Forum OpenACS Development: content_revision deleting

Collapse
Posted by Dave Bauer on
I am having a problem with a __delete function in static_pages. It
attempts to perform content_item__delete which is supposed to delete
all related revisions of a cr_item. It returns a RI error that it
cannot delete the live_revision or latest_revision.

Do I need to unset these columns before attempting the delete?

Collapse
Posted by Dan Wickstrom on
I just ran content-test.sql, and it worked correctly.  content-test.sql runs a variety of tests on the CR, and it deletes all of the content items at the end, so it looks like content_item__delete works correctly.  Are you running the latest code from cvs?  Also what version of postgresql are you running?
Collapse
Posted by Don Baccus on
Remember that his content items and revisions are of the "map existing file to CR object" variety, which postdates the writing of the tests.  Unless you updated the tests for this variant of CR objects it's possible this is a case missed by the tests...
Collapse
Posted by Dave Bauer on
I tried it with postgresql 7.1.2 and 7.1.3 and there was no difference. I will check into it more this evening.
Collapse
Posted by Dan Wickstrom on
When deleting content items, it shouldn't matter whether or not the content items are mapped to the file system - at least it shouldn't matter with respect to RI.  A trigger logs the content revisions that are deleted, and a scheduled proc comes along later and removes the deleted revisions from the file system.
Collapse
Posted by Don Baccus on
Hmmm...right.  Dave - the Schema Browser is working for Postgres and will list all tables that have RI checks on a given table, so poking around in the SB might help you figure out which tables might still have rows pointing to the entry being deleted.

(The SB doesn't work for Oracle due to some driver problems, I think(?)  I'm not sure but I've been informed it breaks in the older ACS 3.x Classic environment, too, ugh, I've been slowly looking into this as I have time).

Collapse
Posted by Dave Bauer on
OK I was wrong. It is only broken when the static-page resides in a static-folder. The deleting of folders and pages is all done in static_page__delete_stale_items. First it deletes all the static_pages, then all the folders. So apparently the static_page__delete_folder function is not seeing the pages as deleted because they are all in the same transaction??

Am I on the right track here?

Collapse
Posted by Dave Bauer on
Here is an example.
 static_page_id |          filename           | folder_id | show_comments_p
----------------+-----------------------------+-----------+-----------------
           2475 | /static1.html               |      2472 | t
           2478 | /static-folder/static1.html |      2477 | t
(2 rows)
static_page_id 2478 is supposed to be deleted. It lives in static_folder 2477. Here is what I am getting in the log:
[23/Oct/2001:05:41:47][771.8197][-conn1-] Notice: Querying 'select __exec_6_del
ete_old_files ();'
NOTICE:  ***Deleting id:2478
NOTICE:  *** Number of rows deleted: 1
NOTICE:  Deleting symlinks...
NOTICE:  Unscheduling item...
NOTICE:  Deleting associated revisions...
NOTICE:  Deleting associated item templates...
NOTICE:  Deleting item relationships...
NOTICE:  Deleting child relationships...
NOTICE:  Deleting parent relationships...
NOTICE:  Deleting associated permissions...
NOTICE:  Deleting keyword associations...
NOTICE:  Deleting associated comments...
NOTICE:  identifier "journal_entry__delete_for_object" will be truncated to "jo
urnal_entry__delete_for_objec"
NOTICE:  Deleting content item...
[23/Oct/2001:05:41:47][771.8197][-conn1-] Error: Ns_PgExec: result status: 7 message: ERROR:  -20000: 
Folder ID 2477 (/static-folder) cannot be deleted because it is not empty.
So it looks like performing both these deletes in the same function is causing the problem.
Collapse
Posted by Don Baccus on
It should see the deletes ... how is it determining that the folder's not empty?  Is this code checked into CVS yet, I'm really curious now...
Collapse
Posted by Dave Bauer on
Here is the offending query and error message from content_folder__delete.
  -- check if the folder contains any items

  select count(*) into v_count from cr_items
   where parent_id = delete__folder_id;

  if v_count > 0 then
    v_path := content_item__get_path(delete__folder_id, null);
    raise EXCEPTION ''-20000: Folder ID % (%) cannot be deleted because it is n
ot empty.'', delete__folder_id, v_path;
  end if;
From my log, it shows that content_item__delete ran with no error message. I will check-in this code, this is broken in previous versions anyway, it was never tested.
Collapse
Posted by Dave Bauer on
Oops.

This requires patches to content-item.sql and content-folder.sql in the acs-content-repository. I am sending them via email to Don.

Collapse
Posted by Dan Wickstrom on
There must be something else involved here. The deletion of content folders seems to work fine. Here is a test case where I create a folder with a content item. I then set the revision of the content item live, and then I delete the content item and its folder in one step using a pl/pgsql function.
openacs4=# select * from cr_items;
 item_id | parent_id |       name       | locale | live_revision | latest_revision | publish_status |   content_type   | storage_type | storage_area_key | tree_sortkey 
---------+-----------+------------------+--------+---------------+-----------------+----------------+------------------+--------------+------------------+--------------
    -100 |         0 | pages            |        |               |                 |                | content_folder   | text         | CR_FILES         | /00
    -200 |         0 | templates        |        |               |                 |                | content_folder   | text         | CR_FILES         | /01
     991 |      -200 | default_template |        |           992 |             992 |                | content_template | text         | CR_FILES         | /01/00
    2458 |      -100 | test folder      |        |               |                 |                | content_folder   | text         | CR_FILES         | /00/00
    2459 |      2458 | simple item      |        |          2461 |            2462 | ready          | content_revision | text         | CR_FILES         | /00/00/00
(5 rows)


And here is the deletion function:

create function delete_test_folder(integer)
returns integer as '
declare
        folder_id       alias for $1;
        v_rec           record;
begin
        for v_rec in select item_id from cr_items where parent_id = folder_id
        LOOP
                PERFORM content_item__delete(v_rec.item_id);
        end LOOP;

        PERFORM content_folder__delete(folder_id);

        return null;
end;' language 'plpgsql';

Here is the folder and its content item being deleted:

openacs4=# select delete_test_folder(2458);
NOTICE:  Deleting symlinks...
NOTICE:  Unscheduling item...
NOTICE:  Deleting associated revisions...
NOTICE:  Deleting associated item templates...
NOTICE:  Deleting item relationships...
NOTICE:  Deleting child relationships...
NOTICE:  Deleting parent relationships...
NOTICE:  Deleting associated permissions...
NOTICE:  Deleting keyword associations...
NOTICE:  Deleting associated comments...
NOTICE:  identifier "journal_entry__delete_for_object" will be truncated to "journal_entry__delete_for_objec"
NOTICE:  Deleting content item...
NOTICE:  identifier "content_folder__unregister_content_type" will be truncated to "content_folder__unregister_cont"
NOTICE:  Deleting symlinks...
NOTICE:  Unscheduling item...
NOTICE:  Deleting associated revisions...
NOTICE:  Deleting associated item templates...
NOTICE:  Deleting item relationships...
NOTICE:  Deleting child relationships...
NOTICE:  Deleting parent relationships...
NOTICE:  Deleting associated permissions...
NOTICE:  Deleting keyword associations...
NOTICE:  Deleting associated comments...
NOTICE:  Deleting content item...
 delete_test_folder 
--------------------
                   
(1 row)

And here is cr_items afterwards:

openacs4=# select * from cr_items;
 item_id | parent_id |       name       | locale | live_revision | latest_revision | publish_status |   content_type   | storage_type | storage_area_key | tree_sortkey 
---------+-----------+------------------+--------+---------------+-----------------+----------------+------------------+--------------+------------------+--------------
    -100 |         0 | pages            |        |               |                 |                | content_folder   | text         | CR_FILES         | /00
    -200 |         0 | templates        |        |               |                 |                | content_folder   | text         | CR_FILES         | /01
     991 |      -200 | default_template |        |           992 |             992 |                | content_template | text         | CR_FILES         | /01/00
(3 rows)

As you can see, there is no problem with deleting the folder. Are you sure the there is nothing else in the CR that has your folder as its parent? What is in cr_items before you try to delete the folder?
Collapse
Posted by Dave Bauer on
Thanks Dan and Don for helping my with this.

It turned out to be a silly mistake, but I sure learned alot about debugging pl/pgsql.

The trigger function to create sortkeys on insert into sp_folders was broken, causing static_page__delete_stale_items to attempt to delete all folders in the tree.

Collapse
Posted by Dave Bauer on
OK, I was wrong. I can delete a regular content_item, but not a static_page. It gives me the same error as before:

openacs-4-test=> select * from static_pages;
 static_page_id |       filename        | folder_id | show_comments_p
----------------+-----------------------+-----------+-----------------
           2388 | /static1.html         |      2387 | t
           2391 | /static1/static1.html |      2390 | t
(2 rows)

openacs-4-test=> select static_page__delete(2391);
NOTICE:  ***Deleting id:2391
NOTICE:  *** Number of rows deleted: 1
NOTICE:  *** selected 0 rows still in static_pages
NOTICE:  Deleting symlinks...
NOTICE:  Unscheduling item...
NOTICE:  Deleting associated revisions...
NOTICE:  Deleting associated item templates...
NOTICE:  Deleting item relationships...
NOTICE:  Deleting child relationships...
NOTICE:  Deleting parent relationships...
NOTICE:  Deleting associated permissions...
NOTICE:  Deleting keyword associations...
NOTICE:  Deleting associated comments...
NOTICE:  identifier "journal_entry__delete_for_object" will be truncated to "journal_entry__delete_for_objec"
NOTICE:  Deleting content item...
ERROR:  Referential Integrity: attempting to delete live_revision: 2392
If I use content_item__delete(2391) I get the exact same output. I created a generic content_item with content_item__new and created a live revision. The content_item and revision were deleted without error using content_item__delete. I can't think of where else to look. There is a constrainst on cr_revisions that will not allow a revision to be deleted if it is referred in the live_revision of latest_revision column of cr_items.
Collapse
Posted by Don Baccus on
If content_revisions had an "on delete cascade" on the key referring to content_items you could just delete the item and be done with it...this would still allow the deletion of individual revisions, of course.

This doesn't answer your question but sometimes I think we make this stuff a lot harder than necessary by not using SQL's referential integrity operations...

Collapse
Posted by Dan Wickstrom on
Try modifying static_page__delete so that it sets live_revision null if the static page is the live revision, and if the static page is the latest revision, set the latest_revision to the next latest revision if other revisions exist - otherwise set the latest_revision to null.

It would be nice to be able to use the RI features here, but in this particular case, postgresql is kind of broken.  If you have RI triggers on latest and live revision, the RI triggers think that live and latest revision point to something even if you set them to null during the transaction.  I've reimplemented the RI triggers using regular user-level triggers to avoid this particular bug.

Collapse
Posted by Robert Locke on
Hey there,

I just upgraded to the alpha release and find that I'm also having problems deleting content items.

Basically:

* I create a content item which has a single live/latest revision.
* I then call content_item__delete on said item.
* content_item__delete then calls acs_object__delete on all its revisions (of which there is only 1 in this case, but it doesn't really matter)
* acs_object__delete then deletes the revision from cr_revisions
* however, the trigger on cr_revisions prevents the deletion of the live/latest revision and an error results

Dan: is that the bug you're referring to?

Why did this work in the pre-alpha?  Is the solution to null out the live/latest revision before performing a content_item__delete?

Thanks!

Collapse
Posted by Dave Bauer on
Robert,

I can create and delete a default content_item just fine. It appears my problem is with the creation of the 'static_page' content_items. I set the content_item subtype to 'content_revision' instead of 'content_item' which screws up acs_object_delete where it attempts to delete the item in cr_items before the acs_object is deleted.

Collapse
Posted by Dan Wickstrom on
I looked at this a little, and I was able to recreate your problem.  This might be a RI bug in postgresql, but I'm not 100% sure yet.  I did notice that I was able to delete a static page (I had to set live and latest revision to null first), if I recreated the steps individually in the context of a transaction.  I'm going to look into this some more when I get the time, but in the mean time, I would suggest trying to convert the sql functions to tcl procs which then wrap the individual steps in a transaction.
Collapse
Posted by Robert Locke on
That fits in nicely with what I observed, Dave. Deleting content items only became a problem when I changed the type to content revision.

Also, as Dan mentioned, nulling out the live/latest revision allowed me to successfully delete the item.

Collapse
Posted by Dave Bauer on
It worked.

Deleting is fine.

Dan, sorry to send you on a wild goose chase. When I changed the content_type to static_page to properly support the new search package, I changed it in the wrong place.

In the object type, the supertype was static_page which should have been content_revision.

In content_item__new the subtype was content_revision, it should have been content_item.

Thanks!

Collapse
Posted by Dan Wickstrom on
Well I'm glad to hear that.  I was starting to think that it was another variation on that same !$#%@ postgresql RI bug.
Collapse
Posted by Robert Locke on
> In content_item__new the subtype was content_revision, it should
> have been content_item.

I tried it. And everything seems to work now. Deletion and search.

However, is this a documentation bug in packages/search/www/doc/guidelines.html?

  • Whenever you call content_item__new, call it with 'content_revision' as the item_subtype and 'your_content_type' as the content_type.

Thanks!

Collapse
Posted by Don Baccus on
Dan - have you reported the RI issue to the PG group?  Tom Lane's digging into the whole "data trigger change violation" issue (which is going to go away in practice, we've known for a long time that we (Jan Wieck, Steve Szabo and I) misinterpreted the standard in this case - read the SQL92 standard and you'll see why three reasonably smart people could all misinterpret it!.

Since he's mucking around in this area already he might have time to look into the problem you're reporting.

Collapse
Posted by Dan Wickstrom on
Vinod originally found this problem and reported it to Tom Lane.  IIRC, Tom passed him off to Steve Szabo.  Steve gave some vaque response about there being a problem with the RI triggers, but as far as I know, no definite plans for fixing the problem were forthcoming.  I think that may have been more than six months ago.