Forum OpenACS Q&A: possible big problem with content_item__get_path

Hi,

In the older bcms I tried to avoid using this plsql due to performance reason.  Although using th list builder and seeing that my convention for getting the path on the older bcms is pretty unconventional (returning rows of the path and concatinating in tcl).  Now I change my tcl procs to just use content_item__get_path, to get the path of a content item.

Having being frustrated with some other issue.  I did some very fast browser refresh.  Since pg is in the local machine, the request was pretty fast.  PG cpu util then suddenly climb up.  It boils down to the call content_item__get_path and i believe its cause by its use of cursors on the plsql.  Here is a what i got when i cancelled the query:

Cancel request sent
WARNING:  Error occurred while executing PL/pgSQL function content_item__create_abs_cursor
WARNING:  line 7 at SQL statement
ERROR:  Query was cancelled.

content_item__get_path is a pretty long plsql query.  I don't have full understanding, especially the cursors part.  Is it just a too complex plsql that needs a different implementation?

Collapse
Posted by Ola Hansson on
I agree that it looks really complex. Especially if get_path__root_folder_id is specified, which I believe is often the case.

This proc has been introduced in the new "pretty url"-aware code in file storage (on the forthcoming 5.0 HEAD), too, so if this function needs to perform better, let's do it 😊.

BTW, Jun, speaking of possible performance issues in the CR ... have you been seeing any performance issue with content_item__get_id(), if you have even used it in bcms, that is? I've started seeing a considerable amount of lack of speed in the new file storage when it comes to resolving an item from its path, i.e., downloading a file by means of the new index.vuh under /view, using content_item__get_id(). It seems to be a bit quicker on Oracle, though.

I am working with FS "pretty urls" for Heidelberg, and Janine is implementing it for Sloan, too, so I have good faith we will work it out soon.

/Ola

Collapse
Posted by Jun Yamog on
Hi Ola,

content_item__get_id runs fine on me.  But then I haven't done any stress testing.

Collapse
Posted by Thomas Senn on
Hello everybody,

I was kind of happy to read Jun's post about the proc content_item__get_path crashing postgres. I indeed have exactly the same problem. Having in mind that it is related to the use of cursors, I checked the source of the same proc from the CVS to see if the cursor thing had changed (we are using a 4.6.3 ditro of OpenACS). And it had effectively been rewritten, shorter and above all, not making use of cursors any more. So I just pasted the code of this proc into our test db to see if no side effects were to deplore. And it works great, letting me hope that the server will never crash again...

However, I discoverd before reading this post that the crashing request (the one calling content_item__get_path, which in turn calls content_item__create_abs_cursor, which hangs on the "update get_path_cursors set abs_cursor_pos = 0;" line) was not *always* crashing postgres. It means that under certain circumstances (like just after a server reboot) things were going smoothly, as long as a special "state" of the postgres database was not reached. A special state from which a single call to the mentioned proc would then crash postgres. So even if the replacement of the proc's source seems to solve the problem, two questions remain :

- Are they any other procedures of postgres making use of cursors (I guess a lot) and moreover, likely to crash the database? In other words, are cursors necessary and why are they related to pg crashes, is it a kind of bug in pg or is it the programmer's duty to write cleanly closing cursors procs?

- Has anybody an idea of what this postgres "special state" could be, and/or how it occurs? Is it likely to be related only to cursors or are they any other factors that could participate in such an evolution towards this special state. Note that a call to the db_release_unused_handles proc has already been added in the dotlrn-master, ensuring that potentially unused handles are closed regularly.

I hope someone very informed about cursors and/or pg internals could answer.

Clearly, my questions might be too general but I have been annoyed by this problem since a long time and even if I feel close to a solution, I would be delighted to know more about this less and less mysterious but still not totally obvious issue.

Thanks a lot and happy new year!