Forum OpenACS Q&A: Lots of tcl_url2file request processor errors in log file

We upgraded our intranet from ACS to OpenACS on Sunday, and it's been running fairly well these last few days. The only real downside has been that the memory demands of Postgres seem to be much higher than Oracle, something which surprised be a bit. My load drifts between 1 and 4 throughout the day. Part of this may be due less to memory differences than the fact that project manager is much more demanding (and unoptimized) than the ticket tracker was in ACS, and a proprietary application I ported to OpenACS now has some inefficient hacks to compensate for a lack of decent hierarchical support in Postgres.

When I came in to work this morning, the server wasn't running. I restarted the server, and everything started working again. However, every request seems to be accompanied by an error in the error.log file:

[21/Apr/2004:09:09:38][2108.4101][-conn:ibr::1] Notice: RP (10.544 ms): error in rp_handler: can't read "tcl_url2file(/)": no such element in array
[21/Apr/2004:09:09:52][2108.7176][-conn:ibr::4] Notice: RP (13.849 ms): error in rp_handler: can't read "tcl_url2file(/directorynamehere/)": no such element in array
Any ideas of what's going on? Is this something to worry about?
Ouch, something got trashed, yes, I'd worry.  If it happens again I'd add diagnostic code to see why tcl_url2file("/") fails.  The array caching the information got hosed somehow.

You were running in performance mode?  actually "/directorynamehere/" sounds suspicious, where does that come from?  Which package?

As far as PG vs. Oracle goes ... I haven't noticed drastic differences between the two.  Memory usage in PG should mostly be determined by your PG configuration parameters.  Speed should be roughly comparable for a wide range of operations.

Make sure you ANALYZEd your tables.

Of course you give two reasons why it might be your fault not PG's fault and I suspect you may be on to something :)

Collapse
3: "hierarchical support"? (response to 1)
Posted by Andrew Piskorski on
What does "decent hierarchical support in Postgres" mean?
Don wrote:

Ouch, something got trashed, yes, I'd worry. If it happens again I'd add diagnostic code to see why tcl_url2file("/") fails. The array caching the information got hosed somehow.

Ouch, those are all over our log file as well. Since they were always there, I assumed it was just a matter of a cache miss being logged the first time a url was requested.

Checking, I see 8000 occurences this morning, with 3000 different urls.

I just realized now that the error message is sometimes

no such element in array

which should be normal for a cache miss, and sometimes

no such variable

which seems BAD.

I might be totally off the mark here - since I don't really understand the inner workings of the request processor.

Re: tcl_url2file errors:
------------------

Thank you all, for your comments. I always appreciate feedback.

I turned off performance mode and restarted, and the error messages went away.

Re: performance on Postgres
------------------------

I think these were mostly due to poor coding. While porting, my main focus was on getting things working, and quickly. There was so much to do, and the deadline was so inflexible, that I didn't have the time to worry about optimization.

Now I'm tackling the largest problems, and the response time on these pages has gone down significantly. So hierarchical queries is not the problem.

Actually, the system load may may not due to be those pages at all, but queries that get hung for some reason. Every day this week, I've looked at the monitoring page:

/acs-admin/monitor

and found that there are queries that are hanging. I'm not sure why it keeps happening, or what is causing it. It does look like it is happening on two pages in project-manager, both of which use list-builder, and have fairly expensive queries.

Conn#       IP       State       Method       URL       # Seconds       Bytes

167     63.225.86.72     running     GET     /intranet/project-manager/index     10825.447178     -186

1326     192.168.1.57     running     GET     /intranet/project-manager/tasks     198.430493     -180

1361     64.122.45.238     running     GET     /acs-admin/monitor     0.72462     0

How could something be returning negative bytes?

(Please pardon the old thread revival.)

I'm dealing with some nasty performance tonight. I tried switching on performance mode and got a pile of error messages. Nice to hear that some are just cache misses, but I also have a collection of these:

[11/Sep/2005:19:08:39][17774.114696][-conn4-] Notice: RP (2950.736 ms): error in rp_handler: can't read "tcl_url2file(/dotlrn/)": no such variable

dotlrn definitely exists, resolves fine _mostly_. Likewise with a bunch of custom code, ie:

[11/Sep/2005:19:08:38][17774.180236][-conn8-] Notice: RP (1895.061 ms): error in rp_handler: can't read "tcl_url2file(/dotlrn/classes/vtchemistry/generalchemistry/drsariskyschemistry1035/chem-assess/question-feedback)": no such variable

Any suggestion for where to look for this? I'd -really- like to switch over to performance mode - I need all the help I can get on this server's load!

Thanks!

No worries. It just means that you're running in performance mode and haven't been running long enough yet that all your connection threads have a cached url-to-file mapping.

Incidentally, anyone know why this is done with a global variable rather than an nsv_array? Seems like it would bootstrap the caching a lot quicker that way. Perhaps, like tcl_site_nodes, it could fall back to an nsv_array.

Incidentally, to get rid of this message, you can either go into the request processor where it does the performance mode check and comment out the log command, or shut off the RP debug logging.