Forum OpenACS Development: Re: OpenACS Performance Tests

Collapse
Posted by Gustaf Neumann on
what i can see from your log is that
 - connection threads time out (most likely due to
   your threadtimeout settings), it seems they
   are no fed by new requests.
 - the "destroy called" messages are not serious.
   When a connection thread terminates (the thread is
   destroyed) all the objects it contains are destroyed
   as well. Since the deletion semantics on the C
   level are quite tricky, i left the notice calls
   in. You just see here the messages of the
   thread proxy objects, XOTcl itself is not destroyed.

 - a few db-queries seem to be quite slow:
   
    5 seconds nsdb0 dml dbqd.acs-tcl.tcl.security-
      procs.sec_update_user_session_info.update_last_visit
    9 seconds nsdb0 dml dbqd.acs-tcl.tcl.security-
      procs.sec_sweep_sessions.sessions_sweep
    sched: excessive time taken by proc 4 (11 seconds)

Without more information, it is hard to guess what happens. 
Is the "performance issue" happen during/after the benchmark 
or during normal operations?

To see, what's happening in your database, use e.g.
http://search.cpan.org/dist/pgtop/pgtop
http://www.rot13.org/~dpavlin/sysadm.html

For a deeper understanding of postgres semantics
in particular with checkpoints, see
http://www.westnet.com/~gsmith/content/postgresql/

Collapse
Posted by Eduardo Santos on
Hi Gustaf,

Thank you for your quick answer. I have already installed in my machine ptop and some other tools to monitor PostgreSQL. From the PostgreSQL analysis, it seems like every time I can see these messages in my log, all the DB queries get very slow and ptop shows me that they are waiting to be parsed (they are in waiting state). From this analysis, my first thought is that this destroy objects call is bringing the performance down.

Then, I'm asking myself: why it happens? When I run the benchmark, the system uses all DB connections and threads available. After that, it destroys some of them, what shows me the log messages above. However, when the threads are being destroyed, the system gets so slow that we just can't navigate.

So, there are some thoughts I can get from here:

1 - Is the thread destroy proccess being a performance problem?
2 - Why the DB connections stay in waiting when this destroy is hapenning if I have memory, processor and connections available?

I'll try to take a closer look at the PostgreSQL performance info at this time and try to see something I haven't seem yet. Thank you for your help.

Collapse
Posted by Gustaf Neumann on
The primary question is to figure out, what resources are running out. Candidates
  - cpu
  - amount of memory
  - memory bandwidth
  - i/o

and who is causing it
  - aolserver (e.g. lock-contention)
  - postgres (e.g. checkpoint handling, vacuuming, 
    some complex queries...)

is the whole machine in a bad state (e.g. 
load etc. how does it react on the console)
or just the aolserver or the database?

how many cpus do you have on your system 
(fgrep processor /proc/cpuinfo)?

In respone to your questions:
 - i have never seen thread destroy as a 
   performance problem, but under "normal
   conditions", not all threads end 
   at more or less the same time.
   Normally, the problem with the thread
   destroys is just that they might be
   created quite soon later, thread 
   creation is slow, when the blueprint
   is large (when you have many packages
   installed). However, if one has a large
   blueprint, the thread cleanup has to 
   free all of the contents, which 
   might be a couple of million individual
   free operations. This might entail 
   as well quite a large number of 
   memory locks.

 - there are as well many reasons
   for possible waiting operations.
   Do you see error message concerning
   DEADLOCKS in you database? OpenACS
   5.4 uses less locks (which were introduced
   to overcome some problems in PostgreSQL
   8.0 and 8.1).

-gustaf neumann