Forum OpenACS Q&A: Tracking expensive DB queries

Request notifications

Collapse
Posted by Malte Sussdorff on
Is there a way to detect long running DB queries in PostgreSQL? I would like to get a logfile of all queries that take longer than 5 seconds to complete. My guess is that someone is playing with custom searches in contacts on one of our installations which is killing the server.

Furthermore, I would like to automatically restart the aolserver whenever the load on the machine is 5 or higher (to give it some breathing space and kill off expensive long running queries). Has anyone written something like this already?

Collapse
Posted by Dave Bauer on
Restarting AOLserver will not stop the long-running queries.

Start with the PostgreSQL monitoring documentation
http://www.postgresql.org/docs/8.0/static/monitoring.html

In postgresql.conf you want to set

stats_start_collector true

and

stats_command_string true

You need to restart the PG backend for this to take effect.

This should give you useful information.

While a long running query is running

go to psql and type

select * from pg_stat_activity;

This should give you the queries currently running, and how long they have been running including the query text (if stats_command_string is set to true).

You can also get extensive statistical data on every table and index in the system using the views defined. Refer to the documentation for a detailed list.

Collapse
Posted by Dave Bauer on
I wrote an additional mode for developer support that can log queries to a database table over a certain time.

I'll check it in in the next couple of days.

I learned you can stop a long running query by sending a SIGINT signal to the backend running the query.

You can learn the PID of the process by looking in pg_stat_activity. That will tell you how long the query has been running by checking the start_time of the query.

I also wrote a handy view that doesn't include the procid but calculates the running time by subtracting from now(). This can be misleading because the query string could be "IDLE" which will tell you how long the backend has been idle.

For this to work you need to turn on the stats collector and command string in postgresql.conf.

stats_start_collector = on
stats_command_string = on

Collapse
Posted by Malte Sussdorff on
What would come in handy if we could look at the current long running queries in OpenACS and then send the sigint over the db connection.

Alternatively it would be great if the xotcl request processor could track the DB execution times, so you could with one click see which queries have been long running *and* what are the DB queries that caused this behaviour

Collapse
Posted by Dave Bauer on
Malte,

Since, a db_multirow or db_foreach does an ns_db getrow on every row, it is not scalable to track individual database operations on a production environment.

I am not sure why you'd want to rewrite all the developer support database code into the xotcl request monitor. It does not really matter where you put the code but you definitely can't send a signal to a postgres process from the nsd user. That seems like a really bad idea.

I'd suggest that a scheduled proc that checks for long running queries by looking in pg_stat_activity, and if it finds one, a n email or other message to an admin who can take the appropriate action.

In general, don't write queries that take that long, although I ran into a similar problem with dynamic flters that a user can build a bad query if you filter code is not smart enough. I fixed that problem by not joining on the CR views, but instead using the type specific table and joining explicitly on cr_items and cr_revisions as necessary. This worked really well and a query went from minitues to under 1 second.

Collapse
Posted by Nima Mazloumi on
Dave, your scheduled proc is a brilliant idea.
Collapse
Posted by Mark Aufflick on
Malte I have found pqa ( http://pqa.projects.postgresql.org/ ) VERY useful for providing log analysis with postgres. I haven't used it with pg8 - you may need to check that the following patch has been applied to the version you use for compatibility with pg8:

http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pqa/pqa/lib/pqa.rb.diff?r1=1.80&r2=1.81