Forum OpenACS Q&A: Killing Postgresql connections

Collapse
Posted by Bob OConnor on

I was testing a multi-table join using two Selects seperated by EXCEPT and I must have done something wierd like a Cartesian product because the tcl page started but didn't end for a long time.

I put an ns_log Notice into the tcl file, so I knew it started with:

[06/Nov/2000:09:21:22][29735.7244810][-conn5097-] 
Notice:...

I did a top command and found that the postmaster was hogging more than 50% of cpu cycles. Ten minutes later I stopped and closed the browser window. I looked in server.log and a short while later found:

nsthread(23841) error: ns_realloc: could not allocate 1432299 
bytes

After this, things appeared to halt and no pages were served, so I killed AOLServer, it restarted and all is well.

I would like learn:

  • Can I stop a running postgress process, above it was -conn5097-, by issuing some command.. short of restarting AOL server? I didn't have to restart postgresql to get things working again, only AOLserver.
  • Was AOL server about to return 1432299 bytes but because I closed the browser it didn't know what to do? ie: nsthread(23841) error:...
  • Have people successfully used UNION/EXCEPT/INTERSECT with pg version 7.01 ? (I'll start a separate post to discuss the select issues)
TIA -Bob
Collapse
Posted by Don Baccus on
We use UNION all over the place to expand outer joins that are in ACS
Classic (PG 7.0 doesn't have outer joins).  It works fine except that
any ORDER BY clause on orders the first SELECT of the UNION (fixed in
PG 7.1, which will beta soon).  I've not used EXCEPT/INTERSECT myself
because ACS Classic doesn't use these constructs AFAIK - it is
possible there are some in there, though, that ported directly without
effort or that others dealt with.

It's not a good idea to kill a postgres backend.  It's far better to
stop the postmaster using the script in init.d - it does a kill -TERM
on the POSTMASTER (not individual backend) process.  The postmaster
traps the SIGTERM and tries to stop all the individual backends in a
clean fashion before stopping itself.  If the kill -TERM fails, then
the script will issue a kill -KILL.

The error in AOLserver simply says that it couldn't allocate roughly
1.5MB to itself (using malloc).  I think you most likely did trigger a
cartesian product, causing PG to hog all your RAM and swap space when
it attempted to build the output from the query.  In essence, you ran
out of memory.  In PG, the value returned from a query must fit in
memory+swap.  Not really a problem for realistic queries on modern
computers but it does lead to runaway memory consumption if you have a
cartesian product on a large dataset.

Collapse
Posted by Bob OConnor on

Thank you Don. So far, my experiments with UNION have resulted in slow queries... but I keep discovering new SQL stuff as I solve new problems....

For those that haven't seen it, the best reference for Postgresql and SQL (IMHO!) is

PostgreSQL -- Introduction and Concepts by Bruce Momjian

Soon to be published, it's available online at: http://www.postgresql.org/docs/awbook.html

-Bob