Forum OpenACS Q&A: PostgreSQL deadlock? Need some help here.

Hello,

We are trying to add around 50.000 users to an OpenACS installation (not latest 5.1.1, but 5.x somewhere) and it seems to crawl to a halt. When looking at the server running the database, I can see:

postgres 13001  1.2  3.3 139048 130720 ?     S    Jul14  31:06 postgres: MYDB MYDB 10.0.0.2 SELECT waiting               
postgres 13003  0.6  3.3 138964 130460 ?     S    Jul14  15:31 postgres: MYDB MYDB 10.0.0.2 SELECT waiting               
postgres 13644  0.0  1.0 136232 38912 ?      S    Jul15   0:02 postgres: MYDB MYDB 10.0.0.2 idle                         
postgres 13649  0.0  2.9 137596 114572 ?     S    Jul15   0:13 postgres: MYDB MYDB 10.0.0.2 SELECT waiting               
postgres 14786  0.0  0.0  2024  876 ?        S    04:48   0:00 /bin/bash /var/lib/pgsql/vacuum.sh
postgres 14787  0.0  0.0  2048  944 ?        S    04:48   0:00 /bin/sh /usr/bin/vacuumdb -z -a
postgres 14795  0.0  0.0  4108 1360 ?        S    04:48   0:00 /usr/bin/psql -c SET autocommit TO 'on';VACUUM   ANALYZE  -d MYDB
postgres 14796  0.0  3.2 135732 127260 ?     S    04:48   0:12 postgres: postgres MYDB [local] VACUUM waiting           
postgres 15043  0.0  1.3 137608 52396 ?      S    12:17   0:03 postgres: MYDB MYDB 10.0.0.2 SELECT waiting               
postgres 13009 91.5  3.3 138752 130216 ?     S    Jul14 2305:48 postgres: MYDB MYDB 10.0.0.2 SELECT    
What I can understand from this is that VACUUM is blocking some operations ... Is it possible to kill the VACUUM without creating a mess in the database?

Thanks

Collapse
Posted by Bruno Mattarollo on

Just replying to myself ... There is a post where Tom Lane says that it's safe to kill vacuum ... so that's what I did but didn't solve the problem, so vacuum was also just waiting (like the process says).

Anyway, it's my last day in the office, it's 18:00 hs, it's time to go to the bar and not try to solve the last minute problems, right?

Next post will certainly be from my new home in Sydney, Australia.

Collapse
Posted by Jonathan Ellis on
try this:

select c.relname, l.*
from pg_class c, pg_locks l
where c.relfilenode = l.relation;

this gives you something like

   relname   | relation | database | transaction |  pid  |      mode       | granted
-------------+----------+----------+-------------+-------+-----------------+---------
 pg_locks    |    16759 |  8992864 |             | 18220 | AccessShareLock | t
 armor       |  8994951 |  8992864 |             |  7309 | AccessShareLock | t
 pg_class    |     1259 |  8992864 |             | 18220 | AccessShareLock | t
 armor_v     |  8994977 |  8992864 |             |  7309 | AccessShareLock | t
 items       |  8994975 |  8992864 |             |  7309 | AccessShareLock | t
 armor_types |  8994628 |  8992864 |             |  7309 | AccessShareLock | t
look for locks that haven't been granted, and see what has a granted lock on that relation.
Collapse
Posted by Lachlan Myers on
Hey Bruno!

Welcome to the underside of OpenACS! Downside! I meant downside! No ... Downunder! Got it.

You probably know these people already, but here's a few of us in Australia using OpenACS. Rafael Calvo at Sydney University, and Mark Aufflick, also in Sydney.

I'm in Melbourne, and I think there are others in Adelaide.