Forum OpenACS Q&A: pg miscellanea
First, a workaround if you get "Cannot insert a duplicate key into unique index pg_statistic_relid_att_index" during vacuum analyze in 7.2: other people have had this problem and Tom Lane said basically you have to shut down the database and use a standalone backend to REINDEX. Shutting down sucks on a live system, so in the spirit of it-can't-hurt-to-try I found which specific table was having issues, dropped, and recreated it. Vaccum was happy again. Obviously if the table has too many constraints to recreate by hand, this is even worse thann the REINDEX solution but for me in this case it was far better.
Second, a couple questions.
I'm experiencing a VERY slow delete on my largest table (250k rows). I trim the table every day to prevent it from getting unmanagably large:
delete from battles where date_fought < ago('48 hours'::interval)but this is ENORMOUSLY slow. even after running this once, then vacuum analyze (and reindex just to be sure), the planner thinks it's going to hit 180k rows with this when run a second time, when really only 500-1000 rows will be. explain says:
NOTICE: QUERY PLAN: Seq Scan on battles (cost=0.00..28305.75 rows=182361 width=6)If I set enable_seqscan=off, it does use index scan, but is not noticably faster. I have shared memory segment set to 256M, and the largest SHARE segment reported by top is 120M. But that the index scan is slow (time in minutes, not seconds) when run twice in a row would seem to indicate that the index isn't staying in memory. I'm baffled. (Note that I've created an iscachable function "ago" to avoid the problems with directly using now() in such a query.)
Finally, related to the above: date_fought is declared timestamp, but in 7.2 that includes ms. Anyone know if it is faster to use timestamp(0) when you don't need the extra precision?
select max(battles_id) from battles where date_fought < now()::date - '48 hours'::interval; delete from battles where battles_id < $battles_id(That also makes the cutoff midnight instead of 48 hours prior to the current time due to the ::date line)
delete from battles where date_fought < ( select now()::date - interval('48 hours') )Does that still cause a seq scan? is the index scan still slow?
result returned by ago() presumably depends on more than just
the parameters passed in (ie the current time when the call was
I don't see how that would cause the problem you're seeing, but
it still looks wrong to me...
Russel: yes, this is how it's supposed to be. W/o iscachable, PG's query planner recomputes function calls for each row (which is what David was working around with the subselect above). Even if you set enable_seqscan=off in such a case, EXPLAIN will still output a seq scan plan. I'm not sure how PG knows to make this function not "globally cachable", but if you do select ago('1 minute') twice, you will get two different values.