pgsql-general is eating my mail w/o either posting it or sending me an
error message, so I'll post here instead. :0
First, a workaround if you get "Cannot insert a duplicate key into
unique index pg_statistic_relid_att_index" during vacuum analyze in
7.2: other people have had this problem and Tom Lane said basically
you have to shut down the database and use a standalone backend to
REINDEX. Shutting down sucks on a live system, so in the spirit of
it-can't-hurt-to-try I found which specific table was having issues,
dropped, and recreated it. Vaccum was happy again. Obviously if the
table has too many constraints to recreate by hand, this is even worse
thann the REINDEX solution but for me in this case it was far better.
Second, a couple questions.
I'm experiencing a VERY slow delete on my largest table (250k rows).
I trim the table every day to prevent it from getting unmanagably large:
delete from battles where date_fought < ago('48 hours'::interval)
but this is ENORMOUSLY slow. even after running this once, then
vacuum analyze (and reindex just to be sure), the planner thinks it's
going to hit 180k rows with this when run a second time, when really
only 500-1000 rows will be. explain says:
NOTICE: QUERY PLAN:
Seq Scan on battles (cost=0.00..28305.75 rows=182361 width=6)
If I set enable_seqscan=off, it does use index scan, but is not
noticably faster. I have shared memory segment set to 256M, and the
largest SHARE segment reported by top is 120M. But that the index
scan is slow (time in minutes, not seconds) when run twice in a row
would seem to indicate that the index isn't staying in memory. I'm
baffled. (Note that I've created an iscachable function "ago" to
avoid the problems with directly using now() in such a query.)
Finally, related to the above: date_fought is declared timestamp, but
in 7.2 that includes ms. Anyone know if it is faster to use
timestamp(0) when you don't need the extra precision?