Forum OpenACS Q&A: best Postgres vacuum to do automatically

As per this bug, should the default vacuum command include a --full or --analyze command? If so, which?
Collapse
Posted by Roberto Mello on
We should be recommending at least ANALYZE (by itself). If we are going to recommend VACUUM, we should recommend VACUUM ANALYZE once an hour (or so) and a VACUUM FULL ANALYZE once a night.

The Debian PostgreSQL packages include scripts to do this periodic maintenance. We could provide them in the newly created /etc directory or in the docs themselves.

-Roberto

Collapse
Posted by Joel Aufrecht on
Currently the docs describe using crontab to run a command to vacuum. It's quick and dirty and works reasonably well. As Lars pointed out on a related cron-type issue, it's not cross-platform and doesn't take advantage of OpenACS for logging/email/failsafe. Is the OpenACS cron-type package stable and usable? If so, we should start moving automated tasks to that in the default install. In the meantime, should I just add a second cron line (one for hourly vacuum analyze, one for nightly vacuum full analyze)? Is there an advantage to using the scripts?

Automated tasks include vacuuming in its various flavors, nightly database backup, nightly everything backup ... anything else? Notifications has regular jobs. search re-indexing is automatic. Do those use the cron package?

Collapse
Posted by Tom Jackson on

The current cornjob package should be working fine. I have started thinking about updating the package. This would require a relatively big datamodel change. I'll start a new thread today with a link to the proposed changes.

One feature that I'll mention here is that each cron should be able to specify intervals like the familiar OS crons:

0,30 8-17 * * 1-5 

This time around I want to write the documentation first, since this package never had documentation.

Another feature that may be of interest is a lookback an lookforward feature. Admins should be able to see what has run, and what will run in a given interval. If an important report is missed because the server wasn't up at the moment it was scheduled to run, some kind of action should be specified, such as to run it asap, and/or notify the owner.

Collapse
Posted by Jonathan Ellis on
carnageblender.com probably does more updates than most oacs sites, and vacuum full nightly would still be overkill.  Particularly since full still takes out table locks, you're nuts to do that nightly to a busy service with a large amount of data.  Allowing a "lazy" vacuum that doesn't exclusively lock was a HUGE win for 7.2.  The contention doing VACUUM FULL on my entire database caused was bad enough that back in the 7.1 days I added a filter to just serve up a "routine database maintenance" page to the users for the half hour vacuum took; it was just too frustrating for them otherwise.

With 7.2 what I do instead is besides vacuum analyzing, I nightly vacuum full (and reindex) a handful of tables at a time.  I get all the tables in the database over a 10 day period -- so maybe 3 minutes a day, I can live with that -- and things don't get too crufty in the meantime.  (For comparison, this is roughly 250k updates and deletes a day.) I can post the code if people are interested.