Forum OpenACS Q&A: Table Design vs Perfomance Issues

I have modified the bboard system by adding a new table so that users can mark certain threads as their own personal favorites. We also have system favorites that everyone sees that I store as user_id=0 in the same table. It seems that performance is getting a bit worse, especially for display of system wide favorites that are hit frequently. I am already using an index for user_id and msg_id in the favorites table.

My question: Would it make any difference in performance if I added separate table with only the user_id=0 system favorites rather then leave them in the table with many rows?

A related issue is our bboard table that now has over 12,000 messages but most (guess 95%+) of the access is to the most recent posts. (OpenACS appears to have about the same number of messages)

It seem that over time, large parts of the database almost never get accessed.... Do recently accessed rows stay in memory? and for how long? As a web application grows in tables and fields and rows, is it just a matter of getting more memory to keep performance acceptable?

Other perfomance and design considerations appreciated.

-Bob

Collapse
Posted by Jonathan Ellis on
Re data caching, the answer is yes, and <a href=http://www.ca.postgresql.org/docs/momjian/hw_performance/>it depends on how big your cache is</a>.

Re performance degrading, I wouldn't think that 12K rows would be enough to see a performance drop with a simple query.  You could check your query in psql with EXPLAIN and make sure it's using the index(es) you think it should.  Also, you will definitely see degredation if you are not VACUUM ANALYZEing on a regular basis.

If DB activity is indeed your bottleneck, you could cache the top level threads at the AOLserver level with util_memoize...

Collapse
Posted by Don Baccus on
The growth-over-time issue is a complicated one, which PG at the moment won't help you with.

The general problem leads to the notion of "partioning" data.  Oracle supports this (but dumbly, at least when Philip Greenspun wrote about it, not sure about 9i etc but Oracle has been paying more attention to the webspace and not the long-term accumulation of data space per se so I doubt much has changed).

PG doesn't support partioning at all.

For the near term, adding RAM and making sure that queries grow non-linearly (in the postitive logN(n) sense, not the negative n^2 sense!) is probably the best defence.

photo.net has never need to resort to partioning (either in-db or by other hacks) and that's a busier site than most of us will ever spawn, so take heart ...