Forum OpenACS Q&A: Re: Random Book off bookshelf

Collapse
Posted by Mark Aufflick on
What I do for random photo and book on my site (http://mark.aufflick.com/) to minimise db hits on the front page is to util_memoize a list of photo (or book) id's, and then pick one at random in tcl.

Sure the list is pretty darn long, but not really a lot of data (they're just integers) and every db hit you save on your front page is worth it.

I stole this idea (and some of the code) from Lars' old photo and book packages he hacked for his old website. You can download my diffs from his code on my website (http://mark.aufflick.com/about) but I don't think you can download Lars' old modules from CVS any more. The new ones are much better anyway.

If you want, I can post the tcl/sql snippets that do the work.

Collapse
Posted by Jarkko Laine on
Mark,

Do you only memoize the id's? Doesn't that mean that you still need the db hit to get the other attributes (and the photo if it's in the db, too)? If you need to hit the db anyway, I don't think the random functionality adds too much overhead. But maybe I understood something wrong.

Collapse
Posted by Jeff Davis on
Jarkko, the "select ... orderby random() limit 1" can be pretty slow for a large table. On a users table with 8300 users I get:
head=# explain analyze select * from users order by random() limit 1 offset 0;
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Limit  (cost=803.12..803.12 rows=1 width=1225) (actual time=113.42..113.42 rows=1 loops=1)
   ->  Sort  (cost=803.12..823.86 rows=8298 width=1225) (actual time=113.41..113.41 rows=2 loops=1)
         Sort Key: random()
         ->  Seq Scan on users  (cost=0.00..262.98 rows=8298 width=1225) (actual time=0.02..39.11 rows=8298 loops=1)
 Total runtime: 114.64 msec
(5 rows)
whereas if you have the id picked randomly from a cached list the query to get the single row is small numbers of milliseconds.

I think caching the id's is generally a win...