Forum OpenACS Q&A: Simple caching of html content

Collapse
Posted by Peter Alberer on

In my application there are a lot of dynamic html parts that could be cached. For example the table of contents of a textbook. To generate that a few db-queries are necessary, some computations are made to generate chapter numbers and hierarchy and so on.
I thought about putting some code into my master template that saves the content of the slave into a file or a db-table. The name of the item being cached and the location of the cached html could be saved in an nsv. Cached html items will be removed from the cache when the item itself (or any of its children) changes (and tells the cache about that). (Maybe this is the main problem and i have not given enough thought to it, but in my current situation this is quite easy) This could be enhanced by saving some html parts specifically for different users, view types or whatever.

But my current question is another one: Is the master template a good place to get hold of the html to cache? I thought of 2 other locations, in the slave template itself and within the request processor (dont know much about that). But in the slave page how would i get the generated html into some variable within the tcl part of the page? so that the cached html can be written to a file or the DB.
The request processor seems to be a bad place because it probably gets the whole page content with parts like menus that need not be cached. Any ideas?
TIA

Collapse
Posted by Jun Yamog on
Hi Peter,

If you are just worried about DB hits then you might want to just use the -cache switch on the db APIs.  The current db API I think has been modified by DonB to accept this switch.  This way your code stays the same and just add that -cache to db_multirow, db_string, etc.

Collapse
Posted by Peter Alberer on
I just had a look at my latest cvs checkout and there are no -cache switches on the db-procs. Are you sure this has already been added to cvs?
Collapse
Posted by Jun Yamog on
Hi Peter,

Sorry I am not sure if its committed into CVS or not.  Maybe not.  You can read this thread about this.

https://openacs.org/bboard/q-and-a-fetch-msg.tcl?msg_id=00058m&topic_id=12&topic=OpenACS%204%2e0%20Design

Collapse
Posted by Peter Alberer on

I have finished a very first prototype of a simple caching solution for whole pages.

Pages (without the master but with their properties) can be cached for a specific user or a certain query string. Cached pages are removed from the cache after a certain time or on request. These params are adjustable via adp properties.

To achieve the caching i have modified template::adp_parse and added a few procs in the new namespace template::cache.

If anyone else is interested in something like that let me know. I think the caching of whole pages is a good thing (for certain kinds of pages), as you need not change the logic in the page.

Collapse
Posted by Don Baccus on
I intend to add my caching db API changes for 4.7.  I think one can make a good argument for supporting caching at various levels, i.e. util_memoize, query caching, and as Peter mentions page caching.  Peter, yes, I'm sure your changes will be of interest for 4.7.
Collapse
Posted by Lars Pind on
We talked about it a month or two ago: The memory footprint of an OpenACS server is pretty large, and my guess is that most of it is due to all the extensive caching. That's perfect for a production environment, but it can be annoying and wasteful on a development box where you may be running ten different servers at once, most of which have very low traffic.

I noticed that ns_cache create has an option to specify the max size in kilobytes. Maybe we could use that always, and have a config file option to set the size, so you can lower it on dev servers and up it on production servers.

This, of course, doesn't help with all the nsv-caching that's also going on, for example for the query dispatcher and the message catalog. This could be helped if those had an option to load from file every time which, btw, would also make developing easier.

I'm just tossing this out as an idea for someone with plenty of time on their hands to look into cleaning up at some point. I know I won't have the time to do this any time soon.

/Lars

Collapse
Posted by Jeff Davis on
There is already a parameter for memoize:
ns_cache create util_memoize -size 
    [ad_parameter -package_id [ad_acs_kernel_id] MaxSize memoize 200000]
I think the default of 200k bytes is absurdly small as a default for a production machine.

More memory is taken by the nsv's and of course they cannot really be sized. I checked and here are the top ten nsv's by size

entries bytes nsv
 170     5236 locale
  49     8811 site_nodes
  41    10522 ad_page_contract_filters
 192    54013 api_library_doc
1580    97454 apm_library_mtime
2190   170279 proc_source_file
2189   336690 proc_doc
  94   574815 apm_version_properties
2189  1115145 api_proc_doc
3134  1332667 OACS_FULLQUERIES
The memory footprint of my server is 37mb and as far as I can tell, there is only ~4mb cached (although there are some nsv's I skipped like the parameter cache and and some others but I think they are smaller ones). This is on my testing site with all packages installed and mounted (some several times).

Anyway, the short answer is that things are bloated and it does not seem to really be overly aggressive caching. I think the real issue is that there are just a huge number of procedures defined.

The source for all the tcl libraries is ~3mb and I think when they get bytecode compiled you end up keeping src and bytecode around so that is a big piece of the pie. Also, I am not sure how much compiled templates end up taking, although they are done on demand so an empty idle server should not have very many loaded.

I suppose we could look at making query cached on demand, put in a size limited ns_cache, or not cached at all. Also we could ad a parameter to disable proc doc stuff by default and that might help but again, it oes not seem like either of these would make a huge difference.

Collapse
Posted by Talli Somekh on
Lars, it sounds like what you may be asking for is John Sequeira's portable.nsd.

The advantages of portable.nsd is not just that it will allow for the use of other webservers but that it will allow for very small installations of oacs and development from tclsh.

Perhaps even better for many, portable.nsd will allow for clean and complete use of TclPro (http://activestate.com/Products/Tcl_Dev_Kit/?_x=1). In fact, John's been developing portable.nsd using TclPro and has been raving about it.

So I think this might answer your problem, plus solve a lot of other things.

talli

Collapse
Posted by Don Baccus on
John's standalone thing will need to implement nsv vars so nsv caching will be there in his version, too.
Collapse
Posted by John Sequeira on
Talli,

Thanks for the plug, but I expect the initial implementations of portable.nsd will be more memory-hungry than a straight AOLServer implementation. With OpenACS/FastCGI, we'll be running multiple single-threaded interpreter processes instead of one multithreaded process, each initially chewing up memory for caching. OpenACS/tclHTTPD has the potential to be multithreaded, but I'm not sure what it'll take to get all the dependency libraries to cooperate. Michael C is right now working on how to do multi-process nsv's with sockets, but based on recent progress I think I'll beat him to a deliverable 😊.

Right now I've reimplemented nsv's as globals,  which probably took 15 lines of code. I was thinking that for development it might be good to have an implementation that writes to the database (like ASP.NET's caching) so you can keep an eye on the values.  For production sites,  this approach might make sense for low memory sites or web farms.  For now, TclPro's variable inspection makes the debugging usefulness of db logging a bit moot. It really easy to look at any variable at any stack level,  so no rush there.

There is one slight memory optimization I'm working on right now: replacing the query dispatcher with a source code preprocessor. Instead of parsing/caching xql files, I'm parsing the source code and writing the correct query into the tcl files. I needed to do this to make nstcl happy (which has no qd), but it has some other benefits wrt interpreter launch time, a good thing when using FastCGI.