Forum OpenACS Q&A: How do I serve up a directory without any perms, security, or cookies?
According to the tools at web-caching.com, which crawled one URL at my request, because each of these gifs is served with a session id cookie,
http://126.96.36.1991/templates/shared/background.gif Date Fri, 26 Sep 2003 08:14:01 GMT Expires - Cache-Control - Last-Modified 24 weeks 6 days ago (Fri, 04 Apr 2003 19:42:40 GMT) validated ETag - Set-Cookie ad_session_id=4040301%2c0%20%7b34%201065168841%2009237CF575A7A017302D2B066F11A9F1D60C2F63%7d; path=/; max-age=604800 Content-Length 0.5K (514) Server AOLserver/3.3.1+ad13
"This object doesn't have any explicit freshness information set, so a cache may use Last-Modified to determine how fresh it is with an adaptive TTL (at this time, it could be, depending on the adaptive percent used, considered fresh for: 4 weeks 6 days (20%), 12 weeks 3 days (50%), 24 weeks 6 days (100%)). It can be validated with Last-Modified. This object requests that a Cookie be set; this makes it and other pages affected automatically stale; clients must check them upon every request."
So for each "real page" the server has to go through 24 rounds of db perm checking and session handling chokepoints. And the client's browser can't use the curv-tl.gif in it's cache and has to ask for it anew so it can compare the last modified date.
I'm just guessing, but I suspect the site would be a lot snappier without some of this rigamarole.
One solution is just to set up another webserver (tux?) to serve these images from a different port.
But is there away w/i the OACS to designate a particular directory or set of directories as files that get served up without all the std. OACS goodness?
Put them in package/www/resources/...
Refer to them as /resources/package/...
Not sure if it avoids a session_id, though ... would you care testing?
Is there a global /resources directory?
Ack! Why does mathoppd remind me of Bill the Cat?
I'd like to see images be package-local since the whole idea of an APM package is that one can tarball it up, ship it to someone, and they can dump it in their packages directory without fear of overwriting any other package's resources.
You wouldn't have to invoke any TCL code for resources like css, gif, etc. so it should be faster. You also wouldn't have to migrate content into separate /resource folders for the custom code you've already written.
You might have to make sure cache headers work correctly, but you could probably get around that by enabling caching on the basis of the URL extension (gif|css|jpg|etc).
Has anyone tried this?
That solution would probably be easiest on the developer and the production server itself. A lot of care needs to be taken with the headers to ensure things are working the way you want. Probably most images (but not image requests) go through some kind of security check, and permissions can change at any time.
But the first step in tuning a production server is to remove public images from, at least, the aolserver process. You could still run a pair of production servers where one handled secured images and the other page requests, you might get some benefit from that, but it would need to be tested.
It would be nice if the OpenACS toolkit can Do The Right Thing, such that the point where the user of OpenACS needs to dedicate engineering man-hours to special purpose scalability solutions is delayed as late in their site growth as is reasonably feasible. Ideally, a medium sized (but what does that mean exactly?) public site should be able to just use vanilla OpenACS (and thus AOLserver) without any special image-only web server, any special hacking of the OpenACS request processor, etc. etc.
Don, it sounds like OpenACS 5.0 already is doing this Right Thing with the new resources directory filter? How far does this seem to take stock OpenACS up the "scalability without extra man hours" curve? Is there anything else along these lines that OpenACS can or should consider doing in the future?
On my own personal site (http://rubick.com:8002), I wanted to upload my thesis, which contains about 80-85 images linked in from one HTML page, which was converted from Word of all things. Ug.
How am I supposed to upload this file?
#1 Make a new package, and put the images under a directory there?
#2 Put it in ETP? Well, that doesn't solve the image permissions problem, and it actually didn't work because the file was too big. It would time out on Safari (more than 60 seconds), and it wouldn't copy and paste into IE on the Mac, (probably for buffer-overflow protection reasons, they didn't want to allow you to paste that much text in?).
#3 Put it under www, in an .adp and .tcl file, and put the images under /www/images or something like that.
I can't imagine a better solution that #3 right now. #1 is just too much work. While not up to the ideal of engineering purity, it still seems like the best option. No?
I would argue this type of thing is common enough that we should provide some default support for it. Sure I can hack the request processor -- looks pretty easy, I think. But I don't think I'm the only one who will run across this situation.
My hobby related website, just a bunch of HTML pages, has numerous pages with lots of large photos. It doesn't see much traffic, yet still it consumes an average of 31MB of bandwidth per day. I host the HTML on my DSL line at home in preparation for moving to AOLserver in the near future, but to conserve bandwidth I would still like to host the photos off site (which I do).
Having a global @http@ parameter would make it very simple for even a small outfit to host their own dynamic pages, while at the same time conserving precious bandwidth by letting their off-site ISP serve the bulk bandwidth for the in-document graphics.