Forum OpenACS Q&A: Re: How to return 304s for ADP pages?

Collapse
Posted by carl garland on
You could put something like this in a filter
set mtime [ns_httptime [file mtime [$filename]
set none_header [ns_set iget [ns_conn headers] If-None-Match]
set since_header [ns_set iget [ns_conn headers] If-Modified-Since]

if {$none_header == $mtime || $since_header == $mtime} {
    ns_return 304 html/text
} else {
    ns_set put [ns_conn outputheaders] ETag $mtime
}

Collapse
Posted by James Thornton on
Since the nsd.tcl's
    ns_param   checkmodifiedsince   true  
appears to not work for ADP pages, and I haven't found a way to enable nsd's checking of the Not-Modified-Since header for ADP pages, I took Carl's suggestion and wrote a proc to do it.

The site I wrote it for is OpenACS 3.x so I am invoking it at the top of ad_serve_adp_page:

# code from ad_serve_adp_page
if {[jt_return_304_p]} {
	ns_return 304 text/plain ""
	return
}
Here's the code for jt_return_304_p...
proc jt_return_304_p {} {

    set return_304_p 0

    # ad_conn only works for abstract urls
    set path [ad_conn file]

    if {[string equal "" $path]} {
	# ns_conn doesn't work for abstract urls
	set path "[ns_normalizepath [ns_info pageroot][ns_conn url]]"
    }

    if { [file exists $path]} {
	set modified_time [file mtime $path]

	# googlebot doesn't use ETags;
	#set none_header [ns_set iget [ns_conn headers] If-None-Match]

	# NS 4.x SGI sends If-Modified-Since with "; length=xx"
	# If-Modified-Since = Thu, 21 Aug 2003 04:37:53 GMT
	# If-Modified-Since = Tue, 30 Sep 2003 22:28:53 GMT; length=885
	set since_header_maybe_length [ns_set iget [ns_conn headers] If-Modified-Since]
	
	if {![string equal "" $since_header_maybe_length]} {
	    # If-Modified-Since header sent

	   # remove the ;length=xx part if present
	    set since_header [lindex [split $since_header_maybe_length ";"] 0]

 	   # convert date string to epoch time
	    set since_time [clock scan $since_header]
	
	    if {$modified_time < $since_time} {
		# file hasn't changed
		set return_304_p 1
	    }
	}
    }
        
    return $return_304_p
}
To test it, telnet to your Web server's port 80:
$ telnet jamesthornton.com 80
Trying 209.164.72.61...
Connected to jamesthornton.com.
Escape character is '^]'.
... and send someting similar to the following HTTP commands (make sure not to begin with a blank line, but terminate with a blank line)...
GET /index.html HTTP/1.0
User-Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)
From:  googlebot(at)googlebot.com
Accept: text/html,text/plain,application/*
Host: jamesthornton.com
If-Modified-Since: Thu, 21 Aug 2010 23:44:54 GMT
You should get something back like this:
HTTP/1.0 304 Not Modified
Set-Cookie: ad_browser_id=6789763; Path=/; Expires=Fri, 01-Jan-2010 01:00:00 GMT
Set-Cookie: ad_session_id=6789764,0,Y3ajJELPiYQI8bsl.KCRGRRjLVjV1izs,1064970966; Path=/; Max-Age=86400
Set-Cookie: last_visit=1064970966; path=/; expires=Fri, 01-Jan-2010 01:00:00 GMT
Set-Cookie: CurriculumProgress=start; path=/; expires=Fri, 01-Jan-2010 01:00:00 GMT
Content-Type: text/plain; charset=iso-8859-1
MIME-Version: 1.0
Date: Wed, 01 Oct 2003 01:16:07 GMT
Server: AOLserver/3.3.1+ad13
Connection: close

Connection closed by foreign host.
Collapse
Posted by Jade Rubick on
It seems to me that the problem with Carl's suggestion is that unless the actual file has been modified, Google will never index your site again. This is fine for static ADPs, but a bummer for non-static ones...
Collapse
Posted by James Thornton on
A simple hack would be to touch the file when a user posts a comment or adds other dynamic content.