Forum OpenACS Q&A: Re: Google & Co on dynamic content

Collapse
Posted by Andrew Herne on
Like Brian a first poster here. I've been lurking too long. Hi all.

Introducing myself. I'm IT Director (and yes that means programmer too!) at National Extension College, a distance learning not-for-profit in the UK.

http://www.nec.ac.uk

We've been running a very modified ACS 3.4 since summer 2001. Always heads down and no time to chat. [I know, lame excuse.]

This may or may not be relevant. I think it's the idea not the detail that may be helpful. We had big problems with Google mid 2002, losing all presence. Tracking down the problem proved tortuous, and we never really got to the bottom of it.

We realised that user sessions were implicated, and in particular the usca_p query string appended to ACS 3.4 ecommerce pages. We use a modified ecommerce module for most of our public pages as a rudimentary CMS.

We were able to recover by hacking a standard proc in ecommerce-defs.tcl:

ec_create_new_session_if_necessary

At the top of that proc we now list known spider user agents (googlebot, scooter, slurp, etc.) and match against return value of util_GetUserAgentHeader. If we have a match we set the user_session_id to 0 and quit the proc. It's my belief that Google gets stuck in a session loop which obviously is not relevant to it but led to it failing to spider the site.