Forum OpenACS Q&A: Response to Search Engines and bboard postings

Collapse
Posted by Dave Bauer on
Bob, Its already in there. See doc/robot-detection.html in your OpenACS docs
Web Robot Detection
part of the ArsDigita Community System by Michael Yoon 
-------------------------------------------------------

User-accessible directory: none 
Site administrator directory: /admin/robot-detection/ 
Data model: /doc/sql/robot-detection.sql 
Tcl procedures: /tcl/ad-robot-defs.tcl 
The Big Picture

Many of the pages on an ACS-based website are hidden from robots 
(a.k.a. search engines) by virtue of the fact that login is required 
to access them. A generic way to expose login-required content to 
robots is to redirect all requests from robots to a special URL that 
is designed to give the robot what at least appear to be linked .html 
files. 
You might want to use this software for situations where public (not 
password-protected) pages aren't getting indexed by a specific robot. 
Many robots won't visit pages that look like CGI scripts, e.g., with 
question marks and form vars (this is discussed in Chapter 7 of 
Philip and Alex's Guide to Web Publishing). 
Also I believe that Google does index URLs with varaiables in them anyway.