Of cource ACS has Robots detection. http://serverspace.com/doc/robot-detection , however I fear that relying on the easy to "fake" USER_AGENT variable could be a problem.
Allan Regenbaum kindly sent me a repair to robot-detection;
repair of robots facility on 3.x
couple changes required to make robots work ...
first .. osme useragents are too long so ..
SQL> alter table robots modify ( robot_useragent varchar(200));
second, the call to get the file in /tcl/ad-robot-defs.tcl
ad_replicate_web_robots_db needs to change from
set result [ns_geturl $web_robots_db_url headers]
set result [ns_httpget $web_robots_db_url]
The new URL to get a list of robots has changed per the response to Malcolms
post...
In your service.ini
[ns/server/yourservername/acs/robot-detection]
; the URL of the Web Robots DB text file
WebRobotsDB=http://www.robotstxt.org/wc/active/all.txt <<< this is the
new URL
; which URLs should ad_robot_filter check (uncomment to turn system on)
FilterPattern=/ecommerce/* >>> will cause a robot check on
any vist to /ecommerce (as an example)
; FilterPattern=/members-only-stuff/*
; the URL where robots should be sent
RedirectURL=/robot-heaven/ <<<< create this directory with
pages which suit the robots
; How frequently (in days) the robots table
; should be refreshed from the Web Robots DB
RefreshIntervalDays=30