Forum OpenACS Q&A: Search Engines and bboard postings

Posted by Bob OConnor on

When search with Google say on "openacs" I get results, yes, but NO results from bboard posts. I assume this is because Google and other search engines do NOT search for dynamic content or specifically URL VARs.

Yes, I know one can use the search here on the site but having it as part of the search engines would allow for more exposure to the big world. I am amazed when I use Google to search for the answer to some problem and it often forum posting with the answer is returned.

OPEN ACS posts have urls like

So, what about a module (or proc) that builds a directory tree of threads? It would look like:

And in the "0000LN" directory would be an index.tcl file that would do a redirect to the long url above.

The module would create this new directory and index.tcl file every time a new thread is started.

Feedback please!


Posted by Dave Bauer on
Bob, Its already in there. See doc/robot-detection.html in your OpenACS docs
Web Robot Detection
part of the ArsDigita Community System by Michael Yoon 

User-accessible directory: none 
Site administrator directory: /admin/robot-detection/ 
Data model: /doc/sql/robot-detection.sql 
Tcl procedures: /tcl/ad-robot-defs.tcl 
The Big Picture

Many of the pages on an ACS-based website are hidden from robots 
(a.k.a. search engines) by virtue of the fact that login is required 
to access them. A generic way to expose login-required content to 
robots is to redirect all requests from robots to a special URL that 
is designed to give the robot what at least appear to be linked .html 
You might want to use this software for situations where public (not 
password-protected) pages aren't getting indexed by a specific robot. 
Many robots won't visit pages that look like CGI scripts, e.g., with 
question marks and form vars (this is discussed in Chapter 7 of 
Philip and Alex's Guide to Web Publishing). 
Also I believe that Google does index URLs with varaiables in them anyway.
Posted by Ola Hansson on
Philip Greenspun wrote about this in a chapter of his book:

He publishes some procedures to overcome what you point at in that chapter and I have changed them to suit PostgreSQL. I suppose the db will be hit quite hard when a robot finds your site though. I took the liberty of placing the file here...

Posted by Bob OConnor on

Thank you! I see that Google does index some URL vars but in my brief scan using "openacs" as the search, I found NO bboard entries.

Ola, good code! I'll probably rewrite to use
ns_return 200 text/html $whole_page
instead of
I'll post my results...