Forum OpenACS Q&A: Search Engines and bboard postings
When search with Google say on "openacs" I get results, yes, but NO results from bboard posts. I assume this is because Google and other search engines do NOT search for dynamic content or specifically URL VARs.
Yes, I know one can use the search here on the site but having it as part of the search engines would allow for more exposure to the big world. I am amazed when I use Google to search for the answer to some problem and it often forum posting with the answer is returned.
OPEN ACS posts have urls like
So, what about a module (or proc) that builds a directory tree of threads? It would look like:
And in the "0000LN" directory would be an index.tcl file that would do a redirect to the long url above.
The module would create this new directory and index.tcl file every time a new thread is started.
Also I believe that Google does index URLs with varaiables in them anyway.Web Robot Detection part of the ArsDigita Community System by Michael Yoon ------------------------------------------------------- User-accessible directory: none Site administrator directory: /admin/robot-detection/ Data model: /doc/sql/robot-detection.sql Tcl procedures: /tcl/ad-robot-defs.tcl The Big Picture Many of the pages on an ACS-based website are hidden from robots (a.k.a. search engines) by virtue of the fact that login is required to access them. A generic way to expose login-required content to robots is to redirect all requests from robots to a special URL that is designed to give the robot what at least appear to be linked .html files. You might want to use this software for situations where public (not password-protected) pages aren't getting indexed by a specific robot. Many robots won't visit pages that look like CGI scripts, e.g., with question marks and form vars (this is discussed in Chapter 7 of Philip and Alex's Guide to Web Publishing).
He publishes some procedures to overcome what you point at in that chapter and I have changed them to suit PostgreSQL. I suppose the db will be hit quite hard when a robot finds your site though. I took the liberty of placing the file here...
Thank you! I see that Google does index some URL vars but in my brief scan using "openacs" as the search, I found NO bboard entries.
Ola, good code! I'll probably rewrite to use
ns_return 200 text/html $whole_page
I'll post my results...