Forum OpenACS Q&A: Re: Google & Co on dynamic content

Collapse
Posted by Tom Jackson on

You used to have to worry about having query strings in your public urls, because search engines would not index them, for fear of falling into an infinitly deep sub-web.

Google now indexes any link on your page, even javascript links. However, they only go to a certain depth on any site now. They believe that any content of importance is within a few levels of an index page.

I haven't figured out how they know what an index page is, but it is interesting to listen to them pontificate on their bot technology. Put up a large site and eventually googlebot will visit like _Attack of the Clones_.

This might have implications for using pages like index.vuh, where the same page will occur at different depths, or under many subdirectories in a package. I don't think robots.txt allows wildcards in the path portion of a url.