Forum OpenACS Q&A: Re: Google & Co on dynamic content

Collapse
Posted by Brian Fitzgearld on
Hey Don,

Alas, even the pages that don't contain a ? URL fail to be googled, and that includes pages that have not changed since launch.  See, for example, http://www.greenpeace.org/aboutus/

Contains the phrase "As one of the longest banners we've ever made" and has contained it since Greenpeace Planet inception.  Google is aware of this phrase only at an alternate site that quotes it.

As to what gets googled and doesn't, our experience *sounds* closer to Tom's: some top of the directory indexing and then the brainless bots get tired and go away.  I've seen articles indexed as part of the aggregating-menus (News, Features, Press) but absent themselves from google, so it may just be how many hops down the tree. There's a fairly detailed bug report on this in the GP Planet Bugzilla about what's getting hit and what's not if you're keen.

I'll ask Bruno and Alex to drop by this thread and call attention to the VUH idea and Michael's suggestions -- all look sound, but me, I only know enough to ask the question, not evaluate the answers. 😉

Great to see the community response to this.  Thanks all.

--b

Collapse
Posted by Robert Locke on
Hi Brian,

<blockquote> See, for example, http://www.greenpeace.org/aboutus/
Contains the phrase "As one of the longest banners we've
ever made" and has contained it since Greenpeace Planet
inception.  Google is aware of this phrase only at an
alternate site that quotes it.
</blockquote>

I did a search on Google as follows:
    "as one fo the longest banners we've" site:www.greenpeace.org

and www.greenpeace.org/aboutus/ appeared as the only result.

I'm guessing you didn't see it because Google filters out redundant results, but Google is definitely aware of the page.  Click on the "repeat the search with the omitted results included" link to see all results.

I checked a few of the "deeper" pages in the Greenpeace site and then ran a search on Google for a phrase within that page, adding "site:www.greenpeace.org" (to limit the search results).  And Google appeared to be aware of the them (atleast the ones I checked).  Google also seemed to be aware of the various versions of the page (eg, /ships/ship-detail?ship... and /international_en/ships/ship-detail?ship...

Perhaps one of the problems is your ranking within Google, which is a separate issue.  I know there are companies/software which can supposedly help in that department, but I don't know if they are reliable.