Of course we don't want to do any cloaking. The quality of the content and whether or not people are linking to it is a bit out of scope of my core mission, which is to identify and solve any technical glitches that cause poor indexing. I've got Eric Wolfram's top item, fixing page titles, on the top of my list.
After reading all the notes and some of the linked items, I'm wondering:
- Is it worth it to try and monitor results? greenpeace.org gets many google hits every day, but not every page is hit every day, and some authors claim that pages go months without re-indexing. Maybe we should just make the obvious fixes and leave it alone, or check back in 6 months.
- Should I put any effort into better pretty urls - not just /article/145 but putting a keyword into the pretty url? We do the foundation work for this in some parts of openacs, where short_name is a locally unique string suitable for a url. This is nicer for users, certainly - how standard can we make in OpenACS? Is it worth trying to retrofit this to old apps that just have ids, by creating a short-name field and populating it?
- Where else should we be setting noindex,nofollow? So far:
- in edit and add mode of form-builder
- in packages with duplicates. Are the duplicates a bigger problem then the possibility of not getting indexed at all if we block some pages from indexing and the "intended to be indexed" pages don't get hit? Maybe we're better off trusting the search engines' ability to hide duplicates.