So I'm playing with htDig and other site indexing tools, and have had
pretty good results (
http://www.theashergroup.com/demos/openacs/).
But I face a basic quandry and I'd like your help. Part of the
quandry is AOLserver quirk, and part is fundamental to sites that
offer dynamic content. But based on the thought that what I am doing
my be useful to others in the ACS environment, I'd like to hear your
thoughts on search metaphors, design, and APIs.
I would like to offer visitors the ability to search for content
based on the date of the content. Now that's a completely
understandable and useful capability when searching your garage, your
tax records, and when searching a static site. I know something
happened last year between April Fools Day and Guy Fawkes Day. What
was it?
It doesn't make as much sense on a site that's more application than
content (or it's not nearly as easy or even doable by typical site
index engine): find me the Amazon quote as of last April. Nice idea -
- can't be done by most site indexers which are geared more towards
capturing and presenting the latest snapshot of a site.
It does make sense for certain ACS elements: find me the thread on
javascript security holes that occurred around Halloween.
Problem is, is that's hard to do with the site index tools I've
looked at so far when combined with our current bboard metaphor and
it's hard to do in general within AOLserver.
The site index tools I've seen (htDig mostly) only can deal with one
date: the last modified date of the document. Yet on a page that
contains dynamic content, there's an assembly of content each of
which has a different "last modified" date, and most of those dates
are unknown.
AOLserver does the cheap and mostly correct thing: for adp pages (tcl
pages too?) the current time and date is reported as the last
modified time and date; while html pages have the actual last
modified time and date as recorded in the file system returned.
Our bboard metaphor makes it hard, because, well, what is the date of
a thread? Is it the date of the first post, or the date of the last
post? It's really a date range.
So what's the answer?
Is the search by date metaphor just meaningless in a world of dynamic
content?
Should we formally support an ACS interface so that each page can set
it's last modified date time if it needs/wants to?
In an application like bboard, what should the last modified date be
set too?
Should content assemblies like bboard have a special search indexing
engine mode (presumably useful to more than just htDig but unknown)
that can expose individual elements when they each have a meaningful
last modified date time that would be of interest to folks searching
a site?
What kind of an interface would you like to see?