Forum OpenACS Development: Response to OpenACS 4 Search Integration, what should it look like?
Well, if you'd done just a little bit of research you would've known that the CR was rewritten to allow both filesystem and db storage of content, so there's no need to propose a solution to this already-solved problem.Well the research I did before posting was to
- participate in a lengthy discussion evaluating various OpenACS search strategies from htDig to SWISH, to something I now recognize as OpenFTS: here's your confident support of it just two months ago:
possibly synergistic efforts include an in-database search engine being developed by folks who run a Russian portal. If someone's interested in tracking this down and evaluating it both for usability and completeness (i.e. are they done yet?) e-mail me. I know that to make it work very well on their portal site they had to make a change to the PG core's optimization and evaluation of limit that Tom Lane didn't like all that well.
- implement htDig and SWISH on OpenACS 3.2.5
- sets up an htDig based search for OpenACS.ORG and OpenNSD.org that lets users use boolean, wildcard, fuzzy, misspelling correction, stemming, date ranges, while also letting the user specify which portions of the site to search. This demo lets folks index their word, excel, and pdf files. (This seemed like a damn fine search to start with, better than offered at /search, but I now understand how unsophisticated this search actually is.)
- searched OpenACS's /bboards, /doc, and /wp for the word OpenFTS
- read BOTH threads that contain the word OpenFTS in them (prior to this thread)
- tried to discover (but failed) where the OpenFTS discussion came from (the main OpenFTS thread starts off by saying this is an update)
- had a brief email discussion with Dan in which he seemed to agree that OpenFTS was PG centric, that Oracle SE doesn't offer Intermedia (we think this is right), and in which he agreed there was room for both approaches: db and web crawling, and in which he asked for my opinion of SWISH applicability to OpenACS 4, which I offered back in this thread.
- Found a link to a status report from one month ago status report suggesting that Swish++ (and htDig ) based approaches would be a good thing
And the content repository should be the hook for searching dynamic content regardless of where it's stored, DB or file system. There's no reason why static content couldn't be mapped via the content repository, too, now that it knows about stuff stored in the file system ...
I think I did my preliminary bit of research. No one suggested the problem may not be real. No one suggested the OpenFTS/PG weirdness has been fixed. No one mentions that OpenFTS has been found reliable and faster than the alternatives. And no one mentions just what search features OpenFTS actually supports.
I stand corrected now, and I understand that before asking any questions, or stating any concerns, or prior to any effort to participate with you, I must first read every thread and every document available in the system and be prepared to defend my understanding of this material. I must make private inquiries to all the developers and ask what their status is and what their thoughts are.
So in ten weeks of Baccusland, OpenFTS moves from being unknown code that munges PG in not pleasant ways to being savior of OpenACS 4 search. And I'm not supposed to voice any concerns.
Got it, I think I understand the process now.
Thank you for the hammering reset.