Forum OpenACS Q&A: Full text search
included in version 3.2.4 ? Also how is full text searching done?
Thanks in advance
See the thread of two days ago from Don Baccus entitled "Simple search tool available for bboard module" at http://new.openacs.org/bboard/q-and-a-fetch-msg.tcl?msg_id=0000Tz&topic_id=11&topic=OpenACS - Don says that this is only a stopgap measure, but it is clearly immensely better than nothing.
It does a simple ranking based on a list of keywords - it's not phrased based. The more keywords that are matched, the higher the score you get. It doesn't weight for multiple occurances of keywords or anything like that. It scales the return value so it lies between 0-100, 0 being "no keywords matched", 100 being "all keywords matched".
I suspect the simple Tcl ranking function could easily be twiddled to provide more finely-tuned search results - Tcl's a lot more fun for writing this kind of code than Oracle PL/SQL, that's for sure! The current ranking function is about 10 lines of code...
But there's no way to avoid the basic problem that this hack requires a sequential scan of the bboard table (or any table you decide to search), so is inherently slow. This is the major reason it is a
stopgap, as it won't scale.
But as Greg mentions, it indeed is better than nothing. photo.net survived surprisingly well with this little hack for quite some time.
Right now Ben and I are leaning towards an out-of-database solution, since a good indexing solution in the database is likely to lead to slow inserts of posts, news items, and other searchable things. Experience with InterMedia tends to back up that point of view (if not outright poison our point of view!)
An out-of-database solution is fine, because you don't really need your search index to be ACID - if it hoses, you just rebuild it.
I've been playing with swish and swish++...
"The PLWeb Turbo 3.0 source code distribution allows you to rebuild the application-level binaries of PLWeb Turbo 3.0. The libraries that are part of the search engine (CPL) are provided as binaries in the CPL 6.3 binary distribution and cannot be rebuilt (i.e., no source code is provided)."
In summary: it's not an open-source software and nothing seems that it'll ever be.
2. License Grant. AOL hereby grants You a world-wide, royalty-free, non-exclusive license (a) to use, reproduce, sublicense and distribute the Licensed Software, including as part of one or more Integrated Works; (b) to provide support and maintenance to a third-party in the use of the Licensed Software and in the development and use of one or more Integrated Works; and (c) to use, reproduce, sublicense and distribute the Documentation in connection with all of the foregoing. Perl scripts incorporated with the Licensed Software may be used and modified to facilitate use of the Licensed Software or Integrated Works.
It may not be open source, but free is still free.
However, as the (real) Free Software philosophy goes: you've got the source (to OpenACS) and there is nothing that can stop you to provide OpenACS search interface for pls.