Forum OpenACS Development: ANNOUNCE: OpenFTS release 0.3.2

Collapse
Posted by Dan Wickstrom on
OpenFTS 0.3.2 is now available for download at http://sourceforge.net/projects/openfts.

OpenFTS (Open Source Full Text Search engine) is an advanced PostgreSQL-based search engine that provides online indexing of data and relevance ranking for database searching. Close integration with database allows use of metadata to restrict search results.

The tcl version (a perl version is also available) can be built to work with tclsh or it can be built as a loadable module for Aolserver (see http://www.aolserver.com).  Also, package support for OpenFTS has been built into Openacs (see http://openacs.org) to provide a highly integrated search solution.

OpenFTS has the following features:

* Very fast index updates: No need for re-building the whole index on every submission to the indexer.

* Proximity: Ranking results with respect to how close the keywords were found in a document. So if you have a document which includes the phrase "full text search" and another document which includes the words "full", "text", "search", the first one will be ranked higher.

* Weighting scheme: Configurable weights for words in title and body. By default, words in title weight more than words in the body of the document. So, documents that contain the search query in the title are ranked higher than those that contain it in the body..

* Stemming: It allows to find same words with different endings. For example, if words "testing" or "tests" are found in a document, the word "test" will be stored by indexer instead. Search will also try to find the word "test" if "testing" or "tests" is given in search query. Note that this scheme lacks exact search possibility, but usually reduces database size and makes search faster.

* Stopwords: Ignores common words and characters (known as stopwords) as they tend to slow down searches without improving the quality of the results. Terms such as "where" and "how", as well as certain single digits and single letters, are not included in searches.

* Modular architecture: Makes it easy to add parsers and dictionaries.  Snowball Stemmer dictionary support has been added for the following languages: Danish, Dutch, English, Finnish, German, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish.  Other Snowball stemmer dictionaries can also be easily added as they become available.  In addition, Ispell dictionay support is also available.  This allows for easy integration of multilingual search capablities.

* Multilingual support: Latin and Cyrillic are available. Others can be added.

* Search terms in context: Displays excerpt from the documents, which shows how your search terms are used in context on that document. Your search terms are bolded so you can tell at a glance whether the result is a document you want to view.

For more information, see http://openfts.sf.net/

Collapse
Posted by Andrei Popov on
What about openfts-driver -- has it been updated? It does not seem so, as I get:
Request Error

can't read "opt(txtidx_field)": no such variable
    while executing
"string length $opt(txtidx_field)"
    (procedure "Search::OpenFTS::Index::init" line 23)
    invoked from within
"Search::OpenFTS::Index::init opt"
    invoked from within
"array set idx [Search::OpenFTS::Index::init opt]"
    ("uplevel" body line 35)
    invoked from within
"uplevel {
    	  ad_page_contract {

    Initialize OpenFTS

    @author Neophytos Demetriou

} {
    table_name
    table_id
    dict
    parser
    ..."
when trying to init it with 0.3.2. Looks to me like openfts-driver/www/admin/initialize-2.tcl is completely out of sync with reality -- tables that it creates are anything but what 0.3.2 is expecting...
Collapse
Posted by Dan Wickstrom on
It has only been updated on the 4.6 branch.
Collapse
Posted by Andrei Popov on
i.e. it is not in HEAD?
Collapse
Posted by Dan Wickstrom on
No it's not.  It will be merged back onto the main branch after the 4.6 release.