Forum OpenACS Q&A: OpenFTS

Collapse
Posted by Richard Hamilton on
Just tried to install OpenFTS v0.3 (tcl) and discovered that the 'create_func.sql' file referred to in 'load.sql' (ref.OpenACS Next Steps) does not exist in v0.3. In fact the func_pgsql directory is not there at all. The following link indicates that the currect version is not backwardly compatible with older versions and that query syntax and data structures have changed.

I assume therefore that openacs at present can only support openfts up to v0.2.
See: http://openfts.sourceforge.net/primer.html

Can anyone confirm or refute this rather half educated conclusion!?

Regards

Richard


3. Changes
IMPORTANT NOTICE: This version is incompatible with earlier versions 
due to changes in the base data type, the structure of the indexing 
tables, and the interfaces of the dictionaries. 
OpenFTS is in what is likely to be one of many stages. The OpenFTS 
developers are experimenting with various features which should 
eventually result in a full-featured search engine within PostgreSQL. 
The latest incarnation has more natural interface which is easier to 
understand. In the old system, search queries look something like the 
following: 


      SELECT
          txt.tid,
      FROM
          txt
      WHERE
          (txt.fts_index @ '{14054652}')

and the new system uses a natural language approach that supports 
boolean operators and it looks like the following: 

      SELECT * FROM foo WHERE titleidx @@ '(the|this)&!we';

This is quite an improvement over the previous approach. Here's a 
more complete list of changes in the latest version: 
The latest version is based on tsearch , a PostgreSQL contrib module, 
which provides the implementation of a special text data type, namely 
txtidx, suitable for text indexing. It uses words 'as is' without 
hashing to integers and provides search interface in more natural 
way. For example, it's possible now to test full text search from 
psql. More information about tsearch is available here. 

Implementations of dictionary interfaces are required to work with 
lexems instead of integers: lemms method instead of lemmsid, 
is_stoplexem instead of is_stoplemm. 

Better administration and maintenance API. Added: 

drop -- removes all OpenFTS tables, indices, dictionaries (if 
dictionary provides 'drop' method); 

drop_index -- removes all OpenFTS indices from index tables 
(INDEX1,,,INDEXN) and the GiST index on the base table (where the 
documents are stored together with their primary key). 

Added generic interfaces to ISpell dictionaries and Snowball 
stemmers. ISpell dictionaries are free and available for many 
languages and could be used to return base forms of a word. This is 
very important for inflective languages, like Russian. Snowball 
stemmers, available from http://snowball.tartarus.org, can be used to 
stem a word, i.e. to cut the word's ending and use the linguistic 
root for indexing and searching.