Forum OpenACS Q&A: OpenFTS
Just tried to install OpenFTS v0.3 (tcl) and discovered that
the 'create_func.sql' file referred to in 'load.sql' (ref.OpenACS
Next Steps) does not exist in v0.3. In fact the func_pgsql directory
is not there at all.
The following link indicates that the currect version is not
backwardly compatible with older versions and that query syntax and
data structures have changed.
I assume therefore that openacs at
present can only support openfts up to v0.2.
See:
http://openfts.sourceforge.net/primer.html
Can anyone confirm or refute this rather half educated conclusion!?
Regards
Richard
3. Changes
IMPORTANT NOTICE: This version is incompatible with earlier versions
due to changes in the base data type, the structure of the indexing
tables, and the interfaces of the dictionaries.
OpenFTS is in what is likely to be one of many stages. The OpenFTS
developers are experimenting with various features which should
eventually result in a full-featured search engine within PostgreSQL.
The latest incarnation has more natural interface which is easier to
understand. In the old system, search queries look something like the
following:
SELECT
txt.tid,
FROM
txt
WHERE
(txt.fts_index @ '{14054652}')
and the new system uses a natural language approach that supports
boolean operators and it looks like the following:
SELECT * FROM foo WHERE titleidx @@ '(the|this)&!we';
This is quite an improvement over the previous approach. Here's a
more complete list of changes in the latest version:
The latest version is based on tsearch , a PostgreSQL contrib module,
which provides the implementation of a special text data type, namely
txtidx, suitable for text indexing. It uses words 'as is' without
hashing to integers and provides search interface in more natural
way. For example, it's possible now to test full text search from
psql. More information about tsearch is available here.
Implementations of dictionary interfaces are required to work with
lexems instead of integers: lemms method instead of lemmsid,
is_stoplexem instead of is_stoplemm.
Better administration and maintenance API. Added:
drop -- removes all OpenFTS tables, indices, dictionaries (if
dictionary provides 'drop' method);
drop_index -- removes all OpenFTS indices from index tables
(INDEX1,,,INDEXN) and the GiST index on the base table (where the
documents are stored together with their primary key).
Added generic interfaces to ISpell dictionaries and Snowball
stemmers. ISpell dictionaries are free and available for many
languages and could be used to return base forms of a word. This is
very important for inflective languages, like Russian. Snowball
stemmers, available from http://snowball.tartarus.org, can be used to
stem a word, i.e. to cut the word's ending and use the linguistic
root for indexing and searching.