Forum OpenACS Q&A: OpenFTS 0.3.1 Problems

Collapse
Posted by Simon at TCB on
Hi all,

can anyone (I'm desparate) help out with this one.

I've got pretty much all the way with installing the newer version of the OpenFTS but when i initialise it (via the admin screen) it bombs.

I kindof expected that as I wasn't able to perform one of the steps in the instalation instructions, namely

psql nsprint < /web/nsprint/packages/openfts-driver/sql/postgresql/load.sql

as it failed complaining that

ERROR: stat failed on file $libdir/_int : no such file or directory

Any ideas how I can get that to work?

Everything else has gone fine, but the problem is that the installation instructions are out of date and still refer to version 0.2, and my suspicion is there's a step to change/miss/add.

any advice very, very gratefully received (muggins has a delivery date on Friday ... yuk)

Collapse
2: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Simon at TCB on
ok, as i hunt around i find that version 7.3.2 doesn't have an _int.sql, only an _int.sql.in

and the create_func.sql file is not in version 0.3.1 of the search package.

I managed to get the _int.sql file generated. I ahd to compile the intarray in the contribs directory, this results in the .sql file which ran ok.

however i still don't have the create_func.sql file. where is this in verson 0.3.1? does it exist any more and is this why I'm havong problems?

Collapse
3: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Dan Wickstrom on
Openfts does not use intarray anymore. For installation directions, read the AOLSERVER.INSTALL file in the Openfts-0.31. Briefly, you need to do the following:


Installation:

1. Install aolserver.
2. Build and install postgresql tsearch option from contrib directory.
3. Build aolserver module:


        tar xvzf openfts-tcl-0.3.tar.gz
        cd openfts-tcl-0.3
        ./configure --with-aolserver-src=DIR 
        cd aolserver
        make
        cp fts.so [aolserver bin directory]
        cd ..
        cp fts_*.tcl [aolserver tcl lib directory] (don't do this if installing with openacs)
        (if not installing with openacs4 use: cp aolserver/test/10-database-procs.tcl [aolserver tcl lib dir])
        cd PGSQL_SRC_HOME/contrib/tsearch
        make
        make install
        psql template1 < tsearch.sql
        cp -r pgsql_contrib_openfts PGSQL_SRC_HOME/contrib      
        cd PGSQL_SRC_HOME/contrib/pgsql_contrib_openfts
        make 
        make install
        psql template1 < openfts.sql (installing in template1 makes openfts available to all new dbs that are created - if you don't want openfts in all of your newly created dbs, then only load this file into the db you want to use openfts with)

4. Edit nsd.tcl file and add entry for fts module:

ns_section "ns/server/${server}/modules" 
        ns_param   nssock          ${bindir}/nssock.so 
        ns_param   nslog           ${bindir}/nslog.so 
  ...
        ns_param   nsfts           ${bindir}/nsfts.so


Collapse
4: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Simon at TCB on
excellent! I'll give that a try now.

could you also confirm that the openfts-driver and the search package from the latest *4.6* branch are ok to work with this too?

Thanks

Collapse
5: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Dan Wickstrom on
On the 4.6 branch, I updated the INSTALL document in the openfts-driver package, so that it gives the overall sequence for installing openfts and the openfts-driver.  It is no longer necessary to run load.sql, and I've removed it from cvs.
Collapse
6: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Jon Griffin on
Is there an update procedure anywhere?
Collapse
8: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Simon at TCB on
Dan,

juts want to check I'm getting everything right from the 4.6 branch.

the INSTALL file i have hs no indication its been updated and still contains an instruction line telling you to install load.sql frm this packages directory.

Am i looking at the right version? sorry for bing a pain, but its tricky to tell.

Collapse
7: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Dan Wickstrom on
If by update you mean take an openfts-0.2 installation and upgrade it to a openfts-0.31 installtion, then the answer is no.  Version 0.2 was based on intarray and version 0.31 is based on tsearch, and the two packages are incompatible.  Essentially, intarray stores lexem ids (crc-32 checksums on a lexem), while tsearch actually stores the lexem itself.  Tsearch is much better in that you can actually look in the db and see what has been indexed.  Since you can't go backwards from a crc-32 sum back to a lexem, I don't see how it would be possible to write an upgrade script.  To upgrade, you would need to install openfts-0.31 and then write a script to force all of the content on your site to be reindexed.  I've done this and it works, but it might take a while, if your site has a lot of content.
Collapse
9: Re: OpenFTS 0.3.1 Problems (response to 1)
Posted by Dan Wickstrom on
The cvs log for the INSTALL file shows that I updated it:

44 eusdawi@edgedsp6:/home/unix/wickstrom/web/openacs-4-6/packages/openfts-driver/www/doc>cvs log INSTALL
danw@openacs.org's password: 

RCS file: /cvsroot/openacs-4/packages/openfts-driver/www/doc/INSTALL,v
Working file: INSTALL
head: 1.3
branch:
locks: strict
access list:
symbolic names:
        oacs-4-6: 1.3.0.4
        oacs-4-5-final: 1.3
        oacs-4-5-rc-1: 1.3
        don-merge-1: 1.3
        oacs-4-5-beta-1-2: 1.3
        oacs-4-5-beta-1-1: 1.3
        oacs-4-5-beta-1: 1.3
        oacs-4-5: 1.3.0.2
keyword substitution: kv
total revisions: 4;     selected revisions: 4
description:
----------------------------
revision 1.3
date: 2001/09/15 21:31:50;  author: neophytosd;  state: Exp;  lines: +6 -6
branches:  1.3.4;
enhanced search package
----------------------------
revision 1.2
date: 2001/09/04 15:39:51;  author: neophytosd;  state: Exp;  lines: +1 -1
changed gmake-install-headers to gmake install-all-headers
----------------------------
revision 1.1
date: 2001/09/01 20:33:52;  author: donb;  state: Exp;

Added this to the tree ...
----------------------------
revision 1.3.4.1
date: 2002/10/11 17:09:32;  author: danw;  state: Exp;  lines: +5 -27
updated docs and removed load.sql
=============================================================================

Have you updated from cvs lately?

Collapse
Posted by Simon at TCB on
Dan,

I *think* i've re-done everything as you've described, but still, when I come to try a test using the search package and the notes package as an example, i falls over with a erro about

can't read self(TXTTID): no such element in array invoked from within openfts_driver_search

So, i did a cvs update from the 4.6 branch and ususpiciously load.sql is still there. can you check you're changes/work is acually in the 4.6 branch? perhaps this is the problem?

Oh, and also, i noticed you've put an extra note in your posting about AOLSERVER.INSTALL file.

You say that all the fts_*.tcl *shouldn't* be copied to the tcl lib dir if you're running openacs? is that correct? if not what do you mean by the aolserver tcl lib directory? /usr/local/aolserver/lib/tcl8.3 ?

Thanks

Collapse
Posted by Simon at TCB on
Sorry dan, posted my last one before I saw yours. (not keen on this cacheing :o()

anyway, i have a 4.6 checkout that i regularly cvs update but each time i do it at the moment I'm not getting your changes.

therefore i don't know if maybe my install is screwed? i wouldn't have thought so as its picked up other changes over the past few weeks.

unfortunately i'm stuck at home now and don't have the bandwidth to bring down a new checkout entirely, but i'll try doing that in the morning if you're sure there's nothing else this could be.

i'll do a complete re-install, from pg upwards, tomorrow and follow your advice here exactly.

If could mail me over a copy of the openfts-driver package that you have there, that you know is working that would be a big help. i could at least compare that to what I have here.

Collapse
Posted by Dan Wickstrom on
I'll send you a copy of the openfts-driver. One of us must have a hosed cvs. The log entry for load.sql shows that it has been deleted, and the log entry is the same wether I do it from the main branch or from the openacs-4-6 branch:


584 eusdawi@edgedsp6:/home/unix/wickstrom/web/openacs-4-6/packages/openfts-driver/sql/postgresql>cvs log load.sql
danw@openacs.org's password: 

RCS file: /cvsroot/openacs-4/packages/openfts-driver/sql/postgresql/load.sql,v
Working file: load.sql
head: 1.2
branch:
locks: strict
access list:
symbolic names:
        oacs-4-6: 1.2.0.4
        oacs-4-5-final: 1.2
        oacs-4-5-rc-1: 1.2
        don-merge-1: 1.2
        oacs-4-5-beta-1-2: 1.2
        oacs-4-5-beta-1-1: 1.2
        oacs-4-5-beta-1: 1.2
        oacs-4-5: 1.2.0.2
keyword substitution: kv
total revisions: 3;     selected revisions: 3
description:
----------------------------
revision 1.2
date: 2001/09/15 21:57:00;  author: neophytosd;  state: Exp;  lines: +2 -2
branches:  1.2.4;
latest pg and openfts-tcl
----------------------------
revision 1.1
date: 2001/09/01 20:33:52;  author: donb;  state: Exp;

Added this to the tree ...
----------------------------
revision 1.2.4.1
date: 2002/10/11 17:09:30;  author: danw;  state: dead;  lines: +0 -0
updated docs and removed load.sql
=============================================================================


Collapse
Posted by Jon Griffin on
cvs appears hosed. I just did a fresh co and have the same problem of old files.

Can someone check this.

Collapse
Posted by Jon Griffin on
Dan,

Can you do another checkin of the new openfts? It is not showing anywhere I can see including SDM.

Either that or send me a copy?

Thanks

Collapse
Posted by Dan Wickstrom on
openfts is available from openfts.sf.net - follow the links for download.  If by openfts, you mean the openfts-driver, you can checkout the latest version from the 4.6 branch in cvs.  I've done a fresh checkout from cvs, and the openfts-driver has been updated correctly.
Collapse
Posted by Jon Griffin on
If you are upgrading from an earlier version, you may need to wipe out some stuff. I ran into some problems on initialize, so hopefully this will help if you have the same problems.

You may need to drop some other tables:


drop table fts_unknown_lexem;
drop table txt;
drop table fts_conf;

select relname from pg_class where relname like 'index%';
I also changed the function create proc to create or replace in admin/initialize-2.tcl. Since this release is incompatible with < 7.2 PG I will check this fix in.

Collapse
Posted by Jon Griffin on
Another Problem:

invalid command name "Search::OpenFTS::Parser::run_parser"
It appears this isn't sourced anywhere.

Collapse
Posted by Dan Wickstrom on
This is sourced from the openfts-driver during startup.  You need to set the directory in the parameters section for openfts-driver in the site-map.
Collapse
Posted by Jon Griffin on
Looking into this further, it is a problem with the config file. It can't find tcl headers (at least on my gentoo box) which is true if you are running 3.5 aolserver.

If you are using AOLserver >= 3.5 you also need to specify your tcl config: --with-tcl=/usr/lib or wherever.

Now everything is fine. I will add a little upgrade howto and I think that the default param in .info should be changed to 0.3.1 instead of 0.2

Collapse
Posted by Dan Wickstrom on
Okay, thanks.  I've changed the default to 0.31.  I'll have to investigate the problem with aolserver 3.5, as it shouldn't be necessary to also specify --with-tcl.
Collapse
Posted by Jon Griffin on
Dan,
I am using a non-standard distro (gentoo) but it still puts the tclconfig.sh in /usr/lib. The .configure couldn't find it, so it may be that RH, SUSE etc work.
Collapse
Posted by Dan Wickstrom on
Okay.  Is it the case that aolserver 3.5 links to a stock tcl distro, or is that something that was going to happen for 4.0?  I know it's coming, but I can't remember the version where it was supposed to happen.
Collapse
Posted by Jon Griffin on
Yes, 3.5 is uses the stock distro. I haven't looked closely at the .configure script, but it needs to look in the usual places I think.
Collapse
Posted by Simon at TCB on
Hi,

Its also necessary to specify --with-tcl on Mandrake distributions.

Dan, I have got it working now (thanks fro the help) but it appears that the Search package isn't listing the usual warning when using common words like 'as is a if' and so on.

It simple comes back saying 'if' didn't match any documents. Looking at the 0.3.1 code for OpenFTS:get_sql, which I beleive is what is supposed to set the 'opt' array with this info, I can;t actually see this getting done anywhere?

Is this a bug or is this something I may have done wrong or need to set up?

Thanks
Simon

Collapse
Posted by Dan Wickstrom on
'if' is a stop-word in english, so it doesn't get indexed. You can also restrict indexing of other lexem types by specifying the types in the admin screen when the openfts-driver is created. This configuration is a little obscure, but I've put a script in the examples sub-directory which will list all of the lexem types supported by the current version of openfts. Running it, you wil get the following output:

403 eusdawi@edgedsp6:/home/unix/wickstrom/web/openfts/tcl/examples>./types.tcl 
  1 => Latin word
  2 => Cyrillic word
  3 => Word
  4 => Email
  5 => URL
  6 => Host
  7 => Scientific notation
  8 => VERSION
  9 => Part of hyphenated word
 10 => Cyrillic part of hyphenated word
 11 => Latin part of hyphenated word
 12 => Space symbols
 13 => HTML Tag
 14 => HTTP head
 15 => Hyphenated word
 16 => Latin hyphenated word
 17 => Cyrillic hyphenated word
 18 => URI
 19 => File or path name
 20 => Decimal notation
 21 => Signed integer
 22 => Unsigned integer
 23 => HTML Entity


In my current setup, I have it configured to not index html tags, space symbols, and HTTP head. The openfts driver also allows you to restrict what is shown in a headline display.

I don't recall the search package ever giving a warning for using stopwords for search terms.

Collapse
Posted by Simon at TCB on
Dan,

I took the inclusion of stuff like this:



      </if>
        <if @nstopwords@ eq 1>
        <font color=6f6f6f>
          "<b>@stopwords@</b>" is a very common word and was not included in your search.
          [<a href=help/basics#stopwords>details</a>]<br>
        </font>
        </if>

to mean the search package should be displaying at, but as I said I'm not convinced the 'opt' variable from which nstopwords is created is ever populated.

Maybe it never has worked, but as the code's always been there this really means its never *worked* ;o)

Anyway, so as far as we know there is a bug, its just never been addressed?

I;d also like ot suggest that as the default behaviour is to ignore such words, its probabyl better to say that in advance on the search pages than after you've stuck it in a search term?

Collapse
Posted by Dan Wickstrom on
Simon,

Okay, now I understand.  I've never looked at the search package in detail, so I wasn't even aware of that bit of code.  I've fixed the problem, which requires changes to both openfts and the openfts-driver, so I will be releasing a new version of openfts soon.

Specifying all of the stop words in advance is not practical, as there can be hundreds of stopwords, and if your doing multilingual searching, each dictionary can have its own list of stop-words.  I think the way it is implemented now is the most reasonable.

Collapse
Posted by Neophytos Demetriou on
Maybe it never has worked, but as the code's always been there this really means its never *worked* ;o) Anyway, so as far as we know there is a bug, its just never been addressed?
Simon, stopwords used to work but as I'm sure you understand the OpenFTS (not the driver) has changed a *lot* in the past year. I do not see how a "bug" that is revealed more than an year after the package was released and especially using a different OpenFTS codebase leads you to the conclusion that it *never* worked.
Collapse
Posted by Simon at TCB on
Ok, apologies.. for 'never worked' I meant 'never worked in recent significant history' or 'never worked since chnages to OpenFTS'

No offence meant, I just assumed you'd read between the lines a bit :o)

Anyway, don't jump all over me, but, as the Search package and the OpenFTS are so inherently intertwined, I just assumed when chnages to one were going on, then someone would be looking after how that affects the operation of the other.

Any idea when you'll make the new versions available Dan?

PS, what I really meant about highlighting keywords in advance is not to necessarilay list them all, just point out on the search screen that such a thing applies.

Its the very first thing that caught out my customer yesterday. Just a minor suggestion. Good help text/instructions can go a long way to even overcoming poor software or design.

Collapse
Posted by Neophytos Demetriou on
No offence, Simon. As far as your request, you can slightly modify OpenFTS to index/search stopwords as well. In fact Google is indexing all words (without stemming, though) and the default behavior is to point out that stopwords were not included in the search unless the user specifically asks for a stopword to be included, i.e. prefixing the word with the plus (+) sign. I'm out of Cyprus, so my apologies if I cannot be more specific on this one. If you are still interested on this one I can be more specific next week when I'm back to Nicosia.
Collapse
Posted by Neophytos Demetriou on
Just one last bit of info. Here's a working installation of OpenFTS that you can see how stopwords work (at Infogettable):

http://www.infogettable.net/search/search?q=retires+where+are+you

Collapse
Posted by Dan Wickstrom on
Simon, the search package and openfts are intertwined,  but openfts is an independent project that, in addition to openacs, has support for perl, tclsh, and stand-alone aolserver applications.  Currently there are no maintainers for the search and openfts-driver packages, so search integration with openacs is suffering somewhat.

I will try and get a new version of openfts released by the end of next week.

Collapse
Posted by tammy m on
It seems like the change: I also changed the function create proc to create or replace in admin/initialize-2.tcl. Since this release is incompatible with < 7.2 PG I will check this fix in.

did not make it into the code base since I'm using 4.6.1 release of oacs and having the same problem with ERROR: Relation 'txt' already exists when installing OpenFTS Driver.

When I manually drop the txt table, the driver install succeeds. I didn't have an older version of OpenFTS installed but had other issues on my install and dropped my OpenFTS driver and reinstalled it. Then I got the error with the txt table existing.