Forum OpenACS Development: Response to OpenACS 4 Search Integration, what should it look like?


I would love OpenFTS to be wonderfully terrific, but what if it's not?  I know it's open source, but I have enough time dealing with OpenACS issues....

I guess my concerns are several:

One. With little information and little user experience out there regarding OpenFTS or its performance, reliability, or stability, I am concerned that relying on OpenFTS for OpenACS Postgres search is risky for OpenACS users.  Will it hold up under system load?  Is it actually faster/smaller/better than any of the other solutions?  Will the initial releases be robust and will releases make it out in a timely fasion?  How does anyone know this?  Is OpenFTS really far enough along that you're ready to narrow the design space?

Two: I want to make sure there is a cheap 80% solution (that is, the solution that costs 20% and solves 80% of the problems):
Right now, I have seen htDig and SWISH<fork> work with OpenACS and explored their use and implementation.  I know a good 80% (or 90%) solution can be made of them.  I can discuss their performance requirements and their reliability.  I can reference many sites around the net that are using these solutions and talk to the folks that have implemented them to discover how these things work in reality. I can't do that with OpenFTS, and it doesn't look as though I will be able to do so except in the long run.  I can't tell if OpenFTS is a 20/80 solution, but it doesn't appear to be.
I am concerned that emphasis on two db crawlers (Intermedia and OpenFTS) will focus/narrow/bias the OpenACS interfaces in such away as to make it difficult to implement the cheap 80% solutions.

Three: I would like to find a solution for OpenACS 4 that can be used now with OpenACS 3.2.5.  I am not sure how migration from 3.2.5 to 4 will work, and so I suspect there will be 3.2.5 sites out there for a long time.  I would prefer to master one set of search engine dependencies if possible.

Four: I already have dependencies on AOLserver and OpenACS, two organizations working very hard that still don't have enough resources.  I am concerned about adding another dependency on a fledgling search engine without much track record and without many developers regardless of its promise, and regardless of how well it may be performing on one or two demonstrator sites.

Five: when you write, "If someone wants to sit down and write alternative indexers that operate on the CR, that could be substituted for InterMedia or the OpenFTS solution, sure, they can go for it" I worry that emphasizing OpenFTS/Intermedia/db crawling now will limit us as to what we can actually substitute in later on.

Think of today's big limitation of AOLserver -- Tcl.  The implementers of AOLserver knew/wanted db neutrality and extensability and lots of db support, so they created a simple db driver interface that lots of things could implement: solid, sybase, oracle, and even postgres.  They wanted communications neutrality and extensability so they created a simple communications driver interface and we have nssock, nsopenssl, nsssl, nsvhr, nsunix.  But, they targeted Tcl and Tcl alone for scripting, and so today we don't have wonderfully integrated python, java, perl, javascript or ruby solutions with AOLserver.  Merely Apache and Tomcat.  Once again, I worry that by examining/emphasizing db crawlers today (Intermedia and OpenFTS) we will overly restrict the search engine/ACS interfaces.

Six: OpenACS integration into existing infrastructures.  By focusing on OpenFTS now, we make it that much harder to sell OpenACS into an organization already using brand foo.  "Well, you need two search engines, and your existing indexer will never really work as well with your OpenACS content as it does with your existing content.  What we really ought to do, is get rid of your other search engine and replace it with our OpenFTS search engine."  This may well be the best thing to do technically, but it is a harder sell and may not have been required to produce an requirements satisfying site.

So what am I missing?  I've visited several OpenFTS sites, including the one in Russia and the one's mentioned by Google.  Why are you so confident that OpenFTS is the way of the future and will be reliable and timely for use in OpenACS?