Forum OpenACS Q&A: Searching a PostgeSQL database
PostgreSQL database? I'm interested in free as well as commercial
software packages. Comments anyone? Thanks.
The problem for web work is that solutions that depend entirely on the database for building and containing an index for searching tend to be slow, complex, and troublesome. Oracle's InterMedia, for instance. Photo.net's not automatically indexed at this point, but rather is done by hand every couple of weeks, because of InterMedia problems.
There is something out there called PLS, which is also owned by AOL. It builds indices on its own for searching. Since these can easily be reconstructed from the database, there's really no need for search indices to be maintained in the database. PLS has the reputation of being very fast and very good.
It's not open source and not supported under Linux, at the moment. There is some hope that AOL might open source it. If not, we need to search for another solution.
Postgresql has a contributed function for doing substrings searches more efficiently than LIKE. I know someone who wrote an alternative to it in PGPLSQL or whatever the procedural language for "triggers" is called, using a faster algorithm and accepting substring queries in the same format as LDAP. Works by indexing each of the 10 rotated versions of a 10 letter word separately so that a search for '*SUB' or '*SUB*' will be just as fast as for 'SUB*'. Doesn't need to search the whole table, just looks up the (much larger) index. So searches are dramatically faster (but inserts slower and much more disk space, though the latter is pretty irrelevant these days).
Last I saw it was working but hadn't been packaged up for release or integrated as an internal Postgresql function. I'm sure he'll be happy to pass it on "as is". If anyone is interested, let me know.
Doesn't deal directly with full text searching, only relatively small text fields within database. But could be relevant if people are looking at different ways to do search related things.
So the technique may have limited applicability for a general solution. For the right scenario, though, it could work really well.