Forum OpenACS Development: Search enhancments

Collapse
Posted by Jon Griffin on
I added a quick hack to log and report on what was being searched on. This may or may not be of use to anyone else, but I wanted a way to know what was being searched for and by who and when.

Basically I added a table (search_stats) and added an admin interface. I simply insert a record every time a search result is output.

Very simple, shouldn't break anything but since it is not my package I won't add the diff to cvs yet.

It will temporarily be available at:
http://dev.jongriffin.com/blog/search-stats

Collapse
2: Re: Search enhancments (response to 1)
Posted by Malte Sussdorff on
It looks great and is a nice idea. Maybe Dave will commit this patch to search?
Collapse
3: Re: Search enhancments (response to 1)
Posted by Dave Bauer on
Jon, This looks good, I will work on integrating with the search package. Thanks!

I think this can be used with an idea I have been working on to speed up tsearch2 searchs. Basically you have to make sure the index is in RAM to have the fastest searches. The easiest way to do this is running some search queries in the background. Keeping track of the most popular searches and running the queries periodically with a scheduled proc should do the trick.

I have alot more enhancements from the search work Solution Grove is doing for MIT Sloan. This builds on the work Dirk Gomez and I started a while ago.

Some things I have learned, 1) increase shared buffers. Shared buffers of about twice the size of the index on disk seems to speed up the search. 2) A way to avoid doing an explicit count query that generates reasonable estimates for searches where there are more than 100 results. 3) Weighting of results based on age of the items, and object type. 4) A way to get all the search results in one query instead of running 10 seperate queries.

The result of all this is typical searches under 5 seconds, with searches with many results under 10 seconds. This all depends on the hardware, the size of the index, and the search being done. A very generic search that returns many results is slower than a specific search that returns fewer results.

I wanted to post this before all the code is done and committed as a reminder and incentive to make sure I get all this documented and committed back to OpenACS.

Collapse
4: Re: Search enhancments (response to 3)
Posted by Jon Griffin on
Do you mind if I commit this patch of mine to head, there is really no side effect that I can see.
Collapse
5: Re: Search enhancments (response to 1)
Posted by Dave Bauer on
Jon,

Please go ahead and commit. Thanks. This is very useful.