Forum OpenACS Development: RFC:Advanced Search

Collapse
Posted by Dave Bauer on
Some applications need to offer advanced search beyond this existing full text index. This often includes joining on additional tables other than the text index table to restrict the search results based
on additional criteria. For example, one might wish to search by the author or creation user of an object, the title or pretty name of that object, restrict search to a particular object type, package instance, or subsite.

I propose to add the ability to define operators that trigger search on additional database columns or tables. See http://www.google.com/help/operators.html for examples of the type of operators used at google.com.

Each operater would be defined by a service contract (or Tcl Callback see https://openacs.org/forums/message-view?message_id=273782 )

The tcl procedure that implements an operator would accept a Tcl list argument that contained all the query "words" that are associated with that operator. It would return a two element list. The first element would be the name of a table to join against. The second element would be a where clause to add to the query.

If no opertaor is found the elements of the search query after the operator would be appended to the default full text search query.

I have already implented a variation on this system for the Greenpeace CTK project and I need something similar for a current project so I will be working on this. I would like to find a solution that can be added back into OpenACS that can eventually support PostgreSQL and Oracle, and be independent of the full text indexing solution that is used. It might be the case that each full text indexing driver would need to take the query fragments and put them into the full query for the search results. Most of the work should be done by procedures in the Search package that can be accessed from any full text indexing implementation.

Collapse
2: Re: RFC:Advanced Search (response to 1)
Posted by Dave Bauer on
Anyone care about this? I'd like to finish it, most of the code is in OpenACS already.

The idea is to allow google-style search operators

then we can extend search with new operators arbitratily without adding to the callback signature etc, and a package could use search like an api just by passing in a search query ie:

"package_id:12345 creation_user:23456 some search terms"

could be sent to the search package which would generate the correct quer and return a multirow with the results.

Collapse
3: Re: RFC:Advanced Search (response to 1)
Posted by Torben Brosten on
Power searching is an important part of making use of available data. Definitely useful!

Could you also expose available operators (and their meaning/use) via proc or adp include fragment, so applications that use this feature can have a user-accessible help page about it?

Collapse
Posted by Dave Bauer on
Sounds like a good idea.

I am looking at the latest tsearch2 frm PG 8.2 which has some new features taht look useful. It was backported to work on 8.1 as well so it might be nice to support that soon in OpenACS 5.3.

Collapse
5: Re: RFC:Advanced Search (response to 1)
Posted by Torben Brosten on
btw, in your example, you use: 'package_id:12345' maybe instead of the package_id, include/evaluate context for the entire url for the equivalent of google's site:example.com/subsite/application/page. Google accepts a url as part of the site constraint. For example, to search just the openacs docbook docs for version 5.2, include: site:openacs.org/doc/openacs-5-2/
Collapse
Posted by Dave Bauer on
Torben, great idea, I really like that. I could pass the subsite url in, instead of a list of package ids.

I think maybe still both formats should be supported, but I do like the url idea.

Collapse
7: Re: RFC:Advanced Search (response to 1)
Posted by Torben Brosten on
Yeah, support both formats. That was just a limitation of my early morning pidgin English =)