Forum OpenACS Q&A: help for search package

Collapse
Posted by David Spanberger on
I read the posting in this forum concerning the Full Text Search of pdfs and docs. I want to implement this on my server. I'm using openacs-5-2-3 installed from tarball and have currently installed the tsearch2 driver. But I'm not able to find anything on my Server. I tried to find postings in the forum and filenames in the file-storage on my server. But the search doesn't find anything.
I checked the error.log but there were no errors. I have also installed the FtsContentProvider Bindings for these packages.

Maybe you can give me a hint how to manage that my Search can find items.

Collapse
Posted by Dave Bauer on
The 5.2.3 version of search does not include indexing of PDFS or DOC formats. You'll need to extend this procedure to do that:https://openacs.org/api-doc/proc-view?proc=search%3a%3acontent%5ffilter

search::content_filter is just a switch statement keyed on the mime type of the content. The easiest solution is to exec an external program to extract the content.

You can see the future direction this is taking for the next release of OpenACS here http://cvs.openacs.org/cvs/openacs-4/packages/search/tcl/search-convert-procs.tcl?rev=1.1&view=markup

Collapse
Posted by David Spanberger on
I found a solution for indexing pdf and doc: https://openacs.org/forums/message-view?message_id=295461

But I am not at this point yet. I don't manage that the search finds anything. e.g. I opened a new forum and a new message inside. Then I tried to find this message by the Search package. It is the same with the file-repository. Can you tell me how I can the Search to search this packages?

Collapse
Posted by Gustaf Neumann on
be sure, you have mounted the search package.
did run run the indexer? Check out reindex in xowiki.
Collapse
Posted by David Spanberger on
Yes, the search package is mounted.

I looked at the reindex procedure of xowiki.
I tried to use this command:

search::queue -object_id $object_id -event INSERT

for the object_id I took the object_ids of the table "acs_objects"

Now I can search through the forums package too. But I don't find anything in the file_storage.

Any ideas?

Collapse
Posted by Gustaf Neumann on
You did the search::queue ... INSERT for the items in the file store, you installed the search_content_filter above, you are sure it is called and works without errors?
Collapse
Posted by David Spanberger on
I did this insert for all object_ids in the acs_objects table.

No, I didn't installed the changed filter. I wanted to have a working search before I change the tcl files.

Collapse
Posted by Gustaf Neumann on
Without the filter (or some other possbile approaches), the file contents are not accessed. I would not insert all acs_objects into the queue: for some content it makes sense to index based on the item_id, for others based on the revision_id (both are acs_objects). notice, that also user_ids or package ids are acs_objects, so the indexer won't be able to do anything with this. maybe insert everything into the queue does not hurt, but i certainly would not do this for testing on a system with some content.

try to understand, what the FtsContentProvider service contract is about, and how the datasource works (again, you might check our xowiki, or various other packages)