Forum OpenACS Q&A: site-wide-search with file storage objects

Hi,
I am attempting to set up the site-wide-search so that it would work with file storage objects. I have successfully set up the interfaces and registered the object with the pot service. The content is now being indexed and I can search the content using site-side-search, however the appropriate url is not being found for file-storage-objects.

I notice that when a file storage object is created, the application_id and the node_id in table sws_search_contents are null. This differs from other object types such as bboard objects and ttracker_ticket. When these are created and the index is synchronised, the application_id and node_id are filled.

These fields should be given values when the index is synchronized and sws_service.rebuild_index is called. This calls sws_service.update_content_info for each row which in turn should fill the node_id and application_id. I am using sws_service_interface to implement the methods sws_url, sws_application_id and sws_site_node_id. These are returning null because this function is returning null:
sws_service.first_obj_type_in_context_tree (
        object_id        => object_id,
        object_type        => 'apm_package');

If I look in the database then it’s obvious that the function is returning null because there’s no object_type apm_package in the context tree for an object of type file_storage_object:
OBJECT_ID    CONTEXT_ID    OBJECT_TYPE
2287    2286    file_storage_object
2286    2119    content_item
2119    -100    content_folder
-100    0    content_folder
0        person

(I get this tree using the query:
select object_id,context_id,object_type
        from acs_objects
        start with object_id = 1390
        connect by prior context_id = object_id;)

If I look at an object where the application_id and site_node_id are being filled correctly, eg. Bboard object then I can see that there is an apm_package in the context tree:
OBJECT_ID    CONTEXT_ID    OBJECT_TYPE
1390    1389    acs_message_revision
1389    704    bboard_message
704    380    bboard_forum
380    257    apm_package
257    -3    apm_service
-3        acs_object

All this is causing problems when the results of site-wide-search include file_storage_objects. The appropriate content url for file storage objects is not being found (unless I hack it in!), so the user can’t click into the object.

I guess I need to implement my own sws_url, sws_application_id and sws_site_node_id for the file_storage_object to fix this problem. However I’m wondering if anyone has come across this before with file_storage_objects and why does a file_storage_object not have an apm_package in its context tree, while other object types do (eg. ttracker_ticket, bboard)?

Thanks in advance
Keith

Collapse
Posted by Don Baccus on
This is a bug in file storage which I really wish I'd had time to fix in OpenACS 4.6.2.

I didn't realize it screwed up site wide search.  What I did know, and why I think it is a bug, is that hooking the file storage root folder to object -100 means that the expected behavior of inheriting permissions from the subsite under which file storage is rooted isn't implemented.  So if I have a subsite "foo" and create an admin subgroup for the subsiteand mount file storage under that subsite, the members of foo's admin subgroup can't admin the files!

Grrr.

Because of this dotLRN, for instance, turns off security inheritance and manually assigns the proper permissions to folders it creates for the file storage instance created for each class.  This makes creating classes or copying them slower, makes the code obtuse, is different than the way other packages do permissions, etc.

Now ... changing the code to do the right thing shouldn't be all that hard ... and perhaps upgrade scripts aren't necessary and perhaps would actually be dangerous as people will have already manipulated permissions to get around the problem and changing them in an upgrade script would probably break things.

So maybe just changing the code without bothering with an upgrade script would be the right thing to do.

Collapse
Posted by Dave Bauer on
Don,

So should each subsite have its own content root folder?

Collapse
Posted by Don Baccus on
I was thinking that each file storage instance should have the  context_id of the folder it creates set to its parent in the site map.

This doesn't have any relationship to where it physically sits in the folder hierarchy.

File storage needs a little rethinking, the original version wasn't subsite aware at all ...