Forum OpenACS Development: How to make an object type searchable?

In order to make a new custom content_type search-able, I've followed the guidelines (i.e. available at: https://openacs.org/doc/search/guidelines), to write custom service contracts to those new custom content_types.

Then, I manually ran SQL scripts (i.e. acs_sc_impl_alias__new... ), and restarted NS. However, those new custom content items are still not search-able.

https://dashboard.qonteo.com/search/

Those custom content_types use the content repository to store its items, and have content_revision as their super type, defined at the moment of the package installation and its data-model creation. (i.e. SQL content_type__create_type ...please read psql function below)

I've double-checked the service contracts and they were properly installed as FtsContentProvider.

However, custom items are still not search-able

What am I missing?

Furthermore, I tested file-storage package items, and they are not search-able either, in this OACS instance.
Instance 1 (search is broken): https://dashboard.qonteo.com/search/

Although, I have another OACS instance, which lars-blogger package is installed, and the search engine works just fine and returns the blog entries, searched by keywords.

Instance 2 (search works fine with lars-blogger):
https://iurix.com/search/search?q=entrepreneurs&__csrf_token=8FD3917A574FC2771D0015E26ED403A68DBA1295&t=Buscar

References:
https://openacs.org/doc/search/guidelines
https://openacs.org/doc/tsearch2-driver/

select content_type__create_type (
'qt_face', -- content_type
'content_revision', -- supertype. We search revision content
-- first, before item metadata
'Qonteo Face', -- pretty_name
'Qonteo Faces', -- pretty_plural
NULL, -- table_name
-- IURI: acs_object_types supports a null table name so we do that
-- instead of passing a false value so we can actually use the
-- content repository instead of duplicating all the code in file-storage
NULL, -- id_column
'qt_face__get_title' -- name_method
);

Collapse
Posted by Antonio Pisano on
Dear Iuri,

the place where "the magic happens" in the search package is this proc http://openacs.org/api-doc/proc-view?proc=search::indexer&source_p=1.

This gets scheduled to run every minute or so (configurable) and will loop through entries in the search_observer_queue table which are supposedly created by a trigger existing on cr_items table and called "content_item_search__utrg" (Object types that do not inherit from cr_items can define a similar trigger on their own).

For each such entry, the callback you have registered should be called and return some indexable text representation of your document that is then processed by the FTS engine.

To try to debug your situation, try adding searchable documents to your websites (e.g. file-storage pdfs, xowiki pages and so on) and check:
- that upon creation, an entry in search_observer_queue is created (note that it will be deleted as soon as search::indexer is run, unless something goes wrong)
- put some debug statements in search::indexer and in your callback, to check that your item made it to the processing and your indexing callback was executed

If this does not help, we should have a look to your service-contract definition.

Ciao
Antonio

Collapse
Posted by Iuri Sampaio on
Antonio,
Avoiding the possibility of errors in my custom code, I used FS and Xowiki packages to evaluate searching macanism. That way, we also exclude potential errors in the service-contract definition.

I've created the FS-file object (i.e. a PDF file) Nice reading btw! Voltaire's master piece.

https://dashboard.qonteo.com/file-storage/view/E%cc%81pi%cc%82tre_de_la_mode%cc%81ration_en_%5b...%5dVoltaire_(1694-1778)_bpt6k6104957k.pdf

... and I've also created a Xowiki Page
https://dashboard.qonteo.com/xowiki/oacsforumthread

Then, I've tried to search for both items, but no results were returned.

https://dashboard.qonteo.com/search/search?q=forum&search_package_id=442093&__csrf_token=C9C66423DAD97C33AF9A28377EECC7AC9CE1E212

Thus, neither a file nor a xowiki page is searchable.
i. I did wait for a couple of hours, and searched again, to make sure there's was engough time to reindex procedures.
ii. I did reboot NS.

Furthemore, there is no need to create extra step to make search observer aware, beucase new object types are "subtypes" of content revision.

OACS kernel is 5.9.2d2

At first my intuition were pointing to Tsearch2 Driver or PostgreSQL level. However, everything works just fine in the other instance, which is using the same RDBMS, same server even

Collapse
Posted by Iuri Sampaio on
I've uninstalled and installed search package again, and now both items file object and xowiki page returned in the search.

I have no idea why neither how. I had ony reinstalled search pkg. That's all info I have.

Now, I'm going to debug custom objects... i'll paste the result here later on

Collapse
Posted by Iuri Sampaio on
... reinstalling search pkg, means the same as...
"Have you try to turn off and turn on again?"
The famous jargon from IT Crowd https://pt.wikipedia.org/wiki/The_IT_Crowd
Collapse
Posted by Iuri Sampaio on
Antonio,

One Problem solved. objects have been indexed successfully!

Notice: Running TCL ad_prc qt_vehicle__datasource
[06/Aug/2020:21:12:07][4365.7efbebfff700][-sched:0:3:10-] Notice: ROW 202759 {2020-08-07 00:07:22.410127+00} UPDATE
...

Lessons learned:

1. Never forget to set a revision as "live". Then the search can return it.

2. As I copied datasource ad_proc implementation from fs_object, its SQL query searches only "content" column. However, my object_type has data only in title and description columns.

db_0or1row fs_datasource "
select r.revision_id as object_id,
i.name as title,
case i.storage_type
when 'lob' then r.lob::text
when 'file' then '[cr_fs_path]' || r.content
else r.content
end as content,
r.mime_type as mime,
'' as keywords,
i.storage_type as storage_type
from cr_items i, cr_revisions r
where r.item_id = i.item_id
and r.revision_id = :revision_id
" -column_array datasource

Now the issue is to display the results.

The results are there. Pagination shows a bunch of pages. but no results are visible to the user.

https://dashboard.qonteo.com/search/search?q=plate&__csrf_token=F192516B843660460C5FAC68FEBA6DD372F383BF

I noticed when search engines runs, logs returns

Notice: Running TCL ad_prc qt_vehicle__datasource
[06/Aug/2020:21:30:56][6145.7efbf3d70700][-conn:qonteo:0:3-] Notice: Running TCL ad_proc qt_vehicle__url
[06/Aug/2020:21:30:56][6145.7efbf3d70700][-conn:qonteo:0:3-] Error: search.tcl object_id 108031 object_type qt_vehicle error Query did not return any rows.

However, the object exists

select * from acs_objects WHERE object_id =108031
qonteo-# ;

object_id | object_type | title | package_id | context_id | security_inherit_p | creation_user | creation_date | creation_ip | last_modified | modifying_user | modifying_ip
-----------+-------------+-------+------------+------------+--------------------+---------------+-------------------------------+---------------+-------------------------------+----------------+---------------
108031 | qt_vehicle | 70433 | 860 | 108029 | t | 726 | 2020-07-28 00:00:15.961343+00 | 178.62.211.78 | 2020-07-28 00:00:15.961343+00 | | 178.62.211.78
(1 row)

Collapse
Posted by Iuri Sampaio on
Resolved!

Another problem in the content_type creation. As there's no folder_id in this context, the parent_id must be the same as package_id. At least for now.

Best wishes,
I

Collapse
Posted by Antonio Pisano on
Glad you could make it work!

All the best

Antonio