Forum OpenACS Q&A: Re: OpenFTS problem with non US_ASCII characters

Collapse
Posted by Claudio Pasolini on
Thank you very much, Gary: now it works!

Actually I was already quite close to the solution, but I converted the lexem to unicode instead of utf-8, without success.

Furthermore I had to shut down and restart aolserver to get things work.

Collapse
Posted by Joel Aufrecht on
I have two related problems. I applied the fix above and restarted but it didn't have any effect, which is unsurprising because my error messages are slightly different:
[08/Nov/2004:13:47:18][16217.163850][-sched:25-] Error: Ns_PgExec: result status: 7 message: ERROR:  Invalid UNICODE character sequence found (0xc200)

transaction error
[08/Nov/2004:13:47:18][16217.163850][-sched:25-] Error: Aborting transaction due to error:
Database operation "dml" failed (exception ERROR, "ERROR:  Invalid UNICODE character sequence found (0xc200)
")

ERROR:  Invalid UNICODE character sequence found (0xc200)

SQL:
                    insert into index8
                        (lexem,tid,pos)
                         values
                        ('0rÂ,16537,
                        '{1005}')
and
[08/Nov/2004:13:47:54][16217.163850][-sched:25-] Error: Ns_PgExec: result status: 7 message: ERROR:  Cannot insert a duplicate key into unique index in
dex10_key

transaction error
[08/Nov/2004:13:47:54][16217.163850][-sched:25-] Error: Aborting transaction due to error:
Database operation "dml" failed (exception ERROR, "ERROR:  Cannot insert a duplicate key into unique index index10_key
")

ERROR:  Cannot insert a duplicate key into unique index index10_key

SQL:
                    insert into index10
                        (lexem,tid,pos)
                         values
                        ('2004',18789,
                        '{20}')
Collapse
Posted by Claudio Pasolini on
Joel,

regarding the UNICODE error something has gone wrong, because you are again trying to insert an unconverted (or badly converted) string: perhaps you could try your patch in a sample tcl script and verify if it actually does the conversion.

I also got the duplicate key problem, but I ignored it for the moment, because it is caused by the double insertion into the search_observer_queue when you create a new content (I observed this creating a news): in this case the content will be processed correctly.