Forum OpenACS Q&A: Charset problem

Collapse
Posted by Daniël Mantione on
Hi,

If I put a file with a name depending on encoding in /www, say "Daniël.html" , I get the follwing error if I visit http://localhost/Daniël.html:

Database operation "0or1row" failed (exception NSDB, "Query was not a statement returning rows.")

ERROR: Invalid UNICODE character sequence found (0xeb6c2e)

SQL:
select sp.static_page_id, f.package_id
from static_pages sp, sp_folders f
where sp.filename = '/www/Daniël.html'
and sp.folder_id = f.folder_id
-- Only want pages from the Static Pages package.
and f.package_id in (
select package_id from apm_packages
where package_key = 'static-pages' )
-- If the same page is in more than one instance of
-- Static Pages for some reason, we only want one of
-- them, and we don't care which.
-- Oracle
--and rownum <= 1
-- PostgreSQL
limit 1

This problem disappears if I set UrlCharSet to ISO-8859-1 in my AOLserver config file. However, then things go totally wrong if I enter some text that depends on encoding a value into a random HTML form.

The is propably because an URL that is sent by a browser is ISO-8859-1 encoded. Unicode is possible but a browser then escapes the characters with the %number notation. However, for a form a browser is free to send the form UTF-8 encoded.

Any idea how to fix this?