Forum OpenACS Development: Re: New Feature: Formbuilder maxlength

4: Re: New Feature: Formbuilder maxlength (response to 1)

Posted by Michael Hinds on 03/20/03 05:42 PM

Lars,

I'm not sure why you don't want to use string length. Here's what the manual says about bytelength

string bytelength string

       Returns a decimal string giving the number of bytes
       used to represent string in memory.  Because  UTF-8
       uses  one to three bytes to represent Unicode char¡
       acters, the byte length will not be the same as the
       character  length  in  general.   The cases where a
       script cares about the byte length  are  rare.   In
       almost  all cases, you should use the string length
       operation.  Refer  to  the  Tcl_NumUtfChars  manual
       entry for more details on the UTF-8 representation.

So it seems to me string length works fine. Have you seen evidence otherwise?

5: Re: New Feature: Formbuilder maxlength (response to 4)

Posted by Tilmann Singer on 03/21/03 08:20 AM

Type psql -l to find out the encoding of your pg databases:

tils@tp:~$ psql -l
        List of databases
   Name    |  Owner   | Encoding
-----------+----------+----------
 beta      | tils     | UNICODE
 lari      | tils     | UNICODE
 lari2     | tils     | UNICODE
...

If you have something else, for example SQL_ASCII, in there then those are single byte encoded databases. As far as I understand it's in almost any case the right thing to create your database as UNICODE when you want to be able to store data in different encodings.

The error that your maxlength procedure catches indicates that something else is going wrong before, because in that case you would end up storing a single international character (e.g. a german umlaut) as two characters in the db, which leads to lots of other problems. For example a query that selects a substring could split the 2-byte character in two pieces. You should have created your database UNICODE encoded or in the encoding that understands the characters that you need (e.g. LATIN1).