Forum OpenACS CMS: XoWiki - Weird behaviour when editing form fields in numeric format

Hello everyone,

I am exploring the possibilities of XoWiki to create forms dynamically and I am experiencing a problem with numeric fields. Here is how to reproduce it:

- in a Xowiki instance, install locale for the Italian language it_IT and set it as the locale for the system
- create a new xowiki form with a numeric field
- fill the form to create a xowiki FormPage; as the value for the numeric field choose a number of your choice
- once saved, click to edit again the FormPage so created

As you will see, the number in the form has become 100 time bigger every time you save and enter the FormPage to edit.

The problem doesn't happen with the english locale, so I think it must be caused by some sort of "comma loop" in field retrieval and saving. Unfortunately, my knowledge of xowiki internals are still too poor to track down the issue in the code on my own.

For convenience I leave my test Form specification:

-- template

Number: @number@

-- form

(form tag opened)
@number@
(form tag closed)

-- form constraints

number:numeric,required

Any ideas?

I started digging: the problem doesn't seem to be xowiki specific, but rather in the lc_parse_number procedure.

In a system with it_IT locale installed, executing

lc_parse_number 12,00 it_IT 0

which should return the number 12 in whatever form, the result is 1200 instead.

lc_parse_number 12,00 it_IT 1

parses the number correctly, as we are forcing the proc to treat the number as integer, but I think it should be able to understand the real nature of the number in both situations.

I will look further

The problem is that the nationalization of Italian numbers in OpenACS is apparently incorrect since ages. I wonder, that you did not run into this problem in other OpenACS applications with this locale....

Background 1: xowiki uses in the numeric form-fields the standard OpenACS conversion functions lc_numeric to convert an internal number to a nationalized number, and lc_parse_number for converting a nationalized number into an internal one.

If one calls e.g. lc_numeric 123 %.2f it_IT (convert internal number 123 to external with locale it_IT) then the result is 123,00. When this number is converted back via lc_parse_number 123,00 it_IT 0, then the result is (incorrectly) 12300. If one tries the same with e.g. de_DE, then the result is correct.

In order to fix this for your local installation, browse to acs-lang/admin/, select it_IT and package acs-lang and change the two entries thousands_sep and mon_thousands_sep from , to .

Background 2: xowiki form-fields have in general two methods for converting back and forth between internal and external representations, namely convert_to_internal and convert_to_external. This functions use the locale of the current entry for the conversions. If you want to debug this locally, look for these...

Best regards
-g

Btw, on wikipedia, i see four styles of writing decimal marks and thousand separator for Italy: http://en.wikipedia.org/wiki/Decimal_mark
Is "1.234.567,89" the most common style?
I think I managed to solve the problem: it was caused by the special situation of the Italian locale, which has the decimal separator set as the same as the thousands one.

Actually, in the italian language is a rule since a while that no separator should be used for the thousands, but only a space (at least, this I clearly remember from the elementary school), but often in software the "." is used. The separator for the decimal is the comma, and this is correct in OpenAcs.

In other software I had to work on, the company had created a customized set of procedures to handle number formatting, so we never incurred in the problem.

Anyway, I looked into the lc_parse_number code: it first removes the thousands separator, and then converts the decimal separator to a dot. This leads to the problem, as in the italian locale it erases the comma used for the decimal before it can be detected.

I solved putting a few lines of code in which it firstly detects the last decimal separator (whatever it could be) an splits the number into its integer and decimal parts, then it removes the thousands separator from the integer part, and finally joins the two parts again using the dot.

This fixes the special situation of locales using the same separator for both decimal and thousands.

I then tried to put the "right" thousands separator in the translations. I could insert a space as "\ " and checked the proc output for "12 000,00": it is parsed correctly as "12000.00".

I can provide the modified localization-procs.tcl, which comes from oacs-HEAD installed just the other day. I also suggest to switch the thousands separator from "," to "\ " or "." for the Italian locale in standard installation, as the comma is really unusual.

All the best

Antonio

Played a little more with "\ " as separator: I don't think quoted values are a good idea into translations, as their effects become a bit unpredictable... lc_numeric function will format the number 10000 as "10\ 000", so we would have to check into the proc for quoted separators to fix it, opening to a fiddling hell of possible regressions.

If the " " value cannot be inserted into translations directly (without quoting), then the dot will be just fine as thousands separator.

not sure, you read my reply above. When using "." as thousand separator, no code changes are necessary. The changes to the catalog files are already in CVS, and will be moved over to HEAD soon.
Of course changing the separator will do, but what I found is that whenever the locale uses the same separator for decimals and thousands the issue will occur. It is not related to italian locale only.

I understand my fix touches very very stable code in responsible components of OpenAcs, but I think this problem deserves to be considered as a bug and solved (with priority and solution to your convenience). What is your opinion?

The check for invalid settings of i18n should be a constraint after the setting of message keys when these are entered, or part of the regression test. Since the message key setting is agnostic about the semantics of the keys, and regression testing is not performed for all languages, i've now added a sanity check in lc_parse_number.

probably, the most realistic approach is to improve regression testing. patches are welcome.

Came up with this, tell me what you think:

I put the check for the separators into a proc called lang::message::check, together with the check performed into lang::message::register to ensure a message contains all the variables required for substitution. This proc should contain all the semantic and sanity checks to be performed on messages now and in the future.

The new proc was then put into 3 places:

- lang::message::register in place of the former check
- a new regression test called lang_messages_correct, which performs the check on every message
- the on_submit block of the message key setting form, displaying an error on the form field when the check fails

If sounds reasonable to you I can send the patches

sounds very reasonable... please send me the patches for review.
Patches sent!