Forum OpenACS Development: Rich text input

Request notifications

Posted by Lars Pind on
I'm working on standardizing textarea input/formatting in OpenACS.

The default input format, I think, should be like what MovableType has, namely HTML input where the software can automatically insert <p>'s, <br>', and &nbsp;'s for you. it could also make links out of URLs and emails. So essentially, this is the current plaintext format, except we don't quote HTML tags for you.

It's really convenient to be able to use simple HTML formatting, but it's really annoying to have to insert your own paragraph and line breaks.

I still think that keeping the plaintext format is nice for posts like this, where you want to write HTML tags that should be shown literatly.

Also, preformatted text probably makes sense. And finally, a "raw HTML" format also makes sense, for when you're copy-pasting stuff from some other source.

Finally, I've added some simple [B], [I], [U], and [URL] buttons which will wrap your current selection in the relevant tags. This is ripped from MovableType (we might need to create our own version of this, though this code is floating around the net in several places).

This only works in IE, though, I haven't been able to figure out how to make it work in Mozilla or other browsers. If there are any Javascript gurus out there, any help would be much appreciated.

I also considered coming up with a Wiki-style 'structured' input language, but Carl convinced me that we'd never manage to teach people yet another rich text formatting language, that it's better to just stick with HTML and add a few buttons to make it easier to use.

Q: Do you people agree that this input format (HTML with <p>, <br>, &nbsp;, and email/URL->link translation) is the optimal format?

Q: How do we label these formats? We could keep HTML/Plaintext/Preformatted, and add two check boxes "Preserve line breaks" and "Make links out of emails and URLs", but that's really long and convoluted.

We could also call them HTML, Raw HTML, Text, and Preformatted. Is that better? Or "Rich HTML"/"Raw HTML"? How would we explain the difference to people?

Is there any need to be able to turn on/off line breaking and making links out of emails and URLs separately from each other?

If you want to play with this, try here:

Register to try.


2: Re: Rich text input (response to 1)
Posted by Jorge Garcia on
Hi, Lars

I get this error when i try to edit the message

Request Error
The server has encountered an internal server error. The error has been logged and will be investigated by our system programmers.

3: Re: Rich text input (response to 1)
Posted by Janine Ohmer on
As I understand it, those buttons are generally implemented using ActiveX controls that are only available in IE on Windows.  Hopefully the code you're using does it some other way, but that might be why you can't get it to work elsewhere.
5: Re: Rich text input (response to 1)
Posted by Ola Hansson on
What will be inserted into the db under the various settings? Always the exact string the user wrote, or the resulting string?

Preformatted should stay but perhaps we can call it "fixed width" instead... I think that is more telling of what it does.

Perhaps we could have a bunch of formats to pick from via a kernel param... The more options we have available, the better it is, IMO - as long as the administrator can decide which ones to offer the users.

4: Re: Rich text input (response to 1)
Posted by Lars Pind on

I'm actively hacking on this server, that's probably why. Try again.


No, it's not an ActiveX, it's a plain old textara with a little javascript magic. The only thing the buttons get you is that it wraps your current selection in <b>...</b> tags. Writing the <b> tags by hand is as trivial as it always was. It's mainly a pedagogical tool that shows people how to write HTML tags for people who've never tried that before.


6: Re: Rich text input (response to 1)
Posted by Roger Metcalf on
I think it would also be really convenient for the software to build single-level unordered lists, where lines beginning with "- " indicate a new list item and a double line break indicates end of list.
7: Re: Rich text input (response to 1)
Posted by Lars Pind on

Yes, we never alter the user's input before storing it in the DB, only on output.

Regarding a multitude of formats, sure, but we might as well try to come up with one that we think is probably the best, the one that we'd use on, say.

Btw, right now I'm trying to figure out what to do with links. I added the turning URLs into links, but now it conflicts with your own manual links: If you write <a href="">link</a>, it's translated into <a href="<a href=""></a>">link</a> ...

Anyway, will keep hacking on this a bit.


25: Re: Rich text input (response to 7)
Posted by xx xx on

Concerning your remark on 26/1 (10.07): do <a href=>-tags work properly with richtext widgets?

I'm using the richtext widget with 4.6.3 (final) but still get this as output:

<p><a href="">link</a></p>
<input type="hidden" name="pa" value="<a href="">link</a>"><input type="hidden" name="pa.format" value="text/enhanced">

I would like to get rid of the <p> as well as the resulting ' link"> ' displayed in the browser.

26: Re: Rich text input (response to 25)
Posted by Lars Pind on
The resulting ' link"> ' is a quoting bug, which I've just now fixed on HEAD:

The <p> ... </p> around the contents is an attempt to make the  output proper HTML, but it does result in annoying whitespace.

Any better suggestion? Should we resort to open-ended <p>'s between paragraphs like we used to?

The code that does this is these two lines in util_convert_line_breaks_to_html in packages/acs-tcl/tcl/text-html-procs.tcl, in case you want to take it out on your own system:

    # Wrap P's around paragraphs
    set text "<p>$text</p>"
    regsub -all {([^\n\s])\n\n([^\n\s])} $text {\1</p><p>\2} text


27: Re: Rich text input (response to 26)
Posted by xx xx on
Thanks for fixing this bug.

As far as I can see "<p> ... </p> around the contents" prevents us from using this formwidget inline. An inline_p switch would therefore be useful, IMO.

8: Re: Rich text input (response to 1)
Posted by Don Baccus on
I think these are useful.

An orthoganal issue ... 'way back on my back burner has been the notion of writing a new datatype for the form builder that handles our existing HTML/plaintext textarea boxes cleanly and without the need for the programmer building a form-handling page to do the work explicitly.

Among other things this would lead to uniformity in how such textareas are presented to the user, and in how they're handled by the submit code.

Do you have any interest in combining your ideas into a new smart text input type in the form builder?  Or are you already way ahead of me and doing this :)

9: Re: Rich text input (response to 1)
Posted by Lars Pind on

I am doing this as a form builder *wdiget*, which outputs both the textarea and the format selection drop down. But I'm not sure how it would work as a data type.

I you could sketch out the design you have in mind, I could probably do it.


10: Re: Rich text input (response to 1)
Posted by Don Baccus on
Let's see ... a widget implies something like (using ad_form notation)


How do you communicate the "HTML/plaintext/etc" values to the code that handles the form?  If you're using the widget with the textarea datatype you should really only be returning the text itself, not any additional hidden stuff with a magic name, or a list, etc.  At submit time, a form should be able to handle any element of type textarea without any knowledge of the widget you've associated it.

On the other hand, a rich_text datatype (which would of course work with a rich_text widget) could be defined to return the list { text_area_data format }.  The widget would create the textarea block and the associated format dropdown (which I assume your existing widget is already doing), then at validation time (I think, I'd have to look to be sure since I haven't written a new datatype since GP/June) a list would be returned.  You'd want to create property procs to get the textarea and format values back, and an acquire proc to build one from passed-in data (useful for edit forms).

Probably the easiest example to look at is the currency datatype and its associated widget.  It's not well-documented but it is much more straightforward that the equally poorly documented date datatype and widget.

Anyway ... building both a datatype and widget allows you to  fully hide implementation details (i.e. the fact that textarea and  format are a list, or the order in which the two elements are stored in the list).

I was also thinking that Ola's spellcheck stuff could be implemented by creating spellchecked datatypes leading to a cleaner implementation (one which ad_form could mostly be ignorant of) but haven't had time to investigate ...

11: Re: Rich text input (response to 1)
Posted by Don Baccus on
OK I just looked.  If you have an ad_form snippet like:

{foo:rich_text ... }

your widget just puts out "foo.textarea" and "foo.format" (or perhaps "foo.1" and "foo.2").

The "values" attribute will contain the list of the two values.

The textarea datatype would work with this if you overload its meaning and submit processing code used the values rather than value attribute.

But ... as I said above, submit processing code should not need to be aware of what widget's used with a variable of a given datatype.  In this case it needs to know to use values if your rich_text widget is used.

Creating a rich_text datatype as well as widget avoids that slight messiness, and as I mentioned above allows you to provide get_property, set_property, acquire and transform methods that hide the data structure you're using to hold the textarea/format pair from the application code.

These also plug in to the ad_forms to_html, from_sql and related ... ummm ... hacks (I eventually want to extend the datatype implementation model so ad_form can intuit these without explicit coding by the programmer).  to_html, for instance, would apply the right style of formatting to the data and pass back the HTML string - which is then suitable for display in a "-confirm_template" template.

13: Re: Rich text input (response to 1)
Posted by Lars Pind on

I haven't done that before, but it does sound like the right way to implement this. Will do that tomorrow.

The proposed standard data types are:

- Rich HTML (links, paragraphs, etc.)
- Raw HTML (no processing, only security checks)
- Text (links, paragraphs, quoting)
- Fixed-width text (links, paragraphs, quoting, wrap in PRE)

We should probably have some kind of help page explaining what they are. Who's building the context-sensitive help system? :)


14: Re: Rich text input (response to 1)
Posted by Don Baccus on
The nice thing about burying this all in a form builder datatype is that it will be easy to extend/modify if we can get packages to use the standard get_property/set_property etc methods you provide ...

Of course the programmer still has to get the datamodel right.  And the DML statements probably need to be done by hand (using get_property to grab the textarea and format field).

Right now content datamodels seem to either have "html_p" or "mime_type" columns to flag the appropriate text/CLOB column so adopting your scheme will imply upgrade scripts etc.  Though of course no one expects you to rewrite all the packages to take advantage of this when your done!

Perhaps this could be a target for a "hacking marathon" of the type we've talked about for bugs and documentation but which as of yet we've never done?

12: Re: Rich text input (response to 1)
Posted by Vadim Makarov on
For security reasons we only accept the submission of HTML containing the following tags:
You have a <form> tag in there.
- yeah, thanks. I'll try to describe it in words then. You can call your format options:
text with autolinks
and make text with autolinks default.

Note that you can't unambiguously detect URLs, because they may contain spaces (thanks to MS) if not for a bunch of other reasons.

To me, a mix of user-entered HTML and post-processing always screws something up, except in the simplest cases. I had to swear a lot at various forums that do that. It's better to leave HTML option intact, i.e. no post-processing on user input. If it's HTML, it's HTML, no strings attached (well, that seems to be already lost :).

18: Re: Rich text input (response to 12)
Posted by Andrei Popov on
> Note that you can't unambiguously detect URLs, because they
> may contain spaces (thanks to MS) if not for a bunch of 
> other reasons.
Shouldn't all non-plain ASCII's be quoted? I.e. your url.htm become After all, when one copy/pastes from address window it is likely to be formatted correctly. Otherwise it should be user's tough luck, I suppose.
15: Re: Rich text input (response to 1)
Posted by Lars Pind on

We're going to leave a raw HTML input option around, the question is just what to label it. I'm thinking "Raw HTML".

Then there's the question of whether to try and detect links. I think it's reasonable enough to try and detect them. If you wrap a URL in a link yourself, we'll leave it untouched.

What should the default option be named? "Simple HTML"? "Rich Text"? I'm actually thinking that "Simple HTML" might be the best way to go.



16: Re: Rich text input (response to 1)
Posted by Lars Pind on
Actually, Mohan is saying something clever here.

We really want people to think about it as plain text with some extra frills (like bold text). Not as simplified HTML.

So how about "Smart Text"? Or "Enhanced Text"?


17: Re: Rich text input (response to 16)
Posted by Tilmann Singer on
I would like to back Rogers suggestion for automatic rendering of unordered lists, but more in a wiki fashion, so that
* some item
* another item
* last one
  • some item
  • another item
  • last one
I don't think that would result in unexpected formatting when added to the Rich/Smart/Enhanced text widget. Any objections? If not I'll gladly add it when the widget is available.

Regarding the form builder integration: that's great. I hope this will result in an acs-subsite (kernel?) parameter where the admin can define the default widgets that are available plus the ability for the programmer to override it in particular cases with options passed to the widget. (Just bluntly requesting features here ...)

19: Re: Rich text input (response to 1)
Posted by Malte Sussdorff on
Will it be possible to set parameters, which "Input" Styles shall be available. Furthermore, once Lars has made his effort (and thanks for starting to go with it), will it be possible to add other input styles (like a wiki or the style used in ezboard or whatever else we might think of). I think a good documentation on how to do this, that goes a little bit beyond the ideas exchanged here would help tremendously for future actions in this area.
20: Re: Rich text input (response to 1)
Posted by Lars Pind on
I've added this code to HEAD.

There's a new datatype 'richtext', and a new widget 'richtext', which provides four formats, namely:


These are currently hard-coded in both acs-tcl/tcl/text-html-procs.tcl and acs-templates/tcl/richtext-procs.tcl. It would be relatively easy to make a service contract or some other form of API for adding new methods, but I didn't do it, because I'm more concerned about raising usability for users today than adding fancy options for admins :) Other people are more than welcome to add this, of course.

Also, I've added corresponding mime types to the content repository create scripts, so we can get packages that store data in the CR using this as well.

Available tasks for anyone who feels up to the job: Fix forums, general-comments, wimpy, faq, and whatever other applications we have to use form builder and the richtext widget/datatype.


21: Re: Rich text input (response to 1)
Posted by Ola Hansson on
Pretty cool, Lars!

I have two small complaints, though (aka feedback):

1) The richtext widget is treated as optional even though the "optional"  element attribute is not specified...

2) The "underline" tag is by default one of the unaccepted tags, and maybe for a reason, too (underlined text is easily mistaken for a link).

I propose that we either allow the <u> tag by default in the toolkit, or that you drop that button.


22: Re: Rich text input (response to 1)
Posted by Lars Pind on
Thanks, Ola, these problems have been fixed now (took out the U button, underline sucks, anyway).

I'd also forgotten to add the gifs used for the buttons. They're in acs-subsite/www/shared.

Perhaps they should be somewhere else?


23: Re: Rich text input (response to 1)
Posted by Don Baccus on
Ahhh ... the first is a bug by accepted form builder semantics  (just in case there's any doubt) ... fields are required unless declared optional.
24: Re: Rich text input (response to 1)
Posted by Staffan Hansson on
Lars, maybe there could be a "cite" button, combining <blockqoute> and <i> (perhaps wrapping the text within quotation marks as well). This would be neat, and would standardize the look of quotes.
28: Re: Rich text input (response to 1)
Posted by Lars Pind on
What does that mean, to use the formwidget inline?
29: Re: Rich text input (response to 1)
Posted by xx xx on
What I mean is using a formwidget within a sentence or paragraph like this:


Hello, I am <formwidget id="">. My favorite link is <formwidget id="">....etcetera.

<if $admin_p><input type="submit" name="formbutton:c_1" value="change"></if>


30: Re: Rich text input (response to 1)
Posted by Lars Pind on
Hm. Not a very common way to use a textarea widget, it would seem to me.

Why don't you just cut out those two lines for now.