Forum OpenACS Development: Thread safety of OpenACS vs RoR et al.

1: Thread safety of OpenACS vs RoR et al.

Posted by Stan Kaufman on 09/05/06 10:50 PM

One of the compelling attributes of OpenACS compared to LAMP/RoR/virtually anything else is the thread safety that comes from AOLserver and Tcl. There's an interesting discussion about this right now in the AOLServer list, notably this post:

Of course, the more popular languages weren't implemented with
embeddability (embedibility?) in mind.  Tcl is one of the only mature
languages that can be safely embedded in a multi-threaded application.

Specifically, what I mean is, Tcl is one of a few languages that can
have multiple interpreters in the same process space executing
concurrently in multiple threads.

Python can be embedded, but according to this:

    http://docs.python.org/api/api.html

    8.1 Thread State and the Global Interpreter Lock
    http://docs.python.org/api/threads.html

    "The Python interpreter is not fully thread safe. In order to
    support multi-threaded Python programs, there's a global lock that
    must be held by the current thread before it can safely access
    Python objects. [...]"

A global lock?  There goes scalability and the benefits of embedding
Python.  (You'd be better off running it in a separate process ala
FastCGI, AOLserver would gain no performance by embedding it, only
risk.)

Same problem with Ruby:

    http://metaeditor.sourceforge.net/embed/

    "ruby is not thread-safe. If you embed ruby into a multi-threaded
    application then you will need to semaphore protect all access to
    ruby. Warning: Insufficient semaphore protection can be hard
    debuggable."

So, they're pointing out the need for a global lock, too.  What's worse,
the Ruby interpreter uses global variables to manage interpreter state,
so even if you did embed it and use a global lock, you couldn't spin up
two instances of the interpreter in the same process, anyway!

PHP isn't even worth considering, really:

    Re: [PHP-DEV] Re: PHP and Apache2
    http://marc.theaimsgroup.com/?l=php-dev&m=108736540021355&w=2

    "The major feature that draws people to Apache2 is threading. [...]
    However, on UNIX there are a lot of basic libraries where thread
    safety is an unknown.  [...]

    And yes, you could use the prefork mpm with Apache2 to avoid the
    threading, and yes you could use a standalone fastcgi mechanism to
    avoid the threading, but you are going out of your way to avoid the
    defining characteristic of the web server you have decided to use."

Rasmus makes it very clear that the reason PHP is so popular and
powerful is its large list of third-party extensions that add a lot of
"out of the box" functionality for PHP developers.  However, it's that
same attribute of PHP that keeps it from playing nicely, embedded in a
multi-threaded application.

Lets not forget this, too:

    http://developers.slashdot.org/developers/06/07/28/0552240.shtml

    "Jani Taskinen, one of the lead developers of the Zend Engine (the
    engine that powers PHP), as well as a lead developer for the thread
    safety system and other core components of the PHP project, has quit
    in a relatively cryptic message to the php-internals mailing list.
    Jani has been involved with PHP for about 6 years and his loss will
    undoubtedly be a big blow for the PHP project."

It'll be interesting to see where the future of PHP takes it.

Now, lets look at Perl:

    http://search.cpan.org/dist/perl/pod/perlembed.pod

    "Now suppose we have more than one interpreter instance running at
    the same time. This is feasible, but only if you used the Configure
    option -Dusemultiplicity or the options -Dusethreads -Duseithreads
    when building perl. [...]

    Using -Dusethreads -Duseithreads rather than -Dusemultiplicity is
    more appropriate if you intend to run multiple interpreters
    concurrently in different threads, because it enables support for
    linking in the thread libraries of your system with the
    interpreter."

Oh, nice.  I don't see any mention of using a globally shared lock to
prevent two Perl interpreters running concurrently.  (This could just be
an omission from the docs, but I have the feeling it's not the case.)

So, of the "languages more popular than Perl" being Python, Ruby, PHP
and Perl ... Perl is the only suitable candidate for replacing Tcl in
AOLserver.

Considering that Perl is no longer the "flavor of the day" and has been
replaced by Python and Ruby amongst the "language geeks" and PHP amongst
the unwashed masses ... who really wants to go replacing AOLserver's
embedded Tcl with embedded Perl?  How many months or years before people
see Perl like they see Tcl today (not popular, etc.)?

The irony here is so few people have even heard of Tcl that if it was
evangelized and marketed correctly, it could be brought to the
marketplace as a "new" language to replace Ruby and Python and PHP.
Write a slew of books, release a mound of free Tcl code and add simple
enhancements to Tcl like closures/lambdas and other language features
that'll make the language geeks go "ooh!" and the unwashed masses will
download all the free code they need ...

... and then the cycle starts all over again.

After having said all this, I would like to explore the idea of
Server-side JavaScript (SSJS):

    http://en.wikipedia.org/wiki/Server-side_JavaScript

Netscape took a shot at this with its LiveWire stuff back in
1996.  Now that it's 10 years later, could it be the right time to try
again?  (I wouldn't try to re-implement LiveWire, but use what we've
learned about embedding a scripting language inside a webserver and use
JS as the language.)

    SpiderMonkey: JavaScript C Engine Embedder's Guide
    http://developer.mozilla.org/en/docs/JavaScript_C_Engine_Embedder%27s_Guide

    "The run time is the space in which the variables, objects, and
    contexts used by your application are maintained. A context is the
    script execution state for a thread used by the JS engine. Each
    simultaneously existent script or thread must have its own context.
    A single JS run time may contain many contexts, objects, and
    variables."

I'd like to work on an AOLserver module and ADP parser (extend the fancy
parser to recognize "") to embed SpiderMonkey, as a start.

I'd argue that there's more JavaScript expertise out there today than
there is Java, PHP, etc. because with the rise of RIA's and AJAX
techniques, regardless of what backend language you use, you still have
to use some amount of JavaScript.  To me, it makes sense to leverage
that same JS knwoledge (and code!) on the server side, too.

IMHO, this change could put AOLserver back in the beauty pageant.

As always, I welcome any comments or criticisms or flaming of my
half-baked off-the-cuff email-to-the-whole-list ideas I presented here.
I know there's plenty of them ... 😊

-- Dossy

Here is more about the trying to add threads to RoR in the Mongrel project:

Here's the things that blow up--even with this setting--unless there's
been some *major* improvements in how Rails' was structured:

* Classes mysteriously "disappear" or aren't loaded.  Ruby's dynamic
class loading and instance_eval is not thread safe, so when multiple
threads load rails actions, models, and helpers the Dispatcher throws
random exceptions.
* Model objects used in one thread show up in another thread.  Last time
this was because there were problems with how shared connections were
being used between the different threads.
* Requests mysteriously pick up other thread's request input or output.
No idea why this was.  It's really really weird.
* High load causes database connections to explode exponentially into
the thousands.  This is caused by use Thread.current[] to store the DB
connection, which causes the rails application to connect one time for
each thread.
* Once connected, the connection doesn't go away, but the thread does.
This means that when each thread is created, connects, and then finishes
it's response, the connection is kept around until the GC kills it.
This causes file descriptor leaks that eventually make the whole Mongrel
server stop functioning because the number of open files is greater than
the 1024 select() limit on most systems.
* Long running Mongrel servers start eating lots of memory or stop
running.  This is mostly caused by the connections per thread model not
being cleaned up, but even when this is cleaned out it's still not good
enough.  Files, other thread, popen calls, fork, and many other
resources or notoriously poorly managed by nearly every Rails
programmer.

I started working on fixing these problems way back in the SCGI days,
but the task was so daunting that I just gave up.  So, yes, your little
patch is very easy.  It probably won't work without a lot of extra
effort.

If you're interested in doing some hack-o-tronic work to make this
function, then you'll have to either rule out the above problems (with
more than one request with a browser) or find a way to implement fixes
outside of rails.  Some possible solutions:

1) Rip out anything left over in a Thread.current[] after the rails
request is done.
2) Find a way to mark what request/response/IO objects are assigned to
each thread, and then blow-up if they change during processing.  This
would most likely be a debugging option for people suspecting crossover.
3) Find a way to keep track of what a rails controller opens, and then
make sure it gets closed completely.  This'll be a real realy pain in
the ass since people love to open crap at random and not clean it up.
4) Do a completely pre-loading when a rails application is started in
production mode so that all classes are properly loaded ahead of time
and you don't have to worry about Ruby's lack of thread safety here.

If you're up to the challenge, then give it a shot.  You'll want to
setup a test harness with your current code that thrashes the living
hell out of a real rails application for a long period of time and then
fix everything that comes up.

-- 
Zed A. Shaw
http://www.zedshaw.com/
http://mongrel.rubyforge.org/

Yikes. This seems like an important consideration to anyone building enterprise-class apps who is thinking of OpenACS vs RoR.

2: Re: Thread safety of OpenACS vs RoR et al. (response to 1)

Posted by Rocael Hernández Rizzardini on 09/06/06 05:44 AM

hey Stan, thanks for the post, I pretty sure people will find it useful plus no all follows the aolserver lists.

3: Re: Thread safety of OpenACS vs RoR et al. (response to 1)

Posted by Mark Aufflick on 10/29/06 02:42 PM

I started hacking together an nsruby, and then discovered exactly what you did.

There is a proof of concept for an nsperl by Jim Lynch on sourceforge:

http://sourceforge.net/projects/perl-aol

I've previously compiled and played with it. Probably too scary if you're heart isn't up to merging the aolserver internals and the perl internals in your brain. Has definate promise though.

One of these days if I ever get to spend more time on aolserver based projects I will make some more of this - it would be so cool.

4: Re: Thread safety of OpenACS vs RoR et al. (response to 3)

Posted by Stan Kaufman on 10/29/06 06:20 PM

It's hard to know how to evaluate these thread issues; they certainly haven't dented the enthusiasm for RoR out there, and RoR certainly has much otherwise to commend it.

5: Re: Thread safety of OpenACS vs RoR et al. (response to 4)

Posted by Andrew Piskorski on 10/31/06 08:50 PM

Actually, it's fairly straightforward to evaluate these thread issues:

If you want to use a programming language with AOLserver, or with any other genuinely multi-threaded program, and you want to use it in a general purpose, fully functional way, then you definitely don't want to use Ruby. Nor R or Python, for that matter. What you want is a fully thread-safe and reentrant language implementation, period.

If on the other hand, your program is strictly single threaded, or perhaps also if you are willing to use said programming lanaguage from one thread only and you are willing to ad-hoc figure out how to do so safely for your particular non-thread-safe language, then the doors are wide open - use Ruby, Python, or whatever else you're willing to hack in, and make work.

And that "one thread in my multi-threaded program" approach will involve trial and error on the thread-safety bits, it is not a matter of just following some simple recipe.

(Btw, the reason is somewhat deep: All current threading libraries use mutex locks, and locks do not compose. Transactions do. This is basically why transactions are awesome and locks suck. But I digress...)

Practically speaking, I think the above means that you might want to use something like R or Ruby from AOLserver for some one special purpose, which you know ahead of time. Typically, you have an existing special purpose code library, and you want to call it from your web server - cool. You might want to embed it, you might want to just fork off scripts as needed, I suppose it depends on the exact use case.

But given the limitations above, I can't see it ever making practical sense to embed a non-multithreaded language implementation into AOLserver for general purpose use. (Tcl is AOLserver's principal general purpose language.)

In the case of, say, AOLserver and Ruby, the dark horse way around all the limitations above might be to change all of AOLserver underneath to use multiple processes and shared memory rather than threads. At the Tcl level, everything would look the same. At least some of the C code underneath would definitely see differences, though. Multi-process AOLserver should certainly be possible to do, somehow, but I've no basis for even guessing how much hacking it would require, nor what performance problems or other difficulties might crop up. Oh, and it probably would never work right on MS Windows at all, only Unixes.

I doubt you want to go there with AOLserver. Neat hack, and it might be the easier route, but it'd probably be more fun to "just" write your own single-threaded web server in Ruby, or implement your own multi-threaded version of Ruby for AOLserver. :)

Now, Zoran Vasiljevic, creator of the AOLserver-inspired Tcl Threads Extension, has mentioned several times how he would like to add complete multi-process support to the Threads Extension, with an API essentially identical to the current multi-threaded one. He's just never done it because he couldn't come up with any practical need for it in his work. I bet you won't either. :)

Btw, supposedly, way back in the mid 1990's when the Navisoft guys were first writing AOLserver, Tcl was not thread safe. But, do to its clean implementation, intended for easy embedding in larger C programs, the Naviserver guys were able to make Tcl multi-threaded quickly and easily. Presumably that's one of the single biggest reasons why Tcl was ever used in AOLserer at all. (That's all hearsay, but you can ask Jim Davidson if you want to know for sure.)

Clearly Python and Ruby were not so much intended for easy embedding in other programs, and certainly not in multi-threaded programs. I think the creators of each have publically said as much, too.

6: Re: Thread safety of OpenACS vs RoR et al. (response to 5)

Posted by Stan Kaufman on 11/01/06 07:11 PM

Andrew, thanks for your Deep Plumbing analysis. I was musing on the rather more superficial question of whether -- for the sorts of community-oriented web apps that most people build with OpenACS or RoR -- the threading issues are really important. On one hand, it would appear that using a completely thread-safe environment would be superior, but yet it also is clear that many/most people are using non-thread-safe stacks without evidence of disaster. So what do we make of this?

7: Re: Thread safety of OpenACS vs RoR et al. (response to 6)

Posted by Andrew Piskorski on 11/01/06 10:18 PM

Oh, I don't think your, "should I make my app multi-threaded or not?" question is at all superficial. Note that I never addressed that question above at all - that was on purpose. I didn't address it because it's mostly orthogonal to the issues discussed above, and because it's more contentious, and I'm less sure of the right answer.

Achieving at least the illusion of concurrency is important for many apps. (GUIs, moderate traffic web servers, etc.) Currently, the main ways to get it are either are various flavors of single-cpu user threads, or event driven code.

Achieving real concurrency - multiple threads of execution running at the same time on multiple processor cores - is less widely needed - but if you need it, you may really, really need it. The trick is that you may not know ahead of time whether you need it or not. ("Oh awesome, my web service just got hugely popular! Oh shit, my web service just got hugely popular...")

(Note also that there are many different - some wildly different - approaches in the basic multiple-threads-of-execution bin. User threads, OS threads, processes with shared memory, programs on different physical boxes exchanging messages with MPI or with TCP/IP sockets, etc.)

In practice I've been quite happy with Tcl's (and AOLserver's) high-level apartment thread model built on top of real, OS-driven POSIX threads. So I am biased in that direction. (Note though that these can be fairly heavyweight threads, often much heavier than the basic POSIX threads underneath.) There are definitely other concurrency models in real world production use worth checking out, though. Erlang is one of those, there are no doubt others.

Also, there is definitely ongoing, much needed research on better models and tools for concurrent programming. (E.g., there seem to be academic folks seriously working on replacing locks with transactions. Also lots of lower level stuff, trying to make efficient use of "weird" cpu architectures like graphics cards and IBM's Cell, etc.) My bet is that 15 years from now, we'll be using concurrent programming tools quite a lot different than the POSIX threads, MPI-based message passing, etc. that we're using now.

But to get back to the real world of the here and now...

Code that is thread-safe and reentrant is always a plus, because it gives you the option of using it in a multi-threaded environment, even if you don't really care to at the moment. But naturally, software development has trade-offs. Some developers choose to punt on thread-safety in favor of spending their time elsewhere. Whether that's a wise trade-off or not in their particular case, I don't know.

I do know I would try to avoid having to make that trade-off. I tend to write most of my code in a re-entrant style even if I'm using it single-threaded. ("Repeat after me class, global variables are EEEEEvil.")

Multi-threaded Tcl and AOLserver do work very well. This not so much because they just "use threads", but because they use POSIX threads in a rather well thought out way, and provide lots of nice APIs to make doing the right thing fairly easy, and difficult things possible.

8: Re: Thread safety of OpenACS vs RoR et al. (response to 7)

Posted by Stan Kaufman on 11/01/06 11:10 PM

At the risk of revealing how shallow my Pool of Understanding is on all this ("Dammit, Jim, I'm an app developer not a systems engineer!"), to what extent does a typical OpenACS installation use the multi-threaded facilities that Tcl and AOLServer make possible (or is this a function of loading)? None of the package code appears to have any explicit design decisions based on the question of "should this be multi-threaded or not?". Is that because OpenACS is intrinsically multi-threaded and concurrency simply comes for free with OpenACS?

9: Re: Thread safety of OpenACS vs RoR et al. (response to 8)

Posted by Andrew Piskorski on 11/01/06 11:49 PM

Stan, yes, that's basically correct. The multi-threading largely comes for free, for typical OpenACS app development, you don't need to think much about it, and the parts you do need to think about have a gentle learning curve.

Note that you probably do need to think about concurrency in the way you write your SQL insert or update statements - that's got nothing to do with AOLserver nor Tcl per se. But Oracle and PostgreSQL both have MVCC and Transactions, which help enormously in dealing with database concurrency.

AOLserver uses multi-threading implicitly and rather pervasively. However, as an OpenACS or AOLserver/Tcl programmer you normally don't need to think about this. This is the beauty of the AOLserver and Tcl thread model and APIs vs. nasty low-level stuff like the POSIX thread APIs.

Note that Tcl's threading model is default shared-nothing. (While the POSIX threads underneath are default shared everything. The C code deals with that.)

In AOLserver each page hit runs in its own connection thread with its own Tcl interpretor. (The actual architecture is a bit more complex than that, but that statement is true as far as it goes.) You program what each page does essentially as if it was a single threaded application. When you as the programmer use nsv_get, nsv_set, or ad_schedule_proc, you are using AOLserver's multi-threaded services, but typically there's nothing special you need to do because of that. Each single nsv_* operation is atomic, the mutex locking is done automatically underneath.

With OpenACS you would typically not use lower level Tcl multi-threading stuff like ns_mutex or ns_cond - but those features are there if you need them.

Now, say you want to read a the value 5 from and nsv, increment it by 1, and then write it back. Or more realistically, use nsv_get to read a Tcl list, lappend to it, and then use nsv_set to write it back. Ah, now you do need to think about threads and concurrency, at least a little bit. Another thread could be reading or writing that nsv at the same time, and correct operation depends on a sequence of two nsv_* commands being atomic, not just one. So you need to manually protext access to the nsv with a mutex lock. Or better yet, upgrade to Zoran's Tcl Threads Extension which gives you an atomic tsv::lappend command, etc.

11: Re: Thread safety of OpenACS vs RoR et al. (response to 5)

Posted by Andrew Piskorski on 11/04/06 01:58 PM

There are also other techniques that fall in between the two extremes of, "embed this programming language in-process in AOLserver, just like Tcl" and "just fork off a CGI or shell script for each call".

In particular, ns_proxy and FastCGI. I've never used either one, and I'm not clear on how much, if any, of the AOLserver API they let you use. But if you have a single-threaded programming language that you want to use with AOLsesrver (or with a similar multi-threaded server), then those two approaches are probably both worth investigating.

12: Re: Thread safety of OpenACS vs RoR et al. (response to 11)

Posted by Stan Kaufman on 11/16/06 07:09 PM

Apparently a Ruby interpreter is only 20-100K in size, so with FastCGI and reasonable hardware, you can spawn a lot of them without breaking a sweat. So the threaded-vs-forked issue really isn't much of an issue in most web apps.

13: Re: Thread safety of OpenACS vs RoR et al. (response to 12)

Posted by Andrew Piskorski on 11/16/06 07:29 PM

For performance, sure, FastCGI should be good. But I've never yet seen a good explanation of the programming power (or lack there of) of the FastCGI approach (and I've never had a good reason to try it out myself). John Sequeira would probably know.

14: Re: Thread safety of OpenACS vs RoR et al. (response to 13)

Posted by John Sequeira on 11/16/06 11:06 PM

The benefits of FastCGI are mainly simplicity, robustness and portability at the expense of memory use (ie. the multiprocess tax).

You don't have to worry about library thread-safety since you just have a bunch of single-threaded processes. As a result, you can easily do things like run your fcgi processes in a debugger ( I used Komodo + TclPro way back when). I know it's possible to do this in AOLServer, but it's more complicated than hitting the "go" button.

FastCGI's out-of-process architecture is also big win when interfacing with other systems. If any of your external system library code leaks memory or has long-term stability issues you can easily cull processes after a certain number of requests. Not elegant, but when you have code you can't or don't want to maintain that has issues, simply giving it a short halflife works wonders and is not really available as an option in webserver-integrated APIs.

On the flip side, of course, not being integrated means that you have access to no AOLServer (or apache) APIs, just like in CGI. That may or may not be an issue, but I've found it not terribly difficult to emulate the APIs I've needed ( moving ns_schedule* to cron, reimplementing filters).

Going server-api-agnostic means you get to run everywhere - even Microsoft is has gotten the fastcgi bug and will be supporting an ISAPI module (it's in beta.) Even without explicit vendor fcgi support, the cgi-fcgi bridge works great now on every webserver that supports CGI. I'm using it in production at many sites.

I have some small issues with FastCGI, but it has withstood the test of time and continues to do what it was designed to do over a century ago (in web time). Like openacs 😊.

10: Re: Thread safety of OpenACS vs RoR et al. (response to 1)

Posted by Robert Taylor on 11/03/06 01:31 AM

wow.

this thread answered a whole range of 'language choice' level questions in one fell swoop ... and then some.

my brain hurts.

thank you for these amazing explanations ... truly truly helpfull.