Forum OpenACS Development: Re: Thread safety of OpenACS vs RoR et al.

Posted by Andrew Piskorski on
Actually, it's fairly straightforward to evaluate these thread issues:

If you want to use a programming language with AOLserver, or with any other genuinely multi-threaded program, and you want to use it in a general purpose, fully functional way, then you definitely don't want to use Ruby. Nor R or Python, for that matter. What you want is a fully thread-safe and reentrant language implementation, period.

If on the other hand, your program is strictly single threaded, or perhaps also if you are willing to use said programming lanaguage from one thread only and you are willing to ad-hoc figure out how to do so safely for your particular non-thread-safe language, then the doors are wide open - use Ruby, Python, or whatever else you're willing to hack in, and make work.

And that "one thread in my multi-threaded program" approach will involve trial and error on the thread-safety bits, it is not a matter of just following some simple recipe.

(Btw, the reason is somewhat deep:  All current threading libraries use mutex locks, and locks do not compose. Transactions do. This is basically why transactions are awesome and locks suck. But I digress...)

Practically speaking, I think the above means that you might want to use something like R or Ruby from AOLserver for some one special purpose, which you know ahead of time. Typically, you have an existing special purpose code library, and you want to call it from your web server - cool. You might want to embed it, you might want to just fork off scripts as needed, I suppose it depends on the exact use case.

But given the limitations above, I can't see it ever making practical sense to embed a non-multithreaded language implementation into AOLserver for general purpose use. (Tcl is AOLserver's principal general purpose language.)

In the case of, say, AOLserver and Ruby, the dark horse way around all the limitations above might be to change all of AOLserver underneath to use multiple processes and shared memory rather than threads. At the Tcl level, everything would look the same. At least some of the C code underneath would definitely see differences, though. Multi-process AOLserver should certainly be possible to do, somehow, but I've no basis for even guessing how much hacking it would require, nor what performance problems or other difficulties might crop up. Oh, and it probably would never work right on MS Windows at all, only Unixes.

I doubt you want to go there with AOLserver. Neat hack, and it might be the easier route, but it'd probably be more fun to "just" write your own single-threaded web server in Ruby, or implement your own multi-threaded version of Ruby for AOLserver. :)

Now, Zoran Vasiljevic, creator of the AOLserver-inspired Tcl Threads Extension, has mentioned several times how he would like to add complete multi-process support to the Threads Extension, with an API essentially identical to the current multi-threaded one. He's just never done it because he couldn't come up with any practical need for it in his work. I bet you won't either. :)

Btw, supposedly, way back in the mid 1990's when the Navisoft guys were first writing AOLserver, Tcl was not thread safe. But, do to its clean implementation, intended for easy embedding in larger C programs, the Naviserver guys were able to make Tcl multi-threaded quickly and easily. Presumably that's one of the single biggest reasons why Tcl was ever used in AOLserer at all. (That's all hearsay, but you can ask Jim Davidson if you want to know for sure.)

Clearly Python and Ruby were not so much intended for easy embedding in other programs, and certainly not in multi-threaded programs. I think the creators of each have publically said as much, too.

Posted by Stan Kaufman on
Andrew, thanks for your Deep Plumbing analysis. I was musing on the rather more superficial question of whether -- for the sorts of community-oriented web apps that most people build with OpenACS or RoR -- the threading issues are really important. On one hand, it would appear that using a completely thread-safe environment would be superior, but yet it also is clear that many/most people are using non-thread-safe stacks without evidence of disaster. So what do we make of this?
Posted by Andrew Piskorski on
Oh, I don't think your, "should I make my app multi-threaded or not?" question is at all superficial. Note that I never addressed that question above at all - that was on purpose. I didn't address it because it's mostly orthogonal to the issues discussed above, and because it's more contentious, and I'm less sure of the right answer.

Achieving at least the illusion of concurrency is important for many apps. (GUIs, moderate traffic web servers, etc.) Currently, the main ways to get it are either are various flavors of single-cpu user threads, or event driven code.

Achieving real concurrency - multiple threads of execution running at the same time on multiple processor cores - is less widely needed - but if you need it, you may really, really need it. The trick is that you may not know ahead of time whether you need it or not. ("Oh awesome, my web service just got hugely popular! Oh shit, my web service just got hugely popular...")

(Note also that there are many different - some wildly different - approaches in the basic multiple-threads-of-execution bin. User threads, OS threads, processes with shared memory, programs on different physical boxes exchanging messages with MPI or with TCP/IP sockets, etc.)

In practice I've been quite happy with Tcl's (and AOLserver's) high-level apartment thread model built on top of real, OS-driven POSIX threads. So I am biased in that direction. (Note though that these can be fairly heavyweight threads, often much heavier than the basic POSIX threads underneath.) There are definitely other concurrency models in real world production use worth checking out, though. Erlang is one of those, there are no doubt others.

Also, there is definitely ongoing, much needed research on better models and tools for concurrent programming. (E.g., there seem to be academic folks seriously working on replacing locks with transactions. Also lots of lower level stuff, trying to make efficient use of "weird" cpu architectures like graphics cards and IBM's Cell, etc.) My bet is that 15 years from now, we'll be using concurrent programming tools quite a lot different than the POSIX threads, MPI-based message passing, etc. that we're using now.

But to get back to the real world of the here and now...

Code that is thread-safe and reentrant is always a plus, because it gives you the option of using it in a multi-threaded environment, even if you don't really care to at the moment. But naturally, software development has trade-offs. Some developers choose to punt on thread-safety in favor of spending their time elsewhere. Whether that's a wise trade-off or not in their particular case, I don't know.

I do know I would try to avoid having to make that trade-off. I tend to write most of my code in a re-entrant style even if I'm using it single-threaded. ("Repeat after me class, global variables are EEEEEvil.")

Multi-threaded Tcl and AOLserver do work very well. This not so much because they just "use threads", but because they use POSIX threads in a rather well thought out way, and provide lots of nice APIs to make doing the right thing fairly easy, and difficult things possible.

Posted by Stan Kaufman on
At the risk of revealing how shallow my Pool of Understanding is on all this ("Dammit, Jim, I'm an app developer not a systems engineer!"), to what extent does a typical OpenACS installation use the multi-threaded facilities that Tcl and AOLServer make possible (or is this a function of loading)? None of the package code appears to have any explicit design decisions based on the question of "should this be multi-threaded or not?". Is that because OpenACS is intrinsically multi-threaded and concurrency simply comes for free with OpenACS?
Posted by Andrew Piskorski on
Stan, yes, that's basically correct. The multi-threading largely comes for free, for typical OpenACS app development, you don't need to think much about it, and the parts you do need to think about have a gentle learning curve.

Note that you probably do need to think about concurrency in the way you write your SQL insert or update statements - that's got nothing to do with AOLserver nor Tcl per se. But Oracle and PostgreSQL both have MVCC and Transactions, which help enormously in dealing with database concurrency.

AOLserver uses multi-threading implicitly and rather pervasively. However, as an OpenACS or AOLserver/Tcl programmer you normally don't need to think about this. This is the beauty of the AOLserver and Tcl thread model and APIs vs. nasty low-level stuff like the POSIX thread APIs.

Note that Tcl's threading model is default shared-nothing. (While the POSIX threads underneath are default shared everything. The C code deals with that.)

In AOLserver each page hit runs in its own connection thread with its own Tcl interpretor. (The actual architecture is a bit more complex than that, but that statement is true as far as it goes.) You program what each page does essentially as if it was a single threaded application. When you as the programmer use nsv_get, nsv_set, or ad_schedule_proc, you are using AOLserver's multi-threaded services, but typically there's nothing special you need to do because of that. Each single nsv_* operation is atomic, the mutex locking is done automatically underneath.

With OpenACS you would typically not use lower level Tcl multi-threading stuff like ns_mutex or ns_cond - but those features are there if you need them.

Now, say you want to read a the value 5 from and nsv, increment it by 1, and then write it back. Or more realistically, use nsv_get to read a Tcl list, lappend to it, and then use nsv_set to write it back. Ah, now you do need to think about threads and concurrency, at least a little bit. Another thread could be reading or writing that nsv at the same time, and correct operation depends on a sequence of two nsv_* commands being atomic, not just one. So you need to manually protext access to the nsv with a mutex lock. Or better yet, upgrade to Zoran's Tcl Threads Extension which gives you an atomic tsv::lappend command, etc.

Posted by Andrew Piskorski on
There are also other techniques that fall in between the two extremes of, "embed this programming language in-process in AOLserver, just like Tcl" and "just fork off a CGI or shell script for each call".

In particular, ns_proxy and FastCGI. I've never used either one, and I'm not clear on how much, if any, of the AOLserver API they let you use. But if you have a single-threaded programming language that you want to use with AOLsesrver (or with a similar multi-threaded server), then those two approaches are probably both worth investigating.

Posted by Stan Kaufman on
Apparently a Ruby interpreter is only 20-100K in size, so with FastCGI and reasonable hardware, you can spawn a lot of them without breaking a sweat. So the threaded-vs-forked issue really isn't much of an issue in most web apps.
Posted by Andrew Piskorski on
For performance, sure, FastCGI should be good. But I've never yet seen a good explanation of the programming power (or lack there of) of the FastCGI approach (and I've never had a good reason to try it out myself). John Sequeira would probably know.
Posted by John Sequeira on
The benefits of FastCGI are mainly simplicity, robustness and portability at the expense of memory use (ie. the multiprocess tax).

You don't have to worry about library thread-safety since you just have a bunch of single-threaded processes. As a result, you can easily do things like run your fcgi processes in a debugger ( I used Komodo + TclPro way back when). I know it's possible to do this in AOLServer, but it's more complicated than hitting the "go" button.

FastCGI's out-of-process architecture is also big win when interfacing with other systems. If any of your external system library code leaks memory or has long-term stability issues you can easily cull processes after a certain number of requests. Not elegant, but when you have code you can't or don't want to maintain that has issues, simply giving it a short halflife works wonders and is not really available as an option in webserver-integrated APIs.

On the flip side, of course, not being integrated means that you have access to no AOLServer (or apache) APIs, just like in CGI. That may or may not be an issue, but I've found it not terribly difficult to emulate the APIs I've needed ( moving ns_schedule* to cron, reimplementing filters).

Going server-api-agnostic means you get to run everywhere - even Microsoft is has gotten the fastcgi bug and will be supporting an ISAPI module (it's in beta.) Even without explicit vendor fcgi support, the cgi-fcgi bridge works great now on every webserver that supports CGI. I'm using it in production at many sites.

I have some small issues with FastCGI, but it has withstood the test of time and continues to do what it was designed to do over a century ago (in web time). Like openacs 😊.