Forum OpenACS Q&A: ns_xml thread safety
In the process I've begun to explore the libxml and libxslt libraries that do the heavy lifting and discovered in the mailing list archives that there is a lot of confusion over how thread safe these libraries are.
Additionally I've talked with one of the lead programmers and what I got out of him is that he's tried to do clean design, but doesn't use threads so hasn't worried about it too much, and more importantly hasn't done thorough testing.
So I would just like to bring up this issue with the community at large.
Is there anybody out there who has used ns_xml intensively? What were the results?
And has anybody actually done specific tests or code reviews? I'd be really curious as to how they went about it and what the results were.
Thanks a lot...
and of course many thanks to Curtis and Yon for bringing us this module
It seems that libxml/libxslt are not thread safe. I'm not sure if ns_xml is also not thread safe. You can always make non-thread safe library to be thread-safe by wrapping all calls in mutexes (running into a possibility of slowing system down due to lock contention or deadlocks if you're not careful). There are some calls wrapped with mutexes in ns_xml, but not many. I guess only Curtis can tell at this point what are they protecting and how confident he is in thread-safeness of the code.
The nature of threading problems is such that it's almost impossible to infer problems by code inspection (especially if the code is largely unknown, as is libxml and ns_xml to me).
All you can do is to increase level of confidence by running stress tests. I've only run stress with ab (apache bench) that tries 1000 times 10 simulataneous connections to a page that does XSLT transformation of one XML doc. It seemed to work fine. Surely more elaborate tests can be done.
To summarize: there are no known threading problems in ns_xml. If you run into one that you think can be caused by ns_xml threading issues (I think it can show as either corrupted output or AOLServer crash) please let me know and I'll investigate. In that case a very detailed description of the problem, setup etc. will be needed.
I plan to use ns_xml in my own web services so if you find any issues send them to me, I'll actively maintain ns_xml. I'll fix any reported bugs that I can reproduce and re-release a new version. I might be less receptive to new feature requests due to time limitations (like, it would be nice to add SAX API but as long as I don't need it badly I won't work on it myself; if, on the other hand, someone sends me a patch to implement new functionality and the patch looks good, I'll add it).
You can probably assure over time that ns_xml is thread safe, but if you find that the underlying libxml/t isn't then, well there's another alternative to making something thread safe that's not. And that is to only use it from one thread.
I had a similar problem with swish-e a few weeks back. It was rewritten to be thread safe but there had been no real testing of it. Plan A was to go for it, with the typical AOLserver embeddeding, but there was a Plan B. Plan B was to use Rob Mayoff's threadpool code and create a ns_swish-e server that runs in it's own AOLserver thread. Code in other threads that need swish-e services would create a swish-e request on a swish-e queue that get's handled in the separate swish-e thread. One way of looking at that is that it is letting Rob handle all your mutexes for you. Swish-e has a reasonable API for that. There's an initial call, that gives you a swish-e handle that you then return to the other calls. I don't know if ns_xml could be set up the same way.
Anyway, that was Plan B.
Anyway, I will hold off changes in this area until anyone can proof that ns_xml *isn't* thread safe.