Forum OpenACS Development: Re: Not handled by server cluster procs?

5: Re: Not handled by server cluster procs? (response to 4)

Posted by Harish Krishnan on 05/04/04 04:55 PM

Currently they don't. I have added another proc to the set "synchronize_cluster_nodes" which tries to synchronize the nsv set information of the site_nodes across the cluster. But the way I have currently implemented it is not efficient enough as well as not "the" solution but a compromise for time. I am currently exploring possiblities to achieve synchronization via request-processor .. i.e when a user requests for a page(which is available on a new node) , no matter which server picks up the request it will check with the db in the case of failure in finding the node and only then give the user the "no-page-found" error. But this is turning out to be tricky.

6: Re: Not handled by server cluster procs? (response to 5)

Posted by Tom Jackson on 05/04/04 05:27 PM

That is interesting, I thought arsDigita had a way of syncing nodes. Some time back Jim Davidson mentioned that AOL used something called an NV: network variable. I've asked many times how this was implimented, but never got a response. I think I can take a guess at how it would be done, athought I don't know how efficient it would be.

First the NV would be pre-declaired as such. After that its value would be maintained on a separate 'parent' node. On the parent node, this would just be an nsv array so the threads of the parent would have serialized access. Each child node (your frontend cluster servers) would maintain a pool of tcp connections to the parent, the pool access controlled by a mutex/condition variable. The interface would be the same as to nsv arrays.

I just finished writing code for a generalized pooled query API, which would work for the client side, leaving the communication protocol, on the server, plus the client code for the query.

7: Re: Not handled by server cluster procs? (response to 6)

Posted by Andrew Piskorski on 05/05/04 07:56 AM

Tom, your NV stuff is interesting. (I think it should be called NetV though, to eliminate any confusion with NSV.) I don't understand the bit about the tcp pool on the client side having a mutex around it though, what for?

It should be interesting to see what the performance looks like. (Obviously it's going to be much worse than nsv using process-local memory, but probably plenty good enough for many or most applications.) I wonder if UDP might work better than TCP; but then there's no UDP access from Tcl, so certainly ignore that idea for now. You probably need to worry about handling broken tcp connections in the pool - Cleverly's Burrow could be helpful for that.

The enhanced nsv/tsv provided by the Tcl Threads Extension might be of particular value, as in some cases it's more powerful API let's you do more work in a single atomic call, which means only one round trip to the server rather than several. Also, in order for NetV to work the same as NSV, you need NetMutex too, otherwise it's impossible for the client to properly serialize access when doing complex operations on a Nsv. That's also a good reason to use the Tcl Threads Extension tsv/nsv rather than the legacy AOLserver nsv implementation, as it's enhanced API should let you avoid explicit use of mutexes much more often.

You may also want some sort of notification from the server to the client, saying, "Oh shit, I just crashed and restarted!". Edge cases like that are never going to work quite the same way as local Nsvs, but that should be ok.

This NetV stuff is all basically a re-invention of the Linda/Tuplespace stuff of course. That Wiki page also mentions that Ruby ships with a standard tuplespace implementation, so it might be worth looking there briefly for any lessons learned, performance hacks, or that sort of thing.

If you ever had a site making really big use of NetV (like AOL Digital City, say), I wonder whether it would pay to have NetV do its message passing over the network via MPI rather than straight TCP sockets. As long as the MPI is still talking over TCP/IP underneath, my guess is no, as the NetV communication patterns are very simple. But if you had an alternate low-latency ethernet driver (no TCP/IP, OS bypass, etc.), MPI might be the easiest way to interface to it (because someone else will have already done the work). And of course, if for some bizarre reason you actually had an SCI interconnect between your boxes, like you'd need for Clusgres, you could use the real SCI shared memory for NetV too. (All very very blue sky, this whole last paragraph.)

8: Re: Not handled by server cluster procs? (response to 7)

Posted by Andrew Piskorski on 05/05/04 09:20 AM

A possible alternate design to NetV would be to simply use AOLserver's database APIs to talk to a single remote in-memory RDBMS. Unfortunately, SQLite doesn't (yet?) support the necessary concurrency, nor does Metakit. And the NetV approach is a lot simpler, and would probably be noticeably faster for the simpler cases it supports.

Actually, Mnesia probably does support the the necessary concurrency, as well as other neat featuers. But I haven't really looked into it, it's all in Erlang, and AOLserver would need to talk to it via either SNMP or CORBA. Seems strange, but it looks like Erlang support client-side but not server-side ODBC. Erlang can use ODBC to talk to some external SQL database (Oracle or whatever), but a Tcl or C client cannot use ODBC to talk to Erang's Mnesia RDBMS. Ick.

9: Re: Not handled by server cluster procs? (response to 8)

Posted by Tom Jackson on 05/05/04 06:11 PM

Andrew, I have a module already working for this type of application. Essentially is is an abstraction layer which looks much like the ns_db sp_* api, and has many of the same goals of the OpenACS db_* api, although more general, and much simplified. Right now I have it working with odbc and postgresql drivers, these are called providers. I just finished removing all the ns_db calls into callbacks, so you can easily create a new provider with your own callbacks in this case.

The mutex/condition variables are used to protect a pool of connections/handles so you don't need to reconnect for each request. Also, you can reuse a connection for the life of your ns_conn without re-using the mutex/condition. It has a built in explicit transaction start/commit/rollback set of callbacks as well which could be used to issue special commands, such as locking a mutex on the remote server. That might be a little tricky in the case of a misbehaving or crashed client. Also transactions are handled without the need for catch/rollback. You could use these if you want, but when you start a transaction, the initializing routine registers a callback (once per ns_conn) to rollback any outstanding and uncommitted transactions upon connection/scheduled proc close, using ns_atclose. Ns_atclose also returns the token to the pool and signals any waiting threads that a token is now available. All this is created automatically when you create a new datasource and add a list of connection handles. Your callback code would need to translate this handle into an actual connection to use, for instance the callback code for odbc/postgresql is simply: 'return [ns_db gethandle $pool]'.

As far as maintaining the connections, I haven't yet looked at burrow. But I remember setting up something like this when I wrote a jabber module. It would vigorously maintain a connection with the jabber server.

Too bad we don't yet have udp in AOLserver. The changes to ns_sock are almost zero, but the problem is you need configuration changes so that a particular sock knows if it is udp or tcp. But the application doesn't need to know anything. With udp you could efficiently support server push. It might be worth writing a simple udp echo server. Whenever a change is made to an NV, the change is echoed back (to all connected nodes), which record the change. Using udp would probably complicate some things on both ends. How do you know when the server is down? Also you would probably need a timeout on every NV, where it would become stale and need to be rechecked. But this should be perfect for session management, with the session manager sitting on your database server and providing simple answers that might initially be stored in a database, but live in memory so multiple servers can track a single user session.

12: NetV, GAMMA, MPI, etc. (response to 7)

Posted by Andrew Piskorski on 10/16/04 04:32 AM

Tom, how's your NetV stuff going? Used it for anything interesting yet?

I'm curious whether AOL has ever had their NV (NetV) became a performance bottleneck. Even using TCP/IP, it is probably faster than querying an RDBMS, although lots slower than using a local NSV in shared memory. Since they are probably using it in a query-like rather than nsv-in-an-inner-loop like fashion, my guess is it's probably fast enough.

On the other hand, if you were really hitting NetV with lots of small requests, the TCP/IP latency might bite you. In that case, GAMMA might be the answer. It supports very low latency reliable MPI over cheap Intel PRO/1000 Gigabit Ethernet cards.

That seems like one of those atypical edge cases where "enterprise" style server computing and high performance cluster computing overlap. However, offhand I can't think of any web-oriented application at all that is likely to be that latency sensitive. I'm curious whether anyone has seen one.

17: AOL to relase their NetV, nsdci (response to 6)

Posted by Andrew Piskorski on 07/03/06 08:47 PM

On the AOLserver list, in the "Scaling at the high end" thread, Nathan Folkman just announced that AOL is preparing its nsdci "Digital City Module" for Open Source release. Among other things, it includes their network "NV" code.

18: Re: AOL to relase their NetV, nsdci (response to 17)

Posted by Andrew Piskorski on 03/01/07 10:49 PM

I'd forgotten all about this, but apparently AOL really did open source their nsdci Digital City Infrastructure codebase some time ago, on Google Code.