Forum OpenACS Q&A: The whole java thing, Tclblend and Webmail

I've been investigating porting some of the java stuff coming out of aD, and as a test case I decided to port webmail which relies heavily on oracles sqlj java implementation. I'm to the point where I can now able to send and receive emails without attachments using the webmail interface. Attachments dont' work yet, since I'm only able to pull lob data out of the database, but with a little more work, I should be able to get it so I can put lob data back into the database through the java JDBC interface to postgres.

The webmail module consists of two main parts that execute java code inside of oracle. One is a message parser that reads in messages from qmail performs mime decoding, and the other is a message composer that performs mime encoding.

I was able to reuse most of the java code without modification. The only exception being that sql statements that are declared using a c-preprocessor looking syntax that starts each sql statement with #sql. Why oracle chose to do it that way is beyond me. Why not for instance, just extend the java sql classes that are already specified in the product api? The postgres JDBC interface is handled this way, and the result is much cleaner.

The acs implementation of message composition went as follows:

  1. Stuff message headers and body into the database.
  2. Call java function to perform mime encoding.
  3. Pull mime encoded message out of the database and send it using qmail-inject.

The ported implementation works as follows:

  1. Stuff message headers and body into the database.
  2. Call java function to perform mime encoding via exec function in tcl. java talks to postgres using JDBC interface.
  3. Pull mime encoded message out of the database and send using qmail-inject.

The message parsing transformation is similar. The oracle/acs implementation is as follows:

  1. Oracle schedules java process to check the qmail rx queue once a minute.
  2. New messages are mime decoded and stuffed into the database when scheduled process runs.
  3. User sees new messages when /webmail url is reloaded.

The ported version works as follows:

  1. Aolserver schedules process to check qmail queue once every two minutes.
  2. New messages are mime decoded and stuffed into the database using JDBC interface when scheduled process is run.
  3. User sees new messages when /webmail url is reloaded.
Java is definitely not suited for web development. I've never been a big fan of TCL, but I now find it to be a dream compared to doing this stuff in java. I found that when writing java code, I spent half my time scanning the api library docs trying to figure out to do something that I would have easily done in TCL.

Tclblend seems to be a non-starter as it requires the tcl library to be compiled as a shared library, and aolserver compiles tcl as a static library. I'm guessing that aolserver uses static libraries for reasons of thread safety and speed, and I wouldn't want to start messing with custom compilations of either aolserver or tclblend.

Given that aolserver and java both are multi-threaded and have good support for sockets, a better approach might be to start up a standalone java application that talks to postgres through JDBC, and is controlled by aolserver via a socket connection. Doing it this way would be more portable and easier for users to install. The only extra requirement being that the user install java and compile the JDBC interface.

Of course, if aD gets too carried away with this java inside of oracle thing we'll just have to fork from acs, and provide similar functionality using tcl/postgres.

Collapse
Posted by Roberto Mello on
This is way cool Dan ! Thanks ! And I thought we wouldn't have webmail ported to OpenACS.

I don't know much about Java, but your approach to have a standalone Java app talking to Postgres via JDBC and to which AOLserver talks is pretty clean to me. Eventually, when other RDBMS folks decide to port OpenACS, it will be a lot easier for then.

Keep up the good job!

Collapse
Posted by Ben Adida on
Dan, this is a *great* exploration of our possibilities. I've done the Java web development thing, and I agree with you that it is absolutely horrendous. However, if aD writes the Java, and we just have to change the SQL, it's not that big a deal.

What concerns me deeply, though, is the fact the interface is through process execution. This is a very bad solution from a scalability standpoint, because each call spawns a process which spawns a Java VM. That means that if there are 10 active connections, you have 10 parallel Java VMs that get created and killed, and that's pretty bad. That's as bad as CGI scripts each launching a Java VM (eek!).

I'm not downplaying your work, what you've done is 90% of what we need, and I'm psyched! The difference is that we need to have, as you mention, a standalone Java VM constantly running, accepting some kind of communication from AOLserver in a threaded manner.

Since Tcl Blend doesn't do it, we should go ahead and look into JNI and the C interface that allows us to launch a JVM and maybe interface that from an AOLserver module. I see no reason why this wouldn't work, but it should be investigated more.

This is great progress.

Collapse
Posted by David Creemer on

I'm not sure that involving a Java VM at all is the most efficient strategy. If I had the time (which I unfortunately don't...) I'd look at porting the MIME functionality to TCL, possibly using tcllib [which may itself have to be ported from TCL 8.x], and then running the MIME code right inside AOLServer. aD is probably using Java because it can run inside Oracle in a relatively efficient and conventient manner. That's not the case with Postgres.

As a distant second, I'd suggest looking into using the JVM as a server, handling each request from OpenACS within the same process. But if you're going to do that much coding, why not implement the mail handling server in something like Python? There's already server frameworks, MIME parsers, etc.

Collapse
Posted by Dan Wickstrom on
I would like to emphasize that this is just a proof of concept for using java in openACS devlopment and not a final solution for porting webmail. Most of the work on this project has involved adding java classes to replace the functionality lost by converting the sqlj statements. Yes it's true that mime coding/decoding would probably be better handled in TCL, but given the small number of people doing development/porting on openACS, it's easier for us to find ways to port the stuff from aD than to reinvent it.

I agree with Ben's comment about scalability, but for this phase of the development, I've been mainly concerned with the JDBC end of things and making the interface to the database functional in a generic way, so that future porting of java code will mainly involve rewriting the sql statements. In the short-term it would be easy to convert the message parsing side of webmail run as a standalone application that would run in a loop where it would sleep for some fixed amount of time, wake itself up and then process the mail queue, putting itself back to sleep at the end. The process could put logging information in the database each time it parsed the mail queue, and a watchdog process could be scheduled in aolserver to run periodically and check if the queue processing were working correctly, and if not, it could kill the current message parsing procees and restart it.

The message encoding is the main bottleneck in the current scheme, and it will require some investigation to work out a method for controlling a standalone java application in a robust and portable manner. Once I finish up the blob stuff and verify that I can send and recieve attachments, I will look into some of the possible alternatives.

The two main alternatives as I see it are to use JNI to embed a jvm in a loadable aolserver c-module, or to control a standalone jvm through a socket connection.

A JNI interface would probably provide a seamless interface to the java libraries, but the main problem with this approach would be that we would need to develop a loadable jvm c-module for aolserver. It seems like this would be doable, but I'm sure it's not a trivial exercise. Maybe somebody has already done this? I looked over at the aolserver site, and the only mention of java was the tomcat servlet engine, which from the sound of it, is not what we want.

The other alternative for interfacing to java would be to use a socket interface. Both aolserver and java have strong support for socket progamming, and I think it would be possible to provide a generic api in aolserver for issuing commands to a standalone jvm.

Anyway, I'm going to do some investigation on the two methods mentioned here, and if anybody out there has some other ideas, I would be willing to look into those also.

Collapse
Posted by Ben Adida on
Yes, what Dan is saying here is tremendously important: the point here is to make the best engineering decision given our limited resources and the fact that we are trying to remain as much in sync as possible with ACS/Oracle. We can't redevelop a solution for every Java-in-Oracle situation that aD develops (and they *are* developing a number of such modules).

Dan, I'm not at all worried about launching a single, scheduled JVM every 2 minutes. I'm worried about the "more serious" problem, as you mention it. Sockets would be much slower, and would require starting two separate processes, so I would much prefer the JNI-in-module solution. I'm happy to help out and discuss this (on or offline).

Collapse
Posted by Don Baccus on
First - this is a tremendous coup, Dan.  I'm extremely impressed.

The fact that you got JDBC working with Postgres is a minor coup in itself, as the PG hacker's group gets a steady trickle of pleas for help from folks trying to do this.

JNI-in-module seems like the best way to go, by far.  It would be a valuable contribution to AOLserver, too, I'm sure other folks would want it.

A question about BLOBs.  Are you doing an emulation hack like I did or  using built-in LOs?  The built-ins aren't a viable solution as they can't be dumped, and secondarily because they make two files for each LO in your database directory.  Moving the hack I did for the driver into Java would be easy, and of course the SQL's already defined, you'd just use the same table the driver uses.

It looks like real large objects are going to be in PG 7.1 for sure.  Jan Wieck (who also did PL/pgSQL and referential integrity) has already committed his first cut to the PG development CVS tree.  He's doing a CLOB type at the moment in order to test the SQL and internal routines (it actually works a bit like my driver hack), so it can be tested without folks having to wait for him to hook it up to built-in types.

If he or someone else doesn't write character-encoded I/O for large byte arrays, I'll add it in, to make this built-in type pg_dumpable.

At that point we can drop my driver kludge and anything you come up with for Java.

The point is simply that you can view any solution you come up with as  being strictly a short-term thing, just like my driver hack.

Collapse
Posted by Dan Wickstrom on
I pretty much lifted the lob hack from your driver code verbatim. For a first cut I implemented it in 'c', and I run the code/decode routines using piped input/output streams. Yesterday I cam across a reference for a uu_enocde/decode class which was implemented in java using a table lookup method, but source code was only available for a fee. I looked at the algorithm that you're using, and I think it can be converted to use a table lookup method that relies on several small tables to generate the coding/decoding. This method would probably give acceptable conversion speeds even in java.

Of course if we do use JNI, then I can keep the 'c' routines and call them directly. I've pretty much isolated all of this lob stuff in a single class, so it should be easy to rip it out once pg supports true large objects.

Collapse
Posted by Don Baccus on
Table lookup would be fine, but I wouldn't expect the standard uuencode stuff to run terribly slowly anyway, even in Java (I lifted it from uuencode myself).  I'm surprised someone would want to charge  money for such a trivial piece of code!

You might just write it up in Java and see just how slow it is.  Since  we only need it for a few months, we might not care.

Collapse
Posted by Brent Fulgham on
I wrote the initial implementation of PyWX (Python interpreter) for AOLserver using a few hours a night for about two weeks.  So, I don't think you'll have much trouble embedding Java in the same way.  You could look at my code for ideas, although it's quite Python-specific.

A good place to start would probably be any Apache module that embeds a Java interpreter.  Then, you only really need to conver the various Apache API calls to the equivalent AOLserver calls.  It's surprisingly easy -- AOLserver is an excellent program.  It is very easy to extend, and very easy to read/understand.

I would be more than happy to discuss your ideas via e-mail or this web board, and help you with any pitfalls I can remember.  This would also be a nice "gift" to the AOLserver community. 😊

Collapse
Posted by Dan Wickstrom on
Brent,

I tried to check out PyWX using anonymous cvs access, but I didnt' have any luck.  cvs said that it didn't recognize PyWX as a module name.  I downloaded the tar file but it seems that you've added quite a few files since the last release.

Collapse
Posted by Brent Fulgham on
Okay -- it's really time I fixed this.  I screwed up in my
initial import, and it's done at "root" level.  So to check
it out, just use "." as your directory, instead of PyWX.

Sorry!

Collapse
Posted by Brent Fulgham on
Okay -- it should now be possible to check it out of CVS
by following the directions posted on the sourceforge site.  It
is now in the "PyWX" directory, as it should be.

Have fun!

Collapse
Posted by Dan Wickstrom on
Brent,

I looked at your sources for PyWX and it looks to me like you get most of your functionality in python by putting a python wrapper around the tcl interpreter eval command.  Is there any advantage to doing it this way, or is this just a quick way to get a lot of functionality without too much trouble?  Do you plan to eventually wrap most of the aolserver api functions and call them directly from python?  I think I could use your idea in java, at least at the start, to test out the functionality of a loadable java-c module.

Also, what does the aolbuffer class do?

Collapse
Posted by Brent Fulgham on
Dan,

We do wrap many pieces of the Tcl API, but that's mainly as a means to get things working quickly.  Furthermore, we are seeking a high level of Tcl integration, so some of what you are seeing are Tcl callbacks to Python, Python callbacks to Tcl, etc.

We plan on exporting all AOLserver API functions through Python, using the ns_python and pywx modules that are already present (in an early form) in PyWX.

The important thing to note is how the python interpreter is created and cached by AOLserver.  You will also need to devise a way of connection the Java I/O streams to AOLservers input and output buffers.  That's what the AOLbuffer C++ class does (in a messy way).

It might interest you to know that I'm getting started embedding a Scheme interpreter (Guile) in AOLserver.  It's not because I'm a
language-crazy fiend (I prefer to remember as few as possible), but
it dovetails with another effort I've got going on.  At any rate,
we might want to discuss embedding issues together so we can have a somewhat unified system of handling these projects.

I believe that once your Java interpreter is in-process with AOLserver, it's standard error will automatically be written to the logfile.  Then you need to decide if you want to redirect all I/O through AOLservers HTTP ports, or if you want to implement specific commands (like ns_write) that do this for you and leave standard I/O alone.

Anyway, let me know if I can answer any questions.

Collapse
Posted by Dan Wickstrom on
Brent,

Since you're such a language fiend, maybe you should consider using jpython instead of cpython, so we could have a real inter-language tower of babble :). Just think - tcl calling python, python calling java, java calling tcl... ad nauseum. Oh yeh, and you could also work guile in there somehow.

Seriously, I think it would be good to collaborate, though I think that what I intend to do initially will be much simpler than what you're trying to achieve. My main goal is to provide the same java functionality in aolserver that acs classic is getting out of oracle. To do that I need two things:

  1. A means of calling arbitrary new java methods without having to do any new JNI extension work to support it.
  2. From within the java module I want the same access to the database that I get from within aolserver/tcl.

The first item, from what I've seen so far, should be obtainable by using java's reflection capabilities, and the second item should be easily obtained by wrapping the aolserver db functions.

Although, I have no intention of generating and caching pages that are written to the http connection, I don't want to make if difficult for somebody to extend it in that direction if they so desire.

Collapse
Posted by Brent Fulgham on
Dan,

Check out "Kawa" for a Scheme implementation on top of Java.  😊

The Amaya web development kit embeds the Kaffe version of Java.  So we could probably look there for ideas as to how to go about doing so.

-Brent

Collapse
Posted by Edward Dao on
has any one sucessfully implement ns_tomcat on AOLserver that have the ACS system?...i was able to successfully install ns_tomcat with AOLserver but i can't seem to get it to work when ACS is install ....the reason because ACS uses abstract URL for anything after the /  but the ns_tomcat need a virtual directory (e.g: /servlet). The ACS seem to have a higher control over the URL than the ns_tomcat... is there any way to get ACS to ignore one URL like (/servlet/* or  /jsp/* ) ?
Collapse
Posted by Roberto Mello on
I don't know which ACS functionalities this would break (if any) but you can disable abstract-urls entirely at your parameters/servicename.tcl:
ns_section ns/server/${server}/acs/abstract-url

        # enable abstract url handling?
        ns_param EnableAbstractURLsP 1

Just change "1" to "0".

Collapse
Posted by Don Baccus on
I'd recommend you take a shot running nsjava unless you really need nstomcat (.jsp pages?  can nsjava be made to serve these, too?)

The main reason is that nstomcat talks to the DB via JDBC.  This isn't  a very efficient way to talk to PG, one of the key virtues of AOLserver is its management of database handles, something you really don't want to lose when working in Java in the AOLserver environment.

Collapse
Posted by Li-fan Chen on
Hi, anyone got any cool ideas about nsjava? Mainly, what would you use it for? I haven't touched it yet. But I understand that threaded Java chat servers are common, so my question is how could Tcl and Tcl access to the RDBMS benefit from having a Java VM in process near-by? Any neat ideas?
Collapse
Posted by Li-fan Chen on
Another question, I played with JDK1.2.x from Sun (Blackdown) some. But the future of GNU java-related projects seem bright as time passes (and MS<->Sun<->IBM<->tandards groups politics flames on) I thought maybe we can try to have OpenACS use the GNU tools instead of Sun's or IBM's tools. That means dealing with beta compilers (GCJ) and Personal Java (Kaffe) instead of the glorious JDK1.3.x from the masters of VMs (Sun and IBM)... but in time things should better. So are you all interested in GNU java-related tools now? Or think we should wait and stick with Sun's idiot-proof development kits (Blackdown) and licenses?