Forum OpenACS Q&A: A few general questions

Posted by ultra newb on 06/07/11 08:41 AM

1. What's the standard way for me to get a private "library" proc "registered" with the system so that I don't have to source that proc every time one of my pages wants to use the proc? (If it makes any difference in your answer, I haven't developed a "package" or "application," I'm just using the www directory which is working fine for my needs so far.)

2. How do I have a custom proc run each time the system is started?

3. What does "ad" stand for in the various system procs such as ad_page_contract, ad_form, etc?

4. I recently wrote a proc to access a backend. Originally I wrote it using ns_sockopen. Later I changed this to use the standard Tcl socket facilities. Any reason to go back to using the ns version (i.e. thread safety or some other concern)?

Thanks.

2: Re: A few general questions (response to 1)

Posted by Ryan Gallimore on 06/07/11 03:20 PM

Welcome to OpenACS!

1. You can "watch" a Tcl library or XQL file from the APM: http://SYSTEM/acs-admin/apm. Click "reload" next to your package and then "watch" to reload the listed files automatically with each request.

2. Take a look at the /tcl/*-init.tcl files in many packages. These always run on startup after all library files have been loaded for the package.

3. "ad" stands for "ArsDigita" - the commercial company that started "ACS" or ArsDigita Community System. When aD went under they open-sourced the project and it was renamed "Open Architecture Community System" or OpenACS.

4. Not that I know of...

See the Documentation section, and the Tutorial for more information. Good luck!

3: Re: A few general questions (response to 2)

Posted by Andrew Piskorski on 06/08/11 12:46 AM

When aD went under they open-sourced the project and it was renamed "Open Architecture Community System" or OpenACS.

Well no. Fortunately the ACS was always open source from very early on; at least by 1998, possibly as early as 1995. That's how OpenACS was able to be created; first as a fork of ACS to use PostgreSQL, and then with support for both databases starting with version 4.x. The OpenACS project had been active for a good while (years?) before aD went out of business.

4: Re: A few general questions (response to 1)

Posted by Jim Lynch on 06/08/11 01:18 AM

for 1, notice that after you created your package it made the dirs sql, www, tcl. if you put a file named some-purpose-procs.tcl in the tcl dir, the procs in that file will become available next time the server is restarted. as Ryan mentioned, you can also reload that file and you can watch it.

5: Re: A few general questions (response to 1)

Posted by Jeff Rogers on 06/08/11 06:07 AM

On 4 - tcl's standard socket facilities work best with the tcl event loop. AOLserver doesn't use tcl's built-in event loop, it has its own event loop that works slightly differently. If you structure your code correctly to take advantage of AOLserver's event loop you should be able to handle more load.

However, you're unlikely to see any difference in load handling capabilities until you're doing quite a lot, probably hundreds of requests per second (depending on how slow your backend is).

You can use tcl's event loop in aolserver if you start it by hand with vwait. There's a background delivery mechanism somewhere in openacs that does this but I forget where.

One last problem with the ns_sock commands and the other features you could use to get maximum performance (such as queuewait and cls) are not well documented and there aren't any large examples of usage that I'm aware of.

Short answer - if you want to do really low-level high-performance tweaking the ns_sock commands might be better, but there's nothing wrong with using tcl socket for now.

6: Re: A few general questions (response to 5)

Posted by Gustaf Neumann on 06/08/11 09:56 AM

Concerning the use of the Tcl event-loop in aolserver:

When the plain Tcl event-loop is used in aolserver threads (e.g. connection threads) there are two problems: (1) under heavy load, we observed blocking (connection threads stop processing events) and (2) connection threads are expensive and a scarce resource: when you have e.g. five connection threads defined, and all five connection threads run into a request with the event loop (i.e. a "vwait"), the server cannot accepts further requests. Long running connection threads are in generally something to avoid in aolserver, if you want scalability

The tcl-event loop works perfectly within aolserver when the tcl-thread library is loaded (see http://www.openacs.org/xowiki/libthread).

The background delivery mechanism (http://www.openacs.org/xowiki/Boost_your_application_performance_to_serve_large_files!) is based on the Tcl thread library. We deliver often several hundred thousand files via background delivery per day on our production system, including pseudo streaming for mp4 files (requires local rewriting of video stream when someone jumps to an arbitrary position). The advantage of the tcl thread library is that with event based processing a single thread can deliver simultaneously several thousand streams with different transfer rates without blocking request threads. We use multiple tcl-threads for different purposes on our production system without any problems.

7: Re: A few general questions (response to 6)

Posted by ultra newb on 06/08/11 03:41 PM

So short answer... use the aolserver version if I care about scalability.

Thanks.

8: Re: A few general questions (response to 7)

Posted by Gustaf Neumann on 06/09/11 10:41 AM

No. the short answer is: if you care about scalability, don't block your connection threads, use background delivery and friends.

what do i mean by scalability: we have often more than 10.000 users concurrently logged in, more than 2.000 concurrently active. With this kind of scale, we see frequently 200 views per second (and about 5 times this number as hits).

Say, the server has 10 connection threads configured. If e.g. a query is delivering a large file, the time to finish for this query depends on the connection quality between the server and the client (which you can't influence). For a client with a good connection quality, time-to-finish might take e.g. 0.5 secs, for one with bad quality e.g. 10 secs, or a minute. So, without background delivery, the connection thread might be blocked for 10 secs, 1 min... Suppose, there are 10 clients, requesting the file over bad connection at about the same time. In this case, all 10 connection threads will be occupied for this time, the server won't be able to serve any requests. If we serve e.g. 100 query per sec, the 10 sec case will mean that 1000 queries have to be queued (for 1min: 6000). Increasing the number of connection threads by a factor of 2 or 5 does not change the picture, if really slow operations can occupy all connection thread.

With background delivery, the processing time in a connection thread is in the range of milliseconds, independently of the connection quality of the client. Therefore, one can keep the number of connection threads (and therefore the memory footprint) low and ensure scalability (for this kind of load).

The numbers above are in some respects conservative figures; when a site serves e.g. video content, the delivery times might be much larger.

What has this to do with your question: if you have a request that has to fetch content from a different site (via ns_sock or whatever), you are in a similar situation, if you don't know the connection quality or the size of the content that has to be transfered.

Recommendations: try to occupy connection threads as little as possible; if you have confidence to the performance of transfers from other sites content within a connection thread, use ns_socket and friends, and try to cache transfered content if possible; if you care for scalability, decouple spool time from processing time and use tcl threads and async io.

Hope this helps.

9: Re: A few general questions (response to 8)

Posted by ultra newb on 06/09/11 08:39 PM

Would help if I were at a higher level as far as understanding OpenACS and AOLServer, but I'm not 😊

Could you boil your "recommendations" to a simple "use X, do not use Y?" 😊

Question: If I use the "file-procs" method described above to get my custom procs loaded in and available, is this file resourced every time I use the proc, or is it just sourced one time?

Thanks.

10: Re: A few general questions (response to 9)

Posted by Gustaf Neumann on 06/10/11 09:13 AM

ok, i try once again, three rules:

(a) The easiest approach to use is to use ns_socket and friends.

(b) Don't use tcl socket operations in connection threads since this might result in lockups under high load.

The approach (a) requires the least knowledge on your side. Whether or not this is sufficiently scalable depends on your application and setup. If you care about scalability, try to guarantee short procession times in your your connection threads. Socket operations have the tendency to depend on other servers, therefore hard to give bounds, therefore scalability degrades.

If (a) is not sufficient scalable, use (c) async i/o based on tcl's non-blocking i/o in a separate thread based on the tcl thread library.

The approach (c) is used by the background delivery (as i tried to explain above), guaranteeing short processing time in the connection threads.

The *-procs.tcl files are sourced at server start time, not per usage (that would be very inefficient). Aolserver builds during startup a "blueprint" containing all library procs. This blueprint is used for initialization of every connection thread (the threads processing incoming requests). Therefore, one wants to update these procs in the threads in memory, one has to reload it explicitly via the admin interface (see above), or one has to restart the server.

11: Re: A few general questions (response to 1)

Posted by ultra newb on 06/10/11 09:44 AM

That's the answers I was looking for... thanks!