Forum OpenACS Q&A: File Table Overflow.

Collapse
Posted by Tom Jackson on

I installed the newest CVS version of the ACS(3.2) from sourceforge. I immediately had an incident where linux reported "file table overflow". I have since had other similar incidents. Has anyone else seen this error? Might it be related to PostgreSQL 7, or a perl script in WatchDog? This affects the entire machine. Does anyone have any experience with this type of error?

BTW, you can check out my installation of this software at http://acspg3.zmbh.com:8080/

Collapse
Posted by Tom Jackson on

The above installation was on AOLserver 2.3.3(libc5) Slackware Linux kernel 2.0.34.

I decided to try it on my workstation which runs Red Hat 6.1 and AOLserver 3.0rc1. I haven't had any problems on this machine. From what I can find out, I have too many files open. This affected multiple programs, especially ls, and visiting a web page with many images.

Collapse
Posted by Don Baccus on
Well, you've answered your own question, really - your 2.0.34 kernel's  limit on the number of files that can be open simultaneously is set too low for the suite of programs you're trying to run simultaneously.

Raising it is a kernel compile thing, I bet...

Collapse
Posted by Robert O'Gwynn on

I've encountered the same problem with my initial install, only with FreeBSD instead of Linux.

Actually, I ran flat out of available open files (pstat -T showed me 1300+, which was the upper limit at defined by the kernel compile). I recompiled the kernel with "MAXUSERS 512" -- up from the default "MAXUSERS 32" -- and now have 16424 available open files

I rebooted the machine earlier today (for unrelated reasons) and since boot some 4 hours ago, I am up to 890 open files. I think my previous problem was compounded by the fact that sendmail was NOT configured (properly or otherwise).

However, every time I load an ACS page from this server, my open file count increments by 2-4 files. Usually, after a bit, it will decrement by 1-2 files. Being a world-class idiot, I have no clue what's being opened or not opened, and pstat -f tells me bupkus.

I guess I get to look forward to my machine slowly using up its max open files, tank the machine, and I get to reboot, unless somebody else with a clue can answer this

Collapse
Posted by Dan Wickstrom on
You can use lsof to tell which process is opening what files.  It might help you track down the source of your problem.
Collapse
Posted by Don Baccus on
Robert's problem sounds like a FreeBSD issue with AOLserver, perhaps?  It sounds as though file handles might not be properly reclaimed after threads end execution, that kind of thing.

It would be good to isolate the problem, if you can.  Forget the ACS for the moment, and put up an AOLserver to a different pageroot.

Try simple .html, .adp, and .tcl files and see if you get the same problem.  If so, talk to the folks at aolserver.com about it.

As far as Linux goes, apparently 2.0.* had some problems in this area that have been cleaned up for 2.2.* kernels.  I've seen discussion of it in newsgroups between folks who've never heard of AOLserver, much less the ACS.

Collapse
Posted by Robert O'Gwynn on

Thanks, Dan and Don, the wonder twins. Dan's suggestion of using "lsof" (a new util to me, tho prolly not for smarter people). I installed it from the ports tree, and lo-and-behold, I think I've got a culprit, or at least a couple of suspects.

The box I'm running all this on is a poor benighted Dual Pentium Pro, that's running development for all of our web projects, as well as a forwarding DNS, Samba services, AppleTalk services, and a partridge in a pear tree. Shameful, I know...

The two biggest open file abusers appear to be postgres and apache -- I'll keep watching it, and will probably write some poor excuse for a perl script to watch it and see who's being naughty or nice.

I'm also going to try your advice, Don, and install a vanilla AOLserver and hammer it. I think, though, it's more likely a postgres problem. I am running the last beta release of 7.0, so I'll try the release version as well.

If I'm forced to move to Linux, I might cry, though. I sure do like my FreeBSD... :)

Collapse
Posted by Don Baccus on
Oh ... Postgres.

AOLserver, at least, keeps persistent connections open to Postgres backend processes.  Each of these processes will open files, one file per table and one file per index.  The ACS has lots of tables and lots  of indices.  Actually, come to think of it the shared buffer manager does file management so presumably file handles are shared between child processes.  Still, it's easy to see that you might need quite a few file handles to run the ACS!

I've been through the Postgres shared buffer management code before, but didn't dig into how and when file handles are grabbed (i.e. files opened and closed).

This is one potential drawback to the PG model of one-table-per-file (as opposed to say Oracle that stuffs tables into a tablespace which in turn is stuffed into a file).  Still, with modern Linux/Unix systems you can have lots and lots of file handles.  Three more cheers  for cheap hardware!

Also, PG will open files to sort and do heap joins if you haven't started it with the appropriate switches that tell it to use lots of RAM for these things.  I'll be writing something on this in the next couple of weeks.