Forum OpenACS Q&A: What does "couldn't create pipe: too many open files" mean?

Hi everyone, I'm running OpenACS 3.2.5 and WatchDog just email me this message:
[01/Apr/2002:05:39:06]
    Error: couldn't create pipe: too many open files
    couldn't create pipe: too many open files
        while executing
    "exec $command $options $error_log"
        (procedure "wd_errors" line 17)
        invoked from within
    "wd_errors $num_minutes"
        (procedure "wd_mail_errors" line 8)
        invoked from within
    "wd_mail_errors"
        ("eval" body line 1)
        invoked from within
    "eval [concat [list $proc] $args]"
        (procedure "ad_run_scheduled_proc" line 43)
        invoked from within
    "ad_run_scheduled_proc {f f 900 wd_mail_errors {} 1017346118 0 t}"
    Notice: Running scheduled proc wd_mail_errors...
This error brought my server down a couple of hours till I restarted nsd. Can some please explain to me how it happens and how to prevent it from happening? Thanks!
See this thread. (maybe it's related - I don't know).
In the thread referenced Ola, S.Y. mentions a 4MB PDF entitled "Securing and Optimizing Linux: RedHat Edition" -- this is also available in HTML format at http://www.linuxdoc.org/LDP/solrhe/Securing-Optimizing-Linux-RH-Edition-v1.3.
Just thought I'd give an update on this, for future references!

The solution mentioned in the earlier thread sortof works, because it increased the max-file limit in the kernel. Summary:
echo "24576" > /proc/sys/fs/file-max 
echo "98304" > /proc/sys/fs/inode-max
What bugged me was that my server had opened some 8192 files (the default file-max). Increasing the file-max helped, but after a couple of days, the same error came up again -- nsd had opened even more files this time, and it hit the new limit.

A little more searching (had to compensate for starting this thread without searching first :) brought me to this: http://freshmeat.net/projects/lsof/
Lsof is a Unix-specific diagnostic tool. Its name stands for LiSt 
Open Files, and it does just that. It lists information about any 
files that are open by processes currently running on the system. It 
can also list communications open by each process.
(Quote from Freshmeat)

Using lsof (which was in /usr/sbin on my Redhat 7.2 server) I was able to quickly trace the offending tcl scripts (mine :) and solve *most* of the problem. Still working on the rest.

Perhaps this can be useful to 1) help anyone else who encounters problem in the future -- push up the file limit AND fix whatever's making your server open 9000 files at a time; and also to 2) buy myself some pardon for opening my mouth too soon :)