Forum OpenACS Q&A: Server Monitoring: What do people use?

Well, after have our customer find some small defects before we have,
I am preparing to install a full suite of monitoring tools on our
production and development systems.

I have thought about writing extensions for the existing arsDigita
tools (keepalive cassandrix...)  but then I discovered the wonderful
world of Big Brother and NetSaint.

I am curious what tools others use to keep tabs on cpu load, disk
space, server functionality, postgres functionality and if people
would like a document describing how to install netsaint for openACS
services.

I did a small install of netsaint this afternoon and it promises to be
a good way to monitor all of our critical services for multiple
customers and multiple machines.

-ccm

Collapse
Posted by David Eison on
I don't know much about it since I try to stick to programming, but I know we use netsaint now and I've heard good things about big brother from weather.com (in addition to keepalive, which is still useful because it will actually take action beyond just e-mailing you).  So I assume you're on the right track here.
Collapse
Posted by Carl Coryell-Martin on
Net Saint is looking pretty entertaining, as it happens it can do stuff also when processes are down so in theory it can replace keepalive.  I'll put a full report together when I am done.
Collapse
Posted by Arjun Sanyal on
After reviewing the open-source monitoring tools for their usefulness
in the context of our own OpenACS systems, we (OpenForce) decided on a
NetSaint-based monitoring system. Our goal was not only to monitor our
own servers, but to start the discussion of a comprehensive OpenACS
monitoring system (that also monitors mailservers, the UNIX layer,
postgres, etc) that could, in the future, be integrated into OpenACS.

I'm in LA at the moment, but when I get back to New York tommorow, I'll post some of our thinking on this topic.

Also, I'm in the process of writing some documents about using
NetSaint as a OpenACS monitoring tool. This could be the basis of
a guide. Let me know if you (or anyone) would like to collaborate
on this.

Collapse
Posted by Carl Coryell-Martin on
That sounds great.  I did a shallow review at best, but went for netsaint.  So far it looks pretty good.  I would love to collaborate on this project.  I have started instrumenting our development server and should have a production system up early next week.

My only complaint is that in only does warnings, it doesn't actually collect data, so I am thinking about using another mechanism in parallel to monitor and log various bits.  (Cricket or MRTG perhaps)

Cheers,

Collapse
Posted by Janine Ohmer on
I'd love to know why folks are choosing NetSaint over Big
Brother?  I know that we (furfly) are using Big Brother, but beyond
that I know very little about either of them.  If there's some
compelling feature of NetSaint, we should check it out.
Collapse
Posted by Carl Coryell-Martin on
The easy answer for us is that Netsaint is free+open source.  We figured we would give it a try and if we didn't like it we would install big brother and pay them the $600 for non-commercial use.
Collapse
Posted by Arjun Sanyal on
I am new to NetSaint too, but I think that NetSaint can currently save state information in a text file, and in the next version (v0.0.7 in beta RSN) there will be support for native storage of "status, retention, comment, and extended data" in PostgreSQL! See Nos. 9 and 17 in What's new in 0.0.7

Janine: One of my goals is to begin to build an OpenACS monitoring solution, so I don't want to build an OpenACS module and OpenACS-centric extensions that depend on Big Brother because of its non-opensource license. From a technical standpoint, Big Brother looks like a fine tool, but I've concentrated on its open-source competition.