Forum OpenACS Q&A: any webalizer gurus out there?

Posted by Janine Ohmer on
I've never used webalizer before, and in fact I didn't set up our
installation.  Which became painfully obvious when I changed the way
our log files are named, not knowing that the webalizer scripts were
set up to depend on the just-rolled file being named <whatever>.log.000.

I've now changed the names back, but have been unable to re-run
webalizer to get it to pick up the log entries it missed.  I've tried
all the likely-looking switches (-i, -f) and it always just ignores
all rows in each log file.

There don't seem to be many resources out there on this program, and
the docs don't cover this.  I'm waiting for my membership to the Yahoo
Group to be approved, but it doesn't look like much answering happens
there, either.  So I'm turning to my usual source of knowledge - does
anyone out there know how I can fix this?

Posted by David Walker on
Webalizer ignores all entries prior to the latest entry in it's cache file. (a pain if you thought you could combine logs from 2 machines.)

Delete the webalizer cache file, rerun webalizer, and it will catch the old stuff.
Posted by Janine Ohmer on
Thanks, David!  Two more newbie questions, though:

The only cache I see is a DNS cache.  That doesn't sound like what you're talking about.  Where would I look to see where it is and what it's called on this installation?

The way this installation is set up, it runs webazolver and then webalizer on the just-rolled log.  As far as I could see from reading the docs, webalizer always expects to be given the name of a single log file.  Is there some way to tell it "reprocess all these log files"?

Actualy, now that I think about it I have a third question:  if I remove the cache, am I going to have to rerun it on every log file since the beginning of the site?  Or just the ones that got missed?


Posted by Ola Hansson on

I think it's the "history" file you should remove or, rather, rename and keep around for a bit since I'm not totally sure. It should be found in the -o output dir, or in the current directory (~/nsadmin ?) if the -o option has been omitted.

Have a look at the proc that I recently submitted to new-file-storage:

It incrementally processes all the rolled logs I have in a certain dir, which of course may be wasteful because it does it every night and I keep 5 log files around... However it should be easy to change the "bash" FOR loop to suit your needs.

Good luck!

Posted by David Walker on
It has been a long time since I set up my webalizer scripts or I'd give you the correct filenames. Thankfully they're just kept working.

Ola is right. It is it the history file.

You would have to rerun from the beginning. Webalizer does not do fill in the blanks type of stuff at all. It has to process things in order.