While a good program - wwwstat has some design flaws that make it unsuited for use by large sites as released - notably difficult reconfiguration of reports, bad handling of characters that should be escaped, difficulty in making it support additional log formats, poor support for multiple servers, and the rather 'after the fact' retro-fitting of graphic reports to it.
My experience using and heavily customizing wwwstat led me to conclude that I needed a new program written from the ground up for flexibility: FTPWebLog was the result.
wwwstat still does some things that FTPWebLog does not - most notably filtering of reports by date. On the flip side, FTPWebLog does several things that wwwstat does not and is much easier to customize to match a sites particular needs.
I have added 'archive section' reporting to the main text report, a CGI script to allow getting 'extract' reports on the fly and a hostname lookup function that can convert raw IP addresses to hostnames as a log is being scanned. The documentation on all changes from 1.0.1 is still quite thin.
My "to do" list for 1.0.3 includes adding command line support for archive sections, adding the archive sections to the graphical report, 'local' machine name handling, date filtering, re-laying out of the daily graph to allow more than one month of data, addition of a "by the month" summary report, the ability to include more than one old report at a time, and some speed enhancements.
IOW: Check back often. Things will be a changin.
For example, a 'stats lite' version of the same report above is easily generated by extracting the needed information from the full report. It is only 24 Kbytes.
If you like it - just download it and set it up.
If you want to do graphical reports, you will also need some additional support:
Identify where your access_log is stored. Change $LogFile in the 'ftpweblog' program to point to it.
If using 'graphftpweblog', set $GraphFTPWebLogURL in the 'ftpweblog' program to point the URL where you intend to put the graphic report html file generated by 'graphftpweblog'.
Make any directories that will be used by 'graphftpweblog' to store the gif files it generates.
Run 'ftpweblog' - directing its output to a file:
ftpweblog > stats.html
If using graphftpweblog, run it - also directing its output to a file.
graphftpweblog > graphs.html
You should now have a report. That easy. By fine tuning the report options, you can make it as short or as in depth as you like.
ftpwwwlog [-h] [-i pathname] [-t www|ftp] [-x perlregex] [-X perlregex] [-r perlregex] [-R perlregex] [-A 0|1] [-H 0|1] [-f N] [-d N] [-S 0|1] [-D 0|1] [-F 0|1] [-N systemname] [-T perlregex] [-B perlregex] [-Q quota] [-q quotarate] [logfile ...] [logfile.gz ...] [logfile.Z ...]
GraphFTPWebLog processes a FTPWebLog report and produce graphss of the information in it. An HTML web page connecting them together is sent to STDOUT
#!/bin/bash
cd /home/users/snowhare/bin/stats # Where I keep the FTPWebLog scripts
# Directory where I am going to keep all my stats
basestatsdir="/usr/local/lib/httpd/htdocs/statistics"
# Location of my access_log
sourcelog="/usr/local/lib/httpd/logs/access_log"
# Name of my server
name="www.someplace.com"
# Type of log I am processing (www or ftp)
type="www"
#Name of the full stats report
statsfile="$basestatsdir/$httpstats.html"
# Genate a FULL stats report, all reports.
./ftpweblog -t "$type" -N "Web Log Report for $name" \
-d 40 -D 1 -L 1 -f 40 -F 1 -S 1 -A 1 -H 1 \
-g "/statistics/graph.html" \
$sourcelog > ${statsfile}.$$
mv ${statsfile}.$$ ${statsfile} # Doing the two step to keep the time
# when there are NO stats to a minimum
# Generate a stats lite
# Only the Summary, Daily, Hourly and Top Level domains.
litestatsfile="$basestatsdir/httpstats-lite.html"
./ftpweblog -t "$type" \
-N "Lite Web Log Report for $name" -i $statsfile \
-d 0 -D 0 -L 1 -F 0 -f 0 -S 1 -A 1 -H 1 \
-g "/statistics/graph.html" \
/dev/null > ${litestatsfile}.$$
mv ${litestatsfile}.$$ ${litestatsfile} # Doing the two step to keep the time
# when there are NO stats to a minimum
# Make the graphical log report.
./graphftpweblog -N "Graphical Web Log Report for $name"
-U "/statistics" \
-P "$basestatsdir" \
-A 1 -D 1 -d 40 -f 40 -H 1 \
$statsfile > $basestatsdir/graph.html
# Just to be sure file permission are correct
chmod 644 $litestatsfile $statsfile $basestatsdir/graph.html
chmod 644 $basestatsdir/*Stats.gif
Note: You can't extract domains and meaningfully associate them with a file sections from an old log report. You have to do that particular trick using the original access_log. You can extract domains from an old log report for analysis OR extract file names from an old report and have it mean something. But not both.
ftpweblog -t www -N 'Web Pages for John Doe' -D 0 -d 0 -L 0 -i fullreport.html -r '^/~johndoe' /dev/null > johndoe.html
Breaking it down:
This is an extremely powerful feature - you can use it to extract reports on graphic files, individual users, and archive sections.
You will also find in this distribution a 'ftpweblog-103a1' file - this is an experimental version of FTPWebLog that supports Apache's mod_config_log module and improves FTPWebLog's memory management (you should save TONS of memory now if you turn off the domain related reports). You should be able to directly copy your 'LogFormat' directive value into the appropriate line and have the program parse your custom log format. It is nowhere near complete - it does work.
Benjamin "Snowhare" Franz / snowhare@netimages.com
Webmaster for Net Images - A Full Service Internet Presence Provider