You've spent innumerable
hours patiently configuring and customizing your Open SiteSearch environment.
You've conducted an endless array of tests to ferret out bugs and build a safe,
reliable system. You know all about what it has and what it does, but do you know
if your patrons have figured this out?
Well, the answers are all within -- within your web server log files, that
is.
Like a diligent court reporter, your web server log files have recorded all
sorts of information about who accesses your system and how they use it. Log
analyzers harnass this data and package it into tangible reports and graphs
to help you evaluate the performance of your environment and better understand
the needs of your patrons.
There are a number of log analyzers available on the market today, from open
source vanilla to high-end glitz. Herein is a small primer on what is available
and what to look for. In preparation for this piece, OCLC did not test the log
analyzers. It does not endorse any particular product. Rather, the following
is for your information purposes only. If you have a recommendation or an experience
with log analyzers that you'd like to share with the SiteSearch community, let
us know, and we'll place it in the Community Zone.
Features to check:
Log Formats ~ Web servers can write logs in various log formats and
log analyzers can read various log formats. Check to make sure that the log
analyzer and your web server are compatible.
Most log analyzers read the Common Log Format -- a loose industry standard
for reporting basic web activity. The Common Log Format tracks:
- Hostname or IP address of the client browser
- Time and date of transaction
- Pages requested from client
- HTTP response status returned to client
- Amount of bytes delivered to client
Combined Log Format and Extended Log Format are based on the Common Log Format.
Combined Log Format tracks everything in Common Log Format plus two other
fields -- user-agent (type of browser used) and referrer (URL of webpage that
linked patron to your site). Extended Log Format is a customized list
of http-header fields you elect to track.
Platforms ~ Check not only for the platforms the log analyzer runs on,
but also whether it can read logs from different web servers.
Processing speed ~ If you have 50MB log files or larger, processing
speed will be an important consideration.
Reverse IP resolution ~ Resolving Internet Protocol (IP) to its corresponding
Domain Name Service (DNS) address will save you time and frustration when developing
a profile of your online patrons.
Types of reports ~ Low-end log analyzers may provide simple lists and
a barchart or two. High end analyzers can overwhelm with their variety of actual
reports. Try to determine the types of graphical and statistical reports you'll
need and how customizable the reports are. Check whether the log analyzer can
run static and/or dynamic reports.
Accessibility ~ Some log analyzers provide security features to protect
access to data. You'll also want to see if the log analyzer can be used remotely
from the web server.
Database functionality ~ Some log analyzers have a built-in database
component. Others can format reports for easy export into a database software
application.
Sample List of Log Analyzers:
NOTE: The following information was taken from vendor websites
AccessWatch
Fast and inexpensive. Open source shareware based on Perl 5.0. The sourcecode
is available for editing, but the changes may not be distributed per the
license agreement. Reads Apache and Netscape server log formats.
Analog
Written in C. Available for most platforms and reads most log formats.
According to its website, Analog can handle 25GB log files. Offers reports
in 31 languages. Does not support DNS lookup, but helper applications
are available.
Bazaar Analyzer
Java-based application housed on the web server. Graphical path analysis
shows the path taken by a patron through your website. Offers a number
of security and report options.
NetIntellect
Supports Unix webservers including Apache. Can schedule generation of
reports. Path analysis shows the most traveled paths taken by patrons
on your site. DNS lookup. Reports based on user profiles. Customizable
reports.
NetTracker
Server-based application. On-the-fly reports."Drill-down"
tracking of patron traffic.
Sawmill
Runs on web server as a CGI program. Hierarchical statistics -- not
report-based.
W3Perl
Shareware.Converts URLs into document titles. Options for excluding
frame pages and robot hits.
Webalizer
C language application. Reads Common Log Format and Combined Log Formats.
Support search string analysis. Tracks page exit and entry statistics.
WebTrends
Built-in scheduler for printing reports. Compatible with Apache, Netscape
and IIS servers. Available only for Windows and NT systems.
Wusage
Report macros. Multiple report formats keyed to different users. Enhanced
trend analysis. Supports cookies. Ability to filter elements, such as
.gif and .jpg files, from being included in statistics.
|
What
they say:
(Internet sources for product reviews)
Cnet.com
Internet Product Watch
Internet World June 1997
PC Magazine
WebDeveloper.com
WebReview.com |
|