Tag Archives: Analyze

Analyzing Web Server Logs

Log files are not fun to look at. They are ugly, contain too much information, and often lead to massive headaches. Fortunately, these beasts can be tamed for more capacities than just debugging; they can used to generate wonderful reports that make sense. A number of programs are out there to analyze Web server logs, and this article will cast the spotlight in their direction.
Before evaluating the software packages, determine the type of data you wish to see. Although most of the software we looked at supports more than just Web server logs, this article discusses only Web server output. Log analysis programs can show everything from a list of IP addresses connected to the Web server to a pie chart detailing which files were accessed most often. The majority of popular Web log analysis tools try to make sense of every piece of data in the logs, but few succeed in making the data readable.
Some log file analysis packages cannot distinguish pertinent information from the raw log file itself. Displaying statistics in an aesthetically pleasing manner is a very important attribute. Every once in a while, user interface designers create a new paradigm, setting a standard that other designers attempt to emulate. Arguably, Apple has done this with its OS X desktop environment, and some Web log analysis programs do this better than others.
Webalizer is one popular log analysis tool. Many people prefer it because it is written in C and runs quite fast. The graphics, however, are not optimal. The gd graphics library supplies some readable charts, but they are not as aesthetically attractive as they could be. The reports themselves are sufficient for providing a quick glimpse of a few important data points; namely "what pages are accessed" and "how many hits are we getting." A wealth of information can be extracted from Web logs. When done properly, the information is not so overwhelming. Webalizer is adequate, but its mediocre graphics and lack of statistics, earn it a mere three stars in our five-star ad hoc award system.
Analog, favored by a small group of die-hard fans, is another worthy contender. Analog attempts to present everything, but it is an example of how to include too much information for normal human consumption. By default, everything is displayed on the same Web page. A navigation bar at the top allows users to click on a specific report, which drills down to another section of the page. Analog’s saving grace is the navigation bar at the top of each section, which simplifies the navigation — somewhat. Analog’s more interesting reports include listings of: how many hits come from each country (TLD, actually), search engine queries that brought users to the Web site, and which browsers and operating systems visitors used. The software is capable of presenting just about everything else derivable from Web server logs. The graphics are a slight improvement over Webalizer gd-based graphics, but the pie and bar charts still leave much to be desired. Because Analog includes much useful information, and the navigation isn’t completely unusable, we feel it deserves an apprehensive four out of five stars.
Summary is a commercial log analysis tool for which a 30-day trial is available. This package includes all possible information and lists options in a text Web page for users to click on. When you follow a link, for example, "Bandwidth Peak," you are brought to fairly decent Web page that lists bandwidth usage by time. A small bar graph accompanies each entry, but the graphics in Summary are quite minimal. Here, minimal is not a defect. Quite the contrary; Summary is really decent looking. However, the overall GUI is cumbersome, and it took us a good bit of time to browse to each report we wished to see. The cost of Summary is not prohibitive, and the reports are decent, albeit not awe-inspiring. We rate it four out of five.
No discussion of Web log analysis software would be complete without at least a nod to WebTrends. The sheer scope of WebTrends Web Log Analyzer (another commercial offering) earns it an honorable mention here. Its Web site makes the auspicious claim of increasing return on investment, and even asserts "This is Complete Web Analysis." Not surprisingly, WebTrends is not for organizations with skinny wallets. The online demos reflect how great GUI design should look, and it does indeed look great. The company’s claims of usability appear founded, and it has even included a way to access all of the information available from Web server logs. WebTrends has been around for more than a decade and plays nicely with IIS. We are giving it four out of five stars, based solely on what we learned in the product’s impressive Web-based demo.
The grail of log analysis, AWStats, is by far the best looking of all of the Free Web log analysis tools we’ve seen. AWStats is also the only Perl-based application on the list. Its graphics are superb, and its information is presented in an excellent manner. At a glance, users can view all available reports and navigate seamlessly between them. Many users will be amazed at the amount of detail the program can extract from the log files. Small browser icons and flags for various countries add to the already-pleasing GUI. AWStats includes all of the features mentioned above for other programs, and is in a readable format, to boot. We give it the full five stars.
Of course, there are countless other log analysis programs, but these are the more commonly deployed ones.
Compatibility, which is normally a key issue, is not a great concern when it comes to log analysis tools. The Apache Web server produces logs in a standardized format, called NCSA combined log files. IIS W3C conformant format is also supported by most of the analysis programs listed here.
In a later article, we will explore the other types of log files most of these programs can work on, including mail and FTP.