Defindit Docs and Howto Home

This page last modified: Jul 26 2006
title:Analog report tutorial
keywords:analog,howto,web,logs,page,traffic,request,reports,apache,http,httpd,search,engine,web,server,statistics
description:A quick overview on how to interpret web site traffic reports from Analog.


Quick overview
--------------

Use one month's data when possible, and copy/paste into a spreadsheet
if you need several month's figures.

Use "successful requests for pages"  to measure your site traffic,
specifically look at the line "Average successful requests for pages
per day". Note: "requests" are different from "requests for pages". 
nice small site with some good content can expect roughly 500 requests
for pages per day. A good site (but not famous) will be more like 3000
page loads per day. Somewhere around 500,000 page requests per day is
where you start hitting the big time.

Use the "Daily Report" to get an idea of traffic shape throughout the
month. 

Use the "Request Report" to understand the relative popularity of
pages and sections of your web site.



Introduction
------------

Analog creates summary statistics of web logs from the Apache web
server. There are several aspects of the http protocol (used to
request and serve web pages) that limit the amount of information
available in a web log. Specifically, it is difficult to track an
individual user. Many companies would like to track "sessions" as
opposed to the number of times pages are viewed. Cookies and some
related technologies can help, but these features are not available
from Analog. Even with special tracking techniques, there is a large
margin of error in tracking sessions. For instance, it is impossible
to tell if the user has ended a session, or is merely taking a
break. Web site features that attempt to focus on user tracking
generally fail to meet business goals. I strongly recommend that
businesses use standard metrics such as revenue, and spend less time
on weak numbers such as number of sessions or session length in
minutes. Incidentally, web page requests are not quite the same as
advertising industry "impressions". 

Ignore the stats which are simply called "requests". (We are only
interested in "page requests".) A given web page may require many
requests since each image is a separate request. This number of
requests is only useful for system administrators following server
load (and sysadmins have other more germane stats for server load).

The default setup for Analog may not be quite optimal for most sites. I
recommend that you have at least these three Analog reports: General
Summary, Daily Report, and Request Report. 



Page requests
-------------

Start with an Analog report by looking at the "Analyzed requests from
... to ..." at the top of the report (third line). I recommend looking
at a month of data in each report. Depending on what your system
administrators have done, the report could contain a few hours of the
preceding or following month. A few hours won't make a difference in
the stats, but your sysadmin(s) should be able to create a report that
contains all of a single month with no overlap. 

The first numbers of interest are page requests, specifically
successful requests for pages. With a month of data, we can look at
the and assume these average daily stats are reasonably accurate.

In the "General Summary", look at the first number on the line
"Average successful requests for pages per day". The numbers in
parentheses refer to the prior 7 days, but we don't care about this
since there is a more granular view of daily page requests in the
Daily Report.

For a more visual picture, and to get an idea of how traffic varies
during the month, go to the Daily Report section. There are three
columns: date, reqs, pages. You are interested in date and
pages. There is generally some variability through the week. If you
start a successful promotional campaign, you should be able to see the
hits spike (or ramp up) when the campaign starts.



Request Report
--------------

This section is the popularity ranking of pages in your site. There
are many factors here, including how well you have implemented a
program of search engine optimization. Your site's best page may not
be the most popular due to outside factors. Web page popularity is
partly due to the quality and appeal of the content, and partly due to
the Google ranking of each page. 

Remember, people coming in from a search engine can land on any page
in your site (unless you have made one of several possible grievous
errors that I'll mention later.) Pages are ranked based on several
factors, the most common is how well the search term matched the text
content on your pages. 

On my sites, I've found that pages with basic information for
beginners using popular brands tend to be my most popular pages. My
most extensive content relates to the classic VW Beetle. However, the
Beetle was sold in the U.S. between 1959 and 1979. There is a limited
audience for cars that are more than 20 years old. My VW content is
popular, but mixed in are other items relating new newer
vehicles. Honda started making the VLX motorcycle in the 1980's and
still makes it. This is a popular motorcycle for people who are
starting out. It is no surprise that the single most popular page on
my site is the home page for my VLX content.



Search Engine Optimization
--------------------------

Very briefly, search engine optimization requires your web pages
to have at least these four attributes (the complete list is closer
to thirty items):

1) Use keywords important to your pages in: page title, keywords meta tag, description
meta tag, links to your pages, and in the text of the pages.

2) Have as much text as possible for the search engines to index, having the
text in normal HTML (not in images or other forms that are difficult
or impossible to index via computer). Three hundred words is a good
starting point.

3) Use ALT tags on all images.

4) Avoid JavaScript navigation, image-only buttons and navigation,
Flash-only designs and navigation, and frames. Google can't index text
inside Flash or images, so that text gains you nothing except visual
appeal if a human being manages to find your page. Google (and the
other search engines) can't follow links that are JavaScript only, or
are hidden inside Flash animations. If Google can only find your home
page, and cannot send its spider crawling through your entire web
site, you will have a lower ranking. Frames are horrible for at least
12 reasons. The single reason I'll cover here is that frames prevent
linking to interior pages. This makes it impossible for any search
engine to send people to pages inside your site. Since these pages
outnumber your main page(s), frames restrict how much traffic the
search engines can send you.