This page last modified: Jul 26 2006
title:Analog report tutorial keywords:analog,howto,web,logs,page,traffic,request,reports,apache,http,httpd,search,engine,web,server,statistics description:A quick overview on how to interpret web site traffic reports from Analog. Quick overview -------------- Use one month's data when possible, and copy/paste into a spreadsheet if you need several month's figures. Use "successful requests for pages" to measure your site traffic, specifically look at the line "Average successful requests for pages per day". Note: "requests" are different from "requests for pages". nice small site with some good content can expect roughly 500 requests for pages per day. A good site (but not famous) will be more like 3000 page loads per day. Somewhere around 500,000 page requests per day is where you start hitting the big time. Use the "Daily Report" to get an idea of traffic shape throughout the month. Use the "Request Report" to understand the relative popularity of pages and sections of your web site. Introduction ------------ Analog creates summary statistics of web logs from the Apache web server. There are several aspects of the http protocol (used to request and serve web pages) that limit the amount of information available in a web log. Specifically, it is difficult to track an individual user. Many companies would like to track "sessions" as opposed to the number of times pages are viewed. Cookies and some related technologies can help, but these features are not available from Analog. Even with special tracking techniques, there is a large margin of error in tracking sessions. For instance, it is impossible to tell if the user has ended a session, or is merely taking a break. Web site features that attempt to focus on user tracking generally fail to meet business goals. I strongly recommend that businesses use standard metrics such as revenue, and spend less time on weak numbers such as number of sessions or session length in minutes. Incidentally, web page requests are not quite the same as advertising industry "impressions". Ignore the stats which are simply called "requests". (We are only interested in "page requests".) A given web page may require many requests since each image is a separate request. This number of requests is only useful for system administrators following server load (and sysadmins have other more germane stats for server load). The default setup for Analog may not be quite optimal for most sites. I recommend that you have at least these three Analog reports: General Summary, Daily Report, and Request Report. Page requests ------------- Start with an Analog report by looking at the "Analyzed requests from ... to ..." at the top of the report (third line). I recommend looking at a month of data in each report. Depending on what your system administrators have done, the report could contain a few hours of the preceding or following month. A few hours won't make a difference in the stats, but your sysadmin(s) should be able to create a report that contains all of a single month with no overlap. The first numbers of interest are page requests, specifically successful requests for pages. With a month of data, we can look at the and assume these average daily stats are reasonably accurate. In the "General Summary", look at the first number on the line "Average successful requests for pages per day". The numbers in parentheses refer to the prior 7 days, but we don't care about this since there is a more granular view of daily page requests in the Daily Report. For a more visual picture, and to get an idea of how traffic varies during the month, go to the Daily Report section. There are three columns: date, reqs, pages. You are interested in date and pages. There is generally some variability through the week. If you start a successful promotional campaign, you should be able to see the hits spike (or ramp up) when the campaign starts. Request Report -------------- This section is the popularity ranking of pages in your site. There are many factors here, including how well you have implemented a program of search engine optimization. Your site's best page may not be the most popular due to outside factors. Web page popularity is partly due to the quality and appeal of the content, and partly due to the Google ranking of each page. Remember, people coming in from a search engine can land on any page in your site (unless you have made one of several possible grievous errors that I'll mention later.) Pages are ranked based on several factors, the most common is how well the search term matched the text content on your pages. On my sites, I've found that pages with basic information for beginners using popular brands tend to be my most popular pages. My most extensive content relates to the classic VW Beetle. However, the Beetle was sold in the U.S. between 1959 and 1979. There is a limited audience for cars that are more than 20 years old. My VW content is popular, but mixed in are other items relating new newer vehicles. Honda started making the VLX motorcycle in the 1980's and still makes it. This is a popular motorcycle for people who are starting out. It is no surprise that the single most popular page on my site is the home page for my VLX content. Search Engine Optimization -------------------------- Very briefly, search engine optimization requires your web pages to have at least these four attributes (the complete list is closer to thirty items): 1) Use keywords important to your pages in: page title, keywords meta tag, description meta tag, links to your pages, and in the text of the pages. 2) Have as much text as possible for the search engines to index, having the text in normal HTML (not in images or other forms that are difficult or impossible to index via computer). Three hundred words is a good starting point. 3) Use ALT tags on all images. 4) Avoid JavaScript navigation, image-only buttons and navigation, Flash-only designs and navigation, and frames. Google can't index text inside Flash or images, so that text gains you nothing except visual appeal if a human being manages to find your page. Google (and the other search engines) can't follow links that are JavaScript only, or are hidden inside Flash animations. If Google can only find your home page, and cannot send its spider crawling through your entire web site, you will have a lower ranking. Frames are horrible for at least 12 reasons. The single reason I'll cover here is that frames prevent linking to interior pages. This makes it impossible for any search engine to send people to pages inside your site. Since these pages outnumber your main page(s), frames restrict how much traffic the search engines can send you.