Navicosoft, Hosting and Development Company, Lahore Pakistan, Canada Hosting
Home|About Us|Sitemap|Contact Us
Home About Us Web Hosting Domain Names Design Software Reseller Marketing Portfolio
Understanding Site Statistics
Hosting Articles Index
How do you know how many people have visited your site? How do you know which pages they view? And what is meant by unique visitors, hits, page views and raw log access. This article seeks to explain these terms to you and provide an understanding of these terms so you will have a better grasp of what is happening on your site. We will first start with defining the terms used in judging site statistics and then move on to analyzing how they are used. The first term defined is raw log access because it serves as the foundation for defining the rest of these terms and is the foundation for all site statistics.
Raw log access is a term used to describe a file which contains a record of all of the visits by people to your site. When a person visits your site what they do is request the web page from your site. This request is sent by the web browser to the server and then in response the server sends the web page back to the browser so it can be displayed on the computer. But in reality when a browser sends this request it is not just "send the web page". A web page is actually broken up into many different pieces of information. For example, if you have four pictures on your web page the browser will send five different requests to the server; one for the web page and then one request for each picture.
The reason why this is important is that each of these requests are logged separately in the raw log access file. You also might know these requests as they are commonly called "hits". When someone says they have one million hits on their web site it means they have had one million requests for web pages AND those pieces of the web pages. Now the term hits used to be a fairly common term for the measure of how popular a site was. This is not a good measurement for how busy a site is. The reason is that if you have one web page with 100 pictures one visitor who wants to see this web page it will log 101 hits in the access file (one for the web page and one for each picture on the web page). If this web page had five visitors it would log 505 hits. It sounds like a lot but there were only five people viewing the page. Now compare this to a page which has just one picture and text on it. Each person viewing it would log only two hits (one for the page and one for the picture). If this second page had 100 people visiting the page the hit count would only be 200. It looks like on the face of it the first page has more traffic but in reality it doesn't. This is why the amount of hits a site receives doesn't mean a lot because the number of hits it receives is totally dependant on how the site is developed. So what is a better method to compare two sites?
The next most commonly used term is "page views". Page views is a term which seeks to compare apples with apples so you can evaluate web site pages with each other independent of how the web page is constructed. This term basically says that there were x number of requests for a web page in a certain amount of time. It does this by going through the raw log access file and looking for just the requests for the web pages themselves and by ignoring all the rest. That is, ignoring how many hits there were-just tell me how many requests there were for the web page. This is a good indicator of how many times a particular web page is viewed or how many particular pages were viewed on a site. Looking at it on the face this is a good indicator, but it also has its problems. For example, let us look at our example with the two web pages. In the first example the number of hits the first site had was 505; the second, 200. If you look at the page views for these two sites the count would be as follows: for the first site, 5; for the second, 100. A much better indicator but the problem which naturally occurs is what happens when you have one person who views one page leaves then comes back and views the same page again? Or what happens when one person refreshes the web page in the browser. Each time this happens another web page request is logged. So you could have ten repeat people viewing a site a couple times a month. Or you could have one person who looks at your site once a day. How can you tell who has the most people visiting? You can't from page views. With page views you get duplication which is unaccounted for when viewing the results. So as a measure of overall traffic it is good, but for a detailed analysis of how people are coming to your site it is ineffective.
We have went over two different methods so far of evaluating web site traffic but there is one more method used currently which is independent or unique visitors. When a request is logged in the raw log access file it might look like this:
198.162.0.1 - - [09/Jul/2003:15:30:19 -0400] "GET /index.htm HTTP/1.1" 200 - "http://www.Navicosoft.com/index.htm"
The first part of the line is the IP address of the computer which requested the web page. At the end we can see that they requested the index.htm file. We can also see the date and time of the request in the square brackets. We can also see that the code "200" (a successful request and web page sent) was logged. (If the web page was not found you would see an error code of 404 which means the web page wasn't found. Another popular code is 304 which means that the web page has not been modified since the visitor last visited.) The important part is that whenever the computer located at the IP address of 198.162.0.1 requested a web page from the site their IP address was listed and logged. If you went through the entire raw log access file and counted up the "different" IP addresses you could find out a number of unique IP addresses and have a rough estimate of the number of individual people who visited your site. Counting these IP addresses will tell you how many "different" people visited your site. So if we went back to our examples and counted the number of different people who visited we might find that the first site might have more unique people visiting its site than the second one.
But, this is not a perfect method either. Simply counting up the IP addresses might actually give you a lower (or higher) estimate of the people visiting your site than in reality. This is because if you have people who visit your site who use dial-up modems they are assigned temporary IP addresses each time they dial in to the internet. So if you have one person who visits your site on one day and is assigned the IP of 198.162.0.1 on one day and then assigned 198.162.0.5 on the next day it will be logged as two different unique visitors when in reality it was only one. Also, you might have one person dial in and use one IP to visit your site and then log off. And then have another person dial in and be assigned the same IP and visit you. This will show up as one unique IP because the IPs are the same even though there are two different people visiting. Using this method however will give you a generalized view of how many people visited.
For example if you viewed a quick snapshot statistics for Navicosoft.com you would find that the following was found:
There were 2,606 unique visitors who visited a total of 3,557 times who requested 11,236 pages and recorded 62,905 hits.
Which number is better? It really depends on what you want to know. We can see that there were several thousand different IP logged (give or take some duplication or under estimating). We can also see that they requested a little over 11,000 web pages (were some browser refreshes?). And because of our site design there were a little of 62,000 individual requests in the raw log access file. So how do we know which pages these people visited? And where did they come from? How about how long did they stay on the site? And more importantly do I have to count several thousand log entries in the file to determine all of this? The answer to all of these questions lies within using a graphical site statistics program. With a click of a button all this information is at your reach. In our next article we will discuss three of the most popular programs that will read your raw log access file and provide you with information instantly. But before you could understand the information coming from those programs you had to understand the terminology displayed by these programs. And that is what we have accomplished here.
None of these methods are perfect, but without knowing their flaws (and their strengths) you cannot effectively tell what is happening on your site. Each is useful in filling in one aspect of the picture and all should be taken with a grain of salt as all of these combined show in general how your site is doing not exactly how it is doing.
Hosting Articles Index
Web Hosting Navicosoft
Navicosoft Hosting Packages, Compare plans
Silver Package
Gold Web Hosting Package
Diamond Package
Pro Web Hosting Packages
Related Links
Languages
Fantastico
WHM demo
cPanel demo
Mode of Payment
Register
Single Website Plan
Pakistan Rates
Features
Network & Servers
Hosting Basics
Home|About Us|Services|Portfolio|Careers|Support|Sitemap|Contact Us
Domain Names|Web Hosting|Reseller Hosting|Web Development|Software Development|Single Website Plan| Demo
Web Hosting Basics|Marketing|Plan features|Pakistan Rates|Search Engine Optimization|Network & Servers|Mode of Payments
Downloads|Templates|Free Services|FAQ's
Copyright © 2002-2006 Navico Inc. Lahore Pakistan. All Rights Reserved.
Terms & Conditions|Spam Policy