A bigger technical watershed separating traffic-reporting tools, though, may be their ability to track cached pages. When PC users request Web pages from Internet service providers, they are often viewing pages that were cached in the ISP's data centers once and served to multiple users. The result: some cached pages are never counted and can throw off Web-traffic reports. How many cached pages are missed? It's difficult to say. Critics of traffic-analysis tools put the number as high as 10 percent for some sites, but the vendors of those tools say the number is negligible.
|
|
 |
 |
The Web traffic tools that are most susceptible to missing cached pages, critics argue, are those that use data-collection methods known as log-file analysis and network packet sniffing, such as WebTrends and NetGenesis. Network packet sniffers, which usually reside on standalone servers between the Web server and the firewall, scan Web server data packets that stream past, copy them, and then forward them to a database. Log-file analysis records the requests made from Web, proxy, and other Internet servers--noting such things as the visitor's IP address and the time it took to process the request--and sends them to a database for subsequent number crunching.
Proponents of log-file analysis insist that tracking cached pages is not the problem it was with the Web design tools available just a few years ago. "I understand why advertisers are concerned about this, but it's a bit of an urban legend now," says Kevin Epstein, director of product management in Inktomi's networks products division. "All you need to do is put a noncacheable object in your page, like a piece of text." That way, even if graphics and banner ads are served from cache memory, the pages will still be tracked.
But even vendors that consistently count cached pages aren't always on the same page. WebSideStory and PC Data Online, for example, do capture traffic routed through caching servers, but their services still report different traffic numbers for the same Web pages during the same time. Why? Again, different methods. WebSideStory uses a technique known as page tagging by which the company's clients place a few lines of code at the bottom of each Web page they want tracked. Each time that page is requested, whether or not the page has been cached, WebSideStory is notified.
PC Data Online, on the other hand, doesn't code each page, but captures the URL requested by placing tracking software on survey participants' hard drives. (PC Data's tracking software, @PC Data, starts tracking Internet usage as soon as users open up their browsers. @PC Data collects and temporarily stores a log of participants' Web activities for 15 minutes. The data is then sent in real time in an encrypted message to PC Data.) Caching, therefore, is not an issue for either.
The best bet is to use a combination of third-party auditing tools, such as those offered by Media Metrix or PC Data, and analysis tools from NetGenesis, WebTrends, and the like. The auditing tools will help you compare your site to others in your market segment. The analysis tools will give you more specifics on your site.