WebLogAnalysis

 

Taking a look at one's web page access records can be both fun and instructive. (Many ISPs offer free log service, sometimes "raw", in other cases via an automatic analyzer, e.g. "Analog".) Here, for example, the first fortnight of May 2001 saw hundreds of page hits on http://www.his.com/~z/ and its associated pages from one or more individuals at "bellglobal.com" ... similar activity coming via various AOL proxies ... dozens of visits by search engine robots such as "fast-search.net", "googlebot.com", and "inktomi.com" ... international stop-overs from "utaonline.at" (Austria), "avantel.net.mx" (Mexico), "time.net.my" (Malaysia), "haifa6.actcom.co.il" (Israel), "ecolint.ch" (China), "labs.itu.edu.tr" (Turkey), etc.

The local page that got by far the most visits was http://www.his.com/~z/gibbon.html with a count of several thousand. (I fear that most of these hits were by students seeking quick quotations for their term papers ... but perhaps a few came from more voluntary seekers of knowledge.) The late Eugene Ho's essay on Edward Gibbon (http://www.his.com/~z/gibho1.html ) attracted many hundreds of looks, as did the HIS.COM Gibbonic quotation "Fortune Cookie" service http://www.his.com/cgi-bin/fortune.gibbon. The Montgomery County Coin Club (see http://www.money.org/club_mccc.html ) has its monthly newsletter and other pages hosted at the American Numismatic Association, but since those pages load an image (typically one of my 1852 large cents) stored on HIS.COM, their hits also register on the web logs here; there were a few hundred of them. The next most popular pages were a set of favorite Gibbon quotes that Eugene Ho assembled (http://www.his.com/~z/passage.html ) followed by the (in)famous ^zhurnal formerly in a "guestbook" directory at his.com, now on http://zhurnaly.com/ — where this item itself appears.

The little program that I use to analyze HIS.COM web logs is an elementary example of Perl. The guts of it are three loops, which respectively use associative arrays to count hits by looker and by page looked at, to sort and print the lookers, and to sort and print the page hit counts. For the record:

 # analyze HIS.COM web logs - ^z - 20010218
 # assumes format:
 # client_addr day month year time 0 /~z/page.html
 # to run try:
 # perl zweban <infile >results
 # count lookers and mypages that they look at
 while ($line = <STDIN>) {
   @fields = split(" ", $line);
   ++$looker{$fields[0]};
   ++$mypage{$fields[6]};
 }
 print " *** Lookers at web pages ***\n";
 # sort into descending order and print lookers
 foreach $key (sort { $looker{$b} <=> $looker{$a} } keys %looker) {
   print $looker{$key}, " ---- ", $key, "\n";
 }
 print "\n *** Mypages being looked at ***\n";
 # sort into descending order and print mypages
 foreach $key (sort { $mypage{$b} <=> $mypage{$a} } keys %mypage) {
   print $mypage{$key}, " ---- ", $key, "\n";
 }

Straightforward stuff, which could be extended to do a more detailed analysis of who's looking at what and when. Maybe some day!

Saturday, June 02, 2001 at 05:33:38 (EDT) = 2001-06-02


TopicProgramming


(correlates: GibbonomaticRequiem, LowProfile, Meaning of Blog, ...)