VisitorStats

 

Every once in a while I like to look at information about those who are visiting the ZhurnalWiki and the ^zhurnal archive at http://zhurnaly.com/ . Browsing the logs helps me separate issues worth worrying about from topics to ignore. A snapshot based on the first half of October 2003 offers some semi-surprising (to me) statistics.

But before zooming in on any details, at top-level what does it all mean? Tough to say. My tentative conclusions include:

  • ZhurnalWiki's main page deserves improvement — it's the dominant entry in the logs, since it serves a gateway to everything as well as an instant-search facility, a quick-start intro, a set of Wiki services, and a list of the newest postings ... hence, the recent redesign
  • "CryptoQuip" is attracting folks for the wrong reason — so I've added a disclaimer at the top of that page
  • The ^zhurnal on his.com is still being read and deserves to be retained — but it's not a growth industry, and I probably shouldn't mess much with it
  • 'Bots are big — they account for almost half of all traffic on zhurnal.net
  • Everything you know is not wrong — Google still stands tall as the king kahuna of search engines and Internet Explorer remains the 600 pound orangu-browser

Now for some slightly gory minutae. The total ZhurnalWiki hit count is averaging ~1,000/day, plus or minus 50%, as it has been for most of the past year. This corresponds to ~300 or so "visits" daily, each of which averages 3-4 pages fetched. But that average is deceptive, since "visitors" include many search engine robots which fetch lots of pages during their crawling activities.

What are people and 'bots fetching? The main ZhurnalWiki page is the big winner here, followed by FindPage (which snags many misformed URL requests), CryptoQuip, an ActionNotDefined catch-all bin, and then the ever-popular RecentChanges.

In the various zhurnal.net subsites my daughter's pages at http://zhurnal.net/~violconey/ of photos from her summer music camps are relatively popular, presumably with folks trying to decide where to go next year. My wife's pages of library-related speeches and other talks (now at http://librariesfriend.com/ ) are likewise strong. Her presentation titled "Magic Wands" leads the list, doubtless via its allure for Harry Potter fans and the like.

And what sorts of specific searches point people to the ^zhurnal? Leading the pack, as it has month after month, is "cryptoquip" (~9%) — a simple-substitution cipher puzzle that appears in various newspapers, mostly in North America. The ZhurnalWiki page CryptoQuip, which Google seems to enjoy, has nothing to do with such puzzles; it's a cypherpunk aphorism that I read many years ago and posted in April 2001. Second place among Zhurnal-linked search strings is the run-together phrase "allyourbasearebelongtous" (~3%) which leads to AllYourBaseAreBelongToUs. Coming in third (~2%) is the similarly concatenated "worldtradecenter" (see WorldTradeCenter). It briefly outshone the cryptoquip-seekers a month ago during the September 11 anniversary timeframe.

Where do folks arrive from? Google as expected leads the pack, with twice as many referrals as Yahoo, which in turn outpaces MSN and AOL by a similar ratio. Cross-linkages from the ^zhurnal and other http://www.his.com/~z/ pages are smaller but not insignificant, led by my various Edward Gibbon quotation-collection pages.

Google's "Googlebot" leads the list of Zhurnal watchers, with ~12% of all ops to its credit. Slightly behind it at ~11% is a crawler from the French "serveur.com" apparently associated with Art-Online.com and artmarket.com. This robot is possibly seeking email addresses for art-related mass mailings, at least according to one recent post on webmasterworld.com. That hypothesis correlates with my experience: I received some advertising email from Art-Online many months ago. But I don't believe that these folks are truly evil spammers; when I asked, they immediately dropped me from their mailing lists and I have seen nothing from them since.

Following in the standings at ~10% is an automated crawler from inktomi.com. Further down the charts are 'bots from search.msn.com (~3%) and FAST aka alltheweb.com aka fastsearch.net (~1%). And there are hits from looksmart.com and turnitin.com — that last being a plagiarism-catching service for teachers. Watch out, students, if you hope to take a ^zhurnal essay and submit it as your own work ...

Other crawlers in recent weeks include active agents of almaden.ibm.com, learninglab.uni-hannover.de, openfind.com.tw, av.com (Altavista), directhit.com, teoma.com, and numerous unidentified IP addresses. And there are doubtless downloading programs that I've overlooked, or that are disguising themselves, or that didn't happen to pass by during the date range of the logs I have glanced at. But the bottom line total of likely robotic Zhurnal activity adds up to ~40% — much higher than I anticipated, but still a minority.

As for human readers of the ZhurnalWiki, the mass of Microsoft Internet Explorer users compose at least ~20%-30% of all hits. I would have expected more, and perhaps some of their browsers aren't being properly logged. Macintosh visitors are ~3%, but I suspect that half of those are my own activities. Other browsers and platforms come in at ~1%. These numbers leave perhaps a quarter of all activity uncounted; I don't know why.

Looking back at the original ^zhurnal on his.com, although it gets ~30 hits per day its activity is dwarfed in the logs by the "Best of" Gibbon's Decline and Fall collection of quotations — http://www.his.com/~z/gibbon.html — which sees 10 times as many passers-by, including not a few in search of cheap filler for their term papers. In terms of total his.com traffic the mass of ^z pages sit in a virtual tie for first place with a collection of local area realtor image files, at ~10%. That number has crept up in recent years as more individuals graduate to their own domains and therefore drop out of the his.com user home page list.

And there's heaps o' more data in them there logs yet to be mined ... but I think the above pretty well captures the largest nuggets at this time.

(see also WebLogAnalysis (2 Jun 2001). CrudeMetrics (9 Feb 2003), GotLibrary (17 Sep 2003), ... )


TopicZhurnal - TopicProgramming - 2003-10-17



(correlates: CryptoQuip, PopularityContest2005, PlusOrMinus, ...)