Numbers are keys to knowledge. As William Thompson (aka Lord Kelvin) famously said:

When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind: it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the state of science.

As is probably obvious from many of the posts here, semi-quantitative analysis is one of my bags. On a related theme, recently DarrenNeimke kicked off a discussion of metrics for programmer productivity (in --- see "Lies, D*mned Lies, and Statistics"). Many other examples of the power of numbers in a visual context are offered in Edward Tufte's classic books (see TufteThoughts (18 Dec 2000)).

But numbers, used incorrectly, can be worse than useless in advancing one's understanding of a topic. For instance, an op-ed piece by a famous columnist caught my eye earlier this week. The author wanted to suggest that the wealthy are being overtaxed, and as "proof" reported, "In 1979 the top 1 percent of earners paid 19.75 percent of income taxes. Today they pay 36.3 percent."

Well, duh! That's absolutely the wrong computation to make. The fraction of total tax revenues that the top 1% pay does speak volumes about income distribution --- and the rise in that percentage implies an increasingly skewed concentration of wealth. (Think: if 99% of the population had zero income, then the top 1% would pay all of the taxes, no matter how low the tax rates might be.) In fact, a quick check of the Statistical Abstract of the United States (2003 edition, Table No. 688) shows that the top 5% of all families received 14.6% of aggregate income in 1980 but increased that share to 21% by 2001. Adjusted Gross Income is even more sharply peaked; the share of AGI received by the top 1% rose from 8.5% in 1980 to 17.5% in 2001. (The popping of the dot-com bubble hurt the top 1% somewhat; their share was a bit higher during 1998-2000.)

The number that an honest author should have quoted is the fraction of the income of the top 1% that is taken by taxes. In 1980 that tax rate was 34.5%; it fell to 27.5% in 2001. So the tax burden on the wealthy has arguably fallen, moderately, during the past two decades. (source: Internal Revenue Service data, presented by the Tax Foundation, Inc.; note that tax law changes in 1986 account for part of the difference)

This is the same kind of statistical fallacy that often crops up in criminal DNA evidence arguments, or in analyses of the benefits of vaccination, or in a countless number of other places. When taking ratios, you've gotta do them in the right direction, and when you're done, you've gotta think about the results ...

(see also ScienceVersusStampCollecting (20 Jun 2000), BasementWorries (15 Jun 2002), ModernPhrenology (19 Oct 2003), ... )

TopicScience - TopicWriting - 2004-02-24

Mark, I have a love/hate relationship with statistics, let me explain...

I love statistics because they offer something concrete to grab hold of, I also seem to be blinded by them at times. As an example, I really doubt that I would have stopped to check the statistic that you have pulled apart in this ^Zhurnal entry. It's a worry that, with so much information and raw data "out there", you have to think that much of it is composed of errors and wrong assumptions; it's finding and recognizing them that proves difficult though! -- DarrenNeimke

(correlates: PaulHolbrook, Worse Obsessions, DarrenNeimke, ...)