ns_ad.png
Radio Personality Ken Dashow
by Bernie Langs







ns_ad.png


Benford’s Law: The Tale of Too Many Ones Print E-mail
By Sriram
November 2006

Take a collection of numbers, say for example, an arbitrarily chosen set of 100 closing stock prices from the Business section of The New York Times. How many times would the first digit of these numbers be 1? How about 2, 3, 4, …, 9? Naively, one would expect that all digits would occur with equal frequency. Well, you are in for a surprise.

For a large class of “real-life” collections of numbers, the distribution of first digits is logarithmic rather than uniform. More precisely, the probability that the first significant digit is d (1 ≤ d ≤ 9) is log10(1 + 1/d). In plain language, the first digit will be 1 in about 30.1% of the cases, it will be 2 with a relative frequency of 17.6%, and so on, with the digit 9 appearing in a mere 4.6% of the cases. This empirical observation is referred to as Benford’s law.

Benford’s law was first expounded by Simon Newcomb, an astronomer, in the year 1881, after he observed that the first few pages of logarithmic tables (which were used a lot by astronomers for calculations) wore out faster than the last ones. The law was rediscovered by Frank Benford, a physicist in 1932, after he analyzed 20,229 entries from 20 different sources which tabulated quantities as disparate as catchment areas of 335 rivers, specific heats of 1,389 chemical compounds, numbers taken from Reader’s Digest articles and front pages of newspapers, American League baseball statistics, and the street addresses of the first 342 people listed in the book American Men of Science [3,6]. Since then, the law has been found to hold true for many other sources, including lists of commonly used physical constants, half-lives of radioactive substances, and populations of 3,141 counties as determined in the 1990 US census [6]. In 1996, Theodore Hill gave a general mathematical explanation for the widespread occurrence of this logarithmic distribution [6].

Is Benford’s law a mere mathematical curiosity? Not quite. Apparently, numbers in “honest” tax returns and other financial documents tend to follow the logarithmic distribution, whereas “faked” numbers deviate from the distribution. In 1995, the District Attorney’s office in Brooklyn stated that this test identified all seven cases of admitted financial fraud that it had handled. Incidentally, the test was also applied to former President Bill Clinton’s tax return, but found nothing wrong with it except for rounding-off of certain numbers. More applications of this law in detecting accounting fraud are described in reference 5.

References:

1. http://en.wikipedia.org/wiki/Benford’s_law

2. http://mathworld.wolfram.com/BenfordsLaw.html

3. http://www.rexswain.com/benford.html

4. http://plus.maths.org/issue9/features/benford/index-gifd.html

5. http://www.aicpa.org/pubs/jofa/may1999/nigrini.htm

6. Theodore P. Hill,”A Statistical Derivation of the Significant-Digit Law,” Stat. Sci. 10, no. 4 (1996): 354-363.