What Is Benford's Law?
Benford's Law (also called the "First-Digit Law" or "Law of Anomalous Numbers") is a mathematical principle stating that in many naturally occurring datasets, smaller digits appear as the first digit more frequently than larger digits.
Contrary to what intuition suggests, the digits 1-9 do NOT appear with equal frequency (11.1% each) as leading digits. Instead, the digit 1 appears first about 30% of the time, while 9 appears first less than 5% of the time.
The phenomenon was first discovered by astronomer Simon Newcomb in 1881, who noticed that logarithm tables were more worn on pages beginning with 1 and 2. Physicist Frank Benford rediscovered and tested it in 1938 across 20 different datasets—and the law bears his name.
The Expected Distribution
According to Benford's Law, the probability of a digit d being the first digit follows this distribution:
Notice: Each digit is roughly 80% of the previous one
Exact Percentages Table
| First Digit | Probability | Approximate |
|---|---|---|
| 1 | 30.103% | 30% |
| 2 | 17.609% | 18% |
| 3 | 12.494% | 12% |
| 4 | 9.691% | 10% |
| 5 | 7.918% | 8% |
| 6 | 6.695% | 7% |
| 7 | 5.799% | 6% |
| 8 | 5.115% | 5% |
| 9 | 4.576% | 5% |
Why Does Benford's Law Work?
The explanation lies in logarithmic growth patterns. Consider counting from 1 to 100:
- Numbers 1-9 all start with digits 1-9 (9 numbers)
- Numbers 10-19 start with 1 (10 numbers)
- Numbers 20-99 are spread across digits 2-9
To go from a first digit of 1 to 2, a number must grow by 100% (from 100 to 200, or from 1,000 to 2,000). But to go from 8 to 9, it only needs to grow by 12.5% (from 800 to 900). This means numbers "spend more time" with lower leading digits.
Think of it this way: If a company's revenue grows from $100K to $999K, it spends much more time in the $100-199K range than in the $900-999K range. Numbers naturally "linger" at lower leading digits longer because they take more growth to advance.
Using Benford's Law for Fraud Detection
Fraud examiners use Benford's Law because fabricated numbers typically don't follow natural patterns. When humans invent numbers, they unconsciously introduce biases:
- They may avoid "obvious" numbers starting with 1
- They may cluster around round numbers ($500, $1,000)
- They may stay just below approval thresholds ($4,999 to avoid $5,000 review)
- They may use middle digits (5, 6, 7) more often
How Fraud Examiners Apply It
- Extract first digits from a dataset (e.g., all vendor payments for a year)
- Calculate the actual distribution of first digits in the data
- Compare to Benford's expected distribution
- Identify significant deviations that warrant investigation
- Investigate anomalies—deviations may indicate fraud OR have legitimate explanations
A company analyzes 10,000 expense reimbursements. Expected: ~30% should start with digit 1. Actual finding: Only 15% start with 1, while 35% start with 4 or 9.
Red flag: Employees may be submitting expenses just below the $500 threshold (starting with 4) or inflating small expenses into the $900 range. This warrants investigation—but isn't proof of fraud by itself.
The First-Two Digits Test
For more granular analysis, fraud examiners often use the first-two digits test, which examines the first two digits together (10, 11, 12... through 99). This is especially effective for detecting:
- Threshold fraud: Spikes at values like 49 or 99 (just below $500 or $1,000 limits)
- Round number bias: Unusual clustering at 50, 100, 500
- Duplicate amounts: Repeated specific values
Suitable vs. Unsuitable Data
Benford's Law doesn't apply to all datasets. Understanding when it works—and when it doesn't—is crucial for the CFE exam.
- Accounts payable amounts
- Sales transactions
- Population data
- Tax returns
- Stock prices
- Utility bills
- Invoice amounts
- Insurance claims
- River lengths, lake areas
- Financial statements
- Assigned numbers (SSNs, phone numbers)
- Numbers with fixed ranges (percentages 0-100)
- Numbers influenced by psychology ($9.99 pricing)
- Numbers with minimum/maximum constraints
- Randomly generated numbers
- ATM withdrawals (fixed amounts)
- ZIP codes, addresses
- Small datasets (<500 records)
- Data with narrow range (all between $50-$100)
Benford's Law requires data that spans multiple orders of magnitude (e.g., values from $10 to $10,000). If all values are in a narrow range (like $50-$100), the law won't apply. The CFE exam often tests whether candidates can identify appropriate vs. inappropriate datasets for Benford analysis.
Statistical Tests for Conformity
To determine if deviations from Benford's Law are statistically significant, fraud examiners use these tests:
1. Z-Statistic (Individual Digit Test)
Tests whether a single digit significantly deviates from expected. If Z > 1.96, the deviation is significant at the 95% confidence level.
2. Chi-Square Test (χ²)
Tests whether the entire distribution conforms to Benford's Law. Compares all nine digits simultaneously. A high chi-square value indicates significant deviation.
3. Mean Absolute Deviation (MAD)
Measures the average difference between observed and expected frequencies. Mark Nigrini (leading Benford's Law researcher) suggests these MAD thresholds for first-digit tests:
- 0.000 to 0.006: Close conformity
- 0.006 to 0.012: Acceptable conformity
- 0.012 to 0.015: Marginally acceptable
- >0.015: Non-conformity (investigate further)
Real-World Cases
Benford's Law analysis of Enron's financial statements revealed significant deviations from expected first-digit distributions. The analysis showed anomalies in reported revenue and asset figures that later proved to be fabricated through mark-to-market accounting fraud. While Benford's Law wasn't the primary detection method, it provided corroborating evidence of data manipulation.
Researchers applied Benford's Law to macroeconomic data Greece reported to the European Union before entering the Eurozone. The analysis revealed significant deviations suggesting the data was manipulated to meet EU requirements for deficit and debt levels. This case demonstrated Benford's Law's applicability to government fraud detection.
Political scientist Walter Mebane applied the second-digit Benford test to vote counts in Iran's 2009 presidential election. The analysis found statistical anomalies in the reported results for the winning candidate. However, this case also illustrates the limitations—Benford's Law analysis alone cannot prove fraud, only suggest areas needing investigation.
Limitations & Caveats
Benford's Law is a powerful tool, but it has important limitations that the CFE exam tests:
- Deviation ≠ Fraud: A dataset that doesn't follow Benford's Law isn't necessarily fraudulent. There may be legitimate business reasons (approval thresholds, standardized pricing, etc.).
- Conformity ≠ No Fraud: A dataset that perfectly follows Benford's Law may still contain fraud. Sophisticated fraudsters can manipulate numbers to conform.
- Sample Size Matters: The test requires large datasets (typically 500+ records) to be reliable. Small samples may deviate from Benford's Law by chance.
- Not All Data Applies: The law only works for naturally occurring data spanning multiple orders of magnitude. Assigned numbers, narrow-range data, and psychologically-influenced numbers don't apply.
- It's a Screening Tool: Benford's Law identifies areas for further investigation—it doesn't prove fraud. Always follow up with traditional investigation techniques.
The exam often tests whether candidates understand that Benford's Law is a screening tool, not a definitive fraud detector. A correct answer will acknowledge that deviations warrant investigation but don't prove fraud.
CFE Exam Tips
Benford's Law appears in the Investigation section of the CFE exam, specifically under Data Analysis. Here's what to know:
- The approximate distribution: 30-18-12-10-8-7-6-5-5
- Digit 1 appears first ~30% of the time; digit 9 only ~5%
- Data must span multiple orders of magnitude
- Minimum sample size: 500-1,000+ records
- It's a screening tool, not proof of fraud
- The first-two digits test is best for detecting threshold fraud
- Assigned numbers (SSNs, phone numbers) don't follow the law
Practice Questions
According to Benford's Law, the digit 1 appears as the first digit approximately 30.1% of the time in naturally occurring datasets. This is the most frequently tested statistic about Benford's Law on the CFE exam.
Vendor payments are naturally occurring numbers that span multiple orders of magnitude (from small supplies to large contracts). SSNs are assigned numbers, ATM withdrawals have fixed amounts, and percentages have constrained ranges—none of which follow Benford's Law.
Expected frequency for digit 4 is ~9.7%, so 35% is a significant deviation. This could indicate employees are keeping expenses just below a $5,000 threshold. However, this warrants investigation—it doesn't prove fraud. There may be legitimate reasons, or the anomaly might lead to discovering actual fraud.
Benford's Law analysis typically requires 500-1,000+ records for reliable results. With only 200 transactions, conformity (or deviation) may occur by chance. Additionally, conformity doesn't prove the absence of fraud—sophisticated fraudsters can manipulate data to appear normal.
Conclusion
Benford's Law is a powerful digital analysis tool in the fraud examiner's toolkit. By understanding that naturally occurring numbers follow a predictable first-digit distribution, fraud examiners can identify anomalies that warrant investigation.
For the CFE exam, remember these key points:
- The digit 1 appears first ~30% of the time; digit 9 appears only ~5%
- The law applies to naturally occurring data spanning multiple orders of magnitude
- It's a screening tool—deviations indicate further investigation is needed, not proof of fraud
- Assigned numbers, constrained ranges, and small datasets don't follow the law
Ready to Test Your Knowledge?
Practice more Investigation section questions including data analysis and Benford's Law