Benford's Law for CFE Exam: Complete Guide to Digital Analysis

📋 Table of Contents

What Is Benford's Law?
The Expected Distribution
Why It Works
Using It for Fraud Detection
Suitable vs Unsuitable Data
Statistical Tests
Real-World Cases
Limitations & Caveats
CFE Exam Tips
Practice Questions

What Is Benford's Law?

Benford's Law (also called the "First-Digit Law" or "Law of Anomalous Numbers") is a mathematical principle stating that in many naturally occurring datasets, smaller digits appear as the first digit more frequently than larger digits.

Contrary to what intuition suggests, the digits 1-9 do NOT appear with equal frequency (11.1% each) as leading digits. Instead, the digit 1 appears first about 30% of the time, while 9 appears first less than 5% of the time.

📜 Brief History

The phenomenon was first discovered by astronomer Simon Newcomb in 1881, who noticed that logarithm tables were more worn on pages beginning with 1 and 2. Physicist Frank Benford rediscovered and tested it in 1938 across 20 different datasets—and the law bears his name.

The Expected Distribution

According to Benford's Law, the probability of a digit d being the first digit follows this distribution:

Benford's Law First-Digit Distribution

30.1%

17.6%

12.5%

9.7%

7.9%

6.7%

5.8%

5.1%

4.6%

🧠 CFE Exam Memory Trick

30-18-12-10-8-7-6-5-5

Round the percentages: 30%, 18%, 12%, 10%, 8%, 7%, 6%, 5%, 5%
Notice: Each digit is roughly 80% of the previous one

Exact Percentages Table

First Digit	Probability	Approximate
1	30.103%	30%
2	17.609%	18%
3	12.494%	12%
4	9.691%	10%
5	7.918%	8%
6	6.695%	7%
7	5.799%	6%
8	5.115%	5%
9	4.576%	5%

The Mathematical Formula

P(d) = log₁₀(1 + 1/d)

Where d = the first digit (1-9)

Why Does Benford's Law Work?

The explanation lies in logarithmic growth patterns. Consider counting from 1 to 100:

Numbers 1-9 all start with digits 1-9 (9 numbers)
Numbers 10-19 start with 1 (10 numbers)
Numbers 20-99 are spread across digits 2-9

To go from a first digit of 1 to 2, a number must grow by 100% (from 100 to 200, or from 1,000 to 2,000). But to go from 8 to 9, it only needs to grow by 12.5% (from 800 to 900). This means numbers "spend more time" with lower leading digits.

💡 Simple Explanation

Think of it this way: If a company's revenue grows from $100K to $999K, it spends much more time in the $100-199K range than in the $900-999K range. Numbers naturally "linger" at lower leading digits longer because they take more growth to advance.

Using Benford's Law for Fraud Detection

Fraud examiners use Benford's Law because fabricated numbers typically don't follow natural patterns. When humans invent numbers, they unconsciously introduce biases:

They may avoid "obvious" numbers starting with 1
They may cluster around round numbers ($500, $1,000)
They may stay just below approval thresholds ($4,999 to avoid $5,000 review)
They may use middle digits (5, 6, 7) more often

How Fraud Examiners Apply It

Extract first digits from a dataset (e.g., all vendor payments for a year)
Calculate the actual distribution of first digits in the data
Compare to Benford's expected distribution
Identify significant deviations that warrant investigation
Investigate anomalies—deviations may indicate fraud OR have legitimate explanations

🔍

Example: Detecting Expense Fraud

How a CFE might apply Benford's Law

A company analyzes 10,000 expense reimbursements. Expected: ~30% should start with digit 1. Actual finding: Only 15% start with 1, while 35% start with 4 or 9.

Red flag: Employees may be submitting expenses just below the $500 threshold (starting with 4) or inflating small expenses into the $900 range. This warrants investigation—but isn't proof of fraud by itself.

The First-Two Digits Test

For more granular analysis, fraud examiners often use the first-two digits test, which examines the first two digits together (10, 11, 12... through 99). This is especially effective for detecting:

Threshold fraud: Spikes at values like 49 or 99 (just below $500 or $1,000 limits)
Round number bias: Unusual clustering at 50, 100, 500
Duplicate amounts: Repeated specific values

Suitable vs. Unsuitable Data

Benford's Law doesn't apply to all datasets. Understanding when it works—and when it doesn't—is crucial for the CFE exam.

✅

Data That Follows Benford's Law

Accounts payable amounts
Sales transactions
Population data
Tax returns
Stock prices
Utility bills
Invoice amounts
Insurance claims
River lengths, lake areas
Financial statements

❌

Data That Doesn't Follow It

Assigned numbers (SSNs, phone numbers)
Numbers with fixed ranges (percentages 0-100)
Numbers influenced by psychology ($9.99 pricing)
Numbers with minimum/maximum constraints
Randomly generated numbers
ATM withdrawals (fixed amounts)
ZIP codes, addresses
Small datasets (<500 records)
Data with narrow range (all between $50-$100)

⚠️ CFE Exam Alert

Benford's Law requires data that spans multiple orders of magnitude (e.g., values from $10 to $10,000). If all values are in a narrow range (like $50-$100), the law won't apply. The CFE exam often tests whether candidates can identify appropriate vs. inappropriate datasets for Benford analysis.

Statistical Tests for Conformity

To determine if deviations from Benford's Law are statistically significant, fraud examiners use these tests:

1. Z-Statistic (Individual Digit Test)

Tests whether a single digit significantly deviates from expected. If Z > 1.96, the deviation is significant at the 95% confidence level.

2. Chi-Square Test (χ²)

Tests whether the entire distribution conforms to Benford's Law. Compares all nine digits simultaneously. A high chi-square value indicates significant deviation.

3. Mean Absolute Deviation (MAD)

Measures the average difference between observed and expected frequencies. Mark Nigrini (leading Benford's Law researcher) suggests these MAD thresholds for first-digit tests:

0.000 to 0.006: Close conformity
0.006 to 0.012: Acceptable conformity
0.012 to 0.015: Marginally acceptable
>0.015: Non-conformity (investigate further)

Real-World Cases

🏢

Enron Scandal (2001)

$74 billion fraud detected

Benford's Law analysis of Enron's financial statements revealed significant deviations from expected first-digit distributions. The analysis showed anomalies in reported revenue and asset figures that later proved to be fabricated through mark-to-market accounting fraud. While Benford's Law wasn't the primary detection method, it provided corroborating evidence of data manipulation.

🏛️

Greek Economic Data (2011)

EU debt crisis investigation

Researchers applied Benford's Law to macroeconomic data Greece reported to the European Union before entering the Eurozone. The analysis revealed significant deviations suggesting the data was manipulated to meet EU requirements for deficit and debt levels. This case demonstrated Benford's Law's applicability to government fraud detection.

🗳️

Iran Election (2009)

Electoral fraud analysis

Political scientist Walter Mebane applied the second-digit Benford test to vote counts in Iran's 2009 presidential election. The analysis found statistical anomalies in the reported results for the winning candidate. However, this case also illustrates the limitations—Benford's Law analysis alone cannot prove fraud, only suggest areas needing investigation.

Limitations & Caveats

Benford's Law is a powerful tool, but it has important limitations that the CFE exam tests:

Deviation ≠ Fraud: A dataset that doesn't follow Benford's Law isn't necessarily fraudulent. There may be legitimate business reasons (approval thresholds, standardized pricing, etc.).
Conformity ≠ No Fraud: A dataset that perfectly follows Benford's Law may still contain fraud. Sophisticated fraudsters can manipulate numbers to conform.
Sample Size Matters: The test requires large datasets (typically 500+ records) to be reliable. Small samples may deviate from Benford's Law by chance.
Not All Data Applies: The law only works for naturally occurring data spanning multiple orders of magnitude. Assigned numbers, narrow-range data, and psychologically-influenced numbers don't apply.
It's a Screening Tool: Benford's Law identifies areas for further investigation—it doesn't prove fraud. Always follow up with traditional investigation techniques.

🎯 CFE Exam Key Point

The exam often tests whether candidates understand that Benford's Law is a screening tool, not a definitive fraud detector. A correct answer will acknowledge that deviations warrant investigation but don't prove fraud.

CFE Exam Tips

Benford's Law appears in the Investigation section of the CFE exam, specifically under Data Analysis. Here's what to know:

✅ Know These Cold for the Exam

The approximate distribution: 30-18-12-10-8-7-6-5-5
Digit 1 appears first ~30% of the time; digit 9 only ~5%
Data must span multiple orders of magnitude
Minimum sample size: 500-1,000+ records
It's a screening tool, not proof of fraud
The first-two digits test is best for detecting threshold fraud
Assigned numbers (SSNs, phone numbers) don't follow the law

Practice Questions

CFE Exam Practice

According to Benford's Law, approximately what percentage of numbers in a naturally occurring dataset will have 1 as the first digit?

A) 11.1%

B) 20.0%

C) 30.1%

D) 45.0%

✓ Correct Answer: C) 30.1%

According to Benford's Law, the digit 1 appears as the first digit approximately 30.1% of the time in naturally occurring datasets. This is the most frequently tested statistic about Benford's Law on the CFE exam.

CFE Exam Practice

Which of the following datasets would be MOST appropriate for a Benford's Law analysis?

A) Employee Social Security numbers

B) A company's vendor payment amounts

C) ATM withdrawal amounts

D) Percentages on a performance review

✓ Correct Answer: B) A company's vendor payment amounts

Vendor payments are naturally occurring numbers that span multiple orders of magnitude (from small supplies to large contracts). SSNs are assigned numbers, ATM withdrawals have fixed amounts, and percentages have constrained ranges—none of which follow Benford's Law.

CFE Exam Practice

A fraud examiner runs a Benford's Law test on expense reports and finds that 35% of entries begin with the digit 4. What should the examiner conclude?

A) Fraud has definitely occurred

B) The deviation warrants further investigation

C) The company's approval threshold is likely $5,000

D) Both B and C

✓ Correct Answer: D) Both B and C

Expected frequency for digit 4 is ~9.7%, so 35% is a significant deviation. This could indicate employees are keeping expenses just below a $5,000 threshold. However, this warrants investigation—it doesn't prove fraud. There may be legitimate reasons, or the anomaly might lead to discovering actual fraud.

CFE Exam Practice

A dataset of 200 transactions perfectly conforms to Benford's Law. Which statement is TRUE?

A) The dataset is definitely free of fraud

B) The sample size may be too small for reliable conclusions

C) No further testing is needed

D) The company has strong internal controls

✓ Correct Answer: B) The sample size may be too small for reliable conclusions

Benford's Law analysis typically requires 500-1,000+ records for reliable results. With only 200 transactions, conformity (or deviation) may occur by chance. Additionally, conformity doesn't prove the absence of fraud—sophisticated fraudsters can manipulate data to appear normal.

Conclusion

Benford's Law is a powerful digital analysis tool in the fraud examiner's toolkit. By understanding that naturally occurring numbers follow a predictable first-digit distribution, fraud examiners can identify anomalies that warrant investigation.

For the CFE exam, remember these key points:

The digit 1 appears first ~30% of the time; digit 9 appears only ~5%
The law applies to naturally occurring data spanning multiple orders of magnitude
It's a screening tool—deviations indicate further investigation is needed, not proof of fraud
Assigned numbers, constrained ranges, and small datasets don't follow the law

Ready to Test Your Knowledge?

Practice more Investigation section questions including data analysis and Benford's Law