← All Datasets

CFPB Consumer Financial Complaints 2011–Present

Consumer Financial Protection Bureau — Complaint Database

The CFPB Consumer Complaint Database is the largest public repository of consumer financial complaints in the United States. Every complaint submitted to the CFPB since 2011 — covering credit reporting, debt collection, mortgages, credit cards, student loans, and banking — cleaned, structured, and enriched with normalised product categories. 3.75M+ records include free-text consumer narratives, making this a premier dataset for NLP, compliance monitoring, and fintech risk research.

Sample: Public DomainPaid: Commercial LicenseCSV + ParquetDaily Updates2011–Present3.75M+ Narratives

Why not just pull it from the CFPB API?

You can — it's public domain. Here's what we saved you:

  • Stable point-in-time Parquet snapshot — the CFPB API is live and paginated; getting 4M+ records requires hours of API calls and pagination handling
  • All ~1.5M consumer narratives preserved with their structured metadata intact — the API returns these inline but bulk export doesn't guarantee completeness
  • Normalized product, issue, and sub-issue categories across 13+ years of CFPB taxonomy changes
  • Standardized date fields and state codes — the API returns mixed formats
  • Parquet output — CFPB only provides JSON via API or CSV bulk download; we provide columnar Parquet for fast analytical queries

⏱ Skip ~1–2 hours of API pagination and format conversion. 4M+ records at the CFPB's default page size means thousands of requests.

What you'd need to do yourself ↓
  • Paginate through the CFPB API endpoint — 4M+ records means thousands of paginated requests
  • Handle API rate limits and implement retry logic for failed pages
  • Convert the JSON response stream to a tabular format without losing nested fields
  • The CFPB also provides a bulk CSV download, but it's a live snapshot with no versioning — useful for testing, not for reproducible analysis
  • Convert to Parquet for efficient column-level queries

14M+

Total Complaints

3.75M+

With Narratives

7,800+

Companies

14+

Years of Data

Use Cases

Complaint Trend Analysis

Track complaint volumes by product, company, state, and channel over time. Identify surges tied to regulatory changes, market events, or product launches across 13+ years of data.

NLP & Sentiment Analysis

~1.5M free-text consumer narratives for training complaint classification, sentiment analysis, topic modelling, and named entity recognition models. Ideal for fintech NLP benchmarks.

Compliance Monitoring

Monitor complaint rates by company for UDAP/UDAAP signals, identify repeat-issue patterns, and benchmark company response timeliness against industry peers.

Fintech Risk Assessment

Assess consumer complaint risk profiles for financial institutions. Join against FFIEC data to correlate complaint rates with exam outcomes and enforcement actions.

Consumer Research

Analyse complaint patterns by geography, product type, and submission channel. Study how complaint behaviour varies across servicemember and older American populations.

Regulatory Intelligence

Identify emerging consumer protection issues before they become enforcement actions. Track how CFPB's complaint taxonomy has evolved and normalise categories for longitudinal analysis.

Schema

Single table (cfpb_complaints), delivered as both cfpb_complaints.csv and cfpb_complaints.parquet.

FieldTypeDescription
complaint_idintCFPB unique complaint identifier
date_receivedstringDate CFPB received complaint (YYYY-MM-DD)
yearintYear received (derived)
monthintMonth received 1–12 (derived)
productstringFinancial product category (original CFPB value)
product_normalisedstringStandardised product category (consolidates renamed categories across years)
sub_productstringProduct sub-category
issuestringIssue type reported by consumer
sub_issuestringIssue sub-type
consumer_narrativestringConsumer complaint text (free text, ~1.5M records have this)
has_narrativeint1 if narrative was provided, 0 otherwise
companystringCompany the complaint is about
company_public_responsestringCompany public statement (if provided)
company_responsestringCompany response to consumer
statestringUS state (2-letter code)
zip_codestring5-digit ZIP code
submitted_viastringChannel: Web / Phone / Referral / Postal mail / Fax / Email
tagsstringServicemember / Older American / Older American, Servicemember
consumer_consentstringWhether consumer consented to publish narrative
timely_responseint1 if company responded within CFPB timeframe, 0 otherwise
consumer_disputedint1 if consumer disputed company response (historical; CFPB stopped collecting 2017)
date_sent_to_companystringDate complaint forwarded to company (YYYY-MM-DD)

Quick Start

import pandas as pd

df = pd.read_parquet("cfpb_complaints.parquet")

# Complaint volume by year
print(df.groupby("year").size().sort_index())

# Top products by complaint volume
print(df["product_normalised"].value_counts().head(10))

# Companies with most complaints
print(df["company"].value_counts().head(20))

# Complaints with narratives only
narratives = df[df["has_narrative"] == 1]["consumer_narrative"]
print(f"{len(narratives):,} complaints have free-text narratives")

# Timely response rate by company (top 10 by volume)
top_companies = df["company"].value_counts().head(10).index
rates = df[df["company"].isin(top_companies)].groupby("company")["timely_response"].mean()
print(rates.sort_values())

# State breakdown
print(df.groupby("state").size().sort_values(ascending=False).head(15))

Pairs Well With

External: FFIEC Call Reports

Join CFPB complaint rates against Federal Financial Institutions Examination Council (FFIEC) call report data to correlate consumer complaint volumes with bank financial health metrics.

External: CFPB Enforcement Actions

Cross-reference complaint spikes with CFPB public enforcement actions (available via CFPB API) to study how consumer complaint patterns precede regulatory action.

Pricing

Sample

Free

1,000 rows (CSV) + schema docs

Public Domain

Download Sample
Complete

$79

Full dataset — 14M+ complaints (2011–present), CSV + Parquet

Commercial License

Buy Complete
Annual

$149/yr

Full dataset + annual updates as CFPB adds new complaints

Commercial License

Subscribe

Data Provenance

Source: Consumer Financial Protection Bureau (CFPB), Consumer Complaint Database

Portal: CFPB Consumer Complaint Database

Coverage: 2011 to present. Bulk CSV updated daily by CFPB.

Product normalisation: CFPB has renamed and restructured product categories multiple times since 2011. We map all historical variants to a consistent product_normalised field for longitudinal analysis, while preserving the original product field.

Narratives: ~1.5M complaints include consumer-written narratives where the consumer consented to publication. The has_narrative flag makes it easy to filter to narrative-only subsets for NLP work.

Consumer disputed field: CFPB stopped collecting the “Consumer disputed?” field in April 2017. Records after that date will have null for this field.

Update frequency: CFPB updates the bulk CSV daily. Annual subscribers receive refreshes when ClarityStorm re-runs the pipeline.

License: CFPB complaint data is a US federal government work in the public domain. Paid tiers are licensed under the ClarityStorm Commercial Data License covering our pipeline and enrichment work (normalisation, derived fields, Parquet conversion).

Need custom data cuts, multi-year snapshots, or bulk licensing?

Contact Sales