NHTSA Vehicle Recalls 1967–Present
Complete US Vehicle Safety Recall Database
Every NHTSA safety-related defect and compliance recall campaign since 1967 — cleaned, structured, and ready for analysis. Covers vehicles, equipment, child restraints, and tires with defect summaries, corrective actions, affected unit counts, and manufacturer details. Pairs naturally with NHTSA Complaints for end-to-end vehicle safety pipelines.
Why not just download it from NHTSA?
You can — it's public domain. Here's what we saved you:
- ✓ Merged pre-2010 and post-2010 flat files — NHTSA distributes these as two separate downloads with different schemas
- ✓ Added human-readable
recall_type— the raw source uses single-letter codes (V/E/C/T) - ✓ Normalized
model_year— the source contains mixed formats and null representations - ✓ Added derived fields —
recall_year,mfg_span_days(manufacturing window duration) - ✓ ISO 8601 dates — all date fields standardized from NHTSA's raw format
- ✓ Pairs with Complaints — normalized make/model/year for joining complaint volume to recall campaigns
⏱ Skip ~1–2 hours of merging and normalization. Simple dataset — but the two-file schema mismatch catches everyone.
What you'd need to do yourself ↓
- Find and download two separate flat files from NHTSA — the split at the 2010 boundary is not obvious from the portal
- Reconcile the schema differences between the pre-2010 and post-2010 files
- Decode the single-letter recall type codes (V/E/C/T) from the NHTSA data dictionary
- Parse and standardize the raw date fields and model year values
- Convert to Parquet for efficient analytical use
50K+
Recalls
57+
Years
200+
Makes
500M+
Units Affected (est.)
Use Cases
Build models that predict recall likelihood for specific makes, models, and model years based on component defect patterns and manufacturer history.
Join with NHTSA Complaints to measure how many consumer complaints preceded each recall — a proxy for NHTSA's investigation pipeline.
Rank manufacturers and models by recall frequency, units affected, and defect severity for competitive safety analysis.
50K+ structured defect summaries and corrective action descriptions — ideal for topic modeling, defect classification, and entity extraction.
Analyze recall initiation patterns (manufacturer-initiated vs. ODI-directed), FMVSS compliance violations, and regulatory trends since 1967.
Track component failure trends across model years, identify chronic defect categories, and support due diligence for fleet purchases.
Schema
Single table: nhtsa_recalls.parquet / nhtsa_recalls.csv — one row per recall campaign. Derived fields added: recall_year, model_year (normalised), recall_type (human-readable), mfg_span_days.
| Field | Type | Description |
|---|---|---|
| campno | string | NHTSA recall campaign number |
| maketxt | string | Vehicle/equipment make |
| modeltxt | string | Vehicle/equipment model |
| model_year | int | Model year (null if unknown/N/A) |
| mfgname | string | Manufacturer that filed Part 573 report |
| mfgtxt | string | Manufacturers of recalled vehicles/equipment |
| mfgcampno | string | Manufacturer's own campaign number |
| compname | string | Component description |
| rcltypecd | string | Recall type code (V/E/C/T) |
| recall_type | string | Recall type (Vehicle/Equipment/Child Restraint/Tire) |
| potaff | int | Potential number of units affected |
| influenced_by | string | Recall initiated by (MFR/OVSC/ODI) |
| rcdate | string | Part 573 report received date (YYYY-MM-DD) |
| recall_year | int | Recall year (derived) |
| odate | string | Date manufacturer notified owners (YYYY-MM-DD) |
| bgman | string | Manufacturing begin date (YYYY-MM-DD) |
| endman | string | Manufacturing end date (YYYY-MM-DD) |
| mfg_span_days | int | Manufacturing span in days (derived) |
| desc_defect | string | Defect summary (free text, up to 6,000 chars) |
| conequence_defect | string | Consequence summary (free text) |
| corrective_action | string | Corrective action summary (free text) |
| notes | string | Additional recall notes |
| fmvss | string | Federal Motor Vehicle Safety Standard number |
| do_not_drive | int | Consumer advisory: do not drive (1/0) |
| park_outside | int | Consumer advisory: park outside (1/0) |
Quick Start
import pandas as pd
df = pd.read_parquet("nhtsa_recalls.parquet")
# Top makes by recall count
print(df["maketxt"].value_counts().head(10))
# Total units affected by year
yearly = df.groupby("recall_year")["potaff"].sum()
print(yearly.tail(10))
# Manufacturer-initiated vs. regulator-directed
print(df["influenced_by"].value_counts())
# Filter to vehicle-type recalls only
vehicles = df[df["recall_type"] == "Vehicle"]
print(f"{len(vehicles):,} vehicle recalls")Pairs Well With
Join complaints to recalls on make/model/year to measure how many consumer complaints preceded each recall campaign — the full NHTSA defect investigation pipeline in one dataset pair.
Cross-reference recall campaigns with fatal crash data to evaluate the safety impact of recalls on actual crash outcomes by vehicle and model year.
Pricing
Data Provenance
Source: National Highway Traffic Safety Administration (NHTSA), Office of Defects Investigation (ODI)
Portal: NHTSA Vehicle Safety Recalls
Coverage: All recall campaigns since 1967 — pre-2010 and post-2010 flat files combined into a single cleaned table.
Update frequency: NHTSA updates the flat file continuously as new recalls are issued. Annual subscribers receive quarterly refreshes.
License: NHTSA recall data is a US federal government work in the public domain. Paid tiers are licensed under the ClarityStorm Commercial Data License covering our pipeline and enrichment work (date standardisation, type normalisation, derived fields).
Need custom data cuts, API access, or bulk licensing?
Contact Sales