NHTSA Vehicle Recalls 1967–Present

Name: NHTSA Vehicle Recalls 1967–Present
Creator: ClarityStorm
License: https://www.claritystorm.com/license

Complete US Vehicle Safety Recall Database

Every NHTSA safety-related defect and compliance recall campaign since 1967 — cleaned, structured, and ready for analysis. Covers vehicles, equipment, child restraints, and tires with defect summaries, corrective actions, affected unit counts, and manufacturer details. Pairs naturally with NHTSA Complaints for end-to-end vehicle safety pipelines.

Sample: Public DomainPaid: Commercial LicenseCSV + Parquet50K+ recalls1967–Present

Why not just download it from NHTSA?

You can — it's public domain. Here's what we saved you:

✓ Merged pre-2010 and post-2010 flat files — NHTSA distributes these as two separate downloads with different schemas
✓ Added human-readable recall_type — the raw source uses single-letter codes (V/E/C/T)
✓ Normalized model_year — the source contains mixed formats and null representations
✓ Added derived fields — recall_year, mfg_span_days (manufacturing window duration)
✓ ISO 8601 dates — all date fields standardized from NHTSA's raw format
✓ Pairs with Complaints — normalized make/model/year for joining complaint volume to recall campaigns

⏱ Skip ~1–2 hours of merging and normalization. Simple dataset — but the two-file schema mismatch catches everyone.

What you'd need to do yourself ↓

Find and download two separate flat files from NHTSA — the split at the 2010 boundary is not obvious from the portal
Reconcile the schema differences between the pre-2010 and post-2010 files
Decode the single-letter recall type codes (V/E/C/T) from the NHTSA data dictionary
Parse and standardize the raw date fields and model year values
Convert to Parquet for efficient analytical use

50K+

Recalls

57+

Years

200+

Makes

500M+

Units Affected (est.)

Use Cases

Recall Risk Scoring

Build models that predict recall likelihood for specific makes, models, and model years based on component defect patterns and manufacturer history.

Complaints-to-Recalls Analysis

Join with NHTSA Complaints to measure how many consumer complaints preceded each recall — a proxy for NHTSA's investigation pipeline.

Vehicle Safety Benchmarking

Rank manufacturers and models by recall frequency, units affected, and defect severity for competitive safety analysis.

NLP on Defect Narratives

50K+ structured defect summaries and corrective action descriptions — ideal for topic modeling, defect classification, and entity extraction.

Regulatory & Compliance Research

Analyze recall initiation patterns (manufacturer-initiated vs. ODI-directed), FMVSS compliance violations, and regulatory trends since 1967.

Automotive Intelligence

Track component failure trends across model years, identify chronic defect categories, and support due diligence for fleet purchases.

Schema

Single table: nhtsa_recalls.parquet / nhtsa_recalls.csv — one row per recall campaign. Derived fields added: recall_year, model_year (normalised), recall_type (human-readable), mfg_span_days.

Field	Type	Description
campno	string	NHTSA recall campaign number
maketxt	string	Vehicle/equipment make
modeltxt	string	Vehicle/equipment model
model_year	int	Model year (null if unknown/N/A)
mfgname	string	Manufacturer that filed Part 573 report
mfgtxt	string	Manufacturers of recalled vehicles/equipment
mfgcampno	string	Manufacturer's own campaign number
compname	string	Component description
rcltypecd	string	Recall type code (V/E/C/T)
recall_type	string	Recall type (Vehicle/Equipment/Child Restraint/Tire)
potaff	int	Potential number of units affected
influenced_by	string	Recall initiated by (MFR/OVSC/ODI)
rcdate	string	Part 573 report received date (YYYY-MM-DD)
recall_year	int	Recall year (derived)
odate	string	Date manufacturer notified owners (YYYY-MM-DD)
bgman	string	Manufacturing begin date (YYYY-MM-DD)
endman	string	Manufacturing end date (YYYY-MM-DD)
mfg_span_days	int	Manufacturing span in days (derived)
desc_defect	string	Defect summary (free text, up to 6,000 chars)
conequence_defect	string	Consequence summary (free text)
corrective_action	string	Corrective action summary (free text)
notes	string	Additional recall notes
fmvss	string	Federal Motor Vehicle Safety Standard number
do_not_drive	int	Consumer advisory: do not drive (1/0)
park_outside	int	Consumer advisory: park outside (1/0)

Quick Start

import pandas as pd

df = pd.read_parquet("nhtsa_recalls.parquet")

# Top makes by recall count
print(df["maketxt"].value_counts().head(10))

# Total units affected by year
yearly = df.groupby("recall_year")["potaff"].sum()
print(yearly.tail(10))

# Manufacturer-initiated vs. regulator-directed
print(df["influenced_by"].value_counts())

# Filter to vehicle-type recalls only
vehicles = df[df["recall_type"] == "Vehicle"]
print(f"{len(vehicles):,} vehicle recalls")

Pairs Well With

NHTSA Vehicle Complaints 1995–Present

Join complaints to recalls on make/model/year to measure how many consumer complaints preceded each recall campaign — the full NHTSA defect investigation pipeline in one dataset pair.

NHTSA FARS 1975–2023

Cross-reference recall campaigns with fatal crash data to evaluate the safety impact of recalls on actual crash outcomes by vehicle and model year.

Pricing

Sample

Free

1,000 rows (CSV) + schema docs

Public Domain

Download Sample

Complete

$59

Full dataset — 50K+ recalls since 1967, CSV + Parquet

Commercial License

Buy Complete

Annual

$119/yr

Full dataset + quarterly updates as NHTSA issues new recalls

Commercial License

Data Provenance

Source: National Highway Traffic Safety Administration (NHTSA), Office of Defects Investigation (ODI)

Portal: NHTSA Vehicle Safety Recalls

Coverage: All recall campaigns since 1967 — pre-2010 and post-2010 flat files combined into a single cleaned table.

Update frequency: NHTSA updates the flat file continuously as new recalls are issued. Annual subscribers receive quarterly refreshes.

License: NHTSA recall data is a US federal government work in the public domain. Paid tiers are licensed under the ClarityStorm Commercial Data License covering our pipeline and enrichment work (date standardisation, type normalisation, derived fields).

Need custom data cuts, API access, or bulk licensing?

Contact Sales