Dataset Changelog

Schema changes, quarterly data refreshes, and enrichment updates for all ClarityStorm datasets — newest first.

Refresh schedule: Annual subscribers receive updated data as upstream sources publish. Most federal datasets publish annually (Q1/Q2). Real-time sources (CFPB, NHTSA) are refreshed quarterly.

All Changes

2025-12-01

Initial release

225 .csv.gz files merged — details, fatalities, and locations tables. Damage strings parsed to USD.

2025-12-01

Initial release

Pre-joined NHTSA Complaints + Recalls + FARS fatal crashes. 33K+ vehicle-year profiles. One row per make/model/year.

2025-09-01

Enrichment: computed fields added

Added complaint_category (AI-classified), component_group, and is_crash flag derived fields.

2025-09-01

Enrichment: computed fields added

Added seriousness_label (AI-classified), reaction_category, and reporter_type_normalized.

2025-09-01

Initial release

8 annual ITA files merged. DART & TCIR rates computed. NAICS zero-padded.

2025-09-01

Initial release

Stable Parquet snapshot from live API. 14M+ complaints, 3.75M+ with narratives.

2025-09-01

Enrichment: computed fields added

Added sentiment_score, product_category_normalized, and timely_response_flag.

2025-09-01

Initial release

84 monthly BTS files merged. 35M+ flights. Year-partitioned Parquet.

2025-06-01

Initial release

37 annual files merged. Facilities + releases linked. Environmental Justice linker included.

2025-06-01

Initial release

4 quarterly zips merged, deduplicated across 7 tables. 1.5M+ reports.

2025-01-15

Initial release

First publication. 22 years, 5 linked tables, ~6.3M crash events. CSV + Parquet.

2025-01-15

Initial release

49 years, 3 normalized tables. Schema harmonized across 49 years of SAS format changes.

2025-01-15

Initial release

~2.2M complaints. Component hierarchy parsed from free-text. CSV + Parquet.

2025-01-15

Initial release

Pre-2010 + post-2010 files merged. 57+ years of recall campaigns.

2025-01-15

Initial release

6 relational tables extracted from Access MDB. No Access required.

By Dataset

NJ Crash Records 2001–2022

2025-01-15
NewInitial releasev1.0.0

First publication. 22 years, 5 linked tables, ~6.3M crash events. CSV + Parquet.

NHTSA FARS Fatal Crashes 1975–2023

2025-01-15
NewInitial releasev1.0.0

49 years, 3 normalized tables. Schema harmonized across 49 years of SAS format changes.

NHTSA Vehicle Complaints 1995–Present

2025-09-01
SchemaEnrichment: computed fields addedv1.1.0

Added complaint_category (AI-classified), component_group, and is_crash flag derived fields.

2025-01-15
NewInitial releasev1.0.0

~2.2M complaints. Component hierarchy parsed from free-text. CSV + Parquet.

NHTSA Vehicle Recalls 1967–Present

2025-01-15
NewInitial releasev1.0.0

Pre-2010 + post-2010 files merged. 57+ years of recall campaigns.

NTSB Aviation Accidents 1982–Present

2025-01-15
NewInitial releasev1.0.0

6 relational tables extracted from Access MDB. No Access required.

EPA Toxic Release Inventory 1987–Present

2025-06-01
NewInitial releasev1.0.0

37 annual files merged. Facilities + releases linked. Environmental Justice linker included.

FDA FAERS Drug Adverse Events 2023

2025-09-01
SchemaEnrichment: computed fields addedv1.1.0

Added seriousness_label (AI-classified), reaction_category, and reporter_type_normalized.

2025-06-01
NewInitial releasev1.0.0

4 quarterly zips merged, deduplicated across 7 tables. 1.5M+ reports.

OSHA Workplace Injury & Illness 2016–Present

2025-09-01
NewInitial releasev1.0.0

8 annual ITA files merged. DART & TCIR rates computed. NAICS zero-padded.

CFPB Consumer Financial Complaints 2011–Present

2025-09-01
NewInitial releasev1.0.0

Stable Parquet snapshot from live API. 14M+ complaints, 3.75M+ with narratives.

2025-09-01
SchemaEnrichment: computed fields addedv1.1.0

Added sentiment_score, product_category_normalized, and timely_response_flag.

DOT Airline On-Time Performance 2018–Present

2025-09-01
NewInitial releasev1.0.0

84 monthly BTS files merged. 35M+ flights. Year-partitioned Parquet.

NOAA Storm Events Database 1950–Present

2025-12-01
NewInitial releasev1.0.0

225 .csv.gz files merged — details, fatalities, and locations tables. Damage strings parsed to USD.

Vehicle Safety Profile — Complaints + Recalls + Fatal Crashes

2025-12-01
NewInitial releasev1.0.0

Pre-joined NHTSA Complaints + Recalls + FARS fatal crashes. 33K+ vehicle-year profiles. One row per make/model/year.