Datasets

Analysis-ready datasets built from public government sources — normalized schemas, Parquet format, derived fields, and cross-dataset linking done for you. Free samples available; commercial licenses for full datasets.

Dataset Bundles — save up to 40%

Themed bundles: Transportation Safety, Aviation, ESG, Healthcare & Finance, or all datasets.

View Bundles

All datasets refreshed quarterly. View changelog →

NJ Crash Records 2001–2022

22 yrs of NJ crash data — 5 linked tables, schema normalized

Records: ~6.3M crash events

Coverage: 22 years, 21 counties

Format: CSV + Parquet

trafficroad-safetygeospatialinsuranceAI-ready
From $99

Updated Q1 2025

View Dataset
NHTSA FARS Fatal Crashes 1975–2023

49 yrs of schema changes harmonized — all 50 states

Records: ~5.3M records across 3 tables

Coverage: 49 years, all 50 states

Format: CSV + Parquet

fatal-crashesroad-safetyfederalvehiclesAI-ready
From $99

Updated Q1 2025

View Dataset
NHTSA Vehicle Complaints 1995–Present

Component hierarchy parsed — pairs with Recalls for full defect pipeline

Records: ~2.2M complaints

Coverage: 30+ years, all makes & models

Format: CSV + Parquet

vehiclessafetyconsumerautomotiveAI-ready
From $79

Updated Q1 2025

View Dataset
NTSB Aviation Accidents 1982–Present

Extracted from Access MDB — 6 relational tables, no Access needed

Records: ~30K accidents across 6 tables

Coverage: 40+ years, all 50 states

Format: CSV + Parquet

aviationsafetyfederalgeospatialAI-ready
From $79

Updated Q1 2025

View Dataset
NHTSA Vehicle Recalls 1967–Present

Pre-2010 + post-2010 files merged — pairs with Complaints

Records: 50K+ recall campaigns

Coverage: 57+ years, all makes & models

Format: CSV + Parquet

vehiclesrecallssafetyautomotiveAI-ready
From $59

Updated Q1 2025

View Dataset
EPA Toxic Release Inventory 1987–Present

37 annual files merged — facilities + releases linked, ESG-ready

Records: 3M+ release records across 2 tables

Coverage: 37 years, all US states & territories

Format: CSV + Parquet

environmentchemicalsesgpollutiongeospatialAI-ready
From $99

Updated Q2 2025

View Dataset
FDA FAERS Drug Adverse Events 2023

4 quarterly zips merged, deduplicated across 7 tables

Records: 1.5M+ reports across 7 tables

Coverage: 2023 (Q1–Q4), deduplicated

Format: CSV + Parquet

healthcaredrug-safetypharmacovigilanceFDAAI-ready
From $149

Updated Q2 2025

View Dataset
OSHA Workplace Injury & Illness 2016–Present

8 annual ITA files merged — DART & TCIR rates computed, NAICS normalized

Records: 4M+ establishment-year records

Coverage: 8 years, all US states

Format: CSV + Parquet

workplace-safetyoshaesginsurancecomplianceAI-ready
From $79

Updated Q3 2025

View Dataset
CFPB Consumer Financial Complaints 2011–Present

Stable Parquet snapshot from live API — 3.75M+ narratives preserved

Records: 14M+ complaints, 3.75M+ with narratives

Coverage: 13+ years, all US states

Format: CSV + Parquet

financecomplianceconsumer-protectionnlpfintechAI-ready
From $79

Updated Q3 2025

View Dataset
DOT Airline On-Time Performance 2018–Present

84 monthly BTS files merged — delay causes, cancellations, year-partitioned Parquet

Records: 35M+ domestic flights

Coverage: 7 years, 20+ carriers

Format: CSV + Parquet

aviationflightsdelaysmltransportationAI-ready
From $99

Updated Q2 2026

View Dataset
NOAA Storm Events Database 1950–Present

225 .csv.gz files merged — damage $USD parsed, 3 linked tables, 74 years of weather

Records: 2M+ storm events

Coverage: 74 years, all US states

Format: CSV + Parquet

weatherclimateinsurancetornadofloodriskAI-ready
From $79

Updated Q4 2025

View Dataset
Vehicle Safety Profile — Complaints + Recalls + Fatal Crashes

The only pre-joined vehicle safety dataset — NHTSA + FARS cross-linked by make/model/year

Records: 33K+ vehicle-year profiles

Coverage: 1985–2025, all makes

Format: CSV + Parquet

automotiverecallssafetyinsurancelitigationused-carAI-ready
From $149

Updated Q4 2025

View Dataset
FEMA NFIP Flood Insurance Claims 1978–Present

2.7M+ paid claims enriched — ZIP risk scores, coverage ratios, building characteristics

Records: 2.7M+ paid claims

Coverage: 48+ years, all 50 states

Format: CSV + Parquet

insurancefloodclimate-riskfemareal-estategeospatialAI-ready
From $99

Updated Q1 2026

View Dataset
USACE Waterway Lock Inventory

234 US navigable waterway locks — physical specs, chamber dimensions, gate types, capacity tiers

Records: 234 locks, 56 fields

Coverage: 26 states, 60+ waterways, all USACE districts

Format: CSV + Parquet

waterwaylogisticsinfrastructurebargeUSACEgeospatialAI-ready
From $99
View Dataset
FAA Wildlife Strike Database 1990–Present

Every reported strike since 1990 — species risk tiers, damage severity, engine ingestion flags

Records: 341,090 strike reports, 113 fields

Coverage: 36+ years, 2,764 airports, 952 species

Format: CSV + Parquet

aviationwildlifesafetyFAAbirdstrikegeospatialAI-ready
From $99
View Dataset
CDC WONDER Mortality + SDOH 1999–2016

18 yrs of county-level deaths — mortality tiers, YPLL, crude rates, 3,100+ counties, all 50 states + DC

Records: 55K+ county × year mortality records

Coverage: 18 years, 3,100+ counties, all 50 states + DC

Format: CSV + Parquet

healthcaremortalitypublic-healthepidemiologyAI-ready
From $129

Updated Q2 2026

View Dataset
USDA Crop Insurance Indemnities + Weather 1989–2023

35 yrs of crop loss records pre-joined with NOAA drought/weather by county — 130+ crops, all 50 states

Records: 4.2M+ indemnity records across 3 tables

Coverage: 35 years, all US crop counties

Format: CSV + Parquet

agricultureclimate-riskinsurancedroughtUSDAAgTechAI-ready
From $99

Updated Q2 2026

View Dataset