Blog

Python tutorials and analysis guides for our government datasets — from loading raw data to building production ML models.

Learn how to load and analyze 49 years of US fatal crash data from NHTSA FARS using Python and pandas. Discover fatality trends, DUI patterns, and geospatial hotspots.

FARStraffic safetyPythonpandasRead tutorial →

Use 2.2M NHTSA vehicle complaint narratives to build an NLP defect detection model in Python. Cluster complaint text, predict recall likelihood, and profile high-risk makes.

NHTSANLPvehicle safetyscikit-learnRead tutorial →

Explore 40+ years of US aviation accident data from NTSB using Python. Analyze accident rates, aircraft types, injury patterns, and probable causes across 30K events.

NTSBaviation safetyPythonpandasRead tutorial →

Use Python to map 37 years of industrial toxic releases from the EPA Toxics Release Inventory. Analyze PFAS, carcinogen trends, and facility-level pollution across US states.

EPA TRIenvironmentPFASgeospatialRead tutorial →