NJ Crash Records 2001-2022
AI-Ready Structured Dataset
22 years of New Jersey traffic crash records, cleaned and standardised for machine learning. Sourced directly from the NJ Department of Transportation NJTR-1 crash report system. Over 6.3 million records across all 21 NJ counties.
~6.3M
Crash Events
22
Years
21
Counties
31
Features
Use Cases
Train multi-class classifiers to predict fatal/injury/PDO outcomes from road, weather, and speed features.
Cluster crashes by lat/lon to identify high-risk corridors and intersections for infrastructure investment.
Build route-level risk features for actuarial scoring, telematics, and underwriting models.
Detect seasonal, weekly, and time-of-day crash patterns for traffic management and policy.
Real-world incident data for ADAS validation, simulation scenarios, and safety benchmarking.
Support infrastructure investment decisions, Vision Zero initiatives, and Complete Streets analysis.
Schema
Primary table: nj_crash_records.csv / nj_crash_records.parquet — 31 columns, one row per crash event.
| Field | Type | Description |
|---|---|---|
| crash_id | string | NJ DOT unique crash identifier |
| year | int | Data year (2001-2022) |
| date | string | Crash date (YYYY-MM-DD) |
| day_of_week | string | Day name |
| time | string | Time (HHMM 24hr) |
| county | string | NJ county |
| municipality | string | Municipality name |
| severity | string | fatal, injury, or pdo |
| vehicle_count | int | Vehicles involved |
| total_killed | int | Fatalities |
| total_injured | int | Injuries |
| casualty_count | int | Total casualties |
| pedestrians_killed | int | Pedestrian fatalities |
| pedestrians_injured | int | Pedestrian injuries |
| alcohol_involved | string | Y/N flag |
| hazmat_involved | string | Y/N flag |
| weather | string | Weather condition code |
| road_condition | string | Surface condition code |
| light_condition | string | Lighting code |
| posted_speed | int | Speed limit (mph) |
| latitude | float | Decimal degrees (2013+) |
| longitude | float | Decimal degrees (2013+) |
Quick Start
from datasets import load_dataset
# Load 1,000-row sample (free)
ds = load_dataset("claritystorm/nj-crash-records-2001-2022")
# Or with pandas
import pandas as pd
df = pd.read_csv(
"https://huggingface.co/datasets/claritystorm/"
"nj-crash-records-2001-2022/resolve/main/sample_1000.csv"
)
# Severity distribution
print(df["severity"].value_counts())Pricing
$99
All tables (Accidents, Vehicles, Drivers, Occupants, Pedestrians) — CSV + Parquet
Commercial License
Buy CompleteData Provenance
Source: New Jersey Department of Transportation (NJ DOT)
Portal: NJ DOT Crash Data Portal
License: Split licensing. Free sample (1,000 rows on Hugging Face): CC-BY 4.0. Paid tiers: ClarityStorm Commercial Data License — internal use only, no redistribution or resale of raw data. Derivative works (models, analysis, research papers) are permitted. NJ DOT crash data is factual government data collected under statutory duty and formally cleared for commercial resale.
Attribution: “NJ Crash Records 2001-2022, sourced from NJ DOT public crash data, processed by ClarityStorm Data.”
Need custom data cuts, API access, or bulk licensing?
Contact Sales