← All Datasets

NJ Crash Records 2001-2022

AI-Ready Structured Dataset

22 years of New Jersey traffic crash records, cleaned and standardised for machine learning. Sourced directly from the NJ Department of Transportation NJTR-1 crash report system. Over 6.3 million records across all 21 NJ counties.

Sample: CC-BY 4.0Paid: Commercial LicenseCSV + Parquet~6.3M records2001-2022

~6.3M

Crash Events

22

Years

21

Counties

31

Features

Use Cases

Crash Severity Prediction

Train multi-class classifiers to predict fatal/injury/PDO outcomes from road, weather, and speed features.

Geospatial Hotspot Detection

Cluster crashes by lat/lon to identify high-risk corridors and intersections for infrastructure investment.

Insurance Risk Modeling

Build route-level risk features for actuarial scoring, telematics, and underwriting models.

Time-Series Analysis

Detect seasonal, weekly, and time-of-day crash patterns for traffic management and policy.

Autonomous Vehicle Research

Real-world incident data for ADAS validation, simulation scenarios, and safety benchmarking.

Urban Planning

Support infrastructure investment decisions, Vision Zero initiatives, and Complete Streets analysis.

Schema

Primary table: nj_crash_records.csv / nj_crash_records.parquet — 31 columns, one row per crash event.

FieldTypeDescription
crash_idstringNJ DOT unique crash identifier
yearintData year (2001-2022)
datestringCrash date (YYYY-MM-DD)
day_of_weekstringDay name
timestringTime (HHMM 24hr)
countystringNJ county
municipalitystringMunicipality name
severitystringfatal, injury, or pdo
vehicle_countintVehicles involved
total_killedintFatalities
total_injuredintInjuries
casualty_countintTotal casualties
pedestrians_killedintPedestrian fatalities
pedestrians_injuredintPedestrian injuries
alcohol_involvedstringY/N flag
hazmat_involvedstringY/N flag
weatherstringWeather condition code
road_conditionstringSurface condition code
light_conditionstringLighting code
posted_speedintSpeed limit (mph)
latitudefloatDecimal degrees (2013+)
longitudefloatDecimal degrees (2013+)

Quick Start

from datasets import load_dataset

# Load 1,000-row sample (free)
ds = load_dataset("claritystorm/nj-crash-records-2001-2022")

# Or with pandas
import pandas as pd
df = pd.read_csv(
    "https://huggingface.co/datasets/claritystorm/"
    "nj-crash-records-2001-2022/resolve/main/sample_1000.csv"
)

# Severity distribution
print(df["severity"].value_counts())

Pricing

Sample

Free

1,000 rows (CSV) + schema docs

CC-BY 4.0

Download on Hugging Face
Complete

$99

All tables (Accidents, Vehicles, Drivers, Occupants, Pedestrians) — CSV + Parquet

Commercial License

Buy Complete
Annual

$299/yr

All files + annual updates when NJ DOT releases new data

Commercial License

Subscribe

Data Provenance

Source: New Jersey Department of Transportation (NJ DOT)

Portal: NJ DOT Crash Data Portal

License: Split licensing. Free sample (1,000 rows on Hugging Face): CC-BY 4.0. Paid tiers: ClarityStorm Commercial Data License — internal use only, no redistribution or resale of raw data. Derivative works (models, analysis, research papers) are permitted. NJ DOT crash data is factual government data collected under statutory duty and formally cleared for commercial resale.

Attribution: “NJ Crash Records 2001-2022, sourced from NJ DOT public crash data, processed by ClarityStorm Data.”

Need custom data cuts, API access, or bulk licensing?

Contact Sales