·7 min read

Mapping US County Mortality Trends with CDC WONDER Data

Explore 18 years of county-level death rates across 3,100+ US counties. Identify geographic mortality disparities and compute population-adjusted trends in Python.

CDC WONDERmortalitypublic healthepidemiologyPythontutorial
Share:

CDC WONDER's Compressed Mortality File is the gold standard for population health research: county-level death counts and crude rates for every US county, every year. It's the dataset behind countless academic papers, NIH grant proposals, and public health surveillance dashboards. The ClarityStorm release covers 1999 through 2016 — 55,000+ county-year records across 3,133 counties.

In this tutorial we'll load the dataset, compute national and state-level mortality trends, identify the counties with the highest and lowest death rates, and build a geographic mortality disparity map. All in under 50 lines of Python.

What's in the Dataset

  • 55K+ county x year mortality records covering 1999–2016
  • 3,133 unique counties across all 50 states + DC
  • Deaths, population, and crude rate (per 100,000) for each county-year
  • Computed mortality tiers (Very Low to Very High quintiles)
  • Years of Potential Life Lost (YPLL) — a key measure of premature death burden

Loading the Data

python
import pandas as pd

mortality = pd.read_parquet("cdc_wonder_mortality.parquet")

print(f"Records: {len(mortality):,}")
print(f"Years: {mortality['year'].min()} – {mortality['year'].max()}")
print(f"Counties: {mortality['county_fips'].nunique()}")
print(f"States: {mortality['state_name'].nunique()}")
print(mortality[["county_name", "state_name", "year", "deaths", "population", "crude_rate"]].head())

National Mortality Trend

Computing a population-weighted national crude death rate across all counties reveals the long-term trend. Between 1999 and 2016, the US crude death rate showed a notable inflection — declining through the early 2000s, then reversing upward around 2010 as the opioid epidemic, rising metabolic disease, and an aging population pushed mortality higher.

python
import matplotlib.pyplot as plt

national = (
    mortality.groupby("year")
    .agg(total_deaths=("deaths", "sum"), total_pop=("population", "sum"))
    .reset_index()
)
national["crude_rate"] = (national["total_deaths"] / national["total_pop"]) * 100_000

plt.figure(figsize=(10, 5))
plt.plot(national["year"], national["crude_rate"], color="#dc2626", linewidth=2.5, marker="o", markersize=4)
plt.title("US Crude Death Rate per 100K Population (1999–2016)", fontsize=14, fontweight="bold")
plt.xlabel("Year")
plt.ylabel("Deaths per 100K")
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig("national_mortality_trend.png", dpi=150)

Geographic Disparities

Mortality rates vary dramatically by geography. Rural counties in the Deep South, Appalachia, and the Northern Plains consistently report the highest crude death rates, while urban counties on the coasts trend lower. The gap between the healthiest and least healthy counties can exceed 3x.

python
# Average crude rate per state across all years
state_avg = (
    mortality.groupby("state_name")
    .agg(total_deaths=("deaths", "sum"), total_pop=("population", "sum"))
    .reset_index()
)
state_avg["avg_crude_rate"] = (state_avg["total_deaths"] / state_avg["total_pop"]) * 100_000

top5 = state_avg.nlargest(5, "avg_crude_rate")[["state_name", "avg_crude_rate"]]
bot5 = state_avg.nsmallest(5, "avg_crude_rate")[["state_name", "avg_crude_rate"]]

print("Highest mortality states:")
print(top5.to_string(index=False))
print("\nLowest mortality states:")
print(bot5.to_string(index=False))

What to Build Next

  • Health equity dashboard: map county-level mortality tiers alongside poverty and uninsurance rates
  • Regression modeling: predict county crude rates from socioeconomic features (combine with Census ACS data)
  • Opioid epidemic tracking: correlate the post-2010 mortality uptick with drug overdose indicators by county
  • Time-series forecasting: project county-level death rates to 2030 under various demographic scenarios
  • Policy impact analysis: compare pre/post intervention mortality trends for counties with specific health programs

The free sample contains 1,000 rows. The complete CDC WONDER Mortality dataset covers 55K+ county-year records (1999–2016) as CSV and Parquet with a commercial license.

Get the Full Dataset

CDC WONDER Mortality 1999–2016

From $129