FEMA NFIP Flood Insurance Claims 1978–Present
National Flood Insurance Program (NFIP) — OpenFEMA Redacted Claims v2
The FEMA National Flood Insurance Program is the authoritative source for US residential and commercial flood loss data. This dataset covers 2M+ paid claims since 1978 — geocoded by ZIP and county, enriched with flood zone classifications, building characteristics, and ClarityStorm-computed ZIP-level risk scores. The go-to dataset for insurance underwriters, real estate analytics, and climate risk modelers.
Why not pull it directly from FEMA OpenFEMA?
FEMA's raw OpenFEMA API returns paginated JSON with camelCase fields and no computed risk context. Here's what we handled:
- ✓ 2M+ record bulk download, parsed and typed — FEMA's API is paginated with opaque field names; we download the full dataset and normalize all column names to snake_case with correct types
- ✓ Flood zone classification simplified — FEMA uses 50+ zone codes; we map all codes to four risk categories (High Risk SFHA, Coastal SFHA, Moderate Risk, Minimal Risk) for immediate use in risk models
- ✓ Occupancy type labels added — FEMA stores occupancy as integer codes 1–7; we add the human-readable label (Single Family, Commercial, etc.)
- ✓ Total claim amounts computed — building + contents + ICC payouts summed into
total_amount_paid; coverage ratio derived from policy limits - ✓ ZIP-level risk scores computed — for every ZIP code in the dataset we compute claim frequency, average payout, and a composite 1–100 risk score ready for joins with property data
- ✓ Decade buckets added —
decade_of_lossenables decade-over-decade trend analysis without time-series wrangling - ✓ Parquet output — the raw file is 500MB+ CSV; Parquet columnar storage is 5–10× faster for analytics on 2M+ rows
⏱ Skip the pagination loop, camelCase normalization, zone code lookup, and risk score computation. Ready for analysis in minutes.
2.7M+
Paid Claims
46+
Years of Data
All 50
States Covered
38
Schema Fields
Use Cases
Model flood risk by ZIP code, county, and flood zone using 45+ years of actual paid claims. Build actuarial tables for claim frequency and severity by occupancy type, construction era, and elevation. Validate flood zone designations against historical loss experience.
Score properties and geographies by flood claim history. Identify ZIP codes with chronic repeat flooding, high coverage ratios, or rising claim trends. Integrate with property listings, mortgage underwriting, or climate-risk disclosures.
Analyze decade-over-decade shifts in flood claim frequency, severity, and geographic spread. Study how climate-driven flood patterns are changing NFIP loss exposure — particularly in coastal (VE/AE) and riverine (AE/AO) zones.
Cross-reference FIRM flood zone designations with actual claim rates. Identify zones where the official risk classification under- or overstates historical loss experience — valuable for FIRM map revision advocacy and community rating appeals.
Use ZIP-level claim counts and risk scores to study repetitive loss patterns. Identify high-frequency communities for disaster mitigation investment analysis, FEMA BRIC grant targeting, or floodplain buyout program eligibility modeling.
Stress-test mortgage portfolios, REIT holdings, or municipal bond exposures against historical flood scenarios. The decade_of_loss field enables scenario analysis (e.g. Katrina-era vs. Harvey-era flood losses) by geography and property type.
Schema
Single table — fema_flood_insurance_claims — delivered as CSV + Parquet. Fields marked computed are derived by ClarityStorm from raw FEMA data.
| Field | Type | Description |
|---|---|---|
| claim_id | string | Unique claim identifier (OpenFEMA record ID) |
| year_of_loss | int | Year the flood loss occurred |
| date_of_loss | string | Exact date of loss (YYYY-MM-DD) |
| property_state | string | State abbreviation of the insured property |
| county_code | string | 5-digit FIPS county code |
| reported_zip_code | string | 5-digit ZIP code of the insured property |
| reported_city | string | City name as reported |
| latitude | float | Property latitude (where available) |
| longitude | float | Property longitude (where available) |
| occupancy_type | int | FEMA occupancy code (1=single family, 2=2-4 unit, 3=other residential, 4=commercial, 5=agriculture, 6=government) |
| occupancy_type_label | string | Human-readable occupancy type label (computed) |
| flood_zone | string | FEMA flood zone designation (e.g. AE, VE, X, AO) |
| flood_zone_category | string | Simplified risk category: High Risk (SFHA), Moderate Risk, Minimal Risk, Other (computed) |
| total_building_coverage | float | Building insurance coverage limit ($) |
| total_contents_coverage | float | Contents coverage limit ($) |
| number_of_floors | int | Number of floors in the insured building |
| original_construction_date | string | Original construction date of the building (YYYY-MM-DD) |
| original_nb_date | string | Original policy issue date (YYYY-MM-DD) |
| post_firm_construction_indicator | bool | Built after the community's Flood Insurance Rate Map (FIRM) effective date |
| elevated_building_indicator | bool | Building is elevated above base flood elevation |
| elevation_certificate_indicator | bool | Property has an elevation certificate on file |
| elevation_difference | float | Difference between lowest floor and base flood elevation (feet) |
| base_flood_elevation | float | Base flood elevation at the site (feet, where available) |
| basement_enclosure_crawlspace | int | Basement/enclosure type code (FEMA codebook) |
| primary_residence_indicator | bool | Property is the owner's primary residence |
| agriculture_structure_indicator | bool | Agricultural structure |
| condominium_indicator | bool | Condominium unit |
| small_business_indicator | bool | Small business property |
| building_damage_amount | float | Assessed damage to building ($) |
| contents_damage_amount | float | Assessed damage to contents ($) |
| amount_paid_building | float | Amount paid on building claim ($) |
| amount_paid_contents | float | Amount paid on contents claim ($) |
| amount_paid_icc | float | Amount paid for Increased Cost of Compliance ($) |
| total_amount_paid | float | Total claim payment — building + contents + ICC (computed) |
| total_coverage | float | Total coverage limit — building + contents (computed) |
| coverage_ratio | float | Claim payout as fraction of coverage limit (computed) |
| decade_of_loss | int | Decade bucket of the loss year — e.g. 1990, 2000, 2010 (computed) |
| zip_claim_count | int | Total NFIP claims filed from this ZIP code across the full dataset (computed) |
| zip_avg_claim | float | Average total claim payout for this ZIP code (computed) |
| zip_risk_score | float | ZIP-level risk score 1–100 based on claim frequency × average severity (computed) |
Quick Start
import pandas as pd
df = pd.read_parquet("fema_flood_insurance_claims.parquet")
# Total claims paid by state
print(df.groupby("property_state")["total_amount_paid"].sum().sort_values(ascending=False).head(10))
# Claim counts by flood zone category
print(df["flood_zone_category"].value_counts())
# Average claim payout by occupancy type
print(df.groupby("occupancy_type_label")["total_amount_paid"].mean().sort_values(ascending=False))
# Top 10 highest-risk ZIP codes (by zip_risk_score)
top_zips = (
df[["reported_zip_code", "property_state", "zip_risk_score", "zip_claim_count", "zip_avg_claim"]]
.drop_duplicates("reported_zip_code")
.nlargest(10, "zip_risk_score")
)
print(top_zips)
# Decade-over-decade trend: average payout per claim
print(df.groupby("decade_of_loss")["total_amount_paid"].mean().round(0))
# Claims in high-risk SFHA zones vs. outside
sfha = df[df["flood_zone_category"] == "High Risk (SFHA)"]
other = df[df["flood_zone_category"] != "High Risk (SFHA)"]
print("SFHA avg claim:", sfha["total_amount_paid"].mean().round(0))
print("Non-SFHA avg claim:", other["total_amount_paid"].mean().round(0))
# Post-FIRM vs. pre-FIRM building loss comparison
print(df.groupby("post_firm_construction_indicator")["total_amount_paid"].mean())Pairs Well With
Join NFIP claims (date_of_loss, county_code) with NOAA flood and hurricane events (EVENT_ID, CZ_FIPS, BEGIN_DATE) to validate claim spikes against actual storm events and build event-driven loss models.
Cross-reference claims with the separate NFIP Policies dataset (policy counts, coverage amounts, and flood zone distribution by community) to compute community-level loss ratios and penetration rates for insurance market analysis.
Pricing
$99
Full dataset — 2M+ claims, 1978–present, all states, CSV + Parquet
Commercial License
Buy Complete DatasetData Provenance
Source: Federal Emergency Management Agency (FEMA), OpenFEMA — FIMA NFIP Redacted Claims v2
Portal: FEMA OpenFEMA — NFIP Redacted Claims
Coverage: 1978–present. Claims data begins with the NFIP's inception in 1968 but digitized records with complete coverage start from 1978. Includes all paid claims across all 50 states, DC, and US territories where NFIP coverage applies.
Privacy: FEMA redacts all personally identifiable information (PII) from the published dataset. Property addresses and owner names are suppressed; geographic detail is limited to ZIP code and county FIPS.
Computed fields: total_amount_paid, total_coverage, coverage_ratio, decade_of_loss, occupancy_type_label, flood_zone_category, zip_claim_count, zip_avg_claim, and zip_risk_score are derived by ClarityStorm from the raw FEMA data.
Update frequency: FEMA updates OpenFEMA claims data periodically as new claims are processed and older claims are finalized. Annual subscribers receive updated files when ClarityStorm re-runs the pipeline.
License: FEMA OpenFEMA data is a US federal government work in the public domain. Paid tiers are licensed under the ClarityStorm Commercial Data License covering our pipeline work (typing, normalization, flood zone classification, risk score computation, and Parquet conversion).
Need custom date ranges, specific states, or bulk licensing for flood risk portfolios?
Contact Sales