NHTSA Data Explorer

Methodology

What this is

The NHTSA Data Explorer is a structured view of the three primary public datasets published by the National Highway Traffic Safety Administration:

Complaints — safety issues reported directly by vehicle owners.
Recalls — corrective campaigns initiated by manufacturers or required by NHTSA.
Technical Service Bulletins (TSBs) — manufacturer guidance issued to dealerships for diagnosing and repairing known issues.

Each record is run through a classification pipeline that tags it with structured issue categories. The results are aggregated into facet breakdowns, monthly timelines, and trending-issue rates — queryable at three levels: across all vehicles, by make, and by model.

Data sources

All data is pulled from NHTSA's public flat-file releases at nhtsa.gov. NHTSA updates those files daily; this Explorer ingests them on the same schedule. No third-party or paid data sources are used.

The underlying NHTSA data is public domain. The classifications, aggregations, and presentation here are produced by Repair Surge.

Update schedule

The data updates every night. After ingesting new and updated records, all make and model rollups are rebuilt. Pages are pre-rendered with a 24-hour revalidation window, so each visit reflects the latest nightly run.

The exact refresh date is shown in the footer of every page.

How classification works

Each record contains free-text describing the issue. We deduplicate identical records, then run each unique one through a process that tags it with structured categories — things like "component system," "failure mode," and "reporting context" — along with the specific values that apply.

Two records can be worded differently but describe the same problem; they'll receive the same classifications. That consistency is what makes the aggregations across hundreds of thousands of records reliable.

Interpreting the data

A few things to keep in mind when reading the charts and counts on this Explorer:

Leaderboards rank by volume, not rate

The leaderboard ranks makes by total record count, not by records per vehicle on the road. A make with higher sales volume will typically rank higher regardless of its per-vehicle rate. Use it to understand where records are concentrated — not as a reliability ranking.

Severity data comes from complaints only

Crash, fire, injury, and death counts are sourced from consumer complaints only. Recalls and TSBs do not include these fields. The severity strip reflects what complaint filers reported — it is not a complete safety assessment.

Model totals may exceed corpus totals

A single NHTSA model name (e.g. "F150") can map to multiple Repair Surge entries (e.g. "F-150" and "F-150 Heritage"). A record filed under "F150" will appear on both model pages. This is intentional. Each page shows every record relevant to that vehicle, but it means per-model totals will not sum to the overall total.

"Other" indicates no classification match

When a record's text doesn't match any allowed value for a given facet, it's classified as "Other." This is typically a small fraction of records. The facet bars include it so you can see what proportion was unclassified.

How trending is calculated

A facet/value pair is marked as trending when its last-quarter count is significantly higher than its historical baseline. A "5×" lift means roughly five times the normal rate. Pairs with very low baselines are excluded as unreliable.

Questions?

If you spot an error or want to request data we don't currently show, reach out through our contact form.

← Back to the Explorer