Appendix · Reference

Data, Research & Sources

The evidence base behind the plan: the data we build on, the research that makes our analytics credible, the early-warning signals that give us a head start, and a plain-language glossary of every term used across this site.

40+
Catalogued datasets across overdose, claims, SDOH, provider & mobility layers
2
Load-bearing join utilities: HUD ZIP crosswalk + NPPES NPI
2–8 wks
Lead time of our early-warning signals over death-certificate data
5
Peer-reviewed method literatures behind the data edge
How to read this page

This is a reference appendix, organized for lookup rather than reading start to finish. The Targeting page turns these datasets into a ranked county list; the Risks page covers the rules that limit access to some data (for example, 42 CFR Part 2 and restricted prescription-monitoring data). Every dataset link was checked live on 2026-06-25. Figures marked verify are early estimates we still need to confirm.

1 · The data we build on

Our analytics run on a single connected dataset that links information by county and by provider, almost all of it from free, public sources. That public data is widely available, so it isn't where our advantage lies — the advantage is the early-warning and county-targeting layer we build on top of it. Two reference files connect everything, so we set them up first: the HUD USPS ZIP crosswalk, which translates between ZIP codes, census tracts, and counties, and the NPPES NPI registry, which gives every healthcare provider a unique ID.

How the data connects (the join keys)

Join keyWhat it linksRole
FIPS county (5-digit) WONDER, SUDORS, VSRR, SVI, PLACES, AHRQ-SDOH, County Health Rankings, ACS, BLS, BEA, IRS-SOI, ARCOS, TEDS, Part D (agg.) Primary spine — every county-week risk score keys here
Census tract (11-digit) / block group SVI, PLACES, AHRQ-SDOH, ADI, Food Access Atlas, EJScreen, NLCD, LODES, ACS, RTI SynthPop Sub-county vulnerability & SDOH detail
ZIP / ZCTA PLACES (ZCTA), AHRQ-SDOH, IRS-SOI, APCDs, MarketScan, HUD crosswalk Bridges claims/survey data to tract & county
NPI (provider) NPPES, Part D prescribers, Part B PUF, Medicare/Medicaid claims, all-payer claims Master provider key — ties prescribing & capability to geography
ICD-10 + HCPCS / NDC WONDER, HCUP, CMS claims, MIMIC, MarketScan, Synthea, Part D (NDC) Clinical event & drug coding
Lat/long → geocode → tract/county EPA AQS/TRI, USGS, OSM, SafeGraph/Advan, NEMSIS (coarse), facility locators Spatial joins for point data
Set up these two first

The HUD USPS ZIP crosswalk and the NPPES NPI registry hold everything else together, so they come first. Two timing notes: the older NPPES interface (V1) shuts down on 2026-03-03, so we use V2; and EJScreen was pulled from the EPA site in February 2025, so we use the PEDP Azure or Harvard Dataverse copy instead.

Key datasets by layer

DatasetSourceLayerGranularityAccess
CDC WONDER — Multiple Cause of DeathCDC NCHSOverdose mortalityCountyOpen
SUDORS / DOSE (OD2A)CDCFatal-OD detail (toxicology)StateOpen
VSRR Provisional OD countsCDC NCHSTimely OD counts (3–6 mo lag)StateOpen + API
TEDS / TEDS-DSAMHSATreatment admissions & dischargesStateOpen PUF
N-SUMHSS / FindTreatmentSAMHSAFacility capability censusFacilityOpen
WaPo ARCOS pain-pill DBWaPo / litigationPill-level opioid shipments '06–'14Pharmacy / countyOpen + API
NPPES NPI RegistryCMSProvider master keyNPIOpen
Medicare Part D PrescribersCMSOpioid / buprenorphine prescribingNPIOpen
CDC Buprenorphine Dispensing MapsCDC / IQVIAMOUD access proxyCountyOpen
CDC/ATSDR SVICDC/ATSDRSocial vulnerability indexTract / countyOpen
CDC PLACESCDCLocal chronic-disease estimatesCounty / ZCTA / tractOpen + API
AHRQ SDOH DatabaseAHRQPre-joined SDOH (5 domains)County / ZIP / tractOpen
County Health RankingsUW / RWJF90+ outcome & factor measuresCountyOpen
HUD USPS ZIP CrosswalkHUD USERThe join bridgeZIP ↔ tract/countyOpen
Census ACSCensusUniversal denominator (demographics)Nation → block groupOpen API
Census LODES / LEHDCensusOrigin-destination commutingCensus blockOpen
HCUP — SID/SEDD/NEDSAHRQInpatient & ED discharge recordsState / hospitalApp + DUA
Medicaid T-MSIS (TAF)CMS / ResDACMedicaid claims (pays most US SUD care)Beneficiary → countyDUA / App
NEMSIS (EMS)NEMSIS TACEMS naloxone & OD responsesNational (de-id)Application
IDHS/SUPR DARTSIllinois IDHSTrue IL client/provider treatment recordsProvider / clientRelationship-gated
Synthea / RTI SynthPopMITRE / RTISynthetic build-and-test substrateSynthetic → any geoOpen
Build order, by how easy the data is to get

Start now (all free and open): CDC mortality data (WONDER, VSRR, SUDORS); provider and prescribing data (NPPES, Medicare Part D); the Washington Post ARCOS pill-shipment database; core demographic and social layers (Census ACS, HUD crosswalk, SVI, PLACES, AHRQ-SDOH, County Health Rankings); commuting and map data (LODES, OpenStreetMap); and the synthetic datasets (Synthea, SynthPop) used to test the system on fake data first.

Apply early — these take time to clear: hospital discharge records (HCUP), Medicare and Medicaid claims through ResDAC, EMS data (NEMSIS), and the All of Us research cohort.

Paid or commercial: MarketScan, Optum, and IQVIA claims; Advan mobility data; and state all-payer claims databases. Detailed prescription-monitoring (PDMP) records are not public — only the policy summaries (PDAPS) are open.

2 · The research behind our analytics

Our analytics are transparent and tested: the model's reasoning can be inspected, and its predictions are checked against real outcomes. That credibility rests on published research across five areas, plus the evidence on provider outcomes and opioid modeling that shapes the product. What follows maps the fields we draw on, not a full literature review; complete citations are in the source notes.

Causal transport

Does a result travel from one place to another?

Bareinboim & Pearl on combining and transporting data (PNAS 2016; Stat. Science 2014), with practical methods from Stuart et al. (2011) and Dahabreh et al. (2020). The formal basis for asking: does an effect proven in one state hold up in another?

Policy evaluation

Measuring what a policy change actually did

Synthetic control (Abadie et al. 2010), synthetic difference-in-differences (Arkhangelsky et al. 2021), corrections for staggered rollouts (Callaway & Sant'Anna 2021; Goodman-Bacon 2021), and interrupted time series (Bernal et al. 2017) — to measure the real effect of Illinois policy changes and hold settlement spending accountable.

Resource allocation

Where to direct limited funding

Restless multi-armed bandits (Whittle 1988) and a real public-health deployment of them (Mate/Tambe/ARMMAN, AAAI 2022). The math behind "which county do we fund this quarter" — already proven in the field.

Forecasting

Predicting where overdoses are rising

Overdose forecasting (Sumetsky et al. 2021), neighborhood risk modeling (Bozorgi et al. 2021), self-exciting point processes (Mohler et al. 2011), and graph-based spatiotemporal networks (DCRNN, Li et al. 2018) — these power our county-by-week risk scores.

Staying accurate

Keeping models honest across places

Importance weighting (Sugiyama et al. 2007), domain-generalization bounds (Ben-David et al. 2010), and stable-feature transport (Subbaswamy/Saria 2019) — keeping a model accurate when applied to a region whose data looks different from where it was trained.

Testing the model

Calibration and backtesting

Pattern-oriented modeling (Grimm et al. 2005), validation batteries (Barlas 1996), and simulation-based and Bayesian calibration. Opioid examples include the FDA's SOURCE model (Lim et al., PNAS 2022) and RESPOND (PLOS One 2024). Only about 61% of opioid models are calibrated and 31% validated (Cerdá et al. 2021) — that gap is our opening.

Provider outcome data is rare — and valuable

Data on how individual treatment providers actually perform is the hardest thing to get in this field. Public data gives state-level retention and completion rates (TEDS-D, the Medicaid Adult Core Set, mandatory since FFY2024), county-level access to medication (CDC buprenorphine maps), and a facility-by-facility capability census (N-SUMHSS). The claims-based "what works" standards — HEDIS measures like IET, FUA, and POD (staying on medication 180+ days) plus the academic Washington Circle measures — can only be calculated at the provider or county level with claims access (Illinois T-MSIS or DARTS), which is exactly where Dr. Barthwell's relationships open doors. On demand: a 2025 HHS-OIG report found Medicare could have saved $301.5M (53%) on opioid-treatment bundles where the full set of services wasn't actually delivered — a clear need for the spending accountability we provide.

3 · Early-warning signals

Death-certificate data (CDC WONDER) runs four to six months behind, and even the faster provisional counts (VSRR) lag about six months. Our edge turns real-time, upstream signals into county-by-week risk scores that lead overdose deaths by two to eight weeks — the early warning that health departments, insurers, harm-reduction groups, and treatment networks will pay for. Most of these signals already come tagged by county or EMS region, so they slot straight into our data.

The signals, ranked by how early they warn us

SignalPredictsLead timeCounty joinAccess
ODMAP (suspected ODs)Real-time OD spikes, automated alertsReal-timeExcellentMOU-gated
EMS / NEMSIS naloxoneNonfatal-OD burden (best proxy)Real-timeGood (agency→FIPS)State-EMS partner
NPDS poison-center callsAcute toxic exposures (~6-min latency)~Real-timeGoodPaid license
CDC NWSS wastewaterNew/more-potent supply in a sewershedDaysCrosswalk neededMOU / utility
NPS Discovery / NDEWSNovel adulterants (nitazenes, xylazine)Days–weeksRegion onlyOpen
988 / helpline volumePopulation distress demandWeeksState / regionMostly open
Syringe-services (SSP) dataActive-use population sizeWeeksCrosswalkPartnership
Jail bookings / releasesRelease = peak OD-risk window (lost tolerance)Weeks (1–4)GoodCounty scraping
HMIS homelessnessStructural risk-population fluxMonthsCoC→FIPSCoC-gated
Google Trends / socialSearch/attention demandWeeksCoarse (DMA)Open
PDMP / dispensingRx shifts; bup access (protective)MixedNPI / FIPS (native)Regulatory

The three strongest products we can build from these

Product 1

County Surge Score

Combines suspected-overdose spikes (ODMAP) with EMS naloxone use. It joins cleanly to county data, has the clearest buyer (state and county health departments), and carries little technical risk. Output: a daily or weekly surge score plus automatic spike alerts per Illinois county. It needs only an ODMAP data agreement, which Dr. Barthwell's federal-policy network can open.

Product 2

Drug-Supply Early Warning

A "new threat in your county" alert combining national tracking of dangerous new additives (NPS Discovery, free) with local wastewater testing and syringe-program drug checking — the upstream signal almost nobody packages well. We can prototype it for free, then add Illinois-specific local data in a premium tier.

Product 3

Risk vs. Treatment-Access Map

Compares how fast acute risk is rising (poison-center call data) against how much medication treatment is actually available, flagging counties where danger is climbing but access is thin. We can build it quickly on data we already have (Part D, ARCOS, NPPES); the only added cost is the poison-center data license.

4 · Glossary

Plain-language definitions for every acronym and term used across this plan. Click any column header to sort.

TermFull nameWhat it means here
MOUDMedications for Opioid Use DisorderThe three FDA-approved meds — buprenorphine, methadone, naltrexone. Our clinical front door.
MATMedication-Assisted TreatmentOlder term for MOUD (meds + behavioral support); now largely superseded by "MOUD."
OBOTOffice-Based Opioid TreatmentOffice/telehealth buprenorphine prescribing. Billable in 4–8 weeks under Two Dreams' existing SUPR license; no X-waiver needed post-MAT Act 2023.
OTPOpioid Treatment ProgramFederally certified program that can dispense methadone. Requires SAMHSA cert + CARF/JC accreditation + DEA NTP registration (9–14 month build).
NTPNarcotic Treatment ProgramDEA's registration category for an OTP. A registered OTP may add a mobile van as an extension (2021 DEA rule); standalone mobile NTPs are barred.
MMHUMedication-Assisted Recovery Mobile Health UnitsIllinois IDHS/SUPR grant program (AG-certified up to $15M; MMHU-2 = $8.4M / 3 yrs / 4+ awards) funding the van, all three OUD meds, and staff. Two Dreams is eligible.
SUPRDivision of Substance Use Prevention & RecoveryThe IDHS division that licenses SUD providers and administers SUD funding in Illinois. Two Dreams holds a SUPR license.
IDHSIllinois Department of Human ServicesThe state agency that houses SUPR; the grantor/contracting body for MMHU, SOR, and related funds.
SORState Opioid ResponseSAMHSA grant funding flowing to states (IDHS/SUPR), available to providers as subawards. Part of the capital stack.
RCORPRural Communities Opioid Response ProgramHRSA grant program; RCORP-Impact is up to $750K/yr × 4 (rural), with for-profits eligible as a consortium lead.
SAMHSASubstance Abuse & Mental Health Services AdministrationThe federal agency funding SOR, certifying OTPs, and collecting TEDS / N-SUMHSS data.
HRSAHealth Resources & Services AdministrationThe federal agency that runs RCORP and other rural-health funding.
ONDCPOffice of National Drug Control PolicyThe White House drug-policy office. Dr. Andrea Barthwell is a former Deputy Director — the relationship moat for MOU-gated data.
ASAMAmerican Society of Addiction MedicineThe professional society for addiction medicine; Dr. Barthwell is a past president.
CoCMCollaborative Care ModelTeam-based integrated behavioral-health billing model (CPT 99492–99494, G2214; 2026 G0568 ≈ $162 verify). A core LTV layer in the BH stack.
IOPIntensive Outpatient ProgramA structured level of behavioral-health care; part of the separately-billable BH service stack.
SDOHSocial Determinants of HealthNon-clinical conditions (income, housing, food access) driving health outcomes; a major covariate layer in the fabric (SVI, AHRQ-SDOH, ADI).
SVISocial Vulnerability IndexCDC/ATSDR 16-variable tract/county vulnerability index — the flagship SDOH join layer and a weight in the targeting model.
MAT ActMainstreaming Addiction Treatment Act (2023)Eliminated the DEA X-waiver, letting any DEA-registered prescriber treat OUD with buprenorphine — the reason OBOT is billable in weeks.
IORABIllinois Opioid Remediation Advisory BoardThe state body advising on allocation of Illinois opioid-settlement (abatement) funds. verify
IMPACTIllinois Medicaid Provider Enrollment (IMPACT)The IL portal through which a provider enrolls in Medicaid before MCO contracting and billing.
MCOManaged Care OrganizationMedicaid managed-care plans that must be contracted with to bill Medicaid volume in Illinois.
OASEEOpioid Abatement Strategies Effectiveness EvaluatorAn IL NOFO ($1.5M / 3 yrs to one org) to evaluate settlement-spending effectiveness — a direct buyer signal for our analytics.
DARTSDepartmental Automated Reporting & Tracking SystemIDHS/SUPR's provider-reported, client-level treatment records — real IL provider-outcome data; not public (relationship-gated).
ODMAPOverdose Detection Mapping Application ProgramHIDTA-run real-time suspected-OD mapping with automated spike alerts; the strongest county-week early-warning join (MOU-gated).
NPDSNational Poison Data SystemAmerica's Poison Centers' near-real-time (~6-min) poisoning-surveillance feed; licensable acute-exposure signal.
NEMSISNational EMS Information SystemNational EMS data system; source of naloxone-administration and OD-response signals (the best validated nonfatal proxy).
NWSSNational Wastewater Surveillance SystemCDC system whose drug-metabolite layer detects new/more-potent supply in a sewershed days ahead of ED/morgue.
PDMPPrescription Drug Monitoring ProgramState controlled-substance dispensing database; near-real-time but record-level access is regulatory-restricted (clinician/agency only).
NPPES / NPINational Plan & Provider Enumeration System / National Provider IdentifierThe master US provider registry and key — the load-bearing provider join for the fabric (V1 retires Mar 2026 → V2).
ARCOSAutomation of Reports & Consolidated Orders SystemDEA's controlled-substance distribution data; the WaPo litigation release gives transaction-level pill flows 2006–2014.
TEDS / TEDS-DTreatment Episode Data Set (Discharges)SAMHSA admissions/discharge records; the closest open thing to outcome data, but state-level only.
HEDISHealthcare Effectiveness Data & Information SetNCQA's claims-based quality measures (incl. IET, FUA, POD) — the payer-trusted "what-works" standard.
PODPharmacotherapy for Opioid Use DisorderHEDIS measure: % of OUD pharmacotherapy events lasting ≥180 days — the best claims-based MOUD-retention metric.
IET / FUAInitiation & Engagement / Follow-Up After EDHEDIS SUD process measures (treatment initiation/engagement; follow-up after an SUD-related ED visit).
T-MSIS / TAFTransformed Medicaid Statistical Information System (Analytic Files)CMS national Medicaid claims; the gated path to provider-level IL outcomes via ResDAC.
APCDAll-Payer Claims DatabaseState-held multi-payer claims (~20+ states); a Tier-3 commercial/DUA route to claims-based outcomes.
SUD / OUDSubstance Use Disorder / Opioid Use DisorderThe clinical conditions treated; OUD is the opioid-specific subset and our primary focus.
LTVLifetime ValuePer-patient revenue over the relationship; integrating BH lifts it ~2.6× annual / ~4.9× lifetime over MOUD-only.
FIPSFederal Information Processing Standards (county code)The 5-digit county code that is the primary spine of the data fabric.

5 · How we work (methodology notes)

Build and test on synthetic data first

Before touching any real records, we build and test the whole system end to end on synthetic (computer-generated) patient data from Synthea and RTI SynthPop. This lowers engineering risk, lets us demo to partners, and keeps us clear of the strict privacy rules (42 CFR Part 2) during development.

Checking our predictions against reality

Illinois policy changes give us reliable, real-world cause-and-effect results that do double duty: a useful product on their own, and the answer key we test our prediction engine against. The claim we can stand behind — "we predicted how a result would carry over to a new area, and we were right" — is provable against these known outcomes.

Saying when a prediction doesn't apply

Every prediction comes with a confidence level. When the math says a result from one place won't reliably carry over to another, the product says so plainly — "collect local data first" — rather than overstate its confidence. We'd rather be clear about the limits than overclaim.

Caveats and data hygiene

Provider counts and access estimates point in the right direction but aren't exact (the CDC county maps are summarized from licensed IQVIA data, not raw records). In CDC WONDER, any county with fewer than 10 deaths is hidden to protect privacy. All links were checked live on 2026-06-25. Source-note identifiers marked verify come from expert recall and should be double-checked before being cited outside this document. Any financial figure on this site is preliminary until confirmed.

Why this advantage lasts

The clinical foundation — Two Dreams plus Dr. Barthwell — opens the door. What makes the advantage durable is the rest: a single connected dataset, an early-warning surge layer, and a transparent, evidence-backed methodology. Together they power our county-targeting today and a growing analytics business over time. See Deployment Targeting for the data in action, and Year 1·2·5+ for how it scales.