← Home
Opioid Prescriber Risk Prediction
Medicare Part D prescriber forecasting with peer-normalized risk scoring

Project Goal

The primary objective is to build a predictive model that identifies Medicare Part D prescribers (by NPI) who are likely to exceed a peer-normalized opioid prescribing threshold in the following year.

Target Variable

The model predicts the probability that a provider's opioid prescription share next year will be in the top 20% of their peer group (defined by specialty and state). The target is formally defined as:

Pr(opioid_sharet+1 ≥ peer_q80t)

Intended Action

The model's output is intended to be a supportive tool. It should be used to prioritize providers for positive interventions like education, clinical reviews, and promoting naloxone co-prescribing. It is explicitly designed for non-punitive actions.

Description of the Datasets

The model integrates five public datasets to create a comprehensive view of each provider's prescribing patterns and their local community context.

Dataset Owner Scope Grain Rows x Columns Primary Join Keys Example Feature Fields Refresh Cadence Notes / Pitfalls
CMS Medicare Part D Prescriber — Provider × Drug CMS National Part D claims at prescriber–drug level NPI × NDC × Year 1,332,309x84 (2022), 1,380,665x84 (2023) NPI, product_ndc (11-digit), prescriber ZIP/state total_claim_count, total_day_supply, total_drug_cost, bene_count, specialty Annual NDC formatting (11 vs. 10-digit dashed); small-cell suppression; ZIP reflects enrollment address
FDA National Drug Code (NDC) Directory (openFDA) FDA Drug product attributes and labeling product_ndc (10-digit dashed), package-level 132,707x32 product_ndc (convert to NDC11 to join with Part D) generic_name, brand_name, dosage_form, route, pharm_class_epc/cs, dea_schedule, marketing_start/stop Nightly (bulk), API mirrors index Normalize NDC to 11-digit; inactive products present; class strings are free-form
County Health Rankings (CHR) — Analytic File CHR&R (UWPHI) County health outcomes & drivers County FIPS 3,205x796 fipscode (5-digit) smoking_rate, adult_obesity, excessive_drinking, uninsured, primary_care_access, premature_death Annual Some indicators are modeled estimates; endpoint may block HEAD/Range
CDC Provisional Drug Overdose Deaths CDC/NCHS Overdose counts and rates by month County FIPS × Month 76,860x12 county_fips_code (5-digit), month_date age_adjusted_rate, death_count (by cause grouping), rolling 12‑mo mean Monthly/Quarterly (provisional) Provisional and revised over time; prefer rates; aggregate to trailing windows for stability
U.S. Census ACS 5‑Year U.S. Census Bureau Socio-demographics County FIPS 3,222x6 state_fips + county_fips → 5‑digit FIPS population, poverty_count, insurance, income, education Annual (rolling 5‑year) Estimates with MOE; document variables pulled; avoid mixing 1‑year with 5‑year for counties

Why XGBoost?

XGBoost is a fast, regularized, tree-boosting method that is well-suited for this project's tabular data, heterogeneous features, missing values, and non-linear interactions. It provides strong baselines and remains explainable for clinical and governance review.

1. Tabular + Mixed Feature Types

Works well with numeric and one-hot categorical features, learning complex interaction terms automatically.

2. Missing-Value Handling

Trees in XGBoost learn a "default direction," so imputation pipelines are not needed for many features.

3. Nonlinear Signals

Captures threshold effects and interactions (e.g., state × specialty × opioid share) effectively.

4. Class Imbalance Tools

Features like scale_pos_weight help when positive cases are rare.

5. Regularization & Robustness

Techniques like shrinkage, subsampling, and L1/L2 regularization prevent overfitting.

6. Speed & Scale

Highly optimized for fast iteration and broad hyperparameter sweeps.

7. Explainability

SHAP TreeExplainer produces faithful attributions for audits and provider-level reason codes.

Implementation Notes

NDC Normalization & Opioid Flag

  • The pipeline converts openFDA's 10-digit dashed product_ndc to an 11-digit format by zero-padding according to FDA rules to match Part D data.
  • An opioid flag is created using a conservative heuristic based on text in pharmacologic class fields and DEA schedule signals (e.g., CII/CIII).

Feature Engineering (NPI×year)

  • From Part D: Features include total_claims, total_30ds, total_cost, opioid_claims which is used to derive opioid_share, and avg_cost_per_30ds.
  • County Features: Data is merged via a ZIP-to-FIPS bridge. This includes ACS data (population, poverty), CHR indicators (smoking, obesity, etc.), and a rolling 12-month mean overdose rate from the CDC.

Labeling (Peer-Normalized)

  • Providers are grouped by state × specialty.
  • The 80th percentile of opioid_share is computed within each peer group to define the threshold peer_q80.
  • The label is 1 if a provider's opioid_share is greater than or equal to their peer group's threshold.
  • For true forecasting, the model is trained on features at year t to predict the label at year t+1.

Modeling

  • Classifier: XGBClassifier with imbalance handling (scale_pos_weight).
  • Split: A temporal split is preferred.
  • Metrics: Key evaluation metrics are AUPRC, ROC AUC, and lift@k.

Overall Model Performance

The model is successful at determining high-risk prescribers compared to random selection and is highly effective at rank-ordering providers by risk.

0.850
ROC AUC

Indicates a very good ability to distinguish between high-risk and non-high-risk providers.

0.640
Average Precision (AUPRC)

A good result, especially given the positive class prevalence of about 21.5%.

0.581
F1 Score (@ 0.5 Threshold)

Shows a decent balance between precision (46.8%) and recall (76.7%) for the positive class.

Prioritization Performance (Lift Analysis)

Lift measures the model's effectiveness at finding high-risk cases compared to random selection, which is crucial for prioritizing resources.

  • Lift @ 1% (k=1%): 4.384
    The top 1% of providers ranked by risk are 4.4 times more likely to be high-risk than a random selection.
  • Lift @ 5% (k=5%): 3.989
    In the top 5% of riskiest providers, the model is nearly 4 times more likely to find a high-risk case.
  • Lift @ 10% (k=10%): 3.441
    In the top 10%, the model is still 3.4 times more effective than random chance.

Data Governance & Ethics

Purpose

To support education and patient-safety quality programs, not for punitive action.

Bias & Fairness

The model should be evaluated by specialty and geography. Minimum-volume filters and peer group calibration are used to mitigate bias.

Privacy & Transparency

All sources are public aggregates, and patient-level inference should not be attempted. SHAP reasons should be provided with any risk score.

Demo

A short walkthrough showing how the workflow identifies high-risk prescribers and how the scoring output can be used for non-punitive interventions.

Video file: Open or download.