Opioid Prescriber Risk Prediction - Architect As A Service, LLC.

Project Goal

The primary objective is to build a predictive model that identifies Medicare Part D prescribers (by NPI) who are likely to exceed a peer-normalized opioid prescribing threshold in the following year.

Target Variable

The model predicts the probability that a provider's opioid prescription share next year will be in the top 20% of their peer group (defined by specialty and state). The target is formally defined as:

Pr(opioid_share t+1 \geq peer_q80 t)

Intended Action

The model's output is intended to be a supportive tool. It should be used to prioritize providers for positive interventions like education, clinical reviews, and promoting naloxone co-prescribing. It is explicitly designed for non-punitive actions.

Description of the Datasets

The model integrates five public datasets to create a comprehensive view of each provider's prescribing patterns and their local community context.

Dataset	Owner	Scope	Grain	Rows x Columns	Primary Join Keys	Example Feature Fields	Refresh Cadence	Notes / Pitfalls
CMS Medicare Part D Prescriber — Provider × Drug	CMS	National Part D claims at prescriber–drug level	NPI × NDC × Year	1,332,309x84 (2022), 1,380,665x84 (2023)	NPI, product_ndc (11-digit), prescriber ZIP/state	total_claim_count, total_day_supply, total_drug_cost, bene_count, specialty	Annual	NDC formatting (11 vs. 10-digit dashed); small-cell suppression; ZIP reflects enrollment address
FDA National Drug Code (NDC) Directory (openFDA)	FDA	Drug product attributes and labeling	product_ndc (10-digit dashed), package-level	132,707x32	product_ndc (convert to NDC11 to join with Part D)	generic_name, brand_name, dosage_form, route, pharm_class_epc/cs, dea_schedule, marketing_start/stop	Nightly (bulk), API mirrors index	Normalize NDC to 11-digit; inactive products present; class strings are free-form
County Health Rankings (CHR) — Analytic File	CHR&R (UWPHI)	County health outcomes & drivers	County FIPS	3,205x796	fipscode (5-digit)	smoking_rate, adult_obesity, excessive_drinking, uninsured, primary_care_access, premature_death	Annual	Some indicators are modeled estimates; endpoint may block HEAD/Range
CDC Provisional Drug Overdose Deaths	CDC/NCHS	Overdose counts and rates by month	County FIPS × Month	76,860x12	county_fips_code (5-digit), month_date	age_adjusted_rate, death_count (by cause grouping), rolling 12‑mo mean	Monthly/Quarterly (provisional)	Provisional and revised over time; prefer rates; aggregate to trailing windows for stability
U.S. Census ACS 5‑Year	U.S. Census Bureau	Socio-demographics	County FIPS	3,222x6	state_fips + county_fips → 5‑digit FIPS	population, poverty_count, insurance, income, education	Annual (rolling 5‑year)	Estimates with MOE; document variables pulled; avoid mixing 1‑year with 5‑year for counties

Why XGBoost?

XGBoost is a fast, regularized, tree-boosting method that is well-suited for this project's tabular data, heterogeneous features, missing values, and non-linear interactions. It provides strong baselines and remains explainable for clinical and governance review.

1. Tabular + Mixed Feature Types

Works well with numeric and one-hot categorical features, learning complex interaction terms automatically.

2. Missing-Value Handling

Trees in XGBoost learn a "default direction," so imputation pipelines are not needed for many features.

3. Nonlinear Signals

Captures threshold effects and interactions (e.g., state × specialty × opioid share) effectively.

4. Class Imbalance Tools

Features like scale_pos_weight help when positive cases are rare.

5. Regularization & Robustness

Techniques like shrinkage, subsampling, and L1/L2 regularization prevent overfitting.

6. Speed & Scale

Highly optimized for fast iteration and broad hyperparameter sweeps.

7. Explainability

SHAP TreeExplainer produces faithful attributions for audits and provider-level reason codes.

Implementation Notes

NDC Normalization & Opioid Flag

The pipeline converts openFDA's 10-digit dashed product_ndc to an 11-digit format by zero-padding according to FDA rules to match Part D data.
An opioid flag is created using a conservative heuristic based on text in pharmacologic class fields and DEA schedule signals (e.g., CII/CIII).

Feature Engineering (NPI×year)

From Part D: Features include total_claims, total_30ds, total_cost, opioid_claims which is used to derive opioid_share, and avg_cost_per_30ds.
County Features: Data is merged via a ZIP-to-FIPS bridge. This includes ACS data (population, poverty), CHR indicators (smoking, obesity, etc.), and a rolling 12-month mean overdose rate from the CDC.

Labeling (Peer-Normalized)

Providers are grouped by state × specialty.
The 80th percentile of opioid_share is computed within each peer group to define the threshold peer_q80.
The label is 1 if a provider's opioid_share is greater than or equal to their peer group's threshold.
For true forecasting, the model is trained on features at year t to predict the label at year t+1.

Modeling

Classifier: XGBClassifier with imbalance handling (scale_pos_weight).
Split: A temporal split is preferred.
Metrics: Key evaluation metrics are AUPRC, ROC AUC, and lift@k.

Overall Model Performance

The model is successful at determining high-risk prescribers compared to random selection and is highly effective at rank-ordering providers by risk.

0.850

ROC AUC

Indicates a very good ability to distinguish between high-risk and non-high-risk providers.

0.640

Average Precision (AUPRC)

A good result, especially given the positive class prevalence of about 21.5%.

0.581

F1 Score (@ 0.5 Threshold)

Shows a decent balance between precision (46.8%) and recall (76.7%) for the positive class.

Prioritization Performance (Lift Analysis)

Lift measures the model's effectiveness at finding high-risk cases compared to random selection, which is crucial for prioritizing resources.

Lift @ 1% (k=1%): 4.384
The top 1% of providers ranked by risk are 4.4 times more likely to be high-risk than a random selection.
Lift @ 5% (k=5%): 3.989
In the top 5% of riskiest providers, the model is nearly 4 times more likely to find a high-risk case.
Lift @ 10% (k=10%): 3.441
In the top 10%, the model is still 3.4 times more effective than random chance.

Data Governance & Ethics

Purpose

To support education and patient-safety quality programs, not for punitive action.

Bias & Fairness

The model should be evaluated by specialty and geography. Minimum-volume filters and peer group calibration are used to mitigate bias.

Privacy & Transparency

All sources are public aggregates, and patient-level inference should not be attempted. SHAP reasons should be provided with any risk score.

Demo

A short walkthrough showing how the workflow identifies high-risk prescribers and how the scoring output can be used for non-punitive interventions.

Video file: Open or download.