Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans health Administration

Document Type


Publication Date



Objectives The US Veterans Health Administration (VHA) has begun using predictive modeling to identify Veterans at high suicide risk to target care. Initial analyses are reported here. Methods A penalized logistic regression model was compared with an earlier proof-of-concept logistic model. Exploratory analyses then considered commonly-used machine learning algorithms. Analyses were based on electronic medical records for all 6,360 individuals classified in the National Death Index as having died by suicide in fiscal years 2009–2011 who used VHA services the year of their death or prior year and a 1% probability sample of time-matched VHA service users alive at the index date (n = 2,112,008). Results A penalized logistic model with 61 predictors had sensitivity comparable to the proof-of-concept model (which had 381 predictors) at target thresholds. The machine learning algorithms had relatively similar sensitivities, the highest being for Bayesian additive regression trees, with 10.7% of suicides occurred among the 1.0% of Veterans with highest predicted risk and 28.1% among the 5.0% of with highest predicted risk. Conclusions Based on these results, VHA is using penalized logistic regression in initial intervention implementation. The paper concludes with a discussion of other practical issues that might be explored to increase model performance.