Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Clinical prediction models to forecast depression in patients with diabetes and applications in depression screening policymaking
(USC Thesis Other)
Clinical prediction models to forecast depression in patients with diabetes and applications in depression screening policymaking
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
CLINICAL PREDICTION MODELS TO FORECAST DEPRESSION IN PATIENTS WITH DIABETES AND APPLICATIONS IN DEPRESSION SCREENING POLICYMAKING BY HAOMIAO JIN A DISSERTATION SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIRMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN INDUSTRIAL AND SYSTEMS ENGINEERING FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA MAY 2016 Table of Contents Chapter 1 – Introduction ................................................................................................................. 1 Chapter 2 - Paper One: Development of a Clinical Forecasting Model to Predict Comorbid Depression among Diabetes Patients and an Application in Depression Screening Policy Making ................................................................................................................ 7 2.1 Introduction ........................................................................................................................... 9 2.2 Methods ................................................................................................................................. 9 2.2.1 Depression Measure ..................................................................................................... 10 2.2.2 Data set ......................................................................................................................... 10 2.2.3 Candidate predictors and predictor selection ................................................................ 11 2.2.4 Model development and validation .............................................................................. 11 2.2.5 Evaluating and comparing the model-based depression screening policy ................... 12 2.3 Results ................................................................................................................................. 13 2.3.1 the PreDICD model ...................................................................................................... 13 2.3.2 Evaluating and comparing the model-based depression screening policy ................... 14 2.4 Discussion ........................................................................................................................... 15 References of Chapter 2 ............................................................................................................ 18 Chapter 3 - Paper Two: Developing Prediction Models to Improve Policymaking for Current and Future Depression Screening in Patients with Diabetes ........................................... 26 3.1 Introduction ......................................................................................................................... 28 3.2 Methods ............................................................................................................................... 29 3.2.1 Research Framework .................................................................................................... 29 3.2.2 Dataset .......................................................................................................................... 30 3.2.3 Development of the Prediction Models ........................................................................ 31 3.2.4 Evaluation of the Prediction Model Based Approach .................................................. 32 3.3 Results ................................................................................................................................. 33 3.3.1 Prediction model to assess current risk of depression .................................................. 33 3.3.2 Prediction models to assess future risk of depression .................................................. 33 3.3.3 Evaluation of the Prediction Model Based Approach .................................................. 35 3.4 Discussion ........................................................................................................................... 36 References of Chapter 3 ............................................................................................................ 38 Chapter 4 - Paper Three: Predicting Depression among Patients with Diabetes Using Longitudinal Data. A Multilevel Regression Model .......................................................... 43 4.1 Introduction ......................................................................................................................... 45 4.2 Objective ............................................................................................................................. 46 4.3 Methods ............................................................................................................................... 46 4.3.1 Predicted Outcome ....................................................................................................... 46 4.3.2 Dataset .......................................................................................................................... 46 4.3.3 Modeling the Change of PHQ-9 Score Over Time ...................................................... 47 4.3.4 Using the Model for Prediction .................................................................................... 48 4.3.5 Prediction Model Development .................................................................................... 49 4.4 Results ................................................................................................................................. 50 4.5 Discussion ........................................................................................................................... 52 4.6 Conclusions ......................................................................................................................... 53 References of Chapter 4 ............................................................................................................ 54 Chapter 5 - Future Directions ....................................................................................................... 61 Appendix I - Description of the Two Clinical Trials Used to Develop Clinical Prediction Models in the Dissertation ........................................................................................... 63 Acknowledgements ....................................................................................................................... 66 1 Chapter 1 Introduction The field of Industrial and Systems Engineering (ISE) has a long history of designing and improving complex systems in many industries. Over the past decades, there has been particular interest in using tools and techniques of ISE to improve the health care delivery system. In specific, increasing interests have lied on applying evidence-based analytics to engineer more targeted and efficient care (1-3). Evidence-based predictive analytics encompasses a variety of techniques that analyze historical and/or current facts to make predictions about unknown outcomes. This can be used to develop clinical prediction models that forecast outcomes related to patient health such as disease occurrence and treatment effect. Clinical prediction is a key component to facilitate population health management (3, 4). That is, by using the clinical prediction model, health care providers can identify from a population those patients who are in need of care, and then proactively engage them in their most effective interventions. For those who are identified by the model as being without urgent need of care, providers can prioritize resource and time while not compromising patient health care needs. This dissertation will organize three original research articles to investigate the applications of evidence-based predictive analytics to engineer population depression care management for patients with diabetes. Concurrent depression among patients with diabetes is a common but stigmatized and often undiagnosed condition (5-7). Prior studies have shown that prevalence of concurrent depression among diabetes patients is about 20%, twice as the prevalence of depression in general population (8, 9), and the rate of undiagnosed and thus untreated depression among diabetes patients is about 45% (7). An effective intervention to reduce undiagnosed depression and thus improve clinical outcomes for patients with diabetes is to assign depression screening to those patients (10, 11). However, adoption of such intervention is often impeded by its significant requirement for provider resource and time. This problem is particularly important for those providers who have high volume of patients needed to be screened (12-14). In addition, although prevalence of depression is higher in patients with 2 diabetes than it is in the general population, majority of diabetes patients are still non-depressed. This indicates that there exists great potential to improve the efficiency of depression screening. This dissertation uses clinical prediction to identify diabetes patients who are in most need of depression screening, and thus to help providers save resource and time and improve efficiency by avoiding providing depression screening unnecessary to patients who are at low risk of depression. This dissertation intends to conduct three studies each examines a research question related to the clinical prediction. Thus, there will be three separate papers presented as three chapters of the dissertation. Although each paper is standalone, altogether they examine the development of depression prediction models for diabetes care providers and applications of those models in improving the delivery of depression screening for patients with diabetes. All three studies in the dissertation will use the same data set from two clinical trials combined. These trials sampled the same population of diabetes patients at risk of depression from clinics of the Los Angeles County Department of Health Services. One trial included only depressed patients (15, 16) and the other trial included both depressed and non-depressed patients (17, 18). The reason to combine the trials is to have balanced and larger sample size of depressed and non- depressed diabetes patients so the clinical forecasting is not biased towards one subgroup of patients. Appendix 1 provides further descriptions of the two trials. In the first article, a clinical prediction model for major depression is developed using supervised learning techniques based on predictors related to diabetes care and easy to acquire in clinical practices that were identified from the baseline data of the two trials. The derived prediction model can support a partial-screening policy that assigns depression screening to only those diabetes patients predicted by the model as being depressed. Rate of depression identification and measures relevant to provider resource and time will be compared between the model-based policy and alternative policies that can be used in clinical practices, including universal depression screening of every diabetes patients and heuristic-based partial screening policies based on patient characteristics such as depression history, severity of diabetes, or either criteria. The model-based approach proposed in this article is expected to help providers to save resource and time, improve efficiency, and achieves better policy making regarding depression screening. 3 Based on the work in the first article, the second article will extend the scope to repeated depression screening and develop a model based approach to systematically manage current and future depression screening. As a condition that is often life-long and highly recurrent (19, 20), repeated depression screening is important to track the change of depression symptoms and thus to diagnose and intervene major depression in a timely fashion. Prior study suggests providers to assign one-time depression screening to most individuals and may elect to assign subsequent screening based on patient characteristics (21). But, what patient characteristics warrant the increased frequency of depression screening is still a research question. This article will further enhance the model in the first article and develop two new models to forecast future depression, specifically, presence of major depression in 6 months. Operational performance of the prediction model based approach that assigns current and future depression screening to diabetes patients who are at high current and/or future risk of depression will be evaluated and compared to three alternative policies. Implications for better depression screening policymaking will be discussed. An important limitation of the prediction models developed in the first two papers is that the models are based only on predictors measured at a single time point to make predictions and thus cannot utilize the longitudinal information that is typically obtainable from patients’ historical clinic visits. The third paper will focus on the challenges in predictive analytics and develop a generalized multilevel regression model that uses patient’s historical clinical course to predict occurrence of depression among patients with diabetes. Two types of prediction, i.e. population- average prediction and subject-specific prediction, will be derived and compared to discuss the influence of incorporating historical patient records in predicting depression among patients with diabetes. The dissertation is organized as five chapters. In this chapter (i.e. Chapter 1), general background of the dissertation is introduced and contents of the three papers are briefly described. Chapters 2 to 4 will each present one of the three papers. In the last chapter (i.e. Chapter 5), future directions of research will be discussed. 4 References of Chapter 1 1. Reid, P. P., Compton, W. D., Grossman, J. H., Fanjiang, G. Building a better delivery system: a new engineering/health care partnership. Washington, DC: National Academics Press; 2005. 2. Valdez, R. S., Ramly, E. Industrial and systems engineering and health care: Critical areas of research: final report. Agency for Healthcare Research and Quality, US Department of Health and Human Services. 2010. 3. Institute for Health Technology Transformation. Population health management: a roadmap for provider-based automation in a new era of health care. Institute for Health Technology Transformation, 2012. New York, NY. 4. Bellazzi, R., Zupan, B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform 2008;77(2):81-97. 5. Berger, M., Wagner, T. H., Baker, L. C. Internet use and stigmatized illness. Social Sci Med 2005;61(8):1821-27. 6. Brown, C., Conner, K. O., Copeland, V. C., Grote, N, et al. Depression stigma, race, and treatment seeking behavior and attitudes. J Community Psychol 2010;38(3):350-68. 7. Li, C., Ford, E. S., Zhao, G., Ahluwalia, I. B., et al. Prevalence and correlates of undiagnosed depression among US adults with diabetes: the Behavioral Risk Factor Surveillance System, 2006. Diabetes Res Clin Pract 2009;83(2):268-79. 8. Ducat, L., Philipson, L. H., Anderson, B. J. The mental health comorbidities of diabetes. JAMA 2014;312(7):691-2. 9. Egede, L. E., Zheng, D., Simpson, K. Comorbid depression is associated with increased health care use and expenditures in individuals with diabetes. Diabetes Care 2002;25(3):464-470. 10. Ell, K., Katon, W., Cabassa, L. J., Xie, B., Lee, P. J, et al. Depression and Diabetes among Low-Income Hispanics: Design Elements of a Socio-Culturally Adapted 5 Collaborative Care Model Randomized Controlled Trial. Int J Psychiatry Med 2009;39(2):113-32. 11. Ell, K., Katon, W., Xie, B., Lee, P.-J, et al. Collaborative Care Management of Major Depression Among Low-Income, Predominantly Hispanic Subjects With Diabetes A randomized controlled trial. Diabetes Care 2010;33(4):706-13. 12. Klinkman, M. S Competing demands in psychosocial care: a model for the identification and treatment of depressive disorders in primary care. Gen Hosp Psychiatry 1997;19(2):98-111. 13. Kroenke, K. Discovering depression in medical patients: reasonable expectations. Ann Intern Med 1997;126(6):463-5. 14. Schierhout, G., Nagel, T., Si, D. M., Connors, C., et al. Do competing demands of physical illness in type 2 diabetes influence depression screening, documentation and management in primary care: a cross-sectional analytic study in Aboriginal and Torres Strait Islander primary health care settings. Int J Ment Health Syst 2013;7(1):16. 15. Ell, K., Xie, B., Quon, B., Quinn, D. I., et al. Randomized controlled trial of collaborative care management of depression among low-income patients with cancer. J Clin Oncol 2008;26(27):4488-96. 16. Ell, K., Xie, B., Kapetanovic, S., Quinn, D. I., et al. One-Year Follow-Up of Collaborative Depression Care for Low-Income, Predominantly Hispanic Patients With Cancer. Psychiatr Serv 2011;62(2):162-70. 17. Wu, S., Ell, K., Gross-Schulman, S. G., Sklaroff, L. M., et al. Technology-facilitated depression care management among predominantly Latino diabetes patients within a public safety net care system: Comparative effectiveness trial design. Contemp Clin Trials 2014;37(2):342-54. 18. Wu, S., Vidyanti, I., Liu, P., Hawkins, C., et al. Patient-Centered Technological Assessment and Monitoring of Depression for Low-Income Patients. J Ambul Care Manage 2014;37(2):138-47. 6 19. Burcusa, S. L., Iacono, W. G. Risk for recurrence in depression. Clin Psycho Rev 2007;27(8):959-89. 20. Kennedy, N., Abbott, R., Paykel, E. Remission and recurrence of depression in the maintenance era: long-term outcome in a Cambridge cohort. Psycho Med 2003;33(5):827-38. 21. Nease, D. E., Malouin, J. M. Depression screening: a practical strategy. J Fam Prac 2003;52(2):118-26. 7 Chapter 2 Paper One: Development of a Clinical Forecasting Model to Predict Comorbid Depression among Diabetes Patients and an Application in Depression Screening Policy Making Abstract Introduction: Depression is a common but often undiagnosed comorbid condition of people with diabetes. Mass screening can detect undiagnosed depression but may require significant resources and time. The objectives of this study were 1) to develop a clinical forecasting model that predicts comorbid depression among patients with diabetes and 2) to evaluate a model-based screening policy that saves resources and time by screening only patients considered as depressed by the clinical forecasting model. Methods: We trained and validated 4 machine learning models by using data from 2 safety-net clinical trials; we chose the one with the best overall predictive ability as the ultimate model. We compared model-based policy with alternative policies, including mass screening and partial screening, on the basis of depression history or diabetes severity. Results: Logistic regression had the best overall predictive ability of the 4 models evaluated and was chosen as the ultimate forecasting model. Compared with mass screening, the model-based policy can save approximately 50% to 60% of provider resources and time but will miss identifying about 30% of patients with depression. Partial-screening policy based on depression history alone found only a low rate of depression. Two other heuristic-based partial screening policies identified depression at rates similar to those of the model-based policy but cost more in resources and time. Conclusion: The depression prediction model developed in this study has compelling predictive ability. By adopting the model-based depression screening policy, health care providers can use their resources and time better and increase their efficiency in managing their patients with depression. 8 Keywords: Decision support techniques; Clinical prediction rule; Chronic disease; Diabetes mellitus; Depression; Comorbidity; Mass screening; Population management; Machine learning; Logistic regression * This paper has been published on the Preventing Chronic Disease journal. Please cite the paper: Jin H, Wu S, Di Capua P. Development of a Clinical Forecasting Model to Predict Comorbid Depression Among Diabetes Patients and an Application in Depression Screening Policy Making. Prev Chronic Dis. 2015;12:E142. 9 2.1 Introduction Clinical forecasting analyzes current and historical facts to predict clinical outcomes. Such forecasting has important applications for underdiagnosed conditions such as comorbid depression among patients with diabetes (1,2), who are twice as likely to suffer depression as the general population (prevalence, 10%–15%) (3,4). For approximately 45% of patients with diabetes, depression goes undiagnosed (3). Mass depression screening improves diagnosis rates (5) but requires significant resources, which prevents providers (6), especially providers in resource-constrained safety-net clinics (7), from adopting this screening method. Providers could screen only diabetes patients at high risk of depression, but the complex relationships between depression and its risk factors make it difficult to identify only patients at high risk (8). Machine learning methods can automatically detect patterns in data and use the patterns to predict future data (9). Machine learning is related to statistics but emphasizes individual-level prediction rather than population-level inference (10). Machine learning was used to develop prediction models for outcomes such as mortality (11,12) and depression (13–15). The objectives of our study, Predicting Diabetes Patients with Comorbid Depression (PreDICD), were 1) to apply machine learning methods to developing an individual-level clinical forecasting model by using diabetes care-related predictors that are easy to acquire or are recommended in clinical practice and 2) to evaluate a model-based screening policy that assigns depression screening only to patients predicted as being depressed by the model. Such a model could save time and resources by not screening patients predicted as nondepressed unless warranted by further model forecasting or clinical observation. 2.2 Methods We developed the PreDICD model by using machine learning methods. Then, we compared the model-based screening policy with mass screening to evaluate the policy’s influence on provider resources and time and on the rate of depression identification. We also compared the model- based policy with 3 heuristic-based partial screening policies that assign depression screening to patients with certain risk factors (including depression history or severe diabetes or both) and assessed the implications for provider’s choice of depression screening policy. 10 2.2.1 Depression Measure The study measured depression by using Patient Health Questionnaires PHQ-9 and PHQ-2, well- validated tools for depression screening (16,17). PHQ-9 consists of 9 questions that are the same 9 criteria used for the diagnosis of depressive disorders as defined by the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV). Each question has 4 ordinal responses with assigned scores from 0 to 3; therefore, the overall scale has a possible score from 0 to 27, where the higher scores indicate more severe depression. PHQ-2 consists of the first 2 PHQ-9 questions. PHQ-2 often serves as a fast screening tool (16, 17); a score of 3 or higher on PHQ-2 warrants a PHQ-9 evaluation to formally diagnose a depressive disorder or assess severity of depression (17). Major depression (the predicted outcome in this study) is indicated by a PHQ-9 score of 10 or higher. Validity of this cutoff point has been established by Kroenke, Spitzer (16). 2.2.2 Data set Data used to develop the PreDICD model were obtained from 2 clinical trials with underserved, predominantly Hispanic, patients with diabetes: the Diabetes-Depression Care-Management Adoption Trial (DCAT) and the Multifaceted Diabetes and Depression Program (MDDP). DCAT is a comparative effectiveness study conducted from 2010 through 2013 in safety-net clinics in the Los Angeles County Department of Health Services (LACDHS), the second largest safety-net healthcare system in the United States. DCAT tested an automated telephone depression screening and monitoring system integrated with a collaborative care management program to facilitate adoption of a collaborative depression care model (18,19). MDDP is a randomized trial conducted from 2005 through 2009 testing the collaborative depression care model for underserved LACDHS patients with comorbid depression and diabetes. The 2 trials are described elsewhere (5,19). The combined data sets provided the important benefit of balancing the proportions of depressed (PHQ-9 score 10, 43.8%) and nondepressed patients (PHQ-9 score <10, 56.2%). In a prior analysis (20), we investigated the use of DCAT data alone to predict depression. Because the 11 nondepressed instances dominated over depressed instances in the DCAT data, the derived model was easily overfitting to the nondepressed instances. The balanced proportions of depressed and nondepressed patients can help the PreDICD model avoid overfitting to either nondepressed or depressed instances and thus improve the predictive ability of the model. 2.2.3 Candidate predictors and predictor selection We identified 20 candidate predictors from the combined DCAT–MDDP data in accordance with 2 criteria: 1) the candidate predictors were relevant to diabetes care and measure aspects that were supported by 2 prior systematic reviews (4,8) as being correlated with depression, and 2) the candidate predictors were typically obtainable from electronic medical records (EMR) or were recommended for providers to routinely collect during diabetes clinic visits. The 20 candidate predictors are summarized in Table 1. They included common demographics, diabetes characteristics, depression history, other health conditions, and level of health care use. From the candidate predictors we selected predictors for developing the PreDICD model. Available selection methods were variable ranking, subset evaluation, and the wrapper method (21). For this study, we adopted a correlation-based subset evaluation method developed by MA Hall (unpublished doctoral dissertation, Correlation-Based Feature Selection for Machine Learning. Hamilton (Waikato Region, New Zealand): The University of Waikato; 1999) that searches predictors by greedy hill-climbing algorithm and targets to select a subset of predictors that are highly correlated with the outcome measure while having low intercorrelation. This predictor selection procedure was carried out by machine learning software, Weka, version 3.6.11 (Slashdot Media). 2.2.4 Model development and validation To derive the appropriate model, we trained and cross-validated (10-fold) 2 linear machine learning models, logistic regression (with Ridge parameter to improve predictive ability [22]) and multilayer perceptron; and 2 nonlinear models, support vector machine (SVM) and random forest. Model selection was based on the 4 models’ predictive ability. The primary criterion was 12 the area under the receiver operating characteristic curve (AUROC), where a larger AUROC indicates better overall predictive ability. We also evaluated the percentages of correctly classified instances, sensitivity, and specificity. We used the model with the best overall predictive ability, measured by AUROC, as the ultimatePreDICD model. Model validation was also carried out by Weka, version 3.6.11, and the ultimate PreDICD model was fitted by R, version 3.1.1 (https://cran.r-project.org/bin/windows/base/old/3.1.1/), by using the whole data set. 2.2.5 Evaluating and comparing the model-based depression screening policy The PreDICD model can support a model-based screening policy that assigns depression screening only to patients predicted by the model to be depressed. We compared the model- based policy with mass depression screening to evaluate the influence of model-based policy on provider resources and time and on the rate of depression identification. In addition, we compared the model-based policy to 3 heuristic-based partial screening policies used by providers to save resources and time. The first heuristic, which requires depression screening for patients with a previous diagnosis of major depressive disorder, is based on the fact that depression is a highly recurrent disease (23). The second heuristic, which requires depression screening for patients with severe diabetes (hemoglobin A1c≥9.0%), is based on the evidence that diabetes and depression are often comorbid conditions (3,4). The third heuristic combines the other 2, requiring patients with either a previous diagnosis of major depressive disorder or severe diabetes to be screened for depression. We evaluated the model-based policy and compared it with mass screening and 3 heuristic-based policies under the clinical context that PHQ is used for depression screening. We assumed the scenario in which patients meeting screening policy inclusion criteria were evaluated using the 2- step PHQ screening suggested by Kroenke et al (17): PHQ-2 is first assigned, and then patients with a PHQ-2 score of 3 or higher are further evaluated by PHQ-9. We compared the rate of depression identification and 3 measures relevant to provider resources and time: proportion of patients receiving PHQ-2 screening, proportion of patients receiving PHQ-9 screening, and the number of questions asked per patient. We further evaluated and compared policies in another scenario in which the 2-step PHQ screening is bypassed in favor of the complete PHQ-9 13 screening for all patients meeting screening policy inclusion criteria. We compared the same measures as the first scenario. To evaluate the model-based policy, we trained the PreDICD model on the combined DCAT– MDDP data; however, we cross-validated (10-fold) only the DCAT data. That is, we randomly divided the samples from DCAT into 10 roughly equal parts. In each single round of validation, samples from 9 of the 10 parts of DCAT data plus samples from MDDP were used to train the prediction model; we then validated the trained model on samples from the remaining data. Mass screening and the 3 heuristic-based policies were also evaluated only on the DCAT data. Because the DCAT data included data on both depressed and nondepressed patients, they represented the LACDHS safety-net population better than the MDDP data. All comparisons were 2-sided and carried out by statistical software R, version 3.1.1. 2.3 Results 2.3.1 the PreDICD model We identified 1,793 patients from the combined DCAT and MDDP data. The MDDP trial enrolled only depressed patients with diabetes (PHQ-9 10 ), and the DCAT trial enrolled both depressed (PHQ-9 score 10, 28.4%) and nondepressed (PHQ-9 score <10, 71.6%) patients with diabetes (Table 1). The combined sample was predominantly Hispanic with balanced proportions of depressed and nondepressed patients (PHQ-9 score 10, 43.8%; PHQ-9 score <10, 56.2%). We used a correlation-based subset evaluation predictor selection method for the PreDICD model to select 7 predictors that are highly correlated with major depression and have low intercorrelation: 1) female, 2) Toobert diabetes self-care, 3) total number of diabetes complications, 4) previous diagnosis of major depressive disorder, 5) number of ICD-9 diagnoses in past 6 months, 6) chronic pain, and 7) self-rated health status. We trained 4 machine learning models (logistic regression, multilayer perceptron, SVM, and random forest) by using the 7 selected predictors. On the basis of the 10-fold cross-validation results, we chose logistic regression as the ultimate PreDICD model because it outperformed the other 3 models in AUROC (logistic regression = 0.81, multilayer perceptron = 0.80, SVM = 14 0.73, random forest = 0.78). The logistic regression model also had the highest percentage of correctly classified instances of depression (logistic regression = 74.0%, multilayer perceptron = 73.5%, SVM = 71.6%, random forest = 72.6%) and sensitivity (logistic regression = 0.65, multilayer perceptron = 0.55, SVM = 0.61, random forest = 0.65), and the second highest specificity (logistic regression = 0.81, multilayer perceptron = 0.88, SVM = 0.80, random forest = 0.79) among the 4 models. The predictors of depression used for the PreDICD model are listed in Table 2. The results show that the following 5 predictors collectively increased the likelihood that the patient would be depressed: female (odds ratio [OR] = 2.35, P < .001), total number of complications from diabetes (OR = 1.35, P < .001), a history of major depressive disorder (OR = 4.03, P < .001), number of comorbidities, measured by the number of ICD-9 diagnoses in previous 6 months (OR = 1.03, P = .04), and chronic pain (OR = 2.13, P < .001). Two predictors decreased the likelihood that the patient would be depressed: good diabetes self-care, measured by To obert diabetes self-care (OR = 0.66, P < .001), and self-rated good health status (OR = 0.45, P < .001). 2.3.2 Evaluating and comparing the model-based depression screening policy The policy that assigns 2-step PHQ screening only to patients predicted by the PreDICD model as being depressed was compared with mass screening and with 3 heuristic-based partial screening policies. Results (Table 3) show that, compared with mass screening, the model-based policy can save resources and time; specifically, the policy reduces the proportion of patients receiving PHQ-2 screening from 100% to 32.3%, the proportion of patients receiving PHQ-9 screening from 29.1% to 16.5%, and the number of screening questions asked per patient from about 4 to 1.8. However, the model-based policy is also shown to decrease the rate of depression identification from about 80% to 50%. The heuristic-based policy that assigned 2-step PHQ screening to patients with a previous diagnosis of major depressive disorder could identify only about 20% of depressed patients. Compared with the model-based policy, the other 2 heuristic-based policies had insignificantly different rates of depression identification but cost significantly more in provider resources and time. 15 A comparison of the model-based depression screening policy using 1-step PHQ-9 with mass PHQ-9 screening (Table 3) revealed that the model-based policy saved provider resources and time; specifically, the policy reduced the proportion of patients receiving PHQ-9 screening from 100% to 32.3% and the number of screening questions asked per patient from about 9 to 2.9. The rate of depression identification, however, decreases from 100% to about 63%. The heuristic- based policy that assigns PHQ-9 screening to patients with a previous diagnosis of major depressive disorder had a low (20.6%) depression identification rate. Similar to the results for 2- step PHQ screening, the other 2 heuristic-based policies had insignificantly different rates of depression identification but cost significantly more in provider resources and time compared with the model-based policy. 2.4 Discussion The PreDICD study developed a clinical forecasting model predicting the occurrence of depression among patients with diabetes by using data from 2 clinical trials. The study considered 20 candidate predictors and compared 4 machine learning models: logistic regression, multilayer perceptron, SVM, and random forest. The ultimate PreDICD model is logistic regression, with 7 predictors in the model: 1) female, 2) Toobert diabetes self-care, 3) total number of diabetes complications, 4) previous diagnosis of major depressive disorder, 5) number of ICD-9 diagnoses in previous 6 months, 6) presence of chronic pain, and 7) self-rated health status. Five of the 7 predictors typically can be acquired from EMR: female sex, total number of diabetes complications, previous diagnosis of major depressive disorder (ICD-9 diagnosis codes 296.2 and 296.3), number of ICD-9 diagnoses in previous 6 months, and presence of chronic pain (ICD-9 diagnosis code 338.2). Diabetes treatment guidelines recommend that health care providers collect data on 2 of the predictors during clinic visits: Toobert diabetes self-care scale, because most of the day-to-day care inherent in diabetes is handled by patients or their families (24), and self-rated health status, because it is strongly correlated with clinical outcomes such as mortality (25). Three prior studies also predicted the occurrence of depression on the basis of health-related data. King et al (13) developed a model that forecasts depression diagnosed by DSM-IV major 16 depression criteria from prospectively collected data from Europe and Chile; and Wang et al (14) developed a similar prediction model by using data from a US national survey. Huang et al (15) developed a prediction model for depression, measured by PHQ-9, from the EMR of a health system. The PreDICD model has comparable predictive ability (AUROC = 0.81) to those 3 studies (AUROC = 0.75–0.85). However, we emphasize that the predictive ability of those studies cannot be easily compared because they either focused on different patient populations or used different depression measures as the outcome. The model-based screening policy that assigns depression screening only to patients predicted as being depressed by the PreDICD model can improve efficiency in identifying depressed patients with diabetes compared with mass screening (ie, saving about 50% to 60% of provider resources and time at the price of missing identification of about 30% of patients with depression). Such a finding is an encouraging step toward implementing a decision-support system based on available medical information that allows providers to better prioritize the use of resources and time. As health delivery systems increasingly take on responsibility for managing population health, model-based screening can help providers reach out to patients who are identified as at-risk by the model. For example, the National Committee for Quality Assurance’s standard requires patient-centered medical homes to provide depression screening (26). The PreDICD model- based policy could establish a preliminary screening step for medical homes to routinely survey patients and target high-risk patients, especially nonengaged ones, for depression screening. Our findings also suggest that providers should refrain from using heuristic-based screening policies that assign depression screening to patients with diabetes and a history of depression, severe diabetes, or both, because those policies either have low rates of depression identification or higher cost in provider resources and time than the model-based policy. This study has several limitations. The PreDICD model combines 2 data sets with somewhat different populations recruited at different times and does not account for possible cohort and period effects on the health conditions of the study populations. Study patients were predominantly Hispanics from the safety-net population with diabetes, which may limit the generalizability of the PreDICD model to wider patient populations because underlying determinants of depression may differ by racial/ethnic group (27). Culling available medical 17 information introduces limitations, including limitations on accuracy and completeness of ICD-9 codes, and the total number of diabetes complications. Another limitation is that 2 of the 7 predictors, Toobert diabetes self-care scores and self-rated health status, despite recommendations, are not currently available in many medical practices. This could reduce the benefit from using the model-based policy if practitioners need to expend additional effort to collect information for those predictors. However, an Institute of Medicine committee recommended ways to cull EMR to capture social and behavioral determinants of health (28). If this recommendation is implemented, information availability may not be a barrier to adopting the model-based policy. Future work should validate and refine the PreDICD model for broader patient populations to improve its generalizability. Also, research to extend the PreDICD model from predicting current depression to forecasting future depression could help health care providers to identify patients with diabetes who are at high future risk of depression and thus warrant repeated depression screening. The model could alternatively be extended from single-level to multilevel logistic regression to account for possible cohort and period effects, and thus improve the model’s predictive ability. The model should also be tested in a clinical environment to verify the feasibility of implementing a decision-support system and to evaluate its influences on clinical outcomes and operations, including costs and cost-savings. Finally, the machine learning methods demonstrated in the study can be applied to predicting clinical outcomes related to other conditions and could be useful in future research initiatives, such as the National Institutes of Health’s recently launched Precision Medicine Initiative (29). Our PreDICD study developed a prediction model with compelling predictive ability for forecasting comorbid depression among patients with diabetes. Adopting such a model-based policy has the potential to outperform other heuristic approaches by better assisting health care providers to increase efficiency in managing their patients with depression and better prioritize the use of their resources and time to deliver effective care for high-risk patients. 18 References of Chapter 2 1. Brown C, Conner KO, Copeland VC, Grote N, Beach S, Battista D, et al. Depression stigma, race, and treatment seeking behavior and attitudes. J Community Psychol 2010;38(3):350–68. http://dx.doi.org/10.1002/jcop.20368 PubMed 2. Li C, Ford ES, Zhao G, Ahluwalia IB, Pearson WS, Mokdad AH. Prevalence and correlates of undiagnosed depression among US adults with diabetes: the Behavioral Risk Factor Surveillance System, 2006. Diabetes Res Clin Pract 2009;83(2):268–79. http://dx.doi.org/10.1016/j.diabres.2008.11.006 PubMed 3. Ducat L, Philipson LH, Anderson BJ. The mental health comorbidities of diabetes. JAMA 2014;312(7):691–2. http://dx.doi.org/10.1001/jama.2014.8040 PubMed</jrn> 4. Roy T, Lloyd CE. Epidemiology of depression and diabetes: a systematic review. J Affect Disord 2012;142(Suppl):S8–21. http://dx.doi.org/10.1016/S0165-0327(12)70004- 6 PubMed 5. Ell K, Katon W, Xie B, Lee P-J, Kapetanovic S, Guterman J, et al. Collaborative care management of major depression among low-income, predominantly Hispanic subjects with diabetes: a randomized controlled trial. Diabetes Care 2010;33(4):706–13. http://dx.doi.org/10.2337/dc09-1711 PubMed 6. Kroenke K. Discovering depression in medical patients: reasonable expectations. Ann Intern Med 1997;126(6):463–5. http://dx.doi.org/10.7326/0003-4819-126-6-199703150- 00008 PubMed 7. Taylor TB. Threats to the health care safety net. Acad Emerg Med 2001;8(11):1080–7. http://dx.doi.org/10.1111/j.1553-2712.2001.tb01119.x PubMed 8. Dobson KS. Risk factors in depression. Waltham (MA): Acadmic Press; 2011. 9. Murphy KP. Machine learning: a probabilistic perspective. Boston (MA): MIT Press; 2012. 10. Breiman L. Statistical modeling: the two cultures. Stat Sci 2001;16(3):199–231. http://dx.doi.org/10.1214/ss/1009213726 19 11. Austin PC, Tu JV, Lee DS. Logistic regression had superior performance compared with regression trees for predicting in-hospital mortality in patients hospitalized with heart failure. J Clin Epidemiol 2010;63(10):1145–55. http://dx.doi.org/10.1016/j.jclinepi.2009.12.004 PubMed 12. Rose S. Mortality risk score prediction in an elderly population using machine learning. Am J Epidemiol 2013;177(5):443–52. http://dx.doi.org/10.1093/aje/kws241 PubMed 13. King M, Walker C, Levy G, Bottomley C, Royston P, Weich S, et al. Development and validation of an international risk prediction algorithm for episodes of major depression in general practice attendees: the PredictD study. Arch Gen Psychiatry 2008;65(12):1368–76. http://dx.doi.org/10.1001/archpsyc.65.12.1368 PubMed 14. Wang J, Sareen J, Patten S, Bolton J, Schmitz N, Birney A. A prediction algorithm for first onset of major depression in the general population: development and validation. J Epidemiol Community Health 2014;68(5):418–24. http://dx.doi.org/10.1136/jech-2013- 202845 PubMed 15. Huang SH, LePendu P, Iyer SV, Tai-Seale M, Carrell D, Shah NH. Toward personalizing treatment for depression: predicting diagnosis and severity. J Am Med Inform Assoc 2014;21(6):1069–75. http://dx.doi.org/10.1136/amiajnl-2014-002733 PubMed 16. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001;16(9):606–13. http://dx.doi.org/10.1046/j.1525- 1497.2001.016009606.x PubMed 17. Kroenke K, Spitzer RL, Williams JB. The Patient Health Questionnaire-2: validity of a two-item depression screener. Med Care 2003;41(11):1284–92. http://dx.doi.org/10.1097/01.MLR.0000093487.78664.3C PubMed 18. Wu B, Jin H, Vidyanti I, Lee P-J, Ell K, Wu S. Collaborative depression care among Latino patients in diabetes disease management, Los Angeles, 2011-2013. Prev Chronic Dis 2014;11:E148. http://www.cdc.gov/pcd/issues/2014/14_0081.htm Accessed April 28, 2015 http://dx.doi.org/10.5888/pcd11.140081 PubMed 20 19. Wu S, Ell K, Gross-Schulman SG, Sklaroff LM, Katon WJ, Nezu AM, et al. Technology- facilitated depression care management among predominantly Latino diabetes patients within a public safety net care system: comparative effectiveness trial design. Contemp Clin Trials 2014;37(2):342–54. http://dx.doi.org/10.1016/j.cct.2013.11.002 PubMed 20. Jin H, Wu S. Developing depression symptoms prediction models to improve depression care outcomes: preliminary results. Proceedings of the 2nd International Conference on Big Data and Analytics in Healthcare; 2014 Jun 22–24; Singapore. 21. Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res 2003;3:1157–82. 22. Le Cessie S, Van Houwelingen JC. Ridge estimators in logistic regression. Appl Stat 1992;41(1):191–201. http://dx.doi.org/10.2307/2347628 23. Kennedy N, Abbott R, Paykel ES. Remission and recurrence of depression in the maintenance era: long-term outcome in a Cambridge cohort. Psychol Med 2003;33(5):827–38. http://dx.doi.org/10.1017/S003329170300744X PubMed 24. Toobert DJ, Hampson SE, Glasgow RE. The summary of diabetes self-care activities measure: results from 7 studies and a revised scale. Diabetes Care 2000;23(7):943–50. http://dx.doi.org/10.2337/diacare.23.7.943 PubMed 25. McEwen LN, Kim C, Haan MN, Ghosh D, Lantz PM, Thompson TJ, et al. Are health- related quality-of-life and self-rated health associated with mortality? Insights from Translating Research Into Action for Diabetes (TRIAD). Prim Care Diabetes 2009;3(1):37–42. http://dx.doi.org/10.1016/j.pcd.2009.01.001 PubMed 26. Patient Centered Medical Home. Patient Centered Medical Home 2014 standards. Washington (DC): National Committee for Quality Assurance; 2014. 27. Riolo SA, Nguyen TA, Greden JF, King CA. Prevalence of depression by race/ethnicity: findings from the National Health and Nutrition Examination Survey III. Am J Public Health 2005;95(6):998–1000. http://dx.doi.org/10.2105/AJPH.2004.047225 PubMed 21 28. Adler NE, Stead WW. Patients in context — EHR capture of social and behavioral determinants of health. N Engl J Med 2015;372(8):689–701. http://dx.doi.org/10.1056/NEJMp1413945 PubMed 29. Precision medicine initiatives. Washington (DC): National Institutes of Health. http://www.nih.gov/precisionmedicine/. Accessed April 28, 2015. 22 Table 1. Data on Patients (N = 1,793) Served by Los Angeles County Safety-Net Clinics, DCAT (2010–2013) and MDDP (2005–2009), Used to Train and Validate the PreDICD Prediction Model Parameter Patients from DCAT Patients from MDDP P c Patients From the Combined Data Set N a Statistics b N a Statistics b N a Statistics b Depression symptoms PHQ-9 (possible score: 0–27; higher = more severe depression) 1,406 6.67(6.00) 387 14.72(2.95) <.00 1 1,79 3 8.41(6.41) PHQ-9 score ≥10 1,406 399 (28.38%) 387 387 (100%) <.00 1 1,79 3 786 (43.84%) Demographics Age, y 1,406 53.27 (9.24) 387 53.97 (8.74) .17 1,79 3 53.42 (9.13) Hispanic/Latino 1,403 1,254 (89.38%) 387 372 (96.12%) <.00 1 1,79 0 1626 (90.84%) BMI 1,385 32.73 (7.28) 383 32.90 (7.55) .69 1,76 8 32.77 (7.34) Female 1,406 892 (63.44%) 387 318 (82.17%) <.00 1 1,79 3 1210 (67.48%) Diabetes characteristics Years with diabetes 1,379 10.27 (7.64) 385 10.32 (8.60) .92 1,76 4 10.28 (7.86) Hemoglobin A1c (%) 1,344 9.24 (2.12) 374 9.03 (2.19) .10 1,71, 8 9.19 (2.14) Hemoglobin A1c tested 1,406 1,344 (95.59%) 387 374 (96.64%) .36 1,79 3 1718 (95.82%) Toobert diabetes self-care (range 0–7, higher=better diabetes self-care) 1,406 4.33 (1.31) 387 3.38 (1.45) <.00 1 1,79 3 4.12 (1.40) Total number of diabetes complications 1,406 1.27 (1.15) 387 1.45 (1.04) .004 1,79 3 1.31 (1.13) On insulin 1,406 742 (52.77%) 387 107 (27.65%) <.00 1 1,79 3 849 (47.35%) On diabetes oral medication 1,406 1,227 (87.27%) 387 321 (82.95%) .03 1,79 3 1548 (86.34%) Depression history Previous diagnosis of major depressive disorder 1,406 120 (8.53%) 387 74 (19.12%) <.00 1 1,79 3 194 (10.82%) Other health conditions Previous diagnosis of panic 1,406 7 (0.50%) 387 5 (1.29%) .09 1,79 3 12 (0.67%) Previous diagnosis of anxiety 1,406 14 (1.00%) 387 11 (2.84%) .006 1,79 3 25 (1.39%) Number of ICD-9 diagnoses in past 6 months 1,389 7.03 (4.45) 387 7.93 (3.56) <.00 1 1,77 6 7.23 (4.29) Chronic pain 1,406 354 (25.18%) 387 126 (32.56%) .004 1,79 3 480 (26.77%) Self-rated health status 1 (Poor) 1,406 223 (15.86%) 387 144 (37.21%) <.00 1 1,79 3 367 (20.47%) 2 (Fair) 633 (45.02%) 206 (53.23%) 839 (46.79%) 3 (Good) 468 (33.29%) 27 (6.98%) 495 (27.61%) 4 (Very good) 69 (4.91%) 7 (1.81%) 76 (4.24%) 23 5 (Excellent) 13 (0.92%) 3 (0.78%) 16 (0.89%) Health care use Hospitalization in past 6 months 1,406 218 (15.50%) 387 62 (16.02%) .80 1,79 3 280 (15.62%) Admitted to Emergency Department in past 6 months 1,404 385 (27.42%) 387 63 (16.28%) <.00 1 1,79 1, 448 (25.01%) Number of outpatient clinic visits in past 6 months 1,406 2.81 (3.56) 387 2.96 (2.81) .38 1,79 3 2.84 (3.41) Abbreviations: BMI, body mass index; DCAT, Diabetes–Depression Care-Management Adoption Trial; ICD-9, International Classification of Diseases, 9th Revision; MDDP, Multifaceted Diabetes and Depression Program; PHQ-9, Patient Health Questionnaire, 9-items; PreDICD, Predicting Diabetes Patients with Comorbid Depression. a Number of respondents b Values are numbers (column percentages) for categorical variables and mean (standard deviation) for continuous variables c P values were calculated by using 2 test for categorical variables and t test for continuous variables. 24 Table 2. Ultimate PreDICD Model a : Predictors of Depression Among Patients with Diabetes Predictor Estimate (SE) Odds Ratio (95% Confidence Interval) P Value Female 0.86 (0.13) 2.35 (1.83–3.03) <.001 Toobert diabetes self-care -0.42 (0.04) 0.66 (0.61–0.72) <.001 Total number of diabetes complications 0.30 (0.06) 1.35 (1.21–1.51) <.001 History of major depressive disorder 1.39 (0.21) 4.03 (2.66–6.10) <.001 Number of ICD-9 diagnoses in past 6 months 0.03 (0.01) 1.03 (1.00–1.06) .04 Chronic pain 0.75 (0.13) 2.13 (1.61–2.74) <.001 Self-rated health status -0.81 (0.08) 0.45 (0.38–0.52) <.001 Abbreviations: ICD-9, International Classification of Diseases, 9th Revision; PreDICD, Predicting Diabetes Patients with Comorbid Depression; SE, Standard Error a Logistic regression model: N = 1,776, estimate of intercept = 1.635, Ridge parameter for avoiding overfitting and improving predictive ability = 10 10 . 25 Table 3. Comparison of Model-Based Depression Screening Policy with Other Screening Policies Measure Model-Based Policy a,b Mass Screening a Heuristic-Based Partial Screening Policy a No. 1 c No. 2 d No. 3 e Value Value P f Value P f Value P f Value P f Scenario 1: 2-step PHQ screening g Proportion of patients receiving PHQ-2 screening 32.3 100 <.001 8.6 <.001 52.4 <.001 56.2 <.001 Proportion of patients receiving PHQ-9 screening 16.5 29.1 <.001 5.5 <.001 16.9 0.726 19.2 .007 Depression identification rate 49.5 78.7 <.001 18.5 <.001 46.4 0.372 53.8 .15 Number of screening questions asked per patient 1.80 4.04 <.001 0.56 <.001 2.23 <.001 2.47 <.001 Scenario 2: complete PHQ-9 screening h Proportion of patients receiving PHQ-9 screening 32.3 100 <.001 8.6 <.001 52.4 <.001 56.2 <.001 Depression identification rate 62.9 100 <.001 20.6 <.001 58.6 0.247 67.3 .21 Number of screening questions asked per patient 2.91 9.00 <.001 0.77 <.001 4.72 <.001 5.06 <.001 Abbreviations: PHQ, Patient Health Questionnaire; PHQ-2, Patient Health Questionnaire, 2 items; PHQ-9, Patient Health Questionnaire, 9 items; PreDICD, Predicting Diabetes Patients with Comorbid Depression. a Values are percentages unless otherwise indicated. b Model-based policy: assigning 2-step PHQ screening or full PHQ-9 screening to patients predicted by the PreDICD model as being depressed. c Heuristic-based partial screening policy no.1: assigning 2-step PHQ screening or full PHQ-9 screening to patients with previous diagnosis with major depressive disorder. d Heuristic-based partial screening policy no. 2: assigning 2-step PHQ screening or full PHQ-9 screening to patients with severe diabetes (hemoglobin A1c 9%). e Heuristic-based partial screening policy no. 3: assigning 2-step PHQ screening or full PHQ-9 screening to patients with either previous diagnosis with major depressive disorder or severe diabetes (hemoglobin A1c 9%). f McNemar’s test for paired dichotomous variables for comparing proportion of patients receiving PHQ-2 screening, proportion of patients receiving PHQ-9 screening and depression identification rate, and paired t test for comparing number of screening questions asked per patient. g Patients who meet screening policy inclusion criteria are evaluated using the 2-step PHQ screening (ie, PHQ-2 is first assigned, and then patients with PHQ-2 score≥3 are further evaluated by PHQ-9). h Complete PHQ-9 screening is assigned for all patients who meet screening policy inclusion criteria. 26 Chapter 3 Paper Two: Developing Prediction Models to Improve Policymaking for Current and Future Depression Screening in Patients with Diabetes Abstract Introduction: Clinical prediction has important applications for stigmatized and underdiagnosed condition such as concurrent depression among patients with diabetes. Depression screening can identify depressed patients but many barriers reduce effective detection of depression, including resource constraints and patient resistance. This paper targets to develop a prediction model based approach to support providers to deliver targeted and efficient depression screening. Methods: Three prediction models to assess current and future risk of depression in patients with diabetes are developed from data pooled from two multi-center safety-net diabetes-depression trials. Statistical learning methods are used to train and validate the models. Operational performance of a prediction model based approach that assigns depression screening to diabetes patients who are at high current and/or future risk of depression is evaluated and compared to three alternative policies. Results: The dataset includes a total of 1793 patients. The prediction model to assess current risk of depression has an Area Under Receiver Operating Curve (AUROC) 0.85 with sensitivity 0.89 and specificity 0.58 to detect major depression. The two prediction models to assess future risk of depression have AUROC 0.73 to forecast the presence of depression in 6 months. The prediction model based approach identifies 90% current and future depressed patients by screening about 55% patients. Alternative policies are either less efficient than the model based approach or misses identification of most depressed patients. Conclusion: This study develops prediction models for depression among patients with diabetes and establishes a model based approach that assists providers to deliver targeted and efficient 27 depression screening. The model based approach outperforms alternative policies that screen and/or monitor patients universally or based on diabetes status. Keywords: Depression; Diabetes; Comorbidities; Statistical Learning; Medical Decision Making; Mental Health Delivery. 28 3.1 Introduction Clinical prediction has important applications for stigmatized and underdiagnosed condition such as concurrent depression among patients with diabetes (1, 2). Prevalence of depression in patients with diabetes is about 20-30%, twice as the prevalence of depression in general population (3, 4), and the rate of undiagnosed and thus untreated depression in patients with diabetes is about 45% (2). Depression screening can identify depressed patients but many barriers reduce effective detection of depression, including the resources necessary from providers and patient resistance to depression assessment (5, 6). Depression is also a highly recurrent condition for diabetes patients (7, 8). More than 50% of patients with concurrent depression and diabetes that have been remitted during treatment are depressed again within one year. Repeated depression screening is an effective intervention to improve clinical outcomes by identification of undiagnosed depression and early detection of depression recurrence. However, there has been scarce evidence that provides guidance for the determination of depression screening interval. The recent recommendation statement for depression screening in adults by the US Preventive Services Task Force states “the optimal interval for screening is unknown” (9). Providers are suggested to assign additional screening based on patient characteristics (9, 10). But, what patient characteristics warrant the increased frequency of depression screening is still a research question. In this paper we present a prediction model based approach that systematically assign depression screening to diabetes patients who are at high current and/or future risk of depression. At the core of the approach are three prediction models based on clinically available information. The first model assesses current risk of depression and identifies high-risk patients who warrant an immediate depression screening. The rest two models forecast future depression risk, specifically, 6 months following a baseline clinic visit, for patients with and without baseline depression screening, respectively. Patients who are at high future risk of depression and therefore warrant screening in 6 months to monitor depression symptoms can be identified by the models. We evaluate operational performance of the model-based approach and compared such an approach to alternative depression screening policies used in clinics. Implications of the comparison results for depression screening policymaking are discussed. 29 3.2 Methods 3.2.1 Research Framework Focusing on the needs in clinical settings, this study proposes a prediction model based approach for managing current and future depression screening for patients with diabetes (Figure 1). Based on three prediction models, such an approach assigns depression screening to only patients at high current and/or future risk of depression and therefore saves resources and minimizes patient resistance through more efficient and targeted depression interventions. The first step of the approach is to collect patient demographics and diabetes status such as diabetes complications, diabetes self-care, etc. after patient entry into the health systems to seek diabetes care. Such collected information is used by the first prediction model to assess the patient’s current risk of depression. Patients with high current risk of depression are assigned for depression screening to evaluate and diagnose their depression; those with positive screening results are further assigned for depression treatment. Depression screening result, depression treatment information, patient demographics, and diabetes status are used by a second prediction model to assess patient’s risk of depression in 6 months. For patients on depression treatment, such a model could be used to proactively assess effectiveness of the treatment. For those not currently depressed, the model will be used to determine whether they need a re-screening for depression in 6 months. Patients at low current risk of depression are not assigned for depression screening. Their predicted current risk of depression, demographics, and diabetes status are inputted into a third prediction model to assess their risk of depression in 6 months. Patients at high future risk of depression will be arranged for a screening in 6 months to monitor their depression symptoms. In this paper we demonstrate the model-based approach using data pooled from two multi-center depression-diabetes clinical trials. Machine learning methods are used to develop the prediction models, and operational performance of the model-based approach is compared with alternative depression screening policies. Details of the dataset, model development, and performance evaluation are described below. 30 3.2.2 Dataset Data used in this paper are pooled from 2 clinical trials with underserved, predominantly Hispanic, patients with diabetes: the Diabetes-Depression Care-Management Adoption Trial (DCAT) and the Multifaceted Diabetes and Depression Program (MDDP). DCAT is a comparative effectiveness study conducted from 2010 through 2013 in safety-net clinics in the Los Angeles County Department of Health Services (LACDHS), the second largest safety-net healthcare system in the United States. DCAT tested an automated telephone depression screening and monitoring system integrated with a collaborative care management program to facilitate adoption of a collaborative depression care model (11, 12). MDDP is a randomized trial conducted from 2005 through 2009 testing the collaborative depression care model for underserved LACDHS patients with comorbid depression and diabetes. The 2 trials are described elsewhere (12, 13). This paper is based on the baseline and 6-month data of the combined DCAT-MDDP dataset. We randomly select 70% samples to train the prediction models and use the rest samples to validate predictive accuracy and to evaluate performance of the model-based depression screening approach. Predictive accuracy is measured by the Area Under Receiver Operating Curve (AUROC). Methods for evaluating the model-based approach are described below. Depression symptoms in the dataset are measured by the 9-item Patient Health Questionnaire (PHQ-9), a widely used depression screening and diagnosis tool in primary care (14). PHQ-9 consists of nine questions that are the same nine criteria used for the diagnosis of depressive disorders as defined by the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV). Each question has 4 ordinal response categories assigned score 0 to 3 and thus the overall scale has the range of score from 0-27 where higher score indicates more severe depression. Occurrence of major depression is measured by PHQ-9 score≥10. Validity of this cutoff point (i.e. 10) has been evaluated Kroenke et al. (14). Patient demographics in the dataset include gender, age, race, body mass index, and social-economic status such as financial problems and employment status. Diabetes status includes diabetes onset age, diabetes complication, Toobert diabetes self-care, hemoglobin A1c, and diabetes treatment such as insulin 31 and diabetes pills. Other variables include depression intervention program and inpatient stay and number of outpatient visit in past 6 months. 3.2.3 Development of the Prediction Models We train and validate three depression prediction models for patients with diabetes: the first model assesses current risk of depression, and the second and third models forecast risk of depression in 6 months for patients with and without depression screening at baseline, respectively. 3.2.3.1 Development of the prediction model for assessing current risk of depression We develop the prediction model for assessing current risk of depression based on a prior Predicting Diabetes Patients with Comorbid Depression (PreDICD) study (15). The PreDICD study, which has been described elsewhere (15), has used the same dataset and developed a logistic regression model with seven predictors to assess current risk of depression. Three out of the seven predictors in the PreDICD model are based on International Classification of Diseases (ICD) codes, which have been shown to have poor validity in clinical records (16). Therefore, we enhance the model by removing two such predictors: previous diagnosis of major depression and number of ICD-9 diagnoses in past 6 months, and replacing the third predictor, diagnosis of chronic pain, by a 5-point Likert-scale measure for pain interference to normal work. The PreDICD model predicts a probability of major depression (measured by Patient Health Questionnaire 9-item (PHQ-9) score≥10) for each patient, and then classifies those with a probability≥0.5 as being at high risk of depression. Such a classifier does not account for the different influences of false positive and false negative classifications. We enhance the classifier by using the approach of Phelps and Mushlin (17) to determine the optimal threshold by disease prevalence and treatment cost and utility to maximize the net benefit of providing depression screening and treatment for only high-risk patients. The depression prevalence in patients with diabetes was derived from a systematic review (4). The costs and utility of providing depression care for those patients were reported by Katon et al (18). 32 3.2.3.2 Development of the prediction models for assessing future risk of depression We use Lasso logistic regression (19, 20) to develop the two prediction models for assessing future risk of depression. Lasso regression is a technique that regularizes the coefficient estimates, or equivalently, that shrinks the coefficient estimates towards zero. Compared to ordinary regression, Lasso regression is often shown to improve predictive accuracy and is a useful technique for predictor selection. The model for patients with depression screening at baseline is developed using the depression screening result, depression treatment information, patient demographics, diabetes status, and other health-related variables in the study dataset. The model for patients without depression screening at baseline is developed using the same group of variables except that depression screening result is replaced by the current risk of depression determined by the first prediction model described above. Such a technique, that is, predicting an alternative response based on an existing prediction model, is known as model recalibration, which has been applied in previous studies. Cross-validation is used to determine the Lasso parameter based on the train set, and predictive accuracy is evaluated on the validation set. 3.2.4 Evaluation of the Prediction Model Based Approach We use bootstrap to evaluate the prediction model based approach shown in Figure 1. We bootstrap 1000 replicates of a typical patient panel from the validation set to evaluate 5 performance measures: 1) number of patients screened for depression, 2) number of depression identified, 3) depression identification rate, 4) hours spent on depression screening, and 5) efficiency of screening, which is the ratio of the number of identified depression to the number of patients screened. We evaluate these measures at baseline and at 6 months. We compare the model-based approach to three alternative depression screening policies. The first alternative policy is universal screening, that is, all patients receive depression screening at baseline and at 6 months. The second alternative policy screens all patients at baseline and only monitors in 6 months those identified as being major depressed at baseline. The third alternative policy only screens patients with poorly managed diabetes (hemoglobin A1c≥9.0%) at baseline and monitors in 6 months those identified as being major depressed at baseline. 33 3.3 Results We identify a total of 1793 patients with diabetes ; 1255 (70%) of them are used for training the prediction models, and the remaining 538 (30%) for validation. 3.3.1 Prediction model to assess current risk of depression The prediction model to assess current risk of depression is an ordinary logistic regression with 5 predictors to detect major depression (Table 1): 1) gender, 2) Toober diabetes self-care score, 3) number of diabetes complications, 4) pain interference to normal work, and 5) self-rated health status. Log odds of current major depression can be predicted as: Log 𝑂𝑑𝑑𝑠 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑚𝑎𝑗𝑜𝑟 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 = 0.67 + 0.85 × 𝑓𝑒𝑚𝑎𝑙𝑒 − 0.38 × 𝑑𝑖𝑎𝑏𝑒𝑡𝑒𝑠 𝑠𝑒𝑙𝑓𝑐𝑎𝑟𝑒 + 0.28 × 𝑑𝑖𝑎𝑏𝑒𝑡𝑒𝑠 𝑐𝑜𝑚𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠 + 0.47 × 𝑝𝑎𝑖𝑛 𝑖𝑛𝑡𝑒𝑟𝑓𝑒𝑟𝑒𝑛𝑐𝑒 − 0.68 × 𝑠𝑒𝑙𝑓 𝑟𝑎𝑡𝑒𝑑 ℎ𝑒𝑎𝑙𝑡 ℎ 𝑠𝑡𝑎𝑡𝑢𝑠 AUROC of the model is 0.85, improved from the PreDICD model (AUROC=0.81) 2 . The optimal threshold is 0.27, which has a sensitivity 0.89 and specificity 0.58 to detect major depression. 3.3.2 Prediction models to assess future risk of depression The prediction model to assess future risk of depression for patients with high current risk (i.e. patients assigned for depression screening at baseline) is a Lasso logistic regression with 12 predictors: 1) Body Mass Index (BMI), 2) Whitty-9 diabetes symptoms score, 3) number of diabetes complication, 4) pain interference to normal work, 5) enrolled in collaborative depression care program, 6) receiving automatic telephone depression monitoring, 7) PHQ-9 score at baseline, 8) number of outpatient visits in past 6 months, 9) feeling that financial situation is getting worse, 10) having difficulty in paying bills, 11) doing work for extra income, and 12) employment status. Log odds of major depression in six months could be predicted as: 34 Log 𝑂𝑑𝑑𝑠 𝑚𝑎𝑗𝑜𝑟 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑖𝑛 6 𝑚𝑜𝑛𝑡 ℎ𝑠 = −4.84 + 0.02 × 𝐵𝑀𝐼 + 0.54 × 𝑑𝑖𝑎𝑏𝑒𝑡𝑒𝑠 𝑠𝑦𝑚𝑝𝑡𝑜𝑚𝑠 + 0.15 × 𝑑𝑖𝑎𝑏𝑒𝑡𝑒𝑠 𝑐𝑜𝑚𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠 + 0.12 × 𝑝𝑎𝑖𝑛 𝑖𝑛𝑡𝑒𝑟𝑓𝑒𝑟𝑒𝑛𝑐𝑒 − 0.75 × 𝑒𝑛𝑟𝑜𝑙𝑙𝑒𝑑 𝑖𝑛 𝑐𝑜𝑙𝑙𝑎𝑏𝑜𝑟𝑎𝑡𝑖 𝑣 𝑒 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑐𝑎𝑟𝑒 − 0.003 × 𝑟𝑒𝑐𝑒𝑖𝑣𝑖𝑛𝑔 𝑎𝑢𝑡𝑜𝑚𝑎𝑡𝑖𝑐 𝑡𝑒𝑙𝑒𝑝 ℎ𝑜𝑛𝑒 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑚𝑜𝑛𝑖𝑡𝑜𝑟𝑖𝑛𝑔 + 0.13 × 𝑃𝐻𝑄 9 𝑠𝑐𝑜𝑟𝑒 𝑎𝑡 𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒 + 0.01 × 𝑜𝑢𝑡𝑝𝑎𝑡𝑖𝑒𝑛𝑡 𝑣𝑖𝑠𝑖𝑡𝑠 + 0.11 × 𝑓𝑖𝑛𝑎𝑛𝑐𝑖𝑎𝑙 𝑤𝑜𝑟𝑠 𝑒 + 0.40 × ℎ𝑎𝑣𝑖𝑛𝑔 𝑑𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦 𝑖𝑛 𝑝𝑎𝑦𝑖𝑛𝑔 𝑏𝑖𝑙𝑙𝑠 + 0.19 × 𝑑𝑜𝑖𝑛𝑔 𝑒𝑥𝑡𝑟𝑎 𝑤𝑜𝑟𝑘 + 0.14 × 𝑢𝑛𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑑 AUROC of the above model is 0.73. The model could be used to proactively assess treatment effectiveness for those depressed at baseline (i.e. PHQ-9 score at baseline greater than or equal to 10). For patients screened but not depressed at baseline, the optimal threshold to classify those who are at high future risk of depression and therefore warrant re-screening for depression in 6 months is 0.10. Such a threshold has a sensitivity 0.92 and specificity 0.53 to predict major depression in 6 months. The prediction model to assess future risk of depression for patients with low current risk (i.e. patients with no depression screening at baseline) is a Lasso logistic regression with 10 predictors: 1) body mass index, 2) Toobert diabetes self-care score, 3) Whitty-9 diabetes symptoms score, 4) number of diabetes complication, 5) pain interference to normal work, 6) enrolled in collaborative depression care program, 7) receiving automatic telephone monitoring, 8) feeling that financial situation is getting worse, 9) doing work for extra income, and 10) employment status. AUROC of the model for predicting the presence of major depression in 6 months is 0.73. Log odds of major depression in six months could be predicted as: Log 𝑂𝑑𝑑𝑠 𝑚𝑎𝑗𝑜𝑟 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑖𝑛 6 𝑚𝑜𝑛𝑡 ℎ𝑠 = −3.15 − 0.001× 𝐵𝑀𝐼 + 0.01 × 𝑑𝑖𝑎𝑏𝑒𝑡𝑒𝑠 𝑠𝑒𝑙𝑓𝑐𝑎𝑟𝑒 + 0.55 × 𝑑𝑖𝑎𝑏𝑒𝑡𝑒𝑠 𝑠𝑦𝑚𝑝𝑡𝑜𝑚𝑠 + 0.31 × 𝑑𝑖𝑎𝑏𝑒𝑡𝑒𝑠 𝑐𝑜𝑚𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠 + 0.06 × 𝑝𝑎𝑖𝑛 𝑖𝑛𝑡𝑒𝑟𝑓𝑒𝑟𝑒𝑛𝑐𝑒 − 0.32 × 𝑒𝑛𝑟𝑜𝑙𝑙𝑒𝑑 𝑖𝑛 𝑐𝑜𝑙𝑙𝑎𝑏𝑜𝑟𝑎𝑡𝑖𝑣𝑒 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑐𝑎𝑟𝑒 − 0.12 × 𝑟𝑒𝑐𝑒𝑖𝑣𝑖𝑛𝑔 𝑎𝑢𝑡𝑜𝑚𝑎𝑡𝑖𝑐 𝑡𝑒𝑙𝑒𝑝 ℎ𝑜𝑛𝑒 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑚𝑜𝑛 𝑖 𝑡𝑜𝑟𝑖𝑛𝑔 + 0.71 × 𝑓𝑖𝑛𝑎𝑛𝑐𝑖𝑎𝑙 𝑤𝑜𝑟𝑠𝑒 − 0.30 × 𝑑𝑜𝑖𝑛𝑔 𝑒𝑥𝑡𝑟𝑎 𝑤𝑜𝑟𝑘 + 0.22 × 𝑢𝑛𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑑 35 AUROC of the above model is 0.73. The optimal threshold to classify those who are at high future risk of depression and therefore warrant screening for depression in 6 months is 0.10. Such a threshold has a sensitivity 0.88 and specificity 0.55 to predict major depression in 6 months. 3.3.3 Evaluation of the Prediction Model Based Approach Results in Table 1 indicate the prediction model based approach identifies 88% depressed individuals by screening about half patients (361 vs. 690) compared to two screening policies that screen all patients with diabetes at baseline, which makes efficiency of the model-based approach significantly higher. Screening patients with poorly managed diabetes at baseline can identify only 48% depressed patients and has significantly lower efficiency than the model-based approach. The model based approach identifies 93% depressed individuals in 6 months by screening 57% (390 out of 690) patients with diabetes. Efficiency of the model-based approach is significantly higher than the policy that monitors all patients in 6 months. The policy that screens all patients at baseline and monitors those with baseline depression can identify only 44% depressed patients in 6 months although it has significantly higher efficiency than the model-based approach. The policy that screens patients with poorly managed diabetes at baseline and monitors those with baseline depression can identify 20% depressed patients in 6 months. 3.4 Discussion In this paper we present a prediction model based approach that systematically assigns depression screening to patients with diabetes who are at high current and/or future risk of depression. The approach involves the development of three prediction models that use clinically available information to proactively assess current and future risk of depression for patients with diabetes. Especially, we present an accurate and parsimonious prediction model to assess current risk of depression. The five predictors in the model are either common demographics or recommended to collect by the American Diabetes Association in clinical care of patients with 36 diabetes (21). The prediction models for assessing future risk of depression suggest that patients with severe diabetes, pain interference to normal work, and low socioeconomic status are likely to develop depression in the future and therefore warrant routine depression screening to monitor depressive symptoms. Evaluation of operational performance of the model based approach demonstrates that such an approach can support health care providers to prioritize their resources to patients at high current and/or future risk of depression. Compared to universal depression screening, the model based approach can save resources and reduce patient resistance through providing more efficient and targeted depression screening. The model based approach outperforms screening only patients with poorly managed diabetes, a policy that is currently used by LACDHS clinics, as the model based approach significantly improves depression identification and screening efficiency. The model based approach also outperforms monitoring patients receiving depression treatment, since the latter policy misses identifying the majority of patients who develop depression in the future. The model based approach holds potential to improve population health. Providers can establish a preliminary step to proactively assess patient’s current and future risk of depression and therefore reach out to at-risk patients. Specifically, the model based approach suggests patients with poor diabetes self-care, low self-rated health status, pain interference to normal work, and have diabetes complications are at elevated risk of current depression. Patients with poor diabetes status, pain interference to normal work, poor financial status, and are not currently enrolled into disease management program are at high future risk of depression. Limitations of the study include that we combine 2 data sets and do not account for possible cohort and period effects on the health conditions of the study populations. Study patients in this study are predominantly Hispanics from the safety-net population with diabetes. This may limit the generalization of the model based approach to broader population. Finally, the accuracy of predicting future depression has room to be further improved through investigating a broader range of predictors and using nonlinear models such as support vector machine to better capture the relationship between depression and its predictors. As a conclusion, this study develops prediction models for depression among patients with diabetes and establishes a model based approach that assists providers to deliver targeted and 37 efficient depression screening. The model based approach outperforms alternative policies that screen and/or monitor patients universally or based on diabetes status. 38 References of Chapter 3 1. Brown C, Conner KO, Copeland VC, Grote N, Beach S, Battista D, et al. Depression stigma, race, and treatment seeking behavior and attitudes. J Community Psychol 2010;38(3):350–68. http://dx.doi.org/10.1002/jcop.20368 PubMed 2. Li C, Ford ES, Zhao G, Ahluwalia IB, Pearson WS, Mokdad AH. Prevalence and correlates of undiagnosed depression among US adults with diabetes: the Behavioral Risk Factor Surveillance System, 2006. Diabetes Res Clin Pract 2009;83(2):268–79. http://dx.doi.org/10.1016/j.diabres.2008.11.006 PubMed 3. Ducat L, Philipson LH, Anderson BJ. The mental health comorbidities of diabetes. JAMA 2014;312(7):691–2. http://dx.doi.org/10.1001/jama.2014.8040 PubMed</jrn> 4. Roy T, Lloyd CE. Epidemiology of depression and diabetes: a systematic review. J Affect Disord 2012;142(Suppl):S8–21. http://dx.doi.org/10.1016/S0165-0327(12)70004- 6 PubMed 5. Coventry PA, Hays R, Dickens C, et al. Talking about depression: a qualitative study of barriers to managing depression in people with long term conditions in primary care. BMC Fam Pract. 2011;12(1):10. 6. Gjerdingen DK, Yawn BP. Postpartum depression screening: importance, methods, barriers, and recommendations for practice. J Am Board Fam Med. 2007;20(3):280-8. 7. Monroe SM, Harkness KL. Life stress, the “kindling” hypothesis, and the recurrence of depression: considerations from a life stress perspective. Psychol Rev. 2005;112(2):417- 45. 8. Lustman PJ, Griffith LS, Freedland KE, Clouse RE. The course of major depression in diabetes. Gen Hosp Psychiatry. 1997;19(2):138-43. 9. Siu AL, the US Preventive Services Task Force. Screening for depression in adults: US Preventive Services Task Force Recommendation Statement. JAMA. 2016;315(4):380- 387. 39 10. Nease DE, Malouin JM. Depression screening: a practical strategy. J Fam Prac 2003;52(2):118-26. 11. Wu B, Jin H, Vidyanti I, Lee P-J, Ell K, Wu S. Collaborative depression care among Latino patients in diabetes disease management, Los Angeles, 2011-2013. Prev Chronic Dis 2014;11:E148. http://www.cdc.gov/pcd/issues/2014/14_0081.htm Accessed April 28, 2015 http://dx.doi.org/10.5888/pcd11.140081 PubMed 12. Wu S, Ell K, Gross-Schulman SG, Sklaroff LM, Katon WJ, Nezu AM, et al. Technology- facilitated depression care management among predominantly Latino diabetes patients within a public safety net care system: comparative effectiveness trial design. Contemp Clin Trials 2014;37(2):342–54. http://dx.doi.org/10.1016/j.cct.2013.11.002 PubMed 13. Ell K, Katon W, Xie B, Lee P-J, Kapetanovic S, Guterman J, et al. Collaborative care management of major depression among low-income, predominantly Hispanic subjects with diabetes: a randomized controlled trial. Diabetes Care 2010;33(4):706–13. http://dx.doi.org/10.2337/dc09-1711 PubMed 14. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001;16(9):606–13. http://dx.doi.org/10.1046/j.1525- 1497.2001.016009606.x PubMed 15. Jin H, Wu S, Di Capua P. Development of a Clinical Forecasting Model to Predict Comorbid Depression Among Diabetes Patients and an Application in Depression Screening Policy Making. Prev Chronic Dis. 2015;12:E142. 16. Quan H, Li B, Duncan Saunders L, et al. Assessing validity of ICD‐9‐CM and ICD‐10 administrative data in recording clinical conditions in a unique dually coded database. Health Serv Res. 2008;43(4):1424-1441. 17. 18. Katon W, Unützer J, Fan M-Y, et al. Cost-effectiveness and net benefit of enhanced treatment of depression for older adults with diabetes and depression. Diabetes Care. 2006;29(2):265-270. 40 19. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B. 1996;58(1):267-88. 20. James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning with applications in R. 21. American Diabetes Association. Foundations of care: education, nutrition, physical activity, smoking cessation, psychosocial care, and immunization. Diabetes Care 2015;38(Suppl. 1):S20-S30. 41 Figure 1. Conceptual Framework 42 Table 1. Comparison of Operational Performance Between the Model-Supported Screening and Alternative Screening Policies Using Bootstrap a Model- Supported Screening Universal Screening b Screening all Patients and Monitoring Only Depressed Patients b Screening Patients with Poorly Managed Diabetes (hemoglobin A1c≥9.0%) and Monitoring Only Depressed Patients b mean (SD) mean (SD) mean (SD) mean (SD) Number of Patients Screened for Depression Baseline 361 (18) 690 (22) 690 (22) 311 (17) 6 Months 390 (18) 690 (22) 131 (12) 63 (8) Number of Depression Identified Baseline 115 (11) 131 (12) 131 (12) 63 (8) 6 Months 110 (11) 118 (11) 52 (7) 24 (5) Depression Identification Rate c Baseline 0.88 (0.03) 1.00 (0.00) 1.00 (0.00) 0.48 (0.04) 6 Months 0.93 (0.02) 1.00 (0.00) 0.44 (0.05) 0.20 (0.04) Hours Spent on Depression Screening d Baseline 60 (3) 115 (4) 115 (4) 52 (3) 6 Months 65 (3) 115 (4) 22 (2) 11 (1) Efficiency of Screening: Ratio of the Number of Identified Depression to the Number of Patients Screened e Baseline 0.32 (0.03) 0.19 (0.02) 0.19 (0.02) 0.20 (0.02) 6 Months 0.28 (0.02) 0.17 (0.01) 0.40 (0.04) 0.38 (0.06) SD=Standard Deviation a Setting of bootstrap: number of replicates=1000, patient panel size in each replicate=2300, probability of diabetes=30%, probability of concurrent depression in patients with diabetes=19.1% b Paired t-test is used to compare with the model-supported screening. All comparisons are statistically significant with P<.001 c Depression identification rate equals to the ratio of the number of depression identified by the model to the number of patients with Patient Health Questionnaire 9-item (PHQ-9) score greater than or equal to 10. d Hours spent on depression screening is estimated based on the assumption that each screening takes 10 minutes to complete e Efficiency of screening ranges 0-1 with a higher value indicates more efficient screening 43 Chapter 4 Paper Three: Predicting Depression among Patients with Diabetes Using Longitudinal Data. A Multilevel Regression Model Abstract Background: Depression is a common and often undiagnosed condition for patients with diabetes. It is also a condition that significantly impacts healthcare outcomes, use, and cost as well as elevating suicide risk. Therefore, a model to predict depression among diabetes patients is a promising and valuable tool for providers to proactively assess depressive symptoms and identify those with depression. Objectives: This study seeks to develop a generalized multilevel regression model, using a longitudinal data set from a recent large-scale clinical trial, to predict depression severity and presence of major depression among patients with diabetes. Methods: Severity of depression was measured by the Patient Health Questionnaire PHQ-9 score. Predictors were selected from 29 candidate factors to develop a 2-level Poisson regression model that can make population-average predictions for all patients and subject-specific predictions for individual patients with historical records. Newly obtained patient records can be incorporated with historical records to update the prediction model. Root-mean-square errors (RMSE) were used to evaluate predictive accuracy of PHQ-9 scores. The study also evaluated the classification ability of using the predicted PHQ-9 scores to classify patients as having major depression. Results: Two time-invariant and 10 time-varying predictors were selected for the model. Incorporating historical records and using them to update the model may improve both predictive accuracy of PHQ-9 scores and classification ability of the predicted scores. Subject-specific predictions (for individual patients with historical records) achieved RMSE about 4 and areas under the receiver operating characteristic (ROC) curve about 0.9 and are better than population- average predictions. 44 Conclusion: The study developed a generalized multilevel regression model to predict depression and demonstrated that using generalized multilevel regression based on longitudinal patient records can achieve high predictive ability. Keywords: Depression; Diabetes mellitus; Comorbidity; Machine learning; Multilevel regression * This paper has been published on the Methods of Information in Medicine journal. Please cite the paper: Jin H, Wu S, Vidyanti I, Di Capua P, Wu B. Predicting Depression among Patients with Diabetes Using Longitudinal Data. A Multilevel Regression Model. Methods Inf Med. 2015;54(6):553-559. 45 4.1 Introduction Individuals with diabetes are twice as likely as the general population to experience clinically significant depressive symptoms (1, 2); however, 45% of those with depression are not assessed for depressive symptoms, and their depression goes undiagnosed and untreated (3). Since depression has significant impact on healthcare outcomes, utilizations and costs as well as elevating suicide risk (4-7), a model to predict depression among diabetes patients is a promising and valuable tool for providers to proactively assess depressive symptoms and identify those with depression. Depression is also an often lifelong and highly recurrent condition, and its symptoms may change and be correlated over time (8-11). Thus, a longitudinal model that takes into account patient historical records may achieve better predictive accuracy than a model that uses only information assessed at one time point. There are only a few studies in the literature related to developing depression prediction models. The predictD study (12, 13) developed regression models to predict the onset of major depression, using predictors selected from 39 known depression risk factors; those models were trained on a dataset from European countries and validated on an independent dataset from Chile. Huang, LePendu (14) developed regression-based models to predict depression, its severity, and its response to treatment, using electronic health record data that included structured diagnosis and medication codes as well as free-text clinical reports. Wang, Sareen (15) developed a regression model to predict first onset of major depression in the general US population, using data from a national survey. Longitudinal data are desirable for use in a prediction model because they contain information regarding developments and progressions in patient condition and health status over time. In a longitudinal data set, the set of observations on one subject are not statistically independent overtime (16). Since this correlation must be taken into account to make valid predictions, many widely used prediction models such as the ordinary regression methods mentioned above (12-15) become inappropriate because those methods lack the ability to model the correlations. 46 4.2 Objective The objective of this study is to develop a generalized multilevel regression model, using a longitudinal dataset from a recent large-scale clinical trial, to predict depression severity and the presence of major depression among patients with diabetes. 4.3 Methods 4.3.1 Predicted Outcome The predicted outcome in this study is the Patient Health Questionnaire PHQ-9 score. PHQ-9, a well-validated scale for assessing severity of depressive symptoms (17, 18), consists of nine questions that are the same nine criteria used for the diagnosis of depressive disorders as defined by the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV). Each question has four ordinal response categories with assigned scores from 0 to 3; thus, the overall scale has a score range from 0 to 27, where the higher score indicates more severe depressive symptoms. PHQ-9 can also be used to diagnose major depression. Kroenke, Spitzer (18) validated and suggested using PHQ-9 score ≥ 10 as the criterion to classify patients as having major depression. 4.3.2 Dataset The dataset used in this study was from the Diabetes-Depression Care-management Adoption Trial (DCAT), a comparative effectiveness study from 2010 to 2013 with three arms: Usual Care (UC), Supported Care (SC), and Technology Care (TC). Patients in the UC arm received usual care from clinics in Los Angeles County Department of Health Services (LACDHS), the second largest safety-net healthcare system in the United States. Patients in the SC and TC arms enrolled in a collaborative care management program (19) during the first 6 months of the trial, after which they went back to usual care. Additionally, patients in the TC arm received automated telephonic depression screening and monitoring during the first 12 months of the trial. DCAT details have been described elsewhere (20-22). The prediction model was developed from the 47 853 patients who completed all four assessments: one to establish study baseline plus three follow-up assessments (at 6, 12, and 18 months). 4.3.3 Modeling the Change of PHQ-9 Score Over Time Generalized multilevel regression is an extension to the ordinary generalized linear regression method and is useful for modeling longitudinal data (23). As PHQ-9 score is nonnegative and only takes on integer values, we used a 2-level Poisson regression model to relate both time- varying and time-invariant predictors to the PHQ-9 score. The level-1 model excluding the residual can be written as: ln ( PHQ9 𝑖 ,𝑡 ) ~ 𝜋 0,𝑖 + 𝜋 1,𝑖 𝑡 + 𝜋 𝑡𝑣 𝑝 1 𝑥 𝑡𝑣 𝑝 1 ,𝑡 + ⋯ + 𝜋 𝑡𝑣 𝑝 𝑙 𝑥 𝑡𝑣 𝑝 𝑙 ,𝑡 (Eq. 1) 𝑡 is the time variable (represented in years), where 𝑡 = 0 is the time when a patient is assigned for the PHQ-9 screening for the first time and the values of the predictors are recorded. In DCAT, 𝑡 = 0 indicates the study baseline. PHQ9 𝑖 ,𝑡 is subject 𝑖 ’s PHQ-9 score at time 𝑡 . 𝑥 𝑡𝑣 𝑝 1 ,𝑡 to 𝑥 𝑡𝑣 𝑝 𝑙 ,𝑡 are the values of 𝑙 time-varying predictors at time 𝑡 . The level-1 model links the right-hand side linear predictor to the natural log-transformed PHQ-9 score. The level-2 model relates subject 𝑖 ’s time-invariant predictors (such as gender and depression history) to the level-1 intercept 𝜋 0,𝑖 and the coefficient of time variable 𝜋 1,𝑖 . The level-2 model can be written as: 𝜋 0,𝑖 = 𝛾 0,0 + 𝛾 0,1 𝑥 𝑡𝑖 𝑝 0,1 + ⋯ + 𝛾 0,𝑚 𝑥 𝑡𝑖 𝑝 0,𝑚 + 𝜁 0,𝑖 (Eq. 2) 𝜋 1,𝑖 = 𝛾 1,0 + 𝛾 1,1 𝑥 𝑡𝑖 𝑝 1,1 + ⋯ + 𝛾 1,𝑛 𝑥 𝑡𝑖 𝑝 1,𝑛 + 𝜁 1,𝑖 (Eq. 3) [ 𝜁 0,𝑖 𝜁 1,𝑖 ] ~𝑁 ([ 0 0 ] , [ 𝜎 0 2 𝜎 01 𝜎 01 𝜎 1 2 ]). (Eq. 4) 𝑥 𝑡𝑖 𝑝 0,1 to 𝑥 𝑡𝑖 𝑝 0,𝑚 are 𝑚 time-invariant predictors for predicting the level-1 intercept 𝜋 0,𝑖 . 𝑥 𝑡𝑖 𝑝 1,1 to 𝑥 𝑡𝑖 𝑝 1,𝑛 are 𝑛 time-invariant predictors for predicting the level-1 coefficient of time variable 𝜋 1,𝑖 . A time-invariant predictor can be used to predict 𝜋 0,𝑖 or 𝜋 1,𝑖 or both. 𝜁 0,𝑖 and 𝜁 1,𝑖 represent 48 between-individual variances, which are assumed to be bivariate normally distributed with mean 0; unknown variances, 𝜎 0 2 and 𝜎 1 2 , respectively; and unknown covariance, 𝜎 01 . Substituting the level-2 model into the level-1 model leads to the so-called mixed-effects form of the model: ln( PHQ9 𝑖 ,𝑡 )~ ( 𝛾 0,0 + 𝛾 0,1 𝑥 𝑡𝑖 𝑝 0,1 + ⋯ + 𝛾 0,𝑚 𝑥 𝑡𝑖 𝑝 0,𝑚 + 𝜋 𝑡𝑣 𝑝 1 𝑥 𝑡𝑣 𝑝 1 ,𝑡 + ⋯ + 𝜋 𝑡𝑣 𝑝 𝑙 𝑥 𝑡𝑣 𝑝 𝑙 ,𝑡 )+ ( 𝛾 1,0 + 𝛾 1,1 𝑥 𝑡𝑖 𝑝 1,1 + ⋯ + 𝛾 1,𝑛 𝑥 𝑡𝑖 𝑝 1,𝑛 ) 𝑡 + ( 𝜁 0,𝑖 + 𝜁 1,𝑖 𝑡 ) (Eq. 5) 𝛾 0,0 to 𝛾 0,𝑚 , 𝜋 𝑡𝑣 𝑝 1 to 𝜋 𝑡𝑣 𝑝 𝑙 , and 𝛾 1,1 to 𝛾 1,𝑛 are collectively known as the fixed effects. 𝜁 0,𝑖 and 𝜁 1,𝑖 are the random effects. 4.3.4 Using the Model for Prediction We can use the 2-level Poisson regression model described above to make two types of prediction: the population-average prediction, which is based only on fixed effects, and the subject-specific prediction, which is based on both fixed and random effects (24). Specifically, the population-average prediction of the PHQ-9 score for patient 𝑖 at time 𝑡 0 ≥ 0, PHQ9 ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ 𝑖 ,𝑡 0 𝑝𝑜𝑝 , is: PHQ9 ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ 𝑖 ,𝑡 0 𝑝𝑜𝑝 = exp( 𝛾 ̂ 0,0 + 𝛾 ̂ 0,1 𝑥 𝑡𝑖 𝑝 0,1 + ⋯ + 𝛾 ̂ 0,𝑚 𝑥 𝑡𝑖 𝑝 0,𝑚 + 𝜋 ̂ 𝑡𝑣 𝑝 1 𝑥 𝑡𝑣 𝑝 1 ,𝑡 0 + ⋯ + 𝜋 ̂ 𝑡𝑣 𝑝 𝑙 𝑥 𝑡𝑣 𝑝 𝑙 ,𝑡 0 + ( 𝛾 ̂ 1,0 + 𝛾 ̂ 1,1 𝑥 𝑡𝑖 𝑝 1,1 + ⋯ + 𝛾 ̂ 1,𝑛 𝑥 𝑡𝑖 𝑝 1,𝑛 ) 𝑡 0 ) (Eq. 6) and the subject-specific prediction of the PHQ-9 score for patient 𝑖 at time 𝑡 0 > 0, PHQ9 ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ 𝑖 ,𝑡 0 𝑠𝑢𝑏 , is: PHQ9 ̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅ 𝑖 ,𝑡 0 𝑠𝑢𝑏 = exp( 𝛾 ̂ 0,0 + 𝛾 ̂ 0,1 𝑥 𝑡𝑖 𝑝 0,1 + ⋯ + 𝛾 ̂ 0,𝑚 𝑥 𝑡𝑖 𝑝 0,𝑚 + 𝜋 ̂ 𝑡𝑣 𝑝 1 𝑥 𝑡𝑣 𝑝 1 ,𝑡 0 + ⋯ + 𝜋 ̂ 𝑡𝑣 𝑝 𝑙 𝑥 𝑡𝑣 𝑝 𝑙 ,𝑡 0 )+ ( 𝛾 ̂ 1,0 + 𝛾 ̂ 1,1 𝑥 𝑡𝑖 𝑝 1,1 + ⋯ + 𝛾 ̂ 1,𝑛 𝑥 𝑡𝑖 𝑝 1,𝑛 ) 𝑡 0 + ( 𝜁 ̂ 0,𝑖 + 𝜁 ̂ 1,𝑖 𝑡 0 ) (Eq. 7) 𝛾 ̂ 0,0 to 𝛾 ̂ 0,𝑚 , 𝜋 ̂ 𝑡𝑣 𝑝 1 to 𝜋 ̂ 𝑡𝑣 𝑝 𝑙 , and 𝛾 ̂ 1,1 to 𝛾 ̂ 1,𝑛 are the estimates of fixed effects. 𝜁 ̂ 0,𝑖 and 𝜁 ̂ 1,𝑖 are the estimates of random effects. It is worth noting that subject-specific predictions cannot be made for PHQ-9 scores at time 𝑡 = 0. That is, we cannot make subject-specific predictions for patients without historical records of the values of PHQ-9 and the predictors, because estimations of random effects are not available for those patients. 49 4.3.5 Prediction Model Development Many factors have been shown to be correlated with depression (8). We selected 29 factors related to diabetes care from the DCAT dataset as the candidate predictors to develop the prediction model: 20 time-varying factors and nine time-invariant factors, measuring demographics, diabetes, health conditions, healthcare utilizations, and intervention programs in which the patients enrolled (patients from SC and TC arms of DCAT enrolled in the collaborative care management program for 6 months and patients from the TC arm received automated telephonic depression screening and monitoring for 12 months). Summary of time- invariant and time-varying predictors are provided in Online Table 1 and 2, respectively. The dataset was divided into two parts: 80% randomly selected patients (# of patients=682, # of samples=2728) in the training set and the rest (# of patients=171, # of samples=684) in the validation set. Predictor selection was based on the training set and involved two phases. First, we estimated fixed effect of each candidate predictor in a univariate manner and obtained their p- values. We used univariate versions of Eq. 5, that is, Eq. 8, to Eq. 10, to conduct this analysis. Specifically, Eq. 8 and Eq. 9 were used for each time-invariant candidate predictor to estimate their univariate fixed effect on model intercept (i.e., 𝛾 0,1 in Eq. 8) and on model coefficient of time variable (i.e., 𝛾 1,0 in Eq. 9), respectively. Eq. 10 was used to estimate the univariate fixed effect of each time-varying candidate predictor (i.e., 𝜋 𝑡𝑣𝑝 in Eq. 10). Univariate analysis results of time-invariant and time-varying candidate predictors are provided in Online Tables 3 and 4, respectively. Then, we included all factors at significance level 0.05 in the univariate analysis and backward eliminated them using Eq. 5 until all predictors were significant at p < 0.05. Selected predictors were used to develop the final prediction model. ln( PHQ9 𝑖 ,𝑡 )~ ( 𝛾 0,0 + 𝛾 0,1 𝑥 𝑡𝑖𝑝 )+ 𝛾 1,0 𝑡 + ( 𝜁 0,𝑖 + 𝜁 1,𝑖 𝑡 ) (Eq. 8) ln( PHQ9 𝑖 ,𝑡 )~ 𝛾 0,0 + ( 𝛾 1,0 + 𝛾 1,1 𝑥 𝑡𝑖𝑝 ) 𝑡 + ( 𝜁 0,𝑖 + 𝜁 1,𝑖 𝑡 ) (Eq. 9) ln( PHQ9 𝑖 ,𝑡 )~ 𝛾 0,0 + 𝛾 1,0 𝑡 + 𝜋 𝑡𝑣𝑝 𝑥 𝑡 𝑣 𝑝 ,𝑡 + ( 𝜁 0,𝑖 + 𝜁 1,𝑖 𝑡 ) (Eq. 10) We validated predictive ability of the prediction model in a longitudinal context in which newly generated patient records are incorporated with historical records to refit and update the model. 50 First, we fitted a model based on the training set and used the model to predict PHQ-9 scores at study baseline (𝑡 = 0) for patients in the validation set. Then, we combined the baseline records from the validation set with the training set to refit and update the prediction model and used the updated model to make predictions for PHQ-9 scores at 6-month follow-up in the validation set. We repeated these updating procedures another two times, using the 6- and 12-month follow-up records to update the model; and we made predictions for PHQ-9 scores at 12- and 18-month follow-ups, respectively. As discussed above, only population-average prediction can be made for PHQ-9 scores at study baseline (𝑡 = 0), and both population-average and subject-specific predictions can be made for PHQ-9 scores at 6-, 12-, and 18-month follow-ups. We used root-mean-square error (RMSE) to evaluate the accuracy of predicting severity of depression (i.e., the accuracy of the predicted PHQ-9 score). Lower values of RMSE indicate better prediction. We evaluated the classification ability of using predicted PHQ-9 scores to classify patients as having major depression by comparing the area under receiver operating characteristic (ROC) curves for predicted PHQ-9 scores against the real PHQ-9 scores ≥ 10. Statistical analyses in the study were carried out by R, version 3.1.3 (25). Fixed and random effects for a generalized multilevel model were estimated using quasi-likelihood estimation implemented by the “glmmPQL” function in R package “MASS” (26). Quasi-likelihood estimation is an approxmation method to fit a multilevel model because the exact evaluation of likelihood function is often computationally infeasible since it involves the integral over the random-effect space (27, 28). P-values of fixed effect coefficients in a model were computed by conditional t-test, which is a variant of Student’s t-test to test the marginal significance of each coefficient (29). 95% confidence intervals (CI) of fixed effects were obtained using a normal approximation to the distribution of parameter estimators, implemented by the “intervals” function in R package “nlme” (29). 4.4 Results The resulting prediction model is a 2-level Poisson regression using two time-invariant predictors and 10 time-varying predictors. Specifically, we selected one of the nine time- invariant predictors, previous diagnosis of major depressive disorder before baseline, to predict 51 the level-1 model intercept. We selected one time-invariant predictor, age at study baseline, to predict the level-1 model coefficient of time variable. We selected ten of the 20 time-varying predictors: 1) diabetes emotional burden, 2) diabetes regimen distress, 3) number of International Classification of Diseases, 9th Revision (ICD-9) diagnoses in past 6 months>10, 4) self-rated health (1=poor to 5=excellent) ≥3, 5) unemployed, 6) feeling that my financial situation is getting worse, 7) having difficulty in paying bills, 8) hospitalized overnight, 9) chronic pain, and 10) enrolled in the collaborative care management program. Table 1 shows the RMSE of population-average and subject-specific predictions for PHQ-9 scores of patients in the validation set. Predictive accuracy was improved by incorporating historical records and using them to update the model, as the RMSEs of both population-average and subject-specific predictions were smaller at the three follow-ups than the population-average prediction at study baseline. In addition, subject-specific predictions were shown to be better than population-average predictions for the three follow-ups. Table 2 shows the areas under ROC curve when predicted PHQ-9 scores are used to classify patients as having major depression. Consistent with the results in Table 1, classification ability was improved by incorporating historical records and using them to update the model, as the areas under ROC of both population-average and subject-specific predictions were larger at the three follow-ups than the population-average prediction at study baseline. Comparisons between the two types of predictions for the three follow-ups show that subject-specific prediction has better classification ability. ROC curves of the population-average prediction for study baseline and subject-specific predictions for the three follow-ups are shown in Figure 1. Table 3 shows the fixed effects estimated from the whole dataset, which can be used to derive population-average predictions for individuals out of the DCAT dataset using Eq. 6. The exponential of the estimated effect of a predictor is the multiplicative term used to calculate the predicted PHQ-9 score when the predictor is increased by one unit. The way to make subject- specific predictions for individuals out of the DCAT dataset is discussed below. 52 4.5 Discussion This study developed a 2-level Poisson regression model from the DCAT clinical trial to predict PHQ-9 scores for patients with diabetes using two time-invariant and 10 time-varying predictors related to demographics, diabetes, health conditions, and healthcare utilizations. The predicted PHQ-9 scores can be used for assessing depression severity and classifying patients as having major depression. Estimated fixed effects (Table 3) can be used to make population-average predictions for individuals out of the DCAT dataset. To make subject-specific predictions for individuals out of the DCAT dataset, users would need a group of patients with multiple historical records that contain PHQ-9 scores and the 12 predictors to train an initial model. With those data, the model could then be used to make subject-specific predictions for patients in the group. Similar to the validation process described above, newly generated records could be incorporated to update the model. An important implication of the study results is that incorporating historical records and using them to update the model may improve both the accuracy of predicting PHQ-9 scores and the ability to classify patients with major depression. The subject-specific predictions can achieve very good to excellent classification of major depression, as demonstrated by the 0.88–0.91 areas under ROC curve. These values are significantly better than the 0.72–0.80 areas under ROC curve achieved in prior depression prediction studies (12-15). Besides multilevel regression, another applicable longitudinal modeling method is generalized estimating equation (GEE). The main difference between using multilevel regression and using GEE for prediction is that GEE can only derive population-average prediction (24). Application of GEE with the same 12 predictors to the dataset used in this study leads to better predictions than the population-average predictions of generalized multilevel regression, but the predictive ability of GEE for the three follow-ups is worse than the subject-specific predictions of generalized multilevel regression (GEE: RMSE for baseline=5.26, for 6-month=4.46, for 12- month=4.27, for 18-month=4.54; area under ROC curve for baseline=0.73, for 6-month=0.87, for 12-month=0.84, for 18-month=0.86). More discussions about GEE and its comparison to multilevel regression can be found elsewhere in the literature (24, 30, 31). 53 The study has several limitations and opportunities for future research. First, although our results suggest that incorporating historical records can improve predictive ability, the finding is not conclusive because the DCAT dataset includes only four waves of data in 6-month intervals. Future research is indicated to further investigate the influences of using historical records for prediction. Second, some predictors used in the model, such as diabetes emotional burden and diabetes regimen distress (32), are not typically available in current medical practices, which may limit the applicability of the model. Third, the number of candidate predictors in the study is limited, which calls for future research to investigate a broader range of predictors. Fourth, predictor selection in the study is based on p-values obtained from conditional t-tests, which are sometimes not reliable (27, 33). Future research is needed to address this issue. Finally, study patients are predominantly Hispanics from safety-net clinics in Los Angeles area, which may limit the generalizability of the model and calls for future research on a broader population. 4.6 Conclusions The study developed a generalized multilevel regression model to predict depression severity and presence of major depression for patients with diabetes. The study demonstrated that generalized multilevel regression can be used to achieve high predictive accuracy based on longitudinal patient records. 54 References of Chapter 4 1. Egede LE, Zheng D, Simpson K. Comorbid depression is associated with increased health care use and expenditures in individuals with diabetes. Diabetes Care 2002; 25(3): 464-470. 2. Ducat L, Philipson LH, Anderson BJ. The mental health comorbidities of diabetes. JAMA 2014; 312(7): 691-692. 3. Li C, Ford ES, Zhao G, Ahluwalia IB, Pearson WS, Mokdad AH. Prevalence and correlates of undiagnosed depression among US adults with diabetes: the Behavioral Risk Factor Surveillance System, 2006. Diabetes Res Clin Pract 2009; 83(2): 268-279. 4. Lin EHB, Von Korff M, Ciechanowski P, Peterson D, Ludman EJ, Rutter CM, et al. Treatment Adjustment and Medication Adherence for Complex Patients With Diabetes, Heart Disease, and Depression: A Randomized Controlled Trial. Annals of Family Medicine 2012; 10(1): 6-14. 5. Carper M, Traeger L, Gonzalez J, Wexler D, Psaros C, Safren S. The differential associations of depression and diabetes distress with quality of life domains in type 2 diabetes. J Behav Med 2014; 37(3): 501-510. 6. Simon GE, VonKorff M, Barlow W. Health care costs of primary care patients with recognized depression. Arch Gen Psychiatry 1995; 52(10): 850. 7. Goodwin RD, Kroenke K, Hoven CW, Spitzer RL. Major depression, physical illness, and suicidal ideation in primary care. Psychosom Med 2003; 65(4): 501-505. 8. Dobson KS, Dozois DJ. Risk factors in depression. Academic Press; 2011. 9. Kennedy N, Abbott R, Paykel E. Remission and recurrence of depression in the maintenance era: long-term outcome in a Cambridge cohort. Psychol Med 2003; 33(05): 827-838. 10. Burcusa SL, Iacono WG. Risk for recurrence in depression. Clin Psychol Rev 2007; 27(8): 959-985. 55 11. Lustman PJ, Griffith LS, Freedland KE, Clouse RE. The course of major depression in diabetes. Gen Hosp Psychiatry 1997; 19(2): 138-143. 12. King M, Walker C, Levy G, Bottomley C, Royston P, Weich S, et al. Development and validation of an international risk prediction algorithm for episodes of major depression in general practice attendees: the PredictD study. Arch Gen Psychiatry 2008; 65(12): 1368-1376. 13. King M, Bottomley C, Bellón-Saameño J, Torres-Gonzalez F, Švab I, Rotar D, et al. Predicting onset of major depression in general practice attendees in Europe: extending the application of the predictD risk algorithm from 12 to 24 months. Psychol Med 2013; 43(09): 1929-1939. 14. Huang SH, LePendu P, Iyer SV, Tai-Seale M, Carrell D, Shah NH. Toward personalizing treatment for depression: predicting diagnosis and severity. J Am Med Inform Assoc 2014; 21(6): 1069-1075. 15. Wang J, Sareen J, Patten S, Bolton J, Schmitz N, Birney A. A prediction algorithm for first onset of major depression in the general population: development and validation. J Epidemiol Community Health 2014; 68(5): 418-424. 16. Diggle P, Heagerty P, Liang KY, Zeger S. Analysis of longitudinal data. Oxford University Press; 2002. 17. Kroenke K, Spitzer RL. The PHQ-9: a new depression diagnostic and severity measure. Psychiatr Ann 2002; 32(9): 1-7. 18. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16(9): 606-613. 19. Katon WJ, Lin EH, Von Korff M, Ciechanowski P, Ludman EJ, Young B, et al. Collaborative care for patients with depression and chronic illnesses. N Engl J Med 2010; 363(27): 2611-2620. 20. Wu S, Ell K, Gross-Schulman SG, Sklaroff LM, Katon WJ, Nezu AM, et al. Technology- facilitated depression care management among predominantly Latino diabetes patients 56 within a public safety net care system: Comparative effectiveness trial design. Contemp Clin Trials 2014; 37(2): 342-354. 21. Wu B, Jin H, Vidyanti I, Lee P-J, Ell K, Wu S. Collaborative Depression Care Among Latino Patients in Diabetes Disease Management, Los Angeles, 2011-2013. Prev Chronic Dis 2014; 11: E148. 22. Wu S, Vidyanti I, Liu P, Hawkins C, Ramirez M, Guterman J, et al. Patient-Centered Technological Assessment and Monitoring of Depression for Low-Income Patients. J Ambul Care Manage 2014; 37(2): 138-147. 23. Singer JD, Willett JB. Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press; 2003. 24. Gardiner JC, Luo Z, Roman LA. Fixed effects, random effects and GEE: what are the differences? Stat Med 2009; 28(2): 221-239. 25. Statistical Package R. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2015. 26. Venables WN, Ripley BD. Modern applied statistics with S. Springer Science & Business Media; 2002. 27. Breslow NE, Clayton DG. Approximate inference in generalized linear mixed models. J Am Stat Assoc 1993; 88(421): 9-25. 28. Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, et al. Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol Evol 2009; 24(3): 127-35. 29. Pinheiro JC, Bates DM. Mixed-effects models in S and S-PLUS. Springer; 2000. 30. Breitung J, Chaganty N, Daniel R, Kenward M, Lechner M, Martus P, et al. Discussion of" Generalized Estimating Equations: Notes on the Choice of the Working Correlation Matrix". Methods Inf Med. 2010; 49(5): 426. 31. Hubbard AE, Ahern J, Fleischer NL, Van der Laan M, Lippman SA, Jewell N, et al. To GEE or not to GEE: comparing population average and mixed models for estimating the 57 associations between neighborhood risk factors and health. Epidemiology 2010; 21(4):467-474. 32. Fisher L, Glasgow RE, Mullan JT, Skaff MM, Polonsky WH. Development of a brief diabetes distress screening instrument. Ann Fam Med 2008; 6(3): 246-252. 33. Bates D. Lmer, p-values and all that. 2006; URL: https://stat.ethz.ch/pipermail/r- help/2006-May/094765.html. 58 Table 1. RMSE of Population-Average and Subject-Specific Predictions for PHQ-9 Scores of Patients in the Validation Set Prediction Type Baseline 6-Month Follow-up 12-Month Follow-up 18-Month Follow-up Population- Average a 5.45 4.75 4.79 5.16 Subject-Specific b NA 4.12 3.51 4.08 a Population-average prediction derived using Eq. 6 b Subject-specific prediction derived using Eq. 7 Table 2. Area under ROC Curve of Using the Population-Average and Subject-Specific Predictions for PHQ-9 Scores to Classify Patients as Having Major Depression in the Validation Set Prediction Type Baseline 6-Month Follow-up 12-Month Follow-up 18-Month Follow-up Population- Average a 0.70 0.84 0.82 0.84 Subject-Specific b NA 0.88 0.91 0.89 a Population-average prediction derived using Eq. 6 b Subject-specific prediction derived using Eq. 7 59 Table 3. Fixed Effects of Predictors in the 2-Level Poisson Regression Model, Estimated from the Whole Dataset Predictor Estimate 95% CI p a Level-2 Time-Invariant Predictor to Predict Level-1 Model Intercept b Previous Diagnosis of Major Depressive Disorder 0.482 (0.302, 0.662) <0.001 Level-2 Time-Invariant Predictor to Predict Level-1 Model Coefficient of Time Variable c Age at Study Baseline 0.008 (0.003, 0.013) 0.002 Time-Varying Predictor d Diabetes Emotional Burden 0.267 (0.197, 0.338) <0.001 Diabetes Regimen Distress 0.201 (0.134, 0.268) <0.001 Number of International Classification of Diseases, 9th Revision (ICD-9) Diagnoses in Past 6 Months >10 0.078 (0.013, 0.144) 0.02 Self-rated Health (1=poor to 5=excellent) ≥3 −0.358 (−0.421, −0.296) <0.001 Unemployed 0.203 (0.103, 0.303) <0.001 Feeling that My Financial Situation is Getting Worse 0.130 (0.073, 0.188) <0.001 Having Difficulty in Paying Bills 0.203 (0.139, 0.268) <0.001 Hospitalized Overnight 0.136 (0.056, 0.217) <0.001 Chronic Pain 0.220 (0.159, 0.281) <0.001 Enrolled in the Collaborative Care Management Program −0.196 (−0.269, −0.123) <0.001 a Conditional t-test b Model that relates level-2 time-invariant predictors to level-1 model intercept described in Eq. 2, estimated intercept = 1.140, 95% CI = (1.022, 1.258) c Model that relates level-2 time-invariant predictors to level-1 model coefficient of time variable described in Eq. 3, estimated intercept = -0.243, 95% CI = (-0.305, -0.181) d Model that relates time-varying predictors to predicted outcome described in Eq. 1 60 Figure 1. ROC Curves for the Predicted PHQ-9 Scores against Real PHQ-9 Scores≥10 61 Chapter 5 Future Directions This dissertation presents a series of three studies that use statistical learning methods to develop accurate prediction models for a common and complex mental condition. The studies investigate using predictive models to improve population health management and demonstrate the benefits from such a model-based care. The dissertation provides new applications of engineering methods to transform the health care delivery system towards a more patient-centered and efficient system. The series of studies reveal several opportunities for future research. First, the studies yield encouraging results towards a model based approach to deliver depression screening and monitoring. It is worth of further test and implementation of the model based approach in a clinical setting. Second, the studies demonstrate that high accuracy in predicting depression among safety-net patients with diabetes can be achieved when proper predictors and statistical learning methods are used. It is worth of investigating the generalization of the prediction models presented in the dissertation to broader patient populations, such as the general diabetic population. Third, the studies establish a prediction model based approach to deliver precision depression screening and monitoring. Methods used in developing the approach could be applied to the development of other precision health care interventions. Precision health care refers to the delivery of interventions tailored to individual characteristics of each patient. Precision health care is endorsed by the National Institutes of Health’s recent Precision Medicine Initiative (1). Fourth, the prediction model based approach developed in this dissertation has the potential to be an organic part of more comprehensive e-health interventions to facilitate the delivery of patient- centered, efficient, and effective depression care. For example, the model-based approach could be integrated with the Diabetes-Depression Care Adoption Technology (2) to provide more targeted and efficient automatic telephone assessment of depressive symptoms for patients. It is worth of investigating the integration of predictive analytics with broader e-health interventions. 62 Finally, it is worth of investigating broader applications of predictive analytics in promoting population health. As health care delivery system are undertaking more responsibilities in managing population health, prediction models like the ones developed in this dissertation could play important roles in identifying patients who are in most need of care and therefore support providers to proactively reach out to those patients. References of Chapter 5 1. Precision medicine initiatives. Washington (DC): National Institutes of Health. http://www.nih.gov/precisionmedicine/. Accessed Jan 24, 2016. 2. Wu, S., Vidyanti, I., Liu, P., Hawkins, C., et al. Patient-Centered Technological Assessment and Monitoring of Depression for Low-Income Patients. J Ambul Care Manage 2014;37(2):138-47. 63 Appendix I Description of the Two Clinical Trials Used to Develop Clinical Prediction Models in the Dissertation Clinical prediction models presented in the dissertation were developed from two clinical trials. One of them is the Diabetes-Depression Care-management Adoption Trial (DCAT), and the other trial is the Multifaceted Diabetes and Depression Program (MDDP). This appendix summarizes the design and main results from the two trials and provides references for further details. The Multifaceted Diabetes and Depression Program (MDDP) Study MDDP was a randomized controlled trial of 387 underserved, predominantly Hispanic (96.5% Hispanic) diabetes patients with major depressive disorder recruited from two safety-net clinics in the Los Angeles County Department of Health Services (LACDHS), the 2 nd largest safety-net healthcare system, from August 2005 to July 2007 and followed over 18 months. An intervention (INT group) based on the collaborative care model (1), including problem-solving therapy and/or antidepressant medication, first-line treatment choice, telephone treatment response, adherence, and relapse prevention follow-up over 12 months, plus systems navigation assistance, was compared to usual care (UC group), including standard clinic care plus patient receipt of depression educational pamphlets and a community resource list. Primary hypothesis of the trial is that the intervention can improve depressive symptom outcomes, functional status, quality of life, and A1C levels at 6, 12, and 18 months. The Patient Health Questionnaire 9-item (PHQ-9) score≥10 was used to identify patients meeting inclusion criteria for major depressive disorder. Of 1803 diabetes patients screened, 387 (30.2%) met the inclusion criteria. Among the enrolled patients, 98% had Type 2 diabetes, and 83% had A1C levels≥7. The comparison of important baseline characteristics showed that depression severity was significantly associated with diabetes complications, medical comorbidity, greater anxiety, dysthymia, financial worries, social stress, and poorer quality-of- life. 64 Analysis reported elsewhere (2) showed a significant group-by-time interaction over 18 months in diabetes symptoms, anxiety, functional status measured by Short-Form Health Survey (SF-12) mental and physical scores, pain-related functioning, disability scale, financial situation, and number of social stressors, but no group-by-time interaction in A1C, diabetes complications, self-care management, or Body Mass Index (BMI). The primary conclusion derived from MDDP trial is that collaborative depression care can improve depression, functional outcomes, and receipt of depression treatment in predominant Hispanic patients with diabetes in safety-net clinics. Further details of the MDDP trial have been described elsewhere (2, 3). The Diabetes-Depression Care-management Adoption Trial (DCAT) Study DCAT was a comparative effectiveness study based on MDDP. DCAT employed a quasi- experimental design with three arms: with three intervention arms: Usual Care (UC), which was the same as the usual care intervention in MDDP; care-management Supported Care (SC), which was the same as the tested intervention in MDDP, including a collaborative care management program for patients with poorly controlled diabetes (1); and Technology-facilitated Care- management supported care (TC), which tests an innovative system of automated telephonic depression screening and monitoring integrated with the collaborative care management program to facilitate adoption of collaborative depression care model (2, 4, 5). 1406 patients were enrolled in DCAT (484 in UC, 480 in SC, and 442 in TC). The patients received assessments about demographics, diabetes and depression history, status and medication, functional status, health services utilization, financial status, stress, etc. at baseline, 6, 12, and 18 months. There were no significant differences in PHQ-9 depression or Medical Outcomes Study Short-Form Health Survey (SF-12) mental score, Sheehan disability scale ratings, or BMI in pairwise comparisons between groups. Other demographic and diabetes variables varied significantly across study groups. This was anticipated given that in the quasi-experimental DCAT design, pre-treatment differences are more common than those expected from randomized experimental design. Using the generalized propensity score (GPS) method (6) to control for pre-treatment differences, statistical analysis indicate that compared to UC, both TC and SC are shown to significantly decrease PHQ-9, reduce the number of major depressed patients, improve depression remission, 65 increase patient emotional problem care satisfaction, and improve Sheehan disability scale and A1C measures. Only TC was significant in reducing cholesterol level and improving emotional care satisfaction among depressed patients and patient satisfaction with diabetes care. There are no statistically significant differences between TC and SC in all outcome measures except A1C. TC is likely to be more cost-effective than SC and UC. Further details of the DCAT trial have been described elsewhere (4, 5). References of Appendix I 1. Katon WJ, Lin EH, Von Korff M, Ciechanowski P, et al. Collaborative care for patients with depression and chronic illnesses. N Engl J Med 2010;363(27):2611-20. 2. Ell, K., Katon, W., Xie, B., Lee, P.-J, et al. Collaborative Care Management of Major Depression Among Low-Income, Predominantly Hispanic Subjects With Diabetes A randomized controlled trial. Diabetes Care 2010;33(4):706-13. 3. Ell, K., Katon, W., Cabassa, L. J., Xie, B., Lee, P. J, et al. Depression and Diabetes among Low-Income Hispanics: Design Elements of a Socio-Culturally Adapted Collaborative Care Model Randomized Controlled Trial. Int J Psychiatry Med 2009;39(2):113-32. 4. Wu, S., Ell, K., Gross-Schulman, S. G., Sklaroff, L. M., et al. Technology-facilitated depression care management among predominantly Latino diabetes patients within a public safety net care system: Comparative effectiveness trial design. Contemp Clin Trials 2014;37(2):342-54. 5. Wu, S., Vidyanti, I., Liu, P., Hawkins, C., et al. Patient-Centered Technological Assessment and Monitoring of Depression for Low-Income Patients. J Ambul Care Manage 2014;37(2):138-47. 6. Imbens GW. The role of the propensity score in estimating dose-response functions. Biometrika 2000;87(3):706-10. 66 Acknowledgements I would like to thank all the people who mentor and support my doctoral research during the past four years. I would like to give special thanks to my PhD adviser Prof. Shinyi Wu who is a greater mentor, a good friend, and a strong supporter for my research. I would like to thank my dissertation advisory committee members Prof. Qiang Huang, Prof. Chih-Ping Chou, Prof. Sheldon Ross, and Prof. Yan Liu. I would like to thank Dr. Paul Di Capua, Dr. Armen Arevian, Prof. Kathleen Ell, Prof. Amy Cohn, Pey-Jiuan Lee, Dr. Irene Vidyanti, Maggie Ramirez, Abdullah Alibrahim, and , John Franklin for their contributions and help for my doctoral research and study. Finally, I would like to thank my wife Jianhui and my son Brian for their constant love and strong supports.
Abstract (if available)
Abstract
Clinical prediction is a key component to facilitate effective population health management. This dissertation organizes three original research articles to investigate the applications of evidence-based predictive analytics to engineer population depression care management for patients with diabetes. In the first article, a clinical prediction model for major depression is developed using supervised learning techniques based on predictors related to diabetes care and easy to acquire in clinical practices that were identified from the baseline data of two clinical trials. The derived prediction model can support a partial-screening policy that helps providers save resource and time, improve efficiency, and achieve better policy making regarding depression screening. The second article extends the scope to repeated depression screening and develop a model based approach to systematically manage current and future depression screening. And the third paper focuses on the challenges of utilizing longitudinal patient information in predictive modeling and develops a generalized multilevel regression model that uses patient’s historical clinical course to predict occurrence of depression among patients with diabetes.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A series of longitudinal analyses of patient reported outcomes to further the understanding of care-management of comorbid diabetes and depression in a safety-net healthcare system
PDF
Simulation modeling to evaluate cost-benefit of multi-level screening strategies involving behavioral components to improve compliance: the example of diabetic retinopathy
PDF
Using a human factors engineering perspective to design and evaluate communication and information technology tools to support depression care and physical activity behavior change among low-inco...
PDF
Optimizing chronic disease screening frequencies considering multiple risk factors and patient behavior
PDF
Depression severity, self-care behaviors, and self-reported diabetes symptoms and daily functioning among low-income patients receiving depression care
PDF
An agent-based model to study accountable care organizations
PDF
Developing an agent-based simulation model to evaluate competition in private health care markets with an assessment of accountable care organizations
PDF
Designing infectious disease models for local level policymakers
PDF
Total systems engineering evaluation of invasive pediatric medical therapies conducted in non-clinical environments
PDF
Designing health care provider payment systems to reduce potentially preventable medical needs and patient harm: a simulation study
PDF
Statistical modeling and machine learning for shape accuracy control in additive manufacturing
PDF
A stochastic employment problem
PDF
Planning care with the patient in the room: a patient-focused approach to reducing heart failure readmissions
PDF
Economic model predictive control for building energy systems
PDF
Investigation of health system performance: effects of integrated triple element method of high reliability, patient safety, and care coordination
PDF
Deep learning models for temporal data in health care
PDF
Experimental and computational models for seizure prediction
PDF
Addressing unmet needs and harnessing social support to improve diabetes self-care among low-income, urban emergency department patients with diabetes
PDF
Imaging informatics-based electronic patient record and analysis system for multiple sclerosis research, treatment, and disease tracking
PDF
A system framework for evidence based implementations in a health care organization
Asset Metadata
Creator
Jin, Haomiao
(author)
Core Title
Clinical prediction models to forecast depression in patients with diabetes and applications in depression screening policymaking
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Industrial and Systems Engineering
Publication Date
04/21/2016
Defense Date
02/10/2016
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
collaborative care,Depression,Diabetes,health informatics,health systems engineering,OAI-PMH Harvest,population health management,predictive modeling
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wu, Shinyi (
committee chair
), Chou, Chih Ping (
committee member
), Huang, Qiang (
committee member
), Liu, Yan (
committee member
), Ross, Sheldon (
committee member
)
Creator Email
haomiaoj@usc.edu,jinhaomiao@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-239720
Unique identifier
UC11277317
Identifier
etd-JinHaomiao-4345.pdf (filename),usctheses-c40-239720 (legacy record id)
Legacy Identifier
etd-JinHaomiao-4345.pdf
Dmrecord
239720
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Jin, Haomiao
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
collaborative care
health informatics
health systems engineering
population health management
predictive modeling