Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Enabling clinically based knowledge discovery in pharmacy claims data: An application in bioinformatics
(USC Thesis Other)
Enabling clinically based knowledge discovery in pharmacy claims data: An application in bioinformatics
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. U M I films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send U M I a complete manuscript and there are missing pages, these w ill be noted. Also, if unauthorized copyright material had to be removed, a note w ill indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9* black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact U M I directly to order. ProQuest Information and Learning 300 North Zeeb Road. Ann Arbor, M l 48106-1346 USA 800-521-0600 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ENABLING CLINICALLY BASED KNOWLEDGE DISCOVERY IN PHARMACY CLAIMS DATA: AN APPLICATION IN BIOINFORMATICS by Jason Peter Jones A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY (PREVENTIVE MEDICINE) August 2001 Copyright 2001 Jason Peter Jones Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3054758 Copyright 2001 by Jones, Jason Peter All rights reserved. ___ _<£) UMI UMI Microform 3054758 Copyright 2002 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK LOS ANGELES. CALIFORNIA 90007 This dissertation, written by .......................................................... under the direction of Hi* Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School in partial fulfillment of re quirements for the degree of DOCTOR OF PHILOSOPHY Dean of Graduate Studies DISSERTATION COMMITTEE Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgements Having never been accused of being overly optimistic, it probably is not surprising to many that I had doubts about this dissertation ever being written let alone accepted. I was honest and consistent in my belief that only a successful defense and completed list of signatures would afford me meaningful security. Such is the nature of less traveled path. Fortunately, I was in excellent company during my walk. Harland Sather’s knowledge, humility, and generosity are an inspiration to me and every other student of his. Doug Stahl’s fresh experience and enthusiasm have made this a better document. Cyrus Shahabi gave me some outstanding opportunities and provided a framework that extended the utility of my practical experience. Mike Nichol provided astounding, diverse, and consistent support and expertise. Even beyond the expected motivation and support, I believe there are three attributes given to me by others that also deserve mention. I would like to thank my parents for giving me the ability to set lofty, solid aspirations and the tenacity to finish. The best gifts are ones that are appreciated more by the recipient than the giver. My wife and kids provided me with a pride based focus that allowed me to convert failures into a more meaningful success. I don’t want to mince words in my last acknowledgement. In the strong sense, this dissertation would not have been possible without my advisor, Stan Azen. The utility of this dissertation is entirely dependent on my ability to take advantage of a man who has a rare ability to create, develop, and implement visionary work. ii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents Section________________________________________________________ Page Acknowledgements ii List of Figures vi List of Tables vii Abstract viii 1 Introduction 1 1 .1 Purpose of Dissertation 1 1 .2 Dissertation Organization 2 2 Background 4 2 .1 Health Care Delivery Systems 5 2 .2 Efficacy versus Effectiveness 9 2 . 3 Pharmacoeconomics and Outcomes Research 10 2 .4 Medical Data Repositories in Theory 12 2 .5 Medical Data Repositories in Practice 15 2 .6 Clinical Pharmacy Constructs 17 2 .7 Three Methods for Transforming Pharmacy Claims 19 2 .8 Knowledge Discovery in Databases (KDD) 26 2 .9 Computing Technology 30 2 .10 Statistical Methods for Learning from Experts 32 2 .1 1 Transforming Pharmacy Claims into Clinical Constructs 37 3 Methods for Transforming Data 38 3 .1 Pharmacy Claims Database 40 3 .2 Data Extraction 40 3 .3 Analytic Database 42 3 .4 Case Sampling for Expert Opinion 47 3 .5 Clinical Expert Processing 48 3 .6 Modeling Expert Opinion 50 3 .7 Expert Model Refinement (Optional) 52 3 .8 Code Generation for Claims Processing 56 3 .9 Model Assessment 57 3 .10 Predictor Variable Set Modification 57 3 .11 Application of Code to Larger Database 58 3 .12 Output Clinical Database 63 4 Comparison of C&RT Methods and Predictor Variable Sets 65 4 .1 Introduction 65 4 .2 Hypotheses 66 4 .3 Methods 66 4.3.1 Description of the Sample 66 4.3.2 Rule Generation Methods 67 iii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.3.3 Predictor Variable Sets 68 4.3.4 Assessment Method 71 4 .4 Results 75 4.4.1 Overall Model Performance 75 4.4.2 Predictor Variable Set Modification 78 4 .5 Discussion 81 4.5.1 Overall Model Performance 81 4.5.2 Predictor Variable Set Modification 82 4 .6 Conclusion 84 5 Application to Antidepressant Treatment 86 5 .1 Introduction 86 5 .2 Hypotheses 86 5 .3 Methods 87 5.3.1 Description of Sample 87 5.3.2 Creation of Key Episodes and Group Assignment 87 5.3.3 Covariates 88 5.3.4 Outcome Variables 88 5.3.5 Analytic Techniques 90 5 .4 Results 94 5.4.1 Treatment Selection Bias 94 5.4.2 Treatment Discontinuation 99 5.4.3 Treatment Events 102 5.4.4 Prescription Costs 105 5 .5 Discussion 109 5 .6 Conclusion 111 6 Conclusions and Directions for Future Work 112 6 .1 Success in Modeling Expert Opinion 112 6 .2 Process Integration 112 6 .3 Improving Computing Efficiency 113 6 .4 Incorporating Multiple Experts 114 6 .5 Expanding Prescription Classes 115 6 .6 Expanding Opinions Modeled 115 7 Conclusion 117 8 References 118 9 Bibliography 118 A1 Appendix 1: Sample Pharmacy Claims Patterns 128 A2 Appendix 2: Analytic Data Dictionary 130 A3 Appendix 3: RxReview Help 133 A4 Appendix 4: Sample Query of Treatment Tables 144 A5 Appendix 5: List of Evaluated Classification Tree Packages 145 iv Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A6 Appendix 6: Description and Sample Use of Outcomes Visualyzer Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure Title Page 2a Figure 2a: Example Medical Claim 13 2b Figure 2b: Example Pharmacy Claim 14 2c Figure 2c: Knowledge Discovery Process Overview 26 3a Figure 3a: Basic Process Diagram 38 3b Figure 3b: Storage Implementation 42 3c Figure 3c: Basic Analytic Database Structure 42 3d Figure 3d: Case Selection 47 3e Figure 3e: Sample Decision Interface 49 3f Figure 3f: Pruning or Growing the Tree 53 3g Figure 3g: Changing a Splitter 54 3h Figure 3h: Conversion of Tree to SQL Code 55 31 Figure 3i: Basic Clinical Constructs 58 3j Figure 3j: Application Processing 59 3k Figure 3k: Treatment Windows 60 3 1 Figure 31: RDBMS Structure of Treatments 64 4a Figure 4a: Graphical Representation of BMatch 70 4b Multi-Drug C-Statistics by Method and Run 76 4c Median (Min-Max) Multi-Drug C-Statistics 76 4d Mono-Drug C-Statistics by Method and Run 77 4e Median (Min-Max) Mono-Drug C-Statistics 77 4f Figure 4f: Node 1 from CARTi 79 4g Figure 4g: Node 1 from CARTs 79 4h Figure 4h: Sample Problem Claim Pair 80 5a Figure 5a: Total Number of Prescriptions by Product and 97 Month 5b Figure 5b: Treatment Discontinuation 98 5c Figure 5c: Treatment Events 101 5d Figure 5d: Prescription Costs 104 5e Figure 5e: Smeared Treatment Cost Estimates Before 107 Formulary Change 5f Table 5f: Smeared Treatment Cost Estimates After Formulary 107 Change A5a Figure A5a: Sample Initial Screen 146 A5b Figure A5b: Sample Graphing Controls 147 A5c Figure A5c: Sample Plot with Grouping 148 A5d Figure A5d: Sample Plot with Grouping and Filtering 149 A5e Figure A5e: Sample Plot with Grouping and Filtering 150 vi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Tables Table Title Page 2a Table 2a: Types of Treatment Changes 17 2b Table 2b: Summary of Methods Presented in Relevant 20 Literature 2c Table 2c: Approach Summary 25 3a Table 3a: Sample Extract Code 41 3b Table 3b: Basic Predictor Variable Set (PVS) Structure 44 3c Table 3c: Null Values 46 4a Table 4a: Claim Pair Classification 66 4b Table 4b: Decisiveness and Disagreement 75 4c Multi-Drug C-Statistics 76 4d Mono-Drug C-Statistics 77 4e Table 4e: Formal Method Comparison 78 4f Table 4f: Decisiveness and Disagreement 78 5a Table 5a: Treatment Selection 95 5b Table 5b: Treatment Discontinuation 99 5c Table 5c: Treatment Events 102 5d Table 5d: Prescription Costs 105 5e Table 5e: Smeared Treatment Cost Estimates Before 107 Formulary Change 5f Table 5f: Smeared Treatment Cost Estimates After Formulary 107 Change Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Abstract This dissertation describes the development, application, and evaluation of a set of methods for transforming standard pharmacy claims data into a clinically relevant database that can facilitate healthcare research. Prescription claims data are an abundant, accessible, detailed, and reliable source of healthcare information. Unfortunately, the standard structure of these databases makes them difficult to use for clinically oriented research. Historically, data complexities have been ignored or simple rules have been used to transform claims data because they were too voluminous for human expert review. In this dissertation, concepts from statistics, computer science, and pharmacy were integrated under the general framework of Knowledge Discovery in Databases (KDD) to build a process for focusing and inducing expert opinion to transform prescription claims data. In their raw form, prescription claims are stored serially as a list of prescription purchases over time. To be clinically useful, the claims must be combined into prescription treatments. A graphical interface was developed to gather decisions from a clinical expert about whether or not individual claim pairs should be combined into prescription treatments. An analytic database was created using 11,654 expert reviewed claim pairs derived from an existing prescription claims database. A classification tree methodology was then applied to the database in an attempt to induce expert decisions based on a flexible set of predictor variables generated directly from the prescription claims. viii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Two different classification tree approaches and four versions of the predictor variable sets (PVSs) were compared with each other and with a fixed heuristic for data transformation. An m-items out approach was used in the testing procedure to independently train and test the models. The model-based classification rules significantly outperformed the simple rule when claim pairs were comprised of different drugs and performed as well as the simple rule when the drugs were the same. The best combination of classification tree approach and PVS was used to generate a set of rules that was subsequently applied to a larger dataset and used to generate and describe prescription treatment episodes. A sample analysis was conducted using the output database to specify inclusion/exclusion criteria, group assignment, stratification, and outcomes such as treatment discontinuation. Both visual and formal techniques were used in a way that would be commonly used in an outcomes or pharmacoeconomic research endeavor. Pharmacy claims based analysis compliments randomized, controlled clinical trials. Claims databases represent relatively inexpensive and largely unexploited exploratory ground for understanding the relationships between prescription treatments and their healthcare and cost outcomes. The dissertation concludes with discussion on how the described methodologies could be improved and expanded. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 Introduction 1.1 Purpose of Dissertation In clinical medicine it is essential to evaluate benefits and risks associated with new treatments that are developed for a given disease. Phase II and III clinical trials are designed to determine the risks and benefits of new treatments.1 ,2 By design, these safety and efficacy trials involve the careful selection and control of subjects. They employ careful inclusion/exclusion criteria, careful monitoring of patients, and strict adherence to planned study protocols for treatment to ensure valid assessments of causal relationships. The primary drawback to this level of control is that it produces an artificial environment, leaving questions about how treatments will perform in the real world.2 ,3 With the developments of pharmacoeconomics and outcomes research, changes in healthcare delivery, and advances in computing, researchers now have the ability to evaluate the benefits and risks associated with treatments in a natural setting.4 ,5 Organizations charged with delivering and managing healthcare services maintain massive data repositories (large, well-defined, central databases) containing detailed information related to services and treatments provided to patients.5 Though not capable of establishing treatment effects in the same way as randomized controlled trials (RCTs), these repositories can be used for assessing outcomes in actual practice. The primary obstacle in conducting clinically oriented research with these repositories is that the data are collected for the purpose o f managing financial risk.5 A substantial transformation is required to mine (apply 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. statistical methods to produce knowledge) this wealth of untapped information that could be used to answer questions related to outcomes in actual clinical practice. The purpose of this dissertation is to present, apply, and evaluate a set of methods for transforming the data and removing the barriers preventing the efficient mining of these repositories. To date, there are no known published methods for performing this transformation. This dissertation is an application in bioinformatics, drawing upon statistical methods, pharmacy, and computer science to facilitate clinical knowledge discovery in prescription claims databases. 1.2 Organization of the Dissertation Chapter 2 provides a general overview of the key concepts for this dissertation. First, the substantive areas of healthcare delivery, common analysis approaches, claims data systems, and existing methods for utilizing the claims data are covered. Second, the Knowledge Discovery approach is presented as a framework in which to make better use of the claims data. Finally, applicable computational and statistical methods are presented. Chapter 3 presents the set of methods developed in this dissertation. After providing an overview of the entire process, each step (from extracting the original data, to inducing expert human judgment with increasing accuracy, to outputting data) is described. Chapters 4 and 5 present independent sets of experiments. Each chapter contains its own introduction, hypothesis, methods, conclusions, and discussion 2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. section. Chapter 4 evaluates the variable creation, selection, and particular statistical techniques for modeling the human expert opinion. Chapter 5 uses the data that is output by the dissertation methods to study treatment compliance, events, and cost in patients taking antidepressant medication. Chapters 6 and 7 present an overall summary of the dissertation, conclusions, and directions for future research. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 Background The general nature and context of the research problem are explained in the sections 2.1 through 2.8. Section 2.1 describes healthcare delivery systems in the United States. Section 2.2 distinguishes between the concepts of efficacy and effectiveness. The definition, usefulness, and scope of pharmacoeconomics and outcomes research are provided in section 2.3. Sections 2.4 and 2.5 explain medical data repositories and address how information about health care delivery gets stored electronically both in theory and practice. To be useful for outcomes research, these data must be transformed into clinically relevant constructs. Section 2.6 details the types and nature of these constructs. Having established the starting and ending points, Section 2.7 describes the three types of existing methods for moving from pharmacy claims to clinical constructs. The general underlying theoretical framework for understanding this dissertation, Knowledge Discovery in Databases (KDD), is presented in Section 2.8. Having explained the general nature and context of the problem, Sections 2.9 and 2.10 provide background information for understanding the more technical aspects of the dissertation. Database management systems (DBMS) are explained in Section 2.9. The presented methods are based upon the use of statistical techniques to model human expert decisions. Section 2.10 defines and compares several statistical methods that are available for modeling expert decisions in general terms. Section 2.11 gives a description of how the strengths of human expertise and computing efficiency can be synergistically 4 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. combined to create a solution for problems where expert opinion needs to be applied to very large quantities of data. 2.1 Health Care Delivery Systems The United States health care delivery system has evolved dramatically and with increasing speed over the last 100 years. The currently dominant system, known as managed care began in the 1930s with the Ross-Loos Clinic, the Palo Alto Clinic, and the Kaiser Permanente System.6 The shift to providing medical care and associated services via Managed Care Organizations (MCOs) has accelerated from the 1970s through the 1990s as a response to unacceptable health care inflation and federal legislation mandating the availability of managed care to employees of large organizations. While there are a number of MCO models for operation, the central goal is to balance services and cost to provide the appropriate care to the largest number of people.7 The Genesis o f MCOs If a person develops an illness or injury that requires medical attention, the cost of treatment often extends beyond what the patient can afford as an individual. To guard against financial hardship, individuals may obtain medical insurance by purchasing it themselves, through their employers, or, in some cases, from the government. In most cases, the insurer does not treat the patient. Instead, the insurer enters into a contractual agreement with the individuals (e.g., physicians, 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pharmacists, or nurses) or organizations (e.g., physician groups, pharmacies, or hospitals) that actually administer the services. The patient typically pays a small percentage of the overall healthcare costs and the insurer pays the rest. Through the 1960s and 1970s, the most common reimbursement system was called “Fee for Service” (FFS).7 Under this model, the healthcare provider would submit a claim and be reimbursed for each provided service. The primary role of the insurance company was to adjudicate the claim and pay for the services. The FFS model set up a situation whereby providers were encouraged to submit as many claims as possible and this drove the cost of healthcare to unsustainable levels. In 1973, the United States Congress passed the HMOa Act as a way to indirectly control healthcare costs through the creation of private organizations that were “federally qualified” to manage the administration of healthcare services to their members (patients).7 In the 1980s and 1990s MCOs became strong enough to exert substantial control over healthcare providers by selectively offering contracts to providers at standardized rates. MCOs more carefully scrutinized the claims their providers submitted as part of their efforts to provide the highest level of care to the greatest number of patients in a market economy. As a result, providers were limited in the services they could provide and many services would not be reimbursed without receiving prior authorization from the MCO itself. ’ There are different types of MCOs, but this dissertation will consider MCO and HMO to be synonymous to avoid confusion. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Financial Risk Segmentation Insurance companies exist to manage financial risk and the 1973 HMO Act gave MCOs more leverage in managing this risk. Towards the end of the 1980s and through the 1990s, MCOs began segmenting their financial risk in three ways. One segmentation method is known as a carve-out. Carve-outs break up the financial risk typically based on clinical conditions or a group of clinical conditions that are usually treated by a particular subclass of physicians.2 For instance, a common carve-out is for psychiatric services. With a mental health carve-out, the treatment and payment of most or all mental health services (e.g., counseling, hospital stays, and medications) is handled by a select group of providers and managed by a subcontracted MCO. Another way to segment financial risk is by the type of service provided.3 A common segmentation by service type is for a MCO to have a separate entity manage pharmacy (medication) benefits. Due to the large number of new medications that become available each year, the increasing costs of these medications, and the increasing number of people taking medications, the management of pharmacy benefits has become an industry in and of itself. Organizations that manage these benefits and assume the financial responsibility associated with them are called Pharmacy Benefit Managers (PBMs)8 . The third way MCOs have segmented financial risk is called risk-sharing. As MCOs increased in popularity, the concept of the Primary Care Physician (PCP) was 4 Service type segmentation is also a type of carve-out. 7 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. created. The PCPs function in a MCO was that of a gatekeeper; to manage patient access to more expensive specialists and services. To motivate PCPs, MCOs began paying flat rates per member assigned to the PCP and setting up incentives based on the PCP’s management of her/his patients. The management model assumes that, if the PCP manages her/his patients well, the members will have lower healthcare utilization. Since the PCP receives a flat rate per member, better health management should mean the PCP will make more money per member. Poorly managed patients will result in less profit per member. For instance, providing check-ups and medication for diabetic patients on a regular basis could prove less costly than having to perform leg amputations or dealing with massive organ failure as a result of unmanaged diabetes. The concept of paying a flat rate per person is called capitation.2 Capitation can be extended beyond the PCP to specialists, hospitals, or entire medical groups.7 ,9 Capitation agreements can shift small amounts or all of the financial risk between the MCO and the provider to the point where the role of the MCO becomes more of an administration, marketing, and sales organization. Towards the late 1990s, providers were assuming so much of the financial risk without the ability to manage it that, in some areas of the country, they began losing money and eventually going bankrupt.6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.2 Efficacy versus Effectiveness Before a medication can be prescribed and sold in the United States, it must be approved by the Food and Drug Administration (FDA). The FDA requires controlled clinical trials to prove the safety and efficacy of the drug. How effective the drug will actually be outside the controlled trial in the clinical market is not always obvious based on the results of the clinical trial.3 ’ 4 For the purposes of this dissertation, efficacy is defined as: The degree to which a therapeutic outcome is achieved in a patient population under rigorously controlled and monitored circumstances, such as randomized controlled clinical trials. Alternatively, the probability of a benefit to individuals in a defined population from a medical technology applied for a given medical problem under ideal conditions of use.2 Effectiveness is defined as: The degree to which a therapeutic outcome is achieved in a general patient population from a medical technology applied for a given medial problem under actual or average conditions of use.2 For instance, when drugs are administered in a RCT, drug A might be proven to be slightly more efficacious than drug B. However, drug A might have a more severe side effect profile than drug B to the extent that most patients treated with drug A become noncompliant. Since a treatment cannot be effective unless it is taken, drug B might be more effective than drug A in actual clinical practice. 9 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 3 Pharmacoeconomlcs and Outcomes Research The International Society for Pharmacoeconomics and Outcomes Research (ISPOR) defines pharmacoeconomics as: The field of study that evaluates the behavior of individuals, firms and markets relevant to the use of pharmaceutical products, services, and programs, and which frequently focuses on the costs (inputs) and consequences (outcomes) of that use.2 Outcomes research is defined as: The collection and analysis of data on the use of health care products, procedures, services and programs, and the evaluation of the clinical, economic, quality of life, and patient satisfaction outcomes of that care.2 This dissertation focuses on an area of pharmacoeconomics and outcomes research based on information that can be gathered from existing medical claims databases. When available, this information often provides the foundations for other areas of pharmacoeconomic and outcomes research including issues such as cost, quality of life, and patient satisfaction. Usefulness The appropriate use of medical services and medications has been shown to improve patient health and decrease treatment costs and save money for the employers who often cover their employee’s medical benefits.1 0 The appropriateness of treatment is not always obvious and is becoming less so as the number and 10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. complexity of treatments increase. Pharmacoeconomics and outcomes research attempt to determine the relative appropriateness of available treatments in natural settings: the relative effectiveness of treatments. The results of this research have both immediate and long-term consequences. PBM organizations maintain drug formularies to limit the list of medications a physician can prescribe for her/his patients. The formulary decisions are often based on pharmacoeconomic and outcomes research. Therefore, shortly after the research is made available, it could directly impact millions of members’ access to certain medications.4 The results of these research endeavors may have longer-term impact by motivating further research along a number of fronts. Pharmaceutical companies can use the research to find potentially lucrative voids in available treatments. The results could also lead to further academic or industry-based research broadly based on the original work or specifically designed to address weak components. Limitations The impetus for pharmacoeconomics and outcomes research often comes from existing literature or the results of controlled clinical trials. The data typically come from existing databases maintained by MCOs. If the data are inadequate the results are likely to be inaccurate. Even under the best circumstances, the lack of randomization and experimental control allows for potential bias that would prevent 11 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pharmacoeconomic and outcomes research from establishing risk/benefit profiles with the same degree of certainty that can be attained with traditional RCTs. 2.4 Medical Data Repositories in Theory A data repository is a central database for large quantities of data. The data often come from many different sources and are used by many different users for many different purposes. MCOs maintain data repositories with information ranging from member demographics and eligibility, to physician accreditation and contracting, to hospital and physician office visit claims, to pharmacy claims. Distinct departments from many different companies feed data into the data repository but many data elements often are required to build a complete understanding of the healthcare experience for each member. Once populated, the data repository is segmented again by different users looking for answers to questions ranging from what physicians a member can visit in a particular geographic region to how well physicians in that region are maintaining their patients. Medical Service Data Medical services (e.g., check-ups or surgeries) are provided to patients in places like physician offices, clinics, and hospitals. Each service involves procedures (e.g., x-raying an arm and putting the arm in a cast) and diagnoses (e.g., broken arm). In a Fee for Service (FFS) environment, the provider must document 12 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2a: Example Medical Claim Member Date CPT ICD9 Cost XXX 1997-06-21 67228 250.00 $50.00 Retinopathy (Eye Exam) what was done and why using standard forms and codes in order to be reimbursed. This claim would then be processed by the MCO and stored in the data repository such that (at the very least) the provider, patient, procedure, diagnosis, and date could be determined. Standard coding systems exist for both procedures (CPT codes) and diagnoses (ICD codes) so that retrieving and categorizing claims is efficient (Figure 2a).a’ 5 Pharmacy Claims A pharmacy claim is generated every time a member fills a prescription for a medication. The National Council for Prescription Drug Programs (NCPDP), which is recognized by the American National Standards Institute (ANSI), has developed the process and content for transferring pharmacy claims from a pharmacy to the PBM.5 Minimally, the data transfer includes the prescriber, patient, drug identifier, prescription fill date, dose, quantity, and prescription duration (Figure 2b). The system was originally designed to enforce formularies and to make sure that pharmacies did not provide medications to patients who were not eligible for 1 Methods exist for handling cases where multiple diagnoses or procedures are involved in a single visit or when a single visit lasts for extended periods of time (e.g„ a hospital stay). 13 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. them.5 Over time, the system has been enhanced to prevent adverse events such as drug-drug interactions that could be harmful to the patient. Unless a non prescription medication is covered by the health plan, non-prescription medications do not generate pharmacy claims records. Ancillary Data Besides the information about services the MCO administers, information relating to the claims administration itself is also stored. There are differences in the types of ancillary data maintained by different organizations, but the following types of information are often included: • Physician, hospital, pharmacy, and member names and addresses. • Physician accreditation and contract information. • Employer group contract information. • Records specifying which benefits the members were eligible for and at what times. • Information explaining how to decode procedure, diagnosis, and drug codes. Figure 2b: Example Pharmacy Claim P r e s c r ib e r Member D ate NDC Q u a n tity D ays S u p p lie d C o st YYY XXX 1 9 9 7 -0 6 -2 1 00003011101 200 30 $ 5 0 .0 0 Injectable Insulin 100 0/HL Various departments inside and outside the MCO maintain this information. For instance, many MCOs have their own accreditation and contracting departments for 14 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. physicians. However, procedure, diagnosis, and drug codes are usually developed and supplied by third parties such as the American Medical Association (AMA) for procedure codes or the FDA for drug codes.2 2.5 Medical Data Repositories in Practice The quality of medical data repositories varies across MCOs. Expenditures on information technology have traditionally been lower in healthcare than in other fields such as banking." Like other industries, the health care industry has focused first on collecting and maintaining data directly related to business operations or finances.5 This often leaves medical data repositories lacking both accuracy and completeness, making them difficult to use for clinically oriented research. Data Incompleteness and Errors Even in the FFS environment, providers were paid based on procedures and not diagnoses. Therefore, if patients had conditions that were not directly related to the primary, reimbursable service, it is likely that the diagnosis typically would not be included on the claim.1 2 Furthermore, to maximize payments, it is possible that physicians might modify the procedure coding or diagnosis slightly. In the era of shared risk, the problem has become worse. Some MCOs have begun providing incentives for providers to submit claims for medical services, but many do not provide incentives or the incentives are often insufficient to convince 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the provider to submit a claim at all. When data are incomplete in semi-predictable but unrecoverable ways: The bad news is that often the available data are not representative of the population of interest and the worse news is that the data itself contains no hint that there is a potential bias present.1 3 Submitted claims often represent a biased sample to improve the provider’s chance of renegotiating improved capitation contracts. When claims are submitted, they are often inaccurate and go unchecked because they are not considered to be of critical financial importance to the MCO in the way that FFS claims were. In contrast, pharmacy claims are highly regulated and well maintained. Of the 2.8 billion transactions processed by pharmacies in 1998,95% were processed immediately and electronically.1 1 This means that the pharmacy claims data tend to be the highest quality and most complete data in the healthcare industry.5 Clinical Disconnect While the completeness and accuracy of the MCO pharmacy claims is very good, they are not immediately useful for pharmacoeconomic or outcomes research. The purpose of storing the claims was to facilitate claim adjudication and overall financial risk management. It was only after standards were in place that people started to consider the use of these data for clinically oriented research. Because pharmacoeconomic and outcomes research is clinically oriented, not having access to clinically oriented constructs in the pharmacy claims data presents a significant 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2a: Types of Treatment Changes Type Description Example Dose Change Patient changes from one dose to another for the same drug. (A atX m g) => (A at Ymg} Drug Add Patient could keep using one drug and add one or more drugs to the treatment regimen {A}0{A,B} Drug Drop Patient could stop taking one drug but keep taking other drugs. For instance {A,B} => {A} Drug Switch Patient stops using one drug and starts using another. {A} = > {B} obstacle to “mining” these data to evaluate things such as treatment effectiveness.5 1 1 PBMs and MCOs would benefit having pharmacoeconomic and outcomes research studies on which to base their formulary decisions, however they are rarely able to use their data to support their decisions4. 2.6 Clinical Pharmacy Constructs Prescription Treatments For this dissertation, the foundation on which all other clinical constructs are built is the prescription treatment (referred to as treatment for the remainder of the dissertation). Each prescription a patient has filled generates its own row in a pharmacy claims table. Clinically, a treatment may be comprised of one or more medications (and therefore claims) taken together. For instance, people with pneumonia often take both an antibiotic and a steroid inhaler. Changes to Treatments Many diseases require multiple treatments. For instance, high blood pressure or diabetes may require hundreds of prescription treatments over the course of a 17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. patient’s lifetime. Over time, the treatment choice may change. The four types of treatment changes defined in this dissertation are shown in Table 2a. Switches are not strictly necessary as {A forX days} ■ = > {B for Y days} could be rewritten {A for Xdays} = > {A, B for 0 days} ■ = > {B for Y days}. However, a drug switch has clinical relevance not necessarily captured by an add and a drop. It is possible to combine the basic change types to captured more complex changes (e.g., {A, B } ^ {A, C, D}). Episodes o f Care Multiple treatments can be combined into episodes of care (referred to as episodes for the remainder of the dissertation). These are periods of time during which treatments are received for a given disease. This is not the same as an episode of disease. Some diseases, such as diabetes, typically are considered chronic or incurable and a patient can only have one “episode” of diabetes in a lifetime. Specifically, the episode of disease starts with the first diabetes diagnosis and can only end when the patient dies. Over the course of her/his disease, however, the patient may have clinically distinct episodes when the patient was actually receiving medication. In the case of a chronic disease in which treatment was continuously prescribed, episodes of treatment may be punctuated with periods of non- compliance. 18 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Outcomes of Interest One of the more utilized outcomes in a pharmacy claims database is the cost associated with each prescription. An example of a clinically relevant outcome of interest is the period of non-compliance that punctuated episodes in the treatment of a chronic disease. Other outcomes of interest include: • Treatment changes: whether and when the patient experienced dose changes, drug switches, drug adds, or drug drops. • Length of episode: treatment duration. • Treatment relapse: once treatment stopped, whether and when a patient resumed treatment. Many more outcomes may be investigated by incorporating cost information or other medically related information such as diagnosis, procedure, lab, or patient satisfaction data. 2.7 Three Methods for Transforming Pharmacy Claims Table 2b presents a representative summary of recently published research in the area of pharmacy and claims based analysis. Much of the work has focused on compliance issues and very little has focused on therapy changes. This is surprising for two reasons. First, RCT protocols typically prohibit therapy changes and, if therapy changes are required are dealt with either by excluding the patients who changed or including them in the analysis as intent-to-treat (ITT). Adverse reactions 19 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2b: Summary of Methods Presented in Relevant Literature Ref Year Source Type- Class Level Methodological Summary 14 2000 Am. J. Health- Compliance Systems Pharm. 15 1999 J. Managed Care Compliance Pharmacy 16 1999 Value in Health Compliance Anti- Class hypertensive Anti- Drug depressant 17 2000 ISPOR Compliance Antipsychotic Class 18 1998 Arch, of General Compliance Psychiatry 19 2000 Value in Health Compliance Statin 20 2000 Value in Health Cost 21 2000 Value in Health Cost/ Outcome 22 2000 J. of General Rx Practice Internal Med. 23 1999 Value in Health Rx Practice Antipsychotic Drug 24 1994 Clinical Rx Practice Therapeutics ACE Drug Define compliance, dose Inhibitor calculations, and multi-drug therapy. Multi-therapy defined as the presence of a single other prescription with no discussion of therapy changes. Mention compliance calculation. No mention of dosing, multi- therapy, or treatment changes. Define compliance, dose calculations, multi-drug therapy, and therapy changes. Define compliance and multi-drug therapy. Analysis intent-to-treat with therapy changes as outcomes. SSRI Drug Define compliance. Contains switch/augment cohort but no methodological definition given. Drug Extensive discussion of compliance methods and mention of switching therapy. No methodological discussion of doing, therapy changes, or multi therapy. No discussion of multi-therapy. Therapy changes based on presence of a drug not in index treatment. Therapy change used as predictor variable. SSRI Drug Therapy changes (dose and drug switching) considered as outcomes, but not operationally defined. No discussion of multi therapy. Mono-therapy starts only, though no methodological definition given. No discussion of therapy changes. Treatment change defined as any presence of another drug within the first treatment year. No discussion of multi-therapy or dose changes. SSRI Drug Study of dose titration with no discussion of dose calculations, drug changes, or multi-therapy. Antipsychotic Drug Anti- Drug hypertensive 20 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2b: Summary of Methods Presented in Relevant Literature (cont) 25 1998 Pharmacotherapy Rx Practice SSRI 26 2000 Value in Health 27 1997 Clinical Therapeutics 28 1997 J. of Social & Admin. Pharm. 29 1999 Medical Care 30 1997 Clinical Epidemiology Design of economic & effectiveness studies Use of claims databases Validation of pharmacy records for compliance Validation of pharmacy records for compliance Compliance literature review with claims data 31 1999 Medical Care Use of claims databases 32 1995 Inquiry Construction of episodes of care from claims data. Type Concomitant therapy defined as receiving a single prescription for another medication between the second and sixtieth day of treatment. Suggests excluding multi-therapy cases and other measures to increase internal validity more relevant to prospective designs. Discusses importance of selecting and disclosing methods but provides no specific methodological suggestions. Definition of compliance and comparison methods. Indicates dosing may be variable. No mention of treatment changes or multi-therapy. Definition of compliance and comparison methods. Indicates dosing may be variable. No mention of treatment changes or multi-therapy. Definition of various compliance methods. No mention of dosing, treatment changes, or multi therapy. Indicates treatments may change but makes not mention of methods to deal with this Episodes of care defined by procedures and diagnoses and not prescriptions.___________________ are often studied, but the ability to study therapeutic changes in actual practice presents a unique opportunity in claims based analyses. Second, conducting ITT analyses in protocol-free settings without accounting for therapy changes may result in misleading conclusions. For instance, if the compliance of patients on drug A and drug B was conducted as ITT, the results would be misleading if all drug A patients 21 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. switched to drug B two weeks into a one-year course of therapy. This is an extreme case, but the possibility of therapeutic changes outside RCTs is real and should be considered in conducting analyses. Before complex or changing therapies can be studied, operational definitions must be provided for converting the pharmacy claims into prescription treatments that represent the regimens and show change over time. Table 2b also shows that the details for the methods of transforming pharmacy claims into clinical constructs for pharmacoeconomic and outcomes research are frequently omitted from publications. Three general approaches exist, however. Complexity Avoidance Even in diseases where the cost of treatment failure is extreme and treatment standards exist (e.g., various cancers) patient treatment choices often seem unpredictable and difficult to describe on a micro level. Researchers often use an intent-to-treat approach as if the data came from a randomized controlled clinical trial. Specifically, they look at the first treatment the patient received and conduct the analysis under the assumptions that (a) the patient remained on the treatment for the entire episode or (b) changes to treatment occurred at random without impacting the results. The primary advantage to the complexity avoidance method is that it is the easiest, fastest, and least expensive method to implement. Another advantage is that 22 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. it facilitates the creation of indices across disease states because there are no disease- specific methodological differences to reconcile. The problem with the method is that patients often do change treatments within episodes and these changes are usually not random. Changes are frequently made to treatments due to effectiveness or side-effect problems. For example, in some diseases, optimal treatment requires planned changes to treatment known as “titration” in which, for instance, the drug dose is modulated over time. Not accounting for the details and changes in treatment may lead to an incomplete representation of treatments and treatment events in subsequent analyses. Expert Review Pharmacy claims data do contain enough information to reconstruct the details of treatments and episodes. The reason the complexity avoidance method is utilized is because the unregimented prescribing patterns encountered in practice often require a human expert to interpret. For instance, patients often do not fill their prescriptions exactly on the day that their last prescription ran out. This can lead to questions about whether there are significant gaps in the episode or whether the prescribed daily dosage has been increased (i.e., the patient should be taking multiple pills per day). Patients often change treatments before an existing treatment is 23 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. complete. Just looking at the pharmacy claims, it is not always clear whether the intent was for the patient to switch between drugs or add the second drug to the first.3 The expert review method uses a human expert such as a pharmacist or physician to manually review the pharmacy claims and reconstruct the clinical constructs from those claims. The primary advantage of this method is that it is the most accurate and detailed method available. The human expert can combine the pharmacy claim detail with her/his inherent pattern recognition abilities and trained domain specific knowledge (i.e., medical treatment) to transform the data. The problem with expert review is that it is very slow and expensive. Roughly 2.8 billion pharmacy transactions took place in 1998 alone.1 1 Trying to utilize a substantial quantity of the data is prohibitively time consuming and expensive. Using a smaller portion of the claims detracts from a primary reason for using claims data in the first place: the ability to consider large patient groups. Explicit Rule Based Because the medical claims are stored electronically in databases, it is possible to have a computer access them. Furthermore, a clinical expert could extract information from the claims data and then explain the rules (s)he used to do so. For instance, the expert could list the medications that could be used for a particular disease. A programmer could then write code to have the computer implement the expert’s rules. To the extent that the programmer can artificially * Appendix 1 shows sample pharmacy claims and discusses the pattern recognition driven process of transforming them into clinical constructs. 24 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. implement the human expert’s thought processes, the richness of the human expert’s judgment can be replicated. The primary benefit of having a computer apply an expert’s rules is the speed with which large quantities of data can be accessed. Armed with the aforementioned list of possible treatments, a programmer could have a computer search through billions of records for all patients who received at least one of those medications much faster than a human could. The problem that plagues a rule-based methodology is that all the rules must be known and explained to the system. Complete rule based methods are often difficult to assemble and are often limited to particular diseases or types of diseases. Finally, because explicit rule based systems are typically used in place of expert review, the accuracy of the systems are difficult to evaluate thoroughly. This difficulty increases as the complexity of the rules increases. Summary Table 2c summarizes the advantages and disadvantages associated with each of the three approaches across three important measures: speed, clinical richness, and generalizability. Each approach excels in at least one but not all areas. Table 2c: Approach Summary Approach Speed Clinical Richness Generalizability Complexity Avoidance Fast Low Excellent Expert Review Slow High Poor Explicit Rule Based Fast Decreases with generalizability Decreases with richness 25 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.8 Knowledge Discovery in Databases Figure 2c: Knowledge Discovery Process Overview * # Knowledge to Outcomes Select . Data Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Process Overview The process of extracting useful information from large databases is often called Knowledge Discovery in Databases (KDD). KDD is not a single method but a process involving a series of steps. These steps can be conceptualized in different ways, but they always involve selecting, transforming, analyzing, and interpreting data.3 3 Figure 2c depicts a typical conceptualization of the KDD process.3 3 The KDD process always starts with an existing database. Specific data from the database are then selected based on some criteria (e.g., women at least 45 years of age). The subset of data is then processed and transformed so that it is clean and in the format necessary to present to an analytic tool such as a statistical analysis program. The data are then analyzed, often in a semi-automated or computer assisted manner due to the size of the database. The results of the analysis are then interpreted and presented. The entire process is iterative and parts or the entire process may be repeated based on factors such as the availability of new data or information or insight gained in the process itself. “KDD” and “data mining” are often considered synonymous and sometimes confused with data dredging.3 3 Data mining is often part of the KDD process and is generally included in the analyzing step of the KDD process. However, the KDD literature explicitly discourages blind data dredging. The active incorporation o f domain expertise is a required and critical component o f the KDD process?* Results cannot be useful if they are incorrect or irrelevant. 27 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Relation to Statistics The analysis step in the KDD process is typically dependent on statistical analysis. Therefore, “it would be unfortunate if the KDD community dismissed statistical methods”1 3 and the converse is also true. The KDD process cannot succeed without being able to assign levels of certainty to the results, however there are differences between working with a few hundred well-understood observations and working with the sorts of data typically encountered in a KDD venture:3 3 • Massive data present problems for traditional statistical software. • Large numbers of potential predictor and outcome variables make the search for potential patterns more difficult and the likelihood of spurious pattern finding more likely. • Data that have both many observations and many variables can cause problems for traditional statistical significance testing. • There is often no fixed analysis point. The data and rules that generate the data are often not completely understood and fluctuating. • The data are often messy and incomplete. These problems are not entirely new to statisticians. What is different in KDD is the scale of the problems. Statisticians and KDD researchers often have underappreciated the relationship between statistics and KDD. Part of the reason for this may be the strong link between statistics and mathematics. This, coupled with the fact that the 28 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. widespread possibility of working with large datasets is a recent development, has lead to the tendency for statisticians to spend much of their time finding more efficient methods to handle small datasets and attempting to model relationships between a well and pre-defined set of predictor and outcome variables.1 3 Parallel efforts ranging from the creation of faster, cheaper computers to the implementation of improved database management systems has been ongoing in the fields of engineering and computer science to facilitate the ability to efficiently work with massive databases. The investigators conducting this research have as little exposure to and training in statistics as most statisticians have to engineering and computer science. KDD is dependent on both statistics and computer science.1 3 Advances in both areas have made it possible for the two to come together in novel and productive ways to solve problems that would have been very difficult if the areas were isolated. Application to Current Problem The KDD framework as described by Fayyad3 3 and others is a useful one for the current problem because it provides a structure on which to build a methodology for converting large quantities of data into useful information. The problem addressed in this dissertation represents a novel application of KDD in which the domain expertise of a clinical expert will be modeled and then utilized in a semi automated fashion to transform large quantities of pharmacy claims data into clinical 29 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. constructs. These constructs, which will also comprise a large database, will then be used by clinical experts and analysts to find useful patterns in ways that were previously inaccessible. 2.9 Computing Technology Database Management Systems (DBMS) and Standards A database is collection of related data. A database management system (DBMS) is a computer application that allows users to create, maintain, and use databases.3 5 In practice, a database usually is conceptualized as a set of tables, which made up of rows and columns. These tables are analogous to the datasets statisticians may be more familiar with, which are made up of observations and variables. What distinguish commercially available DBMSs from the applications statisticians may be more familiar with (e.g., BMDP™, SAS™, or SPSS™) include: • The quantity of data DBMSs can handle and the efficiency with which these systems handle the data. • The number of users who can simultaneously access the data via DBMSs. • The flexibility with which complex relationships between data elements can be modeled and enforced within the DBMS. • The fact that most commercial DMBSs adhere to standards often allows the data to be accessed across different hardware, operating system, and DBMS platforms without having to change computer code. 30 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For these reasons, MCO data repositories are maintained using DBMSs. This dissertation focuses on a particular type of DBMS known as Relational DBMS (RDBMS) technology because it is by far the most common implementation of DBMS technology for MCO data repositories. In many applications three pieces of software are required to work with RDBMS data: 1) Client software to process and manipulate the data. 2) Driver software to connect between the database server and the client (available from the DBMS vendor or third parties). 3) A language for inserting, extracting, and manipulating data (e.g., Structured Query Language). Structured Query Language (SQL) is the result of an ongoing effort by ANSI, the International Standards Organization (ISO), and a variety of DBMS vendors and related industry members to create a standard language for creating and manipulating databases. There are several different versions of SQL and different DBMS vendors have adhered to the standard to differing degrees.3 5 ,3 6 For instance, S AS™ implements a version that does not include all the variable types included in the standard but does include certain additional functions not included in the standard.3 7 The methods used in this dissertation assume a minimal level of compliance with the ANSI/ISO SQL-92 specification.3 5 ,3 6 31 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.10 Statistical Methods for Learning from Experts There are many different methods available for building statistical models of events or decisions. For this dissertation, it is important to be able to model the binary (yes/no) decisions of a human expert based on a number of highly related and sometimes-incomplete predictor variables. This section describes the traditional and more recently developed statistical techniques typically chosen for this task. Logistic Regression Logistic regression is a traditional linear modeling technique. In many respects, logistic regression is very similar to standard linear regression. The difference is that, rather than trying to predict some continuous outcome (e.g., a person’s height based on that person’s weight), logistic regression is used to predict the probability of an event. To ensure that the model always produces a value between 0 and 1 (having the model produce a probability of -0.3 or 5.7 would be unacceptable) the logit transform is applied:3 8 r _P_^ l - p logit(p) = In where: p is the probability of the event occurring. The logit(p) is then fitted using some combination of predictor variables so that the result is a model of the form: logit(p) = a + 3 'x 32 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where: a: y-intercept parameter. 0: vector of slope parameters, x: vector of predictor variables. The advantages of logistic regression include: 1) Modeling performance is often as good or better than alternate techniques. 2) The process often results in a model that is relatively easy to interpret. 3) Individual parameter estimates can be modified by the user. 4) Many computer statistical packages are available for performing logistic regression and these packages often also contain methods that facilitate model construction and evaluation.3 For instance, stepwise methods can be employed to select variables that should be included in the model. Some packages produce plots that indicate whether the data are adhering to the modeling assumptions and even how well the model is predicting the outcome. However, logistic regression may not always be optimal because: 1) The modeling assumptions may not be met. For instance, the relationship between the outcome and predictor variables may not be linear. While transformations can be applied to improve the model fit, this is a manual process and may detract from interpretability. 2) Incorporating interactions amongst predictor variables can be a tedious art. a Various stepwise methods facilitate the model construction process and diagnostic tools such as residual plots facilitate model evaluation. 33 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3) Any observation containing a null value for a single predictor variable must either have that value imputed or the observation will be excluded from the analysis and also cannot have its outcome predicted. Neural Networks Neural networks (NNs) are sophisticated computer programs, which model the capabilities of the human brain by mimicking the structure and function of neurons in the brain.3 9 Utilizing principles of artificial intelligence, NNs permit the modeling of complex functional relationships.4 0 A NN consists of layers of nodes (analogous to neurons) linked by inter-connections (analogous to axons/dendrites), together with rules that specify how the output of each node is determined by input values from all nodes at the level below. A layered architecture of neurons in the brain can be used to provide a progressively more abstract representation of input stimuli as the information is filtered through successive layers. Neural networks attempt to reproduce this effect, although most networks are limited in practice to three or four layers in total. Theoretical work suggests that NNs can consistently match or exceed the performance of traditional statistical methods.3 9 The primary advantages of NNs include:4 2 1) Variable selection and weighting is always automatic. 2) NNs have the implicit ability to handle complex relationships between predictor variables and the predictor and outcome variables without requiring user input. For instance, interactions between variables would have to be 34 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. explicitly requested and sometimes built in logistic regression packages but these relationships are automatically discovered and used by NN applications. 3) NNs are not linear models and therefore do not make as many assumptions about the data. Problems with NNs include:4 2 1) In many cases they do not outperform or even match the performance of logistic regression models. 2) NNs are black boxes insofar as the user puts values in and gets values out but is given no insight into how the results were generated (in contrast to parameter estimates from logistic regression, for example) or any ability to modify the parameters. 3) Like logistic regression, most NNs cannot handle missing data. Classification & Regression Trees Classification and Regression Trees (C&RT)4 1 are a collection of parent and child nodes built using recursive binary partitioning. An outcome and set of predictor variables are specified. Initially, the probability of the outcome variable is simply the proportion of observations having the event (e.g., the clinical expert responds “yes” to a question). In this initial state, there is only one node, the root, and no predictors have been used. The next step is to find the single best predictor of the outcome variable and force a binary split of the parent node into two child nodes. 35 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In the case of a non-dichotomous predictor variable, this means the optimal splitting point must be determined for the predictor variable. Each child node is then split independently and the process continues until some stopping rule is met (e.g., the node is homogenous or further splitting does not improve outcome prediction sufficiently). The primary advantages of C&RT include:4 2 1) The method makes no distributional assumptions about the data. 2) Variable interactions are handled automatically. 3) The output is often easy to interpret for non-statisticians because it is a decision tree. As long as the user can understand a basic flow-chart the C&RT output can probably be understood. 4) The tree (model) can be modified easily by a user with no statistical training. 5) Null values are implicitly and naturally handled. Because the method considers every predictor variable at each step, if the best overall predictor is null for a particular observation, the method simply selects the next best predictor variable (called a surrogate splitter). The primary problems with C&RT include: 1) If the assumptions are met for logistic regression C&RT will often be less efficient. 2) Strong predictors that are linearly related to the outcome can result in very complex trees that could be represented much more easily with logistic regression models. 36 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Comparing Methods Each of the above methods has the ability to “learn” from the human expert and induce human decision-making. Many studies have been done comparing the different methods. Each method makes certain assumptions about the underlying data and while it may be argued that one method is generally more powerful than another in theory, actual applications of these methods to real and simulated data indicate that there is no a priori winner.4 2 2.11 Transforming Pharmacy Claims into Clinical Constructs Pharmacy claims data provide a reliable record of patient treatment in a managed care healthcare delivery system. The fields of pharmacoeconomics and outcomes research have developed to utilize information gained from these claims to further research and direct policy in healthcare. However, there is a disconnect between the pharmacy claims data as it exists in databases and the way they must exist to be exploited for clinical research purposes. Existing methods to handle this disconnect have weaknesses that substantially limit the utility of the data (Section 2.7). This dissertation develops a set of statistical and computational methods within a Knowledge Discovery in Databases framework for optimally utilizing the domain specific knowledge of a clinical expert to transform a standard pharmacy claims database into a useful research resource. 37 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3. Methods for Transforming Data Figure 3a presents a basic overview of the components and process for converting pharmacy claims into clinical constructs that will be used in this dissertation by statistically deriving rules from a sample of expert human decisions. Sections 3.1 to 3.11 describe the components and methods in detail and each section number ties to the relevant yellow numbered section of Figure 3a. Sections 4.3 and 5.3 describe the particular methods for the methodological and applied sections of this dissertation. Figure 3a: Basic Process Diagram Data E x tr a c tio n '.Patient L ist, Sampling Cor Expert- Opinion E xpert P ro c essin g V ariable M odification Hodel A ssessm ent E xpert Refinement A p p lic a tio n P ro cessin g Modeling Code G eneration C la s s ifie d Claims Hodel S p ecial Raw Claims Subset S et o f Rules fo r P rocessin g C lin ic a l C onstructs 38 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The transformation process is closely tied to the KDD framework (Section 2.8). Process 2 is applied to the original data (component 1), resulting in a select database (component 3). Component 3 also consists of transformed data resulting from transformation methods. Further selection is applied to sample for expert opinion in process 4. Clinical domain expertise is applied in process 5 and again, coupled with pattern interpretation, in process 7, after data mining is conducted in process 6. The modeling process is more formally evaluated in process 9 by comparing the transformation rules created in process 8 to the induced clinical domain expertise. Clinical domain expertise is utilized again in process 10 to modify the transformed data (component 3). Finally, the knowledge gained for transforming the data is applied in process 11 to the component 3 data in order to generate component 12, a new database for clinical knowledge discovery via pharmacoeconomic and outcomes research. In this dissertation, the methods were applied to a particular problem in transforming pharmacy claims in which the start and end dates of two prescription claims overlap. This overlap raises the question: should the two claims be combined into a single treatment or should they remain separate? While the dissertation methods were designed to induce clinical expert decisions for these claim pairs, the general approach was intended to be useful for other types of decisions. 39 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.1 Pharmacy Claims Database The starting point was a PBM database containing a standard, minimal set of fields that specify or allow for the identification of the following information: 1) Unique patient identifier 2) Date the prescription was filled 3) Name of the drug received 4) Average daily dose of the drug received 5) Days supplied (e.g., 30 or 60 day prescription) For this project (as in general practice), items 3 and 4 were not stored directly but could be readily derived from the database. Each row in the pharmacy claims table contained both NDC and GPI codes, which can be linked to another table to determine the drug name and strength (e.g., 10 mg or 20 mg). There was also a field in the pharmacy claims table that specified the quantity of drug received (e.g., the number of pills, patches, or mL). The following formula was then used to obtain the average daily dose: . _ Strength * Quantity Average Daily Dose = ------------------ Days Supplied 3.2 Data Extraction The pharmacy claims used for this project were retrieved from a Pharmacy Benefit Management (PBM) organization. Table 3a shows sample code that would typically be sufficient to extract all pharmacy claims for any patient who had ever received a prescription for an antidepressant medication. The sample code might 40 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3a: Sample Extract Code SELECT Patient_Id, Date_Filled, GPI, Days_Supplied FROM Rx_Claims WHERE Patient_Id IN (SELECT DISTINCT Patient_Id FROM Rx_Claims WHERE LEFT(GPI,2)='58'); have to be altered slightly depending on the exact structure of the pharmacy claims database, but is valid in any SQL-92 compliant database management system (DBMS). In this particular case, the large size of the database and poor database optimization resulted in a long query execution time. This coupled with planned and unplanned server interruptions and an overtaxed computer network, meant that the above query could never be completed. In order to work around these practical issues, a more robust, multi-threaded extraction tool was written to extract the data.3 The software sent many simultaneous but atomic requests for data on small sets (packets) of patients. The software tracked the requests and responses to packets so that it could re-request any lost packets and maintain the integrity of the resulting dataset. Utilizing this method also allows for simpler filtering of patients by an external, select list of patients.b a The extraction tool was written in Java 1.3 and run under Microsoft Windows 98 and NT. Multi threaded applications often are able to make better use of computer resources by partitioning a large task into many multiple components and processing them in parallel. The technique can result in significant performance gains, especially for applications requiring substantial data transfer. b The ability to filter patients outside the main PBM database is not critical to the success of this project, but would be in practice. Here it would be common to select patients based on particular interventions, health or business profiles, or randomly to facilitate exploratory studies. 41 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 J Analytic Database Implementation Due to performance limitations of the PBM data repository, data were extracted as described above and stored separately. The analytic database needed to be more efficiently accessed by both the human clinical experts (Section 3.5) and the analytic algorithms (Section 3.6). The analytic database is also where the rules were applied to generate the “Clinical Constructs” (Section 3.12) database. In practice, the analytic database was implemented on two separate machines as illustrated in Figure 3b. To facilitate expert clinical processing, a web-based application was created (Section 3.5). The DBMS chosen to drive the web application was MySQL 2.3 (Component (3.2) in Figure 3b). MySQL did not support key components of the SQL standard (i.e., sub-selects and multi-table Figure 3b: Storage Implementation m Analytic Database Local Data web Data Figure 3c: Basic Analytic Database Structure MemberJd Rxlri Drug Strength Quantity Days _ _ _ ClDrug C2Drug cum cam 42 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. updates) that were necessary to create the analytic variables used in the modeling process. A synchronized copy of the database was maintained on a Microsoft SQL Server 7.0 DBMS on a local machine to generate the analytic variables (Component (3.1) in Figure 3b). Because both databases supported Open Database Connectivity (ODBC), modeling could be conducted using either database. Structure and Content A graphical summary of the analytic database structure is provided in Figure 3c. The original prescription claim data were stored in table tblRxClaim. Expert and administrative user information, including login information, was stored in table lkpUser. The table tblPair contained one row for every possible claim pairing where the claim involved the same patient and there was at least one day of overlap between the claims (referred to as a claim pair). tblPair also contained all the predictor variables, collectively referred to as the predictor variable set (PVS). The expert decisions were stored in table tblCombine. For modeling purposes, a view3 was constructed to link tblPair (where the claim pairs and PVS were stored) to tblCombine (the decisions). Separating the raw data, claim pairs and PVS, and expert decisions was done primarily to isolate the claim pairs and PVS. The isolation ensured that t b l P a i r could be altered or completely regenerated: • New claim pairs could be added as they became available * A view is a virtual table created as a stored SQL statement. Views are helpful for allowing quick access to data that in a particular format that would otherwise require repetitive transformations. 43 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3b: Basic Predictor Variable Set (PVS) Structure Descriptor! Example 1 Pair Id Information Claim Id 1 2 Basic Rx Information Drug Name 3 Overlap IniDiff = (Claim 2 Start)-(Claim 1 Start) 4 History* PrilCnt = # Prior Drug 1 Prescriptions 5 Future* PstlCnt = # Post Drug 1 Prescriptions 6 Miscellaneous DrugSame = Drug 1 same as Drug 2 # Some variables could be null. • Predictor variables could be modified, generated or removed • Case-sampling could be altered dynamically and without impacting the expert decisions. Predictor Variable Set In making decisions about whether or not to combine the members of the claim pair, human experts would take into account not just the two claims themselves but the context in which the claims fell. In this case, the context was often a complex pattern of prescribing that could span years. The predictor variable set (PVS) was an attempt to represent to a computer the constructs that the human expert was creating and using to make her/his decisions. A data dictionary for the analytic database is contained in Appendix 2. The table tblPair, which contained the PVS, can be broken into 6 basic categories of variables (Table 3b). (1) Identification information was used to uniquely identify claim pairs based on claim Ids and to determine which claim pairs belonged to which patients. (2) The basic prescription information was included in the modeling process, but was largely used to construct the other predictor variables. (3) Several 44 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. variables were constructed to indicate the completeness of the overlap for the two claims involved. For example, the number of days between the start of the first and second prescriptions as well as the number of days of overlap were considered both as absolute values and as percentages to account for varying length prescriptions. (4- 5) To determine the context in which the overlap occurred, historical and future prescribing patterns were captured for each drug individually and both drugs together. Context information was described in terms of number of prescriptions, days supplied, and the total span during which each drug and combinations were available. If the claim pair represented the first or last prescription available for the patient, the past and future indicator variables were left null, respectively. If the drugs in each claim pair were the same (mono-drug as opposed to multi-drug), past and future indicator variables were only calculated for both drugs together. (6) Whether the two prescriptions represented a known combination therapy3 and whether a better match for the two drugs involved could be found elsewhere in the database was also represented in the database. The “better match” variable was used to describe the common occurrence of combination treatments being filled early. In these cases, the better match variable could be used to describe a situation where Drug A and Drug B should be paired, but that the pairing should be {A1,B 1} and {A2,B2} and not {A1,B2} or {A2,B1} . The development of this variable is further described in Section 4. * The known combination therapy variable was not used in this project due to the nature of the medications being prescribed. However, it was thought the field would be valuable in particular diseases that have short and/or one-time combination treatments that are otherwise difficult to categorize. 45 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Null Values Table 3c shows the percentage of null values in the PVS by claim pair type (multi-drug or mono-drug) and PVS group for the variables that could be null.3 The Prior 1,2, and Both groups were sets of variables that described the prior prescribing patterns of drug 1, drug 2, and drugs 1 and 2 together, respectively. For instance, PrilCnt was the number of times the first drug in the claim pair had been prescribed in the past. Pri2Cnt was the number of times the second drug had been prescribed in the past; and PriBCnt represented the number of times both drugs had been prescribed together in the past. Likewise, the Post 1,2, and Both sets of variables described prescribing patterns being filled after the claim pair under review. Of the variables that could be null the Prior 1 group in the multi-drug claim pairs was the least likely to be null (8.69%) and prior information on both drugs for the mono-drug group was the most likely to be null (20.11%). “ Basic prescribing information and overlap information could not be null. 46 Table 3c: Null Values Pair Typa PVS Group Missing Multi-Drug Prior 1 8.69% Prior 2 9.09% Prior Both 18.58% Post 1 9.09% Post 2 10.35% Post Both 17.57% Mono-Drug Prior 1 N/A Prior 2 N/A Prior Both 20.11% Post 1 N/A Post 2 N/A Post Both 19.51% Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.4 Case Sampling for Expert Opinion Figure 3d illustrates the basic process for extracting claim pairs to request an expert decision.3 When the expert first logged in to begin classifying claims or when no more claims were left for a given patient, a new patient was randomly selected through a three-step process. First, a claim pair was randomly selected from the pool of available1 1 claim pairs. The likelihood that any individual claim pair (and patient) would be selected for review could be modulated using the “Interesting” variable in the tblPair table. This allowed for biased sampling in favor of particular types of cases. Once the claim pair was selected, the second step was to determine the patient associated with the claim pair. Finally, the first claim pair (based on prescription date) for the patient was determined and presented to the expert. As a decision for each claim pair was submitted, the next claim pair (based on prescription date) was selected and * Sections 3.4 and 3.S were implemented using Perl 5.04 CGI code running on a Linux 6.2 server with a MySQL database backend. The help section of the application, which also explains how the application worked, is contained in Appendix 3. b Availability was determined by the “A v a il a b l e ” flag in the t b l P a i r table. 47 Figure 3d: Case Selection Claim Paic Available H > Randomly Select Claim Pair Present First Claim Pair Present Next Claim Pair Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. then presented. All claims for a given patient were presented sequentially to the expert before moving on to the next patient in order to maintain a natural flow for review. 3.5 Clinical Expert Processing Expert processing consisted of reviewing each possible claim pairing and determining whether the two claims should be combined into a single treatment or should remain separate. A web-based application (described in Appendix 3) was developed to present each claim pair in isolation, while providing the context in which the prescriptions were filled. The interface is illustrated in Figure 3e. The upper left section of the screen presented the details for the two claims in question and allowed the expert to indicate whether (s)he wanted to combine, not combine, or was undecided about the claim pair. The right side of the screen showed the entire prescription history for the patient. The two claims in question were highlighted in yellow. The green triangles marked all claims that overlapped with either claim in the claim pair. Because date arithmetic is difficult for people and a large number of claims were involved, a graphical representation of the claims history was presented in the lower left portion of the screen. The graph showed the dose of each drug available to the patient within 365 days before or after the start of the first claim.2 No attempt * The pre and post periods were selected to provide a sufficient window for viewing the prescription claims history while allowing sufficient detail to be observed. 48 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. was made to interpret prescribing intent at this point; the graph was based solely on the data available. The y-axes represented the doses for each drug and were independently scaled based on the observed doses during the two-year period. If the same drug was involved in both claims, only one y-axis was presented. The x-axis represented the number of days before and after the start of the first prescription in the claim pair (day 0). The dose lines for the two drugs were colored red and blue to match the respective y-axes. If the two lines happened to be exactly overlaid, a thicker, green line was plotted instead. The actual period of overlap was highlighted in yellow to differentiate it from other claims. If the overlap was imperfect, shaded red and blue tails extended beyond yellow highlighted region to show where the claims started and stopped (Figure 4a, image 2). Figure 3e: Sample Decision Interface Start 1996-03-05 19964)34)5 Clai— ta Review _ D n *_ D ai ly S lut 1996-044)4 1996-044)4 Nefazodone Trazodone 300.00 1 0 0 0 0 IrSS E Sy- Combme Claim*? Ves No Slap L fy b f c S 2 Years Rx History N « f < t x o < 4 o n « T r « * o 3 ► 1 9 9 6 4 )2 -0 8 19964)34)9 w»$g¥05i ► I»60«3fl996fl54)2 19964)4-29 19964)5-29 1996-05-22 199606-21 19964)6-21 19964)7-21 19960£21 19964)MO 19964)8-20 19964)9-19 Entire History s < * » D an - 2 7 1 -it* 19964)9-19 1996-10-19 199£1(M7 199611-16 1996-11-14 1996-12-14 Nefazodone 3 0 0 .0 1 Hefaa>dot»30001 Tniml^riaUI Neftxodone 3 0 0 .0 1 Nefazodone 3 0 0 .0 1 Trazodone 1 0 0 .0 1 Trazodone 1 0 0 .0 1 Trazodone 10001 Trazodone 1 0 0 .0 1 H H M B B B Trazodone 1 0 0 .0 1 Trazodone 1 0 0 .0 1 Trazodone 10001 49 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.6 Modeling Expert Opinion Application Analysis of the expert classified data was performed using SPSS AnswerTree™ v2.1. The primary reason for using a C&RT algorithm was the extent to which variables in the PVS were null and the fact that imputation would not be meaningful in the context. C&RT methods were selected because they are generally easier for clinical experts to understand, make few modeling assumptions, and handle variable interaction and collinearity well. Twenty-two C&RT packages were reviewed (Appendix 5). but AnswerTree™ was chosen because it was the only package that: 1) Implemented surrogate splitting 2) Could be scripted to handle bulk processing 3) Output complete information that could be used to transform subsequent observations While not critical AnswerTree™ had the added benefits of: 1) Having a user interface that was easy for a non-technical person to use 2) The ability to connect to ODBC data sources, which largely removed the manual steps involved in extracting/transforming the data for analysis. 50 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Process Data were brought into AnswerTree™ using the ODBC interface, which could be connected to either the web or local copy of the database. Because ODBC was used, variable types were read directly from the original database and did not need to be specified again. Separate analyses were conducted for mono-drug claim pairs, where both claims had the same drug (e.g., Fluoxetine-Fluoxetine) versus multi-drug claim pairs (e.g., Fluoxetine-Trazodone). This was done to take advantage of a simplified rule structure resulting from the fact that mono-drug pairs have a substantial number of null variable values. A less complicated rule structure can significantly improve application processing performance (Section 3.11). Model construction was done in batch using scripts. The parameters consistent to all runs were:3 • Methodology: either CART or QUEST (see Section 4 for descriptions) • Impurity measure: Gini for CART (N/A for QUEST) • Maximum tree depth: 10 • Minimum parent cases: 100 • Minimum child cases: 50 • Number of surrogates to consider: 5 • Subtree selection: based on 1 * standard error • All trees fully grown, then pruned a These settings were the default settings for AnswerTree's implementation of CART and QUEST. It is possible that modeling performance could be improved by selecting different options. 51 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.7 Expert Model Refinement (Optional*) Due to the AnswerTree™ interface, it is relatively easy for a non-technical individual to interact with the model itself and change the way trees are constructed. For instance, Figure 3f shows how a user could quickly prune or extend a branch if (s)he felt the tree was overlearning or oversimplifying, respectively. Figure 3g shows how simple it would be for the user to select a different splitter if (s)he felt a different splitter was a better predictor at that point. These methods could be combined in a way that the user effectively built the entire tree from scratch. The trees were not manually modified for this project. However, the decision to split the analyses based on whether or not both claims in the pair contained the same drug was reinforced by looking at the simplification that resulted from forcing “DrugSame” to be the first split. Allowing the expert to easily interact with the model also facilitated the process of modifying and creating predictor variables based on problematic areas such as heterogeneous nodes or decision rules that frequently contradicted expert opinion. * The KDD process is explicitly iterative based on the incorporation of domain expertise. Expert model refinement was not conducted in this dissertation in order to maintain consistency and comparability across modeling approaches. Domain expertise should be incorporated in practice. 52 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3f: Pruning or Growing the Tree i <-0.2528 i bmatch lmprovemert-0 i i ’ j'/i SraridriO'riis La / . i l bmatch lmprovement-0.0145 jteu& i& bfiaitu pri2wot 53 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3g: Changing a Splitter Combine <-0.2528 ■■ olppctl Improvement-02314 ________ I ________ Yes - 1 _ >0.2528 bmatch B H !ig H aM |B H II» H aH bmatch Improvements Improvements .0330 ------------ 1 — -------------------- 1 --------------------------, No 1 --- olppctl lmprovement-0.0 0.0145 0.0130 0.0129 0.0087 0.0083 Default Default Default Default Default Arbitrary Arbitrary Arbitrary Y # P stiw ot ^PriZWOt f DifPctl ^ O lp P c tl ^AGroup ^ 5 Cl D rug j ^ C 2 D ru q 54 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3h: Conversion of Tree to SQL Code COMBINE C at % n Yes 8.67 401 No 91.33 4223 Total (160.00) 4624 1 --------- olppctl Im provem ents .0891 <=0 8448 >0.9440 Cat % n Yes 3.29 143 No 96.71 4199 Total (03.90) 4342 C at « n Yes 91.49 258 No 8.51 24 Total 05.10) 282 olppct2 Improvement) .0103 <=0.58115 >0.58115 Cat % n Yes 1.33 54 I No 98.67 4003 1 Total 087.74) 4057 Cat. % n Yes 31.23 89 No 68.77 196 total 05.18) 285 T UPDATE tblOutput SET Prob = 1-0.987 WHERE DrugSame=l A ND (((OlpPctl Not Null AND OlpPctl<=0.9449) OR OlpPctl Null AND ( (DifPctl Not Null AND DifPctl>0 .04675) OR DifPctl Null AND ((IniDiff Not Null AND IniDiff>1.5) OR IniDiff Null AND ((DifPct2 Not Null AND DifPct2>0.0139) OR DifPct2 Null AND ((01pPct2 Not Null AND OlpPct2<=0.9616) OR 01pPct2 Null AND (OlpDays Null OR 01pDays<=28 . 5)))))) A ND (((01pPct2 Not Null AND OlpPct2<=0 .56115) OR 01pPct2 Null AND ((DifPctl Not Null AND DifPctl>0.4606) OR DifPctl Null AND ((OlpPctl Not Null AND OlpPctl<=0.5394) OR OlpPctl Null AND ((IniDiff Not Null AND IniDiff>13.5) OR IniDiff Null AND ((PstBWOt Not Null AND PstBWOt>8.5) OR PstBWOt Null AND (OlpDays Null OR OlpDays<=16.5))))))); 55 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.8 Code Generation for Claims Processing Once grown, the rules represented by the tree were converted into standard SQL update statements. Figure 3h illustrates the conversion of one of the nodes from one of the mono-drug models into SQL code.3 Each statement updates the variable to the probability of decision to combine claims being “Yes”. The WHERE clause specifies which rows should be updated. The first part of the WHERE clause stipulates that only mono-drug claim pairs should be considered. The second part (after the first highlighted A M D ), handles the first split in the tree from the root node: OlpPctl<=0.945. If OlpPctl is null, the next best splitter is DifPctl, so DifPctl will be used as the first surrogate. The second through fifth surrogates are (in order): IniDiff, DifPct2, OlpPct2, and OlpDays. Each surrogate gets used only if the prior—better—splitter is null. The third part (after the second highlighted AM D) handles the second split at OlpPct2<=0.561 and all of its surrogates. Similar UPDATE clauses were generated for every terminal node. The combined code was represented a mutually exclusive and exhaustive specification for each row in the output table (assuming the PVS has not changed). * AnswerTree was the only product that would actually output SQL code directly, but most of the C&RT packages will output some sort of logic describing the tree that could be translated into SQL code. What the other products would not do was output information about the surrogate splitters. 56 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3.9 Model Assessment After applying the UPDATE statements from Section 3.8, each observation in the expert classified set was assigned a probability of combination that could be compared to the expert’s decision. Beyond the expert interaction with the tree to determine whether the rules generally make sense, a statistical measure was used to assess model performance. The method is explained more completely in Section 4.3, but is described briefly here as well. The c-statistic measures the predictive accuracy of models on independent data. The expert classified dataset was randomly partitioned into two sets: a 90% training sample used to build the tree and a 10% sample used as an independent test of the tree’s performance. The C&RT methods used in this project use a similar approach in producing the trees in the first place,3 but the c-statistic provides a mechanism for comparing different models and even modeling techniques directly. 3.10 Predictor Variable Set Modification One of the aspects unique to this project was the extent to which the variables in the PVS were manufactured rather than collected. In many biomedical studies, researchers collect data and are generally limited to the variables on which they collect information.b As was indicated in Section 3.3, the PVS could be changed dynamically. Since the PVS modification process was isolated from the raw claims a Most C&RT packages use “v-fold cross-validation” (a jack-knife technique) for pruning the trees. b O f course variables can be transformed and created in other biomedical studies and the other perspective is that the creation of variables in this project is only necessary because the data are poor to begin with. 57 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and the classification of claim pairs, the new PVS could be used to re-model the classified claims and the resulting rules used to reprocess existing data. This ability would be important if the model assessment suggested the decision rules were doing a poor job of predicting expert opinion. To assist with the variable modification process, a method of summarizing individual node decisiveness and disagreement with expert classification was developed. This method, process, and the importance of variable modification are addressed in Section 4. 3.11 Application of Code to Larger Database The expert classified dataset to which the UPDATE statements from Section 3.8 were applied was a subset of the original analytic database. Once created, the same set of UPDATE statements were applied to the larger analytic database. The resulting dataset contained a column indicating the model-generated probability that each claim pair should be combined. Figure 3i: Basic Clinical Constructs End 58 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Constructs The basic clinical constructs output by the methods described in this dissertation are illustrated in Figure 3i. Each patient could receive multiple treatments. Each Treatment was assigned a patient identifier, start date, and end date, and was tied to one or more claims, prescriptions, and events. These constructs were tied together by the unique combination of patient identifier and treatment number. Treatments were then used to build Treatment Episodes (episodes) and were linked by the patient identifier. Processing Figure 3j illustrates the components and methods implemented in the application created to handle the data conversion from the pharmacy claims to the clinical constructs using 59 Figure 3j: Application Processing Indicate In Ptocess Indicate Complete Selected Patient 1 ----- Build Tx Windows £ 3 • IX Windows Analytic ............ 1 ............. — Database Till Tx Windows ± _ Filled Tx Windows Simplify Tx Windows i _____ Simplified TX Windows 1 ------- Set Events 1 TX Events Output Results Build Episodes + IX Episodes Clinical .Constructs., Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. SKLKCT WinXni, WIN (WinKnd) AS WinKnd FROM ((SKLKCT M«ab«r_Xd, DATKADD(day, 0, RxZni) AS Winlnl FROM tblRxClaim) ONION (SKLKCT Kanbar_Xd, DATKADD(day, Daya+1, RxZni) AS Winlni FROM tblRxClaim)) X, ((SKLKCT Kob«r__Xd, DATXADD (day, -1, RxXni) AS WinKnd FROM tblRxClaim) ONION (SKLKCT Kaab«r_Id, DATKADD (day, Days, RxXni) AS WinKnd FROM tblRxClaim)) K WHKRK I.K*mbar_Id«K. Kanbar_Xd AND X. Maabar_Xd*'X' AND WinKnd>WinXni; the model based UPDATE statements.3 Each patient was handled atomically, but many patients could be processed simultaneously because the application was multi threaded.1 ’ Atomicity was maintained by keeping track of which patients had already been processed or were being processed by the application. When a thread was available for processing a patient, it requested a patient from the list of unprocessed patients and marked the patient as “in process.” The first step for each patient was to use the raw prescription claims to build the initial treatment windows. A sample treatment window construction and standard SQL code are presented in Figure 3k. The initial treatment windows were determined by finding all possible unique start and end claim dates. Figure 3k shows * The application was written in Java 1.3 and run on Microsoft Windows 2000. b The application could be run from several clients simultaneously because patients were handled atomically all data were stored on the same DBMS server. 60 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. three claims: A, B, and C. The drug or doses of the claims was irrelevant at this point—only the dates were used. Because of the time frame for each claim and their relationships to each other, 5 treatment windows were developed: 1. The start of claim A to the day before claim B starts. 2. The start of claim B to the end of claim A. 3. The day after the end of claim A to the end of claim B. 4. The day after the end of claim B to the day before the start of claim C. 5. The start of claim C to the end of claim C. Having generated all possible treatment windows, the raw claims in tblRxClaim were used to fill the treatment windows. If a window was Filled by more than one claim (e.g., treatment window 2 in Figure 3k), the claim pair combination probability was used to determine whether both claims should be represented in the window. If the predicted combination probability was >=0.30, then the claims were combined, otherwise the claim with the last end date was kept in the window and the other claim was excluded. Once all claims were placed into treatment windows, a method was used to simplify treatment windows. Treatment windows with no claims (e.g., treatment window 4 in Figure 3k) were removed. Consecutive treatment windows with no significant gaps (e.g., treatment windows 1,2, and 3 but not 3 and 3 in Figure 3k) were combined if they contained exactly the same drugs and doses. The simplified treatment windows represent prescription treatments. The following events were tracked for treatments: 61 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • Treatment Start/End: First/Last recorded treatment in class • Treatment Changes: Dose Changes, Drops, Adds, Changes (Section 2.6) • Break: Gap between end of last treatment and beginning of current treatment. Events were recorded by comparing consecutive treatments (except for the “start” and “end” events). When processing was complete, results were recorded in the Clinical Constructs database (Section 3.12) and the member marked as “completed” in the patients table. Most analysis takes place at the treatment episode level. As claims were combined into treatments windows and treatment windows were collapsed into simplified treatments, series of treatments were collapsed into episodes. Whether treatments were combined into a single episode or formed two separate episodes was based on a washout period (gaps between treatments where no medication was received). The appropriate washout period is disease specific. For this dissertation, treatments were arbitrarily split into separate episodes if more than 30 days separated the end of one treatment from the beginning of the next. By definition, episodes are summaries of a group of treatments.3 The washout period, start and end dates, and length were calculated for each episode. If the last treatment in an episode ended within 30 days (the washout period) of the end 1 Researchers could choose to construct different summary measures from the prescription treatment foundation depicted in Figure 31. 62 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of the data available for the patient,3 the episode was marked as censored. The first treatment in the episode was also recorded. Treatment events were summarized for time-to-event analyses. The number of days from each episode start to treatment events (i.e., changes in the treatment drugs, doses, and breaks in therapy) was recorded for each episode. Finally, the number of days to the next treatment episode was calculated. 3.12 Output Clinical Database To be accessed by most common analytic applications, the clinical constructs must be stored in a relational database. Figure 3 1 shows how treatment information was stored in a relational format. The structure is very similar to the model presented in Figure 3i. Each row in the cv tT x table was uniquely identified by the combination of patient identifier and treatment number and this primary key was used to link to each of the three supporting tables. The cv tT x table also contained information about the start and end of each treatment. The cvtTxClm table allowed the prescription treatments to be readily linked back to the raw pharmacy claims they were based on. The cvtTxRx and cvtT xE v tables were used to store detailed prescription treatment and event information, respectively. The structure allowed for multiple drugs and events to be independently associated with each prescription treatment. a Data availability was limited by two known factors. The original database contained data from 1995-01-01 to 1999-09-30. Secondly, data availability for each patient was limited by her/his eligibility. The database limits and patient eligibility were used to determine data availability for each patient. 63 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 31: RDBMS Structure of Treatments p k, h PK,I2 H u b t r Id Tx t a T xlni Txfed PK,na,n PK,FK1,I1 PK H*ab«r Id T iK m C lsiftld PK,JK1,I1 H a t e r I d PK,PK1,I1 Tx Mm PK Drug Dos* PK,FK1,I1 PK,FK1,I1 P K IK H*ab*r Id Tx Mm H od ifiar The treatment structure presented in Figure 31 is more consistent with clinical practice and less similar to pharmacy claims transactions. The treatment tables could be queried to produce analyses. Appendix 4 shows a sample query for producing a treatment augmentation report directly from the treatment tables. The application processing application also produced an episode summary table to facilitate episode-based analysis. Each treatment was summarized in one row. Columns contained information about each episode, treatment events during the episode, and when the prior and next episode started. The table cvtTxEp is described fully in Appendix 2. A sample use of this table for analysis is presented in Section 5. 64 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4. Comparison of C&RT Methods and Predictor Variable Sets 4.1 Introduction Determining whether or not two overlapping claims should be combined into a single prescription treatment is a key component in describing treatment patterns and eventually describing the success of these treatments. As documented in Table 2b, somewhat crude, fixed rules have been used to determine whether or not claims should be combined. A hypothesis tested in this study is that a richer, teaming model should be able to induce expert human decisions better than a crude, fixed ruleset. Section 4 deals with two steps that can be taken to improve the performance of the model in inducing human expert in claim pair combination decisions: • Testing the modeling algorithm to ensure the modeling process itself is optimal • Looking for classes of observations that are being poorly predicted and modifying the predictor variable set (PVS) to better represent the claim pairs for the modeling algorithm. The first step can be carried out using different modeling algorithms and testing their relative performance using the c-statistic. The second step can be tested using c- statistics, but also requires a more in depth look at individual classes of observations as well as an understanding of how experts are processing the claims. 65 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Section 4 evaluates two different Classification and Regression Tree (C&RT) methods and four different forms of the PVS. A comparison is also made to an existing method for classifying claim pairs based on a fixed rule. These evaluations demonstrate the methods and importance of selecting algorithms and PVSs in inducing expert human decisions. 4.2 Hypotheses 1) There is a statistically significant difference between the CART and QUEST algorithms for modeling expert opinion on claim pair evaluation. 2) There is a statistically significant difference between the PVSs for modeling expert opinion on claim pair evaluation. The author reviewed 11,654 claim pairs over the course of roughly one month.3 Table 4a shows a breakdown of the 11,654 expert classified claim pairs. Of the 11,654 claim pairs, 6,523 were multi-drug * The application was designed for use by pharmacists or other clinical experts. The author interviewed several pharmacy students and residents and found 6 who were capable and willing to classify claims. Unfortunately, too few claims were classified by the pharmacy students and residents to be used. 66 4.3 Methods 4.3.1 Description of the Sample Table 4a: Claim Pair Classification Pair Typa Ccmbina Claim Pairs Multi-Drug Yes 4,184 (64.14%) No 2,339 (35.86%) Total 6,523 Mono-Drug Yes 450 (8.77%) No 4,681 (91.23%) Total 5,131 Total Yes 4,634 (39.76%) No 7,020 (60.24%) Total 11,654 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pairs (where the members of the pair contained different drugs) and 5,131 were mono-drug pairs. Multi-drug pairs were more likely to be classified as combined (64.14% versus 8.77% for mono-drug pairs). The original prescription claims and analytic databases are described in Sections 3.3 and 3.5, respectively. A data dictionary, explaining the elements of the PVS, is presented in the tblPair section of Appendix 2 and described in Section 33. 4.3.2 Rule Generation Methods Two C&RT methods were used for generating the decision trees and resulting rules: CART40 and QUEST.4 3 Both methods were implemented in SPSS AnswerTree.3 Both methods utilize a recursive partitioning algorithm with binary splits and were implemented with a v-fold cross-validation pruning method to prevent overlearning. The primary difference between the two algorithms is in the splitting algorithm at each intermediate node. CART uses a one-step process in which the optimal split point and optimal splitting variable are chosen simultaneously. Loh and Shih contend that this slows down the algorithm and, more significantly, introduces bias in favor of selecting categorical splitting variables with many different categories.4 3 QUEST uses a two-step splitting method. In the first step, QUEST * There are two substantial differences between AnswerTree’s implementation of CART and QUEST from the original implementations. For CART, AnswerTree includes stopping rules, which Salford System’s CART™ does not. For QUEST, AnswerTree implemented a surrogate splitting algorithm where the original Loh and Shih package used imputation. 67 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. determines the optimal splitting variable by using either an F-test for continuous predictors or a x2 -test for categorical predictors. The variable with the lowest p- value is chosen as the splitter and then the optimal splitting point is chosen for that variable. Twenty-five different products were considered for predicting expert decisions including logistic regression, artificial neural networks, and other recursive partitioning algorithms.3 All were rejected because they either did not handle null values or did not produce rules that could be used for mass application processing. Besides the model based rule sets, an existing method for classifying claim pairs, FixedX, was also used. If the difference between the prescription start dates (equivalent to the PVS variable IniDiff) was less than or equal to X {X: 3,5,7} then the claims were combined, otherwise, the claims are not combined. This basic rule has been used for several years and across many studies. 4.3.3 Predictor Variable Sets Set Overview Four PVSs were tested in predicting expert claim pair classification: 1) CARTi/QUESTi 2) CART p/QUEST p 3) CARTf/QUESTf 4) CARTs * Methods are listed in Appendix 5 and others can be found at http://www.recursive-Dartitioning.com. 68 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CARTi/QUESTi use the same variable (IniDiff) used in FixedX. However, rather than setting X a prior, IniDiff is used as the only variable in the C&RT model to derive the optimal split points for IniDiff. CARTp/QUESTp use all the variables in t b l P a i r described in Appendix 2 except BMatch. CARTf/QUESTf use the full PVS. CARTx/QUESTx will be used to refer to the class of CART and QUEST methods as a whole. CARTs is CARTi plus the BMatch variable. CARTs was built to illustrate the method of finding problematic nodes and building a better model by altering the PVS. The BMatch Variable The BMatch variable was created dynamically in the midst of the claim pair review process. It was intended to describe the situation illustrated in Figure 4a, which seemed difficult to describe in a simple, single variable. Though, it is clear that the patient is supposed to be taking Amitriptyline (A) and Trazodone (T) together, not all the claim the possible claim pairs should be combined. All three images in Figure 4a deal with the same 4 claims with green arrows next to them. The following combinations are possible based on date overlap: {A1,:T1}, {A1,A2}, {A1,T2}, {1 |l^I}, {T1,T2}, {A2^K2}. Only the highlighted pairs are shown in Figure 4a. {A l, T l } and {A2, T2} represent claim pairs that should be combined. {Tl, A2} represents a claim pair where the drugs should being combined, but these particular claims should not be combined. There is a better 69 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 4a: Graphical Representation of BMatch 2 Years Rx H istory B o th Amitriptyline 8 8 83 78 M- m ; 3 ! 56 59 1 3 3 125 117 -2 7 9 -1 8 9 189 2 Years Rx H istory !-jth Amitriptyline 2 Years Rx H istory B o t h Amitriptyline too 94 83 78 S hM U hv _ .CcfiSiid? i Z 67 61 a 15* 142 125 117 144 42 - 3 6 6 -2 7 4 - 1*4 -4 4 274 1996-06-19 1996-07-19 Trazodone 5000 ► 1996-08-27 1996-09-26 Trazodone 5000 1996-10-03 1996-11-02 Trazodone 5000 1996-11-04 1 996-12-04 Trazodone 5000 1996-12-03 1997-01-02 Trazodone 50.00 1996-05-04 1996-06-03 1996-06-19 1996-07-19 Trazodone 5000 w lr5 ► 1996-08-27 1996-09-26 Trazodone 50.00 1996-1003 1996-11-02 Trazodone 5000 1996-11-04 1996-1204 Trazodone 5000 1996-12-03 1997-01-02 Trazodone 50.00 199605-04 19960603 199606-19 199607-19 ►1996-0802 19960901 1996-10-03 19961102 199611-04 199612-04 199612-03 19970102 ine 7500 Trazodone 5000 Trazodone 50.00 IfM Tr»a&^j|5Mr Trazodone 50.00 Trazodone 50.00 Trazodone 5000 Symbol Drug Dates A1 Amitriptyline 75mg 1996-08-02 1996-09-01 T l Trazodone 50mg 1996-08-02 1996-09-01 A2 Amitriptyline 75mg 1996-08-27 1996-09-26 T2 Trazodone 50mg 1996-08-27 1996-09-26 70 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. match for T l (Al) and for A2 (T2) in both cases. Graphically, perfect overlap is seen in the first and third pictures. The following observations can be made about the second picture: • The overlap is for a shorter period of time (the vertical yellow band is narrower) • The overlap is less perfect (there are “tails” extending before and after the overlap period for Trazodone and Amitriptyline, respectively) • The period of overlap coincides perfectly with abnormally high doses (the small band of green at the top of the plot relative to the long band of green at the bottom of the plot—both doses have been doubled) There are situations where the first two observations held but it made sense to combine the claims such as when the fill dates did not line up perfectly (e.g., A filled on the 1st of every month and T filled on the 5th of every month). It is less likely that claims should be combined if all three statements are true. 4.3.4 Assessment Methods C-Statistic The 6,523 multi-drug and 5,131 mono-drug claim pairs were randomly assigned to 10 analysis groups. Each analysis group was further split in two with 90% of the claim pairs being used for model building (training set) and 10% reserved for testing (test set). The 10 test sets were orthogonal. This method is sometimes 71 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. referred to as the m-items out approach and was helpful for determining c-statistic stability.4 4 The c-statistic was used to determine the relative performance for the models in predicting expert classification of claim combination. The c-statistic is equivalent to the area under the receiver operating characteristic (ROC) curve 4 5 ,4 6 For this discussion, “event” and “expert decided to combine the claims” are synonymous as are “non-event” and “expert decided not to combine the claims.” The ROC curve is a plot of combinations of sensitivity (the model's ability to correctly predict the events) vs. specificity (the model's ability to correctly predict non-events). Each observation in the test dataset was assigned a probability of being classified as having an event by the model. These predicted probability values were then used in sequence as the critical value for event assignment (i.e., event versus non-event). For example, if there were two observations in the test dataset and their respective predicted probabilities of an event were 0.2 and 0.5, there would be 4 points on the ROC curve {0,0.2,0.5,1}. These predicted probabilities are then used as critical points such that all observations with a predicted probability of at least 0, then 0.2, then 0.5, and then 1 would be classified as events and all other observations would be classified as non-events. Since the actual events are known, computing the sensitivity and specificity at each point tests the classification accuracy. In this example, if the observation with a predicted probability of 0.2 was actually a non-event and the other observation was an event, the sensitivity and 72 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. specificity would break down in the following way. When the cutoff was 0 or 0.2, the sensitivity would be perfect (all events were classified as events) but the specificity would be bad (all non-events were also classified as events). The exact opposite would be true when the cutoff was 1. At O.S, both the sensitivity and specificity are perfect. The ROC curve is always fixed at the coordinates (0,0: perfect sensitivity and bad specificity) and (1,1: bad sensitivity and perfect specificity). A model that performs well will have a ROC curve that rises very quickly (more area under the curve), while a poorly performing model will have a ROC curve that rises slowly (less area under the curve). A curve that rises quickly indicates that the model does a good job of separating events from non-events (high specificity and sensitivity). Therefore, the area under the ROC curve is a useful summary of prognostic accuracy.4 5 To calculate the c-statistic, the following procedure was carried out. The data were split into two groups based on the expert’s classification: 1) Observations classified as having the event (claims should be combined) 2) Observations classified as not having the event (claims should not be combined) Set membership was independent of the predicted likelihood of having the event. Then the Cartesian product of these sets was produced. For each pairing there were three possible outcomes: 73 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1) Concordant: The actual event had a predicted probability higher than its non-event partner. 2) Discordant: The actual non-event had a predicted probability higher than its event partner. 3) Tie: The pair had equal predicted probabilities. The c-statistic is then calculated as:4 S nc + 0.5(t - nc - nd) c = ----------------------- t where: c: area under the ROC curve, nc: number of concordant pairs, nd: number of discordant pairs, t: total number of pairs. Decisiveness and Disagreement The c-statistic was useful for giving an overall view of how well models were working. If the modeling algorithms themselves could not be improved, a more detailed look at individual problem nodes was required to improve the PVS. Problem nodes were defined as nodes that wen: relatively heterogeneous (indecisive) or more likely to disagree with expert decisions (disagreement). Due to the detail involved and the non-statistical nature of the process, the decisiveness and disagreement indices were built using the entire claim pair multi and mono-drug sets. Both decisiveness and disagreement were summarized for each node as shown in Table 4b. The decisiveness index looked at how far away each node was from treating concluding that the probability of the expert combining the claims was 74 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 4b: Decisiveness and Disagreement SELECT CASE WHEN(DrugSame=0) THEN 'Multi' ELSE 'Mono' END AS Type, Node, COUNT(*) AS Cnt, 0.5+(ABS((P-0.5))) AS Decisiveness, SUM(CASE WHEN(P>=0.5 AND Combine='No') THEN 1 WHEN(P<0.5 AND Combine='Yes') THEN 1 ELSE 0 END) AS Incorrect FROM tblOutput GROUP BY DrugSame, Node, P ORDER BY DrugSame, Decisiveness; 50%. For calculating disagreement, the cutoff for combining claims in a pair was set at 50%. If the predicted probability of combination was at least 50% and the expert decided not to combine the claims or the predicted probability was less than 50% and the expert decided to combine the claims, the observation was counted as a disagreement. 4.4 Results 4.4.1 Overall Model Performance Details of relative model performance were not consistent across the multi and mono-drug claim pairs. Generally, CARTf did as well or better than all competitors within a claim pair type and FixedX was always worst. However, Fixed7 did outperform CARTi in the mono-drug claim pairs. Due to the discrepancies, results are presented separately by multi versus mono-drug type in Tables 4b and 4c and Figures 4b-4e.a 1 To conserve space, Fixed3 and FixedS were omitted. Fixed7 did better than either of the other two. 75 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. For the multi-drug claim pairs, CARTf was consistently the best performer (median c-statistic = 93.51%) and CARTp was a close second (median c-statistic = 92.70%). There was a significant degradation in performance for the CARTi model, which performed worse than all QUESTx methods (median c-statistic = 83.23%). The same ordering that held for the CARTx methods held for the QUESTx methods, but there was a more linear decline from the full, to partial, to IniDiff only models (median c-statistics = 89.55%, 88.80%, and 86.57%, respectively). The FixedX models did consistently worse than all other models. Fixed7 was the best performer (median c-statistic = 73.69%). Table 4c and Figures 4b and 4c: Multi-Drug C-Statistics Run CARTf CARTp CARTi QURST QUKSTp QtTXSTi Fixad7 1 95.19% 94.16% 86.33% 91.39% 91.14% 87.91% 75.33% 2 91.33% 91.18% 81.88% 89.56% 88.84% 84.52% 73.88% 3 94.20% 91.93% 82.19% 88.68% 88.42% 84.62% 73.22% 4 94.51% 93.22% 82.69% 89.38% 89.70% 85.23% 72.49% 5 93.42% 91.34% 82.53% 87.30% 86.96% 84.87% 73.55% 6 93.56% 92.70% 86.22% 90.44% 90.31% 88.44% 74.67% 7 93.45% 93.40% 82.40% 90.23% 90.47% 84.30% 74.40% 8 91.29% 92.70% 85.64% 88.83% 88.53% 88.65% 71.06% 9 94.58% 92.64% 83.76% 89.54% 88.70% 87.93% 71.49% 10 93.30% 93.16% 85.27% 90.12% 88.76% 88.61% 73.84% Max 1.69% 1.46% 3.11% 1.84% 2.34% 2.08% 1.64% Median 93.51* 92.70* 83.23* 89.55* 88.80* 86.57* 73.69* Min 2.22% 1.53% 1.35% 2.25% 1.84% 2.28% 2.63% C-Statistics By Method & Run S 70* U 60* -CARTf -CA RTp -CARTi -QUEST -QUESTp -QUESTi -F«ed7 Median (Min-Max) C-Statistics by Method 1 0 0 % 3 90* £ 80* a °p 70* V 60* 50* i i= £ P £ £ ■ i= S 3 I < I a a a ° u W O’ g, O ’ Method 76 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 4d and Figures 4d and 4e: Mono-Drug C-Statistics Run CARTf CARS* CARTi QURST Qossrn QUSSTi Fixad7 1 90.06% 90.06% 81.84% 89.73% 89.73% 88.82% 86.13% 2 80.22% 80.22% 80.77% 92.79% 92.79% 95.86% 91.77% 3 91.23% 91.23% 81.78% 91.38% 91.38% 93.83% 87.82% 4 93.66% 93.66% 82.23% 91.37% 91.37% 92.34% 88.61% 5 86.26% 86.26% 78.62% 84.22% 84.22% 84.27% 80.95% 6 92.99% 92.99% 80.74% 92.44% 92.44% 93.10% 89.98% 7 77.13% 77.13% 76.25% 86.13% 86.13% 93.24% 83.89% 8 94.15% 93.84% 84.30% 91.59% 91.59% 94.87% 89.98% 9 91.73% 91.73% 81.04% 87.37% 87.37% 90.38% 82.89% 10 91.03% 91.03% 83.56% 90.17% 90.17% 91.26% 88.17% Max 3.02% 2.71% 2.89% 2.02% 2.02% 3.15% 3 .77% Median 91.13% 91.13% 81.41% 90.77% 90.77% 92.72% 87.99% Min 14.00% 14.00% 5.16% 6.55% 6.55% 8.45% 7.05% C-Statistics By Method & Ran no% a tfl V 90% I 80% -I 70% 60% 30% I 2 3 4 3 6 Ran -CARTf -C A R Tp -CARTi -QUEST -QUESTp -QUESTi -Fi«d7 8 9 » Median (Min-Max) C-Statistics by Method 100% p " c c I a a s Method For the mono-drug claim pairs, the best performer was QUESTi (median c- statistic = 92.72%). The BMatch variable was not used by CARTf/QUESTf, so CARTf and CARTp were equivalent as were QUESTf and QUESTp. CARTi was the worst performing model (median c-statistic = 81.41%). CARTf/p and QUESTf/p were within 0.5% of each other (median c-statistic = 91.13% and 90.77%, respectively). Formal hypothesis tests were carried out between select algorithm-PVS methods using the Wilcoxon Signed Rank test (Table 4e). For the multi-drug claim pairs, CARTf performed significantly better than QUESTf, Fixed7, and CARTi 77 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (p=0.0020). QUESTf outperformed QUESTi (p=0.0098). The differences between CARTf/QUESTf and CARTp/QUESTp were marginal (p=0.0371/0.0488) especially given the multiple comparisons. For the mono-drug claim pairs, the single strong result was that CARTf/QUESTf outperformed CARTi/QUESTi (p=0.0039/0.0098). Unexpectedly, QUESTi outperformed QUESTf and CARTf, though the difference between CARTf and QUESTi was not statistically significant. All other differences were not statistically significant. 4.4.2 Predictor Variable Set Modification By nature, the process of evaluating terminal nodes for decisiveness and disagreement is not easily summarized. Table 4f shows the summary for each terminal node in each decision tree: the number of observations in each node, and their decisiveness and 78 Table 4f: Decisiveness and Disagreement PVS MOda H Daclalvenaee Dlsagrasnant 8 280 59.30% 114 (40.71%) 7 760 66.30% 256 (33.68%) 2 293 73.00% 79 (26.96%) 3 3,994 91.50% 341 (8.54%) 5 1,196 97.50% 30 (2.51%) 7 204 59.30% 83 (40.69%) 6 129 69.80% 39 (30.23%) 3 1,790 84.90% 271 (15.14%) 1 4,287 87.10% 555 (12.95%) 8 113 92.90% 8 (7.08%) Table 4e: Formal Method Comparison Pair Typa Comparison p-valua Multi-DrugCARTf -QUESTf 0.0020 CARTf -Fixed7 0.0020 CARTf -CARTp 0.0371 CARTf -CARTi 0.0020 QUESTf -QUESTp 0.0488 QUESTf -QUESTi 0.0098 CARTs -CARTi 0.0020 Mono-Drug CARTf -QUESTf 0.4922 CARTf -Fixed7 0.3750 CARTf -CARTi 0.0039 QUESTf -QUESTi 0.0098 CARTf -QUESTi 0.6250 Red indicates expected direction was reversed. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. disagreement indices. In practice, CARTi would have been produced first. Within CARTi, node 1 has a large number of observations with decisiveness below 90% and 555 cases where the expert and tree disagree. Figure 4f shows node 1 in relation to the root node.3 The criteria for getting to node 1 was IniDiff<=22.5 (the prescriptions were filled within 22.5 days of each other). Since most (78% in this sample) prescriptions are written for 30 days, this rule is effectively saying that claims should be combined as long as the second claim does not start in the last week of the first prescription. In many cases this rule will probably work because most prescriptions are not filled that early. 1 Neither the root node nor node 2 are represented in Table 4d because they are not terminal nodes. 79 Figure 4f: Node 1 from CARTi Combine Cat Y es 64.14 4184 No 35.88 2330 To5IW 55JW 5T I --- <*22.5 1 ---------- IniDiff lmpiovemenM).2013 ________I ________ c a t « II n Yes 87 05 3732 No 12.05 555 Total (JJ5.72) 4287 —I >22.5 P e t % n Yes 20.21 452 No 79.79 1784 Total (34.28) 2238 Figure 4g: Node 1 from CARTs Yes <=22.5 n Cat. ■ i % n Yes 87.05 3732 No 12.95 555 Total (65.72) 4287 r BMatch Improvements .0348 ________ I ________ Cat. % n Yes 26.96 79 No 73J4 214 Total (4.49) 293 “ I No Cat. % n Yes 91.46 3653 No 8.54 341 Total (61.23) 3994 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 4h: Sample Problem Claim Pair 2 Years Rx H istory B o tn 3(0 278 25b 233 211 189 187 1 4 4 122 100 -345 -270 -180 -90 120 111 1(2 93 84 74 (7 58 49 4 ( 18 ( 278 345 1997-02-07 1997-03-09 Trazodone 13000 1997-03-11 1997-04-10 Trazodone 150.00 1997-04-07 1997-05-07 Trazodone 150.00 1997-03-12 1997-06-11 Trazodone 130.00 ► 1997-07-08 1997-08-07 Trazodone 130.00 1997-08-06 1997-09-03 Trazodone 130.00 With BMatch in the model, node 1 splits into nodes 2 and 3. Although node 2 in CARTs is now more heterogeneous than node 1 in CARTi, node 3 in CARTs is more homogenous and node 2 in CARTs represents a comparatively small portion of the sample. Also, the number of disagreements between the model and human expert has been reduced from 553 in node 1 for CARTi to a combined 420 in nodes 2 and 3 for CARTs. Figure 4h shows a sample claim that would be classified as being combined by CARTi and not combined by CARTs. In this case, the patient may have been filling the prescription for Trazodone on 1997-08-07 early in order to synchronize the prescription filling with Fluoxetine. The patient is taking Fluoxetine and Trazodone together, but the two claims that are highlighted and under review in Figure 4h probably do not represent an intended treatment, which would mean the patient was taking an abnormally high dose of Trazodone (300mg) for a very short period of time. 80 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The addition of BMatch was intended to improve modeling for cases such as the one presented in Figure 4h (IniDiff<=22.5). However, BMatch also was chosen as the second splitter for the right side of the tree (IniDiff>22.5). This had the effect of simplifying the right side of the tree (from 3 splits to 2). Overall, the median c- statistic increased from 83.23% for CARTi to 89.82% for CARTsa (p=0.0020). 4.5 Discussion 4.5.1 Overall Model Performance The c-statistic was used to assess overall model and PVS performance. With the exception of the CART C&RT method coupled with only the number of days between prescription starts in the PVS (CARTi) in the mono-drug claim pairs, the model-based rules consistently outperformed the simple rule set. Among the model-based rule generators, the QUEST C&RT method coupled with the number of days between prescription starts as the only variable in the PVS (QUESTi) had the highest median c-statistic (92.72%), for the mono-drug claim pairs. However, the CART C&RT method with the full PVS (CARTf) was the second best performer (median c-statistic = 91.13%). Furthermore, for the multi drug claim pairs, CARTf was clearly the best performer (median c-statistic = 93.51%) and QUESTi performed relatively poorly (5th overall with median c-statistic = 86.57%). Based on these results, CARTf is the preferred overall technique and PVS. * While the c-statistic was not helpful in identifying and addressing particular problem nodes, it is useful for confirming PVS modifications for the entire model. 81 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The reason for QUESTi outperforming CARTf/QUESTf in the mono-drug claim pairs is not clear. The difference between CARTf and QUESTi was not statistically significant, but the difference between QUESTi and QUESTf was. Possible reasons for the poor performance of QUESTf may include: relatively small percentage of “events” (8.77%) and an improvable PVS. QUESTf appears to be a classic case of overlearning. The results did not support Loh and Shih’s proposition that CART would result in biased models.4 3 One possible reason for this is that the PVS did not contain many categorical predictors and the only variables with many categories were the drug names involved. However, Lim found that the so-called unbiased tree algorithms QUEST and his own algorithm PLUS, did not outperform CART and other decision tree methods and Lim suggests QUEST and PLUS may in fact be biased themselves.4 7 4.5.2 Predictor Variable Set Modification The PVS is an attempt to make available to the modeling algorithm the same constructs the human expert is using to decide whether to combine the members of a claim pair. Because the constructs in the PVS are almost entirely manufactured, it is always possible to modify the PVS either by changing existing variable definitions or introducing new ones. The challenge is to first find problem classes of observations and then find the appropriate PVS modification. 82 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The c-statistic is helpful in assessing the overall impact of PVS modifications, but it is insufficiently granular to facilitate the identification of node/PVS problem areas. Fortunately, decision trees are among the easiest of modeling techniques for non-statisticians to review structurally. Each node can be examined to determine if it is heterogeneous or often in disagreement with the human expert. Once problematic nodes are identified, representative cases can be sampled to determine how the PVS could be modified to build a better tree. This method was demonstrated by starting with a CART model that had only the initial difference between prescription starts in the PVS (CARTi). A problematic node was found and the variable BMatch was introduced. BMatch represented the relatively complex construct of whether or not a better match existed for the two drugs involved in the claim pair. Including BMatch changed the structure of CARTi substantially and improved its overall performance. However, while adding BMatch to a full PVS (moving from CARTp/QUESTp to CARTf/QUESTf) did change the structure of the tree, the overall performance gain was moderate. Analysis of the trees suggested that there was enough information in the partial PVS to almost replicate the impact of BMatch with simpler constructs. It seemed as though the combination of PstlWOt and OlpPctl may have provided a good surrogate for BMatch. PstlWOt was the number of days from the end of the claim pair overlap period (the part highlighted in yellow in the dosing history graph) to the beginning of the next prescription for drug 1 in the claim pair. OlpPctl was equal to the number of days of overlap in the claim pair, 83 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. divided by the days supplied in claim 1. The smaller these numbers were1 the less likely that the claims would be combined. An example of a small PstlWOt and OlpPctl is shown graphically in Figure 4a: an early fill. It is important not to forget about surrogate splitters in evaluating PVS modifications when a C&RT method is being used as a modeling algorithm. One of the common questions new C&RT users have is “why did the tree change when I dropped a variable that wasn’t even being used as a splitter?”5 The answer is that some variables can be very important as surrogate splitters (when other variables have missing values) and can change the way the model is built, though they do not show up as primary splitters. In the above example, BMatch may sometimes be null, but OlpPctl is never null and was used as a surrogate for BMatch in CARTf. Some modifications to the PVS may not have an obvious impact on the model primary splitters, but may serve an important secondary role and improve the overall performance. 4.6 Conclusion Building PVSs and using modeling techniques is more labor intensive in the short-term than using simple rules to determine claim pair combination. This effort is justified only when the model-based method outperforms existing methods in agreeing with expert opinion. Two modeling techniques and four PVS * PrilWOt could be negative if the next prescription for drug 1 occurred before the end of the claim pair overlap window. This question often arises when users remove variables that seem unimportant in an attempt to speed up processing. 84 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. configurations were compared to each other and to an existing simple rule set. The results indicated that the appropriate modeling technique and PVS could substantially outperform existing rule sets and that there was also substantial performance variation across modeling techniques and PVSs. 85 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 Application to Antidepressant Treatment 5.1 Introduction The goal of this dissertation is to produce a viable set of methods for converting raw pharmacy claims into clinical constructs. The usefulness of these methods is increased to the extent that they facilitate biomedical research conducted by a typical clinician or analyst. The standard use for pharmacy claims data is in the area of outcomes or pharmaceutical economics research. A sample analysis was conducted using the methods described in this dissertation. The dissertation output dataset contained the variables for selecting, grouping, and evaluating patients and treatments. Section 5 presents a sample analysis of treatment discontinuation, events, and costs for 6,577 patients starting antidepressant treatment between 1996 and 1998 with a class of antidepressants known as selective serotonin reuptake inhibitors (SSRIs). 5.2 Hypotheses 1) There is a statistically significant difference in time to treatment discontinuation between SSRIs. 2) There is a statistically significant difference in time to the first treatment event between SSRIs. 3) There is a statistically significant difference in treatment cost between SSRIs. 86 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5.3 Methods 53.1 Description of Sample The study employed data from a pharmacy claims database for a federally qualified health maintenance organization (HMO) with over 5 million members across California, Oklahoma, Oregon, Texas, and Washington. All pharmacy claims and eligibility information were pulled for the 194,632 patients who received at least 1 prescription for an SSRI antidepressant medication between January 1,19% and December 31, 1998. A random sample of 13,543 patients was selected for analysis. 5.3.2 Creation of Key Episodes and Group Assignment The methods described in Section 3 were used to build antidepressant treatment episodes for all patients. A key episode was chosen for each patient based on the following criteria: 1) Patient was 18 years of age or older at the start of treatment 2) Initial treatment was a monotherapy SSRI (fluoxetine, paroxetine, or sertraline alone) 3) Treatment began between 19%-01-01 and 1998-12-31. 4) No other antidepressant treatment was received within the prior 120 days (washout period). In the event that a patient had more than one treatment episode that met the criteria, the first one was used. There were 6,577 patients (and key episodes) fitting the 87 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. criteria and used in the analysis. Analyses were conducted as intent-to-treat based on the initial SSRI treatment of the key episode. 5 3 3 Covariates Table 5a presents the list of covariates and distribution of the covariates across the three treatment groups. Market, product,3 age, gender, and marital status were available in the eligibility table. If a member’s status changed for any of these variables, the value at the time of treatment start was used. The year the episode started and whether or not there was a record of the patient having received prior antidepressant therapy were available in the treatment episode summary dataset. Post treatment availability was calculated as the number of days from the end of the key episode to the last observable date (due to loss of eligibility or end of data availability). The average overall monthly prescription costs for the prior year were calculated because it is often a good predictor of future costs and a coarse indicator of general patient health status. 5.3.4 Outcome Variables Outcome variables were constructed to measure discontinuation, treatment events, and cost of treatment. Discontinuation and treatment events were time-to- event (TTE) outcomes, defined as the number of days from the start of the key * Market and product are often not included in health research studies, but are potentially important as approved drug list (formulary) restrictions can take effect at these levels and result in business (as opposed to medical) related treatment decisions or events. For instance, a particular drug may be off one market’s formulary and on another’s. This would result in differential selection of the drug for non-medical reasons. 88 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. episode to the event of interest and a binary variable specifying whether the even occurred or the measurement was censored. The following TTE events were measured for one year after treatment start: • Time to discontinuation: The number of days from the start of therapy to the end of the treatment episode. An episode was terminated if the patient did not receive treatment for more than 30 days or no more data were available for the patient. Treatment access was based on the prescription date and recorded days supplied. If the patient became ineligible or the end of the file was reached before the end of treatment (i.e., date filled + days supplied > end of data), then the last date data were available for was used (i.e., end of data). • Time to treatment change: The number of days from the start of therapy to the first drug switch, augment, or drop. If patients reached the end of treatment with no treatment changes, the observation was considered censored. • Time to dose change: The number of days from the start of therapy to the first dose change. If a patient had a treatment change, the patient was censored for dose changes. • Time to therapy break: The number of days from the start of therapy to the first non-terminating gap (<=30 days) in treatment of more than 7 days (grace period). • Time to any treatment event: The number of days from the start of therapy to the first treatment change, dose change, or therapy break. • Time to next episode: The number of days from the end of the key episode to the beginning of the next episode. 89 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Formal analyses were conducted only for time to discontinuation and time to any treatment event. For visualization purposes, cost data were stored in 24 one-month segments based on prescription All date— 12 segments on either side of the key episode start date. For formal analysis purposes, the periods were collapsed into one year prior and one year post by using the average monthly cost for each patient.3 In all cases, cost was based on the amount the insurer paid for the prescriptions. The following cost variables were calculated: • Total Rx cost: The overall cost for all prescriptions received by the patient. • Antidepressant Rx cost: The cost for antidepressant prescriptions. • Non-antidepressant Rx cost: The cost for all non-antidepressant prescriptions. Finally, as with cost, 24 binary variables were defined to look at the months during which patients filled antidepressant medications (0=no prescription, l=some prescription). These variables were used for visualization only. 5.3.5 Analytic Techniques Time to Event Analysis Treatment compliance/completion analysis has frequently been done using logistic regression analysis. The problems with logistic regression analysis based on a fixed acceptable treatment length include: * If a patient was not eligible for a portion of the year, the average monthly costs for the available months was used. The implicit assumption is that the available months were representative of months during which the patient was ineligible. The reasonableness of this assumption was checked by stratifying based on the data availability variable and seeing if the results changed substantially. The results were stable. 90 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1. Setting an acceptable length is arbitrary (between 4 and 9 months for depression) 2. Patients that become ineligible or whose treatments start too close to the end of the data file must be excluded from the analysis. Using l i t analysis methods allows the analyst to use more data without having to set arbitrary definitions of treatment success. Rather than having a single binary outcome (success or failure), two variables are used. One variable specifies the amount of time (days) to the event of interest or to the last observable date for each patient. The second variable specifies whether the period of time was terminated by an event or loss to observation. T IE analyses were conducted visually using Kaplan-Meier plots in a web browser based application that facilitated rapid stratification and filtering using predictor variables and switching between outcome variables.3 Once distributions and simple relationships had been checked, formal analyses were conducted using Cox regression.1 1 Cox proportional hazards models make few distributional assumptions and allow for multivariable analyses. Cost Analysis Medical costs typically have skewed distributions with most patients using relatively few services and a few patients having very high utilization. Several * The outcome visualyzer is a web browser based Java 1.1 applet developed by the author (Appendix 6). b SAS v8.1 PROC PHREG was used to conduct TTE analyses. 91 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. methods were employed to understand and analyze the cost data. Visually, the same browser based application used for producing Kaplan-Meier plots was used to plot the utilization variables over time (12 months before and after the start of the key treatment episode). For continuous outcomes such as cost, mean, median, and custom percentile measures could be selected and switched between to understand the distributions. Formal analyses were based on natural logged cost values.3 These models were useful for building parameter estimates, understanding relationships between the outcome and predictor variables, and conducting formal statistical tests while adjusting for possible confounding variables. However, the use of log costs presents interpretation problems because the units are unnatural. To provide a more natural understanding of the estimates, the smearing technique was employed to return the predicted values to their original units while allowing statistical adjustment. The smeared cost estimate for each observation was calculated as: v ( P+ri> Smeared estimate = —--------- n where: n: Total number of patients p: Model-based, predicted cost value for each patient r: Residual for each patient * SAS v8.1 PROC GLM was used to conduct cost analyses. 92 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The resulting values were then merged with the original dataset so that mean costs could be generated based on treatment group. Selection Bias One of the primary challenges to the validity of claims based analysis is the lack of random patient assignment. Without random assignment there is the possibility that patients or their providers will select treatments based on diagnostic or prognostic factors that have an impact on the outcome variables. This selection results in a confounding of the relationship between the primary predictor of interest (e.g., initial drug treatment) and the outcomes of interest (e.g., compliance or cost). Often, confounding can be handled statistically by including confounding variables in the Cox (TTE) and linear (cost) regression models or by employing a stratified analysis. The primary variable of interest for this analysis was SSRI treatment selection. Bivariate analyses provided strong evidence for non-random assignment of patients to treatment groups (Table 5a). A classification tree was built to facilitate the process of understanding the multivariable nature of the selection bias and to help select key variables for stratification or adjustment. Formulary restrictions such as this one represent a special type of selection bias possible in a managed care setting.3 In the process of conducting the analysis, it became apparent that results varied based on the year the patient began therapy. Figure 5a shows the number of prescriptions for each treatment by month. A 1A formulary is a list of acceptable or preferred agents within a therapeutic class. 93 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. formulary change was implemented in January 1997, whereby paroxetine became the preferred drug and special approval was required to prescribe fluoxetine or sertraline. Analyses were subsequently stratified based on whether the treatment began before or after the formulary change. 5.4 Results 5.4.1 Treatment Selection Bias Table Sa shows the bivartiate relationships between the key episode’s initial treatment and the variables available to predict treatment selection—potential confounding variables for the final models. Paroxetine was the most common initial treatment (68.50%), followed by fluoxetine (17.70%) and sertraline (13.81%). The only predictor variable that was not strongly statistically significantly related to treatment selection was gender (p=0.089).ab The choice of initial treatment varied based on the business segments of the insurance carrier: market and insurance product. Patients who received fluoxetine or sertraline as a first treatment were more likely than paroxetine patients to be in Oregon or Washington or in a commercial (employer paid) plan. * The Likelihood Ratio £2 test was used for all univartiate categorical variable statistical tests. The Wilcoxon Rank Sums test was used for overall prior year prescription cost. b Unless otherwise stated, all bivariate analyses had p<0.001. 94 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5a: Treatment Selection FluomtixM Paroxetine Sertraline Market (p<0.001)1 CA 842 (72.34%) 3,817 (84.73%) 546 (60.13%) OK 76 (6.53%) 255 (5.66%) 59 (6.50%) OR 94 (8.08%) 101 (2.24%) 111 (12.22%) TX 79 (6.79%) 245 (5.44%) 90 (9.91%) WA 73 (6.27%) 87 (1.93%) 102 (11.23%) Product (p<0.001) Commercial 861 (73.97%) 1,975 (43.84%) 683 (75.22%) Medicaid 3 (0.26%) 4 (0.09%) 6 (0.66%) Senior 300 (25.77%) 2,526 (56.07%) 219 (24.12%) Age (p<0.001) <30 125 (10.74%) 288 (6.39%) 103 (11.34%) 30-39 293 (25.17%) 616 (13.67%) 194 (21.37%) 40-49 268 (23.02%) 625 (13.87%) 238 (26.21%) 50-59 158 (13.57%) 399 (8.86%) 133 (14.65%) 60-69 117 (10.05%) 561 (12.45%) 67 (7.38%) 70+ 203 (17.44%) 2,016 (44.75%) 173 (19.05%) Gender (p=0.089) Female 828 (71.13%) 3,086 (68.50%) 647 (71.26%) Male 336 (28.87%) 1,419 (31.50%) 261 (28.74%) Marital Status (p<0.001) Married 480 (41.24%) 1,172 (26.02%) 399 (43.94%) Single 684 (58.76%) 3,333 (73 .98%) 509 (56.06%) Year Start p<0 .001) 1996 588 (50.52%) 1,119 (24.84%) 448 (49.34%) 1997 279 (23.97%) 1,548 (34.36%) 209 (23 .02%) 1998 297 (25.52%) 1,838 (40.80%) 251 (27.64%) First Episode (p<0.001) No 210 (18.04%) 567 (12.59%) 147 (16.19%) Yes 954 (81.96%) 3,938 (87.41%) 761 (83.81%) Post Treatment Availability (p<0 .001) <12 Months 801 (68.81%) 2,674 (59.36%) 634 (69.82%) 12+ Months 363 (31.19%) 1,831 (40.64%) 274 (30.18%) Prior Year Average Monthly Overall Rx Costs (p<0.001) Median 10..25 15.25 9.66 Overall 1,164 (17.70%) 4,505 (68.50%) 908 (13.81%) 1 p-values calculated using the Likelihood Ratio % 2- 2 p-values calculated using Wiicoxon Rank Sums test. 95 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Paroxetine patients tended to be older, with 44.75% of the group being 70+ years old compared to 17.44% and 19.05% of fluoxetine and sertraline patients, respectively. This was expected given the relationship between treatment selection and insurance product and also was expected to have a large impact on the outcome variables. Age is known to be an important predictor of overall health. Overall prescription costs can be used as an indicator of health. The paroxetine patients had a median monthly cost of $15.25 for all prescriptions in the year prior to the key episode. Fluoxetine and sertraline patients had median monthly costs of $10.25 and $9.66, respectively. The gender distribution across treatments was similar: 71.13%, 68.50%, and 71.26% female for fluoxetine, paroxetine, and sertraline, respectively. Paroxetine patients were more likely to be single (73.98%) than fluoxetine (58.76%) or sertraline (56.06%) patients. A lager percentage of fluoxetine (50.52%) and sertraline (49.34%) patients began treatment in 1996, compared to paroxetine patients (24.84%). Figure 5a shows the total number of prescriptions for the three antidepressants by month. Paroxetine prescriptions increased from the beginning of the study period through December 1998. Fluoxetine and sertraline prescriptions did not grow in the same way and there was a perceptible drop in January 1997. This drop was coupled with an increased prescription rate for paroxetine. Paroxetine patients were more likely to have received a prior prescription for an antidepressant (87.41%) than either fluoxetine or paroxetine patients (81.96% and 96 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5a: Total Number of Prescriptions by Product and Month 1200 1000 g 800 e W J 600 * I fib 400 200 o r — o o o o © © O ' O - © r- o © >o O ' o © - Fluoxetine - Paroxetine • Sertraline Prescription Fill Month S3.81%, respectively). Fluoxetine and sertraline patients had at least 12 months of post treatment availability 31.19% and 30.18% of the time, respectively. Paroxetine patients tended to have longer post treatment data availability with 40.64% of patients having at least 12 months. Overall, fluoxetine and sertraline patients tended to be more similar to each other with paroxetine patients having a different profile. 97 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5b: Treatment Discontinuation Days to Discontinuation a v z a e • H U c ft u « i o . X w ti o 2 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% — 40 80 -+ ■ 120 161 201 -t- -H - t - Fluoxetine (1,164) Paroxetine (4,505) Sertraline (908) 241 281 321 361 402* Days from Rx Start Days to Discontinuation (Prior to Formulary Change) 100% u a ti £ e •H U c o • H u ( I ft X u u o 2 10% — 90%— 30% - 40 80 -+ ■ Fluoxetine (588) Paroxetine (1,119) Sertraline (448) 120 161 201 241 281 321 361 402* Days from Rx Start Days to Discontinuation (After Formulary Change) u c V > u a a ■i H U a u • H u ( I a x w u o 2 100% 90% 80% 70% 60% S0% 40% 30%+ 20% 10% 40 80 - + ■ 120 161 2 0 1 241 -+ ■ Fluoxetine (576) Paroxetine (3,386) Sertraline (460) 281 321 361 402* Days from Rx Start 98 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5b: Treatment Discontinuation Prior (n-2,155) Post (n-4,422) Hazard 95% C.1. Hazard 95% C.1. Parameter Ratio Lower Upper Ratio Lower Upper Paroxetine 0.936 0.818 1.071 1.455 1.276 1.660 Sertraline 1.121 0.958 1.312 1.223 1.030 1.452 OK 0.936 0.734 1.195 0.923 0.779 1.093 OR 1.058 0.818 1.368 0.959 0.786 1.171 TX 1.179 0.958 1.451 1.215 1.033 1.428 WA 0.729 0.478 1.110 1.090 0.902 1.317 Commercial 0.756 0.607 0.941 0.844 0.726 0.982 Age 0.992 0.987 0.998 0.992 0.988 0.995 First Episode 1.034 0.892 1.198 1.061 0.939 1.200 Prior Cost 0.963 0.933 0.994 0.977 0.955 0.999 5.4.2 Treatment Discontinuation Figure 5b shows sample Kaplan-Meier plots generated by the outcome visualyzer application for treatment discontinuation. The plots were generally consistent across available covariates. The one exception, whether treatment began before or after the formulary change, is highlighted in Figure 5b. Patients were more likely to start with paroxetine and less likely to start with fluoxetine or sertraline after the formulary change. Also, patients generally remained on treatment longer after the formulary change. Combining strata, fluoxetine patients tended to discontinue therapy later than either paroxetine or sertraline patients, which appeared roughly equivalent. Formal analyses were stratified due to the observed discrepancy in results based on the relationship between treatment start date and the formulary change. Table 5b shows the results of the Cox proportional hazards model for treatment discontinuation within the first year of therapy stratified by whether treatment began before or after the formulary change and controlling for market, 99 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. product, age, whether the key episode was the patient’s first treatment, and prior year overall prescription cost. All results are presented after adjusting for all other terms in the model. Before the formulary change, there were no statistically significant differences fluoxetine patients and either paroxetine (p=0.339) or sertraline patients (p=0.153). After the formulary change, patients starting on paroxetine were 1.46 (95% Cl: 1.28,1.66) times more likely to discontinue therapy than fluoxetine patients. Sertraline patients were 1.22 (95% Cl: 1.03,1.45) times more likely than fluoxetine patients to discontinue therapy within one year. There were no statistically significant differences in treatment discontinuation by market at the p=0.05 level. Commercial patients in employer paid plans were less likely than senior plan members to discontinue therapy before one year both before (HR=0.76; 95% Cl: 0.61,0.94) and after the formulary change (HR=0.84; 95% CI=0.73,0.982). Paradoxically, the model showed an inverse relationship between age and discontinuation. The discontinuation risk decreased by 0.99 for each increase of one year in age (prior p=0.005, post p<0.001). There were no statistically significant differences before or after the formulary change based on whether or not the key episode was the patient’s first episode (p=0.661 and p=0.342, respectively). Increased prior year overall prescription costs3 were associated with a decreased risk of discontinuation before (HR=0.963; 95% CI=0.93,0.99) and after the formulary change (HR=0.977; 95% CI=0.96,0.99). * Units were natural logged average monthly dollars. 100 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5c: Treatment Events Days to Any Tx Event 100% 90% 80% 70% 60% Fluoxetine (1,164) Paroxetine (4,505] Sertraline (908) 50% 40% 30%' 20% 1 0 % 120 161 201 241 281 321 361 402+ Days from Rx Start Days to Any Tx Event (Prior to Formulary Change) 100% Fluoxetine (588) Paroxetine (1.119; Sertraline (448) 90% 80% 70% 60% 50% 40% 30% 20% ' 10% 120 161 201 241 281 321 361 402+ Days from Rx Start Days to Any Tx Event (After Formulary Change) 100% - r c - 90%------ 80%------ 70% — 60%' — 50%------ 40%------ 30% -— 20% " — 10% — Fluoxetine (576) Paroxetine (3,386; Sertraline (460) ■ H 120 161 201 241 281 321 361 402+ Days from Rx Start 101 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5c: Treatment Events Prior (n«2,155) Post (n«4,422) Hazard 95% C.I. Hazard 95% C.I. Parameter Ratio Lower Upper Ratio Lower Upper Paroxetine 0.739 0.617 0.885 1.049 0.889 1.238 Sertraline 1.052 0.855 1.295 1.195 0.968 1.475 OK 0.404 0.267 0.613 0.817 0.642 1.040 OR 0.759 0.525 1.096 1.034 0.802 1.332 TX 0.809 0.592 1.105 0.803 0.615 1.047 WA 1.154 0.731 1.822 1.191 0.934 1.517 Commercial 1.074 0.799 1.443 1.202 0.979 1.477 Age 0.998 0.990 1.005 0.998 0.993 1.004 First Episode 1.061 0.871 1.292 0 .980 0.832 1.155 Prior Cost 0.989 0.946 1.034 1.014 0.982 1.047 5.4.3 Treatment Events Figure 5c shows sample Kaplan-Meier plots generated by the outcome visualyzer application for treatment events (i.e., prescription changes or breaks in therapy). As with treatment discontinuation, results were consistent across most stratification variables except treatment start relative to the January 1997 formulary change. Prior to the formulary change, paroxetine patients were less likely to experience significant treatment events than either fluoxetine or sertraline patients. After the formulary change, the experiences of paroxetine and fluoxetine patients were more similar, and sertraline patients were still more likely to experience treatment events. Combining all treatment start years, it appeared paroxetine patients were less likely to experience treatment events than sertraline patients, with fluoxetine patients being in between the other two groups. Table 5c shows the results of the Cox proportional hazards model for treatment events within the first year of therapy stratified by whether treatment 102 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. began before or after the formulary change and controlling for market, product, age, whether the key episode was the patient’s first treatment, and prior year overall prescription cost. All results are presented after adjusting for all other terms in the model. Prior to the formulary change, paroxetine patients were 0.74 (95% Cl: 0.62, 0.89) times as likely to have a significant treatment event as fluoxetine patients. There were no statistically significant differences between fluoxetine and sertraline patients (p=0.631). After the formulary change, there were no statistically significant differences between fluoxetine patients and either paroxetine (p=0.568) or sertraline patients (p=0.098). There were no statistically significant differences in treatment events by market at the p=0.05 level except for Oklahoma in the group starting treatment before the formulary change (HR=0.40; 95% CI=0.27,0.61 compared to California). There were no statistically significant differences based on plan type (commercial versus senior), age, whether the key episode was the patient’s first recorded antidepressant treatment, or prior year total prescription costs either before or after the formulary change at the p=0.05 level. 103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5d: Prescription Costs Cost OveralRx 146 132 119 105 Fluoxetine (1,164) Paroxetine (4,505) Sertraline (908) 12-11-10 - 8 -8 - 7 -8 - 5 -4 -3 - 2 - 1 1 2 3 4 5 8 7 8 8 1 0 1 1 1 2 Months Trom Rx Start Cost Antidcpressart Rx 99 88 77 66 56 45 34 23 13 2 Fluoxetine (1,164) Paroxetine (4,505) Sertraline (908) 12 -1 1 -1 0 -8 - 8 - 7 -8 - 5 -4 -3 - 2 - 1 1 2 3 4 5 8 7 8 8 1 0 1 1 1 2 Months from Rx Start Cost Non-Antidepressant Rx 56 54 5 0 47 43 39 35 31 27 23 Fluoxetine (1,164) Paroxetine (4,505) Sertraline (908) 12-11-10 - 8 -8 -7 -8 -8 -4 - 3 - 2 - 1 1 2 3 4 5 6 7 8 8 1 0 1 1 1 2 Months from Rx Start 104 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5d: Prescription Costs Overall Proscription Coat Post (n-4,422; Prior (n-2,155; R3-25.47%) R2a29.40%) 95% C.I. 95% C:.i. Parameter Estimate Lower Upper Estimate Lower Upper Intercept 2.774 2.454 3.093 3.144 2.906 3.381 Paroxetine -0.251 -0.349 -0.153 -0.431 -0.522 -0.341 Sertraline -0.307 -0.425 -0.188 -0.388 -0.508 -0.268 OK 0.055-0.121 0.230 -0.144 -0.267 -0.021 OR -0.032 -0.225 0.161 -0.190 -0.334 -0.046 TX -0.226 -0.389 -0.063 -0.297 -0.419 -0.175 WA -0.407 -0.702 -0.112 -0.964 -1.101 -0.826 Commercial 0.175 0.018 0.333 0.154 0.044 0.264 Age 0.004 0.000 0.008 0.004 0.002 0.007 First Episode 0.133 0.026 0.240 0.069 -0.020 0.157 Prior Cost 0.302 0.278 0.326 0.306 0.289 0.322 Antideprossant Proscription Cost Prior (na2,155; Ra-10.59H) Post (n>4.422; R3>8.36\) 95% C.I. 95% C.1. Parameter Estimate Lower Upper Estimate Lower Upper Intercept 2.667 2.288 3.047 2.902 2.641 3.164 Paroxetine -0.379 -0.496 -0.263 -0.637 -0.737 -0.538 Sertraline -0.694 -0.835 -0.554 -0.570 -0.702 -0.438 OK -0.153-0.362 0.056 -0.089 -0 .224 0.047 OR -0.033 -0.262 0.197 -0.051 -0.209 0.108 TX -0.510-0.704 -0.317 -0.239 -0.373 -0.104 WA -0.318 -0.669 0.033 -0.777 -0.928 -0.626 Commercial 0.574 0.386 0.762 0.430 0.308 0.551 Age 0.000-0.005 0.005 0.004 0.001 0.007 First Episode -0.017 -0.145 0.110 -0.082 -0.179 0.015 Prior Cost 0.054 0.026 0.083 0.061 0.043 0.079 5.4.4 Prescription Costs Figure 5d shows the mean overall, antidepressant, and non-antidepressant prescription costs for the 12 months before and after the key episode start date by initial treatment. Peaks were evident for all groups and cost measures within the first month of treatment. The peak was particularly marked for the antidepressant related 105 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5d (continued): Prescription Costs Kon-Antidaprsssant Proscription Cost Post (n*4,422; Prior (n» Panne ter 2,155; R3- Estimate 42.33\) 95% C.I. Lower Upper R3- Estimate 42.11*) 95% C Lower .1 . Upper Intercept 0.180 -0.288 0.647 1.087 0.740 1.434 Paroxetine -0.010 -0.154 0.133 -0.129 -0.262 0.003 Sertraline 0.069 -0.104 0.242 -0.002 -0.177 0.173 OK 0.169 -0.089 0.426 -0.067 -0.247 0.113 OR -0.048 -0.331 0.234 -0.338 -0.548 -0.127 TX 0.037 -0.201 0.276 -0.309 -0.487 -0.132 WA 0.049 -0.383 0.481 -0.616 -0.817 -0.415 Commercial -0.098 -0.330 0.133 -0.254 -0.415 -0.092 Age 0.012 0.006 0.017 0.006 0.002 0.010 First Episode 0.341 0.183 0.498 0.126 -0.003 0.255 Prior Cost 0.612 0.577 0.647 0.599 0.575 0.623 prescription costs. The antidepressant utilization graph shows no cost in the 4-5 months prior to the key episode start. This is because one of the criteria for key episodes was that there be no antidepressant prescriptions in the prior 120 days (washout period). For overall prescription costs, it appeared that the three groups were similar. However, it also appeared as though fluoxetine patients had slightly lower costs than paroxetine and sertraline patients for the prior year and slightly higher costs for the year after the start of therapy. Fluoxetine patients had higher antidepressant prescription costs as expected due to the fact that it is a more expensive drug. Sertraline patients appeared to be similar to the paroxetine patients. For non antidepressant costs, paroxetine patients tended to have higher costs than either fluoxetine or sertraline patients for the entire observation period. 106 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table and Figure 5e: Smeared Treatment Cost* Estimates Before Formulary Change Cost Type Total Antidepressant Non-Antidepressant Fluoxetine (n=588) 77.16 (39.17) 38.52 (10.32) 44.46 (46.85) Paroxetine (n=l,119) 66.86 (30.63) 23.23 (6.78) 61.13 (54.98) Sertraline (n=448) 59.10 (30.76) 19.42 (5.54) 51.89 (55.89) $100 V . Co 0 0 o I $60 S $40 M Z $20 ^ $0 | Antidepressant | Non-Antidepressant Fluoxetine Paroxetine Sertraline Initial Dreatment * M ean (S ta n d a rd rV v ia tin n t a v e r a o*» m n n th lv r n s ts Table and Figure 5f: Smeared Treatment Cost* Estimates After Formulary Change Cost Type Total Antidepressant Non-Antidepressant Fluoxetine (n=576) 95.14 (58.20) 49.52 (12.14) 47.62 (64.37) Paroxetine (n=3,386) 77.31 (40.63) 25.82 (5.07) 69.53 (69.45) Sertraline (n=460) 63.26 (41.96) 26.86 (7.90) 48.71 (60.64) $100 £ f 8 I u z V < Fluoxetine Paroxetine Sertraline Initial Treatment | Antidepressant | Non-Antidepressant * Mean (Standard Deviation} averane mnnthlv ensts. Table 5d shows the results of the formal statistical analyses for cost. Paroxetine and sertraline patients had statistically significant overall and antidepressant utilization compared to fluoxetine patients both prior to and after the 107 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. formulary change after adjusting for market, product, age, whether the key episode was the patient’s first treatment, and prior year overall prescription cost (all p<0.001). There were no statistically significant differences in non-antidepressant costs across groups at the p=0.05 level. Market, product, and age were all statistically significantly related to overall and antidepressant prescription costs before and after the formulary change. Neither market nor product was statistically significantly related to non-antidepressant costs before the formulary change but both were significantly related after the formulary change. While statistical significance was approached for the relationships between all cost measures and whether or not the patient had a documented history of prior antidepressant treatment both before and after the formulary change, the relationship was only significant at the p=0.05 level for non-antidepressant costs in the group starting treatment before the formulary change (p<0.001). In this case, patients who had no documented prior antidepressant treatment had higher non-antidepressant costs. Prior year average monthly cost for all prescriptions was always highly related to all types of prescription costs regardless of the formulary status (all p<0.001). The smeared estimates in Tables 5e and Sf show the expected, unadjusted3 monthly prescription costs for each treatment group in untransformed units before and after the formulary change, respectively. Prior to the formulary change, * Because the treatment groups were heterogeneous with respect to important indicators of healthcare utilization and the smeared cost estimates are not adjusted for these differences, it is inappropriate to compare smeared costs across treatment groups. 108 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. antidepressant costs were $38.52, $23.23, and $19.42 per month for fluoxetine, paroxetine, and sertraline patients, respectively. These costs were uniformly higher after the formulary change; $49.52, $25.82, and $25.86 for fluoxetine, paroxetine, and sertraline, respectively. A similar pattern was observed for overall average monthly prescription costs. Fluoxetine, paroxetine, and sertraline patients averaged $77.16, $66.86, and $59.10 prior to the formulary change and $95.14, $77.31, and $63.26 after the formulary change, respectively. 5.5 Discussion Fluoxetine was the first SSRI to market and was more successful in treating many clinically depressed patients with fewer side effects than previously available treatments. Subsequently, other SSRIs including paroxetine and sertraline became available and at a lower cost per pill. Safety and efficacy trials had been required to make paroxetine and sertraline available in the U.S., but there were still questions about how the SSRIs would perform relative to each other in actual clinical practice. Due to the number of patients taking antidepressant medication, insurers could save millions of dollars a year if a cheaper SSRI could be used in place of fluoxetine without causing adverse events, thereby increasing costs in other areas. The results of this sample analysis suggest that paroxetine or sertraline could be substituted for fluoxetine without increasing prescription costs. Antidepressant costs decreased dramatically without increasing other medication costs. Though 109 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. paroxetine patients were shown to have higher absolute non-antidepressant costs, this difference was reversed after adjusting for treatment selection bias. There were differences in treatment discontinuation and events based on the year the key treatment episode started. There are several hypotheses that might explain these results. One explanation that has been suggested for explaining discontinuation and event differences in treatments that were made available at different times is the learning effect. Essentially, physicians and patients Ieam how to use treatments more effectively over time and this should increase patient compliance and decrease treatment events (e.g., switching drugs or changing doses). In this analysis, the direction of the improvement with respect to discontinuation and treatment events went in the wrong direction for the learning effect. A peculiarity of the data used in this analysis is that the insurer removed fluoxetine from its list of preferred medications (formulary) and replaced it with paroxetine. Therefore, patients or their physicians would have had to request special authorization to use fluoxetine instead of one of the other SSRIs. The selection bias adjustments should have accounted for many of the differences between patients in the treatment groups. However, physicians prescribing fluoxetine after the formulary change were probably more likely to be psychiatrists and/or have stronger relationships with their patients. No data are available to support this hypothesis directly. It is also not possible to assess the impact of treatment selection on other types of utilization such as ambulatory, hospital, or emergency room costs with the available data. Finally, 110 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. SSRIs are used to treat conditions other than depression (e.g., social anxiety). It is possible that many of the patients included in this study did not have depression and the distribution of depressed patients may have been unequal across treatment groups. 5.6 Conclusion This analysis demonstrates how pharmacy claims can be utilized to begin to understand the implications of treatment decisions in actual clinical practice. The results can be used as a starting point for requesting further data and directing future research. For instance the analysis could be used to request data for other types of utilization or physician specialty. Ultimately, the results could be used to justify the expense of conducting chart reviews, surveys, and/or prospective studies into particular classes of patients or treatments that may have been overlooked or excluded in the original clinical trials required for drug approval. I l l Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 Conclusions and Directions for Future Work 6.1 Success in Modeling Expert Opinion The overall c-statistics for the models exceeded a priori expectations. Previous work with various modeling techniques had never yielded c-statistics above 80%. Discussions with pharmacists and other clinical experts over the course of 4 to S years had strongly suggested that it would not be possible to develop a methodology for accurately classifying claims the way this system has. However, there are notable improvements that need to be made before the system is fully functional and evaluable. 6.2 Process Integration There was too much manual effort required to traverse some of the steps described in Section 3. The overall project would be improved if the following three stages could be more tightly integrated: 1. Gathering expert opinion 2. Modeling expert opinion 3. Applying model results The reason these stages were not combined had to do with the tools that existed for implementing them. Expert opinion was acquired via the web. The application was served by Apache on a Linux machine with MySQL as the database and Perl handling the interface and much of the processing. MySQL does not support sub-queries (part of 112 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. theSQL standard) and this meant the analytic variables had to be constructed on another platform (Microsoft SQL Server, which does not run on Linux) and then uploaded to the MySQL server. Once MySQL supports sub-queries (expected in the next major release), the stages listed above can be more tightly integrated. As expert opinion is gathered, analytic variables can be built immediately and made available for modeling without having to involve a second DBMS platform.3 The only C&RT application that could handle missing values and output rules incorporating handling of the missing values was SPSS AnswerTree. This made traversing the above stages particularly challenging. The author is currently working with the author of a classification tree package to remove this obstacle. When MySQL is upgraded and a classification tree package can be more tightly integrated, the entire process will be much more seamless to the end user. For instance it would be possible to get immediate feedback on how the model is changing based on new decisions or manual changes to the mode. Tighter integration will also improve the processing itself. 6.3 Improving Computing Efficiency The processing components of this project were implemented with either Java or Perl, depending on the part of the overall process. These languages were chosen because they facilitate rapid application development—where methods are being prototyped and frequently changed in a short period of time. For large-scale * In practice, multiple servers would probably be used for performance reasons. However, cross- platform operating system and database management system issues could be avoided. 113 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. production, however, these languages (Java especially) are not fast enough. Performance could be improved by rewriting the processing components in a different language, probably C/C++. 6.4 Incorporating Multiple Experts The author was responsible for providing the expert opinions, building the models, and evaluating their performance. While the author does have several years experience in classifying prescription claims, the system was designed for use by trained clinical experts. It is possible that the models are specific to the author’s opinions about how claims should be combined and may not be representative of other expert’s opinions.3 Regardless of this particular bias, it is expected that there would be disagreement between clinical experts on whether some claim pairs should be combined. The system is designed to receive input from multiple experts on the same claim pairs, but this feature was not used. It is necessary to collect input from multiple clinical experts to replicate the findings in this dissertation and measure the level of disagreement between clinical experts. * It should be noted that the resulting models were a surprise to the author. He anticipated that different models would be built. 114 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.5 Expanding Prescription Classes This project covered only antidepressant mediations. While helpful, the methods must be proven with other prescription classes and multiple prescription classes. 6.6 Expanding Opinions Modeled The only decision the current methods are designed to model is whether or not claims in a claim pair should be combined. This decision is the most basic one required to construct treatment histories from pharmacy claims databases, but the modeling of other judgments would both corroborate and extend the claim combination decisions. One concept that may be particularly interesting and useful without being overly complex to model is that of the medicine cabinet. Anecdotally, members of senior plans are reported to stockpile medications for future use. More commonly, patients fill prescriptions earlier because they happen to be at the pharmacy early or perhaps because they are going on vacation and will have difficulty getting their prescriptions filled elsewhere. The concept has been used in the past when calculating medication possession ratios (MPRs), which are a common index used for assessing treatment compliance. Various simple rules have been proposed for handling overfilling and gaps in treatment, but none of these rules deal with cases where multiple medications 115 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. are involved and they appear to be based on explicit opinion rather than validated models. An example of a more advanced decision is asking the expert to indicate significant treatment events directly. For instance, experts could be asked to judge treatment start and stop dates and differentiate these events from less substantial therapy breaks. Rather than indicating whether or not claims should be combined, experts could directly identify prescription events such as augmentation or switching. A step beyond building accessible treatment constructs and events would be to use the treatments to infer information about disease states. The most common existing method for inferring disease from medication is the chronic disease score (CDS). Systems such as the CDS have been developed using explicit rule systems but, unlike MPR methods, have often been validated. The implementation of a semi automated system for inferring disease states would be significantly different from what was implemented here, but the overarching methodology could be similar. However, disease inference may be a better area to test the merits of model based and explicit rule systems because the rule systems have been validated. This dissertation showed that explicit rules could be replaced by a set of methods for deriving the rules using statistical techniques. In theory any of the explicit transformation rules used in this dissertation could be replaced by analogous methods. Whether doing so is cost-effective in terms of human and machine effort is proposed as a direction for future research. 116 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7 Conclusion This dissertation has proposed and implemented a set of methods based on the fields of statistics, computer science and pharmacy to improve pharmacy claims based outcomes and pharmacoeconomic research. There has been a significant gap between the way prescription claims repositories are structured and the data elements required to support clinically oriented researcher. This dissertation bridged this gap by creating a system of computer applications and analyses that efficiently and effectively utilize expert human input to transform large financial data into clinical information. Although, the project has been successful in its primary aim, significant improvements must be made to make the system more accessible to clinicians and analysts. More importantly, this dissertation provides a benchmark for improvement and a platform on which to build transformations that will allow researchers to focus on their substantive expertise and exploit available resources to an even greater extent. This dissertation has also demonstrated two of the complexities involved in conducting research with large databases. First, significant, expert driven transformation methods were required to make the data useable. Second, the analysis process was more iterative and fluid than the typical, randomized controlled trial analyses familiar to many biostatisticians. Knowledge discovery based fields such as outcomes research will improve to the extent that researchers disclose and discuss the methods they use in conducting their research. 117 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8 References 1 Pocock, Stuart J. Clinical Trials: A Practical Approach. John-Wiley & Sons Ltd; 1983. 2ISPOR. ISPOR Lexicon First Edition. ISPOR, 1998. 3 http://www.aimc.com/outcomes.html as available 2000-08-26. 4 Dillon, Michael J. Drug Formulary Management. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. 5 Navarro, Robert P., Da Silva, Robert, Rivkin, Suzanne. Health Informatics Systems. Managed Care Pharmacy Practice. Robert P. Navarro, Ed., Aspen Publishers, Inc.; 1999. 6 Bodenheimer, Thomas. California’s Beleaguered Physician Groups—Will They Survive? New England Journal o f Medicine, 2000; 342, 14. http://nejm.org/content/2000/0342/0014/1064. asp. 7 Navarro, Robert P, Cahill, Judith A. The U.S. Health Care System and the Development of Managed Care. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. 8 Navarro, Robert P. Pharmacy Benefit Management Principles and Practices. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. 9 Stem, Craig S., Stem, Carol J., Cronin, John M. Pharmacy Benefit Management Principles and Practices. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. 1 0 Brixner, Diana I., Szeinback, Sheryl L., Mehta, Shilpa, Ryu, Seonyoung, Shah, Hemal, “Pharmacoeconomic Research and Applications in Managed Care” in Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. 1 1 Romza, John H., Black, Garth E. Pharmacy Data and Information Systems. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. 1 2 Motheral, Brenda. Outcomes Management: The Why, What, and How of Data Collection. Journal o f Managed Care Pharmacy. 1997: 3,3: 345-351. 118 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 3 Elder, John F., Pregibon, Daryl. A Statistical Perspective on Knowledge Discovery in Databases. Advances in Knowledge Discovery and Data Mining, Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy, Eds, AAAI Press/MIT Press, 19%. 1 4 Roe, Catherine M., Motheral, Brenda R., Teitelbaum, Fred, Rich, Michael W. Compliance with and Dosing of Angiotensin-Converting-Enzyme Inhibitors Before and After Hospitalization. American Journal Health-Systems Pharmacy. 2000: 57: 139-145. 1 5 Blandford, Larry, Dans, Peter E., Ober, Joseph D., Wheelock, Clare. Analyzing Variations in Medication Compliance Related to Infividual Drug, Drug Class, and Prescribing Physician. Journal o f Managed Care Pharmacy. 1999: 5,1:47-51. 1 6 McCombs, Jeffrey S., Nichol, Michael B., Stimmel, Glen L. The Role of SSRI Antidepressants for Treating Depressed Patients in the California Medicaid (Medi- Cal) Program. Value in Health. 1999: 2,4: 269-280. 1 7 Nichol, MB, Harada, ASM, Jones, JP, McCombs JS, Grogg, A, Gilderman, A, Vaccaro, J. Utilization of Antipsychotic medications in the treatment of Schizophrenia in a Managed Care Population. ISPOR Fifth Annual International Meeting. May 20-24,2000, Poster PMH15. 1 8 Melfi, Catherine A., Chawla, Anita J., Croghan, Thomas W., Hanna, Mark P., Kennedy, Sean, Sredl, Kate. The Effects of Adherence to Antidepressant Treatment Guidelines on Relapse and Recurrence of Depression. Archives of General Psychiatry. 1998: 55,2: 1128-1132. 1 9 Catalan, Vanessa S., LeLorier, Jacques. Predictors of Long-term Persistence on Statins in a Subsidized Clinical Population. Value in Health. 2000: 3,6:417-426. 2 0 McCombs, Jeffrey S., Luo, Michelle, Johnstone, Bryan M. The Use of Conventional Antipsychotic Medications for Patients with Schizophrenia in a Medicaid Population: Therapeutic and Cost Outcomes over 2 Years. Value in Health. 2000: 3,3: 222-231. 2 1 Bemdt, Ernst R., Russell, James M., Miceli, Robert, Colucci, Salvatore, V., Xu, Yikang, Grudzinski, Amy N. Comparing SSRI Treatment Costs for Depression Using Retrospective Claims Data: The Role of Nonrandom Selection and Skewed Data. Value in Health. 2000: 3,3: 208-221. 119 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 2 Knight, Eric L., Glynn, Rober J., Levin, Raisa, Ganz, David A., Avom, Jerry. Failure of Evidence-based Medicine in the Treatment of Hypertension in Older Patients. Journal o f General Internal Medicine. 2000: 15,10: 702-709. 2 3 McCombs, Jeffrey S., Nichol, Michael B., Stimmel, Glen L., Shi, Jinhai, Smith, Raymond. Use Patterns for Antipsychotic Medications in Medicaid Patients with Schizophrenia. Value in Health. 1999:60 (suppl 19): 5-11. 2 4 Gregor, Karl J., Overhage, Marc J., Joe, Coons, Stephen J., McDonald, Robert C. Selective Serotonin Reuptake Inhibitor Dose Titration in the Naturalistic Setting. Clinical Therapeutics. 1994: 16, 2: 306-315. 2 5 Shields, Shelly A., Gregor, Karl J., Young, Christopher H., James, Steven P. Concomitant Therapy with Anxiolytics or Hypnotics and Maintenance of Initial SSRI Therapy. Pharmacotherapy. 1998: 18,6: 1298-1303. 2 6 Tunis, Sandra L., Johnsone, Bryan M., Kinon, Bruce J., Barber, Beth L., Browne, Robert A. Designing Naturalistic Prospective Studies of Economic and Effectiveness Outcomes Associated with Novel Antipsychotic Therapies. Value in Health. 2000: 3, 3,232-242. 2 7 Motheral, Brenda R., Fairman, Kathleen A. The Use of Claims Databases for Outcomes Research: Rationale, Challenges, and Strategies. Clinical Therapeutics. 1997: 19, 2, 346-366. 2 8 Gregoire, Jean-Pierre, Glbert, Remi, Archambault, Andre, Contandriopoulos, Adre-Pierre. Measurement of Non-Compliance to Antihypertensive Medication Using Pill Counts and Pharmacy Records. Journal o f Social and Administrative Pharmacy. 1997: 14,4, 198-207. 2 9 Choo, Peter W., Rand, Cynthia S., Inui, Thomas S., Lee, Mei-Ling T., Cain, Emily, Cordeiro-Breault, Michelle, Canning, Claire, Platt, Richard. Validation of Patient Reports, Automated Pharmacy Records, and Pill Counts with Electronic Monitoring of Adherence to Antihypertensive Therapy. Medical Care. 1999: 37,9, 846-857. 3 0 Steiner, John F., Prochazka, Allan V. The Assessment of Refill Compliance Using Pharmacy Records: Methods, Validity, and Applications. Clinical Epidemiology. 1997: 50,1,105-116. 3 1 Melfi, Catherine A., Croghan, Thomas W. Use of Claims Data for Research on Treatment and Outcomes of Depression Care. Medical Care. 1999: 37,4 (Lilly Supplement), AS77-AS80. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 2 Wingert, Terence D., Kralewski, John E., Lindquist, Tammie J., Knutson, David J. Constructing Episodes of Care from Encounter and Claims Data: Some Methodological Issues. Inquiry, 1995:32(4), 430-43. 3 3 Fayyad, Usama M., Piatetsky-Shaapiro, Gregory, Smyth, Padhraic. From Data Mining to Knowledge Discovery: An Overview. Advances in Knowledge Discovery and Data Mining, Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy, Eds, AAAI Press/MIT Press, 1996. 3 4 Brachman, Ronald J., Anand, Tej. The Process of Knowledge Discovery in Databases. Advances in Knowledge Discovery and Data Mining. Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy, Eds, AAAI Press/MIT Press, 1996. 3 5 Elmasri, Ramez, Navathe, Shamkant B. Fundamentals o f Database Systems Second Edition. Addison-Wesley Publishing Company, 1994. 3 6 Celko, Joe. SQL for Smarties, Morgan Kaufmann Publishers, Inc., 1995. 3 7 SAS Institute, Inc. SAS Guide to the SQL Procedure Usage and Reference Version 6 First Edition, SAS Institute Inc, 1989. 3 8 Kleinbaum, David G., Kupper, Lawrence L., Muller, Keith E., Nizam, Azhar. Applied Regression and Other Multivariable Methods Third Edition, Duxbury Press, 1998. 3 9 Homik, K. Multilayer feedforward networks are universal approximators. Neural Networks 2 ,359-366 (1989). 4 0 Warner, B., Misra, M. Understanding neural networks as statistical tools. American Statistician 50, 184-293, 1996. 4 1 Breiman, L., Friedman, J, Olshen, R., Stone, C. Classification and Regression Trees, 1984. 4 2 Weiss, Sholom M., Kulikowski, Casimir A., Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems, Morgan Kaufmann Publishers, Inc, 1991. 4 3 Loh, W.Y., Shih, Y.S. Split Selection Methods for Classification Trees. Statistica Sinica. 1997: 7: 815-840. 121 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 4 Lachenbruch, P.A., Mickey, M.R. Estimation of error rates in discriminant analysis. Technometrics 1968: 10: 1-11. 4 5 Hanley, J.A., McNeil B.J. The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology. 1982:143: 29-36. 4 6 Bamber, D. The Area Above the Ordinal Dominance Graph and the Area Below the Receiver Operating Characteristic Graph. Mathematical Psychology. 1975: 12: 387-415. 4 7 Lim, T.S. Polytomous Logistic Regression Trees. Unpublished manuscript available at http://recursive-partitioning.com/plus/plrt.pdf (Available 2001-01-31). 122 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9 Bibliography Bamber, D. The Area Above the Ordinal Dominance Graph and the Area Below the Receiver Operating Characteristic Graph. Mathematical Psychology. 1975: 12: 387-415. Bemdt, Emst R., Russell, James M., Miceli, Robert, Colucci, Salvatore, V., Xu, Yikang, Grudzinski, Amy N. Comparing SSRI Treatment Costs for Depression Using Retrospective Claims Data: The Role of Nonrandom Selection and Skewed Data. Value in Health. 2000: 3,3: 208-221. Blandford, Larry, Dans, Peter E., Ober, Joseph D., Wheelock, Clare. Analyzing Variations in Medication Compliance Related to Infividual Drug, Drug Class, and Prescribing Physician. Journal of Managed Care Pharmacy. 1999: 5,1:47- 51. Bodenheimer, Thomas. California’ s Beleaguered Physician Groups-Will They Survive? New England Journal of Medicine, 2000; 342, 14. http://nejm.org/content/2000/0342/0014/1064.asp. Brachman, Ronald J., Anand, Tej. The Process of Knowledge Discovery in Databases. Advances in Knowledge Discovery and Data Mining. Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy, Eds, AAAI Press/MIT Press, 1996. Breiman, L., Friedman, J, Olshen, R., Stone, C. Classification and Regression Trees, 1984. Brixner, Diana I., Szeinback, Sheryl L., Mehta, Shilpa, Ryu, Seonyoung, Shah, Hemal, "Pharmacoeconomic Research and Applications in Managed Care" in Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. Catalan, Vanessa S., LeLorier, Jacques. Predictors of Long-term Persistence on Statins in a Subsidized Clinical Population. Value in Health. 2000: 3,6:417-426. Celko, Joe. SQL for Smarties, Morgan Kaufmann Publishers, Inc., 1995. Choo, Peter W., Rand, Cynthia S., Inui, Thomas S., Lee, Mei-Ling T., Cain, Emily, Cordeiro-Breault, Michelle, Canning, Claire, Platt, Richard. Validation of Patient Reports, Automated Pharmacy Records, and Pill Counts with Electronic Monitoring of Adherence to Antihypertensive Ther Dillon, Michael J. Drug Formulary Management. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. 123 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Elder, John F., Pregibon, Daryl. A Statistical Perspective on Knowledge Discovery in Databases. Advances in Knowledge Discovery and Data Mining, Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy, Eds, AAAI Press/MIT Press, 1996. Elmasri, Ramez, Navathe, Shamkant B. Fundamentals of Database Systems Second Edition. Addison-Wesley Publishing Company, 1994. Fayyad, Usama M., Piatetsky-Shaapiro, Gregory, Smyth, Padhraic. From Data Mining to Knowledge Discovery: An Overview. Advances in Knowledge Discovery and Data Mining, Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy, Eds, AAAI Press/MIT Press, 1996. Gregoire, Jean-Pierre, Glbert, Remi, Archambault, Andre, Contandriopoulos, Adre- Pierre. Measurement of Non-Compliance to Antihypertensive Medication Using Pill Counts and Pharmacy Records. Journal of Social and Administrative Pharmacy. 1997:14,4, 198-207. Gregor, Karl J., Overhage, Marc J., Joe, Coons, Stephen J., McDonald, Robert C. Selective Serotonin Reuptake Inhibitor Dose Titration in the Naturalistic Setting. Clinical Therapeutics. 1994: 16, 2: 306-315. Hanley, J.A., McNeil B.J. The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology. 1982: 143: 29-36. Homik, K. Multilayer feedforward networks are universal approximators. Neural Networks 2,359-366 (1989). http://www.ajmc.com/outcomes.html as available 2000-08-26. ISPOR. ISPOR Lexicon First Edition. ISPOR, 1998. Kleinbaum, David G., Kupper, Lawrence L., Muller, Keith E., Nizam, Azhar. Applied Regression and Other Multivariable Methods Third Edition, Duxbury Press, 1998. Knight, Eric L., Glynn, Rober J., Levin, Raisa, Ganz, David A., Avom, Jerry. Failure of Evidence-based Medicine in the Treatment of Hypertension in Older Patients. Journal of General Internal Medicine. 2000: 15,10: 702-709. Lachenbruch, P.A., Mickey, M.R. Estimation of error rates in discriminant analysis. Technometrics 1968:10: 1-11. Lim, T.S. Polytomous Logistic Regression Trees. Unpublished manuscript available at http://recursive-partitioning.com/plus/plrt.pdf (Available 2001-01-31). 124 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Loh, W.Y., Shih, Y.S. Split Selection Methods for Classification Trees. Statistica Sinica. 1997: 7: 815-840. McCombs, Jeffrey S., Luo, Michelle, Johnstone, Bryan M. The Use of Conventional Antipsychotic Medications for Patients with Schizophrenia in a Medicaid Population: Therapeutic and Cost Outcomes over 2 Years. Value in Health. 2000: 3, 3: 222-231. McCombs, Jeffrey S., Nichol, Michael B., Stimmel, Glen L. The Role of SSRI Antidepressants for Treating Depressed Patients in the California Medicaid (Medi-Cal) Program. Value in Health. 1999: 2,4: 269-280. McCombs, Jeffrey S., Nichol, Michael B., Stimmel, Glen L., Shi, Jinhai, Smith, Raymond. Use Patterns for Antipsychotic Medications in Medicaid Patients with Schizophrenia. Value in Health. 1999: 60 (suppl 19): 5-11. Melfi, Catherine A., Chawla, Anita J., Croghan, Thomas W., Hanna, Mark P., Kennedy, Sean, Sredl, Kate. The Effects of Adherence to Antidepressant Treatment Guidelines on Relapse and Recurrence of Depression. Archives of General Psychiatry. 1998: 55,2: 1128-1132. Melfi, Catherine A., Croghan, Thomas W. Use of Claims Data for Research on Treatment and Outcomes of Depression Care. Medical Care. 1999: 37,4 (Lilly Supplement), AS77-AS80. Motheral, Brenda R., Fairman, Kathleen A. The Use of Claims Databases for Outcomes Research: Rationale, Challenges, and Strategies. Clinical Therapeutics. 1997: 19,2,346-366. Motheral, Brenda. Outcomes Management: The Why, What, and How of Data Collection. Journal of Managed Care Pharmacy. 1997: 3,3: 345-351. Navarro, Robert P, Cahill, Judith A. The U.S. Health Care System and the Development of Managed Care. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. Navarro, Robert P. Pharmacy Benefit Management Principles and Practices. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. Navarro, Robert P., Da Silva, Robert, Rivkin, Suzanne. Health Informatics Systems. Managed Care Pharmacy Practice. Robert P. Navarro, Ed., Aspen Publishers, Inc.; 1999. 125 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Nichol, MB, Harada, ASM, Jones, JP, McCombs JS, Grogg, A, Gilderman, A, Vaccaro, J. Utilization of Antipsychotic medications in the treatment of Schizophrenia in a Managed Care Population. ISPOR Fifth Annual International Meeting. May 20-24,2000, Poster PMH15. Pocock, Stuart J. Clinical Trials: A Practical Approach. John-Wiley & Sons Ltd; 1983. Roe, Catherine M., Motheral, Brenda R., Teitelbaum, Fred, Rich, Michael W. Compliance with and Dosing of Angiotensin-Con verting-Enzyme Inhibitors Before and After Hospitalization. American Journal Health-Systems Pharmacy. 2000: 57: 139-145. Romza, John H., Black, Garth E. Pharmacy Data and Information Systems. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. SAS Institute, Inc. SAS Guide to the SQL Procedure Usage and Reference Version 6 First Edition, SAS Institute Inc, 1989. Shields, Shelly A., Gregor, Karl J., Young, Christopher H., James, Steven P. Concomitant Therapy with Anxiolytics or Hypnotics and Maintenance of Initial SSRI Therapy. Pharmacotherapy. 1998: 18,6: 1298-1303. Steiner, John F., Prochazka, Allan V. The Assessment of Refill Compliance Using Pharmacy Records: Methods, Validity, and Applications. Clinical Epidemiology. 1997: 50, 1, 105-116. Stem, Craig S., Stem, Carol J., Cronin, John M. Pharmacy Benefit Management Principles and Practices. Managed Care Pharmacy Practice, Robert P. Navarro, Ed., Aspen Publishers, Inc., 1999. Tunis, Sandra L., Johnsone, Bryan M., Kinon, Bruce J., Barber, Beth L., Browne, Robert A. Designing Naturalistic Prospective Studies of Economic and Effectiveness Outcomes Associated with Novel Antipsychotic Therapies. Value in Health. 2000: 3,3, 232-242. Warner, B., Misra, M. Understanding neural networks as statistical tools. American Statistician 50,184-293,1996. Weiss, Sholom M., Kulikowski, Casimir A., Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems, Morgan Kaufmann Publishers, Inc, 1991. 126 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Wingert, Terence D., Kralewski, John E., Lindquist, Tammie J., Knutson, David J. Constructing Episodes of Care from Encounter and Claims Data: Some Methodological Issues. Inquiry, 1995: 32(4), 430-43. 127 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 1: Sample Pharmacy Claims Sample 1: Simple Mono-Therapy Member Date Drug Days XXX 1999-05-01 A 30 XXX 1999-06-01 A 30 XXX 1999-07-01 A 30 XXX 1999-08-01 A 30 XXX 1999-09-01 A 30 XXX 1999-10-01 A 30 Sample 2: Simple Switch Member Date Drug Days XXX 1999-05-01 A 30 XXX 1999-06-01 A 30 XXX 1999-06-01 A 30 XXX 1999-06-15 B 30 Sample 5: Complex Switch Member Date Drug Days XXX 1999-05-01 A 30 XXX 1999-06-01 A 30 XXX 1999-08-15 B 30 XXX 1999-09-15 B 30 XXX 1999-09-01 B 30 XXX 1999-10-01 B 30 Sample 3: Simple Augment Drug Days XXX 1999-06-01 A 30 XXX 1999-06-01 B 30 Sample 4: Complex Augment Patterns Samples 1 through 5 represent realistic types of pharmacy claims as they would appear in a typical pharmacy claims database. Sample 1 represents a simple episode with monotherapy treatment. The patient begins medication with drug A on 1999-05-01 and continues on the medication through the end of October 1999. There are no changes in treatment and no significant gaps between treatments. The treatments for this episode may be more succinctly represented as {A}(180). The patient received drug A for 180 days. Sample 2 is identical to Sample 1 except that the patient in Sample 2 changes from drug A to drug B on 1999- 08-01. This is an example of a simple switch (A}(90) ■ = > (B}(90). The patient received drug A for 90 days and then drug B for 90 days. 128 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Sample 3 is an example of a simple augmentation. In all cases, the patient receives drugs A and B on the same day {A, B}(90). The patient received drugs A and B together for 90 days. Sample 4 is similar to Sample 3 in that the patient has access to both A and B over a prolonged period of time. What is different is that B is filled 15 days after drug A on each occasion. Because the patient did not have access to drug B for the first 15 days of treatment within this episode, the treatment may be represented as {A}(15) • = > {A, B}(75) < = > {B}(15). The patient received A alone for 15 days until B was added on 1999-05-15. The patient had access to both drugs until drug A ran out and was not refilled after the 30 day prescription starting on 1999-07-01 (a total of 75 days). Finady, the patient had access to B alone for 15 days from 1999-07-15, when drug A ran out, to 1999-07-31, when drug B ran out (15 days). Sample 5 is similar to both Sample 2 and Sample 4. The patient has access to A alone, then A and B together, then B alone. The difference between Sample 5 and Sample 4 is that the patient does not refill the prescription for drug A after starting drug B. Although there is some overlap, the fact that drug B is not refilled would probably be taken to indicate that the patient switched from drug A to drug B, implying that the remaining quantity of drug A was thrown away. The episode could be rewritten: {A}(75) = > (B}(90). Fifteen days of A are lost. 129 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 2: Analytic Data Dictionary User Categorical (Char) Unique User (Expert) Id FName Categorical (Char) First Name LName Categorical (Char) Last Name Phone Categorical (Char) Phone Number Email Categorical (Char) E-Mail Address PWD Categorical (Char) Password (Encrypted) I ;«l»k : l h l ( U m l i i i H ' i I . i b l r S t u r m ” I xpi i t I ) r t isiunx i CID1 CID2 User ReqTime DecTime Comb AGroup Claimld Member_Id Rxlni Drug Strength Quantity Days Categorical (Char) Categorical (Char) Categorical (Char) Datetime Datetime Categorical (Int) Continuous (+Int) Unique Claim Id for Claim 1 Unique Claim Id for Claim 2 Unique User Identifier Time User Requested Claim Pair for Review Time User Submitted Decision for Claim Pair User Decision (-l=Skip, 0=No, l=Yes) Randomly Assigned Analytic Group for Testing Purposes I ;iI>U : ll)IK\( laim i I alilr Stui iii” K a u I’a s i r i p t i n n ( laimsi Categorical (Char) Unique Claim Id Categorical (Char) Unique Patient Id Date Categorical (Char) Continuous Continuous Continuous Date Prescription Filled Name of Drug Prescribed Strength of Prescription (e.g., 10 mg) Quantity of Prescription Provided (e.g., 30 pills) Days Supplied I Ml I’air i \ 11 a I \ tic I a I > 1 1 I nr I )t I r r m i n i n ” Krlal ions hip lUlw can ( laims CID1 CID2 Member_Id CXDrug CXIni CXEnd CXDays IniDiff DifPctX Identification Information Categorical (Char) Unique Claim Id for Claim I Categorical (Char) Unique Claim Id for Claim 2 Categorical (Char) Unique Patient Id Basic Prescription Information Categorical (Char) Claim X Drug {X: 1,2} Claim X Start Date {X:l,2} Claim X End Date {X: 1,2} Claim X Days Supplied (X:l,2) Overlap Information C2Ini - Cllni (Difference in Start Dates) IniDiff / CXDays (Difference in Start Dates As a Percentage of Date Date Continuous (+Int) Continuous (+Int) Percentage 130 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. OlpDays OlpPctl BMatch PriXCnt PriXDay PriXSpn PriXWOt PstXCnt PstXDay PstXSpn PstXWOt DrugSame KnownCombo Interesting Member Id Tx Num Tx_Ini Tx_End I able: Member Id Tx Num Claimld I ;1 1 » k-: Member Id Tx Num Drug Dose Continuous (+Int) Percentage Days Supplied) {X:l,2} Total Days of Overlap OlpDays / CXDays (Days of Overlap As a Percentage of Days Supplied) {X: 1,2} Dichotomous (1/0) Better Match Available for Same Drugs in Pair Historical Prescribing Information Continuous (+Int) Number Times X Received Before Start of Claim X {X: 1,2,B} Days Supplied for X Before Start of Claim X {X: 1,2,B } Days from Start of 1st Rx for X and End of Last Rx for X Before Start of Claim X {X:1,2,B} Days from End of Last Rx for X to Beginning of Window |X :1 2 3 ) Future Prescribing Information Continuous (+Int) Continuous (+Int) Continuous (Int) Continuous (Int) Number Times Drug X Received After Start of Claim X {X:1,2,B| Days Supplied for Drug X After Start of Claim X {X: 1,2,B} Days from Start of 1st Rx for X and End of Last Rx for X After Start of Claim X (X :l,2 3 ) Days from End of Current Window to Start of Next Rx for X (X:1,2,B( Miscellaneous Dichotomous (1/0) Same Drug for Both Claims Dichotomous (1/0) Combination Is Known Combination Therapy Continuous (+Int) Variable for Modulating Likelihood That Row Will Be Selected Continuous (+Int) Continuous (-t-Int) Continuous (Int) c\ t I \ i I able Ini' S t o r i n g I n-;tt i n i 'nt I n f o r m a t i o n ) Categorical (Char) Unique Patient Id Integer Unique Treatment Number for Each Patient Date Treatment Start Date Date Treatment End Date i \ I I \( Im i I a bit lor I inking I r e : 11 m inis I Lick Id Raw Claims) Categorical (Char) Unique Patient Id Integer Unique Treatment Number for Each Patient Categorical (Char) Unique Claim Id i \ I I \ K \ i I able lor Sioriii” I’i cm rip I inn I real n u n I Info rmat ion i Categorical (Char) Integer Categorical (Char) Drug Name Continuous Drug Dose Unique Patient Id Unique Treatment Number for Each Patient 131 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I ill) It -: i. \ I I \ I \ i l . i l i l f l m S i o n n ” I i'i;i I m m I I \ f nls i Member Id Categorical (Char) Unique Patient Id Tx Num Integer Unique Treatment Number for Each Patient Type Categorical (Char) Type of Event (e.g.. Switch) Modifier Categorical (Char) Type Modifier (e.g., Fluoxetine->Paroxetine) I \ I I \ l p i I . 11>11- I n c S I u i iiil; I n . i t m r n l I p i s o d f S u m m . i r \ 1 n l u r m . i l i u i i i Member Id Ep Num Ep_Ini Ep_End IniDrug IniDose Washout Ep_Len Ep_LenC Ep_Chg Ep_ChgC Ep_Dos Ep_DosC Ep_Brk Ep_BrkC Ep_Any Ep_AnyC Ep_Nxt Ep_NxtC Categorical (Char) Integer Date Date Categorical (Char) Continuous Continuous (+Int) Continuous (+Int) Dichotomous (1/0) Continuous (-i-Int) Dichotomous (1/0) Continuous (-t-Int) Dichotomous (1/0) Continuous (t-Int) Dichotomous (1/0) Continuous (t-Int) Dichotomous (1/0) Continuous (t-Int) Dichotomous (1/0) Unique Patient Id Unique Treatment Episode Number for Each Patient Episode Start Date Episode End Date Drug Name(s) for First Treatment in Episode Dose for Drug for First Treatment in Episode (Mono-therapy Only) Days Since End of Last Episode (Or Beginning of Data for Patient) Episode Length (Days) Episode Right Censored (Data Ended Before Episode) Days to First Change in Treatment Drug (Add, Drop, Switch) Censored for Treatment Change Days to First Dose Change (Must Remain on IniDrug) Censored for Dose Change Days to Break in Therapy (Break Is Gap in Therapy Less Than Critical Washout Period) Censored for Therapy Break Days to First Treatment Event of Any Type Censored for Treatment Events Days to Next Treatment Episode (Must Be Greater Than Critical Washout Period) Censored for Subsequent Treatment Episode 132 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 3: RxReview Help Page 1 Mam Help Section If you have any questions, please do not hesitate to contact Jason Jones ___________________ by phone (213.924.3111) or sending e-mail to jjones@vnusmfo. com. Frequently Asked Questions • Why are my decisions so imporant? ■ How do I get started? e How are claims selected? e How do I use the patient review screen? o Which claims am I reviewing? O How do I indicate a response? O How do I interpret the graph? O How do I use the “ Entire History" section? • What does it mean to combine or not combine two claims? • Is it better to guess or skip the question? 133 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 2 Why are my decisions so hnporant? The short anser is:you are the expert. Your clinical background coupled with your human ability to recognise complex patterns mean you can convert these lists of claims into meaningful treatment histories. For instance, if you saw the following list of claims: Drug Start Stop Fluoxetine 10 mg 2000-01-01 2000-01-31 Fluoxetine 5 mg 2000-01-01 2000-01-31 Fluoxetine 10 mg 2000-02-01 2000-02-29 Fluoxetine 5 mg 2000-02-01 2000-02-29 Fluoxetine 10 mg 2000-03-01 2000-03-31 Fluoxetine 5 mg 2000-03-01 2000-03-31 you probably wouldn't have much trouble figuring out that the prescnber wanted the patient to take 15 mg/'day. How could you tell? Among other filings, the patient receives the same two prescriptions repeatedly and the two prescriptions are always filled on the same day. The following claim pattern probably wouldn't give you any more trouble: Drug Start Stop Fluoxetine 10 mg 2000-01-01 2000-01-31 Fluoxetine 5 mg 2000-01-15 2000-02-14 Fluoxetine 10 mg 2000-02-01 2000-02-29 Fluoxetine 5 mg 2000-02-15 2000-03-14 Fluoxetine 10 mg 2000-03-01 2000-03-31 Fluoxetine 5 mg 2000-03-15 2000-04-14 Although the prescriptions are not filled ont the same day, you can see that the patient had access to both doses for most of the period. Claims patterns get much more complicated than this and it quickly becomes very difficult to sit down and exp all of your rules and exceptions to a computer. Why explain them to a compuer? Do you want to spend the re your life classifying prescription claims? Probably not Millions of prescriptions get filled weekly and you and all your expert friends could never go through all of them. If a computer could learn from you, it could process the claims much more quickly— most of them are pretty b anyway. Each time you make a decision about how to classify the claims in this application, a computer program looks decision and tries to learn from it The computer is learning to immitate you. Once the computer gets good enough & t tmmitating you, we will feed 100s of millions of prescription claims in ) to try to leam things about prescription treatments. For instance, are certain drugs more effective or better tolerated than others by certain types of people with certain types of medical histories? In order to answer the: types of questions, the computer must first be able to figure out what treatments patients got and what happen, file course of the treatment For instance, did the patient switch drugs, change doses, add drugs, etc.? So, you are the expert You are teaching file computer how to convert prescription claims into treatment histo: 134 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 3 How do I get started? The first thing to do is to get a user id and password. You can do this by contacting Jason Jones. Once you have a user name and password, just dick 'Review a Patient' on the green navigation bar to the left. This w iD take you to a screen that w iD ask you a series of questions about the claims for a given patient If you have questions about interpreting the screen, go here. Once you finish the claims for one patient, you will be prompted to move on to the next one. You can quit at any point <TOP> How are claims selected? Each time you request a patient a patient is randomly selected from all the patient's in the database. Once a patient is selected, the application goes through each possible rlaim pair and asks you for a decision on each pairing. What's a pairing? A pairing is simply defined as two claims that overlap based on the prescription start and stop dates. Whether the overlap is complete and perfect or there is only one day of overlap, you will s Q D be asked to make a decision & is possible for the same claim to occur in multiple pairings. For example, it could be that the treatment regimen requires the use of three drugs. Or. in the Mowing example: Drug Start Stop Fluoxetine 10 mg 2000-01-01 2000-01-31 Fluoxetine 5 mg 2000-01-15 2000-02-14 Fluoxetine 10 mg 2000-02-01 2000-02-29 Fluoxetine 5mg shares the first half of its prescription with the first claim and the second half of its prescription with the last claim, so you would be asked to decide two different pairings. <TOP> 135 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 4 How do I use die patient review screen? To see an image of the whole Review screen, dick here for 1 drug and here for 2 drugs. The screen can be divided into S sections. 1. The main navigation section. 2 The specific 2 claims in question. 3 Where you indicate your decision 4 A graph of 2 years of dosing for the drug(s) in question 5 The complete prescribing history for this patient In each case, you are being asked to indicate whether the two claims should be combined into a single treatment C l e., was the intent to have the patient take both medications together) or not The two claims you are being asked about are represented in section 2. You can indicate your response in section 3. Sections 4 and 5 are there to give you more information about the prescription history in order to make a better decision. <TOP> 136 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Single Drug Image (Link from Page 4): Daily b d n llH n j Daily ►1996-02-08 1996-03-09 1996-03-05 199641404 1996-03-02 30000 30000 dona 30000 19960402 Combine Claims? Yes H 19960429 199603-29 N efoodone 30000 2 Years R x History 9 M — _______ ParMor 4 6 7 : s -OailpV □ '•ps® -27* - t a t -S i W 96^^199M 621TiTO4oM lOOOO 19960621 199607-21 T m odona 10000 l W O M l l W ^ a T ^ ^ l O O O T 199608-20 199609-19 T m odona 10000 HBHBHnn 199609-19 19961619 T faiodona 100.00 19961617 199611-16 T m odona 100 00 199611-14 199612-14 T m odona 100.00 Two Drug Image (Link from Page 4): 19960303 19960303 19960404 N am odone T m o d o n a ►199602-08 19960404 Combine C leans? Y ts ► 19960402 19960429 2 Years R x History 19960621 199607-21 199608-20 S B * ? , D o s e i 199609-19 19961617 1 9 9 6 1 1 -4 E n in Salary Daily S » Pw« Pass idone 300.00 19960309 30000 3 0 0 0 0 1 0 0 ) 0 0 19960302 N efcodone 300.00 19960629 N efcodone 30000 19960621 T m odona 100.00 199607-21 T m odona 100.00 1996-08-20 Tfaiodona 10000 199609-19 T m odona 10000 19961619 T m odona 100.00 199611-16 T m odona 10000 19961614 T m odona 100 00 137 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 5 Which claims am I reviewing? The pair of claims you need to review are in area 2, directly under 'Claims to Review1 in the Review screen. A sample looks like this: Daily Start________Sfcp________ D m _____________________ Paw 1996413-05 1 9 9 M 4 4 4 H tfu o d o n * 300.00 1W &03-05 1 9 9 6 4 * 0 4 T rn o d o rw ______________________100.00 Only the two claims you are malrit^ a decision about are represented In the claims history section, all die patient's claims are displayed and the two claims you are reviewing are highlighted in yellow. In the graph below the claim pair and the area where you indicate your decision, up to two years of dosing history are displayed for any drugs represented in the claim pair, hi the graph, the yellow section indicates the period of time during which die two claims overlap. <TOP> How do I indicate a response? Once you have come to a decision about whether or not the claims should be combined into a single treatment, you can indicate your decision by clicking ‘Yes’ *No* or "Skip" next to the question ‘Combine Claims?*. It looks like this: C om bine C laim s? Y es No Slop As soon as you indicate your response, die application will move you to die nest claim pair for the patient or, if there are no claims left, ask you if you would like to select another patient <TOP> 138 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 6a: How do I interpret the graph? The g^aph is intended to give you a quick feel for the claims history related to the drug(s) in the claim pair. The first drug's dose appears in red and the values appear on the left-most Y-Axis in the same color. If there is a second drug involved, its dosing appears in blue and the values appear on the right-most Y-Axis also in blue. Sometimes, the dosing for the two drugs overlaps on the graph If this happens, the line is colored green so that you can still see that both drugs were available at that time. The X-Axis represents the number of days before and after the start of the claims' overlapping. The period of time during which the two claims in question overlapped is shaded in yellow. ft shows the dose of each drug available to the patient based entirely on the claims record. In other words, no inferences are made about the records in putting the graph together. The easiest way to understand how to interpret the graph may be by example. Here is a sample of a picture when only one drug is involved: 2 Y ears Rx H isto ry I b e li* : p t - * l Ift* 0rug1 n - j M-- ■ M I ia> 271 f « In this case, the patient typically receives Nortriptyline 50mg There are a couple spots where the patient received no drug at all and other spots where the patient received lOOmg The overlap for this particular claim pair is represented as a vertical yellow bar between days 0 and 90 (on the X-Axis). Looking at the history for this patient, it looks as though (s)he simply fills her/his prescriptions early sometimes. That leads to available doses of lOOmg. but the prescnber probably still intended die pahent to take 50mg. In this case.it probably makes sense to indicate "No" don't combine the claims. 139 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 6b: Here is an example of a claim pair involving two drugs: Z Years R x History lk rtr> v (iltn « S o tf . T V « e o < o w til n S 3 78 72 87 81 5 8 58 -385 .2 7 8 -1 8 8 -8 8 8 8# 181 278 388 In this case, the patient starts off receiving Nortnptyiine by itself Then, at the point marked 'Period of Overlap" (s)he receives Trazodone as weH In this case, both drugs are received at the SOmg level* so a green line appears to show that both drugs are being received (otherwise one line would overwrite the other). At one point, around day 220, die patient overlaps her/his Trazodone prescriptions and achieves lOOmg while the Nortriptyline dose remains at 50mg hi this case, there are separate red and blue lines because they would not overwrite each other The patient continues to get both drugs together for at least a year, so we would probably conclude that the intent w ith the particu lar claim p a ir \te a n considering was for the drugs to be taken together. * hi this plot, both drugs have a range of 50-100mg. If the ranges were different, die 2 Y-Axes would be scaled separately. <TOP> 140 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 7: How do I me the "Entire History” section? Section 2 (claims to consider) only shows the two claims in the pair under consideration. Section 4 (graph) shows a longer history, but only for die drug(s) under consideration. Section 5 (entire history) shows all the claims known to the system for this patient In that sense, it is more complete, if sometimes harder to navigate. Here is an example of the claims history: S tu rt Daily D rat D — ►19964)24)8 19964)34)9 H efazodone 30000 ►19964)44)2 19964)54)2 H afazodone 300.00 To help you navigate the history, the claims are sorted by date and the claims under consideration are highlighted in yellow. Often these are right next to each other (as in the above example), however it is possible for pairs to be separated by claims: S ta rt Daily D m Daaa ►19964)2-08 19964]34B H afazodone 30000 ►19964)44)2 19964154)2 H afazodone 300.00 19964)4-29 ^_ 1 9 9 6 4 )^g H afazodone 300.00 Your decision making process should be the same. Claims that have any overlap with either of die claims in the pair have little green arrows to die left of diem. This can be particularly helpful when you feel that the two drugs should be combined, but you are not sure if the two highlighted claims are the ones that should be combined tofonn the treatment Both of the above images mvolove die combination of the same two drugs and all the claims overlap. The highlighted claims in the first image probably should be combined. However, although there is overlap, the second image probably represents an early prescription fill and not a true drug therapy combinatioa By looking at the claims with green arrows in the second image, you can see that there is another prescription for Trazodone much closer to the Nefazodone prescription, specifically the one highlighted in the first image. <TOP> 141 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 8: What does it mean to combine or not combine two claims? Here is a picture of what happens based on whether or not two claims should be combined: Yes Yes ____ Change Existing Dose The process starts when there is an easting treatment (Tx) based on a prior claim. Then, a new claim shows up and the new claim starts before the old claim ends. At this point, we have to decide whether to combine die two claims into a single treatment or not If we decide NOT to combme the two claims, then that means we should truncate the first treatment and simply start the new treatment with the new claim all by itself This means that, although die prior prescription would have continued, we are stopping it early Effectively, this is like saving the unused pills in the medicine cabinet for a rainy day. If we decide the claims SHOULD be combined, then resulting treatment will be a composite of die two claims. If the two claims were for the same drug, then the dose will be adjusted upwards (e.g., Fluoxetine 1 Omg + Fluoxetine becomes Fluoxetine lSmg). If die drugs were not the same, then the treatment simply becomes die combination of die two drugs (e.g., Nortriptyline 5 Omg + Trazodone 5 Omg). (Actually, things can get a little more tricky if the overlapping days across claims is not perfect, but you don't have to worry about that) <TOP> 142 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 9: Is it better to guess or skip the question? Sometimes it's just not possible to figure out what a prescnber was trying to do with a patient or what a patient was doing filling prescriptions. In these cases, you may not have any confidence in responding ‘Yes or "No" to die ‘Combine?' question. You can select ‘Skip* instead. If you feel any sort of push in either direction, however, please do NOT select ‘Skip*. If you select 'Skip' the computer gets no information at aD . The hierarchy here is: If you don't answer the question, someone else will have to...and that someone probably doesn't know any more than you. <TOP> D efinitely ¥cong Answer 143 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 4: Sample Query of Treatment Tables Query 1: Augmentation Summary SELECT R1.Drug, SUM(IF(R2 .Drug=Rl .Drug, 1,0)) AS Txs, SUM(IF(R2 .DrugoRl .Drug, 1, 0) AS [...Augmented], SUM(IF(R2.Drug='Trazodone',1,0) AS [...w/ Traz], SUM(IF(R2 .Drug=Rl.Drug, (T .TxEnd-T.Txlni) /30, 0) AS Months, SUM (IF(R2 .DrugoRl .Drug, (T.TxEnd-T.TxIni)/30, 0) AS [...Augmented], SUM (IF (R2 .Drug=' Trazodone', (T. TxEnd-T. Txlni) /30, 0) AS [...w/ Traz] FROM cvtTx T, cvtTxRx Rl, cvtTxRx R2 WHERE T . Member_Id=Rl. Member_Id AND T . Tx_Num=Rl. Tx_Num AND Rl. Member_Id=R2 .Member_Id AND Rl. Tx_Num=R2 . Tx_Num AND Rl.Drug IN {'Fluoxetine','Paroxetine') GROUP BY Rl.Drug ORDER BY Rl.Drug; Drug Treatments —Augmented -With Trazodone _ Fluoxetine 14,117 4,133 1,602 Paroxetine 23,954 6,057 2,086 _ Drug Months -Augmented -With Trazodone Fluoxetine 44,082 2,500 1,109 Paroxetine 79,870 3,384 1,413 Query 1 returns the total number of treatments and number of months the treatments for all treatments involving either fluoxetine or paroxetine. Also, the subsets of these treatments that involved any augmentation and augmentation with trazodone in particular are presented. 144 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 5: List of Evaluated Classification Tree Packages Product URL Non-Classification Tree Packages PROC LOGISTIC http://www.sas.com NeuralWare http://www.neuralware.com Neural Connection http://www.spss.com/neuro Classification Tree Products AnswerTree http://www.spss.com/AnswerTree/ C4.5 http://www.cse.unsw.edu.au/-quinlan/ C5.0 http://www.rulequest.com/seeS-info.html CART® http://www.saIford-systems.com Cognos http://www.cognos.com/products/scenario/index.html dtree http://fuzzy.cs.uni-magdeburg.de/~borgelt/#Software EC4.5 http://maiden.di.unipi.it/kdd/ec4.SZ Enterprise Miner http://www.sas.com/products/miner/ FACT ftp://ftp.stat.wisc.edu/pub/loh/treeprogs/fact ID3 http://www.ddj .com/ftp/1996/1996.06/aa696.zip IND http://ic-www.arc.nasa.gov/ic/projects/bayes-group/ind/IND-program.html Intelligent Miner http://www-4.ibm.com/software/data/iminer/ KnowledgeSeeker http://www.angoss.com/ LMDT http://yake.ecn.purdue.edu/~brodley/software/lmdt.html MS’ class http://www.cs.waikato.ac.nz/-ml/weka/ MineSet http://www.sgi.com/software/mineset/ MLC++ http://www.sgi.com/tech/mlc/ QUEST http://www.stat.wisc.edu/-loh/quest.html Ripley’s Tree() http://www.stats.ox.ac.Uk/pub/S/Tree.sh.gz RPART http://lib.stat.cmu.edU/S/rpart S+ tree() http://www.splus.mathsoft.com/ TREEDISC.SAS http://www.recursive-partitioning.com/treedisc.sas 145 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix 6: Description and Samples Use of Outcomes Visualyzer The outcomes visualyzer application was created to allow both analysts and decision makers in healthcare related fields to rapidly understand the relationships between potentially important predictor and outcome variables. The application runs within a standard web browser and can be deployed over the Internet. In conjunction with a screen shot application, the outcomes visualyzer can be used to produce presentation ready plots with customizable axes and titles. Outcome variables can be continuous or time-to-event (TTE) types. For continuous variables, the user has the option of viewing mean, median, or custom percentiles. For TTE variables, the user can choose to limit the follow-up period and effectively censor all observations exceeding a certain time period. Predictor variables must be categorical and can be used as grouping and/or filtering variables. Criteria and groupings can be combined to produce increasingly Figure A5a: Sample Initial Screen Days to Discontinuation 1 0 0 % t — 0 a v > u » a • H o c « j • H u ( I a. x u 0 2 10%- 40 80 -+ - -t- -t- Overall (6,577) 120 161 201 241 281 321 361 402+ Days from Rx Start 146 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. complex and targeted analyses. Figure ASa shows a sample initial screen. The default variable to plot was time to antidepressant discontinuation. A Kaplan-Meier plot is displayed in the main plotting section with the number of days from the start of treatment as the x-axis and Figure A5b: Sample Graphing Controls [TxDruq First Tx Tx Start I high Dose |Tx Completion PostBg Age Gender 100-02Months 1 03-05Months 1 06-08Months 09-11 Months <l2Months 1 12+Months from Rx Start [<30 1 30-39 40-49 50-59 Days to Tx Drug Chenge Days to Dose Change Days to Therapy Break Days to Any Tx Evert Days to Next Episode Antidepressant Rx Cost OveralRx______ 147 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure A5c: Sample Plot with Grouping Days to Discontinuation Fluoxetine (1,164) Paroxetine (4,505) Sertraline (908) u a u > u a ■ H u a t) -H 60%------- 50% • ---- 40%------- 30%- ■ ---- u u o. X u 10% 0 120 161 201 241 281 321 381 40 80 Days from Rx Start “survival” or the percent of patients still receiving treatment as the y-axis. The groups (in this case “Overall”) and sample size appear to the right of the graph. No grouping or filtering has been performed. The configuration options available by pushing the “Group By...”, “Filter On...”, or “Chart Controls” buttons are shown in Figure A5b. The “Group By” and “Filter On” dialog boxes contain the same variables. Selecting a variable or variables from the “Group By” dialog box will result in one line being drawn for each unique value of the grouping variable(s). If more than one variable is selected, the unique combination of values is used for producing lines. Figure A5c shows a sample plot after ‘T x Group” was selected in the “Group By” dialog box. Figure A5c also shows how the user can change the chart title and limit the number of days of follow-up (in this case to 36S) for TTE outcomes. 148 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure A5d: Sample Plot with Grouping and Filtering Days to Discontinuation (1996 OnJy) 100% 90% 90% 70% Fluoxetine (588) Paroxetine (1,119) Sertraline (443) u a u > M 60% 50% U tl O . X u 40% 30% 20% 4 J O 2 10% 0 80 281 321 40 2 01 120 161 241 Days from Rx Start Users can select particular levels of the predictor variables to include in the calculations in the “Filter By” dialog. If no levels are selected within a variable, all values are included. If at least one level is selected, only selected levels within a predictor variable will be represented in the plot. The final set of observations used in producing the plots is the set that satisfies all criteria (i.e., criteria for variable 1 AND criteria for variable 2 AND ...). Figure A5d shows a the same plot as FigureA5c when the “1996” has been selected in the ‘T x Start” box of the “Filter By” dialog (note the change in sample size to the right of the chart). The chart title also has been changed in the “Graph Options” dialog box to reflect the change in criteria. The contents of the “Graph Options” dialog box change slightly depending on the type of outcome variable selected. Outcome variables are selected at the top of the “Graph Controls” dialog. Figure A5b (lower left comer) shows a sample of 149 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the outcome variables that could be selected for a particular dataset. The “Plot Title” and “X-Axis Label” controls in the dialog do not change based on the outcome variable type. If the outcome variable is of the TTE type, the user can restrict the follow-up time by changing the value in the “X-Axis Max” control (e.g., Figures ASa and ASc use the same data, but ASc has been restricted to 365 days). If the x-axis is Figure A5e: Sample Plot with Grouping and Filtering Cost OveralRx Fluoxetine (1.164) Paroxetine (4,505) Sertraline (908) -12-11-10 -8 -8 -7 -8 -8 -4 -3 -2 -1 I 2 3 4 5 # 7 8 0 1 0 1 1 1 2 _________________________ Months from Rx Start_______________________________________ Cost OveralRx Fluoxetine (1,164) Paroxetine (4,505) Sertraline (903) -12-11-10 -0 -8 -7 -« -8 -4 -3 -2 -1 1 2 3 4 8 6 7 8 0 1 0 It 1 2 Months from Rx Start 103 90 78 65 52 40 27 15 2 146 132 119 105 92 79 65 52 39 25 150 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. restricted, observations having follow-up times beyond the chosen time period are censored and a “+” symbol appears next to the maximum x-axis value to indicate an x-axis restriction. To reset the x-axis to the maximum observed follow-up time, the user may press the “Reset” button in the “Chart Options” dialog. If the outcome variable is continuous (e.g., counts or costs), the “X-Axis Max” and “Reset” controls are replaced by “Statistic to Plot.” The statistic options are shown in the lower middle section of Figure A5b (“Mean”, “Median”, and “Custom %”). If “Custom %” is selected, a control appears to allow the user to enter the percentile to plot (e.g., 50th percentile would be the same as selecting “Median”). Figure ASe shows a plot of cost based on mean and median statistics. The x-axis and y-axes are more flexible when the outcome variable is of the continuous type. For TTE variables, the x-axis must start at 0 and the y-axis ranges from 0 to 100%. For continuous variables, the x-axis can represent any period of time and the y-axis scales automatically based on the observed statistical values. Figure ASe also shows that user can change the x-axis label. 151 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Cure rate estimation in the analysis of survival data with competing risks
PDF
Imputation methods for missing data in growth curve models
PDF
An intervention and program evaluation to determine the effectiveness of public health reforms on primary prevention practices by chiropractic interns
PDF
Automatically and accurately conflating road vector data, street maps and orthoimagery
PDF
Concept, topic, and pattern discovery using clustering
PDF
An integrated environment for modeling, experimental databases and data mining in neuroscience
PDF
Correlates of smoking behaviour among Egyptian adolescents
PDF
Efficient PIM (Processor-In-Memory) architectures for data -intensive applications
PDF
An adaptive temperament -based information filtering method for user -customized selection and presentation of online communication
PDF
Efficient minimum bounding circle-based shape retrieval and spatial querying
PDF
Emotional intelligence and smoking risk factors in early adolescents
PDF
An adaptive soft classification model: Content-based similarity queries and beyond
PDF
Content -based video analysis, indexing and representation using multimodal information
PDF
Analysis of binary crossover designs with two treatments
PDF
Cost -sensitive cache replacement algorithms
PDF
A unified mapping framework for heterogeneous computing systems and computational grids
PDF
Architectural support for efficient utilization of interconnection network resources
PDF
Cost -efficient design of main cohort and calibration studies where one or more exposure variables are measured with error
PDF
Design of a stealth liposome delivery system for a novel glycinamide ribonucleotide formyltransferase inhibitor
PDF
Multi-State Failure Models With Competing Risks And Censored Data For Medical Research
Asset Metadata
Creator
Jones, Jason Peter (author)
Core Title
Enabling clinically based knowledge discovery in pharmacy claims data: An application in bioinformatics
School
Graduate School
Degree
Doctor of Philosophy
Degree Program
Preventive Medicine
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Computer Science,Health Sciences, Pharmacy,OAI-PMH Harvest,statistics
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Azen, Stanley (
committee chair
), Nichol, Michael (
committee member
), Sather, Harland (
committee member
), Shahabi, Cyrus (
committee member
), Stahl, Douglas (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-152892
Unique identifier
UC11329555
Identifier
3054758.pdf (filename),usctheses-c16-152892 (legacy record id)
Legacy Identifier
3054758-0.pdf
Dmrecord
152892
Document Type
Dissertation
Rights
Jones, Jason Peter
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
Health Sciences, Pharmacy