Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
M-estimation and non-parametric estimation of a random diffusion equation-based population model for the transdermal transport of ethanol: deconvolution and uncertainty quantification
(USC Thesis Other)
M-estimation and non-parametric estimation of a random diffusion equation-based population model for the transdermal transport of ethanol: deconvolution and uncertainty quantification
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
M-Estimation and Non-Parametric Estimation of a Random Diffusion Equation-Based Population Model for the Transdermal Transport of Ethanol: Deconvolution and Uncertainty Quantification by Maria Allayioti A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (APPLIED MATHEMATICS) August 2023 Copyright 2023 Maria Allayioti Acknowledgements I would like to express my sincere gratitude to my advisor, Gary Rosen, for his invaluable mentorship, guidance, and unwavering support. Additionally, I extend my appreciation to Larry Goldstein and Susan Luczak for their collaborative efforts and the wealth of knowledge I have acquired through their guidance. My family has been a constant source of support and love. I am beyond grateful for my parents Andrie and Kypros and my sisters Anastasia and Danae. Knowing that they are constantly thinking of me and believing in me during both the highs and lows of my journey served as a powerful source of inner strength. Thanks for always encouraging me to make decisions based on what I believe is best for myself and for reassuring me that you will always next to me no matter what. The incredible friends I have back home have been truly amazing. Their support and the excitement they expressed every time I shared the news of my return made me feel incredibly loved. Eleni, Andria, Andrea, Pani, Eliza, Nandia, Christina, Eftychia, Georgia, Liana, Gianni, Vasili, Sotiri I’m grateful for the unforgettable memories we created together each time I returned home, the laughter and your love. My PhD journey would have been completely different without the presence of my intelligent and kind hearted officemates, Lernik and Irmak. We continuously supported one another through the challenging times we faced. The conversations we shared and the moments we spent together are memories that I will forever have in my heart. Also, I am grateful for the friendship of J.E. Paguyo. The best thing that happenend to me in LA was meeting extraodrinary individuals that I am lucky enough to call my friends. Constantina, Tommy, Chrysovalanti, baby Michael, Evangele, Andrea, Amy, ii Vasili, Dimitri you have made my seven years in LA truly unforgettable. Our friendship, our road trips, our cooking nights and so much more. Paraskevi, you have been an incredible friend and my forever walking buddy. Our non-stop chats filled with laughter and our endless conversations, are moments I will forever cherish. Panayioti, Panayiota, Pavlo, each of you deserves an entire paragraph. No words can describe how grateful I am for our friendship and unconditional love. During the most challenging moments of my PhD journey, you stood by my side, offering your unwavering support. We went together through a lot and our special bond will be forever in my heart. My godson Apollonas came into my life during the final year of my PhD and changed it forever. Apol- lona, thank you for your light, the happiness you brought in my life and the smiles you put on my face every time I see you. I will always be next to you, even when I am far away. I want to express my gratitude for having a truly special person in my life, a forever best friend who has become like family to me. Nicola, what we have been through since the beginning our PhD journey back in 2016 will connect us forever. Words cannot express the depth of love I have for you, and I truly believe that I wouldn’t be the person I am today if you weren’t a part of my life. Thanks for being my forever best friend and travel buddy. Thanks for exploring the world with me. Thanks for the unforgettable memories that we created together. Thanks for being you. Last but certainly not least, I want to thank Zacharias, a person who entered my life during the final year of my PhD journey. Zacharia, this thesis would never be done if you wouldn’t be there for me. Thanks for believing in me. Your constant encouragement lifted me up when I felt like I had no more strength to carry on. Thank you for loving me and being there for me every step of the way. I feel strong because of you. I don’t know what the future holds but I will be forever grateful for having you in my life and I will forever have you in my heart. iii TableofContents Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 2: A Diffusion-Based Model for the Transdermal Transport of Ethanol . . . . . . . . . . . 6 2.1 The First Principles Individual Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 The State Space Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Finite Dimensional Approximation and Convergence . . . . . . . . . . . . . . . . . . . . . 11 Chapter 3: M-estimation in a diffusion model with application to biosensor transdermal blood alcohol monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1 Introduction and Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.1 Diffusion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.2 Nonlinear least squares estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 M-estimation: Existence, consistency, and limiting distribution . . . . . . . . . . . . . . . 17 3.2.1 Set up and summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2.2 Estimating equations, consistency, and asymptotic normality . . . . . . . . . . . . 20 3.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 Application to a diffusion equation model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.4 Inference on the BrAC curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.5 Basis and Regularization Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.6 Transdermal blood alcohol monitoring: Simulations and data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.6.1 Simulation studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.6.2 Real Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Chapter 4: Non-Parametric Estimation of a Random Diffusion Equation-Based Population Model and Blood/Breath Alcohol Concentration Deconvolution from Transdermal Alcohol Biosensor Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.1 The Population Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.2 Finite Dimensional Approximation and Convergence . . . . . . . . . . . . . . . . . . . . . 83 4.3 Deconvolution and Uncertainty Quantification . . . . . . . . . . . . . . . . . . . . . . . . . 84 iv 4.4 Estimating the Distribution of the Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.4.1 Non-Parametric Estimation of the Population Distribution . . . . . . . . . . . . . . 87 Chapter 5: Consistency Under Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Chapter 6: Regression Models for the Prediction of the TAC Parameters and Sub-population Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.1 Regression Models for the Prediction of the TAC Parameters from Subject Covariates . . . 106 6.1.1 Regression Models Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.2 Sub-population Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Chapter 7: Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 7.1 Dataset3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2 Simulator Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.2.1 Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.2.2 Drinking Diaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 7.2.3 BrAC parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.2.4 TAC parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.2.5 Estimated BrAC curve - A Nonlinear Pharmacokinetic Model for Alcohol Absorption, Metabolism, and Elimination . . . . . . . . . . . . . . . . . . . . . . . 125 7.2.6 Estimated TAC curve - A linear input/output model for TAC, as a function of BAC/BrAC in the form of a convolution . . . . . . . . . . . . . . . . . . . . . . . . 127 7.3 Validating the Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.4 Demonstration of the Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 7.4.1 Simulator Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 7.4.2 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Chapter 8: Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Chapter 9: Discussion and Conluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 v ListofTables 7.1 Dataset3 Covariate List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.2 Dataset3 Covariate Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.3 Simulator Drinking Diaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 7.4 D 1 Covariate Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.5 Simulator Parameter Values for Efficacy Experiment . . . . . . . . . . . . . . . . . . . . . . 130 7.6 Experiment 1 Covariate Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.7 Experiment 2 Covariate Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 7.8 Experiment 2 Drinking Diary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.1 Estimated coefficients of the linear regression models that predict the log-link transfor- mation ofq 1 andq 2 from the covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 8.2 Covariate values of the test data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 8.3 95% Confidence intervals for q 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 8.4 95% Confidence intervals for q 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 vi ListofFigures 3.1 Values of theb q estimators obtained when using 20 TAC observations overT =1 hour. . . 68 3.2 Values of theb q estimators obtained when using 60 TAC observations overT =1 hour. . . 68 3.3 Values of theb q estimators obtained when using 100 TAC observations overT =1 hour. . 69 3.4 TAC errors set to zero, no regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.5 TAC errors2.5× 10 − 5 , no regularization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.6 Lab and field TAC error 2.5× 10 − 4 and2.5× 10 − 5 , no regularization. . . . . . . . . . . . 73 3.7 Lab and field TAC errors 2.5× 10 − 3 and2.5× 10 − 4 , both regularization parameters set to2.5× 10 − 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.8 Confidence Interval: Area under the curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.9 Lab and field TAC error 2.5× 10 − 4 and7.5× 10 − 5 , no regularization. . . . . . . . . . . . 75 3.10 Lab and field TAC error 2.5× 10 − 4 and7.5× 10 − 5 , both regularization parameters set to 2.5× 10 − 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.11 BrAC from recordings of two TAC device observations of dataset BT311 Session1 . . . . . . 78 3.12 BrAC from TAC observations of dataset BT311 Session1 06132019 and Estimated BrAC that results from using the minimizerb q =(0.5611,0.7655). Estimated standard deviation of TAC errors is2.5× 10 − 3 , both regularization parameters set to0.006. . . . . . . . . . . 79 6.1 A mapping from the subject covariates to the TAC parameterq . . . . . . . . . . . . . . . 107 7.1 Schematic diagram of the Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 7.2 Flowchart of the Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7.3 Dataset3 TAC Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 vii 7.4 Dataset3 Categorical Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.5 Dataset3 Numerical Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.6 90% bounds forq based on the covariate vectors inD 1 . . . . . . . . . . . . . . . . . . . . 131 7.7 90% bounds for TAC based on the covariate vectors inD 1 . . . . . . . . . . . . . . . . . . 132 7.8 90% bounds for BrAC based on the covariate vectors inD 1 . . . . . . . . . . . . . . . . . . 132 7.9 Simulator Experiment 1 - Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.10 Simulator Experiment 1 - Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 7.11 Simulator Experiment 1 - Example 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 7.12 Simulator Experiment 2 - Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.13 Simulator Experiment 2 - Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 7.14 Simulator Experiment 2 - Example 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 8.1 Marginal distributions of theq 1 andq 2 populations ofD train . . . . . . . . . . . . . . . . . 145 8.2 Scatter plot and kernel density estimate of the fullq population ofD train . . . . . . . . . . 145 8.3 Test Data 1 sub-population selection and conditional kernel density estimate . . . . . . . . 146 8.4 Test Data 1 deconvolved BrAC signal along with error bands . . . . . . . . . . . . . . . . . 146 8.5 Test Data 2 sub-population selection and conditional kernel density estimate . . . . . . . . 147 8.6 Test Data 2 deconvolved BrAC signal along with error bands . . . . . . . . . . . . . . . . . 147 viii Abstract Being motivated by an alcohol biosensor problem, this thesis presents two approaches. First, the develop- ment ofM-Estimation and deconvolution methodology with the goal of making well-founded statistical inference on an individual’s blood alcohol level based on noisy measurements of their skin alcohol con- tent. The results are applied to a nonlinear least squares estimator of the key parameter that specifies the blood/skin alcohol relation in a diffusion model, establishing its existence, consistency, and asymptotic normality. To make inference on the unknown underlying blood alcohol curve, a basis space deconvo- lution approach with regularization is developed and the asymptotic distribution of the error process is determined, thus allowing the computation of uniform confidence bands on the curve. Second, the de- velopment of a population model for the transdermal transport of ethanol from the blood to an alcohol biosensor on the surface of the skin in the form of a random abstract parabolic hybrid system of coupled ordinary and partial differential equations. Linear semigroup theory in a Gelfand triple of Bochner spaces is used to first formulate the model as an equivalent deterministic system in state space form and to then develop a finite dimensional approximation and convergence theory. Kernel density estimation, is used to estimate the distributions of the random parameters that appear in the model. Sub-population selection based on regression models’ predictions for the transdermal alcohol random parameters based on subject covariates is used to restrict the kernel density estimate of the population density and produce narrower error bands for the deconvolved BrAC signal. ix Chapter1 Introduction The ideas and results presented here are motivated by the following problem. Biosensors that measure transdermal alcohol concentration (TAC) are being developed which record the amount of ethanol excreted through the skin after consuming alcoholic beverages. These devices which rely on fuel cell technology and are currently at the research and prototype stage of their development, require only passive participation by the patient or research participant, yield TAC readings that are essentially continuous in time, and can collect, record, and transmit data for relatively long periods of time. For these reasons, TAC biosensors have the potential to become an invaluable tool for researchers and clinicians who want to observe drinking behavior in a naturalistic setting ([66]). Currently the only way alcohol researchers and clinicians can collect data for studying drinking pat- terns and behaviors is through the use of breath analyzers and self-report. Breath analyzers require active participation by the research participant or patient and they only provide a measurement at a single point in time. They are also subject to error due to 1) improper exhalation by the user, 2) readings taken too close together in time, and 3) readings taken too soon after drinking when there is still alcohol in the mouth. The self-report and drinking diaries can also be plagued by inaccuracies as a result of alcohol’s tendency to impair memory and the difficulty inherent in attempting to determine the precise amount of alcohol consumed, for example, as in the case of mixed drinks or pitchers of beer. 1 TAC measurements, on the other hand, have the potential to eliminate these sources of error, but un- fortunately, however, they also have their limitations. The most significant of these is that TAC is not yet reliably able to be directly converted into the current standard measures of intoxication, namely, blood (BAC) and/or breath (BrAC) alcohol concentration. It is generally well accepted ([36]) that BrAC is closely correlated with BAC across different subjects, different devices, and different environmental conditions. TAC’s relation to BAC or BrAC, on the other hand, being the result of the more complex biological process of ethanol molecules diffusing from the blood through the skin, and then their collection and measurement by the biosensor which is external to the body, can be affected by a number of largely unpredictable con- founding physiological (e.g. sex, age, body composition, etc.), environmental (e.g. ambient temperature and humidity, altitude, etc.), and technological factors. The result is significant variability in the relation- ship of TAC to either BAC or BrAC across people, environmental conditions, and device manufacturers. Thus TAC devices to date have typically been primarily used only in legal and research settings as ab- stinence monitors (e.g., in court mandated monitoring of DUI offenders) because of difficulties researchers have found translating raw TAC to the quantity of alcohol in the blood. Still, TAC measured by a wearable biosensor device has great potential as a tool to improve personal and public health. It provides a passive, unobtrusive way to collect naturalistic data for extended periods of time. Creating a system that reliably converts TAC data into estimates of BAC (or BrAC) would greatly bene- fit the alcohol research and clinical communities who, along with public health institutes, have been quite interested in such models. Such a tool would dramatically improve the accuracy of field data and the valid- ity of naturalistic studies of alcohol-related health outcomes, disease progression, treatment efficacy, and recovery. A wearable alcohol monitoring device could have consumer appeal as well, helping individuals monitor their own alcohol levels and make better health choices. Previous work on the TAC-BAC/BrAC relationship began with deterministic models ([4, 6, 12, 25, 48, 63, 69]) for the “forward process” of the propagation of alcohol from the blood, through the skin, 2 and its measurement by the sensor. Later approaches reversed the forward process to estimate BrAC based on the TAC ([13, 18, 38, 39, 40, 49, 50, 74]). Other statistical modeling approaches include Hill- Kapturczak et al.’s [30] regression model for peak BrAC using peak TAC, time of peak TAC, and gender using controlled laboratory data. Time delays from peak BrAC to peak TAC were examined in [34]. In [72, 73] physics-based statistical models were used for the TAC-BAC/BrAC relationship but ultimately concluded that, “due to the highly variable relationship between the BAC and TAC curves, transdermal sensing of real-time BAC using only skin surface measurements may prove to be very challenging”. Some other methods developed involve machine learning techniques, some of which were strictly data-based and others that were also physics-informed, including hidden Markov models ([45]) and standard and long- short term memory artificial neural networks ([43, 44]). More recently methods for estimating BrAC from TAC observations based on random population models that do not require calibration to each individual subject or patient, environmental conditions, or the particular device being worn have been developed ([29, 28, 59, 60, 61]). Another group has developed methods based on traditional linear regression models ([15, 16, 17]), while others have investigated alternative strictly data based machine learning approaches, for example, ones involving random forests ([22]). In this work, the challenge of the highly variable relationship between the BAC and TAC curves is met by using a physics-based statistical model that allows individual, device, and drinking episode level variation by treating the data from each person/device/episode triple as resulting from its own model parameters. The large sample behavior of estimates of these parameters are determined and conditions under which these estimates are consistent and have a limiting normal distribution are provided. Then, those results are used to give a statistically rigorous asymptotic characterization of the properties of the BAC/BrAC curve estimates obtained from measured TAC, including information on estimation error. As these estimates are made on an individualized basis, they will not be adversely affected when used in 3 a study of a population whose characteristics vary widely. On the other hand, these estimates require individualized calibration over subject, device and environmental conditions. The second part of this work is based on the development of a population model for the transdermal transport of ethanol from the blood to the TAC biosensor on the surface of the skin in the form of a random abstract parabolic hybrid system of coupled ordinary and partial differential equations. Relying on linear semigroup theory ([47]) in a Gelfand triple ([68]) of Bochner spaces the model is first formulated as an equivalent deterministic system in state space form and then this formulation is used to 1) develop a finite dimensional approximation and convergence theory, and 2) deconvolve an estimate of BrAC from the observed TAC measurements along with a quantification of the uncertainty in that estimate. Then, it is shown how non-parametric statistical techniques, in particular the kernel density estimation, can be used together with clinically collected drinking episode data to estimate off-line the distributions of the random parameters that appear in the model. After developing a population model for the transfer of ethanol from the blood to the TAC biosensor on the skin, the next step is to apply sub-population selection to restrict the distribution estimate of the parameters with the goal of reducing the uncertainty in the BrAC estimate as suggested by the Implicit Function Theorem. During the sub-population selection process, regression models are used to predict the TAC parameters based on the physical characteristics of the subjects (covariates). The predicted TAC parameter will determine the sub-population selection that will subsequently limit the population density used during the BrAC deconvolution phase. By estimating the TAC parameters from the covariates, there is no longer a need to calibrate the model to the individual who supplied the TAC data, thus removing the burden of calibration. In order to demonstrate the effectiveness of the methods developed in this work, it was necessary to have data from multiple drinking episodes, along with the corresponding subject covariates that would show variability in both drinking patterns and covariates. Unfortunately, the available dataset did not 4 possess the desired level of variability, which necessitated the use of simulated data. As a result, a simulator was developed that can generate realistic BrAC and TAC data, as well as BrAC and TAC parameters and subject covariates, based on different drinking patterns and characteristics of the existing real dataset. An outline of the remainder of this thesis is as follows. In Chapter 2, the mathematical model for the alcohol biosensor problem is derived. In Chapter 3 M-estimation in a diffusion model with application to biosensor transdermal blood alcohol monitoring is developed. In Chapter 4, the non-parametric esti- mation of a random diffusion equation-based population model is discussed along with finite dimensional approximation and blood/breath alcohol concentration deconvolution from transdermal alcohol biosensor data. In Chapter 5 consistency results for the TAC and the deconvolved BrAC signals are established under finite dimensional approximation of the operators that appear in the model and for the case where the TAC parameter’s population density is estimated using the kernel density estimate method. Then, in Chapter 6, regression models with covariate data for the prediction of the transdermal alcohol parameters are in- troduced and their contribution to the sub-population selection for the uncertainty quantification of the blood/breath alcohol concentration curves is presented. In Chapter 7, the development of a Simulator that generates data to accommodate experiments for the validation of the methods described in Chapters 4 and 6 is explained. The numerical results are in Chapter 8 and in Chapter 9, some discussion and concluding remarks are presented. 5 Chapter2 ADiffusion-BasedModelfortheTransdermalTransportofEthanol 2.1 TheFirstPrinciplesIndividualModel Consider a first principles, linear, hybrid, single input/single output system for the transdermal transport of ethanol from the blood to the TAC biosensor consisting of a coupled ordinary and parabolic partial differential equations along with boundary and initial conditions. The ODE models the inflow/outflow dynamics of the sensor and the PDE models the diffusion of ethanol molecules in the interstitial fluid between the cells in the epidermal layer of the skin. Note that the epidermal layer does not have an active blood supply but rather consists of a combination of living and dead cells which obtain nourishment from the dermal layer via the interstitial fluid. For t>0 and0<η < 1, the model takes the form ∂x ∂t (t,η )=q 1 ∂ 2 x ∂η 2 (t,η ), dw dt (t)=q 1 ∂x ∂η (t,0), x(t,0)=w(t), x(t,1)=q 2 u(t), (2.1) w(0)=w 0 , x(0,η )=ϕ 0 (η ), y(t)=w(t), 6 In the system (2.1), x(t,η ) models the ethanol concentration in the epidermal layer of the skin at time t and dimensionless depthη ∈[0,1]. The quantityw(t) denotes the ethanol concentration in the collection chamber of the biosensor and it is assumed that the TAC, which is the system output, y(t) agrees with w(t). The input to the system is the BrAC or BAC, u(t). The system has two, directly unmeasurable, physiologically, hardware, and environmentally-dependent parameters,q = (q 1 ,q 2 ). The dimensionless parameterq 1 describes the diffusivity ethanol in the epidermal layer and, due to the linearity of the system, the dimensionless parameterq 2 models the product of the alcohol impedance between the epidermal and dermal layers of the skin and the alcohol impedance between the epidermal layer of the skin and the membrane covering the collection chamber of the sensor at the location where it comes into contact with the epidermal layer. The sensor is digital with sampling timeτ > 0; hence the discrete time version of (2.1) is considered. Note that the the input to the system (2.1) is on the boundary of the spatial domain,[0,1], which ordinarily would result in a state space system with unbounded input operator making well-posedness and approx- imation and convergence arguments somewhat more technical. By formulating the model in discrete or sampled time this added complexity is avoided. Assume zero-order hold input, i.e. u(t) = u k ,t∈ [kτ, (k+1)τ ),k = 0,1,2,...,K with a finite time horizon, T > 0, where Kτ = T . Then setx k = x k (η ) = x(kτ,η ),w k = w(kτ ), andy k = y(kτ ),k = 0,1,...,K. Using the change of variablesv(t,η ) = x(t,η )− ξ 0 (η )u k withξ 0 (η ) = q 2 η , for0 < η < 1, andkτ ≤ t<(k+1)τ , for eachk =0,1,2,...,K, the system (2.1) becomes ∂v ∂t (t,η )=q 1 ∂ 2 v ∂η 2 (t,η ), dw dt (t)=q 1 ∂v ∂η (t,0)+q 1 q 2 u k , (2.2) v(t,0)=w(t), v(t,1)=0, 7 v(kτ, ·)=x(kτ, ·)− ξ 0 (·)u k =x k − ξ 0 u k . 2.2 TheStateSpaceFormulation LetQ be a compact subset of the positive orthant ofR 2 . Define H =R× L 2 (0,1) forq =(q 1 ,q 2 )∈Q and equip it with the inner product⟨(θ 1 ,ϕ 1 ),(θ 2 ,ϕ 2 )⟩=θ 1 θ 2 + R 1 0 ϕ 1 (η )ϕ 2 (η )dη . Further, letV ={(θ,ϕ )∈ H :ϕ ∈H 1 (0,1),θ =ϕ (0),ϕ (1)=0} be a Hilbert space with inner product⟨(ϕ 1 (0),ϕ 1 ),(ϕ 2 (0),ϕ 2 )⟩ V = ⟨ϕ ′ 1 ,ϕ ′ 2 ⟩ L 2 (0,1) , whereH 1 (0,1) is the Sobolev spaceW 1,2 (0,1) and⟨·,·⟩ L 2 (0,1) denotes the standardL 2 (0,1) inner product. Let |·| and ||·|| denote respectively the norms induced by these two inner products on their respective spaces. In this way, standard arguments yield the dense and continuous embeddings V ,→H ,→V ∗ ([56]). Define the bilinear form a(q;·,·):V × V →R for eachq∈Q by a(q; ˆ ϕ 1 , ˆ ϕ 2 )=q 1 Z 1 0 ϕ ′ 1 (η )ϕ ′ 2 (η )dη (2.3) for ϕ 1 ,ϕ 2 ∈ V and with ˆ ϕ 1 = (ϕ 1 (0),ϕ 1 ), ˆ ϕ 2 = (ϕ 2 (0),ϕ 2 ). It is not difficult to show that a(q,·,·) satisfies the three properties: i Boundedness There exists a constantα 0 >0 such that |a(q; ˆ ϕ 1 , ˆ ϕ 2 )|≤ α 0 || ˆ ϕ 1 |||| ˆ ϕ 2 ||, ˆ ϕ 1 , ˆ ϕ 2 ∈V . ii Coercivity There exist constantλ 0 ∈R andβ 0 >0 such thata(q, ˆ ϕ, ˆ ϕ )+λ 0 | ˆ ϕ | 2 ≥ β 0 || ˆ ϕ || 2 , ˆ ϕ ∈V . iii Continuity For all ˆ ϕ 1 , ˆ ϕ 2 ∈ V , we have thatq→ a(q, ˆ ϕ 1 , ˆ ϕ 2 ) is a continuous mapping fromQ into R. 8 Moreover, recalling thatQ was assumed to be a compact subset ofR 2 , it follows that the bilinear form a(q;·,·) is uniformly bounded, coercive, and continuously dependent onq∈ Q, i.e. the constantsα 0 ,λ 0 andβ 0 do not depend onq. See [29] for more details. Define the operator A(q): Dom(A(q))⊂ H →H as follows. First let⟨A(q) ˆ ϕ, ˆ ψ ⟩ V ∗ ,V =− a(q; ˆ ϕ, ˆ ψ ) and set Dom(A(q)) ={ ˆ ϕ = (ϕ (0),ϕ )∈ V : ϕ ∈ H 2 (0,1)}. Note that the domain Dom(A(q)) does not depend onq, and for ˆ ϕ ∈ Dom(A(q)) A(q) ˆ ϕ =A(q)(ϕ (0),ϕ )=(q 1 ϕ ′ (0),q 1 ϕ ′′ ). (2.4) Standard arguments yield that A(q) is densely defined on H, regularly dissipative and self-adjoint ([3], [68]). Hence,A(q) is the infinitesimal generator of a uniformly exponentially stable, self-adjoint, analytic semigroup of bounded linear operators,{e A(q)t :t≥ 0}, onH ([3], [47], [68]). With these definitions, the system (2.2) can be rewritten as dˆ v dt (t) = A(q)ˆ v(t)+q 1 q 2 (1,0)u k where ˆ v(kτ )=(w k ,x k − ξ 0 u k ) in the intervalkτ ≤ t<(k+1)τ . Set ˆ x k =(w k ,x k ), and define the operators ˆ A(q)=e A(q)τ ∈L(H,H) and B(q)=q 2 q 1 (1,0)− A(q)(0,ξ 0 ) (2.5) Then using the variation of constants formula, the operator is obtained as follows ˆ B(q)= Z τ 0 e A(q)s B(q)ds=A(q) − 1 e A(q)s B(q)| τ 0 =( ˆ A(q)− I)A(q) − 1 B(q) =q 2 (I− ˆ A(q)) (0,η )− q 1 A(q) − 1 (1,0) ∈L(R,H), Note that the ellipticity ofA(q) ensures thatA(q) − 1 ∈L(H,H) exists. 9 These definitions together with the system (2.2) then yield the state space form of the model given by ˆ x k+1 =(w k+1 ,x k+1 )=(w((k+1)τ ),x((k+1)τ, ·)) = ˆ v((k+1)τ )+(0,ξ 0 )u k =e A(q)τ (w k ,x k − ξ 0 u k )+q 2 Z τ 0 e A(q)s (1,0)dsu k +(0,ξ 0 )u k = ˆ A(q)ˆ x k +(I− ˆ A(q))(0,ξ 0 )u k +q 1 q 2 A(q) − 1 ( ˆ A(q)− I)(1,0)u k = ˆ A(q)ˆ x k + ˆ B(q)u k . Finally, set the output operator ˆ C ∈L(H,R) to be ˆ C(θ,ϕ ) = θ for(θ,ϕ )∈ H. Then, the discrete time model given in (2.2) becomes ˆ x k+1 = ˆ A(q)ˆ x k + ˆ B(q)u k , k =0,1,2,...,K ˆ x 0 =(w 0 ,ϕ 0 ), (2.6) y k = ˆ Cˆ x k , k =0,1,2,...,K. With the additional assumption that neither the epidermal layer of the skin nor the collection chamber of the sensor contain alcohol at timet=0, that is ˆ x 0 =(w 0 ,ϕ 0 )=(0,0), the outputy of the system (2.2) or equivalently, (2.6), can be written as a discrete time convolution of the input,u, with a filter, h(q), as y k = k− 1 X j=0 ˆ C ˆ A(q) k− j− 1 ˆ B(q)u j = k− 1 X j=0 h k− j (q)u j , k =0,1,2,.... (2.7) Here, the filter is h i (q)= ˆ C ˆ A(q) i− 1 ˆ B(q),i=1,2,.... 10 2.3 FiniteDimensionalApproximationandConvergence Estimating the parameter q requires finite dimensional approximation. A standard Galerkin approach is used to discretize. The approach is based on a series of approximating finite-dimensional subspaces V N of V as well as sequences of operators ˆ A N , ˆ B N , and ˆ C N that approximate ˆ A, ˆ B, and ˆ C, respec- tively. For N = 1,2,... let{φ N i } N i=0 be the standard linear B-splines on the interval [0,1] defined with respect to the uniform mesh{0, 1 N , 2 N ,..., N− 1 N ,1}. Then define the subspace V N = span{ˆ φ N i } N− 1 i=0 = span{(φ N i (0),φ N i )} N− 1 i=0 and denote the orthogonal projection ofH on toV N byP N : H → V N . Basic arguments from the theory of splines (see, for example, [53]) can be employed to show that|P N (θ,ψ )− (θ,ψ )|→ 0, asn→∞, for all(θ,ψ )∈ H. Further, these same types of arguments can be used to argue that||P n ˆ φ− ˆ φ||→0, asN →∞, for all ˆ φ∈V . The operator A(q) is approximated using a Galerkin approach. As such, for N = 1,2,... and k = 0,1,2,... set ˆ x N k (η )= N− 1 X i=0 X N,k i ˆ φ N i (η ), and define the approximating operator A N ∈ L(V N ,V N ) by restricting the bilinear form a(q,·,·) to V N × V N . Then set ˆ A N (q)=e A N (q)τ , and ˆ B N (q)=( ˆ A(q) N − I)(A N (q)) − 1 P N B(q)=q 2 (I− ˆ A N (q)) P N (0,η )− q 1 (A(q) N ) − 1 P N (1,0) =q 2 (I− ˆ A N (q)) (M N ) − 1 Z 1 0 ηφ j (η )dη +(K N ) − 1 M N (M N ) − 1 [1,0,...,0] T =q 2 (I− ˆ A N (q)) (M N ) − 1 Ξ N 0 +(K N ) − 1 [1,0,...,0] T (2.8) 11 where ˆ B N ∈ L(R,V N ), and Ξ N 0,j = R 1 0 ηφ N j (η )dη for j = 0,...,N − 1. With respect to the basis {ˆ φ N i } N− 1 i=0 , the matrix representation of the above operators are given by [A N ]=− q 1 [M N ] − 1 K N , [ ˆ A N ]=e − q 1 [M N ] − 1 K N τ and [ ˆ B N ]=q 2 (I− [ ˆ A N ]) [M N ] − 1 Ξ N 0 +[K N ] − 1 [1,0,0,...,0] ⊤ where[M] N− 1 i,j=0 = R 1 0 φ N i φ N j dη ,[K] N− 1 i,j=0 = R 1 0 φ N i ′ φ N j ′ dη . Further,[ ˆ C N ]=[1,0,0,...,0]∈R 1× N . The discrete time model given in (2.6) inV N becomes ˆ x N k+1 = ˆ A N (q)ˆ x N k + ˆ B N (q)u k , k =0,1,2,...,K ˆ x N 0 =(0,0), (2.9) y N k = ˆ Cˆ x N k , k =0,1,2,...,K. or equivalently inR N is given in matrix form by X N,k+1 =[ ˆ A N ]X N,k +[ ˆ B N ]u k , k =0,1,2,...,K X N,0 =[0,0,...,0] T ∈R N , (2.10) y N k =[ ˆ C] N X N,k , k =0,1,2,...,K. and hence y N k = k− 1 X j=0 ˆ C N ( ˆ A N ) k− j− 1 ˆ B N u j = k− 1 X j=0 [ ˆ C N ][ ˆ A N ] k− j− 1 [ ˆ B N ]u j = k− 1 X j=0 h N k− j u j , k =0,1,2,...,K, whereh N i =[ ˆ C N ][ ˆ A N ] i− 1 [ ˆ B N ]∈R,i=1,2,...,K. 12 Using linear semigroup theory (see [50], [6], [3]) and more specifically the Trotter-Kato semigroup approximation theorem (see [35] and [47]) it can be shown that the operators ˆ A N , ˆ B N and ˆ C N converge to ˆ A, ˆ B and ˆ C respectively. 13 Chapter3 M-estimationinadiffusionmodelwithapplicationtobiosensor transdermalbloodalcoholmonitoring 3.1 IntroductionandBackground This Chapter presents an attempt to address the challenge of modeling the highly variable TAC-BAC/BrAC relationship. A physics-based statistical model is proposed, which accounts for individual, device, and drinking episode level variation by treating the data from each person/device/episode triple as resulting from its own model parameters. The aim is to achieve well-founded statistical inference on an individual’s blood alcohol level based on noisy measurements of their skin alcohol content. To this end, M-estimation methodology in a general setting is developed, which is then applied to the diffusion equation-based model for the blood/skin alcohol relationship. The Chapter establishes the existence, consistency, and asymptotic normality of the nonlinear least squares estimator of the diffusion model’s parameter and the resulting estimated blood alcohol curve. 3.1.1 Diffusionmodel Consider the individual model (2.1) as stated in Chapter 2 for the transport of ethanol from the blood to the TAC biosensor which depends on the unknown, 2-dimensional parameterq = (q 1 ,q 2 ). Also, recall 14 that TAC can be expressed as a convolution of BAC/BrAC with a kernel or filter as in (2.7). The unknown parameter q is estimated in this Chapter via nonlinear least squares as described in Section 3.1.2 and its properties are considered in Section 3.3. These properties determine the inferential consequences for BAC/BrAC estimation, and in particular have a large impact on the accuracy of the estimated BrAC curve, as studied in Section 3.4. The system given by equation (2.1) incorporates Dirichlet boundary conditions. The methods pre- sented in this Chapter are inspired by the system introduced in Chapter 2, although with a modification where Neumann boundary conditions are utilized instead. The system with its boundary conditions can be solved in continuous time in terms of unbounded linear operators (see Section 2 in [59]), with solution x(t)=e A(q)t x(0)+ Z t 0 e A(q)(t− s) B(q)u(s)ds. (3.1) where the unbounded linear operators are as in (2.4) and (2.5). In cases considered in this work, x(0) will be the zero function, that is, observation begins at, or before, the time of first intake of alcohol. By taking a discretization of the distance η from skin level intoN steps for someN sufficiently large, the operators in (3.1) can be approximated by N dimensional linear operators (i.e., matrices) which can be derived as in Section 2.3 with a modification to account for the Neumann boundary conditions. Note that in Chapter 2,[ ˆ A N ],[A N ] and[B N ] were used to represent the finite dimensional approximations of the operators ˆ A N , A N and B N respectively. However, in this Chapter, for the sake of simplicity, the same notation of ˆ A N ,A N , andB N will be used to denote the finite dimensional approximations of these operators. The approximation to the solution to (2.1) given by x N (t)= Z t 0 e A N (q)(t− s) B N (q)u(s)ds. (3.2) 15 An observation taken at timet can be represented as the linear function ofx N (t) given by f N u (t;q)= Z t 0 C N e A N (q)(t− s) B N (q)u(s)ds, (3.3) plus an additive error term. For observations taken at skin level, the vector C N will have a one in its first component, and zeros elsewhere. Note that for the ease of notation the dependence of f u (t;q) on N through the matricesA N (q) andB(q) N will be dropped and thereforef N u (t;q) will be denoted with f u (t;q) in what follows. The matrices in (3.3) depend on the unknown parameterq as A N (q)=q 1 D N +E N and B N (q)=q 2 F N , (3.4) whereC N ,D N ,E N , andF N are known matrices that result from making the finite-dimensional approx- imation discussed; methods of computation and consistency for approximating the infinite dimensional solution have been established in [59]. Note that when the boundary conditions are specified as Dirichlet, the matrixE N becomes the zero matrix. More precise assumptions and properties of these matrices, and the domain ofq, will be specified in Section 3.3. 3.1.2 Nonlinearleastsquaresestimation To estimate the parameterq, assume that TAC data{y ij ,1≤ i≤ n,1≤ j ≤ m i } is collected on a single individual overn different drinking episodes at the m i times0≤ t i,1 <··· < t i,m i ≤ T i , for given BrAC curvesu i on[0,T i ]. Withm=(m 1 ,...,m n ), the estimator minimizes J n,m (q)= 1 2 P n i=1 m i n X i=1 m i X j=1 (f u i (t ij ;q)− y ij ) 2 , (3.5) 16 wheref u i (t ij ;q) is given by the right hand side of (3.3) withu replaced byu i , the BrAC curve for drink- ing episode i. The model specified by (3.3) and (3.4) is deterministic, but to account for measurement variability, we include additive, homoscedastic errors on the observed values TAC values. The constant variance condition implies that all TAC observations are ‘equally reliable’, and that the error variances, in particular, do not depend on the length of time elapsed since the last observation. For that reason, the least squares objective functions give equal weight to their summands, and when appropriate, weights, inversely proportional to variance, could be included. The length of the time intervalT i of thei th episode, and the location of the sampling times, may also be allowed to be stochastic. 3.2 M-estimation: Existence,consistency,andlimitingdistribution In this sectionM-estimation in a general setting that contains what will be required to handle for the diffu- sion model is considered. The results may be viewed as an extension of existing results onM-estimation. Textbooks that coverM-estimation tend to focus on the case of a univariate parameter ([41], [55]), whereas ours covers the multivariate case. The closest results to ours are by [33], who obtained similar results but in a setting that is more restrictive in a number of ways. First, [33] considers only least squares estimation whereas our results apply to the more general estimating equation (3.6). Second, these previous results only apply to approximate normality and require i.i.d. error terms, whereas our Theorem 3.2.2 can be applied to other limiting distributions and relaxed conditions on the error terms, although our main appli- cation is to limiting normality. Finally, these previous results are more restrictive in terms of a number of technical conditions, such as compactness of the parameter spaceΘ which our results do not require, and the existence of “tail products” of vectors of observation means and error terms, which our results eschew in favor of more conventional regularity conditions on the score type functionU n . 17 After establishing the notation and setup in Section 3.2.1, our main results are stated in Section 3.2.2. In Section 3.2.3 some general examples of the applications of our results to least squares and maximum likelihood estimation are provided. 3.2.1 Setupandsummaryofresults Forn≥ 1, observed dataX (n) in a spaceχ (n) , a parameter spaceΘ ⊂ R p having non-empty interior, and a functionU n :Θ × χ (n) →R p , consider the estimating equation U n (θ )=0, θ ∈Θ , (3.6) where the dependence ofU n on the data is suppressed. In the examples χ (n) will a Euclidean space en- dowed with a family of densitiesp n (x (n) ;θ ),θ ∈Θ which generate the data from this family withθ =θ 0 . Two important situations in which the solutions of such equations arise are maximum likelihood and least squares estimation. For maximum likelihood, under smoothness conditions on the densities, the maximizer of the log likelihoodL n (θ )=logp n (x (n) ;θ ) is given as a solution to (3.6) with U n (θ )=∂ θ L n (θ ;X (n) ), (3.7) where∂ θ denotes taking derivative with respect toθ , resulting in a column vector of partial derivatives whenθ itself is a vector. When the dataX (n) consists ofn independent random vectorsX 1 ,...,X n in R d , each with distribution p(x;θ 0 ), the space χ (n) can be identified with R d× n , and p n (x (n) ,θ ) is the product of the marginal densitiesp(x i ;θ ) fori=1,...,n. 18 To introduce least squares estimation, suppose that pairs(x i ,y i )∈R d × R,i=1,...,n, are observed with distribution depending onθ for which E θ [y i |x i ]=f i (x i ;θ ) forf i (x;θ ) in some parametric class of functions. Withx (n) = (x 1 ,...,x n ), the least squares estimate ofθ is given as the minimizer of J(θ ;x (n) )= 1 2n n X i=1 (y i − f i (x i ;θ )) 2 , which under smoothness conditions can be obtained via (3.6) with U n (θ )=∂ θ J(θ ;x (n) )= 1 n n X i=1 (f i (x i ;θ )− y i )∂ θ f i (x i ;θ ). (3.8) The aim of the estimating equationU n (θ )=0 is to provide a value close to the one where the function U n (θ ) takes the value of 0 in some expected, or asymptotic, sense. In particular, in Theorem 3.2.1 it will be showed, under that whenU n (θ 0 ) is, under an appropriate scaling, close to zero as n → ∞, then the sequence of estimates obtained via the estimating equations will be consistent for the true parameter. In Theorem 3.2.2, a corresponding limiting distribution for solutions to the estimating equation (3.6) will be provided. LetU n (,θ ) have components U n (θ )=(U n,j (θ )) 1≤ j≤ p where U n,j :R n × Θ →R. 19 U ′ n (θ ) will be written as short for∂ θ U ⊤ n (θ )∈R p× p . This quantity is the observed information matrix in the case of maximum likelihood estimation, and for (3.7) under the assumption of the existence and continuity of second derivatives ofL n forθ ∈Θ , itsk,j th component is given by ∂U n,j (θ ) ∂θ k = ∂ 2 L n (θ ) ∂θ k ∂θ j = ∂ 2 L n (θ ) ∂θ j ∂θ k = ∂U n,k (θ ) ∂θ j . In this case, the third condition in (3.10) below is equivalent to the condition that the limiting information matrixI is positive definite. Tolerating a slight abuse of notation, ∂ j will be written rather than∂ θ j when taking a partial with respect to the j th coordinate variable, and ∂ m j for the m th order derivative, so for instance denoting thek,j th entry ofU ′ n (θ ) by∂ k U n,j (θ ). Over each coordinate j = 1,...,p, under second order differentiabilty conditions, the second order Taylor expansion ofU n,j (θ ) around someθ 0 ∈Θ will be used, U n,j (θ )=U n,j (θ 0 )+ p X k=1 ∂ k U n,j (θ 0 )(θ k − θ k,0 )+ 1 2 X 1≤ k,l≤ p (θ k − θ k,0 )∂ k,l U n,j (θ ∗ n,j )(θ l − θ l,0 ), (3.9) where eachθ ∗ n,j lies on the line segment connectingθ andθ 0 . In the following, let∥·∥ denote the Euclidean norm of a vector, the operator norm of a matrix, and the supremum norm of a function. 3.2.2 Estimatingequations,consistency,andasymptoticnormality Results that provide conditions for the consistency and existence of a non-trivial limiting distribution for a properly centered and scaled sequence of estimating equation solutions are presented below. Results on the consistent estimation of parameters on which the asymptotic distribution of the estimate may depend are also included. 20 Theorem 3.2.1. Suppose thatU n : Θ × χ (n) → R p is twice continuously differentiable in an open set Θ 0 ⊂ Θ containingθ 0 , and that there exist a sequence of real numbersa n , a matrixΓ ∈ R p× p andγ > 0 such that a n U n (θ 0 )→ p 0 and a n U ′ n (θ 0 )→ p Γ asn→∞ with inf ∥θ ∥=1 θ ⊤ Γ θ =γ. (3.10) Suppose further that for anyτ ∈(0,1), that there exists aK such that for alln sufficiently large, P(|a n ∂ k,l U n,j (θ )|≤ K,1≤ k,l,j≤ p,θ ∈Θ 0 ) ≥ 1− τ. (3.11) Thenforanygivenϵ> 0andτ ∈(0,1),forallnsufficientlylarge,withprobabilityatleast 1− τ thereexists b θ n ∈ Θ satisfyingU n ( b θ n ) = 0 and|| b θ n − θ 0 ||≤ ϵ , that is, a sequence of roots to the estimating equation (3.6) consistent forθ 0 . In addition, for any sequence b θ n → p θ 0 , we have a n U ′ n ( b θ n )→ p Γ , (3.12) that is,Γ can be consistently estimated bya n U ′ n ( b θ n ) from any sequence consistent forθ 0 . Proof: By replacingU n bya n U n andθ byθ − θ 0 , it may be assumed that the conditions of Theorem 3.2.1 hold witha n =1 andθ 0 =0. Forδ > 0 let B δ ={θ :∥θ ∥≤ δ }. 21 For the givenτ ∈(0,1), letK andn 0 be such that (3.11) holds withτ replaced byτ/ 2 forn≥ n 0 . For the givenϵ> 0, takeδ ∈(0,ϵ ) such that B δ ⊂ Θ 0 and Cδ <γ where C =2+ Kp 3/2 2 . By (3.10) there existsn 1 ≥ n 0 such that forn≥ n 1 P(∥U n (0)∥<δ 2 )≥ 1− τ/ 3 P(∥U ′ n (0)− Γ ∥<δ )≥ 1− τ/ 3, (3.13) and also taken large enough so that (3.11) holds with τ replaced by τ/ 3. By the union bound, all three events hold with probability at least1− τ . Forθ ∈ B δ andθ ∗ n,j given by (3.9), the components ofR n (θ ) = (R n,1 (θ ),...,R n,p (θ )) ⊤ as defined by R n,j (θ )= X 1≤ k,l≤ p θ k ∂ k,l U n,j (θ ∗ n,j )θ l satisfy |R n,j (θ )|≤ K p X i=1 |θ i | ! 2 ≤ Kp∥θ ∥ 2 . Then, forn≥ n 1 , with probability at least1− τ , from (3.9), (3.13) and (3.11), ∥U n (θ )− Γ θ ∥ ≤ ∥ U n (θ )− U ′ n (0)θ ∥+∥U ′ n (0)θ − Γ θ ∥ = ∥U n (0)+ 1 2 R n (θ )∥+∥(U ′ n (0)− Γ) θ ∥ < δ 2 + Kp 3/2 2 ∥θ ∥ 2 +δ ∥θ ∥≤ Cδ 2 , so ∥θ ⊤ U n (θ )− θ ⊤ Γ θ ∥<Cδ 3 . 22 Hence, if∥θ ∥=δ , θ T U n (θ )>θ T Γ θ − Cδ 3 ≥ γδ 2 − Cδ 3 =δ 2 (γ − Cδ )>0. (3.14) Now it is argued as in Lemma 2 of [1]. Assume for the sake of contradiction thatU n (θ ) does not have a root inB δ . Then forθ ∈B δ , the functionf(θ )=− δ U n (θ )/∥U n (θ )∥ continuously mapsB δ to itself. By the Brouwer fixed point theorem, there exists ϑ∈ B δ , withf(ϑ) =ϑ. Since∥f(θ )∥ = δ for allθ ∈ B δ , then∥f(ϑ)∥ =∥ϑ∥ = δ , contradicting (3.14) viaδ 2 =∥ϑ∥ 2 =ϑ T ϑ =ϑ T f(ϑ) < 0. HenceU n (θ ) has a root withinδ of 0, and sinceδ <ϵ , therefore withinϵ , with probability at least1− τ , as required. To prove (3.12), taking b θ n to be any consistent sequence forθ 0 , a first order Talyor expansion yields, for all1≤ j,k≤ p, ∂ k U n,j ( b θ n )=∂ k U n,j (0)+ p X l=1 ∂ k,l U n,j (θ ∗ n,j ) b θ n,l =∂ k U n,j (0)+Q T k,n,j b θ n where Q T k,n,j :=(∂ k,1 U n,j (θ ∗ n,j ),...,∂ k,p U n,j (θ ∗ n,j )), whereθ ∗ n,j lies along the line segment connecting b θ n and0. Writing this identity in matrix notation, U ′ n ( b θ n )− U ′ n (0)=Q n ( b θ n ) where (Q n (θ )) k,j =Q T n,k,j θ . Letτ ∈ (0,1) andϵ > 0 be given, chooseδ ∈ (0,ϵ/K √ p) so thatB δ ⊂ Θ 0 , and letn 2 be such that for alln≥ n 2 , with probability at least1− τ ,|∂ k,l U n (θ )|≤ K for all1≤ k,l≤ p and∥ b θ n ∥≤ δ . Then, forn≥ n 2 with probability at least1− τ |Q T n,k,j b θ n |≤ K √ pδ <ϵ, or equivalently, ∥U ′ n ( b θ n )− U ′ n (0)∥ ∞ <ϵ 23 where∥A∥ ∞ =max i,j |A i,j | forA∈R p× p . The claim follows, sinceϵ andτ are arbitrary, andU ′ n (0)→ p Γ by assumption. The next result of this work provides conditions under which a consistent estimator sequence, properly centered and scaled, converges in distribution. Theorem 3.2.2. Suppose the sequence of solutions b θ n ,n ≥ 1 to (3.6) is consistent forθ 0 , that (3.11) and the second condition of (3.10) hold for some sequencea n ,n≥ 1 of real numbers, that the matrixΓ in (3.10) is non-singular and thatU n (θ ) is twice continuously differentiable in an open set Θ 0 ⊂ Θ containingθ 0 . Further, letb n be a sequence of real numbers such that for some random variableY, b n U n (θ 0 ) → d Y. (3.15) Then b n a n ( b θ n − θ 0 )→ d − Γ − 1 Y. Proof: As in the proof of Theorem 3.2.1, by replacinga n U n byU n without loss of generality takea n = 1, and also as done there, takeθ 0 =0. Since a limit in distribution does not depend on events of vanishingly small probability, by the consistency of b θ n and (3.11) it may be assumed that for eachn, sufficiently large, that b θ n ∈ Θ 0 , and for some K that|∂ k,j U n (θ )| ≤ K for all 1 ≤ j,k ≤ p andθ ∈ Θ 0 . For such n the expansion (3.9) holds, and substituting b θ n forθ and usingU n ( b θ n )=0 yields − U n (0)=(U ′ n (0)+ϵ n ) b θ n :=Γ n b θ n where (ϵ n ) j,l = 1 2 p X k=1 b θ n,k ∂ k,l U n,j (θ ∗ n,j ). By the Cauchy-Schwarz inequality, |(ϵ n ) j,l |≤ K √ p 2 ∥ b θ n ∥→ p 0. 24 HenceΓ n → p Γ so thatΓ − 1 n exists with probability tending to 1, and converges in probability toΓ − 1 . Now using (3.15), Slutsky’s theorem, on an event of probability tending to one asn tends to infinity, b n b θ n =Γ − 1 n b n Γ n b θ n =− Γ − 1 n (b n U n (0))→ d − Γ − 1 Y. In the most common case the distributional convergence in (3.15) is to the normal, and shown by applying the Central Limit Theorem to a sum of independent random vectors. This situation is illustrated in the following lemma, in which distributional limits that may have covariance matrices of less than full rank are included. For a given vectoru and non-negative definite matrix Σ , X ∼ N(u,Σ) when E[e t T X ]=exp 1 2 t T Σ t+t T u . In particular, in one dimensionN(u,0) is unit mass atu. Lemma 3.2.1. LetA ℓ ,ℓ = 1,2,... be a sequence of arbitrary index sets satisfying|A ℓ | → ∞ asℓ → ∞, and let{X ℓ,a ,a ∈ A ℓ } be a collection ofR d valued independent, mean zero random vectors such that for some matrixΣ and someδ > 0 lim ℓ→∞ X a∈A ℓ Var(X ℓ,a )=Σ and lim ℓ→∞ X a∈A ℓ ,1≤ k≤ d E h |X ℓ,a,k | 2+δ i =0. (3.16) Then S ℓ = X a∈A ℓ X ℓ,a satisfies S ℓ →N(0,Σ) asℓ→∞. 25 Proof: First the result inR will be proved. By the Lindeberg theorem, (e.g. Theorem 3.4.5, [19])if for all ℓ≥ 1 the random variables{X ℓ,a ,a∈A l } are independent, mean zero, and satisfy lim ℓ→∞ X a∈A ℓ Var(X ℓ,a )=σ 2 >0, and for allϵ> 0 lim ℓ→∞ X a∈A ℓ E[X 2 ℓ,a 1(|X ℓ,a |≥ ϵ )]=0, (3.17) then S ℓ → d N(0,σ 2 ) where S ℓ = P a∈A ℓ X n,i . InR, the second condition in (3.16) implies the second condition in (3.17), as for anyϵ> 0, withp=1+δ/ 2 andq =1+2/δ , using Hölder’s inequality followed by Markov’s, E[X 2 ℓ,a 1(|X ℓ,a |≥ ϵ )]≤ E[X 2p ℓ,a ] 1/p P(|X ℓ,a |≥ ϵ ) 1/q ≤ E[X 2p ℓ,a ] 1/p E[X 2p ℓ,a ] ϵ 2p ! 1/q = E[X 2p ℓ,a ] ϵ 2p/q = E[X 2+δ ℓ,a ] ϵ 2p/q . Hence, the claim holds inR when the limiting variance is positive. When this limit is zero, Chebyshev’s inequality yields thatS ℓ → p 0, and henceS ℓ converges as well to zero in distribution, which is the normal distribution with mean and variance 0. Hence the conclusion of the lemma holds ford=1. In general, given a collection of random vectors satisfying the given hypotheses, takingv to be of norm 1, the variablesY ℓ,a =v T X ℓ,a fora∈A ℓ are independent and mean zero for eachℓ, and satisfy the first condition of (3.16) withΣ replaced byv T Σ v, and the second condition of (3.16) by virtue of this condition holding by assumption for the vector arrayX ℓ,a , that∥v∥ ∞ ≤ 1 and hence that |Y ℓ,a | 2+δ =|v T X ℓ,a | 2+δ ≤ X a∈A ℓ ,1≤ k≤ d |X ℓ,a,k | 2+δ . As the claim holds ind = 1 for linear combinations given by anyv of norm 1, the general result follows by the Cramer-Wold device. □ . 26 3.2.3 Examples In the section the scope of the results in Section 3.2.2 is demonstrated by presenting two applications, one to least squares and the other to maximum likelihood. The following lemma, a direct application of the dominated convergence theorem, is used to handle the technical matter of interchanges between integration and differentiation with respect to θ ∈Θ ⊂ R p . Lemma3.2.2. Letf :R m × Θ →R be differentiable with respect to θ in an open setΘ 0 ⊂ Θ , and suppose that there existsg :R m →R such that ∂ θ f(x;θ ) ≤ g(x) for allθ ∈Θ 0 and Z R m g(x)dx<∞. Then for allθ ∈Θ 0 , ∂ θ Z R m f(x;θ )dx= Z R m ∂ θ f(x;θ )dx. (3.18) Example3.2.1. Least squares estimation. Suppose the observations y i =f(x i ,θ 0 )+ϵ i i=1,...,n wheref(x i ,θ ),θ ∈ Θ ⊂ R is some specified parametric family of functions; one dimensional parameter is taken here for simplicity. θ 0 is estimated via least squares, minimizing J n (θ, x (n) )= 1 2n n X i=1 (f(x i ,θ )− y i ) 2 = 1 2n n X i=1 (f(x i ,θ )− f(x i ,θ 0 )− ϵ i ) 2 . 27 Assume that f(x,θ ) has three continuous derivatives with respect to θ that are uniformly bounded, say by K, over some open subset Θ 0 of Θ that contains θ 0 , and that ϵ 1 ,ϵ 2 ,... are independent random variable distributed asϵ , a mean zero, varianceσ 2 random variable withE|ϵ | 2+η =τ 2+η <∞ for someη > 0. Taking one derivative with respect toθ , the estimating equationU n (θ )=0 is obtained where U n (θ )= 1 n n X i=1 (f(x i ,θ )− f(x i ,θ 0 )− ϵ i )∂ θ f(x i ,θ ) so in particular U n (θ 0 )=− 1 n n X i=1 ϵ i ∂ θ f(x i ,θ 0 ). (3.19) The first condition of (3.10) of Theorem 3.2.1 is satisfied with a n = 1, as the errors have zero mean, are uncorrelated and have uniformly bounded variances, implying thatE θ 0 [U n (θ 0 )]=0 andVar θ 0 [U n (θ 0 )]→ 0. Regarding the second condition of (3.10) taking another derivative, it is obtained that U ′ n (θ 0 )= 1 n n X i=1 (∂ θ f(x i ,θ )) 2 +(f(x i ,θ )− f(x i ,θ 0 )− ϵ i )∂ 2 θ f(x i ,θ ) θ 0 = 1 n n X i=1 (∂ θ f(x i ,θ 0 )) 2 − 1 n n X i=1 ϵ i ∂ 2 θ f(x i ,θ 0 ). (3.20) Arguing as for (3.19), the second sum tends to zero in probability. Now takex i ,i = 1,2,... to be inde- pendent random vectors distributed as somex, then the law of large numbers yields that 1 n n X i=1 (∂ θ f(x i ,θ 0 )) 2 → p γ =E θ 0 (∂ θ f(x i ,θ 0 )) 2 , (3.21) showing the second condition of (3.10), and this limit will be positive when ∂ θ f(x,θ 0 ) is a non-degenerate random variable, thus verifying the final condition in (3.10) in that case. It is easy to see that taking another derivative in (3.20) yields an average of functions that are bounded overΘ 0 ,plusaweightedaverageoftheerrorvariables,eachonemultipliedbysomeboundedfunction. Asthe 28 second weighted average can be seen to be bounded in probability by applying reasoning similar to that used for the scoreU n (θ 0 ), condition (3.11) holds. The only remaining verification needed to invoke Theorem 3.2.2 is to show the properly scaled score at θ 0 has a limiting distribution. Takingb n = √ n, it follows that Var 1 √ n U n (θ 0 ) →σ 2 γ, by (3.21), and in addition using the representation ofU n (θ 0 ) from (3.19), n X i=1 E ϵ i ∂ θ f(x i ,θ 0 ) √ n 2+δ ≤ K 2+δ n − δ/ 2 τ 2+δ →0. Hence, invoking Lemma 3.2.1, for any consistent sequence of roots, √ n b θ n − θ 0 → d N(0,σ 2 γ − 1 ). Example 3.2.2. Maximum likelihood. Let p(x,θ ),θ ∈ Θ 0 be a family of density functions for Θ ⊂ R p , and for someθ 0 ∈ Θ , letX 1 ,...,X n be independent random vectors with densityp(x,θ 0 ). Letp(x,θ ) be three times continuosly differentiable in θ with the first two derivatives of p(x,θ ), and the third derivative of q(x,θ )=logp(x,θ ), dominated by an integrable function in some neighborhoodΘ 0 ofθ 0 . Assume further that the Fisher information matrix atθ 0 is positive definite. The maximum likelihood estimate ofθ 0 is obtained by maximizing the log likelihood of the data, and hence given by a solution to the estimating equation (3.6) with U n (θ )= 1 n n X i=1 ∂ θ p(X i ,θ ) p(X i ,θ ) . 29 By Lemma 3.2.2, forθ ∈Θ 0 we have E θ [∂ θ logp(X,θ )]= Z R ∂ θ p(x,θ )dx=∂ θ Z R p(x,θ )dx=0, (3.22) and likewise that the Fisher informationI(θ ) satisfies I(θ )=− E θ [∂ 2 θ logp(X,θ )]=Var θ (∂ θ logp(X,θ )). Hence, by the law of large numbers the first two conditions of (3.10) are satisfied with a n = 1 and Γ = I(θ 0 ), and the last holds by our assumption on the Fisher information. Next it is shown that (3.11) is satisfied. Writing ∂ j short for∂ θ j , U n (θ )= 1 n n X i=1 ∂ θ q(X i ,θ ) and hence ∂ k,l U n,j (θ )= 1 n n X i=1 ∂ k,l,j q(X i ,θ ). Condition (3.11) can be verified by invoking the following uniform strong law of large numbers with h(x,θ ) applied to the components∂ k,l,j q(x,θ ). Theorem3.2.3 (Le Cam 1953, Corollary 4.1, [37]). LetΘ beacompactmetricspaceandχ aspaceonwhich a probability distributionF is defined. Let h(x,θ ) be measurable inx for eachθ ∈ Θ and continuous inθ foralmosteveryx. AssumethereexistsK(x)suchthatE[K(X)]<∞and|h(x,θ )|≤ K(x)forallxand θ . Then, withm(θ )=E[h(X,θ )], P lim n→∞ sup θ ∈Θ 1 n n X i=1 h(X i ,θ )− m(θ ) =0 ! =1, whereX 1 ,X 2 ,... are independent with distributionF. 30 Lastly, under the given assumptions, the classical central limit theorem yields √ nU n (θ 0 )→ d N(0,I(θ 0 )) so that, via Theorem 3.2.2, √ n b θ n − θ 0 → d N(0,I(θ 0 ) − 1 ). For the exponential family p(x;θ )=h(x)exp(η (θ )T(x)− A(θ )) it is obtained that q(x;θ )=logh(x)+η (θ )T(x)− A(θ ). Hence, the needed conditions are satisfied if A(θ ) andη (θ ) have three bounded derivatives in some neigh- borhood ofθ 0 , andE θ 0 [T(X)] exists. 3.3 Applicationtoadiffusionequationmodel To more fully specify the output function of the diffusion model arising from PDE (2.1) as described in Subsection 3.1.1, consider the parameter space Q ={(q 1 ,q 2 )∈R 2 : q 2 >0}, (3.23) and for given matricesD N ,E N ∈R N× N , a vectorF ∈R N andq∈Q, recall from (3.4) that A N =A N (q)=q 1 D N +E N , B N =B N (q)=q 2 F N , (3.24) 31 and that the TAC at timet is given by f u (t;q)= Z t 0 C N e A N (t− s) B N u(s)ds, (3.25) where(C N ) T ∈ R N , andu(s) is the BAC/BrAC at times. Though the methods developed here work in the given generality, in the physics based model the matrixA N will have eigenvalues with negative real parts, andq 1 will be strictly positive. The dependence off onA N ,B N ,C N ,u orq may be dropped in the following for ease of notation, or included to emphasize some particular feature of interest. Consider an individual whose data has been collected overi = 1,...,n drinking sessions, where the BrAC curveu i for episodei is integrable on[0,T i ], and for someq 0 ∈Q andm i observations of TAC plus a mean zero error y ij =f u i (t i,j ;q 0 )+ϵ i,j , (3.26) are taken at the times 0 ≤ t i,1 ≤ ··· ≤ t i,m i ≤ T i ≤ T , for some T > 0. For notational simplicity some of the parameters in (3.26) may be suppressed, for instance, denotingf u i (t i,j ;q) byf ij (q), say. The observation times of episodei as the probability measure putting mass1/m i on each observation time are encoded, and form the vector of probability measuresν n = (ν 1,m 1 , ,...,ν n,mn ). Whenn = 1, that is, for the case of a single episode, the indexi is dropped. For asymptotics, a sequence of experiments indexed byℓ = 1,2,... is considered, wheren andm = (m 1 ,...,m n ) may depend onℓ, and hence indexℓ could be used in place ofn,m, though this dependence may at times be suppressed in the notation. In the case of a single drinking episode, that is, whenn = 1, letℓ=m. For consistency and asymptotic normality, it is required that n X i=1 m i →∞ asℓ→∞. (3.27) 32 In the special case where the number of observationsm i for eachn equals a constantm, the requirement (3.27) becomesnm→∞, and in the sub-case of a single drinking episode, thatm→∞. The methods developed in Section 3.2 are applied to the least squares estimator achieved as a solution toU ℓ (q)=0 where U ℓ (q)=∂ q J ℓ (q) with J ℓ (q)= 1 2 P n i=1 m i n X i=1 m i X j=1 (f u i (t ij ;q)− y ij ) 2 , (3.28) and ∂ q produces the gradient. For i ∈ {1,2}, continue to let ∂ i denote taking the partial derivative with respect toq i ; this notation will extend in the natural way to denote higher order, and mixed partial derivatives. Theorem 3.3.1 below gives conditions under which the least squares estimate is consistent and has a limiting, asymptotically normal distribution, and as well provides the form of the limiting covariance matrix. Theorem 3.3.1 is an immediate consequence of Theorems 3.3.2 and 3.3.3, that verify the conditions of Theorems 3.2.1 and 3.2.2 in the previous section. Now the main result of this work regarding the least squares estimator for the diffusion model is presented. Theorem3.3.1. Suppose the errorsϵ i,j ,1≤ i≤ n,1≤ j≤ m i in model (3.26) are mean zero, uncorrelated and have constant positive variance σ 2 , and let u i and ν i,n be, respectively, the BrAC curve and empirical measure of the observation times for episode i = 1,...,n over [0,T i ], with sup i T i < ∞. Assume the existence of the limit Γ= lim ℓ→∞ Γ ℓ where Γ ℓ = n X i=1 m i P n k=1 m k Z T i 0 ∂ q f u i (v;q 0 )∂ q f u i (v;q 0 ) ⊤ dν i,n . (3.29) and thatΓ is positive definite. Further, suppose there exists a constant K such that ∥u i ∥ 1 ≤ K for alli≥ 1, (3.30) 33 that is, that the L 1 norms of all drinking episode BrAC curves are uniformly bounded. Lastly, suppose that (3.27) holds. Then there exists a consistent sequence of solutions b q ℓ to the estimating equationU ℓ (q) = 0, given in (3.28). Ifinadditiontheerrorsϵ i,j arei.i.d.,andforsomeδ > 0satisfyE|ϵ i,j | 2+δ =τ 2+δ <∞,then,alongany such consistent sequenceb q ℓ , √ m(b q ℓ − q 0 )→ d N 0,σ 2 Γ − 1 where m= n X i=1 m i , and b σ 2 ℓ → p σ 2 where b σ 2 ℓ = 1 m n X i=1 m i X j=1 (y ij − f ij (b q ℓ )) 2 . (3.31) When the errorsϵ i,j in (3.26) are Gaussian, then the least squares estimate that minimizes the sum of squares (3.5) is also maximum likelihood. In this case the contribution to the Fisher information from the single observation in (3.26) is obtained by taking the covariance matrix of the gradient of the log of the density of the observation, Var(∂ q logp(y ij ;f ij ))=Var ∂ q log 1 √ 2πσ exp − 1 2σ 2 (y ij − f ij ) 2 =Var ∂ q − 1 2σ 2 (y ij − f ij ) 2 =Var ϵ ij σ 2 ∂ q f ij = 1 σ 2 ∂ q f ij ∂ q f ⊤ ij . Summing over the observation times yieldsmΓ ℓ /σ 2 as in (3.29), hence taking the limit and comparing with the asymptotic variance obtained we see that for normal errors the least squares estimate ofq achieves the lower bound of the information inequality in an asymptotic sense. To discuss the satisfaction of condition (3.29) recall that a sequence of measures{ν m ,m≥ 1} onR is said to converge weakly to a measureν if for all bounded continuous functionsg :R→R lim m→∞ Z R g(v)dν m = Z R g(v)dν. (3.32) 34 By considering components, the same relation holds when g continuously maps [0,T] to the space of matrices of some fixed dimension. In particular, when ν i,n has weak limit ν i then, as the integrand is bounded and continuous by Lemma 3.3.3, the integral in termi of the sum in (3.29) will converge to the same integral with respect to ν i . When ν m is the m th element of the sequence of discrete probability measures that gives equal weights to the times t m,1 ,...,t m,m in [0,T] with weak limit ν , then for any continuous functiong :[0,T]→R, 1 m m X j=1 g(t m,j )= Z T 0 g(v)dν m → Z T 0 g(v)dν. There are two special cases of note. One is where the distances between consecutive observation times on [0,T] are constant; in this case, the weak limit is the uniform probability measure on [0,T]. A second case is when the observation times are chosen independently according to a probability measure ν supported on [0,T]; in this case, the weak limit in probability is ν , where the convergence in (3.32) is in probability, or almost sure if all the observation times are generated successively by a common iid sequence, see [70]. Consider also the situation where the data fromn drinking episodes are independent and identically distributed replicates of the error distribution and canonical M,T,u,ν , where M is the distribution of m i ,1 ≤ i ≤ n, making the summands in (3.29) i.i.d. When T < ∞ a.s. Lemma 3.3.3 shows that the integrals in (3.29) are uniformly bounded, and one can show that asn→∞, Γ n → p Γ= E M E[M] Z T 0 ∂ q f u (v;q 0 )∂ q f u (v;q 0 ) ⊤ dν , where the expectation is taken overM,T,u andν , whenever the expectation on the right hand side exists. See also Assumption 3.4.1, and the discussion following, in Section 3.4. 35 Before proceeding, the smoothness of the derivatives off u (t;q) in (3.25) with respect toq = (q 1 ,q 2 ) must be verified. Because of the form of the dependence of the matrix A N on q 1 as given in (3.24), to differentiate f with respect toq 1 , directional derivatives of matrix exponentials need to be considered. For square matricesW andV of the same dimension andv∈R, define the first derivative of e vW in direction V by D 1 V (v,W)= lim h→0 exp(v(W +hV))− exp(vW) h . Higher order derivativesD k V (v,W),k ≥ 0 are defined in the natural way, with k = 0 returning e vW . Now withA N =q 1 D N +E N as in (3.24), the partial derivative with respect toq 1 ofe vA N is represented as ∂ 1 e vA N =∂ 1 e v(q 1 D N +E N ) = lim h→0 e v((q 1 +h)D N +E N ) − e v(q 1 D N +E N ) h = lim h→0 e v(A N +hD N ) − e vA N h =D 1 D N (v,A N ), and extending to higher order derivatives ∂ n 1 (e (q 1 D N +E N )v )=D n D N (v,A N ). (3.33) For anyn≥ 0, lettingB n be the(n+1)× (n+1) block matrix given by B n = W V 0 ··· 0 0 W V ··· 0 ··· ··· ··· ··· ··· 0 0 ··· 0 W , (3.34) 36 Theorem 4.13 of [42] (and also [59]) provides that e vBn = e vW D 1 V (v,W) 1! D 2 V (v,W) 2! ··· D n V (v,W) n! 0 e vW D 1 V (v,W) 1! ··· D n− 1 V (v,W) (n− 1)! ··· ··· ··· ··· ··· 0 0 ··· 0 e vW . (3.35) Now apply (3.35) to obtain bounds on higher order derivatives of the matrix exponentiale vA N with respect toq 1 . Lemma 3.3.1. LetW andV be square matrices of the same dimension. Then for alln≥ 0 the directional derivativeD n V (v,W) is analytic inv onR and satisfies the bound ∥D n V (v,W)∥≤ n!∥e vBn ∥ for allv∈R, (3.36) whereB n is given by (3.34). Foralln≥ 0,q 1 ∈RandA N =q 1 D N +E N ,thepartialderivative∂ n 1 e A N v exists,isanalyticinq 1 and, withB n given by (3.34) withW =A N andV =D N , satisfies ∥∂ n 1 e vA N ∥≤ n!e v∥Bn∥ with sup q 1 ∈Q ∥B n ∥<∞ forQ any bounded subset ofR. Proof: As the left hand side e vBn of (3.35) is analytic in each component, the matrix on the right hand side must also be analytic, thus yielding the first claim. Next, for F the submatrix obtained by taking row 37 and column indicesi,j of a given matrixE, applying an alternate form for the spectral norm in the first equality, ∥F∥= sup ∥y∥=1,∥x∥=1 y ⊤ Fx≤ sup ∥v∥=1,∥w∥=1 v ⊤ Ew =∥E∥, as any value over which the first supremum is taken can be achieved in the second by padding x andy with zeros in coordinates that are not ini andj, respectively. Hence, inequality (3.36) now follows from (3.35). The remaining claims now follow in light of (3.33) and that∥B n ∥ is continuous in q 1 , and hence bounded on any set with compact closure. □ The following result is required so to handle the derivatives of matrix products. For k ≥ 0,Q as in (3.23), a matrixM is said to be depending on(q,v)∈Q× R isk-smooth if for any0≤ a,b≤ k, the mixed partials∂ a 1 ∂ b 2 M exist and are continuous forq∈Q, and for any bounded subsetsD⊂ Q andI ⊂ R, sup (q,v)∈D× I,0≤ a,b≤ k ∥∂ a 1 ∂ b 2 M∥<∞. M is said to be smooth if it isk-smooth for allk≥ 0. Lemma3.3.2. LetM i ,i=1,...,d be matrices having dimensions such that the product M = d Y i=1 M i . may be formed. IfM 1 ,...,M d arek-smooth then so isM. Proof: The proof follows directly from the multivariate Leibniz rule that expresses the derivative∂ a 1 ∂ b 2 M i for 0 ≤ a,b ≤ k as a finite linear combination of products of derivatives of M i , each one with order no greater thank, and recalling that for conformable matrices∥AB∥≤∥ A∥∥B∥. □ The next lemma provides additional smoothness estimates, and the forms of derivatives that later appear. 38 Lemma3.3.3. Forq∈Q, and ∂ 1 (e A N v B N )=∂ 1 (e A N v )B N and ∂ 2 (e A N v B N )=q − 1 2 e A N v B N . (3.37) For allv∈R, the matrix functione A N v B is smooth inq. Ifγ (·) is integrable on[0,T], thenf γ (t;q) as in (3.25) is smooth inq, continuous fort∈[0,T] and satisfies ∂ a 1 ∂ b 2 f γ (t;q)= Z t 0 C N ∂ a 1 e A N (t− s) F N γ (s)ds ∂ b 2 q 2 , and sup t∈[0,T] |∂ a 1 ∂ b 2 f γ (t;q)|≤ a!e T∥Ba∥ ∥γ ∥ 1 ∂ b 2 q 2 , whereB n is defined in Lemma 3.3.1. Proof: The claims in (3.37) follow by recalling thatA N does not depend onq 2 and thatB N =q 2 F N . That e A N v is smooth follows from Lemma 3.3.1, and one easily verifies the smoothness of B N directly from (3.24); hence, the product is smooth by Lemma 3.3.2. Differentiation under the integral with respect to q 1 is then justified by the dominated convergence theorem, from which the smoothness of f γ (t;q) inq and the bound on its partials then follows, in light of Lemma 3.3.1; continuity off γ (t;q) fort∈[0,T] follows immediately from the integral representation (3.25). A lemma that is used to verify the conditions of Theorems 3.2.1 and 3.2.2 is presented. Lemma3.3.4. The score functionU ℓ given by (3.28) may be written as U ℓ (q)=V ℓ,1 (q)− V ℓ,2 (q) (3.38) 39 where V ℓ,1 (q)= 1 P n i=1 m i n X i=1 m i X j=1 ∂ q f ij (q)(f ij (q)− f ij (q 0 )) and V ℓ,2 (q)= 1 P n i=1 m i n X i=1 m i X j=1 ∂ q f ij (q)ϵ ij , (3.39) and U ℓ (q 0 )=− 1 P n i=1 m i n X i=1 m i X j=1 ∂ q f ij (q 0 )ϵ ij and U ′ ℓ (q 0 )=Γ ℓ − 1 P n i=1 m i n X i=1 m i X j=1 ∂ 2 q f ij (q 0 ). (3.40) Whentheerrorsϵ ij ,1≤ i≤ n,1≤ j≤ m i havemeanzerowithcommonvarianceσ 2 andareuncorrelated, then Var(U ℓ (q 0 ))= σ 2 P n i=1 m i Γ ℓ . (3.41) Proof: The decomposition (3.38) and the expressions in (3.39) that result follow directly from (3.26), and then the first expression in (3.40) may be obtained by evaluation at q 0 . Identity (3.41) now follows from the first expression in (3.40), which yields that Var(U ℓ (q 0 ))= σ 2 ( P n i=1 m i ) 2 n X i=1 m i X j=1 ∂ q f ij (q 0 )∂ q f ij (q 0 ) ⊤ , followed by (3.29). Differentiation now yields V ′ ℓ,1 (q 0 )= 1 P n i=1 m i n X i=1 m i X j=1 ∂ q f ij (q 0 )∂ q f ij (q 0 ) ⊤ and 40 V ′ ℓ,2 (q 0 )= 1 P n i=1 m i n X i=1 m i X j=1 ∂ 2 q f ij (q 0 )ϵ ij , thus yielding the second expression in (3.40) in light of (3.29). Theorem3.3.2. Suppose the errorsϵ i,j ,1≤ i≤ n,1≤ j≤ m i in model (3.26) are mean zero, uncorrelated and have constant positive variance σ 2 . Assume in addition that the limit Γ in (3.29) exists and is positive definite,andthat (3.27)holds. ThenwithU ℓ givenby (3.28),thehypotheses (3.10)ofTheorem3.2.1aresatisfied withΓ asin (3.29),a n =1andanyboundedneighborhoodΘ 0 ⊂ Qofq 0 . Ifinadditionthereexistsaconstant K such that (3.30) holds, then (3.11) also holds. Proof: Let Θ 0 be any bounded neighborhood ofq 0 . By Lemma 3.3.3, the partial derivatives of f ij (q) := f u i (t i,j ;q) of (3.25) of all orders exist, and are continuous and uniformly bounded overΘ 0 . HenceU ℓ is twice continuously differentiable, with uniformly bounded derivatives, over Θ 0 . The first condition in (3.10) holds as E[U ℓ (q 0 )] = 0 via the first identity in (3.40) of Lemma 3.3.4, and by (3.41) in that same lemma, which yields thatVar(U ℓ (q 0 ))→0 by virtue of (3.29) and (3.27). Focusing now on the the second identity in (3.40), to show the second condition in (3.10), as the limit Γ in (3.29) exists, it suffices that the second term in that identity tends to zero in probability. In fact, the variance of this mean zero term tends to zero, since the components of the matrix∂ 2 q f ij (q 0 ) are uniformly bounded on Θ 0 by Lemma 3.3.3 and multiply mean zero, uncorrelated random variables with uniformly bounded variances. The matrixΓ is positive definite by assumption, so the last condition in (3.10) holds. Lastly, it will be shown that the inequality (3.11) is satisfied. From the decomposition (3.38) ∂ k,l U ℓ,r (q) may be written as a difference of the form R ℓ − S ℓ := 1 P n i=1 m i n X i=1 m i X j=1 g 1,ij (q)− 1 P n i=1 m i n X i=1 m i X j=1 g 2,ij (q)ϵ i,j 41 where, by Lemmas 3.3.2, 3.3.3 and (3.30), there existsK 1 such that sup q∈Θ 0 ,p∈{1,2} g p,ij (q) ≤ K 1 . Hence, for the first component, |R ℓ |= 1 P n i=1 m i n X i=1 m i X j=1 g 1,ij (q) ≤ 1 P n i=1 m i n X i=1 m i X j=1 K 1 =K 1 , while for the second component, Var(S ℓ )≤ σ 2 ( P n i=1 m i ) 2 n X i=1 m i X j=1 K 2 1 ≤ σ 2 K 2 1 P n i=1 m i →0. Hence, for anyτ ∈ (0,1), by Chebyshev’s inequality,K 2 may be picked such thatP(|S ℓ |≥ K 2 )≤ τ/ 8 for alln≥ 1. Thus, settingK =K 1 +K 2 ,for alln≥ 1, it is obtained P (|R n − S n |>K,q∈Θ 0 )≤ P (|R n |+|S n |>K,q∈Θ 0 ) ≤ P (K 1 +|S n |>K 1 +K 2 ,q∈Θ 0 ) =P (|S n |>K 2 ,q∈Θ 0 )≤ τ 8 . The claim now follows by taking a union bound over the eight choices fork,l andr. □ Theorem3.3.3. Assumetheerrorsϵ ij ,1≤ i≤ n,1≤ j≤ m i arei.i.dwithmeanzero, varianceσ 2 andfor someδ > 0 satisfyE|ϵ ij | 2+δ = τ 2+δ <∞. Assume that (3.27) holds and that the limitΓ as given in (3.29) exists. Then forU ℓ (q) given by (3.28), b ℓ U ℓ (q 0 )→ d N 0,σ 2 Γ where b ℓ = v u u t n X i=1 m i . 42 Proof: Verify (3.16) of Lemma 3.2.1. The first condition holds by (3.41) of Lemma 3.3.4, and that the limit Γ in (3.29) exists. For the second condition of (3.16), by (3.40), write b ℓ U ℓ (q 0 )= n X i=1 m i X j=1 X ij where X ij =− 1 p P n i=1 m i ∂ q f ij (q 0 )ϵ ij . By the assumptionE|ϵ ij | 2+δ ≤ τ 2+δ and Lemma 3.3.3 there existsC such that X 1≤ i≤ n,1≤ j≤ m i ,1≤ k≤ 2 E|X ij,k | 2+δ ≤ Cτ 2+δ ( P n i=1 m i ) δ/ 2 , which tends to zero by (3.27). This section is concluded with: Proof of Theorem 3.3.1: Theorems 3.3.2 and 3.3.3 show that the hypotheses of Theorems 3.2.1 and 3.2.2 are satisfied, yielding the claims for consistency and asymptotic normality. It remains to prove the claims on the consistency of the variance estimator. By (3.31), and recalling m= P n i=1 m i , we have b σ 2 ℓ = 1 m n X i=1 m i X j=1 (ϵ ij +f ij (q 0 )− f ij (b q ℓ )) 2 = 1 m n X i=1 m i X j=1 ϵ 2 ij +2ϵ ij (f ij (q 0 )− f ij (b q ℓ )+(f ij (q 0 )− f ij (b q ℓ ) 2 . The first term tends to σ 2 in probability by the weak law of large numbers. To handle the second term, letting R= 1 m n X i=1 m i X j=1 |ϵ ij | then E[R]= 1 m n X i=1 m i X j=1 E|ϵ ij |≤ 1 m n X i=1 m i X j=1 q Eϵ 2 ij =σ. 43 With anyr ∈ (0,1) such that the ballB r of radiusr centered atq 0 is contained inQ, Lemmas 3.3.1 and 3.3.3 show that the first derivatives of f j (q) are uniformly bounded for (q,t) ∈ B r × [0,T], and hence, that there exists someK >0 such that over this set |f ij (q 0 )− f ij (q)|≤ K∥q 0 − q∥. (3.42) Letδ ∈(0,K) be arbitrary,F ={|q 0 − b q ℓ |≤ δ/K } and note that S = 1 m n X i=1 m i X j=1 ϵ ij (f ij (q 0 )− f ij (b q ℓ ) satisfies S1 F ≤ δR 1 F . By Markov’s inequality, for anyτ > 0, P(S≥ τ )≤ P(S1 F ≥ τ )+P(F c )≤ P(δR 1 F ≥ τ )+P(F c )≤ δE [R1 F ] τ +P(F c ) ≤ δE [R] τ +P(F c )≤ δσ τ +P(F c )→ δσ τ , using the non-negativity ofR in the fourth inequality, and the consistency ofq ℓ when taking the limit. As δ can be made arbitrarily small it is concluded thatP(S≥ τ )→0, and asτ is arbitrary, thatS→ p 0. Similarly decomposing the third term on the good eventF and its complement, onF the inequality (3.42) shows that this term is bounded as K 2 m n X i=1 m i X j=1 ∥q 0 − b q ℓ ∥ 2 =K 2 ∥q 0 − b q ℓ ∥ 2 , which tends to zero in probability in view of the consistency ofb q ℓ . □ 44 3.4 InferenceontheBrACcurve In this section confidence bounds on the BrAC curve generated by a drinking episode of a subject in the field are, and estimated using η TAC observations and an estimateq m computed fromm measurements in a previous calibration experiment are obtained. The notation here differs from that used in previous sections, the parametern now being absorbed in the numberm of total observations for calibration. The uniform confidence bounds for the reconstructed BrAC curve are obtained by applying a variation on the standard multivariate delta method, as given in Theorem 7 in [23], using the properties provided by Theorem 3.3.1 onq m , and the assumed properties of the TAC measurement error. First, the process of obtaining the estimate of the BrAC curve is specified in detail. Independently of q m , n TAC observations y η, 1 ,...,y η,η are collected from a drinking episode at the increasing times 0≤ s η, 1 <··· 0 such that λ 1 (H 0 (q)) > ϵ for allq in some bounded open neighborhoodQ 0 centered atq 0 , and with radiusϵ taken small enough that the closure ofQ 0 lies inQ. By continuity this same inequality holds in the non-strict sense over the closure, thus showing the claim made following (3.60). 52 By the first claim in (3.59) and under the assumption bounding ∥M η − M 0 ∥, it is concluded that there exists a constantC such that∥H η (q)− H 0 (q)∥≤ Cr η . Asr η → 0,r η < ϵ/ 2 for allη sufficiently large. Hence, for suchη by (3.62) theninf q∈Q 0 λ 1 (H η (q))>ϵ/ 2, and therefore, for allq∈Q 0 , ∥H − 1 η (q)− H − 1 0 (q)∥=∥H − 1 η (q)(H η (q)− H 0 (q))H − 1 0 (q)∥ ≤∥ H − 1 η (q)∥∥H η (q)− H 0 (q)∥∥H − 1 0 (q)∥≤ 2C ϵ 2 r η , this demonstrating the first claim in (3.61). The second follows from the first, and the triangle inequality. □ Now moving to the properties ofβ η (q m ), note the decomposition β η (q m )− β 0 (q 0 )=(β η (q m )− β 0 (q m ))+(β 0 (q m )− β 0 (q 0 )), (3.63) and for the first term, applying (3.49), β η (q m )− β 0 (q m )= H η (q m ) − 1 − H 0 (q m ) − 1 Z η (q m )+H 0 (q m ) − 1 (Z η (q m )− Z 0 (q m )) +H η (q m ) − 1 ϵ η (q m ). (3.64) Now, a consistency result forβ η (q m ), will be proved and it will be applied to show that the BrAC curve estimate converges uniformly in probability tou 0 (s;q 0 ), given in (3.51). Theorem3.4.1. Supposethatq m isconsistentforq 0 asm→∞,thatH 0 (q 0 )isinvertible,andthattheerror variablesϵ 1 ,...,ϵ η have mean zero, and are uncorrelated with variances dominated byσ f 2 . In addition, let Assumption 3.4.1 hold. Then whenm andη tend to infinity β η (q m )→ p β 0 (q 0 ), 53 and whensup k≥ 1 ∥ϕ k ∥<∞, the supremum norm∥u η (·;q m )− u 0 (·;q 0 )∥→ p 0. Proof: By the triangle inequality, to showβ η (q m ) is consistent forβ 0 (q 0 ), it suffices to verify that the two terms on the right side of tend to zero in probability. The first term converges to zero in probability by the consistency ofq m forq 0 , which in particular guarantees thatq m will lie in the setQ 0 given by Lemma 3.4.2 with probability tending to one, and then also invoking (3.64), Lemma 3.4.2 and Assumption 3.4.1. The second term tends to zero by virtue of the consistency ofq m and the continuity of the functionβ 0 (·). The last claim follows from the first, using that ϕ (s) is bounded on[0,S], and from (3.51) yielding sup s∈[0,S] |u η (s;q m )− u 0 (s;q 0 )|≤∥ ϕ ∥∥β η (q)− β 0 (q 0 )∥. □ The following distributional limit theorem the first term in (3.63) is proved. Lemma3.4.3. AssumeH 0 (q) − 1 exists forq 0 ∈Q. In addition, let the errorsϵ i ,i≥ 1 be independent, mean zerowithcommonvarianceσ 2 f ∈(0,∞)andindependentofq m ,andsupposeforsomeδ > 0andτ > 0that E|ϵ i | 2+δ ≤ τ 2+δ for alli≥ 1. If √ m(q m − q 0 ) = O p (1), Assumption 3.4.1 holds, and thatη,m →∞ so thatmr 2 η /η →0, then, withY ∼ N(0,σ 2 f H 0 (q 0 ) − 1 G 0 (q 0 )H 0 (q 0 ) − 1 ), H η (q m ) − 1 ϵ η (q m )=H 0 (q 0 ) − 1 ϵ η (q 0 )+o p (1/ √ m) and √ ηH 0 (q 0 ) − 1 ϵ η (q 0 )→ d Y where Y ∼ N(0,σ 2 f H 0 (q 0 ) − 1 G 0 (q 0 )H 0 (q 0 ) − 1 ). (3.65) Ifη/m is bounded away from infinity and √ ηr η →0, then √ η (β η (q m )− β 0 (q m ))→ d Y. (3.66) 54 Note that the condition that √ m(q m − q 0 ) = O p (1) is implied by the conditions of Theorem 3.3.1, as they provide the stronger conclusion that √ m(q m − q 0 ) converges in distribution. Note as well that mr 2 η /η →0 wheneverm andη are of comparable size, andr η →0. Proof: Since √ m(q m − q 0 ) = O p (1) thenq m → p q 0 , and without loss of generality it may be assumed thatq m is contained in the setQ 0 given in Lemma 3.4.2. Writing H η (q m ) − 1 ϵ η (q m )=(H η (q m ) − 1 − H 0 (q m ) − 1 )ϵ η (q m )+H 0 (q m ) − 1 (ϵ η (q m )− ϵ η (q 0 )) +(H 0 (q m ) − 1 − H 0 (q 0 ) − 1 )ϵ η (q 0 )+H 0 (q 0 ) − 1 ϵ η (q 0 ), (3.67) (3.65) is shown by demonstrating that the first three terms tend to zero in probability after scaling by √ m. The squared norm of the covariance matrix of the first term scaled by m and conditional onq, by applying the given assumptions on the errors, can be bounded by m∥(H η (q m ) − 1 − H 0 (q m ) − 1 )Var(ϵ η (q m )|q m )(H η (q m ) − 1 − H 0 (q m ) − 1 )∥≤ Cmr 2 η /η using the first inequality in (3.61) and (3.60); the upper bound clearly bounds the squared norm of the scaled unconditional covariance matrix of the first term. For the second term, properly scaled, define A η,m = √ m(ϵ η (q m )− ϵ η (q 0 ))= √ m η η X j=1 (ψ (s j ,q m )− ψ (s j ,q 0 ))ϵ j . Letτ ∈ (0,1) be given andQ 0 be as in Lemma 3.4.2. Since √ m(q m − q 0 ) = O p (1), there existsM such that liminf m P(Ω M,m )≥ 1− τ where Ω M,m ={ √ m∥q m − q 0 ∥≤ M}, 55 andm is chosen sufficiently large so that the closed ball of radius M/ √ m centered atq 0 lies in the setQ 0 given in Lemma 3.4.1. By the independence betweenq m andϵ j ,j =1,...,η , E(A η,m 1 Ω M,m |q m )=0. (3.68) Hence, via the conditional variance formula, and thatΩ M,m is measurable with respect toq m , it is obtained that Var A η,m 1 Ω M,m =E Var(A η,m 1 Ω M,m |q m ) = mσ 2 f η 2 E η X j=1 (ψ (s j ,q m )− ψ (s j ,q 0 )) 2 1 Ω M,m ≤ σ 2 f L 2 η E[m∥q m − q 0 ∥ 2 1 Ω M,m ] ≤ σ 2 f L 2 η M 2 →0 asη →∞, where Lemma 3.4.1 is used in the first inequality. Hence, as E[A η,m 1 Ω M,m ] = 0 by (3.68), and thatτ > 0 is arbitrary,A η,m →0 asη →∞. By (3.61) of Lemma 3.4.1 and thatq m → p q 0 , the second term iso p (1). For the third term, first note that when q m ∈Q 0 , which occurs with probability tending to 1 asm→∞, there exists a constantC such that ∥H − 1 0 (q m )− H − 1 0 (q 0 )∥=∥H − 1 0 (q m )(H 0 (q m )− H 0 (q 0 ))H − 1 0 (q 0 )∥ ≤∥ H − 1 0 (q m )∥∥H 0 (q m )− H 0 (q 0 )∥∥H − 1 0 (q 0 )∥ =∥H − 1 0 (q m )∥∥G 0 (q m )− G 0 (q 0 )∥∥H − 1 0 (q 0 )∥ ≤ C∥G 0 (q m )− G 0 (q 0 )∥, 56 by (3.61) and (3.56) of Lemma 3.4.1. Now by (3.57) of Lemma 3.4.1 there existsC such that √ m∥G 0 (q m )− G 0 (q 0 )∥≤ C √ m∥q m − q 0 ∥=O(1). Hence, by (3.60), this term will tend to zero asη →∞. The proof of (3.65) is complete. To verify the distributional convergence claimed in (3.65) it suffices that √ η ϵ n (q 0 )= η X i=1 X η,i → d N(0,σ 2 f G 0 (q 0 )) where X η,i = 1 √ η ψ (s i ;q 0 )ϵ i . Lemma 3.2.1, is applied noting that the first condition in (3.16) holds, as G η (q 0 ) converges toG 0 (q 0 ) by Lemma 3.4.2. It remains to verify the second condition in (3.16) to complete the proof of (3.65). The vectorψ (s,q) is uniformly bounded overs∈ [0,S] andq ∈Q 0 by (3.56) of Lemma 3.4.1. Hence, for some constantC, asη →∞, η X i=1 E∥X η,i ∥ 2+δ ≤ 1 η 1+δ/ 2 η X i=1 ∥ψ (s i ;q 0 )∥ 2+δ E|ϵ i | 2+δ ≤ C 2+δ η − δ/ 2 τ 2+δ →0, completing the proof of (3.65). For the final claim, after scaling by √ η , by the triangle inequality that the first two terms in (3.64) tend to zero by the consistency ofq m forq 0 , Lemma 3.4.2 and that √ ηr n →0. The final claim (3.66) now follows from (3.65), applying the assumption thatη/m is bounded. □ Lastly, the asymptotic distribution of the estimated BrAC curve, properly scaled is determined. Using that results, in Remark 3.4.3it is shown how confidence intervals for certain functionals of interest, and a uniform confidence bound, for the reconstructed curve can be constructed asymptotically. An expression 57 for the partials required for the computation of the matrix K defined in (3.69) is provided by (3.58) and (3.57). Theorem3.4.2. Suppose that √ m(q m − q 0 )→ d N(0,σ 2 Γ − 1 ) asm→∞ forsomeinvertiblematrixΓ ,thatH 0 (q 0 )isinvertible,Assumption3.4.1holds,thatϵ i ,i=1,...,η aremean zero random variables with common variance σ 2 f and uniformly bounded 2+η moments for some η > 0, independent of each other and ofq m , and thatsup k≥ 1 ∥ϕ k ∥<∞. Assume also that √ ηr η →0. Ifm/η →ρ ∈[0,∞) then √ m(β η (q m )− β 0 (q 0 ))→ d N(0,σ 2 K T Γ − 1 K +ρσ 2 f H − 1 0 (q 0 )G 0 (q 0 )H − 1 0 (q 0 )), (3.69) whereK =∂ q β 0 (q 0 ) T , and W m (s)= √ m(u η (s;q m )− u 0 (s;q 0 ))→ d W σ,ρ (s) (3.70) whereW σ,ρ (s)=ϕ (s) T σK T Γ − 1/2 Z 1 + √ ρσ f H − 1 0 (q 0 )G 1/2 0 (q 0 )Z 2 asprocessesonthespaceC[0,S]of continuousfunctionson[0,S],endowedwiththesupremumnorm,whereΓ − 1/2 andG 1/2 0 (q 0 )aretheunique positive definite square roots of Γ − 1 and G 0 (q 0 ) respectively, andZ 1 ∼ N 2 (0,I) andZ 2 ∼ N p (0,I) are independent. Ifm/η →ρ =∞then (3.69)and (3.70)holdwiththescaling √ mreplacedby √ η andtheparametersof the limiting distributions in those displays set to(σ,ρ )=(0,1). 58 Remark3.4.1. AsK andΓ in (3.70)dependontheunknownq 0 ,inpracticethesequantitiescanbeestimated by their values along a sequence of consistent estimatesq m ,m≥ 1. AsK andΓ are continuous atq 0 , these resulting estimates will likewise be consistent. Similar remarks apply as to the estimation ofσ 2 f andG 0 (q 0 ). Remark3.4.2. Theboundarycaseρ =0correspondstothesituationwherethenumberofobservationstaken in the field is so large that the variability of the resulting BrAC estimate depends only on the uncertainty in the parameter estimateq 0 , hence asymptotically equivalent to the situation where the field observations are taken without noise. At the other extreme, the caseρ =∞ reflects the situation where the number of observations taken in the calibration experiment in the lab is so large that for the purposes of BrAC estimation, the parameterq 0 is, in a practical sense, known. Remark3.4.3. Whenthedistributionalconvergencein (3.70)holdsthenforanycontinuousF :C[0,S]→R then F(W m (s))→ d F(W σ,ρ (s)) asm→∞. (3.71) Consider the following examples: 1. Foru∈C[0,S] andt∈[0,S] let F t (u)= Z t 0 u(s)ds, that is, the area underu at timet. Of special interest is the caset = S, the total area under the curve. In this case (3.70) and the linearity of the integral yields √ m Z t 0 u η (s:q m )ds− Z t 0 u 0 (s;q 0 )ds → d Z t 0 W σ,ρ (s)ds 59 2. Let F(u)= 1 S Z S 0 |u(s)|ds, that is F(u)=∥u∥ 1 , the scaledL 1 norm ofu on[0,S]. From (3.71) we obtain √ m S Z S 0 |u η (s;q m )− u 0 (s;q 0 )|ds→ d 1 S Z S 0 |W σ,ρ (s)|ds=∥W σ,ρ ∥ 1 . (3.72) Thescaled,orunscaledsuchnormonanynon-emptysub-interval[a,b]⊂ [0,S]mayalsobeconsidered. 3. Taking F(u)= sup s∈[0,S] |u(s)|, that is F(u)=∥u∥ ∞ theL ∞ norm, yields √ m u η (· :q m )− Z t 0 u 0 (·;q 0 ) ∞ → d ∥W σ,ρ ∥ ∞ . (3.73) The performance of the asymptotic confidence intervals these examples yield for the total area under the curve,andL 1 anduniformconfidenceboundsonthereconstructedcurve,respectively,areconsideredinSection 3.6. Forα ∈ (0,1) andL andU having the distributions of∥W σ,ρ ∥ 1 and∥W σ,ρ ∥ ∞ respectively, letl 1− α be the infimum over all t such thatP(L≤ t)≥ 1− α , and define u 1− α similarly usingU in place ofL. Then theL 1 anduniform1− α confidencesetscenteredat u 0 aregivenbyC q l 1− α forq =1andq =∞respectively, where C q β ={u:∥u− u 0 ∥ q ≤ β }. Since ∥W σ,ρ ∥ 1 ≤∥ W σ,ρ ∥ ∞ we have L≤ st U, 60 where≤ st denotesstochasticdominancebetweendistributions. Concludethatl 1− α ≤ u 1− α ,andhenceitmay be derived that C 1 l 1− α ⊂ C 1 u 1− α and that C ∞ u 1− α ⊂ C 1 u 1− α but no implications as to relations betweenC 1 l 1− α andC ∞ u 1− α . Proof: By the delta method as in Theorem 7, Chapter 7 in [23], using that ∂ q β (q) is continuous in a neighborhood ofq 0 by Lemma 3.4.1,it is obtained that √ m(β 0 (q m )− β 0 (q 0 ))→ d σU ∼ N(0,σ 2 K T Γ − 1 K). (3.74) Now suppose thatm/η →ρ ∈[0,∞). By Lemma 3.4.3, adopting the notation in (3.66), √ m(β η (q m )− β 0 (q m ))= r m η ( √ η (β η (q m )− β 0 (q m ))→ d √ ρY, (3.75) and by (3.65)Y is the distributional limit of a quantity not depending onq m , plus a term that tends to zero in probability, thus showing thatU andY are asymptotically independent. Hence, adding the expressions in (3.74) and (3.75), we find that √ m(β η (q m )− β 0 (q 0 ))→ d σU + √ ρY, (3.76) completing the proof of (3.69). Lettingα (η,m ) = √ m(β η (q m )− β 0 (q 0 )), by the definition of W m in (3.70), u η in (3.51), and the convergence in (3.76), for d ≥ 1 the finite dimensional distributions of W m at the d arbitrarily chosen times points0≤ s 1 <...a)≤ τ for allm≥ 1, (3.78) and for everyτ ∈(0,1] and positiveϵ , there existsδ > 0 and an integerm 0 such that P(Ω Wm (δ )≥ ϵ )≤ τ for allm≥ m 0 . (3.79) Condition (3.78) follows from (3.77) with d = 1 and s 1 = 0; as W m (0) converges in distribution, the sequenceW m (0) is tight. Let τ ∈ (0,1] and ϵ > 0 be given. As α (η,m ) converges in distribution there exists C such that P(∥α (η,m )∥≤ C)≥ 1− τ for allm≥ 1. Let δ =inf{ζ :Ω ϕ k (ζ )<ϵ/Cp, 1≤ k≤ p}; 62 this quantity will be positive for allϵ> 0 as each basis function in the set{ϕ k ,k =1,2,...,p} is contin- uous on[0,S], and therefore this set of functions is uniformly continuous. Thus, with probability at least 1− τ , for|s− t|<δ, 0≤ s,t≤ S and allm≥ 1, |W m (s)− W m (t)|=|(ϕ (s)− ϕ (t)) T α (η,m )|≤∥ ϕ (s)− ϕ (t)∥∥α (η,m )∥ ≤ C∥ϕ (s)− ϕ (t)∥≤ C p X k=1 |ϕ k (s)− ϕ k (t)|≤ C p X k=1 Ω ϕ k (δ )≤ ϵ, from which (3.79) follows withm 0 =1. Lastly, consider the case m/η → ∞. Scaling (3.63), by √ η , for the second term, using Slutsky’s theorem, it follows that √ η (β 0 (q m )− β 0 (q 0 ))= r η m √ m(β 0 (q m )− β 0 (q 0 ))→ p 0, as the term in parenthesis convergences in distribution. Hence, the only term contributing in the de- composition (3.63) is (3.66), and the argument for the previous case carries through with essentially no modification. □ Remark 3.4.4. The regularization matrix M n in the objective function (3.46) is used to avoid numerical instability;detailsontherelevantchoiceofM n usedherecanbefoundinSection3.5. However,regularization caninducebias. Toillustrate,assumeforsomepthatthetrueBrACcurveliesinthespanofthebasisfunctions ϕ 1 (s),...,ϕ p (s), so that u(s)=ϕ (s) T β ⋆ for someβ ⋆ ∈R p . (3.80) 63 In the limiting casen=0 of (3.49) forq =q 0 , in light of (3.44), (3.45) and (3.80), β 0 (q 0 )=(G 0 (q 0 )+M 0 ) − 1 Z 0 (q 0 )=(G 0 (q 0 )+M 0 ) − 1 G 0 (q 0 )β ⋆ . In particular, the limiting coefficient vector β 0 (q 0 ) may be biased for the trueβ ⋆ unlessM 0 =0. 3.5 BasisandRegularizationDetails For a BrAC curve estimateu in the span of a basis{ϕ i ,i = 1,...,p} of absolutely continuous functions on[0,S], there exists a unique vectorβ =[β 1 ,...,β p ] T ∈R p such that u(t)= p X i=1 β i ϕ i (t)=ϕ (t) T β and hence u ′ (t)= p X i=1 β i ϕ ′ i (t)=ϕ ′ (t) T β whereϕ (t)=[ϕ 1 (t),...,ϕ p (t)] T ∈R p . In this case, theL 2 norm ofu and its derivative can be expressed as quadratic forms involving matricesR andQ respectively, given by Z S 0 u 2 (t)dt= Z S 0 (ϕ (t) T β ) T ϕ (t) T β dt=β T Z S 0 ϕ (t)ϕ (t) T dt β =:β T Rβ , and likewise Z S 0 [u ′ (t)] 2 dt= Z S 0 (ϕ ′ (t) T β ) T ϕ ′ (t) T β dt=β T Z S 0 ϕ ′ (t)ϕ ′ (t) T dt β =:β T Qβ , and then regularize viaM, resulting in a linear combination of these penalty terms, defined by M =λ Z S 0 u 2 (t)dt+u Z S 0 [u ′ (t)] 2 dt=λR +uQ. (3.81) 64 The B-spline basis is one convenient choice. These functions are constructed by first choosing an integerN ≥ 0, the number of interior knots, non-decreasing real numbers t 0 ≤ t 1 ≤···≤ t N ≤ t N+1 , and the degreen≥ 1, of the spline, from which one defines the augmented knot set t − n =··· =t 0 ≤ t 1 ≤···≤ t N+1 =··· =t N+n+1 . Now, for each augmented knott i ,i=− n,...,N +n+1, let B i,0 (x)= 1 t i ≤ x<t i+1 0 otherwise, and forj =0,...,n− 1 recursively define i th B-spline basis functionB i,j of orderj by B i,j+1 (x)=α i,j+1 (x)B i,j (x)+[1− α i+1,j+1 (x)]B i+1,j (x), − n+j+1≤ i≤ N +n+1 where α i,j (x)= x− t i t i+j − t i t i ̸=t i+j 0 otherwise, see [14]. For these functions explicit forms for the integrals that produceR andQ that form the regularizing matrixM cannot be written in general. 65 3.6 Transdermalbloodalcoholmonitoring: Simulationsanddata analysis In both the simulation and real data study presented below the case where data are collected from single drinking episodes is investigated. The computations were carried out in MATLAB and the optimization producing the estimate of the parameterq was solved using the Optimization Toolbox routine FMINCON. 3.6.1 Simulationstudies In this Section the theoretical results on the consistency and asymptotic normality of the parameter esti- mate given in Theorem 3.3.1 are validated, and the practical impact of the number of observations on its behavior is illustrated. Additionally, the behavior of the BrAC curve estimates and the associated confi- dence bounds are evaluated. To reflect a simple real-world situation, BrAC was simulated using a small but realistic drinking diary that consists of a single drink 6 minutes after the beginning of the drinking session. BrAC was computed using the Michaelis-Menten approach (see [13]) that models the metabolic effects of the ethanol specific enzymes ADH and ALDH typically found in the liver, and also known to be present in trace amounts in the skin. For simplicity, the true value of the parameterq, was set to beq 0 = (1,1) and the duration of the drinking session was set toT =1 hour. The vectors and matrices in (3.3) and (3.4) were D N =I 2 , E N =O 2 , C N =(1,0) and F N =(1,0) T . 66 Equally spaced TAC measurement were calculated after adding independent error terms to the expression in (3.25), thus resulting in (3.26), with each error term distributed asN(0,σ 2 ) withσ =0.01. Calculating the theoretical limiting covariance matrix in Theorem 3.3.1 it was obtained that Σ= 16.4404 − 7.2947 − 7.2947 3.4586 . Table 3.6.1 displays the results of three simulations, each of 100q estimates, for three experiments hav- ing 20, 60 and 100 TAC samples. A comparison of the results obtained to the true values of the parameters agrees with the theoretical consistency results. Number of TAC observations Mean Parameter Estimate Scaled Sample Covariance Matrix 20 0.9447± 0.1549 1.0597± 0.0684 12.6231 − 5.2525 − 5.2525 2.4586 60 1.0375± 0.0997 1.0042± 0.0435 15.6790 − 6.6024 − 6.6024 2.9912 100 0.9762± 0.0805 1.0228± 0.0381 17.0397 − 7.8260 − 7.8260 3.8215 Table 3.6.1 Sample mean and covariance matrices from 100 simulation replicates ofb q. To experimentally validate the limiting bivariate normal distribution obtained in Theorem 3.3.1 forb q, Figures 3.1, 3.2, and 3.3 plot the vectorR 2 value of 100 estimators calculated from the synthetic data for 20, 60 and 100 observations, respectively, along with levels curves of the corresponding limiting bi-variate normal distribution. The amount of probability mass contained in the level curves pictured is indicated by the color coding on the right hand side of the figure, which when all taken together show increasingly good fits to the theoretical result as the number of TAC samples increases. 67 Figure 3.1: Values of theb q estimators obtained when using 20 TAC observations overT =1 hour. Figure 3.2: Values of theb q estimators obtained when using 60 TAC observations overT =1 hour. 68 Figure 3.3: Values of theb q estimators obtained when using 100 TAC observations overT =1 hour. The running time of these experiments may be long due to the computation of the matrix exponential ofA N in (3.3), which in general is not symmetric. For that reason, note how speed can be improved using the following diagonalization procedure. From [50], see also [62], withω i ,i = 1,...,N the basis for the finite dimensional approximation in (3.2), define matrices K 1 ,K 2 andV inR N× N by K 1,ij = Z 1 0 ω ′ i (v)ω ′ j (v)dv, K 2,ij =ω i (0)ω j (0) and V ij = Z 1 0 ω i (v)ω j (v)dv; note that the matricesV andK 1 ,K 2 are symmetric. Then obtain D N =− V − 1 K 1 ,E N =− V − 1 K 2 and hence A N =q 1 D N +E N =− V − 1 (q 1 K 1 +K 2 ). (3.82) Multiplying the final expression for V in (3.82) on the left and right byV 1/2 andV − 1/2 respectively yields V 1/2 A N V − 1/2 =− V − 1/2 (q 1 K 1 +K 2 )V − 1/2 . (3.83) 69 As the right hand side of (3.83) is symmetric the spectral theorem may be applied to write V 1/2 A N V − 1/2 =S q 1 L q 1 S − 1 q 1 and so, fort∈R V 1/2 tA N V − 1/2 =S q 1 tL q 1 S − 1 q 1 (3.84) where S q 1 is invertible and L q 1 is diagonal with real entries. Using (3.84) in the final equality, for any non-negative integer powerp, V 1/2 (tA N ) p V − 1/2 =(V 1/2 (tA N )V − 1/2 ) p =S q 1 (tL q 1 ) p S − 1 q 1 , is obtained, which, by substitution into the power series for the exponential function, yields V 1/2 e tA N V − 1/2 =S q 1 e tLq 1 S − 1 q 1 and hence e tA N =V − 1/2 S q 1 e tLq 1 S − 1 q 1 V 1/2 , thus allowing the computation of the matrix exponential of tA N to require the exponentiation of the diagonal matrixL q 1 only. The reconstruction of the BrAC curve from TAC measurements is explored, with a view to the practi- cality of applying Theorem 3.4.2 and Remark 3.4.3, with particular attention to the uniform andL 1 norm confidence bands, and confidence intervals for the total area under the true BrAC curve. In brief, the les- son here is that due to the inherent numerical instability of the inverse problem being solved, the theory works well for TAC errors of a magnitude well under those that can be obtained by the measurement tools currently available, having an estimated standard deviation of2.5× 10 − 3 . Following the approach of [13], the discretization level of N = 4 was used in (3.25) and computed the matricesC N ,D N ,E N andF N as there. The curves in this section are standardized to be on the time interval[0,1], and as described at the start of Section 3.4, with aB-spline basis used in (3.44), as detailed in Section 3.5, with degreem,M interior knots ati/(M+1),i=1,...,M and boundary knots at{0,1}, 70 thus producing the curvesB i,m ,1 = 1,...,M +m+1. Removing the first and last to accommodate the zero boundary conditions results in the curves indexed by i = 2,...,M +m. Set m = 3 and M = 8, yieldingM +m+1− 2=10 curves, of whichB 3,3 is taken as the true BrAC curve. To set a ‘best case’ baseline instance, first consider the case where the TAC errors are set to zero, which results in the single, non-random ‘Estimated-BrAC’ curve shown in Figure 3.4, that overlaps with the true BrAC curve without the need for any regularization. Figure 3.4: TAC errors set to zero, no regularization This figure, and an appeal to continuity, suggests that the confidence regions provided by Theorem 3.4.2 should hold in practice over some substantial time interval in[0,1] in settings where the TAC errors have sufficiently small standard deviations. Six experiments were run, all with non-trivial independent Gaussian TAC noise, having varying mag- nitudes of the standard deviations of the lab and field TAC errors and of the regularization parameters u andλ in (3.81). In each experiment 10 BrAC curve estimates were generated, and the results are compared to the 70% uniform and L 1 confidence bands, and for one case, the 70% confidence interval for the total area under the curve, as provided by Theorem 3.4.2 and Remark 3.4.3. 71 In the first such experiment, whose outcome is shown in Figure 3.5, both TAC error standard deviations are set to be2.5× 10 − 5 . The small value of the error allows for the problem to be sufficiently numerically stable over a large enough portion of the interval that regularization is not needed, and both regularization parameters were set to zero. The uniform confidence band is represented by a pink band in the figure, which is more easily seen in the enlargement over a smaller time interval visualized by the interior box. Of the 10 BrAC curves generated, 7 are completely contained within the 70% uniform confidence band, and there are also 7 within the 70% L 1 confidence band, over the truncated time interval [0,0.63]; note the instability in the reconstructed curve towards the end of the interval. From the detail provided by the interior box of the figure, it can be seen that two curves exit the uniform confidence band at the given time threshold. This experiment, where the confidence bands perform at the level desired over the truncated interval, serves as a reference case for experiments that increasingly approach the more realistic levels of TAC errors having two full orders of magnitude larger. Hence, in these later cases the same time interval as the one here will be considered in order to better make comparisons. Figure 3.5: TAC errors2.5× 10 − 5 , no regularization. 72 In the next experiment, illustrated in Figure 3.6, the lab error was increased by a factor of 10 and maintain the same field TAC error. Over the same time interval considered in Figure 3.5 it can be seen that the uniform 70% confidence band contains only 5 of the 10 generated BrAC curves, while the L 1 confidence band continues to hold 7, bolstering the intuition that theL 1 band should be more robust. Nevertheless, the interior box in the figure illustrates that an increased standard deviation of the lab error causes the curve to have more variability around its peak. Figure 3.6: Lab and field TAC error 2.5× 10 − 4 and2.5× 10 − 5 , no regularization. Increasing the noise level further, Figure 3.7 shows the result of setting both TAC error standard de- viations at 10 times the levels of the previous experiment. Adding a small amount of regularization, there are 7 of 10, and 8 of 10 curves, respectively, within the uniform band and theL 1 band. However, its clear from the figure that the curves have much more variability due to the higher noise levels, and fluctuate strongly within wider confidence bands. 73 Figure 3.7: Lab and field TAC errors 2.5× 10 − 3 and 2.5× 10 − 4 , both regularization parameters set to 2.5× 10 − 10 . For this experiment, having the largest noise levels so far, Figure 3.8 additionally notes the performance of the 70% confidence interval for the total area under the curve; in particular, for 8 of the 10 curves generated the confidence interval covered the true area value. Figure 3.8: Confidence Interval: Area under the curve. 74 Lastly, the results of an experiment that illustrates the effect of regularization, demonstrating why it becomes necessary in certain instances are presented. The TAC error values are the same in the following two plots, the only difference being that no regularization was applied when generating Figure 3.9, and a very small amount of regularization was used for Figure 3.10. Instability in the first instance causes the width of the uniform 70% confidence band to be overly wide, while the band obtained when regularizing in the second plot captures 7 of the 10 generated curves upon restricting to the interval[0,0.63]. Figure 3.9: Lab and field TAC error 2.5× 10 − 4 and7.5× 10 − 5 , no regularization. 75 Figure 3.10: Lab and field TAC error 2.5× 10 − 4 and 7.5× 10 − 5 , both regularization parameters set to 2.5× 10 − 10 . Increasing errors further results in an instability that requires an amount of regularization that may induce significant bias. Again, recall that the sensors on which these experiments are based were designed only to be abstinence monitors for individuals under house arrest, that is, made only to test the presence of alcohol in an individual’s system. Improvements of measurement technology currently in progress will allow the methods developed here to soon be of practical use. In particular, confidence bands for the data plot in Figure 3.12 are not considered. 3.6.2 RealDataAnalysis The data analyzed were collected during a single drinking session that was conducted in Dr. Susan Luczak’s laboratory at the University of Southern California as part of a larger study involving 40 participants. This human subjects research was approved by the USC Institutional Review Board and written informed con- sent was obtained prior to starting the drinking episode. In this session, a participant drank alcohol evenly 76 over a 15-minute period at a dose designed to reach a peak of approximately .05mg%. TAC was first mea- sured 30 minutes prior to the first BrAC measurement of .000 (just as drinking commenced) and then BrAC readings continued until BrAC returned to 0.000 and TAC readings continued until TAC returned to 0.000. Once drinking began, TAC and BrAC observations were taken approximately every 10 minutes ex- cept while drinking (due to mouth alcohol interfering with the breathalyzer’s ability to accurately capture BrAC). After BrAC returned to .000, BrAC was no longer measured and TAC observations were taken every 30 minutes. The first non-zero TAC observation was 67 minutes after the first non-zero BrAC observation. TAC and BrAC observations were taken over 6.3 hours. A total of 70 TAC measurements were taken with two TAC devices, one worn on each arm of the subject, and 28 BrAC measurements were taken with the breath analyzer. Because of the variability in how the TAC devices trigger readings and the time it takes for the breath analyzer to be ready to take a reading, the TAC and BrAC times may not have temporally coincided exactly, but were within minutes of one another while BrAC was greater than 0.000. In general the theory in this work accommodates the situation where multiple devices are used, as it assumes only that the observations are unbiased with independent Gaussian errors having the same variance. 77 Figure 3.11: BrAC from recordings of two TAC device observations of dataset BT311 Session1 . Nevertheless, as indicated in Figure 3.11, the two devices used in this experiment exhibited some vari- ation in the magnitude of their responses, which likely inflated the estimate of the standard deviation, thus also enlarging the confidence bands and contributed to the numerical instability of the deconvolution. In particular, improved performance of our results is anticipated when any multiple devices used in a single experiment are more equally calibrated. Minimizing (3.28) for this data resulted in the estimate b q = (0.5611,0.7655). The cyan curve in the figure was computed as for the experiments in Section 3.6.1, that is, using (3.49), and thus with no constraints when least squares optimizing. The purple curve was created by further constraining all basis coefficients to be non-negative, thus producing a non-negative BrAC curve, which is visibly close to the true BrAC. 78 Figure 3.12: BrAC from TAC observations of dataset BT311 Session1 06132019 and Estimated BrAC that results from using the minimizer b q = (0.5611,0.7655). Estimated standard deviation of TAC errors is 2.5× 10 − 3 , both regularization parameters set to0.006. Minimizing (3.28) for this data resulted in the estimateb q = (0.5611,0.7655). The cyan curve in the figure was computed using (3.49), and thus with no constraints when least squares optimizing. The purple curve was created by further constraining all basis coefficients to be non-negative, thus producing a non- negative BrAC curve, which is visibly close to the true BrAC. 79 Chapter4 Non-ParametricEstimationofaRandomDiffusionEquation-Based PopulationModelandBlood/BreathAlcoholConcentration DeconvolutionfromTransdermalAlcoholBiosensorData 4.1 ThePopulationModel The model in (2.6)-(2.7) provides output TAC signal when provided a particular value for the unknown model parameterq and a BrAC signalu. Now, assume thatq is a random vector, and denote this random vector byq. Further, assume thatq is distributed according to an absolutely continuous cumulative distribution function,F(q), or equivalently by a push forward measureπ (q) associated with the identity function and writeπ (q)∼ f(q) withf being the corresponding density function such thatf ∈F with F given by F ={f ∈C(Q):0<α ≤ f(q)≤ β, ZZ Q f(q)dq =1} (4.1) for some constantsα andβ with0<α ≤ β whereQ is a compact subset of the positive orthant ofR 2 as defined in Section 2.2. In this Chapter, the model described above is developed and it accommodates not only a fixed set of unknown parameters, q, but an entire population. This approach is relied on a recent 80 treatment of random abstract parabolic systems developed in [27], [54] wherein the now random system (2.6) is reformulated in weak form as an equivalent deterministic system in a Gelfand triple ([56],[68]) of Bochner spaces ([58], [60]). Recall that in Chapter 2 and in particular Section 2.2, H was defined as H = R× L 2 (0,1) forq = (q 1 ,q 2 )∈Q andV was defined as V ={(θ,ϕ )∈H :ϕ ∈H 1 (0,1),θ =ϕ (0),ϕ (1)=0}. Usingπ , define the Bochner spacesV = L 2 π (Q;V) andH = L 2 π (Q;H) to obtain the Gelfand tripleV ,→H ,→ V ∗ whereV ∗ denotes the dual ofV. Now, define the bilinear form a(f;·,·) onV× V as a π -weighted version ofa(q;·,·) given in (2.3) by a(f; ˆ ϕ, ˆ ψ )=E π a q; ˆ ϕ (q), ˆ ψ (q) = Z Q a q; ˆ ϕ (q), ˆ ψ (q) dπ (q) for ˆ ϕ, ˆ ψ ∈ V. Using the properties (i)-(iii) specified in Section 2.2 and the Cauchy-Schwarz inequality, compute |a(f; ˆ ϕ, ˆ ψ )| H ≤ Z Q |a q; ˆ ϕ (q), ˆ ψ (q) |dπ (q) ≤ α 0 Z Q || ˆ ϕ (q)|||| ˆ ψ (q)||dπ (q) ≤ α 0 Z Q || ˆ ϕ (q)|| 2 dπ (q) 1/2 · Z Q || ˆ ψ (q)|| 2 dπ (q) 1/2 =α 0 || ˆ ϕ || V ||ψ || V , as well as a(f; ˆ ϕ, ˆ ϕ )+λ 0 | ˆ ϕ | 2 H = Z Q a(q; ˆ ϕ (q), ˆ ϕ (q))dπ (q) 81 +λ 0 Z Q | ˆ ϕ (q)| 2 H dπ (q) = Z Q a(q; ˆ ϕ (q), ˆ ϕ (q))+λ 0 | ˆ ϕ (q)| 2 H dπ (q) ≥ κ 0 Z Q || ˆ ϕ (q)|| 2 V dπ (q) =κ 0 || ˆ ϕ || 2 V , and thus,a(·;·,·) is also a bounded and coercive bilinear form onV× V with the same constants as a(·;·,·). With that, define the bounded linear map A(f):V→V ∗ by⟨A ˆ ϕ, ˆ ψ ⟩ V ∗ ,V =− a(f; ˆ ϕ, ˆ ψ ). The operatorA, appropriately restricted or extended, is an infinitesimal generator of an analytic semigroup of bounded linear operators,T(t;f) ={e A(f)t : t≥ 0}, onV,H andV ∗ , see [3], [5], [68]. Analogously, define the operator B(f)∈L(R,H) by ⟨B(f)u, ˆ ψ ⟩=E π (q) ⟨B(q)u, ˆ ψ ⟩ = Z Q ⟨B(q)u, ˆ ψ ⟩dπ (q), u∈R and set ˆ A(f) = T(τ ;f) ∈ L(H,H), ˆ B(f) = A(f) − 1 ( ˆ A(f)− I)B(f) ∈ L(R,H), whereI is the identity onH. Note that by the coercivity ofa(·;·,·) assume thatA(f) − 1 exists and is bounded. Finally, the output operator ˆ C(f) is given by ˆ C(f) ˆ ϕ =E π (q) ( ˆ C ˆ ϕ )∈L(H,R). In an effort to keep the notation readable, henceforth showing the explicit dependence on the population density f, is omitted unless special emphasis is desired. These definitions yield the discrete-time system ˆ x k+1 = ˆ Aˆ x k + ˆ Bu k (4.2) y k = ˆ Cx k . (4.3) 82 As before, the random outputy k can be expressed using a convolution as y k = k− 1 X j=0 ˆ C ˆ A k− j− 1 ˆ Bu j (4.4) = k− 1 X j=0 h k− j u j , k =0,1,2,...,K, where the population filter is given by h i = ˆ C ˆ A i− 1 ˆ B,i=1,2,.... 4.2 FiniteDimensionalApproximationandConvergence Estimating the distribution of the random parameterq in the population model, (4.2) requires finite di- mensional approximation along with an associated convergence theory. The approach is based on a series of approximating finite-dimensional subspaces V N ofV as well as sequences of operators ˆ A N , ˆ B N , and ˆ C N that approximate ˆ A, ˆ B, and ˆ C, respectively. The process of finite dimensional approximation follows a similar approach as described in Section 2.3. For this, consider the multi-index N =(n,m 1 ,m 2 ). (4.5) The random parametersq i are assumed to be supported in compact intervals of the form [a i ,b i ] for i = 1,2. Partition the interval [a i ,b i ] into m i subintervals of equal width for i = 1,2 and denote the characteristic function on the j-the subinterval by χ m i j for j = 1,...,m i . For n = 1,2,... let {ϕ n j } n j=0 be the standard linear B-splines on[0,1] with respect to the uniform mesh{0, 1 n , 2 n ,...,1} ([53]) and define ˆ ϕ n j = (ϕ n j (0),ϕ n j ) ∈ V . Let the multi-index J = (j 0 ,j 1 ,j 2 ) with j 0 ∈ {0,1,...,n} and j i ∈{1,2,...,m i },i = 1,2. Define Φ N J = ˆ ϕ n j 0 χ m 1 j 1 χ m 2 j 2 and letV N = span J {Φ N J }. Denote the orthogo- nal projection ofH ontoV N byP N :H→V N . It is not difficult to show that P N converges strongly to the identity in bothH andV ([53]). Finally, a standard Galerkin approach is employed to come up with 83 an approximationA N of the operatorA, and set ˆ A N = exp(A N τ )∈L(V N ,V N ), for details see [75]. Similarly, define ˆ B N = (A N ) − 1 ˆ A N − I N P N B ∈ L(R,V N ) and ˆ C N = ˆ CP N . As before (see Section 2.3), the discretized outputy N takes the form of a convolution as follows y N k = k− 1 X j=0 ˆ C N ( ˆ A N ) k− j− 1 ˆ B N u j (4.6) = k− 1 X j=0 h N k− j u j , k =0,1,2,...,K, where the approximating population filter is given by h N i = ˆ C N ( ˆ A N ) i− 1 ˆ B N ,i=1,2,... The choice of approximating subspaces described above and the definitions of the approximating op- erators imply that|y k − y N k |→ 0 uniformly ink,k =0,1,2,,...,K, asN →∞, wherey k andy N k are given by (4.4) and (4.6), respectively. Indeed, using the boundedness and coercivity assumptions, (i) and (ii), it is not difficult to argue resolvent convergence for the operators A N to the resolvent of the operatorA. The use of Galerkin approximation to define the operators A N together with the the coercivity condition (ii) imply that the approximating semigroups,{exp(A N t) : t ≥ 0} are uniformly (in N) exponentially stable. It follows from the Trotter Kato Theorem from linear semigroup theory (see, for example, [47], [58], [60]) that the operators ˆ A N = e A N τ converge strongly to the operator ˆ A = e Aτ . The definition of the operators ˆ B N and ˆ C N in terms of the orthogonal projectionsP N together with the strong con- vergence of the operatorsP N are then sufficient to conclude the convergence of the convolution filters |h k − h N k |→ 0 uniformly ink,k = 0,1,2,,...,K, asN →∞, from which the convergence ofy N k to y k immediately follows. The arguments above will be discussed in detail in Chapter 5. 4.3 DeconvolutionandUncertaintyQuantification The model that was derived in the previous subsection, culminating in the convolutions (4.4) or (4.6), is a forward model that computes TAC,y, based on BrAC,u. In this section, the forward model for inversion, 84 that is, the deconvolution of a BrAC signal from a given TAC signal is used. This input estimation problem is formulated as a positively constrained, regularized, linear least squares optimization problem. Assume the densityf ofq is either known or have been fit to data with the known or fitted density denoted by f ∗ (with the corresponding push forward measureπ ∗ ) together with its supportQ ∗ . Then consider the input u of the model as a random continuous time signalu, whose sampled version isu j =u(jτ )∈L 2 π ∗ (Q ∗ ), where τ > 0 again denotes the length of a sampling interval. It is clear thatu depends on both the parametersq and the timet, and so define the space S(0,T) = H 1 (0,T;L 2 π ∗ (Q ∗ )) and letu∈ S(0,T). Then consider a compact subsetU ofS(0,T), so that the input estimation problem can be posed as u ∗ =argmin U J(u) (4.7) =argmin U K X k=1 |y k (u)− ˆ y k | 2 +||u|| 2 S(0,T) . In this problem,||u|| 2 S(0,T) denotes a regularization term with an appropriate norm||·|| S(0,T) onS(0,T), andy(u) is as in (4.4). For the approximating input or BrAC estimation problem, define the multi-index M = (m,m 1 ,m 2 ), and setL = (N,M) withN as defined above in (4.5). Based on this, introduce the discretized space U M as a closed subset ofU that is contained in a finite dimensional subspace of S(0,T). In this way, the compactness ofU leads to the compactness ofU M . ByN →∞,M →∞, orL→∞, it is implied that all entries of the multi-index approach infinity. With these definitions, the approximating input estimation problem is given by u L∗ =argmin U M J L (u) (4.8) =argmin U M K X k=1 |y N k (u)− ˆ y k | 2 +||u|| 2 S(0,T) , 85 wherey N (u) is as in (4.6). Under the assumptions made earlier together with either the compactness assumption or, by taking advantage of the fact that the performance index in (4.7) and (4.8) is quadratic, a convexity assumption, on U, it can be argued that either a sub-sequence of, or the sequenceu L∗ itself, converges tou ∗ , asL→∞ ([60], [61]). The convergence will be discussed in detail in Chapter 5. As can be seen in (4.8), the eBrAC signalu depends on the random functiony N (u). This forward function is random in the sense that the underlying parametersq are assumed to be randomly distributed and denoted by a random vectorq. Assume that an estimated distribution forq is known or has been estimated (the process of estimating the distribution is described in Section 4.4). Then, by sampling from the distribution ofq and solving the optimization problem (4.8), samples of the deconvolved signalu can be obtained. In this way, by sampling a large number of estimated input signals, conservative bands in which a certain percentage of these signals fall can be found. Indeed, the event{q∈Q} is contained in the event{u(q)∈u(Q)}, and consequently, thereforeP({u(q)∈u(Q)})≥ P({q∈Q}). This imme- diately leads to an uncertainty quantification; this technique will be demonstrated in Chapter 8. Reducing the width of the conservative bands based on sub-population selection will be discussed in Chapter 6. 4.4 EstimatingtheDistributionoftheParameters Assume that BrAC and TAC data pairs, {{u j,k } K k=0 ,{y j,k } K k=0 } M j=1 for M drinking episodes have been collected in the lab or clinic under the assumption that M is sufficiently large that the population has adequate representation with respect to physiological characteristics of the cohort of subjects of interest (e.g. sex, age, BMI, race, etc.), variations in sensor devices and (e.g. manufacturer, calibration, etc.), and environmental conditions (e.g. ambient temperature, humidity, etc.). Assume further that these BrAC/TAC data pairs are used to estimate values for the parametersq = [q 1 ,q 2 ] that appear in the individual model (2.1) or (2.2) yielding theM individual model parameter estimates,{q j =[q 1,j ,q 2,j ]} M j=1 . In the subsection 86 to follow below the process of using the individual model parameters{q j = [q 1,j ,q 2,j ]} M j=1 to obtain an estimate for the distribution of the parameters across the entire cohort of interest that can then be used in the population model described in Section 4.1 is described. In particular, a non-parametric approach based on kernel density estimation and spline-based interpolation and resampling is presented. 4.4.1 Non-ParametricEstimationofthePopulationDistribution Instead of pre-selecting a particular parameterized family of distributions forq and then using the samples ofq, probability density function forq is estimated directly using what is known as the kernel density estimation method (KDE). This method requires only that the sampled data is from a distribution that has a densityf that satisfies basic smoothness conditions (see [64]). It is worth noting that the estimation of an unknown densityf by KDE achieves the optimal rate of convergence (see again [64]). The first uni-variate estimate of this type was proposed in [24], further generalized in [46] and [51], and then extended to the multi-variable case in [9] and [21]. Here the KDE method for the estimation of a uni-variate and a bivariate distribution is explained; the generalization to higher dimensions should be clear. Note that if one wants to assume independence, then a uni-variate kernel density estimator can be fit to each component of q with the estimated joint density then taken to be the product of the uni-variate kernel density estimators. For the uni-variate case, letq 1 ,q 2 ,...,q n be independent and identically distributed samples drawn from a uni-variate distribution with an unknown densityf. Then the kernel density estimate off atq is given by b f(q;h)= 1 n n X i=1 K h (q− q i ) (4.9) In 4.9, K h (x) = 1 h K(h − 1 x), x ∈ R is known as the kernel smoothing function, a non-negative, square integrable function whose integral overR is 1 andh>0 is a smoothing parameter called the bandwidth. 87 For the bivariate case, letq 1 ,q 2 ,...,q n be independent and identically distributed samples drawn from a bivariate distribution with an unknown density f. Then the kernel density estimate of f atq = [q 1 ,q 2 ] is given by b f(q;H)= 1 n n X i=1 K H (q− q i ) (4.10) whereq = [q 1 ,q 2 ] andq i = (q 1,i ,q 2,i ), i = 1,...,n. In (4.10)H ∈ R 2× 2 is known as the bandwidth matrix assumed to be positive definite symmetric and K H (x) =|H| − 1/2 K(H − 1/2 x),x∈R 2 whereK is the kernel smoothing function, a non-negative, square integrable function whose integral overR 2 is 1. So, for example, the Matlab routine MVKSDENSITY takesH =diag(h 1 ,h 2 ) withh i >0,i=1,2 andK to be a tensor product of one dimensional kernel smoothing functions,K(x)=k 1 (x 1 )k 2 (x 2 ),x∈R 2 . In this case b f(q;H)= b f(q,h 1 ,h 2 ) (4.11) = 1 n n X i=1 1 √ h 1 h 2 k 1 q 1 − q 1,i √ h 1 k 2 q 2 − q 2,i √ h 2 . In addition, Matlab’s routine relies on kernel smoothing functions derived from standard normal den- sities which are then, as is clear from (4.10) and (4.11), shifted to have means at the sample points and variances determined by the band width matrix. Once the estimated kernel has been fit, it can be evalu- ated at any desired resolution and, if its derivatives are needed, it can be interpolated with, for example, spline functions. 88 Chapter5 ConsistencyUnderApproximation In this Chapter, the consistency under approximation of the discrete time TAC and BrAC signals,y k and u k , as appear in the model (4.2)-(4.3) is established. Here, by “approximation”, both the approximation of the spaces and operators with respect to the spacial discretization parameterN, as well as the estimation of the true distribution ofq that is based on m samples is implied. For the purposes of establishing consistency under approximation,q 2 inq = (q 1 ,q 2 ) is assumed to be constant instead of a random parameter. In particular, sincey k is linear with respect to q 2 , without loss of generality setq 2 to 1. In what follows,q will be used to denote theq 1 population. Recall thatq is distributed according to an absolutely continuous cumulative distribution function,F(q) or equivalently by a push forward measure π (q) associated with the identity function,π (q)∼ f(q), withf being the corresponding density function such thatf ∈F withF the uni-variate version of (4.1) given by F ={f ∈C(Q):0<α ≤ f(q)≤ β, Z Q f(q)dq =1} whereQ is now a compact subset ofR + . In this Chapter the consistency under approximation will be established in situations when the density of the populationq is estimated using the uni-variate kernel density estimate method as described in 4.4.1. 89 Recall that the kernel density estimatef m of the unknown uni-variate densityf atq (introduced in [51]) is given by f m (q)= m X i=1 1 mh K(h − 1 (q− q i )) (5.1) whereq 1 ,...,q m are independent observations from the density,K is a kernel function andh=h(m) is the bandwidth. The explicit dependence ofh onm will generally be suppressed. The expressions sup and R when unqualified will be taken to be over the range (−∞ ,∞). Given any real functionv of bounded variation,v induces a signed Lebesgue-Stieltjes measureµ v on the Borel sets of the real line ; this measure ascribes measurev(b)− v(a) to the real interval(a,b]. The notation R g(x)|dv(x)| will be used to denote an integral with respect to the absolute value, or total variation, of the measureµ v In the case wherev is differentiable, R g(x)|dv(x)| is equivalent to R g(x)|v ′ (x)|dx. The following Theorem 5.0.1 presented in [57] establishes the uniform consistency of the kernel estimates under very mild conditions onh. The condidtions onK below are used in the Theorem. 1. K is uniformly continuous and of bounded variationsV(K) 2. R |K(x)|dx<∞ andK(x)→0 as|x|→∞ 3. R K(x)dx=1 4. R |xlog|x|| 1 2 |dK(x)|<∞ Theorem 5.0.1. Suppose K satisfies the conditions (1)-(4) and f is uniformly continuous. Suppose h → 0 and(mh) − 1 log(m)→0 asm→∞. Then defining f m as in (5.1) sup|f m − f|→0 a.s as m→∞ (5.2) 90 Below, the definition of consistency under approximation is provided, where N denotes the discretiza- tion parameter of the spaces and operators. Definition 5.0.1. LetW be a normed linear space, and let{f m } be a sequence of estimators of the density f. Then a doubly indexed sequence ofW-valued estimatorsτ N m = τ N (f m ) = τ N (f m (X 1 ,X 2 ,...,X m )) is saidtobeanapproximatingconsistentsequenceofestimatorsofthefunctionoff,τ (f),whereτ iscontinuous from C into the normed linear space W, if for every ϵ > 0 lim N,m→∞ P(∥τ N m − τ (f)∥ W ≤ ϵ ) = 1, or equivalently, lim N,m→∞ P(∥τ N m − τ (f)∥ W >ϵ )=0. Theorem5.0.2. LetX 1 ,X 2 ,..., be a sequenceof samplesor observationsaccording tothe distribution given byf wheref ∈F. For eachm = 1,2,... let{f m } ={f m (X 1 ,X 2 ,...,X m )},m = 1,2,... be a consistent sequence of estimator of the densityf, that is, sup|f m − f| → 0 a.s as m → ∞. Let{W,∥·∥ W } be a normed linear space, let τ :F → W be a continuous map and let{τ N } ∞ N=1 , τ N :F → W, be a family of equi-continuous maps with the property that lim N→∞ ∥τ N (f)− τ (f)∥ W = 0, for each f ∈ F. Then τ N m = τ N (f m ) is an approximating consistent sequence of estimators for the function of f, τ (f). That is, givenϵ> 0 andf ∈F, lim N,m→∞ P(∥τ N m − τ (f)∥ W >ϵ )= lim N,m→∞ P(∥τ N (f m )− τ (f)∥ W >ϵ )=0; and soτ N m =τ N (f m )→τ (f) in probability asN,m→∞ for everyf ∈F. Proof. Letϵ> 0 andf ∈F be given. Let ˆ N be such that ∥τ N (f)− τ (f)∥ W ≤ ϵ 2 , N ≥ ˆ N. 91 LetB N n ={ω :∥τ N (f m )− τ N (f)∥ W > ϵ 2 }beanevent,and,bytheequi-continuityofthefamily{τ N } ∞ N=1 , letδ =δ ( ϵ 2 ,f)>0 be such thatsup|f m − f|≤ δ implies∥τ N (f m )− τ N (f)∥ W ≤ ϵ/ 2,N =1,2,.... Then for eachN =1,2,... andm=1,2,..., define A N m ={ω :∥τ N (f m )− τ (f)∥ W >ϵ }. (5.3) Lettingω∈A N m , the definition of A N m given in (5.3), the triangle inequality, andN ≥ ˆ N yield that ϵ< ∥τ N (f m )− τ (f)∥ W ≤∥ τ N (f m )− τ N (f)∥ W +∥τ N (f)− τ (f)∥ W (5.4) ≤∥ τ N (f m )− τ N (f)∥ W + ϵ 2 . The estimate given in (5.4) then implies that ∥τ N (f m )− τ N (f)∥ W >ϵ − ϵ 2 = ϵ 2 , for all N ≥ ˆ N, and therefore that ω ∈ B N m whenever N ≥ ˆ N. Consequently, it follows that A N m ⊆ B N m wheneverN ≥ ˆ N. Now letC m ={ω : sup|f m − f| > δ }. By the equi-continuity of the family{τ N } ∞ N=1 , it follows that B N m ⊆ C m , for allN = 1,2,... and allm = 1,2,.... Thus ifN ≥ ˆ N, it is implied thatA N m ⊆ B N m ⊆ C m , for allm=1,2,.... The assumption that the sequence{f m }, m = 1,2,... is consistent for f implies that P(C m ) → 0 as m→∞,orequivalentlythat,givenη > 0,thereexistsan ˆ msuchthatP(C m )≤ η forallm≥ ˆ m. Itfollows that P(∥τ N (f m )− τ (f)∥ W >ϵ )=P(A N m )≤ P(B N m )≤ P(C m )≤ η, 92 for allN ≥ ˆ N and allm≥ ˆ m and the theorem is proved. Theorem 5.0.2 is based on the following assumptions: the continuity ofτ with respect tof, the equi- continuity of the family{τ N } ∞ N=1 with respect to f, and the convergence of the approximation τ N . In what follows, the assumptions are taken care separately. First, the continuity assumption is addressed. As described in Chapter 4, the spacesH andV (or more specifically their inner products) depend on the distribution ofq that is given by the density functionf. Hence, a continuity result for the semigroup A(f) onf needs to take into account the changing spacesH m andV m asf m → f. Define the spaces H m = L 2 π m (Q,H) andV m = L 2 π m (Q,V), and letI m : H → H m be the identity map such that I m (ψ ) = ψ for any ψ ∈H. The continuity proof is based on a variant of the Trotter-Kato semigroup approximation theorem [5, 35, 47]. This variant was presented in [2] and more detailed in [60] and deals with the situation where the approximating spaces are not subspaces of the underlying infinite dimensional state space. For a Hilbert spaceH, introduce the notationL∈G(M,γ ) for an operatorL :Dom(L)⊂ H →H that is the infinitesimal generator of an analytic semigroup of bounded linear operators {S(t)} that satisfies |S(t)|≤ Me γt . In the following theoremL andS are used to denote an operatorL and the corresponding semigroupS as described above. Assuming a bound, uniformly inm, on the semigroups generated by the operatorsL m , the following theorem can be proved. Theorem5.0.3. LetH,H m be Hilbert spaces as defined above, where |·| m denotes the norm onH m . Let I m :H →H m be an operator such thatIm(I m ) =H m and|I m z| m ≤ k|z| for some constantk > 0. LetL∈G(M,λ 0 ) onH, andL m ∈G(M,λ 0 ) onH m . Suppose that for someλ ≥ λ 0 , |I m R λ (L)z− R λ (L m )I m z| m →0, asm→∞, (5.5) 93 foreveryz∈H,whereR λ (L)=(λI − L) − 1 andR λ (L m )=(λI − L m ) − 1 denotetheresolventoperators ofL andL m , respectively. Then |I m S(t)z− S m (t)I m z| m →0, asm→∞, inH m for everyz∈H, uniformly int on compactt-intervals. Proof. Without loss of generality, let λ 0 = 0. For any operator B, if R λ (B) is the corresponding resolvent operator, then it follows that BR λ (B) = λR λ (B)− I from the definition of the resolvent operator. As S(t)R λ (L) andS m (t)R λ (L m ) are both strongly differentiable in t, it follows that d dt S(t)R λ (L)=LS(t)R λ (L)=S(t)LR λ (L)=S(t)(λR λ (L)− I). Inaddition,ifT(t)representsthecorrespondingsemigroupoperatorforB,thenB commuteswithbothR λ (B) andT(t). Also,R λ (B)andT(t)commuteswitheachother,aswell. Withthat,andananalogousidentityfor S m (t)R λ (L m ), d ds (S m (t− s)R λ (L m )I m S(s)R λ (L))=S m (t− s)(I m R λ (L− R λ (L m )I m )S(s). (5.6) Compute S m (t− s)R λ (L m )I m S(s)R λ (L)| s=t s=0 =R λ (L m )(I m S(t)− S m (t)I m )R λ (L), (5.7) and together, (5.6) and (5.7) yield R λ (L m )(I m S(t)− S m (t)I m )R λ (L)= Z t 0 S m (t− s)(I m R λ (L)− R λ (L m )I m )S(s)ds. (5.8) 94 Then, using (5.8) and|S m (t− s)|≤ M, for anyu∈H, obtain |R λ (L m )(I m S(t)− S m (t)I m )R λ (L)u| m ≤ M Z t 0 |(I m R λ (L)− R λ (L m )I m )S(s)u| m ds. (5.9) Using the assumption (5.5), the integrand in (5.9) converges to 0 for a fixed s. Moreover, it is bounded by 2M 2 k|u|/λ and so, by the Dominated Convergence Theorem, the upper bound in (5.9) converges to 0 as m→∞, and the convergence is uniform int on compact intervals. Denotev =R λ (L)u, then for allv∈H |R λ (L m )(I m S(t)− S m (t)I m )v| m →0, asm→∞, (5.10) becauseDom(L) is dense inH. Using again the assumption (5.5) and the fact that|S m (t)|≤ M, |R λ (L m )S m (t)I m v− S m (t)I m R λ (L)v| m =|S m (t)(R λ (L m )I m v− I m R λ (L)v)| m →0, (5.11) and analogously, since|S m (t)|≤ M, it follows that |R λ (L m )I m S(t)v− I m S(t)R λ (L)v| m =|(R λ (L m )I m − I m R λ (L))S(t)v| m →0. (5.12) The triangle inequality together with (5.11) and (5.12) yields |R λ (L m )(S m (t)I m − I m S(t))v+(I m S(t)− S m (t)I m )R λ (L)v| m →0, (5.13) asm→∞. Invoking (5.10), the triangle inequality then gives |(I m S(t)− S m (t)I m )R λ (L)v| m →0, asm→∞. (5.14) 95 Lastly, forw = R λ (L)v, notice thatw∈ Dom(L 2 ). SinceDom(L 2 ) is dense inH, the assumption (5.5), (5.13) and (5.14) provide |S m (t)I m z− I m S(t)z| m →0, asm→∞, for allz∈H uniformly int on compact intervals. In Theorem 5.0.3 letL = A(f),L m = A(f m ),S(t) = T(t;f) andS m (t) = T(t;f m ) defined as in Chapter 4 Section 4.1. Based on Theorem 5.0.3, to prove continuity of the semigroupT(t,f) with respect to f, the resolvent convergence as in (5.5) needs to be established. Recall that properties (i)-(iii) in Section 2.2 hold for these operators as shown in Section 4.1. Properties (i)-(iii) imply thatA(f) and A(f m )∈G(λ 0 ,M) which satisfy the assumption of Theorem 5.0.3. Theorem 5.0.4. For the model in (4.2)-(4.3), assume thatq ∼ f, with f ∈F as defined in (4.1). Then if {f m },m = 1,2,..., is a consistent sequence of estimates off, that issup|f m − f|→ 0 a.s asm→∞, it follows thatT(f m )I m x− I m T(f)x→ 0 inH m for eachx∈H, whereI m denotes the identity map fromH ontoH m . Proof. First note that since the assumptions (i)-(iii) hold it is implied thatA∈ G(M,λ 0 ) onH, andA m ∈ G(M,λ 0 ) onH m . SinceV is dense inH, andI m R λ 0 (A) and R λ 0 (A m )I m are uniformly bounded, it suffices to show resolvent convergence for every z ∈ V as this will imply resolvent convergence for every z∈H. Let z ∈ V and set w = R λ 0 (A)z as well as w m = R λ 0 (A m )I m z. Define z m = w m − w. Since w m ∈V m , compute a(f m ;w m ,z m )=⟨− A m w m ,z m ⟩ H m =⟨(λ 0 I− A m )R λ 0 (A m )I m z,z m ⟩ H m− λ 0 ⟨w m ,z m ⟩ H m 96 =⟨I m z,z m ⟩ H m− λ 0 ⟨w m ,z m ⟩ H m. (5.15) DenotetheMoore-Penrosegeneralizedinverse[10]ofI m byI m + . Then,duetow∈Dom(A),itisobtained that a(f;w,I m + z m )=⟨− Aw,I m + z m ⟩ H =⟨(λ 0 I− A)R λ 0 (A)z,I m + z n ⟩ H − λ 0 ⟨w,I m + z n ⟩ H =⟨z,I m + z m ⟩ H − λ 0 ⟨w,I m + z m ⟩ H . (5.16) Combining (5.15) and (5.16) yields a(f m ;w m ,z m )− a(f;w,I m + z m )=⟨I m z,z m ⟩ H m− λ 0 ⟨w m ,z m ⟩ H m (5.17) −⟨ z,I m + z m ⟩ H +λ 0 ⟨w,I m + z m ⟩ H . Theresolventconvergenceisnowbasedontheboundednessandcoercivityoftheforma(·;·,·)andtherespec- tive coefficients are denoted by ˜ α 0 , ˜ µ 0 , and ˜ λ 0 . Together with the Cauchy-Schwarz inequality, the Peter-Paul inequality, the continuous embeddings of the spaceV in the spaceH (i.e. that there exist a constantK such that|·| H ≤ K∥·∥ V and (5.17), for anyϵ > 0, compute ˜ µ 0 ||z m || 2 V m ≤ a(f m ;z n ,z n )+ ˜ λ 0 |z m | 2 H m =a(f m ;w m ,z m )− a(f m ;w,z m )+ ˜ λ 0 |z m | 2 H m =a(f m ;w m ,z m )− a(f;w,I m + z m )+a(f;w,I m + z m )− a(f m ;w,z m )+ ˜ λ 0 |z m | 2 H m =⟨I m z,z m ⟩ H m− λ 0 ⟨w m ,z m ⟩ H m−⟨ z,I m + z m ⟩ H +λ 0 ⟨w,I m + z m ⟩ H 97 + Z Q (a(q;w,I m + z m )f(q)− a(q;w,I m + z m )f m (q))dq+ ˜ λ 0 |z m | 2 H m = Z Q ⟨z,I m + z m ⟩ H (f m (q)− f(q))dq+ ˜ λ 0 Z Q (⟨w,I m + z m ⟩ H f(q)−⟨ w,I m + z m ⟩ H f m (q))dq + Z Q (a(q;w,I m + z m )f(q)− a(q;w,I m + z m )f m (q))dq ≤ Z Q |z| H |I m + z m | H |f m (q)− f(q)|dq+ ˜ λ 0 Z Q |w| H |I m + z m | H |f(q)− f m (q))|dq + ˜ a 0 Z Q |w| V |I m + z m | V |f(q)− f m (q)|dq ≤ K Z Q |z| V |I m + z m | V |f m (q)− f(q)|dq+K ˜ λ 0 Z Q |w| V |I m + z m | V |f(q)− f m (q))|dq +K ˜ a 0 Z Q |w| V |I m + z m | V |f(q)− f m (q)|dq UsingtheDominatedConvergenceTheoremandthehypothesesoftheTheoremitfollowsthat ˜ µ 0 ||z m || 2 V m → 0 asm→∞. This shows resolvent convergence, and with Theorem 5.0.3 the claim is proven. Applying Theorems 5.0.3 and 5.0.4 it can be proved that |y k (f m )− y k (f)| → 0 and |y N k (f m )− y N k (f)|→0 as follows. Corollary5.0.1. UnderthesamehypothesesofTheorems5.0.3and5.0.4,itfollowsthat|y k (f m )− y k (f)|→0 asm→∞foreachk =1,2,...,K, wherey k (f)= P k− 1 j=0 h k− j− 1 (f)u j foreverysequence{u j } ∞ j=0 , with u j ∈R. Proof. Recall thath i (f)= ˆ C(f) ˆ A(f) i− 1 ˆ B(f),i=1,2,.... Using Theorems 5.0.3 and 5.0.4, it follows that ˆ A(f m ) i I m x− I m ˆ A(f) i x→ 0 inH m for eachx∈H, for eachi = 0,1,..., since ˆ A(f) =T(τ ;f) = e A(f)τ . 98 The operator ˆ B can be expressed as ˆ B(f)= Z τ 0 T(s;f)B(f)ds=A(f) − 1 T(s;f)B(f)| τ 0 =A(f) − 1 ( ˆ A(f)− I)B(f). By Theorems 5.0.3 and 5.0.4, A and ˆ A depend continuously on f. The resolvent convergence shown in Theorem 5.0.4 also establishes the continuity ofA(f) − 1 . Recall thatB(f) is defined by ⟨B(f)u, ˆ ψ ⟩ = E π (q) ⟨B(q)u, ˆ ψ ⟩ = R Q ⟨B(q)u, ˆ ψ ⟩dπ (q) = R Q ⟨B(q)u, ˆ ψ ⟩f(q)dq, u ∈ R and by the assumption that {f m },m=1,...,misaconsistentsequenceofestimatorsforf,theDominatedConvergenceTheoremyields the continuity ofB onf. Thus, ˆ B(f m )r− I n ˆ B(f)r→0 inH n for eachr∈R. Finally, the operator ˆ C is defined by ˆ C ˆ ϕ =E π ( ˆ C ˆ ϕ ). Again, the Dominated Convergence Theorem shows that ˆ C is continuous inf. It follows thath i is a concatenation of operators that are continuous in f, thereforeh i depends contin- uously on f for every i = 1,2,.... That is, |h i (f m )− h i (f)| → 0 for each i = 1,2,.... Asy k (f) = P k− 1 j=0 h k− j (f)u j is a finite sum of terms involving h i (f), the claim follows. Corollary5.0.2. UnderthesamehypothesesofTheorems5.0.3and5.0.4,itfollowsthat|y N k (f m )− y N k (f)|→ 0 for each k = 1,2,..., uniformly in N, N = 1,2,..., wherey N k (f) = P k− 1 j=0 h N k− 1− j (f)u j for every se- quence{u j } ∞ j=0 , withu j ∈R. Proof. Theoperators ˆ A N (f), ˆ B N (f)and ˆ C N (f)aretheinfinite-dimensionaloperators ˆ A(f), ˆ B(f), ˆ C(f) restrictedtothefinite-dimensionalspaces H N ,N =1,2,.... Hence,thecontinuityof ˆ A(f), ˆ B(f)and ˆ C(f) gives rise to the equicontinuity of the families{ ˆ A N (f)} ∞ N=1 ,{ ˆ B N (f)} ∞ N=1 and{ ˆ C N (f)} ∞ N=1 . Then, the claim follows with Corollary 5.0.1. Theorem5.0.5. Under the same hypotheses of Theorems 5.0.3 and 5.0.4, the following statements are true 1. |u k (f m )− u k (f)| → 0 for each k = 0,1,2,...,K, where{u k (f)} denotes the optimal input as described in (4.7). 99 2. |u N k (f m )− u N k (f)|→0foreachk =0,1,2,...,K, uniformlyinN,N =1,2,...,where{u N k (f)} denotes the optimal input as described in (4.8) forM fixed. Proof. Observe that by Corollary 5.0.1 , the objective functionalJ(u) as given in (4.7) depends continuously onf. By Corollary 5.0.2, the same holds true for the objective functionalJ L (u) for fixed M. Together with the convexity ofJ andJ L this immediately yields the claims. The previous findings have primarily focused on the continuous dependence of the estimators y andu onf. However, in practice, the operators ˆ A, ˆ B, and ˆ C defined on the infinite dimensional spaces H and V cannot be directly utilized. Instead, they need to be approximated by finite-dimensional counterparts ˆ A N , ˆ B N , and ˆ C N as described in Section 4.2. In the following, the focus shifts to convergence results for these approximating operators, which are based on variations of the Trotter-Kato theorem as documented in [3]. Theorem5.0.6. LetL N andL be infinitesimal generators of C 0 semigroupsS N (t) andS(t) onH N and H, respectively, satisfying: 1. Thereexistconstantsγ ,M suchthat|S N (t)|≤ Me γt foreachN. Thatis,L N ∈G(M,γ )uniformly inN. 2. There exists λ ∈ ρ (L)∩ ∞ N=1 ρ (L N ) with Reλ > γ such that R λ (L N )P N x → R λ (L)x for each x∈H. Then, for eachx∈H,S N (t)P N x→S(t)x uniformly int on any compact interval. The proof can be found in [3]. In the following Theorem the resolvent convergence assumed in Theorem 5.0.6 is established when L = A(f),L N = A N (f),S(t) = T(t;f) andS N (t) = T N (t;f) defined as in Chapter 4 Section 100 4.2. Properties (i)-(iii) in Section 2.2 hold for these operators so it follows that thatA(f) andA N (f) ∈ G(λ 0 ,M) therefore assumption 1 in Theorem 5.0.6 is satisfied. Theorem5.0.7. Foreachz∈V,assumethatthereexistsasequence{ˆ z N }∈H N suchthat∥z− ˆ z N ∥ V →0 asN →∞. Then, forλ =λ 0 ,R λ (A N (f))P N z→R λ (A(f))z in theV norm for anyz∈H. Proof. Firstly, note thatA N (f) results from the restriction ofa toH N × H N and so λ = λ 0 can be chosen in (ii) such thatλ ∈ ρ (A(f))∩ ∞ N=1 ρ (A N (f)) withλ > λ 0 . Forz ∈H, denotew = R λ (A(f))z and w N = R λ (A N (f))P N z. As w ∈ Dom(A(f)) ⊂ V, then there exists a sequence{ˆ w N } such that ∥ˆ w N − w∥ V →0 asN →∞. The goal is to show∥w N − w∥ V →0. By the triangle inequality, ∥w N − w∥ V ≤∥ w N − ˆ w N ∥ V +∥ˆ w N − w∥ V , andthereforeusingtheassumptionthat∥ˆ w N − w∥ V →0,itsufficestoshow ∥w N − ˆ w N ∥ V →0asN →∞. Define z N = w N − ˆ w N ∈H N ⊂ V. Note that other than in Theorem 5.0.4, all the spacesH N here are subspaces ofH, so the only appearing inner product is⟨·,·⟩ H which for readability will be denoted as⟨·,·⟩ . Then, a(f;w,z N )=⟨− A(f)w,z N ⟩ =⟨(λ 0 I− A(f))R λ 0 (A(f))z,z N ⟩− λ 0 ⟨w,z N ⟩ =⟨z,z N ⟩− λ 0 ⟨w,z N ⟩ (5.18) as well as a(f;w N ,z N )=⟨− A N (f)w N ,z N ⟩ 101 =⟨(λ 0 I− A N (f))R λ 0 (A N (f))P N z,z N ⟩− λ 0 ⟨w N ,z N ⟩ =⟨z,z N ⟩− λ 0 ⟨w N ,z N ⟩. (5.19) Combining (5.18) and (5.19) yields a(f;w N ,z N )=a(f;w,z N )+λ 0 ⟨w− w N ,z N ⟩. (5.20) Using the coercivity ofa(f;·,·) together with (5.20) the following is obtained ˜ µ 0 ||z N || 2 V ≤ a(f;z N ,z N )+ ˜ λ 0 |z N | 2 H =a(f;w N ,z N )− a(f; ˆ w N ,z N )+ ˜ λ 0 |z N | 2 H =a(f;w,z N )+λ 0 ⟨w− w N ,z N ⟩− a(f; ˆ w N ,z N )+ ˜ λ 0 |z N | 2 H =a(f;w− ˆ w N ,z N )+ ˜ λ 0 (⟨w− w N ,z N ⟩+|z N | 2 H ). The boundedness ofa(f;·,·) further gives ˜ µ 0 ∥z N ∥ 2 V ≤ α 0 ∥w− ˆ w N ∥ V ∥z N ∥ V + ˜ λ 0 (⟨w− w N ,z N ⟩+|z N | 2 H ) (5.21) Using ⟨w− w N ,z N ⟩=⟨w− ˆ w N ,z N ⟩+⟨ˆ w N − w N ,z N ⟩=⟨w− ˆ w N ,z N ⟩−⟨ z N ,z N ⟩, in (5.21) it then turns into ˜ µ 0 ∥z N ∥ 2 V ≤ α 0 ∥w− ˆ w N ∥ V ∥z N ∥ V + ˜ λ 0 ⟨w− ˆ w N ,z N ⟩ ≤ α 0 ∥w− ˆ w N ∥ V ∥z N ∥ V +| ˜ λ 0 ||w− ˆ w N | H |z N | H . 102 Finally, the embedding|·| H ≤ K|·| V results in ˜ µ 0 ∥z N ∥ 2 V ≤ α 0 ∥w− ˆ w N ∥ V ∥z N ∥ V +| ˜ λ 0 |K 2 ∥w− ˆ w N ∥ V ∥z N ∥ V =∥z N ∥ V ∥w− ˆ w N ∥ V (α 0 +| ˜ λ 0 |K 2 ), and therefore ˜ c 0 ∥z N ∥ V ≤∥ w− ˆ w N ∥ V , where˜ c 0 = ˜ µ 0 α 0 +| ˜ λ 0 |K 2 which concludes the proof since∥ˆ w N − w∥ V →0 asN →∞. Corollary5.0.3. Let the assumptions of Theorem 5.0.7 hold. For the model in (4.2)-(4.3) assume thatq∼ f, with f ∈ F as defined in (4.1). Then |y N k (f)− y k (f)| → 0 as N → ∞ for each k = 1,2,..., where y N k (f)= P k− 1 j=0 h N k− 1− j (f)u j for every sequence{u j } ∞ j=0 , withu j ∈R. Proof. By Theorems 5.0.6 and 5.0.7, it is obtained that ˆ A N (f)P N x− ˆ A(f)x → 0 as N → ∞ inH for eachx∈H, whereP N denotes the orthogonal projection ofH ontoH N . Then, it immediately follows that ( ˆ A N (f)) k P N x− ( ˆ A(f)) k x→0asN →∞inHforeachx∈Handeachk =1,2,.... Further,recallthat ˆ B(f) = A(f) − 1 ( ˆ A(f)− I)B(f) and ˆ B N (f) = A N (f) − 1 ˆ A N (f)− I N P N B(f), which implies ˆ B N (f)r− ˆ B(f)r→0 inH for eachr∈R. Finally, with ˆ C N definedas ˆ C N = ˆ CP N , itisconcluded that |h N k (f)− h k (f)|→0asN →∞foreachk =1,2,...,whereh N k (f)= ˆ C N (f)( ˆ A N (f)) k− 1 ˆ B N (f). Then, sincey k (f)= P k− 1 j=0 h k− j− 1 (f)u j andy N k (f)= P k− 1 j=0 h N k− j− 1 (f)u j itfollowsthat|y N k (f)− y k (f)|→0 asN →∞ for eachk =1,2,..., for every sequence{u j } ∞ j=0 , withu j ∈R. Theorem5.0.8. Let the assumptions of Theorem 5.0.7 hold. Then for fixed f ∈F,|u N k (f)− u k (f)|→ 0 as N → ∞ for eachk = 0,1,2,... where{u k (f)} denotes the optimal input determined as given in (4.7), 103 and{u N k (f)} denotes the optimal input as given in (4.8) corresponding tof ∈F. Proof. Recall that, for fixed f, the finite dimensional approximating deconvolution problem is given by u ∗ L =argmin U M J L (u)=argmin U M K X k=1 |y N k (u)− ˆ y k | 2 +||u|| 2 S(0,T) . (5.22) For anyL, it will be shown that the problem (5.22) admits a solutionu ∗ L , and that there exists a subsequence {u ∗ L k }⊂{ u ∗ L } withu ∗ L k →u ∗ ask→∞, whereu ∗ is the solution to the infinite-dimensional deconvolu- tion problem as in (4.7). By(4.6),y N k dependscontinuouslyontheinputsequenceuandhenceJ L iscontinuousonu. SinceU M is assumed to be compact, for everyL there exists a solution to (5.22). Consider a convergent sequence{u M } in U⊂ S(0,T) withu M ∈U M , and||u M − u|| S(0,T) →0 asM →∞, for anyu∈U. Then, by Corollary 5.0.3, it follows that |y N k (u M )− y k (u)|→0, asL=(N,M)→∞, k =1,2,...,K. This shows J L (u M ) → J(u) as L → ∞. Now, consider a sequence of solutions to the finite dimensional approximating optimization problem (5.22),{u ∗ L }⊂ U M ⊂ U. Leveraging the compactness ofU, a subse- quence{u ∗ L k } ⊂ { u ∗ L } withu ∗ L k →u ∗ ∈U ask → ∞ is obtained. Letu ∈U be arbitrary and denote L k =(N k ,M k ). Then, J(u ∗ )=J( lim k→∞ u ∗ L k )= lim k→∞ J L k (u ∗ L k )≤ lim k→∞ J L k (u M k )=J( lim k→∞ u M k )=J(u). Here,{u M k }⊂ U withu M k ∈U M k is the sequence of approximations tou whose existence is given by the the assumptions of Theorem 5.0.7. Hence,u ∗ solves the problem (4.7). 104 Finally, the following Theorem proves consistency under approximation forh N k,m = {h N k (f m )} K k=0 , y N k,m = {y N k (f m )} K k=0 , andu N k,m = {u N k (f m )} K k=0 when f is approximated by a sequence of Kernel Density Estimators. Theorem 5.0.9. Letq ∼ f with f ∈ F. Let{f m }, m = 1,2... be a sequence of Kernel Density Esti- mates forf as in (5.1). Let the assumptions of Theorem 5.0.1 hold. Thenh N k,m ,y N k,m , andu N k,m are approx- imating consistent sequences of estimators forh k ,y k , andu k , respectively, whereh N k,m = {h N k (f m )} K k=0 , y N k,m ={y N k (f m )} K k=0 , andu N k,m ={u N k (f m )} K k=0 . Proof. Theorem5.0.2isapplied. First,sincetheAssumptionsofTheorem5.0.1aresatisfieditfollowsthat {f m } isaconsistentsequenceofestimatorsforf,thatissup|f m − f|→0a.sasm→∞. InCorollary5.0.1itwas establishedthath k (f)andy k (f)arecontinuousinf. ByTheorem5.0.5thesameistrueforu k (f). Corollary 5.0.2andTheorem5.0.5showthatthisisalsotrueforh N k ,y N k andu N k ,uniformlyinN,N =1,2,.... Finally, Corollary 5.0.3 showslim N→∞ |h N k (f)− h k (f)|=0 andlim N→∞ |y N k (f)− y k (f)|=0, respectively, for everyk = 1,2,...,K, and Theorem 5.0.8 shows|u N k (f)− u k (f)| → 0 fork = 1,2,...,K asN → ∞. Thus, all the assumptions of Theorem 5.0.2 are satisfied and the desired result is obtained. 105 Chapter6 RegressionModelsforthePredictionoftheTACParametersand Sub-populationSelection Once a population model has been established to study the transfer of ethanol from the blood to the TAC biosensor on the skin (see Chapter 4), sub-population selection is employed to limit the distribution estimate of parameters and reduce uncertainty in the BrAC estimate. The sub-population selection process involves utilizing regression models that rely on the physical characteristics of the subjects (covariates) to predict the TAC parameterq. By estimating the TAC parameters from covariates, the need for individual calibration of the model for each subject who provided the TAC data is eliminated, thereby reducing the calibration burden. 6.1 Regression Models for the Prediction of the TAC Parameters from SubjectCovariates The TAC parameterq consists of two entries, q 1 and q 2 . The dimensionless q 1 describes the diffusivity ethanol in the epidermal layer and the dimensionless q 2 models the product of the alcohol impedance between the epidermal and dermal layers of the skin and the alcohol impedance between the epidermal layer of the skin and the membrane covering the collection chamber of the senor at the location where it 106 comes into contact with the epidermal layer. This Chapter outlines the efforts to improve the accuracy of BAC/BrAC estimates by incorporating subject covariates such as ethnicity, gender, and physical character- istics into the TAC parameters, thereby reducing the need for calibration. The development of regression models that use subject covariates as predictors to determine TAC parameters is a significant advancement towards a more reliable and valid method for quantitatively measuring BAC/BrAC from TAC. 6.1.1 RegressionModelsFormulation To train regression models that can predict the TAC parameters based on subject covariates, a dataset comprising of both theq parameters and the corresponding covariate values of the subjects is required. Assume that BrAC and TAC data pairs,{{u j,k } K k=0 ,{y j,k } K k=0 } M j=1 for M drinking episodes (of M individ- uals) have been collected in the lab or clinic. Assume further that these BrAC/TAC data pairs are used to obtain the least squares estimators for the parametersq =[q 1 ,q 2 ] that appear in the individual model (2.1) or (2.2) yielding the M individual model parameter estimates,{q j = [q 1,j ,q 2,j ]} M j=1 . Also, assume that covariate vectors for the M individuals have been collected. To obtain a prediction for theq parameter of a new individual, it is necessary to establish a mapping from the covariates to theq parameters. Figure 6.1: A mapping from the subject covariates to the TAC parameterq LetC = [c 1 ,c 2 ,...,c n ]∈R M× n denote the covariate matrix wherec i ,i = 1,...,n are the vectors of then covariates, containing the covariate values of the M individuals. Also, letQ = [q 1 ,q 2 ]∈R M× 2 denote the corresponding individual model parameter estimates. A mapping from the covariate matrix C to the TAC parameters inQ is desired. The mapping can follow a linear approach of modelling the relationship between target variableq and explanatory variables (covariates), i.e. linear model, a linear approach of modelling the relationship between transformed expected target variable in terms of the link 107 function and the explanatory variables, i.e. generalized linear model or other learning algorithms, for instance decision tree, random forest, neural nets, etc. Note that the TAC parametersq 1 andq 2 can either be treated as dependent or independent random variables. In this work, and in particular in Chapter 8, to demonstrate the methods presented here, q 1 andq 2 are treated as independent random variables. For instance, if a linear regression model is selected as the mapping from the covariates to the TAC parameters, two separate functionsf 1 :R n →R andf 2 :R n →R are required to be learned such that q 1,i =f 1 (C i )+ϵ 1,i =β 1,0 + n X j=1 β 1,j C i,j +ϵ 1,i and q 2,i =f 2 (C i )+ϵ 2,i =β 2,0 + n X j=1 β 2,j C i,j +ϵ 2,i for i=1,...,M whereC i ,i = 1,...,M is the i th row of the covariate matrixC, [q 1,i ,q 2,i ] is the i th row ofQ and ϵ 1,i , ϵ 2,i , i = 1,...,M are iid random variables fromN(0,σ 2 1 ) andN(0,σ 2 2 ) respectively. Therefore, after obtaining least squares estimates b β 1,0 ,..., b β 1,n , b β 2,0 ,..., b β 2,n for β 1,0 ,...,β 1,n ,β 2,0 ,...,β 2,n the TAC parameterb q =[b q 1 ,b q 2 ] of a new individual with covariate vectore c=[e c 1 ,e c 2 ,...,e c n ] T is given by b q 1 = b β 1,0 + n X j=1 b β 1,j e c j and b q 2 = b β 2,0 + n X j=1 b β 2,j e c j Various regression models and machine learning algorithms can be trained and evaluated using differ- ent evaluation metrics to determine their ability to predict the TAC parameterq. The algorithm with the highest empirical performance can then be chosen and the predictionsb q ofq can be used as explained in Section 6.2 for sup-population selection with the aim to reduce the width of the conservative band of the deconvolved signalu. 108 6.2 Sub-populationSelection Now, letq be a random vector, and denote this random vector byq as in Chapter 4. In Section 4.3, it was explained that by sampling from the distribution of the random vectorq and solving the optimization problem (4.8), one can obtain samples of the deconvolved signalu. Furthermore, by sampling a large num- ber of estimated input signals, it is possible to find conservative bands within which a certain percentage of these signals fall. This section introduces a method to achieve narrow bands for the deconvolved signal u. The method is based on sub-population selection and the Implicit Function Theorem stated below, the proof of which can be found in [76]. Theorem 6.2.1. Let X, Y and Z be Banach spaces and let Ω be an open subset of X × Y. Let F be a continuously differentiable map from Ω to Z. If (b x,b y) ∈ Ω is a point such that D y F(b x,b y) is a bounded, invertible, linear map from Y to Z, then there is an open neighborhood G ofb x, and a unique function f : G→Y, such that F(x,f(x))=F(b x,b y) , ∀x∈G Consider that the input TAC signaly is expressed as a function of the random vectorq and the output eBrAC signalu,y = F(q,u), such that F : R 2 × R n → R m . For a fixed point (q 0 ,u 0 ), if the assumptions of Theorem 6.2.1 are met then there exists an open neighborhoodG ofq 0 and a unique functionf : G → R n , such thatf(q 0 ) = u 0 , andF(q,f(q)) = F(q 0 ,u 0 ) for allq ∈ G. This implies that when a setQ such thatP(q∈Q)=p is chosen, then p=P(q∈Q)≤ P(f(q)∈f(Q))=P(u∈f(Q)) Thus, given a fixed probability p, reducing the variability in theq spaceQ results in reduced variability inf(Q), leading to narrower conservative bands for the estimated output signalu compared to when the 109 entire population was used. The process of selecting a sub-population to achieve narrow conservative bands is explained below. In Section 6.1.1, it was explained how the parameter q of an individual is predicted from the indi- vidual’s covariates through regression models or machine learning algorithms. Let the population of the random parametersq be denoted by{q i } M i=1 , and consider a new individual with covariate vector e c = [e c 1 ,e c 2 ,...,e c n ] T for which a predicted TAC parameter b q = [b q 1 ,b q 2 ] was obtained using regression models or machine learning algorithms as described in Section 6.1.1. The predictedb q is then used to select the sub-population of the population{q i } M i=1 to be considered for constructing the bands. Along with b q, to select the sub-population, an estimate of the variance of the model or algorithm used to obtainb q is required. This can be obtained by using the cross-validation (CV) approach as follows. The population {q i } M i=1 is divided intoK folds, each consisting ofl samples. For each fold, predictions are obtained by training the regression model or machine learning algorithm on the remainingK− 1 folds. This process is repeatedK times, each time leaving out a different fold. Then, the cross-validation estimate of the variance is calculated by e σ 2 1,j = 1 lK K X k=1 l X i=1 (q 1,i,k − b q 1,i,k ) 2 and e σ 2 2,j = 1 lK K X k=1 l X i=1 (q 2,i,k − b q 2,i,k ) 2 whereq 1,i,k ,q 2,i,k i = 1,...,n, k = 1,...,K are the true values of the i th sample of the k th fold andb q 2,i,k ,b q 2,i,k are the corresponding predicted values. Repeating N times with different splits of the population, N estimates of the model variance are calculated and then the final estimate b σ 2 = [b σ 2 1 ,b σ 2 2 ] of the variance is taken to be the average of the N cross validation estimates b σ 2 1 = 1 N N X j=1 e σ 2 1,j and b σ 2 2 = 1 N N X i=j e σ 2 2,j (6.1) 110 The restriction of the population{q i } M i=1 is then the rectangle [b q 1 ± Lb σ 1 ]× [b q 2 ± Lb σ 2 ] (6.2) whereL is a constant andb σ 1 ,b σ 2 are as in (6.1). The choice of L depends on the choice of the regression model or machine learning algorithm for the prediction ofq (more in Chapter 8). When the sub-population is selected, it is used to restrict the kernel density estimate that has already been computed based on the entire population{q i } M i=1 , as in (4.9) for a uni-variate random variable or in (4.10) for a bivariate random variable. Note that when the TAC parameters are assumed to be independent (as in the experiments presented in Chapter 8) the kernel density estimate ofq is the product of the two uni-variate kernel densities, one fitted on q 1 and one fitted on q 2 . After conditioning the kernel density estimate based on the restriction in 6.2, the support of the resulting density estimate shifts from the entire population to the particular sub-population. Therefore, to maintain the properties of a density function, normalization is necessary to ensure that the restricted estimate integrates to a value of 1. Thus, the conditional kernel density estimate e f based on the sub-population selection can be expressed as e f(q;H)= 1 R b q 1 +Lc σ 1 b q 1 − Lc σ 1 R b q 2 +Lc σ 2 b q 2 − Lc σ 2 b f(q,H)dq 1 dq 2 b f(q;H) , where q∈[b q 1 ± Lc σ 1 ]× [b q 2 ± Lc σ 2 ] (6.3) 111 Chapter7 Simulator A software named Simulator was created using Matlab to produce datasets with records that include co- variates, a drinking diary, as well as TAC and BrAC parameters and signals that correspond to an individ- ual’s drinking session. The Simulator is an interactive software that allows users to enter the number of records to be generated and those records are simulating a realistic drinking scenario for an individual. The purpose of developing the Simulator is to produce data with the aim of evaluating the impact of the sub-population selection method described in Chapter 6 on the reconstructed BrAC curves and the con- servative bands. The need for the simulator arose due to limitations in the available lab data (Dataset3), which only comprised 267 drinking sessions from 40 subjects, making the dataset insufficiently large and lacking variability in the covariate values among the subjects. In Figure 7.1 a schematic diagram of the Simulator is presented. The covariates and a drinking diary can be inputted either manually by the user or generated by the Simulator. In the diagram, the latter method is demonstrated (for the first method the purple boxes are omitted). The Simulator uses the covariates to calculate the BrAC parameterv in the Michaelis-Menten model for absorption, metabolism, and elimina- tion of alcohol from the bloodstream which will be presented in (7.8). Additionally, it uses the covariates to calculate the TAC parameterq of the subject which appears in (2.6)-(2.7). Then, the Michaelis–Menten alcohol clearance model, which operates on a nonlinear pharmacokinetic basis, is utilized to estimate the 112 BrAC from the drinking diary andv. Finally, the estimated BrAC is used withq to obtain the estimated TAC. Figure 7.1: Schematic diagram of the Simulator In Figure 7.2 a flowchart of the Simulator is presented. The Simulator inquires, “ Do you wish to enter covariates? ”. If the user answers yes, they manually input the covariate vector. On the other hand, if the user chooses not to enter covariates, the Simulator generates them from statistical models that use relevant statistical quantities of Dataset3. Next, the Simulator inquires, “ Do you want to enter a drinking diary? ". If the answer is yes, the user inputs the drinking diary manually. If the answer is no, the Simulator selects one of the predefined drinking diaries stored. Upon completion of the above steps, the Simulator calculates the BrAC parameter (v), TAC parameter (q) and the estimated BrAC (eBrAC) and TAC (eTAC) curves. The output of the Simulator consists of the covariate vector, the drinking diary,v,q, eBrAC and eTAC. 113 Figure 7.2: Flowchart of the Simulator All calculations needed for the development of the Simulator were carried out on either standard PC laptops or desktop machines, or on Apple Macbook Pro laptops. In either case, all codes were developed in Matlab and Python. The Simulator software has been developed for utilization in Matlab. 114 7.1 Dataset3 The goal of the simulator is to create datasets that contain meaningful covariate data as well as BrAC and TAC curves that reflect a real drinking episode. To achieve this, the statistical properties (mean and standrad deviation) of the numerical covariates and the proportions of each category of the categorical covariates from Dataset3 were used in the Simulator to generate covariate values (see Subsection 7.2.1). Furthermore, the BrAC and TAC parameters of the subjects in Dataset3, along with their corresponding covariate values, were utilized to determine how the Simulator generates the BrAC and TAC parameters (see Subsections 7.2.3 and 7.2.4). Each of the 40 participants in Dataset3 completed 4 laboratory drinking sessions. Each participant consumed the same amount of alcohol in all four sessions, which was designed to reach a maximum BrAC peak of .080 for that person based on body water weight. An Intoximeter IV breath analyzer (Intoximeters, Inc., St. Louis, MO) was used to obtain BrAC, and two SCRAM-CAM transdermal alcohol sensors in the form of wearable bracelets (Alcohol Monitoring Systems, AMS, Littleton, CO) were placed on the upper arms to obtain TAC. Measurements were taken at 10-minute intervals over the entire BrAC curve. After BrAC returned to .000, the TAC devices continued to record automatically at 30- minute intervals until TAC also returned to .000. Several sets of BrAC-TAC data were excluded from analyses due to a change in the protocol, and with two sets of TAC data recorded in each session, our final sample consisted of 239 usable BrAC-TAC datasets of 32 participants ([52]). In addition, for each subject tested, a set of covariates that have been observed to have an impact on the relationship between BrAC and TAC were obtained. The list of covariates consists of 7 numerical and 2 categorical attributes (A-G) and (H-I) respectively (see Table 7.1). Statistical quantities of the Dataset3 covariates are shown in Table 7.2. Also, Figures 7.3-7.5 provide a better understanding of Dataset3. 115 Covariate List Name Code Explanation HT A Height WT B Weight BF C Body Fat VF D Visceral Fat RM E Resting Metabolism WH F Waist-Hip Ratio AGE G Age BIOSEX H Sex (biological) ETH I Ethnic identity Table 7.1: Dataset3 Covariate List Dataset3 Covariate Statistics Covariate Mean Std Min Max HT 171.15 10.61 149 188.5 WT 77.86 17.41 49.4 113.3 BF 30.11 11.43 10.2 53.5 VF 6.84 3.99 2 18 RM 1639.52 288.49 1121 2219 WH 0.87 0.075 0.72 1.05 AGE 24.57 3.25 21 33 Table 7.2: Dataset3 Covariate Statistics 116 Figure 7.3: Dataset3 TAC Parameters Figure 7.4: Dataset3 Categorical Covariates 117 Figure 7.5: Dataset3 Numerical Covariates 7.2 SimulatorDevelopment 7.2.1 Covariates The Simulator presents the user with two choices: (1) manual entry of the subject’s covariate vector, or (2) automatic generation of covariates using statistical models that rely on information from Dataset3. If the second option is chosen, the following steps are followed to generate a subject’s covariate vector. Letµ A ,...,µ G represent the sample mean values andσ A ,...,σ G represent the sample standard devi- ation values multiplied by the constant 0.75 of the respective covariate columns in Dataset3 (the codes A-G 118 are as in Table 7.1). Also, letp∈R 5 be a vector containing the sample proportions for each of the 5 cate- gories (african american, asian, caucasian, hispanic, mixed) of the ethnicity covariate (code I) in Dataset3. The generated covariate vector of a subject will be produced based on the following statistical models: HT ∼ N(µ A , σ 2 A ) WT ∼ N(µ B , σ 2 B ) BF ∼ N(µ C , σ 2 C ) RM ∼⌊ X⌋ , X ∼ N(µ D , σ 2 D ) WH ∼ N(µ E , σ 2 E ) VF ∼⌊ Y⌋ , X ∼ N(µ F , σ 2 F ) AGE∼⌊ Z⌋ , Y ∼ N(µ G , σ 2 G ) BIOSEX = 1 K>0.5 where K∼ U(0,1) ETH =i1 I i =1 where [I 1 ,I 2 ,I 3 ,I 4 ,I 5 ]∼ Multinomial(1,p) This rounding to the nearest integer and scaling the sample standard deviations of the statistical models by 0.75 are both important steps in ensuring the generated covariates are within the acceptable range of values. By rounding to the nearest integer, the generated values of RM, VF and AGE will match the values in Dataset3, which is important for ensuring the generated data is comparable to the real data. By scaling the sample standard deviations by 0.75, the range of values that can be generated is reduced, avoiding the possibility of generating negative values for the covariates. After trying out various other values, the constant 0.75 was selected because it is the highest value that ensures the generated covariates remain positive, even when generating datasets with one million records. Negative values for the covariates would not be acceptable because the domain of all the covariates is strictly positive. Note that the categorical covariate ETH is stored in the simulator in the form of dummy variables. 119 7.2.2 DrinkingDiaries The simulator offers users two options for inputting drinking data: (1) manual entry or (2) random as- signment of predefined drinking diaries. A drinking diary consists of two columns that record the time elapsed (in hours) and the cumulative amount of standard drinks consumed. A standard drink is defined by the National Institute of Alcohol Abuse and Alcoholism (NIAAA), a research institute under the National Institutes of Health (NIH), as containing 14 grams or 0.56 fl. oz. of pure ethanol ([17], [32]). This can be represented by one 12 fl. oz. beer, 5 fl. oz. of table wine, 2-3 fl. oz. of aperitif, or 1.5 fl. oz. of 80-proof spirits (hard liquor). The predefined drinking diaries in the Simulator can be found in Table 7.3. The purpose of creating these drinking diaries is to capture a range of drinking scenarios that could occur in real-life situations. The diaries are designed to vary in factors such as the total number of standard drinks consumed and the duration of the drinking session. For instance, Diary 3 portrays an individual who had only one standard drink over a span of 1.75 hours, whereas Diary 10 portrays an individual who consumed four standard drinks in a period of 3.25 hours. 120 Drinking Diaries (Cumulative Number of Standard Drinks) Time (hours) D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 0 0 0 0 0 0 0 0 0 0 0 0.25 0.5 0.5 0.5 0.25 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 0.5 0.5 0.25 0.5 0.5 0.5 0.5 1 0.75 0.75 1 0.5 0.5 0.5 0.75 0.5 1 0.75 1 1 1 1 0.5 0.5 0.5 1 0.5 1 0.75 1 1 1.25 1.5 1 0.5 0.75 1 0.5 1 1 1 1 1.5 1.5 1 0.5 0.75 1 1 1 1 1 1 1.75 1.5 1 1 1 1 1 1.5 1.5 1 1 2 1.75 1 1 1 1 1.5 1.5 1 1.5 2.25 2 1 1.25 1 1 1.5 2 1 2 2.5 1.5 1.25 1 1 1.5 2 1 2.5 2.75 1.5 1.5 1 1 2 2 1 3 3 1.5 1.5 1 1.75 2 3 1.5 3.5 3.25 2 1.75 1 2 2 3 2 4 3.5 2.5 2 2 2.5 3 3 3.75 2 3 3 4 2 3 4.25 2.5 3 4.5 2.5 3.5 5 2.75 Table 7.3: Simulator Drinking Diaries 121 7.2.3 BrACparameters Each covariate vector - drinking diary pair either generated by the Simulator or entered manually by the user is assigned with a BrAC parameter vectorv∈R 4 in the form v =(K,M,α, 1) , K,M,α> 0, (7.1) whereα is the normalized rate of ethanol absorption from the gut into the bloodstream in units of mg/(dlsdsh), K is the maximum rate of metabolism/elimination achieved when u(t) >> M in units of mg/(dlh) and M, the Michaelis-Menten parameter, is the substrate concentration in units of mg/dl when the metabolism/elimination rate is K/2. The BrAC parameter vector (when obtained) along with the drinking diary and a nonlinear pharmacokinetic (Michaelis-Menten) based model for the absorption, metabolism, and elimination of alcohol from the bloodstream will then be used to obtain the BrAC estimate for a particular drinking episode (see Subsection 7.2.5). For the purpose of simulating data, it is assumed that the BrAC parameters of an individual are related to their covariate vector. In (7.1), the BrAC parameter v contains four entries, one of which is set to 1 as it equals to 1 across all records in Dataset3. To maintain consistency with the Dataset3, the simulated parameterv also has its fourth entry set to 1. For each of the remaining three entries inv a regression model was fitted to the covariate - BrAC parameter pairs in Dataset3. LetC = [1,c 1 ,...,c n ] denote the covariate matrix of the subjects in Dataset3, where c i is the column vector of thei th covariate. Letv D1 ,v D2 ,v D3 be the vectors of the BrAC parameter values in Dataset3 that correspond toK,M andα respectively. Let e v Di =log((v Di − min(v Di ))/(max(v Di )− min(v Di ))) , i=1,2,3 (7.2) 122 Then, three individual linear regression models are fitted as follows: e v Di =Ca i +ϵ i , i=1,2,3 (7.3) wherea i =(a i0 ,a i1 ,a i2 ,...,a in ) T ,i=1,2,3 are the linear regression models’ coefficients. Note that the BrAC parametersv D1 ,v D2 ,v D3 of Dataset3 are transformed as indicated in (7.2) to prevent the regression models from predicting negative values. The least squares estimates of the coefficient vectors a i ,i=1,2,3 in (7.3), denoted byˆ a i ,i=1,2,3, are stored in the Simulator. Then, when the Simulator generates covariate data for a new subjectj,j = 1,...,J the corresponding BrAC parametersv ij ,i = 1,2,3 are calculated based on their covariate vector(x j1 ,...,x jn ) by e v ij = n X k=1 ˆ a ik x kj +e ϵ ij , v ij =(e e v ij ∗ max(e v ij )+min(e v ij ))/(e e v ij +1) , i=1,2,3 , j =1,...,J (7.4) wheree ϵ ij ∼ N(0,σ B i 2 )i = 1,2,3,j = 1,...,J independent. Note that the focus of this work is not on the BrAC parameters themselves. Therefore, in the experiments presented in Chapter 8, the values ofσ 2 Bi , i = 1,2,3 are selected based solely on the criterion of providing BrAC parameter values that are similar to those found in Dataset3. 7.2.4 TACparameters The TAC parametersq = (q 1 ,q 2 ) are the main focus of the Simulator and this work. The objective of the Simulator is to derive TAC parameters from the covariates. Then, regression models can be trained using the simulated covariates with the corresponding simulated TAC parameters and predictions for the TAC parameters of a new individual can be made, thereby validating the sub-population selection method. After extensive experimentation on various methods for generating TAC parameters from the covariates, 123 the approach described below was selected for the Simulator. Note that this approach yields reasonable and realistic TAC parameters. Some approaches that did not produce reasonable results are fitting linear regression models and linear regression models with interaction terms using covariates as predictors and TAC parameters as outputs. The former produced negative TAC parameter predictions in certain instances, while the latter failed to generate predictions that were comparable to those in Dataset3. The dimensionless parameter q 1 describes the diffusivity ethanol in the epidermal layer and, due to the linearity of the system, the dimensionless parameterq 2 models the product of the alcohol impedance between the epidermal and dermal layers of the skin and the alcohol impedance between the epidermal layer of the skin and the membrane covering the collection chamber of the senor at the location where it comes into contact with the epidermal layer. For the purposes of Simulating data to validate the methods described in Chapter 6,q 1 andq 2 are treated as independent random variables. However, note that sinceq 1 andq 2 depend on the same subject covariates they are neither functionally nor stochastically independent. LetC = [1,c 1 ,...,c n ] denote the covariate matrix of the subjects in Dataset3, wherec i is the column vector of thei th covariate. Letq D1 ,q D2 be the vectors of the estimated TAC parameter values in Dataset3 that appear in the individual model (2.1) or (2.2), obtained from the corresponding BrAC/TAC data pairs. Let e q Di =log((q Di − min(q Di ))/(max(q Di )− q Di )) , i=1,2 (7.5) be the log-link transformation so positivity is guaranteed. Then, two individual linear regression models are fitted as follows: e q Di =Cb i +δ i , i=1,2 (7.6) whereb i =(b i0 ,b i1 ,b i2 ,...,b in ) T ,i=1,2 are the linear regression models’ coefficients. 124 The least squares estimates of the coefficient vectors b i , in (7.3), denoted by ˆ b i ,i = 1,2, are stored in the Simulator. Then, when the Simulator generates covariate data for a new subject j, j = 1,...,J the corresponding TAC parametersq ij ,i = 1,2 are calculated based on their covariate vector(x j1 ,...,x jn ) by e q ij = ˆ b i0 n X k=1 ˆ b ik x kj , q ij =(e e q ij ∗ max(e q ij )+min(e q ij ))/(e e q ij +1)+ e δ ij , i=1,2 , j =1,...,J (7.7) whereδ ij ∼ N(m T i ,σ T i 2 )i=1,2 , j =1,...,J independent. Since linear regression models will be fit to simulated data to predict the TAC parameters of a new individual so to validate the sub-population selection method, the approach described above avoids gen- erating TAC parameters in the same manner as they will be predicted, as this would result in an overly simple solution. This is why noise δ is included in the final calculation of q = (q 1 ,q 2 ) where the in- verse log-link transformation (7.7) is applied instead of the step above like what was done for the BrAC parameters (Subsection 7.2.3). 7.2.5 EstimatedBrACcurve-ANonlinearPharmacokineticModelforAlcohol Absorption,Metabolism,andElimination The drinking diary and BrAC parameters generated by the Simulator as described in Subsections 7.2.2 and 7.2.3 , together with a nonlinear pharmacokinetic (Michaelis-Menten) based model for the absorption, metabolism, and elimination of alcohol from the bloodstream are used to obtain the BrAC estimate for a particular drinking episode. Before introducing the Michaelis-Menten model ([20]), a brief overview of the absorption, metabolism, and elimination of alcohol from the bloodstream is presented. Ethanol, when consumed, is absorbed into 125 the bloodstream through the stomach and intestines. Due to its high water solubility and the body’s com- position of70% water, the ingested ethanol is widely distributed throughout the body. Although a small amount of ethanol is processed out of the body by the kidneys and excreted through sweat, most of the ethanol is metabolized into waste products in the liver with the help of alcohol and aldehyde dehydro- genase enzymes (ADH and ALDH) ([11], [32], [65]). The rate of this reaction follows first-order kinetics when the substrate concentration is low, but as the concentration increases, the reaction becomes satu- rated, leading to zero-order kinetics. Existing models that are mainly used to determine if a person is within the legal limit for drunk driving only account for zero-order kinetics, which are accurate only at relatively high levels of blood alcohol content (BAC). These models are based on large-scale studies of stratified population that take into account various factors such as gender, body weight, drinking history, etc. However, to achieve capturing BAC at all levels, a linear term to model ethanol absorption through the gut and a Michaelis-Menten ([20]) term to capture the metabolism and elimination kinetics, including the switch from first-order to zero-order were combined. The model starts with the assumption that there is no initial ethanol in the blood and takes the form ˙ u(t)=αd (t)− Ku(t) M +u(t) , t>0 , u(0)=0 (7.8) where t is time in hours, u(t) is the subject’s BAC or BrAC in units of mg/dl, d(t) is the number of standard drinks (sds) ingested by the subject at timet,α is the normalized rate of ethanol absorption from the gut into the bloodstream in units ofmg/(dlsdsh),K is the maximum rate of metabolism/elimination achieved whenu(t) >> M in unitsmg/(dlh) andM, the Michaelis-Menten parameter, is the substrate 126 concentration in units of mg/dl when the metabolism/elimination rate is K/2. Then, (7.8) is sampled using an explicit forward difference approximation to the derivative to obtain u k+1 =u k +τ k αd k − τ k Ku k M +u k , k =0,1,2,... , u 0 =0 (7.9) whereτ k denotes the length of thek th sampling interval in units of hours. In practice, the parameters in equations (7.8) or (7.9) would be determined using data from a population of individuals, specifically in the form of records of drinks consumed and time-stamped measurements of BAC/BrAC. However, for the purposes of the Simulator, the parametersK,M, andα for each individual are assigned using the method described in Subsection 7.2.3, where these three parameters are obtained from regression models that use covariate data as predictors. The Simulator is designed to generate BrAC observations for a duration of 14 hours for every simulated drinking episode it generates. 7.2.6 Estimated TAC curve - A linear input/output model for TAC, as a function of BAC/BrACintheformofaconvolution The estimate of BrAC for a particular episode along with the TAC parameters of the individual are used in a linear input/output model model that describes the transdermal transport of ethanol from the blood to the TAC biosensor to obtain an estimate of TAC. The model is explained in detail in Section 2. In particular, all the details of how TAC observations are generated by the Simulator are provided in Sections 2.2 and 2.3. The Simulator is designed to generate TAC observations for a duration of 20 hours for every simulated drinking episode. The matrices used for the finite dimensional approximation of the operators required for TAC observations generation are given by the matrices in (2.8), withN =32. 127 7.3 ValidatingtheSimulation The Simulator generates values for BrAC and TAC parameters that are based on the covariates, as described in Subsections 7.2.3 and 7.2.4. As a result, the estimated BrAC and TAC curves obtained as in Subsections 7.2.5 and 7.2.6 are also dependent on the covariates. The objective of this experiment is to provide some evidence that the Simulator generates reasonable data by investigating how constraining the covariate space affects the TAC parameter space as well as the BrAC and TAC curve spaces, which are the focal points of this work. It should be noted that in order to gain a thorough understanding of how the errors in the regression models in (7.4) and (7.7) propagate into the BrAC and TAC curves, a more extensive analysis is required. One example of such an analysis is the use of the Delta method, which is outside the scope of this work. Two datasets,D 1 andD 2 , each containing 1000 records of covariates, BrAC, and TAC parameters, as well as the corresponding estimated BrAC and TAC curve observations were generated by the Simulator. The datasetD 1 will be utilized to limit the covariate space and determine margins forq, TAC and BrAC curves. On the other hand, D 2 will serve as unseen data to assess the number of itsq, TAC, and BrAC curves that fall within the established margins. All of the drinking sessions in these datasets were based on the same drinking diary, which consisted of one standard drink consumed 45 minutes after time 0. Also, the parametersσ B 1 ,σ B 2 ,σ B 3 ,µ T 1 ,µ T 2 ,σ T 1 ,σ T 2 in (7.4) and (7.7) are chosen as in Table 7.5. For ease of notation, the codes for the covariates, as listed in Table 7.1, will be used in the following analysis. Consider the numerical covariates A-G. The covariate space ofD 1 is given by: R=[A min ,A max ]× [B min ,B max ]× ...× [G min ,G max ] 128 where A min ,A max ,...G min ,G max are the minimum and maximum values of each of the covariates in D 1 as shown in Table 7.4. Then,R is restricted such that it includes90% of the covariate vectors inD 1 . In particular, forx=594.1 the90% restriction ofR is given by e R= ¯ A± xσ A σ A +...+σ G × ¯ B± xσ B σ A +...+σ G × ...× ¯ G± xσ G σ A +...+σ G ⊂ R where ¯ A,..., ¯ G and σ A ,...,σ G are the sample mean and the sample standard deviation values of the covariates A-G in D 1 as shown in Table 7.4. It should be noted that in order to conduct a more com- prehensive analysis, it would be necessary to limit the covariate space in various ways, such as including different percentages of D 1 ’s covariates and setting varying thresholds for each coordinate inR. However, this work does not aim to present further experiments. The90% bounds in theq parameter and BrAC and TAC curve spaces are formed by theq values and BrAC and TAC curves corresponding to the90% of the covariate vectors that belong in e R. It is expected that at least90%, of theq parameters and BrAC and TAC curves inD 2 will be contained within these bounds. As depicted in Figures 7.6-7.8, only 12, 20, and 23 of the 1000q parameters, BrAC, and TAC curves in D 2 respectively, lie outside of the 90% bounds, which provides some indication that the Simulator generates reasonable data. 129 Covariate Sample Mean Sample Std A 171.10 7.71 B 75.71 12.76 C 30.05 8.50 D 6.85 2.99 E 1624.40 214.54 F 0.88 0.056 G 24.78 2.67 Table 7.4: D 1 Covariate Statistics Parameter Value σ B 1 0.1 σ B 2 0.1 σ B 3 0.1 µ T 1 0.48 µ T 2 0.38 σ T 1 0.2 σ T 2 0.16 Table 7.5: Simulator Parameter Values for Efficacy Experiment In Figure 7.6 the rectangle represents the90% bounds based on the covariate vectors inD 1 that belong in e R. The results show that98.8% of theq parameters inD 2 are contained within these bounds. In Figure 7.7 the grey area represents the90% bounds based on the covariate vectors inD 1 that belong in e R.97.7% 130 of the TAC curves inD 2 lie within these bounds. In Figure 7.8 the grey area represents the90% bounds based on the covariate vectors inD 1 that belong in e R.98% of the BrAC curves inD 2 lie within the bounds. Figure 7.6: 90% bounds forq based on the covariate vectors inD 1 . 131 Figure 7.7: 90% bounds for TAC based on the covariate vectors inD 1 Figure 7.8: 90% bounds for BrAC based on the covariate vectors inD 1 132 7.4 DemonstrationoftheSimulator In this section, two experiments are presented. The first experiment demonstrates the impact of ran- domly selecting different drinking diaries from the 10 predefined drinking diaries in the Simulator while keeping covariates constant (i.e., same individual), on the BrAC and TAC curves. The second experiment demonstrates the impact of different covariate vectors (representing different individuals) generated by the Simulator, but with the same manually entered drinking pattern (i.e., drinking diary is kept constant), on the BrAC and TAC curves. 7.4.1 SimulatorExperiment1 For this experiment a covariate vector is entered manually in the Simulator as in table 7.6. Then 3 drinking diaries are randomly chosen among the 10 predefined drinking diaries in the Simulator. In particular, those three drinking diaries were D7, D6 and D1 as appear in Table 7.3. Figures 7.9, 7.10 and 7.11 demonstrate the changes in BrAC and TAC curves based on the three different drinking diaries, while keeping the covariate vector constant. 133 Covariate Values Covariate Value HT 170 WT 65 BF 20 VF 5 RM 1600 WH 0.9 AGE 29 BIOSEX female Eth_Id_C caucasian Table 7.6: Experiment 1 Covariate Values Figure 7.9: Simulator Experiment 1 - Example 1 134 Figure 7.10: Simulator Experiment 1 - Example 2 Figure 7.11: Simulator Experiment 1 - Example 3 135 7.4.2 Experiment2 For this experiment the Simulator generates 3 vectors of covariates as in Table 7.7 (examples 1, 2 and 3). The drinking pattern in Table 7.8 is chosen and entered manually for all three examples. By examining the BrAC and TAC curves depicted in Figures 7.12, 7.13, and 7.14, it is evident that the peak values, peak times, and the time taken for BrAC and TAC to return to 0 vary across examples. This discrepancy is expected, as each example possesses a distinct covariate vector, thereby influencing the resulting curves uniquely. Covariate Values Covariate Example 1 Example 2 Example 3 HT 178.10 167.44 171.30 WT 75.17 92.42 83.50 BF 23.95 28.22 12.04 VF 6 4 4 RM 1514 1848 1659 WH 0.90 0.90 0.81 AGE 26 25 28 BIOSEX female female male Eth_Id_C mixed hispanic caucasian Table 7.7: Experiment 2 Covariate Values 136 Drinking Diary Time (hours) Cumulative Num- ber of Standard Drinks 0 0 0.25 0.5 0.5 0.5 0.75 0.5 1 1 Table 7.8: Experiment 2 Drinking Diary Figure 7.12: Simulator Experiment 2 - Example 1 137 Figure 7.13: Simulator Experiment 2 - Example 2 Figure 7.14: Simulator Experiment 2 - Example 3 138 Chapter8 NumericalResults In this Chapter, the application of kernel density estimation to a population model based on a random diffusion equation for the transdermal transport of ethanol as explained in Chapter 4 is demonstrated. Additionally, the sub-population selection of the TAC parameter’sq population, based on predictions from regression models with covariates, as discussed in Chapter 6, is showcased. In particular, data produced by the Simulator described in Chapter 7 are used to illustrate how sub-population selection can decrease the level of uncertainty in the resulting BrAC signalu. All calculations to be presented below were carried out on either standard PC laptops or desktop ma- chines, or on Apple Macbook Pro laptops. In either case, all codes were developed in Matlab and Python. Wherever possible, Matlab built-in functions and Python libraries were used. For example, to fit the in- dividual model the constrained optimization routine FMINCON was used, and to solve the deconvolution problem the non-negative least squares solver LSQNONNEG was utilized. Similarly the Python library scikit-learn was used to build the regression models. A dataset D train of 3000 records is produced by the Simulator. For the purposes of the experiment, the covariate vectors are not entered by the user but instead they are generated by the Simulator (see Subsection 7.2.1). Additionally, each subject (record) is assigned at random with one of the pre-specified drinking diaries (see Table 7.3). Recall that the Simulator produces covariate vectors consisting of nine 139 variables: height, weight, body fat, visceral fat, resting metabolism, waist-hip ratio, age, sex, and ethnic identity (see Table 7.1). For each record, the trueq parameter is calculated using the regression models in (7.6)-(7.7). Theq parameters contained inD train will serve as theq population in what follows. In the initial phase of the experiment, regression models are required to be build in order to provide predictions for the TAC parameterq of new individuals based on their covariates. Note that for the purposes of the experiment the entries of the parameterq = [q 1 ,q 2 ] are considered independent. This is because even though the regression models for q 1 and q 2 use the same predictors, the errors in the two models are independent random variables. Therefore, two individual regression models need to be trained onD train so to provide predictions forq 1 andq 2 . Theq 1 andq 2 values inD train are transformed using the log-link transformation e q ij =log((q ij − min i )/(max i − q ij )) , i=1,2 , j =1,...,3000 (8.1) where min i and max i , i = 1,2 are the minimun and maximum values of q 1 and q 2 in D train . Then, linear regression models are trained onD train using the values in (8.1) as outputs and the corresponding covariates as predictors. Note that more complex models (neural nets, random forests etc.) may have better performance, however for the purposes of demonstrating the sub-population selection method simpler models are sufficient. Also, it is important to highlight that while linear regression models were used to obtain simulated values of q 1 and q 2 through the log-link transformation as in (7.6)-(7.7), errors were added to the simulatedq 1 andq 2 after applying the inverse log-link transformation (see 7.7). This approach enables us to utilize the same model for prediction purposes. To prepareD train the numerical covariates are transformed using e x= x− µ X σ X (8.2) 140 where x is a value of the covariate column X and µ X , σ X are the sample mean and sample standard deviation ofX respectively. Note that the categorical covariates as produced by the Simulator are already in the form of dummy variables. Then, the two linear regression models are trained on D train . The estimated coefficients of the two models are the following: Models’s Coefficients Coefficient q 1 model q 2 model Intercept -1.22 -0.10 HT 0.19 0.07 WT -0.75 0.05 BF 0.60 -0.43 VF 0.15 0.29 RM 0.25 -0.32 WH 0.05 0.29 AGE -0.10 -0.29 BIOSEX 0.92 -0.49 Eth/afamer 0.03 -0.13 Eth/asian -0.30 -0.10 Eth/caucasian 0.42 0.04 Eth/hispanic -0.006 0.28 Eth/mixed -0.14 -0.08 Table 8.1: Estimated coefficients of the linear regression models that predict the log-link transformation ofq 1 andq 2 from the covariates For the next step of the experiment, the Simulator was utilized to generate new records of data that will be treated as unseen data of new individuals (test data) and will be used to test the sub-population 141 selection method. Note that the covariates of the test data were normalized based on the covariates of D train as in (8.2). The covariate values of the test data can be found in Table 8.2. Note that for BIOSEX the value 0 correpsonds to female and the value 1 corresponds to male. Also note that Eth/afamer-Eth/mixed are the dummy variables of the ethnic identity covariate. Test Data Covariates Covariate Test data 1 Test Data 2 HT 174.19 178.47 WT 56.54 60.13 BF 37.82 25.73 VF 9 4 RM 1833 1435 WH 0.91 0.91 AGE 23 24 BIOSEX 0 0 Eth/afamer 0 0 Eth/asian 0 1 Eth/caucasian 0 0 Eth/hispanic 1 0 Eth/mixed 0 0 Table 8.2: Covariate values of the test data The drinking diaries associated with Test data 1 is Diary 9 and with Test data 2 is Diary 2 in Table 7.3. When the covariates of the test data are generated, they are then scaled as in (8.2) and predictions for 142 the corresponding log-link transformations ofq 1 andq 2 are obtained using the estimated linear regression model coefficients in Table 8.1. In order to demonstrate the method described in Section 6.2, theq 1 andq 2 predictions will be utilized to identify the sub-population of theq population inD train to be used, instead of the entire population, in the next step of the experiment. In particular, the sub-population will be selected based on confidence intervals of q 1 and q 2 . Let the prediction of the log-link transformationse q i denoted by b e q i , i = 1,2. A (1− α )% confidence interval for e q i ,i=1,2 is given by b e q i ± z α/ 2 b σ i (8.3) where ˆ σ i , i = 1,2 is the square root of the mean squared error ofe q i based onD train . Then a (1− α )% confidence interval for q i ,i=1,2 is obtained using the inverse log-link transformation as follows h e b e q i − z α/ 2 b σ i ∗ max i +min i e b e q i − z α/ 2 b σ i +1 , e b e q i +z α/ 2 b σ i ∗ max i +min i e b e q i +z α/ 2 b σ i +1 i wheremin i andmax i ,i = 1,2 are the minimun and maximum values ofq 1 andq 2 inD train . Also, note that the predictions of q 1 and q 2 , denoted byb q 1 andb q 2 respectively are also obrtained using the inverse log-link transformation b q i = e b e q i ∗ max i +min i e b e q i +1 , i=1,2 The predictedb q 1 andb q 2 values for the test data in Table 8.2 along with95% confidence intervals are the following: 143 Confidence Intervals for Test Data Test Data Lower bound b q 1 Upper bound 1 2.7434 3.6993 4.3374 2 0.8999 1.5650 2.5376 Table 8.3: 95% Confidence intervals for q 1 Confidence Intervals for Test Data Test Data Lower bound b q 2 Upper bound 1 1.2978 1.7158 2.1328 2 1.4972 1.9236 2.3158 Table 8.4: 95% Confidence intervals for q 2 The last stage of the experiment involves the kernel density estimation of theq population density ofD train when the full population is considered (see Chapter 4) and then it’s restriction according to the confidence intervals in Tables 8.3 and 8.4 (see Section 6.2). The scheme described in Sections 4.2 and 4.3 was applied withn = m1 = m2 = 4, orN = (4,4,4) andM = (24,4,4). Sinceq 1 andq 2 are assumed to be independent, marginal distributions are fit to each separately with the joint pdf then taken to be the product of the marginals. In Figure 8.1 is the one dimensional KDE estimate for the distribution ofq 1 on the left and forq 2 on the right. 144 Figure 8.1: Marginal distributions of theq 1 andq 2 populations ofD train Finally, the input signalu is deconvolved as described in 4.3. The effect of the sub-population selection on the deconvolved signalu is illustrated in the figures below. It is obvious from the Figures 8.4 and 8.6 that conditioning the kernel density estimate of theq population based on the sub-population selection results in narrower error bands which validates the arguments in Section 6.2. On the left of Figure 8.2 is a scatter plot theD train population and on the right is the KDE fitted based on the entire population. Figure 8.2: Scatter plot and kernel density estimate of the fullq population ofD train On the upper left of Figure 8.3 is the scatter plot of the full q population of D train along with the restriction (red rectangle) based on the confidence interval [2.7434,4.3374]× [1.2978,2.1328] of the Test Data 1 parameter q. On the upper right is the conditional joint density conditioned on the event q ∈ [2.7434,4.3374]× [1.2978,2.1328]. Figure 8.4 shows the result of using the resulting population model to 145 deconvolve eBrAC from the test episode TAC, (4.7) and (4.8), along with the simulated or modeled TAC, eTAC, the BrAC and TAC data, and the95% credible or error band. The left plot of Figure 8.4 is based on the full population kernel density estimate shown in Figure 8.2 and the right plot is based on the conditional kernel density estimate shown in Figure 8.3. Figure 8.3: Test Data 1 sub-population selection and conditional kernel density estimate Figure 8.4: Test Data 1 deconvolved BrAC signal along with error bands On the upper left of Figure 8.5 is the scatter plot of the full q population of D train along with the restriction (red rectangle) based on the confidence interval [0.8999,2.5376]× [1.4972,2.3158] of the Test Data 2 parametersq. On the upper right is the conditional joint density conditioned on the eventq ∈ [0.8999,2.5376]× [1.4972,2.3158]. Figure 8.6 is as Figure 8.4. 146 Figure 8.5: Test Data 2 sub-population selection and conditional kernel density estimate Figure 8.6: Test Data 2 deconvolved BrAC signal along with error bands 147 Chapter9 DiscussionandConludingRemarks The results presented in this thesis suggest a number of open questions that deserve further consideration. This work represents the initial effort to construct regression models capable of predicting the TAC param- eterq based on subject’s covariates. Through the utilization of simulated data from the Simulator and two distinct linear regression models (one for each entry of the TAC parameterq), some preliminary insights were gained. However, several challenges have arisen. The primary difficulty is associated with the con- straints of the available real-world dataset. Specifically, the existing dataset exhibits limited variability in both the covariate values and drinking patterns, as well as a relatively small number of recorded drinking episodes. As a result, only simulated data has been employed thus far to verify the techniques outlined in Chapter 6. An additional difficulty arises regarding the adequacy of the covariates utilized in the current regression models. The goal of the team is to find ways to sufficiently predict the TAC parameters of an individual based on readily observable person-level covariates that do not usually change dramatically over time, for example height, weight, sex, ethnicity etc. However, these covariates might not be adequate for accurately predictingq. Incorporating environmental factors such as weather or location data into the models may be necessary to account for episode-level variations and thereby improve the results. Finally, to achieve more precise predictions forq, it may be necessary to consider also within-subject variation across episodes, such as changes in skin temperature or hydration. 148 To obtain accurate predictions for the TAC parameterq, a range of existing and novel covariate combi- nations should be employed in regression models. Furthermore, this study utilized basic regression models to generate simulated data in the Simulator and then to predictq. The actual relationship betweenq and the covariates in real data could be considerably more complex. Experimentation with more sophisticated models such as neural networks and XGBoost in the near future is necessary. Additionally, regression models that treat the entries of the parameterq, q 1 and q 2 , as dependent random variables could be ex- plored, since in this work they were treated as independent variables for the purposes of demonstrating the methods in Chapters 4 and 6. 149 Bibliography [1] J. Aitchison and S. D. Silvey. “Maximum-likelihood estimation of parameters subject to restraints”. In: The Annals of Mathematical Statistics 29.3 (1958), pp. 813–828. [2] H. T. Banks, J. A. Burns, and E. M. Cliff. “Parameter estimation and identification for systems with delays”. In: SIAM Journal on Control and Optimization 19.6 (1981), pp. 791–828.doi: 10.1137/0319051. eprint: http://dx.doi.org/10.1137/0319051. [3] H. T. Banks and K. Ito. “A unified framework for approximation in inverse problems for distributed parameter systems”. In: CTAT 4.1 (1988), pp. 73–90.url: https://apps.dtic.mil/dtic/tr/fulltext/u2/a193780.pdf. [4] H. T. Banks and K. Ito. “Approximation in LQR problems for infinite dimensional systems with unbounded input operators”. In: Journal of Mathematical Systems, Estimation and Control 7 (1997), pp. 1–34. [5] H. T. Banks and Karl Kunisch. Estimation Techniques for Distributed Parameter Systems. Springer Science & Business Media, 2012.url: https://www.springer.com/us/book/9780817634339. [6] H. Thomas Banks and Karl Kunisch. Estimation Techniques for Distributed Parameter Systems. Birkhauser, Boston-Basel-Berlin, 1989. [7] Patrick Billingsley. Convergence of Probability Measures. 2nd. John Wiley & Sons, 2013.isbn: 978-1118162290. [8] Olivier Bousquet. “Concentration Inequalities for Sub-additive Functions Using the Entropy Method”. In: Stochastic Inequalities and Applications. Vol. 56. Progress in Probability. Basel: Birkhäuser, 2003, pp. 213–247. [9] T. Cacoullos. “Estimation of a multivariate density”. In:Ann.Inst.Stat.Math. 18 (1966), pp. 179–189. [10] Stephen L Campbell and Carl D Meyer. Generalized Inverses of Linear Transformations. SIAM, 2009. url: http://bookstore.siam.org/cl56/. [11] Kate B. Carey and John T. Hustad. “Are retrospectively reconstructed blood alcohol concentrations accurate? Preliminary results from a field study”. In: Journal of Studies on Alcohol 63 (2002), pp. 762–766. 150 [12] Ruth F Curtain and Dietmar Salamon. “Finite-dimensional compensators for infinite-dimensional systems with unbounded input operators”. In: SIAM Journal on Control and Optimization 24.4 (1986), pp. 797–816.doi: https://doi.org/10.1137/0324050. [13] Z. Dai, I. G. Rosen, C. Wang, N. Barnett, and S. E. Luczak. “Using drinking data and pharmacokinetic modeling to calibrate transport model and blind deconvolution based data analysis software for transdermal alcohol biosensors”. In: Math. Biosci. and Eng. 13.5 (2016), pp. 911–934.issn: 1551-0018.doi: 10.3934/mbe.2016023. [14] Carl De Boor. A practical guide to splines. Vol. 27. Applied Mathematical Sciences. Springer-Verlag, New York, 1978.isbn: 978-0-387-90223-1.doi: 10.1007/978-1-4757-0313-1. [15] D. M. Dougherty, N. E. Charles, A. Acheson, S. John, R. M. Furr, and N. Hill-Kapturczak. “Comparing the detection of transdermal and breath alcohol concentrations during periods of alcohol consumption ranging from moderate drinking to binge drinking”. In: Experimental and Clinical Psychopharmacology 20 (2012), pp. 373–381. [16] D. M. Dougherty, N. Hill-Kapturczak, Y. Liang, T. E. Karns, S. E. Cates, S. L. Lake, and J. D. Roache. “Use of continuous transdermal alcohol monitoring during a contingency management procedure to reduce excessive alcohol use”. In: Drug and Alcohol Dependence 142 (2014), pp. 301–306. [17] D. M. Dougherty, T. E. Karns, J. Mullen, Y. Liang, S. L. Lake, J. D. Roache, and N. Hill-Kapturczak. “Transdermal alcohol concentration data collected during a contingency management program to reduce at-risk drinking”. In: Drug and Alcohol Dependence 148 (2015), pp. 77–84. [18] M. A. Dumett, I. G. Rosen, J. Sabat, A. Shaman, L. Tempelman, C. Wang, and R. M. Swift. “Deconvolving an estimate of breath measured blood alcohol concentration from biosensor collected transdermal ethanol data”. In: Applied Mathematics and Computation 196.2 (2008), pp. 724–743.doi: 10.1016/j.amc.2007.07.026. [19] Rick Durrett. Probability: Theory and Examples. Vol. 49. Cambridge University Press, 2019. [20] L. Edelstein-Keshet. Mathematical Models in Biology. Philadelphia: Society for Industrial and Applied Mathematics, 2005. [21] V. A. Epanechnikov. “Non-parametric estimation of a multivariate probability density”. In: Theory of Probability and its Applications 14.1 (1969), pp. 153–156. [22] C. E. Fairbairn, D. Kang, and N. Bosch. “Using machine learning for real-time BAC estimation from a new-generation transdermal biosensor in the laboratory”. In: Drug and Alcohol Dependence 216 (2020), p. 108205.issn: 0376-8716.doi: https://doi.org/10.1016/j.drugalcdep.2020.108205. [23] Thomas Ferguson. A Course in Large Sample Theory. Routledge, 2017. [24] E. Fix and J. L. Hodges. “Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties”. In: International Statistical Review / Revue Internationale de Statistique 57.3 (1989), pp. 238–247. 151 [25] J. S. Gibson and I. G. Rosen. “Numerical approximation for the infinite-dimensional discrete-time optimal linear-quadratic regulator problem”. In: SIAM J. Control Optim. 26.2 (1988), pp. 428–451. [26] Evarist Giné and Martin Nikl. Mathematical foundations of infinite-dimensional statistical models . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2016. [27] C. J. Gittelson, R. Andreev, and C. Schwab. “Optimality of adaptive Galerkin methods for random parabolic partial differential equations”. In: J. Computational Applied Mathematics 263 (2014), pp. 189–201.doi: 10.1016/j.cam.2013.12.031. [28] K. Hawekotte, S. E. Luczak, and I. G. Rosen. “Deconvolving breath alcohol concentration from biosensor measured transdermal alcohol concentration: A multivariate normal Bayesian approach”. submitted. 2022. [29] K. Hawekotte, S. E. Luczak, and I. G. Rosen. “Deconvolving breath alcohol concentration from biosensor measured transdermal alcohol level under uncertainty: a Bayesian approach”. In: Mathematical Biosciences and Engineering 18.5 (2021).doi: 10.3934/mbe.2021335. [30] Nathalie Hill-Kapturczak, John Roache, Yuanyuan Liang, Tara Karns-Wright, Sharon Cates, and Donald Dougherty. “Accounting for sex-related differences in the estimation of breath alcohol levels using transdermal alcohol monitoring”. In: Psychopharmacology 232 (June 2014).doi: 10.1007/s00213-014-3644-9. [31] R. Horn and C. Johnson. Matrix Analysis. Cambridge University Press, 2012. [32] John T. Hustad and Kate B. Carey. “Using calculations to estimate blood alcohol concentrations for naturally occurring drinking episodes: a validity study”. In: Journal of Studies on Alcohol 66.1 (2005), pp. 130–138. [33] Robert I Jennrich. “Asymptotic properties of non-linear least squares estimators”. In: Ann. Math. Statist. 40.2 (1969), pp. 633–643. [34] Tara Karns-Wright, John Roache, Nathalie Hill-Kapturczak, Yuanyuan Liang, Jillian Mullen, and Donald Dougherty. “Time Delays in Transdermal Alcohol Concentrations Relative to Breath Alcohol Concentrations”. In: Alcohol and Alcoholism 52 (Aug. 2016).doi: 10.1093/alcalc/agw058. [35] Tosio Kato. Perturbation Theory For Linear Operators. Grundlehren der Mathematischen Wissenschaften. Springer Berlin Heidelberg, 2013.isbn: 978-3-662-12677-3.doi: 10.1007/978-3-662-12678-0. [36] D. A. Labianca. “The chemical basis of the breathalyzer: a critical analysis”. In: J. Chem. Educ. 67.3 (1990), pp. 259–261. [37] Lucien LeCam. “On some asymptotic properties of maximum likelihood estimates and related Bayes estimates”. In: University of California Publications in Statistics 1 (1953), pp. 277–330. 152 [38] S. E. Luczak, A. L. Hawkins, Z. Dai, R. Wichmann, C. Wang, and I. G. Rosen. “Obtaining continuous brac estimates in the field: A hybrid system integrating transdermal alcohol biosensor, intellidrink smartphone app, and brac estimator software tools”. In: Addictive Behaviors 83 (2018). Ambulatory Assessment of Addictive Disorders, pp. 48–55. [39] S. E. Luczak and I. G. Rosen. “Estimating BrAC from transdermal alcohol concentration data using the BrAC estimator software program”. In: Alcoholism: Clinical and Experimental Research 38 (2014), pp. 2243–2252. [40] Susan E. Luczak, I. Gary Rosen, and Tamara L. Wall. “Development of a Real-Time Repeated-Measures Assessment Protocol to Capture Change over the Course of a Drinking Episode”. In: Alcohol and Alcoholism 50.1 (2015), pp. 1–8. [41] Ricardo A Maronna, RD Martin, Víctor J Yohai, and Matias Salibián-Barrera. Robust Statistics: Theory and Methods (with R). John Wiley & Sons, 2019. [42] Izabela Najfeld and Timothy F Havel. “Derivatives of the matrix exponential and their computation”. In: Advances in Applied Mathematics 16.3 (1995), pp. 321–375. [43] C. Oszkinat, S. E. Luczak, and I. G. Rosen. “An abstract parabolic system-based physics-informed long short-term memory network for estimating breath alcohol concentration from transdermal alcohol biosensor data”. In: Neural Computing and Applications (2022).doi: 10.1007/s00521-022-07505-w. [44] C. Oszkinat, S. E. Luczak, and I. G. Rosen. “Uncertainty quantification in estimating blood alcohol concentration from Transdermal alcohol level with physics-informed neural networks”. In: IEEE Transactions on Neural Networks and Learning Systems (2022).doi: 10.1109/TNNLS.2022.3140726. [45] C. Oszkinat, T. Shao, C. Wang, I. G. Rosen, A. Rosen, E. Saldich, and S. E. Luczak. “Blood and breath alcohol concentration from transdermal alcohol biosensor data: estimation and uncertainty quantification via forward and inverse filtering for a covariate-dependent, physics-informed, hidden Markov model”. In: Inverse Problems 38.5 (2022), p. 055002.doi: 10.1088/1361-6420/ac5ac7. [46] E. Parzen. “On Estimation of a Probability Density Function and Mode”. In: The Annals of Mathematical Statistics 33.3 (1962), pp. 1065–1076. [47] A. Pazy. Semigroups of Linear Operators and Applications to Partial Differential Equations . Applied Mathematical Sciences. Springer, 1983.isbn: 9783540908456.url: https://books.google.com/books?id=80XYPwAACAAJ. [48] Anthony J Pritchard and Dietmar Salamon. “The linear quadratic control problem for infinite dimensional systems with unbounded input and output operators”. In: SIAM Journal on Control and Optimization 25.1 (1987), pp. 121–144.doi: https://doi.org/10.1137/0325009. [49] I. G. Rosen, S. E. Luczak, W. W. Hu, and Michael Hankin. “Discrete-Time Blind Deconvolution for Distributed Parameter Systems with Dirichlet Boundary Input and Unbounded Output with Application to a Transdermal Alcohol Biosensor”. In: July 2013, pp. 160–167.isbn: 978-1-61197-327-3.doi: 10.1137/1.9781611973273.22. 153 [50] I. G. Rosen, S. E. Luczak, and Jordan Weiss. “Blind deconvolution for distributed parameter systems with unbounded input and output and determining blood alcohol concentration from transdermal biosensor data”. In: Applied Mathematics and Computation 231 (Mar. 2014), pp. 357–376.doi: 10.1016/j.amc.2013.12.099. [51] M. Rosenblatt. “Remarks on Some Nonparametric Estimates of a Density Function”. In: The Annals of Mathematical Statistics 27.3 (1956), pp. 832–837. [52] E. Saldich, C. Wang, I.G. Rosen, L. Goldstein, J. Bartroff, R. M. Swift, and S. E. Luczak. “Obtaining high resolution multi-biosensor data for modeling transdermal alcohol concentration data”. In: ACER 44 (2020), pp. 357–376. [53] M. H. Schultz. Spline Analysis. Prentice-Hall Series in Automatic Computation. Pearson Education, Limited, 1972.isbn: 9780138354053.url: https://books.google.com/books?id=AdRQAAAAMAAJ. [54] C. Schwab and C. J. Gittelson. “Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs”. In: Acta Numerica 20 (2011), pp. 291–467.doi: 10.1017/S0962492911000055. [55] Robert J Serfling. Approximation Theorems of Mathematical Statistics. New York, NY: John Wiley & Sons, 1980. [56] Ralph E Showalter. Hilbert Space Methods in Partial Differential Equations . Courier Corporation, 2010. [57] B. W. Silverman. “Weak and strong uniform consistency of the kernel density estimate of a density and its derivatives”. In: The Annals of Statistics, 6.1 (1978). [58] M. Sirlanci. “Finite dimensional approximation and convergence in the estimation of the distribution of, and input to, random abstract parabolic systems with application to the deconvolution of blood/breath alcohol concentration from biosensor measured transdermal alcohol level”. PhD thesis. University of Southern California, 2018. [59] M. Sirlanci, S. E. Luczak, C. E. Fairbairn, Kang D., R. Pan, X. Yu, and I. G. Rosen. “Estimating the distribution of random parameters in a diffusion equation forward model for a transdermal alcohol biosensor”. In: Automatica 106 (2019), pp. 101–109. [60] M. Sirlanci, S. E. Luczak, and I. G. Rosen. “Estimation of the Distribution of Random Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output: Approximation and Convergence”. In: Communications in Applied Analysis 23 (Jan. 2019), pp. 287–329.doi: 10.12732/caa.v23i2.4. [61] M. Sirlanci, I. G. Rosen, S. E. Luczak, C. E. Fairbairn, K. Bresin, and D. Kang. “Deconvolving the input to random abstract parabolic systems; A population model-based approach to estimating blood/breath alcohol concentration from transdermal alcohol biosensor data”. In: Inverse Problems 34.12 (2018), pp. 2–27. 154 [62] Melike Sirlanci, Susan Luczak, and I Gary Rosen. “Approximation and convergence in the estimation of random parameters in linear holomorphic semigroups generated by regularly dissipative operators”. In: American Control Conference (ACC), 2017. IEEE. 2017, pp. 3171–3176.doi: 10.23919/ACC.2017.7963435. [63] O. J. Staffans. Well-posed linear systems. Cambridge: Cambridge University Press, 2005. [64] Charles J Stone. “Optimal choice of the smoothing parameter for regression estimates”. In: The Annals of Statistics 8.6 (1980), pp. 1348–1360. [65] Robert M. Swift. “Direct measurement of alcohol and its metabolites”. In: Addiction 98 (2003), pp. 73–80.issn: 1360-0443.doi: 10.1046/j.1359-6357.2003.00605.x. [66] Robert M. Swift. “Transdermal measurement of alcohol consumption”. In: Addiction 88.8 (1993), pp. 1037–1039.issn: 1360-0443.doi: 10.1111/j.1360-0443.1993.tb02122.x. [67] Michel Talagrand. “New concentration inequalities in product spaces”. In: Invent. Math. 126.3 (1996), pp. 505–563. [68] H. Tanabe. Equations of Evolution. Monographs and Studies in Mathematics. Pitman, London, 1979. isbn: 9780273011378.url: https://books.google.com/books?id=Dn6zAAAAIAAJ. [69] M. Tucsnak and G. Weiss. Observation and control for operator semigroups. Basel: Birkhauser, 2009. [70] V. S. Varadarajan. “On the Convergence of Sample Probability Distributions”. In: Sankhy¯ a, The Indian Journal of Statistics 19 (1958), pp. 23–26. [71] Roman Vershynin. High-dimensional probability. An introduction with applications in data science. Cambridge Series in Statistical and Probabilistic Mathematics 47. Cambridge: Cambridge University Press, 2018.isbn: 9781108415194.doi: 10.1017/9781108655592. [72] Gregory D. Webster and Hampton C. Gabler. “Feasibility of Transdermal Ethanol Sensing for the Detection of Intoxicated Drivers”. In: Annual Proceedings for the Advancement of Medicine 51 (2007), pp. 449–464. [73] Gregory D. Webster and Hampton C. Gabler. “Modeling of transdermal transport of alcohol: effect of body mass and gender”. In: Biomedical Sciences Instrumentation 44 (2008), pp. 361–366. [74] J. Weiss, I. G. Rosen, T. L. Wall, and S. E. Luczak. “Creating an automated drinking episode identifier software program utilizing transdermal alcohol sensor data from real-time drinking episodes”. In: Alcoholism: Clinical & Experimental Research 38 (2014). [75] M. Yao, S. E. Luczak, E. B. Saldich, and I. G. Rosen. “A population model-based LQG compensator for the control of intravenously-infused alcohol studies and withdrawal symptom prophylaxis using transdermal sensing”. In: Optimal Control Applications and Methods (2022). to appear. [76] Eberhard Zeidler. Nonlinear Functional Analysis and Its Applications: Part 1: Fixed-Point Theorems. Revised and updated edition. Springer, 2013.isbn: 978-1-4471-5241-8. 155
Abstract (if available)
Abstract
Being motivated by an alcohol biosensor problem, this thesis presents two approaches. First, the development of M-Estimation and deconvolution methodology with the goal of making well-founded statistical inference on an individual's blood alcohol level based on noisy measurements of their skin alcohol content. The results are applied to a nonlinear least squares estimator of the key parameter that specifies the blood/skin alcohol relation in a diffusion model, establishing its existence, consistency, and asymptotic normality. To make inference on the unknown underlying blood alcohol curve, a basis space deconvolution approach with regularization is developed and the asymptotic distribution of the error process is determined, thus allowing the computation of uniform confidence bands on the curve. Second, the development of a population model for the transdermal transport of ethanol from the blood to an alcohol biosensor on the surface of the skin in the form of a random abstract parabolic hybrid system of coupled ordinary and partial differential equations. Linear semigroup theory in a Gelfand triple of Bochner spaces is used to first formulate the model as an equivalent deterministic system in state space form and to then develop a finite dimensional approximation and convergence theory. Kernel density estimation is used to estimate the distributions of the random parameters that appear in the model. Sub-population selection based on regression models' predictions for the transdermal alcohol random parameters based on subject covariates is used to restrict the kernel density estimate of the population density and produce narrower error bands for the deconvolved BrAC signal.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
An abstract hyperbolic population model for the transdermal transport of ethanol in humans: estimating the distribution of random parameters and the deconvolution of breath alcohol concentration
PDF
Population modeling and Bayesian estimation for the deconvolution of blood alcohol concentration from transdermal alcohol biosensor data
PDF
Finite dimensional approximation and convergence in the estimation of the distribution of, and input to, random abstract parabolic systems with application to the deconvolution of blood/breath al...
PDF
Delta Method confidence bands for parameter-dependent impulse response functions, convolutions, and deconvolutions arising from evolution systems described by…
PDF
Obtaining breath alcohol concentration from transdermal alcohol concentration using Bayesian approaches
PDF
Linear quadratic control, estimation, and tracking for random abstract parabolic systems with application to transdermal alcohol biosensing
PDF
Prohorov Metric-Based Nonparametric Estimation of the Distribution of Random Parameters in Abstract Parabolic Systems with Application to the Transdermal Transport of Alcohol
PDF
Physics-informed machine learning techniques for the estimation and uncertainty quantification of breath alcohol concentration from transdermal alcohol biosensor data
PDF
Estimation of random input to semi-linear abstract parabolic systems with application to quantitative description of drinking behavior based on trans-dermal alcohol concentration
PDF
Distributed parameter model based system identification and filtering in the estimation of blood alcohol concentration from transdermal alcohol biosensor data
PDF
A nonlinear pharmacokinetic model used in calibrating a transdermal alcohol transport concentration biosensor data analysis software
PDF
Determining blood alcohol concentration from transdermal alcohol data: calibrating a mathematical model using a drinking diary
PDF
Simultaneous parameter estimation and semi-blind deconvolution in infinite-dimensional linear systems with unbounded input and output
PDF
Robust estimation of high dimensional parameters
PDF
Modeling of Earth's ionosphere and variational approach for data assimilation
PDF
A comparative study of non-blind and blind deconvolution of ultrasound images
PDF
Essays on estimation and inference for heterogeneous panel data models with large n and short T
PDF
Reconstruction and estimation of shear flows using physics-based and data-driven models
PDF
Differentially private and fair optimization for machine learning: tight error bounds and efficient algorithms
PDF
Probabilistic divide-and-conquer -- a new method of exact simulation -- and lower bound expansions for random Bernoulli matrices via novel integer partitions
Asset Metadata
Creator
Allayioti, Maria
(author)
Core Title
M-estimation and non-parametric estimation of a random diffusion equation-based population model for the transdermal transport of ethanol: deconvolution and uncertainty quantification
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Applied Mathematics
Degree Conferral Date
2023-08
Publication Date
06/01/2023
Defense Date
05/11/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
biosensor data,blood alcohol concentration,density estimation,M-estimation,OAI-PMH Harvest,transdermal alcohol concentration
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Rosen, Gary (
committee chair
), Goldstein, Larry (
committee member
), Luczak, Susan (
committee member
)
Creator Email
allayiot@gmail.com,allayiot@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113144564
Unique identifier
UC113144564
Identifier
etd-AllayiotiM-11914.pdf (filename)
Legacy Identifier
etd-AllayiotiM-11914
Document Type
Thesis
Format
theses (aat)
Rights
Allayioti, Maria
Internet Media Type
application/pdf
Type
texts
Source
20230602-usctheses-batch-1051
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
biosensor data
blood alcohol concentration
density estimation
M-estimation
transdermal alcohol concentration