Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Population modeling and Bayesian estimation for the deconvolution of blood alcohol concentration from transdermal alcohol biosensor data
(USC Thesis Other)
Population modeling and Bayesian estimation for the deconvolution of blood alcohol concentration from transdermal alcohol biosensor data
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
POPULATION MODELING AND BAYESIAN ESTIMATION FOR THE DECONVOLUTION OF BLOOD ALCOHOL CONCENTRATION FROM TRANSDERMAL ALCOHOL BIOSENSOR DATA by Yuliya Piterbarg A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (APPLIED MATHEMATICS) May 2009 Copyright 2009 Yuliya Piterbarg Dedication This dissertation is dedicated to Ulyana and Victor. ii Acknowledgements I am most grateful to my advisers Gary Rosen and Alan Schumitzky for their instruction, guidance and immense patience. I would like to thank them for being supportive, under- standing and for always being able to allay my research concerns. This dissertation could not be possible without them. I am also very grateful to Women in Science and Engineer- ing (WISE) and the Department of Mathematics at the University of Southern California, Linda Templeman at Giner, Inc, and NIH/NIAAA (Grant No. 2R44 AA014118-03S1) for their nancial support of my research project and graduate studies. Additionally, I appreciate helpful comments and assistance I received from my guidance committee, Chunming Wang, Remigius Mikulevicius and David D'Argenio. I am also very gtrateful to Miguel Dummet, Amy Young and Leonid Piterbarg for their support. iii Table of Contents Dedication ii Acknowledgements iii List Of Tables vi List Of Figures vii Abstract x Chapter 1: Introduction 1 1.1 Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2: Simple Skin Model 5 2.1 Discretization in time and space . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 3: Population Analysis 9 3.1 Data Generation Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Maximum Likelihood Approach for Identifying System Parameters . . . . 14 3.3 GTS Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3.1 Adjoint Gradient Method for Computation of LS Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter 4: Bayesian Deconvolution of the BAC 34 4.1 Stochastic Model for TAS Data for the BAC Deconvolution Problem . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Deconvolution Problem for the BAC Signal . . . . . . . . . . . . . . . . . 38 4.3 Theoretical Approach to the Computation of MAP Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.4 Theoretical Approach to the Computation of Interval Estimates . . . . . . 49 4.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.5.1 Data Generation for the Deconvolution Problem . . . . . . . . . . 55 4.5.2 Implementation of Bender's Search Method. Discussion. . . . . . . 57 iv 4.5.3 Results for the MAP estimation . . . . . . . . . . . . . . . . . . . 69 4.5.4 Computation of the Condence Intervals for the Blood Input. Dis- cussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.5.5 Results for the Interval Estimates . . . . . . . . . . . . . . . . . . 86 Chapter 5: Proposed Research 94 References 98 Appendix: WinBUGs 101 v List Of Tables 1 GTS results for the model with known ML estimates. . . . . . . . . . . . 28 2 GTS results for for generated data . . . . . . . . . . . . . . . . . . . . . 29 3 GTS results for for generated data . . . . . . . . . . . . . . . . . . . . . 30 4 GTS results for clinical data. . . . . . . . . . . . . . . . . . . . . . . . . . 32 5 MAP estimates of the SSM coecients for Patient 1 . . . . . . . . . . . . 72 6 MAP estimates of the SSM coecients for Patient 2 . . . . . . . . . . . . 73 7 MAP estimates for Patient 1 and Patient 2 assuming = 1000, = 2 . . 73 8 Sample Statistics for an MCMC sample of (;l), q = (:15; 1:7), input ap- proximation by pup-tents . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 9 Sample Statistics for an MCMC sample of (;l), q = (:15; 1:7), input ap- proximation by piecewise constants . . . . . . . . . . . . . . . . . . . . . . 109 10 Sample statistics for an MCMC sample of (q;l) . . . . . . . . . . . . . . . 111 vi List Of Figures 1 The Giner WristTAS V . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Full Body Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Skin model for the Full Body Model . . . . . . . . . . . . . . . . . . . . . 12 4 Two week BAC data for patient #103 and tted results . . . . . . . . . . 13 5 Two week TAS data for patient #103 and tted results . . . . . . . . . . 14 6 Scheme for generating hospital tests in the population analysis problem . 14 7 Scatter plot for SSM coecients obtained from actual clinical hospital tests. 30 8 Boxplot for SSM coecients obtained from actual clinical hospital tests. . 31 9 Clinical data and TAS measurements computed based on GTS and average estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 10 Generated data for two new patients . . . . . . . . . . . . . . . . . . . . . 56 11 Comparison of generated data sets . . . . . . . . . . . . . . . . . . . . . . 56 12 LS input estimate based on GTS parameters for SSM, N = 5 splines . . . 69 13 LS input estimate based on GTS parameters for SSM, N = 10 splines . . 69 14 LS input estimate based on GTS parameters for SSM, N = 20 splines . . 70 15 Joint prior distribution function for (; ) dened on b Q. . . . . . . . . . . 70 16 Plot of minimization functional F (q) computed for Patient 1 on the rst stage of Bender's Decomposition, P = 5 splines . . . . . . . . . . . . . . . 71 vii 17 Plot of minimization functional F (q) computed for Patient 1 on the rst stage of Bender's Decomposition, P = 10 splines . . . . . . . . . . . . . . 71 18 Plot of minimization functional F (q) computed for Patient 1 on the rst stage of Bender's Decomposition, P = 20 splines . . . . . . . . . . . . . . 72 19 MAP estimates of the input for Patient 1 using dierent number of splines. 74 20 MAP estimates of the input for Patient 2 using dierent number of splines. 75 21 MAP estimation for Patient 1 with 20 splines, noise = 1, = 1000, = 2, l =:01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 22 MAP estimates of the BAC signal for dierent noise levels for both patients 76 23 MAP estimates of the BAC signal for both patients assuming dierent values for ^ l . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 24 MAP estimates of the BAC signal for both patients assuming dierent values for l . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 25 Histograms for SSM coecients simulated from a non-discretized and dis- cretized priors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 26 CI's for Patient 1 TAS data observed with noise =1 . . . . . . . . . . . . . 87 27 CI's for Patient 2 TAS data observed with noise =1 . . . . . . . . . . . . . 88 28 Compare dependence of CI's on the observed noise level . . . . . . . . . . 89 29 Compare dependence of CI's on a spline precision l . . . . . . . . . . . . 90 30 Compare dependence of CI's on the prior mean ^ l . . . . . . . . . . . . . . 91 31 Compare prior and posterior samples for the SSM coecients . . . . . . . 92 32 Compare dependence of CI's on the assumed noise level . . . . . . . . . . 93 33 History plots for , l 2 , l 3 , approximation by pup-tents constants . . . . . . 108 34 History plots for , l 2 , l 3 , input approximation by piecewise constants constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 35 Autocorrelation plots for , l 2 , l 3 . . . . . . . . . . . . . . . . . . . . . . . 111 viii 36 History plots for (; ), input approximation by piecewise constants . . . 112 37 History plots for l 2 and l 3 in MCMC of (q;l), input approximation by piecewise constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 ix Abstract We develop statistical methods to estimate blood alcohol concentration (BAC) from trans- dermal alcohol concentration (TAC) measurements supplied by a transdermal alcohol sensor (TAS). This eliminates the need for clinical test data for each patient to calibrate the underlying mathematical models to each subject and each device. We parametri- cally estimate the distribution for transport parameters in a transdermal ethanol model based on population analysis and simulated data using the Global Two-Stage Method (GTS). We implement the GTS method using generated and clinical data obtained from 17 healthy volunteers with approximately the same body mass index. Next, we develop a Bayesian approach to estimate BAC for new patients from the population using TAS data generated by a full body alcohol model. The BAC signal is approximated by piecewise- constant (zero-order spline) functions. Prior distributions are assumed for skin model parameters (based on population analysis) and BAC spline coecients. The new pa- tient's TAS data is used to estimate BAC via maximization of a posterior distribution for skin parameters and input coecients. This yields a maximum a-posteriori (MAP) esti- mate. With basic assumptions on the BAC spline coecients and the TAS noise model, we obtain direct samples from the posterior distribution. This allows us to construct 95% condence intervals (CI) for BAC. x Chapter 1 Introduction 1.1 Device The idea of using a transdermal alcohol sensor to measure blood alcohol concentration (BAC) originated in the 1930's, when it was determined that ingested ethanol can be measured in perspiration in concentrations that are approximately equal to its concen- tration in the blood [13]. The appeal of a skin surface based sensor is that it is noninvasive and relatively simple. Giner, Inc. of Newton, Massachusetts, has developed the Trans- dermal Alcohol Sensor/Recorder (WrisTAS, Figure 1), a wearable electrochemical alcohol detector. It measures transdermal alcohol continuously { sampling the alcohol, skin tem- perature and skin resistance signals every 10-30 seconds and stores average values at predetermined intervals ranging from 0.5 to 10 minutes for later retrieval. The sensor is placed on the skin surface and continuously oxidizes alcohol, generating a current that is linearly related to the local alcohol concentration. It responds to 10-200 mg/dl BAC with a lag time of approximately 30 minutes to plateau. The electrochemical sensor is 1 specic for ethanol; it does not respond to oxygen, CO or acetone at their maximum physiological levels. Figure 1: The Giner WristTAS V Detailed descriptions of the development and performance of the Giner transdermal alcohol sensors can be found in [17], [19], [16] and [18]. 1.2 Motivation The Giner WrisTAS V (TAS) was developed primarily to monitor alcohol abstinence, rather than to provide quantitative measurements of BAC. However, the accurate and reliable determination of BAC over both short and long time periods is important for treatment, research and forensic applications. In the clinical setting, assessing BAC is important for identifying problem drinking, monitoring treatment compliance, under- standing sensitivity, tolerance, dependence, withdrawal and satiety, and preventing re- lapse. In a research setting, assessing BAC is necessary to evaluate treatment outcomes under dierent experimental conditions, to understand patterns in the drinking behavior of groups, such as teens, college students, pregnant women and the elderly, and to better understand alcohol physiology. It is important to be able to correlate BAC at a specic 2 time to behavioral impairments at that time and to correlate cumulative BAC over time to alcohol-related pathologies. 1.3 Approaches Adapting the Giner device to the more challenging task of providing a quantitative es- timate of BAC from TAS eld data, requires the development of a sophisticated data analysis system that is capable of obtaining the BAC signal out of the observed TAS measurements. To target this problem we start from designing a primitive skin model that addresses the transport of ethanol from the blood through the skin to the TAS sen- sor. Once the problem is solved for the simple skin model, designed techniques will be applied to the more complicated realizations described in [7]. Two important directions of research are of interest: studying parameters of the model and deconvolution of the BAC signal from the measured TAS data for a random individual. The former approach was rst to conduct a hospital test to determine patient's pa- rameters, and then use them to recover blood inputs. We want to eliminate the rst stage and be able to obtain model parameters and deconvolve blood alcohol concentrations at once for an incoming patient. First, we adopt the population analysis approach of obtaining central estimates for parameters in the same way it is done in pharmacokinetic/pharmacodynamic (PK/PD) problems. The classical schemes and examples are highlighted in the literature [20],[4],[14]. 3 We present some preliminary results in this area and propose directions for further re- search. The second approach uses Bayesian techniques. We will obtain point and interval esti- mates for parameters and blood concentrations. The rst approach involves maximization of the posterior distribution and the second approach allows us to obtain a complete prob- abilistic structure for the model parameters and blood concentrations. This will make possible construction of condence intervals for dierent functionals of the parameters and the BAC inputs. A variety of the techniques described above dier from the purely deterministic ap- proach taken previously by our research group. On the other hand methods that we are going to apply are widely used in various applications [10],[1], e.t.c. We will have to answer many questions on how to better model the data, on the choice of methods, and theoretical and numerical issues relating to the accuracy of the estimates. We will discuss these issues later in detail. 4 Chapter 2 Simple Skin Model For this model the skin is thought to consist of just one layer of length L. The spread of ethanol through the skin is governed by the one dimensional diusion equation. Such a model may not give a perfect t to the data, but provides us with a certain degree of accuracy. The need for simplicity comes from the necessity to apply statistical methods which can be tested and then transferred to more sophisticated analogues. The diusion equation for the simple skin model (SSM) is: @w @t = @ 2 w @x 2 t 0; 0xL; (2.1) where w(t;x) is the ethanol concentration in the skin at depth x and time t. The boundary and initial conditions have the form: w x (t; 0) =u(t) w x (t;L) = w(t;L) y(t) =w(t;L) (2.2) 5 w(0;x) 0 for any 0xL; (2.3) where at time t, u(t) is the ethanol concentration in the blood and y(t) is the TAS measurement. The problem (1-3) of nding the concentration of the ethanol in the skin can be restated in the weak form. Determine w(t)2 H 1 (0;L) such that for any '2H 1 (0;L), hw t ;'i L 2 =hw xx ;'i =hw x ;' 0 i +w x 'j L 0 ; w(0) = 0 y(t) =w(t;L) or hw t ;'i L 2 =hw x ;' 0 i +u(t)'(0) w(t;L)'(L); w(0) = 0 y(t) =w(t;L): (2.4) Problem (2.4) can be discretized in space, by approximatingH 1 (0;L) with nite-dimensional subspaces H 1 N (0;L) =Spanf' i g N i=0 , where ' i are linear spline functions on the interval [0;L] with respect to the uniform meshf0;L=N; 2L=N;:::;Lg. The solution to the dis- cretized equation (2.4) is w N (t;x) = N X i=0 w N i (t)' i (x); such that for any ' i , j = 0;:::;N satises, N X i=0 _ w N i (t)h' i ;' j i = N X i=0 w N i (t)h' 0 i ;' 0 j i +u(t; 0)' j (0) N X i=0 w N i (t)' i (L)' j (L) (2.5) 6 y N (t) = N X i=0 w N i (t)' i (L): Or if we write the same problem in the matrix form: M N _ w N (t) = (K N 1 K N 2 )w N (t) +F N u(t) y N (t) =H N w N (t); (2.6) where M N = L 6N 0 B B B B B B B B B B B B @ 2 1 0 0 ::: 0 0 1 4 1 0 ::: 0 0 0 1 4 1 ::: 0 0 ::: ::: ::: ::: ::: ::: ::: 0 ::: ::: ::: ::: 1 2 1 C C C C C C C C C C C C A K N 1 = N L 0 B B B B B B B B B B B B @ 1 1 0 0 ::: 0 0 1 2 1 0 ::: 0 0 0 1 2 1 ::: 0 0 ::: ::: ::: ::: ::: ::: ::: 0 ::: ::: ::: ::: 1 1 1 C C C C C C C C C C C C A ; K N 2 = 0 B B B B B B B B @ 0 ::: 0 0 ::: ::: ::: ::: 0 ::: 0 0 0 ::: 0 1 1 C C C C C C C C A ; M,K 1 and K 2 are N + 1 by N + 1 matrices; H N = [0;:::; 0; 1] is a 1 by N + 1 matrix. F N = 0 B B B B B B B B @ 1 0 . . . 0 1 C C C C C C C C A and the solution w N (t) = 0 B B B B B @ w N 0 (t) . . . w N N (t) 1 C C C C C A are N+1 by 1 vectors: 7 The system of equations (2.6) can also be rewritten as: _ w N (t) = (M N ) 1 (K N 1 K N 2 )w N (t) + (M N ) 1 F N u(t) y N (t) =H N w N (t): (2.7) Parameters of the system (2.7) will be further denoted by q = (; ). 2.1 Discretization in time and space Numerical solution of the problem considered above sometimes needs to be discretized in time as well. The backward Euler scheme will be used in this case, since it is more stable than the forward scheme. For chosen discretization points t k 0 consider w k =w N (t k ), u k =u(t k ), h k =t k+1 t k , A N (q) = (M N ) 1 (K N 1 K N 2 ) and B N (q) = (M N ) 1 F N the rst equation in (2.7) can be written as: w k+1 w k h k =A(q) N w k+1 +B N (q)u k+1 or (Ih k A N (q))w k+1 =w k +B N (q)u k+1 : Thus, system (2.7), discretized in time, becomes, w k+1 = (Ih k A(q)) 1 w k + (Ih k A(q)) 1 B(q)u k+1 y k+1 =H N w k+1 : (2.8) 8 Chapter 3 Population Analysis A traditional pharmacokinetic problem considers an individual's response to the drug in dierent parts of the body at various times. This response is often analyzed as a non- linear regression model. Initial estimates of parameters are obtained from well-designed experiments. The population analysis problem is to use information obtained for model parameters from healthy volunteers and apply it to predict drug eects for a patient undergoing therapy [20]. Traditionally the initial estimates are used to estimate central values and sometimes variability. The SSM described above represents the TAS signal, a result of ethanol diusing through the skin from the blood stream, as a (non-linear in parametersq = (; )) model (2:8) which captures the physiological characteristics of the skin for a particular patient. Adopting population analysis approach we will obtain blood and TAS observations based on generated hospital tests and use them in an iterative scheme to obtain parameter estimates. In the rst part of this section we describe thoroughly the generation of the hospital test data, blood and TAS measurements, which will be used further to test methods 9 we develop. To generate blood inputs we will use a full body model (FBM) which was developed and tted earlier by our research group. The second part describes a maximum likelihood approach for estimating parameter values. Later in this section we present the Global Two Stage method under the assump- tion thatq's are normally distributed and obtain estimates of their mean and covariance matrix . Numerical results are attached at the end of this section. 3.1 Data Generation Scheme We start from describing the full body model used in data generation for both population analysis and deconvolution problems. Next we proceed with detail description of the generation scheme used in the deconvolution problem. The full body model consists of 5 body compartments with three nonlinear Michaelis- Menten elimination pathways from the liver, and distributed parameter (PDE) models for the skin and TAS sensor. u(t) c 1 (t) c 2 (t) c 4 (t) c 3 (t) c 5 (t) k a q pv q eh , q pa q pv q am q ah q vh GI Portal Vein Liver Blood Other Body Tissue Skin f 1 , f 2 , f 3 Figure 2: Full Body Model c 1 (t) { Alcohol concentration in the gastro-intestinal tract 10 c 2 (t) { Alcohol concentration in the portal vein c 3 (t) { Alcohol content of the liver c 4 (t) { Alcohol concentration in the blood c 5 (t) { Alcohol concentration in other body tissue u(t) { Rate of alcohol consumption through the mouth The following ODE's were used to model the rate of change of the alcohol concentration in body compartments: _ c 1 (t) =k a c 1 (t) +f(t) _ c 2 (t) =k a c 1 (t)q vp c 2 (t) +q am c 4 (t) _ c 3 (t) =q vp c 2 (t) +q ah c 4 (t)q eh c 3 (t)q vh c 3 (t)f 1 (c 3 (t))f 2 (c 3 (t))f 3 (c 3 (t)) _ c 4 (t) =q vh c 3 (t)q ah c 4 (t)q am c 4 (t)q pa c 4 (t) +q pv c 5 (t) _ c 5 (t) =q pa c 4 (t)q pv c 5 (t): Separately, the skin model was designed to capture the diusion of the ethanol molecules through several layers of the skin (hypodermis (H), dermis (D), epidermis (E), and horny or stratum corneum (SC)), the excretion of ethanol by exocrine sweat glands, and the diusion of the ethanol across the control membrane in the TAS sensor itself. 11 Dermis Hypo - Dermis Epi - Dermis Stratum Corneum Exocrine Sweat Gland TAS Control Membrane v(t ) y(t ) z(t ) L TAS 0 L l D l SC l E 0 Dermis Hypo - Dermis Epi - Dermis Stratum Corneum Exocrine Sweat Gland TAS Control Membrane BA v(t ) y(t ) z(t ) Dermis Hypo - Dermis Epi - Dermis Stratum Corneum Exocrine Sweat Gland TAS Control Membrane BAC v(t ) y(t ) z(t ) L TAS 0 L l D l SC l E 0 L l D l SC l E 0 C 4 (t) Figure 3: Skin model for the Full Body Model Exocrine Sweat Glands: dv(t) dt =v(t) +f(t); t 0 v(0) = 0: Skin Diusion: @u(t;x) @t = @ @x (x) @u(t;x) @x + (x)c 4 (t); y(t) =u(t; 0) t 0 0xL @u(t;L) @x = 0 @u(t; 0) @x = u(t; 0) u(0;x) = 0 (x) = 8 > > > > > > > > > > > > < > > > > > > > > > > > > : SC 0xl SC E l SC <xl E D l E <xl D H l D <xL TAS Control Membrane: @w(t;x) @t = @ 2 w(t;x) @x 2 ; t 0 0xL TAS z(t) =w(t; 0) t 0 12 @w(t; 0) @x =w(t; 0) w(t;L TAS ) =y(t) +v(t) w(0;x) = 0: To t parameters the model was discretized in time and space and reduced to a linear non- homogeneous system of rst order ODE's that depend on 24 parameters. The data was obtained from the hospital test of patient #103 wearing WrisTAS device #04 on August 21, 2002, with the intake dose of 61.6 ml. Then a non-negative least squares optimization technique was used to obtain parameters. It appears that the chosen approach does an excellent job of capturing the locations of peaks in both BAC and TAS. The results are presented below in Figure 4 and Figure 5 : 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 Time (hrs) BAC (mg/dl) Model BAC P 103 BAC Data Figure 4: Two week BAC data for patient #103 and tted results To produce a hospital test for new patients, we assume that each patient is given the same amount of the alcohol to drink,U(t), and that the hospital test is conducted during the next 10 hours starting from the moment the patient starts drinking. First, we obtain the FBM model parameters by normally perturbing patient #103's parameters with a 15% deviation. Based on the amount of alcohol administered and the chosen parameter values, the FBM model generates nely sampled blood input u(t) and TAS observations 13 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 Time (hrs) TAS (mg/dl) Model TAS P 103 TAS Data Figure 5: Two week TAS data for patient #103 and tted results y using a high space discretization for the full body model. In most cases we added Gaussian noise to generated TAS values. For the detailed generation scheme refer to Figure 6 below. FBM U(t) Perturbed P103 FBM Parameters N(0, ) Figure 6: Scheme for generating hospital tests in the population analysis problem 3.2 Maximum Likelihood Approach for Identifying System Parameters In this section the parametric case of the general population modeling problem is considered. The random vectorY i takes values inR M and is observed, the random vector 14 q i takes values in R L and is not observed, i = 1;:::;K. The following assumptions are made: q i 's are i.i.d. with common but unknown density function . Y i 's are independent but not necessarily identically distributed. The conditional density p i (Y i jq i ) is known. The population density belongs to a family of parameterized probability densities of the formf(qj!) :!2 g, where 2R W , so that =(qj! ). The Y i 's usually represent measurements for the i'th subject of the experiment, q i is the collection of the unknown parameters associated with the adopted model. In our case the observed values are TAS measurements sampled every 2 minutes, and q = (; ) represents parameters for the SSM. For the generated model, assumptions of independence of observations and parameters for dierent patients are satised. We try to estimate a probabilistic structure of parameters q based on observed data. For some true parameter !2 the conditional density of Y i given ! is dened as, p i (Y i j!) = Z p i (Y i jq)(qj!)dq: The log likelihood of the data is, L(!) = K X i=1 logP i (Y i j!): (3.1) 15 To estimate! we consider! ML = arg maxfL(!) :!2 g, in other words the maximum likelihood estimator. To nd the necessary condition for the existence of the maximum at an interior point of we use the following, @ @! L(!) = K X i=1 @ @! logp i (Y i j!) = 0: Notice that, @ @! logp i (Y i j!) = R p i (Y i jq) @ @! (qj!)dq p i (Y i j!) or that @ @! logp i (Y i j!) = R @ @! (qj!) p i (Y i jq)(qj!)dq p i (Y i j!) =E @ @! log(qj!)jY i ;! ; hence, @ @! L(!) = K X i=1 E @ @! log(qj!)jY i ;! = 0: (3.2) We propose to nd ! ML using the Expectation-Maximization (EM) algorithm [6], [14]. TheY i 's are thought to be incomplete data as opposed to pairs (Y i ;q i )'s. The EM principle is then to condition the complete likelihood function (3.1) on the incomplete data Y i 's and then to maximize it. We may start from any point ! 0 2 and for the k'th step of the algorithm dene, E-Step Q(!;! K ) = K X i=1 E[log(qj!)jY i ;! k ] M-Step ! k+1 = arg maxfQ(!;! k ) :!2 g: (3.3) 16 The following fact is well known and is shown for example in [14]. Lemma 1. For ! k dened in (3.3) and for any k, we have L(! k+1 )L(! k ). Proof. First note that, p i (Y i j! k+1 ) p i (Y i j! k ) = Z p i (Y i jq) p i (Y i j! k ) (qj! k+1 )dq: Also for any events A, B, C we have, P (AjB) = P (AjC;B)P (CjB) P (CjA;B) : So that, p i (Y i j! k+1 ) p i (Y i j! k ) = Z p i (Y i jq)P (qjY i ;! k ) p i (Y i jq;! k )(qj! k ) (qj! k+1 )dq = Z p i (qjY i ;! k ) (qj! k+1 ) (qj! k ) dq; and, nally, p i (Y i j! k+1 ) p i (Y i j! k ) =E (qj! k+1 ) (qj! k ) jY i ;! k : By Jensen's inequality: log p i (Y i j! k+1 ) p i (Y i j! k ) E (qj! k+1 ) (qj! k ) jY i ;! k : 17 Thus, L(! k+1 )L(! k ) = K X i=1 log p i (Y i j! k+1 ) p i (Y i j! k ) K X i=1 E log (qj! k+1 ) (qj! k ) jY i ;! k = Q(! k+1 ;! k )Q(! k ;! k ) 0 : If ! k+1 is a maximum of Q(!;! k ), then @ @! Q(!;! k )j !=! k+1 = K X i=1 E[ @ @! log(qj! k+1 )jY i ;! k ) = 0: This expression is equivalent to equation (3:2). If the sequencef! k g converges, it will converge to the maximum likelihood estimate ! ML . The EM algorithm is stopped when ! k and ! k+1 are close enough. The convergence depends on the log likelihood function L(!). As a special case, assume that qN(; ), and ! = (; ), the parameter space =f(; ) :2R and is positive deniteg is an open subset of the Euclidean space. Then, log(qj!) = 1 2 log det() 1 2 (q) 0 1 (q) +C and equation (3:2) can be rewritten in the form: K X i=1 E( 1 (q))jY i ;!) = 0 K X i=1 E( 1 + 1 (q)(q) 0 1 )jY i ;!) = 0: 18 Thus, = 1 K K X i=1 i (!) = 1 K K X i=1 [P i (!) + ( i (!))( i (!)) t ]; (3.4) where i (!) =E(qjY i ;!) P i (!) =E[(q i (!))(q i (!)) 0 jY i ;!]: (3.5) It may be shown that (3:5) is equivalent to the E-step of the EM algorithm and (3:4) is equivalent to the M-step in the sense that for the k-th iteration: E-step : i (! k ) =E(qjY i ;! k ) P i (! k ) =E[(q i (! k ))(q i (! k )) t jY i ;! k ] M-step: k = 1 K K X i=1 i (! k ) k+1 = 1 K K X i=1 [P i (! k ) + ( i (! k ) k )( i (! k ) k ) 0 ]: (3.6) To summarize, note that (3:6) is the algorithm for obtaining the maximum likelihood estimate of the hyper parameter! = (; ) of the unobserved parametersq i based on the observed data Y i from patient i = 1;K (K = 100) provided that q i have the distribution N(; ). 19 3.3 GTS Method Let us restate one more time the form of the generated SSM considered: y i =f N (q i ;u i ) +e i i = 1; 100 e i N(0; 2 i I) y i ;u i 2R M q2R L ; (3.7) where y i are observations of TAS measurements for the i-th patient sampled every T M minutes (we agreed to get TAS observations every 2 minutes). T - the length of the test(10 hours); u i - known blood inputs; q i = (;; ) unknown model parameters; f N (q i ;u i ) is obtained from (2:8). In reality TAS observations can never be negative but we would like to choose the model (3:7), because of its simplicity. Adopting this model we make an assumption that it is possible for a Wrist TAS device to show negative values. Assume further that: q i N(; ): (3.8) We are in a \data rich" situation, since there are 100 patients with approximately 300 observations for each person. We will apply the Global Two Stage (GTS) method [14], [4] to estimate (; ), mainly because this method involves nding an estimate forq i , and then estimating the covariance of ^ q i for each patient. This requires a lot of data for each 20 experiment. The GTS method is also a modied version of the EM algorithm and is very easy to implement. As we mentioned before, the rst stage of the method consists of obtaining a least squares estimate of the q i 's, ^ q i , by minimizing the functional J i (q) with respect to q: J i (q) = y obs i f N (q;u i ) 2 = M X j=1 (y i (t j )f N (q;u i (t j ))) 2 ; i = 1; 100; (3.9) wheref N (q;u i (t j )) =H N w N i (q;t j ) is computed for the i'th patient from the system 2:8 via an adaptive time-stepping algorithm. At this stage the least square's estimate of the noise variance can be computed as well [4], (^ N i ) 2 = 1 Mp M X j=1 (y i (t j )f N (q;u i (t j ))) 2 : (3.10) The second stage of the GTS requires some considerations. For the least square estimate, ^ q i =q i +V 1=2 i e; eN(0;I) q i = + 1=2 ; N(0;I); (3.11) where V i is V N i = ( N i ) 2 ((X N i ) 0 X N i ) 1 - a covariance matrix of the estimate with X N i = d dq f N (^ q i ;u i ), ande independent[4]. From this point the ^ q i are considered to be observed data and (; ) is missing, so that the iterative steps (3:6) of the EM algorithm will apply to nd the ML estimate of (; ) based on the dataf^ q i g, i = 1; 100. The main concern 21 is computation of the E-step. In general if the data depends linearly on the parameter as in, Y =Cq +V 1=2 e; where eN(0;I); then formulas for the conditional mean and covariance are well known [2]: (!) = +P (!)C 0 V 1 (YC) P (!) = ( 1 +C 0 V 1 C) 1 : According to (3:11) and the latter expressions, the EM algorithm (3:6) is, E-step: P i (k) = ( 1 k +V 1 i ) 1 (3.12) i (!) = k +P i (k)V 1 i (^ q i k ) (3.13) M-step: k = 1 K K X i=1 i (k) (3.14) k+1 = 1 K K X i=1 [P i (k) + ( i (k) k )( i (k) k ) 0 ]: (3.15) The initial point of the algorithm is arbitrary but usually the least squares estimates of the mean and covariance of ^ q i are considered, 0 = 1 K K X i=1 ^ q i 0 = 1 K K X i=1 (^ q i 0 )(^ q i 0 ) 0 : 22 This concludes the GTS method. At the end just note that the algorithm converges, sincef^ q i g are normally distributed so the log likelihood of the data points has a unique maximum at the interior point of the parameter space. 3.3.1 Adjoint Gradient Method for Computation of LS Covariance Matrix First, to implement the above-stated algorithm we need to obtain the least squares estimatesf^ q i g. To run the second stage of the GTS described by the equations (3:12)- (3:14) we need to compute the covariance matrix of the least squares estimateV i . Only an estimate of V i can be obtained: ^ V N i = (^ N i ) 2 ((X N i ) 0 X N i ) 1 ; (3.16) where X N i = @ @q f N (^ q i ;u i ) is an M by p matrix, and ^ N i is obtained from (3:10). Luckily the matrixX N i can be computed exactly for the time-discretized system (2:8) by applying the adjoint gradient method. Lemma 2. For the linear system of ordinary dierential equations discretized in time, w k+1 =D k (q)w k +G k (q)u k+1 y k+1 =C(q)w k+1 ; w 0 = 0; k = 0;M 23 the following holds: d dq y k (q) = d dq C(q)w k + k1 X j=0 z j [ d dq D j (q)w j + d dq G j (q)u j+1 ] (3.17) for any k = 1;M, where z k1 =C(q); z j1 =z j D j (q) (3.18) for any j = 1;k 1. Proof. Let J(q) := P k j=0 j y j , where k = 1, j = 0 for j = 1;k 1. Then, d dq J(q) = k X j=0 j f d dq C(q)w j +C(q) w j q g = d dq C(q)w k + k X j=0 j C(q) dw j dq : Next consider functions z j (q), j =1;k such that z j1 (q)z j (q)D j (q) = j C(q)z k (q) = 0: It follows that: d dq J(q) = d dq C(q)w k + k X j=0 z j1 dw j dq k1 X j=0 z j [ dw j+1 dq d dq D j (q)w j d dq G j (q)u j+1 ]: Since dw 0 dq = 0, the last expression can be rewritten as: d dq J(q) = d dq C(q)w k + k1 X j=0 z j [ d dq D j (q)w j + d dq G j (q)u j+1 ]: 24 Note that for the last expression d dq y k (q) = d dq J(q); thus, we obtained (3:17) for z j (q) that satisfy (3:18). To compute X N i = @ @q f N (^ q i ;u i ) for the skin model problem apply Lemma 2 to the discretized system of equations (2:8). Notice that, C =H N D j = (Ih j A N (q)) 1 G j = (Ih j A N (q)) 1 B N (q) where the time step h j equals 2 minutes. We will show now that Proposition 1. d dq C = 0 d dq D j (q) =h j 1 j d dq K N (q) 1 j M N d dq G j (q) = 1 j [h 2 j d dq K N (q) 1 j F N +h j d dq (F N )] (3.19) where K N (q) = (K N 1 K N 2 ), j = (Mh j K(q)). Proof. For C =H, D j = (Ih j A(q)) 1 , G j = (Ih j A(q)) 1 B(q): d dq C = 0 d dq D j (q) =h j (Ih j A(q)) 1 dA(q) dq (Ih j A j (q)) 1 d dq G j (q) =h j [ d dq (Ih j A(q)) 1 B(q) + (Ih j A j (q)) 1 d d B(q)] (3.20) 25 From (2:7): dA(q) dq =M 1 d dq K(q); where K(q) = (K 1 K 2 ) d dq B(q) =M 1 d dq (F ) Let j =Mh j K(q) then(Ih j A(q)) 1 = (Ih j M 1 K(q)) = 1 j M, and equations (3:20) can be rewritten as: d dq C = 0 d dq D j (q) =h j 1 j d dq K(q) 1 j M d dq G j (q) = 1 j [h 2 j d dq K(q) 1 j F +h j d dq (F )] Equations (3.19) are easily computed explicitly. The last issue to worry about is a non-singularity of (X N i ) 0 X N i which is a p by p matrix. If X i has rank p, then (X N i ) 0 X N i is invertible, but nevertheless it may be nearly singular and cause problems in running the GTS. The latter situation arises when the model is over parametrized and sometimes depends on the choice of observations [15]. 26 3.4 Numerical Results First, we decided to test the GTS algorithm using a simple model, for which we know the result a priori. For known , V , and we generated k = 100 values of q i 's: q i = + (V + ) 1=2 " i : (3.21) Then we applied the GTS method to the obtained data, in an attempt to recover and . The GTS method as was described above obtains a maximum-likelihood estimate for the parameters. In this particular case we know that ML = q; ML = (k 1)s 2 k V . For = [:5; 1:2; 3]; = 1 4 I; V = 0 B B B B B @ 1 :2 :3 :2 1 :5 :3 :5 1 1 C C C C C A ; tolerance - 10 6 the simulated 100 values produced: ML = [:3593; 1:1369; 2:8956]; ML = 0 B B B B B @ 0:4125 0:0310 0:0156 0:0310 0:1985 0:0997 0:0156 0:0997 0:1037 1 C C C C C A : while the GTS results are shown in the Table 1. When the algorithm starts from the ML estimate, it converges in one step, it requires longer to run when starting from a random point. Next, we simulated TAS data for 100 patients using the scheme described in section 27 Initial Step GTS Estimates Steps 0 0 0 0 0 @ 0:3593 1:1369 2:8956 1 A 0 @ 0:4125 0:0310 0:0156 0:0310 0:1985 0:0997 0:0156 0:0997 0:1037 1 A 0 @ 0:3593 1:1369 2:8956 1 A 0 @ 0:4125 0:0310 0:0156 0:0310 0:1985 0:0997 0:0156 0:0997 0:1037 1 A 1 0 @ 1 1 1 1 A 0 @ 1 0 0 0 1 0 0 0 1 1 A 0 @ 0:3593 1:1369 2:8956 1 A 0 @ 0:4125 0:0310 0:0156 0:0310 0:1985 0:0997 0:0156 0:0997 0:1039 1 A 1096 Table 1: GTS results for the model with known ML estimates. 3:1 of chapter 3, and added Gaussian noise with variance 1/16, 1, and 2 to nd out if and how the GTS estimates are aected by the noise variance. For the rst stage of the GTS method we obtained a least squares estimate ^ q i of J(q) (3:3) using skin discretizations N = 4; 16; 32; 64. Next, we precomputed V 1 = 1 ^ 2 (X 0 X) using the adjoint method, initialized a starting point for the algorithm 0 and 0 and compared results for dierent skin discretizations and noise levels in Tables 2 and 3. For this particular example we do not know the exact ML estimates. As a benchmark we will use the results obtained with the skin discretization N = 64 for each noise level and compare a norm of the dierence between the benchmark and the estimated values. In theory as the skin discretization increases the solution to the discretized diusion equation converges to the true solution. In order to demonstrate convergence, one needs to show that the absolute error between successive trials gets smaller as discretization increases. Both results for and show receding errors with increasing discretization. As the nal illustration of the GTS method we used some actual clinical data provided by Dr. L.Templeman from Giner, Inc. We considered 17 patients with approximately the 28 N 2 noise = 1 16 2 noise = 1 2 noise = 2 N = 4 0:1862 1:9997 0:1862 1:9999 0:185 1:9955 Error 0:00014677 0:00014666 0:00015154 N = 16 0:1857 1:9883 0:1858 1:9885 0:1845 1:9839 Error 5E 07 5E 07 4:9E 07 N = 32 0:1856 1:9877 0:1857 1:9879 0:1845 1:9834 Error 1E 08 1E 08 4E 08 N = 64 0:1856 1:9876 0:1857 1:9878 0:1845 1:9832 Table 2: GTS results for for generated data same body mass indices (BMI), who were asked to consume from 20 to 70ml of alcohol within 15 minutes. For the next 5-6 hours they were asked to wear the TAS device and to take measurements with a breathanalyser every 10 minutes until the device showed a sucient drop of the blood alcohol content in a patient's breath. The breathanalyser data was used as a blood input for the diusion problem. For each of these patients we obtained LS estimates of their skin model coecients using N = 64. The scatter plot for the estimates we obtained is shown in Figure 7. Parameters do not seem to be correlated. To assess the distribution of the estimates we obtained, consider boxplots, also known as box-and-whisker diagrams, in Figure 8. To build a boxplot each set of parameters was ordered. A box in a diagram represents rst and third quartiles of the dataset with a vertical red line being the median. Whiskers usually show values within 1.5IQR, where IQR stands for inter-quartile range computed as (Q 75 Q 25 ). All other values of 29 N 2 noise = 1 16 2 noise = 1 2 noise = 2 N = 4 0:0009 0:0023 0:0023 0:3685 0:0009 0:0022 0:0022 0:3656 0:0009 0:0021 0:0021 0:3677 Error 2:118E 05 2:118E 05 2:209E 05 N = 16 0:0009 0:0022 0:0022 0:3642 0:0009 0:0021 0:0021 0:3612 0:0009 0:0021 0:0021 0:3632 Error 9E 08 4E 08 4E 08 N = 32 0:0009 0:0022 0:0022 0:364 0:0009 0:0021 0:0021 0:361 0:0009 0:0021 0:0021 0:363 Error 1E 08 0 0 N = 64 0:0009 0:0022 0:0022 0:3639 0:0009 0:0021 0:0021 0:361 0:0009 0:0021 0:0021 0:363 Table 3: GTS results for for generated data 0 0.5 1 1.5 2 2.5 0 5 10 15 20 25 alpha : 2.424 gamma : 2.196 α γ alpha : 0.3286 gamma : 24.93 Figure 7: Scatter plot for SSM coecients obtained from real clinical hospital tests. 30 .33 .55 .88 2 2.42 1 α 1.3 3.5 8.3 20 24.9 1 γ −1.11 −.57 −.13 .8856 1 log α .25 1.23 1.5 2.1 2.5 3.22 1 log γ Figure 8: Scatter plot for SSM coecients obtained from real clinical hospital tests the dataset are considered outliers. Since the GTS method relies on the assumption of normality for skin coecients, boxplots are very useful in understanding if the datasets under consideration can be considered approximately normal. There are no problems with coecients except for one outliers, whereas 's distribution is highly skewed to the right. We tried to improve this situation by considering log and log shown on the right in Figure 8. The boxplot shows some improvement in symmetry for the log coecient, and the most important discovery is that the outlier has disappeared and now lies within 1.5IQR. To conclude, we will run the GTS method for (log; log ) using all data sets, and then repeat the GTS excluding the Patient #5 data set, whose coecient may be an outlier. Results are presented in Table 4. One can notice that there is not much dierence between the sample and the GTS estimates produced with or without the presence of the outlier. A sample estimate for is the sample average, and for is the sample covariance. To investigate the eciency of 31 Type of Estimator Error(^ ) GTS 0:4885 1:4462 0:1937 0:1474 0:1474 0:4463 725:8784 Sample Est. 0:4865 1:4461 0:2123 0:1579 0:1579 0:4744 887:814 GTS, No Outliers 0:5725 1:4873 0:0868 0:098 0:098 0:4454 633:1726 Sample Estimates, No Outliers 0:5722 1:4873 0:0931 0:1043 0:1043 0:4752 729:5283 Table 4: GTS results for clinical data. each method we computed the average error between observed TAS data and the output produced by each estimate of . Namely, for each ^ and for each patient we computed: E i (^ ) =jjy i f(^ ;u i )jj 2 1 T i ; where T i is the length of the experiment for the ith patient. Then Error (^ ) = X i E i (^ ): We see that computed errors are smaller for GTS estimates compared to the sample ones, and also decrease without the presence of the outlier. Compare these values to the errors computed in the same way for the least squares estimates: Error (d q LS ) = 30:0149 Error (d q LS ; No outliers) = 26:1398: 32 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 100 TAS BAC TAS LS TAS AVE TAS GTS (a) 0 1 2 3 4 5 6 7 0 10 20 30 40 50 60 70 80 TAS BAC TAS LS TAS AVE TAS GTS (b) Figure 9: Clinical data and TAS measurements computed based on GTS and average estimates Least squares estimates provide the best t possible for a new patient, but only when the BAC input is known, which is possible if the patient is admitted for the hospital test. Compare modeled TAS observations with observed data for two patients in Figure 9. As mentioned before our goal is to eliminate a hospital test for an incoming patient that belong to the analyzed population group, and in particular, to obtain a better estimate for the population skin model parameters. The GTS method provides us with better estimates than sample statistics. It is very easy to implement, the estimators are more robust and provide better overall t when comparing the SSM output based on these values to the TAS data observed. We are not sure how fast the GTS results converge with increased skin discretization and propose to investigate this aspect in the future. 33 Chapter 4 Bayesian Deconvolution of the BAC As mentioned in the introduction, the most important challenges for the WrisTAS device is to be able to predict the blood alcohol concentration by observing noisy TAS data. We tried to simplify this problem by assuming that a new patient belongs to a group studied earlier at the population analysis stage. In the Bayesian perspective all unknown parameters are treated as variables; the uncertainty about their values is re ected in the probability distribution. The basic principle at work is to assume a prior distribution for parameter values, and then to obtain an updated (posterior) distribution for them based on observed data. In this way we do not obtain a single solution but rather a distribution of the value of interest that can be used to produce estimates. The updated distribution expresses a degree of condence about the unknown value after the measurement has been observed. [10]. Similar input deconvolution problems are considered in [10], [1], and [5]. However, most of them consider a known system, or are interested in other estimators. They all approximate the input by the linear combination of basis functions and use regularization techniques. 34 In the rst part of this section we carefully describe the deconvolution problem and un- known quantities to be computed . Next we consider theoretical methods and approaches used to obtain the estimates, as well as numerical implementation and results. 4.1 Stochastic Model for TAS Data for the BAC Deconvolution Problem To model the TAS signal we use the discretized in space simple skin model described in chapter 2 to represent observed data. However, we approach the solution of the ODE's in a dierent way, since now the inputu(t) is a parameter itself and needs to be estimated together with SSM parameter values. Recall the system of ODE's for the discretized in space SSM: w N (t) = A N (q)w N (t) +B N (q)u(t) (4.1) y N (t) = H N w N (t); where A N (q) = (M N ) 1 (K 1 K 2 ) and B N (q) = (M N ) 1 F N areR N+1N+1 matri- ces, N is skin discretization, and M N ;H N ;F N are as they were dened in the chapter 2. The general solution to the above system of ODE's for the TAS signal computed from (4.1) is written as: y N (t) =H N t Z 0 e A N (q)(t) B N (q)u(t)d; where 0tT and T is the moment we stopped observing TAS data. 35 The blood concentration is a smooth signal (the consumed alcoholic beverage having been \ltered" by the gastro-intestinal system of the body) so it is well approximated by splines. Consider u i (t), linear spline functions on the interval [0;T ] with respect to the uniform meshf0; T P ; 2T P ;:::;Tg, for some integer value P and consider u P (t) = P1 P i=1 l P i u i (t); where l P i are the coecients to be estimated. Dene y N;P (t) = P X i=1 l P i C N;P i (t;q); where C N;P i (t;q) =H N t Z 0 e A N (q)(t) B N (q)u i (t)d: (4.2) It can be shown under certain conditions that y N;P (t)!y N (t) as P!1. For the times 0t 1 t 2 :::t M =T dene y N;P j =y N;P (t j ). Then the system of ODE's (4.1) for the discretized input can be written in matrix form as y N;P = 0 B B B B B B @ y N;P 1 . . . y N;P M 1 C C C C C C A =C N;P (q)l P ; where C N;P is PM matrix with entries C N;P ij =H N t j Z 0 e A N (q)(t j ) B N (q)u i (t)d (4.3) 36 and l P = 0 B B B B B B @ l P 1 . . . l P M 1 C C C C C C A : The actual TAS data is noisy, and the signal produced by the TAS device is nonnega- tive, but for simplicity in our model we introduce an additive Gaussian noise with mean 0 and known variance 2 . Once we obtain adequate estimations for the case of Gaussian noise it is not hard to generalize results for nonnegative TAS observations, i.e. consider noise with log normal distribution. The nal formula for the actual TAS observations is: y TAS =C N (q)l +e; where eN(0; 2 I MM ); (4.4) y TAS 2R M + - observed TAS measurements, C(q)2R MP , M - is the number of observations, P - number of splines used to approximate the input, N - SSM discretization level, q2R 2 + , l2R P + - parameters to be estimated. Questions to answer now are how to estimate the parameters (q;l) from observing noisy measurementsy TAS for chosen SSM discretizationN and number of splinesP , and how these estimates depend on the choice of the discretization parameters. 37 4.2 Deconvolution Problem for the BAC Signal The idea of the Bayesian method is to assume a prior distribution for parameter values, and to obtain an updated (posterior) distribution for them based on observed data. Further, we set notation for the prior density function to be pr () and for the posterior to be post () . We assume that a priori random variablesq = (; ) andl are independent, since the blood alcohol concentration depends on the amount of alcohol consumed by the patient, while the skin parametersq describe physiological characteristics of human skin. Then the joint posterior density of q andl giveny is a product of the marginal densities: (q;ljy) =c(yjq;l) pr (q) pr (l): (4.5) Note thatc is a normalizing constant that guarantees that (q;ljy) is a density function. We can obtain the density (yjq;l) of observed values from the additive models (4.4). Since parameter values (q;l) do not depend on the noise, one can conclude that (yjq;l) = noise (yC N (q)l): (4.6) Now we are ready to state the problem. Problem 2 (Bayesian Formulation of the BAC Deconvolution Problem). Given TAS observations y for a new patient who belongs to the population group analyzed earlier, compute the estimate of their posterior density (q;ljy). For this, assume skin discretiza- tion N; choose basis splinesfu i (t)g P i=1 and prior distributions for parameters of interest, i.e. q and l . 38 One of the ways to solve this problem is to compute the maximum a posteriori (MAP) estimate for the system and the input parameters: (q MAP ;l MAP ) = arg maxf(q;ljy);q2Q;l2R P + g: (4.7) Another possibility is to obtain an actual sample from the posterior distribution. This could be done directly or using Markov Chain Monte Carlo (MCMC) simulation methods. Then virtually any point or interval estimate of parameters or their functions can be obtained. In any case before trying to solve the problem one need to choose a prior distribu- tion for q and l. This prior has to re ect as much known a priori information about parameters as possible, but also it has to allow enough variability to include true val- ues. As mentioned in the deconvolution problem statement a new patient is coming from the population group analyzed earlier. Consequently, set pr () =N( GTS ; ) and pr ( ) =N( GTS ; ) where GTS , and GTS are estimates obtained via the GTS scheme, and = 1 2 and = 1 2 are called "precisions" whose values are assigned at the imple- mentation stage. Further in this section we may include a positivity condition for the SSM parameters in a prior. Namely, one may see the notation pr () = N + ( GTS ; ), or pr ( ) =N + ( GTS ; ) which means that N( GTS ; ), N( GTS ; ) but all the negative values for and are omitted. 39 For the spline coecients we choose a Gaussian prior as well. The mean for the spline coecients can be obtained by the nonlinear least squares estimation for the parameter l of the functional F (l) =jjy TAS C(q GTS )ljj 2 +R(l), where q GTS = ( GTS ; GTS ); (4.8) where ( GTS ; GTS ) are the population simple skin model parameters, and R(l) is a regularization term described in section 4:5:2 of chapter 4. Indeed, without any prior calibration for a new patient, to base conclusions on population analysis estimates seems to be the best guess about the ethanol concentration in her/his blood. To dene the variance for spline coecients recall that the blood input is smooth; thus, we will penalize l for having too many \bumps" or \jumps" by introducing L = 0 B B B B B B B B B B B B @ 1 1 ::: ::: ::: ::: 1 1 ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: 1 1 ::: ::: ::: ::: 1 1 C C C C C C C C C C C C A :2R PP Next assume that pr (l) =N( b l; ( l L 0 L) 1 ): To better understand the choice of correlation matrix, write the density explicitly pr (l) =Ce l (l ^ l) 0 L 0 L(l ^ l)=2 : 40 The above expression is equivalent to penalizing the dierence vector (l ^ l) for having too many changes within its components. The larger the parameter l is, the less change for (l ^ l) is possible, which means that we force the solution to accept the shape of ^ l. In a Bayesian perspective, we have to allow l to stay rather small, to guarantee that the true l will be included in the prior. The value for l is discussed in depth when we consider implementation issues in section 4:5 of chapter 4. The idea of introducing a correlation matrix that penalizes jumps for the input can be compared with regularization techniques used in inverse problems, [5]. Since inverse prob- lems are typically known to be ill-posed there are many solutions that t the optimization criteria used in input deconvolution. Hence, restricting the space of admissible solutions makes the problem well-posed or at least better-posed. Similarly, in the Bayesian ap- proach by introducing the correlation function described above, we simply restricted the domain of possible solutions to make the problem more \identiable." Note also that the input u(t) can not be negative in practice, but we will ignore this condition in assigning a prior distribution for the spline coecients l in some methods. Next, we introduce the MAP estimation problem and its solution via Bender's De- composition. 4.3 Theoretical Approach to the Computation of MAP Estimates Consider our stochastic model (4.4) for the new patient, y TAS =C N (q)l +e; where eN(0; 2 I MM ); 41 where q2Q, l2R P + , and is known. Assume now thatQ = [a;b][c;d], andl2B, whereB is a compact set inR P + : As was noted above a maximum a posteriori estimate is computed as (4.7) which is equivalent to: (q MAP ;l MAP ) = arg maxflog(q;ljy) :q2Q;l2Bg: Since (4.5), it follows that: (q MAP ;l MAP ) = arg maxflog(yjq;l) + log pr (q) + log pr (l) :q2Q;l2Bg: Hence, from (4:4) and (4:6) we obtain: (q MAP ;l MAP ) = arg max 1 2 2 jjy TAS C N (q)ljj 2 + log pr (q) + log pr (l) :q2Q;l2B : (4.9) To compute MAP estimates we rely on an especially useful optimization technique in which we partition the domain. This is usually done when optimization with respect to one set of the variables is much simpler than with respect to another, see for example [1]. Fixing the variable q will result in the linear least squares estimation for the variable l, while simultaneous optimization with respect to (q;l) requires computation of derivatives for the log(q;ljy) and is extremely time consuming. With this in mind, the approach of partitioning variables look like it may be of some value to us. 42 Next, we start from the duality theorem that allows one to solve maximization prob- lems in the conjugate space, and we will use it to prove the general Bender's Decomposi- tion method, and a slight modication. Finally, we demonstrate that the method works well conclude that the method works for the ethanol deconvolution problem. Theorem 3 (Duality Theorem). Let f be a real-valued convex functional dened on a convex subset of a vector space X, and let G be a convex mapping of X into a normed space Z. Suppose there exist x 1 such that G(x 1 ) < and that 0 = infff(x) : G(x) ;x2 g is nite. Then inf G(x);x2 f(x) = max z '(z ) and the maximum on the right is achieved by some z 0 : If the inmum on the left is achieved by some x 0 2 , then hG(x 0 );z 0 i = 0 and x 0 minimizes f(x) +hG(x);z 0 i; x2 . Proof. The actual proof may be found in [12]. Note also that: '(z) = inf x2 ff(x) +hG(x);zig: Theorem 4 (Generalized Bender's Decomposition I). Let A, B be convex sets, G(x;y) - a convex mapping of AB into a normed space Z. Assume also that: 1. f(x;y) is a real-valued convex functional that achieves a minimum value on AB; 2. For every xed x2A, f(x;y) is convex and achieves a minimum value; 43 3. There exist (x 1 ;y 1 ) such that G(x 1 ;y 1 )<. If y (x) = arg minflogf(x;y) :y2B;G(x;y)g (4.10) x = arg minfy (x) :x2Ag; then (x ;y (x )) = arg minff(x;y) :x2A;y2B;G(x;y)g: (4.11) Lemma 3. Assume f;G;A;B satisfy Theorem 4 then min x2A 8 < : min y2B G(x;y) f(x;y) 9 = ; = min x2A y2B G(x;y) f(x;y): Proof. To prove the lemma we will use the formulation of a dual optimization problem. Note that by Theorem 3 min y2B G(x;y) f(x;y) = max z ' x (z); where ' x (z) = inf y2B ff(x;y) +hG(x;y);zig: In the same way: min x2A y2B G(x;y) f(x;y) = max z '(z): 44 Note also that if (x ;y ) minimizes the left side of the above inequality and z maxi- mizes the right side, then by Theorem 3, (x ;y ) minimizesff(x;y) +hG(x;y);z ig and hG(x ;y )z i = 0. Now we prove the main statement. For any x2A : min y2B G(x;y) f(x;y) min x2A y2B G(x;y) f(x;y): Hence, min x2A 8 < : min y2B G(x;y) f(x;y) 9 = ; min x2A y2B G(x;y) f(x;y): Backwards, for some y inf y2B ff(x;y) +hG(x;y);zigff(x;y ) +hG(x;y );zig: Hence, for any x2A : max z ' x (z) max z ff(x;y ) +hG(x;y );zig: In particular, for some x : max z ' x (z) max z ff(x ;y ) +hG(x ;y );zig: If (x ;y ) is a solution to the problem inf G(x) x2 f(x), the duality Theorem states that max z ff(x ;y ) +hG(x ;y );zig =f(x ;y ): (4.12) 45 On the other hand, by the duality theorem max z ' x (z) = min y2B;G(x ;y) f(x ;y); and min x2A min y2B;G(x;y) f(x;y) min y2B;G(x ;y) f(x ;y); ; hence, min x2A min y2B;G(x;y) f(x;y) f(x ;y ) = min x2A;y2B;G(x;y) f(x;y): Generalized Bender's Decomposition I. Proof. Note that (x ;y ) is a solution to 4:11 i f(x ;y ) = min x2A;y2B;G(x;y) f(x;y), where (x ;y ) satises all the necessary restrictions. Assume that (^ x; ^ y(^ x)) is a solution to (4:10) thenf(^ x; ^ y(^ x)) = min x2A min y2B;G(x;y) f(x;y) and (^ x; ^ y(^ x))2 AB, G(^ x; ^ y(^ x) . By lemma 3, f(^ x; ^ y(^ x)) = min x2A;y2B;G(x;y) f(x;y) means that (^ x; ^ y(^ x)) is a solution to (4:11). Theorem 5 (Generalized Bender's Decomposition II). Let A, B be convex sets, G(x;y) - a convex mapping of AB into a normed space Z. Assume also that: 1. f(x;y) is a real-valued functional that achieves a minimum value on AB; 2. For every xed x2A, f(x;y) is convex and achieves a minimum value; 3. '(z ) = infff(x;y) +hG(x;y);z i;x2A;y2Bg is nite for some z 4. There exist (x 1 ;y 1 ) such that G(x 1 ;y 1 )<. 46 If y (x) = arg minflogf(x;y) :y2B;G(x;y)g (4.13) x = arg minfy (x) :x2Ag; then (x ;y (x )) = arg minff(x;y) :x2A;y2B;G(x;y)g: (4.14) Proof. By Luenberger [12] p.225 if f(x;y) is not convex, then Theorem 3 can no longer be used, although the following is true. Provided '(z ) is nite for some z , max z '(z ) min x2 ;G(x) f(x): So to prove the above theorem we just change (4:10) to max z ff(x ;y ) +G(x ;y )zgf(x ;y ) = min x2 ;G(x) f(x) in Theorem 4 and the proof follows. Theorem 6 (Bender's Decomposition For the Deconvolution Problem). Let Q, L be compact convex sets in R K 1 and R K 2 , respectively . Assume also that: 1. (q;ljy) is a density function on R K 1 +K 2 ; 2. For every xed q2Q, (q;ljy) is a log concave function w.r.t l; 3. There exist (q 1 ;l 1 )2QB such that C(q 1 )l 1 > 0 47 . For xed q let l (q) = arg maxflog(q;ljy) :l2L; C(q)l 0g (4.15) q = arg maxflog(q;l(q)jy) :q2Qg (4.16) then (q ;l (q )) = arg maxflog(q;ljy) :q2Q;l2L;C(q)l 0g: Proof. To apply Theorem 5 we take f(q;l) = log(q;ljy) and take G(q;l) =C(q)l: Note that since Q, L are compact sets, and (q;ljy) is a density function dened on R K 1 +K 2 , it is bounded and therefore achieves its extreme values on QL. Remark 7. The result of the theorem holds without any restrictions on C(q 1 )l 1 : For xed q let l (q) = arg maxflog(q;ljy) :l2Lg q = arg maxflog(q;l(q)jy) :q2Qg; Then (q ;l (q )) = arg maxflog(q;ljy) :q2Q;l2Lg: By (4:9), log(q;ljy) = 1 2 2 jjy TAS C N (q)ljj 2 + log pr (q) + log pr (l) 48 We have to make sure that our priors for parameters (l;q) are log concave to be able to run Bender's decomposition. In the numerical implementation of Bender's decomposition, section 4:5:2 of chapter 4, Q = [a;b] [c;d] is approximated with a suciently ne discrete mesh of values f 1 ;:::; V gf 1 ;:::; W g. Then for every xed pair q ij = ( i ; j ) one can compute l (q ij ) from (4:15), and then implement a direct search to compute (4:16). Approximating Q = [a;b] [c;d] by a nite mesh is equivalent to the discretization over the system parameters' space. Implementation, numerical results and benets of this method are discussed in the already mentioned section 4:5:2 of chapter 4 below. It appears to be very dicult to explore the statistical behavior of the MAP estimator, since it is highly nonlinear in the parameter q. We could invoke bootstrap methods but they here shown to be extremely time consuming for our problem. An alternative to point estimates is condence intervals which, with a certain condence, give us an estimate of the range for the true parameter values. Below we will describe a method to produce condence intervals for the system, output and input parameters. 4.4 Theoretical Approach to the Computation of Interval Estimates The idea for this method came from [1] although those authors use a Reproducing Kernel Hilbert Space(RKHS) approach for the space of input functions and some restric- tions which are not necessary for our purposes. The method we are about to unveil is suitable for more general cases than SSM deconvolution. 49 Consider a more general stochastic model than model (4.4): y =C(q)l +e; where eN(0; ); where C(q)2R MP , C ij (q) = t i Z 0 K(t;;q)u j d: y TAS 2R M + - observed TAS measurements, = diagf 2 1 ;:::; 2 M g. In [1] it is assumed that the spline coecientsfl i g are independent r.v. with l i N(0; i ) and known precision i , andf j ;u j (t)g are eigenvalues and eigenfunctions of the RKHS kernel. We are not bound to these restrictions and will just assume that l N(l 0 ; ) with known mean l 0 and covariance matrix , and that e is independent froml. Note that we intentionally did not include a non-negativity property for the spline coecients l. Next we show how to simulate independent samples (q k ;l k ) with k = 1;:::;B from (q;ljy). Taking samples we can explore the whole posterior distribution for parameters (q;l), and look at the condence intervals as well. For the input and output CI's consider time values 0 s 1 s M 0 T , then compute y k (s i ) = C(q k )l k and u k (s i ) = P P j=1 l j u j (s i ) fork = 1;:::;B andi = 1;:::;M 0 . For each time points i build a (100)% CI by discarding 2 B largest and smallest values, and connectingu max (s i ) fori = 1;:::;M 0 andu min (s i ) fori = 1;:::;M 0 to get the upper and lower bounds respectively for the input function. In the same way a CI for the output can be obtained and compared to actual measured TAS values, y TAS . We start with a series of lemmas and summarize the result with a nal theorem. 50 Lemma 4. Let y =C(q)l +e, lN(l 0 ; ) , eN(0; ). Then the following statements are true assuming l and e are independent: yjqN(E[yjq];Cov[yjq]) E[yjq] =C(q)l 0 Cov[yjq] =C(q)C(q) T + : Proof. Since y = C(q)l +e and l N(l 0 ; ) then y = C(q)l 0 +C(q)w +e, for some wN(0; ); hence, E[yjq] =C(q)l 0 Cov[yjq] =C(q)C(q) T + : Lemma 5. Let Z = 2 6 6 4 l q 3 7 7 5 , where l and y are from Lemma 4. Then E[Zjq] = 2 6 6 4 l 0 C(q)l 0 3 7 7 5 Cov[Zjq] = 2 6 6 4 C(q) T C(q) C(q)C(q) T + 3 7 7 5 : 51 Proof. The rst statement and E[Zjq] = 2 6 6 4 E[ljq] E[yjq] 3 7 7 5 follow from Lemma 4. Now as for Cov[Zjq] = 2 6 6 4 Cov[ljq] Cov[l;yjq] Cov[y;ljq] Cov[yjq] 3 7 7 5 ; since we have Cov[ljq] = and Cov[yjq] =C(q)C(q) T + from Lemma 4, and Cov[y;l] =E[(yC(q)l 0 )(ll 0 ) T jq] =E[(C(q)l 0 +C(q)w +eC(q)l 0 )(l 0 +wl 0 ) T jq] =E[C(q)ww T +ew T jq] =C(q); the second statement follows. Lemma 6. For y, l, and q from Lemma 4-5 ljy;qN(E[ljy;q];Cov[ljy;q]); 52 where E[ljy;q] =l 0 + C(q) T (C(q)C(q) T + ) 1 (yC(q)l 0 ) (4.17) Cov[ljy;q] = C(q) T (C(q)C(q) T + ) 1 C(q): (4.18) Proof. Consider a moment generating function of W =Zjq whereZ comes from Lemma 5. It is easy to show that W is in fact a multivariate normal r.v. with mean E[W ] = 2 6 6 4 l 0 C(q)l 0 3 7 7 5 and covariance matrix Cov[W ] = 2 6 6 4 C(q) T C(q) C(q)C(q) T + 3 7 7 5 : From [10], if 2 6 6 4 X Y 3 7 7 5 has a multivariate normal distribution with mean 2 6 6 4 X Y 3 7 7 5 and covariance matrix 2 6 6 4 XX XY YX YY 3 7 7 5 , then XjY N(m;S) where m =E[XjY ] = X + XY ( YY ) 1 (y Y ) S =Cov[XjY ] = XX XY ( YY ) 1 ( XY ) T : 53 Applying this statement to W , and using formulas for the joint mean and covariance matrix, we obtain (4.17) and (4.18). Theorem 8 (Sampling Theorem). Let y =C(q)l +e, lN(l 0 ; ), eN(0; ), l and e are independent, C(q)2R MP . If q k (qjy) and l k N(m k ;S k ), where m k =l 0 + C(q k ) T (C(q k )C(q k ) T + ) 1 (yC(q k )l 0 ) S k = C(q k ) T (C(q k )C(q k ) T + ) 1 C(q k ); then (q k ;l k )p(q;ljy). Proof. Lemmas 4, 5 and 6 prove thatl k jq k ;yN(m k ;S k ) for= statedm k andS k . Since (q;ljy) =(ljy;q)(qjy) the statement of the theorem follows. The problem of sampling (q k ;l k ) p(q;ljy) is reduced to the question of how to sample q(qjy). Note that (qjy) =(yjq)(q), and (yjq;l) = noise (yC(q)l) from (4.6). Since we assumed that l N(l 0 ; ), data log likelihood (yjq;l) = (yjq) and yjqN(yC(q)l 0 ; ) that does not depend on l any more. The next chapter will describe the actual algorithm of constructing condence inter- vals and provide some of our numerical results as well. 4.5 Results and Discussion The practical implementation of any theoretical method is always a very challenging task. Solving an inverse problem is even more challenging since aside from numerical concerns stability and identiability problems may become an issue that has to be dealt with. MAP 54 and interval estimations turn to be very sensitive to the choice of prior parameters, as well as to the number of approximating splines and noise levels for the TAS data. Tuning prior parameters can be done using examples where the BAC input is known and then use the same prior parameter values to analyze new data sets. The subject of optimizing the number of approximating splines is very challenging and is left for the future. We, rst, consider implementation for MAP estimates via Bender's decomposition and compare some of them to the interval estimators later in the section. Also we describe challenges that were encountered when building the condence intervals (in particular, sampling from(qjy)), and compare plots of CI estimators for dierent parameter values. 4.5.1 Data Generation for the Deconvolution Problem We started by generating two hospital data sets using the full body model, similar to the scheme described in section 3:1 of chapter 3. For this we perturbed Patient 103's FBM parameters and obtained BAC and TAS data, then proceeded in a slightly dierent way. Namely, we obtained SSM parameters for both patients by non linear least squares minimization of: ^ l = arg minfjjy TAS f 64 (t;q;u(t))jj 2 g; where f 64 is a numerical solution to the SSM diusion equation discretized in time and space discussed in section 2:1 of chapter 2. Obtained parameters are SSM coecients:( P 1 ; P 1 ) = [0:1976; 1:3245] for Patient 1 and ( P 2 ; P 2 ) = [0:1813; 2:0577] for Patient 2. Further, we generated TAS data using obtained above SSM values and corresponding BAC's at the skin discretization 64. The 55 nal step was to add Gaussian noise with precisions .5, 1, and 4. The generated data sets are plotted in Figure 10 and Figure 11. 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 BAC TAS by FBM TAS by SSM, τ noise =1 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 BAC TAS by FBM TAS by SSM, τ noise =1 Figure 10: (a) Patient 1, (; ) = [0:1976; 1:3245], (b) Patient 2, (; ) = [0:1813; 2:0577] 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 Patient1 − BAC Patient2 − BAC Patient1 − TAS Patient2 − TAS Figure 11: Comparative plot for generated data sets. In the population analysis study, chapter 3, the GTS method produced (; ) = [:186; 1:99] with the estimated precisions = 1000 and = 2, so that Patient 1's SSM parameters deviate more from the central values than Patient 2's parameters. 56 4.5.2 Implementation of Bender's Search Method. Discussion. The problem of obtaining a maximum a posteriori estimate of the input and SSM pa- rameters is solved by maximizing the data log likelihood function with respect to system parameters and input spline coecients by the algorithm suggested in Theorem 6. As was noted earlier the choice of prior distributions specied in section 4:2 of chapter 4 allows us to rewrite the formula for the MAP estimation in the following way: (q MAP ;l MAP ) = arg max noise 2 jjy TAS C N (q)ljj 2 + log pr (q) + log pr (l) :q2Q;l2B ; (4.19) noise = 1 2 is a noise precision, pr (q) = pr () pr ( ) or pr (q) =N + ( GTS ; )N( GTS ; ), pr (l) =N( b l; ( l L 0 L) 1 ), b l is a prior mean for the spline coecients computed as a least squares estimate by tting observed data to the SSM model assuming the skin parameters obtained on a population analysis stage. Next, we will proceed as follows: 1. Describe, in a detail, computation of the prior mean, b l . 2. Specify all parameters for prior distributions: pr (), pr ( ), pr (l): 3. Specify a grid for the system parameters. 57 4. Proceed with Bender's Decomposition: First, optimize with respect to spline coef- cients and then with respect to the system parameters. The GTS estimate for system parameters is (; ) GTS = [:186; 1:99]. We will compute b l as a nonnegative least squares estimate of (4:8): F (l) =jjy TAS C(q GTS )ljj 2 + 2 jjMljj; (4.20) where M = 0 B B B B B B B B B B B B @ 1 1 0 ::: ::: 0 0 1 1 ::: ::: 0 ::: ::: ::: ::: ::: ::: 0 ::: ::: ::: 1 1 0 ::: ::: ::: 0 1 1 C C C C C C C C C C C C A : The addend 2 jjMljj is a Tikhonov regularization term which is a basis of one of the classical inverse problem solving methods [10]. Approximating the smooth signal with a linear combination of piecewise constants leads to a Gibbs phenomenon which results in large oscillations of estimated spline coecients around the input. To avoid this we penalizel for having too many jumps. The choice of the regularization term 2 is a central issue in the literature discussing Tikhonov regularization. Making it too large will result in under tting, and making it too small won't solve the oscillation problem. We will pick 2 experimentally. In Figures(12), (13), and (14) LS estimates ofl based on GTS skin model parameters are compared to the real input for dierent patients and spline coecients. We noticed 58 earlier that the noise precision does not substantially change LS estimation, and included results only for data sets generated with the noise precision 1. Notice that for Patient 1 the optimal values of the regularization parameter are 2 = 1 and 2 = :5 for any number of splines. For values 2 = :1 and 2 = :05 the estimated mean for the spline coecients overestimates the input. The situation is dierent for Patient 2: the optimal regularization values seem to be 2 = :1 and 2 = :05, and for the rest of the possible's the input is underestimated. From a Bayesian perspective the distribution forl is our best guess about the input, and will be corrected after computing posterior distribution. We set 2 = 0:1; and will investigate the signicance of this choice later. Now we specify all parameters for prior distributions. Specically, we need to choose values for , , l . It would be logical to choose precisions for SSM parameters based on population analysis. As was noted earlier, they were computed to be = 1000, = 2. We would like to allow more freedom for parameter , choosing = 10, and less for parameter , setting = 10. This choice is arbitrary and could be changed as desired. Thus, pr () =N + (:186; 10) and pr () =N(1:99; 10): The last parameter to discuss is the covariance for the spline coecients, ( l L 0 L) 1 , in pr (l) =N( b l; ( l L 0 L) 1 ): As discussed in section 4:2 of chapter 4 suggesting such covariance results in estimates of the input l for which the distance from the prior mean (l b l) changes from one 59 component to another according to l . The larger l is, the less change is allowed. To better understand the motivation for introducing the covariance ( l L 0 L) 1 , we will rewrite the MAP formula (4:19) taking into account the prior densities dened above, (q MAP ;l MAP ) = arg min noise 2 jjy TAS C N (; )ljj 2 + 10 2 jj:186jj 2 + 10 2 jj1:99 + jj 2 + l 2 jjL( b l( 2 )l)jj 2 (; )2Q;l2B : (4.21) By examining (4:21) one may notice that the term l 2 jjL( b l( 2 )l)jj 2 is equivalent to the Tikhonov regularization term in the minimization problem given in (4:20). Thus, depending on the parameter l the spline covariance is responsible for the well-posedness of the problem. Dening the spline precision brings us back to the challenge of coming up with a value for the Tikhonov regularization term. One possibility would be to run the minimization algorithm several times for dierent values of l , choosing the ones for which the solution is dierent from the prior mean but at the same time that would provide us with adequate results. Such values are, for example, l =:01, and l =:005, which do not result in the best MAP estimates for the input, but they could be used for other estimations. Some authors [3] have also suggested treating l as a parameter and assigning a distribution to it. We tried the approach suggested by [3] assigning a uniform distribution for the precision l , but did not get an overall better t for the spline coecients l. 60 Finally we will dene the boundaries of the domain of the skin and input parameters in (4:21), namely, B and Q. Since the lethal concentration of the alcohol in the blood is well below 400mg/dl, let B = [50; 400] P : We allow spline coecients to be negative but expect this behavior only when the esti- mating signal is close to 0. This condition is essential for simulating from the posterior distribution in section 4:5:4 of chapter 4, but not for MAP estimation. We hope to com- pare MAP estimates with interval estimates and would like these two methods to have the same prior setting. To determine the boundaries of Q, we plotted prior densities for the parameters and , and marked regions where these functions take values signicantly greater than 0. Additionally, for the parameter we are interested in positive values only, as specied in a prior. Since we can not choose 0 as a lower bound for , we will choose a value that is suciently small, Q = [10 4 ; 1:2] [:5; 3]: To tackle the primary or master problem in the Bender's Decomposition (optimization w.r.t. system parameters), we will apply a direct search method. For this, let us discretize the domains of the SSM parameters :f 1 ;:::; V g; and :f 1 ;:::; W g within Q: Let, f 1 ;:::; V g =f10 4 ;:0011;:0021;:::; 2;:21;:22;:::; 1:2g; f 1 ;:::; W g =f:5;:051;:052;::::; 2:99; 3g; 61 and consider b Q =f 1 ;:::; V gf 1 ;:::; W g: We plotted the joint prior density for parameters (; ) in Figure 15, and at this point state the algorithm for the decomposition. Algorithm9. Bender's Decomposition Method for computing MAP estimate of the BAC input 1. Choose number of splines P . Precompute C(; )2R MP on b Q from (4:3). Consider TAS data with known noise precision noise . Compute b l( 2 ) for 2 =:1. Choose values for l ; ; . 2. For every q = (; )2 b Q nd l (q) = arg min n noise 2 jjy TAS C N (; )ljj 2 + 10 2 jj:186jj 2 + 10 2 jj1:99 jj 2 + l 2 jjL( b l( 2 )l)jj 2 ,l2B o (4.22) using least squares minimization and compute F (q) = noise 2 jjy TAS C N (q)l (q)jj 2 + 10 2 jj:186jj 2 + 10 2 jj1:99 jj 2 + l 2 jjL( b l( 2 )l (q))jj 2 : (4.23) 62 3. Use a direct search method to nd q = arg minfF (q);q2 b Qg. Note: F (q) = log((q;l (q)jy)) or (q;l (q)jy) =e F (q) . Bender's Decomposition was tested for Patient 1 and Patient 2 TAS data at the noise precisions 0.5, 1, and 4. Note that the precision satises = 1 2 , so that the noise variance is smaller when the precision is larger. We tried to approximate the input with 5, 10, and 20 equally spaced spline functions, using spline precisions l = :01 and l = :005. Overall, there have been 6 runs for each combination of splines for each patient given the following sets of parameters: 1. noise =:5, l =:01 and noise =:5, l =:005, 2. noise = 1, l =:01 and noise =:5, l =:005, 3. noise = 4, l =:01 and noise = 4, l =:005. We also tested the Bender's Decomposition method for the case when the non negativity assumption for the spline coecients are present, that is when: pr (l) =N + ( b l; ( l L 0 L) 1 ): (4.24) In this case the domain for the spline coecients B changes to B + = [0; 400] P , and the computation of (4:22) in step 2 of the Bender's Decomposition will become non negative least squares minimization. This added one more run of Bender's Decomposition for each set of prior parameters. To make this discussion more clear, let us further dene F (q) computed over B + , for the case when the prior of the spline coecients is non negative, to be F + (q). 63 We plotted the function F (q) next to F + (q)in Figures 16, 17, and 18 for Patient 1 TAS data observed with the noise precision noise = 1. The overall shape of the function F (q) does not change when setting noise , or spline precision, l , so we will not investigate examples where the latter prior parameters have been changed. For each plot the global minimum is marked with a data tip. The global minimum of F (q) in eect is a result of the Bender's Search method for the SSM parameters. Figures 16 and 18 contain large at valleys where the minimum value is found. This suggests that the problem may have several solutions that minimize F (q), so deconvolution problem admits several solutions as well, at least, with respect to the SSM parameters. The shape for F + (q) in this case also contains at regions and does not present a better situation in terms of nding a unique pair of SSM parameters. Notice that in Figure 17 the shape of F (q) is very dierent. The global minimum of the function F (q) is more \expressed". Hence, the deconvolution problem is better posed, since there is no set of parametersq that could be substituted for another set of completely dierent values for the parameters that would yield a similar result. By varying the priors, in particular, by increasing , , we can change the shape of F (q) in Figures 16 and 18, since in this way we add more weight to the terms 2 jj:186jj 2 and 2 jj1:99 jj 2 in the least squares minimization of F (q), so that the minimum will be located close to q GTS = (:186; 1:99). Another conclusion that can be drawn from Figures 16, 17 and 18 is that we can expect to have better results for the deconvolution problem using 10 approximating splines rather than using 5 and 20, since, at least for the SSM coecients, the problem of nding the MAP estimate is better posed when using 64 10 spline functions. Computation of F (q) based on Patient 2's TAS observations lead to the similar shaped plots. Consequently, they are not included. We next look at the MAP estimates for Patient 1 data in Figure 19. We used the data with noise = 1 and l =:01. The plots (a), (b), and (c) contain the true input compared to the estimated inputs; one - with a non negativity assumption for the spline coecients and another one - without. The fourth plot, (d), is included as a proof of concept to show how close the outputs produced using estimated inputs in combination with SSM parameters are to the observed data and to each other. The runs displayed in plots (a), (b), and (c) all produced a good estimate of the input functions. At this point the MAP estimation computed with the non negativity condition for the spline coecients does not produce a better estimate of the input function, but it did require a much longer time to run. The estimation using 20 splines in plot (c) seem to overestimate the input just a little bit. Refer to the Figure 20 to compare the estimates obtained for the Patient 2 TAS observations with noise = 1 and l = :01. As with Patient 1, plots (a), (b), and (c) contain the true input compared to the estimated inputs, and plot (d) compares outputs produced using estimated blood inputs in combination with SSM parameters and the observed data. In plots (a) and (b) MAP estimates t very well to the true input. In plot (d), both MAP estimates signicantly underestimate the true BAC. This means that for chosen set of prior parameters the problem is ill-posed when approximating the input with 20 splines. To obtain a better t, one must try to change the value of the prior parameters. Notice that even when the MAP estimate in plot (d) is very dierent from the true blood input, it nevertheless produces an output the ts the data essentially perfectly. 65 Tables 5 and 6 contain MAP estimates of the SSM parameters without prior assump- tions on non negativity of the spline coecients. The goal of the deconvolution problem stated in section 4:2 of chapter 4 is not to obtain exact SSM parameters for a new pa- tient, but to tune them. The population parameters are q GTS = [:186; 1:99] and Patient 1's SSM parameters areq P 1 = [0:1976; 1:3245]. We can not state that the MAP estimates for the SSM parameters is a better approximation than the GTS values for Patient 1. The same goes for Patient 2 with q P 2 = [0:1813; 2:0577]. When approximating with 20 splines, the parameter is consistently overestimated for both patients, which implies that the prior precision for the coecient might be increased to allow for better results. To investigate this notion, we ran Bender's Decomposition, setting , to be equal to 1000 and 2 respectively, the values suggested by the GTS method. The results are summarized in table 7. We notice a signicant improvement for the estimation of the SSM coecients using 20 splines for both patients, but, overall, it is hard to state whether changing the precisions improved the estimation of the SSM coecients. In particular, when approximating the blood input using 10 splines based on observations with the noise precision noise = 4, the estimation of the SSM coecients gets worse, possibly because was changed to the smaller value. Also in Figure 21 we compare the input for the second patient estimated using = 1000 and = 2 with the previous estimation using = 10, = 10 and illustrated in Figure 20. Changing precisions for the SSM coecients in most of the cases improved their estimates, but did not improve the input approximation. We next investigate dependence of the Bender's Search method results on noise , l , and ^ l(q) in Figures 22, 24, and 23. We are interested in the MAP approximations using 66 20 splines, since there seems to be a room for improvement in estimating the input. The SSM precisions are set once again to = 10, = 10. The Figure 22 illustrates dependence of input estimates on the noise level. For Patient 1 observations with known noise = 4, the input estimation deteriorated, and at the same time for Patient 2 there is no enhancement in the results when using the data with higher precisions. Also, in Tables 5 and 6 the estimation of SSM coecients seem to be worse when estimating data with the smallest known variance. This could be explained as follows. When the noise precision is high, we are forced to t the output of the SSM model to the observed data in the LS minimization step of Bender's Decomposition. Since the deconvolution problem is ill-conditioned the algorithm nds a set of (q;l) that provide a good t for the data but not for estimated parameters . This idea will be discussed further in section 4:5:4 of chapter 4. In Figure 24 we tried to compare input estimates for several increasing values of l . For Patient 2 the estimation has dramatically improved when l = :01 was increased to l = :1. For both patient's the estimates stopped changing at a certain point while l continued to grow. Finally, we investigate the dependence of the prior mean for the spline coecients, ^ l on the nal estimation in Figure 23. Plot (a) on the left compares the MAP estimate, computed for the Patient 1 data given noise = 1, l = :01, ^ l(:1), and a prior mean, ^ l(:1). On the right the MAP estimate is computed for the same set of prior parameters except for the prior mean, which is set to ^ l(:5) and according to the Figure 14 seems to approximate the true input a little better than ^ l(:1). The same consideration was used in choosing ^ l =:05 as a prior mean for the Patient 2 MAP estimate on the right in plot 67 (b). While the MAP estimate denitely depends on it's prior mean and can improve the estimation, we can not conclude that setting the prior mean at the true value alone would necessarily produce a good estimation result as seen from plot(b). To summarize, we have observed that to obtain better MAP estimates, increase the prior precision, , for the SSM parameter to GTS = 1000, and increase the spline regularization precision l to 1. These conclusions were made in a presence of the true input. However, the goal is to be able to obtain estimations for blood alcohol concentra- tion without any prior knowledge. The question is, how ever can one be sure about the credibility of obtained results? To examine the error of the MAP estimator, a bootstrap method can be applied to build condence intervals for estimated values. The other possibility is to estimate the input with an interval estimate rather than with a point estimate. One way to obtain such estimates is discussed in detail in the next section. 68 4.5.3 Results for the MAP estimation Compare estimates of the blood input, equation (4:20), assuming that SSM coecients come from the population analysis. 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 delta 2 = 1 delta 2 = .5 delta 2 = .1 delta 2 = .05 Real Input (a) 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 delta 2 = 1 delta 2 = .5 delta 2 = .1 delta 2 = .05 Real Input (b) Figure 12: Comparative plot for [ l( 2 ) estimates, P = 5. (a) Patient 1, noise = 1 (b) Patient 2, noise = 1 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 100 δ 2 = 1 δ 2 = .5 δ 2 = .1 δ 2 = .05 Real Input (a) 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 δ 2 = 1 δ 2 = .5 δ 2 = .1 δ 2 = .05 Real Input (b) Figure 13: Comparative plot for [ l( 2 ) estimates, P = 10. (a) Patient 1, noise = 1 (b) Patient 2, noise = 1 69 0 2 4 6 8 10 0 20 40 60 80 100 δ 2 = 1 δ 2 = .5 δ 2 = .1 δ 2 = .05 Real Input (a) 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 δ 2 = 1 δ 2 = .5 δ 2 = .1 δ 2 = .05 Real Input (b) Figure 14: Comparative plot for [ l( 2 ) estimates, P = 20. (a) Patient 1, noise = 1 (b) Patient 2, noise = 1 Figure 15: Joint prior distribution function for (; ) dened on b Q. 70 Compare plots of F(q), (4:23), discussed in section 4:5:2 of chapter 4 (a) (b) Figure 16: Plot of F (q) computed for Patient 1 TAS data, noise = 1, P=5. (b) Plot of F + (q) computed for Patient 1 TAS data, noise = 1, P=5. (a) (b) Figure 17: Plot of F (q) computed for Patient 1 TAS data, noise = 1, P=10. (b) Plot of F + (q) computed for Patient 1 TAS data, noise = 1, P=10. 71 (a) (b) Figure 18: Plot of F (q) computed for Patient 1 TAS data, noise = 1, P=20. (b) Plot of F + (q) computed for Patient 1 TAS data, noise = 1, P=20. Compare MAP estimates of SSM coecients for Patient 1 given l =:01, = = 10. Number of Splines noise = 4 noise = 1 noise = 0:5 P = 5 1.371 1.35 0.1431 1.46 0.1491 1.57 P = 10 0.126 1.29 0.157 1.72 0.167 1.81 P = 20 0.48 1.7 0.177 1.83 0.178 1.88 Table 5: Patient 1, q P1 = [0:1976; 1:3245]. MAP (; ) 72 Compare MAP estimates of SSM coecients for Patient 2 given l =:01, = = 10. Number of Splines noise = 4 noise = 1 noise = 0:5 P = 5 .1061 .95 .1211 1.37 .1257 1.51 P = 10 .16 1.86 .16 1.79 .117 1.53 P = 20 0.35 1.83 0.34 1.85 0.41 1.84 Table 6: Patient 2, q P2 = [0:1813; 2:0577]. MAP (; ) Compare MAP estimates of SSM coecients for both patients given = 1000, = 2, l =:01. N. Splines/Patient noise = 4 noise = 1 noise = 0:5 P = 10, Patient 1 .126 .69 .16 1.47 .17 1.61 P = 10, Patient 2 .12 1.16 .174 1.53 .176 1.79 P = 20, Patient 1 .23 1.04 .176 1.79 .18 1.7 P = 20, Patient 2 .22 1.32 .198 1.57 .189 1.69 Table 7: q P1 = [0:1976; 1:3245], q P2 = [0:1813; 2:0577] 73 Compare MAP estimates for Patient 1 blood alcohol concentration assuming noise = 1, l =:01, and the prior mean ^ l(:1). 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 Nonneg. BS, delta 2 = .01 BS, delta 2 = .01 Real Input (a) 0 2 4 6 8 10 −20 0 20 40 60 80 100 Nonneg. BS, δ 2 = .01 BS, δ 2 = .01 Real Input (b) 0 2 4 6 8 10 −20 0 20 40 60 80 100 Nonneg. BS, δ 2 = .01 BS, δ 2 = .01 Real Input (c) 0 2 4 6 8 10 −5 0 5 10 15 20 25 TAS(P = 5) TAS(P = 10) TAS(P = 20) TAS (d) Figure 19: (a) MAP estimate, 5 splines. (b) MAP estimate, 10 splines. (c) MAP estimate, 20 splines. (d) Comparison of outputs produced by the obtained MAP estimates for dierent number of splines. 74 Compare MAP estimates for Patient 2 blood alcohol concentration assuming noise = 1, l =:01, = = 10, and the prior mean is ^ l(:1). 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 Nonneg. BS, delta 2 = .01 BS, delta 2 = .01 Real Input (a) 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 Nonneg. BS, δ 2 = .01 BS, δ 2 = .01 Real Input (b) 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 Nonneg. BS, δ 2 = .01 BS, δ 2 = .01 Real Input (c) 0 2 4 6 8 10 −5 0 5 10 15 20 25 TAS(P = 5) TAS(P = 10) TAS(P = 20) TAS (d) Figure 20: (a) MAP estimate, 5 splines. (b) MAP estimate, 10 splines. (c) MAP estimate, 20 splines. (d) Comparison of outputs produced by the obtained MAP estimates for dierent number of splines. 75 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 BS, δ 2 = .01,τ α =10,τ γ =10 BS, δ 2 = .01,τ α =1000,τ γ =2 Real Input Figure 21: MAP estimation for Patient 1 with 20 splines, noise = 1, = 1000, = 2, l =:01 Compare MAP estimates of the BAC signal for dierent noise levels for both patients based on 20 approximating splines given l =:01, = = 10. 0 2 4 6 8 10 −20 0 20 40 60 80 100 MAP, τ noise =4 MAP, τ noise =1 MAP, τ noise =.5 (a) 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 MAP, τ noise =4 MAP, τ noise =1 MAP, τ noise =.5 (b) Figure 22: (a) MAPs of BAC computed for Patient 1. (b) MAPs of BAC computed forg Patient 2. 76 Compare MAP estimates of the BAC signal computed for dierent values of ^ l for both patients based on 20 approximating splines given l =:01, = = 10. 0 2 4 6 8 10 −20 0 20 40 60 80 100 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 MAP, Prior mean based on l(δ 2 )=(.1) Prior mean, l(δ 2 )=(.1) True Input BS, Prior mean based on l(δ 2 )=(.5) Prior mean, l(δ 2 )=(.5) True Input (a) 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 MAP, Prior mean based on l(δ 2 )=(.05) Prior mean, l(δ 2 )=(.05) True Input MAP, Prior mean based on l(δ 2 )=(.1) Prior mean, l(δ 2 )=(.1) True Input (b) Figure 23: (a) MAP for Patient 1 computed assuming ^ l =l(:1) and ^ l =l(:5). (b) MAP for Patient 2 computed assuming ^ l =l(:1) and ^ l =l(:05) 77 Investigate dependence of MAP estimates on l for both patients based on 20 approximating splines given noise = 1, = = 10. 0 2 4 6 8 10 −20 0 20 40 60 80 100 MAP, τ l =.01 MAP, τ l =.1 MAP, τ l =1 MAP, τ l =5 BAC 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 MAP, τ l =.01 MAP, τ l =.1 MAP, τ l =1 MAP, τ l =5 BAC Figure 24: MAPs of BAC computed for Patient 2 TAS data with noise = 4 4.5.4 Computation of the Condence Intervals for the Blood Input. Discussion. Our rst approach to building condence intervals for the blood input was to use Monte Carlo Markov Chain (MCMC) software, WinBUGs (Windows Bayesian inference Using Gibbs Sampling), popular among statisticians, that can be used to obtain samples from the posterior distribution . The WinBUGs project is essentially an interactive software for Bayesian analysis of statistical models using MCMC methods. It began in 1989 at the MRC Biostatistics Unit, branched out to several similar versions and has been carried forward by the Imperial College School of Medicine at St Mary's, London, and the University of Helsinki, Finland. Running this software for the deconvolution problem unfortunately did not lead to any useful results, as revealed in the Appendix. 78 Next, we decided to proceed with a direct sampling method from the posterior dis- tribution. This required an assumption that a-priori spline coecients l are normally distributed and may take negative values. We begin with an algorithm for building (1)100% CI for the input function, where the simulation is suggested by Theorem 6. But, rst, it would be useful to recall all the assumptions. In particular, the stochastic formula for the TAS observations is y =C(q)l +e; lN(l 0 ; ); eN(0; ) where l and e are independent, (q;l) are parameters of interest C(q)2 R MP and is computed form 4:3. The prior densities for parameters (q;l) are duplicated from section 4:5:2 of chapter 4: pr (l) =N( b l; ( l L 0 L) 1 ) () =N( GTS ; ) ( ) =N( GTS ; ): To build a (1)100% CI for the input function proceed as follows: Algorithm 10. on the kth step.. 1. Simulate q k (qjy). 79 2. Simulate l k N(m k ;S k ), where m k = b l + C(q k ) T (C(q k )C(q k ) T + ) 1 (yC(q k ) b l) S k = C(q k ) T (C(q k )C(q k ) T + ) 1 C(q k ) = noise I MM = ( l LL 0 ) 1 . 3. Repeat steps 1-2 until K =B. 4. For every j = 1;:::;P putfl k j g B k=1 , in the ascending order. 5. Throw away 100=2 of the rst and the last elements, thenLB BAC = (l min 1 ;:::;l min 1 ); and UB BAC = (l max 1 ;:::;l max 1 ): We also can compute CI's for the output and compare it to the TAS data, just to make sure that the estimated band is consistent with the observed TAS data. For this, for every pair (q k ;l k ); computey k =C(q k )l k , and repeat steps 4-5 for every i = 1;:::;M andfy k i g B k=1 , where y k i = y k (t i ) to obtain LB TAS = (y min 1 ;:::;y min 1 ); and UB TAS = (y max 1 ;:::;y max 1 ). To accomplish the rst step we notice that since (qjy) =(yjq)(q); it follows that (qjy) = ; (yjq)(q); 80 where by Lemma 4 =C(q) b l =C(q)( l LL 0 ) 1 C(q) T + : Before proceeding with the algorithm, notice that when computing the MAP estimates we discretized the domain of the SSM coecients to perform a direct search method, and o-line computed C(q) for all possible values of q. This was absolutely essential for reducing the overall runtime of the MAP algorithm. We would like to use the same approach in computing CI estimates. In particular, we will discretize the prior for the SSM parameters, and then use already computed C(q) matrices in order to proceed with simulations from the posterior distribution. Thus, the domain of q is set to Q = [10 4 ; 1:2] [:5; 3]. Similarly to what was done in the section 4:5:2 of chapter 4, assume that 2 f 1 ;:::; V g =f10 4 ;:0011;:0021;:::; 2;:21;:22;:::; 1:2g 2 f 1 ;:::; W g =f:5;:051;:052;:::; 2:99; 3g: The prior distributionP (q =q ij ) is then computed asP (q =q ij ) =P ( = i )P ( = i ), where P ( = i ) = GTS ; ( i ) GTS ; ( i1 ) (4.25) P ( = j ) = GTS ; ( j ) GTS ; ( j1 ): (4.26) 81 For points at the boundaries of the domain, we have P ( = 1 ) = GTS ; ( 1 ) GTS ; ( 1 =2) P ( = 1 ) = GTS ; ( 1 ) GTS ; ( 1 =2): To compute P (q = q ij jy), we rst nd ; (yjq ij ) for every i = 1;:::;V , j = 1;:::;W , and then multiply it by P (q = q ij ). As a proof of concept, in Figure 25 we show the comparison between histograms of a sample obtained from the continuous version of pr (q) and a sample from the discretized distribution drawn from ^ Q with weights P (q =q ij ). Next, we discuss the results presented in section 4:5:5 of chapter 4. Once again consider Patient 1's and Patient 2's TAS data generated by the skin model using SSM coecients ( P 1 ; P 1 ) = [0:1976; 1:3245] and ( P 2 ; P 2 ) = [0:1813; 2:0577] respectively, together with blood inputs generated by the FBM. See section 4:5:1 of chapter 4 for details. The data is observed at noise precisions noise = 4, noise = 1, and noise =:5. We try to approximate the input with 5, 10, and 20 splines. The prior parameters , , l , b l are chosen to be the same as in the MAP estimation problem so that we can compare its solution to the interval estimates. In this case a MAP estimator possesses the same prior setting and maximizes the data log likelihood. Thus, by denition, MAP values should have high posterior probabilities and, consequently, should belong to the corresponding condence intervals. The prior parameters are set to = = 10, b l = d l(:1), and l =:01 or l =:005 as in the section 4:5:2 of chapter 4. All intervals are at 95% condence level. Consider condence intervals for Patient 1 and Patient 2 data sets, noise = 1, l =:01 computed using 5, 10, and 20 spline functions in Figures 26 and 27, where each set of 82 condence intervals is compared to a corresponding MAP estimator. On the left side we compare input estimations, and on the right - compare outputs produced by the estimates and the SSM model. We notice that, for Patient 1, MAP estimators are located inside the corresponding condence bands, which capture the true input but almost miss it during the rst 2 hours of the experiment. This should not be surprising since we noted in the data generation section that Patient 1's parameters are quite dierent from the population estimates, which may be one of the reasons that we observed estimating eects. For Patient 2 the condence intervals are wider and capture the true input perfectly. A special attention deserves the plot where the estimation is done using 20 splines. As discussed in the previous section, the MAP estimator in this case fails to approximate the input as well as the SSM parameters, but the condence interval does. Surprisingly, the MAP estimate does not lie in the condence interval. Unfortunately we can not explain this fact, but rather we can explain the failure in the computation of a true MAP estimator. The plot ofF (q), Figure 18, is very at. It may be possible that accumulation of some numerical errors could produce very similar results for F (q) value. This as was suggested earlier can be corrected by changing prior parameters. Next, we will compare condence intervals in Figure 28 built using data with dierent noise precisions. The top two plots compare CI's for the blood alcohol concentration for Patient 1 and Patient 2. For both patients the intervals computed observing data with the noise precision noise = 4 are narrow and almost miss the downhill slope of the curve. The same eect was noticed in the MAP estimation section: the SSM coecient estimates were worse for higher precisions. As a proof of concept, the plots (b) below compare 95% condence intervals built for TAS observations. We may see that the intervals for lower 83 precision noise = :5 are wider as they allow more variation in the output. The higher noise precision leads to more narrow condence intervals which can lead to unfavorable results. Also we computed a MAP estimate and a CI for the Patient 1 data observed with known noise precision noise = 4 in Figure 32, assuming the prior noise to be smaller, in particular, noise =:5. The estimations of MAP and CI's dramatically improved as seeing in the leftmost plot and the plot in the middle. The plot on the right just compares the output CI's produced by simulated values. Next we compare the dependence of the spline precision on the width of the CI's in Figure 29. The condence intervals were compared for data sets observed with noise precision noise = 1 using 10 splines for the following values of the spline precisions l = :1, l = :01, and l = :005. Since the noise precision is inversely proportional to the variance, we expect that simulated variables from the posterior distribution will have smaller variance, and the condence intervals would be more narrow. This eect can be observed in plots (a) and (b) in the left sub gures. CI's for l =:01, and l =:005 are almost identical. Sub gures on the right compare 95% CI's for the outputs produced by simulated values and the SSM. There is almost no dierence between the width of such intervals. Assuming low precision for the spline coecients, i.e. considering \wide" priors, in bayesian estimation is essential for a true solution to belong to posterior distribution. We can also compare the dependence of the prior mean for the spline coecients on the CI's in Figure 30. The top two plots compare the prior mean ^ l( 2 ) computed from (4:20) for 2 =:1 and 2 =:5 for Patient 1 TAS data to the corresponding condence intervals. The estimation ^ l(:5) seemed to be superior to ^ l(:1) (see Figure 13, but it is hard to state the same for condence intervals. Part (b) of the same gure compares prior means ^ l(:01) 84 and ^ l(1) for Patient 2 TAS data to the corresponding condence intervals. As seen in Figure 13), the prior mean ^ l(1) underestimates the true input, while the corresponding CI captures the maximum of the BAC, but only an upper boundary captures the downward slope. Thus, compared to MAP estimates the CI have more opportunities to capture the true input even starting from a very unfavorable initial guess ^ l. Overall the CI's are more stable to the choice of the prior mean for the spline coecients than MAP estimates. Since we also obtained samples of the system parameters from the posterior distribu- tion , we can analyze the simulated values by looking at their histograms. Without loss of generality consider Patient 1 and Patient 2 data sets with observed noise precision 1, when trying to t BAC with 10 splines. In Figure 31 we compare histograms of simulated SSM coecients to the corresponding samples of the equal size obtained from the prior distribution for both patients. Most of the sampled 's take values between .1 and .2 for both patients. At the same time the range for 's essentially remained unchanged. These results are hardly of any use and show that the SSM coecient's can be measured very approximately in hopes to improve the population (GTS) estimates for a new patient. To summarize, the condence intervals were able to capture the true blood alcohol concentration in almost all of the experiments. They showed to be more stable to over parametrization and choice of the prior mean compared to MAP estimators. To guarantee better results, one should not increase the assumed noise precision in modeling TAS observations even if the noise variance is known to be small. 85 4.5.5 Results for the Interval Estimates Histograms for samples obtained from pr (q) = pr () pr ( ) and discretization of pr (q) computed by (4:25-4:26) on ^ Q. Figure 25: Left: samples obtained from pr (q). Right: samples obtained from a discretization pr (q) on ^ Q. 86 Condence intervals for Patient 1 BAC input are compared with corresponding MAP estimates computed for the same set of prior parameters. 0 2 4 6 8 10 −20 0 20 40 60 80 100 L.Bound Input Up.Bound Input MAP Estimate True Input 0 2 4 6 8 10 −5 0 5 10 15 20 25 L.Bound Output Up.Bound Output MAP Estimate Output Data (a) 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 0 2 4 6 8 10 −5 0 5 10 15 20 25 L.Bound Input Up.Bound Input MAP Estimate True Input L.Bound Output Up.Bound Output MAP Estimate Output Data (b) 0 2 4 6 8 10 −40 −20 0 20 40 60 80 100 120 140 L.Bound Input Up.Bound Input MAP Estimate True Input 0 2 4 6 8 10 −5 0 5 10 15 20 25 L.Bound Output Up.Bound Output MAP Estimate Output Data (c) Figure 26: Patient 1, noise = 1. (a) P = 5.(b) P = 10.(c) P =20. 87 Condence intervals for Patient 2 BAC input are compared with corresponding MAP estimates computed for the same set of prior parameters. 0 2 4 6 8 10 −10 0 10 20 30 40 50 60 70 80 90 L.Bound Input Up.Bound Input MAP Estimate True Input 0 2 4 6 8 10 −5 0 5 10 15 20 25 L.Bound Output Up.Bound Output MAP Estimate Output Data (a) 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 L.Bound Input Up.Bound Input MAP Estimate True Input 0 2 4 6 8 10 −5 0 5 10 15 20 25 L.Bound Output Up.Bound Output MAP Estimate Output Data (b) 0 2 4 6 8 10 −40 −20 0 20 40 60 80 100 120 L.Bound Input Up.Bound Input MAP Estimate True Input 0 2 4 6 8 10 −5 0 5 10 15 20 25 L.Bound Output Up.Bound Output MAP Estimate Output Data (c) Figure 27: Patient 2, noise = 1. (a) P = 5.( b) P = 10. (c) P = 20. 88 For both patients condence intervals for BAC input are compared with corresponding MAP estimates computed given the same set of prior parameters. 0 2 4 6 8 10 −40 −20 0 20 40 60 80 100 120 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 τ noise =.5 τ noise =1 τ noise =4 TAS τ noise =.5 τ noise =1 τ noise =4 TAS (a) 3 3.5 4 4.5 5 18 19 20 21 22 23 24 τ noise =.5 τ noise =1 τ noise =4 TAS 4.5 5 5.5 6 6.5 16 17 18 19 20 21 22 23 τ noise =.5 τ noise =1 τ noise =4 TAS (b) . Figure 28: (a) Patient 1, 95% CI's for the input, left: Patient 1, right: Patient 2. (b) Fragment of 95% CI's for the ouput, left: Patient 1, right: Patient 2 89 CI's computed for dierent values of l , when observing data with noise = 1 and using 10 splines to approximate an input. 0 2 4 6 8 10 −40 −20 0 20 40 60 80 100 120 4 4.5 5 5.5 6 14 15 16 17 18 19 20 τ l =.005 τ l =.01 τ l =.1 True Input τ l =.005 τ l =.01 τ l =.1 Output, τ noise =1 (a) 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 4 4.5 5 5.5 6 18 19 20 21 22 23 24 τ l =.005 τ l =.01 τ l =.1 True Input τ l =.005 τ l =.01 τ l =.1 Output,τ noise =1 (b) Figure 29: (a) Patient 1. (b) Patient 2. 90 Compare CI's for dierent values of the prior mean ^ l, when using TAS data observed with the noise precision 1 and input approximation with 10 splines. 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 CI based on l(.1) l(.1) True Input 0 2 4 6 8 10 −20 0 20 40 60 80 100 CI based on l(.5) l(.5) True Input (a) 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 0 2 4 6 8 10 −20 0 20 40 60 80 100 CI based on l(.1) l(.1) True Input CI based on l(1) l(1) True Input (b) Figure 30: (a) Patient 1. (b) Patient 2. 91 Histograms for the SSM coecients sampled from the prior and posterior distributions, when observing data with noise = 1 and using approximation with 10 splines. 0 0.5 1 1.5 .1 .2 0 50 100 150 200 250 300 π pr (α) π post (α) 0 1 2 3 4 0 50 100 150 200 250 300 π pr (γ) π post (γ) (a) 0 0.5 1 1.5 .1 .2 0 50 100 150 200 250 300 π pr (α) π post (α) 0.5 1 1.5 2 2.5 3 0 50 100 150 200 250 300 π pr (γ) π post (γ) (b) Figure 31: (a) Patient 1, left: samples, right: samples. (b)Patient 2, left: samples, right: samples. 92 Compare CI's for Patient 1 TAS data, observed with the noise precision 4, when using dierent noise precision in the stochastic model. 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 L.Bound Input Up.Bound Input MAP Estimate True Input 0 2 4 6 8 10 −20 0 20 40 60 80 100 120 CI, using τ noise =4 CI, using τ noise =.5 True Input 2.5 3 3.5 4 4.5 18 19 20 21 22 23 24 25 TAS CI, τ noise =.5 TAS CI, τ noise =4 TAS Figure 32: Compare estimated results for Patient 1 TAS data observed with noise = 4 when modeling using noise =:5 and noise = 4 93 Chapter 5 Proposed Research The challenging task of using the WrisTAS device to obtain a quantitative estimate of a patient's blood alcohol content resulted in two separate problems. First, estimating population SSM coecients by applying a modication of the EM algorithm, a Global Two Stage Method, and, second, deconvolving the blood alcohol content and tuning SSM parameters for an incoming patient by analyzing the posterior distributions of parame- ters of interest. The GTS iterative scheme for estimation of the population skin model parameters was straightforward to implement. It demonstrated stable results and rapid convergence. One of the aspects that was not analyzed is whether the method converges with increasing skin discretization. As was discussed in chapter 3:4 it is well known that a solution to a discretized in space diusion equation converges to a true solution with increasing discretization when approximated by linear spline functions. However, it may be hard to show and analyze numerical convergence for the GTS method, since its results 94 depend on the LS estimation method and are used for obtaining nal estimates for pa- rameters of a distribution for the skin model coecients. Thus, we propose to investigate if and how ( N GTS ; N GTS ) ! N!1 (; ): A very interesting problem is dening population groups based on a number of real clinical tests. This topic is discussed to some extent in [4]. In the second part of this thesis, we analyzed the posterior distribution for the SSM coecients and blood content approximated by a linear combination of piecewise constant functions. We obtained satisfactory results for two examples, but there are many of aspects of the problem which require improvement and further research. We point out some of the primary directions one might go in. First, we can try to compute a MAP estimator by assuming the precision for spline coecients l to be a random variable, assigning to it a prior distribution, and trying to estimate it together with input and skin model parameters as in [3]. It is not clear if this will signicantly improve results, but since the choice of l is essential in a MAP estimation as seen in section 4:5:2 of chapter 4, we hope that in the suggested approach a suitable precision will be found automatically. Second, it is necessary to extend our estimation methods to TAS data with several drinking epics. The rst question to ask is how many splines we will need to get ade- quate results for input approximation, and if and how it is possible to obtain them in a reasonable amount of time. This is a very important question for both MAP and inter- val estimations. As a result the next problem is to formulate an adaptive algorithm for 95 choosing breakpoints for approximating spline functions and for dening a total number of splines as it may signicantly decrease runtime and eectively improve estimation. For this we propose the so-called reversible jump Markov chain Monte Carlo (RJMCMC) approach, which is a modication of the standard Markov Chain Monter Carlo algorithm. The Markov Chain Monte Carlo (MCMC) method originated in physics as a tool for exploring equilibrium distributions of interacting molecules. In statistical applications, it is used to explore and estimate features of likelihood surfaces and Bayesian posterior distributions by simulating pseudo random draws from multidimensional distributions. The basic idea of an MCMC scheme is that we wish to generate random variables from a distribution (x), for some x2R n , but can not do it directly, since the form of (x) is intractable. Instead we obtain a Markov chain - a sequence of random variables for which the distribution of each element depends only on the value of the previous one. If the generated Markov chain is aperiodic and irreducible, then distributions of the chain elements converge to the equilibrium distribution which is in fact (x) [11], [16]. Some classical sampling methods are the Metropolis-Hastings and Gibbs Sampler algorithms. RJMCMC is dierent from MCMC in the sense that it allows the dimension of a simulated parameter vector x to be a variable as well [9]. Applied to the deconvolution problem, the RJMCMC approach is to consider possible models for the input function M k = (k; k ), where k is a number of splines that is usually bounded, k = (l k ;s k ) is a vector of parameters withl k being the spline coecients, ands k is a vector of breakpoints for the spline functions. Then the goal is to simulate (q k ;M k ) from (q k ;M k jy), where q k is still a vector of SSM parameters. In particular, we suggest: 1. Start from M 0 . 96 2. On the i'th step: A. Simulate q i from p(qjy;M i1 ) =p(yjq;M i1 )p(qjM i1 )p(M i1 ). B. Run RJMCMC to simulate M i from p(Mjy;q i ) . 3. Repeat step 2 to obtain samples (q i ;M i ) and based on the generated samples build the condence interval for the input. The algorithm suggested above is a hybrid MCMC simulation, since it consists of two embedded Markov chains, one of which, 2B, is an RJMCMC. In RJMCMC simulation, samples from a target distributionp(Mjy;q i ) are obtained by sequentially sampling from a candidate distribution and switching to a dierent candidate based on the previously obtained sample. In particular, for the deconvolution problems similar to ours the fol- lowing realization for the step 2B can be considered [9], [11]: Start from a model M i1 , on the j'th step: 1. Choose one of the breakpoints, say, s . 2. Choose one of the following steps to switch to the next model: (a) Update the values of the spline coecient s . (b) Add a new breakpoint. (c) Move a breakpoint. (d) Delete s . 3. Repeat steps 1-2 until the chain converges. 97 In order to prove theoretically that such a scheme converges, one needs to develop a numerical way to verify that obtained samples come from a target distribution,p(Mjy;q i ). In the same way it is necessary to prove and verify convergence of a scheme that produces simulations (q k ;M k ). The suggested method can be applied in the estimation of non smooth inputs by adding one more scenario to step 2 in the above algorithm, i.e. \assign spline coecient valuesl andl + to the points ". This is equivalent to introducing a jump discontinuity at the point s , and can be potentially used in a more general problem of deconvolving the number and the sizes of drinks consumed by a patient in the case of the full body model. 98 References [1] B. M. Bell and G. Pillonnetto. Bayes and empirical bayes semi-blind deconvolution using eigenfunctions of a prior covariance. Automatica, 43:1698{1712, 2007. [2] P.E. Caines. Linear Stochastich Systems. Wiley, 1988. [3] R.K. Dash, E. Somersalo, M.E. Cabrera, and D. Calvetti. An ecient deconvolution algorithm for estimating oxygen consumption during muscle activities. Computer Methods and Programs in Biomedicine, 85(3), March 2007. [4] M. Davidian and D. M. Giltinan. Nonlinear Models for Repeated Measurement Data, pages 138{141. Chapman & Hall, 1995. [5] G. De Nicolao, G. Sparacino, and C. Cobelli. Population pharmacoki- netic/pharmocodynamic methodology and applications: A bibliography. Automat- ica, 33(5), 1997. [6] A. Dempster, N. Laird, D. Rubin, and J. Maximum. Likelihood for incomplete data via the EM algorithm. Royal Statistical Society, 401, 1977. [7] M.A. Dumett, I. G. Rosen, J. Sabat, A. Shaman, L. Tempelman, C. Wang, and R. Swift. Four distributed parameter models for estimating simulating and inverting the transdermal transport of alcohol. University of Southern California, 2005. [8] W.R. Gilks, S. Richardson, and D. Spiegelhalter. Markov Chain Monte Carlo in Practice: Interdisciplinary Statistics. Chapman & Hall/CRC, 1996. [9] P. J. Green. Reversible jump markov chain monte carlo computation and bayesian model. Biometrika, 82(4), December 1995. [10] J. Kaipio and E. Somersalo. Statistical and Computational Inverse Problems, pages 56{108. Springer, 2004. [11] D. Kang and D. Verrotta. Reversible jump markov chain monte carlo for the decon- volution. Journal of Pharmocokinetics and Pharmacodynamics, 34(3), June 2007. [12] D.G. Luenberger. Optimization by Vector Space Methods. John Wiley & Sons, Inc, 1969. [13] E. Nyman and A. Palmlov. The elimination of ethyl alcohol in sweat. Scand. Arch. Physiol., 74, 1936. 99 [14] A. Schumitzky. EM algorithms and two stage methods in pharmacokinetic popula- tion analysis. Advanced Methods of Pharmacokinetic and Pharmacodynamic System Analysis, 1995. Edited by D'Argenio DZ. New York:Plenum. [15] G.A.F. Seber and C.J. Wild. Non-linear Regression, pages 25{30, 120{126. Wiley - Interscience, 2003. [16] L. Swette, A. Grith, and A. La Conti. Potential and diusion controlled solid elec- trolyte sensor for continuous measurement of very low levels of transdermal alcohol. Pat. U.S. 5,944,661. Giner, Inc., Waltham, MA. Appl. No. 840802, 16.4.1997 (led), 31.8.1999 (issued). [17] R.M. Swift. Transdermal measurement of alcohol consumption. Addiction, 88(8), August 1993. [18] R.M. Swift. Transdermal alcohol measurement for estimation of blood alcohol con- centration. Alcoholism: Clinical & Experimental Research, 24(4), April 2000. [19] R.M. Swift, C.S. Martin, L. Swette, A. La Conti, and N. Kackley. Studies on a wearable, electronic, transdermal alcohol sensor. Alcoholism: Clinical & Experimen- tal Research, 16(4), August 1992. [20] L. Yuh, S. Beal, M. Davidian, F. Harrison, A. Hester, K. Kowalski, E. Vonesh, and R. Wolnger. Population pharmacokinetic/pharmocodynamic methodology and applications: A bibliography. Biometrics, 50(2), June 1994. 100 Appendix: WinBUGs The BUGS (Bayesian inference Using Gibbs Sampling) project is aimed at academia and industry professionals interested in exible software for Bayesian analysis of complex statistical models using Markov Chain Monte Carlo (MCMC) methods. As mentioned earlier the project began in 1989 in the MRC Biostatistics Unit, and continued developing as a shared project by the Imperial College School of Medicine at St Mary's, London, and the University of Helsinki, Finland. The software has a graphical user interface and a scripting language for automating routing analysis. To run an MCMC in WinBUGs a user has to specify a stochastic model using a limited PASCAL syntax, assign prior distributions for all variables, and include text les containing data and Markov chain initializations. Once a model has been compiled, a user is given an option to choose the number of samples to produce and variables to display. At the end, the samples obtained can be extracted into a text le, so that they can be tested for convergence and other statistical properties. A simulation method for each variable is chosen according to its conditional distribution and is usually a hybrid of classical update algorithms, such as Gibbs Sampler, Metropolis-Hastings, rejection sampling and so on. There is not a lot of exibility for the user to choose an update method. To solve a system of ODEs it is necessary to install an additional package, WBDi. In order to 101 provide faster results a user has to change a template le written in PASCAL to solve the target system of ODEs. Still, computational speed in WinBUGS is inferior to Matlab. Every time the updater generates a sample of parameters that are a part of a system of dierential equations, the ODE solver will run a numerical iterative (Runge-Kutta) scheme. Nevertheless WinBUGs seemed to be attractive, since, if succes First, we tested WinBUGs on a stochastic model: y k =f(q;u k ) +; k = 1;N N(0;I NN ); where f(q;u k ) is a numerical solution to the following linear system of ODEs computed at times t k : _ C 1 (t) =C 1 (t) +A exp (tb) 2 _ C 1 (t) =C 1 (t) + C 2 (t): We generated data in Matlab assuming =1, = :5, =1, A = 30, b = 2. We added Gaussian noise with precision 1 to obtain a realistic noisy data set. We had to specify prior distributions for parameters q = (;; ), A, and b. We chose to work with Gaussian priors centered at true values for q and A, and picked a uniform distribution dened on [0; 10] for the parameter b. Finally we launched WinBUGs to obtain 100,000 samples from (;; ;A;bjy). A sample average for the parameter vector of the last 50,000 samples estimated true values with high precision. In our rst attempt to solve the deconvolution problem the goal was to obtain samples from the posterior distribution (l;qjy TAS ) for the simple skin model parameters, q, and spline coecients, l using WinBUGs. The rst SSM model we considered consisted of 102 three parameters (;; ) with being a rate of alcohol absorption into the blood to the skin: @w @t = @ 2 w @x 2 t 0; 0xL w x (t; 0) =u(t) w x (t;L) = w(t;L) y(t) =w(t;L): Also we approximated the input by 10 and 20 linear spline functions, as they are more suitable for continuous function approximations than piecewise constants. Next, we con- sidered the SSM model assuming skin discretization,N = 4. The real data sampled every 5 minutes was generated by the SSM model with q = (:15; 1:7) assuming input from the example 5, i.e. u(t) = Ae (tb) 2 . At the end gaussian noise with precision 4 was added to each data point. The nal stochastic model considered was: y =f(t;q;u) +; N(0;I); (5.1) where y - observed TAS data, u - BAC input, q - skin model parameters, -unknown noise precision. We assigned pr () =N(:15; 10), pr ( ) =N(1:7; 10), pr () =N(5;:5), pr (A) =N(30;:1), and, nally, pr (b) =N(2; 1). Next it was time to run MCMC but, 103 the obtained samples for and A were quite disappointing. It was easy to spot the problem once we examined a theoretical solution to the SSM model: y N (t) =H N t Z 0 e A N (q)(t) B N (q)A exp (b) 2 d: In fact we have been trying to estimate the SSM coecient and the input parameterA by just observing their product. As a result we had to assume further that = 1, which means that u(t) is the part of alcohol in the bloodstream that was transferred through the skin. Further, we did not encounter any problems in MCMC simulations of SSM parameters with known input. To continue testing the solvability of the deconvolution problem using WinBUGs, we assumed parameters for the SSM model to be known and xed, and estimated the input function using a linear approximation of the input by 10 pup-tent functions. The prior for the spline coecients was assumed to be: l i N(l true i ;:1)l true = [0:5495; 11:0364; 30:0000; 11:0364; 0:5495; 0:0037; 0; 0; 0; 0; 0]: The prior for the noise precision was set to pr () =N(5;:5). The results for simulated spline coecient samples and noise precision samples are illustrated in Table 8 and Figure 33. The overall runtime to produce 100,000 samples took about 1.5 hours. The sample average for the spline coecients and are close to the true values, but the signicant autocorrelation for some spline coecients in Figure 35 suggests that the chain mixes slowly. So in order to obtain samples from the target distribution we may have to run 104 the chain longer and then test it for convergence. Next we may want to thin the chain to obtain samples for estimates. No thorough analysis was done to verify that. At the next step we changed the approximating splines to piecewise constants in hope that corresponding spline coecients will exhibit better convergence behavior. The following prior for the spline coecients was set: l i N(l true i ;:1)l true = [3:1620; 23:3640; 23:3640; 3:1620; 0:0579; 0:0001; 0; 0; 0; 0]: The history plots for the same coecients are shown in Table 9 and Figure 34. The results are very similar to the previous example, but the overall runtime took less than 1 hour. As the results from the previous runs were satisfactory, we were anxious to move to the case when both SSM coecients and spline coecients were unknown. The following priors were assumed: l i N(l true i ;:1) N(:15; 1000) N(1:7; 10) N(5;:5); wherel true comes from the previous example. Some results are shown in Figure 37, Figure 36 and Table 10. The overall runtime to accomplish this task took about 10 hours. The true values are estimated well by obtained sample averages, but history plots for SSM coecients and some spline coecients exhibited a certain trend, which suggested that the MCMC chain did not converge. 105 It is still possible to improve the convergence of the Markov chain by carefully choos- ing the right prior distribution. Unfortunately, the WinBUGs website contains only a list of updaters which are used for certain conditional distributions, it does not contain any recipes for choosing the most ecient or optimal sampling algorithm for any particular problem. Aside from choosing an appropriate updater one can reparametrize problematic variables [8]. Another obvious reason for poor convergence is the fact that a deconvolu- tion problem, in general, is ill-posed. We can regularize it by introducing a smoothing prior distribution for the spline coecients [3] and hope that it will improve the overall convergence for the MCMC. However, introducing any new variables or additional com- putational steps will increase the runtime. Ultimately, it is highly desirable to increase the skin discretization and to be able to analyze several drinking epics at once, which will take even longer to run. As the runtime for the BAC estimation is relevant we decided to abandon WinBUGs and proceed with MAP estimations and direct sampling from the posterior distribution. 106 Node Mean SD MC Error 2.5% Median 97.5% Start Sample l 1 -0.114 2.216 0.04095 -4.452 -0.1175 4.237 10000 90001 l 2 11.4 1.695 0.03807 8.079 11.41 14.72 10000 90001 l 3 31.4 1.721 0.0404 28.06 31.38 34.76 10000 90001 l 4 12.7 1.74 0.03979 9.262 12.69 16.07 10000 90001 l 5 -1.379 1.754 0.03957 -4.738 -1.387 2.137 10000 90001 l 6 -1.008 1.746 0.04014 -4.531 -0.9981 2.397 10000 90001 l 7 -0.991 1.763 0.04029 -4.468 -0.9839 2.475 10000 90001 l 8 0.6027 1.772 0.03797 -2.855 0.6127 4.031 10000 90001 l 9 -0.08072 1.786 0.03029 -3.559 -0.09035 3.394 10000 90001 l 10 -0.05764 2.311 0.02627 -4.637 -0.04912 4.459 10000 90001 l 11 0.1383 3.143 0.02383 -6.018 0.1394 6.293 10000 90001 4.033 0.5207 0.002134 3.073 4.014 5.108 10000 90001 Table 8: Sample Statistics for an MCMC sample of (;l), q = (:15; 1:7), input approximation by pup-tents 107 tau iteration 80000 85000 90000 95000 100000 2.0 3.0 4.0 5.0 6.0 7.0 l[2] iteration 80000 85000 90000 95000 100000 5.0 10.0 15.0 20.0 l[3] iteration 80000 85000 90000 95000 100000 25.0 30.0 35.0 40.0 Figure 33: History plots for , l 2 , l 3 , approximation by pup-tents constants 108 Node Mean SD MC Error 2.5% Median 97.5% Start Sample l1 2.864 1.067 0.0201 0.7852 2.874 4.954 10000 90001 l2 22.18 1.462 0.03412 19.33 22.17 25.02 10000 90001 l3 23.7 1.475 0.03393 20.83 23.71 26.57 10000 90001 l4 3.722 1.485 0.03403 0.8197 3.722 6.595 10000 90001 l5 0.1928 1.47 0.03284 -2.658 0.1922 3.053 10000 90001 l6 -0.6044 1.449 0.03218 -3.534 -0.5975 2.262 10000 90001 l7 0.708 1.46 0.03081 -2.184 0.7082 3.545 10000 90001 l8 -1.272 1.464 0.02703 -4.119 -1.285 1.58 10000 90001 l9 0.7067 1.514 0.02209 -2.267 0.7159 3.647 10000 90001 l10 -0.6235 2.905 0.02526 -6.379 -0.6125 5.089 10000 90001 4.696 0.565 0.002509 3.648 4.676 5.851 10000 90001 Table 9: Sample Statistics for an MCMC sample of (;l), q = (:15; 1:7), input approximation by piecewise constants 109 tau iteration 80000 85000 90000 95000 100000 2.0 4.0 6.0 8.0 l[2] iteration 80000 85000 90000 95000 100000 15.0 20.0 25.0 30.0 l[3] iteration 80000 85000 90000 95000 100000 15.0 20.0 25.0 30.0 Figure 34: History plots for , l 2 , l 3 , input approximation by piecewise constants constants 110 tau lag 0 20 40 -1.0 -0.5 0.0 0.5 1.0 f i[2] lag 0 20 40 -1.0 -0.5 0.0 0.5 1.0 f i[3] lag 0 20 40 -1.0 -0.5 0.0 0.5 1.0 (a) tau lag 0 20 40 -1.0 -0.5 0.0 0.5 1.0 l[2] lag 0 20 40 -1.0 -0.5 0.0 0.5 1.0 l[3] lag 0 20 40 -1.0 -0.5 0.0 0.5 1.0 (b) Figure 35: Autocorrelation plot for ,l 2 ,l 3 ; (a) input approximation by pup-tents constants, (b) input approximation by piecewise constants Node Mean SD 2.5% Median 97.5% Start Sample 0.1487 0.0132 0.1259 0.1476 0.1766 10000 90001 1.7999 2.1613 1.4880 1.7990 2.1170 10000 90001 l 2 23.4167 2.6625 18.3300 23.3800 28.5600 10000 90001 l 3 24.9027 2.3459 20.3500 24.9100 29.4600 10000 90001 Table 10: Sample statistics for an MCMC sample of (q;l) 111 8 8.2 8.4 8.6 8.8 9 9.2 9.4 9.6 9.8 10 x 10 4 0.1 0.15 0.2 0.25 iterations 8 8.2 8.4 8.6 8.8 9 9.2 9.4 9.6 9.8 10 x 10 4 1 1.5 2 2.5 3 iterations γ α Figure 36: History plots for (; ), input approximation by piecewise constants 8 8.2 8.4 8.6 8.8 9 9.2 9.4 9.6 9.8 10 x 10 4 10 15 20 25 30 35 iterations l[2] 8 8.2 8.4 8.6 8.8 9 9.2 9.4 9.6 9.8 10 x 10 4 15 20 25 30 35 iterations l[3] Figure 37: History plots for l 2 and l 3 in MCMC of (q;l), input approximation by piecewise con- stants 112
Abstract (if available)
Abstract
We develop statistical methods to estimate blood alcohol concentration (BAC) from transdermal alcohol concentration (TAC) measurements supplied by a transdermal alcohol sensor (TAS). This eliminates the need for clinical test data for each patient to calibrate the underlying mathematical models to each subject and each device. We parametrically estimate the distribution for transport parameters in a transdermal ethanol model based on population analysis and simulated data using the Global Two-Stage Method (GTS). We implement the GTS method using generated data and clinical data obtained from 17 healthy volunteers with approximately the same body mass index assuming log-normal distributions for their skin model parameters. Next, we develop a Bayesian approach to estimate BAC for new patients from the population using TAS data generated by a full body alcohol model. The BAC signal is approximated by piecewise-constant (zero-order spline) functions. Prior distributions are assumed for skin model parameters (based on population analysis) and BAC spline coefficients. The new patient's TAS data is used to estimate BAC via maximization of a posterior distribution for skin parameters and input coefficients. This yields a maximum a-posteriori (MAP) estimate. With basic assumptions on the BAC spline coefficients and the TAS noise model, we obtain direct samples from the posterior distribution. This allows us to construct 95% confidence intervals (CI) for BAC.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Obtaining breath alcohol concentration from transdermal alcohol concentration using Bayesian approaches
PDF
An abstract hyperbolic population model for the transdermal transport of ethanol in humans: estimating the distribution of random parameters and the deconvolution of breath alcohol concentration
PDF
M-estimation and non-parametric estimation of a random diffusion equation-based population model for the transdermal transport of ethanol: deconvolution and uncertainty quantification
PDF
Distributed parameter model based system identification and filtering in the estimation of blood alcohol concentration from transdermal alcohol biosensor data
PDF
Finite dimensional approximation and convergence in the estimation of the distribution of, and input to, random abstract parabolic systems with application to the deconvolution of blood/breath al...
PDF
Determining blood alcohol concentration from transdermal alcohol data: calibrating a mathematical model using a drinking diary
PDF
Physics-informed machine learning techniques for the estimation and uncertainty quantification of breath alcohol concentration from transdermal alcohol biosensor data
PDF
A nonlinear pharmacokinetic model used in calibrating a transdermal alcohol transport concentration biosensor data analysis software
PDF
Prohorov Metric-Based Nonparametric Estimation of the Distribution of Random Parameters in Abstract Parabolic Systems with Application to the Transdermal Transport of Alcohol
PDF
Nonparametric estimation of an unknown probability distribution using maximum likelihood and Bayesian approaches
PDF
Delta Method confidence bands for parameter-dependent impulse response functions, convolutions, and deconvolutions arising from evolution systems described by…
PDF
A Bayesian approach for estimating breath from transdermal alcohol concentration
PDF
Linear quadratic control, estimation, and tracking for random abstract parabolic systems with application to transdermal alcohol biosensing
PDF
Estimation of random input to semi-linear abstract parabolic systems with application to quantitative description of drinking behavior based on trans-dermal alcohol concentration
PDF
Simultaneous parameter estimation and semi-blind deconvolution in infinite-dimensional linear systems with unbounded input and output
PDF
Direct wholebody Patlak and Logan image estimation from listmode PET data
PDF
Feature selection in high-dimensional modeling with thresholded regression
PDF
Functional connectivity analysis and network identification in the human brain
Asset Metadata
Creator
Piterbarg, Yuliya
(author)
Core Title
Population modeling and Bayesian estimation for the deconvolution of blood alcohol concentration from transdermal alcohol biosensor data
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Applied Mathematics
Publication Date
02/06/2009
Defense Date
12/17/2008
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
alcohol transdermal sensor,Bayesian deconvolution,blind deconvolution,blood alcohol concentration,global two-stage method,OAI-PMH Harvest,population analysis
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Rosen, Gary (
committee chair
), D'Argenio, David (
committee member
), Schumitzky, Alan (
committee member
)
Creator Email
jpiterbarg@gmail.com,julia@piterbarg.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m1970
Unique identifier
UC1188125
Identifier
etd-Piterbarg-2601 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-146780 (legacy record id),usctheses-m1970 (legacy record id)
Legacy Identifier
etd-Piterbarg-2601.pdf
Dmrecord
146780
Document Type
Dissertation
Rights
Piterbarg, Yuliya
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
alcohol transdermal sensor
Bayesian deconvolution
blind deconvolution
blood alcohol concentration
global two-stage method
population analysis