Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Delta Method confidence bands for parameter-dependent impulse response functions, convolutions, and deconvolutions arising from evolution systems described by…
(USC Thesis Other)
Delta Method confidence bands for parameter-dependent impulse response functions, convolutions, and deconvolutions arising from evolution systems described by…
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Delta Method Confidence Bands for Parameter-Dependent Impulse Response Functions,
Convolutions, and Deconvolutions Arising from Evolution Systems Described by Analytic
Semigroups of Operators with Regularly Dissipative Generators
by
Haoxing Liu
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(APPLIED MATHEMATICS)
August 2024
Copyright 2024 Haoxing Liu
Dedication
To my dear family and friends.
ii
Acknowledgements
I would like to express my deepest gratitude to those who have supported me throughout my Ph.D. studies. First and foremost, I am profoundly grateful to my advisor, Professor Gary Rosen, for his unwavering
guidance, insightful feedback, and continuous encouragement during this process. I was definitely not an
expert in many parts of the study to begin with, but his support and patience guided me through not only
the completion of this thesis, but also all aspects of my academic research life. I would not have imagined
a better experience for my Ph.D. studies at USC.
In addition, I extend my heartfelt thanks to the members of my thesis committee, Professor Larry Goldstein and Dr. Susan E. Luczak. Professor Goldstein introduced me to this amazing project and shared
many helpful suggestions in statistics and probability. Dr. Luczak provided many valuable inputs from the
biological and psychological point of view of this project that I was not originally familiar with. Thank
you for your constructive criticism and dedicated support.
I thank Dr. Luczak and her lab at the University of Southern California for carefully collecting the data
and allowing me to use it towards this thesis.
iii
I would also like to thank the National Institute of Health for the generous support under the grants R21
AA17711 and R01 AA26368.
Last but not least, I am deeply grateful for my parents and other family members. Your constant love
and support have been my foundation. Their belief in me has been a source of strength and inspiration
throughout this journey. I also want to thank my girlfriend Yejia. Her love and companionship have been a
constant source of motivation and comfort. I would not have been able to complete the degree without you.
To all of you, I extend my sincerest thanks. This thesis would not have been possible without your support.
iv
Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2: Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Functional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Probability and Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Linear Regression and Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Differentiation in Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Chapter 3: A Single Input Single Output Model for the Transdermal Transport of Ethanol . . . . . 24
Chapter 4: The Abstract Evolution System and Finite Dimensional Approximation and Convergence 27
4.1 The Continuous-Time Infinite Dimensional Evolution System . . . . . . . . . . . . . . . . 27
4.2 The Discrete-Time Convolution Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Finite Dimensional Approximation and Convergence for the Discrete-Time System . . . . 30
Chapter 5: The Fréchet Derivative of Analytic Semigroup, its Finite Dimensional Approximation
and Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1 The Fréchet Derivative of the Analytic Semigroup . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Evaluation of the Fréchet Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3 Convergence of the Finite Dimensional Approximating Derivatives . . . . . . . . . . . . . 37
5.4 Derivative of the Impulse Response Function . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Chapter 6: The Deconvolution Problem and Regularization . . . . . . . . . . . . . . . . . . . . . . 43
6.1 Reformulating the Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2 The Reconstructed BrAC with Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3 Derivative of the Inverse Operator and Convergence . . . . . . . . . . . . . . . . . . . . . 49
Chapter 7: The Delta Method Induced Confidence Bands . . . . . . . . . . . . . . . . . . . . . . . 52
7.1 The Delta Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.2 Asymptotic distribution of the Impulse Response Matrix . . . . . . . . . . . . . . . . . . . 54
v
7.3 Confidence Bands for the Reconstructed BrAC . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.4 The Choice of Uniform Confidence Bands and Constructions . . . . . . . . . . . . . . . . . 59
Chapter 8: Application of Our Framework to Our SISO Model for the Transdermal Transport of
Ethanol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.1 The Abstract State Space Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.2 Finite Dimensional Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.3 Choice of Regularization Pamameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Chapter 9: Numerical Examples with SISO Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.1 Human Subject Clinical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.2 The Impulse Response Function and Confidence Bands . . . . . . . . . . . . . . . . . . . . 72
9.3 Conficence Intervals for Forward and Backward Estimation Based on Actual Drinking
Episode Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.4 Convergence of our Approximating Estimates and Confidence Bands . . . . . . . . . . . . 77
Chapter 10: Concluding Remarks and Future Research . . . . . . . . . . . . . . . . . . . . . . . . . 82
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
vi
List of Figures
1.1 (Left) WristTASTM 7 Continuous Alcohol Monitoring device, (Right) SCRAM Systems
transdermal continuous alcohol monitoring device. . . . . . . . . . . . . . . . . . . . . . . 2
7.1 90% single-parameter rectangular confidence region of the form N2
k=1[yk −cσk, yk +cσk]
and confidence ellipse for a two-demensional normal distribution with mean (0, 0) and
covariance matrix
1 0.20.2 0.25
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.1 Plot of 70% confidence intervals with impulse response function based on a single impulse
at time k = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.2 Plot of 70% confidence bands for deconvolving the impulse response function based on
Single Impulse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.3 Unregularized Deconvolution the Impulse Response Function based on Single Impulse . . 74
9.4 Plot of 70% confidence bands for forward estimation of TAC based on observed BrAC data. 75
9.5 Plot of 70% confidence bands for the deconvolution problem and estimated BrAC based on
Observed TAC Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
9.6 Plot of sequential difference for the forward and backward estimates as the dimension of
approximation N increases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
9.7 Plot of sequential differences for the finite dimensional confidence band widths with the
impulse response function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
9.8 Plot of sequential differences for the finite dimensional confidence band widths computed
using actual human subject BrAC and TAC data. . . . . . . . . . . . . . . . . . . . . . . . . 80
vii
Abstract
We present a comprehensive study of the propagation of parameter uncertainty in a single input/single
output (SISO) infinite dimensional random evolution system that models the relationship between blood or
breath alcohol concentration (BAC/BrAC), the input signal, and biosensor measured transdermal alcohol
concentration (TAC), the output signal, passively collected and noninvasively observed at the stratum
corneum, the outer surface of the skin or epidermal layer. Described by a first principles physics-based
random SISO diffusion model in the form of an abstract parabolic system set in a Gelfand triple of Hilbert
spaces, the resulting input output model takes the form of a discrete convolution with filter involving
an analytic semigroup of bounded linear operators with regularly dissipative generator that depends on
the model’s random parameters known only up to their distributions. To quantify the propagation of
uncertainty in more general multi-input channel/multi-output channel systems of this type, we use Fréchet
derivatives of the semigroup with respect to the parameters together with a statistical technique known as
the Delta Method. In particular, we consider the system’s impulse response matrix or function, and both
the forward and inverse or deconvolution problems for the purpose of estimating the input signals (e.g.
BrAC) based on observations of the output signals (e.g. TAC). The deconvolution process is formulated
as an abstract regularized least-squares problem. A finite-dimensional approximation and convergence
theory is developed, and the results of our numerical studies involving actual human subjects drinking
episode data are presented and discussed.
viii
Chapter 1
Introduction
The research and results reported here were motivated by the following estimation problem. Currently the
only means by which clinicians treating patients suffering from alcohol use disorder (AUD) and researchers
studying the causes and developing new treatments for AUD can collect drinking data is by either selfreporting or by having the patient or participant use a breath analyzer. For a number of reasons, both of
these methods are plagued by inaccuracies and make it impossible to collect data continuously in a naturalized setting, in the field, and in real time [27]. Recently, relatively inexpensive and wearable biosensors
that can collect drinking data in the field in near continuous time, both passively and non-invasively, have
been developed. These devices in the form of wearable wrist or ankle bracelets similar to a digital watch,
a Fitbit, or mobile tracking devices, collect and process ethanol (the type of alcohol in alcoholic beverages) evaporating from the surface of the skin. When alcoholic beverages are consumed, ethanol, which is
highly miscible [53] rapidly finds its way into the water that makes up 70% of the body’s mass including
that which can be found in tears, sweat, urine, interstitial fluid, and perspiration [35, 16, 39, 4]. While
ethanol’s primary pathway out of the body is its enzyme catalyzed transformation into aldehydes in the
liver, roughly 1% finds its way out via evaporation from the surface of the skin. A transdermal alcohol
biosensor collects ethanol in the vapor state on the surface of the skin and performs an oxidation reduction reaction similar to that which occurs in a fuel cell. The oxidation of each ethanol molecule produces
1
4 electrons in the form of an electrical current which can then be measured thus producing a count of
ethanol molecules. The biosensors can be bench calibrated allowing the measured current to be converted
into what is known as transdermal alcohol concentration (TAC) with units of percent alcohol (grams per
100 mL of blood). Two TAC devices are shown in Fig. 1.1.
Figure 1.1: (Left) WristTASTM 7 Continuous Alcohol Monitoring device, (Right) SCRAM Systems transdermal continuous alcohol monitoring device.
Although these devices represent a significant advancement in the ability to collect drinking data in
the field, there remains a significant obstacle to wide spread use of this technology by scientists and clinical
practitioners who specialize in addiction. Indeed, alcohol researchers and clinicians are most accustomed
to using blood alcohol concentration (BAC) and its surrogate, breath alcohol concentration (BrAC), as
the standard quantitative measures of alcohol intoxication. Now TAC biosensors are already available
on the commercial market as abstinence monitors (e.g. for court ordered monitoring of DUI and spousal
abuse offenders). However, due to a number of confounding factors including contact of the sample with
the environment, variations in physiological characteristics (e.g. skin thickness, porosity and tortuosity,
circulatory issues, propensity to sweat as opposed to perspire, etc.), and manufacturing variations in sensor
components, that are generally not as problematic for the breath analyzer [25], wide spread acceptance of
bench calibrated TAC as a suitable quantitative substitute for BrAC has not occurred.
Over the years, researchers have examined and demonstrated the effectiveness of utilizing TAC data
from transdermal alcohol biosensors as a means of estimating BAC/BrAC levels or as a direct indication
2
of drinking activities in general. A group at the University of Texas Health Science Center in San Antonio
have contributed significantly to verifying the efficacy of TAC levels subject to different degrees of alcohol
consumption, ranging from low-level drinking to binge drinking. For minimal drinking activities that are
traditionally harder to detect by the existing Alcohol Monitoring System (AMS) SCRAM, using TAC data
has proven to be successful in detecting small changes in drinking activities under specific decision rules
(e.g. drink-no drink) without inducing a large number of false-positives [41]. Alternatively, for moderate to intense drinking levels, the TAC and BAC/BrAC readings are found to be consistently positively
correlated and the researchers were able to determine decision rules based on peak TAC values for participants with moderate and more serious drinking habits [8]. In addition, it has been observed that TAC
observations have a significant time lag between the drinking episode and the signal captured by the TAC
sensor. This might diminish the effectiveness of the biosensor as a real-time monitoring device. The time
lag phenomenon was found to be influenced by the amount of alcohol consumed as well as other factors
related to the characteristics of the individual such as biological sex, etc. [11]. This team has also carried
out a preliminary study on the authenticity of TAC monitoring by spontaneously collecting daily drinking
reports from participants along with passively measured TAC levels using a wearable device [22]. They
compared TAC events during the days when participants reported drinking and abstaining separately, and
in both scenarios, the correspondence appeared to be promising.
A research group at the University of Illinois Urbana-Champaign has conducted a more extensive examination of the correlation between TAC inputs and BAC/BrAC outputs [21]. They collected results from
31 relevant laboratory-derived samples involving various conditions for participants, and estimated the
correlation coefficients and time lag via a sophisticated meta-regression. The outcomes indicated strong
performance for the transdermal alcohol sensors in revealing BAC/BrAC levels in a controlled environment. Moreover, the researchers noted that the location where the biosensor was attached to the body
3
played a vital role in increasing the correlation and reducing the time lag. They concluded that a wristworn device is preferable. On the other hand, another team in London conducted a similar review on
32 studies related to testing the accuracy of wearable biosensors across different experimental designs,
environmental conditions, and types of devices[10]. Their results suggested that even though a positive
correlation is commonly reported, the lack of consistency in these studies and, correspondingly, in their
conclusions compromised the results. Therefore, a more universal approach that accounts for the environmental variations is required for modeling the connection between TAC signals and BAC/BrAC levels.
However, at this point, despite technological advances, it has remained a challenge to construct a reliable
and accurate model that transforms the raw TAC data, which varies significantly across different individuals, devices, and other environmental factors, to corresponding BAC or BrAC levels that are found to be
generally consistent[13] [26] [30].
A number of studies have been undertaken with the goal of quantifying the relationship between TAC
and BrAC. One group’s approach involved regression-based models [8, 19, 23] with others investigating
physics-based models that capture the path alcohol, or more precisely, ethanol takes through the human
body [56, 55]. In a series of papers, our research group at the University of Southern California developed
schemes based on a first principles, physics based forward model first being fit to a set of training data
in the form of alcohol challenges. In an alcohol challenge, a pre-measured bolus of ethanol mixed with
fruit juice is consumed by a test subject while they are wearing one or more TAC sensors. Simultaneously
breath measurements at prescribed time intervals are also taken. The input to the model is the BrAC
and the output is the TAC. Then, in a second phase, the fit model is used to estimate BrAC from a TAC
signal that was not used in training. Nonlinear least squares and spline-based approximation are used to
solve the resulting ill-posed inverse problem. In these studies (see, for example, [9, 7, 42, 43]) the entire
computation is performed off-line and after the fact; in other words, not in real time. Moreover, they require
re-calibration of the model to each individual participant, sensor, and current environmental conditions.
4
More recently, our group has looked at developing population model based techniques for converting
TAC to equivalent BrAC in real time (see, for example, [51, 49, 50, 1, 2, 18, 60, 61, 59]).Population models
allow us to eliminate the requirement that the model be trained for each individual test subject, TAC
device, and environmental conditions. This is achieved by replacing the underlying deterministic diffusion
equation by a random one. Then, in the training phase, rather than estimate the values of the model
parameters, we use the training data to estimate the distribution of the random model parameters based
on the population data. This yields a population model that describes the aggregate response of a cohort
of the population. When this fit random model together with a deconvolution scheme are applied to TAC
data collected in the field, the result is an estimated BrAC signal along with conservative error bands based
on the estimated distribution of the model parameters.
In still other studies our group has developed TAC to BrAC conversion schemes using physics-informed
machine learning techniques including hidden Markov models, general adversarial neural networks, and
long-short term memory neural network models [38, 37, 36]. A group at the University of Illinois has also
developed data based methods for estimating BrAC from TAC using random forests [14].
In our treatment here, we investigate a population model-based approach of a different sort. We consider the unknown parameters to be the response variables of a statistical regression model with covariates
such as sex, weight, hip-to-waist ratio, body mass index, etc. as predictor variables. Our focus here though,
is not on the performance of these models, but rather on mathematically rigorously quantifying the propagation of the uncertainty inherent in the regression models for the parameters through our first principles
physics based forward convolution model for the transdermal transport of ethanol to 1) the system’s impulse response function (IRF), 2) the convolved TAC output signal, and 3) the deconvolved estimate for
BrAC based on the biosensor measured TAC signal. In addition, since our physics based model takes the
form of a diffusion and the convolution filter is finite dimensional, the convolution kernel function is defined in terms of the parameter dependent analytic semigroup of operators generated by the diffusion
5
equation. Consequently, any calculations require finite dimensional approximation. In our study here, we
develop an abstract approximation framework and corresponding convergence arguments. Our functional
analytic approach is based on a Taylor series-based statistical technique known as the multi-variate delta
method [5, 15] and the Fréchet derivative of the analytic semigroup [46] that describes the evolution of the
system. In an earlier effort [28] we used this approach to estimate conservative naive confidence bands
for the forward problem; here we construct tighter uniform confidence bands via techniques specifically
applicable to convolutions [32] and we also extend our results to the inverse or deconvolution problem as
well.
An outline of the remainder of the thesis is as follows. In Chapter 2, we list some of the preliminary
definitions, theorems and results that we will use in later chapters. In Chapter 3, we summarize the derivation of our diffusion-based model for the transdermal transport of ethanol through the epidermal layer of
the skin and its collection and measurement by the TAC biosensor. In Chapter 4 we use linear semigroup
theory to reformulate our model as an abstract evolution equation in an infinite dimensional Hilbert state
space. In this chapter we also develop our abstract approximation framework and convergence theory.
In Chapter 5, in preparation for applying the delta method, we discuss the Fréchet derivative of an analytic semigroup, its finite dimensional approximation and convergence. In Chapter 6 we formulate the
inverse or deconvolution problem and present the relevant regularity conditions. In Chapter 7, we consider
the application of the delta method to develop confidence bands for the forward convolution of TAC and
the inverse deconvolution of the BrAC signal. In Chapter 8, we show how the abstract theory derived in
Chapters 3 - 7 can be applied to the BrAC estimation problem, while in Chapter 9 we discuss our numerical
findings based on human subject drinking data collected in the Luczak laboratory in the Department of
Psychology at the University of Southern California. A final Chapter 10 contains some discussion, a few
concluding remarks, and possible future research.
6
Chapter 2
Preliminaries
In this chapter, we briefly state some of the definitions and theorems needed for the rest of the thesis. While
most of the information and more detailed proofs can be found in general textbooks of related subjects, we
provide a list of sources that guide our summary here. The functional analysis topics, including Sobolev
spaces and semigroup theory, can be found in [57, 40, 24, 52, 12]. Statistics and probability related concepts
can be found in [5, 15]. For the discussion of Fréchet derivatives and linear least-squares problems, we rely
on the notes [6, 17] in addition to the previously mentioned texts.
2.1 Functional Analysis
We begin by explaining the analytic semigroups of operators with regularly dissipative generators. Let X
be a Banach space, and let T(t), 0 ≤ t < ∞ be a one parameter family of bounded linear operators from
X onto X.
Definition 2.1.1. T(t) is a semigroup of bounded linear operators on X if
• T(0) = I, where I is the identity on X.
• T(t + s) = T(t)T(s) for every t, s ≥ 0.
7
Furthermore, the semigroup of operators T(t) is uniformly continuous if
lim
t↘0
T(t) − I
= 0.
Definition 2.1.2. The linear operator A defined on the domain
D(A)
x ∈ X : lim
t↘0
T(t)x − x
t
exists
,
and given by
Ax = lim
t↘0
T(t)x − x
t
:=
d
+T(t)x
dt
t=0
,
for x ∈ D(A) is the infinitesimal generator of the semigroup T(t).
The uniformly continuous semigroup of operators can be defined by its generator.
Theorem 2.1.3. A linear operator A is the infinitesimal generator of a uniformly continuous semigroup if
and only if A is a bounded linear operator. The following is statements are true if T(t) is a uniformly continous
semigroup of bounded linear operators,
• There exists a constant ω ≥ 0 such that ∥T(t)∥ ≤ e
ωt
.
• There exists a unique bounded linear operator A such that T(t) = e
tA.
• The operator A above is the infinitesimal generator of T(t).
We also define a slightly weaker family of semigroups.
Definition 2.1.4. T(t), 0 ≤ t < ∞ is a strongly continuous semigroup of bounded linear operators, or a
C0 semigroup, on X if
lim
t↘0
T(t)x = x for every x ∈ X.
8
Moreover, the C0 semigroup is analytic in X if the mapping t → T(t)x is analytic on (0, ∞) for all x ∈ X.
We note that a uniformly continuous semigroup is strongly continuous, and t → T(t)x is a continuous
function from the non-negative real line into X.
Theorem 2.1.5. Let T(t) be a C0 semigroup. Then there exists ω ≥ 0 and M ≥ 1 such that
∥T(t)∥ ≤ Meωt for 0 ≤ t < ∞.
The C0 semigroup T(t) is a generalization of the exponential function e
At, and maintains many properties of the latter regarding growth, domain, integration, differentiation, etc.
Theorem 2.1.6. Let T(t) be a C0 semigroup with infinitesimal generator A, then
1. D(A), the domain of A, is dense in X and A is closed.
2. For x ∈ D(A),
R t
0
T(s)ds ∈ D(A) and A
R t
0
T(s)ds
= T(t)x − x.
3. For x ∈ D(A), T(t)x ∈ D(A) and d
dtT(t)x = AT(t)x = T(t)Ax.
When M = 1 and ω = 0 in the theorem 2.1.5, in other words ∥T(t)∥ ≤ 1, the semigroup is called a
C0 semigroup of contractions. The generators of such semigroup of contractions have specific characteristics.
Definition 2.1.7. Let A be a linear, not necessarily bounded, operator on X. the resolvent set ρ(A) is the
set of all complex numbers λ for which λI − A is invertible. Equivalently,
(λI − A)
−1
is a bounded linear operator in X.
The family R(λ, A) = (λI − A)
−1
, λ ∈ ρ(A), of bounded linear operators is called the resolvent of A.
9
Theorem 2.1.8. (Hille-Yosida). A linear operator A is the infinitesimal generator of a C0 semigroup of
contractions T(t), t ≥ 0, if and only if
1. A is closed and D(A) = X.
2. The resolvent set ρ(A) of A contains R
+ and for every λ > 0
R(λ, A)
≤
1
λ
.
Next we move on to the dissipative characteristic of the generator. Denote X∗
as the dual space, the
space of all linear operators on X, of a Banach space X, and denote the value of x
∗ ∈ X∗
at x ∈ X as
⟨x
∗
, x⟩ = ⟨x, x∗
⟩.
Definition 2.1.9. For every x ∈ X, the duality set F(x) ⊆ X∗
is given by
F(x) = n
x
∗
: x
∗ ∈ X∗
,
x
∗
, x
= ∥x∥
2 = ∥x
∗
∥
2
o
Definition 2.1.10. A linear operator A is dissipative if for every x ∈ D(A) there is a x
∗ ∈ F(x) such that
ℜ⟨Ax, x∗
⟩ ≤ 0.
Intuitively, dissipative operators represent the decaying nature of the semigroup, similar to the semigroup of contractions. It can be characterized by the following theorem.
Theorem 2.1.11. A linear operator A is dissipative if and only if
(λI − A)x
≥ λ∥x∥, for all x ∈ D(A) and λ > 0.
The next theorem connects dissipative operators with the C0 semigroup of contractions.
Theorem 2.1.12. (Lumer-Phillips). Let A be a linear operator with dense domain D(A) in X.
10
• If A is dissipative and there is a λ0 > 0 such that the range of (λ0I−A) is X, then A is the infinitesimal
generator of a C0 semigroup of contractions on X.
• If A is the infinitesimal generator of a C0 semigroup of contractions on X, then the range of (λ0I −A) is
X for all λ > 0 and A is dissipative. Moreover, for every x ∈ D(A) and every x
∗ ∈ F(x), ℜ⟨Ax, x∗
⟩ ≤
0.
We will finish the sections on semigroups of operators by discussing the mutual continuous dependence between T(t) and its generator A, and introduce a convergence theorem. First, we denote the
infinitesimal generator of a C0 semigroup T(t) as A ∈ G(M, ω), where M and ω are the constants in
theorem 2.1.5. The convergence of the semigroup can be characterized as follows.
Theorem 2.1.13. Let A, An ∈ G(M, ω) and let T(t), Tn(t) be the corresponding semigroups generated.
Then the following are equivalent:
• For every x ∈ X and λ, ℜ(λ) > 0, R(λ, An)x → R(λ, A)x as n → ∞.
• For every x ∈ X and t ≥ 0, Tn(t)x → T(t)x as n → ∞.
The convergence of the semigroup of operators is equivalent to the convergence in the resolvent of
the corresponding generators. The following theorem provides the conditions for converging generators.
Theorem 2.1.14. Let An ∈ G(M, ω). If there exists a λ0 with ℜ(λ0) > ω such that
1. for every x ∈ X, R(λ0, An)x → R(λ0) as n → ∞ and
2. the range of R(λ0) is dense in X.
Then there exists a unique operator A ∈ G(M, ω) such that R(λ0) = R(λ0, A).
Together, we present the following theorem.
11
Theorem 2.1.15. (Trotter-Kato). Let An ∈ G(M, ω) and let Tn(t) be the semigroup whose infinitesimal
generator is An. If for some λ0 with ℜ(λ0) > ω we have
1. for every x ∈ X, R(λ0, An)x → R(λ0) as n → ∞ and
2. the range of R(λ0) is dense in X.
Then there exists a unique operator A ∈ G(M, ω) such that R(λ0) = R(λ0, A). If T(t) is the C0 semigroup
generated by A, then as n → ∞, Tn(t)x → T(t)x for all t ≥ 0 and x ∈ X. The limit is uniform in t for t in
bounded intervals.
Next we look at some functional analysis tools for studying strongly elliptic differential operators.
Again denote X∗
as the dual or antidual space, the space of all antilinear continuous functionals, of a
Banach space X. The difference is not essential in our study but makes more sense when working with
differential operators in Hilbert spaces.
Definition 2.1.16. Let V be an reflexive Banach space, V = V
∗∗, and H a Hilbert space. Suppose that
V ,→ H, and that the embedding is continuous, injective, and dense in H. Then the embedding H ,→ V
∗
is
also continuous, injective, and dense in V
∗
. Altogether we have
V ,→ H ,→ V
∗
,
where both embeddings are continous, injective, and have dense images in H and V
∗
respectively. Such a
scheme is called a Gelfand triple.
Since H is a Hilbert space, by the Riesz’s theorem we have H = H∗
. Hence, the embeddings can be
further expanded as V ,→ H = H∗
,→ V
∗ where the second embedding is more natural. In a Gelfand
triple setup, by definition we have
∥x∥V ∗ ≤ c1∥x∥H ≤ c2∥x∥V for all x ∈ V, c1, c2 constants.
12
Some common pairs of examples that can form a Gelfand triple includes H1
,→ H0
, W1
2
,→ L 2
, and
much more. In addition, we need one more definition to work with strongly elliptical equations and state
the main theorem.
Definition 2.1.17. Let H1, H2 be two Hilbert spaces. A map a(·, ·) : H1 × H2 → R/C is bilinear /
sesquilinear if it is linear in the first variable and linear / antilinear in the second. Moreover, the form is
continuous if
a(x, y)
≤ c∥x∥1∥y∥2, x ∈ H1, y ∈ H2.
Theorem 2.1.18. (Lax-Milgram). Let H be a Hilbert space and a : H × H → R/C be a bilinear /
sesquilinear map, where
1. |a(x, y)| ≤ γ∥x∥H∥y∥H, x, y ∈ H, and
2. |a(x, y)| ≥ δ∥x∥
2
H, x ∈ H,
for γ, δ > 0 constants. Then there exists a unique bijective lienar map A : H → H continuous in both
directions and uniquely determined by a(·, ·), given by
a(x, y) = ⟨Ax, y⟩H, a(A
−1x, y) = ⟨x, y⟩H, x, y ∈ H, (2.1)
with operator norms ∥A∥ ≤ γ and ∥A−1∥ ≤ 1/δ.
Two other relevant definitions are listed below. Let V ,→ H ,→ V
∗ be a Gelfand triple, where V and
H are Hilbert spaces, and let a(x, y) be a bilinear / sesquilinear form on V .
Definition 2.1.19. a(x, y) is V-elliptic if the following conditions are satisfied:
1. |a(x, y)| ≤ c1∥x∥V ∥y∥V for all x, y ∈ V , and
2. |a(x, x)| ≥ c2∥x∥
2
V
for all x ∈ V ,
13
where c1, c2 > 0 constants independent of x, y.
It follows immediately from theorem 2.1.18 that there exists a linear topological isomorphism between
V and V
∗ given in the form presented in (2.1).
Definition 2.1.20. a(x, y) is V-Coercive if the following conditions are satisfied:
1. |a(x, y)| ≤ c1∥x∥V ∥y∥V , c1 > 0, for all x, y ∈ V , and
2. there exists a constant k ∈ C and c2 > 0 with |a(x, y) + k∥x∥
2
H| ≥ c2∥x∥V for all x ∈ V ,
where k, c1, c2 constants independent of x, y.
Recall that from the embedding, |⟨x, y⟩H| ≤ ∥x∥V ∥y∥V . Therefore, a(x, y) being V -coercive implies
that the sesquilinear form a˜(x, y) = a(x, y) + k⟨x, y⟩H is V -elliptic, and the above results apply in this
case. Later in the thesis we will appeal to similar assumptions and arguments to operators and sesquilinear
forms depending on addition parameters.
2.2 Probability and Statistics
We will start by introducing the concepts related to the Delta method with scalar-valued random variables.
Consider a sequence of random variables X1, X2, ..., and another random variable X, then
Definition 2.2.1. Let FX be the distribution function of a random variable X. A sequence of random variables
X1, X2, ... converges in distribution to the random variable X if
limn→∞
FXn
(x) = FX(x),
for all points of continuity of FX,
Convergence in distribution is usually viewed as the weakest form of convergence compared to convergence in measure (probability) or almost sure convergence. In particular, one can easily show that
14
convergence in probability implies convergence in distribution. Once we have established the different
types of convergence we can have with random variables, we can present the central limit theorem.
Theorem 2.2.2. (Central Limit Theorem). Let X1, X2, ... be a sequence of independently and identically
distributed random variables with E[Xi
] = µ and 0 < VAR(Xi) ≤ σ
2 < ∞ for all i. Define Xn =
Pn
i=1 Xi/n, and let N(µ, σ2
) denote the normal random variable with mean µ and variance σ
2
, then
√
n(Xn − µ) → N(0, σ2
),
in distribution as n → ∞.
The intuition behind the central limit theorem is that given independent copies of any random variable,
we can recover normality in the limit. This justifies the assumption we make later for implementing the
delta method in various contexts in regression studies. But, one more tool is needed before we discuss the
delta method.
Theorem 2.2.3. (Slutsky’s Theorem). If Xn → X in distribution and Yn → a, a ∈ R a constant, in
probability. Then
YnXn → aX in distribution,
Xn + Yn → X + a in distribution.
Our goal is to estimate the distribution of functions of random variables provided the asymptotic distribution of the random variables (potentially given by the central limit theorem).
Theorem 2.2.4. (Univariate Delta Method). Let Y1, Y2, ... be sequence of random variables such that
√
n(Yn − θ) → N(0, σ2
) in distribution,
15
for some θ ∈ R and σ
2 > 0. Consider a function g : R 7→ R continuously differentiable at θ, with g
′
(θ) ̸= 0,
then
√
n
g(Yn) − g(θ)
→ g
′
(θ)N(0, σ2
) in distribution.
Proof. (sketch) The result comes from the first-order Taylor approximation of g(Yn). From the expansion,
we have
g(Yn) = g(θ) + g
′
(θ)(Yn − θ) + Reminder.
Since Yn → θ in probability, the higher order terms in the reminder go to 0. Then rearranging the equation
gives
√
n
g(Yn) − g(θ)
=
√
ng′
(θ)(Yn − θ).
The right-hand side of the equation converges in distribution to g
′
(θ)N(0, σ2
) from the assumption and
Theorem 2.2.3 with g
′
(θ) as a constant.
Many generalizations can be made on the standard delta method presented above. For instance, we
can look at a higher-order approximation.
Theorem 2.2.5. (Second-order Delta Method). Let Y1, Y2, ... be a sequence of random variables such that
√
n(Yn − θ) → N(0, σ2
) in distribution,
for some θ ∈ R and σ
2 > 0. Consider a function g : R 7→ R twice continuously differentiable at θ, with
g
′
(θ) = 0 and g
′′(θ) exists and nonzero, then
n
g(Yn) − g(θ)
→ σ
2
g
′′(θ)
2
χ
2
1
in distribution,
where χ
2
1 ∼ Z 2
is the Chi-square distribution with degree of freedom 1, and Z represents the standard
normal distribution.
We may also apply the delta method to random variables of higher dimensions,which will be stated in
later chapters when needed.
2.3 Linear Regression and Least Squares
Consider the linear equation Ax = y, where y ∈ R
m, A ∈ R
m×n
are given, and x ∈ R
n needs to be
solved. In general regression studies, the solution does not exist, but we aim to look for the "best" possible
xˆ such that the cost function, J(x) = ∥Ax − y∥
2
is minimized.
Definition 2.3.1. The optimization problem given by
xˆ = arg min
x∈Rn
∥Ax − y∥
2
,
is called a least squares problem.
The solution is given by solving the following equation.
Definition 2.3.2. The equation A⊺Axˆ = A⊺y is called the normal equation.
The problem can be interpreted alternatively as finding the projection of y in the column space C(A)
of A. Then it must be the case that ⟨Ax, y ˆ − Axˆ⟩ = 0 or ⟨x, A ˆ
⊺y − A⊺Axˆ)⟩ = 0, where the second half
gives the normal equation. The solution to the normal equation is given by xˆ = (A⊺A)
−1A⊺y.
Definition 2.3.3. The matrix (A⊺A)
−1A⊺
is called the pseudo-inverse or left-inverse of A.
We note the while xˆ given above minimizes the selected error metric J(x), the matrix (A⊺A) might
not be invertible and needs additional care, or regularization.
17
2.4 Differentiation in Banach Spaces
In this section, we will discuss the definition of the Fréchet derivative of an operator on general Banach
spaces and various rules for calculating it as analogs to the standard differentiation techniques in Euclidean
spaces. Let X and Y be Banach spaces with corresponding norms ∥ · ∥X and ∥ · ∥Y , and denote L(X, Y )
as all continuous linear operators from X to Y . Consider x, h ∈ X and U an open neighborhood of x, and
let F : U 7→ Y be the function of interest.
Definition 2.4.1. The Gâteaux variation of F at x in the direction of h is given by
δF(x; h) = ϕ
′
(0),where ϕ(λ) : λ 7→ F(x + λh) is differentiable at 0.
If there exists an operator A ∈ L(X, Y ) such that δF(x; h) = Ah, then A is called the Gâteaux derivative
of F at x, which is an analog to the directional derivative of functions in Euclidean spaces.
Definition 2.4.2. If there exists an operator A ∈ L(X, Y ) such that
lim
∥h∥X→0
F(x + h) − F(x) − Ah
Y
∥h∥X
= 0.
Then F is said to be Fréchet differentiable and A is the Fréchet derivative of F at x.
The Fréchet derivative is an analog to the derivative of functions in Euclidean spaces, and various
rules for calculation still hold under this definition. Two of the useful differentiation rules similar to their
familiar counterparts from elementary calculus will be presented here for later reference.
Lemma 2.4.3. (Fréchet Derivative of an Inner Product). Let H be a Hilbert space equipped with the usual
inner product ⟨·, ·⟩H. Consider A, B : X 7→ H bounded and continuous mappings from the Banach Space X
18
with norm ∥ · ∥X to the Hilbert space, and let DA(x) = DxA(x), DB(x) = DxB(x) be the corresponding
Fréchet derivatives at x ∈ X. Then for fixed x ∈ X,
lim
∥h∥X→0
⟨A(x + h), B(x + h)⟩ − ⟨A(x), B(x)⟩ − ⟨DA(x)h, B(x)⟩ − ⟨A(x), DB(x)h⟩
∥h∥X
= 0. (2.2)
In other words, the Fréchet derivative of the inner product ⟨A(x), B(x)⟩ at x ∈ X is given by
D⟨A(x), B(x)⟩h = ⟨DA(x)h, B(x)⟩ + ⟨A(x), DB(x)h⟩ ∈ C or R,
for h ∈ X admissible, formally at least, as in the standard calculus.
Another useful property is a version of the product rule (or, depending on how it is viewed, the chain
rule) for Fréchet derivatives.
Lemma 2.4.4. (Fréchet Derivative of a Product or Composition). Let X and Y be Banach spaces occupied
with corresponding norms ∥ · ∥X and ∥ · ∥Y . Consider two bounded and continuous operators A(x) : X 7→
L(Y, Y ) and B(x) : X 7→ Y , and define C(x) := A(x)B(x) : X 7→ Y . Denote DA(x), DB(x) to be the
corresponding Fréchet derivative of A and B respectively. Then for admissible h ∈ X, the Fréchet derivative
of C at x ∈ X is given by
DC(x)h = (DA(x)h)(B(x)) + A(x)(DB(x)h) ∈ Y,
for admissible h ∈ X.
Proof. We begin with Lemma 2.4.3. It suffices to show that the limit is less than or equal to 0. Rearranging
the numerator in equation (2.2) gives the following result.
19
⟨A(x + h), B(x + h)⟩ − ⟨A(x), B(x)⟩ − ⟨DA(x)h, B(x)⟩ − ⟨A(x), DB(x)h⟩
≤
⟨A(x + h), B(x + h)⟩ − ⟨A(x), B(x + h)⟩ − ⟨DA(x)h, B(x + h)⟩
(2.3)
+
⟨A(x), B(x + h)⟩ − ⟨A(x), B(x)⟩⟩ − ⟨A(x), DB(x)h⟩
(2.4)
+
⟨DA(x)h, B(x + h)⟩ − ⟨DA(x)h, B(x)⟩
. (2.5)
We will show that the limit of each of these three parts is 0. Indeed, for equation (2.3),
lim
∥h∥X→0
⟨A(x + h), B(x + h)⟩ − ⟨A(x), B(x + h)⟩ − ⟨DA(x)h, B(x + h)⟩
∥h∥X
= lim
∥h∥X→0
⟨A(x + h) − A(x) − DA(x)h, B(x + h)⟩
∥h∥X
= lim
∥h∥X→0
⟨
A(x + h) − A(x) − DA(x)h
∥h∥X
, B(x + h)⟩
≤ lim
∥h∥X→0
A(x + h) − A(x) − DA(x)h
H
∥h∥X
∥B(x + h)∥H
= 0 ∗ ∥B(x)∥H
= 0.
20
The last line follows from the boundedness of B and the definition of DA(x) being the Fréchet derivative of A at x. Similarly, for equation (2.4), we have
lim
∥h∥X→0
⟨A(x), B(x + h)⟩ − ⟨A(x), B(x)⟩ − ⟨A(x), DB(x)h⟩
∥h∥X
= lim
∥h∥X→0
⟨A(x), B(x + h) − B(x) − DB(x)h⟩
∥h∥X
= lim
∥h∥X→0
⟨A(x),
B(x + h) − B(x) − DB(x)h
∥h∥X
⟩
≤ lim
∥h∥X→0
∥A(x + h)∥H
B(x + h) − B(x) − DB(x)h
H
∥h∥X
= ∥A(x)∥H ∗ 0
= 0.
Finally, equation (2.5) follows from the continuity of B
lim
∥h∥X→0
⟨DA(x)h, B(x + h)⟩ − ⟨DA(x)h, B(x)⟩
∥h∥X
≤ lim
∥h∥X→0
∥DA(x)∥L(X,H)∥B(x + h) − B(x)∥H
= 0.
For Lemma 2.4.4, similar to the previous proof, it suffices to show that
lim
∥h∥X→0
A(x + h)B(x + h) − A(x)B(x) − DA(x)hB(x) − A(x)DB(x)h
Y
∥h∥X
≤ 0.
Again we will break down the numerator into three parts,
21
A(x + h)B(x + h) − A(x)B(x) − DA(x)hB(x) − A(x)DB(x)h
Y
≤
A(x + h)B(x + h) − A(x)B(x + h) − DA(x)hB(x + h)
Y
(2.6)
+
A(x)B(x + h) − A(x)B(x) − A(x)DB(x)h
Y
(2.7)
+
DA(x)hB(x + h) − DA(x)hB(x)
Y
. (2.8)
First, using the definition of DA(x) and boundedness of B, for equation (2.6) we have
lim
∥h∥X→0
A(x + h)B(x + h) − A(x)B(x + h) − DA(x)hB(x + h)
Y
∥h∥X
≤ lim
∥h∥X→0
A(x + h) − A(x) − DA(x)h
L(Y,Y )
∥B(x + h)∥Y
∥h∥X
= 0 ∗ ∥B(x)∥Y
= 0.
For equation (2.7), Since A(x) is a linear operator on B(x), it follows directly that
lim
∥h∥X→0
A(x)B(x + h) − A(x)B(x) − A(x)DB(x)h
Y
∥h∥X
= lim
∥h∥X→0
A(x)
B(x + h) − B(x) − DB(x)h
Y
∥h∥X
≤ lim
∥h∥X→0
A(x)
L(Y,Y )
∥B(x + h) − B(x) − DB(x)h∥Y
∥h∥X
=
A(x)
L(Y,Y )
∗ 0
= 0.
2
Finally, inequality for equation (2.8) follows from the fact that DA(x)h is a linear operator on B(x)
and the continuity of B
lim
∥h∥X→0
DA(x)hB(x + h) − DA(x)hB(x)
Y
∥h∥X
= lim
∥h∥X→0
DA(x)h
B(x + h) − B(x)
Y
∥h∥X
≤ lim
∥h∥X→0
DA(x)
X,L(Y,Y )
∥h∥X
B(x + h) − B(x)
Y
∥h∥X
= 0.
2
Chapter 3
A Single Input Single Output Model for the Transdermal Transport of
Ethanol
We model the transdermal transport of alcohol using a linear, hybrid, single input single output (SISO)
system. In the present context, hybrid refers to a system that involves both ordinary (ODE) and partial
(PDE) differential equations. The input is blood alcohol concentration (BAC) or its surrogate, breath alcohol
concentration (BrAC) and the output is the biosensor measured transdermal alcohol concentration (TAC).
For t > 0 and 0 < η < 1, our model takes the form
∂x
∂t (t, η) = q1
∂
2x
∂η2
(t, η),
dw
dt (t) = q1
∂x
∂η (t, 0), (3.1)
y(t) = w(t),
with x(t, 0) = w(t), x(t, 1) = q2u(t), w(0) = w0, and x(0, η) = φ0(η). The inflow and outflow dynamics
of the sensor are described by the ODE in the form of a compartment model. It is coupled through a flux
term at the boundary of the skin with a parabolic PDE in the form of a one dimensional diffusion equation.
It relies on a Fickian model for the transport and mixing of ethanol molecules within the interstitial fluid
between the cells in the skin’s epidermal layer. There is no active blood supply to the epidermal layer
24
which is comprised of both living and dead cells. The interstitial fluid serves to provide nourishment to
the living cells in the epidermal layer via transport from the adjacent deeper dermal layer of the skin which
does have an active blood supply.
The variables in the system (3.1) and their definitions are: x(t, η), the concentration of ethanol in the
epidermal layer at time t and dimensionless depth η ∈ [0, 1] and w(t), the ethanol concentration in the
biosensor’s collection reservoir. The pair x(t, η) and w(t) together describe the state of the system and
are referred to as the state variables. The system output, y(t), is the observed biosensor’s TAC signal in
the form of an electric current. In our model we consider it to be proportional to w(t). We note that there
is no loss of generality in setting the constant of proportionality to one since the system being linear, this
constant is simply absorbed into the gain that multiplies the system input, u(t), the BAC or BrAC signal.
The model has two in general, unmeasurable, physiologically-, hardware-, and environmentally-dependent
parameters, q = [q1, q2]. The dimensionless parameter, q1, is essentially the diffusivity of ethanol in the
interstitial fluid in the epidermal layer and the dimensionless parameter q2 is the input gain; it captures
the combined or effective impedance of alcohol between the epidermal and dermal layers and between the
epidermal layer and the membrane that covers the collection reservoir of the biosensor where it comes in
contact with the outer surface of the skin’s epidermal layer or the stratum corneum.
Although the transdermal transport of ethanol and correspondingly, our model, are continuous time,
the processors and electronics are digital. Consequently, we convert our model to sampled or discrete
time. In addition, we note that the input to the system (3.1) is on the boundary of the spatial domain [0, 1].
When this is the case, converting the model to state-space form for analysis and approximation can be
somewhat delicate and mathematically technical. Fortunately, when the system is converted to discrete
time, the input operator becomes bounded [18].
Toward this end, we assume zero-order hold input of the form, u(t) = uk, t ∈ [kτ,(k + 1)τ ), k =
0, 1, 2, . . . , K where T = Kτ . We set xk = xk(η) = x(kτ, η), wk = w(kτ ), and yk = y(kτ ), k =
25
0, 1, . . . , K. Letting v(t, η) = x(t, η) − ξ(η)uk with ξ(η) = q2η, for 0 < η < 1, and kτ ≤ t < (k + 1)τ ,
each k = 0, 1, 2, . . . , K, the system (3.1) now becomes
∂v
∂t(t, η) = q1
∂
2v
∂η2
(t, η),
dw
dt (t) = q1
∂v
∂η (t, 0) + q2uk,
v(t, 0) = w(t), v(t, 1) = 0, (3.2)
v(kτ, ·) = x(kτ, ·) − ξ(·)uk = xk − ξuk
yk = w(kτ ).
Henceforth, this is form of our model for transdermal transport of ethanol that we will be working with.
In the subsequent chapters we develop our abstract approximation framework and convergence theory
for quantifying the propagation of the uncertainty in the model parameters, q, through the dynamical
system given in (3.2) to the system output signal, or TAC, yk, and to deconvolved estimates of the input
signal, or BrAC, uk. We do this by using linear semigroup theory to refomulate both the continuous time
system (3.1) and the discrete or sampled time system, (3.2), in infinite dimensional state space form. To
facilitate numerical computations, we rely on finite dimensional approximations to the system (3.2) and
argue convergence with the aid of the abstract approximation theory for linear semigroups. After we have
developed our abstract framework and approximation and convergence theory, in Chapter 8 we will see
how to apply it to the system given in (3.2).
26
Chapter 4
The Abstract Evolution System and Finite Dimensional Approximation
and Convergence
In this chapter, we begin with the class of infinite-dimensional evolution systems that are used to describe
the diffusion model relevant to our problem of interest. Then we transform the system into a discrete-time
convolution and finish with their finite-dimensional approximation alongside the theory regarding their
convergence.
4.1 The Continuous-Time Infinite Dimensional Evolution System
The setting for our abstract framework is a Gelfand triple of state spaces, V ,→ H ,→ V
∗
, where V and
H are arbitrary Hilbert spaces that satisfy the requirement that V is continuously and densely embedded
in H. Denote the norms on H and V by | · |H and ∥ · ∥V , respectively. We further define the admissible
parameter space Q as a compact subset of r-dimensional Euclidean space with respect to a metric dQ,
and let a (q; ·, ·) : V × V → R denotes a bilinear form with parameter q ∈ Q. We make the following
assumptions on a (q; ·, ·):
Assumption 4.1.1. (Boundedness) There exists a constant α0 > 0 such that ∀q ∈ Q, ∀φ, ψ ∈ V , |a(q; φ, ψ)| ≤
α0∥φ∥V ∥ψ∥V .
27
Assumption 4.1.2. (Coercivity) There exist constants λ0 ∈ R and µ0 > 0 such that ∀q ∈ Q, ∀φ ∈ V ,
a(q; φ, φ) + λ0|φ|
2
H ≥ µ0∥φ∥
2
V
.
Assumption 4.1.3. (Continuity) For q, qe∈ Q, |a(q; φ, ψ) − a(˜q; φ, ψ)| ≤ dQ(q, q˜)∥φ∥∥ψ∥, ∀φ, ψ ∈ V .
Note that all constants α0, λ0, and µ0, do not depend on q, for q ∈ Q. While the above assumptions will
allow us to construct the analytic semigroup of operators, we will need one more assumption to discuss
its derivatives:
Assumption 4.1.4. (Affinity) The map q → a (q; ·, ·) is affine in the sense that a (q; φ, ψ) = a0 (φ, ψ) +
a1 (q; φ, ψ), ∀φ, ψ ∈ V , where a0 is independent of q and a1 is linear in q.
Under Assumptions 4.1.1-4.1.3, we define a bounded linear operator A(q) : V → V
∗
as follows. For all
φ, ψ ∈ V , A(q)φ satisfies the equation −a(q; φ, ψ) = ⟨A(q)φ, ψ⟩V ∗,V . It can be shown that, restricting
the range of A(q) to H, the set
Dom(A(q)) = {φ ∈ V : A(q)φ ∈ H},
consists of the infinitesimal generator of a uniformly exponentially stable analytic semigroup of bounded
linear operators on H, written as {T(q;t) : t ≥ 0} = {e
A(q)t
: t ≥ 0}, if λ0 ≤ 0. Moreover, we can further
extend or restrict the semigroup in V and V
∗
, respectively, by modifying Dom(A(q)) accordingly[52, 3].
The continuous state-space system can now be described as
x˙(t) = A(q)x(t) + B(q)u(t), x(0) = x0, y(t) = C(q)x(t), 0 < t ≤ T, (4.1)
28
where x0 ∈ H is the given initial condition. u(·) : [0, T] → R
m and y(·) : [0, T] → R
p
are the inputs and
outputs of the evolution system. B(q) ∈ L(R
m, H) and C(q) ∈ L(H, R
p
) are continuous maps from Q
into their respective co-domains of operators.
4.2 The Discrete-Time Convolution Formulation
Throughout this thesis we will focus primarily on the discrete-, or sampled-time system. Let τ > 0 denote
the sampling interval and let K ∈ Z
+ denote the number of sampling intervals. For q ∈ Q, similar to
B(q) and C(q), we denote the discrete-time operators as Bˆ(q) ∈ L(R
m, H) and Cˆ(q) ∈ L(H, R
p
) and
assume that all mappings are continuous with respect to q. Furthermore, we consider the zero-order hold
input of the form u(t) = uk, t ∈ [kτ,(k + 1)τ ), uk ∈ R
m, k = 0, 1, 2, ..., K − 1. Setting x0 ∈ H be
the same initial condition as in the continuous-time system, the discrete- or sampled-time linear evolution
input/output system in H given by
xk+1 = Aˆ(q)xk + Bˆ(q)uk, yk = Cˆ(q)xk, (4.2)
k = 0, 1, 2, ...K,where Aˆ(q) = T(q; τ ) = e
A(q)τ ∈ L(H, H).
Typically, the sampled-time system will be derived directly from a continuous-time system stated in
the previous section, where xk = x(kτ ) and yk = y(kτ ). We can compute, through the means of the
variation of parameters formula, that Bˆ(q) = R τ
0
T(q; s)B(q)ds. Moreover, when A(q) is invertible or
a(q; , ·, ·) is elliptic (that is, λ0 ≤ 0 in Assumption 4.1.2), we can further reduce the equation to Bˆ(q) =
R τ
0
T(q; s)B(q)ds = A(q)
−1
(T(q; τ ) − I). However, we have chosen here to leave the precise definition
of Bˆ(q) unspecified since in the case of certain distributed parameter models (e.g. a parabolic partial
differential equation) with the input u on the boundary of the spatial domain, the appropriate choice of
29
Bˆ(q) is somewhat more technical, as we will demonstrate with a numerical example later. In general, we
have Cˆ(q) = C(q).
Assuming x0 = 0, the output sequence {yk}
K
k=1 in (4.2) can be written as a discrete-time convolution
of the input {uk}
K−1
k=o with a convolution filter or impulse response function (IRF) as
yk =
X
k−1
j=0
hk−j−1(q)uj k = 1, 2, ..., K. (4.3)
where hi(q) = Cˆ(q)Aˆ(q)
iBˆ(q) ∈ R
p×m, i = 0, 1, ...K − 1. Here we note that while each hi(q) involves
infinite dimensional operators on or into the Hilbert space H, on the face of it hi(·) is a matrix-valued
function mapping from the parameter space R
r
to R
p×m. One can expect that under some conditions, we
may consider hi(·) as a standard vector-valued function and discuss its regularity properties, in particular,
the derivatives.
4.3 Finite Dimensional Approximation and Convergence for the DiscreteTime System
In practice, the infinite-dimensional system is not suitable for numerical computation. As a result, we
develop a finite dimensional approximation framework for the system using a standard Galerkin approach
as described in [3]. For each N = 1, 2, ..., we construct a sequence of N-dimensional subspaces V
N =
span{φ
N
i
}
N
i=0 with the property that for any v ∈ V , there exists a sequence {v
N : v
N ∈ V
N }∞
1
such that
||v
N − v||V → 0 as N → ∞. (4.4)
Consider the orthogonal projection of H onto V
N , P
N : H → V
N . For commonly used Galerkin bases
(e.g. linear and cubic spline functions) it can be argued that P
N converges strongly to the identity on
30
H, and moreover, using the derivative approximating properties of these functions, then (4.4) holds with
the approximating sequence v
N = P
N v. We define our approximations AN (q) : V
N → V
N to the
operator A(q) on the subspace V
N by < AN (q)φ
N , ψN >= −a(q; φ
N , ψN ), for φ
N , ψN ∈ V
N . The
finite-dimensional discrete-time evolution system is then given by
x
N
k+1 = AˆN (q)x
N
k + BˆN (q)uk,
y
N
k = CˆN (q)x
N
k
, k = 0, 1, 2, ...
where AˆN (q) = e
AN (q)τ ∈ L(V
N , V N ), BˆN (q) ∈ L(R
m, V N ), and CˆN (q) = Cˆ(q) ∈ L(V
N , R
p
). It
then follows from the coercivity assumption 4.1.2 and the compactness of the parameter spaceQ that there
exists a uniform exponential bound on AˆN = e
AN (q)t
. Furthermore, it can be argued as in [3] that the
resolvent operator of AN (q), RN (λ; q) P
N = (λI − AN (q))−1P
N , converges strongly to the resolvent
operator of A(q), R (λ; q) = (λI − A(q))−1
, in the sense that for u ∈ H, RN (λ; q) P
N u → R (λ; q) u in
the V -norm, and consequently the H-norm, as N → ∞. Applying the Trotter-Kato theorem from linear
semigroup theory [40], we conclude that AˆN (q) = e
AN (q)τP
N converges strongly to Aˆ(q) = T(q; τ ) on
H uniformly in q for q ∈ Q. Since the precise definitions for Bˆ and Cˆ have not been given, we assume
that BˆN (q) and CˆN (q) have been defined so that they converge strongly to Bˆ(q) and Cˆ(q), respectively,
uniformly in q for q ∈ Q. We note that as a result of the assumptions that the input and output spaces are
finite dimensional, these operators are finite rank and therefore in fact converge in the uniform operator
topologies.
As in the infinite-dimensional case, we have
y
N
k =
X
k−1
j=0
C
N (q)AˆN (q)
k−j−1BˆN (q)uj
=
X
k−1
j=0
h
N
k−j−1
(q)uj , k = 1, 2, ...K,
(4.5)
31
where h
N
i
(q) = C
N (q)AˆN (q)
iBˆN (q), i = 0, 1, 2, ...K − 1, with h
N
i
(q) converging to hi(q) in R
p×m,
uniformly in q for q ∈ Q and i in {0, 1, 2, ..., K − 1} and y
N
k
converging to yk in R
p
, uniformly in q for
q ∈ Q and k in {1, 2, ..., K}.
32
Chapter 5
The Fréchet Derivative of Analytic Semigroup, its Finite Dimensional
Approximation and Convergence
In this chapter, we will discuss the regularity and Fréchet differentiability of the analytic semigroup {T(q;t) :
t ≥ 0} constructed in the previous chapter with respect to the parameter q ∈ Q, which will ultimately
lead to the differentiability of the impulse response functions hi(q) that will be the building blocks of our
subsequent discussion. In addition, we will explore the convergence of such derivatives, in the standard
sense, of the finite-dimensional approximation described at the end of the previous chapter. The impulse
response function hi(q) involves infinite-dimensional operators defined on a Hilbert space H, which do
not process derivatives in the sense of standard elementary calculus. Alternatively, we will consider the
Fréchet derivative described in Chapter 2.3.
5.1 The Fréchet Derivative of the Analytic Semigroup
In this chapter, we will follow the roadmap presented in [46] to demonstrate the existence of the Fréchet
derivative under assumptions made in 4.1, and further extend the method to the finite-dimensional approximating operators in 4.3.
33
Recall that we define the operator A(q) based on a sesquilinear form a (q; ·, ·) : V × V → C with
parameter q ∈ Q. Under assumptions 4.1.1-4.1.4, we define a related bounded linear operator A1(δq) :
V → V
∗
restricted to the linear component in assumption 4.1.4 for each δq ∈ ∆Q, where
∆Q :=
δq ∈ Q| δq = q − q, e for some q, qe∈ Q
,
such that −a1(q; φ, ψ) = ⟨A1(q)φ, ψ⟩V ∗,V , ∀φ, ψ ∈ V . According to assumption 4.1.4, the derivative of A
or the sesquilinear operator a (q; ·, ·), if it exists, will only depend on the linear part A1 or a1, respectively.
Towards this end, we define an obtuse sector in C related to the constants given in the assumptions
4.1.1 - 4.1.3.
Definition 5.1.1. Fix γ ∈ (0, 1) constant. For α0, µ0, and λ0 the constants in Assumptions 4.1.1 and 4.1.2,
let α1 = 1 + α0/µ0 and θγ = tan−1
α1/(1 − γ)
. Define the obtuse sector
Σγ =
λ ∈ C| arg(λ − λ0)| ≤ π
2
+ θγ
.
The sector, independent of q, is a subset of the resolvent set ρ(A(q)) for all q ∈ Q, and more importantly, the resolvent, R (λ; q) =
λI − A(q)
−1
, is uniformly bounded for all λ ∈ Σγ in the sense that
∥R(λ, q)∥V ∗,V ≤ 1/(µ0γ)[46].
The sector Σγ serves as a domain for the resolvent R(λ; q) to have a uniformly norm-convergent
power series representation in terms of A(q) and more specifically A1(δq) in the neighborhood where
∥δq∥Q ≤ µ0γ/α0 [46]. The Fréchet derivative DqR(λ, q) acting on δq can thus be deduced as a linear
map from Q to L(V
∗
, V ). Moreover, for fixed γ ∈ (0, 1) and q ∈ Q, we can further decompose the
derivative by DqR(λ, q)δq = R(λ, q)A1(δq)R(λ, q) [46].
With the assumptions 4.1.1 - 4.1.4 we have made about the system, we can write the operator T(q;t)
using the inverse Laplace transform as a contour integral [40]
T(q;t) = 1
2πi Z
∂Σγ
e
λtR(λ, q) dλ for t > 0 and q ∈ Q.
The expression combined with the uniform boundedness and analyticity of the resolvent, R(λ, q),
shown above, one can conclude [46] with the dominated convergence theorem that the Fréchet derivative
of operator T(q;t) acting on δq exists and is given by
DT(q;t)δq =
1
2πi Z
∂Σγ
e
λtDqR(λ, q)δqdλ (5.1)
=
1
2πi Z
∂Σγ
e
λtR(λ, q)A1(δq)R(λ, q)dλ.
Following the definitions of the approximating semigroup in 4.3 and in [3], we can argue that Σγ
is contained in the resolvent set ρ(AN (q)) and that for q ∈ Q, the resolvent operator RN (λ, q) is uniformly bounded for all λ ∈ Σγ. Again, we choose to express the operators T
N (q;t) in its contour integral form[40], even though in the finite-dimensional case it possess standard elementary total and partial
derivatives. Setting
T
N (q;t) = 1
2πi Z
∂Σγ
e
λtR
N (λ, q) dλ for t > 0 and q ∈ Q.
35
We can further define the linear component of the generator AN
1
(δq) and follow the steps presented for
the infinite-dimensional case, along with the dominated convergence theorem and uniform boundedness
of the resolvent operators, to conclude that
DqT
N (q;t)δq =
1
2πi Z
∂Σγ
e
λtDqR
N (λ, q)δq dλ (5.2)
=
1
2πi Z
∂Σγ
e
λtR
N (λ, q)A
N
1
(δq)R
N (λ, q)dλ.
5.2 Evaluation of the Fréchet Derivative
At this point we spend a moment discussing how the Fréchet derivative of the finite dimensional approximations to the analytic semigroup, {T(q;t) : t ≥ 0}, presented in 5.1 can actually be computed in practice.
Indeed, the contour integral may not be convenient for the actual computation. One of the alternatives
presented in [46] is treating the derivative as a solution to a modified differential equation. For fixed q ∈ Q
and admissible δq, let x(q;t) be the solution to the abstract initial value problem
x˙(q;t) = A(q)x(q;t), 0 < t ≤ T, x(q; 0) = x0, (5.3)
in the sense that x(q;t) = T(q;t)x0. Formally, denoting w = w(q; δq;t) = Dqx(q;t)δq = DqT(q;t)x0δq,
and applying the product rule Lemma 2.4.4 along with assumption 4.1.4 to w˙ (q; δq;t), we obtain another
initial value problem
w˙ (q; δq;t) = A(q)w(q; δq;t) + A1(δq)x(q;t),
w(q; δq; 0) = 0.
36
The variation of constants formula then yields
w(q; δq;t) = Z t
0
T(q;t − s)A1(δq)x(q; s)ds. (5.4)
In the case of the finite-dimensional approximations, as noted in [28], the operators are replaced with
their corresponding matrix exponential representations. The derivative DqT
N (q; τ )δq =
∂AˆN (q)
∂q δq can
be obtained as a tensor in the following form. For t ≥ 0, set Φ
N (t; q) = e
AN (q)t
; differentiation then
yields Φ˙ N (t; q) = AN (q)ΦN (t; q), Φ
N (0; q) = I. Then, setting ΨN (t; q) = ∂ΦN (t;q)
∂q , differentiating the
previous expression with respect to q, and interchanging the order of differentiation, we obtain Ψ˙ N (t; q) =
AN (q)ΨN (t; q) + ∂AN
1
(q)
∂q Φ
N (t; q), with ΨN (0; q) = 0. Combining these two equations and solving the
resulting system, we obtain
ΨN (t; q)
Φ
N (t; q)
= exp
AN (q) ∂AN
1
(q)/∂q
0 AN (q)
t
0
I
.
Then, setting t = τ we have that DqT
N (q; τ )δq =
∂AˆN (q)
∂q δq = ΨN (τ ; q)δq.
5.3 Convergence of the Finite Dimensional Approximating Derivatives
Under assumptions 4.1.1 - 4.1.4, we were able to prove the following theorem [28]
Theorem 5.3.1. For every φ ∈ H, DqT
N (q;t)P
N φ → DqT(q;t)φ in the V
∗
-norm as N → ∞, uniformly
in t for t > 0.
Proof. Our argument is based on the expression given in (5.1) and (5.2). From [46], we have that for all λ
in the sector Σγ, both the resolvent operator and the linear component A1 are both uniformly bounded in
q and δq, respectively, with ∥R(λ, q)∥V ∗,V ≤ 1/(µ0γ) and ∥A1(δq)∥V,V ∗ ≤ 2α0. Theorem 2.2 from [3]
37
asserts that for any φ ∈ H, RN (λ, q)P
N φ → R(λ, q)φ in the V-norm, and convergence from AN (q) to
A(q) ensures the convergence of AN
1
to A1. As a result, we have that for any φ ∈ H,
R
N (λ, q)A
N
1
(δq)R
N (λ, q)P
N φ → R(λ, q)A1(δq)R(λ, q)φ,
for any admissible q and δq, λ ∈ Σγ in the V-norm. Moreover, Lemma 3.6.1 in [52] guarantees that for every
φ ∈ H or V
∗
, ∥R(λ, q)φ∥V ∗ ≤
α1
|λ|
∥φ∥V ∗ where α1 is defined in Definition 5.1.1. Consequently, we see that
for any φ ∈ H, the integrand in (5.1) is bounded above by ∥R(λ, q)A1(δq)R(λ, q)φ∥V ∗ ≤
2α0α
2
1
|λ|
2 ∥φ∥V ∗ .
The same bound can be applied to the approximation applied to the projection P
N φ. The 1/|λ|
2
in the
upper bound asserts that the integral will be bounded over the boundary of the sector where the real part of
λ extends to negative infinity. Taking the limit as N → ∞ for equation (5.2) and applying the Dominated
Convergence Theorem, we obtain that for admissible q and δq, any t > 0, and any φ ∈ H and thus in V
∗
,
lim
N→∞
DqT
N (q;t)P
N φ = lim
N→∞
1
2πi Z
∂Σγ
e
λtR
N (λ, q)A
N
1
(δq)R
N (λ, q)P
N φdλ
=
1
2πi Z
∂Σγ
lim
N→∞
e
λtR
N (λ, q)A
N
1
(δq)R
N (λ, q)P
N φdλ
=
1
2πi Z
∂Σγ
e
λtR(λ, q)A1(δq)R(λ, q)φdλ
= DqT(q;t)φ in the V
∗
-norm uniformly in t > 0.
An alternative argument can be given using the characterization of the Fréchet derivative given in
Chapter 5.2. Indeed, using the fact that {T(q;t) : t ≥ 0} is also a semigroup on V
∗
[52], the bounds for
the semigroup given in Lemma 3.6.2 of [52] yield that
T(q;t − s)A1(δq)T(q; s), T N (q;t − s)A
N
1
(δq)T
N (q; s)P
N ∈ L(H, V ∗
)
38
with norm less than L2
t
1
2
, for some L2 > 0. The Trotter-Kato Theorem [40] and the Dominated Convergence
Theorem then yield the desired result.
5.4 Derivative of the Impulse Response Function
In this chapter, we will combine all our previous results to obtain the derivative of the impulse response
function hk(q) for all k = 0, 1, ..., K − 1, the components of which can be interpreted as the response
of a particular output channel to a unit impulse input in a particular input channel. Since the precise
definitions of the operators Bˆ(q) and Cˆ(q) are not yet been specified at this stage, for now we assume that
they are Fréchet differentiable with respect to the parameter q for the sake of generalization. In the specific
application to the Alcohol Biosensor problem, we will show that these operators and their derivatives can
be explicitly computed.
Recall that we denote hk(q) = Cˆ(q)Aˆ(q)
kBˆ(q) ∈ R
p×m, k = 0, 1, ...K − 1. Here we summarize our
results in [28, 29] and present the following theorem.
Theorem 5.4.1. Assume that Bˆ(q) and Cˆ(q) are Fréchet differentiable with respect to the parameter q and
that the assumptions 4.1.1 - 4.1.4 are satisfied. Then hk(q) is differentiable with respect to q ∈ Q for all
k = 0, 1, ..., K − 1. Let Dhk(q) denote the tensor derivative, and let DhN
k
(q) be the derivative of the
corresponding finite-dimensional approximation h
N
k
(q) defined in Chapter 4.3. Further assume that the Fréchet
derivatives of Bˆ(q) and Cˆ(q) have the corresponding converging approximations in H and V respectively.
Then DhN
k
(q) converges uniformly in q to Dhk(q) for all k = 0, 1, ..., K − 1.
Proof. To begin the proof, we rely on the fact that the operator Bˆ(q) maps R
m into H and rewrite it as an
m-dimensional row vector of elements in the Hilbert space H, Bˆ(q) =
ˆb1(q), ...,
ˆbm(q)
,
ˆbj (q) : Q 7→
H, ∀j = 1, ..., m. Similarly, we use the fact the operator Cˆ(q) maps into R
p
and the Riesz Representation
39
Theorem to rewrite it as a p-dimensional column vector Cˆ(q) =
cˆ1(q), ..., cˆp(q)
⊺
, cˆi(q) : Q 7→ H, ∀i =
1, ..., p. It then suffices to show that the derivative for the (i, j)-th entry of hk(q) exists and has a converging approximation. The (i, j)-th entry of hk(q), which is scalar-valued, can be represented in terms of an
inner product in H of the following form
h
i,j
k
(q) = D
cˆi(q), Aˆk
(q)
ˆbj (q)
E
H
=
D
cˆi(q), T(q; kτ )
ˆbj (q)
E
H
,
where τ is the length of the sampling interval. The result then follows from the semigroup property
of the operators T(q; kτ ) = Aˆk
and the representation theorem. Using Lemma 2.4.3 and 2.4.4, and the
result from Chapter 5.1 regarding the analytic semigroup, we can explicitly compute this derivative. Let
the Fréchet derivatives of cˆi(q) and ˆbj (q) be given by cˆ
′
k
(q) and ˆb
′
j
(q) respectively, and denote the Fréchet
derivative of Aˆ(q) = T(q; τ ) as DqT(q; τ ). For admissible δq, we have
Dqh
i,j
k
(q)δq =
D
cˆ
′
i
(q)(δq), T(q; kτ )
ˆbj (q)
E
H
+
D
cˆi(q), DqT(q; kτ )(δq)
ˆbj (q)
E
H
+
D
cˆi(q), T(q; kτ )
ˆb
′
j
(q)(δq)
E
H
.
Here we note that the h
i,j
k
(q) : Q 7→ R can be viewed as a multivariate scalar-valued function with
directional derivative in the classical sense. On the other hand, cˆ
′
i
(q) and ˆb
′
j
(q) are mappings from Q to H,
and DqT(q; kτ )δq ∈ L(H, H) is understood as a linear operator on H applying to ˆbj (q). We can recover
4
the classical partial derivatives by applying specific δq in the Euclidian space, namely the canonical basis
of R
r
.
∂hi,j
k
(q)
∂qs
= Dqh
i,j
k
(q)es, s = 1, 2, ..., r, (5.5)
es = (0, .., 0, 1, 0, ..0) ∈ R
r
denotes the canonical basis.
Next, we consider the finite-dimensional approximation discussed in Chapter 4.3. To argue convergence, we further require that Cˆ(q) ∈ L(V, R
p
) and thus each of the cˆi(q) now maps from Q into V .
Recall in 4.3 we construct the approximation in V
N → V , and thus the additional requirement on cˆi(q)
does not change the approximation process. Furthermore, it allows us to replace the inner product of H
with the duality pairing between V and V
∗
, ⟨·, ·⟩V,V ∗ . In particular, we can then rewrite our derivative as
Dqh
i,j
k
(q)δq =
D
cˆ
′
i
(q)(δq), T(q; kτ )
ˆbj (q)
E
V,V ∗
(5.6)
+
D
cˆi(q), DqT(q; kτ )(δq)
ˆbj (q)
E
V,V ∗
(5.7)
+
D
cˆi(q), T(q; kτ )
ˆb
′
j
(q)(δq)
E
V,V ∗
. (5.8)
In the finite-dimensional case, derivatives of CˆN , BˆN , and AˆN can be found in the classical sense as
matrix derivatives, and the corresponding inner product can be viewed as the dot product in R
N where
N is the dimension of the approximation. The expression of (Dqh
i,j
k
)
N (q)δq can be easily obtained by
replacing all the operators on the right-hand side by their corresponding approximations,
Dqh
i,j
k
N
(q)δq =
D
(ˆc
N
i
)
′
(q)(δq), T N (q; kτ )
ˆb
N
j
(q)
E
V,V ∗
(5.9)
+
D
cˆ
N
i
(q), DqT
N (q; kτ )(δq)
ˆb
N
j
(q)
E
V,V ∗
(5.10)
+
D
cˆ
N
i
(q), T(q; kτ )(ˆb
N
j
)
′
(q)(δq)
E
V,V ∗
. (5.11)
41
Under the assumption that (ˆc
N
i
)
′
(q) and (
ˆb
N
j
)
′
(q) respectively converge strongly as linear operators
to cˆ
′
i
(q) and ˆb
′
i
(q) in V and H, uniformly in q for q ∈ Q, for all i = 1, 2, ..., p and j = 1, 2, ..., m, and
applying Theorem 5.3.1, together with the triangle inequality and the uniform boundedness principle, the
convergence of each of the terms (5.9), (5.10), and (5.11) to the terms (5.6), (5.7), and (5.8), respectively, for
each δq ∈ R
r
as N → ∞ can readily be established. Therefore, Dqh
i,j
k
N
(q) converges strongly for each
δq ∈ R
r
to Dqh
i,j
k
(q) as N → ∞, uniformly in q for q ∈ Q, for all i = 1, 2, ..., p and j = 1, 2, ..., m, ,and
therefore in the uniform operator topology as well. It then follows from (5.5) that ∂hi,j
k
N
(q)
∂qs
converges to
∂hi,j
k
(q)
∂qs
, as N → ∞, uniformly in q for q ∈ Q, for all i = 1, 2, ..., p, j = 1, 2, ..., m, and s = 1, 2, ..., r.
We note that if P
N φ
N → φ for φ ∈ H or φ ∈ V as is frequently the case for many commonly
used Galerkin finite element bases (for example, linear and cubic spline functions), one can take b
N
j
(q) =
P
N bj (q) and c
N
i
(q) = P
N ci(q) i = 1, 2, ..., p, j = 1, 2, ..., m and all the requisite hypotheses of our
convergence theory will be satisfied.
42
Chapter 6
The Deconvolution Problem and Regularization
In this chapter, we address the deconvolution problem: Given observed data, y(tτ ) = yk, i.e. TAC data,
determine an estimate for the input, u(kτ ) = uk, or the BrAC. We focus on the discrete system as described in Chapter 4.2 and reformulate the convolution in terms of block matrix operations. In addition to
determining this estimate, we establish a convergence result for the reconstructed BrAC, uˆ(kτ ) = ˆuk, and
its finite-dimensional approximation, uˆ
N (kτ ) = ˆu
N
k
.
6.1 Reformulating the Convolution
Recall from Chapter 4.2, equation (4.3) indicates that
yk =
X
k−1
j=0
hk−j−1(q)uj k = 1, 2, ..., K, (6.1)
where yk ∈ R
p
, k = 1, ..., K, is the observed output, uj ∈ R
m, j = 0, ..., K − 1, is the given input, and
K represents the level of discretization. In Chapter 5.4 we view the impulse response operator hk(q) ∈
R
p×m, k = 0, ..., K−1, as a matrix operator mapping from R
m to R
p
and establish its regularity properties
including continuity and differentiablilty, and the convergence of the approximation h
N
k
(q).
43
We now consider equation (6.1) in the following block matrix form. Let y = (y1, ..., yK)
⊺ ∈
NK
k=1 R
p =:
(R
p
)
K be the collection of outputs and u = (u0, ..., uK−1)
⊺ ∈
NK−1
k=0 R
m =: (R
m)
K be the collection
of inputs, where both are considered as block column vectors in R
K with components in their corresponding domain, i.e. R
p or R
m, respectively. The convolution can then be written as y = H(q)u with
H(q) ∈
NK
i,j=1 R
p×m =: (R
p×m)
K×K, where the lower trianguler block K × K matrix H(q) takes the
following form.
y =
y1
y2
.
.
.
yK
=
h0(q) 0 . . . 0
h1(q) h0(q)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0
hK−1(q) . . . h1(q) h0(q)
u0
u1
.
.
.
uK−1
= H(q)u. (6.2)
To study the propagation of uncertainty in the forward problem discussed in Chapter 5, H(q) must be
continuously differentiable at a point q0 with the matrix derivative DqH(q) consisting of the corresponding
derivatives Dqhk(q) defined in Chapter 5. We also consider the finite-dimensional approximation given
by
HN (q) =
h
N
0
(q) 0 . . . 0
h
N
1
(q) h
N
0
(q)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0
h
N
K−1
(q) . . . hN
1
(q) h
N
0
(q)
. (6.3)
It follows directly from our earlier results in Chapters 4.3 and 5.4 that HN (q) → H(q) element-wise
and in any induced matrix norm, and that the element-wise matrix derivatives DqHN (q) converge to the
corresponding matrix derivative DqH(q) as well for any admissible q.
44
6.2 The Reconstructed BrAC with Regularization
With the new reformulation, we will now address the inverse procedure of finding the optimal estimate for
the input signal u that generates the given output y. Given equation (6.2), the problem can be formulated
as an optimization problem with a quadratic cost function or performance index J = J(u). In other words,
as an abstract least squares problem described in Chapter 2.3.
Minimize J(u) = ∥Hu − y∥
2
2
, or (6.4)
Determine uˆ = arg min
u
∥Hu − y∥
2
2
Here we choose the standard vector 2-norm for evaluating the squared difference and uˆ is the estimated
input. Recall that H, u, and y are given in the form of block matrices and vectors, where the components
belong to R
p×m, R
m and R
p
respectively. Viewing equation 6.4 as an ordinary least squares problem with
dimensions Kp Kp× Km and Km, the solution, or uˆ, can be found by solving the corresponding normal
equations.
In what follows, we require the following lemma.
Lemma 6.2.1. If h0 = h0(q) has rank m, or is left invertible, then so is H = H(q).
Proof. The result can be proved in multiple ways, one of which comes from the definition of convolution
and the linear independence of the columns of H. From equation (6.1), it is trivially true that if uj =
0 ∈ R
m for all j = 0, ..., K − 1, then yk = 0 ∈ R
p
for all k = 1, ..., K. On the other hand, if y1 =
h0(q)u0 = 0 and h0(q) has rank m, or linearly independent columns, we must have u0 = 0. Next we
consider y1 = 0 and y2 = 0. The first condition gives u0 = 0, and the second condition, by definition,
is y2 = h1(q)u0 + h0(q)u1 = h0(q)u1 = 0 =⇒ u1 = 0. We can then recursively deduce that
45
y1, ..., yk = 0 =⇒ u0, ..., uk−1 = 0, and thus that the columns of H are also linearly independent.
Hence the generalized left inverse (H⊺H)
−1 H⊺
exists.
Alternatively, one may also argue using the singular value decomposition of h0(q) and the diagonal
property of the block matrix H that
Rank(H(q)) ≥
X
K
i=1
Rank(h0(q)) = K ∗ Rank(h0(q)) = Km,
which then yields the same result.
Now, with the existence of the left inverse of H, the problem is then easily solved via standard least
squares theory as
uˆ = (H⊺H)
−1 H⊺
y. (6.5)
This result should be interpreted as the projection of y onto the image of H which minimizes the
specified distance. Unfortunately, however, we note that the nature of the deconvolution problem is illconditioned since in practice h0(q) is often singular, close to singular, or rectangular without full column
rank. Moreover, the true u, which represents the BrAC level, we would expect to be smooth with only
relatively gradual changes over time. Consequently, we address this issue by modifying the cost functional
with the introduction of regularization. We require another lemma.
Lemma 6.2.2. Let R =
Rij
be a K×K block matrix with entries Rij ∈ R
m×m. Assume that Rii ≻ 0, that
is, symmetric positive definite, for all i = 1, ..., K and that R is block symmetric Rij =
Rji⊺
. Furthermore,
decompose R recursively via the sequence of block matrices Rn, n = 1, ..., K, as
R1 = R11, R2 =
R1 B2
B
⊺
2 R22
, R3 =
R2 B3
B
⊺
3 R33
, ..., RK =
RK−1 BK
B
⊺
K RKK
= R,
4
and assume that Rnn − B
⊺
nR
−1
n−1Bn ⪰ 0 for all n = 2, ..., K. Then R is also positive definite or R ≻ 0.
Similarly, if Rii ⪰ 0 for all i = 1, ..., K, and the other conditions stay the same, then R ⪰ 0.
Proof. We will show inductively that Rn ≻ 0 for all n = 2, ..., K and hence R ≻ 0. First consider the case
for R2,
R2 =
R1 B2
B
⊺
2 R22
=
R11 B2
B
⊺
2 R22
.
Since R11 is positive definite and thus invertible, we consider the Schur complement of R11 in R2, R22 −
B
⊺
2R
−1
11 B2, and look at the following factorization.
R11 B2
B
⊺
2 R22
=
I 0
B
⊺
2R
−1
11 I
R11 0
0 R22 − B
⊺
2R
−1
11 B2
I R−1
11 B2
0 I
:= T2S2T
⊺
2
.
Here T2 and T
⊺
2
are block triangular matrices with the identity matrix on the diagonal, and hence invertible.
It follows that S2 ≻ 0 ⇐⇒ R2 ≻ 0. We can further deduce that S2 ≻ 0 as it is block-diagonal with
R11 ≻ 0 and R22 − B
⊺
2R
−1
11 B2 = R22 − B
⊺
2R
−1
1 B2 ⪰ 0 from the hypotheses of the lemma. As a result,
R2 is positive definite.Then, for any Rn, assuming Rn−1 ≻ 0, we have
Rn =
Rn−1 Bn
B
⊺
n Rnn
=
I 0
B
⊺
nR
−1
n−1
I
Rn−1 0
0 Rnn − B
⊺
nR
−1
n−1Bn
I R−1
n−1Bn
0 I
:= TnSnT
⊺
n
.
47
A similar argument shows that Tn is invertible and Sn ≻ 0 which leads to the conclusion that Rn ≻ 0.
We note that the hypotheses of the lemma trivially hold if Rii ≻ 0, i = 1, ..., K and Rij = 0, i ̸= j. The
same arguments apply to the positive semi-definite case replacing ≻ with ⪰ in the statements above.
Let R1, R2 be two block matrices Rl =
h
R
i,j
l
iK
i,j=1
, l = 1, 2 of dimension K × K satisfying the
assumptions in Lemma 6.2.2 so that R1 ≻ 0 positive definite and R2 ⪰ 0 positive semi-definite. Define
the regularized cost functional
J˜(u) = ∥Hu − y∥
2
2 + u
⊺R1u + (Du)
⊺R2(Du), (6.6)
where D ∈ (R
m)
K×K represents the discrete difference operator of u ∈ (R
m)
K in block vector format.
We can rewrite the above equation by augmenting H with R1 and R2 as
J˜(u) =
Hu˜ − y˜
2
2
, with H˜ =
H
R
1/2
1
R
1/2
2 D
and y˜ =
y
0
0
. (6.7)
The solution to the regularized least square problem defined in terms of equation (6.6) is then given by
u˜ = arg min
u
Hu˜ − y˜
2
2
=
H˜ ⊺H
−1
H˜ ⊺
y˜ (6.8)
= (H⊺H + R1 + D⊺R2D)
−1 H⊺
y = G
−1H⊺
y.
Since H⊺H, D⊺R2D ⪰ 0 and R1 ⪰ 0, we have G ≻ 0 and therefore invertible. Consequently, expression (6.8) is well defined for appropriate choices of R1 and R2. We note that compared to equation (6.5),
the estimate u˜ deviates from the optimal estimate uˆ by a quantity whose magnitude depends on R1 and
R2. However, the resulting estimate will always exist regardless of the conditioning of h0. In addition
48
it should provide a smoother reconstruction by penalizing large amplitude and rapid changes in the resulting estimated input (i.e. BrAC) signal. In Chapters 8 and 9 we discuss in greater detail the choice of
these regularization matrices in the context of our SISO model. We will also demonstrate that, with only
minimally compromising accuracy, we can obtain a much more stable reconstructed BrAC curve.
6.3 Derivative of the Inverse Operator and Convergence
We have defined the solution to the regularized deconvolution problem as given in expression (6.8). Our
goal in this chapter is to examine the regularity condition for u˜ = ˜u(q) with respect to our parameters
q ∈ Q and establish a convergence result for the finite-dimensional approximation. In Chapter 6.1 we
have shown that HN (q) → H(q) and DqHN (q) → DqH(q) element-wise as matrices and hence in any
induced matrix norm, specifically in the Forbenius norm. Denote
u˜
N =
G
N
−1
HN
⊺
y (6.9)
=
HN
⊺
HN + R1 + D⊺R2D
−1
HN
⊺
y, (6.10)
and consider the following result.
Theorem 6.3.1. Under the assumptions stated in the previous chapters and the appropriate choice of R1 and
R2, we have u˜
N → u˜ in the Frobenius Matrix norm ∥ · ∥F as N → ∞. Furthermore, the matrix derivatives
Dqu˜(q) and Dqu˜
N (q) exist as Km × r matrix operators, where q ∈ Q ⊂ R
r
, and Dqu˜
N (q) → Dqu˜(q) in
the Frobenius Matrix norm ∥ · ∥F as N → ∞.
Proof. We begin by looking at GN =
HN
⊺
HN + R1 + D⊺R2D
and G = (H⊺H + R1 + D⊺R2D).
From the convergence of HN , and its transpose
HN
⊺
and the fact that each of the hk’s is bounded, we
have
HN
⊺
HN → H⊺H in ∥ · ∥F as N → ∞. Since the rest of the terms in G are bounded constants,
we naturally have GN → G in ∥ · ∥F as N → ∞. Recall that both GN and G are positive definite
49
and therefore
GN
−1
F
and
G−1
F
can be bounded by a constant multiple of the reciprocals of the
smallest eigenvalue of R1 for all N. Then by the sub-multiplicativity of the Frobenius norm, it follows that
G
N
−1
− G
−1
F
=
G
N
−1
G − G
N
G
−1
F
≤
G
N
−1
F
G − G
N
F
G
−1
F
.
Thus the convergence of GN to G implies that of the inverses as well. Combining the convergence and
boundedness results for GN and
HN
⊺
, we obtain that for any y,
u˜
N − u˜
F
=
G
N
−1
HN
⊺
y − G
−1H⊺
y
F
≤
G
N
−1
F
∥H⊺
∥F
G
N
−1
− G
−1
F
HN
⊺
− H⊺
F
∥y∥F ,
which implies that
u˜
N − u˜
F
→ 0, as N → ∞. Note that the Frobenius norm for vectors u˜ and y are
simply the standard vector norm ∥ · ∥2, which conforms with the norm needed in equations (6.6) and (6.8).
For the derivatives, observe first that the q dependence of G is restricted to only the H⊺H term. Thus,
we have
DqG(q) = Dq
H(q)
⊺H(q) + R1 + D⊺R2D
= Dq
H(q)
⊺H(q)
=
DqH(q)
⊺ H(q) + H(q)
⊺DqH(q)
=
H(q)
⊺
(DqH(q))⊺ + H(q)
⊺
(DqH(q))
= M(q)
⊺ + M(q),
where in the above expression M(q) = H(q)
⊺
(DqH(q)). The derivative of the inverse can be written as
Dq
G
−1
(q)
= −G
−1
(q)(DqG(q))G
−1
(q)
= −G
−1
(q)
M(q)
⊺ + M(q)
G
−1
(q).
The finite-dimensional version can be obtained similarly by replacing all occurrences of H by HN and
G by GN . Convergence of MN can be deduced based on that of HN and DqHN stated in Chapter 6.1 and
the boundedness under the Frobenius norm of H and DqH. Indeed,
MN (q) = HN (q)
⊺
(DqHN (q)) → H(q)
⊺
(DqH(q)) = M(q),
Dq
G
N
−1
(q)
= −
G
N
−1
(q)(DqG
N (q))
G
N
−1
(q)
= −
G
N
−1
(q)
MN
⊺
+ MN
G
N
−1
(q)
→ Dq
G
−1
(q)
in ∥ · ∥F as N → ∞.
Finally, we have
Dqu˜
N (q) =
Dq
G
N
−1
(q)
HN
⊺
y +
G
N
−1
(q)
DqHN (q)
⊺
y,
Dqu˜(q) =
DqG
−1
(q)
H⊺
y + G
−1
(q)
DqH(q)
⊺
y.
where the convergence in ∥· ∥F can easily be shown based on our previous convergence and boundedness
results for each of the terms in these expressions.
Chapter 7
The Delta Method Induced Confidence Bands
So far we have established how the derivatives of the impulse response functions or convolution filters,
h
i,j
k
(q), and their finite-dimensional approximations can be computed. In this chapter, we step away from
functional analysis and semigroups of operators and consider the case where the parameters, q, are nondeterministic. More specifically, we are interested in quantifying how randomness or uncertainty in the
parameters propagates through the forward model and how the randomness is observed in the response
function. We rely on a well-known statistical tool known as the Delta Method to compute the asymptotic
distribution of the system’s response by using their derivatives with respect to the parameters. More precisely, we demonstrate how to construct confidence sets or bands for the estimates. We will also apply the
same method as described to study the asymptotic distribution of the deconvolution estimate and establish
confidence bands.
7.1 The Delta Method
We begin by generalizing the delta method introduced in chapter 2.2 to meet the multivariate conditions
[5, 15]. Let q0 ∈ Q ⊆ R
r be a fixed constant and Γ ∈ R
r×r be a positive definite matrix. We consider a
multivariate vector-valued function g : R
r
7→ R
d with continuous partial derivatives ∂gi
qj
, i = 1, ..., d, j =
1, ..., r. Define the derivative matrix of g as Dg(q) :=
h
∂gi
qj
i
i.j
∈ R
d×r
.
52
Theorem 7.1.1. (Multivariate Delta Method). Suppose that there exists a sequence of random variables
q1, ..., qn ∈ Q with constant expectation E[qi
] = q0 for i = 1, ..., n such that √
n (qn − q0) → ZΓ in
distribution as n → ∞, where ZΓ ∼ N(0, Γ). Furthermore, assume that the matrix (Dg(q0))Γ(Dg(q0))⊺
is
positive definite. Then
√
n
g(qn) − g(q0)
→ Zg,
in distribution as n → ∞, where Zg ∼ N
0,(Dg(q0))Γ(Dg(q0))⊺
.
Here N(0, Σ) denotes the multi-dimensional multivariate normal distribution with mean zero and
covariance matrix Σ. We note that the version of the delta method that we present here is tailored to
our specific needs in this study. Indeed, many of the assumptions we make, while satisfied by our model
desecribed in Chapters 3 and 8, can in fact be relaxed and generalized. For instance, the condition on the
expectation can be reduced to E[qi
] = q0 + o
1/
√
n
where 1/
√
n is a scaling factor that can be replaced
with another decreasing sequence. The positive definiteness condition guaranteeing that the resulting
random variable does not degenerate can be replaced with positive semidefinite if a little more care is
exercised in the proof of the theorem. Higher-order approximations are also possible when more precise
estimation is required or when (Dg(q0)) = 0. In our study here, we take for granted that the parameters
q satisfy the stated hypotheses, but in actual practice the parameters are typically the response variables
of a standard regression model with the asymptotic normality assumption being realized with relatively
minor assumptions on the data. In particular, to simplify the presentation and without any significant loss
of generality, we focus primarily on the case where the function g has scalar output (d = 1), and that the
positive definite requirement can be easily satisfied as long as Dg(q0) ̸= 0.
Here we omit the proof of the delta method, which is a direct application of the multivariate Taylor
expansion and Slutsky’s theorem 2.2.3, which can be found in most statistics textbooks. Instead, we comment on the mode of convergence stated in the theorem. The convergence in distribution for random
variables is a weak form of convergence that can be interpreted as point-wise convergence where the cumulative distribution function is continuous. In the case of a sequence of normal random variables where
the parameters (i.e. the mean and variance) converge, then the convergence in the density function, which
is by definition continuous and bounded, will imply convergence in distribution. We will use just such
an argument below when we argue the convergence of our finite-dimensional approximating systems in
terms of their distributions.
7.2 Asymptotic distribution of the Impulse Response Matrix
Recall that in Chapter 5.4 we discussed the impulse response matrices hk(q) and its (i, j)-th entry, h
i,j
k
(q),
which is scalar-valued. Then h
i,j
k
(q) can be interpreted as the response of the i-th output channel of the
evolution system to the unit impulse in the j-th input channel, u = (0, .., 0, 1, 0, ..0) ∈ R
m with 1 in the
j-th component and 0 elsewhere. Theorem 5.4.1 established the existence of the derivative with respect to
q of h
i,j
k
(q). Applying the delta method (Theorem 7.1.1), we obtain the following result.
Theorem 7.2.1. Assuming that the first partial derivatives of h
i,j
k
(·) are continuous and non-zero at q0 and
letting ∇h
i,j
k
(q)
⊺ = ∇qh
i,j
k
(q)
⊺ ∈ R
r denote the gradient vector with repsect to q of h
i,j
k
, then
√
n
h
i,j
k
(qn) − h
i,j
k
(q0)
→ Z
i,j
k
,
in distribution as n → ∞, where q0, qn, Γ are as defined in Chapter 7.1 and
Z
i,j
k ∼ N
0, ∇h
i,j
k
(q0)Γ∇h
i,j
k
(q0)
⊺
.
In the scalar case, the confidence interval can be directly constructed on the basis of the distribution.
Denote Γ
i.j
k = ∇h
i,j
k
(q0)Γ∇h
i,j
k
(q0)
⊺ ∈ R. For large enough nk, the response h
i,j
k
(qnk
) is approximately
54
normally distributed with mean h
i,j
k
(q0) and variance 1
nk
Γ
i.j
k
. Intuitively, h
i,j
k
(qnk
) lies in a neighborhood
of h
i,j
k
(q0) with small enough variance to allow the difference to be approximated by the distribution of
the parameters similar to a linear approximation by the tangent line. Note that h
i,j
k
(qnk
) ∈ R, and thus
the (1 − α)% confidence interval of the response component is given by
h
i,j
k
(q0) − zα/2
1
√
nk
(Γi.j
k
)
1
2 , hi,j
k
(q0) + zα/2
1
√
nk
(Γi.j
k
)
1
2
!
, (7.1)
where zα/2 denotes the α
2
-th quantile of the standard normal distribution. A confidence interval of this
form is typically referred to as a single parameter family, where the width of the interval is determined by
a single parameter z = zα/2
that depends only on the expected coverage, but not on the distribution.
Now we can consider the following theorem related to the finite-dimensional approximations of h
i,j
k
discussed in Chapter 4.3.
Theorem 7.2.2. Suppose that the random parameter q ∈ Q satisfies the assumptions of Theorem 7.1.1 with
center q0 and variance matrix Γ, and that hk(q) = h
h
i,j
k
(q)
i
satisfies the conditions in Theorem 5.4.1 with
∂hi,j
k
(q)
∂qs
̸= 0 for all i = 1, 2, ..., p, j = 1, 2, ..., m, and s = 1, 2, ..., r, then
√
n
h
i,j
k
N
(qn) − h
i,j
k
N
(q0)
→ Z
i,j
k
,
in distribution as n, N → ∞.
Proof. The result is a direct consequence of Theorem 5.4.1 and the delta method, Theorem 7.1.1. For q0 ∈
Q, h
i,j
k
N
(q) is continuous around q0 and differentiable by construction as discussed in Chapter 5.4, and
55
∂hi,j
k
N
(q)
∂qs
̸= 0 for all i,j,s since as a matrix exponential, DqT
N (q; kτ )(es) ̸= 0 for any admissible q and
canonical basis element es. As a result, we can apply the delta method to h
i,j
k
N
(qn) and obtain
√
n
h
i,j
k
N
(qn) − h
i,j
k
N
(q0)
→ Z
i,j
k
N
,
in distribution as n → ∞, where Z
i,j
k
N
∼ N
0, ∇h
i,j
k
N
(q0)Γ∇h
i,j
k
N
(q0)
⊺
. Similarly we denote
∇h
i,j
k
N
(q0)Γ∇h
i,j
k
N
(q0)
⊺ = Γi,j
k
N
∈ R. From Theorem 5.4.1, we have that ∂hi,j
k
N
(q)
∂qs
converges to ∂hi,j
k
(q)
∂qs
,
as N → ∞, uniformly in q for q ∈ Q, for all i = 1, 2, ..., p, j = 1, 2, ..., m, and s = 1, 2, ..., r. Consequently the gradient ∇h
i,j
k
N
(q) =
∂hi,j
k
N
(q)
∂q1
, ...,
∂hi,j
k
N
(q)
∂qr
converges and Γ
i,j
k
N
→ Γ
i,j
k
as N → ∞ for all
values of i and j. Then it follows that the random variable Z
i,j
k
N
converges to Z
i,j
k
in distribution since
the density converges for all points in R.
Alternatively, we denote Xn and Yn to be the random variables √
n
h
i,j
k
N
(qn) − h
i,j
k
N
(q0)
and
√
n
h
i,j
k
(qn) − h
i,j
k
(q0)
, respectively. From Theorem 5.4.1 we have that |Yn − Xn| → 0 in distribution
since the functions converge almost surely for all q. Then by Slutsky’s theorem for Xn = Yn − (Yn − Xn)
or by the dominated convergence theorem with the characteristic functions of normal random variables,
the desired convergence can be established.
To extend these results for the impulse response functions h
i,j
k
(q) to the impulse response matrix
hk(q) or to an arbitrary response yk(q), the response at time k, to a given input u, we recall as in (4.3) that
yk(q) = Pk−1
j=0 hk−j−1(q)uj k = 1, 2, ..., K. It then follows from the linearity of the convolution and
the differentiability of hK−j−1(q) that yk(q) is differentiable with respect to q as a function from R
r
to R
p
.
56
Letting ul = (ul,1, ..., ul,m)
⊺
, the derivative Dqyk(q) =
∂yi
k
(q)
∂qj
can be explicitly computed in terms of
Dqh
i,j
k
(q) as
∂yi
k
(q)
∂qj
=
X
k−1
l=0
∂
hk−l−1(q)ul
∂qj
=
X
k−1
l=0
Xm
n=1
Dqh
i,n
k−l−1
(q)ej
ul,n. (7.2)
We can then apply the delta method to yk and obtain its asymptotic distribution as
√
n
yk(qn) − yk(q0)
→ Zi,j
yk
, in distribution as n → ∞,
with Z
i,j
yk ∼ N
0,(Dqyk(q0))Γ(Dqyk(q0))⊺
. Here, N
0,(Dqyk(q0))Γ(Dqyk(q0))⊺
:= N (0, Σk) is the
p-dimensional normal distribution with mean zero and covariance matrix Σk = (Dqyk(q0))Γ(Dqyk(q0))⊺
.
Again choosing nk sufficiently large, the distribution of yk = yk(qnk
) ∈ R
p
is approximately normal
with mean yk(q0) and covariance matrix Σk/nk. To compute the corresponding (1 − α)% confidence set,
consider the value zk = nk(yk − yk(q0)
⊺Σ
−1
k
(yk − yk(q0)) ∈ R
+. Then zk has a χ
2
p distribution (a χ
2
distribution with p degrees of freedom), with the upper and lower bounds corresponding to the quantile of
desired confidence level for the χ
2
p distribution easily determined. Solving the inverse system with these
boundary values of zk gives the boundary for the confidence set in R
p
. In the cases where p = 1 or 2, this
confidence set agrees with the confidence interval or the elliptical confidence region in R
2
. However, as
one can imagine, as the dimension increases, the level of computation needed to solve the inverse system
increases drastically, and, more importantly, we cannot properly visualize the confidence set. Practically
speaking, for a vector in R
p
, we prefer to observe a confidence set of the form [y
−
k
, y+
k
], where y
−
k
and y
+
k
are vectors in R
p
that together form a band-like confidence set around the target curve. The end of this
chapter will be dedicated to techniques for constructing such confidence bands.
We will summarize the result in terms of the new formulation in Chapter 6.1. Assume that the parameter q satisfies our assumption that there exists a random sequence qn ∈ Q such that E[qi
] = q0 for all
i = 1, 2, . . . and √
n (qn − q0) → ZΓ where ZΓ ∼ N(0, Γ) in distribution as n → ∞, with Γ ∈ R
r×r
positive definite. For given fixed input u, y
N (q0) = HN (q0)u → H(q0)u = y(q0) almost surely as N →
∞, and √
n
y
N (qn) − y
N (q0)
→ ZΣ, in distribution as n, N → ∞, where ZΣ ∼ N (0, Σ) with
Σ = (Dqy(q0))Γ(Dqy(q0))⊺ = (DqH(q0)u)Γ(DqH(q0)u)
⊺
. The approximating output converges, and
we are able to find the asymptotic distribution of the infinite-dimensional estimation based on its finitedimensional approximation.
7.3 Confidence Bands for the Reconstructed BrAC
In this chapter, we will combine all our previous results in Chapter 6, along with the methodology developed earlier in this chapter, and establish the asymptotic distribution of the reconstructed BrAC input u˜
based on the random parameter q.
Theorem 7.3.1. Assume that there exists a random sequence qn ∈ Q such that E[qi
] = q0 ∈ Q for all
i = 1, 2, . . . and √
n (qn − q0) → ZΓ in distribution as n → ∞, where ZΓ ∼ N(0, Γ) with Γ ∈ R
r×r
positive definite. Further assume that all conditions on the convolution system have been met and Dqu˜(q0)
has rank r. Then √
n
u˜
N (qn) − u˜
N (q0)
→ ZΣN in distribution as n → ∞, where ZΣN ∼ N
0, Σ
N
with Σ
N = Dqu˜
N (q0)Γ
Dqu˜
N (q0))⊺
. Moreover, for fixed n
∗
, sufficiently large, we have
u˜
N (qn∗ ) ∼ N
u˜
N (q0),
1
n∗ Dqu˜
N (q0)Γ
Dqu˜
N (q0))⊺
,
and
u˜
N (qn∗ ) → Zu˜(q0),Σ, in distribution as N → ∞,
58
whereZu˜(q0),Σ ∼ N
u˜(q0), Σ
with Σ = 1
n∗ Dqu˜(q0)ΓDqu˜(q0))⊺
. In other words, taking large enough n and
N, the asymptotic distribution of the reconstructed input u˜ can be estimated by that of its finite-dimensional
approximation.
Proof. The proof is essentially the same as that of Theorem 7.2.2. The first convergence in n and the
estimation on asymptotic distribution come from the delta method (Theorem 7.1.1), whereas the second
convergence in N can be derived from our results in the previous chapter. Note that since Dqu˜(q) has finite
dimensions, despite the infinite-dimensional structure underneath, the convergence in the Frobenius norm
in Theorem 6.3.1 implies that of the entry-wise convergence for the covariance matrix, which then leads
to the convergence in distribution for the normal random variables.
Once we have the asymptotic distribution for the reconstructed u˜, we may construct the confidence set
of the random vector. Alternatively, similar to the discussion in chapter 7.2, if might be more convenient
and reasonable to construct various types of confidence bands based on the asymptotic distribution for
each of the block entries u˜k, k = 0, ..., K − 1, obtained directly from the corresponding component of
the overall derivative Dqu˜ =
Dqu˜0, ..., Dqu˜K−1
. The convergence of the distribution can be deduced
from that of the derivatives, and more specifically from that of the covariance matrix. We also remark that
the confidence bands will center at the regularized estimation, which, depending on the magnitude of the
regularization applied, is inherently biased compared to the true estimator uˆ, and should be interpreted
with caution. Further discussion will be presented in chapter 9 with a concrete example.
7.4 The Choice of Uniform Confidence Bands and Constructions
In this final chapter, we will discuss methods to construct bands around a curve that covers the desired
confidence level as summarized in [29]. For simplicity of notation, we denote y(q) = (y1, ..., yK) and
assume that for each of the yk(q), k = 1, ..., K is scalar-valued and follows the distribution N (ˆyk, σk).
The center yˆ = (ˆy1, ..., yˆK) represents our estimation for the response function based on the input. In
the SISO model in Chapter 3, the yk’s are points of estimation and K represents the resolution of the time
discretization. For a more general discussion as mentioned in Chapter 5, we can build the confidence set for
yk based on the distribution of h
i,j
k
. Similarly, the notation applies to the confidence band of reconstructed
BrAC, as described at the end of Chapter 7.3 in a similar format.
The single-parameter family of confidence bands will be our main focus in this discussion, namely
the bands in the form of NK
k=1[y
−
k
, y+
k
] = NK
k=1[yk − cασk, yk + cασk]. It has been shown [33, 34]
that most methods for constructing confidence bands, including computing the confidence set through χ
2
distributions, belong to the family when cα > 0 is properly picked depending on the desired confidence
level α and independent of the distribution of yk. An immediate consequence of this construction is that
the band-width cασ
N
k
of the confidence set for the finite dimensional approximation discussed in chapter
4.3 will converge to that of the infinite dimensional system, cασk since, as we have shown in Chapters 7.2
and 7.3, the distribution converges. The only exception we will discuss in this chapter is the Supremum
Band, where the critical value also depends on the distribution of yk and hence needs a bit more care in
arguing for convergence.
We begin with the naïve approach of simply directly combining the individual confidence intervals for
each of the yk’s, as was done in (7.1), where the cα = zα are the corresponding statistics for the standard
normal distribution. However,if we assume that the yk’s are independent, then the confidence level of the
naïve band is (1 − α)
K, and for the correlated case, simple analysis for the probability coverage shows
that it will be substantially smaller than the target level. To compensate for this phenomenon, we can
adjust the confidence level α so that the overall coverage becomes more conservative. Bonferroni bands
and ŠidáK bands [48] are well-known improvements over the naïve bands. Indeed, this accomplished by
taking (1 − α/K) and (1 − α)
1/K as the marginal confidence levels for individual yk’s instead of (1 − α).
We note that the modification to the marginal confidence levels do not depend on the distribution of the
60
yk’s, but rather rely on probability inequalities to achieve a desired overall confidence level. Both of these
bands are conservative in the sense that the overall coverage is at least (1 − α) [29, 32], and for small
α and large K, (1 − α)
1/K and
1 − α/(K)
are reasonably similar [33]. Consequently, there is very
little difference in the width of the resulting confidence bands. However, one should keep in mind that
these methods do not take the correlation between yk’s into account, and hence may very well be ignoring
hidden information in the data.
Next we will consider a similar construction mentioned at the end of previous chapter. This is the
method referred to as the Wald’s Band in [20] and [32]. Suppose that y follows a distribution of N(˜y, Γ)
for y, y˜ ∈ R
K and Γ ∈ R
K×K positive definite. The confidence set, known as Wald’s Ellipse, is defined as
W(1 − α) :=
n
y ∈ R
K|(y − y˜)
⊺Γ
−1
(y − y˜) ≤ χ
2
K,1−α
o
,
where χ
2
K,1−α
gives the (1−α) quantitle of the χ
2 distribution with K degrees of freedom. The set is optimal in the sense that it has the exact confidence level we desire and the covariance matrix, Γ, incorporates
all the possible correlation structures within the yk’s. On the other hand, visualization is a challenge. In
order to obtain a confidence band in the form of NK
k=1[y
−
k
, y+
k
], we will take the projection of W(1 − α)
onto the coordinates of y by
WCI (1 − α) = O
K
k=1
[ min
y∈W(1−α)
yk, max
y∈W(1−α)
yk].
While computation of the projection adds an additional level of complexity, we note that the requirement that Γ be invertible is, in general, not the case in applications where the dimension of the parameter
space r (in our case, r = 2) is much smaller than the level of discretization K. The existence of an asymptotically valid joint confidence set with a degenerate covariance matrix has been established in [20], but
the construction of such a confidence set relies on approximation methods such as bootstrapping and the
6
power of the confidence set suffers a significant loss due to projection. Alternatively, we can approximate
the critical value of the projected Wald’s confidence band by picking cα to be the corresponding q
χ
2
K,1−α
statistic [34]. We also note that the projection set, which includes the Wald ellipse W(1 − α), is overly
conservative [34] compared to other options.
Finally, we introduce the asymptotically optimal choice of cα for the single-parameter family of confidence bands. Given any y satisfying √
n(y − y˜) → ZΓ, in distribution, where ZΓ ∼ N(0, Γ), the supremum of the difference between its component yk and the center y˜k is bounded by a constant multiple of its
standard deviation. According to [34], the asymptotic coverage of any single-parameter confidence band
can be realized as
P
y ∈ B(c)
→ P
max
k∈1,...,K
|yk − y˜k|
√
Γkk
≤ c
,
where Γkk = σk denotes the k-th diagonal element of Γ.
We look for a critical value c
crit
α such that the probability on the right-hand side of the convergence
is exactly (1 − α). Note that since the yk’s are jointly normal, the cumulative distribution, F(c; Γ), of
maxk∈1,...,K
|yk−y˜k|
√
Γkk
exists. The critical value c
crit
α is determined by the (1 − α)% quantile of the random
variable with cdf F(c; Γ), with the confidence band known as the Sup-t band then given by NK
k=1[yk −
c
crit
α σk, yk + c
crit
α σk]. By construction, the asymptotic coverage of the Sup-t band is exactly (1 − α),
and provides a narrower bandwidth compared to the other types of single-parameter confidence bands
discussed earlier [34]. However, we note that in this case the choice of critical value might depend on the
parameter Γ and the numerical process of finding such a value relies on empirical methods for generating
the distribution of the maximum of a normal random vector. Our result in Theorem 7.2.2 shows that
Γ
N → Γ, which in turn ensures the convergence of F
N (c; Γ) = F(c; ΓN ) to F(c; Γ). It follows that the
choice of c
crit
α will converge as well. Unfortunately however, as demonstrated below in Chapter 9, the
process of obtaining the critical value tends to jeopardize the rate of convergence.
6
Figure 7.1: 90% single-parameter rectangular confidence region of the form N2
k=1[yk − cσk, yk + cσk]
and confidence ellipse for a two-demensional normal distribution with mean
(0, 0) and covariance matrix
1 0.2
0.2 0.25!
.
To summarize, we presented methods for constructing confidence bands of two types. Methods of
the first type, of which Bonferroni,ŠidáK and naïve bands are examples, are relatively straight forward to
implement with readily determined critical values. On the other hand however, they tend to ignore the
correlations which exist between the yk’s resulting in either under or over estimates of the coverage or
width. The second type of methods of which the Wald and Sup-t bands are examples, are more theoretically robust in finding the ideal coverage, but the computational challenges involved in constructing them
undermines the benefits derived from their use. Figure 7.1 displays all the confidence bands discussed in
this chapter on the same scale for a hypothetical two dimensional normal random variable.
63
Chapter 8
Application of Our Framework to Our SISO Model for the Transdermal
Transport of Ethanol
8.1 The Abstract State Space Formulation
Let Q denote our feasible parameter set which we assume to be a compact subset of the positive orthant of R
2
. For the state space, we define the Hilbert space H = R × L
2
(0, 1) with the inner product
⟨(θ1, φ1),(θ2, φ2)⟩ = θ1θ2 +
R 1
0
φ1(η)φ2(η)dη. Further, for the energy space V we set V = {(θ, φ) ∈ H :
φ ∈ H1
(0, 1), θ = φ(0), φ(1) = 0}, the Hilbert space with inner product ⟨(φ1(0), φ1),(φ2(0), φ2)⟩V =
⟨φ
′
1
, φ′
2
⟩L2(0,1), where ⟨·, ·⟩L2(0,1) denotes the standard L
2
(0, 1) inner product. Let | · | and || · || denote respectively the norms induced by these two inner products on their respective spaces. In this way, standard
arguments yield the dense and continuous embeddings V ,→ H ,→ V
∗
[47].
For each q ∈ Q, a(q; ·, ·) : V × V → R is a(q; ˆφ1, φˆ2) = q1
R 1
0
φ
′
1
(η)φ
′
2
(η)dη, φ1, φ2 ∈ V where
φˆ1 = (φ1(0), φ1), φˆ2 = (φ2(0), φ2). It is straight forward to show that this bilinear form is bounded
and coercive uniformly in q for q ∈ Q and that it is continuous and affine with respect to q ∈ Q [18].
Define the operator A(q) : Dom(A(q)) ⊂ H → H by letting ⟨A(q) ˆφ, ψˆ⟩V ∗,V = −a(q; ˆφ, ψˆ) and setting
Dom(A(q)) = {ϕˆ = (φ(0), φ) ∈ V : φ ∈ H2
(0, 1)}. Note that the domain Dom(A(q))is independent of
q, and for φˆ ∈ Dom(A(q)), we have A(q) ˆφ = A(q)(φ(0), φ) = (q1φ
′
(0), q1φ
′′). It is not difficult to argue
64
using standard arguments that the operator A(q) defined above is densely defined on H, regularly dissipative and self-adjoint [3, 52]. It follows that, A(q) is the infinitesimal generator of a uniformly exponentially
stable, self-adjoint, analytic semigroup of bounded linear operators, {T(q;t) : t ≥ 0}, or {e
A(q)t
: t ≥ 0},
on H [3, 40, 52].
Using these definitions, the model (3.2)on kτ ≤ t < (k + 1)τ can be rewritten as
dvˆ
dt(t) = A(q)ˆv(t) + q1q2 (1, 0) uk,
with initial conditions
vˆ(kτ ) = (wk, xk − ξuk).
Next we set xˆk = (wk, xk), and define
Aˆ(q) = T(q; τ ) = e
A(q)τ ∈ L(H, H)
B(q) = q1q2 (1, 0) ∈ L(R
1
, H)
Bˆ(q) =
I − Aˆ(q)
((0, ξ) −
Z τ
0
e
A(q)s
)B(q)ds
= q2
I − Aˆ(q)
(0, η) − q1A(q)
−1
(1, 0)
∈ L(R
1
, H)
The fact that A(q) is in fact elliptic implies that A(q)
−1 ∈ L(H, H) exists. Recalling that v(t, η) =
x(t, η) − ξ(η)uk, these definitions of our spaces and operators together with the variation of parameters
formula from linear semigroup theory, the following calculation yields the state space form of the model:
xˆk+1 = (wk+1, xk+1) = (w((k + 1)τ ), x((k + 1)τ, ·))
= ˆv((k + 1)τ ) + (0, ξ)uk
= e
A(q)τ
(wk, xk − ξuk) + q1q2
Z τ
0
e
A(q)s
(1, 0) dsuk + (0, ξ)uk.
65
Defining our output operator to be Cˆ ∈ L(H, R) as Cˆ(θ, ϕ) = θ for (θ, ϕ) ∈ H, the discrete time model
now becomes
xˆk+1 = Aˆ(q)ˆxk + Bˆ(q)uk, k = 0, 1, 2, . . . , K, (8.1)
yk = Cˆxˆk, k = 0, 1, 2, . . . , K.
We assume that neither the epidermal layer nor the sensor collection chamber contains any alcohol at time
t = 0; consequently the initial conditions are given by xˆ0 = (w0, x0) = (0, 0). It follows that the output
or TAC y of the discrete time system (8.1) can then be written as a discrete time convolution of the input
or BrAC, u, with a filter, h(q), as
yk =
X
k−1
j=0
CˆAˆ(q)
k−j−1Bˆ(q)uj
=
X
k−1
j=0
hk−j−1(q)uj , k = 1, 2, ..., K. (8.2)
The convolution kernel or filter is given by hi(q) = CˆAˆ(q)
iBˆ(q), i = 0, 1, 2, ..., K − 1.
For N = 1, 2, ..., define the approximating subspaces using spline functions. Let {φ
N
j
}
N
j=0 ⊂ H1
(0, 1)
be the set of linear B-splines on the interval [0, 1] corresponding to the uniform mesh {0,
1
N
,
2
N
, ..., 1} given
by φ
N
0
(η) = 1−Nη if η ∈ [0,
1
N
], φ
N
0
(η) = 0 otherwise, φ
N
N (η) = Nη−N +1 if η ∈ [
N−1
N
, 1], φ
N
N (η) = 0
otherwise, and for j = 1, 2, ..., N − 1, φ
N
j
(η) = Nη − j + 1 if η ∈ [
j−1
N
,
j
N
], φ
N
j
(η) = j + 1 − Nη if
η ∈ [
j
N
,
j+1
N
], φ
N
j
(η) = 0 otherwise. Let V
N = span{φˆ
N
j
}
N−1
j=0 = span{(φ
N
j
(0), φN
j
)}
N−1
j=0 . It then follows
that V
N is a subspace of V , and the properties of splines [3, 45] yield that for any v ∈ V , there exists a
u
N ∈ V
N with ||u
N − v||V → 0 as N → ∞.
66
8.2 Finite Dimensional Approximation
Let P
N : H → V
N denote the orthogonal projection of H onto V
N and let AN (q) ∈ L(V
N , V N ) be
given by < AN (q) ˆφ
N , ψˆN >= −a(q; ˆφ
N , ψˆN ), for φˆ
N , ψˆN ∈ V
N . It follows from the properties of
linear splines [45] that P
N converges strongly to the identity on H. We then set AˆN (q) = e
AN (q)τ ∈
L(V
N , V N ), BˆN (q) = q2(I − AˆN (q))
P
N (0, η) − q1AN (q)
−1P
N (1, 0)
∈ L(R
m, V N ), and CˆN =
Cˆ ∈ L(V
N , R). The definition of AN (q), the compactness of Q, and the uniform coercivity of a(q; ·, ·)
yield the uniform exponential bound on e
AN (q)t
and that Rλ(AN (q))P
N φˆ → Rλ(A(q)) ˆφ as N → ∞ in
H, V and V
∗
for every φˆ ∈ H (or V or V
∗
, as the case may be). It follows that AˆN (q) = e
AN (q)τP
N
converges strongly to Aˆ(q) = T(q; τ ) on H uniformly in q for q ∈ Q.
The approximating model is then given in convolution form by
y
N
k =
X
k−1
j=0
h
N
k−j−1
(q)uj , k = 1, 2, ...K, (8.3)
where h
N
i
(q) = C
N AˆN (q)
iBˆN (q), i = 0, 1, 2, ...K − 1, with h
N
i
(q) converging to hi(q) in R, uniformly
in q for q ∈ Q and i in {0, 1, 2, ..., K − 1} and y
N
k
converging to yk in R, uniformly in q for q ∈ Q and k
in {1, 2, ..., K}.
67
8.3 Choice of Regularization Pamameters
In Chapter 6.2, we solved the deconvolution problem with regularization. For the SISO case where m = 1,
we consider the following choice of regularization matrix for equation (6.8). We set R1 = αIK, with α > 0
and R2 = βIK, with β ≥ 0, where
D = ˜IK =
1 0 . . . . . . 0
−1 1 .
.
.
.
.
.
0 −1 1 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 0
0 . . . 0 −1 1
∈ R
K×K (8.4)
where IK denotes the identity in R
K×K and α, β real constants of choice. It is easy to verify that the
requirements stated in Chapter 6.2 are satisfied, as the conditions in Lemma 6.2.2 reduced to R1 positive
definite and and R2 positive semi-definite. Du ∈ R
K gives the finite difference of u, a discrete simulation
of the derivative that we want to control. The positive and non-negative parameters α and β, respectively,
should be calibrated based on a large amount of observed data if available, or by statistical means such as
cross-validation or bootstrapping based on a single set of observed data, in order to obtain the best possible values that minimize the difference between estimated and observed BrAC. However, in our physical
studies, these options might not be feasible. The ideal regularization could be unique to the specific participant or drinking episode, and more importantly, the true BrAC data might not be obtainable for actual
applications.
In this study, we choose the parameters based on their intuitional influences on the inverse system.
The first parameter α serves the purpose of making the matrix G in equation (6.8) positive definite and
invertible, which stabilizes the inverse estimation from the ill-conditioned H. The larger the magnitude of
68
α, the more well-posed the deconvolution will be. The second parameter β, penalizing the cost function by
the rate of change in u, controls the smoothness of our estimated curve, under the assumption that BrAC
evolves continuously in our body. On the other hand, the parameters together contribute to the amount of
deviation from the original system, and the larger the magnitudes, the more biased our estimation might
be. When choosing the parameters, one should take into consideration the smoothness preference over
reconstructed BrAC and the trade-off between precision and stability. For the purpose of examining the
efficacy of our deconvolution method and the convergence of the approximation, the process of obtaining
the optimal values for regularization is not included.
69
Chapter 9
Numerical Examples with SISO Model
9.1 Human Subject Clinical Data
The numerical studies presented in this chapter utilize primarily two human subjects datasets, referred
to as Trial 3 and Trial 4. These experiments were approved by the University of Southern California
(USC) Institutional Review Board (IRB) and conducted in the Luczak Laboratory within the Department
of Psychology at USC in Los Angeles, California [44]. The drinking episodes in Trial 3 consist of BrAC
and TAC data from 40 participants, each completing four drinking sessions. Each participant consumes
a predetermined amount of alcohol in a number of different consumption patterns including evenly over
15 minutes in two episodes, in two 15-minute periods over 60 minutes in one episode, and evenly over 60
minutes in one episode. The amount consumed was designed, based on body water weight, for a maximum
BrAC peak of 0.08 over a 15-minute period of time. The BrAC data was measured using an Alco-sensor
IV breath analyzer (Intoximeters, Inc., St. Louis, MO), and the TAC data was recorded using two SCRAMCAM transdermal alcohol sensor bracelets (Alcohol Monitoring Systems, AMS, Littleton, CO), shown in
Figure 1.1, worn on each of the upper arms, producing two sets of TAC data recorded in each session.
Until the BrAC reading reached 0.000, both BrAC and TAC measurements were taken every 10 minutes,
after which the TAC continued to be recorded every 30 minutes until it arrived at 0.000 as well. Although
some BrAC-TAC data sets were excluded due to protocol changes, Trial 3 contains 267 usable BrAC-TAC
70
datasets[44]. Trial 4 involved 40 regular drinkers in the laboratory, each wearing a sensor on each arm,
drinking approximately 2 drinks in 15 minutes in two sessions and drinking ad libitum for 2-3 hours in
two sessions [54, 58]. BrAC and TAC readings were taken at 15-minute intervals, resulting in a total of 282
usable BrAC-TAC datasets recorded simultaneously.
With this experimental design, it might be natural to question whether drinking episodes can be treated
as independent observations. Indeed, the skin physiology (which may vary on a single individual’s right
and left arms), environmental conditions, different sensors, and different drinking episodes are all contributing factors to the potential correlation in observed BrAC and TAC data. The advantage of introducing
randomness in the parameter design is to capture such variations caused by the above factors so that we
can perform a quantitative analysis of the influences. In our study, each pair of observed BrAC-TAC data
is associated with a corresponding parameter pair q = (q1, q2) calculated based on methods and software
previously established by our team [31].
For the remainder of this chapter, we consider a total observation time of T = 7 hours and a discretization step of τ = 1/60, or 1 minute, resulting in K = T /τ = 420 points for each of the BrAC and TAC
curves. These parameters were chosen to match the time interval and collection frequency stated above.
One of the episodes in Trial 4 was randomly picked as our test subject in Chapter 9.3. For the parameters q = (q1, q2), we set q1 = 1.2481 and q2 = 1.5647, which comply with our selected data, and the
covariance matrix Γ
0.069 −0.004
−0.004 0.101
was estimated using all the calculated pairs of parameters in
Trials 3 and 4. We will consider an approximation of dimension N = 15, and we will discuss the relevant
convergence results in Chapter 9.4. We choose a 0.3 confidence level for the construction of the confidence
bands, resulting in an expected coverage of 1 − 0.3 = 70%.
71
9.2 The Impulse Response Function and Confidence Bands
We begin with a single impulse at time t = k = 0, given by u = (u0, ..., uK−1), with u0 = 1 and uj = 0
for j = 1, ..., K − 1, as our input to the SISO model (8.2),(8.3). Direct computation yields that yk = hk−1
and y
N
k = h
N
k−1
. In other words, we examine the impulse response function hk, k = 0, ..., K − 1 and its
confidence band directly.
Figure 9.1: Plot of 70% confidence intervals with impulse response function based on a single impulse at
time k = 0.
Figure 9.1 shows our forward estimation for the response of a single impulse at time k = 0 as a solid black
curve. We plot the curves from T = 0 to T = 4, when the impulse response function becomes nearly 0
afterwards. In this figure, we have also plotted the 5 types of confidence bands discussed in Chapter 7.4.
The Naïve, Bonferroni, ŠidáK and Wald bands are plotted with dashed lines. In addition to obtaining the
distribution of hk according to (7.1), the critical values for each of these single parameter confidence bands
are computed based on their corresponding 70% quantiles from the standard normal distribution or the
χ
2 distribution. The supremum confidence band (Sup-t), plotted as a pink region enclosed by solid green
lines, is obtained from empirical simulation of the joint distribution of the maximum of the normal vector
72
with calculated distribution. As described in Chapter 7.4, the widths of the confidence bands in decreasing
order are: 1) The Wald’s band, which drastically over-estimates the necessary coverage, 2) the Bonferroni
and ŠidáK bands, which almost coincide and guarantees a coverage of at least 70%, 3) the Sup-t band that
yields the asymptotically optimal coverage, and 4) the Naïve band with coverage smaller than the target
level.
Figure 9.2: Plot of 70% confidence bands for deconvolving the impulse response function based on Single
Impulse.
In the next two figures, we consider the deconvolution problem in this scenario with the input being the
impulse response h, again as a solid black curve, calculated above. Figure 9.2 demonstrates the estimated
uˆ, plotted in red, with a minimal amount of regularization. We present the curves till T = 2, as we expect
the reconstructed unit impulse to quickly decrease to 0. The regularization parameters (see equation (8.4)
in Chapter 8.3) are taken to be α = 0.01, the amount needed to stabilize the estimate in this case, and
β = 0, since we aim to recover the impulse function u that is discontinuous by construction. We note
that the regularization choices here, as described in Chapter 8.3, serve to demonstrate our deconvolution
method and, therefore, are not necessarily optimal. Compared to the expected impulse u, we observe that
73
the estimation uˆ at time t = k = 0 has a smaller initial value (< 1) and a smoother decay. The deviations
in the result are expected to occur when regularization is applied. The five types of confidence bands
are presented in the same notation as in Figure 9.1, and we observe a similar pattern in the coverage as
described above.
Figure 9.3: Unregularized Deconvolution the Impulse Response Function based on Single Impulse
Figure 9.3 shows the necessity of regularization. In this case, we observe that the initial value of the
impulse at time t = 0 was properly recovered. However, the resulting estimation curve is highly unstable
towards the later part of the estimation due to the fact that h0, shown in the figure as the beginning of the
black curve, is very close to zero, resulting in a nearly singular H matrix in equation (6.5). Although the
inverse can be directly obtained for any strictly positive h0 as discussed in Chapter 6.2, the small values
and their numerical inverses lead to large turbulence when the resulting estimate uˆ takes on a small value
as time increases, demonstrating the ill-conditioned nature of the deconvolution problem.
74
9.3 Conficence Intervals for Forward and Backward Estimation Based
on Actual Drinking Episode Data
In this chapter, we apply our methods to BrAC and TAC data collected from a single drinking episode from
Trial 4. In the following figures, the curves with red solid dots represent the collected BrAC data, whereas
the blue solid dots represent the observed TAC data. In order to apply our scheme we must first resample the data. We use a simple first-order spline or piece-wise linear continuous interpolation scheme
that essentially connects the data points with straight lines. Estimated TAC/BrAC and corresponding
confidence bands are plotted, but to better illustrate our results, the overly conservative Wald bands have
been omitted from the collection of confidence bands shown in the figures.
Figure 9.4: Plot of 70% confidence bands for forward estimation of TAC based on observed BrAC data.
Figure 9.4 shows our forward estimation with collected BrAC data. The red dotted curve is the input,
and the output or estimated TAC is the solid black line. The different types of 70% confidence bands
discussed earlier are presented again with dashed lines with different colors except for the Sup-t band,
which is represented by the pink region enclosed by solid green lines. Compared with the observed TAC
75
data plotted with blue dots, our estimated TAC captures most of the movement trend, with a correlation
coefficient of 0.9614. The majority of the observed TAC points were included in our confidence band,
demonstrating its efficacy in reflecting parts of the hidden noise, the uncertainty in the parameters q, in our
evolution model. The width and coverage of different confidence bands follow from that in our discussion
in Chapter 7.4, with the Naïve band being the narrowest and most aggressive, and the Bonferroni/ŠidáK
bands being more conservative (excluding the Wald band).
Figure 9.5: Plot of 70% confidence bands for the deconvolution problem and estimated BrAC based on
Observed TAC Data.
In Figure 9.5 we have plotted the solution to the deconvolution problem for the same set of BrAC-TAC
data shown in Figure 9.4. In this case, we choose more regularization than we used in the impulse response
case, with regularization parameters α = 0.08 and β = 0.08. The choices are made with regard to the
higher level of oscillation observed in the collected data. We anticipated that these oscillations would affect
the stability and smoothness of our estimated outcomes. The solid red line represents our estimated BrAC
uˆ, and the solid blue line is the reconstructed TAC which is the output of the forward model when the input
is uˆ. Both curves align well with the observed data, with corresponding correlation coefficients of 0.9408
76
and 0.9807, respectively. The 70% confidence bands are plotted as they were in the previous figure, Figure
9.4 with the pink region again representing the Sup-t band. The majority of the observed BrAC data are
enclosed or near the constructed Sup-t confidence band, demonstrating the efficacy of our approach. We
also want to remark that in this case we chose to adjust the second regularization parameter, β, to a value
that maintains a reasonable amount of smoothness while tracking the oscillations in the observed BrAC
data. One may choose to further increase the value of β for the purpose of obtaining a smoother estimate.
In the case of actual drinking episode data, one should keep in mind that the observed data will include
un-modeled observation (e.g. sensor) noise and might not accurately represent the true levels of BrAC and
TAC. In the conclusion chapter, we briefly discuss how we might extend our method to incorporate these
noise processes.
9.4 Convergence of our Approximating Estimates and Confidence Bands
So far we have shown how our approach and results can be used to construct confidence bands for both the
forward and backward estimates. These results were obtained by setting the level of spatial discretization
in the PDE model at a sufficiently high level. In this chapter we illustrate our convergence results; that
is we illustrate the convergence of the confidence bands as the level of discretization or approximation
increases. Since it is of course impossible to numerically compute the actual confidence bands for the infinite dimensional model, we quantify the convergence by looking at sequential differences of our estimated
curves and confidence bands, computing errors by treating them as vectors in R
K.
Figure 9.6 displays the sequential error for the 4 constructed curves in previous chapters. For simplicity,
we label the forward estimates as y1 and y2, corresponding to those in Chapters 9.2 and 9.3. Similarly the
backward or deconvolution estimates are labeled u1 and u2 respectively. Recall that each of the estimates is
a vector in R
K. Taking the initial values u
0
i = y
0
i = 0, we directly compute the standard vector 2-norm for
the sequential difference at stage N as
u
N+1
i − u
N
i
2
and
y
N+1
i − y
N
i
2
, i = 1, 2. Since the estimates
77
Figure 9.6: Plot of sequential difference for the forward and backward estimates as the dimension of approximation N increases.
are naturally bounded, the diminishing sequential difference implies convergence. As we can observe from
the diagram, the differences drop drastically for all scenarios, with a magnitude of about 10−5 obtained as
early as N = 10. These plots not only illustrate our theoretical convergence results in Chapters 7.2 and
7.3, but they also exhibit the rather fast rate of convergence leading to high-quality estimates even when
the discretization level is quite low.
To illustrate the convergence of the confidence bands, we quantify the convergence through the sequential difference in the bandwidth at each state of N. Denote BN
j =
BN
j,1
, ..., BN
j,K
∈ R
K as the vector
of confidence bandwidth at stage N for the five types of confidence bands we consider, where j = 1, 2, ..., 5
corresponds to the Naïve, Bonferroni, ŠidáK, Wald, and Sup-t bands respectively. Recall in Chapter 7.4 we
compute the width of these single-parameter family of bands as BN
j,k = cjσ
N
k
, where cj depends on the
type of the confidence bands and the selected confidence level (0.3), and σ
N
k
, the k-th standard deviation
at stage N, is calculated based on the estimated asymptotic distribution using the delta method. Again
setting the initial value B0
j = 0 for all j = 1, 2, ..., 5, the sequential difference at stage N with respect to
78
the vector 2-norm can be directly computed as
B
N+1
j − BN
j
2
. Note that the dependence on the type of
confidence band and confidence level lies entirely in the critical value, which is in general deterministic,
except for the Sup-t band, and hence does not play a major role in convergence. However, the convergence
rate for different types of confidence bands can be largely influenced by the magnitude of the critical values. On the other hand, since cασk is bounded, the Cauchy sequence is sufficient to show convergence. If
the difference approaches 0 as N → ∞, then the bandwidth is converging, which verifies our results in
Chapters 7.2 and 7.3. The next two figures plot the sequential errors against the approximation dimension
N for all the different estimates and bands we did earlier in this chapter.
Figure 9.7: Plot of sequential differences for the finite dimensional confidence band widths with the impulse
response function.
In Figure 9.7 we have plotted the sequential differences of confidence bandwidths for all the types of
confidence bands we considered in Chapter 9.2 against the approximation dimension for the case with unit
impulse in both the forward calculation, shown in Figure 9.1, and the deconvolution, shown in in Figure
9.2. In all cases, we observe a rapid decay in the sequential differences as the level of discretization,N,
increases, with the difference in convergence rate primarily due to the differences in the magnitudes of
79
the critical values. While the convergence can be naturally deduced for most types of confidence bands,
the Sup-t band displays signs of instability as N increases. Such behavior arises because of the numerical
method we use to estimate the Sup-t critical value, which depends on the calculated distribution and
hence the dimension of approximation N. Moreover, the process involves generating a large number of
random vectors with normal distributions. As one can imagine, even if the distribution converges, as
shown in Chapter 7.4, extreme values that lead to the small jumps we observed should be expected, since
convergence in distribution is weaker than almost sure convergence. In general, the overall sequential
differences are reasonably small, which agrees with the convergence theory presented in previous chapters.
Figure 9.8: Plot of sequential differences for the finite dimensional confidence band widths computed using
actual human subject BrAC and TAC data.
Similarly, Figure 9.8 illustrates our convergence results by showing that the confidence bandwidths
converge in both the forward and the backward cases with actual human subject drinking episode data as
in Chapter 9.3. In this case, the rate of convergence is even slower, but behavior similar to that indicated
in the previous discussion can be observed.
80
The above results pose a natural question as to which type of confidence band is most suitable for
our method. For this example, we selected the asymptotically optimal Sup-t band to describe our data.
However, as observed in the convergence studies discussed above, the Sup-t band has the disadvantage of
exhibiting a slower convergence rate and possible instability in their widths due to the empirical techniques
used to compute them. Different types of bands have their own advantages and it largely depends on the
goal of the study as to which would be the most appropriate band type with either more aggressive or more
conservative coverage. In addition, the band widths were calculated based on the assumed distribution of
our parameters. Our study focuses on the propagation of such a distribution through our system, but the
result could be less convincing with a more empirical distribution for the parameters as a starting point.
81
Chapter 10
Concluding Remarks and Future Research
In this study, we constructed an infinite-dimensional evolution model for the purpose of modeling the
connection between blood/breath alcohol concentration (BAC/BrAC) and biosensor measured transdermal alcohol concentration (TAC). We have also constructed a finite-dimensional approximation framework
for our system and attempted to solve both the forward problem, where we estimated the TAC outputs
based on given BrAC inputs, and the deconvolution problem, where we reconstructed the BrAC readings
based on recorded TAC data. Our primary focus, however, was the propagation of uncertainty in the
model parameters 1) through the forward model to the estimated TAC, and more significantly, 2) through
the inversion or deconvolution process to the estimated BrAC signal based on a biosensor measured and
observed TAC signal. We were able to quantify this propagated uncertainty in the form of confidence
bands by formulating the forward problem as a convolution and the otherwise ill-posed inverse problem
as a regularized least-squares optimization problem. The challenge in doing this was that due to the infinite dimensional nature of the underlying model in the form of an abstract parabolic partial differential
equation, the forward convolution filter involved a parameter-dependent analytic semigroup of bounded
linear operators on an infinite dimensional Hilbert space. Using Fréchet derivatives and a statistical technique known as the Delta method, we were able to compute the asymptotic distribution of the impulse
response function, the forward TAC signal and the deconvolved BrAC signal based on the distribution of
82
the random parameters q. In particular, we investigated and compared multiple options for using the distribution to construct confidence bands. In addition, since the actual computation of the confidence bands
necessitated convergence results finite-dimensional approximation of the infinite dimensional semigroup,
we rigorously established the convergence of the approximating confidence bands to the theoretical, but
computationally out of reach, infinite dimensional confidence bands. We presented numerical studies involving actual human subject drinking episode data that demonstrated the efficacy of our approach and
made it possible for us to compare the different types of confidence bands.
As a result of our study, a number of open questions remain. For example, it would be desirable to have
a quantitative metric that could be used to evaluate the relative merits and disadvantages of the different
types of confidence bands we considered in order to definitively decide which one performs optimally in
various situations. In addition, the our study here neglected the observational sensor error that is no doubt
present in our data. One approach would be consider a mixed effects approach combining the random
effects both across individuals in the population as was done here and random effects within an individuals
observed TAC signal. Another idea is to include the observational noises as part of the joint distribution of
our parameters. Denote ϵu and ϵy be the normally distributed sensor errors in BrAC and TAC respectively.
Under proper assumptions, such as these noises and the parameters q are all uncorrelated, we may argue
by the property of normal distribution that the joint distribution of (q, ϵu, ϵy) satisfies the conditions for
applying the delta method. Then by replacing u and y with u+ϵu and y+ϵy in our formulation in Chapter
6, we can compute the additional partial derivatives needed in the delta method and a similar procedure
follows as stated in Chapters 6 and 7.
Our research also suggested some possible new directions and challenges. The standard Delta method
is based on approximation by a first-order Taylor polynomial. One could consider a higher-order Delta
Method, as shown in Chapter 2.2, for improved accuracy and to avoid the possibility of derivative zeroes.
We derived our approach under the assumption of m input channels and p output channels. However,
83
since our primary motivation for this study was the transdermal alcohol biosensor and in particular, BrAC
deconvolution problem in which m = p = 1. It would be interesting to consider some examples in which
m, and/or p are greater than one. One source of such a system would again be the transdermal alcohol
biosensor. One extension we are currently looking at involves the assimilation of signals from other sensors
including ones that measure sweat, sweat rate, heart rate, and blood flow, so as to mitigate a number of
confounding effects. We are also currently working on problems related to time-sensitive monitoring. One
of the central goals for transdermal alcohol sensing is the ability to monitor quantitative BrAC levels based
on TAC in real time. However, as our model in the present study is based on a Fickian diffusion paradigm,
the resulting system is characterized by infinite speed of propagation. But this is at odds with the observed
behavior of the underlying physical system which exhibits an inherent delay or hysteresis. One idea would
be to replace the parabolic diffusion equation with a hyperbolic telegraph equation based transport model
which in essence takes into account the inertia of the ethanol molecules. This system would have three
parameters with uncertainty rather than the two appearing in the model we considered here.
84
Bibliography
[1] Maria Allayioti. “M-Estimation and Non-Parametric Estimation of a Random Diffusion
Equation-Based Population Model for the Transdermal Transport of Ethanol: Deconvolution and
Uncertainty Quantification”. PhD thesis. University of Southern California, 2023.
[2] L. Asserian, S. E. Luczak, and I.G. Rosen. “Computation of nonparametric, mixed effects, maximum
likelihood, biosensor data based-estimators for the distributions of random parameters in an
abstract parabolic model for the transdermal transport of alcohol”. In: Math Biosci Eng 20.11 (2023),
pp. 20345–20377. doi: 10.3934/mbe.2023900.
[3] H. T. Banks and K. Ito. “A unified framework for approximation in inverse problems for distributed
parameter systems”. In: CTAT 4.1 (1988), pp. 73–90. url:
https://apps.dtic.mil/dtic/tr/fulltext/u2/a193780.pdf.
[4] Saul W. Brusilow and Ellen H. Gordes. “The Permeability of the Sweat Gland To Nonelectrolytes”.
In: American Journal of Diseases of Children 112.4 (1966), pp. 328–333.
[5] George Casella and Roger L Berger. Statistical Inference. Vol. 2. Duxbury, Pacific Grove, CA, 2002.
url: https://books.google.com/books/about/Statistical%5C_Inference.html?id=0x%5C_vAAAAMAAJ.
[6] John D. Cook. Differentiation in Banach Spaces. unpublished notes. May 1994. url:
https://www.johndcook.com/Differentiation_in_Banach_spaces.pdf.
[7] Z. Dai, I. G. Rosen, C. Wang, N. P. Barnett, and S. E. Luczak. “Using Drinking Data and
Pharmacokinetic Modeling to Calibrate Transport Model and Blind Deconvolution Based Data
Analysis Software for Transdermal Alcohol Biosensors”. In: Mathematical Biosciences and
Engineering 13.5 (Oct. 2016), pp. 911–934.
[8] Donald M Dougherty, Nora E Charles, Ashley Acheson, Samantha John, R. Michael Furr, and
Nathalie Hill-Kapturczak. “Comparing the Detection of Transdermal and Breath Alcohol
Concentrations During Periods of Alcohol Consumption Ranging From Moderate Drinking to
Binge Drinking.” In: Experimental and Clinical Psychopharmacology 20.5 (2012), p. 373.
[9] Miguel Dumett, I. G. Rosen, J Sabat, A Shaman, Linda Tempelman, Color Wang, and Robert Swift.
“Deconvolving an Estimate of Breath Measured Blood Alcohol Concentration from Biosensor
Collected Transdermal Ethanol Data”. In: Applied Mathematics and Computation 196 (Apr. 2008),
pp. 724–743. doi: 10.1016/j.amc.2007.07.026.
85
[10] Brobbin E, Deluca P, Hemrage S, and Drummond C. “Accuracy of Wearable Transdermal Alcohol
Sensors: Systematic Review.” In: J Med Internet Res 2022 24(4) (2022), e35178. doi: 10.2196/35178.
[11] Karns-Wright Tara E., John D. Roache, Nathalie Hill-Kapturczak, Yuanyuan Liang, Jillian Mullen,
and Donald M. Dougherty. “Time Delays in Transdermal Alcohol Concentrations Relative to
Breath Alcohol Concentrations”. In: Alcohol and Alcoholism 52.1 (2016), pp. 35–41. issn: 0735-0414.
doi: 10.1093/alcalc/agw058. eprint:
https://academic.oup.com/alcalc/article-pdf/52/1/35/8566843/agw058.pdf.
[12] Lawrence C. Evans. Partial differential equations. Providence, R.I.: American Mathematical Society,
2010. isbn: 9780821849743.
[13] C. E. Fairbairn and D. Kang. “Transdermal alcohol monitors: research, applications, and future
directions.” In: The Handbook of Alcohol Use and Abuse. Elsevier. 24 (2020). doi:
doi.org/10.1016/B978-0-12-816720-5.00014-1.
[14] C. E. Fairbairn, D. Kang, and N. Bosch. “Using machine learning for real-time BAC estimation from
a new-generation transdermal biosensor in the laboratory”. In: Drug and Alcohol Dependence 216
(2020), p. 108205. issn: 0376-8716. doi: https://doi.org/10.1016/j.drugalcdep.2020.108205.
[15] Thomas S. Ferguson. A Course in Large Sample Theory (1st Ed.) Chapman & Hall/CRC, 1996. doi:
https://doi.org/10.1201/9781315136288.
[16] M. Gamella, S. Campuzano, J. Manso, G. González de Rivera, F. López-Colino, A.J. Reviejo, and
J.M. Pingarrón. “A Novel Non-Invasive Electrochemical Biosensing Device for in situ
Determination of the Alcohol Content in Blood by Monitoring Ethanol in Sweat”. In: Analytica
Chimica Acta 806 (Jan. 2014), pp. 1–7. doi: 10.1016/j.aca.2013.09.020.
[17] Alexey Grigorev. Normal Equation. MLWiki. Mar. 2020. url:
http://mlwiki.org/index.php/Normal_Equation#Sources.
[18] K. Hawekotte, S. E. Luczak, and I. G. Rosen. “Deconvolving breath alcohol concentration from
biosensor measured transdermal alcohol level under uncertainty: a Bayesian approach”. In:
Mathematical Biosciences and Engineering 18.5 (2021). doi: 10.3934/mbe.2021335.
[19] Nathalie Hill-Kapturczak, John D. Roache, Yuanyuan Liang, Tara E. Karns, Sharon E. Cates, and
Donald M. Dougherty. “Accounting for Sex-Related Differences in the Estimation of Breath
Alcohol Concentrations Using Transdermal Alcohol Monitoring”. In: Psychopharmacology 232
(2014), pp. 115–123.
[20] Atsushi Inoue and Lutz Kilian. “Joint confidence sets for structural impulse responses”. In: Journal
of Econometrics 192.2 (2016). Innovations in Multiple Time Series Analysis, pp. 421–432. issn:
0304-4076. doi: https://doi.org/10.1016/j.jeconom.2016.02.008.
[21] Yu J, Fairbairn CE, Gurrieri L, and Caumiant EP. “Validating transdermal alcohol biosensors: a
meta-analysis of associations between blood/breath-based measures and transdermal alcohol
sensor output.” In: Addiction. 117 (2022), pp. 2805–2815. doi: 10.1111/add.15953.
86
[22] Tara E. Karns-Wright, Donald M. Dougherty, Nathalie Hill-Kapturczak, Charles W. Mathias, and
John D. Roache. “The correspondence between transdermal alcohol monitoring and daily
self-reported alcohol consumption”. In: Addictive Behaviors 85 (2018), pp. 147–152. issn: 0306-4603.
doi: https://doi.org/10.1016/j.addbeh.2018.06.006.
[23] Tara E. Karns-Wright, John D. Roache, Nathalie Hill-Kapturczak, Yuanyuan Liang, Jillian Mullen,
and Donald M. Dougherty. “Time Delays in Transdermal Alcohol Concentrations Relative to
Breath Alcohol Concentrations”. In: Alcohol and Alcoholism 52 (2017), pp. 35–41.
[24] T. Kato. Perturbation Theory for Linear Operators. Vol. 132. Springer Science & Business Media,
1995. url: https://www.springer.com/us/book/9783540586616.
[25] Dominick A. Labianca. “The chemical basis of the breathalyzer: a critical analysis”. In: J. Chem.
Educ 67.3 (1990), pp. 259–261. doi: DOI:10.1021/ed067p259.
[26] T. R. Leffingwell, N. J. Cooney, J. G. Murphy, S. E. Luczak, I. G. Rosen, D. M. Dougherty, and
N. P. Barnett. “Transdermal alcohol monitoring: 21st century measurement for an age-old
problem.” In: Alcoholism: Clinical and Experimental Research 37 (2013), pp. 16–22.
[27] Richard G. Lister, Clarice Gorenstein, Debra Risher-Flowers, Herbert J. Weingartner, and
Michael J. Eckardt. “Dissociation of the Acute Effects of Alcohol on Implicit and Explicit Memory
Processes”. In: Neuropsychologia 29.12 (Dec. 1991), pp. 1205–1212. doi:
10.1016/0028-3932(91)90034-6.
[28] Haoxing Liu, Larry Goldstein, Susan Luczak, and I. Rosen. “Confidence Bands for Evolution
Systems Described by Parameter-Dependent Analytic Semigroups”. In: Proceedings of the 2023
SIAM Conference on Control and its Applications. Society for Industrial and Applied Mathematics,
July 2023, pp. 125–132. isbn: 978-1-61197-774-5. doi: 10.1137/1.9781611977745.17.
[29] Haoxing Liu, Larry Goldstein, Susan E. Luczak, and I. G. Rosen. “Delta-Method Induced
Confidence Bands for a Parameter-Dependent Evolution System with Application to Transdermal
Alcohol Concentration Monitoring”. In: 2023 62nd IEEE Conference on Decision and Control (CDC).
2023, pp. 6211–6216. doi: 10.1109/CDC49753.2023.10383767.
[30] S. E. Luczak and V. A. Ramchandani. “Special issue on alcohol biosensors: Development, use, and
state of the field: Summary, conclusions, and future directions”. In: Alcohol 81 (2019), pp. 161–165.
[31] S. E. Luczak and I. G. Rosen. “Estimating BrAC from transdermal alcohol concentration data using
the BrAC Estimator software program.” In: Alcoholism: Clinical and Experimental Research 38
(2014), pp. 2243–2252.
[32] Helmut Lütkepohl, Anna Staszewska-Bystrova, and Peter Winker. “Constructing Joint Confidence
Bands for Impulse Response Functions of VAR Models”. In: Econometrics and Statistics 13 (Sept.
2018), pp. 69–83.
[33] José Luis Montiel Olea and Mikkel Plagborg-Møller. “Simultaneous Confidence Bands: Theoretical
Comparisons and Recommendations for Practice”. In: Unpublished manuscript (2017).
87
[34] José Luis Montiel Olea and Mikkel Plagborg-Møller. “Simultaneous confidence bands: Theory,
implementation, and an application to SVARs”. In: Journal of Applied Econometrics 34.1 (2019),
pp. 1–17. doi: https://doi.org/10.1002/jae.2656.
[35] E. Nyman and A. Palmlöv. “The Elimination of Ethyl Alcohol in Sweat 1”. In: Skandinavisches
Archiv Für Physiologie 74.2 (1936), pp. 155–159.
[36] C. Oszkinat, S. E. Luczak, and I. G. Rosen. “An abstract parabolic system-based physics-informed
long short-term memory network for estimating breath alcohol concentration from transdermal
alcohol biosensor data”. In: Neural Computing and Applications (2022). doi:
10.1007/s00521-022-07505-w.
[37] C. Oszkinat, S. E. Luczak, and I. G. Rosen. “Uncertainty quantification in estimating blood alcohol
concentration from Transdermal alcohol level with physics-informed neural networks”. In: IEEE
Transactions on Neural Networks and Learning Systems (2022). doi: 10.1109/TNNLS.2022.3140726.
[38] C. Oszkinat, T. Shao, C. Wang, I. G. Rosen, A. Rosen, E. Saldich, and S. E. Luczak. “Blood and breath
alcohol concentration from transdermal alcohol biosensor data: estimation and uncertainty
quantification via forward and inverse filtering for a covariate-dependent, physics-informed,
hidden Markov model”. In: Inverse Problems 38.5 (2022), p. 055002. doi: 10.1088/1361-6420/ac5ac7.
[39] Gaston L. S. Pawan and K. Grice. “Distribution of Alcohol in Urine and Sweat After Drinking”. In:
The Lancet 292.7576 (1968), p. 1016.
[40] A. Pazy. Semigroups of Linear Operators and Applications to Partial Differential Equations. Applied
Mathematical Sciences. Springer, 1983. isbn: 9783540908456. url:
https://books.google.com/books?id=80XYPwAACAAJ.
[41] John D. Roache, Tara E. Karns-Wright, Martin Goros, Nathalie Hill-Kapturczak,
Charles W. Mathias, and Donald M. Dougherty. “Processing transdermal alcohol concentration
(TAC) data to detect low-level drinking”. In: Alcohol 81 (2019), pp. 101–110. issn: 0741-8329. doi:
https://doi.org/10.1016/j.alcohol.2018.08.014.
[42] I. G. Rosen, S. E. Luczak, W. W. Hu, and M. Hankin. “Discrete-Time Blind Deconvolution for
Distributed Parameter Systems with Dirichlet Boundary Input and Unbounded Output with
Application to a Transdermal Alcohol Biosensor”. In: 2013 Proceedings of the Conference on Control
and its Applications. IEEE, 2013, pp. 160–167. doi: 10.1137/1.9781611973273.22.
[43] I. G. Rosen, Susan E Luczak, and Jordan Weiss. “Blind Deconvolution for Distributed Parameter
Systems with Unbounded Input and Output and Determining Blood Alcohol Concentration from
Transdermal Biosensor Data”. In: Applied mathematics and computation 231 (Mar. 2014),
pp. 357–376. issn: 0096-3003. doi: 10.1016/j.amc.2013.12.099.
[44] E. B. Saldich, C. Wang, I. G. Rosen, L. Goldstein, J. Bartroff, R. M. Swift, and S. E. Luczak.
“Obtaining high-resolution multi-biosensor data for modeling transdermal alcohol concentration
data”. In: Alcoholism: Clinical and Experimental Research 44 (2020). Suppl. 1.
[45] M. H. Schultz. Spline Analysis. Prentice-Hall Series in Automatic Computation. Pearson Education,
Limited, 1972. isbn: 9780138354053. url: https://books.google.com/books?id=AdRQAAAAMAAJ.
88
[46] S Seubert and J.G Wade. “Fréchet Differentiability of Parameter-Dependent Analytic Semigroups”.
In: Journal of Mathematical Analysis and Applications 232.1 (1999), pp. 119–137. issn: 0022-247X.
doi: doi.org/10.1006/jmaa.1998.6249.
[47] Ralph E Showalter. Hilbert Space Methods in Partial Differential Equations. Courier Corporation,
2010.
[48] Zbynek Sidak. “Rectangular Confidence Regions for the Means of Multivariate Normal
Distributions”. In: Journal of the American Statistical Association 62.318 (1967), pp. 626–633. issn:
01621459. url: http://www.jstor.org/stable/2283989 (visited on 03/27/2023).
[49] Melike Sirlanci, Susan E. Luczak, Catharine E. Fairbairn, Dahyeon Kang, Ruoxi Pan, Xin Yu, and
I. G. Rosen. “Estimating the distribution of random parameters in a diffusion equation forward
model for a transdermal alcohol biosensor”. In: Automatica 106 (2019), pp. 101–109. issn:
0005-1098. doi: doi.org/10.1016/j.automatica.2019.04.026.
[50] Melike Sirlanci, Susan E. Luczak, and I. G. Rosen. “Estimation of the Distribution of Random
Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output:
Approximation and Convergence”. In: Commun Appl Anal 23.2 (2019), pp. 287–329. doi:
10.12732/caa.v23i2.4.
[51] Melike Sirlanci Tuysuzoglu. “Finite Dimensional Approximation and Convergence in the
Estimation of the Distribution of, and Input to, Random Abstract Parabolic Systems with
Application to the Deconvolution of Blood/Breath Alcohol Concentration from Biosensor
Measured Transdermal Alcohol Level”. PhD thesis. University of Southern California, 2018.
[52] H. Tanabe. Equations of Evolution. Monographs and Studies in Mathematics. Pitman, London, 1979.
isbn: 9780273011378. url: https://books.google.com/books?id=Dn6zAAAAIAAJ.
[53] L. G. Wade. Organic Chemistry. Prentice Hall, 2003. isbn: 9780130338327.
[54] K. R. Walden, E. B. Saldich, G. Wong, H. Liu, C. Wang, I. G. Rosen, L. Goldstein, J. Bartroff,
R. M. Swift, and S. E. Luczak. “Momentary assessment of drinking: Past methods, current
approaches incorporating biosensors, and future directions”. In: In C. Fairbairn and K. Federmeier,
editors, Special Issue on New Developments in Addiction Science, Psychology of Learning and
Motivation, Vol 79. Academic Press, 2023, pp. 271–301.
[55] G. D. Webster and H. C. Gabler. “Modeling of Transdermal Transport of Alcohol Effect of Body
Mass and Gender”. In: Biomedical Sciences Instrumentation 44 (Feb. 2008), pp. 361–366.
[56] Gregory D. Webster and Hampton C. Gabler. “Feasibility of Transdermal Ethanol Sensing for the
Detection of Intoxicated Drivers”. In: Annual Proceedings / Association for the Advancement of
Automotive Medicine. Association for the Advancement of Automotive Medicine 51 (Feb. 2007),
pp. 449–64.
[57] J. Wloka. Partial Differential Equations. Cambridge University Press, Cambridge, 1987.
89
[58] G. Y. Wong, E. B. Saldich, K. R. Walden, C. Wang, I. G. Rosen, L. Goldstein, J. Bartroff, R. M. Swift,
and S. E. Luczak. “A laboratory protocol for obtaining transdermal and breath alcohol biosensor
data: Next steps toward collecting data in the field”. In: Alcoholism: Clinical and Experimental
Research 47 (2023), 249A.
[59] M. Yao, E. Saldich, G. Wong, S. E. Luczak, M. Allayioti, C. Wang, and I. G. Rosen. “Real-time
recursive estimation of, and uncertainty quantification for, breath alcohol concentration via LQ
tracking control-based inverse filtering of transdermal alcohol biosensor signals”. In: Applied
Mathematics Modern Challenges 2.1 (2024), pp. 38–69. doi: 10.3934/ammc.2024003.
[60] Mengsha Yao, Susan E. Luczak, Emily B. Saldich, and I. G. Rosen. “A population model-based
linear-quadratic Gaussian compensator for the control of intravenously infused alcohol studies
and withdrawal symptom prophylaxis using transdermal sensing”. In: Optim Control Appl Meth.
(2023), pp. 1–29. doi: DOI:10.1002/oca.2934.
[61] Mengsha Yao, Susan E. Luczak, Emily B. Saldich, and I. G. Rosen. “Tracking and blind
deconvolution of blood alcohol concentration from transdermal alcohol biosensor data: A
population model-based LQG approach in Hilbert space”. In: Automatica 147 (Jan. 2023), p. 110699.
doi: https://doi.org/10.1016/j.automatica.2022.110699.
90
Abstract (if available)
Abstract
Full title: Delta Method confidence bands for parameter-dependent impulse response functions, convolutions, and deconvolutions arising from evolution systems described by analytic semigroups of operators with regularly dissipative generators. Abstract: We present a comprehensive study of the propagation of parameter uncertainty in a single input/single output (SISO) infinite dimensional random evolution system that models the relationship between blood or breath alcohol concentration (BAC/BrAC), the input signal, and biosensor measured transdermal alcohol concentration (TAC), the output signal, passively collected and noninvasively observed at the stratum corneum, the outer surface of the skin or epidermal layer. Described by a first principles physics-based random SISO diffusion model in the form of an abstract parabolic system set in a Gelfand triple of Hilbert spaces, the resulting input output model takes the form of a discrete convolution with filter involving an analytic semigroup of bounded linear operators with regularly dissipative generator that depends on the model's random parameters known only up to their distributions. To quantify the propagation of uncertainty in more general multi-input channel/multi-output channel systems of this type, we use Fr\'echet derivatives of the semigroup with respect to the parameters together with a statistical technique known as the Delta Method. In particular, we consider the system's impulse response matrix or function, and both the forward and inverse or deconvolution problems for the purpose of estimating the input signals (e.g. BrAC) based on observations of the output signals (e.g. TAC). The deconvolution process is formulated as an abstract regularized least-squares problem. A finite-dimensional approximation and convergence theory is developed, and the results of our numerical studies involving actual human subjects drinking episode data are presented and discussed.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Obtaining breath alcohol concentration from transdermal alcohol concentration using Bayesian approaches
PDF
Finite dimensional approximation and convergence in the estimation of the distribution of, and input to, random abstract parabolic systems with application to the deconvolution of blood/breath al...
PDF
M-estimation and non-parametric estimation of a random diffusion equation-based population model for the transdermal transport of ethanol: deconvolution and uncertainty quantification
PDF
An abstract hyperbolic population model for the transdermal transport of ethanol in humans: estimating the distribution of random parameters and the deconvolution of breath alcohol concentration
PDF
Physics-informed machine learning techniques for the estimation and uncertainty quantification of breath alcohol concentration from transdermal alcohol biosensor data
PDF
Prohorov Metric-Based Nonparametric Estimation of the Distribution of Random Parameters in Abstract Parabolic Systems with Application to the Transdermal Transport of Alcohol
PDF
Linear quadratic control, estimation, and tracking for random abstract parabolic systems with application to transdermal alcohol biosensing
PDF
Distributed parameter model based system identification and filtering in the estimation of blood alcohol concentration from transdermal alcohol biosensor data
PDF
Population modeling and Bayesian estimation for the deconvolution of blood alcohol concentration from transdermal alcohol biosensor data
PDF
Simultaneous parameter estimation and semi-blind deconvolution in infinite-dimensional linear systems with unbounded input and output
Asset Metadata
Creator
Liu, Haoxing
(author)
Core Title
Delta Method confidence bands for parameter-dependent impulse response functions, convolutions, and deconvolutions arising from evolution systems described by…
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Applied Mathematics
Degree Conferral Date
2024-08
Publication Date
07/01/2024
Defense Date
06/24/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
analytic semigroups of operators,blood and breath alcohol concentration,confidence bands,Delta Method,evolution system,Fréchet derivatives,OAI-PMH Harvest,transdermal alcohol concentration and biosensor
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Rosen, Gary (
committee chair
), Goldstein, Larry (
committee member
), Luczak, Susan (
committee member
)
Creator Email
antoniusliu82@gmail.com,haoxingl@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113997AIZ
Unique identifier
UC113997AIZ
Identifier
etd-LiuHaoxing-13168.pdf (filename)
Legacy Identifier
etd-LiuHaoxing-13168
Document Type
Dissertation
Format
theses (aat)
Rights
Liu, Haoxing
Internet Media Type
application/pdf
Type
texts
Source
20240701-usctheses-batch-1176
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
analytic semigroups of operators
blood and breath alcohol concentration
confidence bands
Delta Method
evolution system
Fréchet derivatives
transdermal alcohol concentration and biosensor