Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The wavelet filter: a new method for nonlinear, nongaussian filtering in one dimension
(USC Thesis Other)
The wavelet filter: a new method for nonlinear, nongaussian filtering in one dimension
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE W A VELET FIL TER: A NEW METHOD F OR NONLINEAR,
NONGAUSSIAN FIL TERING IN ONE DIMENSION
by
Daniel M. Johnson
A Thesis Presented to the
F ACUL TY OF THE GRADUA TE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial F ulllment of the
Requirements for the Degree
MASTER OF SCIENCE
(COMPUTER SCIENCE)
August 2009
Copyright 2009 Daniel M. Johnson
Dedication
This thesis is dedicated to Eileen, whose patience and c heerful supp ort in the face of so
m uc h uncertain t y ha v e b een critical. Also to Philip, Benjamin and Miriam; although this
thesis has b een around longer than some of them, I lik e them more.
ii
Acknowledgements
I w ould lik e to thank Gaura v Sukhatme for allo wing me wide latitude to pursue m y o wn
researc h in terests.
I w ould also lik e to thank Northrop Grumman for its generous tuition reim bursemen t
program.
iii
T able of Contents
Dedication ii
Ac kno wledgemen ts iii
List Of Figures v
Abstract vii
Chapter 1: The Discrete Time Nonlinear Filtering Problem 1
Chapter 2: Classical Orthogonal F unction Systems (COFS) and Nonlinear Filtering 6
Chapter 3: Fiv e Problems with Applying COFS to Nonlinear Filtering 9
3.1 Adapting Lo cation and Scale P arameters . . . . . . . . . . . . . . . . . . 9
3.2 Preserving P ositivit y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Determining the Co ecien ts in the First Place . . . . . . . . . . . . . . . 15
3.4 Propagating Co ecien ts Through an Action Step . . . . . . . . . . . . . . 15
3.5 Propagating Co ecien ts Through a Measuremen t Step . . . . . . . . . . . 17
Chapter 4: Orthogonal W a v elets and the Sp ecialized Mey er-T yp e W a v elets 19
4.1 Orthogonal W a v elets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Appro ximation Prop erties . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Mey er-t yp e W a v elets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 The Coiet Prop ert y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Chapter 5: Ho w W a v elets Solv e the Fiv e Problems 31
5.1 Adapting Lo cation and Scale P arameters . . . . . . . . . . . . . . . . . . 31
5.2 Preserving P ositivit y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3 Determining the Co ecien ts in the First Place . . . . . . . . . . . . . . . 33
5.4 Propagating Co ecien ts Through an Action Step . . . . . . . . . . . . . . 34
5.5 Propagating Co ecien ts Through a Measuremen t Step . . . . . . . . . . . 40
Chapter 6: Related and F uture W ork 42
References 43
iv
List Of Figures
3.1 N (0; 1) appro ximated b y Hermite functions deriv ed from w (x) = N (; 1)
for = 0; 3; 6; 9; 12; 10 terms. . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 N (0; 1) appro ximated b y Hermite functions deriv ed from w (x) = N (0; )
for = 1; 2; 3; 4; 5; 10 terms. . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 N (0; 1) appro ximated b y Hermite functions deriv ed from w (x) = N (0; )
for = 1;
1
2
;
1
3
;
1
4
;
1
5
; 10 terms. . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1 Step function pro jected on to V
0
. . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Step function pro jected on to V
1
. . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Step function pro jected on to V
2
. . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Step function pro jected on to V
3
. . . . . . . . . . . . . . . . . . . . . . . 24
4.5 Step function appro ximated b y complex exp onen tials o v er ( 4; 4), w a v en um-
b ers from -30 to 30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1 N (0; 1) appro ximated with 11 terms of j = 0 . . . . . . . . . . . . . . . . 32
5.2 N (0; 10) appro ximated with 11 terms of j = 3 . . . . . . . . . . . . . . . . 32
5.3 N (10; 1) appro ximated with 11 terms of j = 0 . . . . . . . . . . . . . . . . 33
5.4 f (X ) = X
2
in an appro ximately linear regine: 0;9
(x
2
) (red), 2;12
(x)
(green), ( 3;24
(x) (blue) . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.5 f (X ) = X
2
is appro ximately linear at x = 3 . . . . . . . . . . . . . . . . . 37
5.6 f (X ) = X
2
around a p oin t where f (x) = 0: 3;0
(x
2
), 0;0
(x
2
), 3;0
(x
2
) . 37
5.7 f (X ) = X
2
around a p oin t where f (x) = 0: 0;f 5;:::; 5
(x
2
) . . . . . . . . . 38
v
5.8 Y = X
2
around a b oundary of the domain of Y : 2;f 10;:::; 1g
(x
2
) . . . . 39
vi
Abstract
While classical orthogonal function systems (COFS) are v ery successful at appro ximating
static functions, there are sev eral problems with using them in m ulti-stage estimation.
These problems are adapting lo cation and scale parameters b et w een time steps, preserving
p ositivit y in the PDF appro ximation, determining the initial v alues of the decomp osition
co ecien ts, propagating the co ecien ts through general nonlinear functions, and propa-
gating the co ecien ts through the m ultiplications required for application of Ba y es’s rule.
Orthogonal w a v elets, particularly Mey er-t yp e w a v elets, ha v e sev eral attractiv e prop erties
missing from COFS. W e sho w ho w w a v elets a v oid eac h of the problems encoun tered with
COFS in nonlinear ltering.
vii
Chapter 1
The Discrete Time Nonlinear Filtering Problem
The general ltering problem is to estimate the state of a dynamical system. The system
is sub ject to an input con trol whic h is kno wn exactly but whic h is corrupted b y noise
during application. W e are able to measure some function of the system state, but these
measuremen ts are lik ewise corrupted b y noise in the course of observ ation. F or the sak e
of simplicit y , w e mak e the follo wing common assumptions:
The system state is nite dimensional.
The system b oth ev olv es and is observ ed in discrete time.
Both the state up date and the measuremen ts are Mark o vian, i.e. they dep end only
on the curren t system state and not an y past or future state v alues.
The pro cess and measuremen t noises pro cesses are indep enden t of eac h other and
uncorrelated with their past and future v alues.
Both the pro cess and observ ation noise are additiv e.
1
In terms of equations w e ha v e
x
k +1
= f (x
k
; u
k
) + w
k
(1.1)
z
k
= h(x
k
) + v
k
(1.2)
where x
k
is the system state, u
k
is the kno wn applied con trol, w
k
is the pro cess noise,
and f is the system transition function; z
k
is the measuremen t, the only new information
directly a v ailable at time t
k
, v
k
is the measuremen t noise, and h is the measuremen t
function whic h determines the v alue of z
k
. An y one of the ab o v e v ariables ma y b e v ector-
v alued and eac h ma y ha v e a dieren t dimension, but for the sak e of simplicit y w e will
write scalar equations from no w on.
Our goal is to estimate the probabilit y distribution of the state at eac h time step giv en
our estimate of the previous time step and the curren t con trol input and measuremen t
output. An y relev an t question ab out the state of the system ma y b e answ ered using
this distribution. W e therefore need to con v ert (1.1) and (1.2) from equations of random
v ariables to equations of probabilit y densit y functions.
F or (1.1) w e can b egin with the denition of the transformation of a random v ariable
[2]: If Y ; X
1
; : : : ; X
n
are scalar random v ariables and y = f (x
1
; : : : ; x
n
), then
P
Y
(y ) =
Z
: : :
Z
P
X
1
;X
2
;:::;X n
(x
1
; x
2
; : : : ; x
n
) (y f (x
1
; x
2
; : : : ; x
n
)) dx
1
dx
2
: : : dx
n
(1.3)
2
W e can therefore deriv e
p(x
k +1
jZ
k
; u
k
) =
Z Z
p(x
k
; w
k
jZ
k
) (x
k +1
f (x
k
; u
k
) w
k
) dx
k
dw
k
(1.4)
where p(xjZ
k
) means p(xjz
0
; z
1
; : : : ; z
k
). It is also common to implicitly marginalize the
noise w
k
and write (1.4) as a Chapman-Kolmogoro v equation
p(x
k +1
jZ
k
; u
k
) =
Z
p(x
k
jZ
k
) p(x
k +1
jx
k
; Z
k
; u
k
) dx
k
(1.5)
in whic h case p(x
k +1
jx
k
; Z
k
; u
k
) can b e deriv ed from (1.3) b y
p(x
k +1
jx
k
; Z
k
; u
k
) =
Z
p(w
k
jx
k
) (x
k +1
f (x
k
; u
k
) w
k
) dw
k
This latter form is usually preferred when one wishes to a v oid in tro ducing functions
for p edagogical or other reasons, b ecause it is easier to giv e a heuristic deriv ation of
p(x
k +1
jx
k
; Z
k
; u
k
) than of the whole up date equation. Ho w ev er, w e prefer to w ork directly
with the form (1.4) b ecause it mak es it clear that the system up date is really a kind of
nonlinear con v olution of densit y functions. Also note that (1.4) is still v alid in the cases
of non-additiv e noise, noise correlated with past and future noise v alues, noise correlated
with the system state, and non-additiv e noise (with the ob vious mo dications); heuristic
deriv ations quic kly b ecome imp ossible in these conditions.
3
F or the measuremen t step (1.2), the same reasoning can b e used to deriv e p(z
k
jx
k
; Z
k 1
).
Ba y es’s rule can then b e emplo y ed to deriv e
p(x
k
ju
k
; Z
k
) =
p(x
k
ju
k 1
; Z
k 1
) p(z
k
jx
k
)
R
p(x
k
ju
k 1
; Z
k 1
)p(z
k
jx
k
) dx
k
(1.6)
It is not ob vious ho w to deal with equations (1.4) and (1.6) in a computationally
ecien t w a y for general f , h and p(x
0
). Historically , this problem has b een discussed in
terms of nding a \parameterization" of p(x
k
jZ
k
) suc h that the n um b er of parameters to
trac k is sucien tly small.
Most early eorts concen trated on parameterizing b y the set of conditional momen ts
R
p(x
k
jZ
k
)x
n
k
dx
k
; n = 1; 2; : : : The rst successful parameterization w as giv en b y Kalman
and Bucy in terms of the rst and second conditional momen ts in the case of linear f and
h [25, 26]. When the initial state distribution p(x
0
) is Gaussian, this paraemeterization is
exact. Ho w ev er, later researc h sho w ed that the momen t parameterization in the general
case will require all momen ts to b e exact, i.e. eac h the ev olution of eac h momen t dep ends
on higher-order momen ts [8, 29, 30]. (Note that these references and man y others that
follo w refer to the con tin uous time ltering problem, but the results on parameterization
are still applicable to the discrete time problem.) Analagous problems in the ph ysical
sciences ha v e b een successfully appro ximated b y assuming a single function that giv es
the v alue of all momen ts of a certain order and ab o v e (the simplest function, f (m
i
) = 0,
leading to a simple truncation of the momen t equations) [49, 38]. This approac h can
create large instabilities, ho w ev er, and w as nev er v ery successful when applied to nonlinear
ltering [4, 39, 29].
4
One migh t ask whether the momen t parameterization approac h is simply inadequate
and whether another more p o w erful approac hes migh t yield a nite dimensional param-
eterization in the general case. Indeed, in addition to the Kalman-Bucy lter, there do
exist parameterizations of p(x
k
jZ
k
) with a nite n um b er of parameters for some other
sp ecic forms of f , h and p(x
0
) [5, 54, 6, 14, 15]. Ho w ev er, it has b een pro v en that there
exists at least one system whic h will not admit an exact nite parameterization [20, 37].
Therefore, in the most general case any exact parameterization will in v olv e an innite
n um b er of parameters.
5
Chapter 2
Classical Orthogonal F unction Systems (COFS) and
Nonlinear Filtering
One promising p ossible parameterization of the densit y function is as an orthogonal series
expansion
p(x
k
jZ
k
) =
1
X
i=0
c
i
i
(x) (2.1)
where the i
; i = 0; 1 : : : form a complete orthonormal function set with an inner pro duct
giv en b y
Z
i
(x) j
(x)dx =
8
>
>
<
>
>
:
1 if i = j
0 if i 6= j
(W e will mostly ignore completeness questions and assume the completeness of the giv en
set i
has already b een established.) T aking the inner pro duct of b oth sides of (2.1) with
j
giv es the co ecien ts c
i
as
c
j
=
Z
p(x
k
jZ
k
) j
(x)dx
6
The series (2.1) is then made in to a nite parameterization b y k eeping only a nite n um-
b er of terms. This approac h is attractiv e b ecause classical orthogonal function systems
are w ell understo o d and are kno wn to pro duce excellen t appro ximations for static func-
tions. (W e use the abbreviation COFS to mean a classical orthogonal function system,
whic h roughly means a complete orthonormal set where frequency increases with the
\w a v en um b er" n.)
Almost all previous attempts at using orthogonal function systems to attac h the
nonlinear ltering problem ha v e used either complex exp onen tials when the domain is
p erio dic [10, 11, 32, 52, 53] or Hermite p olynomials when the domain is innite and the
densit y exp ected to b e almost Gaussian [12, 21, 22, 27, 31, 44, 50].
The case of Hermite p olynomials is in teresting b ecause they ha v e b een used b oth
as a COFS and as a biorthogonal function system, whic h dierence w e outline briey .
Lik e all classical orthogonal p olynomials, the Hermite p olynomials ma y b e classied in
sev eral w a ys, including via a recurrence relation or a Ro drigues’s equation [7]. They are
orthogonal with resp ect to the w eigh t function e
x
2
:
Z
H
i
(x)H
j
(x)e
x
2
dx =
8
>
>
<
>
>
:
0 if i 6= j
p
2
n
n! if i = j
(2.2)
The more common (and more complicated) use of Hermite p olynomials in the nonlinear
ltering literature is to use (2.2) to dene a biorthogonal function system that expands
a function f as:
i
(x) =
1
p
2
i
i!
H
i
(x)
7
~
j
(x) = H
j
(x)e
x
2
f (x) =
1
X
i=0
c
i
~
i
(x)
c
i
=
Z
f (x) i
(x)dx
This approac h has the adv an tage that eac h co ecien t c
i
is a linear com bination of the
momen ts of f of order i and b elo w, so computing momen ts from the c
i
is easy . F or
this reason the c
i
are called quasimoments in the literature [31]. Ho w ev er, this approac h
suers from n umerical instabilities similar to those encoun tered when appro ximating b y
momen ts directly , and from somewhat stringen t conditions on f to guaran tee con v ergence
[27]. A more stable and straigh tforw ard expansion ma y b e had from (2.2):
i
(x) =
1
p
p
2
i
i!
H
i
(x)e
x
2
2
(2.3)
f (x) =
1
X
i=0
c
i
i
(x)
c
i
=
Z
f (x) i
(x)dx
whic h con v erges in the mean-square sense for all f 2 L
2
(1 ; 1) [27, 40].
The pap er b y Kizner is b y far the b est source for applications of Hermite p olynomials
to nonlinear ltering [27].
8
Chapter 3
Five Problems with Applying COFS to Nonlinear Filtering
As successful as classical orthognal function systems are at single stage estimation, they
encoun ter sev eral serious problems when used in m ultistage estimation.
3.1 Adapting Location and Scale Parameters
First is the question of ho w to set the lo cation and scale parameters that all classical
function systems require, e.g. the mean and v ariance of the Gaussian w eigh t function for
Hermite p olynomials or the scaling and translation to ( ; ) for complex exp onen tials.
Although COFS expansions will nominally con v erge in the mean-square sense for an y
function in the appropriate domain, the n um b er of terms required for a go o d appro xima-
tion is highly dep enden t on these lo cation and scale parameters.
F or example, with N (; ) denoting a Gaussian of mean and standard deviation ,
consider appro ximating N (0; 1) b y Hermite functions as in (2.3) but where the underlying
w eigh t function w (x) from (2.2) has b een shifted and scaled from N (0;
p
2) to arbitrary
N (; ). Appro ximations with w (x) of v arying means are sho wn in Figure 3.1, with
w (x) of increasing are sho wn in Figure 3.2, and with w (x) of decreasing in Figure
9
Figure 3.1: N (0; 1) appro ximated b y Hermite functions deriv ed from w (x) = N (; 1) for
= 0; 3; 6; 9; 12; 10 terms.
0.4
0.3
0.2
10 -5
0.1
0 -10
0
5
10
Figure 3.2: N (0; 1) appro ximated b y Hermite functions deriv ed from w (x) = N (0; ) for
= 1; 2; 3; 4; 5; 10 terms.
0.4
0.3
0.2
10
0.1
0 -10 5 -5
0
11
Figure 3.3: N (0; 1) appro ximated b y Hermite functions deriv ed from w (x) = N (0; ) for
= 1;
1
2
;
1
3
;
1
4
;
1
5
; 10 terms.
0.3
0 4
0.2
0.4
0.1
-2 -4 2
0
12
3.3. Eac h appro ximation uses 10 terms. Clearly the appro ximation gets w orse rapidly
as the function b eing appro ximated div erges from the underlying w eigh t function. (In
the case where the w eigh t function is at a substan tially larger scale than the function to
b e appro ximated, this problem ma y b e understo o d as a manifestion of the Heisen b erg
uncertain t y principle.) This is the single most imp ortan t problem encoun tered when
COFS are used in nonlinear ltering, where an y in teresting problem will in v olv e state
that mo v es a w a y from its initial v alues and estimates whose certain t y gro ws and shrinks
o v er time. Indeed, this problem will b e seen to con tribute substan tially to all the others.
P ast researc hers ha v e outlined sev eral w a ys to deal with this problem. The simplest
w a y is to ignore it and hop e for the b est. This is the course t ypically tak en when us-
ing complex exp onen tials o v er a p erio dic domain, where the lac k of an explicit w eigh t
function mak es this question implicit and shifting or scaling the domain w ould in tro duce
aliasing problems. It is clear, ho w ev er, that suc h lters will require more terms in their
appro ximations when the lter is trac king w ell and probabilit y mass is concen trated in a
small p ortion of the domain ( ; ) than when it is trac king p o orly and probabilit y mass
is more diuse [53]. This corresp onds to the case illustrated in Figure 3.2.
The most in teresting attempt to deal with this problem in the general case w as de-
v elop ed b y Cen ter [13], who in the case of a COFS appro ximation prop osed using a least
squares optimization metho d on the lo cation and scale parameters b efore computing the
c
i
b y the normal inner pro ducts. This w ould require a gradien t-descen t or some other
minimization algorithm to pic k the b est parameters at eac h time step. (Cen ter’s metho d
13
is actually m uc h more general and he giv es examples of using it to pic k the b est parame-
ters for Gaussian sum appro ximations [1, 41] and the p oin t-mass metho d [9]. Because of
the computational load required, this metho d has not seen hea vy use.
The most common w a y to set the lo cation and scale parameters when using Hermite
p olynomials (in either incarnation) is to calculate the and of the curren t appro xi-
mation and use a w eigh t function of that and some m ultiple of that [21, 22, 27].
A m ultiple of is used to increase the probabilt y that the w eigh t function eectiv ely
o v erlaps most of the probabilit y mass. (In the case of m ultiple dimensions, where tensor
pro ducts of Hermite expansions in the dieren t dimensions ha v e b een considered, the co-
ordinate system is usually rotated to align with the eigen v ectors of the co v ariance matrix,
although some researc hers ha v e preferred simply to assume a separable pro duct to b egin
with [44]).
Of course, asking ho w lo cation and scale parameters are adjusted from one time step
to the next b egs the question of ho w the c
i
themselv es are propagated, whic h w e discuss
next.
3.2 Preserving Positivity
Because the only w a y to get more accuracy when using a COFS is to use higher frequency
terms, it’s common to end up with large oscillations that cause the appro ximate PDF
to b e negativ e o v er large c h unks of the domain. P ast researc hers ha v e rep orted that
their lters w ere v ery sensitiv e to this problem [11, 29, 23]. In fact, Bucy seems to ha v e
dev elop ed the p oin t-mass metho d [9] sp ecically to a v oid this problem.
14
3.3 Determining the Coecients in the First Place
Before w e can do an ything else ab out nonlinear ltering with a COFS, w e ha v e to ha v e a
w a y to get the co ecien ts c
n
in the rst place. Ob viously there is no general closed form
for their initial v alues c
n
=
R
p(x
0
) n
(x
0
) dx
0
.
F or complex exp onen tials, the fast fourier transform is an ob vious candidate and
seemed to w ork w ell [51]. In additions, for sev eral of the sp ecic problems considered
with complex exp onen tials, it is p ossible to deriv e an explicit (series) form ula for the
co ecien ts [11, 52, 53].
F or Hermite p olynomials, almost all researc hers used Gaussian quadrature to appro x-
imate the initial co ecien ts. This approac h seemed to w ork w ell. See Kizner’s pap er for
details [27].
All in all, this is the only problem in tro duced b y using COFS in nonlinear ltering
that could fairly b e said to ha v e b een satisfactorally solv ed.
3.4 Propagating Coecients Through an Action Step
Ha ving discussed ho w to determine the functions i
at time t
k +1
giv en the estimate at
time t
k
and ho w to get the initial co ecien ts c
i
at time t = t
0
, it’s not at all clear ho w
the co ecien ts c
i
transform under f or h.
There ha v e b een a few basic approac hes, and sev eral can b e used together at once.
One is to use some quadrature rules to appro ximate the in tegrations in v olv ed, with an
implicit discretization of x
k +1
. Most researc hers used Gauss-Hermite quadrature [21] but
15
some used the quadrature rules asso ciated with other orthogonal p olynomials suc h as the
Legendre p olynomials [28].
Another approac h is to only w orry ab out propagating some lo w er-order momen ts.
This w as the common approac h when using the quasimomen ts. A related approac h is to
expand f and h in T a ylor series and either come up with lters that are exact when f and h
are lo w-order p olynomials [50] or else use the series to propagate the momen ts [43]. Some
researc hers ha v e rep orted that their lters w ere v ery sensitiv e to these appro ximations
[43, 23].
Another approac h is to use an assumed densit y expansion, i.e. giv en the c
i
at time
t
k
and either f or h, nd the b est t from a family of functions and then expand that
function in the COFS. T ypically this w as used in connection with linearized f or h [43].
Note that this case co v ers truncation of a COFS series as a whole, since truncation
implicitly assumes a form for the densit y with N or few er terms for some N - some
researc hers ha v e understo o d the truncations this w a y [52, 53].
The approac h that seems most app ealing at rst blush is to use (1.5) and the expansion
p(x
k +1
jx
k
) =
X
i;j
d
ij
i
(x
k +1
) j
(x
k
)
16
and simplify with the orthogonalit y condition:
Z
p(x
k
jZ
k
) p(x
k +1
jx
k
; Z
k
; u
k
) dx
k
=
X
i;j
X
l
c
ij
d
l
i
(x
k +1
)
Z
j
(x
k
) l
(x
k
) dx
k
=
X
i
0
@
X
j
c
ij
d
j
1
A
i
(x
k +1
)
This approac h has the adv an tage of b eing v ery straigh tforw ard and ev en seems lik e the
b est use of a COFS, since it mak es explicit use of the orthogonalit y prop ert y [12]. The
problem is that it doubles the problem’s eectiv e dimension, whic h is a problem since the
complexit y can in general b e exp onen tial in the dimension.
3.5 Propagating Coecients Through a Measurement Step
In addition to the problems in v olv ed in propagating the co ecien ts c
n
through a mea-
suremen t caused b y the nonlinearit y of h, whic h are essen tially the same as the problems
in v olv ed with propagating them through f in the action step, it is not clear ho w to
propagate the co ecien ts through the m ultiplication of PDF’s required b y application of
Ba y es’s rule in (1.6). A t rst it seems lik e this migh t b e an easy problem, since man y
COFS and in particular b oth complex exp onen tials and p olynomials are closed under
m ultiplication. Dieren t lo cation and scale parameters for the appro ximations of p(z jx)
and p(x) w ould presen t a problem, as already discussed.
Ho w ev er, the bigger problem, unique to the m ultiplication required b y application of
Ba y es’s rule, is that with a COFS higher order terms are higher frequency terms, and
17
an y suc h m ultiplication will require trac king additional co ecien ts without making an y
others negligible.
Some researc hers simply truncated the new series for n ab o v e some threshold, whic h
usually causes p ositivit y problems as already discussed. Lo in tro duced a mathematical
formalism whic h essen tially used exp onen tials of the densit y to turn m ultiplication in to
addition [32]. Johnson and Stear giv e Lie-algebraic criteria for a giv en basis set to b e
repro ducing in this sense [24]. Ho w ev er, the problem w as nev er really solv ed.
18
Chapter 4
Orthogonal W avelets and the Specialized Meyer-Type
W avelets
W e presen t a brief in tro duction to orthogonal w a v elets b efore applying them to our non-
linear ltering problem. The theory of one-dimensional orthogonal w a v elets is b y no w
w ell enough dev elop ed that the b est references are b o oks instead of journal articles. The
b o ok b y Mallat is an excellen t treatmen t from whic h the follo wing presen tation is adapted
[33].
4.1 Orthogonal W avelets
A sequence fV
j
g
j 2Z
of closed subspaces of L
2
(R) with : : : V
j +1
V
j
V
j 1
is called a
multir esolution appro ximation if the follo wing prop erties are satised:
8(j; k ) 2 Z
2
; f (t) 2 V
j
, f (t 2
j
k ) 2 V
j
8j 2 Z; V
j +1
V
j
19
8j 2 Z; f (t) 2 V
j
, f (
t
2
) 2 V
j +1
lim
j !+1
V
j
=
+1
\
j =1
= f0g
lim
j !1
V
j
=
+1
[
j =1
L
2
(R)
and in addition there exists suc h that f (t n)g
n2Z
is a Riesz basis of V
0
.
If the previous conditions hold, then there exists (t), called the sc aling function ,
suc h that f (t n)g
n2Z
is a Riesz basis of V
0
lik e and in addition is normalized and
orthogonal to its in teger translates, i.e.
h (t n
1
); (t n
2
)i =
8
>
>
<
>
>
:
0 if n
1
6= n
2
1 if n
1
= n
2
This orthogonalit y condition translated to the F ourier domain b ecomes
+1
X
k =1
^
(! + 2k )
2
= 1 (4.1)
so that ma y b e deriv ed from as
^
(! ) =
^
(! )
^
(! + 2k )
2
The function can b e expanded in terms of its translates at the next ner dy adic
resolution as
1
p
2
(
t
2
) =
+1
X
n=1
h[n] (t n)
20
h[n] =
1
p
2
(
t
2
); (t n)
(It is also p ossible to b egin with the lter h and deriv e the function as an innite cascade
of con v olutions of h at ner scales; this is ho w the Daub ec hies w a v elets are constructed.
See Mallat’s b o ok [33] for details.)
W e no w ha v e a w a y to express the basis of V
j
in terms of the basis of V
j 1
, but
w e w ould lik e to do this with a basis orthogonal to V
j
. Let W
j
b e the orthogonal
complemen t of V
j
in V
j 1
:
V
j 1
= V
j
W
j
Then there exists a function called the wavelet function suc h that f (t n)g
n2Z
is an
orthonormal Riesz basis of W
0
in the same w a y that f (t n)g
n2Z
is an orthonormal
basis of V
0
. ma y b e deriv ed from the scaling function as
1
p
2
(
t
2
) =
+1
X
n=1
g [n] (t n)
g [n] =
1
p
2
(
t
2
); (t n)
The digital lter g is related to h as
g [n] = ( 1)
1 n
h[1 n]
21
4.2 Approximation Properties
W
j
con tains the \details" of an expanded function that are presen t in V
j 1
but not in
V
j
. Clearly V
j
=
L
+1
j
0
=j +1
W
j
0 . Letting f
j;n
(t) = 2
j =2
f
2
j
t n
, w e can therefore
expand f 2 L
2
(R) as
f (t) =
+1
X
j =1
+1
X
n=1
hf ;
j;n
i
j;n
=
+1
X
n=1
hf ; j;n
i j;n
+
j
X
j
0
=1
+1
X
n=1
f ;
j
0
;n
j
0
;n
(4.2)
Giv en a reasonably w ell-b eha v ed function f (suc h as one is lik ely to encoun ter in a
non-linear ltering problem), an excellen t appro ximation to f can b e had b y truncating
equation (4.2) at some nest resolution 2
J
; j
0
J .
The appro ximation prop erties of w a v elets w as analyzed in a series of pap ers in the
mid 1990’s [16, 17, 18, 19]. In an asymptotic minimax sense, simple thresholding of the
w a v elet co ecien ts w orks b etter for compression, estimation and reco v ery than essen-
tially an y other metho d. In particular, thresholding of w a v elet co ecien ts is optimal for
represen ting functions with a discrete set of discon tin uities [16].
F or example, the appro ximations of a step function b y its pro jection on to V
j 2f0; 1; 2; 3g
are sho wn in Figures 4.1-4.4. (See the follo wing section for information on the sp ecici
w a v elets used.) Also sho wn in Figure 4.5 for comparison is an appro ximation with com-
plex exp onen tials o v er ( 4; 4) with w a v en um b ers from -30 to 30. Notice the lac k of
ringing in the w a v elet appro ximations.
22
Figure 4.1: Step function pro jected on to V
0
0 -2 -4
1.8
1.6
1.4
x
1
2
2
1.2
4
Figure 4.2: Step function pro jected on to V
1
0 -2
x
-4
1.8
1.6
1
1.4
2
2
1.2
4
23
Figure 4.3: Step function pro jected on to V
2
0 -2 -4
1.8
1.6
1.4
x
1
2
2
1.2
4
Figure 4.4: Step function pro jected on to V
3
0 2 -2 -4
1.8
1.6
1.4
1
x
2
1.2
4
24
Figure 4.5: Step function appro ximated b y complex exp onen tials o v er ( 4; 4), w a v en um-
b ers from -30 to 30
1.4
1.2
1
2
1.8
1.6
x
4 2 0 -2 -4
25
4.3 Meyer-type W avelets
It is ob vious that one w a y to construct to satisfy the orthogonalit y condition of Equation
(4.1)
+1
X
k =1
^
(! + 2k )
2
= 1
is to deriv e
^
as the con v olution of the square function and a band-limited non-negativ e
function ^ m. W e consider esp ecially w a v elets deriv ed from suc h a function ^ m with supp ort
in ( ; ) for 0 < 3
:
^
(! ) =
s
Z
! + ! ^ m(!
0
)d!
0
(4.3)
(Strictly sp eaking, ^ m could go negativ e o v er its range so long as the resulting
^
is real,
but w e will ignore this case.) These w a v elets are called Mey er-t yp e w a v elets, b ecause
the Mey er w a v elets [33] are the earliest example. Notice from the supp ort ( ; ) of ^ m,
0 < 3
, and the denition of
^
in Equation (4.3) that
^
(! ) = 0 for j! j 2( + ) and
^
(! ) = 1 for j! j j j.
Unless otherwise noted, all the gures in this thesis use w a v elets deriv ed from
^ m(! ) =
32
45
cos(
3
2
! )
5
; = 3 ! = 3
Mey er-t yp e w a v elets are imp ortan t b ecause they are \almost closed" under sev eral
imp ortan t transformations. More precisely , W alter has pro v ed the follo wing [47].
Let T
b e the op erator of translation b y an amoun t . Then T
( j;n
) 2 V
j 1
.
26
Let D
b e the op erator of dilation b y a factor . If < = 2
+ then
D
2
m
( j;n
) 2 V
j (m+1)
.
Let H b e the op erator of con v olution with a function h 2 L
1
(R). Then H ( j;n
) 2
V
j 1
.
Eac h of these prop erties will pla y an imp ortan t role in the application of w a v elets to
nonlinear ltering and is established b y the same pro of pattern, whic h w e illustrate b y
pro ving one more imp ortan t almost-closure result.
Almost-Closure Under Multiplication 1. L et b e a Meyer-typ e wavelet, let j;n
2
V
j
and k ;m
2 V
k
with j k . If =
3
then j;n
(x) k ;m
(x) 2 V
j 1
, otherwise
j;n
(x) k ;m
(x) 2 V
j 2
.
Pr o of. The fourier transform of l;p
is giv en b y
Z
e
ix!
l;p
(x)dx = 2
k =2
e
i2
k
p!
^
(2
l
! )
W e can therefore con v ert the condition j;n
k ;m
2 V
l
) j;n
(x) k ;m
(x) =
P
p
a[p] l;p
(x)
to the fourier domain as
\
j;n
k ;m
(! ) = ^ a(! )2
l=2
e
i2
l
p!
^
(2
l
! ) (4.4)
where ^ a is a 2
l
2 -p erio dic function, since a(p) = h j;n
k ;m
; l;p
i is sampled at m ultiples
of 2
l
. If suc h a ^ a exists, then j;n
k ;m
2 V
l
: W e will pro v e the theorem b y constructing
suc h an ^ a for l = j 1.
27
Since m ultiplication in the time domain corresp onds to con v olution in the fourier
domain, w e ha v e
\
j;n
k ;m
(! ) =
Z
2
j =2
e
i2
j
n(! !
0
)
^
(2
j
(! !
0
))2
k =2
e
i2
k
m!
0
^
(2
k
!
0
) d!
0
(4.5)
=
p
2
j +k
e
i2
j
n!
Z
e
i(2
k
m 2
j
n)!
0
^
(2
j
(! !
0
))
^
(2
k
!
0
) d!
0
(4.6)
(4.7)
If w e c ho ose l suc h that
\
j;n
k ;m
(! ) = 0 outside the region where
^
(2
l
! ) = 1 then w e can
construct ^ a as the 2
l
2 -p erio dic extension of
\
j;n
k ;m
(! )
^
(2
l
! )
=
\
j;n
k ;m
(! ) since a translation
of
\
j;n
k ;m
(! ) b y an y non-zero in teger m ultiple of 2
l
2 will ha v e its supp ort outside the
supp ort of
^
(2
l
! ). (W e ignore the factor of e
i2
l
p!
since it is already 2
l
2 -p erio dic.)
Because Mey er-t yp e w a v elets ha v e
^
(! ) = 0 for j! j > 2( + ), the in tegrand in (4.6)
is only non-zero if b oth 2( + )
2
j
! !
0
2( + )
2
j
and 2( + )
2
k
!
0
2( + )
2
k
whic h
imply j! j 2( + )(2
j
+ 2
k
). Recalling that
^
(2
l
! ) = 1 for j! j 2
l
( ), our
condition on l is seen to b e 2( + )(2
j
+ 2
k
) 2
l
( ). Since j k b y assumption,
2
j
+ 2
k
= 2
j
(1 + 2
k +j
) < 2
(j 1)
and the nal condition on l b ecomes
2
+ 2
(j 1)
2
l
(4.8)
Since 0 < 3
, 1 2
+ < 2 with the minim um at =
3
. If =
3
then c ho osing
l = j 1 will satisfy (4.8) with an equalit y; otherwise w e m ust c ho ose l = j 2.
28
4.4 The Coiet Property
The accuracy of a w a v elet appro ximation is t ypically giv en in terms of the n um b er of
v anishing momen ts of , i.e. has p v anishing momen ts if
Z
t
k
(t)dt = 0; 0 k < p
This is b ecause f is lo cally C
k
(i.e. at a ne enough resolution 2
j
), then it is w ell
appro ximated b y a T a ylor series of order k , whic h will b e orthogonal to a
j;n
with p
v anishing momen ts. See Mallat’s b o ok for more details [33].
In addition to v anishing momen ts of , the scaling function can also ha v e v anishing
momen ts of order greater than zero:
Z
(t)dt = 1 and
Z
t
k
(t)dt = 0; 1 k < p (4.9)
Scaling functions with this prop ert y are called c oiets . The adv an tage of the coiet
prop ert y is that a function f whic h is lo cally C
k
so that it is w ell-appro ximated b y a
T a ylor series of order k ab out 2
j
n will b e orthogonal to ev ery p o w er except the zeroth:
2
j =2
hf ; j;n
i f (2
j
n) + O (2
(k +1) j
) (4.10)
Mey er-t yp e w a v elets will ha v e b oth and with ab out k =2 v anishing momen ts (ex-
cept the zeroth momen t for ) if the function ^ m 2 C
k
[33, 46].
29
Coiets therefore pro vide an imp ortan t quadrature prop ert y and substan tially help
ease the computational burden of computing the inner pro ducts asso ciated with the
w a v elet system.
30
Chapter 5
How W avelets Solve the Five Problems
5.1 Adapting Location and Scale Parameters
Using w a v elets to appro ximate p(x
k
jZ
k
) essen tially tak es care of the problem of adapting
lo cation and scale parameters automatically; it essen tially oers all the p o w er of Cen ter’s
approac h of optimizing the lo cation and scale parameters in a least-squares sense with
none of the computational burden. As an example, a w a v elet expansion of N (0; 1) is
sho wn in Figure 5.1, of N (0; 10) in Figure 5.2, and of N (10; 1) in Figure 5.3. Eac h
expansion k eeps 11 terms. Despite v arying in b oth lo cation and p osition, eac h expansion
is ab out as accurate as the others. Compare these expansions to those sho wn in Figures
3.1, 3.2, and 3.3.
5.2 Preserving Positivity
Because w a v elets are w ell-lo calized in b oth time and frequency , they a v oid the ringing
c haracteristic of COFS expansions with either singularities as in Figure 4.5 or with mis-
matc hes lo cation and scale parameters as in Figures 3.1, 3.2, and 3.3. It is still p ossible
31
Figure 5.1: N (0; 1) appro ximated with 11 terms of j = 0
0.4
0
0.3
0.2
-2
0.1
0
-4
x
4 2
Figure 5.2: N (0; 10) appro ximated with 11 terms of j = 3
0.04
0
0.03
0.02
-20
0.01
0
-40
x
40 20
32
Figure 5.3: N (10; 1) appro ximated with 11 terms of j = 0
0.4
10
0.3
0.2
8
0.1
0
6
x
14 12
for a w a v elet appro ximation of a nonnegativ e function to b e negativ e for part of its do-
main, as the appro ximations in Figures 4.1-4.4 sho w. In fact, an y truncation of a w a v elet
series will probably induce lo calized oscillations around areas of small co ecien ts, since
the functions and are themselv es oscillatory . Because the and functions die
o quic kly , ho w ev er (relativ e the their scales) suc h a burst of negativit y will b e also b e
lo calized; a negativ e v alue in a w a v elet appro ximation of a PDF can safely b e in terpreted
as meaning zero or lo w probabilit y .
5.3 Determining the Coecients in the First Place
Assuming the scaling co ecien ts hf ; J ;n
i for the nest resolution 2
J
are a v ailable, Mal-
lat’s algorithm [33] giv es a fast w a y to calculate the co ecien ts hf ; j;n
i ; j > J and
33
hf ;
j;n
i ; j J . These co ecien ts hf ; J ;n
i at the nest resolution ma y b e obtained from
(4.10) as
hf ; J ;n
i 2
J =2
f (2
J
n) (5.1)
5.4 Propagating Coecients Through an Action Step
W e fo cus on nding the PDF P
Y
(y ) of a random v ariable Y = f (X ) giv en the PDF
P
X
(x). The application to equations (1.1) and (1.2) are ob vious. All the examples in this
section use y = f (x) = x
2
.
Using (1.3) w e can write
P
Y
(y ) =
Z
P
X
(x) (y f (x))dx
Assume P
X
(x) has b een appro ximated as a pro jection on to V
j
, so
P
X
(x) =
X
n
c
n
j;n
(x)
A t this p oin t, w e can expand (y f (x)) in a m ultiresolution appro ximation in y
1
(y f (x)) = lim
k !1
Z
(y f (x)) k ;m
(y ) dy k ;m
(y )
= lim
k !1
k ;m
(f (x)) k ;m
(y )
1
Ev en though = 2 L
2
(R), lim j !1 j; 0 (x) = (x).
34
and substitute in to (1.3) to get
P
Y
(y ) =
X
n
X
m
c
n
Z
k ;m
(f (x)) j;n
(x) dx k ;m
(y ) (5.2)
The inner pro ducts h k ;m
(f (x)); j;n
(x)i whic h c haracterize the transformation f ma y b e
computed oine and stored. A t run time all that is necessary is a lo okup from ( j; n) to
nd the signican t ( k ; m) and a m ultiplication b y the c
n
.
Notice that none of the preceeding deriv ation dep ends on the fact that f k ;n
g comes
from a m ultiresolution; the preceedings deriv ation could apply equally w ell to a COFS.
What mak es it useful is that a w a v elet expansion can appro ximate a -function relativ e
to another appro ximated function with few parameters.
In analyzing the accuracy of (5.4) there are three cases to consider. W e tak e the
simplest case rst. Supp ose the resolution of 2
k
of y is v ery ne relativ e to f (x) and
f
0
(2
k
m) exists so that f is appro ximately linear around x = 2
k
m. Then
k ;m
(f (x)) k ;m
(2
k
m + f
0
(2
k
m)(x 2
k
m)) =
X
m
0
c
m
0 k
0
;m
0 (x)
b ecause of the almost-closure of Mey er-t yp e w a v elets under translation and dilation.
W e can therefore in terpret the inner pro duct h k ;m
(f (x)); j;n
(x)i from (5.4) as simply
p erforming a com bination of translation and dilation from X to Y where f is lo cally
linear. As an example, 0;9
(x
2
) is sho wn in red in Figure 5.4 compared to 3;24
(x) in
green and 2;12
(x) in blue; clearly it has b een transformed to a scaling function with
a resolution b et w een 2
2
and 2
3
. The graph of f (x) = x
2
along with its linearization
35
Figure 5.4: f (X ) = X
2
in an appro ximately linear regine: 0;9
(x
2
) (red), 2;12
(x)
(green), ( 3;24
(x) (blue)
x
4 3.5 3 2.5 2
3
2.5
2
1.5
1
0.5
0
-0.5
ab out x = 3 is sho wn in Figure 5.5, conrming this translation and dilation happ ened
where f is lo cally linear compared to the relev an t scale.
The second case is when k ;m
cen ters around a p oin t x
0
: f
0
(x
0
) = 0. No matter
ho w ne the scale, there is nev er a p oin t where it is ne enough to b ecome an eectiv e
translation and dilation. Graphs of 3;0
(x
2
); 0;0
(x
2
)and
3;0
(x
2
) are sho wn in Figure
5.6. A attening at the top in the neigh b orho o d of x = 0 relativ e to the corresp onding
j;0
nev er go es a w a y . This attening is ev en more clear in Figure 5.7 whic h sho ws 0;n
:
n from 5::5.
Of course, all w e really care ab out is that the expansion b e accurate - it seems that
pic king a ne enough k relativ e to the input j;n
should still giv e go o d results. Although
36
Figure 5.5: f (X ) = X
2
is appro ximately linear at x = 3
x
16
12
4
10
4
3.5
8
3 2.5 2
14
6
Figure 5.6: f (X ) = X
2
around a p oin t where f (x) = 0: 3;0
(x
2
), 0;0
(x
2
), 3;0
(x
2
)
0.8
0.6
x
0.4
3 2 -2
1
1
0.2
-1 -3
-0.2
0
0
37
Figure 5.7: f (X ) = X
2
around a p oin t where f (x) = 0: 0;f 5;:::; 5
(x
2
)
0 -2
0.4
0.8
0.6
3 2
1
0.2
1 -3
x
-0.2
-1
0
k ;m
(f (x)) will nev er con v ert to a single j;n
it ma y con v ert to only a few, whic h is go o d
enough.
The last case to consider is when the range of f has a b oundary , e.g. the range of
Y = X
2
is [0; 1). This case basically creates a p oten tial discon tin uit y in the w a v elet
expansion of y . This situation can b e easily diagnosed when k ;m
(f (x))’s p eak v alue do es
not o ccur at f
1
(2
k
m) and is analagous to the high w a v elet co ecien ts just on the other
side of a singularit y . As an example, 2;n
(x
2
) : n from 10:: 1 are sho wn in Figure
5.8. Note that b ecause there is no v alue x
0
suc h that f (x
0
) = 2
2
n, there is no p eak at
these v alues, but rather the p eak of eac h is around x : f (x) = 0, i.e. in the neigh b orho o d
of the singularit y .
38
Figure 5.8: Y = X
2
around a b oundary of the domain of Y : 2;f 10;:::; 1g
(x
2
)
-0.3
-0.4
-1
x
1 2 -2
0
0.1
0
0.2
-0.1
-0.2
It is instructiv e to consider what happ ens when P
X
(x) is replaced with unit y in
(5.4), i.e. w e treat the problem as a simple c hange of v ariable instead of a problem of
probabilities. In this case w e ha v e
Z
1 (y f (x)) dx =
X
m
Z
k ;m
(f (x)) dx k ;m
(y ) =
X
m
Z
k ;m
(y )
d
dy
f
1
(y ) dy k ;m
(y )
where w e ha v e made the usual assumptions that f
1
exist and b e monotonic o v er the
domain under consideration [3]. In some sense, f (X ) is c haracterized b y this expansion,
and the particular distribution P
X
(x) only en ters when b y pro viding non-uniform c
n
co ecien ts b y whic h to m ultiply the expansion ab o v e. A t the v ery least, the ab o v e
39
expansion w ould b e v ery useful when computing (5.4) for kno wing whic h k ;m
are the
imp ortan t ones.
As a nal note on transformation of random v ariables, w e note that the sp ecial case
of addition of t w o random v ariables is exp edited b y the almost-closure of con v olution for
Mey er-t yp e w a v elets.
5.5 Propagating Coecients Through a Measurement Step
In addition to the transformation of nonlinear v ariables as outlined ab o v e, w e ha v e to deal
with the measuremen t pro cess z
k
= h(x
k
) + v
k
. Dropping the k subscripts and pro jecting
P
V
(v ) and p(x) on to V
j
and V
k
resp ectiv ely , w e ha v e
p(z jx)p(x) =
Z
P
V
(v ) (z (h(x) + v )) dv p(x)
=P
V
(z h(x)) p(x)
=
X
n
c
n
j;n
(z h(x))
!
X
m
d
m
k ;m
(x)
!
One w a y to attac k this equation w ould b e to expand j;n
(z h(x)) in b oth z and x
b eforehand, but curren t approac hes to m ultidimensional w a v elet expansions w ould lea v e
m uc h to b e desired [33]. A more clev er approac h w ould b e to do the expansion (5.4)
for the function h and then compute the translation b y z when a new measuremen t z
b ecomse a v ailable. The almost-closure of Mey er-t yp e w a v elets under translations means
this expansions will not require man y terms, and the relev an t inner pro ducts can b e
appro ximated v ery w ell b y the quadrature form ula (5.1). The m ultiplication required
40
for the up date of Ba y es’s rule could then b e carried out in the same w a y , i.e. use the
almost-closure of Mey er-t yp e w a v elets under m ultiplication and use (5.1) to estimate the
relev an t inner pro ducts.
41
Chapter 6
Related and F uture W ork
Since the adv en t of fast computers enabling Mon te Carlo metho ds, there has b een little
researc h on application of orthogonal function systems to nonlinear ltering. One excep-
tion w as the pap er b y Ch un and Ch un who also sough t to apply w a v elet decomp ositions to
the nonlinear ltering problem. Another is the pap er b y W ei and Cheng who use w a v elets
to estimate the momen ts of a sto c hastic pro cess, but not the whole conditional PDF [48].
he metho ds for computing transformations of random v ariables and the m ultiplications
required b y Ba y es’s la w outlined here seem to b e no v el.
The statistics comm unit y has used w a v elets for man y y ears. Their standard op erating
pro cedure when using w a v elets in estimation problems is to dene a prior probabilit y on
the w a v elet co ecien ts themselv es. See the pap ers b y Muller and others for a go o d
o v erview [35, 36, 34]. They seem to b e mainly concerned with what engineers w ould call
single-stage estimation, for whic h COFS ha v e pro v en more or less adequate.
The k ey enabling researc h for applying w a v elets to nonlinear ltering is a satisfactory
extension of the w a v elet theory to m ultiple dimensions. Otherwise, the application will b e
limited to one-dimensional problems, and will ha v e to resort to tric k ery to a v oid ha ving
42
to expand conditional densities in all v ariables. See Mallat’s b o ok for an o v erview of
p ossible approac hes [33].
43
References
[1] Daniel L. Alspac h. Gaussian sum appro ximations in nonlinear ltering and con trol.
Information Scienc es , 7, 1974.
[2] Chi Au and Judy T am. T ransforming v ariables using the Dirac generalized function.
The A meric an Statistician , 53(3):270{272, August 1999.
[3] Lee J. Bain and Max Englehardt. Intr o duction to pr ob ability and mathematic al statis-
tics. Do v er, second edition, 1992.
[4] Ric hard Bellman and John M. Ric hardson. Closure and preserv ation of momen t
prop erties. Journal of Mathematic al A nalysis and Applic ations , 23:639{644, 1968.
[5] V.E. Benes. Exact nite-dimensional lters for certain diusions with nonlinear drift.
Sto chastics , pages 65{92, 1981.
[6] V.E. Benes. New exact nonlinear lters with large lie algebras. System and Contr ol
L etters , 5:217{221, F ebruary 1985.
[7] James W ard Bro wn and Ruel V. Ch urc hill. F ourier series and b oundary value pr ob-
lems. McGra w-Hill, Inc., 5th edition, 1993.
[8] Ric hard S. Bucy . Nonlinear ltering theory . IEEE T r ansactions on A utomatic Con-
tr ol, 10(2):198, 1965.
[9] R.S. Bucy and K.D. Senne. Digital syn thesis of non-linear lters. A utomatic a , 7:287{
298, 1971.
[10] R.S. Bucy and Hussein Y oussef. F ourier realization of the optimal phase demo dula-
tor. In Stear et al. [45], pages 34{38.
[11] R.S. Bucy and Hussein Y oussef. Optimal phase demo dulation. IEEE T r ansactions
on A utomatic Contr ol , 21(5):732{737, Octob er 1976.
[12] G. Celan t and G.B. Di Masi. Hermite p olynomials expansions for discrete-time
nonlinear ltering. Statistic a , pages 759{769, 2002.
[13] Julian L Cen ter. Practical nonlinear ltering of discrete observ ations b y generalized
least squares appro ximation of the conditional probabilit y distribution. In Sorenson
et al. [42], pages 88{99.
44
[14] F rederic k E. Daum. Solution of the Zak ai equation b y separation of v ariables. IEEE
T r ansactions on A utomatic Contr ol , 32(10):941{943, Octob er 1987.
[15] F rederic k E. Daum. Bayesian A nalysis of Time Series and Dynamic Mo dels , c hapter
New exact nonlinear lters. Marcel Dekk er, 1988.
[16] Da vid L. Donoho. Unconditional bases are optimal bases for data compression and
for statistical estimation. Applie d and Computational Harmonic A nalysis , 1(1):100{
115, 1993.
[17] Da vid L Donoho. De-noising b y soft-thresholding. IEEE T r ansactions on Informa-
tion The ory , 41(3):613{627, 1995.
[18] Da vid L. Donoho and Ian M. Johnstone. Adapting to unkno wn smo othness b y
w a v elet shrink age. Journal of the A meric an Statistic al Asso ciation , 90:1200{1224,
1995.
[19] D.L. Donoho, I.M. Johnstone, G. Kerky ac harian, and D. Picard. W a v elet shrink age:
Asymptopia? Jounr al of the R oyal Statistic al So ciety Series B , 57:301{369, 1995.
[20] M. Hazewink el, S.I. Marcus, and H.J. Sussman. Filtering and Contr ol of R andom
Pr o c esses , v olume 61 of L e ctur e Notes in Contr ol and Information Scienc es , c hapter
Nonexistence of nite dimensional lters for conditional statistics of the cubic sensor
problem. Springer, Berlin / Heidelb erg, 1984.
[21] Calvin Hec h t. Digital realization of non-linear lters. In Sorenson et al. [42], pages
152{158.
[22] Calvin Hec h t. System iden tication using ba y esian estimation. In Stear et al. [45],
pages 107{113.
[23] Andrew H. Jazwinski. Sto chastic Pr o c esses and Filtering The ory . Academic Press,
1970.
[24] Carl Johnson and Edwin B. Stear. Repro ducing probabilit y densit y classes on Lie
groups with application to discrete time ltering. IEEE T r ansactions on Information
The ory , 26:124{129, 1980.
[25] R.E. Kalman. A new approac h to linear ltering and prediction problems. T r ansac-
tions of the ASME, Journal of Basic Engine ering , 82(1):34{35, 1960.
[26] R.E. Kalman and R.S. Bucy . New results in linear ltering and prediction theory .
T r ansactions of the ASME, Journal of Basic Engine ering , 83:95{108, 1961.
[27] William Kizner. Optimal nonlinear estimation based on orthogonal expansions.
T ec hnical Rep ort 32-1366, Jet Propulsion Lab oratory , P asadena, California, April
1969.
[28] R.L. Klein and A. W ang. Gauss quadrature estimators. IEEE T r ansactions on
A utomatic Contr ol , 22(1):70{73, F ebruary 1977.
45
[29] Harold J. Kushner. Appro ximations to optimal nonlinear lters. IEEE T r ansactions
on A utomatic Contr ol , 12(5):546{556, Octob er 1967.
[30] Harold J. Kushner. Dynamical equations for optimal nonlinear ltering. Journal of
Dier ential Equations , 3:179{190, 1967.
[31] P .I. Kuznetso v, R.L. Stratono vic h, and V.I. Tikhono v. Quasi-momen t functions in
the theory of random pro cesses. The ory of Pr ob ability and its Applic ations , (1):80{
97, 1960.
[32] James Ting-Ho Lo. Exp onen tial fourier densities and optimal estimation and de-
tection on the circle. IEEE T r ansactions on Information The ory , 23(1):110{116,
Jan uary 1977.
[33] Stephane Mallat. A wavelet tour of signal pr o c essing . Academic Press, 2nd edition,
1998.
[34] P eter Mueller and Brani Vidak o vic. Bayesian Infer enc e in Wavelet-Base d Mo dels ,
c hapter MCMC Metho ds in W a v elet Shrink age: Non-Equally Spaced Regression,
Densit y and Sp ectral Densit y Estimation, pages 187{202. Springer-V erlag, New
Y ork, 1999.
[35] P eter Muller and F ernando A. Quin tana. Nonparametric ba y esian data analysis.
Statistic al Scienc e , 19(1):95{110, 2004.
[36] P eter Muller and Brani Vidak o vic. Ba y esian inference with w a v elets: Densit y esti-
mation. Journal of Computational and Gr aphic al Statistics , 7(4):456{468, Decem b er
1998.
[37] Daniel Ocone. Probabilit y densities for conditional statistics in the cubic sensor
problem. Mathematics of Contr ol, Signals, and Systems , 1:183{202, 1988.
[38] John M. Ric hardson and Leo C. Levitt. Linear closure appro ximation metho d for
classical statistical mec hanics. Journal of Mathematic al Physics , 8(9):1707{1715,
Septem b er 1967.
[39] N.G.F. Sanc ho. On the appro ximate momen t equations of a nonlinear sto c hastic
dieren tial equation. Journal of Mathematic al A nalysis and Applic ations , 29:384{
391, 1970.
[40] Stuart C. Sc h w artz. Estimation of probabilit y densit y b y an orthogonal series. The
A nnals of Mathematic al Statistics , 38(4):1261{1265, August 1967.
[41] Harold W. Sorenson and Daniel L. Alspac h. Recursiv e ba y esian estimation using
gaussian sums. A utomatic a , 7:465{479, 1971.
[42] Harold W. Sorenson, Allen D. Da yton, Ric hard E. Mortensen, Edwin B. Stear,
Allen R. Stubb erud, and Rob ert C. Kolb, editors. Pr o c e e dings of the se c ond sym-
p osium on nonline ar estimation the ory and its applic ations , San Diego, Septem b er
1971. W estern P erio dicals.
46
[43] H.W. Sorenson and A.R. Stubb erud. Non-linear ltering b y appro ximation of the a
p osteriori densit y . International Journal of Contr ol , 8(1):33{51, July 1968.
[44] Krishnasw am y Sriniv asan. State estimation b y orthogonal expansion of probabilit y
distributions. IEEE T r ansactions on A utomatic Contr ol , 15(1):3{10, F ebruary 1970.
[45] Edwin B. Stear, Ric hard E. Mortensen, W alter J. Rab e, Harold W. Sorenson, and
Allen R. Stubb erud, editors. Pr o c e e dings of the fourth symp osium on nonline ar esti-
mation the ory and its applic ations , San Diego, Septem b er 1973. W estern P erio dicals.
[46] G.G. W alter and J. Zhang. Orthonormal w a v elets with simple closed-form expres-
sions. IEEE T r ansactions on Signal Pr o c essing , 46(8):2248{2251, August 1998.
[47] Gilb ert G. W alter. T ranslation and dilation in v ariance in orthogonal w a v elets. Ap-
plie d and Computational Harmonic A nalysis , 1(4):344{349, 1994.
[48] Dong W ei and Haiguang Cheng. Represen tations of sto c hastic pro cesses using coiet-
t yp e w a v elets. In Pr o c e e dings of the T enth IEEE Workshop on Statistic al Signal and
A rr ay Pr o c essing , pages 549{553. IEEE, 2000.
[49] Ralph M. Wilco x and Ric hard Bellman. T runcation and preserv ation of momen t
prop erties for Fokk er-Planc k momen t equations. Journal of Mathematic al A nalysis
and Applic ations , 32:532{542, 1970.
[50] W arren W. Willman. Edgew orth expansions in state p erturbation estimation. IEEE
T r ansactions on A utomatic Contr ol , 26(2):493{498, April 1981.
[51] Alan S Willsky . A nite fourier transform approac h to estimation on cyclic groups.
In Allen R. Stubb erud, James S. Meditc h, Ric hard E. Mortensen, Edwin B. Stear,
and Harold W. Sorenson, editors, Pr o c e e dings of the fth symp osium on nonline ar
estimation the ory and its applic ations , pages 301{303, San Diego, Septem b er 1974.
W estern P erio dicals.
[52] Alan S. Willsky . F ourier series and estimation on the circle with applications to
sync hronous comm unications - part I: Analysis. IEEE T r ansactions on Information
The ory , 20(5):577{583, Septem b er 1974.
[53] Alan S. Willsky . F ourier series and estimation on the circle with applications to
sync hronous comm unications - part I I: Implemen tation. IEEE T r ansactions on In-
formation The ory , 20(5):584{590, Septem b er 1974.
[54] W.S. W ong. New classes of nite-dimensional nonlinear lters. System and Contr ol
L etters , 3:155{164, 1983.
47
Abstract (if available)
Abstract
While classical orthogonal function systems (COFS) are very successful at approximating static functions, there are several problems with using them in multi-stage estimation. These problems are adapting location and scale parameters between time steps, preserving positivity in the PDF approximation, determining the initial values of the decomposition coefficients, propagating the coefficients through general nonlinear functions, and propagating the coefficients through the multiplications required for application of Bayes's rule. Orthogonal wavelets, particularly Meyer-type wavelets, have several attractive properties missing from COFS. We show how wavelets avoid each of the problems encountered with COFS in nonlinear filtering.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Long range stereo data-fusion from moving platforms
PDF
Identification, control and visually-guided behavior for a model helicopter
PDF
Iterative path integral stochastic optimal control: theory and applications to motor control
PDF
Reconfiguration in sensor networks
PDF
Bayesian methods for autonomous learning systems
PDF
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
PDF
Discrete geometric motion control of autonomous vehicles
PDF
The importance of not being mean: DFM -- a norm-referenced data model for face pattern recognition
PDF
Compilation of data-driven macroprograms for a class of networked sensing applications
PDF
Self-assembly and self-repair by robot swarms
PDF
A user-centric approach for improving a distributed software system's deployment architecture
PDF
Parameter estimation to infer injector-producer relationships in oil fields: from hybrid constrained nonlinear optimization to inference in probabilistic graphical model
PDF
A fully discrete approach for estimating local volatility in a generalized Black-Scholes setting
Asset Metadata
Creator
Johnson, Daniel M.
(author)
Core Title
The wavelet filter: a new method for nonlinear, nongaussian filtering in one dimension
School
Viterbi School of Engineering
Degree
Master of Science
Degree Program
Computer Science
Publication Date
08/04/2009
Defense Date
06/20/2009
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
nonlinear estimation,nonlinear filtering,OAI-PMH Harvest,wavelets
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Sukhatme, Gaurav S. (
committee chair
), Krishnamachari, Bhaskar (
committee member
), Schaal, Stefan (
committee member
)
Creator Email
da7.johnson@ngc.com,danielmj@email.usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m2474
Unique identifier
UC1106455
Identifier
etd-Johnson-3144 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-172277 (legacy record id),usctheses-m2474 (legacy record id)
Legacy Identifier
etd-Johnson-3144.pdf
Dmrecord
172277
Document Type
Thesis
Rights
Johnson, Daniel M.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
nonlinear estimation
nonlinear filtering
wavelets