Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Process data analytics and monitoring based on causality analysis techniques
(USC Thesis Other)
Process data analytics and monitoring based on causality analysis techniques
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
PROCESSDATAANALYTICSANDMONITORINGBASED
ONCAUSALITYANALYSISTECHNIQUES
by
TaoYuan
ADissertationPresentedtothe
FACULTYOFTHEUSCGRADUATESCHOOL
UNIVERSITYOFSOUTHERNCALIFORNIA
InPartialFulfillmentofthe
RequirementsfortheDegree
DOCTOROFPHILOSOPHY
(ELECTRICALENGINEERING)
August2014
Copyright2014 TaoYuan
ii
Dedication
Tomyfamily.
iii
Acknowledgments
First of all, I would like to express my sincerest gratitude to my ad-
visor, Professor S. Joe Qin. I am deeply grateful for his continuous encour-
agement,intellectualguidanceandfinancialsupportforthepastfewyears.
Mydoctoralstudycannotbecompletedwithouthisingeniousdirectionand
constructivecriticism. Ithasbeenagreatpleasuretoworkwithhimandhe
has influenced me in many aspects of my life. He guided me to acquire
knowledge backgrounds and to develop new ideas to solve practical prob-
lems. More importantly, he taught me the innovative thinking and profes-
sionalanalyticalskillswhichwillbenefitmethroughmycareer.
I appreciate the effort of Dr. Ioannou from Electrical Engineering
Department and Dr. K. Shing from the Chemical Engineering Department
forservinginmycommittee. Iwouldliketothankthemfortheirinsightful
commentsandadvices.
IthasbeenanhonorandprivilegetobeamemberofProfessorQin’s
researchgroup. Itisacollectionoftalentedanddedicatedindividualswho
granted me enjoyable and memorable experience. I am grateful to the for-
mergroupmemberswhoofferedmevaluableadvicesandpatienthelp,Car-
los, Quan, Yamin, Bo, Zhijie, Jingran, Hu and Yu. I am also grateful to all
iv
the current members for the friendship and consistent assistance, Johnny,
Yining, Zora, Alisha, Yuan, Qinqin and postdoc Gang Li, visiting scholar
Lijuan,YingandLe.
I would like to thank the CSC Fellowship, and the Center for Inter-
activeSmart OilfieldTechnologies(CiSoft) for thefinancial support during
mygraduatestudies.
Last but not least, I want to thank my parents for their endless love,
careandencouragement.
v
Table of Contents
Dedication ii
Acknowledgments iii
ListofTables vii
ListofFigures viii
Abstract xi
Chapter1. Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 ExistingMethodsforRootCauseDiagnosis . . . . . . . . . . . 6
1.2.1 Non-linearitytestbasedmethod . . . . . . . . . . . . . . 6
1.2.2 Spectralindependentcomponentanalysis . . . . . . . . 7
1.2.3 Adjacencymatrixmethod . . . . . . . . . . . . . . . . . . 9
1.2.4 Transferentropybaseddisturbancepropagationmethod 10
1.3 OutlineoftheDissertation . . . . . . . . . . . . . . . . . . . . . 11
Chapter 2. Root Cause Diagnosis of Plant wide Oscillations Using
GrangerCausality 14
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 MethodsandImplementation . . . . . . . . . . . . . . . . . . . 19
2.2.1 Time-domainGrangerCausality . . . . . . . . . . . . . . 19
2.2.2 SpectralGrangerCausality . . . . . . . . . . . . . . . . . 22
2.2.3 FeatureSelection . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 CaseStudies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.1 Simulated4×4System . . . . . . . . . . . . . . . . . . . 29
2.3.2 IndustrialApplicationCaseStudy . . . . . . . . . . . . . 33
vi
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Chapter3. Deepcausalminingforplant-wideoscillationswithmul-
tilevelGrangercausalityanalysis 42
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 ProposedMultilevelGrangerCausalityAnalysisFramework . 46
3.2.1 DynamicTimeWarping-basedK-meansClustering . . . 46
3.2.2 Group-levelGrangercausality . . . . . . . . . . . . . . . 51
3.2.3 FeaturegroupformulationbasedonPCA . . . . . . . . . 57
3.2.4 WithingroupGrangercausalitytestbasedonPLS . . . . 59
3.3 IndustrialCaseStudy . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3.1 DTW-basedk-meansclustering . . . . . . . . . . . . . . 65
3.3.2 Group-levelcausaltest . . . . . . . . . . . . . . . . . . . 67
3.3.3 PLS-basedcausalityanalysiswithineachGroup . . . . . 74
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Chapter4. Nonstationaryprocessmonitoringfordynamicprocesses
withcointegrationanalysis 88
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Nonstationaritytestforunivariatetimeseries . . . . . . . . . . 91
4.2.1 Unitroottest . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2.2 Stationaritytest . . . . . . . . . . . . . . . . . . . . . . . . 93
4.3 Co-integrationanalysisformultivariateintegratedseries . . . 94
4.3.1 vectorerrorcorrectionandJohansentest . . . . . . . . . 96
4.3.2 commontrendrepresentationandStock-Watsontest . . 98
4.4 Faultdetectionbasedoncointegrationrepresentation . . . . . 99
4.5 CasestudyonTEprocess . . . . . . . . . . . . . . . . . . . . . . 100
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Chapter5. Conclusions 111
Bibliography 114
vii
List of Tables
Table4.1 Faultdetectionindices . . . . . . . . . . . . . . . . . . . . . . 100
Table4.2 Nonstationarytestsfor22measurementsand11MVs . . . . 106
Table4.3 Nonstationarytestsforcommontrendsanderrorprocess . . 107
viii
List of Figures
Figure2.1 TimeseriesplotoftwosimpleoscillatingsignalsS1andS2. 26
Figure2.2 SpectralGrangercausalitymagnitudesbetweenS1andS2. 26
Figure2.3 Timeseriesplotandautocorrelationplotofthesimulation
example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Figure2.4 Time-domaincausalitymapoffourCVsinthesimulation
example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Figure 2.5 Spectral Granger causality for each CV of the simulation
example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Figure2.6 ProcessschematicofaplantfromEastmanChemicalCom-
pany. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Figure2.7 Fiveprincipalcomponentsfortheindustrialcasestudy. . . 36
Figure2.8 Oscillationsignificanceindexfortheindustrialcasestudy. 36
Figure 2.9 Time-domain causality map of 15 selected features in the
industrialcasestudy. . . . . . . . . . . . . . . . . . . . . . . . 38
Figure2.10Causalflowforeachnode(variable)intheindustrialcase
study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Figure 2.11Spectral Granger causality for 8 selected features of the
industrialcasestudy. . . . . . . . . . . . . . . . . . . . . . . . 40
Figure3.1 DTW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
ix
Figure3.2 ProcessschematicofaplantfromEastmanChemicalCom-
pany . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Figure3.3 Group1timeseriesplot. . . . . . . . . . . . . . . . . . . . . 68
Figure3.4 Group2timeseriesplot. . . . . . . . . . . . . . . . . . . . . 69
Figure3.5 Group3timeseriesplot. . . . . . . . . . . . . . . . . . . . . 70
Figure3.6 Group4timeseriesplot. . . . . . . . . . . . . . . . . . . . . 71
Figure3.7 FeatureGrouptimeseriesplot . . . . . . . . . . . . . . . . . 72
Figure 3.8 Group Granger causal magnitude (trace version) from
G
1
∼ G
4
toG
T
. . . . . . . . . . . . . . . . . . . . . . . . . . 72
Figure 3.9 Group Granger causal magnitude (determinant version)
fromG
1
∼ G
4
toG
T
. . . . . . . . . . . . . . . . . . . . . . . 73
Figure3.10Grouplevelcausalnetwork . . . . . . . . . . . . . . . . . . 75
Figure3.11Group1causalstructure . . . . . . . . . . . . . . . . . . . . 78
Figure3.12Group2causalstructure . . . . . . . . . . . . . . . . . . . . 81
Figure3.13Group3causalnetworkbyPLS-basedGrangercausality . 83
Figure3.14Group4causalnetworkbyPLS-basedGrangercausality . 85
Figure 3.15Schematic diagram of the multilevel Granger causality
analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Figure4.1 SchematicdiagramoftheTennesseeEastmanprocess . . . 101
Figure4.2 NonstationaryseriesinTEprocess . . . . . . . . . . . . . . 104
Figure4.3 Commontrendsτ
1
,τ
2
anderrorprocessz
1
,z
2
,z
3
,z
4
. . . . 105
x
Figure4.4 FaultdetectionresultwithT
2
z
andT
2
τ
forIDV1 . . . . . . . 107
Figure4.5 FaultdetectionresultwithT
2
z
andT
2
τ
forIDV10 . . . . . . . 108
Figure4.6 FaultdetectionresultwithT
2
z
andT
2
τ
forIDV13 . . . . . . . 109
xi
Abstract
A typical industrial process or plant operates with hundreds of
control loops and those primary loops should operate at desired levels
for safety and efficiency. There exist challenges on how to monitor and
maintain such a large-scale complex system, thus control performance
monitoring(CPM) has drawn tremendous research interest and progress
steadily over the last two decades. After detection of poor control per-
formance, a even more challenging and meaningful task is to diagnose
and locate the root cause of whole plant degradation therefore further
targeted maintenance could be performed to improve the situation. In this
dissertation, new data-driven approaches based on causality methods are
proposedtodiagnosetherootcauseofplant-widedisturbances.
Ocillationsasacommonreasonforpoorperformanceinclosed-loop
controlled processes which, once generated, can propagate along process
flowsandfeedbackpathsofthewholeplant. Anewdata-driventimeseries
method for diagnosing the sources and propagation paths of plant-wide
oscillations is presented. Firstly a latent variable based feature variable se-
lection scheme is proposed to select candidate variables which share com-
monoscillations. Thetime-domainGrangercausalityandspectralGranger
xii
causalityhavebeenappliedsuccessfullytorevealthecausalrelationshipin
termsoftimeseriesprediction.
The proposed method is then extended to extract more complete
causal structure in a large-scale complex industrial plant. A novel multi-
level Granger causality framework for root cause diagnosis is proposed,
which consider both the group-wise and within group causal structures.
Dynamic time warping-based K-means clustering method is applied to
group time-series variables more accurately. Group Granger causal test
is performed to find out the causal contribution with regard to specific
disturbancefeature. Dominantandmajorcausalrelationshipcouldalsobe
extractedingrouppointofview. Moreover,apartialleastsquaresmodified
Granger causal test is developed to overcome multicollinearity issue and
make the generated graphic causal network much sparse and easier to
interpret.
For nonstationary series in chemical process, traditional latent vari-
able or other statistical modeling methods are inadequate because the sta-
tisticalpropertiesaretimevariant. Nonstationarytestsareimplementedto
identify the nonstationary variables, then cointegration test are utilized to
capture the common trends and dynamic structure. Monitoring index and
controllimitarederivedformonitoringandfaultdetectionpurpose.
1
Chapter 1
Introduction
1.1 Background
Inlarge-scaleindustrialplants,therearehundredsoreventhousand-
s of control loops operating under varying conditions. The task of each
control loop is to maintain the process at the desired level steadily and ef-
ficiently. For such complicated and highly integrated systems, the corre-
spondingmaintenanceisadifficultandchallengingtaskforprocesscontrol
engineers. Even if these controllers initially perform quite well, many fac-
tors can cause their performance deterioration after a period of time such
as sensor/actuator failure, product changes and seasonal disturbances[27].
ThesuverysbyEnderandBialkowski[19]reportedthatonlyonethirdofall
control loops worked satisfactorily. In a recent study of 9000 controller-
s in Eastman Chemical plants by Paulonis and Cox[61], over 40 percent
of control loops were grouped in the categories of ”fair” and/or ”poor”.
Poorcontrolperformanceofindustrialprocessesmaybecausedbyvarious
factors such as inappropriate controller tuning or control structure design,
dynamic changes in process or disturbance, instrument failure and equip-
ment malfunction. A control loop with poor performance may result in
2
disrupted process operation, inferior product quality and more material or
energy consumption. Therefore, control loops have become important as-
sets which should be routinely monitored and maintained. Monitoring the
performanceofchemicalprocessesarebecomingincreasinglynecessaryfor
early detection of abnormal events/faults and any performance degrada-
tionbeforeanyunexpecteddisruptionsandbreak-downhappens.
The development of control loop performance monitoring tech-
niques using routine operating data has received much attention during
last two decades[32, 42]. Minimum variance control(MVC) [26, 33] as a
popular performance benchmark of control loops has been well studied
either in SISO and MIMO cases. MVC benchmark is a theoretical lower
bound of the output variance and thus represents the best achievable
control performance under input-output time delay constraints. Some
review articles or books have summarized the latest research results
of control performance monitoring theory, approaches and industrial
applications[40,36].
A challenging task after detection of poor control performance is to
diagnose and locate the root cause in order to regulate and improve the
situation. It was reported that in about 30% of control loops, the increased
outputvariabilityandcyclicpatternwasduetoinstrumentperformance[6].
Therefore, a further investigation and solution to any control-loop perfor-
3
mance problem often involve diagnosis and root cause analysis of oscilla-
tion and valve-stiction problems. Entech Control engineering firm point-
ed outthat eightypercentofvalves failed dynamic performance standards
duetostiction,backlashoroversizeddesign. Sustainedoscillationsaround
the setpoint are a major cause for increased process variability. Reducing
or removing such oscillations could obtain substantial commercial benefit-
s. How to effectively diagnose the root cause, recover and understand its
propagationmechanismplayasignificantroleinensuringefficientandsafe
operations.
Oscillationsincontrolloopscouldoccurduetovalvestiction,aggres-
sivetuningofcontrollerorexternaloscillationdisturbance. Sinceindustrial
controllers are often set conservatively, bad tuning is not very common for
oscillation,eventhoughitdoesexist,wecouldviewitasa”softwareprob-
lem”anditcouldbeeliminatedbyadjustingthecontrolparameters. Many
industrial studies indicate that valve nonlinearity is the largest cause of in-
creasedvariability[37,70]. Thetaskofdetectingoscillationsanddiagnosing
stiction in valves from routing operating data is a challenging task. An ef-
fective and reliable methodology that can perform root cause analysis of
wholeplantoscillationsishighlysought.
In a plant facility, operators are the end users of the controllers and
areresponsiblefortheday-to-daysafeandeconomicoperationoftheplant.
4
For control engineers, their responsibility include troubleshooting, mainte-
nance of advanced control software and interaction. It is very difficult for
aprocessengineerwhoistakingcareofhundredsofloopstoprioritizethe
controlloopproblems. Currentperformancemonitoringtoolscoulddetec-
t the oscillating loops and present them to the operators, but do not have
theefficientcapabilitytolocatethesourceoftheoscillationandrecoverthe
corresponding propagation paths. It is costly and time-consuming to look
at each oscillating valve to do further maintenance, therefore, people just
ignoresuchproblematicloopreportsifthereisnoseverefault. Thus, there
is an urgent need for a systematic and reliable approach that could help
process engineers, operators to find the root cause of the oscillation loops
accuratelyandmaketheplantrunninginahealthycondition.
Comparedtothesinglelooptestingmethod, aplant-wideapproach
that considers the mapped out distribution of disturbance from an overall
plantperspectiveismorepreferable[78]. Somekeyrequirementsandwish-
list for such a methodology are: detection of presence of one or multiple
periodicoscillations/disturbance;determinationofthelocationsofthevar-
iousoscillations/disturbancesintheplantandtheirmostlikelyrootcauses;
facility-wideapproachessuchasbehaviorclustering;automatedmodel-free
causalanalysis;smartincorporationofprocessknowledge,andsoon[40,61,
14].
5
Another kind of degradation in control performance can be caused
by sensor fault or external disturbance such as step or random changes of
the process variables[41]. They can be modeled as additive disturbances.
Statisticalprocessmonitoring(SPM)basedonprocessvariablesandquality
variables has been developed and implemented successfully to many in-
dustries over the last twenty years[64]. Typically data driven models from
latent variable method such as principal component analysis and partial
least squares are built to derive process monitoring statistics[48, 93]. How-
ever,manyexistingmethodsassumethatthemeasureddatapossesseshave
stationarymeansandvariances. Inpracticalsituation,someobservedseries
containnonstationarycharacteristics. Directlyapplyinglinearregressionon
nonstationaryserieswillleadto’spuriousregression’issueandleadtomis-
leading relationship result[24]. If the nonstationary series are converted to
stationary series through differencing, then too much potential useful dy-
namic information will be lost during this procedure. Therefore, how to
capturetheinnerrelationshipwithinthosenonstationaryseriesandutilize
ittohelpmonitoringthewholeindustrialprocessremainsanimportantand
activearea.
6
1.2 Existing Methods for Root Cause Diagnosis
1.2.1 Non-linearity test based method
Choudhury etal. utilized higher-order statistics of closed-loop data
for diagnosing the causes of poor control-loop performance[69]. The basic
principleistocalculatethezero-squaredbicoherencetosearchfornonlinear
interactionsinasinglesignal:
bic
2
(f
1
,f
2
) =
|E{X(f
1
)X(f
2
)X
∗
(f
1
+f
2
)}|
2
E{|X(f
1
)X(f
2
)|
2
}E{|X(f
1
+f
2
)|
2
}
(1.1)
whereX(f
1
) is the discrete Fourier transform of datax(k) at the frequency
f
1
. Thusanon-Gaussianindex(NGI)isdefinedas:
NGI =
∑
bic
2
significant
L
−
c
z
α
2K
(1.2)
where
∑
bic
2
significant
are those bicoherence that fail the hypothesis test in
P(2Kbic
2
(f
1
,f
2
) > c
χ
2
α
) = α, Lis the number of
∑
bic
2
significant
. IfNGI ≤ α,
the signal is Gaussian, thus the possible cause for the poor performance is
linear external oscillation, tightly tuned controller and so on. If NGI > α,
thenthesignalisnon-GaussianandaNLI(non-linearityindex)iscalculated
as:
NLI =bic
2
max
−(bic
2
+2σ
bic
2) (1.3)
where the largest and smallest 10% of the bicoherence are excluded to en-
sure robustness. if NLI > 0, the signal-generating process is non-linear,
thus PV-OP plot and typical signatures of actuator could be further ana-
lyzedtodiagnosethenonlinearityinducedperformancedegradation.
7
In order to find the possible source of non-linearity in a plant-wide
oscillation case, a total non-linearity index(TNLI) [11]has been introduced
as a measure to quantify non-linearities: TNLI =
∑
bic
2
significant
. The vari-
ables with highest TNLI are diagnosed to be the root cause of the whole
plant oscillation. The drawback of this method is that it doesn’t consider
the correlation and dynamic propagation between each signals, the TNLI
is calculated separately instead. It is reasonable to assume the root cause
belongs to the signal set with higher TNLI. However it is still difficult to
accurately determine which one is the true cause especially among several
variableswithcomparablehighTNLIindex.
1.2.2 Spectral independent component analysis
In order to automated detection of plant-wide disturbances and for
theisolationofthesources,multi-resolutionspectralICAwasproposedby
Xia etal.[86] to isolate the sources of multiple oscillations in a industrial
process. The spectral data matrix is formulated as follows by performing
discretefouriertransform(DFT):
X =
P
1
(f
1
) ··· P
1
(f
N
)
.
.
. ···
.
.
.
P
m
(f
1
) ··· P
m
(f
N
)
(1.4)
wheremisthenumberofprocessvariables. ThenICAcouldbeperformed
ontheX matrix:
X =AS (1.5)
8
where A is the mixing matrix and S contains the independent compo-
nents(ICs) which maximizes the non-Gaussianity. Those ICs can capture
the dominant feature of narrow-band peak in spectrum to represent
different frequency of oscillation sources. Then a dominance ratio(DR)
could be calculated as the ratio of the total energy carried by that specific
IC over the total energy carried by all dominant oscillatory ICs. For each
oscillatory IC with significant DR, a significant index based on A matrix is
proposed to quantify and rank the contribution of each original variables
to this IC. Therefore, for a plant-wide disturbance with particular interest
of frequency range, those variables with highest significant index to the
correspondingICareselectedtobethecandidatesoftherootcause. Itisan
efficientwaytoisolatemultipleoscillationsandcouldgiveasuggestionon
possible root causes. However, since the significance index is based on the
correlationofrawsignalandIC,ahighersignificantindexcouldimplythat
the variable are more relevant to the IC compared to other variables, but it
does not necessary confirm any causal relationship between them. Thus,
further investigation or diagnosis methodologies need to be performed
on the candidates root causes to find out the true cause of the plant-wide
disturbance.
9
1.2.3 Adjacency matrix method
Manyfrequencydomaintoolssuchasthepowerspectrumandspec-
trum envelop methods were applied successfully to detect the oscillation
phenomenon, however, the efficient strategies to locate the root cause are
relatively few. Jiang etal. proposed an adjacency matrix based topology
methodtodiagnosethecauseoftheplant-wideoscillations[38].
Firstlyacontrolloopdiagraphisformulatedbasedonprocessflow-
sheetoftheindustrialplant. Eachcontrollerisdenotedasanode,thenifthe
controller output of theith controller(OP) can directly affect the controlled
variable of jth controller, an edge from node i to node j is generated. Put
this idea into a mathematic form, which is called adjacency matrix. For a
systemwith5nodes,itcouldhavethefollowingadjacencymatrix:
X =
0 1 0 0 0
0 0 1 1 0
0 0 0 1 1
0 0 0 0 1
0 0 0 0 0
(1.6)
whereifrowvariablehasandirectededgepointedtocolumnvariable,then
the corresponding element in the adjacency matrix is mark as 1, otherwise
itis0.
Therefore,thereachabilitymatrixofX isdefinedas:
R = (X +X
2
+X
3
+···+X
N
)
#
(1.7)
10
where A
#
is the boolean equivalent of matrix A. The reachability matrix
represents process topology in an chemical system since its(i,j)th element
indicates whether a path exists from node i to node j. Therefore, those
variableshaveaccesstoalltheaffectedvariablesareregardedasrootcause.
This method is model-based method, it requires process plant knowledge
whichisnotalwaysavailable. Anotherdrawbackofthismethodisthatthe
variables that could influence all the oscillating variables may not be the
truesource. Althoughfromprocessstructurepointofview,thepropagation
seems reasonable, the true dynamics hidden in the routing operation data
arenotinvestigated.
1.2.4 Transfer entropy based disturbance propagation method
Bauer etal.[4] developed a method based on transfer entropy[67] to
diagnose the root cause and find the direction of disturbance propagation.
Partial and direct transfer entropy have been recently proposed to detect
partialanddirectcausalityrespectively[18,81].
Transferentropyisanon-parametricstatisticmeasuringtheamount
of directed transfer of information between two random processes. Based
on the transition probabilities containing all information on causality be-
tweentwovariables,theamountofinformationtransferredfromvariabley
toxasfollows:
t(x|y) =
∑
p(x
i+h
,x
i
,y
i
)·log
p(x
i+h
|x
i
,y
i
)
p(x
i+h
|x
i
)
(1.8)
11
wherex
i
andy
i
denoteembeddingvectorswithelementsfrompastvalues
ofxandy. p(·|·)denotestheconditionalPDFandhisthefuturehorizon. A
causalitymeasureisderivedbycomparingtheinfluenceofxony withthe
influenceofy onx
t
x→y
=t(y|x)−t(x|y) (1.9)
Significantlargepositivevaluesoft
x→y
meansxcausesy, whilesignificant
large negative values mean the reverse case. The confidence level is estab-
lishedbyaMonteCarlomethodusingsurrogatedata.
The advantage of transfer entropy method is that it is suitable for
both linear and nonlinear relationships between variables. However, it re-
quires large amount of time series data to estimate the joint probability for
calculatingtransitionprobabilityandsensitivetoparameterschangessuch
as predictionhorizon, embeddinghorizon and time interval. As a result, it
isrelativelydifficulttoimplementandcorrespondingcomputationburden
islarge.
1.3 Outline of the Dissertation
This dissertation focuses on developing a causal structure based
analysisframeworkfordiagnosingtherootcauseofplant-wideoscillations
and monitoring the dynamic processes. The dissertation is organized as
follows:
12
InChapter2,thenecessarybackgroundsofplant-widedisturbances
are discussed. A new data-driven time series method for diagnosing
the sources and propagation paths of plant-wide oscillations is present-
ed. The proposed method first uses a latent variable method to select
features which carry significant common oscillations, then applies both
time-domain Granger causality and spectral Granger causality to provide
reliablediagnosis. Simulationstestsandanindustrialcasestudyareshown
todemonstratetheeffectivenessoftheproposedmethodology.
InChapter3,anovelmultilevelGrangercausalityframeworkispro-
posed for complete causal mining and root cause analysis in a large-scale
complexindustrialprocess. Thehighlevelisregardedasagroup-wiseanal-
ysis, which is clustered by dynamic time warping-based K-means method
and investigated using group Granger causality. While the low level is in-
dividual causal reasoning within each group where a partial least squares
modified Granger causal test is developed to overcome multicollinearity
and prediction power comparison issues. The proposed causality analysis
frameworkisvalidatedthroughabenchmarkindustrialcasestudytoshow
itseffectiveandsuperiority.
Chapter4introducestheconceptofcointegrationanditsapplication
in fault detection of dynamic processes. Nonstationary test is adopted to
distinguish nonstationary series from stationary series. After that, cointe-
13
gration analysis is used to describe the stochastic common trends and e-
quilibrium errors, which can be used to construct monitoring indices for
abnormal detection. Case study on the Tennessee Eastman process shows
that the proposed nonstationary process monitoring can efficiently detect
faultsinthenonstationarydynamicprocess.
Chpater5givestheconcludingremarksforthedissertation.
14
Chapter 2
Root Cause Diagnosis of Plant wide Oscillations
Using Granger Causality
2.1 Introduction
There are typically hundreds or even thousands of control loops
in a modern chemical plant. In order to maximize the productivity and
efficiency, all critical loops must operate at or near optimal status. Ef-
ficient and effective execution of control loops is very important to the
successful implementation of the upper level optimization, scheduling
and management[61]. In practice, however, it is reported by Ender and
Bialkowski[19,6]thatonlyonethirdofthecontrolloopsworksatisfactorily.
There are several typical reasons for poor control performance, including
inappropriate controller tuning, process nonlinearity, malfunctioning
of actuators and sensors, and severe model mismatch for model-based
control. Among the many possible types of performance degradations, the
presence of plant-wide oscillations is a common situation, and yet their
cause is difficult to diagnose due to plant and control interactions. Sus-
tained oscillations can dramatically increase process variability. Therefore,
promptly reducing or removing such oscillations can lead to substantial
15
economicbenefits[1,82].
Oscillations are a common type of plant-wide variabilities which
could be due to multiple causes such as aggressive controller tuning,
valve stiction, and external oscillatory disturbances[35]. It is a periodic
phenomenon with a somewhat well defined frequency that can occur con-
tinuouslyorintermittentlydependingontheprocessoperationconditions.
Once an oscillation is generated somewhere in the plant, it can propagate
to other parts of the plant through mass and heat transfer, making the root
cause difficult to diagnose. The plant-wide oscillations will cause poor
control and operation performance, low product quality, and increased
energyconsumptionandequipmentwear-out[78,45].
Since industrial processes are usually interacting and the most pop-
ular industrial controllers are single loop PID controllers, oscillations often
occur due to process interactions even though each PID controller is tuned
usingconservativerules. Thiscauseofoscillationscanbeviewedasa”soft-
wareproblem”anditcouldbeeliminatedbyadjustingthecontrolparame-
tersorbytheuseofmultivariablemodel-basedcontrollers. Manyindustrial
studies indicate that valve stiction is another major source of oscillation-
s, which is considered a ”hardware problem” and can only be eliminated
byidentifyingthestickingvalveandperformingmaintenance[56,13]. Due
to the interaction and complexity of industrial plants, a single cause of os-
16
cillations can result in plant-wide oscillations, increasing the difficulty in
determiningcause-effectrelationshipamongoscillatingloopsorcontrolled
variables [40]. Therefore, strategies for detecting and diagnosing of plant-
wideoscillationsareessentialtosolvetheseproblems.
The task of detecting single loop oscillations from routine operation
data has been studied intensively. The auto-covariance function (ACF) as
a smoothed version of the original time-domain data is often capable of
detecting the oscillation reliably. One of the efficient ways to automati-
cally detect oscillations is based on zero crossing methods. Kedam and
Yakowitz[47] proposed a method to determine the oscillatory behavior
from the high order zero crossings of a time-series data. Thornhill and
H¨ agglund used zero-crossings and calculate the integral absolute error to
indicatetheonsetofanoscillation[76].
While sticking valves tend to cause oscillations in the control loop,
they usually have specific characteristics that can be used to distinguish
from other causes of oscillations. Several stiction detection methods ana-
lyzethesignalshapesoftheclosed-loopoperationdata. Horch[31]showed
that the covariance of the manipulated variables (MV) and controlled vari-
ables (CV) would be an odd function if the oscillations is caused by valve
stiction rather than loop interactions. He et al. [28] proposed to identify
valve stiction by finding a better fit of the oscillatory data to a triangular
17
wave rather than sinusoidal waves. Singhal and Salsbury[72] compared
the areas before and after the peaks of the control error signal to indicate a
probable presence of stiction. Kano et al. [44] compared the shape of MV
versus controller output (OP) plots with reference to the stiction pattern to
detectstiction. Choudhuryetal. [10]proposedtochangethecontrollergain
todistinguishaninternallygeneratedoscillationfromanexternaloscillato-
ry disturbance. There are also some model-based approaches to quantify
valve stiction in control loops, which can provide more accurate analysis
eventhoughtheyaremorecomputationallydemanding[34,52,46].
Root cause diagnosis in a multivariable and plant-wide setting is a
challenging issue. Some methods have been proposed to deal with root
cause diagnosis of plant-wide oscillations which consider many loops as a
whole and find the most probable source. Thornhill et al. [75] proposed a
non-linearityindexalongwithprocessunderstandingtofindtherootcause
due to valve stiction. The assumption behind this method is that oscilla-
tions at or near the source tend to be more nonlinear with higher order of
harmonics. Xia et al. [86] proposed multi-resolution spectral independen-
t component analysis to detect and isolate the sources of multiple oscilla-
tions, where a significance index is utilized to indicate possible root caus-
es. Recently, Jiang et al. [38] incorporated process knowledge into a graph
theoretic approach to diagnose the root cause. Bauer et al. [4] proposed a
18
data-drivenmethodtofindthepropagationdirectionofdisturbancesusing
theideaoftransferentropy.
Given a set of time series data, one of the most popular definitions
for causality is the Granger causality (GC)[23]. Due to its simplicity, inter-
pretability, and ease of implementation, Granger casuality has found wide
applicationsineconomics[25,29,8]. RecentlyGrangercausalityhasgained
greatattentioninmanyotherareas[68,30,15]toextractusefuldynamicin-
formation and study the inner causal relationships. The Granger causality
builds a straightforward connection between causality and prediction and
employs a statistical hypothesis test to determine whether one time series
is helpful in forecasting another. This chapter proposes a strategy through
theapplicationofGrangercausalitytoanalyzethecauseandeffectrelation-
shipsinplant-wideoscillations.
The main idea of this chapter is to combine feature selection using
principal components with Granger causality both in time-domain and in
frequency domain to diagnose the root cause of plant-wide oscillations. It
isadata-basedmethodandcanbeimplementedduringnormaloperations
without any plant perturbations. The rest of this chapter is organized as
follows. Section 2 introduces the main causality analysis algorithm of the
proposed diagnosing strategy which consists of three parts: time-domain
Granger causality, spectral Granger causality and feature selection. In Sec-
19
tion 3, through a simulated example and an industrial case study, the ef-
fectiveness of the presented root cause diagnosis method is shown for the
correct diagnosis of the root cause. Section 4 summarizes the work and
givesconclusions.
2.2 Methods and Implementation
In this section, the concept of Granger causality is introduced and a
significancetestisutilizedtodeterminewhetheracause-effectrelationship
is significant or not. Granger causality (G-causality) is a measure of causal
effect from one time series to another and is based on linear predictions of
timeseries. BasedontheG-causality,wesaythatX
2
causesX
1
iftheinclu-
sionofpastobservationsofX
2
inaregressionmodelreducestheprediction
errorofX
1
ascomparedtoamodelwithoutX
1
information.
2.2.1 Time-domain Granger Causality
For two time series X
1
(t) and X
2
(t) from two stationary stochastic
processes,abivariateautoregressive(AR)modelcanbebuiltasfollows,
X
1
(t) =
k
∑
l=1
a
11,l
X
1
(t−l)+
k
∑
l=1
a
12,l
X
2
(t−l)+e
1
(t) (2.1)
X
2
(t) =
k
∑
l=1
a
21,l
X
1
(t−l)+
k
∑
l=1
a
22,l
X
2
(t−l)+e
2
(t) (2.2)
wherea
ij,l
aretheARcoefficients,e
i
(t)arethepredictionerrorsorresiduals
of the model. k is the model order which defines how many time lags to
20
be included in the regression model. Equations (2.1) and (2.2) are referred
to as the unrestricted model or the full model. Also we could exclude the
other variableX
1
orX
2
toperformthe AR modeling on each time series as
follows,
X
1
(t) =
k
∑
l=1
b
1,l
X
1
(t−l)+e
1(2)
(t) (2.3)
X
2
(t) =
k
∑
l=1
b
2,l
X
2
(t−l)+e
2(1)
(t) (2.4)
where e
i(j)
(t) refers to the prediction error from a model that predicts the
i
th
variable by excluding the j
th
variable. If the variance of e
i
(t) is smaller
than that ofe
i(j)
(t) with some statistical significance, the prediction of time
series X
i
(t) is significantly more accurate by incorporating the past values
ofX
j
(t). Asaconsequence,itisconcludedthatthereisacausaleffectfrom
X
j
(t) to X
i
(t). The difference of the variances can be quantified by the fol-
lowingratioindex.
F
j→i
=ln
var(e
i(j)
)
var(e
i
)
(2.5)
It is straightforward to generalize the bivariate Granger causality to
themultivariatecase. Defining
A
ij
(q
−1
) =
k
∑
l=1
a
ij,l
q
−l
whereq
−1
is the backward shift operator, (2.1) and (2.2) can be extended to
21
themultivariatecaseasfollowsforasystemofnvariables,
X
1
(t)
X
2
(t)
.
.
.
X
n
(t)
=
A
11
(q
−1
) A
12
(q
−1
) ··· A
n1
(q
−1
)
A
21
(q
−1
) A
22
(q
−1
) ··· A
n2
(q
−1
)
.
.
.
.
.
.
.
.
. ···
A
n1
(q
−1
) A
n2
(q
−1
) ··· A
nn
(q
−1
)
X
1
(t)
X
2
(t)
.
.
.
X
n
(t)
+
e
1
(t)
e
2
(t)
.
.
.
e
n
(t)
(2.6)
We say that X
j
causes X
i
if excluding X
j
reduces the ability to pre-
dict X
i
even if all other variables are included in the regression model. If
eachvariableX
j
isexcludedonceinpredictingallothervariablesX
i
,∀i̸=j,
therewillbe(n−1)×npredictionerrorsequencese
i(j)
(t)andcorrespond-
ingly (n− 1)×n variances denoted as σ
2
i(j)
≡ var(e
i(j)
). We can form the
followingcause-effectmatrixofvariances,wheretheij
th
elementisthevari-
anceofpredictingthei
th
rowvariablebyexcludingthej
th
columnvariable
for i ̸= j. The diagonal elements give the residual variances of the n full
models,denotedasσ
2
i
≡var(e
i
).
(X
1
) (X
2
) ··· (X
n
)
X
1
σ
2
1
σ
2
1(2)
··· σ
2
1(n)
X
2
σ
2
2(1)
σ
2
2
··· σ
2
2(n)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
X
n
σ
2
n(1)
σ
2
n(2)
··· σ
2
n
It is expected that the diagonal elements of the matrix are smaller
thantheotherelementsinthesamerow,sincethefullmodelsusemorein-
putsintheregressionmodelstofitthedata. IfavariableX
j
hasasignificant
causaleffecttoanothervariableX
i
,theij
th
elementinthematrixshouldbe
significantly greater that the ii
th
element. Therefore, the Granger causality
22
fromVariablej toVariablei,givenallothervariablesismeasuredby
F
j→i
=ln
σ
2
i(j)
σ
2
i
which is the G-causality index. However, the statistical significance needs
to be established before making an inference on the causality relationship.
Considertwomodels,Models1and2. Model1istherestrictedmodelwith
p
1
parameters, while Model 2 is the full model with p
2
parameters, where
p
2
> p
1
. The full model can fit the data at least as well as the restricted
model. We want to determine whether Model 2 gives a significantly better
fittothedatathanModel1. ThefollowingF testisapplied,
(RSS
1
−RSS
2
)/(p
2
−p
1
)
RSS
2
/(m−p
2
)
∼F
p
2
−p
1
,m−p
2
(2.7)
whereRSS
i
istheresidualsumofsquaresinModeli,andmistotalnumber
of observations. If the p-value is less than the F distribution with a signif-
icance level α (typically 0.01 or 0.05), the null hypothesis that there is no
causalitycanberejected.
2.2.2 Spectral Granger Causality
Since the interest is to determine whether oscillating variables in a
plant-wide setting have a cause-effect relationship among themselves, it is
appropriatetousespectralGrangeranalysistofocustheanalysisonaspe-
cific frequency range. For multiple time series that are autoregressive and
nondeterministic, the spectral decomposition of Granger’s time-domain
23
causality was proposed by Geweke [22]. Geweke showed that, for data
fromonetime-series,thepowerataspecificfrequencycanbedecomposed
into an intrinsic part and another part that is predicted by the data from
anothertime-series. Toderivethefrequency-domainGrangercausality,we
startwith(2.1)and(2.2)andperformFouriertransformtoobtain
[
A
11
(f) A
12
(f)
A
21
(f) A
22
(f)
][
X
1
(f)
X
2
(f)
]
=
[
E
1
(f)
E
2
(f)
]
(2.8)
where the components of the coefficient matrix [A
ij
(f)] are A
ij
(f) = δ
ij
−
k
∑
l=1
a
ij,l
e
−2πfl
. Denote the transfer function matrix H(f) = [A
ij
(f)]
−1
, (2.8)
canberewrittenasfollows.
[
X
1
(f)
X
2
(f)
]
=
[
H
11
(f) H
12
(f)
H
21
(f) H
22
(f)
][
E
1
(f)
E
2
(f)
]
(2.9)
Thus,thespectraldensitymatrixS(f)isderivedas:
S(f) =⟨X(f)X
∗
(f)⟩ =H(f)ΣH
∗
(f) (2.10)
where Σ is the covariance of the full model residuals. For the X
1
process,
pre-multiply(2.8)onbothsideswith
[
1 0
−Σ
12
/Σ
11
1
]
(2.11)
andobtain
[
X
1
(f)
X
2
(f)
]
=
[
˜
H
11
(f)
˜
H
12
(f)
˜
H
21
(f)
˜
H
22
(f)
][
E
1
(f)
˜
E
2
(f)
]
(2.12)
24
where
˜
E
2
(f) =E
2
(f)−
Σ
12
Σ
11
E
1
(f) (2.13)
and
[
˜
H
11
(f)
˜
H
12
(f)
˜
H
21
(f)
˜
H
22
(f)
]
=
[
H
11
(f) H
12
(f)
H
21
(f) H
22
(f)
][
1 0
−Σ
12
/Σ
11
1
]
−1
=
[
H
11
(f)+
Σ
12
Σ
11
H
12
(f) H
12
(f)
H
21
(f)+
Σ
12
Σ
11
H
11
(f) H
22
(f)
]
Equation (2.13) makes
˜
E
2
uncorrelated toE
1
. The spectrum ofX
1
(t)
isrepresentedintotwoparts:
S
11
(f) =
˜
H
11
(f)Σ
11
˜
H
∗
11
(f)+
˜
H
12
(f)
˜
Σ
22
˜
H
∗
12
(f) (2.14)
where
˜
Σ
22
= Σ
22
−
Σ
2
12
Σ
11
. Therefore,thespectralGrangercausalityfromX
2
to
X
1
atfrequencyf is
I
2→1
(f) = ln
S
11
(f)
S
11
(f)−
(
Σ
22
−
Σ
2
12
Σ
11
)
|H
12
(f)|
2
(2.15)
which could be interpreted in this way: at a given frequency, how much
powerofX
1
couldbeexplainedorpredictedbyhistoryofX
2
.
Asanexampleweexaminethefollowingtwosimplesinusoidalsig-
nals,
S
1
=e
−0.01t
sin(
π
2
t)+e(t)
S
2
=e
−0.01t
sin(
π
2
t+
3π
8
)+e(t)
where S
1
and S
2
have same oscillation at frequency
π
2
, but there is a phase
difference which makes S
1
follow S
2
with a delay. e(t) is a white noise se-
25
quence. Thesamplingtimeis 0.1secondsandthetotalnumberofobserva-
tionsis1000.
Figure 2.1 shows the time series plot of the two exponentially de-
caying sine signals carrying the same frequency but with different phase
components. It is easy to see the pattern that S
1
follows S
2
with a phase
delay, thus the current value of S
2
helps to predict the future value of S
1
.
Inthiscase,wesayS2causesS1intermsofG-causality. Ifweexaminethe
power spectral plot of the two signals, both would show high peaks at the
frequency
π
2
,whichcannotgiveanyindicationaboutacausalrelationship.
Figure 2.2 shows the magnitude of Granger causality from S
1
to S
2
andfromS
2
toS
1
,respectively,atvariousnormalizedfrequencies. Thepeak
aroundnormalizedfrequency 0.023correspondstothecommonoscillation
frequencyofthetwosinewaves. Alargermagnitudefor 2→ 1than 1→ 2
at the peak frequency implies that S
2
drives S
1
more significantly than S
1
doesS
2
.
Thisisjustansimpleexampletohelpunderstandthespectralcausal
relationships between two time series. This might not be the perfect way
to show the power of spectral Granger causality, since the phase shift 2π
could exist to somehow mislead the causality relationship. However, for
oscillating process data without any knowledge about their relationship,
some valuable hints about root cause could be inferred by analyzing sev-
eral pair-wise spectral Granger causality especially when combined with
26
0 5 10 15 20 25 30 35 40 45
−1.5
−1
−0.5
0
0.5
1
1.5
time
amplitude
Time series plot of S1 and S2
S1
S2
Figure2.1: TimeseriesplotoftwosimpleoscillatingsignalsS1andS2.
10
−3
10
−2
10
−1
10
0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
X: 0.025
Y: 1.052
normalized frequency
spectral G−causality magnitude
spectral Granger causality magnitude between S1 and S2
S1−>S2
S2−>S1
Figure2.2: SpectralGrangercausalitymagnitudesbetweenS1andS2.
27
feature selection and time-domain causality network analysis. Therefore,
forplant-wideoscillations,itisstraightforwardtoobtaintheoscillationfre-
quencythroughspectrumplotsoftime-seriesdata,whichgivesafrequency
range to be concerned for root cause diagnosis. Then variables with peak
magnitudes at the oscillation frequency should be selected and pair-wise
spectralGrangercausalitycouldbeperformed. Ifsomevariablesarefound
to have strong causal effect on other variables and receive negligible effect
from others, then it could lead to the possible root cause of the plant-wide
oscillation.
2.2.3 Feature Selection
Feature selection serves as a tool to detect the common oscillation
frequency. Featureselection,alsoknownasvariableselection,isconcerned
with selecting a subset of relevant features for building the models. In this
work, we wish to locate the plant-wide oscillation source by analyzing the
causalitybetweendifferentvariables. Withmanyprocessvariablesandcon-
troller outputs in an oscillating plant, some of them are irrelevant to the
diagnosis problem and can be excluded from causality analysis. Feature s-
electioncanreducedimensionalityeffectively,improveresultinterpretabil-
ity,andprovideabetterunderstandingoftheunderlyingprocessbehavior.
Therefore,featureselectionisusuallyanecessarypreprocessingstepforos-
cillationsourcediagnosis.
28
There are a number of feature selection methods, including cluster-
ing,lineartransformation,wavelets,andconvolutionsofkernels. Sincewe
wanttoeliminatevariablesthatareirrelevanttotheplant-wideoscillations,
therefore,itisreasonabletoassumethat,ifavariabledoesnotshowsignif-
icantpowernearorattheoscillationfrequencyofinterest,thenitcannotbe
thesourceofthecorrespondingplant-wideoscillation.
Principalcomponentanalysis(PCA)has beenwidelyused fortreat-
ing multivariate data in various areas such as process monitoring, quality
controlandfaultdiagnosis. Inthissubsection,ageneralprocedureofusing
PCA as a feature selection method is introduced. Consider a data matrix
X ∈ℜ
m×n
consistingofmobservationrowsandnvariables. ThematrixX
canbedecomposedintoascorematrixT andaloadingmatrixP asfollows,
X =TP
T
+E (2.16)
whereE istheresidualmatrix. EachcolumnofTisreferredtoasaprincipal
component (PC). Each PC can be decomposed into a linear combination of
thesetoforiginalvariables,i.e.,
t
i
=x
1
p
1i
+x
2
p
2i
···+x
j
p
ji
···+x
n
p
ni
(2.17)
The loadings for thej
th
variablep
2
ji
is a measure of the contribution
fromthatvariabletothei
th
PC.LeadingPCsbydefinitionextractthelargest
variations in the data, thus extract oscillations that are shared among mul-
tiple variables. The PCs that carry the plant-wide oscillation pattern are
29
chosenandvariablesthatmakestrongcontributiontothoseoscillatingPCs
are selected as candidates for oscillation source diagnosis. The following
oscillation significance index (OSI) is computed to rank the contributing
variablestotheoscillationpattern,
OSI
j
=
∑
i∈Opc
λ
i
p
2
ji
(2.18)
whereλ
i
istheeigenvalueofthecorrespondingPCinthesetofO
pc
oscillat-
ingPC’s.
2.3 Case Studies
2.3.1 Simulated 4×4 System
Consider a four-input, four-output system with correlated dis-
turbances. The open-loop process transfer function matrix G(q
−1
) and
disturbancemodelaregivenasfollows
G(q
−1
) =
0.05q
−3
1−0.95q
−1
0
0.7q
−3
1−0.3q
−1
0
0.02966q
−3
1−1.627q
−1
+0.706q
−2
0.0627q
−6
1−0.937q
−1
0 0
0
0.235q
−5
1−0.765q
−1
0.5q
−2
1−q
−1
+0.25q
−2
0
0.5q
−5
−0.4875q
−6
1−1.395q
−1
+0.455q
−2
0 0
0.2q
−6
1−0.8q
−1
(2.19)
and
N(q
−1
) =
1−0.1875q
−1
1−0.9875q
−1
0 0 0
0
1−0.1875q
−1
1−0.9875q
−1
0 0
0 0
1−0.1875q
−1
1−0.9875q
−1
0
0 0 0
1−0.1875q
−1
1−0.9875q
−1
(2.20)
30
In this system, four single-loop PI controllers are utilized with
K
c
{1 + (
1
Tr
)[
∆T
1−q
−1
]}, where K
c
is the proportional gain, ∆T is the
controller sampling time and T
r
is the integral time. When we set
K
c
=
[
0.816 0.625 0.184 0.37
]
and T
r
=
[
20 16 2.86 5
]
, the four
loopsarewellcontrolledandthereisnoobviousoscillation,whichimplies
no aggressive tuning oscillation problem. To introduce oscillations, a
data-drivenvalvestictionmodel[12]isaddedintotheLoop3.
The time series plot of the four controlled variables along with the
correspondingautocorrelationplotisshowninFig. 2.3. Itiseasytoseethat
thereexistsacommonoscillationpatterninseveralloops. Thetime-domain
Granger causality method is applied to all four controlled variables. Fig.
2.4 shows the time-domain causality graph of the four variables. In this
graph, each node represents one variable. If there is a significant causal
relationship from one variable to another, a directed edge with an arrow is
drawn to represent it. The root cause suggested by this graph is obviously
CV3. Meanwhile,anoscillationpropagationpathisrecoveredwhichstarts
from CV3 to CV1 then to CV4. CV2 is not in this path which is consistent
with the autocorrelation plot of Fig. 2.3(b), where CV2 does not show the
similarperiodicpatternsastheotherthreeloops.
By examining the autocorrelation plot Fig. 2.3(b), we know the pe-
riod of plant-wide oscillation is about 33 samples per cycle, corresponding
31
0 100 200 300 400 500 600 700 800 900 1000
−5
0
5
CV1
time series plot
0 100 200 300 400 500 600 700 800 900 1000
−5
0
5
CV2
0 100 200 300 400 500 600 700 800 900 1000
−5
0
5
CV3
0 100 200 300 400 500 600 700 800 900 1000
−5
0
5
time(second)
CV4
(a)
0 50 100 150 200 250 300 350 400 450 500
−1
0
1
CV1
autocorrelation plot
0 50 100 150 200 250 300 350 400 450 500
−1
0
1
CV2
0 50 100 150 200 250 300 350 400 450 500
−1
0
1
CV3
0 50 100 150 200 250 300 350 400 450 500
−1
0
1
time interval (second)
CV4
(b)
Figure 2.3: Time series plot and autocorrelation plot of the simulation ex-
ample.
32
CV1
CV2
CV3
CV4
Figure2.4: Time-domaincausalitymapoffourCVsinthesimulationexam-
ple.
a normalized frequency of 0.03. We wish to further examine the cause and
effectrelationshipusingthespectralGrangercausalityaroundthisfrequen-
cy range. Fig. 2.5 shows the spectral Granger causality magnitudes from
each CV to all the other CVs over the frequency range. Several high peaks
around frequency 0.03 are observed, which implies the presence of a com-
mon oscillation pattern. From the CV3 point of view, it has strong causal
effecttoalltheotherthreeloopsatthefrequencyofoscillation,whichcould
beseeninthethirdcolumn. Furthermore,fromthethirdrow,itshowsthat
CV3 receives no or negligible causal effect from CV1, CV2 or CV4 at this
frequency. Therefore,CV3hasdominantcausaleffecttoothersandislikely
tobetherootcauseoftheoscillation.
33
10
−2
10
−1
0
5
(2)
10
−2
10
−1
0
5
(3)
10
−2
10
−1
0
5
(4)
10
−2
10
−1
0
5
2
(1)
10
−2
10
−1
0
5
X: 0.03
Y: 0
10
−2
10
−1
0
5
10
−2
10
−1
0
5
3
10
−2
10
−1
0
5
10
−2
10
−1
0
5
10
−2
10
−1
0
5
4
10
−2
10
−1
0
5
10
−2
10
−1
0
5
Figure 2.5: Spectral Granger causality for each CV of the simulation exam-
ple.
2.3.2 Industrial Application Case Study
Fig. 2.6showsaprocessschematicdiagramfromtheEastmanChem-
icalCompany,USA.TheadvancedcontrolgroupatEastmanChemicalhad
identifiedaneedtodiagnoseacommonoscillationofabouttwohours(320
samples/cycle). Circle-marked variables are those affected by this plant-
wideoscillationthatshowsimilaroscillationpatterns. FC,LC,PC,TCrep-
resent flow, level, pressure, and temperature controlled loops. Fifteen con-
trolledvariablesalongwith15controlleroutput(OP)signalsareavailable.
To carry out this industrial application, feature selection is imple-
mentedfirsttoselectthemostrelevantvariablesforcausalityanalysis. PCA
34
Figure2.6: ProcessschematicofaplantfromEastmanChemicalCompany.
35
is applied to the 15 controlled variables and 15 controller outputs in a data
matrixwith8640samples. Fiveleadingprincipalcomponentsareshownin
Fig. 2.7. PC1andPC3carrythe2hoursoscillationpattern,thuswecalculate
theoscillationsignificanceindexfromeachvariabletothesetwooscillating
PC’s. Fig. 2.8 shows the OSI of 30 variables. The horizontal red line mark-
s the threshold which is chosen to be the 1/3 of the maximum OSI in this
group. Thethresholdvalueischosenbyempiricalexperienceandwhether
or not to include some marginal variables into the features does not quite
affect the final diagnose result. The 15 variables with OSI larger than the
thresholdareselectedforfurthercausalityanalysis.
Time-domain Granger causality is applied to measure the cause-
effect relationship among the selected variables. The model order of the
vectorARmodelischosentobe4assuggestedbytheBayesianinformation
criterion. Fig. 2.9 depicts the causality graph, where the line connecting
two nodes without an arrow indicates there is a two-way causality, known
ascausalityfeedback. Thereareseveralimportantobservationstobenoted
fromthisgraph. First,Variables10and11(CVandOPintheLC2loop)are
clearlythesourcesincetheyhavemanycausaleffectsonothervariablesbut
do not receive any significant causal effects from other variables. Second,
two major oscillation propagation paths have been detected. One is from
LC2 (the Decanter level loop) to TC1 (the temperature loop in Column 2),
36
0 1000 2000 3000 4000 5000 6000 7000 8000
−10
0
10
PC1
0 1000 2000 3000 4000 5000 6000 7000 8000
−10
0
10
PC2
0 1000 2000 3000 4000 5000 6000 7000 8000
−10
0
10
PC3
0 1000 2000 3000 4000 5000 6000 7000 8000
−10
0
10
PC4
0 1000 2000 3000 4000 5000 6000 7000 8000
−10
0
10
time
PC5
Figure2.7: Fiveprincipalcomponentsfortheindustrialcasestudy.
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
30 variables
oscillation significance index
Figure2.8: Oscillationsignificanceindexfortheindustrialcasestudy.
37
theotherstartsfromLC2andgoestoTC2(thetemperatureloopinColumn
3). Furthermore, a cascade feedback structure (TC1 loop and FC5 loop) is
recoveredintheformofcausalityfeedback.
Fromthetime-domaincausalitynetworkinFig. 2.9,anetcausalflow
can be computed for each node which is equal to the number of outgoing
flows minus the number of incoming flows. If a node has a high positive
causal flow, it is likely to be a source, while a high negative causal flow
indicates a likely sink in the network. Fig. 2.10 depicts the causal flow for
the 15 variables. Variables 10 and 11 have largest positive causal flow, and
thusareregardedassources.
To apply pair-wise spectral Granger causality to this problem, eight
variables with significant oscillations are chosen among the 15 features.
They are Variables 2, 3, 4, 6, 8, 11, 13 and 14, representing all important
oscillating loops. Fig. 2.11 depicts the spectral causality graph for these
variables.
ThesubplotsunderColumn(11)havemanystrongpeaksatfrequen-
cy around 0.003 which means that Variable 11 has dominant prediction
power and causal effect to other variables at this frequency. At the same
time, the subplots in the row marked with 11 show the causal effect
from other variables to Variable 11, which are all very small and can be
negligible. Furthermore, strong peaks exist from Variable 11 to Variables
2, 3 and 4, which indicates another oscillation propagation path (LC1 and
38
1. LC1.PV
2. LC1.OP
3. FC1.PV
4. FC2.PV
5. FC2.OP
6. TC1.PV
7. TC1.OP
8. FC5.PV
9. FC5.OP
10. LC2.PV
11. LC2.OP
12. FC8.PV
13. FC8.OP
14. TC2.PV
15. TC2.OP
Figure2.9: Time-domaincausalitymapof15selectedfeaturesintheindus-
trialcasestudy.
39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
−4
−2
0
2
4
6
X = 11
Y = 5
node number
causal flow
X = 10
Y = 4
Figure2.10: Causalflowforeachnode(variable)intheindustrialcasestudy.
FC1 cascade feedback structure), which has not been recovered in Fig. 2.9
using time-domain causality method. Therefore, Variable 11 is inferred as
therootcauseoftheoscillationwith0.003normalizedfrequency.
2.4 Summary
Oscillationsinindustrialprocessesoftenpropagatefromoneunitto
otherunits. Itisofinteresttoextracttheinformationflowfromprocessop-
eration data and diagnose the root cause of pant-wide oscillations. In this
chapter, the time-domain Granger causality and spectral Granger causal-
ity have been applied successfully to reveal the cause and effect relation-
shipintermsoftimeseriesprediction. Anoveldata-drivenmethodforthe
diagnosis of plant-wide oscillations is presented. This method combines
40
0
5
(3) (4) (6) (8) (11) (13) (14)
0
5
(2)
3
0
5
4
0
5
6
0
5
8
X: 0.003
Y: 0
0
5
11
0
5
13
10
−4
10
−3
10
−2
0
5
14
10
−4
10
−3
10
−2
10
−4
10
−3
10
−2
10
−4
10
−3
10
−2
10
−4
10
−3
10
−2
10
−4
10
−3
10
−2
10
−4
10
−3
10
−2
Figure 2.11: Spectral Granger causality for 8 selected features of the indus-
trialcasestudy.
41
time-domainGrangercausality,spectralGrangercausalityalongwithPCA
basedfeatureselectiontolocatethesourceofoscillationsandexplorerelat-
ed propagations. The time-domain causality is shown to be good at recov-
eringthepropagationpathwhilespectralcausalityenhancestheanalysisof
pair-wise causality at a specified frequency range. Feature selection serves
asapre-processingsteptoselectvariablesrelevanttotheoscillationsforthe
root cause diagnosis. Furthermore, the proposed method does not require
processmodelsnorplantperturbationstobeimplemented. Twocasestud-
iesshowthattheproposedmethodiseffectiveandpromisingforrootcause
diagnosisofplant-wideoscillations.
42
Chapter 3
Deep causal mining for plant-wide oscillations with
multilevel Granger causality analysis
3.1 Introduction
In a large-scale complex industrial plant, process units and control
systems interact to keep the whole process running in a satisfactory status.
A disturbance or process fault could occur at one point of the plant and
propagatestootherunitsthroughprocessflowsandfeedbackpaths,which
is regarded as a plant-wide disturbance. Their presence may impact the
overall control performance and leads to inferior quality and excessive en-
ergyconsumption. Therefore,itisimportanttodetectanddiagnosetheroot
causeofsuchdisturbanceandtakefurthertargetedmaintenance[40,78].
Oscillation is a common type of disturbance with well-defined am-
plitudes and frequencies. It may occurs due to valve sticking, improper
controller tuning, loop interaction or external disturbance. There are two
important tasks for dealing with plant-wide oscillations. One is detection
ofthepresenceofoneormoreperiodicoscillations,whichhasbeenstudied
intensively [55, 86, 37, 28]. The other task is diagnosing the most probable
rootcause,whichisstillquitechallengingandvaluable.
43
Several data analysis methods have been proposed to identify the
root cause of plant-wide oscillations. Choudhury et al.[69] utilized higher-
order statistical techniques such as bicoherence to develop a nonlinearity
index then searched for the highest nonlinearity for possible root cause.
Xia et al. [86] proposed multi-resolution spectral independent component
analysis to detect and isolate the sources of multiple oscillations, where a
significanceindexisutilizedtoindicatepossiblerootcauses. Zangetal.[92]
developedabi-amplituderatiowhichcouldquantifyharmonicattenuation
or amplification as the oscillation propagates so as to isolate the source of
non-linearity-inducedoscillations. Jiangetal.[39]implementedthespectral
envelope method to detect and diagnose the oscillations according to the
frequencypowerandthecontributiontotheoscillations.
Some model-based methods were developed as well to help diag-
nosetherootcause. PetersenpresentedamethodbasedonMultilevelFlow
Modeling along with actual causal structure as a framework for reasoning
the source in supervisory control of complex system. Recently adjacency
matrix[38]andsigneddirectedgraphs[87]wereproposedasgraphicalmod-
elstolocatetherootcause. Themajorlimitationofthesemodel-basedmeth-
odsarethatdetailedprocessknowledgeisnotalwaysavailableandoftenit
is time consuming to construct a model for a specific plant which is nearly
non-reusable. Furthermore,themaindynamicpropagationsometimesmay
44
notbehighlightedinthecompletecausalstructurenetworkwhichmakethe
sourcedifficulttobediagnosed.
Data-driven causality and topology analysis methods are gaining
increasing attentions which aim to capture dynamic process topology and
find fault propagation pathways to determine the root cause for certain
plant-wide disturbance[17]. Bayesian networks were utilized to measure
the causal influence in terms of conditional dependency and could help to
recoverthefaultpropagationpathsandrevealthesource[83,94,90]. Based
on the transition probabilities on causality between two variables, the
transfer entropy has been successfully implemented to find the direction
of disturbance propagation in chemical process[4]. The transfer entropy
method could handle both linear and non-linear process but is relatively
difficulttoimplementandsensitivetoparametertuning. RecentlyGranger
causality (GC) has been widely used in some other domains [7, 60, 21] to
extract useful dynamic information and analyze the causal relationships.
TheGrangercausalitybuildsastraightforwardconnectionbetweencausal-
ity and prediction and employs a statistical hypothesis test to determine
whether one time series is helpful in forecasting another. Yuan and Qin
successfullyappliedtime-domainandfrequency-domainGrangercausality
along with a latent variable feature selection method to find the root cause
andpropagationpathsofplant-wideoscillations[91].
45
Theexistingrootcausediagnosismethodsinchemicalprocesswere
limited to the study of interactions between one-dimensional time series.
Somedynamiccauseandeffectrelationshipmayexistbetweengroup-level
subsetsofthewholedatasetandmaybeoverlookedbytraditionalcausality
methodologies. Little work has been done to deal with group interactions
and causality, therefore it is difficult to obtain a complete overview of the
whole causal structure in large-scale complex process. In this chapter, a
multilevel Granger causality framework is proposed where causal reason-
ing could be inferred at a high level between groups and a low individual
level within groups. Group Granger causality[51, 3] is adopted to analyze
causaleffectbetweendifferentgroupsoftimeseriesdatainhighlevelclus-
tered by a modified K-means clustering method. For low level individuals
insideeachgroup,apartialleastsquaresmodifiedGrangertestisdevelope-
d and implemented to overcome the issue of multicolinearity where corre-
lation and colinearilty are usually quite strong. Combining different level
ofcausalanalysiscouldleadtoadeepandcompletecausalityminingofthe
wholedynamicpropagationinthelarge-scaleprocess.
Theremainderofthischapterisorganizedasfollows: Section3.2in-
troduce our proposed causal analysis framework which mainly consists of
threeimportantparts. FirstadynamictimewarpingbasedK-meanscluster-
ing is presented to group whole data set into several groups. Then Group
46
Granger causality is introduced along with the disturbance feature group
formulation. After that, a PLS-based modified Granger causality has been
proposed to analyze the within-group causal relationship. In Section 3.3,
an industrial benchmark example is studied under our proposed causality
analysis framework and the effectiveness and superiority have been dis-
cussed. Thechapterendswithconclusionpartinsection3.4.
3.2 Proposed Multilevel Granger Causality Analysis Frame-
work
3.2.1 Dynamic Time Warping-based K-means Clustering
Before studying causal relationship in a group sense, it is important
totakeappropriategroupingstrategiestoformthehighlevelelements. The
objectiveofgroupingistomakethosetimeserieswhosharesimilardynam-
ics stay together. Clustering is the task of grouping a set of objects in such
awaythatobjectsinthesamegrouparemoresimilartoeachotherthanto
those in other groups, therefore the within-group-objects similarity is min-
imizedandthebetween-group-objectdissimilarityismaximized. K-means
clusteringisquitepopularforclusteranalysisindatamininganditaimsto
partition n observations into k clusters in which each observation belongs
totheclusterwiththenearestmean.
Given a set of observations(x
1
,x
2
,···,x
n
), where each observation is
a d-dimensional real vector, k-means clustering aims to partition the n ob-
47
servations into k sets (S =S
1
,S
2
,···,S
k
) so as to minimize the within cluster
sumofsquares:
argmin
k
∑
i=1
∑
x
j
∈S
i
||x
j
−µ
i
||
2
(3.1)
whereµ
i
isthemeanofobjectsinclusterS
i
Similarity measure is a real-valued function that quantifies the sim-
ilarity between two objects. In (3.1), squared Euclidean distance is utilized
to quantify the similarity between each data objects and the centers of dif-
ferentclusters. Dataarecalledstaticifalltheirfeaturesdonotchangeover
time, many existing traditional clustering programs are designed to deal
withstaticdata[57].
When it comes to our case, the data we have for an industry plant
are time series of process variables(PV) and controller outputs(OP), which
aremeasuredandrecordedthroughsensorsandcomputerswhichcouldbe
used for process monitoring and diagnosing purpose. We are interested to
determine groups of similar time series and within each group time series
tend to show a common pattern, have high correlation and share similar
dynamics. Unlike static data, the time series of a feature comprise values
changeovertime. Worksdevotingtoclusteranalysisoftimeseriesarerela-
tivelyscantcomparedwiththosefocusingonstaticdata,however,recently
time series data clustering is becoming an active research area and there is
48
atrendofincreasedinterestandactivity.
AlthoughtheEuclideandistanceispopularandeasytocalculate,itis
notsuitablefortimeseriesdatabecauseitsdistancebetweentwosequences
isbasedonone-to-onemanner[58]. Asaresult,k-meanswithEuclideandis-
tancedoesnotgiveclusterwellbecausetimeshiftingamongdatasequences
inthesameclassusuallyoccurs. Fortheprocesscontrolsystem,time-delay
alsoappearsfrequentlyduetotransferfunctionandprocessintegration. We
would like to cluster the variables which share similar dynamics but may
have time shifting in the same cluster. Dynamic Time Warping(DTW)[5]
has been proven to give more accurate distance measure than Euclidean
distancewhendealingwithtimeseriesclusteringorclassification.
DTW is a time series alignment algorithm developed for measur-
ing similarity between two temporal sequences which may vary in time
or speed. In general, DTW is a method to calculates an optimal match be-
tween two given time series with certain restrictions. Consider two time
seriesA = [a
1
,a
2
,··· ,a
n
]andB = [b
1
,b
2
,··· ,b
m
]. Thetwosequencescould
be arranged on the sides of a grid, with A on the top and the other B the
left hand side in Fig.3.1. Inside each cell a distance measure can be calcu-
lated, comparing the corresponding elements of the two time series. The
procedure is to find the best alignment between A and B which involves
finding all possible routes through the grid and for each one compute the
49
Figure3.1: DTW
overall distance, which is defined as the sum of the distances between the
individualelementsonthewarpingpath.
There are several constraints of the DTW algorithm: the alignment
path will not turn back on itself and it advances one step at a time. The
warpingwindow-sizeparameter willdetermine the farthest possible point
that could be matched to obtain the optimal distance. The power of the
DTWalgorithmresidesinthefactthatinsteadoffindingallpossibleroutes
throughthegridbyBrute-forcesearch,theDTWadoptsdynamicprogram-
ming and works by keeping track of the cost of the best path at each point
inthegrid:
50
γ(1,1) =d(1,1) =|s
1
−t
1
|
γ(i,1) =d(i,1)+γ(i−1,1)
γ(1,j) =d(1,j)+γ(1,j−1)
γ(i,j) =d(i,j)+min(γ(i−1,j−1),γ(i,j−1),γ(i−1,j))
ThenthefinalDTWdistancebetweenAandBwouldbenormalized
bythedatalengthofthetimeseriestoensurefairsimilaritymeasure:
dtw(A,B) =
1
n+m
γ(n,m) (3.2)
After we know how to define similarity between two time series,
thek-meansclusteringcouldbegeneralizedtoDTW-basedk-meanswhich
will consider the dynamic pattern of objects to be grouped. The algorithm
iscomposedofthefollowingsteps:
1. First define how many clusters we would like to obtain, for ex-
ample k clusters. k points are randomly picked up as the centroid of each
clusters,whichwouldbetheinitialcondition.
2. Foreachtimeseriesobject,calculateitsDTWdistanceswiththek
centers of the clusters and assign it to the cluster with maximum similari-
ty(minimalDTWdistance).
3. When all the objects have been assigned, recalculate the center
positionsofkgroups.
4. Repeatsteps2and3untilthecentroidsnolongermove.
51
AfterdoingtimeseriesgroupingusingDTW-basedk-meanscluster-
ing, we would have a preliminary idea of the whole time series data set.
Those time series which have high correlation and share similar dynamics
willbeclusteredinthesamegroup. Thecausaltestbetweeneachgroupwill
giveamacrographicandcompleteviewofthewholedynamicsoftheplant.
Furthermore, it is reasonable to assume that within group the causal rela-
tionship is likely to be inferred more accurately due to stronger interaction
andsimilardynamics.
3.2.2 Group-level Granger causality
Inthehighlevelwithdifferentgroupsclusteredbypreviousmethod,
a group-level causal analysis is needed to capture the big picture of causal
networks. ThestandardmeasureofGrangercausalityusedintheliterature
is defined only for univariate case. We now consider the situation where
predictedandpredictorvariablesarenolongerconstrainedtobeunivariate.
Suppose we have n time series (X
1
(t),X
2
(t),···,X
n
(t)), after appro-
priateclusteringwewouldobtainkgroups,herewechoosekequalsto3to
illustrate the idea. Furthermore, we could always arrange theX(t) accord-
ing to the order of groups without losing generality. Then three multiple
52
timeseriesM
1
(t),M
2
(t),M
3
(t)couldbedefinedasfollows:
M
1
(t) =
X
1
(t)
X
2
(t)
.
.
.
X
n
1
(t)
,M
2
(t) =
X
n
1
+1
(t)
X
n
1
(t)
.
.
.
X
n
1
+n
2
(t)
,M
3
(t) =
X
n
1
+n
2
+1
(t)
X
n
1
+n
2
+2
(t)
.
.
.
X
n
(t)
(3.3)
n
1
,n
2
and n
3
corresponding to the dimension of M
1
(t), M
2
(t) and
M
3
(t)arethenumberofgroupmembersineachclusters,andn
1
+n
2
+n
3
=n
whichistotalnumberofunivariatetimeseries. Defining
A
ij
(q
−1
) =
r
∑
l=1
a
ij,l
q
−l
(3.4)
where q
−1
is the backward shift operator and r is the number of lags or
model order which determine how many time lagged information to be
included in the VAR model. Vector autoregressive model of a multivariate
systemwithntimeseriescouldberepresentedasfollows,
X
1
(t)
X
2
(t)
.
.
.
X
n
(t)
=
A
11
(q
−1
) A
12
(q
−1
) ··· A
n1
(q
−1
)
A
21
(q
−1
) A
22
(q
−1
) ··· A
n2
(q
−1
)
.
.
.
.
.
.
.
.
. ···
A
n1
(q
−1
) A
n2
(q
−1
) ··· A
nn
(q
−1
)
X
1
(t)
X
2
(t)
.
.
.
X
n
(t)
+
e
1
(t)
e
2
(t)
.
.
.
e
n
(t)
(3.5)
Substitute(3.3)into(3.5),weobtainthevectorARmodelbetweeneachmul-
tipletimeseries:
M
1
(t)
M
2
(t)
M
3
(t)
=
G
11
(q
−1
) G
12
(q
−1
) G
13
(q
−1
)
G
21
(q
−1
) G
22
(q
−1
) G
23
(q
−1
)
G
31
(q
−1
) G
32
(q
−1
) G
33
(q
−1
)
M
1
(t)
M
2
(t)
M
3
(t)
+
ξ
1
(t)
ξ
2
(t)
ξ
3
(t)
(3.6)
53
whereG
(
q
−1
)isamatrixformconsistingofcorrespondingA
ij
(q
−1
)itemsin
(3.5), and ξ
i
(t) is also multi-dimension representing the residue sequences
forM
i
(t). Therefore, the property ofξ
i
(t) could be measured in covariance
form, defining Σ
i
= cov(ξ
i
(t)), i=1,2,3. This is regarded as the full model,
becauseweincludeallthegroupsoftimeseriestobethepredictors.
In order to quantify whether a group of variables have causal effect
on the other group of variables, we need to construct a restricted model to
comparewiththefullmodel. Forexample,inthe3groupscase,ifwewould
like to see whether the history data of Group 2 has a significant impact on
predictingthevariablesingroup1,westartfrom(3.6)andexcludevariables
inM
2
(t),thenredotheidentification. Thefollowingrestrictedmodelcould
beobtained:
[
M
1
(t)
M
3
(t)
]
=
[
G
11(2)
(q
−1
) G
13(2)
(q
−1
)
G
31(2)
(q
−1
) G
33(2)
(q
−1
)
][
M
1
(t)
M
3
(t)
]
+
[
ξ
1(2)
(t)
ξ
3(2)
(t)
]
(3.7)
As a result, we could get Σ
1(2)
= cov(ξ
1(2)
(t)) and Σ
3(2)
= cov(ξ
3(2)
(t)) as
the covariance matrix of group residual sequences for the restricted model
whenexcludingM
2
(t).
WesaythatM
j
causesM
i
ifincludingallbutM
j
reducetheabilityto
predict M
i
significantly. When each group M
j
is excluded once in predict-
ing all other groups M
i
, ∀i ̸= j, there will be (k− 1)×k multi-dimension
residualsequencesξ
i(j)
(t)andcorresponding(k−1)×k covariancematrix
denoted as Σ
i(j)
≡ cov(ξ
i(j)
). The following cause-effect matrix in terms of
54
covariance matrix can be formed, where the ij
th
element is the covariance
matrix of predicting the i
th
group by excluding the j
th
column group for
i ̸= j. The diagonal elements shows the residual covariance of the k full
models,denotedasΣ
i
≡ cov(ξ
i
).
(M
1
) (M
2
) ··· (M
k
)
M
1
Σ
1
Σ
1(2)
··· Σ
1(k)
M
2
Σ
2(1)
Σ
2
··· Σ
2(k)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
M
k
Σ
k(1)
Σ
k(2)
··· Σ
k
Itisexpectedthatthediagonalelementsintheabovetablearesmall-
er than the other elements in the same row, since the full models use more
predictorsintheregressionmodelstofitthedata.
There is no standard definition for group Granger causality[3]. The
main idea is how to compare two covariance matrices of the residual se-
quences,oneisfromfullmodelandtheotherisfromrestrictedmodel. The
traceversionofgroupGrangercausalityseemstobeanaturalextensionof
G-causality because it sums up all the variance of the residual errors in the
groupwhichcouldbeareasonableindexformeasuringhowgoodamodel
isfittedtothedata[51],leadingto
F
tr
M
j
→M
i
= ln
{
tr[Σ
i(j)
]
tr[Σ
i
]
}
(3.8)
The value of tr[Σ
i
] measures the accuracy of predicting the present
valueofM
i
basedonallthepreviousvaluesincludingM
1
,M
2
,··· ,M
k
. The
value of tr[Σ
i(j)
] gives the accuracy of predicting of M
i
based on all the
55
history values except the M
j
group. According to the causality definition
of Granger, multiple dimensional cases could be extended as follows: If
the trace of the residual error for one group of time series is significantly
reduced by the inclusion of the past of another group of time series, then
a causal relation from the latter group of multiple time series to the former
exists. IfF
tr
M
j
→M
i
= 0thenwesaythereisnocausalinfluencefromM
j
toM
i
otherwise F
tr
M
j
→M
i
> 0. Moreover, if both M
i
and M
j
are one-dimensional
(only one time series member in each group), then the definition of group
causalityisconsistentwiththetraditionalGrangercausality.
An alternative to compare two covariance matrix is to use general-
izedvariance|Σ|[89,22]toquantifythevolumeinwhichtheresidualspace
spans. Calculatingtheratioofdeterminantsoftwocovariancematrixleads
to
F
det
M
j
→M
i
= ln
{
det[Σ
i(j)
]
det[Σ
i
]
}
(3.9)
In the geometric sense, the determinants give the volume of the hyper-
ellipsoids formed by the covariance matrices of the full model residuals ξ
i
and restricted modelξ
i(j)
. The volume ratio is equivalent to the product of
all eigenvalues. Compare to the trace ratio, the determinant ratio would
considermorecompleteinformationrepresentedbythecovariancematrix,
however,itisalsoarguedthatitwillbringsomenumericalstabilityissues.
Like trace-based measure, this measure is also non-negative and reduces
56
to the traditional Granger causality in univariate time series case. In our
methodology,thesetwokindsofmeasuresarebothadoptedtogiveamore
convincinginferenceofthecausality.
With only the magnitude ofF
tr
M
j
→M
i
orF
det
M
j
→M
i
, it is still not enough
to make the correct judgement whether a causal path exists or not. There-
fore,thestatisticalsignificanceneedstobeestablishedbeforemakinganin-
ferenceonthecausalityrelationship. FortraditionalGrangercausality,f-test
is frequently used as a statistical tool to reject the null hypothesis of causal
impact. Here we extend the idea of F-test to test group Granger causality.
ThenullhypothesisisrepresentedasH
0
:thereisnocausalimpactfromM
j
to M
i
. This could be determined by comparing the two models regarding
to predict M
i
, one is called full model(Model 2) and the other is restrict-
edmodelwhichis’nested’withinfullmodel(Model1). RSS istheresidual
sumofsquaresandpisnumberofparametersusedintheregressionmodel.
Weknowthatthefullmodelcanfitthedataatleastaswellastherestricted
model,butwewouldliketoseewhetheritisasignificantbetterfit,thusthe
followingFtestisapplied,
(RSS
1
−RSS
2
)/(p
2
−p
1
)
RSS
2
/(m−p
2
)
∼F
p
2
−p
1
,m−p
2
(3.10)
Whenitcomestoourmulti-dimensionalcausaltest,wecouldcalcu-
57
latetheparametersasfollows:
RSS
1
=tr(Σ
i(j)
)
RSS
2
=tr(Σ
i
)
p
1
=r(n−n
j
)n
i
p
2
=rnn
i
m = (w−r)n
i
(3.11)
Thensubstitute(3.11)into(3.10)andobtainthecorrespondingfval-
ue:
(tr(Σ
i(j)
)−tr(Σ
i
))/n
j
r
tr(Σ
i
)/(w−r−rn)
∼F
n
i
n
j
r,n
i
(w−r−rn)
(3.12)
where n
j
is the number of time series in M
j
, r is number of lags
included in the regression model, w is the total number of observations
of time series, n is the number of total time series (n = n
1
+ n
2
+··· +
n
k
). If f
i(j)
> F
n
i
n
j
r,n
i
(w−r−rn),α
, then the null hypothesis that there is no
causalimpactfromM
j
toM
i
couldberejectedwith(1−α)confidence. The
F
n
i
n
j
r,n
i
(w−r−rn),α
couldbeobtainedbycomputingtheinverseoftheFcumu-
lativedistributionwithnumeratordegreesoffreedomn
i
n
j
randdenomina-
tordegreesoffreedomn
i
(w−r−rn)forthecorrespondingprobabilitiesin
(1−α). Thecriticalvalueαistypicallychosenas0.01or0.05.
3.2.3 Feature group formulation based on PCA
Besides the groups clustered from the original time series data set,
wecouldformaspecificfeaturegroupwhichconsistsofinterestingcharac-
teristics or dynamics to be diagnosed. For the plant-wide oscillation diag-
noseproblem,thestructureorshapeoftheoscillationcouldbedetectedand
58
wewouldliketoformagrouptorepresentthiskindofinterestingfeatures
such that further group causality test could be applied in a well-targeted
manner. Principal component analysis (PCA) models are predominantly
usedtoextractvariablecorrelationfromdata[41]andimportantfeaturesof
thedatasetcouldalsobeextractedasprincipalcomponents[91].
Consider a data matrix X ∈ ℜ
w×n
consisting of w observation rows
andnvariables(timeseries),itcanbedecomposedintoascorematrixTand
aloadingmatrixPbythesingularvaluedecomposition:
X =TP
T
+E = [ t
1
t
2
··· t
n
′ ]
p
T
1
p
T
2
.
.
.
p
T
n
′
+E (3.13)
where t
1
and p
1
are the score vector and loading vector for the first prin-
cipal component, respectively. E is the residual matrix and n
′
is the num-
ber of principal components. For different diagnosing problem, we could
constructacorrespondingfeaturegroupM
f
inwhichmembersarethoses-
corevectorstcarryingtheimportantfaultcharacteristics. Thenthisfeature
groupM
f
isaddedintotherawdatagroups{M
1
,···M
k
}forfurthergroup
causality test. As a result, we could find which group has higher causal
relationshiporcontributesmoretothefeaturegroup,andalsothosegroup-
s which are not relevant to the feature group could be isolated. Different
from the feature selection process using contribution plot and significance
index[91], the proposed group causality test framework allows us to select
59
groups of candidate variables regarding to fault features based on causal
relationshipratherthancorrelationattheearlystageofdiagnosis.
3.2.4 Within group Granger causality test based on PLS
Besides analyzing the causal influence in high level between dif-
ferent groups, further diagnosis could be implemented in each individual
group which is regarded as low level diagnosis in multilevel systems.
Traditionally, ordinary least squares are employed to estimate the re-
gression coefficients for multivariate autoregressive models in Granger
causality test [74, 79]. Assume all channels of time series are of the same
length: X
i
(t) = [ x
i
(1) x
i
(2) ··· x
i
(w) ], where w is the total available
samplesineachtimeseries. Toparameterizecoefficientsin(3.5),denotethe
followings:
Γ
i,t
=
x
i
(r+1)
x
i
(r+2)
.
.
.
x
i
(t)
,Γ
i,t−1
=
x
i
(r)
x
i
(r+1)
.
.
.
x
i
(t−1)
,···,Γ
i,t−r
=
x
i
(1)
x
i
(2)
.
.
.
x
i
(t−r)
(3.14)
wherei∈ 1,2,··· ,n
i
istheindexofgroupmemberinthegroupwhichhas
totaln
i
timeseries,r isthetimelagsinthevectorARmodel. Therefore,we
60
canformulatetheregressionmodelinthisstructure:
x
1
(r+1) x
2
(r+1) ··· x
n
i
(r+1)
x
1
(r+2) x
2
(r+2) ··· x
n
i
(r+2)
.
.
.
.
.
.
.
.
.
.
.
.
x
1
(t) x
2
(t) ··· x
n
i
(t)
=
x
1
(r) x
1
(r−1) ··· x
1
(1) x
2
(r) ··· x
n
i
(1)
x
1
(r+1) x
1
(r) ··· x
1
(2) x
2
(r+1) ··· x
n
i
(2)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
x
1
(t−1) x
1
(t−2) ··· x
1
(t−r) x
2
(t−1) ··· x
n
i
(t−r)
β
(3.15)
Whichcouldalsoberepresentedas:
Y =Xβ (3.16)
where
Y = [ Γ
1,t
Γ
2,t
··· Γ
n
i
,t
]
X = [ Γ
1,t−1
··· Γ
1,t−r
Γ
2,t−1
··· Γ
n
i
,t−r
]
WecallY outputmatrixandX inputmatrix,andtheordinaryleastsquares
solutionis:
ˆ
β = (X
T
X)
−1
X
T
Y (3.17)
whichminimizesthesumofsquarederrorofresidual:||Y −X
ˆ
β||
2
.
One major drawback of the standard Granger causal test using OLS
is that it is not able to handle the multicollinearity well. The parameteriza-
tionofmultivariateARmodelwilloftenleadstoanoverrejectionofnullhy-
pothesisofcausality,thereforetheresultinggraphicalcausalnetworkbeing
61
generatedwilltendtobetoodenseandisverydifficulttodrawconclusion
of the source and corresponding significant propagation paths. The pro-
posed PLS-based causality test has several improvements: First is to han-
dlemulticollinearityofinputmatrixmuchbetter. Especiallyfortimeseries
which are clustered into one group, correlation between variables usually
areratherhighandmulticollinearityissuebecomeevensevereininputma-
trixX (3.17). PLSfindsthefundamentalrelationsbetweeninputandoutput
matrices(X andY)byprojectingbothX andY tonewspacesandtrytofind
themultidimensionaldirectioninXspacethatcouldexplainsthemaximum
multidimensional variance direction in the Y space. This kind of inherent
scheme is quite matched with the art of Granger causality which builds a
straightforwardconnectionbetweenpredictionandcausality. GivenX and
Y,thegeneralunderlyingmodelofPLSis:
X =TP
T
+E
Y =TQ
T
+F
(3.18)
where T = [t
1
,··· ,t
A
] is the score matrix, P = [p
1
,··· ,p
A
] is the loading
matrixforX andQ = [q
1
,··· ,q
A
]istheloadingmatrixforY (AisPLScom-
ponent number). E andF are the model residues ofX andY respectively.
PLSprojects(X,Y)toalow-dimensionalspacedefinedbyasmallnumberof
latent variables (t
1
,··· ,t
A
). The algorithm to perform PLS is known as the
nonlinear iterative partial least squares(NIPALS)[84]. The following could
62
becalculatedusingtheNIPALS:
R =W(P
T
W)
−1
T =XR
(3.19)
ThentherelationshipbetweenY andX couldbeobtainedas:
Y =XRQ
T
+F (3.20)
andthePLSestimatefortheregressioncoefficientis:
ˆ
β
pls
=RQ
T
(3.21)
It’s important to choose the number of components A to minimize
the expected error when predicting the response from future observations
onthepredictorvariables. Simplyusingalargenumberofcomponentswill
do a good job in fitting the current observed data, but will lead to overfit-
ting.
Cross-validation is a statistically tool for choosing the number of
components in PLS regression. It is mainly used where the goal is to
estimate how accurately a predictive model will perform in practice. The
idea of 2-fold cross-validation is utilized in our algorithm to pick up the
optimalnumberofcomponents. Theproceduresisgivenasfollows,
(i) Partition the data into two equal size subsamples. The first
half dataset is marked as d
1
and the second half is marked as
63
d
2
. Data shuffling is not appropriate since we are treating time
serieswhichhavetimedependency.
(ii) Used
1
as training set andd
2
as testing set to do PLS model-
ing. PerformPLSregressionond
1
toobtainamodel,thenapply
themodelontheinputd
2
tocalculate the predictedoutput. Ac-
cordingly,predictionerrorcouldbeobtained.
(iii) Interchange the role of d
1
and d
2
and perform the similar
operation in stepii and calculate another prediction error. Sum
up the two prediction errors to an index, which is known as
(PRESS).
(iv)ForeachnumberofPLScomponents,auniquePRESSiscal-
culated by going through the step ii and iii. The one that has
minimumPRESSortheonebeyondwhichthePRESSnolonger
show significant improvement is chosen as the optimal for our
PLSmodel.
Cross-validation yields meaningful result if the validation set and
training set are drawn from the same population. And we assume that the
mainstructureofthesystembeingstudieddonotchangetoomuchduring
thetimeperiodweareanalyzing.
Another advantage is that our PLS-based Granger causality uses
prediction error rather than residual error compared to traditional granger
64
causality test. The key idea of causal test in Granger sense is to find
whether full model gives a significant better prediction than the restricted
model. We incorporate the training and testing framework widely used in
supervised learning into the causal test in order to give a better or more
realistic prediction power measure of both full and restricted model. The
predictionerrorcouldbeobtainedinthe 2-foldcrossvalidationprocedure,
andf-testisstillusedtodeterminethesignificancethecausalstrength.
3.3 Industrial Case Study
AnindustrialdatasetwasprovidedbytheAdvancedControlTech-
nology group of the Eastman Chemical Company. Fig.3.2 shows the pro-
cess schematic diagram, which consists of three distillation columns, two
decanters, two reboilers, several condensers and recycle streams. FC, LC,
PC,TCrepresentflow,level,pressureandtemperaturecontrolledloops. FI,
LI, PI and TI are indicators for flow, level, pressure and temperature re-
spectively. Theprocessengineershadidentifiedaneedforthediagnosisof
a common disturbance with an oscillation period of about 2 hours. Those
markedwithcirclesareallaffectedinvaryingdegreesbythisplant-wideos-
cillation. This process data are often used as benchmark data for people to
validate their root cause diagnosis methodology, because other researchers
[77,85,92,38]haveanalyzeditandtheplantoperatorshavereportedthere
isavalve-stictionissueintheLC2loop.
65
Figure3.2: ProcessschematicofaplantfromEastmanChemicalCompany
Fifteencontrolledvariables(PV)alongwith15controlleroutput(OP)
are used in our proposed method. The sampling interval is 20s and there
are 8640 samples in each channel of time series. Each time series represent
thesensororcontrollerdynamicinformationinatimewindowof48h.
3.3.1 DTW-based k-means clustering
The first step is to separate the 30 time series into several groups.
We would like to select those variables which have high correlation and
66
share similar dynamics to be grouped together. The DTW-based k-means
clusteringisutilizedheretoachievethisgoal. Selectkequalstobe4andthe
warpingwindowforeachdtwdistancecalculationtobe10. Afterclustering
algorithm has converged, 4 groups of time series have been formed and
time series plot of Group 1, 2, 3 and 4 are displayed in Fig. 3.3, 3.4, 3.5 and
3.6accordingly.
InGroup 1, we can see that six variables carry a very low frequency
dynamics and share the similar pattern of long-term trend. Refer to the
process diagram, the group member are all around the two steam going
into the reboilers. Therefore, it is reasonable that those variables will share
similardynamicsandbeinggroupedintosameclusterforfurthercausality
analysis.
InGroup2,thereare11variablesandmostofthemcarryacommon
oscillation of about 2h (320 samples/cycle). Therefore, one of the major
characteristics of this group is having such a plant-wide disturbance and
thesevariablesareregardedtohavehighcorrelationwiththisphenomenon.
However,itneedstobefurtheranalyzedwhetheraspecificgroupmember
isasource,aresultorjustanintermediatepropagationpoint.
For Group 3, the seven members carry the energy in a much high
frequency range, there may exist another kind of noise disturbance which
alsoworthfurtherstudied.
67
TheimportantfeatureofGroup4isquitesimilartoGroup2. Ithas6
groupmembersintotal.
This kind of clustering procedure gives us a general idea of the w-
hole data set. The motivation and objective of doing this step is: first, in a
controlled industrial plant with mass and heat integration, some causal re-
lationship may exist between one group of variables and another group of
variables rather than one to one relation. Second, by appropriately group-
ing, the within group causality could be inferred more accurately than in-
cludingalltheredundantinformation. Inourclusteringstep,wealsocom-
pare this grouping result with the one obtained from traditional k-means
and find out the modified k-means using DTW as a distance measure is
muchbettertohandlethiskindoftimeseriesclustering. Thisimprovemen-
t is mainly due to that DTW is a more accurate way to measure the true
similaritybetweentwotimeseries.
3.3.2 Group-level causal test
The main objective of diagnosis is to find the source of 2h period
plant-wide oscillations. Therefore, we have to construct a feature group
whichcouldrepresentthismajorprocessdisturbance. Principalcomponent
decompositionisappliedtoadatamatrixwhichcontains15controlledvari-
ables and 15 controller outputs with 8640 samples. After examine the first
several large principal components, we find out PC1 and PC3 carry the 2h
68
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
Group 1
PC1.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC3.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC3.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC6.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC6.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
LC3.OP
Figure3.3: Group1timeseriesplot
oscillation pattern, thus we group these two PC together as a new Group,
markedasGroupTinFig.3.7.
With Group T added into the groups generated by clustering
method, the diagnosis and causal analysis could have a clear main objec-
tive. The most important information flow which is relevant to the feature
structure will be recovered. According to 3.3, denote M
1
, M
2
, M
3
, M
4
and
M
t
torepresentGroup1,Group2,Group3,Group4andthefeatureGroup.
Model order is chosen to be 20, which is the l parameter in (3.4). Group
causalitymethodisappliedbothintermsoftraceanddeterminantversion
tohaveamorecomprehensiveview.
Fig.3.8 shows the trace version of group Granger causality magni-
69
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
Group 2
LC1.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC4.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
TC1.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC5.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC5.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
LC2.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−2
0
2
LC2.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC8.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC8.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
TC2.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC7.OP
Figure3.4: Group2timeseriesplot
70
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
Group 3
PC1.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC1.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC4.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
PC2.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
PC2.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
LC3.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC7.PV
Figure3.5: Group3timeseriesplot
tudefromthefourgroupstothefeaturegroup. ItindicatesthattheGroup2
hasthehighestcausalstrengthtothefeaturegroupwithoscillationcharac-
teristics. Thereforeitishighlyrecommendedtosearchtherootcauseofthis
typeofprocessfaultpropagationinGroup2. InFig.3.9isthecorresponding
determinantversionofgroupGrangercausality,whichalsoimpliesthatthe
Group2contributesmosttothefaultfeaturegroupT.
With the proposed significance test of group Granger causality,
a causal network could be constructed to better represent the causal
relationship between each group of variables. Fig.3.10 shows the causality
graph of the five groups. In this graph, each node represents one group of
timeseries. Ifasignificantcausalrelationshipfromonevariabletoanother
71
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
Group 4
LC1.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC1.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC2.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
FC2.OP
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
TC1.PV
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−5
0
5
TC2.PV
Figure3.6: Group4timeseriesplot
72
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−10
−5
0
5
10
PC1
Group T
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
−10
−5
0
5
10
PC3
Figure3.7: FeatureGrouptimeseriesplot
1 2 3 4
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
Group GC magnitude in terms of trace
Figure3.8: GroupGrangercausalmagnitude(traceversion)fromG
1
∼ G
4
toG
T
73
1 2 3 4
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Group GC magnitude in terms of determinant
Figure 3.9: Group Granger causal magnitude (determinant version) from
G
1
∼ G
4
toG
T
74
exists, a directed green edge with an arrow is drawn to represent it. If the
causalrelationshipisbidirectional,thenaredlineisusedtomarkit.
First we could see that M
2
, M
3
and M
4
all have cause effect on the
M
T
, but M
2
has the highest causal strength among the three sources. Be-
sides this, M
1
and M
4
both receive causal effect from M
3
. This may refer
to the propagation of another oscillation disturbance propagation of about
20 samples per cycle. This effect was also discussed and analyzed in [85]
using spectral ICA method. It could be observed that several variables in
M
4
(FC1.PV, FC2.OP, TC1.PV) carry a mixed oscillation pattern, one is the
main interest of our diagnosing task which possibly transits from M
2
and
anotherisamuchhigheroscillationpossiblycausedbythevariablesinM
3
.
The group-level Granger causal network as the high level could
show a general causal structure of the whole data set. Besides the main
targetof oscillationpropagation, other important causal relationshipcould
also be recovered from big picture of view. More exhaustive causality
relationship needs to be analyzed in a ’zoom in’ manner, which is to
perform causal test in a lower level between individual time series within
groupstoseemoredetailedcauseandeffectrelationship.
3.3.3 PLS-based causality analysis within each Group
For each group of time series, further causal test is applied using
PLS-based Granger causality algorithm. The cause and effect relationship
75
M
1
M
2
M
3
M
4
M
T
Figure3.10: Grouplevelcausalnetwork
76
are analyzed in a subset structure which may yield more meaningful and
accurateinferencethaninatotaldataset.
In Fig.3.11(a), the causal network of Group 1 has been identified
using traditional Granger causal test which make use of ordinary least
squares(OLS)toestimatetheVARmodel. Modelorderissettobe10. Inthis
graph,theidentifiedcausalnetworkistoodenseandverydifficulttodraw
any convincing result of what their important causal relationships are.
There are too many bidirectional causal paths between those six variables.
This is an issue that other researchers also mentions that original Granger
causality with OLS tends to over reject the null hypothesis of causality
leading too many inferred causal paths. This issue will become severe if
the predictors have high correlation or colinearity, which is what happens
here. The six variables clustered in Group 1 are highly correlated, their
minimum pair-wise correlation is 0.69. Therefore, from this graph, it could
only be inferred that there are strong causal feedback structure between
thesevariables.
ThenournewproposedPLS-basedGrangercausaltestisperformed
on Group 1, and the resulting causal network is shown in Fig.3.11(b). The
causalstructurenowbecomesmuchsparserthanpreviousone. Thecausal
networksuccessfullyseparatedintotwosubsets: (PC1.OPFC3.PVFC3.OP)
and (FC6.PV FC6.OPLC3.OP). Fromthe process diagram, we find out that
77
the 1st subset represents the cascade flow loop going into the reboiler of
Column 1 and the 2nd subset represents another cascade flow loop going
intothereboilerofColumn 2. In1stsubset, the threevariables areinferred
tohavecausaleffectbetweeneachother. Whilein2ndsubset,theLC3.OPis
thecontrolleroutputofmasterloop,whichistheset-pointforthesecondary
loop, therefore the value of LC3.OP should have causal effect on the inner
loopvariables: thePVandOPofsteamflowloop. Thesecascademechanis-
m could be clearly displayed in our causal network. Furthermore, there is
nosignificantcausalrelationshipbetween1stsubsetand2ndsubsetwhich
mayimplythatthesetwocascadeloopsarelikelytooperateseparatelyand
there are not much information flow transfer between them. These two
subsets are both manipulating the inflow steam rate to the reboilers, there-
foreitisquitereasonablethattheywillhavehighcorrelationandsharethe
similardynamicsorpatterns. Thiscorrelationrelationshiphassuccessfully
beencapturedbytheDTW-basedclusteringwhichmakethosevariablesto
be in the same group. However, through more detailed causal test, these
twocascadeloopsarenotcausingeachotherforthisperiodoftime.
For Group 2 data, it has been inferred to be the most possible sub-
set where the root cause of the plant-wide oscillation lies after the group
Granger test with feature group included. Within-group causal test is also
performedonthegroupmemberstofurtheranalyzetherelationshipinside
78
PC1.OP
FC3.PV FC3.OP
FC6.PV
FC6.OP LC3.OP
(a) Group1causalnetworkbyOLS-basedGrangercausality
PC1.OP
FC3.PV
FC3.OP
FC6.PV
FC6.OP LC3.OP
(b) Group1causalnetworkbyPLS-basedGrangercausality
Figure3.11: Group1causalstructure
79
thegroup.
Fig.3.12showsthecausalstructureinsideGroup2identifiedbyOLS
andPLSbasedGrangercausality. Themodelorderisselectedtobe4assug-
gestedbyBayesianinformationcriterion. Thereareacommonobservations
tobenotedfromthesetwographs. LC2.PVandLC2.OPhaveseveralcausal
effectsontheothervariablesbutdonotreceiveanycausaleffectsfromoth-
ers,whichareconsideredtobethesourceofthemajorpropagation.
Howeverinordertoanalyzethemechanismsofoscillationpropaga-
tion, it is still not that easy to interpret and diagnose under the Fig.3.12(a).
In Fig.3.12(b), the causal network becomes much sparser and easier to un-
derstand while still reserving the key causal paths. There are total three
propagation paths starting from the source loop LC2. First causal path is
fromLC2.OPtoTC2.OP,thenfromTC2.OPtoFC8.OPandFC8.PV.Itiscon-
sistentwithprocessunderstandingthatoscillationflowthroughthecontrol
valveofLC2willaffectFI3andpropagatetoColumn3includingTC2. Also
disturbance of TC2 will change the set-point of inner loop controller of a
cascade structure and upset the recycling flow rate of FC8. It is interested
topointoutthatthepathfromLC2.OPtoFC7.OPdisappearafterchanging
OLS with PLS based test, which is closer to the true situation. On the one
side,FC7.OPistocontrolthesteamflowrateofColumnanddoesnotbeing
affected by the plant-wide oscillation. On the other side, it does will affect
80
the temperature of Column 3(TC2) and this relationship is still reserved in
Fig.3.12(b).
The second propagation path is from LC2.PV to LC1.PV, which has
not been clearly identified in[91]. The successful detection of this causal
relationship is mainly due to the appropriate grouping methodology such
that a more focused and accurate causal test could be performed. Refer to
the process diagram, it can be argued that: after the first causal path, the
disturbancecouldarriveFC8loopwhichiscontrollingtherecycleflowrate
back into the Column 1 and hence disturb the level controller LC1. The
thirdpropagation path is fromLC2.PV to TC1 cascade flow loop structure.
TC1.OP, the control signal of master loop is the set-point of FC5 secondary
loop. ThereforeTC1bringtheoscillationdisturbancetotheFC5loop.
Tosumup,theLC2controlloopislikelytobethecauseinthisGroup
and it transfer the oscillation disturbance to three parts of the whole plant.
The PLS-based causal test can give a more clear and precise propagation
diagramthanoriginalmethod.
Fig.3.13 is the causal structure identified for Group 3. Although the
timeseriesinthisgroupdoesnotcarrytheplant-wideoscillationof2hours
period, some other useful causal relationship regarding to the plant could
stillbeinferred. AcausalpathfromLC3.PVtoPC2.OPisrecovered,which
means that the level in Column 2 has an impact on the pressure of corre-
sponding separator. This observation matches the process schematic de-
81
LC1.PV
FC4.OP
TC1.OP
FC5.PV
FC5.OP
LC2.PV
LC2.OP
FC8.PV
FC8.OP
TC2.OP
FC7.OP
(a) Group2causalnetworkbyOLS-basedGrangercausality
LC1.PV
FC4.OP
TC1.OP
FC5.PV
FC5.OP
LC2.PV
LC2.OP
FC8.PV
FC8.OP
TC2.OP
FC7.OP
(b) Group2causalnetworkbyPLS-basedGrangercausality
Figure3.12: Group2causalstructure
82
sign: In order to compensate the variation of level in Column 2, the sep-
arator will adjust its pressure by regulating its outlet flow rate. The time
series of LC3.PV and PC2.OP are almost in anti-phase. These three vari-
ables(LC3.PV,PC2.PV,PC2.OP)alongwithanindicatorTI6.PVshareanoth-
er common oscillation frequency at around 20 samples per cycle[86]. An-
other causal test where TI6.PV was also added into Group 3 is performed
and T6.PV is regarded to be a sink of the propagation. Therefore from the
subset causal structure analysis, it is highly possible that the source of this
disturbanceisintheLC3controlloop.
AnothernoticeablecausalpathisfromPC1.PVtoFC1.OP.Thisisal-
so understandablethat the pressurein the Column 1 will affect the level in
Column 1 through regulating the steam flow rate into the corresponding
reboiler, and meanwhile the level of Column 1 is used as a set-point to a
inner loop of FC1 and have a direct effect on the manipulated variable of
FC1(FC1.OP). This kind of indirect causal relationship could also be recov-
eredeventhoughsomeintermediatevariables(FC3.PV,FC3.OP)ofcomplete
propagationarenotincluded.
Group4alsocontainsalotofvariablesthatsharethecommonplant-
wide oscillation to be diagnosed, however during the Group-causal test it
hasmuchsmallercausalcontributiontothefeaturegroupthantheGroup2,
therefore it is more likely to be a subset of variables that are being infected
83
PC1.PV
FC1.OP
FC4.PV
PC2.PV
PC2.OP
LC3.PV
FC7.PV
Figure3.13: Group3causalnetworkbyPLS-basedGrangercausality
84
ratherthantherootcause.
Fig.3.14 shows the causal network inside Group 4. There is causal
path starting from TC1.PV to LC1.OP. This path is reasonable because the
variation in TC1 will affect the FC5 reflux rate back into the reboiler for
Column1andfinallywilldisturbthelevelinColumn1(LC1). Afterthedis-
turbancehasalreadyarriveLC1,itcouldfurtheraffecttheFC1.PVthrough
thesimilarcascadecontrolstructureasmentionedbefore. TheFC2.PVand
FC2.OParenot showninthe processdiagram, but it could be inferred that
they are receiving the causal effect from FC1.PV and LC1.OP and thus be-
havethesimilarcommonoscillation. Anotherinterestingtocommentisthe
independence of TC2.PV in the causal network. It implies that although
TC2.PV carry the similar oscillation dynamics and have high correlation
with other members in this Group, but it does not have causal relation-
ship with others within Group. Combining the causal analysis in Group 2,
it could be concluded that TC2 loop receives causal effect from LC2 loop
and will disturb reboiler for Column 1 through reflux rate of FC8 but it is
a different propagation path than the one that goes through TC1 and FC5.
Therefore the isolation of TC2 in Fig.3.14 further confirms the mechanism
oftheoscillationpropagation.
85
LC1.OP
FC1.PV FC2.PV
FC2.OP
TC1.PV TC2.PV
Figure3.14: Group4causalnetworkbyPLS-basedGrangercausality
86
3.4 Summary
In this chapter, a deep causal mining using multilevel Granger
causality framework for root cause diagnosis has been proposed for
industrial controlled process. The main procedures of the methodologies
is shown in Fig.3.15. It starts with time series of routing operation data
and group them into several groups using DTW-based K-means method
wherecorrelationandsimilaritywithingroupishigh. Thenforadiagnosis
task with specific objective, relevant principal components are formed to
be a feature group. Therefore, group Granger causality could be utilized
to analyze the causal effect between each groups along with the feature
group, which is consideredto be high level analysis. To further investigate
the causality within each group, a PLS-based modified Granger causality
test has been proposed for low level element-wise system, which has two
major improvement: one is to better handle the multicollinearity and the
other is to accurately measure the true improvement of prediction power
duringcausalinference. Finallyrootcauseandthecorrespondingcomplete
propagation paths could be diagnosed by combining the results of two
level of causal reasoning. A case study of benchmark industrial data set is
given to illustrate the presented strategies. The root cause and complete
propagationpathsofthewhole-plantoscillationcouldbeclearlydiagnosed
along with other important causal relationship which are consistent with
87
Figure3.15: SchematicdiagramofthemultilevelGrangercausalityanalysis
theprocessunderstandingandsystemdesign.
This framework is totally data-driven and can be carried out with-
out process models or plant perturbations. Since it utilize all the available
data and thus is regarded as a complete causal modelling without losing
anypotentialusefulinformation. Moreoveritprovidesaflexible diagnosis
structure where root cause inference could placed at different level of pre-
cision. This method could have other potential as well. It could be further
extended to locate the source of many other fault propagation other than
theoscillationdisturbance.
88
Chapter 4
Nonstationary process monitoring for dynamic pro-
cesses with cointegration analysis
4.1 Introduction
Principalcomponentanalysisanditsvariantsarewidelyusedinthe
area of statistical process monitoring for industrial processes[63]. General-
ly speaking, when an industrial process operates under normal situation,
the measured variables approximately follow a multivariate stationary s-
tochastic process. If the sampling interval is long and the samples seem to
be time independent, static PCA model performs well in describing cross
correlations among measurements. Otherwise, if the auto-correlation be-
tween variables are evident, dynamic models are preferred to model the
process. Different from static PCA models, dynamic models capture not
onlycrosscorrelationsbetweenvariablesbutalsoserialcorrelationsamong
measurementseries.
One of the most widely used dynamic model was proposed by Ku
et al. with a lagged version of PCA to process multivariate variables [49],
calleddynamicPCAmodel(DPCA).DPCAconductsthesingularvaluede-
composition on an augmented matrix of time lagged process variables. Li
89
and Qin investigated the relationship between dynamic PCA and a sub-
spaceidentificationmethod(SIM)andproposedaconsistentdynamicPCA
algorithm,calledindirectdynamicPCA(IDPCA)[54]. Dingetal. combined
SIMandmodelbasedfaultdetectiontechnologiestoproposeanotherfault
detection scheme [16]. Using the linear dynamic state space description of
processes, the residual generator of the parity space method after identify-
ing a subspace model is equivalent to indirect dynamic PCA modeling for
normal data. Negiz and Cinar used a canonical variate (CV) state space
model to describe dynamic processes, which is equivalent to a vector au-
toregressive moving-average time-series model (VARMA) [59]. In order to
revealthelatentstructurehiddeninthemultivariatetimeseries,adynamic
latent variable model was proposed by Li et al. similar to the concept of
dynamic factor analysis [53]. In their model, multivariate time series share
severalcommonautocorrelatedstochastictrends,whiletheremainingsub-
spaceconsistsofindependentlyidenticallydistributedseriesandmodeling
residuals.
However, the aforementioned approaches all assume that the mea-
suredseriespossessesstationary(i.e. time-invariant)meansandvariances,
ratherthannonstationarycases. Inpractice,someobservedseriescontaina
slow-varying trend due to the equipment aging, and other observed series
actlikearandomwalkundertheinfluenceofvariousdisturbances. Inthis
90
case,traditionalPCAorDPCAmodelsareinadequatetodescriberelations
amongvariables,andanewmodelframeworkisdesiredtodepictthepro-
cess structure. Nonstationarity test is a tool to test whether a time series is
stationary or not, which is studied frequently in the econometrics [62]. If
twoormoretimeseriesarenonstationary,butalinearcombinationofthese
time series is stationary, these series are cointegrated [2]. This chapter will
apply nonstationarity test technology to identify nonstationary time series
and use cointegration representation to describe latent structures for non-
stationarymultivariateprocesses,andfurtherproposethestatisticalindices
forprocessmonitoring.
The rest of the chapter is organized as follows. In Section 2, several
nonstationaritytestsarereviewedandcomparedtoidentifywhetheratime
series is stationary. Then, the cointegration analysis is adopted to describe
the nonstationary multivariate time series in the Section 3. Following that,
fault detection indices are proposed in Section 4 based on the proposed
model. In Section 5, we use a case study on Tennessee Eastman process
to illustrate the effectiveness of the method. Finally, conclusions are given
inthelastsection.
91
4.2 Nonstationarity test for univariate time series
4.2.1 Unit root test
Unitroottestsareapopulartooltotestwhetheratimeseriesisunit
root nonstationary. A unit root process is a data-generating process whose
first difference is stationary. In other words, the simplest unit root process
x
k
hasthefollowingform
x
k
=x
k−1
+e
k
. (4.1)
where e
k
is a stationary series. A unit root test attempts to determine
whetheragiventimeseriesisconsistentwithaunitrootprocess. Unitroot
testoftenusesthefollowingregressionmodel
x
k
=µ+ϕx
k−1
+
p
∑
i=1
α
i
△x
k−i
+ϵ
k
(4.2)
whereµ is the intercept,△ is the difference operator such that△x
k
= x
k
−
x
k−1
andϵ
k
istheindependentidenticaldistribution(iid)innovationprocess
withzeromean. Thenullhypothesisofunitroottestisϕ = 1in(4.2),which
indicates the process is unit root nonstationary. The alternative hypothesis
is|ϕ| < 1, which indicates the process is trend stationary. Define
ˆ
ϕ as the
ordinaryleastsquaresestimatorofϕ,i.e.
[ˆ µ,
ˆ
ϕ, ˆ α
1
,··· , ˆ α
p
]
T
= (Ψ
T
Ψ)
−1
Ψ
T
X (4.3)
92
where
Ψ =
1 x
0
△x
0
... △x
1−p
1 x
1
△x
1
... △x
2−p
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 x
n−1
△x
n−1
... △x
n−p
(4.4)
andX = [x
1
,x
2
,...,x
n
]
T
,nisthesamplelengthoftheregression. Underthe
null hypothesis ofϕ = 1, according to the functional central limit theorem,
the limit distribution ofn(
ˆ
ϕ−1) converges in distribution to the following
functionwithn→∞[71]:
n(
ˆ
ϕ−1)
d
− →
σ[W(1)
2
−1−2W(1)
∫
1
0
W(t)dt]
2ω[
∫
1
0
W
2
(t)dt−[
∫
1
0
W(t)dt]
2
]
(4.5)
where W(t) is standard Brownian motion in [0,1]. Further, σ,ω are two
parameters and can be estimated consistently as ˆ σ
2
=
∑
ˆ ϵ
2
k
/n and ˆ ω
2
=
ˆ σ
2
/(1−
∑
ˆ α
j
)
2
, respectively. Because the limiting distribution (4.5) does
nothaveaclosedform,quantitiesofthedistributionmustbecomputedby
numerical approximation or by simulation. The lower tail critical value of
thedistributioncanbefoundinatablewithagivensignificance. Ifthetest
statisticisgreaterthanthecriticalvalue,thenullhypothesisisaccepted.
If p = 0 is selected, σ = ω in distribution (4.5) and the test statistic
n(
ˆ
ϕ− 1) is well known as the unit root or Dickey-Fuller (DF) statistic. If
p> 0,theDFtestisextendedtotheso-calledaugmentedDickey-Fullertest
(ADF).InADFtest,thechoiceofpiscrucialandshouldbelargeenoughto
capture the essential correlation structure in the original series. According
to the work by [66], the lag length p is enough to be n
1/3
. An alternative
93
test is the Phillips-Perron (PP) test, which differs from ADF mainly in how
they deal with serial correlation and heteroskedasticity in the errors [71].
ADF and PP tests are two commonly used unit root tests, which test the
coefficientsestimationoftheregressionmodel(4.2).
4.2.2 Stationarity test
Whilethenullhypothesisofunitroottestisnonstationarity,station-
arity tests with the Largrange multiplier (LM) principle can test the null
hypothesisoftrendstationarity. Kwiatkowskietal. proposedawellknown
test to identify a random walk from a stationary series, which is so called
KPSStest[50]. Itassumesthatthetargetseriescanbedecomposedintothe
sumofadeterministictrend,arandomwalk,andastationaryerror:
{
x
k
= µ+r
k
+e
k
.
r
k
= r
k−1
+u
k
(4.6)
where e
k
is stationary series, r
k
is a standard random walk,u
k
is iid(0,σ
2
u
).
Themodelisknownasthecomponentrepresentation. Lete
k
betheresidu-
alsfromtheregressionofx
k
onthedeterministictrendasfollows:
ˆ e
k
=x
k
− ˆ µ (4.7)
Defineˆ σ
2
e
=
∑
ˆ e
2
k
/nastheestimateoftheerrorvariancefromtheregression,
and S
k
=
∑
k
i=1
ˆ e
i
as the partial sum process of the residuals, then the LM
statisticcanbeconstructedasfollows[62]
LM =
∑
S
2
k
/n
2
ˆ σ
2
ϵ
(4.8)
94
Under the null hypothesis of stationarity and e
k
is an iid process, the LM
statisticconvergestothefollowingfunctionindistribution
LM
d
− →
∫
1
0
V
2
(t)dt (4.9)
where V(t) = W(t)−tW(1) is a standard Brownian bridge, and W(t) is a
Brownianmotion. Whenthereisaserialdependenceine
k
,thestatisticcan
bemodifiedbyreplacingthevarianceestimate ˆ σ
2
ϵ
in(4.8)withthelongrun
variances
2
(l)[50]:
s
2
(l) = ˆ σ
2
e
+
2
n
l
∑
s=1
[(1−
s
l+1
)
n
∑
k=s+1
ˆ e
k
ˆ e
k−s
] (4.10)
Theparameterlcanbechosenasn
1/2
. Givenasignificancelevel,thecritical
valuescanbecalculatedbynumericalsimulationorfoundfromatable. The
KPSS test is a upper tail test, thus when the test statistic (4.8) is below the
criticalvalue,thenullhypothesisisaccepted.
There are also some other efficient tests for nonstationarity, such as
the point optimal test and the local best invariant test, which consider the
casethatϕinmodel(4.2)iscloseto1.
4.3 Co-integrationanalysisformultivariateintegratedseries
If the observed series are found to be unit root nonstationary, the
populationvarianceofseriesisnotaconstant,whichmakesthemonitoring
withtraditionalstatisticalmethodsinadequate. However,intheeconomet-
ricsarea,aphenomenoncalledco-integrationhasbeenobservedfrequently.
95
Whentwoormoreobservedseriesarenonstationarybutalinearcombina-
tion of these series is stationary, these time series are called cointegrated.
EngleandGrangerhadinitiatedadiscussionontherepresentation,estima-
tion and testing for cointegrated series and introduced the following con-
cepts[20].
Definition4.1. Ifaseriesx
k
hasastationary,invertible,autoregressivemov-
ingaverage(ARMA)representationafterdifferencingdtimeswithnodeter-
ministiccomponent,itiscalledintegratedoforderd,denotedasx
k
∼I(d).
For simplicity, only d = 0,1 will be discussed in the chapter. The
results can be generalized to other cases. According to the definition, x
k
∼
I(1)isaunitrootnonstationary,andx
k
∼I(0)isstationary.
Definition 4.2. For a vector seriesx
k
, if all components ofx
k
are I(d), and
there exists a nonzero vector β so that z
k
= β
T
x
k
∼ I(d− b),b > 0, x
k
is
called cointegrated of order d,b, denoted asx
k
∼ CI(d,b). and β is called
thecointegratingvector.
For the bivariate case, there is at most one cointegration relation in
x
k
, which was solved with a residual based procedures by [20]. In the case
thatmorethanonerelationmayexist,sayrcointegrationrelations,thecoin-
tegrated vector is extended to a cointegration matrixB with rankr, andz
k
isavectorserieswithr dimension. Therearetwotasksinthecointegration
96
analysis: i)testingr,andii)identifyingB. Thereareseveralrepresentations
for cointegrated processes, including vector error correction and common
trendsmodel.
4.3.1 vector error correction and Johansen test
Suppose that a multivariate series is generated by a vector autore-
gressivemodeloforderpwithGaussianerrors
x
k
=µ+
p
∑
i=1
A
i
x
k−i
+ϵ
k
(4.11)
wherex
k
∈R
m
isam-dimensionalseries,A
i
∈R
m×m
isthecoefficientma-
trix, andϵ
k
is a m-dimensional white noise process distributed asN(0,Σ
ϵ
).
Here,weconsiderthatx
k
∼I(1),and△x
k
arestationary. Define
Π =
p
∑
i=1
A
i
−I
m
(4.12)
then Πx
k
∼ I(0) and rank(Π) = r is the number of linearly independent
cointegration relations [65]. Decompose Π into two full-ranked matrices
Π = AB
T
, where A ∈ R
m×r
is the loading matrix and B ∈ R
m×r
is the
cointegration matrix. Note that the decomposition is not unique. Define
B
i
=−
∑
p
j=i+1
A
j
,thenthemodel(4.11)canbereparameterizedasavector
error-correction(VEC)modelasfollows[65]
△x
k
=µ+AB
T
x
k−1
+
p−1
∑
i=1
B
i
△x
k−i
+ϵ
k
(4.13)
97
Remark4.1. Ononehand,ifthegoalofaVARanalysisistodeterminerela-
tionships among the original variables, differencing loses information and
mayleadtoamodelmisspecification,sincelong-terminformationisreflect-
edinthelevels. Ontheotherhand, ifthegoalistosimulateanunderlying
data-generatingprocess,integratedlevelsdatacancauseanumberofprob-
lems. Fortunately, the VEC model provides intermediate options, between
differencesandlevels,bymixingthemtogetherwiththecointegratingrela-
tions. Since all terms of the VEC model are stationary, problems with unit
rootsareeliminated.
Johansenproposedamaximumlikelihoodproceduretoestimatethe
parametersin(4.13),andformulatedalikelihoodratiotestofthehypothesis
H
0
:r =r
0
againstH
1
:r >r
0
basedonthestatistic[43]:
−2lnQ =−n
m
∑
i=r
0
+1
ln(1−
ˆ
λ
i
) (4.14)
wherer
0
is the given value ofr to be tested,
ˆ
λ is them−r
0
smallest eigen-
valuesobtainedbysolvinganeigenvalueequation:
|λS
pp
−S
p0
S
−1
00
S
0p
| = 0 (4.15)
where
S
ij
=
1
n
R
i
R
j
,(i,j = 0,p)
R
0
= △X−△XY
T
(YY
T
)
−1
Y
R
p
= X
−p
−X
−p
Y
T
(YY
T
)
−1
Y
△X = (△x
1
,...,△x
n
)
X
p
= (x
−p+1
,...,x
n−p
)
Y = (y
0
,...,y
n−1
)
y
k
= (1,△x
T
k−1
,...,△x
T
k−p+1
)
T
98
The asymptotic distribution of the statistic is nonstandard and de-
pends on the restriction of the deterministic term if existing, which can be
found in [80]. The lag parameters p can be jointly estimated with r
0
using
anordercriterion[65]. Oncetherankr andlagsparedetermined,thecoin-
tegration matrixB can be estimated by an ML procedure or an LS method
from(4.13).
4.3.2 common trend representation and Stock-Watson test
Another representation for cointegrated variables is known as
common-trendrepresentationordynamicfactormodel:
{
x
k
= Γτ
k
+e
k
τ
k
= τ
k−1
+u
k
(4.16)
where e
k
is a stationary multivariate series, Γ ∈ R
m×(m−r)
is the loading
matrix, τ
k
∈ R
m−r
is a (m−r)-dimensional random walk process,u
k
is an
iid process. [73] proposed an estimation procedure of the common trend
representation (4.16) and test for the number of the random walks. First of
all, they used principal component analysis to estimate the τ
k
with m−r
principal components. Then the autoregressive coefficient matrix Φ in the
form of τ
k
= Φτ
k−1
+ϵ
k
was estimated by least squares regressor. At last,
the following statistic is used to test the hypothesis of H
0
: r = r
0
against
H
1
:r >r
0
:
q
m−r
=n(real(
ˆ
λ
min
)−1) (4.17)
99
where
ˆ
λ
min
is the smallest eigenvalue of Φ, and real(·) means the real part
ofanumber. UndertheH
0
hypothesis,theestimatedminimumeigenvalue
shouldbeinsignificantlydifferentfromone. Theasymptoticdistributionof
q
m−r
isnonstandardanddependsonr [73].
4.4 Fault detection based on cointegration representation
Inthestudyoneconomics,nonstationarytestandcointegrationanal-
ysis are directly performed on economical data collected in the long ter-
m. Here, these kinds of modeling are used in the monitoring of industrial
processes, which should be adapted for the special use. As is mentioned,
cointegration model can describe nonstationary variables with strong cor-
relations, which arecalledcointegrated variables. However, it is desired to
construct monitoring indices to indicate whether there is a disturbance or
break in the normal structure. Different from traditional PCA, only equi-
librium error processz
k
can be acquired as a stationary multivariate series,
whichisproperformonitoring.
However,ifx
k
∼I(1),△τ
k
∼I(0)isstationary,andthedifferencein-
formationofcommontrendscanalsobeusedformonitoringpurpose. The
remaining directions which are orthogonal to col(B) contain the common
stochastictrendsinx
k
,whichcanbeextractedas
τ
k
=B
⊥T
x
k
(4.18)
100
wherecol(B)meansthespacespannedbythecolumnsofB,andB
⊥
means
the full-column-rank matrix which is orthogonal to
ˆ
B. For a new observa-
tionx
k
,wecalculatethez
k
and△τ
k
as
z
k
=
ˆ
B
T
x
k
+ ˆ µ
△τ
k
= B
⊥T
(x
k
−x
k−1
)
(4.19)
TheT
2
statisticforz
k
and△τ
k
arelistedinTable4.1.
Table4.1: Faultdetectionindices
Statistics Calculation Controllimit
T
2
z
z
T
Λ
−1
z
z
r(n
2
−1)
n(n−r)
F
r,n−r,α
T
2
τ
△τ
T
Λ
−1
τ
△τ
(m−r)(n
2
−1)
n(n−m+r)
F
m−r,n−m+r,α
Λ
z
, Λ
τ
are covariance of z
k
and △τ
k
, re-
spectively. F
a,b,α
is the critical value of F-
distribution with degree a, b at the signifi-
canceα
The statistic T
2
z
monitors the equilibrium relations among nonsta-
tionary series, which is similar to SPE index in a PCA model, while the
statisticT
2
τ
monitorsthevariationsinthenonstationarycomponents,which
issimilartoT
2
indexinaPCAmodel.
4.5 Case study on TE process
In this section, the proposed method is investigated for nonstation-
ary variables in the Tennessee Eastman Process (TEP). TEP was created to
evaluate process control and monitoring methods, such as PCA, PLS, and
Fisher discriminant analysis (FDA) [9]. Fig.4.1 shows the schematic dia-
101
Figure4.1: SchematicdiagramoftheTennesseeEastmanprocess
gram of the TEP, which consists of five major units: a reactor, a condenser,
a compressor, a separator and a stripper. 12 manipulated variables and 41
measured variables are recorded. Process measurements are sampled with
an interval of 3 minutes. 19 composition measurements are sampled with
time delays that vary from six minutes to fifteen minutes, which are not
used in this study. Therefore, 22 process measurements and 11 manipulat-
edvariables,i.e. XMEAS(1-22)andXMV(1-11),arechosenasX.
Firstly, 480 normal samples are chosen to test the nonstationarity of
all variables and find out the nonstationary variables. We use ADF test,
102
PP test and KPSS test without linear time trend to identify nonstationary
series in TE process. The number of lags is appropriated chosen consider-
ing both the t statistics of regression coefficients and the BIC criterion. In
this case for most variables, a model order of 2 or 3 is adequate. The test
results are listed in Table 4.2, where ’0’ represents to accept the hypothe-
sis of unit root nonstationarity and ’1’ represents to accept the hypothesis
of trend stationarity. From the table, 6 variables seem to be nonstationary,
which are plotted in Figure 4.2. The results show that ADF test outper-
formsPPtestandKPSStestinunitroottestforindustrialprocessdata. Let
x = [x
7
,x
13
,x
18
,x
19
,x
20
,x
31
]
T
. Then Johansen test is performed onx
k
, and
had r = 4 as the test result, which means there exists 4 cointegration rela-
tions. Thecointegrationmatrixandtheinterceptareestimatedas
B =
−1.6166 −1.2341 0.0055 −0.5631
1.5929 1.2440 0.0913 0.2577
2.2897 −4.9380 10.9607 −1.0464
0.3233 −0.4269 −0.2824 −0.0557
−0.3023 −0.4590 0.2237 1.0152
−1.6000 2.4273 −0.5862 −0.0236
,µ =
131.9253
526.3619
−959.3754
580.7272
Following that, common trends τ
k
and the error process z
k
are cal-
culated by (4.18) and (4.19), respectively and plotted in the Fig. 4.3. The
nonstationarytestsforτ
k
,z
k
arelistedinTable4.3. Afterthelineartransfor-
m by cointegration matrix, the nonstationary data will generate the corre-
spondingerrorprocessz
k
,whichshouldbestationary. Whiletau
k
captures
the common trends and will show nonstationary. Thus the nonstationary
103
testingresultisconsistentwiththecointegrationtestdesign.
There are 15 known faults in TE process [88]. IDV(1), IDV(10) and
IDV(13)areusedasthreedifferenttypesoffaultcasestoshowthemonitor-
ing effect. When IDV(1) is introduced to testing data at the 160
th
sample, a
stepchangeisinducedintheA/CfeedratioinStream4. IDV(10)toinduce
a random variation in C feed temperature and IDV(13) is to make a slow
drift in the reaction kinetics also at the 160
th
sample. All the three faults
can be detected effectively by the proposed statistics, as plotted in Fig.4.4,
Fig.4.5andFig.4.6. ItcouldbeobservedthatT
2
z
outperformsT
2
τ
inallthree
cases,becauseT
2
τ
utilizesthedifferencingseriesoforiginaldata,whichmay
losetoomuchimportantinformation.
For other stationary variables, traditional PCA or DPCA can be ap-
plied to model and monitor the process effectively. However, for these six
nonstationary variables, cointegration analysis is more efficient to handle
thenonstationaryserieswithcommontrends.
4.6 Summary
In this chapter, the process monitoring problem for nonstationary
multivariate series is addressed. As the statistical properties of nonstation-
ary variables vary along time, traditional multivariate statistical process
monitoring techniques are inadequate to deal with the modeling and
monitoring of nonstationary processes. This chapter utilizes nonstationary
104
0 200 400 600
2690
2700
2710
2720
sample index
x
7
0 200 400 600
2600
2620
2640
2660
sample index
x
13
0 200 400 600
64
65
66
67
sample index
x
18
0 200 400 600
200
220
240
260
sample index
x
19
0 200 400 600
335
340
345
sample index
x
20
0 200 400 600
40
45
50
55
sample index
x
31
Figure4.2: NonstationaryseriesinTEprocess
105
0 100 200 300 400 500
840
860
880
900
sample index
τ
1
0 100 200 300 400 500
−3690
−3680
−3670
−3660
−3650
sample index
τ
2
0 100 200 300 400 500
−4
−2
0
2
4
sample index
z
1
0 100 200 300 400 500
−4
−2
0
2
4
sample index
z
2
0 100 200 300 400 500
−4
−2
0
2
4
sample index
z
3
0 100 200 300 400 500
−4
−2
0
2
4
sample index
z
4
Figure4.3: Commontrendsτ
1
,τ
2
anderrorprocessz
1
,z
2
,z
3
,z
4
106
Table4.2: Nonstationarytestsfor22measurementsand11MVs
Variable Description ADF PP KPSS
Xmeas(1) Afeed(stream1) 1 1 1
Xmeas(2) Dfeed(stream2) 1 1 1
Xmeas(3) Efeed(stream3) 1 1 1
Xmeas(4) Totalfeed(stream4) 1 1 1
Xmeas(5) Recycleflow(stream8) 1 1 1
Xmeas(6) Reactorfeedrate(stream6) 1 1 1
Xmeas(7) Reactorpressure 0 1 0
Xmeas(8) Reactorlevel 1 1 1
Xmeas(9) Reactortemperature 1 1 1
Xmeas(10) Purgerate(stream9) 1 1 1
Xmeas(11) separatortemperature 1 1 1
Xmeas(12) separatorlevel 1 1 1
Xmeas(13) separatorpressure 0 1 0
Xmeas(14) separatorunderflow 1 1 1
Xmeas(15) stripperlevel 1 1 1
Xmeas(16) stripperpressure 1 1 0
Xmeas(17) stripperunderflow 1 1 1
Xmeas(18) strippertemperature 0 0 0
Xmeas(19) strippersteamflow 0 0 0
Xmeas(20) compressorwork 0 1 0
Xmeas(21) reactorcoolingwateroutlettemp 1 1 1
Xmeas(22) condensercoolingwateroutlettemp 1 1 1
Xmv(1) Dfeedflow(stream2) 1 1 1
Xmv(2) Efeedflow(stream3) 1 1 1
Xmv(3) Afeedflow(stream1) 1 1 1
Xmv(4) totalfeedflow(stream4) 1 1 1
Xmv(5) compressorrecyclevalve 1 1 0
Xmv(6) purgevalve(stream9) 1 1 1
Xmv(7) separatorpotliquidflow 1 1 1
Xmv(8) stripperliquidproductflow 1 1 1
Xmv(9) trippersteamvalve 0 0 0
Xmv(10) reactorcoolingwaterflow 1 1 1
Xmv(11) condensercoolingwaterflow 1 1 1
107
Table4.3: Nonstationarytestsforcommontrendsanderrorprocess
series ADFtest PPtest KPSStest
τ
1
0 0 0
τ
2
0 0 0
z
1
1 1 1
z
2
1 1 1
z
3
1 1 1
z
4
1 1 0
0 100 200 300 400 500 600 700 800 900 1000
10
−2
10
0
10
2
10
4
sample index
T
z
2
0 100 200 300 400 500 600 700 800 900 1000
0
10
20
30
40
50
sample index
T
τ
2
Figure4.4: FaultdetectionresultwithT
2
z
andT
2
τ
forIDV1
108
0 200 400 600 800 1000
10
−2
10
0
10
2
10
4
sample index
T
z
2
0 200 400 600 800 1000
0
10
20
30
sample index
T
τ
2
Figure4.5: FaultdetectionresultwithT
2
z
andT
2
τ
forIDV10
109
0 100 200 300 400 500 600 700 800 900 1000
10
−2
10
0
10
2
10
4
sample index
T
z
2
0 100 200 300 400 500 600 700 800 900 1000
0
50
100
150
200
sample index
T
τ
2
Figure4.6: FaultdetectionresultwithT
2
z
andT
2
τ
forIDV13
110
tests to identify the nonstationary variables in a dynamic process, and
then adopts cointegration analysis to describe the common trends among
cointegrated variables. Based on the survey of these methods and models,
a new fault detection policy is established for detecting the fault during
multivariatenonstationaryprocesses. Theproposedmethodisdemonstrat-
ed by the case study on TE process. The results show the effectiveness of
theproposedmethod.
111
Chapter 5
Conclusions
This dissertation presents a novel data-driven causality analysis
framework for root cause diagnosis in large-scale industrial process. For
plant-wide oscillations problem, the idea of Granger causality is firstly
introduced into the area of control performance monitoring and diagno-
sis. It could be demonstrated that the time-domain causality is good at
discovering the information flow and propagation path while spectral
causality could uncover the pair-wise causal magnitude at a specified
frequency range. Besides this, a latent variable based feature selection is
developed and incorporated to the causal modeling in order to highlight
theinterestingcausalpathswhichleadtotheplant-widepropagation.
Theproposedmethodisfurtherextendedtoacompletecausalstruc-
ture mining methodology using multilevel Granger causality. The causal
analysiscould beperformedat two levels, one is group-wiseand the other
iswithineachgroup. Fordifferentdiagnosistask,aspecificfeaturegroupis
constructedthroughPCAandaddedintothesetofothertimeseriesgroups
clustered by a DTW-based k-means method for group-level causal test. To
furtherrecoverthepropagationpathandlocatemostpossiblerootcause,a
112
PLS-based modified Granger causality test has been developed to address
the multicollinearity and prediction power comparison issue. Combining
the causal test results of two levels, root cause and the corresponding
complete propagation paths could be diagnosed. Furthermore, important
causal structure of controlled industrial plant could be recovered to help
understandthedynamicsofcurrentoperationmode.
Theprocessmonitoringproblemformultivariatenonstationarytime
seriesisalsoaddressed. ADF,PPandKPSSnonstationarytestsareutilized
to give a reliable judgment of nonstationary. Cointegration test is adopted
tocapturethecommontrendsandgeneratestationaryequilibriumerrorse-
ries which could be used for monitoring purpose and detect any abnormal
change due to different process faults. Thus, a new fault detection policy
is established by monitoring the nonstationary process and its efficiency is
validatedthroughacasestudyonTEprocess.
Futuredirectionsofresearchcouldbefurtherexplorationorgenera-
tionoftheproposedcausalanalysisframework. NonlinearGrangercausal-
ity could possibly handle the highly nonlinear process better. One of the
promising nonlinear Granger causality is to use kernel regression instead.
In addition, multiscale causal test using wavelet could also be studied to
extractthecausalpathsatdifferentdetailslevelplustheapproximatelevel
betweendifferentvariablesorgroups. Besidesthis,fornonstationarymon-
113
itoring, it has been concluded that if two process has cointegration then at
least one Granger causal path exists either one-way or bidirectional. How
tofindthecauseofaprocessfaultsuchasstepchangeorslowdriftremains
achallengingandopenproblem.
114
Bibliography
[1] K. J. Astrom, “Assessment of achievable performance of simple feed-
back loops,” International Journal of Adaptive Control and Signal Process-
ing,vol.5,no.1,pp.3–19,1991.
[2] A. Banerjee and D. F. Hendry, “Testing integration and cointegration:
An overview,” Oxford Bulletin of Economics and Statistics, vol. 54, no. 3,
pp.225–255,1992.
[3] A.B.Barrett,L.Barnett,andA.K.Seth,“Multivariategrangercausality
and generalized variance,” Physical Review E, vol. 81, no. 4, p. 041907,
2010.
[4] M. Bauer, J. W. Cox, M. H. Caveness, J. J. Downs, and N. F. Thorn-
hill, “Finding the direction of disturbance propagation in a chemical
process using transfer entropy,” IEEE Transactions on Control Systems
Technology,vol.15,no.1,pp.12–21,jan.2007.
[5] D.J.BerndtandJ.Clifford, “Usingdynamictimewarpingtofindpat-
terns in time series.” in KDD workshop, vol. 10, no. 16. Seattle, WA,
1994,pp.359–370.
[6] W. Bialkowski, “Dreams versus reality: a view from both sides of the
gap: manufacturing excellence with come only through engineering
excellence,”Pulp&PaperCanada,vol.94,no.11,pp.19–27,1993.
[7] J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock mar-
ket,”JournalofComputationalScience,vol.2,no.1,pp.1–8,2011.
[8] B. Cheng and T. Lai, “An investigation of co-integration and causality
betweenenergyconsumptionandeconomicactivityinTaiwan,”Ener-
gyEconomics,vol.19,no.4,pp.435–444,1997.
115
[9] L.H.Chiang,E.L.Russell,andR.D.Braatz,“Faultdiagnosisinchem-
ical processes using Fisher discriminant analysis, discriminant partial
leastsquares,andprincipalcomponentanalysis,”ChemometricsandIn-
telligentLaboratorySystems,vol.50,no.2,pp.243–252,2000.
[10] M. A. A. S. Choudhury, V. Kariwala, N. F. Thornhill, H. Douke, S. L.
Shah, H. Takada, and J. F. Forbes, “Detection and diagnosis of plant-
wideoscillations,”TheCanadianJournalofChemicalEngineering,vol.85,
no.2,pp.208–219,2007.
[11] M. S. Choudhury, D. S. Shook, and S. L. Shah, “Linear or nonlinear? a
bicoherence based metric of nonlinearity measure,” in Fault Detection,
Supervision and Safety of Technical Processes, vol. 6, no. 1, 2006, pp. 617–
622.
[12] S. M. Choudhury, S. L. Shah, and N. F. Thornhill, “Data-driven model
ofvalvestiction,”inDiagnosisofProcessNonlinearitiesandValveStiction.
Springer,2008,pp.161–171.
[13] R.Deibert,“Modelbasedfaultdetectionofvalvesinflowcontrolloop-
s,”inIFACSAFEPROCESSSymposium94,1994,pp.445–450.
[14] L. Desborough and R. Miller, “Increasing customer value of industri-
alcontrolperformancemonitoring-honeywell’sexperience,”in AIChE
symposium series. New York; American Institute of Chemical Engi-
neers;1998,2002,pp.169–189.
[15] M.Dhamala,G.Rangarajan,andM.Ding,“Estimatinggrangercausal-
ityfromfourierandwavelettransformsoftimeseriesdata,”Phys.Rev.
Lett.,vol.100,p.018701,Jan2008.
[16] S.Ding,P.Zhang,A.Naik,E.Ding,andB.Huang,“Subspacemethod
aided data-driven design of fault detection and isolation systems,”
JournalofProcessControl,vol.19,no.9,pp.1496–1510,2009.
[17] P. Duan, S. L. Shah, T. Chen, and F. Yang, “Methods for root cause
diagnosisofplant-wideoscillations,”AIChEJournal,2014.
116
[18] P. Duan, F. Yang, T. Chen, and S. L. Shah, “Direct causality detection
via the transfer entropy approach,” Control Systems Technology, IEEE
Transactionson,vol.21,no.6,pp.2052–2066,2013.
[19] D.B.Ender,“Processcontrolperformance: Notasgoodasyouthink,”
ControlEngineering,vol.40,no.10,pp.180–190,1993.
[20] R. F. Engle and C. W. Granger, “Co-integration and error correction:
representation, estimation, and testing,” Econometrica: journal of the E-
conometricSociety,vol.55,no.2,pp.251–276,1987.
[21] K. Friston, R. Moran, and A. K. Seth, “Analysing connectivity with
granger causality and dynamic causal modelling,” Current opinion in
neurobiology,vol.23,no.2,pp.172–178,2013.
[22] J.Geweke,“Measurementoflineardependenceandfeedbackbetween
multiple time series,” Journal of the American Statistical Association,
vol.77,no.378,pp.pp.304–313,1982.
[23] C.W.J.Granger,“Investigatingcausalrelationsbyeconometricmodels
and cross-spectral methods,” Econometrica, vol. 37, no. 3, pp. pp. 424–
438,1969.
[24] C.W.GrangerandP.Newbold,“Spuriousregressionsineconometric-
s,”Journalofeconometrics,vol.2,no.2,pp.111–120,1974.
[25] C.Granger,“Somerecentdevelopmentinaconceptofcausality,”Jour-
nalofEconometrics,vol.39,no.1,pp.199–211,1988.
[26] T. J. Harris, “Assessment of control loop performance,” The Canadian
Journal of Chemical Engineering, vol. 67, no. 5, pp. 856–861, 1989.
[Online].Available: http://dx.doi.org/10.1002/cjce.5450670519
[27] T. Harris, C. Seppala, and L. Desborough, “A review of performance
monitoringandassessmenttechniquesforunivariateandmultivariate
controlsystems,”JournalofProcessControl,vol.9,no.1,pp.1–17,1999.
117
[28] Q. He, J. Wang, M. Pottmann, and S. Qin, “A curve fitting method
for detecting valve stiction in oscillating control loops,” Industrial &
engineeringchemistryresearch,vol.46,no.13,pp.4549–4560,2007.
[29] C. Hiemstra and J. Jones, “Testing for linear and nonlinear granger
causality in the stock price-volume relation,” The Journal of Finance,
vol.49,no.5,pp.1639–1664,2012.
[30] Y.Hong,Y.Liu,andS.Wang,“Grangercausalityinriskanddetection
of extreme risk spillover between financial markets,” Journal of Econo-
metrics,vol.150,no.2,pp.271–287,2009.
[31] A.Horch,“Asimplemethodfordetectionofstictionincontrolvalves,”
ControlEngineeringPractice,vol.7,no.10,pp.1221–1231,1999.
[32] B. Huang and S. L. Shah, Performance assessment of control loops: theory
andapplications. Springer,1999.
[33] B. Huang, S. Shah, and E. Kwok, “On-line control performance mon-
itoring of mimo processes,” in American Control Conference, 1995. Pro-
ceedingsofthe,vol.2,jun1995,pp.1250–1254vol.2.
[34] M.Jelali,“Estimationofvalvestictionincontrolloopsusingseparable
least-squares and global search algorithms,” Journal of Process Control,
vol.18,no.7,pp.632–642,2008.
[35] M. Jelali and B. Huang, Detection and Diagnosis of Stiction in Control
Loops: StateoftheArtandAdvancedMethods. Springer,2010.
[36] M.Jelali,“Anoverviewofcontrolperformanceassessmenttechnology
andindustrialapplications,”ControlEngineeringPractice,vol.14,no.5,
pp.441–466,2006.
[37] M.JelaliandB.Huang,Detectionanddiagnosisofstictionincontrolloops:
stateoftheartandadvancedmethods. Springer,2009.
[38] H. Jiang, R. Patwardhan, and S. L. Shah, “Root cause diagnosis of
plant-wideoscillationsusingtheconceptofadjacencymatrix,”Journal
ofProcessControl,vol.19,no.8,pp.1347–1354,2009.
118
[39] H.Jiang,M.ShoukatChoudhury,andS.L.Shah,“Detectionanddiag-
nosisofplant-wideoscillationsfromindustrialdatausingthespectral
envelopemethod,”JournalofProcessControl,vol.17,no.2,pp.143–155,
2007.
[40] S.JoeQin,“Controlperformancemonitoringłareviewandassessmen-
t,”ComputersandChemicalEngineering,vol.23,no.2,pp.173–186,1998.
[41] S.JoeQin,“Statisticalprocessmonitoring: basicsandbeyond,”Journal
ofchemometrics,vol.17,no.8-9,pp.480–502,2003.
[42] S.JoeQinandJ.Yu,“Recentdevelopmentsinmultivariablecontroller
performance monitoring,” Journal of Process Control, vol. 17, no. 3, pp.
221–227,2007.
[43] S. Johansen, “Likelihood-based inference in cointegrated vector au-
toregressivemodels,”NewYork,1995.
[44] M. Kano, H. Maruta, H. Kugemoto, and K. Shimizu, “Practical model
and detection algorithm for valve stiction,” in IFAC DYCOPS, vol. 18,
2004.
[45] V. Kariwala, M. Choudhury, H. Douke, S. Shah, H. Takada, J. Forbes,
and E. Meadows, “Detection and diagnosis of plant-wide oscillation-
s: An application study,” Proceedings of IEEE Advanced Process Con-
trol(APC)ApplicationsforIndustryWorkshop2004,2004.
[46] S. Karra and M. Karim, “Comprehensive methodology for detection
anddiagnosisofoscillatorycontrolloops,”ControlEngineeringPractice,
vol.17,no.8,pp.939–956,2009.
[47] B. Kedem and S. Yakowitz, Time series analysis by higher order crossings.
IEEEpressPiscataway,NJ,1994.
[48] J. V. Kresta, J. F. MacGregor, and T. E. Marlin, “Multivariate statistical
monitoringofprocessoperatingperformance,”TheCanadianJournalof
ChemicalEngineering,vol.69,no.1,pp.35–47,1991.
119
[49] W. Ku, R. Storer, and C. Georgakis, “Disturbance detection and isola-
tion by dynamic principal component analysis,” Chemometrics and in-
telligentlaboratorysystems,vol.30,no.1,pp.179–196,1995.
[50] D. Kwiatkowski, P. C. Phillips, P. Schmidt, and Y. Shin, “Testing the
null hypothesis of stationarity against the alternative of a unit root:
How sure are we that economic time series have a unit root?” Journal
ofeconometrics,vol.54,no.1,pp.159–178,1992.
[51] C. Ladroue, S. Guo, K. Kendrick, and J. Feng, “Beyond element-wise
interactions: identifyingcomplexinteractionsinbiologicalprocesses,”
PLoSOne,vol.4,no.9,p.e6899,2009.
[52] K. Lee, Z. Ren, and B. Huang, “Novel closed-loop stiction detection
and quantification method via system identification,” in International
Symposium on Advanced Control of Industial Processes (ADCONIP) 2008,
2008,pp.283–288.
[53] G.Li,B.Liu,S.J.Qin,andD.Zhou,“Dynamiclatentvariablemodeling
forstatisticalprocessmonitoring,”inProc.IFACWorldCongress,Milan,
Italy,2011,pp.12886–12891.
[54] W. Li and S. Qin, “Consistent dynamic PCA based on errors-in-
variables subspace identification,” Journal of Process Control, vol. 11,
no.6,pp.661–678,2001.
[55] X.Li,J.Wang,B.Huang,andS.Lu,“Thedct-basedoscillationdetection
methodforasingletimeseries,”JournalofProcessControl,vol.20,no.5,
pp.609–617,2010.
[56] Y.Li,K.Ang,andG.Chong,“Patents,software,andhardwareforPID
control: an overview and analysis of the current art,” Control Systems,
IEEE,vol.26,no.1,pp.42–54,2006.
[57] T. W. Liao, “Clustering of time series dataa survey,” Pattern
Recognition, vol. 38, no. 11, pp. 1857 – 1874, 2005. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0031320305001305
120
[58] W. Meesrikamolkul, V. Niennattrakul, and C. A. Ratanamahatana,
“Shape-basedclusteringfortimeseriesdata,”inAdvancesinknowledge
discoveryanddatamining. Springer,2012,pp.530–541.
[59] A.NegizandA.C ¸inar,“Statisticalmonitoringofmultivariabledynam-
icprocesseswithstate-spacemodels,”AIChEJournal,vol.43,no.8,pp.
2002–2020,1997.
[60] H.-T. PAO and C.-M. TSAI, “Multivariate granger causality between
co2emissions,energyconsumption,fdi(foreigndirectinvestment)and
gdp (gross domestic product): Evidence from a panel of bric (brazil,
russian federation, india, and china) countries,” Energy, vol. 36, no. 1,
pp.685–693,2011.
[61] M.A.PaulonisandJ.W.Cox,“Apracticalapproachforlarge-scalecon-
trollerperformanceassessment,diagnosis,andimprovement,”Journal
ofProcessControl,vol.13,no.2,pp.155–168,2003.
[62] P. C. Phillips and Z. Xiao, “A primer on unit root testing,” Journal of
EconomicSurveys,vol.12,no.5,pp.423–470,1998.
[63] S. J. Qin, “Survey on data-driven industrial process monitoring and
diagnosis,”AnnualReviewsinControl,vol.36,pp.220–234,2012.
[64] S. J. Qin and Y. Zheng, “Quality-relevant and process-relevant fault
monitoring with concurrent projection to latent structures,” AIChE
Journal,vol.59,no.2,pp.496–504,2013.
[65] H.-E. Reimers, “Comparisons of tests for multivariate cointegration,”
Statisticalpapers,vol.33,no.1,pp.335–359,1992.
[66] S. E. Said and D. A. Dickey, “Testing for unit roots in autoregressive-
moving average models of unknown order,” Biometrika, vol. 71, no. 3,
pp.599–607,1984.
[67] T. Schreiber, “Measuring information transfer,” Physical review letters,
vol.85,no.2,p.461,2000.
121
[68] A. K. Seth, “Causal connectivity of evolved neural networks during
behavior,” Network: Computation in Neural Systems, vol. 16, no. 1, pp.
35–54,2005.
[69] M. Shoukat Choudhury, S. L. Shah, and N. F. Thornhill, “Diagnosis of
poorcontrol-loopperformanceusinghigher-orderstatistics,”Automat-
ica,vol.40,no.10,pp.1719–1728,2004.
[70] M. Shoukat Choudhury, N. F. Thornhill, and S. L. Shah, “Modelling
valve stiction,” Control engineering practice, vol. 13, no. 5, pp. 641–658,
2005.
[71] R.H.ShumwayandD.S.Stoffer,Timeseriesanalysisanditsapplications:
withRexamples. Springer,2011.
[72] A. Singhal and T. Salsbury, “A simple method for detecting valve s-
tiction in oscillating control loops,” Journal of Process Control, vol. 15,
no.4,pp.371–382,2005.
[73] J.H.StockandM.W.Watson, “Testingforcommontrends,” Journal of
theAmericanstatisticalAssociation,vol.83,no.404,pp.1097–1107,1988.
[74] J.H.StockandM.W.Watson,“Vectorautoregressions,”JournalofEco-
nomicperspectives,pp.101–115,2001.
[75] N.Thornhill,J.Cox,andM.Paulonis,“Diagnosisofplant-wideoscilla-
tionthroughdata-drivenanalysisandprocessunderstanding,”Control
EngineeringPractice,vol.11,pp.1481–1490,2003.
[76] N.ThornhillandT.H¨ agglund,“Detectionanddiagnosisofoscillation
in control loops,” Control Engineering Practice, vol. 5, no. 10, pp. 1343–
1354,1997.
[77] N.F.Thornhill,J.W.Cox,andM.A.Paulonis,“Diagnosisofplant-wide
oscillation through data-driven analysis and process understanding,”
ControlEngineeringPractice,vol.11,no.12,pp.1481–1490,2003.
122
[78] N.F.ThornhillandA.Horch,“Advancesandnewdirectionsinplant-
wide disturbance detection and diagnosis,” Control Engineering Prac-
tice,vol.15,no.10,pp.1196–1206,2007.
[79] D. L. Thornton and D. S. Batten, “Lag-length selection and tests of
grangercausalitybetweenmoneyandincome,”JournalofMoney,credit
andBanking,pp.164–178,1985.
[80] P. Turner, “Testing for cointegration using the johansen approach: are
we using the correct critical values?” Journal of Applied Econometrics,
vol.24,no.5,pp.825–831,2009.
[81] V. A. Vakorin, O. A. Krakovska, and A. R. McIntosh, “Confounding
effects of indirect connections on causality estimation,” Journal of neu-
rosciencemethods,vol.184,no.1,pp.152–160,2009.
[82] A.Wallen,“Valvediagnosticsandautomatictuning,”inAmericanCon-
trol Conference, 1997. Proceedings of the 1997, vol. 5. IEEE, 1997, pp.
2930–2934.
[83] G.Weidl,A.Madsen,andS.Israelson,“Applicationsofobject-oriented
bayesian networks for condition monitoring, root cause analysis and
decisionsupportonoperationofcomplexcontinuousprocesses,”Com-
puters&chemicalengineering,vol.29,no.9,pp.1996–2009,2005.
[84] H.WoldandE.Lyttkens,“Nonlineariterativepartialleastsquares(ni-
pals) estimation procedures,” Bulletin of the International Statistical In-
stitute,vol.43,no.1,1969.
[85] C.XiaandJ.Howell,“Isolatingmultiplesourcesofplant-wideoscilla-
tions via independent component analysis,” Control Engineering Prac-
tice,vol.13,no.8,pp.1027–1035,2005.
[86] C.Xia,J.Howell,andN.F.Thornhill,“Detectingandisolatingmultiple
plant-wideoscillationsviaspectralindependentcomponentanalysis,”
Automatica,vol.41,no.12,pp.2067–2075,2005.
123
[87] F. Yang, S. Shah, and D. Xiao, “Signed directed graph based modeling
anditsvalidationfromprocessknowledgeandprocessdata,” Interna-
tionalJournalofAppliedMathematicsandComputerScience,vol.22,no.1,
pp.41–53,2012.
[88] S. Yin, S. X. Ding, A. Haghani, H. Hao, and P. Zhang, “A compari-
sonstudyofbasicdata-drivenfaultdiagnosisandprocessmonitoring
methodsonthebenchmarktennesseeeastmanprocess,”JournalofPro-
cessControl,2012.
[89] J. Yu and S. J. Qin, “Statistical mimo controller performance monitor-
ing.parti: Data-drivencovariancebenchmark,”JournalofProcessCon-
trol,vol.18,no.3,pp.277–296,2008.
[90] J. Yu and M. M. Rashid, “A novel dynamic bayesian network-based
networked process monitoring approach for fault detection, propaga-
tion identification, and root cause diagnosis,” AIChE Journal, vol. 59,
no.7,pp.2348–2365,2013.
[91] T. Yuan and S. J. Qin, “Root cause diagnosis of plant-wide oscillations
usinggrangercausality,”JournalofProcessControl,2013.
[92] X. Zang and J. Howell, “The propagation of whole plant oscillations
through a chemical process plant,” Journal of Process Control, vol. 16,
no.9,pp.959–970,2006.
[93] D. Zhou, G. Li, and S. J. Qin, “Total projection to latent structures for
processmonitoring,”AIChEjournal,vol.56,no.1,pp.168–178,2010.
[94] C.Zou andJ.Feng, “Granger causalityvs. dynamicbayesian network
inference: a comparative study,” BMC bioinformatics, vol. 10, no. 1, p.
122,2009.
Abstract (if available)
Abstract
A typical industrial process or plant operates with hundreds of control loops and those primary loops should operate at desired levels for safety and efficiency. There exist challenges on how to monitor and maintain such a large‐scale complex system, thus control performance monitoring (CPM) has drawn tremendous research interest and progress steadily over the last two decades. After detection of poor control performance, an even more challenging and meaningful task is to diagnose and locate the root cause of whole plant degradation therefore further targeted maintenance could be performed to improve the situation. In this dissertation, new data‐driven approaches based on causality methods are proposed to diagnose the root cause of plant‐wide disturbances. ❧ Ocillations as a common reason for poor performance in closed‐loop controlled processes which, once generated, can propagate along process flows and feedback paths of the whole plant. A new data‐driven time series method for diagnosing the sources and propagation paths of plant‐wide oscillations is presented. Firstly a latent variable based feature variable selection scheme is proposed to select candidate variables which share common oscillations. The time‐domain Granger causality and spectral Granger causality have been applied successfully to reveal the causal relationship in terms of time series prediction. ❧ The proposed method is then extended to extract more complete causal structure in a large‐scale complex industrial plant. A novel multilevel Granger causality framework for root cause diagnosis is proposed, which consider both the group‐wise and within group causal structures. Dynamic time warping‐based K‐means clustering method is applied to group time‐series variables more accurately. Group Granger causal test is performed to find out the causal contribution with regard to specific disturbance feature. Dominant and major causal relationship could also be extracted in group point of view. Moreover, a partial least squares modified Granger causal test is developed to overcome multicollinearity issue and make the generated graphic causal network much sparse and easier to interpret. ❧ For nonstationary series in chemical process, traditional latent variable or other statistical modeling methods are inadequate because the statistical properties are time variant. Nonstationary tests are implemented to identify the nonstationary variables, then cointegration test are utilized to capture the common trends and dynamic structure. Monitoring index and control limit are derived for monitoring and fault detection purpose.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Concurrent monitoring and diagnosis of process and quality faults with canonical correlation analysis
PDF
Dynamic latent structured data analytics
PDF
Non‐steady state Kalman filter for subspace identification and predictive control
PDF
Demand response management in smart grid from distributed optimization perspective
PDF
Performance monitoring and disturbance adaptation for model predictive control
PDF
Distributed adaptive control with application to heating, ventilation and air-conditioning systems
PDF
Architectural innovations for mitigating data movement cost on graphics processing units and storage systems
PDF
A risk analysis methodology to address human and organizational factors in offshore drilling safety: with an emphasis on negative pressure test
Asset Metadata
Creator
Yuan, Tao
(author)
Core Title
Process data analytics and monitoring based on causality analysis techniques
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
08/06/2014
Defense Date
04/10/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cointegration analysis,Granger causality,multilevel analysis,OAI-PMH Harvest,plant‐wide oscillations,propagation paths,root cause diagnosis
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Qin, S. Joe (
committee chair
), Ioannou, Petros (
committee member
), Shing, Katherine (
committee member
)
Creator Email
dkdlbbm@gmail.com,wyyuantao@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-453398
Unique identifier
UC11287650
Identifier
etd-YuanTao-2769.pdf (filename),usctheses-c3-453398 (legacy record id)
Legacy Identifier
etd-YuanTao-2769.pdf
Dmrecord
453398
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Yuan, Tao
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
cointegration analysis
Granger causality
multilevel analysis
plant‐wide oscillations
propagation paths
root cause diagnosis