Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Inferential modeling and process monitoring for applications in gas and oil production
(USC Thesis Other)
Inferential modeling and process monitoring for applications in gas and oil production
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INFERENTIAL MODELING AND PROCESS MONITORING
FOR APPLICATIONS IN GAS AND OIL PRODUCTION
by
Yu Pan
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(CHEMICAL ENGINEERING)
May 2015
Copyright 2015 Yu Pan
ii
Dedication
To my father and my family.
iii
Acknowledgments
I would like to thank everyone who supported me, encouraged me
and inspired me during my doctoral work. In fact, I could not possibly
finish this work without these people influencing me in every aspects of my
life.
Foremost, I would like to gratefully and sincerely thank Prof. S. Joe
Qin for his guidance, understanding, patience, and most importantly, his
friendship during my graduate studies at the University of Southern Cali-
fornia. He encouraged me to not only grow as an experimentalist and an
engineer but also as an instructor and an independent thinker. His men-
torship was paramount in providing a well rounded experience consistent
with my long-term career goals. I am also grateful for having an exceptional
dissertation committee and would like to thank all the members including
Prof. Qiang Huang and Prof. Iraj Ershaghi for their guidance and insights.
Working experience with the entire Qin group was memorable and
enjoyable. I feel very fortunate to have the former members who offered
me useful advices and helpful suggestions, including Carlos, Hu, Yingying,
Jingran, Yu, Tao, Zhijie, and visiting scholars Li, Le, Prof. Liu, and Prof.
Zheng. I would also like to thank the current members of the group, Yin-
iv
ing, Alisha, Yuan, Zora, Qinqin, Wei, and postdoc Gang, and Qiang for their
friendliness and consistent help. Especially I appreciate the time spending
with Yining Dong on our joint work. She had a strong mathematical back-
ground and taught me everything she knew about partial least-squares and
linear algebra.
I would also like to thank the USC Provost PhD Fellowship, and the
Center for Interactive Smart Oilfield Technologies (CiSoft) for their financial
support during my graduate studies. Deep gratitude also goes to Chevron
Champions, Michael Barham, Phi Nguyen, and Lisa Brenskelle for their
valuable guidance and insights on my CiSoft projects. Their experience pro-
vided me with the unique opportunity to gain a wider breadth of industrial
world while being a graduate student.
Last but not the least, I am indebted to thank my family for their con-
tinuous support. It is very unfortunate that my father passed away during
the last year of my doctoral study. His exceptional guidance and encour-
agement will always be remembered.
v
Table of Contents
Dedication ii
Acknowledgments iii
ListofTables viii
ListofFigures ix
Abstract xi
Chapter1. Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Inferential Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Partial Least Squares . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Oil and Gas Production . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter2. PartialLeastSquaresforProcessModelingandMonitor-
ing 16
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Partial Lease Squares Algorithm . . . . . . . . . . . . . . . . . . 16
2.2.1 PLS Objective . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 The Number of PLS Components . . . . . . . . . . . . . 20
2.2.3 PLS Model Prediction . . . . . . . . . . . . . . . . . . . . 21
2.3 Statistical Process Monitoring . . . . . . . . . . . . . . . . . . . 21
2.3.1 PLS Monitoring Indices . . . . . . . . . . . . . . . . . . . 22
2.4 Fault Diagnosis by Contributions . . . . . . . . . . . . . . . . . 25
2.4.1 Sensor Faults and Process Faults . . . . . . . . . . . . . . 26
2.4.2 Complete Decomposition Contributions . . . . . . . . . 28
2.4.3 Reconstruction-based Contributions . . . . . . . . . . . . 28
2.4.4 Analysis of Diagnosability . . . . . . . . . . . . . . . . . 29
vi
2.4.5 Diagnosis Using Complete Decomposition Contributions 30
2.4.6 Diagnosis Using Reconstruction-based Contributions . . 31
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Chapter3. InferentialSensorforBasicSedimentsandWaterinGas
andOilProduction 33
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Partial Least-squares Algorithm . . . . . . . . . . . . . . . . . . 36
3.3 BS&W for Crude Oil . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Chapter 4. Vapor Pressure Inferential Sensors for Gas and Oil Pro-
duction 43
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 The Vapor Pressure Equation . . . . . . . . . . . . . . . . . . . . 47
4.3 Hybrid Inferential Model . . . . . . . . . . . . . . . . . . . . . . 51
4.3.1 Partial Least Squares . . . . . . . . . . . . . . . . . . . . . 53
4.3.2 Statistical Process Monitoring . . . . . . . . . . . . . . . . 54
4.3.3 Fault Diagnosis Using Contribution Methods . . . . . . 54
4.4 Vapor Pressure for Gas Condensate . . . . . . . . . . . . . . . . 55
4.5 Vapor Pressure for Crude Oil . . . . . . . . . . . . . . . . . . . . 58
4.5.1 The RVP Inferential Sensor . . . . . . . . . . . . . . . . . 63
4.5.2 The TVP Inferential Sensor . . . . . . . . . . . . . . . . . 64
4.5.3 Monitoring Methods . . . . . . . . . . . . . . . . . . . . . 67
4.5.4 Fault Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Chapter5. ConcurrentPLS-basedContributionforFaultDiagnosis 76
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 PLS for Process and Quality Monitoring . . . . . . . . . . . . . 80
5.3 Concurrent Projection to Latent Structures . . . . . . . . . . . . 83
5.3.1 CPLS Algorithm . . . . . . . . . . . . . . . . . . . . . . . 83
5.3.2 CPLS Properties . . . . . . . . . . . . . . . . . . . . . . . 87
5.4 CPLS-based Fault Detection . . . . . . . . . . . . . . . . . . . . 88
5.4.1 Monitoring Indices . . . . . . . . . . . . . . . . . . . . . . 89
5.4.2 Control Limits . . . . . . . . . . . . . . . . . . . . . . . . 90
vii
5.4.3 Indices General Forms . . . . . . . . . . . . . . . . . . . . 92
5.5 Fault Diagnosis by Contributions . . . . . . . . . . . . . . . . . 93
5.5.1 Complete Decomposition Contributions for Sensor Faults 94
5.5.2 Reconstruction-based Contributions for Sensor Faults . 95
5.5.3 Reconstruction-based Contributions for Process Faults . 96
5.5.4 Extraction of Fault Subspace . . . . . . . . . . . . . . . . 98
5.6 Analysis of diagnosability . . . . . . . . . . . . . . . . . . . . . 100
5.6.1 Diagnosis Sensor Faults Using Complete Decomposi-
tion Contributions . . . . . . . . . . . . . . . . . . . . . . 101
5.6.2 Diagnosis Sensor Faults Using Reconstruction-based
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.6.3 Diagnosis Process Faults Using Reconstruction-based
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.7 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.7.1 Synthetic Study . . . . . . . . . . . . . . . . . . . . . . . . 104
5.7.2 Case Study on the Tennessee Eastman Process . . . . . . 106
5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Chapter6. Conclusions 118
Bibliography 121
viii
List of Tables
Table 1.1 Comparison between two categories of inferential sensors . 3
Table 2.1 NIPALS PLS algorithm . . . . . . . . . . . . . . . . . . . . . . 20
Table 2.2 Values of M . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Table 3.1 Process variables used in the inferential sensor . . . . . . . . 39
Table 4.1 Process variables used in the inferential sensor . . . . . . . . 57
Table 4.2 Process variables used in the inferential sensor . . . . . . . . 63
Table 4.3 Average errors for the inferential sensors . . . . . . . . . . . 75
Table 5.1 Values for M . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Table 5.2 Values for N, a andc(x) . . . . . . . . . . . . . . . . . . . . . 93
Table 5.3 Percent rates of correct diagnosis for scenario 1 & 2 . . . . . 106
Table 5.4 Percent rates of correct diagnosis for scenario 3 . . . . . . . . 110
Table 5.5 Fault description . . . . . . . . . . . . . . . . . . . . . . . . . 112
ix
List of Figures
Figure 1.1 Typical oilfield process flow diagram. [48] . . . . . . . . . . 9
Figure 3.1 Process flow diagram around the wash tanks. . . . . . . . . 38
Figure 3.2 Predicted error sum of squares by cross-validation. . . . . . 40
Figure 3.3 Comparison between analyzer BS&W and inferential BS&W. 41
Figure 4.1 Vapor pressure curves of n-paraffins. . . . . . . . . . . . . . 49
Figure 4.2 A schematic illustration of the hybrid inferential sensor
building method. . . . . . . . . . . . . . . . . . . . . . . . . . 53
Figure 4.3 Process flow diagram around the debutanizer . . . . . . . . 56
Figure 4.4 Predicted error sum of squares by cross-validation . . . . . 59
Figure 4.5 Comparison between Korsten RVP and Analyzer RVP . . . 60
Figure 4.6 Comparison between Korsten RVP and Analyzer RVP . . . 61
Figure 4.7 Process flow diagram around the stabilizer. . . . . . . . . . 62
Figure 4.8 Hybrid inferential sensor for RVP prediction . . . . . . . . . 65
Figure 4.9 Predicted error sum of squares by cross-validation . . . . . 66
Figure 4.10 Hybrid inferential sensor for TVP prediction with a tem-
perature correction . . . . . . . . . . . . . . . . . . . . . . . . 68
Figure 4.11 PLS Monitoring Indices . . . . . . . . . . . . . . . . . . . . . 70
Figure 4.12 Contributions for Fault One . . . . . . . . . . . . . . . . . . 72
x
Figure 4.13 Selected variables to demonstrate the effectiveness of the
diagnosing result . . . . . . . . . . . . . . . . . . . . . . . . . 73
Figure 4.14 Contributions for Fault Two . . . . . . . . . . . . . . . . . . 74
Figure 5.1 CPLS Monitoring Indices for Scenario 1 . . . . . . . . . . . 107
Figure 5.2 CPLS Monitoring Indices for Scenario 2 . . . . . . . . . . . 108
Figure 5.3 CPLS Monitoring Indices for Scenario 3 . . . . . . . . . . . 109
Figure 5.4 TEP Process Flow Diagram . . . . . . . . . . . . . . . . . . . 111
Figure 5.5 CPLS-based Monitoring Result for a Step Change in B
Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Figure 5.6 CPLS-based Monitoring Result for a Step Change in Reac-
tor Cooling Water Inlet Temperature . . . . . . . . . . . . . . 114
Figure 5.7 CPLS-based Monitoring Result for a Step Change in Con-
denser Cooling Water Inlet Temperature . . . . . . . . . . . 115
xi
Abstract
In industrial processes, including gas and oil processing, many im-
portant physical properties are difficult to measure due to limitations such
as cost, reliability, and large time delay. These properties are usually qual-
ity variables and are directly related to the economic interest. Traditionally,
these quality variables are either measured off-line by experimental analy-
ses or on-line by dedicated analyzers. Both methods can be expensive and
time consuming. An alternative solution is to use inferential sensors. An in-
ferential sensor utilizes other easy-to-measure process variables to estimate
those hard-to-measure quality variables. Inferential sensors are economi-
cal due to their software nature, and they can produce accurate estimations
without incurring any measurement delay. Accordingly, inferential sensors
gained increasing popularity and have been applied in many different pro-
cess industries.
In this dissertation, we focus on developing inferential sensors in the
field of gas and oil production industry. Inferential sensors on three prop-
erties are developed, including basic sediment and water (BS&W), Reid va-
por pressure (RVP), and true vapor pressure (TVP). The BS&W inferential
sensor is a data-driven one, which is based on partial least-squares (PLS)
xii
regression. The RVP and TVP inferential sensors are hybrid ones, which
consist of a model-driven part and a data-driven part. The model-driven
part is based on the Korsten vapor pressure equation, and PLS is used to
capture the modeling residual, which denotes the data-driven part. Case
studies on real data demonstrate the effectiveness of the proposed inferen-
tial sensors.
Since inferential sensors depend on the correlations among variables
within the process, their estimations are no longer valid if these correlations
are changed. To amend this shortage, process monitoring methods have
been developed by researchers. We first utilized the traditional monitoring
method based on PLS on the RVP inferential sensor. Then we develop a new
monitoring method based on a newly proposed concurrent partial least-
squares (CPLS) algorithm. At the end we demonstrate the effectiveness of
the CPLS monitoring methods by two case studies.
1
Chapter 1
Introduction
1.1 Overview
In industrial processes, including gas and oil processing, many im-
portant physical properties are difficult to measure due to limitations such
as cost, reliability, and large time delay. These properties are usually qual-
ity variables and are directly related to the economic interest. Traditionally,
these quality variables are either measured off-line by experimental anal-
yses or on-line by dedicated analyzers. Both methods have several draw-
backs. The problem of laboratory analyses is that they are time consuming.
Not only do they need preparation time, tedious procedures, but they also
induce large time delays. On the other hand, on-line analyzers are usually
expensive, the installation can be difficult, and maintaining these equip-
ments is not always easy. As a result, researchers started to investigate
alternative solutions using models as inferential sensors to estimate these
hard-to-measure quality variables [64, 38, 62, 18].
In the remaining of the chapter, an overview on the inferential sen-
sors is first given. Following by that, the partial least squares (PLS) model,
which is one of inferential sensors that has been studied through out the dis-
2
sertation, is introduced. The aim of this work is to build inferential sensors
for the applications in the petroleum industry therefore the background in-
formation on gas and oil production process is then presented. Finally we
conclude the chapter by giving an outline of the dissertation.
1.2 Inferential Sensors
An inferential sensor utilizes other easy-to-measure process vari-
ables to estimate those hard-to-measure quality variables. Inferential
sensors are economical due to their software nature, and they can produce
accurate estimations without incurring any measurement delay. Accord-
ingly, inferential sensors gained increasing popularity and have been
applied in many different process industries. A detailed survey of recent
inferential sensor developments can be found in Kadlec et al. [31].
Many inferential sensors exist in different applications. At a very
general level, they can be classified into two categories, namely, model-
driven and data-driven. Model-driven inferential sensors are based on first
principles models (FPM), which describe the physical and chemical mech-
anisms of the process. Data-driven inferential sensors extract empirical sta-
tistical correlations between the process variables and quality variables by
various techniques such as multivariate statistics and artificial neural net-
works.
Both categories have their own advantages and disadvantages. One
3
major advantage of model-driven inferential sensors is that they can gen-
erally work on different operating modes because they are derived based
on complex process knowledge. The process knowledge stays the same
when switching to different operating modes. However, the requirement of
complex process knowledge is also their biggest disadvantage, namely they
can be hard to derived even for the experts in the application domain. An-
other disadvantage of model-driven inferential sensors is their assumption
of ideal states, which a real process is never going to achieve. Data-driven
inferential sensors do not need complex process knowledge and its corre-
sponding model equations. They simply extract the statistical correlations
from the historical data, and therefore the same technique can be applied
to different application domains easily. One disadvantage of data-driven
inferential sensors is the potential requirement of different models for dif-
ferent operating modes due to changes in statistical correlations. The com-
parison between two categories of inferential sensors is given in Table 1.1.
Table 1.1: Comparison between two categories of inferential sensors
Model-driven Data-driven
Pros
Work on different modes Do not need complex
of operation process modelling equations
Cons
Hard to derive and assume Need different models for
assume ideal states different operating modes
It is possible to build a hybrid model, which is composed of both
4
a model-driven and data-driven model to compensate each other. Jutan
et al. [30] used a combination of FPM with multivariate statistical distur-
bance models to estimate the transport properties and catalyst activity for a
packed-bed reactor. Model-driven inferential sensors based on neural net-
works have also been published. [12]. Qin et al [55] proposed to validate
the input measurements before using them for predicting the outputs to
improve the reliability of the inferential sensors in case some inputs contain
faults or gross errors.
1.3 Partial Least Squares
While the form of a model-driven inferential sensor depends on its
application, the data-driven inferential sensors through out this disserta-
tion are based on partial least squares model. Partial least squares (PLS),
also known as projection to latent structures, is a statistical method that
bears some relation to principal components analysis (PCA). Unlike PCA,
which projects the input data X onto a low-dimensional latent space which
captures most of the variance of X, PLS projects both the input data X and
output data Y onto a low-dimensional latent space that captures most of
their covariance. Many of the industrial processes involve high dimension-
ality and collinearity, which can be overcome by partial least squares. Par-
tial least squares is know to be effective in dealing with large number of
variables with collinearity.
5
Partial least-squares was first introduced in the chemometrics liter-
atures by Swedish statistician Herman Wold [78], who then developed it
with his son, Svante Wold. Ever since then, many researchers started to
investigate its properties and improve its efficiency. We list some of the im-
portant developments of PLS here for reference. H¨ oskuldsson [28] gave an
overall explanation on the PLS algorithm and its properties. Lindgren et
al. [45] developed an efficient PLS kernel algorithm for the special case of
data set with many samples and few variables. Inspired by Lindgren et al.
[45], a similar PLS kernel algorithm for data set with many variables and
few samples is developed by R¨ annar et al. [67]. Dayal and MacGregor [17]
proved that during the sequential process of computing PLS score vectors,
the deflation step does not need to be carried out on both the input X and
the output Y; deflating only one of the two results the same. Bennett and
Embrechts [10] gave an optimization perspective on PLS regression. While
mathematical properties of PLS are well understood, its geometric proper-
ties have also been analyzed by Li et al. [43].
The underlining structure of PLS consists of a outer model and a
inner model. However both of them are linear models, and therefore it is
not able to capture the process nonlinearity. Many of its variants were being
developed to solve this issue. One popular method is to retain the linear
relations of the outer model, but incorporate nonlinear relations into the
6
inner model, such as neural net PLS [62] [76], spline PLS [79], quadratic PLS
[80], and fuzzy PLS [9]. Another popular method is to update the model
parameters recursively and assume that the nonlinear relations in the near
future can be approximated by the current linear relations [27, 56, 18, 1].
Kernelized PLS methods are also been developed [10, 70]. In kernel PLS, the
original variables are transformed into a high-dimensional feature space.
Then a traditional PLS is performed. In this way the kernel PLS method is
similar to its linear counterpart.
Another problem of PLS is that it is a static model, namely it only
considers the cross-variation among variables at each timestamp. Therefore
when the underlining relationship of the process is time dependent, the
traditional PLS along is not sufficient to capture the dynamic relations. To
solve this problem, many modifications have being made. One popular
method is to use a dynamic outer model, but the inner model remains static
[41, 69, 60, 22, 35, 33]. Recently a new dynamic PLS has been developed in
which both the outer model and the inner model are dynamic.
Since all data-driven inferential models rely on the statistical corre-
lations within the past normal operating measurements, their performance
will deteriorate if abnormality happens. Multivariate statistic process mon-
itoring (MSPM) methods such as PLS is proved to be an effective approach
for detecting and diagnosing abnormal operating situations in many indus-
7
trial applications including chemicals [37], polymers [50, 47], and microelec-
tronic manufacturing, thus making it an advantageous option among other
data-driven models.
Here we briefly explain how PLS is used to monitor a process. A
more detail explanation is given in the later chapters. Partial least squares
partitions the input measurements into two subspaces, a principal subspace
and a residual subspace. Each subspace is monitored by an statistical index.
The principal subspace, which contains variations related to the output, is
monitored by the T
2
index. The residual subspace, which contains vari-
ations not related to the output, is monitored by the Q index. A fault is
detected when a new measurement breaks the normal statistical correla-
tion causing one of the monitoring indices to go beyond its control limit.
Several modifications including orthogonal PLS (OPLS) [72], total (TPLS)
[86, 44, 42], concurrent PLS (CPLS) [65], and their variants [85, 41, 84, 52]
have been proposed to improve the interpretability of the traditional PLS
monitoring method.
Once an abnormal situation has been detected, the next step is to di-
agnose its cause. There are several methods for fault diagnosis based on
historical data, such as discriminant analysis [66, 82], pattern matching us-
ing dissimilarity factors [32, 34], and structured residual-based approaches
[58, 59, 25]. One popular method among them is the contribution method.
8
Contribution methods determine the contribution of each variable to the
fault monitoring index calculated. The variable that contributes the most
are considered as a possible root cause of the abnormality. Several contribu-
tions have been defined and used for fault diagnosis[13, 6, 40, 5, 77]. Alcala
and Qin [3] further analyzed these contributions and generalized them into
three categories. Despite other existing diagnosing methods in multivariate
statistical process monitoring (MSPM), the approach of contribution plots
seems to work very well in many applications [51, 74, 61, 15].
1.4 Oil and Gas Production
One potential application domain for the inferential sensors is in the
gas reservoir and oilfield. In this section, we first give an brief overview of a
typical oil and gas production process, then the processing of the individual
separated streams is reviewed.
As produced, wellhead fluids, namely crude oil, nature gas, and
brine, must be processed before sale, transpor, reinjection, or disposal.
Therefore, oil and gas production involves a number of surface unit opera-
tions between the well head and the point of custody transfer or transport
from the production facilities. Fig. 1.1 shows a typical oilfield process flow
diagram. Oilfield processing generally consists of two distinct categories of
operations [48, 19] :
9
Figure 1.1: Typical oilfield process flow diagram. [48]
10
1. Separation of the gas-oil-brine wellstream into its individual phases.
2. Removal of impurities from the separated phases to meet sales, trans-
portation, reinjection sepcifications, and environmental regulations.
Category one has three specific operations: separation, dehydration, and
stabilization. Separation describes the preliminary separation of vapor, oil
and water phases of a produced wellhead stream. Dehydration describes
the operation of removing water droplets or basic sediment and water
(BS&W) from crude oil. Stabilization is to remove the most volatile com-
ponents of a crude oil to reduce its vapor pressure. Category two has two
specific operations: desalting and sweetening. While desalting is to reduce
the salt content of a crude oil, sweetening, also known as treating, is to
remove H
2
S and other sulfur compounds.
Gas processing begins with acid gas removal as shown in Fig. 1.1.
The acidic gases typically refer to hydrogen sulfide and carbon dioxide.
Both gases are very corrosive when liquid water is present. Moreover, hy-
drogen sulfide is highly toxic and environmental regulations almost always
prohibit the release of large amounts of hydrogen sulfide to the atmosphere.
Gas treating or sweetening usually involves using aqueous solutions of var-
ious chemicals and therefore sweetening usually precede dehydration. The
purpose of dehydration not only is to increase the product purity, but more
11
importantly, to prevent the formation of gas hydrates, which may plug the
pipelines and processing equipment at high pressure and temperature.
In some gas reservoirs, a significant amounts of liquefiable hydrocar-
bons is present. This liquefied natural gas (LNG), namely ethane, propane,
or heavies, can produce condensate upon cooling and compressing, which
may induce difficulties in the subsequent pipeling and processing. Some-
times extracting LNG may be profitable due to its economical value, and
sometimes it is simply required to meet the dew-point specification. Recov-
ered condensate may have to be stabilized before turning to a transportable
product, and therefore it is generally avoided in remote locations.
How to handle the processed natural gas depends on situations. The
recovered gas may simply be flared if a gas pipeline is not available. Con-
servation is often required by law, therefore a more acceptable practice is
to conserve the gas by compressing and rejecting it to the formation with a
view of its eventual recovery and sales. When a gas pipeline is available,
transport it for sales is the common situation.
In oil processing, dehydration and desalting are often performed al-
ternatively. After removal of free water, produced oil often still contains a
considerable amounts of residual emulsified water, which must be reduced
to an acceptable value for transportation or sales. Water-in-oil emulsion,
which is caused by turbulence or agitation, is a quasi-stable suspension of
fine drops of water dispersed in oil. There are generally four techniques
12
to remove the emulsified water: chemical addition, heat, electrostatic fields,
and simply by gravitational force with enough residence time. After the ini-
tial dehydration, dilution water must occasionally be added to reduce the
salt content of the emulsified water, which is then followed by another de-
hydration step. More cycles of desalting and dehydration may be required
depending on the situations.
Hydrogen sulfide in crude oil must be removed, again due to its tox-
icity, corrosive property, and environmental regulations. And gas stripping
or heating is usually applied for this purpose. The final step of oil pro-
cessing, before storage or sales, is the stabilization. Crude oil stabilization
refers to removing the most volatile hydrocarbons as vapor, and results in
lowering the crude oil vapor pressure to a acceptable value that allows safe
handling and transport.
Produced water is a waste material, but processing it is often
required before reinjection to the formation or disposal to the surrounding.
Water skimming is the process of removing oil-in-water emulsion. To
prevent reservoir plugging, reinjected water must first be filtered and
deaerated. Deaeration refers to removing dissolved oxygen by either
chemical scavengers, gas stripping, or catalytic reaction. Deaeration will
reduce the corrosivity of water drastically.
Sand may or may not be produced depending on the reservoir prop-
erties and process operating conditions. When sand is produced, it is often
13
gathered in the locations of the process where the flow velocity and turbu-
lence is low, such as bottom of tanks. To prevent process upsets, sand needs
to be periodically cleaned up and removed.
After a general overview, readers can now understand there are
many product specifications in the gas and oil production process. For
gas processing, typical specifications are water content, hydrogen sulfide
content, gross heating value, hydrocarbon dew point, carbon dioxide con-
tent, oxygen content, and condensate vapor pressure. For oil processing,
typical specifications are basic sediment and water (BS&W), crude oil vapor
pressure, pour point, sulfur content, and viscosity. Almost none of them
can be easily measured without either an analyzer or a laboratory analysis.
In this dissertation, we develop and apply inferential sensors on three
properties, namely condensate vapor pressure, crude oil vapor pressure,
and crude oil BS&W content.
1.5 Outline
To conclude this chapter, we present the outline of the dissertation
here.
In Chapter 2, we start by presenting the partial least squares algo-
rithm with process monitoring methods and contribution methods. Since
PLS is used in every subsequent chapters except the conclusion, we take
this chance to give a more detail explanation of the PLS algorithm. This
14
would allow the formalization of the notations, prevent too many redun-
dancies in the later chapters, and make the rest of the dissertation easier to
read.
In Chapter 3, the target property is the crude oil BS&W content.
The inferential sensor presented in this chapter is simply a data-driven
one, which is based on PLS. At the end, the proposed inferential sensor is
applied to a crude production process.
In Chapter 4, we focus on developing inferential sensors for vapor
pressure. This includes both gas condensate vapor pressure and crude oil
vapor pressure. The inferential sensor presented in this chapter is a hy-
brid one. The model-driven part is based on Korsten vapor pressure equa-
tion. This equation is a simple way to estimate vapor pressure curve of
pure hydrocarbons. The only information needed is one measurement of
equilibrium temperature and pressure. Partial least squares, which is the
data-driven part, is used to model the residuals between the estimated va-
por pressure by the equation and the measured vapor pressure by an online
analyzer. This proposed inferential sensor is applied to both gas plant con-
densate and oilfield crude. Finally, process monitoring and fault diagnosis
based on PLS is applied to the oilfield.
In Chapter 5, we turn our attention away from inferential sensors.
Instead, we study the concurrent partial least squares (CPLS) in details. Re-
cently proposed by Qin and Zheng [65], CPLS is one of the improved PLS
15
process monitoring methods. We first derived some of its properties, and
then fault diagnosis methods based on CPLS are developed. Finally we ap-
ply the developed diagnosing methods on synthetic case studies to demon-
strate its performance.
Chapter gives some concluding remarks and potential future works
for the dissertation.
16
Chapter 2
Partial Least Squares for Process Modeling and
Monitoring
2.1 Introduction
As mentioned in Chapter 1, partial least squares (PLS) algorithm is
used profoundly in this dissertation, and therefore this chapter is dedicated
to explain it in a great detail and to formalize its notations. First, the PLS
principle and its algorithm is presented. Then the traditional PLS monitor-
ing method with its monitoring indices is given. After that we describe and
compare the two mostly used contribution methods, namely the complete
decomposition contributions (CDC) and the reconstruction-based contribu-
tions (RBC). Finally, we conclude the chapter in the last section.
2.2 Partial Lease Squares Algorithm
PLS is a multivariate statistical method based on linear projection
to a latent-variable subspace, and is mainly used in process modeling and
monitoring to deal with a large number of variables with collinearity. It
consists of an outer model and an inner model. Suppose we gather all the
input measurements withn samples andm variables to form an input ma-
17
trix X 2 R
nm
, and all the output measurements with n samples and p
variables to form an output matrix Y2 R
np
, the outer model will try to
project both X and Y to a subspace that maximize their covariance. Then
the inner model is used to find the relations between these two subspaces.
The underlying structure of the PLS is the follows,
X = TP
T
+ E (2.1)
Y = TQ
T
+ F (2.2)
where T2 R
nl
is the score matrix, and P2 R
ml
and Q2 R
pl
are or-
thogonal loading matrices for X and Y, respectively. The matrix E is the
residual matrix for X, and the matrix F is the residual matrix for Y. Next,
we describe how (3.1) and (3.2) are achieved.
2.2.1 PLS Objective
The objective of PLS is to maximize the following objective function,
max
t;u
J = t
T
1
u
1
(2.3)
where t
1
is calculated with the aid of a weighting vector w
1
,
t
1
= Xw
1
(2.4)
and u
1
is calculated with the aid of a loading vector q
1
,
u
1
= Yq
1
(2.5)
18
subject to the following constraints:
kw
1
k
2
= 1 (2.6)
kq
1
k
2
= 1 (2.7)
By using Lagrange multipliers, the objective function becomes
max
t;u
J = w
T
1
X
T
Yq
1
+
1
2
w
(1 w
T
1
w
1
) +
1
2
q
(1 q
T
1
q
1
) (2.8)
Taking derivative with respect to w
1
and q
1
gives
@J
@w
1
= X
T
Yq
1
w
w
1
= 0 (2.9)
@J
@q
1
= Y
T
Xw
1
q
q
1
= 0 (2.10)
After few manipulations to eliminate w
1
and q
1
, (2.9) and (2.10) become the
two following equations
X
T
YY
T
Xw
1
=
w
q
w
1
(2.11)
Y
T
XX
T
Yq
1
=
q
w
q
1
(2.12)
Therefore, the weighting vectors w
1
and the loading vector q
1
are eigenvec-
tors of X
T
YY
T
X and Y
T
XX
T
Y, respectively. The score vectors t
1
and u
1
19
can then be calculated by (2.4) and (2.5). The score vectors are related by a
linear inner model:
u
1
=
1
t
1
+ r
1
(2.13)
where
1
is a constant coefficient and is determined by minimizing the norm
of the residual r
1
. To make sure all score vectors t
i
are orthogonal to each
other, the X matrix is deflated by
E
1
= X t
1
p
T
1
(2.14)
where the loading vector p
1
is determined by
p
1
= X
T
t
1
=t
T
1
t
1
(2.15)
To deflate the matrix Y, the score vector u
1
is substituted by its estimate,
1
t
1
:
F
1
= Y
1
t
1
q
T
1
(2.16)
then the entire procedure is repeated until the desired number of PLS com-
ponents is reached. The overall PLS algorithm can be done iteratively by
a nonlinear iterative partial least-squares algorithm (NIPALS), and is sum-
marized in Table 2.1. Note that in the third step of the NIPALS algorithm,
q
h
is normalized during each iteration. If we relax this restriction, the in-
ner model coefficient
h
becomes unity, and the estimated score vector u
h
is equal to t
h
. Arranging all of the score and loading vectors into a matrix,
20
Table 2.1: NIPALS PLS algorithm
1. Scale X and Y to zero-mean and unit-variance
E
0
= X; F
0
= Y
2. Leth =h + 1 and take u
h
as some column of F
h1
3. Iterate the following equations until converge:
w
h
= E
T
h1
u
h
=kE
T
h1
u
h
k
t
h
= E
T
h1
w
h
q
h
= F
T
h1
t
h
=kF
T
h1
t
h
k
u
h
= F
h1
q
h
4. Find the regression coefficient
h
h
= u
T
h
t
h
=t
T
h
t
h
5. Calculate the X-loadings:
p
h
= E
T
h1
t
h
=t
T
h
t
h
6. Calculate the residuals:
E
h
= E
h1
t
h
p
T
h
F
h
= F
h1
h
t
h
q
T
h
7. Return to step 2 until all PLS components are calculated.
namely T = [t
1
;:::; t
l
], P = [p
1
;:::; p
l
], and Q = [q
1
;:::; q
l
], we achieve
(3.1) and (3.2).
2.2.2 The Number of PLS Components
The desired number of PLS components,l, is usually determined by
cross-validation. A standard way of doing cross-validation is as follows.
1. Divide the data intos subsets.
2. Leave out a subset of data at a time and build a PLS model with the
remaining subsets.
21
3. Test the model on the subset which is not used in modeling.
4. Repeat this procedure until every subset has been left out once.
5. Calculated the total prediction errors of all subsets, known as the pre-
dicted error sum of squares (PRESS).
6. The optimal number of PLS components is the one that gives either
the minimum PRESS or after which the PRESS no longer decreases at
a noticeable rate.
2.2.3 PLS Model Prediction
Once the PLS model is built, it can be used to make prediction of the
future output with a given input measurement x
k
by the following equa-
tion:
b y
k
= QR
T
x
k
(2.17)
where
R = W(P
T
W)
1
(2.18)
A detail tutorial of PLS algorithm and cross-validation can be found in
Geladi and Kowalski[24].
2.3 Statistical Process Monitoring
Since the PLS model is based on the statistical correlation among pro-
cess vaiables, it is no longer reliable if abnormal operating situations hap-
22
pen, which results in breaking this correlation. Multivariate statistic pro-
cess monitoring is effective for detecting and diagnosing faults or abnor-
mal operating situation in many industrial processes. Two popular mul-
tivariate statistic process monitoring methods based on principal compo-
nent projection are principal component analysis (PCA) and partial least
squares (PLS).[63, 6, 42] Both methods project the measurement data onto
a lower dimensional subspaces, and the process is then monitored in these
subspaces. These two methods are popular due to their ability to handle
missing data and large numbers of highly correlated variables. PCA-based
monitoring method should be used if one wants to monitor all the vari-
ations and abnormal situations in process variables X. PLS-based mon-
itoring method is more effective when only the variations in the process
variables that are most influential on quality variables Y is of interest. In
this dissertation, we are not interested to monitor all the variations in a
process. Rather, monitoring variations that are related to the output mea-
surements of our inferential sensors is more meaningful, and therefore PLS-
based monitoring is used.
2.3.1 PLS Monitoring Indices
The PLS model divides the measured variable space into two sub-
spaces. One of them, the principal subspace, is believed to be able to explain
most of the variations in Y, and the residual subspace, the second subspace,
23
is believed not to have much influence on Y. The typical approach is to use
theT
2
index to monitor the principal subspace and use theQ index to mon-
itor the residual subspace.
TheT
2
index is defined as,
T
2
= t
T
1
t = x
T
R
1
R
T
x; (2.19)
where x is a given sample vector, and R = W(P
T
W)
1
is the matrix men-
tioned previously. The diagonal matrix , is defined as,
=
1
N 1
T
T
T; (2.20)
whereN is the number of training sample and T is the score matrix from
the PLS model. Under the condition that the process is normal and the data
follows a multivariate normal distribution, Tracy et al. [71] showed that
theT
2
index is related to anF distribution considering that the population
mean and covariance are estimated from the data:
N(Nl)
l(N
2
1)
T
2
F
l;Nl
; (2.21)
whereF
l;Nl
is anF distribution withl andNl degrees of freedom. For a
given significance level the process is considered normal if
T
2
T
2
l(N
2
1)
N(N 1)
F
l;Nl;
: (2.22)
IfN is large, then Box [11] showed thatT
2
index can be well approximated
with a
2
distribution withl degrees of freedom and
T
2
2
l;
: (2.23)
24
TheQ index is defined as
Q =jj~ xjj
2
= x
T
(I PR
T
)
T
(I PR
T
)x; (2.24)
where P is the loading matrix defined in the previous subsection. The pro-
cess in considered normal if
Q
2
; (2.25)
where
2
denotes the control limit for theQ index. Two popular expressions
for the control limit
2
are proposed by Nomikos and MacGregor [51], and
Jackson et al [29]. In this dissertation, expression proposed by Nomikos
and MacGregor is used. This expression, derived by using the results of
Box [11], is the following:
2
=g
2
h;
; (2.26)
where
g =
trfS(I PR
T
)g
2
trfS(I PR
T
)g
; (2.27)
and
h =
trfS(I PR
T
)g
2
trfS(I PR
T
)g
2
: (2.28)
The expressiontr(A) denotes the trace of matrix A, and S is the input co-
variance matrix,
S =
1
n 1
X
T
X: (2.29)
Note that both theT
2
index and theQ index are in quadratic forms.
The notation can be simplified by considering just one general expression
25
as
Index(x) = x
T
Mx; (2.30)
where the positive semi-definite matrix M for corresponding index in
shown is Table 2.2. This general expression is very helpful in defining the
contributions for fault diagnosis methods, which are discussed in the next
section.
Table 2.2: Values of M
Index T
2
Q
M R
1
R
T
(I PR
T
)
2.4 Fault Diagnosis by Contributions
Once a fault is detected, it is desirable to diagnose its cause. Many
methods are proposed to solve this problem. One popular category among
them is the contribution analysis methods. Contribution methods deter-
mine the contribution of each variable to the fault detection indices calcu-
lated. The idea is that faulty variables should have high contributions to the
fault detection index. Several contributions have been defined and used for
fault diagnosis [13, 61]. Alcala and Qin [3] showed that they can be unified
into three general categories: diagonal contribution, general decompositive
contribution, and reconstruction-based contribution. Diagonal contribution
was proposed by Qin et al.[61], and it is specialized in dealing with mulit-
26
block process monitoring. Among the general decompositive contributions,
the complete decomposition contribution is mostly widely used in industry.
In this section, we will first explain the difference between a sensor fault and
a process fault. Then, the complete decomposition contributions (CDC) and
the reconstruction-based contributions (RBC) are discussed and compared
using the general expression of the monitoring indices.
2.4.1 Sensor Faults and Process Faults
There are two kinds of fault that can happen in a process, namely
a sensor fault and a process fault. A sensor fault happens when one of
the sensors is not working properly, and therefore it produces inaccurate
measurements while the process is actually operated at normal conditions.
There are many types of sensor fault, and the four most common ones are
listed here [21]:
1. A bias fault happens when the measurement is affected by a constant
term;
2. A complete sensor failure happens when the sensor stops functioning,
and its measurement is merely a constant;
3. A drifting fault happens when the sensor signal is added by an auto-
correlated term with a constant forcing function
27
4. A precision degradation happens when measurement noise is
increased, causing a higher noise-to-signal ratio.
A sensor fault is quite isolated by itself. It won’t affect other variables, and
therefore, it is generally easier to identify using contribution methods. We
simply need to define a contribution for each of the sensors, and the one
with the highest contribution is treated as the faulty one.
On the other hand, a process fault is much different. A process fault
happens when the process is operated in an abnormal situation while all the
sensor signals correctly reflect the actual operating conditions. A process
fault typically involves more than one variable, and as a result, its identi-
fication is considerably much difficult. One example of a process fault is a
sudden change in the distillation column feed flow, which could then affect
the column temperature and pressure drastically.
In the case of a process fault, contribution method is best used when
historical information on a set of faults is available, and the user is trying to
find out which one within this set is currently happening. A contribution is
assigned to each of the known faults, and the one with the highest contribu-
tion is treated as the one that is under investigation. Often, especially when
process monitoring is never applied before, historical fault information is
not available. In this situation, contributions defined for sensor faults can
be used. Although it cannot give information on exactly what is going on
28
in the process, it provides the variable that is mostly affected by this abnor-
mality, and thus, offers a starting point on the diagnosis.
In the next two subsections, two contribution methods based on sen-
sor faults are defined. They can be easily generalized to process faults by
substituting
i
with proper fault directions, which can be extracted from
historical faulty samples.
2.4.2 Complete Decomposition Contributions
In general, the CDC for monitoring indices with a quadratic form is
defined as
Index(x) = x
T
Mx =jjM
(1=2)
xjj
2
=
P
n
i=1
T
i
M
(1=2)
x
2
=
P
n
i=1
CDC
Index(x)
i
(2.31)
where
i
is thei th column of the identity matrix, and
CDC
Index(x)
i
=
T
i
M
(1=2)
x
2
: (2.32)
2.4.3 Reconstruction-based Contributions
The RBC was proposed by [4] & [5]. It used the amount of reconstruc-
tion of a fault detection index along a variable direction as the contribution
of that variable. The reconstructed index with a quadratic form along a
variable direction
i
is
Index (x
r
i
) =jjM
(1=2)
x
r
i
jj
2
=jjM
(1=2)
(x
i
f)jj
2
(2.33)
29
wheref is the reconstructed portion to be determined. The best reconstruc-
tion by minimizing (2.33) gives the optimal value off. Taking derivative of
Index(x
r
i
) with respect tof and set it equals to zero. The expression off can
be solved as
f =
T
i
M
i
1
T
i
Mx
: (2.34)
The RBC is defined as
RBC
Index(x)
i
= Index(x) Index(x
r
i
)
= x
T
Mx (x
i
f)
T
M(x
i
f)
= 2x
T
M
i
ff
T
T
i
Mf
=
(
T
i
Mx)
2
T
i
M
i
(2.35)
Note thatf is a scalar, and therefore its transpose is equals to itself.
The forth equality in (5.48) is the result of applying (2.34).
2.4.4 Analysis of Diagnosability
Although the contribution methods for fault diagnosis is popular
and has been adopted by many researchers for decades, not much fun-
damental analysis on their diagnosability has been developed. However,
there are reports showing that contribution plots involve fault ”smearing”,
which can lead to misdiagnosis [63, 74, 81]. Smearing is when a fault in the
i th variable affects the contribution of other variables, and it is unavoidable
in both CDC and RBC. In order to give some theoretical analysis on the
smearing effect, Alcala and Qin [4] proposed to examine the case where
30
a sensor fault happens in the
j
direction with a sufficiently large fault
magnitudef.
A fault in sensorj is represented as x = x
+
j
f where x
is the fault-
free part of the measurement. Whenf is sufficiently large, x
is negligible
compared to
j
f, and therefore
x
j
f: (2.36)
This case will be utilize to examine the diagnosability of the above defined
contributions.
2.4.5 Diagnosis Using Complete Decomposition Contributions
Substituting the fault in (5.62) into (5.45) we get
CDC
Index(x)
i
=
(
[M
(1=2)
]
2
ij
f
2
fori6=j
[M
(1=2)
]
2
jj
f
2
fori =j
; (2.37)
where [A]
ij
=
T
i
A
j
is the ij th element of the matrix A. From (2.37), it
is easily showed that a fault happens in the j th variable can affect con-
tributions of other variables. This is the smearing effect of CDC. Correct
diagnosis using CDC is guaranteed only if
[M
(1=2)
]
2
jj
[M
(1=2)
]
2
ij
: (2.38)
Equation (2.38) however, is not always true. It is worth noting that if we
assume the data are stationary, the model is fixed, and so is M. Therefore,
when (2.38) does not hold, the CDC method completely fails, and the correct
diagnosing rate is zero.
31
2.4.6 Diagnosis Using Reconstruction-based Contributions
Substituting the fault in (5.62) into (5.48) we get
RBC
Index(x)
i
=
(
[M]
2
ij
[M]
1
ii
f
2
fori6=j
[M]
jj
f
2
fori =j
(2.39)
Again, the smearing effect of the RBC is easily seen. Correct diagnosis using
RBC is guaranteed only if
[M]
jj
[M]
2
ij
[M]
1
ii
(2.40)
Since M is positive semi-definite matrices, (2.40) always holds. The prove
is given next.
Proof. [M]
jj
[M]
2
ij
[M]
1
ii
Since M is positive semi-definite, we have
[
i
j
]
T
M [
i
j
] =
2
4
[M]
ii
[M]
ij
[M]
ij
[M]
jj
3
5
0 (2.41)
which implies
det
2
4
[M]
ii
[M]
ij
[M]
ij
[M]
jj
3
5
= [M]
ii
[M]
jj
[M]
2
ij
0 (2.42)
wheredet [A] is the determinant of the matrix A. Equation (2.40) is easily
proved by solving (2.42) for [M]
jj
.
32
2.5 Summary
In this chapter, we present the PLS algorithm in details, including
PLS objective, process modeling, and process monitoring. Definitions and
control limits of the two monitoring indices, theT
2
index and theQ index,
are described. Finally, we give the definitions of the two most popular con-
tribution methods for fault diagnosis, namely the complete decomposition
contribution (CDC) and the reconstruction-based contribution (RBC). Their
diagnosability is analyzed with the conclusion that the RBC method guar-
antees correct fault daignosis for the simplest case of a sufficiently large
sensor fault, but the CDC method does not.
33
Chapter 3
Inferential Sensor for Basic Sediments and Water
in Gas and Oil Production
3.1 Introduction
Basic Sediments and water (BS&W), also known as bottom solids and
water, is a technical specification of certain impurities in crude oil. As men-
tioned in Chapter 1, the crude oil usually contain some amount of water and
suspended solids from the reservoir formation when extracted. The water
content can vary greatly among different fields, and it can reach very high
values when oil extraction is enhanced using water flooding technology.
The bulk of the water and sediment is usually removed at the production
site to minimize the quantity that unavoidably needs to be transported fur-
ther. BS&W measures the residual content of these unwanted impurities.
There are several reasons for dehydrating crude oil. The most ob-
vious reason is to meet the product specifications given by the purchasers.
One other reason is to increase its selling price. Crude oil is bought and
sold on a
API gravity basis. High-gravity oils command higher prices,
and water content lowers its
API gravity. Another reason is to minimize
shipping costs by avoiding transportation of valueless water. Finally, the
34
mineral salts present in oilfield water can corrode production equipment,
storage tanks, and pipelines. Therefore, reducing the BS&W content saves
equipment maintenance costs in the long term.
Monitoring the BS&W content is typically perform at the production
site in the lease automatic custody transfer (LACT) unit to prevent excessive
amounts entering the pipeline system. How much the pipeline company is
willing to accept into its system depends on geographic location, market
competitiveness, and its ability to handle BS&W in the system [75].
BS&W online analyzers are usually installed in a vertical pipe
with downward flow to provide the best mixing which, if the water
content is present in reasonable amounts, will ensure the existence of an
oil-continuous emulsion and uniformly distributed water particles. There
are many types of BS&W online analyzer, and the three most commonly
used are capacitance, density, and energy-absorption analyzers [75]. The
price of one analyzer can be as high as $50,000.
The capacitance analyzer works by measuring the dielectric permit-
tivity of the targeted fluid and compare it with the dielectric permittivity of
either pure water or pure oil. The output from the capacitance probe will
be proportional to the water content. However, the capacitance analyzer
assumes the dielectric strengths of the oil and water remain relatively con-
stant, and it’s performance can be affected by the salinity, emulsion charac-
teristics, temperature, presence of free gas, and paraffin deposition. There-
35
fore, frequent calibrations using field tests are usually required.
The density analyzer takes the advantage of the fact that the den-
sities of oil and water are generally very different. Therefore, measuring
the density of the emulsion can yield the amounts of water content. In the
case of heavy crudes, where densities of the crude and water are similar, the
density analyzer does not work well. Temperature, on the other hand, can
greatly affect fluid density. Thus, this type of analyzers must either have
temperature compensation or be installed on a steady-temperature stream.
Other factors that can disturb the analyzer performance including presence
of free gas, vibration, and paraffin deposition.
The energy absorption analyzer measures the electromagnetic
energy-absorption rate. The difference in the energy-absorption rates be-
tween water and hydrocarbons is utilized to determine the water content.
This type of analyzers is less affected by paraffin deposition, but can still be
significantly impacted by the presence of free gas and temperature.
Field testing is performed whenever recalibration of the above an-
alyzes is required, and its measuring procedure is determined by the test
method ASTM D-4007 [8]. The procedure involves mixing with solvent,
preheating, centrifugation, and reading the BS&W content. It could take
more than 10 minutes for just one single measurement.
Besides online analyzers and field testings, another way to monitor
the BS&W content is by utilizing an inferential sensor. It is relatively cheap
36
to develop, and does not require much maintenance after it is employed.
In this chapter, the proposed BS&W inferential sensor is a data-driven one,
which is based on partial least-squares (PLS) regression. PLS first extracts
the statistical correlations among process variables to build an inferential
model. It then apply this model to make predictions in the future.
The rest of the chapter is organized as follows. In Section 3.2, partial
least squares algorithm is briefly reviewed. In Section 3.3, we apply our
inferential sensor on a crude production plant. Finally, we conclude this
chapter in Section 3.4.
3.2 Partial Least-squares Algorithm
Partial least-squares (PLS) regression is a method for statistical mod-
eling, and it is able to give a robust solution in the case of collinear or corre-
lated input variables, where the ordinary least squares regression gives rise
to the ill-conditioned problem. It is carried out in two steps. First the sub-
spaces within the input matrix X and the output matrix Y that maximize
their covariance are extracted by an outer model. Then the relation between
these two subspaces is captured by an inner model. The underlying struc-
ture of the PLS is the follows,
X = TP
T
+ E (3.1)
37
Y = TQ
T
+ F (3.2)
How to obtained this structure and utilize it to make prediction can be
found in Chapter 2. The details of the PLS algorithm is not repeated here to
avoid too much repetition.
3.3 BS&W for Crude Oil
In this section we apply our BS&W inferential sensor on a crude pro-
duction process. In this process, a free-water knockout (FWKO) is used
to remove the free water. A FWKO is a large separation vessel that sepa-
rates reservoir fluids into gas, oil, and water simply by gravitational force.
Crude oil leaving the FWKO can still contain as much as 40% emulsified wa-
ter. Four wash tanks are used to further reducing the crude oil BS&W con-
tent. Wash tanks are large holding vessels usually operated with produced
water filling the bottom one-third and crude oil the top two-thirds. The
emulsion feed is introduced below the oil-water interface using a spreader.
This ”washing” action facilitates coalescence of water droplets, and thus
removes it from the oil stream.
Fig. 3.1 shows the simplified process flow diagram around the four
wash tanks. The oil-water emulsion feed heads into the four wash tanks
in parallel. The removed water leaves from the bottom and merges into a
single stream, which then goes to the water plant for further processing.
38
T-21
T-24
T-23
T-22
Water Plant
Lact
Tank
Reject
Tank
Oil/Water
Wash Tanks
BS&W
Meter
BS&W
Meter
BS&W
Meter
4"
6"
8"
Shipping
Shipping
Shipping
Figure 3.1: Process flow diagram around the wash tanks.
The purified crude oil leaves from the top of the wash tanks and merges
into one stream as well. If the BS&W content is within the specification, the
crude oil is stored in the lease automatic custody transfer (LACT) tank and
ready to ship. However, if the BS&W is not within the specification, then it
is stored in the reject tank. Three pipelines with different sizes are available
for shipping. Each pipeline has its own BS&W online analyzer installed.
Data for around 25 days were collected from the process at an one-
minute sampling interval. Each data point consists of 9 variables, which
are listed in Table 3.1. Notice that readings from the BS&W meter on the 6”
39
Table 3.1: Process variables used in the inferential sensor
variable no. variable name unit
1 BS&W 4” %.
2 BS&W 8” %.
3 T-21 wash tank pressure bar.
4 T-22 wash tank temperature
C.
5 T-22 wash tank pressure bar.
6 T-23 wash tank temperature
C.
7 T-23 wash tank pressure bar.
8 T-24 wash tank temperature
C.
9 T-24 wash tank pressure bar.
pipe are not used in the modeling because it is not working properly during
the period in which the data was collected. T-21 wash tank temperature is
also not used for the same reason. The inputs of our inferential sensor are
variable 3 9, and the output is the average of the variable 1 and 2. To
prevent overfitting, the data is separated into a training set and a testing
set. Since we have plenty of samples, only around 30% of data is used as
the training set, and the other 70% of data is used as the testing set.
From the cross-validation result as shown in Fig. 3.2 the optimal
number of PLS components is chosen to be 6. The BS&W comparison is
shown in Fig. 3.3. Notice that only around 4-day of data is shown here for
clarity. The resulting average errors for training set and testing set are 4.65%
and 7.98%, respectively.
40
1 2 3 4 5 6 7
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
CumulativepPRESSpaspapfunctionpofpnumberpofpPLSpcomponents
PRESS
Figure 3.2: Predicted error sum of squares by cross-validation.
41
day 07 day 08 day 09 day 10
BS&W comparison
BS&W
Analyzer BS&W
Inferential BS&W
Figure 3.3: Comparison between analyzer BS&W and inferential BS&W.
42
3.4 Summary
In this chapter, we present our proposed BS&W inferential sensor for
gas and oil applications. This inferential sensor is simply a PLS regression
model with the wash tanks temperatures and pressures as inputs and the
BS&W measurement as an output. This inferential sensor is applied to a
crude production process with average errors less than 8%.
43
Chapter 4
Vapor Pressure Inferential Sensors for Gas and Oil
Production
4.1 Introduction
In this chapter, we develop vapor pressure inferential sensors for the
petroleum industry. In general vapor pressure is defined as the pressure ex-
erted by a vapor in thermodynamic equilibrium with its condensed phases
at a given temperature in a closed system. The vapor pressure is an indica-
tion of a liquid’s evaporation rate, and the term ”volatile” is used to refer a
substance with a high vapor pressure at normal temperatures.
There are, however, two different vapor pressures in the gas and oil
industry. The vapor pressure previously defined refers to the true vapor
pressure (TVP). Another vapor pressure, denotes the Reid vapor pressure
(RVP), is defined as the absolute vapor pressure exerted by its condensed
phases at 100
F, and its measuring procedure is determined by the test
method ASTM D-323 [7]. Reid vapor pressure is more commonly used, and
therefore it will be introduced next.
Unlike TVP which is a function of temperature, RVP is defined at
100
F. As a result, RVP only depends on the composition of the hydrocar-
44
bon mixture, which makes it an simple indication of the amount of light
components dissolved in either the crude oil or gas condensate. On the
prospective of crude producers, it is desirable to retain light components
in the oil phase since liquid hydrocarbons is much valuable then gaseous
hydrocarbons. However, this is not the case for crude buyers. Light com-
ponents tend to evaporate during the downstream processing and won’t
turn into their products. Also, too much light components dissolved in the
oil can lead to safety concerns. Due to the above reasons, crude producers
need to measure and monitor RVP values in order to meet its specification
and to optimize profits.
Measuring RVP in a laboratory is not an easy job. It involves precool-
ing of sample and apparatus , sample transfer, assembly of apparatus, time
to reach thermal equilibrium, measurement of vapor pressure, and lastly,
cleaning of apparatus for next test. More over, during the entire measure-
ment procedure, operators need to constantly looking for any leakage. If a
leak is observed, the entire procedure needs to start over. It could take more
than 20 minutes for just one single RVP measurement.
Online analyzers, on the other hand, can help eliminating human er-
rors and shorten the sampling time to 1 10 minutes depending on the an-
alyzer capability. Notwithstanding their benefits, analyzers also have their
downsides. RVP analyzers are expensive and can take up to $200,000. Also,
45
they often require frequent maintenance, which can cost around $50,000 per
year.
TVP cannot be used as indication of crude oil quality because hydro-
carbon mixtures with the same composition can have different TVPs if the
temperatures are different. However, it is still an important property to en-
sure safety in crude oil processing, transport, and storage. One of the major
risks when transporting crude oil is pumping cavitation [54]. In a pumping
system, the crude oil is accelerated, generating areas of low pressure. When
the surrounding pressure is lower than the TVP of the crude oil, bubbles
build, grow, then collapse, generating high pressure and high temperatures
at the bubble surface, which can damage the transportation system or the
pump. TVP is also widely used in storage tank’s vapor loss calculations
[53]. Crude oil storage tanks are often designed according to government
emission regulations, and engineers often need TVP to develop emission
estimates.
Although simpler measuring procedures are been developed [39],
inferential sensor is still a better solution to the above problems. It is rel-
atively cheap to develop, and does not require much maintenance after it is
employed. Also, the sampling time is drastically reduced. There are some
vapor pressure inferential sensors in the gasoline blending industry with
known vapor pressure of the feedstocks [68, 23], but not much has been
46
seen in the gas and oil production applications. The only one that can be
found in the literature is proposed by Al-Thamis [2]. Al-Thamis utilized the
composition results from gas chromatography to calculate the vapor pres-
sure. The inferential sensor we proposed in this chapter does not need any
gas chromatography or other analysis, which makes it a better choice.
The inferential sensor in this chapter is a hybrid inferential sensor
which consists of a model-driven part and a data-driven part. The model-
driven part is based on the Korsten vapor pressure equation [36]. This
equation, nevertheless, can only provide a rough estimate when applying
to a hydrocarbon mixture. To overcome this shortage, partial least-squares
(PLS) is used to model the residual, which denotes the data-driven part of
the inferential sensor.
PLS model is built based on the statistical correlation among vari-
ables. If a sensor fails or provides a wrong signal, or if the operation mode
is changed, PLS model is no longer accurate. This is because the correlation
among variables is no longer the same. To make sure the reliability of the
PLS model, process monitoring method based on PLS can be used to detect
any fault that might occur.
Once a fault has been detected, it is important to diagnosis the fault
and to make any necessary correction as soon as possible. Fault diagno-
sis using contribution methods have been used by many researchers and
applications [26, 14, 73, 49], and they are also used in this chapter.
47
The rest of the chapter is organized as follows. In Section 4.2 , the
Korsten vapor pressure equation is described in detail. In Section 4.3, we
explain the structure of our hybrid inferential sensor. Partial least squares,
monitoring indices, and contribution methods are briefly mentioned in this
section as well. In Section 4.4, we apply our inferential sensor on gas con-
densate. Then in Section 4.5 , we apply it on crude oil. Finally, we conclude
this chapter in Section 4.6.
4.2 The Vapor Pressure Equation
Most of the vapor pressure equations proposed in literatures find
their origin in the Clapeyron equation
d lnp
d(1=T )
=
V
H
R(
V
Z)
(4.1)
wherep is the pressure,T is the temperature, andR represents the gas con-
stant.
V
Z and
V
H are changes in the compressibility factor and enthalpy
associated with vaporization. Assuming a constant ratio (
V
H)=(
V
Z), a
simpler vapor pressure equation with reasonable accuracy within a low-
pressure range can be derived as follows.
lnp =A +
B
T
: (4.2)
To improve the quality of vapor pressure representations, several modifica-
tions of (4.2) have been proposed, including the one proposed by Cox [16],
48
lnp =A +
B
T
+CT +DT
2
; (4.3)
and the famous Antoine equation,
lnp =A +
B
C +T
: (4.4)
Cox is the also the first person to find that the vapor pressure curves can be
represented by straight lines that seem to converge to a common point if the
temperature axis is plotted as a proper function ofT .
Korsten [36] found out (1=T ) in the previous vapor pressures is not
the right functionality for hydrocarbon components. Instead, it should be
(1=T
1:3
). He then proposed the following vapor pressure equation for hy-
drocarbons,
lnp =A +
B
T
1:3
: (4.5)
In the original paper, the accuracy of this equation was demonstrated by
plotting lnp against 1=T
1:3
for different chemical components as shown in
Fig. 4.1.
The vapor pressure curves for n-paraffins appear as straight
lines in Fig: 4:1. Similar figures can be drawn for n-alkylbenzene,
n-alkylcyclopentanes, n-alkylcyclohexanes, monoolefins, alkanethiols,
chloroalkanes, and other miscellaneous components. Also shown in Fig.
4.1, the vapor pressure curves of all the components merge together at
49
1000 500 400 300 250 200
0.01
0.1
1
10
100
1000
Temperature T in K
Vapor pressure P in bar
Vapor pressure curves for n−paraffins
8 7
Number of C atoms
6 5 4
Figure 4.1: Vapor pressure curves of n-paraffins.
50
a common point. The pressure at the common point, however, is an
imaginary vapor pressure since this common point is above the critical
points where the vapor-liquid boundaries do not exist. Nevertheless, it can
be utilized to eliminate one of the two parameters in (4.5), and the resulting
equation is
lnp = lnp
+B(
1
T
1:3
1
T
1:3
); (4.6)
where B is a constant parameter that represents a characteristic of each
component. P
and T
are the corresponding temperature and pressure
(T
= 1994:49K;P
= 1867bar) at the common point for all hydrocarbons
and most other chemical species investigated. The parameterB in (4.6) can
be determined by using one measurement of equilibrium temperature and
pressure. Once the parameterB is determined, the parameterA in (4.5) can
be solved, and the vapor pressure curve is obtained.
Unlike RVP which is defined at 100
F, TVP is a function of temper-
ature. But often, there is no temperature sensor at the exact point of the
process where TVP measurement is required. We could improvise by us-
ing adjacent temperature sensors. However in the above vapor pressure
equations, a logarithmic scale is applied on the pressure, and therefore, the
accuracy of the model is highly sensitive to the temperature measurement.
Depending on the process characteristics and the reliability of sensors, the
estimation error from (4.6) can exceed as much as 100%. The inferential
51
sensor will be more robust if this large residual can be further attenuated.
Inspired by the Antoine’s equation, we modify (4.5) for this purpose, and
the resulting equation is the following
lnp =A +
B
(T +C)
1:3
(4.7)
where the new parameter C accounts for the temperature correction, and
is solved by minimizing the mean squared errors. Solving (4.7) for C, we
obtain
C =
B
lnpA
1=1:3
T (4.8)
By utilizing the available historical data, C is simply determined by taking
the average of all C’s.
Substitution of the appropriated temperature into (4.5) or (4.7) allows
the desired vapor pressures to be obtained. However, as will be discussed
in the next section, Korsten equation can only provide a rough estimate of
vapor pressure when dealing with a hydrocarbon mixture. The resulting
residual is modeled by PLS regression against other process operating con-
ditions.
4.3 Hybrid Inferential Model
The proposed hybrid inferential sensor is described in this section.
With one measurement of equilibrium pressure and temperature from the
52
process, the parameter B in (4.6) is calculated. The estimated vapor pres-
sures at various temperatures can then be obtained. The accuracy of this es-
timated vapor pressure, however, is not guaranteed for the following three
reasons
1. The expression in (4.6) is designed for pure components only and does
not guarantee the generalization to mixtures;
2. Disturbances are usually pervasive throughout a real continuous pro-
cess. As a result, the temperature and pressure measurement required
by the Korsten equation might not be taken at the equilibrium state;
and
3. The model is highly sensitive to temperature. As a consequence, even
a slight measurement deviation will induce a large estimation error.
To deal with these systematic errors, a data-driven supplementary
model is built to model the residuals. This data-driven model is the PLS
regression model. The inputs are the process operating conditions plus the
estimate vapor pressure, and the output is the modeling residuals from the
Korsten equation. In the modeling step, if a temperature sensor is not avail-
able at the exact location where TVP measurements are needed, an adjacent
temperature sensor and (4.7) are used, which incorporates a new parameter
C to account for the temperature correction. The parameterC is determined
by utilizing the available historical data.
53
Temperature and pressure
at equilibrium
Estimated vapor
pressure +
Measured vapor
pressure from
historical data
X
-
Residual = Y
Surrounding
operating conditions
Q
R
Figure 4.2: A schematic illustration of the hybrid inferential sensor building
method.
The above procedure is applied to build the hybrid inferential sensor.
A schematic illustration of the building method is given in Fig. 4.2.
4.3.1 Partial Least Squares
Partial least squares (PLS) is a statistical modeling method. It extracts
the statistical correlations between the model inputs and the model outputs.
The objective is to find two subspaces, one in the input space and the other
in the output space, that maximize their covariance. Partial least squares
performs the following decompositions,
X =
P
l
i=1
t
i
p
T
i
+ E = TP
T
+ E
Y =
P
l
i=1
t
i
q
T
i
+ F = TQ
T
+ F
: (4.9)
Once the PLS model is built, it can used to predict unknown outputs with
known inputs. The detail algorithm is presented in Chapter 2. Readers can
also refer to the tutorial given by Geladi and Kowalski [24].
54
4.3.2 Statistical Process Monitoring
While the analyzers rely on their mechanical and physical reliability
to ensure the measurement quality, the inferential sensors rely on the pro-
cess normality. The inferential sensors cannot provide accurate predictions
if a fault happens in the process, causing changes in statistical correlations
among variables. Statistical process monitoring provides a way to automat-
ically detect process abnormality, and therefore it can serve as complement
to the inferential sensors.
The traditional PLS process monitoring involves two monitoring in-
dices, the T
2
index and the Q index. With the assumption that the data
follows a multivariate normal distribution, their control limits can be cal-
culated. A fault is detected if one of the monitoring indices goes beyond
its control limit. The detail calculation of the monitoring indices and their
control limits are given in Chapter 2.
4.3.3 Fault Diagnosis Using Contribution Methods
Diagnosing a fault, especially a process fault, is not always easy. In
an industrial process, a process fault can propagate through different units
due to process interactions and closed loop control actions. Contribution
methods can provide information on possible root cause and facilitate fault
identification. The basic idea behind the contribution method is that faulty
variables have high contributions to the monitoring detection index. The
55
two most commonly used contribution methods, namely the complete de-
composition contribution (CDC) and the reconstruction-based contribution
(RBC), are explained in Chapter 2, and they are also applied in one later
section.
4.4 Vapor Pressure for Gas Condensate
We first apply the above inferential sensor on a gas production
plant. Gas condensate from a gas production plant can be recovered and
sold. Nevertheless, dissolved light components must be removed in order
to meet its RVP specification. Debutanizer is used to serve for this purpose.
Fig. 4.3 shows the simplified process flow diagram around the debutanizer.
The debutanizer in this plant serves as a typically distillation column
with a condenser, a reflux drum, and a reboiler. It separates the feed stream
into butane, pentane, and the condensate. Butane leaves as the outlet stream
of the reflux drum. Pentane leaves from one of the middle trays of the
debutanizer. The condensate leaves from the bottom of the debutanizer. A
rundown cooler is used to cool the condensate before its RVP is measured
by the online analyzer.
Five days of data at a 9-minute sampling interval were collected.
Each data point consists of seventeen process variables around the debu-
tanizer, which are listed in Table 4.1. Assuming the debutanizer operates
according to its design conditions, the fluids within each tray should be
56
Reboiler
Debutanizer
Condenser
Reflux Drum
Rundown Cooler
Hot Oil Supply
Hot Oil Return
Butane
Pentane
Debutanizer Feed
RVP Online Analyzer
Condensate
Figure 4.3: Process flow diagram around the debutanizer
57
Table 4.1: Process variables used in the inferential sensor
variable no. variable name unit
1 Reid vapor pressure mmHg.
2 Debutanizer bottom temperature
C.
3 Debutanizer cooler downstream temperature
C.
4 Debutanizer cooler downstream pressure bar.
5 Debutanizer cooler downstream flow m
3
=h.
6 RVP analyzer downstream temperature
C.
7 Reboiler temperature
C.
8 Debutanizer bottom pressure bar.
9 Debutanizer feed m
3
=h.
10 Pentane outlet flow m
3
=h.
11 Debutanizer top temperature
C.
12 Butane outlet flow m
3
=h.
13 Condenser level %.
14 Condenser pressure bar.
15 Condenser temperature
C.
16 Reboiler hot oil supply m
3
=h.
close to equilibrium states. Therefore, the debutanizer bottom temperature
and pressure, namely variable 2 and variable 8, are used as the equilibrium
measurement that is required by (4.6). The residuals from (4.6) are modeled
by PLS regression using variable 2 to 16 plus the estimated RVP calculated
by (4.6). To make sure overfitting does not happen, the data are separated
into a training set and a testing set. The training set, which consists of
70% of the data, is used to build the inferential sensor. And the testing set,
which consists of 30% of the data, is used to test the sensor performance.
From the cross-validation result shown in Fig. 4.4, the optimal num-
ber of component is chosen to be 8. The comparison between the estimated
58
RVP and the measured RVP is shown in Fig. 4.5. In Fig. 4.5, the values on
the y-axis are not shown purposely to conceal the original data, and this is
done throughout this chapter. There are two lines in this figure. The black
line denotes the RVP measured by the online analyzer, and the red line de-
notes the RVP estimated by (4.6). Notice that even though both lines follow
the similar trend, the red one is roughly 54 minutes ahead of time. This is
because we are using the operating conditions of the debutanizer to calcu-
late the estimated RVP , and the online analyzer, which is used to generate
the measured RVP , is located downstream of the debutanizer. This gives
another great advantage of using our inferential sensor, namely it is able to
detect abnormal variations of RVP earlier than the online analyzer.
Fig. 4.6 shows the RVP comparison after we move the trend of Ko-
rsten RVP 54 minutes back in time and apply our hybrid inferential sensor.
Again, the black line denotes the RVP measured by the online analyzer, the
red line denotes the RVP calculated by (4.6). The new green line represents
the RVP calculated by the proposed hybrid inferential sensor. The average
error of training set is 1.51%, and the average error of testing set is 1.47%.
4.5 Vapor Pressure for Crude Oil
In this section, we apply our inferential sensor on the crude oil vapor
pressure, including both the RVP and the TVP . Then the process monitoring
and fault diagnosing methods are adopted. As mentioned in the introduc-
59
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
250
300
350
400
450
500
550
Cumulative PRESS as a function of number of PLS components
PRESS
Figure 4.4: Predicted error sum of squares by cross-validation
60
day 1 day 2 day 3 day 4 day 5
Vapor pressure (psia)
Date
Gas condensate RVP comparisons
Analyzer RVP
Korsten RVP
Figure 4.5: Comparison between Korsten RVP and Analyzer RVP
61
day 1 day 2 day 3 day 4 day 5
Vapor pressure (psia)
Date
Gas condensate RVP comparisons
Analyzer RVP
Korsten RVP
Hybrid RVP
Figure 4.6: Comparison between Korsten RVP and Analyzer RVP
62
Stabilizer
Stabilizer Feed
Gas
Reboiler 2 Reboiler 1
Rundown cooler
Level control valve
Vapor pressure
on-line analyzer
Off-spec crude oil
Stabilized crude oil
Figure 4.7: Process flow diagram around the stabilizer.
tion, dissolved light components must be removed from the crude oil in or-
der to meet its vapor pressure specification. In a crude production process,
a stabilizer is used to server for this purpose.
Fig. 4.7 shows the simplified process flow diagram around the stabi-
lizer. The stabilizer is the last step to remove extra light components from
the crude oil before shipping or storage. Gas is leaving from the top of the
stabilizer, and the oil is leaving from the bottom. We have two reboilers for
this stabilizer. Reboiler 2 is in operation and reboiler 1 is the spare. A run-
down cooler is used to cool the crude. The vapor pressure online analyzer
is located downstream of the rundown cooler to monitor both the RVP and
the TVP .
Data for around 60 days were collected from an oil and gas upstream
process at a 20-minute sampling interval. Each data point consists of 14
63
Table 4.2: Process variables used in the inferential sensor
variable no. variable name unit
1 Reid vapor pressure mmHg.
2 True vapor pressure mmHg.
3 Stabilizer top temperature
C.
4 Rundown cooler crude temperature
C.
5 Stabilizer bottom temperature
C.
6 Stabilizer feed temperature
C.
7 Reboiler 2 vapor temperature
C.
8 Stabilizer feed flow
C.
9 Stabilizer bottom level control valve m
3
=h.
10 Steam to reboiler %.
11 Stabilizer overhead pressure kg=h.
12 Stabilizer bottom pressure bar.
13 Stabilizer pressure difference bar.
14 Stabilizer bottom level bar.
variables, which are listed in Table 4.2. Assuming the fluids within each
tray is close to equilibrium states, the stabilizer bottom temperature and
pressure, namely variable 5 and 12, are chosen as the equilibrium measure-
ment required by (4.6) or (4.7).
Again, to prevent overfitting, the data is separated into a training set
and a testing set. Since we have plenty of samples, only 50% of data is used
as the training set, and the other 50% of data is used as the testing set.
4.5.1 The RVP Inferential Sensor
In this subsection, we focus on the RVP inferential sensor. This in-
ferential sensor consists of two parts. Within the first part, (4.6) is used to
64
calculate the estimated RVP , and the result is shown in Fig. 4.8 (a). No-
tice that only around 4-day of data is shown here for clarity. Half of them
is from the training set, and the other half is from the testing set. While
the estimated RVP from (4.6) is close to the measured one, there still exists
some discrepancy. This discrepancy is attenuated after introducing the PLS
regression as the second part. The estimated RVP plus all other operating
conditions are used in the PLS modeling. Reboiler one temperature is not
used because it is a spare reboiler, and it is not currently used at the time
where the training data set was collected.
From the cross-validation result as shown in Fig. 4.9, the optimal
number of PLS components is chosen to be 8. The residual comparison is
shown in Fig. 4.8 (b). After combining each of the two parts together, the
final inferred RVP is obtained, and the result is shown in Fig. 4.8 (c). The
resulting average errors for training set and testing set are 2.67% and 3.12%,
respectively.
4.5.2 The TVP Inferential Sensor
As mentioned previously, the TVP of crude oil is also of interest.
Building an inferential sensor for the TVP is basically the same as build-
ing an inferential sensor for the RVP with one exception, namely the TVP is
not defined solely at 100
F . It is a function of temperature. Ideally, we need
to replace 100
F with the crude temperature at the exact place where the va-
65
dayA28 dayA29 dayA30 dayA31 dayA32
VaporApressure
(a)AMeasuredARVPA(−)AandAestimatedARVPAfromA(4.6)A(...)
dayA28 dayA29 dayA30 dayA31 dayA32
(b)AActualAresidualA(−)AandAPLS −modelledAresidualA(...)
Residual
dayA28 dayA29 dayA30 dayA31 dayA32
(c)AMeasuredARVPA(−)AandAestimatedARVPAfromAtheAinferentialAsensorA(...)
VaporApressure
Date
Figure 4.8: Hybrid inferential sensor for RVP prediction
66
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
600
800
1000
1200
1400
1600
1800
2000
2200
Cumulative PRESS as a function of number of PLS components
PRESS
Figure 4.9: Predicted error sum of squares by cross-validation
67
por pressure online analyzer is located. In practice, we do not have access to
this temperature, and instead, the rundown cooler temperature is used. The
rundown cooler and online analyzer are very close location-wise, therefore
their temperature difference is small. Nevertheless, this temperature differ-
ence can still cause large estimation errors as mentioned in Section 4.2. In
this situation, we propose to incorporate a temperature correction into the
vapor pressure equation, and (4.7) is used.
The parameter C in (4.7) accounts for the temperature correction,
and it can be calculated by (4.8) for each sample. The optimal C for our
inferential sensor is calculated as the average of all C’s in the training set.
In this particular case study, the optimal value of C is equals to25:55. The
estimated TVP from (4.7) is shown in Fig. 4.10 (a). Again, we use PLS to
model the residual, and the number of PLS components is chosen to be 8
based on the cross-validation result. The residual comparison is shown in
Fig. 4.10 (b). After combining the two parts together, the final inferred TVP
is obtained, and the result is shown in Fig. 4.10 (c). The resulting average
errors for training set and testing set are 3.34% and 4.79%, respectively.
4.5.3 Monitoring Methods
In this subsection, we demonstrate an example of using PLS moni-
toring methods. A 4-day period between day 43 47 is selected for this
purpose. The correspondingT
2
index andQ index are shown in Fig. 4.11.
68
dayA28 dayA29 dayA30 dayA31 dayA32
VaporApressure
(a)AMeasuredATVPA(−)AandAestimatedATVPAfromA(4.7)A(...)
dayA28 dayA29 dayA30 dayA31 dayA32
(b)AActualAresidualA(−)AandAPLS −modelledAresidualA(...)
Residual
dayA28 dayA29 dayA30 dayA31 dayA32
(c)AMeasuredATVPA(−)AandAestimatedATVPAfromAtheAinferentialAsensorA(...)A
VaporApressure
Date
Figure 4.10: Hybrid inferential sensor for TVP prediction with a tempera-
ture correction
69
The horizontal blue lines are the control limits. Two faults can be observed
by both indices within this 4-day period. The first one begins at day 43 and
ends at day 44. The second fault is relatively short, and it happens during
day 46. In the next subsection, we use contribution methods to diagnosis
these two faults.
4.5.4 Fault Diagnosis
Two faults are being observed, and both of them are process faults.
As mention in Chapter 2, contribution method for process faults is best used
when historical information on a set of faults is available. In our case, since
process monitoring method was never applied in this process, the histor-
ical fault information is not available. Nevertheless, we can still use con-
tributions defined for sensor faults and make some diagnosis with suitable
process knowledge.
To diagnosis the first fault, a sample at day 44 is selected. Its com-
plete decomposition contributions (CDC) and reconstruction-based contri-
butions (RBC) for both indices are shown in Fig. 4.12. TheT
2
index for both
the CDC and RBC have similar results, which shows that variable 4, namely
the rundown cooler crude temperature, has the highest contribution. In
the case of theQ indices, three variables have relatively high contributions
among others, namely variable 7, 8, and 10, and they denotes reboiler 2 tem-
perature, stabilizer feed flow, and steam to reboiler, respectively. Besides the
70
day 43 day 44 day 45 day 46 day 47
0
50
100
150
200
T
2
index
day 43 day 44 day 45 day 46 day 47
0
2
4
6
Q index
Date
Figure 4.11: PLS Monitoring Indices
71
stabilizer feed flow, variable 4, 7, and 10 are all related to the bottom of the
stabilizer. Therefore, although we cannot state the actual root cause for this
fault, we can postulate that the fault probably started at the stabilizer feed
coming from its upstream unit. It then propagated through the stabilizer
bottom, and finally further to the downstream rundown cooler. To check
the validity of the diagnosis result, few variables are selected and plotted
in Fig. 4.13. From the figure, both the stabilizer feed flow and the reboiler
2 temperature clearly show abnormal behaviors during the time in which
fault one happened.
Similarly, to diagnosis the second fault, a corresponding faulty sam-
ple is selected. Its contributions for the indices are shown in Fig. 4.14.
From this figure, three variables appear to have relatively high contribu-
tions among others. They are variable 5, 7, and 10, namely the stabilizer
bottom temperature, reboiler 2 temperature, and steam to reboiler, respec-
tively. Since all three variables are related to the bottom of the stabilizer,
we can postulate that this fault probably started at the stabilizer bottom
and didn’t propagate to other parts of the process. The reboiler 2 tempera-
ture shown in Fig. 4.13 clearly shows abnormal behaviors during the time
in which fault two happened while the stabilizer feed flow was normal.
This result agrees with our hypothesis. One interesting thing to note in Fig.
4.13 is that the temperature of the spare reboiler, namely the reboiler 1 tem-
72
03 06 09 12 15
0
20
40
60
CDCs for T
2
03 06 09 12 15
0
10
20
30
40
50
RBCs for T
2
03 06 09 12 15
0
0.1
0.2
0.3
0.4
CDCs for Q
03 06 09 12 15
0
2
4
6
8
RBCs for Q
Figure 4.12: Contributions for Fault One
73
day 43 day 44 day 45 day 46 day 47
Stabilizer feed flow
day 43 day 44 day 45 day 46 day 47
Reboiler 2 temperature
day 43 day 44 day 45 day 46 day 47
Reboiler 1 temperature
Figure 4.13: Selected variables to demonstrate the effectiveness of the diag-
nosing result
74
03 06 09 12 15
0
20
40
60
80
100
CDCs for T
2
03 06 09 12 15
0
50
100
150
200
RBCs for T
2
03 06 09 12 15
0
2
4
6
8
10
CDCs for Q
03 06 09 12 15
0
5
10
15
20
RBCs for Q
Figure 4.14: Contributions for Fault Two
perature, increased dramatically also during the time in which fault two
happened. Further diagnosis on how and why this occurred with proper
process knowledge should render the correct root cause for this fault.
4.6 Summary
In this chapter, we present our proposed vapor pressure hybrid in-
ferential sensor for gas and oil applications. This inferential sensor is ap-
plied on both the gas condensate and the crude oil with average errors less
75
than 5% as shown in Table 4.3. At the end, we demonstrate the effective-
ness of PLS-based process monitoring and fault diagnosing methods. Two
faults have been detected. Although their exact root causes are still unclear,
contribution methods provide helpful insights and can be used to facilitate
further diagnosis when additional process knowledge is available.
Table 4.3: Average errors for the inferential sensors
Gas condensate RVP Crude oil RVP Crude oil TVP
Training set 1.51 % 2.67 % 3.34 %
Testing set 1.47 % 3.12 % 4.79 %
76
Chapter 5
Concurrent PLS-based Contribution for Fault Di-
agnosis
5.1 Introduction
Over the last two decades, multivariate statistical methods such as
principal component analysis (PCA) and projection to latent structures
(PLS) have been successfully applied to the monitoring of industrial
processes [50, 77, 57]. These methods build statistical models from normal
operation data, and they partition the measurement into a number of
subspaces. Each subspace is monitored by an statistical index. A fault is
detected when a new measurement breaks the normal statistical correlation
causing one of the monitoring indices to go beyond its control limit.
Both PCA and PLS partition the process measurements X into a
principal subspace and a residual subspace, and use theT
2
andQ indices
to monitor them, respectively. However, these two types of monitoring
methods have different interpretations. PCA-based methods are effective
in monitoring all the variations and abnormal situations only in process
measurements. The T
2
index monitors the principal variations, and the
Q index monitor the residual variations. Often, large amount of process
77
measurements were collected and stored by the process control systems
in high-frequent rates, whereas quality variables Y are measured at much
slower rates and typically come with a significant time delay. If the quality
measurements are expensive or difficult to obtain, PCA-based methods can
be a better solution. One obvious drawback of PCA-based methods is that
no information from the quality variables is incorporated, and therefore it
cannot reveal whether or not a fault detected in process measurements is
relevant to quality variables.
On the other hand, PLS model is used to extract statistical correla-
tions between the process variables and quality variables, and therefore it
can be used to monitor both X and Y. Traditional PLS-based monitoring
methods use the T
2
index to monitor the process variations that is most
relevant to the quality variables, and use the Q index to monitor the pro-
cess variations that is not related to the quality variables. This monitoring
method for PLS has two problems. First, the principal subspace in PLS,
which is thought to reflects major variations related to the quality measure-
ments Y, still contains variation orthogonal to Y. Second, PLS does not
extract variations of the process measurements in a descending order, and
therefore, the residual subspace can still contain large variations, making it
inappropriate to be monitored by theQ index.
To solve the problems mentioned above, Zhou et al [85] proposed a
78
total PLS (TPLS) algorithm. TPLS partitions the process measurements X
into four different subspaces to detect output-relevant faults and output-
irrelevant faults. Four monitoring indices are developed to monitor each
of the four subspaces. TPLS-based monitoring methods amends some of
the disadvantages of the traditional PLS methods, but still has two draw-
backs as pointed out by Qin and Zheng [65]. The first drawback is that
the output-relevant monitoring index only monitors quality variations that
are predictable from the process data. In some cases, PLS performs poorly
in predicting the quality variables due to lack of excitation in the process
measurements and the existence of large unmeasured disturbances. Conse-
quently, a large portion of the unpredictable quality variations is unmoni-
tored by the TPLS-based monitoring method. The second drawback is that
instead of one subspace, TPLS unnecessarily uses two subspaces to monitor
the input-relevant variations.
To resolve the above issues, Qin and Zheng [65] proposed a concur-
rent PLS (CPLS) algorithm. CPLS partitions the process measurements X
into three different subspaces to detect output-relevant faults and output
irrelevant faults. These three subspaces are the covariation subspace (CVS),
the input-principal subspace (IPS), and the input-residual subspace (IRS).
CPLS also partitions the unpredictable quality measurements Y into two
different subspaces, namely the output principal subspace (OPS) and the
79
output-residual subspace (ORS). Details on these five subspaces are given
in later sections. Five monitoring indices and their corresponding control
limits are developed to monitor each fo the above five subspaces.
There are two kinds of fault, namely a sensor faults and a process
faults. A sensor fault happens when one of the sensors is not working prop-
erly, and therefore it produces inaccurate measurements while the process
is actually operated at normal conditions. A process fault happens when
the process is operated in an abnormal situation while all the sensors are
working properly. Once a fault is detected, it is desirable to diagnose its
cause. There are many methods exist in the literatures. One popular cate-
gory among them is the contribution analysis methods. For sensor faults,
contribution methods determine the contribution of each variable to the
monitoring indices. The assumption is that faulty variables should give
higher contributions compare to normal variables. For process faults, fault
directions must be first extracted to form a set of known historical fault.
Then, contribution methods calculate the contribution of each fault direc-
tions. The one that gives the highest contribution is treated as the targeted
fault.
While CPLS algorithm has been demonstrated to provide better
monitoring results than the traditional PLS algorithm, its fault diagnosis
and identification methods have not been developed. In this chapter, the
80
complete decomposition contributions (CDC) and reconstruction-based
contributions (RBC) are defined for CPLS monitoring indices.
The remaining part of this chapter is organized as follows. Fault de-
tection based on PLS models is briefly reviewed in Section 5.2. The CPLS
algorithm is presented and its properties derived in Section 5.3. The CPLS
fault detection indices and their general forms are calculated in Section 5.4.
The CDCs and RBCs are defined for CPLS in Section 5.5, and their diag-
nosability are analyze in Section 5.6. Simulation case studies is presented in
Section 5.7. Finally, we conclude the chapter in Section 5.8.
5.2 PLS for Process and Quality Monitoring
Given an input matrix X2R
nm
consisting ofn samples withm pro-
cess variables, and an output matrix Y2R
np
withp quality variables, the
PLS algorithm first scales, and then projects X and Y to a low-dimensional
space, which is defined by a small number of latent variables (t
1
;::: t
l
),
where l is the PLS component number. The mean-centered and scaled X
and Y are decomposed as:
(
X =
P
l
i=1
t
i
p
T
i
+ E = TP
T
+ E
Y =
P
l
i=1
t
i
q
T
i
+ F = TQ
T
+ F
(5.1)
In (5.1), T = [t
1
;::: t
l
] are the latent score vectors, P = [p
1
;::: p
l
] and Q =
[q
1
;::: q
l
] are the loading vectors for X and Y, respectively. The matrices E
81
and F are the corresponding residuals to X and Y. In general, the PLS de-
composition is carried out iteratively. The first latent vector t
1
is extracted
by maximizing the covariance between X and Y, and then both matrices
are deflated to form X
1
and Y
1
. The second latent vector is then extracted
by maximizing the covariance between X
1
and Y
1
, and the process is re-
peated until enough latent components have been extracted. Intuitively, it
is desired to have the number of PLS components l to give the maximum
prediction power to the PLS model based on data that are excluded from
training data, and l is usually determined by cross-validation. Although
the PLS decomposition is an iterative process, once the model is built and
parameters stored, all score vectors can be computed directly from original
X:
T = XR (5.2)
and
R = W(P
T
W)
1
(5.3)
where the weight vectors W = [w
1
;::: w
l
] are also parameters in the PLS
decomposition. They are used to calculate the scores t
i
= X
i
w
i
. For readers
who are interested in detail of PLS algorithms can refer to Chapter 2 or the
tutorial given by Geladi [24].
To perform process monitoring on a new data sample x, the PLS
model projects it onto a principal subspace ^ x, which is thought to reflect
82
major variations related to Y, and a residual subspace ~ x, which is thought
to contain variation unrelated to the output Y. However, unlike orthogo-
nal projections in the PCA, Li et al. [43] showed that the PLS induces an
oblique projection decomposition. In other words, ^ x is the projection of x
onto SpanfPg along SpanfR
?
g, and ~ x is the projection of x onto SpanfR
?
g
along SpanfPg.
8
>
>
<
>
>
:
x = ^ x + ~ x
^ x = PRx2 SpanfPg
~ x = (I PR)x2 SpanfR
?
g
(5.4)
Early literature (e.g., MacGregor et al. [47]) suggested to monitor
principal subspace byT
2
index and residual subspace byQ index.
T
2
= t
T
1
t
l(n
2
1)
n(n 1)
F
l;nl;
(5.5)
Q =jj~ xjj
2
= x
T
(I PR
T
)xg
2
h;
(5.6)
where
t = R
T
x (5.7)
and
1
=
1
n 1
T
T
T: (5.8)
83
F
l;nl;
is theF -distribution withl andl1 degrees of freedom, is the level
of significance, and
2
h
is the
2
-distribution withh degrees of freedom. The
calculation ofg andh can be found in Chapter 2.
5.3 Concurrent Projection to Latent Structures
Unlike PLS, the CPLS algorithm projects the input and output data
spaces concurrently to five subspaces. Three objectives are achieved by the
CPLS model: (1) based on the result of traditional PLS model, the scores
that are directly relevant to the predictable quality variations are extracted,
which forms the covariation subspace (CVS); (2) in order to monitor the un-
predictable quality variations, they are further projected to an output princi-
pal subspace (OPS) and an output-residual subspace (ORS); (3) the process
variations are further projected to an input-principal subspace (IPS) and
an input-residual subspace (IRS) to monitor abnormal variations in these
subspace. In this section, we first introduce the CPLS algorithm, then we
derived some of its properties.
5.3.1 CPLS Algorithm
The CPLS algorithm for multiple input and multiple output data is
given as follows.
1. Scale X and Y to zero-mean and unit-variance. Perform PLS on X and
Y using (4.9) to give T, Q, and R. The number of PLS componentsl
84
is determined by cross-validation.
2. Perform singular value decomposition (SVD) on the ”predictable out-
put”
^
Y = TQ
T
.
^
Y = U
c
D
c
V
T
c
= U
c
Q
T
c
(5.9)
where Q
c
= V
c
D
c
includes alll
c
nonzero singular values in descend-
ing order and the corresponding right singular vectors.
3. Set U
c
= XR
c
where R
c
= RQ
T
V
c
D
1
c
.
4. Form the ”unpredictable output”
~
Y
c
= Y U
c
Q
T
c
and compare the
variance between
~
Y
c
and Y.
R
y
=
var(
~
Y
c
)
var(Y)
(5.10)
where var(A) is simply the sum of squared singular values of A. If
R
y
< 0:05, there is no output-principal subspace, and
~
Y
c
=
~
Y is sim-
ply the output residuals. Skip step 5 and go to step 6. Otherwise if
R
y
0:05, go to step 5.
5. Perform PCA on
~
Y
c
withl
y
principal components
~
Y
c
= T
y
P
T
y
+ T
~ y
P
T
~ y
= T
y
P
T
y
+
~
Y (5.11)
to yield the output-principal scores T
y
and output residuals
~
Y.
85
6. Form the ”output-irrelevant input”
~
X
c
= XU
c
R
y
c
= XU
c
(R
T
c
R
c
)
1
R
T
c
and compare the variance between
~
X
c
and X.
R
x
=
var(
~
X
c
)
var(X)
(5.12)
If R
x
< 0:05, there is no input-principal subspace, and
~
X
c
=
~
X is
simply the input residuals. Skip step 7 and stop. Otherwise ifR
x
0:05, go to step 7.
7. Perform PCA on on
~
X
c
withl
x
principal components
~
X
c
= T
x
P
T
x
+ T
~ x
P
T
~ x
= T
x
P
T
x
+
~
X (5.13)
to yield the input-principal scores T
x
and input residuals
~
X.
Note that there is a small modification on the original algorithm pro-
posed by [65]. In step 4, the ratio between the variance of
~
Y
c
and Y is com-
puted. If this ratio is small, essentially all of Y is predictable, then
~
Y
c
=
~
Y
is simply the output residuals, and there is no output-principal variations.
Similar modification has been made in step 6 for input space.
Based on the CPLS algorithm the data matrices X and Y are decom-
posed as follows
(
X = U
c
R
y
c
+ T
x
P
T
x
+
~
X
Y = U
c
Q
T
c
+ T
y
P
T
y
+
~
Y
(5.14)
In the above decomposition, U
c
represents the scores in the CVS, T
x
repre-
sents the scores in the IPS, T
y
represents the scores in the OPS,
^
X represents
86
the residual matrix of the input, and
^
Y represents the residual matrix of the
output. The matrices R
c
2R
mlc
, P
x
2R
mlx
, Q
c
2R
plc
, and P
y
2R
ply
denotes the loadings of their corresponding subspaces. Transpose a specific
row of the above matrices, and let it be written as a lower case column vec-
tor (e.g. a column vector u
c
corresponds to a specific row of U
c
). The CPLS
decomposition can be written in terms of a single sample as follows
(
x = R
yT
c
u
c
+ P
x
t
x
+ ~ x
y = Q
c
u
c
+ P
y
t
y
+ ~ y
(5.15)
or
(
~ x
c
= (x R
yT
c
u
c
) = P
x
t
x
+ ~ x
~ y
c
= (y Q
c
u
c
) = P
y
t
y
+ ~ y
(5.16)
where
u
c
= R
T
c
x (5.17)
t
x
= P
T
x
~ x
c
= P
T
x
x (5.18)
t
y
= P
T
y
~ y
c
= P
T
y
(y Q
c
u
c
) (5.19)
~ x = (I P
x
P
T
x
)~ x
c
= (I P
x
P
T
x
)x (5.20)
~ y = (I P
y
P
T
y
)~ y
c
= (I P
y
P
T
y
)(y Q
c
u
c
): (5.21)
87
5.3.2 CPLS Properties
The second equalities in (5.18) and (5.20) are not obvious. To prove
them, we derive some properties of CPLS here.
Lemma5.1.
~
X
c
R
c
= 0.
Proof. This can be easily showed by using the CPLS relations given in the
above algorithm,
~
X
c
R
c
= (X U
c
R
y
c
)R
c
= XR
c
U
c
(R
T
c
R
c
)
1
R
T
c
R
c
= U
c
U
c
= 0:
(5.22)
Lemma5.2. P
T
x
R
yT
c
= 0 and P
T
~ x
R
yT
c
= 0.
Proof. By using (5.13) and Lemma 5.1.
~
X
c
R
c
= [T
x
T
~ x
]
P
T
x
P
T
~ x
R
c
= 0: (5.23)
Since [T
x
T
~ x
] is full rank, P
T
x
R
c
= 0 and P
T
~ x
R
c
= 0. Therefore P
T
x
R
yT
c
=
P
T
x
R
c
(R
T
c
R
c
)
1
= 0 and P
T
~ x
R
yT
c
= P
T
~ x
R
c
(R
T
c
R
c
)
1
= 0.
We are now ready to prove the second equalities in (5.18) and (5.20).
Lemma5.3. t
x
= P
T
x
x and ~ x = (I P
x
P
T
x
)x.
88
Proof. This can be easily proved by using Lemma 5.2.
t
x
= P
T
x
~ x
c
= P
T
x
(x R
yT
c
u
c
)
= P
T
x
x P
T
x
R
yT
c
u
c
= P
T
x
x
(5.24)
~ x = (I P
x
P
T
x
)~ x
c
= (P
~ x
P
T
~ x
)(x R
yT
c
u
c
)
= P
~ x
P
T
~ x
x P
~ x
P
T
~ x
R
yT
c
u
c
= P
~ x
P
T
~ x
x
= (I P
x
P
T
x
)x:
(5.25)
5.4 CPLS-based Fault Detection
From the CPLS model given in the previous section, it is straight-
forward to design fault monitoring indices. Similar to monitoring meth-
ods of traditional PCA and PLS,T
2
indices are used in the subspaces with
large variations, such as covariance subspace (CVS), input-principal sub-
space (IPS), and output-principal subspace (OPS). TheQ indices are used in
the subspaces with small variations, such as input-residual subspace (IRS)
and output-residual subspace (ORS). In this section, we present the CPLS
monitoring indices and their corresponding control limits in detail.
89
5.4.1 Monitoring Indices
In the CVS, the scores U
c
are orthonormalized and hence each ele-
ment of u
c
is zero mean with variance 1=(n 1). Therefore, this subspace
can be monitoring by the followingT
2
index,
T
2
c
= (n 1)u
T
c
u
c
= (n 1)x
T
R
c
R
T
c
x: (5.26)
The IPS and IRS can be monitored by the followingT
2
andQ indices,
T
2
x
= t
T
x
1
x
t
x
= x
T
P
x
1
x
P
T
x
x (5.27)
Q
x
= jj~ xjj
2
= x
T
(I P
x
P
T
x
)
T
(I P
x
P
T
x
)x
= x
T
(I P
x
P
T
x
)x
(5.28)
where the diagonal elements of
x
denotes the sample variances of the in-
put principal components,
x
=
1
n 1
T
T
x
T
x
= diag(
x;1
;
x;2
;:::;
x;lx
): (5.29)
Similarly, the OPS and ORS can be monitored by the following T
2
and Q
indices,
T
2
y
= t
T
y
1
y
t
y
= ~ y
T
c
P
y
1
y
P
T
y
~ y
c
= (y Q
c
u
c
)
T
P
y
1
y
P
T
y
(y Q
c
u
c
)
(5.30)
Q
y
= jj~ yjj
2
= ~ y
T
c
(I P
y
P
T
y
)
T
(I P
y
P
T
y
)~ y
c
= ~ y
T
c
(I P
y
P
T
y
)~ y
c
= (y Q
c
u
c
)
T
(I P
y
P
T
y
)(y Q
c
u
c
)
(5.31)
90
where the diagonal elements of
y
denotes the sample variances of the out-
put principal components,
y
=
1
n 1
T
T
y
T
y
= diag(
y;1
;
y;2
;:::;
y;ly
): (5.32)
Note that both the OPS and ORS have similar purposes, namely to
monitor the unpredictable output variations. Therefore, an alternative com-
bined index,'
y
, can be used. This combined index is proposed by Yue and
Qin [83], and it is defined as follows,
'
y
=
Q
y
2
y;
+
T
2
y
2
y;
; (5.33)
where
2
y;
and
2
y;
are the control limits for indicesT
2
y
andQ
y
, respectively.
They will be introduced in the next subsection.
5.4.2 Control Limits
To perform the monitoring based on the above indices, their control
limits should be calculated from the statistics of the normal data. Since sin-
gular value decomposition (SVD) is used to perform the CPLS decomposi-
tion, all of the scores are orthogonal. Therefore, their corresponding control
limits can be calculated similarly to those used in PCA-based or PLS-based
monitoring. Ifn is large theT
2
index and theQ index approximately follow
2
distributions [11].
The control limits for each of the indices is summarized here:
91
1. IfT
2
c
>
2
c;
=
2
lc;
, an output-relevant fault is detected. The denotes
the level of significance, and
2
lc
denotes the
2
distribution with l
c
degrees of freedom.
2. IfQ
x
>
2
x;
= g
x
2
hx;
a potentially output-relevant fault is detected.
The parametersg
x
andh
x
are calculated as follows,
g
x
=
trfS(I P
x
P
T
x
)g
2
trfS(I P
x
P
T
x
)g
(5.34)
and
h
x
=
trfS(I P
x
P
T
x
)g
2
trfS(I P
x
P
T
x
)g
2
; (5.35)
where
S =
1
n 1
X
T
X: (5.36)
3. If T
2
x
>
2
x;
=
2
lx;
, an output-irrelevant, but input-relevant fault is
detected.
4. If Q
y
>
2
y;
= g
y
2
hy;
an unpredictable output-relevant fault is de-
tected. The parametersg
y
andh
y
are calculated as follows,
g
y
=
trfS
y
(I P
y
P
T
y
)g
2
trfS
y
(I P
y
P
T
y
)g
(5.37)
and
h
y
=
trfS
y
(I P
y
P
T
y
)g
2
trfS
y
(I P
y
P
T
y
)g
2
; (5.38)
where
S
y
=
1
n 1
~
Y
T
c
~
Y
c
: (5.39)
92
5. IfT
2
y
>
2
y;
=
2
ly;
, an unpredictable output-relevant fault is detected.
6. If '
y
>
2
'
= g
'
2
h';
, an unpredictable output-relevant fault is de-
tected. The parametersg
'
andh
'
are calculated as follows,
g
'
=
trfS
y
y
g
2
trfS
y
y
g
(5.40)
and
h
'
=
[trfS
y
y
g]
2
trfS
y
y
g
2
; (5.41)
where
y
=
P
y
1
y
P
T
y
2
y;
+
I P
y
P
T
y
2
y;
(5.42)
5.4.3 Indices General Forms
The indicesT
2
c
,T
2
x
, andQ
x
that monitor the input space can be writ-
ten in quadratic forms in terms of x. To simplified the notation, we can
expressed them in a general form
Index(x) = x
T
Mx (5.43)
where M is shown in Table 5.1 for each detection index.
Table 5.1: Values for M
Index T
2
c
T
2
x
Q
c
M (n 1)R
c
R
T
c
P
x
1
x
P
T
x
I P
x
P
T
x
93
The indicesT
2
y
,Q
y
, and'
y
that monitor the output space can be writ-
ten in quadratic forms only in terms of ~ y
c
and not in terms of y. Expand-
ing (5.30), (5.31), and (5.33) will result quadratic polynomials in terms of y.
Again, to simplify the notation, we can express them in a general form
Index(y) = y
T
Ny 2y
T
a +c(x) (5.44)
where N, a, andc(x) are shown in Table 5.2 for each detection index, and
the expression for
y
is given in (5.42). These two general forms will be
used to define contributions and analyze their diagnosabilities.
Table 5.2: Values for N, a andc(x)
Index T
2
y
Q
y
'
y
N P
y
1
y
P
T
y
I P
y
P
T
y
y
a NQ
c
R
T
c
x NQ
c
R
T
c
x NQ
c
R
T
c
x
c(x) x
T
R
c
Q
T
c
NQ
c
R
c
x x
T
R
c
Q
T
c
NQ
c
R
c
x x
T
R
c
Q
T
c
NQ
c
R
c
x
5.5 Fault Diagnosis by Contributions
Once a fault is detected, the next step is to diagnosis its root cause.
There are two types of faults, namely sensor faults and process faults. Both
complete decomposition contributions (CDC) and reconstruction-based
contributions (RBC) are widely used to diagnosis the sensor faults. While
CDCs are very easy to calculate, they do not always point to the correct
answer even for just sensor faults. This fact is shown in Chapter 2, and
94
it is briefly reviewed in the next section. On the other hands, RBCs can
guarantee to have the correct diagnosis result in the case of sensor faults
with sufficiently large fault magnitudes. Another problem of CDCs is that
their definition for some monitoring indices are not clearly defined. Due
to the above problems, process faults are only diagnosed by RBCs in this
chapter.
5.5.1 Complete Decomposition Contributions for Sensor Faults
In this subsection, we define the complete decomposition contribu-
tions (CDC) for sensor faults from the general forms. The CDC for monitor-
ing indices with a quadratic form is already defined in Chapter 2 as
CDC
Index(x)
i
=
T
i
M
(1=2)
x
2
; (5.45)
where
i
is thei th column of the identity matrix.
There is no general way to define the CDC for monitoring indices
with a quadratic polynomial. However, from the expression of the Index(y)
Index(y) = y
T
Ny 2y
T
a +c(x)
= jjN
(1=2)
yjj
2
2y
T
a +c(x)
=
P
n
i=1
T
i
N
(1=2)
y
2
2y
T
a +c(x);
(5.46)
we propose the CDC to be
CDC
Index(y)
i
=
T
i
N
(1=2)
y
2
2y
i
a
i
+c(x)=n (5.47)
95
wherey
i
anda
i
are thei th component of vectors y and a, respectively. This
definition allows the sum of all CDCs equals to Index(y) while eliminating
the ”smearing” effect on the linear and constant terms of the quadratic poly-
nomials. Smearing is when a fault in thei th variable affects the contribution
of other variables [74]. Smearing is unavoidable in both CDCs and RBCs,
and can lead to misdiagnosis. The smearing effect will be further studied in
section 5.6.
5.5.2 Reconstruction-based Contributions for Sensor Faults
In this subsection, we define the reconstruction-based contribution
(RBC) for sensor faults from the general forms. The RBC was proposed by
Alcala and Qin [4, 5]. It used the amount of reconstruction of a fault de-
tection index along a variable direction as the contribution of that variable.
The RBC for monitoring indices with a quadratic form is already defined in
Chapter 2 as
RBC
Index(x)
i
=
(
T
i
Mx)
2
T
i
M
i
: (5.48)
We can follow the similar procedure to obtain the RBC for moni-
toring indices with a quadratic polynomial. The expression for the recon-
structed index along a variable direction
i
is
Index (y
r
i
) = y
rT
i
Ny
T
i
2y
rT
i
a +c(x)
= (y
i
f)
T
N(y
i
f) 2(y
i
f)
T
a +c(x);
(5.49)
wheref denotes the fault magnitude.
96
Minimizing (5.49) gives the optimal value off. Taking derivative of
Index(y
r
i
) with respect to f and set it to zero. The expression of f can be
solved as
f = (y
T
N
i
a
i
)(
T
i
N)
1
: (5.50)
The RBC for a quadratic polynomial is defined as
RBC
Index(y)
i
= Index(y) Index(y
r
i
)
= y
T
Ny 2y
T
a +c(x)
(y
i
f)
T
N(y
i
f)
+2(y
i
f)
T
ac(x)
= 2y
T
N
i
ff
2
T
i
N
i
2f
T
i
a
=
(y
T
N
i
a
i
)
2
T
i
N
i
(5.51)
The last equality in (5.51) is the result of applying (5.50).
5.5.3 Reconstruction-based Contributions for Process Faults
Process faults affect more than one variables at the same time, and
it can be either unidimensional or multidimensional. In the case of an uni-
dimensional process fault, the fault direction remains as a vector, and the
same symbol,
i
, is used to represent the i th fault. To make every fault
direction comparable, it is scaled to unit norm.
In the case of a multidimensional fault, the fault subspace can be
represented as a matrix, and the symbol,
i
, is used. This matrix composed
by multiple unidimensional directions may not have linearly independent
97
columns. However, we can always select a collection of independent direc-
tions to compose
i
. Singular-value decomposition (SVD) can be applied to
obtain an orthonormal basis of the fault subspace. For this reason,
i
in this
chapter is assume to be orthonormal without loss of generality.
Since unidimensional fault is simply a special case of multidimen-
sional fault, in this subsection, we define the reconstruction-based contri-
bution (RBC) for multidimensional process faults only.
Following the procedures similar to the previous subsection, the first
step is to find the fault magnitudes f that minimize the reconstructed index.
Note that the fault magnitudes f for a multidimensional process fault is no
longer a scalar, but a vector. The reconstructed index with a quadratic form
along a fault subspace
i
is
Index (x
r
i
) =jjM
(1=2)
x
r
i
jj
2
=jjM
(1=2)
(x
i
f)jj
2
: (5.52)
The value of optimal f can be obtain by taking derivative of Index(x
r
i
) with
respect to f and set it to zero. The resulting expression is
f = (
T
i
M
i
)
1
T
i
Mx: (5.53)
Finally, the RBC is defined as
RBC
Index(x)
i
= Index(x) Index(x
r
i
)
= x
T
Mx (x
i
f)
T
M(x
i
f)
= 2x
T
M
i
f f
T
T
i
Mf
= x
T
M
i
(
T
i
M
i
)
1
T
i
Mx
(5.54)
98
Note that (5.48) is obtained when the fault subspace
i
in (5.54) reduces to
a column of the identity matrix
i
.
The reconstructed index with a quadratic polynomial along a fault
subspace
i
is
Index (y
r
i
) = y
rT
i
Ny
T
i
2y
rT
i
a +c(x)
= (y
i
f)
T
N(y
i
f) 2(y
i
f)
T
a +c(x):
(5.55)
Again, the value of optimal f can be obtain by taking derivative of Index(y
r
i
)
with respect to f and set it to zero. The resulting expression is
f = (
T
i
N
i
)
1
T
i
(Ny a): (5.56)
Finally, the RBC is defined as
RBC
Index(y)
i
= Index(y) Index(y
r
i
)
= y
T
Ny 2y
T
a +c(x) (y
i
f)
T
N(y
i
f)
+2(y
i
f)
T
ac(x)
= 2y
T
N
i
f f
T
T
i
N
i
f 2a
T
i
f
= (y
T
N a
T
)
i
f
= (y
T
N a
T
)
i
(
T
i
N
i
)
1
T
i
(Ny a)
(5.57)
Note that (5.51) is obtained when the fault subspace
i
in (5.57) reduces to
a column of the identity matrix
i
.
5.5.4 Extraction of Fault Subspace
In the previous subsection, we assume the fault direction matrix
i
is known beforehand. In the case of sensor faults, the fault direction is
99
easily derived. However, it is hard to derive for process faults. Several
methods have been proposed by researchers to extract the fault direction
matrix from the historical faulty samples [44, 83]. In this chapter, we use the
method proposed by Yue and Qin [83] to extract the fault direction matrix
or subspace as follows. Let X
f
represents the faulty data matrix under the
fault directions . Denote thek th sample as x
k
, then
x
k
= x
k
+ f
k
; (5.58)
where x
k
denotes the normal value. As x
k
is zero-mean, it can be neglected
with some averaging schemes, such as the moving window average. De-
note x
k
as the sample processed with moving window average, then
x
k
f
k
: (5.59)
We can then form a matrix composed of averaged faulty data,
X
T
f
[f
1
; f
2
;:::; f
n
]; (5.60)
and perform SVD on X
T
f
,
X
T
f
= UDV
T
; (5.61)
where the diagonal matrix D has nonzero signular values in descending
order. The first l columns of U is chosen as the fault subspace . In this
chapter, the appropriate fault dimensionl is determined so that 95% of the
X
T
f
variations has been captured.
100
5.6 Analysis of diagnosability
Contribution methods have been used in practice, but not much fun-
damental analysis on their diagnosabilities has been developed. Alcala and
Qin [4] proposed to approach this by examining the case where a sensor
fault happened in the
j
direction with a sufficiently large fault magnitude
f. A fault in sensorj is represented as x = x
+
j
f where x
is the fault-
free part of the measurement. Whenf is sufficiently large, x
is negligible
compared to
j
f, and therefore
x
j
f: (5.62)
Similarly for fault sample y
y
j
f: (5.63)
This case had been utilized in Chapter 2 to examine the sensor-fault diag-
nosability of the contributions with quadratic form. The conclusion was
that the RBC methods can guarantee correct fault diagnosis, but the CDC
methods do not. In this section, we first show that similar results can be
obtained for contributions with quadratic polynomial. Then we extend this
analysis to process-fault RBCs.
101
5.6.1 Diagnosis Sensor Faults Using Complete Decomposition Contri-
butions
Substituting the fault in (5.63) into (5.47) we can get the expression
of CDC with a quadratic polynomial
CDC
Index(y)
i
=
(
[N
(1=2)
]
2
ij
f
2
+c(x)=n fori6=j
[N
(1=2)
]
2
jj
f
2
2fa
j
+c(x)=n fori =j
(5.64)
where [A]
ij
=
T
i
A
j
is theij th element of the matrix A. Correct diagnosis
using CDC is guaranteed only if
[N
(1=2)
]
2
jj
f
2
2fa
j
[N
(1=2)
]
2
ij
f
2
: (5.65)
Equation (5.65) however, are not always true.
5.6.2 Diagnosis Sensor Faults Using Reconstruction-based Contribu-
tions
Substituting the fault in (5.63) into (5.51) we can get the expression
of RBC with a quadratic polynomial
RBC
Index(y)
i
=
(
f[N]
ij
fa
i
g
2
[N]
1
ii
fori6=j
f[N]
jj
fa
j
g
2
[N]
1
jj
fori =j
(
[N]
2
ij
[N]
1
ii
f
2
fori6=j
[N]
jj
f
2
fori =j
(5.66)
The approximation in (5.66) assumesf is sufficiently large and thereforea
i
anda
j
are negligible compared to [N]
ij
f and [N]
jj
f. Correct diagnosis using
RBC is guaranteed only if
[N]
jj
[N]
2
ij
[N]
1
ii
: (5.67)
102
Since N is a positive semi-definite matrix, (5.67) always hold. The prove is
given in Chapter 2.
5.6.3 Diagnosis Process Faults Using Reconstruction-based Contribu-
tions
The above approach to analyze diagnosability of the contribution
methods can be extended to process faults. In the case of a process fault
happened in the
j
subspace with a sufficiently large fault magnitude f,
the faulty measurement is represented as x = x
+
j
f where x
is the
fault-free part of the measurement. When the norm of the vector f is large,
x
is negligible, and therefore
x
j
f: (5.68)
Similarly for fault sample y
y
j
f: (5.69)
This case will be utilized to examine the diagnosability of the reconstruction-
based contributions with process faults.
It can be showed that when a faulty measurement has the form x =
j
f, the following expression always holds true for contributions with a
quadratic form,
RBC
Index(x)
j
RBC
Index(x)
i
for all otheri: (5.70)
Equation (5.70) was proofed by Li et al. [40], and it is shown here.
103
Proof. If the faulty measurement x =
j
f, then RBC
Index(x)
j
RBC
Index(x)
i
0
for all otheri.
This can be easily proofed by substituting (5.54) into the expression
and expanding it,
RBC
Index(x)
j
RBC
Index(x)
i
= x
T
M
j
(
T
j
M
j
)
1
T
j
Mx
x
T
M
i
(
T
i
M
i
)
1
T
i
Mx
= f
T
T
j
M
j
f
f
T
T
j
M
i
(
T
i
M
i
)
1
T
i
M
j
f
= jj(I M
1=2
i
(
T
i
M
i
)
1
T
i
M
1=2
)M
1=2
j
fjj
2
0
(5.71)
Note that on the third equality we use the property
(I M
1=2
i
(
T
i
M
i
)
1
T
i
M
1=2
)
2
= I M
1=2
i
(
T
i
M
i
)
1
T
i
M
1=2
: (5.72)
Similar to the case of sensor faults, process-fault RBCs with a
quadratic polynomial will converge to RBCs with a quadratic form if the
vector f is sufficiently large. Substituting (5.69) into (5.57), we obtain
RBC
Index(y)
j
= (y
T
N a
T
)
j
(
T
j
N
j
)
1
T
j
(Ny a)
= (f
T
T
j
N a
T
)
j
(
T
j
N
j
)
1
T
j
(N
j
f a)
f
T
T
j
N
j
(
T
j
N)
1
j
T
j
N
j
f;
(5.73)
and
RBC
Index(y)
i
f
T
T
i
N
j
(
T
j
N)
1
j
T
j
N
i
f; (5.74)
104
The approximations assumes the vector f is sufficiently large and therefore
the vector a is negligible compared to f
T
T
j
N and f
T
T
i
N. By using (5.74)
and (5.73), it can be easily shown that the following expression is also true
with sufficient large f,
RBC
Index(y)
j
RBC
Index(y)
i
for all otheri: (5.75)
In summary, we have shown that in the case of sufficiently large fault
magnitude, reconstruction-based contributions can guarantee to have cor-
rect fault diagnosis, but the complete decomposition contributions do not.
However, for modest fault magnitudes the randomness in the fault-free por-
tion x
will likely affect the diagnosis results, which will be studied next by
case studies.
5.7 Case Studies
In this section, we perform two case studies. First, we compare the
performance of RBCs and CDCs in the case of sensor faults by using a set of
synthetic data. Then, we demonstrate the effectiveness of RBCs in the case
of process faults on the Tennessee Eastman Process.
5.7.1 Synthetic Study
In this subsection, we use synthetic simulations to create a number
of representing sensor fault scenarios to demonstrate and compare the ef-
fectiveness of the above defined contributions for fault diagnosis.
105
The simulated numerical example without faults is as follows.
(
x
k
= Az
k
+ e
k
y
k
= Cx
k
+ v
k
(5.76)
where A =
0
B
B
@
1 3 4 0 0
3 0 4 0 1
1 1 0 0 0
0 0 0 1 0
1
C
C
A
T
, C =
2 2 1 1 0
0 0 0 1 0
, z
k
2 R
4
U([0; 1]),
e
k
2 R
5
N(0; 0:1
2
), and v
k
2 R
2
N(0; 0:08
2
). U([0; 1]) is the uniform
distribution in the interval [0; 1].
All of the parameters are more or less randomly chosen, except that
x
4
is independent of other input variables, and it is the only input variable
contributes to y
2
. This will make sure that x
4
is in the input-output covari-
ance subspace.
Equation (5.76) is used to generate 500 normal operation data points,
and the number of PLS components l = 3 is determined by 10-fold cross-
validation. In this model,R
y
in (5.10) is less than 0.05, and therefore there is
no output principal subspace.
A sensor fault is added in the following form in the input space
x
k
= x
k
+
x
f
x
(5.77)
and in the output space
y
k
= y
k
+
y
f
y
; (5.78)
where x
k
and y
k
are the fault-free values,
x
and
y
are the fault directions,
andf
x
andf
y
are the respective fault magnitudes.
106
Three scenarios are being studied.
1. A sensor fault was added to y
1
, which was detected only byQ
y
;
2. A sensor fault was added to x
4
, which was detected only byT
2
c
; and
3. A sensor fault was added to x
2
, which was detected by bothT
2
x
and
Q
x
.
The CPLS monitoring indices for all three scenarios are presented in Fig. 5.1,
5.2, and 5.3. The percent rates of correct diagnosis and the fault detection
rates (FDR) with various fault magnitudes are given in Table 5.3 and 5.4. It
is interesting to note that the CDC method completely failed onT
2
x
index in
scenario 3.
Table 5.3: Percent rates of correct diagnosis for scenario 1 & 2
Scenario 1 Scenario 2
Q
y
T
2
c
f FDR CDC RBC FDR CDC RBC
0.5 5.8 37.93 0.00 33.8 98.82 94.08
1 7.8 38.46 0.00 86.6 100.00 99.31
2 31.6 51.90 60.13 100.0 100.00 100.00
3 100.0 66.40 98.60 100.0 100.00 100.00
10 100.0 99.20 100.00 100.0 100.00 100.00
5.7.2 Case Study on the Tennessee Eastman Process
In this subsection, the Tennessee Eastman Process (TEP) [20] is used
to evaluate the effectiveness of the proposed CPLS diagnosis methods for
107
0 500 1000
0
5
10
15
T
c
2
0 500 1000
0
0.2
0.4
0.6
0.8
1
Q
y
0 500 1000
0
2
4
6
8
10
T
x
2
0 500 1000
0
0.1
0.2
0.3
0.4
Q
x
Figure 5.1: CPLS Monitoring Indices for Scenario 1
108
0 500 1000
0
10
20
30
40
T
c
2
0 500 1000
0
0.2
0.4
0.6
0.8
1
Q
y
0 500 1000
0
2
4
6
8
10
T
x
2
0 500 1000
0
0.1
0.2
0.3
0.4
Q
x
Figure 5.2: CPLS Monitoring Indices for Scenario 2
109
0 500 1000
0
5
10
15
T
c
2
0 500 1000
0
0.2
0.4
0.6
0.8
Q
y
0 500 1000
0
5
10
15
20
T
x
2
0 500 1000
0
0.5
1
1.5
2
2.5
Q
x
Figure 5.3: CPLS Monitoring Indices for Scenario 3
110
Table 5.4: Percent rates of correct diagnosis for scenario 3
Scenario 3
T
2
x
Q
x
f FDR CDC RBC FDR CDC RBC
0.5 1.8 0.00 22.22 26.0 30.77 34.62
1 8.2 0.00 31.71 75.8 43.01 45.91
2 28.8 0.00 30.56 100.0 63.20 62.80
3 53.6 0.00 35.45 100.0 79.20 73.60
10 100.0 0.00 65.60 100.0 100.00 100.00
process faults. TEP was created by the Eastman Chemical Company to pro-
vide a realistic industrial process for evaluating process control and mon-
itoring methods. The process flow diagram is shown in Fig. 5.4. TEP is
composed of five unit operations, including a chemical reactor, condenser,
compressor, vapor/liquid separator, and stripper, and eight chemical com-
ponents: A, B, C, D, E, F, G, and H. The gaseous reactants A, C, D, and
E and the inert B are fed to the reactor where the liquid product G and H
are formed. The component F is a by-product in the process. The goal is
to remove the unreacted reactants, inert, and by-product from the product
stream. The detailed description of the TEP can be found in Downs and Vo-
gel [20]. The control strategy applied to the process is described in Lyman
and Georgakis [46]. The simulation data were downloaded from Professor
Richard D. Braatz’s website.
TEP contains two blocks of variables: the XMV block of 12 manipu-
lated variables and the XMEAS block of 41 measured variables. In this case
111
Figure 5.4: TEP Process Flow Diagram
112
study, the input variables are XMEAS(1-36) and XMV(1-11), and the output
variables are XMEAS(37-41). The first 500 samples collected under normal
condition are used to train the CPLS model, and the last 480 samples are
faulty data. The number of PLS componentsl = 4 is determined by 10-fold
cross-validation. From the CPLS training, we havep = 5 andl
y
= 5, thus
Q
y
is null that does not need to be monitored.
There are total of 15 known faults in TEP , which are represented as
IDV(1) (15). In this case study we selected IDV(2), (4), and (5) as our his-
torical set of known process faults, and their detailed description is shown
in Table 5.5.
Table 5.5: Fault description
Fault Detailed description Type
IDV(2) B composition, A/C ratio constant (Stream 4) Step
IDV(4) Reactor cooling water inlet temperature Step
IDV(5) Condenser cooling water inlet temperature Step
In IDV(2), a step change occurs in the B composition of the stripper
inlet stream, and its CPLS monitoring result is shown in Fig, 5.5. While all
four monitoring indices successfully detect the fault, only T
2
c
reduces to-
ward normal values after the step change occurs. This indicates that quality
variables tend to return to normal due to fault alleviation by feedback con-
trollers. In IDV(4), a step change occurs in the reactor cooling water inlet
temperature, and its CPLS monitoring result is shown in Fig. 5.6. In the fig-
113
0 500 1000
0
100
200
300
400
500
T
c
2
0 500 1000
0
20
40
60
80
100
T
y
2
0 500 1000
0
1000
2000
3000
4000
T
x
2
0 500 1000
0
50
100
150
Q
x
Figure 5.5: CPLS-based Monitoring Result for a Step Change in B Composi-
tion
ure, bothT
2
x
andQ
x
successfully detect the fault while the fault detections
forT
2
c
andT
2
y
are not clear. In IDV(5), a step change occurs in the condenser
cooling water inlet temperature, and its CPLS monitoring result is shown in
Fig. 5.7. All four monitoring indices successfully detect the fault, and they
all reduces toward normal values after the step change occurs.
To perform fault diagnosis, we first extract the fault subspaces of
114
0 200 400 600 800 1000
0
5
10
15
20
25
T
c
2
0 200 400 600 800 1000
0
5
10
15
20
25
T
y
2
0 200 400 600 800 1000
0
50
100
150
T
x
2
0 200 400 600 800 1000
0
10
20
30
40
50
Q
x
Figure 5.6: CPLS-based Monitoring Result for a Step Change in Reactor
Cooling Water Inlet Temperature
115
0 200 400 600 800 1000
0
50
100
150
200
250
300
T
c
2
0 200 400 600 800 1000
0
50
100
150
200
T
y
2
0 200 400 600 800 1000
0
50
100
150
200
250
T
x
2
0 200 400 600 800 1000
0
10
20
30
40
Q
x
Figure 5.7: CPLS-based Monitoring Result for a Step Change in Condenser
Cooling Water Inlet Temperature
116
IDV(2), (4), and (5) using method described in subsection 5.5.4. All of the
data is preprocessed by moving window average, then 50 samples, which
corresponding to the period where the monitoring indices have the highest
magnitude, are selected to form the averaged faulty data matrix X
T
f
. The
SVD result shows that in all cases, fault subspace with one dimension is
enough to capture 95% of the X
T
f
. Therefore all of the fault subspace is
simply an vector.
To perform fault diagnosis, each of the three faults takes turn to be
the true fault, and the percent rates of correct diagnosis for all three cases
are shown in Table 5.6. Note that as long as the percent rate of correct
diagnosis is more than 50%, the true fault is correctly identified. In Table
5.6, we can observe that our proposed method performs well most of the
time, however there are still three scenarios where percent rates of correct
diagnosis fall below 25%. One possible reason is that the angle between
two fault subspaces is small, thus making fault isolation more difficult, and
therefore induces false diagnosis. Nevertheless, in theory they should all
increase toward 100% when the fault magnitude is sufficiently large.
5.8 Summary
In this chapter, CPLS based contributions for fault diagnosis are stud-
ied. We unified the five CPLS monitoring indices into two general forms,
and based on these general forms, we defined their complete decomposition
117
Table 5.6: Percent rates of correct diagnosis for TEP
Monitoring Indices
True Fault T
2
c
T
2
x
T
2
y
Q
x
IDV(2) 100 100 66 24
IDV(4) - 14 - 56
IDV(5) 77 20 44 54
contributions (CDC) and reconstruction-based contributions (RBC) for sen-
sor faults. For process faults, only the RBCs are defined. Diagnosability of
the CDCs and RBCs are analyzed, and one method to extract fault subspace
proposed by previous researchers is used. At the end, we demonstrate the
effectiveness of our proposed fault diagnosis methods on a synthetic case
study and Tennessee Eastman Process.
118
Chapter 6
Conclusions
In this dissertation, we present three inferential sensors in the field
of gas and oil production industry, including the BS&W, RVP , and TVP in-
ferential sensors. The BS&W is studied first. This inferential sensor is a
data-driven one, which is based on PLS regression. The inputs to the model
are the temperatures and pressures of the wash tanks, and the output to the
model is the BS&W estimate. A case study on a crude production process
indicates that, although simple, the proposed inferential sensor performs
well.
The inferential sensors for both RVP and TVP are studied within one
single chapter due to their similar characteristics. While TVP is a function of
temperature, RVP is simply an estimate of TVP at 100
F. These inferential
sensors are hybrid ones which consist of a model-driven part and a data-
driven part. The model-driven part is based on the Korsten vapor pressure
equation, and PLS is used to capture the modeling residual, which denotes
the data-driven part. The proposed inferential sensors were applied on two
processes: a crude production plant, and a gas condensate production plant.
Again, these studies give sound results, which indicates the proposed infer-
119
ential sensors perform well.
Since inferential sensors depend on the correlations among variables
within the process, their estimations are no longer valid if these correlations
are changed. Process correlations can change due to many reasons. One
possible reason is changed in operation mode. Another reason is the oc-
currence of faults within the process. To amend this shortage, we utilized
process monitoring and fault diagnosis methods based on PLS algorithm.
These methods are applied on two observed process faults of the crude pro-
duction plant. Although their exact root causes are still unclear, they pro-
vides helpful insights and can be used to facilitate further diagnosis when
additional process knowledge is available.
Although traditional PLS-based monitoring methods have been ap-
plied to industrial processes over the last two decades, they are still not per-
fect. Many researchers proposed modifications, including CPLS, to improve
its ability. The CPLS algorithm has been proved by previous researchers to
offer complete overall monitoring of faults that happen in the process. In
this dissertation, we develop its diagnosis methods for both sensor faults
and process faults. The proposed methods are applied two case studies to
demonstrate its effectiveness.
Future directions of research could be developing inferential sensors
for other properties in the field of gas and oil industry. One example is the
C2% in LNG, which is an essential indicator of LNG dew-point. Another
120
example is the chemical impurities, such as hydrogen sulfide and carbon
dioxide.
121
Bibliography
[1] F. Ahmed, S. Nazir, and Y. K. Yeo, “A recursive pls-based soft sensor for
prediction of the melt index during grade change operations in hdpe
plant,” Korean Journal of Chemical Engineering, vol. 26, no. 1, pp. 14–20,
2009.
[2] W. Al-Thamir, “A new method to determine reid vapour pressure for
stabilized crude oils by gas chromatography,” Chromatographia, vol. 22,
no. 1-6, pp. 63–64, 1986.
[3] C. F. Alcala and S. Joe Qin, “Analysis and generalization of fault di-
agnosis methods for process monitoring,” Journal of Process Control,
vol. 21, no. 3, pp. 322–330, 2011.
[4] C. F. Alcala and S. J. Qin, “Reconstruction-based contribution for pro-
cess monitoring,” Automatica, vol. 45, no. 7, pp. 1593–1600, 2009.
[5] C. F. Alcala and S. J. Qin, “Reconstruction-based contribution for pro-
cess monitoring with kernel principal component analysis,” Industrial
& Engineering Chemistry Research, vol. 49, no. 17, pp. 7849–7857, 2010.
[6] C. F. Alcala and S. Qin, “Reconsruction-based contrivution for process
monitoring,” Automatica, vol. 45, pp. 1593–1600, 2009.
[7] ASTM Standard, Standard test method for vapor pressure of petroleum prod-
ucts (Reid method), D323 - 08 ed. West Conshohocken, PA: ASTM In-
ternational, 2009.
[8] ASTM Standard, Standard test method for water and sediment in crude oil
by the centrifuge method (laboratory procedure), D4007 - 11e1 ed. West
Conshohocken, PA: ASTM International, 2011.
[9] Y. H. Bang, C. K. Yoo, and I.-B. Lee, “Nonlinear pls modeling with
fuzzy inference system,” Chemometrics and intelligent laboratory systems,
vol. 64, no. 2, pp. 137–155, 2002.
122
[10] K. P . Bennett and M. J. Embrechts, Advances in learning theory: methods,
models and applications. Amsterdam: ISO Press, 2003, vol. 190.
[11] G. Box, “Some theorems on quardratic forms applied in the study of
analysis of variance problems. i. effect of inequality of variance in the
one-way classification,” Ann. Math. Statist., vol. 25, pp. 290–302, 1954.
[12] L. Chen, O. Bernard, G. Bastin, and P . Angeloc, “Hybrid modelling of
biotechnological processes using neural networks.” Control. Eng. Pract.,
vol. 8, pp. 821–827, 2000.
[13] G. A. Cherry and S. J. Qin, “Multiblock principal component analy-
sis based on a combined index for semiconductor fault detection and
diagnosis,” Semiconductor Manufacturing, IEEE Transactions on, vol. 19,
no. 2, pp. 159–172, 2006.
[14] L. H. Chiang, M. E. Kotanchek, and A. K. Kordon, “Fault diagnosis
based on fisher discriminant analysis and support vector machines,”
Computers & chemical engineering, vol. 28, no. 8, pp. 1389–1401, 2004.
[15] S. W. Choi and I.-B. Lee, “Multiblock pls-based localized process diag-
nosis,” Journal of Process Control, vol. 15, no. 3, pp. 295–306, 2005.
[16] E. R. Cox, “Pressure-temperature chart for hydrocarbon vapors.” In-
dustrial & Engineering Chemistry, vol. 15, no. 6, pp. 592–593, 1923.
[17] B. S. Dayal and J. F. MacGregor, “Improved pls algorithms,” J.
Chemometr., vol. 11, pp. 73–85, 1997.
[18] B. Dayal and J. MacGregor, “Recursive exponentially weighted PLS
and its applications to adaptive control and prediction.” J. Process.
Contr., vol. 7, pp. 169–179, 1997.
[19] H. Devold, “Oil and gas production handbook,” ABB ATP A Oil and
Gas, 2006.
[20] J. J. Downs and E. F. Vogel, “A plant-wide industrial process control
problem,” Computers & Chemical Engineering, vol. 17, no. 3, pp. 245–
255, 1993.
123
[21] R. Dunia, S. J. Qin, T. F. Edgar, and T. J. McAvoy, “Identification of
faulty sensors using principal component analysis,” AIChE Journal,
vol. 42, no. 10, pp. 2797–2812, 1996.
[22] N. Fletcher, A. Morris, G. Montague, and E. Martin, “Local dynamic
partial least squares approaches for the modelling of batch processes,”
The Canadian Journal of Chemical Engineering, vol. 86, no. 5, pp. 960–970,
2008.
[23] I. Furzer, “Prediction of the reid vapor pressure of gasolines with mtbe
and other oxygenates,” Developments in Chemical Engineering and Min-
eral Processing, vol. 3, no. 1, pp. 50–55, 1995.
[24] P . Geladi and B. R. Kowalski, “Partial least-squares regression: a tuto-
rial,” Anal. Chim. Acta, vol. 185, pp. 1–17, 1986.
[25] J. Gertler, W. Li, Y. Huang, and T. McAvoy, “Isolation enhanced prin-
cipal component analysis,” AIChE Journal, vol. 45, no. 2, pp. 323–334,
1999.
[26] Q. P . He, S. J. Qin, and J. Wang, “A new fault diagnosis method using
fault directions in fisher discriminant analysis,” AIChE journal, vol. 51,
no. 2, pp. 555–571, 2005.
[27] K. Helland, H. E. Berntsen, O. S. Borgen, and H. Martens, “Recursive
algorithm for partial least squares regression,” Chemometrics and intel-
ligent laboratory systems, vol. 14, no. 1, pp. 129–137, 1992.
[28] A. H¨ oskuldsson, “Ple regression methods,” J. Chemometr., vol. 2, pp.
211–228, 1988.
[29] J. E. Jackson and G. S. Mudholkar, “Control procedures for residuals
associated with principal component analysis,” Technometrics, vol. 21,
no. 3, pp. 341–349, 1979.
[30] A. Jutan, J. F. MacGregor, and J. Wright, “Multivariable computer con-
trol of a butane hydrogenlysis reactor, part ii - data collection, param-
eter estimation, and stochastic disturbance identification.” AIChE J.,
vol. 23, pp. 742–750, 1977.
124
[31] P . Kadlec, B. Gabrys, and S. Strandt, “Data-driven soft sensors in the
process industry,” Comput. Chem. Eng., vol. 33, pp. 795–814, 2009.
[32] M. Kano, S. Hasebe, I. Hashimoto, and H. Ohno, “Statistical process
monitoring based on dissimilarity of process data,” AIChE journal,
vol. 48, no. 6, pp. 1231–1240, 2002.
[33] M. Kano, K. Miyazaki, S. Hasebe, and I. Hashimoto, “Inferential con-
trol system of distillation compositions using dynamic partial least
squares regression,” Journal of Process Control, vol. 10, no. 2, pp. 157–
166, 2000.
[34] M. Kano, K. Nagao, S. Hasebe, I. Hashimoto, H. Ohno, R. Strauss,
and B. R. Bakshi, “Comparison of multivariate statistical process mon-
itoring methods with applications to the eastman challenge problem,”
Computers & chemical engineering, vol. 26, no. 2, pp. 161–174, 2002.
[35] T. Komulainen, M. Sourander, and S.-L. J¨ ams¨ a-Jounela, “An online ap-
plication of dynamic pls to a dearomatization process,” Computers &
Chemical Engineering, vol. 28, no. 12, pp. 2611–2619, 2004.
[36] H. Korsten, “Internally consistent prediction of vapor pressure and re-
lated properties,” Ind. Eng. Chem. Res., vol. 39, pp. 813–820, 2000.
[37] J. V . Kresta, J. F. MacGregor, and T. E. Marlin, “Multivariate statistical
monitoring of process operating performance,” The Canadian Journal of
Chemical Engineering, vol. 69, no. 1, pp. 35–47, 1991.
[38] J. Kresta, T. Marlin, and J. MacGregor, “Development of inferential pro-
cess models using pls.” Comput. Chem. Eng., vol. 18, pp. 597–611, 1994.
[39] R. LeTourneau, J. Johnson, and W. Ellis, “Reduced-scale reid vapor
pressure apparatus,” Analytical Chemistry, vol. 27, no. 1, pp. 142–144,
1955.
[40] G. Li, C. F. Alcala, S. J. Qin, and D. Zhou, “Generalized reconstruction-
based contributions for output-relevant fault diagnosis with applica-
tion to the tennessee eastman process,” Control Systems Technology,
IEEE Transactions on, vol. 19, no. 5, pp. 1114–1127, 2011.
125
[41] G. Li, B. Liu, S. J. Qin, and D. Zhou, “Quality relevant data-driven mod-
eling and monitoring of multivariate dynamic processes: The dynamic
t-pls approach,” Neural Networks, IEEE Transactions on, vol. 22, no. 12,
pp. 2262–2271, 2011.
[42] G. Li, S. J. Qin, Y. Ji, and D. Zhou, “Total pls based contribution plots
for fault diagnosis,” Acta Automat. Sinica, vol. 35, pp. 759–765, 2009.
[43] G. Li, S. J. Qin, and D. Zhou, “Geometric properties of partial least
squares for process monitoring,” Automatica, vol. 46, pp. 204–210, 2010.
[44] G. Li, S. J. Qin, and D. Zhou, “Output relevant fault reconstruction and
fault subspace extraction in total projection to latent structures mod-
els,” Ind. Eng. Chem. Res., vol. 49, pp. 9175–9183, 2010.
[45] F. Lindgren, P . Geladi, and S. Wold, “The kernel algorithm fot pls,” J.
Chemometr., vol. 7, pp. 45–59, 1993.
[46] P . R. Lyman and C. Georgakis, “Plant-wide control of the tennessee
eastman problem,” Computers & chemical engineering, vol. 19, no. 3, pp.
321–331, 1995.
[47] J. F. MacGregor, C. Jaeckle, C. Kiparissides, and M. Koutoudi, “Process
monitoring and diagnosis by multiblock pls methods,” AIChE Journal,
vol. 40, no. 5, pp. 826–838, 1994.
[48] F. S. Manning and R. E. Thompson, Oilfield processing II: crude oil.
Tulsa, Oklahoma: PennWell Publishing Company, 1995.
[49] M. Misra, H. H. Yue, S. J. Qin, and C. Ling, “Multivariate process mon-
itoring and fault diagnosis by multi-scale pca,” Computers & Chemical
Engineering, vol. 26, no. 9, pp. 1281–1293, 2002.
[50] P . Nomikos and J. F. MacGregor, “Multi-way partial least squares in
monitoring batch processes,” Chemometrics and intelligent laboratory sys-
tems, vol. 30, no. 1, pp. 97–108, 1995.
[51] P . Nomikos and J. F. MacGregor, “Multivariate spc charts for monitor-
ing batch processes,” Technometrics, vol. 37, no. 1, pp. 41–59, 1995.
126
[52] K. Peng, K. Zhang, G. Li, and D. Zhou, “Contribution rate plot for
nonlinear quality-related fault diagnosis with application to the hot
strip mill process,” Control Engineering Practice, vol. 21, no. 4, pp. 360–
369, 2013.
[53] J. Peress, “Estimate storage tank emissions,” Chemical engineering
progress, vol. 97, no. 8, pp. 44–46, 2001.
[54] H. Pichler and K. Hense, “Crude oil vapour pressure testing,”
Petroleum Technology Quarterly, vol. 17, no. 1, p. 102, 2012.
[55] S. J. Qin, H. Yue, and R. Dunia, “Self-validating inferential sensors with
application to air emission monitoring.” Ind. Eng. Chem. Res., vol. 36,
pp. 1675–1685, 1997.
[56] S. J. Qin, “Recursive pls algorithms for adaptive data modeling,” Com-
put. Chem. Eng., vol. 22, pp. 503–514, 1998.
[57] S. J. Qin, “Survey on data-driven industrial process monitoring and
diagnosis,” Annual Reviews in Control, vol. 36, no. 2, pp. 220–234, 2012.
[58] S. J. Qin and W. Li, “Detection, identification, and reconstruction of
faulty sensors with maximized sensitivity,” AIChE Journal, vol. 45,
no. 9, pp. 1963–1976, 1999.
[59] S. J. Qin and W. Li, “Detection and identification of faulty sensors in
dynamic processes,” AIChE Journal, vol. 47, no. 7, pp. 1581–1593, 2001.
[60] S. J. Qin and T. McAvoy, “Nonlinear fir modeling via a neural net pls
approach,” Computers & chemical engineering, vol. 20, no. 2, pp. 147–159,
1996.
[61] S. J. Qin, S. Valle, and M. J. Piovoso, “On unifying multiblock anal-
ysis with application to decentralized process monitoring,” Journal of
chemometrics, vol. 15, no. 9, pp. 715–742, 2001.
[62] S. Qin, “Neural networks for intelligent sensors and control – practical
issues and some solutions.” in Neural Systems Control., O. Omidvar and
D. L. Elliott., Eds. Academic Press., 1997, pp. 213–234.
127
[63] S. Qin, “Statistical process monitoring: basics and beyond,” J.
Chemometr., vol. 17, pp. 480–502, 2003.
[64] S. Qin and T. McAvoy, “A data-based process modeling approach and
its applications,” Proceedings of the 3rd IFAC DYCORD+ Symposium,
April 1992.
[65] S. Qin and Y. Zheng, “Quality-relevant and process-relevant fault
monitoring with concurrent projection to latent structures,” AIChE J.,
vol. 59, pp. 496–504, 2013.
[66] A. Raich and A. Cinar, “Statistical process monitoring and distur-
bance diagnosis in multivariable continuous processes,” AIChE Journal,
vol. 42, no. 4, pp. 995–1009, 1996.
[67] S. R¨ annar, F. Lindgren, P . Geladi, and S. Wold, “A pls kernel algorithm
for data sets with many variables and fewer objects. part 1: theory and
algorithm,” J. Chemometr., vol. 8, pp. 111–125, 1994.
[68] M. Riazi, T. Albahri, and A. Alqattan, “Prediction of reid vapor pres-
sure of petroleum fuels,” Petroleum science and technology, vol. 23, no. 1,
pp. 75–86, 2005.
[69] N. L. Ricker, “The use of biased least-squares estimators for parame-
ters in discrete-time pulse-response models,” Industrial & engineering
chemistry research, vol. 27, no. 2, pp. 343–350, 1988.
[70] R. Rosipal and L. J. Trejo, “Kernel partial least squares regression in
reproducing kernel hilbert space,” The Journal of Machine Learning Re-
search, vol. 2, pp. 97–123, 2002.
[71] N. Tracy, J. Young, and R. Mason, “Multivariate control charts for indi-
vidual observations,” J. Qual. Technol., vol. 24, pp. 88–95, 1992.
[72] J. Trygg and S. Wold, “Orthogonal projections to latent structures (o-
pls),” Journal of chemometrics, vol. 16, no. 3, pp. 119–128, 2002.
128
[73] S. Wang and F. Xiao, “Ahu sensor fault diagnosis using principal com-
ponent analysis method,” Energy and Buildings, vol. 36, no. 2, pp. 147–
160, 2004.
[74] J. A. Westerhuis, S. P . Gurden, and A. K. Smilde, “Generalized contri-
bution plots in multivariate statistical process monitoring,” Chemomet-
rics and Intelligent Laboratory Systems, vol. 51, no. 1, pp. 95–114, 2000.
[75] J. Williams, “Accurate bs&w testing important for crude-oil custody
transfer,” Oil & Gas Journal, 1990.
[76] D. Wilson, G. Irwin, and G. Lightbody, “Nonlinear pls using radial
basis functions,” Transactions of the Institute of Measurement and Control,
vol. 19, no. 4, pp. 211–220, 1997.
[77] B. M. Wise, N. L. Ricker, and D. J. Veltkamp, “Upset and sensor failure
detection in multivariate processes,” in AIChE 1989 Annual Meeting,
1989.
[78] H. Wold, “Partial least squares,” Encyclopedia of statistical sciences, 1985.
[79] S. Wold, “Nonlinear partial least squares modelling ii. spline inner re-
lation,” Chemometrics and Intelligent Laboratory Systems, vol. 14, no. 1,
pp. 71–84, 1992.
[80] S. Wold, N. Kettaneh-Wold, and B. Skagerberg, “Nonlinear pls model-
ing,” Chemometrics and Intelligent Laboratory Systems, vol. 7, no. 1, pp.
53–65, 1989.
[81] S. Yoon and J. F. MacGregor, “Statistical and causal model-based ap-
proaches to fault detection and isolation,” AIChE Journal, vol. 46, no. 9,
pp. 1813–1824, 2000.
[82] S. Yoon and J. F. MacGregor, “Fault diagnosis with multivariate statisti-
cal models part i: using steady state fault signatures,” Journal of process
control, vol. 11, no. 4, pp. 387–400, 2001.
129
[83] H. H. Yue and S. J. Qin, “Reconstruction-based fault identification
using a combined index,” Industrial & engineering chemistry research,
vol. 40, no. 20, pp. 4403–4414, 2001.
[84] C. Zhao and Y. Sun, “The multi-space generalization of total projection
to latent structures (mst-pls) and its application to online process mon-
itoring,” in Control and Automation (ICCA), 2013 10th IEEE International
Conference on. IEEE, 2013, pp. 1441–1446.
[85] Z. Zhao, Q. Li, M. Huang, and F. Liu, “Concurrent pls-based process
monitoring with incomplete input and quality measurements,” Com-
puters & Chemical Engineering, vol. 67, pp. 69–82, 2014.
[86] D. Zhou, G. Li, and S. J. Qin, “Total projection to latent structures for
process monitoring,” AIChE J., vol. 56, pp. 168–178, 2010.
Abstract (if available)
Abstract
In industrial processes, including gas and oil processing, many important physical properties are difficult to measure due to limitations such as cost, reliability, and large time delay. These properties are usually quality variables and are directly related to the economic interest. Traditionally, these quality variables are either measured off‐line by experimental analyses or on‐line by dedicated analyzers. Both methods can be expensive and time consuming. An alternative solution is to use inferential sensors. An inferential sensor utilizes other easy‐to‐measure process variables to estimate those hard‐to‐measure quality variables. Inferential sensors are economical due to their software nature, and they can produce accurate estimations without incurring any measurement delay. Accordingly, inferential sensors gained increasing popularity and have been applied in many different process industries. ❧ In this dissertation, we focus on developing inferential sensors in the field of gas and oil production industry. Inferential sensors on three properties are developed, including basic sediment and water (BS&W), Reid vapor pressure (RVP), and true vapor pressure (TVP). The BS&W inferential sensor is a data‐driven one, which is based on partial least‐squares (PLS) regression. The RVP and TVP inferential sensors are hybrid ones, which consist of a model‐driven part and a data‐driven part. The model‐driven part is based on the Korsten vapor pressure equation, and PLS is used to capture the modeling residual, which denotes the data‐driven part. Case studies on real data demonstrate the effectiveness of the proposed inferential sensors. ❧ Since inferential sensors depend on the correlations among variables within the process, their estimations are no longer valid if these correlations are changed. To amend this shortage, process monitoring methods have been developed by researchers. We first utilized the traditional monitoring method based on PLS on the RVP inferential sensor. Then we develop a new monitoring method based on a newly proposed concurrent partial least‐squares (CPLS) algorithm. At the end we demonstrate the effectiveness of the CPLS monitoring methods by two case studies.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Concurrent monitoring and diagnosis of process and quality faults with canonical correlation analysis
PDF
Dynamic latent structured data analytics
PDF
Robust real-time algorithms for processing data from oil and gas facilities
PDF
Data-driven performance and fault monitoring for oil production operations
PDF
Performance monitoring and disturbance adaptation for model predictive control
PDF
Statistical modeling and process data analytics for smart manufacturing
PDF
Process data analytics and monitoring based on causality analysis techniques
PDF
Uncertainty quantification and data assimilation via transform process for strongly nonlinear problems
PDF
Nanostructure interaction modeling and estimation for scalable nanomanufacturing
PDF
Latent space dynamics for interpretation, monitoring, and prediction in industrial systems
PDF
Subsurface model calibration for complex facies models
PDF
Novel multi-stage and CTLS-based model updating methods and real-time neural network-based semiactive model predictive control algorithms
PDF
Application of data-driven modeling in basin-wide analysis of unconventional resources, including domain expertise
PDF
Real-time reservoir characterization and optimization during immiscible displacement processes
PDF
Stochastic data assimilation with application to multi-phase flow and health monitoring problems
PDF
Modeling and simulation of complex recovery processes
PDF
Graph-based models and transforms for signal/data processing with applications to video coding
PDF
Energy control and material deposition methods for fast fabrication with high surface quality in additive manufacturing using photo-polymerization
PDF
Efficient data collection in wireless sensor networks: modeling and algorithms
PDF
Deformation control for mask image projection based stereolithography process
Asset Metadata
Creator
Pan, Yu
(author)
Core Title
Inferential modeling and process monitoring for applications in gas and oil production
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Chemical Engineering
Publication Date
04/14/2015
Defense Date
02/23/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
basic sediment and water,contribution method,inferential model,OAI-PMH Harvest,partial least squares,process monitoring,Reid vapor pressure
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Qin, S. Joe (
committee chair
), Ershaghi, Iraj (
committee member
), Huang, Qiang (
committee member
)
Creator Email
pan05131987@gmail.com,yupan@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-547268
Unique identifier
UC11297918
Identifier
etd-PanYu-3290.pdf (filename),usctheses-c3-547268 (legacy record id)
Legacy Identifier
etd-PanYu-3290.pdf
Dmrecord
547268
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Pan, Yu
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
basic sediment and water
contribution method
inferential model
partial least squares
process monitoring
Reid vapor pressure