Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Concurrent monitoring and diagnosis of process and quality faults with canonical correlation analysis
(USC Thesis Other)
Concurrent monitoring and diagnosis of process and quality faults with canonical correlation analysis
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
CONCURRENT MONITORING AND DIAGNOSIS OF PROCESS AND QUALITY FAULTS WITH CANONICAL CORRELATION ANALYSIS by Qinqin Zhu A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (CHEMICAL ENGINEERING) December 2017 Copyright 2017 Qinqin Zhu ii Dedication To my family. iii Acknowledgments I would like to thank everyone who supported me, encouraged me and inspired me during my doctoral work. Foremost, I would like to gratefully and sincerely thank my advisor, Prof. Joe S. Qin, for all of his support and encouragement throughout my graduate studies at University of Southern California. When I first joined Qin’s lab, I knew little about what it means to do research. Prof. Qin, with his endless patience and faithful encouragement, guided me through each step to do meaningful research. I can never forget all the moments when Prof. Qin stays up late to the last minute to work with me before conference deadlines, challenges me to push every idea to its limit, iterates over paper and talks numerous times to reach their excellence. In addition, I really ap- preciate the luxurious freedom he gives me to explore the research domains that I am interested in, with which I really enjoyed the past four-years re- search experience. Prof. Qin’s passion about research and commitment to students makes him the best academic advisor I have ever met. I learned a lot from him, not only about research but also how to be a good mentor with care and patience. I would also like to thank the effort of Drs. Pin Wang, Qiang Huang, iv Iraj Ershaghi, Cyrus Shahabi and Theodore Tsotsis for serving as my qual- ifying and defense committees and providing valuable feedback to my re- search. Their stimulating questions and comments made my qualifying and defense exams enjoyable. It has been an honor and privilege to be a member of Prof. Qin’s research group. This group is a collection of talented and dedicated indi- viduals who granted me with enjoyable and memorable experience. I feel very fortunate to have the former members who offered me useful advices and helpful suggestions, including Tao, Johnny, Yining, Hu, Yingying, Jin- gran, and visiting scholars Le, Gang, Qiang, Lijuan, and Ying. I am also grateful to all the current members for the friendship and consistent assis- tance, Alisha, Zora, Wei and Yuan. Especially I appreciate the time spending with Dr. Qiang Liu on our joint work. He has a strong background on my research area, and it is always a great pleasure to discuss with him. I would also like to thank the Viterbi Graduate School PhD Fellow- ship, and the Center for Interactive Smart Oilfield Technologies (CiSoft) for their financial support during my graduate studies. Last but not the least, I am indebted to thank my family and friends for their endless love, care and encouragement. I would like to thank my parents, who always encourage me and believe the best in me. I would like to thank my husband, Dr. Liang Liu, for his tremendous support and encouragement. This thesis would have never been possible without him. v Table of Contents Dedication ii Acknowledgments iii ListofTables viii ListofFigures x ListofAlgorithms xiv Abstract xv Chapter1. Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Existing Latent Variable Methods . . . . . . . . . . . . . . . . . 2 1.2.1 Principal Component Analysis . . . . . . . . . . . . . . . 2 1.2.2 Partial Least Squares . . . . . . . . . . . . . . . . . . . . . 4 1.2.3 Canonical Correlation Analysis . . . . . . . . . . . . . . . 5 1.2.4 Performance Comparison . . . . . . . . . . . . . . . . . . 8 1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter2. ConcurrentQualityandProcessMonitoringwithCanon- icalCorrelationAnalysis 14 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 Fault Monitoring Schemes . . . . . . . . . . . . . . . . . . . . . 17 2.3 RCCA for Quality-Relevant Monitoring . . . . . . . . . . . . . 20 2.3.1 Regularized CCA Model . . . . . . . . . . . . . . . . . . 20 2.3.2 Comparison of CCA and PLS on Data Modeling . . . . . 22 2.4 CCCA for Process Monitoring . . . . . . . . . . . . . . . . . . . 24 2.4.1 Concurrent CCA Model . . . . . . . . . . . . . . . . . . . 24 2.4.2 CCCA-based Monitoring . . . . . . . . . . . . . . . . . . 26 vi 2.4.3 CCCA-based Modeling and Online Monitoring . . . . . 29 2.5 Synthetic Case Studies . . . . . . . . . . . . . . . . . . . . . . . 32 2.6 Tennessee Eastman Process Case Studies . . . . . . . . . . . . . 39 2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Chapter3. Quality-RelevantFaultDetectionforNonlinearProcesses based on Concurrent Kernel Canonical Correlation Anal- ysis 50 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2 KCCA Model for Process and Quality Monitoring . . . . . . . 53 3.2.1 Kernel CCA Model . . . . . . . . . . . . . . . . . . . . . . 53 3.2.2 KCCA-based Monitoring . . . . . . . . . . . . . . . . . . 55 3.3 CKCCA for Nonlinear Process Monitoring . . . . . . . . . . . . 57 3.3.1 Concurrent Kernel CCA Model . . . . . . . . . . . . . . . 57 3.3.2 CKCCA-based Monitoring . . . . . . . . . . . . . . . . . 59 3.4 Synthetic Case Studies . . . . . . . . . . . . . . . . . . . . . . . 62 3.5 Tennessee Eastman Process Case Studies . . . . . . . . . . . . . 67 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Chapter4. ConcurrentDiagnosisofProcessandQualityFaultswith RegularizedCanonicalCorrelationAnalysis 76 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2 CCCA-based Fault Diagnosis . . . . . . . . . . . . . . . . . . . 78 4.2.1 Quality-Relevant Monitoring with Combined Index . . . 78 4.2.2 Analysis of Detectability . . . . . . . . . . . . . . . . . . 79 4.3 Quality-Relevant Diagnosis . . . . . . . . . . . . . . . . . . . . 81 4.3.1 Contribution Plots and RBC . . . . . . . . . . . . . . . . 81 4.3.2 Analysis of Diagnosability . . . . . . . . . . . . . . . . . 84 4.3.3 Extended Reconstruction-based Contribution . . . . . . 86 4.4 Tennessee Eastman Process Case Studies . . . . . . . . . . . . . 88 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Chapter5. DynamicConcurrentKernelCanonicalCorrelationAnal- ysis and its Application to a Continuous Annealing Pro- cess 100 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 vii 5.2 Process Description and Strip-Thickness Relevant Fault De- scription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.2.1 Process Description . . . . . . . . . . . . . . . . . . . . . 107 5.2.2 Strip-Thickness Relevant Fault Description . . . . . . . . 107 5.3 DCKCCA for Process Modeling and Monitoring . . . . . . . . 108 5.3.1 Concurrent Kernel CCA Model . . . . . . . . . . . . . . . 108 5.3.2 Dynamic Concurrent Kernel CCA Model . . . . . . . . . 111 5.3.3 DCKCCA-based Monitoring . . . . . . . . . . . . . . . . 114 5.4 Multi-Block DCKCCA for Fault Diagnosis . . . . . . . . . . . . 117 5.5 Continuous Annealing Process Case Studies . . . . . . . . . . . 120 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Chapter6. ConclusionsandFutureDirections 134 6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Appendices 139 Appendix A: The CCA Algorithm . . . . . . . . . . . . . . . . . . . 139 Appendix B: The Regularized CCA Algorithm . . . . . . . . . . . . 141 Appendix C: Calculations of Scores and Residuals of CKCCA . . . 142 Appendix D: The Simplified Kernel CCA Algorithm . . . . . . . . . 143 Appendix E: Proof of Lemma 1 and Lemma 2 . . . . . . . . . . . . . . 147 Bibliography 148 viii List of Tables Table 2.1 Monitoring Statistics and Control Limits for CCCA . . . . . 28 Table 2.2 Root Mean Squared Errors for Numerical Simulation . . . . 34 Table 2.3 Weighting Matrices R for Scenario 1 . . . . . . . . . . . . . . 34 Table 2.4 Weighting Matrices R for Scenario 2 . . . . . . . . . . . . . . 35 Table 2.5 The Angles between Weighting Vectors in Scenarios 1 and 2 ( ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Table 2.6 Disturbance Description for TEP . . . . . . . . . . . . . . . . 42 Table 2.7 Fault Detection Rates for Quality-Relevant Disturbances with PLS, CPLS and CCCA (%) . . . . . . . . . . . . . . . . . 43 Table 2.8 False Alarm Rates for Quality-Irrelevant Disturbances with PLS, CPLS and CCCA (%) . . . . . . . . . . . . . . . . . . . . 43 Table 3.1 Monitoring Statistics and Control Limits for CKCCA . . . . 61 Table 3.2 Fault Detection Rates for Quality-Relevant Disturbances with PLS, CCA, KCCA, CCCA and CKCCA (%) . . . . . . . 68 Table 3.3 False Alarm Rates for Quality-Irrelevant Disturbances with PLS, CCA, KCCA, CCCA and CKCCA (%) . . . . . . . . . . 69 Table 4.1 Formulations of M for Monitoring Statistics . . . . . . . . . . 82 Table 4.2 Formulations of D for Contribution Plots and RBC . . . . . . 84 ix Table 4.3 Fault Detection Rates for Quality-Relevant Disturbances with PCA, PLS and CCCA (%) . . . . . . . . . . . . . . . . . . 89 Table 4.4 False Alarm Rates for Quality-Irrelevant Disturbances with PCA, PLS and CCCA (%) . . . . . . . . . . . . . . . . . . . . . 89 x List of Figures Figure 2.1 Offline Modeling Scheme for CCCA . . . . . . . . . . . . . 30 Figure 2.2 Online Monitoring Scheme for CCCA . . . . . . . . . . . . 31 Figure 2.3 Quality-Relevant Fault 1 Occurred in the CRS Subspace . . 37 Figure 2.4 Monitoring Results for Fault 1 with Regularized CCCA (f = 6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Figure 2.5 Quality-Irrelevant Fault 2 Occurred in the PPS Subspace . . 38 Figure 2.6 Monitoring Results for Fault 2 with Regularized CCCA (f = 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Figure 2.7 Quality Variables and Quality Monitoring Results for IDV(1) 44 Figure 2.8 PLS-based Monitoring Results for IDV(1) . . . . . . . . . . 45 Figure 2.9 CCCA-based Monitoring Results for IDV(1) . . . . . . . . . 45 Figure 2.10 Quality Variables and Quality Monitoring Results for IDV(4) 46 Figure 2.11 PLS-based Monitoring Results for IDV(4) . . . . . . . . . . 47 Figure 2.12 CCCA-based Monitoring Results for IDV(4) . . . . . . . . . 47 Figure 3.1 Monitoring Results for Fault 1 with KCCA (f 1 = 3) . . . . . 64 Figure 3.2 Monitoring Results for Fault 1 with CCCA (f 1 = 3) . . . . . 64 Figure 3.3 Monitoring Results for Fault 1 with CKCCA (f 1 = 3) . . . . 65 Figure 3.4 Monitoring Results for Fault 2 with KCCA (f 2 = 1) . . . . . 65 xi Figure 3.5 Monitoring Results for Fault 2 with CCCA (f 2 = 1) . . . . . 66 Figure 3.6 Monitoring Results for Fault 2 with CKCCA (f 2 = 1) . . . . 66 Figure 3.7 CCA-based Monitoring Results for IDV(2) . . . . . . . . . . 70 Figure 3.8 KCCA-based Monitoring Results for IDV(2) . . . . . . . . . 70 Figure 3.9 CCCA-based Monitoring Results for IDV(2) . . . . . . . . . 71 Figure 3.10 CKCCA-based Monitoring Results for IDV(2) . . . . . . . . 71 Figure 3.11 CCA-based Monitoring Results for IDV(4) . . . . . . . . . . 72 Figure 3.12 KCCA-based Monitoring Results for IDV(4) . . . . . . . . . 73 Figure 3.13 CKCCA-based Monitoring Results for IDV(4) . . . . . . . . 73 Figure 4.1 CCCA-based Fault Monitoring Results for IDV(1) . . . . . . 90 Figure 4.2 Fault Diagnosis Results for the 42 th Sample Using Contri- bution Plots (Upper) and RBC (Lower) withT 2 c . . . . . . . 90 Figure 4.3 Fault Diagnosis Results for the 42 th Sample Using Contri- bution Plots (Upper) and RBC (Lower) with ~ Q x . . . . . . . 91 Figure 4.4 Fault Diagnosis Results for the 42 th Sample Using Contri- bution Plots (Upper) and RBC (Lower) with . . . . . . . . 91 Figure 4.5 RBC Diagnosis Results for All Samples (Upper) and Aver- age RBC (Lower) . . . . . . . . . . . . . . . . . . . . . . . . . 94 Figure 4.6 RBC Diagnosis Results After Reconstructing along Vari- able 4 (Upper: RBC for All Samples; Lower: Average RBC) . 94 Figure 4.7 RBC Diagnosis Results After Reconstructing along Vari- ables 4 and 23 (Upper: RBC for All Samples; Lower: Aver- age RBC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 xii Figure 4.8 RBC Diagnosis Results After Reconstructing along Vari- ables 4, 23 and 8 (Upper: RBC for All Samples; Lower: Av- erage RBC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Figure 4.9 Quality-Relevant Monitoring and Reconstruction Results for IDV(1) Using the Combined Index . . . . . . . . . . . . . 96 Figure 4.10 CCCA-based Fault Monitoring Results for IDV(3) . . . . . . 98 Figure 4.11 Fault Diagnosis Results for the 50 th Sample Using Contri- bution Plots (Upper) and RBC (Lower) withT 2 c . . . . . . . 98 Figure 4.12 Fault Diagnosis Results for the 50 th Sample Using Contri- bution Plots (Upper) and RBC (Lower) with ~ Q x . . . . . . . 99 Figure 4.13 Fault Diagnosis Results for the 50 th Sample Using Contri- bution Plots (Upper) and RBC (Lower) with . . . . . . . . 99 Figure 5.1 Layout of Continuous Annealing Process . . . . . . . . . . 106 Figure 5.2 DCKCCA-based Quality Relevant Fault Diagnosis Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Figure 5.3 Quality Variable and Selected Contribution Process Vari- ables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Figure 5.4 Measured Strip-Thickness and Predicted Strip-Thickness (Normal Case) . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Figure 5.5 Auto-Correlation Curves of Prediction Errors (Normal Case)124 Figure 5.6 Measured Strip-Thickness and Predicted Strip-Thickness (Faulty Case 1) . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Figure 5.7 CAP Process Monitoring Results (Faulty Case 1) . . . . . . 127 Figure 5.8 CAP Strip-Thickness Relevant Diagnosis Results using Multi-Block DCKCCA (Faulty Case 1) . . . . . . . . . . . . . 129 xiii Figure 5.9 Original Data for Contributed Variables (Faulty Case 1) . . 130 Figure 5.10 CAP Process Monitoring Results (Faulty Case 2) . . . . . . 131 Figure 5.11 Measured Strip Thickness (Faulty Case 2) . . . . . . . . . . 132 xiv List of Algorithms 1 The Nonlinear Iterative Partial Least Squares Algorithm . . . 6 2 Cross Validation for the Regularized CCA Model . . . . . . . . 23 3 The Concurrent CCA Algorithm . . . . . . . . . . . . . . . . . 26 4 The Concurrent Kernel CCA Algorithm . . . . . . . . . . . . . 58 5 The Extended Reconstruction-based Contribution Algorithm . 87 6 The Simplified Concurrent Kernel CCA Algorithm . . . . . . . 110 xv Abstract Statistical process monitoring and fault diagnosis apply multivariate statistical analysis techniques on process and quality data to monitor and diagnose disturbances in industrial processes. Partial least squares (PLS) and canonical correlation analysis (CCA) are two popular supervised learn- ing methods among them. PLS model and PLS-based monitoring frame- work have been widely studied and used for quality relevant monitoring. However, the discrepant objectives in PLS inner and outer models lead to many problems, such as the irrelevant components in the extracted latent variables and large variances in the process residual subspace. CCA extracts the multi-dimensional correlation structure between process and quality variables, which enables it to maximize the quality prediction from process data. CCA can be used to overcome the drawbacks of PLS; However, CCA focuses only on the correlation and ignores the variance information, and the process residual subspace may still contain a large portion of variances unexploited. In addition, it suffers from collinearity problems that often exist in the process data. Thus, in this thesis, a concurrent CCA (CCCA) approach with regularization is proposed to exploit the variance and covariance structure in the process-specific and xvi quality-specific spaces. The CCCA method retains the CCA’s efficiency in predicting the quality while exploiting the variance structure for quality and process monitoring using subsequent principal component decompo- sitions. The corresponding monitoring statistics and control limits are then developed in these subspaces. Additionally, several types of monitoring schemes are clearly formulated, including process monitoring, quality monitoring, quality-relevant monitoring and process-specific monitoring, with each serving a particular purpose and usefulness. In order to extend the applicability of CCCA, a concurrent kernel CCA (CKCCA) is proposed for nonlinear process monitoring, which de- composes the original space into five subspaces, including the correlation subspace, quality-principal subspace, quality-residual subspace, process- principal subspace and process-residual subspace. The proposed CKCCA considers the nonlinearity in both process and quality variables, and incor- porates a regularization term as well for numerical robustness. The above methods are applicable to data from steady state opera- tions in industries. A dynamic concurrent kernel canonical correlation anal- ysis (DCKCCA) approach is developed to monitor and diagnose dynamic processes. A continuous annealing process (CAP) is used as a practical ex- ample to illustrate the performance of the DCKCCA algorithm. First, a DCKCCA algorithm is proposed to capture dynamic nonlinear correlations xvii between strip thickness and process variables. Strip thickness specific vari- ations, process-specific variations, and thickness-process covariations are monitored respectively. Then in order to localize faults that are relevant to abnormal strip thickness, a multi-block extension of DCKCCA is designed to compute the contributions according to the block partitions of lagged variables. After the faults are detected, their root causes should be localized. Therefore, the contribution plots and reconstruction-based contribution (RBC) diagnosis methods are developed for concurrent fault diagnosis. For multi-dimensional faults, the diagnosis results of these two methods can be ambiguous and subjective due to smearing effects, and an extended RBC approach is proposed to generalize them to multi-dimensional faulty variables. The analysis of dectability and diagnosability of these diagnosis approaches is also presented. The detailed case studies on the Tennessee Eastman process and con- tinuous annealing process are shown to illustrate the performance of all the proposed monitoring and diagnosis methods on process and quality faults as well as the prognosis of the quality-relevant faults. 1 Chapter 1 Introduction 1.1 Overview Industrial processes employ a large number of sensors to measure flows, temperatures and pressures among other variables. Accurate mea- surements of such variables are important for the safe operation and con- trol of the processes. However, there are times when faults happen in the sensors or in some equipment in the process. In these cases, it is important to detect and diagnose the faults; otherwise, the consequences might be ex- pensive for the process, and its safety would be compromised. Therefore, it is necessary to implement mechanisms to detect and diagnose faults in sensors and process variables as soon as possible. The implementation as well as the use of such mechanisms is called process monitoring. At a very general level, the methods employed for process monitor- ing can be classified into two categories, namely, model-driven and data- driven. The model-driven methods are based on first principles models (FPM), which describe the physical and chemical mechanisms of the pro- cesses. However, these approaches are hard to implement when the process has a large number of variables and its analytical model is difficult to ob- 2 tain. Data-driven methods or statistical methods, in contrast, do not involve explicit knowledge of the process models since they build statistical models using process measurements only. Given the large number of sensors and measurements involved in modern process operations, multivariate statistical process monitoring based on process variables and quality variables has been widely used in industrial processes, including chemicals, polymers, microelectronics manufacturing and pharmaceutical processes [1–7]. Among them, princi- pal component analysis (PCA), partial least squares (PLS) and canonical correlation analysis (CCA) are three basic multivariate statistical methods [8]. 1.2 Existing Latent Variable Methods 1.2.1 Principal Component Analysis PCA models are predominantly used in statistical process monitor- ing to extract variable correlations from data [9, 10]. Let x2R m denote a sample ofm sensors. Assuming that there are N samples for each sensor, a data matrix X 2 R Nm is composed with each row representing a sample. The matrix X is scaled to zero mean for covariance-based PCA, and, in addition, to unit variance for correlation- based PCA [3]. The covariance of X is approximated by the sample covari- 3 ance matrix S = 1 N 1 X > X (1.1) By either the NIPALS algorithm [11] or the singular value decompo- sition (SVD) method, the matrix X can be decomposed into a score matrix T and a loading matrix P as follows. X = TP > + ~ X = TP > + ~ T ~ P > = h T ~ T ih P ~ P i > T P > (1.2) where ~ X = ~ T ~ P > is the residual matrix, T = h T ~ T i , and P = h P ~ P i . Since the columns of T are orthogonal, Eq. (1.1) can be represented as S = P P > = PP > + ~ P ~ ~ P > = ^ S + ~ S (1.3) where = 1 N 1 T > T = diagf 1 ; 2 ; :::; m g i = 1 N 1 t > i t i ^ S = PP > , ~ S = ~ P ~ ~ P > , and T = [t 1 ; t 2 ; :::; t m ]. From Eq. (1.3), the original space is decomposed into the principal component subspace (PCS) S p = spanfPg and the residual subspace (RS)S r = spanf ~ Pg. A new measurement x can be decomposed as x = ^ x + ~ x, where ^ x = PP > x is the projection to PCS and ~ x = ~ P ~ P > x is the projection to RS. 4 1.2.2 Partial Least Squares PLS, proposed by Wold et al. [12], is a family of methods to model relations between two sets of variables. It has been a popular tool for re- gression and classification as well as dimensionality reduction [13–15]. It has received a great amount of attention in the field of chemometrics, and has become a standard tool for processing a wide spectrum of chemical data problems. The success of PLS in chemometrics results in a lot of applications in other scientific areas, including bioinformatics, food research, medicine, pharmacology, social sciences - to name but a few [16–22]. PLS extracts a latent structure between input matrix X2R Nm and output matrix Y2 R Np , where N and m are the same as those specified in Subsection 1.2.1, and p is the number of quality variables. The latent variables for PLS are extracted by maximizing the covariance between X and Y. Mathematically, it is expressed as max t; u J PLS = t > u s.t. jjwjj 2 = 1;jjqjj 2 = 1 (1.4) where t and u are score vectors for X and Y, and are related with the weighting vectors w and q by t = Xw u = Yq (1.5) 5 The solution to this optimization problem can be achieved by La- grange multipliers, which leads to X > Yq = w w Y > Xw = q q (1.6) where w and q are Lagrange coefficients. The process that builds the re- lation between the latent score vectors and the corresponding observations is called outer modeling, which can be obtained by iterating through Eqs. (1.5) and (1.6). Then a linear inner model can be built between the latent scores t and u. In order to extract the next latent component, deflations of X and Y are performed with the input score t and the estimated output score ^ u, respectively. The detailed PLS algorithm is presented in Algorithm 1 [23, 24]. After performing PLS, the scaled and mean-centered X and Y are decomposed as 8 < : X = TP > + E Y = TQ > + F (1.7) where T = [t 1 ; t 2 ;:::; t l ], P = [p 1 ; p 2 ;:::; p l ], and Q = [q 1 ; q 2 ;:::; q l ]. E and F are residuals for X and Y, and l is the number of PLS factors, which is determined by cross validation. 1.2.3 Canonical Correlation Analysis CCA [25] is commonly used to find the correlations between two sets of variables. CCA seeks a pair of linear transformations, one for each set of 6 Algorithm1 The Nonlinear Iterative Partial Least Squares Algorithm 1. Scale X and Y to zero-mean and unit-variance, or use some other scaling rule. Set X 1 = X, Y 1 = Y, andi = 1. 2. Outer modeling: Choose a starting u i as some column of Y i and iterate the following equations until t i converges or a maximum number of iterationsi =i max has elapsed: w i = X > i u i =jjX > i u i jj; t i = X i w i ; q i = Y > i t i =jjY > i t i jj; u i = Y i q i : 3. Inner modeling: Calculate the inner regression coefficient b i = u > i t i =t > i t i 4. Residual deflation: p i = X > i t i =t > i t i ; X i+1 = X i t i p > i ; Y i+1 = Y i b i t i q > i : 5. Seti :=i + 1 and return to Step 2 untili =i max . variables, such that the data are maximally correlated in the transformed space. Indeed, CCA has been applied successfully in various areas, includ- ing regression, discrimination, and dimensionality reduction [26, 27]. CCA focuses on the multi-dimensional correlation between X and Y, which enables it to build a more efficient prediction. The objective of 7 CCA is expressed as max t; u J CCA = t > u s.t. jjtjj 2 = 1;jjujj 2 = 1 (1.8) where t and u are score vectors for X and Y in CCA. Assume that both X > X and Y > Y are nonsingular. Then the weight- ing vectors w and q are given by thel principal eigenvectors of the following generalized eigenvalue problems. (X > X) 1 X > Y(Y > Y) 1 Y > Xw = t u w (Y > Y) 1 Y > X(X > X) 1 X > Yq = t u q (1.9) where t and u are Lagrange coefficients for CCA. The number of non-zero eigenvalues of Eq. (1.9) is equal to the rank of X > Y, which is less than or equal to the smaller dimension of X and Y. The weighting vectors (w and q) and the canonical variates (t and u) of CCA can be solved by either the combination of eigenvalue decom- position and SVD or three SVDs only (Appendix A). Similar to PLS, the mean-centered X and Y are decomposed by CCA as: 8 < : X = TP > + E Y = TQ > + F (1.10) where T, P and Q have the same meanings as in PLS. In essence, CCA finds the directions of maximum correlation while PLS finds the directions of maximum covariance. Covariance and correla- tion are two different statistical metrics to quantify how variables covary. It 8 has been shown that there is a close connection between PLS and CCA in discrimination [28]. Rosipal and Kr¨ amer [29] derived a unified framework for PLS and CCA, which is max w; q J unifed = w > X > Yq p w > [(1 x ) X > X + x I] w p q > [(1 y ) Y > Y + y I] q (1.11) where x and y are hyper-parameters. When x = 0 and y = 0, Eq. (1.11) reduces to the objective of CCA, while when x = 1 and y = 1, Eq. (1.11) reduces to the objective of PLS. More detailed comparison of PLS and CCA will be presented in Chapter 2. 1.2.4 Performance Comparison When the quality measurements are expensive or difficult to obtain, PCA is widely used for process monitoring on account of its ability to han- dle high-dimensional, noisy, and highly correlated data by projecting the data onto a lower-dimensional subspace that contains most of the variances of the original data [2]. However, PCA-based monitoring methods are only effective for monitoring variations in process variables (X), and no informa- tion from the quality variables (Y) is extracted, which may lead to nuisance alarms that reduce its reliability. In industrial processes, the variations or disturbances of process variables may be compensated by feedback con- trollers and other operator corrections, which will have no influence on the quality variables. Thus, monitoring only on the process variables will lead 9 to nuisance alarms on disturbances that have no sensible effect on quality variables. In order to include the information of the quality variables, PLS- based and CCA-based modeling methods are employed to find the covariances or correlations between process variables and quality variables [6]. PLS is a data decomposition method for maximizing covariances between X and Y, and it has been studied intensively. Li et al. [30] studied the effect of quality variables on the X-space decomposition and the geometric properties of the PLS. Unlike PCA, PLS does not decompose the X-factors in descending order of the magnitude of the variance. Therefore, the residuals un-useful to predict the Y data can be large in terms of the variance magnitudes. In process monitoring based on PLS methods, the X-space is decomposed into two subspaces, which are monitored by T 2 and Q statistics [3, 31], respectively. Although it works well in some cases, there are several problems in this monitoring scheme. Firstly, PLS components that form T 2 usually contain variations orthogonal to Y, which are useless to predict Y. And even for a single quality variable, PLS takes many factors to achieve optimal prediction. Moreover, PLS does not maximize the variance of the input data with each factor, so its residuals are not necessarily small, and it is not appropriate to apply Q statistic on the PLS residuals. 10 In order to detect quality-relevant faults with X, it is desirable to ex- tract the subspace in X that is exactly relevant in predicting Y, no more and no less. If a fault has no effect in this subspace, it is quality-irrelevant. Recent work in [32] proposed a total PLS (T-PLS), which performs further decomposition of the PLS leading factors and the PLS residuals and divides X-space into four parts. Qin et al. [33] put forward a concurrent PLS (CPLS) to overcome the drawbacks of T-PLS by monitoring process-relevant and quality-relevant faults separately. In both methods, an orthogonalization step is required to remove the excessive dimensions in the PLS latent fac- tors that are irrelevant to the output Y, since PLS tends to use more latent factors than the number of quality variables to adequately predict the qual- ity variables. PLS related methods are classes of affine transformation of the pro- cess and quality variables, and are more robust to collinearity when pre- dicting the quality variables. CCA, in contrast, focuses only on extracting the multidimensional correlation between X and Y. CCA is widely used in signal processing, computer vision and behavioral studies [34, 35]. CCA- based monitoring, however, has not been the favored choice due to its lack of attention to variance structure in the data [36]. In addition, when extract- ing the components of CCA, matrix inversions are inevitable that can lead to ill-conditioning when strong collinearity exists in X or Y space. 11 1.3 Outline of the Thesis This thesis focuses on developing a concurrent CCA based fault monitoring and diagnosis framework, and it is organized as follows. In Chapter 2, a description and analysis of quality-relevant moni- toring is provided, and several quality-relevant detection statistics are ana- lyzed. Then a concurrent canonical correlation analysis (CCCA) algorithm for quality-relevant fault detection based on the regularized CCA model is proposed. Similar to CPLS, CCCA decomposes the original data space into five subspaces. Firstly, the correlation subspace (CRS) is obtained us- ing CCA with regularization by extracting the canonical variates that are directly relevant to the predictable variations of the quality variables. Then the remaining part of the process variables is divided further into process- principal subspace (PPS) and process-residual subspace (PRS). The remain- ing part of the quality variables is also decomposed into quality-principal subspace (QPS) and quality-residual subspace (QRS). The corresponding fault indices and control limits for the five subspaces are also developed. The efficiency of the proposed CCCA is validated through a synthetic case study and the Tennessee Eastman process. In Chapter 3, a novel concurrent kernel CCA (CKCCA) method is proposed, which employs KCCA to extract the correlation part between the feature spaces of the input and output. In CKCCA, different from pre- 12 vious kernel methods, the nonlinearity existing in the quality variables is also considered. Thus both process variables X and quality variables Y are mapped from the original spaces into higher-dimensional feature spaces. Meanwhile, regularization is introduced to control the flexibility of projec- tions by penalizing the norms of the associated weight vectors. Then the remaining parts in both process and quality feature spaces are performed by PCA for further decomposition. The corresponding fault indices and control limits for the subspaces are also developed. The effectiveness of CKCCA is demonstrated through a numerical simulation and an industrial case study. In Chapter 5, a dynamic nonlinear version of concurrent CCA is pro- posed and applied for strip-thickness relevant fault diagnosis considering the nonlinearities and dynamics of the continuous annealing process (CAP). The new dynamic concurrent kernel CCA (DCKCCA) modeling and mon- itoring method deals with the following three aspects: (i) Lagged values of process and quality variables are used to generate augmented data matrices to provide an easy way to model dynamic correlations and auto-relations; (ii) The kernel technique derived from kernel CCA is used to model nonlin- earities; (iii) Concurrent decomposition of the dynamic kernel CCA residu- als in both process and quality variables is used to achieve comprehensive monitoring. In addition, to deal with the large dimensions induced by the 13 dynamic augmentation of the data matrices, DCKCCA is extended to multi- block DCKCCA while each original variable and the corresponding lagged values are incorporated into a single block to generate a new contribution to evaluate the fault effects on each variable. In Chapter 4, the traditional diagnosis methods, contribution plots and reconstruction-based contribution (RBC), are developed for concurrent CCA, and the analysis of diagnosability is also presented. RBC is an improved diagnosis method compared to contribution plots; however, for multi-dimensional faults, its diagnosis result can still be ambiguous. Thus, an extended RBC (ERBC) diagnosis approach is proposed for multi-dimensional quality-relevant faults. Finally, a detailed case study on Tennessee Eastman process is shown to illustrate the diagnosis of process and quality faults and the prognosis of quality-relevant faults. Chapter 6 gives some concluding remarks and potential future work along the direction of the thesis. 14 Chapter 2 Concurrent Quality and Process Monitoring with Canonical Correlation Analysis 2.1 Introduction Statistical process monitoring and fault diagnosis apply multivariate statistical analysis techniques on process data to monitor and diagnose dis- turbances in a process, which has been one of the most active research areas in process systems engineering over the past several decades [3, 37]. If real- time data from a process is outside the predefined normal control limit, an alarm is signaled to alert the operation personnel. If the alarm persists, in- tervention or even shutdown of the process operation are recommended in, for instance, semiconductor manufacturing operations. However, in prac- tice, anomalies in process variables alone may not lead to an anomaly in product quality due to corrective effort by human operations and feedback controllers. Alarming on faults due to process variable deviations alone may lead to nuisance alarms and reduce the reliability of the fault detection methods. The quality-relevant variations should receive a higher level of at- tention than process-relevant variations. Thus, quality-relevant monitoring and diagnosis is the main focus in this chapter. 15 With the availability of high dimensional process data or big data, data-driven latent structure modeling methods, such as PCA, PLS and CCA, are typical modeling tools for fault detection and diagnosis in industries [2, 33, 36, 38]. PLS is a data decomposition method for maximizing covari- ance between process variables X and quality variables Y. With iterative calculations, PLS decomposes the original spaces into principal and resid- ual subspaces, which can be monitored byT 2 andQ statistics, respectively. However, PLS usually requires many factors to predict even one output variable, making a large fraction of the latent space orthogonal to the output to be predicted. In addition, PLS can leave large variances in the residual subspace if they are irrelevant to predicting the output, which is different from PCA residuals and thus should be monitored differently from PCA based monitoring indices. Recent efforts have been devoted to overcoming these issues, including total PLS [32], concurrent PLS [33], and concurrent CCA [36]. PLS related methods simultaneously exploit the process and quality structures, and are robust to collinearity. CCA, in contrast, extracts the mul- tidimensional correlation between X and Y with no attention to the mag- nitude of the variance in each set of variables, which enables it to build an efficient model with as few latent factors as possible. In doing so, however, CCA requires to invert the input and output covariance matrices, making it 16 susceptible to collinearity or strong correlations. In this chapter, we propose the concurrent monitoring of process and quality faults based on CCA and PCA, and regularization terms are added into the objective of CCA to handle the collinearity problem. The proposed algorithm is referred as concurrent canonical correlation analysis (CCCA) with regularization for quality-relevant fault detection. The CCCA moni- toring approach is inspired by the CPLS approach, while the efficiency on process-quality prediction of CCA is incorporated into the CCCA modeling. The remaining part of this chapter is organized as follows. Section 2.2 defines and analyzes quality-relevant monitoring based on the popular latent structure modeling methods as described in Chapter 1, and the as- sociated quality-relevant monitoring statistics are defined. A regularized version of CCA is presented in Section 2.3 to deal with the ill conditions in CCA caused by strong collinearity in X or Y space. A comparison between CCA and PLS is demonstrated as well. In Section 2.4, the concurrent CCA modeling algorithm and the corresponding fault monitoring scheme are de- veloped. The detailed procedure of offline modeling and online monitoring is also described in this section. Section 2.5 shows that regularized CCCA is robust to noises, and it can correctly detect the quality-relevant faults in strongly collinear cases. The Tennessee Eastman process is employed to illustrate the effectiveness of CCCA against PLS and CPLS in Section 2.6. 17 Finally, conclusions are summarized in the last section. 2.2 Fault Monitoring Schemes In this section, several types of fault monitoring schemes are de- scribed and analyzed, including process monitoring, quality monitoring, quality-relevant monitoring and quality-irrelevant process monitoring. Quality-relevant monitoring is highlighted as a new scheme, and some of their recent developments are presented. As described in Chapter 1, two classes of methodologies are available in fault monitoring. One is to build models for the process based on first- principles, and the other is to build a model with normal data and use it to detect faults that deviate from the normal case, which is referred to data- driven method. It is obvious that the former approach requires much more modeling effort than the latter one, and it can be difficult to build rigorous models related to various types of faults and quality variables. Thus, the data-driven approach is more popular in industries, with tools ranging from PCA, PLS, CCA, and other variants. Depending on the variations in the process and quality data, monitoring schemes can be classified into four types. (1) Process monitoring (PM). PM applies multivariate statistics and ma- chine learning methods to fault detection and diagnosis based on pro- cess data, and PCA is one of the most popular methods. PM focuses 18 on monitoring variations inside process variables, and no information of the quality variables is included or required. It works well to mon- itor and diagnose upsets in the process; however, the variations or disturbances among process variables may have no influence on the final quality, since they can be compensated by feedback controllers or corrective actions by the human operator. Thus, monitoring the pro- cess variables only can lead to nuisance alarms that have no sensible effect on quality variables. (2) Quality monitoring (QM). Since product quality is the main concern in industries, QM has been practiced which focuses on the variations in quality variables. Typically, the Hotelling’sT 2 and Q statistics are used to detect the abnormal cases in QM [39]. However, QM does not pinpoint to which process variables contribute to the quality prob- lems, since no models are built to correlate quality variables with pro- cess variables. Additionally, quality variables are usually measured at a much slower rate than process variables with measurement de- lays, thus a large delay often occurs in QM-based fault detection and diagnosis. (3) Quality-relevant monitoring (QRM). QRM refers to the fault detection and diagnosis of quality variables that can be inferred or predicted from process variables. Input-output data driven models built from, 19 e.g., PLS and CCA, are usually employed in QRM [32, 33, 36]. QRM can have tiers in its structure depending on the use of mid-course, intermediate or final quality variables. For example, the quality vari- ables in Tier 1 can be composed of final or main product quality, and variables in Tier 2 can be composed of intermediate quality variables. (4) Quality-irrelevant process monitoring (QIPM). Process variations that are not quality relevant can also be monitored, although its attention level should be much lower than QRM, since they are excited in the data but have no relevance to quality. The monitoring of this portion of variations is the same as regular PM, while the portion of quality relevant variations is removed. In the following sections, QRM will be the main focus, since it can provide monitoring of quality variables with process measurements using an inferential model that can be executed as frequent and as soon as the pro- cess measurements are available. In this sense, QRM is prognostic. On the other hand, QM is the final authority to determine whether product qual- ity is indeed abnormal or not, although it usually involves long time delays and long sampling intervals. Since the quality variables cannot be perfectly predicted from process variables, depending on the model goodness of fit- ting, there can still be false alarms using QRM to predict quality anomalies. Nevertheless, since QRM uses supervised models with the help of quality 20 data in the model building phase, while PM uses unsupervised models, the fault detection rates (FDR) should be higher than those from PM, and its false alarm rates (FAR) are lower. It is noted that process faults are different from process disturbances that can be well compensated by feedback control, such that no sensible ef- fect results in the product quality. PM that uses process variables alone is prone to false alarms caused by process disturbances that can be well com- pensated by feedback control. Therefore, it is inappropriate to treat process disturbances as process faults, for instance, when the disturbances in the Tennessee Eastman process are used to demonstrate the detection rates of various methods (e.g., [8]). Many papers try to show an incrementally high detection rate for a disturbance with their methods, but it is actually a high false alarm rate in the case of quality irrelevant disturbances that can be handled by feedback control and thus should not be of concern at all. 2.3 RCCA for Quality-Relevant Monitoring 2.3.1 Regularized CCA Model The CCA algorithm as in Appendix A involves matrix inversions or pseudo-inversions of X > X and Y > Y in Eqs. (A.3) - (A.4). However, when there exists strong collinearity in X or Y space, which is very common in practical industrial processes, some eigenvalues of X > X and Y > Y are zero or quite close to zero, and the components extracted by the CCA algorithm 21 in Appendix A will be very sensitive to noise in the data that are used for modeling, suffering from the same ill-conditioned problem as ordinary least squares. In order to deal with the ill-conditioning issue, a regularized CCA (RCCA) is developed. In RCCA, the regularization terms are added into the objective function to ensure non-zero eigenvalues, max r; c J RCCA = r > X T Yc s.t. r > X > Xr + 1 jjrjj 2 = 1 c > Y > Yc + 2 jjcjj 2 = 1 (2.1) where r and c correspond to w and q defined in Section 1.2.3, respectively. i (i = 1; 2) is the regularization parameter and is employed to control non- zero eigenvalues. When i is set to zero, the original CCA will be recovered; while when i is large enough and the constraints in Eq. (2.1) are dominated by 1 jjrjj 2 and 2 jjcjj 2 , RCCA reduces to PLS. The detailed algorithm of RCCA is shown in Appendix B, and RCCA is employed in the following sections. After performing RCCA, X and Y are decomposed as follows, which is similar to Eq. (1.10). 8 < : X = TP > + E Y = TQ > + F (2.2) where T = [t 1 ; :::; t l ] are the sets of canonical variates for X. The load- ing matrices P and Q are obtained by maximizingjjX TP > jj 2 andjjY 22 TQ > jj 2 , which lead to P = X > T Q = Y > T (2.3) and the canonical variate T is calculated by T = XR (2.4) where R = [r 1 ; :::; r l ] is the weighting matrix in RCCA. In the regularized CCA model, the number of latent variablesl and the regularization parameters 1 and 2 should be determined. A cross validation with three parameters is used to determine l, 1 and 2 jointly, which is illustrated in Algorithm 2. It is noted that the determination of the configurable parameters l min , l max , 1 min , 1 max , 2 min , 2 max and their step sizes are based on the trade-off between accuracy and efficiency of the algorithm. 2.3.2 Comparison of CCA and PLS on Data Modeling Both CCA and PLS are used to extract the relations between process and quality variables. However, CCA maximizes the correlation between a linear combination of process and quality variables; while PLS aims to extract the covariance in the two sets of variables. PLS has been widely studied and used in multivariate statistical quality control. The objective function of PLS is presented in Eq. (1.4). From the objective, we can see that the scaling of the variables will affect 23 Algorithm2 Cross Validation for the Regularized CCA Model 1. Divide the training data (X; Y) into N cv subsets randomly, and confine the parameters to a certain region: l 2 [l min ;l max ], 1 2 [ 1 min ; 1 max ], and 2 2 [ 2 min ; 2 max ]. 2. For each possible combination (l; 1 ; 2 ), (1) Set aside one subset (X i ; Y i ) (i = 1;:::;N cv ), and train a regular- ized CCA model with the remaining subsets; (2) Calculate the predicted residual sum of squares (PRESS) of Y i with the trained model running on X i ; (3) Repeat (1) and (2) for each subset and calculate the cumulative PRESS. 3. Select the combination (l; 1 ; 2 ) that corresponds to the minimum PRESS. the solutions of PLS, since it is based on a maximum covariance criterion. For the principal components, apart from extracting the ones that reflect the relation between process and quality variables, PLS also considers the effects of the variance structures. Thus, PLS exploits the variance structure well. However, in industrial processes, the process data variations in a particular direction can be large due to a large disturbance and a feedback controller successfully compensating for the disturbance, thus yielding no effect on the product quality. In this case, while PLS tries to exploit this large variance in the data, it cannot be simultaneously efficient in predicting the quality. On the other hand, CCA or RCCA tries to maximize the correlation 24 between X and Y, which is given in Eq. (1.8) or (2.1). Correlation between process and quality variables is invariant to the variable magnitudes [40]. Therefore, CCA is efficient in predicting the quality variables using the pro- cess data. However, the magnitudes of the variances of the variables have no impact on the result of CCA, which makes it unsuitable for the monitor- ing of process variance structures. 2.4 CCCA for Process Monitoring 2.4.1 Concurrent CCA Model Inspired by the spirit of CPLS on fault monitoring in the work of Qin et al. [33], CCCA-based monitoring method is proposed. To realize a complete monitoring scheme of the quality variables and exclude unrelated variables, CCCA decomposes the data into five spaces: correlation subspace (CRS), responsible for the predictable quality part; process-principal sub- space (PPS), monitoring process relevant faults; process-residual subspace (PRS), monitoring potentially quality-relevant faults; and quality-principal subspace (QPS) and quality-residual subspace (QRS), monitoring abnormal variations that are quality-relevant and unpredictable from process vari- ables, respectively. The CCCA algorithm for data with multiple process variables and multiple quality variables is given in Algorithm 3, where the regularized CCA algorithm is adopted. From Algorithm 3, the matrices X 25 and Y can be decomposed as: X = T c R y c + ~ T x ~ P > x + ~ X Y = T c Q > c + ~ T y ~ P > y + ~ Y (2.5) where the loadings R c 2 R mlc , ~ P x 2 R mlx , Q c 2 R plc and ~ P y 2 R ply characterize the CCCA model, and the scores T c 2R Nlc , ~ T x 2R Nlx and ~ T y 2 R Nly represent the correlations in X that are related to predictable part of Y, variations in X that are useless to predict Y, and variations in Y that are unpredictable from X, respectively. In Algorithm 3, the number of principal components l c , l x and l y are determined as follows: l c is de- termined by cross-validation together with 1 and 2 as described in Algo- rithm 2, whilel x andl y are determined based on the cumulative percentage of variance [41]. Given a new data sample x and y, CCCA projects them as follows according to Eq. (2.5): x = R y> c t c + ~ P x ~ t x + ~ x (2.6) y = Q c t c + ~ P y ~ t y + ~ y (2.7) where t c = R > c x (2.8) ~ t x = ~ P > x x R y> c t c (2.9) ~ t y = ~ P > y (y Q c t c ) (2.10) 26 Algorithm3 The Concurrent CCA Algorithm 1. Scale process matrix X and quality matrix Y to zero mean and unit variance. Perform the regularized CCA (Appendix B) on X and Y to give R c , T c , Q c , and P c withl c latent factors. 2. Obtain the unpredictable quality variables ~ Y c = Y T c Q > c , and per- form PCA withl y principal components, ~ Y c = ~ T y ~ P > y + ~ Y This gives the quality-principal scores ~ T y and quality-residuals ~ Y. 3. Obtain the quality-irrelevant process by projecting on the orthogonal complement of Span R c , ~ X c = X T c R y c , where R y c = R > c R c 1 R > c , and perform PCA withl x principal components, ~ X c = ~ T x ~ P > x + ~ X to yield the process-principal scores ~ T x and process-residuals ~ X. For the residual part ~ x and ~ y, we have ~ x = I ~ P x ~ P > x x R y> c t c (2.11) ~ y = I ~ P y ~ P > y (y Q c t c ) (2.12) 2.4.2 CCCA-based Monitoring CCCA-based monitoring scheme works in a similar way as CCA. First, we build a CCCA model from normal data sets X and Y. Then for a new sample, all scores and residuals are calculated through Eqs. (2.8) - (2.12). Finally, several control plots are constructed with corresponding control limits, which are used for fault detection. 27 In multivariate statistical process monitoring, T 2 and Q statistics are widely used for monitoring systematic and residual variations [32]. In CCCA, CRS, PPS and QPS subspaces contain systematic variations, which are suitable to useT 2 statistic. PRS and QRS subspaces represent the residual variations, which should be monitored withQ index [42]. For a sample x and y, theT 2 c statistic for the predictable part ^ y can be calculated by T 2 c = t > c 1 c t c = x > R c 1 c R > c x (2.13) where c = diagf c1 ; c2 ; :::; clc g and ci denotes the variance of the i th principal components. The process-relevant scores in Eq. (2.9) and residuals in Eq. (2.11) can be monitored like PCA byT 2 andQ as follow, ~ T 2 x = ~ t > x ~ 1 x ~ t x (2.14) ~ Q x = ~ x > ~ x (2.15) where ~ x = 1 N1 ~ T > x ~ T x . Similarly, the quality-relevant scores in Eq. (2.10) and residuals in Eq. (2.12) are monitored by ~ T 2 y = ~ t > y ~ 1 y ~ t y (2.16) ~ Q y = ~ y > ~ y (2.17) 28 Table 2.1: Monitoring Statistics and Control Limits for CCCA Statistics Calculation Control Limits T 2 c t > c 1 c t c lc(N 2 1) N(Nlc) F lc;Nlc; ~ T 2 x ~ t > x ~ 1 x ~ t x lx(N 2 1) N(Nlx) F lx;Nlx; ~ T 2 y ~ t > y ~ 1 y ~ t y ly (N 2 1) N(Nly ) F ly;Nly; ~ Q x ~ x > ~ x g x 2 hx; ~ Q y ~ y > ~ y g y 2 hy; N, number of training samples; l c , number of principal components in the CRS subspace;l x , number of principal components in PPS subspace; l y , number of principal components in QPS subspace; and the calculation ofg x ,h x ,g y andh y can be found in [3]. where ~ y = 1 N1 ~ T > y ~ T y . Assume that the data are sampled from a multivariate normal dis- tribution. Then the control limits of T 2 c , ~ T 2 x and ~ T 2 y can be obtained by an F -distribution, and the control limits for ~ Q x and ~ Q y are gained by a 2 - distribution [1]. The control limits of these indices are listed in Table 2.1. According to the monitoring statistics and control limits in Table 2.1, we can monitor the process-relevant and quality-relevant faults as follows: (1) IfT 2 c exceeds its control limit, a quality-relevant fault will be detected with (1) 100% confidence. (2) If ~ T 2 x exceeds its control limit, the fault will be identified as quality irrelevant but process relevant with (1) 100% confidence. These 29 faults can be paid less attention if only quality variables are considered significant. (3) If ~ Q x exceeds its control limit, a potentially quality-relevant fault will be detected with (1) 100% confidence, since it may contain the variations that are not excited in the training dataset. (4) If ~ T 2 y or ~ Q y exceeds their control limits, a quality-relevant fault will be detected, which is unpredictable from the process variables. 2.4.3 CCCA-based Modeling and Online Monitoring In practical, in order to employ CCCA for quality-relevant monitor- ing, both offline modeling and online monitoring are needed, which are shown in Figures 2.1 and 2.2, respectively. Note that in Figure 2.2, two CCCA models are used only for clarification purpose, and they are exactly the same. For offline modeling, after the raw data X and Y are collected, the preprocessing step should be executed first, which includes (1) Handle the lagging effects in quality variables. In practical industrial processes, it always takes longer time to obtain quality measurements Y, while the measurements for the process variables X are relatively faster. The size of X should be much larger than Y. Therefore, it is essential to remove the data in X that has no corresponding quality 30 Figure 2.1: Offline Modeling Scheme for CCCA 31 Figure 2.2: Online Monitoring Scheme for CCCA measurements in Y. (2) Scale data to zero mean and unit variance, since correlation matrices should be fed into the regularized CCA algorithm in Appendix B. Then CCCA algorithm in Algorithm 3 can be used to extract the five subspaces and develop the corresponding monitoring indices. After CCCA model is obtained, we can employ it for the online process monitoring. Dif- ferent from the preprocessing step in the modeling phase, we can monitor quality relevant faults with x directly based on CCCA model, and we do not need to wait for the measurements of the quality variables y. Therefore, in the online monitoring phase, x can be used to monitor the process once it 32 is available. When quality variables y are obtained, the variations that are not predictable from x in QPS and QRS subspaces can be monitored. 2.5 Synthetic Case Studies Prediction Eectiveness and Robustness of Regularized CCA In this section, we use numerical simulations to create some cases where strong collinearity exists. The prediction performance and the ro- bustness of two versions of CCA (the original CCA and the regularized CCA) are tested with these cases. The simulated numerical examples are generated as follows. ( x = Az + e y = Cx + v (2.18) where A = 0 @ 1 3 4 4 3 2 6 8 3 0 1 4 1 6 0 2 1 1 3 0 0 2 2 6 1 A > C = 0 B B B B @ 2 4 1 1 0 1 2 2 3 3 0 4 0 2 1 3 1 2 2 1 0 0 0 1 4 3 1 2 0 0 1 2 4 3 1 2 0 0 1 2 1 C C C C A z2R 3 s U([0; 2]) e2R 8 s N(0; 0:02 2 ) U([0; 2]) is the uniform distribution in the interval [0; 2], and N(; 2 ) is the normal distribution with mean and variance 2 . In Eq. (2.18), x2R 8 and 33 y2R 5 . It is noted that the first three variables and the last three variables of x are strongly collinear by properly choosing the corresponding entries in A and adding the noise term e. In order to compare the effectiveness of the original CCA and the regularized CCA, two scenarios of the noise v with different magnitudes are generated. Scenario 1: v2R 5 s N(0; 0:1 2 ) (2.19) Scenario 2: v2R 5 s N(0; 0:5 2 ) (2.20) For each of the scenarios, 200 normal samples are generated. Among them, the first 100 samples are used to train the models, and the remaining samples are for test purpose. With the cross-validation method described in Section 2.3, the parameters for Scenario 1 are determined: for the original CCA, l = 3; and for the regularized CCA, l = 3, 1 = 0:028, and 2 = 0:001. And those for Scenario 2 are: for the original CCA,l = 3; and for the regularized CCA,l = 3, 1 = 0:097, and 2 = 0:001. The root mean squared errors (RMSE) of the original CCA and the regularized CCA are presented in Table 2.2 for both scenarios. From the results, we can see that the regularized CCA is at least 20% better than the original CCA in terms of the prediction on the test data. 34 Table 2.2: Root Mean Squared Errors for Numerical Simulation Original CCA Regularized CCA Scenario 1 0:0062 0:0051 Scenario 2 0:029 0:021 Table 2.3: Weighting Matrices R for Scenario 1 Original CCA Regularized CCA r 1;1 r 2;1 r 3;1 r 1;1 r 2;1 r 3;1 0.212 -0.804 30.15 0.015 0.036 0.041 0.110 0.045 53.40 0.013 -0.027 -0.024 0.054 -2.979 38.26 0.019 -0.034 0.064 -0.120 -0.181 9.909 0.010 0.043 -0.069 0.073 2.702 -51.03 0.009 0.018 -0.096 -0.179 -0.678 -6.582 0.015 0.036 0.042 -0.131 -4.736 27.03 0.013 -0.027 -0.022 -0.126 6.200 -93.75 0.019 -0.034 0.064 Furthermore, their weighting matrices R (defined in Eq. (2.4)) are summarized in Tables 2.3 and 2.4, and it shows that the magnitudes of the weighting matrix R of the original CCA are relatively large due to data collinearity and ill-conditioning. Additionally, in order to compare the sensitivity to noises of these two models, the angles i between the corresponding weighting vectors r i;1 35 Table 2.4: Weighting Matrices R for Scenario 2 Original CCA Regularized CCA r 1;2 r 2;2 r 3;2 r 1;2 r 2;2 r 3;2 0.757 -3.454 -38.51 0.016 0.036 0.038 -0.573 -5.828 39.59 0.013 -0.027 -0.015 0.172 10.81 10.48 0.020 -0.031 0.060 -1.280 2.294 -123.1 0.010 0.044 -0.071 0.372 -9.806 91.67 0.008 0.016 -0.093 0.493 7.612 116.4 0.016 0.036 0.037 1.964 21.95 -23.97 0.013 -0.027 -0.015 -1.533 -21.99 -60.46 0.020 -0.031 0.060 Table 2.5: The Angles between Weighting Vectors in Scenarios 1 and 2 ( ) Original CCA Regularized CCA 1 83.25 0.81 2 17.82 2.29 3 89.40 4.36 in Scenario 1 and r i;2 in Scenario 2 are calculated by i = arccos r > i;1 r i;2 jjr i;1 jjjjr i;2 jj ! and the results are summarized in Table 2.5. It is observed that the regular- ized CCA are more robust to noises, where the directions of its weighting vectors are little changed even the noise variance increases. However, the 36 original CCA model is ill conditioning and very sensitive to disturbances, and the directions of its weighting vectors change significantly. Monitoring Performance of CCCA with Regularization In this subsection, the regularized CCA is adopted for CCCA model, and its monitoring performance is analyzed. 100 normal data is generated with Eq. (2.18), and 100 samples with the disturbances are generated as follows. x = x + f (2.21) where x is normal value without disturbances, is the disturbance direction matrix or vector with orthogonal columns, and f is distur- bance magnitude. Two kinds of disturbances (quality-relevant one and quality-irrelevant one) are generated with Eq. (2.21). Fault 1: Fault Occurs in CRS Subspace Only To simulate a fault that occurs in the CRS subspace only, we take the first column of R c as . As shown in Figure 2.3, the fault affects the quality variables y, so it is a quality-relevant fault. The fault monitoring results of the regularized CCCA are shown in Figure 2.4. From the results, we can see that when strong collinearity exists in the process space, the regularized CCCA model successfully extracts the latent components, and detects ab- normal variations in T 2 c subspace. ~ T 2 y , ~ T 2 x and ~ Q x indices are concurrently 37 0 20 40 60 80 100 120 140 160 180 200 -10 0 10 y 1 0 20 40 60 80 100 120 140 160 180 200 -5 0 5 y 2 0 20 40 60 80 100 120 140 160 180 200 -10 0 10 y 3 0 20 40 60 80 100 120 140 160 180 200 -10 0 10 y 4 0 20 40 60 80 100 120 140 160 180 200 -10 0 10 y 5 Fault Added Figure 2.3: Quality-Relevant Fault 1 Occurred in the CRS Subspace Figure 2.4: Monitoring Results for Fault 1 with Regularized CCCA (f = 6) 38 0 20 40 60 80 100 120 140 160 180 200 -5 0 5 y 1 0 20 40 60 80 100 120 140 160 180 200 -5 0 5 y 2 0 20 40 60 80 100 120 140 160 180 200 -5 0 5 y 3 0 20 40 60 80 100 120 140 160 180 200 -5 0 5 y 4 0 20 40 60 80 100 120 140 160 180 200 -5 0 5 y 5 Fault Added Figure 2.5: Quality-Irrelevant Fault 2 Occurred in the PPS Subspace Figure 2.6: Monitoring Results for Fault 2 with Regularized CCCA (f = 4) 39 monitored in CCCA, and no violations are detected in these subspaces. Fault 2: Fault Occurs in PPS Subspace Only In this case study, is composed by the first column of ~ P x , and the fault occurs in PPS subspace only. As shown in Figure 2.5, the fault is quality-irrelevant. Figure 2.6 shows the monitoring results, and only ~ T 2 x detects the disturbance. If quality variables are the main concern in the process, the attention level for this kind of fault can be set to a lower value. 2.6 Tennessee Eastman Process Case Studies Tennessee Eastman process (TEP) [43] was created by the Eastman Chemical Company to provide a realistic industrial chemical process for the purpose of developing, studying and evaluating process control technology. The process produces two products (G andH) from four reactants (A,C,D andE), and the reactions are: A(g) +C(g) +D(g)!G(l) A(g) +C(g) +E(g)!H(l) A(g) +E(g)!F (l) 3D(g)! 2F (l) whereF is byproduct. There are 53 variables in TEP model. Among them, 41 variables are measurements in XMEAS common block, and 12 additional variables are 40 available for manipulation by controllers in XMV common block. Although TEP is widely used for process and quality monitoring, many applications show little understanding of the process and suffer from two common misuses. (1) Incorrect selection of quality variables. In TEP , G and H are main products, which are extracted in Stream 11, and they should be re- garded as the main quality variables. Additionally, most byproduct F and a small amount of reactants and products are purged from the separator and flow into Stream 9. The abnormal disturbances in purge gases should also be monitored (either alone or combined withG and H in Stream 11) in QRM. (2) Ignoring differences in sampling frequencies. It takes relatively longer time to obtain quality measurements, and the TEP simulation takes this into account. The measurements of process variables XMEAS(1- 22) are real-time; the sampling frequency is 0.1 h for Streams 6 and 9, and 0.25 h for Stream 11. However, most work failed to handle these delays. Due to the time delays, it is essential to preprocess the col- lected data before building the models: align the process samples with their corresponding quality measurements. For online monitoring, it is desirable to detect quality-relevant faults before y is measured. 41 In this case study, PLS, CPLS [33] and CCCA are performed on TEP . For all these monitoring schemes, XMEAS(1-22) and XMV(1-11) are regarded as the process variables, where XMEAS (1-22) are the process measurements and XMV(1-11) are the manipulated variables. XMEAS(35- 36) are selected as the quality variables, which are product G and H composition analysis in Stream 9. After preprocessing, 250 normal samples are employed to build PLS, CPLS and CCCA models. All samples are centered to zero mean and scaled to unit variance. The number of latent factors for PLS are 6, determined by cross-validation. For CPLS,l c = 2,l y = 2 andl x = 18. And for CCCA,l c = 1, l y = 2,l x = 20, 1 = 0:001 and 2 = 0:068. The control limit is set with 99% confidence. There are 15 known disturbances in TEP [43], which are listed in Table 2.6. In Table 2.6, the disturbances are divided into two groups, quality-relevant one and quality-irrelevant one, based on the monitoring result on quality variables Y [44]. For quality monitoring, if T 2 or Q statistic exceeds their corresponding control limit, Y is affected by the disturbance; Otherwise, the disturbance is classified as quality-irrelevant. The alarms on quality-related disturbances are effective, while alarms on quality-irrelevant disturbances are nuisance, which should receive less attention. The monitoring results for these two groups are listed in 42 Table 2.6: Disturbance Description for TEP Disturbances Detailed Description Type Classification IDV(1) A=C feed ratio,B composition constant (Stream 4) Step (G,H)-Relevant IDV(2) B composition,A=C ratio constant (Stream 4) Step (G,H)-Relevant IDV(3) D feed temperature (Stream 2) Step (G,H)-Irrelevant IDV(4) Reactor cooling water inlet temperature Step (G,H)-Irrelevant IDV(5) Condenser cooling water inlet temperature Step (G,H)-Relevant IDV(6) A feed loss (Stream 1) Step (G,H)-Relevant IDV(7) C header pressure loss - reduced availability (Stream 4) Step (G,H)-Relevant IDV(8) A,B,C feed composition (Stream 4) Random variation (G,H)-Relevant IDV(9) D feed temperature (Stream 2) Random variation (G,H)-Irrelevant IDV(10) C feed temperature (Stream 4) Random variation (G,H)-Irrelevant IDV(11) Reactor cooling water inlet temperature Random variation (G,H)-Irrelevant IDV(12) Condenser cooling water inlet temperature Random variation (G,H)-Relevant IDV(13) Reaction kinetics Slow drift (G,H)-Relevant IDV(14) Reactor cooling water valve Sticking (G,H)-Irrelevant IDV(15) Condensor cooling water valve Sticking (G,H)-Irrelevant 43 Table 2.7: Fault Detection Rates for Quality-Relevant Disturbances with PLS, CPLS and CCCA (%) Disturbances PLS CPLS CCCA IDV(1) 85.14 93.24 98.65 IDV(2) 96.00 99.38 100.00 IDV(5) 76.60 93.62 93.16 IDV(6) 99.74 100.00 100.00 IDV(7) 86.76 95.59 97.06 IDV(8) 86.90 97.82 98.25 IDV(12) 87.73 96.65 98.88 IDV(13) 86.62 99.65 99.65 Table 2.8: False Alarm Rates for Quality-Irrelevant Disturbances with PLS, CPLS and CCCA (%) Disturbances PLS CPLS CCCA IDV(3) 13.42 7.11 6.84 IDV(4) 33.33 8.42 5.53 IDV(9) 6.58 5.00 4.74 IDV(10) 33.32 8.74 4.11 IDV(11) 21.58 8.95 6.05 IDV(14) 77.11 8.16 9.21 IDV(15) 7.11 6.37 5.53 Tables 2.7 and 2.8, respectively. For PLS, we use T 2 index to monitor the quality-relevant faults, while for CPLS and CCCA,T 2 c and ~ T 2 y are employed to monitor quality-relevant faults for the process. From Tables 2.7 and 2.8, we can see that CCCA outperforms PLS and CPLS in most cases, since apart from the correlation structure, it employs 44 0 50 100 150 200 250 300 350 400 450 500 -20 0 20 XMEAS(35) 0 50 100 150 200 250 300 350 400 450 500 -10 0 10 XMEAS(36) 0 50 100 150 200 250 300 350 400 450 500 0 100 200 T 2 Quality Monitoring with Delays Control Limit Fault Added Figure 2.7: Quality Variables and Quality Monitoring Results for IDV(1) the concurrent decomposition to exploit the variance information. Addi- tionally, based on Tables 2.6 - 2.8, the effectiveness of CCCA over other methods is observed for all types of the disturbances. Apart from the higher fault detection rates and lower false alarm rates, CCCA outperforms PLS and CCA since it can decompose the data spaces completely to monitor the quality-specific and process-specific changes efficiently. To show this point, we use IDV(1) and IDV(4) as examples. IDV(1) is a step disturbance in A=C ratio, and as shown in the T 2 -based quality monitoring result of Figure 2.7, it is a short-term quality- relevant fault. The effects of this fault is compensated by the controllers in 45 0 50 100 150 200 250 300 350 400 450 500 0 200 400 600 800 T 2 0 50 100 150 200 250 300 350 400 450 500 0 500 1000 1500 Q Monitoring Index Control Limit Fault Added Figure 2.8: PLS-based Monitoring Results for IDV(1) Figure 2.9: CCCA-based Monitoring Results for IDV(1) 46 0 50 100 150 200 250 300 350 400 450 500 -5 0 5 XMEAS(35) 0 50 100 150 200 250 300 350 400 450 500 -5 0 5 XMEAS(36) 0 50 100 150 200 250 300 350 400 450 500 0 10 20 T 2 Quality Monitoring with Delays Control Limit Disturbance Added Figure 2.10: Quality Variables and Quality Monitoring Results for IDV(4) the process, thus the quality variables return to normal after a short period. In Figures 2.8 and 2.9, although both PLS and CCCA detect this disturbance in all subspaces, T 2 c of CCCA-based model returns to normal after about 150 sampling periods. This is consistent with the quality monitoring of Figure 2.7, but it detects the quality fault sooner than the results in Figure 2.7. On the other hand, T 2 of PLS also tends to return to normal, but it hovers above the control limit for the entire period. Additionally, ~ T 2 x presents the variations irrelevant to the quality variables. For the step disturbance IDV(4) that comes from reactor cooling water inlet temperature, the monitoring results of PLS and CCCA are presented in Figures 2.11 and 2.12. In this scenario, due to the cascade 47 0 50 100 150 200 250 300 350 400 450 500 0 50 100 T 2 0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 Q Monitoring Index Control Limit Disturbance Added Figure 2.11: PLS-based Monitoring Results for IDV(4) Figure 2.12: CCCA-based Monitoring Results for IDV(4) 48 controllers in the system, the variation of the reactor cooling water inlet temperature will not affect the quality variables (as shown in Figure 2.10). For CCCA, the monitoring result shows that this fault is quality irrelevant, which should receive less attention during the process. PLS, however, detects the disturbance with both T 2 and Q statistics, raising the alarm incorrectly. It is worth noting that the process-residual subspace (measured by ~ Q x ) can be indicative of a quality-relevant fault. The process-residual sub- space is the part of the X-space with little excitation or variations during normal operations. If this subspace suddenly exhibits excessive variations, it is potentially quality relevant. This case is observed in the ~ Q x plot of IDV(1) in Figure 2.9. Not only does ~ Q x show excessive variations, the trend of variations in ~ Q x matches that of ~ T 2 y of the quality data, which comes af- ter a measurement delay. In this case, ~ Q x happens to be predictive of ~ T 2 y , which cannot be predicted from the model based on normal data. On the other hand, for IDV(4) which is quality-irrelevant as shown in Figure 2.12, excessive variations in ~ Q x remain to be process specific. 2.7 Summary In this chapter, several types of monitoring schemes are clearly formulated, including process monitoring, quality monitoring, quality- relevant monitoring and quality-irrelevant monitoring, with each serving a 49 particular purpose and usefulness. Additionally, it is analyzed that CCA alone does not yield a reliable fault detection model, since it only focuses on the extraction of multidimensional correlation structure and ignores the variance structure in the data. Thus, a concurrent CCA model with regularization is proposed to fix this issue and monitor the process-relevant and quality-relevant faults separately, which is based on a regularized CCA algorithm to deal with collinear problems. In CCCA, the process and quality spaces are concurrently projected into five subspaces, and the corresponding monitoring statistics and control limits are developed. The detailed procedure of offline modeling and online monitoring is also illustrated. A numerical simulation example is employed to illustrate the advan- tages of the regularized CCA over the original CCA. The application results of TEP shows that the concurrent regularized CCA achieves higher fault detection rates and lower false alarm rates than PLS and CPLS. CCCA also outperforms other methods with its efficient decomposition. The compar- ison of CCA and PLS performed in this chapter shows the advantages of CCCA on quality-relevant fault monitoring. 50 Chapter 3 Quality-Relevant Fault Detection for Nonlinear Processes based on Concurrent Kernel Canonical Correlation Analysis 3.1 Introduction As shown in Chapter 2, PLS, CCA and their variants are widely applied in industry for quality relevant process monitoring and diag- nosis. However, all these methods imply a linearity assumption, which limits their further application for nonlinear processes. In order to model nonlinear data, several nonlinear methods have been proposed [45–49]. Qin et al. [46] embedded neural networks into the framework of PLS modeling methods, which can capture the nonlinearity and obtain robust generalization properties. Akaho [48] studied the kernel CCA (KCCA) and proved its effectiveness. KCCA maps the original process data into a high dimensional feature space to describe the nonlinear relations between process data and quality data. The process monitoring techniques are similar for these methods. The original feature space and quality space are decomposed into two sub- spaces, which are monitored byT 2 andQ statistics [3]. Although it works 51 well in some cases, there are several problems involved in this monitoring scheme. Firstly, PLS factors that contribute to T 2 may contain variations orthogonal to quality variables. Additionally, PLS or CCA extracts prin- cipal components based on a maximum covariance or correlation criterion respectively, and the residuals of process variables are not necessarily small. Several approaches have been proposed to overcome the above is- sues. For example, based on the linear concurrent PLS proposed by Qin and Zheng [33], Sheng et al. [50] developed a concurrent kernel PLS (CK- PLS) model, which realized comprehensive fault detection. In CKPLS, the nonlinear process and quality spaces are decomposed into five subspaces, and the corresponding monitoring scheme is also discussed in their work. However, CKPLS does not consider the nonlinearity in quality variables, which is common in industrial processes [48]. For example, the compo- sition variables in high purity distillation exhibit high nonlinearity as the purity approaches 1. In other cases where the quality variables are grades, the grade levels are hardly linear with respect to the process conditions. Owing to the advantages of CCA over PLS in predicting the out- put variables [36], a novel concurrent kernel CCA (CKCCA) method is pro- posed in this chapter. KCCA is employed in CKCCA to extract the corre- lation part between the feature spaces of the input and output. Different from previous kernel computation in CKPLS, we consider the nonlinear- 52 ity existing in the quality variables as well. Thus both process variables X and quality variables Y are mapped from the original spaces into higher- dimensional feature spaces. To overcome the drawback of CCA that is sen- sitive to collinearity in the feature spaces, regularization is introduced to control the flexibility of projections by penalizing the norms of the associ- ated weight vectors. Then the residuals both in process and quality feature spaces are performed by PCA for further decomposition. Therefore, after performing CKCCA, the original feature space is partitioned into five sub- spaces: correlation subspace, quality-principal subspace, quality-residual subspace, process-principal subspace and process-residual subspace. The corresponding fault indices and control limits for the five subspaces are also developed. The effectiveness of CKCCA is demonstrated through a synthetic simulation and an industrial case study. The remaining part of this chapter is organized as follows. Fault de- tection based on KCCA is first summarized in Section 3.2. The CKCCA al- gorithm and CKCCA-based monitoring scheme are proposed in Section 3.3. Section 3.4 simulates a nonlinear numerical case to compare the effective- ness of KCCA, CCCA and CKCCA in terms of detecting quality-relevant and process-relevant faults. In Section 3.5, the Tennessee Eastman process is employed to demonstrate the advantages of CKCCA. The chapter ends with a summary in Section 3.6. 53 3.2 KCCA Model for Process and Quality Monitoring 3.2.1 Kernel CCA Model The data collected when the process operates in normal condition are X and Y, where X2R Nm consists ofN samples withm process variables, and Y2R Np consists ofN samples withp quality variables. In order to model the nonlinear process, a nonlinear map is adopted. In this chapter, we consider the cases when the nonlinearity exists in both the process and quality variables. Both X and Y are mapped from the original space into the feature spaceF , in which they are related linearly approximately [51]. Then after nonlinear mapping, X and Y are mapped into X 2R Nh and Y 2R Nh , whereh is the dimension in the feature space. Further, in order to avoid trivial solution explained in [34], a control on the flexibility of the projections is introduced by penalizing the norm of the associated weight vectors. Then the objective of KCCA is max ; J KCCA = > K X K Y (3.1) s.t. > K 2 X + X > X 2 = 1 > K 2 Y + Y > Y 2 = 1 where K X = X > X and K Y = Y > Y are defined as the gram kernel ma- trices; and are weighting vectors for X and Y in the feature space; and X and Y are the regularization parameters, used to control the mag- nitudes of > X 2 and > Y 2 . and are transformed directions in the 54 feature spaces, and they are connected with the weighting vectors r and c by r = X and c = Y , respectively. The objective Eq. (3.1) can be solved via Cholesky decomposition [48]. However, the complete decomposition of a kernel matrix is an expen- sive step and should be avoided especially for real world data. Therefore, the incomplete Cholesky decomposition is adopted, and its details are de- scribed in [52]. After KCCA decomposition, the matrices can be represented as, X = TP > + ~ X Y = TQ > + ~ Y (3.2) where T is the set of canonical variates for X , P and Q are the loadings for X and Y , respectively, and ~ X and ~ Y are the residuals. T can be computed through weighting matrix R by T = X R (3.3) where R = [r 1 ; r 2 ; ; r l ] = > X A, A = [ 1 ; 2 ; ; l ] is the transformed weighting matrix, andl is the number of components. The use of kernel function allows to compute inner products in feature space without performing nonlinear mappings. That is, perform- ing nonlinear mappings and computing inner products in the feature space can be avoided by introducing a kernel function of the form k (a; b) =< (a); (b) >. In KCCA, the Gaussian kernel function is 55 adopted, which is defined as k (a; b) = exp ka bk 2 c ! (3.4) where c is the width of the Gaussian function. In process monitoring, the value ofc controls the robustness and sensitivity of the model. In general, whenc is large, the robustness of the model increases whereas the sensitiv- ity decreases. Namely, false alarms decrease while missing alarms increase. Before performing KCCA, we need to center the data points as fol- lows, K = K 1 N K K1 N + 1 N K1 N (3.5) where 1 N 2R NN and each element in 1 N is 1=N. K is the centered gram matrix. For simplicity, we still denote K as K for centered matrix in the following discussion. 3.2.2 KCCA-based Monitoring For a set of new samples X new 2R Lm , we map them into the hyper- dimensional feature space using function , and the kernel vector is com- puted as K X new = X new > X , where X new 2 R Lh . Similar to the training samples, mean centering of the new kernel vector can be done by substitut- ing the kernel matrix K X new with K X new , where K X new = K X new 1 L K X K X new 1 N + 1 L K X 1 N (3.6) 56 with 1 L 2R LL and each element is 1=L. The same centering process should perform on Y new 2R Lp as well. After performing kernel trick and mean centering, a new pair sam- ple (x new ; y new ) is mapped to ((x new );(y new )). Then the KCCA model is written as (x new ) = ^ (x new ) + ~ (x new ) (3.7) where ^ (x new ) = PR > (x new )2 SpanfPg (3.8) ~ (x new ) = (I PR > )(x new )2 SpanfRg ? (3.9) P = X K X A A > K 2 X A 1 (3.10) Also, the score t new and the residual ~ (x new ) are t new = R > (x new ) = A > k xnew (3.11) ~ (x new ) = h I X K X A A > K 2 X A 1 A > X i (x new ) (3.12) where k xnew = X (x new ). Further, the two statistics T 2 and Q and their control limits are de- fined [3] as follows, T 2 = t new 1 t new l(N 2 1) N(N 1) F l;Nl; (3.13) Q =jj ~ (x new )jj 2 g 2 h (3.14) where = 1=(N 1)T > T. F l;Nl; is an F-distribution with l and N l degrees of freedom, and the confidence level is defined with by (1 57 ) 100%. g 2 h is the 2 -distribution with scaling factorg andh degrees of freedom. The calculations ofg andh can be found in [6]. Note that bothT 2 andQ can be derived directly by K X and k xnew , where k xnew = X (x new ). 3.3 CKCCA for Nonlinear Process Monitoring 3.3.1 Concurrent Kernel CCA Model The concurrent KCCA decomposes the process and quality spaces similar to CCCA as described in Section 2.4. However, instead of par- titioning the original process and quality spaces, CKCCA decomposes the high-dimensional feature spaces of process and quality variables into five subspaces: correlation subspace, quality-principal subspace, quality-residual subspace, process-principal subspace and process-residual subspace. The detailed algorithm is demonstrated in Algorithm 4. KCCA is first performed on the mapped data to extract the approximately linear relation between process and quality data in the feature space. Then PCA is employed to decompose the remaining variations into principal and residual parts for both process and quality variables. The CKCCA algorithm decomposes X and Y as follows, X = T c R y c + T x P > x + ~ X Y = T c Q > c + T y P > y + ~ Y (3.15) where T c R y c is the variation that is related to Y ; T x P > x is the process princi- pal subspace, which is useless to predict Y ; ~ X is the process residual sub- 58 Algorithm4 The Concurrent Kernel CCA Algorithm 1. Normalize training data X and Y, and perform KCCA by solving Eq. (3.1) via incomplete Cholesky decomposition to obtain R, A, T, Q, and P. 2. For the predictable quality part ^ Y = TQ > , perform SVD on ^ Y , ^ Y = U c D c V > c = T c Q > c , where T c = U c includes left singular vec- tors. Q c = V c D > c = ^ > Y W c is the product ofl c nonzero singular values and the corresponding right singular vectors, where W c is the unitary eigenvector matrix of (1=N) ^ Y ^ > Y corresponding to the firstl c eigen- values. 3. For the unpredictable quality variations ~ Yc = Y T c Q > c , perform PCA with l y principal components and obtain ~ Yc = T y P > y + ~ Y , where P y = ~ > Yc W y , T y = K yc W y , K yc = ~ Yc ~ > Yc , and W y is the uni- tary eigenvector matrix of (1=N) ~ Yc ~ > Yc corresponding to the largest l y eigenvalues. 4. Obtain the quality-irrelevant process variations by ~ Xc = X T c R y c , where R c = RQ > V c D 1 c , and R y c = R > c R c 1 R > c . Then perform PCA withl x principal components, ~ Xc = T x P > x + ~ X , where P x = ~ > Xc W x , T x = K xc W x , K xc = ~ Xc ~ > Xc , and W x is the unitary eigenvector ma- trix of (1=N) ~ Xc ~ > Xc corresponding to the largestl x eigenvalues. R c 2R hlc ,P x 2R hlx ,Q c 2R hlc andP y 2R hly are loadings of CKCCA model; and T c 2R Nlc ,T x 2R Nlx andT y 2R Nly are score matrices. space, which is not excited in Y ; T c Q > c can be predicted from X ; T y P > y is the quality principal subspace, which cannot be predicted by X ; and ~ Y is the quality residual subspace. For a new data sample(x new ) and(y new ), CKCCA projects them as 59 follows, (x new ) = R y> c t c + P x t x + ~ (x new ) (y new ) =Q c t c + P y t y + ~ (y new ) (3.16) where the calculations of t c , t x , t y , ~ (x new ) and ~ (y new ) can be found in Appendix C. 3.3.2 CKCCA-based Monitoring From Eqs. (C.1) - (C.5), the monitoring statistics can be easily de- rived. Since the variations in the correlation subspace, process principal subspace and quality principal subspace contain systematic part, it is suit- able to monitor the variations with T 2 statistic. And the residual parts in process residual subspace and quality residual subspace can be monitored withQ index [42]. TheT 2 statistics in correlation subspace, process principal subspace and quality principal subspace can be calculated by T 2 c = t > c 1 c t c (3.17) T 2 x = t > x 1 x t x (3.18) T 2 y = t > y 1 y t y (3.19) where c , x and y contain the largestl c ,l x andl y principal components in each subspace, respectively. The process-relevant residuals in Eq. (C.4) can be monitored by the 60 Q statistic, Q x = ~ (x new ) > ~ (x new ) = ~ c (x new ) P x t x > ~ c (x new ) P x t x = ~ c (x new ) > ~ c (x new ) 2 ~ c (x new ) > P x t x + t > x P > x P x t x (3.20) For the first part in Eq. (3.20), ~ c (x new ) =(x new ) R y> c t c =(x new ) > X M > t c (3.21) where M is defined in Appendix C for simplicity. Then based on Eq. (3.21), ~ c (x new ) > ~ c (x new ) can be represented by the gram matrices K X and k xnew , both of which are known. For the second term in Eq. (3.20), ~ c (x new ) > P x t x = k > xnew t > c MK X (I T c M) > W x t x (3.22) Both t c and t x in Eq. (3.22) can be transformed into K X and k xnew . For the last term in Eq. (3.20), t > x P > x P x t x = t > x W > x (I T c M) K X (I T c M) > W x t x (3.23) Therefore, based on Eqs. (3.21) - (3.23), we obtain Q x = 1 2t > c Mk xnew + t > c MK X M > t c 2 k > xnew t > c MK X (I T c M) > W x t x + t > x W > x (I T c M) K X (I T c M) > W x t x (3.24) 61 Table 3.1: Monitoring Statistics and Control Limits for CKCCA Statistics Calculation Control Limit T 2 c t > c 1 c t c lc(N 2 1) N(Nlc) F lc;Nlc; T 2 x t > x 1 x t x lx(N 2 1) N(Nlx) F lx;Nlx; T 2 y t > y 1 y t y ly (N 2 1) N(Nly ) F ly;Nly; Q x ~ (x) > ~ (x) g x 2 hx; Q y ~ (y) > ~ (y) g y 2 hy; The calculations ofg x ,h x ,g y andh y can be found in [3]. Q y can be obtained in a similar way, Q y = 1 2k > ynew W c t c + t > c W > c K Y W c t c 2 k > ynew t > c W > c K Y I T c W > c > W y t y + t > y W > y I T c W > c K Y I T c W > c > W y t y (3.25) To perform monitoring based on the above indices, control limits should be calculated from the statistics of the normal data, which are listed in Table 3.1. The monitoring scheme is as follows. IfT 2 c exceeds its control limit, then a quality-relevant fault is detected with (1)100% confidence; ifT 2 x exceeds its control limit, the fault is identified as quality irrelevant but process relevant with (1) 100% confidence; if Q x exceeds its control limit, a potentially quality-relevant fault is detected with (1)100%; and ifT 2 y orQ y exceeds their control limits, a quality-relevant fault is detected, which is unpredictable from the process variables. 62 3.4 Synthetic Case Studies In this section, a nonlinear numerical case study is used to compare the effectiveness of KCCA, CCCA and CKCCA in terms of detecting quality-relevant and process-relevant faults. The advantages of the CKCCA-based monitoring over other existing methods are pointed out using this case study. The simulated numerical example without faults is generated as fol- lows. Process variables: x 1 N(0; 1) x 2 U(0; 1) x 3 = sin(x 1 ) + e 1 (3.26) x 4 = x 2 1 + 3x 1 + 4 + e 2 x 5 = x 2 2 + cos(x 2 2 ) + 1 + e 3 Quality variable: y = x 2 3 + x 3 x 4 + x 2 1 + v (3.27) where e k N(0; 0:01 2 ) (k = 1; 2; 3), and v N(0; 0:05 2 ). U([0; 1]) is a uni- form distribution in the interval [0; 1], and N(0; 2 ) is a normal distribution with mean 0 and variance 2 . With the above process variables and quality variable, we generate 200 samples to build the KCCA, CCCA and CKCCA 63 models. For KCCA,l = 1, which is equal to the rank of y. For CCCA,l c = 1, l y = 1, andl x = 2. For CKCCA,l c = 1,l y = 1, andl x = 2. Sincel y is equal to the number of the quality variables,Q y is null in this simulation case. A 99% control limit is chosen for this case. Additionally, 400 faulty samples are generated by adding the follow- ing disturbances into the input space [3], 1 Fault 1: a step change in x 1 by x 1 = x 1 +f 1 1. 2 Fault 2: a step change in x 2 by x 2 = x 2 +f 2 1. where x 1 and x 2 are fault-free vectors,f 1 andf 2 represent the magnitudes of the faults, and each element in 12R 400 is 1. From Eqs. (3.26) and (3.27), we can see that Fault 1 is quality- relevant. The monitoring results for f 1 = 3 are presented in Figures 3.1 - 3.3. All three methods detect the fault and classify it as quality-relevant. However, due to the comprehensive decomposition, the fault in each subspace can be analyzed separately and attached various attention for CKCCA, which cannot be achieved by KCCA. Moreover, CKCCA is more sensitive than CCCA since it overcomes the linear assumption between process variables and quality variable. The predefined Fault 2 is quality-irrelevant. Setf 2 = 1, and the mon- itoring results for KCCA, CCCA and CKCCA are shown in Figures 3.4 - 64 0 100 200 300 400 500 600 0 50 100 150 200 T 2 0 100 200 300 400 500 600 0 0.05 0.1 0.15 Q Figure 3.1: Monitoring Results for Fault 1 with KCCA (f 1 = 3) 0 200 400 600 0 50 100 150 200 T c 2 0 200 400 600 0 1000 2000 3000 4000 5000 6000 T y 2 0 200 400 600 0 5 10 15 20 T x 2 0 200 400 600 0 10 20 30 40 50 Q x Figure 3.2: Monitoring Results for Fault 1 with CCCA (f 1 = 3) 65 0 200 400 600 0 20 40 60 80 100 T c 2 0 200 400 600 0 20 40 60 80 100 T y 2 0 200 400 600 0 10 20 30 40 50 T x 2 0 200 400 600 0 2 4 6 8 ×10 -4 Q x Figure 3.3: Monitoring Results for Fault 1 with CKCCA (f 1 = 3) 0 100 200 300 400 500 600 0 10 20 30 40 T 2 0 100 200 300 400 500 600 0 0.02 0.04 0.06 0.08 0.1 0.12 Q Figure 3.4: Monitoring Results for Fault 2 with KCCA (f 2 = 1) 66 0 200 400 600 0 100 200 300 400 500 T c 2 0 200 400 600 0 5 10 15 T y 2 0 200 400 600 0 100 200 300 400 500 T x 2 0 200 400 600 0 50 100 150 200 250 300 Q x Figure 3.5: Monitoring Results for Fault 2 with CCCA (f 2 = 1) 0 200 400 600 0 5 10 15 20 T c 2 0 200 400 600 0 5 10 15 T y 2 0 200 400 600 0 20 40 60 80 T x 2 0 200 400 600 0 0.2 0.4 0.6 0.8 1 1.2 ×10 -4 Q x Figure 3.6: Monitoring Results for Fault 2 with CKCCA (f 2 = 1) 67 3.6. From Figure 3.6, we can conclude that Fault 2 affects the process vari- ables, but not the quality variable, since only T 2 x exceeds its control limit. However, we cannot draw this conclusion from Figure 3.4, because the in- dex Q for KCCA is potentially related to the quality variable. In Figure 3.5, since CCCA cannot model the nonlinearity in the system, the fault is even detected in T 2 c , which is a false alarm obviously. Therefore, CKCCA outperforms KCCA and CCCA with the complete decomposition and the nonlinearity modeling. 3.5 Tennessee Eastman Process Case Studies Tennessee Eastman process, described in Section 2.6, is employed to compare the performance of CKCCA over other models. In this case study, PLS, CCA, KCCA, CCCA and CKCCA based monitoring methods are per- formed on TEP . For these monitoring schemes, XMEAS(1-22) and XMV(1- 11) are selected as process variables, where XMEAS(1-22) are process mea- surements and XMV(1-11) are manipulated variables. XMEAS(37-41) are selected as quality variables, which are quality measurements. The preprocessing steps illustrated in Section 2.4.3 are employed to handle the lagging effects in quality variables. Then 100 normal samples are used to build PLS, CCA, KCCA, CCCA and CKCCA models in the training phase. For kernel methods, the selection of kernel parameter c affects the detection results significantly [53]. In this study,c = 4000 is chosen for both 68 Table 3.2: Fault Detection Rates for Quality-Relevant Disturbances with PLS, CCA, KCCA, CCCA and CKCCA (%) Disturbances PLS CCA KCCA CCCA CKCCA IDV(1) 83.3 63.2 82.5 85.0 85.8 IDV(2) 82.7 78.4 83.0 83.9 85.6 IDV(5) 31.3 83.1 83.7 84.5 84.3 IDV(6) 83.4 81.6 83.5 84.1 84.5 IDV(8) 82.0 66.0 81.8 81.3 83.9 IDV(10) 74.3 50.9 76.0 72.1 80.5 IDV(12) 83.2 71.7 85.8 83.3 85.8 IDV(13) 79.8 65.4 80.9 80.6 80.9 KCCA and CKCCA models. The number of factors for CCA and KCCA are 4 and 5, repectively, determined by cross-validation. For CCCA,l c = 4, l y = 5, andl x = 32, and for CKCCA,l c = 5,l y = 5, andl x = 22. The control limit for these processes is 99%. Similar to Section 2.6, the 15 disturbances in Table 2.6 are divided into two groups: quality-relevant faults and quality-irrelevant disturbances. The fault detection rates for quality-relevant faults and the false alarm rates for quality-irrelevant disturbances are shown in Tables 3.2 and 3.3, repectively. As shown in Table 3.2, since KCCA and CKCCA can model the non- linearity well in the TEP , their fault detection rates are higher than other methods in most cases. CKCCA has higher fault detection rates than KCCA, since it includes the information in both correlation subspace and quality- 69 Table 3.3: False Alarm Rates for Quality-Irrelevant Disturbances with PLS, CCA, KCCA, CCCA and CKCCA (%) Disturbances PLS CCA KCCA CCCA CKCCA IDV(3) 7.5 8.1 6.7 3.4 3.2 IDV(4) 84.0 84.2 11.0 2.5 3.5 IDV(9) 8.5 9.8 10.2 3.2 2.9 IDV(11) 68.0 19.6 11.8 7.5 3.4 IDV(15) 14.2 8.6 5.6 3.0 3.7 principal subspace. Additionally, by comparing the results of PLS and CCA, we can see that the fault detection performance of CCA alone is not as good as PLS, which is the same as the conclusion drawn in Section 2.6. In Table 3.3, the false alarm rates of KCCA are much smaller than those of CCA due to its nonlinearity modeling ability, expecially for IDV(4) and IDV(11). Among all methods, CKCCA has the smallest false alarm rates in most cases. Although the false alarm rates of CCCA are smaller in IDV(4) and IDV(15), their values are quite close. Case 1: Step Change in B Composition IDV(2) In this process disturbance, theA=C ratio is kept constant and a step change occurs inB composition of the stripper inlet stream, which is IDV(2) faulty case in [43]. The monitoring results for CCA, KCCA, CCCA and CKCCA models are shown in Figures 3.7 - 3.10, respectively. From Figures 3.9 and 3.10, we can see that CKCCA can model and track the nonlinear 70 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 T 2 0 20 40 60 80 100 120 140 160 180 200 0 1000 2000 3000 4000 5000 Q Figure 3.7: CCA-based Monitoring Results for IDV(2) 0 20 40 60 80 100 120 140 160 180 200 0 50 100 150 200 250 300 T 2 0 20 40 60 80 100 120 140 160 180 200 0 0.5 1 1.5 2 Q Figure 3.8: KCCA-based Monitoring Results for IDV(2) 71 0 50 100 150 200 0 100 200 300 400 T c 2 0 50 100 150 200 0 20 40 60 80 100 T y 2 0 50 100 150 200 0 1000 2000 3000 4000 T x 2 0 50 100 150 200 0 50 100 150 Q x Figure 3.9: CCCA-based Monitoring Results for IDV(2) 0 50 100 150 200 0 50 100 150 T c 2 0 50 100 150 200 0 20 40 60 T y 2 0 50 100 150 200 0 50 100 150 200 T x 2 0 50 100 150 200 0 1 2 3 4 5 ×10 -3 Q x Figure 3.10: CKCCA-based Monitoring Results for IDV(2) 72 0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 T 2 0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 Q Figure 3.11: CCA-based Monitoring Results for IDV(4) part in the process better than CCCA, which makes T 2 c in CKCCA return to normal completely after the adjustment of the feedback controllers in the process. The same conclusion can be drawn by comparing Figures 3.7 and 3.8. Moreover, compared with KCCA in Figure 3.8, CKCCA decomposes the process and quality spaces further: T 2 c andT 2 y spaces monitor the quality- relevant variations, whileT 2 x andQ x spaces monitor the variations that are irrelevant to quality variables. Although bothT 2 c andT 2 in Figures 3.8 and 3.10 return normal,T 2 y still monitors quality-relevant variations in CKCCA when the quality measurements are available. 73 0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 T 2 0 20 40 60 80 100 120 140 160 180 200 0 0.02 0.04 0.06 0.08 0.1 Q Figure 3.12: KCCA-based Monitoring Results for IDV(4) 0 50 100 150 200 0 10 20 30 T c 2 0 50 100 150 200 0 5 10 15 20 25 T y 2 0 50 100 150 200 0 20 40 60 T x 2 0 50 100 150 200 0 0.5 1 1.5 ×10 -3 Q x Figure 3.13: CKCCA-based Monitoring Results for IDV(4) 74 Case 2: Step Change in Reactor Cooling Water Inlet Temperature IDV(4) In this process disturbance, reactor cooling water inlet temperature has a step change, which is IDV(4) disturbance case in [43]. The monitor- ing results for CCA, KCCA, CCCA and CKCCA models are presented in Figures 3.11 - 3.13. In this scenario, the variation of the reactor cooling water inlet temperature will not affect the quality variables due to the ef- fect of cascade controller in the system. For CKCCA, quality-relevant in- dex T 2 c is within the control limit, while T 2 x and Q x exceed their control limits, which shows that this disturbance are process-relevant but quality- irrelevant, thus should receive less attention when monitoring. The advan- tage of comprehensive decomposition for CKCCA cannot be achieved with CCA and KCCA, although KCCA outperforms CCA by modeling nonlin- earity well (Figures 3.11 and 3.12). 3.6 Summary In this chapter, concurrent kernel CCA is proposed to comprehen- sively monitor process-relevant and quality-relevant fault for nonlinear industrial processes, which projects the original feature space into five subspaces. Different from previous work, CKCCA also considers the nonlinearity in the quality variables. The monitoring statistics and cor- responding control limits are developed for CKCCA. The simulation results in the industrial studies show that CKCCA gives a complete 75 monitoring of the faults that happen in the predictable quality subspace and the unpredictable quality-residual subspace, as well as faults that only affect the process spaces, thus effectively decreases the false alarm rates. Additionally, the results demonstrate that CKCCA can model and track the nonlinear part in the model better, and of course at the expense of kernel treatment that requires more parameters than the linear counterpart. However, the good news is that the computational cost is linear due to the kernel tricks. 76 Chapter 4 Concurrent Diagnosis of Process and Quality Faults with Regularized Canonical Correlation Analysis 4.1 Introduction Considering the efficiency of CCA over PLS, the concurrent CCA is proposed in Chapter 2, which decomposes the inputs into quality-relevant and quality-irrelevant spaces, and the corresponding monitoring scheme is also developed. For multivariate quality-relevant monitoring, root causes of a de- tected fault should also be analyzed. Contribution plots, as an early and popular approach, are employed to diagnose a fault by determining the contribution of each variable to the fault detection indices [1, 74]. However, Westerhuis et al. [75] showed that contribution plots has smearing effects, which can lead to misleading results. To avoid the ambiguity of the contribution plot method, reconstruc- tion based diagnosis methods have been proposed, where rigorous diagnos- ability analysis is available with a given fault direction [76]. The advantage of the reconstruction based method is that faults with known fault direc- tions can be diagnosed without ambiguity, but it requires prior knowledge 77 of fault directions. In order to overcome this problem, Alcala and Qin [77] proposed a reconstruction-based contribution (RBC) method, and defined the amount of reconstruction along each variable direction that minimizes the fault detection index as the RBC of that variable. Alternatively, when fault data are available for a particular fault, the fault direction can be ex- tracted in the residual space or principal component space using singular value decomposition [78]. With the knowledge of the fault directions, an extended RBC is proposed for multi-dimensional fault diagnosis [79, 80]. The remaining sections of this chapter are organized as follows. In Section 4.2, a combined index for CCCA is proposed to monitor the quality- relevant faults with process variables, and its analysis of detectability is also presented. The traditional contribution plots and RBC diagnosis ap- proaches are defined for CCCA in Section 4.3. Additionally, an extended RBC method is also proposed to diagnose multi-dimensional faults. In Sec- tion 4.4, appropriate monitoring schemes of Tennessee Eastman process on quality-relevant monitoring and diagnosis are employed to illustrate the performance of these methods. Finally, conclusions are drawn in the last section. 78 4.2 CCCA-based Fault Diagnosis 4.2.1 Quality-Relevant Monitoring with Combined Index The indices in Table 2.1 are for both process-relevant and quality- relevant faults. However, in practice, quality-relevant faults predicted from process data should receive higher attention. Process-relevant faults such as disturbances in process variables can often be compensated by feedback controllers, thus have no effect on quality variables, and should not be con- cerned. Therefore, the fault monitoring forT 2 c and ~ Q x are adopted here to monitor quality-relevant faults, which are T 2 c = x > R c 1 c R > c x 2 c ~ Q x = x > (IR c R y c )(I ~ P x ~ P > x )(I R c R y c )x 2 x where c = 1 n1 T > c T c , and 2 c and 2 x are control limits for T 2 c and ~ Q x , re- spectively. Yue and Qin [81] and Negiz and C ¸ linar [82] pointed out that it is convenient to monitor one index rather than two. Therefore, a combined index for CCCA is adopted here to monitor the quality-relevant faults using process variables: = T 2 c 2 c + ~ Q x 2 x = x > x (4.1) where = Rc 1 c R > c 2 c + (IRcR y c )(I ~ Px ~ P > x )(IRcR y c ) 2 x . Since is a quadratic function of x, its control limit can be obtained 79 by [83] & 2 = g 2 h; (4.2) where g = tr (S) 2 =tr (S), h = [tr (S)] 2 =tr (S) 2 , and S = 1 n1 X > X is the estimated covariance of X. 4.2.2 Analysis of Detectability When a fault occurs, the faulty sample vector can be represented as x = x + i f (4.3) where x is the sample vector under normal operating conditions, and i f is the fault part added to x . In Eq. (4.3), i 2R mA f is an orthonormal matrix that spans the fault subspace with dimensionA f , and f is the magnitude of the fault. When A f = 1, i reduces to a vector with unit norm and it is classified as a uni-dimensional fault. Substitute Eq. (4.3) into Eq. (4.1), we obtain =jj 1 2 x + 1 2 i fjj 2 =jj x + i fjj 2 (4.4) where x = 1 2 x , and i = 1 2 i . Although i has full column rank, i may not have full column rank. After applying singular value decomposi- tion on i , we can get i = U i U ? i D i 0 V i V ? i > = U i D i V > i 80 > i D i V > i (4.5) where D i contains nonzero singular values, and U ? i and V ? i are matrices that are orthogonal to > i U i and V i , respectively. Then Eq. (4.4) can be rearranged into =jj x + > i D i V > i fjj 2 =jj x + > i f > jj 2 (4.6) where f > = D i V > i f. According to the rank of i , the quality-relevant faults can be di- vided into three groups, which are illustrated in Lemma 1. Lemma 1. The detectability of the quality-relevant faults can be de- termined as follows, (1) If rank i = 0, the fault is undetectable no matter what f is. (2) If 0< rank i <A f , the fault is not detectable if f2R V ? i . (3) If rank i =A f and f = 2R V ? i , the fault is detectable. Lemma 1 is only the necessary condition for a fault to be detectable. However, in order to guarantee the fault detection when the fault occurs, there is also a restriction on the magnitude of the fault, which is shown in Lemma 2. 81 Lemma2. When f = 2R V ? i , the quality-relevant faults can be guar- anteed to be detected ifjjfjj> 2&=d min , whered min is the minimum nonzero singular values of i . The proof of Lemma 1 and Lemma 2 is given in Appendix E. 4.3 Quality-Relevant Diagnosis Fault detection is only the first step in process monitoring. The next step is to find the root causes of the faults, which is called fault diagnosis. In this section, the traditional contribution plots and RBC fault diagnosis methods are developed for quality-relevant diagnosis using CCCA, that is, to give prognosis of quality-related faults once they are detected by moni- toring the real-time process data. These two methods are simple to use and do not require prior knowledge or data of the faults, but they can have poor performance when they are used to diagnose multi-dimensional faults, thus an extended RBC algorithm for multi-dimensional fault diagnosis is also proposed. 4.3.1 Contribution Plots and RBC Contribution plots is a popular approach for fault diagnosis, and its idea is that variables with the largest contributions to the fault detection index are most likely the faulty variables. For quality-relevant diagnosis 82 Table 4.1: Formulations of M for Monitoring Statistics Index M T 2 c R c 1 c R > c ~ Q x (I R c R y c )(I ~ P x ~ P > x )(I R c R y c ) based on CCCA, the contribution plots can be defined as Index(M) = x > Mx =jjM 1 2 xjj 2 = N X i=1 > i M 1 2 x 2 = N X i=1 c M i (4.7) wherec M i = > i M 1 2 x 2 is the contribution of variable x i to Index(M), i is the i th column of the identity matrix, and the formulations of M are sum- marized in Table 4.1. The method of contribution plots works well in many cases, but the assumption that variables with the largest contributions are most likely the faulty variables is not supported on the theoretical basis, and it may cause misleading diagnosis results due to the smearing effects [75, 84]. Alcala and Qin [77] then proposed the RBC approach for diagnosis of faulty variables. RBC combines the contribution analysis and reconstruc- tion based identification, which improves the diagnosis performance. After detecting quality-relevant faults with Index(M) for CCCA model, the cor- responding RBC is derived as follows. 83 For a faulty sample x, the reconstructed vector along i direction is z = x i f i (4.8) and the Index(M) for the reconstruction vector is Index(M) = z > Mz =jjzjj 2 M =jjx i f i jj 2 M (4.9) In order to reconstruct the sample, anf i is found such that the index Index(M) is minimized. Thus, we take the first derivative of Eq. (4.9) with respect tof i and set it to 0, which lead to f i = > i M i 1 > i Mx (4.10) Then RBC for variable x i to the fault detection index Index(M), RBC M i , is defined as the amount of reconstruction along direction i , which can be expressed as RBC M i =jj i f i jj 2 M = x > M i > i M i 1 > i Mx (4.11) Since both Eqs. (4.7) and (4.11) have quadratic forms, their control limits can be defined as [83] 2 i = g i 2 h i ; (4.12) whereg i = tr (SD) 2 =tr (SD),h i = [tr (SD)] 2 =tr (SD) 2 , and S is the same as that in Eq. (4.2). The matrices D for contribution plots and RBC are listed in Table 4.2. 84 Table 4.2: Formulations of D for Contribution Plots and RBC Matrix Contribution Plot RBC D M 1 2 i > i M 1 2 M i > i M i 1 > i M 4.3.2 Analysis of Diagnosability Contribution plots have been used in practice, but there is no fun- damental analysis on their diagnosability. Alcala and Qin [77] showed that both contribution plots and RBC have smearing effects, but RBC can always give correct fault diagnosis, while contribution plots cannot guarantee cor- rect diagnosis even if the fault is only in a single variable direction. In this section, the analysis of diagnosability of contribution plots and RBC is pre- sented. Fault with only a single variable direction is considered, and Eq. (4.3) reduces to x = x + j f (4.13) where j f is the faulty part, which is composed of the fault direction j and the magnitude of the faultf. Based on Eqs. (4.7) and (4.13), the variable contributions of a fault measurement to the index Index(M) can be presented as c M i = > i M 1 2 x 2 = h > i M 1 2 (x + j f) i 2 = x + h M 1 2 i ij f 2 (4.14) 85 where x = > i M 1 2 x is the contribution of the fault free part of the measure- ment x, and h M 1 2 i ij = > i M 1 2 j is theij th element of the matrix M 1 2 . Similarly, with Eqs. (4.11) and (4.13), the RBC M i can be rearranged as RBC M i = x > M i > i M i 1 > i Mx = > i Mx 2 m ii = > i M (x + j f) 2 m ii = m 1 2 ii x +m 1 2 ii m ij f 2 (4.15) wherem ij andm ii are theij th andii th elements of M, respectively. From Eqs. (4.14) and (4.15), we can see that there is a smearing effect of a fault in Variable j into the values of c M i and RBC M i . Then the issue is to determine whether the smearing effect may lead to mis-diagnosis byc M i and RBC M i . For simplicity, we assume that the fault sample x is exactly in the j direction, which may be true when f is sufficiently large and x is negligible compared to j f. Therefore, x can be represented as x = j f (4.16) Then Eqs. (4.14) and (4.15) reduce to c M i = h M 1 2 i 2 ij f 2 = 8 > < > : h M 1 2 i 2 ij f 2 ; i6=j h M 1 2 i 2 jj f 2 ; i =j (4.17) and RBC M i =m 1 ii m 2 ij f 2 = ( m 1 ii m 2 ij f 2 ; i6=j m jj f 2 ; i =j (4.18) Correct diagnosis withc M i is guaranteed only if h M 1 2 i 2 jj h M 1 2 i 2 ij (4.19) 86 and that with RBC M i is guaranteed only if m jj m 1 ii m 2 ij (4.20) Alcala and Qin [77] showed that there is no guarantee thatc M j c M i fori6= j, but RBC M j RBC M i is always held. The proof is also shown here [77]. Since M is positive semi-definite, M 1 2 is also positive semi-definite. Thus, [ i j ] > M 1 2 [ i j ] = 0 @ h M 1 2 i ii h M 1 2 i ij h M 1 2 i ij h M 1 2 i jj 1 A 0 (4.21) Eq. (4.19) does not always hold based on Eq. (4.21). Therefore,c M j c M i cannot be guaranteed. However, from the positive semi-definite prop- erty of M, we have [ i j ] > M [ i j ] = m ii m ij m ij m jj 0 (4.22) which is always valid, since det m ii m ij m ij m jj =m ii m jj m 2 ij 0 (4.23) Therefore, RBC M j RBC M i is guaranteed. 4.3.3 Extended Reconstruction-based Contribution Compared with contribution plots, RBC M i is an improved index for quality-relevant fault diagnosis. For multi-dimensional faults, the RBC method can be used just like the contribution plot method, but the 87 Algorithm5 The Extended Reconstruction-based Contribution Algorithm 1. Initialize fault direction candidate set S c =;, and the faulty samples are X and Y. 2. For every process variablei (i = [1; 2; ; m]) that is not in S c , 1) Create a potential faulty direction i 2R m , where thei th element is 1 and the other elements are zeros; Initialize the sum of accu- mulative error (SAE i ) to zero, and X temp = X; 2) Define S 0 c = [S; i ], and reconstruct every quality-relevant faulty sample of X temp along S 0 c direction. If the reconstructed index still exceeds the control limit, add the difference Index(M)& 2 to SAE i . 3. Select process variablej with minimum SAE as the next fault direction candidate, add j into S c , and update S c with S c := [S c ; j ]. 4. Go back to Step 2 until the number of fault direction candidates reachesm or SAE j for the newly selected variable is less than a thresh- old. diagnosis results can be ambiguous and subjective due to smearing effects. In order to deal with these cases, we propose an extended RBC (ERBC) approach as shown in Algorithm 5, which is adapted from [80]. The following are some remarks of this algorithm. Remark 1. The selection of threshold determines the size of candi- date set S c . When is small, more fault variables will be included in order to satisfy the reconstruction requirement. When is large, reconstruction per- formance may not be very good. In our study, the value of is determined based on the development dataset. 88 Remark 2. From Eqs. (4.7) and (4.11) and the diagnosability analy- sis in Section 4.3.2, we can see that both contribution plots and RBC have smearing effects, since the fault contribution in one variable propagates to contributions of other variables in these methods. However, ERBC elimi- nates the smearing effects in a large extent. This is because when the root causes are detected, their effect are removed from the samples and so as their smearing effects on other variables. Further proof will be shown in the case studies. 4.4 Tennessee Eastman Process Case Studies In this section, Tennessee Eastman process (TEP) described in Sec- tion 2.6 is used to illustrate the quality-relevant monitoring performance of the existing methods, and to show the advantages of ERBC algorithm over other diagnosis approaches. In this work, PCA, PLS and CCCA are selected for comparing the effectiveness in terms of QRM, and when faults are detected, CCCA-based contribution plots, RBC and ERBC methods are employed to locate their root causes. For PCA, XMEAS(1-22) and XMV(1-11) are process variables. For PLS and CCCA, XMEAS(1-22) and XMV(1-11) are selected as process variables, and XMEAS(34, 40, 41) are selected as quality variables. After handling the asynchronous sampling and measurement delay in the quality data, 100 normal samples are employed to build these mod- 89 els. The number of components of these methods are selected by cross- validation: for PCA,l = 19; for PLS,l = 10; and for CCCA,l c = 2,l x = 19, l y = 3, 1 = 0:021, and 2 = 0:002. The control limit is 99%. Table 4.3: Fault Detection Rates for Quality-Relevant Disturbances with PCA, PLS and CCCA (%) Disturbance PCA PLS CCCA IDV(1) 99.36 100 100 IDV(2) 100 100 100 IDV(5) 75.76 72.73 93.26 IDV(6) 100 100 100 IDV(7) 95.40 98.85 100 IDV(8) 100 100 100 IDV(12) 100 100 100 IDV(13) 99.56 100 100 Table 4.4: False Alarm Rates for Quality-Irrelevant Disturbances with PCA, PLS and CCCA (%) Disturbance PCA PLS CCCA IDV(3) 15.34 14.29 3.17 IDV(4) 83.68 57.37 4.74 IDV(9) 16.84 11.58 2.63 IDV(10) 44.27 44.27 8.85 IDV(11) 65.97 50.79 8.38 IDV(14) 84.13 83.60 13.23 IDV(15) 9.04 11.70 2.66 90 0 20 40 60 80 100 120 140 160 180 200 0 10 20 T 2 c 0 20 40 60 80 100 120 140 160 180 200 0 100 200 ˜ Qx 0 20 40 60 80 100 120 140 160 180 200 0 50 ˜ T 2 x 0 20 40 60 80 100 120 140 160 180 200 0 5 T 2 y Normalized Monitoring Index Control Limit Quality Monitoring with Delays Fault Added Figure 4.1: CCCA-based Fault Monitoring Results for IDV(1) 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 c T 2 c i Contribution Control limit Contribution Plots for T 2 c 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 RBC T 2 c i RBC Control limit RBC for T 2 c Figure 4.2: Fault Diagnosis Results for the 42 th Sample Using Contribution Plots (Upper) and RBC (Lower) withT 2 c 91 0 5 10 15 20 25 30 35 0 50 100 150 200 c ˜ Qx i Contribution Control limit Contribution Plots for ˜ Qx 0 5 10 15 20 25 30 35 0 50 100 150 200 RBC ˜ Qx i RBC Control limit RBC for ˜ Qx Figure 4.3: Fault Diagnosis Results for the 42 th Sample Using Contribution Plots (Upper) and RBC (Lower) with ~ Q x 0 5 10 15 20 25 30 35 0 50 100 150 200 c φ i Contribution Control limit Contribution Plots for φ 0 5 10 15 20 25 30 35 0 50 100 150 RBC φ i RBC Control limit RBC for φ Figure 4.4: Fault Diagnosis Results for the 42 th Sample Using Contribution Plots (Upper) and RBC (Lower) with 92 The 15 process disturbances described in [43] can be classified into two groups by performing monitoring on quality variables, which are the quality-relevant group and quality-irrelevant group. The monitoring statis- tics for these groups are summarized in Tables 4.3 and 4.4. From Table 4.3, we can see that all these methods perform well when quality-relevant faults occur, with CCCA performing the best among all cases. However, for the cases where the disturbances are quality irrelevant (Table 4.4), CCCA based monitoring is clearly the winner of all three methods. Although PLS is marginally better than PCA in terms of false alarm rates, their false alarm rates are much higher than CCCA for almost all cases. To perform fault diagnosis for faults and disturbances in Tables 4.3 and 4.4, CCCA-based contribution plots, RBC and ERBC are performed on the dataset for fault diagnosis, and the combined index in Table 4.1 is em- ployed to illustrate their performance of quality-relevant diagnosis. Here IDV(1) and IDV(3) are adopted to show the performance of these diagno- sis approaches on quality-relevant fault and quality-irrelevant disturbance, respectively. In IDV(1), B composition is kept constant and a step change occurs inA=C feed ratio [43]. The fault monitoring result of CCCA is shown in Fig- ure 4.1, where the monitoring indices are normalized by their correspond- ing control limits. T 2 c and ~ Q x are used to detect predictive and potential 93 quality-relevant faults, so as shown in Figure 4.1, IDV(1) is classified as quality-relevant, which is validated with quality monitoring with T 2 y . It is also observed that the quality monitoring T 2 y comes with delays in Fig- ure 4.1. The diagnosis results using contribution plots and RBC for the 42 th faulty sample withT 2 c , ~ Q x and are presented in Figures 4.2 - 4.4, respec- tively. It is noted that the relative contribution plots and relative RBC are adopted in these figures. In Figure 4.4, based on the magnitudes of the con- tributions, we may conclude that Variables 4, 16 and 20 are the contributing variables for both contribution plots and RBC. With ERBC, however, the extracted variables are 4, 23 and 8, which are ordered by extracting sequences. Figures 4.5 - 4.8 show the extracting process of ERBC. The fault is introduced after 36 th sample, and in Figure 4.5, the variable RBCs for Samples 36 th - 50 th are high while others’ are low due to the efforts of the controllers in the process. The average RBC for each variable direction over all samples is shown in the lower figure, which indi- cates that Variable 4 has the highest contribution and is selected as the first faulty direction. Its smearing effect on Variables 5 and 16 are removed after reconstructing along Variable 4. Similarly, in Figures 4.6 and 4.7, Variables 23 and 8 are selected as the next faulty directions, and the fault diagnosis result after reconstruction along these three directions is shown in Figure 4.8, where all variables have relatively low RBCs. The fault detection index 94 200 Samples 100 0 5 10 Variables 15 20 25 0 RBC φ 30 35 40 30 45 25 50 20 55 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 55 0 5 10 15 20 25 30 35 0 0.5 1 1.5 2 2.5 3 3.5 AverageRBC φ Figure 4.5: RBC Diagnosis Results for All Samples (Upper) and Average RBC (Lower) 200 Samples 100 0 5 10 Variables 15 20 25 0 RBC φ 30 35 40 30 45 25 50 20 55 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 55 0 5 10 15 20 25 30 35 0 0.5 1 1.5 2 2.5 3 3.5 AverageRBC φ Figure 4.6: RBC Diagnosis Results After Reconstructing along Variable 4 (Upper: RBC for All Samples; Lower: Average RBC) 95 200 Samples 100 0 5 10 Variables 15 20 25 0 RBC φ 30 35 40 30 45 25 50 20 55 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 55 0 5 10 15 20 25 30 35 0 0.5 1 1.5 2 2.5 3 3.5 AverageRBC φ Figure 4.7: RBC Diagnosis Results After Reconstructing along Variables 4 and 23 (Upper: RBC for All Samples; Lower: Average RBC) 200 Samples 100 0 5 10 Variables 15 20 25 0 RBC φ 30 35 40 30 45 25 50 20 55 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 55 0 5 10 15 20 25 30 35 0 0.5 1 1.5 2 2.5 3 3.5 AverageRBC φ Figure 4.8: RBC Diagnosis Results After Reconstructing along Variables 4, 23 and 8 (Upper: RBC for All Samples; Lower: Average RBC) 96 0 20 40 60 80 100 120 140 160 180 200 -2 -1 0 1 2 3 4 5 6 log(φ(x)) Original Index Control Limit Reconstructed Index Figure 4.9: Quality-Relevant Monitoring and Reconstruction Results for IDV(1) Using the Combined Index after reconstruction is plotted in Figure 4.9, which brings all out-of-control samples back to the normal control region. When a step change is intro- duced inA=C feed ratio, Variable 4, which is theA=C feed in Stream 4, is affected immediately. Then due to the reactions, the feed flow of another reactantD in Stream 2 (Variable 23) decreases, which leads to a decrease of the throughput of productG. SinceG is a liquid, the reactor level (Variable 8) is also affected. The effect of these variables then propagates to other variables (such as Variables 5, 16 and 20), which leads to the high RBCs of these variables in Figure 4.4. Therefore, after reconstructing along Variables 4, 23 and 8, their smearing effects on other variables are removed as well. 97 IDV(3), which is a step change in D feed temperature, is a quality- irrelevant disturbance, as shown in Figure 4.10. The fault diagnosis results with contribution plots and RBC for the 50 th sample are shown in Figures 4.11 - 4.13. The contributions forT 2 c are small, while for ~ Q x and, Variable 32 has relatively high contribution. Variable 32 is the reactor cooling water flow, which is directly affected by the D feed temperature and is process- relevant only. Thus, the proposed fault diagnosis methods can be used for quality-irrelevant disturbances as well. 4.5 Summary In this chapter, concurrent CCA is employed to diagnose the root causes of the quality-relevant faults, and a new combined monitoring index is defined for CCCA. The contribution plots and RBC diagnosis methods are then developed with this index, and their diagnosability is also ana- lyzed. Additionally, an extended RBC approach is proposed for quality- relevant fault diagnosis, which generalizes the traditional RBC method to multi-dimensional faulty variables. Finally, TEP is used to illustrate the per- formance of quality-relevant monitoring and diagnosis based on the CCCA model. 98 0 20 40 60 80 100 120 140 160 180 200 0 2 4 T 2 c 0 20 40 60 80 100 120 140 160 180 200 0 2 4 ˜ Qx 0 20 40 60 80 100 120 140 160 180 200 0 1 2 ˜ T 2 x 0 20 40 60 80 100 120 140 160 180 200 0 5 T 2 y Normalized Monitoring Index Control Limit Quality Monitoring with Delays Disturbance Added Figure 4.10: CCCA-based Fault Monitoring Results for IDV(3) 0 5 10 15 20 25 30 35 0 0.5 1 1.5 c T 2 c i Contribution Control limit Contribution Plots for T 2 c 0 5 10 15 20 25 30 35 0 0.5 1 1.5 RBC T 2 c i RBC Control limit RBC for T 2 c Figure 4.11: Fault Diagnosis Results for the 50 th Sample Using Contribution Plots (Upper) and RBC (Lower) withT 2 c 99 0 5 10 15 20 25 30 35 0 1 2 3 4 5 6 c ˜ Qx i Contribution Control limit Contribution Plots for ˜ Qx 0 5 10 15 20 25 30 35 0 1 2 3 4 5 6 RBC ˜ Qx i RBC Control limit RBC for ˜ Qx Figure 4.12: Fault Diagnosis Results for the 50 th Sample Using Contribution Plots (Upper) and RBC (Lower) with ~ Q x 0 5 10 15 20 25 30 35 0 1 2 3 4 5 6 c φ i Contribution Control limit Contribution Plots for φ 0 5 10 15 20 25 30 35 0 1 2 3 4 5 6 RBC φ i RBC Control limit RBC for φ Figure 4.13: Fault Diagnosis Results for the 50 th Sample Using Contribution Plots (Upper) and RBC (Lower) with 100 Chapter 5 Dynamic Concurrent Kernel Canonical Correlation Analysis and its Application to a Continuous An- nealing Process 5.1 Introduction Continuous annealing process (CAP) is an important unit for pro- ducing high-quality cold rolling strip. During the production phase, the cold rolling strip is recrystallized and annealed to improve its microstruc- ture, plasticity, and metal-forming feature. Strip-thickness is one of the most important quality indices. As the practitioners are more concerned with the faults that are relevant to abnormal product quality, strip-thickness relevant fault diagnosis receives an increasing attention. With the complexity of highly correlated variables and difficulties in modeling the dynamic nonlinear continuous annealing processes, it is difficult to apply model-based fault diagnosis methods. Recently, multi- layer data modeling has become one of the hottest topics in big data mod- eling [38]. In particular, data-driven multivariate statistical methods have attracted more attention and successfully applied to fault diagnosis of con- tinuous annealing processes [54–58]. PCA and PLS algorithms are used to 101 model correlations from historical process data, e.g., speeds and currents of carrying rolls, and measured tensions, by projecting the original data onto a couple of reduced-dimensional subspaces. The correlations are then used to establish the corresponding statistical indices for process monitor- ing. After that, contribution plots, multi-block contribution and reconstruc- tion methods are proposed for fault diagnosis [54–58]. For example, the work in [54, 57] applied PCA to diagnose tension relevant faults, specifi- cally abnormal tension condition, strip-break and roll-slippage faults, using measured speeds and currents of carrying rolls. While PCA-based meth- ods only monitor the faults that appear in speeds and currents, PLS-based methods also include the measured tensions when building the latent struc- ture model to pay attention to tension-relevant faults. For example, the work of [58] applied concurrent PLS (CPLS) proposed in [33] to give a more comprehensive decomposition of data space for tension-relevant fault mon- itoring. The advantages of [58] come from the fact that CPLS can decom- pose the data in a much clearer way compared to traditional PLS-based method in [9]. The CPLS model further divides the oblique subspaces of PLS and incorporates two unnecessary subspaces of total PLS in [32] to gen- erate tension-process covariations, process-specific variations, and tension- specific variations. Monitoring methods that use process data (e.g., currents, speeds, 102 and tensions) with quality data (e.g., strip thickness) unused cannot tell if a detected fault is relevant to product quality or not. For example, abnormal tensions may not necessarily lead to unexpected strip-thickness. From this point of view, strip-thickness relevant fault diagnosis of the CAPs remains unsolved. Meanwhile, a great effort has been made to diagnose strip-thickness relevant faults for hot rolling and rolling mill processes in the steel making manufacturing [53, 59–62]. For example, taking process nonlinearities into consideration, total kernel PLS (TKPLS) based strip thickness-relevant fault monitoring method was proposed in [53]. The TKPLS method incorporates the kernel technique into total PLS algorithm to give a clearer decomposition compared to kernel PLS in [63] for non- linear processes. Using the TKPLS model, contribution plots method was proposed in [59] to diagnose the strip-thickness relevant fault. In addition, the work of [60] proposed a residual based nonlinear rolling mill fault detection method by using a new genetic variant of Box-Cox models and Takagi-Sugeno fuzzy models. In the work of [62], PLS was combined with a fuzzy modeling approach to achieve a non-linear version of PLS without a large kernel gram matrix, whereas fault isolation was achieved based on gradient-based contributions in the system identification models. However, all the above works cannot capture the dynamic correlations and auto-relations in the processes. The drawbacks make them not suitable for 103 dynamic process fault diagnosis. The continuous annealing process is a typical dynamic process whose dynamics come from carrying rolls inertia, thermal inertia, etc. When the process and quality variables exhibit auto-correlations or dy- namic relations, the normal control region defined by the static principal components that are themselves auto-correlated is usually too large and tends to have a high rate of missed detections. Dynamic extensions are straightforward to include a number of lagged variables of process data X and quality data Y when deriving the latent structures from the data. For example, the dynamic version of PCA in [64] modeled the lagged variables. Reference [65] modeled the lagged variables of both X and Y. Some other work which did not form augmented matrices of lagged data were also discussed in [66] and [67]. The existing dynamic modeling methods mainly focus on deriving the dynamic relations between X and Y, rather than providing a statistical model for process monitoring. Moreover, the lagged values in the above methods lead to matrices with a large number of columns, which complicates contribution analysis. Multi-block partition of the lagged data becomes an effective manner to enhance fault isolation [68, 69]. The work of [70] proposed a dynamic CPLS monitoring method using lagged process data and quality data, and the corresponding multi- block dynamic CPLS fault isolation method was also developed, without 104 capturing the nonlinear relations among the measured variables. Taking the process nonlinearities into consideration, the recent work of [61] dealt with the integration of time lags into the non-linear PLS. The non-linear finite impulse response model was combined with PLS while the residuals regarding to various forms of error feedbacks were dynamically analyzed. In the above work on quality-relevant fault monitoring, PLS is the backbone algorithm that focuses on maximizing covariances between qual- ity data and process data. The objective of PLS is to extract two score vectors with maximum covariance between X and Y. From this objective, PLS may extract latent variables of X which have large magnitudes of variations but not necessarily highly correlated to the quality. However, this objective that maximizes the extracted covariations is not necessarily efficient for qual- ity prediction, since it usually requires many latent dimensions to model a single output. CCA, in contrast, can overcome this drawback as it focuses on maximizing correlation between quality data and process data but ig- nores variance information. The concurrent CCA (CCCA) is thus proposed in Chapter 2 to retain the CCA’s efficiency in predicting the quality while exploiting the variance structure for process monitoring using subsequent principal component decompositions in the process and quality spaces, re- spectively. In this chapter, a dynamic nonlinear version of concurrent CCA is 105 proposed and applied for strip-thickness relevant fault diagnosis consider- ing the nonlinearities and dynamics of the CAP . The new dynamic concur- rent kernel CCA (DCKCCA) modeling and monitoring method deals with the following three aspects: 1. Lagged values of process and quality variables are used to generate augmented data matrices to provide an easy way to model dynamic correlations and auto-relations; 2. The kernel technique derived from kernel CCA is used to model non- linearities; 3. Concurrent decomposition of the dynamic kernel CCA residuals in both process and quality variables is used to achieve comprehensive monitoring. In addition, to deal with the large dimensions induced by the dy- namic augmentation of the data matrices, DCKCCA is extended to multi- block DCKCCA while each original variable and the corresponding lagged values are incorporated into a single block to generate a new contribution to evaluate the fault effects on each variable. 106 TM1 TM2 TM3 Furnace zone … TM17 TM16 TM18 TM19 TM13 Tinning black plate … … ST1 ST8 FT1 FT21 Strip thickness Cold rolling strip BF: Bending force; FT: Furnace temperature; ST: Strip temperature; TM: Tension meter; BF1 BF2 Temper Mill Figure 5.1: Layout of Continuous Annealing Process 107 5.2 Process Description and Strip-Thickness Relevant Fault Description 5.2.1 Process Description The physical layout of the CAP is shown in Figure 5.1. The raw material, i.e., cold-rolling steel strip, is annealed in furnace zone, where temperature and tension control are realized when the strip goes through each section. Thereafter, the temper mill is used to improve the flatness and thickness of the strip. The manufacturing line is a typical multivariate dynamic nonlinear process. The process variables and quality variables are dynamic cross- correlated and auto-correlated with the existence of inertias of the carrying roll and thermal inertia of the strip, and closed-loop control of measured tensions and temperatures. Additionally, the process variables are nonlin- early correlated due to the effects of bearing eccentricity of carrying rolls, complex coupling of strip and carrying rolls, nonlinearity terms in tension model, etc. 5.2.2 Strip-Thickness Relevant Fault Description As an important quality index, strip-thickness is mainly affected by the distributed 17 strip tensions, 21 furnace temperatures, 8 strip temper- atures, 2 roll forces, and 6 speeds and currents of the temper mill. They are grouped into process variables, while strip-thickness is considered as quality variable. 108 It is desirable to monitor and diagnose strip-thickness specific and process specific abnormalities for the dynamic nonlinear continuous annealing process. To achieve this, a dynamic concurrent kernel CCA modeling and monitoring method is proposed in Section 5.3 and the corresponding fault diagnosis method is proposed in Section 5.4. 5.3 DCKCCA for Process Modeling and Monitoring 5.3.1 Concurrent Kernel CCA Model As described in Chapter 2, while CCA is efficient in extracting latent scores that are most correlated to the output Y, the unexploited residuals can have large variances, which should be modeled for monitoring pur- poses. To overcome the drawback of CCA that focuses on correlation but ignores variance information, a concurrent CCA method is proposed to ex- clude the quality-irrelevant part in T to have process-quality covariations, and to further combine the quality-irrelevant part with large variations in the residuals to have process-specific variations. However, both CCA and CCCA are linear modeling methods, which cannot capture nonlinear rela- tions. To overcome this drawback, a concurrent kernel CCA algorithm is proposed in this subsection in a similar way to CKPLS in [50, 71] for nonlin- ear process monitoring. It is noted that unlike the CKCCA algorithm pro- posed in Chapter 3, CKCCA developed in this chapter only implicitly maps 109 process data onto a high-dimensional feature space while leaving quality data unmapped, which makes it more meaningful for quality prediction. For differentiation purpose, we refer it as simplified CKCCA. Instead of decomposing the original process data X and quality data Y as CCCA, simplified CKCCA applies the kernel technique to map the original process data to a high dimensional feature space . After that, a linear CCA model is built between and Y to describe the nonlinear rela- tions among the original X and Y. The high dimensional feature space is generated from a nonlinear mapping expressed as : x i 2R m !(x i ); (i = 1; 2; ;N). Although cannot be described explicitly, the kernel matrix K can be used to represent > , with each element K i 1 i 2 = K(x i 1 ; x i 2 ) = h(x i 1 );(x i 2 )i (i 1 ;i 2 = 1; 2; ;N). K() is kernel function. In this work, Gaussian kernel is adopted as K(x i 1 ; x i 2 ) = exp jjx i 1 x i 2 jj 2 c wherec is the width of Gaussian function. After that, a concurrent CCA decomposition is achieved on the high dimensional feature space and quality data space Y. In order to eliminate the mean effect in high-dimensional feature space, the training data and test data are first preprocessed as follows: K = (I N 1 N 1 N 1 > N )K raw (I N 1 N 1 N 1 > N ) (5.2) K new = (K new raw 1 N 1 L 1 > N K raw )(I N 1 N 1 N 1 > N ) (5.3) 110 Algorithm6 The Simplified Concurrent Kernel CCA Algorithm 1. Perform simplified kernel CCA in Appendix D on X and Y with l factors to get T, Q, and ^ Y = TQ > . 2. Perform singular value decomposition on ^ Y, ^ Y = T c D > c V > c T c Q c , where Q c = V c D > c includes alll c nonzero singular values in descend- ing order and the corresponding right singular vectors. 3. Perform PCA withl y components on ~ Y c = Y ^ Y, ~ Y c = T y P > y + ~ Y. 4. Perform PCA with l x components on ~ c = T c R y c = T x P > x + ~ , where ~ is the process residual. P x = ~ > c W x , T x = K c W x with K c = ~ c ~ > c a known matrix. W x contains the scaled eigen- vectors of (1=N) ~ c ~ > c corresponding to its l x largest eigenvalues. R c = RQ > V c D 1 c , and R y c = (R > c R c ) 1 R > c is the pseudo-inverse of R c . R y c can also be described as R y c = , where = (D > c V > c Q(T > KU) > U > KU(T > KU) 1 Q > V c D 1 c ) 1 D > c V > c Q(T > KU) > U > (5.1) with U a matrix computed from the iteration algorithm of kernel CCA. where K raw is the Gram matrix, K new raw is direct mapped kernel matrix for new test samples, K is centered kernel matrix for modeling samples, and K new is centered kernel matrix for new test samples. I N 2 R NN is anN- dimensional identity matrix. 1 N and 1 L are vectors whose elements are one with lengthN andL respectively.L is the number of test samples. The simplified CKCCA decomposes and Y as follows and detailed 111 algorithm can be seen in Algorithm 6. = T c R y c +T x P > x + ~ Y = T c Q > c +T y P > y + ~ Y (5.4) where T c represents the covariations in related to Y, T x represents the variations in that are useless to predict Y, and T y represents the varia- tions in Y that are unpredictable from . 5.3.2 Dynamic Concurrent Kernel CCA Model For a dynamic process, e.g., CAP , with the effects of dynamic units and closed-loop control, samples measured at different timek are not inde- pendent. They can be both autocorrelated and cross-correlated. Thus there is a need to capture the dynamic relations between X and Y. It is assumed that the dynamic orders are known as priori knowledge, i.e.,a for process variables andb for quality variables, and the lagged data matrices of both quality and process data are obtained. Owing to the complexity of kernel computation and the memory re- quirement of kernel matrix, two computation considerations are taken into account in the proposed DCKCCA: i) the incomplete Cholesky decompo- sition and iteration algorithm are employed in the simplified KCCA algo- rithm as shown in Appendix D; ii) and distributed based computation and storage are applied to compute the kernel matrix and relevant matrices in the proposed DCKCCA algorithm. 112 The proposed dynamic concurrent kernel CCA modeling algorithm is summarized as follows. Step1. Collect historical process and quality data, i.e., X and Y. X = 2 6 6 6 4 x 1 (1) x 1 (1) ::: x m (1) x 1 (2) x 2 (2) ::: x m (2) . . . . . . . . . . . . x 1 (N) x 2 (N) ::: x m (N) 3 7 7 7 5 Nm Y = 2 6 6 6 4 y 1 (1) y 1 (1) ::: y p (1) y 1 (2) y 2 (2) ::: y p (2) . . . . . . . . . . . . y 1 (N) y 2 (N) ::: y p (N) 3 7 7 7 5 Np Step2. Construct augmented data matrices X g and Y g . X g = X 1 ; X 2 ; :::; X m Y g = Y 1 ; Y 2 ; :::; Y p wherem andp are the numbers of process variables and quality variables, respectively. X i = 2 6 6 6 6 6 4 x i (k) x i (k 1) ::: x i (ka + 1) x i (k + 1) x i (k) ::: x i (ka + 2) . . . . . . . . . . . . x i (k +N 1) x i (k +N 2) ::: x i (ka +N) 3 7 7 7 7 7 5 (i = 1; 2;:::;m) Y j = 2 6 6 6 6 6 4 y j (k) y j (k + 1) ::: y j (k +b 1) y j (k + 1) y j (k + 2) ::: y j (k +b) . . . . . . . . . . . . y j (k +N 1) y j (k +N) ::: y j (kb +N 2) 3 7 7 7 7 7 5 (j = 1; 2;:::;p) 113 Step 3. Perform the simplified CKCCA in Section 5.3.1 on aug- mented data matrices (X g ; Y g ). By doing this, the dynamic CKCCA (DCKCCA) decomposition can be represented as follows. g = T c R y c +T x P > x + ~ g Y g = T c Q > c +T y P > y + ~ Y g (5.5) where g is feature data for dynamic process data X g , T c represents the co- variations in g that are related to Y, T x represents the variations in g that are useless to predict Y, T y represents the variations in Y that are unpre- dictable from g , and ~ g and ~ Y g represent the residuals from the process data and quality data respectively. It is noted that dynamic feature data g and quality data Y g are used for DCKCCA instead of (; Y) for the simplified CKCCA. Kernel matrix K g for the dynamic feature data g is denoted as K g = g g> , with each elementK g ij =K(x g i ; x g j ) = (x g i );(x g j ) . Step 4. The five dynamic latent structures, i.e., process-specific subspace T x , quality-specific subspace T y , process-quality covariation T c , potentially process-relevant subspace ~ g , and potentially quality-relevant subspace ~ Y g , are built similar to the ones of the simplified CKCCA. Different from the five subspaces defined for the simplified CKCCA in Eq. (5.4), the ones of DCKCCA can exhibit dynamic cross-correlations and auto-correlations between g and Y g because of data expansion with the lagged values. 114 5.3.3 DCKCCA-based Monitoring To monitor the five latent subspaces, the statistics and correspond- ing control limits are computed and compared accordingly. The detailed DCKCCA-based monitoring scheme is described as follows. Step 1. Collect monitored process and quality samples, i.e., x and y. To project the monitored samples onto the subspaces in Section 5.3.2, the original samples are augmented similar to the way of obtaining X g and Y g from X and Y. The augmented samples x g and y g are constructed from x and y as follows. x g = x 1 ; x 2 ; :::; x m > y g = y 1 ; y 2 ; :::; y p > where x i = x i (k) x i (k 1) :::; x i (ka + 1) y j = y j (k) y j (k + 1) :::; y j (k +b 1) Step2. The DCKCCA scores and residuals are calculated in terms of a single sample as follows t c = R > c (x g ) = D > c V > c Q(T > K g U) > U > K x g (5.6) t x = W > x g (x g ) T c g (x g ) K g > t c + U c K g > t c = W > x I U c K g > D > c V > c Q(T > K g U) > U > + U c K g > D > c V > c Q(T > K g U) > U > K x g (5.7) 115 t y = P > y ~ y c = P > y (y g Q c t c ) = P > y y g P > y Q c D > c V > c Q(T > KU) > U > K x g (5.8) and ~ y c = y g Q c R > c x g (5.9) ~ (x g ) = ~ c (x g ) P x t x (5.10) ~ y g = (I P y P > y )~ y c (5.11) In the above expressions, g and(x g ) cannot be calculated explic- itly. But the calculation of t c , t x , t y can be achieved by replacing g g> and g (x g ) with K g and K x g respectively, where K x g is a vector composed of a column of K g for modeling samples or K g new for test samples. Step 3. The fault detection statistics for the five subspaces in DCK- CCA can be computed to obtain threeT 2 statistics for t c , t x and t y , i.e.,T 2 c , T 2 x andT 2 y , and twoQ statistics for ~ (x) and ~ y, i.e.,Q x andQ y . T 2 c = t > c 1 c t c (5.12) T 2 x = t > x 1 x t x (5.13) T 2 y = t > y 1 y t y (5.14) Q x =jj ~ (x g )jj 2 =jj ~ c (x g ) P x t x jj 2 = ~ c (x g ) > ~ c (x g ) 2 ~ c (x g ) > P x t x + t > x P > x P x t x (5.15) Q y =jj~ y g )jj 2 = ~ y > c (I P y P > y )~ y c = (y g Q c t c ) > (I P y P > y )(y g Q c t c ) (5.16) 116 Although P x and e c (x g ) in Eq. (5.15) cannot be described explicitly, the statisticQ x can be rewritten owing to kernel trick as Q x = ~ c (x g ) > ~ c (x g ) 2 ~ c (x g ) > P x t x + t > x P > x P x t x = 1 2t > c (x g ) + t > c K g > t c 2 (x g ) > > t > c K g (I T c ) > W x t x + t > x W > x (I T c )K g (I T c ) > W x t x (5.17) Step 4. The control limits for the five statistics are computed. Con- sidering that the original variables which follow Gaussian distribution can be non-Gaussian after nonlinear mapping, the control limits are computed similarly to those in [59] with nonparametric kernel density estimation ap- proach using the results of [72]. Step 5. The five statistics defined in Step 3 can be compared to the corresponding control limits computed in Step 4 to achieve comprehensive process monitoring in X-specific, Y-specific, X-Y correlated, potentially X-relevant, and potentially Y-relevant subspaces respectively. As the T 2 y andQ y are both quality relevant, the two statistics can be incorporated into a combined index y while the two statistics are normalized by the corre- sponding control limits similar to the one of CPLS. It is noted that the above algorithm can give complete monitoring for abnormal dynamic variations, but fault detection delay is inevitable and the length depends on the time-delay orders of the augmented data matrix. 117 This is a trade-off between the improvement and the time delay. Further- more, one should keep in mind that a single abnormal sample from the statistic denotes an abnormal dynamic variation in a period of time. 5.4 Multi-Block DCKCCA for Fault Diagnosis After an abnormality is detected, it is often necessary to pinpoint the variables which are related to the abnormality and interpret the monitoring results further. Although one can derive contribution plots for the DCK- CCA based fault diagnosis similar to the ones for TKPLS in [59], it often displays many variable contributions due to the fact that the dimension of the augmented process data matrix may be many times (4 50) larger than that of the original process variables. To overcome this problem, the follow- ing multi-block DCKCCA is proposed based on the multi-block partition of the augmented data matrices (X g , Y g ). Step1. Rearrange the elements in x g to incorporate the variable and the lagged values to establish a single block sample vector x i . Step 2. As the practitioners are often concerned with the quality- relevant fault and process specific fault, contribution rates for T 2 c and T 2 x are proposed to pinpoint the variable relevant to quality-relevant fault and process specific fault. Contribution rates for the original variable and aug- mented variables with lagged values are computed in a similar way to the work of [59]. 118 (a) The variable contributions of sample x g forT 2 c are defined as VC T 2 c (x g ;r) = @T 2 c (x g v) @v r v=Ima = trace @(K x gK > x g) @v r v=Ima T 2 c (5.18) wherer = 1; 2;:::;ma. x g v denotes the vector of x 1 (k)v 1 ;:::;x 1 (ka + 1)v a ;:::;x m (k)v (m1)a ;:::;x m (ka + 1)v ma > , with v = [v 1 ;:::;v ma ] > ,v r > 0, T 2 c = > 1 1 c 1 , with 1 denoted as 1 = D > c V > c Q(T > K g U) > U > (5.19) (b) The variable contributions of sample x g forT 2 x are defined as VC T 2 x (x g ;r) = @T 2 x (x g v) @v r v=Im = trace @(K x gK > x g) @v r v=Ima T 2 x (5.20) where T 2 x = > 2 1 x 2 , with 2 denoted as 2 =W > x I U c K g > D > c V > c Q T > K g U > U > + U c K g > D > c (5.21) V > c Q T > K g U > U > i VC T 2 c (x g ;r) andVC T 2 x (x g ;r) in the above expressions denote the vari- able contributions gradient toT 2 c andT 2 x , respectively. Each element of@ X x gX > x g =@v r can be obtained as follows [73]. 119 @ X x gX > x g p;q @v r v=Ima = 2 c [K x g(p)K raw new (q)x g r (x g r (q) x g r ) +X x g(q)K raw new (p)x g r (x g r (p) x g r )] (5.22) + 1 nc (K x g(p) + K x g(q)) x g r n X k=1 [(x g r (k) x g r ) K raw new ] (r = 1; 2;:::;ma) where K x g(s) is the s th element of K x g, K raw new (s) is the s th element of K raw new , x g r denotes the augmented data vector at current time, and x g r (k) denotes the augmented data vector at timek. It is noted that the contributions for the lagged variables of each vari- able under normal condition are different from each other. It is mean- ingful to normalize the original contribution to make sure that the con- tributions of all lagged variables for the same variable are additive. Relative variable contribution is defined as rVC T 2 c (x g ;r) =VC T 2 c (x;r)=mvc T 2 c (r) (5.23) rVC T 2 x (x g ;r) =VC T 2 x (x;r)=mvc T 2 x (r) (5.24) where mvc T 2 c (r) = (1=n) P n k=1 VC T 2 c (x g (k);i), and mvc T 2 x (r) = (1=n) P n k=1 VC T 2 x (x g (k);i). Step 3. Block statistics for the two indices are defined according to variable partition for the augmented data with lagged values. 120 (a) Block contributionsBC T 2 c (x g ;i) forT 2 c are defined as BC T 2 c (x g ;i) = ia X r=1+(i1)a rVC T 2 c (x g ;r) (5.25) (b) Block contributionsBC T 2 x (x g ;i) forT 2 x are defined as BC T 2 x (x g ;i) = ia X r=1+(i1)a rVC T 2 x (x g ;r) (5.26) Step 4. Block contributions are built to analyze the fault effects on auto-correlations and cross-correlations for each variable in the process and quality spaces concurrently. The flow chart for the DCKCCA-based quality-relevant fault diagno- sis method is summarized in Figure 5.2. 5.5 Continuous Annealing Process Case Studies This section applies the proposed DCKCCA and multiblock DCK- CCA methods to strip-thickness relevant monitoring and diagnosis of cold- rolling continuous annealing processes. The studied process section consists of 54 process variables and 1 quality variable, i.e., m = 54 andp = 1. The original quality variable, i.e., strip-thickness, and some contributed process variables can be seen from Figure 5.3. The data are collected by sampling time 0.1s and modeled by applying the DCKCCA algorithm as described in Section 5.3.2. After that, the fault diagnosis strategies discussed in Sections 5.3.3 and 5.4 are demon- strated when the CAP operates in a normal situation and two faulty cases 121 Figure 5.2: DCKCCA-based Quality Relevant Fault Diagnosis Flow Chart 122 200 400 600 800 1000 1200 1400 1600 0.285 0.29 0.295 strip thickness (mm) Y 0 500 1000 1500 410 412 414 416 temperature ( o C) X−13 0 500 1000 1500 406 408 410 412 temperature ( o C) X−14 0 500 1000 1500 348 350 352 354 temperature ( o C) X−15 Figure 5.3: Quality Variable and Selected Contribution Process Variables 123 0 50 100 150 200 250 300 350 400 0.284 0.286 0.288 0.29 0.292 0.294 Sample number Strip thickness(mm) measured value predicted value−CKPLS (a) CKPLS 0 50 100 150 200 250 300 350 0.284 0.286 0.288 0.29 0.292 0.294 Sample number Strip thickness(mm) measured value predicted value−DKPLS (b) DKPLS 0 50 100 150 200 250 300 350 0.284 0.286 0.288 0.29 0.292 0.294 Sample number Strip thickness(mm) measured value predicted value−DCKCCA (c) DCKCCA Figure 5.4: Measured Strip-Thickness and Predicted Strip-Thickness (Nor- mal Case) 124 −300 −200 −100 0 100 200 300 −0.5 0 0.5 1 (a) CKPLS −300 −200 −100 0 100 200 300 −0.5 0 0.5 1 (b) DKPLS −300 −200 −100 0 100 200 300 −0.5 0 0.5 1 (c) DCKCCA Figure 5.5: Auto-Correlation Curves of Prediction Errors (Normal Case) 125 that include strip-thickness relevant faults. It is noted that due to the fact that the strip thickness is measured by the sensor at a distance from the stands, the time delay is considered for modeling and real-time monitoring. The process data are aligned accordingly. To demonstrate the effectiveness of the proposed method, DCKCCA is compared with other bi-layer nonlin- ear modeling and monitoring methods, i.e. concurrent kernel PLS (CKPLS) and dynamic kernel PLS (DKPLS). Normalcase: When the CAP operates in a normal situation, the strip thickness is estimated and compared to the measured value using CKPLS, DKPLS, and DCKCCA algorithms. The results are shown in Figure 5.4 (a), (b) and (c), respectively. From the three sub-figures, the proposed DCKCCA demonstrates better prediction compared to CKPLS and DKPLS owing to the dynamic modeling and advantage of CCA over PLS on prediction. In addition, from the auto-correlation curves of prediction error shown in Fig- ure 5.5 (a), (b) and (c), the auto-correlation coefficient around zero for DCK- CCA is much sharper, which demonstrates that the prediction errors for DCKCCA are mainly noises compared to CKPLS and DKPLS. Both of Fig- ures 5.4 and 5.5 demonstrate that a clear improvement has been achieved for the proposed DCKCCA on quality prediction. Faulty case 1: When the CAP operates in a faulty situation (faulty case 1), the quality prediction results using CKPLS, DKPLS, and DCKCCA 126 0 50 100 150 200 250 300 350 400 0.284 0.286 0.288 0.29 0.292 0.294 Sample number Strip thickness(mm) measured value predicted value−CKPLS (a) CKPLS 0 50 100 150 200 250 300 350 0.284 0.286 0.288 0.29 0.292 0.294 Sample number Strip thickness(mm) measured value predicted value−DKPLS (b) DKPLS 0 50 100 150 200 250 300 350 0.284 0.286 0.288 0.29 0.292 0.294 Sample number Strip thickness(mm) measured value predicted value−DCKCCA (c) DCKCCA Figure 5.6: Measured Strip-Thickness and Predicted Strip-Thickness (Faulty Case 1) 127 0 200 400 0 10 20 30 T c 2 0 200 400 0 100 200 T x 2 statistic control limit 0 200 400 0 2 4 6 8 φ y 0 200 400 0 0.1 0.2 Q x (a) CKPLS 0 50 100 150 200 250 300 350 400 0 20 40 T 2 statistic control limit 0 50 100 150 200 250 300 350 400 0 0.1 0.2 Q statistic control limit (b) DKPLS 0 200 400 0 20 40 T c 2 0 200 400 0 1000 2000 T x 2 statistic control limit 0 200 400 0 20 40 φ y 0 200 400 0 2 4 6 8 x 10 −3 Q x (c) DCKCCA Figure 5.7: CAP Process Monitoring Results (Faulty Case 1) 128 approaches are shown in Figure 5.6 (a), (b), and (c), respectively. From the strip thickness prediction results in Figure 5.6, DCKCCA outperforms the CKPLS and DKPLS methods. The five statistics in the CKPLS but not the two statistics in DKPLS is used because of the oblique decomposition of quality and process spaces for DKPLS algorithm. For this application with only 1 quality variable, Q y is null with four statistics shown in Figure 5.7 (a). From Figure 5.7 (a) - (c), the fault situation can be detected by CKPLS, DKPLS, and DCKCCA. But from the DKPLS result in Figure 5.7 (b), one cannot analyse how the fault affects the strip thickness. The CKPLS result in Figure 5.7 (a) gives more comprehensive monitoring, but the fault is con- sidered as quality relevant for only a few samples of T 2 c statistic around Sample 320. In particular, the unexpected strip thickness variations around Sample 390 in Figure 5.6 (a) are not observed by theT 2 c statistic of CKPLS in Figure 5.7 (a) but detected successfully by the T 2 c statistic of DCKCCA in Figure 5.7 (c). Also from Figure 5.6 (c), there are still some large portion of variations after Sample 260 in unpredicted quality. The deviation of the measured and predicted value of the strip-thickness in Figure 5.6 (c) denotes the quality-specific variations which cannot be estimated from the process measurements. The result is consistent with the deviation of predicted strip thickness and measured strip thickness in Figure 5.6 (c). But the CKPLS y statistic in Figure 5.7 (a) cannot detect this kind of variations. The y 129 statistic of DCKCCA in Figure 5.7 (c) successfully detects this kind of vari- ations, which demonstrates the advantage of the proposed DCKCCA. For this application, the unexpected thickness variations can be caused by the unmeasured process variation, e.g., gas flow in the furnace zone. Figure 5.8: CAP Strip-Thickness Relevant Diagnosis Results using Multi- Block DCKCCA (Faulty Case 1) After that, the multi-block DCKCCA based contribution plot is shown in Figure 5.8. FromT 2 c plot in Figure 5.8, the contributed variables for the faulty sample No. 361 is variable No. 21, i.e., No. 3 measured tension, whose strip-thickness relevant unexpected abnormality cannot be observed by the original data shown in the top sub-figure of Figure 5.9. The abnormalities appear at variables No. 1, No. 3 and No. 5 forT 2 x in Figure 5.8 can be considered not relevant to strip-thickness. By detailed analyses of 130 Figure 5.9: Original Data for Contributed Variables (Faulty Case 1) the practitioners, the variations forT 2 x are indeed process-specific variations because of normal changes of rolling speed of the temper mill, and the contributed variables are identified effectively. The corresponding process data are displayed in Figure 5.9 from which the changes of the rolling speed are strip thickness irrelevant and can be ignored while monitoring the strip thickness relevant faults. Faulty case 2: When the CAP operates in another faulty situation (faulty case 2), the monitoring results using CKPLS, DKPLS, and DCKCCA approaches are shown in Figure 5.10 (a) - (c), respectively. From the T 2 statistic of DCKCCA in Figure 5.10 (c), the strip-thickness relevant down- 131 0 200 400 0 10 20 T c 2 0 200 400 0 100 200 T x 2 statistic control limit 0 200 400 0 5 10 15 φ y 0 200 400 0 1 2 3 Q x (a) CKPLS 0 50 100 150 200 250 300 350 400 0 10 20 30 T 2 statistic control limit 0 50 100 150 200 250 300 350 400 0 0.2 0.4 Q statistic control limit (b) DKPLS 0 200 400 0 10 20 30 T c 2 0 200 400 200 400 600 800 1000 1200 T x 2 statistic control limit 0 200 400 0 5 10 φ y 0 200 400 0 2 4 6 8 x 10 −3 Q x (c) DCKCCA Figure 5.10: CAP Process Monitoring Results (Faulty Case 2) 132 0 50 100 150 200 250 300 350 400 0.28 0.282 0.284 0.286 0.288 0.29 0.292 0.294 Sample number Strip thickness(mm) Figure 5.11: Measured Strip Thickness (Faulty Case 2) trend fault is successfully detected around Sample 75 and goes back to nor- mal region after Sample 150. The result is consistent with strip thickness curve in Figure 5.11. However, theT 2 c statistic of CKPLS in Figure 5.10 (a) and T 2 statistic of DKPLS in Figure 5.10 (b) still exceed the control limits for Samples 50-140, 210-240, and 301-400, which can be considered as false alarms. Although theT 2 x statistic of DCKCCA in Figure 5.10 (c) exceeds the control limits, the process-specific variations can be considered as normal and be ignored when the practitioners focus on monitoring the strip thick- ness relevant faults. The T 2 c statistic of DCKCCA successfully detects the fault that appears around Sample 75, which demonstrates the advantage of the proposed DCKCCA method. 133 5.6 Summary This chapter proposes a dynamic concurrent kernel CCA modeling and diagnosis method for dynamic nonlinear processes, which is able to monitor and diagnose quality and process variations concurrently. It pro- vides an easy way to achieve comprehensive monitoring and helps localize faults. The advantages of the proposed method are demonstrated with data from a real cold-rolling continuous annealing process. Additionally, multi-block DCKCCA is developed to address the large dimensions induced by the dynamic augmentation of the process and quality matrices. In multi-block DCKCCA, the original variable and the corresponding lagged values are combined into a signle block to create a new contribution to evaluate the faulty effects on each variable. Its effectiveness is also proved by the CAP process. 134 Chapter 6 Conclusions and Future Directions 6.1 Conclusions This thesis presents a novel monitoring and diagnosis framework based on canonical correlation analysis. Different from partial least squares, CCA focuses on the multivariate correlations between process variables X and quality variables Y, and the latent variables extracted by CCA are di- rectly relevant to both X and Y. Thus, CCA is proved to have more predic- tion power than PLS. However, currently, CCA is not used for monitoring. The major contributions of the thesis are as follows. 1. Proposed the regularzied CCA to handle strong collinearity cases. Matrix inversion is involved in CCA when extracting it principal components, which leads to ill conditions when strong collinearity exists. Thus, in the regularized CCA, the regularization terms are added into both process variables and quality variables to ensure non-zero eigenvalues. Compared with the original CCA, we have shown that the regularized CCA has better prediction effectiveness, and it is more robust to noises. 135 2. Proposed concurrent CCA and its corresponding monitoring and diagnosis framework. The major reason that CCA is not adopted in process monitoring is that it does not pay attention to the process variance structure and output variance structure, which still contain a large amount of information unexploited. The concurrent CCA algorithm proposed here is able to exploit the residual subspaces of X and Y in addition to extracting the correlation subspace between X and Y. It extracts the latent variables from X and Y with CCA based on the maximum correlation criterion, and then decomposes the residual subspaces further with PCA to separate the principal component subspace that contains large variances from residual subspace that mainly consists of noise. The CCCA-based monitoring and diagnosis scheme is shown to have better performance than other existing schemes. 3. Extended concurrent CCA to nonlinear cases and dynamic cases, which are concurrent kernel CCA and dynamic concurrent kernel CCA respectively. CKCCA is proposed based on kernel CCA model, which incorporates a regularization term in the objective to prevent trivial solutions existing in the original KCCA model. DCKCCA can capture dynamic nonlinear correlations between strip thickness and process variables, and strip thickness specific variations, process- 136 specific variations and thickness-process covariations are monitored simultaneously in DCKCCA model. The experiments on synthetic simulations, the Tennessee Eastman process and the continuous an- nealing process have shown the modeling, monitoring and diagnosis effectiveness of the above concurrent algorithms over other existing models. 4. Defined and analyzed several types of fault monitoring schemes, including process monitoring, quality monitoring, quality-relevant monitoring and quality-irrelevant process monitoring. Quality moni- toring is of the most concern in operations, but it usually comes with large delays and slow sampling. With quality-relevant monitoring, however, these two issues can be circumvented, which is validated through experiments. 5. Pointed out some common misuses of the Tennessee Eastman process. When studying the Tennessee Eastman process, we have found that some common misuses in the literature, and it is important to note that not all disturbances are created equal. Some disturbances are compen- sated through the feedback control or some corrective operations in the system, and they do not affect the final quality variables. The at- tention level of this kind of disturbance should be relatively lower. For those quality-relevant faults, more importance should be attached. 137 6.2 Future Directions Concurrent quality-relevant monitoring and diagnosis framework based on canonical correlation analysis works well, but there are still some potential future directions involved here. 1. CCA is an improved algorithm compared with PLS. However, both PLS and CCA need further processing to build comprehensive mod- els, which is not efficient. So one possible direction is to propose a latent variable method to exploit the input variance structure and in- corporate the efficiency on process-quality prediction power of CCA simultaneously. We have obtained some preliminary progress on this part, and proposed a latent variable least squares (LVLS) algorithm, which aims to minimize the prediction error of the input and output score models. But more properties of LVLS need to be studied. 2. For CKCCA proposed in Chapter 3, due to the fact that the kernel com- putation is linear related to the number of samples and the large col- umn dimensions of augmented process matrix, the computation load will increase significantly. The future work will discuss Map-Reduce based distributed fast implementation of the modeling algorithm. Ad- ditionally, we want to try some other nonlinear tricks, such as support vector machine (SVM), neural network, etc., and compared their per- formance with CKCCA. 138 3. For DCKCCA in Chapter 5, the lagged data matrices of both quality and process data are employed to capture the dynamic relations be- tween X and Y. The relation between inner model and outer model is not explicit and consistent. Recently, Dong and Qin [85] revised the objective of inner model and proposed a dynamic PLS (DiPLS) for dynamic system modeling, which gives an explicit description for dynamic inner model and outer model. A similar dynamic structure can be tried for CCA as well, and its corresponding monitoring and diagnosis framework should be developed. 4. The fault diagnosis methods work well in TEP dataset, and it should be further validated on other industrial processes. 139 Appendices Appendix A: The CCA Algorithm Method 1: CCA via eigenvalue decomposition and SVD. 1. Scale the data to zero mean and unit variance. 2. Perform eigenvalue decomposition to calculate the square root factors, [V x ; D x ] = eig(X > X) (A.1) [V y ; D y ] = eig(Y > Y) (A.2) 1 2 xx = V x D 1 2 x V > x (A.3) 1 2 yy = V y D 1 2 y V > y (A.4) 3. Perform SVD to calculate weighting matrices W and Q, [U z ; S z ; U z ] = svd( 1 2 xx X > Y 1 2 yy ) W = 1 2 xx U z Q = 1 2 yy V z 4. Obtain canonical variates T and U for X and Y, T = XW U = YQ 140 Method 2: CCA via three SVDs. 1. Scale the data to zero mean and unit variance. 2. Perform SVD on the data matrices, [U x ; S x ; V x ] = svd (X) [U y ; S y ; V y ] = svd (Y) 3. Perform SVD to calculate W and Q, [U z ; S z ; V z ] = svd U > x U y W = V x S 1 x U z Q = V y S 1 y V z 4. Obtain canonical variates T and U for X and Y, T = XW U = YQ 141 Appendix B: The Regularized CCA Algorithm 1 Scale the data to zero mean and unit variance. 2 Perform eigenvalue decomposition on the matrices X > X and Y > Y to calculate the square root factors, [V x ; D x ] = eig(X > X) [V y ; D y ] = eig(Y > Y) 1 2 xx = V x (D x + 1 I) 1 2 V > x 1 2 yy = V y (D y + 2 I) 1 2 V > y 3 Perform SVD to calculate weighting matrices R and C, [U z ; S z ; U z ] = svd( 1 2 xx X > Y 1 2 yy ) R = 1 2 xx U z C = 1 2 yy V z 4 Obtain canonical variates T and U for X and Y, T = XR U = YC 142 Appendix C: Calculations of Scores and Residuals of CKCCA The score vector t c in the correlation subspace is calculated as follows t c = R > c (x new ) = D 1 c V > c QR(x new ) = D 1 c V > c QA > k xnew (C.1) where k xnew = X (x new ). The score vector t x in the process-principal subspace can be repre- sented as t x = P > x ~ c (x new ) = W > x X T c R y c (x new ) R y> c t c = W > x I K X M > D 1 c V > c QA > k xnew (C.2) where M is used to simplify the equation, and it is calculated by M = D 1 c V > c QA > K X AQ > V c D 1 c 1 D 1 c V > c QA > The score vector t y in the quality-principal subspace can be com- puted similar to t x , t y = W > y (I T c W > c )(k ynew K Y W c t c ) (C.3) where k ynew = Y (y new ). The residuals ~ (x new ) and ~ (y new ) are calculated by ~ (x new ) = ~ c (x new ) P x t x (C.4) ~ (y new ) = ~ c (y new ) P y t y (C.5) 143 Appendix D: The Simplied Kernel CCA Algorithm The simplified kernel CCA (referred as kernel CCA in the following discussion) aims to solve projection vectors andc, so that the following objective function obtains maximum value max t;u J KCCA = t > u ktkkuk (D.1) where t = (X) , and u = Yc. There exists an N-dimensional column vector, such that = (X) > . The objective function becomes max ;c J KCCA = > KYc p > K 2 p c > Y > Yc (D.2) The kernel CCA objective is equivalent to max ;c J KCCA = > KYc (D.3) subject to the following constraints: 8 > < > : > K 2 = 1 c > Y > Yc = 1 (D.4) The corresponding Lagrangian is L(;; c) = > KYc 2 ( > K 2 1) c 2 (c > Y > Yc 1) (D.5) Taking derivatives in respect to and c, we obtain @L @ = KYc K 2 = 0 (D.6) @L @c = Y > K c Y > Yc = 0 (D.7) 144 The above two equations imply that = c = > KYc. Let = = c . Assume that Y > Y is invertible, we can have c = (Y > Y) 1 Y > K (D.8) Substituting c into Eq. (D.6), we obtain KY(Y > Y) 1 Y > K = 2 K 2 Assuming K is invertible, we can have K 1 Y(Y > Y) 1 Y > K = 2 We are left with the standard eigen decomposition problem to have the eigenvector corresponding to the largest eigenvalue of the above equation. In order to avoid non-trivial solution presented in [34] whenK is not invertible, the constraint is revised by penalizing the norm of the associated weight vector. Then the objective of kernel CCA in Eq. (D.2) becomes max ;c J KCCA = > KYc q > K 2 +jj jj 2 p c > Y > Yc = > KYc p > K 2 > + > K p c > Y > Yc (D.9) The kernel CCA objective is equivalent to max ;c J KCCA = > KYc (D.10) subject to the constraints which incorporate the norm of the associated 145 weight vectors: 8 > < > : > K 2 +jj jj 2 = > K 2 + > K > = 1 c > Y > Yc = 1 (D.11) We follow the same approach in the above section where K is as- sumed to be invertible, and the following equation can be obtained (K +I) 1 Y(Y > Y) 1 Y > K = 2 The eigenvector is computed from the eigen decomposition of the above equation. It is noted that complete decomposition of a kernel matrix is an expensive step and should be avoided, thus the incomplete Cholesky decomposition is adopted instead of Cholesky decomposition [52]. After obtaining the eigenvector , score vectors and loading vectors are com- puted from the iterative algorithm as follows: 1) score vector of (X): t = K; t t=jjtjj; 2) score vector of Y: u = YY > t, u u=jjujj; 3) weight vector of (X): w = (X) > u; 4) loading vector of (X): p = (X) > t; 5) loading vector of Y: q = Y > t. Then, deflate K and Y as follows: K (I tt > )K(I tt > ) Y (I tt > )Y 146 Using the scores and loadings for l latent variables computed from the above algorithm, the score matrices T = [t 1 ; t 2 ; ; t l ] and U = [u 1 ; u 2 ; ; u l ], and loading matrices P = [p 1 ; p 2 ; ; p l ] and Q = [q 1 ; q 2 ; ; q l ] can be built respectively. The following kernel CCA decomposition is thus achieved (X) = ^ (X) + r = TP > + r Y = ^ Y + Y r = TQ > + Y r (D.12) where T = R with the weight matrix R = W(P > W) 1 = > (X)U(T > KU) 1 , and r = (I TT > )(X). 147 Appendix E: Proof of Lemma1andLemma2 When rank i = 0 ( i = 0), then according to Eq. (4.4), the fault can not be detected by , indicating that the fault is completely quality- irrelevant. When f2R V ? i , then f > = D i V > i f = 0 (E.1) This makes the second term of Eq. (4.6) equal to zero and (x) always within& 2 . Thus, in this case, the fault is not detectable as well. When f = 2R V ? i , in order for the fault to be detected, (x) =jj x + i fjj 2 jj x jjjj i fjj 2 >& 2 (E.2) Sincejj x jj<&,jj i fjj> 2&. Then 1. If rank i =A f , i is a full column-rank matrix and all the singular values are greater than 0. Thenjj i fjjd min jjfjj> 2&. 2. If 0< rank i <A f , then some of the singular values are zero. Then the minimum singular valued min is adopted among the nonzero ones, which givesd min jjfjj> 2&. Thus, whenjjfjj> 2&=d min , the fault can be guaranteed to be detected. 148 Bibliography [1] P . Nomikos and J. F. MacGregor, “Multivariate SPC charts for monitor- ing batch processes,” Technometrics, vol. 37, no. 1, pp. 41–59, 1995. [2] B. M. Wise and N. B. Gallagher, “The process chemometrics approach to process monitoring and fault detection,” Journal of Process Control, vol. 6, no. 6, pp. 329–348, 1996. [3] S. J. Qin, “Statistical process monitoring: basics and beyond,” Journal of chemometrics, vol. 17, no. 8-9, pp. 480–502, 2003. [4] L. H. Chiang, E. L. Russell, and R. D. Braatz, “Fault diagnosis in chem- ical processes using fisher discriminant analysis, discriminant partial least squares, and principal component analysis,” Chemometrics and In- telligent Laboratory Systems, vol. 50, no. 2, pp. 243–252, 2000. [5] S. Wold, N. Kettaneh-Wold, and B. Skagerberg, “Nonlinear PLS mod- eling,” Chemometrics and Intelligent Laboratory Systems, vol. 7, no. 1, pp. 53–65, 1989. [6] J. F. MacGregor, C. Jaeckle, C. Kiparissides, and M. Koutoudi, “Process monitoring and diagnosis by multiblock PLS methods,” AIChE Journal, vol. 40, no. 5, pp. 826–838, 1994. [7] J. F. MacGregor, H. Yu, S. G. Mu˜ noz, and J. Flores-Cerrillo, “Data-based latent variable methods for process analysis, monitoring and control,” Computers & chemical engineering, vol. 29, no. 6, pp. 1217–1223, 2005. [8] S. Yin, S. X. Ding, A. Haghani, H. Hao, and P . Zhang, “A compari- son study of basic data-driven fault diagnosis and process monitoring methods on the benchmark tennessee eastman process,” Journal of Pro- cess Control, vol. 22, no. 9, pp. 1567–1581, 2012. 149 [9] J. V . Kresta, J. F. MacGregor, and T. E. Marlin, “Multivariate statistical monitoring of process operating performance,” The Canadian Journal of Chemical Engineering, vol. 69, no. 1, pp. 35–47, 1991. [10] B. Wise and N. Ricker, “Recent advances in multivariate statistical pro- cess control: improving robustness and sensitivity,” in IFAC Symposium on Advanced Control of Chemical Processes. Toulouse, France, pp. 125–130, Citeseer, 1991. [11] S. Wold, K. Esbensen, and P . Geladi, “Principal component analysis,” Chemometrics and intelligent laboratory systems, vol. 2, no. 1-3, pp. 37–52, 1987. [12] S. Wold, A. Ruhe, H. Wold, and W. Dunn, III, “The collinearity problem in linear regression. the partial least squares (PLS) approach to gener- alized inverses,” SIAM Journal on Scientific and Statistical Computing, vol. 5, no. 3, pp. 735–743, 1984. [13] B. R. Kowalski, Chemometrics: mathematics and statistics in chemistry, vol. 138. Springer Science & Business Media, 2013. [14] D. R. Hardoon, Semantic models for machine learning. PhD thesis, Uni- versity of Southampton, 2006. [15] L. Sun, S. Ji, S. Yu, and J. Ye, “On the equivalence between canonical correlation analysis and orthonormalized partial least squares.,” in IJ- CAI, vol. 9, pp. 1230–1235, 2009. [16] J. Hulland and R. I. S. of Business, “Use of partial least squares (PLS) in strategic management research: A review of four recent studies,” Strategic management journal, vol. 20, no. 2, pp. 195–204, 1999. [17] N. J. Lobaugh, R. West, and A. R. McIntosh, “Spatiotemporal analysis of experimental differences in event-related potential data with partial least squares,” Psychophysiology, vol. 38, no. 3, pp. 517–530, 2001. [18] M. Martens and H. Martens, “Partial least squares regression,” Statisti- cal procedures in food research, pp. 293–359, 1986. 150 [19] D. V . Nguyen and D. M. Rocke, “Tumor classification by partial least squares using microarray gene expression data,” Bioinformatics, vol. 18, no. 1, pp. 39–50, 2002. [20] J. Nilsson, S. d. Jong, A. K. Smilde, et al., “Multiway calibration in 3D QSAR,” Journal of chemometrics, no. 11/6, pp. 511–524, 1997. [21] P . D. Sampson, A. P . Streissguth, H. M. Barr, and F. L. Bookstein, “Neu- robehavioral effects of prenatal alcohol: Part ii. partial least squares analysis,” Neurotoxicology and teratology, vol. 11, no. 5, pp. 477–491, 1989. [22] K. J. Worsley, “An overview and some new developments in the statis- tical analysis of pet and fmri data,” Human Brain Mapping, vol. 5, no. 4, pp. 254–258, 1997. [23] A. H¨ oskuldsson, “PLS regression methods,” Journal of chemometrics, vol. 2, no. 3, pp. 211–228, 1988. [24] P . Geladi and B. R. Kowalski, “Partial least-squares regression: a tuto- rial,” Analytica Chimica Acta, vol. 185, pp. 1–17, 1986. [25] H. Hotelling, “Relations between two sets of variates,” Biometrika, pp. 321–377, 1936. [26] D. R. Hardoon and J. Shawe-Taylor, “KCCA for different level pre- cision in content-based image retrieval,” in Proceedings of Third Inter- national Workshop on Content-Based Multimedia Indexing, IRISA, Rennes, France, 2003. [27] J.-P . Vert and M. Kanehisa, “Graph-driven feature extraction from mi- croarray data using diffusion kernels and kernel CCA,” in Advances in neural information processing systems, pp. 1425–1432, 2002. [28] M. Barker and W. Rayens, “Partial least squares for discrimination,” Journal of chemometrics, vol. 17, no. 3, pp. 166–173, 2003. 151 [29] R. Rosipal and N. Kr¨ amer, “Overview and recent advances in partial least squares,” in Subspace, latent structure and feature selection, pp. 34– 51, Springer, 2006. [30] G. Li, S. J. Qin, and D. Zhou, “Geometric properties of partial least squares for process monitoring,” Automatica, vol. 46, no. 1, pp. 204– 210, 2010. [31] J. F. MacGregor and T. Kourti, “Statistical process control of multivari- ate processes,” Control Engineering Practice, vol. 3, no. 3, pp. 403–414, 1995. [32] D. Zhou, G. Li, and S. J. Qin, “Total projection to latent structures for process monitoring,” AIChE Journal, vol. 56, no. 1, pp. 168–178, 2010. [33] S. J. Qin and Y. Zheng, “Quality-relevant and process-relevant fault monitoring with concurrent projection to latent structures,” AIChE Journal, vol. 59, no. 2, pp. 496–504, 2013. [34] D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor, “Canonical corre- lation analysis: An overview with application to learning methods,” Neural computation, vol. 16, no. 12, pp. 2639–2664, 2004. [35] A. Sherry and R. K. Henson, “Conducting and interpreting canonical correlation analysis in personality research: A user-friendly primer,” Journal of Personality Assessment, vol. 84, no. 1, pp. 37–48, 2005. [36] Q. Zhu, Q. Liu, and S. J. Qin, “Concurrent canonical correlation anal- ysis modeling for quality-relevant monitoring,” IFAC-PapersOnLine, vol. 49, no. 7, pp. 1044–1049, 2016. [37] S. J. Qin, “Survey on data-driven industrial process monitoring and diagnosis,” Annual Reviews in Control, vol. 36, no. 2, pp. 220–234, 2012. [38] S. J. Qin, “Process data analytics in the era of big data,” AIChE Journal, vol. 60, no. 9, pp. 3092–3100, 2014. [39] J. E. Jackson, A user’s guide to principal components, vol. 587. New York: Wiley-Interscience, 1991. 152 [40] M. Borga, T. Landelius, and H. Knutsson, “A unified approach to PCA, PLS, MLR and CCA,” 1997. [41] S. Valle, W. Li, and S. J. Qin, “Selection of the number of principal com- ponents: the variance of the reconstruction error criterion with a com- parison to other methods,” Industrial & Engineering Chemistry Research, vol. 38, no. 11, pp. 4389–4401, 1999. [42] J. E. Jackson and G. S. Mudholkar, “Control procedures for residuals associated with principal component analysis,” Technometrics, vol. 21, no. 3, pp. 341–349, 1979. [43] J. J. Downs and E. F. Vogel, “A plant-wide industrial process control problem,” Computers & Chemical Engineering, vol. 17, no. 3, pp. 245– 255, 1993. [44] Q. Zhu, Q. Liu, and S. J. Qin, “Concurrent monitoring and diagnosis of process and quality faults with canonical correlation analysis,” IFAC 2017 World Congress (the 20th World Congress of the International Federa- tion of Automation Control), 2017. [45] I. E. Frank, “A nonlinear PLS model,” Chemometrics and intelligent labo- ratory systems, vol. 8, no. 2, pp. 109–119, 1990. [46] S. J. Qin and T. J. McAvoy, “Nonlinear PLS modeling using neural net- works,” Computers & Chemical Engineering, vol. 16, no. 4, pp. 379–391, 1992. [47] S. J. Qin and T. A. Badgwell, “An overview of nonlinear model predic- tive control applications,” in Nonlinear model predictive control, pp. 369– 392, Springer, 2000. [48] S. Akaho, “A kernel method for canonical correlation analysis,” arXiv preprint cs/0609071, 2006. [49] B. Sch¨ olkopf, A. Smola, and K.-R. M ¨ uller, “Nonlinear component anal- ysis as a kernel eigenvalue problem,” Neural computation, vol. 10, no. 5, pp. 1299–1319, 1998. 153 [50] N. Sheng, Q. Liu, S. J. Qin, and T. Chai, “Comprehensive monitoring of nonlinear processes based on concurrent kernel projection to latent structures,” IEEE Transactions on Automation Science and Engineering, vol. 13, no. 2, pp. 1129–1137, 2016. [51] M. Kubat, “Neural networks: a comprehensive foundation by simon haykin, macmillan, 1994, isbn 0-02-352781-7.,” 1999. [52] F. R. Bach and M. I. Jordan, “Kernel independent component analysis,” The Journal of Machine Learning Research, vol. 3, pp. 1–48, 2003. [53] K. Peng, K. Zhang, and G. Li, “Quality-related process monitoring based on total kernel PLS model and its industrial application,” Math- ematical Problems in Engineering, vol. 2013, 2013. [54] Q. Liu, T. Chai, and S. Qin, “Fault diagnosis of continuous annealing processes using a reconstruction-based method,” Control Engineering Practice, vol. 20, no. 5, pp. 511–518, 2012. [55] Y. Zhang, H. Zhou, S. J. Qin, and T. Chai, “Decentralized fault di- agnosis of large-scale processes using multiblock kernel partial least squares,” Industrial Informatics, IEEE Transactions on, vol. 6, no. 1, pp. 3– 10, 2010. [56] Y. Zhang, H. Zhou, and S. J. Qin, “Decentralized fault diagnosis of large-scale processes using multiblock kernel principal component analysis,” Acta Automatica Sinica, vol. 36, no. 4, pp. 593–597, 2010. [57] Q. Liu, S. Qin, and T. Chai, “Decentralized fault diagnosis of continu- ous annealing processes based on multi-level PCA,” IEEE Transactions on Automation Science and Engineering, vol. 10, no. 3, pp. 687–698, 2013. [58] Q. Liu, S. Qin, and T. Chai, “Multi-block concurrent PLS for decentral- ized monitoring of continuous annealing processes,” IEEE Transactions on Industrial Electronics, vol. 61, no. 11, pp. 6429–6437, 2014. [59] K. Peng, K. Zhang, G. Li, and D. Zhou, “Contribution rate plot for nonlinear quality-related fault diagnosis with application to the hot 154 strip mill process,” Control Engineering Practice, vol. 21, no. 4, pp. 360– 369, 2013. [60] F. Serdio, E. Lughofer, K. Pichler, T. Buchegger, and H. Efendic, “Residual-based fault detection using soft computing techniques for condition monitoring at rolling mills,” Information Sciences, vol. 259, pp. 304–320, 2014. [61] F. Serdio, E. Lughofer, K. Pichler, M.Pichler, T. Buchegger, and H. Efen- dic, “Fault detection in multi-sensor networks based on multivariate time-series models and orthogonal transformations,” Information Fu- sion, vol. 20, pp. 272–291, 2014. [62] F. Serdio, E. Lughofer, K. Pichler, M. Pichler, T. Buchegger, and H. Efen- dic, “Fuzzy fault isolation using gradient information and quality cri- teria from system identification models,” Information Sciences, vol. 316, pp. 18–39, 2015. [63] R. Rosipal and L. J. Trejo, “Kernel partial least squares regression in reproducing kernel hilbert space,” The Journal of Machine Learning Re- search, vol. 2, pp. 97–123, 2002. [64] W. Ku, R. H. Storer, and C. Georgakis, “Disturbance detection and iso- lation by dynamic principal component analysis,” Chemometrics and in- telligent laboratory systems, vol. 30, no. 1, pp. 179–196, 1995. [65] S. J. Qin and T. McAvoy, “Nonlinear FIR modeling via a neural net pls approach,” Computers & chemical engineering, vol. 20, no. 2, pp. 147–159, 1996. [66] M. H. Kaspar and W. H. Ray, “Dynamic PLS modelling for process control,” Chemical Engineering Science, vol. 48, no. 20, pp. 3447–3461, 1993. [67] S. Lakshminarayanan, S. L. Shah, and K. Nandakumar, “Modeling and control of multivariable processes: Dynamic PLS approach,” AIChE Journal, vol. 43, no. 9, pp. 2307–2322, 1997. 155 [68] J. Hong, J. Zhang, and J. Morris, “Progressive multi-block modelling for enhanced fault isolation in batch processes,” Journal of Process Con- trol, vol. 24, pp. 13–26, 2014. [69] Q. Jiang and X. Yan, “Nonlinear plant-wide process monitoring us- ing MI-spectral clustering and Bayesian inference-based multiblock KPCA,” Journal of Process Control, vol. 32, pp. 38–50, 2015. [70] Q. Liu, S. Qin, and T. Chai, “Quality relevant monitoring and diag- nosis with dynamic concurrent projection to latent structures,” in the 19th IFAC World Congress, (Cape Town, South Africa), pp. 2740–2745, August 2014. [71] Y. Zhang, R. Sun, and Y. Fan, “Fault diagnosis of nonlinear process based on KCPLS reconstruction,” Chemometrics and Intelligent Labora- tory Systems, vol. 140, pp. 49–60, 2015. [72] P . Odiowei and Y. Cao, “Nonlinear dynamic process monitoring using canonical variate analysis and kernel density estimation,” IEEE Trans- actions on Industrial Informatics, vol. 6, no. 1, pp. 36–44, 2010. [73] R. Rosipal and L. J. Trejo, “Kernel partial least squares regression in reproducing kernel hilbert space,” Journal of machine learning research, vol. 2, no. Dec, pp. 97–123, 2001. [74] P . Miller, R. E. Swanson, and C. E. Heckler, “Contribution plots: a miss- ing link in multivariate quality control,” Applied mathematics and com- puter science, vol. 8, no. 4, pp. 775–792, 1998. [75] J. A. Westerhuis, S. P . Gurden, and A. K. Smilde, “Generalized contri- bution plots in multivariate statistical process monitoring,” Chemomet- rics and Intelligent Laboratory Systems, vol. 51, no. 1, pp. 95–114, 2000. [76] R. Dunia and S. J. Qin, “Subspace approach to multidimensional fault identification and reconstruction,” AIChE Journal, vol. 44, no. 8, pp. 1813–1831, 1998. [77] C. F. Alcala and S. J. Qin, “Reconstruction-based contribution for pro- cess monitoring,” Automatica, vol. 45, no. 7, pp. 1593–1600, 2009. 156 [78] S. J. Qin, S. Valle, and M. J. Piovoso, “On unifying multiblock anal- ysis with application to decentralized process monitoring,” Journal of chemometrics, vol. 15, no. 9, pp. 715–742, 2001. [79] G. Li, C. F. Alcala, S. J. Qin, and D. Zhou, “Generalized reconstruction- based contributions for output-relevant fault diagnosis with applica- tion to the tennessee eastman process,” IEEE transactions on control sys- tems technology, vol. 19, no. 5, pp. 1114–1127, 2011. [80] G. Li, S. J. Qin, and T. Chai, “Multi-directional reconstruction based contributions for root-cause diagnosis of dynamic processes,” in Amer- ican Control Conference (ACC), 2014, pp. 3500–3505, IEEE, 2014. [81] H. H. Yue and S. J. Qin, “Reconstruction-based fault identification using a combined index,” Industrial & engineering chemistry research, vol. 40, no. 20, pp. 4403–4414, 2001. [82] A. Negiz and A. C ¸ linar, “Statistical monitoring of multivariable dy- namic processes with state-space models,” AIChE Journal, vol. 43, no. 8, pp. 2002–2020, 1997. [83] G. E. Box et al., “Some theorems on quadratic forms applied in the study of analysis of variance problems, i. effect of inequality of vari- ance in the one-way classification,” The annals of mathematical statistics, vol. 25, no. 2, pp. 290–302, 1954. [84] S. Yoon and J. F. MacGregor, “Fault diagnosis with multivariate statisti- cal models part i: using steady state fault signatures,” Journal of process control, vol. 11, no. 4, pp. 387–400, 2001. [85] Y. Dong and S. J. Qin, “Dynamic-inner partial least squares for dynamic data modeling,” IFAC-PapersOnLine, vol. 48, no. 8, pp. 117–122, 2015.
Abstract (if available)
Abstract
Statistical process monitoring and fault diagnosis apply multivariate statistical analysis techniques on process and quality data to monitor and diagnose disturbances in industrial processes. Partial least squares (PLS) and canonical correlation analysis (CCA) are two popular supervised learning methods among them. PLS model and PLS-based monitoring framework have been widely studied and used for quality relevant monitoring. However, the discrepant objectives in PLS inner and outer models lead to many problems, such as the irrelevant components in the extracted latent variables and large variances in the process residual subspace. ❧ CCA extracts the multi-dimensional correlation structure between process and quality variables, which enables it to maximize the quality prediction from process data. CCA can be used to overcome the drawbacks of PLS
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Dynamic latent structured data analytics
PDF
Inferential modeling and process monitoring for applications in gas and oil production
PDF
Performance monitoring and disturbance adaptation for model predictive control
PDF
Statistical modeling and process data analytics for smart manufacturing
PDF
Process data analytics and monitoring based on causality analysis techniques
PDF
Data-driven performance and fault monitoring for oil production operations
PDF
Big data challenge via Tennessee Eastman Process
PDF
Machine learning techniques for perceptual quality enhancement and semantic image segmentation
PDF
Modeling and analysis of nanostructure growth process kinetics and variations for scalable nanomanufacturing
PDF
Efficient and accurate in-network processing for monitoring applications in wireless sensor networks
PDF
Robust real-time algorithms for processing data from oil and gas facilities
PDF
In-situ quality assessment of scan data for as-built models using building-specific geometric features
PDF
Principal dynamic mode analysis of cerebral hemodynamics for assisting diagnosis of cerebrovascular and neurodegenerative diseases
PDF
Nanostructure interaction modeling and estimation for scalable nanomanufacturing
PDF
Clinical prediction models to forecast depression in patients with diabetes and applications in depression screening policymaking
PDF
A signal processing approach to robust jet engine fault detection and diagnosis
PDF
Latent space dynamics for interpretation, monitoring, and prediction in industrial systems
PDF
Adaptive and resilient stream processing on cloud infrastructure
PDF
Data and computation redundancy in stream processing applications for improved fault resiliency and real-time performance
PDF
Modeling and recognition of events from temporal sensor data for energy applications
Asset Metadata
Creator
Zhu, Qinqin
(author)
Core Title
Concurrent monitoring and diagnosis of process and quality faults with canonical correlation analysis
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Chemical Engineering
Publication Date
10/10/2019
Defense Date
09/22/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
concurrent canonical correlation analysis,dynamic processes,nonlinear processes,OAI-PMH Harvest,quality-relevant fault diagnosis,quality-relevant monitoring
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Qin, Joe (
committee chair
), Huang, Qiang (
committee member
), Wang, Pin (
committee member
)
Creator Email
qinqinzh@usc.edu,qinqinzhu2013@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-444213
Unique identifier
UC11265786
Identifier
etd-ZhuQinqin-5834.pdf (filename),usctheses-c40-444213 (legacy record id)
Legacy Identifier
etd-ZhuQinqin-5834.pdf
Dmrecord
444213
Document Type
Dissertation
Rights
Zhu, Qinqin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
concurrent canonical correlation analysis
dynamic processes
nonlinear processes
quality-relevant fault diagnosis
quality-relevant monitoring