Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A novel hybrid probabilistic framework for model validation
(USC Thesis Other)
A novel hybrid probabilistic framework for model validation
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
A NOVEL HYBRID PROBABILISTIC FRAMEWORK FOR MODEL VALIDATION by Subhayan De A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (CIVIL ENGINEERING) May 2018 Copyright 2018 Subhayan De Dedication To my ma, papa and brother. िवद्वत्त्वं !च!नृपत्वं !च!नैव!तुल्यं !कदाचन!।! स्वदेशे !पूज्यते !राजा!िवद्वान् !सवर्त्र !पूज्यते !॥! Scholars and kings are never comparable. A king is worshipped in his country, but a scholar is worshipped everywhere. – Chanakya (300 BC) ii Acknowledgments The dissertation work is based on my attempt to propose a new probabilistic machine learning framework for computational model validation. For this valuable educational venture first and foremost, I would like to express my gratitude to my advisor Prof. Erik A. Johnson here at USC and my co-advisor Prof. Steven F. Wojtkiewicz at Clarkson University. I would like to take this opportunity to formally thank them for their constant encouragement and valuable suggestions throughout the last five years. Special thanks also go to Dr. Patrick Brewick (now at the United States Naval Research Laboratory) for his help and mentoring during his two years stay at USC. I would also like to thank: my dissertation com- mittee members Prof. Roger Ghanem, Prof. Ketan Savla, and Prof. Iván Bermejo- Moreno; my past advisors at Indian Institute of Science, Prof. C. S. Manohar and Prof. Debraj Ghosh; and my friends in Los Angeles, Bangalore, and Kolkata. Finally, I want to express my deep gratitude to my parents, Munmun and Sushil Kumar, and my elder brother, Sankarsan, for their love, affection, support and encouragement. The partial support of this work by the University of Southern California (through a Viterbi Ph.D. Fellowship and a Gammel Scholarship) and by the National Science Foundation (through grants 13-44937, 14-36018 and 16-63666) are also gratefully acknowledged. iii Contents Dedication ii Acknowledgments iii List of Tables vii List of Figures xii Abstract xviii 1 Introduction 1 1.1 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . 3 2 Model Falsification 5 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 Error Domain Model Falsification . . . . . . . . . . . . . . . 10 2.3.2 Multiple Comparison Test Corrections . . . . . . . . . . . . 12 2.3.3 Family-Wise Error Rate (FWER) . . . . . . . . . . . . . . . 12 2.3.4 False Discovery Rate (FDR) . . . . . . . . . . . . . . . . . . 14 2.4 Model Falsification Methodology . . . . . . . . . . . . . . . . . . . . 20 2.4.1 Error-Bound Model Falsification . . . . . . . . . . . . . . . . 20 2.4.2 Likelihood-Bound Model Falsification . . . . . . . . . . . . . 21 2.4.3 Relationships between Error-Bound and Likelihood-Bound Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4.4 Likelihood as Metric of Model Confidence . . . . . . . . . . 28 2.4.5 Robust Estimation and Robust Prediction . . . . . . . . . . 29 2.4.6 Model Confidence and Post-falsification Robust Prediction . 30 2.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.5.1 Example I: Conceptual Example . . . . . . . . . . . . . . . . 32 2.5.2 Example II: Four DOF System . . . . . . . . . . . . . . . . 39 2.5.3 Example III: Complex Wind-Excited Building (1623 DOF) . 49 iv 2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3 Bayesian Model Class Selection 57 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.2.1 Bayesian Model Class Selection . . . . . . . . . . . . . . . . 60 3.2.2 Evaluating the Evidence: Nested Sampling . . . . . . . . . . 61 3.2.3 Efficient Analysis of Systems with Local Modifications . . . 65 3.3 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.1 Example I: SDOF Superstructure on a Hysteretic Isolator . 69 3.3.2 Example II: 11-Story Base-Isolated Structural Model . . . . 76 3.3.3 ExampleIII:ComplexThree-DimensionalWind-ExcitedStruc- ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4 Multilevel Estimation of Marginal Likelihood for Bayesian Model Selection 90 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2 Review of sampling methods . . . . . . . . . . . . . . . . . . . . . . 92 4.2.1 Importance sampling . . . . . . . . . . . . . . . . . . . . . . 92 4.2.2 Stratified sampling . . . . . . . . . . . . . . . . . . . . . . . 93 4.2.3 Markov chain Monte Carlo (MCMC) . . . . . . . . . . . . . 94 4.3 Proposed Methodology . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.3.1 Use of probability integral transform . . . . . . . . . . . . . 95 4.3.2 Multilevel-Importance sampling (ML-IS) . . . . . . . . . . . 97 4.3.3 Multilevel-Stratified sampling (ML-SS) . . . . . . . . . . . . 98 4.3.4 Multilevel-particle approximation (ML-PA) . . . . . . . . . . 100 4.4 Discussion of the Proposed Approach . . . . . . . . . . . . . . . . . 103 4.4.1 Estimation of posterior moments . . . . . . . . . . . . . . . 103 4.4.2 Stopping criteria . . . . . . . . . . . . . . . . . . . . . . . . 104 4.4.3 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.5 Numerical Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.5.1 Example I: Conceptual example . . . . . . . . . . . . . . . . 106 4.5.2 Example II: Flow past a cylinder . . . . . . . . . . . . . . . 109 4.5.3 Example III: 11 story base isolated building . . . . . . . . . 112 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 5 Model Validation Framework 119 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.2 Hybrid Framework for Model Validation . . . . . . . . . . . . . . . 120 5.2.1 Intra-Model-Class Falsification: Framework’s Preprocessing Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 v 5.2.2 Bayesian model class selection . . . . . . . . . . . . . . . . . 126 5.2.3 Inter-Model-Class Falsification: Framework’s Postprocessing Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.2.4 Computational Advantage of the Synergistic Framework . . 129 5.3 Numerical Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.3.1 Example I: 3DOF Model with Nonlinear Stiffnesses . . . . . 130 5.3.2 Example II : 4DOF model with Hysteretic Isolation layer . . 138 5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 6 Applications 146 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 6.2 Computationally Efficient Design of Passive Control Devices . . . . 146 6.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 150 6.2.2 Bridge Model and Optimization Objectives/Constraints . . . 155 6.2.3 Numerical examples . . . . . . . . . . . . . . . . . . . . . . 160 6.2.4 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 6.3 Efficient Forward Uncertainty Propagation . . . . . . . . . . . . . . 178 6.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 180 6.3.2 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . 184 6.3.3 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 6.4 Reynolds Averaged Navier-Stokes (RANS) Models for Turbulence . 189 6.4.1 k− model class . . . . . . . . . . . . . . . . . . . . . . . . 190 6.4.2 RNG k− model class . . . . . . . . . . . . . . . . . . . . . 192 6.4.3 k−ω model class . . . . . . . . . . . . . . . . . . . . . . . . 193 6.4.4 Spalart-Allmaras model class . . . . . . . . . . . . . . . . . 194 6.4.5 NASA Two Dimensional Hump Flow . . . . . . . . . . . . . 196 6.4.6 Uncertainty Model . . . . . . . . . . . . . . . . . . . . . . . 197 6.4.7 Priors for the Parameters . . . . . . . . . . . . . . . . . . . 199 6.4.8 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 200 6.4.9 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 6.4.10 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 6.5 Work in Progress: Validation of a Four-Story Building Models . . . 203 6.5.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 6.5.2 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 7 Conclusions and Future Directions 208 A Modified Metropolis-Hastings Algorithm 212 References 214 vi List of Tables 2.1 The number of outcomes among the N o hypothesis tests between measured data and the outputs of a particular candidate model. . . 14 2.2 Falsification methodologies employed herein. . . . . . . . . . . . . . 20 2.3 Priors for model parameters as applicable to each model class. . . . 43 2.4 Fraction of models that are unfalsified using different methods. . . . 45 2.5 Fraction of models that are unfalsified using a prior distribution similar to the priors in Table 5.7 except the mean ofk post is changed to 3.5 MN/m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.6 True Bouc-Wen model parameters and estimates of the Bouc-Wen and Bilinear model parameters. (ML denotes maximum likelihood parameter estimates; b θ are robust parameter estimates.) . . . . . . 48 2.7 Models for the TMD damping force used to define different model classes for this structure and the prior distributions for their model parameters (u is a TMD displacement relative to the roof; W tmd is the weight of the corresponding TMD; S.D. is the standard deviation). 51 2.8 Fractions of models unfalsified within each model class. . . . . . . . 51 2.9 Robustly estimated parameters of two model classes. . . . . . . . . 52 vii 3.1 Prior distributions for model class parameters as applicable to each 2 DOF model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.2 Posterior model class probabilities with equal priors. . . . . . . . . 74 3.3 PriorsN (m,σ 2 ) for model class parameters of the 100 DOF model. 79 3.4 Posteriormodelclassprobabilitieswithequalpriorsforthe100DOF structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.5 Computational gain achieved using the proposed approach for the three numerical examples. . . . . . . . . . . . . . . . . . . . . . . . 82 3.6 Nonlinear damping model classes, where u is a TMD displacement relative to its roof attachment point. . . . . . . . . . . . . . . . . . 86 3.7 Priordistributionsformodelparametersfordifferentdampingmodel classes for the 1623 DOF building structure subjected to wind load. 87 3.8 Posterior TMD damping model class probabilities with equal priors for the 1623-DOF wind-excited building model. . . . . . . . . . . . 88 4.1 Comparisonofmarginallikelihoodorevidencevaluesobtainedusing the three proposed algorithm with the exact value. The coefficient of variations (COV) are obtained from 10 independent simulation runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.2 Comparison of marginal likelihood or evidence values using different algorithms for Example II. . . . . . . . . . . . . . . . . . . . . . . . 113 4.3 Priordistributionofparametersforthe11-storybaseisolatedbuilding.114 4.4 Comparison of marginal likelihood or evidence values using different algorithms for Example III. . . . . . . . . . . . . . . . . . . . . . . 116 4.5 Posterior mean and standard deviation of the parameters for the 11-story base isolated building. . . . . . . . . . . . . . . . . . . . . 116 viii 4.6 Posterior model probabilities for the hysteretic isolation layer in the 11-DOF base-isolated building. . . . . . . . . . . . . . . . . . . . . 117 5.1 Means and standard deviations of Gaussian prior distributions for model class stiffness coefficients (units are MN/m p i ). . . . . . . . . 132 5.2 Unfalsified models using proposed intra-model class falsification. (Bold means unfalsified model classes.) . . . . . . . . . . . . . . . . 134 5.3 Posterior model class probabilities after most of the model classes are rejected using a preprocessing step of intra-model-class falsifica- tion. (Relative log-evidence is with respect to the model class with the largest log-evidence.) . . . . . . . . . . . . . . . . . . . . . . . . 134 5.4 Posterior mean and standard deviation of model parameters for model class 1− 3− 2, and the true values. . . . . . . . . . . . . . . 134 5.5 Posteriormodelclassprobabilitiesofthetwobestmodelclassesafter some model classes are rejected using preprocessing intra-model- class falsification and an adjustment of prior parameter distribution for model selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 5.6 Unfalsifiedmodelsusingproposedintra-modelclassfalsificationwith target identification probability φ = 0.90. . . . . . . . . . . . . . . . 137 5.7 Priors for model parameters as applicable to each model class. . . . 141 5.8 Unfalsified models using proposed intra-model class falsification. . . 142 5.9 Posterior model class probabilities after some model classes are rejected using a preprocessing step of intra-model-class falsification. 142 5.10 Posterior model parameters and their true values for Bouc-Wen model class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 5.11 Unfalsified models using the proposed intra-model class falsification with φ = 0.90. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 ix 6.1 Performance metrics [1] . . . . . . . . . . . . . . . . . . . . . . . . . 159 6.2 Feasible solutions of linear damper configuration . . . . . . . . . . . 164 6.3 Feasible solutions of linear damper configuration with relaxed con- straints, and the corresponding cost reductions relative to that with no damper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 6.4 Feasible solutions of nonlinear damper configuration with relaxed constraints, and the corresponding cost reductions relative to that with no damper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 6.5 Optimization of two pairs of power-law dampers . . . . . . . . . . . 166 6.6 Computationalgainachievedusingtheproposedmethodwithfourth- order accurate quadrature scheme for passive damper design . . . . 167 6.7 Solutions of TMD parameters in location A with relaxed constraints.171 6.8 Solutions of TMD parameters in location B with relaxed constraints.172 6.9 Computational gain achieved using the proposed method for TMD design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 6.10 Optimal combined passive dampers and TMD configuration with relaxed constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 6.11 Uncertain parameter description . . . . . . . . . . . . . . . . . . . . 176 6.12 Worst-case and average passive damper designs-under-uncertainty . 176 6.13 Computational gain achieved using the proposed method for design- under-uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 6.14 Standard values of the closure coefficients for k− model class. . . 192 6.15 Standard values of the closure coefficients for RNG k− model class.192 6.16 Standard values of the closure coefficients for k−ω model class. . . 194 6.17 StandardvaluesoftheclosurecoefficientsforSpalart-Allmarasmodel class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 x 6.18 Prior probability distributions of the different parameters of each model class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 6.19 Unfalsified models using proposed intra-model class falsification. . . 202 6.20 Posterior model class probabilities evaluated using the Bayesian model class selection. . . . . . . . . . . . . . . . . . . . . . . . . . . 202 6.21 Identified values of the parameters of the k− models. . . . . . . . 202 6.22 Different model classes with their parameters, mean and standard deviation of their prior distribution, results from model falsification and selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 xi List of Figures 2.1 Model falsification fundamentals. . . . . . . . . . . . . . . . . . . . 9 2.2 Statistical power of a single comparison test with one measure- ment. (Note: if the density is Gaussian, as drawn, then [,] = ±σΦ −1 (α/2)whereσ isthemodel’sassumptionontheresidualerror standard deviation and Φ(·) is the standard unit normal cumulative distribution function.) . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Comparison of residual errors’ areas/volumes within which a model is unfalsified by FDR/BH (solid-line area/volume, which is blue in theelectronicversion)andbyFWER/Šidák(dashed-linesquare/cube, which is red in the electronic version). (Note: the axes are scaled by the ranges of the bounds; if the distributions of the i were dif- ferent and no scaling were applied, the square/cube would become a rectangle/cuboid.) . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4 Error-bound model falsification procedure. . . . . . . . . . . . . . . 21 2.5 Typical single-measurement residual density. . . . . . . . . . . . . . 24 2.6 Likelihood bound for the two-measurement case. (Note: while the jointdensitiesaredepictedforuncorrelatedmeasurements, thisneed not be assumed.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 xii 2.7 Relationbetweenerror-andlikelihood-boundmethodsforFWER/Šidák and FDR/BH. Note: L 1 ≤L 2 ≤L 3 . . . . . . . . . . . . . . . . . . . 27 2.8 Error residuals for N o measurements with larger residual on one sensor measurement ( 3 ). . . . . . . . . . . . . . . . . . . . . . . . . 27 2.9 The sets of Case I models θ that are falsified and unfalsified by the error-bound and likelihood-bound methods for different values of assumed residual standard deviation σ. . . . . . . . . . . . . . . . . 33 2.10 Case I (σ = 0.25, σ d = 0.25) models’ likelihoods on a: (a) linear scale, with vertical-line ranges of the error-bound (EB) unfalsified models; (b) log scale, with vertical-line ranges of the likelihood- bound unfalsified models. (Bonferroni and Šidák bounds/ranges are nearly identical and overlap on the graphs.) . . . . . . . . . . . 35 2.11 Statistical power (the probability of rejecting a model when it is invalid)asafunctionofthenumberofmeasurementsfortheFWER/Šidák and FDR/BH error-bound methods for parametersθ = 1.0, 1.5, and 2.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.12 The sets of falsified and unfalsified Case II models (biased modeling error) via the error-bound (Err-bound) and likelihood-bound (LH- bound) methods for assumed residual standard deviation σ = 0.25. . 36 2.13 The sets of falsified and unfalsified Case III models (correlated mea- surements with σ = 0.25) via the error-bound (Err-bound) and likelihood-bound (LH-bound) methods. . . . . . . . . . . . . . . . . 37 2.14 The sets of falsified and unfalsified Case IIIb models (alternating- sign-correlated measurements) via the error-bound (Err-bound) and likelihood-bound (LH-bound) methods with σ = 0.25. . . . . . . . . 38 xiii 2.15 (a) 4DOF testbed system: a building superstructure on a hysteretic isolation layer; (b) force-displacement loops and linear approxima- tions for the various model classes . . . . . . . . . . . . . . . . . . . 39 2.16 The fraction of Bouc-Wen models falsified by the FDR/BH error- and likelihood-bound methods as a function of the number of can- didate models. n s = 2000, used in Example II, is sufficient as the curves do not significantly change beyond 2000 models. . . . . . . . 44 2.17 Mean of RMS errors of model residuals i for Bouc-Wen models that are error-bound falsified by neither FWER/Šidák nor FDR/BH, fal- sified by FDR/BH only (not FWER/Šidák), and falsified by both. The error bars show one standard deviation of the mean RMS errors. 44 2.18 Likelihood values of the bilinear and Bouc-Wen models as a function of parameter k post ; the FDR/BH likelihood bound is also shown. . . 47 2.19 True absolute base accelerations of the 4DOF building subjected to the Kobe earthquake, and those robustly predicted from models likelihood-bound-unfalsified using the El Centro data. . . . . . . . . 47 2.20 Relative RMS error in prediction of the absolute base acceleration of thefourDOFbuildingsubjectedto1995Kobeearthquakeexcitation using the Bouc-Wen models with varying measurement noise level and for different initial model set for falsification. (CI = confidence interval, Std = standard deviation.) . . . . . . . . . . . . . . . . . . 49 2.21 1623 DOF model of a building with TMDs on its roof, subjected to wind load. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.22 Robustly predicted and true absolute roof acceleration of the 1623 DOF building subjected to a different realization of the wind exci- tation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 xiv 2.23 Relative RMS error in prediction of the roof acceleration in the x direction of the 1623 DOF building using cubic polynomials for the TMD damping subjected to a different realization of wind excita- tion with varying measurement noise level and for different initial modelsetforfalsification. (CI=confidenceinterval, Std=standard deviation.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.1 The nested sampling algorithm in the two-dimensional parameter space with iso-likelihood contours. . . . . . . . . . . . . . . . . . . . 62 3.2 mplementation of the efficient dynamic response algorithm showing one time calculation and repeated calculation components. . . . . . 68 3.3 Two DOF structural model with hysteretic damping. . . . . . . . . 69 3.4 Damping model classes: Bouc-Wen hysteresis, bilinear hysteresis, and the AASHTO “equivalent” linear model class. . . . . . . . . . . 71 3.5 Hysteresis loops for the true Bouc-Wen model class for the 2 DOF structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.6 100 DOF base-isolated structural model. . . . . . . . . . . . . . . . 77 3.7 Hysteresisloops,forthe(a)Bouc-Wenand(b)bilinearmodelclasses, for the 100 DOF isolated building structure. . . . . . . . . . . . . . 80 3.8 Typical isolation-layer restoring force hysteresis loops, over the time duration [0.73, 2.16] s, foronelinearandtwononlinearmodelclasses for the 100 DOF structure. . . . . . . . . . . . . . . . . . . . . . . 81 3.9 Complex three-dimensional wind-excited structure. . . . . . . . . . 83 4.1 Multilevel estimation of marginal likelihood. . . . . . . . . . . . . . 96 xv 4.2 Multilevel-particleapproximation(ML-IS)method: theiso-likelihood contours are shown with λ 1 <λ 2 <···<λ n ; Importance densities are formed successively to generate samples from high likelihood region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.3 Multilevel-startified sampling (ML-SS) method: the iso-likelihood contours are shown with λ 1 < λ 2 <··· < λ n ; more samples are generated from the strata with high likelihood values. . . . . . . . . 101 4.4 Multilevel-particleapproximation(ML-PA)method: theiso-likelihood contours are shown withλ 1 <λ 2 <···<λ n ; Markov chains are run to generate samples from high likelihood region. . . . . . . . . . . . 102 4.5 The error in estimation of log evidence gets reduced with increasing number of likelihood function evaluations for the three proposed algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.6 Schematic of pipe and cylinder. . . . . . . . . . . . . . . . . . . . . 109 4.7 Typical velocity and pressure distributions in Example II. . . . . . . 110 4.8 100 DOF base-isolated structural model . . . . . . . . . . . . . . . 113 4.9 Models for hysteresis. . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.1 Proposed synergistic framework of model falsification and Bayesian model class selection. . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.2 A likelihood-boundL defined for a multidimensional non-Gaussian residual density. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 5.3 3 DOF model with nonlinear stiffnesses. . . . . . . . . . . . . . . . 131 5.4 4DOF models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.5 Representative time history realization of ¨ x g (t). . . . . . . . . . . . 139 5.6 Degradation parameters and hysteresis loops of the Bouc-Wen and Baber-Wen models using the true values of the parameters. . . . . . 143 xvi 6.1 Finite element model of the bridge (dimensions in m); adapted from Dyke et al. [1]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 6.2 Six candidate device locations denoted as I, II, III, IV, V and VI. . 161 6.3 Accuracy of the proposed approach (Δt = 6.1 ms, a fourth-order quadrature) compared to Matlab’s ode45. . . . . . . . . . . . . . 167 6.4 Two candidate TMD device locations denoted as A and B. . . . . . 170 6.5 Finite element model (side view) of the bridge, with Example IV device locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 6.6 The random space divided into n st disjoint elements. . . . . . . . . 181 6.7 100 DOF base-isolated structural model . . . . . . . . . . . . . . . 184 6.8 Mean response with its variation. . . . . . . . . . . . . . . . . . . . 186 6.9 An adaptive division of elements . . . . . . . . . . . . . . . . . . . . 187 6.10 Comparison of variance estimates over time . . . . . . . . . . . . . 188 6.11 The experimental setup used to generate measurements for the two dimensional hump flow case. . . . . . . . . . . . . . . . . . . . . . . 196 6.12 The hump geometry shown in details. . . . . . . . . . . . . . . . . . 197 6.13 Plot of the velocity measurements from the experiment at different distance from the hump in the downhill direction. . . . . . . . . . . 197 6.14 The experimental set-up. (Picture taken by Prof. Erik Johnson.) . . 203 6.15 The finite element model (kindly provided by Mr. Tianhao Yu). . . 204 xvii Abstract Models are used to represent and characterize physical phenomena. When the number of plausible models for a particular phenomenon is large, computational tools such as model falsification or model selection can help with the choice of models by eliminating models that do not fit the data. A probabilistic framework is proposed in this dissertation for validating models by intertwining the concepts of model falsification and Bayesian model selection. The model falsification is used in this framework as pre- and postprocessing steps that try to eliminate models as well as model classes that can not explain the measurements. A likelihood-bound model falsification based on control of a statistical error criterion, namely the false discovery rate, is proposed and used here in the falsification step. This likelihood based falsification result determines the validity of the initial candidate model class set and helps in removing most of the incorrect model classes without much computational cost. Bayesian model selection, which assigns posterior model class probabilities based on Bayes’ theorem, is then applied to the remaining model classes with computational savings from the likelihood values evaluated at the pre-processing step. Finally, a post-processing step based on likelihood-bound falsification is used to check on the validity of the finally selected model class(es). The proposed framework is applied to nonlinear structural examples with many model classes available for the systems. Further improvements are also proposed xviii for efficient estimation of evidence for Bayesian model selection using a probability integraltransform. Finally, optimaldesignanddesign-under-uncertaintyofpassive damping devices, an efficient forward uncertainty propagation, model validation of a four-story full-scale base isolated building, and validation of turbulence models are performed using the methodologies developed here. xix Chapter 1 Introduction We demand rigidly defined areas of doubt and uncertainty! Douglas Adams, The Hitchhiker’s Guide to the Galaxy Models and model classes, typically given by systems of mathematical equa- tions, are built to help represent, understand, and further characterize physical phenomena. The choice of a model class or model for a particular phenomenon is made based on user judgment, evidence from measurement data, and/or the ease of its use. In the dynamic characterization of structural systems, different model classes and members of these classes are often available for response calculation; a choice of model(s) or model class(es) must be made after eliminating the incorrect model(s) or model class(es). In structural control, the design of a control strategy is only as effective as the model of the system it is applied to. In health monitoring, the modeling of the structure is important as damage detection is often performed by identifying changes in the model parameters. In structural dynamics, these models are used to design damping and control systems to mitigate the effects of catastrophic earthquakes and other natural hazards based on the outcome of different simulated scenarios [2]. Model selection refers to problems in which model(s) must be selected from a larger set, even if the “true” model is not included [3]. Occam’s razor suggests that models with lesser complexity should be favored among models of comparable accuracy [4]. The use of Bayesian inference to evaluate the plausibility of different 1 modelsisknownasBayesian model selection, whichhasbeenappliedacrossdiverse fields [5–7]. The Occam’s razor principle that emphasizes selecting the simplest model class among plausible model classes is also embedded in Bayesian model selection, as shown by MacKay [8–11]. This type of inference differs from the more widely applied Bayesian approaches that characterize the likelihood of the values of the models’ parameters and instead focuses on the plausibility of (the form of) the models themselves. A common form of Bayesian model selection uses the Bayes factor, which is a ratio of the marginal likelihood, or evidence, of two models [12, 13]. Bayesian model class selection [14] is Bayesian inference applied to quantify the likelihood of entire families of models, i.e., model classes. In Bayesian model and model class selection, posterior probabilities of available models are found based on system response measurement data, and hereby applied to dynamic responses [11, 15–18]. Model falsification, on the other hand, tries to eliminate models that can not describe the behavior of the physical system based on the philosophy that mea- surement data can only be used to falsify them [19, 20]. Statistical hypothesis testing methods can be employed in this setting. Recently, an increased interest in this methodology has seen applications in diverse fields including identification of structural systems [21–26], control theory [27–33], biology [34], and so on. The current approaches for model selection and falsification, used individually, have challenging limitations: the former can, without indication of the danger, leave as most plausible a model that is certainly wrong; the latter can fail to falsify any model class, leaving a large number of unfalsified models. Modelers need new tools for validating their models. To address these challenges, a falsification of models in a Bayesian framework, as well as a methodology that can use dynamic 2 measurements (as opposed to modal parameters) from dynamical systems, both linear and nonlinear, are proposed in this work. The framework proposed in this work is based on the philosophy put forward by Box as “Essentially, all models are wrong, but some are useful” [35, 20]. This philosophy is extended in this proposed framework to select one or more useful model(s) or model class(es) by integrating the principle of model falsification into the Bayesian model selection to mitigate the weaknesses of these different identi- fication schemes. By re-thinking the incorporation of dynamic measurements into model falsification, dynamic models and measurements of all types will become candidate inputs. Exploiting model falsification’s ability to greatly reduce the valid model class set will avoid numerous expensive computations required by the evaluation of posterior parameter distribution and Bayesian model selection for every model class; this savings in cost will directly increase as the number of mea- surements, degrees-of-freedom, or space/time resolution increase. The reduction in the set of valid models will provide modelers with a newfound confidence in their results, as the guaranteed plausibility of all remaining models will ensure the accuracy of the analysis. 1.1 Outline of the Dissertation The second chapter discusses the model falsification part of the framework. A likelihood-bound model falsification based on the false discovery rate (FDR) and its use in robust response prediction are proposed in that chapter. The third chap- ter discusses Bayesian model selection with improved computational efficiency, for dynamical systems with isolated uncertainties and nonlinearities, by solving for their responses using a Volterra integral equation. The fourth chapter proposes 3 a multilevel approach to calculate the evidence using a probability integral trans- formation. The fifth chapter combines the falsification and selection approaches discussed in the previous chapters and gives a detailed outline of the proposed framework with numerical examples. The sixth and seventh chapters discuss applications of the developed methodologies in diverse fields, and conclude the dissertation, respectively. 4 Chapter 2 Model Falsification When nothing is sure, everything is possible. Margaret Drabble 2.1 Introduction The notion of model falsification originated in the 1930s from the assertion of Popper [19] that scientific models cannot be fully validated by data, but can only be falsified. Box and Draper [20] stated that “Essentially, all models are wrong, but some are useful.” The idea behind model falsification is to find these use- ful models by falsifying and eliminating invalid models from the candidate model set. Based on these principles, model falsification has found several applications, including those in control theory, biology and structural modeling. Model falsifi- cation has been used to find a robust adaptive controller from measurement data to satisfy performance specifications of a closed-loop system in Safonov and Tsao [27] and Brugarolas and Safonov [28]. Model falsification using feasibility inequal- ities, derived based on robust control principles (e.g.,H ∞ ), has also been used for robust control design in Smith and Doyle [29], Poolla et al.[30], Smith et al.[31], Woodley et al.[32], Kosut [33], Sznaier and Mazzaro [36], and Bianchi and Sánchez- Peña [37], as well as for recursive state estimation in Schweppe [38]. Concurrent with the developments in the control area, the application of model falsification also proceeded in problems related to modeling bioprocesses for biotechnology [39] 5 and modeling astronomical phenomena [40]. More recently, model falsification has seen increased interest [41] and has been applied in new application areas such as advanced modeling of biological networks [34]. The error domain model falsification approach [42] specifies bounds on pre- diction errors, based on combining uncertainties from different sources — such as sensor noise, error in analysis, and errors due to model simplification — and then uses these bounds to falsify models with larger prediction errors. This approach was developed and applied to a series of structural modeling and monitoring inves- tigations in Raphael and Smith [21], Robert-Nicoud et al. [22], and Smith and Saitta [23]. Some applications quantified the uncertainty and, subsequently, iden- tifiability of models for civil structures, such as bridges and pipe networks, based on various characteristics of the measurement and monitoring systems [24, 43]. Ambient vibration data has been incorporated into the method by applying model falsification to the identification of modal parameters, such as natural frequencies and mode shapes [25, 26]. Error domain model falsification has also been utilized for optimizing the performance of measurement systems through an iterative pro- cess that systematically reduces the number of measurements needed for effective identification, helping to control under- and over-instrumentation [44]. To demon- strate its strengths, this model falsification technique has been compared to other methods of model selection, such as residual minimization and Bayesian inference [45]. Beven and his colleagues have proposed a different approach to model fal- sification by using the likelihood values of the measurement data in hydrological examples while defining the bounds in a subjective manner [46–49], though they have been criticized for using likelihood functions that do not represent a prob- ability density for the prediction errors and for applying Bayes’ theorem in an empirical sense [50]. 6 This chapter compares model falsification approaches that evaluate models based on their predictions of the measured data, encompassing approaches from the previous paragraph as well as proposed modifications and new methods using likelihoods of the model prediction errors. Bounds are employed to directly define limits on prediction errors or to define limits on their likelihoods using their distri- bution function. The bounds on the prediction errors or their likelihoods can be specified using different multiple comparison test corrections. The Šidák correc- tion, which controls the family-wise error rate (FWER), is generally employed for falsification using error bounds [45], though the simpler Bonferroni correction is sometimes used. Herein, false discovery rate (FDR) control is proposed for defin- ing the error bounds, as FDR has the capacity to falsify more invalid models than FWER while performing comparably in retaining valid models [51]. Likelihood bounds are also proposed, chosen based on these three multiple comparison cor- rections as well as on a constant probability mass approach, to falsify models and evaluate the confidence in the remaining models. Compared to standard Bayesian approaches, likelihood-bound falsification and weights can reduce the computa- tional cost of response prediction by falsifying many invalid models (and even some entire invalid model classes), thereby reducing the required number of model simulations in any subsequent analysis. Next, the likelihood-bound falsification results are used for a robust prediction of future response of the dynamic systems in an approximate Bayesian sense. An elementary problem is used first to illustrate the application of the methods in a simple setting. The second example uses a four degree-of-freedom structural dynamical system with different linear and nonlinear models of a hysteretic base isolator. These two examples demonstrate that speci- fying limits on the prediction error helps to falsify many of the models, with FDR performing better than FWER in eliminating invalid models, whereas the use of 7 likelihood values can be of great help by improving the modeler’s judgment on the validity of the remaining unfalsified models. A third example of a building subjected to wind load is also used with a specific focus on the robust prediction of future response. 2.2 Terminology A variety of terms are used to describe the falsification of models, with slightly differing meaning from field to field; the following terms are defined to eliminate any ambiguity herein: Falsified/Rejected model: a model falsified or rejected by an approach because the prediction from that model does not satisfy some criteria specified by that approach (e.g., the residual error between the measurement and the prediction is too large or too unlikely). Unfalsified/Accepted model: a model with predictions that satisfy those spe- cific criteria. Valid model: a model that reasonably reproduces the system behavior. Invalid model: a model that cannot reasonably reproduce the system behavior. Positive test result: the event of falsifying/rejecting/discarding a model. Negative test result: the event of unfalsifying/accepting/retaining a model. False positive: the event of (incorrectly) falsifying/rejecting a valid model. Falsenegative: theeventof(incorrectly)unfalsifying/acceptinganinvalidmodel. Type I error: the error introduced by (incorrectly) rejecting valid model(s) (false positives). Type II error: the error introduced by (incorrectly) accepting invalid model(s) (false negatives). 8 Candidatemodels Candidatemodels Candidatemodels Candidatemodels Candidatemodels valid models valid models valid models valid models valid models unfalsified models unfalsified models unfalsified models unfalsified models unfalsified models Figure 2.1: Model falsification fundamentals. Significance level of test: the probability of incorrectly rejecting valid models (i.e., type I error) while performing a falsification test. p-value: the likelihood of observing, according to some model, a residual error as extreme, or more extreme than, the actual observed residual error. Control of an error criterion: A correction method “controls” a particular error criterion (e.g., FWER, FDR) by keeping the error below a specified value for the accepted models. 2.3 Background Model falsification tries to eliminate invalid models from a set of candidate models [52, 53]. However, the set of valid models is different from the unfalsified model set (Figure 6.6). Valid models that are falsified produce type I errors (false positives) whereas invalid models that are accepted produce type II errors (false negatives). Different model falsification methods try to minimize one or both of these two errors. This section first provides a brief background of error domain 9 model falsification; then, the use of different multiple test correction methods, which trade off committing type I and type II errors, is discussed. 2.3.1 Error Domain Model Falsification Error domain model falsification [42] evaluates the error between the predicted model response and the true value of a measurement. For a model, specified within some model classM by the value of an n θ × 1 parameter vectorθ, let the vector function h(θ) denote the N o model outputs; the corresponding measurements are given by d. The difference between the i th predicted model output h i (θ) and the associated modeling error h,i is equal to the “true” system response Q i , which is also equal to the difference between the measurement d i and its corresponding measurement error d,i [45]. Thus, h i (θ)− h,i =Q i =d i − d,i (2.1) Neither Q i nor the error values h,i and d,i are ever known precisely; only the observed residuals i = h,i − d,i = h i (θ)− d i are known. The residuals are modeled as (typically, continuous) random variables, characterized in Goulet and Smith [45] by marginal probability densities p E i (e i |θ), where E i is the random variable denoting the difference between a model’s i th output prediction and the corresponding actual measured response, e i is a possible value of random variable E i , and i is the actual difference (i.e., the one realization of E i computed from measured responsed i ); note that i depends implicitly on the model and its param- eter vectorθ. Clearly, the forms of the density functions p E i (e i |θ) are an integral part of the model classM as well, with any density function parameters specified by the particular model through its specific parameter vectorθ. 10 The model defined byθ is falsified by the i th comparison point (e.g., measure- ment location or time) if the difference i between the predicted and measured val- ues falls outside the interval defined by threshold bounds [45] denoted [ i ,¯ i ] herein. These bounds are chosen such that the resulting N o -dimensional domain contains the hypothetical residuals E i with probability greater than or equal to a user- defined target identification probability φ∈ (0, 1); i.e., P ∩ No i=1 [ i ≤E i ≤ ¯ i ] ≥ φ. The bounds are then defined by the narrowest interval satisfying φ 1/No = Z ¯ i i p E i (e i |θ)de i , i = 1,...,N o (2.2) (for some distributions, additional constraints may be required for the bounds to be unique). When used to falsify models, this criterion essentially asserts that the probability of retaining (i.e., not falsifying) valid models will be greater than or equal to φ regardless of any relationships between the residuals E i [45]; i.e., this criterion is an equality when theE i are independent. Thus, models will be falsified when they fail to satisfy the inequalities. i ≤ i =h i (θ)−d i ≤ ¯ i , ∀i∈{1,...,N o } (2.3) Models that are not falsified are retained, called unfalsified models, and assumed to be equally likely explanations of the observed behavior. Error domain model falsification achieves the task of falsification in the pres- ence of multiple uncertainties (e.g., material properties and loading) by aggre- gating them to determine their combined effects on the measurement. However, all resulting unfalsified models are of equal importance as this method does not provide any confidence level on an individual unfalsified model. (Methods such bootstrap aggregating, a.k.a. bagging, can be used to judge the relative value of 11 remaining models, though a thorough analysis of such methods is beyond the scope of this study.) 2.3.2 Multiple Comparison Test Corrections Another approach, which can be equivalent to error domain model falsification using(2.2)and(2.3), istoconverttheobservedresidualerrors i (modelpredictions minus the measured values) into corresponding p-values that must all be smaller than some significance level for the model to be retained; the model is rejected (falsified) if any p-value is larger than the significance level. Using the notation herein, the p-values for two-sided distributions can be defined as [54]: p i = 2 min{P (E i ≤ i |θ),P (E i ≥ i |θ)}, i = 1,...,N o = 2 min Z i −∞ p E i (e i |θ)de i , Z ∞ i p E i (e i |θ)de i (where the factor of 2 is used because a model is falsified if a residual is below some lower bound or above some upper bound); note that 0≤ p i ≤ 1. For zero-mean symmetric distributions, (2.4) implies p i = 2P(E i ≥| i |). The model will then be falsified if it fails to satisfy any of the inequalities p i ≥ ¯ α i ∀i∈{1,...,N o } (2.4) where ¯ α i is the significance level of the test for the i th measurement. 2.3.3 Family-Wise Error Rate (FWER) The significance levels can be chosen in multiple ways, each of which can have a different influence on the likelihood of errors in model falsification. One way is to use the family-wise error rate (FWER) for testing multiple hypotheses (e.g., 12 measurements at multiple locations or at multiple times). The FWER is the probability of making one or more type I errors (i.e., incorrectly rejecting a valid model). There are two common approaches for controlling the FWER: the Šidák correction and the Bonferroni correction [55, 56]. To keep the FWER at α for N o simultaneous hypotheses tests, the Šidák cor- rection (which is exact when the tests are independent) sets each test’s significance level at ¯ α i = 1− (1−α) 1/No , i = 1,...,N o (2.5) Thep-valuetestinequalities(2.4)areequivalenttoerrordomainmodelfalsification ([44, 24, 45, 25]) inequalities (2.3) if the significance levels are Šidák-chosen using (2.5) with α = 1−φ, with a resulting per-measurement target probability ¯ φ i = φ 1/No = 1− ¯ α i . Alternately, the Bonferroni correction, which can be easily employed and yields bounds very similar to the Šidák correction, controls the FWER at α by setting the per-test significance level at ¯ α i =α/N o , i = 1,...,N o (2.6) and the target probability of each test becomes ¯ φ i = 1− (1− φ)/N o . Thus, the Bonferroni correction is a first-order Taylor series expansion of the Šidák cor- rection (2.5) for small α. This correction is always more conservative (falsifies fewer models) than the Šidák correction since the Bonferroni correction requires no assumption of test independence. 13 2.3.4 False Discovery Rate (FDR) Thefalsediscoveryrate(FDR)wasintroducedbyBenjaminiandHochberg[51]. Controlling the false discovery rate (FDR), as opposed to the FWER, is proposed hereasanalternatecriterionforselectingthesignificancelevelsorboundsinmodel falsification—thefirstsuchuseofFDR(totheauthors’knowledge). FDRattempts to control the fraction of measurement rejections that are incorrect (where each measurement rejection indicates that the candidate model is overall invalid). To define FDR, consider that: (a) a model with N o outputs (locations and/or times) may provide a valid or invalid prediction of each output (i.e., it is possi- ble that the model accurately predicts some outputs but poorly predicts others); further, (b) for each of the N o measurements, a comparison test with the cor- responding model output will indicate whether the model should be accepted or rejected (falsified) based on that measurement. This results in the matrix of out- comes shown in Table 2.1, composed of the following random variables (that must sum to N o ): N va = # of outputs for which: a model is valid & the test accepts the model N vr = # of outputs for which: a model is valid & the test rejects the model N ia = # of outputs for which: a model is invalid & the test accepts the model N ir = # of outputs for which: a model is invalid & the test rejects the model Table 2.1: The number of outcomes among the N o hypothesis tests between mea- sured data and the outputs of a particular candidate model. # of outputs for which a model is: accepted rejected Total valid N va N vr N v invalid N ia N ir N o −N v Total N o −N r N r N o 14 N r =N vr +N ir is the total number of output residuals indicating that the model should be rejected. N v = N vr +N va is the number of model outputs for which the model has valid predictions, which is never known in any real application. Similarly, the separation of the N r rejections into N vr and N ir , as well as the separation of the N o −N r acceptances into N va and N ia , are unknown. Using these counts of possible outcomes, the FDR is defined as the expected fractionofmeasurementrejectionsthatareincorrect, i.e., theexpectedvalueofthe ratio of the number of times that a valid model prediction is incorrectly rejected (N vr ) to total number of times the prediction is rejected (N r ), where this ratio (N vr /N r ) is defined to be zero when N r = 0; i.e., FDR =E N vr N r N r > 0 P(N r > 0) (2.7) Using this definition, the Benjamini-Hochberg (BH) procedure for controlling the FDR at α does indeed achieve FDR≤α [51]. On the other hand, FWER control atα requires that the probability of incorrectly rejecting valid models at least one time among theN o comparisons be less thanα; i.e., FWER =P(N vr ≥ 1)≤α. In problems involving multiple comparison tests, FDR control ensures, on average, that the fraction of rejections that are false positives is below α, and provides greater statistical power (detailed below) than FWER control while allowing some false positive results [57]. Thus, relative to FWER, FDR should reduce the likeli- hood of false negatives while performing as well in false positives; i.e., FDR should perform as well as FWER in retaining valid models but better in falsifying invalid models. 15 The statistical power is defined as the probability of rejecting a model when it is invalid, i.e., Statistical Power = N ir N ir +N ia = N ir N o −N v (2.8) and is negatively proportional to the probability of type II error. For a single measurement case, the statistical power is the probability of rejecting an invalid model. Figure 2.2 shows an example where the model θ =θ true is the true model. Given p E (e|θ), any residual falling outside [,¯ ], on either tail, is rejected. Hence, an invalid model with θ =θ invalid will be rejected with probability 1−β, which is equal to the (green) shaded area{e :e< }, and is the statistical power; β is the probability of accepting the invalid model, which corresponds to the (red) shaded area{e :e> }. =μ E|θ invalid −μ E|θtrue statistical power =1−β β p E (e|θ invalid ) p E (e|θ true ) α/2 Δμ E 0 e probability density Figure 2.2: Statistical power of a single comparison test with one measurement. (Note: if the density is Gaussian, as drawn, then [,] =±σΦ −1 (α/2) where σ is the model’s assumption on the residual error standard deviation and Φ(·) is the standard unit normal cumulative distribution function.) Benjamini-Hochberg Procedure for FDR Control Benjamini and Hochberg [51] proposed an algorithm, known as the Benjamini- Hochberg (BH) procedure, for keeping the FDR at a pre-chosen levelα by sequen- tial adjustment of ascending-ordered p-values for each (independent) test (though it is also valid for controlling FDR with positively-correlated residuals). The BH procedure is based on the Simes [58] test procedure for multiple hypotheses testing. 16 Let the measurement numbers be ordered, for evaluating a candidate model, such that thep-values{p 1 ,p 2 ,...,p No } are sorted starting with the one corresponding to the most extreme residual; i.e., 0≤p 1 ≤p 2 ≤···≤p No ≤ 1. Then, with multiple measurements available, the BH procedure sets the per-measurement significance levels at ¯ α i = i N o α, i = 1,...,N o (2.9) Then, for a model to be retained, criteria (2.4) must be satisfied. For example, if there are N o = 4 outputs and the model class M assumes zero-mean Gaussian residuals, then the following demonstrates one realization of one model’s residual errors i and their corresponding p-valuesp i = 2P(E i ≥| i |), as well as the reorderedp-values and the corresponding significance levels to which they are compared: 1 =− σ 1 ⇒ p 1 = 2[1− Φ( 1σ1 σ1 )] = 2− 2Φ(1) = 0.3173 2 = 1 2 σ 2 ⇒ p 2 = 2[1− Φ( σ2 2σ2 )] = 2− 2Φ( 1 2 ) = 0.6171 3 =−2σ 3 ⇒ p 3 = 2[1− Φ( 2σ3 σ3 )] = 2− 2Φ(2) = 0.0455 4 = 3 2 σ 4 ⇒ p 4 = 2[1− Φ( 3σ4 2σ4 )] = 2− 2Φ( 3 2 ) = 0.1336 reorder ⇒ 0.0455 ? ≥ ¯ α 1 = 1 4 α 0.1336 ? ≥ ¯ α 2 = 2 4 α 0.3173 ? ≥ ¯ α 3 = 3 4 α 0.6171 ? ≥ ¯ α 4 = 4 4 α Thus, FDR/BH uses smaller significance levels for the measurements that are more extreme outliers, and larger significance levels for residuals that are not outliers, enabling problems with a large number of measurements since FDR/BH allows for one or a few outliers as long as the rest of the measurements are well within expected ranges. Note: While not used herein, two other FDR algorithms are: (1) one pro- posed by Benjamini and Yekutieli [59] to control FDR for general dependent test statistics, given by ¯ α i = i N o 1 P No i=1 1/i ! α i = 1,...,N o (2.10) 17 (which is a looser and more conservative test that falsifies fewer models, thereby decreasing the incidence of false positives at the cost of more false negatives) and (2) the positive false discovery rate by Storey [60], which uses a slightly differ- ent error measureE [(N vr /N r )|N r > 0] for multiple hypothesis tests, with a control algorithm that fixes the rejection region with a p-value analogue. Additionally, FDR has been used [61] to define confidence intervals for model parameters; this is only similar to the methodology proposed herein when the system has sim- ple (particularly, one-to-one and monotonic) mappings between model parameters and prediction errors — typically not the case for most engineering applications (including structural/mechanical ones). Relationship between FWER and FDR To examine the relationship between FWER and FDR, consider two cases adapted from Benjamini and Hochberg [51]. (1) If a model is valid (i.e., an ade- quate prediction for each and every measurement), then N v = N o , which means N ir = 0 and N r = N vr , which leads to FDR = 1·P(N r > 0) = P(N vr ≥ 1) = FWER. (2) If a model is not an adequate prediction for every measurement, then N v <N o , soN vr /N r ≤ 1 forN vr > 0, which leads to FDR =E [(N vr /N r )|N vr > 0]· P(N vr ≥ 1)≤ 1·P(N vr ≥ 1) = FWER. Thus, FWER control at α also ensures FDR control at or below α. However, as the number of measurements increases, FDR provides greater statistical power, rejecting more invalid models than does FWER control [51]. Hence, FDR control is beneficial for a significant gain in rejecting invalid models. To better understand the difference between these two error control criteria, the authors introduce in Figure 2.3 visualizations of the error bounds specified by FDR control (computed with the BH-procedure) compared to those by the FWER 18 FDR/BH FDR/BH FWER/ FWER/ Šidák Šidák e 1 e 2 (a) 2 measurements e 1 e 2 e 3 (b) 3 measurements FDR/BH only e 1 e 2 e 3 (c) 3 measurements Figure 2.3: Comparison of residual errors’ areas/volumes within which a model is unfalsified by FDR/BH (solid-line area/volume, which is blue in the electronic version) and by FWER/Šidák (dashed-line square/cube, which is red in the elec- tronic version). (Note: the axes are scaled by the ranges of the bounds; if the distributions of the i were different and no scaling were applied, the square/cube would become a rectangle/cuboid.) control with the Šidák correction. Figure 2.3a shows the two-dimensional case: models with residual errors 1 and 2 inside the (red) dashed square region are accepted by FWER/Šidák, and by FDR/BH inside the (blue) solid plus-shaped region with the solid boundary. Figure 2.3c shows the three-measurement residual error volumes within which a model will be unfalsified: within the (blue) solid-line volume utilizing the FDR/BH procedure and the (red) dashed-line cube utilizing FWER/Šidák. FDR/BH ignores the corners where all residual error components have high values, but adds the side strips where all but one of the residual error components have values close to their means; this provides significant benefit for problems with many measurements as it allows for one (or a few) residuals to be further outliers than allowed by FWER/Šidák as long as the rest of the residuals are much more likely. 19 2.4 Model Falsification Methodology Two model falsification approaches are investigated in this paper: one based on error bounds and the other based on likelihood bounds. In each approach, the bounds are chosen using one of several procedures for multiple comparison test corrections. Error domain model falsification [42], discussed in the background section, is an error-bound approach using FWER/Šidák. The next two subsec- tions outline model falsification methodologies based on error bounds, including the proposed use of FDR instead of FWER, and the proposed likelihood bounds, respectively. The falsification methodologies investigated herein are summarized in Table 2.2. Table 2.2: Falsification methodologies employed herein. Bounds Correction Controls Comments Error Šidák FWER error domain model falsification [42] Error Bonferroni FWER similar to Šidák correction Error BH procedure FDR proposed here Likelihood Šidák proposed here Likelihood Bonferroni proposed here Likelihood BH procedure proposed here Likelihood Constant probability mass equivalent to ellipsoidal error bound 2.4.1 Error-Bound Model Falsification Figure2.4showsageneralflowchartofthestepsformodelfalsificationmethods. Theinputstothemethodsarethetargetprobabilityφ,thedistribution p E (e|θ,M) of residual errors for a model classM (this distribution may be assumed based on prior knowledge or obtained from a sampling approach), and a set ofn s models each specified by a value of parameter vectorθ. The target probability φ, which helps decide the bounds, can be tuned to adjust hypothesis testing tradeoffs; in 20 Bounds-based modelfalsification (using FWERorFDR) guess bounds or compute bounds: i , i Probabilisticknowledge ofresiduals: p E (e|θ θ θ,M) Compute ¯ φ i or ¯α i (Šidák,Bonferroni, orBH-procedure) bounds data Choosetarget probability φ =1−α chooseN s models θ θ θ 1 ,θ θ θ 2 ,...,θ θ θ N s (viaexpertjudgement) some models falsified Figure 2.4: Error-bound model falsification procedure. standard statistics literature,φ = 0.95 or 0.9 is generally assumed. From the target probability φ, the next step is to compute the per-measurement significance ¯ α i or target probability ¯ φ i = 1− ¯ α i , which can be computed using various multiple-test corrections; three correction types are evaluated herein, namely, the Šidák and Bonferroni corrections controlling the family-wise error rate (FWER) and the BH- procedure to control the false discovery rate (FDR). Either the falsification is done using p-values, as in (2.4), or equivalently (and used herein) via the error bounds [ i ,¯ i ], which are specified for a particular model by (2.2) or, more precisely, by 1 2 ¯ α i =P(E i ≤ i |θ) =P(E i ≥ ¯ i |θ) = Z i −∞ p E i (e i |θ)de i = Z ∞ ¯ i p E i (e i |θ)de i , i = 1,...,N o (2.11) where ¯ α i is given in (2.5) for FWER/Šidák, in (2.6) for FWER/Bonferroni and in (2.9) for FDR/BH. 2.4.2 Likelihood-Bound Model Falsification The uncertainties in models are framed probabilistically in terms of the distri- butionoftheresidualerrorsbetweenmodel-predictedandmeasuredvalues. Goulet et al. [25] argued that, even after using various uncertainty propagation methods (e.g., interval analysis, probability bound analysis), sufficient information about 21 the combined uncertainties may not be available. Hence, rather than a simple uni- form distribution, they suggested using curvilinear distributions (e.g., extended uniform distributions). On the other hand, the principle of maximum entropy suggests that this residual error distribution should simply be assumed Gaussian [62–64]; further, the noise in sensors is generally estimated to be Gaussian dis- tributed [65]. The likelihood of the measured data setD — i.e., the probability density of a specific model predicting the measurement — is given by p(D|θ,M), which is often called the likelihood function and equivalently denoted by L(θ;D). For a single-measurement case, whereD ={d} contains just the one measurement, the likelihood function is defined for a modelθ in model classM using the probabil- ity distribution of the residual error = h(θ)−d; i.e., L(θ;D) = p(D|θ,M) = p E (|θ,M) = p E (h(θ)−d|θ). Assuming a Gaussian p E (e|θ), the likelihood func- tion is L(θ;D) = 1 √ 2πσ exp − [h(θ)−d] 2 2σ 2 ! = 1 √ 2πσ exp − 2 2σ 2 ! (2.12) where σ is the assumed (based on prior experience or expert judgment) standard deviation of the residual between between measurement and prediction. Here, a model will be accepted if: L(θ;D)>L (2.13) whereL is a likelihood lower bound, defined in subsequent paragraphs. 22 When the measured data setD ={d} contains multiple measurements (i.e., d is a N o × 1 vector) and if the residuals are Gaussian distributed, the likelihood function can be defined L(θ;D) = p E (h(θ)− d|θ) = exp − 1 2 [h(θ)− d] T Σ −1 [h(θ)− d] (2π) No 2 |Σ| 1 2 = exp − 1 2 T Σ −1 (2π) No 2 |Σ| 1 2 (2.14) where Σ is the assumed covariance (or chosen based on some uncertainty propa- gation analysis) of the residuals = h(θ)− d. The choice of likelihood lower bound L can be made in several ways; two are studied herein: 1. Likelihood bound based on error bounds: DetermineL from the residual error bounds [ i ,¯ i ]. For the single-measurement case, a lower bound for likelihood is determined as L = min ≤e≤¯ p E (e|θ) (2.15) where and ¯ are calculated using (2.11) forN o = 1. If p E (e|θ) is unimodal, thenL = min{p E (|θ), p E (¯ |θ)}; if the density is also symmetric, thenL = p(|θ) = p(¯ |θ). With multiple measurements, the value ofL may be similarly computed L = No Y i=1 min i ≤e i ≤¯ i p E i (e i |θ) (2.16) (Note that the choice of (5.4) is not unique; its elements could be determined in alternate ways, particularly for asymmetric distributions; these other pos- sibilities are beyond the scope of this study but will be investigated in the future.) 23 residuale PDFp E (e|θ θ θ) shaded area = ¯ φ =1− ¯α ¯α/2 ¯α/2 L Figure 2.5: Typical single-measurement residual density. p E1 (e 1 |θ θ θ) L 1 L 2 pE2 (e2|θ θ θ) L=L 1 L 2 p E (e|θ θ θ) e 1 1 1 e 2 2 2 (a) FWER/Šidák, where j and ¯ j corre- spond to significance ¯ α = 1− (1−α) 1/2 p E1 (e 1 |θ θ θ) L 0 1 L 1 pE2 (e2|θ θ θ) L 0 2 L 2 L=L 0 1 L 2 =L 1 L 0 2 p E (e|θ θ θ) e 1 0 1 0 1 1 1 e 2 0 2 0 2 2 2 (b) FDR/BH, where j and j correspond to significance ¯ α 1 = α/2 and 0 j and 0 j corre- spond to significance ¯ α 2 =α Figure 2.6: Likelihood bound for the two-measurement case. (Note: while the joint densities are depicted for uncorrelated measurements, this need not be assumed.) The relations between [ i ,¯ i ] and L are shown graphically in Figures 2.5 and 2.6 for the single-measurement and two-measurement cases, respectively. The error bounds [ i ,¯ i ], from which the likelihood boundL is chosen, can be computed in several ways, as previously discussed, to provide three variants to this approach for computingL: 1a. chosen using FWER/Šidák 1a 0 . chosen from FWER/Bonferroni 24 1b. chosen from FDR/BH. 2. Likelihood bound chosen using a constant probability mass (CPM): Another approach to choose the likelihood bound fixes the probability mass inside the hypervolume of allowable residual errors. This likelihood bound is deter- mined, for Gaussian residuals, by L = 1 (2π) No 2 |Σ| 1/2 exp − R 2 φ 2 ! (2.17) where R 2 φ is obtained from P X≤R 2 φ = φ, where X is χ 2 (Chi-square) distributed with N o degrees of freedom. This is equivalent to using the N o - dimensional ellipsoidal error bound used in model calibration [66], which is given by d M = Mahalanobis [67] distance = √ T Σ −1 < R φ (2.18) Goulet and Smith [45] argued that incorrect knowledge of the residual error covariance structure will significantly increase type I errors. However, the numerical examples, presented herein, will demonstrate that this method keeps constant the probability of type I errors and gives the fewest unfalsified models. 2.4.3 Relationships between Error-Bound and Likelihood- Bound Methods If the target probability φ is the same for both error- and likelihood-bound methods, and the residuals are independent and identically distributed (i.i.d.), then the following relationships between the error-bound methods and their cor- responding likelihood-bound methods can be observed: 25 (1) In Figure 2.7, all models with residuals falling inside the (red) dashed square will be accepted using an error-bound method with the FWER/Šidák cor- rection. A likelihood-bound method designed to accept the corner points in the shaded square will accept all models inside the (red) dashed circumscribing circle L 1 , including some residuals outside the shaded square. Thus, the circumscribed likelihood bound will accept (unfalsify) more models than the corresponding error- bound method; the ratio between the two will become very large as the number of measurementsN o grows, because the volume of the circumscribing hypersphere becomes very large compared to that of the hypercube formed by the error bounds. (2)AninscribedcirclewithlikelihoodboundL 3 , unfalsifyingfewermodelsthan the corresponding error-bound method, could instead be considered. However, as the number of measurements becomes large, the volume of the inscribed hyper- sphere, relative to that of the FWER hypercube, tends to zero (a conceptual view can be found in Figure 6.4 of Zaki and Meira Jr. [68]), leaving very few unfalsified models and likely rejecting many valid models. (3) A possible middle ground, between FWER’s circumscribed and inscribed hyperspheres, is the likelihood-bound hypersphere, denotedL 2 in Figure 2.7, that is circumscribed about the unfalsified region of FDR/BH. (Note that a middle likelihood bound could possibly be conceived based on the FWER correction, but it would likely be based on some ad hoc metric.) (4) Error- and likelihood-bound methods can also be combined by evaluat- ing likelihood values for the models unfalsified by an error-bound method. The likelihoods are typically very small for the models unfalsified by a circumscribed- hypersphere likelihood-bound method but falsified by the corresponding error- bound hypercube; unless computational resources are of primary concern, the 26 L 1 L 2 L 2 L 3 e 1 e 2 Figure 2.7: Relation between error- and likelihood-bound methods for FWER/Šidák and FDR/BH. Note: L 1 ≤L 2 ≤L 3 . 1 e 1 4 e 4 N o −2 e N o −2 2 e 2 5 e 5 N o −1 e N o −1 3 e 3 6 e 6 N o e N o Figure 2.8: Error residuals forN o measurements with larger residual on one sensor measurement ( 3 ). likelihood-bound approach may be just as useful by falsifying but also providing information on the relative confidence in the various unfalsified models. (5) Although the likelihood-bound methods with circumscribed hyperspheres accept more models than their error-bound counterparts, the distribution of the residual errors (e.g., Gaussian, curvilinear, etc.) provides each unfalsified model’s 27 likelihood, which identifies the models of greatest importance (a weighting proce- dure is discussed in the next section). Further, consider the case in which one or a few (of many) sensors are damaged, or some faults develop in the data acquisition system, producing one or a few incorrect measurements: as shown in Figure 2.8, error-bound methods will always falsify a model that has a single incorrect mea- surement (see the red dot on the third measurement in Figure 2.8) even though all other residuals are very small. The likelihood-bound methods, however, provide a more flexible approach by retaining that model because the likelihood values for other residual errors will be very high, and L > L even though the one residual falls slightly outside the hypercube defined by the error bounds. In conclusion, the likelihood-bound methods constructed in this chapter based on the Bonferroni, Šidák, and BH procedures result in more unfalsified models compared to their error-bound counterparts, but result in only a small number of unfalsified models with significantly high likelihood (or the importance) values. Further, the use of (circumscribed) FDR/BH likelihood bounds provides a middle ground relative to circumscribed or inscribed FWER/Šidák bounds. 2.4.4 Likelihood as Metric of Model Confidence Whether error or likelihood bounds are used to falsify models, the likelihood values — which are not demanding to calculate from the residuals — can be exploited: a model’s likelihood is a measure of confidence in the model relative to other models; unfalsified models with low likelihood are still retained, but are likely of less usefulness than models with greater likelihood. A maximum likeli- hood approach (or other uses of the likelihood values) can be employed to choose parameter values. (Clearly, given prior model probabilities, a standard Bayesian approachcouldalsobeused, potentiallyoverallmodelsandmodelclasses, possibly 28 using falsification to inform the priors. However, such an approach only provides a relative comparison between models and cannot provide indication on whether a model is suitable. For the sake of brevity, comparison with a standard Bayesian approach is not pursued herein, but will be studied in future investigations.) 2.4.5 Robust Estimation and Robust Prediction The posterior model probability p(θ|D,M) is computed via Bayes’ Theorem p(θ|D,M) = L(θ;D)p(θ|M) R L(θ;D)p(θ|M)dθ (2.19) using model likelihood L(θ;D) = p(D|θ,M) and the prior model probability p(θ|M) (which is constant if, prior to data collection and analysis, all models are assumed equally likely, or non-constant based on the modeler’s expert judge- ment or other knowledge of the distribution of θ). Then, following Beck and Taflanidis [64], the model class’ parameters and predictions of (future) responses can be estimated in a manner that is robust to the modeling uncertainty: by using the theorem of total probability, which is an average weighted by each model’s posterior probability. The robust parameter estimation, then, is b θ =E[θ|D,M] = Z θ p(θ|D,M) dθ (2.20) and the corresponding robust prediction of some quantity of interest q(θ|M) is b q =E[q(θ|M)|D] = Z q(θ|M) p(θ|D,M) dθ (2.21) 29 2.4.6 Model Confidence and Post-falsification Robust Pre- diction The falsification proposed here can be extended to quantify post-falsification model confidence, and to use unfalsified models to provide parameter and response estimates that are robust to the modeling uncertainty. In (2.19), the denominator p(D|M) — known as the likelihood of, or the evidence for, model class M — is the same normalization factor for all modelsθ in model classM, so it need not be explicitly computed and the numerators can be used in a relative sense. Suitable post-falsification weights for an approximate Bayesian method are then W i = c W i P ns j=1 c W j , c W i = L(θ i ;D)p(θ i |M), θ i is unfalsified 0 θ i is falsified (2.22) where the weights W i are normalized so their sum is unity, and only unfalsified models have nonzero weightslines 117–118. It may be noted that (2.22) is approx- imate in that it uses only the finite number of unfalsified models, and avoids the computationally expensive evaluation of the denominator in the exact Bayesian approach (2.19). While robustness to model class uncertainty [64] could be evalu- ated by incorporating models from multiple model classes into (2.22), this aspect of the proposed approach is not evaluated herein. Using these weights, a parameter estimate that is robust to the uncertainty in modeling can be computed with b θ≈ ns X i=1 W i θ i (2.23) 30 and the corresponding robust response prediction iswas part of (8) b q≈ X i W i q(θ i |M) (2.24) In most cases, the computational cost of this method is small compared to the standard Bayesian method that requires a full exploration of the posterior param- eter distribution. (At worst, the computational cost of the proposed method may approach that of a standard Bayesian method if the high likelihood regions are concentrated where the prior density is very small.) The accuracies of the approxi- mationsin(2.23)and(2.24)depend, ofcourse, onthenumberofunfalsifiedmodels, which should be sufficiently large (either through a sufficient number of initial can- didate models so that the models are representative of the whole model class, or through a sufficiently large value of φ) for accurate predictions. Alternately, an iterative strategy could be implemented in which the number of models evaluated within a model class is iteratively increased (e.g., doubled) until the fraction of models that are falsified converges (within some tolerance). Similarly, if it is found that only a very few models dominate (i.e., only a few W i are much larger than the rest), then a similar iterative process could be pursued (e.g., until the largest W i is below some threshold). 2.5 Numerical Examples Clearly, the falsification results — whether based on error or likelihood bounds — depend on the target probability φ. Larger φ falsifies fewer models, retaining some that very poorly predict the system response (more type II errors), but also retains most valid models (fewer type I errors); smaller φ falsifies more models, including more that poorly predict system response (i.e., fewer type II errors), but 31 at the expense of rejecting some valid models (more type I errors). The numerical examples herein use φ = 0.95, a typical value in many applications of hypothesis testing, though alternate values a bit below or above 0.95 do not give significantly different falsification results. 2.5.1 Example I: Conceptual Example Assume a model class with the simplest form of a model of multiple measure- ments: let h i (θ) =θ +bias, i = 1,...,N o (2.25) i.e., each output of a model is just the model’s scalar parameter θ and some (possibly non-zero) bias. Then, the data setD consists of measurements{d}, where d = [d 1 d 2 ... d No ] T , obtained from the true model that has parameter value θ true = 1 and no bias: d i = 1 + d,i , i = 1,...,N o (2.26) where the measurement errors d,i are i.i.d., each following Gaussian distribution N (0,σ 2 d ). The n s candidate models can be chosen in several ways, such as a grid-based approach or by randomly sampling n s values of θ from some distribution; in this example, they are sampled from uniform distribution U(−1, 3). The falsification is performed for n s = 10, 000 models to densely cover the parameter space, and N o = 100 measurements to show the effects of multiple measurements. Three different cases, with different modeling error and sensor noise specifications, are considered. 32 Case I: Unbiased Modeling Error In this scenario, model θ predicts each of N o outputs as h i (θ) = θ j for i = 1,...,N o ; i.e., an unbiased model. The measurement noises are generated by sampling independent zero-mean Gaussian random variables, each with standard deviation σ d = 0.25. Similarly, the residual error is assumed to be uncorrelated zero-mean Gaussian distributed with standard deviation σ, which represents both modeling and measurement uncertainties. Because of the simplicity of this numeri- cal example and the i.i.d. nature of the measurement noises, the standard deviation could be estimated via σ≈ 1 No−1 P No i=1 (d i − ¯ d) 2 where ¯ d = 1 No P No i=1 d i ; however, as this is not possible in real problems where it is unknown whether the sensor noises are i.i.d., instead the methods are evaluated herein using five residual standard deviations σ∈{0.15, 0.20, 0.25, 0.35, 0.50}. σ=0.50 0 θ 1 2 3 σ=0.35 σ=0.25 σ=0.20 σ=0.15 Bonferroni Error-bound: 0 θ 1 2 3 Šidák 0 θ 1 2 3 FDR-BH σ=0.50 0 θ 1 2 3 σ=0.35 σ=0.25 σ=0.20 σ=0.15 Bonferroni Likelihood-bound: 0 θ 1 2 3 Šidák 0 θ 1 2 3 FDR-BH 0 θ 1 2 3 CPM Note: =unfalsified; =falsified. Figure 2.9: The sets of Case I models θ that are falsified and unfalsified by the error-bound and likelihood-bound methods for different values of assumed residual standard deviation σ. 33 The sets of unfalsified models (dark blue) and falsified models (light pink) are shown in Figure 2.9 for the methods based both on error bounds and likelihood bounds. Constant probability mass likelihood bounds unfalsify the smallest frac- tion of models, resulting in the tightest range, but may falsify some models that describe the measurements reasonably well; e.g., with σ = 0.15 or 0.20, CPM rejects all models and, with σ = 0.25, unfalsifies only a fairly narrow range of models. The error-bound methods give tighter bounds than the corresponding likelihood-bound methods, which is expected and consistent with the previous dis- cussion of their relationship. Different values of the assumed residual standard deviationσ show that the choice ofσ strongly affects the fraction of models unfal- sified, which increases with increasing σ as the allowable bounds on the residual errors increase. On the other hand, if the assumed residual standard deviation is too small, many models are falsified, including some that are close to the true model. For error-bound methods, FDR control retains (unfalsifies) 13–15% fewer (invalid)modelsthandoesFWERcontroland, forlikelihood-boundmethods, FDR retains 33–45% fewer (invalid) models. One advantage of the likelihood-bound methods is the ability to utilize (with- out extra computation) each likelihood value as a measure of confidence in the corresponding unfalsified model. Figure 2.10 shows, in both linear (a) and log (b) scales, the models’ likelihood values as a function of the model parameter θ for an assumed residual standard deviation σ = 0.25. Figure 2.10a also shows vertical lines at the lower and upper ends of the ranges of models unfalsified by the error- bound methods; Figure 2.10b also shows the vertical lines for the ranges of the models unfalsified by the likelihood-bound methods, as well as the corresponding horizontal lines at the log-likelihood bounds. As expected, the likelihood-bound methods give looser bounds than the corresponding error-bound methods (with 34 0 0.02 0.04 0.06 0.08 0.10 (a) likelihood EBunfalsifiedrange FDR/BH FWER/Šidák FWER/Bonf. −1 0 1 2 3 −800 −600 −400 −200 0 lnL CPM lnL FDR/BH lnL FWER/Šidák lnL FWER/Bonf. (b) modelparameterθ loglikelihood Figure 2.10: Case I (σ = 0.25,σ d = 0.25) models’ likelihoods on a: (a) linear scale, with vertical-line ranges of the error-bound (EB) unfalsified models; (b) log scale, with vertical-line ranges of the likelihood-bound unfalsified models. (Bonferroni and Šidák bounds/ranges are nearly identical and overlap on the graphs.) the sameφ); however, the models with significant likelihoods are concentrated in a smaller region away from both sets of boundaries. (Note: asσ increases, the ranges of unfalsified models widen faster than the range of non-negligible likelihoods, indi- cating that bounds become less important, with larger σ, than the likelihoods.) Using the likelihood values, the maximum likelihood estimate of model parameter θ (i.e., the unfalsified modelθ that gives the largest likelihood value) isθ≈ 0.9879 for both σ = 0.25 and σ = 0.5. The statistical power of the FWER/Šidák and FDR/BH error-bound methods are also evaluated for this case withσ = 0.25. Figure 2.11 shows that, as the num- ber of measurements increases, the statistical power (the probability of rejecting a model when it is invalid) of FDR/BH is significantly stronger, forθ = 1.5 and 2.0, than that for FWER/Šidák; hence, FDR/BH is more effective in rejecting invalid models as the number of measurements increases. For the true parameter value θ = 1, the statistical power is almost zero and similar for both methods, indicating 35 0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1 θ=2 θ=1.5 θ=1 NumberofMeasurementsN o StatisticalPower FDR/BH FWER/Šidák Figure 2.11: Statistical power (the probability of rejecting a model when it is invalid) as a function of the number of measurements for the FWER/Šidák and FDR/BH error-bound methods for parameters θ = 1.0, 1.5, and 2.0. that both FDR/BH and FWER/Šidák error-bound methods perform equally well in retaining a valid model. LH-bound 0 θ 1 2 3 Err-bound Bonferroni 0 θ 1 2 3 Šidák 0 θ 1 2 3 FDR-BH 0 θ 1 2 3 CPM Figure 2.12: The sets of falsified and unfalsified Case II models (biased modeling error) via the error-bound (Err-bound) and likelihood-bound (LH-bound) methods for assumed residual standard deviation σ = 0.25. Case II: Biased Modeling Error In this scenario, the model predictions are assumed to be biased h i (θ) =θ + 2.25, i = 1,...,N o (2.27) where a deliberate and significant level of modeling error h,i = 2.25 is introduced. In this case, the correct conclusion should be to reject all of the candidate models. 36 As shown in Figure 2.12 for assumed residual standard deviation σ = 0.25 and measurement noise standard deviation σ d = 0.25, except for the method based on constant probability, all other methods retain some unfalsified models due to the measurement noise. (While not shown here, smaller σ results in all methods falsifying all candidate models — as expected — and larger σ results in modest increases in the number of unfalsified models.) FDR again unfalsifies fewer models than FWER: 53–57% fewer in this Case. The error-bound methods again result in ranges of unfalsified models that are tighter than the likelihood-bound methods; however, because of this Case’s strong modeling error bias and since the models θ are only sampled fromU(−1, 3), high likelihoods are only assigned to models with θ near−1, which gives prediction h i (θ) very close to 1.25. LH-bound 0 θ 1 2 3 Err-bound Bonferroni 0 θ 1 2 3 Šidák 0 θ 1 2 3 FDR-BH 0 θ 1 2 3 CPM Figure 2.13: The sets of falsified and unfalsified Case III models (correlated mea- surements with σ = 0.25) via the error-bound (Err-bound) and likelihood-bound (LH-bound) methods. Case III: Unbiased Modeling Error but Correlated Measurement Noise In this scenario, the models are assumed unbiased — i.e., h i (θ) = θ — but the residual errors are, unknown to the method, correlated. This correlation is introduced by sampling the measurement noise vector d fromN (0, Σ), where Σ has diagonal termsσ 2 d = 0.25 2 and off-diagonal terms 0.9σ 2 d , so that all noise terms are strongly positively correlated. However, while performing the falsification, the residual errors are assumed independent with standard deviation σ = 0.25; the results are shown in Figure 2.13. The fraction of models unfalsified using the error- bound and likelihood-bound methods are similar to Case I. FDR, both in error- 37 and likelihood-bound methods, rejects 22–33% more invalid models than FWER. The ranges of unfalsified models are no longer symmetric with respect to the true parameter value. As a consequence, CPM rejects all models with parameter θ outside [0.5688,1.1035], even though models with parameter θ just above 1.1 may be considered reasonable given the noise in the measurements, which is consistent withsimilarfindingsinGouletandSmith[45]. Themaximumlikelihoodparameter estimate, using the models unfalsified by the FDR/BH likelihood-bound method, is θ≈ 0.8365 (it would also be biased with other corrections and methods as well for this Case). LH-bound 0 θ 1 2 3 Err-bound Bonferroni 0 θ 1 2 3 Šidák 0 θ 1 2 3 FDR-BH 0 θ 1 2 3 CPM Figure 2.14: The sets of falsified and unfalsified Case IIIb models (alternating-sign- correlated measurements) via the error-bound (Err-bound) and likelihood-bound (LH-bound) methods with σ = 0.25. Figure 2.14 shows the ranges of unfalsified models if the measurement noise correlations instead alternate in sign (so they are strongly positively or negatively correlated). In this case, CPM rejects all models. The FDR methods perform bet- ter than FWER by falsifying 23–36% more invalid models. Again, the likelihood- bound methods falsify fewer models than the error-bound methods; the FDR/BH likelihood-bound unfalisfied models have a maximum likelihood for θ = 0.9971, which is very close to the true value even though the correlation structure is incor- rectly assumed. Summary of Results from Example I All three cases show that error-bound methods give tighter ranges of unfalsified models than the corresponding likelihood-bound methods. The likelihood-bound 38 methods, on the other hand, provide unfalsified likelihoods that are relatively insensitive to incorrect assumptions of the noise covariance structure. Even if the correlations are significantly different than assumed, the FDR-BH methods still provide a good balance between rejection of as many as possible invalid models — better than the FWER methods – while successfully retaining most of the valid ones. 2.5.2 Example II: Four DOF System (a) Ground m b x b x ¨ g x 1 x 2 x 3 _ 1 2 k 1 , _ 1 2 c 1 _ 1 2 k 1 , _ 1 2 c 1 m 1 m 2 m 3 _ 1 2 k 2 , _ 1 2 c 2 _ 1 2 k 2 , _ 1 2 c 2 _ 1 2 k 3 , _ 1 2 c 3 _ 1 2 k 3 , _ 1 2 c 3 (b) restoring force (f b – c b x b ) ˙ base drift x b Q y x y x d k eq AASHTO k post k pre Bouc-Wen Bilinear AASHTO Figure 2.15: (a) 4DOF testbed system: a building superstructure on a hysteretic isolation layer; (b) force-displacement loops and linear approximations for the var- ious model classes Consider the base-isolated building model shown in Figure 2.15a. The isola- tion system often exhibits hysteretic behavior, introducing nonlinearity into an otherwise linear dynamical system. To simulate the system response and design appropriate control strategies, the behavior of the hysteretic elements in the iso- lation must be accurately modeled. This four-degree-of-freedom (4DOF) system is subjected to base excitation; the isolation layer is modeled as an elastoplastic element (e.g., lead-rubber bearing) with modest linear viscous damping. The base excitation ¨ x g is the N-S El Centro, CA earthquake record (at the Imperial Valley 39 Irrigation District substation) during the 18 May 1940 Imperial Valley earthquake, sampled at 50 Hz, with a peak acceleration of 3.42 m/s 2 . The equations of motion of the superstructure, if it were fixed-base, are given by M s ¨ X s + C s ˙ X s + K s X s =−M s 1¨ x g (2.28) where mass matrix M s and stiffness matrix K s are M s = m 1 0 0 0 m 2 0 0 0 m 3 K s = k 1 +k 2 −k 2 0 −k 2 k 2 +k 3 −k 3 0 −k 3 k 3 (2.29) where m 1 = m 2 = m 3 = 300 Mg and k 1 = k 2 = k 3 = 40 MN/m; 1 is a column vectorofones; and X s = [x 1 x 2 x 3 ] T isthevectoroffloordisplacementsrelative to the ground. Rayleigh damping, i.e., C s = β 1 M s + β 2 K s , is assumed with 3% damping for the first two modes, where β 1 and β 2 are constants evaluated from the assumed modal damping ratios. Combining the isolation layer and the superstructure equations of motion, the full system can be described by M s ¨ X s + C s ˙ X s + K s X s =−M s 1¨ x g + C s 1 ˙ x b + K s 1x b (2.30) m b ¨ x b + 1 T C s 1 ˙ x b + 1 T K s 1x b +f b =−m b ¨ x g + 1 T C s ˙ X s + 1 T K s X s where m b = 500 Mg is the base mass and f b is the sum of the isolation-layer dampingandrestoringforces. Thetotalstructuremassism =m b +m 1 +m 2 +m 3 = 1400 Mg. 40 Nonlinear Model classes for Hysteretic Damping Two nonlinear model classes are considered here to represent the isolation layer: a bilinear hysteresis model and a Bouc-Wen hysteresis model [69], which is smoother and more realistic [70]. In these nonlinear models, k pre , k post , and Q y are the pre-yield and post-yield stiffnesses, and yield force, respectively, as shown in Figure 2.15b. The force f b exerted by the isolation layer (both damping and restoring forces) is given by f b =c b ˙ x b +k post u b +q y z (2.31) The Bouc-Wen model uses q y z as the non-elastic force, where q y = Q y (1−r k ); r k =k post /k pre is the hardness ratio; and z is an evolutionary variable given by ˙ z =A ˙ x b −β ˙ x b |z| npow −γz| ˙ x b ||z| npow−1 (2.32) where selecting A = 2β = 2γ =k pre /Q y forces z to stay in [−1, 1] and makes the loading and unloading stiffnesses the same [71]. Exponentn pow = 1 is used for the Bouc-Wen model; while n→∞ asymptotically approaches a bilinear hysteretic model, n pow = 100 is sufficiently large and used herein as the bilinear model. Linear Model Classes for Hysteretic Damping As linear models are generally easier to handle and are less computationally expensive, they are sometimes preferred over nonlinear models. A few linear model 41 classes available to approximate (i.e., roughly equivalent per-cycle energy dissipa- tion) the hysteretic isolation layer are also included in this example. The linear models approximate the total isolator force with f b = [c b +c eq ] ˙ x b +k eq u b = h c b + 2ζ eq q k eq m i ˙ x b +k eq u b (2.33) AASHTO (American Association of State Highway and Transportation Officials) and JPWRI (Japanese Public Works Research Institute) specified equivalent lin- ear model classes [72, 73], where the equivalent damping ratio ζ eq and equivalent stiffnessk eq that approximate the energy dissipation of a hysteretic component are given by ζ eq = 2(1−r k )(1−ρ −1 ) π[1 +r k (ρ− 1)] k eq = k pre ρ [1 +r k (ρ− 1)] (2.34) where ρ = r d for the AASHTO model and ρ = 0.7r d for the JPWRI model, and r d = x d /x y is the shear ductility ratio, i.e., the ratio of the design displacement x d to the yield displacement x y = Q y /k pre . Hwang and Chiou [73] proposed a modified AASHTO model using (2.34) and ρ =r d but with appended multiplica- tive correction factors of r 0.58 d /(6− 10r k ) and [1− 0.737(r d − 1)/r 2 d ] −2 on ζ eq and k eq , respectively; this modified AASHTO model is used herein as a third linear model. In the Caltrans (California Department of Transportation) model [73], a fourth linear model considered herein, the equivalent damping ratio and stiffness are given by ζ eq = 0.0587(r d − 1) 0.371 k eq =k pre [1 + ln{1 + 0.13(r d − 1) 1.137 }] −2 (2.35) 42 Falsification of Linear and Nonlinear Isolator Models To assess the effectiveness of different falsification methods applied to a non- linear system, the Bouc-Wen model is used to generate a set of nonlinear dynamic response data; the measurement is the absolute base acceleration ¨ x a b at a sampling rate of 20 Hz, to which is added a Gaussian pulse process measurement noise with a standard deviation that is 10% of the root-mean square (RMS) of the actual response — i.e., there areN o = 600 measurements. The uncertainty in each linear model class arises from equivalent linear parameters k eq andc eq , which depend on k post , c b , r k , and r d through model-specific constitutive relations defined above. The true values of these parameters and their prior distribution specifications are listed in Table 5.7. Table 2.3: Priors for model parameters as applicable to each model class. Parameter True value Prior Distribution Type Mean Std. Dev. k post 4.0 MN/m Lognormal 4.5 MN/m 0.25 MN/m c b 20 kN·s/m 2 Lognormal 20 kN·s/m 2 4 kN·s/m 2 r k 0.1667 Uniform 0.1600 0.0058 r d n/a † Uniform 2.5 0.2887 Q y (%mg) 5.00 Uniform 4.75 0.2887 † The Bouc-Wen model does not require r d . For each model class, falsification is performed on n s = 2000 candidate models (Figure2.16showsthat2000modelsissufficientforareasonablyconvergedfraction of unfalsified Bouc-Wen models). The model falsification is applied with residuals i assumedGaussianN (0,σ 2 ); theresidualstandarddeviationσ issetat0.15times the standard deviation of the measurement data vector d (this standard deviation is slightly larger than that of the actual measurement noise). For each model 43 0 1000 2000 3000 4000 5000 0 20 40 60 80 100 Numberof ModelsN m PercentUnfalsified LB-FDR/BH EB-FDR/BH Figure 2.16: The fraction of Bouc-Wen models falsified by the FDR/BH error- and likelihood-bound methods as a function of the number of candidate models. n s = 2000, usedinExampleII,issufficientasthecurvesdonotsignificantlychange beyond 2000 models. neither FDR/BH only both 0 0.1 0.2 FalsifiedBy Meanof RMSError[m/s 2 ] Figure 2.17: Mean of RMS errors of model residuals i for Bouc-Wen models that are error-bound falsified by neither FWER/Šidák nor FDR/BH, falsified by FDR/BH only (not FWER/Šidák), and falsified by both. The error bars show one standard deviation of the mean RMS errors. class, the resulting fractions of models unfalsified by the error- and likelihood- bound methods (withφ = 0.95) are shown in Table 2.4. The error-bound methods falsify all model classes except the true Bouc-Wen model class. On the other 44 hand, among the likelihood-bound methods, only the constant probability mass (CPM) falsifies all model classes except Bouc-Wen. However, the likelihood values of non-Bouc-Wen models in other likelihood bound methods are much lower than those of the Bouc-Wen models, demonstrating that the other model classes are of less importance. FDR/BH retains 7–10% fewer Bouc-Wen models in error- and likelihood-bound methods compared to FWER/Šidák, as shown in Table 2.4. With the likelihood-bound methods, FDR/BH falsifies around 95% of the bilinear models while FWER/Šidák falsifies only 19%. Table 2.4: Fraction of models that are unfalsified using different methods. Model class (M k ) % unfalsified Error-bound Likelihood-bound Bonferroni Šidák BH Bonferroni Šidák BH CPM AASHTO 0 0 0 100 100 0 0 JPWRI 0 0 0 98.4 98.3 0 0 Caltrans 0 0 0 68.5 68.0 0 0 mod. AASHTO 0 0 0 72.1 71.8 0 0 Bouc-Wen 32.5 32.2 29.8 100 100 90.2 16.6 Bilinear 0 0 0 81.7 81.4 5.1 0 As the fraction falsified or unfalsified only tells part of the story, Figure 2.17 compares the quality of models falsified or unfalsified by FWER/Šidák and FDR/BH error-bound methods: the mean RMS of residual errors for models falsi- fied only by FDR/BH are larger than those of models accepted by FDR/BH (i.e., as expected, the models FDR/BH rejects are more invalid than those accepted) and smaller than the set of models falsified by both FDR/BH and FWER/Šidák (i.e., FWER/Šidák retains some larger-residual models that FDR/BH properly rejects). Further, the fraction of models unfalsified does not differentiate between the plausibility of different models or model classes. This is easily shown when 45 the mean of the lognormal distribution of k post is changed from 4.5 MN/m to 3.5 MN/m; the resulting fraction of nonlinear models unfalsified by the various approaches is shown in Table 2.5. The error-bound methods with FWER correc- Table 2.5: Fraction of models that are unfalsified using a prior distribution similar to the priors in Table 5.7 except the mean of k post is changed to 3.5 MN/m. Model class (M k ) % unfalsified Error-bound Likelihood-bound Bonferroni Šidák BH Bonferroni Šidák BH CPM Bouc-Wen 22.1 22.0 20.8 100 100 84.3 23.2 Bilinear 7.1 6.8 0.5 100 100 90.6 0 tions give some unfalsified models for both Bouc-Wen and bilinear model classes and do not differentiate between any unfalsified models. The FDR/BH error- bound method, however, falsifies nearly all bilinear models. The table also shows that the FWER and FDR/BH likelihood-bound methods falsify bilinear models at a rate similar to, or slightly lower than, the Bouc-Wen models; however, the likelihood values of the models from these two classes, shown in Figure 2.18 as functions of k post , are rather higher for the Bouc-Wen models, thus demonstrat- ing the advantage that likelihood-bound methods provide a quality index for the unfalsified models. Post-falsification Robust Prediction The estimated parameters, using both maximum likelihood estimation and robust estimation (2.23) using the unfalsified Bouc-Wen models, are shown in the sixth and seventh columns of Table 5.7 and are very close to the corresponding true values. 46 2.5 3 3.5 4 4.5 5 −3000 −2000 −1000 0 k post [MN/m] loglikelihood Bouc-Wen Bilinear lnL Figure 2.18: Likelihood values of the bilinear and Bouc-Wen models as a function of parameter k post ; the FDR/BH likelihood bound is also shown. The Bouc-Wen models unfalsified based on the El Centro response data are used to robustly predict the isolated structure response to the 1995 Kobe earth- quake (N-S record at the Japanese Meteorological Agency in Kobe, Japan, during the 1995 Hy¯ ogo-ken Nanbu earthquake, sampled at 50 Hz, with a peak accelera- tion of 8.18 m/s 2 ). Using the FDR/BH likelihood-bound falsification with weights 10 15 20 Time [s] 0 5 Predicted True (a) Bouc-Wen model class 10 15 20 Time [s] 0 5 Predicted True (b) Bilinear model class Figure 2.19: True absolute base accelerations of the 4DOF building subjected to the Kobe earthquake, and those robustly predicted from models likelihood-bound- unfalsified using the El Centro data. 47 Parameter True value Bouc-Wen Bilinear ML b θ ML, b θ k post [MN/m] 4.0 4.0733 4.0609 3.8701 c b [kN·s/m] 20 23.3901 22.5401 19.5726 r k 0.1667 0.1687 0.1681 0.1634 Q y (%W) 5.00 4.9315 4.9450 4.3468 Table 2.6: True Bouc-Wen model parameters and estimates of the Bouc-Wen and Bilinear model parameters. (ML denotes maximum likelihood parameter esti- mates; b θ are robust parameter estimates.) assigned according to (2.22) used in (2.24), the relative RMS error in predicting the absolutebaseaccelerationis0.8639%; theactualandpredictedresponsesareshown in Figure 2.19. The corresponding error using only the robust parameter estimate b θ is a similar 0.8788%, and is 0.9752% using the maximum likelihood parameter estimate arg max i {L(θ i )}. Hence, using either (2.23) or (2.24) with the weights W i of the unfalsified models provides very accurate response predictions; robust estimation (2.24) is slightly better than the direct use of the estimated parameters (2.23). On the other hand, prediction (2.24) using the unfalsified bilinear models gives a 10.1957% error; the parameter estimates from the bilinear models signifi- cantly underestimate k post and Q y (the bilinear models’ maximum likelihood and robust estimates, tabulated together in the rightmost column of Table 2.6, are the same to four decimal digits because few bilinear models remain unfalsified and one of them, with a high likelihood, dominates the others.) The robustness of the proposed method is tested next by varying the mea- surement noise level and the initial model set for falsification. The falsification is performed using n s = 2000 with the residual standard deviation assumed to be 20% of the standard deviation of the measured absolute base acceleration so that some unfalsified models will remain even in the presence of large measure- ment noise in the data. The result in Figure 2.20a shows that the mean relative 48 0 10 20 30 40 50 Measurement noise (%) 0 0.5 1 1.5 2 2.5 3 Relative RMS error (%) 95% CI 1 Std Mean 1 eval. (a) Error vs. noise level 1 2 3 4 5 Initial model set 0 0.5 1 1.5 2 2.5 3 Relative RMS error (%) 95% CI 1 Std Mean 1 eval. (b) Error vs. initial model set with 20% measurement noise. Figure 2.20: Relative RMS error in prediction of the absolute base acceleration of the four DOF building subjected to 1995 Kobe earthquake excitation using the Bouc-Wen models with varying measurement noise level and for different initial model set for falsification. (CI = confidence interval, Std = standard deviation.) RMS error is less than 1.5% when the measurement noise is as large as 50% 1 . Please note that this error can be further reduced if total number of models n s for the model falsification is increased from 2000. Next, different randomly gener- ated initial model sets are used with 20% measurement noise and again the result (Figure 2.20b) show that the relative RMS error does not vary significantly for different initial model sets and the mean error remains around 1%. This shows the accurately predicted responses are also robust in presence of model uncertainties and measurement noise. 2.5.3 Example III: Complex Wind-Excited Building (1623 DOF) A complex 20-story moment-resisting frame building model, adapted from Wojtkiewicz and Johnson [74], with a height of 80 m, is shown in Figure 3.9. 1 measurement noise is generated from N (0,σ 2 n ) where σ n is assumed a certain percentage of the standard deviation of the absolute base acceleration which is varied from 0% to 50%. 49 −15 −5 5 15 −20 −12 −4 4 12 20 0 8 16 24 32 40 48 56 64 72 80 x [m] E 30 ° wind N y [m] z [m] Figure 2.21: 1623 DOF model of a building with TMDs on its roof, subjected to wind load. Cross braces provide additional stiffness for lateral bending, torsion, and in-plane floor stiffness. The structural model without the passive control devices has 1620 DOFs, with its fundamental modes in the y-direction at 0.5718 Hz, x-direction at 0.5893 Hz and torsional at 0.9363 Hz. Two TMDs are placed in the y-direction (each 0.55% of building mass) and one TMD in the x-direction (1.1% of building mass). The building is subjected to wind excitation (oriented toward the east- northeast, at a 30 ◦ angle from the x axis as shown in Figure 3.9), modeled as a narrowband filtered Gaussian white noise process (most of the excitation energy is in the range of 0.35–1.5 Hz, exciting primarily the fundamental mode in the east- westx-direction), vertically shaped proportional to the height to the 0.3 power [75] 50 and, for simplicity, assumed to be fully correlated at all heights along the building. The three TMDs, in the true model, exert power-law damping forces f x TMD = 200 kN·(s/m) 0.8 | ˙ u| 0.8 sgn ˙ u + 30 kN·(s/m) ˙ u f y TMD = 100 kN·(s/m) 0.8 | ˙ u| 0.8 sgn ˙ u + 15 kN·(s/m) ˙ u (2.36) where ˙ u is the velocity of a TMD relative to its roof connection. The values in (2.36) are chosen so that the effects of the nonlinearities are pronounced in the structure’s roof accelerations. The (N o = 1200 element) measurement data vector contains sampled x- and y-direction roof-center acceleration time histories — [¨ u roof x (0Δt) ¨ u roof y (0Δt) ¨ u roof x (1Δt) ¨ u roof y (1Δt) ··· ] T with Δt = 0.05 s — plus additive Gaussian pulse-process sensor noise with a standard deviation that is 30% that of that noise-free vector. Models Damping Force Parameter Distribution x TMD y TMDs Mean S.D. Mean S.D. Linear f lin =c 1 ˙ u c 1 [kN·s/m] Normal 250 75.0 150 45.0 Cubic polynomial f cub =c 3 ˙ u 3 c 3 [kN·(s/m) 3 ] Lognormal 50.0 22.5 25.0 6.0 +c 1 ˙ u c 1 [kN·s/m] Normal 250 75.0 150 45.0 Bouc-Wen f hyst =q y z +k post u § k pre [k post ] Normal 0.1667 0.05 0.1667 0.0144 n pow = 1; k pre fixed Q y [%W tmd ] Normal 7.5 0.5 7.5 0.5 § z is defined in (6.26). Table 2.7: Models for the TMD damping force used to define different model classes for this structure and the prior distributions for their model parameters (u isaTMDdisplacementrelativetotheroof;W tmd istheweightofthecorresponding TMD; S.D. is the standard deviation). x TMD linear cubic Bouc-Wen y TMDs linear 48.6% 41.2% 0.0% cubic 43.9% 45.6% 0.0% Bouc-Wen 0.0% 0.0% 0.0% Table 2.8: Fractions of models unfalsified within each model class. 51 0 10 20 30 Time [s] 0 2 Predicted True (a) Linear damping forces in both directions 0 10 20 30 Time [s] 0 2 Predicted True (b) Cubic damping forces in both directions Figure 2.22: Robustly predicted and true absolute roof acceleration of the 1623 DOF building subjected to a different realization of the wind excitation. The three models for TMD damping forces, and their corresponding parameter priors, are shown in Table 3.7; the 3× 3 = 9 candidate model classes are formed by combining these TMD force models for thex- andy-direction TMDs. The true TMD damping force model is (intentionally) omitted from the candidate model classes,buttwooftheseTMDdampingforcemodels—linearandcubicpolynomial — cause TMD behaviors similar to the true one. For each model classM k ,n s = 2000 models randomly generated from the prior distribution p(θ|M k ) are used for falsification. Each residual i is assumed nor- mally distributedN (0,σ 2 ), where the residual standard deviationσ is assumed to be 0.08 m/s 2 , which is about 15% of the standard deviation of the measured data. The fraction of models unfalsified within each candidate model class is shown in Table 2.8. All Bouc-Wen models are falsified because the more boxy shapes of Model Class b c x 1 [kN·s/m] b c y 1 [kN·s/m] b c x 3 [kN·(s/m) 3 ] b c y 3 [kN·(s/m) 3 ] x-linear, y-linear 399.1 154.3 n/a n/a x-cubic, y-cubic 410.5 140.3 30.1 31.1 Table 2.9: Robustly estimated parameters of two model classes. 52 their hysteresis loops, for the priors chosen here, are very different from the ellip- tical shapes of the other damping models. The parameters estimated using the unfalsified models of each of the two model classes with the most unfalsified mod- els — linear in both directions and cubic polynomial in both directions — are shown in Table 2.9. Then, robust prediction is performed, using these unfalsified models with weights assigned according to (2.22) and used in (2.24), for response due to a different realization of the stochastic wind excitation; the results for both model classes are shown in Figure 2.22. The relative RMS errors in predicting the roof acceleration in thex-direction are 1.7995% and 1.8081% for robust prediction using the linear-in-both-directions and cubic-polynomial-in-both-directions mod- els, respectively. Hence, the future prediction using the proposed method provides very good accuracy for this example. The robustness of the prediction using the proposed method is shown next for the nonlinear cubic polynomial for damping forces in the TMDs in both directions by varying the measurement noise levels and initial model set for falsification. The standard deviation of the residual error is assumed to be 20% of the measured data to obtain some unfalsified models for all the cases evaluated. The relative RMS errors in predicting the roof acceleration in x direction are shown in Figure 2.23 indicating mean errors as small as 1.5% even when the measurement noise level is 50%. The initial model set is then changed and the robust prediction is performed in presence of 20% measurement noise and the mean errors are again obtained to be around 1%. Hence, in this example the accurate future prediction using the proposed method is also robust against the presence of noise and uncertainties in the models. 53 0 10 20 30 40 50 Measurement noise (%) 0 0.5 1 1.5 2 2.5 3 Relative RMS error (%) 95% CI 1 Std Mean 1 eval. (a) Error vs. noise level 1 2 3 4 5 Initial sample set 0 0.5 1 1.5 2 2.5 3 Relative RMS error (%) 95% CI 1 Std Mean 1 eval. (b) Error vs. initial model set with 20% measurement noise. Figure 2.23: Relative RMS error in prediction of the roof acceleration in the x direction of the 1623 DOF building using cubic polynomials for the TMD damping subjected to a different realization of wind excitation with varying measurement noise level and for different initial model set for falsification. (CI = confidence interval, Std = standard deviation.) 2.6 Conclusions This chapter compares different model falsification strategies using bounds on prediction errors and bounds on their likelihoods. False discovery rate (FDR) control is introduced herein to determine bounds for model falsification — the first such application of FDR — and compared to the family wise error rate (FWER) control that has been more commonly used in falsification. The FDR control providesgreaterstatisticalpowerbyfalsifyingmoreinvalidmodelscomparedtothe typically-used FWER control. The use of a likelihood bound chosen based on these error control criteria is also proposed herein. The corresponding likelihood-bound methods are less strict but provide, for each unfalsified model, a confidence level that can also be used to estimate model parameters and to predict responses with robustness. The relationships between error-bound and likelihood-bound methods are also discussed. A circumscribed FDR/BH likelihood-bound is found to provide 54 a nice middle ground between defining a likelihood based on circumscribing or inscribing the FWER bounds. These methods are then applied to three numerical examples. The first exam- ple demonstrates the principles of each method by considering different scenarios of modeling error and correlation between the measurements, providing insights into how the error-bound and likelihood-bound methods perform as the results can be easily interpreted for this fundamental example. The further usefulness of max- imizing likelihood to estimate model parameters is also demonstrated. The second example uses a structural system with a hysteretic base isolation layer modeled using different linear and nonlinear model classes with unknown parameters; base accelerationismeasuredasthestructuralsystemissubjectedtoanhistoricalearth- quake excitation. All error-bound and likelihood-bound methods are implemented to falsify models in different model classes. Parameter estimation is performed first using maximum likelihood. Next, the proposed use of falsification result is used for parameter estimation and robust response prediction. The third example uses a 1623 degree-of-freedom building subjected to a wind excitation with three nonlinear TMDs attached to its roof. The likelihood-bound falsification using the FDR control is applied and then future prediction is performed using the falsifica- tion result that shows that the proposed method can predict robust and accurate responses. These three examples demonstrate: (1) the advantages of using FDR control with error-bound methods to falsify more invalid models than FWER, which can be useful for exploratory studies when a large number of models and model classes are available, and (2) that likelihood-bound methods also provide relative confi- dence metrics for the unfalsified models and useful for future response predictions. For other case studies, similar results would also be expected (though results will of 55 course vary when data are both noisy and of finite dimension, just as in all statisti- cal testing methods including conventional model falsification and well-established Bayesian model selection). 56 Chapter 3 Bayesian Model Class Selection Laplace equally aware of the power of Bayes theorem used it to help him decide which astronomical problems to work on ... something new to be found. E. T. Jaynes, Bayesian Methods: General Background 3.1 Introduction Bayesian model class selection for structural examples can be performed using response measurement data to identify the best possible model(s) and model class(es) for further calculations [11, 15, 16, 76]. The application of this computa- tional tool to aerospace models has been presented by Mthembu et al. [18]. In his book, Yuen [77] applied Bayesian model class selection to problems of air quality prediction, hydraulic jump and seismic attenuation relationships. Kerschen et al. [78] performed a model screening using Bayes’ theorem for nonlinear system identification. Muto and Beck [17] and Worden and Hensman [79] implemented model slelections for hysteretic systems in a Bayesian framework. An approximate Bayesian computation method has been suggested for model selection in Toni et al. [80]. Bayesian model class selection has been applied in structural health monitoring in Saito and Beck [81]. Beck and Taflanidis [64] used Bayesian model class selection for predictive analysis. In addition to these Bayesian approaches, a fuzzy arithmetic based approach has been used for selection of different mate- rial models for a brake pad in Haag et al. [82]. Smart et al. [83] performed a 57 model class selection for a turbogenerator foundation. A review of machine learn- ing approaches for model class selection of nonlinear systems is given in Hong et al. [84]. Structural engineering models often involve an otherwise linear structure with local nonlinearities, e.g., a linear shear building model with a hysteretic base iso- lator that introduces nonlinearities into an otherwise linear model. Additionally, a variety of model classes are available to represent each of the nonlinearities present in the structure. To analyze these types of structures when they are embedded in model class selection problems, either approximate linearization techniques must be employed, or one must incur the computational costs associated with determin- ing the full nonlinear structural response using nonlinear solvers and Monte Carlo sampling. The required computational effort, however, can be reduced by (1) utilizing sampling methods that are more intelligent than standard Monte Carlo sampling and/or (2) reducing the computational requirements of each simulation, such as by exploiting the localized nature of the nonlinear components. To increase the efficiency of the dynamic response computation, the method proposed herein will use the algorithm previously developed by the senior authors [85] for determining the response of systems with localized nonlinearities or uncer- tainties. In this algorithm, the system equations of motion are transformed into a low-order nonlinear Volterra integral equation (NVIE) in the forces induced by the localized nonlinearities and/or uncertainties, which is then solved numerically. This approach, which exploits the localized nature of the modification from a nom- inal linear model, will be incorporated herein inside the nested sampling algorithm to significantly increase the overall computational efficiency of the Bayesian model class selection process. 58 The proposed method is demonstrated using three numerical examples with increasing complexity of the structural models. In the first two examples, the selection of a proper model of lead-rubber bearing (LRB) isolators from a class of linear and nonlinear candidate models is investigated. The isolators are designed to reduce the structural response of two building models subjected to earthquake excitation. The first example consists of the application of the proposed method to a simple single degree-of-freedom (DOF) superstructure model to demonstrate the applicability of the method. Next, a more complex 11-story 2-bay 99-DOF superstructure on a hysteretic isolation layer is employed to compare the efficiency of the proposed method to that of a Bayesian model class selection utilizing a conventional ordinary differential equation solver (Matlab’s ode45). The third example considers a three-dimensional building frame, subjected to wind loading, where different models for the damping of three tuned mass dampers (TMDs) attached to its roof are used for the selection problem. The computational-demand reductions provided by the efficient dynamic response algorithm are evaluated for each example, showing that the proposed approach achieves significant reductions in computation time. Although the proposed method of increasing computational efficiency for Bayesian model class selection is illustrated through examples of structural or mechanical systems, its application is not limited to this field. The proposed approach is a general computational framework that can be applied to any locally nonlinear dynamic systems that may arise from different fields. 3.2 Methodology This section first briefly outlines the Bayesian model class selection method- ology. Subsequently, a modified method for the derivation of nested sampling is 59 discussed with the effective use of both nested sampling and the efficient response analysis of systems with local nonlinearities and/or uncertainties in the context of Bayesian model class selection. 3.2.1 Bayesian Model Class Selection Let M 1 , M 2 ,..., M Nm be the N m different candidate model classes for a particular model selection problem. The uncertain parameter vector for the k th model class isθ (k) . Given the measurement data setD, the posterior probability is evaluated for each model classM k , P (M k |D) = p(D|M k )P (M k ) p(D) , k = 1, 2,...,N m , (3.1) where the denominator is given by the probability density p(D) = Nm X k=1 p(D|M k )P (M k ). (3.2) P (M k ) is an a priori measure of plausibility assigned by the user, normalized such that P Nm k=1 P (M k ) = 1; this specification of prior model class probability depends on the user’s past experience and the problem at hand. The main computational challenge in Bayesian model class selection problems lies in evaluating the evidence or marginal likelihood E (k) = p(D|M k ). For a particular model class M k , this evidence or marginal likelihood can be written as, E (k) = p(D|M k ) = Z Θ (k) p(D|θ (k) ,M k )p(θ (k) |M k )dθ (k) = Z Θ (k) L(θ (k) ,M k )p(θ (k) |M k )dθ (k) (3.3) 60 whereL(θ (k) ,M k ) = p(D|θ (k) ,M k ) is the likelihood function, and p(θ (k) |M k ) is the prior density ofθ (k) (the measure of prior plausibility of the parameters) for model classM k . 3.2.2 Evaluating the Evidence: Nested Sampling The evidence integral omitting (for the sake of brevity and clarity) the model class numberk and model class variableM k , can be rewritten using the following procedure. A derivation of nested sampling moderately different than given in Skilling [86] is presented below [87–90]. First, the likelihood functionL(θ) can be defined as the integral L(θ) = Z L(θ) 0 dλ, (3.4) which allows the evidence to be written as E = Z Θ L(θ)p(θ)dθ = Z Θ " Z L(θ) 0 dλ # | {z } L(θ) p(θ)dθ. (3.5) The order of integration can then be swapped and the limits adjusted accordingly to yield E = Z ∞ 0 Z L(θ)>λ p(θ)dθdλ. (3.6) Next, define the inner integral to be the monotonically decreasing function χ(λ), χ(λ)≡ Z L(θ)>λ p(θ)dθ, (3.7) 61 which is the probability mass enclosed in the subset of Θ in whichL(θ) exceeds λ, and has boundary values of χ(0) = 1 and χ(∞) = 0. The evidence can then be written as E = Z ∞ 0 " Z L(θ)>λ p(θ)dθ # | {z } χ(λ) dλ = Z ∞ 0 χ(λ)dλ (3.8) After defining ϕ(χ) to be the inverse of function χ(λ) — i.e., ϕ (χ(λ))≡ λ — a change of variables can be used to write the evidence as E = Z 1 0 ϕ(χ)dχ (3.9) Finally, this integral for the evidence can be approximated by discretizing over χ as E≈ X i ϕ i Δχ i (3.10) Figure 3.1: The nested sampling algorithm in the two-dimensional parameter space with iso-likelihood contours. The sampling procedure for a particular model class begins with χ 0 = 1 and initial evidence estimate E 0 = 0, with n s parameter vectors θ 1 ,...,θ ns , each sampled from prior probability density p(θ), with n s corresponding data likeli- hoods L(θ 1 ),...,L(θ ns ). During the i th iteration, a new sample θ new is drawn from the prior p(θ) subject to L(θ new ) > L(θ j ), where θ j is the sample with smallest likelihood value of the current sample pool; i.e., j = arg min k=1,...,ns L(θ k ). 62 The new sample θ new replaces the sample θ j with smallest likelihood. The prior volume contained inside the contour corresponding to the rejected sam- ple with the smallest likelihood is χ i = τ i χ i−1 . The random variable τ follows p(τ) = n s τ (ns−1) , which is the probability density function for the largest of n s samples drawn from standard uniform distribution U(0, 1). The means of τ and lnτ are E[τ] = n s /(n s + 1) and E[lnτ] = −1/n s , respectively, which suggests two deterministic approaches that may be used for simplicity while performing the integration: using χ i = [n s /(n s + 1)] i or χ i = exp (−i/n s ) [91–93]. The rejected sample θ j are used to update evidence estimate E i = E i−1 + Δχ i L(θ j ) where Δχ i = χ i−1 −χ i . The trapezoidal rule for integration can also be imple- mented with Δχ i = [χ i−1 −χ i+1 ]/2. A stopping criterion, such as i > n max or a less than 1% change in evidence E, is typically used. At the final iteration, an average of the likelihoods of the remaining samples is added to compensate for the remaining undiscovered parameter space. This results in an intelligent sam- pling algorithm that samples more from high likelihood regions as the width Δχ i decreases with increasing likelihood. Pseudocode for this algorithm is shown in Algorithm 1. The selection of each sample and the progression of the integra- tion is shown in Figure 3.1. The figure shows samples from a two dimensional parameter space with increasing likelihood values being drawn. These samples with their likelihood values are then transformed into (χ,ϕ) coordinates according to (3.4) – (3.9). A quadrature rule is then employed to evaluate the shaded area. A standard Monte Carlo (MC) unbiased estimate for the evidence is given by E = 1 ns P ns k=1 L(θ k ;D,M), where the samplesθ k are chosen randomly from p(θ). However, the high likelihood region is generally very different compared to where the prior probability density, p(θ), is large. Hence, a large number of samples are needed to use with the standard MC estimator with the standard MC estimator 63 1 Initialization: Set χ 0 = 1 andE 0 = 0; 2 Generate n s samplesθ i , i = 1,...,n s , from prior p(θ) with corresponding likelihoodsL(θ i ) ; 3 Start the sample counter: i = 1; 4 while stopping criteria = FALSE do 5 Find j = arg min k=1,...,ns L(θ k ); 6 Assign χ i = [n s /n s + 1] i ; 7 Update evidence estimate byE i =E i−1 + Δχ i L(θ j ), where Δχ i =χ i−1 −χ i ; 8 Replaceθ j with a new sampleθ new that satisfiesL(θ new )>L(θ j ); 9 i =i + 1; 10 end 11 E last =E i−1 + 1 ns P ns k=1 L(θ k )χ i ; Result: EvidenceE =E last . Algorithm 1: Evidence calculation using nested sampling. [94]. On the other hand, nested sampling samples more from the high likelihood region; hence, an increased efficiency can be expected for this estimator compared to an MC estimator. Skilling [86] suggested using the Markov Chain Monte Carlo (MCMC) algo- rithm for generating samples from the constrained prior subjected to L new > L. Mukherjee et al. [91], Shaw et al. [95], and Feroz and Hobson [92] sampled in the parameter space from ellipsoids that are formed using the covariance infor- mation of the remaining samples. Murray [96] and Aitken and Akman [97] used slice sampling [98] to generate samples from the truncated prior. Chatpatanasiri [99] provided a review of these approaches and proposed an approach by cou- pling different Markov chains following the idea in Geyer [100] for multi-peaked likelihood functions. Chopin and Robert [93] introduced importance sampling to provide efficient simulation of new samples. The authors here, however, suggest using straightforward sampling of p(θ) in the nested sampling algorithm as long 64 as the number of accepted samples is above a certain percentage of the total sim- ulated samples for the past few iterations; otherwise an MCMC algorithm should be employed. 3.2.3 Efficient Analysis of Systems with Local Modifica- tions To calculate the likelihood value for each sample, the response of the structural system with the parameter values for that sample is needed. The second part of the proposed method considers this computational aspect of the model class selectionproblemandproposestouseacomputationallyefficientdynamicresponse calculation algorithm for locally modified systems, described as follows [85]. Let the modified model be governed by the state-space equation ˙ X(t) = AX(t) + Bu(t) + Lg(X(t);θ), X(0) = x 0 , Y(t) = CX(t) + Du(t) + Eg(X(t);θ), (3.11) where X(t) is the n× 1 state vector; A is the n×n state matrix; u is an m× 1 external excitation; B is the n×m influence matrix; g(·;·) is an n g × 1 (possibly nonlinear) function of an n o × 1 subset, or linear combination, of states X(t) = GX(t) and n θ × 1 uncertain parameter vectorθ; L is an n×n g influence matrix that maps the force vector g(·;·) to all the states; and x 0 is the initial condition. The output Y(t) is an n y × 1 vector. The nominal linear system corresponding to the modified system in (6.15) is ˙ x(t) = Ax(t) + Bu(t), x(0) = x 0 , y(t) = Cx(t) + Du(t). (3.12) 65 The response of modified system (6.15) is calculated by the superposition of x(t), the solution of linear system (3.12), and x (nl) (t) due to the (possibly nonlinear) forcing function g(X(t);θ). Hence, x(t) =e At x 0 + Z t 0 H B (t−s)u(s)ds, x (nl) (t) = Z t 0 H L (t−s)g(X(s);θ)ds, (3.13) where H B (t) = e At B and H L (t) = e At L are impulse responses of the nominal system. Then, the modified state response is X(t) = x(t) + x (nl) (t). x (nl) can be efficiently determined as follows: p(t) = g(X(t);θ) X(t) = x(t) + Z t 0 H L (t−s)p(s)ds (3.14) where x(t) = Gx(t) and H L (t) = GH L (t) = Ge At L. The two equations in (3.14) can be combined into: p(t)− g x(t) + Z t 0 H L (t−s)p(s)ds;θ = 0 (3.15) The system of equations in (3.15) is a (generally) nonlinear Volterra type integral equation (NVIE) of the second kind written in nonstandard form. Hence, the modified system (6.15) is exactly converted to a low-order NVIE. (3.15) can be solved by employing a Newton-Gregory integration scheme of var- ious orders [101]. For example, using a second-order-accurate integration scheme, 66 which is employed in the numerical examples herein, the approximation for X(t) at t =kΔt becomes X(kΔt) = α k−1 z }| { x(kΔt) + 1 2 H L (kΔt)p(0)Δt + k−1 X i=1 H L ([k−i]Δt)p(iΔt)Δt + 1 2 H L (0)p(kΔt)Δt (3.16) whereα k−1 depends only on p(iΔt), i = 1,...,k− 1, and is known at t = kΔt. However, p(kΔt) in the last term is unknown and is given by p(kΔt)− g α k−1 + 1 2 H L (0)p(kΔt)Δt;θ = 0 (3.17) (3.17)isa(generallynonlinear)equationin p(kΔt); itcanbesolvedusingNewton’s method with an initial guess p 0 (kΔt) = p([k− 1]Δt) from the previous time step, as follows p j+1 (kΔt) = p j (kΔt) (3.18) − " I− 1 2 ∂g ∂X H L (0)Δt # −1 p j (kΔt)− g α k−1 + 1 2 H L (0)p j (kΔt)Δt;θ where∂g/∂X is evaluated at timekΔt using prior estimate p j (kΔt). The iteration continues until a certain level of accuracy (10 −12 relative accuracy is used herein) has been achieved. As in the algorithm proposed in Gaurav et al. [85], alongside the Newton-Gregory integration scheme, a recursive fast Fourier transform (FFT) is applied herein to compute the convolution with dramatically reduced computa- tional cost. A block diagram of the flow of calculations for this approach is shown in Figure 3.2. 67 where∂ g/∂ X is evaluated at time kD t using prior estimatep j (kD t). The iteration continues until a certain level of accuracy (10 12 relative accuracy is used herein) has been achieved. As in the algorithm proposed in Gaurav et al. [22], alongside the Newton-Gregory integration scheme, a recursive fast Fourier transform (FFT) is applied herein to compute the convolution with dramatically reduced computa- tionalcost. Ablockdiagramoftheflowofcalculationsforthisapproachisshown in Figure 2. Repeated Calculations One-time Calculations ˙ x=Ax+Bu G H L (t)= e At L G u(t) x(t) H L (t) NVIE p(t) g ✓ x(t)+ Z t 0 H L (t s)p(s)ds;qqq ◆ =0 H L (t) x(t) Repeat for each sample ofqqq qqq x (nl) (t)= Z t 0 H L (t s)p(s)ds + p(t) x (nl) (t) X(t) H L (t) x(t) Figure 2: Implementation of the efficient dynamic response algorithm showing one time calcula- tion and repeated calculation components. These two approaches are combined into the proposed generalized compu- tational framework for Bayesian model selection with different application do- mains. The increase in the computational efficiency is next illustrated through three structural and mechanical(?) examples but in no way limited to these type of applications only. 3. NumericalExamples Theproposedmethodisappliedtothreedifferentstructuralmodelsofincreas- ing of complexity. A variety of linear and nonlinear models are used to character- 14 Figure 3.2: mplementation of the efficient dynamic response algorithm showing one time calculation and repeated calculation components. These two approaches are combined into the proposed generalized computa- tional framework for Bayesian model class selection with different application domains. The increase in computational efficiency is next illustrated through three structural examples but is, in no way, limited to only these types of applications. 3.3 Numerical Examples The proposed method is applied to three different structural models of increas- ing of complexity. A variety of linear and nonlinear model classes are used to characterize selected components of these structures. The accuracy of the pro- posed method is compared to Matlab’s ode45 using an error metric relative RMS difference = k(·) NVIE − (·) ode45 k 2 k(·) ode45 k 2 (3.19) where the two norm is defined bykx(t)k 2 2 = 1 t f R t f 0 x 2 (t)dt; (·) NVIE is a quantity evaluated using the NVIE approach; and (·) ode45 is evaluated using ode45. All simulations were performed using a MacBook Pro with a 2.3 GHz Intel core i7 68 ü g u b f b Ground u s m s m b c s k s Figure 3.3: Two DOF structural model with hysteretic damping. processor and 16 GB of RAM running Matlab R2014b where the cputime func- tion was used to calculate the required computation time. 3.3.1 Example I: SDOF Superstructure on a Hysteretic Isolator A two degree-of-freedom structure (Figure 3.3), comprised of a mass-spring- damper superstructure model mounted on a lead rubber bearing isolator, is sub- jected to a 30-second record of the 1940 El Centro earthquake (N-S Imperial Valley Irrigation District substation record of the 1940 Imperial Valley earthquake; PGA 0.348g; sampled at 50 Hz) to demonstrate the method. The equations of motion of this structure are: m s ¨ u s +c s ( ˙ u s − ˙ u b ) +k s (u s −u b ) =−m s ¨ u g m b ¨ u b +c s ( ˙ u b − ˙ u s ) +k s (u b −u s ) +f b =−m b ¨ u g (3.20) where u s and u b are superstructure and base displacements, respectively, relative to the ground; m s , c s and k s are the superstructure’s mass, damping coefficient and stiffness, respectively; ¨ u g is the ground acceleration; m b is the base mass; and f b is the sum of the isolation-layer damping and restoring forces. 69 Nonlinear Model Classes for Hysteretic Damping Two nonlinear model classes are considered here for the isolator restoring force: the Bouc-Wen hysteresis model class [69] and an approximate bilinear model class. An illustration of a single hysteresis loop for these nonlinear model classes is shown in Figure 3.4. In these model classes, k pre and k post are the isolator pre-yield and post-yield stiffnesses, respectively, and Q y is the isolator yield force. In both, the total force exerted by the isolation layer, the sum of the damping and restoring forces, is given by f b =c b ˙ x b +k post u b +αz (3.21) whereα =Q y [1−r k ], the peak of the non-elastic force, depends on the hardening ratio r k =k post /k pre . The evolution of auxiliary variable z is governed by ˙ z =A ˙ x b −β ˙ x b |z| npow −γz| ˙ x b ||z| npow−1 (3.22) where the selection of A = 2β = 2γ =k pre /Q y dictates that z stays in [−1, 1] and produces identical loading and unloading stiffnesses [71, 102]. For the Bouc-Wen model class, n pow = 1 is assumed. As n→∞, the model approaches the bilinear hysteresis model class; herein, n pow = 100 is used to represent the bilinear model class. For these nonlinear model classes, the state vector is X = [u s u b ˙ u s ˙ x b z] T ; the state subset is X = [u b ˙ x b z] T . For the impulse response H L (t) to be a decaying function of time, the nominal structure must have nominal isolation-layer linear stiffness and linear viscous damping, denoted by k nom b and c nom b , respectively. Thus, the nonlinear function g becomes a 2× 1 vector: the first element is the modified system’s total isolation-layer force relative to that of the nominal linear system; in the nominal system, the linear state 70 force base drift u b Q y x y x d k eq AASHTO k post k pre Bouc-Wen Bilinear AASHTO Figure 3.4: Damping model classes: Bouc-Wen hysteresis, bilinear hysteresis, and the AASHTO “equivalent” linear model class. equation ˙ z = 0 is used, so the second element of g is the is the right-hand side of (6.26). g(X(t)) = (c b −c nom b ) ˙ x b + (k post −k nom b )u b +αz A ˙ x b −β ˙ x b |z| npow −γz| ˙ x b ||z| npow−1 (3.23) and L = [0 0 0 −1/m b 1] T . Linear Model Classes for Hysteretic Damping Several linearmodel classes are also considered to model the total isolatorforce, all of the form f b = [c b +c eq ] ˙ x b +k eq u b = c b + 2ζ eq q k eq (m b +m s ) ˙ x b +k eq u b . (3.24) These model classes are chosen from among those proposed in the literature to approximate the energy dissipation per cycle of a bilinear hysteretic model class. AASHTO (American Association of State Highway and Transportation Officials) specifies an equivalent linear model class [103, 104], where the linear stiffness is 71 the secant stiffness for a particular design displacementx d , as shown in Figure 3.4, and the linear viscous damping is chosen so that the energy dissipated per cycle is the same as the bilinear model class when the structure (as a rigid mass) moves with a harmonic displacement at resonance; the resulting the equivalent damping ratio ζ eq and equivalent stiffness k eq are given by ζ eq = 2(1−r k )(1−r −1 d ) π[1 +r k (r d − 1)] k eq =k pre · 1 +r k (r d − 1) r d (3.25) and where the shear ductility ratior d =x d /x y ; andx d andx y are design and yield- ing displacements, respectively. The JPWRI (Japanese Public Works Research Institute) linear model class [72] is similar but replaces r d with 0.7r d in (3.25) becausetheeffectivedesigndisplacementisassumedtobeonly0.7timesthedesign displacementx d . Additionally, Hwang and Chiou [73] proposed a correction to the AASHTO model class using correction factors for ζ eq and k eq ζ eq = 2(1−r k )(1−r −1 d ) π[1 +r k (r d − 1)] · r 0.58 d (6− 10r k ) k eq =k pre · 1 +r k (r d − 1) r d · r 4 d [r 2 d − 0.737(r d − 1)] 2 (3.26) In the Caltrans (California Department of Transportation) equivalent linear model class [105], the equivalent damping ratio and stiffness are given by ζ eq = 0.0587(r d − 1) 0.371 k eq =k pre · 1 [1 + ln{1 + 0.13(r d − 1) 1.137 }] 2 (3.27) 72 In these linear model classes, the proposed method can also be employed with X = [u s u b ˙ u s ˙ x b ] T , X = [u b ˙ x b ] T , and g(X(t)) = (c b +c eq −c nom b ) ˙ x b + (k eq −k nom b )u b (3.28) Measurement Data and Problem Specification Thedataforthisfirstnumericalexampleisgeneratedusingthefollowingsuper- structure parameters: m s = 29485 kg, k s = 11912 kN/m, and c s = 23.71 kN·s/m. The mass of the base layer ism b = 6800 kg [71]. The true model class of the isola- tion layer is the Bouc-Wen model class with parameters as shown in Table 3.1: the true model’s post-yield stiffness k post = 232 kN/m and linear isolation damping c b = 3.74 kN·s/m 2 are chosen so that the large-amplitude isolation period is 2.5 s with 2% damping; the hardening (post-yield to pre-yield stiffness) ratio r k = 1/6 and yield forceQ y = 0.05(m s +m b )g (i.e., 5% of the total weight of the structure) — whereg is the acceleration of gravity, not to be confused with the modification function g(·;·) — are typical values [106]; and the exponent is n pow = 1. Mea- surements of the drift of the structure with respect to the base — i.e., (u s −u b ) as shown in Figure 3.3 — are collected with a sampling rate of 20 Hz, where 20% Gaussian pulse sensor noise is added to the measurements. The different linear and nonlinear model classes for the lead rubber bearings, described in the previous section, are evaluated using Bayesian inference. The parameters k post , c b , r k and Q y for the nonlinear model classes are considered uncertain; for the linear model classes, the uncertain parameters are k post , c b , r k and r d . The assumed distributions of the priors are given in Table 3.1 along with their respective means and standard deviations. The stopping criterion used for the nested sampling is less than a 1% change in evidence E. The proposed 73 Table 3.1: Prior distributions for model class parameters as applicable to each 2 DOF model. Parameter True value Prior density type and statistics Distribution Mean Std. Dev. k post 232 kN/m Lognormal 250 kN/m 7.5 kN/m c b 3.74 kN·s/m Normal 4.00 kN·s/m 0.28 kN·s/m r k 0.1667 Uniform 0.1800 0.0115 r d n/a ∗ Normal 2.5 0.2 Q y (in %) 5.0 Normal 4.5 0.25 ∗ r d is only required for linear model classes. Table 3.2: Posterior model class probabilities with equal priors. Model classM k Log of Evidence, Posterior model probabilities, P (M k |D) ln p(D|M k ) allM k noM 5 noM 5 ,M 6 M 1 AASHTO 1897.4117 ≈ 0 ≈ 0 0.000045 M 2 JPWRI 1907.4214 ≈ 0 ≈ 0 0.999955 M 3 modified AASHTO 1302.4428 ≈ 0 ≈ 0 ≈ 0 M 4 Caltrans 269.3359 ≈ 0 ≈ 0 ≈ 0 M 5 Bouc-Wen 4290.3246 1.0000 n/a n/a M 6 Bilinear 4137.2200 ≈ 0 1.0000 n/a approach for response calculation is used with Δt = 30 s/(2 12 − 1) = 7.33 ms. So that the impulse responses of the nominal system are well-behaved, the nominal systemhasisolation-layerlinearstiffnessk nom b = 250 kN/manddampingcoefficient c nom b = 4 kN·s/m, which are the mean values ofk post andc b , respectively, as shown in Table 3.1. Assuming equal prior probabilities for all six model classes, i.e.,P (M k ) = 1/6, the model class selection is performed. The results, shown in Table 3.2 (“allM k ” column), find that the posterior probability of the Bouc-Wen model class is essen- tially unity, which clearly demonstrates the method’s ability to identify the best model. (Note that the log-evidence values in Table 3.2 are unscaled by the denom- inator (3.1), so they must be interpreted in a relative sense, with larger values indicating model classes that better fit the data.) If the Bouc-Wen model class, 74 Base drift [cm] 0 6 8 4 2 –4 –2 auxiliary variable z 0 –0.5 –1.0 0.5 1.0 Figure 3.5: Hysteresis loops for the true Bouc-Wen model class for the 2 DOF structure. which was used to generate the measured data, is not among the model classes considered (soP (M k ) = 1/5 for k6= 5), then the bilinear model class is the best (“no M 5 ” column). If neither the Bouc-Wen nor the bilinear model classes are considered (withP (M k ) = 1/4 fork = 1,..., 4), then the JPWRI model class is by far the best, with the AASHTO model class a distant second best (“noM 5 ,M 6 ” column); this result is expected because the ranges of shear ductility ratio r d and hardening ratio r k for this example (specified in Table 3.1) make the AASHTO and JPWRI model classes softer than the other linearized model classes (e.g., using the means of the priors ofr k andr d , thenk eq isk pre times 0.5 for AASHTO, 0.6429 for JPWRI, 0.7092 for Caltrans, and 0.7380 for modified AASHTO); there- fore, AASHTO and JPWRI are better fits to this example’s base drifts that go far beyond the yield displacements, as shown in the graph in Figure 3.5 of the hysteresis loops of the auxiliary variable z (which is proportional to the yielding component of the isolator restoring force). (Note, however, that the main focus of this chapter is on evaluating the computational efficiency of the proposed method, not on the details of which model classes or their variants are included; in either case, inclusion or exclusion of the true model class has no impact on the efficiency of the method.) 75 Accuracy and Computational Efficiency Gain With a time step of Δt = 7.33 ms, using the true parameters of a Bouc- Wen model class of the isolation hysteresis, the accuracy of the proposed method is assessed by (3.19), the relative RMS of the difference between the response computed by the proposed approach with that from Matlab’s ode45. Using structural drift as the metric, and comparing to an extremely accurate ode45 with 10 –10 relative and 10 –16 absolute tolerances (which may serve as an “exact” response), the relative RMS difference is 3.1× 10 −3 . Thus, the computational cost of the proposed approach with Δt = 7.33 ms is evaluated against a comparably- accurate ode45, one with the default tolerances of 10 –3 relative and 10 –6 absolute. The proposed approach takes 0.020 s for one-time calculations and 0.539 s for each repeated calculation (avg. of 10 simulations), which is comparable to the computation time forMatlab’s ode45. Although the proposed approach does not give any computational advantages for this example, the computational efficiency greatly increases for systems with many DOFs, as will be illustrated in the next two examples. 3.3.2 ExampleII:11-StoryBase-IsolatedStructuralModel The second example utilizes a more complex superstructure comprised of an 11-story 2-bay 99 DOF superstructure, sitting on a hysteretic isolation layer that is rigid in-plane and can only move horizontally, resulting in the 100 DOF model shown in Figure 6.7. The column weights are neglected and consistent mass matri- ces are used for the beam elements in the model. Rayleigh damping, with assumed 76 3% damping ratios for the 1 st and 10 th mode, is used to construct the superstruc- ture damping matrix. If considered as a fixed base structure, the superstructure would have a fundamental period of 1.05 s with equations of motion written as M s ¨ u s + C s ˙ u s + K s u s =−M s r s ¨ u g (3.29) where M s , C s and K s are the mass, damping and stiffness matrices of the super- structure, respectively; ¨ u g is the ground acceleration; u s is the generalized displace- ment vector relative to the ground, consisting of 3 DOF per node for each of the 33 nodes in the model. The influence vector of the ground motion ¨ u g , consisting of a 1 in each element corresponding to a horizontal displacement in u s and zeros Base nd 2 floor th 10 floor th 11 floor st 1 floor Ground hysteretic isolation bearings u b u b u 3 u 6 u 9 u 12 u 15 u 18 u 84 u 87 u 90 u 1 u 4 u 7 u 10 u 13 u 16 u 82 u 85 u 88 u 91 u 94 u 97 u 2 u 5 u 8 u 11 u 14 u 17 u 83 u 86 u 89 u 92 u 95 u 98 u 93 u 96 u 99 Figure 3.6: 100 DOF base-isolated structural model. 77 elsewhere, is denoted by r s ; i.e., r s = [1 0 0 ... 1 0 0] T . When combined with the isolation layer, the equations of motion are M s ¨ u s + C s ˙ u s + K s u s =−M s r s ¨ u g + C s r s ˙ u b + K s r s u b (3.30) m b ¨ u b + (c b + r T s C s r s ) ˙ u b + (k b + r T s K s r s )u b +f b =−m b ¨ u g + r T s C s ˙ u s + r T s K s u s The true model class is again a Bouc-Wen model with: exponentn pow = 1; typ- ical yield force Q y = 5% of building weight and hardening ratio r k =k post /k pre = 1/6 [106]; and isolation parametersk post andc b , shown in Table 5.7, chosen so that the building’s large-amplitude isolation mode has a period of 2.22 s and a damping ratio of 3.53%. The excitation is again the 1940 El Centro earthquake record of 30 s duration as the ground excitation. The absolute acceleration measurements of the roof horizontal DOF denoted byu 97 in Figure 6.7 are collected with a sampling rate of 20 Hz, to which 20% Gaussian pulse process sensor noise is added. The nominal system has isolation-layer linear stiffness k nom b = 780 kN/m and damping coefficient c nom b = 35 kN·s/m — which are, as shown in Table 5.7, the meansofthepost-yieldstiffnessk post andisolationviscousdampingc b , respectively — both of which are removed by the modification as in (3.23) or (3.28). In the state-space formulation, the state vector is X = [u T u b ˙ u T ˙ x b ] T for the linear model classes specified in Example I; for the nonlinear model classes, the state vector is X = [u T u b ˙ u T ˙ x b z] T . The uncertainty in the model is assumed to be contained solely in the isolation layer behavior. The proposed method is applied to the nonlinear and linear model classes. The prior distributions for the uncertain isolation layer parameters are taken as independently Gaussian, with means and standard deviations as given in Table 5.7. The nested sampling algorithm is again used with the same stopping criterion discussed previously. 78 Table 3.3: PriorsN (m,σ 2 ) for model class parameters of the 100 DOF model. Parameter True value Statistics of Gaussian prior Mean Std. Dev. k post 750 kN/m 780 kN/m 15 kN/m c b 40.0 kN·s/m 35.0 kN·s/m 2.8 kN·s/m r k 0.1667 0.1800 0.0060 r d n/a ∗ 2.5 0.2 Q y (in %) 8.75 7.875 0.4375 ∗ r d is only required for linear model classes. ThenonlinearshapesofBouc-Wenhysteresisloopsandbilinearhysteresisloops are captured with the exponents n pow = 1 and 100 for the Bouc-Wen and bilinear hysteresis model classes, respectively. The hysteresis loops of auxiliary variable z (which is proportional to the yielding component of the isolator restoring force) with thesen pow values are shown in Figure 3.7. n pow = 100 provides a clear bilinear behavior with a sharp transition, and n pow = 1 provides the smooth transition from pre- to post-yield behavior that is more typical of real materials in isolation bearings. A comparison of hysteretic restoring forces is shown in Figure 3.8 for the bilinear, Bouc-Wen and AASHTO model classes. The linear model class gives an elliptically shaped hysteresis loop due to its use of equivalent linear viscous damping (chosen so that the energy dissipated per cycle, which is also the area inside the loop, approximates the energy dissipated by the bilinear model class); in contrast, the bilinear hysteresis loops consists of sharp lines, and the Bouc-Wen loops are similar to the bilinear ones but with smooth transitions at yielding. The model classes are assumed initially to be equally likely, resulting in all prior model class probabilities equal to 1/6. The result of the model class selec- tion process is given in Table 3.4 (“allM k ” column); the Bouc-Wen model again correctly emerged as the single best model class. Clearly, in real applications, one often does not have the true model class in the suite of possible model classes; thus, 79 Base drift [cm] 0 5 –5 auxiliary variable z 0 –0.5 –1.0 0.5 1.0 (a) Bouc-Wen model, n pow = 1 Base drift [cm] 0 5 –5 auxiliary variable z 0 –0.5 –1.0 0.5 1.0 (b) Bilinear model, n pow = 100 Figure 3.7: Hysteresis loops, for the (a) Bouc-Wen and (b) bilinear model classes, for the 100 DOF isolated building structure. Table 3.4: Posterior model class probabilities with equal priors for the 100 DOF structure. Model classM k Log of Evidence, Posterior model probabilities, P (M k |D) ln p(D|M k ) allM k noM 5 noM 5 ,M 6 M 1 AASHTO −1532.4467 ≈ 0 ≈ 0 ≈ 0 M 2 JPWRI −1040.8365 ≈ 0 ≈ 0 ≈ 0 M 3 modified AASHTO − 938.2874 ≈ 0 ≈ 0 1.0000 M 4 Caltrans − 965.2360 ≈ 0 ≈ 0 ≈ 0 M 5 Bouc-Wen 172.5988 1.0000 n/a n/a M 6 Bilinear − 182.6921 ≈ 0 1.0000 n/a the results are also provided when the Bouc-Wen model class is omitted (“noM 5 ” column), resulting in the bilinear model class chosen as best, and when neither the Bouc-Wen nor bilinear model classes are included (“no M 5 , M 6 ” column), resulting in the modified AASHTO model class being the clear best (the hysteresis loops, shown in Figure 3.7, only yield mildly compared to those of the first exam- ple; as a result, the secant stiffness of this example is better fit by the modifed AASHTO model class, which is stiffer than the other linear model classes for the ranges of priors in Table 5.7 for this example). 80 Base drift [cm] 4 2 0 –2 –4 –6 –8 Isolator force [kN] 0 –50 –100 50 Bouc-Wen Bilinear AASHTO Figure 3.8: Typical isolation-layer restoring force hysteresis loops, over the time duration [0.73, 2.16] s, for one linear and two nonlinear model classes for the 100 DOF structure. Accuracy and Computational Efficiency Gain The computational efficiency of the proposed method is again compared with a traditional nonlinear solver (Matlab’s ode45) for the solution of the state-space equations. (For Examples II and III, other similar computational tools — such as Matlab’s Simulink, with or without accelerator mode — provide computational efficiency of the same order as ode45, so ode45 is suitable as a benchmark against which the proposed approach is compared.) The proposed approach’s relative RMS differences of base drift and horizontal roof accelerations, compared to an extremely accurate ode45 with 10 –10 relative and 10 –16 absolute tolerances, are 3.5× 10 −3 and 3.9× 10 −3 , respectively, for the true parameter values and 2 12 time steps of Δt = 7.33 ms duration each. Thus, the computation time is compared with that of the comparably-accurate ode45 with the default 10 –3 relative and 10 –6 absolute tolerances. Using the proposed approach for either nonlinear hysteresis model class, a gain in computational efficiency of about 70 is achieved compared to a traditional 81 Table 3.5: Computational gain achieved using the proposed approach for the three numerical examples. CPU time, 1 simulation † Gain in computational efficiency Example Δt proposed ode45 one-time repeated 1 simulation Many simulations I (1 DOF) 7.33 ms 0.020 s 0.539 s 2.068 s 3.70 3.84 II (100 DOF) 7.33 ms 0.628 s 0.789 s 55.136 s 38.91 69.88 III (1623 DOF) 1.83 ms 562.830 s 2.195 s 43.04 mins. 4.57 1176 † average of 10 runs approach using ode45, with tolerances selected to obtain comparable accuracy, for the response calculation as shown in Table 6.6. The nested sampling typically requires∼15,500 simulations for the Bouc-Wen model class and∼13,500 simula- tions for the bilinear model class. Hence, for the Bouc-Wen model class, the evalu- ation of evidence would require 9.89 days with ode45 compared to 3.40 hours with the proposed method; for the bilinear model class, ode45’s (projected) 8.62 days of computation reduces to 2.96 hours using the proposed method. 3.3.3 Example III: Complex Three-Dimensional Wind- Excited Structure In the third example, a complex 20-story moment-resisting frame building model with three nonlinear TMDs, adapted from Wojtkiewicz and Johnson [74] and shown in Figure 3.9, is considered to evaluate the efficacy of the proposed method. The building has a height of 80 m, and has a 40 m× 30 m rectangular 5-bay× 3-bay plan in the first five stories, then 3-bay× 2-bay in stories 6–10, and 2-bay× 2-bay in the top half of the building. Cross braces carrying axial loads provide additional stiffness for lateral bending and torsion. Euler-Bernoulli beams are used to model columns and floor beams; the beam-column joints are assumed 82 −15 −5 5 15 −20 −12 −4 4 12 20 0 8 16 24 32 40 48 56 64 72 80 x [m] E 30 ° wind N y [m] z [m] Figure 3.9: Complex three-dimensional wind-excited structure. rigid. Additional in-plane stiffness of the floors is provided by additional cross elements on the floors. Without the TMDs, the structural model has 1,620 DOF with its first six modalfrequenciesat0.5718 Hz(firsty-directionmode), 0.5893 Hz(firstx-direction mode), 0.9363 Hz (first torsional mode), 1.3632 Hz (second y-direction mode), 1.5346 Hz (secondx-direction mode), and 2.0292 Hz (second torsional mode). Two TMDs are placed in the y-direction (each about 0.55% of the building mass) and one TMD in the x-direction (about 1.1% of the building mass), which splits the first y-mode into two modes with frequencies at 0.5062 and 0.6282 Hz, the first 83 x mode into two modes with frequencies at 0.5214 and 0.6475 Hz, and the first torsional mode into two modes with frequencies at 0.5615 and 0.9506 Hz. The building is subjected to one-directional wind excitation (oriented toward the east-northeast, at a 30 ◦ angle from thex-axis as shown in Figure 3.9), modeled as a narrowband filtered Gaussian white noise process shaped vertically along the building height. The filter is a 16th-order Butterworth band-pass filter with cutoff frequencies 1.2 times smaller and larger than the fundamental structural natural frequency, resulting in most of the excitation energy in the range of 0.35–1.5 Hz, exciting primarily the fundamental mode in the east-west (E-W) x-direction and, secondarily, the fundamental modes in the north-south (N-S) y-direction and in torsion. The vertical power-law shaping is proportional to the height to the 0.3 power [75]. The force is assumed for simplicity to be fully correlated at all heights along the building. The true model, used to generate response measurements, has in each of the three TMDs a power-law damper that exerts a force f P =c P | ˙ u| β P sgn ˙ u +c P ˙ u (3.31) where β P = 0.8 and ˙ u is the velocity of the TMD relative to its roof connection. For the one TMD that moves in the x-direction, the true damper coefficients are c x P = 200 kN·(s/m) β P and c x P = 30 kN·(s/m); for each of the two TMDs that move in the y-direction, the true coefficients are c y P = 100 kN·(s/m) β P and c y P = 15 kN·(s/m). These values are chosen such that the effect of nonlinearity is pronouncedinthestructure’sbehavior. Thistrue modelissimulatedwiththewind excitation producing output acceleration responses at the top-floor center-of-mass in the x and y directions and the torsional acceleration (which, in the field, could 84 be approximated with two non-collocated accelerometers). These measurements are sampled at 20 Hz and corrupted with 20% Gaussian pulse process sensor noise. Candidate Model Classes for Damping of the TMDs One linear and six nonlinear model classes are postulated to represent the nonlinear damping of the TMDs; their details are given in Table 3.6. The linear viscous damping model class assumes that the damping force, exerted by each TMD,islinearlyproportionaltotherelativevelocitybetweentheTMDanditsroof attachment point. The power-law viscous damping model class has the damping force proportional to the relative velocity raised to a power as well as a linear viscous damping term; this power-law damping model is the true model class used for generating measurements in this example (the analysis will study both the case in which this true model is included in the set of candidate models, and cases in which the set does not include this true model class). In the cubic polynomial damping model class, a cubic term, similar to the power-law damping model class with an exponent of 3, is added to the linear term. In the Bingham model, the damping force has two components: a dry friction component and a linear viscous component. In the Van der Pol nonlinear damping model, the damping force degrades with increasing displacement. Two nonlinear hysteretic models are also considered: Bouc-Wen and bilinear hysteresis model classes. The priors for the parameters of the different model classes are given in Table 3.7 (the prior forQ y is specified relative to the TMD weightW tmd ). With all model classes considered, the models are assumed a priori to be equally likely (i.e., P (M k ) = 1/7), resulting in the power law model class having a posterior probabil- ity near unity, as shown in Table 3.8 (“allM k ” column), and the linear and cubic polynomial damping model classes as the distant runners-up. Clearly, this result is 85 Table 3.6: Nonlinear damping model classes, where u is a TMD displacement relative to its roof attachment point. Model classM k Damping force Parameters M 1 Linear f L =c L ˙ u c L M 2 Power law f P =c P | ˙ u| β P sgn ˙ u +c P ˙ u [c P β P c P ] M 3 Cubic polynomial f C =c C ˙ u 3 +c C ˙ u [c C c C ] M 4 Bingham f B =γ B sgn ˙ u +c B ˙ u [c B γ B ] M 5 Van der Pol f V =c V 1− V u 2 ˙ u [c V V ] Hysteretic: f H =α H z +k post u § [r k Q y ] M 6 Bouc-Wen n pow = 1; k pre fixed M 7 Bilinear n pow = 100; k pre fixed § z is defined in (6.26). expected given that the power law model class is the true model used to generate the measured data; further, the power-law model exponent is β P = 0.8, which is close to one, so both the linear and cubic polynomial damping model classes can roughly approximate the close-to-linear power law. If the true power-law model classM 2 is omitted, and the other six model classes are considered equally likely, then the linear and cubic polynomial damping model classes are both considered to be reasonable choices, with linear having a slight edge (“no M 2 ” column). If the linear damping model class is also omitted (“noM 1 ,M 2 ” column), then the cubic polynomial damping model class is the best, with the Van der Pol model class a distance second. Computational Efficiency Gain Theproposedmethodisimplementedwith2 14 timestepsof Δt = 1.83 msdura- tion each (smaller for this example due to the greater complexity of the structure). To evaluate the accuracy of the proposed method, the TMD velocity responses are compared to those computed with Matlab’s default ode45 (10 –3 relative and 10 –6 absolute tolerances); the relative RMS differences are no more than 3.9×10 −3 . 86 Sincethisode45hasrelativeaccuracyofO(10 −3 ), thetwoapproachesareofsimilar accuracy; thus, it is meaningful to compare their computational costs. For the power law damping model class, using the proposed approach, gains in computational efficiency over three orders of magnitude were achieved com- pared to a traditional approach using ode45, with tolerances selected to obtain comparable accuracy, for the response calculation as shown in the last row of Table 6.6. For example, to evaluate the evidence for the power-law damping model class, ode45’s 20.92 days of computation time for the required∼700 simulations is reduced to 34.99 mins. with the proposed approach; for the Van der Pol damp- ing model class with∼400 simulations, ode45’s 11.95 days of computation time is reduced to 23.57 mins.; similar reductions in computation time are achieved for the other damping model classes as well. Table 3.7: Prior distributions for model parameters for different damping model classes for the 1623 DOF building structure subjected to wind load. Model class Parameter Distribution x TMD y TMDs Units Mean Std. Dev. Mean Std. Dev. M 1 Linear c L Normal 350 35.0 225 22.5 kN·s/m M 2 Power law c P Normal 225 20.0 120 10.0 kN·(s/m) β P c P Logn. 27.5 2.5 20.0 2.0 kN·s/m β P Logn. 0.85 0.05 0.85 0.05 unitless M 3 Cubic c C Logn. 75 7.5 20.0 2.0 kN·(s/m) 3 polynomial c C Normal 350 35.0 225 22.5 kN·s/m M 4 Bingham c B Normal 350 35.0 225 22.5 kN·s/m γ B Uniform 7.5 1.44 7.5 1.44 kN M 5 Van der Pol c V Normal 350 35.0 225 22.5 kN·s/m V Uniform 200 14.43 200 14.43 1/m 2 M 6 ,M 7 k pre Uniform 0.15 0.0144 0.15 0.0144 k post Hysteretic Q y Normal 4.0 0.5 4.0 0.5 %W tmd 87 Table 3.8: Posterior TMD damping model class probabilities with equal priors for the 1623-DOF wind-excited building model. Model classM k Log of Evidence, Posterior model class probabilities, P (M k |D) ln p(D|M k ) allM k noM 2 noM 1 ,M 2 M 1 Linear 4888.0157 0.000015 0.579751 n/a M 2 Power law 4899.1005 0.999974 n/a n/a M 3 Bingham 4817.4932 ≈ 0 ≈ 0 ≈ 0 M 4 Cubic polynomial 4887.6737 0.000011 0.411825 0.979954 M 5 Van der Pol 4883.7842 ≈ 0 0.008424 0.020046 M 6 Bouc-Wen 2509.3258 ≈ 0 ≈ 0 ≈ 0 M 7 Bilinear 3097.0425 ≈ 0 ≈ 0 ≈ 0 3.4 Conclusions The main obstacle in performing Bayesian model class selection efficiently for nonlinear dynamical systems is to estimate the evidence within reasonable compu- tational cost limits. This issue is addressed herein by proposing a computational framework using the combined use of an intelligent sampling algorithm (nested sampling) and an efficient response calculation for systems with local modifica- tions. The nested sampling samples more from the high likelihood region, thereby substantially improving the efficiency compared to standard Monte Carlo for the estimationofevidence. Thesimulationofthelocallymodifiedsystemsisperformed using an efficient algorithm that exploits the localized nature of the changes, rela- tive to a nominal linear model, by exactly converting the equations of motion into a low-order nonlinear Volterra integral equation, which is then solved numerically to achieve significant gains in computational efficiency. This proposed approach is illustrated using three examples with 2–, 100–, and 1623–degree-of-freedom sys- tems. These examples show • computational efficiency gains up to three orders of magnitude • with comparable accuracy 88 compared toMatlab’s ode45, clearly demonstrating the efficacy of the proposed method. This computational gain is due to the use of an efficient dynamic response algorithm and can be multiplied by the gain achieved due to use of an intelligent sampling algorithm alone. 89 Chapter 4 Multilevel Estimation of Marginal Likelihood for Bayesian Model Selection truth – which is much too complicated to allow anything but approximations. J. von Neumann, "The Mathematician", in The Works of the Mind 4.1 Introduction The main computational effort in Bayesian model selection arises in the esti- mation of the evidence or the marginal likelihood for each model since it requires a large number of forward model simulations. A number of methods have been proposed in the past few decades to more intelligently select samples to estimate the marginal likelihood or evidence. For example, the posterior harmonic mean estimator [107] samples from the posterior distributions of the parameters. Kass and Raftery [12] applied importance sampling to improve the efficiency of stan- dard Monte Carlo sampling. Annealed importance sampling [108] and the power posterior method [109] can also be used to estimate the evidence. Radford Neal’s annealed sampling [108] can also be used to calculate evidence. Friel and Pettitt presented the Power Posterior method [109]. Ching et al. [94] introduced the Transitional Markov Chain Monte Carlo method, where samples are drawn from 90 intermediate distributions ultimately converging to a target distribution. Skilling [86] proposed another method, known as nested sampling, which samples more from the high likelihood region. In this chapter, a multilevel approach is proposed to intelligently sample from thehighlikelihoodregiontoefficientlyestimatethemarginallikelihoodorevidence. Inthisproposedapproach, aprobabilityintegraltransformationisfirstusedtocon- vert the multidimensional integration of marginal likelihood to a one-dimensional integration. The resulting 1D integral is evaluated using a quadrature rule. The quadrature points are calculated using a proposed multilevel approach where sam- ples with increasing levels of likelihoods are generated sequentially. Three algo- rithms to efficiently generate these samples using importance sampling, stratified sampling, and Markov chain Monte Carlo sampling, respectively, are proposed in this chapter. In the first algorithm, samples for the current likelihood level are gen- erated from an importance distribution formed using the samples from previous levels. In the second proposed algorithm, a subset of strata that contains samples from the previous level are used to generate samples for the current likelihood level. In the third algorithm, Markov chains are run starting from previous level sam- ples to generate samples for the current likelihood level. The proposed algorithms provide flexibility of choosing the likelihood levels that can be used to focus on regions with high likelihood values providing better computational efficiency over methods like nested sampling [110]. The proposed algorithms are illustrated using three examples. First, an ele- mentary problem with Gaussian likelihood and a Gaussian prior is used where the true value of the evidence or marginal likelihood is known. In the second example, flow past a cylinder inside a pipe is used. The inlet velocity is assumed to be of parabolic shape where the maximum inlet velocity is distributed as truncated 91 Gaussian. Velocities are measured at several downstream points. The marginal likelihood results from the proposed algorithms are compared with nested sam- pling. In the third example, an 11-story base-isolated building is used where the uncertainty is assumed to be in the nonlinear hysteretic isolation layer. Using the roof acceleration measurements, the evidence or marginal likelihood is estimated for two nonlinear model classes, namely, Bouc-Wen and bilinear hysteresis models, and for a linear approximation. Again nested sampling is used to compare the results. 4.2 Review of sampling methods 4.2.1 Importance sampling Importance sampling is used to estimate an expectation μ f = E p [f] when sampling from the density p(x) is difficult. Importance sampling instead draws N samples{x i } N i=1 fromasimilardensityfunction q(x), calledtheimportancedensity, and gives the unbiased estimate as ˜ μ f = 1 N f(x i )w i (4.1) where the importance weights w i = p(x i )/q(x i ) are used to correct the bias intro- duced by sampling from q(x). The variance of the estimator is given by Var q [˜ μ f ] = 1 N ( E p " f 2 (x) p(x) q(x) # −μ 2 f ) (4.2) 92 The reduction in variance obtained compared to a standard Monte Carlo estimator μ (MC) f = 1 N f(x i ) with x i ∼ p(x) is given by Var p h μ (MC) f i − Var q [˜ μ f ] = 1 N E p f(x) 2 1− p(x) q(x) ! (4.3) Hence, the use of importance sampling can produce variance reduction by choosing the importance density proportional to|f(x)|p(x). However, p(x) or q(x) are often known up to a constant. In that case, a normalized importance sampling (NIS) is used which estimates the expectation as ˜ μ f = N X i=1 f(x i ) ˜ w i (4.4) where the normalized weights ˜ w i are given by ˜ w i = w i P N j=1 w j = p(x i )/q(x i ) P N j=1 p(x j )/q(x j ) (4.5) This normalized importance sampling estimator is biased but consistent (i.e., asymptotically unbiased). Choosing q(x) > 0 whenever p(x) > 0, the variance reduction can also be achieved. As a measure of effectiveness in using the impor- tance density q(x), the effective sample size (ESS) is used, given by ESS = P N i=1 w i 2 P N i=1 w 2 i (4.6) 4.2.2 Stratified sampling In the second method for efficient estimation of marginal likelihood or evidence, a multilevel stratified sampling is used. Stratified sampling suggests dividing the sample space Ω inton st disjoint subspaces{Ω i } nst i=1 where∪ nst i=1 Ω i = Ω and Ω i ∩Ω j = 93 ? for i6= j. The mean of the quantity f is then estimated within each of these strata, denoted as ˜ μ (i) f for i = 1,...,n st . The strata means are then combined using total law of probability to give E p [f]≈ ˜ μ f = nst X i=1 ˜ μ (i) f p i (4.7) where p i = P (Ω i ). The variance reduction compared to standard Monte Carlo method can be given by McKay et al. [111] Var p h μ (MC) f i − Var p|Ω [˜ μ f ] = 1 N nst X i=1 p i (˜ μ i f − ˜ μ f ) 2 (4.8) where the strata means ˜ μ i f , i = 1,...,n st are calculated using N i samples with N i =p i N. 4.2.3 Markov chain Monte Carlo (MCMC) Markov chain Monte Carlo is used to sample from a distribution otherwise difficult to sample. For this purpose, a Markov chain is constructed to explore the sample space Ω with stationary distribution π(·) as the one from which samples are sought [112]. The invariant distribution of the chain satisfies π(y)dy = Z Ω K(x, dy)π(x)dx (4.9) where the transition kernel of the Markov chain is defined as [113, 114] K(x, dy) =f(x,y)dy + 1− Z Ω f(x,y)dy δ x (dy) (4.10) 94 for some transition function f(x,y) with f(x,x) = 0 and δ x (dy) = 1 for x∈ dy and 0 otherwise. The probability of the chain staying at x is [1− R Ω f(x,y)dy]. A popular algorithm for generating samples using MCMC is the Metropolis-Hastings (MH) algorithm which assumes the transition from x toy forx6=y is of the form [114] f MH (x,y) =q(x,y)α(x,y) (4.11) where q(x,y) is an assumed proposal density and α(x,y) is the acceptance rate defined by α(x,y) = min h π(y)q(y,x) π(x)q(x,y) , 1 i , for π(x)q(x,y)> 0 1, for π(x)q(x,y) = 0 (4.12) However,withincreasingdimension,theMHalgorithmbecomesinefficient. Auand Beck [115] proposed a modified algorithm with a higher acceptance of generated candidate samples by using the MH algorithm componentwise. 4.3 Proposed Methodology 4.3.1 Use of probability integral transform Evidence integral (3.3), omitting model variableM k , is rewritten in (4.14) by: (i) expressing the likelihood L(θ) = Z L(θ) 0 dλ; (4.13) (ii) rearranging the order of integration; 95 (iii) defining monotonically decreasing function χ(λ), the probability mass enclosed in the parameter space subset where likelihoods L(θ) exceed λ, and its inverse ϕ(χ) (i.e., ϕ (χ(λ))≡λ); (iv) changing the variable of integration; and (v) approximating the integral by discretizing over χ. 1 1 2 2 3 3 n n n 1 n 1 ✓ 1 ✓ 2 ' 0 1 b ' 1 , 1 b ' n , n b ' n 1 , n 1 Figure 1: Multilevel estimation of marginal likelihood. of i which can be written as i =P ⇣ ✓ 2 e ⇥ i ⌘ =E ⇥ I e ⇥ i (✓ ) ⇤ (17) where e ⇥ i = {✓ |L(✓ ) > i } for i 2 Z + and the indicator function I (·) (·)is defined by I e ⇥ i (✓ )= 8 >< >: 1if ✓ 2 e ⇥ i 0otherwise (18) Hence, the quantity i in (17) can be approximated with the estimator b IS i = 1 N N X j=1 I e ⇥ i (✓ )w(✓ j )(19) where w(✓ j )=p(✓ j )/q(✓ j ); q(·)istheimportancesamplingdensitywith q(✓ ) >0wheneverp(✓ ) > 0. The importance density q(✓ )canbechosenas 11 Figure 4.1: Multilevel estimation of marginal likelihood. E = Z Θ p(D|θ)p(θ)dθ = Z Θ L(θ)p(θ)dθ = Z Θ " Z L(θ) 0 dλ # | {z } L(θ) p(θ)dθ = Z ∞ 0 " Z L(θ)>λ p(θ)dθ # | {z } χ(λ) dλ = Z ∞ 0 χ(λ)dλ = Z 1 0 ϕ(χ)dχ≈ imax X i=1 ϕ i Δχ i (4.14) 96 Oncethemultidimensionalintegralisconvertedintoaone-dimensionaltheproblem becomes estimation of χ i for a corresponding ϕ i . For efficient estimation of these quantities to perform the quadrature different variance reduction methods can be implemented. Figure 4.1 shows that the samples from successive likelihood levels areusedtoconvertto (φ,χ)coordinatestoestimatetheintegralusingaquadrature rule. In this chapter, three algorithms have been proposed to evaluate the trans- formed one dimensional integral that are described in details in the next section. Please note that the author has shown in Section 3.2.2 and also in De et al. [87] that the above transform is also the backbone of the nested sampling method [86]. 4.3.2 Multilevel-Importance sampling (ML-IS) The first method presented here employs the importance sampling for estima- tion of χ i which can be written as χ i =P θ∈ f Θ i =E h I e Θ i (θ) i (4.15) where f Θ i ={θ|L(θ)>λ i } for i∈Z + and the indicator function I (·) (·) is defined by I e Θ i (θ) = 1 ifθ∈ f Θ i 0 otherwise (4.16) Hence, the quantity χ i in (4.15) can be approximated with the estimator b χ IS i = 1 N N X j=1 I e Θ i (θ)w(θ j ) (4.17) 97 Iso-likelihood contours 1 1 2 2 3 3 n n n 1 n 1 ✓ 1 ✓ 2 Figure 2: Multilevel-particle approximation (ML-IS) method: the iso-likelihood contours are shown with 1 < 2 < ··· < n ; Importance densities are formed successively to generate samples from high likelihood region. anormaldistributionwithmeanvalueatposteriormode ˆ ✓ guessed from the previous set of samples and arbitrarily large variance b ⌃ [33]. A threshold for e↵ ective sample size is chosen beforehand. The choice of i is, however, done after the simulation ofN samples for practicality. However, the main challenge of applying this algorithm lies in choosing appropriate form for the importance densities. An algorithm for multilevel estimation of marginal likelihood using IS is presented in algorithm 1. 3.3. Multilevel-Stratified sampling (ML-SS) The second algorithm proposed here implements the stratified sampling in the multilevel approach. A stratified sampling can also be implemented to estimate P ⇣ ✓ 2 e ⇥ i ⌘ by focusing on the strata with L(✓ )> i . Essentially, the stratified sampling strategy can be used to estimate b i usingN j samples 12 Figure 4.2: Multilevel-particle approximation (ML-IS) method: the iso-likelihood contours are shown with λ 1 < λ 2 <··· < λ n ; Importance densities are formed successively to generate samples from high likelihood region. wherew(θ j ) = p(θ j )/q(θ j ); q(·) is the importance sampling density with q(θ)> 0 whenever p(θ) > 0. The importance density q(θ) can be chosen as a normal distribution with mean value at posterior mode ˆ θ guessed from the previous set of samples and arbitrarily large variance b Σ [91]. A threshold for effective sample size γ is chosen beforehand. The choice of λ i is, however, done after the simulation of N samples for practicality. However, the main challenge of applying this algorithm liesinchoosingtheappropriateformfortheimportancedensities. Analgorithmfor multilevel estimation of marginal likelihood or evidence using importance sampling is presented in Algorithm 2 and a schematic diagram is shown in Figure 4.2. 4.3.3 Multilevel-Stratified sampling (ML-SS) The second algorithm proposed here implements stratified sampling in the multilevel approach. A stratified sampling can also be implemented to estimate 98 1 Initialization: Set b χ IS 0 = 1,E 0 = 0, and q 0 (·) = p(·); choose γ 2 Set i = 0 3 while Stopping Criterion = FALSE do 4 Assume a proper importance density q i (·) using the samples from last iteration 1 5 Draw samplesθ j ∼ q i (θ) for j = 1,...,N 6 Calculate likelihood valuesL(θ j ) for these samples 7 Evaluate the importance weights w(θ j ) = p(θ j )/q i (θ j ) 8 if ESS = ( P N i=1 w i) 2 P N i=1 w 2 i >γ then 9 Estimate a suitable λ i 10 b χ IS i ≈ 1 N P N j=1 I e Θ i (θ)w(θ j ) 11 else 12 Draw more samples from q i (θ) 13 Goto line 6 14 end 15 Update the marginal likelihoodE i =E i−1 +λ i b χ IS i−1 − b χ IS i (using a rectangular rule for integration) 16 i←i + 1 17 end Result: The marginal likelihoodE last Algorithm 2: Multilevel marginal likelihood estimation using importance sam- pling (ML-IS). P θ∈ f Θ i by focusing on the strata with L(θ) > λ i . Essentially, the stratified sampling strategy can be used to estimate b χ i using N j samples [111] b χ SS i = X j∈I w j N j X l=1 I e Θ i θ (j) l (4.18) 99 wherew j = p j N j ;p j =P (θ∈ Ω j ); the index setI consists of strata indices that have L(θ) > λ i−1 from the previous step. The mean of this estimate is E h b χ SS i i = χ i and the variance is given by Var h b χ SS i i = σ 2 N − 1 N X j∈I p j b χ SS i,j − b χ SS i 2 (4.19) where b χ SS i,j = 1 N j P N j l=1 I e Θ i (θ (j) l ). An optimal number N ∗ j of samples for each strata can be estimated as N ∗ j = w j σ j P I i=1 w i σ i N (4.20) where σ j is estimated from samples drawn at the previous step. However, imple- menting stratified sampling becomes difficult with increasing dimension of the parameter space. An algorithm for multilevel estimation of marginal likelihood or evidence using stratified sampling is presented in Algorithm 3 and a schematic diagram is shown in Figure 4.3. 4.3.4 Multilevel-particle approximation (ML-PA) A particle approximation is combined with the Markov chain Monte Carlo method in the following proposed algorithm. As the likelihood levels λ i increases 1 In this chapter, Gaussian distribution is assumed for q i (·) with mean same as the mean of the samples withL(θ j )>λ i and a suitably large standard deviation. 100 Stratum Samples Iso-likelihood contours 1 1 2 2 3 3 n n n 1 n 1 ✓ 1 ✓ 2 Figure 3: Multilevel-startified sampling (ML-SS) method: the iso-likelihood contours are shown with 1 < 2 < ···< n ; more samples are generated from the strata with high likelihood values. define i = P ⇣ ✓ 2 e ⇥ i ⌘ = i Y k=1 P ⇣ ✓ 2 e ⇥ k |✓ 2 e ⇥ k 1 ⌘ = i Y k=1 E h I e ⇥ k ✓ (k 1) Q k l=1 I e ⇥ l ✓ (l 1) i E h Q k l=1 I e ⇥ l (✓ (l 1) ) i (23) where ✓ (l) l>0 are obtained from a Markov chain with ✓ (0) ⇠ p(✓ )witha transition kernel that can be assumed of the form K(x,dy)givenin(12). Note that for high-dimensional parameter space Markov chain higher ac- ceptance rate is used herein, e.g., modified Metropolis-Hastings algorithm (MMH) [32]. 16 Figure 4.3: Multilevel-startified sampling (ML-SS) method: the iso-likelihood con- tours are shown with λ 1 < λ 2 <··· < λ n ; more samples are generated from the strata with high likelihood values. 1 Initialization: Set χ 0 = 1,E 0 = 0, and q 0 (·) = p(·) 2 Divide the sample space into n st strata{Ω k } nst k=1 3 Set i = 0 andI (0) ={1,...,n st } 4 while Stopping Criterion = FALSE do 5 for j∈I (i) do 6 Draw samplesθ (j) l from the stratum Ω j for l = 1,...,N (i) j 7 Calculate likelihood valuesL(θ l ) for these samples 8 Assign weights w (i) j =p j /N j where p j =P (θ∈ Ω j ) and N j = P i k=0 N (k) j 9 end 10 Estimate a suitable λ i 11 Estimate b χ SS i = P j∈I (i)w j P N j l=1 I e Θ i (θ) 12 Update the marginal likelihoodE i =E i−1 +λ i b χ SS i−1 − b χ SS i (using a rectangular rule for integration) 13 Include the index j of the stratum Ω j inI (i) if any of theL(θ (j) l )>λ i i←i + 1 14 end Result: The marginal likelihoodE last Algorithm3:Multilevelmarginallikelihoodestimationusingstratifiedsampling (ML-SS). 101 Algorithm 3: Multilevel marginal likelihood estimation using particle approximation (ML-PA). 1 Initialization: Set 0 =1, E 0 =0,andp (0) (✓ )=p(✓ ) 2 Set i=1 3 while Stopping Criterion = FALSE do 4 Define p (i) (✓ )=p(✓ |✓ 2 e ⇥ i ) 5 Draw N samples {✓ l } n l=1 from the distribution, p (i) (✓ )starting from {✓ (i 1) } 6 Evaluate likelihood values L(✓ l )forthesesamples 7 Decide on a suitable i 8 Get the samples {✓ (i) } satisfying L(✓ l )> i 9 Estimate b PA i = b PA i 1 N P N l=1 I e ⇥ i (✓ l ) 10 Update the marginal likelihood E i = E i 1 + i ⇣ b PA i 1 b PA i ⌘ 11 (using a rectangular rule for integration) 12 Using the indices of the strata containing samples with L(✓ > i ) construct the set I (i) 13 i i+1 Result: The marginal likelihood E last Markov chain Iso-likelihood contours 1 1 2 2 3 3 n n n 1 n 1 ✓ 1 ✓ 2 Figure 4: Multilevel-particle approximation (ML-PA) method: the iso-likelihood contours are shown with 1 < 2 <···< n ; Markov chains are run to generate samples from high likelihood region. 17 Figure 4.4: Multilevel-particle approximation (ML-PA) method: the iso-likelihood contours are shown with λ 1 < λ 2 <··· < λ n ; Markov chains are run to generate samples from high likelihood region. with iteration f Θ k ⊂ f Θ k−1 where f Θ k are as defined in Section 4.3.2. Hence, the Feynman-Kaç representation [116, 117] can be used to define χ i =P θ∈ f Θ i = i Y k=1 P θ∈ f Θ k |θ∈ f Θ k−1 = i Y k=1 E h I e Θ k θ (k−1) Q k l=1 I e Θ l θ (l−1) i E h Q k l=1 I e Θ l (θ (l−1) ) i (4.21) where n θ (l) o l>0 are obtained from a Markov chain withθ (0) ∼ p(θ) with a transi- tion kernel that can be assumed of the form K(x, dy) given in (4.10). Note that, for a high-dimensional parameter space, Markov chain with higher acceptance rate is used herein, e.g., the modified Metropolis-Hastings algorithm (MMH) [115] (see Appendix A). An algorithm for multilevel estimation of marginal likelihood or evi- dence using Markov chain is presented in Algorithm 4 and a schematic diagram is shown in Figure 4.4. 102 1 Initialization: Set χ 0 = 1,E 0 = 0, and p (0) (θ) = p(θ) 2 Set i = 1 3 while Stopping Criterion = FALSE do 4 Define p (i) (θ) = p(θ|θ∈ f Θ i ) 5 Draw N samples{θ l } n l=1 from the distribution, p (i) (θ) starting from {θ (i−1) } 6 Evaluate likelihood valuesL(θ l ) for these samples 7 Decide on a suitable λ i 8 Get the samples{θ (i) } satisfyingL(θ l )>λ i 9 Estimate b χ PA i = b χ PA i−1 N P N l=1 I e Θ i (θ l ) 10 Update the marginal likelihoodE i =E i−1 +λ i b χ PA i−1 − b χ PA i (using a rectangular rule for integration) 11 Using the indices of the strata containing samples withL(θ)>λ i construct the setI (i) 12 i←i + 1 13 end Result: The marginal likelihoodE last Algorithm 4: Multilevel marginal likelihood estimation using particle approxi- mation (ML-PA). 4.4 Discussion of the Proposed Approach 4.4.1 Estimation of posterior moments The posterior statistics of the model parameters are often sought from a Bayesian analysis. Using the proposed approach in this chapter, the outputs from the above algorithms can also be used to evaluate the posterior moments of the model parametersθ using the rejected samples and the change in evidence value at each step without any significant computation cost. For example, the mean and variance ofθ can be evaluated using E[θ] = 1 E last imax X i=1 ΔE i θ i Var[θ] = 1 E last imax X i=1 ΔE i θ i 2 − (E[θ]) 2 (4.22) 103 where ΔE i = b ϕ(χ i−1 − χ i ) and θ i is the sample corresponding to the current likelihood level λ i . 4.4.2 Stopping criteria Different stopping criteria, based on accuracy and/or computational cost, can be used in these algorithms, namely, (i) the change in ‘evidence’ is less than some threshold ΔE tol , often taken to be 1% or 0.1%; (ii) the total number of iterations is less than a pre-chosen threshold N max it ; (iii) χ is less than some pre-specified tolerance τ ; (iv) the likelihood level λ i of the current iteration is within some fraction of the theoretical maximum of the likelihood function. A combination of all four of these criteria is implemented here. 4.4.3 Accuracy The evidence in (4.14) can be written as [93] E = Z 1 0 ϕ(χ)dχ = imax X i=0 ϕ i Δχ i + Z 0 ϕ(χ)dχ | {z } εt + " Z 1 ϕ(χ)dχ− imax X i=0 ϕ i Δχ(ϕ i ) # | {z } εn + imax X i=0 ϕ i [Δχ(ϕ i )− Δχ i ] | {z } εs = imax X i=0 ϕ i Δχ i +ε t +ε n +ε s where ε t is the truncation error, ε n is the numerical integration error, and ε s is the stochastic error. Since the algorithms are stopped when χ = rather χ actually becomes zero the truncation errorε t arises. However, it is very small if the 104 algorithmsarerunenoughlong. Thenumericalintegrationerrorε n intheproposed algorithms for a rectangular rule is O(N −1 ), where N is the number of samples, if dϕ/dχ is bounded in [, 1]. Finally, the stochastic error ε s is asymptotically unbiased with a convergence rate ofO(N −1/2 ) for the methods used here as shown below. Hence, the convergence of ε s dictates the convergence of the algorithms here. Proposition 1. In each stochastic error component, Δχ i converges to Δχ(ϕ i ) and its coefficient of variation (COV) isO(N −1/2 ) for large number of samples N. Proof. For a large number N of samples, using the strong law of large numbers, Δχ i converges to Δχ(ϕ i ) almost surely. Hence, ε s,i = ϕ i [Δχ(ϕ i )− Δχ i ] is zero- mean for largeN. Further, denoteI i j,k =I e Θ j (θ (i) ) h 1−I e Θ k (θ (i) ) i , i.e., an indicator that sampleθ (i) is in f Θ j but outside f Θ k , whereI is the indicator function. Hence, Δχ i = 1 N N X l=1 I l i−1,i (4.23) and E[(Δχ i − Δχ(ϕ i )) 2 ] =E 1 N N X l=1 h I l i−1,i − Δχ(ϕ i ) i ! 2 = 1 N 2 N X k=1 N X l=1 E h I k i−1,i − Δχ(ϕ i ) ih I l i−1,i − Δχ(ϕ i ) i (4.24) 105 Note that, I k i−1,i are Bernoulli random variable with success probability Δχ(ϕ i ). Also for simplicity, assume the samples are uncorrelated, i.e., E h I k i−1,i − Δχ(ϕ i ) ih I l i−1,i − Δχ(ϕ i ) i = 0 for k6=l E h I k i−1,i − Δχ(ϕ i ) i 2 = Var(I k i−1,i ) = Δχ(ϕ i ) [1− Δχ(ϕ i )] (4.25) Using the above simplification in (4.24) Var(Δχ i ) = Δχ(ϕ i ) [1− Δχ(ϕ i )] N COV(Δχ i ) = v u u t 1− Δχ(ϕ i ) NΔχ(ϕ i ) (4.26) 4.5 Numerical Illustrations 4.5.1 Example I: Conceptual example Consider a likelihood function for a datasetD ={x i } n i=1 defined by p(D| μ) = 1 2π n/2 σ n exp − 1 2σ 2n n X i=1 (x i −μ) 2 ! (4.27) The prior for the parameter μ is assumed as normally distributed with mean and variance given by μ 0 and σ 2 0 , respectively, i.e., p( μ| μ 0 ,σ 2 0 ) = 1 q 2πσ 2 0 exp − ( μ−μ 0 ) 2 2σ 2 0 ! (4.28) 106 The evidence or marginal likelihood in this example can be calculated analytically as p(D) = σ √ 2πσ 2 n q nσ 2 0 +σ 2 exp − P n i=1 x 2 i 2σ 2 − μ 2 0 2σ 2 0 ! × exp 2nμ 0 ¯ x + σ 2 0 n 2 ¯ x 2 σ 2 + σ 2 μ 2 0 σ 2 0 2(nσ 2 0 +σ 2 ) (4.29) Withn = 100 measurementsgeneratedfromanormaldistributionwithμ = 1.5 and σ = 0.5, the three proposed algorithms are implemented using μ 0 = 1 and σ 0 = 0.25. ML-IS is implemented with a Gaussian importance density formed at each iteration using the mean of the remaining samples and a standard deviation of 0.125 with an initial sample size of 1000. The ML-SS algorithm is used with Ω i =F −1 μ (x i ) wherex i ∈ i−1 nst , i nst ,n st = 5, andF μ is the probability distribution of the parameter μ. The ML-PA algorithm is implemented with an initial sample size of 1000 and, at each iteration, 25 samples with the lowest likelihoods are rejected and new 25 samples with higher likelihoods are added to the sample pool. The stopping criteria is used as discussed in section 4.4.2 with evidence change threshold ΔE tol = 0.01%, the tolerance τ = 0.005, maximum iteration count N max it = 100, and the number of likelihood evaluations is limited to 20,000. Table 4.1 shows a comparison of the marginal likelihoods or evidence values obtained using the three proposed algorithms and the exact value computed using (4.29). The Table shows that all three algorithms give an accuracy of≤ 0.4% with a very small coefficient of variation. Figure 4.5 shows how errors are reduced with increasing computational effort using the algorithms. The figure also indicates that the performance of the ML-SS method is better than the others. However, the ML-SS algorithm suffers from the curse of dimensionality as dimension of the parameter space increases. The figure also shows that the COV of ML-PA is much 107 higher than the other algorithms. The reason behind this is the fact that the samples from Markov chain Monte Carlo methods are generally correlated. 0 1 2 3 4 5 0 0.2 0.4 0.6 0.8 1 1.2 95% Confidence interval 1 Standard deviation Mean Sample evaluation (a) ML-IS 0 1 2 3 4 5 0 0.1 0.2 0.3 0.4 95% Confidence interval 1 Standard deviation Mean Sample evaluation (b) ML-SS 0 1 2 3 4 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 95% Confidence interval 1 Standard deviation Mean Sample evaluation (c) ML-PA Figure 4.5: The error in estimation of log evidence gets reduced with increasing number of likelihood function evaluations for the three proposed algorithms. Table 4.1: Comparison of marginal likelihood or evidence values obtained using the three proposed algorithm with the exact value. The coefficient of variations (COV) are obtained from 10 independent simulation runs. Method log(Evidence) Error (%) COV (%) Exact −69.8354 — — ML-IS −70.0468 0.3028 0.0812 ML-SS −69.8219 0.0193 0.0464 ML-PA −69.7071 0.1836 0.1873 108 4.5.2 Example II: Flow past a cylinder In this example, fluid flow past a cylinder inside a pipe is considered where the inflow velocities are assumed as uncertain. The problem is implemented in the FEniCS software package [118] adapting from one of its examples. Figure 4.6 shows the cross-section of the pipe and the cylinder in it. The dimensions are as shown in the figure. 0.2m 0.21m 1.2m 0.15m 0.1m inlet outlet Figure 6: The setup for Example II where a parabolic shaped velocity profile is entering the pipe from left hand side. (a) Velocity distribution. (b) Pressure distribution. Figure 7: Typical velocity and pressure distributions in Example II. where u(x;t)=[u 1 (x;t),u 2 (x;t)] T is the velocity vector at position (x 1 ,x 2 ); p is the pressure; ⇢ is the fluid density; ⌫ is the kinematic viscosity of the fluid; and the operator r = @ @ x 1 + @ @ x 2 . The equations for this problem is solved using the incremental pressure correction scheme in FEniCS. The 23 Figure 4.6: Schematic of pipe and cylinder. The Navier-Stokes equation for an incompressible fluid along with the mass conservation equation are ∂u ∂t + (u·∇)u =− 1 ρ ∇p +ν∇ 2 u ∇· u = 0 (4.30) where u(x;t) = [u 1 (x;t),u 2 (x;t)] T is the velocity vector at position (x 1 ,x 2 ); p is the pressure; ρ is the fluid density; ν is the kinematic viscosity of the fluid; and the operator∇ = ∂ ∂x 1 + ∂ ∂x 2 is the gradient. Equations (4.30) are solved using 109 (a) Velocity distribution. (b) Pressure distribution. Figure 4.7: Typical velocity and pressure distributions in Example II. the incremental pressure correction scheme in FEniCS. The inlet velocity profile is assumed parabolic u(0,x 2 ;t) = " 4U max x 2 (0.41−x 2 ) 0.41 2 , 0 # T (4.31) The velocity measurements are generated with U max = 1.2 m/s that results in Reynolds number 80. Ten points are randomly chosen in the downstream where horizontal velocities u 1 (x;t) are measured starting at 1s after the start of the flow and stops at 1.6s with an interval of 0.05s, resulting in a total of n = 120 measurements, to which 20% additive Gaussian noise are added. Herein, the multiplicative error model in the predicted velocity is used as in Cheung et al. [119], Oliver and Moser [120], and Edeling et al. [121, 122]. In this uncertainty model, the velocity u 1 (x;t) predicted by the models is given by u 1 (x;t,θ) =E m (x)u pred 1 (x;t,θ) (4.32) 110 whereu pred 1 (x;t,θ) is the velocity predicted by the model; the multiplicative model- ing error given byE m (x) is used to define the modeling uncertainty; andu 1 (x;t,θ) is the true velocity with θ =U max . The required covariances are then written as k uu (x, x 0 ;t|θ) =u 1 (x;t,θ)k Em (x, x 0 ;t|θ)u 1 (x 0 ;t,θ) (4.33) The multiplicative error E m is assumed in this chapter to be a Gaussian process with unit mean and covariance function given by k Em (x, x 0 ;t|θ) =σ 2 exp − 2 X i=1 x i −x 0 i l i ! 2 (4.34) where the hyperparameters are [σ,l 1 ,l 2 ] T . Next, the measurement error is intro- duced in the data as d = u 1 + e (4.35) where u 1 consists of velocities obtained from a model; e is the measurement error distributed asN (0, Σ e ); and d is the measurement vector. The likelihood function is assumed to be Gaussian p(D|θ) = 1 (2π) n/2 |Σ| 1/2 exp − 1 2 h u pred 1 − d i T Σ −1 h u pred 1 − d i (4.36) where u pred 1 consists of the predicted velocities using parameter θ; Σ = Σ e + Σ Em with Σ Em formed using the covariances in (4.33). The hyperparameters are assumed here as σ = 0.2 q Var(d), and l 1 =l 2 = 1. The prior for U max is assumed to be a truncated Gaussian with mean 1.25 m/s and standard deviation 0.5 m/s, truncated at 1.0 and 1.5 m/s. The results are shown in Table 4.2. The results 111 are compared with log evidence obtained using nested sampling (NS). The relative difference in the marginal likelihood or evidence is calculated using Relative difference = logE ML − logE NS logE NS (4.37) The ML-IS algorithm is implemented with a truncated Gaussian importance den- sity formed at each iteration using the mean of the current samples and a stan- dard deviation of 0.25 m/s truncated at 1.0 and 1.5 m/s. The ML-SS algorithm is implemented again with Ω i = F −1 μ (x i ) where x i ∈ i−1 nst , i nst , n st = 5, and F μ is the probability distribution of the parameter U max . The ML-PA algorithm is implemented with an initial sample size of 100 and, at each iteration, 20 samples with the lowest likelihoods are rejected and new 10 samples with higher likelihoods are added using the Metropolis-Hastings algorithm with a proposal density of trun- cated Gaussian having a standard deviation 0.1 m/s and truncated at 1 m/s and 1.5 m/s to the sample pool. Since the analytical value of the marginal likelihood is notavailableforthisexample, theaccuracyoftheproposedalgorithmsiscompared to the established nested sampling algorithm as shown in Table 4.2. The stopping criteria are same as in Example I except ΔE tol = 1%. Among the three algorithms, the performance of ML-IS is affected by the assumption of importance density at each iteration. The ML-SS performs well as the dimension of the parameter space is low. The ML-PA algorithm also performs well again, similar to Example I. 4.5.3 Example III: 11 story base isolated building This example utilizes a complex superstructure comprised of an 11-story 2-bay 99 DOF superstructure [123], sitting on a hysteretic isolation layer that is rigid in-plane and can only move horizontally, resulting in the 100 DOF model shown 112 Table 4.2: Comparison of marginal likelihood or evidence values using different algorithms for Example II. Method log(Evidence) Relative difference (%) Nested sampling 252.2854 — ML-IS 252.5498 0.1048 ML-SS 252.5067 0.0877 ML-PA 252.5249 0.0949 Base nd 2 floor th 10 floor th 11 floor st 1 floor Ground hysteretic isolation bearings u b u b u 3 u 6 u 9 u 12 u 15 u 18 u 84 u 87 u 90 u 1 u 4 u 7 u 10 u 13 u 16 u 82 u 85 u 88 u 91 u 94 u 97 u 2 u 5 u 8 u 11 u 14 u 17 u 83 u 86 u 89 u 92 u 95 u 98 u 93 u 96 u 99 Figure 4.8: 100 DOF base-isolated structural model in Figure 6.7 and used in Section 3.3.2. Rayleigh damping, with assumed 3% damping ratios for the 1 st and 10 th mode, is used to construct the superstructure damping matrix. If considered as a fixed base structure, the superstructure would have a fundamental period of 1.05s with equations of motion written as M s ¨ u s + C s ˙ u s + K s u s =−M s r s ¨ u g (4.38) where M s , C s and K s are the mass, damping and stiffness matrices of the super- structure, respectively; ¨ u g is the ground acceleration; u s is the generalized dis- placement vector relative to the ground, consisting of 3 DOF per node for each of 113 the 33 nodes in the superstructure. The influence vector of the ground motion ¨ u g , consisting of a 1 in each element corresponding to a horizontal displacement in u s and zeros elsewhere, is denoted by r s . When combined with the isolation layer, the equations of motion are M s ¨ u s + C s ˙ u s + K s u s =−M s r s ¨ u g + C s r s ˙ u b + K s r s u b m b ¨ u b + (c nom b + r T s C s r s ) ˙ u b + r T s K s r s u b +f b =−m b ¨ u g + r T s C s ˙ u s + r T s K s u s (4.39) The model for the isolator force is assumed as a Bouc-Wen model [69]. The total force exerted by the isolation layer, the sum of the damping and restoring forces, is f b =k post u b +c b ˙ u b +Q y [1−r k ]z (4.40) where k pre and k post are the isolator pre-yield and post-yield stiffnesses, respec- tively; Q y is the isolator yield force; Q y [1−r k ], the peak of the non-elastic force, depends on the hardening ratio r k =k post /k pre . Table 4.3: Prior distribution of parameters for the 11-story base isolated building. Parameter Distribution Lower Upper Mean Std. dev. bound bound k post (kN/m) Trunc. Gaussian 700 800 780 20 Q y (% of W † ) Uniform 4.5 6.5 5.5 0.5774 r k Uniform 0.16 0.20 0.18 0.0115 † W = total weight of the building. The nominal isolation-layer linear damping coefficient is assumed as c b = 40 kN·s/m. Further, yield force Q n y = 5% of building weight and hardening ratio r k = k post /k pre = 1/6 are used to generate the measurements. These parameters 114 give the first mode a typical period ofT i 1 = 2.76s, and damping ratio of 5.5% [106]. The evolution of auxiliary variable z is governed by ˙ z =A ˙ x b −β ˙ x b |z| npow −γz| ˙ x b ||z| npow−1 , (4.41) with n pow = 1 where the selection of A = 2β = 2γ =k pre /Q y dictates that z stays in [−1, 1] and produces identical loading and unloading stiffnesses [71, 102]. The ground excitation is the 1940 El Centro earthquake record of 30s duration. The absolute (horizontal) acceleration of the roof, specified by DOF u 97 in Figure 6.7 with a sampling rate of 20Hz is used as the output Y(t) of the model to which 20% Gaussian white noise is added. The ML-IS algorithm is implemented with a Gaussian importance density formed at each iteration using the mean of the current samples and standard deviations. The ML-SS algorithm is implemented again with Ω i =F −1 μ (x i ) for each of the parameters wherex i ∈ i−1 nst , i nst ,n st = 5, and F −1 μ is the inverse probability distribution of the parameters k post , Q y , and r k . The ML-PA algorithm is implemented with an initial sample size of 100 and, at each iteration, 10 samples with the lowest likelihoods are rejected and new 10 samples with higher likelihoods are added to the sample pool using the Metropolis- Hastings algorithm. The proposal density in the Metropolis-Hastings algorithm is assumed as Gaussian with a standard deviation of 5 kN/m, 0.5%, and 0.005 for the parametersk post ,Q y , andr k , respectively. The stopping criteria as in Example I are used except the total number of function evaluation is limited to 5000 and ΔE tol = 1% for computational purposes. The accuracy of the proposed algorithms compared to nested sampling is again shown in Table 4.4. The estimated mean and the standard deviation of the parameters using ML-PA algorithm are shown in Table 4.5. The proposed approach is again shown to estimate the marginal 115 likelihood or evidence value accurately compared to nested sampling with stricter stopping criteria. Table 4.4: Comparison of marginal likelihood or evidence values using different algorithms for Example III. Method log(Evidence) Relative difference (%) Nested sampling 45.1889 — ML-IS 43.4989 3.74 ML-SS 45.7038 1.14 ML-PA 45.5899 0.89 Table 4.5: Posterior mean and standard deviation of the parameters for the 11- story base isolated building. Parameter True Posterior Posterior value Mean Std. dev. k post (kN/m) 750 766.84 17.12 Q y (% of W) 5 5.04 0.25 r k 0.1667 0.1677 0.0036 Next, a Bayesian model selection exercise is performed for this example. The candidate models are Bouc-Wen, bilinear, and a linear approximation of the bilin- ear model according to the AASHTO (American Association of State Highway and Transportation Officials) guidelines. The bilinear model can be approximated with (6.26) as n pow →∞. In this chapter, n pow = 100 is used. The AASHTO model approximates the energy dissipation in each cycle of the bilinear model. In this model, f b = [c b +c eq ] ˙ x b +k eq u b = c b + 2ζ eq q k eq (m b +m s ) ˙ x b +k eq u b . (4.42) 116 Figure 4.9: Models for hysteresis. wherem s is the mass of the superstructure. The equivalent damping ratio ζ eq and equivalent stiffness k eq are given by ζ eq = 2(1−r k )(1−r −1 d ) π[1 +r k (r d − 1)] k eq = k pre r d [1 +r k (r d − 1)] (4.43) where r d = x d /x y is the shear ductility ratio of the design displacement x d to the yield displacement x y . The marginal likelihoods are calculated using ML-PA algorithm. With equal prior model probabilities, the posterior model probabilities, are calculated using (3.1) and shown in Table 4.6. The model selection correctly assigns a posterior probability of 1.0 to Bouc-Wen model. Also note that, if the Bouc-Wen model is absent from the candidate model set, the Bayesian model selection chooses the second-best bilinear model as the correct one. Table 4.6: Posterior model probabilities for the hysteretic isolation layer in the 11-DOF base-isolated building. Model log(Evidence) P (M k |D) Bouc-Wen 45.6893 ≈ 1.0 Bilinear − 46.1149 ≈ 0.0 AASHTO −821.0503 ≈ 0.0 117 4.6 Conclusions In this chapter, a multilevel approach has been proposed to evaluate the multi- dimensional integral known as marginal likelihood or evidence required in the Bayesian model selection. Three algorithms using importance sampling, strat- ified sampling, and Markov chain Monte Carlo chains are used to implement this approach. In the first proposed algorithm, importance sampling densities are formed adaptively at each iteration of the algorithm to generate samples with increasedlevelsof likelihood. Theperformanceof thisalgorithmdepends, however, on the effectiveness of the assumed importance sampling algorithm. The second algorithmusesstratifiedsamplingandfocusesongeneratingmoresamplesfromthe strata with higher likelihoods. This algorithm can suffer from curse of dimensional- ity for high dimensional parameter spaces. The final algorithm uses Markov chain Monte Carlo algorithms to sample more from the high likelihood region. For high dimensional parameter spaces, MCMC algorithms with high acceptance rates, e.g., Modified Metropolis-Hastings [115], can be used. The performance and accuracy of these proposed algorithms are compared in three numerical examples. The first example shows the accuracy of the proposed algorithms where the exact marginal likelihood or evidence is known. The second example evaluates the marginal like- lihood or evidence for a model where the underlying physics requires the solution of Navier-Stokes equations may times. The proposed algorithms are shown to give results that are within the desired accuracy level relative to marginal likelihood value calculated using a well-known algorithm. The third example uses an 11-story base-isolated building model with uncertainties in the nonlinear hysteretic isola- tion layer subjected to a historic earthquake. A multidimensional parameter space is investigated in this example. These examples show the efficacy of the proposed algorithms in different settings and their application to Bayesian model selection. 118 Chapter 5 Model Validation Framework As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality. Albert Einstein 5.1 Introduction The merits of both Bayesian model selection and model falsification are obvi- ous, yet each method alone has its own inherent weaknesses. When all model classes in an initial set are inadequate representations of the system (an entirely plausible scenario, particularly for systems with high complexity or unknown fea- tures), Bayesian model selection will always choose a model class but without clear warning or indication to the modeler of its inadequacy. Consequently, future pre- dictions produced by the inadequate model class may be highly inaccurate and, therefore, assumptions or decisions based on these predictions may have catas- trophic consequences. While model falsification possesses the ability to eliminate incorrect models and inform the modeler if none of the options are valid, further judgment on the usefulness of a particular model class and parameter values are not possible in its current form [52, 124]. The merits of both Bayesian model selection and model falsification are obvious, yet each method alone has its own inherent weaknesses. When all model classes in an initial set are inadequate rep- resentations of the system (an entirely plausible scenario, particularly for systems 119 with high complexity or unknown features), Bayesian model selection will always choose a model class but without clear warning or indication to the modeler of its inadequacy. Consequently, future predictions produced by the inadequate model class may be highly inaccurate and, therefore, assumptions or decisions based on these predictions may have catastrophic consequences. While model falsification possesses the ability to eliminate incorrect models and inform the modeler if none of the options are valid, further judgment on the usefulness of a particular model class and parameter values are not possible in its current form [52, 124]. The framework proposed herein selects one or more model(s) or model class(es) by integrating the principle of model falsification into Bayesian model selection to mitigate the weaknesses of these different identification schemes. Exploiting model falsification’s ability to significantly shrink the valid model class set will avoid numerous expensive computations required to evaluate the posterior parameter distribution for Bayesian model selection; these savings in computational cost will directlygrowwithincreasingnumbersofmeasurements, degreesoffreedom(DOF), or space/time resolution. As Bayesian model selection already includes the effect of Occam’s razor in evidence, no further extraneous steps are needed to penalize model classes with more parameters. Hence, the proposed framework not only identifies the most plausible model class and parameter estimates, it also requires fewer model simulations than other validation methods and provides checks on the suitability of the resulting models and model classes. 5.2 Hybrid Framework for Model Validation Since Bayesian model selection leaves as most plausible a model that may be wrong, and model falsification can fail to falsify any model class, leaving a large 120 number of unfalsified models, and provides no relative confidence in them, this section proposes the falsification of models in a Bayesian framework. Further, this falsification approach [52, 124] is designed (unlike other falsification approaches) to accommodate many measurements such as those over multiple spatial and tem- poral dimensions from dynamical systems, both linear and nonlinear. The model validation framework proposed herein incorporates model falsification as both pre- and postprocessing steps before and after Bayesian model selection to combine the usefulness of both approaches while overcoming their individual shortcomings. Intra-Model-Class falsification Likelihood-bound model falsification using FDR Preprocessing assume prediction error density p(✏ ✏ ✏) dataD choose n s samples ✓ priors: p(✓ |M k ) Adjust Priors some models falsified M 1 initial model class prob. P(M k ) Bayesian model class selection Inter-Model-Class falsification Evidence-based model class falsification Postprocessing updated param.prob. p(✓ |D,M k ) evidenceE (k) M 2 Computational EngineL(✓ ;D,M k ) parameter posteriors posterior model class prob. P(M k |D) validated model classesM 3 Figure 1: Proposed synergistic framework of model falsification and Bayesian model class selection. implement Bayesian model selection with the remaining model classes by re-using computational results from the preprocessing step. Finally, a post- processing step checks the validity, on average, of the final model class(es). Some other researchers have proposed di↵ erent types of model valida- tion frameworks. For example, a validation approach proposed by Babuˇ ska et al. [39] employs a model rejection step using a validation dataset, but only after fitting probability distributions to all model class parameters us- ing a separate calibration dataset. Recently, Farrell et al. [40–42] proposed a framework for model validation that first selects a subset of candidate model classes and calculates the calibrated posterior model parameter distribution, the evidence values and, finally, the validated posterior parameter distribu- tion (and probability distribution for a quantity of interest) using Monte Carlo approaches; however, that framework is unable to find model classes with higher evidence values but with more parameters because it starts with the model class subsets that have the fewest parameters, stopping when it 7 Figure 5.1: Proposed synergistic framework of model falsification and Bayesian model class selection. A flowchart of the steps of the proposed validation procedure is given in Fig- ure5.1[125,126]. Theflowchartshowsthatthepreprocessingfalsificationstepfirst eliminates the model classes that do not reproduce reasonably well the responses of the physical system, thereby shrinking the set of candidate model classes. The next step is to implement Bayesian model selection with the remaining model classes by re-using computational results from the preprocessing step. Finally, a postprocessing step checks the validity, on average, of the final model class(es). Someotherresearchershaveproposeddifferenttypesofmodelvalidationframe- works. For example, a validation approach proposed by Babuška et al. [127] employs a model rejection step using a validation dataset, but only after fitting 121 probability distributions to all model class parameters using a separate calibra- tion dataset. Recently, Farrell et al. [128–130] proposed a framework for model validation that first selects a subset of candidate model classes and calculates the calibrated posterior model parameter distribution, the evidence values and, finally, the validated posterior parameter distribution (and probability distribution for a quantity of interest) using Monte Carlo approaches; however, that framework is unable to find model classes with higher evidence values but with more parame- ters because it starts with the model class subsets that have the fewest parameters, stopping when it finds a valid model class, and does not investigate further. 5.2.1 Intra-Model-Class Falsification: Framework’s Pre- processing Step A model from a model classM k , which is in the setM of all candidate model classes, is specified by ann θ ×1 parameter vectorθ∈ Θ. (Technically,θ should be writtenθ (k) since its size may be different for different model classes; however, the superscript (k)isomittedfornotationalsimplicity.) Herein,θ willbecalledamodel as its value defines one model within the corresponding model class. The difference between theN o outputs h(θ) of the model and their corresponding measurements d is known as the residual error [52]. These residuals are modeled as continuous random variables, herein characterized by the probability density function p E (e|θ), where E is a random vector, and e is a possible value of random vector E, whereas is the actual residual error. (The random vector E is henceforth omitted for brevity but is implied by context.) Conventionalerrordomainmodelfalsificationsetsboundsoneacherrorresidual i =h i (θ)−d i ; the bounds are chosen so that each residual has a given probability, defined by the model’s assumptions on p(e|θ), of remaining within the bounds. 122 This approach ignores significant information that may be available about the residuals, their uncertainty distributions and their correlations. Instead, following Chapter 2 and De et al. [52], the proposed framework’s first step is to falsify models based on the likelihood of their residual; in the case of a single measurement, the likelihood (omitting the model classM k ) is L(θ;D) = p(D|θ) = p(|θ) = p [h(θ)−d] θ (5.1) The model would then be unfalsified (i.e., accepted) if the likelihood is larger than a likelihood threshold, and falsified if below L(θ;D)<L ⇒ falsify modelθ (5.2) For an arbitrary number of measurements, including time history responses of dynamical systems, the predicted outputs of model M k are given again by h(θ) but in a stacked form (i.e., for predicted response time history vector y(t;θ), sampled at times{t 0 ,t 1 ,···}, let h(θ) = [y T (t 0 ;θ) y T (t 1 ;θ) ··· ] T or a similar arrangement). TheN o ×1 residual vector is = h(θ)−d. If the resulting residuals are zero-mean Gaussian distributed with covariance matrix Σ, then the likelihood can be written L(θ;D) = p h(θ)− d θ = exp − 1 2 [h(θ)− d] T Σ −1 [h(θ)− d] (2π) No 2 |Σ| 1 2 = exp − 1 2 T Σ −1 (2π) No/2 |Σ| 1/2 (5.3) This approach directly frames model uncertainty in terms of the dynamic response measurements with two advantages: (i) avoiding the intermediate use of modal parameters helps limit error introduced via system identification methods and 123 modal analysis; and (ii) the likelihood calculated for models in a class can be used to subsequently compute the model class evidence, an integral step in Bayesian model selection. Figure 5.2: A likelihood-bound L defined for a multidimensional non-Gaussian residual density. Figure 2.6 show the typical lower and upper bounds ( i and ¯ i ) used by model falsification for a two-data-point case; for a symmetric, unimodal residual density such as that shown, these bounds have the same likelihood L, though this is not the case for the more general residual densities, such as in Figure 5.2, where a likelihood threshold would shift the residual bounds to the area(s) of highest like- lihood. Further, a likelihood threshold criteria can accommodate more general residual densities (skewed or multimodal), and allow the modeler to incorporate the correlation structure of multiple measurement residuals. Methods to calculate bounds on likelihood values De et al. [52] discussed various model falsification approaches to determine likelihood lower boundL based on the false discovery rate (FDR) correction with the Benjamini-Hotchberg (BH) procedure [52, 51]. In multiple comparison problems, FDR control ensures that, on average, the number of false positives among all rejections will be below a specified value, and 124 provides a better statistical power while allowing some false positive results [57]. (The statistical power, defined as the probability of rejecting a model when it is invalid, is related to probability of making a type II error, which is the error introduced by incorrectly accepting invalid models.) Herein, the BH procedure is used, as described in Section 2.3.4, to compute the residual error bounds [ i ,¯ i ] from the target identification probability φ (generally assumed as 0.95 or 0.90). These bounds are used next to computeL using L = No Y i=1 min i ≤e i ≤¯ i p(e i |θ) (5.4) (As discussed in De et al. [52], there may be other ways to use the bounds [ i ,¯ i ] or other information to decide the lower boundL.) When all models evaluated from a model class are falsified, the entire model class is considered falsified. M 1 , the set of model classes that passes this pre- processing step, is used as the possible model class set for the subsequent step of Bayesian model selection. (IfM 1 is empty, then the selection of the initial set of model classes must be reconsidered.) By shrinking the set of candidate model classes, this preprocessing step can achieve significant computational savings in the Bayesian model selection, which requires complete exploration of the high likeli- hood region, a computationally expensive task (discussed in the next subsection), whereas a satisfactory coverage of the high likelihood region is sufficient for model falsification. 125 5.2.2 Bayesian model class selection WiththeremainingmodelclassesinM 1 , theposterior modelclassprobabilities (i.e., the probability of a model class conditioned on measurement dataD) are given by Bayes’ theorem: P (M k |D) = p(D|M k )P (M k ) p(D) , M k ∈M 1 (5.5) where the probability of an event is denoted byP (·);P (M k ) is an a priori measure of model plausibility assigned by the modeler based on past experience, normalized so P M k ∈M 1P (M k ) = 1; and denominator p(D) = P M k ∈M 1 p(D|M k )P (M k ) using the theorem of total probability. For a particular model class M k , the model evidenceE (k) = p(D|M k ) is E (k) = Z Θ p(D|θ,M k )p(θ|M k )dθ = Z Θ L(θ;D,M k )p(θ|M k )dθ (5.6) where p(θ|M k ) is the prior probability, based on modeler judgment, of parameter vector θ for model class M k ; and L(θ;D,M k ) = p(D|θ,M k ) is the likelihood functionalreadycomputedinSection5.2.1. TheseevidencevaluesE (k) arerequired to evaluate the posterior model class probabilities P (M k |D), which can then be used to select a model class, or to be used as weights for multiple validated model classes. At the end of this step, based on the posterior model class probabilities, a smaller model class setM 2 ⊆M 1 , typically only the one model class with the highest P (M k |D), is retained and will be subjected to the final postprocessing falsification step. 126 The assumption that the true model class is among the candidate model classes may not always be true. In such scenarios, the application of Bayesian model class selection alone may lead to erroneous conclusions; the framework’s preprocessing falsificationstepavoidsthiserrorandensuresameaningfulresultfromtheBayesian modelselection. Note: Theunfalsifiedmodels’parameterscanbeusedtoformulate informative priors for the model selection, though the question of using the data twice can be raised. In that case, the dataset is divided in two datasetsD I and D II , where the falsification is done withD I . The prior of the model parameters is then formed using the unfalsified parameters for the model selection, which is performed withD II . Sampling Algorithms For efficient evaluation of the evidence, a notable computational challenge for applications of Bayesian model selection, several methods have been proposed, including the posterior harmonic mean estimator [107], importance sampling [12], nested sampling technique [86], annealed sampling [108], the Power Posterior method [109], the Transitional Markov Chain Monte Carlo method [94], the Monte Carlo splitting and subset methods [131–134], stochastic collocation [135], and polynomial chaos approaches [136]. For the numerical examples herein, a nested sampling algorithm is used to evaluate the evidence values. A brief description of this nested sampling algorithm is provided in Section 3.2.2. However, after several iterations of the nested sampling algorithm, sampling from high likelihood region becomes difficult, especially if the high likelihood region is concentrated within a very small region. In such a case, Skilling [86] suggested using the Markov Chain Monte Carlo (MCMC) algorithm for generating samples from the prior con- strained to high likelihood region. For the examples herein, the nested sampling 127 uses conventional random sampling with rejection but switches, when the aver- age acceptance rate drops below 5%, to an MCMC augmented by the modified Metropolis algorithm, introduced in Au and Beck [115] (see Appendix A), which will provide a high acceptance rate to efficiently sample from the high likelihood region. 5.2.3 Inter-Model-Class Falsification: Framework’s Post- processing Step A postprocessing falsification procedure, following the Bayesian model class selection, is proposed here to provide a robust framework that is capable of indi- cating whether the model class(es) in the current poolM 2 are valid in terms of their evidence value(s). The evidenceE (k) for model classM k , which is the likelihood (5.6) of the data given a model class, can be written as the expected likelihood of the residuals; i.e., E (k) = p( |M k ) = E Θ [L(θ;D,M k )]. Similar to the likelihood threshold L for a given model in the preprocessing falsification,an entire model class can be falsified if the model class evidenceE (k) is below a thresholdE: E (k) <E ⇒ falsify model classM k (5.7) As the evidenceE (k) has already been computed in the Bayesian model class selec- tion, this proposed postprocessing falsification can be performed with no extra calculation. While different choices can be made for evidence lower bound E, it is taken herein to be equal to the likelihood threshold L used in the preprocess- ing step because the evidence E is nothing but the expected value of the likeli- hood functionL; since the likelihood thresholdL has already been computed, the 128 choice E = L has the further convenience of not requiring additional computa- tion. Another evidence threshold that could be used without further computation is E = γ max k max i L(θ i ;D,M k ) with γ ∈ (0, 1) depending on the modeler’s expectation about the average likelihood value in a model class. Thispostprocessingfalsificationisparticularlybeneficialinidentifyingthecases inwhichsomefewmodelsremainfromanentiremodelclassafterthepreprocessing falsification but, on average, the model class is incorrect as well. The result, then, is a setM 3 ={E (k) <E :M k ∈M 2 } of final, validated model classes. 5.2.4 Computational Advantage of the Synergistic Frame- work The proposed approach provides significant computational savings when some competing model classes are excluded using the preprocessing step, thereby elimi- nating the most computationally expensive step of calculating evidence values for those model classes, as this step generally requires many more model evaluations. The second reduction in computational effort is obtained since the model likeli- hoods computed in the preprocessing falsification can be used as initial starting values for (nested) sampling. Third, the postprocessing model class falsification relies on evidence already computed in the Bayesian model class evaluation, so does not incur additional cost. A metric C for the computational savings can be introduced as C = P k6∈M 1(N 2,k −N 1,k ) P k∈M N 2,k × 100% where N 1,k and N 2,k are number of model evaluations in the preprocessing falsi- fication and Bayesian model selection steps, respectively, for the k th model class. Generally N 2,k N 1,k because, in evidence estimation, a complete exploration of 129 the high likelihood region must be performed (as shown in the two numerical illus- trations given in the next section), leading to significant computational savings. Also, if some of the model classes are nonlinear and fail to pass the preprocessing step, additional savings in computation time can be expected, compared to falsifi- cationofsomelinearmodelclasses, duetotheadditionalcomputationalcomplexity associated with the response calculation of a nonlinear model. 5.3 Numerical Illustrations The proposed framework is illustrated using two numerical examples from non- linear structural dynamics. In each of these examples, several model classes are used to describe the system’s nonlinear behaviors. The proposed framework is shown to systematically eliminate the incorrect candidate model classes to choose and validate the model class that probabilistically best fits the measurement data. 5.3.1 Example I: 3DOF Model with Nonlinear Stiffnesses Consider the three DOF model shown in Figure 5.3 subjected to base acceler- ation ¨ x g . The equations of motion of the structure are given by M¨ x + C ˙ x + Kx + Lg(x) =−M1 3×1 ¨ x g (5.8) where the mass matrix is M = m 1 0 0 0 m 2 0 0 0 m 3 (5.9) 130 with m 1 =m 2 =m 3 = 300 Mg; the nominal stiffness matrix is K = ¯ k 1 + ¯ k 2 − ¯ k 2 0 − ¯ k 2 ¯ k 2 + ¯ k 3 − ¯ k 3 0 − ¯ k 3 ¯ k 3 (5.10) with ¯ k 1 = ¯ k 2 = ¯ k 3 = 8 MN/m; g(x) is a (possibly nonlinear) restoring force function with its influence matrix L = 1 0 0 −1 1 0 0 −1 1 (5.11) 1 3×1 is a column vector of ones; and the mass displacement vector relative to the support is x = [x 1 x 2 x 3 ] T . Rayleigh damping, i.e., C =β 1 M+β 2 K, is assumed m 1 ¨ x g ¯ k 1 k 1 c 1 x 1 m 2 ¯ k 2 k 2 c 2 x 2 m 3 ¯ k 3 k 3 c 3 x 3 Figure 5.3: 3 DOF model with nonlinear stiffnesses. with 3% damping for each of the first two modes, where β 1 and β 2 are constants evaluated from the damping ratios and the nominal natural frequencies. 131 Table5.1: MeansandstandarddeviationsofGaussianpriordistributionsformodel class stiffness coefficients (units are MN/m p i ). Story Stiffness linear (p i = 1) quadratic (p i = 2) cubic (p i = 3) i coeff. mean σ mean σ mean σ 1 k 1 25 2.5 250 25 2500 250 2 k 2 0.18 0.025 1.8 0.25 18 2.5 3 k 3 1.8 0.25 18 2.5 180 25 Candidate Model Classes Three possibilities for each element of g(x) are assumed — linear, quadratic and cubic stiffnesses — for a total of 3× 3× 3 = 27 different model classes: i.e., g i = k i |x i −x i−1 | p i sgn(x i −x i−1 ), where x 0 ≡ 0 and p i ∈ {1, 2, 3} for i = 1, 2, 3. These model classes are denoted by concatenating the exponent values p 1 − p 2 − p 3 ; e.g., combination 1− 2− 3 has the vector g(x) = [k 1 x 1 k 2 |x 2 −x 1 | 2 sgn(x 2 −x 1 ) k 3 (x 3 −x 2 ) 3 ] T . The prior distributions for parameters k 1 , k 2 , and k 3 are assumed to be Gaussian with means and stan- dard deviations as shown in Table 5.1. These choices of model parameters for the different exponents are chosen so that the force levels are similar with different exponents given the responses of this system (i.e., the interstory drifts are on the order of 0.1 m, so the coefficients in the linear, quadratic and cubic stiffness models have ratios of 1 : 10 : 100). True Model Class and the Measurement Data The measurement dataD, containing the time history x 1 (t) of the first floor displacementrelativetotheground, isgeneratedusing (5.8)formodelclass 1−3−2 with g(x) = [k 1 x 1 , k 2 (x 2 −x 1 ) 3 , k 3 |x 3 −x 2 | 2 sgn(x 3 −x 2 )] T — i.e.,p 1 = 1,p 2 = 3 and p 3 = 2 — with stiffness coefficients k 1 = 22.5 MN/m, k 2 = 20.0 MN/m 3 and k 3 = 20.0 MN/m 2 . The measurements are sampled at 20 Hz and include 20% 132 additive bandlimited Gaussian white noise. The base excitation ¨ x g (t) is the N-S componentofthe18May1940ImperialValleyearthquakerecordedattheImperial Valley Irrigation District substation in El Centro, California, sampled at 50 Hz, with peak acceleration 3.42 m/s 2 . Preprocessing: Intra-Model-Class Falsification Model falsification is performed with each residual density p( ) assumed to be independent zero-mean Gaussian with a standard deviation σ that is 20% of the RMS of the measured first-mass displacement and with the target identification probability set at φ = 0.95 resulting in an FDR-based likelihood bound of L = 266.4, computed using the method described in Chapter 2, that will be used to falsify models. 1000 models are randomly generated for each model class using the prior parameter Gaussian distribution statistics listed in Table 5.1. Out of the 27 possible combinations of different nonlinear stiffnesses of the three stories, only nine remain after the preprocessing step of intra-model class falsification: M 1 ={1−i−j| i∈{1, 2, 3},j ∈{1, 2, 3}}, which means that the nonlinear models of the first spring are clearly unsuitable. The fractions of models unfalsified in each of the 27 model classes are shown in Table 5.2. The result of this step also shows the limitation of using model falsification alone as all model classes inM 1 have almost the same fraction of unfalsified models. Bayesian Model Selection Bayesian model selection is performed next for only the nine model classes that passed the preprocessing falsification step, with prior model class probabilities set as P (M k ) = 1/9. The posterior model class probabilities are given in Table 5.3, which shows that the model class 1−3−2 has the maximum posterior model class 133 Table 5.2: Unfalsified models using proposed intra-model class falsification. (Bold means unfalsified model classes.) Model Unfalsified Model Unfalsified Model Unfalsified class (M k ) (%) class (M k ) (%) class (M k ) (%) 1− 1− 1 63.7 2− 1− 1 0.0 3− 1− 1 0.0 1− 1− 2 62.7 2− 1− 2 0.0 3− 1− 2 0.0 1− 1− 3 53.9 2− 1− 3 0.0 3− 1− 3 0.0 1− 2− 1 51.8 2− 2− 1 0.0 3− 2− 1 0.0 1− 2− 2 62.1 2− 2− 2 0.0 3− 2− 2 0.0 1− 2− 3 63.1 2− 2− 3 0.0 3− 2− 3 0.0 1− 3− 1 48.8 2− 3− 1 0.0 3− 3− 1 0.0 1− 3− 2 60.1 2− 3− 2 0.0 3− 3− 2 0.0 1− 3− 3 61.9 2− 3− 3 0.0 3− 3− 3 0.0 Table 5.3: Posterior model class probabilities after most of the model classes are rejected using a preprocessing step of intra-model-class falsification. (Relative log- evidence is with respect to the model class with the largest log-evidence.) Model class (M k ) log(Evidence) rel. log(Evidence) P (M k |D) 1− 1− 1 2290.9 −71.4 ≈ 0.0 1− 1− 2 2321.4 −40.9 ≈ 0.0 1− 1− 3 2310.5 −51.8 ≈ 0.0 1− 2− 1 2321.4 −40.9 ≈ 0.0 1− 2− 2 2351.0 −11.3 ≈ 0.0 1− 2− 3 2330.4 −31.9 ≈ 0.0 1− 3− 1 2340.3 −22.0 ≈ 0.0 1− 3− 2 2362.3 0.0 1.0 1− 3− 3 2331.0 −31.3 ≈ 0.0 Table 5.4: Posterior mean and standard deviation of model parameters for model class 1− 3− 2, and the true values. Stiffness coeff. True Posterior value mean Std. dev. k 1 [MN/m] 22.5 22.5110 0.0715 k 2 [MN/m 3 ] 20.0 20.7211 0.9466 k 3 [MN/m 2 ] 20.0 19.4666 0.7619 134 probability of essentially 1.0; i.e.,M 2 ={1− 3− 2} and P (1− 3− 2|D)≈ 1.0. The means and the standard deviations of the posterior distribution of the stiffness parameters are obtained from the nested sampling algorithm and are shown in Table 5.4 along with their true values. In this numerical example, due to the intra-model class falsification step, 18 out of 27 model classes are entirely falsified, requiring no evidence calculation for them. As N 2,k N 1,k for all model classes, the computational savings here is given by C≈ 66.67%. Postprocessing: Inter-Model-Class Falsification Using the likelihood-bound falsification approach with the FDR/BH procedure, the lower limit for the model class evidence for validation is again logE = logL = 266.4. For model class 1− 3− 2, logE (k) = 2362.3 > logE = 266.4, i.e., on average this model class also passes the postprocessing falsification step. Note that, with the current choice of E, all of the model classes that pass through the preprocessingstepwillalsopassthroughthepostprocessingstep. However, ifother model classes, e.g., 2− 1− 1 or 2− 3− 2 somehow pass through the preprocessing, then this choice ofE would stop them since logE (2−1−1) =−3716.69 < logE and logE (2−3−2) =−3956.28< logE. Variations on the Method One way to improve the validity of the model classes inM 1 , is to adjust their prior parameter distributions based on those retained after preprocessing falsifi- cation, using insights gained by applying the likelihood-bound falsification. This optional step, shown after the preprocessing block in Figure 5.1, is implemented here; the measurement data is divided in to two sets:D I consisting of every odd 135 numbered measurement of the response andD II consisting every even numbered measurement of the response. (Other simple divisions of the data in the time domain can be used but, for a response to a historical earthquake record that is nonstationary, this division of data is chosen to ensure that the nonlinear behavior of the system is pronounced in both datasets; a detailed discussion on dividingD intoD I andD II is beyond the scope of this study.) The results from the prepro- cessingfalsificationstepusingD I againgivethesamenineunfalsifiedmodelclasses as before. For each of these nine model classes, the priors (which were originally Gaussian) are adjusted by fitting new Gaussian distributions to the unfalsified models’ parameters (i.e., the new Gaussian priors’ means and standard deviations are set equal to the sample mean and the sample standard deviation of the unfalsi- fied models’ parameters), and are then used as priors for the next step of Bayesian model selection. For example, for the model class 1− 2− 2, the prior for k 1 is changed to Gaussian with mean 23.65 MN/m and standard deviation 1.44 MN/m; i.e., the mean is closer to the true value and the standard deviation is about half that of the original prior. The Bayesian model selection is performed withD II for the nine model classes that pass the preprocessing step with adjusted priors for the model parameters. The results for two best model classes are shown in Table 5.5, which shows that the proposed framework is able to again find the correct model class. Table 5.5 also shows that usingD I to inform the priors makes model class 1− 2− 2 more competitive with true model class 1− 3− 2, as evidenced their relative log-evidence values reducing from a difference of 11.3 to only 3.9, though the posterior probability of 1− 3− 2 is still near unity. Hence, this modification in the proposed framework is very useful in making the candidate model classes competitive when they are very close to each other in their respective behavior. 136 Table 5.5: Posterior model class probabilities of the two best model classes after some model classes are rejected using preprocessing intra-model-class falsification and an adjustment of prior parameter distribution for model selection. Model class (M k ) log(Evidence) rel. log(Evidence) P (M k |D) 1− 2− 2 1170.1 −3.9 0.02 1− 3− 2 1174.0 0.0 0.98 A second variation is to adjust the choice of the target identification probability φ, changed here from 0.95 to 0.90, which gives fewer unfalsified models but still nine unfalsified model classes after the preprocessing intra-model-class falsification step, asshowninTable5.6. Usingthesenineunfalsifiedmodelclasses, theBayesian model class selection and postprocessing again chooses the 1− 3− 2 model class as the valid model class. This shows that adjusting φ within conventional ranges does not significantly affect the result (the effect of other values ofφ is beyond the scope of this chapter). Table 5.6: Unfalsified models using proposed intra-model class falsification with target identification probability φ = 0.90. Model class (M k ) Unfalsified (%) 1− 1− 1 42.7 1− 1− 2 51.5 1− 1− 3 53.6 1− 2− 1 42.2 1− 2− 2 50.8 1− 2− 3 50.3 1− 3− 1 40.4 1− 3− 2 48.0 1− 3− 3 49.3 137 Ground m b x b x ¨ g x 1 x 2 x 3 _ 1 2 k 1 , _ 1 2 c 1 _ 1 2 k 1 , _ 1 2 c 1 m 1 m 2 m 3 _ 1 2 k 2 , _ 1 2 c 2 _ 1 2 k 2 , _ 1 2 c 2 _ 1 2 k 3 , _ 1 2 c 3 _ 1 2 k 3 , _ 1 2 c 3 m b x b x ¨ g x 1 x 2 x 3 m 1 m 2 m 3 c 1 k 1 c 2 k 2 c 3 k 3 Figure 5.4: 4DOF models 5.3.2 Example II : 4DOF model with Hysteretic Isolation layer Consider the base-isolated building or mechanical equipment models shown in Figure 5.4. The isolation often exhibits hysteretic behavior, introducing nonlinear- ity into an otherwise linear dynamical system. To simulate the system response to various classes of inputs, the behavior of the hysteretic elements in the isolation layer must be accurately modeled. The true isolation model here is an elastoplas- tic element (e.g., a lead-rubber bearing in building isolation) with modest linear viscous damping. This four DOF system is subjected to base excitation ¨ x g , a sta- tionaryfilteredwhitenoisegeneratedusingaKanai-Tajimifilter[137]withspectral density S ¨ xg ¨ xg (ω) = S 0 4ζ 2 g ω 2 g ω 2 +ω 4 g ω 2 −ω 2 g 2 + 4ζ 2 g ω 2 g ω 2 (5.12) 138 where ω g = 17 rad/s and ζ g = 0.3 are assumed following Ramallo et al. [71], and the spectral intensity S 0 is calculated using S 0 =σ 2 w 0.03ζ g πω g 4ζ 2 g + 1 g 2 (5.13) whereg is the gravitational acceleration. The constantσ w = 2 is selected such that the nonlinearity in the system response is pronounced but not so large that the isolation layer is always beyond its yield point. Figure 5.5 shows a representative time history realization of ¨ x g (t). The equations of motion of the superstructure, if it were fixed base, are given by M s ¨ X s + C s ˙ X s + K s X s =−M s 1¨ x g (5.14) where M s is the 3×3 mass matrix as in (5.9) withm 1 =m 2 =m 3 = 300 Mg; K s is the 3×3 stiffness matrix similar to (5.10) with ¯ k 1 = ¯ k 2 = ¯ k 3 = 40 MN/m; 1 3×1 is a column vector of ones; and X s = [x 1 x 2 x 3 ] T is the vector of floor displacements T i m e ( s ) 0 5 10 15 20 25 30 B x g ( m / s 2 ) 0 1 2 Figure 5.5: Representative time history realization of ¨ x g (t). 139 relative to the ground. Again, Rayleigh damping, i.e., C s = β 1 M s +β 2 K s , is assumed with 3% damping for the first two modes, where β 1 andβ 2 are constants evaluated from the damping ratios and the superstructure natural frequencies. Combining the isolation layer and the superstructure equations of motion, the full system can be described by M s ¨ X s + C s ˙ X s + K s X s =−M s 1¨ x g + C s 1 ˙ x b + K s 1x b (5.15) m b ¨ x b + 1 T C s 1 ˙ x b + 1 T K s 1x b +f b =−m b ¨ x g + 1 T C s ˙ X s + 1 T K s X s wherem b = 500 Mg is the base mass; model classes forf b , the sum of the isolation- layer damping and restoring forces, are discussed in the following paragraphs. Candidate Model classes A total of six model classes — four linear and two nonlinear (Bouc-Wen and bilinear) — are considered, which have been described in Chapter 3. To assess the effectiveness of the proposed model validation framework for a nonlinear system, the Baber-Wen model with δ ν = 0.04 and δ η = 0.02, to introduce degradation within 30 s [102], is used to generate a set of nonlinear dynamic response data for base acceleration ¨ x b at a sampling rate of 20 Hz for 30 s, givingN o = 601, to which 20% white Gaussian noise is added. 140 Table 5.7: Priors for model parameters as applicable to each model class. Parameter True value Distribution Mean Std. dev. k post 4.0 MN/m Lognormal 4.5 MN/m 0.25 MN/m c b 20 kN·s/m Lognormal 18 kN·s/m 4 kN·s/m r k 0.1667 Uniform 0.1600 0.0058 r d n/a † Uniform 2.5 0.2887 Q y (% of W) ∗ 5.00 Uniform 4.75 0.2887 ∗ W = weight of the structure = (m s +m b )g. † Measurement data generated from Baber-Wen model does not require this parameter. Preprocessing: Intra-Model-Class Falsification For each of the six model classes, 1000 models are drawn from independent distributions of the constitutive parameters using their prior distributions given in Table 5.7. The model falsification is applied with a zero-mean Gaussian likelihood function with standard deviation σ assumed to be 25% of the noisy RMS base acceleration measurements; i.e., to 25% of the standard deviation of the data in D. This assumed residual standard deviation is close to the actual noise present in the measurement data. The target probability level is set at φ = 0.95 for the falsification method. The results in Table 5.8 demonstrate that the FDR likelihood bounds successfully falsify all linear model classes. Further, the results indicate that both Bouc-Wen and bilinear models may be valid representations and, therefore, deserving candidates for Bayesian model selection. For this four DOF example, combined computational savings become C ≈ 66.67%. (The actual savings will be slightly smaller than this because the ease of simulation of the linear systems compared to a nonlinear system, but this will not 141 Table 5.8: Unfalsified models using proposed intra-model class falsification. Model class (M k ) Unfalsified (%) AASHTO 0 JPWRI 0 CALTRANS 0 mod. AASHTO 0 Bouc-Wen 82.1 Bilinear 5.2 Table 5.9: Posterior model class probabilities after some model classes are rejected using a preprocessing step of intra-model-class falsification. Model class (M k ) log(Evidence) rel. log(Evidence) P (M k |D) Bouc-Wen 490.3 0.0 1.0 Bilinear 27.2 −463.1 ≈ 0.0 be a significant factor since the structure in this example has only four degrees of freedom.) Bayesian Model Selection By falsifying all linear model classes with FDR, the subsequent Bayesian model selection is significantly reduced in scale, from six plausible model classes to two, saving computation time by avoiding unnecessary evidence calculations. Assuming equal priorsP (M k ) = 1/2 for the nonlinear model classes, a subsequent Bayesian model class analysis assigns the Bouc-Wen model a posterior model class probabil- ity of essentially unity, leaving nearly zero probability of the bilinear model class (see Table 5.9). Postprocessing: Inter-Model-Class Falsification The Bouc-Wen model class also passes the postprocessing step of inter- model-class falsification step as the log of evidence value for this model class 142 logE (Bouc-Wen) = 490.3 > logE =−986.5 = logL. (The evidence is the expected value of likelihood of the measurement data for a model class; hence, a wide range of values are expected.) This result is expected since the Bouc-Wen model is very similar to the true Baber-Wen model class. For example, the evolution of degrada- tion parameters and hysteresis loops of the Bouc-Wen and Baber-Wen models for this building model are shown in Figure 5.6. The current choice ofE =L accepts the bilinear model class as well; it would also have rejected all of the linear model classes if they had passed through the preprocessing. A stricterE can be chosen, for exampleE = 0.5 max k max i L(θ i ;D,M k ) , which will reject the bilinear model class even if it passes the preprocessing step and only accepts the Bouc-Wen model class. However, if the Baber-Wen model parameters were chosen so that it were to degrade faster, then the Bouc-Wen model might not remain a valid model class. 0 10 20 30 Time (s) 0 0.2 0.4 0.6 0.8 e (t) 1 1.01 1.02 1.03 (e), (e) e (t) (e) (e) (a) Measure of response duration and severitye and degradation shape functions η(e)and ν(e). 0 5 10 Base displacement (cm) 0 0.5 1 z Bouc-Wen Baber-Wen (b) Hysteresis loops of Bouc-Wen and Baber-Wen models Figure 5.6: Degradation parameters and hysteresis loops of the Bouc-Wen and Baber-Wen models using the true values of the parameters. This framework identifies the Bouc-Wen model class as the valid one, and does so efficiently, drawing from a smaller candidate pool in which invalid model classes were systematically removed. The model validation framework here eliminates 143 other model classes but validates the Bouc-Wen model class. However, the hys- teresis loops for both Bouc-Wen and Baber-Wen models with the same parameter values are very similar, which justifies the result that the framework validates the Bouc-Wen model class. The posterior parameter means and standard deviations for the validated Bouc-Wen model class, shown in Table 5.10, are very close to the true values (as expected). Table 5.10: Posterior model parameters and their true values for Bouc-Wen model class. Parameter True value Mean Std. dev. k post 4.0 MN/m 3.9082 MN/m 0.0764 MN/m c b 20 kN·s/m 18.0486 kN·s/m 1.1822 kN·s/m r k 0.1667 0.1628 0.0024 Q y (% of W) 5.00 5.0017 0.0919 Variations on the Method To evaluate the framework with a different choice of target probability φ to assess its effects on this framework’s performance, the preprocessing intra-model- class falsification step is applied using another conventional valueφ = 0.90, result- ing in the fractions of remaining models listed in Table 5.11, which shows that two model classes, Bouc-Wen and bilinear, remain unfalsified. The Bayesian model selection and postprocessing inter-model-class falsification are applied next, result- ing in validating the Bouc-Wen model class as before. This shows that the effect of this user-chosen parameter, the target identification probability φ, (if altered modestly) does not change the number of unfalsified model classes after the pre- processing step. 144 Table 5.11: Unfalsified models using the proposed intra-model class falsification with φ = 0.90. Model class (M k ) Unfalsified (%) AASHTO 0 JPWRI 0 CALTRANS 0 mod. AASHTO 0 Bouc-Wen 69.7 Bilinear 2.0 5.4 Conclusion This chapter proposes a model validation framework that incorporates the modelfalsificationfromChapter2aspre-andpostprocessingstepsfortheBayesian model selection discussed in Chapter 3. The first example shows how the proposed framework reduces the available model classes by 66.67%, giving large computa- tional savings in the next step of Bayesian model selection, which requires the calculation of evidence — a costly enterprise. The second example uses a candi- date model class set that does not contain the true model class used to generate the measurements, but the validated model class is very close to the true one. Further, this framework provides computational savings of approximately 66.67% by eliminating most of the incorrect model classes. (Note that the elimination of two-thirds of the model classes in the preprocessing is particular to these examples, and it is coincidence that they eliminated the same fraction of model classes.) 145 Chapter 6 Applications We tend to hear much more about the splendors returned than the ships that brought them or the shipwrights. It has always been that way. Carl Sagan, Pale Blue Dot: A Vision of the Human Future in Space 6.1 Introduction The methodology for the proposed model validation framework, along with modifications of each of its components, are established in the previous chapters. Here they are used for examples from different fields. First, the computationally efficient response calculation algorithm discussed in Section 3.2.3 is used for design and design-under-uncertainty of passive control devices for a cable-stayed bridge. Next, an efficient forward uncertainty propagation method using the algorithm dis- cussed in Section 3.2.3 is presented. Finally, a full-scale four-story building tested in Japan’s E-Defense lab in 2013 and an example from computational fluid dynam- ics are used for the implementation of the proposed model validation framework. 6.2 Computationally Efficient Design of Passive Control Devices Passive control of structures has been studied, in simulation and experiment, andimplementedoverthelastfewdecades. Inpassivecontrolstrategies, theenergy 146 imparted by the external excitation is dissipated or redirected using energy dissi- pation devices [138–142] or isolation techniques [143–146]. Some such components include hysteretic or viscoelastic dampers, friction dampers, isolators, and tuned- mass or liquid dampers. Although the superstructures are designed to remain elas- tic for most design level earthquakes, many of these passive control devices exhibit nonlinear behavior. The optimization of passive elements embedded in such struc- tures, whether through a formal iterative optimization algorithm or a response sur- face generated from a parameter study, requires determining the system response multiple times, quickly becoming computationally expensive if not intractable for complexstructureswithnonlinearanduncertaincomponents. Whilemethodsexist to reduce the order or complexity of structural models, the presence of nonlinear components makes it difficult (and often impossible) to know in advance the errors induced by such model reduction, particularly for design studies in which a wide range of component properties will be explored. Rather, to ensure accuracy, the full model must be retained. Herein, theVolterraIntegralEquation(VIE)approachdiscussedinChapter3is utilized to determine optimal parameter values and placement locations of passive control devices (passive power-law dampers and nonlinear tuned-mass dampers) installed on a cable-stayed bridge. Localized uncertainty in the cable-stayed bridge is considered as well, forming a design-under-uncertainty problem. The optimal parameter design of passive dampers has been studied extensively over the last thirty years. Constantinou et al. [147] designed a first-story damper by minimizing an estimate of maximum relative displacement of top floor of the structure. Parameter studies on structures equipped with viscous and viscoelastic dampers added to a frame structure were performed by Fu and Kasai [148]. The dampers can also be designed using optimal control theory with the minimization 147 of a quadratic performance index with viscous or hysteretic devices [149, 150]. In Lavan and Levy [151], the dampers are designed to minimize inter-story drift with assumed linear superstructure behavior, whereas Lavan [152] develops an iterative approach for damper design for nonlinear structures. Zare and Ahmadizadeh [153] uses pole-placement, similar to that used for active control systems, to design optimal passive dampers and stiffness modification in a building. The optimal placement problem of dampers has been investigated by a variety of researchers [154–160]. Miguel et al. [161] use a genetic algorithm approach to choose friction damperlocationsandparameterstooptimizethemeanandvarianceofearthquake- induced response in the presence of uncertainties. Frahm, in 1909, first suggested the use of tuned mass dampers [162]. Since then, theoretical expressions for optimal TMD parameters were derived by War- burton and Ayorinde [163–165] for harmonic and white noise excitations. Sadek et al. [166] suggested design formulas for TMDs under seismic excitations by pro- viding large damping in the first two modes of vibration of a structure. Similarly, TMDs are designed by increasing modal damping in Miranda [167]. Lee et al. [168] gave an optimal design theory for TMDs by minimizing a performance met- ric based on frequency domain response of the structural system. Hoang et al. [169] suggested an improvement over those expressions by considering the earth- quake ground frequency content and the structural damping. The performance of nonlinear TMDs has been investigated in Gattulli et al. [170] and in Alexander and Schilder [171]. Application of multiple TMDs, as well as distributed mass dampers, to abate the superstructure’s response over a wide frequency range has also been proposed [172–177]. In most of these prior studies, however, the passive dampers and TMDs were designed in a deterministic scenario; in reality, uncertainties are often present in 148 material, loading or topological characteristics of a structure. Robust design in the presence of these uncertainties has been investigated using different optimiza- tion methods [178]. Sandgren and Cameron embedded Monte Carlo simulation inside a genetic algorithm to consider the present uncertainties [179]. Calafiore and Dabbene performed average and worst-case design optimization in the pres- ence of uncertainties with application to a truss structure [180]. Over the past few decades, reliability-based design optimization of these types of uncertain struc- tures has been investigated using single- [181, 182] and multi-objective [183–185] optimization. Reliability-based optimization of system components using decision theorywasperformedin Enevoldsenand Sørensen[186]. A robustcontroller design method with parametric uncertainties using H 2 - and reliability-based objectives has been proposed in Taflanidis et al. [187]. Neural networks have been used for reliabilitybasedoptimizationoflarge-scalestructuresinPapadrakakisandLagaros [188]. The use of randomized algorithms has been done by Tempo et al. for control design of such uncertain systems [189]. This section proposes a computationally-efficient framework for the design of optimal passive control devices for systems that are mostly linear and determinis- tic but have localized nonlinear and uncertain elements. The use of the framework is illustrated using four numerical examples. These examples use a finite element model of the Bill Emerson Memorial Bridge across the Mississippi river between Cape Girardeau, Missouri, and East Cape Girardeau, Illinois [1]. The bridge has been studied extensively for different purposes, e.g., chaos analysis using a finite element model [190], damage detection by artificial neural network [191], aerody- namic vibration [192], model updating [193, 194], etc. The bridge is a benchmark for studying control problems as the FEM is freely available [1]. Dyke et al. [1] designed a linear quadratic Gaussian (LQG) control for the bridge, whereas 149 Caicedo et al. [195] also considered multisupport and transverse excitations and evaluated robustness of the strategies considering the addition of snow loads. He and Agrawal [196] studied passive dampers and hybrid passive/semiactive struc- tural control strategies for this benchmark structure. This section is organized as follows. First, a method for efficient response calcu- lation, which exploits the localized nature of the design and uncertain components used in Section 3.2.3 is briefly mentioned. The optimization procedures, including bothworst-caseandaveragedesigns, aredescribed. Thenumericalexamples, using the benchmark cable-stayed bridge FEM [1], apply the proposed methodology for the design of linear/nonlinear passive damping devices and tuned mass dampers (TMDs). The cable-stayed bridge structure model is summarized; the device mod- els are given; and the optimization objective and constraints are detailed. Then, four examples are used to demonstrate the efficacy of the proposed methodology: first, optimizing placement and parameters of passive dampers; second, optimizing the placement and parameters of TMDs; third, simultaneous optimization of both passive dampers and TMDs; and, finally, optimizing the passive dampers given TMDs with uncertain parameters. 6.2.1 Methodology Theproposedmethodologyforefficientoptimaldevicedesignanddesign-under- uncertainty consists of two parts: efficient response calculation and the optimiza- tion approaches [197–199]. Efficient Response calculation The response calculation algorithm described in Section 3.2.3 is used here. 150 Optimization procedure The design problem is considered herein as a minimization of a cost function subjected to some constraints. This constrained optimization problem can be written as, min θ∈Θ J (Y(t);θ) subject to h (Y(t);θ)≤ 0 (6.1) The specific forms of the cost function J and constraint functions h, respec- tively, are discussed subsequently in the numerical examples. The optimal design valueθ ? is obtained with an iterative optimization algorithm. At each step k, the iterative technique used tries to calculate a better solution giving a smaller J. In gradient based methods, the gradients can be calculated using analytical expres- sionsorapproximatednumericallyusingfinitedifferences. Theoptimizationherein is performed viaMatlab’s fmincon command implementing sequential quadratic programming (SQP). At step k, the nonlinear constrained optimization in (6.1) is approximated with a quadratic subproblem in SQP as [200, 201] min d [∇J (θ k )] T d + 1 2 d T ∇ 2 L(θ k )d subject to ∇h (θ k ) d + h(θ k )≤ 0 (6.2) where the search direction is d = θ−θ k ; the gradient is ∇ = ∂/∂θ; and the Lagrangian function is L(θ) = J(θ) +λ T h(θ). The Hessian matrix of the Lagrangian is updated using a quasi-Newton method. 151 Design optimization under uncertainty If some parts of the system model are also uncertain (e.g., characteristics of someof thestructural control elements orother localizedcomponents), thenMonte Carlo sampling can be used for uncertainty characterization that, in turn, further increases the computational requirements for optimization of such systems. The optimal design of passive nonlinear control devices requires repeated solutions of thenonlinearsystem. Multiplyingthesethreecosts—nonlinearsimulation, Monte Carlo sampling, and design parameter iterations — often creates a significant com- putational burden, so there is a clear advantage in the design-under uncertainty problem for a computationally efficient approach to solve for the response of locally nonlinear systems. In the presence of local uncertainties, the equation of motion for the design- under-uncertainty problem can be written as ˙ X(t) = AX(t) + Bw(t) + L d g d (X d (t);θ) + L u g u (X u (t);δ) = AX(t) + Bw(t) + Lg(X(t);θ,δ); X(0) = x 0 Y(t) = CX(t) + Dw(t) + E d g d (X d (t);θ) + E u g u (X u (t);δ) = CX(t) + Dw(t) + Eg(X(t);θ,δ) (6.3) where g d (·;·)∈R n g,d ×1 is a function of state linear combination X d (t) = G d X(t)∈ R n o,d ×1 and design parameters θ ∈ R n θ ×1 ; g u (·;·) ∈ R ng,u×1 is a function of state linear combination X u (t) = G u X(t) ∈ R no,u×1 and uncertain parameters δ∈ R n δ ×1 ; L d ∈ R n×n g,d is an influence matrix mapping to all states from the pseudoforce vector function g d arising due to design parameters; L u ∈ R n×ng,u is an influence matrix mapping to all states from the pseudoforce vector func- tion g u arising due to uncertain parameters; E d ∈ R ny×n g,d and E u ∈ R ny×ng,u 152 are influence matrices mapping g d and g u to Y(t), respectively. The state lin- ear combinations X d (t)∈ R n o,d ×1 and X u (t)∈ R no,u×1 are vectors with dimen- sions generally much less than n. The pseudoforces from the uncertainties and nonlinearities can be combined to form L = [L d T L u T ] T ∈ R n×no and g(X(t);θ,δ) = [g d T (X d (t);θ) g u T (X u (t);δ)] T ∈R ng×1 where n g ≤ (n g,d +n g,u ). Similarly, X(t) = [X d T (t) X u T (t)] T = GX(t)∈R no×1 , wheren o ≤n o,d +n o,u , can be formed with G = [G d T G u T ] T . The efficient response calculation algorithm described in section 6.2.1 is effective for computing the controlled responses with local uncertainties in the structure. Of course, any redundant columns in L (e.g., if two devices are collocated) can be eliminated and the dimensionn g of g(X(t);θ,δ) can be reduced, simplifying the solution of VIE (3.15). Further, any redundant columns of G can be eliminated and the dimensionsn o of x and X can be reduced. The objectives for design-under-uncertainty must incorporate the effect of uncertainties on the response X(t) and, subsequently, on the output vector Y(t). Two different design scenarios are employed here: worst-case design and average design [180]. A brief description of these two formulations are given as follows. Worst case design: In this design method, also called a minimax problem, the structure or control deviceisdesignedforthecasewhenthecostfunctionismaximizedoverthedomain of uncertainty with constraints satisfied, known as worst-case design. This design optimization problem can be posed as min θ∈Θ max δ∈Δ J(Y(t);θ,δ) subject to h(Y(t);θ,δ max )≤ 0 a.s. 1 (6.4) 153 whereJ(·) and h(·) may be functionals of the entire trajectory of Y; the set of all possible values of design parameterθ is denoted by Θ; Δ represents a probability space{Ω,P,F} with sample space Ω, probability measure P, and σ-algebra F corresponding to the uncertainty defined for the problem; δ max (θ) ∈ Δ is the element which, for a particular design θ, maximizes J(Y(t);θ,δ) subject to the satisfaction of the constraint h; and J(·) is assumed concave inδ. The analytical solution of (6.4) is very unlikely to be available. However, with samples{δ i } N δ i=1 from Δ, one may approximate the problem as min θ∈Θ max i=1,...,N δ J(Y(t);θ,δ i ) subject to h (Y(t);θ,δ i,max (θ))≤ 0 (6.5) where δ i,max (θ) corresponds to the sample that, for a particular design θ, maxi- mizes J(Y(t);θ,δ) subject to the satisfaction of constraint h. Average design: In the second method of design, the expected value of the cost is minimized while satisfying the constraints. This design problem can be posed as min θ∈Θ E δ [J(Y(t);θ,δ)] subject to h(Y(t);θ,δ)≤ 0 a.s. 1 (6.6) 1 Almost surely (a.s.): the event happens with probability 1. 154 The expected value of the objective function can be estimated using samples from the uncertainty distribution. With samples{δ i } N δ i=1 from Δ, min θ∈Θ 1 N δ N δ X i=1 J(Y(t);θ,δ i ) subject to h(Y(t);θ,δ i )≤ 0 ∀ i = 1,...,N δ (6.7) 6.2.2 Bridge Model and Optimization Objec- tives/Constraints Cable-stayed bridge model The cable-stayed bridge used herein is the Bill Emerson Memorial Bridge built in 2003 across the Mississippi river between Cape Girardeau, Missouri, and East Cape Girardeau, Illinois. A finite element model of the superstructure of this bridge was developed in 2003 by Dyke et al. [1], as shown in Fig. 6.1, con- sisting of 579 nodes, 128 cable elements, 162 beam elements, 420 rigid links and 134 nodal masses. The version of the model used herein has no connection between the bridge deck and the tower except through the cables so as to allow for energy dissipation devices to be placed between the deck and a tower. The initial 3474 degree-of-freedom (DOF) model, which is the linear motion about the static equilibrium, can be reduced to 909 DOFs when boundary conditions are imposed and slave DOFs removed. A static condensation is applied to eliminate DOFs with small contribution to the global response, resulting in the 419 DOF model that is described in the benchmark definition paper [1] and available online 1 Almost surely (a.s.): the event happens with probability 1. 155 [https://nees.org/resources/3246], which is used in this example. The equation of motion of the bridge model is M s ¨ u s (t) + C s ˙ u s (t) + K s u s (t) =−M s r¨ u g (t) (6.8) where the mass, damping and stiffness matrices of the bridge superstructure for the active DOFs are M s , C s and K s , respectively; r is the influence vector for the ground acceleration (i.e., it consists of ones corresponding to active horizontal displacementDOFsandzeroselsewhere); u s isthegeneralizeddisplacementvector; and ¨ u g is the ground acceleration in the longitudinal direction. The reader is directed to the benchmark definition paper [1] for further details. Figure 6.1: Finite element model of the bridge (dimensions in m); adapted from Dyke et al. [1]. Passive energy dissipation devices To maintain the symmetric nature of the model, one or more pairs of passive energy dissipation devices are attached within the bridge symmetrically about the spine of the bridge deck. Two types of passive control devices are investigated in 156 this study: (a) passive (viscous) dampers and (b) tuned mass dampers. The forces in the linear and nonlinear passive dampers are given by f i = c pd i Δ ˙ u i , linear damper c pd i |Δ ˙ u i | β pd i sgn(Δ ˙ u i ), power law damper i = 1,...., 2n pd (6.9) wherec pd i is the damping coefficient and Δ ˙ u i is the velocity across thei th damper. Different possible locations for the devices are also investgated in this study. The governing differential equations of the motion of the TMD masses are given by, m tmd i ¨ v i +c tmd i |Δ ˙ v i | β tmd i sgn(Δ ˙ v i ) +k tmd i Δv i =−m tmd i r tmd i ¨ u g , i = 1,..., 2n tmd (6.10) where v i is the displacement of the i th TMD relative to the ground, Δv i is the displacement of the i th TMD relative to its attachment point on the bridge, and r tmd i is in [−1, 1] depending on thei th TMD’s orientation relative to the earthquake ground motion direction. Objective function and constraints formulation The location and parameter values of the passive control devices are optimized in this section to improve some performance metrics based on bridge response. Different performance metrics can be chosen; a set of metrics normalized with respect to the uncontrolled and connected deck-tower case, suggested by Dyke et al. [1], is used here. Considering the 1940 El Centro earthquake excitation, nine performance met- rics are used to form the objective function and constraints. The performance 157 metrics (numbered consistently with the benchmark definition [1]) based on base shear at the two towers are J 1 = max i,t |F bi (t)| F max 0b , J 2 = max i,t |F di (t)| F max 0d , J 7 = max i kF bi (t)k kF 0b (t)k , J 8 = max i kF di (t)k kF 0d (t)k (6.11) where F bi (t) is the base shear at the i th tower at time t; F max 0b is the maximum uncontrolled base shear at the two towers; F di (t) is the shear at the deck level of the i th tower at time t; F max 0d is the maximum uncontrolled shear at the deck level at the two towers;k·(t)k denotes the root mean square over time;kF 0b (t)k is the maximum of the normed value of uncontrolled base shear at the two towers (i.e., max i kF di (t)k); andkF 0d (t)k is the maximum of the normed value of uncontrolled shear at the deck level of the two towers [1]. The performance metrics based on overturning moments at the bases of the towers are J 3 = max i,t |M bi (t)| M max 0b , J 4 = max i,t |M di (t)| M max 0d , J 9 = max i kM bi (t)k kM 0b (t)k , J 10 = max i kM di (t)k kM 0d (t)k (6.12) where M bi (t) is the overturning moment at the i th tower at time t; M max 0b is the maximum uncontrolled overturning moment at the base of the two towers; M di (t) is the overturning moment at the deck level of thei th tower at timet;M max 0d is the maximum uncontrolled overturning moment at the deck level of the two towers; kM 0b (t)k is the maximum of the normed value of uncontrolled overturning moment at base of the two towers; andkM 0d (t)k is the maximum of the normed value of uncontrolled overturning moment at the deck level of the two towers [1]. Other performance metrics used are J 6 = max i,t x bi (t) x 0b , J 12 = max i,t |f i (t)| W , J 13 = max i,t |y d i (t)| x max 0 (6.13) 158 wherex bi (t) is the displacement of the deck at Bent 1 or Pier 4 in the finite element model at time t; x 0b is the maximum uncontrolled displacement of the deck; f i (t) is the amount of force exerted by the i th device;W = 510 MN is the weight of the bridge superstructure; y d i (t) is the stroke of the i th device at time t; and x max 0 is the maximum displacement at the top of the two towers relative to the ground in the uncontrolled case [1]. Performance metricsJ 12 andJ 13 are used to monitor the viability of the optimized device configuration. While the benchmark definition [1] considers several ground motions, only the 1940 El Centro earthquake is used herein as the ground excitation in the x-direction herein; each simulation is run for 200 s. Table 6.1: Performance metrics [1] Responses Peak Metric Normed Metric Base shear J 1 = max i,t |F bi (t)| F max 0b J 7 = max i kF bi (t)k kF 0b (t)k Shear at deck level J 2 = max i,t |F di (t)| F max 0d J 8 = max i kF di (t)k kF 0d (t)k Overturning moment J 3 = max i,t |M bi (t)| M max 0b J 9 = max i kM bi (t)k kM 0b (t)k Moment at deck level J 4 = max i,t |M di (t)| M max 0d J 10 = max i kM di (t)k kM 0d (t)k Deck displacement at abutment J 6 = max i,t x bi (t) x 0b 159 Using these performance metrics, the deterministic optimization problem is formulated as min θ∈Θ J 1 (θ) subject to J k (θ)≤αJ k,0 k = 2, 3, 4, 6, 7, 8, 9, 10. where Θ ={θ :θ lb j ≤θ j ≤θ ub j , j = 1,...,n θ }. (6.14) where n θ is the total number of device parameters for all devices and J k,0 corre- sponds to thek th performance metric with no devices. The lower and upper limits of parameter θ j are given by θ lb j and θ ub j , respectively. Initially, α is set to unity, but is subsequently increased to 1.25, as detailed below, to further decrease the cost J 1 while mildly relaxing the constraints. The optimization is performed using the fmincon command of Matlab, with default tolerance values, implementing the sequential quadratic programming described in section 6.2.1. While fmincon can be provided analytical gradient information, numerical forward finite differences are used here to evaluate the gra- dients. 6.2.3 Numerical examples The proposed approach demonstrates increased computational efficiency over traditional approaches when many simulations are required for optimization prob- lems as illustrated through four numerical examples in this section. The first example aims to find the best parameters and location of passive dampers to minimize a key structural response metric while constraining others for the deter- ministic problem; the second example is similar but using TMDs instead of passive dampers. Inthethirdexample, thedesignoptimizationsimultaneouslychoosesthe 160 Figure 6.2: Six candidate device locations denoted as I, II, III, IV, V and VI. parameter values of the passive dampers and TMDs. The final example considers the design optimization of locations and parameter values of passive dampers for a cable-stayed bridge with some local uncertain components. Example I: Optimal design of passive dampers for a cable-stayed bridge The proposed method is implemented in this section for the optimal design of 2n pd passive structural devices, both linear viscous and power law dampers, in a cable-stayed bridge, optimizing the device locations, damper coefficients and (for the power law dampers) the velocity exponents. Formulation: The state space representation of the equations of motion of the bridge is in the form of (6.15) where X(t) = u s (t) ˙ u s (t) , A = 0 I −M −1 s K s −M −1 s C s , B = 0 r , L = 0 M −1 s R pd and R pd is the influence matrix for the dampers. Each of the 2n pd columns of R pd transforms the force of one damper to a global force vector. To be consistent with the symmetric nature of the bridge model, the passive devices are placed 161 in identical pairs symmetrically located about the bridge deck centerline; i.e., let c pd 2i−1 =c pd 2i and β pd 2i−1 =β pd 2i for i = 1,...,n pd . Hence, in the power law damper g X(t);θ = −c pd 2 |Δ ˙ u 1 | β pd 2 sgn(Δ ˙ u 1 ) −c pd 2 |Δ ˙ u 2 | β pd 2 sgn(Δ ˙ u 2 ) . . . −c pd 2n pd |Δ ˙ u 2n pd −1 | β pd 2n pd sgn(Δ ˙ u 2n pd −1 ) −c pd 2n pd |Δ ˙ u 2n pd | β pd 2n pd sgn(Δ ˙ u 2n pd ) where X(t) = GX(t) = [Δ ˙ u 1 Δ ˙ u 2 999 ··· 999 Δ ˙ u 2n pd −1 Δ ˙ u 2n pd] T ; θ = [c pd 2 β pd 2 999 ··· 999 c pd 2n pd β pd 2n pd ] T ; and G = [0 R pd T ]. The range for the damping coefficients is taken as c pd 2i ∈ [0.5, 30] MN·(s/m) β pd 2i where β pd 2i = 1 for the linear viscous damper and is taken to be 0.2≤ β pd 2i ≤ 1.8 for power law dampers [202]. Hence, the parameters in this example are θ =c pd 2 for one pair of identical linear viscous dampers andθ = [c pd 2 β pd 2 ] T for one pair of identical nonlinear power law dampers. The initial guess for the optimal point isc pd 2i = 10 MN·(s/m), which lies within the allowable range for linear dampers; for the power law dampers, the optimal damping coefficients c pd 2i from the linear damper results are used as initial guesses with exponents β pd 2i = 1.0. Results: The passive damping devices are connected from the deck to either Bent 1 (B1) at the abutment, Pier 2 (P2), Pier 3 (P3) or Pier 4 (P4). For aesthetic reasons and to eliminate the effects of connection compliance, the lengths of the dampers are kept limited to 10 m in the horizontal and vertical directions, which determines the set of possible device configurations. Details of node numbers can be found 162 in the freely available finite element model [1]. Table 6.2 tabulates the optimal damper values for the α = 1 case (i.e., constrain all cost metrics J i to be no more than their values in the uncontrolled structure) for linear dampers for different positions of one pair of devices. The maximum reduction in J 1 obtained with a pair of linear passive dampers is 13.0676%. However, the reduction of J 1 is much more pronounced if the constraint metrics{J 2 ,...,J 4 , J 6 ,...,J 10 } are allowed to modestly exceed their corresponding values in the uncontrolled structure. Thus, the constraints are relaxed by setting α = 1.25 to expand the set of feasible solutions and allow for greater reduction in objective metric J 1 . Table 6.3 tabulates the optimal damper values with this α = 1.25 for different positions of one pair of linear dampers. Table 6.4 shows the optimal nonlinear damper parameters, with α = 1.25, for different positions of one pair of devices. The values of metricsJ 12 (peak damper force) andJ 13 (peak damper stroke) at the two best locations (devices attached to Pier 3 connecting nodes 319 to 184 and 324 to 117inlocationIVandconnectingnodes319to186and324to119inlocationV;see Figure 6.2) for the nonlinear dampers are peak forces that are 2.28% and 2.44% of structure weight, respectively, and normalized damper strokes 0.8195 and 0.9099, respectively. By relaxing the constraints, the maximum percentage reduction in J 1 has doubled, from about 13% with constraint α = 1, as shown in Table 6.2, to about 27% withα = 1.25 as shown in in Tables 6.3 and 6.4. The results also show significant reduction in J 1 using linear and nonlinear dampers attached to Pier 3. However, with relaxed constraints, the optimal nonlinear damper does not give a better solution than the linear one. Hence, a pair of linear dampers with damping coefficients c pd 2 = 22.2268 MN·(s/m) connecting nodes 319 to 184 and 324 to 117 is the optimal solution. 163 Table 6.2: Feasible solutions of linear damper configuration Region Connecting nodes: Reduction of c pd 2 [MN·s/m] # fcn. left damper, right damper cost J 1 (%) evals. II – P2 204-525, 209-459 8.0399 30.0000 † 6 II – P2 205-525, 210-459 10.8527 30.0000 † 14 II – P2 206-525, 211-459 10.9413 19.9059 12 III – P2 204-527, 209-461 6.6224 30.0000 † 6 III – P2 205-527, 210-461 10.4540 28.8429 10 III – P2 206-527, 211-461 10.4540 22.2830 10 IV – P3 318-184, 323-117 13.0676 23.1692 12 V – P3 318-186, 323-119 0.4127 16.1128 10 I – B1 ab-136, ab-69 3.1894 30.0000 † 6 I – B1 ab-137, ab-70 2.3034 30.0000 † 8 I – B1 ab-138, ab-70 7.3311 30.0000 † 6 † 30 MN·(s/m) is used as the upper limit for c pd 2 in the optimization. Next, the optimal design of two pairs of nonlinear dampers placed symmetri- cally about the bridge deck is performed for a few possible configurations. The parameter vector for this example becomesθ = h c pd 2 β pd 2 999 c pd 4 β pd 4 i T . The range for the damping coefficients and the exponents are prescribed to be c pd i ∈ [0.5, 30] MN·(s/m) β pd i and β pd i ∈ [0.2, 1.8], respectively, for i = 2, 4. The initial guess for the parameter vector is obtained from the result of optimization for nonlinear pas- sive dampers in Table 6.4. The optimal parameter values for two nonlinear passive dampers are shown in Table 6.5. The results show that the optimal nonlinear dampers at two best locations do not give a better reduction of J 1 compared to one pair case. However, a further reduction of J 1 can be obtained when one pair passive dampers are used in combination with TMDs as demonstrated in Example III. 164 Table 6.3: Feasible solutions of linear damper configuration with relaxed con- straints, and the corresponding cost reductions relative to that with no damper Region Connecting nodes: Reduction of c pd 2 [MN·s/m] # fcn. left damper, right damper cost J 1 (%) evals. II – P2 204-525, 209-459 8.03987 30.0000 † 6 II – P2 205-525, 210-459 12.3588 29.3148 53 II – P2 206-525, 211-459 13.4662 27.1434 57 III – P2 204-527, 209-461 6.6224 30.0000 † 6 III – P2 205-527, 210-461 10.7641 30.0000 † 6 III – P2 206-527, 211-461 12.8018 30.0000 † 6 IV – P3 318-184, 323-117 15.5260 30.0000 † 6 IV – P3 319-184, 324-117 27.5969 22.2268 48 V – P3 318-186, 323-119 14.9059 30.0000 † 6 V – P3 319-186, 324-119 26.6667 20.8560 75 I – B1 ab-136, ab-69 3.1894 30.0000 † 6 I – B1 ab-137, ab-70 2.3034 30.0000 † 8 I – B1 ab-138, ab-70 7.3311 30.0000 † 6 † 30 MN·(s/m) is used as the upper limit for c pd 2 in the optimization. Computation time for the proposed method: Toevaluatethecomputationalefficiencyoftheproposedmethod, itiscompared withatraditionalordinarydifferentialequationsolver: Matlab’sode45(afourth- order Runge-Kutta method with adaptive time-steps) with relative accuracy of 10 −3 andabsoluteaccuracyof 10 −6 . Theproposedmethodisimplementedwith 2 nd order accurate trapezoidal and fourth-order accurate quadrature rules. With 2 15 time steps of Δt = 6.1 ms each, for the linear optimal design configuration IV – P3 from Table 6.3, the proposed method fourth-order accurate quadrature rule gives a relative difference in the RMS cross-device displacement of about 0.23%, and 0.72% for velocity, compared to ode45. With 2 16 time steps of Δt = 3.1 ms each, 165 Table 6.4: Feasible solutions of nonlinear damper configuration with relaxed con- straints, and the corresponding cost reductions relative to that with no damper Region Connecting nodes: Reduction of c pd 2 [MN·(s/m) β pd 2 ] β pd 2 # fcn. left damper, right damper cost J 1 (%) evals. II – P2 204-525, 209-459 9.2580 30.0000 † 0.8727 62 II – P2 205-525, 210-459 12.3588 29.3148 1.0000 15 II – P2 206-525, 211-459 13.4662 27.1434 1.0000 29 III – P2 204-527, 209-461 8.4607 30.0000 0.8011 63 III – P2 205-527, 210-461 11.4729 30.0000 † 0.9301 70 III – P2 206-527, 211-461 13.9092 26.0253 0.7905 113 IV – P3 318-184, 323-117 18.2724 30.0000 † 0.7439 20 IV – P3 319-184, 324-117 27.5969 22.2266 1.0000 30 V – P3 318-186, 323-119 16.5891 30.0000 † 0.8040 27 V – P3 319-186, 324-119 26.8660 20.5440 0.9777 50 I – B1 ab-136, ab-69 14.7287 30.0000 † 0.2000 † 9 I – B1 ab-137, ab-70 5.0941 30.0000 † 0.2000 † 9 I – B1 ab-138, ab-70 16.1462 29.9791 0.2000 † 47 † 30 MN·(s/m) β pd 2 is used as the upper limit for c pd 2 and 0.2 is used as the lower limit for β pd 2 in the optimization. Table 6.5: Optimization of two pairs of power-law dampers Region Connecting nodes: Reduction of c pd i [MN·(s/m) β pd i ] β pd i # fcn. left damper, right damper cost J 1 (%) evals. IV – P3 319-184, 324-117 26.4230 0.5000 † 1.7319 186 V – P3 319-186, 324-119 16.1602 0.8018 IV – P3 319-184, 324-117 15.0609 30.0000 † 1.1970 241 II – P2 206-525, 211-459 26.1564 0.8019 † c pd ∈ [0.5, 30.0]MN·(s/m) β pd is used as the range of the damping coefficient in the optimization. the relative errors are 0.18% and 0.23%, respectively. As the relative accuracy of ode45 isO(10 −3 ), the accuracy of proposed method is at leastO(10 −3 ) as well. The computational cost of the proposed optimization method involves both one time and repeated calculations. Table 6.6 tabulates these costs for a typical case. The total cost and computational gain achieved are compared to ode45 for design optimization using fmincon with 20 function evaluations. The computation times are computed using the Matlab command cputime on a computer with a 2.3 166 GHz Core i7-4850HQ processor, 16 GB RAM, Mac OS X, and running Matlab 2013a and 2014b. The results in Table 6.6 demonstrate a significant computational speedup can be achieved using the proposed method while keeping the relative RMS error to beO(10 −3 ). A visual comparison of accuracy of response computed by proposed method compared to Matlab’s ode45 is shown in Figure 6.3. 0 10 20 30 40 50 60 −10 −5 0 5 10 Time (s) Displacement (mm) ode45 VIE, Δt = 6.1 ms 0 10 20 30 40 50 60 −40 −20 0 20 40 Time (s) Velocity (mm/s) ode45 VIE, Δt = 6.1 ms Figure 6.3: Accuracy of the proposed approach (Δt = 6.1 ms, a fourth-order quadrature) compared to Matlab’s ode45. Table 6.6: Computational gain achieved using the proposed method with fourth- order accurate quadrature scheme for passive damper design CPU time, Proposed, 1 fcn. eval. Total, 20 fcn. evals. Gain in Δt one-time repeated comp. x(t) H L (t) p(t) Y(t) Proposed ode45 efficiency 3.1 ms 65.67 s 44.46 s 35.19 s 20.47 s 20.39 min. 35.22 hrs. 103.64 6.1 ms 33.30 s 22.44 s 14.19 s 10.15 s 9.04 min. 35.22 hrs. 233.76 Example II: Tuned mass dampers In the second example, n tmd pairs of linear or nonlinear tuned mass dampers (TMDs) are attached symmetrically to the cable-stayed bridge described in section 6.2.2. The TMDs are attached to the top of the tower (preliminary studies show 167 that TMDs attached to bridge deck do not reduce the objectiveJ 1 by a significant amount; hence, deck locations are not considered herein). Both towers are taken as candidate locations for the TMDs (Figure 6.4). Formulation: The state-space equation of motion of the structure with TMDs is in the form of (6.15) where X(t), A, B, L have form similar to those in Example I but with u s (t), M s , K s , C s , r, and R pd replaced by u(t), M, K, C, ¯ r, and R tmd , respectively. ¯ r = [r T r 1 r 2 ··· r 2n tmd] T is the influence vector for the ground acceleration; and R tmd is the influence matrix for the TMDs. The displacement vector consists of both superstructure DOFs u s (t) and TMD DOFs v tmd (t); i.e., u(t) = [u T s (t) v tmd T (t)] T . The mass, nominal stiffness and nominal damping matrices of the tuned mass dampers are denoted by m tmd , k n , and c n respectively. With the nominal TMDs, the mass, stiffness and damping matrices are given by M = M s 0 0 m tmd , K = K s + T T k n T −T T k n −k n T k n and C = C s + T T c n T −T T c n −c n T c n where T is a transformation matrix that transforms the nominal stiffness and damping matrices from local TMD coordinates to global coordinates. Each of the 2n tmd columns of R tmd transforms the force of one TMD to a global force vector. To be consistent with the symmetric nature of the bridge model, the passive TMDs are placed in identical pairs located symmetrically about the bridge deck centerline; i.e., let c tmd 2i−1 = c tmd 2i , β tmd 2i−1 = β tmd 2i , m d 2i−1 = m tmd 2i 168 and k tmd 2i−1 = k tmd 2i for i = 1,...,n tmd (TMDs). For the optimal design of n tmd pairs of linear TMDs, the design parameter vector is taken as θ = h Δk tmd 2 Δc tmd 2 999 ··· 999 Δk tmd 2n tmd Δc tmd 2n tmd i T , which leads to g(X(t);θ) = −Δk tmd 2 Δv 1 − Δc tmd 2 Δ ˙ v 1 −Δk tmd 2 Δv 2 − Δc tmd 2 Δ ˙ v 2 . . . −Δk tmd 2n tmd Δv 2n tmd −1 − Δc tmd 2n tmd Δ ˙ v 2n tmd −1 −Δk tmd 2n tmd Δv 2n tmd − Δc tmd 2n tmd Δ ˙ v 2n tmd where X(t) = GX(t) = [Δv 1 Δv 2 999 ··· 999 Δv 2n tmd −1 Δv 2n tmd 999 Δ ˙ v 1 Δ ˙ v 2 999 ··· 999 Δ ˙ v 2n tmd −1 Δ ˙ v 2n tmd] T and G = [R tmd T R tmd T ]. In the nonlinear TMD case, g(X(t);θ) = −Δk tmd 2 Δv 1 −c tmd 2 |Δ ˙ v 1 | β tmd 2 sgn(Δ ˙ v 1 ) +c n 2 Δ ˙ v 1 −Δk tmd 2 Δv 2 −c tmd 2 |Δ ˙ v 2 | β tmd 2 sgn(Δ ˙ v 2 ) +c n 2 Δ ˙ v 2 . . . −Δk tmd 2n tmd Δv 2n tmd −1 −c tmd 2n tmd |Δ ˙ v 2n tmd −1 | β tmd 2n tmd sgn(Δ ˙ v 2n tmd −1 ) +c n 2n tmd Δ ˙ v 2n tmd −1 −Δk tmd 2n tmd Δv 2n tmd −c tmd 2n tmd |Δ ˙ v 2n tmd | β tmd 2n tmd sgn(Δ ˙ v 2n tmd ) +c n 2n tmd Δ ˙ v 2n tmd where θ = h Δk tmd 2 c tmd 2 β tmd 2 999 ··· 999 Δk tmd 2n tmd c tmd 2n tmd β tmd 2n tmd i T . The design optimization of the TMD parameter values are performed with the objective of minimizing J 1 subject to the constraints given in (6.14) with α = 1.25. Results: The TMDs are designed with mass ratio μ (mass of the device/total mass of the structure) of each device as 5%, 2% and 1%, respectively. One pair of TMDs is attached in each of the candidate locations as shown in Figure 6.4. Based on a preliminary study, the nominal damping coefficients were chosen to be 12 169 Figure 6.4: Two candidate TMD device locations denoted as A and B. MN·(s/m) for location A and 2 MN·(s/m) for location B whereas nominal stiff- nesses were chosen to be 0.6 MN/m for both locations for the linear TMDs. The initial guesses for Δc tmd i and Δk tmd i are assumed zero. For the linear TMDs, the upper and lower bounds are chosen to be c n i + Δc tmd i ∈ [0, 16] MN·(s/m) and k n i + Δk tmd i ∈ [0, 5] MN/m for i = 2, 4,..., 2n tmd . The initial guess for nonlin- ear TMDs is taken as the optimal result from the linear TMD case with upper and lower limits for parameters identical to linear case. The initial guess for the additional parameter, the exponents β tmd 2i , of the nonlinear TMDs is taken as 1.0, with the feasible range [0.2,1.8] as in the previous example. Table 6.7 and Table 6.8 summarize the results of this study. The maximum number of function evalu- ations, set at the default of fmincon, was the active stopping criterion for most of the results. However, the first order optimality valuesk∇ θ L(θ,λ)k ∞ were small and all constraints were satisfied. The results indicate that a pair of TMDs at location ‘A’ in Figure 6.4 gives a maximum 9.3023% reduction of cost J 1 whereas a pair of TMDs at location ‘B’ in Figure 6.4 only reduces J 1 by 3.7209%. Computational Efficiency: Toevaluatethecomputationalefficiencyoftheproposedmethod, itiscompared with Matlab’s ode45. The VIE approach is used with Δt = 6.1 ms, which gives O(10 −3 ) relative RMS accuracy for displacements and velocities across the devices. 170 Table 6.7: Solutions of TMD parameters in location A with relaxed constraints. Linear optimal TMD parameters μ Reduction of k n 2 + Δk tmd 2 c n 2 + Δc tmd 2 # fcn. evals. cost J 1 (%) [MN/m] [MN·(s/m)] 5% 5.2270 0.6018 12.1575 201 ‡ 2% 7.4862 0.6955 12.6039 201 ‡ 1% 8.6822 0.7161 12.2872 201 ‡ ‡ Optimization halted at fmincon’s default number of function evalua- tions. Nonlinear optimal TMD parameters μ Reduction of k n 2 + Δk tmd 2 c tmd 2 β tmd 2 # fcn. evals. cost J 1 (%) [MN/m] [MN·(s/m) β tmd ] 5% 7.3754 1.1874 12.2736 0.3900 57 2% 8.5936 1.2010 12.6839 0.4809 96 1% 9.3023 1.2807 12.4217 0.4280 86 The gain in computational efficiency achieved with the VIE approach is 331.54 for a typical 100 function evaluation optimization as shown in Table 6.9. Example III: Combination of passive dampers and TMDs In this example, both the passive dampers and tuned mass dampers are designed simultaneously. The control devices are attached symmetrically with respect to the spine of the deck. The initial values for the design parameters are taken from the results of optimal parameter values of passive dampers in Example I and the TMDs attached to the respective locations in Example II. Hence, with n pd pairs of passive dampers and another n tmd pairs of TMDs X(t) = GX(t) = [Δv 1 Δv 2 999 ··· 999 Δv 2n tmd −1 Δv 2n tmd 999 Δ ˙ u 1 Δ ˙ u 2 999 ··· 999 Δ ˙ u 2n pd −1 Δ ˙ u 2n pd 999 171 Table 6.8: Solutions of TMD parameters in location B with relaxed constraints. Linear optimal TMD parameters μ Reduction of k n 2 + Δk tmd 2 c n 2 + Δc tmd 2 # fcn. evals. cost J 1 (%) [MN/m] [MN·(s/m)] 5% 3.2115 4.5862 2.5346 201 ‡ 2% 2.5914 0.9360 2.2971 201 ‡ 1% 2.2591 0.6000 2.1174 201 ‡ ‡ Optimization halted at fmincon’s default number of function evalua- tions. Nonlinear optimal TMD parameters μ Reduction of k n 2 + Δk tmd 2 c tmd 2 β tmd 2 # fcn. evals. cost J 1 (%) [MN/m] [MN·(s/m) β tmd 2 ] 5% 3.7209 1.9372 3.8149 1.7327 300 ‡ 2% 2.6135 0.8011 2.4183 1.1044 301 ‡ 1% 2.3256 0.6474 2.3776 1.0671 300 ‡ ‡ Optimization halted at fmincon’s default number of function evaluations. Table 6.9: Computational gain achieved using the proposed method for TMD design CPU time, Proposed, 1 fcn. eval. Total, 100 fcn. evals. Gain in Δt one-time repeated comp. x(t) H L (t) p(t) Y(t) Proposed ode45 efficiency 6.1 ms 34.56 s 19.49 s 9.75 s 9.54 s 33.05 min. 7.40 days 331.54 Δ ˙ v 1 Δ ˙ v 2 999 ··· 999 Δ ˙ v 2n tmd −1 Δ ˙ v 2n tmd] T , θ = [c pd 2 β pd 2 999 ··· 999 c pd 2n pd β pd 2n pd 999 Δk tmd 2 c tmd 2 β tmd 2 999 ··· 999 Δk tmd 2n tmd c tmd 2n tmd β tmd 2n tmd ] T , and g(X(t);θ) = −c pd 2 |Δ ˙ u 1 | β pd 2 sgn (Δ ˙ u 1 ) −c pd 2 |Δ ˙ u 2 | β pd 2 sgn (Δ ˙ u 2 ) . . . −c pd 2n pd |Δ ˙ u 2n pd −1 | β pd 2n pd sgn Δ ˙ u 2n pd −1 −c pd 2n pd |Δ ˙ u 2n pd | β pd 2n pd sgn (Δ ˙ u 2n pd ) −Δk tmd 2 Δv 1 −c tmd 2 |Δ ˙ v 1 | β tmd 2 sgn (Δ ˙ v 1 ) +c n 2 Δ ˙ v 1 −Δk tmd 2 Δv 2 −c tmd 2 |Δ ˙ v 2 | β tmd 2 sgn (Δ ˙ v 2 ) +c n 2 Δ ˙ v 2 . . . −Δk tmd 2n tmd Δv 2n tmd −1 −c tmd 2n tmd |Δ ˙ v 2n tmd −1 | β tmd 2n tmd sgn Δ ˙ v 2n tmd −1 +c n 2n tmd Δ ˙ v 2n tmd −1 −Δk tmd 2n tmd Δv 2n tmd −c tmd 2n tmd |Δ ˙ v 2n tmd | β tmd 2n tmd sgn (Δ ˙ v 2n tmd ) +c n 2n tmd Δ ˙ v 2n tmd 172 The design optimization problem is the constrained minimization of cost J 1 as defined in (6.14) with α = 1.25. The mass ratio for each of the TMD is taken as 2%. Four candidate configurations involving the two best nonlinear damper solutions and two TMD locations ‘A’ and ‘B’ are investigated. Optimal parameter values and configuration are provided in Table 6.10. Comparing with Examples I and II, this combined case of passive dampers and TMDs gives a bit greater reduction in the cost J 1 . The gain in computational efficiency for a typical 400 function evaluation optimization is 120.57. Table 6.10: Optimal combined passive dampers and TMD configuration with relaxed constraints. Optimal passive damper parameters Region Connecting nodes: c tmd 2 [MN·(s/m) β tmd 2 ] β tmd 2 left damper, right damper IV – P3 319-184, 324-117 20.9992 0.8627 Optimal TMD parameters μ Device k n 2 + Δk tmd 2 c tmd 2 β tmd 2 location [MN/m] [MN·(s/m) β tmd 2 ] 2% B 0.5895 2.1439 1.6259 Optimization result Reduction of # fcn. evals. cost J 1 (%) 30.0775 478 Example IV: Design under uncertainty This example incorporates localized uncertainties into the model to pose a computationally-efficient design-under-uncertainty framework for systems that are mostly linear and deterministic but that have localized nonlinear and uncertain elements. 173 To be consistent with the symmetric nature of the bridge model, the passive devices are again placed in identical pairs symmetrically located about the bridge deck centerline as in Example III. Further, this exam- ple will consider each pair of nonlinear TMDs as uncertain, i.e., δ = [c tmd 2 β tmd 2 k tmd 2 999 c tmd 4 β tmd 4 k tmd 4 999 ··· 999 c tmd 2n tmd β tmd 2n tmd k tmd 2n tmd ] T — and the stiffness and damping parameters of each pair of nonlinear viscous dampers as the design parameters, i.e.,θ = [c pd 2 β pd 2 999 c pd 4 β pd 4 999 ··· 999 c pd 2n pd β pd 2n pd ] T . Formulation: The equations of motion of the bridge with local uncertainties can be written in the form of (6.3) where the variables X(t), u(t), A, B, M, K, and C are same as in Example II. L pd and L tmd are similar to the matrices L in Example I and II, respectively. g tmd (X tmd (t);δ)issameasinExampleIIwith X tmd (t) = G tmd X(t) = [Δv 1 Δv 2 999 ··· 999 Δv 2n tmd −1 Δv 2n tmd 999 Δ ˙ v 1 Δ ˙ v 2 999 ··· 999 Δ ˙ v 2n tmd −1 Δ ˙ v 2n tmd] T = [R tmd T R tmd T ]X(t) and g pd (X pd (t);θ) is same as in Example I with X pd (t) = G pd X(t) = [Δ ˙ u 1 Δ ˙ u 2 999 ··· 999 Δ ˙ u 2n pd −1 Δ ˙ u 2n pd] T = [0 R pd T ]X(t). For sim- plicity, consider the case of a single pair of TMDs attached at location ‘B’ on top of the 2 nd tower as shown in Figure 6.5. The initial values c pd 2 and β pd 2 of the design parameters for the optimization problem are taken from the results in Table 6.4 where the nonlinear damping devices are used alone. The lower and upper bounds for damping coefficients and nonlinear exponents are taken as in Example I. The uncertain parameter vector corresponds to a single pair of TMDs (i.e., 2n tmd = 2) placed on top of second tower (see Figure 6.5) in the finite element model with mass ratio 2%. The uncertain parameters of the TMDs follow the 174 distributions given in Table 6.11, where the mean values were obtained from the Table 6.8 results of the TMD optimization in Example II. Optimization Results: The optimization is performed using Matlab’s fmincon with default values for tolerances but limited to 20 function evaluations as the focus here is on the computational gain. Figure 6.5: Finite element model (side view) of the bridge, with Example IV device locations. N δ = 100 samples of the three-element uncertain parameter vector δ = [Δk tmd 2 c tmd 2 β tmd 2 ] T are generated using Latin Hypercube sampling [111, 203]. The results of the design optimization are shown in Table 6.12: the worst-case design reduces the largest cost (the largest over all samples of the uncertain TMD parameters) by almost 26% relative to the largest cost without the pair of pas- sive dampers; similarly, the average design reduces the average cost (averaged over all samples of the uncertain TMD parameters) by a bit over 26% relative to the average cost without the passive dampers. The optimal values for c pd 2 andβ pd 2 are very similar for both average and worst case design. The passive damper force metric J 12 for the worst-case design is very slightly (0.37%) higher than that for the average design — which is not surprising: very slightly larger force levels are used to handle the worst-case outliers. As a point of comparison, both worst-case and average design-under-uncertainty use force levelsJ 12 that are about 9% higher 175 than that of a similar deterministic optimization when the TMD parameters are at the design-under-uncertainty’s means given in Table 6.11 (the the determin- istic optimization results in optimal passive damper parameters c pd opt = 1.5830 MN·(s/m) β pd opt and β pd opt = 0.6417); again, this is expected as the optimal design- under-uncertainty passive dampers use slightly larger forces to accommodate the TMD parameter uncertainty. Hence, due to the assumed uncertainty in the struc- ture, the robust optimization requires passive dampers capable of exerting greater force than a deterministic optimization. Table 6.11: Uncertain parameter description Variable Distribution Mean COV Damping coeff. c tmd 2 Lognormal 2.4183 MN·(s/m) β tmd 2 0.05 Stiffness coeff. Δk tmd 2 Lognormal 0.8011 MN·(s/m) β tmd 2 0.025 Exponent β tmd 2 Lognormal 1.1044 0.1 Note: COV = coefficient of variation Table 6.12: Worst-case and average passive damper designs-under-uncertainty Design Cost Cost Reduction # fcn. evals. Optimal c pd 2 Optimal β pd 2 Worst-case max J 1 25.8343% 20 20.9222 MN·(s/m) β pd 2 0.8875 Average E[J 1 ] 26.2165% 20 20.7751 MN·(s/m) β pd 2 0.8853 Computational Efficiency: For N δ = 100, and for 20 function evaluations in the optimization procedure, x (nl) (t) must be evaluated a total of 2000 times. The proposed design-under- uncertainty framework provides a computational speedup of 321.77 relative to ode45 with comparable accuracy, as shown in Table 6.13. 176 Table 6.13: Computational gain achieved using the proposed method for design- under-uncertainty Total CPU time, 2000 simulations Gain in Δt Proposed ode45 comp. one-time + 2000×repeated = total total efficiency 6.1 ms 19.48 min. + 2000×19.29 s = 11.04 hrs. 148.03 days 321.77 6.2.4 Closure A computational approach for optimal design and design-under-uncertainty of passive control devices has been presented in this section. For each (realization of a) function evaluation, the proposed approach efficiently evaluates a performance metric of structural response by exploiting localized modification of the structure duetoaddedpassivecontroldevicesand/orlocaluncertaintiespresentinthestruc- ture. This leads to a Volterra integral equation (VIE) in the pseudoforces induced by the devices and/or uncertainties; the VIE is solved numerically using a recur- sive fast Fourier transform (FFT) to achieve computational gain; the optimization is performed using sequential quadratic programming. The method is illustrated through a benchmark cable-stayed bridge subject to a historic earthquake record. Location as well as the parameter values of passive dampers, TMDs and their com- bination are designed to achieve better performance of the bridge in the first three examples. The fourth example considers robust design of passive dampers under uncertainty in the TMDs to maximize the average or worst case performance. The achievement in the computational gain is compared to Matlab’s ode45, a tra- ditional nonlinear solver. The examples show considerable computational gain of two orders of magnitude, and up to around 300. To further the applicability of the proposed framework in future work reliability based design optimization under 177 stochastic loading will be incorporated in which the proposed method can take advantage of the local nature of the uncertainties and design components. 6.3 Efficient Forward Uncertainty Propagation To simplify various types of analyses, linear structural dynamical models are often used, though they are not always adequate to accurately compute struc- tural responses. When the models instead involve some nonlinearity, the required computation for model evaluations in forward uncertainty propagation increases significantly. An important class of nonlinear problems consists of models that are mostly linear except for some spatially localized nonlinearities. For example, in a building with a base isolation layer, the superstructure is designed to behave essen- tially linearly in an earthquake excitation whereas only the isolation layer behaves nonlinearly [204]. Similarly, spacecraft are built from linear components which are sometimes connected by spatially localized nonlinear joints [205]. For these large locally nonlinear systems, forward uncertainty propagation using a simple Monte Carlo approach becomes very expensive. Perturbation methods using Taylor series expansions have been used often for forward uncertainty propagation for nonlinear structural problems [206]. First order expansions applied alone produce good results only when the fluctuations of the random elements about their means are small [207]. To address this forward uncertainty propagation problem for locally nonlinear models, this section proposes a method that first divides the random variable space into disjoint elements to limit the order of uncertainties in each element [208]. Inside each element, the proposed method employs a Taylor series expansion for the quantity of interest (QoI) that depends on the response of the system; the 178 system responses, as well as the local sensitivities required in the approximation, are estimated using the exact reduction provided by the Volterra integral equations approach [74]. Using this reduction, most of the computational cost is converted to a one-time-cost resulting in rapid solution for a whole family of responses and sensitivities. The proposed method next uses these approximations inside each element to estimate the statistics of the QoI instead of using samples from each of these elements as in stratified sampling. Hence, the proposed methods improve the solution of the forward uncertainty propagation problem in two ways: ◦ Approximation error is localized and bounded by dividing the random space into disjoint elements and using a Taylor series approximation in each ele- ment. ◦ The system response and Taylor series’ local sensitivities are calculated effi- ciently by converting the high-order system of nonlinear equations into a low-order set of nonlinear Volterra integral equations, which are solved via a computationally efficient approach [74]. This approach leverages one-time computations of the nominal system impulse responses to make very effi- cient (compared to a standard nonlinear solver such as Matlab’s ode45) the solutions of the many locally-evaluated samples. An adaptive procedure to subdivide the random variable space is also introduced here. The proposed methods are illustrated using a 100 degree-of-freedom building model with a nonlinear base isolation layer. 179 6.3.1 Methodology Let the state space equations of a locally nonlinear dynamical system be given by [74, 209, 210] ˙ X(t;ξ) = AX(t;ξ) + Bw(t) + Lg X(t;ξ) ; X(0) = x 0 Y(t;ξ) = CX(t;ξ) + Dw(t) + Eg X(t;ξ) (6.15) where X(t;ξ)∈ R n×1 is the state vector; Y(t;ξ)∈ R ny×1 is the output vector; A∈R n×n is the state matrix; C∈R ny×n is the state observation matrix; w(t)∈ R m×1 is an external excitation; B ∈ R n×m and D ∈ R ny×m are the influence matrices for the external excitation; g(·)∈ R ng×1 is a function of a linear state combination X(t;ξ) = GX(t;ξ)∈ R no×1 and uncertain parameters ξ∈ R n ξ ×1 ; L∈R n×ng and E∈R ny×ng are influence matrices which map the vector function g(·) to all the states and to the output, respectively; and x 0 is the initial condition. Further, assume the QoIf(Y(t;ξ)) is evaluated using the output vector Y(t;ξ) for a realization of the uncertain parameter ξ from the probability density function p Ξ (ξ). Use of Taylor series approximations The first order Taylor series approximates the QoI as f(Y(t;ξ))≈f(Y(t;ξ 0 )) + (ξ−ξ 0 ) T [∇f] ξ 0 (6.16) where the first derivatives of f(·) are assumed to exist; [∇f] = [∂f/∂ξ i ] T is the vector of first order local sensitivities that are evaluated atξ =ξ 0 . 180 Division of random space The approximation error with expansion (6.16) is localized by dividing the random space Ξ into n st disjoint elements{Ξ l } nst l=1 , as shown in Figure 6.6, where ∪ nst l=1 Ξ l = Ξ and Ξ i ∩ Ξ j =∅ for i6=j. QoI f(Y(t;ξ)) is then expressed as f(Y(t;ξ)) = nst X l=1 f(Y(t;ξ))1(ξ∈ Ξ l ) (6.17) where 1(·) is an indicator function. Then, using f l = f(Y(t;ξ)) for ξ∈ Ξ l , the mean and variance can be written E[f(Y(t;ξ))] =μ f = nst X l=1 p l E[f l ] (6.18) Var[f(Y(t;ξ))] =−μ 2 f + nst X l=1 p l Var[f l ] + nst X l=1 p l E[f l ] 2 wherep l =P (ξ∈ Ξ l ) and p Ξ l (ξ) = p Ξ (ξ)/P (ξ∈ Ξ l ). As a result, the global error in the approximation of f(Y(t;ξ))∈L 2 (Ω,F,P) is bounded from above. μ f 3 σ f 3 μ f 2 σ f 2 μ f 1 σ f 1 μ f 4 σ f 4 n st = 4 Ξ Ξ 1 Ξ 2 Ξ 3 Ξ 4 Figure 6.6: The random space divided into n st disjoint elements. 181 1 Divide parameter space Ξ into n st elements:{Ξ l } nst l=1 2 Evaluate p(ξ∈ Ξ l ) =p l 3 for l = 1,...,n l do 4 Estimate the meanμ l ξ in each element 5 Solve (6.15) for Y(t;ξ) and calculate [∇Y(t)] atξ =μ l ξ 6 Use (6.20) and (6.21) to evaluateE [f l ] and Var [f l ] 7 end Result: E[f(Y(t;ξ))]≈ P nst l=1 p l E [f l ], Var[f(Y(t;ξ))]≈−μ 2 f P nst l=1 p l Var[f l ] + P nst l=1 p l (E[f l ]) 2 Algorithm 5: Pseudocode for the proposed method. Approximation in each element In each space Ξ l , parameter meansμ l ξ are calculated analytically R Ξ l ξp Ξ l (ξ)dξ or using random sampling. The QoI f(Y(t;ξ)) can then be approximated with first order terms from (6.16) within each element as f l ≈f Y(t;μ l ξ ) + ξ−μ l ξ T [∇f(Y(t;ξ))] μ l ξ (6.19) The mean and variance of f(Y(t);ξ) within each element can then be estimated with E[f l ]≈f Y(t;μ l ξ ) (6.20) Var [f l ]≈ n θ X i=1 n θ X j=1 ∂f ∂ξ i ! μ l ξ ∂f ∂ξ j ! μ l ξ ·σ l ij (6.21) where σ l ij = R Ξ l (ξ i −μ l ξ i )(ξ j −μ l ξ j )p Ξ l (ξ)dξ can be calculated analytically with p Ξ l (ξ) = p Ξ (ξ)/P (ξ∈ Ξ l ) for some standard probability distributions. However, note that the forward model is not needed to be run more than once per element in this method. 182 This proposed method is equivalent to fitting hypersurfaces for the QoI f(Y(t;ξ)) at every element using information obtained at the means of the ele- ments of the form f l = n ξ X i=1 θ l i (ξ l i −μ l ξ i ) +θ l 0 for l = 1,...,n st (6.22) where θ l 0 = f(Y(t;μ l ξ )) and θ l i = ∂f(Y(t;ξ)) ∂ξ i μ l ξ . Similarly, a second order expan- sion would fit hyperquadratic surfaces. A generalization to multiple samples from each element could use an n s th order surface fit with up to n T th order sensitivity information and n T ≤ n s . When n T = n s , only one sample from each of these element is sufficient, which can essentially be taken as the mean of the element. However, with n T < n s , a few random samples are needed and a least square approach can be applied to estimate the coefficientsθ. Efficient Analysis of Locally Nonlinear Systems An efficient analysis of locally nonlinear systems shown in Section 3.2.3 is used here. Adaptive scheme Anadaptiveprocedureisintroducedforsubdividingtherandomvariablespace: Analogous toh refinement in the finite element method, an adaptive refinement of the elements can be introduced here with the monitoring of the variance in each element. If p l (Δσ l /σ l ) 1/2 > h max,1 or p l (σ l /σ max ) 1/2 > h max,2 for specified values of the thresholds h max,1 and h max,2 , the lth element is divided into three elements along the directioni ∗ = argmax i [∂f/∂ξ i ] 2 σ l ii . Δσ l is the change in the variance of f l in retrospect (i.e., the change in σ l from a previous level of subdivision). The element is then divided into three equal parts along ξ i ∗ as shown in Figure 6.6 183 Base nd 2 floor th 10 floor th 11 floor st 1 floor Ground hysteretic isolation bearings u b u b u 3 u 6 u 9 u 12 u 15 u 18 u 84 u 87 u 90 u 1 u 4 u 7 u 10 u 13 u 16 u 82 u 85 u 88 u 91 u 94 u 97 u 2 u 5 u 8 u 11 u 14 u 17 u 83 u 86 u 89 u 92 u 95 u 98 u 93 u 96 u 99 Figure 6.7: 100 DOF base-isolated structural model for the n ξ = 2 case. Subdividing the element into an odd number of subelements allows f(Y(t;ξ)) and its derivatives at the mean of the undivided element to be reused for one of the new subelements. The order of the Taylor series approximation can also be increased adaptively. However, higher order sensitivities may introduce instabilities in the approxima- tion [207] and exponentially increase computational requirement with higher order derivatives, so only a first order approximation is used herein. 6.3.2 Numerical Example Thisexampleusesan11-story2-bay99DOFsuperstructureona1-Dhysteretic isolation layer, resulting in the 100 DOF model as described in Section 3.3.2. The superstructure is Rayleigh damped with 3% damping ratios in the 1 st and 10 th 184 modes. The superstructure as fixed-base would have a fundamental period of 1.05 s with equations of motion M s ¨ u s + C s ˙ u s + K s u s =−M s r s ¨ u g (6.23) where M s , C s and K s arethemass, dampingandstiffnessmatrices, respectively; ¨ u g is the ground acceleration with influence vector r s composed of ones in horizontal DOFs and zeros elsewhere; u s is the generalized displacement vector (3 DOF per node) relative to the ground. The isolation has rubber bearings with stiffness k r , and hysteretic lead-rubber bearings (LRBs) with nominal yield force Q n y and pre- and post-yield stiffnesses k n pre and k n post , respectively, with hardening ratio r k = k post /k pre . Together, these bearings provide viscous damping with coefficient c b . The resulting combined equations of motion are M s ¨ u s + C s ˙ u s + K s u s =−M s r s ¨ u g + C s r s ˙ u b + K s r s u b m b ¨ u b + (c b + r T s C s r s ) ˙ u b + (k r + r T s K s r s )u b +f lrb =−m b ¨ u g + r T s C s ˙ u s + r T s K s u s (6.24) The restoring force f lrb of the LRBs is assumed to follow a Bouc-Wen model [69]: f lrb =k n post (1 + Δ k ξ k )u b +Q n y (1 + Δ Q ξ Q )[1−r k ]z (6.25) where k post = k n post (1 + Δ k ξ k ) and Q y = Q n y (1 + Δ Q ξ Q ) are the actual post-yield stiffness and yield force, respectively; the uncertain parameters ξ k and ξ Q are independent and each uniformly distributed in [−1, 1]; and Δ k = 0.67 and Δ Q = 0.2 are chosen to produce significant variation in the output ¨ u 97 . Herein, k r = 780 kN/m, c b = 35 kN·s/m and k n post = 468 kN/m. Nominal yield force of Q n y = 185 T i m e [ s ] 0 5 10 15 20 25 30 B u 9 7 [ m /s 2 ] 0 2 4 6 M e a n ' 3# s t an d a r d d e v i at i on M e a n ' 2# s t an d a r d d e v i at i on M e a n ' s t an d a r d d e v i at i on M e a n r e s p o n s e (a) Using 100 elements T i m e [ s ] 0 5 10 15 20 25 30 B u 9 7 [ m /s 2 ] 0 2 4 6 M e an ' 3 # s t a n d a r d d e v i a t i o n M e an ' 2 # s t a n d a r d d e v i a t i o n M e an ' s t a n d ar d d e v i at i o n M e an r e s p on s e (b) Using 400 elements Figure 6.8: Mean response with its variation. 8.75% of building weight andr k = 1/6 [106]. These parameters give the first mode a typical period of T i 1 = 2.22 s, and damping ratio of 3.53%. The evolution of the auxiliary variable z is governed by ˙ z =A ˙ x b −β ˙ x b |z|−γz| ˙ x b | (6.26) where the selection of A = 2β = 2γ = k pre /Q y dictates that z stays in [−1, 1] and produces identical loading and unloading stiffnesses [71, 102]. The ground excitation is a 30-s record of the 1940 El Centro earthquake record. The absolute (horizontal) roof acceleration ¨ u 97 (Figure 6.7), with a sampling rate of 20 Hz, is used as the output Y(t;ξ) of the model. The proposed method is applied to estimate the mean and variance of f = Y (t i ;ξ) at the sampling time instants. A first order Taylor series approximation is used over a 10×10 element division of the random variable space, each with equal probabilities. Figure 6.8 shows the mean response, along with its variability up to three standard deviations, obtained using the proposed method. Next, the adaptive method is applied withh max,1 =h max,2 = 0.005 and the resulting adapted 186 grid is shown in Figure 6.9; this grid produces a mean response plot similar to Figure 6.8. The variance estimates are compared to a 200,000 realization Monte Carlo simulation (MCS) in Figure 6.10. The RMS differences over time in mean response estimates and variance estimates are 0.0033 and 0.0249, respectively, for equal 100 elements. These RMS differences becomes 0.0016 and 0.0020, respectively, for the adapted grid in Figure 6.9. The proposed method agrees with the MCS results while incurring a fraction of its cost. Computing the response statistics using the adapted grid in Figure 6.9 requires the computation of responses and two first-order sensitivities at 748 different val- ues of ξ, in contrast with responses only for 200,000 realizations in the MCS. Further, the efficient Volterra-based response and sensitivity calculation provides 9 k 0 0.5 1 9 Q 0 0.5 1 Figure 6.9: An adaptive division of elements 187 T i m e [ s ] 0 5 10 15 20 25 30 R o of ac c e l e r at i on v ar i an c e V ar ( B u 9 7 ) [ m 2 /s 4 ] 0 0.2 0.4 0.6 0.8 1 1.2 1.4 10 0 e l e m e n t s ad ap t i v e g r i d 20 0 , 00 0 M C s i m u l at i o n s (a) Using 100 elements T i m e [ s ] 0 5 10 15 20 25 30 V ar ( B u 9 7 ) [ m 2 / s 4 ] 0 0.4 0.8 1.2 1.6 4 00 e l e m e n t s a d ap t i v e gr i d 2 00, 000 M C s i m u l at i o n s (b) Using 400 elements Figure 6.10: Comparison of variance estimates over time a computational cost reduction of 73.9 compared to Matlab’s ode45 with a cen- tral difference approach for sensitivities calculations. Combined, these provide a computational speedup of 19,753 achieved using the proposed method. 6.3.3 Closure A method for forward propagation of uncertainty through nonlinear dynamical systems is proposed using a Taylor series approximation, an adaptive sampling strategy, and a computationally efficient response and sensitivity algorithm. The global error in the approximation is bounded by dividing the random space into multiple elements; an adaptive scheme is also proposed here. Further, an efficient dynamic analysis algorithm is implemented for response and local sensitivities calculations required in the approximation. The numerical example shows that the proposed method is very close to results obtained using 200,000 standard Monte Carlo simulations but is four orders of magnitude faster. 188 6.4 Reynolds Averaged Navier-Stokes (RANS) Models for Turbulence Due to the presence of a large range of scales in turbulent flows with high Reynolds number, direct numerical simulation (DNS) is very costly. However, in many engineering applications, the mean flow characteristics are often the quanti- ties of interest. To achieve this, the turbulent velocity U(x,t) can be decomposed into U(x,t) = U(x,t) + u(x,t) (6.27) where (·) denotes the mean and u(x,t) is the zero-mean fluctuation. Using this decomposition, known as the Reynolds decomposition [211–213], in the govern- ing equations for incompressible flow, the following equations are obtained using Einstein summation convention ∂U j ∂x j = 0 ∂U j ∂t +U i ∂U j ∂x i = ∂ ∂x i " ν ∂U i ∂x j + ∂U j ∂x i ! −u i u j # − 1 ρ ∂p ∂x j (6.28) where ν is the molecular kinematic viscosity and p is the mean pressure. This set of equations is known as the Reynolds averaged Navier-Stokes (RANS) equa- tions. However, this gives rise to new unknown quantities u i u j , which are called the Reynolds stresses. To solve for these new unknowns, Boussinesq’s viscosity hypothesis can be used, which gives for an incompressible flow u i u j =−ν T ∂U i ∂x j + ∂U j ∂x i ! (6.29) 189 where ν T is the eddy viscosity coefficient. Substituting (6.29) into the RANS momentum equation, (6.28) gives ∂U j ∂t +U i ∂U j ∂x i = ∂ ∂x i " (ν +ν T ) ∂U i ∂x j + ∂U j ∂x i !# − 1 ρ ∂p ∂x j (6.30) Next, different RANS models introduce different sets of equations for these new unknowns in terms of known flow properties to close the system and to solve for the flow properties. Because each RANS model has slightly different formulation in the literature, the brief descriptions of the RANS models provided in the next subsection correspond to the forms implemented in the open-source code Open- FOAM v4.1 (www.openfoam.org). To be consistent with previous terminology, these RANS models are each referred to as a model class and an instance of its parameters are called a model. 6.4.1 k− model class In this two-equation RANS turbulence model class, model transport equations for the turbulent kinetic energyk and the dissipation of turbulent kinetic energy are specified along with the closure coefficients; all of which will be used as model parameters that are subjected to some constraints. The model transport equation for the turbulent kinetic energy k = 1 2 u i u j with a gradient-diffusion hypothesis is given by Lauder and Spalding [214], Pope [215], and Wilcox [216] ∂k ∂t +U i ∂k ∂x i = ∂ ∂x j " ν + ν T σ k ∂k ∂x j # +P− (6.31) 190 where the turbulent Prandtl number for kinetic energy is σ k and is one of the five model parameters; the quantityP is called production of turbulent kinetic energy and defined as P =−u i u j ∂U i ∂x j (6.32) An empirical model transport equation for the dissipation of turbulent kinetic energy is assumed as ∂ ∂t +U i ∂ ∂x i = ∂ ∂x j " ν + ν T σ ∂ ∂x j # +C 1 P k −C 2 2 k (6.33) where C 1 , C 2 , and σ are model parameters [214–217]. In addition to the linear eddyviscositymodelgivenin(6.29), thek−modelalsoassumestheeddyviscosity to be given by ν T =C μ k 2 (6.34) where C μ is a model parameter. The standard values of these parameters are given in Table 6.14. Two constraints suggested in Pope [215] can be used to reduce the total number of model parameters to four. The first constraint corresponds to the high-Reynolds-number fully developed channel flow: the second constraint comes from the homogeneous shear flow [215]. These two constraints are given by, respectively, κ 2 =σ C 0.5 μ (C 2 −C 1 ) P = C 2 − 1 C 1 − 1 (6.35) where κ is the von Kármán constant. From experiments, P/≈ 1.7 but Edeling et al. [121] suggested the use of P/≈ 2.09 obtained using the standard values of C 1 and C 2 so that the posterior means for these parameters will be close to 191 their standard values. Using these constraints the uncertain model parameters are chosen asθ p = [C μ ,C 2 ,σ k ,κ] T . Table 6.14: Standard values of the closure coefficients for k− model class. Closure Standard coeff. value C μ 0.09 C 1 1.44 C 2 1.92 σ k 1.0 σ 1.3 6.4.2 RNG k− model class This model class is a modification of the standard k− model class but tries to account for more than one scale of turbulence. Here, the equation for the dissipation of turbulent kinetic energy is modified by adding an extra term R = C μ η 3 1− η η 0 1 +βη 3 2 k (6.36) to the left side of (6.33) [218], whereη = 1 √ Cμ , andη 0 = 4.28. The standard values of these parameters are given in Table 6.15. Table 6.15: Standard values of the closure coefficients for RNG k− model class. Closure Standard coeff. value C μ 0.0845 C 1 1.42 C 2 1.68 σ k 0.7194 σ 0.7194 β 0.012 192 The constant β can be estimated from the following two constraints C ∗ 1 =C 1 − η 1− η η 0 1 +βη 3 κ 2 =σ C 0.5 μ (C 2 −C ∗ 1 ) Using these constraints, the uncertain model parameters are chosen as θ p = [C μ ,C 2 ,σ k ,σ ,κ] T . 6.4.3 k−ω model class Thistwo-equationRANSturbulencemodelclassusesthefollowingmodeltrans- port equations for the turbulent kinetic energyk and model transport equation for the turbulence frequency ω≡ /k given by [216] ∂k ∂t +U i ∂k ∂x i = ∂ ∂x j " (ν +σ k ν T ) ∂k ∂x j # +P−β ∗ kω ∂ω ∂t +U i ∂ω ∂x i = ∂ ∂x j " (ν +σ ω ν T ) ∂k ∂x j # +α Pω k −βω 2 (6.37) where the turbulent kinematic viscosity is given byν T =k/ω. The standard values for the closure coefficients are given in Table 6.16. The auxiliary functions to close the system are given by β =β 0 f β , f β = 1 + 70χ ω 1 + 80χ ω , χ ω = Ω ij Ω jk S ki (β ∗ 0 ω) 3 , β ∗ =β ∗ 0 f β ∗, f β ∗ = 1 for χ k ≤ 0 but f β ∗ = 1 + 680χ 2 k 1 + 400χ 2 k for χ k > 0, χ k = 1 ω 3 ∂k ∂x j ∂ω ∂x j (6.38) 193 A constraint can be used to give an appropriate value for the von Kármán constant using these coefficients α = β 0 β ∗ 0 − σ ω κ 2 √ β ∗ 0 (6.39) This constraint can be used to fix α giving uncertain model parameters θ p = [β 0 ,β ∗ 0 ,σ ω ,σ k ,κ] T . Table 6.16: Standard values of the closure coefficients for k−ω model class. Closure Standard coeff. value β 0 0.072 β ∗ 0 0.09 α 0.52 σ k 0.5 σ ω 0.5 6.4.4 Spalart-Allmaras model class The Spalart-Allmaras model is a one-equation turbulence model [219] where the governing equation for a viscosity-like state variable ˜ ν is given by ∂˜ ν ∂t +U j ∂˜ ν ∂x j =C b1 ˜ S˜ ν−C ω1 f ω ˜ ν d 2 + 1 σ ∂ ∂x j " (ν + ˜ ν) ∂˜ ν ∂x j # + C b2 σ ∂˜ ν ∂x i ∂˜ ν ∂x i (6.40) 194 where d is the distance to the nearest wall. The kinematic viscosity and the auxiliary functions are given by ν T = ˜ νf v1 , f v1 = χ 3 χ 3 +C 3 v1 , χ = ˜ ν ν , ˜ S =S + ˜ ν κ 2 d 2 f v2 , S = q 2Ω ij Ω ij , Ω ij = 1 2 ∂U i ∂x j − ∂U j ∂x i ! , f v2 = 1− χ 1 +χf v1 , f ω =g " 1 +C 6 ω3 g 6 +C 6 ω3 # 1/6 , g =r +C ω2 (r 6 −r), r = ˜ ν ˜ Sκ 2 d 2 (6.41) The standard values for the closure coefficients are given in Table 6.17. Using the Table 6.17: Standard values of the closure coefficients for Spalart-Allmaras model class. Closure Standard coeff. value C b1 0.1355 C b2 0.622 C ω1 2.7635 C ω2 0.3 C ω3 2.0 C v1 7.1 σ 0.6667 constraint function C ω1 = C b1 κ + 1 +C b2 σ (6.42) the model parameters are chosen asθ p = [C b1 ,C b2 ,C ω2 ,C ω3 ,C v1 ,σ,κ] T . 195 6.4.5 NASA Two Dimensional Hump Flow Measurement Data The measurement data is obtained from the NASA (National Aeronautics and Space Administration) website (https://cfdval2004.larc.nasa.gov/case3.html) for the flow over a hump. The experimental setup is shown in Figure 6.11 with a magnified side-view of the hump geometry shown in Figure 6.12. The free-stream velocity is U ∞ = 34.6 m/s. The ambient pressure and temperature are approximately 101.325 kPa and 298 K, respectively. The ambient density of air and viscosity are known as 1.185 kg/m 3 and 18.4× 10 −6 kg/(m·s). Among the two experimental cases mentioned in the website, only the “no flow control case” data is used here. Details of this experiment can be found in Greenblatt et al. [220] and Naughton et al. [221]. From the available dataset, velocity and wall shear stress measurements are used in this section. The horizontal velocity measurement profiles U 1 (x 1 ,x 2 ) are shown as functions of x 2 for several x 1 values in Figure 6.13. U ∞ =34.6m/s Figure 6.11: The experimental setup used to generate measurements for the two dimensional hump flow case. 196 0 80 160 240 320 400 0 20 40 60 x 1 (mm) x 2 (mm) Figure 6.12: The hump geometry shown in details. 0 10 20 30 40 0 10 20 30 40 50 60 70 Figure 6.13: Plot of the velocity measurements from the experiment at different distance from the hump in the downhill direction. 6.4.6 Uncertainty Model Gorlé and Iaccarino [222] and Emory et al. [223] used perturbation in the eigenvalue decomposition of the Reynold’s stress tensor. Cheung et al. [119] used a multiplicative error model in the velocity prediction of the Spalart-Allmaras RANS model class. In this study, the multiplicative error model in the predicted velocity used in Cheung et al. [119], Oliver and Moser [120], and Edeling et al. 197 [121, 122], will be used. In this uncertainty model, the velocity in the x 1 direction predicted by the RANS models is given by U 1 (x,θ) =E m (x,θ u )U RANS 1 (x,θ p ) (6.43) whereU RANS 1 (x,θ p )isthevelocityinthex 1 directionpredictedbytheRANSmodel withRANSmodelparameterθ p ; themultiplicativeerrorisgivenbyE m (x,θ u )with hyperparameterθ u used to define the uncertainty; andU 1 (x,θ) is the true velocity withθ = [θ p ,θ u ] T . The required covariances are then written as k UU (x, x 0 |θ u ) =U RANS 1 (x,θ p )k Em (x, x 0 |θ u )U RANS 1 (x 0 ,θ p ) (6.44) where the subscripts U corresponds to the velocity. The multiplicative error E m is assumed in this section as a Gaussian process with unit mean and covariance function given by k Em (x, x 0 |θ u ) =σ 2 exp − 2 X i=1 x i −x 0 i αl i ! 2 (6.45) where the hyperparameter is given byθ u = [σ,α,l 1 ,l 2 ] T . Next, the measurement error is introduced as d = d RANS + e (6.46) where the d RANS consists of velocities and wall shear stresses obtained from an RANS model; e is the measurement error distributed as N (0, Σ e ); and d is the 198 measurement obtained from the NASA website mentioned before. This helps in writing the likelihood function (omitting the model class variableM k ) as p(D|θ) = 1 (2π) N/2 |Σ| 1/2 exp − 1 2 (d− d RANS ) T Σ −1 (d− d RANS ) (6.47) where Σ = Σ e +Σ RANS and Σ RANS consists of the covariance values obtained from (6.44). This likelihood function is then used in the proposed model validation framework. 6.4.7 Priors for the Parameters The prior distribution for the model parametersθ p should be chosen such that they do not vary by a large amount from their standard values at the same time describing the uncertainty in the measurements. Also, the range of these param- eters should not give physically impossible model outputs while being inside the limits prescribed in the models. To satisfy these criteria, the model parameters are assumed to have independent either truncated Gaussian or uniform prior distribu- tions with lower and upper bounds shown in Table 6.18. The bounds are chosen based on the recommendation in Edeling et al. [122] and Cheung et al. [119]. The choice of truncated Gaussian distribution with mean as their respective standard values is made for the parameters for which the existing literature does not show much variation of the posterior mean from their respective standard values. On the other hand, the choice of uniform distribution is made for the parameters where no such information is available. The prior distribution for the hyperparameterθ u is chosen so that the likelihood falsification employed result in a meaningful result giving some falsified models. Details about the choice of these hyperparameter choices are given in chapter 2 and De et al. [52]. 199 Table6.18: Priorprobabilitydistributionsofthedifferentparametersofeachmodel class. Model Param. Distribution Lower Upper Mean Standard bound bound deviation k− C μ Uniform 0.055 0.135 0.095 0.0231 (M 1 ) C 2 Trunc. Gauss. 1.15 2.88 1.92 0.2 σ k Uniform 0.55 1.45 1.0 0.2598 κ Trunc. Gauss. 0.205 0.615 0.41 0.0625 RNG k− C μ Uniform 0.055 0.135 0.095 0.0231 (M 2 ) C 2 Trunc. Gauss. 1.15 2.88 1.92 0.2 σ k Uniform 0.55 1.45 1.0 0.2598 σ Uniform 0.55 1.45 1.0 0.2598 κ Trunc. Gauss. 0.205 0.615 0.41 0.0625 k−ω β 0 Uniform 0.0576 0.0864 0.072 0.0083 (M 3 ) β ∗ 0 Uniform 0.072 0.108 0.09 0.0104 σ k Uniform 0.4 0.6 0.5 0.0577 σ ω Uniform 0.4 0.6 0.5 0.0577 κ Trunc. Gauss. 0.205 0.615 0.41 0.0625 SA C b1 Uniform 0.10162 0.16938 0.1355 0.0196 (M 4 ) C b2 Uniform 0.4665 0.7775 0.622 0.0898 κ Trunc. Gauss. 0.205 0.615 0.41 0.0625 C ω2 Uniform 0.15 0.45 0.3 0.0866 C ω3 Uniform 1.0 3.0 2.0 0.5774 C v1 Uniform 3.55 10.65 7.1 2.0496 σ Uniform 0.6 1.0 0.8 0.1155 6.4.8 Implementation This two dimensional hump flow is implemented in the open-source Open- FOAM 4.1 software package (www.openfoam.org) using its default library for the RANS model. For the solution of the model transport equations, the discretized equations are solved using a semi-implicit method for the pressure-linked equations 200 (SIMPLE) algorithm [224]. Mesh convergence is observed for the default values of the parameters using the metrics J 1 (x 2 ) = |U f (x 2 )−U c (x 2 )| U ∞ , J 2 (x 2 ) = |U f (x 2 )−U c (x 2 )| max x 2 |U f (x 2 )| , J 3 = kU f (x 2 )−U c (x 2 )k 2 kU f (x 2 )k 2 , J 4 = 1 n kU f (x 2 )−U c (x 2 )k 2 max x 2 |U f (x 2 )| (6.48) whereU f (x 2 ), andU c (x 2 ) corresponds to velocities in the fine mesh and the coarse mesh, respectively;kvk 2 2 = P n i=1 v 2 i is the Euclidean two-norm of the vector v; and n is the number of measurements. The mesh sizes are refined up to a point such that decreasing the mesh grid separation by half results in less than 1% changes in all of the metrics stated above in (6.48). 6.4.9 Results The pre-processing falsification step is applied withφ = 0.90 and the hyperpa- rameters are chosen asσ = 0.25,α =−2, andl 1 =l 2 = 10. The result is shown in Table 6.19. The three remaining model classes are then used for Bayesian model class selection with the result of posterior model class probability of 1.0 fork− as shown in Table 6.20. Since, the velocity measurements far from the boundary layer are mostly used and the k− models are good to model the turbulent flow this can be expected. This use of velocity measurements also explains the not-so-good performance of thek−ω models, which are good for flow near the wall. The RNG k− model outputs are sensitive to changes in its parameters; hence, on average, this model class does not perform well in the model class selection. The identified parameters of the k− models are shown in Table 6.21. The Table shows that the C μ , C 1 , and C 2 are very similar to the standard values but the parameters σ k andσ are significantly different than their standard values pointing to the fact 201 that the eddy-viscosity might not be modeled properly if the standard values of these parameters are used. Table 6.19: Unfalsified models using proposed intra-model class falsification. Model class (M k ) Unfalsified (%) k− 8.75 RNG k− 0.70 k−ω 2.45 SA 0.00 Table 6.20: Posterior model class probabilities evaluated using the Bayesian model class selection. Model class, Posterior model class M k probability,P (M k |D) k− 1.0 RNG k− 0.0 k−ω 0.0 Table 6.21: Identified values of the parameters of the k− models. Parameters Standard value Identified value C μ 0.09 0.0967 C 1 1.44 1.3743 C 2 1.92 1.7822 σ k 1.00 1.2307 σ 1.30 1.7865 6.4.10 Closure The proposed framework is shown here to be applicable for an important prob- lem in computational fluid dynamics. Different RANS models for turbulence are evaluated for reproducing 2D flow over a hump. The measurement data are col- lected from the experiments performed in NASA. The results show that the k− 202 models, which perform well in the free-stream, are the best model class since most of the velocity measurements used are far from the boundary layer. In future stud- ies, wall-modeled LES models will be used for the problem of turbulence modeling and the model validation framework will be applied to select a model class. 6.5 WorkinProgress: ValidationofaFour-Story Building Models Figure 6.14: The experimental set-up. (Picture taken by Prof. Erik Johnson.) A base-isolated test structure, mounted on the world’s largest six degree-of- freedom shake table at Japan’s E-Defense lab, was tested in March 2013 and again in August 2013 (see Figure 6.14) [225]. In this section, measurements from the test performed on 8 August 2013 are used. The structure consists of a four-story, asym- metric, moment frame with a setback and coupled transverse-torsional motion. 203 The 690-ton superstructure is roughly 14m×10m×15m. The isolation layer is composed of two rubber bearings, two elastic sliding bearings, and two pairs of passive U-shaped steel yielding dampers. The building was subjected to random excitation along different table axes, i.e., in the x, y and z directions, and to scaled versions of historical and synthetic earthquake ground motions. Accelerom- eters were placed on each floor at three different corner locations, each recording responses in thex,y, andz directions, but only two sensors were used on the roof (because of the different floor plan that allowed for a fourth-story deck). A finite element model, shown in Figure 6.15, is developed according to the design draw- ings. The beams, columns, and shear walls are modeled by solid elements with reinforced steel bars modeled by truss elements embedded in them. The floor slabs and the nonstructural walls (autoclaved lightweight concrete plates) are modeled using shell elements and the isolation devices are modeled using spring elements. Figure 6.15: The finite element model (kindly provided by Mr. Tianhao Yu). 204 6.5.1 Results Five model classes are constructed in which the moduli of elasticity for beams and columns in different floors are used as parameters of these classes. The first model classM 1 assumes that all beams and columns in the building have the same elastic modulus. The second model classM 2 assumes different elastic moduli for beams and columns. The remaining model classes similarly differentiate between elastic moduli of beams of different floors and in x and y directions as shown in Table 6.22. The prior distributions of these parameters are assumed Gaussian with mean and standard deviation as shown in Table 6.22. The first six natural frequencies and mode shapes estimated [226] from the acceleration data using a subspace state space system identification method (N4SID method) [227] are used as measurements and compared with predictions from the models for model falsi- fication with 1000 models from each class. To compute the likelihood, the covari- ance matrix Σ is assumed diagonal with 0.025 2 and 0.085 2 for natural frequency residuals and MAC (modal assurance criterion 1 ) values, respectively. The model falsification results show the first two model classes are rejected at this step (Table 6.22). The remaining three model classes are then used for Bayesian model class selection withP (M k ) = 1/3 fork = 3, 4, 5. The posterior model class probabilities estimated shows that model classes 4 and 5 are almost equally good for model- ing the superstructure. Although model class 5 has more parameters, Occam’s razor principle of selecting the simplest model explaining the behavior, which is embedded inside Bayesian model selection, assigns a higher posterior probability to model class 4 with fewer parameters [228]. 1 a measure of similarity between the mode shapes obtained from the experiment and from the model. 205 Table 6.22: Different model classes with their parameters, mean and standard deviation of their prior distribution, results from model falsification and selection. Model θ Mean Standard Unfalsified (%) log(Evidence) P (M k |D) ClassM k deviation M 1 E Beam,Col 27 GPa 2.5 GPa 0 – – M 2 E Beam 27 GPa 2.5 GPa 0 – – E Col 23 GPa 2.5 GPa M 3 E Beam,1 27 GPa 2.5 GPa 0.5 −154.9558 ≈ 0.0 E Beam,2,3,4 25 GPa 2.5 GPa E Col 23 GPa 2.5 GPa M 4 E Beam,1 27 GPa 2.5 GPa 2.9 −147.2980 0.5983 E x Beam,2,3,4 27 GPa 2.5 GPa E y Beam,2,3,4 23 GPa 2.5 GPa E Col 23 GPa 2.5 GPa M 5 E Beam,1 27 GPa 2.5 GPa 3.8 −147.6963 0.4017 E x Beam,2,3,4 27 GPa 2.5 GPa E y Beam,2,3,4 23 GPa 2.5 GPa E x Col 23 GPa 2.5 GPa E y Col 24 GPa 2.5 GPa 6.5.2 Closure This study implements a model validation strategy for a four-story building tested on a shake table that first uses the model falsification strategy that reduces the total number of candidate model classes to three from five for the computation- ally expensive model selection step. Results from the selection step demonstrate the Occam’s razor principle built in to the Bayesian model class selection, prefer- ring models with fewer parameters. Using the two validated model classes for the superstructure, a multi-model control strategy will be implemented in future work to protect from large earthquake motions. 6.6 Conclusion This chapter discusses applications of the methodologies developed in the pre- vious chapters of this dissertation. The wide range of problems shown here extends 206 the applicability of the proposed framework. In future work, different applications domains, such as material modeling and control theory, will be explored. 207 Chapter 7 Conclusions and Future Directions Don’t cross a river if it is four feet deep on average. Nassim Nicholas Taleb, The Black Swan ThenotionofBayesianmodelselectioninvolveschoosingmodel(s)fromacandi- date model pool using Bayes’ theorem; however, this process can lead to erroneous selection if the candidate pool does not contain any valid models. Model falsifica- tion, on the other hand, tries to eliminate the invalid models from the candidate pool, leaving many unfalsified models for future use though without any way to distinguish between them. The proposed model validation framework unites the philosophical ideas of model falsification and Bayesian model selection into a sin- gle integrated model verification methodology. This framework applies falsification in a likelihood domain with false discovery rate control as pre- and postprocess- ing steps. The unfalsified models and model classes after the preprocessing step are used for Bayesian model class selection and the validity of the finally selected model(s) or model class(es) are tested in the postprocessing step. A modification of the prior parameter distributions of the model classes based on the falsification result can also be incorporated within this framework. Because each framework step requires some forward model runs, an efficient computational engine is also an integral part of this framework. This framework is shown to not only identify the correct models but to also overcome the shortcomings of each of these methods 208 applied alone, by efficiently and systematically eliminating incorrect model classes. se Each of the components of this framework is also improved individually. The inclusion of the statistical error criterion, namely, the false discovery rate (FDR) is proposed for falsification based on residual error bounds. In this error criterion, the relative value of the number of incorrect rejections compared to all rejections is kept fixed at a pre-chosen value. The use of this criterion provides more statistical power leading to falsifying more invalid models. Further, the use of FDR becomes more efficient compared to other more common criteria, such as the familywise error rate (FWER), when the number of measurements increases, which is the case for falsification of models of a dynamic system. Likelihood-bound falsification is also proposed utilizing bounds on the probability that a model could generate the observed residual errors given the measurement data. This method of falsifica- tion is also shown to be useful for model parameter estimation and robust future predictions. In Bayesian model selection, a multilevel method for efficient estimation of evidence is proposed, where the multidimensional evidence integral is first con- verted into a one-dimensional integral using a probability integral transform and then evaluated using a quadrature. Adaptive sampling techniques are used to efficiently sample from different likelihood levels required in the quadrature. Com- putational savings for the evidence calculation are achieved compared to other standard approaches by using this proposed intelligent sampling technique and by using an efficient dynamic response algorithm to simulate the dynamic system that requires the solution of Volterra integral equations. This efficient dynamic response algorithm is also used for passive control design ofacable-stayedbridgeandtoproposeanefficientforwarduncertaintypropagation method. Passive control devices, e.g., viscous dampers, and tuned mass dampers, 209 are designed using this algorithm to mitigate the earthquake-induced motion of a cable-stayed bridge, namely, the Bill Emerson Memorial Bridge built in 2003 across the Mississippi river. Design-under-uncertainty is also explored for these passive control devices. The proposed method is compared to a standard solver for the governing differential equations of the bridge. The results show that for the same level of accuracy many orders of magnitude of computational efficiency can be achieved in the design of passive control devices using this method. In the uncertainty propagation algorithm, the random parameter space is divided into many strata. Inside each stratum, a Taylor series expansion is used to approximate a quantity of interest (QoI) of the dynamic system. The required sen- sitivities in this approximation are evaluated using an efficient dynamic response algorithm by solving nonlinear Volterra integral equations. The mean and stan- dard deviation of the QoI estimated using the proposed approach are compared to standard Monte Carlo sampling and the results show that for the same level of accuracy the proposed method can produce large computational savings. In this thesis, the proposed model validation methodology is implemented for a variety of problems from structures and fluid dynamics fields. Different structures, varying from four degree-of-freedom (DOF) to 1623 DOF, subjected to earthquake and wind excitation are used to illustrate the framework. Models of a four-story building tested in Japan’s E-Defense lab in 2013 are also being validated using this hybrid framework. Fluid dynamics examples — like flow past a cylinder and flow over a 2-dimensional hump — are used to show the framework’s impact on fields other than the structural dynamics. The numerical examples demonstrate the efficacy of this framework and its computational efficiency. For more complex models, with more measurements and a higher-dimensional parameter space Θ, much greater savings are expected when these models are validated using the 210 proposed framework. For example, simulation of the climate models or biological system models, where a variety of factors should be taken into account while developing them, is computationally expensive. This framework can be adapted to these models with greater savings. Infuturework, theproposedmethodwillbeappliedtocaseswherethebehavior of the physical system changes over time. To handle time-varying dynamics, the framework must be implemented on-line with an added feedback loop, which opens up a new area of application that will be investigated in the future. One such application will be to design a multimodel control strategy for a system behaving in both linear and nonlinear domains. Another important application that will be examined is the modeling of material. For example, different finite element meshes can be used to approximate the deformation of a polycrystalline specimen and the choice of a model must be based on evidence from the measurements and on the ease of model use. Recent developments made in novel materials that satisfy some specified performance requirements also require good models so they can be efficiently simulated. This is an area of research where the proposed framework can be useful. 211 Appendix A Modified Metropolis-Hastings Algorithm The modified Metropolis-Hastings algorithm proposed in Au and Beck [115] is used here to generate θ new with high acceptance rate. A brief description of the algorithm is given below. Let the prior distribution of parameters can be written as p(θ) = Q n θ i=1 p i (θ i ) at least using some approximate transformation, whereθ∈R n θ ×1 . At any iteration of the nested sampling, the chain starts from θ k that being chosen from one of the remaining samples. The proposal density to generate candidates is assumed as q(θ c |θ) = Q n θ i=1 q i (θ c i |θ i ). A sequence of θ (l) can then be generated such that L(θ (l) )>L(θ j ) using the steps shown in algorithm 6. 212 1 Initialization: Set l = 0 andθ (l) =θ k ; 2 /* Select candidate θ c */ 3 for i = 1,...,n θ do 4 Generate ϕ∼ q i ϕ|θ (l) i ; 5 Evaluate acceptance probability p a = min 1, p i (ϕ)q i θ (l) i |ϕ p i θ (l) i q i ϕ|θ (l) i ; 6 Generate u∼U(0, 1) ; 7 if u< p a then 8 Set θ c i =ϕ ; 9 else 10 Set θ c i =θ (l) i ; 11 end 12 end 13 /* Accept or reject θ c */ 14 if L(θ c )>L(θ j ) then 15 Setθ (l+1) =θ c ; 16 else 17 Setθ (l+1) =θ (l) ; 18 end 19 Set l =l + 1 ; 20 Repeat ; Algorithm 6: Modified Metropolis-Hastings algorithm. 213 References [1] S. J. Dyke, J. M. Caicedo, G. Turan, L. A. Bergman, and S. Hague, “Phase I benchmark control problem for seismic response of cable-stayed bridges,” ASCE Journal of Structural Engineering, vol. 129, no. 7, pp. 857–872, 2003. [2] B.F.Spencer, Jr.andS.Nagarajaiah, “Stateoftheartofstructuralcontrol,” ASCE Journal of Structural Engineering, vol. 129, no. 7, pp. 845–856, 2003. [3] C. R. Rao, Y. Wu, S. Konishi, and R. Mukerjee, “On model selection,” Lecture Notes-Monograph Series, vol. 38, pp. 1–64, 2001. [4] A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth, “Occam’s razor,” Information Processing Letters, vol. 24, no. 6, pp. 377–380, 1987. [5] H. Chipman, E. I. George, R. E. McCulloch, M. Clyde, D. P. Foster, and R. A. Stine, “The practical implementation of Bayesian model selection,” Lecture Notes-Monograph Series, vol. 38, pp. 65–134, 2001. [6] K. M. Cremers, “Stock return predictability: A Bayesian model selection perspective,” Review of Financial Studies, vol. 15, no. 4, pp. 1223–1249, 2002. [7] C. Andrieu, P. Djurić, and A. Doucet, “Model selection by MCMC compu- tation,” Signal Processing, vol. 81, no. 1, pp. 19–37, 2001. [8] S. F. Gull, “Bayesian inductive inference and maximum entropy,” in Maximum-entropy and Bayesian methods in science and engineering, pp. 53– 74, Springer, 1988. [9] D. J. C. MacKay, “Bayesian interpolation,” Neural Computation, vol. 4, no. 3, pp. 415–447, 1992. [10] D. J. C. MacKay, Bayesian methods for adaptive models. PhD thesis, Cali- fornia Institute of Technology, 1992. 214 [11] J. L. Beck and K.-V. Yuen, “Model selection using response measurements: Bayesian probabilistic approach,” ASCE Journal of Engineering Mechanics, vol. 130, no. 2, pp. 192–203, 2004. [12] R. E. Kass and A. E. Raftery, “Bayes factors,” Journal of the American Statistical Association, vol. 90, no. 430, pp. 773–795, 1995. [13] S. N. Goodman, “Toward evidence-based medical statistics. 2: The Bayes factor,” Annals of Internal Medicine, vol. 130, no. 12, pp. 1005–1013, 1999. [14] E. T. Jaynes, Probability Theory: The Logic of Science. Cambridge, U.K.: Cambridge University Press, 2003. [15] J.L.Beck, “Bayesiansystemidentificationbasedonprobabilitylogic,” Struc- tural Control and Health Monitoring, vol. 17, no. 7, pp. 825–847, 2010. [16] S. H. Cheung and J. L. Beck, “Calculation of posterior probabilities for Bayesian model class assessment and averaging from posterior samples based on dynamic system data,” Computer-Aided Civil and Infrastructure Engi- neering, vol. 25, no. 5, pp. 304–321, 2010. [17] M. Muto and J. L. Beck, “Bayesian updating and model class selection for hysteretic structural models using stochastic simulation,” Journal of Vibra- tion and Control, vol. 14, no. 1-2, pp. 7–34, 2008. [18] L. Mthembu, T. Marwala, M. I. Friswell, and S. Adhikari, “Model selec- tion in finite element model updating using the Bayesian evidence statistic,” Mechanical Systems and Signal Processing, vol. 25, no. 7, pp. 2399–2412, 2011. [19] K. R. Popper, The Logic of Scientific Discovery. Routledge, New York, 2002. Translation of Logik der Forschung, first published in 1934 by Verlag von Julius Springer, Vienna, Austria. [20] G. E. P. Box and N. R. Draper, Empirical Model-Building and Response Surfaces. John Wiley & Sons, Ltd., 1987. [21] B.RaphaelandI.F.C.Smith,“Findingtherightmodelforbridgediagnosis,” Artificial Intelligence in Structural Engineering, Computer Science, Lecture Notes in Artificial Intelligence, vol. 1454, pp. 308–319, 1998. [22] Y. Robert-Nicoud, B. Raphael, and I. F. C. Smith, “System identification through model composition and stochastic research,” ASCE Journal of Com- puting in Civil Engineering, vol. 19, pp. 329–247, 2005. 215 [23] I. F. C. Smith and S. Saitta, “Improving knowledge of structural system behaviorthroughmultiplemodels,”ASCE Journal of Structural Engineering, vol. 134, pp. 553–561, 2008. [24] J. A. Goulet and I. F. C. Smith, “Predicting the usefulness of monitoring for identifying the behavior of structures,” ASCE Journal of Structural Engi- neering, vol. 139, pp. 1716–1727, 2013. [25] J. A. Goulet, C. Michel, and I. F. C. Smith, “Hybrid probabilities and error-domain structural identification using ambient vibration monitoring,” Mechanical Systems and Signal Processing, vol. 37, pp. 199–212, 2013. [26] J. A. Goulet, M. Texier, C. Michel, I. F. C. Smith, and L. Chouinard, “Quan- tifying the effects of modeling simplifications for structural identification of bridges,” Journal of Bridge Engineering, vol. 19, pp. 59–71, 2014. [27] M. G. Safonov and T. Tsao, “The unfalsified control concept and learning,” IEEE Transactions on Automatic Control, vol. 42, pp. 843–847, 1997. [28] P. B. Brugarolas and M. G. Safonov, “Learning about dynamical systems via unfalsificationofhypotheses,” International Journal of Robust and Nonlinear Control, vol. 14, no. 11, pp. 933–943, 2004. [29] R. S. Smith and J. C. Doyle, “Model validation: a connection between robust control and identification,” IEEE Transactions on Automatic Con- trol, vol. 37, no. 7, pp. 942–952, 1992. [30] K. Poolla, P. Khargonekar, A. Tikku, J. Krause, and K. Nagpal, “A time- domain approach to model validation,” IEEE Transactions on Automatic Control, vol. 39, pp. 951–959, 1994. [31] R. Smith, G. Dullerud, S. Rangan, and K. Poolla, “Model validation for dynamically uncertain systems,” Mathematical Modelling of Systems: Meth- ods, Tools and Applications in Engineering and Related Sciences, vol. 3, pp. 43–58, 1997. [32] B. R. Woodley, R. L. Kosut, and J. P. How, “Uncertainty model unfalsifica- tion with simulation,” in Proceedings of the American Control Conference, Philadelphia, PA, June, 1998. [33] R. L. Kosut, “Uncertainty model unfalsification for robust adaptive control,” in Proceedings of the IFAC Workshop on Adaptive Systems in Control and Signal Processing, University of Strathclyde, Glasgow, Scotland, UK, 1998. 216 [34] J. Hasenauer, S. Waldherr, K. Wagner, and F. Allgower, “Parameter iden- tification, experimental design and model falsification for biological net- work models using semidefinite programming,” Systems Biology, IET, vol. 4, pp. 119–130, March 2010. [35] G. E. Box, “Science and statistics,” Journal of the American Statistical Asso- ciation, vol. 71, no. 356, pp. 791–799, 1976. [36] M. Sznaier and M. C. Mazzaro, “An LMI approach to control-oriented iden- tification and model (in)validation of LPV systems,” IEEE Transactions on Automatic Control, vol. 48, no. 9, pp. 1619–1624, 2003. [37] F. D. Bianchi and R. S. Sánchez-Peña, “Robust identification/invalidation in an LPV framework,” International Journal of Robust and Nonlinear Control, vol. 20, pp. 301–312, 2010. [38] F. C. Schweppe, “Recursive state estimation: unknown but bounded errors and system inputs,” IEEE Transactions on Automatic Control, vol. 13, no. 1, pp. 22–28, 1968. [39] A. Moser, “‘Macroscopic pattern analysis’ based on formal analogies as a scientific methodology for complex systems,” Acta Biotechnologica, vol. 15, pp. 173–195, 1995. [40] P. van Ballmoos, N. Guessoum, P. Jean, and J. Knödlseder, “Models for the positive latitude e-e+ annihilation feature,” Astronomy & Astrophysics, vol. 397, pp. 635–643, 2003. [41] A. Tarantola, “Popper, Bayes and the inverse problem,” Nature Physics, vol. 2, pp. 492–494, 2006. [42] J. A. Goulet, P. Kripakaran, and I. Smith, “Multimodel structural per- formance monitoring,” ASCE Journal of Structural Engineering, vol. 136, pp. 1309–1318, 2010. [43] J. A. Goulet, S. Coutu, and I. F. C. Smith, “Model falsification diagnosis and sensor placement for leak detection in pressurized pipe networks,” Advanced Engineering Informatics, vol. 27, pp. 261–269, 2013. [44] J. A. Goulet and I. F. C. Smith, “Performance-driven measurement system design for structural identification,” ASCE Journal of Computing in Civil Engineering, vol. 27, no. 4, pp. 427–436, 2013. [45] J. A. Goulet and I. F. C. Smith, “Structural identification with systematic errors and unkown uncertainty dependencies,” Computers and Structures, vol. 128, pp. 251–258, 2013. 217 [46] K. Beven and A. Binley, “The future of distributed models: Model calibra- tion and uncertainty prediction,” Hydrological Processes, vol. 6, pp. 279–298, 1992. [47] K. Beven, “Prophecy, reality and uncertainty in distributed hydrological modelling,” Advances in Water Resources, vol. 16, no. 1, pp. 41–51, 1993. [48] K. Beven and J. Freer, “Equifinality, data assimilation, and uncertainty esti- mation in mechanistic modelling of complex environmental systems using the glue methodology,” Journal of Hydrology, vol. 249, no. 1, pp. 11–29, 2001. [49] K. J. Beven, Rainfall-runoff modelling: the primer. John Wiley & Sons, 2011. [50] P. Mantovan and E. Todini, “Hydrological forecasting uncertainty assess- ment: Incoherence of the GLUE methodology,” Journal of Hydrology, vol. 330, no. 1, pp. 368–381, 2006. [51] Y. Benjamini and Y. Hochberg, “Controlling the false discovery rate: A practical and powerful approach to multiple testing,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 57, pp. 289–300, 1995. [52] S. De, P. T. Brewick, E. A. Johnson, and S. F. Wojtkiewicz, “Investigation of model falsification using error and likelihood bounds with application to a structural system,” ASCE Journal of Engineering Mechanics, vol. (In press), 2017. [53] S. De, P. T. Brewick, E. A. Johnson, and S. F. Wojtkiewicz, “Exploration of error rate criteria to decide bounds for model falsification,” in ASCE Engineering Mechanics Institute Conference, 2016. [54] P. G. Hoel, S. C. Port, and C. J. Stone, Introduction to statistical theory. Boston: Houghton Mifflin, 1971. [55] Z. Šidák, “Rectangular confidence regions for the means of multivariate nor- mal distributions,” Journal of the American Statistical Association, vol. 62, no. 318, pp. 626–633, 1967. [56] O. J. Dunn, “Multiple comparisons among means,” Journal of the American Statistical Association, vol. 56, no. 293, pp. 52–64, 1961. [57] M. Bouaziz, M. Jeanmougin, and M. Guedj, “Multiple testing in large-scale genetic studies,” in Data Production and Analysis in Population Genomics, pp. 213–233, Springer, 2012. 218 [58] R. J. Simes, “An improved Bonferroni procedure for multiple tests of signif- icance,” Biometrika, vol. 73, no. 3, pp. 751–754, 1986. [59] Y. Benjamini and D. Yekutieli, “The control of the false discovery rate in multiple testing under dependency,” Annals of Statistics, vol. 29, pp. 1165– 1188, 2001. [60] J. D. Storey, “The positive false discovery rate: A Bayesian interpretation and the q-value,” Annals of Statistics, vol. 31, pp. 2013–2035, 2003. [61] Y. Benjamini, Y. Gavrilov, et al., “A simple forward selection procedure basedonfalsediscoveryratecontrol,” The Annals of Applied Statistics, vol.3, no. 1, pp. 179–198, 2009. [62] E. T. Jaynes, “Information theory and statistical mechanics,” Physical Review, vol. 106, no. 4, pp. 620–630, 1957. [63] L.-G. Alberto, Probability and random processes for electrical engineering. Reading, Massachusetts: Addison-Wesely Publishing Company, Inc, 1994. [64] J.L.BeckandA.A.Taflanidis, “Priorandposteriorrobuststochasticpredic- tions for dynamical systems using probability logic,” International Journal for Uncertainty Quantification, vol. 3, no. 4, pp. 271–288, 2013. [65] B. Øksendal, Stochastic differential equations: an introduction with applica- tions. Springer Science & Business Media, 2013. [66] J. McFarland and S. Mahadevan, “Multivariate significance testing and modelcalibrationunderuncertainty,” Computer Methods in Applied Mechan- ics and Engineering, vol. 197, no. 29, pp. 2467–2479, 2008. [67] P. C. Mahalanobis, “On the generalized distance in statistics,” Proceedings of the National Institute of Sciences (Calcutta), vol. 2, pp. 49–55, 1936. [68] M. J. Zaki and W. Meira Jr., Data mining and analysis: fundamental con- cepts and algorithms. Cambridge University Press, 2014. [69] Y.-K. Wen, “Method for random vibration of hysteretic systems,” ASCE Journal of Engineering Mechanics, vol. 102, no. 2, pp. 249–263, 1976. [70] S. Nagarajaiah and X. Sun, “Response of base-isolated USC hospital build- ing in Northridge earthquake,” ASCE Journal of Structural Engineering, vol. 126, no. 10, pp. 1177–1186, 2000. [71] J. C. Ramallo, E. A. Johnson, and B. F. Spencer, Jr., “‘Smart’ base iso- lation systems,” ASCE Journal of Engineering Mechanics, vol. 128, no. 10, pp. 1088–1099, 2002. 219 [72] K. Kawashima, K. Hasegawa, and H. Nagashima, “Manual for Menshin designofhighwaybridges,”in2 nd US-Japan Workshop on Earthquake Protec- tive Systems for Bridges, Public Works Research Institute (PWRI), Tsukuba City, Japan, 1992. [73] J. S. Hwang and J. M. Chiou, “An equivalent linear model of lead-rubber seismic isolation bearings,” Engineering Structures, vol. 18, no. 7, pp. 528– 536, 1996. [74] S. F. Wojtkiewicz and E. A. Johnson, “Efficient sensitivity analysis of struc- tures with local modifications — Part I: Time domain responses,” ASCE Journal of Engineering Mechanics, vol. 140, no. 9, p. 04014067, 2014. [75] J. D. Holmes, “Along wind response of lattice towers—III. Effective load distributions,” Engineering Structures, vol. 18, no. 7, pp. 489–494, 1996. [76] J. L. Beck, “Bayesian updating, model class selection and robust stochas- tic predictions of structural response,” in Proceedings of the 8th European Conference on Structural Dynamics, 4–6 July, Leuven, Belgium, 2011. [77] K.-V. Yuen, Bayesian Methods for Structural Dynamics and Civil Engineer- ing. John Wiley and Sons, 2010. [78] G.Kerschen, J.-C.Golinval, andF.M.Hemez, “Bayesianmodelscreeningfor the identification of nonlinear mechanical structures,” Journal of Vibration and Acoustics, vol. 125, no. 3, pp. 389–397, 2003. [79] K. Worden and J. J. Hensman, “Parameter estimation and model selection for a class of hysteretic systems using Bayesian inference,” Mechanical Sys- tems and Signal Processing, vol. 32, pp. 153–169, 2012. [80] T. Toni, D. Welch, N. Strelkowa, A. Ipsen, and M. P. Stumpf, “Approximate bayesian computation scheme for parameter inference and model selection in dynamical systems,” Journal of the Royal Society Interface, vol. 6, no. 31, pp. 187–202, 2009. [81] T. Saito and J. L. Beck, “Bayesian model selection for ARX models and its application to structural health monitoring,” Earthquake Engineering and Structural Dynamics, vol. 39, no. 15, pp. 1737–1759, 2010. [82] T. Haag, S. C. González, and M. Hanss, “Model validation and selection based on inverse fuzzy arithmetic,” Mechanical Systems and Signal Process- ing, vol. 32, pp. 116–134, 2012. 220 [83] M. G. Smart, M. I. Friswell, and A. W. Lees, “Estimating turbogenerator foundation parameters: model selection and regularization,” in Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 456, pp. 1583–1607, The Royal Society, 2000. [84] X. Hong, R. J. Mitchell, S. Chen, C. J. Harris, K. Li, and G. W. Irwin, “Model selection approaches for non-linear system identification: a review,” International journal of systems science, vol. 39, no. 10, pp. 925–946, 2008. [85] Gaurav, S. F. Wojtkiewicz, and E. A. Johnson, “Efficient uncertainty quan- tification of dynamical systems with local nonlinearities and uncertainties,” Probabilistic Engineering Mechanics, vol. 26, no. 4, pp. 561–569, 2011. [86] J. Skilling, “Nested sampling for general Bayesian computation,” Bayesian Analysis, vol. 1, no. 4, pp. 833–859, 2006. [87] S. De, E. A. Johnson, S. F. Wojtkiewicz, and P. T. Brewick, “Computation- ally efficient Bayesian model selection for locally nonlinear structural dynam- icalsystems,” Journal of Engineering Mechanics, vol.144, no.5, p.04018022, 2018. [88] S. De, E. A. Johnson, and S. F. Wojtkiewicz, “Fast Bayesian model selection with application to large locally-nonlinear dynamic systems,” in 6th Interna- tional Conference on Advances in Experimental Structural Engineering, 11th International Workshop on Advanced Smart Materials and Smart Structures Technology, 2015. [89] S.De, E.A.Johnson, andB.P.T.Wojtkiewicz, StevenF,“EfficientBayesian model selection for locally nonlinear systems incorporating dynamic mea- surements,” in 10th International Workshop on Structural Health Monitoring (IWSHM), 2015. [90] S. De, M. Kamalzare, E. A. Johnson, and S. F. Wojtkiewicz, “Computa- tionally efficient Bayesian model selection for structural systems with local nonlinearities,” in ASCE Engineering Mechanics Institute Conference, 2014. [91] P. Mukherjee, D. Parkinson, and A. R. Liddle, “A nested sampling algo- rithm for cosmological model selection,” The Astrophysical Journal Letters, vol. 638, no. 2, p. L51, 2006. [92] F. Feroz and M. P. Hobson, “Multimodal nested sampling: an efficient and robust alternative to Markov Chain Monte Carlo methods for astronomical data analyses,” Monthly Notices of the Royal Astronomical Society, vol. 384, no. 2, pp. 449–463, 2008. 221 [93] N. Chopin and C. P. Robert, “Properties of nested sampling,” Biometrika, vol. 97, no. 3, pp. 741–755, 2010. [94] J. Ching and Y.-C. Chen, “Transitional Markov chain Monte Carlo method for Bayesian model updating, model class selection, and model averaging,” ASCE Journal of Engineering Mechanics, vol. 133, no. 7, pp. 816–832, 2007. [95] J. R. Shaw, M. Bridges, and M. P. Hobson, “Efficient Bayesian inference for multimodal problems in cosmology,” Monthly Notices of the Royal Astro- nomical Society, vol. 378, no. 4, pp. 1365–1370, 2007. [96] I. Murray, Advances in Markov chain Monte Carlo methods. PhD thesis, University College London, 2007. [97] S.AitkenandO.E.Akman, “Nestedsamplingforparameterinferenceinsys- tems biology: application to an exemplar circadian model,” BioMed Central Systems Biology, vol. 7, no. 1, p. 72, 2013. [98] R. M. Neal, “Slice sampling,” Annals of Statistics, pp. 705–741, 2003. [99] R. Chatpatanasiri, “How to sample from a truncated distribution if you must,” Artificial Intelligence Review, vol. 31, no. 1-4, pp. 1–15, 2009. [100] C.J.Geyer, “MarkovchainMonteCarlomaximumlikelihood,” inComputing Science and Statistics: Proceeding of the 23rd Symposium on the interface (E. M. Keramidas and S. M. Kaufman, eds.), pp. 161–171, Interface Foun- dation of North America, Fairfax Station, VA, 1991. [101] P. Linz, Analytical and Numerical Methods for Volterra Equations, vol. 7. SIAM, 1985. [102] F. Ma, A. Bockstedte, G. C. Foliente, P. Paevere, and H. Zhang, “Parameter analysis of the differential model of hysteresis,” Journal of Applied Mechan- ics, vol. 71, no. 3, pp. 342–349, 2004. [103] AASHTO,Guide Specifications for Seismic Isolation Design. AmericanAsso- ciation of State Highway and Transportation Officials, Washington D. C., 1991. [104] J. S. Hwang and L. H. Sheng, “Equivalent elastic seismic analysis of base- isolated bridges with lead-rubber bearings,” Engineering Structures, vol. 16, no. 3, pp. 201–209, 1994. [105] J. S. Hwang, L. H. Sheng, and J. H. Gates, “Practical analysis of bridges on isolation bearings with bi-linear hysteresis characteristics,” Earthquake Spectra, vol. 10, no. 4, pp. 705–727, 1994. 222 [106] R. I. Skinner, W. H. Robinson, and G. H. McVerry, An Introduction to Seismic Isolation. Chichester, England: Wiley, 1993. [107] M. A. Newton and A. E. Raftery, “Approximate Bayesian inference with the weighted likelihood bootstrap,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 56, no. 1, pp. 3–48, 1994. [108] R. M. Neal, “Annealed importance sampling,” Statistics and Computing, vol. 11, no. 2, pp. 125–139, 2001. [109] N. Friel and A. N. Pettitt, “Marginal likelihood estimation via power pos- teriors,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 70, no. 3, pp. 589–607, 2008. [110] S. De, E. A. Johnson, and S. F. Wojtkiewicz, “Multilevel estimation of marginal likelihood for Bayesian model selection,” (in preparation). [111] M. D. McKay, R. J. Beckman, and W. J. Conover, “Comparison of three methods for selecting values of input variables in the analysis of output from a computer code,” Technometrics, vol. 21, no. 2, pp. 239–245, 1979. [112] C. Andrieu, N. De Freitas, A. Doucet, and M. I. Jordan, “An introduction to mcmc for machine learning,” Machine learning, vol. 50, no. 1-2, pp. 5–43, 2003. [113] L. Tierney, “Markov chains for exploring posterior distributions,” the Annals of Statistics, pp. 1701–1728, 1994. [114] S. Chib and E. Greenberg, “Understanding the Metropolis-Hastings algo- rithm,” The American Statistician, vol. 49, no. 4, pp. 327–335, 1995. [115] S.-K. Au and J. L. Beck, “Estimation of small failure probabilities in high dimensions by subset simulation,” Probabilistic Engineering Mechan- ics, vol. 16, no. 4, pp. 263–277, 2001. [116] P. Del Moral, J. Garnier, et al., “Genealogical particle analysis of rare events,” The Annals of Applied Probability, vol. 15, no. 4, pp. 2496–2534, 2005. [117] F. Cérou, P. Del Moral, T. Furon, and A. Guyader, “Sequential Monte Carlo for rare event estimation,” Statistics and Computing, vol. 22, no. 3, pp. 795– 808, 2012. [118] H. P. Langtangen and A. Logg, Solving PDEs in Python: The FEniCS Tuto- rial I. Springer, 2016. 223 [119] S. H. Cheung, T. A. Oliver, E. E. Prudencio, S. Prudhomme, and R. D. Moser, “Bayesian uncertainty analysis with applications to turbulence mod- eling,” Reliability Engineering & System Safety, vol.96, no.9, pp.1137–1149, 2011. [120] T. A. Oliver and R. D. Moser, “Bayesian uncertainty quantification applied to RANS turbulence models,” in Journal of Physics: Conference Series, vol. 318, p. 042032, IOP Publishing, 2011. [121] W. Edeling, P. Cinnella, R. P. Dwight, and H. Bijl, “Bayesian estimates of parameter variability in the k–ε turbulence model,” Journal of Computa- tional Physics, vol. 258, pp. 73–94, 2014. [122] W. Edeling, P. Cinnella, and R. P. Dwight, “Predictive RANS simulations via Bayesian model-scenario averaging,” Journal of Computational Physics, vol. 275, pp. 65–91, 2014. [123] M. Kamalzare, E. A. Johnson, and S. F. Wojtkiewicz, “Computationally efficient design of optimal output feedback strategies for controllable passive damping devices,” Smart Materials and Structures, vol. 23, no. 5, p. 055027, 2014. [124] S. De, P. T. Brewick, E. A. Johnson, and S. F. Wojtkiewicz, “Robust predic- tion for structural systems using model falsification,” Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering, vol. (In review), 2017. [125] S. De, P. T. Brewick, E. A. Johnson, and S. F. Wojtkiewicz, “A hybrid probabilistic framework for model validation with application to structural dynamics modeling,” Mechanical Systems and Signal Processing, vol. (In review), 2018. [126] S. De, P. T. Brewick, E. A. Johnson, and S. F. Wojtkiewicz, “Model falsifi- cation in a Bayesian framework,” in ASCE Engineering Mechanics Institute Conference, 2017. [127] I. Babuška, F. Nobile, and R. Tempone, “A systematic approach to model validation based on Bayesian updates and prediction related rejection cri- teria,” Computer Methods in Applied Mechanics and Engineering, vol. 197, no. 29, pp. 2517–2539, 2008. [128] K. Farrell and J. T. Oden, “Calibration and validation of coarse-grained models of atomic systems: application to semiconductor manufacturing,” Computational Mechanics, vol. 54, no. 1, pp. 3–19, 2014. 224 [129] K. A. Farrell, Selection, calibration, and validation of coarse-grained models of atomistic systems. PhD thesis, University of Texas at Austin, 2015. [130] K. Farrell, J. T. Oden, and D. Faghihi, “A Bayesian framework for adaptive selection, calibration, and validation of coarse-grained models of atomistic systems,” Journal of Computational Physics, vol. 295, pp. 189–208, 2015. [131] Z. I. Botev and D. P. Kroese, “Efficient Monte Carlo simulation via the generalizedsplittingmethod,” Statistics and Computing, vol.22, no.1, pp.1– 16, 2012. [132] M. Chiachio, J. L. Beck, J. Chiachio, and G. Rus, “Approximate bayesian computation by subset simulation,” SIAM Journal on Scientific Computing, vol. 36, no. 3, pp. A1339–A1358, 2014. [133] F. A. DiazDelaO, A. Garbuno-Inigo, S. K. Au, and I. Yoshida, “Bayesian updating and model class selection with subset simulation,” Computer Meth- ods in Applied Mechanics and Engineering, vol. 317, pp. 1102 – 1121, 2017. [134] M.K.Vakilzadeh, Y.Huang, J.L.Beck, andT.Abrahamsson, “Approximate bayesian computation by subset simulation using hierarchical state-space models,” Mechanical Systems and Signal Processing, vol. 84, pp. 2 – 20, 2017. Recent advances in nonlinear system identification. [135] Y. M. Marzouk and D. Xiu, “A stochastic collocation approach to Bayesian inference in inverse problems,” Communications in Computational Physics, vol. 6, no. 1, pp. 826–847, 2009. [136] Y. M. Marzouk, H. N. Najm, and L. A. Rahn, “Stochastic spectral methods forefficientBayesiansolutionofinverseproblems,” Journal of Computational Physics, vol. 224, no. 2, pp. 560–586, 2007. [137] B. F. Spencer, S. J. Dyke, and H. S. Deoskar, “Benchmark problems in struc- tural control: part I-active mass driver system,” Earthquake Engineering and Structural Dynamics, vol. 27, no. 11, pp. 1127–1140, 1998. [138] G. W. Housner, L. A. Bergman, T. K. Caughey, A. G. Chassiakos, R. O. Claus, S. F. Masri, R. E. Skelton, T. Soong, B. Spencer, and J. T. Yao, “Structural control: past, present, and future,” Journal of Engineering Mechanics, vol. 123, no. 9, pp. 897–971, 1997. [139] T. T. Soong and G. F. Dargush, Passive energy dissipation systems in struc- tural engineering. Wiley, 1997. 225 [140] M. C. Constantinou, T. T. Soong, and G. F. Dargush, Passive energy dis- sipation systems for structural design and retrofit. Multidisciplinary Center for Earthquake Engineering Research A National Center of Excellence in Advanced Technology Application, 1998. [141] I. D. Aiken, D. Nims, and J. M. Kelly, “Comparative study of four passive energy dissipation systems,” Bulletin of the New Zealand National Society for Earthquake Engineering, vol. 25, no. 3, pp. 175–192, 1992. [142] I. G. Buckle, “Passive control of structures for seismic loads,” Bulletin of the New Zealand Society for Earthquake Engineering, vol. 33, no. 3, pp. 209–221, 2000. [143] J. M. Kelly, G. Leitmann, and A. G. Soldatos, “Robust control of base- isolated structures under earthquake excitation,” Journal of Optimization Theory and Applications, vol. 53, no. 2, pp. 159–180, 1987. [144] J. M. Kelly, “Base isolation: linear theory and design,” Earthquake Spectra, vol. 6, no. 2, pp. 223–244, 1990. [145] H. Ali and A. Abdel-Ghaffar, “Modeling the nonlinear seismic behavior of cable-stayed bridges with passive control bearings,” Computers & Structures, vol. 54, no. 3, pp. 461–492, 1995. [146] S. Nagarajaiah, A. M. Reinhorn, and M. C. Constantinou, “Nonlinear dynamic analysis of 3-D-base-isolated structures,” Journal of Structural Engineering, vol. 117, no. 7, pp. 2035–2054, 1991. [147] M. Constantinou and I. Tadjbakhsh, “Optimum design of a first story damp- ing system,” computers & Structures, vol. 17, no. 2, pp. 305–310, 1983. [148] Y. Fu and K. Kasai, “Comparative study of frames using viscoelastic and viscousdampers,” Journal of Structural Engineering, vol.124, no.5, pp.513– 522, 1998. [149] N. Gluck, A. Reinhorn, J. Gluck, and R. Levy, “Design of supplemen- tal dampers for control of structures,” Journal of Structural Engineering, vol. 122, no. 12, pp. 1394–1399, 1996. [150] Y. Ribakov and J. Gluck, “Optimal design of ADAS damped MDOF struc- tures,” Earthquake Spectra, vol. 15, no. 2, pp. 317–330, 1999. [151] O. Lavan and R. Levy, “Optimal design of supplemental viscous dampers for linear framed structures,” Earthquake Engineering & Structural Dynamics, vol. 35, no. 3, pp. 337–356, 2006. 226 [152] O. Lavan, “A methodology for the integrated seismic design of nonlinear buildings with supplemental damping,” Structural Control and Health Mon- itoring, vol. 22, no. 3, pp. 484–499, 2015. [153] A. Zare and M. Ahmadizadeh, “Design of viscous fluid passive structural control systems using pole assignment algorithm,” Structural Control and Health Monitoring, vol. 21, no. 7, pp. 1084–1099, 2014. [154] G.-S. Chen, R. J. Bruno, and M. Salama, “Optimal placement of active/passive members in truss structures using simulated annealing,” AIAA Journal, vol. 29, no. 8, pp. 1327–1334, 1991. [155] M. H. Milman and C.-C. Chu, “Optimization methods for passive damper placement and tuning,” Journal of Guidance, Control, and Dynamics, vol. 17, no. 4, pp. 848–856, 1994. [156] I. Takewaki, “Optimal damper placement for minimum transfer functions,” Earthquake Engineering & Structural Dynamics, vol. 26, no. 11, pp. 1113– 1124, 1997. [157] A. Agrawal and J. Yang, “Optimal placement of passive dampers on seismic and wind-excited buildings using combinatorial optimization,” Journal of Intelligent Material Systems and Structures, vol. 10, no. 12, pp. 997–1014, 1999. [158] D. L. Garcia, “A simple method for the design of optimal damper configu- rations in mdof structures,” Earthquake Spectra, vol. 17, no. 3, pp. 387–398, 2001. [159] M. P. Singh and L. M. Moreschi, “Optimal placement of dampers for passive response control,” Earthquake Engineering & Structural Dynamics, vol. 31, no. 4, pp. 955–976, 2002. [160] I. Takewaki, Building control with passive dampers: optimal performance- based design for earthquakes. John Wiley & Sons, 2011. [161] L. F. F. Miguel, L. F. F. Miguel, and R. H. Lopez, “Robust design optimiza- tion of friction dampers for structural response control,” Structural Control and Health Monitoring, vol. 21, no. 9, pp. 1240–1251, 2014. [162] J. Den Hartog, Mechanical Vibrations. McGraw-Hill, 4th edition, NY, 1956. [163] G. Warburton and E. Ayorinde, “Optimum absorber parameters for sim- ple systems,” Earthquake Engineering & Structural Dynamics, vol. 8, no. 3, pp. 197–217, 1980. 227 [164] E. Ayorinde and G. Warburton, “Minimizing structural vibrations with absorbers,” Earthquake Engineering & Structural Dynamics, vol. 8, no. 3, pp. 219–236, 1980. [165] G. Warburton, “Optimum absorber parameters for various combinations of response and excitation parameters,” Earthquake Engineering & Structural Dynamics, vol. 10, no. 3, pp. 381–401, 1982. [166] F. Sadek, B. Mohraz, A. W. Taylor, and R. M. Chung, “A method of esti- mating the parameters of tuned mass dampers for seismic applications,” Earthquake Engineering and Structural Dynamics, vol. 26, no. 6, pp. 617– 636, 1997. [167] J. C. Miranda, “On tuned mass dampers for reducing the seismic response of structures,” Earthquake Engineering & Structural Dynamics, vol. 34, no. 7, pp. 847–865, 2005. [168] C.-L. Lee, Y.-T. Chen, L.-L. Chung, and Y.-P. Wang, “Optimal design theories and applications of tuned mass dampers,” Engineering Structures, vol. 28, no. 1, pp. 43–53, 2006. [169] N. Hoang, Y. Fujino, and P. Warnitchai, “Optimal tuned mass damper for seismic applications and practical design formulas,” Engineering Structures, vol. 30, no. 3, pp. 707–715, 2008. [170] V. Gattulli, D. F. Fabio, and A. Luongo, “Nonlinear tuned mass damper for self-excited oscillations,” Wind and Structures, vol. 7, no. 4, pp. 251–264, 2004. [171] N. A. Alexander and F. Schilder, “Exploring the performance of a nonlin- ear tuned mass damper,” Journal of Sound and Vibration, vol. 319, no. 1, pp. 445–462, 2009. [172] H. Yamaguchi and N. Harnpornchai, “Fundamental characteristics of mul- tiple tuned mass dampers for suppressing harmonically forced oscillations,” Earthquake Engineering & Structural Dynamics, vol. 22, no. 1, pp. 51–62, 1993. [173] M. Abe and Y. Fujino, “Dynamic characterization of multiple tuned mass dampers and some design formulas,” Earthquake Engineering & Structural Dynamics, vol. 23, no. 8, pp. 813–835, 1994. [174] T. Igusa and K. Xu, “Vibration control using multiple tuned mass dampers,” Journal of Sound and Vibration, vol. 175, no. 4, pp. 491–503, 1994. 228 [175] K. Gurley, A. Kareem, L. A. Bergman, E. A. Johnson, and R. E. Klein, “Coupling tall buildings for control of response to wind,” in Structural Safety and Reliability: Proceedings of ICOSSAR ‘93, the 6th International Confer- ence on Structural Safety and Reliability (G. I. Schuëller, M. Shinozuka, and J. T. P. Yao, eds.), pp. 1553–1560, 1994. [176] A. Kareem and S. Kline, “Performance of multiple mass dampers under random loading,” ASCE Journal of Structural Engineering, vol. 121, no. 2, pp. 348–361, 1995. [177] T. S. Fu and E. A. Johnson, “Distributed mass damper system for integrat- ing structural and environmental controls in buildings,” ASCE Journal of Engineering Mechanics, vol. 137, no. 3, pp. 205–213, 2010. [178] C. Zang, M. I. Friswell, and J. E. Mottershead, “A review of robust optimal design and its application in dynamics,” Computers & Structures, vol. 83, no. 4, pp. 315–326, 2005. [179] E. Sandgren and T. M. Cameron, “Robust design optimization of structures through consideration of variation,” Computers & Structures, vol. 80, no. 20, pp. 1605–1613, 2002. [180] G.C.CalafioreandF.Dabbene, “Optimizationunderuncertaintywithappli- cations to design of truss structures,” Structural and Multidisciplinary Opti- mization, vol. 35, no. 3, pp. 189–200, 2008. [181] M. Gasser and G. I. Schuëller, “Reliability-based optimization of structural systems,” Mathematical Methods of Operations Research, vol. 46, no. 3, pp. 287–307, 1997. [182] J. Tu, K. K. Choi, and Y. H. Park, “A new study on reliability-based design optimization,” ASME Journal of Mechanical Design, vol. 121, no. 4, pp. 557– 564, 1999. [183] D. M. Frangopol, “Multicriteria reliability-based structural optimization,” Structural Safety, vol. 3, no. 1, pp. 23–28, 1985. [184] J. L. Beck, C. Papadimitriou, E. Chan, and A. Irfanoglu, “Reliability-based optimal design decisions in the presence of seismic risk,” in Proceedings of 11th World Conference on Earthquake Engineering, Pergamon, 1996. [185] J. L. Beck, E. Chan, A. Irfanoglu, and C. Papadimitriou, “Multi-criteria optimal structural design under uncertainty,” Earthquake Engineering and Structural Dynamics, vol. 28, no. 7, pp. 741–762, 1999. 229 [186] I. Enevoldsen and J. D. Sørensen, “Reliability-based optimization in struc- tural engineering,” Structural Safety, vol. 15, no. 3, pp. 169–196, 1994. [187] A. A. Taflanidis, J. T. Scruggs, and J. L. Beck, “Robust stochastic design of linear controlled systems for performance optimization,” Journal of Dynamic Systems, Measurement, and Control, vol. 132, no. 5, p. 051008, 2010. [188] M. Papadrakakis and N. D. Lagaros, “Reliability-based structural optimiza- tion using neural networks and Monte Carlo simulation,” Computer Methods in Applied Mechanics and Engineering, vol. 191, no. 32, pp. 3491–3507, 2002. [189] R.Tempo, G.Calafiore, andF.Dabbene, Randomized algorithms for analysis and control of uncertain systems: with applications. Springer Science & Business Media, 2012. [190] S. Jin and R. A. Livingston, “Chaos theory analysis of a cable-stayed bridge: Part I. finite element model development,” in The 14th International Sympo- sium on: Smart Structures and Materials & Nondestructive Evaluation and Health Monitoring, vol. 6529, International Society for Optics and Photonics, 2007. [191] W. Wang, G. Chen, and B. A. Hartnagel, “Real-time condition assessment of the Bill Emerson cable-stayed bridge using artificial neural networks,” in The 14th International Symposium on: Smart Structures and Materials & Nondestructive Evaluation and Health Monitoring, vol. 6529, International Society for Optics and Photonics, 2007. [192] B. Vannemreddy, Aerodynamic vibration of stay cables of a cable-stayed bridge. PhD thesis, Northern Illinois University, 2010. [193] B. A. Zárate and J. M. Caicedo, “Finite element model updating: Multiple alternatives,” Engineering Structures, vol. 30, no. 12, pp. 3724–3730, 2008. [194] B. A. Zarate, J. M. Caicedo, and A. Dutta, “Model updating of cable-stayed bridges using MGA,” in International Modal Analysis Conference (IMAC), 2007. [195] J. M. Caicedo, S. J. Dyke, S. J. Moon, L. A. Bergman, G. Turan, and S.Hague, “PhaseIIbenchmarkcontrolproblemforseismicresponseofcable- stayed bridges,” Journal of Structural Control, vol. 10, no. 3-4, pp. 137–168, 2003. [196] W.-L. He and A. K. Agrawal, “Passive and hybrid control systems for seis- mic protection of a benchmark cable-stayed bridge,” Structural Control and Health Monitoring, vol. 14, no. 1, pp. 1–26, 2007. 230 [197] S. De, S. F. Wojtkiewicz, and E. A. Johnson, “Efficient optimal design and design-under-uncertainty of passive control devices with application to a cable-stayed bridge,” Structural Control and Health Monitoring, vol. 24, no. 2, 2017. [198] S.De, S.F.Wojtkiewicz, andE.A.Johnson, “Efficientoptimaldesign-under- uncertainty of passive structural control devices,” in Proceedings of the 12th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP12), 2015. [199] S. De, M. Kamalzare, E. A. Johnson, and S. F. Wojtkiewicz, “Efficient opti- mal design of passive structural control devices for complex structures,” in ASCE Engineering Mechanics Institute Conference, 2014. [200] P. T. Boggs and J. W. Tolle, “Sequential quadratic programming,” Acta Numerica, vol. 4, pp. 1–51, 1995. [201] J. Nocedal and S. Wright, Numerical Optimization. Springer, New York, USA, 2006. [202] J. A. Main and N. P. Jones, “Free vibrations of taut cable with attached damper. II: Nonlinear damper,” ASCE Journal of Engineering Mechanics, vol. 128, no. 10, pp. 1072–1081, 2002. [203] R. L. Iman, Latin Hypercube Sampling. John Wiley & Sons, Ltd, 2008. [204] M. Kamalzare, E. A. Johnson, and S. F. Wojtkiewicz, “Efficient optimal design of passive structural control applied to isolator design,” Smart Struc- tures and Systems, vol. 15, no. 3, pp. 847–862, 2015. [205] M. Bowden and J. Dugundji, “Joint damping and nonlinearity in dynamics of space structures,” AIAA J., vol. 28, no. 4, pp. 740–749, 1990. [206] M. Shinozuka and G. Deodatis, “Response variability of stochastic finite element systems,” J. Engrg. Mech., vol. 114, no. 3, pp. 499–519, 1988. [207] R. G. Ghanem and P. D. Spanos, Stochastic finite elements: a spectral approach. Courier Corporation, 2003. [208] D. Xiu and G. E. Karniadakis, “The Wiener–Askey polynomial chaos for stochastic differential equations,” SIAM J. on Sci. Computing, vol. 24, no. 2, pp. 619–644, 2002. [209] S. De, E. A. Johnson, and S. F. Wojtkiewicz, “Efficient forward uncertainty propagation for locally nonlinear dynamic systems using Volterra integral equations,” (in preparation). 231 [210] S. De, E. A. Johnson, and S. F. Wojtkiewicz, “Efficient uncertainty quan- tification for locally nonlinear dynamical systems,” in ASCE Engineering Mechanics Institute Conference, 2017. [211] O. Reynolds, “On the dynamical theory of incompressible viscous fluids and the determination of the criterion,” Proceedings of the Royal Society of Lon- don, vol. 56, no. 336-339, pp. 40–45, 1894. [212] S. De, P. T. Brewick, E. A. Johnson, and S. F. Wojtkiewicz, “A probabilistic model validation framework for Reynolds averaged Navier-Stokes models for turbulence,” (in preparation). [213] S. De, P. T. Brewick, E. A. Johnson, S. F. Wojtkiewicz, and I. Bermejo- Moreno, “Error and likelihood bounds for falsification of dynamical models,” in Proceedings of the IMAC XXXV Conference, 2017. [214] B. E. Launder and D. Spalding, “The numerical computation of turbulent flows,” Computer methods in applied mechanics and engineering, vol.3, no.2, pp. 269–289, 1974. [215] S. B. Pope, Turbulent flows. IOP Publishing, 2001. [216] D. C. Wilcox, Turbulence Modeling for CFD. DCW industries Inc., La Cañada, California, 2nd ed., 1998. [217] S. El Tahry, “k− equation for compressible reciprocating engine flows,” Journal of Energy, vol. 7, no. 4, pp. 345–353, 1983. [218] V. Yakhot, S. Orszag, S. Thangam, T. Gatski, and C. Speziale, “Develop- ment of turbulence models for shear flows by a double expansion technique,” Physics of Fluids A: Fluid Dynamics, vol. 4, no. 7, pp. 1510–1520, 1992. [219] P. R. Spalart and S. R. Allmaras, “A one equation turbulence model for aerodinamic flows.,” AIAA journal, vol. 94, 1992. [220] D. Greenblatt, K. B. Paschal, C. S. Yao, J. Harris, N. W. Schaeffler, and A. E. Washburn, “Experimental investigation of separation control part 1: baseline and steady suction,” AIAA journal, vol. 44, no. 12, pp. 2820–2830, 2006. [221] J. W. Naughton, S. Viken, and D. Greenblatt, “Skin friction measurements on the NASA hump model,” AIAA journal, vol. 44, no. 6, pp. 1255–1265, 2006. 232 [222] C. Gorlé and G. Iaccarino, “A framework for epistemic uncertainty quantifi- cation of turbulent scalar flux models for Reynolds-averaged Navier-Stokes simulations,” Physics of Fluids (1994-present), vol.25, no.5, p.055105, 2013. [223] M. Emory, J. Larsson, and G. Iaccarino, “Modeling of structural uncertain- ties in Reynolds-averaged Navier-Stokes closures,” Physics of Fluids (1994- present), vol. 25, no. 11, p. 110822, 2013. [224] S. Patankar, Numerical heat transfer and fluid flow. CRC press, 1980. [225] E. Sato, T. Sasaki, K. Fukuyama, K. Tahara, and K. Kajiwara, “Develop- ment of innovative base-isolation system based on E-Defense full-scale shake table experiments, part I: Outline of project research,” in AIJ Annual Meet- ing, (Hokkaido, Japan), pp. 751–752, 2013. (In Japanese). [226] P. T. Brewick, E. A. Johnson, E. Sato, and T. Sasaki, “Constructing and evaluating generalized models for a base-isolated structure,” Structural Con- trol and Health Monitoring. in preparation. [227] P. V. Overschee and B. D. Moor, “N4SID: Subspace algorithms for the iden- tification of combined deterministic-stochastic systems,” Automatica, vol. 30, no. 1, pp. 75–93, 1994. [228] S. De, T. Yu, E. A. Johnson, and S. F. Wojtkiewicz, “Model validation of a four-story base isolated building using seismic shake-table experiments,” in 11th U.S. National Conference on Earthquake Engineering, 2018. 233
Abstract (if available)
Abstract
Models are used to represent and characterize physical phenomena. When the number of plausible models for a particular phenomenon is large, computational tools such as model falsification or model selection can help with the choice of models by eliminating models that do not fit the data. A probabilistic framework is proposed in this dissertation for validating models by intertwining the concepts of model falsification and Bayesian model selection. The model falsification is used in this framework as pre- and postprocessing steps that try to eliminate models as well as model classes that can not explain the measurements. A likelihood-bound model falsification based on control of a statistical error criterion, namely the false discovery rate, is proposed and used here in the falsification step. This likelihood based falsification result determines the validity of the initial candidate model class set and helps in removing most of the incorrect model classes without much computational cost. Bayesian model selection, which assigns posterior model class probabilities based on Bayes' theorem, is then applied to the remaining model classes with computational savings from the likelihood values evaluated at the pre-processing step. Finally, a post-processing step based on likelihood-bound falsification is used to check on the validity of the finally selected model class(es). The proposed framework is applied to nonlinear structural examples with many model classes available for the systems. Further improvements are also proposed for efficient estimation of evidence for Bayesian model selection using a probability integral transform. Finally, optimal design and design-under-uncertainty of passive damping devices, an efficient forward uncertainty propagation, model validation of a four-story full-scale base isolated building, and validation of turbulence models are performed using the methodologies developed here.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Model falsification: new insights, methods and applications
PDF
Modeling and analysis of parallel and spatially-evolving wall-bounded shear flows
PDF
Computationally efficient design of optimal strategies for passive and semiactive damping devices in smart structures
PDF
Novel queueing frameworks for performance analysis of urban traffic systems
PDF
A variation aware resilient framework for post-silicon delay validation of high performance circuits
PDF
A polynomial chaos formalism for uncertainty budget assessment
PDF
Inverse modeling and uncertainty quantification of nonlinear flow in porous media models
PDF
Efficient stochastic simulations of hydrogeological systems: from model complexity to data assimilation
PDF
Hybrid physics-based and data-driven computational approaches for multi-scale hydrologic systems under heterogeneity
PDF
Novel techniques for analysis and control of traffic flow in urban traffic networks
PDF
Development and applications of a body-force propulsor model for high-fidelity CFD
PDF
Bayesian methods for autonomous learning systems
PDF
High-accuracy adaptive vibrational control for uncertain systems
PDF
A novel sensor-based blended wall-modeling approach for large-eddy simulation of flows with relaminarization
PDF
Calibration uncertainty in model-based analyses for medical decision making with applications for ovarian cancer
PDF
Design optimization under uncertainty for rotor blades of horizontal axis wind turbines
PDF
New approaches in modeling and control of dynamical systems
PDF
Model based design of porous and patterned surfaces for passive turbulence control
PDF
Coordinated freeway and arterial traffic flow control
PDF
Optimal clipped linear strategies for controllable damping
Asset Metadata
Creator
De, Subhayan (author)
Core Title
A novel hybrid probabilistic framework for model validation
School
Andrew and Erna Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Civil Engineering
Publication Date
05/06/2018
Defense Date
03/21/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian model selection,false discovery rate (FDR),forward uncertainty propagation,marginal likelihood,model falsification,model validation,nonlinear Volterra integral equation,OAI-PMH Harvest,Reynolds averaged Navier-Stokes models for turbulence
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Johnson, Erik A. (
committee chair
), Wojtkiewicz, Steven F. (
committee chair
), Bermejo-Moreno, Iván (
committee member
), Ghanem, Roger (
committee member
), Savla, Ketan (
committee member
)
Creator Email
subhayad@usc.edu,subhayande@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-500819
Unique identifier
UC11266481
Identifier
etd-DeSubhayan-6309.pdf (filename),usctheses-c40-500819 (legacy record id)
Legacy Identifier
etd-DeSubhayan-6309.pdf
Dmrecord
500819
Document Type
Dissertation
Format
application/pdf (imt)
Rights
De, Subhayan
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Bayesian model selection
false discovery rate (FDR)
forward uncertainty propagation
marginal likelihood
model falsification
model validation
nonlinear Volterra integral equation
Reynolds averaged Navier-Stokes models for turbulence