Close
USC Libraries
University of Southern California
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Folder
Subsurface model calibration for complex facies models
(USC Thesis Other) 

Subsurface model calibration for complex facies models

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Copy asset link
Request this asset
Request accessible transcript
Transcript (if available)
Content Subsurface Model Calibration for Complex Facies Models by Wei Ma University of Southern California M.S., Computational Mathematics School of Mathematical Science, Peking University, 2010 M.S., Petroleum Engineering University of Southern California, 2013 Submitted in Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy (Petroleum Engineering) University of South California August 2018 Defense Exam Committee Faculty of the USC Graduate School Dr. Behnam Jafarpour Dr. Iraj Ershaghi Dr. Katherine Shing ii © Copyright by Wei Ma, May 2018 All Rights Reserved iii iv ACKNOWLEDGEMENT First and foremost, I would like to thank my advisor Behnam Jafarpour, who introduced me to the field of subsurface model calibration. It has been an honor to be working with him during my Ph.D. study. He has taught me, both consciously and unconsciously, how good academic research is done. I appreciate all his contributions of time, ideas, and funding to make my Ph.D. experience productive and stimulating. The joy and enthusiasm he has for his research was contagious and motivational from me, even during tough times in the Ph.D. pursuit. I am also thankful for his guidance during my learning of three skills that shaped my career: how to conduct academic research, how to write an academic paper and how to present my work effectively to others. I would like to express my appreciation to my advisor Joe Qin for all his contributions of time and funding during the first few years of my Ph.D. study. I would like to express my gratitude to Iraj Ershaghi who introduced me to the field of petroleum engineering, especially in well testing, reservoir characterization and geology. He taught me how geology is essential and critical in subsurface model calibration. I appreciate his guidance on deepening and expanding my knowledge and understanding of every aspect in petroleum engineering. I am also thankful for the excellent example he has provided as a successful professor and engineer. I would like to express my appreciation to my advisor Huazhong Tang who introduced me to the field of computational mathematics. I started working with him after taking his course on computational fluid mechanics and we have managed to advance this area in several directions since then. I am also thankful for his guidance during my graduate study in Peking University. v For this dissertation I would like to thank my committee members: Behnam Jafarpour, Iraj Ershaghi and Katherine Shing for their time, interest, and helpful comments. I would also like to thank the other two members of my qualifying exam committee, Charles Sammis (Charlie) and Kristian Jessen, for their time and insightful questions. Particularly, I would like to express my appreciation to Charlie and his fantastic course, “Rock Mechanics”, where I met my wife for the first time. I would like to thank all my teachers at University of Southern California (USC) for offering advanced courses that have helped me throughout my career. In particular, I would like to thank Victor Ziegler, Fred Aminzadeh, Yushu Wu, Justin Haldar, Thomas Jordan, Roger Ghanem, George Chilingar and Konstantinos Psounis. The members of the Jafarpour group and the Qin group have contributed immensely to my personal and professional time at USC. The groups have been a source of friendships as well as good advice and collaboration. Especially, I am grateful for the fun group of original Jafarpour group members who stuck it out in grad school with me: Mohammad-Reza Khaninezhad, Atefeh Jahandideh, Siavash Hakim-Elahi and Azarang Golmohammadi. Other past and present group members that I have had the pleasure to work with or alongside of are grad students Shiva Navabi, Anyue Jiang, Fangnin Zheng, Fan Zou, Tao Yuan, Qinqin Zhu, Yining Dong and Zhaohui Zhang; postdocs Entao Liu and Gang Li; working with each one of them has been a tremendous learning experience. I am deeply grateful to the Mork Family Department, Viterbi School of Engineering, Idania Takimoto, Juli Legat, Xiaoyan Zhang, Nicole Kerns, Andy Chen, Karen Woo, Jennifer Gerson and Tracy Charles for supporting my graduate research during the past 8 years. I vi gratefully acknowledge Chevron, Foundation CMG and the Department of Energy for their financial support. I have greatly benefited from the summer internships at ConocoPhillips and Chevron ETC, Houston. I want to thank my supervisors Steve Rigatos, Mark Fissell and Kaveh Dehghani, my mentors Hector Kile and Satyajit Taware and my colleagues in those companies for giving me the opportunity to work and network with researchers and engineers from all around the world. Last but not least, I would like to thank my family for all their love and encouragement. For my parents who raised me with love and supported me in all my pursuits. For the presence of my sister Qi here at USC for several months and her support to the family. For the new member in my family, my beloved daughter Hannah. And most of all for my loving, supportive, encouraging, and patient wife Xin whose faithful support during the final stages of this Ph.D. is so appreciated. My special thanks go to everyone in my family. They have supported me during this Ph.D. by means that words cannot explain. vii viii ABSTRACT My PhD work mainly focuses on subsurface model calibration for complex facies models. In this type of models, a Gaussian model (e.g., variogram model) fails to reproduce discrete and connectivity features of hydraulic properties. To better represent the discrete facies and their distributions, multiple point statistic (MPS) has been developed as a flexible simulation approach for generating complex facies patterns that are not amenable to variogram-based modeling techniques. Instead of using a variogram model, MPS relies on a conceptual model of geologic connectivity patterns, known as a training image (TI), to represent the statistical information about the expected patterns in the model. As a grid-based simulation, MPS provides simple conditioning of the results to hard data (e.g., facies measurements at well locations) and soft data (e.g., 3-D seismic data). However, conditioning MPS simulations on flow data is not trivial. In my PhD study, I developed several methods for constraining MPS complex facies simulations onto nonlinear flow data by taking advantages of its convenient hard data and soft data conditioning. The first method is the adapted pilot points methods, where a subset of all grid cells is selected as pilot points and the facies values at these points are inferred from nonlinear flow data. Then MPS facies simulations are conditioned onto the facies values at the pilot points by considering them as part of hard data. The second method is the facies probability conditioning method, where flow data is integrated to estimate facies probability maps. Then these probability maps are considered as soft data in MPS simulations to constrain facies models. In both methods, MPS facies simulations are conditioned on the information, e.g., pilot points or facies probability maps, inferred from nonlinear flow data, instead of directly conditioned on flow data. The third method is the approach of combining the first two methods. This combination approach is ix developed based on the observations of the behaviors of MPS facies simulations. One essential behavior of MPS facies simulations is that the facies connectivity patterns in the output of MPS facies simulations are dominated by the simulations at the early stage of the simulations. Therefore, a combination approach can provide improved facies connectivity patterns while honoring historical flow data by gaining more control of the facies simulations at the early stage of MPS facies simulations. The numerical experiments have shown that the developed methods can provide improvements in preserving the connectivity features in complex facies models as well as model predictions. x xi CONTENTS Acknowledgement ......................................................................................................................... iv Abstract ........................................................................................................................................ viii Contents ......................................................................................................................................... xi List of Tables ............................................................................................................................... xiii List of Figures .............................................................................................................................. xiv Chapter 1 Introduction .................................................................................................................... 1 1.1 Subsurface Model Calibration ............................................................................................... 1 1.2 Review of Model Calibration Methods ................................................................................. 4 1.3 Contributions ......................................................................................................................... 9 Chapter 2 Background: Complex Facies Models ......................................................................... 12 2.1 Facies Model Simulations ................................................................................................... 12 2.2 Multiple Point Statistics ...................................................................................................... 13 2.3 Review of MPS-based Calibration Methods ....................................................................... 15 Chapter 3 Adapted Pilot Points Method ....................................................................................... 19 3.1 Introduction ......................................................................................................................... 19 3.2 Methodology ....................................................................................................................... 26 3.2.1 Strategic Placement of Pilot Points .............................................................................. 26 3.2.2 Facies Estimation at Pilot Points .................................................................................. 29 3.2.3 Implementation Steps ................................................................................................... 32 3.3 Numerical Experiments ....................................................................................................... 33 3.3.1 Uniform Pilot Points Placement ................................................................................... 34 3.3.2 Strategic Placement of Pilot Points .............................................................................. 37 3.3.3 Incorporating Flow Data in Pilot Point Placement ....................................................... 41 3.3.4 Example 2: Complex Facies Connectivity Patterns ..................................................... 47 3.4 Summary ............................................................................................................................. 51 Chapter 4 Facies Probability Conditioning Method ..................................................................... 54 4.1 Introduction ......................................................................................................................... 54 4.2 Methodology ....................................................................................................................... 56 xii 4.2.1 Estimation of Facies Probability Maps ......................................................................... 57 4.2.2 Soft Data Conditioning: Tau Model ............................................................................. 59 4.2.3 Implementation of the Algorithm ................................................................................. 64 4.3 Experiments and Discussion ............................................................................................... 65 4.3.1 Probability Maps........................................................................................................... 67 4.3.2 Four Scenarios .............................................................................................................. 70 4.3.3 Statistical Analysis ....................................................................................................... 74 4.3.4 Experiment 2: Example with Complex Connectivity Patterns ..................................... 77 4.3.5 Information in The Facies Probability Map ................................................................. 78 4.4 Summary ............................................................................................................................. 82 Chapter 5 Properties of MPS Facies Simulations and The Combination of Pilot Points and Facies Probabilities .................................................................................................................................. 84 5.1 Introduction ......................................................................................................................... 84 5.2 MPS Facies Simulations ..................................................................................................... 87 5.3 Behavior of MPS Facies Simulations.................................................................................. 90 5.4 Combination of Pilot Points and Facies Probability Maps ................................................. 98 5.4.1 Numerical Experiments .............................................................................................. 100 5.5 Summary ........................................................................................................................... 104 References ................................................................................................................................... 106 LIST OF TABLES Table 3.1 The experiments and their description for the first example. ....................................... 34 Table 3.2 The relative reduction in the total RMSE of the model parameter (𝒍𝒐𝒈𝒌 ) estimation in the experiments. ............................................................................................................................ 44 Table 5.1 The relative reduction in the total RMSE of the model parameter (𝒍𝒐𝒈𝒌 ) estimation and data matching in the three experiments. ............................................................................... 101 xiv LIST OF FIGURES Figure 2.1 A training image representing a braided river system. Black represents channels and white refers to non-channel background. ...................................................................................... 14 Figure 3.1 Row 1: Well configuration and the reference log-permeability model; black refers to high-permeability channel and white shows low permeability mudstone. There are five injectors on the left side of the domain (gray diamonds) and 5 producers on the right side (gray squares). The color bar shows the values of 𝒍𝒐𝒈 𝟏𝟎 𝒌 . Row 2: From left to right: ensemble mean, variance and three samples of the initial facies realizations. Row 3: Results from Experiment A-1 (the standard ES method). Row 4: Results from Experiment A-2 (uniform placement of pilot points (dots)). ........................................................................................................................................... 36 Figure 3.2 Score maps: (a) the score map calculated from the sensitivity matrix (shown in log10- scale); (b) its truncated version with the threshold of 75-percentile 𝒑 𝟕𝟓 ; (c) the score map obtained from normalized variance map; (d) its truncated version using the threshold 𝒑 𝟕𝟓 ; (e) the combined score map obtained by taking the Schur product of (b) and (d) (shown in log10-scale). ....................................................................................................................................................... 38 Figure 3.3 Demonstration of the deterministic pilot point placement: the numbers refer to the order in which the pilot points are placed and the bold black circles refer to the region of influence for each pilot point. ....................................................................................................... 38 Figure 3.4 Performance of the deterministic (a) and random (b) pilot point placement methods assuming perfect knowledge about facies values at the pilot points. For each case the mean, variance and three samples out of the 100 realizations after conditioning on hard data and pilot points using SNESIM algorithm are shown. The top row in (b) shows three pilot points configurations that are generated using random placement; the corresponding conditional samples are shown on the bottom row. Pilot points are randomly placed to cover the entire truncated score map. ..................................................................................................................... 39 Figure 3.5 Pilot points conditioning results using ES (left column) and ES-MDA (right column) data assimilation: (a1) the mean of the standard ES update; (a2) the corresponding updated facies ensemble plots; (b1) the mean of the ES-MDA update; (b2) the corresponding updated facies ensemble plots. .............................................................................................................................. 42 Figure 3.6 The facies ensemble, including mean, variance and three samples of the final ensemble, obtained after 2 (a) and 4 (b) full iterations of pilot points conditioning with the standard ES. .................................................................................................................................. 45 xv Figure 3.7 Flow data match results at the two sample producers: (a) shows the wetting phase flow rates based on the initial ensemble, updated ensemble after two and four iterations while (b) displays the same results for non-wetting phase flow rates. The black circles refer to the observations and the red lines indicate the median of the ensemble prediction. The blue boxes show the Inter Quartile Range (IQR). ........................................................................................... 46 Figure 3.8 Top: Well configuration and reference log-permeability model; black refers to high- permeability channel and white shows low permeability mudstone. There are five injection wells (gray diamonds) and 5 extraction wells (gray squares). The color bar shows the values of 𝒍𝒐𝒈 𝟏𝟎 𝒌 . Column 1: Ensemble mean, variance (top) and three samples (bottom) of the initial facies models. Column 2: Ensemble of the facies models calibrated by the iterative scheme of the proposed method (after 4 full iterations). ..................................................................................... 48 Figure 3.9 The results of flow data match at the two producers: (a) shows the wetting phase flow rates based on the initial ensemble, updated ensemble after four iterations while (b) displays the corresponding results for non-wetting phase flow rates. The black circles refer to the observations and the red lines indicate the median of the ensemble prediction. The blue boxes show the Inter Quartile Range (IQR). ........................................................................................... 49 Figure 4.1 Contours of 𝑷 (𝑨 |𝑩 ,𝑪 ) vs. 𝑷 (𝑨 |𝑩 ) (x-axis) and 𝑷 (𝑨 |𝑪 ) (y-axis) for different 𝝉 values (columns from left to right: 𝝉 =𝟎 .𝟏𝟐𝟓 ,𝟎 .𝟐𝟓 ,𝟎 .𝟓 ,𝟏 , 𝟐 ,𝟒 ,𝟖 ) and marginal probabilities 𝑷 (𝑨 ) (rows from top to bottom: 𝑷 (𝑨 )=𝟎 .𝟏 ,𝟎 .𝟓 ,𝟎 .𝟗 ). 9-level contours (from 0.1 to 0.9 with increment as 0) are plotted from bottom-left to top-right. ............................................................ 61 Figure 4.2 The relationship between 𝝉 and entropy. ..................................................................... 64 Figure 4.3 Row 1: Well configuration and reference log-permeability model; black refers to high-permeability channel and white shows low permeability mudstone. There are five injectors on the left side of the domain (gray diamonds) and 5 producers on the right side (gray squares). The color bar shows the values of 𝒍𝒐𝒈 𝟏𝟎 𝒌 . Row 2 (from left to right): ensemble mean, variance and three samples of the initial facies models. Row 3: Estimation results with the ES method, showing the ensemble mean, variance and three sample realizations. ......................................... 66 Figure 4.4 After one ES update, the probability maps are calculated from the ensemble by (a) PCM and (b) the proposed method. Row 2 (from left to right): the mean, variance and three samples from the ensemble conditioned on the probability map (a). Row 4: the mean, variance and three samples from the ensemble conditioned on the probability map (b). ........................... 69 Figure 4.5 After one ES update, the probability maps and their corresponding 𝝉 maps are calculated from the updated ensemble by (a) PCM and (b) the proposed method. Row 2 (from left to right): the mean, variance and three samples from the ensemble conditioned on the xvi probability map with its corresponding 𝝉 map (a). Row 4: the mean, variance and three samples from the ensemble conditioned on the probability map with its 𝝉 map (b). ................................. 72 Figure 4.6 From left to right: the relative total RMSE of parameter estimation of Scenario (iv) relative to (a) Scenario (i); (b) Scenario (ii); and (c) Scenario (iii). ............................................. 73 Figure 4.7 From left to right: the relative total RMSE of data matching of Scenario (iv) relative to (a) Scenario (i); (b) Scenario (ii); and (c) Scenario (iii). .......................................................... 73 Figure 4.8 Top: Well configuration and reference log-permeability model; black refers to high- permeability channel and white shows low permeability mudstone. There are five injection wells (gray diamonds) and 5 extraction wells (gray squares). The color bar shows the values of 𝒍𝒐𝒈 𝟏𝟎 𝒌 . Column 1: Three samples (top), ensemble mean and variance (bottom) of the initial facies models. Column 2: Three samples (top), ensemble mean and variance (bottom) of the facies models calibrated by ES. Colum 3: Estimation results of PCM: Three samples (top) and ensemble mean and variance (bottom) are shown. ....................................................................... 75 Figure 4.9 The facies ensemble, including three samples, mean and variance of the final ensemble, obtained after 1 (a) and 4 (b) full iterations of facies probability map conditioning with the standard ES. ............................................................................................................................ 76 Figure 4.10 (a) The input ensemble for MPS simulation. (b) The mean, variance, two samples and their corresponding conditional probabilities 𝑷 (𝑨 |𝑪 ) and 𝑷 (𝑨 |𝑩 ,𝑪 ) of the output ensemble of MPS simulation conditioned on probability maps. (c) the same outputs of the MPS simulation by local conditioning..................................................................................................................... 81 Figure 5.1 The outputs of the experiment of MPS facies simulations conditioned on a facies probability map. (a) The facies probability map 𝑷 (𝑨 |𝑪 ) used in the experiment. (b) Two samples of facies models conditioned on the probability map. (c) The conditional probability 𝑷 (𝑨 |𝑩 ) estimated from the training image. (d) The joint conditional probability calculated from the corresponding facies probabilities 𝑷 (𝑨 |𝑩 ) and 𝑷 (𝑨 |𝑪 ) by using the 𝝉 model. ........................... 90 Figure 5.2 The statistics of the conditional probability 𝑷 (𝑨 |𝑩 ) estimated from the training image for the two samples. (a) The histogram of 𝑷 (𝑨 |𝑩 ) that is not 0 or 1. (b) The conditional probability 𝑷 (𝑨 |𝑩 ) for each simulating grid cell along the corresponding random path. (c) The number of simulating grid cells that have 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 in every two hundred simulated grid cells along the random path. (d) The plot of the number of grid cells that have 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 vs. the number of grid cells within the first two hundred grid cells along the random path. .................................................................................................................. 93 Figure 5.3 The outputs of the MPS facies simulation that is conditioned on no data. (a) The sample realization. (b) The corresponding 𝑷 (𝑨 |𝑩 ). (c) The histogram of 𝑷 (𝑨 |𝑩 ) not equal to 0 xvii or 1. (d) The number of 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 in every 200 simulating grid cells (left) and the cumulative number of 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 in the first 200 grid cells (right). ............. 94 Figure 5.4 The maps of the conditional probability 𝑷 (𝑨 |𝑩 ) that belongs to (a) (0, 1) and (b) [0.01, 0.99] from the first 200 grid cells along the random path (left) and the rest 12061 grid cells (right). ........................................................................................................................................... 95 Figure 5.5 The comparison of the outputs from the first (left) and third (right) experiments. (a) and (d): The hard data information at the first 200 simulating grid cells along the random path. Blue refers to channel and red refers to non-channel. (b) and €: The conditional probability 𝐏 (𝑨 |𝑩 )∈[𝟎 .𝟎𝟏 ,𝟎 .𝟗𝟗 ] (top) and the resulting sample realization (bottom) from the first experiment where the facies values are sampled from the joint conditional probability 𝑷 (𝑨 |𝑩 ,𝑪 ). (c) and (f): The conditional probability 𝑷 (𝑨 |𝑩 )∈[𝟎 .𝟎𝟏 ,𝟎 .𝟗𝟗 ] (top) and the resulting sample realization (bottom) from the first experiment where the facies values are sampled from the conditional probability 𝑷 (𝑨 |𝑩 ) only. ........................................................................................... 96 Figure 5.6 Top: Well configuration and reference log-permeability model; black refers to high- permeability channel and white shows low permeability mudstone. There are five injection wells (gray diamonds) and 5 extraction wells (gray squares). The color bar shows the values of 𝒍𝒐𝒈 𝟏𝟎 𝒌 . Column 1: Three samples (top), ensemble mean and variance (bottom) of the update models by using the adapted pilot points method. Colum 2: Three samples (top), ensemble mean and variance (bottom) of the update models by using the facies probability conditioning method with 𝝉 =𝟏 . Colum 3: Three samples (top), ensemble mean and variance (bottom) of the update models by using the combination of the two methods. ............................................................... 102 Figure 5.7 Flow data match results at the one sample producer: (a) shows the wetting phase flow rates based on the initial ensemble, updated ensemble from the adapted pilot points method, the facies probability conditioning method and their combination, while (b) displays the same results for non-wetting phase flow rates. The black circles refer to the observations and the red lines indicate the median of the ensemble prediction. The blue boxes show the Inter Quartile Range (IQR). .......................................................................................................................................... 103 xviii CHAPTER 1 INTRODUCTION 1.1 Subsurface Model Calibration Subsurface model calibration is a type of inverse problem which aims to use observed reservoir model responses to estimate the model variables that caused the responses. Most documented subsurface model calibration methods consider model parameters as unknowns and they try to identify the model parameters which contribute largely to the model uncertainty because of the inherent heterogeneity of aquifer properties. This is of importance since no reliable predictions can be acquired without a good characterization of model parameters. Besides estimating model parameters, the subsurface model calibration is also critical to assess the uncertainty of model predictions. Subsurface model calibration implies that a reservoir can be modeled by a mathematical model with parameters that have some physical interpretation such as permeability, porosity, hydraulic conductivity and etc. Subsurface model calibration is aiming to find some admissible model parameters that minimize the squared norm of the data mismatch 𝐽 (𝒎 )= 1 2 ∥𝑔 (𝒎 )−𝒅 𝑜𝑏𝑠 ∥ 𝐷 2 (1.1) where 𝒎 refers to the mathematical model variables, 𝑔 (𝒎 ) is the output of the model (i.e. the simulation results) and 𝒅 𝑜𝑏𝑠 is the vector of observed data. The norm in the above equation usually refers to weighted norm. A common choice to weigh the squared data mismatch in automatic subsurface model calibration is to use the inverse of noise variance. 2 𝐽 (𝒎 )= 1 2 (𝑔 (𝒎 )−𝒅 𝑜𝑏𝑠 ) 𝑇 𝑪 𝐷 −1 (𝑔 (𝒎 )−𝒅 𝑜𝑏𝑠 ) (1.1) where 𝐶 𝐷 is the covariance of noise in the observation data. One goal of the subsurface model calibration is to find values of the model variables such that the mathematical model can reproduce the observed reservoir responses. Such models then can be used to predict future behavior of the reservoir, which provides means to perform numerical experiments for designing reservoir management and development plans. One of the most important difficulties associated with the subsurface model calibration is that most of the inverse problems are ill-posed and they cannot be solved unless certain assumptions and constraints are provided. Ill-posedness may be caused by three problems: non- uniqueness, non-existence and non-steadiness of the solutions. Among these three problems, non-uniqueness, which is present when more than one set of parameters leads to minima of the objective function, is the most common one. Non-uniqueness is primarily caused by the fact that the number of model parameters to be estimated is larger than that of the available observation data. Another reason is that the observations are sometimes not sensitive to the model parameters to be estimated, in other words, the information content of the observations that can contribute to the calibration of the model parameters is very limited. To alleviate the ill-posedness, two common suggestions have been proposed and studied: (1) balance the numbers of the observation data and unknowns by reducing the number of unknown parameters, e.g., using parameterization, or collecting more observation data (de Marsily 1978, De Marsily et al. 1984, McLaughlin and Townley 1996, Lu and Horne 2000, Jafarpour and McLaughlin 2009); (2) specify certain constraints by considering the prior information or some other type of restrictions, or imposing regularization terms to reduce fluctuations during the optimization iterations (Lee et al. 1986, Doherty 2003, Fienen et al. 2009, Jafarpour et al. 2009, Oliver and Chen 2011). 3 Because of the non-uniqueness, the computation of a single history matched reservoir model is usually insufficient to address questions of risk and uncertainty in reservoir management. Instead of determining a single solution, a stochastic approach to solving the inverse problem is taken to generate a set of solutions distributed according to a probability distribution. Specifically, in a Bayesian framework, such solutions are sampled from the posterior density distribution of the model parameters 𝒎 given the data 𝒅 𝑜𝑏𝑠 , 𝑓 (𝒎 |𝒅 𝑜𝑏𝑠 )= 𝑓 (𝒅 𝑜𝑏𝑠 |𝒎 )𝑓 (𝒎 ) 𝑓 (𝒅 𝑜𝑏𝑠 ) (1.3) Where 𝑓 (𝒎 ) refers to the prior density distribution of the model parameter 𝒎 which constrains the set of possible invers solutions. In many subsurface model calibration problems, such dependency may refer to the spatial structure or connectivity of 𝒎 , for example, a covariance, Boolean or a training image. And 𝑓 (𝒅 𝑜𝑏𝑠 |𝒎 ) is the likelihood density, which models the relationship between the observation data 𝒅 𝑜𝑏𝑠 and model parameters 𝒎 , accounts for model and measurement errors. In special cases when any such errors are not considered, the relationship between the data and model parameters is described by a forward model 𝒅 =𝑔 (𝒎 ). Therefore, the probability density function 𝑓 (𝒅 ) depends on the density 𝑓 (𝒎 ) and the forward model 𝑔 , however, 𝑓 (𝒅 ) is often avoided in sampling of the posterior density 𝑓 (𝒎 |𝒅 ). Such stochastic approach leads to accepting a large set of models and using them all to characterize uncertainty in model predictions. In Gaussian type subsurface models, the distribution of model properties (e.g., logarithm of permeability, porosity) is considered to be Gaussian based on the fact or assumption (Law 1944, Rollins et al. 1992, Gu and Oliver 2005, Oliver and Chen 2011). In geostatistics, two-point 4 statistics, e.g., variogram models or covariance matrix, are typically used in direct geostatistical methods, such as kriging or sequential Gaussian simulation to generate realizations for Gaussian models (Liu and Oliver 2004, Gu and Oliver 2005, Skjervheim et al. 2007). In many cases, a Gaussian model fails to reproduce connectivity features of hydraulic properties, because it is unable to generate large spatial continuity of extreme values (the probability of the extreme values is usually minimal). Thus, multi-Gaussian models do not allow characterizing some geological settings (for example, fluvial deposition, such as river beds, braided channels, etc.), unless their geometry is not very sophisticated or the information content of measurements of dependent state variables is sufficiently adequate to identify connectivity patterns (Alcolea et al. 2006, Kerrou et al. 2008). Thus, one needs to use direct stochastic methods which are capable to integrate such geometries explicitly, in order to overcome the above limitations. These methods include the sequential indicator simulation method (Gó mez-Herná ndez and Srivastava 1990), the truncated plurigaussian method (Le Loc’h and Galli 1997), the transition probability methods (Carle and Fogg 1997), the Boolean method (Haldorsen and Lake 1982) and Multiple-Point statistics (MPS) (Guardiano and Srivastava 1993, Strebelle 2002, Hu and Chugunova 2008). In MPS, empirical conditional probabilities are obtained directly from training image(s). Then, the values of properties at a simulation point are draw from its specific conditional probability instead of using kriging. In this way, higher order statistics from training image(s) can be embedded. In the next section, I will give a brief review of the model calibration methods for these two different types of models: Gaussian models and complex facies models. 1.2 Review of Model Calibration Methods The model inversion techniques that have been developed for subsurface model calibration for Gaussian type subsurface models vary widely. Both local gradient-based search 5 methods (e.g., Gauss-Newton or quasi-Newton methods are used for minimization and global optimization techniques (e.g., genetic algorithm, etc.) have been used to implement model (Liu and Nocedal 1989, Bissell et al. 1997, Lepine et al. 1999, Caers et al. 2002, Madsen 2003, Schulze-Riegert et al. 2003, Vrugt et al. 2003, Ballester and Carter 2007, Regis and Shoemaker 2007, Tolson and Shoemaker 2007, Zhang et al. 2008, Floudas and Gounaris 2009, Bhark et al. 2011, Arsenault et al. 2013, Elsheikh et al. 2013). And a number of reviews of the inversion techniques can be found in the literature (Yeh 1986, Ewing et al. 1994, Watson et al. 1994, McLaughlin and Townley 1996, Oliver et al. 2001, Oliver and Chen 2011). Another approach is to partially characterize the states of the system using either point statistics or sample-based representations, for example, through Monte-Carlo simulation (Marshall et al. 2004, Wö hling and Vrugt 2011). All subsurface model calibration methods mentioned above are based on the minimization of an objective function similar to the one in Eq. (1.1) that measures the mismatch between the observation data and the simulated model outputs. However, alternative ways exist to achieve the same results without involving an optimization problem, but rather to sampling a multivariate posterior probability distribution estimated by using Bayes’ theorem, as described in Eq. (1.3). The Markov chain Monte Carlo method (MCMC) (Carle and Fogg 1997, Oliver et al. 1997, Marshall et al. 2004, Wö hling and Vrugt 2011) is a common type of methods for drawing samples from the posterior conditional distribution 𝑓 (𝒎 |𝒅 𝑜𝑏𝑠 ). A MCMC method usually first generates the initial realization 𝒎 and then update 𝒎 according to the Metropolis-Hastings rule. Notice that MCMC is not an optimization algorithm, its purpose is to generate multiple independent realizations by drawing samples from the posterior model parameter distribution that is conditioned on the observation data. The original MCMC is computationally demanding 6 since: (1) a forward simulation that involves solving a partial differential equation over a large domain is needed for each proposed realization; (2) these proposed realizations are not necessarily accepted, which leads to a large number of realizations that have to be generated and evaluated by forward simulations. An alternative and popular type of sampling-based techniques is filtering methods such as ensemble Kalman filter (EnKF) (Evensen 1994,2004), which offer several advantages including sequential real-time data processing and gradient-free implementation. The EnKF is based on the Kalman filter, which is a data assimilation algorithm for systems where the relationship between model parameters and outputs is linear and the model parameters are Gaussian distribution random variables. The linearity makes it easy to obtain the propagation of the covariance with time. However, most subsurface fluid flow models are nonlinear, which prevents the direct computation of the covariance evolution in time. To overcome the problem caused by the nonlinearity of the models, the EnKF estimate the covariance by working with an ensemble of realizations. This is one of the reasons that the EnKF is computationally efficient. The procedure of the EnKF consists of two parts: (1) it generates an initial ensemble of realizations by drawing samples from a prior distribution 𝑓 (𝒎 ) and each realization is subject to forward simulation to obtain the model predictions, and this part is usually knowns as forecast step; (2) after observation data are collected the state vector (a realization of model parameter) is updated by Kalman filtering, this part is also known as analysis step. These properties make the EnKF suitable for large-scale real-time monitoring data integration problems with complex state-space (black-box) models. (Skjervheim et al. 2007, Arroyo et al. 2008, Jafarpour and McLaughlin 2008, Chen and Oliver 2010, Sakov et al. 2012). Notice that the EnKF is neither an optimization method nor a sampling one, it is a data assimilation filter based on the minimization of a 7 posterior covariance. For this reason, the EnKF is an optimal inversion method when the model parameters follow a multiGaussian distribution and the system is linear. Application of the EnKF to subsurface model calibration was recently reviewed by Aanonsen et al. (Aanonsen et al. 2009). In facies models in which discrete geologic objects (e.g., fluvial channels) present, the characteristics of fluid flow patterns are often controlled by the connectivity and distribution of these objects. Thus the model calibration problem for such models should be primarily focused on preserving the spatial connectivity and distribution of these discrete features. Most of the grid-based methods mentioned in the previous section for calibration Gaussian models may not be able to preserve connectivity features in facies models while assimilating nonlinear flow data. Especially, the EnKF is not suitable for the subsurface model calibration problems for these facies models, since the model parameters cannot be properly modeled as multiGaussian. Some work, which attempts to circumvent the problem of nonGaussianity, has been done and will be discussed later in this work. One approach that has been studied in recent years is to estimate a parameterized representation of discrete geologic objects. Such methods include gradual deformation (Roggero and Hu 1998) truncated plurigaussian methods (Liu and Oliver 2004), level set method (Dorn and Villegas 2008), principle component analysis (Sarma et al. 2008), Fourier-based discrete cosine transform (Jafarpour and McLaughlin 2009) and wavelet transform (Jafarpour 2011). These parameterization techniques allow for some improvements in preserving the connectivity features, however, a more consistent approach for conditioning complex facies models onto flow data is still needed. 8 Another approach has been developed to work in conjunction with multi-point-based simulation methods. Caers applied a probability perturbation method (PPM) on the permeability fields generated from MPS (Caers 2003). In PPM, probability perturbation is done by solving an optimization problem which requires a large amount of computational cost. Alcolea and Renard proposed a block moving-window (BMW) algorithm to condition MPS simulations to piezometric head data and connectivity data (Alcolea and Renard 2010). Mariethos et al. developed an iterative spatial resampling (ISR) method working on the MPS simulation in the Bayesian framework (Mariethoz et al. 2010). Both BMW and ISR can assess uncertainty in flow prediction in an inefficient way and they require a large amount of computational effort since they follow the framework of Markov Chain Monte Carlo (MCMC) approach. More recently, some variations of the EnKF have been proposed to overcome the limitation posed by the covariance-based updating so that the connectivity of the permeability field is properly preserved. Sun et al. coupled the EnKF and a mixture of Gaussian models (weighted linear combination of multiple Gaussian models) in order to improve the performance for fluvial models (Sun et al. 2009). Jafarpour and Khodabakhshi proposed a probability conditioning method (PCM) (Jafarpour and Khodabakhshi 2011). In PCM, they updated the ensemble mean of MPS-based permeability realizations using the EnKF and then regenerated updated models using MPS based on the updated ensemble mean. Both PCM and PPM may suffer from a probability map which is inconsistent with training image(s). Zhou et al. proposed a pattern- search-based method to simulate MPS-based conductivity fields (Zhou et al. 2012). Li et al. extended this method to an ensemble-based MPS framework, which is Ensemble PATten (EnPAT) algorithm (Li et al. 2013, Li et al. 2014). In pattern-search-based methods, generating a conditioning pattern requires that the number of observations should be large enough. For 9 example, 20 locations for obtaining hard data and 63 observation locations are collected in a 2D model with the dimension of 100 by 80 grid cells. However, in some cases, only responses from a few wells may exist, thus such spatially dense data may be unavailable. 1.3 Contributions During my PhD study, I have been working on developing consistent and elegant methods for conditioning complex facies models on nonlinear flow data while preserving connectivity features. Among the inversion techniques, ensemble based methods are attractive since they provide easy access to uncertainty quantification. Therefore, I started my work with applying the EnKF and the Ensemble smoother (ES) on a Gaussian model calibration problem and investigating its feasibility of characterizing the aquifer heterogeneity from different types of monitoring measurements. The results suggest that with a proper design of EnKF/ES, it can provide promising estimates of the aquifer heterogeneity. When applied to calibrating complex facies models, the EnKF/ES failed to reproduce discrete connectivity features. However, the continuous updated ensembles from the EnKF/ES contain the information from nonlinear flow data, which can be used to constrain MPS facies simulations by means of hard data or soft data conditioning or both. Motivated by this idea, I developed two methods for conditioning multiple-point statistics (MPS) facies simulations onto flow data. The first method is an adapted pilot point method, where a subset of all grid cells is selected as pilot points and the values of model parameters (e.g., facies) at these points are considered as parameters to be determined in model calibration. To place pilot points, I define a score map that is generated based on three sources of information: (i) the uncertainty in facies 10 distribution, (ii) the model response sensitivity information, and (iii) the observed flow data. Once the pilot points are placed, the facies values at these points are inferred from production data and then are used, along with available hard data at well locations, to simulate a new set of conditional facies realizations by using MPS. While facies estimation at the pilot points can be performed using different inversion algorithms, in this study the ensemble smoother (ES) is adopted to update permeability maps from production data, which are then used to statistically infer facies values at the pilot point locations. The developed method combines the information in the flow data and the TI by using the former to infer facies values at select locations away from the wells and the latter to ensure consistent facies structure and connectivity where away from measurement locations. The second method is the facies probability conditioning method, where a probabilistic spatial description of facies distribution (i.e., a facies probability map) is first obtained by inverting the flow data and then the probability maps are used to constrain MPS facies simulations. In this approach, samples of facies distributions are drawn based on two sources of information: (i) the conditional probabilities from the TI and (ii) facies probability maps which contains important information about the flow data. To combine them, a spatiotemporal confidence map (i.e., tau map), which represents the relative confidence given to the facies probability maps over the conditional probabilities estimated from the TI, is also inferred from the flow data and provides weights to these two sources of information. The numerical results have shown that the two developed methods can provide improvements in preserving the connectivity features in complex facies models during the model calibration. The developed methods have the following advantages: first, they both can condition complex facies models on nonlinear flow data, which is achieved by inferring facies information 11 from flow data during the data assimilation and then using it as constraints in MPS. Second, by using MPS facies simulations, they both can preserve the discrete connectivity features that are consistent with the prior conceptual geologic models (e.g., training image). Third, by calibrating an ensemble of facies models, the developed methods provide means for uncertainty quantification. 12 CHAPTER 2 BACKGROUND: COMPLEX FACIES MODELS 2.1 Facies Model Simulations In traditional geostatistics, geological continuity is captured through a variogram, which is a two- point statistics measuring the correlation or conversely variability between any two locations in space. However, variogram cannot model complex facies models, such as channels or fractures. To better represent discrete facies and their distribution, a number of geostatistical simulation techniques are available. Two traditional approaches for simulating lithofacies distribution that honor a prior statistical representation and various types of measured and interpreted data are pixel-based approaches such as sequential indicator simulation (Journel 1983, Isaaks 1991, Goovaerts 1997, Chiles and Delfiner 2009) and object-based (Boolean) methods, e.g., marked point process, that are capable of describing the continuity in geobodies with well-defined shapes (Haldorsen and Lake 1984, Deutsch and Wang 1996, Holden et al. 1998, Chiu et al. 2013). Object-based methods, however, lack the flexibility of grid-based simulation techniques, making the data integration aspect particularly difficult. More recently, multiple-point statistics (MPS) (Guardiano and Srivastava 1993, Strebelle 2002, Caers and Zhang 2004, Hu and Chugunova 2008) has been developed as a flexible grid- based alternative simulation approach to object-based methods for generating complex geologic patterns that are not amenable to variogram-based modeling techniques. This was motivated by extending the sequential indicator simulation method that utilizes only two-point statistics for modeling more complex geological formations without building a complete, explicit and necessarily non-Gaussian random function model (Journel 2005, Hu and Chugunova 2008). Instead of a covariance model, MPS relies on a conceptual model of geologic connectivity 13 patterns, known as a training image (TI), to represent the statistical information about the expected patterns in the model. A particular implementation of the MPS that is used for simulation of the discrete facies patterns is the Single Normal Equation SIMulation (SNESIM) (Strebelle 2002). Similar to traditional conditional simulation techniques, such as sequential Gaussian simulation (SGSIM), SNESIM sequentially generates the facies model, one grid cell after another following a random path. The difference between SNESIM and SGSIM is the conditional probability distribution from which the sample at each grid cell is drawn: SNESIM estimates the conditional probability distribution from the training image while these probabilities are derived by kriging using a variogram model in SGSIM. The grid-based implementation of the SNESIM algorithm provides simple conditioning of the results to hard data (e.g., facies measurement at well locations) and soft data (e.g., 3-D seismic). In particular, integration of hard data is conveniently accomplished by simply assigning the hard data to the corresponding cells and excluding them from the simulation random path (Journel 2002, Strebelle 2002, Remy et al. 2009). The integration of soft data will be discussed in detail later. 2.2 Multiple Point Statistics Let’s consider the stochastic simulation of a random function 𝒎 (𝒙 )= [𝑚 (𝑥 1 ),𝑚 (𝑥 2 ),…,𝑚 (𝑥 𝑁 𝑔 )] defined on a grid system with 𝑁 𝑔 cells, where the notation 𝑚 (𝑥 𝑖 ) refers to the parameter value 𝑚 at the location 𝑥 𝑖 . In sequential simulation, instead of characterizing and sampling from the multivariate distribution of the random function, the multiplication rule of probability is invoked to sequentially visit each un-sampled cell and compute and sample from the corresponding univariate conditional probability, i.e., 𝐹 𝑐 (𝑧 )=𝑃 [𝑚 (𝑥 𝑛 +1 )<𝑧 |𝑚 (𝑥 1 ),𝑚 (𝑥 2 ),…,𝑚 (𝑥 𝑛 )] (2.1) 14 where 𝑚 (𝑥 1 ),𝑚 (𝑥 2 ),…,𝑚 (𝑥 𝑛 ) are conditioning (or previously sampled) data and 𝑚 (𝑥 𝑛 +1 ) is the un-sampled value at the current simulation cell. In practice, the conditioning data 𝑚 (𝑥 1 ),𝑚 (𝑥 2 ),…,𝑚 (𝑥 𝑛 ) are taken from a local neighborhood of 𝑥 𝑛 +1 . When the random function 𝒎 (𝒙 ) is multi-Gaussian, all the conditional probabilities are Gaussian and can be computed by estimating the corresponding conditional mean and variance of 𝑚 (𝑥 𝑛 +1 ), e.g., using kriging from a variogram model. However, when the multi-Gaussian assumption does not hold, computing the conditional probability for 𝑚 (𝑥 𝑛 +1 ) becomes nontrivial. Two possible approaches are possible to overcome this difficulty. Figure 2.1 A training image representing a braided river system. Black represents channels and white refers to non-channel background. In the first approach, a suitable random function model based on the conceptual knowledge about the spatial architecture and the physical properties of interest is built and then such random function is used to infer conditional probability. However, in some cases such as channel systems, it is not easy to build such a random function. Furthermore, even the random function is available, 15 it is still difficult to perform data conditioning and model parameter inference. In the second approach, a complete image representing the geometrical/geological connectivity features is built and then conditional probability is inferred from this image. Such an image is known as training image (TI), for example, Figure 2.1 shows a training image which represents channel connectivity patterns in a braided river system. In this approach, rather than building a realistic random function model, a realistic TI that can represent connectivity features needs to be built. A TI does not have to honor any local data but depict the spatial continuity features and it can come from conceptual drawing, outcrop description, aerophotography, etc. In general, these techniques produce 2D training images. Also a training image would be constructed by using an unconstrained 3D Boolean simulation or process-based stochastic simulation method (Caers 2003, Hu and Chugunova 2008). Unlike traditional geostatistical simulation techniques that are based on random function models, MPS simulation utilizes empirical multivariate distributions that are inferred from a training image (Guardiano and Srivastava 1993, Strebelle 2002). If 𝒎 (𝒙 ) represents a discrete random process (e.g., facies distribution), the corresponding conditional probabilities are simple to describe and can be inferred from a TI that contains the MPS information (patterns) (Strebelle 2002, Hu and Chugunova 2008). Then the sequential simulation allows to generate a number of equiprobable realizations which reproduce the training image pattern of continuity and honor local hard data (e.g., well-logs) and soft data (e.g., seismic data). 2.3 Review of MPS-based Calibration Methods Several inverse methods have been developed to work in conjunction with multi-point- based simulation methods. Caers applied the probability perturbation method (PPM) on the permeability fields generated from MPS (Caers 2003, Caers and Hoffman 2006). In the PPM, a 16 simple Markov chain is proposed to iteratively modify the MPS simulation outputs (realizations) until history match. Instead of perturbing an initial realization, the conditional probability that is used to generate the realization is perturbed by introducing another probability model that depends on the production data. Moreover, after the perturbation, the geostatistical realization generated with the perturbed conditional probability is still consistent with the geological continuity of the training image. However, probability perturbation is done by solving an optimization problem which requires a large amount of computational cost. Alcolea and Renard proposed a block moving-window (BMW) algorithm to condition MPS simulations to piezometric head data and connectivity data (Alcolea and Renard 2010). BMW starts with MPS simulation of lithofacies from geological/geophysical data and connectivity constraints where only a portion of the domain (e.g., the blocking window with user-defined size) is simulated at every iteration. Next, hydraulic properties at the intrafacies are populated and dependent state variables (e.g., piezometric heads) are obtained by forward simulations. Then the MPS simulation is accepted or rejected depending on the quality of the fit of the observation data. However, the optimal size of the block window is not easy to find and case dependent. Mariethos et al. developed an iterative spatial resampling (ISR) method working on the MPS simulation in the Markov chain Monte Carlo (MCMC) framework (Mariethoz et al. 2010). ISR starts with an initial MPS realization and at each iteration a random subset of all the grid cells are generated and a new facies realization is generated by conditioning MPS simulation on the points in this subset. Then the new realization is accepted or rejected depending on the misfit of the observation data. Both BMW and ISR can assess uncertainty in flow prediction in an inefficient way and they require a large amount of computational effort since they follow the framework of Markov Chain Monte Carlo (MCMC) approach. 17 More recently, some variations of the EnKF have been proposed to overcome the limitation posed by the covariance-based updating so that the connectivity of the permeability field is properly preserved. Sun et al. coupled the EnKF and a mixture of Gaussian model clustering techniques (weighted linear combination of multiple Gaussian models) in order to improve the performance for estimation of fluvial models (Sun et al. 2009). Jafarpour and Khodabakhshi proposed a probability conditioning method (PCM) (Jafarpour and Khodabakhshi 2011). In PCM, they updated the ensemble mean of MPS-based permeability realizations using the EnKF and then regenerated updated models using MPS based on the updated ensemble mean. Both PCM and PPM may suffer from a probability map which is inconsistent with training image(s). Zhou et al. proposed a pattern-search-based method to simulate MPS-based conductivity fields (Zhou et al. 2012). Li et al. extended this method to an ensemble-based MPS framework, which is Ensemble PATten (EnPAT) algorithm (Li et al. 2013, Li et al. 2014). In pattern-search- based methods, generating a conditioning pattern requires that the number of observations should be large enough. For example, 20 locations for obtaining hard data and 63 observation locations are collected in a 2D model with the dimension of 100 by 80 grid cells. However, in some cases, only responses from a few wells may exist, thus such spatially dense data may be unavailable. I developed two MPS-based methods for constraining complex facies models to nonlinear flow data: adapted pilot points method and facies probability conditioning method. The developed methods combine the information in the flow data and the TI by using the former to infer facies information at locations away from the wells and the latter to ensure consistent facies structure and connectivity where away from measurement locations. In both methods, the ensemble smoother (ES), which is an alteration of the EnKF, is applied to infer facies 18 information from the flow data. Such information can be facies types at the pilot points in the adapted pilot points method or facies probabilities in the facies probability conditioning method. Then I use MPS simulation to condition complex facies models to the inferred facie information. The two methods will be discussed in detail in the next two chapters. 19 CHAPTER 3 ADAPTED PILOT POINTS METHOD 3.1 Introduction Calibration of subsurface flow models against scattered non-linear well response data typically leads to an inverse problem to infer hydrological properties of the formation, and to enable more accurate prediction of future flow and transport behavior. A difficulty in formulating and solving model calibration inverse problems is related to the large number of parameters that need to be estimated from limited scattered measurements, making the problem ill-posed (Carrera and Neuman 1986, Kitanidis 1986, McLaughlin and Townley 1996, Carrera et al. 2005). To overcome problem ill-posedness, additional assumptions about the solution are introduced into the problem formulation. One approach is to constrain the solution, through regularization functions, to impose a pre-specified spatial structure (e.g., smoothness) on the distributed formation properties (Yeh 1986, Carrera et al. 2005). An alternative approach is to reduce the number of model parameters while maintaining the most salient features of the original parameters, i.e. parameterization. Several parameterization techniques have been developed and applied to subsurface flow model calibration problems. A classical parameterization technique is zonation, in which the spatial distribution of parameters is approximated by a finite number of zones (representing lithofacies) with identical properties (Sun et al. 1998, Tsai et al. 2003, Berre et al. 2009). Another class of parameterization methods involves subspace methods an appropriate low-dimensional subspace is used to describe the main spatial variability of the parameters. The most notable example in this category is the Principal Component Analysis (Karhunen 1947, Gavalas et al. 1976, Loeve 1978, Reynolds et al. 1996, Li and Cirpka 2006, Ma et al. 2008, Jafarpour and McLaughlin 2009). 20 In some cases, in addition to reducing the number of parameters, parameterization serves to improve solution plausibility by constraining model parameters to honor the expected geologic continuity. A specific example of these techniques is the pilot points method, which was originally proposed by de Marsily (de Marsily 1978). Similar methods, including variation and extended version of the original pilot-points method, for parametrization of subsurface model calibration techniques that have been studied in the literature (Certes and de Marsily 1991, LaVenue and Pickens 1992, RamaRao et al. 1995, Gó mez-Herná nez et al. 1997, Doherty 2003, Kowalsky et al. 2004, Alcolea et al. 2006,2008, Finsterle and Kowalsky 2008, Riva et al. 2010, Rubin et al. 2010, Jimé nez et al. 2013, Jimé nez et al. 2016). In the pilot points method, a subset of grid cells is selected as pilot points and the unknown property values in those cells are considered as the parameters to be estimated using available measurements. Geostatistical interpolation schemes (mainly Kriging methods (Deutsch and Journel 1992)) are then used to populate the unknown parameters in the remaining cells of the model to generate the full map of the property distributions. Using this approach, the pilot points reflect the information in the dynamic flow data while geostatistical methods are used to incorporate a pre-specified prior geologic continuity model (e.g. a variogram or covariance function) through interpolation. As a model calibration approach, the pilot points method has received broad attention within the subsurface modeling community as it offers two main benefits while attempting to match flow-related measurements: (i) it generates model parameters that honor a pre-specified variogram model and are conditioned on hard data; and (ii) it alleviates the computational burden by reducing the number of parameters that need to be estimated. Implementation of the pilot points method entails a number of issues that must be addressed. The first one is related to the number of pilot points that should be used. Since pilot 21 points reflect the information in the flow data, the number of pilot points represents a weight given to flow data relative to the prior (variogram) model. Using too many pilot points with small distances can lead to over-parameterization, which can result in computational overhead, non- unique parameter estimation results, and overly constrain the interpolation step, unless proper measures are taken to address such issues. On the other hand, having very few pilot points can limit the effect of flow data and reduce the efficiency of the model calibration process. Beyond these general insights, there is no fixed rule on the optimal number of pilot points for a given model/scenario. In addition to the number of pilot points, placement of pilot points in the model domain presents another challenge. Several approaches have been proposed for placing pilot points in the literature. The simplest approach is to assume that all grid blocks have the same potential to be selected as pilot points. De Marsily located the pilot points “more-or-less uniformly” but also attempted to follow zones of large sensitivities and large hydraulic property contrasts (De Marsily 1984). For Gaussian distributed fields, one recommendation is to place pilot points on a “pseudo- regular grids”. For example, a few studies have reported good results when the spacing between pilot points is in the order of 1 to 3 points per correlation lengths (Gó mez-Herná nez et al. 1997, Wen et al. 2002, Kowalsky et al. 2004). Alcolea et al. (Alcolea et al. 2006) reported that increasing the number of such regularly distributed pilot points may lead to spurious numerical effects and instabilities. Various aspects of the inverse modeling problem, including numerical and computational considerations, optimization algorithms, and joint inversion using co-Kriging techniques, are discussed in (Franssen et al. 1999). While these observations provide useful rule of thumb, they cannot be applied universally. In general, the field configuration, including well locations, inter-well connectivity, and sensitivity information play important roles in placement of pilot points. Therefore, an alternative method to 22 uniform placement of pilot points is to place them strategically. For example, pilot points can be placed in regions with high sensitivity in order to increase their contribution to reducing the data mismatch objective function (LaVenue and Pickens 1992, LaVenue et al. 1995, RamaRao et al. 1995, Lavenue and Marsily 2001). Pilot point placement has also been carried out based on both sensitivity and covariance information (Bissell 1994, Bissell et al. 1997), where the leading singular vectors of the sensitivity and covariance matrix are used to identify the locations of pilot points. Tonkin and Doherty (Tonkin and Doherty 2005) suggested constructing a highly parameterized base model for calculating base parameter sensitivities, and then decomposing the base parameter normal matrix into eigenvectors that represent principal orthogonal directions in the parameter space. These eigenvectors define a parameter subspace in which calibration data are used to find the coordinates of the solution. Based on the idea of parameter space reduction via SVD, Yang et al. (Yang et al. 2012) used multiple leading singular vectors of the sensitivity matrix (of model outputs with respect to the model parameters) to define the intensity scores and then placed pilot points at locations with high intensity scores. However, in their framework, an initial static set of uniformly distributed candidates (potential pilot points) must be predefined, from which the pilot points are selected. Most recently, Jimé nez et al. (Jimé nez et al. 2016) proposed an approach based on learning not only the pilot point values, but also their number and suitable locations during the reversible-jump Markov-chain Monte Carlo (RJ-MCMC) procedure, which requires a large number of iterations to converge. They defined an initial set of pilot points that are uniformly distributed and then perturbed the pilot point values as well as their number and location from the initial state. The above studies suggest that strategic placement of pilot points can be more beneficial than simply placing them uniformly in space. However, different strategies or criteria can be used 23 for placement of pilot points, and each strategy can lead to a different solution. Yang et al. (Yang et al. 2012) considered strategic locations to be those that could increase the information yield; that is, the information that can be extracted from the state variables for improving parameter estimation. Hence, they use sensitivity information in their study to identify strategic locations. In a probabilistic formulation, an important criterion for placing pilot points may be uncertainty reduction. In this case, to increase the confidence in the calibrated models, pilot point locations with higher uncertainty are preferable. In general, the locations with high uncertainty and high sensitivity have greater potential for improving the quality of solutions. Another source of information that can be used in placing pilot points is flow data. That is, flow data can be used to both help with finding promising locations of the pilot points and estimating the parameter values at those points. While pilot points method has been successfully applied to model calibration problems, it employs Kriging (or co-kriging techniques for joint inversion), based on variogram/covariance functions, which are suitable for continuous and multi-Gaussian random fields. This assumption is in no way unique to the pilot points method and has been widely used in many other conventional model calibration techniques. The multi-Gaussian assumption offers simplicity and mathematical convenience by requiring only the first two statistical moments of the distribution to fully characterize its probability density function (PDF). However, in geologic formations that are characterized by sharp transitions between extreme (low and high) values of subsurface properties, typically due to changing facies types, the multi-Gaussian assumption fails to capture such behavior and to represent the expected variability and distribution of the underlying features. For example, in fluvial deposits (e.g., river beds, braided or meandering channels), multi-Gaussian models cannot characterize the distinct geometry and shape of the relevant geobodies, especially 24 when complex geometric features are involved (Guardiano and Srivastava 1993, Carle et al. 1998, Gó mez-Herná ndez and Wen 1998, Western et al. 2001, Strebelle 2002, Zinn and Harvey 2003, De Marsily et al. 2005, Alcolea et al. 2006, Kerrou et al. 2008). This limitation is primarily due to the use of a covariance model to represent the spatial relations, as it only represents point-to-point correlations without accounting for multiple-point data patterns. To better represent discrete facies and their distribution, several geostatistical simulation techniques are available, including sequential indicator simulation (Journel 1983, Isaaks 1991, Goovaerts 1997, Chiles and Delfiner 2009), object-based (Boolean) methods, e.g., marked point process (Haldorsen and Lake 1984, Deutsch and Wang 1996, Holden et al. 1998, Chiu et al. 2013) and multiple-point statistics (MPS) (Guardiano and Srivastava 1993, Strebelle 2002, Caers and Zhang 2004, Hu and Chugunova 2008). In MPS, integration of hard data is conveniently accomplished by simply assigning the hard data to the corresponding cells and excluding them from the simulation random path (Journel 2002, Strebelle 2002, Remy et al. 2009). Conditioning the output of MPS simulation to dynamic flow data is, however, nontrivial. This is primarily due to the nonlinear and indirect relation between hydraulic properties and flow data. A number of approaches have been studied for calibration of MPS-based models against flow data while preserving the complex connectivity in these models. These techniques are either based on continuous approximation of the discrete geologic facies, in which the complex connectivity features are updated by adjusting a few parameters that control their distribution (Jafarpour and McLaughlin 2008,2009, Alcolea and Renard 2010, Jafarpour 2011, Zhou et al. 2011, Khaninezhad et al. 2012b,a) or they are based on manipulation of the simulation algorithm to reproduce the dynamic response data (Caers and Hoffman 2006, Mariethoz et al. 2010, Jafarpour and Khodabakhshi 2011, Khodabakhshi and Jafarpour 2013) . The latter approach has the advantage 25 of automatically preserving the exact connectivity patterns in the TI, which is critical when complex patterns are involved while the former is simpler and amenable to continuous gradient- based inversion techniques. In this chapter, an adapted pilot points method that can be used for conditioning the simulations of complex facies patterns to nonlinear flow data is presented. The motivation is to combine the convenient hard data conditioning (i.e., pilot point values) in MPS simulation for complex models with integration of flow data by using pilot points method with strategic placement. For strategically placing pilot points, three sources of information are used: sensitivity information, variance reduction, and flow data. To apply the pilot points to MPS-based facies simulation, the original implementation has been adapted in two major ways: first, the method is modified to include estimation of discrete values (facies types); second, the use of stochastic simulation necessitates sampling and updating multiple realizations of the facies maps and pilot points. To implement the modifications, first I use MPS simulation to generate multiple realizations of facies maps (and the corresponding permeability maps) that are conditioned on the hard data at wells. Next, I update permeability values at the pilot points using the ensemble smoother (ES) (Van Leeuwen and Evensen 1996, Skjervheim and Evensen 2011). Based on the updated ensemble of permeability maps, facies types are inferred at the pilot points. The resulting updated facies values at the pilot points are in turn used as hard data in MPS simulation to generate an updated ensemble of facies maps. In our proposed method, the MPS simulation ensures that the simulated facies patterns are consistent with the TI while the updated pilot points incorporate the information in the dynamic flow data. Hence, both high-order statistical information from the TI and nonlinear flow data are incorporated in the calibration process. In the next section I introduce the proposed pilot points method, followed by a series of synthetic examples and their discussion. 26 3.2 Methodology 3.2.1 Strategic Placement of Pilot Points When placing pilot points, it is desirable to select locations that exhibit significant contribution to explaining the parameter variability from observed data (e.g., heads, flow rates, etc.) and high potential to reduce the uncertainty in the facies distribution. Potential candidates for such points are those that can be accurately estimated (i.e., flow responses show sensitivity to those locations) and that can maximally reduce the uncertainty in the facies distribution. Typically, these two criteria tend to compete as high sensitivity locations are near well locations, which often have low uncertainty. In this work, I seek to identify locations with high sensitivity and uncertainty. Let us consider 𝑴 =(𝒎 1 ,𝒎 2 ,…,𝒎 𝑁 𝑒𝑛𝑠 )∈ℝ 𝑁 𝑔 ×𝑁 𝑒𝑛𝑠 to be the matrix that represents an initial ensemble of the model parameters, where 𝑁 𝑔 is the number of cells in the model and 𝑁 𝑒𝑛𝑠 is the ensemble size. For each realization of the model parameters 𝒎 𝑖 , let 𝑱 𝑖 ∈ℝ 𝑁 𝑔 ×𝑁 𝑑 be the sensitivity matrix of the model output (flow simulation output, e.g., rates, heads, saturation, etc.) with respect to model parameters (e.g., permeability, hydraulic conductivity, etc.), where 𝑁 𝑑 is the dimension of measurements. Since the sensitivity matrix 𝑱 𝑖 is different from realization to realization, a global sensitivity matrix is defined as 𝑱 𝑎𝑢𝑔 =(𝑱 1 ,𝑱 2 ,…,𝑱 𝑁 𝑒𝑛𝑠 )∈ℝ 𝑁 𝑔 ×𝑁 𝑎 , where 𝑁 𝑎 =𝑁 𝑑 ×𝑁 𝑒𝑛𝑠 . The first step for placing pilot points is to generate score maps, following a similar approach discussed in (Yang et al. 2012). For matrix 𝒁 =𝑱 𝑎𝑢𝑔 , the singular value decomposition (SVD) is written as: 𝒁 =𝑼 𝚺 𝑽 𝑇 (3.1) 27 where 𝑼 =(𝑢 𝑖𝑗 )∈ℝ 𝑁 𝑔 ×𝑁 𝑔 , 𝚺 =(𝑠 𝑖𝑗 )∈ℝ 𝑁 𝑔 ×𝑁 𝑎 , 𝑽 =(𝑣 𝑖𝑗 )∈ℝ 𝑁 𝑎 ×𝑁 𝑎 . Note that in 𝚺 , the off- diagonal elements are zeros and the diagonal element 𝑠 𝑖𝑖 =𝜆 𝑖 refers to the i th largest singular value. To identify the preferred locations for pilot points, the singular vectors corresponding to the first 𝑁 𝑠 singular values (such that ∑ 𝜆 𝑘 2 𝑁 𝑠 𝑘 =1 ≥95%) are used. Using the 𝑁 𝑠 singular vectors, the following score map is defined: 𝑆 𝑗 =∑ 𝜆 𝑘 2 |𝑢 𝑘𝑗 | ∑ |𝑢 𝑘𝑙 | 𝑁 𝑐 𝑙 =1 𝑁 𝑠 𝑘 =1 ,𝑗 =1,2,…,𝑁 𝑔 (3.2) With this definition, regions with high score values in the calculated maps indicate locations for which the model responses show high sensitivity to the corresponding parameters (at those locations). In order to identify the high-sensitivity regions, a s-percentile 𝑝 𝑠 is used as the threshold to truncate the original score map defined by Eq. (3.2) such that the scores below 𝑝 𝑠 become zeros. Thus non-zero regions in the truncated map represent the high-score regions in the original map, and the pilot points will be placed, using a procedure describe in the next section, in these non- zero regions only. To account for the uncertainty in model parameters, the parameter sample variance map 𝑪 =𝑑𝑖𝑎𝑔 ( 1 𝑁 𝑒𝑛𝑠 −1 (𝑴 −𝑴̅ )(𝑴 −𝑴̅ ) 𝑇 )∈ℝ 𝑁 𝑔 is used, where 𝑴̅ ∈ℝ 𝑁 𝑔 ×𝑁 𝑒𝑛𝑠 is a matrix that has the sample mean in all of its columns. If the variance map is normalized onto range [0, 1] by a linear transformation 𝑆 𝑗 = (𝐶 𝑗 −𝐶 𝑚𝑖𝑛 ) (𝐶 𝑚𝑎𝑥 −𝐶 𝑚𝑖𝑛 ) ,𝑗 =1,2,…,𝑁 𝑔 and 𝐶 𝑚𝑖𝑛 ,𝐶 𝑚𝑎𝑥 are the minimum and maximum values of the elements in 𝑪 , notice that the normalized variance map is the only information entering the score. Hence, areas that assume low values in the corresponding score map correspond to positions with lower uncertainty. Therefore, to reduce the uncertainty in the parameter space, pilot points are placed in the locations with higher score values. For estimation 28 of the parameters at the pilot points, it is desirable to place the pilot points in locations with high sensitivity and high variance. Therefore, I define a new score map by combining the truncated score maps (after normalizing into [0, 1] range) based on the sensitivity and variance information. The combined map is generated through the Schur (point-wise) product of the two truncated score maps. The Schur product of the two score maps represents a grid-block-based operation to find the intersection of the locations in the domain that have high sensitivity and variance. The regions with significant scores in the resulting map correspond to target locations for placing pilot points. These resulting regions are covered by (randomly) placing pilot points in them. Note that the use of variance map in obtaining uncertainty score maps only captures the second-order statistics of the parameters, and does not include higher statistics. While variance is not an optimal measure to use for parameters with higher-order variability-, its simplicity and inclusion in the update step of the filter motivate its use for quantifying the prior model parameter uncertainty in our formulation. Additionally, computationally demanding optimization methods can be used with some performance metrics to find optimal locations for pilot points, the simple approach used in this work has resulted in satisfactory performance. An additional source of information in finding promising locations for pilot points is the flow data. I incorporate this information through an iterative data assimilation scheme as discussed in Section 3.3. To place the pilot points on the combined map, I consider a region of influence around each pilot point to establish a minimum distance criterion. When a pilot point is placed on the map, its region of influence is removed from consideration for additional pilot points. The shape of the region of influence can be decided based on prior knowledge about the continuity model or based on sensitivity analysis. In our study, no preference is given to any direction and a symmetric region with radius 𝑟 from the center of a pilot point is used, where 𝑟 is chosen using sensitivity analysis. 29 Note that incorporation of reliable information about the correlation structures in the field is expected to improve the results. For example, one could use an ellipsoid to define the region of influence based on available prior knowledge. This region of influence as well as the truncated map mentioned earlier provides a means to avoid over-parameterization. In Section 3.3, two approaches used for placing the pilot points are investigated. The first approach is deterministic, in which the ranking in the score map is used to place the pilot points. The second approach randomly places the pilot points in non-zero regions on the score map without following any order. In both cases no overlap is allowed between the regions of influence of different pilot points. The numerical experiments suggest that the random placement is more robust compared with the deterministic one. Therefore, in the proposed method, the random placement is used to distribute the pilot points on the score map. One advantage of the proposed approaches is that by defining a threshold for the score map and a radius of influence, the number and locations of the pilot points are determined systematically. Notice that although the proposed placement of pilot points is sub- optimal and can be further improved, it is easy to implement with a little extra computational cost, whereas formal optimization of the pilot points placement is not trivial and expected to be computationally demanding. 3.2.2 Facies Estimation at Pilot Points After placing the pilot points, an inversion method is needed to estimate the parameters at the pilot points using the flow data. Note that in the implementation, the facies types are not updated directly by integrating the flow data. Instead, the permeability values at the pilot points are estimated first. The estimated permeability values are then used to infer the conditional (to flow data) facies probability at the pilot points following the same approach (linear scaling) proposed by Jafarpour and Khodabakhshi (Jafarpour and Khodabakhshi 2011), where the facies probabilities at each pilot 30 point are estimated by linear scaling the ensemble mean of permeability at the same point. Samples from the facies probability are then drawn to generate an ensemble of facies realizations at the pilot point locations, which are subsequently included, along with hard data from existing well locations, in the SNESIM algorithm to regenerate an updated ensemble of facies models. For data assimilation, I use a variant of the ensemble Kalman filter (EnKF), known as ensemble Kalman smoother (or simply ensemble smoother, ES). The EnKF is a Monte-Carlo- based approximation of the sequential Kalman filtering method that allows for nonlinear propagation of uncertainty from model parameters (e.g., hydraulic conductivity, permeability) into the outputs (e.g., heads, flow rates) of a dynamical system to approximately represent (through samples) the prediction uncertainty (distribution). When measurements become available, they are used in a linear analysis scheme to update the parameters/states of the system (Evensen 1994,2004). Following the notation in (Evensen 2004), the (augmented) state matrix containing the ensemble members 𝝍 𝑖 ∈ℝ 𝑁 𝑔 is denoted as 𝑨 =(𝝍 1 ,𝝍 2 ,…,𝝍 𝑁 )∈ℝ 𝑁 𝑎𝑢𝑔 ×𝑁 𝑒𝑛𝑠 (3.3) In this study, the ensemble members 𝝍 𝑖 are the permeability at the pilot point locations. The ensemble mean is stored in each column of 𝑨̅ =𝑨 𝟏 𝑵 , where 𝟏 𝑵 ∈ℝ 𝑁 𝑒𝑛𝑠 ×𝑁 𝑒𝑛𝑠 is a matrix in which all the elements are 1/𝑁 𝑒𝑛𝑠 , and the perturbation matrix is defined as 𝑨 ′ =𝑨 −𝑨̅ . Given a vector of measurements 𝒅 ∈ℝ 𝑁 𝑚 , with 𝑁 𝑚 being the number of measurements, the perturbed observations 𝒅 𝑗 =𝒅 +𝝐 𝑗 is stored in columns of a matrix 𝑫 =(𝒅 1 ,𝒅 2 ,…,𝒅 𝑁 𝑒𝑛𝑠 )∈ℝ 𝑁 𝑚 ×𝑁 𝑒𝑛𝑠 . The ensemble of perturbed observations is defined as 𝑬 =(𝝐 1 ,𝝐 2 ,…,𝝐 𝑁 𝑒𝑛𝑠 )∈ℝ 𝑁 𝑚 ×𝑁 𝑒𝑛𝑠 . Define the sample states covariance and measurement error covariance matrices, respectively, as 31 𝑷 𝑒 = 𝑨 ′ (𝑨 ′ ) 𝑇 𝑁 −1 , 𝑹 𝑒 = 𝑬 𝑬 𝑇 𝑁 −1 (3.4) The analysis step of the EnKF can be expressed as follows: 𝑨 𝑎 =𝑨 +𝑲 (𝑫 −𝑯𝑨 )=𝑨 +𝑷 𝑒 𝑯 𝑇 (𝑯 𝑷 𝑒 𝑯 𝑇 +𝑹 𝑒 ) −1 (𝑫 −𝑯𝑨 ) (3.5) where 𝑨 𝑎 is the updated ensemble, 𝑲 is Kalman gain and 𝑯 is the measurement matrix. Letting 𝑫 ′=𝑫 −𝑯𝑨 , 𝑺 =𝑯 𝑨 ′ ∈ℝ 𝑁 𝑚 ×𝑁 𝑒𝑛𝑠 and 𝑪 𝑒 =𝑺 𝑺 𝑇 +𝑬 𝑬 𝑇 , Eq. (3.5) can be rewritten as 𝑨 𝑎 =𝑨 (𝑰 +𝑺 𝑇 𝑪 𝑒 −1 𝑫 ′ )=𝑨𝑿 , 𝑤 ℎ𝑒𝑟𝑒 𝑿 =𝑰 +𝑺 𝑇 𝑪 𝑒 −1 𝑫 ′ ∈ℝ 𝑁 𝑒𝑛𝑠 ×𝑁 𝑒𝑛𝑠 (3.6) Evensen proposed a robust algorithm to perform the analysis step by calculating 𝑨 𝑎 =𝑨𝑿 in Eq. (3.6). To alleviate the rapid reduction in variance after the update, which is a well-known issue of the EnKF update, a damping factor is applied to the Kalman gain matrix of the EnKF update. The resulting update takes the form 𝑨 𝑑𝑎𝑚𝑝 𝑎 =𝑨 +𝛼 𝑲 (𝑫 −𝑯𝑨 )=𝛼 𝑨𝑿 +(1−𝛼 )𝑨 =𝛼 𝑨 𝑎 +(1−𝛼 )𝑨 (3.7) where 𝛼 ∈(0,1] is a damping factor. Note that a reduction in the Kalman gain can be viewed as inflating the observation error variance and leads to a more modest update. The choice of 𝛼 as a tuning parameter is not immediately obvious and may require trial and error to specify. Different from the EnKF, the ensemble smoother (ES) (Van Leeuwen and Evensen 1996) is an alternative data assimilation method which avoids the sequential updating of the realizations and the associated restart, which is performed after each update step to generate forecasts for the next data assimilation step. In ES, one global update in the space-time domain is performed instead of using recursive updates in time as in the EnKF. As a result, compared with the EnKF, ES provides a significant reduction in simulation time which makes it more efficient and much simpler to implement and apply, especially in complex subsurface applications. However, implementation 32 of ES requires multiple iterations, which introduce additional computational overhead. As mentioned earlier, both the EnKF and ES can be used in our proposed method, however, due to its lower computational cost, ES is used as the data assimilation of choice in this work. 3.2.3 Implementation Steps In this section, I summarize the implementation steps. Starting with the initial ensemble of the facies models (which are conditioned on hard data from wells), the proposed pilot points conditioning with MPS simulation is performed using the following steps: 1) Generate an initial ensemble using the SNESIM algorithm 2) Construct the sample variance map from the initial ensemble; 3) Perform forward simulation to predict the states and measurements (model outputs), and the sensitivity information (using the adjoint method) for each of the ensemble members; 4) Generate the score maps based on sensitivity and variance information, and combine them to obtain a combined truncated score map (as discussed in Section 3.2.1); 5) Using an appropriate radius of influence, determine the pilot point locations (based on the placement procedure discussed in Section 3.2.1); 6) Estimate the permeability for each realization at the pilot points using the ES data assimilation; 7) Infer the facies probabilities, using the approach proposed in (Jafarpour and Khodabakhshi 2011) and discussed above, at the pilot points from the corresponding updated permeability and use them to generate an ensemble of facies types at the pilot points, by sampling; 8) Use the SNESIM algorithm to generate an ensemble of facies realizations that are conditioned on the hard data and the sampled facies types at the pilot points. 33 A major difference between the above implementation with the standard EnKF and ES approach is that the smoother in our approach is used to only update the permeability values at the pilot points locations. It’s important to note that in the above scheme the pilot points are placed based on the prior information only and no dynamic flow data is involved in their placement. An iterative scheme of the proposed method can be formed by repeating Step 2-8. That is, in each iteration, the updated ensemble of facies types from the previous iteration is considered as the initial ensemble at the current iteration. Consequently, the score map as well as the locations and number of pilot points are updated in Step 3-5. Thus, performing these iterations leads to integration of the flow data into the updated facies realizations, resulting in updated sensitivity and variance information (and, hence, updated score map). Unlike the non-iterative scheme, the iterative form of the updates incorporates the information in the measurements (flow data) in identifying the pilot point locations. Hence, the iterative scheme provides a mechanism to include flow data, in addition to sensitivity and initial uncertainty, in placing the pilot points. 3.3 Numerical Experiments I present two examples in this Section. Example 1 includes a series of numerical experiments to discuss various aspects of the proposed approach. In Example 2, which involves a larger and more complex subsurface model, the application of the proposed approach in shown. Note that in Example 2 I have also performed similar experiments and comparisons as in Example 1, with similar conclusion; however, for brevity, I only present the final results from applying the proposed method to Example 2. 34 3.3.1 Uniform Pilot Points Placement Table 3.1 The experiments and their description for the first example. In Example 1, I present a series of numerical experiments as listed in Table 3.1 to evaluate the performance of the proposed workflow and discuss its properties. I use two-dimensional two-phase (wetting and non-wetting phase) flow in subsurface to demonstrate the performance of our method. The model in our first experiment consists of a 65×65 grid system with five injection wells on the left and five extraction wells on the right. The reference facies distribution is shown in the top plot of Figure 3.1 and consists of channel facies (black) and background mudstone (white). It is assumed that the values of permeability and porosity for each facies type are known (e.g., from analysis of core samples and well logs) and the only uncertainty is related to the distribution of facies in space (away from well locations). The pressure at the injection wells and flow rates of both phases at the extraction wells are measured every month over a two-year period and used for data assimilation. The initial ensemble of facies realizations is conditioned on the hard data (the facies types) at 10 well locations. The second row in Figure 3.1 shows the mean, variance and three (out of 100) sample realizations of the initial ensemble of the spatial distribution of the log- Experiment Description A-1 The standard ensemble smoother (ES) A-2 Uniform placement of pilot points + ES A-3 Strategic placement of pilot points + ES A-4 Strategic placement of pilot points + ESMDA A-5 Two full iterations: iterate “strategic placement of pilot points + ES” twice A-6 Four full iterations 35 permeability, generated using the SNESIM algorithm. The initial samples share statistically similar connectivity patterns with the reference model. However, the exact shape and location of the features are different from those in the reference model. Since the initial ensemble is conditioned on the measurement at the well locations, the samples show less uncertainty about the facies distribution at the vicinity of the wells. This can be verified by examining the initial ensemble mean and variance maps in Figure 3.1 (second row). The main uncertainty is therefore related to the channel locations and connectivity in the middle of the domain. In Experiment A-1, the standard ES is used to assimilate the measurements and update the distribution of log-permeability field in the initial ensemble. The third row of Figure 3.1 plots the mean, variance and three samples from the updated ensemble using the ES analysis. A comparison with the reference model reveals that the updated maps are closer to the reference model, but the exact shape and discrete nature of the channel features are lost. This is a general issue that is observed in the application of the EnKF or ES methods to characterization of facies. Specifically, these methods apply second-order continuous updates that are not suitable for estimation of discrete facies with higher-order statistical patterns, unless indirect approaches, such as continuous parameterization of facies, are used. To avoid such inconsistency, I propose to use the observed flow data to infer facies types at pilot points and use them to generate conditional samples. To infer the facies type, following the approach proposed in (Jafarpour and Khodabakhshi 2011), I first find facies probabilities at the pilot point locations (as discussed in the description of the method, Section 3.2) and use them to generate realizations of facies types at the pilot point locations. 36 Figure 3.1 Row 1: Well configuration and the reference log-permeability model; black refers to high-permeability channel and white shows low permeability mudstone. There are five injectors on the left side of the domain (gray diamonds) and 5 producers on the right side (gray squares). The color bar shows the values of 𝐥𝐨𝐠 𝟏𝟎 𝒌 . Row 2: From left to right: ensemble mean, variance and three samples of the initial facies realizations. Row 3: Results from Experiment A-1 (the standard ES method). Row 4: Results from Experiment A-2 (uniform placement of pilot points (dots)). Mean Variance Sample 2 Sample 1 Sample 3 37 A comparison between the variance maps (second column) in the second and third rows of Figures 3.1 shows that the low variance regions in the initial ensemble are updated minimally and remain largely unchanged after the update. This observation suggests that the locations with low variance in the initial ensemble are not good candidates for placing the pilot points. In Experiment A-2, the pilot points are uniformly distributed across the entire domain instead of applying a strategic placement. The facies types at the pilot points are inferred from the ensemble of updated permeability models in Experiment A-1. The last row in Figure 3.1 shows the updated results for the ensemble of log-permeability, generated after conditioning on uniformly placed pilot points. The results demonstrate that uniform placement of pilot points is not a good strategy and may not capture the correct connectivity in facies. When the pilot points are placed uniformly, some of them may be assigned to low sensitivity regions, hence they may not be updated correctly. Since the estimated facies types at the pilot points are treated as hard data during facies simulation, they play a significant role in describing the connectivity of the final facies maps. Hence, it is important to avoid arbitrary updates due to the placement of pilot points in low- sensitivity regions. This is the primary motivation of our work on strategic placement of pilot points, which will be introduced in the next section. 3.3.2 Strategic Placement of Pilot Points Figure 3.2 shows the score maps calculated following the approach discussed in Section 3.2. Figures 3.2a and 3.2b display the normalized score maps obtained from the sensitivity matrix, and its truncated version using a threshold of 75-percentile, i.e., 𝑝 75 , respectively. Figures 3.2c and 3.2d depict similar score maps that are computed using the variance map. The final score map is obtained by taking the Schur product of the two truncated maps as discussed in Section 3.2. These score maps, and hence the resulting pilot point locations, are generated based on prior information 38 (the initial ensemble of facies) and the corresponding sensitivity information (based on predicted model responses). That is, dynamic flow measurements are not included in placing the pilot points. Figure 3.2 Score maps: (a) the score map calculated from the sensitivity matrix (shown in log10- scale); (b) its truncated version with the threshold of 75-percentile 𝐩 𝟕𝟓 ; (c) the score map obtained from normalized variance map; (d) its truncated version using the threshold 𝒑 𝟕𝟓 ; (e) the combined score map obtained by taking the Schur product of (b) and (d) (shown in log10-scale). Figure 3.3 Demonstration of the deterministic pilot point placement: the numbers refer to the order in which the pilot points are placed and the bold black circles refer to the region of influence for each pilot point. (a) (b) (c) (d) (e) . . . First PP Second PP Third PP Last PP 39 Figure 3.4 Performance of the deterministic (a) and random (b) pilot point placement methods assuming perfect knowledge about facies values at the pilot points. For each case the mean, variance and three samples out of the 100 realizations after conditioning on hard data and pilot points using SNESIM algorithm are shown. The top row in (b) shows three pilot points configurations that are generated using random placement; the corresponding conditional samples are shown on the bottom row. Pilot points are randomly placed to cover the entire truncated score map. (a) Deterministic pilot point placement Mean Variance Sample 2 Sample 1 Sample 3 (b) Random pilot point placement Mean Variance Sample 2 Sample 1 Sample 3 40 Figure 3.3 uses a simple schematic to illustrate the deterministic placement of pilot points. At the beginning, the first pilot point is placed at the location with the highest score. The next pilot point is then placed to satisfy two requirements: (i) its region of influence does not overlap with those of all previously placed pilot points; (ii) it is at the location with the highest score among all possible locations in the rest of the non-zero regions. This procedure is repeated until there are no positions in the truncated score map that can accommodate a pilot point without an overlap with previously placed pilot points. Note that the region of influence can take a different shape than a circle and may require sensitivity analysis to identify. In this example, the radius of the circular region 𝑟 (i.e., the region of influence) is assumed to be 7 grid cells (in practice sensitivity analysis and field correlation length can be used to specified this radius). In the deterministic approach, all ensemble members share the same pilot point configuration. As discussed earlier in Section 3.2, an alternative approach for finding pilot point locations is randomly placing them on the score map (instead of using a descending order of the scores). The importance of random placement is better appreciated by noting that pilot points that intersect with the channel facies provide more important conditioning information than those that do not intersect the channels. Hence, a random placement approach increases the chance of pilot points hitting the channel facies (if the dynamic data supports it). In placing the pilot points in the random case, overlap between the regions of influence is not permitted. In our experiments, the ensemble of 𝑁 𝑒𝑛𝑠 =100 realizations is divided into 𝑛 𝑔 = 10 groups and each group is randomly assigned a pilot point configuration. To highlight the effect of pilot point placement, in the next experiment it is assumed that at the pilot points the facies types are known perfectly. That is, instead of using model calibration to infer the facies types, the exact facies types from the reference model are assigned to the pilot points. This is done to ensure that the pilot point placement analysis is not affected by errors in the 41 data integration step. Figure 3.4a (top) shows the truncated score map and the corresponding pilot points, which are placed using the deterministic approach. The second row of Figure 3.4a shows the mean, variance, and three sample realizations of the resulting facies models. Figure 3.4b displays three pilot point configurations out of ten that are placed randomly. The second row in Figure 3.4b shows the ensemble mean and variance of the facies distribution, along with three sample realizations (each from a different group). In comparison with the results from Figure 3.4a, the conditional facies represent the reference map more accurately. This and several other examples that were performed over the course of this work indicate that the random placement of pilot points is more robust than the deterministic approach. Therefore, I adopt the random placement approach in our proposed method. 3.3.3 Incorporating Flow Data in Pilot Point Placement In Experiment A-3, the non-iterative scheme of the proposed method is used. In this experiment, the facies type at the pilot points is inferred (using the method described in Section 3.2) after estimating the log-permeability maps by applying the standard ES. The resulting facies types are then used to regenerate conditional realizations from the TI using the SNESIM algorithm. In Experiment A-4 I apply multiple updates to the permeability map in an iterative form, using the ensemble smoother with multiple data assimilation (ES-MDA) approach, before performing any resampling from the TI; that is, ES-MDA iterations are first completed to estimate the permeability values at pilot points, and then the results are used to resample from the TI). Experiment A-4 differs from A-3 in that it involves full iterations of ES-MDA prior to resampling from the TI (Emerick and Reynolds 2013). It is important to note that the iterations of ES-MDA are only applied to improve the estimation of the permeability values at the pilot points. Therefore, the pilot point locations are still identified prior to data integration. 42 Figure 3.5 Pilot points conditioning results using ES (left column) and ES-MDA (right column) data assimilation: (a1) the mean of the standard ES update; (a2) the corresponding updated facies ensemble plots; (b1) the mean of the ES-MDA update; (b2) the corresponding updated facies ensemble plots. Updated mean Mean Variance Sample 2 Sample 1 Sample 3 (a1) (a2) (b1) (b2) 43 Figures 3.5(a1) and 3.5(b1) show the ensemble mean of log-permeability obtained from the standard ES and ESMDA, respectively. The corresponding mean, variance and three samples from the final ensemble of facies maps after conditioning on the pilot points are shown in Figure 3.5(a2) and 3.5(a3) for ES and in Figure 3.5(b2) and 3.5(b3) for ES-MDA. Two additional points are important to stress: first, although the entire map of updated log-permeability is shown, only the values at pilot point locations are used in our analysis; second, the facies types at the pilot points are determined by first identifying the facies probability at each pilot point and then using the resulting facies probabilities to draw samples for facies types at those points, prior to conditional resampling with the SNESIM algorithm. To quantify the performance of the two approaches, the root-mean-squared-error (RMSE) corresponding to a variable 𝑥 is calculated for all grid cells: 𝑅𝑀𝑆𝐸 (𝑥 𝑖 )=√ 1 𝑛 𝑒𝑛𝑠 ∑ (𝑥 𝑖 ,𝑗 −𝑥 𝑖 ,𝑟𝑒𝑓 ) 2 𝑛 𝑒𝑛𝑠 𝑗 =1 ,𝑖 =1,2,3,…,𝑛 𝑔 (3.8) The total RMSE of an ensemble is defined as: 𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (𝑥 )=∑ 𝑅𝑀𝑆𝐸 (𝑥 𝑖 ) 𝑛 𝑔 𝑖 =1 (3.9) The relative reduction of the total RMSE of the updated ensemble 𝑥 𝑢𝑝𝑑𝑎𝑡𝑒 compared to that of the initial ensemble 𝑥 𝑖𝑛𝑖𝑡𝑖𝑎𝑙 can be obtained as: 𝑟 𝑅𝑀𝑆𝐸 (𝑥 )= 𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (𝑥 𝑢𝑝𝑑𝑎𝑡𝑒 )−𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (𝑥 𝑖𝑛𝑖𝑡𝑖𝑎𝑙 ) 𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (𝑥 𝑖𝑛𝑖𝑡𝑖𝑎𝑙 ) (3.10) The relative reduction of the total RMSE of the model parameters is 3.24% and 6.43% for Experiment A-3 and A-4, respectively (Table 3.2). Compared to the standard ES, the updated mean of facies maps with the ES-MDA is slightly improved. However, considering the 44 computational overhead associated with multiple iterations used in the ES-MDA algorithm, a trade-off exists between the incremental improvements versus the computational cost. Experiment 𝑟 𝑅𝑀𝑆𝐸 (log(k)) A-3 -3.24% A-4 -6.43% A-5 -8.46% A-6 -15.55% Table 3.2 The relative reduction in the total RMSE of the model parameter (𝒍𝒐𝒈𝒌 ) estimation in the experiments. In these two experiments, prior information was used for placing the pilot points on the score map. An additional and important source of information for identifying promising locations for pilot points is the dynamic measurements, e.g., flow and pressure head data. In the next set of experiments, A-5 and A-6, the iterative scheme of our proposed method is used. In this case, in each iteration, first the pilot points are placed based on the current ensemble of facies/permeability maps (from previous iteration). The ES algorithm is then used to update the log-permeability values at the pilot points. The facies type at each pilot point is inferred and used with the TI to regenerate (i.e., resample) the new ensemble of facies (and permeability) maps that will be passed to implement the next iteration. Since the new ensemble of facies maps incorporate the dynamic data, this scheme ensures that the flow data is also used in determining the pilot point locations. Figure 3.6a and 3.6b show the final ensemble of facies maps after two and four full iterations, respectively. The results from these two experiments show noticeable improvements over those in Figure 3.5, highlighting the importance of flow data in placing the pilot points. In these two 45 experiments, the relative reductions of the total RMSE of the model parameters are 8.46% and 15.55%, respectively (Table 3.2). Figure 3.6 The facies ensemble, including mean, variance and three samples of the final ensemble, obtained after 2 (a) and 4 (b) full iterations of pilot points conditioning with the standard ES. (a) (b) Mean Variance Sample 2 Sample 1 Sample 3 46 Figure 3.7 Flow data match results at the two sample producers: (a) shows the wetting phase flow rates based on the initial ensemble, updated ensemble after two and four iterations while (b) displays the same results for non-wetting phase flow rates. The black circles refer to the observations and the red lines indicate the median of the ensemble prediction. The blue boxes show the Inter Quartile Range (IQR). Initial 2 Iterations 4 Iterations 2 iterations 4 iterations Initial (b) Non-wetting Phase Flow Rates (a) Wetting Phase Flow Rates 47 Figures 3.7a and 3.7b display the predicted and observed flow rates for the wetting and non-wetting phases at the extraction wells for the initial and updated models from Experiment A- 5 and A-6, respectively. The initial predictions are relatively far from the observed values and show a larger spread. The predictions after two and four full iterations show increasingly improved forecasts. In Experiment A-6, compared to the initial ensemble, the relative reductions in the total RMSE for the heads at the injection wells and flow rates for both non-wetting and wetting phases at the extraction wells for the updated ensemble of facies maps are 26.0%, 30.1% and 37.6%, respectively. Together, the results from the updated facies maps and production forecasts demonstrate noticeable improvements when the iterative scheme is performed to incorporate the flow data both in pilot point placement and facies identification. 3.3.4 Example 2: Complex Facies Connectivity Patterns I now consider a larger example with more complex connectivity patterns. The example also involves two-phase flow in a two-dimensional model. The top panel in Figure 3.8 shows the reference model, which consists of a 201×61 grid system with five injection wells (diamonds) and five extraction wells (squares). The reference facies distribution consists of channel facies (black) and background mudstone (white). The pressure at the injection wells and flow rates of both phases at the extraction wells are measured every month over a two-year period and used as conditioning data. The initial ensemble of facies realizations is conditioned on the hard data (the facies types) at 10 well locations. The first column in Figure 3.8 shows three sample (out of 100) realizations of the initial ensemble of log-permeability distributions along with the ensemble mean, variance maps. While the initial samples share similar connectivity patterns (statistically) with the reference model, they do not represent the exact shape and location of the features in the reference 48 model. Since the initial ensemble is conditioned on the hard data at the well locations, the initial ensemble members show less uncertainty about the facies distribution near the wells. Figure 3.8 Top: Well configuration and reference log-permeability model; black refers to high- permeability channel and white shows low permeability mudstone. There are five injection wells (gray diamonds) and 5 extraction wells (gray squares). The color bar shows the values of 𝐥𝐨𝐠 𝟏𝟎 𝒌 . Column 1: Ensemble mean, variance (top) and three samples (bottom) of the initial facies models. Column 2: Ensemble of the facies models calibrated by the iterative scheme of the proposed method (after 4 full iterations). Sample 2 Sample 1 Sample 3 Mean Variance Reference model and well configuration 49 Figure 3.9 The results of flow data match at the two producers: (a) shows the wetting phase flow rates based on the initial ensemble, updated ensemble after four iterations while (b) displays the corresponding results for non-wetting phase flow rates. The black circles refer to the observations and the red lines indicate the median of the ensemble prediction. The blue boxes show the Inter Quartile Range (IQR). Using this example with complex connectivity patterns, I performed a series of the experiments that are similar to those in Example 1. The results in all the experiments were consistent with those of Example 1 and show that the proposed method (iterative scheme with four full iterations and random pilot point placement) outperforms other approaches used in Example 1, both in estimating facies distribution and in matching the flow data. Here, for brevity, only the results of our proposed final approach are shown. The second column in Figure 3.8 contains the results after the implementation of the proposed method. Qualitatively, the updated sample better Initial 4 Iterations 4 iterations Initial (b) Non-wetting Phase Flow Rates (a) Wetting Phase Flow Rates 50 represents the connectivity in the reference model. To quantify this performance, the corresponding relative reduction of the total RMSE in model parameter estimation is calculated to be 19.95%. Figures 3.9a and 3.9b display the predicted and observed flow rates of the wetting and non-wetting phases at the extraction wells for the initial and updated (after four full iterations) ensemble, respectively. The initial predictions are relatively far from the observed values and show a larger spread. The predictions after four full iterations show noticeable improvements. With the updated models, the relative reductions in the total RMSE in matching the pressure heads at the injection wells and flow rates for both non-wetting and wetting phases at the extraction wells in Experiment B-6 are 26.0%, 30.1% and 37.6%, respectively. Finally, unlike iterative inverse modeling formulations that minimize a data mismatch cost function, the updates performed using different flavors of the EnKF or ES by construction do not guarantee an improvement in data match (unless such a mechanism is explicitly enforced). In addition to this inherent property, the stochastic nature of the SNESIM conditional simulation step, which is used to incorporate the facies values at the pilot points, introduces additional variability (randomness) in the facies distribution, which can affect the quality of data match. One approach to improve the predictive performance of the updated ensemble is to introduce an acceptance/rejection sampling step in the last stage of the workflow when the final ensemble of conditional facies is generated. In our experiments, good data match is observed after the final updates and did not implement an acceptance/rejection criterion. 51 3.4 Summary A novel approach is developed, based on the pilot points parameterization, for conditioning multiple-point statistical simulation of complex geologic facies on nonlinear flow data. The pilot points method was originally developed for geostatistical parameterization of model calibration problems using variogram-based Kriging techniques. Here, a pilot point scheme is developed for conditioning stochastic discrete facies simulation on flow data, using ensemble-based data assimilation techniques. The main motivation behind the proposed approach is to address the difficulty in parameterization of discrete facies for model calibration. At least two challenges are faced in calibrating complex geologic facies models: (i) honoring the discrete and non-Gaussian nature of the model parameters (facies distribution), and (ii) preserving their expected complex connectivity patterns. The developed approach takes advantage of the convenient hard-data conditioning in MPS simulation to incorporate the production data. This is achieved by first identifying the facies types at a set of pilot point locations from flow data and then using them, along with available hard data, for conditional facies simulation. By drawing conditional samples from a TI, the proposed method ensures that the discrete nature of facies and the prior model connectivity in the TI are honored during model calibration. To implement the pilot point conditioning approach, the first step is a strategic identification of pilot point locations. Placement of pilot points is carried out by combining the information related to the sample variance, to maximize reduction in uncertainty, and the sensitivity of the flow responses of the prior models, to increase the effectiveness of the updates. These criteria are combined to place pilot points in locations that have the potential to make a significant improvement to the prior models and to avoid locations where little variability is present. Once a score map is constructed, a predefined threshold is used to truncate regions with 52 low values of sensitivity and variance and only retain promising regions for placing the pilot points. I investigated two techniques for locating pilot points in the truncated score map: a deterministic placement method that covers the effective regions based on the ranking of the scores; and a random placement strategy where, for each subgroup of the ensemble, the pilot points are randomly located within the truncated score maps. In both cases, the pilot points that are introduced into the score map have a minimum distance constraint; that is, they cover an area with a radius of influence that may not have any overlap with the region of influence of other pilot points. This strategy results in automatic identification of the number of pilot points, but it also requires a tuning parameter, i.e. the radius of influence for each pilot point. The results from our numerical experiments show that the random placement of pilot points is more robust as it covers more points on the score map and increases the chance of intersecting with important facies types (e.g., channels). The importance of dynamic data in both determining good candidate locations for pilot points and for identifying the values of the pilot points via data assimilation are also shown. The dynamic data can be easily incorporated by repeating the workflow with the conditional facies maps that are generated from previous steps. By the iterative approach on the workflow, the updated facies that contain the flow data are used for placing new pilot point locations. The facies estimation results are clearly improved when dynamic data is included in selecting the pilot point locations. The developed method presents an important extension to the original pilot points method for stochastic calibration of complex facies distributions against flow data. A key advantage of the presented approach is that it automatically reproduces discrete facies models with complex 53 connectivity patterns that are specified in the prior model, properties that are difficult to achieve using conventional model calibration techniques. 54 CHAPTER 4 FACIES PROBABILITY CONDITIONING METHOD 4.1 Introduction The adapted pilot point method discussed in the last chapter conditions multiple point statistics (MPS) facies simulations on nonlinear flow data by taking advantages of convenient hard data conditioning feature of MPS. It first strategically places a number of pilot points based on the information from three sources: sensitivity, variance and flow data. Then ensemble smoother (ES) is applied to integrate flow data and facies values at the pilot points are inferred from flow data in a probabilistic manner. These facies values at the pilot points are considered as hard data and used in MPS simulations to constrain the facies models. The adapted pilot points method provides a mean to condition MPS simulations on flow data and preserve the connectivity features (e.g., channel structures). However, it uses the information inferred from flow data (e.g., updated ensemble realizations) at the pilot points only, while the information at the rest grid cells are discarded. In this chapter, a facies probability conditioning method is presented to use the information inferred from flow data at all grid cells to condition MPS facies simulations. A similar method, a probability conditioning method (PCM) was proposed to condition MPS facies simulations on nonlinear flow data (Jafarpour and Khodabakhshi 2011). In PCM, the ensemble Kalman filter (EnKF), a data assimilation technique proposed by Evensen (Evensen 1994), is used to integrate non-linear flow data and infer facies probability maps from the ensemble mean of the updated continuous realizations. Then, the 𝜏 -model, a data dependence model proposed by Journel (Journel 2002), is used to integrate the inferred facies probability map to generate new facies realizations. PCM can provide the solutions that are consistent with the TI and preserve 55 the higher-order statistics within the TI, however, it can be improved if (i) in addition to ensemble mean of the updated realizations, higher-order statistics, such as variance is involved in estimating the facies probability map and (ii) instead of applying a constant tau value everywhere, a pixel-based tau is used to represent reliability (weight) of the estimated probability map at each cell and then used in generating new facies models. In this chapter, a facies probability conditioning method that can be used for conditioning the simulations of complex facies patterns to nonlinear flow data is presented and discussed. The developed approach is general and can be implemented using various nonlinear inversion techniques. Similar to the PCM, the developed method consists of two steps. In the first step, the ensemble smoother (ES) (Van Leeuwen and Evensen 1996) is used to integrate the flow data. When the updated ensemble realizations of log-permeability are available, the facies probability maps are inferred based on the mean and variance of the ensemble. Then an entropy map can be defined based on the facies probability maps and it can be used to generate the pixel-based tau map which represents the reliability of the estimation of the probability maps at each grid cell. In the second step, Single Normal Equation SIMulation (SNESIM) algorithm is used to generate new facies models by conditioning them onto hard data if available and the probability maps along with this pixel-based tau map. By such approach, the newly generated facies models are conditioned on both nonlinear flow data and a given TI. The chapter starts with a presentation of the developed method, followed by the discussion of the important properties of the method. The results of the applications of the method to several two-dimensional synthetic examples with channel facies are demonstrated and discussed before presenting the conclusions. 56 4.2 Methodology As discussed in Chapter 2, MPS simulations utilize empirical multivariate distributions 𝐹 𝑐 (𝑧 ) in Eq. (2.1) that are inferred from a TI (Guardiano and Srivastava 1993, Strebelle 2002, Hu and Chugunova 2008). When an extra source of information, e.g., seismic data (usually known as soft data), other than TI becomes available, combining the information of diverse sources (TI and soft data) is challenging. SNESIM, which is a particular implementation of the MPS that is used for simulation of the discrete facies patterns, uses 𝜏 -model to simulate the random variable 𝒎 (𝑥 𝑛 +1 ) through its conditional probability given two data events of different sources, specifically, TI and soft data in this case. Denoting a particular facies occurrence at a given grid cell as event A, information from TI as B and information from soft data as C. Let 𝑃 (𝐴 |𝐵 ) and 𝑃 (𝐴 |𝐶 ) be the conditional probabilities representing the faceis probabilities conditioned on the TI (data B) and on the flow data (data C), respectively. For the stated problem with two sources of data, the joint conditional probability of event A is given by 𝑥 𝑏 = 𝑐 𝑎 , (4.1) 𝑥 = 1−𝑃 (𝐴 |𝐵 ,𝐶 ) 𝑃 (𝐴 |𝐵 ,𝐶 ) , 𝑎 = 1−𝑃 (𝐴 ) 𝑃 (𝐴 ) , 𝑏 = 1−𝑃 (𝐴 |𝐵 ) 𝑃 (𝐴 |𝐵 ) , 𝑐 = 1−𝑃 (𝐴 |𝐶 ) 𝑃 (𝐴 |𝐶 ) (4.2) In MPS facies simulations, the marginal probability 𝑃 (𝐴 ) is usually provided based on prior knowledge of a subsurface model and it is considered as a constant in the simulations, so is 𝑎 . The underlying assumption in Eq. (4.1) is a form of independence of relative incremental contributions between data B and C (the incremental contribution of data event C to the occurrence of A is the same before and after B is known) (Journel 2002). However, 𝜏 -model is a form of Data 57 dependence, which can be introduced as follows in order to generalize the estimation of posterior probability (Journel 2002): 𝑃 (𝐴 |𝐵 ,𝐶 )= 1 1+𝑥 , 𝑤𝑖𝑡 ℎ 𝑥 𝑏 =( 𝑐 𝑎 ) 𝜏 (𝐵 ,𝐶 ) , 𝜏 (𝐵 ,𝐶 )≥0. (4.3) The value of 𝜏 ≥0 can be interpreted as the weight given to data C relative to data B. For example, when 𝜏 >1, data C is given more weight than data B; while, less weight is given to data C when 1>𝜏 >0. In PCM, the 𝜏 value is set as a constant, 𝜏 =1, which implies the equal weights for the TI and the probability maps inferred from the non-linear flow data. In this chapter, a 𝜏 -map where the value at each grid cell represents the weight given to two data sources (i.e., TI and soft data) at the corresponding locations is developed and such 𝜏 map may vary with space and time. The details will be discussed in Section 4.2.3. Notice that the current application of SNESIM can only take constant 𝜏 values throughout all the grid cells and I have made necessary modifications of the application of the SNESIM algorithm in order to make it suitable for integrating pixel-based 𝜏 maps. 4.2.1 Estimation of Facies Probability Maps PCM uses the ensemble Kalman filter (EnKF) as the flow data integration approach. Once flow data is integration, PCM calculates the ensemble mean of permeability. Then the ensemble mean is used to generate facies probability maps for conditioning facies simulation in the SNESIM algorithm. Assume the subsurface model only contains two facies types: channel (denoted as CH) and non-channel (denoted as NC) and their permeability values are known as 𝑘 𝑐 ℎ and 𝑘 𝑛𝑐 , respectively. If the ensemble mean of log-permeability at a grid cell 𝑥 is 𝑘̅ (𝑥 ) , then the conditional probability of the facies types at the grid cell x can be estimated as: 58 where [𝑝 𝑚𝑖𝑛 ,𝑝 𝑚𝑎𝑥 ]=[0.10,0.90] are the bounds of the probability maps in PCM to only allow for at most 90% likelihood on the updated results. It is important to note that instead of updating the first- and second-order moments (mean and covariance), the EnKF updates individual realizations. Thus, these realizations can be used to approximate any desired statistics. PCM only uses the first-order moment (ensemble mean) to estimate the probability maps. In our method, an approach of estimating the probability maps by using both the first- and second- order moments (ensemble mean and variance of the updated realizations) is developed. Similarly, I assume the subsurface model only contains two facies types: channel (denoted as CH) and non- channel (denoted as NC) and their permeability values are known as 𝑘 𝑐 ℎ and 𝑘 𝑛𝑐 , respectively. Consider location 𝑥 , let’s denote the first- and second- moment estimated from the ensemble at 𝑥 as 𝜇 1 (𝑥 ) and 𝜇 2 (𝑥 ), respectively. The probability of location 𝑥 to be channel (or non-channel) is denoted as 𝑃 (𝐴 (𝑥 )=𝐶𝐻 |𝐶 )≔𝑝 𝑐 ℎ (𝑥 ) (or 𝑃 (𝐴 (𝑥 )=𝑁𝐶 |𝐶 )≔𝑝 𝑛𝑐 (𝑥 ) ). Then these facies probability can be approximated by solving the optimization problem as follows: min 𝑝 𝑐 ℎ ,𝑝 𝑛𝑐 𝐸 1 (𝑝 𝑐 ℎ ,𝑝 𝑛𝑐 )+𝐸 2 (𝑝 𝑐 ℎ ,𝑝 𝑛𝑐 ) 𝑠 .𝑡 . 𝑝 𝑐 ℎ +𝑝 𝑛𝑐 =1 𝑎𝑛𝑑 𝑝 𝑐 ℎ ,𝑝 𝑛𝑐 ∈[𝑝 𝑚𝑖𝑛 ,𝑝 𝑚𝑎𝑥 ], (4.4) where 𝑃 (𝐴 (𝑥 )=𝐶𝐻 |𝐶 )= { 𝑝 𝑚𝑎𝑥 , 𝑘 (𝑥 )>𝑘 𝑐 ℎ 𝑝 𝑚𝑖𝑛 , 𝑘 (𝑥 )<𝑘 𝑛𝑐 𝑝 𝑚𝑖𝑛 +(𝑝 𝑚𝑎𝑥 −𝑝 𝑚𝑖𝑛 ) 𝑘̅ (𝑥 )−𝑘 𝑛𝑐 𝑘 𝑐 ℎ −𝑘 𝑛𝑐 , 𝑜𝑡 ℎ𝑒𝑟𝑤𝑖𝑠𝑒 𝑃 (𝐴 (𝑥 )=𝑁𝐶 |𝐶 )=1−𝑃 (𝐴 (𝑥 )=𝐶𝐻 |𝐶 ) (4.3) 59 𝐸 1 (𝑝 𝑐 ℎ ,𝑝 𝑛𝑐 )=∥𝑘 𝑛𝑐 𝑝 𝑛𝑐 +𝑘 𝑐 ℎ 𝑝 𝑐 ℎ −𝜇 1 (𝑥 )∥ 2 2 , 𝐸 2 (𝑝 𝑐 ℎ ,𝑝 𝑛𝑐 )=∥√𝑘 𝑛𝑐 2 𝑝 𝑛𝑐 +𝑘 𝑐 ℎ 2 𝑝 𝑐 ℎ −√𝜇 2 (𝑥 )∥ 2 2 . (4.5) Eq. (4.4) and (4.5) assume that the facies probability at each grid cell is a Bernoulli distribution. And Eq. (4.4) and (4.5) imply that unlike PCM which estimates probability maps by matching the first-order moment equation only, the newly proposed facies probabilities at location 𝑥 are constrained by both the first- and second-order moments. In such a way, the facies probability maps estimated by solving Eq. (4.4) and (4.5) will be more consistent in terms of matching the higher-order statistical information that is included in the updated ensemble realizations. Consequently, it is expected that these more accurate probability maps should result in better performance than PCM. A set of numerical experiments and detailed discussion will be shown in Section 4.3. These numerical experiments show that such changes in the estimation of probability maps will provide improved results in capturing connectivity features as compared with the PCM. 4.2.2 Soft Data Conditioning: Tau Model Let’s first take a detailed look into the 𝜏 model in Eq. (4.1). In the following discussion, without losing generality, I assume that 𝑎 ,𝑏 ,𝑐 ∈(0,+∞), which is equivalent to assuming that 𝑃 (𝐴 ),𝑃 (𝐴 |𝐵 ),𝑃 (𝐴 |𝐶 )∈(0,1). There are important properties of the 𝜏 model: lim 𝜏 →0 ( 𝑐 𝑎 ) 𝜏 =1 ⟹ lim 𝜏 →0 𝑥 =𝑏 (4.6) 𝑖𝑓 𝜏 >1 𝑎𝑛𝑑 𝑐 𝑎 <1, 𝑡 ℎ𝑒𝑛 𝑥 =𝑏 ( 𝑐 𝑎 ) 𝜏 <𝑏 ⟹ 𝑃 (𝐴 |𝐵 ,𝐶 )>𝑃 (𝐴 |𝐵 ) 𝑖𝑓 𝑐 𝑎 <1, 𝑡 ℎ𝑒𝑛 lim 𝜏 →+∞ 𝑥 = lim 𝜏 →+∞ 𝑏 ( 𝑐 𝑎 ) 𝜏 =0 ⟹ 𝑃 (𝐴 |𝐵 ,𝐶 )→1 (4.7a) 60 𝑖𝑓 𝜏 >1 𝑎𝑛𝑑 𝑐 𝑎 >1, 𝑡 ℎ𝑒𝑛 𝑥 =𝑏 ( 𝑐 𝑎 ) 𝜏 >𝑏 ⟹ 𝑃 (𝐴 |𝐵 ,𝐶 )<𝑃 (𝐴 |𝐵 ) 𝑖𝑓 𝑐 𝑎 >1, 𝑡 ℎ𝑒𝑛 lim 𝜏 →+∞ 𝑥 = lim 𝜏 →+∞ 𝑏 ( 𝑐 𝑎 ) 𝜏 =+∞ ⟹ 𝑃 (𝐴 |𝐵 ,𝐶 )→0 (4.7b) 𝑖𝑓 𝜏 >1 𝑎𝑛𝑑 𝑐 =𝑎 , 𝑡 ℎ𝑒𝑛 𝑥 =𝑏 ⟹ 𝑃 (𝐴 |𝐵 ,𝐶 )=𝑃 (𝐴 |𝐵 ) (4.7c) As mentioned earlier, in MPS facies simulations, the marginal probability 𝑃 (𝐴 ) is usually a constant in the simulations, so is 𝑎 . Hence, the probability 𝑃 (𝐴 |𝐵 ,𝐶 ) is determined by 𝜏 value and two condtional probabilities: (i) 𝑃 (𝐴 |𝐵 ), which is estimated from a TI and patterns of pre- simulated grid cells in a neighborhood of current simulating grid cell; (ii) 𝑃 (𝐴 |𝐶 ), which is provided by facies probability maps inferred from flow data, and it can vary with space and time. Eq. (4.6) is equivalent to saying that no matter what 𝑃 (𝐴 |𝐶 )∈(0,1) is, as long as 𝜏 approaches 0, the joint probability 𝑃 (𝐴 |𝐵 ,𝐶 ) from the 𝜏 model is almost identical to 𝑃 (𝐴 |𝐵 ). That means when 𝜏 is sufficiently small (𝜏 <1), the resulting probability from the 𝜏 model is dominated by the conditional probability estimated from the TI. Eq. (4.7) discusses three possibilities as 𝜏 is sufficiently large (𝜏 >1). When 𝑃 (𝐴 |𝐶 )>𝑃 (𝐴 ), or equivalently 𝑐 <𝑎 , the resulting probability 𝑃 (𝐴 |𝐵 ,𝐶 )>𝑃 (𝐴 |𝐵 ), which is identical to that event A is more likely to happen (to be drawn in sampling) with additional information inferred from flow data (data C) than without it. And when 𝜏 is extremely large, then 𝑥 =𝑏 ( 𝑐 𝑎 ) 𝜏 →0, consequently, 𝑃 (𝐴 |𝐵 ,𝐶 )→1. This indicates that the increment of probability that A will occur resulted from the additional information contained in the flow data, i.e., 𝑃 (𝐴 |𝐶 )>𝑃 (𝐴 ), will be emphasized by a large 𝜏 value when estimating 𝑃 (𝐴 |𝐵 ,𝐶 ) by the 𝜏 model. Similarly, when 𝑃 (𝐴 |𝐶 )<𝑃 (𝐴 ) and 𝜏 is large enough, 𝑃 (𝐴 |𝐵 ,𝐶 )→ 0. This means that the increment of probability that A will not occur, which is also inferred from 61 the flow data, will also be emphasized by a large 𝜏 value. In summary, Eq. (4.7) implies that when 𝜏 is sufficiently large, the joint conditional probability 𝑃 (𝐴 |𝐵 ,𝐶 ) is dominated by the conditional probability 𝑃 (𝐴 |𝐶 ) which is inferred from flow data (data C). Figure 4.1 Contours of 𝑷 (𝑨 |𝑩 ,𝑪 ) vs. 𝑷 (𝑨 |𝑩 ) (x-axis) and 𝑷 (𝑨 |𝑪 ) (y-axis) for different 𝝉 values (columns from left to right: 𝝉 =𝟎 .𝟏𝟐𝟓 ,𝟎 .𝟐𝟓 ,𝟎 .𝟓 ,𝟏 , 𝟐 ,𝟒 ,𝟖 ) and marginal probabilities 𝑷 (𝑨 ) (rows from top to bottom: 𝑷 (𝑨 )=𝟎 .𝟏 ,𝟎 .𝟓 ,𝟎 .𝟗 ). 9-level contours (from 0.1 to 0.9 with increment as 0) are plotted from bottom-left to top-right. Figure 4.1 shows contour maps of 𝑃 (𝐴 |𝐵 ,𝐶 ) vs. 𝑃 (𝐴 |𝐵 ) (x-axis) and 𝑃 (𝐴 |𝐶 ) (y-axis) for a given set of 𝑃 (𝐴 ) values and 𝜏 values. In each plot, 9 levels of 𝑃 (𝐴 |𝐵 ,𝐶 ) values (from 0.1 to 0.9 with increment as 0.1) are shown from dark blue (bottom-left) to light yellow (top-right). The first row (Figure 4.1a) shows the contour maps when 𝑃 (𝐴 )=0.1 and 𝜏 =0.125,0.25,0.5,1,2,4,8 𝜏 = 1 8 𝜏 = 1 4 𝜏 = 1 2 𝜏 =1 𝜏 =2 𝜏 =4 𝜏 =8 (a) 𝑃 (𝐴 )=0.1 (b) 𝑃 (𝐴 )=0.5 (c) 𝑃 (𝐴 )=0.9 𝑃 (𝐴 |𝐵 ) 𝑃 (𝐴 |𝐶 ) 62 (from the left-most column to the right-most). It’s clear that when 𝜏 =0.125, 𝑃 (𝐴 |𝐵 ,𝐶 ) is more sensitive to 𝑃 (𝐴 |𝐵 ) than to 𝑃 (𝐴 |𝐶 ), while when 𝜏 =8, 𝑃 (𝐴 |𝐵 ,𝐶 ) is more sensitive to 𝑃 (𝐴 |𝐶 ) than to 𝑃 (𝐴 |𝐵 ). Moreover, when 𝜏 =0.125, the contours of 𝑃 (𝐴 |𝐵 ,𝐶 ) is nearly parallel to the y- axis, and 𝑃 (𝐴 |𝐵 ,𝐶 ) is almost identical to 𝑃 (𝐴 |𝐵 ), which is consistent with the property Eq. (8) of the 𝜏 model. As 𝜏 increases, the contours are continuously becoming flat and collapsing. When 𝜏 =8, all contours virtually become flat and collapse at 𝑃 (𝐴 |𝐶 )=𝑃 (𝐴 ). It is expected that when 𝜏 is large enough, then 𝜏 model works like a thresholding method: when 𝑃 (𝐴 |𝐵 )>𝑃 (𝐴 )+𝜀 , 1≫ 𝜀 >0 (or 𝑃 (𝐴 |𝐵 )<𝑃 (𝐴 )−𝜀 ), then 𝑃 (𝐴 |𝐵 ,𝐶 )→1 (or 𝑃 (𝐴 |𝐵 ,𝐶 )→0). In PCM, a constant 𝜏 =1 is used to give the same weight to the conditional probabilities from TI and flow data. Let’s look at the behavior of the 𝜏 model when this constant 𝜏 value is used. In Figure 4.1a, the fourth plot from left shows the contours of 𝑃 (𝐴 |𝐵 ,𝐶 ) when 𝜏 =1. The left part of this plot shows that when 𝑃 (𝐴 |𝐵 ) is close to 0 while 𝑃 (𝐴 |𝐶 ) is close to 1, the joint conditional probability 𝑃 (𝐴 |𝐵 ,𝐶 ) can be small. This could happen when 𝑃 (𝐴 |𝐶 ) and 𝑃 (𝐴 |𝐵 ) are not consistent, e.g., when one is close to 0 while the other close to 1. When such inconsistency occurs and one wants the estimation of 𝑃 (𝐴 |𝐵 ,𝐶 ) relying more heavily on the information from flow data, then apparently it is difficult to be achieved by using the constant 𝜏 =1. Given the fact that one may be more confident about the probabilistic facies update results that are closer to the observation points, the 𝜏 values at these locations should be larger than those far away from the observation points. On the other hand as more data is integrated, the inferred probability maps are likely to be more accurate, thus larger 𝜏 values should be expected at later data assimilation updates (or iterations). One way to model the spatiotemporal distribution of the 𝜏 values in our proposed method is to relate the distribution to the entropy which measures the 63 degree to which the facies probability of the system (e.g., a pixel) is spread out over different possible outcomes (e.g., facies types): 𝑆 (𝑥 )=− ∑ 𝑝 𝑖 log 2 (𝑝 𝑖 ) 𝑁 𝑓𝑎𝑐𝑖𝑒𝑠 𝑖 =1 . (4.8) The entropy is the measure of uncertainty. For example, in two facies (e.g., channel and non- channel) subsurface models, the facies types of pixel 𝑥 will have only two possible outcomes: channel or non-channel. If the probability of the facies types at pixel 𝑥 to be channel (or non- channel) equals to 1, then the entropy achieves its minimum value 𝑆 (𝑥 )=0. If no preference is given to any facies types (channel or non-channel), then the entropy will reach to its maximum value 𝑆 (𝑥 )=1. In the proposed method, once the facies probability maps are approximated by solving Eq. (4.4) and (4.5), then I can calculate the entropy at each grid cell based on the corresponding estimated probabilities. With a pre-specified relationship between the value of entropy and the 𝜏 value, the entropy map then can be used to generate a 𝜏 map. There are various ways to define the relationship between the entropy and the 𝜏 value, but all of them should share the same rule: larger 𝜏 value should be related to lower entropy. Since lower entropy implies less uncertainty about the facies types at a grid cell, which is equivalent to saying that one is more certain about facies types at the grid cell (one facies type is more likely to be sampled than any others), thus higher 𝜏 value should be expected in the grid cell in SNESIM algorithm. Based on a series of experiments and analysis, I define a relationship between 𝜏 value and entropy value as in Figure 4.2. It is important to notice that such relationship is neither optimal or unique. The defined relationship is a piece-wise linear function, the values of entropy at the six 64 end points of the five line segments are: 0, 0.2864, 0.4690, 0.9341, 0.9710 and 1, respectively. This relationship will be used in all the experiments shown in the rest of paper. Since entropy is defined from facies probability maps, which vary with space and time, as a result, the 𝜏 map defined by this relationship is also a function of space and time. Notice that I have modified the original implementation of SNESIM algorithm in order to implement the proposed approach with a spatially varying 𝜏 map. Figure 4.2 The relationship between 𝝉 and entropy. 4.2.3 Implementation of the Algorithm Now, I summarize the implementation steps of the developed facies probability conditioning method is performed as follows: 1) Generate an initial ensemble of facies models (conditioned on hard data if available) using the SNESIM algorithm, then assign permeability value to each grid cell based on its facies type; 2) Integrate flow data to update the ensemble of permeability by using ES (as discussed in Section 3.2); 65 3) Infer facies probability maps by solving the optimization problem Eq. (4.4) and (4.5); 4) Calculate entropy for each grid cell based on the inferred facies probability maps by Eq. (4.8) and then generate a pixel-based τ map based on the entropy map based on the relationship shown in Figure 4.2. 5) Use the SNESIM algorithm to generate an ensemble of facies realizations that are conditioned on the hard data as well as the facies probability maps along with the pixel- based τ map. Then assign permeability value to each grid cell based on its updated facies type. 6) Repeat from Step 2 to 5 until a stopping criterion is met. Note that in our implementation, the ES is not used to update the facies types directly. Instead, the permeability values are estimated first. The estimated permeability values are then used to infer the conditional facies probability maps (by solving Eq. (4.4) and (4.5)), which are subsequently included, along with hard data from existing well locations, in the SNESIM algorithm to regenerate an updated ensemble of facies models. 4.3 Experiments and Discussion Two-dimensional two-phase (wetting and non-wetting phase) flow model is used to demonstrate the performance of our method. The model in our first experiment consists of a 65×65 grid system with five injection wells on the left and five extraction wells on the right. The reference facies distribution is shown in the top plot of Figure 4.3 and consists of channel facies (black) and background mudstone (white). It is assumed that the value of permeability and porosity for each facies type are known from analysis of core sample and well logs and the only uncertainty is related 66 to the distribution of facies in space (away from well locations). The pressure at the injection wells and flow rates of both phases at the extraction wells are measured every month over a two-year period and used as conditioning data. Figure 4.3 Row 1: Well configuration and reference log-permeability model; black refers to high- permeability channel and white shows low permeability mudstone. There are five injectors on the left side of the domain (gray diamonds) and 5 producers on the right side (gray squares). The color bar shows the values of 𝐥𝐨𝐠 𝟏𝟎 𝒌 . Row 2 (from left to right): ensemble mean, variance and three samples of the initial facies models. Row 3: Estimation results with the ES method, showing the ensemble mean, variance and three sample realizations. Mean Variance Sample 2 Sample 1 Sample 3 67 The initial ensemble of facies realizations is conditioned on the hard data (the facies types) at 10 well locations. The second row in Figure 4.3 displays the mean, variance and three (out of 100) sample realizations of the initial ensemble of the log-permeability distributions. The initial samples share statistically similar connectivity patterns with the reference model. However, the exact shape and location of the features are different from those in the reference model. Since the initial ensemble is conditioned on the hard data at the well locations, the samples show less uncertainty about the facies distribution at the vicinity of the wells. This can be verified by examining the initial ensemble mean and variance maps in Figure 4.3 (second row). The main uncertainty is therefore related to the channel locations and connectivity in the middle of the domain. The standard ES is used to assimilate the flow measurement into the initial ensemble of facies realizations. The third row of Figure 4.3 plots the mean, variance and three samples from the updated ensemble using the ES analysis. A comparison with the reference model reveals that the updated maps are closer to the reference model, but the exact shape and discrete nature of the channel features are lost. In the rest of this section, a series of numerical experiments is presented to evaluate the performance of the developed workflow. 4.3.1 Probability Maps After the standard ES is applied, the updated ensemble has lost the discrete nature and connectivity features of channels. To preserve the connectivity features, PCM proposed to use the mean of updated ensemble realizations to generate facies probability maps. Figure 4.4a shows the probability map (of channel) that is generated by PCM. Notice that this probability map is quite similar to the ES updated ensemble mean shown in Row 3 of Figure 4.3, since PCM proposed a simple linear relationship between the facies probabilities and the ensemble mean of permeability only. Hence, the generated probability map should match this first-order moment well, but it may 68 not match the other higher-order moments of the updated ensemble, e.g., variance map. Let’s consider the ensemble variance map to be a vector 𝒗 𝑒𝑛𝑠 =[𝑣 𝑒𝑛𝑠 ,1 ,𝑣 𝑒 𝑛𝑠 ,2 ,…,𝑣 𝑒𝑛𝑠 ,𝑛 𝑔 ] 𝑇 , where 𝑛 𝑔 is the number of grid cells in the model. Then I can calculate the misfit between the ensemble mean 𝝁 𝑒𝑛𝑠 and variance 𝒗 𝑒𝑛𝑠 maps (Row 3 in Figure 4.3) from the ES update and the mean 𝝁 𝑒𝑠𝑡 and variance 𝒗 𝑒𝑠𝑡 map calculated from the probability map (Figure 4.4a): 𝐸𝑟𝑟 =∥𝝁 𝑒𝑛𝑠 −𝝁 𝑒𝑠𝑡 ∥ 2 +∥𝝈 𝑒𝑛𝑠 −𝝈 𝑒𝑠𝑡 ∥ 2 (4.9) where 𝝁 𝑒𝑠𝑡 =[𝜇 𝑒𝑠𝑡 ,1 ,𝜇 𝑒𝑠𝑡 ,2 ,…,𝜇 𝑒𝑠𝑡 ,𝑛 𝑔 ] 𝑇 𝜇 𝑒𝑠𝑡 ,𝑖 =𝑘 𝑛𝑐 𝑝 𝑛𝑐 ,𝑖 +𝑘 𝑐 ℎ 𝑝 𝑐 ℎ,𝑖 𝒗 𝑒𝑠𝑡 =[𝑣 𝑒𝑠𝑡 ,1 ,𝑣 𝑒𝑠𝑡 ,2 ,…,𝑣 𝑒𝑠𝑡 ,𝑛 𝑔 ] 𝑇 , 𝜎 𝑒𝑠𝑡 ,𝑖 2 =𝑣 𝑒𝑠𝑡 ,𝑖 𝑣 𝑒𝑠𝑡 ,𝑖 =(𝑘 𝑛𝑐 −𝜇 𝑒𝑠𝑡 ,𝑖 ) 2 𝑝 𝑛𝑐 ,𝑖 +(𝑘 𝑐 ℎ −𝜇 𝑒𝑠𝑡 ,𝑖 ) 2 𝑝 𝑐 ℎ,𝑖 (4.10) where 𝑝 𝑛 𝑐 ,𝑖 and 𝑝 𝑐 ℎ,𝑖 are the corresponding values at the i th grid cell in the facies (non-channel and channel) probability maps, respectively. For the PCM, 𝐸𝑟 𝑟 𝑃𝐶𝑀 =62.0276. The second row of Figure 4.4 shows the mean, variance and three samples of the ensemble conditioned on the probability map estimated from PCM. Although the channel connectivity and discrete features are preserved as shown in the samples, they are not close to the reference model. 69 Figure 4.4 After one ES update, the probability maps are calculated from the ensemble by (a) PCM and (b) the proposed method. Row 2 (from left to right): the mean, variance and three samples from the ensemble conditioned on the probability map (a). Row 4: the mean, variance and three samples from the ensemble conditioned on the probability map (b). (a) Probability map (PCM) (b) Probability map (proposed) Mean Variance Sample 2 Sample 1 Sample 3 70 Figure 4.4b shows the probability map (of channel) that is generated by solving the optimization problem Eq. (4.4) and (4.5), and for our developed method 𝐸𝑟 𝑟 𝐹𝑃𝑀𝐶 =53.6455. Clearly, compared to PCM, the probability map that is generated by the proposed method is more accurate in terms of matching both first- and second-order moments. The fourth row of Figure 4.4 shows the mean, variance and three samples of the ensemble conditioned on the corresponding probability map. From the samples it can be observed that the channel connectivity is better preserved and the channel features are closer to those in the reference model compared with the results from PCM. 4.3.2 Four Scenarios In this section, a set of experiments using Example 1 will be shown and discussed. The set contains four scenarios, each of them uses flow data to constrain the facies simulations by one of the four different approaches: (i) PCM; (ii) an approach the same as PCM but with the facies probability maps estimated by Eq. (4.4) and (4.5); (iii) an approach the same as PCM but with the a spatiotemporal τ map defined in Section 4.2 rather than constant 𝜏 =1; (iv) the developed method described in Section 4.2. The results of Scenario (i) and (ii) are already shown in Figure 4.4a and 4.4b, respectively. Figure 4.5a shows the results from Scenario (iii). The left figure in the top row shows the channel facies probability map estimated from PCM Eq. (4.3), while the right figure shows the corresponding 𝜏 map generated based on the relationship shown in Figure 4.2. The second row shows the mean, variance and three samples of the MPS simulation results conditioned on the facies probability map by using the 𝜏 map. The simulation results are improved compared with those from Scenario (i) (Figure 4.4a), which indicates that conditioning on the same facies probability map, the spatially varying τ map can leads to improved results than the constant 𝜏 =1. 71 Figure 4.5b shows the results from Scenario (iv). The left figure in the top row shows the channel facies probability map estimated from Eq. (4.4) and (4.5), and the right figure shows the corresponding 𝜏 map generated based on the relationship shown in Figure 4.2. The second row shows the mean, variance and three samples of the conditioned MPS simulation results. It is clearly that the results from Scenario (iv) are improved compared with the other three scenarios. The total RMSE (root mean square error) of parameter estimation after 𝑠 iteration can be defined as: 𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (log(𝑘 (𝑠 ) ))=∑ √ 1 𝑁 𝑛𝑒𝑛𝑠 ∑ (log(𝑘 𝑗 ,𝑖 (𝑠 ) )−log(𝑘 𝑖 (𝑟 ) )) 2 𝑁 𝑒𝑛𝑠 𝑗 =1 𝑛 𝑔 𝑖 =1 (4.11) where 𝑘 𝑗 ,𝑖 (𝑠 ) and 𝑘 𝑖 (𝑟 ) are the permeability values at the i th grid cell of the estimated j th ensemble realization at the 𝑠 𝑡 ℎ iteration and the reference model. Each iteration contains two steps: ES update and MPS simulation. 𝑠 =0 refers to the initial ensemble (as shown in Figure 4.3), while 𝑠 =1 refers to the estimated ensemble after one iteration (as shown in Figure 4.4 and 4.5). 𝑁 𝑒𝑛𝑠 is the ensemble size, which is 100 in this paper and 𝑛 𝑔 is the number of grid cells. The relative total RMSE reduction of parameter estimation after one iteration relative to the initial ensemble then can be defined as: 𝑟𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (log(𝑘 (1) ))= 𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (log(𝑘 (1) ))−𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (log(𝑘 (0) )) 𝑅𝑀𝑆 𝐸 𝑡𝑜𝑡𝑎𝑙 (log(𝑘 (0) )) (4.12) The relative total RMSE reduction after one iteration of the four scenarios are: -10.89%, - 18.15%, -16.17% and -35.43%, respectively. The results from Figure 4.4 and 4.5 as well as the relative total RMSE reduction of parameter estimation show that in this case study the developed approach outperforms the rest three scenarios. 72 Figure 4.5 After one ES update, the probability maps and their corresponding 𝝉 maps are calculated from the updated ensemble by (a) PCM and (b) the proposed method. Row 2 (from left to right): the mean, variance and three samples from the ensemble conditioned on the probability map with its corresponding 𝝉 map (a). Row 4: the mean, variance and three samples from the ensemble conditioned on the probability map with its 𝝉 map (b). (a) Probability map (PCM) and the 𝜏 map (b) Probability map (proposed) and the 𝜏 map Mean Variance Sample 2 Sample 1 Sample 3 73 Figure 4.6 From left to right: the relative total RMSE of parameter estimation of Scenario (iv) relative to (a) Scenario (i); (b) Scenario (ii); and (c) Scenario (iii). Figure 4.7 From left to right: the relative total RMSE of data matching of Scenario (iv) relative to (a) Scenario (i); (b) Scenario (ii); and (c) Scenario (iii). Reduction (%) Counter (a) (b) (c) Counter (a) (b) (c) Reduction (%) 74 4.3.3 Statistical Analysis In the previous section, the case study has shown that our developed method of conditioning MPS simulations on flow data provides improved results than the other three scenarios in terms of estimating model parameters. In this section, the four scenarios are applied to 100 case studies and their performance on parameter estimation is investigated, which provides a statistical analysis. In these 100 case studies, all the reference models have the same well configuration. However, the reference facies (permeability) models are different. They are conditioned on the same TI but on to different hard data configurations at wells. Consequently, their initial ensembles are also different. For each case study, the total RMSE of parameter estimation is calculated and Figure 4.6 and 4.7 show the relative total RMSE reductions of parameter estimation and data matching, which are the reduction of the total RMSE of the parameter estimation and data matching of the Scenario (iv) relative to the rest 3 scenarios, respectively. In all case studies, compared with the rest three scenarios, the results from the Scenario (iv) have the least total RMSE of parameter estimation and data matching. Based on the statistical analysis performed in this subsection, our developed approach (Scenario (iv)) outperforms the other scenarios in terms of estimating model parameters and matching the observed data. Therefore, in the following sections, I will only show the results from the developed approach. 75 Figure 4.8 Top: Well configuration and reference log-permeability model; black refers to high- permeability channel and white shows low permeability mudstone. There are five injection wells (gray diamonds) and 5 extraction wells (gray squares). The color bar shows the values of 𝐥𝐨𝐠 𝟏𝟎 𝒌 . Column 1: Three samples (top), ensemble mean and variance (bottom) of the initial facies models. Column 2: Three samples (top), ensemble mean and variance (bottom) of the facies models calibrated by ES. Colum 3: Estimation results of PCM: Three samples (top) and ensemble mean and variance (bottom) are shown. Sample 2 Sample 1 Sample 3 Mean Variance Reference model and well configuration 76 Figure 4.9 The facies ensemble, including three samples, mean and variance of the final ensemble, obtained after 1 (a) and 4 (b) full iterations of facies probability map conditioning with the standard ES. (a) 1 full iterations (b) 4 full iterations Mean Variance Sample 3 Sample 1 Sample 2 77 4.3.4 Experiment 2: Example with Complex Connectivity Patterns I now consider a larger example with more complex connectivity patterns. The example also involves two-phase flow in a two-dimensional model. The top panel in Figure 4.8 shows the reference model, which consists of a 201× 61 grid system with five injection wells (gray diamonds) and five extraction wells (gray squares). The reference facies distribution consists of channel facies (black) and background mudstone (white). The pressure at the injection wells and flow rates of both phases at the extraction wells are measured every month over a two-year period and used as conditioning data. The initial ensemble of facies realizations are conditioned on the hard data (the facies types) at 10 well locations. The first column in Figure 4.8 displays three sample (out of 100) realizations of the initial ensemble of log-permeability distributions along with the ensemble mean, variance maps. While the initial samples share similar connectivity patterns (statistically) with the reference model, they do not represent the exact shape and location of the features in the reference model. Since the initial ensemble is conditioned on the hard data at the well locations, the initial ensemble members show less uncertainty about the facies distribution at the vicinity of the wells. The second and third columns in Figure 4.8 show the sample plots after data assimilation with the standard ES and using PCM, respectively. Similar to the first example, the updated maps after applying the standard ES are closer to the reference model, but the exact shape and discrete nature of the channel features are lost (second column). The final ensemble of log-permeability after PCM shows the connectivity features, however, the samples are not close to the reference model. In the next experiments, I implement our developed approach: first, the facies probability maps are inferred from the dynamic flow data by using ES update and Eq. (4.4) and (4.5); then a 𝜏 map is generated based on the probability maps as well as the relationship shown in Figure 4.2; 78 at last, the MPS facies simulations are conditioned on the probability maps by using the 𝜏 map as the weights. The developed approach can also be performed iteratively by repeating the above steps. Figures 4.9a and 4.9b show the three samples, mean and variance from the final ensemble of facies after one and four full iterations of the workflow, respectively. The results from these experiments show noticeable improvements compared with those from PCM or ES in Figure 4.8. 4.3.5 Information in The Facies Probability Map Notice that when estimating conditional probability 𝑃 (𝐴 |𝐶 ) from flow data at a grid cell, neither PCM nor the propose method considers spatial correlations between the simulation cell and the grid cells in its neighborhood. Both methods infer probability maps from flow data by using statistical information at the simulating cell only. Let’s reconsider the conditional probabilities that are used in PCM and the proposed method. Denote 𝑥 as the location of the current simulating cell. 𝑃 (𝐴 |𝐶 ), the conditional probability of facies types at location 𝑥 inferred from flow data by either PCM or our proposed method, can be denoted as 𝑃 (𝐴 (𝑥 )|𝐶̃ (𝑥 )). And 𝐶̃ (𝑥 ) refers to the updated model parameters (permeability) at the location 𝑥 without considering any information from the neighborhood of 𝑥 . 𝑃 (𝐴 |𝐵 ), the conditional probability inferred from a TI in SNESIM, can be denoted as 𝑃 (𝐴 (𝑥 )|(𝐵̂ ,𝐹 (𝑁 (𝑥 )))), where 𝑁 (𝑥 ) refers to the pre-simulated grid cells in a neighborhood of 𝑥 , 𝐹 (𝑁 (𝑥 )) refers to the facies pattern (a vector with discrete values) at these grid cells and (𝐵̂ ,𝐹 (𝑁 (𝑥 ))) refers to the combination of the TI as well as 𝐹 (𝑁 (𝑥 )). Specifically, in SNESIM a template the same as 𝑁 (𝑥 ) is used to scan the whole TI and record all possible outcomes of 𝐴 (𝑥 ) when 𝐹 (𝑁 (𝑥 )) is matched during the scanning. Then these outcomes are used to estimate the conditional probability 𝑃 (𝐴 (𝑥 )|(𝐵̂ ,𝐹 (𝑁 (𝑥 )))). With the more specific 79 notations defined above, obviously, in either PCM or our proposed method, the two estimated conditional probabilities 𝑃 (𝐴 |𝐵 ) and 𝑃 (𝐴 |𝐶 ) are not consistent: 𝑃 (𝐴 |𝐶 ) is estimated by considering the current simulation cell only, while the 𝑃 (𝐴 |𝐵 ) is estimated by considering the cell as well as its neighborhood. Furthermore, estimating 𝑃 (𝐴 (𝑥 )|𝐶̃ (𝑥 )) from an updated ensemble of model parameters without considering information from a neighborhood may cause a loss of spatial correlation information in the ensemble, which may be useful to improve the conditioned models. Like SNESIM estimating 𝑃 (𝐴 (𝑥 )|(𝐵̂ ,𝐹 (𝑁 (𝑥 )))) , a more consistent way of conditioning facies simulations onto flow data should estimate the conditional probability 𝑃 (𝐴 |𝐶 ) by considering both the simulating cell as well as its neighborhood. In this section, our goal is not to completely address this issue. Instead, I want to develop a more consistent way, which may not be optimal, to apply this idea, i.e., estimating 𝑃 (𝐴 (𝑥 )|(𝐶 ̂ ,𝐹 (𝑁 (𝑥 )))), where 𝐶 ̂ refers to an updated ensemble from ES by integrating flow data. First, let 𝑟 be an ensemble realization in 𝐶 ̂ and 𝐹 𝑟 (𝑁 (𝑥 )), which is the facies pattern at the pre-simulated grid cells within the neighborhood 𝑁 (𝑥 ), is obtained. Next, select the ensemble realizations where 𝐹 𝑟 (𝑁 (𝑥 )) is close enough to 𝐹 (𝑁 (𝑥 )) and the resulting subset of 𝐶 ̂ is denoted as 𝐶 ̂ 𝐹 (𝑁 (𝑥 )),𝜀 ={𝑟 ∈𝐶 ̂ |∥𝐹 𝑟 (𝑁 (𝑥 ))−𝐹 (𝑁 (𝑥 ))∥ 2 <𝜀 }. Then from all ensemble realizations in 𝐶 ̂ 𝐹 (𝑁 (𝑥 )),𝜀 , the outcomes of 𝐴 (𝑥 ) can be found and used to estimate the conditional probability 𝑃 (𝐴 (𝑥 )|(𝐶 ̂ ,𝐹 (𝑁 (𝑥 )))). Notice that there are cases when 𝐶 ̂ 𝐹 (𝑁 (𝑥 )),𝜀 =∅, which means that 𝐹 𝑟 (𝑁 (𝑥 )) of all realizations from 𝐶 ̂ is not close enough to the target pattern 𝐹 (𝑁 (𝑥 )). In such cases, I may approximate 𝑃 (𝐴 (𝑥 )|(𝐶 ̂ ,𝐹 (𝑁 (𝑥 )))) by assigning the estimated value from the 80 proposed method in Section 4.2 to it, i.e., 𝑃 (𝐴 (𝑥 )|(𝐶 ̂ ,𝐹 (𝑁 (𝑥 ))))=𝑃 (𝐴 (𝑥 )|𝐶̃ ). I call the above approach as local conditioning, which differs from the approach discussed in Section 4.2 by estimating conditional probability 𝑃 (𝐴 |𝐶 ) based on information from a local neighborhood 𝑁 (𝑥 ). Figure 4.10a shows the mean, variance and three samples of the input ensemble used in MPS simulation. The input ensemble is generated by applying a threshold to the continuous ensemble from ES updates. Clearly, the connectivity features (channels) are lost in the samples. Figure 4.10b shows the mean, variance and two samples of the output ensemble from MPS simulation conditioned on facies probability maps as well as the corresponding conditional probabilities for the two samples. Notice that in this case 𝑃 (𝐴 |𝐶 ) is estimated by matching the mean and variance of the input ensemble, therefore 𝑃 (𝐴 |𝐶 ) is the same in generating output ensemble realizations. In the top-left corner of the variance map of the output ensemble (circled by a red dashed ellipse), it’s clear that the value of variance is not zero, which means channels are present in this area in some of the output ensemble realizations. This can also be observed in the second sample of the output ensemble. On the other hand, in the center area (circled by a red dashed ellipse) of the first sample, the channel structure does not exist. This is inconsistent with the high probability values in the same area from the facies probability map 𝑃 (𝐴 |𝐶 ). Figure 4.10c shows the mean, variance and two samples of the output ensemble from MPS simulation by local conditioning as well as the corresponding conditional probabilities for the two samples. In this case, the conditional facies probability 𝑃 (𝐴 |𝐶 ) is estimated by scanning all the samples of the input ensemble, therefore 𝑃 (𝐴 |𝐶 ) are different in the two cases, although very similar, in generating different output ensemble realizations. (a) Input Ensemble 81 Figure 4.10 (a) The input ensemble for MPS simulation. (b) The mean, variance, two samples and their corresponding conditional probabilities 𝑷 (𝑨 |𝑪 ) and 𝑷 (𝑨 |𝑩 ,𝑪 ) of the output ensemble of MPS simulation conditioned on probability maps. (c) the same outputs of the MPS simulation by local conditioning. Mean Variance 3 samples Mean Variance Sample 1 Sample 2 𝑃 (𝐴 |𝐶 ) 𝑃 (𝐴 |𝐵 ,𝐶 ) samples (b) Output ensemble of MPS simulations by conditioning on probability maps (marginal) Mean Variance Sample 1 Sample 2 (c) Output ensemble of MPS simulations by local conditioning 82 Although some improvement can be observed from the samples of the output ensemble from MPS simulation by local conditioning, it is difficult to find strong evidence from these figures to support this argument. The total RMSE of model parameter estimation from local conditioning is only reduced by 3.2% compared with those conditioned on the facies probability maps. Instead, the output ensembles are considered as model parameters to generate predictions of the model output (e.g., pressure and oil/water rates) and then these predictions are compared with the synthetic observations generated from the reference model. The total RMSE of the data matching for pressure, oil and water rates for the output ensemble from local conditioning are reduced by 16.7%, 6.0% and 11.2%, compared with those of the output ensemble conditioned on the facies probability maps. 4.4 Summary Two general challenges in calibrating complex channel facies models against flow data are: (1) honoring the discrete and non-Gaussian nature of the model parameters (facies distribution) and their connectivity through existing parameterization techniques, and (2) the lack of a robust discrete model calibration method that can be used for estimation of complex facies patterns. A natural method for addressing these challenges is to directly condition the facies construction on flow data, instead of generating unconditional realizations and calibrating them. I examine the problem of conditioning multiple-point statistical facies simulation on flow data with facies probability maps. I propose an improved approach for conditioning multiple-point statistical simulation of complex facies on nonlinear flow data, using soft-data conditioning. The improvements are introduced in two steps: first, by using a statistically more rigorous to estimate the facies probability map flow an ensemble of calibrated continuous proxy parameters (i.e. log- 83 permeability); and second, by introducing a spatially variable (pixel-based) τ map to provide grid-cell level weights for the conditional probabilities from the TI and facies probabilities at each cell. The proposed method has three advantages: (i) by drawing conditional samples from a TI, it ensures that the discrete nature of facies and the prior model connectivity in the TI are honored during model calibration; (ii) by inferring facies probability maps from flow data, the updated facies realizations generated by MPS simulations will be conditioned on the flow data. (iii) the entropy-related τ map provides proper weights to the facies probability maps in MPS simulations, which leads to improved facies models from the simulations. I also examined and discussed the important properties of conditional facies simulation with probability maps using the SNESIM algorithm. One of them is that the facies probability map provided by soft data (in this case the updated ensemble of permeability) does not include spatial patters that are obtained in the updated ensemble. I proposed a method to incorporate the local conditioning data from the updated ensemble, in a similar way that the TI is scanned with local conditioning data. This approach ensures that the local conditioning data that are used during simulation are also considered when computing the facies probability from the flow- calibrated ensemble of permeability. 84 CHAPTER 5 PROPERTIES OF MPS FACIES SIMULATIONS AND THE COMBINATION OF PILOT POINTS AND FACIES PROBABILITIES 5.1 Introduction As mentioned in the previous Chapters, in multiple point statistical (MPS) facies simulation, a random path is generated first to visit all grid cells and the conditional facies probabilities at each grid cell are estimated from a conceptual model of geologic connectivity, known as a training image (TI). The conditional facies probabilities are then used to draw samples of facies distributions which are consistent with the TI. A number of MPS-based methods have been developed to constrain facies models on flow data, including the two MPS-based methods that I have developed, which are taking advantages of convenient hard data and soft data conditioning of MPS facies simulations. For example, the adapted pilot points method discussed in Chapter 3 considers the estimated facies values at pilot points as hard data and then uses MPS facies simulations to generate facies models that are conditioned on these pilot points (Ma and Jafarpour 2017,2018b). In such a way, the facies connectivity patterns in the newly generated facies models are consistent with those in the TI and also honor the flow data. However, it has two disadvantages: (1) it only uses a tiny portion of information from the update ensemble, i.e., it only uses the information at a few pilot points from updated ensemble models; (2) its sensitivity/gradient calculation requires more computational resources. Furthermore, not like other parameterization methods, it requires sensitivity/variance information at all grid cells to place the pilot points, which does not actually reduce the dimension of the inverse problem. In order to overcome some difficulties that the adapted pilot points method is suffering from, I have developed 85 the facies probability conditioning method which is discussed in Chapter 4. The facies probability conditioning method uses the information at all grid cells (facies probability maps) from the updated ensemble rather than only considering a few cells (pilot points). As expected, the facies probability conditioning method can reproduce facies models having facies connectivity patterns that are consistent with the TI and also improve the model predictions. However, the facies probability map conditioning method may suffer from: (1) lack of spatial correlation information from the updated ensemble by only considering mean/variance at each grid cell alone. Even though this can be improved by estimating the facies probability at a grid cell by taking the pre-simulated facies pattern within its neighborhood, which is discussed in Chapter 4, further study may still be needed to improve the facies probability estimation; (2) lack of an improved and more systematic way of designing 𝜏 map, instead of the relationship defined in Figure 4.2, which is still an open problem; (3) more importantly, the facies connectivity patterns in the updated facies models that are conditioned on facies probability maps may not be consistent with those included in the facies probability maps (Ma and Jafarpour 2018a). A number of approaches have been proposed to alleviate the issue. For example, Le et al. proposed to find and discard the realizations that do not agree with the probability map by using an algorithm in computer vision called Connected Component Labeling that detects and labels connected regions in a binary image (Le et al. 2015). I proposed an approach of using a spatiotemporal 𝜏 map which can assign higher weights to the facies probabilities at the grid cells where a facies type is more likely to present (Ma and Jafarpour 2018a). As a result, the facies realizations conditioned on the facies probability map will have much higher chance to be consistent with the probability maps. However, the design of an optimal 𝜏 map is still very challenging and requires further study. Both approaches attempt to alleviate the issue with specific techniques and provide means to improve the consistency of the conditional 86 (on facies probability maps) facies realizations. In order to solve the issue, it is critical to reveal the reason behind the issue. We recently uncovered that one essential reason causing the issue is that the facies patterns in the output of MPS facies simulation is primarily controlled by the first few grid cells along the random path, which is only a small portion (2-3%) of all grid cells. In this Chapter, I first performed a reassessment to MPS facies simulations and facies probability conditioning approaches by a series of experiments. The results from these experiments reveal the essential behaviors of MPS facies simulations: (1) the facies patterns in the output of a MPS facies simulation are primarily controlled by the first few grid cells along the corresponding random path. (2) The facies simulations at the rest of the grid cells are then dominated by the conditional probabilities form the TI only rather than soft data which in facies probability conditioning approaches is facies probability maps inferred from the flow data. In other words, facies probability maps will have almost on impact on determining the facies patterns at the late stage of MPS facies simulations. These essential behaviors of MPS facies simulations can explain why facies probability conditioning approaches will generate inconsistent facies connectivity patterns in facies models while conditioning on flow data. Then I proposed a new approach for conditioning MPS facies simulations on the flow data by combining the adapted pilot points and the facies probability conditioning method that I have developed. This new approach provides improved conditional facies simulations at the early stage of MPS facies simulations by conditioning the facies simulations at the first few grid cells onto either the strategically placed pilot points or the conditional probability from the facies probability maps, which leads to improved facies models that have more consistent facies connectivity patterns compared to applying each of the two methods alone. 87 5.2 MPS Facies Simulations Although MPS facies simulation has been discussed in the previous chapters, in this section a brief introduction to MPS facies simulation as well as soft data conditioning of MPS is provided for reader’s convenience. MPS is a sequential simulation algorithm, where a random path is generated first to visit all grid cells and the model property values are sampled from a conditional probability at each simulating grid cell. Let’s consider the stochastic simulation of a random function 𝒎 (𝒙 )= [𝑚 (𝑥 1 ),𝑚 (𝑥 2 ),…,𝑚 (𝑥 𝑁 𝑔 )] defined on a grid system with 𝑁 𝑔 cells, where the notation 𝑚 (𝑥 𝑖 ) refers to the parameter value, e.g., facies types, at the location 𝑥 𝑖 . In sequential simulation, instead of characterizing and sampling from the multivariate distribution of the random function, the multiplication rule of probability is invoked to sequentially visit each un-sampled cell and sample from the corresponding univariate conditional probability, i.e., 𝐹 𝑐 (𝑧 )=𝑃 [𝑚 (𝑥 𝑛 +1 )<𝑧 |𝑚 (𝑥 1 ),𝑚 (𝑥 2 ),…,𝑚 (𝑥 𝑛 )] (5.2) where 𝑚 (𝑥 𝑛 +1 ) is the un-sampled random variable at the current simulation grid cell 𝑥 𝑛 +1 and 𝑚 (𝑥 1 ),𝑚 (𝑥 2 ),…,𝑚 (𝑥 𝑛 ) are conditioning (or previously sampled) data within a local neighborhood of 𝑥 𝑛 +1 . Notice that the order 1,2,…,𝑛 is from the corresponding random path and each MPS simulation should have a different random path which is determined by a random seed in SNESIM algorithm. For each un-sampled grid cell 𝑥 𝑛 +1 in a sequential simulation, if it is a hard data location, then the corresponding hard data will be assigned to 𝑚 (𝑥 𝑛 +1 ), which is known as hard data conditioning. Otherwise, a random sample is going to be simulated based on the conditional probability described in Eq. (5.1). When the random function 𝒎 (𝒙 ) is multi-Gaussian, all the conditional probabilities are Gaussian and can be computed by estimating the corresponding conditional mean and variance of 𝑚 (𝑥 𝑛 +1 ), e.g., using Kriging. However, when the multi- 88 Gaussian assumption does not hold, computing the conditional probability for 𝑚 (𝑥 𝑛 +1 ) becomes nontrivial. Generally, there are two sources of information that can be used to estimate the conditional probability of 𝑚 (𝑥 𝑛 +1 ): a training image and soft data (e.g., seismic data or facies probabilities inferred from flow data). Instead of using random function models, MPS simulation utilizes empirical multivariate distributions that are inferred from a training image (Guardiano and Srivastava 1993, Strebelle 2002, Hu and Chugunova 2008). For example, if 𝒎 (𝒙 ) represents a discrete random process (e.g., facies distribution), the corresponding conditional probabilities described in Eq. (5.1) can be inferred from a training image that contains the geostatistcal information (patterns) (Strebelle 2002, Hu and Chugunova 2008). And the information from the training image is by default always used in computing the conditional probability of 𝑚 (𝑥 𝑛 +1 ). Besides a training image, when soft data becomes available, combining the information from these two sources is challenging. Single Normal Equation SIMulation (SNESIM), which is an implementation of the MPS facies simulation, uses 𝜏 -model to compute the conditional probability of the random variable 𝑚 (𝑥 𝑛 +1 ) when soft data is available. Let’s denote the occurrence of a facies type at the current simulating grid cell (e.g., 𝑥 𝑛 +1 ) as event A, information from the training image as event B and information from soft data as event C. Let 𝑃 (𝐴 |𝐵 ) and 𝑃 (𝐴 |𝐶 ) be the facies probabilities conditioned on the training image (data event B) and the soft data (data event C), respectively. Then the joint conditional probability of event A, which is also the conditional probability of 𝑚 (𝑥 𝑛 +1 ), is given by 𝑃 (𝐴 |𝐵 ,𝐶 )= 1 1+𝑥 , 𝑤𝑖𝑡 ℎ 𝑥 𝑏 =( 𝑐 𝑎 ) 𝜏 (𝐵 ,𝐶 ) , 𝜏 (𝐵 ,𝐶 )≥0, (5.2) 89 𝑥 = 1−𝑃 (𝐴 |𝐵 ,𝐶 ) 𝑃 (𝐴 |𝐵 ,𝐶 ) , 𝑎 = 1−𝑃 (𝐴 ) 𝑃 (𝐴 ) , 𝑏 = 1−𝑃 (𝐴 |𝐵 ) 𝑃 (𝐴 |𝐵 ) , 𝑐 = 1−𝑃 (𝐴 |𝐶 ) 𝑃 (𝐴 |𝐶 ) (5.3) In MPS facies simulations, the marginal probability 𝑃 (𝐴 ) is usually provided based on prior knowledge of a subsurface model and it is considered as a constant in the simulations, so is 𝑎 . The value of 𝜏 ≥0 can be interpreted as the weight given to data C relative to data B. For example, when 𝜏 >1, data C is given more weight than data B, as a result, 𝑃 (𝐴 |𝐶 ) will dominates the computation of 𝑃 (𝐴 |𝐵 ,𝐶 ); while, less weight is given to data C when 1>𝜏 >0. The probability conditioning method (PCM) was proposed to estimate conditional probability 𝑃 (𝐴 |𝐶 ), which they call a facies probability map, based on the mean of updated ensemble of permeability and then condition MPS facies simulations on it (Jafarpour and Khodabakhshi 2011). However, they found that the facies continuity patterns in the conditioned facies models are not necessarily consistent with those in the facies probability map, even in the regions that very high facies probability (90%) presents. Similar observations were also found in later applications of PCM (Le et al. 2015). More recently, an improved facies probability conditioning method has been developed (Ma and Jafarpour 2018a), in which an improved facies probability estimation approach as well as a spatiotemporal 𝜏 map reflecting confidence to the corresponding facies probability map is proposed. The inconsistency issue of the PCM can be alleviated by applying a spatiotemporal 𝜏 map and the corresponding updated facies models become more consistent, however, the issue is not completely solved. In this chapter, observations will be made that facies probability maps only have impacts on computing the conditional probability 𝑃 (𝐴 |𝐵 ,𝐶 ) at the first few grid cells along the random path {𝑥 1 ,𝑥 2 ,…,𝑥 𝑁 𝑔 }. In the next section, a series of experiments are used to reveal such behavior of MPS facies simulation, followed by a detailed discussion. 90 5.3 Behavior of MPS Facies Simulations Figure 5.1 The outputs of the experiment of MPS facies simulations conditioned on a facies probability map. (a) The facies probability map 𝑷 (𝑨 |𝑪 ) used in the experiment. (b) Two samples of facies models conditioned on the probability map. (c) The conditional probability 𝑷 (𝑨 |𝑩 ) estimated from the training image. (d) The joint conditional probability calculated from the corresponding facies probabilities 𝑷 (𝑨 |𝑩 ) and 𝑷 (𝑨 |𝑪 ) by using the 𝝉 model. (a) Facies probability map 𝑃 (𝐴 |𝐶 ) (d) 𝑃 (𝐴 |𝐵 ,𝐶 ) (c) 𝑃 (𝐴 |𝐵 ) (b) Samples Sample 1 Sample 2 91 In the first experiment, a number of facies models are generated from MPS facies simulation conditioned on both the training image shown in Figure 2.1 and a facies probability map 𝑃 (𝐴 |𝐶 ) shown in Figure 5.1(a). Two sample realizations of the MPS facies simulation outputs as well as their conditional facies probabilities are shown in Figure 5.1(b)-5.1(d). It can be observed that one of the two sample realizations have geologically consistent channel patterns that are similar to those in the facies probability map, while the other doe not (the region within the red ellipse). This is a common observation in the outputs from the applications of probability conditioning method. Another observation from Figure 5.1 is that the joint conditional probability 𝑃 (𝐴 |𝐵 ,𝐶 ) calculated based on the two conditional probabilities 𝑃 (𝐴 |𝐵 ) and 𝑃 (𝐴 |𝐶 ) by using the 𝜏 model is identical to the conditional probabilities 𝑃 (𝐴 |𝐵 ) estimated from the training image almost everywhere in the model. In other words, in the facies simulation at the most of grid cells, the joint conditional probability 𝑃 (𝐴 |𝐵 ,𝐶 ) is dominated by 𝑃 (𝐴 |𝐵 ) even though the probability 𝑃 (𝐴 |𝐶 ) from the facies probability map is also integrated, which in this case will have little or no impact on the facies simulations at these grid cells. The third observation is that most grid cells have 𝑃 (𝐴 |𝐵 ) to be 0 or 1 and consequently these grid cells will have 𝑃 (𝐴 |𝐵 ,𝐶 ) to be 0 or 1, which will explain why 𝑃 (𝐴 |𝐵 ,𝐶 ) is dominated by 𝑃 (𝐴 |𝐵 ) at most grid cells. The statistics of the conditional probability 𝑃 (𝐴 |𝐵 ) estimated from the training image for the two sample realizations from the first experiment is shown in Figure 5.2. Figure 5.2a shows the histogram of the conditional probability 𝑃 (𝐴 |𝐵 ) that is not 0 or 1. The number of grid cells that have 𝑃 (𝐴 |𝐵 ) not equal to 0 or 1 for the two sample realizations are 392 and 384 (out of 12261 grid cells), respectively. The histogram shows that only around 3% of the grid cells will have the conditional probability 𝑃 (𝐴 |𝐵 ) estimated from the training image not equal to 0 or 1, while the MPS simulations at the rest grid cells become deterministic (with probability 0 or 1) rather than 92 probabilistic (drawing samples from a probability not equal to 0 or 1). Figure 5.2b shows the conditional probability 𝑃 (𝐴 |𝐵 ) at all the grid cells along the corresponding random path. Notice that the random paths are different in MPS facies simulations of these two sample realizations. It can be observed that most grid cells have the conditional probability 𝑃 (𝐴 |𝐵 ) to be 0 or 1, while only a small portion of the grid cells have 𝑃 (𝐴 |𝐵 ) not equal to 0 or 1. The majority of these grid cells are located in the early part of the corresponding random path, and this can be observed more clearly from Figure 5.2c, which shows that the number of grid cells that have 𝑃 (𝐴 |𝐵 ) not equal to 0 or 1 within every 200 grid cells along the random path. For both sample realizations, this number is around 140 within the first 200 grid cells, then it drops dramatically to less than 30 grid cells in every 200-grid-cell interval after the first one. Figure 5.2d shows the details of the first 200 simulating grid cells along the corresponding random path, in which the red line is a straight line with slop equal to 1 while the blue circles represents the cumulative number of grid cells that have 𝑃 (𝐴 |𝐵 ).not equal to 0 or 1. Within the first 50 simulating grid cells, almost all the grid cells have 𝑃 (𝐴 |𝐵 ) not equal to 0 or 1, which result in sampling facies values probabilistically from 𝑃 (𝐴 |𝐵 ,𝐶 ). Based on the properties of 𝜏 model discussed in Chapter 4, the grid cells, which have 𝑃 (𝐴 |𝐵 ) not equal to 0 or 1, are the only grid cells that will be affected by the probability from 𝑃 (𝐴 |𝐶 ). Hence, 𝑃 (𝐴 |𝐶 ) will have impact on the MPS facies simulations at almost all the first 50 grid cells in this experiment. After these first 50 grid cells, the slop of the blue line becomes flatten, which indicates that less grid cells will be affected by the probability 𝑃 (𝐴 |𝐶 ) from facies probability map. Figure 5.2 shows that the probability 𝑃 (𝐴 |𝐶 ) provided by facies probability maps can only have impact on the simulations at a few grid cells which is a very small portion of all grid cells. And the majority of the grid cells that can be affected by 𝑃 (𝐴 |𝐶 ) are those grid cells at the early stage of visiting the corresponding random path. 93 Figure 5.2 The statistics of the conditional probability 𝑷 (𝑨 |𝑩 ) estimated from the training image for the two samples. (a) The histogram of 𝑷 (𝑨 |𝑩 ) that is not 0 or 1. (b) The conditional probability 𝑷 (𝑨 |𝑩 ) for each simulating grid cell along the corresponding random path. (c) The number of simulating grid cells that have 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 in every two hundred simulated grid cells along the random path. (d) The plot of the number of grid cells that have 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 vs. the number of grid cells within the first two hundred grid cells along the random path. Sample 1 Sample 2 (a) (b) (d) (c) 94 Figure 5.3 The outputs of the MPS facies simulation that is conditioned on no data. (a) The sample realization. (b) The corresponding 𝑷 (𝑨 |𝑩 ). (c) The histogram of 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1. (d) The number of 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 in every 200 simulating grid cells (left) and the cumulative number of 𝑷 (𝑨 |𝑩 ) not equal to 0 or 1 in the first 200 grid cells (right). (a) Sample (b) 𝑃 (𝐴 |𝐵 ) (c) Histogram of 𝑃 (𝐴 |𝐵 ) not equal to 0 or 1 (d) The number of 𝑃 (𝐴 |𝐵 ) not equal to 0 or 1 95 The observations from the first experiment can also found in the outputs of MPS facies simulation that is conditioned on no data, which means the facies patterns are sampled from the conditional probability estimated from the training image only. Figure 5.3 shows the outputs of the MPS simulation including a sample realization (Figure 5.3a), its corresponding 𝑃 (𝐴 |𝐵 ) map and the statistics of 𝑃 (𝐴 |𝐵 ) that is not equal to 0 or 1. The similar observations from the MPS facies simulations with and without soft data conditioning indicates that it is a common behavior of MPS simulations that the facies connectivity patterns are primarily determined by a few grid cells (around 3-4% of all the grid cells) while facies values will be deterministically assigned at the most grid cells (around 96-97%). Figure 5.4 The maps of the conditional probability 𝑷 (𝑨 |𝑩 ) that belongs to (a) (0, 1) and (b) [0.01, 0.99] from the first 200 grid cells along the random path (left) and the rest 12061 grid cells (right). From the rest 12061 grid cells From the first 200 grid cells (a) 𝑃 (𝐴 |𝐵 )∈(0,1) (b) 𝑃 (𝐴 |𝐵 )∈[0.01,0.99] 96 Figure 5.5 The comparison of the outputs from the first (left) and third (right) experiments. (a) and (d): The hard data information at the first 200 simulating grid cells along the random path. Blue refers to channel and red refers to non-channel. (b) and (e): The conditional probability 𝑷 (𝑨 |𝑩 )∈ [𝟎 .𝟎𝟏 ,𝟎 .𝟗𝟗 ] (top) and the resulting sample realization (bottom) from the first experiment where the facies values are sampled from the joint conditional probability 𝑷 (𝑨 |𝑩 ,𝑪 ). (c) and (f): The conditional probability 𝑷 (𝑨 |𝑩 )∈[𝟎 .𝟎𝟏 ,𝟎 .𝟗𝟗 ] (top) and the resulting sample realization (bottom) from third experiment where the facies values are sampled from the conditional probability 𝑷 (𝑨 |𝑩 ) only. (a) The first 200 grid cells along the random path (b) Sampled from 𝑃 (𝐴 |𝐵 ,𝐶 ) (c) Sampled from 𝑃 (𝐴 |𝐵 ) 𝑃 (𝐴 |𝐵 )∈[0.01,0.99] from the rest 12061 grid cells Facies realization Sample 1 (d) The first 200 grid cells along the random path (e) Sampled from 𝑃 (𝐴 |𝐵 ,𝐶 ) (f) Sampled from 𝑃 (𝐴 |𝐵 ) 𝑃 (𝐴 |𝐵 )∈[0.01,0.99] from the rest 12061 grid cells Facies realization Sample 2 97 Next, a number of plots are generated to show where these critical grid cells, whose facies simulations can determine the facies patterns in the whole model, are located. Figure 5.4 shows the conditional probabilities corresponding to Sample 1 in the first experiment shown in Figure 5.1. The left column in Figure 5.4 shows the maps of 𝑃 (𝐴 |𝐵 ) that belongs to the intervals (0,1) and [0.01,0.99] from the first 200 and the rest 12061 grid cells along the random path. There are 138 grid cells that have 𝑃 (𝐴 |𝐵 )∈(0,1) and 124 grid cells having 𝑃 (𝐴 |𝐵 )∈ [0.01,0.99] from the first 200 grid cells, while there are 254 grid cells having 𝑃 (𝐴 |𝐵 )∈(0,1) and 228 grid cells with 𝑃 (𝐴 |𝐵 )∈[0.01,0.99] from the rest 12061 grid cells. It’s clear that there are more grid cells that their facies simulations can be affected by the facies probability map in the rest 12061 grid cells. However, these grid cells are mostly located close to the boundaries between the two facies, i.e., channel and non-channel. These observations show that the first 200 grid cells along the random path contains the most critical grid cells that can essentially determine the facies patterns in the output of a MPS facies simulation. Then the third experiment is performed to confirm the findings above: (i) the facies patterns in the output of a MPS facies simulation are primarily determined by the simulations at the first few grid cells along the random path; (ii) a facies probability map only have impact on determining the facies patterns by affecting the joint conditional probability 𝑃 (𝐴 |𝐵 ,𝐶 ) at these grid cells; (iii) after a certain number of grid cells along the random path, the facies simulations become almost deterministic since the values of the conditional probabilities 𝑃 (𝐴 |𝐵 ) estimated from the training image are almost 0s or 1s, which cannot be changed by the probabilities from the facies probability map. In the third experiment, the first 200 grid cells along the random path for simulating Sample 1 (and Sample 2) in the first experiment (Figure 5.1) are selected and considered as hard data. Their facies values are assigned the same as those in Sample 1 (and 98 Sample 2) and a new sample realization is generated by conditioning the MPS facies simulation on the hard data at these grid cells and using the same random path of Sample. Therefore the only difference between the first and third experiments is that the facies values at the rest 12061 grid cells are sampled from the joint conditional probability 𝑃 (𝐴 |𝐵 ,𝐶 ) in the first experiment rather than being sampled from the conditional probability 𝑃 (𝐴 |𝐵 ) in the third experiment. In other words, the facies probability map (Figure 5.1a) is not used in conditioning facies simulations in the third experiment. If the findings mentioned above are true, then the newly generated sample realizations should have the facies patterns that are close to those in Sample 1 (and Sample 2) from the first experiment (Figure 5.1b). Figure 5.5 shows the comparison of the outputs from the first and third experiments, including the hard data information at the first 200 grid cells along the random path, the conditional probability 𝑃 (𝐴 |𝐵 )∈[0.01,0.99]. As expected, the facies realizations (of Sample 1 and Sample 2) from the third experiment are close to the corresponding facies realizations from the first experiment, which confirms the findings. 5.4 Combination of Pilot Points and Facies Probability Maps From the previous section, it is observed that the facies patterns in the output of a MPS facies simulation are primarily determined by the simulations at the first few grid cells along the random path. Hence, a facies probability map can only have impact on determining the facies values at these first few cells, which will result in different facies patterns in the output of the MPS facies simulation conditioned on the facies probability map. And some of these patterns are similar to those in the facies probability maps, while some are not. As a result, in the methods which condition the MPS facies simulation on the facies probability maps (Jafarpour and 99 Khodabakhshi 2011, Le et al. 2015, Ma and Jafarpour 2018a), the consistency between the facies patterns in the generated facies models and in the facies probability maps is actually depend on the locations of the first few grid cells along the random path. However, these methods have no control on where random paths will visit at the early stage, therefore, the facies patterns that are inconsistent with facies probability maps are observed in the results from these methods. In this section, I propose a method in which a combination of the two methods that I have developed, i.e., the adapted pilot points method discussed in Chapter 3 and the facies probability conditioning method discussed in Chapter 4, will be applied to condition MPS facies simulation onto flow data. There are several advantages of combining the two methods: first of all, with strategically placed pilot points, when simulating facies values at the first few grid cells along the random path in a MPS facies simulation, it is possible that there are some pilot points existing in the neighborhood of these grid cells. Then the simulated facies values at these grid cells will be affected by these pilot points. In such a way, by introducing the pilot points and considering them as hard data, we will gain some control on determining facies values at the first few grid cells along random paths. With this better control on determining facies values at the first few grid cells, we should expect the proposed approach will have better performance than the facies probability conditioning in terms of preserving the consistent facies connectivity patterns in the updated facies models. Second, since the pilot points only provide limited information from the updated ensemble realizations to condition MPS facies simulations, which means that in estimating conditional facies probability 𝑃 (𝐴 |𝐵 ) at the early stage of a MPS facies simulation pilot points may not provide constraining power due to the limited coverage spanned by them. Therefore, in the cases where pilot points do not present, facies probability maps may provide the only constraining power from the flow data. As a result, we should expect better performance 100 of reproduce consistent facies models from the combination of the two method compared with using pilot points only. Since the procedure of combine the two methods are trivial, I will omit the details. However, it is important to notice that the constant 𝜏 =1 will be used in this combination approach because: (1) an optimal design of 𝜏 map still needs further study. Therefore, it is more objective to assume equal weights to both sources of conditional probabilities, i.e., TI and facies probability maps. (2) a comparison study is needed to investigate the performance of the combination approach, thus by setting 𝜏 =1, the influence caused by a subjective spatiotemporal 𝜏 map can be avoided. In the next section, a number of numerical experiments will be presented to show the performance of the combination approach. 5.4.1 Numerical Experiments We now consider an example with complex connectivity patterns, which involves two-phase flow in a two-dimensional model. The top panel in Figure 5.6 shows the reference model, which consists of a 201× 61 grid system with five injection wells (gray diamonds) and five extraction wells (gray squares). The reference facies distribution consists of channel facies (black) and background mudstone (white). The pressure at the injection wells and flow rates of both phases at the extraction wells are measured every month over a two-year period and used as conditioning data. The initial ensemble of facies realizations are conditioned on the hard data (the facies types) at 10 well locations. The details of the initial ensemble of facies models can be found in Figure 3.8. The first column in Figure 5.6 displays three sample (out of 100) realizations of the updated ensemble of log-permeability distributions from the adapted pilot points method along with the ensemble mean, variance maps. The second column displays three sample (out of 100) 101 realizations of the updated ensemble of log-permeability distributions from the facies probability conditioning method (with 𝜏 =1) along with the ensemble mean, variance maps. The third column shows three sample (out of 100) realizations, mean and variance of the updated ensemble of log- permeability distributions from the combination of the two methods. It is clear that the combination approach provides better results in terms of preserving consistent connectivity patterns and the channel connectivity patterns in the updated facies models from the combination approach are closest to the reference model. Table 5.1 The relative reduction in the total RMSE of the model parameter (𝒍𝒐𝒈𝒌 ) estimation and data matching in the three experiments. Next the performance of three approaches on matching the historical flow data is shown. Figures 5.7 display the predicted and observed flow rates for the wetting and non-wetting phases at one extraction well for the initial and updated models from 3 numerical experiments applying the adapted pilot points method, the facies probability conditioning method and their combination, respectively. The initial predictions are relatively far from the observed values and show a larger spread. It is clear that the ensemble predictions from the combination approach have the smallest spread (uncertainty) and most improved matching onto the historical flow data. This can also be confirmed by the relative reductions in the total RMSE for model parameter estimation and data matching, which is reported in Table 5.1. Experiment 𝑟 𝑅𝑀𝑆𝐸 (log(k)) 𝑟 𝑅𝑀𝑆𝐸 (d obs ) Adapted Pilot Points -18.63% -21.25% Facies Probability Conditioning -39.22% -40.46% Pilot Points Approach -48.42% -50.74% 102 Figure 5.6 Top: Well configuration and reference log-permeability model; black refers to high- permeability channel and white shows low permeability mudstone. There are five injection wells (gray diamonds) and 5 extraction wells (gray squares). The color bar shows the values of 𝐥𝐨𝐠 𝟏𝟎 𝒌 . Column 1: Three samples (top), ensemble mean and variance (bottom) of the update models by using the adapted pilot points method. Colum 2: Three samples (top), ensemble mean and variance (bottom) of the update models by using the facies probability conditioning method with 𝝉 =𝟏 . Colum 3: Three samples (top), ensemble mean and variance (bottom) of the update models by using the combination of the two methods. Sample 2 Sample 1 Sample 3 Mean Variance Reference model and well configuration 103 Figure 5.7 Flow data match results at the one sample producer: (a) shows the wetting phase flow rates based on the initial ensemble, updated ensemble from the adapted pilot points method, the facies probability conditioning method and their combination, while (b) displays the same results for non-wetting phase flow rates. The black circles refer to the observations and the red lines indicate the median of the ensemble prediction. The blue boxes show the Inter Quartile Range (IQR). Initial Adapted Pilot Points Facies Probability Conditioning (a) Wetting Phase Flow Rates (a) Non-Wetting Phase Flow Rates Combination 104 5.5 Summary One common problem shared by these facies probability conditioning approaches is that they will generate facies connectivity patterns that are inconsistent with the facies probability maps. In this chapter, a series of experiments have been done to have a deep and detailed reassessment of facies probability conditioning approaches and MPS facies simulations. The results from the experiments have revealed the essential behaviors of MPS facies simulations, which are responsible for causing the common problem in the facies probability conditioning approaches. These behaviors include (1) the facies patterns in the output of a MPS facies simulation are primarily controlled by the first few grid cells along the corresponding random path and (2) the facies simulations at the rest of the grid cells are then dominated by the conditional probabilities form the TI only rather than soft data which in facies probability conditioning approaches is facies probability maps inferred from the flow data. As a result, the quality of the facies connectivity patterns in the facies models conditioned on facies probability maps highly depends on the information at the first few grid cells along the random paths. Based on these behaviors of MPS facies simulations, I proposed a new approach for conditioning MPS facies simulations on the flow data by combining the adapted pilot points methods and the facies probability conditioning method together. In this approach, instead of relying on the information from the early stage of a random path which users have no control of, pilot points along with facies probability maps will provide constraining power onto the first few grid cells. In such a way, users gain better controls on simulating facies patterns at the early stage of MPS facies simulations. A number of numerical experiments are then used to show that this new approach provides improved facies simulations at the early stage of MPS facies simulations by conditioning the facies simulations at the first few grid cells onto either the pilot points or the 105 facies probability maps, which leads to improved facies models that have more consistent facies connectivity patterns compared to applying the adapted pilot points method or facies probability conditioning approach alone. 106 REFERENCES Aanonsen, Sigurd I, Geir Næ vdal, Dean S Oliver et al. 2009. The Ensemble Kalman Filter in Reservoir Engineering--a Review. Spe Journal 14 (03): 393-412. Alcolea, Andrés, Jesús Carrera, Agustí n Medina. 2006. Pilot points method incorporating prior information for solving the groundwater flow inverse problem. Advances in Water Resources 29 (11): 1678-1689. http://www.sciencedirect.com/science/article/pii/S0309170805002976. Alcolea, Andrés, Jesús Carrera, Agustí n Medina. 2008. Regularized pilot points method for reproducing the effect of small scale variability: Application to simulations of contaminant transport. Journal of hydrology 355 (1): 76-90. Alcolea, Andres, Philippe Renard. 2010. Blocking Moving Window algorithm: Conditioning multiple‐point simulations to hydrogeological data. Water Resources Research 46 (8). Arroyo, Elkin, Deepak Devegowda, Akhil Datta-Gupta et al. 2008. Streamline-assisted ensemble Kalman filter for rapid and continuous reservoir model updating. SPE Reservoir Evaluation & Engineering 11 (06): 1,046-1,060. Arsenault, Richard, Annie Poulin, Pascal Côté et al. 2013. A comparison of stochastic optimization algorithms in hydrological model calibration. Journal of Hydrologic Engineering. Ballester, Pedro J, Jonathan N Carter. 2007. A parallel real-coded genetic algorithm for history matching and its application to a real petroleum reservoir. Journal of Petroleum Science and Engineering 59 (3): 157-168. Berre, Inga, Martha Lien, Trond Mannseth. 2009. Multi-level parameter structure identification for two- phase porous-media flow problems using flexible representations. Advances in Water Resources 32 (12): 1777-1788. Bhark, Eric, Behnam Jafarpour, Akhil Datta-Gupta. 2011. An adaptively scaled frequency-domain parameterization for history matching. Journal of Petroleum Science and Engineering 75 (3): 289-303. Bissell, RC, O Dubrule, P Lamy et al. Combining Geostatistical Modelling With Gradient Information for History Matching: The Pilot Point Method. Society of Petroleum Engineers. Bissell, Robert. Calculating optimal parameters for history matching. Caers, Jef. 2003. History matching under training-image-based geological model constraints. SPE journal 8 (03): 218-226. Caers, Jef, Todd Hoffman. 2006. The probability perturbation method: A new look at Bayesian inverse modeling. Mathematical geology 38 (1): 81-100. Caers, Jef, Sunderrajan Krishnan, Yuandong Wang et al. 2002. A geostatistical approach to streamline- based history matching. SPE Journal 7 (03): 250-266. Caers, Jef, Tuanfeng Zhang. 2004. Multiple-point geostatistics: a quantitative vehicle for integrating geologic analogs into multiple reservoir models. Carle, Steven F, Graham E Fogg. 1997. Modeling spatial variability with one and multidimensional continuous-lag Markov chains. Mathematical Geology 29 (7): 891-918. Carle, STEVEN F, ERIC M Labolle, GARY S Weissmann et al. 1998. Conditional simulation of hydrofacies architecture: a transition probability/Markov approach. Hydrogeologic models of sedimentary aquifers, concepts in hydrogeology and environmental geology 1: 147-170. Carrera, Jesús, Andrés Alcolea, Agustí n Medina et al. 2005. Inverse problem in hydrogeology (in English). Hydrogeology Journal 13 (1): 206-222. http://dx.doi.org/10.1007/s10040-004-0404-7. Carrera, Jesus, Shlomo P Neuman. 1986. Estimation of aquifer parameters under transient and steady state conditions: 1. Maximum likelihood method incorporating prior information. Water Resources Research 22 (2): 199-210. 107 Certes, Catherine, Ghislain de Marsily. 1991. Application of the pilot point method to the identification of aquifer transmissivities. Advances in Water Resources 14 (5): 284-300. Chen, Yan, Dean S Oliver. 2010. Ensemble-based closed-loop optimization applied to Brugge field. SPE Reservoir Evaluation & Engineering 13 (01): 56-71. Chiles, Jean-Paul, Pierre Delfiner. 2009. Geostatistics: modeling spatial uncertainty, Vol. 497, John Wiley & Sons (Reprint). Chiu, Sung Nok, Dietrich Stoyan, Wilfrid S Kendall et al. 2013. Stochastic geometry and its applications, John Wiley & Sons (Reprint). de Marsily, G. 1978. De l’identification des syst mes hydrologiques [On the calibration of hydrologic systems], Doctoral thesis, University Paris VI. De Marsily, G. 1984. Spatial variability of properties in porous media: a stochastic approach. In Fundamentals of Transport Phenomena in Porous Media, 719-769. Springer. De Marsily, G, G Lavedan, M Boucher et al. 1984. Interpretation of interference tests in a well field using geostatistical techniques to fit the permeability distribution in a reservoir model. Geostatistics for natural resources characterization, Part 2: 831-849. De Marsily, Gh, Frédéric Delay, Julio Gonçalvès et al. 2005. Dealing with spatial heterogeneity. Hydrogeology Journal 13 (1): 161-183. Deutsch, Clayton V, Andre G Journel. 1992. Geostatistical software library and user’s guide. New York 119: 147. Deutsch, CV, Libing Wang. Hierarchical object-based geostatistical modeling of fluvial reservoirs. Society of Petroleum Engineers. Doherty, John. 2003. Ground Water Model Calibration Using Pilot Points and Regularization. Ground Water 41 (2): 170-177. http://dx.doi.org/10.1111/j.1745-6584.2003.tb02580.x. Dorn, Oliver, Rossmary Villegas. 2008. History matching of petroleum reservoirs using a level set technique. Inverse Problems 24 (3): 035015. Elsheikh, Ahmed H, Mary F Wheeler, Ibrahim Hoteit. 2013. Sparse calibration of subsurface flow models using nonlinear orthogonal matching pursuit and an iterative stochastic ensemble method. Advances in Water Resources 56: 14-26. Emerick, A. A., A. C. Reynolds. 2013. Ensemble smoother with multiple data assimilation. Computers & Geosciences 55: 3-15. Evensen, Geir. 1994. Sequential data assimilation with a nonlinear quasi‐geostrophic model using Monte Carlo methods to forecast error statistics. Journal of Geophysical Research: Oceans (1978–2012) 99 (C5): 10143-10162. Evensen, Geir. 2004. Sampling strategies and square root analysis schemes for the EnKF. Ocean dynamics 54 (6): 539-560. Ewing, Richard E., Michael S. Pilant, J. Gordon Wade et al. 1994. Estimating Parameters in Scientific Computation: A Survey of Experience from Oil and Groundwater Modeling. IEEE Comput. Sci. Eng. 1 (3): 19-31. Fienen, Michael N, Christopher T Muffels, Randall J Hunt. 2009. On constraining pilot point calibration with regularization in PEST. Groundwater 47 (6): 835-844. Finsterle, Stefan, Michael B Kowalsky. 2008. Joint hydrological–geophysical inversion for soil structure identification. Vadose Zone Journal 7 (1): 287-293. Floudas, CA, CE Gounaris. 2009. A review of recent advances in global optimization. Journal of Global Optimization 45 (1): 3-38. Franssen, Harrie-Jan Hendricks, J Jaime Gómez-Hernández, Jose E Capilla et al. 1999. Joint simulation of transmissivity and storativity fields conditional to steady-state and transient hydraulic head data. Advances in water resources 23 (1): 1-13. 108 Gavalas, GR, PC Shah, John H Seinfeld. 1976. Reservoir history matching by Bayesian estimation. Society of Petroleum Engineers Journal 16 (06): 337-350. Gómez-Hernández, J Jaime, R Mohan Srivastava. 1990. ISIM3D: An ANSI-C three-dimensional multiple indicator conditional simulation program. Computers & Geosciences 16 (4): 395-440. Gómez-Hernández, J Jaime, Xian-Huan Wen. 1998. To be or not to be multi-Gaussian? A reflection on stochastic hydrogeology. Advances in Water Resources 21 (1): 47-61. Gómez-Hernánez, J Jaime, Andrés Sahuquillo, José E Capilla. 1997. Stochastic simulation of transmissivity fields conditional to both transmissivity and piezometric data--1. Theory. Journal of Hydrology(Amsterdam) 203 (1): 167-174. Goovaerts, Pierre. 1997. Geostatistics for natural resources evaluation, Oxford University Press on Demand (Reprint). Gu, Yaqing, Dean S Oliver. 2005. History matching of the PUNQ-S3 reservoir model using the ensemble Kalman filter. SPE journal 10 (02): 217-224. Guardiano, Felipe B, R Mohan Srivastava. 1993. Multivariate geostatistics: beyond bivariate moments. In Geostatistics Troia’92, 133-144. Springer. Haldorsen, Helge H, Larry W Lake. 1984. A new approach to shale management in field-scale models. Society of Petroleum Engineers Journal 24 (04): 447-457. Haldorsen, HH, LW Lake. 1982. A new approach to shale management in field scale simulation models, Univ. of Texas. Holden, Lars, Ragnar Hauge, Ø ivind Skare et al. 1998. Modeling of fluvial reservoirs with object models. Mathematical Geology 30 (5): 473-496. Hu, LY, T Chugunova. 2008. Multiple‐point geostatistics for modeling subsurface heterogeneity: A comprehensive review. Water Resources Research 44 (11). Isaaks, Edward H. 1991. The application of Monte Carlo methods to the analysis of spatially correlated data, Stanford University Dissertation. Jafarpour, B., V. K. Goyal, D. B. McLaughlin et al. 2009. Transform-domain sparsity regularization for inverse problems in geosciences (in English). Geophysics 74 (5): R69-R83. <Go to ISI>://WOS:000270381400020. Jafarpour, Behnam. 2011. Wavelet reconstruction of geologic facies from nonlinear dynamic flow measurements. IEEE Transactions on Geoscience and Remote Sensing 49 (5): 1520-1535. Jafarpour, Behnam, Morteza Khodabakhshi. 2011. A probability conditioning method (PCM) for nonlinear flow data integration into multipoint statistical facies simulation. Mathematical Geosciences 43 (2): 133-164. Jafarpour, Behnam, Dennis B McLaughlin. 2008. History matching with an ensemble Kalman filter and discrete cosine parameterization. Computational Geosciences 12 (2): 227-244. Jafarpour, Behnam, Dennis B McLaughlin. 2009. Reservoir characterization with the discrete cosine transform. SPE Journal 14 (01): 182-201. Jiménez, S, R Brauchler, P Bayer. 2013. A new sequential procedure for hydraulic tomographic inversion. Advances in Water Resources 62: 59-70. Jiménez, S, G Mariethoz, R Brauchler et al. 2016. Smart pilot points using reversible‐jump Markov‐chain Monte Carlo. Water Resources Research 52 (5): 3966-3983. Journel, AG. 2002. Combining knowledge from diverse sources: An alternative to traditional data independence hypotheses. Mathematical geology 34 (5): 573-596. Journel, André G. 1983. Nonparametric estimation of spatial distributions. Mathematical Geology 15 (3): 445-468. Journel, AndreG. 2005. Beyond Covariance: The Advent of Multiple-Point Geostatistics. In Geostatistics Banff 2004, ed. Oy Leuangthong and ClaytonV Deutsch, Chap. 23, 225-233. Quantitative Geology and Geostatistics, Springer Netherlands. 109 Karhunen, Kari. 1947. Ü ber lineare Methoden in der Wahrscheinlichkeitsrechnung, Vol. 37, Universitat Helsinki (Reprint). Kerrou, Jaouher, Philippe Renard, Harrie-Jan Hendricks Franssen et al. 2008. Issues in characterizing heterogeneity and connectivity in non-multiGaussian media. Advances in water resources 31 (1): 147-159. Khaninezhad, Mohammadreza Mohammad, Behnam Jafarpour, Lianlin Li. 2012a. Sparse geologic dictionaries for subsurface flow model calibration: Part I. Inversion formulation. Advances in Water Resources 39: 106-121. Khaninezhad, Mohammadreza Mohammad, Behnam Jafarpour, Lianlin Li. 2012b. Sparse geologic dictionaries for subsurface flow model calibration: Part II. Robustness to uncertainty. Advances in Water Resources 39: 122-136. Khodabakhshi, Morteza, Behnam Jafarpour. 2013. A Bayesian mixture‐modeling approach for flow‐ conditioned multiple‐point statistical facies simulation from uncertain training images. Water Resources Research 49 (1): 328-342. Kitanidis, Peter K. 1986. Parameter uncertainty in estimation of spatial functions: Bayesian analysis. Water resources research 22 (4): 499-507. Kowalsky, Michael B, Stefan Finsterle, Yoram Rubin. 2004. Estimating flow parameter distributions using ground-penetrating radar and hydrological measurements during transient flow in the vadose zone. Advances in Water Resources 27 (6): 583-599. LaVenue, A. Marsh, John F. Pickens. 1992. Application of a coupled adjoint sensitivity and kriging approach to calibrate a groundwater flow model. Water Resources Research 28 (6): 1543-1569. http://dx.doi.org/10.1029/92WR00208. LaVenue, A. Marsh, Banda S. RamaRao, Ghislain De Marsily et al. 1995. Pilot Point Methodology for Automated Calibration of an Ensemble of Conditionally Simulated Transmissivity Fields: 2. Application. Water Resources Research 31 (3): 495-516. http://dx.doi.org/10.1029/94WR02259. Lavenue, Marsh, Ghislain Marsily. 2001. Three‐dimensional interference test interpretation in a fractured aquifer using the Pilot Point Inverse Method. Water Resources Research 37 (11): 2659- 2675. Law, Jan. 1944. A statistical approach to the interstitial heterogeneity of sand reservoirs. Transactions of the AIME 155 (01): 202-222. Le, Duc H, Rami Younis, Albert C Reynolds. A history matching procedure for non-Gaussian facies based on ES-MDA. Society of Petroleum Engineers. Le Loc’h, G, A Galli. 1997. Truncated plurigaussian method: theoretical and practical points of view. Geostatistics wollongong 96 (1): 211-222. Lee, TY, Costas Kravaris, John Seinfeld. 1986. History matching by spline approximation and regularization in single-phase areal reservoirs. Lepine, OJ, RC Bissell, SI Aanonsen et al. 1999. Uncertainty analysis in predictive reservoir simulation using gradient information. SPE Journal 4 (03): 251-259. Li, Liangping, Sanjay Srinivasan, Haiyan Zhou et al. 2014. Simultaneous Estimation of Geologic and Reservoir State Variables Within an Ensemble-Based Multiple-Point Statistic Framework. Mathematical Geosciences 46 (5): 597-623. Li, Liangping, Sanjay Srinivasan, Haiyan Zhou et al. 2013. A pilot point guided pattern matching approach to integrate dynamic data into geological modeling. Advances in Water Resources 62, Part A (0): 125-138. http://www.sciencedirect.com/science/article/pii/S0309170813001954. Li, Wei, Olaf A Cirpka. 2006. Efficient geostatistical inverse methods for structured and unstructured grids. Water resources research 42 (6). Liu, Dong C, Jorge Nocedal. 1989. On the limited memory BFGS method for large scale optimization. Mathematical programming 45 (1-3): 503-528. 110 Liu, Ning, Dean S Oliver. 2004. Automatic history matching of geologic facies. SPE Journal 9 (04): 429- 436. Loeve, Michel. 1978. Probability theory, vol. II. Graduate texts in mathematics 46: 0-387. Lu, Pengbo, Roland N Horne. A multiresolution approach to reservoir parameter estimation using wavelet analysis. Ma, W., B. Jafarpour. 2018a. An Improved Probability Conditioning Method for Constraining Multiple- Point Statistical Facies Simulation on Nonlinear Flow Data. Proc., SPE Western Regional Meeting, Garden Grove, California, USA. Ma, Wei, Behnam Jafarpour. 2017. Conditioning Multiple-Point Geostatistical Facies Simulation on Nonlinear Flow Data Using Pilot Points Method. Proc., SPE Western Regional Meeting, Bakersfield, California. Ma, Wei, Behnam Jafarpour. 2018b. Pilot points method for conditioning multiple-point statistical facies simulation on flow data. Advances in Water Resources 115: 219-233. Ma, Xianlin, Mishal Al-Harbi, Akhil Datta-Gupta et al. 2008. An efficient two-stage sampling method for uncertainty quantification in history matching geological models. SPE Journal 13 (01): 77-87. Madsen, Henrik. 2003. Parameter estimation in distributed hydrological catchment modelling using automatic calibration with multiple objectives. Advances in Water Resources 26 (2): 205-216. http://www.sciencedirect.com/science/article/pii/S0309170802000921. Mariethoz, Grégoire, Philippe Renard, Jef Caers. 2010. Bayesian inverse problem and optimization with iterative spatial resampling. Water Resources Research 46 (11). Marshall, Lucy, David Nott, Ashish Sharma. 2004. A comparative study of Markov chain Monte Carlo methods for conceptual rainfall‐runoff modeling. Water Resources Research 40 (2). McLaughlin, Dennis, Lloyd R Townley. 1996. A reassessment of the groundwater inverse problem. Water Resources Research 32 (5): 1131-1161. Oliver, D. S., Y. Chen. 2011. Recent progress on reservoir history matching: a review (in English). Computational Geosciences 15 (1): 185-221. <Go to ISI>://WOS:000288223900013. Oliver, Dean S, Luciane B Cunha, Albert C Reynolds. 1997. Markov chain Monte Carlo methods for conditioning a permeability field to pressure data. Mathematical Geology 29 (1): 61-91. Oliver, Dean S, Albert C Reynolds, Zhuoxin Bi et al. 2001. Integration of production data into reservoir models. Petroleum Geoscience 7 (S): S65-S73. RamaRao, Banda S., A. Marsh LaVenue, Ghislain De Marsily et al. 1995. Pilot Point Methodology for Automated Calibration of an Ensemble of conditionally Simulated Transmissivity Fields: 1. Theory and Computational Experiments. Water Resources Research 31 (3): 475-493. http://dx.doi.org/10.1029/94WR02258. Regis, R. G., C. A. Shoemaker. 2007. A stochastic radial basis function method for the global optimization of expensive functions (in English). Informs Journal on Computing 19 (4): 497-509. <Go to ISI>://WOS:000251037100003. Remy, Nicolas, Alexandre Boucher, Jianbing Wu. 2009. Applied geostatistics with SGeMS: A user's guide, Cambridge University Press (Reprint). Reynolds, Albert C, Nanqun He, Lifu Chu et al. 1996. Reparameterization techniques for generating reservoir descriptions conditioned to variograms and well-test pressure data. SPE Journal 1 (04): 413-426. Riva, Monica, Alberto Guadagnini, Francesca de Gaspari et al. 2010. Exact sensitivity matrix and influence of the number of pilot points in the geostatistical inversion of moment equations of groundwater flow. Water Resources Research 46. Roggero, F, LY Hu. Gradual deformation of continuous geostatistical models for history matching. Society of Petroleum Engineers. 111 Rollins, JB, SA Holditch, WJ Lee. 1992. Characterizing, Average Permeability in Oil and Gas Formations (includes associated papers 25286 and 25293). SPE formation evaluation 7 (01): 99-105. Rubin, Yoram, Xingyuan Chen, Haruko Murakami et al. 2010. A Bayesian approach for inverse modeling, data assimilation, and conditional simulation of spatial random fields. Water Resources Research 46 (10). Sakov, Pavel, Dean S Oliver, Laurent Bertino. 2012. An Iterative EnKF for Strongly Nonlinear Systems. Monthly Weather Review 140 (6). Sarma, Pallav, Louis J. Durlofsky, Khalid Aziz. 2008. Kernel Principal Component Analysis for Efficient, Differentiable Parameterization of Multipoint Geostatistics. Mathematical Geosciences 40 (1): 3- 32. http://dx.doi.org/10.1007/s11004-007-9131-7. Schulze-Riegert, RW, O Haase, A Nekrassov. Combined global and local optimization techniques applied to history matching. Society of Petroleum Engineers. Skjervheim, J, Geir Evensen, Sigurd Ivar Aanonsen et al. 2007. Incorporating 4D seismic data in reservoir simulation models using ensemble Kalman filter. SPE JOURNAL-RICHARDSON- 12 (3): 282. Skjervheim, Jan-Arild, Geir Evensen. An ensemble smoother for assisted history matching. Society of Petroleum Engineers. Strebelle, Sebastien. 2002. Conditional simulation of complex geological structures using multiple-point statistics. Mathematical Geology 34 (1): 1-21. Sun, Alexander Y, Alan P Morris, Sitakanta Mohanty. 2009. Sequential updating of multimodal hydrogeologic parameter fields using localization and clustering techniques. Water resources research 45 (7). Sun, Ne-Zheng, Shu-li Yang, William W. G. Yeh. 1998. A proposed stepwise regression method for model structure identification. Water Resources Research 34 (10): 2561-2572. http://dx.doi.org/10.1029/98WR01860. Tolson, Bryan A, Christine A Shoemaker. 2007. Dynamically dimensioned search algorithm for computationally efficient watershed model calibration. Water Resources Research 43 (1). Tonkin, MJ, J Doherty. 2005. A hybrid regularized inversion methodology for highly parameterized environmental models. Water Resour. Res 41: W10412. Tsai, Frank T‐C, Ne‐Zheng Sun, William W‐G Yen. 2003. A combinatorial optimization scheme for parameter structure identification in ground water modeling. Ground Water 41 (2): 156-169. Van Leeuwen, Peter Jan, Geir Evensen. 1996. Data assimilation and inverse methods in terms of a probabilistic formulation. Monthly Weather Review 124 (12): 2898-2913. Vrugt, Jasper A., Hoshin V. Gupta, Luis A. Bastidas et al. 2003. Effective and efficient algorithm for multiobjective optimization of hydrologic models. Water Resources Research 39 (8): 1214. http://dx.doi.org/10.1029/2002WR001746. Watson, AT, JG Wade, RE Ewing. Parameter and system identification for fluid flow in underground reservoirs. 81-108: Springer. Wen, X-H, CV Deutsch, AS Cullick. 2002. Construction of geostatistical aquifer models integrating dynamic flow and tracer data using inverse technique. Journal of Hydrology 255 (1): 151-168. Western, Andrew W, Günter Blöschl, Rodger B Grayson. 2001. Toward capturing hydrologically significant connectivity in spatial patterns. Water Resources Research 37 (1): 83-97. Wöhling, Thomas, Jasper A Vrugt. 2011. Multiresponse multilayer vadose zone model calibration using Markov chain Monte Carlo simulation and field water retention data. Water Resources Research 47 (4). Yang, Yarong, Matthew Over, Yoram Rubin. 2012. Strategic placement of localization devices (such as pilot points and anchors) in inverse modeling schemes. Water Resources Research 48 (8). Yeh, William W‐G. 1986. Review of parameter identification procedures in groundwater hydrology: The inverse problem. Water Resources Research 22 (2): 95-108. 112 Zhang, Xuesong, Raghavan Srinivasan, Kaiguang Zhao et al. 2008. Evaluation of global optimization algorithms for parameter calibration of a computationally intensive hydrologic model. Zhou, Haiyan, J Jaime Gómez-Hernández, Harrie-Jan Hendricks Franssen et al. 2011. An approach to handling non-Gaussianity of parameters and state variables in ensemble Kalman filtering. Advances in Water Resources 34 (7): 844-864. Zhou, Haiyan, J. Jaime Gómez-Hernández, Liangping Li. 2012. A pattern-search-based inverse method. Water Resources Research 48 (3): W03505. http://dx.doi.org/10.1029/2011WR011195. Zinn, Brendan, Charles F Harvey. 2003. When good statistical models of aquifer heterogeneity go bad: A comparison of flow, dispersion, and mass transfer in connected and multivariate Gaussian hydraulic conductivity fields. Water Resources Research 39 (3). 
Asset Metadata
Creator Ma, Wei (author) 
Core Title Subsurface model calibration for complex facies models 
Contributor Electronically uploaded by the author (provenance) 
School Andrew and Erna Viterbi School of Engineering 
Degree Doctor of Philosophy 
Degree Program Petroleum Engineering 
Publication Date 08/09/2018 
Defense Date 11/22/2017 
Publisher University of Southern California (original), University of Southern California. Libraries (digital) 
Tag facies models,facies probability maps,geostatistics,history matching,model calibration,multiple-point statistics,OAI-PMH Harvest,pilot points 
Format application/pdf (imt) 
Language English
Advisor Jafarpour, Behnam (committee chair), Ershaghi, Iraj (committee member), Shing, Katherine (committee member) 
Creator Email mawei.polaris@gmail.com,mshwyl@126.com 
Permanent Link (DOI) https://doi.org/10.25549/usctheses-c89-59849 
Unique identifier UC11670733 
Identifier etd-MaWei-6701.pdf (filename),usctheses-c89-59849 (legacy record id) 
Legacy Identifier etd-MaWei-6701.pdf 
Dmrecord 59849 
Document Type Dissertation 
Format application/pdf (imt) 
Rights Ma, Wei 
Type texts
Source University of Southern California (contributing entity), University of Southern California Dissertations and Theses (collection) 
Access Conditions The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law.  Electronic access is being provided by the USC Libraries in agreement with the a... 
Repository Name University of Southern California Digital Library
Repository Location USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Abstract (if available)
Abstract My PhD work mainly focuses on subsurface model calibration for complex facies models. In this type of models, a Gaussian model (e.g., variogram model) fails to reproduce discrete and connectivity features of hydraulic properties. To better represent the discrete facies and their distributions, multiple point statistic (MPS) has been developed as a flexible simulation approach for generating complex facies patterns that are not amenable to variogram-based modeling techniques. Instead of using a variogram model, MPS relies on a conceptual model of geologic connectivity patterns, known as a training image (TI), to represent the statistical information about the expected patterns in the model. As a grid-based simulation, MPS provides simple conditioning of the results to hard data (e.g., facies measurements at well locations) and soft data (e.g., 3-D seismic data). However, conditioning MPS simulations on flow data is not trivial. In my PhD study, I developed several methods for constraining MPS complex facies simulations onto nonlinear flow data by taking advantages of its convenient hard data and soft data conditioning. ❧ The first method is the adapted pilot points methods, where a subset of all grid cells is selected as pilot points and the facies values at these points are inferred from nonlinear flow data. Then MPS facies simulations are conditioned onto the facies values at the pilot points by considering them as part of hard data. The second method is the facies probability conditioning method, where flow data is integrated to estimate facies probability maps. Then these probability maps are considered as soft data in MPS simulations to constrain facies models. In both methods, MPS facies simulations are conditioned on the information, e.g., pilot points or facies probability maps, inferred from nonlinear flow data, instead of directly conditioned on flow data. The third method is the approach of combining the first two methods. This combination approach is developed based on the observations of the behaviors of MPS facies simulations. One essential behavior of MPS facies simulations is that the facies connectivity patterns in the output of MPS facies simulations are dominated by the simulations at the early stage of the simulations. Therefore, a combination approach can provide improved facies connectivity patterns while honoring historical flow data by gaining more control of the facies simulations at the early stage of MPS facies simulations. The numerical experiments have shown that the developed methods can provide improvements in preserving the connectivity features in complex facies models as well as model predictions. 
Tags
facies models
facies probability maps
geostatistics
history matching
model calibration
multiple-point statistics
pilot points
Linked assets
University of Southern California Dissertations and Theses
doctype icon
University of Southern California Dissertations and Theses 
Action button