Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Sparse feature learning for dynamic subsurface imaging
(USC Thesis Other)
Sparse feature learning for dynamic subsurface imaging
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Sparse Feature Learning for Dynamic Subsurface Imaging by Mohammad-Reza Khaninezhad A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements of the Degree DOCTOR OF PHILOSOPHY (Electrical Engineering) Committee: Dr. Behnam Jafarpour (Chair) Electrical Engineering Prof. Mahta Moghaddam Electrical Engineering Prof. Iraj Ershaghi Petroleum Engineering Date of submission: March 2018 Date of degree conferral: August 2018 Copyright 2018 Mohammad-Reza Khaninezhad i Abstract This work focuses mainly on solutions of nonlinear inverse (in general model calibration) problems with limited (sparse) data. The application focus of this work is on subsurface energy resources model calibration but without loss of generality. The methodologies developed can be used for any linear or nonlinear inverse modeling and image reconstruction problems. In the first part of this work, we investigate the feasibility of characterizing subsurface models using sparse signal representation techniques. We have provided a framework for learning the main patterns in a prior set of model realizations, utilizing some of the well-known dictionary learning algorithms recently introduced in the machine learning and computer vision community. The proposed algorithm provides a novel framework for learning geologic dictionaries, resulting in geologically consistent patterns when used for inverse modeling problems related to subsurface flow and transport processes. In this regard, this dissertation explores model selection properties of the proposed sparse recovery algorithms as a tool for reducing prior uncertainties in model realizations of the geology. In the case of large prior uncertainty in prior models, this work provides several novel frameworks for identification and correction of those models in an automated scheme. ii Acknowledgments Firstly, I would like to express my sincere gratitude to my advisor, Dr. Behnam Jafarpour, for the continuous support of my Ph.D. study and related research, for his patience, motivation, and immense knowledge. His guidance was crucial throughout the research and writing of this thesis. I would also like to thank the rest of my thesis committee: Prof. Iraj Ershaghi and Prof. Mahta Moghaddam, as well as Prof. Jerry Mendel, Prof. Keith Jenkins, and Dr. Justin Haldar, for their insightful comments and encouragement, but also for the hard questions which incented me to widen my research from various perspectives. I thank my fellow lab-mates for the stimulating discussions, for the sleepless nights working together before deadlines, and for all the fun we have had over the last few years. Last but not the least, I would like to thank my wife and daughter for supporting me with their love throughout this journey and in my life. iii Dedication I dedicate my dissertation work to my best friend and beautiful wife, Zahra, for being there for me throughout the entire doctorate program and my daughter, Nora. Both of you have been my best cheerleaders. Also, a special feeling of gratitude to my loving parents whose words of encouragement and push for tenacity ring in my ears. they have never left my side. iv Table of Contents Abstract ................................................................................................................................ i Acknowledgments............................................................................................................... ii Dedication .......................................................................................................................... iii List of Tables ................................................................................................................... viii List of Figures .................................................................................................................... ix Chapter I. INTRODUCTION ..............................................................................................1 Linear inverse problems: ..............................................................................5 Nonlinear inverse problems .........................................................................7 Chapter II. SPARSE REGULARIZATION FOR DYNAMIC SUBSURFACE IMAGING ..............................................................................................................................................9 Inverse modeling with sparse regularization: Formulation .......................16 Linear and Non-Linear Approximation .........................................17 Geologically Learned Sparsifying Transforms ..............................19 Singular Value Decomposition ......................................................20 K-SVD Algorithm ..........................................................................26 Step I: Sparse Coding.....................................................................27 Step II: Dictionary Update .............................................................28 Estimation Method .....................................................................................34 Regularized Least Squares .............................................................34 Sparsity Promoting Regularization ................................................36 v Application to Flow Data Integration ........................................................42 Alternative Reservoir Configurations ............................................42 Description of Experiments ...........................................................44 Results and Discussion ..............................................................................45 Example 1 ......................................................................................45 Example 2 ......................................................................................51 Example 3 ......................................................................................55 Compressed K-SVD...................................................................................59 Effect of Measurements Noise .......................................................61 Conclusions ................................................................................................64 Appendix A: Structural Similarity Error ................................................................68 Chapter III. PRIOR MODEL ERROR IDENTIFICATION .............................................70 Iteratively Reweighted Least Squares (IRLS) ...........................................75 Sparsity Promoting Prior Model Identification..........................................78 Proposed Method For Model Error Identification .........................79 Numerical Experiments and Discussion ....................................................81 Example 1: Synthetic 2D Model ....................................................81 Example 2: SPE10 (Benchmark Model): Top Layer .....................86 Conclusion .................................................................................................89 Chapter IV. SPARSE-RANDOMIZED MAXIMUM LIKELIHOOD FOR UNCERTAINTY ASSESSMENT.....................................................................................91 Bayesian Formulation ................................................................................96 RML with Multi-Gaussian Priors ..................................................96 vi Sparse RML with Laplace Priors ...................................................99 Numerical Experiments ...........................................................................105 Example 1: Two-Dimensional Multi-Gaussian Model ................107 Example 2: Two-Dimensional Non-Gaussian Model (Meandering Channel) .......................................................................................111 Example 3: Three-Dimensional PUNQ-S3 Case .........................115 Conclusions ..............................................................................................122 Chapter V. INVERSE MODELING IN COMPLEX NON-GAUSSIAN GEOLOGIC ENVIRONMENTS ..........................................................................................................125 Methodology ............................................................................................131 i) Connectivity-Preserving Constraint .......................................133 ii) Discrete Regularization ........................................................136 Numerical Experiments ...........................................................................142 Example 1: Tomographic Inversion .............................................143 Example 2: Integration of Dynamic and Static Data ..................147 Example 3: Large-Scale Pumping Test with Complex Geology .154 Conclusions ..............................................................................................159 Chapter VI. PATTERN BASED NON-LINEAR DISCRETE IMAGING FOR COMPLEX GEOLOGIC ENVIRONMENTS ................................................................162 Methodology ............................................................................................167 Data Mismatch Term ...................................................................168 Connectivity Preserving Constraints ............................................168 Training Image Constraints ..........................................................171 vii Obtaining the Feasible Solution from a Training Image .............174 Numerical Results ....................................................................................180 Example 1 (Motivating Example): Tomographic Inversion ........182 Example 2: 2D Pumping Test with 3 Discrete Facies .................184 Example 3: Large Scale Complex Heterogeneity ........................188 Conclusion ...............................................................................................192 A NOTE ON PRACTICAL APPLICATIONS ...............................................................195 BIBLIOGRAPHY ............................................................................................................198 viii List of Tables Table 1. Important flow simulation and data integration parameters. .............................. 44 Table 2. Implementation steps of the proposed MEI algorithm ....................................... 80 Table 3. Geostatistical simulation parameters for the reference model in Example 1 ..... 82 Table 4 Cluster contributions ............................................................................................ 87 Table 5. Geostatistical simulation parameters for the two-dimensional reservoir in Example 1. ...................................................................................................................... 108 Table 6. Geostatistical simulation parameters for PUNQ-S3 reservoir (Example 2) ..... 118 ix List of Figures Figure 2.1: K-SVD dictionary and SVD basis ..................................................................22 Figure 2.2: Compression quality comparison between K-SVD and SVD atoms .............26 Figure 2.3: Behavior of two different sparsity-promoting penalty functions ...................39 Figure 2.4: Experimental field setup and well configurations ..........................................43 Figure 2.5: Log-permeability reconstruction results for Experiment A ............................47 Figure 2.6: Reconstruction errors and sample observation matches for Experiment A ...48 Figure 2.7: Log-permeability reconstruction results for Experiment 2 ............................53 Figure 2.8: Reconstruction errors for Experiment 2 .........................................................55 Figure 2.10: Reconstruction errors for Experiment 3 .......................................................58 Figure 2.11: Log-permeability reconstruction results for Experiments 2 and 3 with different K values ...............................................................................................................61 Figure 2.12: Log-permeability reconstruction results for Experiments C after adding noise to observations ..........................................................................................................63 Figure 2.13: Reconstruction errors for Experiment C with 20% observation noise .........64 Figure 3.1: Sample log-permeability realizations from five different clusters. ................83 Figure 3.2: Log-permeability reconstruction result for sparsity-promoting PMI algorithm ............................................................................................................................................83 Figure 3.3: Clusters contributions .....................................................................................84 Figure 3.4: Cluster contribution weights ..........................................................................85 Figure 3.5: Evolution of the cluster contribution histograms ...........................................85 x Figure 3.6: Production forecasts for PMI method ............................................................86 Figure 3.7: Reconstruction Results for Example 2 ...........................................................88 Figure 3.8: Realizations and cluster contributions for Example 2 ....................................88 Figure 3.9: Calibrated data for Example 3 ........................................................................89 Figure 4.1. The Q-Q plot of the sparse K-SVD coefficients against the standard Gaussian distribution .......................................................................................................................105 Figure 4.2. Model setup and inverse modeling results for Example 1 ............................108 Figure 4.3. Dictionary elements and model’s sparse representation ...............................110 Figure 4.4. Data match and production forecasts for all 50 realizations ........................111 Figure 4.5. Realizations and dictionary atoms for Example 2 (Meandering structure) ..112 Figure 4.6. Field setup and calibrated models for Example 2 .........................................114 Figure 4.7. Data match and production forecasts for all 50 realizations ........................115 Figure 4.8. Reference log-permeability map for the PUNQ-S3 model ..........................117 Figure 4.9. Conditional realizations for example 3 .........................................................119 Figure 4.10. Sample K-SVD atoms and model representation for Example 3 ...............121 Figure 4.11. Data match and inference results for Example 3 ........................................122 Figure 5.1. Visual representation of linear expansion ....................................................133 Figure 5.2. Sample discrete regularization functions; .....................................................138 Figure 5.3. The behavior of the discrete penalty function ..............................................142 Figure 5.4. Travel-time tomography example ................................................................145 Figure 5.5. Travel-time tomography reconstruction results with different discrete regularization function forms ...........................................................................................147 Figure 5.6. Well configuration, realizations and transform functions for Example2. ....149 xi Figure 5.7. Calibration results for Examples 2. ..............................................................151 Figure 5.8. Data match and forecasts for Example 2 ......................................................153 Figure 5.9. Well configuration for Example 3 ................................................................154 Figure 5.10. Low-rank representation error measures ....................................................156 Figure 5.11. Calibrated hydraulic-conductivity models for Example 3 ..........................158 Figure 6.1. Schematic of two step inversion ...................................................................175 Figure 6.2. Schematic of a template-matching algorithm. ..............................................177 Figure 6.3. Transform domain vs spatial template matching schemes ...........................178 Figure 6.4. Schematic of aggregation step ......................................................................179 Figure 6.5. Example 1 (motivating example) .................................................................182 Figure 6.6. Tri-facies model example .............................................................................184 Figure 6.7. Example 2 Model realizations and basis functions ......................................185 Figure 6.8. Example 2 reconstruction results ..................................................................187 Figure 6.9. Example 2 model updates ............................................................................188 Figure 6.10. Example 3 well configuration .....................................................................189 Figure 6.11. Example 3 realizations and basis functions ...............................................190 Figure 6.12. Example 3 reconstructed models comparison ............................................191 Figure 6.13. Example 3 model updates ...........................................................................192 Figure 7.1. Process for practical use of proposed inverse modeling algorithms ............197 1 Chapter I. INTRODUCTION 2 The area of inverse problems is an important topic of research in engineering and science. The input-output or cause-effect view of physical systems makes it possible to not only have a better understanding of the fundamental behavior of the system and its elements, but it can also be used to tackle the practical issues related to understanding these systems. In this thesis, I will address the inverse problems and their application in subsurface parameter estimation via (mainly) sparse reconstruction techniques. This study will discuss the general formulation of inverse problems in subsurface environments and address traditional and recently developed methods of parameter estimation in this area of research. Theoretical analyses of physical systems usually yield a set of governing equations that can be viewed as a relationship between the static or dynamic state parameters in space and time with a set of properties that are caused intrinsically by the physical system or as an interaction between their existing elements. Examples of these physical systems and their governing equations can be widely found in different areas of science and engineering. Fluid flow and transport processes, electromagnetics systems, the movement of planets in the solar system, and the human neural mechanism are examples of these physical systems, each with its own governing principles. Understanding the behavior of these systems requires having precise knowledge about the laws governing these systems, existing physical properties, parameters, and the relationships between them. In many applications, it is rarely possible to have access to all the parameters which appear to be involved in the laws governing the corresponding physical system. The process of understanding and estimating the parameters of a system of equations from limited sources of measurements is usually referred to as inverse modelling or an inverse problem [1] 3 Before moving on to the mathematical specificities of inverse problems, let us examine some key related terms. The situation in which the parameters or sources are directly accessible by measurement is termed “direct measurement”. However, in many cases, the information sought is not directly measurable. Instead, it is physically connected to other measurable values. We call this situation indirect measurement [2]. Although these measurable values are evidently dependent on the physics of the phenomenon under study, they are also dependent on the measurement instruments employed. The laws of physics that link observable information to the values sought are generally mathematically complex, using, for example, integral equations or partial derivatives. The solution to a problem that calculates observable effects from unknown values, or a “direct problem”, is often simpler and more easily mastered than that of an inverse problem, which calculates unknown values from observable effects. In other words, an inverse problem often arises from the confrontation of experimental data with the results from numerical simulations. The problem consists of seeking to minimize the distance between the measurement and the calculation. One common characteristic of inverse problems is being ill-posed or incorrectly posed. In the sense of Hadamard: the total measured data does not allow the existence of a solution to the problem, the solution is not unique, or, even further, the solution is not stable due to disturbance in the data [3], [4]. When the number of parameters in a model is smaller than the number of data points from the measurements, the problem is called “overdetermined”. In this case, it may be possible to add a criterion that diminishes or eliminates the effects of aberrant data by “regularizing” the solution structure. On the other hand, if the problem consists in determining continuous parameters that are thus sampled from a very large 4 number of values, and if the number of results from the experiments is insufficient, the problem is called “underdetermined”. It is then necessary to use a priori information to achieve a reduced number of possible solutions, or, in the best case, only one. Since there are often several possible solutions for an underdetermined problem, it is necessary to specify the confidence level that can be given to each solution. If this is the case, a Bayesian approach can be used for the inverse modeling problem [1]. In this thesis, the focus will be on the flow and transport processes in the subsurface environments, and a general framework of estimation process of subsurface parameters will be discussed in the following sections. Below are a few mathematical definitions to general linear/nonlinear inverse problems and the corresponding (equivalent) least squares formulation of the same problem. Definition (General inverse problem): [5], [6] Consider a subset of U of a Banach * space U, another Banach space Y, and a mapping H ∶ U → Y. The inverse problem consists of the following solution to the equation: 𝐠 (𝐮 ) = 𝐝 𝐮 ∈ 𝐔 & 𝐝 ∈ 𝐘 . (1.1) (Least Squares Formulation): [1] The general inverse problem in (1.1) is equivalent to the minimum of the functional 𝐉 (𝐮 ) = ‖𝐠 (𝐮 ) − 𝐝 ‖ 𝟐 𝟐 𝐮 ∈ 𝐔 . (1.2) When the Banach space Y is some L2 -space, then this is a classical least squares problem. A Banach space is a normed vector space that is a complete metric space with respect to 5 the metric derived from its norm. The setting of equation (1.2) is very general and applies to almost all inverse problems. Even in this general setting, we can formulate properties and solution procedures such as iterative methods. With more specific properties and settings, it will be possible to formulate better and more specific solutions. In the following chapters, we will study inverse problems on several levels, usually summarized into linear and nonlinear configurations. Linear inverse problems: The simplest and most interpretable form of inverse problems can be the case that observations and parameters are related to each other by a linear system of equations [1]. i.e., 𝐝 = 𝐆𝐮 + 𝛜 . Here, 𝐮 is the parameter of the interest, 𝐆 is the sensing (or linear model) matrix that relates parameter space to observation space and 𝛜 is observation noise usually considered to be independent from parameter 𝐮 . A well-known example of linear inverse problem is tomographic inversion [7], which has different forms and applications in diverse areas of engineering. Geophysical tomography [8] and medical X-ray tomography [9] are two examples of this type of linear inverse problem. The question of whether 𝐮 can be retrieved, even approximately, by accessing observation vector 𝐝 , the matrix 𝐆 , and the characteristics of observation noise, is the fundamental question in linear inverse problems. In one view, a probabilistic approach seeks to find the posterior distribution of the parameters by using the observations 𝐝 and the probability distribution of noise vector 𝛜 . Here, the posterior distribution of parameter 𝐮 will be 𝐟 𝐮 |𝐝 (𝐮 |𝐝 ) ∝ 𝐟 𝐝 |𝐮 (𝐝 |𝐮 )𝐟 𝐮 (𝐮 ) = 𝐟 𝛜 (𝐝 − 𝐆𝐮 )𝐟 𝐮 (𝐮 ) (1.3) 6 where f 𝐮 (𝐮 ) and f 𝛜 (𝛜 ) are the prior distribution of parameter 𝐮 and the probability distribution of observation noise, respectively. In this case, a probability distribution is obtained for parameter 𝐮 , and either multiple samples will be sampled from this distribution [10] or a probabilistic measure will be used to extract a single estimation of parameter 𝐮 from this posterior distribution [11]. For example, the estimation result from minimizing the mean squared error between 𝐮 and its estimation yields the mean of the above posterior distribution. In this case, the optimization problem to estimate parameter 𝐮 will have the following form: 𝐮̂ = 𝐚𝐫𝐠𝐦𝐢 𝐧 𝐮̂ 𝐄 𝐮 |𝐝 { (𝐮 − 𝐮̂) 𝟐 } (1.4) The estimation result from above optimization problem yields 𝐮̂ = 𝐄 𝐮 |𝐝 { 𝐮 }. Another approach for the linear inverse problem seeks to minimize a cost function, which is usually called a regularization term, and it considers the relationship between observations and parameters to be the constraint of the corresponding optimization problem. In this case, generally referred to as a deterministic approach, the general form of the optimization problem will be 𝐦𝐢𝐧 𝐮 𝐉 (𝐮 ) 𝐬 . 𝐭 . ‖𝐝 − 𝐆𝐮 ‖ 𝟐 𝟐 ≤ 𝛔 𝟐 (1.5) Here, J(𝐮 ) is a function that restricts (regularizes) the parameter 𝐮 to obtain a specific structure, hence called regularization, and σ 2 is a tolerance bound on the error of prediction. For example, if 𝐮 0 is a prior belief about parameter 𝐮 , minimizing J(𝐮 ) = ‖𝐮 − 𝐮 0 ‖ 2 2 results in a solution around 𝐮 0 [1]. A special case of this example occurs when 𝐮 0 = 𝟎 , and the result of the optimization problem is known to be the minimum norm solution. 7 Although the deterministic and probabilistic approaches are two different views to providing a solution for an inverse problem, there are many works that discuss the relationship between these two approaches [12]. For example, many works have attempted to reach to a deterministic approach by defining a prior distribution for the parameter 𝐮 and maximizing the posterior distribution to obtain the maximum a posteriori solution. The focus of this thesis is mainly on deterministic approaches to the ill-posed inverse problems. Nonlinear inverse problems In nonlinear inverse problems, the relationship between observed data and parameters is nonlinear, i.e., 𝐝 = 𝐠 (𝐮 ) + 𝛜 [13], [14]. The main point of interest in nonlinear inverse problems is very similar to the case of linear inverse problems. In fact, the observations, systems of equations 𝐠 (. ), and noise characteristics are analyzed to estimate parameter 𝐮 . In nonlinear inverse problems, deterministic approach to estimating parameter 𝐮 has following general formulation: 𝐦𝐢𝐧 𝐮 𝐉 (𝐮 ) 𝐬 . 𝐭 . ‖𝐝 − 𝐠 (𝐮 ) ‖ 𝟐 𝟐 ≤ 𝛔 𝟐 (1.7) Although the only difference in the general formulation of deterministic solutions for linear and nonlinear inverse problems is the nonlinear term for system dynamics in the constraint ‖𝐝 − 𝐠 (𝐮 ) ‖ 2 2 ≤ σ 2 , this single difference can change the theoretical discussion about the behavior of these two problems. For example, if J(𝐮 ) is assumed to be well-behaved and convex. Convex optimization methods [15] can be used to find the unique solution of the optimization problem in the linear case; however, the nonlinear term in the nonlinear 8 inverse problems may not result in a convex optimization problem. Hence, in the nonlinear case, a solution approach may not be trivial and well understood, and it may be a bottleneck in many cases. To obtain the solution of the minimization problem stated in Error! R eference source not found., using the penalty method [15], the constraint is added to the objective function as a regularization term that penalizes solutions inconsistent with the observed data. In this case, the optimization problem will be 𝐦𝐢𝐧 𝐮 𝐉 (𝐮 ) + 𝛌 ‖𝐝 − 𝐠 (𝐮 ) ‖ 𝟐 𝟐 . Here, α is a regularization parameter, and is usually set by cross validation [16] or L-curve [17] methods. To obtain the solution of the optimization above stated problem, an iterative approach is usually used. In this case, assuming 𝐮 (k) to be the solution at iteration k, an updated solution can be found by expanding the nonlinear function 𝐠 (𝐮 ) around previous iteration’s solution using first order Taylor expansion and solving the following optimization problem: 𝐦𝐢𝐧 𝐮 𝐉 (𝐮 ) + 𝛂 ‖𝐝 − (𝐠 (𝐮 (𝐤 ) ) + 𝐆 (𝐮 − 𝐮 (𝐤 ) )) ‖ 𝟐 𝟐 In this approximation, a first order Taylor expansion of the 𝐠 (𝐮 ) around 𝐮 (k) is used under the assumption of small perturbations in the updated parameters, i.e., ∆𝐮 = 𝐮 − 𝐮 (k) ≪ 1, and 𝐆 is the Jacobian matrix that contains the first order derivative of multivariate vector function 𝐠 (𝐮 ) with respect to entries of 𝐮 at 𝐮 = 𝐮 (k) . 9 Chapter II. SPARSE REGULARIZATION FOR DYNAMIC SUBSURFACE IMAGING 10 Identification of heterogeneous hydraulic rock properties presents a challenging problem in the modeling and prediction of fluid flow displacement in subsurface environments. Inference of these properties from limited dynamic flow measurements typically leads to a severely underdetermined nonlinear inverse problem that can have many non-unique solutions. The problem is usually regularized by incorporating implicit or explicit prior information in the search for an acceptable solution. Reducing the number of unknown parameters and/or constraining the solution space by imposing an appropriate structural assumption on the solution are common approaches for stabilizing the solution of ill-posed inverse problems. In geoscience inverse problems, a valid structural assumption must be supported by the physics of the problem and available information about the geology of the formation, as well as outcrop and other measurements such as well logs and seismic data. This chapter, presents a geologically-motivated regularization approach for representation and estimation of subsurface hydraulic properties. First, the proposed approach constructs a geologically-relevant dictionary of model elements from an ensemble of prior models using a recently developed learning algorithm, known as K-SVD. The main property of the constructed basis is its sparsity; i.e., only a small subset of the geologic dictionary components is needed to accurately approximate each prior model in the ensemble. Then, this study presents the proposed formulation for the inverse modeling solution approach by searching for the relevant subset of model elements in the dictionary that best reconstruct the final solution according to the dynamic flow data. To identify the relevant subset of model components in the geologic dictionary and estimate their contribution (weight), this chapter will provide the implementation of an iteratively reweighted reconstruction algorithm that minimizes a data misfit function augmented with 11 a multiplicative sparsity regularization term. Finally, I illustrate the effectiveness of this new approach and compare it with alternative prior-based methods using a series of waterflooding experiments in hydrocarbon reservoirs. In geo-environmental applications, inverse modeling is often used to provide a complete picture/map of subsurface physical properties from an incomplete set of often diverse measurements. Examples of these applications include identification of hydrocarbon reservoir basins [18]–[20], characterization of aquifers for accurate fluid flow and contaminant transport predictions [21], [22], [21], geophysical inverse problems[23]–[25], and characterization of geothermal reservoirs [26], [27], [28]. In particular, the reconstruction of subsurface hydraulic property maps is a difficult inverse problem to solve mainly because of the heterogeneity of the flow properties in subsurface formations and the scarcity of observations to resolve detailed information about the spatial distribution of rock flow properties. Consequently, several non-unique solutions can be found that produce the observed quantities, but fail to capture the correct spatial continuity or predict the flow and displacement patterns. Several techniques exist for regularizing the solution of ill-posed inverse problems. The application of these methods to the solution of subsurface characterization inverse problems has been extensively discussed by several authors (see [29]–[33] and references therein). Formulation of the subsurface inverse modeling algorithm is directly influenced by the choice of parameters. Parameterization defines the set of unknown parameters that “effectively” describe the behavior of the underlying model. In spatial inverse problems, choosing an appropriate domain that affords an effective description of parameters is not trivial, mainly because the exact properties of the solution are not known. In general, 12 however, a proper choice of parameter domain may be obtained from knowledge about the physics of the problem, which in the case of subsurface inverse problems should account for the type of geologic continuity in the specific formation under consideration. Moreover, the solution algorithm must account for and take advantage of the properties of the expected solution in the specified domain. For example, when a uniform grid-based description of subsurface petrophysical properties in the spatial domain is used, a large number of correlated parameters result, leading to a redundant description. This redundancy is introduced because many geological features exhibit piecewise smooth properties with correlation structures over large scales. Non-uniqueness and instability are the two major by-products of such redundant descriptions [21], [29], [34], [35]. A classical approach to addressing the non-uniqueness and instability issues is to incorporate, implicitly or explicitly, the prior knowledge about the solution [21], [29], [34], [35]. For instance, Tikhonov regularization methods [36] impose spatial structural assumptions (e.g., smoothness or flatness) onto the solution to acknowledge the spatial correlations in the model. Spatial domain parameter representation and regularization is only one approach to posing and solving the inverse problem. Moreover, spatial regularization methods such as Tikhonov regularization introduce global structural constraints in space, which can be rigid and lead to errors when locally different structures or discontinuities are present. Several techniques have been proposed in the literature to address the parameterization of subsurface inverse problems. A simple approach is to use up-gridding or regionalization, which aggregates grid cells with similar properties into one large cell with a single constant property [37], [38]. A more refined version of this approach is the family of adaptive multi- 13 grid techniques that attempt to optimally refine or coarsen model grids, based on prior data and/or measurement sensitivities, to balance model and data resolutions by adaptively adjusting the local resolution of unknown spatial parameters to a level that is justified by the data [39], [40], [41], [42], [43]. A few other parameterization techniques, including the subspace method [44], [45], [46], and pilot-point techniques [47]–[49] have also been developed and applied to characterization of groundwater aquifers and hydrocarbon reservoirs and similar geophysical inverse problems. While grid-based spatial data fusion approaches to subsurface characterization are flexible, they can have difficulty in capturing the distinctive complexity and irregularity of structural features in subsurface formations from limited measurements. A more consistent and natural choice of parameters for solving the subsurface characterization inverse problem can be obtained by adopting a feature-based estimation framework. Since physical properties of subsurface formations tend to form piecewise continuous features they can often be sparsely approximated with a relatively small number of parameters in an appropriate mathematical transform domain. The sparse transform-domain representations of subsurface property maps lend themselves to feature-based estimation techniques that can better preserve geological continuity and realism. Transform-domain representations with compression bases [50] provide an attractive option in accomplishing this goal. There are two major elements in formulating the inverse problem in this framework: i) the choice of the compression (or sparsifying) basis and ii) effective reconstruction algorithms that can use available measurements to identify the relevant basis elements and estimate their contribution (weight) to generate the solution. The choice of an appropriate basis to sparsely represent subsurface property images will 14 depend on the available prior knowledge and the important compression-related characteristics of the underlying property distribution maps, such as correlation structure and spectral characteristics. Pre-constructed image compression bases, such as the discrete cosine transform (DCT) [51]–[53] or the discrete wavelet transform (DWT) [54], [55], achieve excellent compression performance in representing smooth and piecewise smooth images, dominant in many natural images including geological formations. The DCT and DWT are used in the JPEG [56], [57] and JPEG2000 [58], [59] still image compression standards, respectively. For most natural images only a fraction of transformed coefficients has significant contributions to the reconstruction of the original image, which implies that most natural images have sparse representations or approximations in the DCT or DWT domains. In subsurface modeling where prior information is often available, these pre- constructed bases can be replaced with specialized sparse bases, learned from a prior library of uncertain but likely geologic models that are usually constructed using geostatistical simulation techniques [60]. In this thesis, we describe and use a learned sparse dictionary, known as K-SVD [61], [62]. Once the sparse basis for approximation is selected, the remaining question is which components of the basis should be included in the approximation and how these components are identified. This question is at the heart of approximation theory and the answer to it depends on available information of the solution as discussed below [63], [64]. In many cases, it may be reasonable to apply a truncation to eliminate unimportant basis components to reduce the number of unknown parameters. Typically, the truncation is applied in advance based on prior information, the properties of the parameterization 15 method, and the expected form of the model solution in its parameterized description [20], [65] without incorporating the dynamic data. In approximation theory, this a priori parameter selection leads to a linear approximation (discussed in later sections). Alternatively, nonlinear approximation algorithms have been developed to adaptively identify, based on available information, the essential components (significant basis elements) of the parametric description of the final solution [66], [67], [68], [69], [70]. My objective in this chapter is to present a feature estimation approach for the solution of nonlinear dynamic inverse problems by following a nonlinear approximation framework. This thesis presents a geologically-motivated data integration framework that effectively combines prior information with dynamic flow measurements to reconstruct a solution that is consistent with the existing measurements and respects prior geological continuity. In the groundwater and reservoir characterization inverse problems, it is often the case that one has an uncertain (probabilistic) description of the prior model that should be combined with available flow measurements to reconstruct a consistent solution. The proposed inversion approach is, therefore, based on generation of a large dictionary of model components that are learned from a prior library of likely geologic models. This dictionary of geologically-inspired model components is used as a specialized (trained) search space. A main characteristic of the geologic dictionary generated is that only a small number of its components are needed to approximate models containing features similar to those in the prior model library. Therefore, the inverse problem is reduced to finding a small subset of model components from the dictionary that can be combined, using available measurements, to reconstruct the solution. Just as a dictionary is used to draw and compose a sentence that conveys an intended message, the elements from a prior-based geologic 16 dictionary are combined to form a solution that explains the measurements. However, finding the relevant subset of model components from the dictionary and assigning weights to them is not a trivial assignment. This study, approaches this problem by incorporating the knowledge that only a small subset of the dictionary elements are needed; i.e., the number of required elements is a sparse subset of all possible components. As explained in the following section, by taking advantage of this sparsity property, we minimize an objective function that contains a data misfit term and a sparsity-promoting multiplicative regularization term to identify the relevant components of the geologic dictionary and combine them to form the solution. Inverse modeling with sparse regularization: Formulation This section, will describe the proposed methodology for solving nonlinear spatial inverse problems using the flow data. Our approach motivates sparse representation of the spatial unknown parameters of interest (spatial permeability distribution in this case) in a geologically relevant transform domain, which in this chapter is learned from a prior library of likely geologic models. We begin with a general discussion of transform-domain approximation methods to distinguish between linear and nonlinear approximation methods, followed by a description of the construction of the geologically-relevant dictionary. This section concludes after a brief presentation of the estimation method that is used to infer the significant parameters (i.e., dictionary elements) from dynamic flow measurements. 17 Linear and Non-Linear Approximation In approximating an N-dimensional target function 𝒖 using a pre-specified basis 12 ... N = Φ , where 1: jN = are the basis components, an S -term approximation can be expressed as S ss s v = u (2.1) Where u is the approximant and S N is a set of indices with SN . If the S-basis elements of the above series are fixed independently of the target function 𝒖 , the result is a linear approximation in the sense that for two functions 1 u and 2 u the approximant of their linear combination 1 1 2 2 aa + uu can be written as the linear combination of their individual approximants 1 1 2 2 aa + uu . Since linear approximation is obtained independently of the target function, it is applied mostly to problems in which the target function is unknown. In general, depending on the expected properties of an unknown target function and available prior information, an appropriate basis that results in a high compression rate may be found. For example, most smooth functions can have an effective linear approximation using the low frequency harmonic basis functions [65]. An important example is the low- frequency cosine harmonics (i.e., the DCT) found to be effective in approximating predominantly smooth images [66]. On the other hand, the best S -term approximation of 𝒖 in 2 L is obtained by selecting the S subset of basis elements that yield the largest orthogonal projection coefficients, i.e., 18 , ss v= u , where .,. computes the inner product of 𝒖 and the basis vector s . Note that in this case the choice of the S significant basis elements depend on the target function 𝒖 , as opposed to being fixed (independent of 𝒖 ) in the case of linear approximation. Therefore, the resulting best S -term approximation is nonlinear. Here, the S -term approximant of 1 1 2 2 aa + uu cannot be written as 1 1 2 2 aa + uu because the selected S basis elements are, in general, different for 1 u and 2 u ( 1 1 2 2 aa + uu is in general a 2S -term approximation of 1 1 2 2 aa + uu ). In nonlinear approximation, since the target function is known, the subspace defined by the basis vectors with the largest projection coefficients can be selected; hence, the approximation result is superior to linear approximation. A common application of nonlinear approximation is in image compression, where low-rank representation of a known image in a compressive basis such as DCT and DWT is obtained. In many important inverse problems, the target function (i.e., the solution) is often partially known. Partial knowledge of the solution may be available in the form of direct or indirect measurements of it and/or statistical description of the model (for example am empirically learned covariance or variogram model). Under these circumstances, a common approach is to use a linear approximation, where a fixed basis is selected by taking into account the prior information about the solution. A traditional example of this approach is the principle component analysis (PCA) that derives the S basis elements from the leading spectral modes of the prior covariance of the unknown function [20], [65], [71]. Alternatively, it may be possible in some cases to search for the best S -term (nonlinear) approximation of the solution in an appropriate basis when sufficient information about the solution is available. Recent advances in signal processing, formalized under the 19 compressed sensing theory [69], provide promising theoretical guarantees for perfect reconstruction of sparse signals from linear measurements (see [69] for theoretical details and [72], [73] for a thorough review). Sparse reconstruction techniques can facilitate the nonlinear approximation of an unknown function and may lead to estimation performance superior to the alternative linear approximation. Motivated by recent progress in nonlinear approximation of sparse signals, this study formulates a geologically relevant nonlinear approximation technique that can be used to solve nonlinear subsurface flow inverse problems. In what follows, by a sparse vector this thesis loosely refers to a vector that has several zero or negligible entries; thus, in this context, a sparsity-promoting regularization encourages sparse solutions by assigning a penalty to non-zero components. Geologically Learned Sparsifying Transforms While pre-constructed bases provide good compression power in the absence of any prior information, they are not as effective as specialized bases that can be learned from reliable prior information or structural assumptions. As an alternative to generic compression bases, a geologically-aware sparse basis can be derived from a prior training database. That is, using a training database as prior knowledge, one can construct a matrix Φ such that the projection of the features in the database onto Φ leads to a sparse set of coefficients. In general, a learned basis is more effective and consistent in reconstructing a model similar to those in the prior library. This approach is widely used in computer vision and object recognition where a training database of a specific object is available (e.g., face or fingerprint features). In many subsurface characterization applications, such training databases can be readily constructed by integrating various sources of data through 20 geostatistical simulation techniques [60]. Therefore, geologically-relevant sparse bases can be generated from available (uncertain) prior models using learning algorithms that specifically compress the contents of the prior models into a set of basis elements that are very effective in approximating models that are not far from those in the prior library. In the following section, we present a singular value based geologically-learned sparse basis, known as K-SVD, and discuss its approximation properties and application in subsurface data integration problems. We begin with a brief review of the regular singular value decomposition and the basis derived from the singular vectors (i.e., SVD basis). Singular Value Decomposition A common approach to the approximation of member images in a training library (database) is through their projection onto the left singular vectors of the matrix that has the vectorized members of the training dataset as its columns. Let us denote each 1 N - dimensional member of such a library with i u and the matrix containing the L -samples in the library with 1 ... ... N L i L = U u u u . It is relatively straightforward to show that amongst all S -term linear approximation of U the expansion using its S leading left singular vectors (denoted with NS Φ ) leads to the smallest (averaged over the entire library) root- mean-square-error (RMSE). In fact, the singular vectors can be derived as the solution to the following minimization problem to find a S -term approximation to the library: 2 ˆ arg min F =− Φ Φ U Φ V , (2.2) 21 where 1 ... ... S L i L = V v v v is anSL matrix with its columns representing the S approximation coefficients corresponding to each library member. Note that while the left singular vectors of U are the solution to (1), the best (in RMSE sense) S -term approximation for each individual member of the library is not generally achieved by its projection onto the leading S vectors of Φ . Specifically, Φ gives the optimal linear approximation basis in summed RMSE averaged over the entire library over. While this linear approximation is convenient and in most cases adequate, it tends to provide a reasonable approximation rate for typical members of the library and might not be efficient for approximating members that slightly deviate from the common features in the library. Figures 1-a1 and 1-b1 show samples from two different two-dimensional libraries each with 3000 petrophysical property map (permeability) members in them. The permeabilities in Figure 1-a1 have channel facies with predominantly left-to-right orientation and are generated using the multipoint geostatistical simulation [74], [75]. a1) Sample log-perms a2) KSVD (K=2025) a3) KSVD (K=1000) a4) KSVD (K=400) a5) KSVD (K=200) a6) KSVD (K=100) a7) KSVD (K=20) a8) SVD (S=20) 22 b1) Sample log-perms b2) KSVD (K=2025) b3) KSVD (K=1000) b4) KSVD (K=400) b5) KSVD (K=200) b6) KSVD (K=100) b7) KSVD (K=20) b8) SVD (S=20) Figure 2.1: K-SVD dictionary and SVD basis Sample log-permeability maps with K-SVD and SVD bases of different sizes; (a1) prior log-permeability samples for channel log-permeability maps; (a2)-(a7) first 20 K-SVD bases for different K values (dimension of nonlinear problem) but the number of nonzero coefficients is 20 for all cases; (a8) the first 20 leading elements of the SVD basis; (b) the same figures as in (a) for the example with heterogeneous permeability models. 23 Figure 1-b1 shows more heterogeneous permeability maps, generated using the sequential Gaussian simulation (SGSIM) algorithm [60]. The first 20 leading left singular vectors for the matrix representing this library are shown in Figures 1-a8 and 1-b8, respectively. Note that the first few leading SVD basis images tend to have large scale features. However, the subsequent basis images tend to have smaller scale features (even within the first 20 leading eigen vectors). Figure 2a also shows are sample permeability models. These samples do not belong to the library. The linear approximation of these models using the fixed 20 leading basis are shown in Figure 2-f1 and 2-f2. As discussed in the previous section, the best S -term approximation for each individual member of the prior ensemble of models is in general different from the leading singular vectors and can be uniquely determined by the basis elements that give the largest projection coefficients. In fact, when a nonlinear approximation is performed, a different set of 20 basis elements may be selected for reconstruction of each image. We discuss the K-SVD as an alternative sparsifying transform for nonlinear approximation in the following section. 24 a) True log-perms b1) KSVD Coefficients (K=2025) b2) Approximation (K=2025) c1) KSVD Coefficients (K=400) c2) Approximation (K=400) <RMSE:0> <SSIM Error:0> 20 40 10 20 30 40 <RMSE:0> <SSIM Error:0> 20 40 10 20 30 40 <RMSE:0> <SSIM Error:0> 20 40 10 20 30 40 500 1000 1500 2000 -40 -20 0 20 500 1000 1500 2000 -20 0 20 40 500 1000 1500 2000 -20 0 20 40 <RMSE:0.20748> <SSIM Error:0.1676> 20 40 10 20 30 40 <RMSE:0.19607> <SSIM Error:0.10619> 20 40 10 20 30 40 <RMSE:0.076246> <SSIM Error:0.22936> 20 40 10 20 30 40 100 200 300 400 -80 -60 -40 -20 0 20 100 200 300 400 -60 -40 -20 0 20 100 200 300 400 -40 -20 0 20 25 d1) KSVD Coefficients (K=200) d2) Approximation (K=200) e1) KSVD Coefficients (K=100) e2) Approximation (K=100) f1) SVD Coefficients (K=20) <RMSE:0.20776> <SSIM Error:0.17212> 20 40 10 20 30 40 <RMSE:0.18936> <SSIM Error:0.1091> 20 40 10 20 30 40 <RMSE:0.068268> <SSIM Error:0.17812> 20 40 10 20 30 40 50 100 150 200 -100 -50 0 50 100 150 200 -50 0 50 50 100 150 200 -40 -20 0 20 40 60 <RMSE:0.20686> <SSIM Error:0.1829> 20 40 10 20 30 40 <RMSE:0.19677> <SSIM Error:0.11521> 20 40 10 20 30 40 <RMSE:0.06886> <SSIM Error:0.19788> 20 40 10 20 30 40 20 40 60 80 100 -40 -20 0 20 20 40 60 80 100 -40 -20 0 20 40 20 40 60 80 100 -60 -40 -20 0 20 40 <RMSE:0.21587> <SSIM Error:0.1905> 20 40 10 20 30 40 <RMSE:0.19596> <SSIM Error:0.10978> 20 40 10 20 30 40 <RMSE:0.071692> <SSIM Error:0.2221> 20 40 10 20 30 40 26 f2) SVD Approximation (K=20) Figure 2.2: Compression quality comparison between K-SVD and SVD atoms K-SVD Algorithm A more effective approach for learning a sparse basis from a training database NL U is to solve either of the following optimization problems [62], [76]: 1 0 , min S ii i = Φv v , s.t. 2 ii − u Φv , (2.3a) 1 2 , min S ii ii = − Φv u Φv , s.t. 0 i S = v , (2.3b) where S is the specified number of non-zero entries in 𝒗 𝒊 . Although there is no known practical algorithm for exact solution of the above optimizations, heuristic methods such as the method of directions (MOD) and K-SVD, which approximately solve (3), have been shown to perform reasonably well in finding empirically learned bases [77][78]–[80].The K-SVD algorithm derives its name from its similarity to the K-means clustering algorithm [81]. While in the K-means clustering algorithm, K computations of means are performed, 5 10 15 20 -10 0 10 20 5 10 15 20 -10 0 10 5 10 15 20 -20 0 20 <RMSE:0.26322> <SSIM Error:0.2356> 20 40 10 20 30 40 <RMSE:0.25123> <SSIM Error:0.16279> 20 40 10 20 30 40 <RMSE:0.081279> <SSIM Error:0.26346> 20 40 10 20 30 40 27 in the K-SVD approach, an updated basis is obtained through K SVD computations, each determining one column of the basis, hence the name K-SVD. Note that to avoid notation confusion we have used S to denote sparsity and K to refer to the computation in the K- SVD algorithm (the total number of basis elements). This section briefly describes the K- SVD algorithm in this section and leave further details to the original references [62], [76]. The K-SVD algorithm was recently developed by Aharon et al. [62] to design the dictionary Φ with sizeNK from L samples of i u while ensuring the sparsity (described by l0-norm) for each i u over Φ . This problem can be formalized by seeking a solution for 2 , ˆˆ , arg min F =− ΦV Φ V U Φ V , s.t. 0 , 1,2,..., i S i L = v . (2.4) To find Φ and V from U , the K-SVD algorithm solves the problem in (4) iteratively. Each iteration of the K-SVD algorithm consists of two steps. The first step (known as sparse coding) is used to find V for a fixed Φ and U ; the second step updates the dictionary by finding a new Φ after fixing V and U . The K-SVD algorithm lends its efficiency to taking advantage of efficient sparse coding methods [62], [76] and rank-one SVD computations. The two basic steps in the K-SVD algorithm are summarized below. Step I: Sparse Coding In this step, the sparse transform coefficients 𝒗 𝒊 are found for a fixed dictionary Φ and library member i u while imposing the sparsity constraint via 0 i S v . This formulation is equivalent to solving N sparse reconstruction subproblems of the form 2 ˆ arg min i i i i F =− v vu Φv , s.t. 0 i S v (2.5) 28 The subproblems in (2.5) can be efficiently solved through recently developed algorithms in the sparse reconstruction literature, such as orthogonal matching pursuit (OMP) [77], [82], [83] and iteratively hard thresholding (IHT) algorithms [84]. Using the OMP for a given dictionary 12 ... K = Φ , the algorithm finds the orthogonal projection of each library member onto 1 and computes the residual. The next basis element 2 is that which has the largest inner product with the residual (maximum contribution to the information that is not captured by the first basis element). When the sparsity is specified to be S , the algorithm proceeds to find the S basis elements that provide the minimum residual in the orthogonal projections. Step II: Dictionary Update In the second step of the K-SVD algorithm, the dictionary Φ is updated after fixing the sparse coefficients V from the first step. This update is achieved by solving the subproblem 2 2 1 arg min arg min K T ii F i F = = − = − ΦΦ Φ U Φ V U v (6) where i and T i v denote the i th-column of Φ and the i th-row of V , respectively. After solving the subproblems in (2.6), the K-SVD algorithm sequentially updates each column of Φ by minimizing 2 , ˆ arg min i TT i j j i i j j i F = − − U v v 1,2,..., iK = (7) 29 Reader should note two important observations about the K-SVD algorithm: i) T ii v is a rank-one decomposition of , T jj j j i − Uv ; consequently, for each iteration K times rank- one SVD decomposition operations are carried out; and ii) to preserve the sparsity of i v from Step 1, (6) is carried out by solving 2 , ˆ arg min i i TT i j j i i j j i F = − − U v v 1,2,..., iK = , (8) where :0 i ji j = v . While no formal convergence proof has been given for the K-SVD algorithm, numerical experiments [85], [86] have shown that the above algorithm is quite robust. In this chapter we use the implementation presented in [87]. At the th () k iteration of the update step, the algorithm finds library members k u that have the basis element k in their representation. This basis element is replaced with the leading left singular vector of the residual K k k j j jk = − R U v , which is the closest rank-1 approximation to the residual k R . Figures 1-a2 and 1-b2 show 20 samples of the total K=2025 K-SVD basis elements generated from the channel facies and heterogeneous libraries shown in Figures 1-a1 and 1-b1, respectively. Samples from each library of log-permeabilities are shown on the top left panels for each case. Note that unlike the SVD basis, the K-SVD basis elements are not ordered according to their significance and contribution in reconstructing the members of the library. Different combinations of the basis elements, irrespective of their location 30 index in the basis, can be used to provide a nonlinear S =20-term approximation for each individual replicate in the library. It is also important to note that finding the S =20-term approximation coefficients in this case is not as straightforward as in the linear SVD case. Here, the locations (indices) of the S =20 significant basis elements are also unknown. As a result, a search must be conducted to find both the index and the corresponding coefficients (weights) of the S=20-dimensional expansion (i.e., nonlinear approximation). In general, the exact solution to this discrete optimization problem leads to an exhaustive search over the entire S=20-dimesional subspaces, which is combinatorial, and hence impractical. We use efficient sparse reconstruction techniques to approximate the solution to this discrete (integer) optimization problem. Unlike in the regular SVD case, in which a fixed S-dimensional approximation subspace is used, the dimension of the inverse problem when the K-SVD with K basis components (with a sparsity of S=20) is used is not reduced to S<<K. Instead, a search in a K- dimensional subspace is conducted to find the relevant components and the corresponding significant coefficients that provide a good fit to observations. While a larger dimensional search space in a diverse basis may provide substantial flexibility in reconstruction of the solution, this gain in flexibility is accompanied by a higher computational cost that can limit the application of the approach to larger-scale problems. One way to reduce the computational complexity and stability of the optimization problem solved during data integration is to construct a smaller dimensional sparse approximation subspace. Figures 1-a2 through 1-a7 (for the channel permeabilities) and 1-b2 through 1-b7 (for the heterogeneous permeabilities) show the sample basis elements for K=1000, 400, 200, 100, and 20 subspace dimensions, respectively. 31 A key observation to be made from these figures is that as the number of basis elements (K) decreases, the basis images become smoother and hence may not be able to resolve finer details in a heterogeneous property map. This results in the usual trade-off between smoothness, computational complexity, and identifiability. The prior information, the amount and type of observations, and the goal of the data integration problem are helpful in resolving this trade-off. For example, when seeking to resolve large scale channel structures from limited production data, the smoother basis elements (at low K values) are more relevant. In general, since the resolution of data in many subsurface flow inverse problems does not allow for identification of small scale components, including details that cannot be resolved from the data can lead to non-unique and physically implausible solutions. Hence, removing the details in the basis can, in fact, have a positive effect on the stability and realism of the solution. A main reason that the K-SVD algorithm preserves its approximation power at lower K values is the redundancy of the prior features in the library. As shown in the following section, the approximation quality with lower- dimensional bases is not too far from those with K=2025. A number of important properties of the K-SVD basis are also worth discussing. First, to be able to use nonlinear approximation with sparse reconstruction methods with a fixed number of measurements, it is necessary to maintain a low ( / ) SK ratio. The main advantage of the K-SVD over the linear SVD approximation is related to the flexibility and better continuity offered by its basis, which can be effectively exploited through a nonlinear approximation algorithm. However, the same flexibility in the basis can lead to non-unique solutions, which are consistent in this case with the prior models and observations (assuming an acceptable data match). 32 Second, the constructed basis is not unique. Several alternative sparse bases with similar approximation properties can be generated with the same library of prior models. For instance, a library with L=3000 models could be used to generate a K=100 dimensional basis with a sparsity S=20. That is, the search subspace is K=100 dimensional while only S=20 of the basis elements are used to provide an approximation for each library member (note that L, K, and S are specified for basis construction). Since the K-SVD implementation is initialized in our examples by taking the first K members of the prior library, the final basis components are similar to the first K library models (as can be verified from Figure 1). This does not imply that the constructed basis can only sparsify the first K members of the library; in this case, the K-SVD algorithm produces K- dimensional representation for the entire prior library (L=3000) with a sparsity of S=20. However, it implies that the final basis is non-unique and depends on which K prior samples are used to initialize the K-SVD algorithm. Another important property of the K-SVD is that the basis is not orthogonal. Consequently, while the transformation from the sparse representation to the spatial domain is obtained through the series expansion in (1), the inverse transformation of finding the sparse representation for a spatial permeability model requires a search for the S =20 basis elements with the largest orthogonal projection coefficients. This projection is achieved through a simple orthogonal matching pursuit (OMP) algorithm [77]. Note, however, that the inverse transformation is only used in the construction of the basis. That is, during data integration, we only solve for the sparse representation in the K-SVD transform domain. Once a solution is found, the series expansion in (2.1) is used to obtain the spatial permeability model, which is used to solve the multiphase flow equations. 33 Figure 2 summarizes the approximation property of the K-SVD basis for three different examples with S=20 and K=2025, 400, 200, and 100. The linear SVD approximation with S=20 leading basis elements is also included for each example (Figures 2f1 and 2-f2). The approximations in Figure 2 are nonlinear and assume full knowledge of the true permeabilities. In addition, while the sample permeabilities in Figure 2 have similarities with the prior models in the library, they were not included in the construction of the sparse K-SVD basis. The results in Figure 2 suggest that the K-SVD basis provides an effective nonlinear sparse approximation for these permeabilities with only 20 basis elements. It can be seen that in each example as K increases the K-SVD basis changes and so do the K significant basis elements used in approximating the permeabilities. This behavior is different from the linear SVD case, where the basis elements are fixed and ordered. In the algorithm for construction of the SVD and K-SVD bases, the minimized objective function is a grid-based norm-2 error measure across the library 2 F − U ΦV . While this metric is convenient to compute, it does not provide an appropriate error measure for spatial parameters in an estimation problem as it is less sensitive to the structural connectivity than it is to exact grid property values. In Appendix A, this study describes a structural similarity (SSIM) error metric that is commonly used in quantifying image distortion for measuring image compression quality [88]. The motivation for introducing the SSIM metric is to augment the RMSE as a measure of grid-based approximation quality with a structural similarity metric that accounts for the spatial similarity between property maps. In Figure 2, both RMSE and SSIM errors are reported for the approximations. It is evident from the results in Figure 2 that as K decreases, a marginal RMSE and SSIM error results for the channel example compared to the more heterogeneous Gaussian permeability case (third 34 column). As discussed above, reducing K leads to smoother approximations, which might be desirable if continuous large scale features (such as channels) are to be represented. While reducing K can lead to over-smoothing for more heterogeneous permeability fields, it is also important to be mindful of what level of heterogeneity is realistically resolvable from the type and amount of measurements available. In dynamic flow subsurface inverse problems where scarce and spatially averaged flow measurements are used to infer hydraulic properties, estimation of fine local details may not be possible. Estimation Method This section considers inverse modeling problem in two-dimensional two-phase (oil/water) immiscible and incompressible flow systems. The related governing equations are discretized in space and solved using the finite elements method to obtain the spatial and temporal distribution of saturation and pressure in the model [89], [90]. Details of the forward model used in this chapter are given in [91]. The components of the forward modeling solution are used to compute the required sensitivity of the observed quantities to the unknown parameters of the inverse problem used for inverse modeling minimization problem. Next Section presents the inverse problem and refers interested readers to [91] for details on the forward modeling used in this chapter. Regularized Least Squares In the least squares formulation of the inverse problem, we find an estimate for the unknown parameters (permeability model) 𝒖 ∈ 𝑹 𝒏 from nonlinear dynamic measurements of the state variables (pressure and saturations) at scattered locations in the field. To this 35 end, a deterministic least-squares problem is obtained by minimizing a nonlinear objective function comprising the square of the l2 norm of the misfit between the observed 𝒚 and the simulated ( ) sim = y g u measurement vectors as follows: ( ) ( ) 2 2 J =− u y g u (2.9) Unless otherwise specified, our definition of lp norm of a general vector 𝒙 follows the standard p-norm given by 1 1 n p p i p i xx = = for ( 0 p ). Note that for01 p , the above definition does not constitute a norm; however, this dissertation loosely refers to it as lp norm. When the dimension of u is larger than the number of measurements, the problem is underdetermined and minimization of (2.9) can become unstable and lead to non-unique solutions. To remedy this issue, one approach is to reduce the number of parameters using a better parameterization of the problem [92], [93], [21]. An alternative approach is to add prior knowledge of the solution to regularize the problem [92], [93], [21]. A common form of regularization imposes a structural assumption (often derived from the physics of the problem or the intrinsic nature of the model) upon the solution in an attempt to avoid physically/structurally implausible or unnecessarily complex solutions [92], [93], [21]. A general form of a structural regularization approach minimizes the objective function in (2.9) after augmenting it with an additional term that enforces the desired structural attribute (e.g., smoothness), i.e., ( ) ( ) 2 2 2 2 2 J =+ u y - g u Wu , (2.10) 36 where 2 is a non-negative regularization parameter that balances the data misfit and the regularization term and W is a linear operator that is specified according to the expected property of the solution. The classical examples of regularization are the cases in which W denotes zeroth, first, or second order spatial derivatives, known as Tikhonov regularizations [36]. Minimization of the zero th , first, and second order spatial derivatives of the model encourages minimum length, smooth, and flat solutions, respectively. Sparsity Promoting Regularization In this chapter, by taking advantage of the prior knowledge about the sparsity of the expected solution in the geologically-learned K-SVD basis, implementation of a regularized least squares solution in the sparse basis is desired. This regularization approach amounts to simultaneous selection of the appropriate basis elements and estimation of their weights, according to flow measurements. Note that, in general, basis selection leads to an intractable combinatorial optimization problem. However, this chapter considers an efficient implementation of sparsity regularization approach to avoid the combinatorial optimization problem. While for linear problems, the compressed sensing theory [69], [72] outlines the theoretical guarantees for finding the solution when certain sparsity and measurement requirements are met, it is difficult to establish similar theoretical conditions in the case of nonlinear inverse problems; nonetheless, the underlying concepts in the linear case provide important insights that can be applied in formulating solution procedures for nonlinear problems with dynamic measurements [70], [94]. 37 The search for a sparse solution v in the geologically-derived sparse transform basis that matches the nonlinear flow measurements can be formalized by solving [69], [72] min 𝒗 ‖𝑣 ‖ 0 𝑠 . 𝑡 . 𝒚 = 𝒈 (𝚽 𝒗 ) 𝒐𝒓 ‖𝒚 − 𝒈 (𝚽 𝒗 )‖ 𝟐 𝟐 ≤ 𝝐 (2.11) in which 0 is the measurement noise variance and the l0-quasi-norm minimization attempts to find solutions with minimum support (non-zero entries). After finding the solution ˆ v, the spatial-domain representation is readily obtained using the relationship ˆˆ u Φv = . Solution of this basis selection problem requires an exhaustive (combinatorial) search over all possible sparse subsets of the basis elements, which is impractical. However, in recent years a rapidly growing literature has been developed on finding approximate and efficient solutions to (2.11) when the measurements operator is linear, i.e., g(u) Gu = . The main practical algorithms for solving the linear form of (2.11) are classified as the greedy algorithms (GA), also known as matching pursuit (MP) [95]–[98], and the convex relaxation techniques [68], [99]–[101]. A greedy strategy for finding a solution is to avoid the exhaustive combinatorial search by taking locally optimal steps [97]. Starting with an initially empty matrix and 0 0 v = , iterative construction of a S -term approximation S v is obtained by maintaining a set of active columns and adding an additional column at each step, which is chosen to maximally reduce the residual 2 l error in approximating u with the currently active columns [97]. This procedure is continued until a stopping criterion, usually an error threshold, is reached. The computational complexity of the above greedy algorithm is significantly better than an exhaustive search; however, the variants of this 38 method suffer mainly from robustness issues and lack of guaranteed convergence to a sparse solution [97], [102]. On the other hand, convex relaxation methods [68], [99]–[101] attempt to make the problem more tractable by replacing the highly discontinuous and nonconvex 0 l -pseudo- norm penalty term with a more continuous and convex penalty function such as 1 l -norm or compactness penalty ( ) + j j j v v 2 2 2 / or ( ) / jj j vv + (for very small values of β). Figure 3 shows the behaviors. (a) Behavior of p l -norm penalty as a function of p (b) Behavior of + 2 22 x x penalty as a function of 39 of the p l -norm for ] 1 , 0 ( p and the compactness penalty ( ) + j j j v v 2 2 2 / for different values of p and . It is clear that for ] 1 , 0 ( p the p l -norm is non-convex. Also, as 0 → , the behavior of the ( ) + j j j v v 2 2 2 / penalty term is similar to the 0 l -pseudo-norm. The spatial case where 1 p = leads to a convex optimization problem [103], [104]. In linear inverse problems with sufficiently sparse solutions and using Gaussian or Bernoulli random matrices as measurement bases, the inequality ( ) log / M S K S , where is a constant, provides a lower bound on the number of measurements needed for perfect reconstruction with a very high probability [69], [105]. Figure 2.3: Behavior of two different sparsity-promoting penalty functions (a) |x|p is shown for different values of p; as p→0 larger penalties are given to small values of x; (the function is convex for p ≥ 1); (b) behavior of + 2 22 x x for different values of β, and as β→0. 40 When the p l -norm with ] 1 , 0 ( p replaces the 0 l -pseudo-norm in (2.11), a popular approximate solution technique is the iteratively reweighted least squares (IRLS) algo[106]rithm [69], [105]. In this chapter, we use a variant of IRLS method to solve the regularized nonlinear subsurface inverse problem. In the remainder of the paper, we focus on the 1 l -norm convex relaxation approximation of the 0 l -pseudo-norm in (2.11). In this case (2.11) is replaced with 2 2 21 min ( ) ( ) J v v y g Φ v v l = - + (2.12) To solve this problem using the IRLS approach, we rewrite the regularization form as 22 2 2 min ( ) ( ) J v W v y g Φ v v l = - + . (2.13) where we use the regularization term 2 T = W v v Wv , with W being a diagonal “weighting” matrix with entries , ii w approximately proportional to ( ) 1 () k i − v at each iteration() k of the solution. To avoid the singularity due to very small values of ( ) () k i v , , ii w is modified to take the form ( ) ( ) 1 2 2 () , kk i i k i − =+ Wv , where we have specified ( ) ( ) 2 2 k k =− yg Φv to adaptively control the weighting matrix during the iterations. With this choice, larger weights are given to observation misfit terms at early iterations. As the iterations proceed and the data misfit term decreases, more emphasis is placed on the sparsity term. One of the major difficulties in solving the above problem is related to determination of the regularization parameter 2 l that balances the weight given to the regularization term 2 W v and the data misfit term ( ) 2 2 − yg Φv . In general, this parameter is not easy to specify 41 without considerable numerical experimentation and/or a priori information about the solution. Standard procedures, such as the L-curve method and the generalized cross validation (GCV) [106], have been applied to determine a reasonable value for 2 in linear inverse problems where the discrete Picard condition is satisfied [106]. However, the overhead computation needed for determining 2 can render the application of these approaches to large scale nonlinear inverse problems impractical. One way to eliminate the regularization parameter is to incorporate the regularization term as a multiplicative constraint [94]. The resulting multiplicatively regularized objective function can be written as ( ) ( ) ( ) 2 2 J SP = − v y g Φ v v (2.14) where ( ) ( ) ( ) 2 , 1 L ii i i SP = =+ v W v is the regularization term that promotes the sparsity of v. The diagonal “weighting” matrix W is similar to the additive regularization case above. Experimental studies with the DCT basis in [94] and the examples in this chapter suggest that although the multiplicative regularization approach elevates the nonlinearity in the objective function, it performs well in estimating the underlying permeability fields from relatively limited noisy data. A more detailed discussion on the implementation of this multiplicative approach can be found in [94]. 42 Application to Flow Data Integration The following section, considers the application of the proposed geologically-learned sparse reconstruction method to the estimation of two-dimensional intrinsic permeability fields from two-phase (oil/water) flow measurements. Alternative Reservoir Configurations This chapter presents numerical waterflooding experiments for three sets of synthetic oil reservoirs to evaluate the performance of the proposed geologically-learned sparse basis in reconstruction of permeability maps from production data by estimating their sparse transform domain representations. In each example, a 225m×225m×5m synthetic reservoir model is discretized into a two-dimensional 45×45×1 uniform grid block system. The waterflooding experiments are designed to inject one pore volume of water into the reservoir during 12.5 months of simulation time. Biweekly observations of pressure and saturation at the sink and source locations are used to infer the complete spatial permeability map. Experiments with alternative injection/production configurations and structurally different permeability models are conducted to study the performance of the proposed method under various observation assumptions and permeability model complexity. Figure 4 shows the alternative field setups, consisting of a line-drive (Figure 4a) and a conventional 9-spot pattern drive flooding (Figure 4b). Reservoir A (Figure 4a) portrays a line drive injection using a horizontal injection scenario with 45 ports that uniformly inject one pore volume of water from the left side of the domain during the simulation. A similar horizontal producer with 45 ports is placed at the right end of the domain to produce the oil and water that move toward the production well. The production ports are under a total production 43 rate constraint (a total of 1-PV equally distributed among the producers) to preserve mass balance. Reservoir B (Figure 4b) has a standard 9-spot configuration with one injector in the center and eight producers located symmetrically on the edges of the domain. The initial and boundary conditions are assumed to be known perfectly and are listed, along with other important input parameters, in Table 2.1. For all numerical experiments in this chapter, the initial solution for the permeability is a homogeneous model. (a) Field setup for Reservoir A (b) Field setup for Reservoir B Figure 2.4: Experimental field setup and well configurations (a) 45 injectors at left hand side of the reservoir injecting a total of 1PV over the entire simulation time and 45 production wells at right hand side producing 1PV of liquid throughout the simulation; (b) a 9-spot well configuration with 1-injector in the middle and 8 symmetric producers around the reservoir. line of injectors line of producers injectors producers P-01 P-45 I-01 I-45 producer W5 W1 W2 W3 W4 Injector W6 W7 W8 W9 44 Table 1. Important flow simulation and data integration parameters. Parameter 9 Spot well Flooding Horizontal Flooding Flow system Two-phase (oil/water) Two-phase (oil/water) Simulation time 375 Days 375 Days Observation interval 15 days 15 Days Domain size 45 by 45 by 1 45 by 45 by 1 Cell dimensions 5 by 5 by 1 5 by 5 by 1 Rock Porosity 0.2 0.2 Initial oil saturation 1.0 1.0 Injection volume 1PV 1PV Number of injectors 45 (horizontal well) 1 Number of producers 45 (horizontal well) 8 Number of iterations 30 30 Description of Experiments For each reservoir condition, this section first implements the proposed regularized inversion algorithm using the K-SVD sparse basis. For comparison purposes, the section then follows this approach with two different alternative solution methods: 1) a solution that uses a truncated linear SVD algorithm in which the 20 leading singular vectors are used to provide a parameterized description of the permeability field. This case represents a commonly used parameterization problem, where a 20-dimensional non-sparse coefficient vector is estimated during the inversion (hence, no regularization term is needed in the objective function); 2) the second solution approach applies exactly the same regularized least square problem as in (2.14), except that it uses the entire 2025 linear SVD basis during the inversion. The hypothesis in this case is that the linear SVD also provides a representation of the true permeability map that can be approximately sparse. Therefore, sparsity regularization may also be applied to the linear SVD case. Evaluation is performed 45 with this hypothesis through the reconstruction of two-dimensional permeability models in the examples in the following section. Results and Discussion This section presents and discusses the results from numerical experiments where the K- SVD basis is used with the IRLS reconstruction algorithm to estimate two-dimensional facies models and property maps using examples from waterflooding in oil reservoirs. The synthetic true permeability models presented in this section were not used in generating the learned basis in Figure 1. Let us begin with a simple example to illustrate the approach and gradually increase the complexity of the true model and the inversion problem as this chapter proceeds. Finally, this chapter concludes with a discussion on computational complexity of the approach and some numerical experiments with a reduced order K-SVD approach and study the effect of observation noise. Example 1 In Experiment 1, let us consider the estimation of fluvial channels in a horizontal waterflooding experiment using the field setup in Reservoir A. The training dataset and samples from a 2025 size K-SVD basis are shown in Figure 1a. Figures 5 and 6 summarize the reconstruction results for the log-permeability and the observation matches, respectively. The true log-permeability for this experiment is shown in Figure 5a (right). Figures 5b-5e show selected log-permeability solution iterations. The initial permeability is a homogeneous model with a constant log-permeability value set at 20mD (the left panel in Figure 5a). Our experiments with 50 iterations indicated that in all cases negligible improvements were obtained after 30 iterations. 46 (a) Initial and true log-permeability Initial log-perm True log-perm (b) Estimated log-permeability after selected iterations (KSVD-IRLS) (c) Estimated log-permeability after selected iterations (SVD20-LLS) (d) Estimated permeability after selected iterations (SVD2025-IRLS) (e) Estimated permeability after selected iterations (SVD2025-LLS) 10 20 30 40 10 20 30 40 Initial Permeability Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 47 Figure 2.5: Log-permeability reconstruction results for Experiment A (a) initial log-permeability (left) and true log-permeability (right) with horizontal well configuration; (b) reconstructed log-permeability for selected iterations of the history matching with 2025 dimensional K-SVD basis using iteratively reweighting least square (IRLS) algorithm; (c) reconstruction of log-permeability with the first 20 leading elements of the SVD basis (unregularized Least Square inverse problem); (d) reconstructed log-permeability for the complete SVD basis (2025 coefficients) using the IRLS algorithm; (e) log-permeability reconstruction using the SVD basis with 2025 coefficients and a simple LS solution method. Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 48 (a) Log-permeability RMSE & SISM errors after each iteration (all methods) (b) Observation matches for sample wells Figure 2.6: Reconstruction errors and sample observation matches for Experiment A (a) RMSE (left) and SSIM error (right) for log-permeability reconstruction versus iteration number; (b) saturation (left column) and pressure reduction (right column) mismatch for two sample production wells. It is important to note that in Experiment 1, all observations are collected from the left and right end of the domain and no measurements are available from the mid-section of the reservoir. Therefore, only the dynamic pressure and flow measurements from the two ends of the domain can be used to provide indirect and averaged measurements of the permeability distribution away from the injection and production locations. The dynamic 5 10 15 20 25 30 0.15 0.2 0.25 0.3 0.35 RMSE over all iterations SVD(20) KSVD(2025) SVD(2025-IRLS) SVD(2025-LLS) 5 10 15 20 25 30 0.3 0.4 0.5 0.6 0.7 SSIM Error over all iterations SVD(20) KSVD(2025) SVD(2025-IRLS) SVD(2025-LLS) 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 15) 5 10 15 20 25 -350 -300 -250 -200 Pressure (@Well 15) 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 30) 5 10 15 20 25 -350 -300 -250 -200 Pressure (@Well 30) True KSVD(2025) SVD(20) SVD(2025-IRLS) SVD(2025-LLS) 49 information in the observations is combined with the prior permeability model (as reflected in the learned basis) to reconstruct the solution permeability that best describes the observations. During the iterations, the algorithm tends to select a limited number of basis elements (out of the 2025 available), through l1-norm minimization, and estimates their corresponding coefficients in the series expansion (1) to match the measurements. Note that the limited available observations can, in general, be matched by combining different subsets of the basis elements. This property results mainly from the underdetermined nature of the problem, the diversity of the K-SVD basis, and the fact that the sparsity numbers S is unspecified in the inversion algorithm. Figure 5c shows the permeability iterations when the first 20 leading SVD basis elements are used to parameterize the permeability field. In this case, a linear approximation is used with the SVD basis and no sparsity regularization is needed. It is evident that mainly large- scale patterns are used to identify the solution; however, the final permeability solution does not capture the correct connectivity in the true model. A disadvantage of using linear approximation with the SVD approach is that the basis elements are fixed and not selected adaptively through a search based on flow data. Hence, the approximation has a restricted set of basis elements . On the other hand, the optimization problem has a lower dimension (20 in this case), and hence, the computational aspects are more favorable. To make a better comparison between the K-SVD and SVD bases, An additional experiment is performed in which the full-rank SVD basis (K=2025) was used in a nonlinear approximation. In this case, the sparsity regularization was included in the inversion algorithm with the full SVD basis. As Figure 5d shows, although the results are not as good as the K-SVD case, for this example the full SVD basis with sparsity 50 regularization leads to better facies connectivity than the linear approximation with 20 leading basis elements. It may be argued, however, that applying the sparsity to the SVD basis may not be justified, as the SVD basis is not perfectly sparse. An additional experiment is conducted to evaluate the performance of the full SVD basis without using the sparsity regularization term. The results are shown in Figure 5e. In this case, several small-scale artifacts appear in the solution. This can be explained by the insensitivity of the observations to the detailed basis components and the lack of a mechanism (such as sparsity regularization) to eliminate these. A linear approximation with a larger than 20 SVD elements is expected to provide an improved solution; however, the number of elements to include in the approximation is not easy to know. Figure 6a shows the RMSE (left) and SSIM (right) error for the log-permeability during the iterations. The RMSE and SSIM errors consistently decrease during the iterations, which implies that the iterations consistently move toward the true model in terms of structural similarity and grid-based log-permeability. The RMSE and SSIM errors when the full SVD basis is used with sparsity regularization increase initially (the first ten iterations) and decrease in later iterations. We also observed a similar behavior in some other examples in which full SVD or K-SVD were used. This behavior is explained by the choice of ( ) ( ) 2 2 k k =− yg Φv , which tends to mainly minimize the data misfit and ignore solution sparsity during the initial iterations. Consequently, the initial iterations tend to be non-sparse and contain many irrelevant small-scale artifacts (see Figure 5d). This behavior is not observed when k is chosen to be a small constant value. However, in that case, a slower convergence to the solution is obtained (not shown). Figure 6b shows the observed data match for saturation and pressure reduction profiles at sample production locations. 51 The data matches for all cases are good, except the linear approximation, with 20 leading SVD basis elements, which shows a deviation from the observations at a particular production location. Example 2 The results in Experiment 1 were obtained for a permeability field that had simple straight channels with left-to-right orientation. The channel features in the true model were similar to the most frequently encountered patterns in the prior log-permeability library (i.e., straight left-to-right channels) used to generate the bases. In addition, a relatively large number of observations from the two ends of the reservoir were used to find the solution. In Experiment 2, a 9-spot reservoir configuration with fewer observations is used. Moreover, the true permeability model contains channel features that are less frequently observed in the prior library. The true permeability model is shown in Figure 7a (right). The log-permeability iterations for the 2025-dimensional nonlinear K-SVD, 20- dimensional linear SVD and 2025-dimensional nonlinear SVD approximations (with sparsity regularization) are shown in Figures 7b through 7d, respectively. In this example, the truncated (Figure 7c) SVD and its full rank with sparsity constraint (Figure 7d) were clearly inferior to the K-SVD estimates with sparsity. Since the true permeability model is slightly different from the main features in the prior library, the linear SVD does not reproduce the log-permeability field accurately. Examination of the permeability iterates in this case show that the identified features are mostly horizontal, which explains the better performance of this approach in Experiment 1. On the other hand, by taking advantage of the flexibility offered by the full-size basis the K-SVD approach combines the most 52 relevant basis elements to match the flow data (Figure 7b). As shown in Figure 7d, the performance of the full rank SVD basis is slightly better than the linear 20-dimensional SVD; however, the estimated final log-permeability is not nearly as good as that obtained through the K-SVD. It is worth noting that the continuity of the true model is not preserved in the final estimate obtained by the full rank SVD basis. This lack of continuity can be explained by comparing the basis elements in the SVD and K-SVD algorithms (Figure 1). The K-SVD basis elements have better continuity than the components of the SVD basis. In fact, the SVD basis elements with lower ranks contain small scale features with insignificant spatial correlation. In contrast, all K-SVD basis elements inherit the large- scale continuity in the prior library, which becomes important when capturing large scale continuous features. Later sections of this dissertation will discuss modification of the K- SVD basis to better preserve continuity. (a) Initial and true log-permeability Initial log-perm True log-perm (b) Estimated log-permeability after selected iterations (KSVD-IRLS) 10 20 30 40 10 20 30 40 Initial Permeability 53 (c) Estimated log-permeability after selected iterations (SVD20-LLS) (d) Estimated log-permeability after selected iterations (SVD2025-IRLS) Figure 2.7: Log-permeability reconstruction results for Experiment 2 (a) initial log-permeability (left) and true log-permeability (right) with the 9- spot well configuration; (b) reconstructed log-permeability for selected iterations of the K-SVD using 2025 coefficients and the IRLS algorithm; (c) reconstruction of the log-permeability using the SVD basis with the first 20 leading basis elements using the LS minimization for history matching; (d) reconstructed log-permeability for the SVD case with 2025 sparse coefficients, and using the IRLS algorithm. Figure 8 summarizes the estimation performance for all three methods. Figure 8a plots the RMSE (left) and SSIM (right) error in estimation of the log-permeability for the 30 iterations of the IRLS solution algorithm. The errors are clearly distinguished and Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 54 consistent with the estimated log-permeabilities in Figure 7. Figure 8b shows the match to the pressure reduction and saturation observations with the three methods. In all cases, very good matches to the observed quantities are obtained. This outcome may appear surprising given the distinct structural differences between the log-permeability solutions. However, the result can be explained by the under determined nature of the problem, which can lead to many distinct solutions with similar fits to the limited number of observed quantities. The degree of ill-posedness in spatial inverse problems is better appreciated by noting that even for a simple problem where the prior model in all three methods are reasonable, as shown in Figures 7 and 8, several non-unique solutions may be obtained. (a) Log-permeability RMSE & SISM errors after each iteration (b) Observation matches for sample wells 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 2) 5 10 15 20 25 50 100 150 200 250 Pressure (@Well 2) 5 10 15 20 25 30 0.2 0.3 0.4 0.5 SSIM Error over all iterations 5 10 15 20 25 30 0.12 0.14 0.16 0.18 0.2 0.22 RMSE over all iterations 55 Figure 2.8: Reconstruction errors for Experiment 2 (a) RMSE (left) and SSIM error for the log-permeability reconstruction versus iteration number; (b) saturation (left) and pressure reduction (right) mismatches for two sample wells Example 3 This section presents an example with a more heterogeneous log-permeability model, which resembles a multi-Gaussian random field. In this case, the sample permeability models for the library (Figure 1b) are generated using the variogram-based sequential Gaussian simulation (sgsim) method [107]. The true log-permeability and the 9-spot reservoir configuration are shown in Figure 9a. The initial log-permeability model is a homogeneous field with a permeability value of 1.5mD. The log-permeability solution iterations are shown in Figures 9b-9d. The log- permeability iterations for the K-SVD solution (Figure 9b) and linear 20-dimensional SVD (Figure 9c), as well as full rank nonlinear SVD with sparsity regularization (Figure 9d) are displayed. The RMSE and SSIM error metrics for the iterations in all methods are also shown in Figure 10a. As the results indicate, the K-SVD solution outperforms the alternative methods with the SVD bases. The linear SVD solution is a bit too smooth and overestimates the diagonal high permeability streak trend on the top-right portion of the 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 8) 5 10 15 20 25 50 100 150 200 250 Pressure (@Well 8) 56 domain. The full rank SVD basis, with sparsity regularization, performs poorly in this case. This could be attributed, at least in part, to the more heterogeneous nature of the permeabilities used in the library, where the small scale SVD basis elements tend to play an important part. While the same argument applies to the K-SVD basis, the main difference between the two is that the K-SVD basis, while heterogeneous, preserves the large-scale continuity in the field. The data mismatch is shown in Figure 10b, which indicates good matches to the observed quantities (for the K-SVD solution), which are consistent with the estimated log-permeability fields. 57 (a) Initial and true log-permeability Initial log-perm 10 20 30 40 10 20 30 40 Initial Permeability True log-perm (b) Estimated log-permeability after selected iterations (KSVD-IRLS) (c) Estimated log-permeability after selected iterations (SVD20-LLS) (d) Estimated log-permeability after selected iterations (SVD2025-IRLS) Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 58 (a) Log-permeability RMSE & SISM errors after each iteration (b) Observation matches for sample wells Figure 2.10: Reconstruction errors for Experiment 3 (a) RMSE (left) and SSIM error of log-permeability reconstruction versus iteration number; (b) saturation (left) and pressure reduction (right) mismatch for two sample wells. 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 2) 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 8) 5 10 15 20 25 50 100 150 200 250 Pressure (@Well 8) 5 10 15 20 25 50 100 150 200 250 Pressure (@Well 2) 5 10 15 20 25 30 0.2 0.3 0.4 0.5 SSIM Error over all iterations 5 10 15 20 25 30 0.12 0.14 0.16 0.18 0.2 0.22 RMSE over all iterations 59 Compressed K-SVD The three examples in previous subsections clearly demonstrate the effectiveness of the K- SVD basis and sparsity promoting regularization with the IRLS algorithm for subsurface flow data integration. While that approach results in accurate and consistent solutions, its implementation with K=2025 basis requires solving a 2025-dimensional optimization problem, which may not be efficient (especially for realistic large-scale problems). In addition, the samples in the prior model used to generate the basis have similar spatial features (or redundancies) that can be reduced by constructing a lower dimensional basis. This section discusses one approach for efficient implementation of the K-SVD method for dimension reduction. Figures 1-b3 through 1-b7 show the basis generated with K=1000, 400, 200, 100, and 20. In all these cases, the sparsity S is 20. Therefore, the corresponding (S/K) ratios are 2%, 5%, 10%, 20%, and 100%, respectively. Similar statements can be made about Figure 1b. The main tradeoff in decreasing K is between basis diversity (hence, reconstruction accuracy), computational complexity of the resulting optimization problem, and sparsity in the K-SVD transform domain. How sparse the transform domain representation should be, in general, is related to available measurements, the complexity of the prior model, and the desired approximation quality, as well as the basis used. In addition to computational considerations for basis reduction, an examination of Figure 1 shows that decreasing K results in a smoother basis with better large-scale connectivity. In generating the reduced basis with the K-SVD approach, smaller scale variabilities are eliminated first, as they introduce less error in the reconstruction step. Therefore, if the data resolution does not warrant including smaller scale heterogeneity, or it is appropriate to 60 promote solution continuity (e.g., in fluvial channels), it may be advantageous to remove the small-scale heterogeneities from the basis by reducing K. Figures 11a and 11b show the final reconstruction results for the channel and the more heterogeneous permeability facies examples, respectively, for decreasing K values. It is clear that as K decreases, the estimated permeabilities become smoother and more continuous. The smoothness of the solutions at low K values is a consequence of the smooth basis elements for decreasing K values in Figure 1. The estimation results are generally acceptable even when K=100. The results imply that the prior model library has significant redundancy, which can also be confirmed by examining the features in the model library (Figure 1). While smoothness may not be desired for problems with heterogeneous permeability, the low resolution and spatially averaged nature of the flow data may not offer the required resolving power to infer detailed heterogeneity. Hence, all contributing factors must be considered before choosing the number of basis elements (K) to include in the K-SVD basis. 61 (a) Final log-permeability solutions with different basis sizes (Experiment 2) True Log- Perm KSVD2025 KSVD400 KSVD200 KSVD100 (b) Final log-permeability solutions with different basis sizes (Experiment 3) True Log- Perm KSVD2025 KSVD400 KSVD200 KSVD100 Figure 2.11: Log-permeability reconstruction results for Experiments 2 and 3 with different K values (a) reconstructed log-permeabilities for the K-SVD basis with decreasing values of K=2025 in Experiment B; (c) similar results as in (b) for Experiments C. Effect of Measurements Noise Previous experiments did not add observation noise to the data. This section investigates the robustness of the K-SVD basis and the IRLS solution algorithm when the data is corrupted by noise. Consider Experiment 3 and add noise to the observed data. The noise is added as a fraction of the range of variability in the observed quantities. For instance, if the observed pressure variability is 400-psi, a 10% noise refers to an uncorrelated Gaussian noise with zero mean and a standard deviation of 40-psi. Normalized the observations by their corresponding noise variances are considered to remove the effect of data scale and to avoid overfitting. Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 62 Figure 12 summarizes the final log-permeability solutions for 5% (b), 10% (c) and 20% (d) noise and for different K values (the columns). In general, it is expected that adding noise to the data should have a stronger impact on the K-SVD solutions with larger K values, as the presence of highly heterogeneous basis elements (for large K values) can introduce noise sensitivity to basis elements with fine details, and hence noise fitting. From Figure 12, we can observe that at low K values (rightmost column) the solution is least affected by increasing the noise in the measurements. Overall, it appears that the main large-scale features in the reconstructed models are preserved by increasing the amount of noise. Finally, Figure 13a summarizes the log-permeability reconstruction errors (left and middle) and the total data misfit (right) against the number of iterations for 20% of measurement noise. Figure 13b plots the corresponding observation matches for sample wells, which also indicates the robustness of the algorithm to the noise even at relatively low signal to noise ratio. This chapter closes by pointing out that the approach presented in this dissertation assumes that the prior library of models is reliable. The samples in each library were constructed using a known training image (in the channel examples) and variogram parameters (in the heterogeneous case). While these spatial variability models are often estimated from available data before the construction of static models, their estimation and specification may not be error-free. Hence, it would be interesting to evaluate the performance of the approach under structural uncertainty and when the initial library of the permeability models is structurally diverse. The uncertainty in the initial library is a topic that will be presented in the next chapter. 63 (a) True log-permeability (b) Final log-permeability solutions for different basis sizes with 5% noise KSVD2025 KSVD400 KSVD200 KSVD100 (c) Final log-permeability maps for different basis sizes with 10% noise (d) Final log-permeability maps for different basis sizes with 20% noise Figure 2.12: Log-permeability reconstruction results for Experiments C after adding noise to observations (a) the true log-permeability map; reconstructed log-permeability maps using different K values (columns) and after adding 5% (b), 10% (c), and 20% (d) noise are shown. Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 Itt 1 20 40 10 20 30 40 Itt 3 20 40 10 20 30 40 Itt 7 20 40 10 20 30 40 Itt 20 20 40 10 20 30 40 Itt 30 20 40 10 20 30 40 True log-perm 64 (a) Log-perm RMSE & SISM errors after each iteration for the case with 20% noise (b) Observation match plots Figure 2.13: Reconstruction errors for Experiment C with 20% observation noise the RMSE (left), SSIM error for the log-permeability reconstruction and the total data misfit versus iteration number are shown in (a) and the saturation (left) and pressure reduction mismatch for two sample wells are displayed in (b). Conclusions This chapter demonstrated the application of geologically learned sparse bases to subsurface characterization inverse problems using an iteratively reweighted least squares approach and multiplicative sparsity regularization. Efficient implementation and important properties of the K-SVD algorithm for generating a sparse prior library (or 5 10 15 20 25 30 0.15 0.2 0.25 0.3 0.35 0.4 RMSE over all iterations KSVD(2025) KSVD(400) KSVD(200) KSVD(100) 5 10 15 20 25 30 0.4 0.6 0.8 1 SSIM Error over all iterations KSVD(2025) KSVD(400) KSVD(200) KSVD(100) 5 10 15 20 25 30 1 2 3 4 Misfit Vs Iteration Number KSVD(2025) KSVD(400) KSVD(200) KSVD(100) 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 2) 5 10 15 20 25 0 100 200 300 400 Pressure (@Well 2) 5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 Saturation (@Well 8) 5 10 15 20 25 0 100 200 300 400 Pressure (@Well 8) 65 ensemble) of models was discussed. The K-SVD algorithm offers an effective approach for generating geologically consistent sparse bases for subsurface flow data integration problems from a library of prior models. In many subsurface modeling applications, the lack of a complete property description leads to uncertainty in the representation of rock property distribution. A common approach to acknowledging and reflecting the uncertainty in the prior model is to generate an ensemble of plausible models, often through geostatistical simulation techniques that are consistent with the existing data and geologic information. While these models can be diverse in their spatial distribution, the frequently strong spatial correlation intrinsic to geologic formations can result in redundant descriptions in space. A more concise or sparse approximation of these models can be obtained by removing the redundancy in the spatial description. The advantage of a sparse description is that it lends itself to very effective and efficient estimation algorithms specially designed for sparse parameters. The K-SVD algorithm provides an effective approach for sparse description of a given library of models. Therefore, if the prior model is reliable, the solution of the inverse problem is expected to have a sparse representation in the constructed K-SVD basis. Hence, effective reconstruction algorithms can be developed to exploit the sparsity in the solution. When the K-SVD algorithm is combined with the sparsity promoting regularization techniques, a nonlinear approximation problem is obtained where the solution is identified by identifying and combining relevant basis elements to generate a solution that best explains the dynamic flow data. This nonlinear approximation approach is generally more accurate than the linear approximation alternatives such as the truncated SVD in which the 66 leading singular vectors of the prior model library are linearly combined to form a solution. We have demonstrated the superiority of the K-SVD to the classical SVD basis for application with sparsity-promoting estimation algorithms in several flow data integration inverse problems. This chapter also presented some insights on computational complexity of the K-SVD algorithm and provided reduced-order implementations that can be used to remove the redundancy in the prior library and enforce better connectivity in the expected solution (when warranted). In addition, this study have presented results to illustrate the robustness of the K-SVD-based flow data integration algorithm for noisy data. The results of the examples with noisy measurements suggest that the solution is less affected by the noise when a reduced dimensional K-SVD basis is used, mainly because of the lack of small scale show stronger sensitivity to the noise in the measurements. Overall, the discussions and numerical experiments presented in this chapter indicate that by taking advantage of the solution sparsity ensured through the K-SVD algorithm, sparse reconstruction techniques can be effectively applied to nonlinear subsurface data integration problems. This framework allows for a nonlinear approximation that is more flexible, accurate and effective than linear approximation methods such as the commonly used truncated (parameterized) SVD. Solution of the subsurface inverse problems can be substantially facilitated by taking advantage of the spatial correlation in geologic formations that manifests itself as sparsity in an appropriate transform domain. In geoscience applications, where prior data is available for generating multiple initial models, methods such as K-SVD that generate a geologically learned basis are more consistent and effective than other generic data compression techniques such as Fourier or 67 wavelet domain methods. While the regular SVD approach also exploits the prior geologic models in the basis construction, it does not necessarily yield a perfectly sparse representation and fails to adequately represent the continuity in the prior models with a small number of parameters. In addition, when a fixed number of the leading singular vectors of the SVD approach is used, a less effective linear approximation is obtained, which may fail to capture the features in the true model that slightly deviate from the dominant modes of the prior library. Overall, the K-SVD approach provides a promising data integration approach for geoscience applications as illustrated in this chapter. A major challenge that constitutes applicability of the proposed framework in practical cases is the uncertainty in the prior model. Since the K-SVD basis is constructed from prior models, it may have difficulty estimating features that significantly deviate from those in the prior ensemble of models. It is therefore important to ensure that the prior model is sufficiently diverse to encompass a realistic range of variability. Accounting for prior uncertainty is one of the challenging aspects in any data integration method, including the approach presented in this chapter. In addition to the incorporation of prior uncertainty, it is important to characterize the uncertainty in the solution, which is usually achieved through stochastic or ensemble-based data integration methods. Results of the study presented in the next chapter, is directed toward addressing the issues related to the prior uncertainty and ensemble methods that can exploit the intrinsic sparsity in subsurface property distribution. 68 Appendix A: Structural Similarity Error Experimental studies with synthetic true models provide an opportunity to assess the quality of the estimated solution in representing the true parameters. In estimation of spatial parameters, a typical error metric to quantify the estimation performance is the root-mean- squared-error (RMSE) between the estimated and true parameters. While RMSE provides an intuitive and convenient error metric, it does not account for the structural continuity of spatial parameters, which is very important in subsurface characterization and modeling. In this paper, we have used an additional performance metric for assessing the quality of the reconstructed spatial parameters known as the Structural SIMilarity (SSIM) error. This metric is borrowed from image processing literature [54] where it is used as a metric for perceived image quality with the assumption that human visual perception is highly adapted for extracting structural information from a scene. Since fluid displacement in subsurface is strongly affected by the structural continuity in the physical properties of the rock, and in particular permeability, we have used the SSIM error metric to evaluate the structural similarity between the estimated and true permeabilities. For two images X and Y, the general SSIM index is obtained by calculating three different sub-indices: Illumination: 1 22 1 2 XY XY c I c + = ++ ...................................................................... (A1) Contrast: 2 22 2 2 XY XY c C c + = ++ ......................................................................... (A2) Structure: 3 3 XY XY c S c + = + ............................................................................... (A3) 69 where X , Y , X , Y represent the spatial means and standard deviations, respectively, and XY is the covariance, of the two images. The parameters 1 c , 2 c , and 3 c are small constants. The illumination index compares the regional intensity of the two images (i.e. spatial average of the permeability values) and the contrast index compares the variations in each defined region. As the name suggests, the structure index is a metric related to the structural variability between the two images. With the above sub-indices, the overall SSIM metric is defined as CS I SSIM I C S = . ........................................................................................ (A4) where the choice of I , C , and S determines the importance of each sub-index in defining the similarity between the two images. In this paper, we have used 0 IC == and 1 S = . We have defined the SSIM error as ( ) ( ) 0 Error 1 2 SSIM SSIM = − . 70 Chapter III. PRIOR MODEL ERROR IDENTIFICATION 71 In conventional subsurface data integration as well as the previous chapter, [37], [38], [20], the main focus has been placed on estimating reservoir property distributions from the nonlinear observed data under a “known” geologic continuity model (e.g., a covariance model). Hence, the dynamic data is not used to infer the continuity model itself. In principle, however, the nonlinear flow data can be expected to carry important information about the global connectivity in the field, and can be used to constrain the variability in the geologic continuity [108]. In this chapter, I aim to achieve this goal by exploiting the feature selection property of the sparse signal processing techniques, recently formalized under the compressed sensing paradigm [66], [69], [109]. To this end, this study has formulated an inversion approach that uses the dynamic production data to not only estimate the spatial distribution of the flow properties (permeability in this chapter), but also to identify the continuity model by correcting the errors in specifying the structures in the prior continuity model. In a number of previous publications [70], [85], [110], [111], an effective transform- domain regularization approach has been presented to solve subsurface characterization inverse problems by minimizing a typical least-squares objective function augmented with sparsity-promoting penalty functions. In this context, sparsity implies that out of many basis elements (possible features) that can be combined to represent a property map, only a small fraction with relevant connectivity features can be used to represent the property distribution. The main objective of these techniques is, therefore, to first generate a collection of geologic features that can sparsely represent the reservoir property map, and then, exploit the feature selection property of the sparse reconstruction algorithms (see e.g., [66], [100], [109], to identify the features that significantly contribute to reproducing the 72 observed data. Hence, the key steps of a sparse reconstruction approach are: (i) adopting or designing an appropriate sparse representation method using a diverse set of plausible connectivity features, and (ii) formulating and solving an inversion algorithm to recover the unknown property map by selecting and combining the (few) relevant features from the proposed initial features. The first component, sparse representation, can be obtained through generic compressive transforms such as DCT ([112], [113], [114]), DWT ( [115], [116]) and GCT ([117]), or more specialized learned (from prior training data) sparse dictionaries such as K-SVD which was introduced in the previous chapter. Once the basis elements (features) of a sparse representation are adopted, the inverse problem is reduced to selection of the significant features and estimation of their contribution in recovering the property map (based on available linear or nonlinear data). The feature selection step of this framework leads to a non-trivial optimization problem that has been extensively studied in the past. The compressed-sensing paradigm presented by [69] constitutes a significant breakthrough in sparse signal processing and the solution of sparse reconstruction problems. A powerful aspect of the sparse reconstruction algorithms that was exploited in previous chapter, for prior model identification method is its feature selection property. Here, this study includes a mixture of diverse features, derived from different prior models, and develop an adaptive sparse reconstruction algorithm to identify which model(s) are most relevant to the data. Proposed approach for this chapter first builds a diverse geologic dictionary (collection of geologic features) with equal number of model realizations generated from different variogram models. In this study, let us consider the dictionary elements to be the permeability realizations that are generated from the Sequential 73 Gaussian Simulation (SGS) algorithm ([107]), similar to chapter 2, using different variogram (covariance) models. Each variogram model represents a cluster of model realizations that can be incorporated, as plausible features, into the diverse geologic dictionary. During the nonlinear data integration iterations, the proposed framework, monitors the contribution of each cluster in reconstructing the solution and uses that to proportionally adjust the number of elements representing that cluster in the next iteration. As a result, while the total number of elements in the dictionary remains the same, the contribution of each variogram will progressively change during the iterations, with an increasing (decreasing) number of elements from the consistent (inconsistent) variograms. This leads to the elimination of irrelevant continuity models and the identification of the model(s) most relevant to the data. Remainder of chapter III, presents the adaptive prior model identification approach and demonstrates its effectiveness using a few numerical waterflooding experiments. The sparse reconstruction algorithms e.g., LASSO, BP, and FOCUSS, can be interpreted as feature selection methods [118]. That is, these algorithms can be viewed as feature selection techniques, where the outputs of a system (observations) are used to identify the model components (features) responsible for generating them. This view provides the motivation behind the approach proposed in this chapter. The working hypothesis here is that very few dominant features can be selected (out of a large collection of plausible features), which can then be properly combined to produce the observed response of the system (reservoir). Interestingly, the selection property of the formulation implies that the irrelevant features (i.e., inactive basis components) included in the selection problem do not degrade the solution since they will be assigned zero weights. This selection property 74 introduces an important process through which it is possible to begin with a collection of diverse model components as possible features and use the observed data to discriminate against models with little or no contribution to reproducing of the observed system response. As such, the sparse reconstruction methods present an opportunity to incorporate a significant level of uncertainty into the initial geologic models and use the reservoir production response to identify the consistent prior model. This study uses the sparse reconstruction formulation provided in the previous chapter to implement an adaptive model selection algorithm that identifies “geologically relevant” model components in a diverse mixture of prior model elements. Here, adaptive implementation refers to the capability to update the mixture of prior model components during model calibration based on the contribution of each prior model in reconstructing the solution at the current iteration. To illustrate this concept, let us consider a two- dimensional heterogeneous permeability distribution in a reservoir and use a diverse mixture of geologic continuity features from variograms with random parameters to describe the uncertainty in the knowledge about the variogram. The three main parameters of a two-dimensional variogram model are the direction of continuity , the maximum continuity length max a , and the minimum continuity length min a . Assuming that a variogram- based description is appropriate for the reservoir under consideration, the proper specification of these three parameters is critical in retrieving the correct spatial continuity in the true model. Here, this chapter assumes that the true variogram parameters are unknown and uses the PMI approach to identify these parameters based on the dynamic response of the subsurface model. For completeness of the chapter, let us briefly review 75 the sparse reconstruction formulation proposed in the previous chapter, followed by a detailed presentation of the prior model identification method that is inspired by it. Iteratively Reweighted Least Squares (IRLS) This thesis uses the IRLS algorithm to minimize the lp-norm-regularized least-squares objective function. The IRLS algorithm approximates the lp norm with a weighted l2 norm in which the weights are updated at each iteration based on the solution from the previous iteration, to progressively promote sparsity. The sparsity-regularized objective function in the IRLS algorithm is expressed as ( ) ( ) ( ) 2 v d v g v C SP J w obs + − = (3.1) where ( ) v SP is given as (Error! Reference source not found.) ( ) ( ) ( ) ( ) ( ) ( ) ( ) + + = + = + + = + + + = = − + + 2 1 2 1 1 2 1 1 , 1 2 1 2 2 1 1 it k it it it i k i i i k i p it i it i it v v v w v v SP W v . (3.2) The small constant is introduced to ensure numerical stability at the origin. Note that at any current iteration (it+1), the diagonal weighting matrix it W lags one iteration (it is calculated from the solution at the previous iteration (it)), i.e., ( ) ( ) 2 1 2 , 1 p it i it i i v w − + = . (3.3) With this choice of the weighting matrix, the regularization function ( ) v SP effectively becomes an lp norm at convergence. One remaining issue, as in any regularized 76 minimization problem, is the choice of the regularization parameter . Existing methods for specifying the regularization parameter in the linear case, such as L-Curve and Cross- Validation [119], [120], [121], [122] are only computationally practical for fast models and lose their effectiveness in nonlinear subsurface flow data integration problems, where many time-consuming flow simulations must be performed to compute the objective function. [85] introduced a multiplicative objective function to circumvent the need for specifying the regularization parameter . They showed that with a special choice of the sparsity regularization term, the multiplicative regularization will effectively become an additive regularization with parameters that are more predictable than (in the additive case). This study, uses the following parameters for and : ( ) 2 1 ) ( w obs it it it C φ d v g v − = = + (3.4) ( ) ( ) it obs it it it SP w v d v g v C φ 2 1 ) ( − = = + . (3.5) Finally, a gradient-based minimization of the resulting regularized objective function requires access to the Jacobian matrix ( ) v v g G v = containing the gradient of the simulated measurements ( ) v g with respect to the coefficient vector. Efficient adjoint-based methods are becoming increasingly available for calculation of the gradients of the nonlinear measurement data (well response data) with respect to distributed reservoir properties (e.g., permeability and porosity). The unknown parameter of our inverse modeling in this case is the logarithm of the desired property, i.e., permeability (i.e., ) log(k u = ). The adjoint- calculated gradient information is typically provided with respect to the permeability itself, 77 i.e., ( ) k k g G k = . Given the relations v Φ u n = and ) log(k u = , the chain rule of differentiation can be invoked to convert the simulated Jacobian k G to the desired Jacobian matrix v G , as follows: n n k k Φ G G k v = 0 0 1 , (3.6) where ( ) ( ) v u u k k g v g = . . is used. Note that u k K = , where K is a diagonal matrix with diagonal equal to the entries ofk , and that n Φ v u = . With these details, the (it+1) iteration of the Newton’s method can be derived as it it it d v D = +1 (3.7) with ( ) ) ( ) ( ) ( it it w T it v W v G C G v D v v + = (3.8) ( ) ( ) it it obs w T it v G v g d C G d v v + − = , (3.9) where the entire Jacobian matrix v G is calculated at it v . Note that in deriving the Newton iteration equation the first-order Taylor expansion ( ) ( ) ( ) it it it v v G v g v g v − + = +1 has been introduced and that the Jacobian matrix v G can be readily calculated from n Φ K G G k v = . 78 Sparsity Promoting Prior Model Identification The sparse reconstruction algorithms, in general, can be interpreted as feature selection methods. That is, these algorithms can be viewed as feature selection techniques, where the outputs of a system (observations) are used to identify the model components (features) responsible for generating them. This view provides the motivation behind the approach proposed in this chapter. The working hypothesis here is that very few dominant features can be selected (out of a large collection of plausible features), which can then be properly combined to produce the observed response of the system (reservoir). Interestingly, the selection property of the formulation implies that the irrelevant features (i.e., inactive basis components) included in the selection problem do not degrade the solution, since they will be assigned zero weights. This selection property introduces an important process through which it is possible to begin with a collection of diverse model components as possible features and use the observed data to discriminate against models with little or no contribution to reproducing of the observed system response. As such, the sparse reconstruction methods present an opportunity to incorporate a significant level of uncertainty into the initial geologic models and use the reservoir production response to identify the consistent prior model. This study proposed tp use the above sparse reconstruction formulation to implement an adaptive model selection algorithm that identifies “geologically relevant” model components in a diverse mixture of prior model elements. Here, adaptive implementation refers to the capability to update the mixture of prior model components during model calibration based on the contribution of each prior model in reconstructing the solution at the current iteration. To illustrate this concept, let us consider a two-dimensional 79 heterogeneous permeability distribution in a reservoir and use a diverse mixture of geologic continuity features from variograms with random parameters to describe the uncertainty in the knowledge about the variogram. The three main parameters of a two-dimensional variogram model are the direction of continuity , the maximum continuity length max a , and the minimum continuity length min a . Assuming that a variogram-based description is appropriate for the reservoir under consideration, the proper specification of these three parameters is critical in retrieving the correct spatial continuity in the true model. Here, we assume that the true variogram parameters are unknown and use the MEI approach to identify these parameters based on the dynamic response of the reservoir. Proposed Method For Model Error Identification The proposed model selection algorithm begins by dividing the possible range of each variogram parameter into L intervals (clusters). Within each interval, the variogram parameters are described with a uniform distribution over the range specified for that interval. In each cluster, random samples of the variogram parameters are then drawn and used to generate unconditional realizations of the desired property map (permeability in this case). Initially, the same number of realizations is assigned to each cluster. The clusters of permeability realizations are then mixed to construct a redundant dictionary ( ) c N Φ Φ Φ Φ | | | 2 1 = as a mixture prior model where each cluster i Φ contains c N K / prior samples. The permeability realizations are vectorized, then normalized to have a unit l2 norm and assembled as columns of i Φ . The resulting diverse dictionary is used in conjunction with the dynamic (production) data in the IRLS algorithm to identify the relevant prior model. Since the included clusters exhibit very different continuity features, 80 very few relevant prior elements in the mixture are expected to have a sparse representation. A main feature of the proposed adaptive implementation is that during the model calibration minimization iterations, the mixture dictionary Φ is updated by replacing the irrelevant elements of the dictionary (those with negligible coefficients) with new elements from clusters that have significant contribution to reproducing the output measurements. In this process, the number of samples in each cluster changes while the total number of features in the dictionary remains the same. The idea behind this adaptive implementation is that a greater contribution (number of active elements) from one cluster indicates the consistency of the features in that cluster with the observed data. Therefore, additional elements from that cluster are likely to improve the solution. The number of newly added elements to each cluster is proportional to the number of active realizations in that cluster. The steps in the MEI algorithm are summarized in Table 2. Table 2. Implementation steps of the proposed MEI algorithm If the support v < N0 (threshold) & iteration > 1 1) Remove the smallest in magnitude (n-s) entries from v, and the corresponding elements from c N Φ Φ Φ Φ | | | 2 1 = 2) Compute the histogram of cluster weights using n in elements active of number C j j Φ = 3) Generate (n-s) new realizations according to the cluster weights defined in (2) using the related variogram parameters 4) Reset the IRLS weight matrix W =diag([1,1,1,1…,1] T ) 5) Go to Step (6) Else 6) Solve the sparse reconstruction problem 81 Numerical Experiments and Discussion In this section, we examine the performance of the MEI approach using two numerical experiments. The first example is based on a synthetic subsurface Gaussian heterogeneous model for which the reference model and its corresponding variogram are known, while in the second example, the permeability model is taken from the top layer of the SPE10 benchmark model, for which the prior connectivity model is more complex and unknown. Example 1: Synthetic 2D Model As a motivating example, we use a 50×50 numerical experiment in two-dimensional two- phase (oil-water) systems to study the importance of accounting for variogram uncertainty in model calibration. The reservoir is under a 9-spot pattern-drive waterflooding with one injector in the center and 8 producers symmetrically placed along the edges of the subsurface model. nonlinear observations were collected about every 90 days by fluid flow PDE equations with the reference model. Only the data from the first four years were integrated and the remaining 2 years of data were used to test the predictive quality of the estimated model. In this example, the major source of uncertainty was assumed to be the direction of major (structural) continuity in the variogram model. The entire direction of continuity was divided into five clusters (intervals) and 200 initial samples were initially used to represent each cluster. This led to a linear expansion with 1000 terms to approximate the true model. Note that since the (log-permeability) models are Gaussian, we use the linear combination of the sample models (log-permeabilities) to approximate the solution. The Principle Component Analysis (PCA) could be used as an alternative and the linear expansion written over the leading principle components of each cluster. The 82 sparsity assumption is justified by observing the contrast in the direction of continuity in each cluster, suggesting that the majority of the samples have continuity directions inconsistent with the reference model. Figure 1 shows 25 samples for the 5 different clusters of log-permeability models that cover the entire range of continuity direction, i.e., 90 , 90 − = . The dynamic production data was used to estimate the heterogeneous reference model, which in this case was consistent with the dominant direction of continuity in Cluster 4 (see Figures 1 and 2a). The variogram parameters used to generate the reference model are shown in Table 3. The model calibration iterations began with an initial (guessed)model with an incorrect continuity direction (Figure 2b). The maximum number of iterations was set to 60; however, after 40 iterations no significant changes were observed in the model or in the cost function. Figure 2b shows the calibrated models for the reference model in Figure 2a (top). Figure 2b shows the initial model (top) and the first 25 significant elements of the starting mixed prior used for calibration process. Figure 2c depicts the inversion process solution (top) and the first 25 (out of 1000) significant expansion functions selected from the mixture prior samples (bottom) using the proposed approach. The continuity direction in the selected samples in Figure 2c (bottom) is consistent with the true model shown in Property Name 50x50 Perm model Kriging Type Ordinary kriging Maximum conditioning data 12 Variogram type Spherical Ranges: max , med 50, 10 Angles: Azimuth, Dip, Rake 30, 0, 0 Mean value (LogPerm) 4 Standard Deviation (LogPerm) 1 max a min a Table 3. Geostatistical simulation parameters for the reference model in Example 1 83 Figure 2a, indicating that the algorithm could successfully identify the consistent direction of continuity. For comparison, we also show, in Figure 2d, the estimated model obtained using a consistent prior continuity direction (not a mixture). The solution obtained from the proposed method is similar to that from the (oracle) correct prior model. In both cases, however, the low value (permeability) trend in the lower left region of the reference model is not captured. This could be partly attributed to the observation limitation and the fact that static observations from the wells were not integrated into the used data. Based on the continuity directions assigned to each cluster, the correct direction of continuity is identified to be in the interval 54 , 18 = . Figure 3.1: Sample log-permeability realizations from five different clusters. Cluster #1 Cluster #2 Cluster #3 Cluster #4 Cluster #5 Figure 3.2: Log-permeability reconstruction result for sparsity-promoting PMI algorithm (a) reference log-permeability model and well setup; (b) the initial log-permeability model (top) and the corresponding significant components from the mixture prior models (bottom); (c) estimated log-permeability (top) and its corresponding significant expansion functions (bottom) from the adapted mixture library; (d) estimated log-permeability (top) and its corresponding significant expansion functions from the “correct” prior model (10% observation noise is added to the measurements). (a) (c) (d (b 84 The cluster contributions to the solutions for this example are shown in Figure 3. Figure 3a displays the contribution of each cluster to the initial (incorrect) log-permeability model at iteration 0. The initial model in this case was taken from Cluster 2. Figure 3b shows the cluster contributions at iteration 40 where Cluster 4 becomes dominant. The contribution of Cluster j is denoted as Cj and calculated as 1 1 1 v v j n i i i j i j v v C = = = (3.22) During the model calibration iterations, cluster contributions were adaptively adjusted through weighted resampling from the five prior models. This adaptive adjustment shifted the dominant continuity direction from Cluster 2 (initially) to Cluster 4 (at the solution). The expansion coefficients corresponding to each cluster at the initial and final iterations are shown in Figure 4, which show the same result in a different way. Figure 3.3: Clusters contributions for: (a) the initial model and (b) the calibrated model. (b) 0.00 % 0.00 % 0.23 % 99.77 % 0.00 % 11.06 % 71.17 % 9.51 % 3.89 % 4.37 % (a) Cluster #1 Cluster #2 Cluster #3 Cluster #4 Cluster #5 85 Note that in Figure 4b most of the samples are drawn from Cluster 4. Figure 5 shows the evolution of the histogram of the cluster contributions at iterations in which the prior mixture model is updated. The initial samples are uniformly distributed across the five clusters. However, with each prior model update, the histogram shifts closer to the consistent cluster (Cluster 4), i.e., more samples from Cluster 4 are included in the prior model. Note that throughout the iterations, the number of terms in the linear expansion remains constant (1000) and only the contribution of each cluster changes. The matched and predicted flow time-series resulting from the MEI approach are shown in Figure 6. The water breakthrough time, i.e. the time water phase is observed at each producing well, in Producer 3 occurs at the prediction stage (after four years of data integration). However, the MEI approach correctly predicts the breakthrough time and the water-cut, i.e. ratio of produced water phase to the total liquid produced from each well, trends in Producer 3. Figure 3.4: Cluster contribution weights The initial (a) and calibrated (b) coefficient vector v , showing the cluster contributions in each case. (a) (b) #1 #2 #3 #4 #5 #3 #4 Figure 3.5: Evolution of the cluster contribution histograms for the 5 clusters used in Example 1 at iterations in which the clusters are updated. 86 Example 2: SPE10 (Benchmark Model): Top Layer This chapter’s second numerical experiment, considers the top layer of the SPE10 benchmark reservoir model with 60×220=13200 grid blocks. Figure 7a shows the reference model and well configurations. In this experiment, we used two adjacent 9-spot patterns with three water injecting wells in the center and 12 producing wells located on the edges of the subsurface model. Two years of model calibration followed by four years of model response predictions are considered to check the quality of final estimated model. For this example, variogram uncertainty involves not only the direction of continuity, but also the maximum and minimum ranges. The uncertainty intervals and the corresponding clusters are tabulated in Table 4. Figure 3.6: Production forecasts for PMI method for the true, initial and history matched permeability models: (a) well water-cut, (b) total oil production and (c) oil production rates for Producers 3 and 5. (a) (b) (c) Producer 3 Producer 5 87 There are five intervals for the direction of maximum continuity 𝜃 and three intervals for each of the maximum and minimum variogram ranges max a and min a , resulting in a total number of 5×3×3=45 clusters. In this example, the variogram parameters for the reference model are not known. Let us start the calibration process with a uniform and homogenous model, shown in Figure 7b. Figure 7c shows the updated model with the PMI algorithm for p=1. As reported in [70], replacing the original l0 norm with l1norm can lead to underestimation of the active coefficients. This effect can be seen in the under-estimated high permeability areas (corresponding to large positive expansion coefficients) and over- estimated low permeability regions (large negative expansion coefficients). As shown in Figure 7d, this issue can be addressed by using a smaller p value. In [123], I introduced an implementation in which p varied with the iterations, starting with p=1 at the first iteration and ending with p=0.01 at the final iteration (using a simple functional relation, described in [123]. From Figure 7d it can be seen that assigning smaller values of p at later iterations can improve the property value underestimation issue observed in Figure 7c. max [50,70) a max [70,90) a max [90,110] a min [20,30) a min [30,40) a min [40,50] a min [20,30) a min [30,40) a min [40,50] a min [20,30) a min [30,40) a min [40,50] a 1 [ 90, 54) − − 0.13 0 0 4.3 0 0 9.8 0 0 2 [ 54, 18) − − 15.8 0.02 0 0 0 0 21.2 0 0 3 [ 18, 18) − + 0 0 0 2.2 0 0 0.06 0 0 4 [ 18, 54) + + 0 0 0 0 0 0 9.04 0 0 5 [ 54, 90] + + 6.1 0 0 31.3 0 0 0 0 0 Table 4 Cluster contributions for the final log-permeability model using the MEI algorithm and lp norm minimization in Example 2 88 Cluster contributions for solutions with varying levels of measurement noise are shown in Figure 8. Figure 8a displays sample models from each cluster (a total of 45). Figures 8b through 8d display the contribution of each cluster to reconstructing the calibrated model, using the same scale as in Figure 8a. For the case of lp norm minimization, the cluster with ) 90 , 54 [ , ) 90 , 70 [ max a and ) 30 , 20 [ min a has the highest contribution in estimating the final log-perm model. The normalized contributions of each cluster based on Eq. 9 have been summarized in Table 3. Figure 3.7: Reconstruction Results for Example 2 Reference (a), initial (b), l1-regularized reconstructed (c) lp regularized (with changing p from 1 to 0.01) (d) log-permeability models. The reconstruction in each case is carried out by adding 5% noise to the data. (a) (b) (c) (d) Figure 3.8: Realizations and cluster contributions for Example 2 Mixture log-permeability realizations (a), and cluster contributions for reconstruction with (b) no noise, (c) 5% noise and (d) 10% noise. The cluster specifications and contributions are provided in Table 3. (a) (b) (c) (d) 89 Figure 9 shows the production forecasts at the producers and the pressure predictions at the injectors after two years of non-linear data integration. Despite the relatively high noise level in the assimilated pressure data, the prediction results using the PMI algorithm are acceptable. The time-series data forecasts for well oil and water phases production rates are shown for three different sample wells. The calibrated model is able to correctly predict future trends (after integrating 2 years of production data). A comparison between the forecasts obtained from minimizing the l1 and lp norm suggests a slight improvement for the lp–based solution, which was expected based on studies provided in the last chapter. Conclusion This chapter, inspired by recent advances in sparse signal processing and compressed sensing, developed an adaptive sparsity-promoting inversion algorithm for prior model identification and correction. The approach takes advantage of the selection property of the Figure 3.9: Calibrated data for Example 3 Production data (2 years of calibration and 4 years of prediction) for well oil production rate and well water production rates for (a) Producer 2, (b) Producer 10, and (c) Producer 8. The breakthrough in Producer 10 occurs after the calibration period and is correctly captured by the estimated permeability maps. WOPR WWPR (a) (b) (c) 90 sparsity reconstruction techniques to identify, from a diverse set of uncertain prior continuity models, the model(s) best supported by the observed data. The proposed PMI method heavily penalizes inconsistent prior model elements and promotes models with significant contributions to reproducing the production data. This study provided an implementation of adaptively monitoring the solution obtained at each model calibration iteration. In the adaptive implementation, during the model calibration iterations, the prior models that marginally contribute to the solution are replaced by a new set of prior model elements that are structurally more consistent with those that significantly contribute to the solution (at that iteration). Given the significant uncertainty involved in describing many prior physical models, the dynamic production data may prove useful in constraining the global prior continuity model. Hence, a main motivation behind this work is to develop algorithms that discriminate against different prior models based on the observed response of the system. Sparsity-promoting model calibration methods that are useful for feature selection are also promising candidates for prior model identification. This study implemented a simple adaptive scheme to achieve this goal and applied the formulation to two reservoir models to illustrate its effectiveness for identification and correction of the prior continuity models. The practical implications of this outcome can be quite significant, as the proposed method can be used to reduce the subjectivity (bias) in specifying the prior model in the dynamic data integration workflows currently adopted in the industry. 91 Chapter IV. SPARSE-RANDOMIZED MAXIMUM LIKELIHOOD FOR UNCERTAINTY ASSESSMENT 92 A particularly challenging aspect of describing subsurface models’ properties is a proper representation of uncertainty. Underestimation of uncertainty in solving general non-linear inverse problems can result in biased solutions and future predictions that are difficult to correct in advance. On the other hand, overestimating the uncertainty can lead to under- constrained solutions where many distinct, but plausible solutions, can be found that honor the existing data, but provide very different forecasts; hence, rendering the inverse modeling exercise inconclusive. In general, however, overestimation of uncertainty is preferable to its underestimation since the latter can misguide future development and result in irreversible losses, whereas the former often underscores the need for additional constraining data. In general, the main sources of uncertainty in solving an inverse problem are: (i) uncertainty in the knowledge about the unknown parameters (i.e., prior uncertainty), (ii) uncertainty in the model that is used to relate the input parameters to the observed data (a.k.a. the modeling error), and (iii) uncertainty in measuring and representing the observed outputs (i.e., observation errors). In the Bayesian formulation of an inverse problem, measurement errors are typically represented through a likelihood function. A topic less explored in the literature is the representation of uncertainty in the modeling assumption, or the modeling error, partly because characterization of the modeling error is not trivial. In many cases, the errors introduced when applying the conservation principles and deriving the non-linear flow equations are considered to be far less than the uncertainty when specifying subsurface rock properties. Hence, in many cases, the inversion is presented as the integration of a prior model PDF that reflects the uncertainty in rock properties (model parameters) with the likelihood function that represents the uncertainties related to the measurements and their predictions. In general, due to the limited access to 93 subsurface itself for data acquisition and the complex heterogeneity in spatial properties, the uncertainty in the prior information about subsurface model properties tends to be dominant. Typically, dynamic flow data provide aggregate information about rock properties that can provide a coarse scale description of the underlying properties, which limits the resolution at which the model parameters can be estimated. The limitations in the amount and resolution of the dynamic well data imply that a large portion of the uncertainty in the prior model remains unresolved even after dynamic data integration. Hence, there is usually a significant level of uncertainty when predicting the future performance of the non-linear subsurface model, even after integrating all sources of information. The level and nature of the uncertainty in describing the model can also affect the type and extent of the reparameterization performed for low-order approximation of the unknown spatial properties. For example, if reliable prior information is available, it would be preferable to strongly constrain the parameterization and, hence, the solution of the inverse problem by the existing prior information. Examples of these types of parameterization are those based on the Principle Component Analysis (PCA) or the Karhunen-Loeve transform (KLT) [124], [125] , which constrain the parameterization to honor the second-order statistical moment of the parameter field. Furthermore, the PCA or KLT methods provide an effective approach for parameter reduction, which is quite useful and sometimes even necessary for model calibration [124]. However, placing too much emphasis on the prior information introduces the significant risk of biasing the solution. On the other hand, when prior information is unreliable or unavailable, it would be necessary to depend on the available data to determine the main spatial connectivity in the field. In general, even 94 qualitative prior knowledge/assumptions based on the nature (physics) of the problem may be useful for constraining the inverse modeling solution. For instance, Tikhonov type regularization techniques may be used to impart global spatial smoothness on the parameters, and low-frequency Fourier-type subspaces may be suitable for parameterization of correlated and smooth spatial structures. A major limitation of such regularization measures is that they are not universally applicable and can introduce bias toward overly smooth solutions. In general, qualitative knowledge about the formation type and the form/shape of the expected (geologic) patterns in that environment can be quite helpful in selecting or designing an effective parameterization. An additional factor that controls the degree of parameterization for inverse modeling is data resolution. As a rule, updating the subsurface image at a resolution beyond data resolution is unjustified and can result in unrealistic artifacts if a mechanism is not used to avoid such artifacts. For example, by requiring the spatial distribution of the formation hydraulic property to follow a covariance model, solutions that violate the continuity implied by the covariance model, and that potentially introduce inconsistent continuity features, will be penalized. While the Bayesian approach [1] provides an elegant probabilistic framework for conditioning uncertain prior model parameters on available data, it is only under very simple distributional forms (such as jointly Gaussian data and model parameter distributions) it can provide a complete closed-form characterization of the posterior distributions. In practice, however, real parameter fields rarely follow such simple distributions and an approximate solution method must be sought. Two classes of solutions that are commonly used are estimation of point statistics, such as the first and second order moments of the posterior distribution, and the ensemble methods that use the prior and 95 likelihood distributions to generate sample realizations from the posterior distribution. By generating multiple conditional model realizations, ensemble approaches provide a systematic Monte-Carlo simulation-based method for quantifying the fluid flow prediction uncertainties (e.g., [126]–[128]). Ensemble Kalman Filter (EnKF) [126], [129], [130], Markov-Chain Monte-Carlo (MCMC) [1], and Randomized Maximum Likelihood (RML) [128], [131] are examples of ensemble methods that have been applied to subsurface inverse modeling problems. This chapter uses sparse geologic dictionaries, which was introduced in chapter I, that are learned from prior training datasets to sparsely represent prior geologic models. This study then formulates the sparsity-promoting inversion approach using a Bayesian framework where Laplace distribution is used to describe the sparse representation of the prior models. The resulting Bayesian inverse problem is solved using a Sparse RML (SpRML) approach for uncertainty quantification. The minimization involved in the SpRML approach is carried out using the iteratively reweighted least-squares (IRLS) method [80], [111], [132]. To examine the performance of this approach, this study, applies this framework to two- dimensional and three-dimensional non-linear subsurface inverse problems for uncertainty quantification under sparsity assumptions. 96 Bayesian Formulation RML with Multi-Gaussian Priors The general Bayesian formulation of the inverse problems arising in multiphase flow in porous media is expressed as [1], [128] ) ( ) ( ) | ( ) | ( obs obs obs p p p p d u u d d u = , (4.1) where N u is the unknown rock property (e.g., log-permeability) and M obs d is the vector of measured data (e.g., well pressure and fluid flow rates). In this chapter, we only consider the integration of dynamic data; however, its worth noting that in many cases direct measurements of the desired model (log-permeability) at the well location are available and can be included in data integration process. The probability density functions (PDF) ) (u p , ) ( obs p d , ) | ( u d obs p and ) | ( obs p d u represent the prior, observation, likelihood, and conditional (posterior) distributions. In the case of jointly Gaussian model and data distributions, the posterior PDF can be characterized as ( ) ) ( exp ) | ( u d u J p obs − , where ( ) ( ) ( ) ( ) p T p obs T obs J u u C u u d u g C d u g u u d − − + − − = − − 1 1 ) ( ) ( ) ( , (4.2) where M ) (u g is the vector of simulated measurements, d C is the data covariance and p u and u C are the mean and covariance, respectively, of the prior distribution of u. The simulated data are usually nonlinearly related to model parameters u; hence, ) (u J is a nonlinear function of the unknown parameters u. For linear models, i.e., Gu u g = ) ( , the 97 function ) (u J takes a quadratic form, and the posterior PDF ) | ( obs p d u becomes multi- Gaussian. In that case, the analytical form of the posterior PDF is readily calculated, and hence, can be fully characterized ([1], [128]). The complete characterization of the posterior PDF in the case of a nonlinear model is not trivial. Additionally, even under linear models, if the multi-Gaussianity assumption about the prior model is not justified, analytical characterization of the posterior PDF may not be possible. Alternative methods for approximate characterization of the posterior PDF in each of the above cases have been used in the past. The MCMC method is a computationally demanding but rigorous approach for sampling from the posterior distribution ([1], [128]). The EnKF is a Monte-Carlo implementation of the Kalman filter that has been introduced as a computationally efficient and practical data integration approach for nonlinear models. The original Kalman filter is an optimal state estimator for linear state-space dynamical systems with jointly Gaussian statistics. For linear models with jointly Gaussian states and measurement PDFs, the EnKF can be shown to asymptotically converge (for large sample sizes) to the Kalman filter. Despite its theoretical limitations for handling nonlinear non-Gaussian problems, application of the EnKF to several large-scale nonlinear data assimilation problems has produced promising results [126], [129]. The application of the EnKF to subsurface flow model calibration has been extensively studied in the past decade and important observations have been reported [133]–[136] [126]. The RML is another method that also provides an approximate sampling approach for nonlinear and non-Gaussian problems [137], [138]. For nonlinear models ) (u g with jointly Gaussian distribution, nens samples from the posterior distribution 98 in the RML approach are obtained by solving the following minimization problem [137]– [139] ( ) ( ) ( ) ( ) n i J ens i uc i T i uc i i obs i T i obs i i i 1,2,..., ) ( ) ( ) ( min 1 1 = − − + − − = − − u u C u u d u g C d u g u u d u ,(4.3) where i uc u is an unconditional realization from the prior distribution of u with ) , ( ~ u C u u p i uc N , and i obs d is a random realization of the observation vector taken from ) , ( ~ d C d d obs i obs N (Error! Reference source not found.,Error! Reference source not fo und.). In this context, ) , ( u C u p N represents a multivariate Normal Gaussian distribution with a mean of p u and a covariance matrix of u C . For linear models, i.e., Gu u g = ) ( , and jointly Gaussian PDFs, the closed-form solution of the above minimization problem can be expressed as [139] ( ) ( ) i obs i uc T T i uc i d Gu C G C G G C u u d u u − + + = −1 . (4.4) However, in the linear case when the full conditional multi-Gaussian PDF is available, it can be directly sampled using other direct sampling methods. The RML minimization can be interpreted as an approximate conditional sampling in which unconditional realizations are slightly perturbed to reproduce the observed data within the measurement error given by d C . Linearized error analysis can be used to show that the conditional samples generated with the RML approach reproduce the conditional mean and covariance matrices as long as the distance between the conditional mean and the simulated conditional samples are small enough so that the same linearization applies to both [137], [138]. When this condition does not apply, the generated samples can only be considered as approximations 99 and are not necessarily samples from the theoretical conditional distribution. In general, an importance sampling criterion could be used, such as the Metropolis-Hasting approach, to accept or reject the simulated samples. Despite the approximations involved, application of the RML method to non-linear subsurface inverse problems in the past has been successful [140], [138], [139]. For nonlinear models, the above minimization problem must be solved as many times as the number of conditional realizations needed. However, since minimizing the above objective function involves several time-consuming forward flow simulation runs, the number of samples generated with this method is limited in practice. The RML formulation presented above is general and can be used with other types of prior distributions. The following section, considers the RML formulation under sparse priors. Sparse RML with Laplace Priors Given a set of expansion functions n i i ,..., 2 , 1 = , similar to previous chapters, let us express a property of interestu (e.g., grid block log-permeability values) as a linear expansion of the form = = n i i i v 1 φ u , (4.5) where T n v v v v ,..., , , 3 2 1 = v are the expansion coefficients that represent u in n i i ,..., 2 , 1 = φ ; the Φ matrix is the linear transformation in sparse domain called the dictionary in sparse signal reconstruction literature. The vector v is said to be s-sparse if only s out of its n elements are non-zero. In that case, Eq. (4.5) can be reduced to an s-term expansion 100 = = s i i i v 1 φ u (4.6) The sparse reconstruction problem involves the estimation of the parameter u from its incomplete observations ) (u g d = obs . Sinceu is linearly related to v (i.e., Φv u = ) one can write nonlinear and linear measurement equations in terms of v , as ) (v g d = obs and Gv d = obs , respectively. The sparse reconstruction literature focuses on recovering the sparse vector v from its incomplete observations Gv d = obs by minimizing the support (number of non-zero coefficients) of v (i.e., 0 v ) subject to measurement constraints, i.e., : ) ( 0 P obs d Gv v = s.t. min 0 (4.7) Recent developments in statistical signal processing, which have resulted in the compressed sensing paradigm, specify that under certain, relatively mild, conditions one can find the sparse solution to (P0) by solving the following (P1) problem [62], [69]) : ) ( 1 P obs d Gv v = s.t. min 1 (4.8) Reformulation of the (P1) problem to allow for observation errors leads to [69] : ) ( 1 e P 1 2 2 ) ( min v d Gv v + − = obs J , (4.9) where is the regularization parameter. The norms for t>0 used in the above equations are defined as ( ) t i t i t x 1 1 = = x . 101 The problem ) ( 1 e P provides the inspiration for formulating the Bayesian form of the sparse reconstruction problem. This can be achieved by writing the Bayesian formulation for sparse parameters v using the Laplace (double exponential) distribution [141]. That is, we consider the prior PDF of v to take the form ( ) p p v v v − − exp ) ( (4.10) where p v is the prior mean of v and is the scale parameter. This formulation assumes that each component of the sparse vector v is uncorrelated with other elements in this vector. In addition, it is assumed that the parameter of the Laplace distribution for each element is identical and constant. With a Gaussian choice for the likelihood function, the posterior PDF from the Bayes’ rule is reduced to ( ) ) ( exp ) | ( v d v J p obs − with ( ) ( ) 1 1 ) ( ) ( ) ( p obs T obs J v v d v g C d v g v d − + − − = − . (4.11) For traditional sparse prior, s we have 0 = p v (the expected value of the coefficients is zero, as implied by sparsity), and the resulting exponential argument in the posterior PDF can be simplified to ( ) ( ) 1 1 ) ( ) ( ) ( v d v g C d v g v d + − − = − obs T obs J . (4.12) The l1 norm in the above equations is defined as = = n j j v 1 1 v . The posterior PDF in this case has a complex form and closed-form solutions cannot be derived. Therefore, we use the RML sampling approach to find realizations from the posterior PDF. This is achieved by solving the following minimization 102 ( ) ( ) 1 1 ) ( ) ( ) ( i uc i i obs i T i obs i i J v v d v g C d v g v d − + − − = − (4.13) to generate the realization i for the posterior sample. While for nonlinear problems, the parameter is difficult to exactly specify, experience with sparse reconstruction inverse problems suggests that in a relatively wide range of values for , the solution shows little sensitivity to this parameter [141]. In general, a small number of trial-and-error attempts can be used to identify a reasonable value for . This chapter also follows the iteratively reweighted least-square (IRLS) method proposed in previous chapters and in [86], [142] to minimize the above objective function. Let us write the IRLS objective function as ( ) ( ) ( ) i uc i i obs i T i obs i i SP J v v d v g C d v g v d − + − − = − ) ( ) ( ) ( 1 . (14) where at iteration (k+1) of the minimization, we have ( ) ( ) = + + + − = − n j j i uc j i k j i k i uc i k v v W SP 1 2 , , 1 , 1 ) ( . v v with the weights defined as ( ) 2 1 2 , , , ) ( − + − = j i uc j i k j i k v v W . Note that at iteration (k+1) the weight of the sparsity term is written in terms of the solution at the previous iteration (k). The parameter is generally a small positive constant included to insure numerical stability of the solution, since the derivative of the l1-norm is not defined at the origin. In this study, however, this study follows the approach in [86] and set these parameters to ( ) 2 1 1 ) ( − − = = + d obs k k k C d v g v and ( ) ( ) k obs k k k SP d v d v g v C 2 1 ) ( − = = + . The formulations presented in Eqs. (4.13) and (4.14) tend to promote sparse updates, i.e., minimize 1 i uc i v v − , to the unconditional realizations. However, since the unconditional realizations are themselves sparse, the solution is also 103 expected to be sparse. For minimizing the above objective function, this study uses a gradient-based optimization method, known as the iteratively reweighted least-squares approach, as described in previous chapter III. Finally, to implement the RML algorithm, The basis functions n i i ,..., 2 , 1 = are needed to be defined. While generic compression transforms, such as discrete cosine transform (DCT) [112], [117] or discrete wavelet transform (DWT) [54], [116] or the classical PCA or KLT parameterization can be used as sparse basis functions, this chapter uses the introduced geologically-trained sparse K-SVD dictionary in this study [62], [86]. The K-SVD algorithm and its important properties for subsurface flow inverse problems have been discussed in our recent publications [86]. The K-SVD dictionary elements n i i ,..., 2 , 1 = are derived from a prior training dataset 1,2,..., n L i iL Uu = = and have the property that they can approximate any model similar to those in the prior library with an s<<n term expansion. Construction of a sparse geologic dictionary is performed by providing L prior model realizations as columns of the matrix 1,2,..., n L i iL Uu = = , and specifying a dictionary size K (with K elements) and the sparsity level of S (note that S and K are specified by the user based on the complexity of the model and the desired compression level). The algorithm will compress the input model to generate an S-sparse dictionary nK Φ and the corresponding sparse representation of the prior models 1,2,..., K L i iL Vv = = such that only S (different) elements are needed to provide an approximate representation for each prior model realization. The following optimization problem can be solved to generate the dictionary elements: 104 2 , ˆˆ , arg min F =− ΦV Φ V U Φ V , s.t. 0 , 1,2,..., i s i L v = (4.15) As described in chapter II, the above optimization problem can be efficiently solved using a two-step iterative K-SVD algorithm ([62]). Each iteration of the K-SVD algorithm consists of two steps: the first step (known as sparse coding) is used to find sparse representations V for the initial dictionary Φ and prior models U by solving the sparse reconstruction subproblems of the form 2 ˆ arg min i i i i F =− v vu Φv , s.t. 0 i s v (4.16) The second step updates the dictionary by fixing V and U and solving the subproblem 2 2 1 arg min arg min K T ii F i F ΦΦ Φ U Φ V U v = = − = − (4.17) where i and T i v denote the i th -column of Φ and the i th-row of V , respectively. In this chapter, we use K-SVD as the sparsifying expansion and refer interested readers to [86] for additional details about the K-SVD algorithm and its properties. The K in the K- SVD algorithm refers to the size of the dictionary and is used in the name to reflect the K times rank-one SVD operation that is implemented in the algorithm. To test the validity of the Laplace prior on the K-SVD coefficients, this study performed a Q-Q plot test with a one-dimensional discrete (non-Gaussian) facies model. To this end, I generated 40000 one- dimensional realizations with the SNESIM algorithm and specified a dictionary size of K=100 and a sparsity level of S=20. The K-SVD sparse coefficients were generated using the OMP algorithm with the same sparsity level. The Q-Q plots for four of the coefficients against the Normal and Laplace distributions are shown in Figs. 1a and 1b. The linear trend 105 in Fig. 1b suggests that the Laplace prior is a better assumption than the prior multi- Gaussian. Numerical Experiments To evaluate the performance of the SpRML algorithm, this chapter applies it to three numerical experiments involving two two-dimensional synthetic models and the three- dimensional PUNQ-S3 reservoir. In each case, let us start with nens=50 prior model realizations and condition these samples using the RML algorithm. This chapter does not (b) Q-Q plot for Laplace distribution (a) Q-Q plot for Gaussian distribution Standard Gaussian Standard Gaussian Standard Gaussian Standard Gaussian Coef. Quantiles Coef. Quantiles Coef. Quantiles Coef. Quantiles Figure 4.1. The Q-Q plot of the sparse K-SVD coefficients against the standard Gaussian distribution (a) and the Laplace distribution (b) for a one-dimensional example with discrete facies maps. The plots show that the Laplace distribution is more appropriate than the Gaussian distribution for describing the sparse K-SVD coefficients. 106 apply any importance sampling approach to accept or reject the solutions obtained from the RML algorithm. Note, however, that larger sample sizes are necessary for approximating the statistics of the resulting conditional distribution. The first example resembles a multi-Gaussian-type distribution for the variability in the spatial permeability map, which is amenable to classical variogram-based geostatistical description methods. The second example represents a meandering channel shape connectivity that is not amenable to variogram-based representations. These two examples are used to compare the performance of the conventional model calibration methods using the sparsity- promoting algorithm with the K-SVD dictionary. Finally, the third example demonstrates the application of the method to a more realistic-looking PUNQ-S3 3D example. The immiscible two-phase fluid flow equations (e.g., oil and water) in porous media can be derived from the mass conservation principle and Darcy’s law. They can be compactly represented as ( ) o o o o o o o q B S t Z P B + = − ) ( k . , (4.18) ( ) w w w w w w w q B S t Z P B + = − ) ( k . , (4.19) where k and f are the rock permeability tensor and porosity, respectively; B, l, g are formation volume factor and phase mobility and density, respectively. The pressure and saturation are denoted as P and S while q represents the source and sink terms. The subscript o and w stand for oil and water. Assuming a diagonal tensor for permeability, in two dimensional Cartesian coordinates, the above equations reduce to o o o o yy o o o xx o o q B S t y P k B y x P k B x + = + ) ( , (4.20) 107 w w w w yy w w w xx w w q B S t y P k B y x P k B x + = + ) ( (4.21) These two equations can be combined with two constitutive equations, namely the capillary pressure equation and the physical saturation constraint (4.22) to close the system of equations. The notation Pc represents the capillary pressure. The resulting equations can be solved numerically by discretizing the domain and specifying the initial and boundary conditions. In this chapter, we assume a uniform (constant) initial pressure and oil and consider no-flow boundary condition at all boundary grid block. However, the boundary conditions are specified as source/sink terms in grid block that contain injection and production wells. Example 1: Two-Dimensional Multi-Gaussian Model The two-dimensional model in this study is a 50×50 reservoir with heterogeneous permeability under a conventional 9-spot waterflooding scenario. Figure 2a shows the well configuration (left) and the reference (log-permeability) model (right). The total simulation time was six years, of which the simulated data from the first four years were used for inverse modeling while the data from the last two years were used to evaluate the predictive performance of the estimated models. The reference (log-permeability) model and the unconditional samples were assumed to have the same variogram (covariance) model. The geostatistical parameters used to generate the samples are shown in Table 5. 1 ) ( = + = − o w w c w o S S S P P P 108 Property Name Consistent Library Kriging Type Ordinary kriging Maximum conditioning data 12 Variogram type Spherical Ranges max , med, min 50, 10, 5 Angles Azimuth, Dip, Rake 30, 0, 0 Mean value (LogPerm) 4 Standard Deviation (LogPerm) 1 (a) Reference Log-Perm Well Setup (b) (c) (s (s (s (s Mea Vari Figure 4.2. Model setup and inverse modeling results for Example 1 (a) Field setup (left) and reference log-permeability (right); (b) four unconditional log- permeability realizations (Columns 1-4), unconditional sample mean (Column 5) and variance (Column 6); (c) conditional log-permeability realizations (Columns 1-4) corresponding to (b), and the conditional sample mean (Column 5) and variance (Column 6). The sample mean and variance are calculated from 50 realizations. Table 5. Geostatistical simulation parameters for the two-dimensional reservoir in Example 1. 109 The first four columns of Figure 2b show four sample (out of 50) unconditional realizations of the model. The fifth and sixth columns in Figure 2b display the unconditional sample mean and variance, respectively. While the unconditional realizations have identical variograms, they generally exhibit different variability and connectivity in different regions in the subsurface model. As a result, the unconditional sample mean does not show any particular trend in the spatial structure of the image model. Figure 2c displays the corresponding plots after conditioning with SpRML. The conditional realizations show substantially reduced ensemble variability. The identified local connectivity patterns in the conditional samples (and sample mean) are consistent with the reference model. The main uncertainty in the variance map corresponds to the regions that are far from the wells where the observations are located. Note that the mean and variance are calculated from 50 realizations and are expected to have errors. This is especially true for the variance map. Figure 3a shows 25 samples from the (2500) sparse K-SVD dictionary elements. In this case, the sparsity of the dictionary is S=20. The linear expansion of the four unconditional realizations over the K-SVD dictionary (containing the expansion functions) results in the sparse representations shown in Figure 3b. Figure 3c depicts the updates applied to the set of sparse coefficients corresponding to the unconditional realizations to generate the conditional realizations in the sparse domain (the spatial representations of the unconditional and conditional realizations are shown in Figures 2b and 2c, respectively). These figures verify that the updates applied to the unconditional realizations are sparse, which is expected in SpRML implementation (Eq. 4.13). Figure 4 summarizes the data match (for the first four years) and future predictions (for the last two years) for the water-cut, oil production rate, and total oil production in Producer 2 110 (top row) and Producer 3 (bottom row). The figures clearly demonstrate the improvements obtained after calibration with the SpRML. It is worth noting that in Producer 3 (second row of Figure 4) the water breakthrough occurs after four years (history matching time); nonetheless, the calibrated models can successfully forecast the water breakthrough time and the shape of the breakthrough function. (s1) (s2) (s3) (s4) (b) Unconditional (c) Updates Applied (a) K-SVD Dictionary Elements Figure 4.3. Dictionary elements and model’s sparse representation (a) 25 sample K-SVD dictionary elements (out of K=2500) with sparsity level S=20; (b) sparse coefficients representing the four unconditional realizations in Figure 1; (c) updates applied to unconditional samples to generate the conditional realizations. 111 Example 2: Two-Dimensional Non-Gaussian Model (Meandering Channel) The second example is also a 2D subsurface model with a more complex heterogeneity, representing a meandering fluvial channel formation. Unlike in Example 1, the underlying geologic connectivity in this case is more complex and cannot be represented with the traditional variogram-based description methods. Furthermore, preserving such complex connectivity structures during data integration process, poses a challenging inverse modeling problem. This example aims to evaluate the performance of the K-SVD dictionary and the proposed sparse RML formulation for detecting such complex geologic connectivity by inverting the dynamic non-linear flow data. In this example, the subsurface domain is discretized into a 100-by-100 grid system. The dictionary size is K=1000 and the sparsity level is S=50. Figure 5a shows 25 sample realizations. The corresponding 25 K-SVD dictionary elements and the leading 25 truncated SVD basis functions are shown in Fig. 5b and 5c, respectively. A total of S=90 leading left singular vectors were used for the inversion with the SVD method. A clear distinction between the K-SVD and truncated SVD elements in Figures 5b and 5c is the appearance of the meandering channels in the K- Figure 4.4. Data match and production forecasts for all 50 realizations at Producer # 2 (first row) and Producer # 3 (second row); Colums (a)-(c) show well water-cut, oil production rate (STB/D) and total oil produced (STB), respectively. 112 SVD elements and the loss of the meandering features in the SVD basis. Considering that the training dataset for both K-SVD and SVD elements is identical, the two methods clearly show contrast in behavior. The K-SVD dictionary elements maintain a strong resemblance to the features of the patterns in the prior library. During the sparsity promoting inversion, the most relevant features in the dictionary are chosen and combined to reconstruct the solution. In the case of truncated SVD, the basis elements are generated by spectral decomposition of the features in the training data and lose the overall shape of the training patterns. Hence, during model calibration, the leading elements of the basis are combined to form the patterns in the solution. However, since the problem is ill-posed, the data required to retrieve the complex meandering channel is not available and the solutions obtained fail to represent the complex channel structure. Further details about the properties of the K-SVD and SVD parameterization methods can be found in [86], [123]. Figure 4.5. Realizations and dictionary atoms for Example 2 (Meandering structure) (a) 25 Meandering fluvial channels used as prior unconditional models; (b) 25 K- SVD dictionary elements (with K=1000 and S=50); (c) 25 truncated SVD bases elements for S=90; (a (b (c 113 Figure 6a shows the well configuration for this example. A 13-spot well setup with four injectors and nine producers is used to simulate a waterflooding scenario. The subsurface energy resource is initially filled with oil and the water is injected to push the oil toward the producers and to maintain high subsurface pressure. The reference model is shown on the left side of Figure 6a. Figure 6b displays four sample initial models (Columns 1-4) with the corresponding mean and variance maps (Columns 5-6). With a similar order to Figure 6b, Figures 6c and 6d display the resulting calibrated permeability maps using the K-SVD and SVD parameterization methods, respectively. The last two columns of Figures 6c and 6d show the mean and variance maps for the calibrated realizations. It is evident from these results that the sparse K-SVD parameterization is better able to preserve the connectivity in the meandering channel whereas the SVD parameterization (despite the fact that it uses S=90 basis elements, note that S=50 for K-SVD) fails to reconstruct the meandering shape of the channel. This can be attributed to the second-order (covariance-based) nature of the SVD parameterization, and the flexibility of the sparsity-promoting algorithm for basis selection. Note that in the case of K-SVD, the calibration process uses all K=1000 elements with a sparsity promoting inversion with S=50 while for calibration with the truncated SVD, the leading S=90 singular vectors are selected and included in a reduced-order parametric description. 114 Finally, Figure 7 shows the well water-cut (WWCT) data match and time-series prediction performance for the sparse RML (Figure 7a) and the SVD-based RML (Figure 7b) methods for three different wells. In each case, the first four years of data have been used for inverse modeling while the data from the last two years have been used for assessing the forecasts. (a) Reference Model Well Setup (s1) (s2) (s3) (s4) (b) (c) (d) Mean Varianc Figure 4.6. Field setup and calibrated models for Example 2 (a) Field setup (left) and reference log-permeability (right); (b) four unconditional log- permeability realizations (Columns 1-4), unconditional sample mean (Column 5) and variance (Column 6); (c) conditional log-permeability realizations from the SpRML method (Columns 1-4) corresponding to (b), and the conditional sample mean (Column 5) and variance (Column 6). (d) the conditional log-permeability realizations from the SVD-RML with S=90 (Columns 1-4) corresponding to (b), along with the conditional sample mean (Column 5) and variance (Column 6). The sample mean and variance are calculated from 50 realizations. 115 A particularly important part in these plots is the water breakthrough time. In Producers 2 and 3, the water-cut occurs during the last two years (prediction part) while in Producer 1 the water-breakthrough happens during the first four years (model calibration part). Both SVD and K-SVD based methods can reproduce the early water breakthrough time in Producer 1. However, in Producers 2 and 3 (second and third columns of Figure 7), where the water breakthrough time happens later during the prediction part, the predictions with the K-SVD based methods are evidently superior to those from the SVD-based method. Overall, the results clearly indicate a better model calibration and prediction performance for the sparse RML approach. Figure 4.7. Data match and production forecasts for all 50 realizations at Producer # 1 (first column), Producer # 2 (second column) and Producer # 3 (third column); (a) and (b) show the well watercut for SpRML and SVD-RML methods respectively. (a) (b) Producer Producer Producer 2 116 Example 3: Three-Dimensional PUNQ-S3 Case Let us now consider the PUNQ-S3 3D benchmark model. The PUNQ-S3 case study is developed for uncertainty quantification in a real subsurface field settings. It is a small-size industrial model with 19×28×5 grid blocks, of which 1761 blocks are active. This example contains geologic features that are more challenging than previous examples since it consists of channel-like spatial features in layers 1, 3, and 5 and low permeability in layers 2 and 4, which hinder vertical flow movement. The original model undergoes a water drive supported by a fairly strong aquifer to the north and west of the field. In this study, the structure and reference permeability and porosity maps are borrowed from the original PUNQ-S3, but study a waterflooding experiment with four injection and four production wells, as shown in Figure 8a. The reservoir has five layers, each with a distinct heterogeneous spatial (permeability) distribution. The injection wells exist in the bottom two layers while the producers are penetrating the top three layers. The injectors are under rate control and the producers are under BHP (Bottom Hole Pressure) control. The injection pressure and production oil and water rates are collected and used as conditioning data in the first four years. The data collected from these wells during the fifth and sixth years are used to validate the prediction performance of the conditional realizations. This example estimates the horizontal (log-permeability) image model and derive the porosity and vertical model values from the relations ( ) ( ) 77 . 0 02 . 9 10 log + = φ u XY and 124 . 3 10 306 . 0 + = XY Z u u , respectively. These relations are borrowed from the original PUNQ-S3 model. 117 Figure 4.8. Reference log-permeability map for the PUNQ-S3 model ; (b1)-(b4) four sample unconditional log-permeability realizations; unconditional sample mean (c) and variance (d) are calculated from 50 realizations. In all rows, Columns 1-5 show the five layers of the model. (a) (b1 (b2 (b3 (b4 Layer Layer Layer Layer Layer (c) (d) Injector Producer P3 P2 P4 P1 I2 I1 I3 I4 118 Figures 8b1 through 8b4 show four unconditional realizations generated from the prior model. Table 6 summarizes the geostatistical parameters used to generate the unconditional realizations. Figures 8c and 8d display the sample mean and variance, respectively, generated from 50 unconditional realizations. The unconditional realizations and the corresponding sample mean do not contain the distinctive channel-like patterns observed in Layers 1, 3, and 5 of the reference model. Property Name Layer1 Layer2 Layer3 Layer4 Layer5 Grid Dimensions (ft) Dx, Dy, Dz 180,180,1 180,180,1 180,180,1 180,180,1 180,180,1 Kriging Type Ordinary kriging Ordinary kriging Ordinary kriging Ordinary kriging Ordinary kriging Maximum conditioning data 12 12 12 12 12 Variogram type Spherical Spherical Spherical Spherical Spherical Ranges max , med, min 1000, 285.7, 1 750, 750, 1 1500, 375, 1 750, 375, 1 1250, 416.6, 1 Angles Azimuth, Dip, Rake 30, 0, 0 30, 0, 0 45, 0, 0 150, 0, 0 60, 0, 0 Mean value (LogPerm) 0.14 0.08 0.14 0.10 0.14 Standard Deviation (LogPerm) 0.11 0.04 0.11 0.06 0.11 Table 6. Geostatistical simulation parameters for PUNQ-S3 reservoir (Example 2) 119 (b1 (b2 (b3 (b4 Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 (a) (c) (d) Figure 4.9. Conditional realizations for example 3 (a)Reference log-permeability map for the PUNQ-S3 model (repeated from Figure 4); (b1)-(b4) four sample conditional log-permeability realizations corresponding to unconditional samples in Figure 4; the conditional sample mean (c) and variance (d) are calculated from 50 realizations. In all rows, Columns 1-5 show the five layers of the model. 120 Figure 9 contains the conditional plots corresponding to Figure 8. The plots in Figure 9 show that conditioning with the SpRML algorithm results in a set of estimated models that capture the main trends in the reference model. The main discrepancy in the calibrated models can be seen in the lower region of Layer 1. A closer examination reveals that there are no observations in this region and that the unconditional samples all show high values for it. Hence, this discrepancy can be attributed to the bias in the unconditional realizations (prior model) and lack of observations to correct it. Overall, the calibrated models capture the major trends in high and low regions of the reference model fairly accurately. Figure 10 contains the calibration results with the sparse (K-SVD) representation. Figure 10a shows 15 (out of 2660) K-SVD dictionary elements (shown as 3D permeability models). The sparse representations of the four unconditional samples in Figures 8b1-8b4 using the K-SVD dictionary elements are shown in Figure 10b. The sparsity level in this example is set to S=50. This implies that only 50 non-zero coefficients (out of 2660) are needed to compactly approximate each unconditional sample in Figure 10b (details about the sparse representations with the K-SVD dictionary can be found in [86]. Figure 10c contains the updates applied to the unconditional realizations in Figure 10b, during calibration with SpRML, to generate the conditional realizations. The sparse nature of these updates is consistent with the SpRML formulation (Eq. 4.13). Finally, Figure 11 depicts the production data match and forecasts with the unconditional and conditional log- permeability realizations. The plots of water-cut and oil production rate, as well as total oil production for Producer 1 (first row) and Producer 3 (second row) are shown on the left, middle, and right columns, respectively. It is clear from these figures that conditioning with 121 the SpRML has led to substantial improvement in the data match and future production predictions. (a) Figure 4.10. Sample K-SVD atoms and model representation for Example 3 (a) 15 sample K-SVD dictionary elements (out of K=2660) with sparsity level S=50; (b) sparse coefficients representing the four unconditional realizations of Figure 4 in the K-SVD domain; (c) updates applied to the (sparse) unconditional realizations to generate the conditional samples. (b) Unconditional Samples (c) Updates Applied (s1) (s2) (s3) (s4) 122 Conclusions In this chapter, proposed an RML implementation with sparse prior models that are not amenable to description with multi-Gaussian PDFs. The SpRML provides a practical approach to formulating and solving sparse reconstruction inverse problems and Example Total computation time for K-SVD (min) Computation time for 1 adjoint based forward simulation (min) Total history matching time for one realization (min) Example 1 Multi- Gaussian 568.45 0.91 18.32 Example 2 Non- Gaussian 828.45 1.89 29.1 Example 3: PUNQ-S3 189.85 0.99 30.3 Table 4.3. Computation times associated with each example (in minutes) on a machine with Intel Core i7-2600 CPU with clock time of 3.40 GHz and 16 GB RAM equipped with a 64 –bit Operating System Figure 4.11. Data match and inference results for Example 3 Data match (first four year) and production forecasts (last two years) for all 50 realizations at Producer # 1 (first row) and Producer # 3 (second row); Columns (a)-(c) show the well watercut, the oil production rate (STB/D) and the total oil produced (STB), respectively. (a) (b) (c) 123 quantifying the uncertainty in the solutions and future predictions. To this end, this study first represented the image models in a sparse domain. Here, similar to previous chapters, geologically-learned K-SVD dictionaries for this purpose were used. Then description of the sparse representations of the models in the K-SVD domain with the Laplace prior PDFs was provided. Combining the Laplace prior model and a Gaussian likelihood function to describe the observations, leads to derivation if the form of the posterior distribution that unlike the case with the multi-Gaussian priors and linear models is not amenable to full analytical characterization. For nonlinear problems with multi-Gaussian priors, the RML has been proposed to approximately sample from the posterior distribution. The proposed framework, follows the same approach and introduces the SpRML formulation as an approximate method for generating an ensemble of calibrated models for subsurface fluid flow behavior prediction and uncertainty quantification. While the RML sampling approach generally approximates the conditional realizations from the posterior distribution, it presents a practical model calibration and uncertainty quantification method that has shown a promising performance. To evaluate its performance of the SpRML method, this study applied it to three numerical experiments: (1) in a two-dimensional model with a variogram-based multi-Gaussian permeability distribution; (2) in a two- dimensional model with a complex meandering fluvial channel connectivity generated using multiple point geostatistical simulation, i.e., not amenable to variogram-based representations, and (3) an adapted version of the three-dimensional PUNQ-S3 subsurface flow model. The results from these experiments suggest that SpRML is a promising inverse modeling and uncertainty quantification method that can take advantage of the effective sparse representations as a general attribute of spatially correlated reservoir properties. 124 The SpRML algorithm is more advantageous than the deterministic sparse regularization techniques, as it offers a systematic mechanism for including the uncertainty in the prior model and measurements, and for quantifying the resulting uncertainty in the estimated parameters and future predictions. Solving spatial inverse problems in geologically- inspired sparse domains transforms the model calibration into a low-order feature estimation problem, which is more consistent with the ill-posed nature of the problem. Sparse (low-rank) feature-based representation and estimation methods that take advantage of prior geologic connectivity information are better able to respect the expected geologic continuity during model calibration. They provide an effective framework for simplifying the description and estimation of complex (non-Gaussian) connectivity patterns by first applying sparse learning to encode such patterns and then recovering the solution by selecting and combining the relevant sparse features during the inversion. This dissertation has introduced the SpRML as a practical uncertainty quantification approach for sparsity- promoting model calibration and demonstrated its superior performance, through comparison with the truncated SVD parameterization method, in a series of numerical experiments with two-phase subsurface flow model calibration problems. 125 Chapter V. INVERSE MODELING IN COMPLEX NON-GAUSSIAN GEOLOGIC ENVIRONMENTS 126 As was shown and described in previous chapters, predictive accuracy of groundwater flow models hinges on how closely input physical properties, such as hydraulic conductivity, represent the actual formation properties [143], [21], [29]. Flow model calibration is used to integrate available prior information about the model and its parameters with dynamic measurements of the aquifer response to introduced perturbations [131], [144], [145]. Constructing and updating subsurface flow models is an involved and iterative process that often includes inputs from geologists, geophysicists, and flow simulation experts. Prior descriptions of aquifer property distributions can be obtained by integrating available qualitative and quantitative data with an appropriate geologic continuity model using geostatistical techniques (e.g., variants of Kriging or conditional simulations) [75], [107], [146]–[148]. Dynamic flow and transport measurements, including hydraulic heads, flowrates, and concentration data, provide integral low-resolution information about the hydraulic properties in between the wells. On the other hand, realistic subsurface flow models typically involve a large number of uncertain parameters used to describe the spatial variability in rock properties. During model calibration, these parameters are tuned, subject to rules and constraints specified by modelers, to find solutions that provide model responses with close matches to the observed dynamic data. The ultimate goal of the model calibration process, however, is to generate an accurate predictive power that can be used to optimize future development plans. Traditional geostatistical methods typically make use of two-point statistical measures (i.e., variogram or covariance functions) to interpolate between limited scattered data. Second-order statistical models are appropriate for a model parameter whose variability can be described using multi-Gaussian random functions. While in some cases, 127 field data may be supportive of the Gaussian behavior [143], [149], univariate Gaussian distributions generally do not automatically warrant a multi-Gaussian assumption [150]– [152]. A characteristic feature of multi-Gaussian models is that transitioning between extreme low and high values is gradual and involves continuous variations. Field observations from certain geologic deposits suggest that in some cases nearby locations show sharp contrast in rock type and properties, features that cannot be captured using multi-Gaussian models. An example of geologic environments that do not lend themselves to multi-Gaussian assumptions are those involving well-defined geo-bodies (e.g., braided channels, river beds, and shale lenses) [144], [150]–[152]. When the contrast in hydraulic properties across facies types is significant, correct identification of facies boundaries can be consequential in predicting the flow and transport behavior. While within facies variabilities generally have an impact on the fluid flow and transport behavior, they are of secondary importance when rock facies exhibit sharp contrast in their hydraulic properties. In addition, the flow response of subsurface models tends to show greater sensitivity to large-scale changes such as facies locations, shapes, and boundaries. Given the above reasons, estimating facies distributions and their spatial connectivity from dynamic data is of primary interest when calibrating flow models in certain geologic formations. The discrete and non-Gaussian nature of facies distribution poses a major difficulty in formulating and solving facies model calibration problems. The majority of subsurface model calibration methods are designed for continuous variables; i.e., they apply continuous updates to model parameters to calibrate them. This implies that even when the starting model parameters are discrete in nature, the updated models become continuous. 128 Two general approaches have been used to estimate discrete valued parameters. The first approach is by using parameterization techniques to approximately represent discrete valued models with continuous valued models with similar connectivity patterns. This approach allows the use of traditional continuous model calibration techniques. If necessary, a post-processing approach may be used at the end of the model calibration to convert the continuous model to a corresponding discrete valued model. However, post- processing must be performed while ensuring that the quality of the data match is not compromised. Examples of reduced-order parameter domains that have been used in the past include principal component analysis [18], [19] and truncated singular value decomposition [153], discrete cosine transform [70], [154], discrete wavelet transform [116], [155], grid-connectivity transform [117], and the K-SVD sparse geologic dictionaries [86]. In previous chapters of this dissertation, sparsity-promoting regularization is combined with learned sparse representations (e.g., through the K-SVD sparse dictionary) to reconstruct complex geologic connectivity patterns. A major advantage of the K-SVD-based sparse formulation is the inherent basis (feature) selection property of the algorithm that was exploited in chapters III and IV to identify relevant basis functions based on observed nonlinear hydrological response data [156], [157]. The above parameterization methods have all been used for approximating discrete facies maps with continuous models. One drawback of these methods is that they cannot honor the discrete nature of the desired models. To obtain discrete solutions for facies, hard thresholding can be used. In this approach, each grid cell is assigned a facies type depending on the proximity of its estimated (continuous) parameter to the value of the same parameter in each facies type. 129 This simple approach is convenient to implement and can enforce discreteness on estimated continuous model parameters. However, it can result in unexpected spatial discontinuity and lead to data match deterioration. To enhance continuity and preserve spatial geologic connectivity, more sophisticated post-processing methods have been developed [158]– [160]. The second approach for estimating discrete parameters involves a mapping to convert discrete facies to continuous tuning parameters that are used in model calibration to adjust discrete facies distributions. These methods ensure that the solution remains discrete throughout the model calibration steps while they are tuned through continuous parameters. A good example of these techniques is the level-set method, which has been applied to modeling multi-facies groundwater models [31], [161], [127], [162]. In this method, a continuous sign function is used to describe facies boundaries. During model calibration, the sign function is tuned to adjust the facies boundaries until an acceptable data match is obtained. The continuous sign function is also amenable to calibration with gradient-free model calibration methods, including those that exhibit good performance with applied to parameters with near-multi-Gaussian distributions, such as the Ensemble Kalman Filter (EnKF) [126], [129], [163]. On the other hand, the use of a simple sign function suggests that the corresponding discrete maps cannot be too complex (e.g., braided fluvial channels). The EnKF has also been used with several continuous parameterization and transformation techniques, including the level-set method, where continuous sign functions are updated to calibrate discrete facies distributions [127], [164]. Recently, the probability conditioning method (PCM) was used to condition discrete multiple-point statistical facies 130 models on flow data using continuous facies probability maps that were generated from applying EnKF updates [165]. Another facies calibration approach based on continuous parameter updates is the pluri-Gaussian method, in which multiple Gaussian models are used in a truncation scheme to describe discrete facies [166], [167]. This framework introduces a complex mapping between continuous and discrete domains, which complicates the correspondence between continuous parameters and the discrete facies distribution. This complexity can introduce difficulties in gradient computation for local search methods and may result in inadequate data match for ensemble-based data assimilation method such as EnKF. This chapter proposes a novel framework for calibration of facies models against dynamic data. To this end, let us first introduce a general framework for nonlinear inverse modeling problems with multiple constraints. To develop the discrete facies estimation approach that yields plausible solutions, this study proposes incorporating three main requirements in the formulation as follows: i) satisfactory data match; ii) preservation of a pre-specified spatial connectivity patterns, and iii) solution discreteness (i.e. certain quantization levels are accepted for the desired image model). To implement these requirements, a regularized least-squares objective function is constructed where the mismatch between observed and predicted flow data is minimized along with two regularization terms, one for promoting solution discreteness, and another for reproducing geologic connectivity patterns. Solution discreteness is imposed through well-potential penalty functions while the expected facies connectivity is preserved through low-rank parameterization methods that are derived from prior model realizations. An efficient alternating directions algorithm is introduced to minimize the resulting multi-regularized 131 least-squares objective function by decomposing it into simpler optimization sub-problems that are solved sequentially. Several numerical experiments are presented to evaluate the performance of the developed discrete regularization method for calibrating geologic facies models. Methodology This section formulates a regularized least-squares approach for discrete valued model calibration. The forward and inverse problems are briefly introduced first, followed by a discussion on preserving the expected geologic connectivity and solution discreteness. Throughout, the 𝑙 𝑝 -norm of a length-𝑛 vector 𝐮 = [𝑢 1 𝑢 2 … 𝑢 𝑛 ] 𝑇 (for 𝑝 > 0) is: ‖𝐮 ‖ 𝑝 = (∑ |𝑢 𝑖 | 𝑛 𝑖 =1 𝑝 ) 1 𝑝 (5.1) where the operator |. | computes the absolute value of its argument. To introduce the notations used in this paper and their interpretations within the context of subsurface flow model parameters, Figure 1 illustrates a general subsurface property distribution and its transform-domain representation. The notations 𝐮 , 𝐯 , and 𝐝 represent the unknown model parameter vector in the space domain, the coefficients corresponding to approximation of 𝐮 in a linear expansion form 𝐮 ≈ 𝐮̃ = 𝚽𝐯 , and the observation vector, respectively. Note that for facies models, the parameter 𝐮 consists of discrete variables while 𝐯 involves continuous coefficients. In that case the notation, 𝐮̃ = 𝚽𝐯 represents the best (in least- squares sense) continuous approximation of the discrete facies map, i.e., 𝐮 in a given basis 𝚽 . The function 𝐠 (. ) represents a deterministic nonlinear mapping from the parameter 132 domain to the observation domain, i.e., 𝐠 (𝐮 ) = 𝐝 . In practice, this mapping involves observation errors, and is expressed as 𝐠 (𝐮 ) = 𝐝 + 𝛆 (5.2) This chapter, aims to identify discrete geologic facies distribution from integration of dynamic data and explicit prior model parameter constraints, in the form of a training dataset. To this end, let us define two sets of parameters to estimate 𝐮 , which is an 𝑛 - dimensional parameter that represents the discrete facies types that are assigned to each grid block in the numerical model; and 𝐯 , which stands for the approximation of 𝐮 in a linear transformed domain using approximation basis functions, i.e. 𝐮 ≈ 𝚽𝐯 . Because 𝐮 is discrete and 𝐯 is continuous, this study formulates a two-step optimization procedure for updating 𝐮 and 𝐯 sequentially. In the first step, 𝐯 is updated (in the transform domain) by minimizing a data mismatch and promoting the expected spatial connectivity (through SVD or k-SVD parameterization). In the second step, a discrete regularization function (in the space domain) is determined to find a discrete solution 𝐮 that is close to the continuous solution obtained in the previous step (i.e., after updating 𝐯 ). These two steps are repeated until stopping criteria are met. This section will describe various components of the developed method and discuss them in more detail. 133 Figure 5.1. Visual representation of linear expansion Image of the spatial distribution of a subsurface flow model (top), and its sparse representation using linear expansion functions/ images (bottom). i) Connectivity-Preserving Constraint When prior information on the expected connectivity patterns in the solution is available, it is important to constrain the solution to preserve such patterns. While simple mathematical expressions, such as Tikhonov regularization forms, exist to promote generic attributes, e.g. smoothness, on the solution, specialized constraints that can be learned from reliable prior information are more effective. A classical approach to learn compact representations for spatial connectivity patterns from prior models is through low-rank Singular Value Decomposition (SVD). In Chapter II, [123], training data were used to learn sparse representations to incorporate and preserve connectivity patterns in solving inverse 134 problems. Specifically, the 𝑘 -SVD algorithm [62] was used to learn sparse representation for complex spatial patterns in rock property distributions from prior training data. The advantages of using the 𝑘 -SVD algorithm to learn sparse representation are reported in [123]. Similarly, in this chapter, the SVD and 𝑘 -SVD expansion functions are constructed from a set of 𝐿 training model realizations that provide spatial connectivity constraints for parameters. These models can be constructed by combining static measurements of rock properties with a conceptual model of spatial continuity, using geostatistical simulation techniques. In the case of SVD, a reduced subspace is predefined by the 𝑆 leading left singular vectors of the training data matrix. The coordinates of this subspace compactly describe high-dimensional spatial patterns in the training data. On the other hand, in the 𝑘 - SVD algorithm, the linear expansion functions are learned (i.e., derived) from the parameter training dataset to provide sparse expansion coefficients without any order assigned to the expansion functions. Thus, one of the main advantages of proposed sparsity promoting non-linear model calibration framework introduced in chapter II is that the solution subspace is not pre-determined and has to be found during model calibration. Sparse reconstruction algorithms have been developed to search a relatively large number of the 𝑘 -SVD expansion functions to identify a sub-space (i.e., small subset of expansion functions) that best represents the solution. In this chapter’s formulation, to preserve geologic connectivity, let us both SVD and 𝑘 -SVD methods (Appendix A provides a brief overview of the 𝑘 -SVD algorithm). When 𝑘 -SVD representation is used, a sparse reconstruction algorithm (introduced in chapter II) is applied to find the solution, whereas 135 for the SVD parameterization no regularization term is needed as the expansion functions are predetermined. The least-squares objective function that combines the data mismatch and parameterization is expressed as: 𝐮̂, 𝐯̂ = argmin 𝐮 ,𝐯 ‖𝐠 (𝚽𝐯 ) − 𝐝 obs ‖ 2 2 + 𝜆 𝑅 2 𝑅 (𝐯 ) + 𝜆 𝚽 2 ‖𝚽𝐯 − 𝐮 ‖ 2 2 (5.3) Here, 𝑅 (𝐯 ) is a general regularization constraint on the transformed coefficients 𝐯 , and 𝚽 represents the matrix containing the expansion functions; 𝐮 represents the discrete parameters, and 𝚽𝐯 denotes its continuous approximation. The regularization constraint 𝑅 (𝐯 ) depends on the characteristics of the basis functions in 𝚽 . For example, for 𝑘 -SVD basis functions, 𝑙 1 -norm can be used to promote sparsity, i.e. 𝑅 (𝐯 ) = ‖𝐯 ‖ 1 . This regularization is used to only retain significant 𝑘 -SVD expansion functions during model calibration (see Appendix A and [111] for details). When SVD parametrization is used, 𝑅 (𝐯 ) is not needed as the significant SVD basis functions are predetermined as the 𝑆 leading left singular vectors of the matrix containing the prior parameter realizations in its columns. It is important to note that in each case, the expansion functions 𝚽 are derived (learned) from parameter training data and contain the expected spatial connectivity patterns while the expansion coefficients 𝐯 are the corresponding weights given to each element. The penalty term 𝜆 𝚽 2 ‖𝚽𝐯 − 𝐮 ‖ 2 2 is introduced to ensure that approximation with linear expansion remains close to the discrete solution. The regularization parameters 𝜆 𝑅 2 and 𝜆 𝚽 2 are positive weights that control the importance of their corresponding regularization terms. 136 ii) Discrete Regularization The objective functions in Eqs. (5.2) and (5.3) need additional regularization terms to promote solution discreteness. When the discrete values that the solution can take (number of facies and their corresponding property values) are known a priori 𝐂 = { 𝑙 1 , … , 𝑙 𝑐 }, one can introduce an integer constraint to require each component of the solution to belong to the set 𝐂 = { 𝑙 1 , … , 𝑙 𝑐 }, that is 𝑢 𝑖 ∈ 𝐂 = { 𝑙 1 , … , 𝑙 𝑐 } [168], [169]. A hard integer constraint can be shown to lead to a NP-hard problem [168]. However, a soft version of the constraint can be introduced through discrete regularization. A simple regularization function, denoted as 𝐷 (𝐮 ), can be considered in which the function value is zero when the model parameters, 𝐮 , assume discrete values (facies indicators) and positive non-zero values otherwise. A typical discreteness regularization consists of the sum of non-negative penalties for each single grid block in the model, i.e., 𝑢 𝑖 in 𝐮 , that is [170]: 𝐷 (𝐮 ) = ∑ 𝐷 𝑝 (𝑢 𝑖 ) 𝑛 𝑖 =1 (5.4) where 𝑛 is the total number of grid blocks, and 𝐷 𝑝 (𝑢 𝑖 ) is a single-variable locally convex (well potential) function with its minimums at (discrete) facies values 𝑙 1 , … , 𝑙 𝑐 . Such a function can be generally defined as (the subscript 𝑝 refers to the type of function, i.e. well potential in this case) { 𝐷 𝑝 (𝑢 𝑖 ) = 0 if 𝑢 𝑖 ∈ 𝐂 = { 𝑙 1 , … , 𝑙 𝑐 } 𝐷 𝑝 (𝑢 𝑖 ) > 0 if 𝑢 𝑖 ∉ 𝐂 = { 𝑙 1 , … , 𝑙 𝑐 } ∃ 0 < 𝜖 ≪ 1 ∶ ∀𝑧 ∈ [𝑙 𝑗 − 𝜖 , 𝑙 𝑗 + 𝜖 ] 𝐷 𝑝 (𝑢 𝑖 ) is convex (5.5) After adding the discrete regularization function as a constraint, the objective function in Eq. (5.4) can be expressed as: 137 𝐮̂, 𝐯̂ = argmin 𝐮 ,𝐯 ‖𝐠 (𝚽𝐯 ) − 𝐝 obs ‖ 2 2 + 𝜆 𝑅 2 𝑅 (𝐯 ) + 𝜆 𝐷 2 𝐷 (𝐮 ) + 𝜆 𝚽 2 ‖𝚽𝐯 − 𝐮 ‖ 2 2 (5.6) Figure 2a depicts an example of 𝐷 𝑝 (𝑢 ) for 𝑢 ∈ [𝑙 𝑗 −1 , 𝑙 𝑗 ]. In addition, Figures 2b and 2c demonstrate the general behavior of 𝐷 𝑝 (𝑢 ) for 3 and 4 valued discrete levels, respectively. The proposed framework, considers fourth-order well potential functions of the form: 𝐷 𝑝 (𝑢 ) = { ℎ 𝑗 (𝑢 − 𝑙 𝑗 −1 ) 2 (𝑢 − 𝑙 𝑗 ) 2 for 𝑢 ∈ [𝑙 𝑗 −1 , 𝑙 𝑗 ], 𝑗 ∈ { 2, … , 𝑐 } undefined otherwise (5.7) where ℎ 𝑗 denotes the weight given to each part of the function to control the penalty for deviation from different discrete levels. For two facies models with 𝑙 1 = 0 and 𝑙 2 = 1, the regularization function takes the form 𝐷 𝑝 (𝑢 ) = { 𝑢 2 (𝑢 − 1) 2 for 𝑢 ∈ [0,1] undefined otherwise (5.8a) When three facies models are considered with 𝑙 1 = 0, 𝑙 2 = 2, and 𝑙 3 = 3, the function is expressed as: 𝐷 𝑝 (𝑢 ) = { ℎ 1 (𝑢 ) 2 (𝑢 − 2 ) 2 for 𝑢 ∈ [0,2] ℎ 2 (𝑢 − 2) 2 (𝑢 − 3) 2 for 𝑢 ∈ [2,3] undefined otherwise (5.8b) Note that the regularization functions in Eqs. (5.8a) and (5.8b) are zero for discrete values and are convex and positive in the closed neighborhood of the discrete values. Furthermore, the size of penalty can be controlled by parameters ℎ 1 and ℎ 2 . Geologically, by adding the regularization term 𝐷 (𝐮 ) to the objective function, the continuous values assigned to the distribution of rock permeability 𝐮 in Eq. (5.6) are forced to converge to one of the facies 138 discrete levels. Figure 2d depicts a few alternative functions that are introduced in the literature to promote solution discreetness in inverse problems. Figure 5.2. Sample discrete regularization functions; (a) for two-valued facies; (b) Sample discrete regularization functions for three-valued; (c) and four-valued facies; (d) the behavior of alternative discrete regularization functions at the neighborhood of the i th discrete level 139 To solve the optimization problem in Eq. (5.6), this study, uses a two-step alternating directions algorithm to iteratively solve the following optimization sub-problems to update 𝐯 and 𝐮 : Step (1): 𝐯 𝑘 +1 = argmin 𝐯 ‖𝐠 (𝚽𝐯 ) − 𝐝 obs ‖ 2 2 + 𝜆 𝑅 2 𝑅 (𝐯 ) + 𝜆 𝚽 2 ‖𝚽𝐯 − 𝐮 𝑘 ‖ 2 2 Step (2): 𝐮 𝑘 +1 = argmin 𝐮 𝜆 𝐷 2 𝐷 (𝐮 ) + 𝜆 𝚽 2 ‖𝚽 𝐯 𝑘 +1 − 𝐮 ‖ 2 2 (5.9) Note that, for compactness, the constant terms in each of the above objective functions that do not affect the optimization solution are eliminated. It is also important to note that the term 𝜆 𝚽 2 ‖𝚽𝐯 − 𝐮 ‖ 2 2 couples the two optimization steps and ensures that the updates are applied gradually to avoid drifting too far from the solution in the previous update. Readers should also note that the solutions of the two optimization sub-problem depend on the regularization parameters 𝜆 𝑅 2 , 𝜆 𝐷 2 , and 𝜆 𝚽 2 . As in other nonlinear regularized least-squares problems, a practical approach for identifying reasonable choices for these regularization parameters can be obtained through sensitivity analysis and cross-validation. This study’s results suggest that within a reasonably wide range for these variables the solution does not show too much sensitivity to these parameters (see next section for discussion). The main advantage of the alternating directions algorithm is that it divides complex objective functions into simpler functions that are easier to solve. For instance, the second step of the optimization in Eq. (5.9) does not involve the data mismatch term, which is complex and computationally demanding to compute (due to required forward simulation). In some cases, some of the optimization sub-problems may have closed-from solutions. Furthermore, since the optimization problems are solved multiple times (and sequentially) it is not necessary to find the exact solution for each iteration and only a few 140 iterations of the optimization sub-problems may be sufficient to find an acceptable solution for the next iteration. However, it is recommended that this property be verified for any given problem before its adoption. It is also worthwhile to mention that the proposed algorithm has some similarity with the well-known Alternating Directions Method of Multipliers (ADMM) [171]. For model calibration, Step (1) of the above algorithm can be solved using a standard gradient-based method, such as Gauss-Newton, conjugate gradient, or BFGS algorithms. In the following examples, Step (1) has been solved using the Gauss-Newton algorithm for only 2 iterations to get a “close-enough” approximate solution. Step (2) of the above algorithm can be separated into 𝑛 grid-cell level optimization problems because the regularization term, i.e., 𝐷 (𝐮 ) = ∑ 𝐷 𝑝 (𝑢 𝑖 ) 𝑛 𝑖 =1 , operates on individual pixels of 𝐮 . Therefore, if one defines 𝛉 𝑘 +1 = [𝜃 1 𝑘 +1 𝜃 2 𝑘 +1 … 𝜃 𝑛 𝑘 +1 ] 𝑇 = 𝚽 𝐯 𝑘 +1 , Step (2) of the optimization problem can be divided into 𝑛 subproblems as: ∀𝑖 ∈ { 1,2, … , 𝑛 }: 𝑢 𝑖 𝑘 +1 = argmin 𝑢 𝑖 𝜆 𝐷 2 𝐷 𝑝 (𝑢 𝑖 ) + 𝜆 𝚽 2 ‖𝜃 𝑖 𝑘 +1 − 𝑢 𝑖 ‖ 2 2 (5.10) By defining 𝛽 2 = 𝜆 𝐷 2 𝜆 𝚽 2 , the optimization problems in Eq. (5.10) are further simplified to: ∀𝑖 ∈ { 1,2, … , 𝑛 }: 𝑢 𝑖 𝑘 +1 = argmin 𝑢 𝑖 ‖𝜃 𝑖 𝑘 +1 − 𝑢 𝑖 ‖ 2 2 + 𝛽 2 𝐷 𝑝 (𝑢 𝑖 ) (5.11) which can be easily solved by the following rules: (i) 𝑢 𝑖 = min { 𝑙 1 , … , 𝑙 𝑐 } if 𝜃 𝑖 𝑘 +1 ≤ min { 𝑙 1 , … , 𝑙 𝑐 }, (ii) 𝑢 𝑖 = max { 𝑙 1 , … , 𝑙 𝑐 } if 𝜃 𝑖 𝑘 +1 ≥ max { 𝑙 1 , … , 𝑙 𝑐 }, and (iii) 𝑢 𝑖 𝑘 +1 = 𝑎𝑟𝑔𝑚𝑖𝑛 𝑢 𝑖 ‖𝜃 𝑖 𝑘 +1 − 𝑢 𝑖 ‖ 2 2 + 𝛽 2 𝐷 𝑝 (𝑢 𝑖 , 𝑙 𝑗 , 𝑙 𝑗 +1 ) if 𝑙 𝑗 ≤ 𝜃 𝑖 𝑘 +1 ≤ 𝑙 𝑗 +1 . (5.12) 141 Note that 𝐷 𝑝 (𝑢 𝑖 , 𝑙 𝑗 , 𝑙 𝑗 +1 ) is defined as: 𝐷 𝑝 (𝑢 𝑖 , 𝑙 𝑗 , 𝑙 𝑗 +1 ) = { 𝐷 𝑝 (𝑢 𝑖 ) for 𝑢 𝑖 ∈ [𝑙 𝑗 , 𝑙 𝑗 +1 ] undefined otherwise (5.13) Figure 3 depicts the objective function ‖𝜃 𝑖 𝑘 +1 − 𝑢 𝑖 ‖ 2 2 + 𝛽 (𝑢 𝑖 ) 2 (𝑢 𝑖 − 1) 2 for different values of 𝜃 𝑖 𝑘 +1 ∈ [0,1] and 𝛽 ∈ { 100,0.1,0.01,0.001}, for discrete levels 0 and 1. As shown in Figure 3, the solution of the optimization problem approaches the discrete levels of 0 and 1 when 𝛽 is large. For smaller values of 𝛽 , the solution depends on the value of 𝜃 𝑖 (converges to discrete levels only when 𝜃 𝑖 is close to discrete values). 1 . 0 = i 45 . 0 = i 65 . 0 = i 9 . 0 = i 100 2 = Estimated Discrete Value 099 . 0 ˆ = i z 45 . 0 ˆ = i z 6510 . 0 ˆ = i z 901 . 0 ˆ = i z 1 . 0 2 = Estimated Discrete Value 009 . 0 ˆ = i z 047 . 0 ˆ = i z 965 . 0 ˆ = i z 991 . 0 ˆ = i z 01 . 0 2 = Estimated Discrete Value 001 . 0 ˆ = i z 005 . 0 ˆ = i z 996 . 0 ˆ = i z 999 . 0 ˆ = i z 142 Figure 5.3. The behavior of the discrete penalty function (𝑢 𝑖 − 𝜃 𝑖 ) 2 + 𝛽 𝑢 𝑖 2 (𝑢 𝑖 − 1) 2 , 𝑢 𝑖 ∈ [0,1] for different values of 𝛽 and input argument 𝜃 𝑖 . In each plot, the x and y axes show the 𝑢 𝑖 values and the value of the objective function, respectively. As the value of 𝛽 decreases, the minimum of this function gets closer to the discrete values 0 or 1, depending on the value of the continuous variable 𝜃 𝑖 Numerical Experiments Three sets of numerical experiments are discussed to examine the effectiveness of the proposed discrete regularization for identification of litho-facies. The first experiment is a linear inverse problem involving straight-ray travel time tomography. The remaining examples involve a groundwater pumping test with constant draw-down in which the horizontal hydraulic conductivity of the aquifer is estimated from pressure and hydraulic conductivity observations [172]–[175]. For the pumping tests, three separate examples with alluvial/fluvial channel connectivity patterns in a 2D layer of an aquifer are discussed. For these experiments, in addition to discrete regularization, either reduced-order parameterization in a truncated SVD basis or sparsity-promoting regularization with sparse 𝑘 -SVD dictionaries are used, both serving to preserve the facies connectivity that are learned from prior models. It is important to note that without discrete regularization, low- rank SVD and 𝑘 -SVD approximations provide continuous solutions. In these experiments, zero-mean Gaussian observation noise with standard deviation equal to 5% of the observed 001 . 0 2 = Estimated Discrete Value 0 ˆ = i z 0 ˆ = i z 1 ˆ = i z 1 ˆ = i z 143 data is synthetically added to the simulated data, which are generated using the reference cases. Example 1: Tomographic Inversion This chapter will first examine the performance of the proposed discrete regularization framework in a linear tomographic reconstruction problem [176], [177]. In travel time tomography several sources are used to transmit acoustic waves throughout a medium. The arrival times of these waves are recorded by receiver at a known distance. The governing equation that relates the recorded arrival times with sloweness (unknown parameters) of the medium can be derived as: { 𝑡 − ∫ 𝑠 (𝑥 ⃗)𝑑 𝑥 ⃗ = 0 𝑠 (𝑥 ⃗) = 1 𝑣 (𝑥 ⃗) . (5.14) where, 𝑡 denotes the arrival time of each wave, and 𝑠 (𝑥 ⃗) represents the slowness (1/velocity) of the rock at spatial locations specified by the coordinate vector 𝑥 ⃗. Under simplified conditions, a discretized version of Eq. (5.14) can be used to form a linear system of equations between recoded arrival times (i.e., observations) and discretized rock slowness map (i.e., parameters), i.e., 𝐝 = 𝐆𝐮 . Reconstruction of the slowness map (or velocity model) from limited arrival time information presents an underdetermined linear inverse problem that this study will consider to evaluate the performance of the proposed discrete regularization formulation. In the first example, an array of sources/receivers for the tomographic problem is shown in Figure 4(a) in which straight ray paths from source arrays (on the top) to the receiver arrays (on the bottom) are shown. Figure 4(b) shows the reference slowness map 144 used to generate the synthetic tomographic observations. The reference map has dimensions 6400 × 6400 × 10 ft 3 which are discretized into a 64 × 64 × 1 grid domain. The objective of this example is to reconstruct the reference slowness map from the 6 × 6 = 36 travel time noisy measurements at the receivers, using the proposed regularization functions. The realizations of the prior model are generated using the Sequential Indicator Simulation technique [60]. The initial model realizations (first 20 out of 2000) as well as the corresponding 𝑘 -SVD elements (for 𝑘 = 1000 , 𝑆 = 30) are shown in Figures 4(c) and 4(d), respectively. This chapter, primarily focuses on the discrete regularization technique. Additional details about the above parameterization methods can be found in chapter II. 145 Figure 5.4. Travel-time tomography example (a) Transmitter/Receiver setup; (b) reference three-facies model; (c) initial model realizations generated using sequential indicator simulation; and (d) the corresponding 𝑘 -SVD dictionary atoms learned from the realizations (with 𝑘 = 1000 , 𝑆 = 30); The reconstructed models with the above tomography setup are shown in Figure 5 under three main assumptions. In this example, facies types {0, blue}, {2, yellow} and {3, red} represent the low, medium and high fluid conductivity regions, respectively. The resulting regularization function from Eq. (5.7) takes the form 𝐷 𝑝 (𝑢 ) = { ℎ 1 (𝑢 − 0) 2 (𝑢 − 2) 2 for 𝑢 ∈ [0,2] ℎ 2 (𝑢 − 2) 2 (𝑢 − 3) 2 for 𝑢 ∈ [2,3] undefined otherwise (5.15) For (ℎ 1 = 100,ℎ 2 = 1), (ℎ 1 = 1,ℎ 2 = 1), (ℎ 1 = 1,ℎ 2 = 100) the resulting regularization functions are shown in Figures 5(a)-5(c), respectively. In each case the corresponding solutions are shown next to the regularization function. In Figure 5(a), the regularization function is designed to give more weight to correct identification of facies type {0} than to differentiate between facies {2} and {3}. This can be verified by noting that the boundaries of the yellow and red facies are smeared. Similarly, in Figure 5(b), the rock type with the highest property value (e.g., sand facies type) is more important to reconstruct when ℎ 1 = 1 and ℎ 2 = 100 (the facies in red has sharp boundaries while the boundaries between the yellow and blue facies is blurred). Figure 5(c) shows the regularization function for the case with ℎ 1 = ℎ 2 = 1. To compare the performance of different discrete regularization functions, Figure 5(d) shows the reconstruction results for locally shifted Tukey’s function, which assigns the same penalty (flat region of the function) to deviation beyond a certain distance from the discrete values. A comprehensive study with various discrete regularization functions forms, [178], showed that the reconstruction outcomes with these 146 regularization functions are quite similar. In fact, the use of other discrete regularization functions (shown in Figure 2d) resulted in similar outcomes (not shown), suggesting that the results are less sensitive to the exact shape of the discrete regularization function. However, the reader should note that this observation cannot be generalized and may depend on the weights given to the regularization parameters. 147 Figure 5.5. Travel-time tomography reconstruction results with different discrete regularization function forms (a)-(c) show examples in which more weight is given to identification of shale, sand, and both shale and sand facies types, respectively, as discrete values; the results in (d) are for the locally shifted Tukey’s bi-weight function, which is quite similar to the results in (c), suggesting similar effect for the two regularization forms. Example 2: Integration of Dynamic and Static Data As second example, let us consider groundwater pumping tests, in which the hydraulic conductivity and pressure data at observation wells are used to estimate the full map of hydraulic conductivity distribution in an aquifer. The governing equations of the pumping test problem are derived from the classical equations of single phase flow in a saturated porous medium, i.e., conservation of mass and Darcy’s law. Assuming that mass fluxes due to dispersion and diffusion are negligible the underlying equations can be stated as: { ∂(𝜙𝜌 ) ∂t = −∇. (𝜌 𝐯 ) + 𝑞 conservation of mass 𝐯 = − 1 𝜇 𝐊 (∇𝑝 − 𝜌𝑔 ∇𝑧 ) Darcy ′ s law (5.16) where, 𝜙 , 𝜌 , 𝐯 , 𝑝 and 𝑞 represent porosity, density, velocity, pressure and sources/sink terms (injection/ extraction rates). The notations 𝐊 and 𝜇 refer to permeability and viscosity, respectively. Hydraulic conductivity is defined as 𝐊 ℎ = 𝐊 𝜌𝑔 𝜇 and is used more commonly in hydrogeology. For pumping test problems, the forward model 𝐝 = 𝐠 (𝐮 = 𝐊 ℎ ) related the full map of hydraulic conductivity, 𝐮 = 𝐊 ℎ , to observed pressure head at well locations, 𝐝 . The nonlinear mapping presents an implicit relation that can be established by (numerically) solving the system of equations in Eq. (5.16). The experimental setup in this example is depicted in Figure 6(a). No-flow boundary condition is applied to the top and bottom boundaries of the domain while 148 constant pressure heads are applied to the right and left boundaries to establish a pressure gradient (background velocity field) from left to right. A single well in the middle of the domain is pumping water with a constant draw-down head of 14 m at a steady state flow condition. The spatial distribution of log-hydraulic conductivity (unit:log( m s )) is assumed to be the only unknown parameter to estimate. The log-hydraulic conductivity values for the two facies types are −4 and -2.9, respectively. Pressure heads at monitoring wells surrounding the pumping well and water extraction rate at the pumping well are measured. The reference hydraulic conductivity model and its corresponding pressure head distribution are depicted in Figures 6(b) and 6(c). In this example, the aquifer is of size 2107× 2107× 10m 3 and is uniformly discretized into 64 × 64 × 1 cells. The pumping well is in cell (32, 32). A set of 2000 facies model realizations with representative discrete patterns are used as prior (learning) data. Figure 6(d1) shows twenty sample prior facies models that are generated using multiple-point geostatistics SGeMS [179] without conditioning on hard data. Shown in Figure 6(e1) are twenty samples that are conditioned on hard data at the pumping and monitoring wells. The corresponding SVD basis functions and 𝑘 -SVD dictionary atoms for each case are shown in Figures 6(d2)-(d3) and 6(e2)- (e3), respectively. To conduct these experiments, extensive sensitivity analysis was performed to identify regularization parameters that provided stables solutions (i.e., several trials were used to bracket regularization parameters that provided similar solutions while minimizing the terms in the objective function). 149 Figure 5.6. Well configuration, realizations and transform functions for Example2. Well configuration (a) and reference facies model (b) and its corresponding pressure head map (c) for Examples 2; (d) and (e) show sample prior (training) models (left) used for constructing SVD basis (middle) and 𝑘 -SVD (right) sparse dictionaries without and with conditioning on hard data, respectively. 150 Figure 7 summarizes the reconstruction results using only dynamic pressure head data at monitoring wells and water extraction rate at the pumping well. The calibrated models shown in the first row of Figure 7, from left to right, correspond to the standard SVD parameterization (Figure 7(a1)), continuous solution from the SVD parameterization with discrete regularization (Figure 7(b1)), 𝑘 -SVD sparse reconstruction (Figure 7(c1)), and the continuous solution from the 𝑘 -SVD sparse reconstruction with discrete regularization (Figure 7(d1)). Note that these solutions are continuous and do not represent the final desired discrete map. The results indicate that promoting discreteness on the solution enhances the structural connectivity in the continuous (non-discrete) estimated models. The final discrete solution for each case is shown under the corresponding continuous solution in Figures 7(a1)-(d1), and include simple post-processing of the continuous solutions of SVD (Figure 7(a1)) and 𝑘 -SVD (Figure 7(c1)), and the results obtained from the proposed formulations (Figures 7(b1) and 7(c1), respectively). Comparing the results from these four cases indicates that promoting discreteness improves the facies representation and enhances the structural connectivity in the estimated model. It is also evident that the results obtained by using 𝑘 -SVD representation are superior to those generated by SVD parameterization, an outcome that is consistent with the reported results in [123]. 151 Figure 5.7. Calibration results for Examples 2. without (a1)-(d1) and with (a2)-(d2) hard data conditioning. Columns (1)-(4) contain the continuous and discrete solutions using SVD, SVD+discrete regularization, 𝑘 -SVD, and 𝑘 -SVD+discrete regularization, respectively. 152 The proposed discrete imaging framework simultaneously constrains the solution to preserve structural connectivity and discreteness. On the other hand, the standard SVD and 𝑘 -SVD methods result in continuous solutions that are converted to discrete maps through thresholding. This post-processing step, however, can perturb the obtained data match from model calibration step, and therefore degrade the predictive accuracy of the model. Unlike post-processing, the proposed approach simultaneously incorporates prior information about connectivity and discreteness in the reconstruction process without affecting the flow data match. Figure 8a, from left to right, displays the estimated pressure heads and water rate forecasts for monitoring wells #1 (north-west corner) and #2 (south- west corner) and the pumping well for different methods. For the SVD and 𝑘 -SVD solutions, the post-processed discrete results are used to generate future forecasts. Visual and quantitative (RMSE) comparison of the forecast plots indicates that data match violations due to post-processing can be significant. Since each iteration of the formulation involves a data match step followed by a discrete regularization step that keeps the solution close to the continuous solution, the discrete regularization constraint improves the results in two ways: first by providing a discrete solution, and second by consistently improving the structural connectivity of the SVD and 𝑘 -SVD continuous parameterization outcomes (i.e., through enforcing discreteness to obtain binary connectivity features). This chapter also considers the same pumping test scenario in which noisy hydraulic conductivity measurements are included in model calibration. Figures 7(a2)- (d2) show the final results for calibration with conditioning on hard data. As expected, these figures show improved results in all cases due to incorporation of hard data. The solution with 𝑘 -SVD and discrete regularization is noticeably superior to other methods. 153 These results also indicate that, for both SVD and 𝑘 -SVD, promoting discreteness in identifying the facies distribution enhances the structural connectivity in the estimated models. Figures 8(b) shows the data match for different methods in Example 2. For the traditional SVD and 𝑘 -SVD solutions, simple thresholding results are used to generate the production forecasts. As can be seen from the solution with the SVD parameterization, thresholding can impact the data match quality. Figure 5.8. Data match and forecasts for Example 2 for pressure-head at two sample monitoring wells (left and middle), as well as water extraction rate at the pumping well (right) in Example without (a) and with (b) hard data conditioning. 154 Example 3: Large-Scale Pumping Test with Complex Geology This example, involves a larger and more complex facies model. The aquifer size in this example is 6583 × 6583 × 10m 3 , which is uniformly discretized into 200 × 200 × 1 cells. Figure 9(a) shows the well configuration, consisting of 5 pumping wells under constant draw-down with pressure head of 42.2 meters, and 20 monitoring wells distributed throughout the aquifer. No-flow boundary condition is applied to the top and bottom of the domain boundaries while constant pressure heads are specified for the right and left boundaries to establish a pressure gradient (background flow) from left to right. The hydraulic conductivity measurements from the reference model are shown in Figure 9(b). These hard data are used to generate 3000 prior model realizations with the SNESIM algorithm using the same training image that was used generate the reference map. The mean map that shows the average value of hydraulic conductivity for each cell across all the 3000 realizations is shown in Figure 9(c). Figure 5.9. Well configuration for Example 3 (a): showing the locations of extraction (pumping) and monitoring wells; (b) Reference hydraulic conductivity model; (c) mean map of prior (training) hydraulic conductivity models conditioned on hard-data at well-locations (both extraction and monitoring wells); 155 For large-scale problems, learning algorithms such as 𝑘 -SVD can become computationally demanding. However, in many cases the spatial representation of the training data does not provide a suitable domain for learning the spatial connectivity. [180] proposed dictionary learning in a low-dimensional subspace (such as DCT or SVD) that preserves the main connectivity in the spatial domain. The goal here is to eliminate the redundancy in the description of the training data. This step is performed to compactly represent the training data prior to learning. For this example, a low-rank representation of the training data is used to speed up the learning computation. To this end, the training data were projected onto the subspace defined by the leading left singular vectors of the training dataset. To identify the dimension of the projection subspace, the approach presented in [180] is followed by considering the trade-off between the subspace dimension and projection (approximation) quality, in terms of RMSE. Figures 10(a) shows the expected total RMSE error between low-rank approximated prior models and their full rank representations. Figure 10(b) displays the approximation performance by considering both the computational costs and the total RMSE error associated with low-rank representation. Based on this figure, all the training realizations are projected onto the subspace defined by the 500 leading singular vectors of the training data, and applied the 𝑘 -SVD learning to the resulting 500-dimensional dataset. 156 Figure 5.10. Low-rank representation error measures (a) RMSE, (b) normalized RMSE & Computation Time measure; (c) prior (training) hydraulic conductivity models used for constructing sparse dictionaries with hard-data conditioning for Example 4 (a); SVD elements with ranks 1-9 (d), and 492-500 (e); (f)-(k) depict example 𝑘 -SVD dictionary atoms using a rank-𝑇 prior model representations followed by 𝑘 -SVD dictionary learning for different 𝑇 , 𝑘 , and 𝑆 values: (f) 𝑇 = 500, 𝑘 = 500, 𝑆 = 40; (g) 𝑇 = 500, 𝑘 = 500, 𝑆 = 60; (h) 𝑇 = 500, 𝑘 = 500, 𝑆 = 100; (i) 𝑇 = 500, 𝑘 = 1000 , 𝑆 = 40; (j) 𝑇 = 500, 𝑘 = 1000 , 𝑆 = 60; (k) 𝑇 = 500, 𝑘 = 1000 , 𝑆 = 100. 157 Figure 10(c) shows 9 sample prior model realizations for this experiment. Sample SVD basis functions #1-#9 and #31-#40 are shown in Figures 10(d) and 10(e), respectively. Figures 10(f)-10(k) display 𝑘 -SVD dictionary elements for different values of 𝑘 (i.e., total number of dictionary elements), and 𝑆 (i.e., sparsity level). From analysis of the computation time and estimation quality of the training models, a rank-500 SVD representation for inversion is decided. The 𝑘 -SVD sparse dictionary learning was performed with 𝑘 = 1000 and 𝑆 = 60. The model calibration results with discrete regularization for SVD parameterization (𝑇 = 60) and 𝑘 -SVD (with 𝑇 = 500, 𝑘 = 1000 , 𝑆 = 60) are shown in Figures 11(b2) and 11(d2), respectively. The corresponding continuous models for the SVD and 𝑘 -SVD methods are shown in Figures 11(b1) and 11(d1), respectively. Comparing Figures 11(a1) with 11(c1) and the corresponding discrete images in Figures 11(b1) with 11(d1), indicates a superior performance for 𝑘 - SVD in capturing the correct connectivity patterns. The same observations can be made by comparing the related cell-level mismatch rates for the two models (reported underneath each plot). Figures 11(a3)-11(d3) shows the solution errors (difference between the reference and calibrated models) for each case. The solutions from discrete imaging formulations show superior performance to those from continuous parameterization methods. 158 Figure 5.11. Calibrated hydraulic-conductivity models for Example 3 (with hard-data conditioning) using: (a) truncated SVD parameterization, (b) truncated SVD parameterization with discrete regularization, (c) 𝑘 -SVD based sparse reconstruction followed by thresholding; (d) k-SVD based sparse reconstruction with discrete regularization. (a1)-(d1) show the continuous solutions; (a2)-(d2) display the corresponding discrete solutions (in (a2) and (c2), show discrete solutions after thresholding); (a3)-d3) depict the estimation error (difference between the reference model and the solutions obtained in each case). 159 Conclusions This chapter proposed a new framework for reconstruction of discrete valued images (geological facies distribution) from limited linear/non-linear measurements. The formulation uses parametrized prior models to retain the connectivity structures in facies distribution and discrete regularization functions to honour the categorical nature of facies. A regularized least-squares formulation of the problem leads to multiple regularization terms. The resulting complex objective function is decomposed and solved using variable splitting and an alternating directions optimization method. The optimization algorithm involves sequential minimization of two simpler objective functions, one involving a data mismatch function with a constraint on parameter spatial connectivity, and the other consisting of a discrete regularization function with a constraint on the solution obtained from the first problem. The proposed discrete regularization and alternating directions solution framework for reconstruction of geologic facies is a novel approach for dynamic facies characterization from flow data. While other methods for facies calibration, such as the level set method and pluri-Gaussian techniques, have been developed in the literature, these methods are not amenable to efficient gradient-based model calibration due to the complex relation between facies models and the continuous parameters (e.g., sign function) used to tune them. Furthermore, representing complex geological patterns such as curvilinear fluvial shapes is difficult to accomplish with simple sign functions or Gaussian distributions. Alternatively, the use of continuous methods combined with post-processing steps (such as thresholding) to convert continuous parameters to discrete facies is can deteriorate the quality of data match and model predictions, which is not desirable. The 160 proposed method extends the classical regularized least-squares formulation to solve calibration of geologic facies by promoting solution discreteness. Overall, the results indicate that the proposed approach can simultaneously preserve structural connectivity and promote facies discreteness, while integrating nonlinear flow data. The proposed method is flexible and can be applied to reconstruct the distribution of multiple facies types from well head and flowrate data. Several examples were used to show the superiority of the proposed approach to traditional post-processing methods. Results provided in this work show that discrete regularization can provide an intuitive method for promoting facies reconstruction from dynamic data. In its simple form a discrete regularization function can operate at a cell level and force the solution to assume discrete values with minimal impact on the resulting data match. In addition to generating discrete solutions, including discrete regularization in the objective function can result in improved estimation of connectivity patterns in the continuous solution; and variable splitting with alternative directions method of optimization divide a complex objective function into sub-problems with simpler objective functions, which are easier to solve. In this chapter, the method was applied to two-dimensional synthetic experiments with complex geologic connectivity patterns, and promising results were obtained. Application of discrete regularization technique and the alternating solution algorithm to large-scale 3D problems is needed to fully explore its performance, including convergence behaviour and computational complexity. An important advantage of the alternating directions algorithms with variable splitting is its scaling property for application to large problems. The method decomposes large and complex objective functions, and diverse variable types, into simpler subproblems that are amenable to solution with efficient 161 algorithms. A specific example that was discussed in this chapter is the discrete regularization function, which can be treated efficiently (using existing discrete imaging algorithms) by separating it from the nonlinear data mismatch function. A key point to emphasize is the iterative enforcement of the discrete regularization and incorporation of the outcome into the solution of the continuous problem, which is different from applying a discrete post-processing step after finding a continuous solution. This study’s experience shows that iterative application of the discrete regularization tends to improve the connectivity patterns that are identified in the continuous solution. As a final observation, note that discrete regularization promotes solution discreteness at the cell-level. An important improvement can be achieved by including additional information from neighbouring cells in the discrete regularization. Combining discreteness of each cell with information about the expected local patterns around the cell is expected to improve the performance of the proposed method. One way to incorporate such mechanisms is by introducing higher-order statistical information, e.g., in the form of conditional probabilities, in the regularization function. The conditional probabilities can be quantified using prior datasets or a training image. In the next chapter, one promising direction for moving from a cell-level discrete penalty to a spatially informed discrete penalty that incorporates, from available prior information, the expected statistical patterns for the neighbouring cells will be discussed. 162 Chapter VI. PATTERN BASED NON-LINEAR DISCRETE IMAGING FOR COMPLEX GEOLOGIC ENVIRONMENTS 163 In previous chapter is was described that traditional two-point statistical methods that rely on variogram models to describe the spatial variability in rock properties are not appropriate for representing spatial continuity patterns associated with complex geologic objects (e.g., fluvial or turbidite systems). Modern geostatistical modeling techniques can incorporate multiple-point statistics (MPS) to generate higher-order spatial statistics from a training image (TI) as a conceptual model that encodes complex geologic patterns. A major difficulty in using MPS methods is the difficulty in integration of nonlinear dynamic production data into the resulting models. This chapter, develops an inverse modeling framework for history matching reservoirs with complex MPS-based facies models. With the advancements in solving inverse problems in the past decades, the emphasis has turned in these problems to not only reproduce the observed data, but also to be able to convey some level of (geological) realism and the associated uncertainties. However, with limited number of observations and the low-resolution data, achieving these goals is not trivial. In practice, additional sources of information, e.g., prior spatial connectivity models or geologist’s expertise, is adopted to facilitate this process [151], [20], [1] . As noted in chapter I and II, sparsity in number of the space/time measurements makes it nontrivial to capture high-resolution model parameters. Therefore, ill-posedness is, perhaps, the most challenging part of solving the subsurface inverse problems. Traditionally, this issue is resolved by either compromising the model resolution, e.g., zonation methods [37] or imparting additional constraints on the solution structure. Parameterization and regularization techniques are two important methods that are extensively discussed in the literature of subsurface inverse modeling [181], [139], [182]. 164 Parameterization reduces the search space of the solution by constraining it to a low- dimensional subspace. Principal Component Analysis (PCA) [160], [183] and its kernel- based extensions (kernel-PCA) [184], Singular Value Decomposition (SVD) [139] (see also ho-SVD [185]), 𝑘 -SVD, which is a recently developed sparsity-promoting dictionary learning algorithm [82], [86] and Grid Connectivity Transform (GCT) [117] are examples that incorporate model characteristics to generate parameterization spaces. More complex and nonlinear parameterization techniques, which are mostly developed based on neural network structures, are introduced recently [186], [187]. Furthermore, generic parameterization techniques, such as Fourier transform and its extensions (e.g., Discrete Cosine Transform (DCT)[154], and Wavelet transforms [54], [116], [157], have demonstrated great potential in geological applications. Regularization methods, on the other hand, restrict the solution by enforcing additional structural constraints. For example, Tikhonov regularization [188] searches for a solution with minimum first/second order derivatives. As a result, the obtained solution from this method is spatially smooth with minimum abrupt changes in nearby spatial locations. However, generic regularization techniques, e.g., Tikhonov, are often too simplistic, which makes them non-universal. Consequently, in the context of ill-posed inverse problems (with limited data), these regularization forms are not strong priors and cannot be used to reconstruct complex spatial (geological) connectivity patterns (e.g., meandering fluvial channels shown in previous chapters). Recently developed sparsity- promoting regularization functions, e.g., 𝑙 1 -norm [105], with important selection properties [83], [157], [189] have shown to be successful in the context of geological model selection. 165 Specifically, they are applied to generate low-dimensional subspaces with selection properties. As was shown in chapters IV and V, traditionally, variogram-based modeling techniques [147], [190] are used to constrain the solution of history matching problems. However, it is shown that these methods are not able to sufficiently preserve the correct forms of connectivity in more complex heterogeneous environments, such as channelized ones [191]. Object based simulation techniques [192], [193] and modern training image based methods, a.k.a., multiple-point statistics (MPS), [75], [194], [195] are two alternatives to convey geological patterns in subsurface inverse problems. While the latter learns possible connectivity patterns and their frequencies from a training image, the former is quite cumbersome for data conditioning, primarily due to the lack of flexibility in morphing existing objects [194], [196]–[198]. In MPS methods, the main complexity is related to conditioning the training image based solutions to nonlinear data [126]. To generate flow-conditioned facies model realizations from training images, the Probability Perturbation Method (PPM) [199], [200] and Probability Conditioning Method (PCM) [201] are two direct simulation methods that are introduced recently. Both methods are shown to have extensive applications in solving subsurface related problems. Nonlinear variants of the PCA, including Kernel PCA [160], [184] and O-PCA [160] are examples of techniques that are introduced to preserving higher order statistics in such inverse problems. In [160], a convex quadratic function is incorporated to force the solution to some two-level discrete facies. Other forms of regularization and parameterization techniques are introduced for the purpose of discrete geological parameter estimation [31], [202], [203]. While many of these methods can produce discrete solutions, they do not have 166 a systematic mechanism to preserve multiple point statistics that is offered by the prior models, e.g., a training image. This study is mainly focuses on developing a deterministic inverse modeling formulation, which is able to learn the spatial connectivity patterns from a training image (or sets of training model realizations) and retain them in the inversion process. The optimization problem introduced in this chapter contains three main components, similar to chapter V, which are: (i) a least-squares mismatch term, (ii) connectivity preserving constraints, and (iii) feasibility constraints. Similar to previous chapter, the 𝑘 -SVD parameterization along with sparsity-promoting regularization terms are applied to constrain the solution to specific patterns that are learned to represent the parameters in a low-dimensional sub-space. Additionally, the feasibility constraints are adopted to force the inversion procedure to retain the exact type of connectivity information that is offered by a training image. For this purpose, the optimization problem includes two versions of the parameter of interest, i.e., (i) the solution on the parameterizing basis functions, and (ii) the solution obtained from the feasible set. Furthermore, a least-squares constraint is considered to force these two solutions to not depart from each other in the optimum convergence point. Next, the original optimization problem is divided into two sub- problems that update the solutions iteratively. The first sub-problem searches for the solution on the parametrization space while it is constrained by the parameterizing basis functions and the neighborhood defined by the most recent feasible solution. Next, a mapping is applied to retrieve the feasible solution from the space of samples defined by the training image. In the mapping step, a sample from the training image that is most similar to the parameterized solution is obtained to update the feasible solution. The 167 mapping itself is not trivial, and this study develops a two-step algorithm for this purpose. First, it replaces local connectivity patterns in the parameterized solution with those similar ones that are offered by the training image. Then, an aggregation on the overlapping patterns constructs an empirical probability distribution at each grid-block level. Finally, a maximum likelihood approach assigns the discrete value with maximum frequency to each cell. Template matching and aggregation steps are the two main novelties introduced in this study. Methodology Similar to previous chapters, the regularized least-squares problem for proposed pattern- based (geostatistical) subsurface inverse modeling is presented in this section. Proposed formulation denotes the 𝑞 -norm of a length-𝑛 vector 𝐮 = [𝑢 1 𝑢 2 … 𝑢 𝑛 ] (for 𝑞 >0) as: ‖𝐮 ‖ 𝑞 = (∑ | 𝑢 𝑖 | 𝑞 ) 𝑛 𝑖 =1 1 𝑞 (6.1) where the operator |. | computes the absolute value of its argument. Throughout this chapter, the notations u n×1 and 𝐝 𝑚 ×1 represent the unknown model parameter vector in the spatial domain and the observation vector, respectively. The function g(. ) represents a nonlinear mapping from the parameter domain to the observation domain, i.e., 𝐠 (𝐮 ) + 𝐧 = 𝐝 . This function often represents a set of complex partial differential equations that describe the state-space model. An example of such functions is the flow and transport equations in hydrology or oil and gas reserves. Furthermore, 𝐧 𝑚 ×1 is the vector of observation noise. The objective function of our proposed formulation consists of three main components: (i) a weighted least squares data mismatch term, (ii) a spatial connectivity preserving regularization, and (iii) a (geostatistical) pattern-based constraint 168 to obtain pattern structures from a training image. Each of these terms is briefly presented below. Data Mismatch Term Using the above notation and the assumption of additive Gaussian observation noise, a simple least squares data mismatch term is adopted as: min 𝐮 ‖𝐠 (𝐮 ) − 𝐝 ‖ 𝐂 𝐧 −1 2 = (𝐠 (𝐮 ) − 𝐝 )𝐂 𝐧 −1 (𝐠 (𝐮 ) − 𝐝 ) (6.2) where 𝐠 (. ) is the nonlinear (forward) mapping function described earlier, and 𝐂 𝐧 stands for the covariance matrix of the observation noise. For gradient-based minimization of the above misfit term, the derivative of 𝐠 (. ) with respect to parameter vector 𝐮 , i.e., 𝜕 𝑔 𝑖 𝜕 𝑢 𝑖 ; ∀𝑖 , 𝑗 needs to be computed, which is typically done either through efficient adjoint formulations or approximate methods, i.e., perturbations. The objective function in Eq. (6.2) results in solutions with minimum departure between observed data and model reproduced. However, the obtained solution, i.e., 𝐮 , may not be feasible since Eq. (6.2) does not contain any connectivity-promoting constrains. The next two sections discuss this matter. Connectivity Preserving Constraints When existence of specific geologic connectivity patterns, which are typically provided through prior models, is expected, it is important to properly preserve such patterns in the solution of the inverse problem. While simple mathematical expressions, such as Tikhonov 169 regularization as well as generic transform domain parameterization techniques, e.g., DCT or wavelet, are applied for such problems in the past, they are not geologically aspiring. In fact, these methods constrain the solution of subsurface inverse problem based on generic prior information, e.g., smoothness. Therefore, more reliable methods of learning are needed to adopt geological patterns in the process of inversion. Recently, specialized connectivity preserving approaches, like 𝑘 -SVD sparse reconstruction (introduced and discussed in chapters II and III) or higher order singular value decomposition (ho-SVD), are shown to be more effective in learning geologic connectivity patterns from a prior set of reliable model realizations. In previous chapters, II-IV, [86], it was shown that the 𝑘 -SVD sparse dictionary learning algorithm is a powerful method for learning and preserving geologic structures in an inversion process. In presence of a training image, the 𝑘 -SVD algorithm uses a set of 𝑁 training model realizations (here denoted by 𝐮 𝑖 and generated using geostatistical methods, e.g., SNESIM [197]) to offer 𝑘 basis functions for the parameterization purposes. In the 𝑘 - SVD parameterization, the parameter is represented through a linear expansion of these 𝑘 dictionary basis elements (here denoted by 𝝓 𝑖 ) as: 𝐮 = ∑ 𝝓 𝑖 𝑣 𝑖 = 𝑘 𝑖 =1 𝚽𝐯 . (6.3) In this equation, 𝑣 𝑖 is the weight (coefficient) for the 𝑖 𝑡 ℎ basis function, and it reflects the importance and relevance of the corresponding 𝝓 𝑖 in representing the parameter of interest, i.e., 𝐮 . Furthermore, 𝚽 is a matrix that contains the basis functions in its columns. Here, the low-rank representation of the model parameters (𝑘 ≪ 𝑛 ) decreases the ill-posedness of the inverse problem, simply by reducing the search space from estimating 𝑛 unknown 170 variables (in 𝐮 ) to the 𝑘 parameterization coefficients, i.e., 𝑣 𝑖 :𝑖 =1,2,…,𝑘 . Therefore, using this extra constrain in Eq. (6.3), Eq. (6.2) can be replaced by the following optimization problem: min 𝐯 ‖𝐠 (𝚽𝐯 ) − 𝐝 ‖ 𝐂 𝐧 −1 2 = (𝐠 (𝚽𝐯 ) − 𝐝 )𝐂 𝐧 −1 (𝐠 (𝚽𝐯 ) − 𝐝 ). (6.4) One important attribute of the 𝑘 -SVD parameterization is its sparsity-promoting property, which enables describing the parameter of interest with sparse number of basis functions. In fact, through the dynamic (nonlinear) data integration steps, this low-rank subspace is learned by automatically identifying relevant atoms from the dictionary that can significantly improve the dynamic data mismatch. Mathematically, this process is implemented through a novel sparsity-promoting inverse modeling formulation introduced in [123]. The resulting weighted least squares formulation that leads to the combination of data mismatch and connectivity preserving terms can be expressed as: min 𝐯 ‖𝐠 (𝚽𝐯 ) − 𝐝 ‖ 𝐂 𝐧 −1 2 + 𝜆 2 𝐽 (𝐯 ) (6.5) where 𝐽 (𝐯 ) is a regularization constraint that is applied on the 𝑘 -SVD transform (expansion) coefficients to promote sparsity in the expansion domain, i.e., 𝐯 = [𝑣 1 , 𝑣 2 , … , 𝑣 𝑘 ]. It is shown that the 𝑙 1 -norm, i.e., 𝐽 (𝐯 ) = ‖𝐯 ‖ 1 = ∑ |𝑣 𝑖 | 𝑘 𝑖 =1 , is able to facilitate sparsifying 𝐯 , and similar to previous chapters, this study also applies this penalty function for this purpose throughout the chapter. Note that this term does not exist in truncated parameterization schemes like T-SVD, ho-SVD, etc. In sparse reconstruction with 𝑘 -SVD, the 𝑙 1 -norm minimization is used to select the significant elements while 171 minimizing the data mismatch term. It is also important to note that in each case the expansion functions contain the spatial connectivity patterns in the prior training data while the expansion coefficients (𝐯 ) are the corresponding weights given to each element (i.e., the geologic features). Further details about the 𝑘 -SVD algorithm can be found in chapter II and [86]. Training Image Constraints A recent study [178], also chapter V, introduced discreteness as an extra constraint on the training image based inverse modeling problems. In previous chapter, it was demonstrated that cell-level discrete regularization terms, when combined with 𝑘 -SVD parameterization, can further improve the predictive quality of the estimated discrete models; especially, when the method’s performance is compared to “thresholding” scheme as a post processing step. However, in geosciences applications, the desired training image based models are not only discrete, but they also contain certain spatial connectivity structures, which are (statistically) described by the training image. To pursue an effective formulation in this direction, let us depict the set of all valid samples corresponding to a training image by 𝛀 𝑝 , where 𝑝 stands for “prior”. Therefore, it is a valid assumption that 𝐮 (in Eq. (6.2)) belongs to 𝛀 𝑝 , i.e., 𝐮 ∈ 𝛀 𝑝 . This assumption combined with the connectivity preserving constraints, previously summarized in Eqs. (6.3)-(6.5), leads to a new formulation, which is: min 𝐯 ,𝐮 ‖𝐠 (𝚽𝐯 ) − 𝐝 ‖ 𝐂 𝐧 −1 2 + 𝜆 2 𝐽 (𝐯 ) 𝑠 . 𝑡 ., ‖𝚽𝐯 − 𝐮 ‖ 2 2 ≤ 𝜖 2 & 𝐮 ∈ 𝛀 𝑝 . (6) 172 In this formulation, the constraint ‖𝚽𝐯 − 𝐮 ‖ 2 2 ≤ 𝜖 2 forces the feasible solution 𝐮 and its parameterized version 𝚽𝐯 to remain as close as possible to each other at the convergence point. Eq. (6.6) can further be simplified by adding the constraint ‖𝚽𝐯 − 𝐮 ‖ 2 2 ≤ 𝜖 2 to the objective function using least square penalization, resulting in: min 𝐯 ,𝐮 ‖𝐠 (𝚽𝐯 ) − 𝐝 ‖ 𝐂 𝐧 −1 2 + 𝜆 2 𝐽 (𝐯 ) + 𝛼 2 ‖𝚽𝐯 − 𝐮 ‖ 2 2 𝑠 . 𝑡 ., 𝐮 ∈ 𝛀 𝑝 . (6.7) Here, 𝜆 and 𝛼 are the regularization parameters for the sparsity-promoting term (i.e., 𝐽 (𝐯 )) and the constrain ‖𝚽𝐯 − 𝐮 ‖ 2 2 ≤ 𝜖 2 , respectively. The optimization problem in Eq. (6.7) needs to be solved for two versions of a same parameter, which are (i) 𝐯 and (ii) 𝐮 . Consequently, 𝐯 results in an approximated solution that contains continuous values (i.e., each entry of 𝐮̃ = 𝚽𝐯 takes continuous values), and 𝐮 is the feasible solution, which is discrete. Next, this chapter discusses how to solve the resulting optimization problem demonstrated in Eq. (6.7). This chapter uses an alternating direction algorithm to approach the optimization problem concluded in Eq. (6.7). The proposed alternating direction algorithm updates the solution, i.e., [𝐯 ; 𝐮 ], by first keeping 𝐮 constant and solving Eq. (6.7) in the 𝐯 domain. When 𝐯 is updated, the optimization problem is only solved for 𝐮 by assuming 𝐯 to be constant. Mathematically, these two steps are: 𝐯 (𝑛 ) = min 𝐯 ‖𝐠 (𝚽𝐯 ) − 𝐝 ‖ 𝐂 𝐧 −1 2 + 𝜆 2 𝐽 (𝐯 ) + 𝛼 2 ‖𝚽𝐯 − 𝐮 (𝑛 −1) ‖ 2 2 𝑠 . 𝑡 ., 𝐮 (𝑛 −1) ∈ 𝛀 𝑝 (6.8a) 𝐮 (𝑛 ) = min 𝐮 ‖𝐠 (𝚽 𝐯 (𝑛 ) ) − 𝐝 ‖ 𝐂 𝐧 −1 2 + 𝜆 2 𝐽 (𝐯 (𝑛 ) ) + 𝛼 2 ‖𝚽 𝐯 (𝑛 ) − 𝐮 ‖ 2 2 𝑠 . 𝑡 . , 𝐮 ∈ 𝛀 𝑝 (6.8b) 173 , which can be simplified to: 𝐯 (𝑛 ) = min 𝐯 ‖𝐠 (𝚽𝐯 ) − 𝐝 ‖ 𝐂 𝐧 −1 2 + 𝜆 2 𝐽 (𝐯 ) + 𝛼 2 ‖𝚽𝐯 − 𝐮 (𝑛 −1) ‖ 2 2 (6.9a) 𝐮 (𝑛 ) = min 𝐮 ‖𝚽 𝐯 (𝑛 ) − 𝐮 ‖ 2 2 𝑠 . 𝑡 . , 𝐮 ∈ 𝛀 𝑝 . (6.9b) The optimization problem in Eq. (6.9a) can be easily solved by linearizing 𝐠 (𝚽𝐯 ) (usually using first order Taylor expansion) and setting the gradient of the obtained objective function to zero. However, the solution to the second sub-problem, i.e., (6.9b), is not trivial. Intuitively, the solution to Eq. (6.9b) is a sample from the feasible set (i.e., 𝛀 𝑝 that is obtained from a training image), which is the closest realization to the obtained 𝚽 𝐯 (𝑛 ) . The main difficulty in solving Eq. (6.9) is inaccessibility to the entire domain of samples that correspond to a given training image. In fact, a limited number of representative realizations (corresponding to a training image) are typically available, and searching the entire sample space of 𝛀 𝑝 is neither possible nor practical. It is important to note that, in the above optimization problem, the two main constraints, i.e., (i) preserving connectivity via parameterization (regularization), and (ii) existence of a discrete solution in the feasible set cannot be satisfied simultaneously. For this reason, one has to relax 𝚽𝐯 = 𝐮 using the proposed alternating minimization framework by penalizing the first constraint using 𝛼 2 in Eq. (6.9a). Next section discussed how to address the solution to Eq. (6.9b). 174 Obtaining the Feasible Solution from a Training Image The optimization framework concluded in Eq. (6.9) requires mapping the solution that is obtained on the parameterization space, i.e., 𝚽 𝐯 (𝑛 ) , onto the feasible set. Accordingly, the feasible solution, i.e., 𝐮 (𝑛 ) , is updated based on the characteristics of 𝚽 𝐯 (𝑛 ) and the information (geologic patterns) specified by the training image. It is important to note that, the optimization problem depicted in Eq. (6.9b) resembles a general image segmentation problem with a specific set of constraints for the spatial structure of the objects in the underlying image [204]–[210]. This resembles provides several possible solutions for Eq. (6.9b) from simple image thresholding segmentation introduced in chapter V to segmentation via shape priors [204], [211], [212] and graphical methods [213]–[215] as long as the constraints imposed by the feasible set can be satisfied by the selected segmentation framework. This study introduces a novel template matching formulation which can provide a robust and fast framework for such problems. Investigation of other frameworks, especially graphical models, can be perused as future work. Figure 1 depicts a simple example of mapping from parameterization domain onto the feasible set corresponding to a training image. In this figure, the inversion is based on a set of travel time tomographic data [176], in which the travel time observations (𝐝 ) are obtained from the waves transmitted from left side of the domain to the right (see Numerical Results Section for further details). As it is demonstrated in this figure, the first step of the inversion results in a solution on the parameterization space, and the second stage suggests a sample from the training image that is most similar to the continuous solution (retrieved in Eq. (6.9a)). 175 Obtaining the most similar realization to 𝚽 𝐯 (𝑛 ) from the training image is not trivial, and this study develops a two-stage algorithm for this purpose. The introduced algorithm first adopts a template-matching technique to replace the local patterns in 𝚽 𝐯 (𝑛 ) with the most similar ones offered by the training image. In the second stage, an aggregation on overlapping templates assigns discrete values to each element of the feasible solution, i.e., 𝑢 𝑖 in 𝐮 (𝑛 ) . In the aggregation step, an empirical probability value is calculated for each discrete level based on the information obtained from the overlapping templates at the location of corresponding cell. Finally, the discrete level with highest empirical value is assigned for that cell/pixel. Figure 6.1. Schematic of two step inversion (Top left) Configuration of travel time tomography, (Top right) inversion result from step (i), which is the solution using parameterization space, (Bottom right) training image that defines the feasible set, (Bottom left) the result of mapping onto the training image. 176 This section, first explains the replacement of local patterns in 𝚽 𝐯 (𝑛 ) in presence of a training image. For this purpose, Figure 2 demonstrates the replacement process, pictorially. As depicted in this picture, initially a predefined template is located around each cell, and the grid-block values inside the template (denoted by 𝐱 ) are extracted. Then, the same template is adopted to search the training image to find 𝑘 𝑡𝑒𝑚𝑝 most similar patterns {𝐱 1 , 𝐱 2 , … , 𝐱 𝑘 𝑡𝑒𝑚𝑝 }; Finally, these selected patterns are stored at the location of the template that is placed around the centered cell. In notations, this chapter refers to a distance measure, e.g., ‖𝐱 − 𝐱 𝑖 ‖ 2 2 , to specify the most similar patterns. Selected distance measure contributes toward the quality and speed of the reconstructed solution. As an example, instead of least squares distance measure, normalized-cross correlation (for short NCC) can be computed faster (in Fourier domain) and is also illumination/brightness invariant [216], [217]. Through this study, NCC is used as the distance measure. Here, several algorithms can be used for the purpose of template-matching. Perhaps, the simplest way to find the 𝑘 𝑡𝑒𝑚𝑝 most similar templates is through building a pattern dataset ({𝐱 1 , . . . , 𝐱 𝐾 }) from the training image and compare the obtained patterns form 𝚽 𝐯 (𝑛 ) (i.e., 𝐱 ) with those in {𝐱 1 , . . . , 𝐱 𝐾 }. However, the computational costs would be very high depending on the size of 𝐾 . On the other hand, if a pattern is frequently replicated (> 𝑘 𝑡𝑒𝑚𝑝 ) in the training image, it may result in selection of 𝑘 𝑡𝑒𝑚𝑝 exactly-same patterns for some local template-matching cases. One approach to overcome this issue is to refine {𝐱 1 , . . . , 𝐱 𝐾 } by eliminating similar or replicated patterns, which can be achieved by computing a patterns-tree data structure from the “set of” training images that are 177 provided for the inversion. A fast tree search algorithm can be utilized to boost the computational performance once this data structure is accessible. This can also decrease the number of comparisons that needs to be considered between each 𝐱 and elements of the training dataset. As a result, the computational costs would be reduced drastically. Another technique to reduce the computation burdens of this method is to compare the local patterns in a transformation domain. For this purpose, the templates in the original dataset {𝐱 1 , . . . , 𝐱 𝐾 } are transformed into a low-dimensional space using well-known transformation methods, such as Fourier transform or Discrete Cosine Transform (DCT). Consequently, the comparison between each 𝐱 and {𝐱 1 , . . . , 𝐱 𝐾 } is done in a low- dimensional space, and the pattern-matching process would require less computational complexities. Figure 3 depicts the template-matching process in the original and a low- Figure 6.2. Schematic of a template-matching algorithm. A predefined template is located on the image with continuous values, and the obtained template is compared with the local patterns existing in the training image. Finally, 𝒌 similar patterns are selected and stored at the location of the original template. 178 dimensional space. Locally Linear Imbedding (LLE) [218] is an alternative transformation method that is applied for measuring similarity in low-dimensional spaces. In the template-matching algorithm discussed above, the proposed framework, does not directly replace the selected patterns after obtaining the most similar templates. In fact, the selected patterns from the training image are stored without replacing the original template at a higher memory cost. SIMPAT [219] is a similar method that searches for only one most-similar pattern from a training image, and it replaces the selected pattern with the original template sequentially. As a result, the selected patterns for the next cells are conditioned on the previously replaced templates. Consequently, the final result obtained through SIMPAT would be sensitive to the random path that is taken for the local template- Figure 6.3. Transform domain vs spatial template matching schemes Template-matching based on (Left) space domain matching, and (b) transform domain matching. In (Left), the original patterns from the continuous solution are directly compared with the spatial domain patterns that are extracted from the training image. In (Right), the patterns (shown in Left) are projected onto a transform domain, and matching templates correspond to the neighbor instances in the transform domain. Note that, the results from Right and Left may not be necessarily the same. 179 matchings. For this specific application (i.e., replacing continuous patterns with discrete ones i.e. image segmentation), the proposed method is less sensitive to the random path since the patterns are aggregated at the end to assign the discrete values for each grid-block. Next section describes how aggregation is performed. After the local template-matching stage, an aggregation (or voting) step is used to combine all the facies assignments to each cell/pixel value and decide about the discrete facies values for each grid-block. Figure 4 depicts the aggregation approach where for a given cell (marked with a cross) the information from scanning different regions are combined to generate the conditional facies probability. In this study, the facies with the highest frequency is assigned to each grid-block. Figure 6.4. Schematic of aggregation step Focusing on a cell (specified by cross sign), a set of overlapping templates are specified. Then, the corresponding cell values are extracted from the overlapping templates. Therefore, from every spatial template, k values are obtained for the cell value. Then, all the obtained values are used to calculate the empirical distribution for the discrete levels. Finally, the discrete level with maximum empirical frequency (probability) is assigned for the center cell (cross). 180 The aggregation step ensures that facies assignment to each grid-block takes into account the spatial statistics (and connectivity patterns) from an extended neighborhood (approximately twice the size of the template in each direction). Moreover, aggregation leads to far more samples (than 𝑘 𝑡𝑒𝑚𝑝 ) for each grid cell, which substantially increases the accuracy and spatial consistency of the method. In addition, the aggregation substantially decreases the length of the random path selected for the local template-matchings. In fact, aggregation provides an opportunity for overlapping templates to provide an empirical probability distribution at each cell, even if that cell is not located on the random path that is selected for the local template-matching. Aggregation improves the connectivity by exploiting additional spatial information about extended local neighborhood. Furthermore, when aggregation is used, less sensitivity is observed with respect to 𝑘 𝑡𝑒𝑚𝑝 as aggregation provides many more samples than can be achieved by a modest increase in 𝑘 𝑡𝑒𝑚𝑝 (many overlapping templates cover the same cell in the model). Numerical Results This chapter, demonstrates the performance of the training image-based subsurface inverse modeling formulation developed and described in the previous sections. First, a motivating example is designed based on linear tomographic inversion to discuss the effect and importance of pattern-based learning in the inversion process. Next, two experiments are conducted that involve groundwater constant drawdown pumping tests. In these two experiments, the steady-state pressure head data and hydraulic conductivity values at the location of monitoring wells are used to reconstruct the spatial distribution of lithofacies that represent the hydraulic conductivity maps. 181 In travel time tomography, with the configuration demonstrated in Figure 1(top- left), the acoustic waves are transmitted through a medium, and the travel times of these waves are recorded by receiver at a known distance. The governing equation relating the recorded arrival times (𝑡 ) with slowness (𝑠 (𝑥 ⃗): inverse of velocity) of the medium can be derived as: { 𝑡 − ∫ 𝑠 (𝑥 ⃗)𝑑 𝑥 ⃗ = 0 𝑠 (𝑥 ⃗) = 1 𝑣 (𝑥 ⃗) . (6.10) where 𝑥 ⃗ represents a spatial location in (𝑥 , 𝑦 , 𝑧 ) coordinates. A discretized version of Eq. (6.10) can be used to form a linear system of equations between recoded arrival times (i.e., observations) and discretized rock slowness map (i.e., parameters), i.e., 𝐝 = 𝐆 𝐮 . The governing equations for the pumping test in these two examples are derived from the single-phase flow equations by combining mass conservation and Darcy’s law in a saturated porous medium, and can be expressed as: { 𝜕 (𝜙𝜌 ) 𝜕𝑡 = −𝛁 . (𝜌 𝐯 ) + 𝑞 conservation of mass 𝐯 = − 1 𝜇 𝐊 (𝛁 𝑝 − 𝜌𝑔 𝛁 𝑧 ) Darcy ′ s law (6.11) where 𝜙 , 𝜌 , 𝐯 , 𝑝 and 𝑞 represent porosity, density, velocity, pressure and pumping rates, respectively; 𝜇 represents the fluid viscosity. Here, 𝐊 is the intrinsic rock permeability, which is assumed to be the same in 𝑥 , 𝑦 and 𝑧 directions. The hydraulic conductivity, which is defined as 𝐊 ℎ = 𝐊 𝜌𝑔 𝜇 , as well as the observed pressure head values at the monitoring well are the observations 𝐝 . The solution of Eq. (6.10) provides the operator 𝐠 (. ) that maps the parameter to data space, i.e., 𝐝 = 𝐠 (𝐮 = 𝐊 ℎ ). 182 Example 1 (Motivating Example): Tomographic Inversion In this example, 5 acoustic wave transmitters and 5 receivers are placed on the left and right sides of the domain, respectively (see Figure 1(Top)). Therefore, 25 observations are measured to reconstruct 64 × 64 uniformly divided grid-block values, which account for slowness of the domain. The field itself is 2000× 2000 m 2 . Figure 5(Top left) demonstrates the reference slowness map for this domain. Figure 6.5. Example 1 (motivating example) (Top) Reference slowness map with minimum 2.3 𝒎𝒎 /𝒔 and maximum 6.4 𝒎𝒎 /𝒔 , best achievable approximation using 𝒌 -SVD basis function (𝒌 = 𝟏𝟎𝟎𝟎 , 𝑺 = 𝟑𝟎 ), and initial model (𝒖 (𝟎 ) ), (Bottom left) result of parameterized solution after one iteration of Eq. (9a), (Bottom left) result of feasible solution based on (Top) thresholding using three values of 𝟑 . 𝟗𝟒 , 𝟒 . 𝟑𝟓 , 𝟒 . 𝟕𝟔 𝒎𝒎 /𝒔 (Bottom) pattern-based algorithm using square window with sizes 𝟏𝟏 × 𝟏𝟏 , 𝟐𝟏 × 𝟐𝟏 , 𝟑𝟏 × 𝟑𝟏 . 183 2000 unconditionally sampled model realizations are generated from the training image depicted in Figure 1(Bottom right) using SNESIM algorithm. Then, the 𝑘 -SVD algorithm with parameters 𝑘 = 1000 and 𝑆 = 30 is applied to generate the sparsity- promoting parameterizing basis functions. Figure 5(Top middle) depicts the reference slowness map projected onto these basis functions. Theoretically, this is the best achievable solution on this parameterization space. This study applies one iteration of Eq. (6.9) to demonstrate the importance of the proposed pattern-based template-matching algorithm in the process of inversion. Assuming 𝐮 (0) to be the map shown in Figure 5(Top right), the inversion result from the Eq. (6.9a) is obtained and depicted in Figure 5(Bottom left) (𝚽 𝐯 (1) ). Note that a sensitivity analysis is applied to adjust the regularization parameters in Eq. (6.9a). As it is clear from this figure, the connectivity patterns are lost, and the parameterization space is not able to capture the structures embedded in the basis functions. To obtain the feasible solution (𝐮 (1) ) in Eq. (6.9b), the results of mapping are obtained based on (i) thresholding on the discrete values, i.e., 2.3 𝑚 m/s and 6.4 𝑚 m/s, and (ii) pattern-based template-matching. For the thresholding case, three different values are assigned as the threshold value, which are 3.94, 4.35 and 4.76 𝑚 m/s. There different cases, which differ in the window sizes, are considered for the template-matching (window size: 11 × 11, 21 × 21 an 31 × 31). The 𝑘 𝑡𝑒𝑚𝑝 is set to 5, and 10% of grid-block values are considered for the local template- matching random path. Figure 5 clearly demonstrates that the thresholding is a naïve method for projecting onto the feasible set. As a result, the connectivity structures are lost after the mapping step. However, the template-matching algorithm is able to retrieve more acceptable projection results. Consequently, the obtained feasible solution can be a more 184 valid candidate for the constraint 𝛼 2 ‖𝚽𝐯 − 𝐮 (1) ‖ 2 2 in Eq. (6.9a). As a result, the next parameterized solution, i.e., 𝚽 𝐯 (2) , will remain close to 𝐮 (1) , and it will extract further information from the available data to correct the update results. Example 2: 2D Pumping Test with 3 Discrete Facies In this example, which its configuration is shown in Figure 6(a), a pumping well, located in the middle of the domain, produces water in a constant drawdown pressure (14m). The domain with size 2107 × 2107 × 10 m 3 is divided into 64 × 64 × 1 uniform cells. The top and bottom of the field are constrained on no flow condition, and the left and right sides of the domain have constant pressure of 10 m and 50 m, respectively. 8 monitoring wells are located uniformly on a circle surrounding the pumping well, and they monitor the steady-state pressure values. Figure 6.6. Tri-facies model example (a) Configuration of the pumping test in Example 2, (b) reference hydraulic conductivity field, (c) optimum solution that can be obtained by the 𝒌 -SVD basis function shown in Figure 7(c). 185 Furthermore, the hydraulic conductivity values are observed at location of the monitoring wells. Consequently, the steady-state pressure and hydraulic conductivity values at the location of these monitoring wells are the sources of data (𝐝 ). Figure 6(b) demonstrates the reference (log)-hydraulic conductivity map with three discrete values, i.e., facies #1 with value -4 log(m/s), facies #2 with value -0.5 log(m/s), and facies #3 with value 2.5 log(m/s). Figure 7(a) depicts the training image with three different facies values. 2000 unconditional prior model realizations are generated from this training image, and 25 of them are shown in Figure 7(b). This study, applies the 𝑘 -SVD dictionary learning algorithm with 𝑘 = 1000 and 𝑆 = 30 on these realizations. As a result, a sparsity-promoting parameterization space is obtained for representing the hydraulic conductivity map. Figures 7(c) and 6(c) demonstrate 25 of these basis functions and the best achievable solution using the 𝑘 -SVD parameterization space, respectively. Figure 6.7. Example 2 Model realizations and basis functions (a) The training image with three facies value, (c) sample unconditional prior mode realizations, and (b) the 𝒌 -SVD dictionary obtained for the parameterization purposes (𝒌 = 𝟏𝟎𝟎𝟎 and 𝑺 = 𝟑𝟎 ). 186 The results of parameter reconstruction are depicted in Figure 8 for three different cases: (Case 1) no feasibility constraint is assumed, the result is obtained with the 𝑘 -SVD parameterization regularized with 𝐽 (𝐯 ) = ‖𝐯 ‖ 1 , and the continuous solution is projected onto the closest discrete values at the final iteration, (Case 2) discreetness is the only property inherited from the feasible set, i.e., each entry of the continuous solution is projected onto the closest discrete value at each iteration of Eq. (6.9b), and (Case 3) the continuous solution is projected onto the feasible set based on the pattern-based algorithm discussed in the previous sections. As it is clear in this figure, the pattern-based method is more efficient in retrieving the facies values. For the Case 3, this chapter has applied a 25 × 25 square template for replacing the continuous patterns with the discrete ones that are selected from the training image. In addition, the value of 𝑘 𝑡𝑒𝑚𝑝 is set to 10, and 10% of the cells are randomly selected as the path for the local pattern replacement step. Figure 9 demonstrates the evolution of the continuous and discrete solutions for the pattern-based algorithm during the first 15 iterations of Eq. (6.9). This figure shows that the algorithm first captures the global structures in the hydraulic conductivity field; then, it refines the local connectivity by learning the patterns from the training image during the later iterations. 187 Figure 6.8. Example 2 reconstruction results The result of parameter reconstruction for (column 1) no feasibility constraint is assumed, the result is obtained with the 𝒌 -SVD parameterization regularized with 𝑱 (𝒗 ) = ‖𝒗 ‖ 𝟏 , and the parameterized solution is projected onto the closest discrete values at the final iteration, (column 2) discreteness is the only property inherited from the feasible set, i.e., each entry of the parameterized solution is projected onto the closest discrete value at each iteration of Eq. (9b), and (column 3) the parameterized solution is projected onto the feasible set based on the pattern-based algorithm discussed in the previous sections. 188 Example 3: Large Scale Complex Heterogeneity Similar to the previous example, a pumping well is located in the middle of the domain (shown in Figure 10(a)), and it produces water in a constant drawdown pressure of 42.2 m. The 6583 × 6583 × 10 m 3 domain is divided into 200 × 200 × 1 uniform cells with approximate size of 33 × 33 × 10 m 3 . The left and right sides of the field are constrained on constant pressures of 20 m and 70 m, respectively. Furthermore, no flow conditions are Figure 6.9. Example 2 model updates The results of reconstruction based on pattern-based algorithm and Eq. (6.9), (Left) first 15 iterations of the parameterized solution, (Right) first 15 iterations of the feasible solution. 189 applied for the top and bottom sides. 20 monitoring wells, which monitor the steady-state pressure values, are located uniformly around the pumping well as depicted in Figure 10(a). As a result, the hydraulic conductivity and steady-state pressure values at the location of these monitoring wells construct the measurement space. Figure 10(b) demonstrates the reference (log)-hydraulic conductivity map with two discrete values, i.e., facies #1 with value -6.2 log(m/s), and facies #2 with value 0.9 log(m/s). The training image and 9 conditional (on hard data) hydraulic conductivity realizations (out of 2000) are depicted in Figures 11(a) and 10(b), respectively. The 𝑘 -SVD dictionary with 𝑘 = 1000 and 𝑆 = 60 is generated using the 2000 realizations that are sampled from the training image. 25 basis functions are pictorially demonstrated in Figure 11(c). In addition, the best achievable solution that can be represented using the full 𝑘 - SVD parameterization space is shown in Figure 10(c). Figure 6.10. Example 3 well configuration (a) Configuration of the pumping test in Example 3, (b) reference hydraulic conductivity field, (c) optimum solution that can be obtained by the 𝒌 -SVD basis function shown in Figure 11(c). 190 For the pattern-based mapping, a 80 × 80 square template is adopted for replacing the continuous patterns with the discrete ones that are selected from the training image; The value of 𝑘 𝑡𝑒𝑚𝑝 is set to 3, and, similar to Example 2, the path for the local pattern replacement includes 10% of the domain grid-blocks. Similar to the Example 2, the results of parameter reconstruction are depicted in Figure 12 for the Cases 1, 2, and 3. This figure shows that the pattern-based algorithm is able to retrieve correct form of connectivity patterns better compared to the Cases 1 and 2. Figure 13 demonstrates the obtained continuous and discrete solutions in the first 10 iterations of Eq. (6.9) for the Case 3 (pattern-based). As it is clear in this figure, the updated continuous maps remain close to the most recent discrete solution as a result of constraining Eq. (9a) by 𝛼 2 ‖𝚽𝐯 − 𝐮 (𝑛 −1) ‖ 2 2 . Consequently, this constraint helps the iterations of Eq. (6.9) to become stable and converge to the optimum solution steadily. Figure 6.11. Example 3 realizations and basis functions (a) The training image with intersecting channel patterns and two facies value, (b) sample conditional prior mode realizations, and (c) the 𝒌 -SVD dictionary obtained for the parameterization purposes (𝒌 = 𝟏𝟎𝟎𝟎 and 𝑺 = 𝟔𝟎 ). 191 Figure 6.12. Example 3 reconstructed models comparison The result of parameter reconstruction for (column 1) no feasibility constraint is assumed, the result is obtained with the 𝒌 -SVD parameterization regularized with 𝑱 (𝒗 ) = ‖𝒗 ‖ 𝟏 , and the parameterized solution is projected onto the closest discrete values at the final iteration, (column 2) discreteness is the only property inherited from the feasible set, i.e., each entry of the parameterized solution is projected onto the closest discrete value at each iteration of Eq. (6.9b), and (column 3) the parameterized solution is projected onto the feasible set based on the pattern-based algorithm discussed in the previous sections. 192 Conclusion Inverse modeling of geologic environments with complex heterogeneity requires additional constraints for the purpose of feasible parameter estimation. Traditionally, variogram-based modeling approaches have shown great potential and applications in this direction. However, it is well-known that variogram-based techniques are not appropriate for preserving connectivity information in more complex environments, such as the ones Figure 6.13. Example 3 model updates The results of reconstruction based on pattern-based algorithm and Eq. (6.9), (Top) first 10 iterations of the parameterized solution, (Bottom) first 10 iterations of the feasible solution. 193 with channelized features. Training image-based techniques are an alternative approach to overcome this issue. Specifically, these techniques learn connectivity information from a (or a few) pre-given training image(s) with the purpose of conveying them in a data assimilation approach. However, no straightforward inversion approach can easily constrain the parameter estimation process to the complex connectivity patterns suggested by the training image(s). This chapter, developed a deterministic inverse modeling formulation based on the idea of constraining on a “feasible set”. In this study’s notation, the feasible set contains all the realizations that can be sampled from a training image. However, only a few representative samples are available, summarize this set and provide its associated (empirical) statistics. Beside the feasibility constraints, this study also relied on parameterizing the solution space based on the sparsity-promoting 𝑘 -SVD dictionary. In proposed formulation, the optimization problem consists of two versions of a same parameter, which are (i) the solution on the parameterizing basis functions, and (ii) the solution obtained from the feasible set. The original optimization problem is divided into two subproblems to retrieve the parametrized and feasible versions of the solution (image segmentation with prior), iteratively. In the first subproblem, the search space of the solution is limited by the parameterization space and the neighborhood defined by the most recent feasible solution. After obtaining the solution on the parameterizing basis functions, a mapping projects it onto the feasible set to obtain a new solution from the training image. The mapping, which is developed in this chapter, first adopts a template-matching algorithm and replaces the local connectivity patterns in the continuous solution (parameterized solution) by those similar ones offered by the training image. Then, an 194 aggregation on the overlapping connectivity templates is applied to generate an empirical probability distribution at each grid-block. Finally, those discrete values with maximum frequency are assigned to each cell. The proposed formulation in this chapter is based on a deterministic optimization, and it does not account for uncertainty quantification in the parameter space. Perhaps, an interesting extension of the methodology introduced here is to extend it to a probabilistic data assimilation. A potential formulation would contain parameterized and feasible spaces for the sampling purpose. 195 A NOTE ON PRACTICAL APPLICATIONS 196 During the last few decades, a great deal of research has been devoted to exploring the applicability and suitability of the inverse approach for the identification of subsurface models’ parameters. Most of such studies have either focused on laboratory experiments using small core samples with well-defined boundary conditions or synthetic simulation studies to fully understand the prospects and limitations of the method and facilitate easy benchmarking against subsurface physical properties derived from static or steady-state methods on similar samples. During the past decade, a great deal of interest has been shown on testing various proposed formulations and frameworks in real-world (subsurface) inverse modeling applications [198], [220]–[225]. In this thesis, several new approaches are proposed for addressing some of the important problems in subsurface inverse modeling applications. While some of the proposed methods, have been successfully tested for practical applications [226], more comprehensive studies are needed to understand practical limitations and benefits of the proposed discrete imaging formulations. As an example, many of the parameters in subsurface flow and transport models cannot be estimated directly at the scale of interest but can only be derived via a combination of several inversion algorithms as different scales. It is important to note that, majority of the proposed algorithms in practice can only provide better understanding of the desired (image) model. Such an insight then can be used in an iterative process to adjust the initial assumptions about the models. As an example, close investigation of model updates, after several iterations via the proposed pattern based discrete formulation in chapter VI, can result in a better understanding on scale of connectivity paths and flow preference in subsurface. Updates made on regions of the subsurface model that were not well understood via the static data prior to dynamic data integration, can provide hints on were to execute new exploration projects and collect the new static data. Such hints, will then help the geological modeling teams to make necessary adjustments to the initial model realizations and update the transform functions used for the parameterization accordingly. A general diagram is provided in Figure 7.1 which depicts the proposed practical process. The key component in this process, is the feedback loop, representing the adaptation of the initial model realizations, geologic model, using the updates and insights provided by the inverse modeling of the dynamic data. 197 Figure 7.1. Process for practical use of proposed inverse modeling algorithms This figure shows the proposed process to be used in practice in combination with the proposed formulations in this thesis. 198 BIBLIOGRAPHY [1] A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation. Society for Industrial and Applied Mathematics, 2005. [2] C. R. Vogel and J. G. Wade, “Iterative SVD-based methods for ill-posed problems,” SIAM J. Sci. Comput., vol. 15, no. 3, pp. 736–754, 1994. [3] P. C. Hansen, “The discrete picard condition for discrete ill-posed problems,” BIT, vol. 30, no. 4, pp. 658–672, 1990. [4] K. Miller, “Least Squares Methods for Ill-Posed Problems with a Prescribed Bound,” SIAM J. Math. Anal., vol. 1, no. 1, pp. 52–74, Feb. 1970. [5] E. Resmerita, “Regularization of ill-posed problems in Banach spaces: convergence rates,” Inverse Probl., vol. 21, no. 4, pp. 1303–1314, Aug. 2005. [6] K. Hoffman, Banach spaces of analytic functions. . [7] S. Arridge and J. Schotland, “Optical tomography: forward and inverse problems,” Jul. 2009. [8] M. H. Worthington, “An introduction to geophysical tomography,” First Break, vol. 2, no. 1165, pp. 20–26, Nov. 1984. [9] S. Siltanen et al., “Statistical inversion for medical x-ray tomography with few radiographs: I. General theory,” Phys. Med. Biol., vol. 48, no. 10, pp. 1437–1463, May 2003. [10] D. GAMERMAN, “Sampling from the posterior distribution in generalized linear mixed models,” Stat. Comput., vol. 7, no. 1, pp. 57–68, 1997. [11] C. M. Theobald, “Generalizations of Mean Square Error Applied to Ridge Regression,” Journal of the Royal Statistical Society. Series B (Methodological), 199 vol. 36. WileyRoyal Statistical Society, pp. 103–106, 1974. [12] K. Ganchev, J. Graça, J. Gillenwater, and B. Taskar, “Posterior Regularization for Structured Latent Variable Models,” J. Mach. Learn. Res., vol. 11, no. Jul, pp. 2001–2049, 2010. [13] A. Tarantola and B. Valette, “Generalized nonlinear inverse problems solved using the least squares criterion,” Rev. Geophys., vol. 20, no. 2, p. 219, May 1982. [14] E. Haber, U. M. Ascher, and D. Oldenburg, “On optimization techniques for solving nonlinear inverse problems,” Inverse Probl., vol. 16, no. 5, pp. 1263–1280, Oct. 2000. [15] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004. [16] G. H. Golub and C. Reinsch, “Singular value decomposition and least squares solutions,” Numer. Math., vol. 14, no. 5, pp. 403–420, Apr. 1970. [17] P. C. Hansen, “Analysis of Discrete Ill-Posed Problems by Means of the L-Curve,” SIAM Rev., vol. 34, no. 4, pp. 561–580, Dec. 1992. [18] W. H. Chen, G. R. Gavalas, J. H. Seinfeld, and M. L. Wasserman, “A New Algorithm for Automatic History Matching,” Soc. Pet. Eng. J., vol. 14, no. 6, pp. 593–608, Dec. 1974. [19] G. R. Gavalas, “A new method of parameter estimation in linear systems,” AIChE J., vol. 19, no. 2, pp. 214–222, Mar. 1973. [20] G. R. Gavalas, P. C. Shah, and J. H. Seinfeld, “Reservoir History Matching by Bayesian Estimation,” Soc. Pet. Eng. J., vol. 16, no. 6, pp. 337–350, Dec. 1976. [21] H. J. Hendricks Franssen et al., “A comparison of seven methods for the inverse 200 modelling of groundwater flow. Application to the characterisation of well catchments,” Adv. Water Resour., vol. 32, no. 6, pp. 851–872, Jun. 2009. [22] J. Carrera and S. P. Neuman, “Estimation of Aquifer Parameters Under Transient and Steady State Conditions: 1. Maximum Likelihood Method Incorporating Prior Information,” Water Resour. Res., vol. 22, no. 2, pp. 199–210, 1986. [23] W. Menke, Geophysical Data Analysis: Discrete Inverse Theory. 2012. [24] M. S. Zhdanov, “Geophysical Inverse Theory and Regularization Problems,” Methods Geochemistry Geophys., vol. 36, pp. 467–529, 2002. [25] J. a Scales, M. L. Smith, and S. Treitel, “Introductory geophysical inverse theory,” Notes de cours, pp. 1–208, 2001. [26] S. Ganguly and M. Kumar, “Geothermal reservoirs—A brief review,” J. Geol. Soc. India, vol. 79, no. June, pp. 589–602, 2012. [27] K. Pruess, “Modeling of geothermal reservoirs: Fundamental processes, computer simulation and field applications,” Geothermics, vol. 19, no. 1, pp. 3–15, 1990. [28] S. a. Shapiro, E. Huenges, and G. Borm, “Estimating the crust permeability from fluid-injection-induced seismic emission at the KTB site,” Geophys. J. Int., vol. 131, no. 2, pp. F15–F18, 1997. [29] J. Carrera, A. Alcolea, A. Medina, J. Hidalgo, and L. J. Slooten, “Inverse problem in hydrogeology,” Hydrogeol. J., vol. 13, no. 1, pp. 206–222, Mar. 2005. [30] C. W. Fetter, Applied Hydrogeology. 2001. [31] M. Cardiff and P. K. Kitanidis, “Efficient solution of nonlinear, underdetermined inverse problems with a generalized PDE model,” Comput. Geosci., vol. 34, no. 11, pp. 1480–1491, 2008. 201 [32] H. Zhou, J. J. Gómez-Hernández, and L. Li, “Inverse methods in hydrogeology: Evolution and recent trends,” Adv. Water Resour., vol. 63, no. Liangping Li, pp. 22–37, Jan. 2014. [33] S. I. Kabanikhin, “Definitions and examples of inverse and ill-posed problems,” J. Inverse Ill-Posed Probl., vol. 16, no. 4, pp. 317–357, 2008. [34] G. de Marsily, J. P. Delhomme, A. Coudrain-Ribstein, and A. M. Lavenue, “Four decades of inverse problems in hydrogeology,” in Special Paper 348: Theory, modeling, and field investigation in hydrogeology: a special volume in honor of Shlomo P. Neumans 60th birthday, Geological Society of America, 2000, pp. 1– 17. [35] D. S. Oliver and Y. Chen, “Recent progress on reservoir history matching: a review,” Comput. Geosci., vol. 15, no. 1, pp. 185–221, Jan. 2011. [36] A. N. Tikhonov and V. I. A. Arsenin, Solutions of ill-posed problems. Winston, 1977. [37] P. Jacquard, “Permeability Distribution From Field Pressure Data,” Soc. Pet. Eng. J., vol. 5, no. 4, pp. 281–294, Dec. 1965. [38] H. O. Jahns, “A Rapid Method for Obtaining a Two-Dimensional Reservoir Description From Well Pressure Response Data,” Soc. Pet. Eng. J., vol. 6, no. 4, pp. 315–327, Dec. 1966. [39] R. C. Bissel, Y. Sharma, and J. E. Killough, “History Matching Using the Method of Gradients: Two Case Studies,” in Annual Technical Conference and Exhibition, 1994, no. 28590. [40] M. M. F. Z. Hassane and P. Ackerer, “Groundwater flow parameter estimation 202 using refinement and coarsening indicators for adaptive downscaling parameterization,” Adv. Water Resour., vol. 100, pp. 139–152, 2017. [41] A.-A. Grimstad and T. Mannseth, “Nonlinearity, Scale, and Sensitivity for Parameter Estimation Problems,” SIAM J. Sci. Comput., vol. 21, no. 6, pp. 2096– 2113, 2000. [42] A. Golmohammadi and B. Jafarpour, “Simultaneous geologic scenario identification and flow model calibration with group-sparsity formulations,” Adv. Water Resour., vol. 92, pp. 208–227, Jun. 2016. [43] G. Nolet and R. Montelli, “Optimal parametrization of tomographic models,” Geophys. J. Int., vol. 161, no. 2, pp. 365–372, 2005. [44] D. W. Oldenburg and L. Yaoguo, “Subspace linear inverse method,” Inverse Probl., vol. 10, no. 4, p. 915, 1994. [45] D. W. Oldenburg, P. R. McGillivray, and R. G. Ellis, “Generalized Subspace Methods For Large-Scale Inverse Problems,” Geophys. J. Int., vol. 114, no. 1, pp. 12–20, 1993. [46] B. L. N. Kennett and P. R. Williamson, “Subspace methods for large-scale nonlinear inversion,” in Mathematical Geophysics, 1988, pp. 139–154. [47] M. Le Ravalec-Dupin and L. Y. Hu, “Combining the pilot point and gradual deformation methods for calibrating permeability models to dynamic data,” Oil Gas Sci. Technol., vol. 62, no. 2 SPECIAL ISSUE, pp. 169–180, 2007. [48] L. Li, S. Srinivasan, H. Zhou, and J. J. Gómez-Hernández, “A pilot point guided pattern matching approach to integrate dynamic data into geological modeling,” Adv. Water Resour., vol. 62, no. PA, pp. 152–128, 2013. 203 [49] B. S. RamaRao, A. M. LaVenue, G. De Marsily, and M. G. Marietta, “Pilot Point Methodology for Automated Calibration of an Ensemble of conditionally Simulated Transmissivity Fields: 1. Theory and Computational Experiments,” Water Resour. Res., vol. 31, no. 3, pp. 475–493, 1995. [50] R. C. Gonzalez and R. E. Woods, Digital Image Processing (3rd Edition). 2007. [51] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete Cosine Transform,” IEEE Trans. Comput., vol. C-23, no. 1, pp. 90–93, Jan. 1974. [52] K. Cabeen and P. Gent, “Discrete Cosine Transform,” Image Compression Discret. Cosine Transform, pp. 1–11, 1999. [53] J. Zhou, “On discrete cosine transform,” IEEE Trans. Comput., vol. C-23, no. January, pp. 90–93, 2011. [54] S. G. Mallat, “A Theory for Multiresolution Signal Decomposition: The Wavelet Representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 11, no. 7, pp. 674– 693, 1989. [55] C. Vonesch, T. Blu, and M. Unser, “Generalized daubechies wavelet families,” IEEE Trans. Signal Process., vol. 55, no. 9, pp. 4415–4429, 2007. [56] G. K. Wallace, “The JPEG still picture compression standard,” IEEE Trans. Consum. Electron., vol. 38, no. 1, pp. xviii–xxxiv, 1992. [57] W. Luo, J. Huang, and G. Qiu, “JPEG error analysis and its applications to digital image forensics,” IEEE Trans. Inf. Forensics Secur., vol. 5, no. 3, pp. 480–491, 2010. [58] M. W. Marcellin, M. J. Gormish, a. Bilgin, and M. P. Boliek, “An overview of JPEG-2000,” Proc. DCC 2000. Data Compression Conf., pp. 523–541, 2000. 204 [59] M. D. Adams, “The JPEG-2000 Still Image Compression Standard,” Image (Rochester, N.Y.), vol. 1, pp. 359–366, 2001. [60] A. Journel and C. Deutsch, GSLIB: Geostatistical Software Library and User’s Guide, vol. 37, no. 1. 1995. [61] M. Aharon, M. Elad, and A. M. Bruckstein, “K-Svd : Design of Dictionaries for Sparse Representation,” Spars’05, pp. 9–12, 2005. [62] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation,” IEEE Trans. Signal Process., vol. 54, no. 11, pp. 4311–4322, Nov. 2006. [63] R. a. DeVore, “Nonlinear approximation,” Acta Numer., vol. 7, p. 51, Nov. 2008. [64] R. A. Devore, “Nonlinear approximation and its applications,” in Multiscale, Nonlinear and Adaptive Approximation: Dedicated to Wolfgang Dahmen on the Occasion of his 60th Birthday, 2009, pp. 169–201. [65] B. Jafarpour and D. B. McLaughlin, “Reservoir Characterization With the Discrete Cosine Transform,” SPE J., vol. 14, no. 1, pp. 182–201, Mar. 2009. [66] A. M. Bruckstein, D. L. Donoho, and M. Elad, “From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images,” SIAM Rev., vol. 51, no. 1, pp. 34–81, 2009. [67] E. J. Candes and T. Tao, “Near-optimal signal recovery from random projections: Universal encoding strategies?,” IEEE Trans. Inf. Theory, vol. 52, no. 12, pp. 5406–5425, 2006. [68] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic Decomposition by Basis Pursuit,” SIAM J. Sci. Comput., vol. 20, no. 1, pp. 33–61, Jan. 1998. 205 [69] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1289–1306, 2006. [70] B. Jafarpour, V. K. Goyal, D. B. McLaughlin, and W. T. Freeman, “Compressed History Matching: Exploiting Transform-Domain Sparsity for Regularization of Nonlinear Dynamic Data Integration Problems,” Math. Geosci., vol. 42, no. 1, pp. 1–27, Jan. 2010. [71] M. Khaninezhad and B. Jafarpour, “Hybrid Parameterization of Reservoir Properties for Robust History Matching Under Geologic Uncertainty,” SPE Reservoir Characterisation and Simulation Conference and Exhibition. Society of Petroleum Engineers, Abu Dhabi, UAE, 2011. [72] M. a Davenport, M. F. M. Duarte, Y. C. Y. Eldar, and G. Kutyniok, “Introduction to compressed sensing,” Preprint, vol. 93, pp. 1–68, 2011. [73] G. Kutyniok, “Theory and Applications of Compressed Sensing,” Pre-Print, no. July, p. 22, 2012. [74] J. Wu, A. Boucher, and T. Zhang, “A SGeMS code for pattern simulation of continuous and categorical variables: FILTERSIM,” Comput. Geosci., vol. 34, no. 12, pp. 1863–1876, Dec. 2008. [75] S. Strebelle, “Conditional Simulation of Complex Geological Structures Using Multiple-Point Statistics,” Math. Geol., vol. 34, no. 1, pp. 1–21, 2002. [76] M. Lebrun and A. Leclaire, “An Implementation and Detailed Analysis of the K- SVD Image Denoising Algorithm,” Image Process. Line, vol. 2, pp. 96–133, May 2012. [77] J. Tropp and A. Gilbert, “Signal recovery from partial information via orthogonal 206 matching pursuit,” IEEE Trans. Inform. Theory, vol. 53, no. 12, pp. 4655–4666, 2007. [78] D. Wipf and S. Nagarajan, “Iterative Reweighted –l1 and –l2 Methods for Finding Sparse Solutions,” IEEE J. Sel. Top. Signal …, vol. 4, no. 2, pp. 317–329, 2010. [79] J. Sigl, “Nonlinear residual minimization by iteratively reweighted least squares,” Comput. Optim. Appl., vol. 64, no. 3, pp. 755–792, 2016. [80] D. Wipf and S. Nagarajan, “Iterative reweighted ℓ 1and ℓ 2 methods for finding sparse solutions,” IEEE J. Sel. Top. Signal Process., vol. 4, no. 2, pp. 317–329, 2010. [81] J. B. MacQueen, “Kmeans Some Methods for classification and Analysis of Multivariate Observations,” 5th Berkeley Symp. Math. Stat. Probab. 1967, vol. 1, no. 233, pp. 281–297, 1967. [82] F. Sana, K. Katterbauer, T. Al-Naffouri, and I. Hoteit, “Orthogonal Matching Pursuit for Enhanced Recovery of Sparse Geological Structures With the Ensemble Kalman Filter,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 9, no. 4, pp. 1710–1724, 2016. [83] A. H. Elsheikh, M. F. Wheeler, and I. Hoteit, “Sparse calibration of subsurface flow models using nonlinear orthogonal matching pursuit and an iterative stochastic ensemble method,” Adv. Water Resour., vol. 56, pp. 14–26, Jun. 2013. [84] T. Blumensath and M. E. Davies, “Iterative hard thresholding for compressed sensing,” Appl. Comput. Harmon. Anal., vol. 27, no. 3, pp. 265–274, Nov. 2009. [85] M. M. Khaninezhad, B. Jafarpour, and L. Li, “Sparse geologic dictionaries for subsurface flow model calibration: Part II. Robustness to uncertainty,” Adv. Water 207 Resour., vol. 39, pp. 122–136, 2012. [86] M. R. Mohammad-Khaninezhad, B. Jafarpour, and L. Li, “Sparse geologic dictionaries for subsurface flow model calibration: Part ii. robustness to uncertainty,” Adv Water Resour, vol. 39, pp. 122–136, 2012. [87] M. M. Khaninezhad, B. Jafarpour, and L. Li, “Sparse geologic dictionaries for subsurface flow model calibration: Part I. Inversion formulation,” Adv. Water Resour., vol. 39, pp. 106–121, 2012. [88] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004. [89] A. Settari, “Reservoir Compaction,” J. Pet. Technol., vol. 54, no. 8, 2002. [90] J. Bear, “Hydraulics of groundwater,” Adv. Water Resour., vol. 3, no. 4, p. 567, 1979. [91] J. Aarnes, T. Gimse, and K.-A. Lie, “An Introduction to the Numerics of Flow in Porous Media using Matlab,” Geom. Model. Numer. Simulation, Optim., pp. 265– 306, 2007. [92] J. Carrera and S. P. Neuman, “Estimation of Aquifer Parameters Under Transient and Steady State Conditions: 2. Uniqueness, Stability, and Solution Algorithms,” Water Resour. Res., vol. 22, no. 2, pp. 211–227, 1986. [93] J. Doherty, “Ground water model calibration using pilot points and regularization,” Ground Water, vol. 41, no. 2, pp. 170–177, 2003. [94] L. Li and B. Jafarpour, “Effective solution of nonlinear subsurface flow inverse problems in sparse bases,” Inverse Probl., vol. 26, no. 10, p. 105016, 2010. 208 [95] P. T. Boufounos, “Greedy sparse signal reconstruction from sign measurements,” in Conference Record - Asilomar Conference on Signals, Systems and Computers, 2009, pp. 1305–1309. [96] J. A. Tropp, A. C. Gilbert, and M. J. Strauss, “Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit,” Signal Processing, vol. 86, no. 3, pp. 572– 588, 2006. [97] T. T. Cai and L. Wang, “Orthogonal matching pursuit for sparse signal recovery with noise,” IEEE Trans. Inf. Theory, vol. 57, no. 7, pp. 4680–4688, 2011. [98] D. Needell, J. Tropp, and R. Vershynin, “Greedy signal recovery review,” in Conference Record - Asilomar Conference on Signals, Systems and Computers, 2008, pp. 1048–1050. [99] N. Meinshausen, “Relaxed Lasso,” Comput. Stat. Data Anal., vol. 52, no. 1, pp. 374–393, 2007. [100] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, “Sparsity and smoothness via the fused lasso,” J. R. Stat. Soc. Ser. B Stat. Methodol., vol. 67, no. 1, pp. 91–108, 2005. [101] P. Zhao and B. Yu, “Boosted lasso,” Statistics (Ber)., no. 2000, p. 21, 2004. [102] J. A. Tropp and A. C. Gilbert, “Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit,” IEEE Trans. Inf. Theory, vol. 53, no. 12, pp. 4655– 4666, Dec. 2007. [103] M. Schmidt, “Least Squares Optimization with L1-Norm Regularization,” Optimization, vol. 98, no. December, pp. 230–238, 2005. [104] C. A. Micchelli, J. M. Morales, and M. Pontil, “A Family of Penalty Functions for 209 Structured Sparsity,” Nips’10, pp. 1612–1623, 2010. [105] E. J. Candes and M. B. Wakin, “An Introduction To Compressive Sampling,” IEEE Signal Process. Mag., vol. 25, no. 2, pp. 21–30, Mar. 2008. [106] G. H. Golub, M. Heath, and G. Wahba, “Generalized cross-validation as a method for choosing a good ridge parameter,” Technometrics, vol. 21, no. 2, pp. 215–223, 1979. [107] C. V Deutsch and A. G. Journel, “GSLIB: Geostatistical Software Library and User’s Guide,” Technometrics, vol. 37, no. 1, pp. 126–126, Feb. 1995. [108] B. Jafarpour and M. Tarrahi, “Assessing the performance of the ensemble Kalman filter for subsurface flow data integration under variogram uncertainty,” Water Resour. Res., vol. 47, no. 5, p. W05537, May 2011. [109] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer Publishing Company, Incorporated, 2010. [110] M. M Khaninezhad and B. Jafarpour, “Prior Model Identification with Sparsity- Promoting History Matching,” in SPE Reservoir Simulation Symposium, 2013. [111] B. Jafarpour, L. Li, and M. R. Khaninezhad, “An Iteratively Reweighted Algorithm for History Matching of Oil Reservoirs Is Sparse Domains,” in 12th European Conference on the Mathematics of Oil Recovery, 2010. [112] V. Britanak, P. C. Yip, and K. R. Rao, Discrete Cosine and Sine Transforms: General Properties, Fast Algorithms and Integer Approximations. Elsevier Science, 2010. [113] B. Jafarpour, V. K. Goyal, D. B. McLaughlin, and W. T. Freeman, “Transform- domain sparsity regularization for inverse problems in geosciences,” Geophysics, 210 vol. 74, no. 5, p. R69, 2009. [114] M. M. Khaninezhad and B. Jafarpour, “Hybrid Parameterization for Robust History Matching,” SPE J., vol. 19, no. 3, pp. 487–499, Jun. 2014. [115] S. Mallat, A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way. Academic Press, 2008. [116] B. Jafarpour, “Wavelet Reconstruction of Geologic Facies From Nonlinear Dynamic Flow Measurements,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 5, pp. 1520–1535, May 2011. [117] E. W. Bhark, B. Jafarpour, and A. Datta-Gupta, “A generalized grid connectivity– based parameterization for subsurface flow model calibration,” Water Resour. Res., vol. 47, no. 6, p. W06517, Jun. 2011. [118] M. M. Khaninezhad and B. Jafarpour, “Prior model identification during subsurface flow data integration with adaptive sparse representation techniques,” Comput. Geosci., vol. 18, no. 1, pp. 3–16, 2014. [119] P. C. Hansen and D. P. O’Leary, “The Use of the L-Curve in the Regularization of Discrete Ill-Posed Problems,” SIAM J. Sci. Comput., vol. 14, no. 6, pp. 1487–1503, 1993. [120] P. C. Hansen, “The L-Curve and its Use in the Numerical Treatment of Inverse Problems,” Comput. Inverse Probl. Electrocardiology, ed. P. Johnston, Adv. Comput. Bioeng., vol. 4, pp. 119–142, 2000. [121] M. W. Browne, “Cross-Validation Methods,” J. Math. Psychol., vol. 44, no. 1, pp. 108–132, 2000. [122] R. Kohavi, “A Study of Cross-Validation and Bootstrap for Accuracy Estimation 211 and Model Selection,” in Appears in the International Joint Conference on Articial Intelligence (IJCAI), 1995, pp. 1–7. [123] M. M. Khaninezhad, B. Jafarpour, and L. Li, “Sparse geologic dictionaries for subsurface flow model calibration: Part I. Inversion formulation,” Adv. Water Resour., vol. 39, no. 0, pp. 106–121, Apr. 2012. [124] E. Laloy, B. Rogiers, J. A. Vrugt, D. Mallants, and D. Jacques, “Efficient posterior exploration of a high-dimensional groundwater model from two-stage Markov chain Monte Carlo simulation and polynomial chaos expansion,” Water Resour. Res., vol. 49, no. 5, pp. 2664–2682, 2013. [125] B. Jafarpour and D. McLaughlin, Estimating Channelized-Reservoir Permeabilities With the Ensemble Kalman Filter: The Importance of Ensemble Design, vol. 14, no. 2. 2009. [126] S. I. Aanonsen, G. Nævdal, D. S. Oliver, A. C. Reynolds, and B. Vallès, “The Ensemble Kalman Filter in Reservoir Engineering--a Review,” SPE J., vol. 14, no. 3, pp. 393–412, Sep. 2009. [127] D. L. Moreno and S. I. Aanonsen, “Continuous Facies Updating Using the Ensemble Kalman Filter and the Level Set Method,” Math. Geosci., vol. 43, no. 8, pp. 951–970, Nov. 2011. [128] Y. Chen, D. Oliver, and D. Zhang, “Efficient ensemble-based closed-loop production optimization,” SPE J., no. December, 2009. [129] G. Evensen, Data Assimilation: The Ensemble Kalman Filter. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. [130] A. H. Elsheikh, V. Demyanov, R. Tavakoli, M. a. Christie, and M. F. Wheeler, 212 “Calibration of channelized subsurface flow models using nested sampling and soft probabilities,” Adv. Water Resour., Oct. 2014. [131] P. K. Kitanidis, “Generalized priors in Bayesian inversion problems,” Adv. Water Resour., vol. 36, pp. 3–10, Feb. 2012. [132] E. Candès, “Compressive sampling,” Proc. oh Int. Congr. …, 2006. [133] Y. Chen and D. Zhang, “Data assimilation for transient flow in geologic formations via ensemble Kalman filter,” Adv. Water Resour., vol. 29, no. 8, pp. 1107–1122, 2006. [134] J. L. Anderson, “An Ensemble Adjustment Kalman Filter for Data Assimilation,” Mon. Weather Rev., vol. 129, no. 12, pp. 2884–2903, 2001. [135] P. L. Houtekamer, H. L. Mitchell, P. L. Houtekamer, and H. L. Mitchell, “Data Assimilation Using an Ensemble Kalman Filter Technique,” Mon. Weather Rev., vol. 126, no. 3, pp. 796–811, 1998. [136] G. Evensen, “The Ensemble Kalman Filter: Theoretical formulation and practical implementation,” Ocean Dyn., vol. 53, no. 4, pp. 343–367, 2003. [137] P. K. Kitanidis, “Quasi-Linear Geostatistical Theory for Inversing,” Water Resour. Res., vol. 31, no. 10, pp. 2411–2419, Oct. 1995. [138] D. S. Oliver, N. He, and A. C. Reynolds, “Conditioning Permeability Fields to Pressure Data,” Eur. Conf. Math. Oil Recover. V, pp. 1–11, 1996. [139] D. S. Oliver, A. C. Reynolds, and N. Liu, Inverse Theory for Petroleum Reservoir Characterization and History Matching. Cambridge: Cambridge University Press, 2008. [140] G. Gao, M. Zafari, and A. Reynolds, “Quantifying Uncertainty for the PUNQ-S3 213 Problem in a Bayesian Setting With RML and EnKF,” SPE J., vol. 11, no. 4, pp. 506–515, Dec. 2006. [141] M. M. Khaninezhad and B. Jafarpour, “Prior model identification during subsurface flow data integration with adaptive sparse representation techniques,” Comput. Geosci., vol. 18, no. 1, pp. 3–16, Feb. 2014. [142] M. M. Khaninezhad and B. Jafarpour, “Sparse Randomized Maximum Likelihood (SpRML) for subsurface flow model calibration and uncertainty quantification,” Adv. Water Resour., vol. 69, pp. 23–37, 2014. [143] H. Zhou, J. J. Gómez-Hernández, and L. Li, “Inverse methods in hydrogeology: Evolution and recent trends,” Adv. Water Resour., vol. 63, pp. 22–37, Jan. 2014. [144] P. K. Kitanidis and E. G. Vomvoris, “A geostatistical approach to the inverse problem in groundwater modeling (steady state) and one-dimensional simulations,” Water Resour. Res., vol. 19, no. 3, pp. 677–690, Jun. 1983. [145] M. Cardiff, W. Barrash, and P. K. Kitanidis, “Hydraulic conductivity imaging from 3-D transient hydraulic tomography at several pumping/observation densities,” Water Resour. Res., vol. 49, no. 11, pp. 7311–7326, Nov. 2013. [146] N. Cressie, “The origins of kriging,” Math. Geol., vol. 22, no. 3, pp. 239–252, Apr. 1990. [147] J.-P. Chilès and D. Pierre, Geostatistics: Modeling Spatial Uncertainty. Hoboken, NJ, USA: John Wiley & Sons, Inc., 1999. [148] N. Remy, A. Boucher, and J. Wu, Applied Geostatistics with SGeMS. Cambridge: Cambridge University Press, 2009. [149] J. J. Gómez-Hernández and X.-H. Wen, “To be or not to be multi-Gaussian? A 214 reflection on stochastic hydrogeology,” Adv. Water Resour., vol. 21, no. 1, pp. 47– 61, 1998. [150] P. M. Meier, A. Medina, and J. Carrera, “Geostatistical Inversion of Cross-Hole Pumping Tests for Identifying Preferential Flow Channels Within a Shear Zone,” Ground Water, vol. 39, no. 1, pp. 10–17, 2001. [151] A. Alcolea, J. Carrera, and A. Medina, “Pilot points method incorporating prior information for solving the groundwater flow inverse problem,” Adv. Water Resour., vol. 29, no. 11, pp. 1678–1689, Nov. 2006. [152] J. Kerrou, P. Renard, H. J. Hendricks Franssen, and I. Lunati, “Issues in characterizing heterogeneity and connectivity in non-multiGaussian media,” Adv. Water Resour., vol. 31, no. 1, pp. 147–159, 2008. [153] R. Tavakoli and A. C. Reynolds, “History Matching With Parametrization Based on the SVD of a Dimensionless Sensitivity Matrix,” SPE J., vol. 15, no. 2, pp. 495–508, Jun. 2010. [154] B. Jafarpour and D. B. McLaughlin, “Reservoir Characterization With the Discrete Cosine Transform,” SPE J., vol. 14, no. 1, pp. 182–201, Mar. 2009. [155] I. Sahni and R. N. Horne, “Generating Multiple History-Matched Reservoir Model Realizations Using Wavelets,” in SPE Annual Technical Conference and Exhibition, 2004. [156] M. M. Khaninezhad, B. Jafarpour, and L. Li, “Sparse geologic dictionaries for subsurface flow model calibration: Part II. Robustness to uncertainty,” Adv. Water Resour., vol. 39, no. 0, pp. 122–136, Apr. 2012. [157] A. Golmohammadi, M. M. Khaninezhad, and B. Jafarpour, “Group‐sparsity 215 regularization for ill‐posed subsurface flow inverse problems,” Water Resour. Res., vol. 51, no. 10, pp. 8607–8626, 2015. [158] Y. Zhang, D. S. Oliver, Y. Chen, and H. J. Skaug, “Data Assimilation by Use of the Iterative Ensemble Smoother for 2D Facies Models,” SPE J., vol. 20, no. 1, pp. 169–185, Feb. 2014. [159] B. Sebacher, A. S. Stordal, and R. Hanea, “Bridging multipoint statistics and truncated Gaussian fields for improved estimation of channelized reservoirs with ensemble methods,” Comput. Geosci., vol. 19, no. 2, pp. 341–369, Apr. 2015. [160] H. X. Vo and L. J. Durlofsky, “A New Differentiable Parameterization Based on Principal Component Analysis for the Low-Dimensional Representation of Complex Geological Models,” Math. Geosci., vol. 46, no. 7, pp. 775–813, Oct. 2014. [161] S.-Y. Chang and S. Latif, “Ensemble Kalman Filter to Improve the Accuracy of a Three Dimensional Flow and Transport Model with a Continuous Pollutant Source,” in World Environmental and Water Resources Congress 2011, 2011, no. 2010, pp. 1109–1117. [162] R. J. Lorentzen, G. Nævdal, and A. Shafieirad, “Estimating Facies Fields by Use of the Ensemble Kalman Filter and Distance Functions--Applied to Shallow- Marine Environments,” SPE J., vol. 3, no. 1, pp. 146–158, Feb. 2013. [163] G. Evensen, “Inverse methods and data assimilation in nonlinear ocean models,” Phys. D Nonlinear Phenom., vol. 77, no. 1–3, pp. 108–129, Oct. 1994. [164] R. J. Lorentzen, G. Naevdal, B. Valles, A. Berg, and A.-A. Grimstad, “Analysis of the Ensemble Kalman Filter for Estimation of Permeability and Porosity in 216 Reservoir Models,” in SPE Annual Technical Conference and Exhibition, 2005, p. SPE96375. [165] M. Khodabakhshi and B. Jafarpour, “Adaptive Conditioning of Multiple-Point Statistical Facies Simulation to Flow Data with Probability Maps,” Math. Geosci., vol. 46, no. 5, pp. 573–595, Jul. 2014. [166] N. Liu and D. S. Oliver, “Automatic History Matching of Geologic Facies,” SPE J., vol. 9, no. 4, pp. 429–436, Dec. 2004. [167] B. Sebacher, R. Hanea, and A. Heemink, “A probabilistic parametrization for geological uncertainty estimation using the ensemble Kalman filter (EnKF),” Comput. Geosci., vol. 17, no. 5, pp. 813–832, Oct. 2013. [168] R. Hemmecke, M. Köppe, J. Lee, and R. Weismantel, “Nonlinear integer programming,” in 50 Years of Integer Programming 1958-2008: From the Early Years to the State-of-the-Art, 2010, pp. 561–618. [169] H. P. Williams, “Integer programming,” in Constraints, vol. 11 Suppl 1, no. 1, 1980, pp. 272–319. [170] T. Lukić, “Discrete Tomography Reconstruction Based on the Multi-well Potential,” in Combinatorial Image Analysis, 2011, pp. 335–345. [171] S. Boyd, “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers,” Found. Trends® Mach. Learn., vol. 3, no. 1, pp. 1–122, 2010. [172] J. Zhu and T. C. J. Yeh, “Characterization of aquifer heterogeneity using transient hydraulic tomography,” Water Resour. Res., vol. 41, no. 7, pp. 1–10, 2005. [173] M. L. Calvache, J. P. Sánchez-Úbeda, C. Duque, M. López-Chicano, and B. de la 217 Torre, “Evaluation of Analytical Methods to Study Aquifer Properties with Pumping Tests in Coastal Aquifers with Numerical Modelling (Motril-Salobreña Aquifer),” Water Resources Management, pp. 1–17, 2015. [174] A. Zech, S. Arnold, C. Schneider, and S. Attinger, “Estimating parameters of aquifer heterogeneity using pumping tests - implications for field applications,” Adv. Water Resour., vol. 83, pp. 137–147, 2015. [175] H. Der Yeh and C. T. Wang, “Analysis of well residual drawdown after a constant- head test,” J. Hydrol., vol. 373, no. 3–4, pp. 436–441, 2009. [176] N. D. Bregman, “Crosshole seismic tomography,” Geophysics, vol. 54, no. 2, pp. 200–215, 1989. [177] G. Backus and F. Gilbert, “Uniqueness in the Inversion of Inaccurate Gross Earth Data,” Philos. Trans. R. Soc. A Math. Phys. Eng. Sci., vol. 266, no. March, pp. 123–192, 1970. [178] M.-R. Khaninezhad, A. Golmohammadi, and B. Jafarpour, “Discrete Regularization for Calibration of Geologic Facies Against Dynamic Flow Data,” Water Resour. Res., Jan. 2018. [179] Y. Liu, “Using the Snesim program for multiple-point statistical simulation,” Comput. Geosci., vol. 32, no. 10, pp. 1544–1563, 2006. [180] E. Liu and B. Jafarpour, “Learning sparse geologic dictionaries from low-rank representations of facies connectivity for flow model calibration,” Water Resour. Res., vol. 49, no. 10, pp. 7088–7101, Oct. 2013. [181] R. Bissell, “Calculating Optimal Parameters for History Matching,” in ECMOR IV - 4TH EUROPEAN CONFERENCE ON THE MATHEMATICS OF OIL 218 RECOVERY, TOPIC E: HISTORY MATCH AND RECOVERY OPTIMIZATION, 1994, pp. 1–15. [182] H. Zhou, J. J. Gómez-Hernández, H.-J. Hendricks Franssen, and L. Li, “An approach to handling non-Gaussianity of parameters and state variables in ensemble Kalman filtering,” Adv. Water Resour., vol. 34, no. 7, pp. 844–864, Jul. 2011. [183] J. Honorio and C. Chen, “Integration of PCA with a Novel Machine Learning Method for Reparameterization and Assisted History Matching Geologically Complex Reservoirs,” pp. 28–30, 2015. [184] P. Sarma, L. J. Durlofsky, and K. Aziz, “Kernel principal component analysis for efficient, differentiable parameterization of multipoint geostatistics,” Math. Geosci., vol. 40, no. 1, pp. 3–32, 2008. [185] S. Afra and E. Gildin, “Tensor based geology preserving reservoir parameterization with Higher Order Singular Value Decomposition (HOSVD),” Comput. Geosci., vol. 94, pp. 110–120, 2016. [186] E. Laloy, R. Hérault, D. Jacques, and N. Linde, “Training-Image Based Geostatistical Inversion Using a Spatial Generative Adversarial Neural Network,” Water Resources Research, 2018. [187] S. Chan and A. H. Elsheikh, “Parametrization and Generation of Geological Models with Generative Adversarial Networks,” Aug. 2017. [188] A. N. Tikhonov and V. Y. Arsenin, “Solutions of Ill-Posed Problems.,” Math. Comput., vol. 32, no. 144, p. 1320, Oct. 1978. [189] M. M. Khaninezhad and B. Jafarpour, “Hybrid Parameterization for Robust 219 History Matching,” SPE J., 2013. [190] M. A. Oliver and R. Webster, “A tutorial guide to geostatistics: Computing and modelling variograms and kriging,” Catena, vol. 113. pp. 56–69, 2014. [191] C. Chen, G. Gao, B. A. Ramirez, J. C. Vink, and A. M. Girardi, “Assisted History Matching of Channelized Models Using Pluri-Principal Component Analysis,” in SPE Reservoir Simulation Symposium, 2015. [192] L. Hu and S. Jenni, “History Matching of Object-Based Stochastic Reservoir Models,” SPE J., vol. 10, no. 3, pp. 312–323, 2005. [193] C. V. Deutsch and L. Wang, “Hierarchical object-based stochastic modeling of fluvial reservoirs,” Math. Geol., vol. 28, no. 7, pp. 857–880, 1996. [194] G. Mariethoz and J. Caers, MULTIPLE-POINT GEOSTATISTICS : Stochastic Modeling with Training Images. . [195] P. Tahmasebi and M. Sahimi, “Enhancing multiple-point geostatistical modeling: 1. Graph theory and pattern adjustment,” Water Resour. Res., vol. 52, no. 3, pp. 2074–2098, Mar. 2016. [196] A. G. Journel, “Combining knowledge from diverse sources: An alternative to traditional data independence hypotheses,” Math. Geol., vol. 34, no. 5, pp. 573– 596, 2002. [197] J. Caers and T. Zhang, “Multiple-point geostatistics: a quantitative vehicle for integrating geologic analogs into multiple reservoir models,” pp. 383–394, 2004. [198] L. Y. Hu and T. Chugunova, “Multiple-point geostatistics for modeling subsurface heterogeneity: A comprehensive review,” Water Resour. Res., vol. 44, no. 11, p. n/a-n/a, Nov. 2008. 220 [199] J. Caers and T. Hoffman, “The probability perturbation method: A new look at Bayesian inverse modeling,” Math. Geol., vol. 38, no. 1, pp. 81–100, 2006. [200] K. Johansen, J. Caers, and S. Suzuki, “Hybridization of the probability perturbation method with gradient information,” Comput. Geosci., vol. 11, no. 4, pp. 319–331, 2007. [201] B. Jafarpour and M. Khodabakhshi, “A Probability Conditioning Method (PCM) for Nonlinear Flow Data Integration into Multipoint Statistical Facies Simulation,” Math. Geosci., vol. 43, no. 2, pp. 133–164, Feb. 2011. [202] S. Hakim-Elahi and B. Jafarpour, “A distance transform for continuous parameterization of discrete geologic facies for subsurface flow model calibration,” Water Resour. Res., vol. 53, no. 10, pp. 8226–8249, 2017. [203] A. Astrakova and D. S. Oliver, “Conditioning Truncated Pluri-Gaussian Models to Facies Observations in Ensemble-Kalman-Based Data Assimilation,” Math. Geosci., vol. 47, no. 3, pp. 345–367, 2015. [204] A. Shekhovtsov, P. Kohli, and C. Rother, “Curvature Prior for MRF-based Segmentation and Shape Inpainting,” p. 17, Sep. 2011. [205] Z. Yan, S. Zhang, and X. Liu, “Accurate segmentation of brain images into 34 structures combining a non-stationary adaptive statistical atlas and a multi-atlas with applications to Alzheimer’S,” Biomed. Imaging ( …, no. 518, pp. 1–4, 2013. [206] T. Goldstein, X. Bresson, and S. Osher, “Geometric Applications of the Split Bregman Method: Segmentation and Surface Reconstruction,” J. Sci. Comput., vol. 45, no. 1–3, pp. 272–293, Nov. 2009. [207] T. Riklin-Raviv, “Prior based Image Segmentation,” no. August, 2007. 221 [208] A. El-Baz and G. Gimel’farb, “Robust image segmentation using learned priors,” in 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 857– 864. [209] a. Sanfeliu, R. Alquézar, J. Andrade, J. Climent, F. Serratosa, and J. Vergés, “Graph-based representations and techniques for image processing and image analysis,” Pattern Recognit., vol. 35, no. 3, pp. 639–650, Mar. 2002. [210] A. H. Khalighi et al., “A Comprehensive Framework for the Characterization of the Complete Mitral Valve Geometry for the Development of a Population- Averaged Model,” Springer, Cham, 2015, pp. 164–171. [211] T. Shen, S. Zhang, and J. Huang, “Integrating shape and texture in 3d deformable models: from metamorphs to active volume models,” Multi Modality State-of- …, vol. M, pp. 1–32, 2011. [212] S. Zhang, Y. Zhan, M. Dewan, J. Huang, D. N. Metaxas, and X. S. Zhou, “Towards robust and effective shape modeling: sparse shape composition.,” Med. Image Anal., vol. 16, no. 1, pp. 265–77, Jan. 2012. [213] A. Tuysuzoglu, W. C. Karl, I. Stojanovic, D. Castañòn, and M. S. Ünlü, “Graph Cut Based Discrete Valued Image Reconstruction,” vol. 24, no. 5, pp. 1614–1627, 2015. [214] N. Vu and B. S. Manjunath, “Shape prior segmentation of multiple objects with graph cuts,” 26th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR, 2008. [215] D. Hernando, P. Kellman, J. P. Haldar, and Z. P. Liang, “Robust water/fat separation in the presence of large field inhomogeneities using a graph cut algorithm,” Magn. Reson. Med., vol. 63, no. 1, pp. 79–90, 2010. 222 [216] K. Briechle and U. D. Hanebeck, “Template matching using fast normalized cross correlation,” Proc. SPIE, vol. 4387, pp. 95–102, 2001. [217] J. N. Sarvaiya, S. Patnaik, and S. Bombaywala, “Image Registration by Template Matching Using Normalized Cross-Correlation,” in 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies, 2009, pp. 819–822. [218] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding.,” Science, vol. 290, no. 5500, pp. 2323–6, 2000. [219] G. B. Arpat and J. Caers, “SIMPAT : Stochastic Simulation with Patterns 1 Introduction,” pp. 1–30, 2004. [220] N. Linde, P. Renard, T. Mukerji, and J. Caers, “Geological realism in hydrogeological and geophysical inverse modeling: A review,” Adv. Water Resour., vol. 86, pp. 86–101, 2015. [221] J. A. Vrugt, P. H. Stauffer, T. Wöhling, B. A. Robinson, and V. V. Vesselinov, “Inverse Modeling of Subsurface Flow and Transport Properties: A Review with New Developments,” Vadose Zo. J., vol. 7, no. 2, p. 843, 2008. [222] G. F. Pinder and M. A. Celia, Subsurface Hydrology. 2006. [223] J. A. Vrugt, P. H. Stauffer, T. Wohling, B. A. Robinson, and V. V. Vesselinov, “Inverse Modeling of Subsurface Flow and Transport Properties: A Review with New Developments,” Vadose Zo. J, vol. 7, no. 2, pp. 843–864, 2008. [224] R. Strasser and S. Selberherr, “Practical inverse modeling with SIESTA,” Ieice Trans. Electron., vol. E83c, no. 8, pp. 1303–1310, 2000. [225] J. Lee, H. Yoon, P. K. Kitanidis, C. J. Werth, and A. J. Valocchi, “Scalable 223 subsurface inverse modeling of huge data sets with an application to tracer concentration breakthrough data from magnetic resonance imaging,” Water Resour. Res., vol. 52, no. 7, pp. 5213–5231, 2016. [226] M.-R. Khaninezhad and B. Jafarpour, “Sparse Geologic Dictionaries for Field- Scale History Matching Application,” in SPE Reservoir Simulation Symposium, 2015.
Abstract (if available)
Abstract
This work focuses mainly on solutions of nonlinear inverse (in general model calibration) problems with limited (sparse) data. The application focus of this work is on subsurface energy resources model calibration but without loss of generality. The methodologies developed can be used for any linear or nonlinear inverse modeling and image reconstruction problems. In the first part of this work, we investigate the feasibility of characterizing subsurface models using sparse signal representation techniques. We have provided a framework for learning the main patterns in a prior set of model realizations, utilizing some of the well-known dictionary learning algorithms recently introduced in the machine learning and computer vision community. The proposed algorithm provides a novel framework for learning geologic dictionaries, resulting in geologically consistent patterns when used for inverse modeling problems related to subsurface flow and transport processes. In this regard, this dissertation explores model selection properties of the proposed sparse recovery algorithms as a tool for reducing prior uncertainties in model realizations of the geology. In the case of large prior uncertainty in prior models, this work provides several novel frameworks for identification and correction of those models in an automated scheme.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Feature learning for imaging and prior model selection
PDF
Deep learning for subsurface characterization and forecasting
PDF
Subsurface model calibration for complex facies models
PDF
Solution of inverse scattering problems via hybrid global and local optimization
PDF
Deep learning architectures for characterization and forecasting of fluid flow in subsurface systems
PDF
Integration of multi-physics data into dynamic reservoir model with ensemble Kalman filter
PDF
Transmission tomography for high contrast media based on sparse data
PDF
Subsurface SAR image formation for discrete point targets
PDF
Inverse modeling and uncertainty quantification of nonlinear flow in porous media models
PDF
Efficient control optimization in subsurface flow systems with machine learning surrogate models
PDF
Multi-constrained inversion algorithms for microwave imaging
PDF
Deep learning for characterization and prediction of complex fluid flow systems
PDF
Dynamics of CO₂ leakage through faults
PDF
Stochastic oilfield optimization under uncertainty in future development plans
PDF
Improved brain dynamic contrast enhanced MRI using model-based reconstruction
PDF
Advanced machine learning techniques for video, social and biomedical data analytics
PDF
Optimization of coupled CO₂ sequestration and enhanced oil recovery
PDF
Advances in linguistic data-oriented uncertainty modeling, reasoning, and intelligent decision making
PDF
Calibration uncertainty in model-based analyses for medical decision making with applications for ovarian cancer
PDF
Physics-based data-driven inference
Asset Metadata
Creator
Khaninezhad, Mohammad-Reza Mohammad
(author)
Core Title
Sparse feature learning for dynamic subsurface imaging
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
07/26/2018
Defense Date
06/25/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
compressive sensing,history matching,model calibration,model selection,OAI-PMH Harvest,sparse reconstruction,subsurface imaging
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Jafarpour, Behnam (
committee chair
), Ershaghi, Iraj (
committee member
), Moghaddam, Mahta (
committee member
)
Creator Email
m.khaninezhad@usc.edu,rkhaninezhad@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-30041
Unique identifier
UC11671726
Identifier
etd-Khaninezha-6478.pdf (filename),usctheses-c89-30041 (legacy record id)
Legacy Identifier
etd-Khaninezha-6478.pdf
Dmrecord
30041
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Khaninezhad, Mohammad-Reza Mohammad
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
compressive sensing
history matching
model calibration
model selection
sparse reconstruction
subsurface imaging