Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Efficient connectivity assessment of heterogeneous porous media using graph theory
(USC Thesis Other)
Efficient connectivity assessment of heterogeneous porous media using graph theory
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Efficient Connectivity Assessment of Heterogeneous Porous Media using Graph Theory by Calogero B. Rizzo A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (CIVIL ENGINEERING) August 2020 Copyright 2020 Calogero B. Rizzo Acknowledgements The completion of this dissertation would not have been possible without the help and continuous support of my PhD advisor, Professor Felipe P.J. de Barros. I am extremely grateful for all the discussions we had during these years, which helped me growing both as a person and researcher. Thanks to his tips and feedback, I was able to overcome many road blocks, and achieve goals that I may have labeled as impossible just five years ago. I would like to also thank Dr. Xingyuan Chen for being a great supervisor during my internships at the Pacific Northwest National Lab. Professors Patrick Lynett and Aiichiro Nakano for being part of the defense committee. Professor Alberto Guadagnini, for sharing his expertise and supporting me during the transition to the doctoral program. IwouldliketoexpressmygratitudetoalltheprofessorsandresearchersIhaveencountered during my doctoral program, which provided me with invaluable insights into the topic. My greatest appreciation goes to my office mates and my fellow PhD students. All those discussions, coffees, chats and laughs were critical for the completion of this disserta- tion. Particularly, I would like to thank all the students of my research group, Alessandra Bonazzi, Jinwoo Im, Arianna Libera, Maria Morvillo, and Mahsa Moslehi. Finally, my deepest thanks to my family. Special thanks to my parents Carmela and Lillo, for their unconditional love. To Viviana, for her support and patience. And to my sister Giusy, for being exceptionally strong during this period. ii Contents Acknowledgements ii List of Figures vii Abstract xiii 1 Motivations 1 1.1 Publications and Outline of the Dissertation . . . . . . . . . . . . . . . . . 5 2 Groundwater Flow and Transport Simulations 7 2.1 Subsurface Physical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.1 Hydraulic Conductivity . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.2 Physics of Flow and Solute Transport . . . . . . . . . . . . . . . . . 11 2.1.3 Random Walk Particle Tracking (RWPT) . . . . . . . . . . . . . . 12 2.2 PAR 2 : a GPU-accelerated Particle Tracking Solver . . . . . . . . . . . . . 14 2.2.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.2 Pseudo-Random Number Generator . . . . . . . . . . . . . . . . . . 17 2.2.3 Computing the Drift Vector and the Displacement Matrix . . . . . 17 2.2.4 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.5 Verification and Speed-up Analysis . . . . . . . . . . . . . . . . . . 21 iii 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3 A Link Between Porous Media and Graph Theory 30 3.1 Theory and Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1.1 General Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1.2 Graph Approximation . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.1.3 Computation of the Minimum Hydraulic Resistance . . . . . . . . . 37 3.1.4 Algorithm Output and Graph Refinement . . . . . . . . . . . . . . 39 3.2 The Lazy Mole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4 Connectivity and Solute Transport 45 4.1 Physical Formulation and Numerical Models . . . . . . . . . . . . . . . . . 45 4.1.1 Physics of Flow and Transport . . . . . . . . . . . . . . . . . . . . . 45 4.1.2 Particle Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2 Relation with Solute Transport . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.1 Leading Front of the Plume . . . . . . . . . . . . . . . . . . . . . . 49 4.2.2 Early Time Arrivals . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3 Statistical Properties of Minimum Hydraulic Resistance of MG Fields . . . 54 4.3.1 Single Realization Analysis . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.2 Expected Minimum Hydraulic Resistance . . . . . . . . . . . . . . . 57 4.3.3 Scaling Behavior of First Time Arrivals . . . . . . . . . . . . . . . . 64 4.4 Expected Minimum Hydraulic Resistance of non-MG Fields . . . . . . . . 67 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 iv 5 UncertaintyAnalysisandaConnectivity-BasedIterativeSamplingStrat- egy 73 5.1 Computing the Minimum Hydraulic Resistance and Least Resistance Path 75 5.2 Hydraulic Conductivity RSF Model and the Minimum Hydraulic Resis- tance Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.2.1 Spatially Random Hydraulic Conductivity Fields . . . . . . . . . . 79 5.2.2 Statistical Analysis of Minimum Hydraulic Resistance . . . . . . . . 80 5.2.3 Box Plot Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.2.4 PDF Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.3 Identifying the Least Resistance Path of a 3D Random Hydraulic Conduc- tivity Field and Implications on First Arrival Time Uncertainty . . . . . . 86 5.3.1 Iterative Sampling Strategy based on the Least Resistance Path . . 88 5.3.2 Comparison with a Regular Sampling Strategy . . . . . . . . . . . . 93 5.3.3 Uncertainty Reduction of First Arrival Times . . . . . . . . . . . . 94 5.3.4 Start and Target Control Volumes Selection . . . . . . . . . . . . . 102 5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6 Connectivity of Multi-Indicator Models 106 6.1 Minimum Hydraulic Resistance in Multi-Indicator Models . . . . . . . . . 107 6.2 Monte Carlo Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.2.1 Importance of Dimensionality (2D vs 3D) . . . . . . . . . . . . . . 111 6.2.2 Comparison with Multi-Gaussian fields . . . . . . . . . . . . . . . . 113 7 Conclusions 116 v Bibliography 120 A Lower Bound Estimation of the Early Arrival Times 137 B Analytical derivation of the dispersion tensor D 140 vi List of Figures 1.1 Example of randomly generated hydraulic conductivity fields using differ- ent random space functions. Red indicates high conductivity, while blue indicates low conductivity. It is clear that each field has a different connec- tivity structure that as an impact on solute transport. . . . . . . . . . . . . 2 2.1 MPS simulation of the fluvio-aolian deposits located in Descalvado, Brazil [8]. The size of the field is 42m 42m 5:8m. Each color represents a hydroface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 Raw particles position using RW3D (red particles) and PAR 2 (green par- ticles) after 7.5 days (top) and 30 days (bottom). The number of particles isN p = 20; 000 in both cases. Blue lines indicate the initial position of the particles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3 NormalizedcumulativeBreak-ThroughCurve(BTC)computedattwocon- trol planes (CP) orthogonal to the x-axis located at 5m (CP-1) and 25m (CP-2) from the injection line-source using RW3D and PAR 2 . . . . . . . . 27 2.4 Computational time required to run the full benchmark simulation us- ing RW3D and PAR 2 with different number of particles N p . Computa- tional times for PAR 2 are reported using both single-precision and double- precision float numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 vii 3.1 Scheme used to construct the hydraulic resistance graph R(V;E). The circles represent the nodes of the graph, one for each cell. Each node is connected to the neighbors through edges. For instance, the red node is connected to the neighbors through the green edges. . . . . . . . . . . . . . 34 3.2 Zoom of two nodes connected by one edge j i . The weight of the edge is computed according to the cell conductivities K i and K j as in Equation (3.4). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3 Illustration of the method in a 3D spatially heterogeneous multi-Gaussian hydraulic conductivity field. The natural logarithm of the hydraulic con- ductivity (left) and the corresponding minimum hydraulic resistance map obtained with the graph theory algorithm (right) are displayed. The colors on the minimum hydraulic resistance map (right) indicate the resistance between the center of the domain and all remaining points in the 3D het- erogeneous porous medium. The 3D domain contains 201 201 201 cells. 40 3.4 Example of graph refinement with = 1; 3 and 5. . . . . . . . . . . . . . . 41 4.1 Snapshots of a particle tracking simulation for 9 randomly generated fields ( 2 Y = 4). The least resistance path (red line) is compared with the trajec- tory of the fastest particle (blue line). On the right side of each panel, the normalized minimum hydraulic resistance from the left boundary to each point of the right boundaryR + (y) =R (y)K G =I Y . . . . . . . . . . . . . . 49 viii 4.2 Box plots of the normalized difference between the end of the least resis- tance path y and the final location of the fastest particle y t on the right boundary. The location y t is computed by simulating flow and transport. In both cases ( 2 Y = 1 and 4), 100 realizations of the hydraulic conductivity field have been used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3 Comparison between early time arrivals (t 1% ) and minimum hydraulic re- sistance between left and right boundaries (R). The coefficient of determi- nation of the regression line is R 2 = 0:815 for 2 Y = 1 and R 2 = 0:893 for 2 Y = 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.4 Minimum hydraulic resistance from the center (bottom) obtained using three different K-fields (top). . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.5 Expected value of the minimum hydraulic resistance between points with distance r (top) and its derivative with respect to r (bottom) for different value of 2 Y . The numerical results (?) are fitted using Equation (4.4). . . . 58 4.6 Illustration of the method used to split the least resistance path to derive the large distance approximation. . . . . . . . . . . . . . . . . . . . . . . . 62 4.7 Comparison between the time arrival increase rate m from [55] and the slope of expected minimum hydraulic resistance e 0:8 Y defined in (4.13). . 66 4.8 Expected value of the normalized minimum hydraulic resistance versus nor- malized distance for non-Gaussian log-conductivity fields. Results depicted for (a) well-connected fields and (b) disconnected fields. Insets correspond to the spatial derivative of the expected minimum hydraulic resistance. . . 69 ix 5.1 Schematic representation of the graph generated from a 3 3 3 log- conductivity field. The yellow spheres are the nodes/vertices of the graph collocated at the center of each cell, and the black lines represent the edges. 76 5.2 Box-plotsof log 10 R(log-resistancebetweentwopointsatdistance 4a)using different types of 2D and 3D RSF models. . . . . . . . . . . . . . . . . . . 81 5.3 Comparison between the empirical PDF ofR and existing PDF models: Power Log-Normal (orange), Log-Normal (green), Exponentiated Weibull (red), Beta Prime (purple). The Kolmogorov-Smirnov test was adopted to test the goodness-of-fit and the corresponding p-values are reported inside each sub-figure. Results are displayed for Multi-Gaussian (MG), non-MG connected, non-MG disconnected and binary fields in 2D (top row) and 3D (bottom row). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.4 Three-dimensional spatially heterogeneous log-conductivity field represent- ing the synthetic “truth” ( Y = 0, 2 Y = 4:5). This reference field is denoted as Y. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.5 Map of the x 3 -integrated probability to find the LRP of the field shown in Figure 5.4 in all the six iterations of the iterative sampling strategy. Black circles are the previous sampling locations, while red squares are the sampling location chosen at the current iteration. In I6, the red line represents the real LRP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.6 LRPs ensemble obtained by conditioning the field in different locations us- ing the regular strategy (top) and the iterative strategy (bottom). The paths are colored according to the logK. The spheres represent the loca- tions where the conductivity is sampled. . . . . . . . . . . . . . . . . . . . 92 x 5.7 Solute plume after 500 steps (t = 25) of the open-source GPU acceler- ated RWPT code PAR 2 [100] using the synthetic true field Y (see Figure 5.4). On the bottom projection, we compute the the x 3 -integrated log- concentration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.8 Results for the cumulative BTC at the control plane and its confidence intervals using the iterative sampling (blue) and regular sampling (green) for the benchmark case. The red line is the cumulative BTC obtained using Y. The blue/green lines are the median. The darker shaded areas of blue and green represent the 50% confidence intervals whereas the lighter shadedareasofblueandgreenrepresentthe90%confidenceintervals. Time is normalized with the t 1% of the synthetic true field. . . . . . . . . . . . . 96 5.9 Comparison of first arrival time t 1% using the iterative sampling (blue) and regular sampling (green). The red line represents the t 1% using the synthetic true field Y. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.10 Scatter-plotoftheMHRbetweenleft/rightboundaryandfirstarrivaltimes t 1% . Grey circles are obtained using unconditional simulations while blue rhombus are obtained using the iterative sampling conditional simulations. 99 5.11 Map of thex 3 -integrated probability to find the LRP between the injection and production wells in the field shown in Figure 5.4 after six iterations of the iterative sampling strategy. Black circles are the final 42 horizontal sampling locations. The red line represents the real LRP between the two wells. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 xi 5.12 Results for the cumulative BTC at the production well and its confidence intervals using the iterative sampling (blue) and regular sampling (green) for the well-to-well case. The red line is the cumulative BTC obtained using Y. The blue/green lines are the median. The darker shaded areas of blue and green represent the 50% confidence intervals whereas the lighter shadedareasofblueandgreenrepresentthe90%confidenceintervals. Time is normalized with the t 1% of the synthetic true field. . . . . . . . . . . . . 101 6.1 Multi-Indicator Model (MIM). The red lines represent a possible path from (0, 0) to (4, -2). Note that we are neglecting the part that moves along the y-direction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.2 PDF of the MHR at layers 2, 6, and 10 for 2D (top) and 3D (bottom). The histograms are computed using N MC = 10; 000 realizations for each . . . . 111 6.3 Ratio between 2D and 3D expected values ofR i as a function of the layer i. The 2D expected MHR is always higher than the 3D counterpart. . . . . 113 6.4 ScalingfactoroftheexpectedMHR! definedinequation(6.8)asafunction of the heterogeneity level . The green line is based on the results for the multi-Gaussian fields shown in Rizzo and de Barros [98]. The 2D and 3D values for the MIM fields are estimated using Monte Carlo simulations. . . 114 xii Abstract Spatial heterogeneity of the hydrogeological properties (e.g., hydraulic conductivity and porosity) has a key role in controlling groundwater flow and solute transport processes. The multi-scale variability of the hydraulic conductivity (K) causes complex flow patterns thatcontrolsolutemassfluxesandarrivaltimes. Anomaloussolutetransportbehaviorhas been observed in highly heterogeneous K-fields, due to the presence of fast flow conduits (i.e., preferential flow paths) and low conductive layers controlling connectivity properties. In this dissertation, the minimum hydraulic resistance (MHR) and the corresponding least resistance path (LRP) are used to study the statistical properties of heterogeneous K- fields connectivity. The MHR and the LRP are static quantities since they do not depend on flow and transport equations. A fast graph theory-based methodology to compute the MHR and LRP is developed. One of the main advantages of using static quantities such as the MHR is that important features of the hydrogeological system (e.g., first time arrival and high conductivity channels) can be extracted without resorting to flow and transport simulations. Finally, an iterative data acquisition strategy is developed, which can be utilized to identify the LRP (which is linked to preferential flow channels) in real sites. A synthetic benchmark test is presented, showing the advantages of the proposed sampling strategy when compared to a regular sampling strategy. Using the iterative data sampling strategy, the first time arrival uncertainty is reduced by more than 40% (when compared xiii to the regular sampling strategy), while maintaining site characterization efforts constant. xiv Chapter 1 Motivations Spatial heterogeneity of the hydrogeological properties (e.g., hydraulic conductivity and porosity) has a key role in controlling groundwater flow and solute transport processes. The multi-scale variability of the hydraulic conductivity K causes complex flow patterns which control solute mass fluxes and arrival times. This has been observed in both field [e.g., 120, 16] and modeling [e.g., 48, 46, 54, 123] studies. The link between the spatial structure of the flow field and large scale solute transport behavior is well established for aquifers displaying weak to mild heterogeneity in the K- field [e.g.. 26, 102]. However, establishing this link for highly heterogeneous aquifers is not trivial. The Macrodispersion Experiment (MADE) site located in Columbus, Mississippi (USA) [19] is one of the examples where high heterogeneity in K can be encountered. Highly heterogeneous K-fields are characterized by the presence of fast flow conduits (i.e., preferential flow paths) and low conductive layers [e.g., 49, 107, 78, 42]. The effect of well-connected paths on solute transport has been extensively studied in the hydrogeological community [e.g., 106, 50, 118, 69, 114, 45, 42, 115, 116, 98]. Impli- cations of aquifer connectivity in the probabilistic assessment of increased lifetime cancer risk due to the exposure of chlorinated solvents is also reported in the literature [62]. 1 Figure 1.1: Example of randomly generated hydraulic conductivity fields using different random space functions. Red indicates high conductivity, while blue indicates low conductivity. It is clear that each field has a dif- ferent connectivity structure that as an impact on solute transport. Poeter and Townsend [90] discussed how well-connected K regions could improve aquifer remediation [see also 72]. In particular, the connectivity of the K-field is one of the main factors leading to non-Fickian behaviour commonly observed at the field scale [123, 18]. In the literature, different metrics have been proposed to quantify connectivity [e.g., 106, 69, 75, 42, 93]. These metrics can be classified as static (i.e., metrics solely based on the K field) or dynamic (i.e., metrics based on quantities describing key features of the flow field and transport). For additional details on different connectivity metrics, we refer the reader to Knudby and Carrera [69] and Renard and Allard [93]. We focus on a static quantity denoted as the minimum hydraulic resistance (MHR) and the corresponding least resistance path (LRP) [116, 98]. The MHR and the LRP are solely based on the K field. The MHR is related to the minimum resistance encountered by a solute particle traveling between two control volumes, i.e. higher values of MHR implies larger solute 2 residence times [115, 116]. The LRP corresponds to the curve connecting the two con- trol volumes that minimizes the resistance. One of the main advantages of using static quantities such as the MHR is that important features of the hydrogeological system can be extracted without resorting to flow and transport simulations. They are particularly suited for applications where time resources are limited and fast preliminary screening analysis is required. Moreover, they can potentially indicate how the physical hetero- geneity of the aquifer affects solute transport. Another example of static metric is the Geological Entropy introduced by Bianchi and Pedretti [18, 17]. We point out that not all dynamic quantities can be correlated to static quantities, since the latter are missing crucial information about flow (e.g. velocity gradients) and transport. With the goal of estimating the hydraulic resistance in heterogeneous porous media, Tyukhova et al. [115] introduced an algorithm based on erosion and dilation of a moving polygon. The authors used this algorithm to estimate highK channels in two-dimensional (2D) porous media [115, 116]. The extension of this method to three-dimensional (3D) fields can be cumbersome and computationally expensive. To overcome the computa- tional burden associated with the estimation of the MHR, we propose to transform the continuous K-field into a graph, proving that finding the MHR is equivalent to the fa- mous shortest path problem. This allows us to use a variation of the computationally efficient Dijkstra’s algorithm, being able to compute the MHR and LRP in both 2D and 3D conductivity fields. We also showed how the MHR, computed through the Dijkstra’s algorithm, is related to first arrival time and how the LRP can be used to estimate the trajectoriesoffastestparticles(i.e., leadingfrontoftheplume). Furthermore, weanalyzed results for Gaussian and non-GaussianK-fields and developed a semi-analytic expressions relating the expected value of the MHR to key geostatistical properties characterizing the 3 log-K field (such as its RSF parameters). Knudby and Carrera [70] provided a similar application of the Dijkstra’s algorithm for 2D fields. The authors showed how both so- lute plume and drawdown signal follow the LRP. Moreover, the works of Viswanathan et al. [117] and Hyman et al. [64] show the advantages of using graph theory to estimate flow and transport quantities in discrete fracture networks, where connectivity and flow channeling are of key importance [e.g., 79]. The research objectives for this dissertation are the following: 1. Understand the relation between minimum hydraulic resistance and least resistance path, both static quantities only depending on the K-field, with solute transport in heterogeneous porous media. These analysis allow the usage of static quantities as approximationofdynamicquantities, usuallymuchmorechallengingtocompute. In this way, it is possible to speedup common algorithm known to be computationally expensive (e.g., Monte Carlo) and to ease their application in real case scenarios, where time and computational resources are usually scarce. 2. Determine how the chosen random space functions for the hydraulic conductivity statisticalpropertiesoftheconnectivity. Forexample, wecanquantifythedifference between 2D and 3D models in term of connectivity and its uncertainty, but also the difference linked to the different choice of the K-field geological conceptualizations. 3. Formulate a convenient iterative sampling strategy based on connectivity metrics to reduce the uncertainty on solute transport properties while keeping the charac- terization effort constant. In this case, we exploit the flexibility of the graph theory approach to estimate connectivity properties to build a methodology that could be applied in real scenario. 4 4. Provide a simple way to use the minimum hydraulic resistance and least resistance path concepts in future researches and applications by providing optimized open- source software for their computation. 1.1 Publications and Outline of the Dissertation The research performed during the Ph.D. program has been published in the following journal papers: 1. Rizzo, C. B., & de Barros, F. P. J. (2017). Minimum hydraulic resistance and least resistance path in heterogeneous porous media. Water Resources Research, 53(10), 8596-8613. 2. Rizzo, C. B., de Barros, F. P. J., Perotto, S., Oldani, L., & Guadagnini, A. (2018). Adaptive POD model reduction for solute transport in heterogeneous porous media. Computational Geosciences, 22(1), 297-308. 3. Rizzo, C. B., Nakano, A., & de Barros, F. P. J. (2019). PAR 2 : Parallel Random Walk Particle Tracking Method for solute transport in porous media. Computer Physics Communications, 239, 265-271. 4. Rizzo, C. B., & de Barros, F. P. J. (2019). Minimum Hydraulic Resistance Uncer- tainty and the Development of a Connectivity-Based Iterative Sampling Strategy. Water Resources Research, 55(7), 5593-5611. 5. Rizzo, C. B., Song X., de Barros, F. P. J., & Chen X. (2020). Temporal flow variations interact with spatial physical heterogeneity to impact solute transport in managed river corridors. Journal of Contaminant Hydrology (under review). 5 The chapters of this dissertation are based on the papers Rizzo and de Barros, WRR, 2017 [98], Rizzo et al., CPC, 2019 [99] and Rizzo and de Barros, WRR, 2019 [100]. In Chapter 2, we define the equations and physical models governing subsurface flow and transport, and introduce PAR 2 , an open-source software implementing a parallelized version of the particle tracking method for solute transport simulations. In Chapter 3, we define the link between porous media and graph theory, allowing a fast estimation of preferential flows in heterogeneous porous media. In Chapter 4, we use a Monte Carlo framework to estimate statistical properties of the minimum hydraulic resistance and least resistance path for different type of hydraulic conductivity stochastic models and dimensionality. In Chapter 5, an iterative sampling strategy based on the least resistance pathconceptisintroduced, anddifferencesbetween2Dand3Drandomfieldsareanalyzed. In Chapter 6, we provide a semi-analytical framework based on a multi-indicator model for the K-field to compute the minimum hydraulic resistance. Finally, a summary and conclusions are presented in Chapter 7. 6 Chapter 2 Groundwater Flow and Transport Simulations Groundwaterflowsareplaguedbyspatio-temporalfluctuationsatmultiplescales. Subsur- face properties such as the hydraulic conductivity field and the porosity vary many orders of magnitude which leads to variability in the flow field [102]. These velocity fluctuations have a strong impact on the overall dispersive behavior of a solute plume. Together with local scale dispersive mass transfer mechanisms, the spreading dynamics induced by the variability of the groundwater flow field steepens concentration gradients which in turn augments solute mixing [33]. The interplay between flow variability and mass transfer mechanisms is crucial to understand contaminant exposure as well as bio-chemical ac- tivities, both critical to risk analysis and aquifer remediation. Furthermore, due to high costs of data acquisition, groundwater models are subject to uncertainty. Estimating the uncertainty due to data scarcity in solute transport predications can be achieved by adopting probabilistic tools in combination with geostatistical techniques [2]. Within the stochastic framework, several flow and transport simulations on equiprobable realizations, i.e. Monte Carlo simulations, of the subsurface environment are carried out (e.g., [6]). As 7 an outcome of the above mentioned factors, capturing the spatio-temporal dynamics of a solute plume and the uncertainty associated with model predictions requires intensive computational effort [84, 83]. Many softwares and numerical methods have been developed in order to simulate solute transport in groundwater systems. Numerical methods are usually classified as Eulerian, Lagrangian, or mixed Eulerian-Lagrangian, based on the type of discretization employed [9]. Within the groundwater community, the most comprehensive software for solute transport is MT3DMS [121, 122]. MT3DMS provides several numerical methods to solve solute transport (reactive and non-reactive). For example, it is possible to use the standard finite-difference method, mixed Eulerian-Lagrangian approach, and the total- variation-diminishing (TVD) method. At the same time, a fully Lagrangian solver using the Random Walk Particle Tracking (RWPT) method is provided by the RW3D code de- veloped by Fernàndez-Garcia et al. [40, 41] (for recent developments, see [61]) and RWHet developed by LaBolle [71]. Comparison between Eulerian and Lagrangian approaches can be found in Salamon et al. [105] and Boso et al. [21]. Eulerian methods can introduce a non-physical numerical diffusion to the system as a result of the discretization scheme. Therefore, the numerical grid must be refined where high concentration gradients are expected. This process can be cumbersome in cases where the dimension of the solute plume is small compared to the cell size (e.g., instantaneous point or line injection), when the transport is highly advective (i.e., high Péclet number) or when the system is highly heterogeneous. On the contrary, Lagrangian approaches do not suffer from numerical diffusion. For this reason they are more convenient when advection is the key transport mechanism or when the subsurface environment displays strong heterogeneity in its hy- draulic properties. For such reasons, the RWPT method is widely used to model solute 8 transport in heterogeneous groundwater systems [119]. One of the key drawbacks of particle tracking techniques is that it may require a very high number of particles to accurately represent the solute plume and avoid artifi- cial concentration fluctuations [60, 105, 39, 21, 37, 61]. This may lead to a prohibitive computational time. However, the high number of particles problem can be remarkably attenuated by parallelizing the RWPT method using modern parallel computing tech- niques. In fact, each particle trajectory can be computed independently and, for such reasons, the RWPT can be efficiently parallelized. To verify the benefits due to the parallelization, we developed PAR 2 , an open-source software implementing a parallelized version of the RWPT method. Using the General- Purpose Computing on Graphics Processing Units (GPGPU) methodology, we develop an efficient GPU parallelized particle tracking code able to run on NVIDIA GPUs. The code is written using C++/CUDA programming language and it can be downloaded from the CPC library entry [100] or Git repository [97] (https://github.com/GerryR/par2). The results show the possibility to simulate solute transport on a common desktop setup using a large number of particles in a relatively short time. Moreover, given the increasing number of High Performance Computing (HPC) clusters equipped with NVIDIA GPUs, it is possible to include the software in a massively parallel Monte-Carlo framework, where each GPU simulates one realization. In addition, we propose an analytical methodology to compute the anisotropic displacement matrix needed for the GPU parallelization. 9 2.1 Subsurface Physical Model 2.1.1 Hydraulic Conductivity The hydraulic conductivityK tensor is an indicator of the ease with which the water can flow through a porous media. Hydraulic conductivity depends on properties of both the porous media and the water following the relation: K(x) = k(x)g (2.1) wherek is the the porous media permeability tensor,g is the gravity acceleration, is the density of the water and is the dynamic viscosity of the water. In general, the hydraulic conductivityKvariesinspacex = (x;y;z). However, inrealcasescenariositisimpossible to determine the value of K in any point in space. Instead, only few measurements of K at selected observation points are available. Therefore, in order to model the uncertainty caused by this unavoidable lack of knowledge, we treat the hydraulic conductivity K as a Random Space Function (RSF). The choice of the RSF depends on the type of aquifer considered. The most common RSF model for the log-conductivity Y = logK is the multi-Gaussian (MG) field. A sta- tionary MG field can be uniquely defined by its mean Y (and geometric meanK G =e Y ), variance 2 Y andspatialcovariancemodelC Y . Giventhestatisticalparametersdefiningthe RSF, it is possible to generate an equiprobable multi-Gaussian realization using Sequen- tial Gaussian simulation method. The SGSIM software in the GSLIB [35] can be utilized for the generation of multiple multi-Gaussian fields realization. Alternatively, in case of unconditional 2D fields, a faster and more efficient approach provided by HYDROGEN 10 [11] is used. 2.1.2 Physics of Flow and Solute Transport In this work, we consider a steady-state flow through a saturated 3D porous formation. The flow field is governed by the continuity equation: r K(x) (x) r(x) = 0; (2.2) where is the hydraulic head and is the effective porosity. With the solution for , the spatially variable velocity field u is obtained through the use of Darcy’s law: u(x) = K(x) (x) r(x); (2.3) Given the elliptic nature of the flow equations, Eulerian numerical methods, such as Finite Difference (FD) or Finite Volume (FV), can be effectively used to compute the spatially heterogeneous velocity field u. MODFLOW [59] provides several numerical methods for the solution of the flow equations. An alternative solution is provided by PFLOTRAN [58], a parallel solver to simulate subsurface flows. Next, we consider transport of an inert solute to be governed by advection and local- scale dispersion mechanisms. The concentration of the solute is represented by c and satisfies the Advection-Dispersion equation [73]: @ ((x)c(x;t)) @t +u(x)r ((x)c(x;t)) = r [(x)D(x)rc(x;t)]; (2.4) 11 with given boundary and initial conditions. The local dispersion tensor is denoted by D and models both the sub-Darcy scale heterogeneity and molecular diffusion. A possible choice for the dispersion tensor is (see Chapter 7 of Bear and Cheng [9]): D(x) = ( T ju(x)j +D m )I + L T ju(x)j u(x)u(x) T ; (2.5) where L and T are the longitudinal and transverse dispersivities, D m is the molecular diffusivity and I is the identity matrix. Other possible choices of the dispersion tensor can be found in the literature [22, 77]. Note that the implementation method we propose can be adapted to different local-scale dispersion models. 2.1.3 Random Walk Particle Tracking (RWPT) The RWPT method is a Lagrangian framework that aims to find a solution of (2.4). The solute plume is represented by N p particles, each one carrying a fraction of the solute mass M p =M 0 =N p , where M 0 is the mass of solute injected. The particles are moved by three physical mechanisms: advection, diffusion and local-scale dispersion. Advection is simulatedbymovingparticlesaccordingtothevelocityfieldobtainedthroughthesolution of equations (2.2) and (2.3). In order to simulate the diffusion and local-scale dispersion, a random displacement is added to the particle. Molecular diffusion does not depend on the current velocity direction, thus the displacement is radially symmetric (i.e., isotropic). On the other hand, local-scale dispersion is anisotropic and depends on the direction of the particle velocity. 12 The trajectory of a particlei can be computed using the Itô-Taylor integration scheme [94]: X i (t + t) =X i (t) +A(X i (t))t +B(X i (t))(t) p t; (2.6) where is a normal-Gaussian white noise vector, t is the time step, and the drift vector A and displacement matrix B are defined as follows: A(x) =u(x) +rD(x) + 1 (x) D(x)r(x); (2.7) 2D(x) =B(x)B(x) T : (2.8) Note that the trajectory of each particle is independent from the others. Moreover, the new position of the particle can be found only using the current position, therefore not requiring the knowledge of the full particle history. Since the governing equation (2.4) can be transformed into an equivalent of the Fokker-Planck equation [94], it can be proved that the particles density converges to the solution of (2.4) when increasing the number of particles and reducing the time step, therefore: c(x;t) = 1 (x) lim !0 1 jV x; j Np X i=1 M p 1 V x; (X i (t)); (2.9) whereV x; is a control volume centered inx with characteristic dimension (e.g., a sphere with centerx and radius),jV x; j is the volume ofV x; , and1 V x; (x) is the indicator func- tion (i.e, 1 ifx2V x; , 0 otherwise). Equation (2.9) can be used to effectively compute an approximation of the concentration c(x;t) by fixing with an appropriate small number. Moreover, the average concentration over an arbitrary control volume can be computed 13 in a similar fashion. Note that the smaller is the control volume, the more particles are needed for the convergence. Several studies focused on improving concentration es- timation in situations where the number of particles is low. These techniques allow to effectively reduce the number of particles needed for the convergence of the concentration field. For example, Fernàndez-Garcia and Sanchez-Vila [39] proposed a Kernel Density Estimator (KDE) method to reconstruct the concentration (we refer to [88] for a practical application of the KDE method). A comprehensive review about the RWPT method can be found in [105]. Compari- son with other numerical methods, e.g., Eulerian, mixed Eulerian-Lagrangian and TVD methods, indicates that Particle Tracking is particularly suited for advection-dominated problems. Since the particles position is not linked to the underlying grid, it is possible to use a coarser grid for the resolution of the flow field. Moreover, the choice of the numerical grid is not affect by the initial contaminant distribution. The main disadvantage of the Particle Tracking method is the high number of particles required for the convergence. However, since each particle trajectory derived in (2.6) is independent, this technique can be efficiently parallelized. 2.2 PAR 2 : a GPU-accelerated Particle Tracking Solver With the particle tracking method, the solute is represented by a set of particles each one carrying a small portion of the total mass. Each particle follows equation (2.6). Since the trajectoryofeachparticleisindependent, theparticlepositionscanbeupdatedinparallel. The random walk particle tracking method is particularly suited for GPU parallelization. The hardware architecture of a GPU is different from a CPU. A CPU is composed of few 14 powerful cores, able to run multiple processes independently. On the contrary, a GPU is composed of thousands of small cores, therefore suited to take a huge array of data (e.g., particle positions) and apply the same simple transformation on each element in parallel (e.g., update position following equation (2.6)). The implementation of the particle tracking method follows these steps. First, the velocity field is imported and copied to the GPU and the particle positions are initialized ontheGPU.Then, foreachparticlethecurrentvelocityisevaluatedusingtheappropriate interpolation method. Since the entire velocity field is inside the GPU, there is no transfer ofdatabetweentheGPUandthehost. Oncethevelocityiscomputed,wecanuseequation (2.6)toupdatetheparticlepositions. Again, thisstepisexecutedinparallelontheCUDA device. Finally, we can proceed to the next step. Given the particle positions, it is possible to compute global quantities of interest, e.g., the center of mass of the particles or the number of particles inside a given control volume. These quantities can be efficiently computed inside the GPU, avoiding again the need to transfer all the particle positions between GPU and host. PAR 2 is a C++/CUDA Random Walk Particle Tracking simulator implementing the strategy discussed above. In order to take advantage of Object Oriented programming features, we employ the Thrust library [10]. The Thrust template library provides a STL interface of the CUDA language with some common algorithm already implemented and optimized. The source code of PAR 2 can be found on GitHub (https://github.com/ GerryR/par2). Within the next sections we analyze the main features and the design principles used in PAR 2 . 15 2.2.1 Data Structures Three arrays in the device global memory are used to store the position of the particles at the current time step. Each array is storing the coordinate of all the particles in a given direction (i.e., x, y and z coordinates). The dimension of these arrays is equal to the number of particles N p . To exploit the benefits of the Thrust library, the algorithm is rewritten in terms of two basic operations: transform and reduce. For example, the particles positions arrays are transformed following equation (2.6) and a single CUDA kernel is called for each time step. This ensures a coalesced access to the coordinate arrays improving the overall computational speed-up. Next, the velocity field must be stored on the device. The velocity field is obtained by numerically solving the flow equation (2.2) on a Cartesian grid with dimension n x n y n z . The final discrete velocity is defined by the normal velocity in each face of the grid. Thus, three arrays in the device global memory are used to store the x, y and z velocities at the interface with dimensions (n x + 1)n y n z , n x (n y + 1)n z , and n x n y (n z + 1) respectively. This data will be used for the linear interpolation of the velocity on each particle position. At the same time, for the trilinear interpolation of the velocity (and its derivative), the velocity is evaluated at the corners of each cell. These velocities are precomputed and loaded into the device global memory in three arrays of size (n x + 1) (n y + 1) (n z + 1). Finally, we point out that the memory required for the simulation scales linearly with the number of particles N p and linearly with the number of cells n x n y n z . 16 2.2.2 Pseudo-Random Number Generator One of the most important component of the particle tracking method is the random displacement. This allows to simulate diffusion and dispersion. A reliable pseudo-random number generator must be used in order to compute the vector in (2.6). Massively par- allel generation of independent pseudo-random numbers can be cumbersome and it has been subject of several studies (see [89] for more information on the topic). An inappropri- ate choice of a parallel pseudo-random number generator can lead to time and/or space cross-correlations between particles, compromising the final results. For this task, the well-established CURAND library is used, allowing the generation of the pseudo-random numbers on the GPU device in parallel. 2.2.3 Computing the Drift Vector and the Displacement Matrix The drift vector defined in (2.7) is composed of the advection component u, a correction term due to spatially variable dispersionrD, and a correction term due to spatially variable porosity 1 Dr. The correction term due to spatially variable porosity is not yet implemented and will be added to a future release of PAR 2 . Following the approach described in Salamon et al. [105], we use a hybrid interpolation scheme, using linear interpolation (i.e., using velocities at the interfaces) to compute the advection component and trilinear interpolation (i.e., using velocities at the corners) for the correction term due to spatially variable dispersion. Then, the displacement matrix defined in (2.8) is evaluated. Including a non-constant dispersion requires to compute the eigenvalues and eigenfunctions of D for each particle and for each time step. Using classical iterative methods to compute eigenvalues of many 17 small matrices in CUDA threads would dramatically impact the GPU efficiency. Thus, the eigenvalues and eigenvectors of D are computed analytically. The velocity at the particle position is evaluated using a trilinear interpolation scheme. Thus, given the Darcy-scale velocityu = [u 1 ;u 2 ;u 3 ] T at the particle position, the displace- ment matrixB is computed according to (2.8). First, we note that the dispersion matrix (2.5) can be rewritten as: D =aI +bW; (2.10) wherea = T juj +D m ,b = ( L T )=juj andW =uu T . Letv i be an eigenvector ofW and i the corresponding eigenvalue, then: Dv i =av i +bWv i = (a +b i )v i = i v i : (2.11) The result above implies that v i is also an eigenvector of D with eigenvalue i =a +b i . Since 2D =BB T (see Equation (2.8)), given all the eigenvalues and the corresponding orthogonal eigenvectors of W, the matrix B can be computed as: B = 3 X i=1 p 2 i v i v T i v T i v i : (2.12) More details about the derivation of equation (2.12) can be found in Appendix B. If u6= 0, one of the eigenvalues of W is the square of the modulus of the velocityjuj 2 with corresponding eigenvector u (i.e. Wu = (uu T )u =juj 2 u). Moreover, if u 1 6= 0, we can analytically compute the other two pairs such that: 1 =juj 2 and v 1 =u; (2.13) 18 2 = 0 and v 2 = [u 2 ;u 1 ; 0] T ; (2.14) 3 = 0 and v 3 = [u 3 u 1 ;u 3 u 2 ;u 2 1 +u 2 2 ] T : (2.15) Note thatv 1 ;v 2 ;v 3 form an orthogonal base. This decomposition is valid if all the orthog- onal eigenvectors in (2.13)-(2.15) are non-zero. A sufficient condition for this to happen is u 1 6= 0. In the case where u 1 = 0, a small number " is added to u 1 such that the numerical error introduced is masked by the molecular diffusion (i.e., "D m = L ). The displacement matrix can be computed analytically for each particle in parallel in the GPU using equations (2.13)-(2.15) together with (2.12). 2.2.4 Input/Output In order to use PAR 2 , a configuration file must be prepared for each simulation. We use the YAML markup language for this file, providing an easy and convenient interface to the code. Therefore, knowledge of C++/CUDA programming language is not required to use PAR 2 . Moreover, given the popularity of the YAML format, most of the programming languages already have a library able to read and write a YAML file, thus PAR 2 can be easily collocated in a broader simulation framework. An example of configuration file is: 1 # Grid parameters 2 grid: 3 dimension: [<int>, <int>, <int>] 4 cell size: [<float>, <float>, <float>] 5 6 # Physical parameters 7 physics: 8 porosity: <float> 19 9 molecular diffusion: <float> 10 longitudinal dispersivity: <float> 11 transverse dispersivity: <float> 12 velocity: 13 type: modflow 14 file: </path/to/modflow-output.ftl> 15 16 # Particle tracking simulation parameters 17 simulation: 18 particles: 19 N: <int> 20 start: 21 p1: [<float>, <float>, <float>] 22 p2: [<float>, <float>, <float>] 23 24 dt: <float> 25 steps: <int> 26 27 # Output parameters 28 output: 29 csv: 30 file: </path/to/output.csv> 31 skip: <int> 32 # List of values to compute 33 items: 34 - label: <output1> 35 type: <tag> 36 # options (different for each tag) 37 - label: <output2> 20 38 type: <tag> 39 # options 40 ... 41 42 snapshot: 43 file: </path/to/snapshot-*.csv> 44 skip: <int> The parameters are divided into four blocks: 1. grid: dimensions and parameters of the simulation grid. 2. physics: physical parameters of the site. 3. simulation: parameters required by the particle tracking method. 4. output: define quantities to output during the simulation. Note that the particle starting position in the simulation block is defined by two points p 1 and p 2 . These two points define a volume (i.e., all x such that p 1;i < x i < p 2;i for i = 1; 2; 3) in which the particles are randomly and uniformly placed. Finally, the link to MODFLOW is made using the LMT package. Adding this package to the MODFLOW simulation will generate the Flow-Transport Link (FTL) file containing the volumetric flow rates at cell interfaces. The path to this file must be defined in the physics block of the YAML parameter file. 2.2.5 Verification and Speed-up Analysis In this Section we perform a simulation using data from a realistic aquifer. We consider the fluvio-aolian deposits located in Southeast Brazil close to the town of Descalvado in 21 Figure2.1: MPSsimulationofthefluvio-aoliandepositslocatedinDescal- vado, Brazil [8]. The size of the field is 42m 42m 5:8m. Each color represents a hydroface. 22 Table 2.1: Lithofacies and hydrofacies distribution of the Descalvado site. Lithofacies Description K[m=s] % SGt Trough-cross-bedded sand 3:0 10 4 29.12 and gravel 9:4 10 5 1.72 Sp Planar-cross-bedded aeolian sand 1:6 10 4 34.97 Sh/Sp Horizontally laminated to planar 1:4 10 3 8.84 cross-stratified sand 7:8 10 5 1.33 St Trough-cross-bedded sand 6:0 10 5 5.31 2:5 10 5 9.52 6:2 10 6 9.16 Fm Massive clay intraclasts 7:8 10 8 0.03 the state of São Paulo. Geological and laboratory analysis of the site have been provided by Bayer et al. [8, 7]. The geological formation is characterized by 5 lithofacies and 9 hydrofacies (see Table 2.1). Using multiple-point statistics (MPS) simulations, Bayer et al. [8, 7] randomly generated different realizations of the hydraulic conductivity spatial distribution. For this work, we use one of these realizations, depicted in Figure 2.1. The aquifer’s dimension is L x L y L z , where L x = L y = 42m and L z = 5:8m. The grid resolution is 420 420 58 cells (i.e., the cell size is 0:1m 0:1m 0:1m), with a total of 10; 231; 200 cells. For simplicity, the effective porosity is assumed to be constant and equal to the average porosity of the entire field ( = 0:285). The flow field is computed using MODFLOW [59]. The hydraulic head on the boundaries orthogonal to the x- axis is set such that the difference ^ satisfies ^ =L x = 0:01 (i.e. Dirichlet boundary conditions). A no-flow condition is used for the other boundaries (i.e., Neumann type boundary conditions). The flow field generated by MODFLOW is used for the simulation of the solute trans- port. For the transport simulation we use the GPU-accelerated PAR 2 . To verify the 23 Figure 2.2: Raw particles position using RW3D (red particles) and PAR 2 (green particles) after 7.5 days (top) and 30 days (bottom). The number of particles is N p = 20; 000 in both cases. Blue lines indicate the initial position of the particles. correctness and the performance of the software, we compare the results with the RW3D [40, 41, 61], a robust software implementing the RWPT method without parallelization. For the purpose of illustration, the dispersivities are assumed to be L = 0:01m and T = 0:001m. The molecular diffusion coefficient is D m = 10 9 m 2 =s. We consider a line source located between the points s 1 = (5; 10; 2:9) ands 2 = (5; 32; 2:9). The particles are uniformly distributed along the line source. The total simulation time is 30 days with a time step t = 1h = 3600s for a total of 720 steps. Two snapshots of the particle positions are displayed in Figure 2.2. All the simulations are carried out using a desktop computer equipped with a CPU 24 Intel Core i7-6700 CPU @ 3.40GHz x 8 (for RW3D) and a GPU NVIDIA GeForce GTX 745/PCIe/SSE2 (for PAR 2 ) equipped with 384 CUDA Cores and 4GB of internal RAM memory. First, wesetthenumberofparticlesN p = 20; 000. Twosnapshotsattimet = 7:5 days and t = 30 days of the particle positions are shown in Figure 2.2. The particles distribution is in agreement in both methods. In Figure 2.3 we compare the normalized cumulative Break-Through Curve (BTC) at two control planes (CP) orthogonal to the x-axis located at 5m (CP-1) and 25m (CP-2) from the injection line-source. The results reported in Figure 2.3 show an excellent agreement between RW3D and PAR 2 . Finally, we report the computational run time for the whole simulation using different number of particles in both RW3D and PAR 2 . As shown in Figure 2.4, PAR 2 is able to complete the entire simulation within few seconds. When utilizing 100,000 particles, the simulation carried out with PAR 2 is roughly 98 times faster than the corresponding non- parallelized simulation. PAR 2 can be compiled using single-precision or double-precision float numbers. As shown in Figure 2.4, using single-precision usually leads to faster simulations and we have not seen any remarkable difference on the results due to the float-precision choice. Note that the speed-up difference due to the float precision depends on the GPU device used. It is clear that the proposed GPU parallelization provides a manner to mitigate the high number of particles constrain that typically characterizes the particle tracking methods. For example, using 1 million particles for the same simulation would require a total computation time of 43.8 seconds (using our desktop computer and single-precision float numbers). Simulation time can be further reduced by using GPU cards dedicated to scientific computing, usually more powerful than desktop GPUs. To understand the amount of memory used in the device, we run a simulation with 200 million particles. In this case, the simulation was completed in less than 2 hours. It is 25 interesting to note that PAR 2 used only 2.6GB of the GPU internal memory, showing the possibility to perform relatively large simulations despite GPU memory limitations. Being able to perform simulations with millions of particles allows to compute the solute concentration over very small control volume (e.g., an observation well). Moreover, it is possible to obtain a better convergence for the high-order spatial-moments of the plume (e.g, skewness and kurtosis) or the time-derivative of the spatial-moments (e.g., effective dispersion). 2.3 Summary We developed PAR 2 , an open source, simple-to-use, GPU-accelerated Random Walk Par- ticle Tracking C++/CUDA code to simulate solute transport in groundwater flows. The RWPT method is particularly suited for parallelization, since the trajectory of each parti- cleisindependentfromtheothers. AllthesimulationiscarriedoutinsidetheGPUdevice, avoiding massive transfer of data between the host and the device. To achieve this goal, we provide an alternative method to compute the displacement matrix associated with the local-scale dispersion tensor. Furthermore, simulation parameters are defined in a YAML configuration file, where it is also included a convenient interface with MODFLOW, a commonly used groundwater flow simulator. The usage of PAR 2 does not require any particular knowledge about C++/CUDA programming techniques. We provide a benchmark simulation, where solute transport in a realistic aquifer lo- cated in Brazil is simulated. In addition, we provide a comparison between PAR 2 and another available particle tracking code in the literature. Our results indicate a significant reduction of computational time, being able to complete the full benchmark simulation 26 0 5 10 15 20 25 30 Days 0.0 0.2 0.4 0.6 0.8 1.0 Cumulative BTC RW3D CP-1 RW3D CP-2 PAR 2 CP-1 PAR 2 CP-2 Figure 2.3: Normalized cumulative Break-Through Curve (BTC) com- puted at two control planes (CP) orthogonal to the x-axis located at 5m (CP-1) and 25m (CP-2) from the injection line-source using RW3D and PAR 2 . 27 0 200 400 600 800 1000 1200 1400 Computational Time (s) 20k 40k 60k 80k 100k N p 1m 32.44s 3m 57.34s 6m 49.26s 12m 16.87s 19m 16.75s 11.20s 12.68s 14.12s 15.66s 17.10s 09.45s 10.06s 10.65s 11.34s 11.80s RW3D PAR 2 Double PAR 2 Single Figure2.4: Computational time required to run the full benchmark simu- lation using RW3D and PAR 2 with different number of particlesN p . Com- putational times for PAR 2 are reported using both single-precision and double-precision float numbers. 28 in less than a minute. Due to its efficiency, the software can be adopted within a Monte Carlo framework on desktops equipped with NVIDIA GPU cards or GPU-accelerated HPC clusters, by accelerating the transport simulation in each realization. RWPT simu- lationsusingmillionsofparticlesarepossiblethankstotheGPUparallelization, providing a way to ameliorate the computational cost of large numbers of particles that might limit the range of applications. As shown in the benchmark test, we were able to complete the simulation in a grid with 10 million cells using 200 million particles. These numbers can be easily increased using modern and more capable GPUs. PAR 2 alleviates the computational costs associated with stochastic simulations of hy- drogeological systems. However, the particle tracking software can also be employed in other physical systems that are governed by highly advective (or convective) processes. As for possible future work, the software needs to be expanded to account for more complex flow scenarios such as partially saturated porous media flows, non-constant porosity, tran- sient conditions, unstructured grid (e.g., see [86]), and discrete fracture networks (e.g., see [80]). Probably the most challenging problem, from an implementation point of view, is the inclusion of complex biogeochemical reactions (see [21]). Many efficient approaches to include reactions in RWPT have been proposed recently (e.g, see [61, 91, 14, 108]). However, in this case a highly scalable approach may be more indicated than the most efficient one. 29 Chapter 3 A Link Between Porous Media and Graph Theory Methods for simulating transport in heterogeneous aquifers can be divided into two cat- egories: analytical and numerical methods [see discussion in Chapters 8-10 of 102]. In general, analytical solutions are limited to low levels of heterogeneity [e.g., 12, 43, 29]. Many of these analytical solutions are based on perturbation theory. Semi-analytical methods have also been proposed to tackle flow and transport in highly heterogeneous formations [e.g., 25, 45, 44]. On the other hand, numerical solutions are used to quantify transport in more generic K-fields [e.g., 95, 32]. These methods permit to extend many results to high heterogeneity and better understand its impact on anomalous transport [e.g., 74]. The main limitation of numerical methods is the high computational costs associated with simulations of large stochastic systems [e.g., 76, 84]. The aforementioned challenges and constraints encountered in analytical and numeri- calmethodsmotivateustodevelopalternativeapproachesabletoquantifyrelevantfactors associated with solute transport in aquifers without resorting to the solution of the gov- erning equations for flow and transport. Amongst these factors, we highlight the early 30 time arrivals of a contaminant plume at an environmentally sensitive location. The pres- ence of highK channels is a key factor governing early time arrivals [e.g., 34, 87, 114, 24, 42, 62, 116]. Therefore, determining the highly conductive paths between a contaminant source and a receptor is fundamental for risk analysis [e.g., 82, 112, 31, 111, 28] due to its correlation with early time arrivals [54, 116]. These high K channels lead to the definition of connectivity. In general, the connec- tivity of a field indicates the likelihood that two points are connected by a preferential channel. However, there is not a unique rigorous definition of connectivity and several metrics have been explored [106, 52, 38, 75, 47, 93, 42, 116]. Connectivity measures are labeled as static and dynamic [see 69]. Static connectivity measures are computed using only the K-field and is the subject of interest of the present work. Dynamic connectivity measures use information stemming from the flow field and/or solute transport. We develop a computationally efficient framework that enables to identify the con- nectivity between a source zone and a receptor (which can be represented by distinct combinations of points, lines, areas and volumes). Our goal is to infer dynamic transport quantities (e.g., first arrival times) from static connectivity measures. We aim to find the path that minimizes the hydraulic resistance between the two locations. The minimum hydraulic resistance has been successfully used as a static connectivity measure [see 116] and is equivalent to a spatial distance between two given points. In general, as shown in Tyukhova et al. [115], the higher the minimum hydraulic resistance is, the more time the solute particle needs to travel between the two points. The curve connecting the two points along which a solute particle encounters less resistance is the least resistance curve. The challenge associated to finding the least resistance (shortest or fastest) path is common with other engineering fields, such as traffic or network optimization (e.g., 31 water distribution networks [101]). This class of problems is of key importance in graph theory [20]. Graph theory has also been employed to parametrize spatial hydraulic prop- erty characterization [15]. By recasting the hydrogeological problem into a graph theory problem, it is possible to find the least resistance path and the corresponding minimum hydraulic resistance using the Dijkstra’s algorithm [36]. The computational time needed to generate the minimum hydraulic resistance map using this approach is comparable with the time needed to generate a single realization of theK-field using common random field generators. Due to the computational efficiency of the method, we are able characterize the minimum hydraulic resistance stochastically. 3.1 Theory and Algorithm 3.1.1 General Formulation Given a source points2R n (withn = 2 or 3 denoting the spatial dimensionality) and an arrival point x2R n , we define the set of all the possible curves connecting s tox asP x s . Consider a locally isotropic and heterogeneous hydraulic conductivity field K :R n !R. For each 2P x s we define the corresponding hydraulic resistance as a line integral: R = Z 1 K d : (3.1) The hydraulic resistanceR has units of time and it is an indicator of the resistance found by a solute particle traveling along . A curve passing through a high conductivity zone will likely have a low hydraulic resistance, meaning that the time needed to reach the end point will be reasonably low. For this reason, it is of interest to find the curve that 32 minimizes the hydraulic resistance withinP x s . Formally, we define the minimum hydraulic resistance from a point s to x as: R s (x) = min 2P x s R : (3.2) Consequentially, the curve ^ 2P x s that satisfiesR ^ =R s (x) is called the least resistance curve. Note that it may be possible to have multiple least resistance curves for a pair of points although in practice, the probability of this event is very low. Alternatively, we consider a source control volumeAR n and a target control volume BR n . Using (3.2), the minimum hydraulic resistance from A to B is defined as: R A (B) = min a2A min b2B R a (b) : (3.3) We point out that the definitions above allow to define the minimum hydraulic resistance between two generic geometries (e.g., a pair of points, an injection plane to a point, a point to a control plane or an injection plane to a control plane). 3.1.2 Graph Approximation It is possible to find an analytical solution of equation (3.2) only for simple scenarios such as homogeneous media (i.e., constant K). In general, finding the minimum resistance (and the corresponding least resistance curve) in a continuum framework is a difficult task since it requires to explore all the possible curves connecting two nodes. For this reason we propose the following graph-based procedure to develop an approximation for equations (3.2) and (3.3). 33 Figure 3.1: Scheme used to construct the hydraulic resistance graph R(V;E). The circles represent the nodes of the graph, one for each cell. Each node is connected to the neighbors through edges. For instance, the red node is connected to the neighbors through the green edges. Using the standard graph theory notation [20], we define a graph G(V;E) as a pair of a set V of vertices and a set E of edges. Each edge e2 E connects two vertices of V and it can have a weight w e (e.g., distance, time or cost). In this paper we will use only undirected graphs, meaning that the edges have no orientation (i.e., if a vertex v i is connected to v j then v j is connected to v i ). Finally, a path is a sequence of vertices such that adjacent vertices are connected by edges. In order to have a graph representation of the hydraulic conductivity fields, we par- tition the domain into cells as shown in Figure 3.1. The hydraulic conductivity field is discretized such that each cell i is characterized by a hydraulic conductivity value K i . 34 K i K j dx dy Γ j i Figure 3.2: Zoom of two nodes connected by one edge j i . The weight of the edge is computed according to the cell conductivities K i and K j as in Equation (3.4). Note that this domain partition is the same used in classic Finite Volume or Finite Differ- ence methods. The coordinates of the center of each cell represents a vertex of the graph. Two cells are considered neighbors if they share a common face or a common corner. Each vertex is connected to a neighbor vertex through an edge as shown in Figure 3.2. The weight w e associated to an edge e connecting two neighboring vertices v i and v j is: w e = jr ij j K i + jr ji j K j ; (3.4) wherejr ij j is the length of the segment connecting the coordinates of v i with either the center of the face between the cells i and j or the corner coordinates between i and j. This procedure generates the hydraulic resistance graph R(V;E) where V contains as many vertices as the number of cells and E contains the weighted edges as defined in (3.4). In case of structured grids, the vertex corresponding to an internal cell has exactly 8 neighbors in 2D and 26 neighbors in 3D domains. Equation (3.4) can be extended to 35 the more generic case of anisotropic conductivity tensor usingw e =jK 1 i r ij j+jK 1 j r ji j. We can use the hydraulic resistance graph to find an approximation of the minimum resistance defined in (3.2). Assuming that the source point s and the arrival point x are the center coordinates of two cells with corresponding vertices s andx, we can define the set of paths P x s from s to x on the graph R(V;E). Since for each path we can build a piecewise linear curve starting from the center of the cell corresponding to s to the center of the cell corresponding tox, the set P x s represents a subset of all the possible curvesP x s . Therefore the approximated minimum hydraulic resistance is given by: R s (x) = min 2 P x s R ; (3.5) whereR is defined in (3.1). Using the discrete isotropic hydraulic conductivity field and the weights (3.4), we obtain a simplified expression of the hydraulic resistance: R = Z 1 K d = X e2 w e ; (3.6) wheree2 are all the edges in 2 P x s . Therefore, the approximated minimum hydraulic resistance becomes: R s (x) = min 2 P x s X e2 w e : (3.7) Note that in this formulation there exists at least one path ^ 2 P b a such thatR ^ = R b a since the number of possible paths is finite. The quality of this approximation can be controlledbytherefinementlevelofthegrid. Decreasingthecellsize increasesthenumber of possible paths between cells. Similarly to (3.3), we can extend (3.7) for the case of multiple source vertices and 36 multiple target vertices. Given two sets of vertices AV and BV, we add two virtual vertices a and b to the vertices set V. The vertex a is connected to all the vertices in A through edges with 0 resistance. By the same token, the vertex b is connected to all the vertices in B. Therefore, the approximated minimum hydraulic resistance is: R A ( B) = min 2 P b a R : (3.8) 3.1.3 Computation of the Minimum Hydraulic Resistance The value R s (x) defined in (3.7) is the solution of a minimization problem in a graph theory framework. In other words, we need to find a path going from a vertex s to x that minimizes the sum of the weight of its edges. This problem can be solved using the Dijkstra’s algorithm [36]. The pseudo code to find the minimum hydraulic resistance using the Dijkstra’s algorithm is: 37 1: function Dijkstra(R(V;E), A) 2: W = set of all the vertices V; 3: for all vertices v2W do 4: if v2A then 5: res(v) = 0 6: pre(v) = undefined 7: else 8: res(v) =1 9: pre(v) = undefined 10: while W is not empty do 11: c = vertex in W with the smallest resistance; 12: remove c from W; 13: for all neighbor n of c do 14: r =res(c) +res(c;n) 15: if r<res(n) then 16: res(n) =r 17: pre(n) =c Details about the implementation can be found in [20] and [113]. Despite its simplicity, the Dijkstra’s algorithm has several features motivating the extensive usage in different fields, such as transport networks and Artificial Intelligence development. The first and probably most important feature of this algorithm is that it is computationally efficient. The optimal worst-case performance is O(jEj +jVj logjVj), meaning that the algorithm scales well with the number of vertices and edges. Note that the worst-case performance 38 of the original algorithm formulated by Dijkstra is O(jVj 2 ). In order to achieve a better performance, weneedtouseaspecialdatastructuredenotedasmin-priorityqueuesothat the operation complexity of selecting the vertex with minimum resistance is O(1) [51]. The second feature is that the algorithm implicitly computes the minimum resistance and the associated least resistance path for all the vertices of the graph. In all our tests, the computational time needed to compute the minimum hydraulic resistance map is less than the time needed to generate a randomK-field using sequential Gaussian simulators, with a peak of 1.86 seconds for a 2D field with 1 million cells (using a CPU with 3.40GHz). Therefore, the time required for the computation of the minimum hydraulic resistance map is negligible compared to the full numerical solution of the flow and/or transport equations. Finally, the algorithm can be applied to any K-field using different source and target geometries. 3.1.4 Algorithm Output and Graph Refinement Using the algorithm presented in Section 3.1.3, we are able to compute the minimum hydraulic resistance R A (x) defined in (3.8) from a subset of vertices A to each vertex x. Since each vertex of the graph is linked to a cell of the grid, we can map the minimum hydraulic resistance to the physical domain obtaining a discrete approximation ofR A (x) defined in (3.3) from a subset of the domainA to each pointx. Moreover, for each vertex of the graph we are able to retrieve the neighbor vertex where the least resistance path passes through. Thus we can recursively build the least resistance path to each vertex of the graph and therefore map each path to a curve of the physical domain. The algorithm presented in this paper can compute the minimum hydraulic resistance 39 Figure 3.3: Illustration of the method in a 3D spatially heterogeneous multi-Gaussian hydraulic conductivity field. The natural logarithm of the hydraulic conductivity (left) and the corresponding minimum hydraulic re- sistance map obtained with the graph theory algorithm (right) are dis- played. The colors on the minimum hydraulic resistance map (right) in- dicate the resistance between the center of the domain and all remaining points in the 3D heterogeneous porous medium. The 3D domain contains 201 201 201 cells. 40 Θ = 1 Θ = 3 Θ = 5 Figure 3.4: Example of graph refinement with = 1; 3 and 5. in two-dimensional (2D) and three-dimensional (3D) K-fields. With the purpose of il- lustration, the minimum hydraulic resistance of a 3D heterogeneous K-field is shown in Figure 3.3. Figure 3.3 displays the K-field (left plot of Figure 3.3) used to compute the minimum hydraulic resistance (right plot of Figure 3.3) between the center point of the 3D formation and all the points of the domain. In the upcoming illustrations, we will focus on 2D examples. The approximation quality of the minimum hydraulic resistance can be controlled by refining the grid. A finer grid provides more discrete paths connecting any two vertices of the graph. However, once the size of the cells is chosen, it is possible to refine the graph while keeping the same grid resolution. An example can be found in Figure 3.4. The graph refinement level represents the number of vertices per cell in each of the principal directions. It is clear that increasing will increase the approximation quality by adding possible paths to the conductivity graph. 41 3.2 The Lazy Mole As seen in the previous sections, the graph-theory based method used to compute the minimum hydraulic resistance is computationally efficient. This makes the usage of this connectivity metric appealing for a multitude of applications. However, the implemen- tation of the method, despite its relative simplicity, may be an obstacle for the usage in practical scenarios. For this reason, the software LazyMole has been developed as part of this dissertation, and the source code can be downloaded from the GitHub repository [96] (https://github.com/GerryR/lazymole). The name of the software is linked to the fact that we are trying to find the least resistance path. Assuming that the resistance encountered by a “mole” digging a tunnel is proportional to the hydraulic resistance, fol- lowing the least resistance path would allow the mole to arrive to the chosen destination by minimizing the energy consumed by the digging activity. The software is developed using plain C++, and it can be compiled for any major Operating System due to the CMake compilation process. The variation of the Dijk- stra’s algorithm to compute the minimum hydraulic resistance and least resistance path described in the previous sections is implemented. The computation of the minimum hydraulic resistance from some source points to any other point of the domain can be achieved in few seconds and, in general, in less than the time required for the generation of one realization of theK-field using a classical sequential Gaussian simulation approach. A user friendly interface based on a YAML file is included with the software. An example of configuration file is: 1 # Grid parameters 2 grid: 3 dimensions: 42 4 nx: 200 # Number of cells in x direction 5 ny: 100 # Number of cells in y direction 6 nz: 1 # Number of cells in z direction 7 cell size: 8 dx: 1.0 # Cell size in x direction 9 dy: 1.0 # Cell size in y direction 10 dz: 1.0 # Cell size in z direction 11 refinement: 12 refx: 1 # Refinement in x direction 13 refy: 1 # Refinement in y direction 14 refz: 1 # Refinement in z direction 15 16 # Input parameters 17 input: 18 field: 19 file: field.dat # File name relative to root directory with K values 20 skip: 0 # Number of lines to skip 21 log: true # True if file contains the logK values 22 source: 23 file: source.dat # File name relative to root directory with 24 # source ids 25 target: 26 file: target.dat # File name relative to root directory with 27 # target ids 28 29 # Output parameters 30 output: 31 resistance: 32 file: hres.dat # Output name relative to root directory where 43 33 # resistance map is saved 34 path: 35 file: path.dat # Output name relative to root directory where 36 # least resistance path is saved The parameters are divided into three blocks: 1. grid: dimensions and parameters of the simulation grid. 2. input: location of the input files. 3. output: location of the output files. All the minimum hydraulic resistances and least resistance paths presented in this dissertation have been generated using the LazyMole software. 44 Chapter 4 Connectivity and Solute Transport 4.1 Physical Formulation and Numerical Models In this section we present some basic concepts needed for the analysis of the minimum hydraulic resistance and least resistance path. 4.1.1 Physics of Flow and Transport For this analysis, we assume the hydraulic conductivity to be a spatial random function (RSF) as described in Section 2.1.1. Moreover, we assume the RSF to be isotropic (there- fore it is a scalar function), and we will refer to it as K. The choice of the RSF depends on the type of aquifer considered. We assume that the log-conductivity Y = logK is a multi-Gaussian (MG) field. A stationary MG field can be uniquely defined by its mean Y (and geometric mean K G = e Y ), variance 2 Y and spatial covariance modelC Y . In this chapter, we will adopt an isotropic exponential covariance model such that: C Y (r) = 2 Y exp(r=I Y ); (4.1) 45 where r is the lag distance between two points and I Y is the isotropic integral scale. Different realization of the hydraulic conductivity field can be generated as discussed in Section 2.1.1. Connectivity of the hydraulic conductivity field strongly depends on the RSF model chosen. In order to understand the impact of the model and its high order moments, we also consider the non multi-Gaussian (non-MG) fields suggested by Zinn and Harvey [123]. This type of fields can display strong connected structures of high (connected fields) or low (disconnected fields) conductivity. However, similarly to the MG fields, it is possible to select the mean Y , variance 2 Y and covariance modelC Y . Moreover, the PDF distribution of the log-K in each point is normal. Therefore it is possible to generate MG fields and non-MG fields that only differ in the high order statistical moments (i.e., the joint PDF distribution). We consider a steady-state flow through a saturated porous formation. The flow field can be computed using the continuity equation (2.2). Once the hydraulic head is determined, the spatially variable velocity field u can be obtained through the use of Darcy’s law (2.3). In order to link the connectivity features of theK-field to solute transport, we consider an inert solute that is instantaneously released into the porous formation. The solute transport is governed by advection and local-scale dispersion mechanisms as described by the advection-dispersion equation (ADE) (2.4) with appropriate boundary and initial conditions. The dispersion tensor defined in (2.5) is denoted by D and models both the sub-Darcy scale heterogeneity and molecular diffusion. 46 4.1.2 Particle Tracking Particle tracking techniques aim to find a solution of (2.4) by following a Lagrangian framework. The solute plume is represented by a set of particles carrying some solute mass that are transported by three physical mechanisms: advection, diffusion and local- scale dispersion. Advection is simulated by moving particles according to the velocity field obtained through the solution of Equations (2.2) and (2.3). In order to simulate the diffusion and local-scale dispersion, a random displacement is added to the particle. Molecular diffusion does not depend on the current velocity direction, thus the displace- ment is radially symmetric. On the other hand, local-scale dispersion depends on the direction and it is aligned with the particle velocity. An in-depth introduction of the par- ticle tracking method can be found in Section 2.1.3, while a comprehensive review can be found in [105]. Comparison with other numerical methods, e.g., Eulerian, mixed Eulerian- Lagrangian and TVD methods, indicates that Particle Tracking is particularly suited for advection-dominated problems. Since the particles position is not linked to the underlying grid, it is possible to use a coarser grid for the resolution of the flow field. Moreover, the choice of the numerical grid is not affect by the initial contaminant distribution. The main disadvantage of the Particle Tracking method is the high number of particles required for the convergence. However, since each particle trajectory derived in (2.6) is independent, this technique can be efficiently parallelized. PAR 2 (Parallel Particle Tracking), a fast GPU-accelerated particle tracking software, has been developed as part of this dissertation. Details about the implementation and speedup analysis can be found in the Section 2.2. 47 4.2 Relation with Solute Transport In this section, we explore the relation between the minimum hydraulic resistance and early time arrivals of a contaminant plume. With the purpose of illustration, we consider a 2D fully saturated porous medium (n = 2). We will compare the results obtained using the algorithm described in Chapter 3 with flow and transport simulations. Forthisanalysis, weconsiderasteady-stateflowinarectangulardomainofdimensions L x and L y . The following boundary conditions are employed: (x = 0;y) = L ; (x =L x ;y) = R ; @ @y y=0 = 0; @ @y y=Ly = 0: (4.2) where L and R are prescribed hydraulic heads. The boundary conditions defined in (4.2) ensure a longitudinal uniform-in-the-mean flow along the x-direction. The velocity field u is obtained through Darcy’s law (2.3). For the transport problem, we consider an inert tracer released from a line source on the left boundary and aligned with the y-axis. The dispersion tensorD =DI is assumed to be constant and isotropic. The flow equation (2.2) is solved with the Finite Volume method using the Python library FiPy [57]. The transport equation (2.4) is solved using PAR 2 (see Section 2.2). Within this section, the domain size is L x L y = 20I Y 10I Y and 100 hydraulic conduc- tivity fields are generated using 2 Y = 1 and 100 fields with 2 Y = 4. A line source (100,000 particles) is placed along all the left boundary (i.e., the size of the vertical line source is equal toL y ) and the particles are evenly distributed along its length. The Péclet number is Pe = 1; 000 where Pe = UI Y =D and U = K G ( L R )=L x is the mean longitudinal 48 0 5 10 15 20 0 2 4 6 8 10 y/I Y 2 15 R + (y) 0 5 10 15 20 0 2 4 6 8 10 2 15 R + (y) 0 5 10 15 20 0 2 4 6 8 10 2 15 R + (y) 0 5 10 15 20 0 2 4 6 8 10 y/I Y 2 15 R + (y) 0 5 10 15 20 0 2 4 6 8 10 2 15 R + (y) 0 5 10 15 20 0 2 4 6 8 10 2 15 R + (y) 0 5 10 15 20 x/I Y 0 2 4 6 8 10 y/I Y 2 15 R + (y) 0 5 10 15 20 x/I Y 0 2 4 6 8 10 2 15 R + (y) 0 5 10 15 20 x/I Y 0 2 4 6 8 10 2 15 R + (y) Figure 4.1: Snapshots of a particle tracking simulation for 9 randomly generated fields ( 2 Y = 4). The least resistance path (red line) is compared with the trajectory of the fastest particle (blue line). On the right side of each panel, the normalized minimum hydraulic resistance from the left boundary to each point of the right boundaryR + (y) =R (y)K G =I Y . velocity. The parameters are chosen such that the results are in dimensionless units (i.e., K G =1, L R =L x and U = 1). 4.2.1 Leading Front of the Plume Although the minimum hydraulic resistance is computed using only the K-field, it can provide useful information related to flow and transport. The front of a plume moving through a porous media is strongly affected by the conductivity distribution. The trajec- tory that offers less hydraulic resistance will facilitate the passage of the solute. In other words, there is a relation between the least resistance path computed in Chapter 3 and the actual leading front of the plume. 49 In Figure 4.1 we compare the numerical solution of the heterogeneous flow and trans- portmodelwiththeleastresistancepathbetweenleftandrightboundariesfor9randomly generated fields. The least resistance paths (red lines) are computed starting from the minimum hydraulic resistanceR (y) =R L (L x ;y) between the left boundary of the do- main L and all points within the right boundary of the domain. Subsequently, the coordinate (L x ;y) with minimum resistance is chosen to be the ending point of the least resistance path. Finally, we backtrack the least resistance path using the output of the algorithm described in Section 3. Figure 4.1 shows that the trajectory of the fastest par- ticle computed with the particle tracking code (blue lines) hits the right boundary in the location predicted by the least resistance path in most of the cases. As we can see from the center and center-left panels of Figure 4.1, for the cases where multiple leading plume edges reach the right boundary of the domain (i.e., the presence of multiple channels with similar low resistance), the least resistance path will point towards one of them, but not necessarily to the final location of fastest particle. Figure 4.2 statistically analyzes the normalized difference between the end of the least resistancepathy andthefinallocationofthefastestparticley t (obtainedwiththeparticle trackingcode)ontherightboundaryofthedomainfor 2 Y =1and4overmultiplerandom realizations of the K-fields. In most cases, the difference is approximately zero thus indicating that the least resistance path can be potentially a good estimator of the fastest particle’sfinallocation. However, theboxplotsshowninFigure4.2displaythepresenceof outliers. Aspreviouslymentioned,theseoutliersareanoutcomeofthepresenceofmultiple low resistance channels. It is interesting to note that the box plot for 2 Y = 4 is narrower than the box plot for 2 Y = 1 (Figure 4.2). The reason is that when heterogeneity is higher (i.e., larger values of 2 Y ), there is an increased probability in the occurrence of a single 50 σ 2 Y = 1 σ 2 Y = 4 − 10 − 5 0 5 10 y Γ − y t I Y Figure 4.2: Box plots of the normalized difference between the end of the least resistance path y and the final location of the fastest particle y t on the right boundary. The location y t is computed by simulating flow and transport. In both cases ( 2 Y = 1 and 4), 100 realizations of the hydraulic conductivity field have been used. 51 dominant low resistance channel within the K-field. The relationship between the actual trajectory (obtained via flow and transport simulation) and the least resistance path is subject to multiple dynamic quantities that are not embedded in the minimum hydraulic resistance formulation (which is a static quantity by definition) such as dispersivities, boundaryconditionsandinitialcondition(e.g., sourcezoneandreleasehistory). However, the least resistance path provides a good approximation of the fastest particle trajectory, showing that heterogeneity of theK-field is the driving factor governing the leading front of the plume in the cases taken into consideration. This simple example helps to understand the relation between the leading edge of the plume and the hydraulic resistance, a quantity that can be efficiently computed. Fastest particles tend to flow toward areas with small resistance and motivates the usage for predicting early time arrivals. As we will see in the next subsection, locations with small minimum hydraulic resistance are correlated with first time arrivals. 4.2.2 Early Time Arrivals The visual correlation depicted in Figure 4.1 between the least resistance path and the leading front of the plume indicates that there is a relation between hydraulic resistance and first time arrival. Thus, we compute the time for which 1% of the particles have crossed the right boundary for each of the random field generated. The first arrival time is therefore denoted by t 1% . Figure 4.3 shows the scatter plot between t 1% and the minimum hydraulic resistance between left and right boundaries of the domain. Even if the model is affected by the pres- ence of boundary conditions, the minimum hydraulic resistance shows a good correlation with early time arrivals for both levels of heterogeneity. We remark that the computation 52 Figure 4.3: Comparison between early time arrivals (t 1% ) and minimum hydraulic resistance between left and right boundaries (R). The coefficient of determination of the regression line is R 2 = 0:815 for 2 Y = 1 and R 2 = 0:893 for 2 Y = 4. 53 of the minimum hydraulic resistance does not require to solve the flow and transport equations. Similar results have been found by Tyukhova and Willmann [116] for different kind of fields and using a procedure based on erosion and dilation of a spatial set [115]. From a risk assessment point of view, the minimum hydraulic resistance has the potential to predict, with some level of uncertainty, the early time arrivals of a given contaminant to a environmentally sensitive area. In Appendix A, we mathematically demonstrate that, under certain conditions (i.e., when the distance between the two control volumes is very large and for a conservative tracer), the minimum hydraulic resistance can provide a lower bound estimate on the first arrival times. 4.3 StatisticalPropertiesofMinimumHydraulicResis- tance of MG Fields Given that an extensive site characterization of the subsurface is not feasible, the hy- draulic conductivity field is subject to uncertainty. As a consequence, the minimum hydraulic resistance must be treated as a random variable. Using the geostatistical struc- tural parameters defining the spatial variability of the K-field and taking advantage of the computational efficiency associated with Dijkstra’s algorithm, it is possible to com- pute the statistical properties of the minimum hydraulic resistance using a Monte Carlo framework. Within this Section, the minimum hydraulic resistance is considered to be a stochastic quantity at each location x. For a given starting point s, we can compute the hydraulic resistance between s and all the other pointsx of the domain using equation (3.2). If the ensemble ofK is uncondi- tioned on data, the expected minimum hydraulic resistance only depends on the distance 54 r =jxsj. Note that radial symmetry only holds for the minimum hydraulic resistance as a stochastic quantity (i.e.,R s (x 1 ) andR s (x 2 ) have the same pdf ifx 1 andx 2 have the same distance from s). Several hydraulic conductivity fields were generated following the procedure described in the previous section with the covariance model defined in (4.1) and different values of 2 Y . In the upcoming analysis, the domain size is L x L y = 30I Y 30I Y with 10 cells per integral scale I Y in each direction. The source point s is located at the center of the domain. For each realization of K and for each point of the domain, we compute the minimum hydraulic resistance from the center point s. 4.3.1 Single Realization Analysis Before computing the expected value of the minimum hydraulic resistance, we compare K-fields generated with different 2 Y . Our goal here is to understand how the connectivity structure (e.g., high connectivity channels) affects the minimum hydraulic resistance. Figure 4.4 illustrates the minimum hydraulic resistance maps (Figures 4.4.d, 4.4.e and 4.4.f) generated by different K-field realizations with variance 2 Y = 0:5 and 4:5 (Figures 4.4.a, 4.4.b and 4.4.c). It is clear that the minimum hydraulic resistance map tends to be radiallysymmetricinlowheterogeneousfields(e.g., compareFigures4.4.dand4.4.e). The presence of high conductivity areas in strongly heterogeneous fields leads to the formation of preferential channels in which the hydraulic resistance is much lower (see Figures 4.4.e and 4.4.f). Since the random generated fields are unconditional, the center point of the domain might fall in either a low or high conductivity area as shown in Figures 4.4.b and 4.4.c for 2 Y = 4:5. In the first case (Figure 4.4.e), the least resistance path to a given point 55 Figure 4.4: Minimum hydraulic resistance from the center (bottom) ob- tained using three different K-fields (top). 56 of the domain must travel through this low conductivity zone until it reaches a high conductivity area. In the second case (Figure 4.4.f), the least resistance path starts from a high conductivity zone. In general, if the center point is placed in a high conductivity area, the overall resistance is low since we can find a path from the center to a given point that passes only through high conductivity channels. Therefore, the minimum hydraulic resistance maps shown in 4.4.e and 4.4.f are very different even if they are generated using thesameK-fieldstatisticalproperties. Tobetterunderstandthisbehavior, itisconvenient to look at the equivalent graph theory problem defined in Equation (3.7). The shortest path problem can be viewed as a constrained minimization problem; these constraints are the initial and end points. Therefore, the result of the minimization algorithm is highly sensitive to areas surrounding the start and end points of interest. The results in Figure 4.4 highlights the important of site characterization of the source zone and are well aligned with theoretical and experimental transport studies that showed the significance of the source zone hydraulics in long distance transport behavior [30, 56]. 4.3.2 Expected Minimum Hydraulic Resistance Next, we examine the expected value of the minimum hydraulic resistance. Given a random K field, equation (3.2) can be rewritten as: R s (r) =R s (x s;r ) = min 2P xs;r s R = min 2P xs;r s Z e Y d ; (4.3) whereY = lnK andx s;r is a point distantr froms. Note thatR s (r) is a random variable for each value ofr. The expected value of the minimum hydraulic resistanceE [R s (r)] can now be computed numerically using a Monte Carlo approach. The number of realizations 57 0 2 4 6 8 10 r/I Y 0 2 4 6 8 10 E[R s (r)]K G /I Y 0 2 4 6 8 10 r/I Y 10 − 1 10 0 10 1 d(E[R s (r)]K G /I Y )/dr σ 2 Y = 0.0 σ 2 Y = 0.5 σ 2 Y = 1.0 σ 2 Y = 1.5 σ 2 Y = 2.0 σ 2 Y = 2.5 σ 2 Y = 3.0 σ 2 Y = 3.5 σ 2 Y = 4.0 σ 2 Y = 4.5 Figure 4.5: Expected value of the minimum hydraulic resistance between points with distance r (top) and its derivative with respect to r (bottom) for different value of 2 Y . The numerical results (?) are fitted using Equation (4.4). 58 needed for the convergence of the mean drastically changes depending on 2 Y We generate several random fields with Y = 0 and different 2 Y using the same procedure described in Section 4.2. We use 50,000 K-fields for 2 Y 3:0, 100,000 otherwise. To improve the convergence, we compute the average using the values of the four cells with same distance from the center (top, bottom, left and right) across all the K-fields. Moreover, we use a graph refinement level = 3 as defined in Section 3.1.4. Figure 4.5 illustrates the expected value of the minimum hydraulic resistance as a function of r for different 2 Y . Based on the numerical results in Figure 4.5 (represented by the symbol ?), we observe three distinct regimes corresponding to different behaviors: Regime 1: if rI Y , the slope ofE [R s (r)] increases with 2 Y . For an unconditional Gaussian K-field, the probability of having high or low K in one point is equal. However, since the hydraulic resistance is proportional to 1=K, low K values have a stronger impact on the integral in (3.2) than high K values. The probability of having lower values of K increases for fields characterized by larger 2 Y . This leads to higher average hydraulic resistance within this regime for large 2 Y . Regime 2: if r I Y , the slope of E [R s (r)] changes with r. In this transition regime, the least resistance path of a single conductivity realization is able to escape from a low conductivity area in the vicinity of the source point. Thus, the least resistance path will start to pass through highK areas, which is more likely to occur for high values of 2 Y . Finally we can observe an inversion of the slope trend. Regime 3: if r I Y , the slope of E [R s (r)] decreases with 2 Y . In other words, if r consists of several integral scales, the least resistance path is able to pass through high K 59 channels for most of its length. Larger values of 2 Y implies larger likelihood of encountering high K zones thus lower hydraulic resistance. The numerical values of E [R s (r)] displayed in Figure 4.5 (denoted by the symbol ?) have been fitted using a non-linear least squares method with the following equation: E [R s (r)] = I Y K G c 1 r I Y + e 2 Y =2 c 1 f r I Y ;c 2 ;c 3 ; (4.4) wherec 1 ,c 2 andc 3 arefittingparameters. Thefunctionf , whichrepresentsthetransition regime previously mentioned (see Regime 2), is defined as follows: f (x;a;n) =x 2 F 1 1; 1 n ; 1 + 1 n ;ax n ; (4.5) where 2 F 1 is the hypergeometric function. Using the hypergeometric function properties, the derivative with respect to x of the function (4.5) is: d dx f (x;a;n) = 1 1 +ax n : (4.6) Thus, for every pair of positive parameters a and n, the derivative of f is 1 for x = 0 and tends to zero for x!1. The above properties are the motivations for the choice of (4.5) to model the transition regime. As shown in Figure 4.5, equation (4.4) provides a good approximation for the expected value of the minimum hydraulic resistance. The choice of (4.4) reflects the numerical results and satisfies the heuristic analytical approximations described in the following paragraphs. 60 First, we derive an approximate solution for the expected minimum hydraulic resis- tance within the first regime (i.e.,rI Y ). If the distance between two points is less than an integral scale, the conductivity values in these points are very similar. Therefore, the least resistance path between the two points will likely find similar hydraulic conductivity values along its length (i.e., conduit of similar K values). Thus, we can approximate the least resistance path with a straight line connecting the two points. If rI Y we obtain: R s (r) = min 2P xs;r s Z e Y d ' Z r s e Y d (4.7) where r s is a straight line starting from s with length r. Taking the expected value of (4.7) we get: E [R s (r)]'E Z r s e Y d = Z r s E e Y d = Z r 0 e 2 Y =2 K G dx =r e 2 Y =2 K G : (4.8) The result is a line with slope e 2 Y =2 =K G . The slope increases exponentially with 2 Y . In Figure 4.5 we compare the solution obtained with the algorithm shown in Section 3 with the analytical solution (4.8) represented by dashed lines for different values of 2 Y . As expected, the approximation is valid only within the first regime. Moreover, it does not depend on the spatial covariance model of the conductivity field. The approximation (4.8) suggests that the solution is linear ifr tends to zero. This leads to the choice of (4.4) and (4.5) to modelE [R s (r)] given their mathematical functional form and properties. Next, we analyze two points separated by a distance r I Y (i.e., the third regime). Given two points s and x separated by a distance r, we can divide the least resistance 61 Ω r s x B s B x δ δ W = Ω\B s \B x ˆ Γ Figure 4.6: Illustration of the method used to split the least resistance path to derive the large distance approximation. path into three sections as shown in Figure 4.6. First, the path needs to cross the possible low conductivity area around the first point. We assume this area to have a radius (B s in Figure 4.6). Once the path reaches an high conductivity area, it will follow preferential channels to reach the second point (W in Figure 4.6). Finally, the path may need to enter a low conductivity area surrounding the second point (B x in Figure 4.6). The first and last sections do not depend on the distance r whereas the second section does depend on r. Therefore, we split the expected value of the minimum hydraulic resistance as follows: E [R s (r)] =E Z ^ \Bs e Y d + E Z ^ \W e Y d +E Z ^ \Bx e Y d ; (4.9) where ^ is the least resistance path between s and x. Note that ^ is a stochastic path and it is the same for all the integrals in (4.9). Since B s and B x are independent of r, the first and third terms of (4.9) are not a function of r. As for the second term, the 62 least resistance path passes mainly through high conductivity areas as a result of the minimization problem. Thus, it is natural to expect that the Y values along this path are largerthanthemeanvalue. Therefore, weapproximatethelog-conductivityinthissection as Y Y + Y , where > 0 is an enhancement factor due to the high conductivity channels. Inserting this approximation into the second term of (4.9) yields E [R s (r)](Y ) + e Y (r 2) K G ; (4.10) where (Y ) =E Z ^ \Bs e Y d +E Z ^ \Bx e Y d ; (4.11) E Z ^ \W d =r 2: (4.12) The parameter(Y ) contains the contribution, in the average sense, originating from the regions close to the source and target points. We point out that (Y ) is a function of the statistical properties of Y (such as 2 Y ) and does not depend on the distance r. We observe that the slope of E [R s (r)] in (4.10) decreases exponentially with Y , explaining the behavior in the third regime observed in Figure 4.5. By comparing the linear terms in Equations (4.4) and (4.10), we fit the slope of (4.10) with the parameter c 1 of (4.4) using different values of 2 Y ( 2 Y = 0:5; 1:0; ; 4:5) in order to compute the optimal enhancement factor (). For the Gaussian field characterized with a spatial exponential covariance model for Y, we obtain an enhancement factor MG = 0:804 0:8 = 4=5 where the subscript MG is used to denote a multi-Gaussian K-field. Based on our results, we point out that the fitted parameter does not depend on 2 Y . Furthermore, equation(4.10)suggeststhatE [R s (r)] tendstobelinearwithincreasingr, thusthefitting 63 function (4.4) has been chosen to reflect this behavior (i.e., the derivative of the function (4.4) tends to be constant for high values of r). As a result, by comparing the analytical expression (4.10) with the parameters in the function (4.4), we find that the expected minimum hydraulic resistance between two points at a distance r=I Y 1 scales linearly with the distance r and exponentially with Y , thus: E [R s (r)]/re 4 5 Y : (4.13) Equation (4.13) is valid for the multi-Gaussian field taken into consideration. As shown in Section 4.4, similar results are obtained for fields different than multi-Gaussian, extending the validity of the function (4.4) and the scaling behavior defined in (4.13). Based on our analysis, the enhancement factor is an intrinsic (or static) property of the random field. High values of imply that the expected resistance decreases rapidly with the standard deviation of the random field Y (see Figure 4.5). 4.3.3 Scaling Behavior of First Time Arrivals The results presented in Section 4.2 showed a clear relationship between the minimum hydraulic resistance and first time arrivals (see Figure 4.3). Here we examine if the scaling behavior of the expected minimum hydraulic resistance (see Equation (4.13)) also holds for expected first time arrivals. For this analysis, we will estimate the first time arrivals based on the numerical results reported in [55]. Gotovac et al. [55] presented a detailed numerical study to compute the travel time probability density function (PDF) f (t) from purely advective transport simulations in a 2D multi-Gaussian Y-field for different levels of heterogeneity, namely 64 2 Y = 0.25, 1, 2, 4, 6 and 8. The numerical results used for the travel times are extracted from figure 4 of Gotovac et al. [55]. As noted in Gotovac et al. [55], there are substantial deviations in early and late arrival times from the log-normal approximation for the travel time PDF which holds for 2 Y < 1 [e.g., 13]. The numerical results of Gotovac et al. [55] show that these deviations increase with 2 Y thus indicating a departure from Fickianity. In order verify the scaling behavior (4.13), we need to define a first time arrival indi- cator based on the travel time PDF f (t) obtained from numerical simulations [see circle markers in Fig. 4 of 55]. Let t M be the time where f (t M ) = max t f (t) and let us define t to indicate the time corresponding to f (t ) =f (t M ) wheret <t M (i.e., t is on the left side of the travel time PDFf (t)) and 0<< 1. Finally, we define the dimensionless quantity denoted here as the rate of increase of the first time arrivals between two control planes situated at distance x 1 = 10I Y and x 2 = 20I Y as: m =U t x 2 t x 1 x 2 x 1 ; (4.14) whereU is the longitudinal mean velocity. The choice of the control plane locations (i.e., x 1 = 10I Y andx 2 = 20I Y ) and normalization of (4.14) are based on the numerical results displayed in Figure 4 of Gotovac et al. [55]. We point out that the quantity m is an estimated one since it was extracted from the empirical PDF f (t) represented by the circle markers in Fig. 4 of Gotovac et al. [55]. Figure4.7showsacomparisonbetweenm andthederivativeoftheexpectedminimum hydraulicresistancegivenby(4.13)obtainedbyusingtheparametersdescribedinGotovac et al. [55]. As we decrease (thus t approaches the first time arrivals), the values of m tend to converge to e 0:8 Y . This also occurs for higher levels of heterogeneity as shown 65 0 1 2 3 4 5 6 7 8 σ 2 Y 0.1 0.2 0.3 0.4 0.5 0.6 0.7 m θ e − 0.8σ Y m 0.02 m 0.2 m 0.5 Figure 4.7: Comparison between the time arrival increase rate m from [55] and the slope of expected minimum hydraulic resistancee 0:8 Y defined in (4.13). 66 for cases of 2 Y = 6 and 8. Note that these values of 2 Y are higher than the values used to determine expression (4.13). The results depicted in Figure 4.7 provide an indication that this scaling behavior may be extended to other values of 2 Y . Furthermore, the results in Figure 4.7 show that: (1) the first time arrivals scales according to (4.13) and (2) the potential of using the minimum hydraulic resistance as a computationally low cost metric to estimate first time arrival stochastically. 4.4 Expected Minimum Hydraulic Resistance of non- MG Fields Up to now, the minimum hydraulic resistance was computed only for multi-Gaussian log- conductivity fields. Here we show how the proposed graph theory algorithm presented in Chapter 3 can be utilized to evaluate the minimum hydraulic resistance in logK-fields displaying non-MG features. Subsequently, we test the performance of the semi-empirical relationship (4.4) for the expected value of the minimum hydraulic resistance in non- MG fields. As shown in Gómez-Hernández and Wen [54], the statistical properties of the geological formation influence its connectivity spatial structure as well as the num- ber of extreme values of the permeability field. The spatial statistical properties of the logK-field can potentially have an impact on the magnitude of risk due to groundwater contamination. However, we note that the relative importance of higher-order moments of logK associated with non-MG fields can also depend on the prediction of interest and dimensionality of the computational model. For example, Jankovic et al. [65] recently performed a comparative analysis between Gaussian and non-Gaussian log-conductivity fields on transport simulations. These authors showed that higher-order moments of logK 67 did not have a significant impact on the bulk of the breakthrough curve at a given control plane for ergodic plumes in 3D aquifers. For the purpose of illustration, we consider the non-MG model presented in Zinn and Harvey [123] for 2D fields. This model enables to constructK-fields that are characterized by connected and disconnected structures. We will adopt this model to examine the expected minimum hydraulic resistance for a different ranges of connectivity patterns. The Zinn and Harvey [123] model has also been adopted by Tyukhova and Willmann [116] to study the path of smallest resistance computed with a different methodology. In this section, both connected and disconnected fields were generated by the procedure described in Zinn and Harvey [123]. These fields are characterized by the same univariate log-conductivity distributions detailed in the previous section where multi-Gaussian fields were analyzed. The main difference from the MG fields lies in the connectivity between low- and high-K regions (i.e., they differ in higher order moments). For connected fields, high K regions are well-connected whereas low K regions are poorly connected. The opposite occurs for disconnected fields. Additional details regarding the procedure used to generate the K field can be found in [123]. Figure 4.8 displays the expected value of the minimum hydraulic resistance as a func- tion of dimensionless distance. Results are reported for different 2 Y and are shown for both connected (Figure 4.8.a) and disconnected (Figure 4.8.b) fields. Similar to Figure 4.5 (for MG fields), the results originating from the numerical simulations are represented by the star symbol while the smooth curves correspond to the fitted semi-empirical re- lationship (4.4). The results shown in Figure 4.8 are remarkable in the sense that the semi-empirical relationship (4.4) also holds for the non-MG fields investigated. When compared to the results displayed in Figure 4.5, the only differences lies in the values of 68 0 2 4 6 8 10 r/I Y 0 2 4 6 8 10 E[R s (r)]K G /I Y a 0 2 4 6 8 10 r/IY 10 − 1 10 0 10 1 d(E[Rs(r)]KG/IY)/dr 0 2 4 6 8 10 r/I Y 0 2 4 6 8 10 E[R s (r)]K G /I Y b 0 2 4 6 8 10 r/IY 10 − 1 10 0 10 1 d(E[Rs(r)]KG/IY)/dr Figure 4.8: Expected value of the normalized minimum hydraulic resis- tance versus normalized distance for non-Gaussian log-conductivity fields. Results depicted for (a) well-connected fields and (b) disconnected fields. Insets correspond to the spatial derivative of the expected minimum hy- draulic resistance. 69 the fitted parameters. Similarly to the MG case, we can compute the enhancement factors . We obtain C = 0:928 for connected fields and D = 0:664 for disconnected fields. Here we use the subscript C and D for connected and disconnected fields respectively. As expected, we find the following relation for the enhancement factor : D < MG < C : (4.15) Thus, for the same level of heterogeneity 2 Y , the expected resistance between two points separated by a distance r I Y will change more rapidly in disconnected fields than connected fields (see Figure 4.8). At the same time, by comparing the enhancement factors, the multi-Gaussian case will display an intermediate behavior. It is also interesting to note the robustness of small distance approximation for this specific logK non-MG model (see Equation (4.8)). The small distance approximation holds since the Zinn and Harvey models (both connected and disconnected) are univariate Gaussian (at a point). Thus, the derivative of the expected resistance in zero is the same in all the three cases (compare Figures 4.5 and 4.8). As the distance increases, the effects of the higher-order moments becomes more evident. Juxtaposition between Figures 4.8.a with 4.8.b indicates that magnitude of the expected least resistance is higher for the disconnected field, but after the transition regime (i.e., r I Y ) there is an inversion of the slopes trend. Despite the different connectivity patterns analyzed (i.e., MG field in Figure 4.5 and non-MG fields in Figure 4.8), the results depicted in this work indicate that the functional form of the semi-empirical expression (4.4) is quite robust. Nevertheless, additional anal- ysis needs to be carried out to test the range of validity of Equation (4.4) under different 70 conditions such as 3D structures and other statistical models for the log-conductivity field. 4.5 Summary Delineating the connected structures of an aquifer is challenging since hydrogeological properties vary in space over multiple scales. Identifying the least resistance path be- tween source and receptor within heterogeneous porous formation can provide crucial information for contaminant site managers. We compared the minimum hydraulic resistance with solute transport simulations in a 2D aquifer, confirming the strong correlation between minimum hydraulic resistance and first time arrival reported in Tyukhova and Willmann [116]. Moreover, we showed that our methodology can be used to predict the region where the leading edge of a plume will hit a control plane for the first time. Even if the plume is influenced by boundary effects, a good correlation between the two quantities was computed. Note that the minimum hydraulic resistance is a static quantity and therefore it is not influenced by hydraulic head gradients. For such reasons, further research should be carried out to investigate the relationship between the minimum hydraulic resistance and early arrival times under different boundary and initial conditions. Our results also indicate the importance of low conductivity zones in the calculation of the minimum hydraulic resistance path. If the starting point is located on the low conduc- tive layer, these low conductivity values will have a strong impact on the expected value of the minimum hydraulic resistance. By making use of the Monte Carlo framework, we were able to provide an equation that describes the expected value of the minimum hydraulic resistance between two points. Analytical approximations considering the two extreme 71 cases of small and large distances support the choice of the semi-empirical equation (4.4) and was used to verify the results from the graph theory based algorithm. It is important to note that the connectivity structure will depend on the statisti- cal properties of the hydraulic conductivity field. Nevertheless, we have shown that the proposed approach can be used to investigate the statistical properties of the minimum resistance path in multi-Gaussian (MG) and non-MG log-K fields. Our numerical com- putations also show that the semi-analytical expression (4.4) holds for the type of fields investigated, i.e., 2D statistically isotropic MG log-K fields and connected/disconnected non-MG log-K fields proposed by Zinn and Harvey [123]. The algorithm proposed can be utilized for different types of geological conceptualizations [e.g., 49, 16, 109] and com- putes the minimum hydraulic resistance in 2D and 3D aquifers (e.g., see Figure 3.3). We also illustrate how the expected minimum hydraulic resistance scales with the distance r and the variance 2 Y following Equation (4.13). Moreover, we have seen that the same scaling equation can be applied to first time arrival, by comparing (4.13) with numerical simulations presented in the literature. Finally, the expected minimum hydraulic resis- tance has the potential to efficiently estimate first time arrival in contaminant transport applications which is of importance in risk assessment. 72 Chapter 5 Uncertainty Analysis and a Connectivity-Based Iterative Sampling Strategy The spatio-temporal dynamics of a solute body is strongly affected by the variability of the hydraulic properties of the subsurface environment. Fluctuations in the hydraulic properties of the porous medium, such as the hydraulic conductivity (K), lead to well connected paths which are key in controlling first arrival times at an environmentally sensitive target [54]. Identifying well-connected paths in the subsurface environment is critical for risk analysis in many hydrological, environmental and energy applications [e.g., 3, 27, 53]. Since a complete characterization of the subsurface is not feasible, the identifi- cation of well-connected paths is a challenging task. Instead, few field measurements are collected and therefore the K-field is subject to uncertainty. For such reasons, stochastic methods are employed to construct a Random Space Function (RSF) of the K-field [26, 102]. As a consequence of the uncertainty in theK-field, the connectivity structure is also a random entity. Consequentially, first arrival times need to be characterized statistically 73 through their moments and probability density function. In this Chapter, we expand the ideas of [98] to provide a framework based on the MHR and LRP to estimate the connectivity uncertainty of a given random K-field and to illustrate how the concept of the LRP can be employed to improve site characteriza- tion efforts. These quantities can be efficiently computed using graph theory, and the method can be easily extended to a large variety of grids (e.g., 3D, non-structured grid, etc.). In fact, in many cases it is possible to transform a grid or lattice into a graph. We recall that a graph is defined as a set of vertices that are connected by weighted edges (in this case, the hydraulic resistance between two vertices). Given the flexibility of the method, we explore its usage in practical applications where connectivity plays an impor- tant role. The objectives of this work are twofold. The first is to quantify the uncertainty of the MHR and its controlling factors. Mainly, we illustrate how the dimensionality (2D vs. 3D) and geological conceptualization (e.g. multi-Gaussian and non multi-Gaussian log-conductivity fields) impact the uncertainty of the MHR. Rizzo and de Barros [98] only explored the impact of aquifer structure on the expected value of the MHR. The present contribution further develops the work of Rizzo and de Barros [98] to quantify the empirical distribution of the MHR as a function of the above mentioned factors (i.e., dimensionality and geological conceptualization). Second, we demonstrate how the con- cepts introduced in [98] can be employed to reduce the uncertainty associated with the identification of the LRP of a given K-field. To achieve this, we develop an iterative framework to best allocate data assimilation efforts with the goal of reducing the LRP uncertainty. Given the strong relation between MHR and first arrival times, our proposed framework results in a reduction of the uncertainty of the latter. This strategy is closely 74 related to inverse modeling and optimal design [e.g., 68, 1, 85, 103], where sampling loca- tions are chosen by minimizing the expected prediction variance (e.g., first arrival times variance). In the case of optimal design and inverse modeling, many stochastic flow and transport simulations are needed to estimate such variance (see Herrera and Pinder [63] for a practical application). On the contrary, the iterative sampling strategy uses the MHR and LRP uncertainty as a proxy for the first arrival time uncertainty, therefore no stochastic flow and transport simulation is needed to estimate the new sampling locations. In section 5.1, we present the concepts of MHR and LRP, and briefly review the algo- rithm utilized to compute these quantities presented in [98]. In section 5.2, we investigate, for the first time, how different geological conceptualization, such as different RSF models and dimensionality (i.e., 2D vs 3D), impact the uncertainty of the MHR. This type of analysis can help engineers and scientists in choosing the appropriate conceptual RSF model based on the expected geological connectivity. Then, in section 5.3, we propose a novel iterative method for site characterization which is based on the concepts of the MHR and the LRP. In each step of the method, the sampling locations (where the hydraulic conductivity is sampled) are chosen to maximize the probability to find the LRP. This leads to a drastic reduction of the first arrival time uncertainty when compared to regular sampling strategies. Finally, a summary is presented in section 5.4. 5.1 ComputingtheMinimumHydraulicResistanceand Least Resistance Path We start by providing a brief review of the concepts and methods developed in Rizzo and de Barros [98] to estimate the connectivity of an heterogeneous conductivity field using 75 Figure 5.1: Schematic representation of the graph generated from a 3 3 3 log-conductivity field. The yellow spheres are the nodes/vertices of the graph collocated at the center of each cell, and the black lines represent the edges. 76 graph theory. Consider a n-dimensional aquifer characterized by a spatially variable log-conductivity field Y = logK. The Cartesian coordinate system is given by x = (x 1 ; ;x n ) where n is the dimensionality of the porous formation. In order to measure connectivity ofK fields, we utilize the concept of Minimum Hydraulic Resistance (MHR). Let S and T be two set of points (e.g., a point, surface, or volume) representing the starting points and target points respectively. We define the MHR between S and T as in equation (3.2). For example, a possible setup is to choose S and T as two opposite boundaries. The path ^ that minimizes the hydraulic resistance between S and T (i.e., the hydraulic resistance along ^ R ^ (1=K)d is equal to the MHRR S (T )) is called the Least Resistance Path (LRP). In other words, any path connecting S and T will have a hydraulic resistance greater or equal the hydraulic resistance of the LRP. The MHR has been found to have a close relation with the first arrival time of a solute plume (e.g., Fig. 7 of Rizzo and de Barros [98] and Fig. 3 of Tyukhova and Willmann [116]) and, at the same time, the LRP is correlated to the trajectory of the fastest particle (e.g., Figure 5 in [98]). This relation indicates that connectivity structures of the conductivity field is often the driving factor governing the front of the plume. One of the main advantages of using the MHR and LRP is that these quantities can be efficiently computed. Equation (3.2) can be re-formulated in a graph theory framework where it is equivalent to the shortest path problem, that can be solved using a variation of the Dijkstra’s algorithm. A schematic representation of how to transform the K-field into a graph is showed in Figure 5.1. More details about the algorithm are presented in [98] and briefly summarized in Appendix A. Using this algorithm, the MHR and LRP can be used to quickly estimate critical connectivity properties in real applications while limiting the computational burden. 77 5.2 Hydraulic Conductivity RSF Model and the Mini- mum Hydraulic Resistance Uncertainty We employ the graph theory approach described in Section 5.1 to quantify the impact of the RSF model used to conceptualize the K-field on the MHR uncertainty. The analysis is carried out for different dimensionalities of the physical domain (i.e., n = 2; 3). Our goal is to verify whether or not the statistical properties of the MHR change significantly according to the structure of theK field. To achieve this task, we will adopt four different RSF to describe the log-conductivity field (Y = logK). With the goal of providing a fair comparison, each field has same log-mean Y = 0, same log-variance 2 Y = 5 and same Gaussian semi-variogram model: (r) = 2 Y 1 exp(h 2 ) ; (5.1) where r is the n-dimensional lag-distance and h is defined as: h = v u u t 3 n X i=1 r i a i 2 ; (5.2) with a i denoting the practical ranges in each direction. For all the upcoming analysis within this section, we consider the K-field to be isotropic, i.e. all conductivity fields have the same effective range a i =a for every i. The four types of fields analyzed in this work are: 1. Multi-Gaussian fields (MG) 78 2. Non Multi-Gaussian field where high conductivity values are well connected (non- MG con) 3. Non Multi-Gaussian field where low conductivity values are well connected (non- MG dis) 4. Binary field characterized by two homogeneous facies (binary) For each field, both the 2D and 3D versions are taken into account. Therefore, we will analyze the MHR for a total of eight types of unconditional fields. Details pertaining to the K-fields and the corresponding methods employed to generate multiple random realizations are described in the upcoming subsection. 5.2.1 Spatially Random Hydraulic Conductivity Fields Multi-Gaussian Fields: one of the most common choice to model a random log-K field is the MG model [26, 67, 102]. For any set of points, the joint-probability density function (PDF) is assumed to be multi-Gaussian. This assumption permits to efficiently generate random realizations using, for example, the Sequential Gaussian Simulation (SGSIM) technique [see Ch. 3 of 102] using SGEMS [92]. More information about MG fields can be found in Rubin [102]. Non Multi-Gaussian Fields: inadditiontotheMGfield, wealsoinvestigatetheimpact of augmenting connected and disconnected features in the K field using the approach described in Zinn and Harvey [123]. Other approaches are also available [e.g., 104]. The fields reported in Zinn and Harvey [123] are “artificially” generated using a non-linear transformation of the MG field. The connected fields (denoted in the plots as non-MG con) are characterized by the presence of well-connected high-K areas. On the contrary, 79 disconnected fields (denoted in the plots as non-MG dis) are characterized by the presence of well-connected low-K areas. For each point of this field, the PDF of Y is Gaussian. However, the joint-PDF of a set of points is not multi-Gaussian. For this reason, we refer to these types of fields as non-MG fields. Note that the block size is increased after the fields are transformed to ensure that all the fields (i.e., MG and non-MG) have the same practical range [123]. These fields have been employed in multiple studies to investigate the role of connectivity in transport [e.g., 110, 116, 65]. Binary Fields: finally, we consider a discrete binary field where the log-K can assume only two valuesy 1 andy 2 . The field realizations are generated using Sequential Indicator Simulation (SISIM) in SGEMS [92]. Note that in this case, the univariate PDF is not Gaussian. To satisfy the constrains on the mean and variance, the two log-K values are y 1 = Y andy 2 = Y , and the probability of encountering each of these value is exactly 0.5. Therefore, this is a two-facies type of field, where the conductivity is homogeneous in each face. More complex multi-facies fields can be obtained using multiple-points geostatistical simulations [81] or transition probability-based indicator simulations [23]. 5.2.2 Statistical Analysis of Minimum Hydraulic Resistance 5.2.3 Box Plot Analysis To understand the impact of the RSF model on the MHR, we consider the MHR between twopointsseparatedbyadistance 4aandcomparetheresultsfordifferentlog-conductivity RSF models. Using equation (3.2), we defineR :=R s (t), where s and t are two points at distance 4a between each other. The simulation domain is partitioned in 201 201 cells in 2D, and 201 201 201 cells in 3D. The practical range is a = 20 cells. We 80 MG non-MG con non-MG dis binary MG non-MG con non-MG dis binary − 1.5 − 1.0 − 0.5 0.0 0.5 1.0 1.5 2.0 2.5 log 10 R 2D 3D Figure 5.2: Box-plots of log 10 R (log-resistance between two points at distance 4a) using different types of 2D and 3D RSF models. Table 5.1: Summary of statistical properties (mean, standard deviation, skewness, and kurtosis) of the log 10 R Field Mean Standard Dev Skewness Kurtosis MG 2D 0.433 0.367 -0.029 0.092 non-MG con 2D -0.019 0.556 0.717 0.192 non-MG dis 2D 0.765 0.345 -0.502 1.139 binary 2D 0.343 0.371 0.108 -0.993 MG 3D 0.022 0.543 0.954 0.731 non-MG con 3D -0.145 0.755 0.299 -0.466 non-MG dis 3D 0.359 0.397 0.143 1.134 binary 3D -0.101 0.190 0.518 -0.469 81 consider four types of RSF (MG, non-MG connected, non-MG disconnected, and binary) previously described. For each field, we analyze both the 2D and 3D cases. Since the K-field is spatially random, the MHR is treated as a random quantity. To quantify the uncertainty in the MHR, a Monte-Carlo simulation is performed for each of the fields (eight cases in total). For each case, an ensemble consisting of 100 realizations of the K-field is generated. For each realization, the MHR between two points at distance 4a is computed through the algorithm described in Section 5.1 and Appendix A [see 98, for additional details]. Finally, the key statistical properties ofR can be estimated. Figure 5.2 shows the box-plots of the log 10 R for each of the RSF considered. Results are displayed for both 2D and 3D fields. Statistical properties of the log 10 R are reported in table 5.1 for each type of field. Comparison of the log 10 R box-plots provides two important information: it is possible to see how the MHR varies, in average, among differentcases. Second, itispossibletoseetheimpactoftheRSFmodelonthe uncertainty (i.e., the height of the box-plot) of the MHR. The latter indicates the possibility of having realizations with extremely different connectivity configurations (e.g., two points laying in the same high conductivity channel or two points separated by a low conductivity barrier). Close inspection of Figure 5.2 reveals that the MHR is consistently lower, in the average sense, for the 3D fields when compared to the corresponding 2D fields. This is in agreement with previous studies showing a higher connectivity when considering the third dimension, such as consistently lower percolation threshold for 3D fields when compared to 2D fields [93], or a significantly higher probability to find preferential channels in 3D fields [47]. In the 3D cases, there are more possible paths connecting the start and end points, therefore minimization of the hydraulic resistance (among all the possible paths) leads to a lower MHR. Note that the difference of the average MHR between 2D and 3D 82 can be more than one order of magnitude. Also the uncertainty of the log 10 R (i.e., the height of the box-plot) is affected by the dimensionality in a non-trivial way. For example, our results show that for non-MG connected fields, the uncertainty in 3D is higher than the 2D case, while for binary fields the uncertainty in 3D is lower than the 2D case. Next, we compare the results displayed in Figure 5.2 obtained for MG fields and the non-MG fields (connected and disconnected). As expected, the MHR of the connected field is (on average) lower than the MG field. At the same time, we observe that the mean value MHR of the disconnected non-MG field is higher than the average of the MG field. Since these differences can be relatively high (by order of magnitudes), it is clear that the MHR is strongly affected by high order moments characterizing the RSF, since both MG and non-MG fields share the same mean, variance, semi-variogram and uni-variate PDF. Moreover, we shall note that the transformation used to build the non-MG fields [i.e., 123] strongly affect the uncertainty of the MHR. The connected field shows a remarkably higher uncertainty in the MHR when compared to the MG case, while the disconnected field MHR uncertainty is lower than the MG case. In the following we compare the MG field with the binary field (see Figure 5.2). For this RSF model, the average MHR is similar in both the 2D and 3D case. However, our results show that the uncertainty of the MHR in the binary field is smaller than the uncertainty in the MG field, especially in the 3D case. This means that the binary field is not able to reproduce those extreme connectivity configurations that can be found in the MG case. We remark that connectivity, as expressed by MHR, depends on the position of the start and end points (or volumes) and the type of RSF used. However, comparisons of the MHR as shown in the box-plot analysis (Figure 5.2) can be used to quickly compare 83 connectivity properties of different conceptual models. Comparison can be made for both the average MHR (is one geological model more connected than the other in average?) and the uncertainty of the MHR (is one model including more extremes than the other?). This is particularly important since the MHR displays a strong correlation with first arrival time [115, 98]. Moreover, the MHR can be used as a lower bound for first arrival time [98]. Therefore the statistical analysis presented here can inform risk managers on the importance of the choice of the RSF model adopted in models when estimating early arrival times of a contaminant plume. 5.2.4 PDF Analysis In the previous section, we have seen the impact on the average resistance and its un- certainty. Now, we will compare the empirical PDF ofR (obtained directly from the ensemble) to well-known, parameterized, PDF models. For this analysis, we have tested numerous PDF models. For each known PDF model, we first estimate its parameters using a proper Maximum Likelihood Estimator (MLE), by setting the minimumR (zero for MG and non-MG fields, and 4 exp( Y ) for the binary field). Then, we perform the Kolmogorov-Smirnov goodness-of-fit test to compare the known PDF and the empirical PDF ofR for each of the eight fields previously described. For each test, we report the corresponding p-value that indicates the probability thatR is distributed according to the given PDF. Here, we report the tests performed utilizing the following distributions: Power Log-Normal, Log-Normal, Exponentiated Weibull, and Beta Prime. Results for each type of RSF are shown in Figure 5.3. We start by noting that none of these distribution models is able to fully mimic the statistical behavior in the binary 3D case (see Figure 5.3h). The reason is that theR 84 10 − 3 10 − 2 10 − 1 10 0 10 1 PDF 2D a MG p=0.91 p=0.88 p=0.90 p=0.89 b non-MG con p=0.60 p=0.21 p=0.56 p=0.75 c non-MG dis p=0.75 p=0.67 p=0.77 p=0.82 d binary p=0.75 p=0.52 p=0.82 p=0.60 10 0 10 1 R 10 − 3 10 − 2 10 − 1 10 0 10 1 PDF 3D e p=0.17 p=0.04 p=0.25 p=0.71 10 0 10 1 R f p=0.34 p=0.37 p=0.48 p=0.10 10 0 10 1 R g p=0.10 p=0.09 p=0.12 p=0.18 10 0 10 1 R h p=0.04 p=0.00 p=0.07 p=0.07 Power Log-Normal Log-Normal Exponentiated Weibull Beta Prime Figure 5.3: Comparison between the empirical PDF ofR and existing PDF models: Power Log-Normal (orange), Log-Normal (green), Exponen- tiated Weibull (red), Beta Prime (purple). The Kolmogorov-Smirnov test was adopted to test the goodness-of-fit and the corresponding p-values are reported inside each sub-figure. Results are displayed for Multi-Gaussian (MG), non-MG connected, non-MG disconnected and binary fields in 2D (top row) and 3D (bottom row). 85 empirical PDF is bounded, as a result of the bimodal K distribution, while the known PDF taken in consideration have a semi-infinite domain (e.g., random values range from 0 to infinity). Next, we see that all distributions taken in consideration have a similar performance. It is interesting to note that many of these PDF are used to statistically model extreme events. This is consistent with the class of problem we are investigating in this Chapter, since the MHR can be seen as an extreme. For example, given a set of independent and identically distributed random variables with a finite minimum, the distribution of the minimum will converge to a Weibull distribution [e.g., Ch. 4 of 4]. However, the MHR is the result of the minimization problem (3.2) where the hydraulic resistances of all the paths are not independent. Still, results indicate that the PDF may be related to the Exponentiated Weibull distribution, which is an extension of the Weibull distribution. 5.3 Identifying the Least Resistance Path of a 3D Ran- domHydraulicConductivityFieldandImplications on First Arrival Time Uncertainty In this section, we demonstrate how the proposed MHR concept can be employed to improve data acquisition campaigns and reduce uncertainties associated with the LRP and, consequently, with first arrival times of a solute body. Westartbyconsideringa3DunconditionalY-fieldcharacterizedbymean Y , variance 2 Y andGaussiansemi-variogrammodelasdescribedin(5.1). Theconsideredfieldcontains 1819131cells, eachcellbeingaregular 111cube. Thepracticalrangesinhorizontal 86 Figure 5.4: Three-dimensional spatially heterogeneous log-conductivity field representing the synthetic “truth” ( Y = 0, 2 Y = 4:5). This reference field is denoted as Y. directions are a := a 1 = a 2 = 30 cells, while the vertical range is a 3 = a=2 = 15 cells. In this study, we consider a dimensionless coordinate system x i = ^ x i =a, where ^ x i are the coordinates in dimensional form. For our analysis, a random unconditional realization (with Y = 0 and 2 Y = 4:5) is generated as shown in Figure 5.4 using the Sequential Gaussian Simulation (SGS) method and SGEMS [92]. We refer to this field as Y and it will represent our synthetic true Y- field from which we will sample conductivity values at different locations. In the following subsection, we will explain the details of the LRP-based sampling strategy. 87 5.3.1 Iterative Sampling Strategy based on the Least Resistance Path In the following we propose a methodology to identify the LRP in an unknown field, and reduce the uncertainty associated with it. We start by considering the Y field shown in Figure 5.4, and we are interested in the LRP between the left boundary (x 1 = 0) and right boundary (x 1 = 6). As previously mentioned, this field will represent the synthetic true field from which we will extract log-conductivity data at specific locations. We recall that the hydraulic resistance along the LRP is, by definition, less or equal to the hydraulic resistance of any other path connecting the left and right boundaries (see section 5.1). The goal is to estimate the LRP by sampling the conductivity field only in few locations, emulating a real site-characterization process. Instead of using a regular sampling grid, we select the sampling locations through an iterative approach, using current information on the LRP and its uncertainty to select the locations for the new samples. The iterative sampling method is based on the following steps: I Select the initial x (j) = (x (j) 1 ;x (j) 2 ;x (j) 3 ) sampling locations where j = 1;:::;N 0 and N 0 is the initial number of measurements. II In each new location x (j) , sample the hydraulic log-conductivity, and create a new measurement data vector: = Y (x (1) ); Y (x (2) ); Y (x (3) );::: III Usingthenewmeasurementdata togetherwithpreviouslysampleddata,generate N f conditional realizations of the K-field. 88 IV For each realization, employ the graph theory algorithm to compute the LRP. From all these realizations, compute the probability map of the LRP. V Select N m new x (j) locations using the information on the uncertainty of the LRP, and restart from Step II. The results originating from Steps I-V are illustrated in Figure 5.5. In Figure 5.5 we show the probability of finding the LRP for each iteration of the data sampling campaign. For this type of analysis and for the purpose of illustration, it is convenient to use a x 3 -integrated probability map in order to better identify the sampling locations for the next iterations. Moreover, the probability is computed in cells containing 5 5 blocks of the original numerical grid. The probability that the LRP crosses a given cell can be calculated by counting the number of realizations where the LRP crosses the cell. The initial sampling locations are located on a horizontal 4 3 regular grid (see black circles in Figure 5.5.I1), for a total of 12 horizontal sampling locations. For each horizontal sampling location, conductivity values are extracted in layers 0, 5, 10, 15, 20, 25, 30, so that the initial number of measurements is N 0 = 4 3 7 = 84. Using this information, we generate N f = 100 conditional K-fields and compute the LRP for each realization following the approach described in section 5.1. After the probability map of finding the LRP is computed, we select the six new horizontal sampling locations (see red squares in Figure 5.5.I1). Therefore the number of measurements added for each iteration is N m = 6 7 = 42. Since the objective is to find the LRP, the new sampling locations should be chosen based on the current LRP probability map. Therefore, locations with high probability of finding the LRP should be preferred. The sampling strategy consists in allocating most of the sampling locations in areas where the probability to find the LRP 89 0 1 2 3 x 2 I1 I2 0 1 2 3 x 2 I3 I4 0 1 2 3 4 5 6 x 1 0 1 2 3 x 2 I5 0 1 2 3 4 5 6 x 1 I6 Figure 5.5: Map of the x 3 -integrated probability to find the LRP of the field shown in Figure 5.4 in all the six iterations of the iterative sampling strategy. Blackcirclesaretheprevioussamplinglocations, whileredsquares are the sampling location chosen at the current iteration. In I6, the red line represents the real LRP. 90 is high, and the remaining sampling locations where the probability is low. Collocating few sampling locations in low probability areas reduces the risk of focusing on sub-optimal paths (or from an optimization point of view, converging to a local minimum), therefore improving the quality of the iterative procedure. For this benchmark case, we place four sampling locations on areas where there is a high probability of finding the LRP, and two elsewhere. It is clear from Figure 5.5 that after only five iterations, we are able to detect the LRP with a relatively small level of uncertainty (refer to the red line in Figure 5.5.I6 for the real LRP). Moreover, as a natural outcome of the iterative procedure, most of the sampling locations are placed within the vicinity of the LRP. The resulting sampling layoutisalsoconsistentwiththeresultsoriginatingfromtheBayesiangeostatisticaldesign framework of Nowak et al. [85]. Nowak et al. [85] used inverse modeling to minimize the uncertainty of solute arrival times in a 2D spatially variable flow field and showed how most of the K sampling were located between the source and endpoint. In real applications, the geostatistical model and statistical properties of the hydraulic conductivity field are, in general, unknown. However, in case the site has been already analyzedandhydraulicconductivitystatisticalpropertiesarealreadyestimated, theinitial sampling locations of the iterative method can be chosen to be the K-values sampled duringtheprevioussitecharacterizationprocess. Ifthesiteisyettobeanalyzed(unknown geostatistical parameters), the initial number of sampling locations have to be larger, since statistical properties of the field must be estimated. It is important to note that the geostatistical model (and corresponding statistical properties) can be updated at each iterationusingthenewlyavailabledataandcouldbefurthercastinaBayesianframework. 91 Figure 5.6: LRPs ensemble obtained by conditioning the field in different locations using the regular strategy (top) and the iterative strategy (bot- tom). The paths are colored according to the logK. The spheres represent the locations where the conductivity is sampled. 92 5.3.2 Comparison with a Regular Sampling Strategy Next, we compare the proposed sampling strategy based on the LRP probability (see measurement network depicted in Figure 5.5) to a regular sampling strategy. For this regular strategy, we select the horizontal sampling locations along a regular 7 6 grid, for a total of 42 horizontal sampling locations. This is the same number of locations used at the end of the iterative strategy (i.e., number of black circles in Figure 5.5.I6). Therefore, bothstrategieswouldrequireaidenticalsamplingeffort, withtheiterativestrategyhaving a negligible time overhead due to the fact that we compute the LRP probability map for each iteration. Figure 5.6 shows the ensemble of 100 LRPs for the fields conditioned to K extracted with the regular strategy (Figure 5.6, top) and the iterative strategy (Figure 5.6, bottom). It is clear that the uncertainty on the LRP is higher for the regular strategy, with the LRPs covering a larger area and having different shapes from the expected one. On the contrary, the higher sampling frequency along the LRP obtained by using the iterative strategy reduces its uncertainty, as shown by the more homogeneous behavior of the LRPs across different conditional realizations (Figure 5.6, bottom). The comparison with the regular sampling strategy shows the advantages of using the iterative method to choose the sampling locations. In fact, it is possible to reduce the uncertainty on the LRP by rearranging the sampling locations, maintaining the character- ization costs and effort constant. Since the graph theory based algorithm used to compute the LRP is relatively efficient [98], the computation overhead added can be justified by the gain in uncertainty reduction. 93 Figure5.7: Solute plume after 500 steps (t = 25) of the open-source GPU accelerated RWPT code PAR 2 [100] using the synthetic true field Y (see Figure 5.4). On the bottom projection, we compute the the x 3 -integrated log-concentration. 5.3.3 Uncertainty Reduction of First Arrival Times We have seen how choosing the K sampling locations along the LRP decreases the un- certainty associated with its estimation. Here, we show that the uncertainty on the first arrival time decreases when using the iterative method to selectK sampling locations. To show this, we simulate flow and transport for the both conditional K ensembles obtained through the iterative sampling and the regular sampling strategies. The governing equations of groundwater flow are solved using MODFLOW [59] and the Python interface FloPy [5], while solute transport is simulated using the open-source GPU-accelerated random walk particle tracking (RWPT) code PAR 2 [100]. Details about 94 the physical-mathematical model adopted in this work can be found in Appendix B. Note that all the quantities are presented in dimensionless form and the dimensionless groups are listed in Appendix B. The solute plume, represented by a set of particles, is initially placed in a x 2 -x 3 plane along the left boundary (x 1 = 0). For this simulation, we use 10 million particles, a dimensionless time stepdt = 0:05, molecular diffusionD m = 10 9 , and longitudinal and transverse dispersivities L = 0:01 and T = 0:001 respectively. Figure 5.7 shows the shape of the plume after 500 time steps (i.e., dimensionless time t = 25) using the synthetic true field Y depicted in Figure 5.4. Since the heterogeneity level of the field is relatively high, the solute plume is strongly affected by high conductivity channels, and the front of the plume moves along the LRP. We run two Monte-Carlo simulations using conductivity fields conditioned to the sam- pling locations determined in the iterative strategy (Figure 5.6, bottom) and sampling locations chosen using a regular strategy (Figure 5.6, top). For both ensembles, 100 conditional realizations are generated using SGSIM. For each K realization, flow and transport are simulated and the statistics of the cumulative Break-Through Curve (BTC) at a control plane located on the right boundary (x 1 = 6) is computed. Figure 5.8 shows a comparison on the cumulative BTC at the right boundary between the two sampling strategies. Since the MHR and LRP are linked to first arrival time [see Figures 5 and 7 of 98], we focus on the early solute breakthrough. Figure 5.8 shows that the cumulative BTC obtained from the synthetic true field Y (red line) is within the uncertainty envelope in both the iterative sampling method (blue) and regular sampling method (green). However, the confidence interval (i.e., uncertainty) is smaller when using the iterative sampling method. For example, let us consider t 1% (i.e., the time when 1% of the particles has crossed the right boundary) as the quantity of interest. If we 95 0.00 0.25 0.50 0.75 1.00 1.25 1.50 t/t 1% 0.00 0.02 0.04 0.06 0.08 0.10 Cumulative BTC Iterative Sampling Synthetic Truth 0.00 0.25 0.50 0.75 1.00 1.25 1.50 t/t 1% Regular Sampling Synthetic Truth Figure 5.8: Results for the cumulative BTC at the control plane and its confidence intervals using the iterative sampling (blue) and regular sam- pling (green) for the benchmark case. The red line is the cumulative BTC obtained using Y. The blue/green lines are the median. The darker shaded areas of blue and green represent the 50% confidence intervals whereas the lighter shaded areas of blue and green represent the 90% confidence inter- vals. Time is normalized with the t 1% of the synthetic true field. 96 Iterative Sampling Regular Sampling 15 20 25 30 35 40 45 First Arrival Time (t 1% ) Figure 5.9: Comparison of first arrival time t 1% using the iterative sam- pling (blue) and regular sampling (green). The red line represents the t 1% using the synthetic true field Y. 97 measure the uncertainty through the standard deviation oft 1 %, we find that the proposed iterative method reduces the uncertainty by 47% when compared to the regular sampling method. Figure5.9showstheestimationoft 1% forboththeiterative andregular sampling methods. We point out again that both regular and iterative sampling methods require the same number ofK-samples (in this case, 42 horizontal sampling locations). Similarly to LRP, it is possible to decrease the uncertainty of first arrival time by rearranging the sampling locations. The direct computation of first arrival times would require many flow and transport simulations for each iteration. Instead, our work proposes to use the iterative strategy to minimize the LRP uncertainty, which is more efficient and requires minimal computational effort. Nonetheless, we can achieve a remarkable reduction in the first arrival time uncertainty due to the link between MHR/LRP and first arrival times. The 47% uncertainty reduction from a regular sampling strategy is similar to what was obtained by Nowak et al. [85] for t 50% using a task-driven inverse modeling technique. However, the iterative strategy does not need any flow and transport simulation, being more efficient when the aquifer domain is large. Moreover, the choice of new sampling locations is based on all the samples collected during previous iterations, thus the final sampling locations are optimized for the site under examination. Finally, we compare the computed values of t 1% with the corresponding MHR for each realization of the hydraulic conductivity ensemble. This analysis is carried out for an unconditional ensemble and the conditional ensemble obtained through the iterative sampling strategy described in Section 5.3.1. Figure 5.10 shows a scatter plot of the MHR andt 1% for 100 fields generated using the conditional field, and 100 fields generated using the unconditional field. As expected, there is a clear correlation between the MHR and first arrival times. Note that the quality of this regression is of key importance for the 98 0 5 10 15 20 25 30 35 40 MHR 0 20 40 60 80 100 First Arrival Time (t 1% ) Figure 5.10: Scatter-plot of the MHR between left/right boundary and first arrival times t 1% . Grey circles are obtained using unconditional simu- lations while blue rhombus are obtained using the iterative sampling con- ditional simulations. 99 0 1 2 3 4 5 6 x 1 0 1 2 3 x 2 I6 Figure 5.11: Map of the x 3 -integrated probability to find the LRP be- tween the injection and production wells in the field shown in Figure 5.4 after six iterations of the iterative sampling strategy. Black circles are the final 42 horizontal sampling locations. The red line represents the real LRP between the two wells. success of the iterative sampling strategy. The regression variability is representative of all the factors that are not incorporated in the MHR, such as boundary condition effect and mass conservation. Recall that the MHR is a static quantity and flow and transport simulations are not used for its computation. The unconditional field realizations (grey circles)spanalargearea,asaresultofthehighvariabilityduetothelackofmeasurements. On the contrary, the conditional field realizations (blue rhombus) variability is much lower, being close to the point, depicted as a red cross, indicating the MHR andt 1% of the synthetic true field Y. It is clear that reducing the uncertainty of the MHR (for example by sampling along the LRP) results in a decrease of the uncertainty of the first arrival time due to the fact that the two quantities are correlated. 100 0.00 0.25 0.50 0.75 1.00 1.25 1.50 t/t 1% 0.00 0.02 0.04 0.06 0.08 0.10 Cumulative BTC Iterative Sampling Synthetic Truth 0.00 0.25 0.50 0.75 1.00 1.25 1.50 t/t 1% Cumulative BTC Regular Sampling Synthetic Truth Figure 5.12: Results for the cumulative BTC at the production well and its confidence intervals using the iterative sampling (blue) and regular sam- pling (green) for the well-to-well case. The red line is the cumulative BTC obtained using Y. The blue/green lines are the median. The darker shaded areas of blue and green represent the 50% confidence intervals whereas the lighter shaded areas of blue and green represent the 90% confidence inter- vals. Time is normalized with the t 1% of the synthetic true field. 101 5.3.4 Start and Target Control Volumes Selection Thebenchmarkscenarioconsideredintheprevioussectionsassumesthatthesoluteplume is transported along the average flow direction in the x 1 direction. The iterative strategy proposed can be extended to different scenarios. One key factor for a proper application of this strategy is the selection of the start and target control volumes. In the benchmark case (see section 5.3.3), the start and target control volumes were chosen to be the left and right boundaries of the domain. This was chosen since the solute is initially injected through the left boundary and we are interested in the first arrival times at the right boundary. However, other configurations might be of interest. A practical scenario which is of interest to hydrogeologists is the case where the solute isinjectedinawellandextractedfromanotherwell(similartoatracertest). Followingthe procedure described in the previous sections, Figure 5.11 reports the sampling locations selected after six iterations as well as the corresponding LRP probability map. In this case, we are seeking the LRP from the injection point located in the lower-left area, i.e., (x 1 ;x 2 ;x 3 ) = (1; 0:6; 0:5), to the production point located in the upper-right area, i.e., (x 1 ;x 2 ;x 3 ) = (5; 2:4; 0:5). The synthetic true field is the same as before (i.e., Y as shown in Figure 5.4). Again, the results displayed in Figure 5.11 show that the methodology is capable of identifying the trajectory of the LRP after six iterations and 42 horizontal sampling locations. As done in the benchmark scenario, see section 5.3.3), we perform a Monte-Carlo analysis of the flow an transport simulations using a RSF conditioned to the conductivity extracted in the sampling locations given by the iterative strategy and the sampling locations located along a regular grid. Both cases have the same number of conditioning points (i.e., 42 horizontal sampling locations). For this simulation, we enforce closed 102 boundary conditions and that the injecting and producing wells operated at the same constant flux-rate. The particles are instantaneously released in the injection point. All the other parameters are the same as in the benchmark case. As shown in Figure 5.12, the massBTCuncertaintyestimatedthroughtheiterativesamplingissmallerwhencompared to the simulations conditioned on the regular sampling grid (compare left and right graphs presented in Figure 5.12). In this case, we observed a 42% reduction of the t 1% standard deviation, when compared to the regular sampling strategy. 5.4 Summary Inthiswork,weemployagraphtheoryframeworktocomputetheMinimumHydraulicRe- sistance (MHR) and Least Resistance Path (LRP) in the subsurface environment. These quantities can be used to provide information regarding the hydraulic connectivity of a heterogeneous porous formation which is related to solute first arrival times. Due to the computational efficiency of the framework, we perform a systematic study to investigate the impact of the random heterogeneous structure of the porous medium on the statistical properties of the MHR. We show how the conceptualization of the random space function (RSF) model impacts the statistical distribution and spread of the MHR. Amongstseveralmodelsusedintheliterature, wefocusedonmulti-Gaussianaswellasnon multi-Gaussian log-conductivity fields. Within the non multi-Gaussian field category, we consider two-facies (binary) fields, and fields displaying well-connected and disconnected features. Results show that the RSF model has a remarkable impact on the average MHR and the MHR uncertainty. Furthermore, we illustrated how the dimension of the physical domain (i.e., 2D vs. 3D) has a key role in controlling the statistics of the MHR and the 103 connectivity structure of the porous formation. We show how the choice of both the RSF model and dimensionality impacted the extremes of the MHR statistics. As expected, our analysis show that 3D fields display consistently lower MHR (in the average) when compared to the 2D counterparts. The graph theory based framework presented in this analysis can be used by practitioners to efficiently evaluate the impact of assumptions associated with a specific RSF model and dimensionality on connectivity. Lastly, we propose an iterative strategy for data acquisition to improve the delineation of preferential channels in aquifers. Instead of evenly distributing the sampling locations, our method allows for the sampling campaign to be divided into multiple iterations. In each iteration, new sampling locations are chosen with the goal of minimizing the uncertainty of the LRP. Since the LRP can be computed efficiently for both 2D and 3D fields, the computational overhead introduced by the intermediate steps is negligible. We evaluate this strategy using a synthetic field, and perform a comparison with a simpler (i.e., regular) strategy, where sampling locations are chosen on a regularly spaced grid. The iterative strategy leads to a reduction of the LRP uncertainty when compared to the regular strategy, since most of the sampling locations are placed along the true LRP. Moreover, since the LRP is related to the trajectory of the front of the solute plume, the uncertainty on first arrival time is also decreased. Our computational analysis shows a 47% decrease in first arrival time uncertainty as a result of the new sampling strategy.In our benchmark model, we used a Multi-Gaussian random field to represent the hydraulic conductivity uncertainty. However, due to the flexibility of the algorithm to compute the MHR and LRP, the iterative sampling strategy can be used with any RSF that supports hard-conditioning (i.e., set the K values in the sampling locations). Casting the iterative sampling methodology ideas within an optimal design framework for the identification of 104 the LRP is topic of future research. 105 Chapter 6 Connectivity of Multi-Indicator Models Hydraulic properties (e.g., hydraulic conductivity) characterizing natural porous forma- tions often display variability over multiple length scales. The spatially variable hydraulic conductivity structure leads to the presence of preferential flow paths which is one of the main causes for anomalous transport normally observed in the field. Capturing the con- nectivity features of the subsurface environment is important to predict first time arrival times which has implication in risk analysis. The minimum hydraulic resistance (MHR) has been proved to be a convenient and intuitive quantity to estimate the connectivity properties of spatially random conductivity field. In [98] we showed how to calculate the MHR for a given continuous random con- ductivity field (such as multi-Gaussian fields and the non multi-Gaussian fields Zinn and Harvey [123]). Thus, it is possible to compare the statistical behavior of the MHR across different fields, understanding how the connectivity is affected by the given geostatistical conceptualization. In this chapter, we extend the definition of the MHR for an hydraulic conductivity field represented by multi-indicator models (MIM) [66]. In this type of model, the conductivity field is represented by a set of interconnected blocks, where each block has a given hydraulic conductivity value. Contrary to the multi-Gaussian models, 106 the hydraulic conductivity within each block is independent from the other blocks. As a result, it is possible to approximate analytical results for flow and transport for highly heterogeneous fields [44]. Despite the simplified conceptualization given by the MIM methodology, several studies have tested its predictive power, for example by predicting the plume behavior in the MADE site [46]. The MIM K-fields allow to estimate the statistical properties of the MHR under different scenarios using either a Monte Carlo approach or an analytical approach. In this chapter, we analyze what is the impact of the dimensionality of the field (i.e., 2D vs 3D). It is known that 2D fields are in general less connected than 3D fields [47], however using the MHR with MIM K-fields we are able to quantify this difference and understand how this difference changes with heterogeneity and distance between source and target. At the same time, we compare the expected resistance of the MHR in MIM K-fields with multi-GaussianK-fields results computed in [98], showing at the same time how the MHR can be employed to compare connectivity properties of different models. 6.1 Minimum Hydraulic Resistance in Multi-Indicator Models To analyze the connectivity of a random conductivity field, we use a Multi-Indicator Model (MIM) [66]. A MIM field consists of a series of blocks with size , where each block is characterized by a random homogeneous hydraulic conductivity value. The blocks are grouped in a series of N x layers along the x-direction, and each layer i contains N y;i N z;i blocks. Thus, each block is characterized by the indicesi;j;k, with 0iN x , bN y;i =2cjbN y;i =2c, andbN z;i =2ckbN z;i =2c. A schematic representation of 107 Figure 6.1: Multi-Indicator Model (MIM). The red lines represent a pos- sible path from (0, 0) to (4, -2). Note that we are neglecting the part that moves along the y-direction. 108 the MIM structure is given in Figure 6.1. Note that each block in layer i is connected to 9 blocks in layeri+1 if theK-field is 3D, or 3 blocks if theK-field is 2D. We assume that the layer i = 0 always contains exactly one block, therefore the transverse dimensions in the y-direction is N y;i = 2i + 1, while in the z-direction is N z;i = 2i + 1 for 3D fields or N z;i = 1 for 2D fields. Assuming the block size larger than the K-field integral scale I Y , we set the con- ductivity in each block K i;j;k to be independent and log-normal distributed [66]: K i;j;k logN(;) (6.1) where is the log-mean and is the log-standard-deviation. Followingtheproceduredescribedin[98],wedefinetheMinimumHydraulicResistance (MHR) in aK-field conceptualized through the MIM. We start by defining the hydraulic resistance in each block. For a given block i;j;k, consider a straight lineL crossing the block along the x-direction. Using equation (6.1), the block hydraulic resistance is computed as: R i;j;k = Z L 1 K d = K i;j;k logN(log ;): (6.2) Note that the block hydraulic resistance is also log-normal distributed. Next, we define the set of all the possible pathsP i;j;k (i.e., all the sequences of connected blocks) from the starting block (i = j = k = 0) to the block i;j;k. therefore the MHR between the starting block and the block i;j;k is: R i;j;k = min 2P i;j;k X i 0 ;j 0 ;k 0 2 R i 0 ;j 0 ;k 0: (6.3) 109 The path that minimizes the resistance is called least resistance path (LRP). Using equation (6.3), we compute the MHR from the starting block to the entire layer i as: R i = min j;k R i;j;k : (6.4) If backward connections between blocks are not present, then the structure of the MIM is an acyclic directed graph. Given the MHR in all the blocks of layer i 1, it is possible to compute the MHR in all the blocks of layer i using the following induction step: R i;j;k =R i;j;k + min j 0 ;k 0 2N i;j;k R i1;j 0 ;k 0; (6.5) where N i;j;k contains all the pair j 0 ;k 0 such that i 1;j 0 ;k 0 is connected to i;j;k (i.e., all the neighbor blocks in the previous layer). 6.2 Monte Carlo Analysis The statistical distribution of the MHR can be efficiently estimated using a Monte-Carlo (MC) framework. For each MC realization of the MIM field, we start from layer 0, where a random resistancer 0;0;0 is generated from the resistance distribution defined in equation (6.2), and assigned to r 0;0;0 . Then, the induction step (6.5) can be rewritten for a given realization: r i;j;k =r i;j;k + min j 0 ;k 0 2N i;j;k r i1;j 0 ;k 0: (6.6) The hydraulic resistance within each block is evaluated only once for each MC realization, avoiding the computation of the hydraulic resistance for every possible path. Finally, the 110 0 1 2 3 4 5 f R ∗ σ = 4.0 σ = 2.0 σ = 1.0 σ = 0.5 2 4 6 R ∗ 2 0 1 2 3 4 5 f R ∗ 2 4 6 R ∗ 6 2 4 6 R ∗ 10 Figure 6.2: PDF of the MHR at layers 2, 6, and 10 for 2D (top) and 3D (bottom). The histograms are computed using N MC = 10; 000 realizations for each . CDF of the MHR in each block can be estimated from the r i;j;k values. Using equations (6.4) and (6.6), it is possible to compute the MHR to a layer i for each realization as: r i = min j;k r i;j;k (6.7) 6.2.1 Importance of Dimensionality (2D vs 3D) The dimensionality of the domain (2D vs 3D) affects the number of possible paths con- necting the starting block to a layer. In the 2D case, each block is connected to 3 blocks of the following layer, while in the 3D case there are 9 connections. Since we minimize 111 the resistance among all the possible paths, the 3D case will show a smaller MHR. As shown in [98], the hydraulic conductivity in the source zone (i.e., the starting block) is of key importance, since all the possible paths need to intersect this area/block in both the 2D and 3D cases. Since we are interested in the connectivity differences due to the different dimensionality, we remove the contribution of the conductivity in the starting block by setting r 0;0;0 = 1 in all the MC simulations. Figure 6.2 shows the PDF of the MHR between the starting block and layeri (i.e.,R i as in equation (6.4)) computed using the MC framework presented in section 6.1. The MHR in the 3D case is smaller than the 2D case for all the considered. The MHR difference due to the dimensionality increases with . For example, if we consider = 4 andthelayeri = 10, wehavethattheaverageMHRis1.08forthe3Dcaseand5.05forthe 2D case, that is roughly 5 times higher. Instead, the average MHR for = 0:5 is 5.49 for the 3D case and 7.47 for the 2D case, thus only 1.36 times higher. As a consequence, the 2D assumption may lead to an underestimation of the connectivity (i.e., higher resistance) in the presence of high heterogeneity. Next, we analyze how the expected value of the MHR changes with the control plane distance. Figure 6.3 shows the expected value ofR i as a function of i for different . Since the starting block is the same for every path and we are interested in the behavior for large i, we remove its contribution to the MHR by removingR 1 . In all the cases considered, the expected value of the MHR grows linearly when i is relatively large (see Figure 6.3). However, the increase rates are different between 2D and 3D fields. In fact, for every , the increase rate of the MHR expected value is larger in 2D when compared to the 3D case. Thus, our results indicate the connectivity difference between 2D and 3D MIM K-fields increases indefinitely with the control plane distance. 112 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 i 1 2 3 4 5 E h R ∗ ,2D i i /E h R ∗ ,3D i i σ = 4.0 σ = 2.0 σ = 1.0 σ = 0.5 Figure 6.3: Ratio between 2D and 3D expected values ofR i as a func- tion of the layer i. The 2D expected MHR is always higher than the 3D counterpart. 6.2.2 Comparison with Multi-Gaussian fields In general, the MHR increases with the distance between the source and the target. For a multi-Gaussian field, given two points with a distance greater than one integral scale, the expected MHR is proportional to the distance [98]. Moreover, the coefficient depends on the heterogeneity level . In general, the expected MHR for two points at distance dI Y is: E [R ]'!()d: (6.8) For a unconditional 2D multi-Gaussian field with exponential variogram model, the pro- portionality factor is !() = exp0:8 [98]. To compare the connectivity properties of the MIM models to the multi-Gaussian results, we estimate the expected MHR using the 113 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 σ 10 − 1 logω(σ ) 2D 3D exp(− 0.8σ ) Figure 6.4: Scaling factor of the expected MHR ! defined in equation (6.8) as a function of the heterogeneity level . The green line is based on the results for the multi-Gaussian fields shown in Rizzo and de Barros [98]. The 2D and 3D values for the MIM fields are estimated using Monte Carlo simulations. 114 Monte Carlo framework introduced in section 6.1. Using the same notation, the MHR between the starting block and the block in the center of the i layer isR i;0;0 , while the distance is d = i. For each layer i, we compute the expected MHR E R i;0;0 . Finally, the proportionality factor !() is computed using a linear interpolation of the expected MHR between blocks separated by 3 i 11 layers. Figure 6.4 shows the behavior of !() for 2D and 3D MIM models, and for the 2D multi-Gaussian fields (green line). For low heterogeneity, the MIM fields and multi-Gaussian fields display a similar trend for the expected MHR. However, as the heterogeneity increases, differences of the MHR increase as a result of the missing correlation between blocks of a MIM field. Therefore, particles may encounter a higher resistance in a 2D MIM field when compared to the 2D multi-Gaussian field. At the same time, the 3D MIM model has a lower average resistance when compared to both the 2D MIM model and 2D multi-Gaussian fields. As expected, the increased number of possible paths in the 3D case leads to a lower overall resistance. In all the cases, the expected MHR slope decreases with the heterogeneity , indicating that high leads to an increased overall connectivity of the field. 115 Chapter 7 Conclusions Connectivity of heterogeneous porous media has been proved to be one of the key fac- tors controlling anomalous solute transport. In this dissertation, the minimum hydraulic resistance and least resistance path metrics are employed to measure connectivity prop- erties and statistics of hydraulic conductivity fields. In Chapter 3, we developed a flexible graph theory based algorithm to compute the minimum hydraulic resistance and least resistance path. By transferring the porous media formulation to a graph theory context, we were able to prove that finding the minimum hydraulic resistance is equivalent to the shortest path problem. Therefore, it is possible to compute the minimum hydraulic resis- tance using a variation of the Dijkstra’s algorithm. Due to the computational efficiency of the latter, it is possible to treat connectivity as a stochastic quantity, incorporating the uncertainty due to the lack of information about the hydraulic conductivity field. In Chapter 4, we use Monte Carlo simulations to estimate the expected hydraulic re- sistance for different random space function representing the hydraulic conductivity field. Moreover, we show that the minimum hydraulic resistance is correlated to the first time arrival of a solute plume, while the least resistance path is correlated to the trajectory of 116 the leading edge (i.e., front) of the plume. This means that it is possible to utilize mini- mum hydraulic resistance and least resistance path to quickly estimate crucial information about the solute movements, without relying to flow and transport numerical simulation, that are in general much more computationally expensive. Following these findings, in Chapter 5 we provide a practical example of how the least resistance path concept can be efficiently employed to improve characterization of an aquifer’s physical properties. Dur- ing site characterization campaigns, the hydraulic conductivity is sampled only in a few locations, and these results are then used to build a geostatistical model of the hydraulic conductivity field, incorporating the randomness due to unknown spatial distribution of the conductivity. The idea behind the proposed iterative sampling strategy is to maxi- mize the probability to sample around the least resistance path, that is related to the path followed by the front of the plume (i.e., high conductivity channel or preferential path). It follows a reduction of the hydraulic conductivity uncertainty in the high conductivity channels. Results show that it is possible to obtain more than 40% uncertainty reduction in the first arrival time uncertainty when compared to a regular sampling protocol with the same number of sampling locations. Since we are using the least resistance path and the graph theory algorithm, therefore no flow and transport simulations are required, the time burden introduced by the iterative strategy is marginal when compared to the actual time needed for physically sampling the conductivity in a given location. Another application of the minimum hydraulic resistance is shown in Chapter 6. In fact, the minimum hydraulic resistance can be employed to efficiently estimate connec- tivity differences across different stochastic representations of the conductivity field. We analyzed the connectivity properties of the multi-indicator models (MIM) for a hydraulic 117 conductivity field, by studying the impact of the 2D assumption often made in differ- ent studies. As expected, we found that the 3D fields have a lower hydraulic resistance than 2D fields, since there are more possible paths for a solute particle to flow through porous media. Moreover, we were able to quantify the level of connectivity difference as a function of heterogeneity level and distance form source to target. As part of this dissertation, two open-source software have been developed. The first one is PAR 2 (see section 2.2), a GPU-accelerated particle tracking solver for simulating solute transport in porous media. The particle tracking method is particularly suited for parallelization, since the trajectory of each particle carrying a fraction of the solute mass can be computed independently. PAR 2 allows a fast simulation of solute transport, allowing the practical usage of computationally intense algorithm such as Monte Carlo simulations. Moreover, it will be possible to use PAR 2 as a starting point for a fast parallel reactive transport solver, reusing the advection and dispersion parts already implemented. The second software is LazyMole (see section 3.2), that implements an optimized version ofthegraphtheoryalgorithmusedtocomputetheminimumhydraulicresistanceandleast resistance path. An intuitive interface is provided for both software, and prior knowledge of programming languages is not necessary. All the results and algorithms presented in thisdissertationcanbereproducedusingPAR 2 (https://github.com/GerryR/par2)and LazyMole (https://github.com/GerryR/lazymole), therefore integration of the con- cepts discussed through this dissertation into new research will be facilitated. Since connectivity plays an important role in solute transport, geostatistical models should incorporate the connectivity structure due to the heterogeneity of the aquifer un- der consideration. The minimum hydraulic resistance and least resistance path offer an efficient way to estimate connectivity of random space functions and they can be used 118 for an informed selection of the geostatistical model. These connectivity metrics could be employed in practical scenarios, as shown by the iterative sampling strategy, reducing the solute transport uncertainty due to the lack of knowledge of the conductivity field. Therefore, future research should focus on how to bring these concepts into methodolo- gies pipelines currently used by practitioners, with the intention to reduce the spatial uncertainty, and consequentially improve risk assessments estimations. 119 Bibliography [1] Andrés Alcolea, Jesús Carrera, and Agustín Medina. “Pilot points method incor- porating prior information for solving the groundwater flow inverse problem”. In: Advances in water resources 29.11 (2006), pp. 1678–1689. [2] DenisAllard.J.-P. Chilès, P. Delfiner: Geostatistics: Modeling Spatial Uncertainty. 2013. [3] Roko Andričević and Vladimir Cvetković. “Evaluation of risk from contaminants migrating by groundwater”. In: Water Resources Research 32.3 (1996), pp. 611– 621. [4] Alfredo Hua-Sing Ang, Wilson H Tang, et al. Probability concepts in engineering: emphasis on applications in civil & environmental engineering. Vol. 1. Wiley New York, 2007. [5] Mark Bakker, V. Post, Christian D Langevin, Joseph D Hughes, J T White, J J Starn, and Michael N Fienen. “Scripting MODFLOW model development using Python and FloPy”. In: Groundwater 54.5 (2016), pp. 733–739. [6] Francesco Ballio and Alberto Guadagnini. “Convergence assessment of numerical Monte Carlo simulations in groundwater hydrology”. In: Water resources research 40.4 (2004). 120 [7] P Bayer, A Comunian, D Höyng, and G Mariethoz. “Physicochemical proper- ties and 3D geostatistical simulations of the Herten and the Descalvado aquifer analogs”. In: (2015). [8] Peter Bayer, Alessandro Comunian, Dominik Höyng, and Gregoire Mariethoz. “Highresolutionmulti-faciesrealizationsofsedimentaryreservoirandaquiferanalogs”. In: Scientific data 2 (2015), p. 150033. [9] Jacob Bear and Alexander H-D Cheng. Modeling groundwater flow and contami- nant transport. Vol. 23. Springer Science & Business Media, 2010. [10] Nathan Bell and Jared Hoberock. “Thrust: A productivity-oriented library for CUDA”. In: GPU computing gems Jade edition. Elsevier, 2011, pp. 359–371. [11] A. Bellin and Y. Rubin. “HYDRO_GEN: A spatially distributed random field generator for correlated properties”. In: Stochastic Hydrology and Hydraulics 10.4 (1996), pp. 253–278. [12] A. Bellin, Y. Rubin, and A. Rinaldo. “Eulerian-Lagrangian approach for modeling of flow and transport in heterogeneous geological formations”. In: Water Resources Research 30.11 (1994), pp. 2913–2924. [13] A. Bellin, P. Salandin, and A. Rinaldo. “Simulation of dispersion in heterogeneous porous formations: Statistics, first-order theories, convergence of computations”. In: Water Resources Research 28.9 (1992), pp. 2211–2227. [14] David A Benson and Diogo Bolster. “Arbitrarily complex chemical reactions on particles”. In: Water Resources Research 52.11 (2016), pp. 9190–9200. 121 [15] E. W. Bhark, B. Jafarpour, and A. Datta-Gupta. “A generalized grid connectivity– based parameterization for subsurface flow model calibration”. In: Water Resources Research 47.6 (2011). [16] M. Bianchi, C. Zheng, C. Wilson, G. R. Tick, G. Liu, and S. M. Gorelick. “Spatial connectivity in a highly heterogeneous aquifer: From cores to preferential flow paths”. In: Water Resources Research 47.5 (2011). [17] Marco Bianchi and Daniele Pedretti. “An entrogram-based approach to describe spatial heterogeneity with applications to solute transport in porous media”. In: Water Resources Research (2018). [18] Marco Bianchi and Daniele Pedretti. “Geological entropy and solute transport in heterogeneous porous media”. In: Water Resources Research 53.6 (2017), pp. 4691– 4708. [19] J. M. Boggs. Hydrogeologic characterization of the MADE site. Electric Power Research Institute, 1990. [20] J. A. Bondy and U. S. R. Murty. Graph theory with applications. Vol. 290. Citeseer, 1976. [21] F. Boso, A. Bellin, and M. Dumbser. “Numerical simulations of solute transport in highly heterogeneous formations: A comparison of alternative numerical schemes”. In: Advances in water resources 52 (2013), pp. 178–189. [22] RD Burnett and EO Frind. “Simulation of contaminant transport in three di- mensions: 1. The alternating direction Galerkin technique”. In: Water Resources Research 23.4 (1987), pp. 683–694. 122 [23] Steven F Carle and Graham E Fogg. “Transition probability-based indicator geo- statistics”. In: Mathematical geology 28.4 (1996), pp. 453–476. [24] V. Cvetkovic, C. Carstens, J.-O. Selroos, and G. Destouni. “Water and solute transport along hydrological pathways”. In: Water Resources Research 48.6 (2012). [25] V. Cvetkovic, A. Fiori, and G. Dagan. “Solute transport in aquifers of arbitrary variability: A time-domain random walk formulation”. In: Water Resources Re- search 50.7 (2014), pp. 5759–5773. [26] G. Dagan. Flow and transport in porous formations. Springer Science & Business Media, 1989. [27] F. P. J. de Barros. “Evaluating the combined effects of source zone mass release rates and aquifer heterogeneity on solute discharge uncertainty”. In: Advances in Water Resources 117 (2018), pp. 140–150. [28] F. P. J. de Barros, A. Bellin, V. Cvetkovic, G. Dagan, and A. Fiori. “Aquifer heterogeneity controls on adverse human health effects and the concept of the hazard attenuation factor”. In: Water Resources Research 52.8 (2016), pp. 5911– 5922. [29] F. P. J. de Barros, A. Fiori, F. Boso, and A. Bellin. “A theoretical framework for modeling dilution enhancement of non-reactive solutes in heterogeneous porous media”. In: Journal of Contaminant Hydrology 175 (2015), pp. 72–83. [30] F. P. J. de Barros and W. Nowak. “On the link between contaminant source re- lease conditions and plume prediction uncertainty”. In: Journal of Contaminant Hydrology 116.1 (2010), pp. 24–34. 123 [31] F. P. J. de Barros and Y. Rubin. “A risk-driven approach for subsurface site char- acterization”. In: Water Resources Research 44.1 (2008). [32] J-R. de Dreuzy, A. Beaudoin, and J. Erhel. “Asymptotic dispersion in 2D hetero- geneous porous media determined by parallel numerical simulations”. In: Water Resources Research 43.10 (2007). [33] M. Dentz and F. P. J. de Barros. “Mixing-scale dependent dispersion for transport in heterogeneous flows”. In: Journal of Fluid Mechanics 777 (2015), pp. 178–195. [34] C. V. Deutsch. “Fortran programs for calculating connectivity of three-dimensional numerical models and for ranking multiple realizations”. In: Computers & Geo- sciences 24.1 (1998), pp. 69–76. [35] C. V. Deutsch and A. G. Journel. GSLIB, Geostatistical Software Library and User’s Guide. 1998. [36] E.W.Dijkstra.“Anoteontwoproblemsinconnexionwithgraphs”.In: Numerische mathematik 1.1 (1959), pp. 269–271. [37] D. Ding, D. A. Benson, A. Paster, and D. Bolster. “Modeling bimolecular reac- tions and transport in porous media via particle tracking”. In: Advances in Water Resources 53 (2013), pp. 56–65. [38] D. Fernàndez-Garcia, P. Trinchero, and X. Sanchez-Vila. “Conditional stochastic mapping of transport connectivity”. In: Water Resources Research 46.10 (2010). [39] D Fernàndez-Garcia and X Sanchez-Vila. “Optimal reconstruction of concentra- tions, gradients and reaction rates from particle distributions”. In: Journal of con- taminant hydrology 120 (2011), pp. 99–114. 124 [40] Daniel Fernàndez-Garcia, Tissa H Illangasekare, and Harihar Rajaram. “Differ- ences in the scale dependence of dispersivity and retardation factors estimated from forced-gradient and uniform flow tracer tests in three-dimensional physically and chemically heterogeneous porous media”. In: Water resources research 41.3 (2005). [41] Daniel Fernàndez-Garcia, Tissa H Illangasekare, and Harihar Rajaram. “Differ- ences in the scale-dependence of dispersivity estimated from temporal and spatial moments in chemically and physically heterogeneous porous media”. In: Advances in water resources 28.7 (2005), pp. 745–759. [42] A. Fiori. “Channeling, channel density and mass recovery in aquifer transport, with application to the MADE experiment”. In: Water Resources Research 50.12 (2014), pp. 9148–9161. [43] A. Fiori. “The Lagrangian concentration approach for determining dilution in aquifer transport: Theoretical analysis and comparison with field experiments”. In: Water Resources Research 37.12 (2001), pp. 3105–3114. [44] A. Fiori, A. Bellin, V. Cvetkovic, F. P. J. de Barros, and G. Dagan. “Stochastic modeling of solute transport in aquifers: From heterogeneity characterization to risk analysis”. In: Water Resources Research 51.8 (2015), pp. 6622–6648. [45] A. Fiori, F. Boso, F. P. J. de Barros, S. De Bartolo, A. Frampton, G. Severino, S. Suweis, and G. Dagan. “An indirect assessment on the impact of connectivity of conductivity classes upon longitudinal asymptotic macrodispersivity”. In: Water resources research 46.8 (2010). 125 [46] A. Fiori, G. Dagan, I. Jankovic, and A. Zarlenga. “The plume spreading in the MADE transport experiment: Could it be predicted by stochastic models?” In: Water Resources Research 49.5 (2013), pp. 2497–2507. [47] A. Fiori and I. Jankovic. “On preferential flow, channeling and connectivity in het- erogeneous porous formations”. In: Mathematical Geosciences 44.2 (2012), pp. 133– 145. [48] A. Fiori, I. Jankovic, and G. Dagan. “The impact of local diffusion upon mass arrival of a passive solute in transport through three-dimensional highly heteroge- neous aquifers”. In: Advances in Water Resources 34.12 (2011), pp. 1563–1573. [49] G. E. Fogg. “Groundwater flow and sand body interconnectedness in a thick, multiple-aquifer system”. In: Water Resources Research 22.5 (1986), pp. 679–694. [50] Graham E Fogg, Steven F Carle, and Christopher Green. “Connected-network paradigm for the alluvial aquifer system”. In: Special Papers-Geological Society of America (2000), pp. 25–42. [51] M. L. Fredman and R. E. Tarjan. “Fibonacci heaps and their uses in improved network optimization algorithms”. In: Journal of the ACM (JACM) 34.3 (1987), pp. 596–615. [52] C. C. Frippiat, T. H. Illangasekare, and G. A. Zyvoloski. “Anisotropic effective medium solutions of head and velocity variance to quantify flow connectivity”. In: Advances in Water Resources 32.2 (2009), pp. 239–249. [53] Olga Fuks, Fayadhoi Ibrahima, Pavel Tomin, and Hamdi A Tchelepi. “Analysis of travel time distributions for uncertainty propagation in channelized porous sys- tems”. In: Transport in Porous Media (2019), pp. 1–23. 126 [54] J. J. Gómez-Hernández and X-H. Wen. “To be or not to be multi-Gaussian? A re- flection on stochastic hydrogeology”. In: Advances in Water Resources 21.1 (1998), pp. 47–61. [55] H. Gotovac, V. Cvetkovic, and R. Andricevic. “Flow and travel time statistics in highly heterogeneous porous media”. In: Water Resources Research 45.7 (2009). [56] N. Gueting and A. Englert. “Hydraulic conditions at the source zone and their impact on plume behavior”. In: Hydrogeology Journal 21.4 (2013), pp. 829–844. [57] J. E. Guyer, D. Wheeler, and J. A. Warren. “FiPy: partial differential equations with python”. In: Computing in Science & Engineering 11.3 (2009), pp. 6–15. [58] Glenn E Hammond, Peter C Lichtner, and RT Mills. “Evaluating the performance of parallel subsurface simulators: An illustrative example with PFLOTRAN”. In: Water resources research 50.1 (2014), pp. 208–228. [59] Arlen W Harbaugh. MODFLOW-2005, the US Geological Survey modular ground- water model: the ground-water flow process. US Department of the Interior, US Geological Survey Reston, 2005. [60] Ahmed E Hassan and Mohamed M Mohamed. “On using particle tracking methods to simulate transport in single-continuum and dual continua porous media”. In: Journal of Hydrology 275.3-4 (2003), pp. 242–260. [61] C. V. Henri and D. Fernàndez-Garcia. “Toward efficiency in heterogeneous mul- tispecies reactive transport modeling: A particle-tracking solution for first-order network reactions”. In: Water Resources Research 50.9 (2014), pp. 7206–7230. 127 [62] C. V. Henri, D. Fernàndez-Garcia, and F. P. J. de Barros. “Probabilistic human health risk assessment of degradation-related chemical mixtures in heterogeneous aquifers: Risk statistics, hot spots, and preferential channels”. In: Water Resources Research 51.6 (2015), pp. 4086–4108. [63] Graciela S Herrera and George F Pinder. “Space-time optimization of groundwater quality sampling networks”. In: Water Resources Research 41.12 (2005). [64] JeffreyDHyman,AricHagberg,DaveOsthus,ShriramSrinivasan,HariViswanathan, and Gowri Srinivasan. “Identifying Backbones in Three-Dimensional Discrete Frac- ture Networks: A Bipartite Graph-Based Approach”. In: Multiscale Modeling & Simulation 16.4 (2018), pp. 1948–1968. [65] I. Jankovic, M. Maghrebi, A. Fiori, and G. Dagan. “When good statistical models of aquifer heterogeneity go right: The impact of aquifer permeability structures on 3D flow and transport”. In: Advances in Water Resources 100 (2017), pp. 199–211. [66] Igor Jankovic, Aldo Fiori, and Gedeon Dagan. “Effective conductivity of isotropic highly heterogeneous formations: Numerical and theoretical issues”. In: Water Re- sources Research 49.2 (2013), pp. 1178–1183. [67] Peter K Kitanidis. Introduction to geostatistics: applications in hydrogeology. Cam- bridge University Press, 1997. [68] Peter K Kitanidis. “On the geostatistical approach to the inverse problem”. In: Advances in Water Resources 19.6 (1996), pp. 333–342. [69] C. Knudby and J. Carrera. “On the relationship between indicators of geostatisti- cal, flow and transport connectivity”. In: Advances in Water Resources 28.4 (2005), pp. 405–421. 128 [70] Christen Knudby and Jesús Carrera. “On the use of apparent hydraulic diffusivity as an indicator of connectivity”. In: Journal of Hydrology 329.3-4 (2006), pp. 377– 389. [71] EM LaBolle. “RWHet: Random walk particle model for simulating transport in heterogeneous permeable media”. In: User’s manual and program documentation, Univ. of Calif., Davis (2006). [72] Eric M LaBolle and Graham E Fogg. “Role of molecular diffusion in contaminant migration and recovery in an alluvial aquifer system”. In: Dispersion in Heteroge- neous Geological Formations. Springer, 2001, pp. 155–179. [73] EricMLaBolle,GrahamEFogg,andAndrewFBTompson.“Random-walksimula- tion of transport in heterogeneous porous media: Local mass-conservation problem and implementation methods”. In: Water Resources Research 32.3 (1996), pp. 583– 593. [74] T.LeBorgne,M.Dentz,D.Bolster,J.Carrera,J-R.deDreuzy,andP.Davy.“Non- Fickian mixing: Temporal evolution of the scalar dissipation rate in heterogeneous porous media”. In: Advances in Water Resources 33.12 (2010), pp. 1468–1475. [75] R. Le Goc, J-R de Dreuzy, and P. Davy. “Statistical characteristics of flow as in- dicators of channeling in heterogeneous porous and fractured media”. In: Advances in Water Resources 33.3 (2010), pp. 257–269. [76] P. C. Leube, F. P. J. de Barros, W. Nowak, and R. Rajagopal. “Towards optimal allocation of computer resources: Trade-offs between uncertainty quantification, discretization and model reduction”. In: Environmental Modelling & Software 50 (2013), pp. 97–107. 129 [77] Peter C Lichtner, Sharad Kelkar, and Bruce Robinson. “New form of dispersion tensor for axisymmetric porous media with implementation in particle tracking”. In: Water Resources Research 38.8 (2002). [78] G. Liu, C. Zheng, and S. M. Gorelick. “Evaluation of the applicability of the dual-domain mass transfer model in porous media containing connected high- conductivity channels”. In: Water Resources Research 43.12 (2007). [79] Julien Maillot, Philippe Davy, Romain Le Goc, Caroline Darcel, and Jean-Raynald De Dreuzy. “Connectivity, permeability, and channeling in randomly distributed and kinematically defined discrete fracture network models”. In: Water Resources Research 52.11 (2016), pp. 8526–8545. [80] Nataliia Makedonska, Scott L Painter, Quan M Bui, Carl W Gable, and Satish Karra. “Particle tracking approach for transport in three-dimensional discrete frac- ture networks”. In: Computational Geosciences 19.5 (2015), pp. 1123–1137. [81] Gregoire Mariethoz, Philippe Renard, and Julien Straubhaar. “The direct sampling method to perform multiple-point geostatistical simulations”. In: Water Resources Research 46.11 (2010). [82] R. M. Maxwell, W. E. Kastenberg, and Y. Rubin. “A methodology to integrate site characterization information into groundwater-driven health risk assessment”. In: Water Resources Research 35.9 (1999), pp. 2841–2855. [83] M. Moslehi, F. P. J. de Barros, F. Ebrahimi, and M. Sahimi. “Upscaling of solute transport in disordered porous media by wavelet transformations”. In: Advances in Water Resources 96 (2016), pp. 180–189. 130 [84] M. Moslehi, R. Rajagopal, and F. P. J. de Barros. “Optimal allocation of compu- tational resources in hydrogeological models under uncertainty”. In: Advances in Water Resources 83 (2015), pp. 299–309. [85] W Nowak, FPJ De Barros, and Y Rubin. “Bayesian geostatistical design: Task- driven optimal site investigation when the geostatistical model is uncertain”. In: Water Resources Research 46.3 (2010). [86] SL Painter, CW Gable, and S Kelkar. “Pathline tracing on fully unstructured control-volume grids”. In: Computational Geosciences 16.4 (2012), pp. 1125–1134. [87] E. Pardo-Igúzquiza and P. A. Dowd. “CONNEC3D: a computer program for con- nectivity analysis of 3D random set models”. In: Computers & Geosciences 29.6 (2003), pp. 775–785. [88] D Pedretti, D Fernàndez-Garcia, D Bolster, and X Sanchez-Vila. “On the forma- tion of breakthrough curves tailing during convergent flow tracer tests in three- dimensional heterogeneous aquifers”. In: Water Resources Research 49.7 (2013), pp. 4157–4173. [89] Carolyn L Phillips, Joshua A Anderson, and Sharon C Glotzer. “Pseudo-random number generation for Brownian Dynamics and Dissipative Particle Dynamics sim- ulations on GPU devices”. In: Journal of Computational Physics 230.19 (2011), pp. 7191–7201. [90] Eileen Poeter and Peter Townsend. “Assessment of critical flow path for improved remediation management”. In: Groundwater 32.3 (1994), pp. 439–447. 131 [91] Maryam Rahbaralam, Daniel Fernàndez-Garcia, and Xavier Sanchez-Vila. “Do we really need a large number of particles to simulate bimolecular reactive transport with random walk methods? A kernel density estimation approach”. In: Journal of Computational Physics 303 (2015), pp. 95–104. [92] Nicolas Remy, Alexandre Boucher, and Jianbing Wu. Applied geostatistics with SGeMS: a user’s guide. Cambridge University Press, 2009. [93] P. Renard and D. Allard. “Connectivity metrics for subsurface flow and transport”. In: Advances in Water Resources 51 (2013), pp. 168–196. [94] Hannes Risken. Fokker-Planck Equation. Springer, 1996. [95] M. Riva, L. Guadagnini, A. Guadagnini, T. Ptak, and E. Martac. “Probabilistic study of well capture zones distribution at the Lauswiesen field site”. In: Journal of Contaminant Hydrology 88.1 (2006), pp. 92–118. [96] C. B. R. Rizzo. lazymole. https://github.com/GerryR/lazymole. 2019. [97] C. B. R. Rizzo. PAR 2 . https://github.com/GerryR/par2. 2018. [98] C. B. Rizzo and F. P. J. de Barros. “Minimum hydraulic resistance and least resistance path in heterogeneous porous media”. In: Water Resources Research 53.10 (2017), pp. 8596–8613. [99] Calogero B Rizzo and Felipe PJ de Barros. “Minimum Hydraulic Resistance Uncer- tainty and the Development of a Connectivity-Based Iterative Sampling Strategy”. In: Water Resources Research 55.7 (2019), pp. 5593–5611. 132 [100] Calogero B Rizzo, Aiichiro Nakano, and Felipe P J de Barros. “PAR2: Parallel Random Walk Particle Tracking Method for solute transport in porous media”. In: Computer Physics Communications (2019). [101] Y. Rubin. “A hierarchical method for the design of water allocation and water distribution networks based on graph-theory”. In: Irrigation and Water Allocation (Proceedings of the Vancouver Symposium, August 1987) 169 (1987), pp. 207–220. [102] Y. Rubin. Applied stochastic hydrogeology. Oxford University Press, 2003. [103] Yoram Rubin, XingyuanChen, Haruko Murakami, andMelanie Hahn. “A Bayesian approach for inverse modeling, data assimilation, and conditional simulation of spatial random fields”. In: Water Resources Research 46.10 (2010). [104] Yoram Rubin and Andre G Journel. “Simulation of non-Gaussian space random functions for modeling transport in groundwater”. In: Water Resources Research 27.7 (1991), pp. 1711–1721. [105] P. Salamon, D. Fernàndez-Garcia, and J. J. Gómez-Hernández. “A review and numerical assessment of the random walk particle tracking method”. In: Journal of contaminant hydrology 87.3 (2006), pp. 277–305. [106] X. Sánchez-Vila, J. Carrera, and J. P. Girardi. “Scale effects in transmissivity”. In: Journal of Hydrology 183.1 (1996), pp. 1–22. [107] S.E.SillimanandA.L.Wright.“Stochasticanalysisofpathsofhighhydrauliccon- ductivity in porous media”. In: Water Resources Research 24.11 (1988), pp. 1901– 1910. 133 [108] Guillem Sole-Mari, Daniel Fernàndez-Garcia, Paula Rodríguez-Escales, and Xavier Sanchez-Vila. “A KDE-Based Random Walk Method for Modeling Reactive Trans- port With Complex Kinetics in Porous Media”. In: Water Resources Research 53.11 (2017), pp. 9019–9039. [109] M. R. Soltanian, R. W. Ritzi, C. C. Huang, and Z. Dai. “Relating reactive solute transport to hierarchical and multiscale sedimentary architecture in a Lagrangian- based transport model: 1. Time-dependent effective retardation factor”. In: Water Resources Research 51.3 (2015), pp. 1586–1600. [110] Veljko Srzic, Vladimir Cvetkovic, Roko Andricevic, and Hrvoje Gotovac. “Impact ofaquiferheterogeneitystructureandlocal-scaledispersiononsoluteconcentration uncertainty”. In: Water Resources Research 49.6 (2013), pp. 3712–3728. [111] D. M. Tartakovsky. “Assessment and management of risk in subsurface hydrology: A review and perspective”. In: Advances in Water Resources 51 (2013), pp. 247– 260. [112] D. M. Tartakovsky. “Probabilistic risk analysis in subsurface hydrology”. In: Geo- physical Research Letters 34.5 (2007). [113] H. C. Thomas, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algo- rithms. Vol. 6. MIT press Cambridge, 2001. [114] P. Trinchero, X. Sánchez-Vila, and D. Fernàndez-Garcia. “Point-to-point connec- tivity, an abstract concept or a key issue for risk assessment studies?” In: Advances in Water Resources 31.12 (2008), pp. 1742–1753. 134 [115] A. R. Tyukhova, W. Kinzelbach, and M. Willmann. “Delineation of connectivity structures in 2-D heterogeneous hydraulic conductivity fields”. In: Water Resources Research 51.7 (2015), pp. 5846–5854. [116] A. R. Tyukhova and M. Willmann. “Connectivity metrics based on the path of smallest resistance”. In: Advances in Water Resources 88 (2016), pp. 14–20. [117] Hari S Viswanathan, J D Hyman, Satish Karra, Daniel O’Malley, Shriram Srini- vasan, A Hagberg, and Gowri Srinivasan. “Advancing Graph-Based Algorithms for Predicting Flow and Transport in Fractured Rock”. In: Water Resources Research 54.9 (2018), pp. 6085–6099. [118] Andrew W Western, Günter Blöschl, and Rodger B Grayson. “Toward capturing hydrologically significant connectivity in spatial patterns”. In: Water Resources Research 37.1 (2001), pp. 83–97. [119] C. Zheng and G. D. Bennett. Applied contaminant transport modeling. Vol. 2. Wiley-Interscience New York, 2002. [120] C. Zheng, M. Bianchi, and S. M. Gorelick. “Lessons learned from 25 years of re- search at the MADE site”. In: Ground Water 49.5 (2011), pp. 649–662. [121] Chunmiao Zheng. “MT3D, A modular three-dimensional transport model”. In: (1990). [122] Chunmiao Zheng and P Patrick Wang. MT3DMS: a modular three-dimensional multispecies transport model for simulation of advection, dispersion, and chemical reactions of contaminants in groundwater systems; documentation and user’s guide. Tech. rep. Alabama Univ University, 1999. 135 [123] B. Zinn and C. F. Harvey. “When good statistical models of aquifer heterogeneity go bad: A comparison of flow, dispersion, and mass transfer in connected and multivariate Gaussian hydraulic conductivity fields”. In: Water Resources Research 39.3 (2003). 136 Appendix A Lower Bound Estimation of the Early Arrival Times To better illustrate how the minimum hydraulic resistance can be used to estimate early arrivals, consider the case of pure advective flows (i.e., Pe ! 1). In this case, the minimum hydraulic resistance is a lower bound of the early time arrivals. We demonstrate thisbyconsideringtwocontrolplanesAandB situatedatadistancer apart. Theseplanes are orthogonal to the average flowhv(x)i where the angled brackets denote the average operator. Let be the trajectory of the fastest particle (conservative) connecting A and B. This implies that a particle starting from a2 A traveling along will arrive to b2 B before any other particle released from A at the same time. Therefore, can be parametrized using the Lagrangian formulation _ x =v(x(t)) with x(0) =a. Let t be the time that a particle needs to travel from A to B through (i.e., x(t ) = b). Since is one of the possible paths connecting A and B, the minimum hydraulic resistance between A and B, as defined in Equation (3.3), is smaller than the hydraulic resistance 137 on . Mathematically, this can be expressed as: R A (B) Z 1 K d : (A.1) By rewriting the line integral in (A.1) in terms of the Lagrangian trajectory x(t) we obtain: Z 1 K d = Z t 0 1 K(x(t)) d(x(t)): (A.2) Using Darcy’s law (2.3) (and setting n e = 1 for simplicity), the differential of the velocity along the trajectory becomes: d(x(t)) =K(x(t))jr'(x(t))jdt: (A.3) Substituting (A.3) into (A.2) and multiplying and dividing the result by t we obtain: Z t 0 1 K(x(t)) d(x(t)) =t 1 t Z t 0 jr'(x(t))jdt : (A.4) Finally, using (A.1) and (A.4) we obtain the following inequality: R A (B)t ; (A.5) where: 1 t Z t 0 jr'(x(t))jdt: (A.6) The value represents the temporal average of the head gradient that a particle traveling along the streamline experiences. From (A.6), we know that max x jr'(x)j. Thus, 138 the minimum hydraulic resistance between the two control planes is effectively a lower bound of the early time arrival t according to (A.5). Understanding the behavior of over different scenarios would potentially allow to use the minimum hydraulic resistance as a computationally efficient lower bound estimation of the early time arrivals without the need of transport simulations. 139 Appendix B Analytical derivation of the dispersion tensor D In this section, we prove that choosing B as in equation (2.12) satisfies the relation 2D =BB T (equation (2.8)). If i andv i are the eigenvalues and orthogonal eigenvectors of D, then: BB T = 3 X i=1 p 2 i v i v T i v T i v i ! 3 X i=1 p 2 i v i v T i v T i v i ! T = 3 X i=1 p 2 i v i v T i v T i v i ! 3 X i=1 p 2 i (v i v T i ) T v T i v i ! : Since v i v T i is symmetric, then: = 3 X i=1 p 2 i v i v T i v T i v i ! 3 X i=1 p 2 i v i v T i v T i v i ! = 3 X i=1 3 X j=1 p 2 i p 2 j v i v T i v j v T j v T i v i v T j v j : 140 Since v i are orthogonal, v T i v j = ij v T i v i , therefore: = 3 X i=1 2 i v i v T i v T i v i : This final form is equal to 2D, in fact for every eigenvector v j (i.e., Dv j = j v j ): BB T v j = 3 X i=1 2 i v i v T i v T i v i v j = 2 j v j = 2Dv j : 141
Abstract (if available)
Abstract
Spatial heterogeneity of the hydrogeological properties (e.g., hydraulic conductivity and porosity) has a key role in controlling groundwater flow and solute transport processes. The multi-scale variability of the hydraulic conductivity (K) causes complex flow patterns that control solute mass fluxes and arrival times. Anomalous solute transport behavior has been observed in highly heterogeneous K-fields, due to the presence of fast flow conduits (i.e., preferential flow paths) and low conductive layers controlling connectivity properties. In this dissertation, the minimum hydraulic resistance (MHR) and the corresponding least resistance path (LRP) are used to study the statistical properties of heterogeneous K-fields connectivity. The MHR and the LRP are static quantities since they do not depend on flow and transport equations. A fast graph theory-based methodology to compute the MHR and LRP is developed. One of the main advantages of using static quantities such as the MHR is that important features of the hydrogeological system (e.g., first time arrival and high conductivity channels) can be extracted without resorting to flow and transport simulations. Finally, an iterative data acquisition strategy is developed, which can be utilized to identify the LRP (which is linked to preferential flow channels) in real sites. A synthetic benchmark test is presented, showing the advantages of the proposed sampling strategy when compared to a regular sampling strategy. Using the iterative data sampling strategy, the first time arrival uncertainty is reduced by more than 40% (when compared to the regular sampling strategy), while maintaining site characterization efforts constant.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
On the transport dynamics of miscible fluids in porous media under different sources of disorder
PDF
Multiscale and multiresolution approach to characterization and modeling of porous media: From pore to field scale
PDF
Machine-learning approaches for modeling of complex materials and media
PDF
Efficient stochastic simulations of hydrogeological systems: from model complexity to data assimilation
PDF
Effective flow and transport properties of deforming porous media and materials: theoretical modeling and comparison with experimental data
PDF
Efficient simulation of flow and transport in complex images of porous materials and media using curvelet transformation
PDF
Flow and thermal transport at porous interfaces
PDF
Continuum modeling of reservoir permeability enhancement and rock degradation during pressurized injection
PDF
Hydraulic fracturing and the environment: risk assessment for groundwater contamination from well casing failure
PDF
Chemical and mechanical deformation of porous media and materials during adsorption and fluid flow
PDF
Novel queueing frameworks for performance analysis of urban traffic systems
Asset Metadata
Creator
Rizzo, Calogero Benedetto
(author)
Core Title
Efficient connectivity assessment of heterogeneous porous media using graph theory
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Civil Engineering
Publication Date
08/05/2020
Defense Date
03/06/2020
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
connectivity,graph theory,heterogeneous porous media,OAI-PMH Harvest,solute transport
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
de Barros, Felipe P. J. (
committee chair
), Lynett, Patrick J. (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
calogerr@usc.edu,gerry.rizzo89@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-359088
Unique identifier
UC11666187
Identifier
etd-RizzoCalog-8885.pdf (filename),usctheses-c89-359088 (legacy record id)
Legacy Identifier
etd-RizzoCalog-8885.pdf
Dmrecord
359088
Document Type
Dissertation
Rights
Rizzo, Calogero Benedetto
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
connectivity
graph theory
heterogeneous porous media
solute transport