Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Kinetic Monte Carlo simulations for electron transport in redox proteins: from single cytochromes to redox networks
(USC Thesis Other)
Kinetic Monte Carlo simulations for electron transport in redox proteins: from single cytochromes to redox networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
KINETIC MONTE CARLO SIMULATIONS FOR ELECTRON TRANSPORT IN REDOX PROTEINS: FROM SINGLE CYTOCHROMES TO REDOX NETWORKS by Hye Suk Byun A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (PHYSICS) August 2017 Copyright 2017 Hye Suk Byun To my dear family ii Contents List of Tables vi List of Figures vii Abstract xix 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 Scanning tunneling microscopy . . . . . . . . . . . . . . . . 6 1.2.2 Kinetic Monte Carlo algorithm . . . . . . . . . . . . . . . . 7 1.2.3 Parallel kMC algorithm . . . . . . . . . . . . . . . . . . . . 9 1.3 Chapters overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2 Single Molecule Conductance Measurements and Stochastic Sim- ulations of the Bacterial Decaheme Cytochrome MtrF 17 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.5 Appendix I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.6 Appendix II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3 A Framework for Stochastic Simulations and Visualization of the Mtr-Omc Cytochrome Complex 48 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.2.1 Structural modeling, MD simulations, docking and binding- free energy computations . . . . . . . . . . . . . . . . . . . . 53 3.2.2 Divide-conquer-recombine kinetic Monte Carlo simulations of electron transfer . . . . . . . . . . . . . . . . . . . . . . . 59 3.2.3 Visualization of electron transfer . . . . . . . . . . . . . . . 60 iii 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3.1 Binding free energy . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.2 Nonequilibrium phase transitions . . . . . . . . . . . . . . . 63 3.3.3 Animation of electron-transfer dynamics . . . . . . . . . . . 67 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.5.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.5.2 VMD tcl script . . . . . . . . . . . . . . . . . . . . . . . . . 70 4 AScalableDivide-and-ConquerAlgorithmforKineticMonteCarlo Simulations of Long-Time Dynamics 72 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.2.1 Kinetic Monte Carlo simulation . . . . . . . . . . . . . . . . 76 4.2.2 Divide-and-conquer KMC algorithm . . . . . . . . . . . . . . 77 4.2.3 Dual cell linked-list algorithm . . . . . . . . . . . . . . . . . 80 4.2.4 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.3.1 Scalability tests . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.3.2 Application example . . . . . . . . . . . . . . . . . . . . . . 92 4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5 Redox Network Simulations: Examining Multiple Scenarios of Cytochrome Density, Orientation, and Disorder 95 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.2.1 Redox network scenario . . . . . . . . . . . . . . . . . . . . 99 5.2.2 Coarse-grained molecular dynamics . . . . . . . . . . . . . . 102 5.2.3 Percolation analysis . . . . . . . . . . . . . . . . . . . . . . . 103 5.2.4 Divide-conquer-recombine kinetic Monte Carlo simulations . 106 5.2.5 ET rate parameters for scenario model and visualization . . 108 5.2.6 Simulation workflow . . . . . . . . . . . . . . . . . . . . . . 109 5.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.3.1 Comparison electron flux with percolation result . . . . . . . 110 5.3.2 Effect of cytochrome order, density and length of nanowires on electron flux . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.3.3 Direct inter-protein tunneling vs. Diffusion-linking mechanism113 5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Conclusions 137 iv Reference List 141 v List of Tables 3.1 Binding Free Energy Computation from MM/PBSA . . . . . . . . . 62 4.1 Parallel DCKMC algorithm . . . . . . . . . . . . . . . . . . . . . . 85 5.1 Percolation algorithm for finding percolating heme cluster . . . . . 105 vi List of Figures 1.1 Redoxtowershowingalistofelectrondonor-accepterpairsandtheir associated redox potentials. The red arrows indicates that electron flows from a redox molecule with low redox potential (low affinity to electrons) to a redox molecule with high redox potential (high affinity to electrons) along the electron transport chain. Microor- ganisms can be categorized according to these electron donor and acceptor pair. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Possible extracellular electron transfer mechanisms from the cell to the minerals or electrode. Direct electron transfer via (i) bacterial outer surface components such as c-type cytochromes or (ii) conduc- tive cellular appendages. (iii) Indirect electron transfer via a shuttle released by the microbial catalyst. . . . . . . . . . . . . . . . . . . . 3 vii 1.3 Mtr extracellular electron transfer pathway of S. oneidensis MR- 1. CymA is a tetraheme c-type cytochrome, which is anchored in the inner membrane where it oxidizes quinol (QH 2 ) to quinone (Q) and transfers electrons to MtrA. MtrA is a decahaem c-Cyt that is thought to be embedded in MtrB, a trans-outer membrane and porin-like protein. MtrC and OmcA are the outer membrane deca- heme c-Cyts that transfer electrons to Fe(III) minerals. The yellow arrow indicates the direction of electron flow. . . . . . . . . . . . . . 4 1.4 Crystal structure of MtrF integrated into MtrB in a possible orien- tation. This orientation suggests heme 10 to accept electrons from MtrD/MtrA and heme 5 to be the solvent-exposed terminus for electron output to solid substrates, soluble substrates, or electron shuttles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 Schematic of a scanning tunneling microscopy (STM) approach to measuring the current-voltage (I-V) relationship in individual MtrF molecules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.6 Flowchart of sequential KMC algorithm. . . . . . . . . . . . . . . . 8 1.7 A schematic of parallel kinetic Monte Carlo algorithm in nodes. Each subdomain follows sequential KMC with extra step of adding null events to allow globally synchronous time evolution. This par- allel algorithm uses master-worker approach in which master dis- tributes partitions of data to the worker and collect the results to process it for next step and worker sends the results to the master. 12 viii 2.1 (a) Crystal structure of MtrF after Clarke et al.[16] The schematic shows the heme numbering scheme and the proposed integration of MtrF into the Shewanella outer membrane (OM). Electrons trans- ferred from periplasmic cytochromes are injected to heme 10, and hop along the multiheme assembly to heme 5, which serves as the solventexposedterminusforelectronejectiontosolidorsolubleelec- tron acceptors. (b) A snapshot illustrating the kinetic Monte Carlo approach to calculate the heme occupation density and net electron flux through MtrF. Each heme site can be occupied by at most one electron, and the next snapshot is stochastically chosen depending on the rates associated with each possible ET step (detailed in text). 18 2.2 (a) Phase diagram of the time-averaged electron occupation density for all hemes as a function of the incoming (α) and outgoing (β) ET rates to heme 10 and from heme 5, respectively. (b) The cor- responding phase diagram of the net electron flux J in the α−β plane reveals a low- and maximum- current phase. (c) The time- averaged electron occupation profile of each heme at three chosen α−β locations corresponding to points 1-3 from (a), illustrating the low-density phase, high-density phase, and maximum-current phase. The blue to red scale in (c) indicates increasing probability of the heme site being occupied by an electron during the simula- tion. Calculation parameters: α 0 =β 0 = 0 and heme-to-heme rates (k ij ) obtained based on MD and QM/MM calculations of λ,4G, and H ab [17] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 ix 2.3 (a) Schematic of a scanning tunneling microscopy (STM) approach to measuring the current-voltage (I-V) relationship in individual MtrF molecules. (b) Kinetic Monte Carlo simulations of the I- V curves for asymmetric (red and blue traces where r = 0.5 and 2, respectively) and symmetric (black trace, r = 1) contact con- ditions. Calculation parameters: C elec = 10 5 s −1 , λ n = 0.4eV, δ = δ 10 +δ 5 = 0.3, heme-to-heme rates (k ij ) obtained based on MD and QM/MM calculations, [17]4G modified to include the added voltage bias as described in the text. (c) Phase diagram of the time-averaged electron occupation density for all hemes as a function of the interfacial voltage drop fractions δ 10 and δ 5 at the electrode-heme contacts with increasing voltageV. The color bar is the same as in Figure 2.2a. . . . . . . . . . . . . . . . . . . . . . . 26 x 2.4 (a) Representative scanning tunneling microscopy (STM) image of MtrF monolayers on atomically flat Au(111) terraces. Individual MtrF molecules can be resolved. STM imaging conditions: 1 V bias voltage and 2 nA current setpoint. (b) Tunneling spectroscopy current-voltage (I-V) spectra for both the high and low humid- ity conditions (blue and red traces, left at ambient humidity for < 30 minutes or > 4 hours, respectively. Averages and uncertainties obtained from 300 individual I-V spectra). Grey lines are obtained by KMC simulations using the phenomenological form of the non- adiabatic ET rate equation (7) to calculate the heme-to-heme ET rates assuming = 0.8 eV or 0.55 eV. Additional calculation param- eters: = 1 Å −1 , λ n = 1/2, C elec = 10 10 s −1 , δ = 0.3, R from the crystal structure, and4G accounted for the bias V as described in the text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.1 (a) Atomic force microscope (AFM) image of a Shewanella oneiden- sis MR-1 cell and attached bacterial nanowires. The scale bar is 1 μm. (b) Structural model of an outer-membrane Mtr-Omc complex, where each yellow dot represents a heme group. Numerals show the numbering of the 10 hemes for both MtrC/F and OmcA. The inset shows that, in an oxidation reaction, ejection of an electron (e − ) converts the iron atom in a heme group from Fe 2+ to Fe 3+ , whereas, in a reduction reaction, injection ofe − converts it from Fe 3+ to Fe 2+ . (c) Hypothetical model of a bacterial nanowire, in which a lattice of Mtr-Omc complexes mediates long-distance electron transfer. The figure shows a central slice of the nanowire (IM: inner membrane, PP: periplasm, OM:outer membrane). . . . . . . . . . . . . . . . . 49 xi 3.2 (a) Workflow of the VizBET framework for biological ET visualiza- tion. Major new components developed for VizBET are represented by squares with thick lines. C-RANK consists of the further screen- ing of ZDOCK results, re-solvation MD (molecular dynamics) and MM/PBSA (molecular mechanics/Poisson-Boltzmann surface area) (b) Detailed workflow of the DCR-KMC (divide-conquer-recombine kinetic Monte Carlo) simulation component. QM/MM: quantum mechanics/molecular mechanics; DCDFT: divide-and-conquer den- sity functional theory. . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3 Thermodynamic cycle to calculate the binding free energy of two proteins, A and B . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4 TheTop-rankedMtrF-OmcAconfigurationaccordingtoourETand orientation criteria, showing the 20 heme arrangement within the complex. Hemes 5 of both proteins define the inter-cytochrome contact with a 5.58 Å edge-to-edge distance. . . . . . . . . . . . . . 62 3.5 (a) Phase diagram of the time-averaged electron occupation density n for all 20 hemes as a function of the incoming (α) and outgoing (β) ET rates. The white dotted lines delineate the low-density (LD), high-density (HD) and maximum current (MC) phases. The white dashed circle corresponds to experimentally estimated respiration rates. (b) The corresponding phase diagram of the net electron flux j. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 xii 3.6 (a) Visualizing the KMC simulation of ET dynamics in the MtrF- OmcA complex. The Fe atoms are represented as spheres, and are overlaid with the NewCartoon representation of the entire protein complex, where MtrF and OmcA are represented in red and blue shades, respectively. Fe 2+ and Fe 3+ are represented by red and blue spheres, respectively. (a) A single snapshot from the final 500 KMC steps of the simulation. (b) A snapshot from the same simulation as (a), where an ET event is represented by a directed edge. (c) Time-averaged electron occupation, of each Fe atom (as defined by Eq. (13)). Injection/ejection parameters: α = β = 10 5 s −1 . . . . . . 66 4.1 Heme groups in a dimer of decaheme cytochromes, where the Fe atom within each heme is represented by a yellow sphere. Each of the two cytochromes (colored magenta and cyan) contains 10 hemes. 74 4.2 Two-dimensionalschematicofspatialdecompositionintoanarrayof 32 domains in the x and y directions. Each cytochrome in a lattice of cytochrome-dimers is represented by its 10 Fe positions (colored yellow). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 xiii 4.3 (a) Two-dimensional schematic of the blocked domain decomposi- tion in DCKMC. Each block of domains is delineated by solid lines, and is subdivided into 2× 2 quadrants labeled (or colored) 0, 1, 2 or 3. (b) Caching of heme occupancy in parallel DCKMC with 3 × 3 spatial subsystems. The central subsystem is a block of 2× 2 domains as delineated by thick lines. Each small square is a cell slightly larger than q cut , used to compute electron- hopping rates between neighboring hemes. A heme in a dark gray cell only inter- acts with those in the nearest neighbor cells colored in light gray. Accordingly, thecentralsubsystemonlyneedstobeaugmentedwith one layer of cells with the heme information in them cached from the neighbor subsystems. . . . . . . . . . . . . . . . . . . . . . . . 79 4.4 (a) Wall-clock time per KMC simulation step of the DCKMC code, with scaled workloads, N/P = 127,776 (black), 1,168,032 (blue), and 4,116,000 (red), as a function of the number of cores P (P = 16, ..., 1,024). (b) Weak-scaling parallel efficiency as a function of P for N/P = 4,116,000 (red), compared with the ideal efficiency (black dash-dotted line). . . . . . . . . . . . . . . . . . . . . . . . . 88 4.5 (a) Wall-clock time per KMC simulation step (red) and speedup (blue) of the DCKMC code with strong scaling - 3,145,728-heme system on P cores of Intel Xeon. The blue dash-dotted line shows the ideal speedup. (b) Wall-clock time per KMC simulation step (red) and simulated-time speedup (blue) of the DCKMC code with strong scaling - 3,145,728-heme system on P cores of Intel Xeon. The blue dash-dotted line shows linear speedup. . . . . . . . . . . 91 xiv 4.6 (a) A simulation snapshot of the electron transport process along a 1.12 μm long bacterial nanowire containing 1,280 MtrC deca- heme cytochromes (12,800 hemes). Electrons are injected from the nanowire’s leftmost heme sites with injection rate of 10 7 s −1 and ejected at the rightmost heme sites with ejection rate of 10 7 s −1 . Occupation number of each heme site is color-coded. (b) Time evo- lution of the electron flux along the nanowire. The electron flux is defined as the number of electrons injected/ejected per unit time through an area perpendicular to the flux direction. . . . . . . . . 92 5.1 Electron cryotomography of Shewanella nanowires. (a) One tomo- graphic slice resolving the lipid bilayer and the connection between two vesicular features (bilayer thickness 41 Å). (b) Isosurface rep- resentation of a 3-dimensional reconstruction (from multiple slices) of a longer nanowire. These methods are currently being used to measure the spacing and nanoscale localization of individual mem- brane cytochromes [1]. . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.2 (a) Workflow of the redox network simulation framework for bio- logical ET scenarios on spacing and nanoscale localization of indi- vidual cytochrome. DCR-KMC: divide-conquer- recombine kinetic Monte Carlo (b) Flowchart of calculating ET parameters based on the choice of QM/MM and mechanism of interaction between neighboring redox partners QM/MM: quantum mechanics/molecu- lar mechanics; DCDFT: divide-and-conquer density functional theory. 98 xv 5.3 Detailed description of the multiple scenarios with a set of param- eters and corresponding snapshot of the simulated cytochrome dis- tribution from coarse-grained molecular dynamics. Red solid line represents the irregular shape of MtrC approximated by∼50 parti- cles that outline the molecular structure. . . . . . . . . . . . . . . . 99 5.4 Density distribution curves of heme-to-heme distance for time T = [T μs ,T total ], where T total is the total time in the simulation. Each density distribution curve is calculated from cytochrome distribu- tion changes over time (trajectories). Based on the curve, the level of disorder of MtrC molecule is quantified by calculating the prob- ability of finding heme-to-heme distance less than 10 Å (area under the curve up to 10 Å. At each time point from T 1 to T 3 , the cal- culated probability is 0.91, 0.87, and 0.74, respectively. (i.e. 0.91 represents 91% of inter-protein distances are within 10 Å). Note that heme pairs within distance of 20 Å are collected to take into account interaction within the cut-off radius. After T 3 , the distri- bution converges to the probability of 0.74. . . . . . . . . . . . . . . 100 5.5 A slice from a percolation simulation on a heme network in two dimensional space to describe percolation on a cylindrical surface of nanowire. Each cluster is represented as different color. Percolating cluster can be found if same heme molecule is found in different image while searching neighbor hemes. Red circle represent same heme molecule in two different images. . . . . . . . . . . . . . . . . 103 xvi 5.6 Percolation characteristics for given multiheme cytochrome MtrC configurations specified by MtrC packing density, cytochrome orien- tation order, and whether direct inter-protein tunneling or diffusion- linking mechanism is dominant in electron transfer. Elapsed time to percolation is obtained as density and orientation order changes for each mechanism, diffusion-linking and direct inter-protein tunneling only. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.7 A snapshot of showing cluster size by colors during the dynamic percolation transition at time t = 1 ms. (a) high density with ordered orientation. (b) high density with disordered orientation. (c) low density with disordered orientation. From (a) to (c), Cluster size changes as heme connectivity changes from (a) to (c). . . . . . 106 5.8 A crystal structure of MtrC and a simulation snapshot of the elec- tron transport process along a 1 μm-long bacterial nanowire con- taining 1280 MtrC decaheme cytochromes (total 12800 hemes in the system). Electrons flow from left to right through hopping heme sites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.9 A schematic of spatial decomposition in simulation procedure. Ini- tial system consists of four domains, and it extends layer by layer after the existing system reaches the steady states. Empty proces- sors are assigned at one end of the current system to avoid false electron injection or ejection due to periodic boundary condition (refer to the chapter 4 about periodic boundary condition). . . . . . 110 xvii 5.10 (a) Steady-state electron flux dependence on orientation order of cytochromes at different length of bacterial nanowire. The level of cytochrome disorder is quantified by the probability of finding an inter-protein heme pair within a distance less than 10 Å. (refer to Figure5.4). (b)Steady-stateelectronfluxdependenceoncytochrome density at different length of bacterial nanowire. All the electron flux data is computed based on diffusion-linking mechanism with a diffusion coefficient of 10 μm 2 /s. . . . . . . . . . . . . . . . . . . . 111 5.11 (a) Comparison of direct inter-protein tunneling electron flux with diffusion-linked electron flux at D = 1 μm 2 /s. For high order/- density, it shows same order 10 4 s −1 - 10 5 s −1 of electron flux in both direct tunneling and diffusion-linking mechanism, indicating that direct tunneling is a dominant electron transfer mechanism. Whereas for lower order/density, the restriction to only direct tun- neling rapidly decreased the overall flux to zero. Allowing for diffu- sion between proteins sustains an electron flux of 10 3 s −1 - 10 4 s −1 even at lower order/density. Electron flux is indicated by the color bar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.12 Effect of diffusion coefficient D on electron flow for 60 nm-long nanowire heme network where D = 1000 μm 2 /s (top left), 100 μm 2 /s (top right), 10 μm 2 /s (bottom left), and 1 μm 2 /s (bottom right). Calculation parameters: β = 1 Å, λ = 0.94 eV. Each color represents the resulting steady-state electron flux from DCKMC simulation on different order/density cytochrome distributions. . . . 114 xviii Abstract Dissimilatory metal reducing bacteria, such as Shewanella oneidensis MR-1, can gain energy by coupling the intracellular oxidation of electron donors to the reduction of external surfaces, such as natural minerals or even synthetic elec- trodes. This respiratory strategy, known as extracellular electron transfer (EET), has significant environmental and biotechnological implications. Extensive studies of Shewanella have revealed multiple proposed routes for EET between cells and surfaces, all of which rely on an a network of periplasmic and outer membrane mul- tiheme cytochromes to traffic electrons to the cell surface where they may directly reduce minerals or electrodes. The multiheme cytochromes (MtrC and OmcA) are also present on long (micrometer-scale) membrane filaments, known as bacterial nanowires, that are proposed to perform EET to more distance surfaces. While a detailed characterization of long-distance electron conduction in such redox net- works remains challenging, especially under native conditions and in solution, this thesis complements the emerging experimental work with computational insight into redox-based conduction in individual cytochromes, cytochrome complexes, and even whole bacterial nanowires. This thesis is divided into two parts. The first part of the thesis (chapter 2 and 3) focuses on electron transport through cytochromes and cytochrome complexes xix thatbridgecellstomineralsandanodes. Chapter2reportselectrontransportmea- surements through the single outer-membrane multi-heme cytochrome MtrF from Shewanella oneidensis MR-1 using scanning tunneling microscopy, and analyzes this transport using Kinetic Monte Carlo (KMC) simulations. The simulation results demonstrate reasonable agreement between a multistep electron hopping model through MtrF’s heme chain and experimentally measured currents. We then extend the computational approach by including a structural prediction of an Mtr-Omc complex that explores binding of these proteins under plausible physio- logical conditions while allowing efficient inter-protein electron transfer. We apply KMC simulations to compute electron flux through the predicted dimer and then visualize the simulation results to examine electron transfer dynamics. The second part of the thesis (chapters 4-6) focuses on understanding long- distance electron transport through micrometer-scale bacterial nanowires. Recent in vivo observations of the Shewanella nanowires’ formation and composition high- lighted the importance of the outer-membrane cytochromes MtrC and OmcA, thereby motivating stochastic simulations of electron transport in a large-scale redox network of cytochromes lining nanowires. Chapter 4 details how to over- come the previously implemented non-scalable conventional KMC algorithm, in which the time scale is inversely proportional to the simulated system size, making it computationally prohibitive to treat whole nanowires or biofilms. We devise a divide-and-conquer KMC (DCKMC) algorithm with decent strong-scaling parallel efficiency for simulating biological electron transfer dynamics in a 4.2 billion-heme system. The parallel DCKMC algorithm is used to simulate a micrometer-scale redox network of cytochrome arrays on a bacterial nanowire. Chapter 5 reports simulation results for large-scale cytochrome networks using our DCKMC algo- rithm. Our first test model is a 1.12μm (idealized) nanowire treated as an array of xx 1280 MtrC molecules arranged hexagonally on a membrane tube. The 1280 MtrC molecules are decomposed into 64 domains simulated in parallel. The simulation results from the model shows net electron transport rate of 10 5 s −1 , remarkably consistent with the measured respiration rate of Shewanella. This initial finding leadsustodesignbiologicallymoreplausiblecytochromenetworkmodels. Chapter 6 first proposes percolation analysis as a screening method prior to DCKMC sim- ulations. DCKMC then simulates electron dynamics on the percolation-screened redox network models with various configurations that examine the 1) relative arrangement of heme chains in the cytochrome network; 2) overall cytochrome density and 3) level of disorder in cytochrome orientation. We discover that direct inter-protein electron tunneling through the network can allow for biologically- relevant transport rates through densely-connected scenarios. However, as order and packing decrease, an additional inter-protein interaction (e.g. diffusive shut- tling via small electron carriers or lateral diffusion of membrane cytochromes) is likely necessary to boost the efficiency of long-distance electron transfer. Our redox network scenarios provide new insights into extracellular electron transportmechanismsinlarge-scaleconductivesystems,suchasbacterialnanowires or bacterial biofilms. As detailed experimental information emerges on the density and orientation of redox proteins along nanowires, and with recent demonstrations of redox-based conduction in whole biofilms, the simulations reported here will be critical for both analyzing recent measurements and to guide future develop- ments. xxi Chapter 1 Introduction 1.1 Background Respiration is the process of harvesting energy by transferring electrons from an electron donor to an electron acceptor. Typically, this electron transfer occurs in a respiratory chain embedded in the cell membrane. A series of electron transfer generates an electrochemical potential across the membrane, which then drives the production of energy in a form of ATP. For continuous supply of ATP in order for life to survive, a terminal electron acceptor, such as oxygen, must exist to receive the electrons [2, 3]. Theredoxreactionscatalyzedbymembrane-boundcompoundstakesadvantage of the energy difference between electron donor and acceptor (i.e. electrons are transferred from a low potential electron donor to an acceptor with more positive redox potential by redox reactions). The greater the difference in redox potential between the electron donor and acceptor, the more energy is available for ATP production by the organism per organic carbon molecule. Figure 1.1 shows the tendency of electron transfer between the redox pairs measured as redox potential. Oxygen has the most positive redox potential. Therefore, most microorganism use oxygen as the terminal electron acceptors. However, some microorganism have evolved mechanisms to cope with living in anaerobic environments, allowing them to use abundant minerals outside the cell as respiratory electron acceptors, instead of oxygen or other soluble oxidants that 1 Figure 1.1: Redox tower showing a list of electron donor-accepter pairs and their associated redox potentials. The red arrows indicates that electron flows from a redox molecule with low redox potential (low affinity to electrons) to a redox molecule with high redox potential (high affinity to electrons) along the electron transport chain. Microorganisms can be categorized according to these electron donor and acceptor pair. would normally diffuse inside cells [4]. This kind of bacteria, called dissimilatory metal-reducing bacteria can extract free energy to power life by coupling the intra- cellular oxidation of electron donors (food) to the reduction of a variety of terminal electron acceptors (oxidants), including extracellular iron and manganese minerals [5, 6]. Unlike the well-studied soluble electron acceptors of aerobic (oxygen) or anaerobic (e.g. nitrate or sulfate) respiration, these insoluble minerals present a significant electron transport challenge: how can the electrons generated from the oxidation of organic carbon be transported from the inner bacterial membrane, across the periplasm, to the outer membrane, eventually reaching the external mineral surface? Understanding this extracellular electron transfer (EET) strat- egy will provide new biophysical and physiological insights into the fundamentals of long-range electron transport and microbial respiration. 2 Figure 1.2: Possible extracellular electron transfer mechanisms from the cell to the minerals or electrode. Direct electron transfer via (i) bacterial outer surface components such as c-type cytochromes or (ii) conductive cellular appendages. (iii) Indirect electron transfer via a shuttle released by the microbial catalyst. Beyond the fundamental significance, EET also has significant environmental and technological implications. Because of the abundance of iron and manganese in anoxic zones, the microbial reduction of these minerals drives major elemental cycles at a global scale, including the carbon cycle, with significant consequences for climate change [7]. By using electrodes instead of minerals, the natural abil- ity of microbes to perform reduction-oxidation (redox) reactions at solid surfaces has also been exploited for electricity generation in microbial fuel cells [8], and for the production of high-value electrofuels in the reverse process of microbial electrosynthesis [9]. Shewanella can perform EET by multiple mechanisms (Figure 1.2) that fall into two broad categories: indirect and direct mechanisms [10]. Indirect mecha- nisms are facilitated by small molecules, such as flavins [11, 12], that either dif- fusively shuttle electrons from cells to solid surfaces or chelate metals for delivery to intracellular oxidoreductases. Direct mechanisms take advantage of contact between cells and solids by routing electron transport through multiheme c-type 3 Figure 1.3: Mtr extracellular electron transfer pathway of S. oneidensis MR-1. CymA is a tetraheme c-type cytochrome, which is anchored in the inner membrane where it oxidizes quinol (QH 2 ) to quinone (Q) and transfers electrons to MtrA. MtrA is a decahaem c-Cyt that is thought to be embedded in MtrB, a trans- outer membrane and porin-like protein. MtrC and OmcA are the outer membrane decaheme c-Cyts that transfer electrons to Fe(III) minerals. The yellow arrow indicates the direction of electron flow. cytochromes located on the cellular outer membrane [13–15], or via micrometer- long conductive proteinaceous filaments known as bacterial nanowires [16–18]. It is importanttonotethatthesemechanismsarenotmutuallyexclusive. Shewanella is thought to be capable of both indirect (i.e. redox shuttles) and direct mechanisms, depending on growth conditions, electrode potentials, and other environmental factors [19, 20]. Furthermore, Shewanella’s metal reduction (Mtr) pathway, which includes periplasmic and outer membrane decaheme cytochromes, is capable of direct electron transfer [15], but is also essential for reducing secreted flavin redox shuttles [21]. As part of the Mtr pathway (Figure 1.3), the MtrCAB complex has been impli- cated as the primary route for electron transfer across the cell envelope to soluble 4 Figure 1.4: Crystal structure of MtrF integrated into MtrB in a possible orienta- tion. This orientation suggests heme 10 to accept electrons from MtrD/MtrA and heme 5 to be the solvent-exposed terminus for electron output to solid substrates, soluble substrates, or electron shuttles. redoxshuttles, minerals, andtheanodesofmicrobialfuelcells[22]. Whiletheexact structure of the whole complex is not yet known, the most accepted hypothesis points to MtrCAB as a porin-cytochrome complex that performs EET by arrang- ing 20 heme co-factorsinto a∼20 nm ‘wire 0 . Specifically, the periplasmic decaheme cytochromeMtrAisthoughttotransferelectronstotheoutermembranedecaheme cytochrome MtrC through the transmembrane porin MtrB. Electrons are conse- quently transferred from MtrC, and its partnering outer membrane cytochrome OmcA, to terminal electron acceptors such as minerals or electrodes [14, 15]. She- wanella can also express the MtrFDE conduit, where MtrF, MtrD, and MtrE are homologous to MtrC, MtrA, and MtrB, respectively. It was previously noted that MtrF can act as a potent reductase of minerals and microbial fuel cell anodes in the absence of MtrC, and that hybrid complexes can form between the MtrCAB and MtrFDE components [23]. 5 The crystal structure of MtrF was first revealed among Shewanella’s outer membrane c-Cyts. MtrF consists of four distinct domains that are folded sequen- tially through the amino acid sequence from the N terminus to the C terminus. Domain III and I contain antiparallel β-strands and domain II and IV each binds five heme c-cofactors. Each heme is covalently bound to the protein via two cys- teine linkages in a CXXCH motif and coordinated by two histidines. The heme cofactors are arranged in a "staggered-cross" formation of the 10 hemes, with a cen- tral lateral tetraheme chain (hemes 2, 1, 6, and 7), and an orthogonal octaheme chain linking heme 10 to heme 5. The possible integration orientation of MtrF was that heme 10 receives electrons from the periplasmic proteins and heme 5 outputs electrons to external solid-phase electron acceptors. More buried heme 2 and 7 on the side of the lateral tetraheme chain are proposed as possible sites for electron exchange with soluble electron acceptors or electron shuttles when the heme 5 is occupied with the interaction with a solid substrate [24]. 1.2 Methods 1.2.1 Scanning tunneling microscopy For molecular conductance measurements, the scanning tunneling microscopy (STM) and tunneling spectroscopy (TS) techniques are used [25]. The basic design of STM (Figure 1.5) includes an ultra-find metallic tip wired to the piezoelec- tric drive held at a given bias and a conductive or semi-conductive sample to be scanned. The principle of the STM is as follows. When the tip is brought close to the surface of the sample, an electrical tunneling current is induced and electrons flow from tip to sample (or vice versa depending on the direction of the voltage bias V). The tunneling current decreases exponentially as the separation of the tip 6 Figure 1.5: Schematic of a scanning tunneling microscopy (STM) approach to measuring the current-voltage (I-V) relationship in individual MtrF molecules. from the surface is linearly increased. The STM operates in either the constant current mode or the constant height mode. In constant current mode for atomic- scale imaging of surfaces, the tunneling current between the tip and the surface is fixed, while the tip can trace the topography of the surface in the z direction. The tunneling spectroscopy (TS) in constant height mode technique enables to probe the electronic structure of the sample, which is the primary technique we use to confirms MtrF’s ability to mediate electron transport between an STM tip and an underlying Au(111) surface and compare the experimentally measured I-V curves to KMC simulations of the net electron flux as a function of voltage across MtrF. 1.2.2 Kinetic Monte Carlo algorithm The kinetic Monte Carlo (KMC) method is a Monte Carlo method intended to simulate the time evolution of some processes occurring in nature that typically occur with known transition rates among states [26]. There are two well-known 7 Figure 1.6: Flowchart of sequential KMC algorithm. basic strategies for KMC simulation which refer to as rejection (rKMC) and rejec- tion free algorithms (rfKMC). Both algorithms contain a list of events, each with an associated probability, choose a single event to perform, and advance time by the correct amount of time. rKMC can run faster computations for each attempted step, but it can contain null events, which can lead total computations extremely inefficient. Generally, the inefficiency occurs when there is one or more small rates. Rates distribution in heme network peaks at the rate between adjacent redox sites and decay exponentially between far away neighbors. Therefore, we employ rejection-free algorithms to simulate electron transfer dynamics in heme network. Figure 1.6 describes the computational flow in KMC simulation. We start a simulation by emptying all heme sites and resetting the time to 0. At each KMC step, one of the following events occurs: (i) an electron is injected with rate k red if the entrance heme is unoccupied; (ii) an electron is ejected with rate k ox if the exit heme is occupied; or (iii) an electron hops from heme i to one 8 of its nearest-neighbour hemes j with rate k ji if heme i is occupied and heme j is unoccupied. The method for calculating the rates for the ET dynamics can be found in Ref. [28]. KMC simulation consists of a time-stepping loop. Let l be one of the possible events listed above, with k l being its rate, m be the total number of possible events, and k total = M X l=1 k l (1.1) be the sum of all possible events. At each KMC step, the time is incremented by t =− ln(ξ 1 )/k total (1.2) whereξ 1 is a uniform random number in the range [0,1]. The probability of choos- ing a particular event is proportional to its rate, and specific event m is chosen such that m =min l { l X l=1 k l >k total ξ 2 }, (1.3) where ξ 2 is another random number. 1.2.3 Parallel kMC algorithm Traditionally, program operates in a serial manner by executing instructions sequentially one after another. However, for some problems, as the problem size becomes larger, computational time increases extremely, which raised the need for parallel work flow. Parallel computing [27] is using multiple processors in parallel to solve problems more quickly than with a single processor. It speeds up algorithms by applying single operation to discrete parts of data simultane- ously. As an example, parallel approach can be applied to matrix multiplication. The multiplication of A and B matrices (n× n) can be reduced by block matrix 9 decomposition algorithm where Processors are arranged in a logical √ p∗ √ p 2D topology. Each processor gets a block of (n/ √ p)*(n/ √ p) block of A, B, and C. It is responsible for computing the entries of C that it has been assigned to. For communication between processes, there are two parallel computing models on parallel processors: shared memory and message passing. In shared memory scheme, multiprocessors have a single large memory space directly accessible to all processors. Communication between processors is carried through the shared memory space. An advantage of this model is that there is no need to specify explicitly the communication of data between tasks, so program development can often be simplified. However, performance becomes more difficult to understand and manage data locality. In contrast, distributed memory multiprocessors can only access to its own local memory space, Changes it makes to its local memory have no effect on the memory of other processors. Thus, they must use message passing communication to access memory and exchange data on other machines in which other processes are executing. The message passing interface(MPI) standard interfaces provides two-sided, send and receive communication operations between processes. Synchronous message passing allows its operation happens within the flow of control of the calling process. Users are allowed to use resources specified by the operation after it completes. On the other hand, asynchronous message passing initiates an operation that happens logically outside the flow of control of the calling process. The operation return before the operation completes. The reuse of the resource can be controlled by MPI_Wait() operation for completion of an asynchronous send or receive operation before proceeding. In KMC algorithm for simulating the time evolution of electron transfer of heme network, a ET event occurs stochastically with probability proportional to rates r. The simulated time is updated at each KMC step by4t which is inversely 10 proportional to the sum of the rates of all possible ET events. Therefore, the time evolved at each step is much smaller in large-scale system. To be able to perform large-scale time evolving KMC simulation, we need to break large domain into discrete "chunks" of domain that can be distributed to multiple processes. KMC simulation exploits spatio-temporal locality in which likelihood of accessing next event is higher if the event is near it or happens sometime in the near future. Thus, spatio-temporal locality enable spatial domain decomposition of large-scale heme network in which we partition lattice of heme network into many smaller equal size of heme network. Events occurs near the boundary can be resolved by heme caching where the number of boundary hemes and occupancy status are copied from all neighbor domain. This communication between multiple processes for the exchanges boundary heme occupancy among neighbor domain is done by asyn- chronous message passing as mentioned above. Scalability test is also performed to measure the parallel performance on how efficient the algorithm is with increas- ing numbers of processes. Naïve application of KMC for simulating the electron transfer behavior in heme network demands huge computing time, thus paral- lelization of the simulation algorithm is necessary. KMC considers the whole set of electron hopping events with rate (i=1,...,N) for hemes involved, thus the size of this set grows proportionally to the number of cytochromes involved. The number of cytochromes involved in our system can be at least on the order of 10 4 (assum- ing 1μm of nanowire). Conventional KMC requires to compute k tot = P N i=1 k i for every step, and each step increments time inversely proportional to k tot . In our heme network system where k tot is increased by the order of 10 4 , compared to a single-protein system, time progresses too slowly to make simulation practical. A parallel version of kinetic Monte Carlo proposed by Martinez et al [28, 29]. can improve the performance of the simulation by dividing entire simulation 11 Figure 1.7: A schematic of parallel kinetic Monte Carlo algorithm in nodes. Each subdomain follows sequential KMC with extra step of adding null events to allow globally synchronous time evolution. This parallel algorithm uses master-worker approach in which master distributes partitions of data to the worker and collect the results to process it for next step and worker sends the results to the master. domain into isolated subdomains Ω g and run simulation in these subdomains in parallel. In applying the parallel KMC algorithm to our heme network system, a set of the electron hopping events are sub-divided into multiple subdomains Ω g (g = 1,...,G), where G is the number of groups to run in parallel, and each Ω g contains k i,g for its events. In each step of the simulation, we compute the sum of possible event rates for Ω g , k tot,g = Ng X i=1 k i,g (1.4) And the maximum rate, k max is defined as follows, k max = max g=1,...,G {k tot,g } (1.5) 12 Therefore, k max stands for the sum of event rates of a subdomain Ω max where events are most frequently occurring. Here, we let the entire system progress with "global time progress" based on Ω max for all subdomains in the system, and event rates of each subdomain Ω g are adjusted to reflect it by introducing null events k 0,g as follow, k 0,g =k max −k tot,g (1.6) Conventional KMC is carried out in each subdomain with rates including the null event rate in parallel. If a null event is selected in a subdomain, the configuration of the subdomain remains the same. Subdomains need to communicate with each other when electrons are transferred from one subdomain to another. More details about heme cashing and occupation number copying Îľ described in chapter 5. Finally, we increment the time by τ =− ln(ξ)k −1 max , where ξ is another random number. The key idea of this algorithm is to make the entire system progress with the least time increment based on the most active subdomain Ω max , and normalize other less active subdomain’s event rates by k tot,g /k max with null events for no electron transfer during that time duration. 1.3 Chapters overview In chapter 2, we report the first kinetic Monte Carlo simulations and single- molecule scanning tunneling microscopy (STM) measurements of the Shewanella oneidensis MR-1 outer membrane decaheme cytochrome MtrF, which can perform the final electron transfer step from cells to minerals and microbial fuel cell anodes. We find that the calculated electron transport rate through MtrF is consistent with previously reported in vitro measurements of the Shewanella Mtr complex, as well 13 as in vivo respiration rates on electrode surfaces assuming a reasonable (experi- mentally verified) coverage of cytochromes on the cell surface. The simulations also reveal a rich phase diagram in the overall electron occupation density of the hemesasafunctionofelectroninjectionandejectionrates. Singlemoleculetunnel- ing spectroscopy confirms MtrF’s ability to mediate electron transport between an STMtipandanunderlyingAu(111)surface, butatrateshigherthanexpectedfrom previously calculated heme-to-heme electron transfer rates for solvated molecules. Two decaheme cytochromes, MtrC and OmcA are cell surface-exposed lipopro- teins, which are known to be important for reducing solid metal oxide at the final electron transfer step. In chapter 3, we present a computational framework named VizBET to simulate and visualize ET dynamics in an outer-membrane Mtr-Omc cytochrome complex. Starting from X-ray crystal structures of the constituent cytochromes, molecular dynamics simulations are combined with homology mod- eling, protein docking, and binding free energy computations to sample the con- figuration of the complex as well as the change of the free energy associated with ET. This information, along with quantum- mechanical calculations of the elec- tronic coupling, provides inputs to kinetic Monte Carlo (KMC) simulations of ET dynamics in a network of heme groups within the complex. Visualization of the KMC simulation results has been implemented as a plugin to the Visual Molecular Dynamics (VMD) software. VizBET has been used to reveal the nature of ET dynamics associated with novel nonequilibrium phase transitions in a candidate configuration of the Mtr-Omc complex due to electron-electron interactions. For short distances less than 2 nm, localization of cytochromes in the cellu- lar outer membrane take advantage of direct contact between cells and solids by routing electron transport through multiheme c-type cytochromes located on the cellular outer membrane. Recently, a few other strategies have been reported to 14 mediate dramatically long-distance electron transport, ranging fromnanometers to micrometers and even centimeters. Small molecules, such as flavin, was discovered to act as electron shuttles to enhance reduction of insoluble substrates and chelate metalsfordeliverytoanintracellularmetaloxidoreductase. Electricallyconductive appendages known as nanowire was also shown to conduct electrons over microm- eter distance. Recent in vivo observations of the formation and respiratory impact of nanowires in Shewanella oneidensis MR-1 motivates us to extend the model to large-scale redox network of cytochromes arrays to describe long-distance electron transport along the nanowires. However, unfortunately, the conventional KMC algorithm is not scalable, since its time scale is inversely proportional to the simu- lated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, in which the time scale becomes size indepen- dent. In chapter 5, based on synchronous parallel kinetic Monte Carlo algorithm, we provide a divide-and-conquer KMC (DCKMC) algorithm. This parallel imple- mentation achieves an excellent weak-scaling parallel efficiency of 0.935 on 1,024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. Also, to validate the DCKMC algorithm, we present a test model of a lattice of cytochrome complexes on a bacterial-membrane nanowire to simulate electron transfer. Following the success on simulating the test model of 1.12 μm nanowire, the effect of cytochrome density, orientation, and system size is examined, which pro- vides key insight into understanding and controlling these mechanisms. In chapter 5, we report simulations of the electron transfer dynamics along the redox-network of nanowires using multiple scenarios for cytochrome density, order, orientation andinter-proteininteractions. Toprovidepossiblelocalizationofcytochromeheme 15 network topologies on membrane as input to the KMC simulation, we run coarse- grained molecular dynamics (MD) simulation. Before running KMC simulation, the input topology from MD simulation is pre-tested by network analysis to deter- mine the execution time of a simulation until we achieve steady-state electron transport. The analysis is based on heme cluster connectivity on each topology, which can provide us quantitative predictions for dynamic percolation transition as a function of ET time scale. The overall electron transport over long distances is found to strongly depend on the cytochrome density, topology, and orientation on membrane. 16 Chapter 2 Single Molecule Conductance Measurements and Stochastic Simulations of the Bacterial Decaheme Cytochrome MtrF 2.1 Introduction While the crystal structure of MtrC is not yet available, the structure of its homolog MtrF was recently published [24]. The structure contains a ’staggered- cross’ arrangement of 10 hemes, with a central lateral tetraheme chain, and an orthogonal octaheme chain linking heme 10, which receives electrons from the periplasmic proteins, to heme 5, which serves as the electron output to exter- nal solid-phase electron acceptors (Figure 2.1a). Knowledge of the MtrF struc- ture enabled molecular dynamics and quantum mechanics/molecular mechanics (QM/MM) simulations of key thermodynamic and electron transfer (ET) param- eters: the redox potential of each of the 10 hemes (and therefore the free energy driving force4G for each ET step between neighbouring hemes), reorganization free energies (λ), and heme-to-heme electronic coupling matrix elements (H ab ) [30– 32]. These parameters were then used to calculate the rate of each heme-to-heme 17 Figure 2.1: (a) Crystal structure of MtrF after Clarke et al.[16] The schematic shows the heme numbering scheme and the proposed integration of MtrF into the Shewanella outer membrane (OM). Electrons transferred from periplasmic cytochromes are injected to heme 10, and hop along the multiheme assembly to heme 5, which serves as the solvent exposed terminus for electron ejection to solid or soluble electron acceptors. (b) A snapshot illustrating the kinetic Monte Carlo approach to calculate the heme occupation density and net electron flux through MtrF. Each heme site can be occupied by at most one electron, and the next snap- shot is stochastically chosen depending on the rates associated with each possible ET step (detailed in text). ET step (k ij , between neighbouring hemes i and j) by using the non-adiabatic rate equation [3]: k et = 2π ~ |H DA | 2 √ 4πλk B T exp − (ΔG +λ) 2 4λk B T (2.1) Here, we perform first-principles kinetic Monte Carlo (KMC) simulations [26] to study the electron transport dynamics in the decaheme cytochrome MtrF. Our simulations build on the previous work by Breuer et al. who calculated the steady- state electron flux through solvated MtrF by solving a master equation [32]. The KMC approach detailed here solves the same master equation while offering advan- tages in terms of computational speed, ease of extracting the transient dynamics, 18 andscalabilitytolargersystems. Thescalability, whichisfurtherenabledbyrecent developments in synchronous parallel KMC algorithms [29], will become increas- ingly important as more structural information emerge for entire multi-protein EET complexes and micrometer-long bacterial nanowires. The simulations reveal phase transitions in the average electron occupation density of the 10 hemes, as a function of the molecular reduction and oxidation rates, i.e. electron injection into heme 10 and ejection out of heme 5. Furthermore, we report the first scan- ning tunneling microscopy (STM) and tunneling spectroscopy measurements of MtrF conductance, and compare the experimentally measured I-V curves to KMC simulations of the net electron flux as a function of voltage across MtrF. 2.2 Methods The KMC simulation treats electron hopping events on the MtrF heme network with N (= 10) sites, where a heme site is labeled by indexi∈{1,...,N}. The sys- tem is characterized by electron hopping ratesk ij between a pair (i, j) of adjacent heme sites, intracellular donor-heme hopping rate on the entrance site (i = 10), and heme-acceptor hopping rate on the exit site (i = 5). We employ k ij values obtained from QM/MM simulations based on the non-adiabatic rate expression (Equation 1.1) [32]. Each heme site can be occupied by at most one electron, or it can be empty. We start the KMC simulation by emptying all sites and resetting the time to 0. At each KMC step, one of the following electron hopping events occurs (Figure 2.1b): (i) an electron is injected from the donor to heme 10 with rate if heme 10 is unoccupied; (ii) an electron is ejected from heme 5 to the accep- tor with rate if heme 5 is occupied; (iii) an electron hops from site i to one of its nearest-neighbour heme sites j with rate k ij if site i is occupied and site j is 19 unoccupied. The key idea is that the probability of choosing a particular electron hopping event is proportional to its specific hopping rate. This is implemented by choosing an event stochastically as follows: Let M be the total number of possible events andk l (l = 1,...,M) be the rate of the l-th event; specific event m is chosen such that m−1 X l=1 k l <ξ 1 k total < m X l=1 k l (2.2) where ξ 1 is a random number uniformly distributed in the range (0,1) and k total = M X l=1 k l (2.3) is a cumulative function of all the possible event rates. We displace the electron involved in the chosen event and increment the time by τ =− ln(ξ 2 )k −1 total , where ξ 2 is another random number. KMC steps are repeated for T (∼ 10 6 ) times to describe the time evolution of the system. The time-averaged electron occupation density at heme site i is calculated as hn i i = P T s=1 n i (s)τ(s) P T s=1 τ(s) (2.4) where τ(s) is the time increment at the s-th KMC step, and n i (s) = 1 or 0 when site i is occupied by an electron or unoccupied, respectively, at the s-th KMC step. The overall electron occupation density is given by an average over all heme sites. The steady-state current J (s −1 ) is obtained by the net number of electrons transferred from the donor to the acceptor during T KMC steps divided by the total elapsed time P T s=1 τ(s). 20 2.3 Results and discussion While in vitro and in vivo studies of Mtr proteins can provide a wide redox potential range for the whole heme chain and the net ET rates to external miner- als [15, 24, 33, 34], the experimental techniques cannot reveal the intramolecular details associated with mediating ET from cells to external electron acceptors sim- ply by measuring the interfacial rates. By varying the incoming (α, associated with the reduction of heme 10) and outgoing (β, associated with the oxidation of heme 5) ET rates, our simulation reveals a rich phase diagram for MtrF in the plane (Figure 2.2a). Consistent with the biological function of directional electron flow in dissimilatory metal-reducing bacteria, and the strong driving force from low (donor) to high (acceptor) redox potentials, the reverse ET rates at hemes 10 and 5 are set to zero (α 0 =β 0 = 0) in the simulation leading to Figure 2.2. Three dis- tinct phases are observed in the time-averaged occupation density. A ’low-density’ phase, where the hemes are mostly oxidized, is observed when transport is limited by ET from donor molecules (α < β). A ’high-density’ phase, where the hemes are mostly reduced, is observed when transport is limited by ET to external elec- tron acceptors (α > β). Finally, a ’maximum-current’ phase is observed when both α and β exceed the smallest heme-to-heme ET rate within MtrF, which is ∼ 10 4 s −1 (heme 1 to heme 3) along the octaheme path [32]. The contrast between the high electron flux character of the maximum-current phase and the low flux regime corresponding to the low-density and high-density phases is easily identified in Figure 2.2b, which presents the phase behaviour of the electron flux J in the plane. Some aspects of the phase behaviour are not surprising; for example, one expects a high ejection-to-injection ratio to correspond to mostly oxidized hemes 21 Figure 2.2: (a) Phase diagram of the time-averaged electron occupation density for all hemes as a function of the incoming (α) and outgoing (β) ET rates to heme 10 and from heme 5, respectively. (b) The corresponding phase diagram of the net electron flux J in the α−β plane reveals a low- and maximum- current phase. (c) The time-averaged electron occupation profile of each heme at three chosen α−β locations corresponding to points 1-3 from (a), illustrating the low-density phase, high-density phase, and maximum-current phase. The blue to red scale in (c) indicates increasing probability of the heme site being occupied by an electron during the simulation. Calculation parameters: α 0 = β 0 = 0 and heme-to-heme rates (k ij ) obtained based on MD and QM/MM calculations ofλ,4G, andH ab [17] and vice versa. However, comparison to experimental measurements performed in vitro, and an inspection of the occupation density corresponding to in vivo ET rates, provide new insight into MtrF’s structure-function relationship. The highest flux in the maximum-current phase is 1.3× 10 4 s −1 (Figure 2.2b), which is enough to support the previously measured rapid ET to solid phase Fe(III) oxides from the 22 MtrCAB complex assembled into proteoliposomes (up to 8, 500± 916s −1 per Mtr- CAB toγ-FeOOH) [15]. The latter study used methyl viologen as internal electron donor with a very negative redox potential (-0.45 vs. SHE) to drive ET rapidly across the liposome membrane toγ-FeOOH (-0.1 V vs. SHE). Since electron input from the strongly reducing methyl viologen is unlikely to be the limiting rate, and the MtrC homolog MtrF can support 1.3× 10 4 s −1 , it appears that the liposome experiment was limited by output ET from heme 5 to the mineral. A similar con- clusion was reached by Breuer et al. who calculated the steady-state electron flux through MtrF using a master equation approach [32]. It therefore appears that the Mtr proteins are well suited for the unique challenge of transporting the intra- cellular electrons to extracellular electron acceptors. In addition to the liposome experiments [15], which specifically targeted the MtrCAB membrane complex, it should be noted that general cytochrome-based interfacial ET has been studied in vivo using a photochemical approach in Shewanella Ioihica PV-4 [35]. The latter study revealed significantly lower ET rates than the liposome experiments, pos- sibly limited by metabolic reduction of c-type cytochromes (e.g. ET from quinol derivatives to inner membrane cytochromes or from the periplasmic cytochromes to the outer membrane cytochromes) rather than the electron flux through the proteins themselves. It is also interesting to examine the electron flux through MtrF in light of in vivo measurements of Shewanella respiration. We previously measured the average per cell ET rate in aerobic S. oneidensis MR-1 chemostat cultures to be 2.6× 10 6 s −1 [18]. This value is comparable to more recent biofilm and single-cell anaerobic measurements of per cell EET (1.3× 10 6 s −1 ) to graphite and indium tin oxide anodes, respectively [19, 36]. Previous estimates of the Shewanella MtrC 23 and OmcA content are in the range of 10 3 − 10 4 decaheme cytochromes per cell [34, 37, 38]. The cellular respiration measurements therefore translate to ET rates up to 10 3 s −1 per cytochrome, which again can be supported by our calculations of electron flux through MtrF (Figure 2.2b). Intriguingly, the latter rate is positioned near the critical point, at the intersection of the different occupation density and current phases revealed in Figure 2.2. Bacteria can experience constant fluctuations in environmental parameters, including changes in the availability of food/oxidant (electron donors/acceptors) or different redox potentials of the various minerals. Such fluctuations in the envi- ronmental conditions may therefore trigger a non-equilibrium phase transition in MtrF between the different phases corresponding to donor-limiting ET, acceptor- limiting ET, and maximum electron flux (Figure 2.2). Indeed, an examination of the occupation densities near the critical boundary is instructive, since it gives further insight into the possible roles of particular hemes. For example, due to its high redox potential [32], the occupation density of heme 7 is relatively high, even in the low-density phase (position 1 in Figure 2.2c). In contrast to the sol- vent exposed heme 5, which is capable of interacting with solid substrates, heme 7 is more buried and has therefore been proposed as a possible site for electron exchange with soluble substrates or electron shuttles [24]. Thus in the absence of a fast electron acceptor at heme 5 (small β), which is the only electron ejection site in our simulation of solid-phase EET, there may be another outlet for electron exchange between heme 7 and soluble acceptors (e.g. flavins). Heme 7 is also close to a putative flavin-binding site, and the interaction of MtrC with bound flavin has been shown to result in a significant rate enhancement, compared to interacting with soluble free flavins [39]. Another examination of the occupation density in 24 the maximum-current phase (position 3 in Figure 2.2c) reveals that roughly half the hemes are likely to be reduced and the other half likely oxidized; this profile is consistent with the previously predicted condition for maximum current flow in concentration-gradient driven ET within microbial redox assemblies [40]. While our simulations focused on single injection and ejection scenarios (i.e. exclusively at hemes 10 and 5), we also performed a preliminary KMC calculation to examine the effect of an additional ejection scenario form heme 7, with assumed rates of 10 2 s −1 or 10 5 s −1 . Unlike Figure 2.2a, no distinct phases were detected in theα−β plane as a result of this additional outlet that serves to further oxidize the hemes in MtrF. The KMC simulation can also be applied to the experimental arrangement schematized in Figure 2.3(a), where an MtrF molecule is placed between two elec- trodes. An STM tip acts as the top electrode and a conductive substrate as the bottom electrode; these two electrodes now play the roles of donor or acceptor depending on the direction of the voltage bias V. The electron flux calculation is performed as outlined above, but with two modifications. First, the interfacial injection and ejection rates now follow the interfacial form of the non-adiabatic ET rate equation. The reductive and oxidative electrode-heme rates k red,η and k ox,η , representing ET to and from heme η where η =10 or 5, are given by [41, 42] k red,η =C electrode Z ∞ −∞ exp − x− λ+e(E−E 0 ) k B T 2 k B T 4λ 1 + exp(x) dx (2.5) k ox,η =C electrode Z ∞ −∞ exp − x− λ+e(E−E 0 ) k B T 2 k B T 4λ 1 + exp(x) dx (2.6) 25 Figure 2.3: (a) Schematic of a scanning tunneling microscopy (STM) approach to measuring the current-voltage (I-V) relationship in individual MtrF molecules. (b) Kinetic Monte Carlo simulations of the I-V curves for asymmetric (red and blue traces where r = 0.5 and 2, respectively) and symmetric (black trace, r = 1) contact conditions. Calculation parameters: C elec = 10 5 s −1 , λ n = 0.4eV, δ =δ 10 +δ 5 = 0.3, heme-to-heme rates (k ij ) obtained based on MD and QM/MM calculations, [17]4G modified to include the added voltage bias as described in the text. (c) Phase diagram of the time-averaged electron occupation density for all hemes as a function of the interfacial voltage drop fractions δ 10 and δ 5 at the electrode-heme contacts with increasing voltage V. The color bar is the same as in Figure 2.2a. where λ n is the interfacial reorganization energy, C electrod denotes the coupling between electrode and heme,k B T is the thermal energy, andδ n describes the local voltage drop at heme η as a fraction of the overall applied voltage V between the two electrodes. Second, the driving force for ET between hemes i and j is now4G ij = 4G intrinsic + eV (1− δ)l ij l −1 0 to include the effect of applying V on top of the intrinsic4G intrinsic landscape from the QM/MM simulations [32]. Here, l ij is the total vertical distance between hemes i and j (along substrate- tip axis), l 0 is the total distance between hemes 10 and 5 [24], and δ = δ 10 + δ 5 describes the total interfacial voltage drop. We chose λ n = 0.4 eV, consistent with electrochemically determined values of cytochrome c at electrode surfaces, generally resulting in halving the total reorganization energy compared to fully solvated conditions [41, 43]. Furthermore we assumed an electrode-heme coupling 26 C elec of 10 5 s −1 , matching typical k ij values within the protein [32]; this choice is made in order to avoid having the overall flux be limited by interfacial ET, which would obscure the transport characteristics of MtrF itself. In addition, we used δ = 0.3, from our previous measurements of the interfacial voltage drop in bacterial nanowires [17, 41]. The I-V simulation results are presented in Figure 2.3b, under asymmetric (r = δ 5 /δ 10 = 0.5 and 2, for red solid line and blue asterisk line, respectively) and symmetric (r = 1 for black solid line) contact conditions. The asymmetric contacts result in relatively low conductance, as expected for being in the low-density or high-density phases, and therefore in low-current regime described above. The I-V curve for symmetric contact conditions (i.e. equal injection and ejection rates) shows the higher conductance expected for being in the maximum-current phase. The effect of applying a voltage bias on the phase diagram is evident in Figure 2.3c, which shows the electron occupation density for all hemes as a function of δ 5 ,δ 10 , and increasingV. Significantly, the maximum-current regime grows with increasing V, at the expense of the low- and high- density phases. This effect results from the interfacial ET rates growing more rapidly than the heme-to-heme rates with increasing voltage, leading to a larger portion of the phase diagram where ET is protein limited. We now report the first scanning tunneling microscopy (STM) and tunnel- ing spectroscopy (TS) I-V measurements of the decaheme cytochrome MtrF. Our experimental approach follows previously described procedures for STM/TS mea- surements of the decaheme cytochromes MtrC and OmcA adsorbed on Au(111) substrates [44, 45]. Effective immobilization of MtrF on the surface was achieved via a recombinant tetra-cysteine tag at the C-terminus of the protein, allowing the 27 formation of covalent thiol bonds with Au, and tunneling contact is established by penetrating through a β-D-glucopyranoside (OGP) detergent overlayer that par- tially preserves the protein’s microenvironment (e.g. hydration) [45]. Figure 2.4a demonstrates the uniformity of MtrF monolayers achieved using this procedure, where individual proteins can be resolved within monolayers covering large atom- ically flat Au(111) terraces. From the STM images, the lateral dimensions of indi- vidual cytochromes were measured as 6-9 nm, and atomic force microscopy of less dense preparations (data not shown) revealed a height of 5 nm. These dimensions are consistent with the expected size of MtrF but suggest that the proteins may be tilted at an angle, relative to the orientation schematized in Figure 2.3a, since the measured vertical molecular height is less than the expected distance between hemes 10 and 5 ( 7 nm) [24]. This tilt is not surprising, given the determined location of the C-terminus (and therefore the tetra-cysteine tag) closer to the cen- ter of MtrF than heme 10 [24], as well as the likely flexibility of the C-terminal region and tag. While the specific immobilization and tunneling contact procedure is designed with the stability of the protein in mind, as previously demonstrated for MtrC and OmcA [44, 45], it is difficult to also rule out partial denaturation or conformational change of the protein on the surface (further discussed below). Differences between the simulation and experimental orientation on the surface may also contribute to a different electron exchange pathway between the surface and the protein’s hemes. While the simulation assumes direct electron injection from Au(111) to heme 10 (mimicking the biological conditions), it is possible that a different heme, or multiple hemes, exchange electrons with the surface depending on the distance and coupling strength. Future pathways calculations [46, 47] may reveal some insight into these multiple charge injection scenarios. These differences 28 between the simulation and experimental conditions should be kept in mind while interpreting the following experimental results. The STM/TS measurements were performed in air under ambient humidity conditions. This approach has significant advantages, since it allows conventional conductance (I-V) measurements not confounded by Faradaic currents in solution [44]. However, in order to assess the effect of various hydration conditions on the protein monolayers, we performed the STM/TS measurements at different time points after the solution was removed from the surface following protein deposition on the Au substrates. For protein films left to equilibrate at ambient humidity for more than four hours (referred to as the low humidity condition), an ’activation’ procedure was required to resolve the cytochromes, as previously described for MtrC and OmcA [45]. The activation procedure involves applying a one-time large positive bias (∼3 V) to the tip, in order to disrupt the OGP detergent layer blanketing the cytochrome film [44, 45]. This OGP is residual from the protein solution, where it was required to solubilize the membrane proteins in water by saturating the exposed hydrophobic regions of the protein surface. It should be noted, however, that the activation voltage is only applied once during scanning, and the cytochrome films can be continuously resolved after this step, even if the bias voltage returns to smaller values. Significantly, no activation step was needed to resolve the cytochromes in films left to equilibrate for less than 30 minutes at ambient humidity (referred to as the high humidity condition). This is consistent with higher hydration, where the outward-facing OGP’s hydrophilic head groups interact with the retained water, allowing more direct access of the tip to the MtrF monolayer. The high humidity condition resulted in decreased molecular conductance (Figure 2.4b, blue curve); this is the expected outcome, since the 29 Figure 2.4: (a) Representative scanning tunneling microscopy (STM) image of MtrF monolayers on atomically flat Au(111) terraces. Individual MtrF molecules can be resolved. STM imaging conditions: 1 V bias voltage and 2 nA current setpoint. (b) Tunneling spectroscopy current-voltage (I-V) spectra for both the high and low humidity conditions (blue and red traces, left at ambient humidity for < 30 minutes or > 4 hours, respectively. Averages and uncertainties obtained from 300 individual I-V spectra). Grey lines are obtained by KMC simulations using the phenomenological form of the non-adiabatic ET rate equation (7) to calculate the heme-to-heme ET rates assuming = 0.8 eV or 0.55 eV. Additional calculation parameters: = 1 Å −1 , λ n = 1/2, C elec = 10 10 s −1 , δ = 0.3, R from the crystal structure, and4G accounted for the bias V as described in the text. 30 higher dielectric constant of water results in a higher reorganization energy for ET. A comparison between the measured TS I-V curves of single molecules (average of 300 individual point spectra, Figure 2.4b) and the KMC simulation I-V results (Figure 2.3b) reveals a large discrepancy; the measured current is 3 orders of magnitude higher than the current simulated by KMC, primarily as a result of the low heme-to-heme electronic couplings generated by the QM/MM simulations (adjusting the reorganization energies and driving forces did not compensate for this discrepancy). In the following, we discuss possible reasons for this discrepancy before proceeding to use phenomenological ET rate expressions (not based on the MtrF-specific QM/MM calculated rates) to compare general trends with the experimental data. First, we note it is unlikely that an experimental artifact causes this large disparity with theory, since similar (hundreds of pA) currents were also observed in previous STM/TS studies of the closely related MtrC [44, 45], and giventhattheuniform5nmthickMtrFmonolayercoverage(Figure2.4a)precludes direct access to the underlying Au substrate or single step tip-substrate tunneling through theprotein dielectric. Atthe same time, althoughthe MtrF heme-to-heme QM/MMcalculatedcouplingsareupto2ordersofmagnitudebelowempiricalrules typically used for ET between redox centers, it seems unlikely this results from inaccuracies in the QM/MM calculations as the same methodology gave accurate predictions in other systems [32]. The large discrepancy in measured and calculated ET rates was anticipated by Breuer et al. [32], who proposed that high experimental currents may be mediated by higher-lying non-occupied electronic states, rather than the expected multistep hopping along the heme redox chain. However, we note that the 10 heme redox 31 potentials of MtrF were reported to lie in the range -0.26 - 0 V (vs. SHE) for fully solvated MtrF and -0.312 - -0.044 V (vs. SHE) using protein film voltammetry [24]. This range translates to roughly 4.1 âĂŞ 4.4 V below vacuum on an absolute scale. As previously noted for MtrC on Au(111), a comparison of this potential window with the electrode work functions and applied voltage (± 1V) reveals that this range is completely sampled in the bias interval separating the Fermi levels of the Au substrates and Pt/Ir STM tips used in the MtrC [44] and MtrF measure- ments. We therefore speculate that the most likely explanation for the discrepancy between theory and experiment stems from the difference in environmental condi- tions and conformational state of the protein. The calculations were performed for fully-solvatedMtrFinbulkwater[32], whiletheexperimentswereperformedunder ambient humidity conditions for proteins attached to Au(111) surfaces. While our experimental arrangement retains a stable conformation (after repeated scanning) and a hydrated microenvironment (adsorbed water) under the protective detergent coating [44, 45], it is unlikely that the cytochromes are in their fully-solvated con- formation on the Au surfaces during STM/TS measurements. The conformation of surface-bound molecules, such as Mtr proteins, in an STM configuration can depend on a number of factors discussed elsewhere, including electric field induced effects as a result of the tip bias [44], and denaturing or structural re-arrangements upon adsorption to interacting surfaces [48]. Without knowing the precise effects of humidity, surface adsorption, and tip bias on the protein’s ET rates, we therefore make use of a simplified phenomeno- logical form of the non-adiabatic rate equation, instead of the QM/MM informed 32 equation (1), to provide the heme-to-heme rates as input to the KMC simulation [2]: k hop (s −1 ) = 10 13 exp −βR− (ΔG +λ) 2 4λk B T (2.7) where β is a tunneling decay factor , (∼ 1Å −1 ) and R is the effective tunneling distance (difference between edge-to-edge distance for two neighbouring hemes and the distance at van der Waals contact). Note that this is one possible simplified coupling law with a particular pre-factor, decay coefficient, and using the edge- to-edge metric, while other empirical rules may also be useful. Such a simplified treatment is not meant for detailed comparison to theory, but rather is a heuristic approach intended to look at trends and to gain intuition about the parameters required to meet the experimentally observed currents (Figure 2.4b). Indeed, we find that the measured I-V curves are consistent with KMC simulations (grey curves) based on heme-to-heme ET rates given by equation (7), assuming λ = 0.8 eV and 0.55 eV for the high (blue trace) and low humidity (red trace) conditions, respectively. For both conditions, λ n =1/2λ ,4G was only contributed to by the bias V as described above, δ = 0.3, and C elec = 10 10 s −1 . While the high inter- facial and heme-to-heme electronic couplings were tuned to be consistent with experimental results (up to 4-5 orders of magnitude higher than the inter-heme rates derived from the ab initio calculations), the reorganization energy values are reasonable for MtrF [32] and show the correct trend of increasing with increas- ing hydration, resulting in lower conductance. These initial experimental results motivate us to consider new experimental platforms to assess the conductance as a function of controlled humidity conditions, and to pursue additional electro- chemical techniques for fully solvated molecules. The experiments also motivate us to explore additional mechanisms, beyond incoherent hopping, which may be 33 relevant to EET. In addition to the high ET rates observed in the MtrC [45] and MtrF (this study), nA currents were also measured in bacterial nanowires from Shewanella [17, 18], and a new report demonstrated the localization of the multiheme cytochromes MtrC and OmcA along these micrometer-long nanowires [49]. Given the large difference between the ab initio and empirical inter-heme ET rates (based on a multistep hopping mechanism), it can be argued that resonant enhancement may underlie the observed high ET rates, at least under the exper- imental measurement conditions (ambient humidity, fixed nanowires, etc.). The recently proposed transient or flickering resonance regime is of particular interest, since it may support coherent (ballistic) charge transfer by enhanced charge delo- calization among hemes with close redox potentials [50]. Future studies should be designed to assess the applicability of these exciting theoretical developments to EET. 2.4 Conclusions In summary, we presented a kinetic Monte Carlo approach to study electron transport and the electron occupation density in the heme chain of the She- wanella oneidensis MR-1 outer-membrane cytochrome MtrF. The calculated elec- trontransportratethroughMtrFissufficienttosupportpreviousinvitromeasure- ments of transport through the Shewanella Mtr complex, as well as in vivo respira- tion rates assuming a reasonable coverage of outer membrane cytochromes (∼ 10 3 per cell) that act as terminal reductases of insoluble electron acceptors. Finally, we presented scanning tunneling microscopy of MtrF monolayers chemically immobi- lized on Au(111) substrates, and tunneling spectroscopy current-voltage measure- ments of single MtrF molecules. The molecular conductance was found to depend 34 strongly on the hydration state of the protein, consistent with changes in the reorganization free energies. Our work informs and motivates further experimen- tal and theoretical investigations to understand the long-range electron transport mechanisms underlying the extracellular respiration of minerals and solid-state electrodes. 2.5 Appendix I Recombinant MtrF contained a 4×Cys/V5/6×His tag (DDDDKAACCPGC- CKGKPIPQPLLGLDSTRTGHHHHHH) at its C-terminus for purification and covalent binding to Au surfaces. The construct with a gene encoding this tetra- cysteine-tagged MtrF was prepared similarly to that for tetra-cysteine-tagged MtrC [51]. The resulting construct (pLS289) was introduced into S. oneidensis MR-1 to create LS622 [52]. After overexpression, the tetra-cysteine-tagged MtrF was purified from LS622 by following an established protocol [24]. The purified MtrF (1.7 mg/ml) was distributed into 100 μl aliquots and stored at -20 ◦ C in 20 mM HEPES-buffered solution containing protease inhibitor, 10% glycerol, 250 mM NaCl, 1% OGP, and 5 mM -mercaptoethanol at pH 7.6. As described pre- viously,[14b] this solution was exchanged twice with fresh pH 7.6 buffer solution containing 20 mM OGP and 50 mM HEPES using Amicon Ultra-0.5 mL centrifu- galfilters(10kDamolecularweightcut-off)at14,000rcfand10 ◦ Cbeforeuse. The resulting protein solution was diluted 4-fold and deposited on Au(111) substrates (Agilent annealed N9805B), followed by a 15 hour incubation at 4 ◦ C. Before STM measurements, the majority of the solution was removed from the surface using a pipette applied to the corner, followed by wicking to minimize the remaining solution on the surface. The substrate and surface-adsorbed MtrF layer was then 35 left at room temperature and ambient humidity for either < 30 minutes or > 4 hours, depending on the target hydration condition. STM/TS measurements were performed on a Bruker Innova AFM/STM using 80/20 Pt/Ir STM tips (Bruker CLST-PTBO). Bias was applied to the STM tip while keeping the substrate at ground potential. After scanning, TS I-V spectra were collected at hundreds of locations on MtrF monolayers covering atomically flat Au(111) terraces, with the scan settings of 0.4 V tip bias and 0.4 nA tunneling current defining the initial tip-sample distance. The scanning feedback loop was automatically turned off to maintain the tip a fixed distance above each location while the I-V spectra were actively recorded. 2.6 Appendix II // kmc.c // to compile: gcc kmc.c -o kmc // to run : ./kmc injection_rate ejection_rate // input file (rhop.txt) is required #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <math.h> #include <time.h> #define N 20 /* # of molecules */ #define Max_ngb 19 /* Max. # of neighbor molecules */ 36 #define Max_step 1000000 /* Max. # of KMC steps */ #define stepPrint 100000 /* step to print */ int lsngb[N][Max_ngb+1]; /* Neighbor list */ int occ[N]; /* Molecular occupation */ double rate[N][Max_ngb+2]; double rate_occ[N][Max_ngb+2]; /* Occupation-modified rates */ int ngb_mat[N][N]; double v_mat[N][N]; // Check the neighbors of each heme site void print_lsngb() { printf("lsngb:\n"); int i=0; int j=0; for(i=0; i < N; ++i) { printf("%d : {", i+1); for(j=0; j <= lsngb[i][0]; ++j) { printf("%d,", lsngb[i][j]+1); } printf("}\n"); } } // Check if rates are correctly entered void print_rate() { printf("rate:\n"); 37 int i=0; int j=0; int idx = 0; for(i=0; i < N; ++i) { printf("{"); printf("(%lf,%lf)",rate[i][0], rate[i][1]); idx = 0; for(j=2; j < Max_ngb+2; ++j) { if (i == j-2) idx = idx + 2; else idx = idx + 1; printf("%d:%lf,",(idx-1),rate[i][j]); } printf("}\n"); } } // Initialize lsngb and rate void init_lsngb() { int idx = 1; int i=0; int j=0; int sum = 0; for (i=0; i < N; ++i) { sum = 0; for (j = 0; j < N; ++j){ sum += ngb_mat[i][j]; } 38 lsngb[i][0] = sum; idx = 1; for(j=0; j<N; ++j){ if(ngb_mat[i][j] == 1){ lsngb[i][idx] = j; rate[i][idx+1] = v_mat[i][j]; ++idx; } } } } void init_rate() { int i,j; for(i=0; i < N; ++i) { for(j=0; j < Max_ngb+2; ++j) { rate[i][j] = 0; } } for(i = 0; i < N; ++i){ for(j = 0; j < N; ++j){ ngb_mat[i][j] = 0; } } } int init_data() { 39 int i,j; double v, a0,a1,b0,b1,c0,c1,d0,d1; init_rate(); FILE* fp = NULL; fp = fopen("rhop.txt", "r"); if(fp == NULL) perror("file is not open."); while(fscanf(fp, "%d\t%d\t%lf", &i,&j,&v) == 3) { --i; --j; ngb_mat[i][j] = 1; v_mat[i][j] = v; } // print_rate(); init_lsngb(); return 0; } int main(int argc,char** argv) { int nred = 0, nox = 0, nox_4 = 0, nred_4 = 0, nox_9 = 0, nred_9 = 0; /* # of reduction & oxidation rates */ double t = 0.0; double r,racc,rth; int step,i,k; int p = 0.0; int n[N]; 40 int sum_occ[N]; double avg_occ[N]; double current_red, current_ox; double delt; double time_sum_occ[N]; double time_avg_occ[N]; double time_sum_unocc[N]; double a0, b0, a1, b1 ; double sum_occ_step= 0.0, sum_occ_time= 0.0; double sum_occ_step_r = 0, sum_occ_time_r = 0; for (i = 0; i<N; ++i){ n[i] = 0.0; occ[i] = 0.0; sum_occ[i] = 0; time_sum_occ[i] = 0.0; } init_data(); sscanf(argv[1], "%lf", &a0); sscanf(argv[2], "%lf", &b1); b0 = 0.0; a1 = 0.0; rate[9][0] = a0; rate[9][1] = b0; rate[19][0] = a1; rate[19][1] = b1; 41 srand((unsigned)time(0)); /* Initialize the rondom-number sequence */ int selected = 0; for (step=1; step<=Max_step; step++) { r = 0.0; for (i=0; i<N; i++) { rate_occ[i][0] = rate[i][0]*(1.0-occ[i]); /* Injection rate */ rate_occ[i][1] = rate[i][1]*occ[i]; /* Ejection rate */ for (k=1; k<=lsngb[i][0]; k++) rate_occ[i][k+1] = rate[i][k+1]*(1.0-occ[lsngb[i][k]])*occ[i]; /* Hopping rate */ for (k=0; k<=lsngb[i][0]+1; k++) r += rate_occ[i][k]; } delt = -log(rand()/(double)RAND_MAX)/r; t += delt; /* Increament the time following Poisson distribution */ rth = r*rand()/(double)RAND_MAX; /* Threashold rate to select an event */ /* sum all the occupied time at each site*/ racc = 0; for (i=0; i<N; i++) { if (rth < (racc += rate_occ[i][0])) { occ[i] = 1.0; /* Reduction occurred on molecule i */ ++nred; selected = 1; 42 break; } else if (rth < (racc += rate_occ[i][1])) { occ[i] = 0.0; /* Reduction occurred on molecule i */ ++nox; //fprintf(freox,"%le %d %d\n",t,nred,nox); selected = 1; break; } else { for (k=1; k<=lsngb[i][0];k++) { if (rth < (racc += rate_occ[i][k+1])) { occ[i] = 0.0; occ[lsngb[i][k]] = 1.0; /* Hopping from i to lsngb[i][k] occurred */ selected = 1; break; } } } if (rth < racc) break; } /*Sum all the occupied time at each site*/ for (i = 0; i<N; i++) { time_sum_occ[i] += occ[i]*delt; } /*Sum all the occupied number at each site*/ for (i = 0; i<N; i++) { 43 sum_occ[i] += occ[i]; } /*Print every step%stepPrint step*/ //if (step%stepPrint == 0){ /* if (step > 9999500){ for (i = 0; i<N; i++){ avg_occ[i] = (double)((double)sum_occ[i])/((double)step); time_avg_occ[i] = (double)((double)time_sum_occ[i])/((double)t); printf("%2d %le %.2f %.2f %d\n",i+1, t, time_avg_occ[i], avg_occ[i], occ[i]); } }*/ } nox_9 = nox - nox_4; nred_9 = nred - nred_4; //current_ox = (nox/t)*1.602*1e-19; //current_red = (nred/t)*1.602*1e-19; current_red = nred/t; current_ox = nox/t; //printf("%lf\t%d\t%d\t%le\t%le\n",t,nred,nox,current_red,current_ox); printf("==============================================\n"); printf("Heme_ID Step_avg_occ Time_avg_occ\n"); printf("----------------------------------------------\n"); for(i = 0; i < N; i++){ avg_occ[i] = (double)((double)sum_occ[i])/((double)Max_step); time_avg_occ[i] = (double)((double)time_sum_occ[i])/((double)t); 44 printf("%d\t%lf\t%lf\n",i+1,avg_occ[i],time_avg_occ[i]); sum_occ_step += avg_occ[i]; sum_occ_time += time_avg_occ[i]; } printf("----------------------------------------------\n"); printf("Elapsed time : %le\n", t); printf("Number of electron passing: %d\n", nred); printf("Electron flux (s-1) : %le\n",current_red); printf("==============================================\n"); //printf("%le\t%d\t%d\t%le\t%le\n",t,nred,nox,current_red,current_ox); //printf("%lf\t%lf\t%lf\t%d\t%d\n",sum_occ_step/N, sum_occ_time/N,t, nred,nox); return 0; } Sample Input rhop.txt 1 2 23243.95 1 3 21328.39 1 4 0.56 1 5 0.00 1 6 30637.90 1 7 0.00 1 8 0.02 1 9 0.00 1 10 0.00 1 11 0.00 1 12 0.00 45 1 13 0.00 1 14 0.00 1 15 0.00 1 16 0.00 1 17 0.00 1 18 0.00 1 19 0.00 1 20 0.00 2 1 50364.39 2 3 136.18 2 4 0.00 2 5 0.00 2 6 0.00 2 7 0.00 ... ... ... Sample Output $ ./kmc 1e6 1e6 Heme_ID Step_avg_occ Time_avg_occ ---------------------------------------------- 1 0.973223 0.968416 2 0.941478 0.944147 3 0.331204 0.091298 4 0.233128 0.360527 5 0.892731 0.731726 46 6 0.997741 0.996606 7 0.999977 0.999569 8 0.923933 0.841058 9 0.349415 0.487381 10 0.990533 0.975571 11 0.908239 0.883373 12 0.808990 0.793289 13 0.193038 0.053947 14 0.194907 0.380895 15 0.846266 0.616760 16 0.801179 0.772284 17 0.996367 0.990607 18 0.029144 0.036367 19 0.003020 0.003073 20 0.001094 0.000669 ---------------------------------------------- Elapsed time : 1.548276e-01 Number of electron passing: 164 Electron flux (s-1) : 1.059242e+03 47 Chapter 3 A Framework for Stochastic Simulations and Visualization of the Mtr-Omc Cytochrome Complex 3.1 Introduction Reduction and oxidation reactions govern a variety of biological energy- conversion processes, including respiration. Electron transfer (ET) within and across biological molecules is the key process that essentially dictates these redox reactions[3]. SuchbiologicalETreactionshavebeenstudiedextensivelyinthepast [46, 53–55]. A remarkable example is the rapid ET from metal reducing bacteria such as Shewanella oneidensis MR-1 to extracellular metal oxides that serve as the terminal electron acceptors for anaerobic respiration [56–59]. Under certain con- ditions, S. oneidensis MR-1 produces electrically conductive bacterial nanowires that may mediate long-distance ET to extracellular oxidants [16, 18, 60]. A recent study has revealed that these Shewanella nanowires are membrane extensions decorated with the multiheme cytochromes MtrC and OmcA (Figure 3.1a) [49]. These molecules had been previously identified as outer-membrane cytochromes 48 Figure 3.1: (a) Atomic force microscope (AFM) image of a Shewanella oneidensis MR-1 cell and attached bacterial nanowires. The scale bar is 1μm. (b) Structural model of an outer-membrane Mtr-Omc complex, where each yellow dot represents a heme group. Numerals show the numbering of the 10 hemes for both MtrC/F and OmcA. The inset shows that, in an oxidation reaction, ejection of an electron (e − ) converts the iron atom in a heme group from Fe 2+ to Fe 3+ , whereas, in a reduction reaction, injection ofe − converts it from Fe 3+ to Fe 2+ . (c) Hypothetical model of a bacterial nanowire, in which a lattice of Mtr-Omc complexes mediates long-distance electron transfer. The figure shows a central slice of the nanowire (IM: inner membrane, PP: periplasm, OM:outer membrane). and implicated as the terminal bacterial reductases of extracellular electron accep- tors [13, 22, 61, 62]. MtrC can associate with OmcA, in addition to forming a complexwiththeperiplasmicdecahemecytochromeMtrAthroughtransmembrane porin MtrB; this proposed arrangement has been described as the MtrCAB porin- cytochrome conduit, allowing ET across the cell envelope [22, 51]. Shewanella can also express the MtrFDE conduit, which is homologous to MtrCAB, and MtrF (homolog of MtrC) has been shown to act as a terminal reductase in the absence of MtrC [23]. The localization of MtrC and OmcA along bacterial nanowires suggests that an outer-membranelatticeofcytochromecomplexesmaymediateETovermicrometer length scales, as schematized in Figure 3.1 b and c [49]. Here, each complex is comprised of two decaheme cytochromes, MtrC and OmcA, each of which contains 10 hemes (Figure 3.1b). The iron (Fe) atom in each heme can exist in either of 49 the two valence states, Fe 2+ or Fe 3+ . Conversion of the irons between Fe 2+ and Fe 3+ allows for the hopping of electrons between the hemes as shown in the inset of Figure 3.1b. Despite this plausible hypothesis, however, the microscopic nature of ET dynamics along these bacterial nanowires remains elusive. Visualizing ET dynamics could provide key insight into understanding these fundamental processes and possibly controlling them for a wide range of appli- cations including renewable energy and wastewater treatment [8]. Balabin et al. developed a plugin to the Visual Molecular Dynamics (VMD) software [63] that visualizes ET pathways in biomolecules based on a network model called Pathways [47]. Byun et al. [64] simulated the net electron flux through MtrF using, as input, the sequential heme-to-heme ET rates computed by Breuer et al. [32]. The latter werecalculatedfromsimulationsofthermodynamicandelectroniccouplingparam- etersusingmoleculardynamics, fragment-orbitaldensityfunctionaltheory(similar to divide-and-conquer density functional theory, DCDFT [26]), and the quantum mechanics/molecular mechanics (QM/MM) method [65, 66]. The sequential heme- to-heme ET rates were recombined in kinetic Monte Carlo (KMC) simulations [67–70] to synthesize the global ET dynamics of MtrF [64]. These divide-conquer- recombine KMC (DCR-KMC) simulation results were visualized using VMD, but the visualization was limited to static snapshots. To provide better insight into the dynamics of biological ET processes, we have developed a computational framework named VizBET. The framework consists of an entire workflow of the KMC simulation, and it animates the resulting ET dynamics using a new plugin to VMD. This paper presents key features and imple- mentation details of VizBET, organized as follows. Section 2 describes the overall computational framework. Simulation and visualization results are presented in 50 section 3, and section 4 contains a summary. In this work, we focus on demon- strating the visualization aspect of VizBET by using KMC simulation results with one possible protein-protein docking configuration. In future publications, we will present a more detailed procedure to screen for the most plausible docking com- plex structures, compute the binding free energy of these complexes with molec- ular mechanics/Poisson-Boltzmann surface area (MM/PBSA) [71–74], as well as intra/inter protein ET rates using QM/MM [65] simulations to parameterize the KMC simulation. 3.2 Methods Figure 3.2a summarizes the overall workflow of the VizBET framework, using an outer-membrane MtrF-OmcA cytochrome complex in Shewanella oneidensis MR-1 as an example (the crystal structure of MtrC is not yet available, so the homologous MtrF was used instead). The structures of the MtrF and OmcA molecules determined by X-ray diffraction are downloaded from the protein data bank (PDB) [75]. We first pre-process the PDB files to account for residues that were not resolved in the crystal structure with the aid of the homology-modeling Web server, I-TASSER [20]. The structural outputs from this pre-processing step are used as inputs to molecular-dynamics (MD) simulations that follow the tra- jectories of all atoms by numerically integrating Newton’s equations of motion. Two MD simulations are performed for MtrF and OmcA molecules, respectively, in water. Individual MtrF and OmcA configurations taken from the MD simula- tions are used as inputs to a protein-docking program, ZDOCK [76], to predict the structure of the MtrF-OmcA complex. ZDOCK typically returns 2000 top config- urations according to simple electrostatic and shape criteria. We have developed a 51 C-RANK program to screen these configurations into a small subset of biologically plausible configurations, according to structural and ET criteria detailed below. Moreover, C-RANK serves as the starting point to further screen complex candi- dates by combining MD simulations, which re-solvate and relax the rigidly docked MtrF-OmcA complex structure, and the efficient binding free energy estimation with the method of MM/PBSA, which post-processes an ensemble of configura- tions from the MD trajectory with a combination of a force field and continuum solvent model [71]. The next step in VizBET is to perform DCR-KMC simula- tions for studying the ET dynamics in the selected MtrF-OmcA configurations. The DCR-KMC simulation component (Figure 3.2b) within VizBET combines (1) either DCDFT or empirical approaches to compute the ET rates in a DC fashion, which are then used in (2) KMC simulations of global ET dynamics in the entire complex. We have also developed a plugin to the VMD software named ETViz to animate ET dynamics in the DCR-KMC simulations. Currently, VizBET is implemented using the bash scripting language [77] to be used with the portable batch system (PBS) for job scheduling on a Linux cluster [78]. On high-end parallel supercomputers, we can alternatively use the Swift/T parallel scripting language for efficient distributed-memory workflow processing [79]. On a Grid of distributed parallel computers, the VizBET workflow can be converted to a directed acyclic graph to be executed using a scientific workflow management system such as Pegasus [80]. 52 Figure 3.2: (a) Workflow of the VizBET framework for biological ET visu- alization. Major new components developed for VizBET are represented by squares with thick lines. C-RANK consists of the further screening of ZDOCK results, re-solvation MD (molecular dynamics) and MM/PBSA (molec- ular mechanics/Poisson-Boltzmann surface area) (b) Detailed workflow of the DCR-KMC (divide-conquer-recombine kinetic Monte Carlo) simulation compo- nent. QM/MM: quantum mechanics/molecular mechanics; DCDFT: divide-and- conquer density functional theory. 3.2.1 Structural modeling, MD simulations, docking and binding-free energy computations The first step is to establish homology models for both proteins (MtrF and OmcA). We initially obtain X-ray structures of MtrF and OmcA as determined from the PDB server. The PDB codes for MtrF and OmcA are 3PMQ [24] and 4LMH [81], respectively. It should be noted that hydrogen atoms are not included in these X-ray structures. In addition, certain residues located near the N- and C-termini were not resolved in the crystal structures, likely due to the flexibility of those regions. The unresolved C-terminus residues from the X-ray studies were not included in our homology models, since they result from the purification tag of the recombinant protein. The signal peptide sequence preceding the cysteine residue of the N-terminal’s LXXC cleavage and lipidation motif was also omitted from 53 the homology model, since our goal is to study the mature proteins after cleavage and lipidation of the cysteine, which allows the anchoring of the lipoprotein to the membrane. As a result, our homology models target the amino-acid sequences of the membrane-bound forms of both proteins, starting with the cysteines, as listed in Appendix A. This leaves 25 and 16 N-terminus unresolved residues for MtrF and OmcA, respectively (italicized in Appendix A). In order to build a protein homology model by preserving the original crystal structure and the heme groups’ positions while adding the missing residues, we adopt the following strategy. First, we predict the homology models of MtrF and OmcA by using the I-TASSER server [20] with each protein’s sequence (Appendix A) and its original crystal structure as a reference. Here, the predicted homology models do not include heme groups and the coordinated calcium ions (Ca 2+ ). Second, the predicted structure of the homology model is superimposed on the original crystal structure so as to minimizethemeansquaredisplacementofallnon-hydrogenatomsbetweenthetwo structures using least-square fitting [82]. Third, the new residues from the optimal superposition are transferred and bonded onto the original crystal structure for each individual protein, in such a way that the heme groups and the coordinated Ca 2+ ions are present in the original PDB files (one Ca 2+ ion in 3PMQ; two Ca 2+ ions in 4LMH) can be preserved at their original positions. The second step is to establish both proteins’ solvation structures. With the established homology models of the mature proteins, hydrogen atoms are explicitly added to both proteins. All of the amino acids are protonated (histidine (His) is treated neutral) with the exception of glutamic acid (Glu) and aspartic acid (Asp), which are taken as deprotonated. The N-terminus (NH 3+ ) and the C- terminus (COO − ) are assigned with charges of +e and -e respectively. Those 54 Figure 3.3: Thermodynamic cycle to calculate the binding free energy of two pro- teins, A and B assignments result in a net charge of -37e and -31e, respectively, for MtrF and OmcA including heme groups at pH 7. We perform MD simulations of individual MtrF and OmcA using the Gromacs software package (version 4.6.5) [83] and the Charmm 27 force-field parameters [84]. In both proteins, each Fe ion in a heme group is hexa-coordinated, i.e., two additional axial bonds are formed below and above the molecular plane of porphyrin macrocycle. We use the default harmonic bonding and dihedral potential in Charmm 27 for the Fe-N coordination bonding. In addition, to maintain the coordination geometry of Fe atom as the heme center, the angular potential of N-Fe-O (where O is the water oxygen) in Charmm 27 is adopted to represent the angular interaction of N-Fe-N and the equilibrium angle is set at 180 degree. The calcium ions, which coordinate with proteins’ amino residues, are included in the simulations and interact with proteins via van der Waals and electrostatic potentials. Initially, we relax the newly added residues by MD simulationsin vacuumwith a stepwiseheating protocol from 60K to 298.15 K. To remove the ’bad contacts’ between the added residues and the rest, which lead to huge repulsion, only the added residues are relaxed while all the others are fixed at a low temperature. Subsequently, we obtain the solvated protein structure by MD simulations in an aqueous environment with the TIPS3P water model [84]. To 55 neutralize the system, Na + counter ions are added in both vacuum and solvation simulations. The third step is to obtain the predicted structure of the MtrF-OmcA complex using a protein-docking program, ZDOCK (version 3.0.2) [76]. As an input to the ZDOCK program, we use two PDB files for MtrF and OmcA, respectively, which are sampled from the MD simulations described above. ZDOCK predicts the structure of a protein-protein complex by treating each protein as a rigid body anddockingthem. Thecomplexstructuresarescoredusingacombinationofshape complementarity, electrostatics, and statistical potential terms. ZDOCK returns N top-ranked configurations according to the scoring as N PDB files labeled by the ranking (N is typically 2000). As a fourth step, we use our C-RANK program to sub-select M biologically plausible complex candidates from the N ZDOCK outputs, whereM«N,usingtwocriteria: (1)theminimuminter-cytochromeheme- heme distance (edge-to-edge metric) should be less than 10 Å, to support rapid ET, and (2) the orientation of both proteins in the complex must be compatible with their lipophilic nature, allowing both N-terminus lipid binding sites to face a common plane that serves as proxy for the outer membrane. In the ZDOCK rigid docking, the protein hydrogen atoms and the coordinated calcium ions have been ignored. Those hydrogen atoms are finally re-added and calcium ions are re-positioned at the original sites of both proteins according to their coordinated amino residues. Then the whole MtrF-OmcA complex is re- solvated with added counter ions following the stepwise heating protocol from 60 K to 298.18 K. After the heating, the complex is relaxed in the aqueous environment for 20 ns. The top candidates are now further screened by ranking the binding affinitiesbasedonthebindingfreeenergycalculatedusingtheMM/PBSAmethod. 56 For the MM/PBSA calculation, we have modified the recently developed suite of Bash/Perl scripts, GMXPBSA 2.1 [74], which combines the Gromacs software [83] and the Adaptive Poisson-Boltzmann Solver (APBS) program [85]. In MM/PBSA, theprotein-proteinbindingfreeenergy(4G binding )ofA+B→AB isdefinedasthe free energy difference between the complex (AB) and the sum of the free energies of the individual proteins (A and B) in aqueous environment as 4G binding =G AB aqu −G A aqu −G B aqu (3.1) 4G binding is calculated with the thermodynamic cycle shown in Figure 3.3 to include the complex solvation effect in the binding while minimizing the com- putational cost. The corresponding binding free energy is expressed as 4G binding =4G gas −4G A sol −4G B sol +4 AB sol (3.2) 4G gas = (4hE intra i +4hE LJ i +4hE coul i)−Th4S MM i. (3.3) 4G i sol =4G i polar +4G i nonpolar ,i =A,BorAB (3.4) 4G i nonpolar =γ i hSASA i i,i =A,BorAB (3.5) 4G polar =4G AB polar −4G A polar −4G B polar (3.6) 4G nonpolar =4G AB nonpolar −4G A nonpolar −4G B nonpolar (3.7) The symbolhi represents an average over configurations sampled from MD sim- ulation. 4G gas in equation3.3 is the binding free energy for the AB com- plex in the gas phase and consists of contributions from the averaged changes in intra-molecular interactions (E intra ), inter-molecular interactions that include 57 Lennard-Jones (E LJ ) and Coulombic (E coul ) potentials, and the entropy contribu- tion (S MM ), all computed with molecular mechanics. Due to its inaccuracy and debated contribution to the free energy, we follow the previous research [73, 74, 86] and ignore the entropy contribution. To reduce the computational time, we per- form the MM/PBSA calculation with configurations from a single solvation MD simulation for the complex, rather than three separated solvation MD simulations foreachindividualcomponent(AandB)andthecomplex. Accordingly, thechange in the intra-molecular potential in the gas phase is ignored. The total solvation free energy (4G i sol ) for component i (i = A, B, or AB) consists of both polar (4G i polar ) and nonpolar (4G i nonpolar ) contributions (equation (3.4)). (4G i polar ) is the protein-solvent electrostatic energy difference between the gas phase ( = 1 and the solution phase ( = 80), computed using the Poisson-Boltzmann method in a continuum implicit solvent medium and with grid spacing 0.5 Å. The dielectric constant inside proteins is set at 2.0. The nonpolar contribution (4G i nonpolar ) in equation (3.6) is estimated from the solvent accessible surface area (SASA) using the surface tension (γ = 2.77kJmol −1 nm −2 [51]. A probe with the radius of 0.14 nm is used to identify the dielectric boundary in the SASA calculation. (4G i polar ) and(4G i nonpolar )asshowninequation(3.6)and(3.7)representtheitemizedenergy terms for the electrostatic and nonpolar potentials respectively between the com- plex and two individual proteins. In our MM/PBSA calculation, each component consists of a protein, 10 heme groups, and its coordinated calcium ion(s). 58 3.2.2 Divide-conquer-recombine kinetic Monte Carlo sim- ulations of electron transfer We perform KMC simulations [67–70] to study ET dynamics in the top-ranked MtrF-OmcA configurations selected. The KMC simulation [64] treats electron- hopping events in the MtrF-OmcA heme network with N h (= 20) sites, where a heme site is labelled by index i∈{1,...,N h }. The i-th heme is either occupied by an electron (n i = 1, corresponding to Fe 2+ ) or unoccupied (n i = 0, corresponding to Fe 3+ ), wheren i is the electron occupation number of thei-th heme. The system is characterized by electron hopping ratesk ij between a pair (i,j) of adjacent heme sites, electron injection rate α into selected entrance heme, and electron-ejection rate β from a selected exit heme. With the electron hopping rates as an input, KMC algorithm is implemented as described in the method section of chapter 2. VizBET supports two options in computing the ET rates k ij . In both options, a thermodynamic integration protocol [87] is needed to estimate the change of the Gibbs free energy4G ij and the reorganization energy λ associated with the ET from heme sites i to j [30–32]. The two options differ in the way the electronic coupling H ij for the ET reaction is computed. In the first-principles QM/MM option [88, 89], a divide-conquer-recombine (DCR) algorithmic framework [26] is employed. Here, the divide-and-conquer (DC) phase computes H ij for each heme pair using the QM/MM method, where each QM calculation employs DC density functional theory (DFT). The calculated ET rates are used in the recombina- tion phase to synthesize global ET dynamics using KMC simulations as described 59 above. According to the non-adiabatic rate equation, the rate of ET, k ij , from the i-th heme to the j-th heme is expressed as [90] k et = 2π ~ |H DA | 2 √ 4πλk B T exp − (ΔG +λ) 2 4λk B T (3.8) , where T is the temperature, and and k B are the Planck and Boltzmann con- stants, respectively. To calculate inter-heme ET rates within a MtrF molecule, for example, Breuer et al. [32] estimated the electronic coupling by using the QM/MM method and a variant of DCDFT called fragment-orbital density functional theory. The second option to calculate k ij employs a phenomenological form of the non-adiabatic rate-equation[3], k hop (s −1 ) = 10 13 exp −βR− (ΔG +λ) 2 4λk B T (3.9) whereR ij is the edge-to-edge distance between hemesi andj andβ is a tunneling decay factor. Despite its simplicity, the equation 3.9 correctly reflects the expo- nential decay of the tunneling probability and predicts ET rates in broad classes of biomolecules [54]. 3.2.3 Visualization of electron transfer To animate the ET dynamics in DCR-KMC simulations, we have developed a plugin to the VMD software [63]. VMD is a molecular visualization program for large biomolecular systems using 3D graphics and built-in scripting. Our plugin is implemented using the Tcl scripting language [91]. We have used VMD version 1.9.1. 60 Multiple time frames from the KMC simulation are saved as a multi-frame PDB file, in which the n i value of each Fe atom is written in the temperature- factor (BETA) field in its ATOM record. The Tcl script copies the PDB-BETA values to the USER fields in the TRAJECTORY data category in VMD, so that the time variation of the n i values can be animated as color changes according to one of the built-in color scales in VMD. The animation can be shown in a display window or can be saved as a sequence of image files to create a movie file. The Fe atoms are represented by spheres with color-coded electron occupation n i . The Fe charge dynamics are overlaid with the NewCartoon representation in VMD of the protein complex. This animation is used to examine the hopping of individual electrons within the heme network in the protein complex. Alternatively, VizBET can visualize the time averaged electron occupation, hn i i t = P T s=1 n i (s)τ(s) P T s=1 τ(s) (3.10) In this way, we can animate how the time-averaged electron distribution converges to a steady state. Appendix B shows the Tcl script that reads a multi-frame PDB file named FeOcc.pdb and copies the BETA field of each atom into the USER field. 3.3 Results To illustrate the use of VizBET, we study lateral ET parallel to the outer mem- braneacrossthetoprankedconfigurationoftheMtrF-OmcAcomplex,accordingto the ET and orientation criteria described above, shown in Figure 3.4. Remarkably, this MtrF-OmcA configuration is arranged similarly to the OmcA dimer crystal- lized by Edwards et al. [81], which is the only reported structure of two interacting 61 Figure 3.4: The Top-ranked MtrF-OmcA configuration according to our ET and orientation criteria, showing the 20 heme arrangement within the complex. Hemes 5 of both proteins define the inter-cytochrome contact with a 5.58 Å edge-to-edge distance. decaheme cytochromes. This similarity further suggests that our screening proce- dure is capable of predicting biologically plausible structures of the complex. In this top ranked MtrF-OmcA configuration, heme 5 of each cytochrome serves as the site of cytochrome-cytochrome interaction, linking the two proteins’ long axes from heme 10 of one to heme 10 of the other. 3.3.1 Binding free energy For this example configuration (Figure 3.4), MtrF-OmcA binding free energy is estimated with high throughput MM/PBSA computation by post-processing the configurations derived from the final 4.5 ns of MD simulation, at the stage 4hE LJ i (kJ/mol) 4hE coul i (kJ/mol) 4hE polar i (kJ/mol) 4hE nonpolar i (kJ/mol) 4hE binding i (kJ/mol) −177.260± 8.412 18810.5± 12.203 −17945.519± 12.843 −21.237± 0.805 666.485± 17.835 Table 3.1: Binding Free Energy Computation from MM/PBSA 62 of relaxing the complex in water after the rigid-docking. Table 1 lists the total binding free energy as well as the itemized energy terms. To be consistent with MD simulation, negligible salt concentration is used in MM/PBSA computation. In comparison, the polar contribution (4E coul and4G polar ) is dominant in the overall binding free energy. It should be noted that our interest here is to inves- tigate the binding affinity, which is dependent on a particular relative orientation and position of both docking proteins. Consequently, there is no need to compute the exact docking free energy change, which includes the entropy contribution of the rotational motions of a ligand and a receptor. MM/PBSA can achieve high efficiency in computing the binding free energy without sampling orientational space in the docking, while still taking into account the thermal averaging and the important dehydration effect for two objects in contact in aqueous environment [92, 93]. As shown in Table 1, the protein-solvent electrostatic potential,4G polar , as a dominant term, is comparable with protein-protein electrostatic potential, 4E coul . Therefore, after the initial screening (based on the criteria proteins’ shape complementarity, neighboring heme groups’ distance, and the proteins’ lipid-site orientation), the orientation-dependent candidates can be further selected accord- ing to the binding free energy, which incorporates the dehydration effects. More detailed discussion about this screening will be provided in future work. The small fluctuations in these energy terms listed in Table 1 also demonstrate the structural stability of the complex and the convergence of the computation. 3.3.2 Nonequilibrium phase transitions A multistep ET process has been hypothesized to allow long-distance electron transport through multiple complexes along a bacterial nanowire [49]. To examine 63 Figure 3.5: (a) Phase diagram of the time-averaged electron occupation density n for all 20 hemes as a function of the incoming (α) and outgoing (β) ET rates. The white dotted lines delineate the low-density (LD), high-density (HD) and maxi- mum current (MC) phases. The white dashed circle corresponds to experimentally estimated respiration rates. (b) The corresponding phase diagram of the net elec- tron flux j. this hypothesis along our top ranked configuration, the ET rates within the MtrF- OmcA complex are obtained by parameterizing and generalizing the H ij ,G ij , and λ calculated in Ref. [32] for MtrF. The H ij parameters throughout the complex were fit to a single exponentialh|H ij | 2 i 1/2 (r) =Aexp(−β(r−r 0 )/2), where r is the edge-to-edge distance between i and j, r 0 = 3.6 Å, β = 1.65 Å −1 and A = 3.77 meV, as presented in [32] for MtrF. Similarly, the G ij , and λ values computed for specifici−j pairs in MtrF are generalized throughout the whole combined MtrF- OmcA complex in our example. Figure 3.5 shows the KMC simulation results. Here, we chose the injection site as heme 10 in MtrF, and the ejection site as heme 10 in OmcA. By varying the incoming (α) and outgoing (β) ET rates, our simula- tion reveals a rich phase diagram for the MtrF-OmcA complex in the α−β plane (Figure 3.5a). Three distinct phases are observed in the time-averaged occupation density of the icosaheme complex, similar to those recently reported for the deca- heme MtrF [64]. A low-density (LD) phase, where the hemes are mostly oxidized, is observed when transport is limited by ET from intracellular donor molecules 64 (α < β). A high-density (HD) phase, where the hemes are mostly reduced, is observed when transport is limited by ET to extracellular electron acceptors (α > β). Finally, a ’maximum-current’ (MC) phase is observed when both and exceed the smallest heme-to-heme ET rate within the complex, which is∼ 10 4 s −1 along the octaheme path [32]. The high electron flux character of the MC phase, com- pared to the LD and HD phases, is easily identified in Figure 3.5b, which presents the phase behavior of the electron flux in theα-β plane. Intriguingly, cellular respi- ration measurements and estimates of the cellular cytochrome content translate to ET rates up to 10 3 s −1 per outer-membrane cytochrome [64]. This rate is located near the triple junction between the three phases; it appears that life operates where a small change in the electrochemical environment triggers large bioelec- tronic responses. It should be noted that these nonequilibrium phase transitions are a direct consequence of electron-electron interactions [94–96], which necessitate the use of many-body theory or simulations such as our KMC simulation. 65 Figure 3.6: (a) Visualizing the KMC simulation of ET dynamics in the MtrF- OmcA complex. The Fe atoms are represented as spheres, and are overlaid with the NewCartoon representation of the entire protein complex, where MtrF and OmcA are represented in red and blue shades, respectively. Fe 2+ and Fe 3+ are represented by red and blue spheres, respectively. (a) A single snapshot from the final 500 KMC steps of the simulation. (b) A snapshot from the same simulation as (a), where an ET event is represented by a directed edge. (c) Time-averaged electron occupation, of each Fe atom (as defined by Eq. (13)). Injection/ejection parameters: α = β = 10 5 s −1 . 66 3.3.3 Animation of electron-transfer dynamics Figure 3.6a shows a snapshot of the animation of the above KMC simulations, where α = 10 5 s −1 and β = 10 5 s −1 . Such a static representation does not neces- sarily capture the nature of the many-electron ET dynamics, which is essential for understanding the microscopic mechanisms underlying the nonequilibrium phase transitions. The animation capability of VizBET is expected to bring about such dynamic insight. The three supplementary movie files located in Appendix C ani- mate the ET dynamics corresponding to the LD (α = 10 2 s −1 and β = 10 5 s −1 ), HD (α = 10 5 s −1 and β = 10 2 s −1 ), and MC (α = 10 5 s −1 and β = 10 5 s −1 ) phases, respectively. Each movie depicts the final 500 KMC steps from a total of one mil- lion steps simulated. These movies show representative ET dynamics in a steady state in each phase after the initial transient well dies out. In the animation of the LD phase, a majority of the hemes are unoccupied, corresponding to the color blue. In the animation of the HD phase, a majority of the hemes are occupied, corresponding to the color red. Finally, in the MC phase, we observe extensive ET events, represented by more frequent and distributed color changes of the indi- vidual hemes. These movies help give insight into the possible roles and impact of individual hemes in the complex, for instance by identifying sites that act as carrier traps, or always reduced hemes which may act as electron donors to soluble acceptors capable of diffusing inside the complex [64]. To highlight ET events, we augment the above animations by representing each eventasadirectededge. Namely, anETeventbetweentwoFeatomsisrepresented by a cylinder that connects the Fe atoms. Here, one end of the cylinder attached to the source of ET is colored blue, while the other end connected to the destination of ET is colored red. Supplementary movie S4.mov in Appendix C animates ET 67 events in the MC (α = 10 5 s −1 and β = 10 5 s −1 ) phase. Figure 3.6b shows a snapshot of this animation. Asanalternativetothemovies, VizBEToutputsthecolor-codedtime-averaged electron occupation of each Fe atom as defined by Eq. (13). Figure 3.6c shows such a representation of the time-averaged occupation animation. This representation allows the user to see how the distribution of time-averaged occupation approaches a steady state distribution as time progresses. Supplementary movie S4.mov in Appendix C animates ET events in the MC (α = 10 5 s −1 and β = 10 5 s −1 ) phase for 10,000 KMC steps. 3.4 Conclusions We have described a computational framework named VizBET to visualize bio- logical ET dynamics. This paper has presented VizBET using an outer-membrane MtrF-OmcA cytochrome complex in Shewanella oneidensis MR-1 as an exam- ple. Starting from X-ray structures of individual cytochromes, MD simulations are combined with homology modeling, protein docking, and binding free energy calculations to sample the configurations of the complex as well as the change of free energy associated with ET. Based on this information, combined with the QM calculation of electronic couplings, KMC simulations are performed to study global ET dynamics in the multiheme network of the complex. Visualization of the KMC simulation has been implemented as a plugin to the VMD software. Such spa- tiotemporal data visualizations and analyses [97] are expected to provide valuable insight into the microscopic mechanisms of ET processes in not only biological but also other novel nanostructures such as dislocation-mediated metallic nanowires in 68 ceramics [98, 99]. Currently, we are applying VizBET to a lattice of protein com- plexes and protein-solid interfaces. The visual simulation results will be published elsewhere. 3.5 Appendix 3.5.1 Sequences Sequences for building the MtrF and OmcA homology models in the FASTA format are provided below. The residues within brackets were unresolved in the original crystal structures of MtrF (PDB file 3PMQ) and OmcA (PDB file 4LMH). MtrF Genome Sequence [CGGSDGDDGSPGEPGKPPAMTISSL]NISVDKVAISDGIAQVDYQVSNQENQAVVGIP SATFIAAQLLPQGATGAGNSSEWQHFTSETCAASCPGTFVDHKNGHYSYRFSATFNGMN GVTFLSDATQRLVIKIGGDALADGTVLPITNQHYDWQSSGNMLAYTRNLVSIDTCNSCH SNLAHGGRYNQVETCVTCHNSKKVSNAADIFPQMIHSKHLTGFPQSISNCQTCHADNPD LADRQNWYRVPTMEACGACHTQINFPAGQGHPAQTDNSNCVACHNADWTANVHSNAAQT SALAQFNASISSASMDANGTITVAVSLTNPTTGTAYADSADKLKFISDLRIYANWGTSF DYSSRSARSIRLPESTPIAGSNGTYSYNISGLTVPAGTESDRGGLAIQGRVCAKDSVLV DCSTELAEVLVIKSSHSYFNMSALTTTGRREVISNAKCASCHGDQQLNIHGARNDLAGQ CQLCHNPNMLADATATNPSMTSFDFKQLIHGLHSSQFAGFEDLNYPGNIGNCAQCHIND STGISTVALPLNAAVQPLALNNGTFTSPIAAVCSNCHSSDATQNHMRQQGAVFAGTKAD ATAGTETCAFCHGQGTVADVLKVHPIN OmcA Genome Sequence [CGGSDGKDGEDGKPGV]VGVNINSTSTLKAKFTNATVDAGKVTVNFTLENANGVAVLG LTKDHDLRFGIAQLTPVKEKVGETEADRGYQWQAYINAKKEPGTVPSGVDNLNPSTQFQ 69 ANVESANKCDTCLVDHGDGSYSYTYQVNVANVTEPVKVTYSADATQRATMELELPQLAA NAHFDWQPSTGKTEGIQTRNVVSIQACYTCHQPESLALHGGRRIDIENCASCHTATSGD PESGNSIEFTYMIHAIHKGGERHTFDATGAQVPAPYKIIGYGGKVIDYGKVHYPQKPAA DCAACHVEGAGAPANADLFKADLSNQACIGCHTEKPSAHHSSTDCMACHNATKPYGGTG SAAKRHGDVMKAYNDSLGYKAKFSNIGIKNNALTFDVQILDNKDQPIGKEFISDPSAYT KSSIYFSWGIDKDYPAYTAGSRYSDRGFALSNSKVSTYNEATKTFTIDSTNSNLKLPAD LTGMNVELYAGVATCFNKGGYGVEDVVATPCSTDTRYAYIQDQPFRFKWNGTDTNSAAE KRRAIIDTAKCSGCHNKEIVHYDNGVNCQACHTPDKGLKTDNTYPGTKVPTSFAWKAHE SEGHYLKYAGVQSGTVLKTDCATCHTADKSNVVTGIALGRSPERAWLYGDIKNNGAVIW VSSDAGACLSCHQKYLSDAAKSHIETNGGILNGTSAADVQTRASESCATCHTPSQLMEA HGN 3.5.2 VMD tcl script mol new FeOcc.pdb waitfor all set all [atomselect top all] set frame 0 set in [open FeOcc.pdb r] set beta {} while { [gets $in line] != -1 } { switch -- [string range $line 0 3] { END { $all frame $frame $all set user $beta set beta {} incr frame 70 } HETA - ATOM { lappend beta [expr [string range $line 60 65]] } } } 71 Chapter 4 A Scalable Divide-and-Conquer Algorithm for Kinetic Monte Carlo Simulations of Long-Time Dynamics 4.1 Introduction Ever-increasing processing power of parallel computers [100] is continuously extending the spatiotemporal scales of particle simulations [101], where each par- ticle could represent an atom in molecular dynamics (MD) simulations or a human in agent-based social simulations. Essential to extending the spatial scale is linear- scaling algorithms based on spatial locality principles, in which the computational complexity scales linearly with the number of particles N [26, 102–104]. A harder problem is to increase the time scale of processes that can be simulated, due to the inherently sequential nature of time as a result of causality [105]. In some cases, temporal locality principles alleviate the sequential-time bottleneck in MD simulations [26, 103, 106]. Namely, a many-particle system tends to stay near a local minimum-energy configuration over a long duration of time, which is bounded by a rare transition over short time period to another minimum. In 72 such a case, the transition state theory (TST) [107, 108] uses a local equilibrium assumption to reformulate the sequential long-time dynamics as computationally more efficient parallel search for low-activation transition events [109–112], where the rates of the events are computed from their activation barriers. The most widely used simulation method based on TST is kinetic Monte Carlo (KMC) [67– 70, 113, 114]. In KMC simulations, an event to occur is stochastically selected from a database of events. The simulated time progresses according to Poisson statistics, where the time increment at each KMC step t is inversely proportional to the sum of the rates of all possible events. Since the summed rate grows as at least O(N), KMC simulations progress much more slowly for larger systems: t =O(1/N). With anO(N) implementation of the computation for a single KMC step, therefore, the computational complexity to simulate a time duration of τ is thusO(N×τ/t) =O(N×τN) =O(N 2 ). The conventional KMC algorithm is thus not scalable for largeN. More efficient long-time simulation is possible because of the spatiotemporal locality of activated events. Namely, they are usually localized not only temporally but also spatially [115]. This leads to computationally more efficient simulation methods that concurrently sample multiple, spatially localized events [28, 29, 116–118]. In such cases, event sampling may be enhanced by an ensemble-mean field approach. For example, a time-dependent Hartree (TDH) approximation has been employed to sample the dynamics of a small key subsys- tem within a large molecule [119]. Here, an ensemble of MD simulations for the subsystem is embedded in an MD simulation of the entire molecule, where the subsystem and the rest of the molecule interact via an ensemble-mean field [119]. In the light of the localized nature of atomistic events [115], we here introduce a 73 Figure 4.1: Heme groups in a dimer of decaheme cytochromes, where the Fe atom within each heme is represented by a yellow sphere. Each of the two cytochromes (colored magenta and cyan) contains 10 hemes. local transition-state (LTS) approximation, in conjunction with the TDH approx- imation, into the KMC simulation method. This leads to a divide-and-conquer strategy that simulates multiple events concurrently based on spatial decomposi- tion. The resulting simulation method is equivalent to the synchronous parallel KMC (SPKMC) method [28, 29, 117, 118]. Here, we additionally introduce a dual linked-list cell (DL 2 C) method to further reduce the computational cost. This paper presents the resulting divide-and-conquer KMC (DCKMC) algorithm, and isorganizedasfollows. ThenextsectiondescribestheDCKMCalgorithm. Numer- ical results are presented in Section 3, and Section 4 contains conclusions. 4.2 Methods In this section, we present the divide-and-conquer kinetic Monte Carlo (DCKMC) algorithm and its implementation on parallel computers using a dual 74 linked-list cell (DL 2 C) method. As a concrete example, we use electron-transfer (ET) dynamics between heme groups in a large network of cytochromes [64, 120]. (Heme is an organic compound called porphyrin that contains an iron (Fe) atom at its center.) Figure 1 shows a dimer of decaheme cytochromes and the heme groups in it. The Fe atom in each heme can exist in either of the two valence states, Fe 2+ or Fe 3+ . Conversion of irons between Fe 2+ and Fe 3+ allows for the hopping of electrons between adjacent hemes (Figure 4.1). KMC simulation treats electron-hopping events in a network withN hemes, where a heme site at position q i is labeled by indexi∈ 1,...,N. Thei-th heme is either occupied by an electron (n i = 1 or reduced, corresponding to Fe 2+ ) or unoccupied (n i = 0 or oxidized, cor- responding to Fe 3+ ), wheren i is the electron occupation number of thei-th heme. The system dynamics are characterized by (i) electron hopping ratesW ji from the i-th heme to thej-th heme for adjacent (i,j) pairs, (ii) electron injection rateW red into selected entrance hemei ent , and (iii) electron-ejection rateW ox from a selected exit hemei exit . In our work, W ji is computed from the positions of hemes, q i and q j , and their precomputed free energies, G i and G j [28]. In addition, W ji depends on the occupation numbers, n i and n j , since an electron can hop from i to j only whenn i = 1 andn j = 0. Due to the exponential decay of the electron-hopping rate with respect to the heme-pair distance,W ji = 0 when|q i −q j | >q cut , whereq cut ∼ 1nm is a cutoff distance. The following discussion also applies to other applications such as photoexcitation dynamics in solar cells [20] and computational synthesis of new materials based on chemical vapor decomposition and other techniques [19], as long as an event is spatially localized within a cutoff distance qcut. 75 4.2.1 Kinetic Monte Carlo simulation To introduce a notation necessary for the derivation of the DCKMC algorithm, Appendix A describes the conventional KMC simulation method [67–70, 113, 114]. We start a simulation by emptying all heme sites and resetting the time to 0. At each KMC step, one of the following events occurs: (i) an electron is injected with rate W red if the entrance heme is unoccupied; (ii) an electron is ejected with rate W ox if the exit heme is occupied; or (iii) an electron hops from heme i to one of its nearest-neighbour hemes j with rate W ji if heme i is occupied and heme j is unoccupied. The method for calculating the rates for the ET dynamics can be found in Ref. [28]. KMC simulation consists of a time-stepping loop. Lete be one of the possible events listed above, with W e being its rate, e be the total number of possible events, and W = E X e=1 W e (4.1) be the sum of all possible events. At each KMC step, the time is incremented by t =−ln(ξ 1 )/W (4.2) whereξ 1 is a uniform random number in the range [0,1]. The probability of choos- ing a particular event is proportional to its rate, and specific event e∗ is chosen such that e∗ =min e { e X e=1 W e >Wξ 2 }, (4.3) where ξ 2 is another random number. 76 Figure 4.2: Two-dimensional schematic of spatial decomposition into an array of 32 domains in thex andy directions. Each cytochrome in a lattice of cytochrome- dimers is represented by its 10 Fe positions (colored yellow). 4.2.2 Divide-and-conquer KMC algorithm The standard KMC method in Sec. 4.2.1 is not scalable to larger system sizes. The cumulative event rate W grows as O(N), and accordingly the time scale of the simulation determined by its inverse becomes progressively smaller in larger systems, as is seen in Eq. (4.2). To overcome this scaling problem, we parallelize KMC simulations in a divide-and-conquer (DC) fashion, using a synchronous for- mulation and graph coloring to avoid conflicting events [28, 29, 117, 118]. To do so, we first introduce a local transition-state (LTS) approximation, in which events outside a cutoff distance are assumed to be statistically independent. We thenintroduceatime-dependentHartree(TDH)approximation, i.e., thesimulated system is subdivided into spatially localized domains and local events in a domain are sampled independently of those in the other domains. 77 Domain decomposition: The DCKMC algorithm partitions the 3-dimensional spaceR into spatially localized domains that are mutually exclusive, R 3 = [ d R 3 d ; R 3 d \ R 3 d 0 =∅ (4.4) For simplicity, we consider a simple mesh decomposition, in which the total rect- angular system is subdivided into subsystems of equal volume. The domains are arrangedinacubiclattice. Figure4.2showsatwo-dimensionalschematicofspatial decomposition. In DCKMC, the cumulative rate W d is computed for each non-overlapping domainR 3 d as in Eq.(4.1) but summing only over the local events within the d-th domainR 3 d . Below, we denote the rate of thee-th event in domaind asW (d) e . Then, the global maximum rate is computed as W max =max d {W (d) }. DCKMC simula- tion is performed concurrently among all the domains similarly to the sequential KMC algorithm in Sec. 4.2.1, except that a null event (where nothing occurs) is generated with the rate ofW (d) 0 =W max −W (d) . This allows globally synchronous time evolution, where the time is incremented byt =−ln(ξ 1 )/W max at each KMC step, where ξ 1 is a uniform random number in [0,1] which is common to all the domains. By keeping the size of domains constant and increasing the number of domains, W max should remain nearly constant and thus t = O(1) instead of O(1/N) in the conventional KMC method. Each domain d then generates a local random number ξ (d) 2 (∈ [0, 1]) independently of each other, and chooses an event to occur, e (d) , according to e (d) = min e { e X c=1 W (d) e >W max ξ d 2 } (4.5) 78 Figure4.3: (a)Two-dimensionalschematicoftheblockeddomaindecompositionin DCKMC. Each block of domains is delineated by solid lines, and is subdivided into 2× 2 quadrants labeled (or colored) 0, 1, 2 or 3. (b) Caching of heme occupancy in parallel DCKMC with 3× 3 spatial subsystems. The central subsystem is a block of 2× 2 domains as delineated by thick lines. Each small square is a cell slightly larger than q cut , used to compute electron- hopping rates between neighboring hemes. A heme in a dark gray cell only interacts with those in the nearest neighbor cells colored in light gray. Accordingly, the central subsystem only needs to be augmented with one layer of cells with the heme information in them cached from the neighbor subsystems. To avoid conflicting events within q cut to occur between neighboring domains, the domains are colored so that no domains of the same color are adjacent to each other. In a cubic lattice in three dimensions, this is achieved by arranging adjacent domains into a block of 2× 2× 2 domains. Here, we assume an even number of domains in each Cartesian direction. We require that the side length of each domain to be larger than 2q cut , so that no heme can be a destination of more than one electron hopping event. Within each block, the eight domains are indexed with an octant index (or color), c∈ [0,..., 7]. Figure 4.3 shows a two-dimensional schematic of the blocked domain decomposition. In each DCKMC simulation step, a color is chosen randomly, and events occur only in the domains of the chosen color. 79 4.2.3 Dual cell linked-list algorithm A naive double-loop implementation to compute electron hopping rates W ji between heme pairs (i,j) would scale as O(N 2 ). In fact, with a finite cut- off length q cut , each heme interacts with only a limited number of other hemes, (4πN/3V )q 3 cut ∼ O(1), on average (V is the volume of the system). This makes the computational cost to process all pairs of neighbor hemes to be O(N). The linked-list cell algorithm computes the entire pair interaction with O(N) opera- tions [30, 31]. This algorithm first divides the system into small cells of equal volume, where the edge length of each cell is made at least q cut . A heme in a cell interacts with only other hemes in the same cell and its 26 neighbor cells. The hemes belonging to a cell is organized as a linked list. In an array implementation, head[c] holds the index of the first heme in thec-th cell, orhead[c]= NULL if there is no heme in the cell; lscl[i] holds the heme index to which the i-th heme points. All hemes in cellc are traversed by following links inlscl[], starting with the heme specified by head[c]. The O(N) algorithm loops over cells, and for each cell, it loops over the 27 neighbor cells (including itself). For each pair of neighbor cells, the corresponding linked lists are used to traverse all pairs of hemes residing in the cell pair. For further improvement of performance, we have designed a DL 2 C algo- rithm, which utilizes two types of linked-list cells: (1) small cells for construct- ing neighbor-heme lists for managing nearest-neighbor hopping events; (2) larger cells for domain-block coloring. In our original implementation, cell linked lists, head[] and lscl[], were used in two-fold purposes. First, they were used in func- tionmake_list() to construct neighbor-heme listslsngb[][], before the main KMC simulation loop is entered. For the i-th heme, lsngb[i][0] stores the number of 80 other hemes within q cut and lsngb[i][j] stores the ID of the j-th neighbor heme. Construction of lsngb[][] amounts to doubly nested for loops over cells (each with a while loop to follow the link of hemes associated with the cell), which is com- putationally intensive. Second, the same cell linked lists were used at each KMC step for traversing the hemes in a given colored domain. While the computation in the first usage scales as O(N), the prefactor grows quadratically with the average number of hemes in a cell, or equivalently sixth power of the cell size, r cell . To reduce the computation in the first usage, DL 2 C uses the minimal cell sizer cell ∼q cut ∼ 1 nm, for constructinglsngb[][] in function make_list(). Subsequently, the algorithm dynamically creates another set of cell linked lists using a larger r cell , which is dictated by parallel-computing considera- tions discussed in the next subsection. Typical values used in sample applications in the next section are r cell = 8 nm or larger. 4.2.4 Parallelization Spatial decomposition: The physical system to be simulated is partitioned into subsystems of equal volume. Processors in a parallel computer are logically arranged according to the topology of physical subsystems. Particles that are located in a particular subsystem are assigned to the corresponding processor. For simplicity, weuseasimple3Dmesh(ortorusbecauseoftheperiodicboundarycon- ditions) decomposition. Subsystems are (and accordingly processors are logically) arranged in a 3D array of dimensions P x ×P y ×P z . Each subsytem is a paral- lelpiped of sizeL x ×L y ×L z . Each processor is given a unique processor ID,p, the range of which is [0, P− 1], where P =P x P y P z is the total number of processors. We also define a vector processor ID − → p = (P x ,P y ,P z ), where p x = 0,...,P x − 1; 81 p y = 0,...,P y − 1; andp z = 0,...,P z − 1. The sequential and vector ID’s are related through the relation,p =p x P y P z +p y P z +p z . Using periodic boundary conditions, every subsystem has 26 neighbor subsystems that share either a corner, an edge or a face with it. To map the DCKMC algorithm onto the spatial decomposition, we here adopt a simple scheme, in which each spatial subsystem is a 2×2×2 block of domains for graph coloring as explained in Sec. 4.2.2. This leads to coarse-grained parallelism, resulting in high parallel efficiency according to the analysis in the next section. Also,itshouldbenotedthattheTDHapproximationunderlyingDCKMCbecomes exact in the large granularity limit. In DCKMC, each process computes the elec- tron injection rates into the hemes within its subsystem (i.e., resident hemes) as well as the electron ejection and hopping rates from its resident hemes. Each pro- cess then randomly selects one event within its subsystem and updates the heme occupation numbers accordingly. In order to compute electron hopping rates W ji between heme pairs within cutoff distance, heme positions and free energies in the 26 neighbor subsystems (or corresponding processes), which are located within qcut from the subsystem periphery, are copied from the corresponding processes to ‘this 0 process in func- tionheme_copy() (see line 1 in Table 4.1). Through message forwarding, message passing is completed only with the six face-sharing neighbor subsystems, thus min- imizing the total message latency [121]. These cached hemes are appended after the resident hemes in the corresponding arrays. Note that and are static quanti- ties, and thus heme_copy() is called only once at the beginning of the program before the main KMC loop begins. Here, we adopt a single program multiple data 82 (SPMD) convention, so that the pseudocode of the DCKMC program in Table 4.1 is executed concurrently in all processes. Next, each process allocates cell linked lists, head[] andlscl[], and builds them using the positions of both resident and cached hemes, using the linked-list cell size that is slightly larger than q cut . The linked lists are then used to construct adjacent-heme lists,lsngb[][], for the resident hemes. The adjacency lists,lsngb[][], will be used in the main KMC loop for computing electron hopping rates. Once lsngb[][] are constructed, head[] and lscl[] are deallocated and are rebuilt such that each cell (or domain) is now an octant of the spatial subsystem (or domain block). These lists will be used in the main KMC loop for performing colored synchronous parallel KMC updates. The above computations are carried out in function make_list() before the main KMC loop is entered (line 2 in Table 4.1). In our implementation, make_list() also computes a part of the electron hopping rates that is independent of heme occupancy. After initializing heme occupation numbers and resetting the time to zero, the algorithm enters the main KMC loop (line 4 in Table 1) to iterate KMC simulation steps. At each KMC step, functionocc_copy() is called (line 5 in Table 4.1) first to cache boundary hemes’ occupancies within cut-off distanceq cut from the periphery of the nearest neighbor processes (or spatial subsystems), so that the electron hopping rates for the resident hemes can be calculated locally independent of the other processes, reflecting the latest heme occupancies (line 6 in Table 4.1). Figure 4.3(b) shows a schematic of this heme-occupancy caching operation. Subsequently, the master process, p = 0, randomly selects an octant (or color) c and broadcasts it to all processes. Each process p then computes the sum of all rates within the selectedoctant,W(p),anditsglobalmaximumW max overallprocessesiscomputed 83 using a global reduction operation (lines 8 and 9 in Table 4.1). Using W max , each process randomly selects an event and updates heme occupancies accordingly (line 10 in Table 4.1). Subsequently, the master process increments the time according to Poisson statistics (line 11 in Tables). After the heme occupancies are updated, electron hopping from some resident hemes may have changed the occupancy of a cached heme out of the subspace boundary. Each process sends the change of occupancies of cached hemes to their resident processes in function occ_move() (line 12 in Table 4.1). Table 4.1 shows the DCKMC algorithm. The parallel DCKMC program is written in the C language, where interprocess communications are implemented using the message passing interface (MPI) library [122]. 4.3 Results The scalability of the DCKMC algorithm has been tested on a cluster of mul- ticore computers. This section first presents the scalability test results, followed by an example of its application to ET dynamics in a lattice of cytochromes on a bacterial-membrane nanowire. 4.3.1 Scalability tests We test the scalability of the DCKMC code on a parallel computer using ET dynamics in a lattice of decaheme cytochromes as an example. The scalability tests have been performed on a Linux cluster at the Center for High-Performance Computing of the University of Southern California. Each computing node com- prises of dual octocore Intel Xeon central processing units (CPUs) with 64 GB 84 Table 4.1: Parallel DCKMC algorithm 1. heme_copy(): cache boundary-heme information within cutoff distance q cut from the periphery of the nearest neighbor subsystems (or processes) 2. make_list() a. Allocate and build cell linked lists, head[] and lscl[], including both resident and cached atoms, with the linked-list cell size slightly larger than q cut b. Make adjacent-heme lists, lsngb[][] for the residents hemes using the cell linked lists c. Deallocate head[] and lscl[] d. Allocate and build cell linked lists, head[] and lscl[], including both resident and cached hemes for the block of 8 domains 3. Initialize the electron occupation numbers; reset the time t←0 4. for step = 1 to Maxstep // Main KMC loop 5. occ_copy(): cache boundary hemes’ occupancies from neighbor processes 6. Compute electron-hopping rates for resident hemes 7. Master process, p = 0, randomly selects an octant (or color) c and broadcast it 8. Compute the sum W(p) of the rates of events in octant c (p is the ID of ‘this 0 process) 9. Global reduction: W max =max p W (p) 10. forall domains p 11. Pick an event e(p) according to Eq. (4.5) and update the heme occupation numbers accordingly 12. Master domain increment the time t←t− ln(ξ 2 )/W max 13. occ_move(): Send the change of occupancies of cached hemes to their resident domain DDR3 memory, operating at a clock cycle of 2.66 GHz. Each node thus has 16 CPU cores. These nodes are interconnected via 56.6 Gbit/s Infiniband network. Weak scaling: We first perform isogranular-scaling tests for the DCKMC code, in which the number of hemes per core N/P is kept constant. We compare three granularities,N/P = 127,776, 1,168,032, and 4,116,000, while varying the number of cores P from 16 to 1,024 for each granularity. Here, we use all 16 cores per computing node, and the number of nodes is varied from 1 to 256. The cut-off length of ET isr c = 10 Å, and the domain size per core is a cube of side lengthL = 880 Å, 1,840 Å, and 2,800 Å forN/P = 127,776, 1,168,032, and 4,116,000, respec- tively. Figure 4.4(a) shows the wall-clock time per KMC simulation step with 85 scaled workloads. By increasing the number of atoms linearly with the number of cores, the wall-clock time remains almost constant, indicating excellent scal- ability. While slight increase of the wall-clock time for larger P is observed for the smallest granularity (i.e., N/P = 127,776), such overhead is hardly visible for the largest granularity (i.e.,N/P = 4,116,000). To quantify the parallel efficiency, we first define the speed of the DCKMC code as a product of the total number of hemes and the number of KMC steps executed per second. The isogranular speedup is given by the ratio between the speed of P cores and that of 16 cores (i.e., 1 computing node) as a reference system. The weak-scaling parallel efficiency is the isogranular speedup divided by P/16. Figure 4.4(b) shows the measured weak-scaling parallel efficiency as a function of P for N/P = 4,116,000. With the granularity of 4,116,000 hemes per core, the parallel efficiency is 0.935 on P = 1,024 for a 4,214,784,000-heme system. This demonstrates a very high scalability of the DCKMC code for multibillion-heme systems. The better parallel efficiency for the larger granularity observed above can be understood by a scalability analysis. Using the spatial decomposition and the O(N) linked-list cell method, the parallel DCKMC simulation of N hemes executes independently on P processors, and the computation time T comp (N,P) = aN/P, where a is a constant. Here, we have assumed that hemes are on average dis- tributed uniformly, so that the average number of heme per processor is N/P. The dominant overhead of the parallel DCKMC is heme caching, in which hemes near the subsystem boundary within a cutoff distance, q cut , are copied from the nearest neighbor processors(line 5 in Table 4.1). Since this nearest-neighbor communication scales as the surface area of each spatial subsystem, its time is 86 T )comm(N,P) = b(N/P ) 2/3 , where b is a constant. Another major communica- tion cost is for the global maximum computation to determine W max (line 9 in table 4.1) using the MPI A llreduce(), which incurs T global (P ) =c logP, where c is another constant. The total execution time of the parallel MD program can thus be modeled as T (N,P) =T comp (N,P) +T comp (N,P) +T global (P ) =aN/P +b(N/P ) 2/3 +c logP (4.6) For isogranular scaling, the number of atoms per processor, N/P =n, is constant, and the isogranular parallel efficiency is E P = T (n, 1) T (nP,P ) = an an +bn 2/3 +c logP = 1 1 + b a n −1/3 + c an logP (4.7) For a given number of processors, the efficiency EP is larger for larger granularity n. For a given granularity, E P is a weakly decreasing function of P, due to the very weak logP dependence. Strong scaling: We also perform a strong-scaling test by simulating a cytochrome lattice containing a total of 3,145,728 hemes. In this test, the number of cores ranges from P = 1 to 16 on 1 computing node, while keeping the total problem size constant. In Figure 4.5(a), the red circles show the wall-clock time per KMC simulation step as a function of P. The time-to-solution is reduced by a factor of 6.0 on 16 cores compared with the 1-core run. As in the weak-scaling test, the speed of the DCKMC code is defined as a product of the total number of hemes and the number of KMC steps executed per second. The fixed problem-size (or strong-scaling) speedup is given by the ratio between the speed of P cores and that of 1 core. In Figure 4.5(a), the blue squares show the speedup as a function 87 Figure 4.4: (a) Wall-clock time per KMC simulation step of the DCKMC code, with scaled workloads, N/P = 127,776 (black), 1,168,032 (blue), and 4,116,000 (red), as a function of the number of coresP (P = 16, ..., 1,024). (b) Weak-scaling parallel efficiency as a function of P for N/P = 4,116,000 (red), compared with the ideal efficiency (black dash-dotted line). 88 of P, whereas the blue dash-dotted line is the ideal speedup. The strong-scaling speedup is 6.0, on 16 cores. For fixed problem-size scaling, the global number of hemes, N, is fixed, and the speedup is given by S P = T (N, 1) T (N,P) = aN aN/P +b(N/P ) 2/3 +c logP = P 1 + b a P N 1/3 + c a P logP N (4.8) and the parallel efficiency is E P = S P P = 1 1 + b a P N 1/3 + c a P logP N (4.9) Fromthismodel,wecanseethattheefficiencyisadecreasingfunctionofP through both theP 1/3 andP logP dependences. ThisP dependence is much stronger than Eq. (7) for weak scaling. Consequently, it is more difficult to achieve high strong- scaling parallel efficiency compared with weak-scaling parallel efficiency. This is due to decreasing granularity, and accordingly increasing communication/compu- tation ratio for larger number of processors, in the former. Though the strong- scaling speedup defined above is less than the ideal value in Figure 4.5(a), using a larger number of computing cores (hence a larger number of spatial domains) increases the time simulated by DCKMC simulation. Namely, the average time increment per KMC step is inversely proportional to the sum of rates of all events within a domain,W max , which in turn is proportional to the number of hemes per domain. To quantify the actual speed of KMC simulation, we define another speed of the DCKMC code as a product of the total number of hemes and the simulated time by KMC per running time of the program. The simulated-time speedup is given by the ratio between the simulated-time speed ofP cores and that of 1 core. 89 Namely, the simulated-time speedup measures how fast the actual KMC simula- tion progresses on a parallel computer. In Figure 4.5(b), the blue squares show the simulated-time speedup as a function of P. On 16 cores, DCKMC simula- tion progresses 95.9 times faster than on 1 core. This demonstrates a significant computational efficiency afforded by the divide-and-conquer approach. 90 Figure 4.5: (a) Wall-clock time per KMC simulation step (red) and speedup (blue) of the DCKMC code with strong scaling - 3,145,728-heme system on P cores of Intel Xeon. The blue dash-dotted line shows the ideal speedup. (b) Wall-clock time per KMC simulation step (red) and simulated-time speedup (blue) of the DCKMC code with strong scaling - 3,145,728-heme system on P cores of Intel Xeon. The blue dash-dotted line shows linear speedup. 91 Figure 4.6: (a) A simulation snapshot of the electron transport process along a 1.12 μm long bacterial nanowire containing 1,280 MtrC decaheme cytochromes (12,800 hemes). Electrons are injected from the nanowire’s leftmost heme sites with injection rate of 10 7 s −1 and ejected at the rightmost heme sites with ejection rate of 10 7 s −1 . Occupation number of each heme site is color-coded. (b) Time evolutionoftheelectronfluxalongthenanowire. Theelectronfluxisdefinedasthe number of electrons injected/ejected per unit time through an area perpendicular to the flux direction. 4.3.2 Application example Dissimilatory metal-reducing bacteria have evolved mechanisms to cope with living in anaerobic environments, allowing them to use abundant minerals out- side the cell as respiratory electron acceptors, instead of oxygen or other soluble oxidantsthatwouldnormallydiffuseinsidecells. Thisprocessisknownasextracel- lular electron transfer (EET). Shewanella oneidensis MR-1 accomplishes EET by deploying multiheme cytochrome complexes that form 20-30 nm conduits through the periplasm and across the outer-membrane [22]. More recently, we learned that such cytochrome networks extend along micrometer-long membrane extensions called bacterial nanowires [18, 49], and may even allow conduction over entire bac- terial biofilms [123, 124]. Going beyond our previous conventional KMC work on studying ET dynamics through a single MtrF cytochrome or a single Mtr-Omc cytochrome complex [64, 120], parallel KMC simulation will enable us to examine ET within the larger context of whole cellular structures ranging from nanowires 92 to biofilms [123, 124]. The study of such large-scale EET encompassing long length scales is beyond the scope of conventional KMC algorithms [64, 120], and hence necessitates scalable parallel KMC simulations. The new parallel DCKMC simu- lation program has enabled us to perform KMC simulations with unprecedented spatiotemporal scales, for studying over a 1.12μm long bacterial nanowire contain- ing more than a thousand MtrC decaheme cytochromes assumed to be hexagonally and uniformly packed on the nanowire’s membrane surface as a test case. Each MtrC molecule contains 10 hemes, thus a 12,800 heme network is formed in 1.12 μm long nanowire. The nanowire’s heme network is shown in Figure 4.6(a). Electrons are injected into selected entrance hemes at one end of heme the network with a rate of 10 7 s −1 , and are ejected with a rate of 10 7 s −1 from selected exit hemes at the other end of the network. The net electron flux through the whole nanowire is calculated as the slope of the number of electrons injected/ejected as a function of the total elapsed time after the system reaches a steady state (Figure 4.6(b)). As a result of the simulation, we estimate 10 4 -10 5 s −1 of electron flux for the heme nanowire network. Remarkably, this rate matches previously measured ET rates to solid phase Fe(III) oxides from the MtrCAB complex assembled into proteoliposomes [15], and is enough to support a single cell’s respiration rate [125]. With this basic demonstration of DCKMC to simulate a micrometer-long bacterial nanowire, future studies will allow us to perform additional mechanistic simulations that will reveal the effect of cytochrome density and orientation on nanowire ET rates, and even study larger systems such as the redox network of whole biofilms. 93 4.4 Conclusions Wehavedesignedadivide-and-conqueralgorithmtosimulatelong-timedynam- ics of many-particle systems based on spatiotemporal locality principles. We provided a formal derivation of the known SPKMC algorithm based on local transition-state and time-dependent Hartree approximations. We then introduced a dual linked-list cell method to further reduce the computational cost. The result- ing DCKMC algorithm has achieved an excellent weak-scaling parallel efficiency andgoodstrong-scalingparallelefficiency. TheparallelDCKMCprogramhasbeen used to simulate a lattice of cytochromes on a bacterial-membrane nanowire. Such long-time dynamics simulations are expected to shed light on many areas such as glass dynamics and biological self-assembly. Furthermore, the DCKMC algorithm is expected to play a major role in the computational synthesis of new materials based on chemical vapor decomposition and other techniques. 94 Chapter 5 Redox Network Simulations: Examining Multiple Scenarios of Cytochrome Density, Orientation, and Disorder 5.1 Introduction Our previous kinetic Monte Carlo studies (Chapter 2,3, [64, 120]) simulate multistep hopping in the decaheme chain of MtrF using either empirical inter- heme ET rates or ET rates derived from QM/MM calculations [30–32]. Chapter 3 extended this calculation to an Mtr-Omc icosaheme dimer, which was previously proposed to exist as a stable functional complex implicated as the terminal bacte- rial reductase of extracellular electron acceptors [51]. With the development of the scalable divide-and-conquer algorithm introduced in Chapter 4 [126], this chapter examines putative electron transport along the micrometer-scale outer membrane extensions proposed to function as bacterial nanowires in Shewanella oneidnesis MR-1 [16, 49]. Previous reports demonstrated that the EET components MtrC and OmcA had significantly increased expression in response to electron accep- tor limitation and nanowire production [49]. In addition, MtrC and OmcA have 95 Figure 5.1: Electron cryotomography of Shewanella nanowires. (a) One tomo- graphic slice resolving the lipid bilayer and the connection between two vesicular features (bilayer thickness 41 Å). (b) Isosurface representation of a 3-dimensional reconstruction (from multiple slices) of a longer nanowire. These methods are cur- rently being used to measure the spacing and nanoscale localization of individual membrane cytochromes [1]. been directly observed along the nanowires via immunofluorescence with MtrC and OmcA-specific antibodies [49], and outer-membrane densities consistent with these cytochromes have been observed on nanowires using electron cryo tomography [1]. Long distance electron transport through a cytochrome network, like the one along Shewanella nanowires, will depend strongly on the structure, density, and orientation of these cytochromes, as well as the precise mechanism of interaction between neighboring redox partners (e.g direct tunneling or diffusive intermedi- ates). This motivates us to extend our simulation framework from monomer and dimer of 10-20 heme chains (Shewanella decaheme cytochrome MtrF and icosa- heme Mtr-Omc complex) to the entire redox network of whole nanowires, and pos- sibly to the cell-surface cytochromes lining up the cell surfaces in whole biofilms. To model the bacterial nanowires and describe dynamics of EET, we create heme 96 networks in an (idealized) cylindrical form with a diameter of 20 nm and a length of 1μm. Since little is known about the overall cytochrome density and how they are integrated onto the membranous walls of nanowires, we investigate electron transport in multiple possible scenarios, which are determined by cytochrome ori- entation, density, disorder, and mechanism of inter-cytochrome electron transfer (e.g. direct tunneling between neighboring terminal hemes or via a diffusive event). We perform coarse-grained molecular dynamics (MD) to output cytochrome con- figurations for a given scenario (i.e. a specific orientation, density, and order). As a first step, a percolation analysis [127, 128] is used to search for percolating clusters based on the connectivity of the whole heme network. This is then followed by the divide-and-conquer kinetic Monte Carlo simulation (DCKMC) which is imple- mented based on synchronous parallel KMC (SPKMC) method (Chapter 4). We then present the results of these multiple scenarios to characterize the different limits of long-distance electron transport in the redox networks of nanowires. 5.2 Methods The workflow of redox network simulation includes a stepwise procedure from the generation of redox network topology to the connectivity test of the net- workandthedivide-conquer-recombinekineticMonteCarlosimulationsofelectron transfer, datainterpretation, andfinally, analyzingandevaluatingofscenariomod- els based on these data. First, heme cofactors’ coordinates of the recently resolved crystal structure of an outer-membrane decaheme c-type cytochrome MtrC from Shewanella oneidensis (PDB:4LM8) was prepared to be used as inputs to the 2D coarse-grained molecular dynamics(MD). MD simulation was carried out to gen- erate a particular redox network topology characterized by certain level of order 97 Figure 5.2: (a) Workflow of the redox network simulation framework for biologi- cal ET scenarios on spacing and nanoscale localization of individual cytochrome. DCR-KMC: divide-conquer- recombine kinetic Monte Carlo (b) Flowchart of cal- culating ET parameters based on the choice of QM/MM and mechanism of interac- tion between neighboring redox partners QM/MM: quantum mechanics/molecular mechanics; DCDFT: divide-and-conquer density functional theory. and density of cytochromes. Then, this 2D network topology is converted into a 3D cylindrical topology. The electronic connectivity of this topology is pre-tested by a percolation analysis implemented in breadth-first-search (BFS) [129] manner to determine whether there is a percolation path found in a given time to link the electron injection to ejection sites. This estimates a reasonable simulation time until the steady-state electron transport is achieved. Then, we perform a divide-conquer-recombine kinetic Monte Carlo (DCR-KMC) simulation to obtain steady-state current across the redox network in the nanowire as in chapter 4 [126]. Multiple time frames of electron occupation of each redox site at every 10,000 steps fromatotaloftenmillionstepssimulatedaresavedandanimatedusingOpenGLin C++. The animated ET dynamics in DCR-KMC simulations are used to examine the electron hopping of individual electrons within the redox network and validate the resulting electron flux. 98 Figure 5.3: Detailed description of the multiple scenarios with a set of parameters and corresponding snapshot of the simulated cytochrome distribution from coarse- grained molecular dynamics. Red solid line represents the irregular shape of MtrC approximated by∼50 particles that outline the molecular structure. 5.2.1 Redox network scenario In this section, we describe a set of redox network ‘scenarios 0 including their topologyandredoxmoleculepackingparametersdescribedinchapter3, weclassify theorientationofneighboringcytochromesintoeitherocta-octa(O-O)hemechains or tetra-tetra (T-T) heme chains according to the relative arrangement of the MtrC multiheme chains (i.e. consider the organization of the 10 hemes of each cytochrome as a ‘staggered cross 0 of two multiheme chains, where a octaheme chain (hemes 10, 9, 8, 6, 1, 3, 4, 5) bisects a tetraheme chain (hemes 2, 1, 6, 7). In either orientation, MtrC molecules are hexagonally arranged on a membrane tube to achieve close-to-maximal packing. Note that MtrC is depicted as an irregularly shaped object that outlines the molecular structure in this simulation. Second, we treat the level of disorder of MtrC molecules lining the bilayer membrane as an important simulation parameter; each level of disorder leads to a particular simulation scenario. Specifically, to mimic the flexible orientation 99 Figure 5.4: Density distribution curves of heme-to-heme distance for time T = [T μs ,T total ], where T total is the total time in the simulation. Each density distribu- tion curve is calculated from cytochrome distribution changes over time (trajecto- ries). Based on the curve, the level of disorder of MtrC molecule is quantified by calculating the probability of finding heme-to-heme distance less than 10 Å (area under the curve up to 10 Å. At each time point from T 1 to T 3 , the calculated probability is 0.91, 0.87, and 0.74, respectively. (i.e. 0.91 represents 91% of inter- protein distances are within 10 Å). Note that heme pairs within distance of 20 Å are collected to take into account interaction within the cut-off radius. After T 3 , the distribution converges to the probability of 0.74. of membrane embedded proteins along the nanowire, we consider rotational and translational diffusion of proteins in membrane. The disorder (i.e. departure from the initial hexagonal lattice) arises from these rotational and translational degrees of freedom. To keep our model simple, we assume that rotation of proteins occurs only around the membrane normal. The effect of diffusion is quantified in terms of the characteristics of distance distribution as shown in Figure 5.4. At t = 0 in the time trajectory which corresponds to the initial ordered state, three narrow sharp peaks are shown at 6, 9 and 18 Å which corresponds to closest lateral, closest left 100 diagonal and closest right diagonal nearest neighbor distance, respectively. The distance distribution curve converges to a curve with one peak at 7 Å as diffusion progresses. We extract the coordinates from different diffusion state to describe some level of disorder. Finally, we consider cytochrome density as another scenario parameter. The density of cytochromes is described as a function of the number of cytochromes in the fixed simulation region. Maximum density (maximum area coverage) is calculated as 1280 cytochromes in the membrane area of nanowire with a diam- eter of 20 nm and a length of 1 μm, assuming hexagonal packing of cytochrome molecules. Based on recent experimental measurement of cryo-electron microscopy (cryo-EM) and electron cryotomography of the nanowire samples grown in in vivo microfluidic platforms, resolving the lipid bilayer of the nanowire and the spac- ing between outer-membrane densities consistent with MtrC [1], we estimate this packing parameter. In the observed isosurface representation of a 3- dimensional reconstruction (from multiple slices) of a longer nanowire, we located membrane segments where MtrC is densely localized. We derive the area density of MtrC on membrane by averaging those segment area densities. As the observation of MtrC localization is limited to the cross-sectional data, we set the width of segment as the size of MtrC while we find length of each segment as wide as the range to capture localized cytochromes. As a result, the average occupation density for 8 observed cytochrome clusters (shown in Fig. 5.1) is estimated to be 85.6%. Based on this experimental evidence and other considerations (e.g. direct tunneling gen- erally limited below 20 Å, fast diffusion up to a few nm separating proteins), we explored a cytochrome packing density between 70% to 100% in our redox network 101 scenarios. As an example of redox network scenarios, visual results of comparative redox network for each packing parameters are presented in Fig. 5.3. 5.2.2 Coarse-grained molecular dynamics Coarse-grained molecular dynamics enables simple but efficient simulation of the diffusion dynamics in a protein system. As we described in the previous sec- tion, multiple network scenarios are designed to explore the combination of the following parameters: (1) MtrC orientation on the bilayer (i.e. whether neighbors connect through the octaheme or tetraheme chain of the structure); (2) density of cytochromes in the simulation region (i.e. going below 100 %); (3) level of disor- der (e.g. not a perfect hexagonal lattice). The first two parameters are provided as inputs to the MD simulation. Third parameter, level of disorder is achieved when sampling coordinates over the time. In coarse-grained molecular dynamics, Fe atoms in heme cofactors and 50 atoms outlining the cytochrome MtrC crys- tal structure are used to represent the irregular shape of MtrC. This simplified approach is highly limited in predictive ability for the accurate description of the system, but it is simply to obtain reasonable redox networks for the purpose of our interest. A group of atoms that belongs to the same cytochrome is harmon- ically restrained with a force constant k. Varying the stiffness of the restraint determines the force to pull the atoms along so that it can be used to describe flexibility of cytochrome. The atoms interact via Lennard-Jones pair potential. The output coordinates is converted into 3-dim cylindrical coordinates using a geometric equation for a circle. The resulting cylindrical coordinates are used as an input to percolation or DCKMC simulation in next steps. 102 Figure 5.5: A slice from a percolation simulation on a heme network in two dimen- sional space to describe percolation on a cylindrical surface of nanowire. Each cluster is represented as different color. Percolating cluster can be found if same heme molecule is found in different image while searching neighbor hemes. Red circle represent same heme molecule in two different images. 5.2.3 Percolation analysis The phenomenon of percolation is well known and extensively used in physics and engineering sciences [127, 128]. In fact, our redox network model for electron transfer process can easily be mapped into a bond percolation problem, in which the heme molecules are the vertexes (or sites) and the edge (or bonds) between two hemeneighborsrepresentsapath, whichisopenifitselectrontransferratebetween two heme molecules is greater than threshold rate R th or is blocked if not. In per- colation analysis, we observe whether a cluster is percolating or not, i.e., whether the infinitely repeated periodic images are connected in one or more directions or they are all disconnected. This determines the connectivity of redox networks which can be formed in many different ways under various biological conditions. 103 We observe how the network connectivity changes depending on threshold rate. The threshold rate is set to inverse of the elapsed time T, i.e. R th =1/T=W/N where W is sum of all possible events and N is number of steps. As time evolves, R th decreases, resulting in higher connectivity. For percolation analysis, we use the breadth first search algorithm [129] which visits all of the heme molecules (ver- tices) in a cluster without visiting the same vertex twice. The cluster distribution and whether the cluster percolates or not can be determined during a single pass through the vertexes. The pseudo code for the explained percolation algorithm is described in table 5.1. Figure 5.6: Percolation characteristics for given multiheme cytochrome MtrC con- figurations specified by MtrC packing density, cytochrome orientation order, and whether direct inter-protein tunneling or diffusion-linking mechanism is dominant in electron transfer. Elapsed time to percolation is obtained as density and orien- tation order changes for each mechanism, diffusion-linking and direct inter-protein tunneling only. 104 Table 5.1: Percolation algorithm for finding percolating heme cluster 1. Read the input data for heme coordinates and ET parameters 2. Create a graph g with nodes of heme molecules and edges allowed to hop forward 3. for all heme molecules i do if i is not visited then call g.bfs(i, clusterSize, Percolates, PercolX, PercolY, hemesInCluster) end if end for procedure bfs(i, clusterSize, Percolates, PercolX, PercolY, hemesInCluster) i is the starting heme clusterSize is used to return the size of the cluster Percolates returns a boolean for whether the cluster is percolating, i.e., whether the infinitely repeated periodic images are connected in one or more directions or they are all disconnected PercolX returns a boolean whether a cluster is percolating in x-direction PercolY returns a boolean whether a cluster is percolating in y-direction hemesInCluster stores a list of the heme molecules in the cluster Image index is calculated by edge e(i,j) direction based on nodes i,j coordinates Set i as visited and enqueue it while not (queue is empty) thisHeme = dequeue clusterSize = clusterSize + 1 hemesInCluster[clusterSize] = thisHeme for all all adjacent neighbors of heme i do Calculate image index new_index by the edge direction if neighbor is not visited before then Set neighbor as visited and enqueue neighbor Store new_index as neighbor’s image index else if neighbor image stored is not same as new_index then Percolation is found Compare image index values and set PercolX = 1 or PercolY = 1 else Cycle has been made end if end if end for end while 105 Figure 5.7: A snapshot of showing cluster size by colors during the dynamic per- colation transition at time t = 1 ms. (a) high density with ordered orientation. (b) high density with disordered orientation. (c) low density with disordered ori- entation. From (a) to (c), Cluster size changes as heme connectivity changes from (a) to (c). 5.2.4 Divide-conquer-recombine kinetic Monte Carlo sim- ulations We execute divide-conquer-recombine kinetic Monte Carlo simulations (DCKMC) as constructed and implemented in chapter 4 [126]. The heme net- work configuration is provided as input to DCKMC simulation, which includes cytochrome ID numbers, heme ID numbers, and coordinates of a set of heme molecules. Based on the coordinates and additional thermodynamic and elec- tronic coupling parameters [30–32], ET rates between heme sites are calculated on-the-fly in the initialization phase. DCKMC simulation supports two options in computing the ET rates k ij between a pair (i, j) of adjacent heme sites. In both options, a thermodynamic integration protocol [87] is needed to estimate the 106 change of the Gibbs free energy4G ij and the reorganization energy λ associated with the ET from heme sites i to j [30–32]. The two options differ in the way the electronic couplingH ij for the ET reaction is computed. In the first-principles QM/MM option [88, 89], H ij is computed for each heme pair using the QM/MM method, where each QM calculation employs DC density functional theory (DFT) or a variant of DCDFT called fragment-orbital density functional theory. The electronic coupling within a MtrF molecule, for example, was estimated by using the QM/MM method and a variant of DCDFT. According to the non-adiabatic rate equation, ET rate,k ij , from thei-th heme to thej-th heme is expressed as [3] k et = 2π ~ |H DA | 2 √ 4πλk B T exp − (ΔG +λ) 2 4λk B T (5.1) , where T is the temperature, and~ and k B are the Planck and Boltzmann con- stants, respectively. Thesecondoptiontocalculatek ij employsaphenomenological form of the non-adiabatic rate-equation [90], k hop (s −1 ) = 10 13 exp −βR− (ΔG +λ) 2 4λk B T (5.2) whereR ij is the edge-to-edge distance between hemesi andj andβ is a tunneling decay factor. Despite its simplicity, the equation 5.2 correctly reflects the expo- nential decay of the tunneling probability and predicts ET rates in broad classes of biomolecules [54]. In addition to direct tunneling (electron hopping) rate described by the non- adiabatic rate equation, diffusive rate can be added with consideration of indirect pathway of electron transfer, such as electron shuttling by flavins. Considering that solvent exposed hemes are the most likely to be involved in making contacts 107 Figure 5.8: A crystal structure of MtrC and a simulation snapshot of the electron transport process along a 1 μm-long bacterial nanowire containing 1280 MtrC decaheme cytochromes (total 12800 hemes in the system). Electrons flow from left to right through hopping heme sites. with shuttling molecules, we include a diffusive rate only for a pair (i,j) of heme sites where i, j∈{2, 5, 7, 10}. Therefore, ET rates including diffusion reactions is expressed as k ij =k ij,hop + 6D R 2 ij (5.3) where D = 1.5× 10 −5 cm 2 s −1 if i, j∈{2, 5, 7, 10} or zero otherwise, represent- ing as diffusion coefficient of small molecules in water. R is a distance between neighboring heme sites. 5.2.5 ET rate parameters for scenario model and visual- ization A heme network in each scenario model consists of hexagonally arranged mul- tiheme cytochrome MtrC on a membrane tube (total heme site = total number of cytochrome×10hemes). Electronsareinjectedfromthenanowire’sleftmostheme 108 sites with injection rate of 10 7 s −1 and ejected at the rightmost heme sites with ejection rate of 10 7 s −1 . Heme-to-heme electron hopping rates k ij between a pair (i,j) of adjacent heme sites are obtained based on the non-adiabatic expression in equation 5.2, where β= 1 Å −1 , λ = 0.94 eV,4G =4G intrinsic +4G additional . The change of the Gibbs free energy4 G intrinsic and the reorganization energy λ was obtained from thermodynamic integration and molecular dynamics calculation [32]. The4G additional dependsonourchoicewhethertoapplyanadditionalpoten- tial difference. Each heme site in the nanowire can be occupied by at most one electron, thus it is either occupied by an electron (n i =1, red/Fe 2+ ) or unoccupied (n i =0, blue/Fe 3+ ), where n i is the electron occupation number of the ith heme. Simulation acquires a stream of electron occupation of each heme site during mul- tiple time frames by every 10,000 steps from a total of ten million steps. The stored occupation data is used to animate ET dynamics using OpenGL program. The electron flux is defined as the number of electrons injected/ejected per unit time through an area perpendicular to the flux direction. Steady-state electron fluxJ is estimated when incoming electron flux and outgoing flux stay within 0.5% range. The number of hemes reduced determines the number of electrons in the system at time t. The electron flux are used for a comparison and analysis of multiple network scenarios. 5.2.6 Simulation workflow DCKMC simulation is performed on longitudinal transport from the entrance site on the left to the exit site on the right. Simulation starts with four subsystems of 2 by 2, whose each dimension is 350.12 Å by 326.19 Å, and once existing system reaches the steady state, it is extended by one layer of subsystems lengthwise 109 Figure 5.9: A schematic of spatial decomposition in simulation procedure. Initial system consists of four domains, and it extends layer by layer after the existing system reaches the steady states. Empty processors are assigned at one end of the currentsystemtoavoidfalseelectroninjectionorejectionduetoperiodicboundary condition (refer to the chapter 4 about periodic boundary condition). at a time. We refer to the steady state as the state when there is less than 0.5 % difference in the number of injected electrons and ejected electrons with respect to time. This serial procedure is performed automatically using a bash script in the high-performance computing environment at USC for changing the number of processors required for the assigned domains. Related code is attached in Appendices. 5.3 Results and Discussion 5.3.1 Comparison electron flux with percolation result In the percolation analysis, it is estimated whether there is a percolating cluster in a given heme network corresponding to each scenario case (i.e. a particular set of density and disorder parameters and an assumed inter-protein electron transfer mechanisms). When we restrict inter-protein electron transfer to occur only by direct tunneling, a percolating cluster is found in the MtrC network only under the following scenarios: 1) ordered network with 100% packing density, 2) ordered network with 90% packing density, 3) disordered network with 100% packing den- sity. This percolation data is consistent with the electron flux of DCKMC results, 110 where steady-state current is only achieved for this same set of scenarios, and no current observed otherwise. Figure 5.10: (a) Steady-state electron flux dependence on orientation order of cytochromes at different length of bacterial nanowire. The level of cytochrome disorderisquantifiedbytheprobabilityoffindinganinter-proteinhemepairwithin a distance less than 10 Å. (refer to Figure 5.4). (b) Steady-state electron flux dependence on cytochrome density at different length of bacterial nanowire. All the electron flux data is computed based on diffusion-linking mechanism with a diffusion coefficient of 10 μm 2 /s. 5.3.2 Effect of cytochrome order, density and length of nanowires on electron flux Fig. 5.10 shows the steady-state electron flux corresponding to different order and density parameters. Electron flux drops as the level of cytochrome order or 111 Figure 5.11: (a) Comparison of direct inter-protein tunneling electron flux with diffusion-linked electron flux at D = 1 μm 2 /s. For high order/density, it shows same order 10 4 s −1 - 10 5 s −1 of electron flux in both direct tunneling and diffusion- linking mechanism, indicating that direct tunneling is a dominant electron transfer mechanism. Whereas for lower order/density, the restriction to only direct tun- neling rapidly decreased the overall flux to zero. Allowing for diffusion between proteins sustains an electron flux of 10 3 s −1 - 10 4 s −1 even at lower order/density. Electron flux is indicated by the color bar. cytochrome density decreases. In other words, heme networks in disordered/less dense regime contain no path from one end to the other end of the nanowire as opposed to the ordered and densely packed network. This trend is expected as the overall (end-to-end) electron transport rate is determined by the distribution of neighboring cytochrome distances. For the maximum density-ordered case, elec- tron flux was calculated at around 10 4 s −1 under the direct inter-protein tunneling mechanism, whereas the electron flux goes up to 10 5 - 10 6 s −1 when a diffusive even is allowed to link neighboring proteins at a diffusion coefficient of 10 μm 2 /s. It is important to note the diffusion coefficient used in the simulation is approximately 112 two orders of magnitude smaller than typical small molecule diffusion coefficients of 1000 μm 2 /s [130]. Therefore, although the staggered-cross heme structure in multi-heme cytochrome limits rapid long-range electron transport significantly in disordered heme networks by lowering the probability of finding neighboring pro- teins’ terminal heme pairs within a distance less than 10-15 Å, a modest diffusive mechanism can boost the current up to 10 4 - 10 5 s −1 in the relatively disordered and less dense heme network. Fig. 5.10 also shows length dependences of elec- tron flux at different orientation order and cytochrome density. The electron flux attenuates with increasing distance from the source of electrons, i.e. electron injec- tion point at the end of nanowire. For both cases of varying orientation order or cytochrome density, the attenuation degree with respect to a nanowire length is larger as the level of orientation order or cytochrome density decreases. It is also observed that heme network connected through octaheme chains instead of tetra- heme chains cannot produce steady-state electron flux in a direct tunneling regime due to a huge barrier of β-berral in Mtr cytochrome structure. 5.3.3 Direct inter-protein tunneling vs. Diffusion-linking mechanism In Fig. 5.11, the overall electron flux drops less significantly with decreasing density/order when a diffusive mechanism allows ET between neighboring redox proteins as opposed to requiring direct ET. This suggests that, depending on the density of cytochromes along nanowires and biofilms, a diffusive mechanism may benecessaryforrapidelectrontransportintheselarge-scaleredoxnetworks. While our simulation treats the role of possible diffusing shuttles (e.g. flavins are known to bind and interact with Shewanella cytochromes, [11]), it is important to keep 113 Figure 5.12: Effect of diffusion coefficient D on electron flow for 60 nm-long nanowire heme network where D = 1000 μm 2 /s (top left), 100 μm 2 /s (top right), 10 μm 2 /s (bottom left), and 1 μm 2 /s (bottom right). Calculation parameters: β = 1 Å, λ = 0.94 eV. Each color represents the resulting steady-state electron flux from DCKMC simulation on different order/density cytochrome distributions. in mind that the lateral diffusion of proteins in the membrane may also contribute significantly inter-protein ET. In order to compare the small-molecule diffusion regime with a direct tunneling mechanism, we vary the value of diffusion coefficient fromD = 1μm 2 /s to D = 1000μm 2 /s, the latter being a typical small molecular diffusion coefficient in water (Fig. 5.12), and examine how this affects the overall electron flux. It turns out that only a diffusion coefficient of D = 1 μm 2 /s is required to boost current up to 10 3 s −1 - 10 4 s −1 in disordered and less dense regime. In other words, even though electron hopping paths are limited by the large gap between heme neighbors, additional diffusive factor appears to contribute significantly to the output current boosting it up to 10 4 s −1 - 10 5 s −1 . Given that 114 the experimentally measured cellular respiration rates is on the order of 10 5 s −1 [125], our findings suggest that even a weak diffusive effect is sufficient to support such rates by improving the connectivity of realistic cytochrome network scenarios and density. 5.4 Conclusions We have designed and simulated multiple redox network scenarios to inves- tigate the net electron flux along membrane nanowires. We first offer multiple scenarios with a set of variable parameters (orientation, level of disorder, den- sity) to describe how cytochromes are distributed on membrane. Then, based on the scenario parameters, molecular dynamics is performed to sample the con- figurations of cytochrome network. This redox network configuration is used as an input to a percolation analysis and a DCKMC simulation to find percolating cluster and compute the net electron flux, respectively. In DCKMC simulation, we calculate net electron flux through the wires for two possible inter-protein ET modes: direct tunneling and shuttling (i.e. diffusing molecules linking neighbor- ing cytochromes). The result shows that in the high order/density regime, direct inter-protein ET is sufficient to support biologically reasonable respiration rates. For lower order and density regimes, a diffusive rate becomes necessary to link neighboring proteins and achieve significant electron flux. Our redox network sce- narios provide valuable insight into the cytochrome localization on membrane and microscopic mechanisms of electron transport to understand extracellular electron transfer mechanisms in large-scale conductive system, such as bacterial nanowires or whole bacterial biofilms rich in redox proteins. 115 5.5 Appendix The following is percolation program to finding percolation path in heme network. This heme_percolation.cpp is a C++ version of percolation program for a heme network. Using heme coordinates from files (attached sample input file), program lists up all the heme nodes and creates an adjacency list for each heme node. Then builds a rate_map for heme-to-heme edge as a key and the corresponding rate as value (cutoff is applied to the rate). This rate_map is used to construct a graph that is used to run breadth-first search algorithm. BFS determines whether a cluster percolates through the network or not by finding the same heme node in different images, which are generated with periodic boundary condition. Input file contains heme ID, cartesian coordinates, redox potential, injection and ejection rate. Output determines whether a heme network is percolated given a period of time. Command to compile: g++ heme_percolation.cpp -o heme_percolation Command to run: ./heme_percolation lambda steps diffusion_coefficient hemefile //heme_percolation.cpp #include <iostream> #include <iomanip> #include <fstream> #include <math.h> #include <map> #include <list> #include <queue> 116 #define DIR_INTERNAL 1 #define DIR_SOUTH 2 #define DIR_NORTH 3 #define DIR_WEST 4 #define DIR_EAST 5 using namespace std; struct Node { int id; // node ID int pid; // Protein ID int hid; // Heme ID double color; // color labeled by size of nodes participated //in percolation int image; // node image in periodic boudary condition double x, y, z; // Cartesian coordinates in angstrom bool visited; }; struct Edge { int i; int j; Edge(int i, int j) { this->i = i; this->j = j; } 117 // edges need to be comparable in map/multimap // e1 < e2 --> false // e2 < e1 --> false --> e1 == e2. bool operator<(const Edge& other) const { if (i == other.i) { return j < other.j; } return i < other.i; } }; struct Rate { int dir; double value; Rate(int dir, double value) { this->dir = dir; this->value = value; } }; // Global variables to represent the graph from input file. // Box size in angstrom. double LX, LY, LZ; // Number of nodes (hemes). Initialized from the input file. 118 int NUM_HEME; // Heme-node array. Allocated when file is read. struct Node* HEME; // Map to allocate Rate to the corresponding Edge multimap<Edge, Rate> rate_map; // Rate greater than threshold RTH creates an Edge double RTH; class Graph { int V; // Number of nodes list<int>* adj; // pointer to an adjacency list arry public: Graph(int V) // constructor { this->V = V; adj = new list<int>[V]; } void addEdge(int i, int j) { adj[i].push_back(j); // add j to i’s list } void printGraph() { list<int>::iterator it; int i; for (i = 0; i < V; i++) { cout << "\n"; 119 cout << "Node " << i << ":"; for (it = adj[i].begin(); it != adj[i].end(); ++it) { cout << " " << *it; } } } // function to calculate image index according to next edge direction void calImage(int d, int* mx, int* my) { int m; if (d == DIR_INTERNAL) { // do nothing } else if (d == DIR_SOUTH) { *my = *my - 1; } else if (d == DIR_NORTH) { *my = *my + 1; } else if (d == DIR_WEST) { *mx = *mx - 1; } else if (d == DIR_EAST) { *mx = *mx + 1; } } void bfs(int s, int* clusterSize, int* clusterPercolates, int* clusterPercolX, int* clusterPercolY, 120 int hemesInCluster[]); }; // breadth first search algorithm (pseudo code provided in the text) void Graph::bfs(int s, int* clusterSize, int* clusterPercolates, int* clusterPercolX, int* clusterPercolY, int hemesInCluster[]) { // Initialize image index variables int mx, my, image_idx; int prev_mx, prev_my; mx = 0; my = 0; image_idx = (mx + 2) * 5 + (my + 2) + 1; HEME[s].image = image_idx; // Initialize cluster variables int size = 0; int percolates = 0; int percolX = 0; int percolY = 0; // Set all the vertices as not visited bool* visited = new bool[V]; for (int i = 0; i < V; i++) { visited[i] = false; 121 } // Create a queue for breadth first search (BFS) list<int> queue; // Set the current node as visited and enqueue it visited[s] = true; queue.push_back(s); while (!queue.empty()) { // Dequeue a node from queue and print it int thisHeme = queue.front(); queue.pop_front(); size += 1; hemesInCluster[size] = thisHeme; HEME[thisHeme].visited = 1; // Get all adjacent neighbors of the dequeued node s // If an adjacent has not been visited, then set it visited // and enqueue it for (list<int>::iterator i = adj[thisHeme].begin(); i != adj[thisHeme].end(); ++i) { // if unvisited, calculate its first image // Image index is calculated by the edge direction in calImage function int neighbor = *i; bool found = false; Edge e(thisHeme, neighbor); auto range = rate_map.equal_range(e); for (auto it = range.first; it != range.second; ++it) { mx = ((HEME[thisHeme].image - 1) / 5) - 2; 122 my = (HEME[thisHeme].image - 1) % 5 - 2; if (found) { cout << "multiple edges are found!" << endl; exit(-1); } else { found = true; } calImage(it->second.dir, &mx, &my); image_idx = (mx + 2) * 5 + (my + 2) + 1; if (visited[neighbor]) { prev_mx = ((HEME[neighbor].image - 1) / 5) - 2; prev_my = (HEME[neighbor].image - 1) % 5 - 2; if (HEME[neighbor].image == image_idx) { // Found a cycle in the same image continue; } else { // cout << "percolation found\n"; // Found a path across two images that can be repeated to // create a percolation percolates = 1; if (prev_mx != mx) { percolX = 1; } else if (prev_my != my) { percolY = 1; } } } else { 123 // Not visited visited[neighbor] = true; queue.push_back(neighbor); HEME[neighbor].image = image_idx; } } // if visited, compare its image with previously stored // if yes, percolation happens // if no, nothing happens, just cycles is made } } *clusterSize = size; *clusterPercolates = percolates; *clusterPercolX = percolX; *clusterPercolY = percolY; return; } // Read input coordinates file and create Nodes. void read_heme_file(string filename) { double redox_potential, injection_rate, ejection_rate, occupancy; ifstream fin(filename); fin >> NUM_HEME; fin >> LX >> LY >> LZ; HEME = (Node*)malloc(sizeof(Node) * NUM_HEME); for (int i = 0; i < NUM_HEME; i++) { 124 fin >> HEME[i].pid >> HEME[i].hid >> HEME[i].x >> HEME[i].y >> HEME[i].z >> redox_potential >> injection_rate >> ejection_rate >> occupancy; HEME[i].id = HEME[i].pid * 10 + HEME[i].hid; HEME[i].image = 0; HEME[i].color = 0.0; HEME[i].visited = false; } fin.close(); } // Insert rate_value to the rate map if rate_value > rCut // and return rate_value if rate_Value > rCut. Otherwise, return 0.0 double insert_to_rate_map(Edge e, double rate_value, int dir, double rCut) { if (rate_value > rCut) { Rate r(dir, rate_value); rate_map.insert(make_pair<Edge, Rate>(e, r)); return rate_value; } return 0.0; } int find_index(int a[], int num_elements, int value) { int i; for (i = 0; i < num_elements; i++) { if (a[i] == value) { 125 return 1; /* it was found */ } } return -1; /* if it was not found */ } // Create a rate_map (edge, rate) // Rates are calculated by Marcus expresions for non-adiabatic ET rates void make_map(double lambda, double steps, double diff) { // ET parameters double K0 = 1.0e13; double TEMP = 300; double BETA = 1.0; double R0 = 4.0; double KB = 8.621738e-5; double DCUT = 100.0; /* Distance cutoff*/ double tkb = TEMP * KB; double fmarcus = 0.25 / (lambda * tkb); double dx, dy, dz, dys, dyn, dxw, dxe, r1, r1s, r1n, r1w, r1e; double rSum = 0.0; int count = 0; pair<int, double> rVal; /* Rate cutoff*/ double rCut = K0 * exp(-BETA * (DCUT - R0) - fmarcus * pow(lambda, 2)); for (int i = 0; i < NUM_HEME; i++) { 126 for (int j = 0; j < NUM_HEME; j++) { if (i == j) continue; // if (i == j || HEME[i].pid == HEME[j].pid) continue; // Pair vector dr = r[i]-r[j] dx = HEME[i].x - HEME[j].x; dy = HEME[i].y - HEME[j].y; dz = HEME[i].z - HEME[j].z; dys = HEME[i].y - (HEME[j].y - LY); // neighbor south dyn = HEME[i].y - (HEME[j].y + LY); // neighbor north dxw = HEME[i].x - (HEME[j].x - LX); // neighbor west dxe = HEME[i].x - (HEME[j].x + LX); // neighbor east r1 = sqrt(dx * dx + dy * dy + dz * dz); r1s = sqrt(dx * dx + dys * dys + dz * dz); r1n = sqrt(dx * dx + dyn * dyn + dz * dz); r1w = sqrt(dxw * dxw + dy * dy + dz * dz); r1e = sqrt(dxe * dxe + dy * dy + dz * dz); // if (r1i < DCUT) cout << " r1j < DCUT" << endl; // multiple values to the same key i and j is allowed in pbc picture Edge e(i, j); double rate_value = K0 * exp(-BETA * (r1 - R0) - fmarcus * pow(lambda, 2)); int outermost_hemes[4] = {1, 4, 6, 9}; int index_i, index_j; // Add diffusion rate only to the outermost hemes 1,4,6,9 (-1 from // 2,5,7,10) 127 index_i = find_index(outermost_hemes, 4, HEME[i].hid); index_j = find_index(outermost_hemes, 4, HEME[j].hid); if (HEME[i].pid != HEME[j].pid && index_i == 1 && index_j == 1) { rate_value += diff / (r1 * r1); } // rSum += insert_to_rate_map(e, rate_value, DIR_INTERNAL, rCut); if (rate_value > rCut) { // cout << rate_value << endl; Rate r(DIR_INTERNAL, rate_value); rSum += rate_value; count++; rate_map.insert(make_pair<Edge, Rate>(e, r)); } // outer rates where j is in south if (dy < 0.0) { double rate_value = K0 * exp(-BETA * (r1s - R0) - fmarcus * pow(lambda, 2)); rSum += insert_to_rate_map(e, rate_value, DIR_SOUTH, rCut); count++; } else { // 3: outer rates where j is in north double rate_value = K0 * exp(-BETA * (r1n - R0) - fmarcus * pow(lambda, 2)); rSum += insert_to_rate_map(e, rate_value, DIR_NORTH, rCut); count++; } // outer rates where j is in west 128 if (dx < 0.0) { double rate_value = K0 * exp(-BETA * (r1w - R0) - fmarcus * pow(lambda, 2)); rSum += insert_to_rate_map(e, rate_value, DIR_WEST, rCut); count++; } else { // outer rates where j is in east double rate_value = K0 * exp(-BETA * (r1e - R0) - fmarcus * pow(lambda, 2)); rSum += insert_to_rate_map(e, rate_value, DIR_EAST, rCut); count++; } } // end j } // end i RTH = rSum / steps; } // end make_map void print_list() { cout << "multimap rate contains:\n"; for (auto itr = rate_map.begin(); itr != rate_map.end(); ++itr) { cout << "(" << itr->first.i << "," << itr->first.j << ") :" << "(" << itr->second.dir << "," << itr->second.value << ")" << endl; } } 129 int main(int argc, char* argv[]) { int clusterSize; int clusterPercolates; int clusterPercolX; int clusterPercolY; double lambda = atof(argv[1]); double steps = atof(argv[2]); double diff = atof(argv[3]); string filename = argv[4]; int ispercolated; double r1, r2; // NUM_HEME is initialized by this function. read_heme_file(filename); // Maximum size of hemesInCluster is NUM_HEME int* hemesInCluster = (int*)malloc(sizeof(int) * NUM_HEME); make_map(lambda, steps, diff); // print_list(); Graph g(NUM_HEME); for (auto itr = rate_map.begin(); itr != rate_map.end(); ++itr) { if (itr->second.value > RTH) { g.addEdge(itr->first.i, itr->first.j); } } // g.printGraph(); 130 int bfs_count = 0; for (int i = 0; i < NUM_HEME; i++) { // Avoid calculation for nodes whose cluster is already assigned if (HEME[i].visited == false) { g.bfs(i, &clusterSize, &clusterPercolates, &clusterPercolX, &clusterPercolY, hemesInCluster); ++bfs_count; } if (clusterPercolates) ispercolated = 1; if (clusterSize >= 0) { for (int j = 0; j < clusterSize; j++) { // Set each hemes’ color in cluster to the size of cluster // it belongs. Color value is used to visualize clusters HEME[hemesInCluster[j]].color = clusterSize; } } } // Write the cluster properties cout << "=========================================================\n"; cout << "Number of Hemes = " << NUM_HEME << "\n"; cout << "Box size in angstrom = " << LX << " " << LY << " " << LZ << "\n"; cout << "Number of clusters = " << bfs_count << "\n"; cout << "Total percolation = " << ispercolated << "\n"; cout << "Time to percolate = " << 1 / RTH << "(s)" << "\n"; 131 cout << "=========================================================\n"; } Sample Input hemefile 12800 11204.06 652.39 396.732004 0 0 14.93 48.65 0.08 -0.05 0.00 0.00 1.0 0 1 13.24 61.61 0.08 -0.07 0.00 0.00 1.0 0 2 4.61 49.38 0.08 -0.17 0.00 0.00 1.0 0 3 11203.12 42.40 0.08 -0.27 0.00 0.00 0.0 0 4 11194.36 42.84 0.08 -0.04 0.00 0.00 0.0 0 5 21.12 36.08 0.08 -0.06 0.00 0.00 1.0 0 6 21.13 23.83 0.08 0.07 0.00 0.00 1.0 0 7 32.49 34.14 0.08 -0.16 0.00 0.00 1.0 0 8 38.62 39.88 0.08 -0.28 0.00 0.00 1.0 0 9 47.96 37.52 0.08 -0.09 0.00 0.00 1.0 1 0 49.95 130.20 0.08 -0.05 0.00 0.00 1.0 1 1 48.27 143.16 0.08 -0.07 0.00 0.00 1.0 1 2 39.63 130.93 0.08 -0.17 0.00 0.00 1.0 1 3 34.07 123.95 0.08 -0.27 0.00 0.00 1.0 ... ... ... Sample Output Number of Hemes = 12800 Box size in angstrom = 11204.1 652.39 396.732 132 Number of clusters = 8 Total percolation = 1 Time to percolate = 5.40969e-06(s) The below is bash scripts for serial procedure to perform DCKMC simulation for lengthwise extended nanowire heme network adapted for HPC environment at USC. For continuous submission of a job to HPC cluster, submit hpc.sh bash script to one of head nodes on the HPC. It will submit hpcrun.sh every 10 minutes to check if current job is finished or not. %filename: hpc.sh id=$1 while : do ./hpcrun.sh $id sleep 600 done If file "finished + number of nodes" is found, then do qsub on next job for one layer extended system with the updated PBS script by adding two more nodes. If file is not found, then do qsub on the same PBS script. %filename: hpcrun.sh #!/bin/bash typeset -i i END a b let END=66 i=6 echo $i file="finished" 133 id=$1 while ((i<=END)) do if [ -f "$file$i" ]; then echo "file found - complete $i" else line=‘myqueue | grep dckmc_$id$i.pbs‘ if [ -n "$line" ]; then echo "still running...$i" else echo "start running...$i " jstart=8 for ((j=jstart; j>=1; j--)) do if [ $((i%j)) -eq 0 ] then a=$((i / j)) b=$((i / a)) break fi done sed -e "s/_NODENUM/$a/g" -e "s/_PPNNUM/$b/g" dckmc.pbs.template > dckmc_$id$i.pbs qsub dckmc_$id$i.pbs fi exit 134 fi let i=i+2 done This sample PBS script specifies the number of nodes and processors desired for this job. In this example, one node with six processors is being requested to run a pbsrun bash script. %filename: dckmc.pbs #!/bin/bash #PBS -l nodes=1:ppn=6 #PBS -l walltime=23:59:59 #PBS -o dckmcfinal.out #PBS -j oe #PBS -A lc_me source /usr/usc/openmpi/1.6.4/setup.sh cd "$PBS_O_WORKDIR" ./pbsrun.sh This pbsrun.sh is finished by if-END steps or file "finished" was found, if file "fin- ished" is found, then do qsub with different PBS node file according to hpcrun.sh. If file "finished" is not found, then do qsub with the same PBS node file continuing previous run. %filename: pbsrun.sh #!/bin/bash np=$1 typeset -i i END 135 let END=500 i=1 echo $i file="finished" while ((i<=END)) do if [ -f "$file$np" ]; then echo "file found" exit else echo "continue...$i " today=‘date ’+%m_%d__%H_%M_%S’‘; mpirun -np $np -machinefile $PBS_NODEFILE ./dckmc $np > dckmc$np.$i.$today.out cp dckmc$np.$i.$today.out dckmc.out egrep "^[0-9]+ " dckmc.out > tmp.out python dcal.py tmp.out $np #save occ file in every run ./newhmfile occ_cont cp nheme0 heme0 cp occ_cont occ_cont$np.$i.$today cp occ.xyz occ$today.xyz fi let i++ done 136 Conclusions Shewanella oneidensis MR-1 has gained substantial interest as a model sys- tem of extracellular electron transfer to insoluble electron acceptors ranging from minerals to solid-state electrodes [8, 57]. Many applications have been developed to harness this extracellular redox activity such as 1) microbial fuel cells, where microbial respiratory electron flow is converted to at an electrode surface, 2) waste water treatment where microorganisms are used to break down organics, and 3) the reverse process wheremicroorganisms gain electrons from the renewable energy sources such as solar energy and reduce soluble chemicals to desirable end prod- ucts e.g. fuels. Several strategies have been proposed to link bacterial electron transport chains to electrodes: 1) direct contact via outer membrane multiheme c-type cytochromes, 2) conductive extracellular filaments referred to as bacterial nanowire, and 3) mediated electron transfer via soluble shuttles, including endoge- nously produced flavins in the case of MR-1. However, the detailed mechanistic basis of these strategies remains unclear and, in many cases, difficult to probe experimentally. This thesis elucidates the underlying redox conduction mecha- nisms using computational modeling and simulation. 137 Chapter 2 focused on the direct contact mechanism by computing electron transport capability through a single redox molecule, specifically the outer mem- brane decaheme cytochrome MtrF of Shewanella oneidensis MR-1 using a con- ventional kinetic Monte Carlo method. The calculated electron transport rate through MtrF is 10 4 /s, in agreement with recent in vitro measurements of trans- port through the Shewanella Mtr complex [15]. Importantly, this rate meets in vivo respiration demands with 1-10% of the experimentally determined coverage of outer membrane cytochromes (∼ 10 3 per cell) that act as terminal reductases of insoluble electron acceptors [125]. In addition, the computed electron occupa- tion profile supports the experimentally and computationally proposed roles for each redox site in MtrF. Chapter 3 proposes a computational framework including structural modeling, molecular dynamics, and kinetic Monte Carlo simulation to predict a complex structure of Shewanella’s Mtr-Omc proteins. Remarkably, the predicted Mtr-Omc is arranged similarly to the OmcA dimer crystallized experi- mentally, which is the only reported dimer structure of two interacting decaheme cytochromes [81]. This similarity suggests that our computational framework is capable of predicting biologically plausible structures of complex proteins. Chap- ter 4 designs a size-independent scalable kinetic Monte Carlo algorithm to resolve the scalability issue of the conventional algorithm for simulating electron trans- fer dynamics in single and complex proteins. This newly designed algorithm is based on a divide-and-conquer strategy based on spatial decomposition. Our work provides a formal derivation of our algorithm using synchronized parallel computa- tion based on local transition-state and time-dependent Hartree approximations. We then introduced a dual cell linked-list method to reduce the computational cost. The resulting divide-and-conquer kinetic Monte Carlo algorithm (DCKMC) 138 has achieved an excellent weak-scaling parallel efficiency and decent strong-scaling parallel efficiency. With the aid of this parallel computation algorithm, Chap- ter 5 reports large-scale redox network simulations inspired by the cytochrome- containing membrane tube structure of Shewanella nanowires. While the con- ductivity of the Shewanella nanowires has been previously demonstrated under non-physiological conditions [17, 18], and this conductivity hinges on the presence of MtrC and OmcA, in vivo measurements of redox condution in nanowires and biofilms have proved difficult using standard approaches until recently [124, 131]. Our computational approaches are therefore critical for shedding light on transport through redox networks, and are designed to guide recent experimental develop- ments and motivate future measurements. We designed and simulated multiple redox network scenarios to examine plausible redox protein configurations on a nanowire, i.e. how redox proteins are lining up on a membrane to allow for effi- cientelectrontransport. Wefirstoffermultiplescenarios,eachwithasetofvariable parameters describing 1) orientation, 2) level of disorder, and 3) density. Coarse- grainedMoleculardynamicsisperformedtosampletheredoxnetwork. Percolation analysis is performed as a screening method, prior to DCKMC simulations, to test the connectivity of redox network. The percolation analysis filters out sparsely- connected redox network that are not biologicall relevant (i.e. unlikely to meet the respiratory demand of microbes). DCKMC simulates electron dynamics on the percolation-screened redox network models with various configurations. We discover that direct inter-protein electron tunneling through the network can allow for biologically-relevant transport rates through densely-connected scenarios i.e. highly ordered and packed cytochromes. However, as order and packing decrease, an additional inter-protein interaction (e.g. diffusive shuttling via small electron 139 carriers or lateral diffusion of membrane cytochromes) is likely necessary to boost the efficiency of long-distance electron transfer. To examine this further, we intro- duced a diffusive rate consistent with flavins, which are known to interact with and dock to Mtr and Omc, to account for electron shuttling between the terminal hemes of these multiheme cytochromes along the network. Our redox network scenarios provide new insights into extracellular electron transport mechanisms in large-scale conductive systems, such as bacterial nanowires or bacterial biofilms, and may be used to interpret ongoing and future electrochemical measurements of transport in solution. Finally, our study motivates future work on measuring the cytochrome density and topology on cellular membranes, as well as accounting for inter-protein ET resulting from lateral diffusion and collisions of cytochromes. 140 Reference List [1] Poorna Subramanian, Sahand Pirbadian, Mohamed Y El-Naggar, and Grant J Jensen. The ultrastructure of shewanella oneidensis mr-1 nanowires revealed by electron cryo-tomography. bioRxiv, 2017. doi: 10.1101/103242. URL http://biorxiv.org/content/early/2017/01/28/103242. [2] Harry B Gray and Jay R Winkler. Electron flow through metallopro- teins. Biochimica et Biophysica Acta (BBA)-Bioenergetics, 1797(9):1563– 1572, 2010. [3] R Al Marcus and Norman Sutin. Electron transfers in chemistry and biology. Biochimica et Biophysica Acta (BBA)-Reviews on Bioenergetics, 811(3):265– 322, 1985. [4] Karrie A Weber, Laurie A Achenbach, and John D Coates. Microorganisms pumping iron: anaerobic microbial iron oxidation and reduction. Nature Reviews Microbiology, 4(10):752–764, 2006. [5] Kenneth H Nealson and Daad Saffarini. Iron and manganese in anaerobic respiration: environmental significance, physiology, and regulation. Annual Reviews in Microbiology, 48(1):311–343, 1994. [6] Derek R Lovley. Dissimilatory metal reduction. Annual Reviews in Microbi- ology, 47(1):263–290, 1993. [7] Kenneth H Nealson and B Lea Cox. Microbial metal-ion reduction and mars: extraterrestrial expectations? Current opinion in microbiology, 5(3): 296–300, 2002. [8] Bruce E Logan. Exoelectrogenic bacteria that power microbial fuel cells. Nature Reviews Microbiology, 7(5):375–381, 2009. [9] Korneel Rabaey and René A Rozendal. Microbial electrosynthesisâĂŤrevis- iting the electrical route for microbial production. Nature Reviews Microbi- ology, 8(10):706–716, 2010. 141 [10] Jeffrey A Gralnick and Dianne K Newman. Extracellular respiration. Molec- ular Microbiology, 65(1):1–11, 2007. [11] Enrico Marsili, Daniel B Baron, Indraneel D Shikhare, Dan Coursolle, Jef- frey A Gralnick, and Daniel R Bond. Shewanella secretes flavins that medi- ate extracellular electron transfer. Proceedings of the National Academy of Sciences, 105(10):3968–3973, 2008. [12] Harald Von Canstein, Jun Ogawa, Sakayu Shimizu, and Jonathan R Lloyd. Secretion of flavins by Shewanella species and their role in extracellular elec- tron transfer. Applied and environmental microbiology, 74(3):615–623, 2008. [13] Charles R Myers and Judith M Myers. Localization of cytochromes to the outermembraneofanaerobicallygrownShewanella putrefaciens MR-1. Jour- nal of Bacteriology, 174(11):3429–3438, 1992. [14] Robert S Hartshorne, Catherine L Reardon, Daniel Ross, Jochen Nuester, ThomasAClarke, AndrewJGates, PaulCMills, JimKFredrickson, JohnM Zachara, Liang Shi, et al. Characterization of an electron conduit between bacteria and the extracellular environment. Proceedings of the National Academy of Sciences, 106(52):22169–22174, 2009. [15] Gaye F White, Zhi Shi, Liang Shi, Zheming Wang, Alice C Dohnalkova, Matthew J Marshall, James K Fredrickson, John M Zachara, Julea N Butt, David J Richardson, et al. Rapid electron exchange between surface-exposed bacterial cytochromes and Fe (III) minerals. Proceedings of the National Academy of Sciences, 110(16):6346–6351, 2013. [16] Yuri A Gorby, Svetlana Yanina, Jeffrey S McLean, Kevin M Rosso, Dianne Moyles, Alice Dohnalkova, Terry J Beveridge, In Seop Chang, Byung Hong Kim, Kyung Shik Kim, et al. Electrically conductive bacterial nanowires produced by Shewanella oneidensis strain MR-1 and other microorganisms. Proceedings of the National Academy of Sciences, 103(30):11358–11363, 2006. [17] Mohamed Y El-Naggar, Yuri A Gorby, Wei Xia, and Kenneth H Nealson. The molecular density of states in bacterial nanowires. Biophysical Journal, 95(1):L10–L12, 2008. [18] Mohamed Y El-Naggar, Greg Wanger, Kar Man Leung, Thomas D Yuzvin- sky, Gordon Southam, Jun Yang, Woon Ming Lau, Kenneth H Nealson, and Yuri A Gorby. Electrical transport along bacterial nanowires from She- wanella oneidensis MR-1. Proceedings of the National Academy of Sciences, 107(42):18127–18131, 2010. 142 [19] H Liu, GJ Newton, R Nakamura, K Hashimoto, and S Nakanishi. Elec- trochemical characterization of a single electricity-producing bacterial cell of Shewanella by using optical tweezers. Angewandte Chemie (International ed. in English), 49(37):6596–6599, 2010. [20] AmbrishRoy, AlperKucukural, andYangZhang. I-tasser: aunifiedplatform for automated protein structure and function prediction. Nature protocols, 5 (4):725–738, 2010. [21] Dan Coursolle, Daniel B Baron, Daniel R Bond, and Jeffrey A Gralnick. The Mtr respiratory pathway is essential for reducing flavins and electrodes in Shewanella oneidensis. Journal of bacteriology, 192(2):467–474, 2010. [22] David J Richardson, Julea N Butt, Jim K Fredrickson, John M Zachara, Liang Shi, Marcus J Edwards, Gaye White, Nanakow Baiden, Andrew J Gates, Sophie J Marritt, et al. The ’porin–cytochrome’ model for microbe- to-mineral electron transfer. Molecular Microbiology, 85(2):201–212, 2012. [23] Clemens Bücking, Felix Popp, Sven Kerzenmacher, and Johannes Gescher. Involvement and specificity of shewanella oneidensis outer membrane cytochromes in the reduction of soluble and solid-phase terminal electron acceptors. FEMS microbiology letters, 306(2):144–151, 2010. [24] Thomas A Clarke, Marcus J Edwards, Andrew J Gates, Andrea Hall, Gaye F White, JustinBradley, CatherineLReardon, LiangShi, AlexanderSBeliaev, Matthew J Marshall, et al. Structure of a bacterial cell surface decaheme electron conduit. Proceedings of the National Academy of Sciences, 108(23): 9384–9389, 2011. [25] Gerd Binnig and Heinrich Rohrer. Scanning tunneling microscopy. IBM Journal of research and development, 44(1-2):279–293, 2000. [26] Fuyuki Shimojo, Shinnosuke Hattori, Rajiv K Kalia, Manaschai Kunaseth, Weiwei Mou, Aiichiro Nakano, Ken-ichi Nomura, Satoshi Ohmura, Pankaj Rajak, Kohei Shimamura, et al. A divide-conquer-recombine algorithmic paradigm for large spatiotemporal quantum molecular dynamics simulations. The Journal of chemical physics, 140(18):18A529, 2014. [27] Wesley P Petersen and Peter Arbenz. Introduction to parallel computing. Number 9. Oxford University Press, 2004. [28] Enrique Martínez, Jaime Marian, Malvin H Kalos, and José Manuel Perlado. Synchronous parallel kinetic monte carlo for continuum diffusion-reaction systems. Journal of Computational Physics, 227(8):3804–3823, 2008. 143 [29] Enrique Martínez, Paul R Monasterio, and Jaime Marian. Billion-atom syn- chronous parallel kinetic monte carlo simulations of critical 3d ising systems. Journal of Computational Physics, 230(4):1359–1369, 2011. [30] Marian Breuer, Piotr Zarzycki, Liang Shi, ThomasA Clarke, MarcusJ Edwards, JuleaN Butt, DavidJ Richardson, JamesK Fredrickson, JohnM Zachara, Jochen Blumberger, et al. Molecular structure and free energy landscape for electron transport in the decahaem cytochrome MtrF. Bio- chemical Society Transactions, 40(6):1198, 2012. [31] Marian Breuer, Piotr Zarzycki, Jochen Blumberger, and Kevin M Rosso. Thermodynamics of electron flow in the bacterial deca-heme cytochrome MtrF. Journal of the American Chemical Society, 134(24):9868–9871, 2012. [32] Marian Breuer, Kevin M Rosso, and Jochen Blumberger. Electron flow in multiheme bacterial cytochromes is a balancing act between heme electronic interaction and redox potentials. Proceedings of the National Academy of Sciences, 111(2):611–616, 2014. [33] Catherine L Reardon, AC Dohnalkova, Ponnusamy Nachimuthu, David W Kennedy, DA Saffarini, Bruce W Arey, Liang Shi, Zheming Wang, D Moore, Jeffrey S Mclean, et al. Role of outer-membrane cytochromes MtrC and OmcA in the biomineralization of ferrihydrite by Shewanella oneidensis MR- 1. Geobiology, 8(1):56–68, 2010. [34] Daniel E Ross, Susan L Brantley, and Ming Tien. Kinetic characterization of omca and mtrc, terminal reductases involved in respiratory electron transfer for dissimilatory iron reduction in shewanella oneidensis mr-1. Applied and environmental microbiology, 75(16):5218–5226, 2009. [35] Toshihiko Shibanuma, Ryuhei Nakamura, Yuichiro Hirakawa, Kazuhito Hashimoto, and Kazuyuki Ishii. Observation of in vivo cytochrome-based electron-transport dynamics using time-resolved evanescent wave electroab- sorption spectroscopy. Angewandte Chemie International Edition, 50(39): 9137–9140, 2011. [36] Jeffrey S McLean, Greg Wanger, Yuri A Gorby, Martin Wainstein, Jeff McQuaid, Shun’ichi Ishii, Orianna Bretschger, Haluk Beyenal, and Ken- neth H Nealson. Quantification of electron transfer rates to a solid phase electron acceptor through the stages of biofilm formation from single cells to multicellular communities. Environmental Science & Technology, 44(7): 2721–2727, 2010. 144 [37] Jimmy Borloo, Bjorn Vergauwen, Lina De Smet, Ann Brige, Bart Motte, Bart Devreese, and Jozef Van Beeumen. A kinetic approach to the depen- dence of dissimilatory metal reduction by shewanella oneidensis mr-1 on the outer membrane cytochromes c omca and omcb. Febs Journal, 274(14): 3728–3738, 2007. [38] Brian H Lower, Liang Shi, Ruchirej Yongsunthon, Timothy C Droubay, David E McCready, and Steven K Lower. Specific bonds between an iron oxide surface and outer membrane cytochromes mtrc and omca from she- wanella oneidensis mr-1. Journal of bacteriology, 189(13):4944–4952, 2007. [39] Akihiro Okamoto, Kazuhito Hashimoto, Kenneth H Nealson, and Ryuhei Nakamura. Rate enhancement of bacterial extracellular electron transport involves bound flavin semiquinones. Proceedings of the National Academy of Sciences, 110(19):7856–7861, 2013. [40] Sarah M Strycharz-Glaven, Rachel M Snider, Anthony Guiseppi-Elie, and Leonard M Tender. On the electrical conductivity of microbial nanowires and biofilms. Energy & Environmental Science, 4(11):4366–4379, 2011. [41] Sahand Pirbadian and Mohamed Y El-Naggar. Multistep hopping and extra- cellular charge transfer in microbial redox chains. Physical Chemistry Chem- ical Physics, 14(40):13802–13808, 2012. [42] Nicholas F Polizzi, Spiros S Skourtis, and David N Beratan. Physical con- straints on charge transport through bacterial nanowires. Faraday discus- sions, 155:43–61, 2012. [43] Carlo Augusto Bortolotti, Magdalena E Siwko, Elena Castellini, Antonio Ranieri, Marco Sola, and Stefano Corni. The reorganization energy in cytochrome c is controlled by the accessibility of the heme to the solvent. The Journal of Physical Chemistry Letters, 2(14):1761–1765, 2011. [44] Nicholas S Wigginton, Kevin M Rosso, Brian H Lower, Liang Shi, and Michael F Hochella. Electron tunneling properties of outer-membrane deca- heme cytochromes from Shewanella oneidensis. Geochimica et cosmochimica acta, 71(3):543–555, 2007. [45] Nicholas S Wigginton, Kevin M Rosso, and Michael F Hochella. Mechanisms of electron transfer in two decaheme cytochromes from a metal-reducing bacterium. The Journal of Physical Chemistry B,111(44):12857–12864,2007. [46] DN Beratan, JN Betts, and JN Onuchic. Protein electron transfer rates set by the bridging secondary and tertiary structure. Science, 252(5010): 1285–1288, 1991. 145 [47] Ilya A Balabin, Xiangqian Hu, and David N Beratan. Exploring biologi- cal electron transfer pathway dynamics with the pathways plugin for vmd. Journal of computational chemistry, 33(8):906–910, 2012. [48] Jeffrey J Gray. The interaction of proteins with solid surfaces. Current opinion in structural biology, 14(1):110–115, 2004. [49] Sahand Pirbadian, Sarah E Barchinger, Kar Man Leung, Hye Suk Byun, YaminiJangir, RachidaABouhenni, SamanthaBReed, MargaretFRomine, Daad A Saffarini, Liang Shi, et al. Shewanella oneidensis MR-1 nanowires are outer membrane and periplasmic extensions of the extracellular electron transport components. Proceedings of the National Academy of Sciences, 111 (35):12883–12888, 2014. [50] Y Zhang, C Liu, A Balaeff, SS Skourtis, and DN Beratan. A flickering resonance mechanism for biological charge transfer. Proc. Natl Acad. Sci. USA, 111:10049–10054, 2014. [51] Liang Shi, Baowei Chen, Zheming Wang, Dwayne A Elias, M Uljana Mayer, Yuri A Gorby, Shuison Ni, Brian H Lower, David W Kennedy, David S Wun- schel, et al. Isolation of a high-affinity functional protein complex between OmcA and MtrC: two outer membrane decaheme c-type cytochromes of She- wanella oneidensis MR-1. Journal of Bacteriology, 188(13):4705–4714, 2006. [52] Liang Shi, Jiann-Trzwo Lin, Lye M Markillie, Thomas C Squier, and Brian S Hooker. Overexpression of multi-heme c-type cytochromes. Biotechniques, 38:297–299, 2005. [53] Christopher C Moser, Jonathan M Keske, Kurt Warncke, Ramy S Farid, and P Leslie Dutton. Nature of biological electron transfer. Nature, 355(6363): 796–802, 1992. [54] Harry B Gray and Jay R Winkler. Electron tunneling through proteins. Quarterly Reviews of Biophysics, 36(03):341–372, 2003. [55] Jianping Lin, Ilya A Balabin, and David N Beratan. The nature of aqueous tunneling pathways between electron-transfer proteins. Science, 310(5752): 1311–1313, 2005. [56] Charles R Myers and Kenneth H Nealson. Bacterial manganese reduction and growth with manganese oxide as the sole electron acceptor. Science, 240, 1988. [57] James K Fredrickson, Margaret F Romine, Alexander S Beliaev, Jennifer M Auchtung, Michael E Driscoll, Timothy S Gardner, Kenneth H Nealson, 146 Andrei L Osterman, Grigoriy Pinchuk, Jennifer L Reed, et al. Towards environmental systems biology of Shewanella. Nature Reviews Microbiology, 6(8):592–603, 2008. [58] Mohamed Y El-Naggar and Steven E Finkel. Live wires. The Scientist, 27 (5):38–43, 2013. [59] Marian Breuer, Kevin M Rosso, Jochen Blumberger, and Julea N Butt. Multi-haem cytochromes in Shewanella oneidensis MR-1: structures, func- tions and opportunities. Journal of The Royal Society Interface, 12(102): 20141117, 2015. [60] Kar Man Leung, Greg Wanger, Mohamed Y El-Naggar, Yuri Gorby, Gordon Southam, Woon Ming Lau, and Jun Yang. Shewanella oneidensis MR-1 bac- terial nanowires exhibit p-type, tunable electronic behavior. Nano Letters, 13(6):2407–2411, 2013. [61] CR Myers and JM Myers. Cell surface exposure of the outer membrane cytochromes of Shewanella oneidensis MR-1. Letters in Applied Microbiol- ogy, 37(3):254–258, 2003. [62] Brian H Lower, Ruchirej Yongsunthon, Liang Shi, Linda Wildling, Her- mann J Gruber, Nicholas S Wigginton, Catherine L Reardon, Grigoriy E Pinchuk, Timothy C Droubay, Jean-François Boily, et al. Antibody recog- nition force microscopy shows that outer membrane cytochromes omca and mtrc are expressed on the exterior surface of shewanella oneidensis mr-1. Applied and environmental microbiology, 75(9):2931–2935, 2009. [63] William Humphrey, Andrew Dalke, and Klaus Schulten. Vmd: visual molec- ular dynamics. Journal of molecular graphics, 14(1):33–38, 1996. [64] Hye Suk Byun, Sahand Pirbadian, Aiichiro Nakano, Liang Shi, and Mohamed Y El-Naggar. Kinetic Monte Carlo simulations and molecular con- ductance measurements of the bacterial decaheme cytochrome MtrF. Chem- ElectroChem, 1(11):1932–1939, 2014. [65] Hai Lin and Donald G Truhlar. Qm/mm: what have we learned, where are we, and where do we go from here? Theoretical Chemistry Accounts, 117(2): 185–199, 2007. [66] Anna I Krylov and Peter MW Gill. Q-chem: an engine for innovation. Wiley Interdisciplinary Reviews: Computational Molecular Science, 3(3):317–326, 2013. 147 [67] Alfred B Bortz, Malvin H Kalos, and Joel L Lebowitz. A new algorithm for monte carlo simulation of ising spin systems. Journal of Computational Physics, 17(1):10–18, 1975. [68] Daniel T Gillespie. A general method for numerically simulating the stochas- tic time evolution of coupled chemical reactions. Journal of computational physics, 22(4):403–434, 1976. [69] KristenAFichthornandWHhWeinberg. Theoreticalfoundationsofdynam- ical monte carlo simulations. The Journal of chemical physics, 95(2):1090– 1096, 1991. [70] ArthurFVoter. Introductiontothekineticmontecarlomethod. InRadiation Effects in Solids, pages 1–23. Springer, 2007. [71] Jayashree Srinivasan, Thomas E Cheatham, Piotr Cieplak, Peter A Kollman, and David A Case. Continuum solvent studies of the stability of dna, rna, andphosphoramidate-dnahelices. Journal of the American Chemical Society, 120(37):9401–9409, 1998. [72] Irina Massova and Peter A Kollman. Computational alanine scanning to probe protein-protein interactions: a novel approach to evaluate binding free energies. Journal of the American Chemical Society, 121(36):8133–8143, 1999. [73] Cristina Paissoni, Dimitrios Spiliotopoulos, Giovanna Musco, and Andrea Spitaleri. Gmxpbsa 2.0: A gromacs tool to perform mm/pbsa and com- putational alanine scanning. Computer Physics Communications, 185(11): 2920–2929, 2014. [74] Cristina Paissoni, Dimitrios Spiliotopoulos, Giovanna Musco, and Andrea Spitaleri. Gmxpbsa 2.1: A gromacs tool to perform mm/pbsa and compu- tational alanine scanning. Computer Physics Communications, 186:105–107, 2015. [75] Helen M Berman, John Westbrook, Zukang Feng, Gary Gilliland, Tala- pady N Bhat, Helge Weissig, Ilya N Shindyalov, and Philip E Bourne. The protein data bank. Nucleic acids research, 28(1):235–242, 2000. [76] Brian G Pierce, Kevin Wiehe, Howook Hwang, Bong-Hyun Kim, Thom Vreven, and Zhiping Weng. Zdock server: interactive docking prediction of protein–protein complexes and symmetric multimers. Bioinformatics, 30 (12):1771–1773, 2014. 148 [77] Cameron Newham and Bill Rosenblatt. Learning the bash shell: Unix shell programming. " O’Reilly Media, Inc.", 2005. [78] Robert L Henderson. Job scheduling under the portable batch system. In Workshop on Job Scheduling Strategies for Parallel Processing, pages 279– 294. Springer, 1995. [79] Justin M Wozniak, Timothy G Armstrong, Michael Wilde, Daniel S Katz, Ewing Lusk, and Ian T Foster. Swift/t: Scalable data flow programming for many-task applications. In ACM SIGPLAN Notices, volume 48, pages 309–310. ACM, 2013. [80] Ewa Deelman, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Sonal Patil, Mei-Hui Su, Karan Vahi, and Miron Livny. Pegasus: Mapping scientific workflows onto the grid. In Grid Computing, pages 11–20. Springer, 2004. [81] Marcus J Edwards, Nanakow A Baiden, Alexander Johs, Stephen J Toman- icek,LiyuanLiang,LiangShi,JimKFredrickson,JohnMZachara,AndrewJ Gates, Julea N Butt, et al. The x-ray crystal structure of shewanella oneiden- sis omca reveals new insight at the microbe–mineral interface. FEBS letters, 588(10):1886–1890, 2014. [82] Vladimir N Maiorov and Gordon M Crippen. Size-independent comparison of protein three-dimensional structures. Proteins: Structure, Function, and Bioinformatics, 22(3):273–283, 1995. [83] Berk Hess, Carsten Kutzner, David Van Der Spoel, and Erik Lindahl. Gro- macs 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. Journal of chemical theory and computation, 4(3):435–447, 2008. [84] Alex D MacKerell Jr, Donald Bashford, MLDR Bellott, Roland Leslie Dun- brack Jr, Jeffrey D Evanseck, Martin J Field, Stefan Fischer, Jiali Gao, H Guo, Sookhee Ha, et al. All-atom empirical potential for molecular mod- eling and dynamics studies of proteinsâĂă. The journal of physical chemistry B, 102(18):3586–3616, 1998. [85] Nathan A Baker, David Sept, Simpson Joseph, Michael J Holst, and J Andrew McCammon. Electrostatics of nanosystems: application to micro- tubules and the ribosome. Proceedings of the National Academy of Sciences, 98(18):10037–10041, 2001. [86] Tingjun Hou, Junmei Wang, Youyong Li, and Wei Wang. Assessing the performance of the mm/pbsa and mm/gbsa methods. 1. the accuracy of 149 binding free energy calculations based on molecular dynamics simulations. Journal of chemical information and modeling, 51(1):69–82, 2010. [87] Van A Ngo. Parallel-pulling protocol for free-energy evaluation. Physical Review E, 85(3):036702, 2012. [88] A Warshel and M Karplus. Calculation of ground and excited state potential surfacesofconjugatedmolecules.i.formulationandparametrization. Journal of the American Chemical Society, 94(16):5612–5625, 1972. [89] Arieh Warshel and Michael Levitt. Theoretical studies of enzymic reactions: dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme. Journal of molecular biology, 103(2):227–249, 1976. [90] Rudolph A Marcus. On the theory of oxidation-reduction reactions involving electron transfer. I. The Journal of Chemical Physics, 24(5):966–978, 1956. [91] John K Ousterhout and Ken Jones. Tcl and the Tk toolkit. Pearson Educa- tion, 2009. [92] Tao Wei, Marcelo A Carignano, and Igal Szleifer. Lysozyme adsorption on polyethylene surfaces: why are long simulations needed? Langmuir, 27(19): 12074–12081, 2011. [93] Tao Wei, Marcelo A Carignano, and Igal Szleifer. Molecular dynamics sim- ulation of lysozyme adsorption/desorption on hydrophobic surfaces. The Journal of Physical Chemistry B, 116(34):10189–10194, 2012. [94] Izaak Neri, Norbert Kern, and Andrea Parmeggiani. Totally asymmetric simple exclusion process on networks. Physical review letters, 107(6):068702, 2011. [95] T Chou, K Mallick, and RKP Zia. Non-equilibrium statistical mechanics: from a paradigmatic model to biological transport. Reports on progress in physics, 74(11):116601, 2011. [96] Mieke Gorissen, Alexandre Lazarescu, Kirone Mallick, and Carlo Van- derzande. Exactcurrentstatisticsoftheasymmetricsimpleexclusionprocess with open boundaries. Physical review letters, 109(17):170601, 2012. [97] Dipesh Bhattarai and Bijaya B Karki. Atomistic visualization: space–time multiresolution integration of data analysis and rendering. Journal of Molec- ular Graphics and Modelling, 27(8):951–968, 2009. 150 [98] Atsutomo Nakamura, Katsuyuki Matsunaga, Jun Tohma, Takahisa Yamamoto, and Yuichi Ikuhara. Conducting nanowires in insulating ceram- ics. Nature Materials, 2(7):453–456, 2003. [99] Kenji Tsuruta, Eita Tochigi, Yuki Kezuka, Kazuaki Takata, Naoya Shibata, Atsutomo Nakamura, and Yuichi Ikuhara. Core structure and dissociation energetics of basal edge dislocation in α-al 2 o 3: A combined atomistic simulation and transmission electron microscopy analysis. Acta Materialia, 65:76–84, 2014. [100] Daniel A Reed and Jack Dongarra. Exascale computing and big data. Com- munications of the ACM, 58(7):56–68, 2015. [101] Nichols A Romero, Aiichiro Nakano, Katherine M Riley, Fuyuki Shimojo, Rajiv K Kalia, Priya Vashishta, and Paul C Messina. Quantum molecular dynamics in the post-petaflops era. Computer, 48(11):33–41, 2015. [102] Leslie Greengard and Vladimir Rokhlin. A fast algorithm for particle simu- lations. Journal of computational physics, 73(2):325–348, 1987. [103] Ken-ichi Nomura, Richard Seymour, Weiqiang Wang, Hikmet Dursun, RajivKKalia, AiichiroNakano, PriyaVashishta, FuyukiShimojo, andLinH Yang. A metascalable computing framework for large spatiotemporal-scale atomistic simulations. In Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1–10. IEEE, 2009. [104] DR Bowler and T Miyazaki. methods in electronic structure calculations. Reports on Progress in Physics, 75(3):036503, 2012. [105] David E Shaw, Ron O Dror, John K Salmon, JP Grossman, Kenneth M Mackenzie, Joseph A Bank, Cliff Young, Martin M Deneroff, Brannon Bat- son, Kevin J Bowers, et al. Millisecond-scale molecular dynamics simulations on anton. In Proceedings of the conference on high performance computing networking, storage and analysis, pages 1–11. IEEE, 2009. [106] DannyPerez, BlasPUberuaga, YunsicShim, JacquesGAmar, andArthurF Voter. Accelerated molecular dynamics methods: introduction and recent developments. Annual Reports in computational chemistry, 5:79–98, 2009. [107] Donald G Truhlar, Bruce C Garrett, and Stephen J Klippenstein. Current status of transition-state theory. The Journal of physical chemistry, 100(31): 12771–12800, 1996. [108] Peter Hänggi, Peter Talkner, and Michal Borkovec. Reaction-rate theory: fifty years after kramers. Reviews of modern physics, 62(2):251, 1990. 151 [109] Arthur F Voter. Parallel replica method for dynamics of infrequent events. Physical Review B, 57(22):R13985, 1998. [110] AiichiroNakano. Pathfinder: Aparallelsearchalgorithmforconcertedatom- istic events. Computer physics communications, 176(4):292–299, 2007. [111] Aiichiro Nakano. A space–time-ensemble parallel nudged elastic band algo- rithm for molecular kinetics simulation. Computer Physics Communications, 178(4):280–289, 2008. [112] Kai J Kohlhoff, Diwakar Shukla, Morgan Lawrenz, Gregory R Bowman, David E Konerding, Dan Belov, Russ B Altman, and Vijay S Pande. Cloud- based simulations on google exacycle reveal ligand modulation of gpcr acti- vation pathways. Nature chemistry, 6(1):15–21, 2014. [113] Antonius Petrus Johannes Jansen. An introduction to kinetic Monte Carlo simulations of surface reactions, volume 856. Springer, 2012. [114] Weiwei Mou, Shinnosuke Hattori, Pankaj Rajak, Fuyuki Shimojo, and Aiichiro Nakano. Nanoscopic mechanisms of singlet fission in amorphous molecular solid. Applied Physics Letters, 102(17):173301, 2013. [115] GT Barkema and Normand Mousseau. Identification of relaxation and dif- fusion mechanisms in amorphous silicon. Physical review letters, 81(9):1865, 1998. [116] G Korniss, MA Novotny, H Guclu, Zoltán Toroczkai, and Per Arne Rikvold. Suppressing roughness of virtual times in parallel discrete-event simulations. Science, 299(5607):677–679, 2003. [117] Yunsic Shim and Jacques G Amar. Rigorous synchronous relaxation algo- rithmforparallelkineticmontecarlosimulationsofthinfilmgrowth. Physical Review B, 71(11):115436, 2005. [118] Yunsic Shim and Jacques G Amar. Semirigorous synchronous sublattice algorithm for parallel kinetic monte carlo simulations of thin film growth. Physical Review B, 71(12):125432, 2005. [119] Ron Elber and Martin Karplus. Enhanced sampling in molecular dynamics: use of the time-dependent hartree approximation for a simulation of carbon monoxide diffusion through myoglobin. Journal of the American Chemical Society, 112(25):9161–9175, 1990. 152 [120] C Masato Nakano, Hye Suk Byun, Heng Ma, Tao Wei, and Mohamed Y El-Naggar. A framework for stochastic simulations and visualization of bio- logical electron-transfer dynamics. Computer Physics Communications, 193: 1–9, 2015. [121] Aiichiro Nakano. Multiresolution load balancing in curved space: The wavelet representation. Concurrency - Practice and Experience, 11(7):343– 353, 1999. [122] William Gropp, Torsten Hoefler, Rajeev Thakur, and Ewing Lusk. Using advanced MPI: Modern features of the message-passing interface. MIT Press, 2014. [123] Rachel M Snider, Sarah M Strycharz-Glaven, Stanislav D Tsoi, Jeffrey S Erickson, andLeonardMTender. Long-rangeelectrontransportinGeobacter sulfurreducens biofilms is redox gradient-driven. Proceedings of the National Academy of Sciences, 109(38):15467–15472, 2012. [124] Matthew D Yates, Joel P Golden, Jared Roy, Sarah M Strycharz-Glaven, Stanislav Tsoi, Jeffrey S Erickson, Mohamed Y El-Naggar, Scott Calabrese Barton, and Leonard M Tender. Thermally activated long range electron transport in living biofilms. Physical Chemistry Chemical Physics, 17(48): 32564–32570, 2015. [125] Benjamin J Gross and Mohamed Y El-Naggar. A combined electrochemical and optical trapping platform for measuring single cell respiration rates at electrode interfaces. Review of Scientific Instruments, 86(6):064301, 2015. [126] Hye Suk Byun, Mohamed Y El-Naggar, Rajiv K Kalia, Aiichiro Nakano, and Priya Vashishta. A derivation and scalable implementation of the synchronous parallel kinetic monte carlo method for simulating long-time dynamics. Computer Physics Communications, 2017. [127] DStaufferandAAharony. Introductiontopercolationtheory, tailor&francis, london. United Kingdom Google Scholar, 1985. [128] Bruce B Lowekamp and John C Schug. An efficient algorithm for nucle- ation studies of particles using the octree data structure. Computer physics communications, 76(3):281–293, 1993. [129] Wikipedia. Breadth-first search — Wikipedia, the free encyclope- dia. http://en.wikipedia.org/w/index.php?title=Breadth-first% 20search&oldid=783367329, 2017. [Online; accessed 18-June-2017]. 153 [130] H Lodish, A Berk, SL Zipursky, P Matsudaira, D Baltimore, and J Dar- nell. Section 15.1, diffusion of small molecules across phospholipid bilayers. Molecular Cell Biology, 2000. [131] Matthew D Yates, Sarah M Strycharz-Glaven, Joel P Golden, Jared Roy, Stanislav Tsoi, Jeffrey S Erickson, Mohamed Y El-Naggar, Scott Calabrese Barton, and Leonard M Tender. Measuring conductivity of living geobacter sulfurreducens biofilms. Nature nanotechnology, 11(11):910–913, 2016. 154
Abstract (if available)
Abstract
Dissimilatory metal reducing bacteria, such as Shewanella oneidensis MR-1, can gain energy by coupling the intracellular oxidation of electron donors to the reduction of external surfaces, such as natural minerals or even synthetic electrodes. This respiratory strategy, known as extracellular electron transfer (EET), has significant environmental and biotechnological implications. Extensive studies of Shewanella have revealed multiple proposed routes for EET between cells and surfaces, all of which rely on an a network of periplasmic and outer membrane multiheme cytochromes to traffic electrons to the cell surface where they may directly reduce minerals or electrodes. The multiheme cytochromes (MtrC and OmcA) are also present on long (micrometer-scale) membrane filaments, known as bacterial nanowires, that are proposed to perform EET to more distance surfaces. While a detailed characterization of long-distance electron conduction in such redox net- works remains challenging, especially under native conditions and in solution, this thesis complements the emerging experimental work with computational insight into redox-based conduction in individual cytochromes, cytochrome complexes, and even whole bacterial nanowires. ❧ This thesis is divided into two parts. The first part of the thesis (chapter 2 and 3) focuses on electron transport through cytochromes and cytochrome complexes that bridge cells to minerals and anodes. Chapter 2 reports electron transport measurements through the single outer-membrane multi-heme cytochrome MtrF from Shewanella oneidensis MR-1 using scanning tunneling microscopy, and analyzes this transport using Kinetic Monte Carlo (KMC) simulations. The simulation results demonstrate reasonable agreement between a multistep electron hopping model through MtrF’s heme chain and experimentally measured currents. We then extend the computational approach by including a structural prediction of an Mtr-Omc complex that explores binding of these proteins under plausible physiological conditions while allowing efficient inter-protein electron transfer. We apply KMC simulations to compute electron flux through the predicted dimer and then visualize the simulation results to examine electron transfer dynamics. ❧ The second part of the thesis (chapters 4-6) focuses on understanding long- distance electron transport through micrometer-scale bacterial nanowires. Recent in vivo observations of the Shewanella nanowires’ formation and composition high- lighted the importance of the outer-membrane cytochromes MtrC and OmcA, thereby motivating stochastic simulations of electron transport in a large-scale redox network of cytochromes lining nanowires. Chapter 4 details how to over- come the previously implemented non-scalable conventional KMC algorithm, in which the time scale is inversely proportional to the simulated system size, making it computationally prohibitive to treat whole nanowires or biofilms. We devise a divide-and-conquer KMC (DCKMC) algorithm with decent strong-scaling parallel efficiency for simulating biological electron transfer dynamics in a 4.2 billion-heme system. The parallel DCKMC algorithm is used to simulate a micrometer-scale redox network of cytochrome arrays on a bacterial nanowire. Chapter 5 reports simulation results for large-scale cytochrome networks using our DCKMC algorithm. Our first test model is a 1.12 μm (idealized) nanowire treated as an array of 1280 MtrC molecules arranged hexagonally on a membrane tube. The 1280 MtrC molecules are decomposed into 64 domains simulated in parallel. The simulation results from the model shows net electron transport rate of 105 s⁻¹, remarkably consistent with the measured respiration rate of Shewanella. This initial finding leads us to design biologically more plausible cytochrome network models. Chapter 6 first proposes percolation analysis as a screening method prior to DCKMC simulations. DCKMC then simulates electron dynamics on the percolation-screened redox network models with various configurations that examine the 1) relative arrangement of heme chains in the cytochrome network
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
From single molecules to bacterial nanowires: functional and dynamic imaging of the extracellular electron transfer network in Shewanella oneidensis MR-1
PDF
Bacterial nanowires of Shewanella oneidensis MR-1: electron transport mechanism, composition, and role of multiheme cytochromes
PDF
Electrochemical studies of subsurface microorganisms
PDF
Electrochemical studies of outward and inward extracellular electron transfer by microorganisms from diverse environments
PDF
From fuel cells to single cells: electrochemical measurements of direct electron transfer at microbial-electrode interfaces
PDF
From cables to biofilms: electronic and electrochemical characterization of electroactive microbial systems
PDF
Microbial interaction networks: from single cells to collective behavior
PDF
Electrochemical investigations and imaging tools for understanding extracellular electron transfer in phylogenetically diverse bacteria
PDF
The role of the environment around the chromophore in controlling the photophysics of fluorescent proteins
Asset Metadata
Creator
Byun, Hye Suk
(author)
Core Title
Kinetic Monte Carlo simulations for electron transport in redox proteins: from single cytochromes to redox networks
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Physics
Publication Date
07/07/2017
Defense Date
07/07/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cytochrome, nanowire,electron transport,kinetic Monte Carlo,OAI-PMH Harvest
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
El-Naggar, Moh (
committee chair
), Amend, Jan (
committee member
), Boedicker, James (
committee member
), Dappen, Werner (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
hyesuk.byun@gmail.com,hyesukby@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-396680
Unique identifier
UC11263905
Identifier
etd-ByunHyeSuk-5485.pdf (filename),usctheses-c40-396680 (legacy record id)
Legacy Identifier
etd-ByunHyeSuk-5485.pdf
Dmrecord
396680
Document Type
Dissertation
Rights
Byun, Hye Suk
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
cytochrome, nanowire
electron transport
kinetic Monte Carlo