Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Advancing the state of the art in quantum many-body physics simulations: Permutation Matrix Representation Quantum Monte Carlo and its Applications
(USC Thesis Other)
Advancing the state of the art in quantum many-body physics simulations: Permutation Matrix Representation Quantum Monte Carlo and its Applications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ADVANCING THE STATE OF THE ART IN QUANTUM MANY-BODY PHYSICS SIMULATIONS: PERMUTATION MATRIX REPRESENTATION QUANTUM MONTE CARLO AND ITS APPLICATIONS by Lalit Gupta A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (PHYSICS) May 2022 Copyright 2022 Lalit Gupta Acknowledgements I would like to thank my advisor Prof. Itay Hen for his continuous support and motiva- tion throughout my Ph.D. He involved me in innovative projects which helped enhance my research skills. I am grateful for his immense support and encouragement to pursue my doctoral studies. It was great collaborating with Prof. Tameem Albash, Dr. Lev Barash, Dr. Thomas Halverson, Prof. Moshe Goldstein, Prof. Daniel Lidar, and Dr. Victor Kasatkin. I am thankful for their support and providing insights during my research. I would also like to thank all the professors who taught me various courses at USC. The knowledge gained helped me in carrying out my research in several ways. A special thanks to my parents, family and friends for their ongoing love, unconditional support, and motivation for years. ii Contents Acknowledgements ii List of Tables v List of Figures vi Abstract viii 1 Introduction 1 2 Permutation Matrix Representation Quantum Monte Carlo 4 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 The permutation matrix representation . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Example I: A single spin-1=2 particle . . . . . . . . . . . . . . . . . . 9 2.2.2 Example II: Two-local spin-1=2 models . . . . . . . . . . . . . . . . . 9 2.2.3 Example III: Spin-one particles (qutrits) . . . . . . . . . . . . . . . . 10 2.2.4 Example IV: The Bose-Hubbard model . . . . . . . . . . . . . . . . . 10 2.3 O-diagonal partition function expansion . . . . . . . . . . . . . . . . . . . . 11 2.4 Non-stoquasticity and emergence of the sign problem . . . . . . . . . . . . . 16 2.5 The QMC algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5.1 QMC congurations and GBW calculation . . . . . . . . . . . . . . . 18 2.5.2 Initial state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.3 Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.4 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.6 Advanced measurement schemes for PMR-QMC . . . . . . . . . . . . . . . . 26 2.6.1 Hamiltonian expectation values and functions thereof . . . . . . . . . 26 2.6.2 Measurements of arbitrary static operators . . . . . . . . . . . . . . . 28 2.6.3 Imaginary-time correlations and integrated susceptibilities . . . . . . 31 2.7 A worm update for PMR-QMC . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.8 Numerical simulations: the toric code . . . . . . . . . . . . . . . . . . . . . . 34 2.9 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.9.1 PMR vs SSE: Transverse-eld Ising model simulations . . . . . . . . 37 2.9.2 Ising model with random XX interactions . . . . . . . . . . . . . . . 38 2.10 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 iii 3 Calculating the divided dierences of the exponential function by addition and removal of inputs 45 3.1 Introduction: Divided dierences of the exponential function . . . . . . . . . 45 3.2 Calculating DDEFs by addition and removal of inputs . . . . . . . . . . . . . 50 3.2.1 Zivcovich's algorithm for evaluating DDEFs . . . . . . . . . . . . . . 50 3.2.2 Adding an item to the input stack . . . . . . . . . . . . . . . . . . . . 52 3.2.3 Removing an item from the input stack . . . . . . . . . . . . . . . . . 54 3.2.4 Direct computation of the modied DDEFs with higher precision . . 54 3.2.5 Accurate computation of DDEFS for large n and s . . . . . . . . . . 56 3.3 Numerical testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.3.1 Calculation of DDEFs with extended precision versus double precision oating point data structures . . . . . . . . . . . . . . . . . . . . . . 57 3.3.2 Runtimes of the addition and removal routines . . . . . . . . . . . . . 58 3.4 Applicability to o-diagonal expansion quantum Monte Carlo . . . . . . . . 59 3.5 Summary and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4 Elucidating the Interplay between NonStoquasticity and the Sign Problem 65 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.1.1 Basic denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.1.2 Determining the severity of the sign problem . . . . . . . . . . . . . . 67 4.1.3 Curing the sign problem . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.2 O-diagonal series expansion of the quantum partition function . . . . . . . 71 4.2.1 ODE: an overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.2.2 The sign problem within ODE . . . . . . . . . . . . . . . . . . . . . . 74 4.3 Non-stoquasticity and the emergence of the sign problem . . . . . . . . . . . 74 4.4 Illustrative examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4.1 A single qubit cannot have a sign problem . . . . . . . . . . . . . . . 77 4.4.2 A spin-one particle can have a sign problem . . . . . . . . . . . . . . 78 4.4.3 The sign problem and positivity of ground-state amplitudes . . . . . 82 4.5 Conclusions and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5 Ecient simulation of so-called non-stoquastic superconducting ux cir- cuits 86 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.2 Eciently simulating superconducting ux circuits . . . . . . . . . . . . . . 88 5.3 Simulating non-stoquastic ux qubit Hamiltonians . . . . . . . . . . . . . . . 90 5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 References 97 Appendix 105 Finite dimensional permutation matrix representations . . . . . . . . . . . . . . . 105 Hermiticity of H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Proof of convolution theorem for divided dierences . . . . . . . . . . . . . . . . . 109 iv List of Tables 2.1 Cycle completion moves for the transverse eld Ising model with two-bodyX interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.1 Operator sequences in the lowest expansion orders, the associated weights and signs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.1 Fixed circuit parameters for the coupled rf-SQUID system . . . . . . . . . . 91 5.2 Coecients for the two qubit rf-SQUID circuit normal mode Hamiltonian. . 96 v List of Figures 2.1 Diagrammatic representation of a generalized Boltzmann weight . . . . . . . 14 2.2 Basic update moves of the QMC algorithm . . . . . . . . . . . . . . . . . . . 20 2.3 An example for a worm algorithm sequence . . . . . . . . . . . . . . . . . . . 35 2.4 The toric code Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.5 PMR QMC measurements for 16 spin toric code Hamiltonian . . . . . . . . . 36 2.6 Thermal average of the o-diagonal Hamiltonian as obtained by parallel tem- pering simulations using PMR and SSE for = 0:4 as a function of simulation time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.7 Thermal average of the o-diagonal Hamiltonian as obtained by parallel tem- pering simulations using PMR and SSE for = 0:1 as a function of simulation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.8 Variance of H, denoted 2 H , as a function of the tree-width of the underlying graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.9 Diagonal energyhH p i as a function of s with and without the XX catalyst 42 3.1 Left: DDEF calculation times as a function of n, with s = 1 and s = 2. . . . 59 3.2 Linear runtime of the addition routine as a function of input size ands. Linear runtime of the removal routine as a function of input size n . . . . . . . . . . 60 3.3 Average number of removals from the top of the stack . . . . . . . . . . . . . 64 4.1 Diagrammatic representation of a generalized Boltzmann weight . . . . . . . 73 vi 4.2 Action ofD 1 P 1 (orange arrows) andD 2 P 2 (blue arrows) on the three classical basis statesjg 1 i,jg 2 i andjei . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.3 hsgni as a function of for various values of andhsgni as a function of for various values of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4 The severity of the sign problemhsgni as a function of for the Hamiltonian H =(P +P 3 ) +P 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.1 Circuit schematic for the non-stoquastic two qubit rf-SQUID . . . . . . . . . 91 5.2 Schematic overview of the change in the potential energy of the two qubit circuit Hamiltonian during the annealing process . . . . . . . . . . . . . . . . 92 5.3 Magnitude of the relative error of the persistent current as a function of the convergence parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.4 Persistent current calculated for both exact diagonalization and ODE QMC at various x points throughout the annealing process . . . . . . . . . . . . . 95 vii Abstract This thesis is focused on the development and application of a novel Quantum Monte Carlo (QMC) algorithm to advance the state-of-the-art of quantum many body physics simula- tions. We showed a novel way to decompose the partition function of a Hamiltonian using permutation matrix representation (PMR) formalism which is free from Trotter error. We proposed updates for our QMC method that can be implemented without the prior knowl- edge of Hamiltonian and does not require us to set the expansion order parameter in advance. We described how to compute the thermal expectation of any complex Hermitian operator in our QMC algorithm. We also benchmarked our algorithm against the state-of-the-art Stochastic Series Expansion (SSE) QMC algorithm and found four or more orders of mag- nitude advantage. To compute the ratio of weights in our QMC algorithm, we needed an ecient way to calculate the ratio of divided dierences of the exponential function (DDEF) which diers by one or two inputs. Computing the divided dierences using standard recur- sion denition gives inaccurate results due to its numerical instability. Methods found in the literature to accurately compute the DDEF have high time complexity as they focus on com- puting DDEF and not the ratio of DDEF. We devised an ecient algorithm to compute the ratio of DDEF using the addition and removal of inputs. Sign problem in QMC using PMR formalism and its connection with non-stoquasticity is studied. We demonstrated using the Qutrit model how it is possible to have no sign problem for a Hamiltonian with a complex coecient in its o-diagonal terms. Stoquasticity implies that ground state amplitudes of a Hamiltonian all have the same sign. We showed using a simple Hamiltonian example that the reverse is not true. We showed a stoquastic representation of a ux circuit Hamiltonian viii which is believed to be non-stoquastic. We applied our QMC algorithm to show the ecient simulation of the superconducting ux circuit. ix Chapter 1 Introduction Quantum many-body systems exhibit an extremely diverse range of quantum phenomena such as super uidity, quantum Hall eect, superconductivity [76, 11]. One of the earliest systems that physicists studied showing interesting quantum phenomena was liquid He 4 [35]. It undergoes a phase transition at lambda temperature of 2.2 K and remains liquid until 0 K. To study the thermodynamic properties of this quantum system one must calculate the partition function. However, due to the dimensionality curse of the Hilbert space, this problem is intractable using standard techniques such as diagonalizing the Hamiltonian. Feynman in 1953 paper [35] was able to qualitatively explain the lambda transition in He 4 by using the path integral formalism to describe the partition function but such a qualitative analysis is extremely dicult in general. Since our nature is inherently quantum, the study of quantum many body physics is crucial to the understanding of nature. This means we need some scalable and reliable techniques to study quantum systems. Interestingly in the same year, 1953, Metropolis et al. published a paper [87] that describe an ecient way to simulate many body systems using computing machine. This paper [87] by Metropolis et al. formed the basis of the Monte Carlo algorithm. In condensed matter physics, it became one of the essential tools to study properties of complex systems such as spin glasses [7]. Later in the eighties, the technique of world lines used to explain the lambda transition in He 4 by Feynman lead to the birth of famous path integral Quantum Monte Carlo [21]. Quantum Monte Carlo (QMC) algorithms [76, 11] turned out to be extremely useful for studying equilibrium properties of large quantum many-body systems, with applications ranging from superconductivity and novel quantum materials [9, 69, 78] through the physics of neutron stars [81] and quantum chromodynamics [24, 70]. The algorithmic development of QMC remains an active area of research, with the dual goal of extending the scope of 1 QMC applicability and improving convergence rates of existing algorithms to facilitate the discovery of new phenomena [99, 101, 94]. The main focus of this thesis is a novel quantum Monte Carlo algorithm for the simulation of general quantum and classical many-body models within a single unifying framework. We call this method Permutation Matrix Representation (PMR) Quantum Monte Carlo as we use the permutation matrices to describe the Hamiltonian. The algorithm builds on a power series expansion of the quantum partition function in its o-diagonal terms and is both parameter-free and Trotter error-free. In our approach, the quantum dimension consists of products of elements of a permutation group. As such, it allows for the study of a very wide variety of models on an equal footing. To demonstrate the utility of our technique, we use it to clarify the emergence of the sign problem in the simulations of non-stoquastic physical models. We showcase the exibility of our algorithm and the advantages it oers over existing state-of-the-art by simulating transverse-eld Ising model Hamiltonians and comparing the performance of our technique against that of the stochastic series expansion algorithm. We also study a transverse-eld Ising model augmented with randomly chosen two-body transverse-eld interactions. The QMC algorithm that we developed needs to compute generalized Boltzmann weights which are exponential functions of divided dierence. However, computing the generalized Boltzmann weight using the standard recurrence scheme requires increasing precision as the size of the imaginary time dimension grows. Thus double-precision oating-point data types become insucient for the calculations of QMC weights. In this case, multiple-precision oating-point data types are needed, which can slow down the algorithm accordingly. To make our QMC method be able to eciently compute weights we introduce a method for calculating the divided dierences of the exponential function using addition and removal of items from the input list to the function. We show that upon adding an item to or removing an item from the input list of an already evaluated exponential, the re-evaluation of the divided dierences can be done with only O(sn) oating-point operations and O(sn) bytes of memory, where n is the number of inputs to the divided dierence function and s 2 is proportional to the variance of the input values. We demonstrate our algorithm's ability to deal with input lists that are orders-of-magnitude longer than the maximal capacities of the current state-of-the-art. Unfortunately, QMC algorithms despite being one of the powerful tools to study the equilibrium properties of many-body quantum systems have an Achilles heel. When some of the weights in the partition function decomposition of a Hamiltonian become negative, QMC algorithms suer from an exponential slow down. This is known as an infamous sign problem. The sign problem is a key challenge in computational physics, encapsulating our inability to properly understand many important quantum many-body phenomena in physics, chemistry and the material sciences. Despite its centrality, the circumstances under which the problem arises or can be resolved as well as its interplay with the related notion of `non-stoquasticity' are often not very well understood. In this thesis, we made an attempt to elucidate the circumstances under which the sign problem emerges and to clear up some of the confusion surrounding this intricate computational phenomenon. To that aim, we make use of the recently introduced o-diagonal series expansion quantum Monte Carlo scheme with which we analyze in detail a number of examples that capture the essence of our results. There is a tremendous interest in fabricating superconducting ux circuits that are nonstoquastic|i.e., have positive o-diagonal matrix elements|in their qubit representa- tion, as these circuits are thought to be unsimulable by classical approaches and thus could play a key role in the demonstration of speedups in quantum annealing protocols. We show however that the ecient simulation of these systems is possible by the direct simulation of the ux circuits. Our approach not only obviates the reduction to a qubit representation but also produces results that are more in the spirit of the experimental setup. We discuss the implications of our work. Specically, we argue that our results cast doubt on the concep- tion that superconducting ux circuits represent the correct avenue for universal adiabatic quantum computers. 3 Chapter 2 Permutation Matrix Representation Quantum Monte Carlo This chapter is based on the work in the paper [41, 43]. We present a quantum Monte Carlo algorithm for the simulation of general quantum and classical many-body models within a single unifying framework. The algorithm builds on a power series expansion of the quantum partition function in its o-diagonal terms and is both parameter-free and Trotter error-free. In our approach, the quantum dimension consists of products of elements of a permutation group. As such, it allows for the study of a very wide variety of models on an equal footing. To demonstrate the utility of our technique, we use it to clarify the emergence of the sign problem in the simulations of non-stoquastic physical models. We showcase the exibility of our algorithm and the advantages it oers over existing state-of-the-art by simulating transverse-eld Ising model Hamiltonians and comparing the performance of our technique against that of the stochastic series expansion algorithm. We also study a transverse-eld Ising model augmented with randomly chosen two-body transverse-eld interactions. 2.1 Introduction The main idea behind any Quantum Monte Carlo (QMC) technique is to decompose partition function into sum of positive weight. Z =Tre H X c p c (2.1) 4 This decomposition allows us to interpret these positive weights as a probability. After decomposing the partition function into positive weights we need to apply a suitable sam- pling technique to sample the conguration according to these probabilities. To do so we choose Markov Chain Monte Carlo(MCMC) to sample in high dimensional conguration. Metropolis algorithm is one of the famous algorithms that is used in implementing this MCMC technique. If we can sample congurations according to their weight then we can nd the expectation value dened by hOi = Tr(Oe H ) e H (2.2) In a classical system, the decomposition of partition function into positive weight comes free of cost as the classical Hamiltonian has only diagonal terms. For instance, if we consider the classical Ising model of the form H =J X hi;ji Z i Z j B X i Z i (2.3) where J is the coupling strength between two neighboring spins and B is the external mag- netic eld. Then we can write the partition function as Z =Tre H = X s 1 =1 X s 2 =1 ::: X s N =1 e J P hi;ji s i s j +B P i s i (2.4) However in the quantum system, one needs to employ a mapping of the quantum partition function to an eective classical statistical problem before implementing the Metropolis algorithm. There are two popular approaches to achieve such mapping. One approach is 5 called Trotter-Suzuki decomposition. In this one split the Hamiltonian into two commuting terms i.e H =H 1 +H 2 and decompose the partition function as Z = Tr[e H ] =Tr[(e H ) m ] =Tr[(e H 1 e H 2 ) m ] +O( 2 ) (2.5) = X fi 1 ;:::i 2m g hi 1 je H 1 ji 2 i:::hi 2m1 je H 1 ji 2m ihi 2m je H 2 ji 1 i = X w (w) (2.6) Another approach is two to split the Hamiltonian as a sum of local bond operator H = P N b b=1 H b and Taylor expand the partition function. Z = Tr[e H ] = X z 1 X n=0 () n n! hzjH n jzi (2.7) = X fzg X fH b g 1 X n=0 () n n! hz 1 jH i 1 jz 2 ihz 2 jH i 2 jz 3 i:::hz n1 jH in jz n i (2.8) This approach is pioneered by Sandvik and is famously called Stochastic Series Expansion (SSE). However, these two have some drawbacks. Discrete-time quantum Monte Carlo which uses the rst approach has Trotter type errors that occur from an insucient discretization of the imaginary time dimension. In the SSE approach, one needs to truncate the Taylor series which could also lead to error if the expansion order is not suciently large. Since expansion order depends on beta one need to change the truncation order for dierent beta values. Moreover using existing QMC algorithms it is harder to eciently simulate systems that exhibit the full range of behavior from being 'fully-classical' to 'fully-quantum'. Also while QMC algorithms have been adapted to the simulation of a wide variety of physical systems, dierent models typically require the development of distinct model-specic update rules and measurement schemes. A notable recent example is the transverse-eld Ising model (TFIM), which traditionally includes only single-body X terms (the transverse eld), supplemented with two-bodyX terms. While the updates associated with single-body X terms can be implemented by local (in space) updates, the inherently non-local nature of the two-body X terms requires novel cluster updates [84]. Thus, a proper treatment of 6 the Hamiltonian with single-body and two-bodyX terms requires updates that are dierent than if the Hamiltonian included only single-body or only two-body X terms. In this chapter, we provide a QMC scheme that has the exibility to simulate a broad range of quantum many-body models. The technique we propose here builds on a power- series expansion of the canonical quantum partition function about the classical partition function rst introduced in Refs. [5, 54]). While strongly inspired by earlier methods that expand the partition function in powers of the inverse-temperature [49, 50, 102, 99], our expansion is more accurately described as an expansion in o-diagonal operators of the Hamiltonian. Our QMC algorithm is free from Trotter type error and does not require us to set the expansion order parameter in advance as opposed to the SSE algorithm where one needs to set the Taylor expansion order before starting the simulation. Another feature of our QMC algorithm that distinguishes it from SSE is that it bundles innitely many SSE weights into a single weight. We demonstrated that this translates to orders-of-magnitude runtime advantages in most cases. Our formalism enables a very general treatment of Hamiltonians, allowing us to develop a QMC scheme that is applicable to a wide variety of models, ranging from highly interacting models with multi-body terms to non-interacting ones and from strongly quantum models to purely classical ones, using the same updating formalism. In order to demonstrate the advantage of the new method, we study the TFIM with random two-body longitudinal inter- actions, comparing the performance of our technique against that of stochastic series expan- sion QMC [100]. In addition, we study in detail a variant of the model with added random XX interactions. This model poses a challenge for traditional QMC approaches [12, 84] due to the random connectivity of the Ising couplings andXX interactions, which vary between instances. While we focus most of our attention in what follows on nite-dimensional Hamil- tonians, the technique we present here should apply with equal rigor to innite-dimensional systems. 7 2.2 The permutation matrix representation We consider many-body systems whose Hamiltonians we cast as the sum H = M X j=0 ~ P j = M X j=0 D j P j =H c + M X j=1 D j P j ; (2.9) wheref ~ P j g is a set of M + 1 distinct generalized permutation matrices [66], i.e., matrices with precisely one nonzero element in each row and each column (this condition can be relaxed to allow for rows and columns with only zero elements). Each operator ~ P j can be written, without loss of generality, as ~ P j =D j P j whereD j is a diagonal matrix 1 andP j is a permutation matrix with no xed points (equivalently, no nonzero diagonal elements) except for the identity matrix P 0 = 1. We will refer to the basis in which the operatorsfD j g are diagonal as the computational basis and denote its states byfjzig. We will call the diagonal matrixD 0 the `classical Hamiltonian' and will sometimes denote it byH c . The permutation matrices appearing in H will be treated as a subset of a permutation group, wherein P 0 is the identity element. ThefD j P j g o-diagonal operators (in the computational basis) give the system its `quan- tum dimension'. Each term D j P j obeys D j P j jzi = d j (z 0 )jz 0 i where d j (z 0 ) is a possibly complex-valued coecient andjz 0 i 6= jzi is a basis state. While the above formulation may appear restrictive, we show in Appendix 5.5 that any nite-dimensional matrix can be written in the form of Eq. (2.9). We also note that H = P j D j P j is hermitian if and only if for every index j there is an associated index j 0 such that P j =P 1 j 0 and D j =D j 0 where the indices j and j 0 can be the same (see Appendix 5.5). This in turn implies that any Hamiltonian H can be written as H = X j R j e i j P j + e i j P 1 j ; (2.10) 1 The diagonal matrix D j will be invertible, i.e., will not contain zero elements along the diagonal, if ~ P j is a bonade generalized permutation matrix. 8 where R j ; j are real-valued diagonal matrices. In the case where a permutation matrix P j is its own inverse, the corresponding j will necessarily be the zero matrix. To further elucidate PMR, we now provide several examples. 2.2.1 Example I: A single spin-1=2 particle The Hamiltonian of a single spin-1=2 particle can most generally be written as H = 0 1 + 1 X + 2 Y + 3 Z ; (2.11) where X;Y and Z are the matrix representations of the usual Pauli operators in the basis that diagonalizes the Pauli-Z operator. In PMR, the Hamiltonian becomes H =D 0 P 0 +D 1 P 1 (2.12) with P 0 = 1, P 1 =X, D 0 =H c = 0 1 + 3 Z and D 1 = 1 1i 2 Z. 2.2.2 Example II: Two-local spin-1=2 models A general two-local n-particle spin-1=2 Hamiltonian has similarly the following form H = X i<j X K i 2f1 i ;X i ;Y i ;Z i g K j 2f1 j ;X j ;Y j ;Z j g ij;K i ;K j K i K j : (2.13) Here, the basis states are tensor products of the single spin states. We can cast the Hamilto- nian in the form of Eq. (2.9) by grouping together elements that change a given basis statejzi to the same basis statejz 0 i. For example, the termsX i ,Y i ,X i Z j ,Y i Z j are grouped together as the action of the combined term V i = ij10 X i + ij20 Y i + P j ( ij13 X i Z j + ij23 Y i Z j ) can be written as D i X i , where D i is a (generally complex-valued) diagonal matrix. 9 This approach can be straightforwardly generalized to three- and higher-local Hamilto- nians. The permutation matrices P i for general spin-1/2 Hamiltonians are by extension f1;X i ;:::;X i X j ;:::;X i X j X k ;:::g. 2.2.3 Example III: Spin-one particles (qutrits) The permutation matrix representation generalizes straightforwardly to higher dimensional systems. The Hamiltonian for a single qutrit can be written as H = D 0 P 0 +D 1 P 1 +D 2 P 2 where P 0 = 1 = 2 6 6 6 4 1 0 0 0 1 0 0 0 1 3 7 7 7 5 ;P 1 = 2 6 6 6 4 0 0 1 1 0 0 0 1 0 3 7 7 7 5 ;P 2 = 2 6 6 6 4 0 1 0 0 0 1 1 0 0 3 7 7 7 5 ; and D 2 =D 1 , a condition imposed by the hermiticity of the Hamiltonian. 2.2.4 Example IV: The Bose-Hubbard model Another model that can just as easily be represented in permutation matrix form is the Bose- Hubbard model. This discretely innite dimensional model captures the physics of interact- ing spinless bosons on a lattice [79] and is commonly used to describe super uid-insulator transitions [38], bosonic atoms in an optical lattice [65] and certain magnetic insulators [40]. The Bose-Hubbard Hamiltonian is given by H =t X hi;ji ^ b y i ^ b j + U 2 X i ^ n i (^ n i 1) X i ^ n i : (2.14) Here,hi;ji denotes summation over all neighboring lattice sites i and j, while ^ b y i and ^ b i are regular bosonic creation and annihilation operators such that ^ n i = ^ b y i ^ b i gives the number of particles at the i-th site. The model is parametrized by the hopping amplitude t and the on-site interaction U. In the bosonic number basis where states are described by the number of bosons in each sitejn 1 i:::jn L i (here L is the number of lattice sites) we identify the diagonal part to 10 be D 0 = U 2 P i ^ n i (^ n i 1) P i ^ n i and the o-diagonal (innite dimensional) permutation operators as P hi;ji = ^ b y i ^ b j . The diagonal operators associated with P hi;ji are D hi;ji whose entries aret for states whose n j is positive (the j-th boson can be annihilated) and zero otherwise. 2.3 O-diagonal partition function expansion We are now in a position to discuss the o-diagonal series expansion of the partition function Z = Tr e H as it applies to Hamiltonians cast in the form given in Eq. (2.9). We begin by replacing the trace operation Tr[] with the explicit sum P z hzjjzi and then expanding the exponent in the partition function in a Taylor series in : Z = X z 1 X n=0 n n! hzj(H) n jzi (2.15) = X z 1 X n=0 n n! hzj H c X j=1 D j P j ! n jzi = X z 1 X n=0 X fS in g n n! hzjS in jzi: In the last step we have expressed (H) n in terms of all sequences of length n composed of products of basic operators H c and D j P j , which we have denoted by the setfS in g. Here i n = (i 1 ;i 2 ;:::;i n ) is a set of indices, each of which runs from 0 toM, that denotes which of the M + 1 operators in H appear in S in . We proceed by stripping away all the diagonal Hamiltonian terms from the sequence hzjS in jzi. We do so by evaluating the action of these terms on the relevant basis states, leaving only the o-diagonal operators unevaluated inside the sequence (see Refs. [5, 54] for a more detailed derivation). 11 The partition function may then be written as Z = X z 1 X q=0 X fSqg q Y j=1 d (i j ) z j ! hzjS iq jzi 1 X n=q n (1) n n! X P k i =nq (E z 0 ) k 0 ::: (E zq ) kq 1 A ; (2.16) where E z i =hz i jH c jz i i andfS iq g denotes the set of all products of length q of `bare' o- diagonal operators P j . Also d (i j ) z j =hz j jD i j jz j i; (2.17) which can be considered as the `hopping strength' of P i j with respect tojz j i. Note that while the partition function is positive and real-valued, the d (i j ) z j elements do not necessarily have to be so. The term in parentheses in Eq. (2.16) sums over the diagonal contribution of allhzjS in jzi terms that correspond to the samehzjS iq jzi term. The variousfjz i ig states are the states obtained from the action of the ordered P j operators in the product S iq onjz 0 i, then on jz 1 i, and so forth. For example, for S iq = P iq :::P i 2 P i 1 , we obtainjz 0 i =jzi;P i 1 jz 0 i = jz 1 i;P i 2 jz 1 i =jz 2 i, etc. The proper indexing of the statesjz j i along the path isjz (i 1 ;i 2 ;:::;i j ) i to indicate that the state at thej-th step depends on allP i 1 :::P i j . We will use the shorthand jz j i. The sequence of basis statesfjz i ig may be viewed as a `walk' on the hypercube of basis states [5, 54, 55] (see Fig. 2.1). After a change of variables, n!n +q, we arrive at: Z = X z 1 X q=0 X fSqg hzjS iq jzi () q q Y j=1 d (i j ) z j ! 1 X n=0 () n (n +q)! X P k i =n (E z 0 ) k 0 (E zq ) kq 1 A : (2.18) 12 Noting that the variousfE z i g are the classical energies of the statesjz i i states created by the operator product S iq , the partition function is now given by: Z = X z 1 X q=0 q Y j=1 d (i j ) z j ! X fSqg hzjS iq jzi (2.19) 0 @ (1;:::;1) X fk i g=(0;:::;0) () q (q + P k i )! q Y j=0 (E z j ) k j 1 A : A feature of the above innite sum is that the term in parentheses can be further simplied to give the exponent of divided dierences of theE z i 's (a short description of divided dierences and an accompanying proof of the above assertion can be found in Ref. [5]); it can therefore be succinctly rewritten as: X fk i g () q (q + P k i )! q Y j=0 (E z j ) k j =e [Ez 0 ;:::;Ezq ] ; (2.20) where [E z 0 ;:::;E zq ] is a multiset of energies and where a functionF [] of a multiset of input values is dened by F [E z 0 ;:::;E zq ] q X j=0 F (E z j ) Q k6=j (E z j E z k ) (2.21) and is called the divided dierences [109, 30] of the function F [] with respect to the list of real-valued input variables [E z 0 ;:::;E zq ]. In our case, F [] is the function F [E z 0 ;:::;E zq ] =e [Ez 0 ;:::;Ezq ] : (2.22) Therefore, the innite sum over energies in Eq. (2.19) may be simplied to Z = X z 1 X q=0 X fSqg hzjS iq jziD (z;S iq ) e [Ez 0 ;:::;Ezq ] ; (2.23) 13 where we have denoted D (z;S iq ) = q Y j=1 d (i j ) z j : (2.24) We stress that the partition function expanded as such is not an expansion in . Similar to the techniques introduced by Handcomb in the 1960s [49, 50] and further developed in the stoquastic series expansion (SSE) scheme pioneered by Sandvik [102, 99], the o-diagonal series expansion begins with a Taylor series expansion of the exponential function in the inverse temperature , but the regrouping of terms into the exponent of divided-dierences means that it is no longer a high-temperature expansion. SSE writes the trace of products of the Hamiltonian as a sum of products of matrix elements written in a suitably chosen basis which are then sampled (thereby overcoming the limitation of Handscomb's scheme which required the evaluation of traces of products of the Hamiltonian). This is made possible by breaking up the Hamiltonian into a sum of local bonds. In PMR, this is not the case | all terms may remain non-local but are grouped according to their action on basis states. ⟩ | $ = ( ) ($) ( , (-) ( . (/) 12[4 5 6 ,4 5 ) ,4 5 , ,4 5 6 ] ⟩ | 9 ⟩ | - $ - / Figure 2.1: Diagrammatic representation of a generalized Boltzmann weight, or a GBW, calculated from the classical energiesE z j of the classical statesjz j i, which form a closed walk on the hypercube of basis states. The walk is determined by the action of the permutation operators of the conguration, represented by S iq = P 3 P 2 P 1 , on the initial basis statejz 0 i. The walk closes if and only if the sequence of permutation operators evaluates to the identity operation. 14 Specially, the diagonal portion of the Hamiltonian remains `intact' which then allows us to regroup a large portion of all terms. A single divided-dierence term thus corresponds to a sum of an innite number of SSE terms, meaning that a single PMR conguration represents very many standard SSE congurations and a single PMR weight sums up very many standard SSE weights. This can be immediately seen in the derivation above, par- ticularly Eq. (2.16), which relates the standard SSE weight, which involves sequences of diagonal as well as o-diagonal bonds, to the weights of the current approach that only involve o-diagonal bonds. It is worth noting that the cost of the massive grouping of the o-diagonal series expansion is manifested in the computational cost associated with calculating PMR terms (or ratios thereof). As we discuss in Sec. 2.5 in more detail, these can be calculated (or more precisely, updated) using O(q) basic arithmetic operations (where q is the number of operators in a sequence or equivalently the size of the imaginary-time dimension), as opposed to SSE's O(1). We refer the reader to Ref. [5] for a more detailed comparison between the o- diagonal expansion and SSE. In Sec. 2.9.1 we provide a runtime comparison of PMR vs SSE for simulations of TFIM instances. Aside from SSE, it is also interesting to observe that the exponent of divided dierences also has close relations to continuous-time QMC (e.g., Ref. [94]), via the Hermite-Genocchi formula [30]: e [E 0 ;:::;Eq ] = Z dt 0 dt q e (E 0 t 0 +E 1 t 1 +:::+Eqtq ) ; (2.25) where t i 0 and the area of integration is bounded by t 0 +t 1 +::: +t q from above. Having derived the expansion Eq. (2.23) for any Hamiltonian cast in the form Eq. (2.9), we are now in a position to interpret the partition function expansion as a sum of weights, 15 i.e., Z = P fCg W C , where the set of congurationsfCg is all the distinct pairsfjzi;S iq g. Because of the form of W C , W C =D (z;S iq ) e [Ez 0 ;:::;Ezq ] ; (2.26) we refer to it as a `generalized Boltzmann weight' (or, a GBW). It can be shown [5] that (1) q e [Ez 0 ;:::;Ezq ] is strictly positive. Another feature of divided dierences is that they are invariant under rearrangement of the input values. We note that as written, the weights W C are complex-valued, despite the partition func- tion being real (and positive). Since for every congurationC =fjzi;S iq g there is a conjugate conguration C =fjzi;S y iq g 2 that produces the conjugate weight W C = W C , the imaginary contributions cancel out. Expressed dierently, the imaginary portions of complex-valued weights do not contribute to the partition function and may be disregarded altogether. We may therefore redene D (z;S iq ) = Re h Q q j=1 d (i j ) z j i , obtaining strictly real-valued weights. Before we move on, we note thathzjS iq jzi evaluates either to 1 or to zero. Moreover, since the permutation matrices with the exception of P 0 have no xed points, the condition hzjS iq jzi = 1 implies S iq = 1, i.e., S iq must evaluate to the identity element P 0 (note that the identity element does not appear in the sequencesS iq ). The expansion can thus be more succinctly rewritten as Z = X z X S iq =1 D (z;S iq ) e [Ez 0 ;:::;Ezq ] : (2.27) 2.4 Non-stoquasticity and emergence of the sign prob- lem An attractive property of the formalism introduced above is that it allows us to identify the emergence of the sign problem in QMC via inspection of the weights W C , thereby making more apparent the connection between the notion of non-stoquasticity | the existence of 2 For S iq =P iq :::P i2 P i1 , the conjugate sequence is simply S y iq =P 1 i1 P 1 i2 :::P 1 iq . 16 positive or complex-valued o-diagonal Hamiltonian matrix entries | which has garnered increasing attention with the advent of quantum computers in recent years [16, 17, 83] and the onset of the sign problem. To interpret the real-valued weight terms W C as actual weights (equivalently, un- normalized probabilities), they must be nonnegative. The occurrence of negative weights marks the onset of the infamous sign problem. A weight is positive i (1) q D (z;S iq ) = Re " q Y j=1 (d (i j ) z j ) # is positive, that is, a QMC algorithm will encounter a sign problem, equivalently a negative weight, during a simulation if and only if there exists a closed walk on the hypercube of basis states along which Re h Q q j=1 (d (i j ) z j ) i < 0. It is thus clear that it is not mere non- stoquasticity (equivalently, the sign of o-diagonal entries) that creates the sign problem, but rather the sign of closed walks on the hypercube of basis states that determines its occurrence. A special class of models where the sign problem does not emerge, i.e., where Re h Q q j=1 (d (i j ) z j ) i 0 for all congurations, is that of `stoquastic' Hamiltonians [16, 17] for which all d (i j ) z j are negative, which is equivalent to having only nonpositive o-diagonal elements in the matrix representation of the Hamiltonian. In this case, all products trivially yield positive-valued walks. The existence of positive o-diagonal terms does not however immediately imply a sign problem for QMC (the reader is referred to Ref. [44] for a more detailed discussion). Another example of a sign-problem-free family of models is one where alld (i j ) z j elements are positive but closed walks are all of even length. One such model is the transverse-eld Ising Hamiltonian H = X i;j J ij Z i Z j + X j h j Z j + X j X j : (2.28) 17 for > 0. A slightly less trivial example is the two-body model H = X i;j J ij Z i Z j + X hi;ji X i X j ; (2.29) provided that the underlying connectivityhi;ji of the two-bodyX terms is bi-partite (allow- ing only even cycles). It is also interesting to note that any single-qubit Hamiltonian is necessarily also sign- problem-free. In this case, the Hamiltonian is H =D 0 P 0 +D 1 P 1 as described in Sec. 2.2.1. Since here the S iq are sequences consisting of only one type of non-identity permutation matrices, namely P 1 = X, the expansion order q must be even for S iq to evaluate to the identity element. This in turn results in h Q q j=1 (d (i j ) z j ) i = ( 2 1 + 2 2 ) q=2 being strictly non- negative. The same is however not true for a single qutrit in which case a sign problem may arise. 2.5 The QMC algorithm Having derived the series expansion of the partition function for permutation-represented Hamiltonians, we are now in a position to discuss a QMC algorithm that can be associated with the above expansion. 2.5.1 QMC congurations and GBW calculation As was discussed above, a congurationC =fjzi;S iq g is a pair of a classical state and a product S iq of permutation operators that must evaluate to the identity element P 0 = 1. To take full advantage of our partition function decomposition above, we treat the o- diagonal permutation termsfP j g in our QMC algorithm as elements in a permutation group G (with matrix product as the group operation). Since the elementsfP j g appearing in the Hamiltonian may not form a complete group, we shall treat any additional element 18 P j 0 required to complete the set to form a group as appearing in the Hamiltonian with an associated diagonal matrix D j 0 = 0 [see Eq. (2.9)]. The pairC induces a list of statesfjz 0 i =jzi;jz 1 i;:::;jz q i =jzig, which in turn also generates a corresponding multiset of diagonal energies E C =fE z 0 ;E z 1 ;:::;E zq g of not- necessarily-distinct values (recall thatE z i =hz i jH c jz i i). For systems with discrete energy val- ues, the multiset can be stored eciently in a `multiplicity table'M C =fm 0 ;m 1 ;:::;m j ;:::g, where m j is the multiplicity of the energy E z j in the multiset. Given the multisetE C , the evaluation of the GBWW C follows from its denition as a func- tion of divided dierences (the reader is referred to Ref. [5] for a more detailed description). The calculation of a GBW consisting ofq permutation operators requires the evaluation of a divided-dierences exponential with (q + 1) energies. This calculation can be accomplished with at most O(q) operations [110, 42]. 2.5.2 Initial state At this point we can consider a QMC algorithm based on the partition function expansion generating the weights W C , Eq. (2.26). The Markov process would start with the initial congurationC 0 =fjzi;S 0 = 1g wherejzi is a randomly generated initial classical state. The weight of this initial conguration is W C 0 = e [Ez ] = e Ez ; (2.30) i.e., the classical Boltzmann weight of the initial random statejzi. 2.5.3 Updates We next describe the basic update moves for the algorithm. These are also succinctly summarized in Fig. 2.2. 19 ! ! ! " ! # … | ⟩ ⟨| ! ! ! " ! # … | ⟩ ′ ⟨′| ! ! ! # … | ⟩ ⟨| … ! $%# ! ! ! # … | ⟩ ⟨| … ! $ (a) (b) (c) (d) ! ! ! $%# ! # … | ⟩ ⟨| … ! $ ! $ ! # ! $%# … | ⟩ ′ ⟨′| … ! ! ′′′ ′ ! ! ! $%# ! # … | ⟩ ⟨| … ! $ ’ ! ! ! # … | ⟩ ⟨| … " ! $ ! $%# ! $%" ’’ ′′′′ ! $%" ! $ Figure 2.2: Basic update moves of the QMC algorithm. (a) Classical moves (e.g., a single bit ip), whereby only the initial state z is changed to z 0 leaving S iq unchanged. (b) Cyclic rotation, whereby two adjacent sequences of group elements (in this caseP i k andP i k+1 P i k+2 ) whose product is the identity operation are interchanged, changing their internal classical states. (c) Block swap, whereby two partitions of the sequence S iq are interchanged. This also changes the initial state fromz toz 0 as well as the ordering ofS iq . (d) Cycle completion, whereby a sub-sequence of operators is replaced by an equivalent one (in this case, P i k P i k+1 is replaced by ~ P i k . This is the only update where the number of group element (equivalently, the expansion order of the conguration) may change. Classical moves Classical moves are any moves that involve a manipulation of the classical statejzi while leaving S iq unchanged [see Fig. 2.2(a)]. In a single bit- ip classical move, a spin from the classical bit-string statejzi ofC is picked randomly and is ipped, generating a statejz 0 i and hence a new congurationC 0 . Calculating the weight ofC 0 requires recalculating the energies associated with the productS iq leading to a new energy multisetE C 0 and can become computationally intensive if q is large. Classical moves should therefore be attempted with low probabilities ifq is large. Simply enough, the acceptance probability for a classical move is p = min 1; W C 0 W C = min 1; e [E C 0] e [E C ] ; (2.31) where e [E C ] is a shorthand for e [Ez 0 ;Ez 1 ;:::;Ezq ] of congurationC and likewise forC 0 . 20 In the absence of a quantum part to the Hamiltonian (D j = 0 for allj > 0), not only are classical moves the only moves necessary, they are also the only moves that have nonzero acceptance probabilities. Since the initial conguration of the QMC algorithm is a random classical congurationjzi and an empty operator sequence S 0 = 1, for a purely classical Hamiltonian, the algorithm automatically reduces to a classical thermal algorithm keeping the size of the imaginary-time dimension at zero (q = 0) for the duration of the simulation. Cyclic rotations The `cyclic rotation' move, [Fig. 2.2(b)], consists of identifying short sub-sequences, or cycles, of consecutive operators in the sequenceS iq , whose product is the identity element, i.e., sub- sequences that obey P i j P i j+C = 1: (2.32) Depending on the nature of the operators, preparing a lookup table of short cycles that evaluate to the identity may prove useful. Once a cycle is identied, a random cycle rotation is attempted. Here, a random internal insertion point within the sub-sequence is picked and a rotation is attempted: P i j P i k P i k+1 P i j+C !P i k+1 P i j+C P i j P i k : (2.33) The rotated sequence also evaluates to the identity. Since the internal classical states between the elements in the cycle may change by the rotation, the rotation involves adding new energiesfE z 0:::g and removing old onesfE z 0:::gfE z :::g from the energy multiset. Short cycles should therefore be preferred. The acceptance probability for the move is as in Eq. (2.31) with E C 0 =E C +fE z 0:::gfE z :::g. Block-swap A block swap [Fig. 2.2(c)] is an update that involves a change of the classical state z. Here, a random position k in the product S iq is picked such that the product is split into two 21 (non-empty) sub-sequences, S iq = S 2 S 1 , with S 1 = P i k P i 1 and S 2 = P iq P i k+1 . The classical statejz 0 i at position k in the product is given by jz 0 i =S 1 jzi =P i k P i 1 jzi; (2.34) wherejzi is the classical state of the current conguration. The statejz 0 i has energy E z 0, and the statejzi has energy E z . The new block-swapped conguration isC 0 =fjz 0 i;S 1 S 2 g. The multiplicity table of this conguration diers from that of the current conguration by having one fewerE z state and one additionalE z 0 state. The weight of the new conguration is then proportional toe [E C 0] where the multisetE C 0 =E C +fE z 0gfE z g. The acceptance probability is as in Eq. (2.31) with the aforementioned E C 0. Cycle completion The moves presented so far have left the number of group elements in the sequence, or expansion order, namelyq, unchanged. The cycle completion move has the eect of changing the value of q. A lookup table of short cycles obeying P i j P i j+C = 1 (2.35) will be helpful in this case. The cycle completion move identies a sub-cycle in the sequence S iq , e.g., P i j P i k and replaces it with its complement P i k+1 P i j+C 1 =P 1 i j+C P 1 i k+1 : (2.36) Note that the inverses of permutation matrices are also permutation matrices and are there- fore also present in G. For concreteness, let us consider the case of subsequences of length two. We randomly pick a pointk2 [0;q] in the sequence. With probability 1=4, the subsequence is taken to be 22 P 0 P 0 , P 0 P i k , P i k1 P 0 , P i k1 P i k . 3 The identied subsequence is replaced by its complement, resulting in a new congurationC 0 . Because we can interpret P 1 0 = P 0 = P j P 1 j (for any arbitrary index j) and so on, the cycle completion move can grow and shrink the sequence. The acceptance probability is as in Eq. (2.31) with the new conguration. 2.5.4 Measurements Having reviewed the various update moves we next turn to discuss measurements within the algorithm. Diagonal measurements A diagonal operator obeys jzi =(z)jzi where(z) is a number that depends both on the operator and the state it acts on. SincehzjS iq jzi =(z)hzjS iq jzi, for any given conguration C = (jzi;S iq ), there is a contribution =(z) to the diagonal operator thermal averagehi. To improve statistics, one may also consider rotations in (the periodic) imaginary time. To do that, we may consider `virtual' block-swap moves (see Sec. 2.5.3) that rotate S iq and as a result also change the classical conguration fromjzi tojz i i. The contribution to the expectation value of a diagonal operator thus becomes: = 1 Z q1 X i=0 (z i )e [E C i ] : (2.37) where E C i is the energy multiset associated with congurationC i whose multiset is E C i = E C +fE z i gfE z g (recall that z 0 z, so E C 0 =E C ). The normalization factorZ above is the sum Z = q1 X j=0 e [E C j ] = X j m j e [E C j ] (2.38) 3 Note that if k = 0 or q, namely, the edges of the sequence, then two of the choices correspond to non- starters as there is an operator only to one side of the insertion point and so either P i k1 or P i k are not dened. 23 over all nonzero multiplicities m j . In the case where =H c the above expression simplies to: = 1 Z q1 X i=0 E z i e [E C i ] = 1 Z X j m j E z j e [E C j ] : (2.39) O-diagonal measurements We next consider the case of measuring the expectation value of an o-diagonal operatorP k , namely,hP k i. To do this, we interpret the instantaneous conguration as follows W C = D (z;S iq ) e [E C ] hzjS iq jzi = d iq e [E C ] e [E C 0] ! h D (z;S i q1 ) e [E C 0] hzjS i q1 P iq jzi i ; (2.40) whereC 0 is the conguration associated with the multiset E C 0 = E C fE z g. In the above form, we can reinterpret the weight W C as contributing p k = k;iq e [E C 0] d (iq ) zq e [E C ] ; (2.41) tohP k i where d (iq ) zq is the 'hopping strength' of P k Eq. (2.17). As in the case of the diagonal measurements, one can take advantage of the periodicity in the imaginary time direction to improve statistics by rotating the sequence such that any of the elements of S iq becomes the last element of the sequence (see Sec. 2.5.3), weighted accordingly by the block-swap probability. By doing so, P k becomes p k = X j k;i j d (i j ) z j e E C j P q1 j 0 =0 e [E C j 0 ] e [E C 0] e [E C j ] = 1 Z e [E C 0] X j k;i j d (i j ) z j ; (2.42) where E C i =E C +fE z i gfE z g, the sum P j is over all rotated congurationsC 0 . 24 Products of o-diagonal measurements The sampling of expectation values of the formhP k 1 P k 2 i proceeds very similarly to the single operator case except that now both operators must appear at the end of the sequence. The argument proceeds similarly to the single o-diagonal measurement, and we have that the contribution to the expectation value ofhP k 1 P k 2 i is p k 1 ;k 2 = k 1 ;iq k 2 ;i q1 e [E C 0] d (iq ) zq d (i q1 ) z q1 e [E C ] (2.43) with E C 0 = E C fE z ;E z q1 g. As in the single o-diagonal operator case, we can use the block-swap move to alter the elements at the end of the sequence, and for each pair of adjacent operators in the sequence obtain an improved contribution. By doing so,hP k 1 P k 2 i becomes P k 1 ;k 2 = X j k 1 ;i j k 2 ;i j1 d (i j ) z j d (i j1 ) z j1 e [E C j ] P q1 j 0 =0 e [E C j 0 ] e [E C 0 j ] e [E C j ] = 1 Z X j k 1 ;i j k 2 ;i j1 d (i j ) z j d (i j1 ) z j1 e [E C 0 j ] ; (2.44) where E C k =E C +fE z k gfE z g, E C 0 i =E C fE z ;E z q1 g withjz 00 i =P k 2 jz 0 i andjz 0 i is the classical state after the block swap. Similar to the single o-diagonal operator case, the sum P j is over all rotated congurationsC 0 whose S iq ends with P k 1 P k 2 . Measurements of thermal averages of products of more than two o-diagonal operators can also be derived in a straightforward manner. Improved measurements As will often happen, certain physical operators will have more than one representation as group element. E.g., if P 3 =P 1 P 2 , one could measure both the single operatorhP 3 i and the operator producthP 1 P 2 i and combine the results. 25 2.6 Advanced measurement schemes for PMR-QMC In this section, we derive measurements schemes for several classes of static, dynamic and integrated operators to demonstrate the ease which which these can be measured under PMR-QMC. For all of those, the basic idea would be to write for any given operator A its thermal average as hAi = Tr [Ae H ] Tr e H ] = P (z;i) A (z;i) W (z;i) P (z;i) W (z;i) ; (2.45) where we have used the shorthand notation P (z;i) in lieu of the triple sum P z P 1 q=0 P iq for convenience. The quantity A (z;i) is therefore the instantaneous quantity that is gathered statistics will be during the simulation. 2.6.1 Hamiltonian expectation values and functions thereof Similar to the partition function form derived in Eq. 2.23, the trace of any Taylor-expandable function of H, namely f(H), in the basisB can be written as Tr [f(H)] = X z 1 X q=0 X iq D (z;S iq ) f[E z 0 ;:::;E zq ]hzjS iq jzi; (2.46) wherefS iq g is the set of all (unevaluated) products P iq :::P i 2 P i 1 of size q, i q = (i 1 ;:::;i q ) is a multiple index where each individual index i j (with j = 1:::q) ranges from 1 to M, and the term f[E z 0 ;:::;E zq ] is the divided dierences of f over the multiset of classical energies [E z 0 ;:::E zq ] [109, 30]. The energiesfE z i =hz i jD 0 jz i ig are the classical energies of the statesjz 0 i;:::;jz q i obtained from the action of the orderedP j operators in the sequence S i onjz 0 i, then onjz 1 i, and so forth. Explicitly,jz 0 i =jzi;P i 1 jz 0 i =jz 1 i;P i 2 jz 1 i =jz 2 i, etc. The sequence of basis statesfjz i ig may be viewed as a `walk' in the hypercube of basis states [5, 54, 55]. Note thatjz j i =P i j :::P i 2 P i 1 jzi should in principle have been denoted jz (i 1 ;:::;i j ) i however we are using a simplied notation so as not to overburden the notation. To derive quantities for measurements of the formhg(H)i whereg(H) is some function of the Hamiltonian, let us consider the the Tr [f(H)], Eq. 2.46, for the productf(x) =g(x)e x . 26 For the divided dierences of products of functions, we will use the Leibniz rule, which states that for any two functions f 1 () and f 2 () and f() =f 1 ()f 2 (), we have f[E z ;:::;E zq ] = (f 1 f 2 )[E z ;:::;E zq ] = q X j=0 f 1 [E z ;:::;E z j ]f 2 [E z j ;:::;E zq ]: (2.47) Now consider the thermal average of the physical observable g(H) hg(H)i = Tr [g(H)e H ] Tr [e H ] : (2.48) We nd that hg(H)i = P (z;i) W (z;i) P q j=0 g[E z 0 ;:::;E z j ] e [Ez j ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] P (z;i) W (z;i) ; (2.49) and therefore identify P q j=0 g[E z 0 ;:::;E z j ] e [Ez j ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] as the quantity that is to be evaluated and collected during the simulation. In the special case where g(H) =H, we get hHi = Tr [He H ] Tr [e H ] = P (z;i) W (z;i) E z + e [Ez 1 ;:::;Ezq ] e [Ez;:::;Ezq ] P (z;i) W (z;i) : (2.50) Note that E z in the parentheses above is the diagonal part of H and the rest is the o- diagonal part. By the same token, we get hH k i = P (z;i) W (z;i) P maxfk;qg j=0 [E z 0 ;:::;E z j ] k e [Ez j ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] P (z;i) W (z;i) (2.51) which follows from the fact that [E z 0 ;:::;E z j ] k evaluates to 0 forj >k, forj =k it evaluates to 1, and in general for kj: [E z 0 ;:::;E z j ] k = X a2f0;:::;ng kj with a 1 a 2 a kj Y m2a E zm : (2.52) 27 Specically: hH 2 i = P (z;i) W (z;i) E 2 z 0 + (Ez 0 +Ez 1 )e [Ez 1 ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] + e [Ez 2 ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] P (z;i) W (z;i) : (2.53) and hH 3 i = P (z;i) W (z;i) E 3 z 0 + (E 2 z 0 +Ez 0 Ez 1 +E 2 z 1 )e [Ez 1 ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] P (z;i) W (z;i) + (2.54) P (z;i) W (z;i) E 3 z 0 + (Ez 0 +Ez 1 +Ez 2 )e [Ez 2 ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] + e [Ez 3 ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] P (z;i) W (z;i) : 2.6.2 Measurements of arbitrary static operators We next consider the measurement of a general static operator. We proceed by casting it in permutation matrix form, i.e., as A = P i A i P i where eachA i is diagonal in the basisB and theP i 's are permutation operators. We note that in the most general case, the P i operators in the representation of A may not all appear in the the Hamiltonian PMR decomposition. Nonetheless, we can always write hAi = Tr [Ae H ] Tr [e H ] = X i Tr [A i P i e H ] Tr [e H ] ; (2.55) and we may focus on one A i P i term at a time. We rst consider the case where the permu- tation operator P i appears in the Hamiltonian. In the section that follows we will consider the example where it is not the case. 28 `Simple' permutations Let us then calculate Tr [ ~ A ~ P e H ] where ~ A is a diagonal matrix and ~ P is a permutation operator that appears in the PMR of the Hamiltonian H. Carrying out the o-diagonal expansion we end up with: Tr [ ~ A ~ P e H ] = X z ~ A(z) 1 X q=0 X i D (z;S i ) e [Ez 0 ;:::;Ezq ] hzj ~ PS i jzi: (2.56) Note that there is a one-to-one correspondence between non-vanishing termshzj ~ PS i jzi and non-vanishing termshzjS i jzi which appear in the denominator. Let us them rewrite the above as: Tr [ ~ A ~ P e H ] = X z ~ A(z) 1 X q=0 X i D (z; ~ S i ) e [ ~ Ez 0 ;:::; ~ Ezq ] D (z;S i ) e [Ez 0 ;:::;Ezq ] D (z; ~ S i ) e [ ~ Ez 0 ;:::; ~ Ezq ] hzj ~ S i jzi: (2.57) where D (z; ~ S i ) e [Ez 0 ;:::;Ezq ] is the weight of conguration (z; ~ S i ) with ~ S i = ~ PS i . This gives: Tr [ ~ A ~ P e H ] = X z 1 X q=0 X i D (z; ~ S i ) e [ ~ Ez 0 ;:::; ~ Ezq ] M ~ A ~ P (z; i)hzj ~ S i jzi: (2.58) where M ~ A ~ P (z; i) = ~ P ~ A(z) 1 ~ d z e [Ez 1 ;:::;Ezq ] e [Ez;:::;Ezq ] ; (2.59) where ~ P = 1 is the leftmost operator of ~ S i is ~ P and is zero otherwise. Also ~ d z =hzj ~ Djzi where ~ D is the diagonal attached to ~ P in the Hamiltonian PMR. Thus we can writehAi as hAi = P (z;i) W (z;i) ( P q i=0 M A i P i (z; i)) P (z;i) W (z;i) (2.60) One could use the above formula to calculatehHi by choosing ~ A =H to recover the result forhHi from the last section. 29 The more general case We next consider the case where the operator to be measured has the formA = ~ A ~ P where ~ A is diagonal and ~ P does not appear in the Hamiltonian. Rather it can be written as a product of permutation operators that appear in the Hamiltonian, namely, ~ P =P i 1 P i 2 P i k . In this case we can write Tr [ ~ A ~ P e H ] = X z ~ A(z) 1 X q=0 X i D (z;S i ) e [Ez 0 ;:::;Ezq ] hzj(P i 1 P i 2 P i k )S i jzi: (2.61) where D (z;S i ) e [Ez 0 ;:::;Ezq ] is the weight of conguration (z;S i ). We now modify Eq 2.61 so that (z; ~ S i ) with ~ S i = ~ PS i is seen as a conguration instead of (z;S i ). Tr [ ~ A ~ P e H ] = X z ~ A(z) 1 X q=0 X i D (z;S i ) e [Ez 0 ;:::;Ezq ] D (z; ~ S i ) e [Ez 0 ;:::;Ez q+k ] D (z; ~ S i ) e [Ez 0 ;:::;Ez q+k ] (2.62) hzj(P i 1 P i 2 P i k )S i jzi: Tr [ ~ A ~ P e H ] = X z ~ A(z) 1 X q=0 X i e [Ez 0 ;:::;Ezq ] D (z; ~ P ) e [Ez 0 ;:::;Ez q+k ] D (z; ~ S i ) e [Ez 0 ;:::;E zq+k ] (2.63) hzj ~ PS i jzi: Because of the one-to-one correspondence between non-vanishing termshzj ~ PS i jzi and non-vanishing termshzjS i jzi we can rewrite it as Tr [ ~ A ~ P e H ] = X z ~ A(z) 1 X q=0 X i ~ P e [Ez k ;:::;Ezq ] D (z; ~ P ) e [Ez 0 ;:::;Ezq ] D (z; ~ S i ) e [Ez 0 ;:::;Ezq ] hzj ~ S i jzi: (2.64) 30 where ~ S i = ~ PS i and D (z; ~ P ) is the product of scalarhz i jD i l jz i i corresponding to diagonal D i l attached to P i l appearing in the product of ~ P . This giveshAi as hAi = P (z;i) W (z;i) M A i P i (z; i) P (z;i) W (z;i) (2.65) where M ~ A ~ P (z; i) = ~ P ~ A(z) 1 D (z; ~ P ) e [Ez k ;:::;Ezq ] e [Ez 0 ;:::;Ezq ] : (2.66) 2.6.3 Imaginary-time correlations and integrated susceptibilities Another important class of physical quantities are imaginary-time correlations. These are used to measure response functions and susceptibilities. Imaginary-time correlations We rst consider operators of the formhA(0)B()i where B() = e H Be H . Explicitly: e ()H jzi = X q X iq D (iq;z) e ()[Ez 0 ;Ez 1 ;:::;Ezq ] S iq jzi (2.67) Denotingjz iq i =S iq jzi, let us calculate e H jz iq i = X q 0 X i 0 q D (i 0 q 0;z iq ) e [E z 0 0 ;E z 0 1 ;:::;E z 0 q 0 ] S i 0 q 0 jz iq i; (2.68) where E z 0 0 =E z iq . Plugging in the above expressions, we end up with: hA(0)B()i = 1 Z X z A(z) X iq :S iq =1 D (iq;z) q X j=0 B(z i j )I j () ! (2.69) where I j () = e [Ez;Ez 1 ;:::;Ez i j ] e ()[Ez i j ;Ez i j+1 ;:::;Ez iq ] (2.70) 31 As a sanity check, we note that in the special case where case where B =1, we get: hA(0)i = 1 Z X z A(z) X iq :S iq =1 D (iq;z) q X j=0 e [Ez;Ez 1 ;:::;Ez i j ] e ()[Ez i j ;Ez i j+1 ;:::;Ez iq ] ! : (2.71) The expression in parentheses above can be evaluated via the Liebniz rule, Eq. (2.47) above with g 1 (x) = e x and g 2 (x) = e ()x , in which case g 1 (x)g 2 (x) = e x , we get: hA(0)i = 1 Z X z A(z) X iq :S iq =1 D (iq;z) e [Ez;Ez 1 ;:::;Ez iq ] =hAi; (2.72) as it should. Integrated correlations Let us now consider the integrated observable: R 0 dhA(0)B()i. For the integral, we use the `convolution theorem for divided dierences' which we prove in the Appendix 5.5. Z 0 e t[x 0 ;;x j ] e (t)[x j+1 ;;xq ] dt = e [x 0 ;;xq ] : (2.73) or equivalently Z 0 e t[x 0 ;;x j ] e (t)[x j+1 ;;xq ] dt =e [x 0 ;;xq ] : (2.74) For two diagonal operators A and B, Let us compute R 0 dhA(0)B()i: Z 0 dhA(0)B()i = 1 Z Z 0 d X z hzjAe H Be ()H jzi ! = 1 Z X z A(z) Z 0 dhzje H Be ()H jzi: (2.75) 32 Plugging in the expression forhA(0)B()i derived in the preceding section, Eq. (2.69), we end up getting: Z 0 dhA(0)B()i = 1 Z X z A(z) X iq q X j=0 B(z i j ) Z 0 dD (iq;z) I j () ! (2.76) which simplies to Z 0 dhA(0)B()i = 1 Z X z A(z) X iq q X j=0 B(z i j )D (iq;z) e [Ez;Ez 1 ;:::;Ez i j ;Ez i j ;:::;Ez iq ] ! ; (2.77) where one should note that E z i j appears twice in the divided-dierence input. The above can be rewritten as: Z 0 dhA(0)B()i = 1 Z X z X iq D (iq;z) e [Ez;Ez 1 ;:::;Ez iq ] (2.78) A(z) q X j=0 B(z i j ) e [Ez;Ez 1 ;:::;Ez i j ;Ez i j ;:::;Ez iq ] e [Ez;Ez 1 ;:::;Ez iq ] ! : Thus we can write Z 0 dhA(0)B()i = P (z;i) W (z;i) A(z) P q j=0 B(z i j ) e [Ez;Ez 1 ;:::;Ez i j ;Ez i j ;:::;Ez iq ] e [Ez;Ez 1 ;:::;Ez iq ] P (z;i) W (z;i) : (2.79) Similar expressions can be derived just as easily for cases where A and B are not diagonal but are rather more complicated. 2.7 A worm update for PMR-QMC In this section, we discuss a worm-type move for PMR-QMC. In Sec. 2.5.3 we discussed a number of local updates for PMR-QMC. While for many physical models the above suite of local moves are sucient for the proper exploration of the entire conguration space, 33 maintain ergodicity, for other models, local updates alone are insucient. This is especially true when a model requires a global change in conguration to explore certain parts of conguration space. A good example is topological models for which the congurations belonging to one topological sector are globally distant from the congurations of other topological sectors (we discuss an explicit example in the next section). In these cases, sequences of local moves can never generate the full exploration of conguration space and global updates are essential. To that aim, we devise a worm-type global update for PMR-QMC. The update is based on introducing a `disturbance' (or a `worm head') into the sequence of operators S iq by either appending S iq with a single operator or removing one from it (we will call this addition or removal of an operator `single operator moves'). An insertion or removal of a single permutation operator causes the sequence to evaluate to a non-identity permutation and hence corresponds to a zero-weight conguration. As a result, the disturbed sequence must be `healed' back to an identity-forming sequence. The healing process proceed by introducing further moves: either employing standard local updates such as local swap, cycle completion, and cycle rotation (see Sec. 2.5.3) or additional single operator moves. These single operator moves have the power to heal the sequence. After every such move, the instantaneous sequence can be checked to see whether it evaluates to the identity operator. If it does, the worm update will have ended and if it does not, additional moves are required. To make sure that detailed balance is conserved that further the acceptance rates of the intermediate worm moves are high, we assign non-identity intermediate congurations their `natural' weightW C (rather,jW C j) as per Eq. (2.26), and intermediate moves can be accepted or rejected with probabilities obeying detailed balance. 2.8 Numerical simulations: the toric code To showcase the eectiveness of the worm update devised in the previous section, we simulate a topological quantum model, namely, the toric code [37]. 34 (1) (2) (3) (4) (5) (6) (7) (8) ! " ! # " # ! " ! # " # ! ! " ! # " # ! " ! " ! # " ! # " ! " ! # " ! ! ! " ! # ! " ! ! " ! " " ! ! " ! " " ! # Initial configuration, evaluates to identity A `disturbance’ ( ! ) is introduced Another `disturbance’ is introduced A “local swap” operation A contraction of " # to ! A “local swap” operation A contraction of " ! to # A " `disturbance’ healing the sequence Figure 2.3: An example for a worm algorithm sequence. In this particular example the Hamiltonian generates three permutation operatorsfP 1 ;P 2 ;P 3 g. Here theP i 's are their own inverses and P 3 = P 1 P 2 . The initial QMC conguration has the operator sequence in (1) which evaluates to the identity. The toric code (see Fig. 2.4) is dened on a periodic two-dimensional lattice (a torus), usually chosen to be the square lattice, with a spin-1/2 particle located on each edge. The Hamiltonian of the toric code is given by [37]: H toric =J X v A v + X p B p ! ; (2.80) whereJ > 0 andA v = Q i2v x i andB p = Q i2p z i withi2v denoting the edges touching the vertex v i2p denoting the edges surrounding the plaquette p. In PMR, the Hamiltonian is rewritten as H toric =D 0 + X v D v P v (2.81) where D 0 = P p B p , P v = A v and D v = J1. As was discussed in the previous section, the PMR-QMC congurations are pairs (zi;S iq ). Starting from the initial conguration (jz rand i;1), wherejz rand i is a randomly chosen spin conguration and 1 denoting the empty operator sequence, the QMC simulation proceeds with local updates which includes (i) clas- sical moves such as single spin ips, (ii) cycle rotation moves such as the swapping of adjacent pairs of operators in the operator sequence P v P u ! P u P v , (iii) block swaps as described in 35 the previous section and (iv) cycle completions such as annihilation of P v P v pairs from the sequence and their creation. All the above local moves are however not ergodic. This can be seen by noticing that the annihlation/creation local updates implies that plaquette termsP v appear in even numbers. On the other hand, the product of all plaquette terms P Q v P v also evaluates to the identity and so seuqences where all plaquette terms are odd-numbered should also be accounted for. There are in fact two topolgoivcsl sectors an even one and an odd one. There must be a move that connects these two sectors. This move must be global in nature as it must change the parity of all plaquette operators. The worm update introduce in the previous sector accomplishes that. Fig.2.8 show two non trivial measurement for the 16 spin toric code and how the acceptance probability of the two sector changes with the beta values. B p A v Figure 2.4: The toric code Hamiltonian consists of four-body X terms and four-body Z terms on a periodic two-dimensional lattice (a torus). Spin 1/2 particles sit at the edges. ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ��� ��� ��� ��� ��� ��� � �� ��� ��� ��� ��� β 〈 � � 〉 (a) ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ��� ��� ��� ��� ��� ��� � � � � �� β ∫ � τ 〈∑ � � ( �)〉 〈∑ � � (τ) 〉 (b) ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ Even Sector ○ Odd Sector ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� β 〈 ���������� ����������� 〉 (c) Figure 2.5: PMR QMC measurements for 16 spin toric code Hamiltonian 36 2.9 Results In this section we present some results that highlight some of the advantages that PMR has to oer over existing methods. Specically, we compare the performance of PMR over SSE on random 3-regular MAX2SAT instances augmented with a transverse eld. This class of instances corresponds to a particular choice of the Ising Hamiltonian given in Eq. (2.28) whereby each spin is coupled antiferromagnetically (with strengthJ ij = 1) with exactly three other spins picked at random. This class of instances is known to exhibit a quantum spin- glass phase transition and is notoriously dicult to simulate by standard QMC techniques, making it suitable to illustrate the strengths of the PMR algorithm (see Ref. [33] for more details). In a subsequent section, we demonstrate the versatility of our formalism by considering the performance of PMR on a transverse-eld Ising model augmented with random two- bodyXX connections, which to our knowledge cannot be readily implemented using existing methods [84]. 2.9.1 PMR vs SSE: Transverse-eld Ising model simulations To demonstrate the advantages of PMR over existing state-of-the-art, we study random 3-regular MAX2SAT instances augmented with a transverse eld. We study the thermal properties by utilizing a parallel tempering scheme [64, 82, 56, 33] for both PMR and SSE with 11 replicas at inverse temperatures,2f0:1; 0:2; 0:5; 1; 2; 5; 10; 20; 30; 40; 50g. We carry out simulations for 50 random MAX2SAT instances at sizes N = 96; 128 and at transverse eld strengths = 0:1; 0:4. To quantify the performance of the algorithms, we x the total number of measurements of our observables, and we vary the total number of updates between measurements. Thus, we are able to measure the dependence of the estimate of the thermal expectation value with the total time spent de-correlating measurement samples. The performance comparisons of the PMR algorithm against SSE are summarized in Figs. 2.6 and 2.7. Both gures depict the 37 thermal average of x-magnetization as a function of simulation runtime (other observables exhibit similar behavior). As is evident from the gures, PMR is four or more orders of magnitude faster than SSE, converging on average after only a few seconds in all cases. On the other hand, for a large fraction of the instances, the SSE simulations did not nish running over the 24 hour window allocated for each run. As expected, the dierence in performance is even more pronounced in the more `classical' = 0:1 case (Fig. 2.7). 10 1 10 3 10 5 0.1 0.2 0.3 10 1 10 3 10 5 Total time (sec) 10 1 10 3 10 5 Figure 2.6: Thermal average of the o-diagonal Hamiltonian as obtained by par- allel tempering simulations using PMR (blue) and SSE (red) for = 0:4 as a function of simulation time. A subset of the results are shown for = 2; 10 and 50. Each data point is the mean value over 50 random instances, with each simulation perform- ing 500 measurements of the x-magnetization-per-spin in the reported runtime (horizontal axis). The error bars correspond to the 2 condence interval generated by 10 3 bootstraps performed over the instances. 2.9.2 Ising model with random XX interactions We next illustrate the utility of our technique by studying a model that likely requires highly non-trivial implementations if studied by other QMC algorithms. We consider a transverse- eld Ising model with random XX interactions whose Hamiltonian is given by H = s X hiji Z i Z j (1s) X i X i bs(1s) X hiji X i X j : (2.82) 38 10 1 10 3 10 5 0.01 0.05 0.1 10 1 10 3 10 5 Total time (sec) 10 1 10 3 10 5 Figure 2.7: Thermal average of the o-diagonal Hamiltonian as obtained by par- allel tempering simulations using PMR (blue) and SSE (red) for = 0:1 as a function of simulation time. A subset of the results are shown for = 2; 10 and 50. Each data point is the mean value over 50 random instances, with each simulation perform- ing 500 measurements of the x-magnetization-per-spin in the reported runtime (horizontal axis). The error bars correspond to the 2 condence interval generated by 10 3 bootstraps performed over the instances. Here, s is a parameter in the range (0; 1), and b2f0; 1g determines whether the two-body X terms are absent (b = 0) or present (b = 1). We examine underlying connectivity graphs that are Erd} os{R enyi random, meaning we randomly pick a pair of spins to connect. The total number of edges for each instance is taken to be nm=2, where m2f3; 4; 5g is the average degree of the graph (we focus only on single component graphs for simplicity). Hamiltonians of the above general form appear widely in the context of quantum anneal- ing processes [67], where the system is evolved according to the above Hamiltonian by varying the parameter s slowly in time from s = 0 to s = 1. The goal in quantum annealing is for the system to reach a state at the end of the anneal that has considerable overlap with the ground state manifold of the Z-dependent `problem' Hamiltonian, which in this case is a MaxCut instance (or a random antiferromagnetic) [33]. While in standard quantum annealing the two-body X terms are normally absent (i.e., b = 0), one is often interested in understanding the eects of augmenting the Hamiltonian with a `catalyst' | an extra term that is hoped to reduce the amount of time required for the annealing process to take place (see, e.g., Refs. [31, 3, 28, 62, 90]). Setting b = 1 can be viewed as an example of such a situation. 39 For H in Eq. (2.82), we have D 0 =H c =s P hiji Z i Z j as well as one-body and two-body (in theb = 1 case) o-diagonalP j operators:fX i g[fX i X j g hiji . TheD j operators are all of the form D j =d j 1 where for the one body operators d j =(1s) and for the two-body operators d j =s(1s). That d j 0 implies that the model is stoquastic and hence sign- problem-free. Given a congurationfjzi;S q =P i 1 P i 2 :::P iq g, the QMC algorithm generated new conguration by either changing basis statejzi or subsequence of o-diagonal operator S q . For our updates, we restrict to subsequences of length two which is enough to ensure ergodicity. The possible completion moves are summarized in Table 2.1. For illustration, say P i 1 = X i X k then new congurationfjzi;S 0 q = P 0 i 0 P 0 i 1 P i 2 :::P iq g where P 0 i 0 = X i X j , P 0 i 1 = X j X k can be generated by replacing X i X k ! X i X j ;X j X k and accepting the move with a probability satisfying detailed balance condition. By inspecting the results of QMC simulations of the above Hamiltonian we are able to answer a number of questions that are relevant to quantum annealing. We rst examine the variance of H (denoted 2 H ), which when close to 0 indicates that the thermal state is close to being purely in the ground state of the system (technically, any energy eigenstate of the system will give 2 H = 0). For suciently low temperatures, this will always be the case, but the energy gap and the density of states determine how low the temperature needs to be. We therefore study the dependence of 2 H on the instances' tree-widths. This is shown in Fig. 2.8. We see that while instances with dierentm values may have the same tree-width, in the presence of XX interactions there can be a signicant dierence in their 2 H values. We make several observations. First, we nd that the Hamiltonian with XX interactions requires more sweeps in order to de-correlate, i.e. thermalize. Second, the dierences for dierent m are more substantial with XX interactions than without them, indicating that the XX interaction makes the spectrum much more susceptible to m. Third, larger m values with XX interactions tend to correspond to lower 2 H values. This suggests that in the presence ofXX interactions, largerm values are eectively `colder'. Finally, we nd that the tree-width makes little dierence to the 2 H values, with or without XX interactions. 40 Move Change in q i) 1 $ (X i ;X i ) 2 ii) 1 $ (X i X j ;X i X j ) 2 iii) X i X j $ (X i ;X j ) 1 iv) (X i ;X j ) $ (X j ;X i ) no change v) (X i X j ;X j X k ) $ X i X k 1 Table 2.1: Cycle completion moves for the transverse eld Ising model with two-body X interactions. The moves include insertions or removals of pairs of identical one-body and two-body X terms [i) and and ii)], the breaking up of a two-body term to its one- body constituents and the inverse operation [iii)], swapping [iv)] and the contraction of two operators into one [v)]. 3 4 5 6 7 Tree width 0 0.2 0.4 0.6 0.8 1 1.2 Figure 2.8: Variance of H, denoted 2 H , as a function of the tree-width of the underlying graph (here, n = 16, s = 0:5 and = 2). For m = 3; 4; 5, we have 68, 89, and 99 instances. The tree-width of each instance is identied, and each bar corresponds to the median value of H over the instances of a xed m and tree-width after 10 7 QMC sweeps. Error bars correspond to 95% condence interval calculated using a bootstrap over the instances of a xed m and tree-width. 41 Figure 2.9 shows the average diagonal energy as a function of the annealing parameter s for the b = 0 (no XX) and the b = 1 case and for two dierent values of average graph degree, namely, m = 3 and m = 5. We nd that the presence of the XX catalyst has two important consequences: it minimizes the eect of the graph degree and signicantly raises the average value. The former eect suggests that the presence of an XX catalyst will minimize performance dierences in solving random MaxCut problems with dierent graph degrees. The latter eect is not surprising since the presence of both X and XX in the Hamiltonian means that we can expect the eigenstates to remain disordered for a larger region of s. 0.3 0.4 0.5 0.6 0.7 -20 -15 -10 -5 0 Figure 2.9: Diagonal energyhH p i as a function of s with and without the XX catalyst for m = 3 and m = 5 (here, n = 16; = 2). Note that without the XX catalyst the more connected graphs (m = 5) have signicantly lower diagonal energy than the m = 3 case. 42 2.10 Summary and discussion We presented a parameter-free, Trotter-error free, universal quantum Monte Carlo scheme for the simulation of a broad range of physical models under a single unifying framework. Our technique enables the study of essentially any model on an equal footing. In our approach, the quantum dimension consists of products of elements of permutation groups, allowing us to formulate update rules and measurement schemes independently of the model being studied. We used our approach to clarify the emergence of the sign problem in the simulation of non-stoquastic physical models. In addition, studying the thermal properties of transverse- eld Ising models, we illustrated the advantages of our technique over existing state-of-the- art, specically the stochastic series expansion (SSE) algorithm. We showed that one of the features of the permutation matrix representation QMC that distinguishes it from SSE is that it bundles innitely many SSE weights into a single weight. We demonstrated that this translates to orders-of-magnitude runtime advantages in most cases. We derived measurement schemes for essentially every conceivable physical observable for the recently introduced Permutation Matrix Representation quantum Monte Carlo algo- rithm. We provided a general framework for collecting statistics for static, dynamic and integrated measurements. In addition, we derived a general universal worm-type global update move which allows for the simulation of topological quantum models that necessitate moves of the global type to `jump' between topological sectors. We further illustrated the exibility of our method by studying models with variable localities of interactions and underlying connectivities, namely, a transverse-eld Ising model augmented with randomly placed two-bodyX interactions, which to the authors' knowledge cannot be readily implemented using existing methods. For these, we found that the pres- ence of the XX interactions can mitigate dierences associated with connectivity graphs of dierent degree. We expect that our algorithm can be further improved by implementing updates that utilize more `global' moves that, e.g., update longer cycles of operators. The 43 extent to which this can translate to further runtime advantages remains an open question that we leave for future work. We believe that the generality and exibility of our algorithm will make it a useful tool in the study of physical models that have so far been inaccessible, cumbersome or too large to implement with existing techniques. 44 Chapter 3 Calculating the divided dierences of the exponential function by addition and removal of inputs This chapter is based on the work in the paper [42]. We introduce a method for calculating the divided dierences of the exponential function by means of addition and removal of items from the input list to the function. Our technique exploits a new identity related to divided dierences recently derived by F. Zivcovich [Dolomites Research Notes on Approximation 12, 28-42 (2019)]. We show that upon adding an item to or removing an item from the input list of an already evaluated exponential, the re-evaluation of the divided dierences can be done with only O(sn) oating point operations and O(sn) bytes of memory, where [z 0 ;:::;z n ] are the inputs and s/ max i;j jz i z j j. We demonstrate our algorithm's ability to deal with input lists that are orders-of-magnitude longer than the maximal capacities of the current state-of-the-art. We discuss in detail one practical application of our method: the ecient calculation of weights in the o-diagonal series expansion quantum Monte Carlo algorithm. 3.1 Introduction: Divided dierences of the exponen- tial function In the previous chapter, we saw that generalized Boltzmann Eq. (2.26) weight in our PMR QMC depends on the divided dierence of exponential function (DDEF). However, comput- ing the DDEF accurately is an expensive operation due to its numerical instability. Over 45 the years several methods were proposed to compute this quantity fastly and accurately. However, all the algorithm in the literature describes computing DDEF from the ground up. In our PMR QMC, we need to compute the ratio of weights where the inputs of the DDEF dier by one or two inputs. So we needed a method that could take advantage of the fact that we have already computed DDEF and use that information to compute the new DDEF which diers by a few inputs. In this chapter, we described an algorithm that can calculate the DDEF by addition and removal input to the already calculated DDEF. The divided dierences [88, 109, 30] of a function f() with respect to real- (or complex- )valued inputs [z 0 ;:::;z n ] is dened as f[z 0 ;:::;z n ] n X j=0 f(z j ) Q k6=j (z j z k ) : (3.1) The above expression is ill-dened if two (or more) of the inputs are repeated, in which case f[z 0 ;:::;z n ] can be properly evaluated using limits 1 . For instance, in the case where z 0 =z 1 =::: =z n =x, the denition of divided dierences reduces to: f[x;x;:::;x] = lim z i !x f[z 0 ;:::;z n ] = f (n) (x) n! ; (3.2) where f (n) () stands for the n-th derivative of f(). Divided dierences have historically been used for computing tables of logarithms and trigonometric functions, for calculating the coecients in the interpolation polynomial in the Newton form and more. A central question related to divided dierences has to do with the computational cost of accurately evaluating the divided dierence of the exponential function (DDEF) f(z) = e z as a function of number of inputs [85, 2, 110], namely, the evaluation of exp[z 0 ;:::;z n ] with increasing n. 1 As will be discussed later on, other denitions of divided dierences exist, which do not necessitate the use of limits in the case of repeated indices [92, 30]. 46 DDEFs can be calculated in a straightforward manner via the recurrence relation f[z i ;:::;z i+j ] = f[z i+1 ;:::;z i+j ]f[z i ;:::;z i+j1 ] z i+j z i ; i = 0;:::;nj; j = 1;:::;n; (3.3) with the initial conditions f[z i ] = f(z i ), i = 0;:::;q (assuming that the points are dis- tinct). In practice however, such a straightforward approach is known to produce impre- cise and numerically unstable results if simple double-precision oating point arithmetic is used [85, 58, 2, 19, 110]. To overcome these precision issues, several methods for the accurate evaluation of DDEFs have been devised over the years. These have employed an identity known now as Opitz's formula [92, 86] which gives various divided dierences in terms of functions of matrices, explicitly: f(Z n ) = 0 B B B B B B B @ f[z 0 ] f[z 0 ;z 1 ] ::: f[z 0 ;z 1 ;:::;z n ] f[z 1 ] ::: f[z 1 ;:::;z n ] . . . . . . f[z n ] 1 C C C C C C C A ; (3.4) where Z n = 0 B B B B B B B @ z 0 1 z 1 1 . . . . . . z n 1 1 C C C C C C C A (3.5) The (1;n+1)-th entry of the output matrix on the right-hand side is the desired quantity. A hybrid algorithm for the accurate computation of DDEFs that uses the above formula was suggested by McCurdy et al. [85] which prescribes the standard recurrence relations when it is safe to do so, and otherwise employs a Taylor series expansion of the exponential in conjunction with Opitz's theorem and repeated matrix scaling-and-squaring [59]. 47 Observing that only a single entry of the matrix Eq. (3.4) is actually needed to obtain the DDEF, Opitz's formula was later used to estimate DDEFs via the calculation of the action of the matrix exponential on a given vector [2, 58, 60, 45] employing O(n) matrix- vector products, orO(n 3 ) oating-point operations in the general case. However, while these general-purpose routines for the computation of the matrix exponential are accurate in terms of the matrix norm, they do not ensure high accuracy for individual matrix elements (which is needed for the accurate evaluation of divided dierences). Improving upon the above, in a recent work by F. Zivcovich [110], a new identity related to divided dierences was derived which allows for a faster and more accurate computation of the vector of DDEFs f = (exp[z 0 ]; exp[z 0 ;z 1 ];:::; exp[z 0 ;:::;z n ]) T ; (3.6) requiring O(sn 2 ) oating point operations and O(n 2 ) bytes of memory, where s = d 1 3:5 max i;j jz i z j je. The divided dierence computation technique worked out by Zivcovich is based on change of basis of a polynomial interpolating a function on some data points. Given a function f taking values f(z 0 );f(z 1 );:::;f(z n ) at z 0 ;z 1 ;:::;z n respectively, there exists a unique polynomial p(z) that interpolates these points. In monomial basis the polynomial p(z) have the form p(z) = n X k=0 a k z k (3.7) while in the Newton basis the polynomial p(z) have the form p(z) = n X k=0 f[z 0 ;z 1 ;:::;z k ]n k (z) (3.8) where Newton polynomial basis is dened as n k (z) = Q k1 j=0 (zz j ) and n 0 (z) = 1. In Newton basis the coecient are the divided dierences. Let a(n) =fa 0 ;a 1 ;:::;a n g and d(n) =f[z 0 ];f[z 0 ;z 1 ];:::;f[z 0 ;:::;z n ]. There is a matrix H(z 0 ;z 1 ;:::;z n ) which transform coecients a(n) to d(n) and matrix E(z 0 ;z 1 ;:::;z n ) which transform coecients d(n) to 48 a(n) [110]. The Zivcovich paper [110] discuss in detail about the properties of these matrices. We also can extend this to function of matrices. Given an analytic function f and matrix A, we can write f(A) using Taylor series expansion as f(A) = 1 X k=0 f (k) (0) k! A k (3.9) If f is dened on the eigenvalues z 0 ;z 1 ;:::;z n of A, then we can write f(A) using Newton series expansion [86] as f(A) = n X k=0 f[z 0 ;z 1 ;:::;z k ]n k (A) (3.10) where n k (A) = Q k1 j=0 (Az j I) and n 0 (A) =I. Because of the Opitz formula [92], it turned out matrixZ n is the perfect choice of matrix A. MatrixZ n allows us to eciently apply scaling technique which is essential when dealing with large values ofz 0 ;z 1 ;:::;z n . Since the eigenvalues ofZ n arez 0 ;z 1 ;:::;z n , we can write f(Z) in Newton series representation Eq.3.10. We will focus on case whenf is an exponential function. In Taylor expansion we can write exp(Z n ) as exp(Z n ) = 1 X k=0 1 k! Z k n (3.11) Since there are n+1 terms in the Newton series expansion of f(Z n ) while we have innite terms in the Taylor series expansion. Truncating the Taylor series to n+1 terms and using H(z 0 ;z 1 ;:::;z n ) matrix to transform the Taylor coecient to Newton coecient will not give accurate results. To cope with the precision issue Zivcovich algorithm add additional points z n+1 ;:::;z N to recover the rst n divided dierences accurately. In this chapter, we proposed an algorithm, based on Zivcovich's identity, for a fast and accurate calculation of DDEFs by means of sequential addition or removal of items from the input list to the function, which allows us to update the DDEF with changes to the input list. With each addition or removal, the recalculation of the DDEF is performed using only O(sn) oating point operations and O(sn) bytes of memory. As we show, our method 49 enables the calculation of DDEFs of essentially unlimited input sizes without the need to resort to resource-demanding and time-consuming arbitrary-precision arithmetic. Our primary motivation for devising the algorithm is the need for such a routine in the PMR QMC [41] algorithm. The algorithm we developed helped us in computing the ratio of generalized Boltzmann weight in PMR QMC in an ecient manner. 3.2 Calculating DDEFs by addition and removal of inputs 3.2.1 Zivcovich's algorithm for evaluating DDEFs We begin our discussion describing Zivcovich's algorithm [110] for computing the vector f, Eq. (3.6), introducing along the way several modications which allow us (among other things) to obtain Opitz's matrix of divided dierences column by column, rather than row by row as in the original algorithm. The main building block of Zivcovich's algorithm is obtaining the vector h (j) = exp h z j s i ; exp h z j s ; z j1 s i s 1 ;:::; exp h z j s ;:::; z jN s i s N T (3.12) from the vector h (j1) = exp h z j1 s i ; exp h z j1 s ; z j2 s i s 1 ;:::; exp h z j1 s ;:::; z j1N s i s N T (3.13) using only O(n) operations, with the initial condition h (1) = 1; (1!s) 1 ; (2!s 2 ) 1 ;:::; (N!s N ) 1 T . Here, the vectors h (j) have length N n + 30 (z k = 0 for every k < 0). Zivcovich observed that the matrix operation that takes h (j1) to h (j) can be written as H 1 (z j z j1 )H 2 (z j z j2 ):::H N (z j z jN ); (3.14) 50 whereH i (z) is the (N +1)(N +1) identity matrix, with the valuez added at the (i;i+1)-th position, i.e., H i (z) = 0 B B B B B B B @ I (i 1) 1 z 1 I (Ni) 1 C C C C C C C A ; with I(i) denoting the ii identity matrix. Each operator H i (z) can be applied with only O(1) operations. Applying the transformation Eq. (3.14) successively forj = 0; 1;:::;n, one obtains the matrix G =RF (z 0 =s;z 1 =s;:::;z n =s)R 1 , where F (z 0 ;z 1 ;:::;z n ) = 0 B B B B B B B @ exp[z 0 ] exp[z 0 ;z 1 ] ::: exp[z 0 ;z 1 ;:::;z n ] exp[z 1 ] ::: exp[z 1 ;:::;z n ] . . . . . . exp[z n ] 1 C C C C C C C A (3.15) is the divided dierences matrix and R = diagf1;s;:::;s n g. The above procedure gives the columns of G successively from left to right. The matrix F is obtained from G via the relation F (z 0 ;z 1 ;:::;z n ) =RF (z 0 =s;z 1 =s;:::;z n =s) s R 1 =G (z 0 =s;z 1 =s;:::;z n =s) s ; (3.16) which follows from Opitz's formula [85] and gives the rst row ofF (z 0 ;z 1 ;:::;z n ) as f T = g (s) , where g (1) is the rst row of the matrix G, and g (i+1) = g (i) G for i = 1; 2:::;s 1. The conditionsNn + 30 andsd 1 3:5 max i jz i je ensure that the rstn + 1 elements of the vector Eq. (3.12) are calculated to a machine-level accuracy, which is in turn a sucient condition for the accurate calculation of the matrixG. The above conditions follow from the fact that a double precision accuracy in the calculation of e z is attainable by truncating its Taylor series at order n = 30 provided thatjzjz = 3:5 [110, 2, 20] (in passing, we note 51 that choices other than (n = 30;z = 3:5) may exist which may even provide potentially more rapidly converging routines [58, 2]). In addition, that exp[z 0 ;z 1 ;:::;z n ] can be written as e exp[z 0 ;z 1 ;:::;z n ], allows us to choose any scaling parameter s greater than s min (;n) =d 1 3:5 max i jz i je as long as is subtracted from all inputs and the end result is multiplied by e . The shifting by may translate to a substantial reduction of the scaling parameter s whenevers min (;n)< s min (0;n). Choosing to be the mean z = 1 n+1 P n i=0 z i ensures that s =d 1 3:5 max i;j jz i z j je is a suitable choice for the scaling parameter because it is larger than s min (;n) [110]. Having reviewed Zivcovich's algorithm, we are now in a position to discuss our algorithm for computing exp [z 0 ;:::;z n1 ;z n ] given that the DDEF with [z 0 ;:::;z n1 ] has already been calcu- lated. In what follows, we shall treat the input list to the algorithm as a stack and discuss the re-evaluation of DDEF under addition of items to the stack and removal of items from it. We will further demonstrate how our algorithm solves in an ecient manner one limitation shared by all existing algorithms to date, which has to do with the precision of the calculation at large values of n and s. Our main result is that the re-evaluation associated with the addition to and removal from the input stack can be done with only O(sn) oating point operations and O(sn) bytes of memory for arbitrarily long input lists. 3.2.2 Adding an item to the input stack We rst consider the case where the DDEF vector (exp[z 0 ]; exp[z 0 ;z 1 ];:::; exp[z 0 ;:::;z n1 ]) T has already been calculated for some n > 0, and that the vector h (n1) and the rows g (i) , i = 1;:::;s, as dened above, are stored in memory. The addition of a new input item z n (assuming for the time being that the addition does not require increasing the scaling parameter s s min ) is carried out as follows. The upper triangular matrix G(z 0 =s;:::;z n =s) should contain the additional last column w (n) = (s n exp[z 0 =s;z 1 =s;:::;z n =s];s n1 exp[z 1 =s;:::;z n =s];:::;s 0 exp[z n =s]) T when compared to the already stored G(z 0 =s;:::;z n1 =s) (other additional elements are zeros at positions 52 (n + 1; 1);:::; (n + 1;n) ). All elements of the new column are contained in the vector h (n) dened in Eq. (3.12) withj =n. Applying the transformation Eq. (3.14) withj =n on the already-stored h (n1) , requires O(n) operations: h (n) =H 1 (z n z n1 )H 2 (z n z n2 ):::H N (z n z nN )h (n1) : (3.17) The row g (1) , which coincides with the rst row of the matrixG(z 0 =s;:::;z n =s), contains an additional item exp[z 0 =s;:::;z n =s]=s n as compared to the rst row ofG(z 0 =s;:::;z n1 =s). This item too can be found in h (n) . Each of the rows g (i) , i = 2;:::;s, should be appended an element g (i1) w (n) . In particular, the required last element of f T = g (s) is obtained as g (s1) w (n) . Storing the vector h (n) and the rows g (i) , i = 1;:::;s, requires only O(sn) bytes of memory. Thus, our addition routine requires only O(sn) operations and O(sn) bytes of memory. In the initial case ofn = 0, i.e., when the rst input is added to the empty list, the vector h (0) is obtained from h (1) employing Eq. (3.17), and each of the rows g (i) consists of a single element g (i) 0 = h (0) 0 i , i = 1;:::;s, where h (0) 0 = e z 0 =s as per Eq. (3.12). In addition to the above procedure, all input values can oset by a xed amount (where the nal DDEF value is multiplied by e ), in order to obtain a smaller scaling parameter s. Both values and s should be set at the initial stage and cannot be changed during the subsequent additions. Nonetheless, the scaling parameter s should not exceeds min (;n) after every addition. To ensure that, we choose initially = z 0 if z 0 is the only known value at rst. If several values z 0 ;z 1 ;:::;z k are known initially, then the optimal choice of is the mean 1 k+1 P k i=0 z i in which case a proper choice for the scaling parameter s is s =d 1 3:5 max i;j jz i z j je, which is larger than s min . Whenever the addition of a new item z n requires a larger value ofs, which happens whens min (;n) exceedss, the algorithm restarts and the matrix G is recalculated from scratch by sequentially adding all of z 0 ;:::;z n again with the updated values of and s. Importantly, we nd that in order to ensure the high accuracy of the resulting values, the valueN should remain unchanged during the updates to 53 the vector of divided dierences. Since havingNn + 30 is also important for maintaining accuracy, each timen exceedsN 30 we doubleN, reallocate memory for the vector h and for the rows g (i) , i = 1;:::;s and recalculate the vector of divided dierences. 3.2.3 Removing an item from the input stack Removing an item from the top of the input stack, an operation that is of importance in applications where DDEF inputs need to be replaced or updated, can be done with relative ease. Item removal requires applying the inverse of the transformation Eq. (3.17), which produces h (n1) from h (n) , explicitly: h (n1) =H N (z nN z n )H N1 (z nN+1 z n ):::H 1 (z n1 z n )h (n) : (3.18) Item removal also necessitates the removal of the last element from each of the rows g (i) , i = 1;:::;s including the last item of the DDEF vector f = g (s) T . We thus nd that DDEF re-evaluation following the removal of the last item z n requires only O(n) oating point operations. 3.2.4 Direct computation of the modied DDEFs with higher pre- cision A natural limitation of the above algorithms stems from the fact that DDEFs decrease rapidly with n. This follows directly from the bounds [34] e z n! exp[z 0 ;z 1 ;:::;z n ]e z ; (3.19) where z = 1 n+1 P n i=0 z i and e z = 1 n+1 P n i=0 e z i . Since exp[z 0 ;z 1 ;:::;z n ] = e z exp[z 0 z;z 1 z;:::;z n z], one may assume without loss of generality thatz = 0, in which case the value of exp[z 0 ;z 1 ;:::;z n ] scales as 1=n!. This implies that computing DDEFs for large n using standard double-precision arithmetic is not feasible. 54 We overcome the above obstacle by choosing to compute, rather than f given in Eq. (3.6), the modied vector ~ f = (exp[z 0 ]; 1! exp[z 0 ;z 1 ];:::;n! exp[z 0 ;:::;z n ]) T : (3.20) The range of the elements in ~ f is substantially narrower than that of f; The elements of the modied vector ~ f are not smaller than 1 when z = 0, and as such can be computed more precisely than elements of f. To obtain the vector ~ f, the following modications to the algorithm described in Sec. 3.2.1 are in order. Rather than transforming the vectors h (j) one must now transform the modied vectors ~ h (j) dened as: ~ h (j) i = 8 < : h (j) i j!=(ji)! for i = 0; 1;:::;j h (j) i i! for i =j + 1;:::;N: (3.21) In addition, the vector ~ h (1) should be initialized to ~ h (1) = (1;s 1 ;:::;s N ) T . Furthermore, rather than constructing the matrixG above, we construct the matrix ~ G which is related to G via ~ G ij =G ij j!=i!. The rows ~ g (i) are dened as ~ g (i) j =g (i) j j!, where i = 1; 2;:::;s and j = 0; 1;:::;n. The above rescaling also necessitates the use of the modied transformation ~ H (j) 1 (z j z j1 ) ~ H (j) 2 (z j z j2 )::: ~ H (j) N (z j z jN ) in lieu of Eq. (3.14), where ~ H (j) i (z) = 0 B B B B B B B @ I (i 1) j=(ji + 1) z=(ji + 1) 1 I (Ni) 1 C C C C C C C A ; 1ij; ~ H (j) i (z) = 0 B B B B B B B @ I (i 1) 1 z=i 1 I (Ni) 1 C C C C C C C A ; i>j: 55 Similarly, ~ H (n) i are to be used instead of H i in Eqs. (3.17) and (3.18). We note that in the special case of s = 1, only the rst row of the matrix ~ G is needed, so the values ~ h (j) i for i = 0; 1;:::;j 1 and j = 0; 1;:::;n, which may require the use of extended precision data types, are not used in the modied DDEF calculation. In this case, we nd that the input list sizes for which modied DDEFs can be accurately calculated using the modied transformations is practically unlimited. Numerical tests conrm that employing the modied relations allows us, in the s = 1 case, to obtain accurate values of the vector ~ f using standard double-precision oating point arithmetic at least up ton 10 5 . This is to be contrasted with the original formulation of Sec. 3.2.1, in which case the DDEF values create an under ow already at n 100. We demonstrate the above in Sec. 3.3. 3.2.5 Accurate computation of DDEFS for large n and s An additional limiting factor of the algorithm (shared by all other algorithms as well) is that the usage of double-precision oating point data types to evaluate DDEFs is a source of inaccuracies when n or s are too large. This stems from the (relatively) limited range of the double-precision oating point data type [10 308 ; 10 308 ]. The above limitation can however be overcome if oating-point data structures with extended-precision are used instead. Arbitrary precision data types are however expected to be too costly and will slow down the algorithm considerably, as the memory requirements as well as the runtimes of basic arithmetic operations for these grow with n ands [39]. This extra cost can nonetheless be avoided if one observes that the additional precision is not required for the mantissa of the oating point but rather only for its exponent. The range of the mantissa can be shown to be sucient by considering the relative error" of each oating point operation. After N f oating point operations, the overall accumulated error may be approximated at" p N f . For a double precision oating point we have" 10 17 and so after N f = 10 22 operations the overall error is only on the order of 10 6 . Taking the above into account, we devise a new oating-point data structure, that we refer to as ExExFloat (an abbreviation of extended exponent oating point) using which, 56 calculations with values in the extended range [10 646456992 ; 10 646456992 ] are possible. The ExExFloat consists of a mantissa, which in itself is a double-precision oating point, and an additional 32-bit integer for its exponent. Arithmetic operations such as addition, multi- plication and division with this new data type are performed using only several elementary operations on double-precision oats. 3.3 Numerical testing We next present the results of several benchmarking tests carried out to verify the compu- tational complexity of the routines devised above. We calculate execution runtimes of the addition and removal routines described in Secs. 3.2.2 and 3.2.3 on randomly generated input lists of dierent sizesn and dierent scaling parameters s. Our algorithms are coded in c++ and compiled with a g++ compiler with a -O3 optimization. Benchmarking was done on an Intel Core i5-8257U CPU running at 1:4GHz. 3.3.1 Calculation of DDEFs with extended precision versus double precision oating point data structures Here, we examine the behavior of calculation times with n and s, comparing the perfor- mance of our algorithm against implementations that do not use the improved formulation introduced in Sec. 3.2.4 or the specially devised extended precision data type discussed in Sec. 3.2.5. Figure 3.1(left) shows the scaling of the average DDEF calculation time as a function of input size for input sequences drawn from a random normal distribution with mean 0 and a standard deviation = 0:1. For these input lists s = 1, so the scaling step of the algorithm is not executed. Comparing the performance of the improved algorithm against that of the original formulation reveals that while both produce a linear scaling of calculation time with input size as expected, the original algorithm ceases to produce accurate results at around 57 n 100 due to an under ow, whereas the improved implementation allows for accurate calculations at much larger sizes. We also benchmark the performance of the algorithm on input lists whose scaling param- eters is strictly larger than 1 by using inputs sampled from a normal distribution with = 1. Here, the scaling step of the algorithm is necessary and a double-precision implementation of the algorithm becomes insucient beyond certain n and s values. Comparing the per- formance of an implementation of the addition routine with ExExFloat against the double- precision implementation, we make two observations. First, we nd that the ExExFloat implementation is only about 2:7 times slower than its double-precision counterpart (to be compared with the expected cost of arbitrary-precision arithmetic which would have resulted in orders of magnitude slowdown [39]). This is illustrated in the inset of Fig. 3.1(right), which shows the ratio of the two computation times for dierent input sizes. Second, as is evi- dent from the main panel of Fig. 3.1(right), which depicts the scaling of DDEF calculation time with input size, the improved algorithm with double-precision oats breaks down at n 1700. Without rescaling, the breakdown for s = 2 is observed at n 100. Employing ExExFloat on the other hand allows calculating DDEFs of practically unlimited input sizes, going well beyond the capabilities of the current state-of-the-art. 3.3.2 Runtimes of the addition and removal routines Next, we present the results of several benchmarking tests set out to validate our analysis of the computational complexity of the addition and removal algorithms devised in Sec. 3.2. Figure 3.2 demonstrates the linear runtime scaling of the addition routine with respect to bothn (left) ands (middle) conrming the analysis carried out in Sec. 3.2.2. In Fig. 3.2(right) we verify that the runtime of removing an item from top of the stack scales as O(n) and has a marginal dependence on s, in accordance with the analysis of Sec. 3.2.3. 58 ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ ○ double-precision △ double-precision, improved 10 10 2 10 3 10 4 10 2 10 3 10 4 10 5 10 6 SequenceLength CPUTime [ns] ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ ○ double-precision □ double-precision, improved △ ExExFloat, improved ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ 100 4000 8000 2.4 2.7 3.0 SequenceLength R 10 10 2 10 3 10 4 10 2 10 3 10 4 10 5 10 6 SequenceLength CPUTime [ns] Figure 3.1: Left: DDEF calculation times with double-precision oating points as a function of n, for randomly generated sequences with s = 1. While the standard formulation breaks down at n 100 (circles), the improved formulation allows for an accurate calculation of modied DDEFs of much larger inputs (triangles). Right: DDEF calculation times with ExExFloat (triangles) vs double-precision oats (squares) as a function ofn fors = 2. While double-precision oats cannot produce values beyondn 1700, calculations with ExExFloat are not hampered. Also shown (circles) is the breakdown at n 100 of the double-precision calculation of the implementation that does not leverage the improved algorithm formulation discussed in Sec. 3.2.4. In the inset: The ratioR of DDEF calculation time with ExExFloat to calculation time using double-precision oating point as a function of input size n. The average ratio levels o at 2:67 (dashed line). 3.4 Applicability to o-diagonal expansion quantum Monte Carlo The fast and accurate evaluation of DDEFs via addition and removal of inputs has one imme- diate application in Monte Carlo simulations of quantum many-body systems, specically in the calculation of conguration weights in o-diagonal expansion (ODE) quantum Monte Carlo [5, 41]. We discuss this next. In statistical physics, a system in thermal equilibrium is completely described by its partition function. From it, all thermal properties of the system may be extracted [95]. For quantum systems, the partition function (denoted Z here) is given byZ = Tr e H where H, the Hamiltonian of the system, is a hermitian matrix, Tr [] denotes the trace operation and is a real-valued positive parameter often referred to as the inverse-temperature. For physical systems containing a large number of interacting particles (normally above two dozen 59 ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ○ σ=1 △ σ=2 ◇ σ=3 □ σ=4 ▽ σ=5 100 2000 4000 6000 8000 0 2×10 5 4×10 5 6×10 5 8×10 5 1×10 6 SequenceLength CPUTime [ns] ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ 0 10 20 30 40 50 0 1×10 5 2×10 5 3×10 5 4×10 5 5×10 5 Scaling Parameter s CPUTime [ns] ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ △ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ○ σ=1 △ σ=2 ◇ σ=3 □ σ=4 ▽ σ=5 100 2000 4000 6000 8000 0 2×10 4 4×10 4 6×10 4 8×10 4 1×10 5 SequenceLength CPUTime [ns] Figure 3.2: Left: Linear runtime of the addition routine as a function of input size for several dierent values of s. Middle: Linear runtime of the addition routine as a function of s for a xed sequence length (n = 1000). Right: Linear runtime of the removal routine as a function of input size n for several dierent values of s. The dependence on n is linear whereas the s dependence is marginal. or so), the dimension ofH, which grows exponentially with the number of particles, prohibits the analytical or even numerical evaluation of Z, as the calculation requires exponentiating H. In such cases, one often resorts to approximation schemes, usually stochastic methods, within which Z is not evaluated exactly but is rather randomly sampled. This class of techniques is known as quantum Monte Carlo [11, 76]. Sampling the partition function of a quantum many-body system involves breaking it up to a sum of positive-valued terms Z = Tr e H = X C W C ; (3.22) where each summand W C is called the `weight' of congurationC [44]. To sample Z, QMC prescribes a Markov process in which a Markov chain of congurations is formed. Starting with a random initial conguration, subsequent congurations are visited with probabilities that are proportional to their weights. For Z to be evaluated properly, the decision of whether to hop from one congurationC A to anotherC B involves calculating the ratio of their weights, namely, W C B =W C A . Within the o-diagonal expansion quantum Monte Carlo scheme (ODE for short) [5, 41], a conguration weight is proportional to DDEF exp[z 0 ;:::;z n ] whose inputs z 0 ;:::;z n can be viewed as drawn from a probability distribution over a subset of the diagonal elements of the matrixH. The composition and size of the subset as well as the probability associated 60 with each element are determined by the properties of the Hamiltonian matrix H as well as by the inverse-temperature and so the typical input sizen and the scaling parameters that determine the computational cost of the DDEF (re-)calculation are therefore also strongly dependent on these. In general, the size of a typical input list is known to scale linearly with the norm of H and the inverse temperature (for more details, the reader is referred to Refs. [5, 41]). Consecutive congurations in the Monte Carlo Markov chain generically have input stacks that dier by O(1) elements so the input stack of any visited conguration can be obtained from that of its predecessor by the addition or removal of a nite number of elements (usually one or two). Since at any given stage of the Markov process the DDEF of the current conguration is already known, a need arises for the fast calculation of the DDEF of the new conguration given that the DDEF of the current one has already been obtained. Since the input stacks of the current and new congurations are very similar, the addition and removal of elements from the input list, followed by the re-evaluation of the DDEF, are a natural way to compute the new conguration weight. While the addition of an element can be done by simply using the addition routine, the removal of an item is slightly more involved. This is because the removal of an item from the bulk of the stack (rather than from the top) may be called for in the general case. Removal from the bulk is achieved by removing items one by one from the top of the stack and adding all but the target item back. In the context of QMC simulations, a pertinent question is therefore how many elements are typically removed and then added back to the stack when a removal of item from the bulk is called for. We address this question next. We consider the following scenario, which is typical to QMC. Let there be a probability distribution P over a discrete set of variablesS =fw 1 ;:::;w M g such that each w k has probability p(w k ) (and P M k=1 p(w k ) = 1). Next consider a stack containing n + 1 elements z 0 ;:::;z n drawn from P (i.e., z j 2S). We now pick an additional element w j from P and ask what is the average number of removals required to remove w j form the stack. 61 The probability that the rst occurrence w j is removed after r sequential removals from the top of the stack is P (r) j = p(w j )(1p(w j )) r1 . Hence, the average number of removals for w j is hri w j = X r rP (r) j = 1 p(w j ) : (3.23) Averaging over all possible items w j gives hri = X j p(w j )hri w j = X j p(w j ) p(w j ) =M; (3.24) the number of participating variables. For a nite list of lengthn+1, Eq. (3.23) is an approx- imate relation, because the summation should be performed only up tor =n+1. Moreover, only those itemsw j that are likely to be found in the list, i.e., those obeyinghri w j n+1, or p(w j ) 1=(n + 1), should be taken into account in Eq. (3.24). Assuming that P is approx- imately a normal distribution with standard deviation , the number of values satisfying this condition is 4(w) 1 log 1=2 (n + 1)=( p 2) , where w is the interspacing between adjacent values. For as long as n= is not exponentially large, the number of participating values M is proportional to the standard deviation , which in turn is proportional to s. In order to remove an element, one needs to perform r removals from the stack and r 1 additions to the stack, so the computational cost is O(rsn). Hence, average computa- tional cost is P r O(rsn)P (r) j =O(sn)=p(w j ) =O(hri w j sn). Similar to Eq. (3.24), this gives O(snM) = O(s 2 n) after averaging over all items. We thus conclude that removal from the bulk requires O(s 2 n) operations. We next present some results obtained from QMC simulations carried out to ascertain the average number of removals for a particular physical model known as the transverse-eld Ising model, which describes a system of particles (or spins) interacting magnetically on a graph (in our simulations we focus on one avor of this model, namely, 3-regular random antiferromagnets [33]). The model is of importance in condensed matter physics as well as 62 in the area of quantum information [103, 33]. For a system of N s spins, its Hamiltonian is given by H =A X hiji Z i Z j (1A) Ns X i=1 X i : (3.25) Here, A2 [0; 1] is a real parameter andhiji denotes summation over a subset of pairs of spin indices which constitutes the `interaction graph' of the model [5, 41]. The matrices X i and Z i are a standard shorthand notation for the tensor product of N s 2 2 matrices X i =1 x i 1 and Z i =1 z i 1. The Pauli matrices x i and z i are given, along with the identity, by 1 = 2 4 1 0 0 1 3 5 ; x i = 2 4 0 1 1 0 3 5 ; z i = 2 4 1 0 0 1 3 5 : (3.26) Simulating the above model with ODE QMC, we calculate the average number of removals hri from the top of the stack required within one Markov chain update, as a function of the various parameters of the model. Figure 3.3 showshri as a function of inverse-temperature for A = 0:2 (left) and A = 0:5 (right) for dierent system sizes. Interestingly, as the gure indicates, the dependence ofhri on the inverse temperature is not trivial but is nonetheless bounded across a variety of parameter changes. This behavior may be attributed to the fact that for small values of inverse-temperature the size of the stack is also small (as/n) and so is the average number of removals. At the other extreme, where is large, the size of the stack is likewise large. However in this case, the number M of distinct inputs one may nd in the stack tends to zero, owing to the fact that DDEFs with only smallz values are exponentially more likely to appear than DDEFs containing larger z inputs as the DDEFs serve as conguration weights. 63 ○ ○ ○ ○ ○ ○ ○ ○ △ △ △ △ △ △ △ △ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ □ □ □ □ □ □ □ □ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ○ N s = 8 △ N s = 16 ◇ N s = 32 □ N s = 64 ▽ N s = 128 0.01 0.02 0.05 0.1 0.2 0.5 1 2 1.5 2.0 2.5 3.0 Inverse Temperature β 〈r 〉 ○ ○ ○ ○ ○ ○ ○ ○ △ △ △ △ △ △ △ △ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ □ □ □ □ □ □ □ □ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ○ N s =8 △ N s =16 ◇ N s =32 □ N s =64 ▽ N s =128 0.01 0.02 0.05 0.1 0.2 0.5 1 2 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Inverse Temperature β 〈r 〉 Figure 3.3: Average number of removals from the top of the stackhri in a single Monte Carlo step as a function of inverse temperature in a QMC simulation of the transverse- eld Ising model whose Hamiltonian is given in Eq. (3.25), for various system sizesN s . Left: A = 0:2. Right: A = 0:5. The average stack size n grows linearly with both N s and . Nonetheless, the average number of removals appears to be bounded. 3.5 Summary and outlook We devised an algorithm for calculating the divided dierences of the exponential function by means of addition and removal of items from the input list to the function. The addition and removal routines allows the updating of the calculation when the input list undergoes changes. The re-evaluation of the divided dierences following the addition or removal of an item from an already evaluated input list requiresO(sn) oating point operations andO(sn) bytes of memory, wheren is the input size ands is the scaling parameterd 1 3:5 max i;j jz i z j je. Taking advantage of known bounds for divided dierences along with a specially devised data structure, our algorithm is able to evaluate the divided dierences of the exponential function for input sizes that are orders of magnitude larger than the known capabilities of existing algorithms. We also discussed an immediate application of our algorithm in the context of the Monte Carlo simulations of quantum many-body systems, specically, in the o-diagonal expansion (ODE) QMC algorithm [5, 41] within which the devised technique can be used to considerably speed up the calculation and extend the range of conguration weights thereby enabling the study of physical models that have so far been inaccessible to ODE. 64 Chapter 4 Elucidating the Interplay between NonStoquasticity and the Sign Problem This chapter is based on the work in the paper [44]. The sign problem is a key challenge in computational physics, encapsulating our inability to properly understand many important quantum many-body phenomena in physics, chemistry and the material sciences. Despite its centrality, the circumstances under which the problem arises or can be resolved as well as its interplay with the related notion of `non-stoquasticity' are often not very well under- stood. In this study, we make an attempt to elucidate the circumstances under which the sign problem emerges and to clear up some of the confusion surrounding this intricate com- putational phenomenon. To that aim, we make use of the recently introduced o-diagonal series expansion quantum Monte Carlo scheme with which we analyze in detail a number of examples that capture the essence of our results. 4.1 Introduction In the last chapter, we overcame the challenge of computing the divided dierence of exponen- tial function fastly and accurately. However, the biggest challenge that no QMC algorithm including our PMR QMC algorithm has been able to overcome is the infamous 'negative sign problem' in QMC. The `negative sign problem,' or simply the `sign problem,' is the single most important unresolved challenge in quantum many-body simulations, preventing physicists, chemists and material scientists alike from a true understanding of many of the 65 most profound macroscopic quantum physical phenomena of Nature|in areas as diverse as material design and high temperature superconductivity through the physics of neutron stars to lattice quantum chromodynamics and more.[105, 83, 72, 57] The sign problem slows down quantum Monte Carlo (QMC) algorithms,[6, 61, 91] which are in many cases the only tool available to us for studying large quantum many-body systems, to the point where these schemes become practically useless. QMC algorithms allow us to evaluate thermal averages of physical observables by sampling the conguration space of the model in question via the decomposition of the partition function into a sum of eciently computable terms, which are in turn interpreted as probabilities in a Markov process.[76, 11] The sign problem emerges whenever a decomposition of the quantum partition function into a sum of non-negative terms is not known, in which case the speedups that normally accompany importance sampling are lost or considerably diminished. Despite the centrality of the sign problem to the understanding of many important phys- ical phenomena, certain aspects of the sign problem, such as the circumstances under which the problem emerges, the practical meaning of resolving the problem and its interplay with the related notion of `non-stoquasticity' that has gained traction within the physics and computer sciences communities in its own right in recent years [16, 17, 83], are often not very well understood. In an eort to resolve some of the misconceptions surrounding this important computa- tional phenomenon, in this chapter, we examined in detail the emergence of the sign problem in QMC and clarify certain aspects of it. To that aim, we make use of the recently introduced o-diagonal series expansion quantum Monte Carlo scheme [5, 54, 41], a method that builds on a power series expansion of the quantum partition function in its o-diagonal terms and is both parameter-free and Trotter error-free. We illustrate our key results by analyzing several toy examples that capture the essence of these. 66 4.1.1 Basic denitions We begin our discussion by providing a practical denition to the sign problem in the context of QMC (thereby also acknowledging the existence of other denitions [105]). This denition will then allow us to discuss in more detail the circumstances under which the problem emerges and also the practical meaning of curing it. Denition 1. An N-particle quantum many-body model given by a Hamiltonian H N will be said to possess a sign problem if there is no known decomposition of its partition function Z = Tr[e H N ] into a sum of eciently computable (in and in N) positive-valued terms. It is important to note that our denition of the sign problem has to do with the existence of a decomposition of the partition functionZ = Tr[e H ] into a sum of eciently computable positive weightsZ = P C W C (where each weightW C corresponds to a congurationC), rather than with the computational eort required to sample these terms or to accurately evaluate thermal averages of physical observables. The above denition also implies that `curing' or resolving the sign problem for a model H, i.e., nding a decomposition of its partition function into eciently computable positive weights, implies nothing about the computational eciency with which the QMC algorithm samples the conguration space. Expressed dierently, the absence of a sign problem does not guarantee ecient QMC convergence. An example that illustrates this fact best is that of classical spin glasses (or Ising models). These are classical systems. Their partition functions can be written as sums of strictly positive Boltzmann weights and as such they trivially have no sign problem (by the same token, one may also consider quantum spin glasses by augmenting the Ising model with a small transverse eld). Nonetheless these systems are known to equilibrate in the worst case in exponential time.[10] 4.1.2 Determining the severity of the sign problem The reason the sign problem is indeed a serious impediment to QMC algorithms is that the computational eciency of QMC hinges on the algorithm's ability to eciently sample the 67 congurationsC as obtained from the partition function decomposition according to their importance, or relative weight, p C = W C =Z. A necessary but not sucient condition for proper importance-based sampling (or importance sampling for short) is that all weights W C are positive (or nonnegative), i.e., that p C can be interpreted as a bona de probability distribution over congurations. When this happens, the thermal averagehAi of any physical observable A can be written as hAi = Tr Ae H Z = P C A C W C P C W C = X C A C p C : Since an explicit summation over all terms in the sum above is in general prohibitive due to the generally exponential number of terms in the sum, QMC importance sampling estimates this sum using a Monte Carlo estimator dened as h ~ Ai p 1 N s Ns X i=1 A C i ; (4.1) where the congurationsC i (of which there are N s ) are randomly sampled in proportion to their probability p C i .[8] If the number of `important congurations' is relatively small, then estimation via importance sampling will converge much faster than with straightforward unbiased sampling of the various terms (or an actual summation over all terms) and often times exponentially more so. Importance sampling is however not always possible. Whenever a subset of congurations is assigned negative weights, one cannot draw samples according to p C (as p C is no longer a probability distribution). A common workaround for the appearance of negative weights (hence the terminology `sign problem') is to draw samples from a dierent distribution, ~ p C , that is nonnegative everywhere.[98, 105] Then, the estimator of A becomes h ~ Ai ~ p 1 N s Ns X i=1 A C i p C i ~ p C i (4.2) 68 A common choice for ~ p C is ~ p C = jW C j P C 0 jW C 0j ; (4.3) i.e., weights that are proportional to the absolute values of the original weights. In the next section, we will see that the sum P C 0 jW C 0j may be viewed as the decomposition of the partition function of a related but not equivalent sign-problem-free model. With the above choice, thermal averages can be written as hAi = P C A C sgn(W C )jW C j P C sgn(W C )jW C j = hA sgn(W )i jWj hsgn(W )i jWj : The subscriptjWj is added as a reminder that the weights used in the Markov process are the absolute values of the original `true' weights. For models that do not have a sign problem, all weights are positive andhsgn(W )i jWj = 1. For sign-problematic systems, negative weights are equally dominant and we have hsgn(W )i jWj 0. In this case thermal averages of physical observables will uctuate rapidly, resulting in extremely large error bars and will require an exponentially large number of measurements.[105, 55] An appropriate gure of merit for how adverse the sign problem is for a given QMC algorithm can therefore be given by the `weighted sign'hsgn(W )i jWj (orhsgni for short) which can be explicitly written as hsgni = P C W C P C jW C j : (4.4) 69 4.1.3 Curing the sign problem Often times, even though a positive-valued decomposition is not known for the partition function of H, it may be known for that of a `rotated' model ~ H = UHU y where U is a (usually local) unitary transformation. Since ~ Z = Tr h e ~ H i = Tr h e UHU y i = Tr Ue H U y = Tr e H =Z; (4.5) the two models ~ H and H are physically equivalent via a similarity transformation. The existence of the sign problem therefore depends on the basis in which the hamiltonian H is represented in the QMC algorithm (but not only). An extreme example is the classical Ising Hamiltonian H = P i<j J ij Z i Z j with J ij 2f1; 1g which is diagonal in the computa- tional basis and thus has no sign problem (here Z i and X i denote the Pauli-z and Pauli-x operators acting on the i-th spin, respectively). In the rotated basis in which Z i $ X i the same Hamiltonian will be written asH = P i<j J ij X i X j and will possess a sign problem.[83] Thus, often times curing the sign problem boils down to nding a transformation U to the Hamiltonian such that the decomposition of the partition function of the rotated Hamilto- nian ~ H to positive-valued terms is known.[83, 73, 72, 51] It should be noted nonetheless that since in QMC some physical observables are more easily accessible than others depending on whether or not they are diagonal, curing the sign problem via rotation is not always a practical resolution as observables in the rotated frame may be inaccessible as compared to those in the original frame. Rotating the Hamiltonian is not the only way to resolve the sign problem. Another method that is sometimes applicable, is using re-summation techniques,[108, 23] wherein one groups together positive and negative QMC weights into strictly positive `grouped weights'. Here, no rotation takes place. In Ref. [55] for instance, it was shown that the sign prob- lem associated with a model of an antiferromagnetically connected triplet of spins can be analytically cured by summing weights in a judicious manner. 70 As was already noted and is worth emphasizing again, nding a decomposition of the partition function into positive-valued eciently computable terms cures the sign problem but promises nothing about the convergence time of the QMC simulation of the cured model. Expressed dierently, freeing a model from its sign problem does not guarantee the ecient equilibration of the QMC Markov process, which depends on the details of the model in question and those of the QMC algorithm itself. 4.2 O-diagonal series expansion of the quantum par- tition function As a vehicle for illustrating the main observations pertaining to the sign problem that we analyze in this work, we will be using for concreteness one particular avor of QMC, namely, the o-diagonal expansion (ODE) QMC, rst introduced in Refs. [5, 54]. As we shall illus- trate, the fact that ODE is both Trotter-error-free and parameter-free will allow us to distill certain aspects of the sign problem that are arguably less apparent when examined with other QMC schemes. We proceed with an overview of the technique. 4.2.1 ODE: an overview Within the o-diagonal expansion one considers many-body systems whose Hamiltonians we cast as the sum H = M X j=0 ~ P j = M X j=0 D j P j ; (4.6) wheref ~ P j g is a set of M + 1 distinct generalized permutation matrices [66], i.e., matrices with precisely one nonzero element in each row and each column (this condition can be relaxed to allow for zero rows and columns). Each operator ~ P j can be written, without loss of generality, as ~ P j = D j P j where D j is a diagonal matrix and P j is a permutation matrix with no xed points (equivalently, no nonzero diagonal elements) except for the identity matrix P 0 = 1. We will refer to the basis in which the operatorsfD j g are diagonal as the 71 computational basis and denote its states byfjzig. We will call the diagonal matrix D 0 the `classical Hamiltonian' and will sometimes denote it by H c . The reader is referred to Ref. [41] for additional details. ThefD j P j g o-diagonal operators (in the computational basis) give the system its `quan- tum dimension'. Each term D j P j obeys D j P j jzi = d j z 0 jz 0 i where d j z 0 is a possibly complex- valued coecient andjz 0 i6=jzi is a basis state. While the above formulation may appear restrictive, any nite-dimensional matrix can be written in the form of Eq. (4.6). Given the above formulation for the Hamiltonian, the o-diagonal series expansion of the partition function Z = Tr e H proceeds as follows. Expanding the exponential in a Taylor series in the inverse-temperature , Z can be written as a triple sum over all basis statesjzi, the expansion orderq which ranges from 0 to innity and the (unevaluated) productsS iq =P iq :::P i 2 P i 1 ofq o-diagonal operators. Here we have used the multiple index i q = (i 1 ;:::;i q ) where each individual index i j ranges from 1 to M. In this notation, the empty sequenceS i 0 corresponds to the identity operation. After some algebra, the partition function attains the form Z = X fzg 1 X q=0 X fS iq g D (z;S iq ) hzjS iq jzie [Ez 0 ;:::;Ezq ] ; wherefS iq g is the set of all (unevaluated) products P iq :::P i 2 P i 1 of size q and the term e [Ez 0 ;:::;Ezq ] is the exponent of divided dierences (see the Appendix) over the multiset of classical energies [E z 0 ;:::E zq ].[109, 30] The energiesfE z i =hz i jH c jz i ig are the classical energies of the statesjz 0 i;:::;jz q i obtained from the action of the orderedP j operators in the sequenceS iq onjz 0 i, then onjz 1 i, and so forth. Explicitly,jz 0 i =jzi;P i 1 jz 0 i =jz 1 i;P i 2 jz 1 i = jz 2 i, etc. The sequence of basis statesfjz i ig may be viewed as a `path' in the hypercube of basis states See Figure 4.1 for an illustration. (Note thatjz j i =P i j :::P i 2 P i 1 jzi should in principle have been denotedjz (i 1 ;:::;i j ) i. We are using a simplied notation so as not to overburden the notation.) 72 ⟩ | $ = ( ) * + ( + * , ( , * - /0[2 3 ) ,2 3 + ,2 3 , ,2 3 ) ] ⟩ | 6 ⟩ | 7 * + * , * - Figure 4.1: Diagrammatic representation of a generalized Boltzmann weight, or a GBW, calculated from the classical energies E z j of the classical statesjz j i, which form a closed path, or a cycle, in the hypercube of basis states. The path is determined by the action of the permutation operators of the conguration, represented byS iq =P i 3 P i 2 P i 1 , on the initial basis statejz 0 i. Cycles close if and only if the sequence of permutation operators evaluates to the identity operation. Additionally, we have denoted D (z;S iq ) = q Y j=1 d (i j ) z j ; (4.7) where d (i j ) z j =hz j jD i j jz j i; (4.8) can be considered as the `hopping strength' of P i j with respect tojz j i. Note that while the partition function is positive and real-valued, the d (i j ) z j elements do not necessarily have to be so. Having derived the expansion Eq. (4.7) for any Hamiltonian cast in the form Eq. (4.6), we are now in a position to interpret the partition function expansion as a sum of weights, i.e.,Z = P C W C , where the set of congurationsfCg is all the distinct pairsfjzi;S iq g. Since hzjS iq jzi is either zero or one, we can write W C as W C =D (z;S iq ) e [Ez 0 ;:::;Ezq ] ; (4.9) 73 we refer to it as a `generalized Boltzmann weight' (or, a GBW). It can be shown [5] that (1) q e [Ez 0 ;:::;Ezq ] is strictly positive. 4.2.2 The sign problem within ODE As mentioned above, the weightsW C will in general be complex-valued, despite the partition function being real (and positive). Since however for every congurationC =fjzi;S iq g there is a conjugate conguration C =fjzi;S y iq g 1 that produces the conjugate weight W C = W C , the imaginary contributions cancel out. Expressed dierently, the imaginary portions of complex-valued weights do not contribute to the partition function and may be disregarded altogether. We may therefore redene D (z;S iq ) = Re h Q q j=1 d (i j ) z j i , obtaining strictly real- valued weights. Before we move on, we note that sincehzjS iq jzi evaluates either to 1 or to zero, the expansion can be more succinctly rewritten as Z = X hzjS iq jzi=1 D (z;S iq ) e [Ez 0 ;:::;Ezq ] ; (4.10) i.e., as a sum over all closed paths on the hypercube of basis states.[5, 54, 55] Moreover, since the permutation matrices with the exception of P 0 have no xed points, the condition hzjS iq jzi = 1 implies S iq = 1, i.e., S iq must evaluate to the identity element P 0 (note that the identity element itself does not appear explicitly in the sequences S iq ). 4.3 Non-stoquasticity and the emergence of the sign problem An attractive property of the formalism introduced above is that it allows us to identify the emergence of the sign problem in QMC via inspection of the weightsW C , thereby making the 1 For S iq =P iq :::P i2 P i1 , the conjugate sequence is simply S y iq =P 1 i1 P 1 i2 :::P 1 iq . 74 connection between the emergence of the sign problem and the notion of non-stoquasticity, which has garnered increasing attention with the advent of quantum computers in recent years, more apparent.[16, 17, 83, 72] As already mentioned, to interpret the real-valued weight terms W C as actual weights (equivalently, un-normalized probabilities), these must be nonnegative as it is the presence of negative weights that marks the onset of the infamous sign problem. In ODE, a weight is positive i (1) q D (z;S iq ) = Re " q Y j=1 (d (i j ) z j ) # (4.11) is positive, that is, a QMC algorithm will encounter a sign problem, equivalently a negative weight, during a simulation if and only if there exists a closed path on the hypercube of basis states along which Re h Q q j=1 (d (i j ) z j ) i < 0. It should thus be clear that it is not the sign of o-diagonal entries that creates the sign problem, but rather the cumulative phase of closed paths in the hypercube of basis states that determines its occurrence. We can use the above observation to identify several sign-problem-free classes of mod- els. A special class of models where the sign problem does not emerge, i.e., where Re h Q q j=1 (d (i j ) z j ) i 0 for all congurations, is that of `stoquastic' Hamiltonians [16, 17, 83], which we dene as follows. Denition 2. A Hamiltonian will be called (explicitly) 2 stoquastic with respect to a basis B if the Hamiltonian matrix representation in that basis has only non-positive o-diagonal elements. For stoquastic Hamiltonians alld (i j ) z j are negative (or zero). In this case, all paths trivially yield positive-valued weights. 2 Other denitions of stoquatcity account for the fact that some Hamiltonians may be easily cast in stoquastic form by applying a suitable local unitary transformation ot the Hamiltonian.[16, 17, 83] 75 The existence of positive o-diagonal terms does not however imply a sign problem for QMC. An example of a sign-problem-free family of models is the transverse-eld Ising Hamil- tonian H = X i;j J ij Z i Z j + X j h j Z j + X j X j : (4.12) for > 0. A slightly less trivial example is the two-body model H = X i;j J ij Z i Z j + X hi;ji X i X j ; (4.13) provided that the underlying connectivity of the two-body X terms is bi-partite. In this case, all nonzero weights come from paths on the hypercube of basis states consisting of only even number of steps (or, even q) leading to only positive-valued weights despite the existence of o-diagonal elements with the wrong sign. We note though that both models above can be cast in explicitly stoquastic form by applying simple local rotations. The transverse-eld Ising model, Eq. (4.12) can be made stoquastic by the transformation X i !X i . For the model of Eq. (4.13) a similar rotation performed only on one of the two partitions of the lattice will ip the sign of the o-diagonal terms 3 . As discussed before, a way to deal with the existence of negative-valued weights is to sample congurations according to their absolute values. It is interesting to note that the absolute-valued weightsjW C j are precisely those belonging to the related but not-physically- equivalent `stoquasticized' modelH stoq whose diagonal operatorsD j forj > 0 are related to those of the sign-problematic one by the relation D j !jD j j: (4.14) 3 Less trivial examples for non-stoquastic yet sign-problem-free models do exist however they are somewhat less informative. In fact it can be shown that models with a sign problem that is curable by local rotations are easy to manufacture but are nonetheless generally hard to cure.[83] 76 Using the above notation, one can see thathsgni can be written as: hsgni = Tr e H Tr [e Hstoq ] = Tr Se Hstoq Tr [e Hstoq ] =hSi jWj ; where we have dened S = e H e Hstoq and the severity of the sign problem is the thermal average of S with respect to the stoquasticized model H stoq whose weights arejW C j. 4.4 Illustrative examples Having dened the sign problem in the context of ODE, we are now in a position to derive several observations that illustrate some of the properties of the sign problem. The goal here is to settle certain ambiguities that have been attached to its nature over the years. 4.4.1 A single qubit cannot have a sign problem We begin by showing that a single spin-1=2 Hamiltonian cannot possess a sign problem regardless of the basis in which it is represented. Since any two dimensional hermitian matrix can be written as a linear combination of the Pauli matrices and the identity, the Hamiltonian of a single spin-1=2 particle can most generally be written as H = 0 1 + 1 X + 2 Y + 3 Z ; (4.15) whereX;Y andZ are the matrix representations of the usual Pauli operators in the basis that diagonalizes the Pauli-z operator and 0 ;:::; 3 are real parameters. In permutation-matrix representation, the Hamiltonian becomes H =D 0 P 0 +D 1 P 1 (4.16) with P 0 = 1, P 1 =X, D 0 =H c = 0 1 + 3 Z and D 1 = 1 1i 2 Z. 77 Since in this example the S iq are sequences consisting of only one type of non-identity permutation matrices, namelyP 1 =X, the expansion orderq must be even forS iq to evaluate to the identity element. This in turn results in " q Y j=1 (d (i j ) z j ) # = ( 2 1 + 2 2 ) q=2 being strictly nonnegative. We thus nd that any single-qubit Hamiltonian is necessarily also sign-problem-free. The same is however not true for a single qutrit in which case a sign problem may arise. We discuss the qutrit case next. 4.4.2 A spin-one particle can have a sign problem We next show that unlike a spin 1=2 particle, a single qutrit, or a spin-one particle, can possess a sign problem (in a given basis). In fact, it is the lowest-dimensional model that is capable of exhibiting the problem. The most general Hamiltonian for a single qutrit can be written as H =D 0 1 +D 1 P 1 + D 2 P 2 where 1 = 2 6 6 6 4 1 0 0 0 1 0 0 0 1 3 7 7 7 5 ;P 1 = 2 6 6 6 4 0 0 1 1 0 0 0 1 0 3 7 7 7 5 ;P 2 = 2 6 6 6 4 0 1 0 0 0 1 1 0 0 3 7 7 7 5 ; and D 2 =D 1 , a condition required by hermiticity of the Hamiltonian. The operators obey P 1 P 2 = 1 and P y 1 =P 2 1 =P 2 . Here, we will consider the particular case H = e i P 1 + e i P 2 +Jdiagf0; 1; 0g: (4.17) The diagonal part of the Hamiltonian isD 0 =Jdiagf0; 1; 0g and has two zero-energy ground statesjg 1 i =f1; 0; 0g andjg 2 i =f0; 0; 1g and a single excited statejei =f0; 1; 0g with classical energy J (we will be assuming J > 0). The o-diagonal operators D 1=2 = e i 1 78 are pure phases. The action of D 1 P 1 and D 2 P 2 on the three classical states is depicted in Figure 4.2. Needless to say, for such a small system, the Hamiltonian is easily diagonalizable and may be readily represented in the diagonilizing basis for arbitrary values of andJ. In the diagonilizing basis there is of course no sign problem. ⟩ | $ &'( ⟩ | ) ⟩ | &'( &'( '( '( '( Figure 4.2: Action of D 1 P 1 (orange arrows) and D 2 P 2 (blue arrows) on the three classical basis statesjg 1 i,jg 2 i andjei. The phases accompanying the permutations are also indicated. As noted earlier, an ODE congurationC consists of a basis statejzi and a product S iq =V i 1 V i 2 V iq of o-diagonal operators. In the qutrit modeljzi2fjg 1 i;jg 2 i;jeig and V i j 2fP 1 ;P 2 g. Afjzi;S iq g pair induces a sequence of classical statesjz i i generated by the action of the o-diagonal operators on the basis statejzi. The sequence of basis states fjz i ig may be viewed as a `path' in the hypercube of basis states. For a weight to have a nonzero value, the path must be a closed one, namely,jzi =jz 0 i =jz q i. We present in Table 4.1 the shortest sequences of operators that generate nonzero weights for the qutrit model, equivalently, sequences of P 1 and P 2 that multiply to the identity. It would be very instructive to analyze the onset of the sign problem in this simple model. Let n 1 (n 2 ) be the number of P 1 (P 2 ) operators in a given sequence of operators S iq . The sequence evaluates to P n 1 +2n 2 1 as P 2 = P 2 1 . Since the product S iq must evaluate to the identity element for the weight to be nonzero and P 3 1 = 1 we nd that the condition 79 expansion order q S iq = 1 phase sign of weight 0 1 0 1 1 { { { 2 P 1 P 2 ;P 2 P 1 0 1 3 P 1 P 1 P 1 ;P 2 P 2 P 2 3 sgn(cos 3) 4 P 1 P 2 P 1 P 2 and permutations thereof 0 1 5 P 1 P 1 P 1 P 1 P 2 ;P 2 P 2 P 2 P 2 P 1 and perms. thereof 3 sgn(cos 3) Table 4.1: Operator sequences in the lowest expansion orders, the associated weights and signs. translates to (n 1 + 2n 2 ) mod 3 = 0, or n 1 + 2n 2 = 3m for some integer m. Moreover, the sign of the weight associated with W C is sgn(W C ) = sgn (ReD (z;S iq ) ) sgn(e [Ez 0 ;:::;Ezq ] ) = sgn Ree i(n 1 n 2 ) sgn(1) n 1 +n 2 = sgn Ree i((n 1 n 2 ) +(n 1 +n 2 )) : (4.18) For the model to be sign-problem free, the phase of W C must be a multiple of 2, which yields [(n 1 n 2 ) + (n 1 +n 2 )] mod 2 = 0: (4.19) Plugging n 1 = 3m 2n 2 into Eq. (4.19) gives [(1 + 3 )(mn 2 )] mod 2 = 0: (4.20) For Eq. (4.20) to hold for any m andn 2 , we must require that 1 + 3(=) is a multiple of 2. That is, 1 + 3(=) = 2k for some integer k. This in turn gives = (2k + 1)=3, meaning that in the range2 [0; 2) there are precisely three values for our qutrit model to have no sign problem. These are 2f=3;; 5=3g. We corroborate our analytical derivation by numerical simulations. In Figure 4.3(top) we examine the behavior ofhsgni as a function of phase for xed values of. As is evident from 80 the gure, the severity of the sign problem oscillates periodically between the `worst case' values atf0; 2=3; 4=3g to the sign-problem-free values in between. Figure 4.3(bottom) shows that for a xed value of phase the sign problem becomes exponentially more severe with inverse-temperature. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● β=0.1 ● β=0.4 ● β=0.6 ● β=0.8 ● β=1.0 0 π 3 2 π 3 π 4 π 3 5 π 3 2 π 0.70 0.75 0.80 0.85 0.90 0.95 1.00 ϕ 〈sgn〉 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ϕ= π/8 ϕ=3π/8 ϕ=3π/4 ϕ=0 ϕ= π/7 ϕ= π 0.0 0.2 0.4 0.6 0.8 1.0 0.88 0.90 0.92 0.94 0.96 0.98 1.00 β 〈sgn〉 Figure 4.3: Top: The average signhsgni as a function of for various values of . The severity of the sign depends crucially on the phase . For three values of (marked by vertical lines) the model is sign-problem free. Bottom: hsgni as a function of for various values of , illustrating the fact that the problem becomes more severe with increasing . The qutrit example analyzed above demonstrates that not only is it possible for a single particle model to have sign problem, but that it is also possible to control the severity of the sign problem by tuning the phase of o-diagonal terms while keeping their magnitudes xed. The fact that the model is sign-problem-free for ? = =3; and 5=3 very clearly illustrates that non-stoquasticity, i.e., the complex-valuedness of o-diagonal Hamiltonian matrix entries do not necessarily induce a sign problem. Rather, it is the phase acquired along paths that induces it. The phase-dependent sign problem of the single qutrit model above can readily be gen- eralized to the more interesting many-body case, e.g., the Hamiltonian H = X j D j 0 + X hiji e i P i 1 P j 2 + e i P j 1 P i 2 ; 81 wherehiji denotes summation over the edges of an arbitrary interaction graph with qutrits sitting on its vertices. Using the same arguments as above, it can be shown that Hamiltonian Eq. (4.21) possesses a sign problem for certain values of the angle but not for others. 4.4.3 The sign problem and positivity of ground-state amplitudes In previous sections we have shown that stoquastic models, i.e., models whose Hamiltonians have only nonpositive o-diagonal entries, are sign-problem-free, while the converse is not necessarily true, that is, non-stoquastic Hamiltonians may or may not possess a sign problem. Another property that may be derived from stoquasticity, due to the Perron-Frobenious theorem [13], is that their ground state amplitudes all have the same sign. This raises the natural question of whether models whose ground-state amplitudes are all positive can even exhibit a sign problem (or a severe sign problem). In what follows, we provide a simple example that demonstrates that this is indeed the case, namely, that a model may have a severe sign problem despite having a ground-state whose amplitudes are all positive. To that aim, we consider the simple Hamiltonian H =(P +P 3 ) +P 2 ; (4.21) whereP is permutation matrix satisfyingP 4 = 1 butP 2 6= 1. (To simplify matters we have chosen a Hamiltonian with a vanishing diagonal term however such a term may be easily added on without modifying the end results.) The smallest matrix that satises the above relations is four-dimensional, e.g., P = 2 6 6 6 6 6 6 6 4 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 3 7 7 7 7 7 7 7 5 : (4.22) 82 The above Hamiltonian, Eq. (4.21), has three o-diagonal terms fP;P 2 ;P 3 g whose respective diagonal operators arefD 1 =1;D 2 =1;D 3 =1g. It has a sign problem for all > 0 (in which regime the model is non-stoquastic). This is due to sequences of odd orders (e.g., theq = 3 sequenceS iq =PP 2 P ) having negative weights, owing to the posi- tivity of. Nonetheless, inspecting the spectrum of the model we nd that it has eigenvalues (2 +;;; 2 +) with the respective eigenvectorsf1; 1; 1; 1g;f0;1; 0; 1g;f1; 0; 1; 0g andf1; 1;1; 1g which in turn implies that for < 1, the ground state isf1; 1; 1; 1g, i.e., it the has all-positive amplitudes. Thus, in the region 0<< 1, the model is sign problematic and has all-positive ground- state amplitudes. In facthsgni can be calculated analytically in this case to be hsgni = Tr e H Tr [e Hstoq ] = 2e 2(+1) +e 4 + 1 e 2 +e 2(+2) + 2e 2 : (4.23) As is depicted in Figure 4.4 and can be seen from the above expression, the sign problem becomes exponentially more severe as the inverse-temperature increases (regardless of the ground-state sign). ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● β=.5 β=.1 β=.8 β=.3 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 ϵ 〈sgn〉 Figure 4.4: The severity of the sign problemhsgni as a function of for the Hamiltonian H =(P +P 3 ) +P 2 . The model is sign-problematic for all > 0. In the region 0<< 1 all ground state amplitudes have the same sign. 83 4.5 Conclusions and discussion In this chapter, we made an attempt to elucidate the QMC sign problem, one of the most fundamental bottlenecks of many-body physics simulations [105, 83, 72, 57], and alleviate the confusion surrounding certain aspects of the problem. Along the way, we derived a formalism that allowed us to characterize the conditions under which the sign problem emerges and to analyze several core examples that illustrate a number of important observations. First and foremost, we demonstrated that non-stoquasticity | the non-positivity or complex-valuedness of the o-diagonal entries of the matrix representation of Hamiltonians | does not imply the existence of a sign problem. We have shown rather that the emergence of a sign problem is tightly connected to the cumulative phase of products of o-diagonal terms along closed paths in the hypercube of basis states. We have also shown that a single spin-1=2 particle cannot possess a sign problem but that a single spin-one particle can. We have additionally provided an example illustrating that a physical model may be non-stoquastic and have a severe sign problem despite having all-positive ground-state amplitudes, a property that is usually linked with stoquasticity.[13] Developing a true understanding of the nature of the QMC sign problem is important in physics, chemistry, the material sciences and well beyond those, and is crucial to the potential resolution of the problem. We therefore hope that our work will provide a useful framework for studying the nature of the sign problem in models of physical relevance beyond those analyzed here and will allow addressing the sign problem in more general settings. Another area in which developing a true understanding of the problem is important is the eld of quantum computing. This is because quantum simulations, [25, 96] or simulations of quantum many-body systems on quantum computers, as originally envisioned by Richard Feynman [36] is one of the most promising applications of near-term quantum devices (as well as the more distant fault-tolerant universal quantum computers). The existence of quantum simulation speedups hinges on the premise that simulating quantum systems is an intractable task for standard computers, and relies (at least in part) on the intractability of the sign problem. In this context, it is worth noting in particular the common misconception that 84 the resolution of the sign problem would imply in general a polynomial-time equilibration of QMC simulations or that it would somehow provide as a result a resolution to the famous P versus NP problem of computer science.[75] Notwithstanding, a general resolution to the sign problem would imply the existence of ecient classical simulations to quantum algorithms.[22, 63, 53] 85 Chapter 5 Ecient simulation of so-called non-stoquastic superconducting ux circuits This chapter is based on the work in the paper [47]. There is a tremendous interest in fabri- cating superconducting ux circuits that are nonstoquastic|i.e., have positive o-diagonal matrix elements|in their qubit representation, as these circuits are thought to be unsimula- ble by classical approaches and thus could play a key role in the demonstration of speedups in quantum annealing protocols. We show however that the ecient simulation of these sys- tems is possible by the direct simulation of the ux circuits. Our approach not only obviates the reduction to a qubit representation but also produces results that are more in the spirit of the experimental setup. We discuss the implications of our work. Specically we argue that our results cast doubt on the conception that superconducting ux circuits represent the correct avenue for universal adiabatic quantum computers. 5.1 Introduction In the previous chapter, we saw that non-stoquastic (NStoq) Hamiltonians are not known to be eciently simulated with classical computational methods such as Quantum Monte Carlo (QMC), owing to the relationship between non-stoquasticity and the QMC sign prob- lem [44, 83, 105]. NStoq terms are, therefore, seen as one of the missing ingredients for the demonstration of quantum speedups [28, 90, 106, 62, 3, 104, 31]. One popular quantum com- puting scheme that take advantage of this fact is Adiabatic Quantum Computing (AQC) [4]. 86 AQC is a paradigm in which computation is accomplished by evolving a quantum Hamil- tonian slowly enough so as to ensure the system remains in its instantaneous ground state. AQC is considered a leading candidate for the future of scalable robust quantum devices. It has long been known that for AQC to be universal [14, 1], the Hamiltonians of interest must include non-stoquastic (NStoq) terms [15, 83, 73, 72]|i.e., terms whose matrix representa- tions include positive o-diagonal elements that cannot be easily transformed away [44, 29]. Currently, several groups are utilizing radio frequency superconducting quantum inter- ference devices (rf-SQUIDs) to engineer NStoq systems, [71], and preliminary success has already been reported [93]. These types of devices represent some of the current cutting edge in quantum computing technology. However, as we will discuss here, much of this eort may be in vain. While there is no argument that truly NStoq experimental qubit systems would constitute a huge step towards AQC universality and perhaps even demonstrate some form of quantum supremacy, it is important to reckon with the fact that ux qubit systems are approximations to superconducting circuits, which are not genuinely two-level systems. In fact, we show here that the current paradigm of ux qubit systems that are considered to be NStoq (and hence unsimulable), are Stoq, or at least can be made Stoq, when the circuit is considered. This observation in turn allows us to eciently simulate these systems, calling into question the entire approach of fabricating NStoq quantum annealers with rf-SQUID technology. In this chapter, we demonstrated that ux-circuit Hamiltonians are eciently simulable (with a few stated caveats). This is possible because the formalism of superconducting cir- cuits is not discrete spin states, but instead conjugate variable pairs in a continuous Hilbert space. These coordinate pairs are usually expressed as ux and charge variables|which rep- resent position- and momentum-like coordinates, respectively. This observation was initially pointed out by Bravyi et al. [16] who asserted that if we treat the quantum mechanics of Josephson-junction qubit systems of the \ ux'-type as a collection of distinguishable (rather than bosonic or fermionic) particles the kinetic energy is a Laplacian in the continuous limit (see also Ref. [18]). Devising a combination of rotation of the charge and discretization of 87 ux, we explicitly show that a sign-problem free QMC technique may be applied which avoids the all the computational issues that arise in the NStoq qubit approximation picture 1 . 5.2 Eciently simulating superconducting ux circuits In what follows, we provide a methodology for simulating general superconducting ux cir- cuits whose Hamiltonians are of the form, ^ H(^ q 1 ;::; ^ q n ; ^ 1 ;:::; ^ n ) = n X k=1 k ^ q 2 k + ^ V ( ^ 1 ;:::; ^ n ); (5.1) where ^ q and ^ satisfy ^ q j ; ^ k =i~ jk . While most modern qubit devices are not necessarily given in this form [107], we provide a prescription involving simple, scalable coordinate transformations, that makes them so. Once cast in this form, we show that they can be eciently simulated via sign-problem-free QMC [76, 89], where by \ecient" we mean the computation will result in the thermally averaged observables that require sampling the distribution a poly(n) number of times 2 . We start by discretizing the ux variables, ^ , onto an equally spaced grid of unit (in dimensionless ux units). This yields a diagonal potential energy matrix of the form, X IJ hIj ^ VjJijIihJj = N X I V (i 1 ;:::;i n )jIihIj; (5.2) 1 An independent study making similar assertions that has very recently been posted on the arXiv [26]. However, the focus and methods by which the resolution of the o-diagonal matrix elements in the kinetic energy matrix is achieved are quite dierent. 2 Of course, not much can be said about the scaling of the QMC equilibration time; even the most carefully chosen basis may still yield painfully long computation times due to the system being dicult to equilibrate, e.g., it has a highly degenerate ground state. This is true regardless of system size, and is not unique to QMC|i.e., quantum annealers may similarly take exponential times to equilibrate. 88 wherejIi = Q n k=1 ji k i =ji 1 i 2 :::i n i is a direct product of the one-dimensional Dirac-like func- tions [27, 80]. This yields a diagonal potential energy matrix, regardless of how complicated the functional form of V ( 1 ;::: n ). To address the kinetic energy operator we note thathIj^ qjJi!i~ d d when ! 0. This allows us to express the derivative for a nite using rst order nite dierence (FD), X IJ hIj^ qjJijIihJj = N X I i~ 2 jI + 1ihIjjI 1ihIj : (5.3) While this may not seem the natural choice at rst, we note that using a more traditional basis, e.g. direct products of orthogonal polynomials, will result in positive o-diagonal elements. This is caused, in part, by the non-locality of orthogonal polynomials [46, 48]. Furthermore, utilizing a higher-order derivative expansion is also problematic, since it will also result in NStoq terms in the kinetic energy matrix [27]. Obviously, Eq. (5.3) does not immediately solve our problem, since it does not guaran- tee only non-positive o-diagonal matrix elements, as superconducting circuits may involve couplings between charge operators of dierent qubits|i.e., cross terms [52]. To resolve this issue, we transform the coordinate system in an eort to ensure that the kinetic energy only contains a sum of one-body quadratic terms. By rotating the coordinate basis, eectively diagonalizing the capacitance matrix under the restriction that the phase space volume is preserved, we are left with the desired form. This is akin to a normal mode transformation. We emphasize that this normal-mode transformation is not a scaling bottleneck, since it hinges on diagonalizing (in the worst case) a matrix of dimension nn with n being the number of charge operators. This procedure allows for a generic way to simulate the bulk of circuits used throughout AQC [74] by mapping the circuit Hamiltonian for the device to the discretized Hamiltonian given by, ^ H fd = X I n X k=1 k ~ 2 2 jI (+) k ihIj 2jIihIj +jI () k ihIj + X I V (i 1 ;:::;i n )jIihIj; (5.4) 89 wherejI () k i =ji 1 :::i k1 i ji k 1i ji k+1 :::i n i, obviating the qubit approximation entirely. 5.3 Simulating non-stoquastic ux qubit Hamiltonians For a macroscopic system, such as a superconducting ux circuit, to be a viable choice as a quantum information device, one must be able to isolate and address a subspace generated by two states per qubit. Because of this, rf-SQUIDs have become the cornerstone of modern quantum computing [32]. They can be modeled as an anharmoic oscillator, where the two lowest energy states are spectroscopically accessible through some control process. Moreover, these two states can be mapped to an analogous two-level spin system where traditional quantum computing operations may be carried out. This is all while being easily fabricated through well understood processes [32]. One such rf-SQUID device, the schematic for which is shown in Fig. 5.1, and whose Hamiltonian is given by, ^ H c = ^ Q 2 1 2 ~ C 1 + ^ Q 2 2 2 ~ C 2 + C 12 ^ Q 1 ^ Q 2 C 1 C 2 +C 12 (C 1 +C 2 ) + ( ^ 1 z 1 ) 2 2L 1 + ( ^ 2 z 2 ) 2 2L 2 + M 12 ( ^ 1 z 1 )( ^ 2 z 2 ) L 1 L 2 0 I 1 2 cos h x 1 0 i cos h 2 0 ^ 1 i 0 I 1 2 cos h x 2 0 i cos h 2 0 ^ 2 i : (5.5) has been specically designed to be NStoq in its qubit representation, owing to the novel couplings between the the two qubits (note that 0 =~=e with~ =e = 1 throughout) [93]. We present here a detailed study of how the method outlined in the previous section can be used to simulate this circuit eciently. The values of the xed parameters in the above Hamiltonian are given in Table 5.1. The parameters not listed in Table 5.1 are adjustable device parameters used for initializing and performing the annealing process. To initialize the system, the coaxial uxes, z 1 and z 2 , 90 Figure 5.1: Circuit schematic for the non-stoquastic two qubit rf-SQUID [93]. The two superconducting Josephson junctions are coupled both capacitively and inductively. Parameter Qubit 1 Qubit 2 L (pH) 231.9 239.1 C (fF) 119.5 116.4 I (A) 3.227 3.157 C 12 (fF) 132.0 Table 5.1: Fixed circuit parameters for the coupled rf-SQUID system. See Ref. [93] for more details. are set to the values associated with the problem Hamiltonian. These ux values dene the nal state to be measured at the end of the anneal. The transverse uxes, x 1 = x 2 = x , are then varied slowly, transforming the system from a single, anharmonic well at x = 0 to a set of four wells at x = [77]. These four wells dene the four possible 2-qubit states, the lowest of which being the ground state of the nal system and the solution to the problem Hamiltonian. This process is shown schematically in Fig. 5.2. As was outlined in the previous section, we cannot work directly with Eq. (5.5), owing to the rst order charge coupling ( ^ Q 1 ^ Q 2 ). Instead we transform to new coordinates, ^ q 1(2) = 1 8 p ! 1 ! 2 ! 1=2 p ! 1 ^ u 1 p ! 2 ^ u 2 (5.6) and, ^ 1(2) = 2 p ! 1 ! 2 ! 1=2 p ! 2 ^ v 1 p ! 1 ^ v 2 ; (5.7) 91 Figure 5.2: Schematic overview of the change in the potential energy of the two qubit circuit Hamiltonian during the annealing process. (a) Single contour of the potential energy at x = 0. The dot represents the majority of the probability density of the ground state wavefunction. (b) Single contour of the potential energy at x = 3=4. The solid ring represents the majority of the probability density of the ground state wavefunction, which is delocalized due to the presence of a phase transition. (c) Single contour of the potential energy at x =. The dot represents the majority of the probability density of the ground state wavefunction, which is in the well associated with thej00i qubit state. (d)-(f) Same as (a), (b), and (c) but after the normal mode transformation. The dimensionless operators, ^ u and ^ v, are dened as ^ Q i = p ~ ~ C i ! i ^ u i and ^ i = q ~ ~ C i ! i ^ v i + z i , with ! 2 i = 1=L i ~ C i . This yields kinetic and potential operators of the form, ^ T (^ q 1 ; ^ q 2 ) =C + ^ q 2 1 +C ^ q 2 2 ; (5.8) and, ^ V ( ^ 1 ; ^ 2 ) =L ^ 2 1 +L + ^ 2 2 +L 12 ^ 1 ^ 2 +E 1 ( x ) cos h 2 0 1 ( ^ 1 + ^ 2 ) + z 1 i +E 2 ( x ) cos h 2 0 2 ( ^ 1 ^ 2 ) + z 2 i ; (5.9) the coecients of which are presented in Table 5.2. 92 5.4 Results We simulate the circuit outlined in the previous section, Eq. (5.4), by employing o-diagonal expansion QMC (ODE QMC; the reader is referred to the Supplemental Information for technical details) [5, 54, 41]. Using this, we can compute the thermal average of observable quantities byh ^ Ai = Tr[ ^ Ae ^ H ]=Tr[e ^ H ], where = 1=kT (here we chose T = 12mK to re ect actual device temperatures). The quantity of interest for superconducting ux qubits and AQC is the persistent current, ^ I z 1(2) , the expectation value of which can be expressed as, h ^ I z 1(2) i = * @ ^ H @ z 1(2) + = 1 L 1(2) h ^ 1(2) i z 1(2) + M 12 L 1 L 1 h ^ 2(1) i z 2(1) : (5.10) The above constitutes the \qubit" in superconducting ux qubit device, since the quantized current (or more appropriately, the direction of the quantized current) is what maps to the spin up (or down) in the qubit model, analogous to the eigenstates of the Pauli-z operator [71]. In other words, this is the quantity that is measured by the hardware, and a clockwise (or counterclockwise) current yields a positive (or negative) current measurement, which is interpreted as a qubit state ofj0i (orj1i). Our algorithm has only one extraneous parameter, the grid spacing . It is important to make sure that in the limit ! 0, the discretized model approaches the continuous Hamiltonian. This is presented in Fig. 5.3 which shows the relative error forh ^ I z 1 i as a function of computed using exact diagonalization, conrming our expectations that our simulations recover the results of the continuous model for small enough . To demonstrate that we can, in fact, simulate the circuit, we computed four separate anneals for four sample problem Hamiltonians, each corresponding to the four possible 2- qubit readouts|j00i;j01i;j10i, andj11i. We dictate which of the nal states we desire by setting the coaxial uxes, z 1 and z 2 , to either positive or negative values depending on the desired state. The magnitude of these chosen values would come from the problem 93 Figure 5.3: Magnitude of the relative error of the persistent current as a function of the convergence parameter . The current was computed using z 1 = 0:1m 0 and z 2 = 0:9m 0 at three dierent points in the anneal. Hamiltonian one would be attempting to solve using AQC. For the purposes here, they were chosen arbitrarily to showcase the versatility of our technique. This is illustrated in Fig. 5.4. We can relate the change in ^ I z shown in Fig. 5.4 back to the schematic shown in Fig. 5.2 by noting again that each of the four qubits states correspond to a well in one of the four Cartesian quadrants of the ( ^ 1 ; ^ 2 ) plane. These qubit states also correspond to a set of measurements of the sign of I z 1 and I z 2 . For example, setting both z 1 and z 2 to positive values results in the well in the (+; +) quadrant being the lowest in energy, as well asI z 1 and I z 2 both have a positive sign, as in Fig. 5.4(a). Expressed dierently, Fig. 5.4 illustrates that the sign of the current readout corresponds to the quadrant in which the lowest energy well is located. It is interesting to note that the qubit representations of the circuits simulated above are virtually unsimulable by QMC due to the severe sign problem caused by the NStoq terms that arise from the reduction of the circuit [44, 93]. 5.5 Conclusions In this chapter, we devised a prescription to make seemingly NStoq superconducting ux circuits amendable to scalable, ecient QMC techniques. The importance of our work lies 94 〈 � 〉 ( μ �) □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ � � � � - � � + � (c) □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ - � � + � (a) □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ □ 〈I 1 〉QMC ○ 〈I 1 〉Exact ▽ 〈I 2 〉QMC ◇ 〈I 2 〉Exact � � � � (d) □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ (b) ���������� ���� ( ϕ � ) Figure 5.4: Persistent current calculated for both exact diagonalization and ODE QMC at various x points throughout the annealing process. (a) Anneal for thej00i qubit output. z 1 = 0:1 m 0 and z 2 = 0:9 m 0 were used as the problem Hamiltonian values. (b)j01i, z 1 = 0:5 and z 2 =0:5 (c)j10i, z 1 =0:4 and z 2 = 0:25 (d)j11i, z 1 =0:25 and z 2 =0:25. mainly in its implications for approximating superconducting circuits as qubit models, and what information about the circuit can be gleaned from simulations of the model. Our results stand in stark contrast with recent and ongoing eorts to construct circuits of this type [93]. If indeed superconducting ux circuits|even those circuits whose qubit repre- sentations are NStoq|are eciently simulable, then the implications are that ux circuits cannot generate genuine non-stoquastic dynamics, which is an important ingredient for both AQC universality and quantum supremacy. While the focus of our work was one specic superconducting ux qubit device, we note that our analysis equally applies for any \traditional" qubit circuit and that we expect to maintain QMC scaling, poly(n), irrespective of system size. Our conclusions are predicated on the grid spacing being constant as a function of system size, which is expected since the grid spacing is proportional to the scale of the distance between wells in the ux hyperspace, which are relatively constant with respect to system size. The method laid out here is predicated on the limitation that the charge part of the Hamiltonian can be made Cartesian via some sort of transformation. Therefore, an impor- tant question that arises from our work is: What modications to the Hamiltonians of super- conducting ux circuits can be made for these systems to truly exhibit non-stoquasticity at 95 Coecient Expression ~ C 1(2) C 1(2) + C 12 C 2(1) C 2(1) +C 12 C 2~ p ! 1 ! 2 1 C 12 p ~ C 1 ~ C 2 C 1 C 2 +C 12 (C 1 +C 2 ) L ~ 16 p ! 1 ! 2 ! 2 1 +! 2 2 2M 12 L l L 2 ~ C 1 ~ C 2 L 12 ~ 4 p ! 1 ! 2 ! 2 1 ! 2 2 E i ( x ) 0 2 I i cos 0 x i ~ 8 ~ C i p ! 1 ! 2 1=2 Table 5.2: Coecients for the two qubit rf-SQUID circuit normal mode Hamiltonian. the circuit level? One obvious choice is to introduce an external vector potential; however, this will only results in an oset charge operator, which can, most likely, be gauged away via a coordinate transformation. Currently, the most likely candidate to produce a truly NStoq circuit is a Hamiltonian that couples the charge and ux operators. This is exactly what is being explored by the bi uxon qubit system [68]. We conclude that AQC would benet greatly from more eort being put into the study of systems of this type in order to achieve this long sought after goal|mainly, a truly unsimulable circuit Hamiltonian. An interesting conjecture that follows from our results is that every NStoq (and thus non- simulable) spin model is a low-energy approximation of a higher-dimensional Stoq model. If this is the case and an ecient construction of such higher-dimensional models exists, that might be a path towards resolving the infamous QMC sign problem [83, 44]. 96 References [1] Dorit Aharonov, Wim van Dam, Julia Kempe, Zeph Landau, Seth Lloyd, and Oded Regev. Adiabatic quantum computation is equivalent to standard quantum computa- tion. SIAM J. Comput., 37(1):166{194, January 2007. [2] Awad H Al-Mohy and Nicholas J Higham. Computing the action of the matrix expo- nential, with an application to exponential integrators. SIAM journal on scientic computing, 33(2):488{511, 2011. [3] Tameem Albash. Role of nonstoquastic catalysts in quantum adiabatic optimization. Phys. Rev. A, 99:042334, Apr 2019. [4] Tameem Albash and Daniel A. Lidar. Adiabatic quantum computation. Rev. Mod. Phys., 90:015002, Jan 2018. [5] Tameem Albash, Gene Wagenbreth, and Itay Hen. O-diagonal expansion quantum monte carlo. Phys. Rev. E, 96:063309, Dec 2017. [6] Fabien Alet, Kedar Damle, and Sumiran Pujari. Sign-problem-free monte carlo sim- ulation of certain frustrated quantum magnets. Phys. Rev. Lett., 117:197203, Nov 2016. [7] J V Alvarez and Felix Ritort. Quantum monte carlo study of the innite-range ising spin glass in a transverse eld. Journal of Physics A: Mathematical and General, 29(23):7355{7366, dec 1996. [8] Vinay Ambegaokar and Matthias Troyer. Estimating errors reliably in Monte Carlo simulations of the Ehrenfest model. American Journal of Physics, 78(2):150{157, Feb 2010. [9] Al an Aspuru-Guzik, Anthony D. Dutoi, Peter J. Love, and Martin Head-Gordon. Sim- ulated quantum computation of molecular energies. Science, 309(5741):1704{1707, 2005. [10] F Barahona. On the computational complexity of Ising spin glass models. J. Phys. A: Math. Gen, 15(10):3241, 1982. [11] M.E.J. Newman & G.T. Barkema. Monte Carlo Methods in Statistical Physics. Oxford Uinversity Press, 1999. 97 [12] J. A. Barker. A quantumstatistical monte carlo method; path integrals with boundary conditions. The Journal of Chemical Physics, 70(6):2914{2918, 1979. [13] A. Berman and R. Plemmons. Nonnegative Matrices in the Mathematical Sciences. Society for Industrial and Applied Mathematics, 1994. [14] Jacob D. Biamonte and Peter J. Love. Realizable hamiltonians for universal adiabatic quantum computers. Phys. Rev. A, 78:012352, Jul 2008. [15] S. Bravyi and B. Terhal. Complexity of stoquastic frustration-free hamiltonians. SIAM Journal on Computing, 39(4):1462{1485, 2015/05/22 2009. [16] Sergey Bravyi, David P. Divincenzo, Roberto Oliveira, and Barbara M. Terhal. The complexity of stoquastic local hamiltonian problems. Quantum Info. Comput., 8(5):361385, May 2008. [17] Sergey Bravyi and Matthew Hastings. On complexity of the quantum ising model. arXiv:1410.0703, 2014. [18] Guido Burkard, Roger H. Koch, and David P. DiVincenzo. Multilevel quantum descrip- tion of decoherence in superconducting qubits. Phys. Rev. B, 69:064503, Feb 2004. [19] M. Caliari. Accurate evaluation of divided dierences for polynomial interpolation of exponential propagators. Computing, 80(2):189{201, Jun 2007. [20] Marco Caliari, Peter Kandolf, and Franco Zivcovich. Backward error analysis of polyno- mial approximations for computing the action of the matrix exponential. BIT Numer- ical Mathematics, 58(4):907{935, Dec 2018. [21] D. M. Ceperley and E. L. Pollock. Path-integral computation of the low-temperature properties of liquid 4 He. Phys. Rev. Lett., 56:351{354, Jan 1986. [22] N. J. Cerf and S. E. Koonin. Monte carlo simulation of quantum computation, 1997. [23] Shailesh Chandrasekharan. Fermion bag approach to lattice eld theories. Phys. Rev. D, 82:025007, Jul 2010. [24] Shailesh Chandrasekharan and Uwe-Jens Wiese. Meron-cluster solution of fermion sign problems. Phys. Rev. Lett., 83:3116{3119, Oct 1999. [25] Andrew M. Childs, Dmitri Maslov, Yunseong Nam, Neil J. Ross, and Yuan Su. Toward the rst quantum simulation with quantum speedup. Proceedings of the National Academy of Sciences, 115(38):9456{9461, 2018. [26] Alessandro Ciani and Barbara M. Terhal. Stoquasticity in circuit-QED. arXiv e-prints, page arXiv:2011.01109, November 2020. [27] Daniel T Colbert and William H Miller. A novel discrete variable representation for quantum mechanical reactive scattering via the s-matrix kohn method. The Journal of chemical physics, 96(3):1982{1991, 1992. 98 [28] E. Crosson, E. Farhi, C. Yen-Yu Lin, H.-H. Lin, and P. Shor. Dierent Strategies for Optimization Using the Quantum Adiabatic Algorithm. ArXiv e-prints, January 2014. [29] Elizabeth Crosson, Tameem Albash, Itay Hen, and A. P. Young. De-signing hamilto- nians for quantum adiabatic optimization, 2020. [30] Carl de Boor. Divided dierences. Surveys in Approximation Theory, 1:46{69, 2005. [31] Gabriel A. Durkin. Quantum speedup at zero temperature via coherent catalysis. Phys. Rev. A, 99:032315, Mar 2019. [32] RL Fagaly. Superconducting quantum interference device instruments and applica- tions. Review of scientic instruments, 77(10):101101, 2006. [33] Edward Farhi, David Gosset, Itay Hen, A. W. Sandvik, Peter Shor, A. P. Young, and Francesco Zamponi. Performance of the quantum adiabatic algorithm on ran- dom instances of two optimization problems on regular hypergraphs. Phys. Rev. A, 86:052334, Nov 2012. [34] Reinhard Farwig and D Zwick. Some divided dierence inequalities for n-convex func- tions. Journal of Mathematical Analysis and Applications, 108(2):430{437, 1985. [35] R. P. Feynman. Atomic theory of the transition in helium. Phys. Rev., 91:1291{1301, Sep 1953. [36] Richard P. Feynman. Simulating physics with computers. 21(6-7):467{488, 1982. [37] Bernard Field and Tapio Simula. Introduction to topological quantum computation with non-abelian anyons. Quantum Science and Technology, 3(4):045004, jul 2018. [38] Matthew P. A. Fisher, Peter B. Weichman, G. Grinstein, and Daniel S. Fisher. Boson localization and the super uid-insulator transition. Phys. Rev. B, 40:546{570, Jul 1989. [39] Laurent Fousse, Guillaume Hanrot, Vincent Lef evre, Patrick P elissier, and Paul Zim- mermann. Mpfr: A multiple-precision binary oating-point library with correct round- ing. ACM Trans. Math. Softw., 33(2), June 2007. [40] Thierry Giamarchi, Christian R uegg, and Oleg Tchernyshyov. Bose-Einstein conden- sation in magnetic insulators. Nature Physics, 4(3):198{204, Mar 2008. [41] Lalit Gupta, Tameem Albash, and Itay Hen. Permutation matrix representation quantum monte carlo. Journal of Statistical Mechanics: Theory and Experiment, 2020(7):073105, jul 2020. [42] Lalit Gupta, Lev Barash, and Itay Hen. Calculating the divided dierences of the exponential function by addition and removal of inputs. Computer Physics Communi- cations, 254, 9 2020. [43] Lalit Gupta and Itay Hen. Permutation matrix representation quantum monte carlo: a universal worm update and advanced measurement techniques (inpreparation). 99 [44] Lalit Gupta and Itay Hen. Elucidating the interplay between non-stoquasticity and the sign problem. Advanced Quantum Technologies, 3(1):1900108, 2020. [45] Stefan G uttel. Rational krylov approximation of matrix functions: Numerical methods and optimal pole selection. GAMM-Mitteilungen, 36(1):8{31, 2013. [46] Thomas Halverson and Bill Poirier. Large scale exact quantum dynamics calculations: Ten thousand quantum states of acetonitrile. Chemical Physics Letters, 624:37{42, 2015. [47] Tom Halverson, Lalit Gupta, Moshe Goldstein, and Itay Hen. Ecient simulation of so-called non-stoquastic superconducting ux circuits, 2020. [48] Tom Halverson, Dmitri Iouchtchenko, and Pierre-Nicholas Roy. Quantifying entan- glement of rotor chains using basis truncation: Application to dipolar endofullerene peapods. The Journal of chemical physics, 148(7):074112, 2018. [49] D. C. Handscomb. Proc. Camb. Phil. Soc., 58:594598, 1962. [50] D. C. Handscomb. Proc. Camb. Phil. Soc., 60:115122, 1964. [51] Dominik Hangleiter, Ingo Roth, Daniel Nagaj, and Jens Eisert. Easing the Monte Carlo sign problem. arXiv e-prints, page arXiv:1906.02309, Jun 2019. [52] R Harris, J Johansson, AJ Berkley, MW Johnson, T Lanting, Siyuan Han, P Bunyk, E Ladizinsky, T Oh, I Perminov, et al. Experimental demonstration of a robust and scalable ux qubit. Physical Review B, 81(13):134510, 2010. [53] M. B. Hastings and M. H. Freedman. Obstructions to classically simulating the quan- tum adiabatic algorithm, 2013. [54] Itay Hen. O-diagonal series expansion for quantum partition functions. Journal of Statistical Mechanics: Theory and Experiment, 2018(5):053102, 2018. [55] Itay Hen. Resolution of the sign problem for a frustrated triplet of spins. Phys. Rev. E, 99:033306, Mar 2019. [56] Itay Hen and A. P. Young. Exponential complexity of the quantum adiabatic algorithm for certain satisability problems. Phys. Rev. E, 84:061152, Dec 2011. [57] Patrik Henelius and Anders W. Sandvik. Sign problem in monte carlo simulations of frustrated quantum spin systems. Phys. Rev. B, 62:1102{1113, Jul 2000. [58] Nicholas J Higham. Accuracy and stability of numerical algorithms. Siam, 2002. [59] Nicholas J. Higham. The scaling and squaring method for the matrix exponential revisited. SIAM Journal on Matrix Analysis and Applications, 26(4):1179{1193, 2005. [60] Nicholas J Higham. Functions of matrices: theory and computation. Siam, 2008. 100 [61] A. Honecker, S. Wessel, R. Kerkdyk, T. Pruschke, F. Mila, and B. Normand. Ther- modynamic properties of highly frustrated quantum spin ladders: In uence of many- particle bound states. Phys. Rev. B, 93:054408, Feb 2016. [62] Layla Hormozi, Ethan W. Brown, Giuseppe Carleo, and Matthias Troyer. Nonsto- quastic hamiltonians and quantum annealing of an ising spin glass. Phys. Rev. B, 95:184416, May 2017. [63] Mark Howard and Earl Campbell. Application of a resource theory for magic states to fault-tolerant quantum computing. Physical Review Letters, 118(9), Mar 2017. [64] K. Hukushima and K. Nemoto. Exchange monte carlo method and application to spin glass simulations. J. Phys. Soc. Japan, 65:1604, 1996. [65] D. Jaksch and P. Zoller. The cold atom hubbard toolbox. Annals of Physics, 315(1):52 { 79, 2005. Special Issue. [66] David Joyner. Adventures in group theory. Rubik's cube, Merlin's machine, and other mathematical toys. Baltimore, MD: Johns Hopkins University Press, 2008. [67] Tadashi Kadowaki and Hidetoshi Nishimori. Quantum annealing in the transverse Ising model. Phys. Rev. E, 58(5):5355, November 1998. [68] Konstantin Kalashnikov, Wen Ting Hsieh, Wenyuan Zhang, Wen-Sen Lu, Plamen Kamenov, Agustin Di Paolo, Alexandre Blais, Michael E. Gershenson, and Matthew Bell. Bi uxon: Fluxon-parity-protected superconducting qubit. PRX Quantum, 1:010307, Sep 2020. [69] Ivan Kassal, Stephen P. Jordan, Peter J. Love, Masoud Mohseni, and Al an Aspuru- Guzik. Polynomial-time quantum algorithm for the simulation of chemical dynamics. Proceedings of the National Academy of Sciences, 105(48):18681{18686, 2008. [70] T. D. Kieu and C. J. Grin. Monte carlo simulations with indenite and complex- valued measures. Phys. Rev. E, 49:3855{3859, May 1994. [71] Morten Kjaergaard, Mollie E Schwartz, Jochen Braum uller, Philip Krantz, Joel I-J Wang, Simon Gustavsson, and William D Oliver. Superconducting qubits: Current state of play. Annual Review of Condensed Matter Physics, 11, 2019. [72] Joel Klassen, Milad Marvian, Stephen Piddock, Marios Ioannou, Itay Hen, and Bar- bara Terhal. Hardness and Ease of Curing the Sign Problem for Two-Local Qubit Hamiltonians. arXiv e-prints, page arXiv:1906.08800, Jun 2019. [73] Joel Klassen and Barbara M. Terhal. Two-local qubit Hamiltonians: when are they stoquastic? Quantum, 3:139, May 2019. [74] Philip Krantz, Morten Kjaergaard, Fei Yan, Terry P Orlando, Simon Gustavsson, and William D Oliver. A quantum engineer's guide to superconducting qubits. Applied Physics Reviews, 6(2):021318, 2019. 101 [75] Richard E. Ladner. On the structure of polynomial time reducibility. J. ACM, 22(1):155{171, January 1975. [76] David Landau and Kurt Binder. A Guide to Monte Carlo Simulations in Statistical Physics. Cambridge University Press, New York, NY, USA, 2005. [77] T. Lanting, A. J. Przybysz, A. Yu. Smirnov, F. M. Spedalieri, M. H. Amin, A. J. Berkley, R. Harris, F. Altomare, S. Boixo, P. Bunyk, N. Dickson, C. Enderud, J. P. Hilton, E. Hoskinson, M. W. Johnson, E. Ladizinsky, N. Ladizinsky, R. Neufeld, T. Oh, I. Perminov, C. Rich, M. C. Thom, E. Tolkacheva, S. Uchaikin, A. B. Wil- son, and G. Rose. Entanglement in a quantum annealing processor. Physical Review X, 4(2):021041{, 05 2014. [78] B. P. Lanyon, J. D. Whiteld, G. G. Gillett, M. E. Goggin, M. P. Almeida, I. Kassal, J. D. Biamonte, M. Mohseni, B. J. Powell, M. Barbieri, A. Aspuru-Guzik, and A. G. White. Towards quantum chemistry on a quantum computer. Nature Chemistry, 2:106 EP {, 01 2010. [79] M. Lewenstein, A. Sanpera, and V. Ahunger. Ultracold Atoms in Optical Lattices: Simulating quantum many-body systems. OUP Oxford, 2012. [80] Robert G Littlejohn, Matthew Cargo, Tucker Carrington Jr, Kevin A Mitchell, and Bill Poirier. A general framework for discrete variable representation basis sets. The Journal of chemical physics, 116(20):8691{8703, 2002. [81] D Lonardoni, F Pederiva, and S Gandol. From hypernuclei to the inner core of neutron stars: A quantum monte carlo study. Journal of Physics: Conference Series, 529(1):012012, 2014. [82] E. Marinari, G. Parisi, and J. J. Ruiz-Lorenzo. Numerical simulations of spin glass systems. In A. P. Young, editor, Spin Glasses and Random Fields, page 59. World Scientic, Singapore, Singapore, 1998. [83] Milad Marvian, Daniel A. Lidar, and Itay Hen. On the computational complexity of curing non-stoquastic hamiltonians. Nature Communications, 10(1):1571, 2019. [84] Guglielmo Mazzola and Matthias Troyer. Quantum monte carlo annealing with multi-spin dynamics. Journal of Statistical Mechanics: Theory and Experiment, 2017(5):053105, 2017. [85] A. McCurdy, K. C. Ng, and B. N. Parlett. Accurate computation of divided dierences of the exponential function. Mathematics of computation, 43(168):501{528, 1984. [86] Allan Charles Mccurdy. Accurate Computation of Divided Dierences. PhD thesis, University of California, Berkeley, 1980. AAI8029490. [87] Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087{1092, 1953. 102 [88] L.M. Milne-Thomson. The Calculus of Finite Dierences. Macmillan and Company, limited, 1933. [89] M. E. J. Newman and G. T. Barkema. Monte Carlo Methods in Statistical Physics. Oxford University Press Inc., New York, USA, 1999. [90] Hidetoshi Nishimori and Kabuki Takada. Exponential enhancement of the eciency of quantum annealing by non-stoquastic hamiltonians. Frontiers in ICT, 4:2, 2017. [91] Kouichi Okunishi and Kenji Harada. Symmetry-protected topological order and negative-sign problem for SO(n) bilinear-biquadratic chains. Phys. Rev. B, 89:134422, Apr 2014. [92] G. Opitz. Steigungsmatrizen. Zeitschrift fr Angewandte Mathematik und Mechanik (ZAMM), 44, 01 1964. [93] I. Ozdan, C. Deng, A.Y. Smirnov, T. Lanting, R. Harris, L. Swenson, J. Whittaker, F. Altomare, M. Babcock, C. Baron, A.J. Berkley, K. Boothby, H. Christiani, P. Bunyk, C. Enderud, B. Evert, M. Hager, A. Hajda, J. Hilton, S. Huang, E. Hoskinson, M.W. Johnson, K. Jooya, E. Ladizinsky, N. Ladizinsky, R. Li, A. MacDonald, D. Marsden, G. Marsden, T. Medina, R. Molavi, R. Neufeld, M. Nissen, M. Norouzpour, T. Oh, I. Pavlov, I. Perminov, G. Poulin-Lamarre, M. Reis, T. Prescott, C. Rich, Y. Sato, G. Sterling, N. Tsai, M. Volkmann, W. Wilkinson, J. Yao, and M.H. Amin. Demon- stration of a nonstoquastic hamiltonian in coupled superconducting ux qubits. Phys. Rev. Applied, 13:034037, Mar 2020. [94] N. V. Prokof'ev, B. V. Svistunov, and I. S. Tupitsyn. Exact, complete, and universal continuous-time worldline monte carlo approach to the statistics of discrete quantum systems. Journal of Experimental and Theoretical Physics, 87(2):310{321, 1998. [95] Linda E. Reichl. A modern course in Statistical Physics. Wiley & Sons, 1998. [96] Markus Reiher, Nathan Wiebe, Krysta M. Svore, Dave Wecker, and Matthias Troyer. Elucidating reaction mechanisms on quantum computers. Proceedings of the National Academy of Sciences, 2017. [97] Rotman. An Introduction to the Theory of Groups. New York: Springer-Verlag, 1995. [98] Reuven Y. Rubinstein. Simulation and the Monte Carlo Method. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1981. [99] A. W. Sandvik. A generalization of Handscomb's quantum Monte Carlo scheme | Application to the 1-D Hubbard model. J. Phys, A, 25:3667, 1992. [100] A. W. Sandvik. Stochastic series expansion method for quantum Ising models with arbitrary interactions. Phys. Rev. E, 68:056701, 2003. [101] Anders W. Sandvik. Stochastic series expansion method with operator-loop update. Phys. Rev. B, 59:R14157{R14160, Jun 1999. 103 [102] Anders W. Sandvik and Juhani Kurkij arvi. Quantum monte carlo simulation method for spin systems. Phys. Rev. B, 43:5950{5961, Mar 1991. [103] R B Stinchcombe. Ising model in a transverse eld. i. basic theory. Journal of Physics C: Solid State Physics, 6(15):2459{2483, aug 1973. [104] Yuki Susa, Johann F. Jadebeck, and Hidetoshi Nishimori. Relation between quantum uctuations and the performance enhancement of quantum annealing in a nonstoquas- tic hamiltonian. Phys. Rev. A, 95:042321, Apr 2017. [105] Matthias Troyer and Uwe-Jens Wiese. Computational complexity and fundamental limitations to fermionic quantum monte carlo simulations. Phys. Rev. Lett., 94:170201, May 2005. [106] Walter Vinci and Daniel A. Lidar. Non-stoquastic hamiltonians in quantum annealing via geometric phases. npj Quantum Information, 3(1):38, 2017. [107] Uri Vool and Michel Devoret. Introduction to quantum electromagnetic circuits. Inter- national Journal of Circuit Theory and Applications, 45(7):897{934, 2017. [108] Philipp Werner, Armin Comanac, Luca de' Medici, Matthias Troyer, and Andrew J. Millis. Continuous-time solver for quantum impurity models. Phys. Rev. Lett., 97:076405, Aug 2006. [109] E. T. Whittaker and G. Robinson. Divided dierences. In The Calculus of Observa- tions: A Treatise on Numerical Mathematics. New York: Dover, New York, 1967. [110] Franco Zivcovich. Fast and accurate computation of divided dierences for analytic functions, with an application to the exponential function. Dolomites Research Notes on Approximation, 12(1):28{42, 2019. 104 Appendix Finite dimensional permutation matrix representations To show that any nite-dimensional matrix can be written in the form of Eq. (2.9), we will make use of `cycle notation' [97] | a compact representation of permutations | to represent the permutation matrices P j . We start with some terminology. A cycle is a string of integers that represents an element of the symmetric permutation group S n , which cyclically permutes these integers and xes all other integers. For example cycle (a 1 ;a 2 ;:::;a m ) is the permutation that sends a i to a i+1 , 1 i m 1 and sends a m to a 1 . The cycle given in the above example is an m-cycle. In general, any element 2 S n can be written as a product of k cycles as (a 1 a 2 :::a m 1 )(a m 1 +1 a m 1 +2 :::a m 2 )::: (a m k1 +1 a m k1 +2 :::a m k ). The order of a permutation is dened as the smallest positive integer p such that p is the identity element. In this notation, the action of on any number from 1 to n can be determined as follows. If a appears at the right end of one of the k cycles, then (a) is the integer at the start of the cycle to which a belongs. If an integer a does not appear at the right end of one of the k cycles, then (a) is the integer to the right of a in the cycle to which a belongs. 105 For concreteness, let us write the 3 3 permutation matrices used in Sec. ?? in cycle notation. P 1 =P = 2 6 6 6 4 0 0 1 1 0 0 0 1 0 3 7 7 7 5 (1; 2; 3); (5.11) P 2 =P 2 = 2 6 6 6 4 0 1 0 0 0 1 1 0 0 3 7 7 7 5 (1; 3; 2): (5.12) The identity operation can thus be written as P 0 =P 3 =1 = 2 6 6 6 4 1 0 0 0 1 0 0 0 1 3 7 7 7 5 (1)(2)(3): (5.13) We illustrate the evaluation of a product of two cycles by computing P 2 in cycle notation. Since we dened P = (1; 2; 3) we have P 2 =P 2 = (1; 2; 3)(1; 2; 3). By the above denition P 2 (1) = (1; 2; 3)(1; 2; 3)(1) = (1; 2; 3)(2) = (3), P 2 (2) = (1; 2; 3)(1; 2; 3)(2) = (1; 2; 3)(3) = (1), P 2 (3) = (1; 2; 3)(1; 2; 3)(3) = (1; 2; 3)(1) = (2). Therefore P 2 = (1; 3; 2). If we enumerate the basis states as 1j1i 2 6 6 6 4 1 0 0 3 7 7 7 5 ; 2j2i 2 6 6 6 4 0 1 0 3 7 7 7 5 ; 3j3i 2 6 6 6 4 0 0 1 3 7 7 7 5 ; (5.14) then with the action described above one can see thatjki =P matrix notation i jji corresponds to k =P cycle notation i (j). 106 In the above notation, the set of nn permutation matrices can be seen as groups generated by ann-cycle. Any givenn-cycle = (a 0 ;a 1 ;a 2 ;a 3 ;:::;a n1 )2S n , whereS n is the symmetric permutation group, has order n. To see why this is so, observe that k (a 0 ) =a k for 0 < k < n so its order cannot be less than n and n is the identity as n (a i ) = a i for i2f0;n 1g. Therefore the group generated by hasn elements. More generally we have: k (a j ) =a (j+k)modn . This implies k 1 (a j )6= k 2 (a j ) for k 1 6=k 2 . LetP be the permutation matrix corresponding to. Then,P k 1 ja j i6=P k 2 ja j i fork 1 6=k 2 and basis vectorja j i. Since any row (or column) of a permutation matrix has value 0 at all positions but one, where it has the value 1, this implies that no two permutation matrices generated fromP have the same row otherwise that would meanP k 1 ja j i =P k 2 ja j i for some k 1 ;k 2 ;a j . Next, we will show that for any given matrix entry (i;j) there is at least one permutation matrixP k having value 1 at (i;j), that isP k i;j = 1 for somek. Letjii denote the basis vector which has value 1 at the i-th index and 0 at all others. Since these permutation matrices form a group there must be some matrix P k such that P k jji =jii. This however means that P k has value 1 at (i;j). Combining the two statements above, we nd that for any entry (i;j) there is exactly one permutation matrix generated from P that has value 1 at that entry. The diagonal matricesD j may be used to convert these 1's to any desired value. We have thus shown that by choosing our P j permutation matrices as n-cycles, one can construct arbitrary Hamiltonians H = P j D j P j . This proof also provides a prescription as to how to explicitly choose the permutations P j . Hermiticity of H Here we show that H = P D j P j is hermitian if and only if for every index j there is an associated index j 0 such that P j =P 1 j 0 and D j =D j 0 (in general, j and j 0 may correspond to the same index). 107 We rst prove the if direction. Let H above be a hermitian matrix. We show that this implies that for every index j there is an associated index j 0 such that P j =P 1 j 0 . Let P j be the permutation sending a basis vector p to another basis vector q. Thus in `cycle notation' (see Appendix ??) P j =::: (:::p;q:::):::. Let us assume that D j is not the zero matrix (otherwise D j P j is trivially zero). Case I: Let D i(q;q) 6= 0. Since H is hermitian thus there existP j 0 =::: (:::q;p:::)::: in the decomposition ofH. But thenP j P j 0 =::: (:::q;q:::)::: which has to be equal to1 (as otherwise this would imply the existence of a permutation matrix that has a xed point contradictory to our initial setup). Case II: Let D j (q;q) = 0. SinceD j is not identically zero, there exists an elements such thatD j (s;s) 6= 0. Let us denote byr the element that is sent tos inP j , that isP j =::: (:::p;q:::r;s:::):::. Again sinceH is hermitian there must exist P j 0 =::: (:::s;r:::)::: in the decomposition of H. But then P j P j 0 = ::: (:::s;s:::)::: which has to be equal to 1. Thus P j has an inverse P j 0 in the decomposition of H. Next, we prove that there can be no two permutation matrices inH with the same nonzero element. In cycle notation, this assertion translates to the assertion that there can be no two distinct permutations that send a basis state to the same basis state. We prove this by contradiction. LetP j andP k be two distinct permutations both of which send a basis vector p to q. In 'cycle notation' this mean P j = ::: (:::p;q:::)::: and P k = ::: (:::p;q:::)::: with P j 6= P k . We showed above that P j has an inverse P j 0, that is P j P j 0 = 1. But P j 0P k =::: (:::p;p:::)::: has xed point and cannot be the identity due to the uniqueness of the inverse. Thus we have reached a contradiction. We now prove the full if part. By denition, that H is hermitian implies P D j P j = P D j P j 1 . LetP j 0 be the inverse ofP j which as we proved should exist in the decomposition with a nonzero weightD j . Equating the right-hand side and the left-hand side of the equality gives D j =D j 0. Proving the other direction is simpler. We assume that in the summation H = P D j P j there is for every index j an associated index j 0 such that P j =P j 0 1 and D j =D j 0 . Thus H y = P D j 0 P j 0 1 = P D j P j =H. 108 Proof of convolution theorem for divided dierences We show that the following identity is true. Z 0 e t[x 0 ;;x j ] e (t)[x j+1 ;;xq ] dt = e [x 0 ;;xq ] : (5.15) We rst write the divided dierence of exponential function e [x 0 ;;xq ] in the Taylor expansion form as follows. e [x 0 ;:::;xq ] = 1 X n=0 n [x 0 ;:::;x q ] n n! = 1 X n=q n [x 0 ;:::;x q ] n n! = 1 X m=0 q+m [x 0 ;:::;x q ] q+m (q +m)! = 1 X m=0 q+m (q +m)! X P k j =m q Y j=0 (x j ) k j ; (5.16) where we have used the following identity to go from the third to fourth line in Eq. (5.16): [x 0 ;:::;x q ] q+m = ( m< 0 0 m = 0 1 m> 0 P P k j =m Q q j=0 x k j j : We next plug the divided dierence form in Eq. (5.16) in the left hand side of Eq. (5.15). Z 0 e t[x 0 ;;x j ] e (t)[x j+1 ;;xq ] dt = Z 0 0 @ 1 X m=0 (t) j+m (j +m)! X P k i =m j Y i=0 (x i ) k i 1 A 0 @ 1 X n=0 (t) qj1+n (qj 1 +n)! X P k i =n q Y i=j+1 (x i ) k i 1 A dt = 1 X m=0 1 X n=0 Z 0 j+m (j +m)! (t) qj1+n (qj 1 +n)! dt X P k i =m j Y i=0 (x i ) k i X P k i =n q Y i=j+1 (x i ) k i : (5.17) 109 Integrating the above right hand side, we get 1 X m=0 1 X n=0 (j +m + 1)(qj +n) q+m+n (j +m)!qj 1 +n)! (q +m +n + 1) X P k i =m j Y i=0 (x i ) k i X P k i =n q Y i=j+1 (x i ) k i ; which can be further simplied to 1 X m=0 1 X n=0 q+m+n (q +m +n)! X P k i =m j Y i=0 (x i ) k i X P k i =n q Y i=j+1 (x i ) k i = 1 X m=0 q+m (q +m)! X P k i =m q Y i=0 (x i ) k i = e [x 0 ;;xq ] ; (5.18) as desired. 110
Abstract (if available)
Abstract
This thesis is focused on the development and application of a novel Quantum Monte Carlo (QMC) algorithm to advance the state-of-the-art of quantum many body physics simulations. We showed a novel way to decompose the partition function of a Hamiltonian using permutation matrix representation (PMR) formalism which is free from Trotter error. We proposed updates for our QMC method that can be implemented without the prior knowledge of Hamiltonian and does not require us to set the expansion order parameter in advance. We described how to compute the thermal expectation of any complex Hermitian operator in our QMC algorithm. We also benchmarked our algorithm against the state-of-the-art Stochastic Series Expansion (SSE) QMC algorithm and found four or more orders of magnitude advantage. To compute the ratio of weights in our QMC algorithm, we needed an efficient way to calculate the ratio of divided differences of the exponential function (DDEF) which differs by one or two inputs. Computing the divided differences using standard recursion definition gives inaccurate results due to its numerical instability. Methods found in the literature to accurately compute the DDEF have high time complexity as they focus on computing DDEF and not the ratio of DDEF. We devised an efficient algorithm to compute the ratio of DDEF using the addition and removal of inputs. Sign problem in QMC using PMR formalism and its connection with non-stoquasticity is studied. We demonstrated using the Qutrit model how it is possible to have no sign problem for a Hamiltonian with a complex coefficient in its off-diagonal terms. Stoquasticity implies that ground state amplitudes of a Hamiltonian all have the same sign. We showed using a simple Hamiltonian example that the reverse is not true. We showed a stoquastic representation of a flux circuit Hamiltonian which is believed to be non-stoquastic. We applied our QMC algorithm to show the efficient simulation of the superconducting flux circuit.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Tunneling, cascades, and semiclassical methods in analog quantum optimization
PDF
Trainability, dynamics, and applications of quantum neural networks
PDF
The theory and practice of benchmarking quantum annealers
PDF
Demonstration of error suppression and algorithmic quantum speedup on noisy-intermediate scale quantum computers
PDF
Open-system modeling of quantum annealing: theory and applications
PDF
Imposing classical symmetries on quantum operators with applications to optimization
PDF
Error correction and quantumness testing of quantum annealing devices
PDF
Error suppression in quantum annealing
PDF
Quantum information-theoretic aspects of chaos, localization, and scrambling
PDF
Theory and simulation of Hamiltonian open quantum systems
PDF
Topics in quantum information -- Continuous quantum measurements and quantum walks
PDF
Applications of quantum error-correcting codes to quantum information processing
PDF
Coherence generation, incompatibility, and symmetry in quantum processes
PDF
Topics in quantum cryptography, quantum error correction, and channel simulation
PDF
Topics in quantum information and the theory of open quantum systems
PDF
Applications and error correction for adiabatic quantum optimization
PDF
Protecting Hamiltonian-based quantum computation using error suppression and error correction
PDF
Quantum information techniques in condensed matter: quantum equilibration, entanglement typicality, detection of topological order
PDF
Dissipation as a resource for quantum information processing: harnessing the power of open quantum systems
PDF
Exploiting structure in the Boolean weighted constraint satisfaction problem: a constraint composite graph-based approach
Asset Metadata
Creator
Gupta, Lalit
(author)
Core Title
Advancing the state of the art in quantum many-body physics simulations: Permutation Matrix Representation Quantum Monte Carlo and its Applications
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Physics
Degree Conferral Date
2022-05
Publication Date
03/06/2022
Defense Date
02/28/2022
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
OAI-PMH Harvest,Permutation Matrix Representation Quantum Monte Carlo
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Haas, Stephan (
committee chair
), Hen, Itay (
committee chair
), Brun, Todd (
committee member
), Spedalieri, Federico (
committee member
), Zanardi, Paolo (
committee member
)
Creator Email
lalitg@usc.edu,lalitg327@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC110768077
Unique identifier
UC110768077
Legacy Identifier
etd-GuptaLalit-10425
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Gupta, Lalit
Type
texts
Source
20220308-usctheses-batch-915
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
Permutation Matrix Representation Quantum Monte Carlo