Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Lower overhead fault-tolerant building blocks for noisy quantum computers
(USC Thesis Other)
Lower overhead fault-tolerant building blocks for noisy quantum computers
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
LOWER OVERHEAD FAULT-TOLERANT BUILDING BLOCKS FOR NOISY QUANTUM COMPUTERS by Prithviraj Prabhu A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) May 2024 Copyright 2024 Prithviraj Prabhu Acknowledgements First and foremost, I would like to express my deepest gratitude to my supervisor, Dr. Benjamin Reichardt, for their invaluable encouragement, guidance and support throughout my PhD journey. Their expertise in the field of quantum error correction and fault tolerance has been crucial in shaping my thesis. Additionally, their influence over the past six years has been instrumental in shaping me into the researcher I am today. I would also like to thank the members of my qualifying exam and thesis defense committees, Dr. Todd Brun, Dr. Daniel Lidar, Dr. Eli Levenson-Falk and Dr. Keith Chugg. Their constructive criticism at crucial junctures of my PhD have been greatly appreciated. My deepest thanks to Dr. Brun, who served as my secondary advisor. Their constant support and supply of ideas kept me motivated and exploring new possibilities. I am grateful to my colleagues in the Electrical Engineering department at USC for their stimulating discussions and collaborative spirit. Special thanks to Dr. Rui Chao, Dr. Sourav Kundu, Yuanjia Wang and Anirudh Lanka for their insightful conversations and constructive criticisms. I would also like to thank the quantum computing team at Intel and Dr. Christopher Chamberland previously at Amazon for providing me the opportunity to pursue internships. My forays in the industry have certainly increased the value of this thesis. Finally, I want to express my deepest appreciation to my family and friends for their unwavering love and support during this challenging but rewarding experience. I am grateful for the support of my parents, Govindaraj Prabhu and Sharmila Devi Prabhu, and my sister Prasidha. ii Table of Contents Acknowledgements ii List of Tables v List of Figures viii Abstract xviii Chapter 1: Introduction 1 1.1 What is a fault-tolerant quantum computer? . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Redesigns of fault-tolerant building blocks . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Current thrusts and future outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 2: Syndrome measurement 19 2.1 Flag sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 Distance-three stabilizer measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2.1 Fast reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.2 Slow reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2.3 Space-time cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.3 Distance-five/seven stabilizer measurement . . . . . . . . . . . . . . . . . . . . . . . . 30 Chapter 3: Cat state preparation 34 3.1 Using flags to tolerate faults in a non-fault-tolerant circuit . . . . . . . . . . . . . . . 36 3.2 State preparation by measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.1 Non-local measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.2 Local measurements on a 1-D chain . . . . . . . . . . . . . . . . . . . . . . . 47 Chapter 4: Encoded state preparation 50 4.1 Deterministic fault-tolerant preparation of CSS ancilla states . . . . . . . . . . . . . 52 4.2 Steane code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.2.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3 Golay code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Chapter 5: Distance-four quantum codes 62 5.1 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.2 Fault-tolerant error correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2.1 Stabilizer measurement circuits . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2.2 Stabilizer measurement sequences . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 iii 5.3.1 Comparing memory against unencoded qubits . . . . . . . . . . . . . . . . . . 81 5.4 Potential future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.5 Surface code error correction with the union-find decoder . . . . . . . . . . . . . . . . 84 5.6 A subsystem code with four useful logical qubits . . . . . . . . . . . . . . . . . . . . 85 5.6.1 Fault-tolerant error correction and detection . . . . . . . . . . . . . . . . . . . 87 5.6.2 Logical gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Chapter 6: Temporally encoded lattice surgery 94 6.1 Review of Pauli-based computation and lattice surgery . . . . . . . . . . . . . . . . . 98 6.1.1 Pauli-based computation and lattice surgery . . . . . . . . . . . . . . . . . . . 98 6.1.2 Temporally encoded lattice surgery . . . . . . . . . . . . . . . . . . . . . . . . 101 6.2 New TELS encoding protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.2.1 Improvements arising from repeated temporally encoded measurements . . . . 104 6.2.2 Improvements from correcting classical errors . . . . . . . . . . . . . . . . . . 106 6.2.3 Protocols for PP sets of size up to one hundred . . . . . . . . . . . . . . . . . 108 Chapter 7: TELS for improving magic state fidelity 111 7.1 Magic state distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 7.1.1 Distillation in the Clifford frame . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.1.2 Challenges of extending TELS protocols to Pauli frames . . . . . . . . . . . . 120 7.2 Precise design of distillation tiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 7.2.1 15-to-1 distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.2.2 Scheduling distillation tiles in a factory . . . . . . . . . . . . . . . . . . . . . 131 Bibliography 135 Appendices 147 .1 Post-selective distance-three fault-tolerant cat state preparation . . . . . . . . . . . . 147 .2 Low-depth fault-tolerant cat state preparation . . . . . . . . . . . . . . . . . . . . . . 148 .3 Corrections and rejections for weight-eight stabilizer measurements . . . . . . . . . . 150 .4 Malignant set counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 .4.1 Monte-Carlo sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 .4.2 Modelling malignancy with the Bernoulli distribution . . . . . . . . . . . . . . 153 .4.3 MacWilliams identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 .5 Construction of classical codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 .5.1 Cyclic codes defined using polynomials . . . . . . . . . . . . . . . . . . . . . . 153 .5.2 Reed-Muller and Polar codes . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 .6 Speedups offered by different codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 .7 Clifford frames of distilled magic states . . . . . . . . . . . . . . . . . . . . . . . . . . 158 .8 Choice of codewords for the Golay code for TELS of a 15-to-1 distillation protocol . 164 .9 Procedure for determining code distances of distillation tiles . . . . . . . . . . . . . . 165 .10 Constants used to determine spacetime costs of distillation tiles . . . . . . . . . . . . 167 .11 Additional distillation layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 .11.1 125-to-3 distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 .11.2 116-to-12 distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 .11.3 114-to-14 distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 iv List of Tables 1.1 Overhead estimates for different fault-tolerant algorithm implementations. . . . . . . 4 1.2 References for the error correction experiments in Fig. 1.2. . . . . . . . . . . . . . . . 6 2.1 Space and time costs for measuring a weight-w stabilizer using different distance-three fault-tolerant stabilizer measurement circuits. In the following, all the logarithms are base 2. The flag method requires the fewest ancillas and has low depth, allowing for the smallest cost when computing #ancillas × depth. . . . . . . . . . . . . . . . . . 28 3.1 Cat state size for distance-3 preparation methods that use m ancilla qubit measurements. 35 3.2 Possible data errors and associated corrections for the different observed flag patterns in Fig. 3.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.3 Sequences of parity measurements needed to prepare cat states of size n fault-tolerantly to distance d. We assume non-local gates are possible, permitting parity measurement of distant data qubits. The sequences for the distance-five and -seven cases were generated at random, while the corrections were calculated and fault tolerance was verified using Mathematica programs. {1 . . . n} := (1, 2),(2, 3), . . .(n − 1, n),(n, 1). . 46 5.1 Postselected error correction for 6 logical qubits using J16, k, 4K codes on the 25-qubit planar layout of Fig. 5.2. The probability of logical error, acceptance, and expected time to complete are shown for 300 time steps, with noise rate p = 5 × 10−4 . The k = 6 code achieves logical error rate close to the distance-5 surface code using only 10% of the qubits. In comparison, for 6 physical qubits at memory error rate p/10, the probability of error is about 6 × 300 × p/10 = 0.09. . . . . . . . . . . . . . . . . 63 5.2 Distance-four codes with postselection lead to O(p 3 ) logical errors, much like distancefive codes. Even-distance codes require restarts, however, unlike odd-distance codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.3 Error correction for 1 logical qubit at p = 0.001. The probability of logical error and acceptance are shown for 80 and 200 time steps. Each code uses one patch of qubits. The distance-four surface code has the lowest logical error probability for short computations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.4 Error correction for 2 logical qubits at p = 0.0005. The probability of logical error and acceptance are shown for 300 and 750 time steps. The surface codes require more than one patch of physical qubits. Among the new codes, the k = 2 color code has few large stabilizers and a fast sequence. These advantages help it achieve the lowest logical error probability at the highest acceptance rates. . . . . . . . . . . . . . . . . 80 5.5 Error correction for 6 logical qubits at p = 0.00025. The probability of logical error and acceptance are shown for 700 and 1500 time steps. The k = 6 code requires one-tenth the physical qubits as the distance-5 surface code, while nearly matching the logical error probability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 v 5.6 Error correction for 12 logical qubits at p = 0.0001. The probability of logical error and acceptance are shown for 1800 and 4500 time steps. The k = 4 code is well-balanced, achieving competitive logical error rates with low qubit overhead. . . . . . . . . . . . 81 5.7 Sequences of weight-four operators needed to measure different logical operators fault-tolerantly to distance four. Majority voting decides the measurement outcome. If the measured results are split with equal probability, the measurement result is rejected. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.1 Comparison between the performance of a TELS protocol implemented using pure error detection versus protocols implemented using combined error detection and correction with the [127, 92, 11] BCH code. The last line of the table shows the performance of unencoded lattice surgery. Here, dm is the measurement used when measuring multi-qubit Pauli operators using lattice surgery, and c is the maximum weight of errors that can be corrected by the classical code used in the TELS protocol. The objective is to minimize the average time taken per Pauli measurement (last column) while ensuring the logical error rate is less than 10−15 per Pauli. The logical error rate per Pauli pL is calculated using Eq. (6.12), where the routing space area is A = 100. The results in the first row are for the TELS protocol implemented using pure error detection. By correcting weight-one, -two and -three errors, the average measurement time is reduced to 2.76 syndrome measurement rounds, as opposed to 4.36 when using TELS with pure error detection, or 12 without TELS. Note that a pure error detection scheme with dm ≥ 3 results in a larger runtime than those obtained in the first four rows of this table since the total number of syndrome measurement rounds will be at least 127 × 4 = 508. . . . . . . . . . . . . . . . . . . . 108 6.2 Classical codes (and their associated distances) used in the TELS protocols considered in this work. In different noise regimes, some codes perform better than others, as we show in Fig. 6.4, Fig. 12 and Fig. 13. We provide explicit constructions of the best performing codes in Appendix .5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 7.1 Space-time costs of different distillation protocols on a biased-noise planar surface code. δ (M) is the target logical error rate per output magic state. TELS protocols are labeled “Cliff-xxx”, with “Par” implying that measurements are performed two at a time (i.e., with lattice surgery measurements which can access the two X logical boundaries of surface code patches simultaneously). The number of physical qubits is two times the space cost, since the space cost counts only the number of data qubits of the surface code. The probability that a distillation algorithm rejects due to an error in an injected magic state is p (M) D = 1 − (1 − ϵL) n where ϵL is given by Eq. (7.2). For the 15-to-1 distillation protocol, the space time cost of a protocol using TELS is approximately 30% smaller (1.17 × 105 ) than a protocol that does not use TELS (1.5 × 105 ). For the 125-to-3 distillation protocol, the space time cost is decreased by approximately 20% with TELS. The label NS refers to the number of syndrome measurement rounds required for the entire distillation protocol. . . . . . . . . . . . 121 2 BCH codes and associated generator polynomials. The codewords generators are cyclic shifts of the generator polynomial. . . . . . . . . . . . . . . . . . . . . . . . . . 155 3 Zetterberg codes with associated generator polynomials. . . . . . . . . . . . . . . . . 156 vi 4 The best lattice surgery speedup for k ∈ {1, 2, . . . 100}, and associated classical code achieving it, in different noise regimes p, and for different target logical error rates δ. Table continues on subsequent pages. . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 5 Constants associated with layouts of distillation tiles described in Section 7.2. These constants are used to determine minimum spacelike and timelike distances using the procedure in Appendix .9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 vii List of Figures 1.1 Qubit and gate counts (T) needed for academic and commercial quantum use [SWM+24]. Quantum algorithms decomposed into a sequence of Clifford and T gates assume only the latter is computationally expensive. To ensure reliable results for large and long computations, qubits are protected by error-correcting codes. When using the surface code, we show the minimum code distance needed to achieve an algorithmic accuracy of 1%, with a physical gate error rate of 10−3 . We make modest assumptions about the routing space and ignore magic state distillation factories. . . . . . . . . . . . . 2 1.2 Progress of experimental demonstrations of quantum error correction on superconducting, ion trap, and neutral atom systems. Currently the objective is to benchmark error correction primitives with small codes. In the next few years, devices will become larger while maintaining low noise levels. Neutral atom systems can control many qubits and maintain coherence, but lack repeated real-time error correction. Superconducting and ion trap qubits have demonstrated all the primitives for fault tolerance, and now face the engineering challenge of scaling up system sizes. . . . . 5 1.3 Layered architecture of a fault-tolerant quantum computer [JVMF+12]. The functioning of a quantum computer can be decomposed into a five-layer vertical stack, each handling a different level of abstracted instructions. From hardware to software, they are the physical pulses, error mitigation strategies, quantum error correction procedures, substrates for quantum logic and the programs of quantum algorithms. Given a quantum code, the fundamental fault-tolerant operations that underpin the error correction layer are state preparation, error correction and logical gates, with stabilizer measurement as a foundational subroutine. Faults affect each building block differently. Flag-based fault tolerance curbs error spread due to physical two-qubit gates. Higher up the stack, faults affect measurement results, requiring fault tolerance tactics against corrupted measurements. . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Three different models of fault tolerance against faults that corrupt measurement results. The results of stabilizer measurement in state preparation and error correction are used to apply Pauli corrections. Similarly, results of multi-qubit Pauli measurements are used to apply Pauli or Clifford corrections. We show fault-tolerant circuits that are deterministic (no postselection). These models are distance-d fault-tolerant, for d = 2t + 1. A model is fault-tolerant if for k ≤ t faults, the weight of the output error after corrections is at most k. State preparation is the hardest fault tolerance problem as we must ensure tolerance against t input errors and also up to t faults in the execution of the measurements. Error correction is simpler as the total number of input errors and internal faults is constrained to be at most t. Finally, measurement faults in multi-qubit Pauli measurements can be corrected by protecting results with a classical error-correcting code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 viii 1.5 Overview of the thesis. New results are indicated by a ⋆. (Top left) The flag scheme needs ∼ log2 w ancilla qubits to deterministically measure a stabilizer while tolerating one fault, exponentially fewer than Shor’s probabilistic scheme. (Top right) First overhead estimates on deterministic state preparation. One-shot faulttolerant stabilizer state preparation is optimized with either flags or tolerance to corrupted measurements. (Bottom left) Qubit-efficient distance-four codes protect information as well as the distance-five surface code, assisted by postselection. The qubit layout is an augmented surface code layout. Flags prevent error spread in stabilizer measurement, and a clever selection of stabilizers can tolerate corrupted measurements. (Bottom right) Parallelizable Pauli measurements run faster by pre-encoding measurement results into a classical code (Temporal Encoding of Lattice Surgery). This can reduce magic state distillation costs. . . . . . . . . . . . . . . . . 11 2.1 (a) Function of a flag scheme. Errors in a non-fault-tolerant circuit can be made to spread into flag qubits. On measurement, the flag qubits yield a pattern of 1s and 0s, based on which the data is corrected. (b) Circuit to measure the stabilizer X⊗10 , using three flag qubits, in color, to protect against one X fault (distance three). . . . 20 2.2 Historical progression of stabilizer measurement circuits, illustrated by a weight-10 X stabilizer measurement. The black CNOTs have targets on the 10 data qubits, collectively represented by a black wire. In (b-d), fault-tolerance is only guaranteed to distance three and Pauli corrections, or frame updates, are applied to the data based on the Z basis measurements. (a) Shor’s method uses w + 1 ancillas and requires a fault-tolerantly prepared cat state. (b,c) These methods use unverified cat states with subsequent error decoding, giving a deterministic circuit. (d) Our flag method prepares and unprepares an ancilla cat state while collecting the stabilizer. Exponentially more flag patterns can thus be accessed for fault diagnosis. . . . . . . 21 2.3 Flag sequences for distance-three fault-tolerant syndrome measurement, using a flag qubits, each measured once (the slow reset model). These sequences are walks through the a-dimensional hypercube, from 10a−1 to 0 a−11; passing through each vertex at most once and no other weight-one vertices. Flag patterns are stacked vertically and ordered initially left to right, with solid and empty squares representing 1 and 0, respectively, e.g., represents 10, 11, 01. . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4 (a) Circuit to measure an X⊗6 stabilizer, CSS fault-tolerant to distance three. (b) Circuit to prepare a six-qubit cat state, fault-tolerant to distance three. . . . . . . . 25 2.5 Distance-three fault-tolerant syndrome bit measurement only needs three flag qubits. The highlighted region can be repeated to fit the weight of the stabilizer being measured. 26 2.6 Distance-three error correction is not possible with one flag qubit. Either (top) the control wire is unprotected at some point ⋆, from which an X fault can propagate to an error of weight at least two; or (bottom) faults at a, b, c, d, causing respective errors I, X1, X1X2, Xw have no consistent correction. . . . . . . . . . . . . . . . . . 26 2.7 Simulation of the noisy measurement of an X⊗10 and X⊗22 stabilizer at physical error rate p ∈ {10−3 , 10−2} using different distance-three fault-tolerant circuits: Shor-style, compressed Divincenzo-Aliferis, and the flag method of Sec. 2.2.2. In the first and second column of graphs, we show the rate of weight-one and weight-two data errors due to these circuits, with 99% error bars. In the third column, we show the rate at which the measured syndrome bit is wrong. . . . . . . . . . . . . . . . . . . . . . . . 29 2.8 Distance-five CSS stabilizer measurement with slow qubit reset for w ∈ {6, 7, 8}. Red wires indicate syndrome and flag qubits. . . . . . . . . . . . . . . . . . . . . . . . . . 32 ix 2.9 Distance-five syndrome measurement with slow qubit reset for a weight-w X stabilizer. The thick black wire indicates a register of w qubits. An opaque red wire implies the flag is currently inactive and not catching faults. The gates in the blue section can be repeated to construct stabilizer measurement circuits for arbitrary stabilizer weight w. At any instant, only five flags are active. Hence this circuit can be performed with fast qubit reset using only five flag qubits. . . . . . . . . . . . . . . . . . . . . . . . . 32 2.10 Distance-seven syndrome measurement with slow qubit reset for a weight-17 X stabilizer. At any instant, only seven flags are active. Hence this circuit can be performed with fast qubit reset using only seven flag qubits. . . . . . . . . . . . . . . 33 3.1 Distance-three fault-tolerant cat state preparation circuits. Note that, with fast reset, only one ancilla qubit is required. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.2 Circuit to prepare a 15-qubit cat state by adaptive error correction, fault-tolerant to distance three. Labels on the thick black wire indicate which data qubit in the block is being addressed as the control or target of the CNOT. If a fault occurs while preparing the cat state on the |+⟩ qubit, it is partially localized by the red flag ancilla. The measurement result of this flag then determines a set of parity checks to completely localize a possible fault. After all the ancilla qubits have been measured, corrections are applied based on Table 3.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.3 If the red ancilla flag in Fig. 3.2 is not triggered, these circuits are used to find and correct a possible error. The flag sequences (from Fig. 2.3) and corresponding corrections are listed at the bottom. Note that these sequences are nonadaptive, and can be used either with a ancilla qubits in a slow reset model, or with just one ancilla qubit in a fast reset model, since all the CNOT gates commute. . . . . . . . . . . . 40 3.4 Reduced number of ZZ parity measurements required to prepare a cat state faulttolerantly. We first consider non-local connectivity, proving that n + ⌈log2(n/3)⌉ measurements are sufficient to tolerate one fault. By random search, we found sequences of parity checks that can tolerate two or three faults too. In the second graph, qubits are laid on a 1-D chain, and CNOT gates are local. The number of parity measurements is generally larger than with non-local connectivity. The n = 8, d = 7 solution for local measurements is conjectured but not proved. . . . . . . . . . 44 3.5 Rates of residual errors of weight w ∈ {1, 2, 3, 4} after size-eight cat state preparation, for physical error rates p ∈ [5 × 10−3 , 2.5 × 10−2 ].The methods used consist of sequences from Table 3.3, the measurement of the X⊗n stabilizer fault-tolerantly to distance-seven (MXd7) and finally the encoding of an X⊗n operator fault-tolerantly to distance-seven (SPMd7). The distance-seven cases are fully fault-tolerant, as opposed to the distance-five case where three faults can result in a weight-four error. The lowest residual error rates are observed with the distance-seven fault-tolerant stabilizer measurement sequence in Table 3.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.1 Fault-tolerant circuit for encoding the operator XXXX, given the first data qubit does not start in a Z eigenstate. Data qubits are shown in black and ancilla qubits in red. Ancilla qubits flag correlated errors and apply appropriate corrections on to the data qubits. Note that this circuit is derived from the flag-fault-tolerant stabilizer measurement circuits in Chap. 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 x 4.2 Fault-tolerant preparation of a J7, 1, 3K |0⟩L state. (a) Circuit to ideally prepare the |0L⟩ state of the Steane code, needing nine CNOT gates over three rounds. (b) Condensed circuit to fault-tolerantly and deterministically prepare the |0L⟩ state of the Steane code. This circuit uses the same number of CNOTs as in (c), but has circuit depth seven, as opposed to 21 in (c). (c) Circuit to fault-tolerantly prepare the |0L⟩ state of the Steane code, using the weight-4 operator encoding circuit of Fig. 4.1. Two ancillas are needed if qubits can be measured and reset quickly. . . . . . . . . . 53 4.3 Rates of residual weight-1 and logical errors after preparing |0⟩L of the Steane code using postselection-based fault-tolerant circuits. We also plot the postselection yield. The circuits considered here are found in Fig. 1 of Ref. [Got16]. The method labeled ‘(d)’ replaces the X0X5X6 from (c) with X2X4X5. This improves the logical error rate at the cost of a higher weight-one error rate. . . . . . . . . . . . . . . . . . . . . 57 4.4 Rates of residual weight-1 and logical errors after preparing |0⟩L of the Steane code using deterministic fault-tolerant circuits. We consider a method suggested by Aliferis and Reichardt in Ref. [Rei06], the flag-based circuits in Fig. 4.2, and a circuit to prepare the state by measuring X-type stabilizers. For physical error rates below 2 × 10−2 , the logical error rate of the best deterministic protocol is about five times worse than the postselection-based protocol. . . . . . . . . . . . . . . . . . . . . . . . 58 4.5 Flag-based circuits to encode high-weight X-type operators fault-tolerantly to distance five and seven. Corrections are computed based on the flag measurement outcomes based on the techniques in 2.3. (a) A circuit to encode a weight-eight operator fault-tolerantly to distance seven. (b) A circuit to encode a weight-eight operator fault-tolerantly to distance five. (c) A circuit to encode a weight-seven operator fault-tolerantly to distance five. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.6 Comparison of the rates of residual errors between the 69-qubit postselection-based protocol of Ref. [PR12] and a circuit that uses flags to fault-tolerantly encode the stabilizer operators of the state. The flag-based corrections are deterministic, allowing us to demonstrate the first state preparation circuit for the Golay code that is deterministic. The postselection method is very robust and works well even at high physical error rates. The deterministic method is only really useful at a physical error rate below 10−3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.1 Codes considered in this paper, with associated distance-four fault-tolerant Z or X stabilizer measurement sequences. (The last three codes are self-dual CSS.) Time steps of parallel measurements are separated by “|". (∗) For the surface code, fault-tolerant X and Z error correction is carried out using a rolling window of four syndromes, each measured in two time steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2 Planar layout of 16 data and 9 ancilla qubits, in black and red respectively. CNOT gates are allowed along the edges. Grey edges are required for the surface code, and green edges between ancillas are required for the new codes in this paper. . . . . . . 64 5.3 Summary of results. For short computations, the probability of a logical error in the distance-4 rejection-based surface code is approximately 25 times lower than that of the distance-5 variant. Further, for 6 logical qubits, the k = 6 code on one patch of 25 qubits can match 6 patches of the distance-5 surface code. . . . . . . . . . . . . . 66 xi 5.4 (a) A distance-4 stabilizer measurement circuit contains ancilla preparation, CNOTs, measurement and a recovery. (b) Rules for fault tolerance. One fault should be corrected to an error of X/Z weight at most one—this is sufficient for distance 3. Two faults should either be rejected (denoted by the red R) or result in an error of weight two. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.5 (a) Circuit to measure a weight-four X stabilizer fault-tolerantly to distance-four, satisfying the locality constraints in (b). The ±Z measurements are used to flag mid-circuit faults. Gates bunched together can be performed in parallel. (b) Two layouts for measuring stabilizers in the sequences of Fig. 5.1. . . . . . . . . . . . . . . 69 5.6 (a) Circuit to measure a weight-eight X stabilizer fault-tolerantly to distance-four, satisfying the locality constraints in (b). One fault is corrected to at most a weight-one error, but two or more faults may either be corrected, or detected resulting in rejection. The resulting flag outcomes for corrections and rejection are tabulated in Appendix .3. (b) Two layouts for measuring stabilizers in the sequences of Fig. 5.1. . . . . . . . . . 70 5.7 Fault-tolerant error correction with the three bit repetition code {000, 111} (adapted from Fig. 1 of Ref. [DR20]). It is not fault tolerant to correct errors based on the two parity measurements 1 ⊕ 2 and 1 ⊕ 3. An internal fault on bit 1 can be mistaken for an input error on bit 3, as they yield the same syndrome. Errors can be corrected fault-tolerantly by adding another parity check, 2 ⊕ 3. Now for up to one fault at any of the circled locations, an input error is corrected, and an internal fault leaves an output error of weight 0 or 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.8 (a) A stabilizer measurement sequence (SMS) consists of multiple time steps of parallel stabilizer measurement circuits, where the end of a time step denotes the simultaneous measurement of all the ancilla qubits. (b) Rules for distance-4 fault tolerance—first two are sufficient for distance 3: (i) An input 1-qubit error must be corrected. (ii) 1 internal fault must be corrected to an error of weight at most 1. (iii) A 2-qubit input error is rejected. (iv) 1 input error and 1 internal fault should be corrected to an error of weight at most 1 or rejected. (v) 2 internal faults must be rejected or propagate to an error of weight at most 2. . . . . . . . . . . . . . . . . . . . . . . . . 73 5.9 O(p 3 ) scaling of X logical error rate and O(p 2 ) scaling of rejection rate, with error bars, for the distance-four codes. The distance-three and distance-five surface codes are shown for comparison. The new codes have logical error rate per time step as low as 1/10th the distance-five surface code. The distance-four surface code is as low as 1/100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.10 Probability of X logical error (solid) and acceptance (dotted) for t time steps of error correction on six codes, as a function of physical error rate (row) and desired logical qubits (column). The three colored curves correspond to the k = 2, k = 4 and k = 6 codes and the three gray curves are the surface codes. The graphs for few time steps look like step functions because the code patches are checked for logical errors only after blocks of error correction, not time steps. The top row compares the number of physical qubits required to achieve the desired number of logical qubits. . . . . . . . 79 5.11 CNOT depth at which each code has accumulated 1% probability of X logical error. In black, the depth is plotted for one unencoded qubit, at rest error rate one-tenth the CNOT error rate. Plots are shown for the k = 2 code in blue, k = 4 code in purple, k = 6 code in orange and the d = 4 surface code in grey. We assume the depth of ancilla qubit measurement is ten times the depth of a CNOT gate. The CNOT depth shown for the surface code is for 0.01% probability of X logical error. All data points shown for the postselection-equipped codes have acceptance > 5%. . . . . . . . . . . 82 xii 5.12 A degree-four layout for flag-fault-tolerant error correction of the k = 2 code, using 43 of the 53 qubits on the Google Sycamore lattice. The stabilizer generators of the code are overlaid. Note that qubits have degree 4 only in the ancillas measuring the weight-8 stabilizer, but elsewhere the maximum qubit degree is 3. It may be possible for the k = 2 subsystem code with only weight-4 stabilizers and gauges to fit on a layout of maximum degree 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.13 Syndrome graphs passed to the union-find decoder. All the vertices labeled b are identified as the same boundary vertex. (a) Graph of vertices representing X stabilizer measurement outcomes. The numbers indicate the index of the data qubit Z correction for each edge of the graph. (b) Syndrome graph for Z stabilizers. (c) (2 + 1)-D syndrome graph, used for fault-tolerant decoding with circuit-level errors. We show four layers of syndromes in alternating colors red and black. Edges in the same syndrome layer are shown in black, with diagonal edges between layers shown in grey. For clarity, we do not show the vertical edges between syndrome vertices of the same stabilizer. Corrections are only applied for edges which have at least one vertex in the bottom two layers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.14 (a) Stabilizers of the J16, 6, 4K code. The J16, 4, 2, 4K code is derived by assigning two of the logical qubits as gauge qubits. (b) The two logical qubits chosen as gauges are the horizontal and vertical weight-four operators. By stabilizer equivalence, every column (row) of four qubits is a representation of the vertical (horizontal) gauge. These groups of data qubits are denoted by the letters a-h. (c) X and Z operators specifying the four logical qubits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.15 Implementation of quantum error correction with a J16, 4, 2, 4K code on a square planar lattice of qubits. (a) Layout of qubits in Procedure 5. (b,c) Associated gauge operator measurement circuits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.16 Rejection rates (top two) and logical error rates (bottom three) per time step, when performing error correction with the k = 4 subsystem code. The noise model is the same as that in Fig. 5.9. Here, the probability estimates are compared with the distance-four and distance-five surface codes, with the distance-five surface code exhibiting a larger error probability than distance four. . . . . . . . . . . . . . . . . . 90 5.17 Fault-tolerant versions of physical CZ and SWAP gates. (a) Performing a SWAP between two qubits between qubits A and B is not fault-tolerant, even when broken into a sequence of three CNOTs. (c) It can be made fault tolerant by swapping through an ancilla qubit, C. (b) Similarly, a CZ is not fault tolerant, but can be made fault-tolerant using an ancilla, as in (d). . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.1 General model of Pauli-based computation. A quantum algorithm can be written as a sequence of multi-qubit Pauli measurements which perform both Clifford and non-Clifford operations (here we show the implementation of non-Clifford gates). In general the multi-qubit Pauli operations can be ordered into sets of commuting Pauli operators, where Clifford corrections can be conjugated to the end of each set. Such sets are called parallelizable Pauli (PP) sets. A logical T gate (which is non-Clifford and forms a universal gate set when combined with Clifford operations) can be implemented via a multi-qubit Pauli measurement acting on a set of data qubits and an ancillary magic state |T⟩ = (|0⟩ + e iπ/4 |1⟩)/ √ 2. . . . . . . . . . . . . 95 xiii 6.2 (a) Old protocol for temporally encoded lattice surgery of a PP set of size-2, where P1P2 is a redundant measurement which is used to detect failures in the measurements of P1 or P2. If the measurement results of the three multi-qubit Pauli operators are inconsistent, the original multi-qubit Pauli operators P1 and P2 are measured again. Blue boxes correspond to multi-qubit Pauli measurements, and blue triangles correspond to logical single-qubit measurements. Orange boxes correspond to Clifford corrections. (b) New protocol for temporally encoded lattice surgery of a PP set of size-2. The operators P1, P2 and P1P2 are repeatedly measured until no logical timelike failures are detected. In Section 6.2.1 we show that such a scheme results in smaller average runtimes for the implementation of a PP set. Orange boxes denote Clifford corrections that result from applying non-Clifford gates. . . . . . . . . . . . . 95 6.3 Lattice surgery implementation of an X ⊗ X measurement between two logical qubits encoded in dx = 3, dz = 5 surface code patches. Note that X (Z) stabilizers are represented by red (blue) plaquettes. Prior to measuring X ⊗ X, yellow data qubits in the routing region are prepared in the |0⟩ state. The X ⊗ X measurement outcome is then obtained by measuring the X-stabilizers (shown with white ancillas) in the routing space. The stabilizers of the merged surface code patch are measured for dm syndrome measurement rounds in order to correct timelike failures which can occur in the first round of the merge resulting in the wrong parity of X ⊗ X. In the first syndrome measurement round of the merged patch, the individual measurement outcomes of X stabilizers in the routing space region are random, but their product gives the result of the X ⊗ X measurement outcome. At the end of the dm syndrome measurement rounds, the data qubits in the routing space are measured in the Z basis. Since measurement and reset of qubits typically takes a much longer time than the implementation of the physical CNOT gates used to measure the stabilizers, in this work we assume that the qubits in the routing space used to measure X ⊗ X are only available one syndrome measurement round after the split. Hence the merge/split operation takes a total of dm + 1 syndrome measurement rounds. . . . . . . . . . . 99 6.4 The best average runtime per Pauli (in units of syndrome measurement rounds) for all classical codes considered in this work when using the TELS protocol of Section 6.2 for PP sets of size k ∈ {2, 3, . . . , 100} at p = 10−3 , δ = 10−10 and with maximum routing space A = 100. For example, for PP sets of size k = 20, a distance-5 BCH code achieves the lowest average runtime per Pauli among all the classical codes considered. To calculate pL, we used Eq. (6.12), with pm given by Eq. (6.1) and dm chosen to minimize the runtime while keeping pL < δ. We compare the results of TELS with un-encoded lattice surgery, which is shown here to take 14 syndrome measurement rounds per Pauli. The legend labels correspond to codes from Table 6.2. Low distance codes perform better for small values of k, whereas for larger values of k, the high rate of larger-distance codes enables smaller measurement distances. . . . . . . . . . 110 7.1 Circuit used in a 15-to-1 magic state distillation protocol expressed as a sequence of multi-qubit non-Clifford gates. The circuit above is the Hadamard-transformed version of Fig. 15 of Ref. [Lit19a], which produces the state H|T⟩. This state can be used in the same way as the regular magic state, with only a change of Pauli basis while measuring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 xiv 7.2 (a) Circuit for performing a π/8 multi-qubit Pauli measurement. The circuit requires a |TX⟩ = H|T⟩ resource state, and a Clifford correction may be required depending on the P ⊗ X measurement outcome. (b) Circuit for performing a Clifford gate using an ancilla prepared in |0⟩. Both circuits are adapted from Ref. [Lit19a]. . . . . . . . 116 7.3 Circuit gadget for an auto-corrected non-Clifford gate. The circuit does not require the application of conditional Clifford gates to the logical data qubits. However, an extra ancilla prepared in |0⟩ is required. . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.4 Layouts of logical qubits for TELS-assisted 15-to-1 state distillation. Data qubits are placed in blue cells. Magic states are in pink cells, where cells with a radial shading are extra cells used to prepare new magic states in parallel with the Pauli measurements. |0⟩ ancillas for autocorrected gadgets are placed in the brown cells adjacent to the yellow squares used for twists. Green cells are used to store distilled magic states for use by the core while the next round of distillation occurs. Additional green cells may be required if a distillation tile produces magic states faster than the core consumes them (alternatively, the magic states can be transported to additional tiles surrounding the core). Routing regions between cells are split into grey and blue to show that the relevant lattice surgery operations will not clash. (a) Layout for un-encoded lattice surgery using autocorrected non-Clifford gate gadgets of Fig. 7.3. The grey routing region handles the X ⊗Y measurements and the blue routing regions performs X-boundary measurements between different logical qubits. (b) Layout for 15-to-1 distillation with TELS, using the [12, 11, 2] Single Error Detect code. Note that we only need one radial pink cell. However given the geometry of the entire tile, we use the remaining space for another pink radial tile. (c) Layout using the [15, 11, 3] BCH code, and, (d) using the [23, 12, 7] Golay code. . . . . . . . . . . . . . . . . . . 124 7.5 Layouts of logical qubits for parallelized TELS-based 15-to-1 state distillation protocols. The meaning of each color is described in the caption of Fig. 7.4. (a) Layout for un-encoded lattice surgery, with two routing regions, each accessing one X boundary of the logical qubits. Each routing region has access to a separate magic state and a |0⟩ ancilla used in the circuit of Fig. 7.3. (b) Layout for distillation with TELS, using the [12, 11, 2] Single Error Detect code. Three magic state tiles are held in memory for each pair of parallel Pauli measurements. Then two are discarded and two prepared magic states on other pink cells are used in the following round. (c) Layout and routing region used for the final multi-qubit Pauli measurements in the Clifford frame distillation protocol. (d) Layout for parallelized distillation with TELS, using the [15, 11, 3] BCH code. (e) Layout for parallelized distillation with TELS using the [23, 12, 7] Golay code. For (d) and (e), the layouts used to perform the final multi-qubit Pauli operations required by the Clifford frame can be found in an analogous way from going from (b) to (c). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 xv 7.6 (a) On the layout of Fig. 7.5b, we show how two separate routing spaces can be used to perform parallel lattice surgery measurements. The logical measurements are X1 ⊗ X2 ⊗ X3 ⊗ XTX,1 (in the equatorial routing space) and X3 ⊗ X4 ⊗ XTX,1 ⊗ XTX,2 (in the circumferential routing space). These are the first and second measurements respectively when performing TELS-assisted distillation using the [12, 11, 2] SED code (see Eq. (17) of Appendix .5 for the codeword generator matrix). Alternatively, they correspond to the first measurement and the product of the first and second measurements from Fig. 7.1. The logical patches have code distances dx = 3, dz = 5. X stabilizers are in red, and Z stabilizers are in blue. The product of the X stabilizers indicated by white vertices gives the parity for the multi-qubit Pauli measurement outcomes. (b) On the same layout, we show how to perform Pauli measurements which are tensor products of X, Y , or Z on the data qubits. These measurements are performed after the non-Clifford gates of the distillation protocol. The example in the figure measures X ⊗Y ⊗ X ⊗ X ⊗Y on the five data qubits. The yellow stabilizers are twist defects that are used to access Y boundaries of logical qubits that are originally defined with only X and Z boundaries, using the techniques shown in Ref. [CC22a]. Note that the size of the routing space area separating the top and bottom rows of data qubits is taken to be large enough to allow for Y measurements requiring twists. 126 7.7 Time dynamics of a TELS-assisted distillation factory. Here we consider a 15-to-1 distillation protocol with the [12, 11, 2] code protecting temporally encoded lattice surgery of the 11 non-Clifford gates, followed by the 4 Pauli measurements protected with the [5, 4, 2] code. (a) The first part of the circuit consists of the Pauli measurements corresponding to the non-Clifford gates. We assume that the non-Clifford measurement results yield 110000000011 and the eleven |TX⟩ state measurements yield 10000000001. The Clifford corrections, derived from Fig. 7.2a, are then conjugated through the final Pauli measurements. (b) Sequence of lattice surgery measurements when both TELS-assisted protocols are combined. . . . . . . . . . . . . . . . . . . . . 129 7.8 Round robin scheduling of deterministic-time distillation tiles in a factory. This scheduling method allows minimizing core wait time if a distillation tile rejects. The horizontal axis is time and the labels above are timestamps. Timestamp Tj (i) indicates the end of the jth distillation tile while accumulating magic states for the ith PP set executed in the core. A * indicates the start of the distillation tile for the first time. In this example, there are D = 3 distillation tiles, each producing l = 2 distilled magic states in time Tm, for a core that executes PP sets of size k = 8. . . . . . . . . 132 9 Two-error-detecting fault-tolerant circuit for the preparation of a weight-12 cat state. The state is only accepted when all flag qubits are measured as 0. Note that with fast reset, only one ancilla qubit is required. . . . . . . . . . . . . . . . . . . . . . . . . . 148 10 (a) Logarithmic-depth preparation of an eight-qubit cat state shows there are six possible locations for X faults that create errors of weight at least two. Parity checks need to be chosen to find corrections that leave the cat state with error of weight less than two. (b) The circuit on the left can be represented as a graph, where a CNOT gate is represented by the splitting of an edge. . . . . . . . . . . . . . . . . . . . . . . 148 11 (a) Distance-4 fault-tolerant circuit for measuring a weight-8 stabilizer on a square lattice layout, as arranged in (b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 12 We show the classical codes achieving the lowest average runtime per Pauli for k ∈ {2, 3, . . . , 100} at p = 10−3 and for (a) δ = 10−15, (b) δ = 10−20 and (c) δ = 10−25 . We set the routing space area A = 100. . . . . . . . . . . . . . . . . . . . . . . . . . 159 xvi 13 We show the classical codes achieving the lowest average runtime per Pauli for k ∈ {2, 3, . . . , 100} at p = 10−4 and for (a) δ = 10−15 and (b) δ = 10−20. We set the routing space area A = 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 14 Layouts of logical qubits for lattice-surgery-based 125-to-3 magic state distillation. Cell color legend in caption of Fig. 7.4. (a) Layout for a distillation tile without temporally encoded lattice surgery measurements, using an auto-corrected non-Clifford gate gadget. The blue routing space allows Pauli X-type measurements as there is access to the X boundaries of all the cells. The gray routing region allows access to a |0⟩ ancilla with Y boundary access. This region contains hardware to execute a nonClifford gate gadget. (b) Layout for a distillation tile that performs Pauli measurements two at a time, without temporal encoding. The long blue routing space performs one set of measurement with the auto-corrected non-Clifford gadget hardware at the left, and the large grey routing space uses the gadget hardware on the right. (c) Layout for a distillation tile performing temporally encoded lattice surgery with the [127, 106, 7] BCH code. Non-Clifford gates are performed two at a time, using separate routing spaces shown in gray and blue. (d) Layout for a TELS-assisted distillation tile using the [127, 99, 9] BCH code, performing non-Clifford gates two at a time. . . . . . . . . 174 15 Layouts of logical qubits for lattice-surgery-based 116-to-12 magic state distillation. Cell color legend in caption of Fig. 7.4. (a) Layout for a distillation tile without temporally encoded lattice surgery measurements, using an auto-corrected non-Clifford gate gadget. (b) Layout for a distillation tile that performs Pauli measurements two at a time, without temporal encoding. (c) Layout for a distillation tile performing temporally encoded lattice surgery with the [129, 114, 6] Zetterberg code. Non-Clifford gates are performed two at a time, using separate routing spaces shown in gray and blue. (d) Layout for a TELS-assisted distillation tile using the [127, 99, 9] BCH code, performing non-Clifford gates two at a time. . . . . . . . . . . . . . . . . . . . . . . . 176 16 Layouts of logical qubits for lattice-surgery-based 114-to-14 magic state distillation with a J114, 14, 3K quantum code. Cell color legend in caption of Fig. 7.4. (a) Layout for a distillation tile without temporally encoded lattice surgery measurements, using an auto-corrected non-Clifford gate gadget. (b) Layout for a distillation tile that performs Pauli measurements two at a time, without temporal encoding. (c) Layout for a distillation tile performing temporally encoded lattice surgery with the [129, 114, 6] Zetterberg code. Non-Clifford gates are performed two at a time, using separate routing spaces shown in gray and blue. (d) Layout for a TELS-assisted distillation tile using the [127, 106, 7] BCH code, performing non-Clifford gates two at a time. . . 178 xvii Abstract Quantum computation holds the promise of solving certain complex problems exponentially faster than classical computers. However, the high prevalent noise in current quantum devices impedes the accurate execution of even basic quantum algorithms. This can be remedied by protecting quantum information with a quantum error-correcting code, in which the logical information of an algorithmic qubit is spread across multiple physical qubits. Individual quantum errors are then located and corrected by the fault-tolerant measurement of multi-qubit stabilizer operators (parity checks). Unfortunately, error correction and fault tolerance both impose large demands on the qubit overhead: hundreds to thousands of physical qubits per logical qubit. In this thesis, we reduce the qubit and time cost of fault tolerance by redesigning key building blocks of an error-corrected quantum computer. First, we develop a combinatorial proof with flag fault tolerance that exponentially reduces the number of qubits needed to measure a stabilizer of any size, while tolerating one fault. We then leverage the combinatorial proofs to develop fault-tolerant circuits to prepare cat states deterministically with only one ancillary qubit. These results then enable the construction of few-qubit fault-tolerant circuits for the preparation of complex encoded states with 100% yield. Next, we optimize the overhead of error correction on a planar 25-qubit layout. We show with extensive simulations that a distance-four code encoding six logical qubits protects information as well as the distance-five surface code, using one-tenth as many physical qubits. Finally, we optimize the time overhead of logical gates in surface code quantum computers. For computations executed via lattice surgery measurements of multi-qubit Pauli operators, we show that protecting measurement results with a classical code cuts computation time by a factor of two to six. Our hardware-agnostic optimizations of the space and time costs of fault tolerance thus suggest new routes to advance the timeline of error-free quantum computing. xviii Chapter 1 Introduction 1.1 What is a fault-tolerant quantum computer? The study of the complexity of classical algorithms informs us that some problems cannot be solved by computers in reasonable timeframes. Quantum computers can solve these problems quickly, and the construction of these devices will bring profound technological changes in the upcoming decades. Some practical applications for which quantum algorithms currently exist include molecular simulation for drug discovery and material design [BGM+19, LBG+21, SBW+21, RBK+23], algorithms that break public-key cryptography [GE21] and simulations of quantum physics [BGB+18, CMN+18]. In Fig. 1.1, we show the number of qubits and computation depth needed for some of these algorithms. Quite surprisingly, public-key cryptography can be rendered obsolete with a 3000-qubit quantum computer [Lit23]. Experiments with current quantum devices show that there are hints of quantum advantage around the corner [AAB+19, KEA+23]. However, the execution of the algorithms in Fig. 1.1 requires the ability to run very deep computations. This is not possible with current quantum devices. Minor control inaccuracies and the inevitable interaction between a quantum system and its environment corrupt quantum states, rendering computations inaccurate. For example, an algorithm requiring N gates executed with an output accuracy of 1% requires gate error rates below 1 100N . For N = 106 , we must engineer gate accuracy to be five orders of magnitude better than what is available today [AAA+23]. The current state of development of quantum computers is reminiscent of the age when classical computers were built with vacuum tubes and mechanical relays. These devices suffered from high failure rates and needed technicians to constantly replace components. With the onset of the 1 15 20 25 30 15 20 25 30 15 19 23 27 31 100 106 108 300 1000 1010 3000 1014 1012 10000 30000 Chemistry simulations Cryptography Condensed matter physics T-count Fusion RSA-3072 RSA-2048 ECC-384 ECC-256 Fermi-Hubbard Spin systems (physical qubits = 2 d 2 ) Surface code distance d Logical qubits Li-ion FeMocofertilizer FeMoco Li-ion Li-ion FeMoco Figure 1.1: Qubit and gate counts (T) needed for academic and commercial quantum use [SWM+24]. Quantum algorithms decomposed into a sequence of Clifford and T gates assume only the latter is computationally expensive. To ensure reliable results for large and long computations, qubits are protected by error-correcting codes. When using the surface code, we show the minimum code distance needed to achieve an algorithmic accuracy of 1%, with a physical gate error rate of 10−3 . We make modest assumptions about the routing space and ignore magic state distillation factories. 2 transistor, failures decreased. The quantum equivalent of the transistor has not yet been discovered. This is the case despite decades of research developing quantum devices with different platforms, such as superconductors, photons, ion traps, neutral atoms, semiconductor quantum dots, nuclear magnetic resonance qubits and nitrogen-vacancy centers in diamonds. Useful applications providing quantum advantage may only be possible with more robust qubits. Quantum error correction paves a route to robustness [Sho95, Got97, Ter15]. Here, the quantum information taking part in a logical algorithm is encoded into a quantum error-correcting code on a large number of physical qubits. Any accumulated noise is removed using carefully designed error diagnosis and correction protocol. Using this technique, we can achieve arbitrary levels of accuracy. Protocols for error correction can locate and remove errors. However these protocols are executed using faulty components and control sequences. Errors occurring during the protocol thus corrupt ideal execution, potentially introducing more errors into the system. This can be problematic for very long computations. Hence to run deep quantum algorithms on hundreds of thousands of qubits, we will require fault-tolerant implementations of the fundamental processes of a quantum computer. Theoretical models for fault-tolerant quantum computing Shor and Steane suggested the first techniques for fault-tolerant error correction. Shor’s scheme [Sho96], which is applicable to a large class of codes called stabilizer codes, relies on a slow error diagnosis process where subsets of qubits are checked sequentially. A highly parallelized version of this process applied on topological codes yields very high fidelities. We provide more information on these codes later. Steane’s scheme [Ste97], applicable to a subset of stabilizer codes called CSS codes [Ste96b], determines error locations by copying errors onto an analogous state in one step. The catch is that the fault-tolerant preparation of this resource state is quite complex [PR12]. Clever choices of early quantum error-correcting codes permitted some elementary logical gates to be transversal1 [Ste96a]. Further, magic state protocols enable quantum universality [BK05]. For arbitrarily accurate computation, quantum codes can be concatenated. This is a process wherein encoded qubits are further encoded using another quantum error-correcting code to polynomially improve the protection. Early proposals for planar devices using concatenation for arbitrary accuracy required unattainably-high-fidelity physical operations [Got00, STD05, HGFW06, SDT07]. 1An encoded gate performed by executing the same physical operation on all qubits in parallel 3 Table 1.1: Overhead estimates for different fault-tolerant algorithm implementations. Reference Overheads [GE21] 20 million qubits in 8 hours to break RSA-2048 [GS21a] 13436 qubits in 177 days to break RSA-2048 [Lit23] 6000 resource state generators at 580MHz with logarithmic non-local connections in 58 seconds to break ECC-256 [GRLR+23] 126133 cat qubits in 9 hours to break 256-bit ECC [MGM20] 8 GB QRAM query in 2 ms with 1015 qubits or 2.4 years with 300000 qubits The above results represent the first era in fault-tolerant quantum error correction. The second era began with the connection of ideas in topological quantum field theories to quantum error correction [Kit97]. The surface code [FMMC12], first explored in the early 2000s, has been comprehensively investigated over the last two decades due to its potential for large-scale quantum computation. This attention has unveiled numerous advantages and opportunities, along with some pitfalls and challenges. A compelling advantage is its high fault tolerance threshold, permitting arbitrarily accurate computation at higher error rates than concatenated protocols. Within these families of codes, choosing larger codes guarantees arbitrarily high accuracy below a fault tolerance threshold. The surface code exhibits some weaknesses from the perspective of quantum coding theory. The associated qubit overhead for information storage and performing quantum logic is prohibitively large [GE21, Lit19a]. In Table 1.1, we show some estimates for the physical resources required to implement different algorithms in Fig. 1.1. Additionally, the classical computing power needed to keep up with the error decoding problem can be substantial, even in moderate-sized implementations. Recent developments simplify the error decoding problem by partitioning the decoder and decoding multiple chunks in parallel [TZC+22, SBB+22a]. However with large devices, bottlenecks will be encountered in the global control and heat dissipation of multiple fast parallel decoders. Possibly different codes and fault tolerance techniques must be studied. Experiments on quantum error correction Despite the theoretical advancements outlined in the previous subsection, experimental quantum error correction is still nascent. Drastic improvements to device manufacturing and control optimiza4 Superconducting Ion trap Neutral atom 2024 2023 2022 Surface • 16 cycles, d =3 • Repeated error correction • Decoding in post-processing p 2 = 1% Surface • 15 cycles, d =2 • Repeated error detection • Prepare: |0iL, |+iL Gate: XL,ZL,TL Measure: ZL,YL pLeak = 5% Surface • 25 cycles, d =5 • Scaling improves logical error rate • Error floor: 1.6 ⇥ 107 p 2 =0 .6% Bosonic • Extended lifetime: 105 cycles • Coherence gain: 2.27 > 0.9 previous Bacon-Shor • Prepare: |0iL, |+iL Gate: HL,TL Measure: XL,ZL • One round error detection p 2 =1 .1% Steane • |TiL injection below threshold • CNOTL between two Steane codes • Teleport |TiL FT-ly p 2 =2 .5% [[ 5, 1, 3]] Steane • Real-time decoding • CNOTL error rate < CNOT error rate • Non-transversal CNOTL p 2 =0 .24% Surface Steane • State preparation • State preparation p 2 =2 .6% [[ 8, 3, 2]] Steane Surface • One round error detection • Scaling d =3to7 improves CNOTL • 4-logical-qubit GHZ state • Control 40 codes simultaneously • 48-logical-qubit circuit with CCZL p 2 =0 .5% Qubits = 17 Qubits = 13 Qubits = 7 Qubits = 20 Qubits = 16 Qubits = 49 Qubits = 280 Qubits = 24 Oscillator Figure 1.2: Progress of experimental demonstrations of quantum error correction on superconducting, ion trap, and neutral atom systems. Currently the objective is to benchmark error correction primitives with small codes. In the next few years, devices will become larger while maintaining low noise levels. Neutral atom systems can control many qubits and maintain coherence, but lack repeated real-time error correction. Superconducting and ion trap qubits have demonstrated all the primitives for fault tolerance, and now face the engineering challenge of scaling up system sizes. 5 Table 1.2: References for the error correction experiments in Fig. 1.2. Published date Reference Superconducting qubits Dec 16 2021 [MVM+22] May 25 2022 [KLR+22] Feb 22 2023 [AAA+23] March 22 2023 [SER+23] Ion trap Oct 04 2021 [EDN+21] May 25 2022 [PHP+22] Aug 3 2022 [RABA+22] Neutral atoms April 20 2022 [BLS+22] Dec 06 2023 [BEG+24] tion have permitted device error rates that are sufficiently low enough to see improvements from fault-tolerant error correction protocols. In Fig. 1.2 and Table 1.2, we highlight some of the most recent experiments on quantum error correction conducted on superconducting, ion trap and neutral atom quantum computers. Superconducting and ion trap technologies have matured from industrial funding, while neutral atom computers are still gaining traction within academia. Currently, small-scale fault-tolerant implementations of the fundamental building blocks of a computation have been executed on all three platforms. What remains to be seen is whether these systems are able to retain low gate error rates as they are scaled up in size. As larger systems become more prevalent, more complex error correction protocols will be executed. Potent classical computers will be needed to keep up with decoding for error correction, and developments will need to be made to operate these circuits in cryogenic environments. 1.2 Redesigns of fault-tolerant building blocks Ideally, a quantum computation proceeds by preparing a quantum state, operating on it, and finally measuring it to observe an outcome. When this model is generalized to an error-corrected faulttolerant quantum computation, the building blocks of the computation are slightly altered. We show a layered architecture to construct large-scale fault-tolerant quantum machines in Fig. 1.3 [JVMF+12]. 6 This hierarchy abstracts the instructions for a quantum algorithm into five levels, ranging from high-level algorithmic operations to low-level physical gates on the device. Crucially, the gap between the algorithmic and physical layers is bridged by error correction, acting as a vital buffer against the ubiquitous noise. With an appropriate choice of quantum error-correcting code, the building blocks of a quantum computation must be executed with fault-tolerant physical circuits. These building blocks include state preparation, error correction, and logical gates, with stabilizer measurement serving as a foundational subroutine. Note that we do not include measurement as a building block. We include fault-tolerant measurements of logical qubits under the bracket of logical gates, as in many models of quantum computation, they are used to perform logical operations on quantum states [CC22b, BBD+09]. In Fig. 1.5, we show a summary of the important results in this thesis. For stabilizer measurement, state preparation, and error correction, we show improved qubit overheads. Alternatively, we show an improvement in the execution time for fault-tolerant error correction protocols and logical gate sequences. Tactics for fault tolerance Errors affecting quantum computers are typically modeled as continuous processes [ADL01]. However, continuous noise is challenging to analyze from a fault tolerance perspective. Discrete noise models are simpler. Here, faults are modeled as the probabilistic injection of discrete errors at specific circuit locations. This is a valid approach since stabilizer measurements during error correction discretize continuous errors on individual qubits. With this model of faults, we can analyze the effects of introduced errors. A fault occurring midway through an error correction protocol corrupts subsequent stabilizer measurements, misdiagnosing existing errors, which eventually leads to the application of corrections on the wrong qubits. The resulting effect is an increase in the number of errors. Additionally, errors can spread when qubits interact. Hence, the main objective of fault tolerance is to curb the spread of errors. Faults affect the operation of the building blocks in different ways, hence we must employ different models to solve each fault tolerance problem. For stabilizer measurement and state preparation by operator encoding, the concern is that faults occuring partway through a circuit may eventually spread to errors on many qubits. In these cases we employ flag fault tolerance to curb error spread. 7 Error correction State preparation Logical gates Quantum error-correcting codes Fault-tolerant operations Stabilizer measurement Stabilizer measurement State preparation Logical gates FLAGS CORRUPTED MEASUREMENTS Error correction FAULT TOLERANCE TACTIC Quantum error correction Application Logical Physical Error mitigation Figure 1.3: Layered architecture of a fault-tolerant quantum computer [JVMF+12]. The functioning of a quantum computer can be decomposed into a five-layer vertical stack, each handling a different level of abstracted instructions. From hardware to software, they are the physical pulses, error mitigation strategies, quantum error correction procedures, substrates for quantum logic and the programs of quantum algorithms. Given a quantum code, the fundamental fault-tolerant operations that underpin the error correction layer are state preparation, error correction and logical gates, with stabilizer measurement as a foundational subroutine. Faults affect each building block differently. Flag-based fault tolerance curbs error spread due to physical two-qubit gates. Higher up the stack, faults affect measurement results, requiring fault tolerance tactics against corrupted measurements. 8 State preparation by stabilizer measurement Error correction by stabilizer measurement Logical gates by multi-qubit Pauli measurements correct correct t errors + t faults t faults t errors{ { t faults { t1 + t2 t correct t2 faults { { t1 errors Distance-d models to tolerate faults that corrupt measurement results Figure 1.4: Three different models of fault tolerance against faults that corrupt measurement results. The results of stabilizer measurement in state preparation and error correction are used to apply Pauli corrections. Similarly, results of multi-qubit Pauli measurements are used to apply Pauli or Clifford corrections. We show fault-tolerant circuits that are deterministic (no postselection). These models are distance-d fault-tolerant, for d = 2t + 1. A model is fault-tolerant if for k ≤ t faults, the weight of the output error after corrections is at most k. State preparation is the hardest fault tolerance problem as we must ensure tolerance against t input errors and also up to t faults in the execution of the measurements. Error correction is simpler as the total number of input errors and internal faults is constrained to be at most t. Finally, measurement faults in multi-qubit Pauli measurements can be corrected by protecting results with a classical error-correcting code. Here, circuits are engineered such that errors on code qubits spread onto a set of flag qubits, which are then measured to determine if a fault occurred. Based on the flag measurement outcomes, corrections can be applied to reduce error spread. Early theoretical and current experimental results on flags rely on postselection-style fault tolerance [RABA+22, CR18a]. Recently, there has been a push to make these protocols deterministic to permit repeated circuit execution for extended periods of time [DA07, Ste14, YK17, CR18b]. In this thesis, we consider deterministic flag protocols in Chap. 2, Chap. 3 and Chap. 4. Additionally, we consider some circuits that combine postselection with correction in Chap. 5. We highlight the building blocks using flags for fault tolerance in Fig. 1.3. Flags prevent the spread of errors due to qubit interactions, but these errors can still corrupt measurement results. A noisy error correction procedure may erroneously apply corrections at wrong locations. This can be prevented by ensuring the measured data is protected from errors. Naively, it may seem like encoding the measurement results in a classical error-correcting code may offer some protection. In fact, this is the solution for a model where faults only affect measurement outcomes. In Fig. 1.4, we outline three models of fault tolerance against corrupted measurements. 9 The model for fault-tolerant error correction with stabilizer measurements is historically important as threshold results are derived using it [AGP06]. The model for state preparation is more difficult to find solutions for as more errors must be tolerated. For these models, faults on qubits leave residual errors while also corrupting the measurement result. In the model for logical gates, the qubits are assumed to be robust, hence we only consider faults affecting measurements. Consider the model for error correction. The protocol is deemed fault-tolerant to t errors if for t1 input errors and t2 internal faults (faults during the protocol), the output has at most t2 errors when t1 +t2 < t [DR20]. In the absence of faults, the protocol must correct errors up to the capacity of the code. If there are no input errors and only faults occur, this model ensures that “errors don’t spread". We use this model in Chap. 5 to make fault-tolerant error correction protocols for distance-four quantum codes. The model for state preparation requires tolerance to more errors. When preparing states by stabilizer measurement, projection into the desired state is not guaranteed. Instead the initial set of measurements are random (due to anticommuting operators), leading to a projection that leaves residual errors on the qubits. This is why we require tolerance to at least t input errors. In the absence of faults, the stabilizer measurement results are decoded to diagnose the exact error space, prompting a correction into the desired state. When faults occur, erroneous corrections can project the state into a different error space. As long as the number of residual errors after the completion of the state preparation protocol scales with the number of faults, and fault spread is curbed in stabilizer measurement, this protocol is fault-tolerant. This model is used to prepare cat states and stabilizer states in Chap. 3 and Chap. 4. Finally, we consider a model where the qubits are assumed to be noise free, and faults only affect measurement results. This is the case when considering qubits that are arbitrarily well-protected by error correction, but still suffer from noisy quantum logic. In this model, the measurements are affected by bit flip errors on the results. Since there are no residual quantum errors on the data, we use a classical error-correcting code to protect the measurement results. This approach is only applicable when the set of measurements to be performed commute. We use this model to speed up quantum logic operations in Chap. 6 and Chap. 7. 10 Data Ancilla ±Z ±Z ±Z ±Z ±Z |0i |0i |0i |+i |0i |0i |0i |0i |+i |+i ±Z ±Z ±Z |0i |0i |0i |0i |0i |0i |0i |0i |0i |0i i0| |+i ±X ±X ±X ±X ±X ±X ±X ±X ±X ±X 10 1 ancillas data block Non-deterministic #ancillas = w +1 |0i |0i |0i |+i ±Z ±Z ±Z ±X 10 1 Cat: | i P1 P2 d0 m d0 m d0 m With probability pD, when error detected m1 m2 m3 q2 q1 P1P2 Z Z Z Z = m1 m2? |Ti |Ti { X X Protocol: C(m1,q1,m2,q2) Layout: Shor Flag Stabilizer states: Distillation space-time cost 15-to-1 No TELS 150K TELS 117K 116-to-12 No TELS 1050K TELS 823K (with [12, 11, 2] for TELS) 15-to-1 distillation layout: ? Deterministic Slow Reset: #ancillas dlog2 we +1 Fast Reset: #ancillas 4 ? ? Postselection Deterministic Qubits Gates Qubits Gates Steane 8 11 9 21 Golay 69 297 104 419 ? ? Logical error probability (short computations) k =6 ( ) d = 5 surface ( ) PL ⇡ PL ? d = 5 surface ( ) k =2 ( )⇡ 1 PL 2 ⇥ PL ? Error correction Logical gates Stabilizer measurement State preparation Noisy magic Routing Distilled magic Logical qubits Figure 1.5: Overview of the thesis. New results are indicated by a ⋆. (Top left) The flag scheme needs ∼ log2 w ancilla qubits to deterministically measure a stabilizer while tolerating one fault, exponentially fewer than Shor’s probabilistic scheme. (Top right) First overhead estimates on deterministic state preparation. One-shot fault-tolerant stabilizer state preparation is optimized with either flags or tolerance to corrupted measurements. (Bottom left) Qubit-efficient distancefour codes protect information as well as the distance-five surface code, assisted by postselection. The qubit layout is an augmented surface code layout. Flags prevent error spread in stabilizer measurement, and a clever selection of stabilizers can tolerate corrupted measurements. (Bottom right) Parallelizable Pauli measurements run faster by pre-encoding measurement results into a classical code (Temporal Encoding of Lattice Surgery). This can reduce magic state distillation costs. 11 Distance-three stabilizer measurement Stabilizer measurement is the most fundamental and repeated operation in a fault-tolerant quantum computer. Not only is it the backbone of Shor-type error correction schemes [Sho96], it is also used exclusively for operations with the surface code [FMMC12], and has a similar circuit design to operator encoding circuits, allowing fault tolerance ideas to be shared between the two. Steane-style error correction [Ste97] doe not involve the measurement of individual stabilizers, but the circuits used to prepare the complex ancilla states for Steane error correction require either operator encoding circuits or stabilizer measurement itself, as we show in Chap. 4. Due to its prevalence, optimizing the overhead of fault-tolerant stabilizer measurement is paramount. Previous methods exhibit overheads that scale linearly in the size of the measured stabilizer [DA07, Ste14, YK17, CR18b]. We show logarithmic overhead to measure a stabilizer while tolerating one fault. In Chap. 2, we demonstrate that distance-three fault-tolerant measurement of a weight-w stabilizer needs at most ⌈log2 w⌉ + 1 ancilla qubits, with non-local connectivity. If qubits reset quickly, four ancillas suffice. These improvements arise from mapping flag qubit measurement outcomes to the vertices of a hypercube. With a flag qubits, previous methods use O(a) flag patterns to identify faults. We use the same flag qubits more efficiently by constructing maximal-length paths through the a-dimensional hypercube to use nearly all 2 a possible flag patterns. Finally, we extend this technique to distance-five and -seven stabilizer measurement, where we demonstrate overhead that scales as ∼ w/2 and ∼ w. Deterministic state preparation The combinatorial approach of constructing paths in a hypercube can also be extended to faulttolerant cat state preparation. An even more efficient method is discussed in Chap. 3, where w-qubit cat states are prepared with one ancilla qubit measured O(log2 w) times. Asymptotic upper bounds on the qubit overhead in different scenarios are tabulated in Table 3.1. While these methods primarily focus on checking a cat state after it has been prepared noisily, we also explored an approach that prepares cat states by measuring stabilizers. This method also requires one ancilla, but performs w + O(log2 w) measurements. This technique was then extended to tolerate an arbitrary number of faults during cat state preparation when placing qubits on a 1-D chain. 12 Both techniques developed in Chap. 3 can be applied in a straightforward manner to the preparation of more complex ancilla states, as detailed in Chap. 4. Here, we compare the fidelity and overhead of using different deterministic circuits for the preparation of the encoded |0⟩ states of the Steane and Golay quantum codes. Previous results on the fault-tolerant preparation of these states leveraged postselection-based fault-tolerance, where a protocol is rejected when a non-trivial outcome is observed. Here, we decode measurement results and assign compatible corrections to ensure states are never thrown away. The high-fidelity preparation of stabilizer states is motivated by the use of these states for Steane-style error correction. In this protocol, qubits of a quantum code are engineered to copy errors over to a stabilizer state. Measuring the state and decoding the results reveals the location of errors in the quantum code. For some codes, the Steane-style protocol exhibits higher thresholds than Shor-type schemes [Ste97, EAG24]. This has also been verified experimentally [PBP+23, HBC23, BEG+24]. Distance-four error correction Finding the perfect quantum error-correcting code is an elusive goal. For near-term implementations of quantum memory, we desire a code with high-fidelity error correction, high encoding rate, and circuits that can be implemented on near-term hardware. Topological codes like the surface code are well-suited to near-term hardware and exhibit very high fidelity. Unfortunately, these codes have low encoding rate and require many physical qubits per logical qubit. Quantum LDPC codes, on the other hand, have high rate but need circuits that are difficult to implement on current hardware [PK22a, BCG+23]. In contrast, Chap. 5 of this thesis considers distance-four, efficient encodings of multiple qubits into a modified planar patch of the 16-qubit surface code. These codes satisfy all the desired properties for near-term error correction. We use postselection techniques to achieve high-fidelity error correction. These codes can encode up to six logical qubits with distance-four protection using the same number of physical qubits as the distance-four surface code. Finally, the circuits for error correction are designed for planar hardware. Choosing distance four codes allows us to correct single faults and detect double faults. Thus logical errors occur due to at least three faults, which is rare at current noise rates. We perform extensive simulations of error correction and compare the logical error rate of our codes against the distance-four and -five surface codes [TS14]. These codes serve as the perfect benchmarks as they 13 satisfy most of the desired properties for a quantum memory while only suffering from low code rate. A spectrum of trade-offs were revealed: at one end, a code encoding six logical qubits can achieve comparable error protection as the distance-five surface codes using one-tenth as many qubits. At the other end of the spectrum, a two-logical-qubit code can achieve logical error rate half that of the surface code, using one third of the physical qubits; an improvement on both fronts! Hence distance-four codes, using postselection and in a planar geometry, are qubit-efficient candidates for fault-tolerant, moderate-depth computations. The circuits in this chapter are amenable to architectures that are constrained to be planar with local connectivity, such as superconducting systems and semiconductor spin systems. Currently IBM develops superconducting systems where qubits interact with at most three neighbors [SYK+23], and with Google’s devices, qubits can interact with up to four neighbors [AAA+23]. Constructing a device with a connectivity of eight neighbors naively permits the use of our protocols for error correction. But low-cost modifications can decrease the connectivity requirements. Logical gates by temporally encoded lattice surgery From the perspective of experimental quantum computing, the leading candidate quantum errorcorrecting code is the surface code. Stabilizer measurements for error correction are simple, and can be parallelized when placing the qubits on a square planar grid. The only caveat of these codes is that they encode very little logical information, and hence require many physical qubits. This caveat may indeed be a boon, as performing logical operations on surface code qubits is simpler than with codes encoding multiple qubits. Further, we consider biased-noise implementations of surface codes, as they are backed by experiment and theory [PSJG+20, CBP23]. A biased noise model has reduced noise for one type of error, thereby reducing the overhead of qubits needed to protect for it. In the context of biased noise, different variants of the surface code possess different desirable properties. Our focus is to reduce the number of physical resources per logical qubit for a CPU-style architecture of logical qubits [CC22b]. Recent research on biased noise surface codes has shown that the total fault tolerance overhead is lowest when rectangular surface codes are used with biased noise models [HBK+23]. When equipped with magic state distillation factories, the lowest-overhead logical gates that permit universal computation are lattice surgery measurements of multi-logical-qubit Pauli operators [Lit19a]. 14 Surface code logical qubits are protected from errors by the spacelike distance of the code. However when performing lattice surgery measurements, the accuracy of the result depends on the timelike distance, which is the number of rounds of syndrome measurements needed to protect against strings of timelike failures. As such, a larger timelike distance requirement will result in the slowdown of an algorithm’s runtime. Temporal encoding of lattice surgery (TELS) is a technique which can be used to reduce the number of syndrome measurement rounds that are required during a lattice surgery protocol [CC22a]. In a regular lattice surgery protocol, multi-qubit Pauli operators are sequentially measured. With TELS, only commuting subsequences of Pauli operators can be measured. In this technique, the classical measurement results of the commuting subsequence are encoded into a classical error-correcting code. This results in the measurement of a larger set of mutually commuting multi-qubit Pauli operators that are derived from the initial sequence. Up to the distance of the classical code, the multi-qubit Pauli measurement results are protected against timelike lattice surgery failures. We evaluate the speedups that could be achieved by TELS at different desired accuracy (10−10 , 10−15 , 10−20 , 10−25) and observe an algorithmic speedup by a factor of two to six respectively. In these calculations, we considered codes of size up to 128, and distance up to 11. The application of TELS is not restricted to the surface code. In fact, it can be applied for any topological or non-topological code that executes logical computations by lattice surgery. We also apply TELS to magic state distillation protocols with biased noise in Chap. 7. Magic state distillation is a short algorithm executed on encoded qubits [BK05], and thus served as the perfect benchmark to test the improvements of TELS and perform a full cost estimation. In order to make TELS compatible with magic state distillation protocols, we designed a new distillation technique, where multi-qubit measurements are performed in the Clifford frame2 . Previous techniques distilled in the Pauli frame [Lit19b]. Using TELS and optimally-designed layouts, we demonstrate a reduction in the space-time cost of magic state factories by as much as 22%. 2Computations performed in a certain frame contain errors of that frame’s type, but the locations of the errors are known and they are tracked in software until they can be corrected. 15 1.3 Current thrusts and future outlook Current thrusts Research at the forefront of improvements to quantum error correction can be mapped to different levels of abstraction in the layered architecture of Fig. 1.3. Here we discuss improvements to hardware architecture designs, developments on new quantum error-correcting codes, and advancements in fault-tolerant quantum algorithms. For the first three years of the 2020s, quantum computers built using superconducting and ion trap technologies were touted as game-changing. Superconducting quantum circuits are fast, and gate fidelities are improving, however scaling and cryogenics are formidable challenges [HWFZ20]. On the other hand, ion trap quantum computers can demonstrate impressive gate fidelities and all-to-all connectivity. But they suffer from scaling issues, and slow circuit operations [BCMS19]. Neutral atom systems have recently emerged as another potential candidate to build moderately sized fault-tolerant quantum computers. Non-local gates are performed by transporting qubits across a 2-D plane. Although qubit transport is slow, it can still be performed much faster than the timescale of decoherence [BLS+22]. Additionally, many qubits can be simultaneously loaded, providing a simple route to scale to larger devices. However most importantly, this architecture permits the execution of highly optimal circuits for encoded quantum logic. Steane error correction, shown to exhibit higher thresholds than some Shor schemes [EAG24], can form the backbone of error correction with neutral atoms. In Chap. 4, we show low-overhead fault-tolerant circuits to prepare code states for Steane error correction. As highlighted in Chap. 2, the measurement of stabilizers of high weight can require many extra qubits. The surface code measures stabilizers of size at most four. But recent work has highlighted new codes with measurements of at most two or three qubits. These operators are trivial to measure fault-tolerantly. In most cases, these operators are not stabiliszers of the code, but are representations of ignored logical qubits, colloquially known as gauge qubits [HH21, GNFB21, HB21a]. Similar to the surface code, these codes are topological in nature and encode a constant number of logical qubits. The low encoding rates of these codes may prohibit wide-scale use for logical computation but these codes may serve as a primary level of protection in a concatenated setting. 16 The field of quantum error-correcting codes has recently been inundated with the study of high-rate codes constructed by quantum low-density parity check techniques (QLDPC). This ranges from the discovery of new optimal codes [PK22a, PK22b, LP23], codes on planar layouts [BCG+23], and analyses of concatenations of high-rate QLDPC codes with other high-fidelity codes [RGL+24, GNBJ23]. Additionally, different encoding schemes are being constructed based on the physics of quantum devices [GFP+20, GKP00]. These advancements offer flexibility in code choice throughout the architecture stack. Converting all noise to erasure is an exciting new paradigm. Erasures are errors at known locations, hence they are easier to correct than errors whose position is unknown. There have been recent experimental proposals to convert two-qubit gate errors in neutral atom systems and amplitude damping noise in superconducting systems to erasure noise [WKPT22, MLP+23, KHV+23]. This has been complemented with an experimental implementation of a dual-rail cavity qubit in both systems [CSM+23, SST+23, LHH+23] and in ion trap systems [KCB23]. New codes are also being developed for systems with biased erasure errors [SJC+23]. Other types of noise have also been suppressed in recent experiments. For example, leakage errors have been reduced [BVT21, VBT+20], and high-energy low frequency errors from cosmic rays have also been studied [MFA+22]. Quantum devices entered the era of noisy intermediate-scale devices (NISQ) with the emergence of devices boasting around fifty qubits capable of running circuits with up to a hundred gates [BCLK+22]. This prompted the discovery of a new class of algorithms for these devices, termed NISQ algorithms [CAB+21]. The hope was that one such NISQ algorithm could be found that shows a definite quantum advantage. However in the half decade or so that has passed, no such results have been shown. Research on quantum algorithms has now shifted focus. With larger devices and more robust error correction primitives being shown experimentally, research on quantum algorithms has shifted to fault tolerance. General fault-tolerant algorithms for an arbitrarily accurate quantum computer are converted to permit execution on near-term devices, while incorporating limited error correction benefits. These results indicate the dawn of early fault-tolerant quantum computing (EFTQC) [Cam21, ZWJ22, LT22, WSJ22, KKJ22, WBC22, WFRJ23, DL23, AMO+23]. 17 Future outlook Current early architectures for fault-tolerant quantum computing involve using just one quantum error-correcting code. In the future, a quantum computer will require different encodings throughout the architecture stack. The earliest models concatenated small codes with themselves [Kni05b], however we believe that codes with different properties should be concatenated together. Consider the following thought experiment to construct a fault-tolerant computer. A GKP qubit is a continuous-variable logical qubit with a strong ability to suppress errors. Concatenating GKP qubits with the surface code can then produce low overhead logical qubits with very good protection to noise [NC20, NCBa22]. Considering qubits are fairly well protected, a QLDPC code may now be added on top to further increase protection while keeping the encoding rate fairly high [GNBJ23, RGL+24]. In the process of error correction or logical gates, classical codes may be used to protect measurement results, as explored in Chap. 6. This simple thought experiment uses four levels of codes, however future quantum computers may require more. Additionally, one must consider the decoding problem when using multiple codes. Physical systems that require cryogenics have stringent constraints on the heat output of the quantum computer, hence low-energy techniques must be devised for complex decoding algorithms with classical computers. Even with a simple decoder such as a lookup table, there can be complications. Consider as an example the decoding of a 28-bit string of classical results for the Golay code in Chap. 4. At least ten gigabytes are required to store a lookup table of all the corrections. On the other hand, slow complex fault-tolerant decoders can potentially stall computations for long periods of time [CGS+22]. To combat this, efforts have been made to improve the speed and scalability of decoders for surface code quantum computers [BBB+23, LWDZ23, TZC+23]. A recent summary on real-time decoding is shown in Ref. [BCJ+23]. Conventionally, universal quantum computing is achieved using magic state distillation factories. However, code-switching techniques may be a useful alternative [BKS21]. A quantum computer then can be partitioned into multiple units such as a memory unit with high rate codes and a computational unit with a plethora of codes permitting different types of logic. Accumulating a library of Clifford and non-Clifford operations can speed up the execution of complex algorithms while also permitting the execution of a wide range of algorithms. 18 Chapter 2 Fault-tolerant syndrome measurement A critical component of quantum error correction is syndrome measurement: a set of circuits that are used to pinpoint which qubits have errors. This process of error identification is itself susceptible to noise and may fail. To make this robust, extra (ancilla) qubits can be used to identify damaging mid-circuit faults and mitigate the spread of errors. The objective of this paper is to reduce the overhead of ancilla qubits used in imparting this fault tolerance. In particular, we focus on optimizing the flag technique for distance-three fault-tolerant stabilizer measurement. We strive for low qubit overhead since quantum computers with limited qubits count resources preciously, and even minor improvements can free up extra qubits for other tasks. In topological codes where stabilizers are localized in space and are of low weight, only a few flag qubits close to each stabilizer suffice to impart fault tolerance [YK17, CKYZ20, CZY+20]. It has also been shown that with adaptive control and quickly resetting qubits, only four ancillas are required for the universal fault-tolerant operation of some distance-three codes [CR18b, CR18a]. In this paper, we present a general fault-tolerant protocol that works for a stabilizer of any size. If qubits are connected well enough, we show that only logarithmic overhead is required for fault-tolerant stabilizer measurement, an exponential space improvement over the previous linear overhead. The general model of flag-based fault tolerance is displayed in Fig. 2.1a. Here a set of flag ancilla qubits monitor operations in a non-fault-tolerant circuit and when measured at the end, produce flag patterns that identify mid-circuit faults. Based on the observed flag pattern, a correction is applied to the data to minimize the spread of errors. As an example, Fig. 2.1b measures a stabilizer on 10 data qubits while tolerating one fault. The three colored qubits are the flags and the measured flag patterns each imply different corrections. Also note that the sequence of flag patterns 19 Corrections on data Flag qubits Non-fault-tolerant circuit Flag pattern (a) <latexi sha1_be64="aNmPYzVIqluwBApJPMgBSh9w=">AHY3icbVd9xEFHULZsK9APeENIK1Jyt7NxV9qCAofBSJqEFN0kiZKBrP3nVGscfOzJjsaS/gVf4wA3vkJPF+dG3HsqXxuWfOv42k7KTGgRf/cuBhx9t58vPnwk0/eT4ydNTXVSKwkvsdJUxDiScGEyOCsVDzJ4F1yfeDj7/4ApUhj82sIucpVJMBGoRNa5uTs8nEYDaL6IN1FviE3/8X1MfR5ZON13c8CoHaXjGtD6Po9JcWKaM4BmTVpK/ZinYukHthEa0m8JKG1iDx3KtZmCzJyK92OefCuHlJi8vrBZUDyeaJlRFTEN8wGQsF3zjHeitmsA5+xRTjBo3ZbCRKFSuvBJ9iC0LyrBqDveFC8UoYhTciLPmRxbaphKq9LZHUunM/sqgryP13c7hHKFNm3ZJc48vzVYGRWd374af1zI+GLX8Ah19ByZe4ohOV28JRkPpDHe3LrNbaqgXlYdjmFCr8GQMLYUp8IvqGyrZ+ZDyaK1cGsBn3Y09xq5zweD2/sWH6JbfY8OhW5B0ldQKW5e4sBkTPZVdJ12Rlof59HQZP1fwScLgWH+MzflKCYR2o5w9Vm7b0KBSZaDh3yH6+0nQha5YJk7H1NEpXCvz8YamGc0gVRIi5OZQgrMXVhTCgl4ZBWxsRqk6Zb5paClshnQuyRWxpZoDbVI4sjTXOFrg/aE5u4kmJ7XrwaLS/teHIPdkrpbe9mYylKbXSwF0Tovur162MPAykBGdVmjgbD70o3kmpxDMZa+9CXJUwYvWqSjrMbKh3aCRKw1P6lAOsgIfa5PUropCtVMO20BsTK7pDO/dowjCkqs2o/vTPjWKHabgFz97QxH4Gb6qeO1A83FWs07ebKRuWdp7bDAT896/3W2Fukycmxf24WwlOimJpRtCXZ1f8GvWnI52yb+qVaLBFfHQMBlg9+sNevzxblpTFoZfLEMdDaJ+xO3LFtva6E/54fpLk6Hg30GP+F7vzH9CwYPgy+Dr4FkQB98G+Hr4CgXj+Dv4K/N7tPew9X0xp6/t9jzedA4l/9D3NgCY=</latexi> ±X ±Z ±Z ±Z |0i |0i |0i (b) Figure 2.1: (a) Function of a flag scheme. Errors in a non-fault-tolerant circuit can be made to spread into flag qubits. On measurement, the flag qubits yield a pattern of 1s and 0s, based on which the data is corrected. (b) Circuit to measure the stabilizer X⊗10, using three flag qubits, in color, to protect against one X fault (distance three). 100, 110, 111, 011, 001 is a path on the hypercube and corresponds to the order of the flag CNOTs, e.g., between 100 and 110 a CNOT targets flag qubit 2. In this chapter, we restrict discussion to the measurement of individual stabilizers of a quantum code, as in Shor-style fault-tolerant stabilizer measurement [Sho96]. We do not consider measuring multiple stabilizers in parallel, as in [Ste97, Kni05a, HB21b]. Fig. 2.2 displays improvements made to Shor’s method. Note that Shor’s method can tolerate any number of faults by increasing the fault tolerance of the cat state preparation. The subsequent schemes forgo this property and are only fault-tolerant to distance three. DiVincenzo and Aliferis first make the circuit deterministic by removing the need for cat state verification [DA07]. This ensures that a circuit designer need not wait for a fault-tolerantly prepared cat state before measuring the stabilizer. Subsequent improvements were made in [Ste14, YK17, CR18a] to reduce ancilla count by coupling each ancilla qubit to two data qubits instead of one. With our flag method, the ancilla cat state is prepared and unprepared while collecting the stabilizer. As in Fig. 2.1b, an X fault occurring anywhere on the |+⟩ qubit may spread into the data, but will also leaves its imprint on the flags. This is then measured out as a flag pattern. Due to the particular arrangement of the flag CNOTs, any fault that can spread to a data error of weight more than one triggers one of the five shown flag patterns. Each flag pattern then applies a unique correction that ensures that there is at most one data qubit in error. This satisfies the condition for fault tolerance, which states that k faults in a circuit should cause no more than k qubits to 20 (a) [Sho96] (b) [DA07] (c) [Ste14, YK17, CR18a] (d) Figure 2.2: Historical progression of stabilizer measurement circuits, illustrated by a weight-10 X stabilizer measurement. The black CNOTs have targets on the 10 data qubits, collectively represented by a black wire. In (b-d), fault-tolerance is only guaranteed to distance three and Pauli corrections, or frame updates, are applied to the data based on the Z basis measurements. (a) Shor’s method uses w + 1 ancillas and requires a fault-tolerantly prepared cat state. (b,c) These methods use unverified cat states with subsequent error decoding, giving a deterministic circuit. (d) Our flag method prepares and unprepares an ancilla cat state while collecting the stabilizer. Exponentially more flag patterns can thus be accessed for fault diagnosis. 21 have errors. For the distance-three fault-tolerant measurement of a weight-w stabilizer, we propose two methods based on the speed of qubit reset. With fast qubit reset, Theorem 1, only three flag ancillas are required in total, but each flag needs to be measured once per four data qubits. If more flags are used in parallel, the number of accessible flag patterns grows exponentially and the number of measurements per ancilla converges to one. This is the regime of slow qubit reset, Theorem 2, which uses at most ⌈log2 w⌉ flag ancillas measured only at the end. The rest of the chapter is divided into two sections. Sec. 2.1 details the construction of the two paths on the hypercube that we use as flag sequences. Sec. 2.2 describes how to use these sequences for distance-three fault-tolerant stabilizer measurement. In Sec. 2.3, we detail the distance-five and distance-seven cases. 2.1 Flag sequences A flag pattern is a string of 1s and 0s that arises from measuring flag qubits. A flag pattern with a flags is a vertex of the a-dimensional hypercube {0, 1} a . We show how to construct two maximal-length paths through the hypercube. Between sequential flag patterns only one bit changes, which in the fault-tolerant circuit constructions below will correspond to a CNOT from the syndrome qubit to that flag qubit. The first type of flag sequence just requires a maximal-length traversal of the a-dimensional hypercube. A simple choice for this is the Gray code [Gra53, Gar86]. Lemma 1. For a ≥ 1, the Gray code creates a length-2 a Hamming path in the a-dimensional hypercube {0, 1} a . Proof. We construct the sequence inductively. For a = 1, use 0, 1. For a > 1, first run the sequence for a − 1 with 0s appended, then run it backwards with 1s appended. For a = 2, e.g., the sequence is 00, 10, 11, 01. For a = 3, the sequence is 000, 100, 110, 010, 011, 111, 101, 001. The second type of sequence is related to the degree of fault tolerance of the circuit. By definition, fault tolerance to distance d implies that for all k ≤ t = ⌊ d−1 2 ⌋, correlated errors of weight k occur with k-th order probability. For distance-three Calderbank-Shor-Steane (CSS) fault-tolerant 22 syndrome measurement, any single fault should result in a data error with X and Z components having weight zero or one. In order to ensure that the circuit is distance-three fault-tolerant, we need to ensure that a measurement fault on any one ancilla qubit does not trigger corrections of weight greater than one. Hence the second maximal-length sequence requires that there are no weight-one strings except at the start and end. As shown in Fig. 2.1b, we may assign weight-one corrections to these two patterns, but for all the others, multi-qubit corrections are required. Lemma 2. For a ≥ 2, in the a-dimensional hypercube {0, 1} a there exists a path v1 = 10a−1 , . . . , vn = 0 a−11, with length n = 2a − 2a + 3, such that all v2, . . . , vn−1 have weight at least two, and none repeat. Proof. Let 1S ∈ {0, 1} a denote the vertex that is 1 exactly for indices in S. Fig. 2.3 illustrates the inductive construction. For a = 2, the sequence is the same as that in Lemma 1. The base case of our inductive proof is with a = 3, where the sequence is 100, 110, 111, 011, 001. For a > 3, first run the previous sequence for b = a − 1 with 0s added to the bottom, up to the second-to-last element (which for b ≥ 3 is 1{2,b} ). Then run the sequence backward, except with 1s added to the bottom, and swapping coordinates 2 and a − 1 (the red and blue rows in the figure). Finally, finish the sequence from 1{1,a} by walking through 1{3,a} , 1{4,a} , . . . , 1{a−2,a} , 1{2,a} , with the appropriate weight-three sequences 1{1,3,a} , 1{3,4,a} , . . . , 1{a−1,a−2,a} , 1{a−2,2,a} (shown in gray) interposed. To ensure that no vertex is visited more than once, one need only check that the last 2a − 5 sequences are distinct from those that came before. For this, one can track by induction the 2a − 3 hypercube vertices that are not visited by each walk: 0 a , the a − 2 weight-one strings 12, . . . , 1a−1, and the a−2 weight-two strings 1{1,3} , 1{3,4} , 1{4,5} , . . . , 1{a−1,a} . Thus, the sequence has total length 2 a − (2a − 3). The length 2 a − 2a + 3 is maximal. This follows since there are 2 a−1 − a vertices with odd weight more than one, and vertices must alternate odd and even weights. 2.2 Distance-3 fault-tolerant stabilizer measurement In this section, we outline two protocols for distance-three CSS fault-tolerant stabilizer measurement. They differ based on the speed of qubit measurement and reset. 23 a =3: a =4: a =5: a =6: a =7: Flag sequence a =2: Unused configurations Figure 2.3: Flag sequences for distance-three fault-tolerant syndrome measurement, using a flag qubits, each measured once (the slow reset model). These sequences are walks through the adimensional hypercube, from 10a−1 to 0 a−11; passing through each vertex at most once and no other weight-one vertices. Flag patterns are stacked vertically and ordered initially left to right, with solid and empty squares representing 1 and 0, respectively, e.g., represents 10, 11, 01. 24 ±Z <latexi sha1_base64="GvxbVWk2wSfPd5TjOXJwkSxso=">AI3icdVb9s2Fa7S13vm52wsUCXKobltFgfiFD1l2wBcpC0aZQVFHctEKIhqUew/97Xafs3O1ScNJTAYKow4/fuRmteDGTib/3j51tvHrcv43sfPjRnbzOq0QwmBJKv8ipAcElHFhuBbyoNdAqF/8P94J+8/AG24kvtWcNRUJZ5xi6KDrK7Iy1d34sl0j5kfZGuFvH2p17L26e+uvrFCsqUBaJgxh+mkteOasuZAD/MGg1Zce0BNfa6MkIRQWZKY2vtKSVdnC0MmZ5YisqJ2b/l4QXrd32NjZoyPHd1YkOxc0awRCoSHCYF18sWBLKGNrbUIt2sDnVlFkMzLCjqNS0nO28MNRxiUTQHuhNGm6DTMIpU1VFZeEyS3X1N5tuGyxdI8nUCX4fr3hPMmoJtuObBP7j8eP1jtXBVvi+39KErR6FW+h2CcbSPkIV9lM075DGSxWKLOk9NgoYa+jf7SVPT16mauq9Mj5zLBJWlgD9874LOQZDjm0y3q4EWVrUtMT98AR2yrt4XjTn30eWFw89X0zc2FiVjlgQrwsL72yNiMEA7BxzjPVf8ZBOUiglSWGQEJ4ZUvxbkgOhpFSqILwAGoDFgWKiqalRTRm3LhMFiEkaHY5LwsJ9M6B//+5TMqWlhpRGdO8tuEkJUKdjEV0HpNLyoMztJ5kayG2njXfrhAJf3ocqkTUbMEy1aTSHsCAwMzey3IouXTYHUKgWPOFD0oyEk9JKIzOTqaDfV8GFdC9q2XWR6ip5Qq7TLniywVNTq9IK3JMYq2bl/7Ue8AinoOYLglzqW71fKGEdlCpZh1MsqCP/dj53zWa0xNvmMuL1QG8OsgFlWC4p5j9NGm+rDLYvBCKymOIVeLw3Yf1xE5dvOVDlfkL6GYotCH2ypXGr43dxEh1R5flx3/6IWlYS04qepzT6tQkBCl2zqxjVaXymNw0OWSY7n60aWZikVoYHSJDPuE3MnNZgxhU3bOPTu52GrodgQpwTtd5Nxg+TSLXcwu2TI4gUoQMYPe6A90VRdljXIPmBNE0BpkRyhstS6obw3VCmu2qraJ3KgMorPNtO3jADL91PrX4m0NAetl69h15B/qVef1i+OWlox7m05wypheiNd2F4kV/1vtfAP0N1GaQSdctwbLUGxMh2id53u8nLrVkXq8u9P+Tb2eDYdp5Nx+suDeHvj/BKPtFn0RfvSiNoq2x+vegYhGP/o7+if4djAY/DX4d7J9b95Ynfk6jyDo/8B7wUFTQ=</latexi> <latexi sha1_base64="B/YNH2yUd1AKBk869lWUXS8ZPo=">AI3icdVb9xEFHbLpdvl1sIjLwNWpJuVtK/QoaBQLoKqAZ2Ig7VeHzWO8pcnJkx2WUy7zxCj+Nf8MZ5PG3tSfZb75zP1Ugls3mfx4+Zb7/zq3BeF73/w4Udn783OrGMDhgWmjzsqAWBFdw4LgT8I2QGUh4EVxvB3X/wBxnKt92yhiNJK8VnFGHoO8luS3VfyXjPmR9ka0WbJ69l7dvfVXmrWSFCOtYTap3ZGnxEmIAzxkJN2TGtwLcBrKopMtFXOdJKOzgqrV3KApGSurnt70XhduHjZs9OvJc1Y0Dx4VzRpBnCbYVJyA8yJaGMob0NdWgHm1DcPAuKkPrOWeLMNzIuWKicGfM5Yw12UKThlWkqSp87amDnT54ulfzwBOcL3q0fSE4N2FkiwRy/H4WrnqnjrevH9liZudCteBvFYbSJlI9wlc+M9DrkoFEnWeYLDfRtJemoq9XNwtD2F5H0uqKoEpNlZCF3IMbhzyFma5Z9SHI0qKmaJ+gY4Ylu9LRx/fJqFLPI4tNp6KYS9MxCp2JFoRyHc3iAWMwTEzHP+Sx3UhOlHbGMChgR7ojk1dyRAgldY4SXQWBy0LFUNlIuCaMpYJgzkiVpOo2Re8IrDAfKx8LvUzntoXZpklzPDaxZOUCH1K5hBVdR1Q2siVBdoPcnXQgy1Db79cIFK+tHlSktORswTLVtDMQewIAzN3LC6i48tgdQqDY8EWISnKTksjM5ObqJ9X0QV3wD2rYGnyPQMaFOG58/WCp6NWvohL8kzdinQl+34SNgEU9RzAcCuS8vQtb7ozrXPMOtkI4/yovAdM7ntcHYFDPi92JtMSZnktKOY9zdJpu1abF5IRYPUhxDoReH7TDC+uM2/vp0O8QqCxfrVhoQ+yVK41fW7eFkqOLsP/wxD0riWnAsU7PEOWf0qR2VwLp56gd12i9AaHh6pGBZvj5Uae1txConzG3cjOaQ12LlK39wcrfT0O/SJgFwZuqCH4yfjiaCh3nKSxWzeAGKkPHDmhNLsgbZB6ypS4IWoPsCo2t1gXraFGY812VU37RM8YHWFZ3qdtn2cFmD4pfvZtQ/dYaeFmDM8jXsGrJvwby26lzdGtXJw3Of1ndPW9kK05tpTjfJV9b3QTyAvg5SNsnVLCwrRbIVn+Q4vp251ZAHv7qx/U68vnkH2WSc/fwg3dlc3eKD5NPk8+ReiVfJjv98lecpCwhCd/J8kw42BjOfhns0Nv3lid+STpPIOj/wFvT5<latexi> |+i <latexit sha1_be64="PJ7+19L/sWEH07K592MpVuUv0=">AIpHicdVLb9w2EFSRzb5K2t17YCi5SW16s1gmaQC4cNMHECMuYicBLCegqFkta4qUSreLc17j72/6zpkN57VhaR4Agvjxm/cwrwU3djz+78bN9/4MNbg9vDjz7+5NM7d+9sKoRjM4YEo/SqnBgXcGC5Ffq1krXMDL/Hgn78AThSu7bRQ1HFS0ln3JGLYpeZMdgycabu/F4NGfsrpIl4t4+uofbe3Lv1V1YolQgLRPUmMN0XsjR7lTIAfZo2BmrJjWofRkDUFmSqNr7klXZwtDJmUeWIrKidmf5eEF63d9jY6aMjx2XdWJDsNG0EcQqjwmBdfArFgQyhja21CLdrAZ1RjMywo6UtJ5xNvfDtYLJpoC3AnjmXcBpEU6aqisrCZbqm9W3fZOEej6FK8P1+3XmSU2Hdknmw8Hj1Y7lwVb14v3mhpwlYP3Yq3UOyTdaR8hKtsqiunfAaymC9Q58lpsFBD30Z/aSr6enUz19QfpkOZYLKUkCcnfhWB9nEPO4jTLaoPQZYWNekQxZ3wBHb6m3h+OPi1GdfBRYXT3wfNzYSKWsSXBirDw/vYaMZghIHaGOcYqhIJykUcoSw6iAhHBLKl7OMmBUFIqVRBeA3FQYsCRUVTQuCKaO2ZcJgsQoEjS7nJc5thQJjSfz0hM2pamGlqVMY0r24SYlQp2GVXAal0tfTgDKn2UqIoTbetR8uUEk/lyqilPRBgxTboNoQcwIDC197McSi4docQKNZ87oOSjMQTEgqs5PpYN+3QcWPgH2rYReZnqE1CrtsidzLBW1/J0AvfknVirvdXfq13AEU9BzDcEgfTwnetz5UoQrs4VLMOlngjzP3a+d81mtMTb5lLi9UBvDrIpVguKeY/TBJvuawy2LwQigcpjFX88N2GH9cR+XbzlQ5XC+hmKLQ9sqVx+N3cRIdUeX5cd/+iFpZWEtOJapXuCc0+rUJAwpdtBakY1Wl8pjcNDlkmO50utGlmYpFaGB0iQT7lNzIWYEVN2zpD47udhq6HboAKcE7XebjUcPk3Ey9l3MLi9kyOIFKEBGD3ugPdFUXZYVyD5gTV1yBNAKZEcobLUuqG8N1Qprtqtq0id6xoDKzyT67Tt4QAzS/dT69V+NxqepD1ou3sGvIfgL91qpzdStUP5w0ONc2ndGdML0Ypruwvkqv+t9p6oKdQXYpTdJVS7AsN0byFa5fkZL6duaQe7+60f1OvLl5MRul4P72IN5e/Eo0HZfR1dD9Ko+i7eXaC86iFj0e/R39E7+CbwdPB8HOfTmjeWZz6POM3j9+mBY8=</latexi> <latexit sha1_be64="cWg8qCPykzpQTP7AV3v/wfxDSo=">AIpHicdVLb9w2EFSRzbV9Ie2EruhtbFaJ2gOQeHCTR9AjLiI7QSwnCiZrWsKVImqX3NO89tr+s/6bDuW1Y2kdAYKo4cv3sO8FtzY8fi/W7e+D+8M7g4/viTz+7d/Q6MazeCAKaH0q5wEFzCgeVWwKtaA61yS/zk52wIP0YruW8XNRxtJyhm1KDrMTsCSjf34vFo3D5kdZEuF3G0fPbe3L/zV1Yo1lQgLRPUmKN0XtjR7XlTIAfZo2BmrITWoJjfRkDUUFmSqNr7klXZwtDJmUeWIrKidmf5eEN60d9TYeNjx2XdWJDsQtG0EcQqEjwmBdfArFgQyhja21CLdrAZRZjMywo6UtJ5xNvfDtYLJpoC3CnjmXcBpEM6aqisrCZbm9W3fZOGejKF8P1+3XmSU2Hdknmw8GT1c7lwXb94s3mhpwlYP3qUOyTdaR8jKtsqiunfAaymC9Q5+lZsFBD30Z/Sr6en0z19QfpcOZYLKUkCcnvfhWB9XEDO4zTLaoPQZYWNekQxZN3wBHb6mh+OPi1GdfBRYXT3wfTXNzaSKWsXBirDw/u4aMZghIHaGOcYqhIJykUcoSw6iAhHBLKl7OLMmBUFIqVRBeA3FQYsCRUVTVQuCKaO2ZcJgsQoEjS7nJc5thQJjSQfz0hM2pamGlqVMY0r24SlQZ2QGVXAal0tfTgHK0n2UqIoTbetR8uUEk/lyqiPRBgxTboNQcwIDC17cSi4docQKNZ87oOSjMQTEgqjs5PpYN+3QcWPgH2rYReZnqMn1CrtsqdzLBW1/JW0AvfnVirdXfq13AEU9BzDcgfTwnetz5UoQrsr4VLMOlngjzP3a+d81mtMTb5lLi9UBvDrIpVguKeY/TeBJvuawy2LQigcpTBX86N2GGH9cRN+XbzlQ55S+hmKLQ9sq1xq+N3cRIdUeX5S/+iFpZWEtOJapXuCc0+rMJAUwpdtBakY1Wl8pjcNDlkmO50utGlmYpFaGB0iQT7lNzIWYEVN2zpD47udhq6HboAKcE7XebjUePknEy9l3MLi9kyOIlKEBGj3qgPdFUXZYVyD5gT1xBNAKZEcobLUuqG8N1Qprtq0id6zoDKayTm7t47QAza/cT29U+MJqepaD1ou3sBvIfgL91qodStUP5w2tONcndGML0Ypruwvk+t9p6oGdQXUpTJVS7AsN0byFa5fkZL6duaQe7+60f1OvLg4no3Q8Sn97G+vL2/xQfRl9HX0IEqj76LtJdoLzqIWPR79Hf0T/v4JBs8GLwcEF9Pat5Zkvos4zeP0/Q1FOw=</latexi> |0i <latexit sha1_be64="n2rgGDEpUzTFgMdGFSbsw9Np0=">AIpHicdVLb9w2EFSRzb5K2t17YCi5SR16s1gmaQC4cNMHECMuYjsBLCegqFkta4qUSreLc17j72/6zpkN57VhaR4Agvjxm/cwrwU3djz+78bN9/4MNbg9vDjz7+5NM7d+9mBUoxnsMyWUfpVTA4JL2LfcCnhVa6BLuBlfrwd9/+ApJfsoajipSTzmjFkUH2TYMn5zNx6Pu1DVhfpcFvfRG1z+6be7fygrFmqkZYIac5iOa3vkqLcCfDrDFQU3ZMSCtk6soagUXxlZ00g6OVsYqhyRFbUz098Lwuv2Dhs7fXzkuKwbC5do2glhFsek4BqYFQtCGUN72rRDjajmKLkRl2FJWa1jPO5n64lHJRFOA2cs4bIJNwylRVUm4zFJdNrV36ybL9TMVQJvt+O08yqsmWIxvEkwdPRg+XO1fFG9eLH7Q0YaubsWbKPJOlI+xU21ZVTPgNzBeo8+Q0WKihb6O/NBV9vbqZa+oP0ynMkFlKSBOz7vQrA+ziFncZrptWHIEuLmnSI4sk7ht9bZw/HFx6rOvAouLJ76Pprm5MBHL2JgRVh4f3uNGMwQEDvHGMVykxSKGWJYVRAQrglFS9nuRAKCmVKgvIbioEWBoqKpgXBlFHbMmEw2IUCZpdzksc2woExrIv56QGTUtzDQ1KmOazcpESoUzKDrgFS6WnpwhtaTbCXEUBv2g8XqKQfS5VxaloA4pN2G0AMYEJja+1kOZcOu0MIFGs+90JRuIJCYX2cl0sO/boJHwL7VsINMz9ETapV2dM5lopa/kgXv6TqzV3u1pv9Y7gKeAxhuiYNp4bvW50oUod2VcClmnaxQZ7ntfO+azWGJt8StxuqI1hVsA0qwXFvMdpPIk3XVYZbF4IxYMU5Cr+WE7jLD+uAm/t70ocr8BXQjFNoe+VK49fGbmCkuqPL8uM/ZC0srAWHMtUL3DOaXVqkgKY0ugNaMr+UxuEhyTH86VWjSxMUivDAyTIp9wmZkrMKOG7b0Bd3OwNl2ACdLnPvxqNHyTgZ+5mhxcyZPECFjRz3QrmiqLsKZA+wpi45AmgFsi0UtloX1LeGaU2U16RM9Z0DlF7Jdr2cFqA5pfu9cqfGE1PB68Vb2DVkP4F+a9W5uhWqH04a2nEu7TjOmFaMW1HYU3yVX/W209DOoLUJumqViWpaN6RCt8vyMl1O3KPd3fav6lXFweTUToepb89jLfWzy/xaB9GX0d3Y/S6LtoK/l23Ixb9Hv0d/RP9Ohm8GzwYrB/D153nm86jzDF7/xLZBZQ=</latexi> <latexit sha1_be64="j+7TKEfOe97pG6uNQBd7wfZNM=">AIpHicdVLb9w2EFaStlsX0l67IWt4CJ1gpnaA5BIULN30AMeIithPAcgKmtWypkiZOrd0z32Gv7z/pOTXjqV1BAih+/eQzWnBjk+S/GzdvfDhR7cHd4Yf/LpZ5/fvXwKhGM9nSij9OqcGBJewb7kV8LrWQKtcwv8eDs/otOFK7tlFDUcVLSWfkYtig6y7AkeXs3TsZJ+5DVRbpcxNHy2X17/ZfWaFYU4G0TFBjDtOkeOasuZAD/MGg1Zce0BNa6kigoyVRpfaUkr7eBoZcyihFUTsz/b0gvG7sLHTJ0eOy7qxINm5okjiFUkeEwKroFZsSCUMbS3oRtYDOqKbMYmWFHUalpPeNs7odrGZMNAW4E8Y1a7gNMmnTFUVlYXLNU3u7rL5wj1NoBrh+/268ySjmw5skE8efh0/Gi5c1W8b34YUsTtnroVyJYj9aR8onuMqmLKZyCL+QJ1npwGCzX0bfSpqKvVzdzTf1heuRcJqgsBcTpmfdCNbHOeQsTjPdovQZGlRkw5RPHkLGt3haOPy5fZVYHxPfRNDcXJmIZWxKsCAv76wRgxkCYme2yAiod0kIRqSwxjAoYEW5JxcuZJTkQSqlCsILoKE4aFGgqiakEwZdS2TBjMEGKBM0u5yWOTaUCQ3k0zIjJoWZpoalTHNxtOUiLKZlBUNV1QCpdLT04Q+tJhiqI137YcLVNKPpeq4lS0AcNUmZD6AEMCEztgyHku3SEijWf+6AkI/GEhMLo7GQ62PdtUPEjYN9q2GmF+gJtUq7NkcS0UtfyWtwD17L9Zq/a0X+sdQFHPAy3xMG08F3rcyWK0O5KuBSzTtayIM9z2vnfFZrjE0+JW431MYwK2Ca1Ji3uM0nsSbLqMNi+E4kGKY8jV/LAdRlh/3IRfF26UGX+AroRCm2IvXKl8WtjNzBS3dFl+fGfkhaWVgLjmqFzjntDo1wKY0ugNeMar+UxuEhy1GO50utlmYUa0MD5Agn3I7MjNagxlX3LCPzi62notukCpATvdJl7l4wfj5JRruYHV7IkMULCH/dAu6KpiwrkD3AmrkCKAVyLZQ2GpdUN8aqhXWbFfVpE/0gGV3m12nbw2kBml+6n1r8KXV9DQHrRfvYNeQ/QT6nVXlaofjhpaMe5tO+cqYXohXdTeJFf9b7X1QM+hugxSOkpXLcGy1RsTIdolednvJy61ZFLvT/k29ujiYNknP72KN5aX97ig+jL6OvoQZRG30VbS/RbrQfsej36O/on+jfwTeD54OXg/1z6M0byNfRJ1n8OZ/bIQFA=</latexi> |0i <latexit sha1_be64="n2rgGDEpUzTFgMdGFSbsw9Np0=">AIpHicdVLb9w2EFSRzb5K2t17YCi5SR16s1gmaQC4cNMHECMuYjsBLCegqFkta4qUSreLc17j72/6zpkN57VhaR4Agvjxm/cwrwU3djz+78bN9/4MNbg9vDjz7+5NM7d+9mBUoxnsMyWUfpVTA4JL2LfcCnhVa6BLuBlfrwd9/+ApJfsoajipSTzmjFkUH2TYMn5zNx6Pu1DVhfpcFvfRG1z+6be7fygrFmqkZYIac5iOa3vkqLcCfDrDFQU3ZMSCtk6soagUXxlZ00g6OVsYqhyRFbUz098Lwuv2Dhs7fXzkuKwbC5do2glhFsek4BqYFQtCGUN72rRDjajmKLkRl2FJWa1jPO5n64lHJRFOA2cs4bIJNwylRVUm4zFJdNrV36ybL9TMVQJvt+O08yqsmWIxvEkwdPRg+XO1fFG9eLH7Q0YaubsWbKPJOlI+xU21ZVTPgNzBeo8+Q0WKihb6O/NBV9vbqZa+oP0ynMkFlKSBOz7vQrA+ziFncZrptWHIEuLmnSI4sk7ht9bZw/HFx6rOvAouLJ76Pprm5MBHL2JgRVh4f3uNGMwQEDvHGMVykxSKGWJYVRAQrglFS9nuRAKCmVKgvIbioEWBoqKpgXBlFHbMmEw2IUCZpdzksc2woExrIv56QGTUtzDQ1KmOazcpESoUzKDrgFS6WnpwhtaTbCXEUBv2g8XqKQfS5VxaloA4pN2G0AMYEJja+1kOZcOu0MIFGs+90JRuIJCYX2cl0sO/boJHwL7VsINMz9ETapV2dM5lopa/kgXv6TqzV3u1pv9Y7gKeAxhuiYNp4bvW50oUod2VcClmnaxQZ7ntfO+azWGJt8StxuqI1hVsA0qwXFvMdpPIk3XVYZbF4IxYMU5Cr+WE7jLD+uAm/t70ocr8BXQjFNoe+VK49fGbmCkuqPL8uM/ZC0srAWHMtUL3DOaXVqkgKY0ugNaMr+UxuEhyTH86VWjSxMUivDAyTIp9wmZkrMKOG7b0Bd3OwNl2ACdLnPvxqNHyTgZ+5mhxcyZPECFjRz3QrmiqLsKZA+wpi45AmgFsi0UtloX1LeGaU2U16RM9Z0DlF7Jdr2cFqA5pfu9cqfGE1PB68Vb2DVkP4F+a9W5uhWqH04a2nEu7TjOmFaMW1HYU3yVX/W209DOoLUJumqViWpaN6RCt8vyMl1O3KPd3fav6lXFweTUToepb89jLfWzy/xaB9GX0d3Y/S6LtoK/l23Ixb9Hv0d/RP9Ohm8GzwYrB/D153nm86jzDF7/xLZBZQ=</latexi> <latexit sha1_be64="j+7TKEfOe97pG6uNQBd7wfZNM=">AIpHicdVLb9w2EFaStlsX0l67IWt4CJ1gpnaA5BIULN30AMeIithPAcgKmtWypkiZOrd0z32Gv7z/pOTXjqV1BAih+/eQzWnBjk+S/GzdvfDhR7cHd4Yf/LpZ5/fvXwKhGM9nSij9OqcGBJewb7kV8LrWQKtcwv8eDs/otOFK7tlFDUcVLSWfkYtig6y7AkeXs3TsZJ+5DVRbpcxNHy2X17/ZfWaFYU4G0TFBjDtOkeOasuZAD/MGg1Zce0BNa6kigoyVRpfaUkr7eBoZcyihFUTsz/b0gvG7sLHTJ0eOy7qxINm5okjiFUkeEwKroFZsSCUMbS3oRtYDOqKbMYmWFHUalpPeNs7odrGZMNAW4E8Y1a7gNMmnTFUVlYXLNU3u7rL5wj1NoBrh+/268ySjmw5skE8efh0/Gi5c1W8b34YUsTtnroVyJYj9aR8onuMqmLKZyCL+QJ1npwGCzX0bfSpqKvVzdzTf1heuRcJqgsBcTpmfdCNbHOeQsTjPdovQZGlRkw5RPHkLGt3haOPy5fZVYHxPfRNDcXJmIZWxKsCAv76wRgxkCYme2yAiod0kIRqSwxjAoYEW5JxcuZJTkQSqlCsILoKE4aFGgqiakEwZdS2TBjMEGKBM0u5yWOTaUCQ3k0zIjJoWZpoalTHNxtOUiLKZlBUNV1QCpdLT04Q+tJhiqI137YcLVNKPpeq4lS0AcNUmZD6AEMCEztgyHku3SEijWf+6AkI/GEhMLo7GQ62PdtUPEjYN9q2GmF+gJtUq7NkcS0UtfyWtwD17L9Zq/a0X+sdQFHPAy3xMG08F3rcyWK0O5KuBSzTtayIM9z2vnfFZrjE0+JW431MYwK2Ca1Ji3uM0nsSbLqMNi+E4kGKY8jV/LAdRlh/3IRfF26UGX+AroRCm2IvXKl8WtjNzBS3dFl+fGfkhaWVgLjmqFzjntDo1wKY0ugNeMar+UxuEhy1GO50utlmYUa0MD5Agn3I7MjNagxlX3LCPzi62notukCpATvdJl7l4wfj5JRruYHV7IkMULCH/dAu6KpiwrkD3AmrkCKAVyLZQ2GpdUN8aqhXWbFfVpE/0gGV3m12nbw2kBml+6n1r8KXV9DQHrRfvYNeQ/QT6nVXlaofjhpaMe5tO+cqYXohXdTeJFf9b7X1QM+hugxSOkpXLcGy1RsTIdolednvJy61ZFLvT/k29ujiYNknP72KN5aX97ig+jL6OvoQZRG30VbS/RbrQfsej36O/on+jfwTeD54OXg/1z6M0byNfRJ1n8OZ/bIQFA=</latexi> ±Z <latexi sha1_base64="GvxbVWk2wSfPd5TjOXJwkSxso=">AI3icdVb9s2Fa7S13vm52wsUCXKobltFgfiFD1l2wBcpC0aZQVFHctEKIhqUew/97Xafs3O1ScNJTAYKow4/fuRmteDGTib/3j51tvHrcv43sfPjRnbzOq0QwmBJKv8ipAcElHFhuBbyoNdAqF/8P94J+8/AG24kvtWcNRUJZ5xi6KDrK7Iy1d34sl0j5kfZGuFvH2p17L26e+uvrFCsqUBaJgxh+mkteOasuZAD/MGg1Zce0BNfa6MkIRQWZKY2vtKSVdnC0MmZ5YisqJ2b/l4QXrd32NjZoyPHd1YkOxc0awRCoSHCYF18sWBLKGNrbUIt2sDnVlFkMzLCjqNS0nO28MNRxiUTQHuhNGm6DTMIpU1VFZeEyS3X1N5tuGyxdI8nUCX4fr3hPMmoJtuObBP7j8eP1jtXBVvi+39KErR6FW+h2CcbSPkIV9lM075DGSxWKLOk9NgoYa+jf7SVPT16mauq9Mj5zLBJWlgD9874LOQZDjm0y3q4EWVrUtMT98AR2yrt4XjTn30eWFw89X0zc2FiVjlgQrwsL72yNiMEA7BxzjPVf8ZBOUiglSWGQEJ4ZUvxbkgOhpFSqILwAGoDFgWKiqalRTRm3LhMFiEkaHY5LwsJ9M6B//+5TMqWlhpRGdO8tuEkJUKdjEV0HpNLyoMztJ5kayG2njXfrhAJf3ocqkTUbMEy1aTSHsCAwMzey3IouXTYHUKgWPOFD0oyEk9JKIzOTqaDfV8GFdC9q2XWR6ip5Qq7TLniywVNTq9IK3JMYq2bl/7Ue8AinoOYLglzqW71fKGEdlCpZh1MsqCP/dj53zWa0xNvmMuL1QG8OsgFlWC4p5j9NGm+rDLYvBCKymOIVeLw3Yf1xE5dvOVDlfkL6GYotCH2ypXGr43dxEh1R5flx3/6IWlYS04qepzT6tQkBCl2zqxjVaXymNw0OWSY7n60aWZikVoYHSJDPuE3MnNZgxhU3bOPTu52GrodgQpwTtd5Nxg+TSLXcwu2TI4gUoQMYPe6A90VRdljXIPmBNE0BpkRyhstS6obw3VCmu2qraJ3KgMorPNtO3jADL91PrX4m0NAetl69h15B/qVef1i+OWlox7m05wypheiNd2F4kV/1vtfAP0N1GaQSdctwbLUGxMh2id53u8nLrVkXq8u9P+Tb2eDYdp5Nx+suDeHvj/BKPtFn0RfvSiNoq2x+vegYhGP/o7+if4djAY/DX4d7J9b95Ynfk6jyDo/8B7wUFTQ=</latexi> <latexi sha1_base64="B/YNH2yUd1AKBk869lWUXS8ZPo=">AI3icdVb9xEFHbLpdvl1sIjLwNWpJuVtK/QoaBQLoKqAZ2Ig7VeHzWO8pcnJkx2WUy7zxCj+Nf8MZ5PG3tSfZb75zP1Ugls3mfx4+Zb7/zq3BeF73/w4Udn783OrGMDhgWmjzsqAWBFdw4LgT8I2QGUh4EVxvB3X/wBxnKt92yhiNJK8VnFGHoO8luS3VfyXjPmR9ka0WbJ69l7dvfVXmrWSFCOtYTap3ZGnxEmIAzxkJN2TGtwLcBrKopMtFXOdJKOzgqrV3KApGSurnt70XhduHjZs9OvJc1Y0Dx4VzRpBnCbYVJyA8yJaGMob0NdWgHm1DcPAuKkPrOWeLMNzIuWKicGfM5Yw12UKThlWkqSp87amDnT54ulfzwBOcL3q0fSE4N2FkiwRy/H4WrnqnjrevH9liZudCteBvFYbSJlI9wlc+M9DrkoFEnWeYLDfRtJemoq9XNwtD2F5H0uqKoEpNlZCF3IMbhzyFma5Z9SHI0qKmaJ+gY4Ylu9LRx/fJqFLPI4tNp6KYS9MxCp2JFoRyHc3iAWMwTEzHP+Sx3UhOlHbGMChgR7ojk1dyRAgldY4SXQWBy0LFUNlIuCaMpYJgzkiVpOo2Re8IrDAfKx8LvUzntoXZpklzPDaxZOUCH1K5hBVdR1Q2siVBdoPcnXQgy1Db79cIFK+tHlSktORswTLVtDMQewIAzN3LC6i48tgdQqDY8EWISnKTksjM5ObqJ9X0QV3wD2rYGnyPQMaFOG58/WCp6NWvohL8kzdinQl+34SNgEU9RzAcCuS8vQtb7ozrXPMOtkI4/yovAdM7ntcHYFDPi92JtMSZnktKOY9zdJpu1abF5IRYPUhxDoReH7TDC+uM2/vp0O8QqCxfrVhoQ+yVK41fW7eFkqOLsP/wxD0riWnAsU7PEOWf0qR2VwLp56gd12i9AaHh6pGBZvj5Uae1txConzG3cjOaQ12LlK39wcrfT0O/SJgFwZuqCH4yfjiaCh3nKSxWzeAGKkPHDmhNLsgbZB6ypS4IWoPsCo2t1gXraFGY812VU37RM8YHWFZ3qdtn2cFmD4pfvZtQ/dYaeFmDM8jXsGrJvwby26lzdGtXJw3Of1ndPW9kK05tpTjfJV9b3QTyAvg5SNsnVLCwrRbIVn+Q4vp251ZAHv7qx/U68vnkH2WSc/fwg3dlc3eKD5NPk8+ReiVfJjv98lecpCwhCd/J8kw42BjOfhns0Nv3lid+STpPIOj/wFvT5<latexi> ±X (a) ±Z |0i |+i |0i |0i |0i |0i |0i |0i ±Z (b) Figure 2.4: (a) Circuit to measure an X⊗6 stabilizer, CSS fault-tolerant to distance three. (b) Circuit to prepare a six-qubit cat state, fault-tolerant to distance three. For w ∈ {4, 5, 6}, flag-fault-tolerant circuits are constructed the same way regardless of qubit reset speed. We show in Fig. 2.4a that for w = 6, only two flag qubits are required. Lower-weight stabilizers can be measured by removing data CNOTs and making appropriate changes to the Pauli corrections. For 7 ≤ w ≤ 10, the different constructions yield the same circuits. It is only for w > 10 that the effects of qubit reset speed are pronounced. 2.2.1 Fast reset Theorem 1. If qubits can be measured and reset quickly, then for any w, four ancilla qubits are sufficient to measure the syndrome of X⊗w, CSS fault-tolerantly to distance three. Moreover, the number of measurements needed is ⌈ w+2 4 ⌉ + 1. Proof. For w ∈ {4, 5, 6}, the circuit using two flag ancillas is shown in Fig. 2.4a. It runs through a sequence of three flag patterns and a multi-qubit correction is only applied for the flag pattern 11. For w > 6, the general construction is shown in Fig. 2.5. Each repetition of the highlighted region adds the X parity of four more data qubits, while measuring and quickly reinitializing one flag qubit. In terms of the number of measurements m, the construction achieves up to w = 4 (m − 1) − 2. It is fault-tolerant because X faults on the control wire cause flag patterns of alternating weights two or three, that localize the fault to three possible consecutive locations along the control wire: before, between or after two CNOT gates. The appropriate correction, ensuring distance-three fault tolerance, is for a fault between the CNOT gates. 25 ±Z ±Z ±Z ±Z ±Z |+i |0i |0i |0i |0i |0i ··· ±X Figure 2.5: Distance-three fault-tolerant syndrome bit measurement only needs three flag qubits. The highlighted region can be repeated to fit the weight of the stabilizer being measured. |0i ±Z |0i ±Z ? |0i |+i |0i |0i |0i |0i |+i |0i |0i |0i d c b |0i a ±Z correction w 2 1 Figure 2.6: Distance-three error correction is not possible with one flag qubit. Either (top) the control wire is unprotected at some point ⋆, from which an X fault can propagate to an error of weight at least two; or (bottom) faults at a, b, c, d, causing respective errors I, X1, X1X2, Xw have no consistent correction. Theorem 1 may be optimal; it does not appear to be possible to use fewer than three flag qubits. With just one flag qubit, one can detect that an error has occurred, but not where. As illustrated in Fig. 2.6, either the control wire is unprotected at some point or for w ≥ 4 there is no consistent correction rule. By a similar argument, two flag qubits are not enough. Any correction based on a single flag can have weight at most one, since the flag measurement itself could be faulty. However, if at some point in the middle the control wire is protected by just a single flag, a weight-one correction will not suffice. On the other hand, if both flags are used to protect the control wire across the entire sequence of CNOT gates, we are unable to locate faults well enough to correct them. We remark that this construction can also be used to prepare a w-qubit cat state fault-tolerantly to distance three. The conversion follows three steps: 1. Remove one data qubit. 2. Initialize the data qubits as |0⟩. 3. Remove the syndrome ancilla measurement, so as to retain it in the support of the stabilizer. An example of this conversion is shown for w = 6 in Fig. 2.4b. In Chap. 3, we suggest different protocols that use just one ancilla qubit. 26 2.2.2 Slow reset Theorem 2. The syndrome of X⊗w can be measured CSS fault-tolerantly to distance three using m ≥ 3 measurements, provided that w ≤ 2 (2m−1 − 2(m − 1) + 3). Proof. Two examples are shown in Fig. 2.4a, for w = 6, and Fig. 2.1b, for w = 10. As in these figures, in general we collect the syndrome two qubits at a time into a syndrome qubit that is initialized as |+⟩. Between each of these pairs of CNOT gates, a CNOT is applied from the syndrome qubit into one of m − 1 flag qubits. This leads to a sequence of flag patterns, e.g., 100, 110, 111, 011, 001 for the w = 10 example. Based on the observed flag pattern, a correction is applied as if an X fault had occurred between the corresponding pair of flag CNOT gates. Observe that the flag sequence changes one bit at a time; it can be thought of as a path on the hypercube. It begins and ends with weight-one patterns, but otherwise the patterns all have weight at least two. This is important for distance-three fault tolerance because a fault could affect the flags, and only the first and last data corrections have weight one. Also, the flag patterns along the sequence are distinct, so each is associated with only one correction. The theorem then just follows using the flag sequence construction in Lemma 2. Note that the approach of Theorem 2, with slow reset, is different from the fast reset case of Theorem 1, in that a flag qubit is active and able to detect faults in more than one region of the circuit. 2.2.3 Space-time cost Here, we count the circuit depth and number of ancillas used in our distance-three fault-tolerant stabilizer measurement circuits. Parallelization can substantially reduce circuit depth. Table 2.1 compares our flag method for measuring a weight-w stabilizer to the earlier methods in Fig. 2.2. Also considered is a parallelized Shor method, in which the initial cat state is prepared in logarithmic depth, with w/4 extra ancilla qubits for postselection checks. The Shor methods must pass the postselection checks, and so they are non-deterministic protocols. Table 2.1 shows the best case, 27 Table 2.1: Space and time costs for measuring a weight-w stabilizer using different distance-three fault-tolerant stabilizer measurement circuits. In the following, all the logarithms are base 2. The flag method requires the fewest ancillas and has low depth, allowing for the smallest cost when computing #ancillas × depth. Protocol Ancillas Depth Ancillas×Depth Shor w + 1 w/2 + 3 O(0.5w 2 ) Shor-Par 5w/4 3 log w − 1 O(3.75w log w) DA w 2w − 1 O(2w 2 ) Compressed DA w/2 3w/2 − 2 O(0.75w 2 ) Flag log w + 1 3w/2 + O(1) O(1.5w log w) Not fault-tolerant 1 w O(w) where all the checks pass. Note that the flag and parallelized Shor methods both have space × depth cost scaling as O(w log w), with the leading coefficient in favor of the flag method. Using a standard depolarizing noise model, we simulate noisy versions of the different circuits to determine statistics of the weight-one and weight-two errors. Specifically: • With probability p, the preparation of |0⟩ is replaced by |1⟩ and vice versa—similarly for |+⟩ and |−⟩. • With probability p, an X or Z measurement has its outcome flipped. • With probability p, a one-qubit gate is followed by a Pauli error drawn uniformly at random from {X, Y, Z}. • With probability p, the two-qubit CNOT gate is followed by a two-qubit Pauli error drawn uniformly at random from {I, X, Y, Z} ⊗2 \ {I ⊗ I}. There are no errors on idle resting qubits. Fig. 2.7 shows the rates of weight-one errors and weight-two errors for different input physical error rates p. The rate of weight-one errors is lowest in the non-fault-tolerant circuit, since it contains the fewest locations for faults. The Shor method has a lower weight-one error rate than the flag method, but among the deterministic fault-tolerant methods, the flag method performs the best. For larger stabilizers (w = 22), the curves for the Shor method and the flag method are closer, implying that the difference in the rate of weight-one errors between the Shor method and the flag method is reduced. 28 ಜ ಜ ಜ 6\QGURPHELWHUURUUDWH ಜ ಜ ಜ ಜ ಜ ಜ :HLJKWHUURUUDWH ಜ ಜ ಜ ಜ 6\QGURPHELWHUURUUDWH ಜ ಜ ಜ ಜ ಜ :HLJKWHUURUUDWH ಜ ಜ ಜ ಜ :HLJKWHUURUUDWH 1RW)7 6KRU &RPS'$ )ODJ ಜ ಜ ಜ ಜ :HLJKWHUURUUDWH 1RW)7 6KRU &RPS'$ )ODJ w = 22 w = 10 Physical error rate Figure 2.7: Simulation of the noisy measurement of an X⊗10 and X⊗22 stabilizer at physical error rate p ∈ {10−3 , 10−2} using different distance-three fault-tolerant circuits: Shor-style, compressed Divincenzo-Aliferis, and the flag method of Sec. 2.2.2. In the first and second column of graphs, we show the rate of weight-one and weight-two data errors due to these circuits, with 99% error bars. In the third column, we show the rate at which the measured syndrome bit is wrong. 29 As expected, the rate of weight-two errors of the three fault-tolerant protocols scales quadratically with p, allowing for a lower probability of weight-two errors below a pseudothreshold (physical error rate below which a fault-tolerant method achieves lower weight-two error rate than the nonfault-tolerant method). Notice that the flag method has the highest pseudothreshold. Moreover, as the stabilizer weight is increased, the pseudothreshold of the flag method decreases slower than those of the other fault-tolerant methods. Asymptotically, the flag method admits the highest pseudothreshold for weight-two errors, but incurs more weight-one errors than the probabilistic Shor method. Additionally, we compute the rate of errors on the syndrome bit, as this determines how much fault tolerance will be needed to correct faulty syndrome information [DRS22]. The rate of faulty syndrome bits is lowest when using the flag method for fault tolerance. 2.3 Stabilizer measurement tolerating more than 1 fault Distance-five fault tolerance is interesting for stabilizers of weight w ≥ 6. For w ∈ {6, 7, 8}, the circuits in Fig. 2.8 with seven ancilla qubits are distance-five fault-tolerant. We present a general method to construct stabilizer measurement circuits for arbitrary w in Fig. 2.9. By computer simulation, we verify the fault tolerance of this construction for w up to 90 qubits. The general construction proceeds as follows. First, five flag qubits are activated. For each additional flag that is needed, the gates in the shaded blue region are applied. These gates deactivate an existing flag and activate a new flag. Finally, when no additional flags are needed, the flags are deactivated in the order {2, 4, 1, 5, 3}. 1 denotes the flag that has been active for the longest time and 5, the flag that has been active for the shortest time. To ensure faults are correctly flagged, it is necessary to ensure there is asymmetry between the order in which flags are activated and the order in which they are deactivated. This is in contrast to the distance-three DiVincenzo-Aliferis method in Fig. 2.2c, where both the orders are symmetric. In Fig. 2.9, the thick black line indicates the w-qubit register of data qubits that are in the support of the stabilizer. Data CNOT gates (in black) are applied to qubits {w, w − 1, . . . , 1} after every flag CNOT (in red). The last data CNOT must be placed either before the third-last or second-last flag CNOT. The addition of another data CNOT gate before the last flag CNOT results in uncorrectable errors. 30 If there are a ancilla qubits, one can measure a weight-(2a−5) or weight-(2a−4) stabilizer. Hence a weight-w CSS stabilizer may be fault-tolerantly measured to distance-five, for w ≤ 2a − 4. Note that at most five flag qubits are active at any instant. Hence with fast qubit reset, one only requires five flag ancillas and one syndrome ancilla to measure an arbitrary weight stabilizer fault-tolerantly to distance-five. For distance-seven fault-tolerance, there are differences in the spacing between data CNOT gates and the order in which flag ancillas are activated and deactivated. We show how to construct circuits for stabilizer of arbitrary weight w by first discussing a circuit for a weight-17 stabilizer, shown in Fig. 2.10. We chose w = 17 since the circuit is non-trivial and its construction encompasses all the tricks needed to construct circuits for arbitrary weight. In general, compared to Fig. 2.9, the number of ancilla CNOT gates between data CNOT gates is doubled, except in the center of the circuit, where it is tripled for the length of four data CNOT gates. For odd w, the number of ancilla CNOTs between the w−1 subsequent pairs of data CNOT gates is the sequence {(⌈ w−6 2 ⌉ 2’s), 3, 3, 3, 3,(⌊ w−6 2 ⌋ 2’s), 1}, as shown in Fig. 2.10. For even w, the sequence is {( w−6 2 2’s), 3, 3, 3, 3,( w−6 2 2’s), 1}. Note that, as shown in Fig. 2.10, one additional ancilla CNOT gate is required at the start. Next we comment on the order in which ancilla qubits are deactivated as flags. Similar to the distance-five case, after initially activating seven flags, a flag is deactivated to activate a new flag qubit. An active group of seven flags is closed in the order {2, 4, 6, 1, 3, 5, 7}. As these seven flags are closed, seven new flags are simultaneously opened. The process repeats unitl there are exactly seven remaining flags to close. These last seven flags are also closed in the same order {2, 4, 6, 1, 3, 5, 7}. In Fig. 2.10, flag ancillas are shown in alternating colors to highlight the order that flags are activated and deactivated. Distance-seven fault-tolerance was verified by computer simulation for stabilizer weight up to 32. The number of flag ancillas needed to measure a weight-w stabilizer is w + 1. The techniques described in this section may also be used to develop resource-efficient circuits that are fault-tolerant to higher distance. 31 |0i |0i |0i |0i |0i |+i |0i ±X ±Z ±Z ±Z ±Z ±Z ±Z (a) w = 6, a = 7 |0i |0i |0i |0i |0i |+i |0i ±X ±Z ±Z ±Z ±Z ±Z ±Z (b) w = 7, a = 7 ±X ±Z ±Z ±Z ±Z ±Z ±Z |0i |0i |0i |0i |0i |+i |0i (c) w = 8, a = 7 Figure 2.8: Distance-five CSS stabilizer measurement with slow qubit reset for w ∈ {6, 7, 8}. Red wires indicate syndrome and flag qubits. w 1 |0i |0i |0i |0i |0i |+i |0i |0i |0i |0i |0i |0i |0i ±X ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z ... ... Figure 2.9: Distance-five syndrome measurement with slow qubit reset for a weight-w X stabilizer. The thick black wire indicates a register of w qubits. An opaque red wire implies the flag is currently inactive and not catching faults. The gates in the blue section can be repeated to construct stabilizer measurement circuits for arbitrary stabilizer weight w. At any instant, only five flags are active. Hence this circuit can be performed with fast qubit reset using only five flag qubits. 32 1 |0i |0i |0i |0i |0i |+i |0i |0i ±Z |0i |0i ±Z ±Z |0i |0i ±Z ±Z |0i ±Z |0i ±Z |0i |0i ±Z ±Z |0i ±Z |0i |0i ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±X 17 } 3 } } } 3 3 3 ±Z Figure 2.10: Distance-seven syndrome measurement with slow qubit reset for a weight-17 X stabilizer. At any instant, only seven flags are active. Hence this circuit can be performed with fast qubit reset using only seven flag qubits. 33 Chapter 3 Fault-tolerant cat state preparation In this section we show protocols for distance-three fault-tolerant cat state preparation with overhead that is logarithmic in the size of the cat state. Alternatively with fast reset, only one ancilla is needed, and it is used a logarithmic number of times. Cat states [GHZ89] have applications in many areas of quantum computing, including communication [GT07], information processing [PCL+12], and error correction [Sho96]. Besides practical applications, our results on cat state preparation are theoretically interesting since: i) we introduce the study of asymptotic estimates of qubit overhead for the fault-tolerant preparation of cat states of arbitrary size, and, ii) ideas developed for cat state preparation may provide clues for the fault-tolerant preparation of logical states of more complex codes. We explore these ideas further in Chap. 4. We briefly summarize the results of this chapter. We first consider making cat state preparation deterministic by using flag techniques from Chap. 2. Flag sequences developed in Sec. 2.1 are utilized to tolerate a single fault in a non-fault-tolerant preparation circuit. This leads to protocols requiring just one ancilla qubit, measured m times, where m is logarithmic in the size of the cat state. Flag circuits of higher distance, such as in Sec. 2.3 or Ref. [CR20], can be used for arbitrary distance fault tolerance. We also considered preparing cat states by solely performing parity measurements on single-qubit states. Note that a stabilizer state may be ideally prepared by measuring a minimal set of stabilizer generators of the state. However fault tolerance usually requires the repetition of measurements to facilitate a majority voting. Here we consider cat states prepared by performing ZZ measurements, and show that in addition to the n − 1 generators that must be measured ideally, only ⌈log2 (n/3)⌉ + 1 extra measurements are needed to tolerate a single fault. Table 3.1 contains bounds on the ancilla overhead for preparing weight-w cat states fault-tolerantly 34 Table 3.1: Cat state size for distance-3 preparation methods that use m ancilla qubit measurements. Method Cat state size w Deterministic Error Correction w ≤ 3 (2m − 2m + 2) (Theorem 3) depth = (w − 1) + 2m−2 Adaptive Error Correction w ≤ 3 (2m − 2m + 3) (Theorem 4) Error Detection w ≤ 3 · 2 m−1 (Theorem 6) Parallelized Error Correction w = 2m = 2 · 2 j , j ∈ N (Theorem 7) depth = 2 + log2 w with flags. If the flag qubits can reset quickly, Theorem 3 states that only one flag qubit is required and it needs to be reset and measured m times. Since the flag qubits operate independently, it is also possible to use m flag qubits, with each one being measured once. We further show how to use an adaptive circuit in Theorem 4 to marginally increase the number of flag patterns in use. When preparing the cat states by stabilizer measurement, we consider two scenarios of qubit connectivity. First, we determine overhead bounds for state preparation with non-local gates. This facilitates parity measurements between data qubits that are far apart. Next, we consider a layout where qubits are connected as a 1-D chain. Along the chain, data and ancilla qubits alternate, and two-qubit gates are local. Under this model, the logarithmic overhead from the nonlocal scenario was lost, however a scalable construction for arbitrary distance fault tolerance was uncovered. We find an upper bound scaling as O(nd) for n-qubit cat state preparation with distance-d fault tolerance. We explicitly show the parity measurements needed for small cat state sizes, as this may be interesting from an experimental standpoint. In the appendix, Sec. .1 and Sec. .2 contain two additional protocols for distance-three weight-w cat state preparation. In Theorem 6, we show how to use postselection to prepare cat states while tolerating two faults. Finally Theorem 7 details how to create low-depth circuits for distance-three fault-tolerant cat state preparation, which may be useful in technologies with many qubits or long two-qubit gate times. 35 3.1 Using flags to tolerate faults in a non-fault-tolerant circuit We start by outlining the general procedure used to construct the fault-tolerant circuits in this section. An n-qubit cat state is a simple ancilla state defined with the stabilizer generators {X⊗n , ZiZi+1 (∀ i < n)}. Procedure 1. Flags to tolerate faults for cat state preparation: 1. Construct a non-fault-tolerant cat state preparation circuit by applying CNOT gates from a control |+⟩ qubit on to a set of n − 1 |0⟩ target qubits. 2. Using flag sequences from Sec. 2.1, perform parity measurements on targeted qubits to detect if a fault has caused an X error of high weight. 3. Apply corrections based on the syndromes. For preparing a two- or three-qubit cat state, any preparation circuit is automatically fault-tolerant, because every error has weight zero or one. For example, on three qubits XXI ∼ IIX, since XXX is a stabilizer. Fault tolerance becomes interesting for preparing cat states on w ≥ 4 qubits. Theorem 3. For m ≥ 2, one ancilla qubit, measured m times, is sufficient to prepare a cat state on w qubits fault-tolerantly to distance three, for w ≤ 3 2 m − 2m + 2 . Let [m] = {1, 2, . . . , m} and XS = Q j∈S Xj . Proof of Theorem 3. Fig. 3.1 illustrates our construction for the cases m = 3 and m = 4. In general, we prepare a w-qubit cat state using CNOT gates from the first qubit, so that the possible X errors from a single fault are 1, X1, X[2], X[3], . . .. We then compute parities of subsets of the qubits into the ancillas, following the flag sequence from Lemma 2 and Fig. 2.3. Although for clarity Fig. 3.1 shows the m parity checks being made in parallel, they can also be made sequentially with just one ancilla qubit. With the given correction rules, errors due to single faults are corrected up to possibly a weight-one remainder. (For example, in Fig. 3.1a, errors X[5], X[6] and X[7] all result in the parity checks 111, 3 m = 3 ±Z X[3] X[6] X[9] 11 – 12 1 2–4 5–7 8 – 10 |02i |03i |03i |03i |+i |0i |0i |0i Active Correction: flags: ancilla qubits ±Z ±Z (a) m = 4 |02i ±Z ±Z ±Z |03i ··· ··· ··· ··· |03i |03i |03i |03i |03i |03i |03i |03i 1 2–4 5–7 8 – 10 11 – 13 14 – 16 17 – 19 20 – 22 23 – 25 26 – 28 29 – 30 ±Z X[3] X[6] X[9] Active Correction: flags: ancilla qubits X[12] X[15] X[18] X[21] X[24] X[27] (b) Figure 3.1: Distance-three fault-tolerant cat state preparation circuits. Note that, with fast reset, only one ancilla qubit is required. 37 ±Z ±Z ±Z ±Z ±Z Figure 3.2: Circuit to prepare a 15-qubit cat state by adaptive error correction, fault-tolerant to distance three. Labels on the thick black wire indicate which data qubit in the block is being addressed as the control or target of the CNOT. If a fault occurs while preparing the cat state on the |+⟩ qubit, it is partially localized by the red flag ancilla. The measurement result of this flag then determines a set of parity checks to completely localize a possible fault. After all the ancilla qubits have been measured, corrections are applied based on Table 3.2. for which the correction X[6] is applied.) The circuit also tolerates faults within the parity-check sub-circuit, because a single fault here can flip at most one parity, and no correction is applied for the weight-one patterns. By this method, the cat state is prepared in depth w − 1. The depth of the parity check circuit increases exponentially as 2 m−2 for m ≥ 3 if we consider slow reset (a = m). This is evident from the flag sequences in Fig. 2.3 as the maximum number of times any flag bit is switched. The total depth of the circuit is then (w − 1) + 2m−2 . Note that the construction from Theorem 3 does not help for syndrome measurement, because the parity checks would in general become entangled with the data. However the ideas of Theorems 1 and 2 can also be applied to cat state preparation. For example, just as in Fig. 2.4b a circuit for measuring X⊗6 with three ancilla qubits corresponds to a circuit to prepare a six-qubit cat state with two ancillas, similarly adapting the construction of Theorem 2 allows preparing a 2(2a −2a+ 3)-qubit cat state using a ancilla qubits each measured once. Theorem 3 shows a protocol needing just one ancilla qubit hence that is better for state preparation. This technique can however be very useful for the preparation of more complex ancilla states, as in general we cannot perform parity measurements after preparation as we do with cat states. We can do slightly better than Theorem 3 if we allow an adaptive circuit, in which the parity checks are chosen based on the outcome of a flag qubit measurement. For example, Fig. 3.2 gives a circuit to prepare a 15-qubit cat state using m = 3 measurements. Here, the result of measuring the red ancilla determines how the other two ancillas are used. 38 Theorem 4. Using an adaptive circuit, for m ≥ 2, one ancilla qubit, measured m times, can be used to prepare a cat state on w qubits fault-tolerantly to distance three, for w ≤ 3 2 m − 2m + 3 . Proof. Our construction will follow the same basic structure as the circuit in Fig. 3.2. Prepare the w data qubits as |+0w−1 ⟩, then apply CNOT1,w, CNOT1,w−1, . . . , CNOT1,2 to get a cat state. Let k = 3(2m−1 )−2. Just before CNOT1,k+1 and just after CNOT1,2, apply CNOTs into the first ancilla qubit, the red qubit in Fig. 3.2, and measure it. The remainder of the circuit depends on the measurement result. If it is 1, then a fault has been detected. The error on the cat state can be one of 1, X1, X[2], X[3], X[4], X[5], . . . , X[k−1], X[k] , X[k+1] . The correction procedure needs to determine in which of the above 1 + k−1 3 groups-of-three the error lies; then for any error in {X[3j] , X[3j+1], X[3j+2]} the correction X[3j+1] works. Perhaps the easiest way to locate the error is by binary search using the Gray code in Lemma 1, e.g., by computing parities between qubits 3j for j ∈ {1, 2, . . . , 1 + k−1 3 }. Since the measurement of the red ancilla could have been incorrect, it is important that the all-0s outcome of the binary search correspond to the 1, X1, X[2] error triple, as in Table 3.2. Using m − 1 measurements, we can search 2 m−1 possibilities, which indeed is 1 + k−1 3 . (The search circuit can also be made nonadaptive, as in Fig. 3.2.) Next consider the case that the first measurement result is 0, so no fault has been detected. The error on the cat state can be one of X[k+1], X[k+2], . . . , X[w] ∼ 1. We again use the remaining m − 1 ancilla qubits to measure parities of subsets of cat state qubits. Since there is no guarantee of a fault having occurred yet, we use flag sequences from Lemma 2, where the length of the weight-at-least-two flag sequence is J = 2m−1 − 2(m − 1) + 1. The parity checks are now done between qubits {k, k + 1 + 3j, k + 2 + 3J} for j ∈ {0, 1, . . . , J}, as shown in Fig. 3.3 and Table 3.2. We do not allow weight-one flag patterns to be able to correct any errors since they may also be triggered by a measurement fault on any one of the data qubits involved in the parity check. Consolidating, we are allowed up to 3J + 1 CNOTs before the red ancilla is initialized, and up to 3 k+1 = 11 ±Z ±Z Correction: 10 14 Active flags: X[12] a 1=2 ancilla qubits 1–9 |0i |0i w = 15 (a) w = 15, a = 3 k+1 = 23 32 29 26 22 Active a 1=3 Correction: X[24] 1 – 21 flags: ancilla qubits |0i ±Z |0i ±Z |0i ±Z X[27] X[30] w = 33 (b) w = 33, a = 4 k+1 = 47 74 68 65 62 59 56 53 50 a 1=4 Correction: ±Z 46 1 – 45 Active flags: ancilla qubits |0i |0i |0i |0i X[48] X[51] X[54] X[57] X[60] X[63] X[66] X[69] X[72] ±Z ±Z ±Z 71 w = 75 (c) w = 75, a = 5 Figure 3.3: If the red ancilla flag in Fig. 3.2 is not triggered, these circuits are used to find and correct a possible error. The flag sequences (from Fig. 2.3) and corresponding corrections are listed at the bottom. Note that these sequences are nonadaptive, and can be used either with a ancilla qubits in a slow reset model, or with just one ancilla qubit in a fast reset model, since all the CNOT gates commute. 40 Table 3.2: Possible data errors and associated corrections for the different observed flag patterns in Fig. 3.2. Red flag Parity checks Possible errors Correction 1 3 ⊕ 9 6 ⊕ 12 0 0 1, X1, X[2] X1 1 0 X[3], X[4], X[5] X[4] 1 1 X[6], X[7], X[8] X[7] 0 1 X[9], X[10], X[11] X[10] 0 11 ⊕ 15 10 ⊕ 14 0 0 1 None 1 0 X11, X15, X[14] None 1 1 X[11], X[12], X[13] X[12] 0 1 X10, X14 None k CNOTs in the monitored region of the red ancilla. In total we can create a cat state on up to w ≤ 3J + k + 2 = 3 2 m − 2m + 3 qubits, with m total measurements. We also tested protocols where multiple flags are used for the initial partial localization of a fault (in place of the red flag qubit). We found no improvement to our bounds on ancilla overhead. It appears that ancillas are better used in the parity checks than for partial fault localization. In this section we show multiple cat state preparation circuits that have been made tolerant to a single fault by using flags. Moreover, given the non-local connectivity, the number of ancillas needed is shown to be logarithmic in the cat state size. We derive these upper bounds via combinatorial proof techniques, and hope our results inform the theoretical analysis of the asymptotic resource requirements of general state preparation. In the next section, we consider the preparation of cat states without two-qubit gates between data qubits. Instead, cat states are prepared solely by measuring stabilizers. 4 3.2 State preparation by measurement As before, we outline the general procedure we use in this section to prepare n-qubit cat states by stabilizer measurements. Procedure 2. Overcomplete sequences to tolerate measurement faults: 1. Prepare all the qubits in the X-basis, ensuring that we can then project into the +1-eigenspace of the X⊗n operator. 2. Measure Z-type stabilizers to project into one of the unique X error spaces. For fault tolerance, an overcomplete sequence of stabilizers is chosen according to the techniques developed in Refs. [DR20, DRS22]. 3. Based on the observed syndrome, apply an X correction to return to the desired codespace. Careful gate scheduling and the use of extra ancillas allows measurements to be parallelized, reducing circuit depth. In Fig. 3.4, we show the reduced number of ZZ parity measurements needed for our determinstic circuits, compared to a Shor-type postselective protocol. The first graph shows the number of parity measurements needed with all-to-all connectivity, and the subsequent graph considers cat state preparation on a 1-D chain. An alternative to Procedure 2 is to initialize all the qubits in the Z basis and measure the X⊗n operator fault-tolerantly. Only one operator needs to be measured, however performing the measurement fault-tolerantly can be quite expensive. Flag techniques that are fault-tolerant to arbitrary distance can require many extra qubits [CR20, AM22]. To cut costs, we consider measuring the Z-type stabilizers after initializing all the qubits in the X-basis. This requires only one quickly resetting ancilla qubit if the measurements are performed sequentially, which also reduces the circuit complexity. Note that the fault tolerance setting considered here is not entirely the same as that in Refs. [DR20, DRS22]. In those references, an overcomplete sequence of measurements was determined for error correction, where it is clear that logical errors must be suppressed. With cat state preparation, the objective is simply to reduce the residual error weight at the end of the state preparation circuit. Moreover from a technical standpoint, a distance-d fault-tolerant protocol from Refs. [DR20, DRS22] 42 tolerates up to t = ⌊ d−1 2 ⌋ combined input errors and internal faults. In our state preparation circuits, we must tolerate any number of input errors, while also tolerating up to t internal faults. 3.2.1 Non-local measurements Qubit layouts with non-local (especially all-to-all) connectivity are great tools for analytic resource estimates, however are difficult to construct experimentally. In this subsection, we first attempt to understand the limitations of non-local connectivity by deriving tight upper bounds for the number of parity measurements needed to tolerate one fault. We then apply these techniques to optimize the overhead when considering local connectivity. In doing so, we observe a scalable method of choosing parity measurements that may be fault-tolerant to arbitrary distance. Theorem 5. Using Procedure 2, length-n cat state preparation tolerating a single fault needs at most n + ⌈log2(n/3)⌉ parity measurements. Proof. We partition the proof into first determining the corrections for all the even-weight syndromes, followed by a rigorous analysis of the correction for odd-weight syndromes. Prepare all the qubits as |+⟩. The first n measurements are of the n − 1 stabilizer generators of the code (ZiZi+1 (∀i < n)) followed by Z1Zn. This ensures every qubit is checked twice. Every X error space (in the absence of any faults) is now characterized by a unique even-weight syndrome. The correction for these syndromes are the X errors that cause them. We now prove that a single fault during these measurements causes an odd-weight syndrome. Note that a measurement fault on any of the n syndrome qubits causes a weight-one syndrome. At this stage, there are also exactly n possible locations for X faults (on each qubit, after its first check), which result in a weight-one error and a weight-one syndrome. Since we already consider the syndrome of every X error space, the effect of an X fault is the same as that of a measurement fault in a different error space. This leaves us with only n unique faults, affecting all the measurement error spaces. A fault at any of these n locations converts an even-weight syndrome to odd weight. Each odd-weight syndrome thus occurs as a result of a projection onto an X error space and one of the n different measurement faults. Looking closer at the error spaces signaled by an odd-weight syndrome, they consist of a set of 43 0 5 10 15 20 25 30 n=3 n=4 n=5 n=6 n=7 n=8 Cat state size Shor (minimum) d=3 (nonlocal) d=5 (nonlocal) d=7 (nonlocal) 0 5 10 15 20 25 30 Number of parity measurements n=3 n=4 n=5 n=6 n=7 n=8 Cat state size Shor (minimum) d=3 (1-D) d=5 (1-D) d=7 (1-D) Figure 3.4: Reduced number of ZZ parity measurements required to prepare a cat state faulttolerantly. We first consider non-local connectivity, proving that n + ⌈log2(n/3)⌉ measurements are sufficient to tolerate one fault. By random search, we found sequences of parity checks that can tolerate two or three faults too. In the second graph, qubits are laid on a 1-D chain, and CNOT gates are local. The number of parity measurements is generally larger than with non-local connectivity. The n = 8, d = 7 solution for local measurements is conjectured but not proved. 44 errors {ei + cyc(f, i)|f ∈ F}, for an X error space ei and F = {000 . . . 00, 100 . . . 00, 110 . . . 00, . . . , 111 . . . 00, 111 . . . 10} where cyc(f, i) is a uniform right cyclic shift of the bitstring f by i indices, where the value of i depends on the error space ei . Note that i does not index the error spaces. The resulting n distinct error spaces can then be partitioned into 2 ⌈log(n/3)⌉ groups, using ⌈log(n/3)⌉ parity measurements. Using techniques from 3, and with an appropriate Gray code, we can choose the qubits taking part in the parity checks. Note that each syndrome due to these extra parity checks must signals one of up to three overlapping errors, allowing us to choose the middle of the three errors as a correction. Table 3.3 shows a choice for these parity checks up to n = 24. For every odd-weight syndrome in the first n parity measurements, we have found an assignment of corrections. Ignore all syndromes that do not trigger any of the first n measurements. This completes the proof as all possible syndromes have assigned corrections. By performing measurements in parallel, all of the parity measurements can be implemented in three rounds, with n/2 extra qubits instead of one. Through exhaustive numerical search, we verified that for n ∈ {4, 5, 6}, the minimum required number of non-local two-qubit parity measurements matches our upper bound. We conjecture that the lower bound to tolerate a fault is at least n measurements. Simulations To verify the correctness of our state preparation circuits and compare against previous work, we perform Monte Carlo simulations of our circuits using the Gottesman-Knill framework under an independent circuit-level noise model as described below. 45 d = 3 n = 4 {1 . . . n},(2, 3) n = 5 {1 . . . n},(2, 4) n = 6 {1 . . . n},(2, 5) n = 7 {1 . . . n},(2, 4),(3, 6) n = 8 {1 . . . n},(2, 5),(3, 7) n = 12 {1 . . . n},(2, 8),(5, 11) n = 13 {1 . . . n},(2, 8),(4, 10),(6, 12) n = 18 {1 . . . n},(2, 11),(5, 14),(8, 17) n = 19 {1 . . . n},(2, 6, 11, 16),(4, 14),(8, 18) n = 24 {1 . . . n},(2, 8, 14, 20),(5, 17),(11, 23) d = 5 n = 6 {1 . . . n},(2, 5),(1, 5),(1, 6),(3, 5) n = 7 {1 . . . n},(2, 6),(3, 5),(1, 3),(2, 4),(3, 6) n = 8 {1 . . . n},(2, 5),(3, 8),(2, 8),(4, 8),(2, 7) n = 9 {1 . . . n},(2, 6),(2, 8),(5, 9),(4, 8),(7, 9) n = 10 {1 . . . n},(3, 8),(5, 10),(3, 10),(1, 6),(1, 3) d = 7 n = 7 {1 . . . n},(2, 5),(3, 8),(2, 8),(4, 8),(2, 7),(3, 6),(1, 8),(3, 5) Table 3.3: Sequences of parity measurements needed to prepare cat states of size n fault-tolerantly to distance d. We assume non-local gates are possible, permitting parity measurement of distant data qubits. The sequences for the distance-five and -seven cases were generated at random, while the corrections were calculated and fault tolerance was verified using Mathematica programs. {1 . . . n} := (1, 2),(2, 3), . . .(n − 1, n),(n, 1). 46 • With probability p, the preparation of |0⟩ is replaced by |1⟩ and vice versa—similarly |+⟩ and |−⟩. • With probability p, ±X or ±Z measurement on any qubit has its outcome flipped. • With probability p, the two-qubit CNOT gate is followed by a random two-qubit Pauli error drawn uniformly from {I, X, Y, Z} ⊗2 \ {I ⊗ I}. We perform simulations of a weight-8 cat state preparation circuit, and target fault tolerance to distance seven. We are interested in observing what the probability of a residual weight-k error is, for k < n, using the different fault-tolerance techniques we have at hand. In Fig. 3.5, we compare the residual error probabilities using five different circuits, three of which are fault-tolerant to distance-seven. These include the n = 8, d = 7 sequence from Table 3.3, the measurement of the X⊗n stabilizer fault-tolerantly to distance-seven and finally the encoding of an X⊗n operator. The last two techniques are adopted from techniques in 2. Additionally, we consider a d = 3 and a d = 5 sequence of parity measurements for completeness. The distance-seven techniques do suppress errors fault-tolerantly, with the stabilizer measurement method showing the lowest residual error rates for high-weight errors. Due to the reduced number of operations in the d = 3 and d = 5 sequences, low-weight errors are suppressed better than with d = 7. 3.2.2 Local measurements on a 1-D chain We finally consider the problem of preparing cat states on a line, using only local measurements. We show solutions that need fewer measurements than Shor-type schemes to achieve fault tolerance. This translates to fewer operations on the qubits and hence a higher fidelity output state. Claim 1. With 1-D connectivity and for n ≥ 6, deterministic length-n cat state preparation faulttolerant to distance-d needs at most d(n − 4) + 3 measurements. Length-4 and 5 cat states need 5 and 7 measurements respectively, and are fault-tolerant to distance-three. First, we choose the number of times each parity is measured, and then define the parity checks in each layer. Measure the parity i (on qubits (i, i + 1)) the number of times specified by the entry of index i in this length-(n − 1) string: {1, d+1 2 , d, d, . . . , d, d, d+1 2 , 1}. The parity measurements are executed in at most 2d layers by executing them in parallel. This is done by splitting the n−1 unique parities into two groups, where a qubit appears no more than once in each group. Measurements in 47 10 6 × 10 3 2 2 × 10 2 Physical error rate 10 1 6 × 10 2 2 × 10 1 3 × 10 1 Weight-1 error rate MZ MZd5 MZd7 SPMd7 MXd7 10 6 × 10 3 2 2 × 10 2 Physical error rate 10 2 10 1 Weight-2 error rate 10 6 × 10 3 2 2 × 10 2 Physical error rate 10 3 10 2 Weight-3 error rate 10 6 × 10 3 2 2 × 10 2 Physical error rate 10 5 10 4 10 3 10 2 Weight-4 error rate Figure 3.5: Rates of residual errors of weight w ∈ {1, 2, 3, 4} after size-eight cat state preparation, for physical error rates p ∈ [5 × 10−3 , 2.5 × 10−2 ].The methods used consist of sequences from Table 3.3, the measurement of the X⊗n stabilizer fault-tolerantly to distance-seven (MXd7) and finally the encoding of an X⊗n operator fault-tolerantly to distance-seven (SPMd7). The distance-seven cases are fully fault-tolerant, as opposed to the distance-five case where three faults can result in a weight-four error. The lowest residual error rates are observed with the distance-seven fault-tolerant stabilizer measurement sequence in Table 3.3. 48 each layer are drawn from the two groups in an alternating fashion. Over the first layer (and second layer for even n), the measurements at the ends of the line will be completed and not need to be repeated. Proceed until all the d(n − 4) + 3 measurements are performed. With this construction, we have computationally verified the preparation of cat states on a line up to size 12 for distance-three fault tolerance, and size 8 for distance-five fault tolerance. Due to CPU limitations we could not numerically verify sequences for larger cat states or higher distance. We leave the proof of Claim 1 to future work. 49 Chapter 4 Encoded state preparation Building on improvements from the previous chapter, we now turn our attention to the preparation of more complex states. With cat states, fault tolerance is only needed to curb X error spread. ZZ stabilizers provides tolerance to Z errors for free. In this chapter, we consider general CSS stabilizer states for which both X and Z fault tolerance is required. We attempt to prepare the circuits using two models. In one, flags are used to encode stabilizer operators fault-tolerantly. In the other, states are prepared solely by performing fault-tolerant stabilizer measurements. With both methods, we may choose to postselect. However we make these protocols deterministic. This is a first in stabilizer state preparation, and can greatly simplify the construction and scheduling of quantum state distillation factories. We first describe what it means to prepare an ancilla state fault-tolerantly. For fault tolerance to distance d ′ , the final condition is relaxed to k ≤ d ′+1 2 . Condition 1. Fault-tolerant state preparation: To fault-tolerantly prepare an ancilla state for an Jn, k, dK quantum code, any k faults in the circuit should propagate to a residual error of weight at most k, for k ≤ d+1 2 . Quantum error-correcting codes [Sho95, Ste96a, CS96, Gai13, LB13] used fault-tolerantly can perform arbitrarily accurate computations with noisy components, provided the noise stays below a threshold as the system size is scaled up [Kni05b, ABO97, AGP06, Rei06]. It is natural, therefore, to push the threshold up as high as possible. In recent years, topological codes such as the surface code [FMMC12] and floquet codes [HH21, GNFB21] gained prominence due to their high thresholds and simple hardware requirements. However Shor-style schemes were used to derive these thresholds [Sho96, FMMC12]. 50 Steane-style error correction with low-error CSS stabilizer states may permit higher thresholds of operation [Ste97, EAG24]. This has also been explored experimentally on ion trap and neutral atom systems [PBP+23, HBC23, BEG+24]. While the Shor scheme measures stabilizers individually, the Steane method extracts all the information for fault-tolerant error correction in a single round of transversal CNOT gates. This implies there are fewer potential locations for faults, leading to a very high-fidelity error correction routine. Additionally, it was shown that allied codestates could facilitate single-shot error correction with Knill’s method [Kni05a], and Clifford computation in O(1) time [ZLBK20]. Universality can then be achieved by transversal non-Clifford gates and code switching [BKS21, PR13]. The study of the preparation of CSS stabilizer states is fairly extensive. A lot of focus has been conferred to the preparation of logical states of the quantum Steane and Golay codes [Ste03, Got16, PR12]. However recent work has demonstrated a process to construct states of arbitrary CSS stabilizer codes fault-tolerantly [BZH+15, LZB17, ZLB18]. For the codes we consider in this chapter, we benefit from a more granular analysis of fault propagation. Refs. [HB21b, HJOY23] perform a hybrid Shor-Steane scheme of error correction. Our results may aid in the development of stabilizer states for these protocols too. Previous protocols for stabilizer state preparation relied on postselection for fault tolerance. If a non-trivial syndrome is observed, the erroneous state is discarded. Its major advantage is the simplicity of decoding, however, postselection raises many technical issues when actually constructing a fault-tolerant quantum computer. We consider two scenarios based on the availability of real-time control of quantum operations. Considering state preparation is one of the earlier protocols in a quantum computation, without real-time control, a lot of time is wasted in running computations without knowing if they should be rejected. On the other hand, acting on the measurements early and stopping the computation can save time. The drawback is that real-time scheduling of quantum operations in a large quantum computer can become computationally expensive in itself. Due to these shortcomings, an interesting use case for deterministic protocols is in the first round of state preparation in a state distillation factory. Distillation is generally expensive and non-deterministic, which is remedied with our low-overhead deterministic protocol. In addition, as opposed to simple state injection techniques that encode information with error rate O(p), the protocols we describe in this chapter inject into the code with a failure rate O(p d+1 2 ). 51 4.1 Deterministic fault-tolerant preparation of CSS ancilla states In this chapter we show how to prepare code states of quantum codes using two techniques for fault tolerance, similar to Chap. 3. The first is to use flags to tolerate faults in ideal state preparation circuits and the second involves preparing the state by repeatedly measuring stabilizers. This method of projecting into a codestate is the same as Procedure 2, with the exception that the stabilizers are larger and thus will need flag techniques to make their measurement fault-tolerant. In either case, we may choose to let the protocol be postselective (if a non-trivial syndrome arises, reject), however we attempt deterministic implementations. The side effect of 100% yield is that the logical error rate of the prepared states are higher than with postselection techniques. To make the protocol deterministic, we determine what correction permits projection into the code state while satisfying Condition 1, for every possible observed syndrome. Procedure 3. Flags to tolerate faults for stabilizer state preparation: 1. Construct a non-fault-tolerant ancilla state preparation circuit using |0⟩, |1⟩, |+⟩, |−⟩ states and CNOT gates. 2. Every qubit that acts as a control spreads X errors. For a qubit that controls (w − 1) CNOT gates, use the flag circuit for weight-w fault-tolerant cat state preparation from Fig. 2.4b to prevent malignant X error spread. 3. Similarly, target qubits spread Z errors. This is mitigated with the dual (“Hadamarding") of the flag circuits for controlling X error spread. Circuits that are fault-tolerant to larger distance can be constructed using the techniques developed in Sec. 2.3 and Ref. [CR20]. The lowest overheads were observed when the total number of qubits acting as control or target was lowest. Thus, non-fault-tolerant circuits constructed using the Latin rectangle method [Ste03] may be preferred over the overlap method of Ref. [PR12]. 4.2 Steane code Circuits for the preparation of encoded states of the Steane code go through periodic updates every 10 years [Ste97, Rei06, Got16]. The update of this decade is a circuit to prepare |0⟩L fault-tolerantly 52 ±Z |0i ±Z |0i Figure 4.1: Fault-tolerant circuit for encoding the operator XXXX, given the first data qubit does not start in a Z eigenstate. Data qubits are shown in black and ancilla qubits in red. Ancilla qubits flag correlated errors and apply appropriate corrections on to the data qubits. Note that this circuit is derived from the flag-fault-tolerant stabilizer measurement circuits in Chap. 2. |+i |0i |0i |0i |+i |+i |0i (a) |0i |0i |0i |+i |0i |0i |0i |0i |+i |+i ±Z ±Z ±Z (b) ±Z ±Z |+i |0i |0i |0i |0i |0i |+i |0i |0i ±Z ±Z |0i |+i |0i |0i ±Z ±Z (c) Figure 4.2: Fault-tolerant preparation of a J7, 1, 3K |0⟩L state. (a) Circuit to ideally prepare the |0L⟩ state of the Steane code, needing nine CNOT gates over three rounds. (b) Condensed circuit to fault-tolerantly and deterministically prepare the |0L⟩ state of the Steane code. This circuit uses the same number of CNOTs as in (c), but has circuit depth seven, as opposed to 21 in (c). (c) Circuit to fault-tolerantly prepare the |0L⟩ state of the Steane code, using the weight-4 operator encoding circuit of Fig. 4.1. Two ancillas are needed if qubits can be measured and reset quickly. 53 and deterministically, as shown in Fig. 4.2. The circuit uses 21 CNOT gates and either nine or ten qubits total, depending on the qubit measurement speed. It is constructed by using flags to make the circuit in Fig. 4.2a fault-tolerant. Note that qubits 1, 2 and 4 in Fig. 4.2a contain the controls for 3 CNOT gates each, similar to the circuit for preparing a weight-4 cat state. This permits the use of flag techniques from Chap. 2 and Chap. 3. Using the fault-tolerant X⊗4 encoding circuit in Fig. 4.1, we can construct the fault-tolerant state preparation circuit in Fig. 4.2c. Notice that for each pair of flags, only the weight-two flag configuration is used. No corrections are applied for any of the weight-one flag configurations. This provides a clue for further optimization. Consider that with three total flag qubits, there is access to three distinct weight-two flag configurations, and even one of weight three. We show in Fig. 4.2b that the flag circuits for the three stabilizer encodings can be merged to use a common pool of ancilla qubits. This has the effect of minimizing the space overhead and circuit depth. This technique of sharing flags has not been explored in great depth in this thesis, but this method shows very high potential for reducing overhead. For example, similar to encoding multiple stabilizers in parallel, stabilizers may also be measured in parallel with a common pool of flag qubits. Note that the number of flag configurations increases exponentially, but the number of faults to be tolerated only increases linearly in the number of stabilizers measured. To prepare the |0⟩L state of the Steane code fault-tolerantly, we must only ensure that weight> 2 X errors do not occur due to a single fault. Considering the ZL operator is included as a stabilizer, any Z error is equivalent to a weight-1 Z error. This is because the Steane code is perfect CSS, greatly simplifying the fault-tolerant circuit when preparing by stabilizer measurements. The |0⟩L state can be prepared by initializing the seven physical qubits in the |0⟩ state, and subsequently measuring three of the X-type stabilizer generators. Note that it is fault-tolerant to just measure the stabilizer generators once since regardless of whether a fault occurs, any Z error will be equivalent to a Z error of weight at most one. If there is no fault, the projected Z error space has been identified and a correction can return the state to the code space. If there was a fault, then it is fault-tolerant to project into any of the weight-1 error spaces. Alternatively, the state may be prepared by initializing the qubits in the |+⟩ state and measuring the Z-type operators. This approach can have even lower overhead. Instead of measuring weight-four operators with flag qubits for fault tolerance, weight-three Z operators can be measured faulttolerantly using just one qubit. The circuit will require 8 qubits overall, as in Ref. [Got16], if the 54 ancilla can reset quickly. The caveat is that potentially more than four operators will need to be measured for distance-three X-type fault tolerance. We can also prepare logical magic states of the Steane code fault-tolerantly using a transversal injection technique similar to that in Ref. [GHS+23]. First, initialize all qubits in the |H⟩ state to ensure the logical state is projected into the +1-eigenspace of the HL operator. To complete the logical state preparation, the state is projected into the code space of the code using the Shor-type measurements of individual stabilizers. In this scenario, Z errors of weight two are possible, hence more than three X-type stabilizers will need to be measured for fault tolerance. We find that four weight-four stabilizers are sufficient for the Steane code for fault tolerance by postselection. It is unclear if this protocol can be made deterministic. This method of encoded magic state preparation is interesting because it possesses properties of both magic state injection and magic state distillation. In the former, individual magic states are injected from a physical qubit into a quantum error-correcting code with encoded error probability scaling as the error rate of physical magic state preparation O(p). In the latter, many magic states with error rate p are processed through a Clifford circuit to produce one magic state with error rate O(p k ), physical gate fidelities permitting. Our preparation technique captures the better properties of both these methods. Magic states are injected from physical qubits into a quantum error-correcting code, while simultaneously improving the error probability to O(p 2 ). 4.2.1 Simulations We simulate the circuits for state preparation using a stabilizer circuit simulator to collect statistics on the residual weight-one and weight-two errors. These simulations are performed according to the model in 3.2.1, for p ∈ [10−2.5 , 10−1.6 ]. In doing so, we also compare the post-selection protocols with the new deterministic ones. We plot the probability of weight-one and weight-two X errors, and the yield of the protocol. In Fig. 4.3, we first compare results for different heralded methods as described in Fig. 1 of Ref. [Got16]. The fourth method, labeled ’Goto (d)’, replaces the measurement of the X0X5X6 in (c) with X2X4X5. With this measurement, the logical error rate is decreased with a small increase in the weight-one error rate. The yields of these postselection based protocols are very high, considering very few gates need to be performed. We compare the Goto (c) method with deterministic methods in Fig. 4.4. We consider four circuits. 55 The first, from Ref. [Rei06], prepares two Steane code states with different preparation circuits followed by a Steane-style error correction procedure to identify the locations of residual X errors. These residual errors can eventually be corrected to ensure 1 is satisfied. The two flag methods used are the circuits shown in 4.2. Finally, the measurement-based (MB) method measures the three weight-four X stabilizers to prepare the state. The best performing deterministic method is the Steane-style error correction based method of Ref. [Rei06]. 4.3 Golay code The Golay code has also enjoyed the same amount of attention as the Steane code, with work on state preparation circuits established from the early days of quantum fault tolerance [Ste97, Rei06, PR12, ZLB18]. The protocol we demonstrate is the first, to our knowledge, to show that the |0⟩L state of the Golay code can be prepared fault-tolerantly with 100% yield. First, we consider state preparation by using flags to tolerate faults in operator encoding circuits. Similar to the distance-three Steane code, the distance-seven Golay code is also a perfect CSS code. Since we prepare |0⟩L , the logical Z operator is also included in the stabilizer group. This ensures that any Z error of weight greater than three is equivalent to an error of weight at most three, implying only distance-five fault tolerance is required for Z-type errors. This fact was also observed in Refs. [Rei06, PR12]. We choose the Latin rectangle circuit from Ref. [PR12] with 77 CNOT gates as the base non-faulttolerant circuit for the state preparation. The encoding of the stabilizer operators in this circuit is then made fault-tolerant using flags. We require that the eleven weight-eight X stabilizers are encoded fault-tolerantly to distance-seven and the twelve Z stabilizers are encoded fault-tolerantly to distance-five. This can be done using the flag circuits shown in Fig. 4.5. We perform all the X-type stabilizer encodings sequentially, allowing reuse of the nine flag qubits for each X-stabilizer. As a consequence, the flag qubits for the Z-type operator encodings will need to be kept active for the entire duration of the state preparation. In total, the number of qubits needed is 23+ 9+ 12 ∗6 = 104, with a total of 419 CNOT gates. In contrast, the best performing postselection circuit, from Ref. [PR12], uses 69 qubits simultaneously, with 297 CNOT gates. We finally consider the preparation of the |0⟩L state of the Golay code by measuring stabilizer 56 10 3 × 10 3 4 × 10 3 6 × 10 3 2 2 × 10 2 10 4 10 3 10 2 10 1 Error rate Goto (a) wt-1 Goto (a) log Goto (b) wt-1 Goto (b) log Goto (c) wt-1 Goto (c) log Goto (d) wt-1 Goto (d) log 10 3 × 10 3 4 × 10 3 6 × 10 3 2 2 × 10 2 Physical error rate 0.7 0.8 0.9 Yield Goto (a) Goto (b) Goto (c) Goto (d) Figure 4.3: Rates of residual weight-1 and logical errors after preparing |0⟩L of the Steane code using postselection-based fault-tolerant circuits. We also plot the postselection yield. The circuits considered here are found in Fig. 1 of Ref. [Got16]. The method labeled ‘(d)’ replaces the X0X5X6 from (c) with X2X4X5. This improves the logical error rate at the cost of a higher weight-one error rate. 57 10 3 × 10 3 4 × 10 3 6 × 10 3 2 2 × 10 2 10 4 10 3 10 2 10 1 Error rate Goto (c) wt-1 Goto (c) log AliRei wt-1 AliRei log Flag wt-1 Flag log Flag2 wt-1 Flag2 log MB wt-1 MB log 10 3 × 10 3 4 × 10 3 6 × 10 3 2 2 × 10 2 Physical error rate 0.85 0.90 0.95 1.00 Yield Goto (c) AliRei Flag Flag2 MB Figure 4.4: Rates of residual weight-1 and logical errors after preparing |0⟩L of the Steane code using deterministic fault-tolerant circuits. We consider a method suggested by Aliferis and Reichardt in Ref. [Rei06], the flag-based circuits in Fig. 4.2, and a circuit to prepare the state by measuring X-type stabilizers. For physical error rates below 2 × 10−2 , the logical error rate of the best deterministic protocol is about five times worse than the postselection-based protocol. 58 |0i |0i |0i |0i |0i |0i |0i |0i |0i ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z (a) |0i |0i |0i |0i |0i |0i ±Z ±Z ±Z ±Z ±Z ±Z (b) |0i |0i |0i |0i |0i |0i ±Z ±Z ±Z ±Z ±Z ±Z (c) Figure 4.5: Flag-based circuits to encode high-weight X-type operators fault-tolerantly to distance five and seven. Corrections are computed based on the flag measurement outcomes based on the techniques in 2.3. (a) A circuit to encode a weight-eight operator fault-tolerantly to distance seven. (b) A circuit to encode a weight-eight operator fault-tolerantly to distance five. (c) A circuit to encode a weight-seven operator fault-tolerantly to distance five. 59 operators. We first consider preparation by initializing all the physical qubits in the |0⟩ basis, followed by X-type stabilizer measurements. This is favored since we are only required to find a sequence of X-type stabilizers to measure that is distance-five fault-tolerant to faults corrupting measurements (any Z error is at most weight-three). Even with this simplification, a fast Mathematica script struggled to verify the fault tolerance of sequences longer than 28 stabilizers. Unfortunately, an appropriate distance-five sequence was not found. However, distance-three fault-tolerant sequences were determined at length 19. Checking random sequences can be very time-consuming, as there are many options of stabilizers to consider measuring. We tried an alternative method to find a distance-five sequence found in Sec. IX C of Ref. [DR20]. Stabilizers in the sequence are chosen as cyclic shifts of the following weight-eight stabilizer generator 10000000000111110010010. We verified that a sequence of 20 stabilizers is sufficient for distance-three fault tolerance. Our computer could not completely verify 29 stabilizers for distance-five fault tolerance. We conjecture that a sequence of 29 to 31 stabilizers will be required. 4.3.1 Simulations Using a similar approach as in Sec. 4.2, we simulate the state preparation circuits under a depolarizing noise model to collect error rate statistics. The rates of residual X errors of weight-k are determined, for k ≤ 4. We compare the performance of the postselection-based protocol of Ref. [PR12] and our circuit based on the flag-fault-tolerant encoding of operators. The residual error rates with Paetznick’s postselection protocol match those of the deterministic protocol, albeit at a physical error rate that is a factor of ten higher. At similar physical error rates, the deterministic protocol performs on par with some of the higher overhead postselection-based protocols of Ref. [ZLB18]. These protocols are much more useful for the distillation of codestates of much larger codes. Quantum codes as small as the J23, 1, 7K Golay benefit from a more granular analysis of fault propagation. 60 10 3 10 2 10 7 10 6 10 5 10 4 10 3 10 2 10 1 Error rate Paetz wt-1 Paetz wt-2 Paetz wt-3 Paetz log Flag wt-1 Flag wt-2 Flag wt-3 Flag log 10 3 10 2 Physical error rate 0.2 0.4 0.6 0.8 1.0 Yield Paetz Flag Figure 4.6: Comparison of the rates of residual errors between the 69-qubit postselection-based protocol of Ref. [PR12] and a circuit that uses flags to fault-tolerantly encode the stabilizer operators of the state. The flag-based corrections are deterministic, allowing us to demonstrate the first state preparation circuit for the Golay code that is deterministic. The postselection method is very robust and works well even at high physical error rates. The deterministic method is only really useful at a physical error rate below 10−3 . 61 Chapter 5 Fault-tolerant error correction for distance-four quantum codes with postselection Error correction and postselection. Noisy Intermediate-Scale Quantum (NISQ) algorithms for eigensolvers [PMS+14, WHT15] and machine learning [BWP+17] are growing as popular applications for state-of-the-art few-qubit quantum systems. Unfortunately, these devices are still prone to large amounts of noise [MWHH21, AAB+19, PFM+21]. Although error correction can decrease error rates [Sho95, Got97, Ter15], current experiments encode only one logical qubit that is still fairly noisy [RABL+21, CSA+21, EDN+21, Woo20]. In this paper we simulate storing multiple logical qubits in a lattice, as a first step toward modeling few-qubit computations. We repeatedly correct and remove single-qubit errors. On detecting a more dangerous—and less common—two-qubit error, we reject and restart. This “postselection" technique allows distance-four codes to achieve similar logical error rates to distance-five codes. For example, as shown in Table 5.1, a distance-four code can correct errors on six logical qubits with a similar failure rate as the distance-five surface code, using only 10% as many physical qubits. The table also shows that acceptance rates are fairly high, so occasional restarts should not be a major issue for low-depth NISQ algorithms [CAB+21, BCLK+22]. Postselection is a versatile tool in the quantum toolkit. In experiments, it has been used to decode the J4, 2, 2K error-detecting code [LGL+17, TCC+17] and the J4, 1, 2K surface code [ARL+20, CSA+21]. In theoretical research, it has been used to reduce the logical error probability of state preparation [PR12, ZLB18] and magic state distillation [BK05, MEK13]. Recently, postselection has been used to improve quantum key distribution [JAR20, SM21] and learning quantum states [Aar18, MJR+20]. Knill previously combined postselection with error correction, on concatenated distance-two codes, 62 Table 5.1: Postselected error correction for 6 logical qubits using J16, k, 4K codes on the 25-qubit planar layout of Fig. 5.2. The probability of logical error, acceptance, and expected time to complete are shown for 300 time steps, with noise rate p = 5 × 10−4 . The k = 6 code achieves logical error rate close to the distance-5 surface code using only 10% of the qubits. In comparison, for 6 physical qubits at memory error rate p/10, the probability of error is about 6 × 300 × p/10 = 0.09. surface Postselection Code Qubits P(Logical error) Acceptance E[time] k = 2 25 ⇥ 3= 75 .032 11% 1070 k =4 50 .028 21% 725 k =6 25 .017 36% 530 d = 4 150 .009 3% 2820 d = 3 78 .143 100% 300 d = 5 246 .016 100% 300 Table 5.2: Distance-four codes with postselection lead to O(p 3 ) logical errors, much like distance-five codes. Even-distance codes require restarts, however, unlike odd-distance codes. Error weight Distance 1 23 2 Detect, restart Logical error 3 Correctable Logical error 4 Correctable Detect, restart Logical error 5 Correctable Correctable Logical error to show an impressive 3% fault tolerance threshold [Kni05b]. We also combine postselection and error correction, but with distance-four codes. As Table 5.2 indicates, distance-two codes can detect single errors and distance-three codes can correct them, meaning logical errors are due to second-order faults. Distance-three codes may alternatively be used to detect one or two errors, but then they lose the ability to correct and computations are very short-lived. We choose to use distance-four codes since they can simultaneously correct an error and detect two errors. Correcting some errors ensures restarts are less frequent, so longer computations can be run. Since logical errors are caused only by third-order faults, logical error rates are very low. Physical layout. In practice, it is difficult to build a quantum computer with native (fast, reliable) two-qubit gates between every pair of qubits. Instead, qubits are placed on a one or two-dimensional lattice and two-qubit gates are mediated by local interactions, as in superconducting architectures and solid state systems. Current ion trap systems use long-range gates [NMM+14] and transport 63 Distance-4 stabilizer measurement sequence [[16, 6, 4]] (k = 6 code) [[16, 4, 4]] (k = 4 twisted color code) [[16, 2, 4]] (k = 2 color code) [[16, 1, 4]] (d = 4 surface code) Weight-2 generators Weight-4 generators * Per syndrome: Code Stabilizer generators X Z X1,Z1 X2 X Z ,Z2 Logical qubits Figure 5.1: Codes considered in this paper, with associated distance-four fault-tolerant Z or X stabilizer measurement sequences. (The last three codes are self-dual CSS.) Time steps of parallel measurements are separated by “|". (∗) For the surface code, fault-tolerant X and Z error correction is carried out using a rolling window of four syndromes, each measured in two time steps. Figure 5.2: Planar layout of 16 data and 9 ancilla qubits, in black and red respectively. CNOT gates are allowed along the edges. Grey edges are required for the surface code, and green edges between ancillas are required for the new codes in this paper. 64 mechanisms [RABL+21] to connect all the qubits, but some degree of locality is required for larger systems. In light of these connectivity constraints, it may be wiser to choose quantum codes that can be laid out on a lattice such that error correction requires the fewest number of local native gates. The popular surface code has the attractive feature that it requires only nearest-neighbor interactions on a 2D square lattice [FMMC12, CTV17]. Similarly, error correction for topological codes has been investigated on sparser degree-three lattices [CZY+20, CKYZ20, GNFB21]. But there is insufficient research on the performance gains of denser connected layouts. We suggest using 16-qubit codes on the 25-qubit rotated square lattice of Fig. 5.2, where ancilla qubits additionally interact with neighboring ancillas. This allows the use of flag qubits for fault tolerance [CB18, CR18b, CR20, PR23], in turn allowing measurement of large stabilizers. As shown in Fig. 5.1, we choose distance-four codes whose stabilizer generators are fairly local, with short Shor-style stabilizer measurement sequences that do not require any SWAP gates. We consider two block codes and a color code that encode multiple qubits [DR20], and the rotated surface code [BMD07] as a benchmark for postselection. In contrast, before the advent of topological codes, block codes were used for the simulation of 2D local error correction [SDT07, SR09, LPSB14]. These proposals performed Steane error correction on small distance-two and -three codes, and required many swaps. Results. We compare our 16-qubit codes with the 25-qubit, distance-five surface code. We show below in Fig. 5.9 that, with rejection, the normalized logical error rate of the proposed codes is less than that of the distance-five surface code by as much as one order of magnitude. The distance-four surface code actually achieves two orders of magnitude separation. However, the logical error rate per time step does not capture the drawback of restarts. Instead, a better metric is the cumulative probability of logical error. Figure 5.3 compares this metric between the different codes for short computations that do not restart too often (more information in Fig. 5.10). For one logical qubit, the distance-four surface code vastly outperforms its distance-five counterpart, and the k = 2 and k = 4 codes achieve a good balance of low qubit overhead and low logical error rate. We also show that just 50–75 physical qubits are sufficient for good protection of twelve logical qubits. Overall, we obtain lower logical error rates with higher encoding rates, using postselection and multi-qubit codes. 65 Logical error probability (short computations) Physical qubits 1 : 1.64 ⇡ 1 25⇥ d = 4 surface d = 5 surface 1 logical qubit 1 : 3.28 ⇡ 1 2⇥ k=2 d = 5 surface 2 logical qubits 1 : 9.84 ⇡ k=6 d = 5 surface 6 logical qubits Figure 5.3: Summary of results. For short computations, the probability of a logical error in the distance-4 rejection-based surface code is approximately 25 times lower than that of the distance-5 variant. Further, for 6 logical qubits, the k = 6 code on one patch of 25 qubits can match 6 patches of the distance-5 surface code. In Appendix 5.3.1, we compare the storage error rate of unencoded qubits with the encodings in Fig. 5.1. As expected, at error rates up to 10−3 , fault-tolerant error correction is more robust than leaving qubits idle. Future work. In order to verify these results on current quantum systems, some work is required. Dense qubit connectivity in ion trap systems may allow for simple measurement of high-weight stabilizers, but superconducting devices generally prefer low qubit degree due to high crosstalk errors. It may be possible to modify the circuits in this work to allow maximum qubit degree at most five or six, such as in the IBM Tokyo device [TC21]. Consequently, in Fig. 5.12 below, we show that error correction of the k = 2 code is possible with degree-four connectivity, but requires many extra qubits. We only show how to do fault-tolerant error correction, but the ultimate goal is to perform quantum computation. Selective logical measurements could induce computation within a patch, and transversal gates between vertically stacked code patches could facilitate non-Clifford gates. If these operations introduce a low amount of error, it may be possible to execute relatively high-depth circuits. These tools can then be used to execute short NISQ and magic state distillation algorithms. As an example, our results show that just 50 physical qubits may be sufficient to demonstrate 10-to-2 MEK distillation experimentally with O(p 3 ) logical errors [MEK13]. Organization. In Sec. 5.1 we provide more details about distance-four codes and the examples we choose in this paper. Sec. 5.2 details the methods used for fault tolerance. In particular, stabilizer measurement circuits are dealt with in Sec. 5.2.1 and sequences of stabilizer measurements are handled in Sec. 5.2.2. The noise model and results of simulations are contained in Sec. 5.3. Sec. 5.4 66 concludes with a discussion of future work and open questions. 5.1 Codes We compare the error correction performance of six Jn, k, dK stabilizer quantum codes, where n is the number of physical data qubits and k is the number of logical qubits. A distance-d quantum code should correct all errors of weight j ≤ t = ⌊ d−1 2 ⌋, occurring at rate O(p j ), for error rate p. At low error rates, an unlikely error of weight-(d − j) may be misidentified as the more likely weight-j error, inducing a logical flip on recovery. In even-distance codes, errors of weight d/2 can be detected, but applying a correction may induce a logical flip. In this paper, we stop the computation instead of attempting to correct, ensuring logical flips only occur at rate O(p t+2) and not O(p t+1) as before. For the distance-four codes shown in Fig. 5.1, we show that the logical error rate scales as O(p 3 ) like a distance-five code. As a benchmark we first consider the rotated distance-four surface code [TS14] on the layout of Fig. 5.2. For a fair comparison of both the resource requirements and logical error rate, we consider additional benchmarks: the distance-three and -five surface codes. As in Ref. [TS14], each distance-d surface code uses d 2 data qubits and (d − 1)2 ancilla qubits. The next three codes are the central focus of this work. These self-dual CSS (Calderbank-ShorSteane) codes were first considered in Ref. [DR20], to show examples of codes that can be constructed to have single-shot sequences of stabilizer measurements. By fixing some of the logical operators, the k = 6 code can be transformed into the k = 4 and k = 2 codes. Alternatively, puncturing the k = 6 code yields the well-known J15, 7, 3K Hamming code. Improvements. Although these codes encode more logical qubits, they suffer from the difficult task of having to measure weight-eight stabilizers. It is possible to construct a J16, 2, 4K subsystem code with only weight-four stabilizers and gauge operators. Using the layout of Fig. 5.2, we compared this code with the k = 2 subspace code in this paper but found no significant improvements. This code is still useful, however, as we show in Sec. 5.4. Many other codes can also be constructed with 16 qubits. For a biased-noise system, a CSS code with two logical qubits can be constructed with Z-distance six and X-distance four. For more logical qubits, a non-CSS J16, 7, 4K code can be used [Gra07]. Although its stabilizer generators are larger, flag-based measurement may still offer a low-overhead route to fault-tolerance. 67 5.2 Fault-tolerant error correction A stabilizer measurement circuit is made fault tolerant to quantum errors by using extra physical qubits. These ancillas are used to catch faults that may spread to high-weight errors. In contrast, the bad faults in a syndrome extraction sequence flip syndrome bits. Additional stabilizers are measured, essentially encoding the syndrome into a classical code. A circuit is fault-tolerant to distance d if j ≤ t = ⌊ d−1 2 ⌋ mid-circuit faults cause an output error of weight at most j. Additionally for even distance fault tolerance, sets of d/2 faults spreading to weight > d/2 errors should be detected so the computation can be restarted. When these faults yield an error of weight d/2, the computation is restarted if the faults can be detected, else it is rejected in the next round of error correction. 5.2.1 Stabilizer measurement circuits Quantum error correction involves the measurement of a set of operators called stabilizers, to diagnose the location of errors. For fault-tolerant error correction, these stabilizers may be measured individually, as in Shor’s scheme [Sho96], or together, using Steane- or Knill-type syndrome extraction [Ste97, Kni05a]. The flag method is a popular spin-off of Shor’s scheme [CB18, CR18b, CR20, PR23]. By connecting multiple data qubits to each flag qubit, large stabilizers can be measured with relatively low overhead. In addition, flag circuits can be made fault-tolerant only up to a desired degree. For example, Shor-style measurement of a weight-w stabilizer needs w + 1 ancillas and is fault-tolerant to distance w, but we show a weight-eight stabilizer measurement circuit with six ancillas that is fault-tolerant to distance four. For distance-three fault tolerance, we show in Fig. 5.4 that one fault in the circuit should result in an error of X and Z weight at most one. For distance four, if two faults occur and can be detected, the computation must be rejected and restarted. If this detection is not possible, the circuit must be designed to ensure errors cannot spread to weight greater than two. Note that a fault may alter the value of the measured syndrome bit; syndrome bit errors are dealt with in Sec. 5.2.2. We develop flag-based stabilizer measurement circuits. For this, we use a randomized search algorithm constrained by the above fault tolerance rules and the geometric locality of Fig. 5.2. With 68 5 6 7 1 2 3 4 |0i |+i |0i ±Z ±X ±Z d4 (a) d4 R d3 (b) Figure 5.4: (a) A distance-4 stabilizer measurement circuit contains ancilla preparation, CNOTs, measurement and a recovery. (b) Rules for fault tolerance. One fault should be corrected to an error of X/Z weight at most one—this is sufficient for distance 3. Two faults should either be rejected (denoted by the red R) or result in an error of weight two. 5 6 7 1 2 3 4 |0i |+i |0i ±Z ±X ±Z (a) 5 6 7 1 2 3 4 5 6 7 1 2 3 4 (b) Figure 5.5: (a) Circuit to measure a weight-four X stabilizer fault-tolerantly to distance-four, satisfying the locality constraints in (b). The ±Z measurements are used to flag mid-circuit faults. Gates bunched together can be performed in parallel. (b) Two layouts for measuring stabilizers in the sequences of Fig. 5.1. 69 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ±Z ±X ±Z |0i |+i |0i |0i |0i |0i ±Z ±Z ±Z Apply corrections or reject (a) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 (b) Figure 5.6: (a) Circuit to measure a weight-eight X stabilizer fault-tolerantly to distance-four, satisfying the locality constraints in (b). One fault is corrected to at most a weight-one error, but two or more faults may either be corrected, or detected resulting in rejection. The resulting flag outcomes for corrections and rejection are tabulated in Appendix .3. (b) Two layouts for measuring stabilizers in the sequences of Fig. 5.1. 70 all six codes in this work, the stabilizers that are measured are of weight two, four and eight. At the circuit level, the measurement of a weight-two stabilizer is automatically fault-tolerant (one fault causes an error of weight at most one). A weight-four stabilizer measured fault-tolerantly to distance three (i.e., one fault results in error of weight at most one) is automatically fault-tolerant to distance four, as two faults occurring in the circuit cannot create data errors with X and Z weight greater than two. In Fig. 5.5a, the weight-four stabilizer measurement circuit applies a correction only for the 01 ancilla measurement. Figure 5.6a shows a novel circuit to measure weight-eight stabilizers fault-tolerantly to distance four. This circuit uses different patterns of flag-qubit measurements to either correct an error, or reject—detecting an O(p 2 ) fault event. The flag patterns associated with corrections or with rejection have been tabulated in Appendix .3. Figures 5.5b and 5.6b show different ways of arranging the qubits to measure weight-four and weight-eight operators. Improvements. The benefit of measuring stabilizers individually is that error decoding is relatively simple. When stabilizers with overlapping support are measured in parallel, as in the surface code, more complicated decoding algorithms like minimum-weight perfect matching are required. However, we can still make small improvements for additional parallelism. In the k = 2 and k = 4 codes, only two of the corner weight-four stabilizers can be measured simultaneously, as each stabilizer requires three ancilla qubits. We conjecture that by sharing one ancilla qubit among all the corner stabilizers, it may be possible to fault-tolerantly measure all four of them using just nine ancilla qubits, like in Ref. [Rei20]. Alternatively, in Steane-style syndrome extraction, subsets of stabilizers are measured in parallel using n-qubit resource states. Nine ancilla qubits are not sufficient for Steane’s method, but Ref. [HB21b] shows that any subset of stabilizers can be jointly measured with specific resource states. If the fault-tolerant preparation of those resource states is possible on the nine-qubit ancilla sub-lattice of Fig. 5.2, it will be possible to develop faster and more efficient stabilizer measurement circuits. The circuit of Fig. 5.6a uses six ancilla qubits for distance-four fault tolerance. We found a distance-three fault-tolerant circuit using only four ancillas (qubits 9, 10, 11, 13 in Fig. 5.6b), which also requires fewer rounds of parallel gates. On the layout of Fig. 5.2, this may free up enough ancillas to measure two weight-eight stabilizers in each time step. The result is that more stabilizers can be measured faster and data qubits in an error correction block experience less idle noise. Since these 71 Syndrome Correction 110 1 101 2 correct 011 3 X correct bit 3 X X Not fault tolerant Fault tolerant Figure 5.7: Fault-tolerant error correction with the three bit repetition code {000, 111} (adapted from Fig. 1 of Ref. [DR20]). It is not fault tolerant to correct errors based on the two parity measurements 1 ⊕ 2 and 1 ⊕ 3. An internal fault on bit 1 can be mistaken for an input error on bit 3, as they yield the same syndrome. Errors can be corrected fault-tolerantly by adding another parity check, 2 ⊕ 3. Now for up to one fault at any of the circled locations, an input error is corrected, and an internal fault leaves an output error of weight 0 or 1. circuits are only fault-tolerant to distance three, a future avenue of research could use techniques in Ref. [CR18b] to look at their performance in adaptive distance-four error correction. 5.2.2 Stabilizer measurement sequences The correction of errors in a quantum code requires a syndrome built from the measurement results of a sequence of stabilizers. Since syndrome extraction is noisy, it is generally not sufficient to measure just a set of stabilizer generators, as shown in Fig. 5.7. Even one erroneous collected syndrome bit can result in an incorrect recovery, pushing the code into a state of logical error. Instead, more stabilizers are redundantly measured to protect from quantum faults that cause syndrome bit flips. The distance-d surface code does this by measuring the stabilizer generators sequentially ⌈ d 2 ⌉ times, in a syndrome repetition code. We may port this technique to the k = 2, 4 and 6 codes, but recent research has shown that these codes have very small stabilizer measurement sequences [DR20]. The depth of a quantum circuit is generally calculated as the number of rounds of parallel two-qubit gates, since single-qubit gates are trivially short. However, in current systems, the time needed for measurement dominates over the length of a CNOT [CSA+21, EDN+21, RABL+21, RGD+20]. Hence the focus shifts from minimizing CNOT depth to reducing the rounds of measurements needed for error correction. We therefore denote by “time step” the time needed to measure a set of stabilizers in parallel, as shown in Fig. 5.8a. In addition to finding short fault-tolerant sequences of stabilizers, we carefully parallelize their measurement circuits to further speed up error correction. Fault tolerance rules. We follow the ‘exRec’ formalism of Ref. [AGP06] to determine rules for 72 1 time step d4 d4 d4 d4 d4 d4 d4 SMS d4 = { (a) SMS d4 SMS d4 SMS d4 R R SMS d3 SMS d3 R (b) Figure 5.8: (a) A stabilizer measurement sequence (SMS) consists of multiple time steps of parallel stabilizer measurement circuits, where the end of a time step denotes the simultaneous measurement of all the ancilla qubits. (b) Rules for distance-4 fault tolerance—first two are sufficient for distance 3: (i) An input 1-qubit error must be corrected. (ii) 1 internal fault must be corrected to an error of weight at most 1. (iii) A 2-qubit input error is rejected. (iv) 1 input error and 1 internal fault should be corrected to an error of weight at most 1 or rejected. (v) 2 internal faults must be rejected or propagate to an error of weight at most 2. 73 fault-tolerant error correction, as shown in Fig. 5.8b. For distance-three fault tolerance, only two rules are needed. If the input to an error correction block has a weight-one error and there are no internal faults, the syndrome must be sufficient to correct back to the codespace. This is actually the basic rule for an ideal error correction block. If there are no input errors and one internal fault occurs, the weight of the output error after recovery should be at most one. For distance-four fault tolerance, we must consider the effect of up to two input errors or internal faults. If the input error has weight two and there are no internal faults, then the stabilizer measurement sequence must detect the error and restart the computation. If there is a weight-one input error and an internal fault, either the computation is restarted, or the output of the error correction block must have error of X and Z weight at most one. Finally, if two internal faults occur with no input error, either the computation is restarted, or the output error must have weight at most two. (In the last four rules, the resulting syndrome must never be equivalent to a weight-one input error on a different qubit. This ensures that every weight-one input error can be reliably corrected.) Solutions. To perform distance-four fault-tolerant error correction on the layout of Fig. 5.2, we consider measuring stabilizers only of the form given in Sec. 5.2.1. The goal is then to devise short parallel stabilizer measurement sequences occupying the fewest time steps while satisfying the fault tolerance rules above. For each of the newly proposed codes, the sequences in Fig. 5.1 were found using randomized search and subsequent minor alterations. The k = 2 code measures ten X (or Z) stabilizers over five time steps, hence recovery occurs every ten time steps. On the other hand, the k = 6 code contains no stabilizers of weight less than eight, so parallelism is difficult. Here, seven X (or Z) stabilizers are measured over seven time steps, for a total 14 time steps between recoveries. The distance-four surface code measures all its stabilizer generators in two time steps. The first is used to measure the nine weight-four stabilizer generators using the nine ancilla qubits, and the second time step is used to measure the boundary weight-two stabilizers. Recovery occurs after two fresh syndrome layers are measured, at a frequency of four time steps. 74 5.3 Simulation results Noise model. For simulation, we consider independent circuit-level noise, as described below: • With probability p, the preparation of |0⟩ is replaced by |1⟩ and vice versa—similarly |+⟩ and |−⟩. • With probability p, ±X or ±Z measurement on any qubit has its outcome flipped. • With probability p, a one-qubit gate is followed by a random Pauli error drawn uniformly from {X, Y, Z}. • With probability p, the two-qubit CNOT gate is followed by a random two-qubit Pauli error drawn uniformly from {I, X, Y, Z} ⊗2 \ {I ⊗ I}. • After each time step, with probability p(1 + m/10), each data qubit is acted upon by a random one-qubit Pauli error drawn uniformly from {X, Y, Z}. (A time step denotes one round of parallel stabilizer measurements of maximum CNOT depth m, as in Sec. 5.2.2.) The rest error rate models the observed performance of current-day quantum systems, where the time taken to measure an ancilla qubit is long compared to the CNOT gate time. We model the rest error rate during measurement as p, and during CNOT gates as p/10. Even with dynamical decoupling [VKL99], the error incurred by the idle data qubits can be quite high. Normalized logical error rate. The logical error rate of fault-tolerant storage can be estimated by checking for a logical error after each block of error correction. However, different codes correct errors at different frequencies—once every four time steps for the surface code, but fourteen for the k = 6 code. To compare the codes on a similar time scale, we normalize the logical error rates with respect to time step. We plot the logical error rate per time step in Fig. 5.9, where we show that a distance-four surface code has a storage error rate of O(10−9 ), for a CNOT gate error rate of just 10−4 . Even with the infidelity of current day CNOTs, ∼ 10−3 , we show logical error rates approaching 10−6 . These results demonstrate the benefits of postselection. Cumulative logical error probability. The mean rejection rates in Fig. 5.9 provide a good comparison of how often the different codes reject, but do not accurately describe behavior for 75 k=6 k=4 k=2 distance-4 surface w/ rejection distance-5 surface distance-3 surface Mean rejection rates cp2 c0 p3 Physical error rate, p Logical error rate per time step Figure 5.9: O(p 3 ) scaling of X logical error rate and O(p 2 ) scaling of rejection rate, with error bars, for the distance-four codes. The distance-three and distance-five surface codes are shown for comparison. The new codes have logical error rate per time step as low as 1/10th the distance-five surface code. The distance-four surface code is as low as 1/100. 76 bounded-length computation. Here, a more useful metric is the probability of acceptance, Pa(t), which is how often a t-time-step computation completes. This quantity can be estimated empirically by simulating the application of noisy error correction to an initial state for bounded time, which we denote as a simulation ‘run’. If R is the total number of executed runs and Ra(t) is the number of runs that have not rejected until time step t, Pa(t) = Ra(t) R . (5.1) Similar to the rejection rate, the logical error rate per time step is indicative of the frequency of logical errors, but does not help to understand the drawbacks of postselection. We again refer to a cumulative metric, the probability of a logical error after t time steps of error correction, empirically given by PL(t) = RL(t) R , (5.2) where RL(t) is the number of runs in a state of logical error at time step t. For even distance codes, one must instead look at the probability of logical error conditioned on acceptance, which is calculated as PL|a(t) = PL(t) Pa(t) = RL(t) Ra(t) = RL(t) R Pa(t) . (5.3) For even distance codes with postselection, Pa(t) < 1 and so PL|a(t) > PL(t). For odd distance codes that do not reject, PL|a(t) = PL(t). The above formula holds only for a single code patch. The probability of logical error while using multiple patches can be upper bounded from the data for a single patch as PL|a(t, c) ≤ c RL(t) R Pc a (t) , (5.4) where c is the number of code patches used. Note that the number of logically incorrect runs grows linearly with the number of patches, but the probability of acceptance of multiple patches is the probability that every patch has accepted. Discussion. We simulated fault-tolerant error correction of the codes in Fig. 5.1 for up to 12000 time steps at error rate p ∈ {0.001, 0.0005, 0.00025, 0.0001}. Using the empirical formulae above, we then 77 Table 5.3: Error correction for 1 logical qubit at p = 0.001. The probability of logical error and acceptance are shown for 80 and 200 time steps. Each code uses one patch of qubits. The distance-four surface code has the lowest logical error probability for short computations. 1 2 3 1 2 3 2 3 1 surface t = 80 t = 200 Code Qubits PL|a Pa PL|a Pa k = 2 25 .0037(2) 53.9% .0099(6) 19.6% k = 4 25 .0114(4) 50.5% .0283(11) 17.2% k = 6 25 .0275(7) 44.1% .0687(21) 11.1% d = 4 25 .0001(1) 59.4% .0003(1) 25.9% d = 3 13 .0242(6) 100% .0581(8) 100% d = 5 41 .0062(3) 100% .0135(4) 100% plotted in Fig. 5.10 the probability of X logical error conditioned on acceptance and the probability of acceptance for one, two, six or twelve logical qubits. Note that some plots look discontinuous. This is because we only check for logical errors and reject at the end of an error correction block. Above the graphs of Fig. 5.10, we compare the number of physical qubits required for each code. There are many things to be learned from Fig. 5.10. To start, the first column of graphs shows how a single patch of each code fares against the others, for different error rates. The d = 4 surface code with rejection boasts the lowest logical error probability overall and has the highest acceptance rates among all the even-distance codes. The logical error probability of the k = 2 code actually matches the distance-five surface code, even though it encodes twice as much information. This is also apparent from Table 5.3, where we show the probability of acceptance and logical error for one logical qubit at p = 10−3 . For two logical qubits, (second column of graphs), the surface codes need two patches of qubits, hence the probability of logical error doubles and the acceptance is squared. The distance-four surface code now has the lowest acceptance probability among the distance-four codes. We keep the range of time steps consistent between the first and second columns to show that the curves for the multi-qubit codes are unchanged. As shown in Table 5.4, for two logical qubits, the k = 2 code halves the logical error probability of the d = 5 surface code, using fewer than one-third as many physical qubits. Going further, we have analyzed error correction for six and twelve encoded qubits, as shown in the last two columns. With only one-tenth of the physical overhead, a single k = 6 code patch 78 k=2 k=4 k=6 d = 3 surf d = 4 surf d = 5 surf 25 50 75 100 125 13 26 39 52 65 0 .2 .4 .6 .8 1 40 80 120 160 200 80 160 240 320 400 50 100 150 200 250 0 .2 .4 .6 .8 1 500 1000 1500 2000 2500 160 320 480 640 800 0 .2 .4 .6 .8 1 300 600 900 1200 1500 150 300 450 600 750 Time steps, t 0 40 80 120 160 200 0 .02 .04 .06 .08 .1 .12 p=0.001 0 150 300 450 600 750 0 .01 .02 .03 .04 .05 .06 p=0.0005 0 500 1000 1500 2000 2500 0 .005 .01 .015 .02 .025 .03 p=0.00025 900 1800 2700 3600 4500 0 .2 .4 .6 .8 1 0 2400 4800 7200 9600 12000 2400 4800 7200 9600 12000 0 .002 .004 .006 .008 .01 .012 p=0.0001 1600 3200 4800 6400 8000 1 logical qubit Physical qubits required 2 logical qubits 6 logical qubits 12 logical qubits Acceptance probability, P a(t) Logical error probability, P L|a(t) Figure 5.10: Probability of X logical error (solid) and acceptance (dotted) for t time steps of error correction on six codes, as a function of physical error rate (row) and desired logical qubits (column). The three colored curves correspond to the k = 2, k = 4 and k = 6 codes and the three gray curves are the surface codes. The graphs for few time steps look like step functions because the code patches are checked for logical errors only after blocks of error correction, not time steps. The top row compares the number of physical qubits required to achieve the desired number of logical qubits. 79 Table 5.4: Error correction for 2 logical qubits at p = 0.0005. The probability of logical error and acceptance are shown for 300 and 750 time steps. The surface codes require more than one patch of physical qubits. Among the new codes, the k = 2 color code has few large stabilizers and a fast sequence. These advantages help it achieve the lowest logical error probability at the highest acceptance rates. 1 2 3 1 2 3 1 2 surface 3 t = 300 t = 750 Code Qubits PL|a Pa PL|a Pa k = 2 25 .0025(2) 48.5% .0060(5) 16% k = 4 25 .0066(3) 45.4% .0178(9) 13.1% k = 6 25 .0172(6) 35.7% .0441(18) 7.6% d = 4 50 .0003(1) 31.9% .0021(4) 5.6% d = 3 26 .0477(5) 100% .1145(8) 100% d = 5 82 .0053(2) 100% .0125(3) 100% rivals the performance of six patches of the d = 5 surface code. The single patch of the k = 6 code even outperforms the k = 2 and k = 4 codes, but this is precisely because only one code patch is used. When multiple code patches are used for many logical qubits, the acceptance rate of the distance-four codes drops exponentially. This is also observed in Table 5.5 and Table 5.6, as the acceptance probability of the distance-four surface code quickly approaches zero. In the last column of graphs, we compare statistics for twelve logical qubits. Although current NISQ systems only protect one logical qubit, our results show that just 50–75 physical qubits are sufficient for twelve logical qubits. In this regime, the k = 4 code achieves lower logical error probability than the k = 6 code with only 50% more overhead. Unfortunately at longer time scales, postselection sharply increases the logical error probability, rendering the distance-four codes much less useful. All simulations in this paper, developed in Python, were executed on the USC Center for Advanced Research Computing (CARC) high-performance computing cluster. The simulations used over one million minutes of CPU core time on Intel Xeon processors operating at 2.4 GHz. Take-home message. Postselection can play a crucial role in reducing logical error rates. However, when logical information is stored for too long, it is likely to be wiped and reset. This is okay for some algorithms: applications with low depth, like variational algorithms [CAB+21, BCLK+22], or those that are designed with rejection, like magic state distillation [BK05]. If only one or two qubits 80 Table 5.5: Error correction for 6 logical qubits at p = 0.00025. The probability of logical error and acceptance are shown for 700 and 1500 time steps. The k = 6 code requires one-tenth the physical qubits as the distance-5 surface code, while nearly matching the logical error probability. 1 2 3 2 1 1 3 2 3 surface t = 700 t = 1500 Code Qubits PL|a Pa PL|a Pa k = 2 75 .0061(3) 24.5% .0412(17) 4.8% k = 4 50 .0078(3) 34.7% .0306(10) 10.4% k = 6 25 .0060(2) 50.1% .0129(5) 22.3% d = 4 150 .0001(1) 11.5% .0137(12) 1% d = 3 78 .0847(5) 100% .1796(8) 100% d = 5 246 .0043(1) 100% .0092(2) 100% Table 5.6: Error correction for 12 logical qubits at p = 0.0001. The probability of logical error and acceptance are shown for 1800 and 4500 time steps. The k = 4 code is well-balanced, achieving competitive logical error rates with low qubit overhead. surface 1 2 3 1 2 3 1 2 3 t = 1800 t = 4500 Code Qubits PL|a Pa PL|a Pa k = 2 150 .0025(1) 29.3% .0285(9) 4.6% k = 4 75 .0022(1) 50% .0101(3) 17.5% k = 6 50 .0032(1) 53.4% .0127(3) 20.8% d = 4 300 .0003(1) 15.8% .0094(7) 1% d = 3 156 .0708(2) 100% .1761(4) 100% d = 5 492 .0013(1) 100% .0036(1) 100% are required, the distance-four surface code and the k = 2 code offer very low probability of logical error. For more qubits, we advise using the k = 4 or k = 6 codes, as they use far fewer physical resources to achieve competitively low logical error. We show that 50–75 good physical qubits are sufficient to correct errors on twelve logical qubits. Even at a CNOT error rate as high as 5 × 10−4 , error correction up to 100 time steps can be run with error probability as low as 1%. 5.3.1 Comparing memory against unencoded qubits In this section, we determine whether information protected by fault-tolerant error correction is more reliable than being stored in an unprotected qubit. For the postselection codes, Fig. 5.11 plots the CNOT depth at which qubits have accumulated 1% probability of logical error. The unprotected 81 2 × 10−4 4 × 10−4 6 × 10−4 10−3 CNOT error rate 100 103 104 105 CNOT depth Figure 5.11: CNOT depth at which each code has accumulated 1% probability of X logical error. In black, the depth is plotted for one unencoded qubit, at rest error rate one-tenth the CNOT error rate. Plots are shown for the k = 2 code in blue, k = 4 code in purple, k = 6 code in orange and the d = 4 surface code in grey. We assume the depth of ancilla qubit measurement is ten times the depth of a CNOT gate. The CNOT depth shown for the surface code is for 0.01% probability of X logical error. All data points shown for the postselection-equipped codes have acceptance > 5%. qubit is modeled to accumulate errors only through rest noise, whereas logical errors in the encoded qubits are due to circuits for fault-tolerant error correction. For the error rates we consider (≤ 10−3 ), it is clear that the encoded qubits are better preserved for much longer than an unprotected qubit. 5.4 Potential future work In this paper, we show how to perform fault-tolerant storage with 16-qubit codes. There are two immediate roadblocks en route to universal fault-tolerant quantum computation. Currently, no devices exist with the layout of Fig. 5.2, so until they are fabricated, we turn to other layout improvements. The middle ancilla qubit in Fig. 5.2 is connected to eight neighboring qubits. However, careful analysis and modification of the stabilizer measurement routines may yield solutions that only require maximum qubit degree five or six. This may not be interesting for densely-connected 82 ion trap quantum computers, but is necessary in superconducting architectures to maintain low cross-talk. Alternatively, if we are allowed extra ancilla qubits, we show that maximum degree four is possible, as in the Google Sycamore lattice of Fig. 5.12. The stabilizer measurement circuits are all fault-tolerant to distance four, but since all the stabilizer generators are measured simultaneously, error decoding will require new strategies. The weight-four stabilizers can be measured using the circuit in Fig. 5.5a, but the weight-eight stabilizer requires a new circuit, as we detail in Appendix .3. In Fig. 5.12, the only qubits with degree-four connectivity are the ancillas used for measuring the weight-eight stabilizer. For systems with high crosstalk, qubits of degree three may be sufficient to correct errors on the k = 2 subsystem code, since errors can be corrected by measuring only weight-four operators. Conversely, we may consider subsystem codes to simplify the measurements performed for error correction. In Sec. 5.6, we show a J16, 4, 2, 4K code with gauge operators of weight four. We show a process for fault-tolerant error correction that only requires fast single-shot measurement of the gauge operators on a planar square lattice. We observe improved logical error rates, which can be attributed to the speed of error correction and low overhead. On the theoretical front, we must develop encoding circuits and a universal logical gate set. States may be prepared by either using flags for fault-tolerance, or by combining patches of distance-two code states into a distance-four state. For fault-tolerant universal computation, one possible route is teleportation and logical measurements with distilled magic states. In fact, logical measurements can be performed along with error correction [DR20]. Another route to universality is to use transversal multi-qubit gates between vertically stacked code patches. It may be possible for gates like the CCZ to induce magic [PR13], as the required J15, 7, 3K code can be obtained by puncturing the J16, 6, 4K code. If the error introduced by logical operations is kept low, many logical gates can be applied every time step, allowing high-depth logical circuits. Near the threshold of the odd distance surface codes, postselection on the distance-two and -four variants shows reduced logical error rates. With higher distance, the compounding effect is larger, meaning distance-eight or -ten surface codes may be sufficient for very precise computations. At higher distance, rejections also become exceedingly rare, increasing the possible duration of computations. For larger patches on lattices like Fig. 5.2, more qubits can be encoded at high distance. 83 Figure 5.12: A degree-four layout for flag-fault-tolerant error correction of the k = 2 code, using 43 of the 53 qubits on the Google Sycamore lattice. The stabilizer generators of the code are overlaid. Note that qubits have degree 4 only in the ancillas measuring the weight-8 stabilizer, but elsewhere the maximum qubit degree is 3. It may be possible for the k = 2 subsystem code with only weight-4 stabilizers and gauges to fit on a layout of maximum degree 3. The biggest difficulty will then be in performing operations on or between different logical qubits in the same patch. Another avenue to pursue is concatenation. This technique can combine the low logical error rates of the surface code with the high encoding rates of block codes. 5.5 Surface code error correction with the union-find decoder When performing error correction for a finite time interval [DKLP02], the classical error decoding algorithm is performed on one of two types of inputs. Either the entire syndrome history is used for one shot error decoding, or the process can be repeatedly applied on ∼ d rounds of syndromes, as and when they are collected. With the distance-four surface code, we consider repeated decoding with four rounds of syndromes to best model an experimental implementation. With the goal of generalizing to larger surface codes, we implemented a version of the union-find decoding algorithm [DN21]. Here, corrections are only applied to faults identified in the older layers of syndromes, as in the “overlap recovery" method of Ref. [DKLP02]. It is also important to ensure that the input syndrome graph contains horizontal boundary vertices for each layer (as shown in 84 Fig. 5.13a) and a temporal boundary vertex above the most recently collected layer of syndromes (as shown in Fig. 5.13c). All the boundary vertices may be identified together as one vertex, to make the syndrome graph simpler for the decoder. Finally, we reject or apply corrections based on the rules for distance-four fault tolerance in Fig. 5.8b. The union-find decoding algorithm presents a set of edges in the syndrome graph of Fig. 5.13c denoting the presence of faults that caused Z-type errors. Procedure 4. Rejection decoding with the surface code: 1. If |E| ≥ 3, reject and restart. 2. If |E| = 2 and the two faults are data errors that are sufficiently separated in time, correct the fault that occurred first. 3. If |E| = 2 and E contains vertical edges, remove them from E and apply a correction to resolve any remaining edges. 4. If |E| = 1, apply a correction if the edge has a syndrome vertex in either of the older two layers. 5. If the removed/corrected edge has a syndrome vertex in the third layer, ensure syndrome vertices are flipped before the next round of decoding. Note that there is a trade-off in choosing which cases to reject for and which cases to attempt a correction. We choose a separation that rejects as rare as possible while maintaining the O(p 3 ) logical error rate. It is also possible to reduce the edge count at the start, by projecting measurement faults away before applying any of the rules. Evidently, there is a lot of scope for further improvements to the postselection rules. Research also needs to be done on how to generalize these rules to higher-distance surface codes. 5.6 A subsystem code with four useful logical qubits The new codes presented in Fig. 5.1 all suffer from one main caveat, the requirement that at least one weight-eight stabilizer must be measured. High-degree connectivity and extra ancillas are needed to measure these stabilizers fault-tolerantly to distance four. However, weight-four stabilizers are measured fault-tolerantly to distance three and only need three ancillas, with sparse 85 b 1 2 3 6 7 4 9 10 11 12 14 15 b (a) b 1 2 4 5 6 7 9 10 11 12 13 14 16 8 b (b) time b b b (c) Figure 5.13: Syndrome graphs passed to the union-find decoder. All the vertices labeled b are identified as the same boundary vertex. (a) Graph of vertices representing X stabilizer measurement outcomes. The numbers indicate the index of the data qubit Z correction for each edge of the graph. (b) Syndrome graph for Z stabilizers. (c) (2 + 1)-D syndrome graph, used for fault-tolerant decoding with circuit-level errors. We show four layers of syndromes in alternating colors red and black. Edges in the same syndrome layer are shown in black, with diagonal edges between layers shown in grey. For clarity, we do not show the vertical edges between syndrome vertices of the same stabilizer. Corrections are only applied for edges which have at least one vertex in the bottom two layers. 86 (a) a b c d e f g h (b) XL ZL (c) Figure 5.14: (a) Stabilizers of the J16, 6, 4K code. The J16, 4, 2, 4K code is derived by assigning two of the logical qubits as gauge qubits. (b) The two logical qubits chosen as gauges are the horizontal and vertical weight-four operators. By stabilizer equivalence, every column (row) of four qubits is a representation of the vertical (horizontal) gauge. These groups of data qubits are denoted by the letters a-h. (c) X and Z operators specifying the four logical qubits. connectivity. In Fig. 5.14, we consider a J16, 4, 2, 4K code that is derived from the J16, 6, 4K code of Fig. 5.1. We assign two of the logical qubits as gauge operators to simplify error correction. By measuring different weight-four representations of the gauge operators, we can construct syndromes that permit distance-four fault-tolerant error correction. Furthermore, performing only weight-four measurements simplifies the required connectivity and permits low-overhead implementation on a planar square lattice. Here, we show an implementation of error correction on a degree-four lattice requiring only 44 qubits, with logical fidelity on par with the distance-five surface code needing 41 or 49 qubits. In addition to low overhead and high-fidelity error correction, this code boasts more exciting properties. The J16, 6, 4K code is a highly symmetric code, allowing for many types of simple fault-tolerant logical operations. These operations naturally pass down to the J16, 4, 2, 4K code too. In Sec. 5.6.2, we show how to perform some Clifford operations. 5.6.1 Fault-tolerant error correction and detection Fault-tolerant error correction is performed by measuring the four weight-four representations of each gauge operator. A deterministic routine takes four rounds of measurements. In the first round, measure all the X-type row operators. In the second round, X-type columns. In the third and fourth, measure the Z-type operators. Errors are detected or corrected according to the following rules. Consider X error correction for example, 87 1. If either the row or column syndrome is of weight two, reject. 2. For a weight-one (or equivalently, a weight-three) syndrome among the rows and among the columns, apply a weight-one X correction on the qubit indexed by the row and column with a 1 (or 0 in case the syndrome is weight-3). 3. Ignore all other syndromes. Procedure 5. Distance-4 fault-tolerant error correction with the J16, 4, 2, 4K code. 1. Measure Xa, Xb, Xc and Xd on the qubit layout of Fig. 5.15a using the stabilizer measurement circuits of Fig. 5.15b and Fig. 5.15c. 2. Measure Xe, Xf , Xg and Xh with Fig. 5.15b and Fig. 5.15c. This completes the circuits for Z error correction. 3. For X error correction, repeat the process with Z-type operators. It may be possible to further reduce the number of rounds of measurements needed for error correction. For example, one could find a way to measure all eight operators of one type in one round. The difficulty is in the four intersecting regions of the ancilla circuits for the middle column and row gauge operator measurements. This could be made fault-tolerant using a shared flag ancilla scheme or a scheme that prepares ancilla states for Steane-style error correction [HB21b]. Alternatively, an adaptive routine may reduce the number of measurements needed. Numerical simulations We perform simulations of error correction with Procedure 5 and compare the results with the distance-five and distance-four surface codes in Fig. 5.16. Note that while the logical error rate per time step for the k = 4 subsystem code is larger than that of the k = 4 stabilizer code discussed in Fig. 5.1, the subsystem code only requires four time steps per round of error correction whereas the stabilizer code requires twelve. Hence the logical error rate per round of error correction will be much lower with the subsystem code. In practice, the subsystem code will be far more favorable. 88 (a) |0i |+i |0i ±Z ±X ±Z |0i ±Z (b) |0i |+i |0i ±Z ±X ±Z |0i ±Z |0i |0i ±Z ±Z (c) Figure 5.15: Implementation of quantum error correction with a J16, 4, 2, 4K code on a square planar lattice of qubits. (a) Layout of qubits in Procedure 5. (b,c) Associated gauge operator measurement circuits. 89 6 × 10 4 10 3 2 × 10 3 3 × 10 3 Physical error rate p 10 1 10 2 10 3 10 4 10 5 10 6 Event probability per time step Figure 5.16: Rejection rates (top two) and logical error rates (bottom three) per time step, when performing error correction with the k = 4 subsystem code. The noise model is the same as that in Fig. 5.9. Here, the probability estimates are compared with the distance-four and distance-five surface codes, with the distance-five surface code exhibiting a larger error probability than distance four. 90 B A (a) B A (b) B A C (c) B A C (d) Figure 5.17: Fault-tolerant versions of physical CZ and SWAP gates. (a) Performing a SWAP between two qubits between qubits A and B is not fault-tolerant, even when broken into a sequence of three CNOTs. (c) It can be made fault tolerant by swapping through an ancilla qubit, C. (b) Similarly, a CZ is not fault tolerant, but can be made fault-tolerant using an ancilla, as in (d). 5.6.2 Logical gates There exists a wide variety of logical Clifford operations that are relatively easily to implement with the J16, 4, 2, 4K code. This includes transversal gates, permutation automorphisms, multi-qubit logical measurements, and CZ automorphisms. For the logical qubits of the k = 6 code, we show the logical Cliffords that result from different physical operations. We show simple examples for each case, leaving a full characterization of all the possible Cliffords to future work. Transversal gates • SWAP12 SWAP34 SWAP56 H⊗6 := H⊗16 . • CZ12CZ34CZ56 := S ⊗16 . Permutation automorphisms Permutations of the physical qubits may leave the codespace invariant while modifying the logically encoded information. These operations are called permutation automorphisms. On the layout of Fig. 5.15a, permutations can be performed via SWAP gates. However, physical SWAPs between two 91 data qubits of the code are not fault-tolerant. They must be mediated by an ancilla, as shown in Fig. 5.17. Below, we note the logical effects of some permutation autmorphisms. • CNOT3,5 CNOT6,4 := (1, 5)(2, 6)(3, 7)(4, 8) • SWAP3,6 SWAP4,5 := (2, 3)(6, 7)(10, 11)(14, 15) • CNOT2,4 CNOT2,5 CNOT3,1 CNOT6,1 := (1, 5)(4, 8)(9, 13)(12, 16) • CNOT2,4 CNOT2,5 CNOT3,1 CNOT3,5 CNOT6,1 CNOT6,4 := (1, 5)(4, 8)(10, 14)(11, 15) Multi-qubit logical measurements In Table 5.7, we list the sequences of operators needed to measure the logical operators faulttolerantly to distance four. These types of measurements allow targeted logical Cliffords, such as CNOTs or Hadamards. When assisted by magic states, they can also be used to perform non-Clifford gates (see, for example, Chap. 7). CZ automorphisms Similar to permutation automorphisms, by applying CZ gates between specific pairs of physical qubits, we can perform logical CZ transformations while leaving the codespace invariant. Up to a logical CZ1,2CZ3,4CZ5,6, which can be undone by the transversal gate S ⊗16, the following logical CZ transformations can be achieved. • CZ1,4CZ4,5 := CZ1,2CZ5,6CZ3,7CZ4,8CZ9,10CZ13,14CZ11,15CZ12,16 • CZ3,5CZ1,6 := CZ1,5CZ2,3CZ4,8CZ6,10CZ7,11CZ9,13CZ14,15CZ12,16 • CZ1,4CZ2,4CZ4,5CZ4,6 := CZ1,2CZ5,6CZ3,7CZ4,8CZ9,13CZ10,14CZ11,12CZ15,16 92 Table 5.7: Sequences of weight-four operators needed to measure different logical operators faulttolerantly to distance four. Majority voting decides the measurement outcome. If the measured results are split with equal probability, the measurement result is rejected. Operator Sequence for fault-tolerant measurement X3 X4 X5 X6 X1X3 X1X3X5 X3X4X5X6 X1X3X4X5X6 93 Chapter 6 New magic state distillation factories optimized by temporally encoded lattice surgery Universal fault-tolerant quantum computers will be required in order to implement large scale quantum algorithms. However, both the space and time costs due to the implementation of faulttolerant quantum error correction protocols can be quite high [FMMC12, BH12, Jon13, YTC16, CJOL17, CJO17, BKS21, CC19, Lit19a, Lit19b, CN20, CNAA+22, CC22b, SC22]. For fault-tolerant quantum computers where qubits are encoded in topological quantum error-correcting codes, lattice surgery paired with magic state distillation protocols provides an efficient way to implement universal gate sets while being compatible with the locality requirements of two-dimensional planar hardware architectures [BK05, BH12, MEK13, BMD06, CH18, CC19, CN20, FG18, LO18, Lit19a, Lit19b, CC22b, CC22a, BDM+21]. In addition to the extra space-costs associated with lattice surgery protocols, there are also additional time costs arising from the required protection against timelike failures (which can result in logical parity measurement failures) [CC22b, CC22a, Gid22]. The timelike distance of a lattice surgery protocol is given by the number of syndrome measurement rounds which need to be performed during the measurement of a multi-qubit Pauli operator. In Ref. [CC22b], a protocol called temporal encoding of lattice surgery (TELS) was introduced in order to reduce the required timelike distance of lattice surgery protocols, thus reducing algorithm runtimes. The first step in a TELS protocol is to divide all multi-qubit Pauli measurements {P1, P2, · · · , Pµ} required to run a quantum algorithm into sequences of parallelizable Pauli (PP) sets. For a PP set of size k, the multi-qubit Pauli operators in a PP set P[t+1,t+k] = {Pt+1, Pt+2, · · · , Pt+k} commute, and any necessary Clifford corrections can be conjugated to the end of the sub-sequence. This general model of Pauli-based computation is shown in Fig. 6.1. In this work we consider a 94 { ... ... ... P1 P2 Pk Pk+1 Pk+2 P2k Pµ1 Pµ Size-k parallelizable Pauli sets { µ = k | i { |Ti Z |Ti Z |Ti Z |Ti Z |Ti Z |Ti Z |Ti Z |Ti Z X X X X X X X X X/Z X/Z X/Z X/Z Figure 6.1: General model of Pauli-based computation. A quantum algorithm can be written as a sequence of multi-qubit Pauli measurements which perform both Clifford and non-Clifford operations (here we show the implementation of non-Clifford gates). In general the multi-qubit Pauli operations can be ordered into sets of commuting Pauli operators, where Clifford corrections can be conjugated to the end of each set. Such sets are called parallelizable Pauli (PP) sets. A logical T gate (which is non-Clifford and forms a universal gate set when combined with Clifford operations) can be implemented via a multi-qubit Pauli measurement acting on a set of data qubits and an ancillary magic state |T⟩ = (|0⟩ + e iπ/4 |1⟩)/ √ 2. | i{ qdm With probability pD, when error detected = m1 m2? m1 m2 m3 q2 P1P2 Z Z Z Z Z Z P1 P2 |Ti d0 m P1 P2 d0 m d0 m qdm m1 m2 q1 |Ti X X C(m1,q1,m2,q2) (a) | i P1 P2 d0 m d0 m d0 m With probability pD, when error detected m1 m2 m3 q2 q1 P1P2 Z Z Z Z = m1 m2? |Ti |Ti { X X C(m1,q1,m2,q2) (b) Figure 6.2: (a) Old protocol for temporally encoded lattice surgery of a PP set of size-2, where P1P2 is a redundant measurement which is used to detect failures in the measurements of P1 or P2. If the measurement results of the three multi-qubit Pauli operators are inconsistent, the original multi-qubit Pauli operators P1 and P2 are measured again. Blue boxes correspond to multi-qubit Pauli measurements, and blue triangles correspond to logical single-qubit measurements. Orange boxes correspond to Clifford corrections. (b) New protocol for temporally encoded lattice surgery of a PP set of size-2. The operators P1, P2 and P1P2 are repeatedly measured until no logical timelike failures are detected. In Section 6.2.1 we show that such a scheme results in smaller average runtimes for the implementation of a PP set. Orange boxes denote Clifford corrections that result from applying non-Clifford gates. 95 universal gate set generated by ⟨T, H, S, CNOT⟩ where T = diag(1, eiπ/4 ) and H and S are the Hadamard and phase gates. For such a universal gate set, a T gate can be implemented using multi-qubit Pauli measurements and the resource magic state |T⟩ = (|0⟩ + e iπ/4 |1⟩)/ √ 2. The main idea behind TELS is that for a given PP set, one can measure a larger over-complete set of multi-qubit Pauli operators, where each multi-qubit Pauli operator in the over-complete set is a product of multi-qubit Pauli operators from the original PP set. As was shown in Ref. [CC22b], each multi-qubit Pauli operator in the new over-complete set is associated with a codeword of a classical [n, k, d] error-correcting code. The measurement results denote the n bits of the classical code. Applying the parity check matrix of the code to these measurement outcomes enables the detection of logical timelike failures. This in turn allows fewer rounds of syndrome measurements for each multi-qubit Pauli, due to the extra protection offered by the overlaying classical code. In this work, we introduce new TELS protocols that further reduce the timelike distance required for lattice surgery protocols. In previous TELS protocols, if a lattice surgery failure was detected while measuring the over-complete set, the multi-qubit Pauli operators from the original PP set would be remeasured. We show that better speedups can be obtained if, during the re-measure step, the operators from the over-complete set are repeatedly measured until no logical timelike failures are detected. We also show that in some cases, even larger speedups can be achieved when using the classical error-correcting codes to correct a subset of errors of smaller weight and detect all possible errors of higher weight. We also consider a large number of classical error-correcting codes for various sizes of PP sets. Such considerations enable more efficient TELS protocols to be applied to a wider range of quantum algorithms, where the average PP set sizes depend on the particular algorithm being implemented. We then focus on implementing TELS protocols in the context of magic state distillation using Clifford frames. In doing so, we consider a biased circuit-level noise model, where the logical qubits are encoded using asymmetric surface codes. We consider asymmetric surface codes since such codes require fewer qubits to achieve a desired logical failure rate for physical error rates p = 10−3 and p = 10−4 compared to other topological codes such as XZZX and XY codes [HBK+23]. By developing new layouts for magic state distillation tiles which are adapted to TELS protocols, we show that the space-time costs of such distillation protocols can be reduced compared to magic state distillation tiles that do not use TELS. 96 The manuscript is structured as follows. In Section 6.1.1 we review the notion of Pauli-based computation implemented via lattice surgery and in Section 6.1.2 we review the implementation of previously proposed TELS protocols. Next, in Section 6.2.1 we show how repeated encoded multi-qubit Pauli measurements using TELS can result in smaller algorithm runtimes. We then show in Section 6.2.2 how the simultaneous correction and detection of errors with the classical codes used in a TELS encoding can lead to reduced runtimes compared to a pure error detection scheme. In Section 6.2.3, we present the best average runtimes of TELS protocols using a wide range of classical error-correcting codes. In Appendix .4, we provide different methods to count the number of malignant fault sets for a given classical code. This is useful in analyzing the performance of TELS protocols with various classical codes. In Appendix .5, we show how to construct the different classical codes used in Section 6.2.3. Appendix .6 contains details about the performance of TELS in additional noise regimes and target logical failure rates. Next in Chapter 7, we show how to apply TELS protocols to magic state distillation factories. Specifically, Section 7.1.1 contains a new circuit for executing lattice-surgery-based magic state distillation. In this protocol, magic states are distilled up to a known Clifford correction. In Appendix .7, we discuss in detail how Clifford frames are incorporated in our TELS protocols when applied to magic state distillation schemes. In Section 7.2, we analyze the space-time costs of various distillation protocols that use TELS with the distillation circuit of Section 7.1.1. In particular, for physical error rate p = 10−4 , we use 15-to-1 and 116-to-12 distillation protocols to distill magic states with logical failure probabilities 10−10 and 10−15, where p is the noise parameter of a biased circuit-level noise model described in Section 6.1.1. For p = 10−3 , we consider 114-to-14 and 125-to-3 distillation protocols to produce magic states with logical failure rate 10−10 and 10−15 respectively. For each distillation protocol, we compute the minimum distances and space time costs using different hardware layouts that are dependent on the chosen lattice surgery mechanism. Appendix .8 shows a specific choice of codewords derived from the classical Golay code for use in a 15-to-1 distillation protocol with a TELS implementation. This classical code allows for very low average runtime per Pauli. In Appendix .9, we show the general procedure for determining the minimum space-like and time-like distances for different distillation protocols and layouts. Appendix .10 contains additional information regarding each of the layouts proposed in this work, which is used in conjunction with Appendix .9 to determine the minimum distances and space-time costs. We conclude with a note on 97 how to schedule the operation of a distillation factory which contains many distillation tiles. Using a round-robin scheduling algorithm, we optimize the number of tiles that are required to output enough distilled magic states as required for seamless operation of a quantum core. 6.1 Review of Pauli-based computation and lattice surgery 6.1.1 Pauli-based computation and lattice surgery There exist many models of computing to execute an algorithm on a quantum computer. The most well-studied is the circuit model of quantum computation [Deu89, BBC+95]. Examples of some other models are adiabatic quantum computation [FGG+01], measurement-based models [GC99, AL04, BBD+09], and fusion-based computation [BBB+21] which may be more suitable for specific hardware architectures. For a universal quantum computing architecture which uses two-dimensional planar topological codes (such as the surface code), the most natural way to implement a quantum algorithm is to use the Pauli-based model of quantum computation [BSS16]. In a Pauli-based computation (PBC), all logical gates can be applied by measuring multi-qubit Pauli operators (potentially using additional ancillas), as shown in Fig. 6.1. Quantum algorithms are then executed using a pool of purified magic states and a sequence of multi-qubit Pauli measurements. In the interest of speeding up algorithms, we first note that the sequence of multi-qubit measurements can be grouped into subsequences of mutually commuting measurements. In Fig. 6.1, the set {P1, P2, . . . Pk} is such a subsequence, and is called a PP set. An algorithm executing µ T-gates with T-depth γ can in general be broken down into γ PPs of average size k = µ/γ. In some algorithms of interest like quantum chemistry simulations [KLP+22], the size of a PP set is between 9 and 14. We also note that for magic state distillation protocols that are expressed as sequences of multi-qubit Pauli measurements, all the non-Clifford gates commute and thus form PP sets. For instance, the 15-to-1 distillation protocol has a PP set of size 11, and for the 125-to-3 distillation protocol, the PP set is of size 99. In topological codes such as the surface code and color code, lattice surgery is the dominant mechanism used to perform multi-qubit Pauli measurements [LRA14, HFDM12]. In Fig. 6.3, we show how to perform the X ⊗ X lattice surgery measurement between two logical qubits encoded in the surface code. Prior to measuring X ⊗ X, the data qubits in the routing space are initialized in the |0⟩ state. Subsequently, the X ⊗ X measurement outcome is obtained by measuring X operators 98 |0i |0i |0i |0i |0i |0i |0i |0i |0i ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z ±Z dm syndrome measurement rounds Figure 6.3: Lattice surgery implementation of an X ⊗ X measurement between two logical qubits encoded in dx = 3, dz = 5 surface code patches. Note that X (Z) stabilizers are represented by red (blue) plaquettes. Prior to measuring X ⊗ X, yellow data qubits in the routing region are prepared in the |0⟩ state. The X ⊗ X measurement outcome is then obtained by measuring the X-stabilizers (shown with white ancillas) in the routing space. The stabilizers of the merged surface code patch are measured for dm syndrome measurement rounds in order to correct timelike failures which can occur in the first round of the merge resulting in the wrong parity of X ⊗ X. In the first syndrome measurement round of the merged patch, the individual measurement outcomes of X stabilizers in the routing space region are random, but their product gives the result of the X ⊗ X measurement outcome. At the end of the dm syndrome measurement rounds, the data qubits in the routing space are measured in the Z basis. Since measurement and reset of qubits typically takes a much longer time than the implementation of the physical CNOT gates used to measure the stabilizers, in this work we assume that the qubits in the routing space used to measure X ⊗ X are only available one syndrome measurement round after the split. Hence the merge/split operation takes a total of dm + 1 syndrome measurement rounds. 99 in the routing space region shown in Fig. 6.3 (note that a minimum-weight representative of the logical X operator of the surface code is given by the product of X operators along the vertical boundaries of the patch). The measurement outcomes of the X stabilizers in the routing space are random (such stabilizers are illustrated with white ancillas in Fig. 6.3). However the product of all such stabilizers give the parity of the X ⊗ X measurement outcome. The stabilizers of the merged surface code patch are measured for dm rounds after which the surface code patches are split by measuring the qubits located in the routing space in the Z basis. A logical timelike failure occurs during a lattice surgery protocol when the wrong parity of the multi-qubit Pauli measurement is obtained. Note that the logical timelike failure rate is exponentially suppressed with the number of syndrome measurement rounds dm (see Ref. [CC22b]). All numerical results in this work are obtained using the following biased circuit-level noise model: 1. Each single-qubit gate location is followed by a Pauli Z error with probability p 3 and Pauli X and Y errors each with probability p 3η . 2. Each two-qubit gate is followed by a {Z ⊗ I, I ⊗ Z, Z ⊗ Z} error with probability p/15 each, and a {X ⊗ I, I ⊗ X, X ⊗ X, Z ⊗ X, Y ⊗ I, Y ⊗ X, I ⊗ Y, Y ⊗ Z, X ⊗ Z, Z ⊗ Y, X ⊗ Y, Y ⊗ Y } each with probability p 15η . 3. With probability 2p 3η , the preparation of the |0⟩ state is replaced by |1⟩ = X|0⟩. Similarly, with probability 2p 3 , the preparation of the |+⟩ state is replaced by |−⟩ = Z|+⟩. 4. With probability 2p 3η , a single-qubit Z basis measurement outcome is flipped. With probability 2p 3 , a single-qubit X-basis measurement outcome is flipped. 5. Lastly, each idle gate location is followed by a Pauli Z with probability p 3 , and a {X, Y } error each with probability p 3η . We also set η = 100. Using the above noise model and a minimum-weight perfect matching decoder, in Ref. [CC22b] it was shown that the timelike logical failure rate of an X ⊗ X multi-qubit Pauli measurement with dm syndrome measurement rounds is given by pm(dm) = 0.01634A(21.93p) (dm+1)/2 , (6.1) 100 when p is below threshold. In Eq. (6.1), A corresponds to the area of the routing space connecting the various surface code patches that take part in a multi-qubit Pauli measurement. In Section 6.2, we set A = 100 in order to directly compare our results to those in Ref. [CC22b]. Further, we also use Eq. (6.1) when considering multi-qubit Pauli measurements containing Z and Y terms. Depending on the accuracy requirements of the algorithm, the target logical error rate per Pauli δ sets an upper bound on the maximum tolerable noise. This condition is pm < δ. To achieve low pm, the measurement distance dm must be accordingly increased. Surface codes with X and Z boundaries can perform X ⊗ X, Z ⊗ Z and Z ⊗ X measurements similar to Fig. 6.3. However to access the Y boundary, twist defects are required. They have been studied extensively in surface codes [HFDM12, Lit19a, CC22a] and in color codes [GS21b]. We use the methods of Ref. [CC22a] to implement Y -type measurements. As shown in Fig. 6.3, lattice surgery requires routing space between different logical qubit patches. In the interest of reducing routing space for lattice surgery, Ref. [HPDN19] considered extremely thin data busses that were only one qubit wide. The caveat of this method is that the measurements need to be performed for d 2 m syndrome measurement rounds, instead of dm in the regular case. In this manuscript we consider routing space of dimensions which are functions of dx and dz, which are the X and Z distances of the code respectively. For instance, the core of a quantum processor would have a layout given by Fig. 14 (e) in Ref. [CC22b]. 6.1.2 Temporally encoded lattice surgery It has been shown that encoding the measurement results of lattice surgery in an error-detecting code allows one to effectively reduce the measurement distance dm of each multi-qubit Pauli measurement [CC22b]. In particular, the Pauli operators of a given PP set P = {Pt+1, Pt+2, · · · , Pt+k} can be replaced by a new set S = {Q[x 1 ], Q[x 2 ], · · · , Q[x n ]} where Q[x] = Y k j=1 P xj t+j , (6.2) and x is a binary vector of length k. Such a replacement is allowed since ⟨S⟩ = ⟨P⟩. In this encoding, the vectors {x 1 , x 2 , · · · , x n} form the columns of the generator matrix G of some classical code C, where the rows of G are the codewords of C. Note that the encoding of a TELS protocol takes place entirely in the time domain since additional multi-qubit Pauli measurements are performed without 101 requiring additional qubits. By multiplying the measurement outcomes of all operators in S by the parity check matrix of C, timelike lattice surgery failures will be detected if the result is equal to 1 instead of 0. In Fig. 6.2a, we show a TELS protocol used in Ref. [CC22b] for a PP set given by P = {P1, P2}, and with S = {P1, P2, P1P2} (so that k = 2 and n = 3). The generator matrix for the set S is given by G = 101 011 , (6.3) with the rows of G generating codewords of the classical [3, 2, 2] code. Note that in a general-purpose quantum computer, the algorithms that are executed may have different sizes of PP sets. Depending on the size of the PP set, a suitable classical code would be chosen (the classical code chosen for a PP set of size-2 in Fig. 6.2a was chosen due to its simplicity). When measuring the operators in S, the multi-qubit Pauli operators are measured using d ′ m < dm syndrome measurement rounds during the merge step of the lattice surgery protocol (since the ability to detect logical timelike failures allows for noisier lattice surgery operations). In the example of Fig. 6.2a, if the measurement outcome of P1P2 is inconsistent with the measurement outcome of P1 and P2, a logical timelike failure has been detected. If timelike logical failures are not detected, Clifford corrections are applied based on the measurement results of the k original Paulis, which are obtained by decoding the n measurement bits according to a classical code (here, the [3,2,2] code). Note however that if two logical timelike failures occur, the protocol in Fig. 6.2a will be unable to detect such a timelike failure (this can also be seen by noting that the distance of the classical code C is 2). If a logical timelike failure during the TELS protocol is detected with probability pD, the original k Paulis are measured with measurement distance qdm, where q is a constant that can be optimized to reduce the overall measurement distance. This offsets the fact that the remeasure round only occurs with probability pD. Note that if the Paulis in the original set P are measured (due to the detection of a logical timelike failure), timelike failures cannot be detected since only the original k Paulis are measured. The circuits used in the TELS protocol described in this section are performed 102 in the Clifford frame, hence any Clifford corrections that may be required can be conjugated through to the end of the circuit. To perform a TELS protocol involving Clifford or non-Clifford gates that use resource states, hardware must be allocated to hold the resource states in memory for the entire TELS protocol. They cannot, in general, be measured out after each Pauli measurement. For example, in Fig. 6.2a the |T⟩ states used to measure P1 and P2 need to be held in memory until the end of the protocol. Note that a common approach for designing the architecture for a quantum computer is to use a central processing unit (also referred to as a “core”) and a set of distillation factories. If a quantum computer is built according to this model, the use of TELS results in a small additive factor to the total number of logical qubits needed instead of a multiplicative factor. This does not result in a substantial increase to the space-time cost of implementing algorithms on such an architecture, as in general the algorithms runtime is reduced by a multiplicative factor of between 2× and 5×, as we show in this manuscript. In contrast, in Ref. [Lit19a], Litinski showed that a runtime reduction of approximately 2× required a 6× increase in qubit overhead cost. We now calculate the total time taken by a TELS protocol to execute a PP set of size k, using an [n, k, d] classical code C. To preface, in a regular lattice surgery protocol, each measurement takes time (dm + 1), for a total time of k(dm + 1). Following the TELS protocol described in this section, the total time taken to measure all Paulis in a PP set using TELS is, on average, Told = n(d ′ m + 1) + pDk(⌈qdm⌉ + 1) , (6.4) where the second term is due to the contribution from measuring the Paulis of P if timelike logical failures are detected. We use the subscript ‘old’ to refer to the TELS protocol described in this section, since in Section 6.2, the remeasure part of the TELS protocol has a different time cost. Similarly for this protocol, the logical error rate per Pauli is the sum of the logical error rate contributions of the temporally encoded set and the remeasure set, pL = 1 − pD k Xn i=d li(pm(d ′ m))i (1 − pm(d ′ m))n−i + pDpm(⌈qdm⌉) 103 ≈ 1 − pD k ld(pm(d ′ m))d (1 − pm(d ′ m))n−d + pDpm(⌈qdm⌉) , (6.5) where li is the number of weight-i timelike logical failures that cause trivial syndromes when multiplying the lattice surgery measurement outcomes by the parity check matrix of the classical code C. The variable pm(d ′ m) is the probability of a logical timelike failure of a single lattice surgery measurement with measurement distance d ′ m (obtained from Eq. (6.1)), and pm(⌈qdm⌉) is the probability of a logical timelike failure in the remeasure round, where the measurement distance is ⌈qdm⌉. Note that the timelike logical error rate is due to wrong Clifford corrections being applied after the non-Clifford gates. Hence if TELS fails without detecting a timelike error, the probability that there is a logical error is scaled by 1 − pD. There are various ways that the li term in Eq. (6.5) can be calculated. In Appendices .4.1 and .4.2, we show how li is calculated using sampling methods, where the computational complexity of sampling grows with ‘d‘. In Appendix .4.3, we show how the li coefficients can be computed determinstically using MacWilliams identities. The advantage of using the MacWilliams identities is that the computational complexity only grows exponentially with k as opposed to d. Finally, the probability that an error is detected during the first stage is pD ≥ X d−1 i=1 n i (pm(d ′ m))i (1 − pm(d ′ m))n−i . (6.6) Note that in Eq. (6.6), we use the ≥ sign since some sets of ≥ d logical timelike failures will also be detected. 6.2 New TELS encoding protocol 6.2.1 Improvements arising from repeated temporally encoded measurements In this section we describe an improved TELS protocol compared to the one described in Section 6.1.2. An example of the application of the new protocol is given in Fig. 6.2b. The main difference is that if a logical timelike failure is detected when measuring the Paulis in the set S, operators in S (instead of P) are measured anew. In particular, operators in S are repeatedly 104 measured until no logical timelike failures are detected. Only at this point can we determine the Clifford corrections that must be applied. Although it is clear that the protocol described above is different from the protocol described in Section 6.1.2, it is not clear that the new protocol takes fewer syndrome measurement rounds. First, let us determine the time taken by the protocol described above, which is given by Tnew =n(d ′ m + 1) + pDTnew = n(d ′ m + 1) 1 − pD , (6.7) since the lattice surgery implementation of each multi-qubit Pauli measurement in S is performed using d ′ m syndrome measurement rounds during the merge step. To compare Eq. (6.7) to the protocol described in Section 6.1.2, we can rewrite Eq. (6.4) as Told =n(d ′ m + 1) + pDun(d ′ m + 1) =n(d ′ m + 1)(1 + pDu) , (6.8) where u = k(⌈qdm⌉+1) n(d ′m+1) . If 1 1−pD < (1 + pDu), the revised protocol is more efficient than the one described in Section 6.1.2. Simplifying this, we can get a constraint on u, u > 1 1 − pD . (6.9) We computed the value of u for all the classical codes considered in this paper that were used in a TELS protocol. If the condition in Eq. (6.9) was satisfied (which was true for all classical codes considered in the work except the distance-2 Single Error Detect code (see Appendix .5.1)), we used the new protocol for TELS. An interesting note is that the time difference between the old protocol and the new is only exacerbated when pD is larger. The probability of a logical error in this new protocol is the probability that an undetectable set of measurement errors occurs during the last (or successful) iteration of temporally encoded lattice surgery. Let p ′ L ≡ 1 k Pn i=d li (pm(d ′ m))i (1 − pm(d ′ m))n−i , which corresponds to the probability (per 105 Pauli) that a series of timelike logical failures during the execution of the measurements in S results in a trivial syndrome when multiplied by the parity check matrix of the classical code C. The TELS protocol failure probability is then given by the following equation pL = (1 − pD)p ′ L + pD(1 − pD)p ′ L + p 2 D(1 − pD)p ′ L + · · · = (1 − pD)p ′ L(1 + pD + p 2 D + · · ·) = p ′ L = 1 k Xn i=d li (pm(d ′ m))i (1 − pm(d ′ m))n−i . (6.10) 6.2.2 Improvements from correcting classical errors In Ref. [CC22b], the classical codes used in a TELS protocol were for a pure error detection scheme (as described in Section 6.1.2). Further, it was argued that performing error correction using TELS would result in worse speedups compared to a pure error detection scheme. In this section we show that performing a hybrid scheme using TELS, where errors of low weight are corrected and errors of higher weight are detected, can result in further performance improvements compared to a pure error detection scheme. The overall effect is to reduce the average time per Pauli measurement, while staying under the logical error rate threshold set by δ. This effect is more dominant when using classical codes with high k and d, and at higher tolerable noise rates δ. At larger δ, it is possible to correct classical errors of higher weight than for smaller δ since there is more of an error budget and pL can be increased to match δ. In addition, pD tends to be higher when dm is small, and dm is smallest at large delta. As we show in Table 6.1, the benefit of using error correction is that pD can be made smaller and that in turn allows smaller average runtimes per Pauli. When using a distance-d classical code in an error detection scheme, an error is detected with probability pD ≈ O(pm(d ′ m)). When using classical codes of high distance, the target logical timelike failure rate of a lattice surgery protocol may be achieved using measurement distances dm close to 1. However, with such small measurement distances, pm(d ′ m) is inevitably large, and so is pD. If we instead use the classical code to correct all errors up to some weight c, a detection event is only triggered by errors of weight at least c+ 1. In other words, pD ≈ O (pm(d ′ m))c+1 . A lower value of pD thus requires fewer repeated measurements of the Paulis in the set S. Although time is now 10 saved by performing fewer remeasure rounds, the logical error rate increases relative to a pure error detection scheme. In a pure error detection scheme, the logical error rate scales as O (pm(d ′ m))d . However, some errors of weight d − c will have the same syndrome as errors of weight-c, since they may differ by a logical operator. The most probable correction for this syndrome is the weight-c error, but applying this correction yields a logical error with probability O (pm(d ′ m))d−c . Consequently, using the classical code in a TELS protocol to correct low-weight errors leads to a tradeoff between the logical error rate of the encoded measurements and the time saved due to fewer remeasure rounds. Such tradeoffs have also been considered for magic state distillation schemes (see for instance Ref. [HH18]). By incorporating the ability to correct errors up to weight c, the probability of observing an uncorrectable but detectable error becomes pD ≥ d−Xc−1 i=c+1 n i (pm(d ′ m))i (1 − pm(d ′ m))n−i , (6.11) which is significantly smaller than in Eq. (6.6). The probability of a logical error per Pauli due to the TELS protocol described in this section is pL = 1 k ( X d−1 i=d−c li(pm(d ′ m))i (1 − pm(d ′ m))n−i+ Xn i=d li(pm(d ′ m))i (1 − pm(d ′ m))n−i ) ≈ 1 k ld−c(pm(d ′ m))d−c (1 − pm(d ′ m))n−d+c . (6.12) Note that in Eq. (6.12), li in the second term of the sum includes contributions from both undetectable errors and errors of weight greater than d which have the same syndrome as correctable errors. We only include the leading order term since higher order terms have very small contributions. Note however that for classical codes with very large values of d, a larger value of pm(d ′ m) may be tolerated. In such cases, including higher order terms in Eq. (6.12) may be required. To show the benefits of including error correction in a TELS protocol, we consider TELS with a classical [127, 92, 11] BCH code (see Appendix .5.1) at a physical error rate of 10−4 and a target logical error rate of δ = 10−15 per Pauli. In Table 6.1, we show that by correcting errors up to weight 1 dm c pL pD T T /k 1 0 1 × 10−24 0.37 400.71 4.36 1 1 2.5 × 10−21 0.076 275.08 3 1 2 1 × 10−18 0.011 256.83 2.79 1 3 2.5 × 10−16 0.0012 254.3 2.76 11 − 1.8 × 10−16 − 1104 12 Table 6.1: Comparison between the performance of a TELS protocol implemented using pure error detection versus protocols implemented using combined error detection and correction with the [127, 92, 11] BCH code. The last line of the table shows the performance of unencoded lattice surgery. Here, dm is the measurement used when measuring multi-qubit Pauli operators using lattice surgery, and c is the maximum weight of errors that can be corrected by the classical code used in the TELS protocol. The objective is to minimize the average time taken per Pauli measurement (last column) while ensuring the logical error rate is less than 10−15 per Pauli. The logical error rate per Pauli pL is calculated using Eq. (6.12), where the routing space area is A = 100. The results in the first row are for the TELS protocol implemented using pure error detection. By correcting weight-one, -two and -three errors, the average measurement time is reduced to 2.76 syndrome measurement rounds, as opposed to 4.36 when using TELS with pure error detection, or 12 without TELS. Note that a pure error detection scheme with dm ≥ 3 results in a larger runtime than those obtained in the first four rows of this table since the total number of syndrome measurement rounds will be at least 127 × 4 = 508. three, it is possible to reduce the average time per Pauli measurement by 36% . 6.2.3 Protocols for PP sets of size up to one hundred When performing temporally encoded lattice surgery, it is unclear which classical [n, k, d] code will give the best speedup. Eq. (6.7) shows that the total time to measure all the multi-qubit Pauli operators using a TELS protocol is clearly proportional to n. Also, the time is inversely proportional to the distance of the classical code used since a higher value of d results in a lower value of d ′ m. However the contribution arising from pD needs to be determined, as well as the tradeoffs between a pure error detection scheme and an correction + detection scheme. Ref. [CC22b] evaluated the speedups arising from using TELS protocols at a physical error rate of p = 10−3 , δ = 10−15 and A = 100 (see Eq. (6.1)). Three classical codes were considered, and the speedups arising from each code were computed. In the analysis, the distance-4 Extended Hamming code resulted in the best performance improvements. For the TELS protocols considered in this work, it was unclear at the outset which classical codes gave the best performance improvements. As such, we collected results for an expanded list of classical codes, which are listed in Table 6.2. Note 108 Code / Code family Distances Single Error Detect(SED) 2 Hamming (Hamm) 3 Concatenated SED (CSED) 4 Extended Hamming (EHamm) 4 Golay (Gola) 7 Extended Golay (EGol) 8 Doubly Concatenated SED (DCSED) 8 Reed-Muller (RM) 8 Polar (Pol) 4, 8 Zetterberg (Zett) 5, 6 Bose–Chaudhuri–Hocquenghem (BCH) 3, 5, 7, 9, 11 Table 6.2: Classical codes (and their associated distances) used in the TELS protocols considered in this work. In different noise regimes, some codes perform better than others, as we show in Fig. 6.4, Fig. 12 and Fig. 13. We provide explicit constructions of the best performing codes in Appendix .5. that in our analysis, we only consider odd values of d ′ m, as Eq. (6.1) only applies to odd measurement distances. Since the calculation of the number of malignant fault sets is computationally expensive, we could not consider classical codes of distances higher than those shown in Table 6.2. The code constructions for the codes in Table 6.2 are provided in Appendix .5. For k ∈ {2, 3, . . . 100}, the lowest achievable average time per Pauli measurement for TELS protocols described in this section and which use the codes of Table 6.2 are shown in Fig. 6.4. The average runtime per Pauli is Tnew/k where Tnew is given by Eq. (6.7). Such a calculation allows us to compare the performance of a TELS protocol relative to the time taken to measure multi-qubit Pauli operators in the original PP set P without using TELS. Our results are obtained using the parameters p = 10−3 and δ = 10−10. Other regimes are considered in Appendix .6. In particular, results are obtained for the parameters δ ∈ {10−15 , 10−20 , 10−25} with p = 10−3 and δ ∈ {10−15 , 10−20} with p = 10−4 . We also show the average runtimes per Pauli for unencoded lattice surgery protocols. With a classical code that is defined to encode k logical bits, we can implement TELS protocols for PP sets of any size up to k. This allows us to use good classical codes for many different sizes of PP sets. If the PP set is of size k − j, the Paulis associated with the remaining j logical bits of the code are simply set to the identity. When decoding the temporally encoded measurement results, information corresponding to the extra logical bits is thrown away. 109 0 20 40 60 80 100 0 2 4 6 8 10 12 14 k Ave. runtime per Pauli Figure 6.4: The best average runtime per Pauli (in units of syndrome measurement rounds) for all classical codes considered in this work when using the TELS protocol of Section 6.2 for PP sets of size k ∈ {2, 3, . . . , 100} at p = 10−3 , δ = 10−10 and with maximum routing space A = 100. For example, for PP sets of size k = 20, a distance-5 BCH code achieves the lowest average runtime per Pauli among all the classical codes considered. To calculate pL, we used Eq. (6.12), with pm given by Eq. (6.1) and dm chosen to minimize the runtime while keeping pL < δ. We compare the results of TELS with un-encoded lattice surgery, which is shown here to take 14 syndrome measurement rounds per Pauli. The legend labels correspond to codes from Table 6.2. Low distance codes perform better for small values of k, whereas for larger values of k, the high rate of larger-distance codes enables smaller measurement distances. The biggest insight from our findings is that small codes with low distances perform well at small values of k, whereas at larger values of k, the larger and higher distance codes perform better. Moreover, we notice that for values of k higher than 30, codes from the BCH family generally give the largest speedups. Of course, the list of codes considered in Table 6.2 is not all-encompassing. There may be many codes that perform better than the codes considered here. To find codes that work well for a specific value of k, the primary target should be to ensure that the rate of the code ( k n ) is not too low. In our observations, codes with rate less than 1/2 generally did not give the best speedups. The second biggest consideration is the distance of the classical code. High-distance classical codes admit very low values of d ′ m at the cost of much higher probabilities of detecting a logical timelike failure. Error correction for smaller weight errors can be used to reduce the detection rate as long as the logical failure rate per Pauli is below the target rate δ. 110 Chapter 7 TELS for improving magic state fidelity Quantum gates can be classified into those that can be efficiently simulated by classical computers (such as Clifford gates), and those that cannot [AG04]. As discussed in the introduction to Chapter 6, a universal gate set can be achieved by combining Clifford gates with at least one non-Clifford gate. However, the implementation of logical non-Clifford gates with topological codes is not as straightforward as implementing gates from the Clifford group. Examples of non-Clifford gates include rotations about one of the Bloch sphere axes by angles of jπ/8 where j ∈ {1, 1/2, 1/4, . . .} or multiply-controlled Z gates C jZ where j ≥ 2. One common method used to implement non-Clifford gates is to use magic states as resource states along with stabilizer operations to perform gate teleportation. A logical magic state can be created by first preparing a physical magic state, followed by a gauge fixing step which encodes the state into a logical qubit patch such as the surface code [VLC+19]. However such operations are not fault-tolerant and lead to encoded magic states with unacceptably high physical noise rates. To get high fidelity magic states, the prepared magic states are then injected into a magic state distillation protocol where a quantum error-correcting code uses stabilizer operations to detect logical failures present on the injected magic states. Such protocols can be concatenated to achieve any desired target logical failure rate. Magic state injection Here we briefly describe various magic state injection methods used to prepare magic states prior to performing a distillation protocol. Ref. [LC22] considers a state injection protocol on the rotated surface code afflicted by a standard depolarizing noise model. On the other hand, Ref. [SDBP22] 111 considers an injection protocol under a biased depolarizing noise model with η = 103 and η = 104 . Further, the magic states are prepared in the XZZX code rather than the rotated surface code. In the implementation of the protocol, the magic states are initialized into an effective two-qubit repetition code using low-error ZZ rotations, and the stabilizers of this code are measured twice. Afterwards, the error detecting code is merged into the final XZZX code, where dm syndrome measurement rounds are performed. The prepared magic states are then shown to have failure probability O(p 2 ). In this work, we conjecture that a similar injection protocol can be devised using rectangular surface codes in the presence of biased noise. However, we consider the entire protocol to require only two syndrome measurement rounds since the dm syndrome measurement used after merging the error detecting code with the final code can be part of the lattice surgery operations used in the magic state distillation schemes described in Section 7.1. Recall that the noise model used in this work has bias η = 102 , as opposed to η = 103 or η = 104 used in Ref. [SDBP22]. Further, in Fig. 3 of Ref. [SDBP22], it can be seen that for values of the physical noise rate parameter p ≤ 10−3 , the injected magic states have logical error rate less than p (with a quadratic scaling as a function of p). Since we use a rectangular surface code with bias η = 102 , we take the injected magic states to have a logical error rate ϵL,X = ϵL,Y = p 3η , ϵL,Z = p 3 , (7.1) where ϵL,P is the probability of a logical Pauli P error when injecting the prepared magic state. Note that the previous expression may be optimistic depending on the hardware implementation of the ZZ(θ) rotation used in Ref. [SDBP22] and given that our noise bias η = 100 is lower than what was considered in Ref. [SDBP22]. On the other hand, if the hardware architecture allows for the implementation of the ZZ(θ) as in Ref. [SDBP22] and values of p in the range 10−4 ≤ p ≤ 10−3 are below threshold at η = 100, then the expressions for ϵL,P in Eq. (7.1) can be quite pessimistic. Note that if the underlying hardware architectures enables the implementation of the methods described 112 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 ⇡ 8 |TXi |0i |TXi |TXi |TXi |TXi Z Z Z Z Figure 7.1: Circuit used in a 15-to-1 magic state distillation protocol expressed as a sequence of multi-qubit non-Clifford gates. The circuit above is the Hadamard-transformed version of Fig. 15 of Ref. [Lit19a], which produces the state H|T⟩. This state can be used in the same way as the regular magic state, with only a change of Pauli basis while measuring. in Refs. [CN20, SC22], Eq. (7.1) may be improved even further1 . In what follows, we define ϵL = ϵL,X + ϵL,Y + ϵL,Z = p 3 + 2p 3η . (7.2) Using the results of Fig. 3 of Ref. [SDBP22], we approximate the success probability of the state injection protocol to be greater than 99% for p ≤ 10−3 . Hence, we take the time taken to prepare a magic state to be approximately Tinj = 2/0.99 syndrome measurement rounds for p = 10−3 and Tinj = 2/0.995 syndrome measurement rounds for p = 10−4 . It is especially important to consider the time taken for injection since in TELS protocols, the time for each multi-qubit Pauli measurement can be as low as two syndrome measurement rounds (if we include the time required for ancilla reset). Note that the precise success probabilities require a more careful analysis, however we don’t expect such analysis to have a significant effect on Tinj. 7.1 Magic state distillation A magic state distillation protocol takes several noisy magic states as input (which are encoded in some code such as the surface code), and by performing stabilizer operations which are part of an error detection protocol, yields fewer magic states of much higher fidelity. The output yield and magic state error probability depend entirely on the quantum code used for the error detection protocol. Bravyi and Haah suggested a class of distance-2 quantum codes for distilling k magic states 1 Implementing the schemes described in Refs. [CN20, SC22] would result in a higher space-time cost compared to the other injection protocols described in this section, since color code patches would be used, and O(d) rounds of error detection would be required. However the injected magic states would have failure probabilities that scale as O(p (d+1)/2 ) instead of O(p) or O(p 2 ). Such low failure rates could then result in much smaller magic state distillation factories (see for instance Ref. [CNAA+22]). 113 from 3k + 8 input magic states [BH12]. These codes offer minimal protection, but have good yield. One of the protocols we consider in this work is the popular 15-to-1 distillation protocol [BK05], which can be transformed into a series of 11 commuting multi-qubit Pauli measurements [Lit19a]. These Pauli measurements form a PP set of size 11, and thus a TELS protocol can be used to reduce the runtime required to measure each multi-qubit Pauli operator. In search of magic state distillation protocols with good output-to-input ratio, we found infinite families of protocols with near constant rate, where the focus is not on concatenating protocols but instead on measuring the stabilizers of an outer code using operations that are transversal for an inner code [HHPW17]. Other techniques construct more complex codes for distillation by generalizing triorthogonal codes, or by puncturing Reed-Muller codes [HH18]. In addition to the J15, 1, 3K code, in this work we also consider several triorthogonal quantum codes that are constructed by puncturing a [128, 29, 32] classical Reed Muller code, as described in Ref. [HH18]. In particular, the distillation protocols we consider are derived from J114, 14, 3K, J116, 12, 4K and a J125, 3, 5K quantum codes. It is beneficial to consider quantum codes of different distances as this allows distilling magic states across a range of target logical error probabilities. Since our focus of applying TELS to magic state distillation is to reduce space-time costs, we comment on some previous work by Litinski [Lit19a, Lit19b]. First it was shown that the time costs of distillation algorithms can be reduced, however this leads to a disproportionate space increase, leading to overall larger space-time costs [Lit19a]. In further work, it was shown how to reduce space-time costs by performing the distillation using surface code patches of reduced distance, and using faulty T measurements instead of T state injection [Lit19b]. Improvements due to TELS protocols may also be applied to the work in Ref. [Lit19b], allowing even smaller space-time costs than shown in Table 7.1 of Section 7.2 (where we apply TELS to various distillation protocols and compute the space-time costs). The main objective of the results given in Table 7.1 is to show that space-time costs can be reduced when using TELS as opposed to when there is no temporal encoding. In addition, we make a careful assessment of the time and space required for magic state injection, which makes the results in Table 7.1 look more pessimistic when compared to results such as in Ref. [Lit19a, Lit19b]. Traditional distillation algorithms, such as those in Ref. [Lit19a] perform only pure multi-qubit Pauli Z measurements. In the remainder of this work, we apply Hadamard transformations to the 114 distillation circuits to produce circuits consisting of pure multi-qubit X measurements. Such a transformation reduces the space-time costs of the distillation factories, as only the shorter logical X boundary of the asymmetric surface code patch will need to be accessed. The Hadamard-transformed version of the 15-to-1 distillation circuit of Ref. [Lit19a] is shown in Fig. 7.1. 7.1.1 Distillation in the Clifford frame To obtain the space-time costs of various magic state distillation protocols implemented with TELS, we first design an appropriate distillation protocol which outputs the desired magic state up to a Clifford correction. For simplicity, the general protocol can be separated into two steps. First the non-Clifford gates are applied using temporally encoded lattice surgery. If a non-trivial lattice surgery measurement failure is detected, all the physical qubits in the distillation tile are reset and the protocol restarts. We allow for the TELS protocol to also use the developments of Section 6.2.2, where classical errors of low weight may be corrected before signaling a lattice surgery measurement failure. Alternatively, if we follow the TELS protocol of Section 6.2.1, more magic states would need to be simultaneously held in memory, and hence the hardware requirements would be larger. If TELS was successful, we are left with a distilled magic state up to a Clifford frame, prior to performing the single-qubit measurements. In the second part of the protocol, the Clifford frame is conjugated through the final single-qubit measurements. This changes the single-qubit measurements into multi-qubit π/2 Pauli measurements implemented via lattice surgery. Note that these measurements may now be tensor products of arbitrary Paulis and not just 1 and X. These multi-qubit measurements may also be sped up using TELS, since they all commute. In particular, we use the TELS protocol of Section 6.2.1 for these final multi-qubit Pauli measurements. After the Clifford frame is conjugated through the single-qubit measurements, the output distilled magic states are correct up to a Clifford correction. In fact, for the example in Fig. 7.1, the resulting state is exactly one of |TX⟩, Xπ/4 |TX⟩, Xπ/2 |TX⟩ ,or X3π/4 |TX⟩, as we prove in Appendix .7. When using the distilled state in an algorithm, the final magic state measurement axis is modified depending on the Clifford frame. If magic states were prepared in the Pauli frame (see Fig. 7.3) rather than Clifford frames, such magic states would be measured in the Z basis using transversal single-qubit 115 ⌘ X ⇡ 4 ⇡ 2 ⇡ 8 P P P P |TXi Z (a) ⌘ Y ⇡ 4 ⇡ 2 |0i P P P X (b) Figure 7.2: (a) Circuit for performing a π/8 multi-qubit Pauli measurement. The circuit requires a |TX⟩ = H|T⟩ resource state, and a Clifford correction may be required depending on the P ⊗ X measurement outcome. (b) Circuit for performing a Clifford gate using an ancilla prepared in |0⟩. Both circuits are adapted from Ref. [Lit19a]. measurements (see Fig. 7.2a). However, the measurement basis may now be −Y, −Z, Y given that Xπ/4ZX† π/4 = −Y , Xπ/2ZX† π/2 = −Z , X3π/4ZX† 3π/4 = Y . (7.3) For protocols that distill multiple magic states, the Clifford frame of the distilled states may contain multi-qubit operators. After the distilled states are used by non-Clifford gates in a core of a quantum computer, the Clifford frame must be further conjugated through the remaining single-qubit Z measurements. This may result in further multi-qubit Pauli measurements and additional routing space area in order to access the Y and Z logical boundary of the distilled states. A caveat from using the Clifford frame is that Y basis measurements may require an extra ancilla (depending on the chosen hardware implementation). As such, the design of magic state distillation factories may require additional routing space to store the ancillas needed for Y measurements. We now address the additive space cost of TELS and how it may be minimized in a distillation protocol. Distillation tiles are essential building blocks of fault-tolerant universal quantum computers, so it is worth finding the smallest, most optimal qubit layouts for them. Consider the distillation circuit in Fig. 7.1. Each of the 11 non-Clifford gates requires an input |TX⟩ state, as shown in the non-Clifford circuit gadget of Fig. 7.2a. If we use TELS to perform the 11 measurements, 11 magic states will need to be held in memory, as indicated in Fig. 6.2b. In contrast, when performing lattice surgery without temporal encoding, only one cell is assigned to repeatedly prepare magic states for 116 non-Clifford gates [Lit19a, Lit19b]. Upon close inspection of the multi-qubit Pauli measurements being performed in a TELS protocol, we notice that a magic state is stored only for as long as the Pauli it was associated with from the original PP set P appears in the sequence of new measurements S. Consequently, magic states do not need to be stored for the entire protocol. Consider performing TELS for a PP set of size 3, where each Pauli measurement consumes a magic state. We use the [4, 3, 2] error-detect code with the cyclic codeword matrix G = 1100 0110 0011 . (7.4) Here, the choice of a cyclic representation is what allows us to reduce space requirements. We may read off the new multi-qubit Pauli measurements from S as {P1, P1P2, P2P3, P3}. Notice that after the measurement P1P2, the magic state associated with P1 does not need to be accessed again. At this point the hardware holding the magic state used for the P1 multi-qubit Pauli measurement can be reset to prepare the magic state required for the P3 measurements. Since P1 does not appear in any of the measurements after the first occurrence of P3, the magic states associated with P1 and P3 are never simultaneously accessed. Continuing with this argument, it can be seen that space for only two magic states is required to perform all the measurements given by Eq. (7.4). Hence for every code, a particular choice and ordering of codewords can result in a smaller quantity of |TX⟩ states that need to be stored. For classical codes that are cyclic, a cyclic description of the codeword generator matrix allows for a reduced number of magic states needed to be held in memory. To determine exactly how many, note that for any column in this matrix (corresponding to a Pauli measurement from S), the number of rows between the first 1 and the last 1 denotes the number of magic states that need to be held in memory for that Pauli measurement. The maximum number of |TX⟩ states required for any of the columns of the codeword generator matrix is the maximum for the entire PP set. For codes that do not have a natural cyclic set of codeword generators, the codewords must be chosen and ordered very carefully. In general, finding a sequence that minimizes the space requirements of magic states is an NP problem as there are exponentially many orderings of codewords. For instance, in the 15-to-1 distillation protocol of Fig. 7.1, one of the TELS protocols 117 we considered (see Section 7.2) uses the classical Golay code. For this code, there is a natural cyclic representation of codewords as we show in Appendix .5.1. According to this construction, 12 magic states will need to be held in memory. However in Appendix .8, we show a specific choice of codewords that can minimize the space-requirements to 10 magic states. To execute the non-Clifford and Clifford gates in Fig. 7.1, we use the gate gadgets in Fig. 7.2. Since we conjugate the Clifford frame through to the final single-qubit measurements, we do not actually need to perform any π/4 Clifford gates, since such gates are converted to π/2 multi-qubit Pauli measurements [Lit19a]. Time-cost analysis The time taken to successfully complete one round of distillation is calculated as follows. Let T1 be the time taken to implement TELS on the non-Clifford gates and T2 be the variable time associated with the final multi-qubit π/2 Pauli measurements. T2 = 0 if there are no updates to the Clifford frame. If a lattice surgery logical timelike failure is detected during TELS with probability pD, T1 =Tinj + n(d ′ m + 1) + pD Tinj + n(d ′ m + 1) + p 2 D Tinj + n(d ′ m + 1) + p 3 D . . . =(Tinj + n(d ′ m + 1))(1 + pD + p 2 D + · · ·) = Tinj + n(d ′ m + 1) 1 − pD . (7.5) Here, Tinj is the time taken to inject a magic state into a cell, calculated using the analysis in Chapter 7. Note that in the above equation, we assume that all the Pauli measurements after the first one do not wait any extra time for newly injected magic states. This assumption is validated by the fact that, for p = 10−3 , magic state injection requires two syndrome measurement rounds (with probability 99%), or four (with probability 99.99%). However for all the distillation protocols we consider, TELS requires at least four syndrome measurement rounds per Pauli measurement. The final multi-qubit π/2 Pauli measurements are non-deterministic, implying we may need to perform k ′ Pauli measurements, where 0 ≤ k ′ ≤ κ. In the previous inequality, κ is the number of single-qubit measurements on the input |TX⟩ states of the distillation protocol (not the ones 118 ⌘ ⇡ 2 |0i ⇡ 8 P P P |TXi X X X/Z Z Y Figure 7.3: Circuit gadget for an auto-corrected non-Clifford gate. The circuit does not require the application of conditional Clifford gates to the logical data qubits. However, an extra ancilla prepared in |0⟩ is required. used for the π/8 measurements) before the Clifford frame is conjugated through (for instance, in Fig. 7.1, κ = 4). To execute these Pauli measurements, we perform TELS according to the method of Section 6.2. If there are k ′ measurements to perform, this takes time T2 = k ′ (dm + 1) without TELS. If we use TELS with measurement distance d ′′ m and an [n ′ , k′ , d′ ] code with detection probability p ′ D, T2 = n ′ (d ′′ m + 1) + p ′ D(n ′ (d ′′ m + 1)) + (p ′ D) 2 ... = n ′ (d ′′ m + 1) 1 − p ′ D . (7.6) Note that when measuring the final multi-qubit Paulis, if a detection event is observed, the entire protocol does not need to be restarted. Instead, it is sufficient to just redo the Pauli measurements associated with the TELS protocol, as is done in Fig. 6.2b. The time to successfully distill the magic state also relies on whether the distillation protocol itself detected an error in any of the input magic states. This is modeled by the probability that the magic state protocol detects an error on an input magic state, which we denote p (M) D . Thus the total time required to successfully distill a magic state is T = T1 + T2 + p (M) D T = T1 + T2 1 − p (M) D . (7.7) 119 7.1.2 Challenges of extending TELS protocols to Pauli frames The multi-qubit Paulis associated with the encoded TELS measurements in a Clifford-frame distillation circuit are performed using the circuit shown in Fig. 7.2a. However this results in a Clifford frame which eventually must be implemented using the Clifford gate gadgets in Fig. 7.2b. Keeping track of Clifford frames can be avoided by using the auto-corrected T gadgets shown in Fig. 7.3. In such an implementation, the time associated with the Clifford correction can be traded for the extra space used by the additional |0⟩ ancilla. However, using auto-corrected T gadgets in a TELS protocol leads to additional challenges. When the k P ⊗ X measurements are performed using TELS, there will be an additional space cost associated with holding some magic state cells in memory. In order to benefit from the time speedups provided by TELS, a TELS protocol may need to also be performed on the |0⟩ ancilla states. Such considerations would result in the space cost being tripled (relative to the Clifford frame scheme). To see this, note that the P ⊗ X and X ⊗ Y measurements occur simultaneously since they have the same measurement distance and both X boundaries of the |TX⟩ state can be accessed simultaneously. In such a protocol, the number of |0⟩ and |TX⟩ states are identical. Furthermore, each |0⟩ state requires an additional cell in order to access its Y boundary. If instead, we do not perform TELS on the X ⊗Y measurements, there are two options. The lattice surgery operations (with large measurement distance) can either be performed sequentially, which will result in a speed mismatch between the X ⊗ Y measurements and the P ⊗ X measurements, leading to a backlog of |TX⟩ states and |0⟩ states that will need to be held in memory (and so there would be no time improvement due to TELS). Another option is to perform the slow lattice surgery operations in parallel, but this also admits an additional space cost to hold all the |0⟩ cells. Given the above considerations, and the challenges associated with the design of distillation tiles for the inclusion of |0⟩ ancillas, we leave the analysis of TELS protocols applied to magic state distillation protocols in the Pauli frame to future work. 7.2 Precise design of distillation tiles In this section we analyze space-time costs of various distillation protocols in different noise regimes. At a physical error rate of p = 10−4 , one round of 15-to-1 distillation with robust lattice 120 p δ(M) Distillation Circuit type dx dz Space Time Space-time cost code (NS) (# qubits × NS) 10−4 10−10 J15, 1, 3K No encoding 7 9 1360 110.06 1.5 × 105 No enc., Par 7 9 3300 60.03 1.98 × 105 Cliff-SED 7 9 1120 104.05 1.17 × 105 Cliff-SED, Par 7 9 2352 68.03 1.6 × 105 Cliff-BCH 7 9 1344 92.08 1.24 × 105 Cliff-BCH, Par 7 9 2688 64.06 1.72 × 105 Cliff-Golay 7 9 1792 124.06 2.23 × 105 Cliff-Golay, Par 7 9 3024 80.04 2.42 × 105 10−4 10−15 J116, 12, 4K No encoding 9 15 9440 1391.48 1.31 × 107 No enc., Par 9 15 14934 702.77 1.05 × 107 Cliff-BCH9, Par 9 15 22800 431.82 9.85 × 106 Cliff-Zett5, Par 9 15 18600 442.41 8.23 × 106 10−3 10−10 J114, 14, 3K No encoding 9 19 11750 1646.61 1.93 × 107 No enc., Par 9 17 16836 831.62 1.4 × 107 Cliff-BCH7, Par 9 17 22400 595.31 1.33 × 107 Cliff-Zett5, Par 9 17 20480 623.27 1.28 × 107 10−3 10−15 J125, 3, 5K No encoding 13 25 17514 2479.17 4.34 × 107 No enc., Par 13 25 29548 1252.11 3.7 × 107 Cliff-BCH7, Par 13 25 38640 859.8 3.32 × 107 Cliff-BCH9, Par 13 25 42504 729.66 3.1 × 107 Table 7.1: Space-time costs of different distillation protocols on a biased-noise planar surface code. δ (M) is the target logical error rate per output magic state. TELS protocols are labeled “Cliff-xxx”, with “Par” implying that measurements are performed two at a time (i.e., with lattice surgery measurements which can access the two X logical boundaries of surface code patches simultaneously). The number of physical qubits is two times the space cost, since the space cost counts only the number of data qubits of the surface code. The probability that a distillation algorithm rejects due to an error in an injected magic state is p (M) D = 1 − (1 − ϵL) n where ϵL is given by Eq. (7.2). For the 15-to-1 distillation protocol, the space time cost of a protocol using TELS is approximately 30% smaller (1.17 × 105 ) than a protocol that does not use TELS (1.5 × 105 ). For the 125-to-3 distillation protocol, the space time cost is decreased by approximately 20% with TELS. The label NS refers to the number of syndrome measurement rounds required for the entire distillation protocol. 121 surgery operations is sufficient to distill magic states with final error probability δ (M) ≤ 10−10. For δ (M) = 10−15 (which is relevant for larger algorithms), or for distillation protocols with p = 10−3 , we considered 100+ qubit quantum codes, as suggested in Ref. [HH18]. To the best of our knowledge, our work is the first to analyze space-time costs of distillation protocols using these larger codes. For the noise rate regimes p = 10−4 and p = 10−3 , we estimate the space-time costs of the various distillation protocols using different implementations of lattice surgery. We first consider protocols that do not use TELS. These protocols will execute non-Clifford gates using the auto-corrected non-Clifford gates of Fig. 7.3. Subsequently, we consider distillation protocols that perform TELS, using the methods developed in Section 7.1.1. In contrast to Ref. [Lit19a], the distillation tiles developed in this paper are all rectangular, minimizing wasted space when tiled on a 2D grid of qubits. Note also that in Ref. [Lit19b], the logical qubits are designed to have different space-like distances dx and dz even without a biased noise model. This improvement is permitted due to the specific function of each qubit in the distillation protocol. When applying these improvements to the distillation protocols and layouts in this work, the space-time costs may be further reduced. The time-like distance of lattice surgery dm and the space-like distances dx and dz of the logical qubits and routing regions are computed using a procedure detailed in Appendix .9. Essentially, a set of distances {dx, dz, dm} must be determined that minimize the overall space-time cost, while ensuring the output magic states have logical errors with probability at most δ (M) . In solving Eq. (30), we must first determine certain constants related to each hardware layout. This includes the area of the distillation tile used in each protocol (which we denote as the space cost), the number of logical qubits used in a tile N, the worst-case routing space area A and maximum area used during any lattice surgery measurement. We develop 20 different layouts in this work, and the associated constants for each of them are tabulated in Appendix .10. In Table 7.1, we display the space-time costs of the various distillation protocols considered in this work. Using TELS protocols, it is possible to achieve lower space-time costs than using protocols without any temporal encoding. Although there are only minor improvements to the space-time cost of TELS-assisted distillation tiles, many of these tiles will be needed in each distillation factory. As a result, the improvements add up and the quantum computer as a whole will have a lower space-time volume. Moreover with reduced time costs, fewer distillation tiles may be required altogether, as we show in Section 7.2.2. This in turn further reduces the space-time cost of distillation factories. 122 Interestingly, for the 15-to-1 distillation protocol, circuits that perform TELS do not produce tiles that have smaller time costs. Instead, TELS-assisted tiles require fewer qubits, and this can be attributed to the use of non-Clifford gate gadgets as shown in Fig. 7.2a. Below, we show layouts for the 15-to-1 distillation routine. In Appendix .11, we show layouts for the 100+ qubit codes. 7.2.1 15-to-1 distillation The 15-to-1 magic state distillation protocol is one of the most widely known protocols for distilling |T⟩ states (see for instance Refs. [BK05, BKS21, FG18, Lit19a, Lit19b]). The protocol originates from a J15, 1, 3K triorthogonal CSS quantum code. The code has the property that the application of T gates on all of the physical qubits of the code implements a logical T † gate. Since we perform distillation with |TX⟩ states, we define this code to contain 1 logical qubit, 10 X-type stabilizers and 4 Z-type stabilizers. A distilled magic state is produced by encoding a logical |0⟩ state, applying the transversal T gates, decoding and then performing measurements. By propagating the the Clifford gates past the transversal T gates and removing the redundant parts of the circuit, we are left with the circuit in Fig. 7.1 (see Ref. [Lit19a] for a more detailed derivation). The circuit contains 11 commuting Pauli measurements on 5 logical qubits (four of which are logical |TX⟩ states). Since the 11 Pauli measurements commute, they form a size-11 PP set. There exist many choices of classical codes to be used in TELS protocols with size-11 PP sets. In this paper we will focus on using a Single Error Detect code of distance 2 ([12, 11, 2]), a BCH code of distance 3 ([15, 11, 3]) and the Golay code of distance 7 ([23, 12, 7]). We will consider using this distillation protocol in a regime where the physical error rate is p = 10−4 and the target logical error rate per magic state is δ (M) = 10−10. Using the noise model described in Chapter 7 for the injection of magic states, we apply the analysis in Ref. [Lit19b] to determine the logical failure probability per output magic state for one round of a 15-to-1 distillation scheme which is given by p (M) L =35 (ϵL,Z) 3 + 1 2 6(ϵL,Z) 2 ϵL,X + 1 4 12ϵL,Z(ϵL,X) 2 + 1 8 8(ϵL,X) 3 = 35(1 + η) 3 27η 3 p 3 . (7.8) 123 (a) (b) (c) (d) Figure 7.4: Layouts of logical qubits for TELS-assisted 15-to-1 state distillation. Data qubits are placed in blue cells. Magic states are in pink cells, where cells with a radial shading are extra cells used to prepare new magic states in parallel with the Pauli measurements. |0⟩ ancillas for autocorrected gadgets are placed in the brown cells adjacent to the yellow squares used for twists. Green cells are used to store distilled magic states for use by the core while the next round of distillation occurs. Additional green cells may be required if a distillation tile produces magic states faster than the core consumes them (alternatively, the magic states can be transported to additional tiles surrounding the core). Routing regions between cells are split into grey and blue to show that the relevant lattice surgery operations will not clash. (a) Layout for un-encoded lattice surgery using autocorrected non-Clifford gate gadgets of Fig. 7.3. The grey routing region handles the X ⊗ Y measurements and the blue routing regions performs X-boundary measurements between different logical qubits. (b) Layout for 15-to-1 distillation with TELS, using the [12, 11, 2] Single Error Detect code. Note that we only need one radial pink cell. However given the geometry of the entire tile, we use the remaining space for another pink radial tile. (c) Layout using the [15, 11, 3] BCH code, and, (d) using the [23, 12, 7] Golay code. For p = 10−4 and η = 100, the probability that the distillation succeeds is 1 − p (M) D = (1 − ϵL) 15 = 0.999 and p (M) L = 1.33 × 10−12. As this is sufficiently below δ (M) , the lattice surgery measurements used to execute the distillation protocol must be modeled with measurement distance large enough to allow for distilled magic states of logical error rate at most δ (M) . Using the procedure in Appendix .9, we determined that the minimum spacelike distances required are dx = 7 and dz = 9. In the subsequent subsections, we detail the specifics of the hardware layouts that are used for the various distillation protocols, both with and without TELS. Arranging the logical qubits according to these layouts minimizes the space requirements of distillation blocks. In addition, TELS is used to minimize the time costs. Overall we observe that protocols that use TELS can achieve lower space-time costs than those that do not. 124 (a) (b) (c) (d) (e) Figure 7.5: Layouts of logical qubits for parallelized TELS-based 15-to-1 state distillation protocols. The meaning of each color is described in the caption of Fig. 7.4. (a) Layout for un-encoded lattice surgery, with two routing regions, each accessing one X boundary of the logical qubits. Each routing region has access to a separate magic state and a |0⟩ ancilla used in the circuit of Fig. 7.3. (b) Layout for distillation with TELS, using the [12, 11, 2] Single Error Detect code. Three magic state tiles are held in memory for each pair of parallel Pauli measurements. Then two are discarded and two prepared magic states on other pink cells are used in the following round. (c) Layout and routing region used for the final multi-qubit Pauli measurements in the Clifford frame distillation protocol. (d) Layout for parallelized distillation with TELS, using the [15, 11, 3] BCH code. (e) Layout for parallelized distillation with TELS using the [23, 12, 7] Golay code. For (d) and (e), the layouts used to perform the final multi-qubit Pauli operations required by the Clifford frame can be found in an analogous way from going from (b) to (c). 125 (a) (b) Figure 7.6: (a) On the layout of Fig. 7.5b, we show how two separate routing spaces can be used to perform parallel lattice surgery measurements. The logical measurements are X1 ⊗ X2 ⊗ X3 ⊗ XTX,1 (in the equatorial routing space) and X3 ⊗ X4 ⊗ XTX,1 ⊗ XTX,2 (in the circumferential routing space). These are the first and second measurements respectively when performing TELS-assisted distillation using the [12, 11, 2] SED code (see Eq. (17) of Appendix .5 for the codeword generator matrix). Alternatively, they correspond to the first measurement and the product of the first and second measurements from Fig. 7.1. The logical patches have code distances dx = 3, dz = 5. X stabilizers are in red, and Z stabilizers are in blue. The product of the X stabilizers indicated by white vertices gives the parity for the multi-qubit Pauli measurement outcomes. (b) On the same layout, we show how to perform Pauli measurements which are tensor products of X, Y , or Z on the data qubits. These measurements are performed after the non-Clifford gates of the distillation protocol. The example in the figure measures X ⊗ Y ⊗ X ⊗ X ⊗ Y on the five data qubits. The yellow stabilizers are twist defects that are used to access Y boundaries of logical qubits that are originally defined with only X and Z boundaries, using the techniques shown in Ref. [CC22a]. Note that the size of the routing space area separating the top and bottom rows of data qubits is taken to be large enough to allow for Y measurements requiring twists. 126 No temporal encoding First, we calculate the space-time cost of the distillation circuit of Fig. 7.1 without temporally encoded lattice surgery. For this, we consider a modified version of the layout used by Litisnki (see Fig. 18 of Ref. [Lit19a]), as shown in Fig. 7.4a. In this figure, the five blue cells at the bottom correspond to the data qubits of Fig. 7.1; the pink cells are used to store magic states for performing the π/8 multi-qubit Pauli measurements; radial pink cells are used to inject new magic states for subsequent Pauli measurements (thus preventing time delays due to state injection); brown cells with an adjacent yellow square for twists are |0⟩ ancillas used in the autocorrected non-Clifford gate gadget of Fig. 7.3; and green cells are used to store the distilled magic state from the previous round of distillation, so that it may be accessed by the core of the quantum computer. There are two cells assigned for magic states. Without TELS, only one magic state is used per non-Clifford gate (note that one X boundary has access to the data qubits, and the other boundary has access to the Y boundary of the |0⟩ ancilla, hence these lattice surgery measurements may be performed in parallel). However, at any given time, one magic state cell will take part in a non-Clifford gate, and the other will be used to prepare a noisy magic state for the subsequent non-Clifford gate. This is required as it takes a non-trivial amount of time to prepare a noisy magic state (roughly 2 syndrome measurement rounds as described in Chapter 7). In Figs. 7.4, 7.5 and 14 to 16, we use the radial-shaded pink cell to denote the extra cells needed for this simultaneous magic state preparation. Since we assume classical processing is instantaneous, only one qubit cell is assigned for the |0⟩ ancilla used in the autocorrected gadget. Note however, that in Ref. [Lit19a] and Ref. [SBB+22a], it was shown that for finite decoding times, additional |0⟩ ancilla qubits can be used to offset the extra time cost associated with decoding all syndrome measurement rounds associated with the previous lattice surgery operations. However with classical parallelization and pre-decoders [CGS+22, SBB22b, SBB+22a, TZC+22], such additional ancillas may be unnecessary. Note that we do not need to shuttle the distilled magic state from a blue cell to a green cell. We design the distillation tile such that the output magic state is in the right most blue cell. In the next round of distillation, the layout is mirrored about the equator and the magic state cell now becomes a green cell with core access. In this way, the distillation protocol may be restarted without 127 any shuttling delays. Using the procedure detailed in Appendix .9, we determined that the lattice surgery measurement distance must be dm = 9 to obtain magic states with logical failure rate at most 10−10 . For every data qubit cell, there are two accessible X boundaries. We can almost trivially speed up the protocol by a factor of two by assigning new routing space and ancilla cells that access the second X boundary of the data qubits. This new hardware layout allows us to perform two multi-qubit Pauli measurements in parallel. This idea was originally proposed in Ref. [Lit19a]. We show a layout that performs lattice surgery measurements two at a time without temporal encoding in Fig. 7.5a. Note that with this layout, a distilled magic state present on a blue data cell must be shuttled to a green storage cell. There is an extra time cost associated with the shuttling operation. We do not include the time cost of shuttling in Table 7.1 as it may still be possible to eliminate shuttling using a more clever layout. In any case, the layout without parallel measurements yields a smaller space-time cost. TELS-Single Error Detect [12, 11, 2] Next, we consider a distillation protocol that uses TELS to execute the non-Clifford gates, with the protocol described in Section 7.1.1. We first determine the space-time cost using the Single Error Detect [12, 11, 2] code. The codeword generator matrix for this code is given in Eq. (17) of Appendix .5. If the measurements are performed sequentially, as in the layout of Fig. 7.4b, one routing space with access to the X boundaries of all the qubits will suffice. For a faster distillation tile that performs measurements two at a time, the measurements may be performed using extra routing space as shown in Fig. 7.5b. Note that we now use the non-Clifford gadget of Fig. 7.2a to perform the π/8 rotations since we perform distillation in the Clifford frame. This frees up space, as we do not need to allocate qubits for a |0⟩ ancilla with Y boundary access. This can reduce the routing space and the number of logical qubit cells needed. On the other hand, TELS incurs a larger space cost as more magic states need to be held in memory. In Fig. 7.5b, we separate the routing space into grey and light blue regions to show non-intersecting routing areas for the two parallel multi-qubit Pauli measurements. Each of these routing spaces has access to the X boundaries of all the data and magic state cells involved in the distillation. In Fig. 7.6a, we show the routing space regions that are used to perform the lattice surgery measurements 128 Conjugate Cli↵ord corrections Conjugate Pauli corrections ⇡ 2 ⇡ 2 |TXi Z Z Z Z X X X X X X ⇡ 4 X X X ⇡ 4 X X X ⇡ 2 X X X X X X X X Z Z Z X X Z ⇡ 4 X X X ⇡ 4 X X X X X X ⇡ 2 ⇡ 2 X X X ⇡ 4 X X X ⇡ 4 X X X X X X X X X X X X X Z X X X Z Z Z Z X Z X X X Z Y Y Y Y X X X X X X X 3⇡ 4 Z |TXi Apply TELS transformation X X X X X X X X X |TXi4 |0i |TXi3 |TXi2 |TXi1 |TXi6 |TXi5 |TXi8 |TXi7 X X X X |TXi9 X X X X X X X X X X X X |TXi10 |TXi11 |TXi12 |TXi13 |TXi14 |TXi X X 15 X X X X X X X X X X X X X X X X X X X X X [12, 11, 2] TELS of non-Cli↵ord PP set [5, 4, 2] TELS of final measurement PP set (a) 3 4 7 8 9 10 11 12 13 14 [5, 4, 2] TELS for final Paulis [12, 11, 2] TELS for non-Cli↵ords Reset 1 5 2 6 15 (b) Figure 7.7: Time dynamics of a TELS-assisted distillation factory. Here we consider a 15-to-1 distillation protocol with the [12, 11, 2] code protecting temporally encoded lattice surgery of the 11 non-Clifford gates, followed by the 4 Pauli measurements protected with the [5, 4, 2] code. (a) The first part of the circuit consists of the Pauli measurements corresponding to the non-Clifford gates. We assume that the non-Clifford measurement results yield 110000000011 and the eleven |TX⟩ state measurements yield 10000000001. The Clifford corrections, derived from Fig. 7.2a, are then conjugated through the final Pauli measurements. (b) Sequence of lattice surgery measurements when both TELS-assisted protocols are combined. 129 corresponding to the first two parallelizable Paulis when the [12, 11, 2] code is used for TELS. In the first syndrome measurement round of the lattice surgery measurement, stabilizers with white vertices yield random outcomes due to the gauge fixing step [VLC+19]. The lattice surgery measurement outcomes are then the error-corrected measurement values corresponding to the X stabilizers in the respective routing regions. Using Appendix .9, we determined that dm = 5 is sufficient for δ (M) = 10−10 when using the [12, 11, 2] code for TELS. Access to only the X boundaries of the data cells is sufficient for the first stage of the protocol, which is the temporally encoded measurements for the non-Clifford gates. In the second stage, we must perform multi-qubit Pauli measurements which are tensor products of X, Y and Z. In Fig. 7.5c, we show how to perform these measurements on the same layout without the shuffling around of surface code patches. On the distilled magic state (bottom left blue cell), the multi-qubit measurements only need X boundary access. The remaining data qubits will need at least one accessible Z and Y boundary. For the 15-to-1 distillation protocol, there are at most 4 multi-qubit π/2 Pauli measurements. Since these measurement may require access to different types of boundaries on each data cell, they cannot in general be performed in parallel. Hence the entire routing space (in light blue) is used to perform these measurements. As shown in Section 7.1.1, this set of measurements corresponds to the second PP set. Since there are four single-qubit measurements at the end of Fig. 7.1, there are at most 4 measurements in this second PP set. We perform these measurements using TELS with a [5, 4, 2] Single Error Detect code and with measurement distance dm = 5. When using the [12, 11, 2] code for the lattice surgery measurements of the non-Clifford gates, we define G as a cyclic code where at most two magic states need to be accessed simultaneously for each Pauli measurement. After each measurement, a magic state cell can be reset and reused for a future non-Clifford gate. If Pauli measurements are performed two at a time, three magic states will need to be concurrently held in memory. These are the solid pink tiles of Fig. 7.5b. In addition, for each subsequent pair of measurements, at least two injected magic states are required, which is why there is not just one additional magic state tile associated for injection but three. One of them may be removed, but then distillation tiles will either not be rectangular or will contain wasted physical qubits. In any case the extra cell for injection can ensure there is always a magic state injected and ready for a non-Clifford gate. Note however that since keeping track of Clifford frames requires 130 occasionally performing Y measurements when using the distilled magic states in the core, the extra pink radial cell could also be used to store the ancilla needed to perform the Y measurement2 . Other TELS protocols In addition to the Single Error Detect code, we considered distillation with a distance-3 BCH code and the distance-7 classical Golay code. In Fig. 7.4c, we show a layout for a distillation tile that performs TELS with a classical [15, 11, 3] BCH code. The time cost can be decreased by performing the lattice surgery measurements corresponding to the non-Clifford gates two at a time. A layout with sufficient routing space for this is shown in Fig. 7.5d. Note that since there are two disjoint routing spaces (in blue and grey), the set of n = 15 Pauli measurements can be performed in the time required for 8 sequential measurements. To obtain the time cost shown in Table 7.1, we used dm = 3. Similarly, in Figs. 7.4d and 7.5e, we show layouts for 15-to-1 distillation tiles that perform TELS with a classical [23, 12, 7] Golay code. Here, we used the parameters dm = 3 and c = 2 (where c is the maximum weight of classical errors that are corrected in a TELS protocol) to obtain the time costs shown in Table 7.1. For the 15-to-1 distillation protocol, implementing TELS using the classical Golay code does not allow for smaller space or time costs. However, for a small enough physical error rate, it is sufficient to consider a measurement distance dm = 1, which allows for a smaller time cost than any other lattice surgery protocol. The space requirements of all the above layouts are described as functions of dx and dz in Appendix .10. Additional constants in Appendix .10 can be used with the procedure of Appendix .9 to determine all the minimum distances (spacelike and timelike) for the distillation protocols. 7.2.2 Scheduling distillation tiles in a factory Quantum computer architectures that perform Pauli-based computation generally contain two parts: a core and a magic state distillation factory. The core contains the data qubits taking part in the logical computation. It is known that TELS can speed up the runtime of PP sets executed in the core [CC22b]. In this paper, we also applied TELS to distillation circuits and observed reduced 2Performing a Y measurement on a surface code patch can be done in various ways. For instance, one can perform a logical phase gate, followed by measuring all the data qubits in the X basis. However, performing a logical phase gate on a two-dimensional planar architecture with the surface code requires additional routing space and measurements involving twists (see for instance Fig. 23 of Ref. [BDM+21]). Alternatively, one could use an ancilla prepared in the logical |0⟩ state, and perform a Y ⊗ Z measurement to get the parity of the Y measurement outcome. 131 Tmk lD Tm D Tm T(2) 3 T(2) T 2 (2) T 1 (1) T 1 (1) T 3 (1) T 2 (0) T 2 (0) T 1 (0)⇤ T 3 (0)⇤ T 2 (0)⇤ 1 3 1 2 3 1 2 3 1 2 3 1 2 T(0) T 1 (0) 3 T(1) 2 Figure 7.8: Round robin scheduling of deterministic-time distillation tiles in a factory. This scheduling method allows minimizing core wait time if a distillation tile rejects. The horizontal axis is time and the labels above are timestamps. Timestamp Tj (i) indicates the end of the jth distillation tile while accumulating magic states for the ith PP set executed in the core. A * indicates the start of the distillation tile for the first time. In this example, there are D = 3 distillation tiles, each producing l = 2 distilled magic states in time Tm, for a core that executes PP sets of size k = 8. space-time costs. However distillation tiles are just modules that are used to construct a complete distillation factory, where many distillation tiles must be arranged with sufficient routing space to access the core. When merging a factory with a core, we are faced with additional scheduling and layout challenges. In this paper, we tackle the problem of scheduling, applying our speedups from TELS. We leave the design of the layouts of distillation factories for future work. When executing an algorithm in the core using TELS on PP sets of size k, we denote the time to execute the algorithmic PP set as TPBC. The distillation factory will simultaneously be working to distill at least k new magic states for the next algorithmic PP set. It does this in time Tmagic. General quantum algorithms operate on timescales much longer than Tmagic, and so it is important to ensure the core is never idle and waiting for magic states. This situation, where the core is idling, is called a magic-state bottleneck. To avoid this bottleneck, we would like to ensure Tmagic ≤ TPBC. (7.9) The best case scenario is when Tmagic ≈ TPBC. As discussed in Ref. [CC22b], this yields the smallest 132 space-time cost for running algorithms. We now wish to determine how many distillation tiles are required to satisfy the above condition. This can vary depending on the size of the PP set executed in the core and its relative speed-up. The number of tiles required also depends on the scheduling algorithm used in the factory, especially since some distillation tiles may detect errors and have to restart before producing magic states that can be used by the core. We first show how a simple algorithm can lead to large time costs when a distillation tile fails. Next we show how to use round robin scheduling to minimize additional time costs due to distillation tile failures. Note that the round robin scheduling algorithm can also be used when distilling lower level magic states in a concatenated distillation protocol. Consider a situation where Eq. (7.9) is satisfied and k is greater than or equal to the number of magic state storage cells (green cells in Fig. 7.4) in the factory. This implies each distillation tile takes less time to produce magic states than TPBC. Hence a simple factory schedule would be to start distillation on all tiles when a new PP set is beginning to be executed in the core. However, if the core is only marginally slower than the factory and a distillation tile rejects, the core will pause and wait for new magic states to be produced. Such a situation is undesirable since the core would now need to wait by a time Tm, where Tm is the worst-case time needed to produce distilled magic states by a distillation tile. In an attempt to reduce the core waiting time due to the rejection of a distillation tile (either due to TELS or the distillation algorithm), we suggest a round-robin approach. We assume that we have a distillation tile where, in the case where no errors are detected during the magic state distillation protocol, the tile produces l magic states in time Tm using a deterministic algorithm (TELS distillation is adaptive, but we consider the worst case time). The probability that a tile detects an error on an input magic state, or that the TELS protocol detects a timelike failure during lattice surgery is pD. If D distillation tiles are used with round robin scheduling, the average time to distill k magic states is Tmagic = Tmk lD + k/l 1 pD(1 − pD) k/l−1 Tm D + k/l + 1 2 p 2 D(1 − pD) k/l−1 2Tm D + ... = Tmk lD + pD(1 − pD) k/l−1Tm D × X∞ j=1 j k/l + j − 1 j p j−1 D 133 = Tmk lD + pD(1 − pD) k/l−1Tm D k l (1 − pD) −k/l−1 = Tmk(1 − pD + p 2 D) lD(1 − pD) 2 . (7.10) In Fig. 7.8, we show an example of the round robin scheduling algorithm with 3 distillation tiles that each produce 2 distilled magic states in time Tm. If the core executes PP sets with size 8, the 8 required magic states are distilled and prepared in time Tmk lD , if no errors are detected. From this we may solve for the number of distillation tiles required when Tmagic < TPBC with D distillation tiles given by D = Tmk(1 − pD + p 2 D) lTmagic(1 − pD) 2 . (7.11) The shortcoming of this calculation is that it only applies to constant-time distillation tiles. If the tile takes adaptive time, such as in the magic state distillation protocol of Section 7.1.1, where there are a non-trivial number of extra measurements, it is unclear what the most efficient scheduling algorithm is. In this case, we may still upper bound the total time Tm of a magic state distillation tile thus making the round robin scheduling algorithm applicable. However, adapting a scheduling algorithm to the type of distillation tile used could allow for a more precise calculation of the time cost of the factory, which could possibly reduce the required number of distillation tiles. 134 Bibliography [AAA+23] R. Acharya, I. Aleiner, R. Allen, et al. Suppressing quantum errors by scaling a surface code logical qubit. Nature 614(7949):676–681 (2023). [AAB+19] F. Arute, K. Arya, R. Babbush, et al. Quantum supremacy using a programmable superconducting processor. Nature 574(7779):505 (2019), arXiv:1910.11333. [Aar18] S. Aaronson. Shadow tomography of quantum states. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, page 325, New York, NY, USA, 2018. Association for Computing Machinery. [ABO97] D. Aharonov and M. Ben-Or. Fault-tolerant quantum computation with constant error. In Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’97, page 176–188, New York, NY, USA, 1997. Association for Computing Machinery, arXiv:quant-ph/9611025. [ADL01] C. Ahn, A. C. Doherty, and A. J. Landahl. Continuous quantum error correction via quantum feedback control, arXiv:quant-ph/0110111. URL http://xxx.lanl.gov/abs/ quant-ph/0110111. [AG04] S. Aaronson and D. Gottesman. Improved simulation of stabilizer circuits. Phys. Rev. A 70:052328 (2004). [AGP06] P. Aliferis, D. Gottesman, and J. Preskill. Quantum accuracy threshold for concatenated distance-3 codes. Quant. Inf. Comput. 6:97 (2006), arXiv:quant-ph/0504218. [AL04] P. Aliferis and D. W. Leung. Computation by measurements: A unifying picture. Phys. Rev. A 70:062314 (2004). [AM22] B. Anker and M. Marvian. Flag gadgets based on classical codes, arXiv:2212.10738. [AMO+23] Y. Akahoshi, K. Maruyama, H. Oshima, et al. Partially fault-tolerant quantum computing architecture with error-corrected clifford gates and space-time efficient analog rotations, arXiv:2303.13181. [ARL+20] C. K. Andersen, A. Remm, S. Lazar, et al. Repeated quantum error detection in a surface code. Nature Physics 16(8):875 (2020), arXiv:1912.09410. [BBB+21] S. Bartolucci, P. Birchall, H. Bombin, et al. Fusion-based quantum computation. arXiv preprint (2021). [BBB+23] B. Barber, K. M. Barnes, T. Bialas, et al. A real-time, scalable, fast and highly resource efficient decoder for a quantum computer, arXiv:2309.05558. 135 [BBC+95] A. Barenco, C. H. Bennett, R. Cleve, et al. Elementary gates for quantum computation. Phys. Rev. A 52:3457–3467 (1995). [BBD+09] H. J. Briegel, D. E. Browne, W. Dür, et al. Measurement-based quantum computation. Nature Physics 5(1):19–26 (2009). [BCG+23] S. Bravyi, A. W. Cross, J. M. Gambetta, et al. High-threshold and low-overhead fault-tolerant quantum memory, arXiv:2308.07915. [BCJ+23] F. Battistel, C. Chamberland, K. Johar, et al. Real-time decoding for fault-tolerant quantum computing: progress, challenges and outlook. Nano Futures 7(3):032003 (2023). [BCLK+22] K. Bharti, A. Cervera-Lierta, T. H. Kyaw, et al. Noisy intermediate-scale quantum algorithms. Rev. Mod. Phys. 94:015004 (2022). [BCMS19] C. D. Bruzewicz, J. Chiaverini, R. McConnell, and J. M. Sage. Trappedion quantum computing: Progress and challenges. Applied Physics Reviews 6(2):021314 (2019), arXiv:https://pubs.aip.org/aip/apr/articlepdf/doi/10.1063/1.5088164/19742554/021314_1_online.pdf. [BDM+21] H. Bombin, C. Dawson, R. V. Mishmash, et al. Logical blocks for fault-tolerant topological quantum computation. arXiv preprint :arXiv:2112.12160 (2021). [BEG+24] D. Bluvstein, S. J. Evered, A. A. Geim, et al. Logical quantum processor based on reconfigurable atom arrays. Nature 626(7997):58–65 (2024). [BGB+18] R. Babbush, C. Gidney, D. W. Berry, et al. Encoding electronic spectra in quantum circuits with linear t complexity. Phys. Rev. X 8:041015 (2018). [BGM+19] D. W. Berry, C. Gidney, M. Motta, et al. Qubitization of Arbitrary Basis Quantum Chemistry Leveraging Sparsity and Low Rank Factorization. Quantum 3:208 (2019). [BH12] S. Bravyi and J. Haah. Magic-state distillation with low overhead. Phys. Rev. A 86:052329 (2012). [BK05] S. Bravyi and A. Kitaev. Universal quantum computation with ideal Clifford gates and noisy ancillas. Phys. Rev. A 71:022316 (2005). [BKS21] M. E. Beverland, A. Kubica, and K. M. Svore. Cost of universality: A comparative study of the overhead of state distillation and code switching with color codes. PRX Quantum 2:020341 (2021). [BLS+22] D. Bluvstein, H. Levine, G. Semeghini, et al. A quantum processor based on coherent transport of entangled atom arrays. Nature 604(7906):451–456 (2022). [BMD06] H. Bombin and M. A. Martin-Delgado. Topological quantum distillation. Phys. Rev. Lett. 97:180501 (2006). [BMD07] H. Bombín and M. A. Martin-Delgado. Optimal resources for topological twodimensional stabilizer codes: Comparative study. Phys. Rev. A 76:012305 (2007), arXiv:quant-ph/0703272. 136 [BSS16] S. Bravyi, G. Smith, and J. A. Smolin. Trading classical and quantum computational resources. Phys. Rev. X 6:021043 (2016). [BVT21] F. Battistel, B. Varbanov, and B. Terhal. Hardware-efficient leakage-reduction scheme for quantum error correction with superconducting transmon qubits. PRX Quantum 2:030314 (2021). [BWP+17] J. Biamonte, P. Wittek, N. Pancotti, et al. Quantum machine learning. Nature 549(7671):195 (2017), arXiv:1611.09347. [BZH+15] T. A. Brun, Y.-C. Zheng, K.-C. Hsu, et al. Teleportation-based fault-tolerant quantum computation in multi-qubit large block codes, arXiv:1504.03913. [CAB+21] M. Cerezo, A. Arrasmith, R. Babbush, et al. Variational quantum algorithms. Nature Reviews Physics 3(9):625 (2021), arXiv:2012.09265. [Cam21] E. T. Campbell. Early fault-tolerant simulations of the hubbard model. Quantum Science and Technology 7(1):015007 (2021). [CB18] C. Chamberland and M. E. Beverland. Flag fault-tolerant error correction with arbitrary distance codes. Quantum 2:53 (2018), arXiv:1708.02246. [CBP23] J. Claes, J. E. Bourassa, and S. Puri. Tailored cluster states with high threshold under biased noise. npj Quantum Information 9(1):9 (2023). [CC19] C. Chamberland and A. W. Cross. Fault-tolerant magic state preparation with flag qubits. Quantum 3:143 (2019). [CC22a] C. Chamberland and E. T. Campbell. Circuit-level protocol and analysis for twist-based lattice surgery. Phys. Rev. Research 4:023090 (2022). [CC22b] C. Chamberland and E. T. Campbell. Universal quantum computing with twist-free and temporally encoded lattice surgery. PRX Quantum 3:010331 (2022). [CGS+22] C. Chamberland, L. Goncalves, P. Sivarajah, et al. Techniques for combining fast local decoders with global decoders under circuit-level noise. arXiv e-prints :arXiv:2208.01178 (2022). [CH18] E. T. Campbell and M. Howard. Magic state parity-checker with pre-distilled components. Quantum 2:56 (2018). [CJO17] C. Chamberland and T. Jochym-O’Connor. Error suppression via complementary gauge choices in Reed-Muller codes. Quantum Science and Technology 2(3):035008 (2017). [CJOL17] C. Chamberland, T. Jochym-O’Connor, and R. Laflamme. Overhead analysis of universal concatenated quantum codes. Phys. Rev. A 95:022313 (2017). [CKYZ20] C. Chamberland, A. Kubica, T. J. Yoder, and G. Zhu. Triangular color codes on trivalent graphs with flag qubits. New Journal of Physics 22(2):023019 (2020), arXiv:1911.00355. [CMN+18] A. M. Childs, D. Maslov, Y. Nam, et al. Toward the first quantum simulation with quantum speedup. Proceedings of the National Academy of Sciences 115(38):9456–9461 (2018), arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.1801723115. 137 [CN20] C. Chamberland and K. Noh. Very low overhead fault-tolerant magic state preparation using redundant ancilla encoding and flag qubits. npj Quantum Information 6:91 (2020). [CNAA+22] C. Chamberland, K. Noh, P. Arrangoiz-Arriola, et al. Building a fault-tolerant quantum computer using concatenated cat codes. PRX Quantum 3:010329 (2022). [CR18a] R. Chao and B. W. Reichardt. Fault-tolerant quantum computation with few qubits. npj Quantum Information 4(1):42 (2018), arXiv:1705.05365. [CR18b] R. Chao and B. W. Reichardt. Quantum error correction with only two extra qubits. Phys. Rev. Lett. 121:050502 (2018), arXiv:1705.02329. [CR20] R. Chao and B. W. Reichardt. Flag fault-tolerant error correction for any stabilizer code. PRX Quantum 1:010302 (2020), arXiv:1912.09549. [CS96] A. R. Calderbank and P. W. Shor. Good quantum error-correcting codes exist. Phys. Rev. A 54:1098–1106 (1996), arXiv:arXiv:quant-ph/9512032. [CSA+21] Z. Chen, K. J. Satzinger, J. Atalaya, et al. Exponential suppression of bit or phase errors with cyclic error correction. Nature 595(7867):383 (2021), arXiv:2102.06132. [CSM+23] K. S. Chou, T. Shemma, H. McCarrick, et al. Demonstrating a superconducting dual-rail cavity qubit with erasure-detected logical measurements, arXiv:2307.03169. [CTV17] E. T. Campbell, B. M. Terhal, and C. Vuillot. Roads towards fault-tolerant universal quantum computation. Nature 549(7671):172 (2017), arXiv:1612.07330. [CZY+20] C. Chamberland, G. Zhu, T. J. Yoder, et al. Topological and subsystem codes on low-degree graphs with flag qubits. Phys. Rev. X 10:011022 (2020), arXiv:1907.09528. [DA07] D. P. DiVincenzo and P. Aliferis. Effective fault-tolerant quantum computation with slow measurements. Phys. Rev. Lett. 98:020501 (2007), arXiv:quant-ph/0607047. [Deu89] D. Deutsch. Quantum computational networks. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences 425(1868):73–90 (1989). [DKLP02] E. Dennis, A. Kitaev, A. Landahl, and J. Preskill. Topological quantum memory. J. Math. Phys. 43:4452–4505 (2002), arXiv:arXiv:quant-ph/0110143. [DL23] Z. Ding and L. Lin. Even shorter quantum circuit for phase estimation on early faulttolerant quantum computers with applications to ground-state energy estimation. PRX Quantum 4:020331 (2023). [DN21] N. Delfosse and N. H. Nickerson. Almost-linear time decoding algorithm for topological codes. Quantum 5:595 (2021), arXiv:1709.06218. [DR20] N. Delfosse and B. W. Reichardt. Short Shor-style syndrome sequences, arXiv:2008.05051. [DRS22] N. Delfosse, B. W. Reichardt, and K. M. Svore. Beyond single-shot fault-tolerant quantum error correction. IEEE Transactions on Information Theory 68(1):287–301 (2022), arXiv:2002.05180. 138 [EAG24] G. Escobar-Arrieta and M. Gutiérrez. Improved performance of the bacon-shor code with steane’s syndrome extraction method, arXiv:2403.01659. [EDN+21] L. Egan, D. M. Debroy, C. Noel, et al. Fault-tolerant control of an error-corrected qubit. Nature 598(7880):281 (2021), arXiv:2009.11482. [FG18] A. G. Fowler and C. Gidney. Low overhead quantum computation using lattice surgery. arXiv preprint arXiv:1808.06709 (2018). [FGG+01] E. Farhi, J. Goldstone, S. Gutmann, et al. A quantum adiabatic evolution algorithm applied to random instances of an NP-Complete problem. Science 292(5516):472–475 (2001). [FMMC12] A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N. Cleland. Surface codes: Towards practical large-scale quantum computation. Phys. Rev. A 86:032324 (2012), arXiv:1208.0928. [Gai13] F. Gaitan. Quantum Error Correction and Fault Tolerant Quantum Computing: . CRC Press, Boca Raton, 2013. [Gar86] M. Gardner. The Binary Gray Code. In Knotted Doughnuts and other Mathematical Entertainments, pages 22–39. W. H. Freeman and Company, New York, 1986. [GC99] D. Gottesman and I. L. Chuang. Demonstrating the viability of universal quantum computation using teleportation and single-qubit operations. 402(6760):390–393 (1999). [GE21] C. Gidney and M. Ekerå. How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits. Quantum 5:433 (2021), arXiv:1905.09749. [GFP+20] A. Grimm, N. E. Frattini, S. Puri, et al. Stabilization and operation of a kerr-cat qubit. Nature 584(7820):205–209 (2020). [GHS+23] J. Gavriel, D. Herr, A. Shaw, et al. Transversal injection for direct encoding of ancilla states for non-clifford gates using stabilizer codes. Phys. Rev. Res. 5:033019 (2023). [GHZ89] D. M. Greenberger, M. A. Horne, and A. Zeilinger. Bell’s Theorem, Quantum Theory and Conceptions of the Universe: Going Beyond Bell’s Theorem, pages 69–72. Springer Netherlands, Dordrecht, 1989, arXiv:0712.0921. [Gid22] C. Gidney. Stability experiments: The overlooked dual of memory experiments. Quantum 6:786 (2022). [GKP00] D. Gottesman, A. Kitaev, and J. Preskill. Encoding a qubit in an oscillator. (2000), arXiv:arXiv:quant-ph/0008040. [GNBJ23] C. Gidney, M. Newman, P. Brooks, and C. Jones. Yoked surface codes, arXiv:2312.04522. [GNFB21] C. Gidney, M. Newman, A. Fowler, and M. Broughton. A Fault-Tolerant Honeycomb Memory. Quantum 5:605 (2021), arXiv:2108.10457. [Got97] D. Gottesman. Stabilizer codes and quantum error correction. PhD thesis, California Institute of Technology, 1997, arXiv:quant-ph/9705052. 139 [Got00] D. Gottesman. Fault-tolerant quantum computation with local gates. J. Mod. Opt. 47:333–345 (2000), arXiv:arXiv:quant-ph/9903099. [Got16] H. Goto. Minimizing resource overheads for fault-tolerant preparation of encoded states of the steane code. Scientific Reports 6(1):19578 (2016). [Gra53] F. Gray. Pulse code communication. U.S. Patent 2632058A. Issued Mar. 17, 1953. [Gra07] M. Grassl. Bounds on the minimum distance of linear codes and quantum codes, online available at http://www.codetables.de. [GRLR+23] E. Gouzien, D. Ruiz, F.-M. Le Régent, et al. Performance analysis of a repetition cat code architecture: Computing 256-bit elliptic curve logarithm in 9 hours with 126 133 cat qubits. Phys. Rev. Lett. 131:040602 (2023). [GS21a] E. Gouzien and N. Sangouard. Factoring 2048-bit rsa integers in 177 days with 13 436 qubits and a multimode memory. Phys. Rev. Lett. 127:140503 (2021). [GS21b] M. G. Gowda and P. K. Sarvepalli. Color codes with twists: Construction and universalgate-set implementation. Phys. Rev. A 104:012603 (2021). [GT07] N. Gisin and R. Thew. Quantum communication. Nature Photonics 1(3):165 (2007), arXiv:quant-ph/0703255. [HB21a] O. Higgott and N. P. Breuckmann. Subsystem codes with high thresholds by gauge fixing and reduced qubit overhead. Phys. Rev. X 11:031039 (2021). [HB21b] S. Huang and K. R. Brown. Between Shor and Steane: A unifying construction for measuring error syndromes. Phys. Rev. Lett. 127:090505 (2021), arXiv:2012.15403. [HBC23] S. Huang, K. R. Brown, and M. Cetina. Comparing shor and steane error correction using the bacon-shor code, arXiv:2312.10851. [HBK+23] O. Higgott, T. C. Bohdanowicz, A. Kubica, et al. Improved decoding of circuit noise and fragile boundaries of tailored surface codes. Phys. Rev. X 13:031007 (2023). [HFDM12] C. Horsman, A. G. Fowler, S. Devitt, and R. V. Meter. Surface code quantum computing by lattice surgery. New Journal of Physics 14(12):123011 (2012). [HGFW06] L. C. L. Hollenberg, A. D. Greentree, A. G. Fowler, and C. J. Wellard. Two-dimensional architectures for donor-based quantum computing. Phys. Rev. B 74:045311 (2006). [HH18] J. Haah and M. B. Hastings. Codes and protocols for distilling T, controlled-S, and Toffoli gates. Quantum 2:71 (2018), arXiv:1709.02832. [HH21] M. B. Hastings and J. Haah. Dynamically Generated Logical Qubits. Quantum 5:564 (2021). [HHPW17] J. Haah, M. B. Hastings, D. Poulin, and D. Wecker. Magic state distillation with low space overhead and optimal asymptotic input count. Quantum 1:31 (2017). [HJOY23] S. Huang, T. Jochym-O’Connor, and T. J. Yoder. Homomorphic logical measurements. PRX Quantum 4:030301 (2023). 140 [HPDN19] D. Herr, A. Paler, S. J. Devitt, and F. Nori. Time versus hardware: Reducing qubit counts with a (surface code) data bus. arXiv preprint (2019), arXiv:arXiv:1902.08117. [HWFZ20] H.-L. Huang, D. Wu, D. Fan, and X. Zhu. Superconducting quantum computing: a review. Science China Information Sciences 63(8):180501 (2020). [JAR20] Y. Jing, D. Alsina, and M. Razavi. Quantum key distribution over quantum repeaters with encoding: Using error detection as an effective postselection tool. Phys. Rev. Applied 14:064037 (2020), arXiv:2007.06376. [JCL+10] M.-H. Jing, Y. Chang, C.-D. Lee, et al. A result on Zetterberg codes. IEEE Communications Letters 14(7):662–663 (2010). [Jon13] C. Jones. Low-overhead constructions for the fault-tolerant Toffoli gate. Phys. Rev. A 87:022328 (2013). [JVMF+12] N. C. Jones, R. Van Meter, A. G. Fowler, et al. Layered architecture for quantum computing. Phys. Rev. X 2:031007 (2012). [KCB23] M. Kang, W. C. Campbell, and K. R. Brown. Quantum error correction with metastable states of trapped ions using erasure conversion. PRX Quantum 4:020358 (2023). [KEA+23] Y. Kim, A. Eddins, S. Anand, et al. Evidence for the utility of quantum computing before fault tolerance. Nature 618(7965):500–505 (2023). [KHV+23] A. Kubica, A. Haim, Y. Vaknin, et al. Erasure qubits: Overcoming the T1 limit in superconducting circuits. Phys. Rev. X 13:041022 (2023). [Kit97] A. Y. Kitaev. Quantum Communication, Computing, and Measurement: Quantum Error Correction with Imperfect Gates, pages 181–188. Springer US, Boston, MA, 1997. [KKJ22] R. Kshirsagar, A. Katabarwa, and P. D. Johnson. On proving the robustness of algorithms for early fault-tolerant quantum computers, arXiv:2209.11322. [KLP+22] I. H. Kim, Y.-H. Liu, S. Pallister, et al. Fault-tolerant resource estimate for quantum chemical simulations: Case study on Li-ion battery electrolyte molecules. Phys. Rev. Research 4:023019 (2022). [KLR+22] S. Krinner, N. Lacroix, A. Remm, et al. Realizing repeated quantum error correction in a distance-three surface code. Nature 605(7911):669–674 (2022). [Kni05a] E. Knill. Scalable quantum computing in the presence of large detected-error rates. Phys. Rev. A 71:042322 (2005), arXiv:quant-ph/0312190. [Kni05b] E. Knill. Quantum computing with realistically noisy devices. Nature 434:39 (2005), arXiv:quant-ph/0410199. [LB13] D. A. Lidar and T. A. Brun. Quantum Error Correction: . Cambridge University Press, Cambridge, 2013. [LBG+21] J. Lee, D. W. Berry, C. Gidney, et al. Even more efficient quantum computations of chemistry through tensor hypercontraction. PRX Quantum 2:030305 (2021), arXiv:2011.03494. 141 [LC22] L. Lao and B. Criger. Magic state injection on the rotated surface code. In Proceedings of the 19th ACM International Conference on Computing Frontiers, CF ’22, page 113–120, New York, NY, USA, 2022. Association for Computing Machinery. [LGL+17] N. M. Linke, M. Gutierrez, K. A. Landsman, et al. Fault-tolerant quantum error detection. Science Advances 3(10):1701074 (2017), arXiv:1611.06946. [LHH+23] H. Levine, A. Haim, J. S. C. Hung, et al. Demonstrating a long-coherence dual-rail erasure qubit using tunable transmons, arXiv:2307.08737. [Lit19a] D. Litinski. A game of surface codes: Large-scale quantum computing with lattice surgery. Quantum 3:128 (2019), arXiv:1808.02892. [Lit19b] D. Litinski. Magic state distillation: Not as costly as you think. Quantum 3:205 (2019). [Lit23] D. Litinski. How to compute a 256-bit elliptic curve private key with only 50 million toffoli gates, arXiv:2306.08585. [LO18] D. Litinski and F. v. Oppen. Lattice surgery with a twist: Simplifying Clifford gates of surface codes. Quantum 2:62 (2018). [LP23] H.-K. Lin and L. P. Pryadko. Quantum two-block group algebra codes, arXiv:2306.16400. [LPSB14] C.-Y. Lai, G. Paz, M. Suchara, and T. A. Brun. Performance and error analysis of Knill’s postselection scheme in a two-dimensional architecture. Quantum Info. Comput. 14(9 & 10):807 (2014), arXiv:1305.5657. [LRA14] A. J. Landahl and C. Ryan-Anderson. Quantum computing by color-code lattice surgery. arXiv preprint (2014), arXiv:arXiv:1407.5103. [LT22] L. Lin and Y. Tong. Heisenberg-limited ground-state energy estimation for early fault-tolerant quantum computers. PRX Quantum 3:010318 (2022). [LWDZ23] N. Liyanage, Y. Wu, A. Deters, and L. Zhong. Scalable quantum error correction for surface codes using fpga, arXiv:2301.08419. [LZB17] C.-Y. Lai, Y.-C. Zheng, and T. A. Brun. Fault-tolerant preparation of stabilizer states for quantum Calderbank-Shor-Steane codes by classical error-correcting codes. Phys. Rev. A 95:032339 (2017), arXiv:1605.05647. [MEK13] A. M. Meier, B. Eastin, and E. Knill. Magic-state distillation with the four-qubit code. Quantum Info. Comput. 13:195 (2013), arXiv:1204.4221. [MFA+22] M. McEwen, L. Faoro, K. Arya, et al. Resolving catastrophic error bursts from cosmic rays in large arrays of superconducting qubits. Nature Physics 18(1):107–111 (2022). [MGM20] O. D. Matteo, V. Gheorghiu, and M. Mosca. Fault-tolerant resource estimation of quantum random-access memories. IEEE Transactions on Quantum Engineering 1:1–13 (2020). [MJR+20] J. R. McClean, Z. Jiang, N. C. Rubin, et al. Decoding quantum errors with subspace expansions. Nature Communications 11(1):636 (2020), arXiv:1903.05786. 142 [MLP+23] S. Ma, G. Liu, P. Peng, et al. High-fidelity gates and mid-circuit erasure conversion in an atomic qubit. Nature 622(7982):279–284 (2023). [MVM+22] J. F. Marques, B. M. Varbanov, M. S. Moreira, et al. Logical-qubit operations in an error-detecting surface code. Nature Physics 18(1):80–86 (2022). [MWHH21] G. J. Mooney, G. A. L. White, C. D. Hill, and L. C. L. Hollenberg. Whole-device entanglement in a 65-qubit superconducting quantum computer. Advanced Quantum Technologies 4(10):2100061 (2021), arXiv:2102.11521. [NC20] K. Noh and C. Chamberland. Fault-tolerant bosonic quantum error correction with the surface–gottesman-kitaev-preskill code. Phys. Rev. A 101:012316 (2020). [NCBa22] K. Noh, C. Chamberland, and F. G. Brandão. Low-overhead fault-tolerant quantum error correction with the surface-gkp code. PRX Quantum 3:010315 (2022). [NMM+14] D. Nigg, M. Müller, E. A. Martinez, et al. Quantum computations on a topologically encoded qubit. Science 345(6194):302 (2014), arXiv:1403.5426. [PBP+23] L. Postler, F. Butt, I. Pogorelov, et al. Demonstration of fault-tolerant Steane quantum error correction, arXiv:2312.09745. [PCL+12] J.-W. Pan, Z.-B. Chen, C.-Y. Lu, et al. Multiphoton entanglement and interferometry. Rev. Mod. Phys. 84:777 (2012), arXiv:0805.2853. [PFM+21] I. Pogorelov, T. Feldker, C. D. Marciniak, et al. Compact ion-trap quantum computing demonstrator. PRX Quantum 2:020343 (2021), arXiv:2101.11390. [PHP+22] L. Postler, S. Heuβen, I. Pogorelov, et al. Demonstration of fault-tolerant universal quantum gate operations. Nature 605(7911):675–680 (2022). [PK22a] P. Panteleev and G. Kalachev. Asymptotically good quantum and locally testable classical ldpc codes, arXiv:2111.03654. [PK22b] P. Panteleev and G. Kalachev. Quantum ldpc codes with almost linear minimum distance. IEEE Transactions on Information Theory 68(1):213–229 (2022). [PMS+14] A. Peruzzo, J. McClean, P. Shadbolt, et al. A variational eigenvalue solver on a photonic quantum processor. Nature Communications 5(1):4213 (2014), arXiv:1304.3061. [PR12] A. Paetznick and B. Reichardt. Fault-tolerant ancilla preparation and noise threshold lower bounds for the 23-qubit Golay code. Quantum Inf. Comput. 12:1034 (2012), arXiv:1106.2190. [PR13] A. Paetznick and B. W. Reichardt. Universal fault-tolerant quantum computation with only transversal gates and error correction. Phys. Rev. Lett. 111:090505 (2013), arXiv:1304.3709. [PR23] P. Prabhu and B. W. Reichardt. Fault-tolerant syndrome extraction and cat state preparation with fewer qubits. Quantum 7:1154 (2023). [PSJG+20] S. Puri, L. St-Jean, J. A. Gross, et al. Bias-preserving gates with stabilized cat qubits. Science Advances 6(34):eaay5901 (2020), arXiv:https://www.science.org/doi/pdf/10.1126/sciadv.aay5901. 143 [RABA+22] C. Ryan-Anderson, N. C. Brown, M. S. Allman, et al. Implementing fault-tolerant entangling gates on the five-qubit code and the color code, arXiv:2208.01863. [RABL+21] C. Ryan-Anderson, J. G. Bohnet, K. Lee, et al. Realization of real-time fault-tolerant quantum error correction. Phys. Rev. X 11:041058 (2021), arXiv:2107.07505. [RBK+23] N. C. Rubin, D. W. Berry, A. Kononov, et al. Quantum computation of stopping power for inertial fusion target design, arXiv:2308.12352. [Rei06] B. W. Reichardt. Fault-tolerance threshold for a distance-three quantum code. In M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener, editors, Automata, Languages and Programming, pages 50–61, Berlin, Heidelberg, 2006. Springer Berlin Heidelberg, arXiv:quant-ph/0509203. [Rei20] B. W. Reichardt. Fault-tolerant quantum error correction for steane’s seven-qubit color code with few or no extra qubits. Quantum Science and Technology 6(1):015007 (2020), arXiv:1804.06995. [RGD+20] D. Ristè, L. C. G. Govia, B. Donovan, et al. Real-time processing of stabilizer measurements in a bit-flip code. npj Quantum Information 6(1):71 (2020), arXiv:1911.12280. [RGL+24] D. Ruiz, J. Guillaud, A. Leverrier, et al. LDPC-cat codes for low-overhead quantum computing in 2D, arXiv:2401.09541. [SBB+22a] L. Skoric, D. E. Browne, K. M. Barnes, et al. Parallel window decoding enables scalable fault tolerant quantum computation. arXiv e-prints :arXiv:2209.08552 (2022). [SBB22b] S. C. Smith, B. J. Brown, and S. D. Bartlett. A local pre-decoder to reduce the bandwidth and latency of quantum error correction. arXiv e-prints :arXiv:2208.04660 (2022). [SBW+21] Y. Su, D. W. Berry, N. Wiebe, et al. Fault-tolerant quantum simulations of chemistry in first quantization. PRX Quantum 2:040332 (2021). [SC22] N. Shutty and C. Chamberland. Decoding merged color-surface codes and finding fault-tolerant Clifford circuits using solvers for satisfiability modulo theories. Phys. Rev. Applied 18:014072 (2022). [SDBP22] S. Singh, A. S. Darmawan, B. J. Brown, and S. Puri. High-fidelity magic-state preparation with a biased-noise architecture. Phys. Rev. A 105:052410 (2022). [SDT07] K. M. Svore, D. P. DiVincenzo, and B. M. Terhal. Noise threshold for a fault-tolerant twodimensional lattice architecture. Quantum Inf. Comput. 7(4):297 (2007), arXiv:quantph/0604090. [SER+23] V. V. Sivak, A. Eickbusch, B. Royer, et al. Real-time quantum error correction beyond break-even. Nature 616(7955):50–55 (2023). [Sho95] P. W. Shor. Scheme for reducing decoherence in quantum computer memory. Phys. Rev. A 52:R2493 (1995). [Sho96] P. W. Shor. Fault-tolerant quantum computation. In Proceedings of the 37th Annual Symposium on Foundations of Computer Science, FOCS ’96, page 56, USA, 1996. IEEE Computer Society, arXiv:quant-ph/9605011. 144 [SJC+23] K. Sahay, J. Jin, J. Claes, et al. High-threshold codes for neutral-atom qubits with biased erasure errors. Phys. Rev. X 13:041013 (2023). [SM21] C. Sekga and M. Mafu. Security of quantum-key-distribution protocol by using the post-selection technique. Physics Open 7:100075 (2021). [SR09] F. M. Spedalieri and V. P. Roychowdhury. Latency in local, two-dimensional, faulttolerant quantum computing. Quantum Info. Comput. 9(7):666 (2009), arXiv:0805.4213. [SST+23] P. Scholl, A. L. Shaw, R. B.-S. Tsai, et al. Erasure conversion in a high-fidelity rydberg quantum simulator. Nature 622(7982):273–278 (2023). [STD05] K. M. Svore, B. M. Terhal, and D. P. DiVincenzo. Local fault-tolerant quantum computation. Phys. Rev. A 72:022317 (2005), arXiv:quant-ph/0410047. [Ste96a] A. M. Steane. Error correcting codes in quantum theory. Phys. Rev. Lett. 77:793–797 (1996). [Ste96b] A. M. Steane. Simple quantum error-correcting codes. Phys. Rev. A 54:4741 (1996), arXiv:arXiv:quant-ph/9605021. [Ste97] A. M. Steane. Active stabilization, quantum computation, and quantum state synthesis. Phys. Rev. Lett. 78(11):2252 (1997), arXiv:quant-ph/9611027. [Ste03] A. M. Steane. Overhead and noise threshold of fault-tolerant quantum error correction. Phys. Rev. A 68:042322 (2003), arXiv:arXiv:quant-ph/0207119. [Ste14] A. M. Stephens. Efficient fault-tolerant decoding of topological color codes, arXiv:1402.3037. [SWM+24] T. L. Scholten, C. J. Williams, D. Moody, et al. Assessing the benefits and risks of quantum computers, arXiv:2401.16317. [SYK+23] N. Sundaresan, T. J. Yoder, Y. Kim, et al. Demonstrating multi-round subsystem quantum error correction using matching and maximum likelihood decoders. Nature Communications 14(1):2852 (2023). [TC21] B. Tan and J. Cong. Optimality study of existing quantum computing layout synthesis tools. IEEE Transactions on Computers 70(9):1363 (2021), arXiv:2002.09783. [TCC+17] M. Takita, A. W. Cross, A. D. Córcoles, et al. Experimental demonstration of faulttolerant state preparation with superconducting qubits. Phys. Rev. Lett. 119:180501 (2017), arXiv:1705.09259. [Ter15] B. M. Terhal. Quantum error correction for quantum memories. Rev. Mod. Phys. 87:307 (2015). [TS14] Y. Tomita and K. M. Svore. Low-distance surface codes under realistic quantum noise. Phys. Rev. A 90:062320 (2014), arXiv:1404.3747. [TZC+22] X. Tan, F. Zhang, R. Chao, et al. Scalable surface code decoders with parallelization in time. arXiv e-prints :arXiv:2209.09219 (2022). 145 [TZC+23] X. Tan, F. Zhang, R. Chao, et al. Scalable surface-code decoders with parallelization in time. PRX Quantum 4:040344 (2023). [VBT+20] B. M. Varbanov, F. Battistel, B. M. Tarasinski, et al. Leakage detection for a transmonbased surface code. npj Quantum Information 6(1):102 (2020). [VKL99] L. Viola, E. Knill, and S. Lloyd. Dynamical decoupling of open quantum systems. Phys. Rev. Lett. 82:2417 (1999), arXiv:quant-ph/9809071. [VLC+19] C. Vuillot, L. Lao, B. Criger, et al. Code deformation and lattice surgery are gauge fixing. New Journal of Physics 21(3):033028 (2019). [WBC22] K. Wan, M. Berta, and E. T. Campbell. Randomized quantum algorithm for statistical phase estimation. Phys. Rev. Lett. 129:030503 (2022). [WFRJ23] G. Wang, D. S. França, G. Rendon, and P. D. Johnson. Faster ground state energy estimation on early fault-tolerant quantum computers via rejection sampling, arXiv:2304.09827. [WHT15] D. Wecker, M. B. Hastings, and M. Troyer. Progress towards practical quantum variational algorithms. Phys. Rev. A 92:042303 (2015), arXiv:1507.08969. [WKPT22] Y. Wu, S. Kolkowitz, S. Puri, and J. D. Thompson. Erasure conversion for fault-tolerant quantum computing in alkaline earth rydberg atom arrays. Nature Communications 13(1):4657 (2022). [Woo20] J. R. Wootton. Benchmarking near-term devices with quantum error correction. Quantum Science and Technology 5(4):044004 (2020), arXiv:2004.11037. [WSJ22] G. Wang, S. Sim, and P. D. Johnson. State Preparation Boosters for Early FaultTolerant Quantum Computation. Quantum 6:829 (2022). [YK17] T. J. Yoder and I. H. Kim. The surface code with a twist. Quantum 1:2 (2017), arXiv:1612.04795. [YTC16] T. J. Yoder, R. Takagi, and I. L. Chuang. Universal fault-tolerant gates on concatenated stabilizer codes. Phys. Rev. X 6:031039 (2016). [ZLB18] Y.-C. Zheng, C.-Y. Lai, and T. A. Brun. Efficient preparation of large-block-code ancilla states for fault-tolerant quantum computation. Physical Review A 97(3):032331 (2018), arXiv:1710.00389. [ZLBK20] Y.-C. Zheng, C.-Y. Lai, T. A. Brun, and L.-C. Kwek. Constant depth fault-tolerant Clifford circuits for multi-qubit large block codes. Quantum Science and Technology 5(4):045007 (2020). [ZWJ22] R. Zhang, G. Wang, and P. Johnson. Computing Ground State Properties with Early Fault-Tolerant Quantum Computers. Quantum 6:761 (2022). 146 Appendices .1 Post-selective distance-three fault-tolerant cat state preparation Shor’s method for fault-tolerant stabilizer measurement relies on the fault-tolerant preparation of a cat state by postselection. In Fig. 2.2a, the cat state is prepared fault-tolerantly to distance-two; it detects one fault. For postselected distance-three fault tolerance, any one or two faults in the circuit must result in an error of weight at most one or two respectively, else the state must be rejected. In Fig. 9 we show how to prepare a weight-12 cat state fault-tolerantly to distance three—detecting up to two faults. Theorem 6. One ancilla qubit measured m ≥ 2 times, can be used to prepare a cat state on w qubits fault-tolerantly to distance three, detecting up to two faults, for w ≤ 3 · 2 m−1 . (12) Proof. We explain the proof using the circuit in Fig. 9. The circuit passes with acceptable weight-one or weight-two errors when all the flag qubits are measured as 0. If one X fault occurs on the |+⟩ qubit during the preparation of the cat state, it may spread to a data error of weight > 1. However the red flag qubit is triggered and the fault is detected. If two X faults occur on the |+⟩ qubit, the red flag qubit may not catch it, yet a data error of weight > 2 can exist on the cat state. Since this scenario only arises from two faults, it suffices to check the parities between every third qubit of the cat state, as an error on two consecutive qubits is acceptable. Higher-weight errors, such as the weight-seven error X2X3 . . . X8 in Fig. 9 may not be detected by parity checks that have an even number of erroneous qubits. However these errors are always caught by other parity checks. To check for errors of weight greater than two, we perform parity checks similar to that in Theorem 3. Instead of the flag sequence from Lemma 2, the Gray code from Lemma 1 is used. Now 147 11 – 12 1 2–4 5–7 8 – 10 |02i |03i |03i |03i |+i |0i |0i |0i i0| i0| i0| Figure 9: Two-error-detecting fault-tolerant circuit for the preparation of a weight-12 cat state. The state is only accepted when all flag qubits are measured as 0. Note that with fast reset, only one ancilla qubit is required. (a) (b) Figure 10: (a) Logarithmic-depth preparation of an eight-qubit cat state shows there are six possible locations for X faults that create errors of weight at least two. Parity checks need to be chosen to find corrections that leave the cat state with error of weight less than two. (b) The circuit on the left can be represented as a graph, where a CNOT gate is represented by the splitting of an edge. the parities are computed between qubits 3j − 1 for j ∈ {1, 2, . . . , 2 m−1}. The first and the last qubits are not checked for errors and so with m flags, the maximum cat state weight achieved is 3 · 2 m−1 . .2 Low-depth fault-tolerant cat state preparation So far, we have focused on fault-tolerant preparation circuits with depth linear in the cat state weight. In this section, we detail how to prepare cat states fault-tolerantly in logarithmic depth. In Fig. 10a an eight-qubit cat state is prepared in three rounds of CNOT gates. There are six locations (marked in red) where an X fault may cause an error of weight at least two. These faults 148 result in data errors with a different structure from the linear-depth protocols of Sec. 3.1, hence different parity checks are required. It is simpler to determine these parity checks if the circuit is viewed as a binary tree, as in Fig. 10b. Here time flows down and every CNOT onto a fresh |0⟩ qubit is denoted by the splitting of an edge. An X fault at a marked location results in an X error on all the leaf nodes directly under the location. Note that a fault at the root cannot cause a bad error. We use only two-qubit parity checks, however larger parity checks may be used at the expense of increased depth. If a parity check checks qubit x, it provides information on whether a fault occurred anywhere in the lineage: l(x) = {x, parent(x), parent(parent(x)), . . . ,root}. Therefore, if a parity check (x, y) is triggered, a fault at one of the locations l(x) ∪ l(y) ∪ {SPAM} has occurred, where {SPAM} is the set of faults during state preparation or measurement of the parity-check qubit. Using the parity checks (1, 5),(2, 7),(3, 6),(4, 8), it is possible to separate the five distinct weight at least two errors (since the error due to a1 and a2 is the same up to the cat state’s X⊗w stabilizer) into distinct triggered flag patterns: (1, 5) (2, 7) (3, 6) (4, 8) a1, a2 • • • • b1 • • ◦ ◦ b2 ◦ ◦ • • b3 • ◦ • ◦ b4 ◦ • ◦ • Note that a fault at any of the above locations requires a multi-qubit data correction. We ensure that each of them is detected by at least two parity checks, as one faulty parity check must not induce corrections of weight greater than one. Theorem 7. Using parallelized circuits, a w-qubit cat state can be prepared fault-tolerantly to distance three using w 2 parity checks, where w 2 = 2j , j ∈ N. The depth of the circuit is 2 + log2 w. Proof. For parity check i ∈ {1, 2, . . . , w 4 }, the cat state qubits checked are (i, w 2 + 2i − 1). For the remaining parity checks i ∈ {w 4 + 1, w 4 + 2, . . . , w 2 }, the qubits checked are (i, 2i). As in Fig. 10b, faults at the a level (depth-one) locations trigger all the parity checks, since each parity check is 149 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ±Z ±X ±Z |0i |+i |0i |0i |0i |0i ±Z ±Z ±Z Apply corrections or reject |0i |0i 15 |0i 16 17 ±Z ±Z ±Z (a) 8 7 6 5 4 12 13 14 10 1 2 11 3 9 15 16 17 (b) Figure 11: (a) Distance-4 fault-tolerant circuit for measuring a weight-8 stabilizer on a square lattice layout, as arranged in (b). executed on one cat state qubit from the first half, and one from the second. The correction X⊗w/2 on either half of the qubits works for both faults as (X⊗w/2 ⊗ 1 ⊗w/2 )(1 ⊗w/2 ⊗ X⊗w/2 ) = X⊗w is a stabilizer of the cat state. Faults at the b level (depth-two) trigger distinct sets of w 2 2 parity checks, where the correction is on all the leaf nodes under the uniquely identified fault. The same holds for faults at depth-k, which trigger distinct sets of w 2 k parity checks. One faulty parity check leads to a weight-one flag pattern, for which we do not apply corrections, as the error is restricted to at most one cat state qubit. .3 Corrections and rejections for weight-eight stabilizer measurements The corrections and rejections for the fault-tolerant weight-eight stabilizer measurement circuit in Fig. 5.6a are shown below. Raised flags Correction Raised flags Correction {13} {6} {12, 13} {6} {9, 12} {1, 5} {12, 13, 14} {6, 7, 8} {9, 11, 12, 14} {1, 4, 5} 150 Rejections {9, 11}, {9, 13}, {9, 14}, {11, 12}, {12, 14}, {13, 14}, {9, 11, 12}, {9, 11, 14}, {9, 12, 13}, {9, 12, 14}, {9, 13, 14}, {11, 12, 13}, {11, 12, 14}, {11, 13, 14}, {9, 11, 12, 13}, {9, 11, 13, 14}, {9, 12, 13, 14}, {11, 12, 13, 14}, {9, 11, 12, 13, 14} For the layout described in Fig. 5.12, the weight-eight stabilizer is measured fault-tolerantly with the circuit in Fig. 11, with associated corrections and rejections tabulated below. Raised flags Correction Raised flags Correction {10} {1} {11} {2} {12} {6} {14} {8} {15} {7} {16} {8} {10, 11} {1, 2} {10, 12} {1, 6} {15, 16} {7, 8} {10, 11, 15, 16} {3} Rejections {9, 11}, {9, 12}, {9, 15}, {9, 16}, {9, 17}, {10, 14}, {10, 15}, {10, 16}, {10, 17}, {11, 12}, {11, 15}, {11, 16}, {11, 17}, {12, 15}, {12, 16}, {12, 17}, {14, 15}, {15, 17}, {9, 10, 11}, {9, 10, 12}, {9, 10, 15}, {9, 10, 16}, {9, 10, 17}, {9, 14, 16}, {9, 15, 16}, {10, 11, 12}, {10, 11, 14}, {10, 11, 15}, {10, 11, 16}, {10, 11, 17}, {10, 12, 14}, {10, 12, 15}, {10, 12, 16}, {10, 12, 17}, {10, 14, 16}, {10, 15, 16}, {10, 16, 17}, {11, 12, 15}, {11, 12, 16}, {11, 14, 16}, {11, 15, 16}, {12, 14, 16}, {12, 15, 16}, {14, 15, 16}, {14, 16, 17}, {15, 16, 17}, {9, 10, 14, 16}, {9, 10, 15, 16}, {9, 11, 15, 16}, {10, 11, 12, 14}, {10, 11, 14, 15}, {10, 11, 14, 16}, {10, 11, 15, 17}, {10, 11, 16, 17}, {10, 12, 14, 15}, {10, 12, 14, 16}, {10, 12, 15, 16}, {11, 12, 14, 16}, {12, 14, 15, 16}, {9, 10, 11, 15, 16}, {9, 11, 12, 15, 16}, {10, 11, 12, 14, 15}, {10, 11, 12, 14, 16}, {10, 11, 12, 15, 16}, {10, 11, 14, 15, 16}, {10, 11, 15, 16, 17}, {11, 12, 14, 15, 16}, {11, 12, 15, 16, 17}, {9, 10, 11, 12, 15, 16}, {10, 11, 12, 14, 15, 16} 151 .4 Malignant set counting An [n, k, d] binary classical error-correcting code encodes k logical bits of information into n ≥ k physical bits, with distance d. During error detection/correction, all errors of weight less than d are detected. However some of the weight-d errors are not detected. These errors are called malignant sets as they can cause erroneous flips of the logical bits. The task of computing how many of the n d weight-d bit strings are malignant is computationally hard. The deterministic method is to evaluate the weights of all n d bit strings. But this takes time that is exponential in the problem size. For larger codes, we searched for faster methods to estimate the number of malignant fault sets. The first was a Monte Carlo simulation. The second method modelled the malignancy of weight-d errors using a Bernoulli random variable. Finally, a third method used the MacWilliams identity. .4.1 Monte-Carlo sampling For physical bit error rate p, the logical bit error rate of an [n, k, d] code is pL = Pn−d j=d ljp j (1−p) n−j , where lj is the number of malignant sets of weight j. At sufficiently low p, pL is approximately the first term of the polynomial, ldp d (1 − p) n−d . We can estimate ld using Monte Carlo simulations in two steps: 1. For different, small values of p, compute pL by sampling errors and evaluating the fraction of them that are malignant. An n-bit error sample e is obtained by sampling each bit from a Bernoulli random variable with probability p. The error e is malignant if He = 0, where H is the parity check matrix of the code. 2. Perform a least squares fit of the obtained values with the polynomial p d (1 − p) n−d . The coefficient of the fit is the Monte Carlo approximation of ld. At sufficiently low p, many of the error samples will be trivial. Hence a lot of time is wasted evaluating these samples. For large d, this problem becomes worse. The probability of observing a weight-d error scales as p d , implying that the errors that actually may be malignant are rarely ever observed. 1 .4.2 Modelling malignancy with the Bernoulli distribution Since we only care to check whether a weight-d error is malignant or not, it is faster to sample only from the set of weight-d errors. Let an error be sampled by choosing d out of n locations at random without replacement. We can now model the malignancy of a weight-d error using a Bernoulli random variable: a weight-d error sample is malignant with probability p. We can estimate p with high confidence by checking for malignancy on many samples. Finally, ld = p n d . .4.3 MacWilliams identity For a code C with k codeword generators, there exist n−k vectors spanning the nullspace (kernel), C ⊥. If the weights of the |C ⊥| = 2n−k bit strings can be enumerated, then the weights of the codewords of C can be evaluated using the MacWilliams identity: WC j = 1 |C| ⊥ Xn i=0 WC⊥ i Kj (i, n), (13) for j = 0, 1, . . . , n. Here WC i is the number of codewords of C of weight i and Kj (i, n) is the Krawtchouk polynomial Kj (i, n) = X j l=0 (−1)l i l n − i j − l . (14) Then WC d is the number of malignant fault sets of weight-d. A short Mathematica script can enumerate the weights of 2 36 = 68,719,476,736 codewords in just under 24 hours. .5 Construction of classical codes .5.1 Cyclic codes defined using polynomials Binary cyclic codes can be constructed using cyclic shifts of polynomials defined over finite fields. To understand why, first note the isomorphic map between the field F n 2 and polynomials of degree < n with coefficients in F2. For example, in F 4 2 , the polynomial x 3 + 0x 2 + x + 1 corresponds to the bit string 1011, where the bit at position i ∈ {0, 1, . . . , n − 1} (right-to-left) is the coefficient of the term x i . An [n, k, d] code is cyclic if the k codewords can be generated by cyclic shifts of a generator 15 polynomial, g(x). Cyclic shifts of g(x) are obtained by multiplying g(x) with {1, x, x2 , . . . , xk−1}. Single Error Detect code The Single Error Detect code with parameters [α + 1, α, 2] can be generated by taking cyclic shifts of the generating polynomial g(x) = x + 1 over the field F α+1 2 . For example the codewords of the [4, 3, 2] code are the rows of G below. G = 0011 0110 1100 . (15) The parity check matrix H is the nullspace of G, i.e. the span of all vectors in F n 2 that are orthogonal to elements of G. For the above code, H = 1111 . (16) We also display the codeword generator matrix for the [12, 11, 2] code which we use in Section 7.2.1 for magic state distillation. G = 000000000011 000000000110 000000001100 000000011000 000000110000 000001100000 000011000000 000110000000 001100000000 011000000000 110000000000 . (17) 154 Code Generator polynomial [7, 4, 3] 1011 = x 3 + x + 1 [15, 11, 3] 10011 [31, 26, 3] 100101 [43, 36, 3] 10101011 [49, 43, 3] 1000011 [63, 57, 3] 1000011 [85, 77, 3] 100011101 [127, 120, 3] 10000011 [15, 7, 5] 111010001 [31, 21, 5] 11101101001 [43, 29, 5] 100111110100011 [49, 37, 5] 1010100111001 [63, 51, 5] 1010100111001 [85, 69, 5] 10110111101100011 [127, 113, 5] 101010001111101 [15, 5, 7] 10100110111 [31, 16, 7] 1000111110101111 [43, 22, 7] 1010010100110010100001 [49, 31, 7] 1111000001011001111 [63, 45, 7] 1111000001011001111 [85, 61, 7] 1101110111010000110110101 [127, 106, 7] 1010010011000000011011 [49, 25, 9] 1110110110010011101110111 [63, 39, 9] 1110110110010011101110111 [85, 53, 9] 111101110010110110100001011111101 [127, 99, 9] 11000101001010111100100111111 [43, 15, 10] 11111110001001100100000101011 [31, 11, 11] 101100010011011010101 [49, 22, 11] 1000011011101000000100010011 [63, 36, 11] 1000011011101000000100010011 [85, 45, 11] 10011001101111101110100111010110100010001 [127, 92, 11] 111000010001110010101001101101010111 Table 2: BCH codes and associated generator polynomials. The codewords generators are cyclic shifts of the generator polynomial. 155 u Code Generator Polynomial 3 [9, 2, 6] 10111101 = x 7 + x 5 + x 4 + x 3 + x 2 + 1 4 [17, 9, 5] 100111001 5 [33, 22, 6] 101001100101 6 [65, 53, 5] 1000111110001 7 [129, 114, 6] 1001010000101001 Table 3: Zetterberg codes with associated generator polynomials. Golay code The [23, 12, 7] Golay code is a cyclic code generated by the polynomial x 11+x 9+x 7+x 6+x 5+x+1 over F 23 2 . G = 00000000000101011100011 00000000001010111000110 00000000010101110001100 00000000101011100011000 00000001010111000110000 00000010101110001100000 00000101011100011000000 00001010111000110000000 00010101110001100000000 00101011100011000000000 01010111000110000000000 10101110001100000000000 . (18) 156 BCH codes Bose-Chaudhuri–Hocquenghem (BCH) codes are a well-studied family of classical cyclic codes constructed using polynomials over finite fields. Due to the flexible nature of the construction of these codes, codes of different distances can be defined for the same code size. In Table 2, we show the generating polynomials for the BCH codes that were considered in this paper. Zetterberg codes Zetterberg codes are binary cyclic codes defined as [2u + 1, 2 u + 1 − 2u, 5 ≤ d ≤ 6] codes for even u. For odd u, we obtain the parameters [2u + 1, 2 u − 2u, 6]. These codes are quasi-perfect: the distance between two codewords is 5 ≤ d ≤ 6. In this paper we consider Zetterberg codes for u ∈ {3, 4, 5, 6, 7}. The codewords are defined by taking cyclic shifts of polynomials shown in Table 3. Note that all the polynomials are palindromic. For a chosen u, other codes with the same parameters may be defined using different palindromic polynomials of the same degree. For a more detailed description of the construction, consider [JCL+10]. .5.2 Reed-Muller and Polar codes Binary Reed-Muller codes are [2m, k, 2 m−r ] codes for r ≤ m where k = 2m − mX−r−1 i=0 m i = Xr i=0 m i . (19) To determine the codewords of the (r, m)-Reed-Muller code, start with the m-fold tensor product of the generator matrix 1 1 0 1 . Remove the Pm−r−1 i=0 m i rows with fewer than d 1’s. The k remaining rows denote the codewords. In this paper, we look at the family of m = r + 1 Single Error Detect codes, m = r + 2 distance-4 Extended Hamming codes, and m = r + 3 distance-8 codes. Polar codes are 2 m-bit codes that were initially developed for communication systems to tackle analog noise. The binary codes constructed using this formalism can be used against discrete noise too. The method of construction is the same as that of the Reed-Muller codes, but allows for codes 15 with fewer encoded bits. After removing low-weight codewords to create a Reed-Muller code, remove extra codewords (lowest weight first) until there are exactly as many encoded bits as required. .6 Speedups offered by different codes In Fig. 12, we show the lowest average runtime per Pauli for temporally encoded lattice surgery of k ∈ {2, 3, . . . , 100} measurements for p = 10−3 . We also indicate which code achieves the lowest average runtime per Pauli for each k. Note that this paper only considered a limited number of classical codes for TELS. It may be possible for other codes to perform better than the ones outlined here. Fig. 13 shows the best classical codes for p = 10−4 and δ = 10−15 and δ = 10−20 respectively. In Table 4, we show the best average speedup due to a TELS code for k ∈ {2, 3, . . . , 100}. These speedups are computed with respect to performing the k measurements sequentially at the regular measurement distance dm. These speedups are computed for various regimes: p = 10−3 , δ ∈ {10−10 , 10−15 , 10−20 , 10−25} and p = 10−4 , δ ∈ {10−15 , 10−20}. .7 Clifford frames of distilled magic states In circuits like Fig. 7.1, where non-Clifford gates are implemented using Pauli measurements, we prove that the Clifford frame of the distilled magic states are powers of Xπ/4 . After the temporally encoded non-Clifford gates, the Clifford frame consists of a sequence of Clifford rotations which are tensor products of X and 1. We first observe the effect of an (X ⊗ X)π/4 gate on an input |TX⟩ ⊗ |ψ⟩ state, where |ψ⟩ = α|0⟩ + β|1⟩ is some arbitrary state. If we can determine the effect of the (X ⊗ X)π/4 rotation on the subsystem of the distilled magic state, we can determine the final Clifford frame of the distilled magic state. (X⊗X)π/4 |TX⟩ ⊗ |ψ⟩ = √ 1 2 0 0 √−i 2 0 √ 1 2 √−i 2 0 0 √−i 2 √ 1 2 0 √−i 2 0 0 √ 1 2 · 1 √ 2 (1 + e iπ 4 )α (1 + e iπ 4 )β (1 − e iπ 4 )α (1 − e iπ 4 )β . (20) 158 0 20 40 60 80 100 0 5 10 15 20 k Ave. runtime per Pauli (a) 0 20 40 60 80 100 0 5 10 15 20 25 k Ave. runtime per Pauli (b) 0 20 40 60 80 100 0 5 10 15 20 25 30 k Ave. runtime per Pauli (c) Figure 12: We show the classical codes achieving the lowest average runtime per Pauli for k ∈ {2, 3, . . . , 100} at p = 10−3 and for (a) δ = 10−15, (b) δ = 10−20 and (c) δ = 10−25. We set the routing space area A = 100. 159 0 20 40 60 80 100 0 2 4 6 8 10 12 k Ave. runtime per Pauli (a) 0 20 40 60 80 100 0 5 10 15 k Ave. runtime per Pauli (b) Figure 13: We show the classical codes achieving the lowest average runtime per Pauli for k ∈ {2, 3, . . . , 100} at p = 10−4 and for (a) δ = 10−15 and (b) δ = 10−20. We set the routing space area A = 100. 160 Table 4: The best lattice surgery speedup for k ∈ {1, 2, . . . 100}, and associated classical code achieving it, in different noise regimes p, and for different target logical error rates δ. Table continues on subsequent pages. k p = 10−3 p = 10−3 p = 10−3 p = 10−3 p = 10−4 p = 10−4 δ = 10−10 δ = 10−15 δ = 10−20 δ = 10−25 δ = 10−15 δ = 10−20 2 SED2, 1.167 SED2, 1.333 SED2, 1.238 SED2, 1.333 SED2, 1.333 SED2, 1.333 3 SED2, 1.312 SED2, 1.5 SED2, 1.393 SED2, 1.5 SED2, 1.5 SED2, 1.5 4 EHam4, 1.739 EHam4, 1.666 BCH7, 1.713 EHam4, 2 Ham3, 1.714 CSED4, 1.778 5 SED2, 1.458 SED2, 1.667 BCH7, 2.141 BCH7, 1.778 BCH7, 1.895 SED2, 1.667 6 SED2, 1.5 BCH5, 1.977 BCH5, 1.733 SED2, 1.714 Gola7, 1.441 SED2, 1.714 7 BCH5, 1.633 BCH5, 2.306 BCH5, 2.022 BCH5, 1.867 Gola7, 1.681 BCH5, 1.866 8 CSED4, 1.728 SED2, 1.778 Gola7, 2.22 BCH11, 2.065 Gola7, 1.922 BCH11, 2.053 9 CSED4, 1.944 Gola7, 1.956 Gola7, 2.498 BCH11, 2.323 Gola7, 2.162 BCH11, 2.31 10 EHam4, 2.16 Gola7, 2.174 Gola7, 2.775 BCH11, 2.581 Gola7, 2.402 BCH11, 2.566 11 EHam4, 2.376 Gola7, 2.391 Gola7, 3.053 BCH11, 2.839 Gola7, 2.642 BCH11, 2.823 12 Gola7, 1.826 Gola7, 2.608 Gola7, 3.331 Gola7, 2.783 Gola7, 2.882 BCH9, 2.209 13 CSED4, 1.785 BCH7, 2.096 BCH7, 2.66 BCH9, 2.417 BCH7, 2.251 BCH9, 2.393 14 CSED4, 1.922 BCH7, 2.257 BCH7, 2.865 BCH9, 2.603 BCH7, 2.424 BCH9, 2.577 15 CSED4, 2.059 BCH7, 2.419 BCH7, 3.069 BCH9, 2.789 BCH7, 2.597 BCH9, 2.761 16 CSED4, 2.196 BCH7, 2.58 BCH7, 3.274 BCH7, 2.753 BCH7, 2.771 BCH11, 2.577 17 BCH5, 1.919 Zett5, 2.51 BCH7, 2.484 BCH11, 2.775 BCH11, 2.082 BCH11, 2.738 18 BCH5, 2.032 Zett5, 2.657 BCH7, 2.63 BCH11, 2.939 BCH11, 2.204 BCH11, 2.899 19 BCH5, 2.145 Zett5, 2.805 BCH7, 2.777 BCH11, 3.102 BCH11, 2.326 BCH11, 3.06 20 BCH5, 2.257 Zett5, 2.953 BCH7, 2.923 BCH11, 3.265 BCH11, 2.449 BCH11, 3.221 21 BCH5, 2.37 Zett5, 3.1 BCH7, 3.069 BCH11, 3.429 BCH11, 2.571 BCH11, 3.382 22 EHam4, 2.346 Zett5, 3.248 BCH7, 3.215 BCH11, 3.592 BCH11, 2.694 BCH11, 3.543 23 Pol4, 2.453 BCH5, 2.586 BCH9, 3.049 BCH9, 3.613 BCH9, 2.814 BCH9, 3.149 24 Pol4, 2.56 BCH5, 2.698 BCH9, 3.181 BCH9, 3.77 BCH9, 2.937 BCH9, 3.286 25 Pol4, 2.666 BCH5, 2.81 BCH9, 3.314 BCH9, 3.927 BCH9, 3.059 BCH9, 3.423 26 EHam4, 2.773 BCH5, 2.923 BCH7, 3.319 BCH11, 3.298 BCH7, 2.67 BCH11, 3.23 27 BCH5, 2.196 BCH5, 3.035 BCH7, 3.446 BCH11, 3.425 BCH7, 2.773 BCH11, 3.354 28 BCH5, 2.278 BCH5, 3.148 BCH7, 3.574 BCH11, 3.551 BCH7, 2.876 BCH11, 3.478 29 BCH5, 2.359 BCH5, 3.26 BCH7, 3.702 BCH11, 3.678 BCH7, 2.978 BCH11, 3.603 30 BCH7, 2.143 BCH7, 3.059 BCH7, 3.829 BCH11, 3.805 BCH7, 3.081 BCH11, 3.727 31 BCH7, 2.214 BCH7, 3.161 BCH7, 3.957 BCH11, 3.932 BCH7, 3.184 BCH11, 3.851 32 BCH5, 2.284 BCH11, 2.54 BCH11, 3.302 BCH11, 4.059 BCH11, 3.047 BCH11, 3.975 33 BCH5, 2.355 BCH11, 2.619 BCH11, 3.405 BCH11, 4.186 BCH11, 3.143 BCH11, 4.1 34 BCH5, 2.427 BCH11, 2.698 BCH11, 3.508 BCH11, 4.312 BCH11, 3.238 BCH11, 4.224 35 BCH5, 2.498 BCH11, 2.778 BCH11, 3.611 BCH11, 4.439 BCH11, 3.333 BCH11, 4.348 161 36 BCH5, 2.57 BCH11, 2.857 BCH11, 3.714 BCH11, 4.566 BCH11, 3.428 BCH11, 4.472 37 BCH5, 2.641 BCH9, 2.937 BCH9, 3.813 BCH9, 4.471 BCH9, 3.447 BCH9, 3.747 38 BCH9, 2.111 BCH9, 3.016 BCH9, 3.916 BCH9, 4.592 BCH9, 3.541 BCH9, 3.849 39 BCH9, 2.167 BCH9, 3.095 BCH9, 4.019 BCH9, 4.713 BCH9, 3.634 BCH9, 3.95 40 BCH7, 2.222 BCH7, 3.171 RMul8, 3.863 BCH11, 3.765 BCH7, 3.038 BCH11, 3.623 41 BCH7, 2.278 BCH7, 3.25 RMul8, 3.96 BCH11, 3.859 BCH7, 3.114 BCH11, 3.713 42 BCH7, 2.333 BCH7, 3.329 RMul8, 4.059 BCH11, 3.953 BCH7, 3.19 BCH11, 3.804 43 BCH7, 2.389 BCH7, 3.409 BCH11, 3.288 BCH11, 4.047 BCH7, 3.266 BCH11, 3.895 44 BCH7, 2.444 BCH7, 3.488 BCH11, 3.365 BCH11, 4.141 BCH7, 3.342 BCH11, 3.985 45 BCH7, 2.5 BCH7, 3.567 BCH11, 3.441 BCH11, 4.235 BCH7, 3.418 BCH11, 4.076 46 BCH5, 2.553 BCH9, 2.706 BCH9, 3.517 BCH9, 4.05 BCH9, 3.235 BCH9, 3.191 47 BCH5, 2.608 BCH9, 2.765 BCH9, 3.594 BCH9, 4.138 BCH9, 3.306 BCH9, 3.26 48 BCH5, 2.664 BCH9, 2.824 BCH9, 3.67 BCH9, 4.226 BCH9, 3.376 BCH9, 3.33 49 BCH5, 2.719 BCH9, 2.882 BCH9, 3.747 BCH9, 4.314 BCH9, 3.446 BCH9, 3.399 50 BCH5, 2.775 BCH9, 2.941 BCH9, 3.823 BCH9, 4.402 BCH9, 3.517 BCH9, 3.468 51 BCH5, 2.83 BCH9, 3 BCH9, 3.9 BCH9, 4.49 BCH9, 3.587 BCH9, 3.538 52 Zett5, 2.797 BCH9, 3.059 BCH9, 3.976 BCH9, 4.578 BCH9, 3.657 BCH9, 3.607 53 Zett5, 2.85 BCH9, 3.118 BCH9, 4.053 BCH9, 4.666 BCH9, 3.728 BCH9, 3.676 54 EHam4, 2.808 BCH7, 3.17 BCH7, 3.862 BCH7, 3.388 BCH7, 2.809 BCH11, 3.141 55 EHam4, 2.86 BCH7, 3.228 BCH7, 3.934 BCH7, 3.451 BCH7, 2.861 BCH11, 3.199 56 EHam4, 2.912 BCH7, 3.287 BCH7, 4.006 BCH7, 3.514 BCH7, 2.913 BCH11, 3.257 57 EHam4, 2.964 BCH7, 3.346 BCH7, 4.077 BCH7, 3.576 BCH7, 2.965 BCH11, 3.315 58 BCH7, 2.388 BCH7, 3.405 BCH7, 4.149 BCH7, 3.639 BCH7, 3.017 BCH11, 3.374 59 BCH7, 2.429 BCH7, 3.463 BCH7, 4.22 BCH7, 3.702 BCH7, 3.069 BCH11, 3.432 60 BCH7, 2.47 BCH7, 3.522 BCH7, 4.292 BCH7, 3.765 BCH7, 3.122 BCH11, 3.49 61 BCH7, 2.512 BCH7, 3.581 BCH7, 4.363 BCH7, 3.827 BCH7, 3.174 BCH11, 3.548 62 BCH5, 2.548 CSED4, 2.548 BCH11, 3.173 BCH11, 3.887 BCH11, 2.926 BCH11, 3.606 63 BCH5, 2.589 CSED4, 2.589 BCH11, 3.224 BCH11, 3.95 BCH11, 2.973 BCH11, 3.664 64 BCH5, 2.63 CSED4, 2.63 BCH11, 3.276 BCH11, 4.013 BCH11, 3.02 BCH11, 3.723 65 BCH5, 2.671 BCH9, 2.559 BCH11, 3.327 BCH11, 4.076 BCH11, 3.067 BCH11, 3.781 66 BCH5, 2.712 BCH9, 2.598 BCH11, 3.378 BCH11, 4.138 BCH11, 3.114 BCH11, 3.839 67 BCH5, 2.753 BCH9, 2.638 BCH11, 3.429 BCH11, 4.201 BCH11, 3.162 BCH11, 3.897 68 BCH5, 2.794 BCH9, 2.677 BCH11, 3.48 BCH11, 4.264 BCH11, 3.209 BCH11, 3.955 69 BCH5, 2.835 BCH9, 2.717 BCH11, 3.531 BCH11, 4.326 BCH11, 3.256 BCH11, 4.013 70 CSED4, 2.265 BCH9, 2.756 BCH11, 3.583 BCH11, 4.389 BCH11, 3.303 BCH11, 4.072 162 71 CSED4, 2.297 BCH9, 2.795 BCH11, 3.634 BCH11, 4.452 BCH11, 3.35 BCH11, 4.13 72 CSED4, 2.329 BCH9, 2.835 BCH11, 3.685 BCH11, 4.514 BCH11, 3.397 BCH11, 4.188 73 CSED4, 2.362 BCH9, 2.874 BCH11, 3.736 BCH11, 4.577 BCH11, 3.445 BCH11, 4.246 74 CSED4, 2.394 BCH9, 2.913 BCH11, 3.787 BCH11, 4.64 BCH11, 3.492 BCH11, 4.304 75 CSED4, 2.427 BCH9, 2.953 BCH11, 3.839 BCH11, 4.703 BCH11, 3.539 BCH11, 4.362 76 CSED4, 2.459 BCH9, 2.992 BCH11, 3.89 BCH11, 4.765 BCH11, 3.586 BCH11, 4.421 77 CSED4, 2.491 BCH9, 3.031 BCH11, 3.941 BCH11, 4.828 BCH11, 3.633 BCH11, 4.479 78 CSED4, 2.524 BCH9, 3.071 BCH11, 3.992 BCH11, 4.891 BCH11, 3.681 BCH11, 4.537 79 CSED4, 2.556 BCH9, 3.11 BCH11, 4.043 BCH11, 4.953 BCH11, 3.728 BCH11, 4.595 80 CSED4, 2.588 BCH9, 3.15 BCH11, 4.094 BCH11, 5.016 BCH11, 3.775 BCH11, 4.653 81 CSED4, 2.621 BCH9, 3.189 BCH11, 4.146 BCH11, 5.079 BCH11, 3.822 BCH11, 4.711 82 BCH11, 2.26 BCH9, 3.228 BCH11, 4.197 BCH11, 5.141 BCH11, 3.869 BCH11, 4.77 83 BCH11, 2.287 BCH9, 3.268 BCH11, 4.248 BCH11, 5.204 BCH11, 3.917 BCH11, 4.828 84 BCH11, 2.315 BCH9, 3.307 BCH11, 4.299 BCH11, 5.267 BCH11, 3.964 BCH11, 4.886 85 BCH11, 2.343 BCH9, 3.346 BCH11, 4.35 BCH11, 5.33 BCH11, 4.011 BCH11, 4.944 86 BCH11, 2.37 BCH9, 3.386 BCH11, 4.402 BCH11, 5.392 BCH11, 4.058 BCH11, 5.002 87 BCH11, 2.398 BCH9, 3.425 BCH11, 4.453 BCH11, 5.455 BCH11, 4.105 BCH11, 5.06 88 BCH11, 2.425 BCH9, 3.465 BCH11, 4.504 BCH11, 5.518 BCH11, 4.152 BCH11, 5.119 89 BCH11, 2.453 BCH9, 3.504 BCH11, 4.555 BCH11, 5.58 BCH11, 4.2. BCH11, 5.177 90 BCH11, 2.48 BCH9, 3.543 BCH11, 4.606 BCH11, 5.643 BCH11, 4.247 BCH11, 5.235 91 BCH11, 2.508 BCH9, 3.583 BCH11, 4.657 BCH11, 5.706 BCH11, 4.294 BCH11, 5.293 92 BCH11, 2.535 BCH9, 3.622 BCH11, 4.709 BCH11, 5.768 BCH11, 4.341 BCH11, 5.351 93 BCH9, 2.563 BCH9, 3.661 BCH9, 4.738 BCH9, 5.302 BCH9, 4.057 BCH9, 2.929 94 BCH9, 2.591 BCH9, 3.701 BCH9, 4.789 BCH9, 5.359 BCH9, 4.101 BCH9, 2.961 95 BCH9, 2.618 BCH9, 3.74 BCH9, 4.84 BCH9, 5.416 BCH9, 4.144 BCH9, 2.992 96 BCH9, 2.646 BCH9, 3.78 BCH9, 4.891 BCH9, 5.473 BCH9, 4.188 BCH9, 3.024 97 BCH9, 2.673 BCH9, 3.819 BCH9, 4.942 BCH9, 5.53 BCH9, 4.232 BCH9, 3.055 98 BCH9, 2.701 BCH9, 3.858 BCH9, 4.993 BCH9, 5.587 BCH9, 4.275 BCH9, 3.087 99 BCH9, 2.728 BCH9, 3.898 BCH9, 5.043 BCH9, 5.644 BCH9, 4.319 BCH9, 3.118 100 BCH7, 2.755 BCH7, 3.919 BCH7, 3.412 BCH7, 4.199 CSED4, 2.477 BCH7, 3.15 163 When we trace out the subsystem that started as |ψ⟩, we obtain the following state on the subsystem of the magic state (up to a global phase) tr|ψ⟩ ((X⊗X)π/4 |TX⟩ ⊗ |ψ⟩) = 1 + e iπ 4 − i(1 − e iπ 4 ) 1 − e iπ 4 − i(1 + e iπ 4 ) . (21) This is equivalent to Xπ/4 |TX⟩, which is Xπ/4 |TX⟩ = 1 + e iπ 4 − i(1 − e iπ 4 ) 1 − e iπ 4 − i(1 + e iπ 4 ) . (22) This shows that if there is one Clifford operator in the Clifford frame with X support on the magic state qubit, it essentially results in an Xπ/4 Clifford frame update on the magic state. However with more than one Clifford correction in the Clifford frame, the total rotation accumulated on the magic state qubit is the product of Xπ/4 rotations for all the Cliffords in the Clifford frame that contain an X operator on the support of the magic state qubit. .8 Choice of codewords for the Golay code for TELS of a 15-to-1 distillation protocol In the 15-to-1 distillation protocol of Section 7.2.1, we implemented a TELS protocol for the non-Clifford measurements using the [23, 12, 7] Golay code. We remove one codeword, since we only need to perform 11 Pauli measurements in the PP set, and permute some columns (reordering the 164 resulting Pauli measurements) to get the following codeword generator matrix G = 11111110000000000000000 00011011111000000000000 01101000011110000000000 00010000110111100000000 00101100000101110000000 00010110010001011000000 00001110001100001100000 00000010011011000110000 00000100001110100011000 00000010000011110001100 00000000000001101101111 . (23) Now, after the seventh measurement, the cell holding the magic state associated with the first row of G can be reset and used to inject a new magic state that will only be required for the 14th Pauli measurement (first column (left-to-right) with a 1 in the last row). Note that both the columns and rows of G may be permuted; permuting columns reorders the new sequence of Pauli measurements; permuting rows merely swaps codeword generators. It may be possible to find an algorithm that iteratively applies a permutation rule convention to find a codeword matrix that allows magic states to occupy the fewest number of cells in a distillation tile. .9 Procedure for determining code distances of distillation tiles Here we describe the procedure used to determine the spacelike distances dx and dz and timelike distances dm for lattice surgery in the distillation tiles of Section 7.2. First, note that when performing distillation in the Clifford frame using TELS-assisted lattice surgery, we will need to execute two PP sets. The first PP set performs the non-Clifford gates using |TX⟩ resource states (using a classical [n1, k1, d1] code), and the second PP set performs a set of Pauli measurements associated 165 with conjugating Clifford corrections through single-qubit logical measurements (using a classical [n2, k2, d2] code). We first set δ (M) as the error budget per magic state that is output from a distillation protocol. Logical errors may accumulate on the distilled magic states by different mechanisms. Even with noiseless lattice surgery, errors on the input magic states may may cause the final magic state to be logically wrong. This depends entirely on the choice of quantum code, here Jndist, kdist, ddistK. The logical error rate (per output magic state) of a distillation protocol with noiseless gates is p (M) L = ldist kdist p(1 + η) 3η ddist , (24) according to the analysis in Sec. 1 of Ref.[Lit19b], using the biased circuit-level noise model of Section 6.1.1. Here, ldist is the number of weight-ddist fault sets that can cause a logical error. Given the error budget δ (M) and the logical error rate with noiseless gates p (M) L , we may now upper bound the logical error rate due to noisy lattice surgery measurements, δ = (δ (M) − p (M) L )kdist. (25) Note that we multiply by kdist since δ (M) is the error budget per output magic state, but δ is the error budget of lattice surgery for the entire distillation protocol. Logical errors due to lattice surgery may occur due to spacelike or timelike errors. For each PP set, the logical error rate due to lattice surgery is pPP1(dm, n) = pL,X(dm, n) + pL,Z(dm, n) + pL(pm(dm, n)), (26) where pL,X and pL,Z are the spacelike contributions and pL is the timelike failure rate of TELS described in Section 6.2. Here dm and n refer to the lattice surgery measurement distance and the number of Pauli measurements respectively. Although we express pPP1 as a function of two variables dm and n, the spacelike distance dx and dz are also input variables. The important point is that dm and n are the only two variables that are different for each PP set. In Ref. [CC22b], Eqs. 3−6 denote the logical error rates of an X ⊗ X lattice surgery measurement. 166 We obtain equations for pL,X, pL,Z, and pm by modifying the above equations as shown below pm(dm, n) =0.01634nA(21.93p) dm+1 2 , (27) pL,Z(dm, n) =0.03148T N dx(28.91p) dz+1 2 , (28) pL,X(dm, n) =0.0148T F dx (0.762p) dx+1 2 , (29) where A is the area of the routing space (in units of dx and dz), T is the average time taken to execute the parallelizable Pauli set (from Section 6.2), N is the maximum number of logical qubits that are concurrently used during any lattice surgery measurement in a TELS protocol, and F is a pessimistic estimate of the maximum area used during a lattice surgery measurement (routing space + logical qubits associated in the measurement). In Appendix .10, we show equations for F, A, N and Space in terms of dx and dz for each of the different distillation layouts suggested in Section 7.2. The time to complete the entire distillation protocol is given in Section 7.1.1. If we use measurement distance d ′ m for the first PP set and measurement distance d ′′ m for the second PP set, then the objective is to find a set of parameters {dx, dz, d′ m, d′′ m} that minimizes the space-time cost of a distillation factory, while ensuring the following equation is satisfied, pPP1(d ′ m, n1) + pPP2(d ′′ m, n2) < δ. (30) Note that Eq. (30) ensures that the probability of a single logical failure event is less than δ. Two independent logical failure events occur with probability ∼ δ 2 , so we omit these higher order events from Eq. (30). .10 Constants used to determine spacetime costs of distillation tiles In this section and Table 5 we list the constants that are used to determine the minimum space-like and time-like distances for the distillation layouts described in Appendix .9. Distillation protocols that do not use TELS perform auto-corrected non-Clifford gate gadgets for the entire protocol, and hence contain one value each for the number of logical qubits, N, routing space area, A, and full area of lattice surgery, F. For protocols that perform TELS using the Clifford frame distillation circuit of Section 7.1.1, there are two PP sets. We display two sets of values for N, A, and F to 167 account for the changes between the execution of the first and second PP sets. .11 Additional distillation layouts .11.1 125-to-3 distillation The 125-to-3 magic state distillation protocol is obtained from a triorthogonal CSS quantum J125, 3, 5K code. This code is constructed by puncturing the [128, 29, 32] Reed-Muller code at any three locations [HH18]. As a result, the quantum code will contain 3 logical qubits, 96 X-type stabilizers and 26 Z-type stabilizers. After applying the circuit transformation from a gate-based model to the PBC model [Lit19a], we are left with a sequence of 99 commuting Pauli measurements on 29 logical qubits, three of which will finally become the distilled magic states. These 99 measurements form a size-99 PP set. In this paper, we will consider using this protocol in a regime where the physical error rate is p = 10−3 and our target is to distill magic states with logical error probability at most δ (M) = 10−15 . Using the noise model described in Chapter 7 for the injection of magic states, we apply the analysis in Ref. [Lit19b] to determine the logical failure probability per output magic state for one round of a 125-to-3 distillation scheme, p (M) L = 1 3 31 (ϵL,Z) 5 + 1 2 10(ϵL,Z) 4 ϵL,X + 1 4 40(ϵL,Z) 3 (ϵL,X) 2 + 1 8 80(ϵL,Z) 2 (ϵL,X) 3 + 1 16 80ϵL,Z(ϵL,X) 4 + 1 32 32(ϵL,X) 5 = 31(1 + η) 5 729η 5 p 5 . (31) For p = 10−3 and η = 100, the probability that the distillation succeeds is 1 − p (M) D = (1 − ϵL) 125 = 0.9584 and p (M) L = 4.47 × 10−17. As this is sufficiently below δ (M) , the lattice surgery measurements used to execute the distillation protocol must be modeled with measurement distance large enough to allow for distilled magic states of logical error rate at most δ(M). Using the procedure in Appendix .9, we determined that the minimum spacelike distances are dx = 13 and dz = 25. We show two layouts for distillation tiles that do not use TELS in Fig. 14a and Fig. 14b. These 168 Table 5: Constants associated with layouts of distillation tiles described in Section 7.2. These constants are used to determine minimum spacelike and timelike distances using the procedure in Appendix .9. 15-to-1 distillation - No temporal encoding Space = (2dz + 2dx + 2)(5(dx + 1)) N = 7 A = (dx + 1)(dz + 4(dx + 2)) F = Space − 2dz(dx + 1) 15-to-1 distillation - No temp. encoding, parallelized Space = (2dz + 4dx + 4)(5(dx + 1)) N = 9 A = (dx + 1)(dz + 4(dx + 2)) F = Space − 3dz(dx + 1) 15-to-1 distillation - SED2 Space = (2dz + dx + 3)(5(dx + 1)) N = 7 A = 5(dx + 1)(dx + 3) F = Space − 3dz(dx + 1) N2 = 5 A2 = A + 4dz(dx + 1) F2 = Space − dz(dx + 1) 15-to-1 distillation - SED2, parallellized Space = (2dz + 3dx + 3)(7(dx + 1)) N = 8 A = (dx + 1)(2dz + 14dx + 3) F = Space − 4dz(dx + 1) N2 = 5 A2 = A + 4dz(dx + 1) F2 = Space − dz(dx + 1) 15-to-1 distillation - BCH3 Space = (2dz + dx + 3)(6(dx + 1)) N = 10 A = 6(dx + 1)(dx + 3) F = Space − 2dz(dx + 1) N2 = 5 A2 = A + 4dz(dx + 1) F2 = Space − dz(dx + 1) 169 15-to-1 distillation - BCH3, parallellized Space = (2dz + 3dx + 3)(8(dx + 1)) N = 11 A = (dx + 1)(2dz + 16dx + 3) F = Space − 3dz(dx + 1) N2 = 5 A2 = A + 4dz(dx + 1) F2 = Space − dz(dx + 1) 15-to-1 distillation - Golay Space = (2dz + dx + 3)(8(dx + 1)) N = 15 A = 8(dx + 1)(dx + 3) F = Space − dz(dx + 1) N2 = 5 A2 = A + 4dz(dx + 1) F2 = Space − dz(dx + 1) 15-to-1 distillation - Golay, parallellized Space = (2dz + 3dx + 3)(9(dx + 1)) N = 15 A = (dx + 1)(2dz + 18dx + 3) F = Space − dz(dx + 1) N2 = 5 A2 = A + 4dz(dx + 1) F2 = Space − dz(dx + 1) 116-to-12 distillation - No temporal encoding Space = max(2(dz + dx), 4(dx + 1))(22dx + dz + 23) N = 31 A = (dx + 1)(25dx + 4) F = Space − 13dz(dx + 1) 116-to-12 distillation - No temp. encoding, parallelized Space = max(2(dz + 3dx), 4(dx + 1))(23dx + 2dz + 25) N = 33 A = (dx + 1)(30dx + max(2(dz + 3dx), 4(dx + 1))) F = Space − 14dz(dx + 1) 170 116-to-12 distillation - Zett5, parallelized Space = (2dz + 3dx + 3)(31(dx + 1)) N = 46 A = (dx + 1)(2dz + 51dx + 3) F = Space − 14dz(dx + 1) N2 = 29 A2 = A + 19dz(dx + 1) F2 = Space − 12dz(dx + 1) 116-to-12 distillation - BCH9, parallellized Space = (2dz + 3dx + 3)(38(dx + 1)) N = 58 A = (dx + 1)(2dz + 65dx + 3) F = Space − 15dz(dx + 1) N2 = 29 A2 = A + 32dz(dx + 1) F2 = Space − 12dz(dx + 1) 114-to-14 distillation - No temporal encoding Space = max(2(dz + dx), 4(dx + 1))(23dx + dz + 24) N = 31 A = (dx + 1)(26dx + 4) F = Space − 15dz(dx + 1) 114-to-14 distillation - No temp. encoding, parallelized Space = max(2(dz + 3dx), 4(dx + 1))(24dx + 2dz + 26) N = 33 A = (dx + 1)(30dx + max(2(dz + 3dx), 4(dx + 1))) F = Space − 16dz(dx + 1) 114-to-14 distillation - Zett5, parallelized Space = (2dz + 3dx + 3)(32(dx + 1)) N = 46 A = (dx + 1)(2dz + 51dx + 3) F = Space − 16dz(dx + 1) N2 = 29 A2 = A + 19dz(dx + 1) F2 = Space − 14dz(dx + 1) 171 114-to-14 distillation - BCH7, parallellized Space = (2dz + 3dx + 3)(35(dx + 1)) N = 52 A = (dx + 1)(2dz + 57dx + 3) F = Space − 16dz(dx + 1) N2 = 29 A2 = A + 25dz(dx + 1) F2 = Space − 14dz(dx + 1) 125-to-3 distillation - No temporal encoding Space = max(2(dz + dx), 4(dx + 1))(18dx + dz + 19) N = 31 A = (dx + 1)(20dx + 4) F = Space − 4dz(dx + 1) 125-to-3 distillation - No temp. encoding, parallelized Space = max(2(dz + 3dx), 4(dx + 1))(20dx + 2dz + 22) N = 33 A = (dx + 1)(29dx + max(2(dz + 3dx), 4(dx + 1))) F = Space − 5dz(dx + 1) 125-to-3 distillation - BCH7, parallelized Space = (2dz + 3dx + 3)(30(dx + 1)) N = 52 A = (dx + 1)(2dz + 58dx + 3) F = Space − 6dz(dx + 1) N2 = 29 A2 = A + 26dz(dx + 1) F2 = Space − 3dz(dx + 1) 125-to-3 distillation - BCH9, parallellized Space = (2dz + 3dx + 3)(33(dx + 1)) N = 56 A = (dx + 1)(2dz + 64dx + 3) F = Space − 5dz(dx + 1) N2 = 29 A2 = A + 32dz(dx + 1) F2 = Space − 3dz(dx + 1) 172 layouts perform distillation in the Pauli frame as described in Ref. [Lit19a] using auto-corrected non-Clifford gadgets (Fig. 7.3). On the layout of Fig. 14a, 99 Pauli measurements are performed, each with measurement distance dm = 23 (also derived using Appendix .9). On the layout of Fig. 14b, Pauli measurements can be performed two at a time. Hence the time required is only the time for 50 sequential lattice surgery measurements with measurement distance dm = 23. In Fig. 14c, we show a layout for a distillation tile that performs TELS with a classical [127, 106, 7] BCH code. Note that since there are two disjoint routing spaces (in blue and grey), the set of n = 127 Pauli measurements can be performed in the time required for 64 sequential measurements. This way, the time cost is nearly halved, with only a minor increase to the height of the distillation tile (two rows of routing space of height dx). To obtain the time cost shown in Table 7.1, we used dm = 7 and c = 2, where c is the maximum weight of classical errors that are corrected in a TELS protocol. Only classical errors of weight greater than or equal to three triggered detection events. Similarly, in Fig. 14d, we show a layout for a 125-to-3 distillation tile that performs TELS with a classical [127, 99, 9] BCH code. Here, we used the parameters dm = 5 and c = 1 to obtain the time cost shown in Table 7.1. For both of the layouts that use TELS, we have only discussed the TELS code used to execute the non-Clifford gates of Section 7.1.1. For the 125-to-3 distillation protocol, there may be at most 26 additional Pauli measurements to perform due to the conditional Clifford corrections. The results of these measurements are used to detect if there are errors in the final distilled magic states. These Pauli measurements form a PP set of size at most 26. For the worst case where the PP set is of size 26, the TELS protocol uses the [31, 26, 3] BCH code with lattice surgery measurement distance dm = 9. The space requirements of all the above layouts are described as functions of dx and dz in Appendix .10. Additional constants in Appendix .10 can be used with the procedure of Appendix .9 to determine all the minimum distances (spacelike and timelike) for the distillation protocols. .11.2 116-to-12 distillation The 116-to-12 magic state distillation protocol is obtained from a triorthogonal CSS quantum J116, 12, 4K code. This code is constructed by puncturing the [128, 29, 32] Reed-Muller code at a specific set of 12 locations as shown in Ref. [HH18]. As a result, the quantum code will contain 12 logical qubits, 87 X-type stabilizers and 17 Z-type stabilizers. After applying the circuit 173 (a) (b) (c) (d) Figure 14: Layouts of logical qubits for lattice-surgery-based 125-to-3 magic state distillation. Cell color legend in caption of Fig. 7.4. (a) Layout for a distillation tile without temporally encoded lattice surgery measurements, using an auto-corrected non-Clifford gate gadget. The blue routing space allows Pauli X-type measurements as there is access to the X boundaries of all the cells. The gray routing region allows access to a |0⟩ ancilla with Y boundary access. This region contains hardware to execute a non-Clifford gate gadget. (b) Layout for a distillation tile that performs Pauli measurements two at a time, without temporal encoding. The long blue routing space performs one set of measurement with the auto-corrected non-Clifford gadget hardware at the left, and the large grey routing space uses the gadget hardware on the right. (c) Layout for a distillation tile performing temporally encoded lattice surgery with the [127, 106, 7] BCH code. Non-Clifford gates are performed two at a time, using separate routing spaces shown in gray and blue. (d) Layout for a TELS-assisted distillation tile using the [127, 99, 9] BCH code, performing non-Clifford gates two at a time. 174 transformation from a gate-based model to the PBC model [Lit19a], we are left with a sequence of 99 commuting Pauli measurements on 29 logical qubits, twelve of which will finally become the distilled magic states. These 99 measurements form a size-99 PP set. In this paper, we will consider using this protocol in a regime where the physical error rate is p = 10−4 and our target is to distill magic states with logical error probability at most δ (M) = 10−15 . Using the noise model described in Chapter 7 for the injection of magic states, we apply the analysis in Ref. [Lit19b] to determine the logical failure probability per output magic state for one round of a 116-to-12 distillation scheme, p (M) L = 1 12 495 (ϵL,Z) 4 + 1 2 8(ϵL,Z) 3 ϵL,X + 1 4 24(ϵL,Z) 2 (ϵL,X) 2 + 1 8 32(ϵL,Z)(ϵL,X) 3 + 1 16 16(ϵL,X) 4 = 495(1 + η) 4 972η 4 p 4 . (32) For p = 10−4 and η = 100, the probability that the distillation succeeds is 1 − p (M) D = (1 − ϵL) 116 = 0.9961 and p (M) L = 5.3 × 10−17. As this is sufficiently below δ (M) , the lattice surgery measurements used to execute the distillation protocol must be modeled with measurement distance large enough to allow for distilled magic states of logical error rate at most δ(M). Using the procedure in Appendix .9, we determined that the minimum spacelike distances are dx = 9 and dz = 15. We show two layouts for distillation tiles that do not use TELS in Fig. 15a and Fig. 15b. These layouts perform distillation in the Pauli frame as described in Ref. [Lit19a] using auto-corrected non-Clifford gadgets (Fig. 7.3). On the layout of Fig. 15a, 99 Pauli measurements are performed, each with measurement distance dm = 13. On the layout of Fig. 14b, Pauli measurements can be performed two at a time. Hence the time required is only the time for 50 sequential lattice surgery measurements with measurement distance dm = 13. In Fig. 15c, we show a layout for a distillation tile that performs TELS with a classical [129, 114, 6] Zetterberg code. Note that since there are two disjoint routing spaces (in blue and grey), the set of n = 129 Pauli measurements can be performed in the time required for 65 sequential measurements. To obtain the time cost shown in Table 7.1, we used dm = 3 and c = 0. Similarly, in Fig. 15d, we 175 (a) (b) (c) (d) Figure 15: Layouts of logical qubits for lattice-surgery-based 116-to-12 magic state distillation. Cell color legend in caption of Fig. 7.4. (a) Layout for a distillation tile without temporally encoded lattice surgery measurements, using an auto-corrected non-Clifford gate gadget. (b) Layout for a distillation tile that performs Pauli measurements two at a time, without temporal encoding. (c) Layout for a distillation tile performing temporally encoded lattice surgery with the [129, 114, 6] Zetterberg code. Non-Clifford gates are performed two at a time, using separate routing spaces shown in gray and blue. (d) Layout for a TELS-assisted distillation tile using the [127, 99, 9] BCH code, performing non-Clifford gates two at a time. show a layout for a 116-to-12 distillation tile that performs TELS with a classical [127, 99, 9] BCH code. Here, we used the parameters dm = 3 and c = 2 to obtain the time cost shown in Table 7.1. As discussed in Section 7.1.1, we must also execute a second PP set, now of maximum size 17, due to the conditional Clifford corrections. The results of these measurements are used to detect if there are errors in the final distilled magic states. These Pauli measurements form a PP set of maximum size 17, and in the worst case, the TELS protocol used is a [43, 17, 7] BCH code with lattice surgery measurement distance dm = 3 and c = 2. The space requirements of all the above layouts are described as functions of dx and dz in Appendix .10. Additional constants in Appendix .10 can be used with the procedure of Appendix .9 to determine all the minimum distances (spacelike and timelike) for the distillation protocols. .11.3 114-to-14 distillation The 114-to-14 magic state distillation protocol is obtained from a triorthogonal CSS quantum J114, 14, 3K code. This code is constructed by puncturing the [128, 29, 32] Reed-Muller code at a specific set of 14 locations as shown in Ref. [HH18]. As a result, the quantum code will contain 14 logical qubits, 85 X-type stabilizers and 15 Z-type stabilizers. After applying the circuit transformation from a gate-based model to the PBC model [Lit19a], we are left with a sequence of 176 99 commuting Pauli measurements on 29 logical qubits, fourteen of which will finally become the distilled magic states. These 99 measurements form a size-99 PP set. In this paper, we will consider using this protocol in a regime where the physical error rate is p = 10−3 and our target is to distill magic states with logical error probability at most δ (M) = 10−10 . Using the noise model described in Chapter 7 for the injection of magic states, we apply the analysis in Ref. [Lit19b] to determine the logical failure probability per output magic state for one round of a 114-to-14 distillation scheme, p (M) L = 1 14 30 (ϵL,Z) 3 + 1 2 6(ϵL,Z) 2 ϵL,X + 1 4 12(ϵL,Z)(ϵL,X) 2 + 1 8 8(ϵL,X) 3 = 30(1 + η) 3 378η 3 p 3 . (33) For p = 10−3 and η = 100, the probability that the distillation succeeds is 1 − p (M) D = (1 − ϵL) 114 = 0.962 and p (M) L = 8.18×10−11. Now, the lattice surgery measurements used to execute the distillation protocol must be modeled with measurement distance large enough to allow for distilled magic states of logical error rate at most δ(M). We show two layouts for distillation tiles that do not use TELS in Fig. 16a and Fig. 16b. These layouts perform distillation in the Pauli frame as described in Ref. [Lit19a] using auto-corrected non-Clifford gadgets (Fig. 7.3). On the layout of Fig. 16a, 99 Pauli measurements are performed, each with measurement distance dm = 15. Using the procedure in Appendix .9, we determined that the minimum spacelike distances for this layout are dx = 9 and dz = 19. On the layout of Fig. 16b, Pauli measurements can be performed two at a time. Hence the time required is only the time for 50 sequential lattice surgery measurements with measurement distance dm = 15. However in this case, when we calculated the minimum spacelike distances, the Z-distance could be dropped by two. This can be attributed to the fact that the distillation protocol finished in nearly half the time, and the probability of a logical Z-type error (see Appendix .9) scales linearly with time. Hence dx = 9 and dz = 17. In Fig. 15c, we show a layout for a distillation tile that performs TELS with a classical [129, 114, 6] Zetterberg code. Note that since there are two disjoint routing spaces (in blue and grey), the set of 177 (a) (b) (c) (d) Figure 16: Layouts of logical qubits for lattice-surgery-based 114-to-14 magic state distillation with a J114, 14, 3K quantum code. Cell color legend in caption of Fig. 7.4. (a) Layout for a distillation tile without temporally encoded lattice surgery measurements, using an auto-corrected non-Clifford gate gadget. (b) Layout for a distillation tile that performs Pauli measurements two at a time, without temporal encoding. (c) Layout for a distillation tile performing temporally encoded lattice surgery with the [129, 114, 6] Zetterberg code. Non-Clifford gates are performed two at a time, using separate routing spaces shown in gray and blue. (d) Layout for a TELS-assisted distillation tile using the [127, 106, 7] BCH code, performing non-Clifford gates two at a time. n = 129 Pauli measurements can be performed in the time required for 65 sequential measurements. To obtain the time cost shown in Table 7.1, we used dm = 5 and c = 0. Similarly, in Fig. 15d, we show a layout for a 114-to-14 distillation tile that performs TELS with a classical [127, 106, 7] BCH code. Here, we used the parameters dm = 5 and c = 1 to obtain the time cost shown in Table 7.1. As discussed in Section 7.1.1, we also execute a second PP set, now of maximum size 15, due to the conditional Clifford corrections. The results of these measurements are used to detect if there are errors in the final distilled magic states. These Pauli measurements form a PP set of maximum size 15, and in the worst case, the TELS protocol used is a [31, 16, 7] BCH code with lattice surgery measurement distance dm = 5 and c = 2. The space requirements of all the above layouts are described as functions of dx and dz in Appendix .10. Additional constants in Appendix .10 can be used with the procedure of Appendix .9 to determine all the minimum distances (spacelike and timelike) for the distillation protocols. 178
Abstract (if available)
Abstract
Quantum computation holds the promise of solving certain complex problems exponentially faster than classical computers. However, the high prevalent noise in current quantum devices impedes the accurate execution of even basic quantum algorithms. This can be remedied by protecting quantum information with a quantum error-correcting code, in which the logical information of an algorithmic qubit is spread across multiple physical qubits. Individual quantum errors are then located and corrected by the fault-tolerant measurement of multi-qubit stabilizer operators (parity checks). Unfortunately, error correction and fault tolerance both impose large demands on the qubit overhead: hundreds to thousands of physical qubits per logical qubit.
In this thesis, we reduce the qubit and time cost of fault tolerance by redesigning key building blocks of an error-corrected quantum computer. First, we develop a combinatorial proof with flag fault tolerance that exponentially reduces the number of qubits needed to measure a stabilizer of any size, while tolerating one fault. We then leverage the combinatorial proofs to develop fault-tolerant circuits to prepare cat states deterministically with only one ancillary qubit. These results then enable the construction of few-qubit fault-tolerant circuits for the preparation of complex encoded states with 100% yield. Next, we optimize the overhead of error correction on a planar 25-qubit layout. We show with extensive simulations that a distance-four code encoding six logical qubits protects information as well as the distance-five surface code, using one-tenth as many physical qubits. Finally, we optimize the time overhead of logical gates in surface code quantum computers. For computations executed via lattice surgery measurements of multi-qubit Pauli operators, we show that protecting measurement results with a classical code cuts computation time by a factor of two to six. Our hardware-agnostic optimizations of the space and time costs of fault tolerance thus suggest new routes to advance the timeline of error-free quantum computing.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Towards efficient fault-tolerant quantum computation
PDF
Quantum error correction and fault-tolerant quantum computation
PDF
Flag the faults for reliable quantum computing
PDF
Quantum steganography and quantum error-correction
PDF
Applications of quantum error-correcting codes to quantum information processing
PDF
Applications and error correction for adiabatic quantum optimization
PDF
Quantum coding with entanglement
PDF
Error correction and cryptography using Majorana zero modes
PDF
Trainability, dynamics, and applications of quantum neural networks
PDF
Dynamical error suppression for quantum information
PDF
Destructive decomposition of quantum measurements and continuous error detection and suppression using two-body local interactions
PDF
Demonstration of error suppression and algorithmic quantum speedup on noisy-intermediate scale quantum computers
PDF
Quantum computation and optimized error correction
PDF
Topics in quantum information and the theory of open quantum systems
PDF
Error suppression in quantum annealing
PDF
Quantum and classical steganography in optical systems
PDF
Protecting Hamiltonian-based quantum computation using error suppression and error correction
PDF
Towards optimized dynamical error control and algorithms for quantum information processing
PDF
Error correction and quantumness testing of quantum annealing devices
PDF
Entanglement-assisted coding theory
Asset Metadata
Creator
Prabhu, Prithviraj
(author)
Core Title
Lower overhead fault-tolerant building blocks for noisy quantum computers
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Degree Conferral Date
2024-05
Publication Date
05/17/2024
Defense Date
03/27/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
combinatorics,fault tolerance,fault-tolerant quantum computing,OAI-PMH Harvest,quantum computing,quantum error correction
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Reichardt, Benjamin W. (
committee chair
), Brun, Todd A. (
committee member
), Chugg, Keith Michael (
committee member
), Levenson-Falk, Eli (
committee member
), Lidar, Daniel A. (
committee member
)
Creator Email
pprabhu@usc.edu,prithvirajprab@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113939699
Unique identifier
UC113939699
Identifier
etd-PrabhuPrit-12915.pdf (filename)
Legacy Identifier
etd-PrabhuPrit-12915
Document Type
Thesis
Format
theses (aat)
Rights
Prabhu, Prithviraj
Internet Media Type
application/pdf
Type
texts
Source
20240517-usctheses-batch-1151
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
combinatorics
fault tolerance
fault-tolerant quantum computing
quantum computing
quantum error correction