Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Topics in quantum cryptography, quantum error correction, and channel simulation
(USC Thesis Other)
Topics in quantum cryptography, quantum error correction, and channel simulation
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
TOPICS IN QUANTUM CRYPTOGRAPHY, QUANTUM ERROR CORRECTION, AND CHANNEL SIMULATION by Zhicheng Luo A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (PHYSICS) May 2009 Copyright 2009 Zhicheng Luo Dedication To my lovely wife ii Acknowledgements I am deeply grateful to my advisor, Igor Devetak, for being an excellent mentor. Besides Igor’s persistent encouragement and support, I especially would like to thank him for his elaborate guidance in the early stage of my research, which helped me through the transition from experimental physics to theoretical information theory successfully. His lucid way of thinking and his passion of getting things done in the “right way” had a major impact on my research. I would like to thank Todd Brun for his advice and helpful discussion on many topics and the project I did with him. When Igor was not around, Todd’s door was always open for me and his timely help is highly appreciated. Thanks also go to Stephan Haas for his advice and help on my graduate study at USC. I would like to extend my sincere appreciation to the other dissertation committee members, Daniel Lidar and Werner Dappen, for their experience and insight. Special thanks are given to my wonderful colleagues, Min-Hsiu Hsieh, Mark Wilde, MartinVarbanov,OgnyanOreshkov,SheshaRaghunathan,BilalShawandHariKrovi, for their warm support and invaluable discussions throughout my research life. Finally and most importantly, I would like to thank my family for their love and support. This thesis is dedicated to my wife Hao Chen and it is she who makes my life much happier and my endeavor more meaningful. iii Table of Contents Dedication ii Acknowledgements iii List of Figures vi Abstract ix Chapter 1: Introduction 1 1.1 Quantum key distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Quantum error correction . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 Private communication over quantum channels . . . . . . . . . . . . . . 20 1.4 Channel simulation and rate-distortion theory . . . . . . . . . . . . . . . 31 1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Chapter2: Efficientlyimplementablecodesforquantumkeyexpansion 44 2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.2 Reductionfromentanglement-assistedentanglementdistillationtoquan- tum key expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Chapter 3: Quantum error-correcting codes based on privacy amplifi- cation 55 3.1 Quantum codes based on affine two-universal hash functions . . . . . . . 57 3.1.1 Private classical codes by two-universal hashing . . . . . . . . . . 57 3.1.2 The P-CSS code construction . . . . . . . . . . . . . . . . . . . . 58 3.1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2 Entanglement transmission over Pauli channels . . . . . . . . . . . . . . 63 3.3 Code performance on memoryless qubit Pauli channels . . . . . . . . . . 74 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Chapter 4: Secret Key Assisted Private Classical Capacity over Quan- tum Channels 79 4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.3 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 iv 4.3.1 Classical-quantum channels . . . . . . . . . . . . . . . . . . . . . 83 4.3.2 Generic quantum channels . . . . . . . . . . . . . . . . . . . . . . 93 4.4 Private Father Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Chapter 5: Simulation with quantum side information 96 5.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.2 Channel simulation with quantum side information . . . . . . . . . . . . 99 5.3 Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.3.1 Common randomness distillation . . . . . . . . . . . . . . . . . . 116 5.3.2 Rate-distortion trade-off with quantum side information . . . . . 119 5.4 Bounds on quantum state redistribution . . . . . . . . . . . . . . . . . . 123 5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Chapter 6: A unified approach for multi-terminal source coding prob- lem 127 6.1 Connectionbetweenclassicalchannelsimulationandrate-distortiontheory128 6.2 Achievability of successive refinement, multiple descriptions, and multi- terminal source coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.2.1 Successive refinement . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.2.2 Multiple descriptions . . . . . . . . . . . . . . . . . . . . . . . . . 135 6.2.3 Multi-terminal source coding . . . . . . . . . . . . . . . . . . . . 138 6.2.4 Revisit successive refinement . . . . . . . . . . . . . . . . . . . . 140 Chapter 7: Conclusion 142 Bibliography 145 Appendix 150 v List of Figures 1 TheVernamcipher. Aliceencryptshermessage“JESSICA”byadding the random key bits, say “RJOMDCS”, to her original message to get the encrypted message “ANGELES”. After receiving the encrypted message through the public channel, where the eavesdropper can hear the message but cannot change it, Bob can decrypt the received mes- sage “ANGELES” to the true message “JESSICA” by subtracting the shared secret key “RJOMDCS”. . . . . . . . . . . . . . . . . . . . . . . 2 2 The Bennett and Brassard 1984 (BB84) protocol. Alice randomly chooses |0i or |+i to encode 0 and |1i or |−i to encode 1, and then they use the authenticated classical channel to determine which bits they used in the same basis. After throwing away the bits Bob mea- sured in the wrong basis, Alice and Bob use some of the rest qubits to estimate the eavesdropping level. If the level is below some predeter- mined threshold, they can further perform information reconciliation and privacy amplification procedures to obtain the secret key. . . . . . 5 3 Binary symmetric channel. . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 A Venn diagram shows you the relationship between entropies and mutual information. The mutual information I(X;Y) corresponds to the intersection of the information in X and Y and the joint entropy H(X,Y) corresponds to the union of the information in X and Y. . . 23 5 A communication system for n uses of memoryless discrete channels. . 26 6 A schematic view of the anatomy of the private communication code: all the encoded classical sequences (represented as black dots •) form a HSW code, which is a good transmission code for Bob; the encoded classical sequences are also divided into many privacy amplification sets, each of which can cover Eve’s typical subspace uniformly such that Eve’s local state is almost the same for each message and she hence knows nothing about the secret information. . . . . . . . . . . . 30 vi 7 Shannon’s lossless source coding theorem in terms of noiseless chan- nel simulation. To communicate the source sequence X n to Bob, Al- ice compresses it at a rate of Shannon entropy by encoding E n and sends the compressed sequence to Bob through noiseless communica- tion channels. After receiving the data, Bob decompresses it to the original sequence X n by decoding D n without loss. . . . . . . . . . . . 35 8 The reverse Shannon theorem. We can use common randomness [cc] and classical noiseless channel [c → c] to simulate a classical noisy channel W X A →Y B with feedback Δ Y B →Y A Y B . . . . . . . . . . . . . . . . 36 9 Theclassical-quantumSlepian-Wolfproblem. WecanuseBob’squan- tum side information to reduce the communication cost for replicat- ing the source at Bob’s side. The “typical” source sequences can be classified into many HSW codes defined by {C m ,m∈{0,1} nR }, with C m ={x n (ms),s∈{0,1} nS }. AndforeachHSWcodeC m , thesource sequences{x n (ms)} can be mapped into near orthogonal subspaces of Bob’s “typical” subspace through the classical-quantum correlation of XB and thus distinguishable with high probability. . . . . . . . . . . . 38 10 The relation of our results to prior work. . . . . . . . . . . . . . . . . . 39 11 The multiple descriptions problem with three receivers, two of which receive individual descriptions and the third of which has access to both descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 12 Successive refinement problem. . . . . . . . . . . . . . . . . . . . . . . 41 13 Multi-terminal source coding problem. . . . . . . . . . . . . . . . . . . 41 14 The construction of the full rank matrices N 1 and N 2 . . . . . . . . . . 47 15 The error parameter η vs the quantum code rate R Q for the Gallager code n=19839, k =9839 and column weight t=3. . . . . . . . . . . . 77 16 Private classical communication capacity region of {c → qq} channel when assisted by pre-shared secret keys. . . . . . . . . . . . . . . . . . 88 17 Private classical communication protocol assisted by pre-shared secret keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 18 Achievable region of rate pairs for a classical-quantum system XB. . . 101 19 The covering lemma. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 vii 20 The EGC theorem is the multiple descriptions problem with two en- codersandthreedecoders. Herethesendersendstwodescriptions ˆ X n 0 , ˆ X n 1 of the source X n to receiver 0 and receiver 1 with rates R 0 and R 1 , respectively. Receiver 2 can have access to both descriptions of receiver 0 and receiver 1 with rate R 0 +R 1 and he can reconstruct a better description ˆ X n 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 viii Abstract In this thesis, we mainly investigate four different topics: efficiently implementable codesforquantumkeyexpansion[51],quantumerror-correctingcodesbasedonprivacy amplification [48], private classical capacity of quantum channels [44], and classical channel simulation with quantum side information [49, 50]. Forthefirsttopic, weproposeanefficientlyimplementablequantumkeyexpansion protocol, capable of increasing the size of a pre-shared secret key by a constant factor. Previously, the Shor-Preskill proof [64] of the security of the Bennett-Brassard 1984 (BB84) [6] quantum key distribution protocol relied on the theoretical existence of good classical error-correcting codes with the “dual-containing” property. But the explicit and efficiently decodable construction of such codes is unknown. We show that we can lift the dual-containing constraint by employing the non-dual-containing codes with excellent performance and efficient decoding algorithms. For the second topic, we propose a construction of Calderbank-Shor-Steane (CSS) [19, 68] quantum error-correcting codes, which are originally based on pairs of mutu- ally dual-containing classical codes, by combining a classical code with a two-universal hash function. We show, using the results of Renner and Koenig [57], that the com- munication rates of such codes approach the hashing bound on tensor powers of Pauli channels in the limit of large block-length. For the third topic, we prove a regularized formula for the secret key assisted capacity region of a quantum channel for transmitting private classical information. This result parallels the work of Devetak on entanglement assisted quantum commu- nication capacity. This formula provides a new family protocol, the private father ix protocol, under the resource inequality framework that includes the private classical communication without the assisted secret keys as a child protocol. Forthefourthtopic,westudyandsolvetheproblemofclassicalchannelsimulation with quantum side information at the receiver. Our main theorem has two important corollaries: rate-distortion theory with quantum side information and common ran- domness distillation. Simple proofs of achievability of classical multi-terminal source coding problems can be made via a unified approach using the channel simulation theorem as building blocks. The fully quantum generalization of the problem is also conjectured with outer and inner bounds on the achievable rate pairs. x Chapter 1: Introduction Inthischapter, Iwillintroducethenecessarybackgroundforeachofthefollowingfour topics: efficiently implementable codes for quantum key expansion, quantum error- correcting codes based on privacy amplification, secret key assisted private classical communication capacity over quantum channels, and classical channel simulation with quantumsideinformationanditsapplicationinmulti-terminalsourcecodingproblems. Thefirstthreetopicsaretightlyrelatedthroughtherelationbetweenentanglement and secret classical correlation. We can transform an (entanglement assisted) entan- glementpurificationprotocolintoakeydistribution(expansion)protocolbysomewell designed decoherification procedures. Meanwhile, given a quantum channel, we can alsocoherifyasetofclassicalprivacyamplificationcodesintoasubclassofCalderbank- Shor-Steane (CSS) quantum error-correcting codes by making all the steps coherently, replacing probabilistic mixtures by quantum superpositions. And secret key assisted private classical communication is indeed a classical analogue of entanglement assisted quantum communication and we study the capacity region of private classical com- munication over quantum channels assisted by pre-shared secret keys. For the last topic, we systematically study the problem of simulating a classical channel by noise- lessclassicalcommunication,commonrandomnessandquantumsideinformation. The results can be further applied to simplify the achievability proofs of the multi-terminal source coding problems as well as classical rate-distortion theory with quantum side information and common randomness distillation. 1 1.1 Quantum key distribution In this section I will try to answer the following questions: what is quantum key distribution (QKD) and why should we care about it? How can we prove a QKD protocol is secure? Is there any connection between a secure QKD protocol and a quantum error-correcting code? I will start the discussion by briefly going through the history of cryptography. The classical cryptography consists of two totally different methods: private key cryptography and public key cryptography. Originating thousands of years ago, private key cryptography requires the communicating parties, say Alice and Bob, to share a pair of secret keys. If Alice want to send a secret message to Bob, she must encrypt her message by an encoding key and Bob must have a corresponding decoding key to decrypt the encrypted message. A famous private key cryptosystem shown in Figure is the Vernam cipher, also known as the one time pad. The Vernam cipher is provably secure if the shared secret key bits have the same size as the message bits and they are used only once. However, practical applications of this cryptosystem requires a large amount of secret keys shared by communication parties, and how to distribute these key bits privately is a big challenge for us. JESSICA RJOMDCS Alice’s true message Shared Secret Keys ANGELES Encryption Decryption Bob’s decrypted message JESSICA Public Channel bitwise addition mod 26 bitwise subtraction mod 26 Figure 1: The Vernam cipher. Alice encrypts her message “JESSICA” by adding the random key bits, say “RJOMDCS”, to her original message to get the encrypted mes- sage “ANGELES”. After receiving the encrypted message through the public channel, where the eavesdropper can hear the message but cannot change it, Bob can decrypt the receivedmessage“ANGELES”to the truemessage“JESSICA” by subtracting the shared secret key “RJOMDCS”. 2 Without facing the private key distribution problem, public key cryptography has been quickly developed since it was invented in the 1970s and it is nowadays widely used for all sorts of security applications. Unlike private key cryptography requiring the same secret keys for both encryption and decryption, public key cryptography is an asymmetric cryptography in which the key used to encrypt a message differs from the key used to decrypt it. Specifically, Bob has a pair of keys, a public key which can be published to the world for information encryption and a private key which is kept secretfordecryption. IfAlicewantstosendasecretmessagetoBob,shecanuseBob’s distributed public key to encrypt her original message that only Bob can decrypt with his private key. The great feature of public key cryptography is that people without pre-shared secret keys can communicate privately. What they have to do is just to publish their public keys and the need for large scale secret key distribution is fully eliminated. What a beautiful idea! Unfortunately, this cryptosystem is not perfectly secure. Thepublickeyandtheprivatekeyaremathematicallyrelatedandthesecurity of public key cryptography relies on the difficulty of inverting some one-way functions. Although it is computationally infeasible (for classical computers) to infer the private key from the public one, theoretically it is possible to break the cryptosystem by using quantum computers. To find a provably secure cryptosystem, we still have to go back to private key cryptography and face the problem of secret key distribution. At this point, quantum key distribution comes to our attention. Quantum key distribution proposes many different ways to enable secure distribution of private information, which are guar- anteed by the principles of quantum mechanics. Unlike classical communication, the two communicating parties of quantum key distribution can detect the presence of the eavesdropper, based on the fact that the eavesdropper cannot measure an arbitrary quantum system without disturbing it. If the eavesdropping level is lower than some predeterminedthreshold, asharedsecurekeycanbegeneratedwithnothingleakingto 3 the eavesdropper; otherwise, the eavesdropper can gain certain information about the key and the protocol shall be aborted. Two important classical techniques of private key cryptography, information reconciliation (IR), which is used to detect and remove errors in the shared key, and privacy amplification (PA), which is used to eliminate correlationsbetweenthesharedkeyandtheinformationgatheredbytheeavesdropper, are used in quantum cryptography. Morespecifically, quantumkeydistribution(QKD)allowstwodistantpartiesAlice and Bob to establish a secret key using one-way quantum communication and public classicalcommunication. Thiskeyisprovablysecurefromanall-powerfuleavesdropper Eve, who is allowed to intercept the quantum communication, perform block process- ing of quantum data, and listen to the public discussion. QKD owes its security to two facts: 1) Alice and Bob, by performing tomography on their (quantum) data, au- tomatically obtain information about Eve’s (quantum) data; 2) with this knowledge Alice and Bob can perform IR and PA to distill a key which is common to both (by IR), and about which Eve knows next to nothing (by PA). In Chapter 2 we solve the practical question of constructing efficiently implementable codes for IR and PA. The best known QKD protocol, BB84, was proposed by Bennett and Brassard in 1984 [6] (as shown in Figure 2). BB84 is a simple “prepare-and-measure” proto- col which can be implemented without a quantum computer or quantum memory as follows: (1). Alice encodes a random bit either in the Z basis {|0i,|1i} or X basis {|+i,|−i} (|±i= 1 √ 2 (|0i±|1i)) of a qubit system. (2). Alice sends the qubit to Bob. (3). Bob performs a measurement in one of the two bases, chosen at random. (4). After repeating this many times, Alice and Bob determine by public discussion which bits they chose the same basis for, thus establishing a raw key. 4 (5). Alice and Bob perform channel estimation on a small fraction of the raw key bits. If the channel is too noisy, they abort the protocol. (6). Alice and Bob perform IR and PA on the remaining raw key bits to obtain the final secret key. Quantum channel Authenticated Classical channel Alice Bob 0 , 1 , , = ↔ = + = - = Figure 2: The Bennett and Brassard 1984 (BB84) protocol. Alice randomly chooses |0i or |+i to encode 0 and |1i or |−i to encode 1, and then they use the authenti- cated classical channel to determine which bits they used in the same basis. After throwing away the bits Bob measured in the wrong basis, Alice and Bob use some of the rest qubits to estimate the eavesdropping level. If the level is below some prede- termined threshold, they can further perform information reconciliation and privacy amplification procedures to obtain the secret key. Shor and Preskill [64] gave the first simple proof of the security of standard BB84, by relating the IR and PA steps to Calderbank-Shor-Steane (CSS) quantum error- correcting codes [19, 68]. Their proof is based on the following two-step reductions: 1. Reduction from an entanglement distillation protocol to a CSS code protocol. 2. Reduction from a CSS code protocol to a secure BB84 QKD protocol. The basic idea behind the proof is that we can transform an entanglement distillation protocolintoaquantumkeydistributionprotocolbyproperlydesigneddecoherification procedures. The privacy of the generated common key is guaranteed by the fact that the distilled entanglement is a pure state, and therefore decoupled from the rest of the world, so that Eve can have no information about the common key, which is formed by Z basis measurements on Alice and Bob’s qubits. 5 Before we get into the discussion of protocol reduction, we shall first briefly in- troduce the Calderbank-Shor-Steane codes while the formal introduction of general quantum error-correcting codes is postponed to the next section. A. Introduction to CSS codes A CSS code can be constructed from two dual-containing classical linear codes. We therefore start by defining classical linear codes. An [n,k] classical linear code C, encoding k logical bits into n physical bits, is a k-dimensional linear subspace of Z n 2 . It can be given either as the column space of an n×k generator matrix G, so that C = {Gy : y ∈ Z k 2 }, where all vectors are assumed to be column vectors, or the null space of an (n−k)×n parity check matrix H. The row space of H is the dual code C ⊥ ofC, i.e. the dual ofC consists of all the codewordsy such thaty is orthogonal to all the codewords in C. The interpretation is that Alice encodes her k-bit message y into then-bit codewordx=Gy. This is sent down a noisyn-bit channel to Bob, who then tries to decode the original message y. A quantum code is called an [[n,k]] stabilizer code if it is the simultaneous +1 eigenspace of a set ofn−k independentn-qubit Pauli operators (called the “stabilizer generators”). A CSS code is a stabilizer code, each stabilizer of which is composed of tensor products of Z and I operators or X and I operators, where Pauli operators {I,X,Y,Z} are defined by I = 1 0 0 1 , X = 0 1 1 0 , Y = 0 −i i 0 , Z = 1 0 0 −1 . A CSS code is defined by two dual-containing classical linear codes [n,k 1 ] and [n,k 2 ], calledC 1 andC 2 , with parity check matricesH 1 andH 2 , respectively, such thatC ⊥ 2 ⊆ C 1 . The set of stabilizers of the CSS code is given by{Z H 1 ,X H 2 }, where Z H ={Z h : h is a row ofH}, etc., andZ h =Z h 1 ⊗···⊗Z hn ,h=(h 1 ,...,h n ). Bit flip errors that 6 anti-commute with Z H 1 are detected by measuring the Z stabilizers and corrected by applyingacorrespondingrecoveryunitary. Similarly,phasefliperrorscanbecorrected by the X stabilizers. Please refer to the next section for details. A CSS code C can alternatively be given in terms of codewords |x+C ⊥ 2 i= 1 q |C ⊥ 2 | X y∈C ⊥ 2 |x+yi , (1) wherex runs over the elements ofC 1 and + is bitwise addition modulo 2. It is easy to checkthatthecodewords|x+C ⊥ 2 iarethe+1eigenstatesofthestabilizers{Z H 1 ,X H 2 }. And it is also easy to see that|x+C ⊥ 2 i only depends on the coset ofC 1 /C ⊥ 2 to whichx belongs and we canletx onlyrunover suitably chosencoset representatives{x s }. The number of cosets of C ⊥ 2 in C 1 is |C 1 |/|C ⊥ 2 | = 2 m with m =k 1 +k 2 −n, and therefore C is an [[n,m]] quantum code, encoding m logical qubits into n physical qubits. Here double square brackets are used to distinguish quantum codes from classical codes. Furthermore, the codes C b,p defined by codewords |x+C ⊥ 2 i b,p = 1 q |C ⊥ 2 | X y∈C ⊥ 2 (−1) y·p |x+y+bi and parameterized by (b,p) are different eigenspaces of the stabilizers {Z H 1 ,X H 2 }. Therefore, they are equivalent to the CSS codeC defined by (1) in the sense that they have the same error-correcting properties, which will be used later for the protocol reduction. B. The security proof of BB84 by CSS codes Inthissubsection,IwillshowhowtouseCSScodestodoentanglementdistillationover quantum channels and later I will show how to properly decoherify an entanglement distillation protocol to a secure BB84 quantum key distribution protocol. 7 As shown in [54], there exist 2 m distinct x s , 2 n−k 1 distinct b, and 2 n−k 2 distinct p such that the states {|x s +C ⊥ 2 i b,p } form an orthonormal basis for a 2 n -dimensional Hilbert space. Therefore, we can express n perfect Einstein-Podolsky-Rosen (EPR) pairs |Φi ⊗n shared between Alice and Bob in terms of |Φi ⊗n AB = 1 √ 2 n X xs X b,p |x s +C ⊥ 2 i A b,p |x s +C ⊥ 2 i B b,p , (2) where an EPR pair shared between Alice and Bob is described by the pure state |Φi AB = 1 √ 2 (|0i A |0i A +|1i B |1i B ) . As shown in [39], without loss of generality we can assume that Eve effects a Pauli channel, i.e. a quantum channel that applies elements of the Pauli group {X b 1 Z p 1 ⊗ ···⊗X bn Z pn } , chosen with particular probabilities {p b,p ,b =b 1 ···b n ,p =p 1 ···p n }. Suppose Alice initially prepares n perfect EPR pairs and sends them through a Pauli channel and with probability p be,pe Alice and Bob will end up with a state (I A ⊗X be Z pe )|Φi ⊗n , (3) whereb e andp e are the corresponding bit flip and phase flip error vectors. By measur- ing the stabilizer operators Z H 1 and X H 2 on both sides, they can get syndrome (b,p) and (b+b e ,p+p e ) and project the state (3) into 1 √ 2 m X xs |x s +C ⊥ 2 i A b,p |x s +C ⊥ 2 i B b+be,p+pe . (4) which is an encoded version of |Φi ⊗m in the CSS code C b,p . After Alice sends her error syndrome (b,p) to Bob, Bob can correct errors (b e ,p e ) and they can apply a local unitary to transform their state to m perfect EPR pairs. Since |Φi ⊗m is a pure state decoupledfromtherestoftheworld, thecommonkeyformedbyZ basismeasurement will also be decoupled and hence secure. 8 Forsimplicity,weassumethatthechannelestimationhasbeendone. Thereduction from the above entanglement distillation protocol to a CSS code protocol is based on the following observations: 1) Measuring the stabilizers {Z H 1 ,X H 2 } on (3) gives Alice a random syndrome (b,p) and projects the whole bipartite state into (4). This is equivalent to Alice choos- ing (b,p) randomly at the beginning and sending 1 √ 2 m P xs |x s +C ⊥ 2 i b,p |x s +C ⊥ 2 i b,p through the channel instead of|Φi ⊗n . 2)Alice’sZ basismeasurementonthedistilledEPRpairsgivesherarandomcoset number, which is equivalent to she choosing a randomx s at the beginning and sending only |x s +C ⊥ 2 i b,p to Bob. The reduction from a CSS code protocol to a secure BB84 protocol is based on the following observations: 1) We do not care about phase errors since they will not change the key bit values. So we can drop the phase error correction, decode classically by measuring the bits in the Z basis, and find out the key by calculating the coset number. 2) Since Alice does not need to reveal p, the state she effectively sends is 1 2 n X p∈{0,1} n |x s +C ⊥ 2 i b,p hx s +C ⊥ 2 | b,p = 1 |C ⊥ 2 | X y∈C ⊥ 2 |x s +y+bihx s +y+b| , which can be encoded classically. By further subtle discussions omitted here for sim- plicity, we can show that this can be fully transformed to a secure BB84 protocol using only single qubit operations without requiring quantum memory and quantum computation. In fact as shown later in Chapter 2, we can make a direct reduction from an entanglement distillation protocol to a secure BB84 protocol completely in the stabilizer formalism without invoking the explicit codeword construction of CSS codes. 9 Note that the Shor-Preskill proof of the security of the BB84 quantum key dis- tribution protocol relies on the theoretical existence of good classical error-correcting codes with the “dual-containing” property. In order to securely implement the BB84 protocol we need to find good mutually dual containing codes of large block lengthn. These are known to exist in principle, by the Gilbert-Varshamov bound for CSS codes [19, 67]. Unfortunately, no explicit constructions are known, let alone ones that would be simple to decode. The main result of Chapter 2 is to show that the dual-containing condition can be lifted. This permits us to employ excellent efficiently decodable mod- ern classical codes such as LDPC [37] and turbo codes [15]. The price we have to pay is that our protocol becomes an efficiently implementable key expansion (QKE) pro- tocol, performing expansion of a pre-shared key, rather than a key distribution (QKD) protocol, creating one from scratch. This is not much of a drawback, as existing QKD protocols require a logarithmic amount of pre-shared key to authenticate the public discussion. Still we choose to make this distinction, as in our case the pre-shared key is linear in the quantum communication cost. And the security of a QKE pro- tocol is based on an entanglement-assisted entanglement distillation protocol, which is closely related to the entanglement-assisted quantum codes of Brun, Devetak and Hsieh [18, 17]. 1.2 Quantum error correction In this section I will first briefly go through some basics of quantum mechanics, and then I will give a short introduction to the stabilizer formalism of quantum error cor- rection. Some examples of quantum error-correcting codes together with their error- correcting properties will be discussed. Finally I will focus on how to design quantum 10 error-correcting codes from a good classical error-correcting code and a privacy ampli- fication protocol. Now I would like to begin the discussion through a simple classical error-correcting code, the repetition code. Consider a binary symmetric channel as shown in Figure 3 with{0,1} being both the input and output alphabet and suppose we want to send a bit of information through the channel. With probability p the channel will flip the bit from 0 to 1 or from 1 to 0, while with probability 1−p the bit is transmitted without error. In general, to protect the information bit against the noise, we can encode the bit by adding some redundant information in a way such that even if part of the encoded message, or codeword, is corrupted by noise, we are still able to recover the original message from the partially corrupted codeword. For a [3,1] repetition code, we can encode 0 → 000 and 1 → 111 by repeating the information bit three times, i.e. we encode one logical bit into three physical bits. Given thatp is small (sayp=0.1) and the channel is memoryless (i.e. the channel is identical for every channel use), if we send the codeword 000 through the channel, with probability (1−p) 3 + 3p(1−p) 2 ≈ 0.97 the output message would be one of {000,001,010,100}. So if we observe 001 as the channel output, it is highly possible that the last bit was flipped by the channel and that 0 was the information bit. By using the repetition code, the probability of error has been decreased from 0.1 to less than 0.03. 1-p 1-p p p 0 0 1 1 Figure 3: Binary symmetric channel. 11 Based on the same idea, we can develop quantum error-correcting codes to protect quantum states against noise by adding some redundant information. But before I formally state the stabilizer formalism of quantum error correction, I shall make a short introduction to some basics of quantum mechanics. A. Basics of quantum mechanics A closed physical system can be described by a pure state |ψi which is a unit complex vector in a Hilbert space. Given an orthonormal basis {|ii} of the Hilbert space H, the state of the system can be written as |ψi = P i ψ i |ii with P i |ψ i | 2 = 1, and the evolution of the system can be described by a unitary transformation as |ψ(t 2 )i=U(t 1 ,t 2 )|ψ(t 1 )i . Iftheexactstateofasystemisunknownbutbelongstoapurestateensemble{p i ,|ψ i i}, i.e. thesystemcouldbeinoneofthestate|ψ i iwithprobabilityp i ,thenwecandescribe the system by a density operator or density matrix ρ= P i p i |ψ i ihψ i |. The definition of density operators can be generalized to a mixed state ensemble{p i ,ρ i } byρ= P i p i ρ i . Density operator ρ, characterized by the two conditions that Trρ=1 (i.e. the sum of the diagonal elements of ρ equals one,) and ρ is a positive operator (i.e. hψ|ρ|ψi≥ 0 for any |ψi,) is a more general concept. The time evolution of a system can then be described as ρ(t 2 )=U(t 1 ,t 2 )ρ(t 1 )U(t 1 ,t 2 ) † . A quantum measurement can be described by a set of measurement operators {M i },actingontheHilbertspaceofthesystemandsatisfyingthecompletenessrelation P i M † i M i =I, whereM † i istheHermitianconjugateofM i andtheindexidenotesthe measurementoutcome. Givenasysteminstateρ, ameasurementoutcomeicanoccur with probability p(i) = Tr(M i ρM † i ) and the system is left in state ρ i = M i ρM † i q Tr(M i ρM † i ) . 12 Define Λ i = M † i M i , then it is easy to see that Λ i is a positive operator such that P i Λ i =I and p(i) = Tr(ρΛ i ). The set of positive operators {Λ i } is called a Positive Operator-Valued Measure (POVM). An important example of a POVM is a projective measurement described by a set of projectors {P i }, where P † i = P i and P i P j = δ ij P i for anyi,j, withδ ij equals 1 fori=j and 0 otherwise. The completeness relation goes to P i P i =I. The state space H AB of a composite physical system AB is the tensor product of the state spaces H A ⊗H B of the component physical systems. The subsystem A of a composite system AB in state ρ AB is described by a reduced density operator ρ A ≡ Tr B (ρ AB ), where Tr B is an operation known as the partial trace over system B, which is defined by Tr B (|a 1 iha 2 | A ⊗|b 1 ihb 2 | B )≡|a 1 iha 2 | A Tr(|b 1 ihb 2 | B )=hb 1 |b 2 i|a 1 iha 2 | A . B. Stabilizer formalism of quantum error correction Inthissubsection, Iwillintroducethestabilizerformalismofquantumerrorcorrection and the corresponding error-correction conditions. I will strat by discussing the math- ematical formalism of quantum operations, a powerful tool to describe noisy quantum channels. We are interested in the following two equivalent representations: system- environment representation and operator-sum representation or Kraus representation. For the system-environment approach, a quantum noise operation N :H Q →H Q ′ is realized through an isometry U Q→Q ′ E N : H Q → H Q ′ E followed by tracing out the environment E as N(ρ)=Tr E (U Q→Q ′ E N (ρ)) . Since the isometry U Q→Q ′ E N captures the information accessible to the environment, I will use this representation to describe a Pauli channel later in Chapter 3. 13 In the Kraus representation, a trace-preserving noisy quantum channel N can be (non-uniquely) described by a set of operation elements{E i } as N(ρ)= X i E i ρE † i , with P i E † i E i = I. As discussed in [54, 18], each E i may be expanded by a linear combination of tensor products of single qubit Pauli operators, so we can discretize a noise quantum operation N by an error set E such that if every Pauli error of E is correctable then the noise operation N is correctable. This is a very powerful simplification! Now it is time to introduce the stabilizer formalism of quantum error correction. The set of n-fold tensor products of single-qubit Pauli operators {I,X,Y,Z} ⊗n , to- gether with multiplicative factors ±1, ±i, forms the n-fold Pauli group G n , whose elements can be identified as possible error sets as mentioned above. Suppose S is an abelian subgroup of G n , its corresponding stabilizer code C(S) is defined as C(S)={|ψi:g|ψi=|ψi,∀g∈S} , i.e. the code C(S) is the simultaneous +1 eigenspace of all elements of S. Since C(S) is the subspace stabilized by S, S is called the stabilizer of the code. A nontrivial subspace C(S) requires that the elements of S commute and −I n is not in S. For an [[n,k]] stabilizer code, C(S) encodes k logical qubits into n physical qubits with dim(C(S)) = 2 k and S has 2 n−k distinct elements [54]. Therefore, given the fact that a group can be specified by a set of independent generators, the above abelian subgroup S can be succinctly described by n−k independent generators {g i }. The great feature of this compact representation is that to see if a state is stabilized by the group S, it suffices to only check its generators {g i }, instead of checking all the elements of S. 14 Now suppose a quantum noise operation is discretized by an error setE ={E a }⊂ G n andC(S) is an [[n,k]] stabilizer code withn−k independent generators{g i }. How can we tell if E is an correctable error set for C(S)? Normally, if the error E a can move the code space to an orthogoanl subspace, it can be detected by measuring {g i }. In the stabilizer language, if E a anti-commutes with some generator g i , then for |ψi∈C(S) g i E a |ψi=−E a g i |ψi=−E a |ψi , which means thatE a |ψi is an−1 eigenstate ofg i and hence is orthogonal to the code space C(S). For the error E a , we can measure all the generators {g i } to get an error syndrome and apply some unitary operation to correct it. Therefore, we can correct any error E a that anti-commutes with at least one of the stabilizer generators {g i }. How about those errors commuting with the stabilizer generators? If E a ∈S, the errorE a willnotcorruptC(S)atall. ButifE a commuteswithalltheelementsofS but itself is not in S, we can not detect the existence of E a by measuring the generators and therefore it is not correctable. Define Z(S) to be the centralizer of S, the set of elements in G n that commute with S. Then for E a ∈ Z(S)−S and |ψi ∈ C(S), E a |ψi6=|ψi but g i E a |ψi=E a g i |ψi=E a |ψi . So the error setZ(S)−S is not correctable. All the discussion can be summarized in the following theorem. Theorem 1 (Error-correction conditions) AstabilizercodeC(S)cancorrectaset of errors E if and only if E † a E b ∈S∪(G n −Z(S)) for all E a ,E b ∈E. 15 Although the error-correction conditions of stabilizer codes tell you how to judge if an error set E is correctable for C(S), it does not tell you how to find a stabilizer code. Building a “good” quantum error-correcting code is a sophisticated problem. C. Examples of quantum error-correcting codes After building up the theoretical framework for quantum error-correction, I will go through a few simple examples of well-known quantum codes such as the three qubit bit flip and phase flip code, the nine qubit Shor code, and the seven qubit Steane code. 1) The three qubit bit flip and phase flip code The codewords for the [[3,1]] bit flip code are |0 L i=|000i , |1 L i=|111i , where|0 L i and|1 L i are the logical|0i and logical|1i states, not the physical zero and one states. The set {I,X 1 ,X 2 ,X 3 } forms a correctable error set E for the [[3,1]] bit flip code with stabilizer generators {Z 1 Z 2 ,Z 2 Z 3 }, where X i /Z i means an X/Z error occurring on the ith qubit. This code can correct any single qubit bit flip error. The codewords for the [[3,1]] phase flip code are |0 L i=|+++i , |1 L i=|−−−i , with|±i≡ 1 √ 2 (|0i±|1i)beingthe±1eigenstateofanX operator. Theset{I,Z 1 ,Z 2 ,Z 3 } forms a correctable set of errors for the phase flip code with stabilizer generators {X 1 X 2 ,X 2 X 3 }. This code can correct any single qubit phase flip error. 2) The nine qubit Shor code The [[9,1]] Shor code is a combination of the three qubit phase flip and bit flip codes. The Shor code is constructed by a method called concatenation: (1) encode the logical qubit by the [[3,1]] phase flip code: |0i → |+++i, |1i → |−−−i; (2) encode each 16 qubit of the phase flip code by the [[3,1]] bit flip code: |±i→ 1 √ 2 (|000i±|111i). Thus the codewords for the nine qubit Shor code are |0 L i= 1 √ 8 (|000i+|111i)(|000i+|111i)(|000i+|111i) , |1 L i= 1 √ 8 (|000i−|111i)(|000i−|111i)(|000i−|111i) . The [[9,1]] Shor code has stabilizers Z 1 Z 2 , Z 2 Z 3 , Z 4 Z 5 , Z 5 Z 6 , Z 7 Z 8 , Z 8 Z 9 , X 1 X 2 X 3 X 4 X 5 X 6 , X 1 X 5 X 6 X 7 X 8 X 9 , with a correctable error setE ={I,X i ,Y i ,Z i ,i=1,2,...,9}. This code can correct any single qubit error. As we can see in the following example, as the code block lengthn becomes larger and larger and the code construction becomes more and more complicated, it is more and more difficult to write out the codewords explicitly and expressing quantum codes compactly in stabilizer formalism seems more and more attractive. 3) The seven qubit Steane code Calderbank-Shor-Steane (CSS) quantum error-correcting codes exhibit a good way of constructing quantum codes by only using good classical error-correcting codes with the “dual-containing” property. A simple example of CSS codes is the [[7,1]] Steane code,whichlikethe[[9,1]]Shorcodecanalsocorrectanysinglequbiterrorbutrequires 17 less physical qubits. If we choose both the classical linear codes C 1 and C 2 to be the [7,4] Hamming code with generator matrix G and parity check matrix H G= 1 0 0 0 0 1 1 0 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 0 1 1 1 1 T , H = 0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 , we have C ⊥ 2 ={H T y :y∈Z 3 2 } and C 1 =C ⊥ 2 ∪{1111111+C ⊥ 2 }. As shown in the last section, the [[7,1]] Steane code can be constructed with codewords |0 L i= 1 √ 8 [|0000000i+|1010101i+|0110011i+|1100110i +|0001111i+|1011010i+|0111100i+|1101001i] , |1 L i= 1 √ 8 [|1111111i+|0101010i+|1001100i+|0011001i +|1110000i+|0100101i+|1000011i+|0010110i] . The stabilizers of the [[7,1]] Steane code are {Z H ,X H }= Z 4 Z 5 Z 6 Z 7 , Z 2 Z 3 Z 6 Z 7 , Z 1 Z 3 Z 5 Z 7 , X 4 X 5 X 6 X 7 , X 2 X 3 X 6 X 7 , X 1 X 3 X 5 X 7 , which can also be written in a more compact form as {Z H 1 ,X H 2 } with a correctable error set E = {I,X i ,Y i ,Z i ,i = 1,2,...,9}, where H 1 = H 2 = H is the parity check matrix of the [7,4] Hamming code, Z H = {Z h : h is a row ofH}, etc., and Z h = Z h 1 ⊗···⊗Z hn , h=(h 1 ,...,h n ). The benefit of the CSS code construction is that we can directly construct good quantum codes from good dual-containing classical codes. However, how to find a 18 goodclassicalcodewithlargeblocklengthandaself-orthogonalparitycheckmatrixis still mysterious. Furthermore, efficiently decodable modern codes such as low-density parity check (LDPC) codes are generally not dual-containing and how to apply the excellent classical coding algorithms to the quantum scheme is highly desirable. In Chapter 3, I will discuss a method of building “good” quantum error-correcting codes by using these efficiently decodable modern codes. D. Quantum codes based on privacy amplification Through revisiting the construction of CSS codes from the aspect of quantum key dis- tribution, Iwilltalkaboutthebasicideasofbuilding“good”quantumerror-correcting codes by combining a good classical linear code and a privacy amplification protocol. Recall that a CSS code can be constructed by two classical linear codesC 1 andC 2 with C ⊥ 2 ⊆C 1 such that (for x running over C 1 ) the codewords take the form |x+C ⊥ 2 i= 1 q |C ⊥ 2 | X y∈C ⊥ 2 |x+yi . The construction of a CSS code can be decomposed into two parts: a regular classical code C 1 works as an information reconciliation code to correct bit flip errors and the dual codeC ⊥ 2 works as a privacy amplification code [64] to correct phase flip errors. If wecanrelaxtheneedofadualcodetoalessrestrictiveprivacyamplificationprotocol, we may construct a quantum code by a non-dual-containing classical code such as an LDPC code. As we can see from the last section, secret classical information and quantum in- formationareintimatelyrelated[20,61]. Onecanconvertamaximallyentangledstate sharedbytwodistantpartiesintoasecretclassicalkeybylocalbilateralmeasurements inthestandardbasis,andthusoneconvertsanentanglementdistillationprotocolbased onCalderbank-Shor-Steane(CSS)quantumerrorcorrectingcodes[19,68]intoasecure 19 quantum key distribution protocol [64]. The correspondence often also works in the other direction. Given a quantum channel or noisy bipartite quantum state, a large class of secret communication protocols can be made into quantum communication protocols by performing all the steps “coherently,” i.e. replacing probabilistic mix- tures by quantum superpositions [23, 33, 32]. This result was derived in the idealized asymptotic context of quantum Shannon theory, without giving explicit, let alone effi- cient, code constructions. Not coincidentally, these asymptotic codes had a structure reminiscent of CSS codes. In Chapter 3, we bridge the gap between these asymptotic constructions and CSS codes. We obtain a subclass of CSS codes, called P-CSS codes, by making coherent a class of private codes studied by Renner and Koenig [57]. P-CSS codes involve a combination of a classical error correcting code and a privacy amplification protocol. Consequently, the dual-containing property does not come into play, which greatly simplifies the construction. On the other hand, P-CSS codes are not finite distance codes. This is not necessarily a problem: modern coding theory (eg. LDPC [37] and turbo codes [15]) focuses not on distance but on performance on simple i.i.d. (independent, identically distributed) channels. The main result of Chapter 3 is to show that P-CSS codes have excellent asymptotic behavior, attaining the hashing bound on i.i.d. Pauli channels. 1.3 Private communication over quantum channels Inthissection,wewillfirstbrieflyintroducesomebasicsofclassicalandquantuminfor- mation theory without proof: the definitions and properties of classical and quantum entropic quantities, Shannon’s channel coding theorem and transmission of classical 20 information over quantum channels. Then I will talk about the general ideas of send- ing private classical information over quantum channels besides the Vernam cipher introduced in the first section. BeforeIintroducemanyclassicalandquantuminformationtheoreticterms,Iwould like to ask the following two questions: 1) Suppose you want to send a message to someone but you want to save some space, how much can you compress a message such that it is still readable for the receiver? 2) If in the message transmission process part of the message will be messed up, how much redundancy do you need to add to your original message such that the receiver can still figure out the information included in the partially messy message? These two questions are the famous source coding and channel coding problems. And the purpose of introducing the following definitions and properties of information theoretic terms is to explain and solve the above two questions precisely. A. Classical and quantum entropies First let us go over the definitions and properties of classical entropic quantities. The Shannon entropy H(X) of a discrete random variableX with alphabetX and probability distribution p on X is defined by H(X)=H(p)=− X x∈X p(x)log 2 p(x) , with convention 0log 2 0 = 0. Here we take the logarithms to base 2 and hence all the entropies will be measured in bits. H(X) is a measure of the average uncertainty in random variable X and it is also the average number of bits required to describe X. The entropy has the following properties: 1) H(X)≥0 with equality iff X is a constant. 2)H(X)≤log|X|, where|X| denotes the number of elements in the range ofX, with equality iff X has a uniform distribution overX. 21 The joint entropy H(X,Y) of a pair of discrete random variables (X,Y) with a joint distribution p(x,y) on X ×Y is defined as H(X,Y)=− X x∈X X y∈Y p(x,y)log 2 p(x,y) , which is the total uncertainty of (X,Y). The conditional entropy H(Y|X), the entropy of random variableY given another random variable X, is defined as H(Y|X)= X x∈X p(x)H(Y|X =x) , where H(Y|X =x)=− X y∈Y Q(y|x)log 2 Q(y|x) , with Q(y|x) = Pr{Y =y|X =x} to be the conditional probability of Y given X =x and p(x,y)=p(x)Q(y|x). The conditional entropy has the following properties: 1) H(Y|X)≥0 with equality iff Y ≡X. 2) H(Y|X)=H(X,Y)−H(X). 3) H(Y|X) ≤ H(Y), conditioning reduces entropy, with equality iff X and Y are independent. The mutual information I(X;Y) between random variablesX andY is defined by I(X;Y)= X x∈X X y∈Y p(x,y)log 2 p(x,y) p(x)q(y) which is a measure of the correlation between random variablesX andY. The mutual information has the following properties: 1) I(X;Y) =H(X)−H(X|Y), i.e. the mutual information I(X;Y) is the reduction in the uncertainty of X due to the knowledge of Y. 22 2) I(X;Y) = I(Y;X) = H(X) +H(Y)−H(X,Y), i.e. the mutual information is symmetric about X and Y. 3) I(X;Y)≥0 with equality iff X and Y are independent. 4) I(X;X)=H(X), this is the reason that entropy is also called self-information. ( | ) H X Y ( ; ) I X Y ( ) H X ( ) H Y ( , ) H X Y ( | ) H Y X Figure 4: A Venn diagram shows you the relationship between entropies and mutual information. The mutual information I(X;Y) corresponds to the intersection of the information in X and Y and the joint entropy H(X,Y) corresponds to the union of the information in X and Y. Nowwecangeneralizetheclassicalentropicquantitiesdefinedabovetotheirquan- tum mechanical counterparts. New properties emerge with the generalization. The Von Neumann entropy of a quantum system A with state ρ A is defined as H(A) ρ =H(ρ)=−Tr(ρ A log 2 ρ A ) , where the subscript ρ of H(A) ρ is often omitted. Given ρ in its diagonal form ρ = P i λ i |iihi|, the Von Neumann entropy can be reexpressed as H(A) ρ =− P i λ i log 2 λ i , similar to the Shannon entropy. The Von Neumann entropy has the following proper- ties: 1) H(A)≥0 with equality iff ρ A is a pure state. 2) H(A) ≤ log 2 d, where d denotes the dimension of the state space, with equality iff ρ A is a maximally mixed state. 23 ForabipartitequantumsystemAB insomejointstateσ AB ,thejointvonNeumann entropy is defined by H(AB)=−Tr(σ AB log 2 σ AB ) , which has the following properties: 1) If the system AB is in a pure state, then H(A)=H(B). 2)IfthesystemAB isinaproductstate,i.e. σ AB =σ A ⊗σ B ,H(AB)=H(A)+H(B). 3) Subadditivity,|H(A)−H(B)|≤H(AB)≤H(A)+H(B). For a bipartite system AB, the conditional von Neumann entropy is defined by H(B|A)=H(AB)−H(A) , which has the following properties: 1)H(B|A) is no longer non-negative andH(B|A)≤0 if the system is in a pure state. 2) H(B|A)≤H(B), conditioning reduces entropy, with equality iff the system AB is in a product state. For a tripartite quantum system ABC in some state σ ABC define the quantum mutual information and quantum conditional mutual information as I(A;B)=H(A)+H(B)−H(AB)=H(B)−H(B|A) , I(A;B|C)=I(A;BC)−I(A;C) . Both of them are non-negative. B. Shannon’s channel coding theorems Now let us turn to the message transmission problem mentioned at the beginning of this section. Suppose you want to send a message to someone, but during the messagetransmissionprocessthereisacertainprobabilitysomelettersofyourmessage 24 will be messed up. How would you encode your message such that you can put the maximum amount of information in a fixed-length encoded message while the receiver can still read out the information from the corrupted encoded message with negligible probability of error? To mathematically formulate this problem, we shall first abstract a mathematical modelforthiscommunicationscenario. Wecanseparatethisproblemintothreesteps: encoding, messagetransmissionanddecoding. Encodingmeansweencodeouroriginal messages in a certain format which is suitable for transmission. Decoding means we can translate the received message into the original format that is readable for the receiver. For message transmission, I would like to introduce the concept of a channel, which describes exactly how a message is corrupted during transmission. Now I will start by introducing the definition of a classical discrete channel [21]. A classical discrete channel is a physical system consisting of an input alphabetX, an output alphabet Y, and a probability transition matrix Q(y|x) that indicates the probability of observing the output symbol y given that the symbol x is sent. We are especially interested in memoryless channels, for which the probability distribution of the output depends only on the channel input at that time and is conditionally independent of the previous channel inputs or outputs. Specifically, a memoryless channel is a sequence of independent and identically distributed (i.i.d.) conditional random variables Y n |X n =Y 1 |X 1 ,...,Y n |X n with probability distribution P Y n |X n = Q n and Q n (y n |x n ) = Q(y 1 |x 1 )...Q(y n |x n ). Then for a discrete memoryless channel Q(y|x), the whole communication system (shown in Figure 5) can be described by an encoder E n :{0,1} nR →X n , n copies of the discrete memoryless channel Q n (y n |x n ), and a decoder D n : Y n → {0,1} nR , i.e. the sender encodes a message W into X n = E n (W) and sends it through Q n (y n |x n ), and later the receiver observes Y n as the channel output and decodes it as W ′ =D n (Y n ). 25 Channel Encoder En Channel Q n (Y n |X n ) Channel Decoder Dn Message W W ’ X n Y n Message estimate Figure 5: A communication system for n uses of memoryless discrete channels. Now the message transmission problem becomes: for a sufficiently large number of channel uses n, what is the maximal information transmission rate R (0<R< 1) such that Pr{W ′ 6=W} is negligible? The whole problem now can be summarized as the following channel coding problem. An (n,R,ǫ) channel code consists of • an encoding map E n :{0,1} nR →X n ; • a decoding map D n :Y n →{0,1} nR ; such that the average error probability defined by p err =2 −nR P w∈{0,1} nRp err (w) sat- isfies p err ≤ǫ, where for a particular w∈{0,1} nR , p err (w)=Pr{D n (Y n )6=w|X n =E n (w)}. A transmission rate R is called achievable if for all ǫ> 0, δ > 0 and sufficiently large n there exists an (n,R−δ,ǫ) channel code. The capacity of the channel is defined as C(Q)=sup{R :R is achievable} . Theorem 2 (Shannon’s channel coding theorem) IfQ is a stochastic map iden- tified with P Y|X , then C(Q)=max X I(X;Y). The channel capacity C(Q) is the maximal communication rate at which we can send information over the channel Q and recover it at the output with an arbitrarily 26 lowprobabilityoferror. Shannon’schannelcodingtheoremhastwoimportantaspects: 1) Achievability, all rates below capacity are achievable; 2) Optimality, any (n,R,ǫ) channel code with ǫ→0 as n→∞ must have R≤C. C. Holevo-Schumacher-Westmoreland (HSW) theorem SimilartoShannonentropy,Shannon’schannelcodingtheoremcanalsobegeneralized to quantum channels. We can send pure quantum information, pure classical informa- tion,orevenbothclassicalandquantuminformationoverquantumchannels. Aspecial and important case is to send classical information over quantum channels, i.e. en- coding classical information into quantum states, sending the quantum states through quantum channels and decoding the classical message. This problem was first studied and solved by Holevo [42], and independently by Schumacher and Westmoreland [60], so it is also called Holevo-Schumacher-Westmoreland (HSW) theorem. Formally, a memoryless quantum channel is a tensor power of a trace-preserving quantum operation N, N ⊗n , where N is defined by N : B(H P ) → B(H Q ). Here B(H P ) denotes the space of bounded linear operators onH P , the Hilbert space of the quantum system P. And the corresponding input and output quantum alphabets for quantum channel N are B(H P ) and B(H Q ), respectively. An (n,R,ǫ) channel code consists of • an encoding map E n :{0,1} nR →B(H ⊗n P ); • a decoding map D n :B(H ⊗n Q )→{0,1} nR ; such that the average error probability defined by p err =2 −nR P w∈{0,1} nRp err (w) sat- isfies p err ≤ǫ, where for a particular message w∈{0,1} nR , p err (w)=Pr{D n (E n (w))6=w}. 27 A transmission rate R is called achievable if for all ǫ> 0, δ > 0 and sufficiently large n there exists an (n,R−δ,ǫ) channel code. If we only consider the product input states, we can prepend a classical-quantum channelP|X to the noisy quantum channelN such that the effective channel becomes a memoryless classical-quantum channel with a two-stage encoding E n =E (2) n ◦E (1) n , E (1) n :{0,1} nR →X n , E (2) n :X n →B(H ⊗n P ) . So for aW ∈{0,1} nR , we haveE (1) n (W)=X n (W) andE (2) n (X n (W))=ρ X n (W) , where X n =X 1 ···X n is a sequence of i.i.d. random variables, each of which is generated by distribution p, and ρ X n = ρ X 1 ⊗···⊗ρ Xn . Therefore, the above encoding operation willgenerateaninputensemble{p(x n ),ρ x n}forthequantumchannelN. Theproduct state classical communication capacity of N is defined as C (1) (N)=sup{R (1) :R (1) is achievable} , where R (1) is the transmission rate of the channel code for product input states. Theorem 3 (Holevo-Schumacher-Westmoreland (HSW) theorem) Foranoisy quantum channel N, the product state classical communication capacity is C (1) (N)= max {px,ρx} " H N X x p x ρ x !! − X x p x H(N(ρ x )) # . where {p x ,ρ x } is the input ensemble. Note that the HSW theorem has many different formats. Here it is written in terms of product state classical communication capacity of a quantum channel, and later in Chapter 4 and 5 it is written in terms of the maximal classical information 28 transmission rate of a classical-quantum ensemble. These two formats are essentially thesameandtheonlydifferenceis: hereweusequantumchannelstobuildupclassical- quantum correlations between the sender and receiver, and we can choose different input ensembles to maximize the achievable rate; later we are given the classical- quantum correlation, and we want to use the quantum side information at the receiver to infer information about the classical source. An (n,R (1) ,ǫ) channel code is also called an HSW code. D. Private communication over quantum channels Nowwearereadytotalkaboutprivateclassicalinformationtransmissionoverquantum channels. In private classical communication, the legitimate receiver knows everything abouttheclassicalmessagewhiletheeavesdropper,Eve,learnsalmostnothingaboutit bywiretappingthetransmittedquantumstatesanddoingmeasurementonthem. Note thatbyeavesdropping,Evemaybuildupcertainclassical-quantumcorrelationbetween the encoded classical information and her local quantum system. Then under this circumstance, how can we be sure that she knows almost nothing about the message? The basic idea is that we encode our classical messages in a way such that Eve’s local quantum state is almost the same for each message and thus she cannot distinguish betweendifferentclassicalmessagesbymeasuringherquantumsideinformationalone. Consideringtheeffectiveclassical-quantum-quantumcorrelationbetweenthesender (Alice), the receiver (Bob) and the eavesdropper (Eve), we can define our private clas- sical communication protocol for a classical-quantum-quantum channel from Alice to Bob and Eve. The channel is defined by the map W : x → σ BE x , with x ∈ X and the state σ BE x is defined on a bipartite quantum system BE; Bob has access to sub- system B and Eve has access to subsystem E. For a large number n uses of the channelW, the inputs to the channelW ⊗n are classical sequencesx n ∈X n with prob- ability p n (x n ). The outputs of W ⊗n are density operators σ BE x n = σ BE x 1 ⊗···⊗σ BE xn 29 living on some Hilbert space H B n E n . We can design a private communication code {x n (lm),l ∈ {0,1} nR ,m ∈ {0,1} nS } by choosing R = I(X;B)−I(X;E)− 2δ and S = I(X;E) +δ for some δ > 0 such that with high probability Bob can identify {σ B x n (lm) } l,m by a POVM and figure out both l and m correctly, while Eve can only have accesstoσ E l = 1 2 nS P m σ E u lm ≈Eσ E u lm , which isindependentofthe secretmessage l. Here each {x n (lm)} m is called a privacy amplification (PA) set. Transmission Code Packing in Bob’s typical subspace , n B δ , n E δ Alice Bob Covering Eve’s typical subspace PA set Eve Figure 6: A schematic view of the anatomy of the private communication code: all the encoded classical sequences (represented as black dots•) form a HSW code, which is a good transmission code for Bob; the encoded classical sequences are also divided into many privacy amplification sets, each of which can cover Eve’s typical subspace uniformly such that Eve’s local state is almost the same for each message and she hence knows nothing about the secret information. ThenifAliceandBobaregivensomeprivatestrings(secretkeys)pickeduniformly atrandomfromsomeset{0,1} nRs beforetheaboveprotocolbegins, canwesendmore private information over the quantum channel? The answer is yes and in Chapter 4 we prove the private capacity region of a classical-quantum-quantum channel assisted by pre-shared secret keys. The idea of the proof is as follows: instead of sacrificing nI(X;E)bits of classicalinformation to randomize Eve’sknowledge of the state, Alice and Bob use the pre-shared secret key to do so. 30 In Chapter 4, we study the private classical communication capacity over a quan- tum channel assisted by a secret key. We show that secret keys are a useful nonlocal resource that can increase the private classical communication capacity over quantum channels; however, unlimited secret keys do not help. The trade-off between the rate of secret key consumption and the rate of increased private classical communication is presented quantitatively. Under the resource inequality [26, 25] framework, our proto- colcanbeunderstoodasa“privatefatherprotocol”duetoitssimilaritytotheoriginal father protocol. Furthermore, the unassisted private classical communication capacity [23] can be seen as a child protocol. 1.4 Channel simulation and rate-distortion theory Finally, we come to the last topic: channel simulation with quantum side information and its applications in source coding. This topic involves many important theorems in quantum information theory such as Shannon’s source coding theorem, the reverse Shannontheorem, theclassical-quantumSlepian-Wolfproblem, etc., soIneedtocover them carefully before I formally introduce the problem. A. Shannon’s source coding theorems Again, letusturntothemessagetransmissionproblem. Ifyouwanttosendamessage tosomeonebutyouwanttomakeyourmessageasshortaspossible,howmuchcanyou compress a message such that it is still readable for the receiver? To mathematically formulate this problem, we need to consider how to describe a message of letters first. A natural representation of a symbol is a discrete random variableX with alphabetX and probability distribution p and then a message X n is just a sequence of symbols, which can be thought of as being generated by an information source. A memoryless source consists of a large number n of i.i.d. random variables X, each of which is generated according to a probability distribution p. Shannon’s lossless source coding 31 theorem [62] says that a memoryless source can be compressed without loss at a rate of H(X) bits per symbol. An (n,R,ǫ) source code consists of • an encoding map E n :X n →{0,1} nR ; • a decoding map D n :{0,1} nR →X n ; such that p err =Pr{D n ◦E n (X n )6=X n }≤ǫ. (5) A compression rate R is called achievable if for all ǫ>0, δ>0 and for all sufficiently large n, there exists an (n,R+δ,ǫ) source code. Theorem 4 (Shannon’s lossless source coding theorem) For a memoryless in- formation source X, the infimum data compression rate is inf{R :R is achievable}=H(X). Shannon entropy H(X) is the minimum compression rate at which you can com- press a source without loss of information. Shannon’s lossless source coding theorem has two important aspects: 1) achievability, all rates above H(X) are achievable; 2) optimality, any (n,R,ǫ) source code with ǫ→0 as n→∞ must has R≥H(X). But in reality compression to Shannon entropy is not always good enough. For example,supposeyouwanttotransmitalargeamountofdatathroughabusynetwork and after you compress all the data at a rate of its Shannon entropy, it is still huge. Now either you wait a long time to transmit all the losslessly compressed data or you reduce the data size by compressing beyond the Shannon entropy and require less transmission time. Obviously the latter choice is much cheaper, especially when the quality of data compression is not a critical issue and some sort of distortion is acceptable. A popular application of this “distortion” idea is MPEG-1 Audio Layer 3, 32 alsoknownasMP3,whichisadigitalaudioencodingformatusingaformoflossy data compression. MP3 uses a lossy compression algorithm to greatly reduce the amount of data required to represent the audio recording and make it still sound like a faithful reproduction of the original uncompressed audio. That makes an MP3 file about one tenthofthesizeoftheCDfilecreatedfromtheoriginalaudiosource! Thatisbrilliant. So a memoryless source can be compressed beyond the Shannon entropy such that the reproduction of the source (after compression and decompression) suffers a certain amount of distortion compared to the original. To describe this kind of data distortion quantitatively, we need to introduce a distortion measure. Formally, a distortion measure is a mapping d : X × ˆ X → R + from the set of source-reproduction alphabet pairs into the set of non-negative real numbers. In most cases, the source and reproduction alphabets are the same, i.e. X = ˆ X. And examples of common distortion measures are: 1) Hamming distortion, with d(x,ˆ x)=0 if x= ˆ x and 1 otherwise. A special property of this measure is that Ed(X, ˆ X)=Pr(X 6= ˆ X). 2) Squared error distortion, with d(x,ˆ x) = (x− ˆ x) 2 . This measure is very popular for its simplicity and its relationship to least squared prediction. The above distortion measure is defined on a symbol-by-symbol basis and it can be extended to sequences by letting d(x n ,ˆ x n )= 1 n n X i=1 d(x i ,ˆ x i ) . Hence the lossy data compression problem can now be described as: given a mem- oryless source and a predetermined distortion measure, what is the minimum com- pression rate you can achieve such that the distortion associated with the algorithm is upper-bounded by certain distortion level? The answer to this problem is called Shan- non’s lossy source coding theorem or rate-distortion theory, whose goal is to minimize 33 a suitably defined distortion measure for a given desired compression rate. And I will introduce the formal treatment as follows. An (n,R,d) rate-distortion code is given by • an encoding map E n :X n →{0,1} nR ; • a decoding map D n :{0,1} nR → ˆ X n ; such that the distortion associated with the code defined by d(E n ,D n ):=Ed(X n , ˆ X n )= X x n p n (x n )d(x n ,D n (E n (x n ))) satisfies d(E n ,D n )≤d A rate distortion pair (R,d) is achievable if there exists an (n,R+δ,d) code for anyδ>0 and sufficiently largen. The rate distortion functionR(d) is the infimum of rates R such that (R,d) is achievable for a given distortion d. Theorem 5 (Shannon’s lossy source coding theorem) For a sourceX with dis- tortion measure d(x,ˆ x), the information rate distortion function is R(d)= min p(ˆ x|x): P x,ˆ x p(x)p(ˆ x|x)d(x,ˆ x)≤d I(X; ˆ X). B. Reverse Shannon theorem As we already know, a memoryless source consisting of a large number n of symbols generated according to a probability distribution p can be compressed without loss at a rate of H(p) bits per symbol, where H(p) is the Shannon entropy of p. This result can be rephrased as a communication problem (as shown in Figure 7). The sender Alice wants to communicate her source to the receiver Bob. Equivalently, she wants to simulate a noiseless bit channel (which we denote by id) from her to Bob with respect to the input p. She can accomplish this task by using up a rate H(p) 34 of perfect bit channels (which we denote by [c → c]) from her to Bob. The protocol consists of Alice sending the compressed source and Bob performing decompression upon receipt. The existence of such a protocol may be succinctly expressed as a resource inequality [25, 26, 31] H(p)[c→c]≥hid:pi. The non-local resource on the left hand side can be composed with local pre- and post-processing to simulate the non-local resource on the right hand side. n E n D ( )[ ] nH X c c → Alice Bob n X n X Figure 7: Shannon’s lossless source coding theorem in terms of noiseless channel simu- lation. TocommunicatethesourcesequenceX n toBob,Alicecompressesitatarateof Shannon entropy by encodingE n and sends the compressed sequence to Bob through noiseless communication channels. After receiving the data, Bob decompresses it to the original sequence X n by decoding D n without loss. With this viewpoint in mind, Shannon’s result was generalized some 50 years later to simulating noisy channels. The latter result was dubbed the reverse Shannon the- orem [11, 72], referring to Shannon’s noisy channel coding theorem [62]. Specifically in terms of channel simulation, Shannon’s noisy channel coding theorem says that for sufficiently largen, we can usen copies of classical noisy channelsW(y|x) to simulate nI(X;Y)copiesofclassicalnoiselesschannels. ThereverseShannontheoremconsiders this problem in the opposite direction: how many copies of classical noiseless channels are necessary to simulate n copies of classical noisy channels? The dumb way is that Alice simulates the conditional distribution W n (y n |x n ) locally, generates y n for any 35 input x n , and uses nH(Y) copies of classical noiseless channels to communicate the channel output Y n to Bob. But it turns out that part of the classical communication [c→c]canbereplacedsharedcoinsor“commonrandomness”(denotedby[cc]). More specifically, we can use nI(X;Y) copies of classical noiseless channels and nH(Y|X) copies of common randomness to simulaten copies of classical noisy channelsW(y|x). And since Alice can also have a copy of the channel output, we actually simulate a classical noisy channel with feedback (as shown in Figure 8). [ ] cc + [ ] c c → B A B A B Y Y Y X Y W → → ° Δ A X A Y B Y Figure 8: The reverse Shannon theorem. We can use common randomness [cc] and classical noiseless channel [c→c] to simulate a classical noisy channel W X A →Y B with feedback Δ Y B →Y A Y B . Onemaywellaskwhy oneshoulddeliberatelysimulate imperfectchannels. Ideally we would in fact like noiseless ones, but perhaps our resources are limited, and an imperfectsurrogatecanbesimulatedatadiscount: partoftheclassicalcommunication [c→c]canbereplacedbycommonrandomness[cc]. Commonrandomnessisastrictly weaker resource than classical communication because Alice can flip her coin locally and send the outcome to Bob. We can see that the mapping p(ˆ x|x) of the lossy data compression behaves like a classical noisy channel. And it is shown that the reverse Shannon theorem is intimately related [72] to lossy compression, or rate-distortion theory [12], where the communication rate is traded off against a suitably defined distortion level of the data. More generally, the reverse Shannon theorem is a useful tool for effecting trade-offs between different resources [10, 40]. 36 C. Classical-quantum Slepian-Wolf problem Another generalization of Shannon’s result, introduced by Slepian and Wolf [66], is to give Bob side information about the source. The case of quantum side information wasconsideredin[30]. Theclassical-quantumSlepian-Wolfproblemconsidersclassical data compression when the receiver has quantum side information at his disposal. Suppose there exists a classical-quantum correlation between Alice’s source X and Bob’s quantum side information B such that they share a joint state ρ XB = X x p(x)|xihx| X ⊗ρ B x . Given a large number n copies of the classical-quantum system XB, Alice possesses knowledge of the classical sequence x n =x 1 x 2 ...x n , but not the quantum subsystem described byρ x n =ρ x 1 ⊗ρ x 2 ···⊗ρ xn ; Bob has the quantum subsystem at his disposal but not the classical sequence. As we know from Shannon’s lossless source coding the- orem, without the quantum side information, Alice would need to send about nH(X) classical bits to communicate the source to Bob. Now the question is: can they reduce the communication cost by making use of Bob’s quantum information? The answer is yes and the way to achieve it is to use HSW codes (as shown in Figure 9). We can de- sign a source code{x n (ms),m∈{0,1} nR ,s∈{0,1} nS } by choosingR =H(X|B)+2δ and S =I(X;B)−δ for some δ> 0 such that each C m ={x n (ms)}, m∈{0,1} nR , is a HSW code and Bob can identify x n correctly with high probability. By effectively using the quantum side information, we can substantially reduce the communication cost. D. Channel simulation and rate-distortion theory In chapter 5 we combine the two ideas of making the channel noisy and allowing quantum side information with the receiver to simulate a classical noisy channel with quantum side information at the receiver. We also analyze several consequences for 37 1 C m C 2 C Each of the HSW codes {C m } are mapped to packing into Bob’s typical subspaces Alice Bob Figure 9: The classical-quantum Slepian-Wolf problem. We can use Bob’s quantum side information to reduce the communication cost for replicating the source at Bob’s side. The “typical” source sequences can be classified into many HSW codes defined by {C m ,m ∈ {0,1} nR }, with C m = {x n (ms),s ∈ {0,1} nS }. And for each HSW code C m , the source sequences{x n (ms)} can be mapped into near orthogonal subspaces of Bob’s “typical” subspace through the classical-quantum correlation of XB and thus distinguishable with high probability. 38 trade-offs. Thefirstisrate-distortiontheorywithquantumsideinformationparalleling the classical work of Wyner and Ziv [77], rate-distortion theory with classical side information. The second is an alternative derivation of a result from [31] concerning distillation of common randomness from a bipartite quantum state with the assistance of one-way classical communication. The various implications of our result are shown in Figure 10. PRESENT PAPER CR distillation Reverse Shannon Theorem c−q Wyner−Ziv no side info distortion measure ignore source no side info distortion measure no noise c−q Slepian−Wolf Rate−distortion theory Figure 10: The relation of our results to prior work. The connection between the classical channel simulation (with classical side in- formation) and rate-distortion theory are further discussed in Chapter 6. And the channel simulation theorems are found to be particularly useful to analyze the trade- offs of some source coding problems: multiple descriptions, successive refinement, and multi-terminal source coding, which are introduced as follows. Themultipledescriptionsproblem,studiedbyGersho,Witsenhausen,Wolf,Wyner, Ziv, Ozarow, El Gamal, Cover, Zhang, Berger, and Ahlswede [2, 38, 55, 74, 75, 78], is about one sender describing the same sequence of random variables X n = X 1 ...X n to more than one receiver. Thei-th receiver will receive descriptionE i (X n )∈ 39 {1,2,...,2 nR i } and he can produce an estimate ˆ X n i of the original message X n with distortion D i , where E i is the encoding operation for the description sent to the i-th receiver. nR 1 nR 0 n X ˆ n 0 0 X , D ˆ n 1 1 X , D ˆ n 2 2 X , D Figure11: Themultipledescriptionsproblemwiththreereceivers,twoofwhichreceive individual descriptions and the third of which has access to both descriptions. An important case of multiple descriptions is shown in Figure 11. Information about the source is transmitted to receivers 0 and 1 at rates R 0 and R 1 respectively, and the two receivers individually generate the estimates ˆ X n 0 and ˆ X n 1 with distortion D 0 and D 1 , respectively. Receiver 2 can have access to the both descriptions and the estimate ˆ X n 2 is generated with distortion D 2 (with D 2 ≤ D 0 , D 2 ≤ D 1 ). The rate distortion region is the set of achievable quintuples (R 0 ,R 1 ,D 0 ,D 1 ,D 2 ). Later in 1991, Equitz and Cover [35] proposed the successive refinement problem. In this problem a sender is successively refining a sequence of random variablesX n = X 1 ...X n to a receiver using a two-stage description that is optimal at each stage as shown in Figure 12. The sender first describes X n at rate R 1 with distortion D 1 and then they wish to reduce the distortion from D 1 to D 2 by sending extra bits at rate R 2 −R 1 . If R 1 = R(D 1 ) and R 2 = R(D 2 ), then they have successively refined the sequence X n . Equitz and Cover gave a proof of the successive refinement problem by observing that it is a special case of the multiple descriptions problem in which there is no constraint on the distortion D 0 and in which R 1 =R(D 1 ) and R 0 +R 1 =R(D 2 ) (no excess rate). And it was shown that successive refinement from a coarse description 40 nR 1 n X ˆ n 1 1 X , D ˆ n 2 2 X , D n(R 2 R 1 ) Figure 12: Successive refinement problem. ˆ X n 1 with distortion D 1 to a finer description ˆ X n 2 with description D 2 can be achieved if and only if X n , ˆ X n 2 , ˆ X n 1 form a Markov chain X n → ˆ X n 2 → ˆ X n 1 . 1 X 2 X ( ) 1 2 , p x x Encoder Encoder Decoder ˆ 1 1 X ,D ˆ 2 2 X ,D Figure 13: Multi-terminal source coding problem. In1978,Berger[13]andTung[69]studiedthemulti-terminalsourcecodingproblem (shown in Figure 13), which is a natural rate-distortion extension of the Slepian-Wolf problem, and they gave an inner and an outer bound for the rate-distortion region. Consider two correlated sourcesX 1 andX 2 , drawn i.i.d. according to the joint proba- bility distributionp(x 1 ,x 2 ). These sources are to be encoded by two separate encoders and decoded by a joint decoder. And the goal is to determine the region of all achiev- able rate-distortion quadruples (R 1 ,R 2 ,D 1 ,D 2 ). Define two auxiliary variablesW 1 andW 2 , such that there exist ˆ X 1 = ˆ X 1 (W 1 ,W 2 ) and ˆ X 2 = ˆ X 2 (W 1 ,W 2 ) with Ed(X 1 , ˆ X 1 ) ≤ D 1 , Ed(X 2 , ˆ X 2 ) ≤ D 2 . If p(x 1 ,x 2 ,w 1 ,w 2 ) satisfies a Markov chain of the form W 1 →X 1 →X 2 →W 2 , then all achievable rates (R 1 ,R 2 ) define an inner bound to the rate region; but ifp(x 1 ,x 2 ,w 1 ,w 2 ) only satisfies two Markov chains of the form W 1 → X 1 → X 2 and X 1 → X 2 → W 2 , then we have an outer bound to the rate region. 41 The direct coding theorems of all the above problems can be easily proved via a unified approach by using the reverse Shannon theorem and the channel simulation theorem successively. This method provides a systematic way to prove other more complicated problems in rate-distortion theory. 1.5 Outline The rest of the thesis is organized as follows. In Chapter 2, we generalize the secure BB84 quantum key distribution protocol to an efficiently implementable key expansion protocol, capable of increasing the size of a pre-shared key by a constant factor. Without referring to the code construction of the entanglement-assistedquantumerror-correctingcodes, weuseonlystabilizerstogivea direct reduction from an entanglement-assisted entanglement distillation protocol to a key expansion protocol, which is mathematically clearer than the presentation of Shor and Preskill. In Chapter 3, we propose a construction of CSS codes which combines a classical code with a two-universal hash function. We show, using the results of Renner and Koenig [57], that the communication rates of such codes approach the hashing bound on tensor powers of Pauli channels in the limit of large block-length. While the bit-flip errorscanbedecodedasefficientlyastheclassicalcodeused, theproblemofefficiently decoding the phase-flip errors remains open. In Chapter 4, we prove a regularized formula for the secret key assisted capacity region of a quantum channel for transmitting private classical information, a classical analogue of entanglement assisted quantum communication capacity. This formula provides a new family protocol, the private father protocol, under the resource in- equality framework that includes the private classical communication without secret key assistance as a child protocol. 42 In Chapter 5, we introduce the problem of channel simulation with quantum side information. We use the result of channel simulation to give simple direct proofs of the rate-distortion theory with quantum side information and one-way common ran- domness distillation. We also give inner and outer bounds on the achievable quantum communication,entanglementgenerationratepairsofthefullyquantumgeneralization of this problem. In Chapter 6, we systematically study the connection between channel simulation (with classical side information) and rate-distortion theory. Simple proofs of achiev- ability of multi-terminal source coding problem are made by a unified approach using thereverseShannontheoremandthechannelsimulationwithclassicalsideinformation as building blocks. 43 Chapter 2: Efficiently implementable codes for quantum key expansion Quantumkeydistribution(QKD)allowstwodistantpartiesAliceandBobtoestablish a secret key using one-way quantum communication and public classical communica- tion. Thiskeyisprovablysecurefromanall-powerfuleavesdropperEve,whoisallowed to intercept the quantum communication, perform block processing of quantum data, and listen to the public discussion. In contrast, key distribution by public commu- nication alone is impossible. QKD owes its security to two facts: 1) Alice and Bob, byperformingtomographyontheir(quantum) data, automaticallyobtaininformation about Eve’s (quantum) data; 2) with this knowledge Alice and Bob can perform in- formation reconciliation (IR) and privacy amplification (PA) to distill a key which is common to both (by IR), and about which Eve knows next to nothing (by PA). In this Letter we solve the practical question of constructing efficiently implementable codes for IR and PA. ThebestknownQKDprotocol,BB84,wasproposedbyBennettandBrassardin[6]. BB84 is a simple “prepare-and-measure” protocol which can be implemented without a quantum computer or quantum memory. Alice encodes a random bit either in the Z basis {|0i,|1i} or X basis {|+i,|−i} [here |±i = 1 √ 2 (|0i±|1i)] of a qubit system, and sends it to Bob. Bob performs a measurement in one of the two bases, chosen at random. After repeating this many times, they determine by public discussion which bits they chose the same basis for, thus establishing a raw key. They perform channel 44 estimation on a small fraction of the raw key bits. If the channel is too noisy, they abort the protocol. Otherwise they perform IR and PA on the remaining raw key bits to obtain the final secret key. Shor and Preskill [64] gave the first simple proof of the security of standard BB84, by relating the IR and PA steps to Calderbank-Shor-Steane (CSS) quantum error correcting codes. A CSS code protects m qubits from errors by “rotating” them into a 2 m dimensional subspace of an n qubit system. This subspace is the simultaneous eigenspace of “stabilizer” operators of the form Z h = Z h(1) ⊗···⊗Z h(n) , X g = X g(1) ⊗···⊗X g(n) . HereZ andX arePaulimatrices,andh=h(1)...h(n)andg =g(1)...g(n)arebinary vectors of length n. The vectors h are g are chosen to be rows of the classical “parity check” matrices H 1 and H 2 , respectively. To ensure that the stabilizer operators commute, H 1 and H 2 must be mutually orthogonal: H 1 H T 2 = 0. This condition is equivalent to saying that the codes corresponding to H 1 and H 2 contain each other’s duals. Let the (n−k i )×n parity check matrix H i correspond to an [n,k i ,d] classical error-correcting code which encodes k i bits into n bits and corrects errors on any t = (d−1)/2 bits. Then m =k 1 +k 2 −n and the CSS code corrects quantum errors on any t qubits. In order to securely implement the BB84 protocol we need to find good mutually dual containing codes of large block length n. These are known to exist in principle, by the Gilbert-Varshamov bound for CSS codes [19, 67]. Unfortunately, no explicit constructions are known, let alone ones that would be simple to decode. The main resultofChapter4isthatthedual-containingconditionmaybelifted. Thispermitsus toemployexcellentefficientlydecodablemodernclassicalcodessuchasLDPC[37]and 45 turbo codes [15]. The price we have to pay is that our protocol performs expansion of a pre-shared key rather than creating one from scratch. This is not much of a drawback, as existing QKD protocols require a logarithmic amount of pre-shared key toauthenticatethepublicdiscussion. Stillwechoosetomakethisdistinction,asinour casethepre-sharedkeyislinearinthequantumcommunicationcost. Ourconstruction is closely related to the entanglement-assisted quantum codes of Brun, Devetak and Hsieh [18, 17], which generalize stabilizer codes to the communication setting where the sender and receiver have access to pre-shared entanglement. 2.1 Notation First we consider an idealized setting in which the eavesdropper is known to have introduced errors on no more than a fixed fraction of the qubits. In other words, the channel estimation is assumed to have been successfully performed. We show how to constructan[n,m−c,d;c]quantumkeyexpansion(QKE)protocol,whichexpandsthe key from c to m bits if at most t = (d−1)/2 out of n qubits have become corrupted. Then we invoke standard results [47, 64, 39] to incorporate the channel estimation phase. Let H i (i = 1,2) be the parity check matrix for a classical [n,k i ,d] code C i ⊂Z n 2 , so that the rows of H i form a basis for C ⊥ i . Consider the (n−k 1 )×(n−k 2 ) matrix M =H 1 H T 2 . In generalM 6=0 and it can be row and column reduced (i.e. multiplied fromtheleftandrightbynon-singulartransformationmatricesT 1 andT T 2 )toamatrix of the form T 1 MT T 2 = 0 ℓ 1 ×ℓ 2 0 ℓ 1 ×c 0 c×ℓ 2 I c =J 1 J T 2 , 46 where n−k 1 =c+ℓ 1 and n−k 2 =c+ℓ 2 and J i = 0 ℓ i ×c I c . This is the well known Gaussian elimination procedure. Letting e H i = T i H i be an equivalent parity matrix for the code C i , we have e H 1 e H T 2 =J 1 J T 2 . Hence H ′ 1 H ′ T 2 = 0, where the (n−k i )×(n+c) “augmented” parity check matrices H ′ i (cf. [18, 17]) are given by H ′ i = ( e H i J i ). Note that H ′ i can be viewed as the parity check matrix of a classical code C ′ i by defining the row space of H ′ i to be C ′ ⊥ i . Then C ′ ⊥ 2 ⊆ C ′ 1 and C ′ ⊥ 1 ⊆ C ′ 2 . The set of errors H ′ i can correct is the same as the set of errors H i can correct, assuming that the last c bits are error free. Thus H ′ i can correct the error set S(n,c,d) defined as the set of binary row vectors of dimensionn+c with≤(d−1)/2 ones among the first n bits, and zeros elsewhere. Z X E 1 F 1 n-k 1 n-k 2 n+c m F 2 E 2 n+c 2 C′ 1 H′ 2 H′ 1 C′ Figure 14: The construction of the full rank matrices N 1 and N 2 . To relate the error correcting properties of these classical codes to those of the corresponding CSS codes, we find it useful to extend the parity check matricesH ′ 1 and H ′ 2 to full rank matrices as shown in Figure 1. Starting with H ′ 1 , whose rows are a basis forC ′ ⊥ 1 , addm=k 1 +k 2 +c−n independent row vectors (comprising the matrix 47 E 1 ) such that the rows of H ′ 1 and E 1 together form a basis for C ′ 2 ⊇C ′ ⊥ 1 . Collect the remaining n−k 2 independent vectors in the matrix F 1 . The (n+c)×(n+c) matrix N 1 = H ′ 1 E 1 F 1 hasfullrank, andhencesodoesN 1 N T 1 . ByGaussianelimination, thereexistsamatrix T such that N 1 N T 1 T T =I. Decompose N 2 =TN 1 into three segments just like N 1 : N 2 = F 2 E 2 ˆ H 2 . The condition N 1 N T 2 =I is now written as H ′ 1 F T 2 =I, H ′ 1 E T 2 =0, H ′ 1 ˆ H T 2 =0, E 1 F T 2 =0, E 1 E T 2 =I, E 1 ˆ H T 2 =0, (6) F 1 F T 2 =0, F 1 E T 2 =0, F 1 ˆ H T 2 =I. Hence the rows of ˆ H 2 form a basis forC ′ ⊥ 2 . With an appropriate redefinition ofF 1 we can identify ˆ H 2 with H ′ 2 . H ′ 2 and E 2 together form a basis for C ′ 1 ⊇C ′ ⊥ 2 . An error set E 1 ⊂ Z n+c 2 correctable by the code H ′ 1 is of the form E 1 = {bF 2 + β(b)E 2 +β ′ (b)H ′ 2 : b ∈ Z n−k 1 2 }, where β : Z n−k 1 2 → Z m 2 and β ′ : Z n−k 1 2 → Z n−k 2 2 are known functions. Thus β(b) is a row vector of dimension m and β(b)E 2 is an element of the row space of E 2 . b is called the error syndrome since it uniquely specifies the error. For an erroru∈E 1 , by (6), the error syndrome is calculated asb=H ′ 1 u T . Since H 1 is an [n,k 1 ,d] code, S(n,c,d)⊆E 1 . 48 Similarly, an error set E 2 ⊂ Z n+c 2 correctable by the code H ′ 2 is of the form E 2 = {pF 1 +α(p)E 1 +α ′ (p)H ′ 1 :p∈Z n−k 2 2 }, whereα:Z n−k 2 2 →Z m 2 andα ′ :Z n−k 2 2 →Z n−k 1 2 are known functions. Since H 2 is an [n,k 2 ,d] code, S(n,c,d)⊆E 2 . 2.2 Reduction from entanglement-assisted entanglement distillation to quantum key expansion In the Shor-Preskill proof [64] a QKD protocol was obtained by modifying an entan- glement distillation protocol. Our starting point is an EAED protocol. Alice and Bob initially share the state |Φi ⊗c , where |Φi AB = 1 √ 2 (|0i A |0i B +|1i A |1i B ) is the ebit state. Theirgoalistodistillatotalofmebits. Theresourcesattheirdisposalareclassical communication and a noisyn-qubit channel, which introduces errors on at mostt out ofnqubits. Atthebeginningoftheprotocol,Alicecreatesanothernebitstateslocally and sends theB part of them through the noisy channel to Bob. An operator written asU⊗V means thatU acts on “subsystemA”, then+c qubits that stay with Alice, and V acts on “subsystem B”, the n +c qubits which end up in Bob’s possession. We will describe the noise more generally as acting on the latter n+c qubits (even thoughonlythefirstnoftheseareaffected): letQ(n,c,d)bethesetoferroroperators of the form I⊗(X h 1 Z h 2 ) where h 1 ,h 2 ∈ S(n,c,d). The X-type errors are called bit errors and the Z-type errors are called phase errors. Because S(n,c,d) ⊆ E 1 ∩E 2 , every element of Q(n,c,d) is of the form I⊗(Z pF 1 X bF 2 Z α(p)E 1 Z α ′ (p)H ′ 1 X β(b)E 2 X β ′ (b)H ′ 2 ), (7) 49 for some p∈Z n−k 1 2 ,b∈Z n−k 2 2 . Our EAED protocol comprises the following steps: (1). The initial state is|Φi ⊗n+c . The first n ebits are held entirely by Alice, and the last c are shared between Alice and Bob. Denote by 0 the [(n+c)-dimensional] all-zero vector. The state |Φi ⊗n+c is the simultaneous (−1) (0,0) eigenstate of {Z I n+c ⊗Z I n+c ,X I n+c ⊗X I n+c } . By this we mean that it is the (−1) 0 = 1 eigenstate of Z e ⊗Z e for each row e of the (n+c)×(n+c) identity matrix I n+c . As the matrices N 1 and N 2 are full rank, this state is equivalently described as the simultaneous (−1) (0,0;0,0;0,0) eigenstate of the operators {Z H ′ 1 ⊗Z H ′ 1 ,X F 2 ⊗X F 2 ;Z E 1 ⊗Z E 1 ,X E 2 ⊗X E 2 ;Z F 1 ⊗Z F 1 ,X H ′ 2 ⊗X H ′ 2 }. (8) In other words, it is the (−1) 0 eigenstate ofZ h 1 ⊗Z h 1 for each rowh 1 ofH ′ 1 , etc. (2). An error in Q(n,c,d) of the form (7) occurs. The new state is the simultaneous (−1) (b,α ′ (p);β(b),α(p);β ′ (b),p) eigenstate of the operators in (8). This is easily seen from the relations (6) and the fact that acting with (I ⊗X g ) on a eigenstate of (Z h ⊗Z h ) with eigenvalue (−1) a , changes the eigenvalue to (−1) a+gh T [and similarly with X and Z interchanged]. (3). In order to find out the error syndromesb andp, Alice and Bob should measure thecommutingoperators{Z H ′ 1 ⊗Z H ′ 1 ,X H ′ 2 ⊗X H ′ 2 }. However, thiswouldrequire a non-local measurement. SinceZ h ⊗Z h =(Z h ⊗I)(I⊗Z h ), Alice and Bob can effectively measure Z h ⊗Z h by Alice measuring Z h ⊗I, Bob measuring I⊗Z h and multiplying the measurement outcomes. Thus, Alice measures Z H ′ 1 ⊗I and X H ′ 2 ⊗I,obtainingb ′ andp ′ ,andBobmeasuresI⊗Z H ′ 1 andI⊗X H ′ 2 ,obtainingb ′′ 50 andp ′′ . AlicesendsBobhermeasurementoutcomesandBobcomputesp=p ′ +p ′′ and b=b ′ +b ′′ . (4). Bob performs the correction operation I⊗(Z α(p)E 1 X β(b)E 2 ). Alice and Bob are leftwiththesimultaneous(−1) (0,0) eigenstateof{Z E 1 ⊗Z E 1 ,X E 2 ⊗X E 2 }. They can transform this state by local unitaries into |Φi ⊗m . This EAED protocol is readily made into an entanglement-assisted secret key dis- tillation protocol. Starting with the distilled state|Φi ⊗m , Alice measuresZ Im ⊗I and Bob measures I⊗Z Im to obtain a common key k ∈ Z m 2 . The key is decoupled from the rest of the world, and hence Eve, because |Φi ⊗m is a pure state. We proceed to simplify this key distillation protocol. Instead of transforming into |Φi ⊗m in step 4, and measuring{Z Im ⊗I,I⊗Z Im }, it suffices to measureZ E 1 ⊗I and I ⊗Z E 1 to obtain k directly. In step 4, Bob need not perform the phase error part I⊗Z α(p)E 1 of the correction operation; this commutes with the Z E 1 ⊗I and I⊗Z E 1 operators,andhencedoesnotaffectthemeasuredkeyvaluek. ThusmeasuringX H ′ 2 ⊗I and I ⊗X H ′ 2 in step 3 is also unnecessary. Bob performing the bit error correction I⊗X β(b)E 2 , followed by measuring I⊗Z E 1 to get k, is equivalent to just measuring I⊗Z E 1 to get k ′ and computing k =k ′ +β(b). The new key distillation protocol consists of steps 1 and 2, followed by: 3. Alice measures Z H ′ 1 ⊗I, obtaining b ′ , and Bob measures I⊗Z H ′ 1 , obtaining b ′′ . Alice sends b ′ to Bob. 4. Alice measures Z E 1 ⊗I, obtaining k, and Bob measures I⊗Z E 1 , obtaining k ′ . Bob computes k =k ′ +β(b ′ +b ′′ ). The above protocol requires multi-qubit operations and pre-shared entanglement. We will now reduce it to single-qubit operations and replace the entanglement by a pre-shared secret key. 51 As they commute with all the other steps, Alice may perform her Z H ′ 1 ⊗I and Z E 1 ⊗I measurements before step 2. For the same reason she can measureZ F 1 ⊗I at the same time. Together, these three measurements are equivalent to Alice measuring theZ operator of each individual qubit, obtaining a stringu∈Z n+c 2 . Thenb ′ =H ′ 1 u T and k =E 1 u T . Bob can measureI⊗Z F 1 at the end of the protocol, as it commutes withI⊗Z H ′ 1 and I ⊗Z E 1 . Together, these three measurements are equivalent to Bob measuring the Z operator of each individual qubit. The measurement result v∈Z n+c 2 is used to compute b ′′ =H ′ 1 v T and k ′ =E 1 v T . Because Bob has the last c qubits from the beginning (equivalently, the noise acts as the identity on them), he can measure them at the same time as Alice. Alice and Bob performing local Z measurements on the last c ebits is as if they shared a secret key κ of length c bits to start with. Alicemeasuringhalfofanebit|ΦiintheZ basisandsendingtheotherhalfthrough the channel is equivalent to her preparing|0i or|1i at random and sending it through the channel. This leaves us with the [n,k 1 +k 2 −n,d;c] QKE protocol below. (1). Alice and Bob share a secret key κ of length c bits. Alice generates a random n bit string r. Together they form the n+c bit string u = (r,κ). Alice computes b ′ =H ′ 1 u T and k =E 1 u T . (2). Alicepreparestheproductstate|riinherlabandsendsitoverthenoisychannel. (3). Bob receives the corruptedn qubit state. He measures each qubit in theZ basis, obtaining a string r ′ , which together with Bob’s copy of the initial secret key κ forms the n+c bit string v =(r ′ ,κ). Bob computes b ′′ =H ′ 1 v T and k ′ =E 1 v T . (4). Alice sends b ′ to Bob. Bob computes k =k ′ +β(b ′ +b ′′ ). 52 TheaboveQKEprotocoldealswiththeunrealisticsituationinwhichEveisknown to have introduced no more than t errors. To deal with the most general eavesdrop- ping attack, the so-called coherent attack, Alice and Bob need to be able to estimate the effective channel introduced by Eve. It was shown in [39] that there is no loss of generality in assuming that Eve effects a Pauli channel, i.e. one that applies elements of the Pauli group chosen with particular probabilities. However, preparing and mea- suring only in the Z basis is insufficient to estimate the channel; for instance, phase errorspassundetected. TheBB84protocolcircumventsthisproblembypreparingand measuring in both the Z and X bases. Alice will have to send a total of (2+3δ)n qubits: the factor of 2 comes from number of different bases used and a small fraction δn is reserved for channel estimation. The details of the protocol follow: 1. Alice creates (2+3δ)n random bits. 2. Alice chooses a random (2+3δ)n-bit string a, which determines whether the corresponding bit ofr is to be prepared in theZ (if the corresponding bit ofa is 0) or X basis (if the bit of a is 1). 3. Alice sends the qubits to Bob. 4. Bob receives the (2+3δ)n qubits and measures each in the Z or X basis at random. 5. Alice announces a. 6. BobdiscardsanyresultswherehemeasuredadifferentbasisthanAliceprepared in. Withhighprobability,thereareatleast(1+δ)nbitsleft(ifnot,aborttheprotocol). Alice randomly chooses a set ofn bits to be her stringr, and Bob’s corresponding bits comprise r ′ . The remaining nδ bits are used for channel estimation. 7. Alice and Bob publicly announce the values of their channel estimation bits. If the estimated channel introduces more than t errors, they abort the protocol. 8. Alice computes b ′ =H ′ 1 u T and k =E 1 u T , where u=(r,κ). Alice announces b ′ . 53 9. Bob computes b ′′ =H ′ 1 v T and k ′ =E 1 v T , where v = (r ′ ,κ). Bob’s estimate of the key k is k ′ +β(b ′ +b ′′ ). Observethatiftheprotocolfailsatanypoint,thepre-sharedkeyκremainsuncom- promised. Since the protocol was obtained from an entanglement distillation protocol, it is also universally composable [56, 33]. 2.3 Discussion As in [64, 39], we can go beyond fixed-distance codes, and instead use codes which merely perform well on i.i.d. (independent, identically distributed) channels. This is achieved by Alice performing a random permutation on her bits, and announcing it to Bob in step 5, thus symmetrizing the noisy channel induced by Eve’s actions. If Alice andBobestimatearateqofX andZ errors,itsufficesforC 1 andC 2 toperformwellon a binary symmetric channel (BSC) with error parameter slightly aboveq [39]. Modern classical codes such as turbo codes [15] and LDPC codes [37] can essentially achieve the Shannon capacity 1−H(q) on a BSC. Moreover, these codes are (suboptimally) decodable in polynomial time. This gives a key rate of (m−c)/n≈2(1−H(q))−1= 1−2H(q), which hits 0 for q = 0.11. Thus, with a QKE protocol based on modern codes we can tolerate the Shor-Preskill bound q =0.11 in practice. At present there is a large gap between abstract security proofs of QKD, which rely on the theoretical existence of certain codes, and experimental implementations, which use PA and IR codes chosen ad hoc and are thus not proven to be secure. Our result bridges this gap: it makes accessible the example of modern turbo and LDPC codes which are readily available, easy to encode and decode, yet provide a basis for unconditionally secure key distribution protocols. The performance of specific modern codes is currently under investigation. 54 Chapter 3: Quantum error-correcting codes based on privacy amplification Calderbank-Shor-Steane (CSS) quantum error-correcting codes are based on pairs of classical codes which are mutually dual containing. Explicit constructions of such codes for large block lengths and with good error correcting properties are not easy to find. InthischapterweproposeaconstructionofCSScodeswhichcombinesaclassical code with a two-universal hash function. We show, using the results of Renner and Koenig [57], that the communication rates of such codes approach the hashing bound on tensor powers of Pauli channels in the limit of large block-length. While the bit-flip errorscanbedecodedasefficientlyastheclassicalcodeused, theproblemofefficiently decoding the phase-flip errors remains open. Secret classical information and quantum information are intimately related [20, 61]. One can convert a maximally entangled state shared by two distant parties into a secret classical key by local bilateral measurements in the standard basis. Since it is a pure state, it is decoupled from the environment, and so is the information about the measurement outcome. An important application of this simple observation is the Shor-Preskill [64] proof of the security of the Bennett-Brassard 1984 (BB84) [6] quan- tum key distribution (QKD) protocol. They convert an entanglement distillation pro- tocol based on Calderbank-Shor-Steane (CSS) quantum error correcting codes [19, 68] into a key distribution protocol. The correspondence often also works in the other direction. Given a quantum channel or noisy bipartite quantum state, a large class of 55 secret communication protocols can be made into quantum communication protocols byperformingallthesteps“coherently”, i.e. replacingprobabilisticmixturesbyquan- tum superpositions [23, 33, 32]. This result was derived in the idealized asymptotic context of quantum Shannon theory, without giving explicit, let alone efficient, code constructions. Not coincidentally, these asymptotic codes had a structure reminiscent of CSS codes. In this chapter we bridge the gap between these asymptotic constructions and CSS codes. We obtain a subclass of CSS codes, called P-CSS codes, by making coherent a class of private codes studied by Renner and Koenig [57]. The original CSS [19, 68] construction requires two classical linear finite distance codes which contain each other’s duals. One is used for correcting bit flip errors, while the other corrects phase flip errors and was identified in [64] as being responsible for privacy amplification. The resulting quantum code is also of finite distance. P-CSScodesinvolveacombinationofaclassicalerrorcorrectingcodeandaprivacy amplificationprotocol. Consequently, thedual-containingpropertydoesnotcomeinto play, which greatly simplifies the construction. On the other hand, P-CSS codes are not finite distance codes. This is not necessarily a problem: modern coding theory (eg. LDPC [37] and turbo codes [15]) focuses not on distance but on performance on simple i.i.d. (independent, identically distributed) channels. The main result of this chapter is to show that P-CSS codes have excellent asymptotic behaviour, attaining the hashing bound on i.i.d. Pauli channels. The chapter is organized as follows. In section , we recall the definition of two- universal hash functions and how they are used in private codes. We then explain how they can be turned into quantum codes, and give examples. In section , we state and prove the main theorem which quantifies the performance of our code in terms of the parameters of the underlying classical error correcting code and privacy amplification 56 protocol. Theasymptoticratesforentanglementtransmissionarecalculatedinsection for memoryless qubit Pauli channels. We conclude in section with open problems. 3.1 Quantum codes based on affine two-universal hash functions 3.1.1 Private classical codes by two-universal hashing Assume a communication scenario in which a sender Alice is connected to a receiver Bob and eavesdropper Eve through a noisy classical “wiretap” channel with one input andtwooutputs[76]. AlicewantstosendmessagestoBobaboutwhichEveissupposed tofindoutaslittleaspossible. TherearetwoobstaclesAliceandBobneedtoovercome: i) Bob receives only a noisy copy of Alice’s input, and ii) (partial) information about herinputleakstoEve. InordertoreduceBob’snoisetheymustapplyerror correction, andinordertoincreaseEve’snoisetheymustperformprivacy amplification. Together, the two comprise a private code. An [n,k] error correcting code C is a k-dimensional linear subspace of Z n 2 . It can be given either as the column space of an n×k generator matrix G, so that C ={Gy :y ∈Z k 2 } ∗ , or the null space of an (n−k)×n parity check matrix H. The row space of H is the dual code C ⊥ of C. The interpretation is that Alice encodes her k-bit message y into the n-bit codeword x =Gy. This is sent down a noisy n-bit channel to Bob, who then tries to decode the original message y. Privacy amplification (PA) protocols [8, 45, 7, 57] are defined in terms of random functions. A random function from X to Y is a random variable taking values from the set of functions with domain X and range Y. A random function f from X to Y is called two-universal if Pr[f(x)=f(x ′ )]≤ 1 |Y| , ∗ all vectors are assumed column vectors 57 for any distinct x,x ′ ∈X. Suppose Alice has a k bit wiretap channel which is noiseless to Bob but partially noisy to Eve. Still Eve receives some information about the identity of the bits, and the k bits are thus not secure. However, Alice will settle for transmitting a smaller number ofbitsm, ifitensuresthattheywillnowbesecretfromEve. ThePAprotocol is characterized by a two-universal random function f : Z k 2 → Z m 2 . Alice starts by drawingaparticularrealizationoff. Toconveythesecretmessages∈Z m 2 sheencodes it as an equiprobably chosen random elementy of the setZ s (f)={y :f(y)=s}, and sends that down the channel. She then publicly announces the realization of f. Now Bob, knowing y and f, also knows s. On the other hand, it can be shown [8, 45, 7] that Eve’s correlation with s is exponentially small in k−m. A private code is a combination of an error correcting codeC and a privacy ampli- fication protocol f. Formally it is a set of sets {C s (f) :s∈Z m 2 }, where C s (f) ={x : x=Gy,f(y)=s}. It is used for transmittingm bits of approximately secret informa- tion reliably over an n-bit wiretap channel which is noisy to Bob. As in the noiseless case, Alice draws a particular realization off, and encodes the secret messages∈Z m 2 as an equiprobably chosen random element of the set C s (f). After transmission she publicly announces the realization of f to enable Bob to decode the message. 3.1.2 The P-CSS code construction Following[23],wecanconstructquantumcodesbycoherifyingtheprivatecode{C s (f): s∈Z m 2 }. Wewillworkinthecomputationalqubitorthonormalbasis{|0i,|1i}andthe corresponding n-qubit basis {|xi :x∈Z n 2 }, where |(x 1 ,x 2 ,...x n )i =|x 1 i|x 2 i...|x n i. An [[n,k]] quantum code C is a 2 k -dimensional subspace of the space of n-qubits (see eg. [54] and references therein). It is used to encode k qubits into n, by unitarily transforming the standard k-qubit basis vectors into a particular basis of C. The basis vectors of C are referred to as the quantum codewords of C. Given the private 58 code {C s (f) : s∈Z m 2 }, we define its quantum counterpart C by quantum codewords {|ϕ s (f)i:s∈Z m 2 }, with |ϕ s (f)i= 1 √ 2 k−m X x∈Cs(f) |xi. (9) More precisely, this defines a random quantum code, as for each realization of f we have a potentially different C. We may occasionally be sloppy about this distinction. Recall that a quantum code is called a stabilizer code if it is the simultaneous +1 eigenspace of a set ofk independentn-qubit Pauli operators (called the “stabilizers”). A CSS code is a stabilizer code, each stabilizer of which is composed of purely Z or purely X operators. It is defined by two classical error correcting codes C 1 and C 2 , withparitycheckmatricesH 1 andH 2 ,respectively,suchthatC ⊥ 2 ⊆C 1 . Thesetofsta- bilizersoftheCSScodeisgivenby{Z H 1 ,X H 2 ) }, whereZ H ={Z h : h is a row ofH}, etc., and Z h = Z h 1 ⊗...Z hn , h = (h 1 ,...,h n ). Bit flip errors are corrected by the Z stabilizers, and phase flip errors are corrected by the X stabilizers. CSS codes can alternatively be given in terms of codewords: |x+C ⊥ 2 i= 1 q |C ⊥ 2 | X y∈C ⊥ 2 |x+yi (10) wherex runs over the elements ofC 1 . It is easy to see that|x+C ⊥ 2 i only depends on the coset of C 1 /C ⊥ 2 to which x belongs, so we can let x only run over suitably chosen coset representatives x s . The expression (10) very much resembles (9). We next show that if the random functionf isaffine, orratherthateachofitsrealizationsisaffine, theneachrealization of C is indeed a CSS code. Assume f is of the form s=f(y)=Ay+s 0 , (11) 59 forsomes 0 ∈Z m 2 andfullrankm×k matrixA. DenotingthenullspaceofAbyC ′′ ,we seethaty liesinthecosetofZ k 2 /C ′′ determinedbys−s 0 . LetF beak×(k−m)matrix whose column space is equalC ′′ (if we think ofA as a sort of parity check matrix, then F would be the corresponding generator matrix). Then the setZ s (f)={y :f(y)=s} can be written as Z s (f)={Ft+y s :t∈Z k−m 2 }, where the y s are the coset representatives of Z k 2 /C ′′ . Hence C s (f) ={x :x =Gy,y ∈ Z s (f)} can be written as C s (f)={GFt+x s :t∈Z k−m 2 }, where x s = Gy s are the coset representatives of C/C ′ , and C ′ is the column space of the n×(k−m) matrix GF. Thus the quantum code C with codewords (9) is the CSS code composed from the classical codes C and (C ′ ) ⊥ with parity check matrices H and (GF) T , respectively. Its stabilizers are thus {Z H ,X (GF) T ) }. Observe that the quantum code C is independent of the constant term s 0 from the definition of f (11); its only effect is to permute the quantum codewords. To recapitulate,H andG are the parity check matrix and generator matrix of the classical code C, respectively. F comes from the affine structure of the two universal hashfunctionf. SinceHG=0, wealsohaveH(GF)=0, sothe bitflipand phase flip codes are automatically contained in each other’s duals. We dubbed the CSS codes and thus obtained P-CSS codes. 60 3.1.3 Examples We will illustrate our construction by using a particular class of affine two-universal hash functions taken from [8, 70]. There is a natural bijection between Z k 2 and poly- nomials in ζ ∈GF(2 k ) (ζ 2 k −1 =1) of degree k−1 over GF(2) : y =(y 1 ,...,y k ) ⇔ y(ζ)= k−1 X i=0 y i ζ i . Therefore, we can define linear maps : α(y(ζ)) =y and α −1 (y)=y(ζ) . For a,b∈GF(2 k ) (a6=0), the degree one polynomial q a,b (y(ζ))=ay(ζ)+b defines a permutation of GF(2 k ), which induces a permutation π a,b :Z k 2 →Z k 2 by π a,b =α◦q a,b ◦α −1 . For any fixedm≤k, define the mapτ :Z k 2 →Z m 2 which projects onto the firstm bits: τ((y 1 ,...,y k ))=(y 1 ,...,y m ). Define the function h a,b (y):Z k 2 →Z m 2 as the composition h a,b =τ ◦π a,b , and the set P={h a,b |a,b∈GF(2 k ),a6=0}. 61 The random function f which is uniformly distributed on P is known to be two- universal. Moreover, each h a,b is a composition of affine functions, and hence affine. As observed earlier, the resulting CSS code will be independent of the value of b. Our first set of examples are [[7,1]] CSS codes. We take n = 7, k = 4 and m = 1. For C we take the [7,4,3] Hamming code given by G T = 1 0 0 0 0 1 1 0 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 0 1 1 1 1 , H = 0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 , Different values of a ∈ GF(16) will in general give rise to different CSS codes. If we choose a=ζ −2 , then F = 0 1 0 0 1 1 0 0 1 1 0 0 , (GF) T = 0 0 0 1 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 . The constructed seven qubit quantum code is non other than the [[7,1,3]] Steane code with stabilizers {Z H ,X (GF) T }= Z 4 Z 5 Z 6 Z 7 , Z 2 Z 3 Z 6 Z 7 , Z 1 Z 2 Z 5 Z 6 , X 4 X 5 X 6 X 7 , X 2 X 3 X 6 X 7 , X 1 X 2 X 5 X 6 . 62 On the other hand, if we choose a=ζ, we have F = 1 0 0 0 1 0 0 0 1 0 0 0 , (GF) T = 1 0 0 0 0 1 1 0 1 0 0 1 0 1 0 0 1 0 1 1 0 . This code has stabilizers Z 4 Z 5 Z 6 Z 7 , Z 2 Z 3 Z 6 Z 7 , Z 1 Z 2 Z 5 Z 6 , X 1 X 6 X 7 , X 2 X 5 X 7 , X 3 X 5 X 6 . which can not correct the Z 4 error, and hence does not have distance 3. 3.2 Entanglement transmission over Pauli channels From the examples from the previous section we saw that, unlike the original CSS construction, the P-CSS construction does not tell us anything about the distance of the obtained quantum code. So in what sense is this construction useful? What can we say about the error correction properties of P-CSS codes? What can be quantified is the performance of P-CSS codes on Pauli channels. In thissectionwewilldefinethecommunicationscenarioofinterest, andproveatheorem aboutthequalityofentanglementthatcanbetransmittedthroughsuchchannels,given the properties of the classical code C and parameters of the hash function f which comprise the P-CSS code. The central tool is a theorem by Renner and Koenig [57] regarding quantum privacy amplification (see Appendix). 63 An n-qubit Pauli channel N n , which applies Pauli errors X u Z v with probability p u,v , can be described as N n (ρ)= X u,v∈Z n 2 p u,v X u Z v ρZ v X u , where and u = (u 1 ,...,u n ), v = (v 1 ,...,v n ) are binary vectors of length n. We can equivalently represent the action of N n as N n (ρ)=Tr E U A→BE Nn (ρ), where the isometry U Nn is defined by its action on the computational basis states |xi A ,x∈Z n 2 , as U Nn |xi A =|φ x i BE = X u,v √ p u,v (−1) x·v |x+ui B |u,vi E . (12) The interpretation is that, in being sent through the channel, Alice’s pure input state |xi A undergoes an isometry which splits it up between Bob’s output system B and the unobservable environment E. This is analogous to the wire-tap channel with the environment playing the role of the eavesdropper Eve. Both Eve and Bob receive the state only partially. We are be concerned with the task of entanglement transmission. Alice is handed the A ′ part of the m ebit state (|Φi ⊗m ) RA ′ , where |Φi= 1 √ 2 (|0i|0i+|1i|1i), the other half of which belogs to a reference systemR which Alice cannot access. The objective is to use the channelN n to transfer the purification ofR to Bob, so that Bob 64 ends up sharing an m ebit state (|Φi ⊗m ) RB ′ with R. In most cases this can be done only approximately. Formally,an[[n,m]]entanglementgenerationcodeconsistsofanencodingisometry U A ′ →A C (whichtakesthestandardm-qubitbasistothecodewordsofC)andadecoding operation D B→B ′ . It is said to be η-good if D◦N n ◦U C (Φ ⊗m ) RA ′ −(Φ ⊗m ) RB ′ 1 ≤η. The P-CSS code C which we will use for entanglement transmission consists of an [n,k] code C with parity check matrix H, and an affine random two-universal hash function f :Z k 2 →Z m 2 . The performance of the quantum code will be given in terms of the performance of the classical code C, which involves the bit flip errors u. To decode C one measures the error syndrome e =Hu. While the syndrome is uniquely determined by the error, the reverse is usually not the case. One needs to define the “inverse syndrome” function ˆ u :Z n−k 2 →Z n 2 , for which Hˆ u(e) =e. Now G, H and ˆ u fully specify the classical encoding and decoding operations. Define p u = X v p u,v (13) p e = X u:Hu = e p u (14) p u,v|e = I(Hu = e)p u,v /p e , (15) p u|e = I(Hu = e)p u /p e , (16) ǫ(e) = 1−p ˆ u(e)|e , (17) ǫ = X e p e ǫ(e). (18) Here p u is marginal probability of bit flip error u occuring, p e is the probability of the bit flip error syndrome being e, p u|e is the probability of the bit flip error being u conditional on observing the syndrome e, and ǫ(e) and ǫ are the probabilities of 65 incorrectlyidentifyingthebitfliperrorwithandwithoutconditioningonthesyndrome e, respectively. We can now formulate our main theorem. Theorem 6 Given an n-qubit Pauli channel N n , a classical linear [n,k] code C with average error probabilityǫ, and an affine random two-universal hash functionf :Z k 2 → Z m 2 , there exists a random [[n,m]] P-CSS entanglement generation code which is on average η-good on N n , with η =2 q 2ǫ ′ +4 √ 2ǫ+2 √ 2ǫ , ǫ ′ =2 − 1 2 (H 2 (XE)ω−H 0 (E)ω+k−n−m) , and ω XBE = 1 2 n X x∈Z n 2 |xihx| X ⊗|φ x ihφ x | BE . (19) H 2 and H 0 denote R´ enyi entropies (see Appendix for details), and the state ω XBE is the result of sending a randomly chosen computational basis element through the Pauli channel N n . Proof We will break up the proof into a number of steps. 1. Imagine Alice sends the state |xi, x∈C, through the channel, resulting in the state|φ x i BE defined in (12). Measuring the stabilizers{Z H }, with probabilityp e , Bob will get the syndrome e and state |φ x (e)i BE = X u:Hu=e,v p p u,v|e (−1) x·v |x+ui B |u,vi E , 66 where p e and p u,v|e are defined in (14) and (16). Bob applies X ˆ u(e) to correct the bit flip errors, resulting in |φ ′ x (e)i BE = p 1−ǫ(e)|xi B |ψ good x (e)i E + p ǫ(e)|ψ bad x (e)i BE , where |ψ good x (e)i E =|ˆ u(e)i E 1 X v p p v|ˆ u(e) (−1) x·v |vi E 2 , and |ψ bad x (e)i BE corresponds to {u 6= ˆ u(e) : Hu = e,}, ie. the errors that fail to get corrected. Since |ψ bad x (e)i BE is orthogonal to |xi B |ψ good x (e)i E , we have (see Appendix for details about the fidelity F) F |φ ′ x (e)i BE ,|xi B |ψ good x (e)i E =1−ǫ(e) . (20) 2. If Alice send the quantum codeword |ϕ s i= 1 √ 2 k−m X z∈C ′ |x s +zi, and Bob performs the same steps as above, the resulting state is 1 √ 2 k−m X z∈C ′ |φ ′ xs+z (e)i BE , (21) which has fidelity 1−ǫ(e) with 1 √ 2 k−m X z∈C ′ |x s +zi B |ψ good xs+z (e)i E . (22) 67 3. Let{E i =v i +C ′⊥ :i∈Z k−m 2 }denotethecosetsofC ′⊥ inF n 2 , withC ′ definedas in Section IIB. Recall also that{x s :s∈Z m 2 } were defined as the coset representatives of C/C ′ . Since z·v =z·v i for all v∈E i and z∈C ′ , we can rewrite the state (22) as X i∈Z k−m 2 |ϕ s,i i B |θ s,i (e)i E , (23) with |ϕ s,i i B = 1 √ 2 k−m X z∈C ′ (−1) z·v i |z+x s i B , |θ s,i (e)i E =|ˆ u(e)i E 1 ⊗ X v∈E i p p v|ˆ u(e) (−1) xs·v |vi E 2 . (24) Observe that {|ϕ s,i i B : s ∈ Z m 2 ,i ∈ Z k−m 2 } are a basis for the system B, and there exists a Clifford unitary U :|ϕ s,i i B 7→|si B 1 |ii B 2 . Bob appies U, resulting in a state |Υ s (e)i B 1 B 2 E which has fidelity 1−ǫ(e) with | e Υ s (e)i B 1 B 2 E =|si B 1 X i |ii B 2 |θ s,i (e)i E . (25) 4. Now we need to ensure that there is no s dependence of the state of the envi- ronmentE. As we will see this has to do with the performance of the private code on which our P-CSS code is based on. By Lemma 1 below, 1 2 m X s E f σ E s −σ E 0 1 ≤2ǫ ′ , (26) where σ E s = 1 2 k−m P z∈C ′φ E z+xs , φ E x =Tr B |φ x ihφ x | BE , and ǫ ′ =2 − 1 2 (H 2 (XE)ω−H 0 (E)ω+k−n−m) . It is easy to see that e Υ s (e) E = 1 2 k−m X z∈C ′ ψ good xs+z (e) E . 68 On the other hand, since φ E xs+z = P e p e φ xs+z (e) E , it follows that σ E s = X e p e 1 2 k−m X z∈C ′ φ xs+z (e) E . From the concavity (144) and monotonicoty (142) of fidelity, and (20) we have F σ E s , X e p e e Υ s (e) E ! ≥ X e,z p e 2 k−m r F φ xs+z (e) E ,ψ good xs+z (e) E ! 2 ≥ X e p e (1−ǫ(e)) ! 2 ≥1−2ǫ . Then by (141), for all s we have σ E s − X e p e e Υ s (e) E 1 ≤2 √ 2ǫ . (27) By the triangle inequality for trace distance, we can combine (26) and (27) to get 1 2 m X s E f X e p e e Υ s (e) E − X e p e e Υ 0 (e) E 1 ≤2ǫ ′ +4 √ 2ǫ . (28) Since e Υ s (e) E = P i |θ s,i (e)ihθ s,i (e)| E and {θ s,i (e)} i,e is an orthogonal set, we can ex- press the privacy condition (28) as 1 2 m X s,e,i p e E f θ s,i (e) E −θ 0,i (e) E 1 ≤2ǫ ′ +4 √ 2ǫ . Set |θ s,i (e)i E = √ q i | ˆ θ s,i (e)i E , where | ˆ θ s,i (e)i E is normalized and P i q i = 1. Thus by (141) we have 1 2 m X s,e,i p e q i 1−E f h ˆ θ s,i (e)| ˆ θ 0,i (e)i ≤ǫ ′ +2 √ 2ǫ . (29) 69 5. Steps 2.-4. considered Alice sending a single codeword through the channel. Now we are ready for the actual task of entanglement transmission. Alice is handed the A ′ part of the state (Φ ⊗m ) RA ′ = 1 √ 2 m X s∈Z m 2 |si R |si A ′ . She performs the encoding U C :|si A ′ 7→|φ s i and sends the output down the channel. Bob performs all the operations in steps 2.-4. and the resulting state (conditional on syndrome e) is |Υ(e)i= 1 √ 2 m X s |si R |Υ s (e)i B 1 B 2 E . Define also | e Υ(e)i= 1 √ 2 m X s |si R | e Υ s (e)i B 1 B 2 E , and | ˆ Υ(e)i= 1 √ 2 m X s,i |si R |si B 1 (−1) b(s,i) |ii B 2 |θ 0,i (e)i E , with (−1) b(s,i) = h ˆ θ s,i (e)| ˆ θ 0,i (e)i h ˆ θ s,i (e)| ˆ θ 0,i (e)i . Thus h e Υ(e)| ˆ Υ(e)i= 1 2 m X s,i q i (−1) b(s,i) h ˆ θ s,i (e)| ˆ θ 0,i (e)i = 1 2 m X s,i q i h ˆ θ s,i (e)| ˆ θ 0,i (e)i . Averaging over all the possible syndromes, we can define Υ= X e p e Υ(e) RB 1 B 2 E , 70 and similarly e Υ and e Υ 0 . Thenbytheconcavity(144)offidelity,thepropertyofconvexfunction(i.e. EX 2 ≥ (EX) 2 ) and the privacy condition (29), we have E f F( ˆ Υ RB 1 B 2 E , e Υ RB 1 B 2 E ) ≥E f X e,j p e r F | ˆ Υ(e)i,| e Υ(e)i 2 ≥ 1 2 m X s,i,e p e q i E F h ˆ θ s,i (e)| ˆ θ 0,i (e)i 2 ≥1−2ǫ ′ −4 √ 2ǫ . It is not hard to see that F Υ RB 1 B 2 E , e Υ RB 1 B 2 E ≥1−2ǫ , since by the concavity (144) of fidelity, the fact that trace-preserving quantum opera- tions never reduce fidelity, and condition (20) we have F Υ, e Υ ≥ X e p e r F Υ(e), e Υ(e) ! 2 ≥ X e p e (1−ǫ(e)) ! 2 ≥1−2ǫ . Combining the results above and by (141), we have E f kΥ− ˆ Υk 1 ≤E f kΥ− e Υk 1 +E f k e Υ− ˆ Υk 1 ≤η with η =2 p 2ǫ ′ +4 √ 2ǫ+2 √ 2ǫ. 71 Finally Bob performs the decoupling unitary V B 1 B 2 = X s,i (−1) b(s,i) |sihs| B 1 ⊗|iihi| B 2 , and throws away the B 2 system (and also implicitly E which he never had access to anyway). This combined operation takes ˆ Υ to the desired state 1 √ 2 m X s∈Z m 2 |si R |si B 1 . (30) By the monotonicity of trace distance (143), it takes the actual state Υ to a state the E f of which is η-close in trace distance to (30). Hence the average performace of the code (averaging over f) is η-good, as claimed. This averaging can be treated in two ways. Alice and Bob could start with preshared randomness, based on which they choose f. More simply, if the codes are good on average, then at least one does at least as well as the average. It isworth summing up Bob’sdecoding operation. Hefirst measuresthe stabilizers {Z H }, obtaining the bit flip error syndrome e, and applies X ˆ u(e) to correct the bit flip errors. To deal with the phase errors, he performs U B→B 1 B 2 followed by V B 1 B 2 and discarding B 2 . This phase-correcting combined operaton can be implemented in a simpler, less coherent way. Instead of performing U, he measures the stabilizers {X (GF) T }, thus uniquely determining i. Based on i he performs the unitary V B i = X s (−1) b(s,i) |ϕ s,i ihϕ s,i |, (31) which corrects the phase error. Finally, he un-encodes with U −1 C . It is important to note thatV B i is not efficientlyimplementable, which is a realistic problem when the code length becomes large (see Discussion). 2 72 Lemma 7 In notation from the proof of Theorem 1, 1 2 m X s E f σ E s (f)−σ E 0 (f) 1 ≤2ǫ ′ , (32) where σ E s (f)= 1 2 k−m P z∈C ′φ E z+xs and where ǫ ′ =2 − 1 2 (H 2 (XE)ω−H 0 (E)ω+k−n−m) . Proof Consider the state σ YE = 1 2 k X y∈Z k 2 |yihy| Y ⊗φ E Gy . (33) By Lemma 35, we have the privacy condition E f σ SE (f)−τ S ⊗σ E 1 ≤2 − 1 2 (H 2 (YE)σ−H 0 (E)σ−m) , (34) with σ SE (f)= 1 2 m X s∈Z m 2 |sihs| S ⊗σ E s (f) , where σ E s (f)= 1 2 k−m X y:f(y)=s φ E Gy = 1 2 k−m X z∈C ′ φ E xs+z and τ S = 1 2 m P s |sihs| S is the maximally mixed state. By the relations (see Appendix E for details)H 2 (YE) σ =H 2 (XE) ω −(n−k) and H 0 (E) σ ≥H 0 (E) ω , withω defined in (19), the condition (34) can be easily generalized as E f σ SE (f)−τ S ⊗σ E 1 ≤2 − 1 2 (H 2 (XE)ω−H 0 (E)ω+k−n−m) . This can be rewritten as 1 2 m X s E f σ E s (f)−σ E 1 ≤ǫ ′ , 73 where ǫ ′ = 2 − 1 2 (H 2 (XE)ω−H 0 (E)ω+k−n−m) . Without loss of generality, we can assume that E f ||σ E 0 (f)−σ E || 1 ≤ǫ ′ , and therefore we have 1 2 m X s E f σ E s (f)−σ E 0 (f) 1 ≤2ǫ ′ . (35) 2 3.3 CodeperformanceonmemorylessqubitPaulichannels We now consider the important case of i.i.d. (independent and identically distributed) or tensor power channels N n = ˆ N ⊗n , where ˆ N is a single qubit Pauli channel: ˆ N(ρ)= X u,v∈Z 2 p u,v X u Z v ρZ v X u . First, we need to characterize the error parameter η from Theorem 1 using smooth R´ enyi entropy. Recall that ǫ ′ =2 − 1 2 (H 2 (XE)ω−H 0 (E)ω+k−n−m) , is the error parameter introduced to describe the privacy amplification condition (26). Then by Corollary 36 (see Appendix), we can upper-bound ǫ ′ by ǫ 1 =2 − 1 2 (H δ 2 (XE)ω−H δ 0 (E)ω+k−n−m) +2δ . (36) 74 For our i.i.d. case we have ω = ˆ ω ⊗n with ˆ ω XBE = 1 2 X x∈Z 2 |xihx| X ⊗U ˆ N |xihx|U † ˆ N . Then as n→∞, by Lemma 34 we have 1 n H δ 2 (XE) ˆ ω ⊗n →H(XE) ˆ ω +o(δ) , 1 n H δ 0 (E) ˆ ω ⊗n →H(E) ˆ ω +o(δ) . Observing H(X) ˆ ω =1, then as n→∞ we have 1 n H δ 2 (XE) ω − 1 n H δ 0 (E) ω −1→−I(X;E) ˆ ω +o(δ) . Notice that the state ω XBE represents the classical-quantum correlation for Alice sending classical strings over a Pauli channel. A memoryless Pauli channel works as a classical binary symmetric channel for classical information. So there exist good classical [n,k] codes such as LDPC with code rates k n =I(X;B) ˆ ω −Δ approaching the classical Shannon capacity I(X;B) ˆ ω =1−H({p 00 +p 01 ,p 10 +p 11 }) since we have ˆ ω XB = 1 2 X x∈Z 2 |xihx| X ⊗((p 00 +p 01 )|xihx| B + (p 10 +p 11 )|x+1ihx+1| B ) , 75 and the error probability ǫ→ 0 as n→∞, where p is the probability of bit flip error and Δ is a constant which can be made quite small [37, 52, 53, 58, 65]. Hence, as n→∞, the error parameter ǫ 1 defined by (36) approaches a limit, i.e. ǫ 1 →ǫ 2 =2 − 1 2 {n[C−Δ+o(δ)]−m} +2δ , where C =I(X;B) ˆ ω −I(X;E) ˆ ω =1−H({p 00 ,p 01 ,p 10 ,p 11 }) is the hashing bound on the Pauli channel [9]. Therefore, the rate of our entanglement distillation codes can go up to m n =C−Δ , such that ǫ 2 →0 as n→∞ and δ→0. Therefore, if we employ LDPC codes as our classical codes, we can get a family of η-good entanglement generation codes with code rate approaching the hashing bound of the memoryless Pauli channels and the error parameter η→0 as n→∞. Here we want to present an example of P-CSS codes based on an LDPC code from David Mackay’s classical paper [52]. The classical code is a Gallager code [37] with n = 19839, k = 9839 and each column of its parity check matrix has weight t = 3. For practical purpose, we shall extend the error parameterǫ defined by (18) to include bothdetectedandundetectederrors † . Forbitfliperrorprobabilityequalto0.076, the paper gave an estimate of the block error probability to be 2.62×10 −5 , not specified to be detected or undetected errors. So 2.62×10 −5 is an optimistic estimate of total block error probability. † Detected errors occur if the decoder identifies the block to be in the error but its algorithm runs for the maximum number of iterations without finding a valid decoding. And the original definition ǫ = 1− P e p ˆ u(e) is for undetected errors, which occur if the decoder finds a valid decoding that is not the correct decoding. 76 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2 10 -1 10 0 10 1 10 2 10 3 η vs R Q R Q η p=0.114, C =0.3074 Figure 15: The error parameterη vs the quantum code rateR Q for the Gallager code n=19839, k =9839 and column weight t=3. Then we consider the code performance on an iid depolarizing channel ˆ N(ρ)=(1−p)ρ+ p 3 (XρX +YρY +ZρZ). Since n is large enough, we can estimate ǫ ′ by ǫ ′ = 2 − 1 2 (−nI(X;E) ˆ ω +k−m) and for p = 0.076∗3/2 = 0.114, we have I(X;E) ˆ ω = 0.3046, C = 0.3074. By setting R Q =m/n, we can plot η vs R Q in Fig.15. As you can see, η stabilizes at 0.3548 right after R Q <0.19. Note thatη is a rough upper-bound for the real error, which is believed to be much smaller than 0.3548. So Fig.15 is just for illustration purpose and it should not be considered as the real code performance. 3.4 Discussion We have studied a subclass of CSS codes called P-CSS codes, which are based on a classicalerrorcorrectingcodeandatwo-universalhashfunction. P-CSScodesarevery flexible and easy to construct, and have excellent asymptotic performance. However, there are several drawbacks, relative to the traditional finite-distance CSS codes, that 77 deserve further study. The phase-flip correcting unitary operationV i from (31) suffers fromtwokindsofinefficiencies. First,itisnotatensorpowerofsinglequbitoperations, butagenericn-qubitoperation. Contrastthistotheusualwayofcorrectingphase-flip errors: giventheerrorsyndromei∈Z k−m 2 , oneassociatestoitthemostprobableerror ˆ v(i)∈ Z n 2 (cf. the function ˆ u(e) for bit-flip errors), and performs Z ˆ v(i) . In particular thereisnos-dependenceasinb(s,i)from(31). Second,evenifthephase-flipcorrecting unitary is of the formZ ˆ v(i) , the function ˆ v(i) may not be efficiently computable. This issue is not present for ˆ u(e) if we use an efficiently decodable LDPC code for C. We believe that at least the first issue can be overcome by modifying the Theorem 1 proof techinque. Finally, the error parameterη is a rough upper-bound of the real error and a better bound will be highly desired for applications. 78 Chapter 4: Secret Key Assisted Private Classical Capacity over Quantum Channels Weprovearegularizedformulaforthesecretkeysassistedcapacityregionofaquantum channel for transmitting private classical information. This result parallels the work of Devetak on entanglement assisted quantum communication capacity [25]. This formulaprovidesanewfamilyprotocol, theprivatefatherprotocol, undertheresource inequality framework that includes the private classical communication without the assisted secret keys as a child protocol. 4.1 Introduction Secretkeys,bydefinition,refertocommonrandomnessavailabletosenderandreceiver at distant locations while any other third party has absolute no information about it. Generating secret keys concerns about remaining secrecy to a third party [3]. An information-theoretic model in the classical setting is the famous “wiretap channel” [76],wherethesenderwantstocommunicatewithonelegitimatereceiverwhilekeeping the eavesdropper completely ignorant of the message sent. Private communication can then be achieved via encryption once secret keys are generated. In a sense, secret keys are a valuable resource that can be used to achieve information transmission tasks. The above scenario has a quantum analogue, where secret keys are generated over a quantum channel. The secret key generating protocol has been proposed by several 79 authors [46, 59, 32], and in [23], it has been shown that the capacity of a quantum channelfortransmittingprivateclassicalinformationisthesameasthecapacityofthe same channel for generating secret keys. Furthermore, neither capacity is enhanced by forward public classical communication. This raises the interesting question of how different resources interconvert in a quantum information protocol, and was partially answered in [26, 25, 1]. The formal treatment of quantitative inter-conversions between non-local informa- tion processing resources is studied in [25], wherein such an asymptotically faithful conversion is expressed as a resource inequality (RI). These resource inequalities are extremely powerful, and sometimes lead to new quantum protocols [26]. For example, they allow us to relate the family protocols to several well-known quantum protocols by direct application of teleportation or superdense coding, etc. Inthischapter, westudytheprivateclassicalcommunicationcapacityoveraquan- tum channel when assisted by secret keys. We show that the secret keys are a useful non-local resource that can increase the private classical communication capacity over quantumchannels; however, unlimitedsecretkeyswillnothelp. Thetrade-offbetween therateofsecretkeysconsumedandtherateofprivateclassicalcommunicationgained is presented quantitatively. Under the RI framework, our protocol can be understood as a “private father protocol” due to its similarity to the original father protocol. Fur- thermore, the unassisted private classical communication capacity [23] can be seen as a child protocol of ours. This chapter is organized as follows. Section contains definitions of notations and relevant background materials. Section contains statements and proofs of our main result. In section , we rewrite our result under the RI framework, and show how to recover the unassisted private classical capacity from ours. We conclude in section . 80 4.2 Notation Consider a classical-quantum system, say XQ, in the state described by an ensemble {p(x),ρ x } with p(x) defined on X and the ρ x being density operators on the Hilbert space H Q of system Q. Such state ρ XQ of systems XQ can be represented by the ”enlarged Hilbert space” (EHS) representation ρ XQ = X x p(x)|xihx| X ⊗ρ Q x , where X is a dummy quantum system and{|xi:x∈X} is a set of orthonormal basis for the Hilbert space H X of system X. The reduced density operators of systems X and Q are ρ X = Tr Q ρ XQ = P x p(x)|xihx|, and ρ Q = Tr X ρ XQ respectively. The von Neumann entropy of a quantum state ρ Q is defined to be H(Q) ρ =−Tr(ρlogρ). We will omit the subscriptρ when the state is clear from the context. Notice that the von Neumann entropy of the dummy quantum system X is equal to the Shannon entropy of random variable X whose probability distribution is p(x). The conditional entropy is defined as H(Q|X)=H(QX)−H(X). (37) It should be noted that conditioning on classical variables (systems) amounts to aver- aging, therefore (37) is also equal to H(Q|X)= X x p(x)H(Q) ρx . (38) The mutual information is defined as I(X;Q)=H(X)+H(Q)−H(QX). 81 Next, we will briefly introduce definitions and properties of typical sequences and subspaces [54]. Let T n X,δ denote the set of typical sequences associated with some random variable X such that for the probability distribution p defined on the set X T n X,δ = x n :∀x∈X, N(x|x n ) n −p(x) ≤δ , where N(x|x n ) is the number of occurrences of x in the sequence x n := x 1 ···x n of length n. Assume the density operatorρ Q of systemQ has the following spectrum decompo- sition ρ Q = P y p(y)|yihy|. Then we can define the typical projector as Π n Q,δ = X y n ∈T n Y,δ |y n ihy n |. (39) For a collection of states {ρ x , x∈X}, the conditional typical projector is defined as Π n Q|X,δ (x n )= O x Π Ix Q |x ,δ , (40) where the indicator I x = {i : x i = x} and Π Ix Q |x ,δ denotes the typical projector of the density operator ρ Q x in the positions given by the set I x . Fixingδ>0, we will need the following properties of typical subspaces and condi- tionally typical subspaces: Trσ Q x nΠ n Q|X,δ (x n ) ≥ 1−ǫ (41) Trσ Q x nΠ n Q,δ(|X|+1) ≥ 1−ǫ (42) TrΠ n Q,δ(|X|+1) ≤ α (43) Π n Q|X,δ (x n )σ Q x nΠ n Q|X,δ (x n ) ≤ β −1 Π n Q|X,δ (x n ) (44) 82 where α =2 n[H(Q)+cδ] and β =2 n[H(Q|X)−cδ] for ǫ=2 −nc ′ δ 2 and some constants c and c ′ . Finally we need some facts about trace distances (taken from [54]). The trace distance between two density operators ρ and σ can be defined as kρ−σk 1 =Tr|ρ−σ|, where |A|≡ √ A † A is the positive square root of A † A. The monotonicity property of trace distance is kρ RB −σ RB k 1 ≥kρ B −σ B k 1 . (45) 4.3 Main result 4.3.1 Classical-quantum channels We begin by defining our private classical communication protocol for a {c → qq} channel from sender Alice to receiver Bob and eavesdropper Eve. The channel is definedbythemapW :x→σ BE x ,withx∈X andthestateσ BE x definedonabipartite quantum systemBE; Bob has access to subsystemB and Eve has accessto subsystem E. Alice’s task is to transmit, by some large amount number of n uses of the channel W, one of {0,1} nR equiprobable messages to Bob so that he can identify the message withhighprobabilitywhileatthesametimeEvereceivesalmostnoinformationabout the message. In addition, Alice and Bob are given some private strings (secret keys), picked uniformly at random from the set {0,1} nRs , before the protocol begins. The inputs to the channel W ⊗n are classical sequences x n ∈ X n with probability p n (x n ). The outputs of W ⊗n are density operators σ BE x n = σ BE x 1 ⊗···⊗σ BE xn living on some Hilbert space H B n E n . An (n,R,R s ,ǫ) secret keys assisted private channel code consists of 83 • An encryption map f : {0,1} nR ×{0,1} nRs → {0,1} nR , i.e. f generates an indexrandom variableK uniformly distributed in{0,1} nR basedon theclassical messageembodiedintherandomvariableM andthesharedsecretkeysembodied in the random variable S. Furthermore, f(m,s 1 ) 6= f(m,s 2 ) for s 1 6= s 2 and f(m 1 ,s)6=f(m 2 ,s) for m 1 6=m 2 . • An encoding map E : {0,1} nR → X n . Alice encodes the index k as E(k) and sends it through the channel W ⊗n , generating the state Υ AsBsBE = 1 2 nRs X s∈{0,1} nRs |sihs| As ⊗|sihs| Bs ⊗ 1 2 nR X m∈{0,1} nR σ BE E(f(m,s)) (46) • A decoding POVM{Λ k ′} k ′ ∈{0,1} nR, where Λ k ′ is a positive operator acting onB and taking on values k ′ . Bob need to infer the index k through the POVM; • A decryption map g :{0,1} nR ×{0,1} nRs →{0,1} nR , where g(f(m,s),s) =m, ∀s,m. This allows Bob to recover Alice’s message as m ′ = g(k ′ ,s) based on k ′ and s; such that k e Υ BE −τ B ⊗σ E k 1 ≤ǫ, (47) where e Υ BE is the state of the subsystem BE after Bob’s decoding operation, and τ B = 1 2 nR X m |mihm| B contains the private classical information that is decoupled from Eve’s state σ E . A rate pair (R,R s ) is called achievable if for any ǫ,δ > 0 and sufficiently large n there exists an (n,R−δ,R s +δ,ǫ) private channel code. The private capacity region 84 C PF (W) is a two-dimensional region in the (R,R s ) plane with all possible achievable rate pairs (R,R s ). We now state our main theorem. Theorem 8 The private channel capacity region C PF (W) is given by C PF (W)= ∞ [ n=1 1 n e C (1) PF (W ⊗n ), (48) where the notationZ means the closure of a setZ and e C (1) PF (W) is the set of allR s ≥0, R≥0 such that R ≤ I(X;B) σ −I(X;E) σ +R s (49) R ≤ I(X;B) σ , (50) where BE|X is given by W and σ is of the form σ XBE = X x p(x)|xihx| X ⊗σ BE x . Provingthattherighthandsideof(48)isachievableiscalledthedirect coding theorem, whereas showing that it is an upper bound is called the converse. For the direct coding part, we will need the following lemma from [28], a quantum generalization of the covering lemma in [49]. 85 Lemma 9 (Covering Lemma) We are given an ensemble {p(x),σ x } x∈X with aver- age density operator σ = P x∈X p(x)σ x . Assume the existence of projectors Π and (Π x ) x∈X with the following properties (∀x∈X): Trσ x Π x ≥ 1−ǫ, Trσ x Π ≥ 1−ǫ, TrΠ ≤ α, Π x σ x Π x ≤ β −1 Π x . In addition, we require Π x and σ x to commute for all x. The obfuscation error of a set S ⊆X is defined as oe(S)= 1 |S| X x∈S σ x −σ 1 , and is an upper bound on the probability of distinguishing the fake average from the real one. Define the setC =(X s ) s∈[S] , whereX s is a random variable chosen independently according to the distribution p on X, and S=⌈γ −1 α/β⌉ for some 0<γ <1. Then Pr{oe(C)≥2ǫ+19 √ ǫ}≤2αexp(−κ 0 ǫ 3 /γ). (51) Corollary 10 Consider an ensemble {p n (x n ),σ E x n} x n ∈X n with average density oper- ator σ E = P x np n (x n )σ E x n, let random variables X 1 ,X 2 ,...,X S all independently dis- tributed according to p n and C = (X s ) s∈[S] . Then for all ǫ,δ> 0 and sufficiently large n, Pr{oe(C)≥2ǫ+19 √ ǫ}≤2αexp(−κ 0 Sǫ 3 β/α). (52) 86 where α =2 n[H(E)σ+cδ] , β =2 n[H(E|X)σ−cδ] , S=2 n[I(X;E)σ+3cδ] , and oe(C)= 1 S X s∈[S] σ E Xs −σ E 1 . Proof We can relate to Lemma 9 through the identifications: X →X n , σ x →σ x n, p→p n , σ→σ E , Π→Π n E,δ(|X|+1) and Π x → ˆ Π n E|X,δ (x n ) with ˆ Π n E|X,δ (x n )= Π n E|X,δ (x n ) x n ∈T n X,δ 0 otherwise. The four conditions now read (for all x n ∈X n ), Trσ E x n ˆ Π n E|X,δ (x n )≥1−ǫ, (53) Trσ E x nΠ n E,δ(|X|+1) ≥1−ǫ, (54) TrΠ n E,δ(|X|+1) ≤α, (55) ˆ Π n E|X,δ (x n )σ E x n ˆ Π n E|X,δ (x n )≤β −1 ˆ Π n E|X,δ (x n ). (56) These follow from the properties of typical subspaces and conditionally typical sub- spaces mentioned before. 2 We will also need the Holevo-Schumacher-Westmoreland (HSW) theorem [42, 60]. Proposition 11 (HSW theorem) Given an ensemble σ XB = X x∈X p(x)|xihx| X ⊗σ B x , and integer n, consider the encoding map E : K → X n given by E(k) = X k , where K takes value from the set [K] : 1,2,··· ,K, and {X k } are random variables chosen 87 R s I(X;E) R I(X;B) – I(X;E) P Q I(X;B) Figure 16: Private classical communication capacity region of{c→qq} channel when assisted by pre-shared secret keys. according to the i.i.d. distribution p n . For any ǫ,δ > 0 and sufficiently large n, there exists a decoding POVM (Λ k ) k∈[K] on B for the encoding map E with K = 2 n[I(X;B)−2(c+c ′ δ)δ] , for some c, such that for all k, E X k ′ |π(k ′ |k)−δ(k,k ′ )|≤ǫ . Here π(k ′ |k) is the probability of decoding k ′ conditioned on m having been encoded: π(k ′ |k)=Tr(Λ k ′σ E(k) ), (57) δ(s,s ′ ) is the delta function and the expectation is taken over the random encoding. Now we are ready to prove the direct coding theorem. Proof [direct coding theorem] The capacity region is shown in Fig. 16. This trade-off region includes two limit points P and Q. When R s = 0 (Point P), the private classical capacity of W is equal to I(X;B)−I(X;E). This is the well-known private classical communication capacity proved in [23]. In our case, it suffices to prove Point Q is optimal; that is, the achievability of the rate pair (R,R s ) = (I(X;B),I(X;E)). The idea of proof is 88 as follows: Instead of sacrificingnI(X;E) bits of classical message to randomize Eve’s knowledge of the state, Alice and Bob use pre-shared secret keys to do so. For all ǫ,δ>0andsufficientlylargen, weshowbelowthataprivateinformationtransmission rate of I(X;B) is achievable if Alice and Bob consume pre-shared secret keys of rate I(X;E). Fixǫ,δ>0 and a sufficiently largen. Consider the ensemble{p n (x n ),σ BE x n } of the channel output W ⊗n . There exists an encoding map E : K → X K for Alice on the encryption output K =f(M,S) where X K is i.i.d. according to p n , M represents the classical message taken values from {0,1} nR , and S represents the pre-shared secret keys taken values from {0,1} nRs . Here {X K } serves as a HSW code. In the following, we will explicitly use f(m,s) instead of its index k. For each m∈{0,1} nR , define C m = (X f(m,s) ) s∈[2 nRs ] . C m works as a covering code as define in Corollary 10. Choose R s = I(X;E)+3(c+c ′ δ)δ. For any m ∈ {0,1} nR , define the logic statement ℓ m by oe(C m )≤2ǫ+19 √ ǫ, where oe(C m )= 1 2 nRs X s σ E X f(m,s) −σ E 1 , σ E = P x np n (x n )σ E x n and σ E x n =Tr B σ BE x n . By Corollary 10, Pr{not ℓ m }≤2αexp(−κ 0 2 nRs ǫ 3 β/α) , ∀m. (58) The probability of (58) can be made ≤ǫ2 −nR when n is sufficient large. We now invoke Proposition 11. Choose R = I(X;B)−2(c+c ′ δ)δ, there exists a POVM {Λ k ′} k ′ ∈{0,1} nR acting on B such that for all k, E X k ′ |π(k ′ |k)−δ(k,k ′ )|≤ǫ . (59) 89 After Bob performs the POVM, the state (46) becomes ˆ Υ= 1 2 nRs X s |sihs| As ⊗|sihs| Bs ⊗ 1 2 nR X m,k ′ π(k ′ |f(m,s)) |k ′ ihk ′ | B ⊗σ E X f(m,s) , which is close to ˆ Υ 0 = 1 2 nRs X s |sihs| As ⊗|sihs| Bs ⊗ 1 2 nR X m |f(m,s)ihf(m,s)| B ⊗σ E X f(m,s) in the sense that Ek ˆ Υ− ˆ Υ 0 k 1 ≤ǫ by the condition (59). Bob applies the decryption map g, resulting in a state e Υ AsBsBE . By the mono- tonicity of trace distance (45), we have Ek e Υ BE − e Υ BE 0 k 1 ≤ǫ , (60) where e Υ BE 0 = 1 2 nR X m |mihm| B ⊗ 1 2 nRs X s σ E X f(m,s) . By the Markov inequality, Pr{not ℓ 0 }≤ √ ǫ, where ℓ 0 is the logic statement k e Υ BE − e Υ BE 0 k 1 ≤ √ ǫ. By the union bound, Pr{not (ℓ 0 &ℓ 1 &···&ℓ m )}≤ 2 nR X i=0 Pr{not ℓ i }≤ǫ+ √ ǫ. 90 Encryption f Encoding E Channel Decryption g Decoding k Λ K K’ M M’ S n X n ⊗ W n n n E B x σ Figure17: Privateclassicalcommunicationprotocolassistedbypre-sharedsecretkeys. Hence there exists a specific choice of {X f(m,s) }, say {x f(m,s) }, for which all these conditions are satisfied. Consequently, k e Υ BE −τ B ⊗σ E k 1 ≤k e Υ BE − e Υ BE 0 k 1 +k e Υ BE 0 −τ B ⊗σ E k 1 ≤2ǫ+20 √ ǫ . as claimed. 2 Proof [converse] We shall prove that, for any δ,ǫ > 0 and sufficiently large n, if an (n,R,R s ,ǫ) secret keys assisted private channel code has rate R then (49) and (50) hold. The private classical communication protocol is shown in Fig. 17 nR = H(K) = I(K;K ′ )+H(K|K ′ ) ≤ I(K;K ′ )+1+nǫlog|X|, where the last inequality follows from Fano’s inequality: H(K|K ′ )≤1+Pr{K 6=K ′ }nR, 91 and Pr{K 6=K ′ }≤ǫ is guaranteed by the HSW theorem. Hence, I(K;K ′ ) ≤ I(K;B n ) (61) ≤ I(X n ;B n ), (62) where the first inequality follow from the data processing inequality while the second inequality comes from the Markov condition K →X n →B n E n . We then have R−δ≤ 1 n I(X n ;B n ), (63) where without loss of generality ǫ≤ δ 6log|X| and n≥ 2 δ . This proves (50). On the other hand, I(M;M ′ )≤I(M;SB n ) (64) =I(M;B n |S)+I(M;S) (65) =I(MS;B n )+I(M;S)−I(S;B n ) (66) ≤I(MS;B n ) (67) =I(K;B n ) (68) where (64) follows from data processing inequality, while (67) follows from the facts that I(M;S)=0, and I(S;B n )≥0. Furthermore, (60) guarantees that ǫ≥I(M;E n |S)=I(MS;E n )−I(S;E n ) (69) ≥I(K;E n )−H(S). (70) 92 Combining (68) and (70) gives I(M;M ′ )≤I(K;B n )−I(K;E n )+H(S)+ǫ (71) ≤I(X n ;B n )−I(X n ;E n )+H(S)+ǫ (72) Hence nR =H(M) (73) =I(M;M ′ )+H(M|M ′ ) (74) ≤I(M;M ′ )+1+nǫlog|X| (75) where (75) follows from the Fano’s inequality. Finally, (72) and (75) gives (49) with proper choices of ǫ≤ δ 6log|X| and n≥ 2 δ R−δ≤ 1 n [I(X n ;B n )−I(X n ;E n )+H(S)] (76) = 1 n [I(X n ;B n )−I(X n ;E n )]+R s (77) where R s = H(S) n . 2 4.3.2 Generic quantum channels Consider now Alice and Bob are connected by a noisy quantum channelN :B(H ′ A )→ B(H B ), where B(H) denotes the space of bounded linear operators on H. Let U N : B(H ′ A )→B(H BE ) be an isometry extension of N that includes the unobserved envi- ronment E which is completely under the control of the eavesdropper Eve. Theorem 8 then can be rewritten as the following 93 Theorem 12 The private channel capacity region C PF (N) is given by C PF (N)= ∞ [ n=1 1 n e C (1) PF (N ⊗n ), (78) where the notationZ means the closure of a setZ and e C (1) PF (N) is the set of allR s ≥0, R≥0 such that R ≤ I(A;B) σ −I(A;E) σ +R s (79) R ≤ I(A;B) σ , (80) where σ is of the form σ ABE =U A ′ →BE N (|ψi AA ′ ), for some pure input state |ψi AA ′ whose reduce density operator ρ A ′ = P x p(x)ρ x and U N :A ′ →BE is an isometric extension of N. With the spectral decomposition of the input state ρ A ′ = P x p(x)ρ x , each U N induces a corresponding {c → qq} channel. Therefore, the results of the previous section can be directly applied here. 4.4 Private Father Protocol In this section, we will phrase our result using the theory of resource inequalities developed in [26]. The channel N : A → B assisted by some rate R s of secret keys shared between Alice and Bob was used to enable a rate R of secret communication between Alice and Bob. This is written as hNi+R s [cc] ∗ ≥R[c→c] ∗ . (81) 94 This resource inequality holds iff (R s ,R)∈C PF (N), with C PF (N) given in Theorem 12. The”if”direction,i.e. thedirectcodingtheorem,followedfromthe”cornerpoints” hNi+I(A;E)[cc] ∗ ≥I(A;B)[c→c] ∗ . (82) This resource inequality (82) is called the private father protocol due to its similarity of the father protocol in [26]. We can recover the unassisted private channel capacity result in [23]: hNi≥I c (AiB)[c→c] ∗ . (83) This resource inequality can be obtained by appending the the following noiseless RI [c→c] ∗ ≥[cc] ∗ (84) to the output of (82). 4.5 Conclusion In this chaper, we have found a regularized expression for the secret keys assisted capacity region C PF (N) of a quantum channel N for transmitting private classical information. Ourresultshowsthatthesecretkeysareavaluablenon-localresourcefor transmitting private information. One interesting problem is to investigate how secret keys can be applied in other quantum protocols. For example, it would be plausible if the entanglement generation protocol can be boosted by secret keys. However, the result seems pessimistic. Specifically, it is impossible to construct the secret keys assistedentanglementgenerationprotocolbysimplycoherifyingourprotocolproposed in this chapter. Another open problem is to obtain a single-letterized formula. 95 Channel 5: Simulation with quantum side information In this chapter, we study and solve the problem of classical channel simulation with quantumsideinformationatthereceiver,simulatinganoisyclassicalchannelbynoise- less classical bit channels and shared common randomness together with the receiver’s quantum side information. This is a generalization of both the classical reverse Shan- nontheorem [11,72],simulatinganoisyclassicalchannelbynoiselessbitchannelsand shared common randomness, and the classical-quantum Slepian-Wolf problem [30], simulating a noiseless channel with the receiver’s quantum side information. The opti- mal noiseless communication rate is found to be reduced from the mutual information between the channel input and output by the Holevo information between the channel output and the quantum side information. Our main theorem has two important corollaries. The first is a quantum gener- alization of the Wyner-Ziv problem [77]: rate-distortion theory with quantum side information. The second is an alternative proof of the trade-off between classical com- munication and common randomness distilled from a quantum state [31]. The fully quantum generalization of the problem considered is quantum state re- distribution. Herethesenderandreceivershareamixedquantumstateandthesender wants to transfer part of her state to the receiver using entanglement and quantum communication. We present outer and inner bounds on the achievable rate pairs. This chapter is organized as follows. In Section we introduce the notation and give some background. Section contains our main result, Theorem 13, together with 96 its proof. Section discusses consequences of Theorem 13. In section we find outer and inner bounds for a fully quantum version of our problem. Section concludes with a discussion and proposed future work. 5.1 Notation Let us introduce some useful notation for the bipartite classical-quantum systems. The state of a classical-quantum system XB can be described by an ensemble E = {ρ B x ,p(x)}, withp(x) defined onX and theρ B x being density operators on the Hilbert space H B of B. Thus, with probability p(x) the classical index and quantum state take on values x and ρ B x , respectively. A useful representation of classical-quantum systems is obtained by embedding the random variable X in some quantum system, also labelled byX. Then our ensemble{ρ B x ,p(x)} corresponds to the density operator ρ XB = X x p(x)|xihx| X ⊗ρ B x , (85) where {|xi : x ∈ X} is an orthonormal basis for the Hilbert space H X of X. A classical-quantum system may, therefore, be viewed as a special case of a quantum one. The von Neumann entropy of a quantum system A with density operator σ A is defined as H(A) σ =−Trσ A logσ A . The subscript is often omitted. For a tripartite quantum system ABC in some state σ ABC define the conditional von Neumann entropy H(B|A)=H(AB)−H(A), 97 quantum mutual information I(A;B)=H(A)+H(B)−H(AB)=H(B)−H(B|A), and quantum conditional mutual information I(A;B|C)=I(A;BC)−I(A;C). For classical-quantum correlations (85) the von Neumann entropy H(X) ρ is just the Shannon entropy H(X) = − P x p(x)logp(x) of the random variable X. The condi- tional entropyH(B|X) equals P x p(x)H(ρ B x ). The mutual informationI(X;B) is the Holevo quantity [41] of the ensemble E: χ(E)=H X x p(x)ρ x ! − X x p(x)H(ρ x ). Finally we need to introduce a classical-quantum analogue of a Markov chain. We may define a classical-quantum Markov chainY→X→B associated with an ensemble {ρ B xy ,p(x,y)} for which ρ B xy =ρ B x is independent of y. Such an object typically comes aboutbyaugmentingthesystemXB bytherandomvariableY (classically)correlated with X via a conditional distribution W(y|x) = Pr{Y =y|X =x}. This corresponds to the state ρ XYB = X x p(x) X y W(y|x)|yihy| Y ⊗|xihx| X ⊗ρ B x . (86) HereW(y|x)isthenoisychannelandX andY areinputandoutputrandomvariables. Therefore the classical-quantum system YB can be expressed as ρ YB = X y q(y)|yihy| Y ⊗ρ B y (87) 98 with q(y)= P x p(x)W(y|x) and ρ B y = P x P(x|y)ρ B x . 5.2 Channel simulation with quantum side information Consider a classical-quantum system XB in the state (85) such that the sender Alice possesses the classical index X and the receiver Bob has the quantum system B. Consider a classical channel from Alice to Bob given by the conditional probability distributionW. Applying this channel to theX part ofρ XB results in the stateρ XYB given by (86). Ideally, we are interested in simulating the channel W using noiseless communication and common randomness, in the sense that the simulation produces the state ρ XYB . For reasons we will discuss later, we want Alice to also get a copy Y of the output, so that the final state produced is ρ XYYB = X x p(x) X y W(y|x)|yihy| Y ⊗|yihy| Y ⊗|xihx| X ⊗ρ B x . (88) The systems X and Y are in Alice’s possession, while Bob has B and Y. As usual in information theory, this task is amenable to analysis when we go to the approximate, asymptotic i.i.d. (independent, identically distributed) setting. This means that Alice and Bob share n copies of the classical-quantum system XB, given by the state ρ X n B n = X x n p n (x n )|x n ihx n | X n ⊗ρ B n x n, (89) where x n = x 1 ...x n is a sequence in X n , p n (x n ) = p(x 1 )...p(x n ), and ρ x n = ρ x 1 ⊗ ρ x 2 ···⊗ρ xn . They want to simulate the channelW n (y n |x n )=W(y 1 |x 1 )...W(y n |x n ) approximately, with error approaching zero as n→∞. They have access to a rate of C bits/copy of common randomness, which means that they have the same string l picked uniformly at random from the set{0,1} nC . In addition, they are allowed a rate 99 of R bits/copy of classical communication, so that Alice may send an arbitrary string m from the set {0,1} nR to Bob. An (n,R,C,ǫ) simulation code consists of • AnencodingstochasticmapE n :X n ×{0,1} nC →{0,1} nR ×{0,1} nS . Ifthevalue of the common randomness is l ∈ {0,1} nC , Alice encodes her classical message x n as the index ms, m∈{0,1} nR , s∈{0,1} nS , with probability E l (m,s|x n ) := E n (m,s|x n ,l), and only sends m to Bob; • A set {Λ (lm) } lm∈{0,1} n(C+R), where each Λ (lm) = {Λ (lm) s ′ } s ′ ∈{0,1} nS is a POVM acting on B n and taking on values s ′ . Bob does not get sent the true value of s and needs to infer it from the POVM; • A deterministic decoding mapD n :{0,1} nC ×{0,1} nR ×{0,1} nS →Y n ; this al- lowsAliceandBobtoproducetheirrespectivesimulatedoutputse y n =D l (m,s):= D n (l,m,s) and ˆ y n =D l (m,s ′ ), based on l, m and s (in Bob’s case s ′ ); such that k(ρ XYYB ) ⊗n −σ X nˆ Y ne Y n ˆ B n k 1 ≤ǫ. (90) Here the state σ X nˆ Y ne Y n ˆ B n denotes the result of the simulation, which includes Al- ice’s originalX n , the post-measurement system ˆ B n , Alice’s simulation output random variable e Y n and Bob’s simulation output random variable ˆ Y n (based on s ′ ). A rate pair (R,C) is called achievable if for all ǫ> 0, δ > 0 and sufficiently large n, there exists an (n,R+δ,C +δ,ǫ) code. We now state our main theorem. Theorem 13 The region of achievable (R,C) pairs is given by R≥I(X;Y)−I(Y;B), C +R≥H(Y|B). 100 R C H(Y|X) H(Y|B) I(X; Y)-I(Y; B) Figure 18: Achievable region of rate pairs for a classical-quantum system XB. The theorem contains a direct coding part (achievability) and a converse part (optimality). For the direct coding theorem it suffices to prove the achievability of the rate pair (R,C)=(I(X;Y)−I(Y;B),H(Y|X)). The full region given by Theorem 13 (seeFigure2)followsbyobservingthatabitofcommonrandomnessmaybegenerated from a bit of communication. AnaivesimulationwouldbeforAlicetoactuallyperformthechannelW locallyand sendacompressedinstanceoftheoutputtoBob. Thiswouldrequireacommunication rate of H(Y) bits per copy. The first idea is to split this information into an intrinsic and extrinsic part [73]. The extrinsic part has rate H(Y|X) and is provided by the common randomness. Only the intrinsic part I(X;Y) = H(Y)−H(Y|X) requires classical communication. This protocol would amount to sending the strings m and s above. However, a further savings of I(Y;B) is accomplished by Bob deducing the s index from his quantum state. Thus Alice need only send m which requires a rate I(X;Y)−I(Y;B). Forthedirectcodingpartwewillneedseverallemmas. ThefirstoneistheChernoff bound (cf. [4]). 101 Lemma 14 (Chernoff bound) Let Z 1 ,...,Z n be i.i.d. random variables with mean μ. Define Z n = 1 n P n j=1 Z j . If the Z j take values in the interval [0,b], then for η≤ 1 2 , and some constant κ 0 , Pr{|Z n −μ|≥μη}≤2exp(−κ 0 nμη 2 /b). (91) The second lemma concerns deterministically “diluting” a uniformly distributed random variable to a non-uniform one on a larger set. We will need it to create y n from l, m and s. Lemma 15 (Randomness dilution) We are given a probability distribution q(y) defined on Y and a set T ⊆Y such that q(T):= X y∈T q(y)≥1−ǫ, (92) q(y)≥α, ∀y∈T, (93) for some positive numbersα andǫ. LetW be the random variable uniformly distributed on{1,...,M}. ForrandomvariablesY 1 ,Y 2 ,...,Y M alldistributedaccordingtoq, define the mapG:{1,...,M}→Y byG(i)=Y i . Then, lettinge q be the distribution ofG(W), Pr{kq−e qk 1 ≥η+ǫ}≤2|T|exp(−κ 0 Mαη 2 ) for some constant κ 0 . Proof Consider the indicator function I(G(i) =y) taking values in {0,1}. Observe that I(G(i) =y) for i∈{1,...,M} are i.i.d. random variables with expectation value 102 EI(G(i) = y) = q(y). The distribution e q(y) of G(W) is 1 M P M i=1 I(G(i) = y). By the Chernoff bound (14), for each y∈T, for η≤ 1 2 , and some constant κ 0 , Pr ( 1 M M X i=1 I(G(i)=y)−q(y) ≥q(y)η ) ≤2exp(−κ 0 Mαη 2 ). (94) By the union bound, Pr{not ι}≤2|T|exp(−κ 0 Mαη 2 ), where the logic statement ι is given by ι={e q∈[ˆ q(1−η),ˆ q(1+η)]} and ˆ q(y) =q(y)I(y ∈T). It remains to relate ι to a statement about ke q−qk 1 . First observe that kˆ q−qk 1 = X y |ˆ q(y)−q(y)| = X y6∈T q(y)≤ǫ. (95) Second, observe that ι implies ke q− ˆ qk 1 ≤η. The two give, via the triangle inequality kq−e qk 1 ≤η+ǫ. The statement of the lemma follows. 2 Corollary 16 Consider a random variable Y with distribution q(y), and let W be the random variable uniformly distributed on{1,...,M}. For random variablesY 1 ,...,Y M 103 all distributed according to q n , define the map G :{1,...,M}→Y n by G(i) =Y i . Let e q be the distribution of G(W). Then, for all ǫ,δ>0 and sufficiently large n, Pr{kq n −e qk 1 ≥2ǫ}≤2γexp(−κ 0 Mǫ 2 /γ), where γ =2 n[H(Y)+cδ] and c is some positive constant. Proof We will assume familiarity with the properties of typicality and conditional typicality, collected in the Appendix. We can relate to Lemma 15 through the identi- fications: Y →Y n , q(y)→q n (y n ), and T →T n Y,δ . The two conditions now read q n (T n Y,δ )≥1−ǫ, (96) q n (y n )≥γ −1 , ∀y n ∈T n Y,δ . (97) These follow from properties 1 and 2 of Theorem 43 (relabeling X to Y and p to q). 2 E D M ( ) P | x y y ⊆ Figure 19: The covering lemma. Our next lemma contains the crucial ingredient of the direct coding theorem and is based on [73]. It will tell us how to define the encoding and decoding operations for a particular value of the common randomness. 104 Lemma 17 (Covering lemma) We are given a probability distribution q(y) and a conditional probability distribution P(x|y), with x ∈ X and y ∈ Y. Assume the exis- tence of sets T ⊆X and (T y ) y∈Y ⊆X with the following properties for all y∈Y: X y∈Y q(y)P(T y |y) ≥ 1−ǫ, (98) X y∈Y q(y)P(T|y) ≥ 1−ǫ, (99) |T| ≤ K, (100) P(x|y) ≤ k −1 , ∀x∈T y . (101) Define M = η −1 K/k for some 0<η < 1. Given random variables Y 1 ,Y 2 ,...,Y M all distributed according to q, define the map D : {1,2,...,M} → Y by D(i) = Y i . Then there exists a conditional probability distribution E(i|x) defined for i ∈ {1,2,...,M} such that Pr{k ˆ Pu−Epk 1 ≥5ǫ}≤2Kexp(−κ 0 ǫ 3 /η), (102) where ˆ P(x|i) =P(x|D(i)), u is the uniform distribution on {1,2,...,M} and p is the marginal distribution defined by p(x)= P y∈Y P(x|y)q(y). Remark The meaning of the covering lemma is illustrated in Figure 3. A uniform distribution on the set {1,2,...,M} is diluted via the map D to the set Y, and then stochasticallymappedtothesetX viaP(x|y). Condition(102)saysthattheverysame distribution on {1,2,...,M}×X can be obtained by starting with the marginal p(x) and stochastically “concentrating” it to the set {1,2,...,M}. For this to be possible, the conditional outputs of the channel P(x|y) (for particular values of y) should be sufficiently spread out to cover the support of p(x). Each conditional output random variableissupportedonT y (98)ofcardinalityroughly≥k (101),andp(x)issupported 105 on T (99) of cardinality ≤ K (100). Thus roughly M ≈ K/k conditional random variables ˆ P(x|i) should suffice for the covering. Proof The idea is to use the Chernoff bound, as in the proof of the randomness dilution lemma. First we trim our conditional distributions to make them fit the conditions of the Chernoff bound; the resulting bound is then related to the condition (102). Define w(x)= X y∈Y q(y)P(x|y)I(x∈A y ), with A y =T y T T and A= S y∈Y A y . By properties (98) and (99), w(A)= X y∈Y q(y)P(A y |y)≥1−2ǫ . Further define B y =A y T {x:w(x)≥ǫ/K} and B = S y∈Y B y . Then define e P(x|y)=P(x|y)I(x∈B y ), e w(x)= X y∈Y q(y) e P(x|y)=w(x)I(w(x)≥ǫ/K). By (100), the cardinality ofA is upper-bounded byK, thosex∈A withw(x) smaller than ǫ/K contribute at most ǫ to w(A). Thus e w(B)≥w(A)−ǫ≥1−3ǫ. (103) Observe E e P(x|D(i)) = e w(x) ≥ ǫ/K. By (101), 0 ≤ e P(x|D(i)) ≤ k −1 . We can now apply the Chernoff bound (14) to the i.i.d. random variables e P(x|D(i)) (for fixed x∈X) Pr ( 1 M M X i=1 e P(x|D(i)) / ∈[(1−ǫ)e w(x),(1+ǫ)e w(x)] ) ≤2exp(−κ 0 ǫ 3 /η). (104) 106 Hence Pr{not ι}≤2Kexp(−κ 0 ǫ 3 /η), (105) where the logic statement ι is defined as ι= ( 1 M M X i=1 e P(·|D(i))∈[(1−ǫ)e w,(1+ǫ)e w] ) . Assume that ι holds. Then we can define our conditional distribution E as E(i|x)= 1 (1+ǫ)M e P(x|D(i)) p(x) . By ι and the definition of e w, we can check E(i|x) is a subnormalized conditional distribution, M X i=1 E(i|x)= M X i=1 1 (1+ǫ)M e P(x|D(i)) p(x) ≤ e w(x) p(x) ≤1. Finally, we estimate k ˆ Pu−Epk 1 . It is sufficient to do this for the constructed subnormalized conditional distribution, because we can distribute the rest weight to fill up to 1 arbitrarily. The joint distribution of ˆ Pu is { 1 M P(x|D(i))}, thus k ˆ Pu−Epk 1 = M X i=1 X x∈B D(i) 1 M 1− 1 1+ǫ P(x|D(i))+ M X i=1 X x/ ∈B D(i) 1 M P(x|D(i)). (106) Since P(B D(i) |D(i))≤1, we can bound the first term by ǫ. By assumption, X x∈B 1 M M X i=1 e P(x|D(i))≥(1−ǫ)e w(B)≥1−4ǫ , sinceB D(i) ⊆B, the second term in (106) is bounded by 4ǫ. We have now shown that if ι holds true then k ˆ Pu−Epk 1 ≤5ǫ. 107 Combining with (105) proves the theorem. 2 Corollary 18 Consider the joint random variable XY that is distributed according to q(y)P(x|y). Given random variables Y 1 ,Y 2 ,...,Y M all distributed according to q n , define the map D : {1,2,...,M} → Y n by D(i) = Y i . Then, for all ǫ,δ > 0 and sufficiently large n, there exists a conditional probability distribution E(i|x n ) defined for i∈{1,2,...,M} such that Pr{k ˆ Pu−Ep n k 1 ≥5ǫ}≤2αexp(−κ 0 Mǫ 3 β/α), (107) where ˆ P(x n |i) = P n (x n |D(i)), u is the uniform distribution on {1,2,...,M}, p is the marginal distribution defined by p(x) = P y∈Y P(x|y)q(y), α = 2 n[H(X)+cδ] , β = 2 n[H(X|Y)−cδ] . Proof We can relate to Lemma 17 through the identifications (see Appendix): X → X n , Y →Y n , q(y)→q n (y n ), P(x|y)→P n (x n |y n ), T →T n X,3δ , and T y → ˆ T n X|Y,δ (y n ), with ˆ T n X|Y,δ (y n )= T n X|Y,δ (y n ) y n ∈T n Y,δ ∅ otherwise. The four conditions now read (for all y n ∈Y n ), X y n ∈Y n q n (y n )P n ( ˆ T n X|Y,δ (y n )|y n )≥1−2ǫ, (108) X y n ∈Y n q n (y n )P n (T n X,3δ |y n )≥1−2ǫ, (109) |T n X,3δ |≤α, (110) P n (x n |y n )≤β −1 , ∀x n ∈ ˆ T n X|Y,δ (y n ). (111) These follow from Theorem 44, switching the roles of X and Y and setting δ =δ ′ . 2 We will also need the Holevo-Schumacher-Westmoreland (HSW) theorem [42, 60]. 108 Proposition 19 (HSW Theorem) Given an ensemble σ YB = X y∈Y q(y)|yihy| Y ⊗ρ B y , and integern, consider the encoding mapF :{0,1} nS →Y n given byF(s)=Y s , where the {Y s } are random variables chosen according to the i.i.d. distribution q n . For any ǫ,δ > 0 and sufficiently large n, there exists a decoding POVM {Λ s } s∈{0,1} nS on B n for the encoding map F with S =I(Y;B) σ −δ, such that for all s, E X s ′ |π(s ′ |s)−δ(s,s ′ )|≤ǫ. Here π(s ′ |s) is the probability of decoding s ′ conditioned on s having been encoded: π(s ′ |s)=Tr(Λ s ′ρ F(s) ), (112) δ(s,s ′ ) is the delta function and the expectation is taken over the random encoding. Now we are ready to prove the direct coding theorem: Proof of Theorem 13 (direct coding) In a sense, the simulation of W(y|x) on p(x) looks more like simulating the reverse map P(x|y) on q(y). For each realization of the common randomness (which has uniform distribution u ′ ) we have an encoding mapE from Corollary 18; the latter acts on the distributionp n and outputs a uniform distributionu. On one hand, the combination ofu andu ′ dilutes toq n (Corollary 16), and on the other hand the joint distributionEp n is as if instead the mapP n acted on theq n diluted fromu andu ′ (Corollary 18). Finally, Bob needs to be sent only partial information about the random variable with distribution u; the rest he can deduce from his side information with the help of the HSW theorem. Fix ǫ,δ > 0 and a sufficiently large n (cf. Corollaries 16, 18 and Proposition 19). ConsidertherandomvariablesY lms ,l∈{0,1} nC ,m∈{0,1} nR ,s={0,1} nS (forsome 109 C,R and S to be specified later), independently distributed according to q n , where q(y)= P x p(x)W(y|x). The Y lms are going to serve simultaneously as a “randomness dilution code” G(l,m,s) = Y lms (cf. the Y 1 ,...,Y M in Corollary 16, M here being 2 n(C+R+S );as2 nC independent“coveringcodes”D l (m,s)=Y lms (cf. theY 1 ,...,Y M in Corollary18,M herebeing2 n(R+S ); andas2 n(C+R) independentHSWcodesF lm (s)= Y lms (cf. Proposition19). Wewillconcludetheproofby“derandomizing”thecode,i.e. showingthataparticularrealizationoftherandomY lms existswithsuitableproperties. Define, as in the two corollaries, α = 2 n[H(X)+cδ] , β = 2 n[H(X|Y)−cδ] , and γ = 2 n[H(Y)+cδ] . Define two independent uniform distributionsu ′ (l) andu(ms) on the sets {0,1} nC and {0,1} nR ×{0,1} nS , respectively. The stochastic map e D(y n |l,m,s) is defined as e D(y n |l,m,s)=I(y n =D l (m,s)). Corollary 18 defines corresponding encoding stochastic maps {E l (m,s|x n )}. For any l∈{0,1} nC , define the logic statement ι l by ξ l ≤5ǫ, where ξ l = X m,s X x n X y n P n (x n |y n ) e D(y n |l,m,s)u(ms)−E l (m,s|x n )p n (x n ) . By Corollary 18, for all l Pr{not ι l }≤2αexp(−2 n(R+S) κ 0 ǫ 3 β/α). (113) Define the logic statement ι ′ by ξ ′ ≤2ǫ, where ξ ′ = X y n X l,m,s e D(y n |l,m,s)u ′ (l)u(ms)−q n (y n ) . By Corollary 16, Pr{not ι ′ }≤2γexp(−2 n(C+R+S) κ 0 ǫ 2 /γ). (114) 110 Once we fix the randomness we shall be using f W(y n |x n )= X l,m,s e D(y n |l,m,s)E l (m,s|x n )u ′ (l) (115) to simulate the channel W n (y n |x n ). Observe that X x n y n p n (x n )(W n (y n |x n )− f W(y n |x n )) (116) = X x n ,y n X l,m,s e D(y n |l,m,s)E l (m,s|x n )u ′ (l)p n (x n )−W n (y n |x n )p n (x n ) ≤ X x n ,y n X l,m,s e D(y n |l,m,s)u ′ (l) E l (m,s|x n )p n (x n )− X ˆ y n P n (x n |ˆ y n ) e D(ˆ y n |l,m,s)u(ms) + X x n ,y n P n (x n |y n ) X l,m,s e D(y n |l,m,s)u ′ (l)u(ms)−q n (y n ) ≤max l ξ l +ξ ′ . (117) To obtain the first inequality we have used e D(y n |l,m,s) e D(ˆ y n |l,m,s)= e D(y n |l,m,s)δ(y n ,ˆ y n ) and the triangle inequality. We shall now invoke Proposition 19. Define q(y)ρ y = P x p(x)W(y|x)ρ x . Setting F lm (s) = Y lms and S = I(Y;B)−cδ, there exists a set {Λ (lm) } lm∈{0,1} n(C+R), where each Λ (lm) ={Λ (lm) s ′ } s ′ ∈{0,1} nS is a POVM acting on B n , such that E X s ′ |π lm (s ′ |s)−δ(s,s ′ )|≤ǫ (118) 111 foralll,mands. π lm (s ′ |s)describesthenoiseexperiencedinconveyingstoBob,ifthe channelW n (y n |x n ) were implemented exactly. However, Alice only has the simulation f W(y n |x n ), which corresponds to the ensemblee q(y n )e ρ y n := P x np n (x n ) f W(y n |x n )ρ x n. Observe that (116) is another way of expressing ||(ρ XYB ) ⊗n −σ X ne Y n B n || 1 =||(ρ XY ) ⊗n −σ X ne Y n || 1 . Applying monotonicity of trace distance to (117), we have ||(ρ YB ) ⊗n −σ e Y n B n || 1 = X y n kq n (y n )ρ y n−e q(y n )e ρ y nk 1 ≤max l ξ l +ξ ′ , and hence by the triangle inequality and monotonicity of trace distance Ekρ F(s) −e ρ F(s) k 1 ≤ X y n ||q n (y n )ρ y n−e q(y n )e ρ y n|| 1 + X y n |e q(y n )−q n (y n )|≤2(max l ξ l +ξ ′ ). Thus, the actual noise experienced in conveyings to Bob, denoted bye π lm (s ′ |s), obeys E P s ′|π lm (s ′ |s)−e π lm (s ′ |s)|≤2(max l ξ l +ξ ′ ). Combining the above with (118) gives E X s ′ |e π lm (s ′ |s)−δ(s,s ′ )|≤2(max l ξ l +ξ ′ )+ǫ. Let us focus on the effect this imperfection in the HSW decoding will have on the simulation. By monotonicity, E X x n e y n y n | X l,m,s,s ′ e D(y n |lms) e D(e y n |lms ′ )E l (ms|x n )u ′ (l)p n (x n )(e π lm (s ′ |s)−δ(s,s ′ ))| ≤2(max l ξ l +ξ ′ )+ǫ. 112 By the Markov inequality, Pr{not ι ′′ }≤ 1 2 , where ι ′′ is the logic statement X x n e y n y n X l,m,s,s ′ e D(y n |lms) e D(e y n |lms ′ )E l (ms|x n )u ′ (l)p n (x n )(e π lm (s ′ |s)−δ(s,s ′ )) ≤4(max l ξ l +ξ ′ )+2ǫ. Now for the derandomization step. Pick C = H(Y|X)−cδ and R = I(X;Y)− I(Y;B)+4cδ. By the union bound ι l for all l, ι ′ , and ι ′′ hold true with probability > 0. Hence there exists a specific choice of {Y lms } for which all these conditions are satisfied. Consequently, X x n e y n y n X l,m,s,s ′ e D(y n |lms) e D(e y n |lms ′ )E l (ms|x n )u ′ (l)p n (x n )(e π lm (s ′ |s)−δ(s,s ′ )) ≤30ǫ, i.e. ||σ X ne Y n o e Y n −σ X nˆ Y ne Y n || 1 ≤30ǫ,where e Y n o = e Y n isBob’ssimulationoutputrandom variable if his decoding measurement is perfect. Combining with (117) (||(ρ XYY ) ⊗n − σ X ne Y n o e Y n || 1 ≤7ǫ) gives k(ρ XYY ) ⊗n −σ X nˆ Y ne Y n k 1 ≤37ǫ. This is almost what we need. The statement of the theorem also insists that the state of the B n system is not much perturbed by the measurement. The crucial ingredient ensuringthis, asin[30], isthegentlemeasurementlemma[71]. Toimprovereadability, we omit the details of its application here. 2 Before proving the converse, recall Fannes’ inequality [36]: 113 Lemma 20 (Fannes’ inequality) Let P and Q be probability distributions on a set with finite cardinalityd, such thatkP−Qk 1 ≤ǫ. Then H(P)−H(Q) ≤ǫlogd+τ(ǫ), with τ(ǫ)= −ǫlogǫ if ǫ≤1/4, 1/2 otherwise. Note that τ is a monotone and concave function and τ(ǫ)→0 as ǫ→0. 2 ProofofTheorem13(converse) Consideran(n,R,C,ǫ)code. Definetheuniform random variable U on the set {0,1} nC to denote the common randomness, and W on the set {0,1} nR to denote the encoded message sent to Bob. We have the following Markov chain X n →B n WU→ ˆ B n ˆ Y n . The following chain of inequalities holds: nR≥H(W|U) =H(W|U)+I(X n ;B n |U)−I(X n ;B n ) ≥I(X n ;B n W|U)−I(X n ;B n ) =I(X n ;B n WU)−I(X n ;B n ) ≥I(X n ; ˆ B n ˆ Y n )−I(X n ;B n ) ≥n(I(X;BY)−I(X;B)−f(n,ǫ)) =n(I(X;Y)−I(Y;B)−f(n,ǫ)). with f(n,ǫ) → 0 as n → ∞ and ǫ → 0. The second line from I(X n ;B n |U) = I(X n ;B n ), and the fourth from I(X n ;U) = 0. The fifth line is the data processing inequality based on the Markov chain above. The sixth is a consequence of Fannes inequality, and the last line is based on the Markov chain Y →X→B. 114 Based on the Markov chain e Y n →B n WU→ ˆ Y n , we have another chain of inequalities : nR+nC ≥H(W)+H(U) ≥H(WU) =I( e Y n ;B n WU)+I(WU;B n )+H(WU| e Y n B n )−I( e Y n ;B n ) ≥I( e Y n ;B n WU)−I( e Y n ;B n ) ≥I( e Y n ; ˆ Y n )−I( e Y n ;B n ) ≥n(H(Y)−I(Y;B)−f ′ (n,ǫ)) with f ′ (n,ǫ) → 0 as n → ∞ and ǫ → 0. The last two inequalities are from the data processinginequalityandFannesinequality. Thusanyachievableratepair(R,C)must obey the conditions of Theorem 13. 2 We can use the theory of resource inequalities [25] to succinctly express our main result. In this case we need to introduce an additional protagonist, the Source, which starts the protocol by distributing the state ρ X S S = X x p(x)|xihx| X S ⊗ρ S x , 115 between Alice and Bob. Alice getsX S through the classical identity channel id X S →X A and Bob gets S through the quantum identity channel id S→B . The goal is for Alice and Bob to end up sharing the state σ X A Y A Y B B = X x p(x) X y W(y|x)|yihy| Y A ⊗|yihy| Y B ⊗|xihx| X A ⊗ρ B x , (119) as if ρ X S S wassentthroughthechannelW X S →Y A Y B ⊗id S→B (theformerisafeedback version of W). Our direct coding theorem is equivalent to the resource inequality hid X S →X A ⊗id S→B :ρ X S S i+(I(X A ;Y B ) σ −I(Y B ;B) σ )[c→c]+H(Y B |X A ) σ [cc] s ≥hW X S →Y A Y B ⊗id S→B :ρ X S S i. (120) The superscript s stands for “source” and is a technical subtlety [25]. Without the superscript we would have an ordinary resource inequality, which permits a sublinear amount of the Source’s state ρ X S S to be destroyed. This is not something we would allow in the context of channel simulation. Thus a superscripted inequality is strictly stronger than a regular, unsuperscripted one (see Lemma 4.11 of [25]). 5.3 Applications In this section, common randomness distillation and rate-distortion coding with side information will be seen as simple corollaries of our main result. 5.3.1 Common randomness distillation Alice and Bob share n copies of a bipartite classical-quantum state ρ X A B = X x p(x)|xihx| X A ⊗ρ B x , 116 and Alice is allowed a rate R bits of classical communication to Bob. Their goal is to distill a rate C of common randomness (CR). In terms of resource inequalities, a CR-rate pair (C,R) is said to be achievable iff hρ X A B i+R[c→c]≥C[cc]. Define the CR-rate function C(R) to be C(R)=sup{C :(C,R)is achievable}. and the distillable CR function as D(R) = C(R)−R. The following theorem was proved in [31]. Theorem 21 Given the classical-quantum system XB, then D(R)=max Y|X {I(Y;B) | I(X;Y)−I(Y;B)≤R}. where C(R) =C ∗ (R) =R+D ∗ (R). The maximum is over all conditional probability distributions W(y|x) with |Y|≤|X|+1. We give below a concise proof of the direct coding part of this theorem, relying on our main result (120) and the resource calculus [25]. Proof We need to prove hρ X A B i+(I(X A ;Y B ) σ −I(Y B ;B) σ )[c→c]≥I(X A ;Y B ) σ [cc], (121) 117 with σ X A Y A Y B B given by (119). Observe the following string of resource inequalities: hid X S →X A ⊗id S→B :ρ X S S i+(I(X A ;Y B ) σ −I(Y B ;B) σ )[c→c]+H(Y B |X A ) σ [cc] ≥hW X S →Y A Y B ⊗id S→B :ρ X S S i ≥hW X S →Y A Y B :ρ X S i ≥hW X S →Y A Y B (ρ X S )i ≥H(Y B ) σ [cc]. The first inequality is by (120) and Lemma 4.11 of [25] which allows us to drop the s superscript (see discussion at the end of section 3). The second is by part 5 of Lemma 4.1 of [25] which says that we can basically ignore the S system. The third is by part 2 of the same lemma, which says that if we have the ability to apply a channel with respect to an input state, then we can actually apply it to get the output state. The last inequality is common randomness concentration [25], which states that hσ Y A Y B i ≥ H(Y B ) σ [cc]. Note here the need for Bob to also get a copy of Y (via W X S →Y A Y B ), otherwise he and Alice would have no shared stateσ Y A Y B to concentrate common randomness from. By Lemma 4.10 of [25], hid X S →X A ⊗id S→B :ρ X S S i can be replaced by hρ X A B i=hid X S →X A ⊗id S→B (ρ X S S )i. (122) This lemma says that if the right hand side of a resource inequality does not involve the Source then on the left hand side we can assume that the channel has already been applied to the Source’s input state. 118 Thus by (122) and Lemma 4.6 of [25], which says that we can perform cancellation of [cc] up to a sublinear term o[cc], we have hρ X A B i+(I(X A ;Y B ) σ −I(Y B ;B) σ )[c→c]+o[cc]≥I(X A ;Y B ) σ [cc]. Since [c→c]≥ [cc], by Lemma 4.5 of [25] the o term can be dropped. The reason is that the sublinear amount of [cc] can be created from a sublinear amount of [c→c], which can in turn be absorbed in the (I(X A ;Y B ) σ −I(Y B ;B) σ )[c → c] term. Thus (121) is proved. It may seem counterintuitive that channel simulation, which to a certain extent dilutes common randomness, can be used to prove a result about randomness concen- tration. Thetrickisthattheinitialcommonrandomnessisusedasacatalysttocreate a larger amount of diluted randomness (the state (σ Y A Y B ) ⊗n ), which is subsequently concentrated, and the catalyst returned. 2 5.3.2 Rate-distortion trade-off with quantum side information Rate-distortion theory, or lossy source coding, is a major subfield of classical infor- mation theory [12]. When insufficient storage space is available, one has to compress a source beyond the Shannon entropy. By the converse to Shannon’s compression theorem, this means that the reproduction of the source (after compression and de- compression) suffers a certain amount of distortion compared to the original. The goal of rate-distortion theory is to minimize a suitably defined distortion measure for a given desired compression rate. Formally, a distortion measure is a mapping d : X ×X → R + from the set of source-reproduction alphabet pairs into the set of 119 non-negative real numbers. This function can be extended to sequences X n ×X n by letting d(x n ,ˆ x n )= 1 n n X i=1 d(x i ,ˆ x i ). WeconsiderhereaquantumgeneralizationoftheclassicalWyner-Ziv[77]problem. The encoder Alice and decoder Bob share n copies of the classical-quantum system XB in the state (89). Alice sends Bob a classical message at rate R, based on which, and with the help of his side information B n , Bob needs to reproduce x n with lowest possible distortion. An (n,R,d) rate-distortion code is given by an encoding map E n : X n → {0,1} nR and a decoding map D n which takes E n (x n ) and the state ρ x n as inputs and outputs a string ˆ x n ∈X n . D n is implemented by performing a E n (x n )- dependentmeasurement, followedbyafunctionmappingE n (x n )andthemeasurement outcome to ˆ x n . The condition on the reproduction quality is d(E n ,D n ):=Ed(X n , ˆ X n )= X x n p n (x n )d(x n ,D n (E n (x n ),ρ x n))≤d . A pair (R,d) is achievable if there exists an (n,R +δ,d) code for any δ > 0 and sufficiently large n. Define R B (d) to be the infimum of rates R for which (R,d) is achievable. Theorem 22 Given n copies of a classical-quantum system XB in the state ρ X n B n , then R B (d)= lim n→∞ R (n) B (d), R (n) B (d)= 1 n min Y|X n min D:YB n → ˆ X n (I(X n ;Y)−I(Y;B n )) 120 where the minimization is over all conditional probability distributions W(y|x n ), and decoding maps D :YB n → ˆ X n , such that Ed(X n ,D(Y,B n ))= X x n ,y p n (x n )W(y|x n )d(x n ,D(y,ρ B n x n))≤d. Notethat(m+n)R (m+n) B (d)≤mR (m) B (d)+nR (n) B (d). Byargumentssimilartothose forthechannelcapacity(seee.g.[5],AppendixA),thelimitR B (d)exists. However,the formula of R (n) B (d) is a “regularized” form, so R B (d) can not be effectively computed. We omit the easy proof of the converse theorem. The direct coding theorem is an immediate consequence of Theorem 13 (cf. [72]): Proof of Theorem 4.2 (direct coding) It suffices to prove the achievability of R (1) B (d), for a fixed channel W(y|x) and decoding map D : YB → ˆ X. Consider an (n,R,C,ǫ) simulation code for the channelW(y|x). The simulated stateσ X nˆ Y n ˆ B n can be written as a convex combination of simulations corresponding to particular values of the common randomness l: σ X nˆ Y n ˆ B n = X l u ′ (l)σ X nˆ Y n ˆ B n l . In other words, σ X ne Y n ˆ B n l is obtained from the encoding E l (m,s|x n ), the POVM set {Λ (lm) } m∈{0,1} nC, and decodingD l (m,s). From the condition for successful simulation (90) and monotonicity of trace distance it follows that k X l u ′ (l)D ⊗n (σ ˆ Y n ˆ B n l )−D ⊗n (ρ YB ) ⊗n k 1 ≤ǫ. (123) 121 For each l define rate-distortion encoding E l n by E l (m,s|x n ), and decoding D l n by the POVMset{Λ (lm) } m∈{0,1} nC followedbyD l (m,s ′ )(s ′ isthePOVMoutcome)andD ⊗n . Invoking (123), Ed(X,D(Y,B))≤d and the linearity of the distortion measure, gives X l u ′ (l)d(E l n ,D l n )≤d+c 0 ǫ, for some constant c 0 . Hence there exists a particular l for which d(E l n ,D l n )≤d+c 0 ǫ. The direct coding theorem now follows from the achievable rates given by Theorem 13. 2 The classical Wyner-Ziv problem is recovered by makingB into a classical system Z, i.e. by setting ρ x = P z p(z|x)|zihz| with P z p(z|x) = 1 and associating the joint distribution p(x)p(z|x) with the random variable XZ. In this case a single-letter formula is obtained R Z (d)=R (1) Z (d)=min Y|X min D:YZ→ ˆ X (I(X;Y)−I(Y;Z)) . It is an open question whether a single-letter formula exists for R B (d). Following the standard converse proof of [21, 77] we are able to produce a single letter lower bound on R B (d) given by R ∗ B (d)= min W:X→C min D:CB→ ˆ X (I(X;C)−I(C;B)) , where C is now a quantum system (replacing Y) and W : X → C is a classical- quantum channel (replacing W). Unfortunately, R ∗ B (d) appears not to be achievable without entanglement. For instance, in the d = 0 and B = null case, simulating the 122 channelX →C with a rate ofI(X;C) bits of communication generally requiresH(C) ebits [10]. Since entanglement cannot be “derandomized” like common randomness, a coding theorem paralleling that of Theorem 4.2 seems unlikely. 5.4 Bounds on quantum state redistribution Our channel simulation with side information result, Theorem 13, is only partly quan- tum. To formulate a fully quantum version of it, we (i) replace the classical channel W by a quantum feedback channel [24]U A→ ˆ B ˆ A , which is an isometry from Alice’s sys- tem A to the system ˆ B ˆ A shared by Alice and Bob; (ii) replace the classical-quantum state ρ XB by a pure state|ϕi RAB shared among the reference system, Alice and Bob. Sending the A part of |ϕi RAB through the channel U results in the state |ψi R ˆ A ˆ BB =U|ϕi RAB , where ˆ A is held by Alice and ˆ BB is held by Bob. Because U is an isometry, the state |ϕi RAB is equivalent to |ψi R ˆ A ˆ BB with ˆ A ˆ B in Alice’s possession. Thus simulating the channel U on |ϕi RAB is equivalent to quantum state redistribution: Alice transferring the ˆ B part of her system ˆ A ˆ B to Bob. We can now ask about the trade-off between qubit channels [q → q] and ebits [qq] needed to effect quantum state redistribution. In terms of resource inequalities, we are interested in the rate pairs (Q,E) such that hU S→AB 1 :ρ S i+Q[q→q]+E[qq] s ≥hU S→A ˆ A ˆ B 2 :ρ S i. (124) HereU 1 is an isometry such that|ϕi RAB =U 1 |φi RS ,|φi RS is a purification ofρ S , and U 2 =U ◦U 1 . 123 We can find two rather trivial inner bounds (i.e. achievable rate pairs) based on previousresults. FirstletusfocusonmakinguseofBob’ssideinformationB. Thefeed- backchannelsimulationwillbeperformednaively: AlicewillimplementU A→ ˆ A ˆ B locally andthen“merge”hersystem ˆ BwithBob’ssystemB,treating ˆ Aaspartofthereference system R. This gives an achievable rate pair of (Q 1 ,E 1 ) = ( 1 2 I( ˆ B;R ˆ A),− 1 2 I(B; ˆ B)) by the fully quantum Slepian-Wolf (FQSW) protocol [1, 24], a generalization of [43]. The negative value ofE means that entanglement is generated, rather than consumed. Now let us ignore the side information and focus on performing the channel simu- lation non-trivially. This is the domain of the fully quantum reverse Shannon (FQRS) theorem [1, 24, 27]. TreatingB as part of the reference system R, the FQRS theorem implies an achievable rate pair of (Q 2 ,E 2 )=( 1 2 I( ˆ B;RB), 1 2 I( ˆ B; ˆ A)). An outer bound is given by the following proposition. Proposition 23 The region in the (Q,E) plane defined by Q≥ 1 2 I( ˆ B;R| ˆ A), Q+E ≥H( ˆ B|B) contains the achievable rate region for quantum state redistribution. Proof Assume that Alice holds ˆ A ˆ B and Bob holds B. Alice wants to transfer her system ˆ A ˆ B to Bob. By the converse to FQSW (cf. [1]), transferring ˆ A ˆ B requires a rate pair (Q ′′ ,E ′′ ) such that Q ′′ ≥ 1 2 I( ˆ B ˆ A;R), Q ′′ +E ′′ ≥H( ˆ A ˆ B|B). (125) Now let us perform the redistribution successively: first transfer ˆ B and then ˆ A. Let the cost of transferring ˆ B be (Q,E), which we are trying to bound. By FQSW, the 124 cost of transferring the remaining ˆ A once Bob has ˆ B can be achieved with the rate pair (Q ′ ,E ′ ) such that Q ′ = 1 2 I( ˆ A;R), Q ′ +E ′ =H( ˆ A|B ˆ B). If Q < 1 2 I( ˆ B;R| ˆ A), then Q+Q ′ < 1 2 I( ˆ B ˆ A;R), which contradicts (125). Hence Q ≥ 1 2 I( ˆ B;R| ˆ A) must hold. Similarly, we can prove that Q+E ≥H( ˆ B|B). 2 TheboundQ+E ≥H( ˆ B|B)istheanalogueoftheclassicalboundR+C ≥H(Y|B) from Theorem 13. When ˆ A=null (simulated channel is the identity) the outer bound is achieved by the FQSW-based scheme and whenB =null (no side information) it is achieved by the FQRS-based scheme. 5.5 Discussion We have shown here a generalization of both the classical reverse Shannon theorem, and the classical-quantum Slepian-Wolf (CQSW) problem. Our main result is a new resource inequality (120) for quantum Shannon theory. Unfortunately we were not able to obtain it by naively combining the reverse Shannon and CQSW resource in- equalities via the resource calculus of [25]. Instead we proved it from first principles. An alternative proof involves modifying the reverse Shannon protocol to “piggy-back” independent classical information at a rate ofI(Y;B) (cf. [29]). In [25] certain general principles were proved, such as the “coherification rules” which gave conditions for when classical communication could be replaced by coherent communication. It would be desirable to formulate a “piggy-backing rule” in a similar fashion. An immediate corollary of our result is channel simulation with classical side in- formation. Remarkably, this purely classical protocol is the basic primitive which generates virtually all known classical multi-terminal source coding theorems, not just the Wyner-Ziv result [50]. 125 Regarding the state redistribution problem of Section 5, our results have inspired Devetak and Yard [34] to prove the tightness of the outer bound given by Proposition 23, thus providing the first operational interpretation of quantum conditional mutual information. 126 Chapter 6: A unified approach for multi-terminal source coding problem As we know from Chapter 5, the problem of classical channel simulation with side information at the receiver has important applications such as rate-distortion theory with quantum side information and common randomness distillation from bipartite quantum states. In this chapter, we continue the discussion of rate-distortion the- ory started in Chapter 5 and systematically discuss the connection between classical channel simulation (with classical side information) and rate-distortion theory. The main result of Chapter 6 is that any multistep channel simulation code of classical communication-common randomness rate pair (R,C) can be transformed to a rate-distortion code with the same communication rate R. Based on this idea, we can apply our channel simulation theorem to some interesting classical source cod- ing problems: successive refinement, multiple descriptions, and multi-terminal source coding problems. Simple proofs of achievability of these problems can be made by a unified approach using the reverse Shannon theorem and the channel simulation with classical side information as building blocks. This chapter is organized as follows. In Section we discuss the connection between classicalchannelsimulation(withclassicalsideinformation)andrate-distortiontheory indetail. Section showshowtoapplyTheorem25andTheorem27toprovethedirect coding theorems of the mentioned source coding problems. 127 6.1 Connection between classical channel simulation and rate-distortion theory First,weshallintroduceourchannelsimulationtheoremwithclassicalsideinformation. Consider two classical random variablesX andZ with joint distributionP XZ held by thesenderandthereceiverrespectively. Consideraclassicalchannelfromthesenderto thereceivergivenbytheconditionalprobabilitydistributionW(y|x)=Pr{Y =y|X = x}. Applying this channel to the sourceX results in the joint distributionP XYZ with Markov chainZ →X →Y. Ideally, we are interested in simulating the channel using noiseless communication and common randomness (or shared coins), in the sense that the simulation produces the joint distribution P XYZ . For reasons discussed later, we want the sender to also get a copyY of the output, so that the final state produced is the joint distribution ofP XYYZ . The systemsX andY are in the sender’s possession and the receiver has Y and Z. As usual in information theory, this task is amenable to analysis when we go to the approximate, asymptotic i.i.d. (independent, indentically distributed) setting. This meansthatthesenderandthereceiversharencopiesofrandomvariableXZ,givenby the joint distributionp n (x n ,z n ), wherex n =x 1 ...x n andz n =z 1 ...z n are sequences in X n and Z n , p n (x n ,z n ) =p(x 1 ,z 1 )...p(x n ,z n ). They want to simulate the channel W n (y n |x n ) = W(y 1 |x 1 )...W(y n |x n ) approximately, with error approaching zero as n → ∞. They have access to a rate of C bits/copy of common randomness, which means that they each have the same stringl picked uniformly at random from the set {0,1} nC . InadditiontheyareallowedarateofRbits/copyofclassicalcommunication, sothatthesendermaysendanarbitrarystringmfromtheset{0,1} nR tothereceiver. An (n,R,C,ǫ) simulation code consists of 128 • An encoding stochastic map E n : X n ×{0,1} nC → {0,1} nR ×{0,1} nS . If the value of the common randomness is l ∈ {0,1} nC , the sender encodes his classi- cal message x n as the index ms, m ∈ {0,1} nR , s ∈ {0,1} nS , with probability E nl (m,s|x n ):=E n (m,s|x n ,l), and only sends m to the receiver; • A function F n :{0,1} nC ×{0,1} nR ×Z n →{0,1} nS . The receiver does not get sent the true value of s and needs to infer it by s ′ =F nl (m,z n ):=F n (l,m,z n ). • A deterministic decoding map D n : {0,1} nC ×{0,1} nR ×{0,1} nS → Y n ; this allows the sender and the receiver to produce their respective simulated outputs e y n =D nl (m,s) :=D n (l,m,s) and ˆ y n =D nl (m,s ′ ), based on l, m and s (in the receiver’s case s ′ ); such that k(P XYYZ ) ⊗n −P X nˆ Y ne Y n Z n k 1 ≤ǫ. wherekP−P ′ k 1 = P x∈X |P(x)−P ′ (x)|isthevariationaldistancebetweentwoproba- bility distributions. The distributionP X nˆ Y ne Y n Z n is the result of the simulation, which includes the sender’s original X n and simulation output random variable e Y n , the re- ceiver’s side information Z n and simulation output random variable ˆ Y n (based on s ′ ). A rate pair (R,C) is called achievable if for all ǫ> 0, δ > 0 and sufficiently large n, there exists an (n,R+δ,C+δ,ǫ) code. And our channel simulation theorem [49] is Theorem 24 The region of achievable (R,C) pairs is given by R≥I(X;Y)−I(Y;Z), C +R≥H(Y|Z). Note The full region given above is “generated” by the rate pair (R,C)=(I(X;Y)− I(Y;B),H(Y|X)) and the fact that a bit of common randomness may be generated from a bit of communication. 129 Recall a distortion measure is a mapping d :X × ˆ X →R + from the set of source alphabet-reproduction alphabet pairs into the set of non-negative real numbers. This function can be extended to sequences X n × ˆ X n by letting d(x n ,ˆ x n )= 1 n n X i=1 d(x i ,ˆ x i ). Shannon’s rate-distortion theorem [63] is about the following problem: construct an (n,R,D) code (E n ,D n ) with encoding E n : X n → {0,1} nR and decoding D n : {0,1} nR → ˆ X n , such that for a given distortion D, d(E n ,D n ):=Ed(X n , ˆ X n )= X x n p n (x n )d(x n ,D n (E n (x n )))≤D. A pair (R,D) is achievable if there exists an (n,R +δ,D) code for any δ > 0 and sufficiently large n. The rate distortion function R(D) is the infimum of rates R for which (R,D) is achievable. To relate the channel simulation theorem to rate-distortion theory, we need two theorems. The first is the composability theorem, saying that multistep channel sim- ulations are composable. Theorem 25 (Composability) Suppose the sender and the receiver hold a simula- tion of n copies of random variable XZ, say ˆ X n and ˆ Z n respectively. They want to simulate a channel W n (y n |x n ) from X n to Y n by using the simulated source ˆ X n and side information ˆ Z n . If the simulation of the joint distribution (P XZ ) ⊗n isǫ-good, i.e. ||(P XZ ) ⊗n −P ˆ X n ˆ Z n || 1 ≤ǫ, then the simulation of the joint distribution (P XYZ ) ⊗n is 2ǫ-good. 130 Proof Set P ˆ X nˆ Y n ˆ Z n (x n ,y n ,z n )= ˆ p(x n ) ˆ W(y n |x n ) ˆ Q(z n |x n ) , (P XYZ ) ⊗n (x n ,y n ,z n )=p n (x n )W n (y n |x n )Q n (z n |x n ) . By the triangle inequality, we have ||(P XYZ ) ⊗n −P ˆ X nˆ Y n ˆ Z n || 1 ≤||p n W n Q n −p n ˆ WQ n || 1 +||p n ˆ WQ n − ˆ p ˆ W ˆ Q|| 1 =||p n W n Q n −p n ˆ WQ n || 1 +||p n Q n − ˆ p ˆ Q|| 1 ≤2ǫ as claimed. 2 Before presenting the derandomization theorem, we shall introduce one lemma. Lemma 26 Given i.i.d. source X and its reproduction Y with Ed(X,Y) ≤ d, if the simulation P X nˆ Y n of the distribution (P XY ) ⊗n is ǫ-good, then Ed(X n , ˆ Y n )≤d+c 0 ǫ for some constant c 0 . Proof Ed(X n , ˆ Y n )= X x n ,y n ˆ p(x n ,y n )d(x n ,y n ) = X x n ,y n p n (x n ,y n )d(x n ,y n )+ X x n ,y n (ˆ p(x n ,y n )−p n (x n ,y n ))d(x n ,y n ) ≤d+ǫd max as claimed. 2 Nowwecometothederandomizationtheoremwhichshowsthatanysuitable(mul- tistep) channel simulation codes can be derandomized to rate-distortion codes. 131 Theorem 27 (Derandomization) Given l i.i.d. correlated sources X l = X 1 ···X l and k users, who initially share a joint state Y 0 = Y 10 (X l )···Y k0 (X l ). We want to distribute the sources to different users within certain distortion and it can be done by multistep channel simulations as follows: • Assume that at each time t, there is only one user s t working as the sender and all the users can have access to common randomness W t , which is independent of W s for s6=t. • At time t, user s t encodes the source Y st,t−1 and sends various users nR t bits to realize the joint state Y t . • Repeat above procedure at time t+1. • At timet=T, the users finish the channel simulations with stateY T . Discarding all the redundant or intermediate auxiliary random variables and only keeping the desired sources and reproductions, the users finally share the state e Y 1 e Y 2 ··· e Y k such that ||(P Z ) ⊗n −P e Y 1 e Y 2 ··· e Y k || 1 ≤ǫ , (126) where Z =Z 11 ...Z 1l ...Z k1 ···Z kl and Z ji is X i or a reproduction of X i at user j. Then there exists a protocol of rate-distortion codes using rates {R t } such that user j has a simulation of Z ji with distortion d ij =d i (X i ,Z ji ). Proof Given (126), by the monotonicity of trace distance, for each Z ji we have ||(P Z ji ) ⊗n −P ˆ Z n ji || 1 ≤ǫ , where ˆ Z n ji is the simulation ofZ n ji . Then by Lemma 26, the distortion of the multistep channel simulation codes is Ed i (X n i , ˆ Z n ji )≤d ij +c 0 ǫ. 132 Now we will derandomize a channel simulation code by a double-blocking protocol. Given an input sequence x nN = (x n ) 1 ···(x n ) N of length nN, the sender encodes each block of length n by the same channel simulation code {E nl ,F nl ,D nl } l∈{0,1} nC such that besides thenNR bits of communication, it will only costnC bits of common randomnesstoachieveasimulation ˆ Y nN ofY nN withtotaldistortionEd(X nN , ˆ Y nN )= Ed(X n , ˆ Y n ). The common randomness consumption rate is δ = 1 N C which goes to zero as N →∞. A rate-distortion code is constructed with rate R+δ and distortion Ed(X n , ˆ Y n ) by using δ bits/copy of communication from the sender to the receiver instead of δ bits/copy of preshared common randomness. Therefore, by applying the above double-blocking protocol, we can transform any multistep channel simulation codes of rates{(R t ,C t )} to rate-distortion codes of rates {R t +δ t }(δ t =C t /N t )suchthatuserj has ˆ Z n ji withdistortionEd i (X n i , ˆ Z n ji )≤d ij +c 0 ǫ. 2 6.2 Achievability of successive refinement, multiple de- scriptions, and multi-terminal source coding We will prove the achievability of successive refinement, multiple descriptions, and multi-terminal source coding by channel simulation followed by derandomization. By thecomposibilityandderandomizationtheorem,suitablechannelsimulationcodescan bederandomizedtorate-distortioncodeswithasymptoticallythesamecommunication rates. For simplicity we will hereafter concentrate on constructung channel simulation codes to achieve the communication rates and omit the common randomness descrip- tions and the details of derandomization to rate-distortion codes. 133 6.2.1 Successive refinement When ˆ X n 1 and ˆ X n 2 are two descriptions of the source X n with distortion D 1 and D 2 (D 1 ≥ D 2 ), the successive refinement problem says that sending a coarse description ˆ X n 1 with distortion D 1 and then sending more bits to achieve a finer description ˆ X n 2 costs the same as sending ˆ X n 2 directly if and only if there exists a Markov chain X n → ˆ X n 2 → ˆ X n 1 . Theorem 28 (Equitz, Cover [35]) Successive refinement with distortion D 1 and D 2 (D 1 ≥ D 2 ) can be achieved if and only if there exists a conditional distribution p(ˆ x 1 ,ˆ x 2 |x) with Ed(X, ˆ X m )≤D m ,m=1,2, such that I(X; ˆ X m )=R(D m ),m=1,2, and p(ˆ x 1 ,ˆ x 2 |x)=p(ˆ x 2 |x)p(ˆ x 1 |ˆ x 2 ). Proof of achievability of Theorem 5 To simulate a joint distribution (P X ˆ X 2 ) ⊗n between him and the receiver, the sender can do it in two ways : simulate the chan- nel ˆ X n 2 |X n directly or simulate it successively : simulate ˆ X n 1 |X n first, then simulate ˆ X n 2 |X n ˆ X n 1 byusingthesideinformation ˆ X n 1 ,where ˆ X 1 isanauxiliaryrandomvariable. By the RST, simulating the channel ˆ X n 2 |X n directly can be achieved by R 2 = I(X; ˆ X 2 ). In the two-stage method, by the RST the first stage can be done by R 1 = I(X; ˆ X 1 ). Now the state of (sender, receiver) is (X n ˆ X n 1 , ˆ X n 1 ). Observing the Markov chain ˆ X n 1 → X n ˆ X n 1 → ˆ X n 2 and by Theorem 1, the second stage can be achieved by R 0 =I(X, ˆ X 1 ; ˆ X 2 )−I( ˆ X 2 ; ˆ X 1 ). The total rate of the two-stage channel simulation is R 0 +R 1 = I(X; ˆ X 1 , ˆ X 2 ), which is equal to R 2 if and only if there is a Markov chain X n → ˆ X n 2 → ˆ X n 1 as claimed. 2 134 n X nR 0 ˆ n 1 1 X , D ˆ n 2 2 X , D ˆ n 0 0 X , D encoder encoder decoder decoder decoder nR 1 Figure 20: The EGC theorem is the multiple descriptions problem with two encoders and three decoders. Here the sender sends two descriptions ˆ X n 0 , ˆ X n 1 of the source X n to receiver 0 and receiver 1 with rates R 0 and R 1 , respectively. Receiver 2 can have access to both descriptions of receiver 0 and receiver 1 with rate R 0 +R 1 and he can reconstruct a better description ˆ X n 2 . 6.2.2 Multiple descriptions Furthermore, we can prove the achievability of the mutiple descriptions problem. The first introduced is the EGC theorem. Theorem 29 (El Gamal, Cover [38]) A rate-distortion quintuple is achievable if there exists a probability distribution p(x)p(ˆ x 0 ,ˆ x 1 ,ˆ x 2 |x) with Ed(X, ˆ X m ) ≤ D m ,m = 0,1,2, such that R 0 >I(X; ˆ X 0 ) , (127) R 1 >I(X; ˆ X 1 ) , (128) R 0 +R 1 >I(X; ˆ X 0 , ˆ X 1 , ˆ X 2 )+I( ˆ X 0 ; ˆ X 1 ) . (129) Proof of achievability of Theorem 6 To prove the achievable region of rate pairs (R 0 ,R 1 ) defined by (127), (128) and (129), we need to show that the two rate pairs (∗) (I(X; ˆ X 0 ), I(X, ˆ X 0 ; ˆ X 1 )+I(X; ˆ X 2 | ˆ X 0 , ˆ X 1 )), (∗∗) (I(X, ˆ X 1 ; ˆ X 0 )+I(X; ˆ X 2 | ˆ X 0 , ˆ X 1 ), I(X; ˆ X 1 )) are achievable. 135 We can achieve the rate pair (∗) in three steps: • Startingwiththestateof(sender,receiver0,receiver1,receiver2)as(X n ,0,0,0), simulate the channel ˆ X n 0 |X n with rate I(X; ˆ X 0 ) from sender to receiver 0. • Now with state (X n ˆ X n 0 , ˆ X n 0 ,0, ˆ X n 0 ), simulate the channel ˆ X n 1 |X n ˆ X n 0 with rate I(X, ˆ X 0 ; ˆ X 1 ) from sender to receiver 1. • Finally with (X n ˆ X n 0 ˆ X n 1 , ˆ X n 0 , ˆ X n 1 , ˆ X n 0 ˆ X n 1 ), observing the Markov chain ˆ X n 0 ˆ X n 1 → X n ˆ X n 0 ˆ X n 1 → ˆ X n 2 , simulate the channel ˆ X n 2 |X n ˆ X n 0 ˆ X n 1 with side information ˆ X n 0 ˆ X n 1 from sender to receiver 2 via receiver 1 by rate I(X; ˆ X 2 | ˆ X 0 , ˆ X 1 ) . So we can realize (X n ˆ X n 0 ˆ X n 1 ˆ X n 2 , ˆ X n 0 , ˆ X n 1 , ˆ X n 0 ˆ X n 1 ˆ X n 2 ) by R 0 =I(X; ˆ X 0 ), R 1 =I(X, ˆ X 0 ; ˆ X 1 )+I(X; ˆ X 2 | ˆ X 0 , ˆ X 1 ). Similarly, we can show the rate pair (∗∗) is achievable. Then by time sharing, the whole region defined by (127), (128) and (129) is achievable. 2 Then following the same philosophy, we can prove the positive coding theorem for a special case of the 3−diversity problem in [78] and the following theorem : Theorem 30 (Zhang, Berger) Any quintuple (R 0 ,R 1 ,D 0 ,D 1 ,D 2 ) is achievable if there exist random variables ˆ X 0 , ˆ X 1 , and ˆ X 2 , jointly distributed with a generic source variable X such that R 0 ≥I(X; ˆ X 0 , ˆ X 2 ), (130) R 1 ≥I(X; ˆ X 1 , ˆ X 2 ), (131) R 0 +R 1 ≥2I(X; ˆ X 2 )+I( ˆ X 0 ; ˆ X 1 | ˆ X 2 )+I(X; ˆ X 0 , ˆ X 1 | ˆ X 2 ), (132) 136 and there exist φ 0 , φ 1 and φ 2 which satisfy Ed(X,φ i ( ˆ X i , ˆ X 2 ))≤D i , i=0,1 Ed(X,φ 2 ( ˆ X 0 , ˆ X 1 , ˆ X 2 ))≤D 2 . Proof of achievability of Theorem 7 To prove the achievable region of rate pairs (R 0 ,R 1 ) defined by (130), (131) and (132), we need to show the two rate pairs (•) (I(X; ˆ X 0 , ˆ X 2 ), I(X; ˆ X 2 )+I(X, ˆ X 0 ; ˆ X 1 | ˆ X 2 )), (••) (I(X; ˆ X 2 )+I(X, ˆ X 1 ; ˆ X 0 | ˆ X 2 ), I(X; ˆ X 1 , ˆ X 2 )) are achievable. We can achieve the rate pair (•) in three steps: • Startingwiththestateof(sender,receiver0,receiver1,receiver2)as(X n ,0,0,0), simulate the channel ˆ X n 2 |X n with rate I(X; ˆ X 2 ) from sender to all receivers. • Nowwithstate(X n ˆ X n 2 , ˆ X n 2 , ˆ X n 2 , ˆ X n 2 ),observetheMarkovchain ˆ X n 2 →X n ˆ X n 2 → ˆ X n 0 and simulate the channel ˆ X n 0 |X n ˆ X n 2 with side information ˆ X n 2 from sender to receiver 0 by rate I(X; ˆ X 0 | ˆ X 2 ) . • Finally with (X n ˆ X n 0 ˆ X n 2 , ˆ X n 0 ˆ X n 2 , ˆ X n 2 , ˆ X n 0 ˆ X n 2 ), observing the Markov chain ˆ X n 2 → X n ˆ X n 0 ˆ X n 2 → ˆ X n 1 , simulate the channel ˆ X n 1 |X n ˆ X n 0 ˆ X n 2 with side information ˆ X n 2 from sender to receiver 1 by rate I(X, ˆ X 0 ; ˆ X 1 | ˆ X 2 ). So we can realize (X n ˆ X n 0 ˆ X n 1 ˆ X n 2 , ˆ X n 0 ˆ X n 2 , ˆ X n 1 ˆ X n 2 , ˆ X n 0 ˆ X n 1 ˆ X n 2 ) by R 0 =I(X; ˆ X 0 , ˆ X 2 ), R 1 =I(X; ˆ X 2 )+I(X, ˆ X 0 ; ˆ X 1 | ˆ X 2 ). Similarly, we can show the rate pair (••) is achievable. Then by time sharing, the whole region defined by (130), (131) and (132) is achievable. 2 137 6.2.3 Multi-terminal source coding We can prove the achievability of Berger-Tung inner bound for the rate-distortion region of multi-terminal source coding problem. Theorem 31 (Berger-Tung Inner Bound) For two correlated sourcesX 1 andX 2 , drawn i.i.d. according to p(x 1 ,x 2 ). If there exist two auxiliary variables W 1 and W 2 with Ed i (X i , ˆ X i (W 1 ,W 2 ))≤D i , i=1,2 and ifp(x 1 ,x 2 ,w 1 ,w 2 ) forms a Markov chainW 1 →X 1 →X 2 →W 2 , then for a given distortion pair (D 1 ,D 2 ), the region of achievable rate pairs (R 1 ,R 2 ) is given by R 1 ≥I(X 1 ;W 1 |W 2 ) , (133) R 2 ≥I(X 2 ;W 2 |W 1 ) , (134) R 1 +R 2 ≥I(X 1 ,X 2 ;W 1 ,W 2 ) . (135) Proof of achievability of Theorem 8 To prove the achievable region of rate pairs (R 1 ,R 2 ) defined by (133), (134) and (135), we need to show that under the Markov condition W 1 →X 1 →X 2 →W 2 , the two rate pairs (⋆) (I(X 1 ;W 1 |W 2 ), I(X 2 ;W 2 )), (⋆⋆) (I(X 1 ;W 1 ), I(X 2 ;W 2 |W 1 )) are achievable. We can achieve the rate pair (⋆) in two steps: • Startingwiththestateof(encoder1,encoder2,decoder)as(X n 1 ,X n 2 ,0),simulate the channel W n 2 |X n 2 with rate I(X 2 ;W 2 ) from encoder 2 to decoder. 138 • With state (X n 1 ,X n 2 W n 2 ,W n 2 ), observe the Markov chain W n 2 →X n 1 →W n 1 and simulate the channel W n 1 |X n 1 with side information W n 2 by rate I(X 1 ;W 1 |W 2 ) from encoder 1 to decoder. So we can realize (X n 1 W n 1 ,X n 2 W n 2 ,W n 1 W n 2 ) by R 1 =I(X 1 ;W 1 |W 2 ), R 2 =I(X 2 ;W 2 ). Similarly, we can show rate pair (⋆⋆) is achievable. Then by time sharing, the whole region defined by (133), (134) and (135) is achievable. 2 In [14] Berger and Yeung give a complete characterization of the rate-distortion region of one important particular case (D 2 = 0) of multi-terminal source coding problem. Theorem 32 (Berger, Yeung) For two correlated sources X 1 and X 2 , drawn i.i.d. according to p(x 1 ,x 2 ). For a given distortion pair (D 1 ,0), the rate pair (R 1 ,R 2 ) is achievable if and only if R 1 ≥I(X 1 ;W|X 2 ) , (136) R 2 ≥H(X 2 |W) , (137) R 1 +R 2 ≥H(X 2 )+I(X 1 ;W|X 2 ) . (138) where W is an auxiliary random variable which obeys the following conditions : (1). W →X 1 →X 2 forms a Markov chain , (2). ˆ X 1 (W,X 2 ) exists such that Ed(X 1 , ˆ X 1 )≤D 1 , (3). |W|≤|X 1 |+2. 139 Proof of achievability of Theorem 9 To prove the achievable region of rate pairs (R 1 ,R 2 ) defined by (136), (137) and (138), we need to show that under the Markov condition W →X 1 →X 2 , the two rate pairs (†) (I(X 1 ;W|X 2 ), H(X 2 )), (††) (I(X 1 ;W 1 ), H(X 2 |W)) are achievable, which is obvious. Then by time sharing, the whole region defined by (136), (137) and (138) is achievable. 2 6.2.4 Revisit successive refinement Define two simulations to be equivalent if and only if their regions of achievable rates (R,C) are the same. For successive refinement, we can make the following statement, which is stronger than Theorem 5. Theorem 33 The direct channel simulation and the successive channel simulation are equivalent if and only if p n (x n ,ˆ x n 1 ,ˆ x n 2 ) satisfies the Markov chain X n → ˆ X n 2 → ˆ X n 1 . Proof By the RST, the direct simulation can be achieved by R 2 =I(X; ˆ X 2 ), C 2 =H( ˆ X 2 |X). (139) In the two-stage method, by the RST, the costs of the first stage can be R 1 =I(X; ˆ X 1 ), C 1 =H( ˆ X 1 |X); Nowthestateof(sender, receiver)is(X n ˆ X n 1 , ˆ X n 1 ). ObservingtheMarkovchain ˆ X n 1 → X n ˆ X n 1 → ˆ X n 2 and by Theorem 1, the costs of the second stage can be R 0 =I(X, ˆ X 1 ; ˆ X 2 )−I( ˆ X 2 ; ˆ X 1 ), C 0 =H( ˆ X 2 |X, ˆ X 1 ). 140 The total costs of the two-stage channel simulation are R 0 +R 1 =I(X; ˆ X 1 , ˆ X 2 ), C 0 +C 1 =H( ˆ X 1 , ˆ X 2 |X). (140) Comparing(139)and(140),wecanachieveR 0 +R 1 =R 2 ifandonlyifX n , ˆ X n 2 , ˆ X n 1 form a Markov chain X n → ˆ X n 2 → ˆ X n 1 . However, even under the Markov condition , we still have H( ˆ X 1 , ˆ X 2 |X)−H( ˆ X 2 |X)=H( ˆ X 1 | ˆ X 2 )≥0, i.e. the common randomness consumption rates C 0 +C 1 and C 2 do not fit. Thiscanbefixedbyobservingthatinthetwo-stagecasethesenderandthereceiver end up with X n ˆ X n 1 ˆ X n 2 and ˆ X n 1 ˆ X n 2 respectively, but for the one shot case the sender and the receiver end up with X n ˆ X n 2 and ˆ X n 2 respectively. We can show in Appendix that since the sender and the receiver share ˆ X n 1 ˆ X n 2 , they can each apply a function f : ˆ X n 1 →V, where V is uniformly random on nH( ˆ X 1 | ˆ X 2 ) bits. In this way, they kill ˆ X n 1 and transform it into common randomness such that they can “return” the extra common randomness cost in the end and now the net costs of common randomness also perfectly fit. 2 141 Chapter 7: Conclusion In quantum information theory, the hybrid classical-quantum communication scenar- ios are of great interest. We can use quantum channels to build up secret classical distribution and transmit secret classical information. We can also construct a class of quantum error-correcting codes based on classical privacy amplification. Making some of the resources such as side information quantum can generailize the classical information theory to more interesting problems. About quantum cryptography, we play with the relation between quantum privacy and quantum coherence. First, we generalize the secure BB84 proof by Shor and Preskill and claim that we can build up a quantum key expansion protocol, capable of increasing the size of preshared serect keys by a constant factor. The main part of the generalization is to use entanglement-assisted quantum error-correcting codes rather than CSS codes. Then we can employ the modern non-dual-containing codes with excellent performance and decoding algorithms to construct codes for universal composableandefficientlydecodablekeyexpansionprotocols. Second,insteadofusing the secret keys accumulated through quantum key distribution or expansion protocols for secret classical communication, we can design a general private code construction and prove the private communication capacity over quantum channels. Wecanreversetheaboveprocessofreductionfromentanglementdistillationproto- col to secret key distillation protocol and coherify the secret key generation codes into entanglement generation codes. We construct a subclass of CSS codes, called P-CSS 142 codes, by Renner’s privacy amplification procedure and the asymptotic rate of P-CSS codes can achieve the hashing bound of the memoryless qubit Pauli channels. It is interesting to find a good quantum code example which uses good classical codes such asLDPCandTurbocodes. SincetheconstructionofP-CSScodesisbasedonaclassof affine two-universal hash functions, it is also interesting to try nonlinear two-universal hash functions and see the properties of the generated non-additive codes. In the last scenario, we simulate a classical noisy channel W(y|x) between the sender and the receiver, named Alice and Bob respectively, with Bob holding a quan- tum side information ρ B X correlated with Alice’s source X. In the i.i.d. situation, we can approximate the channel W n (y n |x n ) by using n(I(X;Y)−I(Y;B)) copies of noiseless channel and nH(Y|X) bits of common randomness. As a generalization of both classical reverse Shannon theorem and classical-quantum Slepian-Wolf problem, the (classical) channel simulation theorem is a useful tool for effecting trade-offs be- tween resources. By the connection between classical channel simulation (with side information) and rate-distortion theory, we can form a unified approach to prove the direct coding theorems of source coding problems such as source coding with quantum sideinformation,successiverefinement,multipledescriptionsproblem,andmultitermi- nal source coding problem. It greatly simplifies the coding protocols and decomposes the source coding problems into steps of channel simulation (with side information). Thefullyquantum generalizationofthisproblem, calledquantum stateredistribution, considers the communication scenario of redistributing a subsystemC of the tripartite state ρ ABC from Alice to Bob. We mention this fully quantum conjecture with inner and outer bounds of quantum communication, entanglement generation rate pairs and the formal proof is done later by Yard and Devetak. We are also interested in finding codes for the quantum communication assisted entanglement distillation protocol (mother protocol), a protocol paralleling the entan- glementassistedquantumcommunicationprotocol(fatherprotocol). Observethatthe 143 coherent version of the mother protocol includes the state merging and running the coherent mother protocol backwards gives the reverse Shannon theorem. Finding the mother code and the code for the reverse Shannon theorem may help us to find the code for quantum state redistribution. 144 Bibliography [1] A. Abeyesinghe, I. Devetak, P. Hayden, and A. Winter. The mother of all pro- tocols : Restructuring quantum informations family tree, 2006. e-print quant- ph/0606225. [2] R. Ahlswede. The rate-distortion region for multiple descriptions without excess rate. IEEE Trans. Inf. Theory, 31(6):721–726, 1985. [3] R. Ahlswede and I. Csisz´ ar. Common randomness in information theory and cryptography – part i: Secret sharing. IEEE Trans. Inf. Theory, 39:1121–1132, 1993. [4] R. Ahlswede and A. Winter. Strong converse for identification via quantum chan- nels. IEEE Trans. Inf. Theory, 48:569–579, 2002. [5] H.Barnum,M.A.Nielsen,andB.Schumacher. Informationtransmissionthrough a noisy quantum channel. Phys. Rev. A, 57:4153, 1998. [6] C. H. Bennett and G. Brassard. Quantum cryptography: Public key distribution andcointossing. InProceedings of IEEE International Conference on Computers, Systems, and Signal Processing, pages 175–179. IEEE, 1984. [7] C. H. Bennett, G. Brassard, C. Cr´ epeau, and U. Maurer. Generalized privacy amplification. IEEE Trans. Inf. Theory, 41:1915–1923, 1995. [8] C. H. Bennett, G. Brassard, and J. M. Robert. Privacy amplification by public discussion. SIAM J. COMPUT, 17(2):210–229, 1988. [9] C. H. Bennett, D. P. DiVincenzo, J. A. Smolin, and W. K. Wooters. Mixed state entanglement and quantum error correction. Phys. Rev. A, 52:3824–3851, 1996. e-print quant-ph/9604024. [10] C. H. Bennett, P. Hayden, D. W. Leung, P. W. Shor, and A. J. Winter. Remote preparation of quantum states. IEEE Trans. Inf. Theory, 51(1):56–74, 2005. e- print quant-ph/0307100. [11] C.H.Bennett,P.W.Shor,J.A.Smolin,andA.Thapliyal. Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem. IEEE Trans. Inf. Theory, 48, 2002. e-print quant-ph/0106052. 145 [12] T. Berger. Rate-distortion theory: A mathematical basis for data compression. Prentice Hall, Englewood Cliffs, N.J., 1971. [13] T. Berger. The Information Theory Approach to Communications. Springer- Verlag, 1978. [14] T. Berger and R. W. Yeung. Multiterminal source encoding with one distortion criterion. IEEE Trans. Inf. Theory, 35(2):228–236, 1989. [15] C. Berrou, A. Glavieux, and P. Thitimajshima. Near shannon limit error- correctingcodinganddecoding: turbocodes. InProc.IEEEInternationalConfer- ence on Communications (ICC ’93),volume2,pages1064–1070,Geneva,Switzer- land, 1993. [16] R. Bhatia. Matrix Analysis. Number 169 in Graduate Texts in Mathematics. Springer-Verlag, New York, 1997. [17] T. Brun, I. Devetak, and M.-H. Hsie. Correcting quantum errors with entangle- ment. Science, 314:436–439, 2006. e-print quant-ph/0610092. [18] T.Brun,I.Devetak,andM.-H.Hsieh. Catalyticquantumerrorcorrection. e-print quant-ph/0608027. [19] A. R. Calderbank and P. W. Shor. Good quantum error-correcting codes exist. Phys. Rev. A, 54:1098–1105, 1996. e-print quant-ph/9512032. [20] D. Collins and S. Popescu. Classical analogue of entanglement. Phys. Rev. A, 65:032321, 2002. e-print quant-ph/0107082. [21] T. M. Cover and J. A. Thomas. Elements of Information Theory. Series in Telecommunication. John Wiley and Sons, New York, 1991. [22] I. Csisz´ ar and J. K¨ orner. Information Theory: coding theorems for discrete mem- oryless systems. Academic Press, New York–San Francisco–London, 1981. [23] I. Devetak. The private classical capacity and quantum capacity of a quantum channel. IEEE Trans. Inf. Theory, 51(1):44–55, 2005. e-print quant-ph/0304127. [24] I.Devetak.Triangleofdualitiesbetweenquantumcommunicationprotocols.Phys. Rev. Lett., 97, 2006. e-print quant-ph/0505138. [25] I. Devetak, A. W. Harrow, and A. Winter. A resource framework for quantum Shannon theory, 2005. e-print quant-ph/0512015. [26] I. Devetak, A. W. Harrow, and A. J. Winter. A family of quantum protocols. Phys. Rev. Lett., 93, 2004. e-print quant-ph/0308044. [27] I. Devetak, P. Hayden, D. W. Leung, and P. Shor. Triple trade-offs in quantum Shannon theory, 2006. in preparation. [28] I. Devetak, P. Hayden, and A. Winter. Principles of Quantum Information The- ory. 2006. in preparation. 146 [29] I. Devetak and P. W. Shor. The capacity of a quantum channel for simulta- neous transmission of classical and quantum information, 2003. e-print quant- ph/0311131. [30] I. Devetak and A. Winter. Classical data compression with quantum side infor- mation. Phys. Rev. A, 68:042301, 2003. e-print quant-ph/0209029. [31] I.DevetakandA.Winter. Distillingcommonrandomnessfrombipartitequantum states. IEEE Trans. Inf. Theory, 50:3138–3151, 2003. e-print quant-ph/0304196. [32] I. Devetak and A. Winter. Relating quantum privacy and quantum coherence: an operational approach. Phys. Rev. Lett., 93, 2004. e-print quant-ph/0307053. [33] I. Devetak and A. Winter. Distillation of secret key and entanglement from quan- tum states. Proc. R. Soc. Lond. A, 461:207–235, 2005. e-print quant-ph/0306078. [34] I.DevetakandJ.Yard.Redistributingquantuminformation,2006.inpreparation. [35] W. H. R. Equitz and T. M. Cover. Successive refinement of information. IEEE Trans. Inf. Theory, 37(2), 1991. [36] M. Fannes. A continuity property of the entropy density for spin lattices. Com- mun. Math. Phys., 31:291, 1973. [37] R. G. Gallager. Low-Density Parity-Check Codes. MIT Press, Cambridge, MA, 1963. [38] A. E. Gamal and T. M. Cover. Achievable rates for multiple descriptions. IEEE Trans. Inf. Theory, 28(6), 1982. [39] D. Gottesman and H.-K. Lo. Proof of security of quantum key distribution with two-way classical communications. IEEE Trans. Inf. Theory, 49(2), 2003. e-print quant-ph/0105121. [40] P. Hayden, R. Jozsa, and A. Winter. Trading quantum for classical resources in quantum data compression. J. Math. Phys., 43(9):4404–4444, 2002. e-print quant-ph/0204038. [41] A. S. Holevo. Bounds for the quantity of information transmitted by a quantum communication channel. Problems of Information Transmission, 9:177–183, 1973. [42] A. S. Holevo. The capacity of the quantum channel with general signal states. IEEE Trans. Inf. Theory, 44, 1998. e-print quant-ph/9611023. [43] M. Horodecki, J. Oppenheim, and A. Winter. Partial quantum information. Na- ture, 436:673–676, 2005. e-print quant-ph/0505062. [44] M.-H. Hsieh, Z. Luo, and T. Brun. Secret keys assisted private classical commu- nication capacity over quantum channels. 2008. e-print arXiv:0806.3525. 147 [45] R. Impagliazzo, L. A. Levin, and M. Luby. Pseudo-random generation from one- way functions (extended abstract). In In Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing, pages 12–24, 1989. [46] S. Lloyd. Capacity of the noisy quantum channel. Phys. Rev. A, 55, 1996. e-print quant-ph/9604015. [47] H.-K. Lo and H. F. Chau. Unconditional security of quantum key distribution over arbitrarily long distances. Science, pages 2050–2056, 1999. e-print quant- ph/9803006. [48] Z. Luo. Quantum error correcting codes based on privacy amplification, 2008. In preparation. [49] Z. Luo and I. Devetak. Channel simulation with quantum side information. 2006. e-print quant-ph/0611008. [50] Z.LuoandI.Devetak. Unifiedapproachformulti-terminalsourcecodingproblem, 2006. in preparation. [51] Z. Luo and I. Devetak. Efficiently implementable codes for quantum key expan- sion. Phys. Rev. A, 2007. e-print quant-ph/0608029. [52] D. J. C. MacKay. Good error-correcting codes based on very sparse matrices. IEEE Trans. Inf. Theory, 45:399–431, 1999. [53] D. J. C. MacKay and R. M. Neal. Near shannon limit performance of low density parity check codes. Electronics Letters, 32, 1996. [54] M. A. Nielsen and I. L. Chuang. Quantum Computation and Quantum Informa- tion. Cambridge University Press, New York, 2000. [55] L. Ozarow. On a source-coding problem with two channels and three receivers. Bell Syst. Tech. J., 59(10), 1980. [56] R.Renner. Securityofquantumkeydistribution,2005. e-printquant-ph/0512258. [57] R. Renner and R. Konig. Universally composable privacy amplification against quantum adversaries. Proc. of TCC 2005, 3378, 2005. e-print quant-ph/0403133. [58] T.J.Richardson,A.Shokrollahi,andR.Urbanke. Designofcapacity-approaching low-density parity-check codes. IEEE Trans. Inf. Theory, 47:619–637, 2001. [59] B.SchumacherandM.A.Nielsen. Quantumdataprocessinganderrorcorrection. Phys. Rev. A, 54:2629–2635, 1996. e-print quant-ph/9604022. [60] B. Schumacher and M. D. Westmoreland. Sending classical information via noisy quantum channels. Phys. Rev. A, 56, 1997. [61] B. Schumacher and M. D. Westmoreland. Quantum privacy and quantum coher- ence. Phys. Rev. Lett., 80:5695–5697, 1998. 148 [62] C.E.Shannon. Amathematicaltheoryofcommunication. Bell System Tech. Jnl., 27:379–423, 623–656, 1948. [63] C. E. Shannon. Coding theorems for a discrete source with a fidelity criterion. In International Convention Record, volume 7, pages 142–163. Institute of Radio Engineers, 1959. [64] P. W. Shor and J. Preskill. Simple proof of security of the bb84 quantum key distribution protocol. Phys. Rev. Lett., 85:441–444, 2000. [65] M. Sipser and D. A. Spielman. Expander codes. IEEE Trans. Inf. Theory, 42:1710–1722, 1996. [66] D. Slepian and J. K. Wolf. Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory, 19, 1973. [67] A.M.Steane. Error-correctingcodesinquantumtheory. Phys. Rev. Lett.,77:793– 797, 1996. [68] A. M. Steane. Multiple particle interference and quantum error correction. Proc. Roy. Soc. Lond. A, 452:2551–2576, 1996. [69] S. Y. Tung. Multiterminal Source Coding. PhD thesis, Cornell University, 1978. [70] M. N. Wegman and J. L. Carter. New hash functions and their use in authenti- cation and set equality. J. Comput. System Sci., 22:265–279, 1981. [71] A. Winter. Coding theorem and strong converse for quantum channels. IEEE Trans. Inf. Theory, 45(7):2481–2485, 1999. [72] A. Winter. Compression of sources of probability distributions and density oper- ators, 2002. e-print quant-ph/0208131. [73] A. Winter. “Extrinsic” and “intrinsic” data in quantum measurements: asymp- totic convex decomposition of positive operator valued measures. Comm. Math. Phys., 244(1):157–185, 2004. quant-ph/0109050. [74] H. S. Witsenhausen and A. D. Wyner. Source coding for multiple descriptions ii: A binary source. Bell Syst. Tech. J., 60(10):2281–2292, 1981. [75] J.K.Wolf, A.D.Wyner, and J.Ziv. Sourcecodingfor multipledescriptions. Bell Syst. Tech. J., 59(8):1417–1426, 1980. [76] A. Wyner. The wire-tap channel. Bell. Sys. Tech. J., 54:1355–1387, 1975. [77] A. Wyner and J. Ziv. The rate-distortion function for source coding with side information at the decoder. IEEE Trans. Inf. Theory, 22(1):1–10, 1976. [78] Z.ZhangandT.Berger. Newresultsinbinarymultipledescriptions. IEEE Trans. Inf. Theory, 33(4):502–521, 1987. 149 Appendix A.1 Fidelity and trace distance It is necessary to recall some facts about trace distances, fidelities, and purifications (mostly taken from [54]). The trace distance between two density operators ρ and σ can be defined as ||ρ−σ|| 1 =Tr|ρ−σ|, where |A| ≡ √ A † A is the positive square root of A † A. The fidelity of two density operators with respect to each other can be defined as F(ρ,σ)=k √ ρ √ σk 2 1 . For two pure states |χi, |ζi this amounts to F(|χi, |ζi)=|hχ|ζi| 2 . The following relation between fidelity and trace distance will be needed: 1− p F(ρ,σ)≤ 1 2 kρ−σk 1 ≤ p 1−F(ρ,σ), (141) the second inequality becoming an equality for pure states. 150 A purification |Φ ρ i RB of a density operatorρ B is some pure state living in an aug- mented quantum system RB such that Tr R (|Φ ρ ihΦ ρ | RB )=ρ B . Any two purifications |Φ ρ i RB and|Φ ′ ρ i RB ofρ B are related by some local unitaryU on the reference system R |Φ ′ ρ i RB =(U R ⊗I B )|Φ ρ i RB . A theorem by Uhlmann states that, for a fixed purification Φ σ of σ, F(ρ,σ)=max Φρ F(|Φ ρ i, |Φ σ i). A corollary of this theorem is the monotonicity property of fidelity F(ρ RB ,σ RB )≤F(ρ B ,σ B ), (142) where ρ B = Tr R ρ RB and σ B = Tr R σ RB . The corresponding monotonicity property of trace distance is ||ρ RB −σ RB || 1 ≥||ρ B ,σ B || 1 . (143) Another important property of fidelity is concavity F X i p i ρ i , X i q i σ i ! ≥ X i √ p i q i p F(ρ i ,σ i ) ! 2 , (144) where p i ≥0, q i ≥0, and P i p i = P i q i =1. 151 A.2 R´ enyi entropy The following definitions and properties of R´ enyi entropy are mostly taken from [57]. For α∈ [0,∞] and a density operator ρ, the R´ enyi entropy of order α of ρ is defined by H α (ρ):= 1 1−α log 2 (Tr(ρ α )) , with the convention H α (ρ):=lim β→α H β (ρ) for α∈{0,1,∞}. In particular, for α = 0, H 0 (ρ) = log 2 (rank(ρ)); for α = 1, H 1 (ρ) is the von Neumann entropyH(ρ); forα=∞,H ∞ (ρ)=−log 2 (λ max (ρ)), whereλ max (ρ) denotes the maximum eigenvalue of ρ. Furthermore, for α, β∈[0,∞], α≤β ⇔ H α (ρ)≥H β (ρ) . The definition of R´ enyi entropy for density operators can be generalized to the notion of smooth R´ enyi entropy. For α ∈ [0,∞], ǫ ≥ 0 and a density operator ρ, the ǫ-smooth R´ enyi entropy of order α of ρ is defined by H ǫ α (ρ):= inf σ∈B ǫ (ρ) H α (σ), 0≤α<1 sup σ∈B ǫ (ρ) H α (σ), 1<α≤∞ where B ǫ (ρ)={σ :||ρ−σ|| 1 ≤ǫ} and H ǫ 1 (ρ)=H(ρ). In the independent and identically distributed (i.i.d.) case, the smooth R´ enyi entropy 1 n H ǫ α (ρ ⊗n ) of the state ρ ⊗n will equal its Shannon entropy H(ρ) as n goes to infinity. Lemma 34 For a density operator and any α∈[0,∞], lim ǫ→0 lim n→∞ H ǫ α (ρ ⊗n ) n =H(ρ), ∀α∈[0,∞] . (145) 152 A.3 Universally composable privacy amplification Now we are ready to introduce an important lemma by Renner and K¨ onig [57]. Lemma 35 For a classical-quantum systemYE with stateσ YE = P y∈Y p(y)|yihy| Y ⊗ σ E y , letf be a two-universal function onY with rangeZ m 2 , which is independent ofYE. Then E f X s q(s|f)|sihs| S ⊗σ E s (f)−τ S ⊗σ E 1 ≤ǫ , where ǫ=2 − 1 2 (H 2 (YE)σ−H 0 (E)σ−m) , S =f(Y)withprobabilitydistributionq,σ E s (f)= 1 q(s|f) P y∈f −1 (s) p(y)σ E y withf −1 (s)= {y|f(y)=s}, and τ S = 1 2 m P s |sihs| S is the maximally mixed state. The result of Lemma 35 can be generalized to smooth R´ enyi entropy. Corollary 36 Foraclassical-quantumsystemYE withstateσ YE = P y∈Y p(y)|yihy| Y ⊗ σ E y , letf be a two-universal function onY with rangeZ m 2 , which is independent ofYE. Then for ǫ≥0 E f X s q(s|f)|sihs| S ⊗σ E s (f)−τ S ⊗σ E 1 ≤ǫ ′ , where ǫ ′ =2 − 1 2 (H ǫ 2 (YE)σ−H ǫ 0 (E)σ−m) +2ǫ , S =f(Y)withprobabilitydistributionq,σ E s (f)= 1 q(s|f) P y∈f −1 (s) p(y)σ E y withf −1 (s)= {y|f(y)=s}, and τ S = 1 2 m P s |sihs| S is the maximally mixed state. 153 A.4 Evaluate R´ enyi entropy for Pauli channels Given the expression (12) of |φ x i BE , we have φ E x = X u p u |uihu| E 1 ⊗|φ x,u ihφ x,u | E 2 , with |φ x,u i E 2 = P v √ p v|u (−1) v·x |vi E 2 . Observe that Tr(φ E x ) 2 = P u p 2 u is independent of X. Then for ω XBE defined by (19) and σ YE defined by (33), we have H 2 (YE) σ =k−log 2 X u p 2 u ! , H 2 (XE) ω =n−log 2 X u p 2 u ! . So H 2 (XE) ω =n−k+H 2 (YE) σ . To show H 0 (E) ω ≥ H 0 (E) σ , we need to introduce Weyl’s monotonicity theo- rem [16]. For a Hermitian operator A, define λ ↓ (A) = (λ ↓ 1 (A),...,λ ↓ n (A)), where eigenvalues λ ↓ j (A) are arranged in decreasing order. Theorem 37 (Weyl’s monotonicity theorem) If A is Hermitian and B is posi- tive, then for all j λ ↓ j (A+B)≥λ ↓ j (A) . Then if both A and B are density operators, the number of positive eigenvalues of A+B should be no less than that of A, i.e. rank(A+B) ≥ rank(A). Note the definition of σ XE can be generalized to σ XE ℓ by changing the classical code C to the ℓth coset of C in Z n 2 . Given ω E = 1 2 n−k P ℓ σ E ℓ , we have rank(ω E ) ≥ rank(σ E ℓ ), i.e. H 0 (E) ω ≥H 0 (E) σ ℓ for all ℓ. 154 A.5 Technical results of information theory Now we can introduce some useful tools in classical information theory. Theorem 38 (Chain rule for entropy) Let X 1 ,...,X n be joint random variables, then H(X 1 ,X 2 ,...,X n )= n X i=1 H(X i |X i−1 ,...,X 1 ) . Before we introduce the chain rule for information, we shall define the conditional mutual information. The conditional mutual information of random variable X and Y given Z is defined by I(X;Y|Z)=H(X|Z)−H(X|Y,Z) , which is also non-negative. Here comes the chain rule for information. Theorem 39 (Chain rule for information) LetX 1 ,X 2 ,...,X n ,Y be joint random variables, then I(X 1 ,X 2 ,...,X n ;Y)= n X i=1 I(X i ;Y|X i−1 ,...,X 1 ) . Random variables X, Y, Z are said to form a Markov chain denoted by X → Y → Z iff X and Z are conditionally independent given Y, i.e. the joint probability distribution p(x,z|y) can be written as p(x,z|y)=p(x|y)p(z|y) . Now we are ready to introduce the data processing inequality, which says that there is no way to gain more information than that can be obtained from the data. Theorem 40 (Data processing inequality) If X →Y →Z, I(X;Y)≥I(X;Z). 155 The following two theorems are useful in quantum information theory. Theorem 41 (Schmidt decomposition) Given a pure state |ψi AB of a composite systemAB, there exist an orthonormal basis{|α i ihα i | A } for systemA and an orthonor- mal basis {|β i ihβ i | B } for system B, such that |ψi AB = X i λ i |α i i A |β i i B , where λ i are non-negative real numbers satisfying P i λ 2 i =1. By the Schmidt decomposition, ρ A = P i λ 2 i |α i ihα i | and ρ B = P i λ 2 i |β i ihβ i |, so ρ A and ρ B have the same set of eigenvalues {λ 2 i }. Theorem 42 (Purification) Any density operator ρ A can be purified by a refer- ence system R such that the joint state |ψi RA , called a purification of ρ A , satisfies Tr R (|ψihψ| RA )=ρ A . The procedure to purify a mixed state ρ A = P i p i |α i ihα i | A is to find a reference system R whose state space has the same dimension as system A with orthonormal basis{|β i ihβ i | R }suchthatthereexistsajointstate|ψi RA whoseSchmidtdecomposition is |ψi RA = X i √ p i |α i i A |β i i R . A.6 Typicality and conditional typicality We follow the standard presentation of [22]. The probability distribution P x n defined by P x n(x) = N(x|x n ) n is called the empirical distribution or type of the sequence x n , where N(x|x n ) counts the number of occurrences of x in the word x n = x 1 x 2 ...x n . A sequence x n ∈ X n is called δ-typical with respect to a probability distribution p defined on X if |P x n(x)−p(x)|≤p(x)δ, ∀x∈X. (146) 156 The latter condition may be rewritten as P x n ∈[p(1−δ),p(1+δ)]. ThesetT n p,δ ⊆X n consistingofallδ-typicalsequencesiscalledtheδ-typicalset. When thedistributionpisassociatedwithsomerandomvariableX, wemayusethenotation T n X,δ . The trace distance measurekp−qk 1 between two probability distributionsp and q defined over the same alphabet X is given by kp−qk 1 = X x |p(x)−q(x)|. (147) Observe that Eq. (146) implies kp−P x nk 1 ≤δ. The name “typical” comes from the Law of Large Numbers: if the lettersx i ofx n are independentandidenticallydistributedaccordingtoprobabilitydistributionp,thenas n→∞ the empirical distribution ofx n will approachp. Thus, for largen, a randomly selected sequence is likely to be typical. The properties of typical sets are given by the following theorem : Theorem 43 For all ǫ>0, δ>0 and sufficiently large n, (1). 2 −n[H(p)+cδ] ≤p n (x n )≤2 −n[H(p)−cδ] for x n ∈T n p,δ , (2). p n (T n p,δ )=Pr{X n ∈T n p,δ }≥1−ǫ (3). (1−ǫ)2 n[H(p)−cδ] ≤|T n p,δ |≤2 n[H(p)+cδ] . for some constantc depending only onp. Above, the distributionp n is naturally defined on X n by p n (x n )=p(x 1 )...p(x n ). 157 Given a pair of sequences (x n ,y n )∈X n ×Y n , the probability distribution P y n |x n defined by P y n |x n(y|x)= N(xy|x n y n ) N(x|x n ) = P x n y n(x,y) P x n(x) is called the conditional empirical distribution or conditional type of the sequence y n relative to the sequence x n . A sequence y n = y 1 ...y n ∈ Y n is called δ-conditionally typical with respect to the conditional probability distributionQ and a sequencex n = x 1 ...x n ∈X n if P y n |x n(y|x)∈[(1−δ)Q(y|x),(1+δ)Q(y|x)], ∀x∈X,∀y∈Y. The set of such sequences is denoted by T n Q,δ (x n ) ⊆ Y n . When Q is associated with some conditional random variable Y|X, we may use the notation T n Y|X,δ (x n ). Define q(y)= P x Q(y|x)p(x). Theorem 44 For all ǫ>0, δ>0, δ ′ >0, and sufficiently large n, for all x n ∈T n p,δ ′ , (1). 2 −n[H(Y|X)+cδ+c ′ δ ′ ] ≤Q n (y n |x n )≤2 −n[H(Y|X)−cδ−c ′ δ ′ ] for y n ∈T n Q,δ (x n ). (2). Q n (T n Q,δ (x n )|x n )=Pr{Y n ∈T n Q,δ (x n )|X n =x n }≥1−ǫ (3). (1−ǫ)2 n[H(Y|X)−cδ−c ′ δ ′ ] ≤|T n Q,δ (x n )|≤2 n[H(Y|X)+cδ+c ′ δ ′ ] . (4). If y n ∈T n Q,δ (x n ), then (x n ,y n )∈T n pQ,(δ+δ ′ +δδ ′ ) , and hence y n ∈T n q,(δ+δ ′ +δδ ′ ) . (5). Q n (T n q,δ+δ ′ +δδ ′ |x n )≥1−ǫ. for some constants c,c ′ depending only on p and Q. 158
Abstract (if available)
Abstract
In this thesis, we mainly investigate four different topics: efficiently implementable codes for quantum key expansion [51], quantum error-correcting codes based on privacy amplification [48], private classical capacity of quantum channels [44], and classical channel simulation with quantum side information [49, 50].
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Topics in quantum information and the theory of open quantum systems
PDF
Dynamical error suppression for quantum information
PDF
Quantum steganography and quantum error-correction
PDF
Quantum computation and optimized error correction
PDF
Quantum coding with entanglement
PDF
Quantum error correction and fault-tolerant quantum computation
PDF
Towards optimized dynamical error control and algorithms for quantum information processing
PDF
Error correction and cryptography using Majorana zero modes
PDF
Applications of quantum error-correcting codes to quantum information processing
PDF
Error correction and quantumness testing of quantum annealing devices
PDF
Applications and error correction for adiabatic quantum optimization
PDF
Open quantum systems and error correction
PDF
Error suppression in quantum annealing
PDF
Protecting Hamiltonian-based quantum computation using error suppression and error correction
PDF
Quantum feedback control for measurement and error correction
PDF
Topics in quantum information -- Continuous quantum measurements and quantum walks
PDF
Quantum information flow in steganography and the post Markovian master equation
PDF
Topics in modeling, analysis and simulation of near-term quantum physical systems with continuous monitoring
PDF
Entanglement-assisted coding theory
PDF
Open-system modeling of quantum annealing: theory and applications
Asset Metadata
Creator
Luo, Zhicheng
(author)
Core Title
Topics in quantum cryptography, quantum error correction, and channel simulation
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Physics
Publication Date
05/04/2009
Defense Date
09/02/2008
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
channel simulation,OAI-PMH Harvest,quantum cryptography,quantum error correction,secret communication
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Devetak, Igor (
committee chair
), Brun, Todd A. (
committee member
), Dappen, Werner (
committee member
), Haas, Stephan (
committee member
), Lidar, Daniel (
committee member
)
Creator Email
zhicheng.luo@gmail.com,zluo@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m2117
Unique identifier
UC199514
Identifier
etd-Luo-2710 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-227033 (legacy record id),usctheses-m2117 (legacy record id)
Legacy Identifier
etd-Luo-2710.pdf
Dmrecord
227033
Document Type
Dissertation
Rights
Luo, Zhicheng
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
channel simulation
quantum cryptography
quantum error correction
secret communication