Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Design and analysis of large scale antenna systems
(USC Thesis Other)
Design and analysis of large scale antenna systems
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
DESIGN AND ANALYSIS OF LARGE SCALE ANTENNA SYSTEMS by Ansuman Adhikary A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) July 2014 Copyright 2014 Ansuman Adhikary Dedication This dissertation is dedicated to Mom, Dad and Probir. ii Acknowledgements First and foremost, I would like to express my sincere gratitude to my advisor, Prof. Giuseppe Caire for his guidance and support throughout the past five years. He has been myrolemodelofatruescholarandmentorandIamextremelyhonoredtobehisstudent. This research work would not have been possible without his continuous encouragement and inspiration. I extend my sincere thanks to Prof. Andy Molisch and Prof. Susan Montgomery, who reviewed this dissertation. Their insightful comments and invaluable advice made this work technically sound and meaningful. I am deeply indebted to Junyoung Nam, Dr. Haralabos Papadopoulos, Dr. Alexei Ashikhmin, Dr. Tom Marzetta, and Prof. Ted Rappaport, who are the coauthors of papers which led to the results of this dissertation. I was fortunate to work together with each of them and have learned a lot from their integrity and erudition. In addition, I am grateful to my colleagues Song Nam, Arash, Hassan, Seun, Ozgun, Hoon, Mingyue, Dilip and Avik for their friendly help during my USC years. Finally, I want to say thank you to my Mom and Dad for their endless support, sacrifice, and love. And a special thank you to my friend Probir for being a wonderful companion throughout. iii Abstract A Large Scale Antenna System (LSAS) entails a large number (tens or hundreds) of base station antennas serving a much smaller number of terminals, with large gains in spectral efficiency and energy efficiency compared with conventional multiuser MIMO technology. However, enabling multiuser MIMO requires very accurate channel state information at the transmitter (CSIT), which can be acquired via uplink pilots in Time Division Duplexing (TDD) systems and via downlink pilots and uplink feedback in Frequency Division Duplexing (FDD) sytems. In conventional cellular technology, where FDD is employed, acquiring CSIT becomes prohibitive due to the presence of a large number of antennas. In this work, we propose Joint Spatial Division and Multiplexing (JSDM) and show that it achieves significant savings both in the downlink training and in the CSIT uplink feedback, thus making the use of large antenna arrays at the base station potentially suitable also for FDD systems. JSDM is a two stage beamforming scheme, and relies on serving groups of users with approximately similar covariances. We prove a simple condition under which JSDM incurs no loss of optimality with re- spect to the full CSIT case and that such condition is approached in the large number of antennaslimitwithuniformlyspacedlineararrays. Weextendtheseideastothecaseofa two-dimensional base station antenna array, with 3-dimensional beamforming, including multiple beams in the elevation angle direction. We provide guidelines for optimization andcalculatethesystemspectralefficiencyunderproportionalfairnessandmax-minfair- ness criteria, showing extremely attractive performance. We also show that JSDM with simple opportunistic user selection is able to achieve the same scaling law of the system capacity with full channel state information, and propose a simple scheme for grouping iv users in a realistic setting. We propose a low-overhead probabilistic scheduling algo- rithm that selects these users at random with certain probabilities. As a result, only the pre-selected users are required to feedback their channel state information, realizing im- portant savings in the CSIT feedback. We study the applicability of JSDM to mm-Wave channels and analyze its performance in some realistic propagation channels. Evaluations in propagation channels obtained from ray tracing results, as well as in measured outdoor channels show that JSDM performs surprisingly well in mm-Wave channels. Finally, we study the performance of JSDM in a heterogenous network consisting of a large number of small cells deployed under a macro-cellular “umbrella”. We propose efficient inter- tier interference management schemes using JSDM as a sort of “spatial blanking”, that is significantly more efficient than isotropic slot blanking (enhanced Inter-Cell Interfer- ence Coordination, eICIC) currently proposed in LTE standardization. Our numerical results are obtained via asymptotic random matrix theory, avoiding lengthy Monte Carlo simulations. v Table of Contents Dedication ii Acknowledgements iii Abstract iv List of Tables 4 List of Figures 5 Chapter1 Introduction 8 1.1 MU-MIMO Downlink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2 Joint Spatial Division and Multiplexing . . . . . . . . . . . . . . . . . . . 11 1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter2 Joint Spatial Division and Multiplexing – The Large-Scale Array Regime 16 2.1 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 The idea of JSDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 JSDM with Eigen-Beamforming . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.1 Achieving capacity with reduced CSIT . . . . . . . . . . . . . . . . 23 2.3.2 Block diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.4.1 JSDM with joint group processing . . . . . . . . . . . . . . . . . . 28 2.4.2 JSDM with per-group processing . . . . . . . . . . . . . . . . . . . 30 2.4.3 Validation of the asymptotic analysis . . . . . . . . . . . . . . . . . 32 2.5 Downlink training and noisy CSIT . . . . . . . . . . . . . . . . . . . . . . 35 2.5.1 Results with downlink channel estimation . . . . . . . . . . . . . . 38 2.6 Uniform linear arrays: eigenvalues and eigenvectors . . . . . . . . . . . . . 41 2.6.1 Approximating the channel eigenspace . . . . . . . . . . . . . . . . 48 2.6.2 DFT pre-beamforming . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.7 JSDM with 3D pre-beamforming . . . . . . . . . . . . . . . . . . . . . . . 51 2.7.1 Results with 3D pre-beamforming . . . . . . . . . . . . . . . . . . 55 Chapter3 Joint Spatial Division and Multiplexing : Opportunistic Beamforming and User Grouping 61 3.1 Sum capacity scaling in the large user regime . . . . . . . . . . . . . . . . 62 3.1.1 Achievability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 1 3.2 User Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.2.1 Algorithm 1: K-means Clustering . . . . . . . . . . . . . . . . . . . 70 3.2.2 Algorithm 2: fixed quantization . . . . . . . . . . . . . . . . . . . . 73 3.2.3 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.3 Large System Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.3.1 DFT-based user grouping in the large system limit . . . . . . . . . 78 3.3.2 Probabilistic user selection . . . . . . . . . . . . . . . . . . . . . . 81 3.3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Chapter4 Joint Spatial Division and Multiplexing for mm-Wave channels 96 4.1 Spatial Chanel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.2 Multiple scattering clusters . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.3 Application of JSDM to Highly Directional Channels . . . . . . . . . . . . 108 4.3.1 Channel eigenvalue spectrum and angular occupancy . . . . . . . . 109 4.3.2 Optimization Problem 1 . . . . . . . . . . . . . . . . . . . . . . . . 110 4.3.3 Optimization Problem 2 . . . . . . . . . . . . . . . . . . . . . . . . 112 4.3.4 Application of JSDM after selection . . . . . . . . . . . . . . . . . 114 4.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.4.1 Ray tracing channels . . . . . . . . . . . . . . . . . . . . . . . . . . 116 4.4.2 Measured channels . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4.4.3 JSDM with spatial multiplexing . . . . . . . . . . . . . . . . . . . 121 4.4.4 Covariance-based JSDM . . . . . . . . . . . . . . . . . . . . . . . . 123 Chapter5 Massive MIMO and Inter-tier Interference Coordination 129 5.1 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.1.1 Macrocell subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . 132 5.1.2 Small cell subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . 133 5.2 System performance : Reverse TDD . . . . . . . . . . . . . . . . . . . . . 134 5.2.1 Tier-1 DL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 5.2.2 Tier-2 UL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.3 System performance : Co-channel TDD . . . . . . . . . . . . . . . . . . . 145 5.3.1 Tier-1 UL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 5.3.2 Practical considerations for tier-1 UL. . . . . . . . . . . . . . . . . 147 5.3.3 Tier-2 UL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 5.4 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 5.4.1 Design of the pre-beamforming matrices . . . . . . . . . . . . . . . 150 5.4.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Chapter6 Conclusion 155 Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 A.1 Deterministic equivalents for the SINR with PGP and noisy CSIT . . . . 161 A.1.1 Regularized Zero Forcing Precoding . . . . . . . . . . . . . . . . . 161 A.1.2 Zero Forcing Precoding . . . . . . . . . . . . . . . . . . . . . . . . 164 2 A.2 General formula for S(ξ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 A.3 Converse of Theorem 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 A.4 Limit of the growth function . . . . . . . . . . . . . . . . . . . . . . . . . 172 3 List of Tables 2.1 Sumspectralefficiency(bit/s/Hz)underPFSandmax-minfairnessschedul- ing for PGP and approximate BD/DFT. . . . . . . . . . . . . . . . . . . . 60 4.1 Ray-tracing simulation parameters of USC campus . . . . . . . . . . . . . 117 4.2 28 GHz Channel Sounder Specifications . . . . . . . . . . . . . . . . . . . 120 4 List of Figures 2.1 A UT at AoA θ with a scattering ring of radius r generating a two-sided AS Δ with respect to the BS at the origin. . . . . . . . . . . . . . . . . . 19 2.2 Comparison of sum spectral efficiency (bit/s/Hz) vs. SNR (dB) for JSDM with their corresponding deterministic equivalents. “JGP” denotes JSDM with joint group processing and “PGP” denotes JSDM with per-group pro- cessing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.3 Sum spectral efficiency (bit/s/Hz) vs. SNR (dB) for JSDM (computed via deterministic equivalents) with r ⋆ = 11, for S ′ = 4 and S ′ = 8. The coherence block length is T = 40. The “green” and “cyan” curves denote the results for imperfect CSIT with optimized choice ofb ′ . “JGP” denotes JSDM with joint group processing and “PGP” denotes JSDM with per- group processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.4 Sum spectral efficiency (bit/s/Hz) vs. b ′ for JSDM with r ⋆ = 11, for S ′ = 8 (computed via deterministic equivalents). The coherence block length T = 40. The “dashed” curves denote the results for PGP with perfect CSIT, and the “solid” lines denote the same for imperfect CSIT. . 40 2.5 RatioS ′ /b ′ (slope) for the optimizedS ′ andb ′ versus the channel SNR for different precoders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.6 M = 400,θ =π/6,D = 1,Δ =π/10. Exact empirical eigenvalue cdf ofR R R (red), its approximation (2.60) based on the circulant matrix C C C (dashed blue) and its approximation from the samples of S(ξ) (dashed green). . . 47 2.7 Eigenvalue spectra for a ULA withM = 400,G = 3,θ 1 = −π 4 ,θ 2 = 0,θ 3 = π 4 , D = 1/2 and Δ = 15 deg. . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.8 Sum spectral efficiency (bit/s/Hz) vs. SNR (dB) for JSDM (computed via deterministic equivalents) for DFT pre-beamforming and PGP, for the configurationwithspectrashowninFig.2.7,choosingb g =r g forallgroups g = 1,2,3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5 2.9 The layout of one pattern for JSDM with 3D pre-beamforming. The con- centric regions are separated by the vertical pre-beamforming. The circles indicate user groups. Same-color groups are served simultaneously using JSDM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2.10 Sum spectral efficiency ¯ R l for different annular regions l = 1,...,8 with Regularized ZF and ZF for JSDM with 3D pre-beamforming and ideal CSIT. “BD” denotes PGP with approximate block diagonalization and “DFT” stands for PGP with DFT pre-beamforming. Equal power is allo- cated to all served users and the number of users (streams) in each group is optimized in order to maximize the overall spectral efficiency. . . . . . . 57 3.1 Comparison of sum spectral efficiency (bit/s/Hz) vs. number of users for JSDM withDFT-based fixed quantization user groupingand different user selection algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.2 Comparison of sum spectral efficiency (bit/s/Hz) vs. number of users for JSDM-ZFBF-SUS and JSDM-GBF-MAX with different user grouping al- gorithms for M = 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.3 Illustration of user grouping in the large system limit. ‘black’ denotes g = 1, ‘purple’ denotes g = 2 and ’red’ and ’blue’ denote g = 3 and 4. . . 82 3.4 Partition of the θ−Δ plane into different patterns. Within each pattern, there are different groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.5 Optimization of user subgroups fractions for proportional fairness schedul- ing in the large system limit, for Pattern 1. G = 4,b 1 =b 2 =b 3 =b 4 = 2, δγ = 0.01 and P = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.6 Optimization of user subgroups fractions for sum rate maximization in the large system limit, for Pattern 1. G = 4,b 1 =b 2 =b 3 =b 4 = 2, δγ = 0.01 and P = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.7 Comparison of sum spectral efficiency (bit/s/Hz) vs. SNR for JSDM with M = 64 and varying N for simplified user grouping and probabilistic user scheduling with different fairness functions. . . . . . . . . . . . . . . . . . 93 3.8 CDFs of the normalized subgroup rates for JSDM with M = 64, SNR = 10 dB and varying N for simplified user grouping and probabilistic user scheduling with different fairness functions. . . . . . . . . . . . . . . . . . 94 4.1 Two user groups with local one-cluster scattering and a common scatterer that couples them. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.2 Sum Spectral efficiency (in bits/s/Hz) versus SNR for a scenario with two groups and a common scatterer.. . . . . . . . . . . . . . . . . . . . . . . . 107 4.3 Ray-tracing simulation environment . . . . . . . . . . . . . . . . . . . . . 117 6 4.4 28 GHz cellular measurement locations in Manhattan near the NYU cam- pus. Three base station locations (yellow stars on the one-story rooftop of Coles Recreational Center and five-story balcony of the Kaufman build- ing of Stern Business School) were used to transmit to each of the 25 RX locations within 31 to 423 m. Green dots represent visible RX locations, and purple squares represent RX sites that are blocked by buildings in this image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.5 Comparison of sum spectral efficiency versus SNR with G = 2 and G = 5 user groups. Each user group has multiple scattering clusters, of which some are common to more than one group. . . . . . . . . . . . . . . . . . 122 4.6 Comparison of Spatial Multiplexing versus logǫ with G = 2 and G = 5 user groups. Each user group has multiple scattering clusters, of which some are common to more than one group. . . . . . . . . . . . . . . . . . 122 4.7 ComparisonofsumspectralefficiencyversusSNRandSpatialMultiplexing versus logǫ withG = 5 user groups and no spatial multiplexing. Each user has multiple scattering clusters. . . . . . . . . . . . . . . . . . . . . . . . . 124 4.8 ComparisonofsumspectralefficiencyversusSNRandSpatialMultiplexing versus logǫ with K = 20 users. Each user has multiple scattering clusters. 124 4.9 Comparison of sum spectral efficiency versus transmit power with varying K when the channel is modeled as a double directional impulse response. 125 4.10 ComparisonofSpatialMultiplexingversusNumberofuserswhenthechan- nel is modeled as a double directional impulse response. . . . . . . . . . . 126 4.11 Comparison of sum spectral efficiency versus Transmit Power for different BS locations obtained from measured data. . . . . . . . . . . . . . . . . . 127 5.1 Frame structure for the two-tier network . . . . . . . . . . . . . . . . . . . 131 5.2 Sample layout showing the two tier network. . . . . . . . . . . . . . . . . 152 5.3 Reverse TDD vs Co TDD (Isotropic Scattering) . . . . . . . . . . . . . . . 152 5.4 Reverse TDD vs Co TDD (Directional Scattering) . . . . . . . . . . . . . 153 7 Chapter 1 Introduction Multiuser MIMO (MU-MIMO) is one of the core technologies expected to be adopted by the next generation of wireless communication systems. Its basic information the- oretic principles are grounded in the well-developed theory of vector Gaussian broad- cast channels [CS03,WSS06], representative of a downlink scenario where a base station (BS) equipped with multiple antennas serves a number of wireless users. A consider- able amount of effort has been dedicated to study the performance of such schemes in the presence of various practical constraints, including non-ideal channel state informa- tion at the transmitter (CSIT) [CJKR10], the overhead incurred by downlink channel probing and CSIT feedback [KJC11], the extension to joint precoding over clusters of multiple BSs [HTC12,HCPR12], and the regime of very large number of antennas at the BS [Mar10,HtBD13,YM13,JAMV11]. LargeScaleAntennaSystems(or“MassiveMIMO”(multiple-inputmultiple-output)) are equipped with a large number (dozens or hundreds) of antenna elements at the base station (BS) [Mar10,RPL + 13]. They are intended to be employed in a multi-user MIMO (MU-MIMO) setting, such that the number of BS antenna elements is much larger than the number of users. Such an arrangement leads not only to very high spectral efficiency, but also to an important simplification of the signal processing: in the idealized regime of independent and isotropically distributed channel vectors, in the limit of an infinite number of BS antennas, single-user beamforming, specifically conjugate beamforming 8 (i.e., maximumratiocombininginthereceivemode, andmaximumratiotransmission for the transmit mode) eliminates inter-user interference. Furthermore, the transmit power can be drastically reduced, leading to less interference and a lower energy consumption of the BS. For all these reasons, massive MIMO has received tremendous attention in recent years [CSAL + 12,NLM13,HtBD13,MCD + 13,HCPR12,LETM14]. 1.1 MU-MIMO Downlink Consider a Multiuser MIMO (MU-MIMO) downlink formed by a base station (BS) with M antennas and K single-antenna user terminals (UTs). The K ×M channel matrix is assumed to be constant over time-frequency slots of fixed size T channel uses. 1 The sum rate in the high-SNR regime behaves at best as M ⋆ (1−M ⋆ /T)logSNR +O(1), where M ⋆ = min{M,K,T/2} and where the pre-log factor M ⋆ (1−M ⋆ /T) is the best possible multiplexing gain of the system. This upper bound is obtained by letting all UTs cooperate and using the result of [ZT02] on the high-SNR capacity of the non-coherent block-fading MIMO point-to-point channel. In Frequency Division Duplexing (FDD) systems, where the fading channel reciprocity cannot be exploited, an optimistic lower bound which gives the same sum rate scaling is obtained by devotingM ⋆ dimensions per block to downlink training, in order to allow the UTs to estimate their M ∗ dimensional channelvectors,andassumingthatChannelStateInformationattheTransmitter(CSIT) can be provided by delay-free and error-free feedback. Results in [CJKR10,KJC11] show that, if the feedback is properly designed, the effect of feedback errors is negligible with respect tothe downlink trainingestimation error. In this case, the pre-logfactor (i.e., the system multiplexing gain) M ⋆ (1−M ⋆ /T) is achievable by assuming that the feedback can be implemented in the same coherence block. If, more realistically, instantaneous feedback in the same fading coherence block is not possible, a prediction error further decreasesthesystemmultiplexinggainbythefactormax{1−2B d T s ,0},whereB d =vf 0 /c 1 A channel use corresponds to an independent complex signal-space dimension in the time-frequency domain. 9 is the Doppler bandwidth (Hz), (v denoting the UT speed (m/s),f 0 the carrier frequency (Hz), c the light speed (m/s)), and T s the slot duration (s) [CJKR10,KJC11]. It is evident that, even not taking into account the cost of CSIT feedback (which may impact the uplink system capacity), the MU-MIMO multiplexing gain for an FDD system based on downlink training, channel estimation (and possibly prediction) at the UTs, and CSIT feedback, is significantly reduced when M ⋆ is not much smaller than T and/or2B d T s isnotmuchsmallerthan1. Inparticular, forlargeM andK, thedownlink trainingrepresentsasignificantbottleneck(asquantifiedbytheanalysisin[HTC12])and the corresponding CSIT feedback yields an unacceptably high overhead for the uplink. AlternativesthatdonotrequireCSIT[GWJ10]orrequireonlyoutdatedCSIT[MAT10] (without requiring a strict one-slot prediction constraint) have been proposed. Although these schemes may achieve better multiplexing gain than the basic training and feedback scheme in certain conditions (see for example the comparison in [KC12]) they do not scale well with the number of BS antennas and UTs, since they require a precoding block length (in time slots) that grows very rapidly with the number of system antennas. 2 Hence, these schemes are not suited for “large” MIMO systems with many BS antennas serving many UTs. In contrast, Time Division Duplexing (TDD) systems can exploit channel reciprocity forestimatingthedownlinkchannelsfromuplinktraining. Inthiscase, thesystemmulti- plexing gain is still upperbounded byM ⋆ (1−M ⋆ /T), but training in the same coherence block is possible (hence, no extra degradation due to prediction) and the training dimen- sion is determined by the number of total UT antennas, while the number of BS antennas can be made as large as desired. Therefore, Large Scale Antenna Systems usingM ≫K antennas at the BS with TDD, as proposed in [Mar10] (see the more refined performance analysisand systemoptimization in[HCPR12,CWD09])areveryattractiveforTDDsys- temsbothintermsofachievedthroughputandintermsofsimplifieddownlinkscheduling 2 For example, [MAT10] requires precoding over M! P M j=1 1 j time slots in order to serve M UTs with M BS antennas. 10 and signal processing at the BS. A recent practical testbed implementation of a 64 an- tenna massive MIMO system, achieving transmitter clock stability and self-calibration in order to effectively exploit TDD reciprocity, has been demonstrated in [CSAL + 12]. 1.2 Joint Spatial Division and Multiplexing In this work, we consider a Joint Spatial Division and Multiplexing (JSDM) approach to potentiallyachievemassiveMIMO-likethroughputgainsandsimplifiedsystemoperations also for FDD systems, which still represent the far majority of currently deployed cellular networks. We observe that, for a typical cellular configuration, the channel from the M BSantennastoanyUTantennaisacorrelatedrandomvectorwithcovariancematrixthat depends on the scattering geometry. Assuming a macro-cellular tower-mounted BS with no significant local scattering, the propagation between the BS antennas and any given UT antenna is characterized by the local scattering around the UT, resulting in the well- knownone-ringmodel[SFGK00]. ThemainideaofJSDMconsistsofpartitioningtheuser population into groups with approximately the same channel covariance eigenspace, and split the downlink beamforming into two stages: a pre-beamforming matrix that depends only on the channel covariances, and a MU-MIMO precoding matrix for the “effective” channel, inclusive of pre-beamforming. The pre-beamforming matrix is chosen in order to minimize the inter-group interference for any instantaneous channel realization, by exploiting the linear independence of the dominant eigenmodes of the channel covariance matrices of the different groups. Pre-beamforming can be considered as a generalization of sectorization, widely used in current cellular technology. 3 3 An approach that exploits the same directional structure of the channel covariance matrix used by JSDM, in order to eliminate pilot contamination in a multi-cell massive MIMO setting, was proposed concurrently and independently in [YGFL13]. The idea of such two-stage “hybrid” beamforming has been considered by various authors (see [ARAS + 13] and references therein) and is motivated by the fact that, while implementing a large number of antennas in a reasonable form factor may be feasible, having a correspondingly large number of RF chains may be too expensive or power consuming. In [LL13], the pre-beamforming idea has been used to cope with inter-cell interference in a multi-cell scenario. 11 The MU-MIMO precoding stage requires estimation and feedback of the instanta- neous (effective) channel realization. As we shall see, this may have significantly reduced dimensionwithrespecttothefullK×M channelmatrix. Therefore,bothdownlinktrain- ing and uplink feedback overheads are greatly reduced, making this scheme attractive for FDD systems. The pre-beamforming stage requires only channel covariance information, which can be tracked with small protocol overhead. 4 1.3 Organization The remainder of this dissertation is organized as follows. In Chapter 2, we outline the basic idea of Joint Spatial Division and Multiplexing (JSDM)andshowthatitachievessignificantsavingsbothinthedownlinktrainingandin theCSITuplinkfeedback, thusmakingtheuseoflargeantennaarraysatthebasestation potentially suitable also for Frequency Division Duplexing (FDD) systems, for which uplink/downlink channel reciprocity cannot be exploited. We prove a simple condition under which JSDM incurs no loss of optimality with respect to the full CSIT case. For linear uniformly spaced arrays, we show that such condition is approached in the large number of antennas limit. For this case, we use Szego’s asymptotic theory of Toeplitz matrices to show that a DFT-based pre-beamforming matrix is near-optimal, requiring only coarse information about the users angles of arrival and angular spread. Finally, we extend these ideas to the case of a two-dimensional base station antenna array, with 3-dimensional beamforming, including multiple beams in the elevation angle direction. We provide guidelines for the pre-beamforming optimization and calculate the system spectral efficiency under proportional fairness and max-min fairness criteria, showing extremely attractive performance. Our numerical results are obtained via asymptotic 4 In practice, the channel covariance changes over time at a much slower time scale with respect to the system slot rate, therefore we assume that this is locally stationary and can be estimated and tracked using some standard subspace tracking technique [VLM12], [MTS11], [HM01], [Mes08] (see also Remark 2). 12 random matrix theory, avoiding lengthy Monte Carlo simulations and providing accurate results for realistic (finite) number of antennas and users. In Chapter 3, we extend our work on JSDM, and address some important practical issues. First, we focus on the regime of finite number of antennas and large number of users and show that JSDM with simple opportunistic user selection is able to achieve the same scaling law of the system capacity with full channel state information. Next, we consider the large-system regime (both antennas and users growing large) and propose a simple scheme for user grouping in a realistic setting where users have different angles of arrival and angular spreads. Finally, we propose a low-overhead probabilistic scheduling algorithm that selects the users at random with probabilities derived from large-system random matrix analysis. Since only the pre-selected users are required to feedback their channel state information, the proposed scheme realizes important savings in the CSIT feedback. In Chapter 4, we focus on the applicability of JSDM to mm-Wave channels. Massive MIMOsystemsarewell-suitedformm-Wavecommunications,aslargearrayscanbebuilt withreasonableformfactors,andthehigharraygainsenablereasonablecoverageevenfor outdoor communications. In this chapter, we analyze the performance of JSDM in some realistic propagation channels that take into account the partial overlap of the angular spectra from different users, as well as the sparsity of mm-Wave channels. We formulate theproblemofusergroupingfortwodifferentobjectives, namelymaximizingspatialmul- tiplexing, and maximizing total received power, in a graph-theoretic framework. As the resultingproblemsarenumericallydifficult, wepropose(suboptimum)greedyalgorithms asefficientsolutionmethods. Numericalexamplesshowthatthedifferentalgorithmsmay be superior in different settings. We furthermore develop a new, “degenerate” version of JSDM that only requires average CSI at the transmitter, and thus greatly reduces the computational burden. Evaluations in propagation channels obtained from ray tracing results, as well as in measured outdoor channels show that this low-complexity version performs surprisingly well in mm-Wave channels. 13 In Chapter 5, we study the performance of JSDM in a two-tier system where a large numberofsmallcellsaredeployedunderamacro-cellular“umbrella”. Themacro-cellular tier provides coverage and handles mobile users, while the small cell tier provides high rate locally to nomadic users. While the standard approach consists of operating the two tiers in different frequency bands, for various reasons (e.g., lack of licensed spec- trum), it may be useful to operate both tiers in the whole available spectrum. Hence, we consider schemes for inter-tier interference coordination that do not assume any ex- plicit data or channel state information sharing between tiers. In particular, we consider co-channel TDD and reverse TDD schemes, when the macro (tier-1) base station has a very large number of antennas and the tier-2 base stations have a moderately large number of antennas. We show that by exploiting the spatial directionality of the channel vectors, very efficient inter-tier interference management can be obtained with relatively low complexity. Our approach consists of using JSDM to do a sort of “spatial blanking” of certain angle-of-departure of the tier-1 base station at given scheduled time-frequency slots, in order to create transmission opportunities for the corresponding tier-2 small cells. In particular, such “spatial blanking” is significantly more efficient than isotropic slot blanking (enhanced Inter-Cell Interference Coordination, eICIC) currently proposed in LTE standardization. In Chapter 6, we conclude the dissertation along with a brief summary of future directions for this work. Notation : We use boldface capital letters (X X X) for matrices, boldface small letters for vectors (x x x), small letters (x) for scalars and (X) calligraphic letters for sets. The union, intersection and difference between two setsX andY are respectively denoted byX S Y, X T Y andX \Y. The Lebesgue measure of a Borel setX is indicated by|X|. IfN is a discrete set,|N| indicates its cardinality. The indicator function of a setB is denoted by 1{B}. X X X T andX X X H denote the transpose and the Hermitian transpose ofX X X,||x x x|| denotes the vector 2-norm ofx x x, tr(X X X) and|X X X| denote the trace and the determinant of the square matrix X X X. The identity matrix is denoted by I I I (when the dimension is clear from the 14 context) or byI I I n (when pointing out its dimensionn×n improves clarity of exposition). X X X⊗Y Y Y denotes the Kronecker product of two matricesX X X,Y Y Y. kX X Xk 2 F = tr(X X X H X X X) indicates the squared Frobenius norm of a matrix X X X. We also use Span(X X X) to denote the linear subspace generated by columns of X X X and Span ⊥ (X X X) for the orthogonal complement of Span(X X X). x x x ∼ CN(μ,Σ) indicates that x x x is a complex circularly-symmetric Gaussian vector with mean μ and covariance matrix Σ. 15 Chapter 2 Joint Spatial Division and Multiplexing – The Large-Scale Array Regime This chapter focuses on the concept of JSDM, and its performance analysis and sys- tem design guidelines, i.e., how to choose the parameters of JSDM for a given set of user groups that we wish to serve simultaneously, on the same time-frequency slot. We show that, under some conditions on the eigenvectors of the channel covariance matri- ces, JSDM incurs no loss of optimality with respect to the full CSIT case. When these conditions cannot be met, we examine the design of the pre-beamforming matrix and the performance of regularized zero forcing (linear) MU-MIMO precoding for the resulting effective channel. Then, we specialize our system design in the case of Uniform Linear Arrays (ULAs) and use Szego’s asymptotic results on Toeplitz matrices [GS84] to show that the optimality conditions can be met by ULAs when M is large, as long as the user groups have non-overlapping supports of their Angle of Arrival (AoA) distributions. Using the Toeplitz eigen-subspace approximation result of [GS84], we argue that the pre- beamforming matrix for large ULAs can be obtained by selecting blocks of columns of a unitary Discrete Fourier Transform (DFT) matrix. DFT pre-beamforming achieves very good performance and effective channel dimensionality reduction and requires only a coarse knowledge of the support of the AoA distribution for each user group, without requiring an accurate estimation of the actual channel covariance matrix. We also extend 16 our approach to the case of 2-dimensional ULAs (rectangular antenna arrays) and three- dimensional (3D) beamforming, where we create fixed beams also in the elevation angle direction, inadditiontotheazimuthangle(planar)direction. Theresultingbeamforming matrix takes on the appealing form of a Kronecker product. In this way, we can serve simultaneously angular-separated groups of users in different annular regions in a sector, at different distances from the BS. We demonstrate the performance of such a system in a realistic layout assuming a rectangular antenna array mounted on the face of a tall building. Since we focus on the large system regime, we can leverage asymptotic random ma- trix theory results and in particular a recently developed analytical tool referred to as “deterministic equivalent approximation” (see [CD11] and references therein), which is able to handle a quite general class of structured random matrices arising in the JSDM context. Thanks to this analytical tool, all numerical results presented here are obtained in a semi-analytic way, by solving iteratively a provably convergent system of fixed-point equations, without the need of heavy Monte Carlo simulation. For completeness, in Sec- tion 2.4 we provide the analysis of the basic JSDM schemes without including channel estimation errors, and in Appendix A.1 the case including downlink estimation and noisy CSIT. 2.1 Channel Model A channel use of the downlink of a single-cell FDD system with a BS with M antennas serving K single-antenna UTs is expressed by y y y = h h h H 1 . . . h h h H K x x x+z z z, (2.1) 17 where x x x ∈ C M is the signal vector transmitted by the BS M-antenna array, y y y ∈ C K is the received signal vector at the K UTs antennas, z z z ∼ CN(0,I I I K ) is the corresponding additive white Gaussian noise vector, andh h h k ∈C M denotes the channel vector for user k. The BS makes use of a linear precoding (beamforming) matrixV V V ∈C M×S , such that the transmitted signal vector is x x x = V V Vd d d and d d d ∈ C S denotes the vector containing the data symbols to be sent to the users. In general,S≤ min{M,K} denotes the number of downlink streams transmitted on each channel use, which coincides in this case with the number of simultaneously served UTs on the given time-frequency slot. Assuming no line-of-sight propagation, we haveh h h k ∼CN(0 0 0,R R R k ), for some symmetric positive semi-definite channel covariance matrix R R R k . Using the Karhunen-Loeve repre- sentation, we can write the channel vectors{h h h k } in the form h h h k =U U U k Λ 1 2 k w w w k , (2.2) where w w w k ∈ C r×1 ∼ CN(0 0 0,I I I), Λ k is an r×r diagonal matrix whose elements are the non-zero eigenvalues ofR R R k , andU U U k ∈C M×r is the tall unitary matrix of the eigenvectors ofR R R k corresponding to the non-zero eigenvalues. Furthermore, assuming that the users are spatially separated by a few wavelengths and that their local scattering is rich enough (close to isotropic), it is reasonable to assume that the channel vectors{h h h k } are mutually independent. As far as the transmit antenna correlation is concerned, for analytical simplicity in this paper we consider the one-ring model of Fig. 2.1, where a UT located at azimuth angle θ and distance s is surrounded by a ring of scatterers of radius r such that the AS is Δ≈ arctan(r/s). Assuming a uniform distribution 1 of the received power from planar 1 The uniform distribution is assumed here only for analytical convenience. It is easy to show that similar performances and asymptotic behaviors are achieved by any AoA distribution (measurable non- negative function integrating to 1) with limited support in [θ−Δ,θ+Δ]. 18 waves impinging on the BS antennas, the correlation between the channel coefficients of antennas 1≤m,p≤M is given by (see [SFGK00] and references therein) [R R R] m,p = 1 2Δ Z Δ −Δ e jk k k T (α+θ)(u u um−u u up) dα, (2.3) wherek k k(α) =− 2π λ (cos(α),sin(α)) T is the wave vector for a planar wave impinging with AoAα,λisthecarrierwavelength,andu u u m ,u u u p ∈R 2 arethevectorsindicatingtheposition of BS antennas m,p in the two-dimensional coordinate system (see Fig. 2.1). θ Δ Δ s r scattering ring region containing the BS antennas Figure 2.1: A UT at AoA θ with a scattering ring of radius r generating a two-sided AS Δ with respect to the BS at the origin. 2.2 The idea of JSDM JSDM exploits the fact that, after appropriate partitioning of the UTs such that users in the same group are nearly co-located and different groups are sufficiently well separated in the AoA domain, the structure of the channel covariance matrices can be leveraged in order to reduce the dimensionality of the effective channels and therefore achieve large multiplexing gains with reduced dimension channel training and CSIT feedback. Suppose that K UTs are selected to form G groups based on the similarity of their channel covariance matrices. We letK g denote the number of UTs in groupg, such that K = P G g=1 K g , and define the indexg k = P g−1 g ′ =1 K g ′+k, fork = 1,...,K g , to denote UT 19 k in groupg. Similarly, we letS g denote the number of independent data streams sent to users in group g, such that S = P G g=1 S g . We assume for simplicity that all UTs in the same group g have identical covariance matrixR R R g =U U U g Λ g U U U H g , with rank r g and r ⋆ g ≤r g dominant eigenvalues. In practice, this condition is not verified exactly, but we can select groups such that this condition is closely approximated. Also, the notion of “dominant eigenvalues”isintentionallyleftfuzzy, sincer ⋆ g isadesignparameterthatdependsonhow muchsignalpoweroutsidethesubspacespannedbythecorrespondingeigenvectorscanbe tolerated. Forfuturereference,wedenotebyU U U ⋆ g theM×r ⋆ g matrixcollectingthedominant eigenvectors, and letU U U g = [U U U ⋆ g ,U U U ′ g ], withU U U ′ g of dimension M×(r g −r ⋆ g ), containing the eigenvectors corresponding to the weakest eigenvalues. Notice that, by construction, we have that 0 ≤ S g ≤ min{K g ,r ⋆ g }, since we cannot deliver more independent symbol streams than the multiplexing gain min{K g ,r ⋆ g } of each group g. Thechannelvectorofuserg k isgivenbyh h h g k =U U U g Λ 1 2 g w w w g k . WeletH H H g = h h h g 1 ,··· ,h h h g Kg andH H H = H H H 1 ,··· ,H H H G denotethegroupg channelmatrixandtheoverallsystemchannel matrix, respectively. As anticipated in Section 1.2, JSDM is based on two-stage precod- ing. Namely, we letV V V =B B BP P P, whereB B B ∈C M×b is a pre-beamforming matrix,P P P ∈C b×S is a MU-MIMO precoding matrix, and where b ≥ S is an integer design parameter, to be optimized. The pre-beamforming matrixB B B is a function of the channels second-order statistics, i.e., it depends on the set {U U U g ,Λ g }, or on some directional information ex- tracted from the channel covariance matrices (AoA and AS of the different groups). In any case,B B B is independent of the instantaneous realization of the channel matrixH H H. The MU-MIMO precoding matrix P P P is allowed to depend on the instantaneous realization of the reduced dimensional effective channel H H H,B B B H H H H. We let b = P G g=1 b g such that b g ≥ S g , and let B B B g be the M ×b g pre-beamforming matrix of group g. The received signal (2.1) can be rewritten as y y y =H H H H P P Pd d d+z z z (2.4) 20 where H H H H = H H H H 1 B B B 1 H H H H 1 B B B 2 ··· H H H H 1 B B B G H H H H 2 B B B 1 H H H H 2 B B B 2 ··· H H H H 2 B B B G . . . . . . . . . . . . H H H H G B B B 1 H H H H G B B B 2 ··· H H H H G B B B G , and whereH H H H g B B B g ′ is the K g ×b g ′ effective channel matrix connecting the users of group g with the effective channel inputs of group g ′ . If the estimation and feedback of the whole effective channel H H H can be afforded, the precoding matrixP P P is determined as a function of H H H. We refer to this approach as Joint Group Processing (JGP). However, this may still be too costly in terms of transmission resource. In this case, a more attractive approach consists of estimating and feeding back only the G diagonal blocks H H H g =B B B H g H H H g , of dimension b g ×K g , and treating each group separately. We refer to this approach as Per-Group Processing (PGP). In this case, the precoding matrix takes on the block-diagonal form P P P = diag(P P P 1 ,··· ,P P P G ), where P P P g ∈C bg×Sg , resulting in the vector broadcast plus interference Gaussian channel y y y g =H H H H g P P P g d d d g + X g ′ 6=g H H H g H B B B g ′P P P g ′d d d g ′ +z z z g , for g = 1,...,G. (2.5) With PGP, it is interesting to choose the groups and design the pre-beamforming matrix such that, with high probability, H H H g H B B B g ′ ≈0 0 0, for all g ′ 6=g. (2.6) Exact Block Diagonalization (BD) is possible if Span(U U U g )6⊆ Span({U U U g ′ :g ′ 6=g}) for all g = 1,...,G. In particular, multiplexing gain S g (i.e., the number of interference-free data streams) can be achieved for group g if and only if dim Span(U U U g )∩Span ⊥ ({U U U g ′ :g ′ 6=g}) ≥S g . (2.7) 21 Approximate BD can be achieved by selectingr ⋆ g dominant eigenmodes for each groupg, such that Span(U U U ⋆ g )6⊆ Span({U U U ⋆ g ′ :g ′ 6=g}) for all g = 1,...,G. In this case, in order to deliver S g streams to group g we require dim Span(U U U ⋆ g )∩Span ⊥ ({U U U ⋆ g ′ :g ′ 6=g}) ≥S g . (2.8) However, these streams will be affected by some residual interference due to the weak eigenmodes not included in the matrices{U U U ⋆ g :g = 1,...,G}. Remark 1 Notice that the PGP pre-beamforming creates virtual sectors, i.e., a gen- eralization of spatial sectorization commonly used in current cellular technology. Each group corresponds to a virtual sector, and it is independently precoded under a total sum power constraint, possibly incurring some residual inter-group interference in the case of approximate BD. ♦ Remark 2 It is reasonable to assume that the channel covariance matrix R R R g for each user group changes slowly with respect to the coherence time of the instantaneous channel matrix H H H g . The dominant eigenmodes U U U ⋆ g can be tracked by using a suitable subspace estimation and tracking algorithm [ESS94], exploiting the downlink training phase. Fur- thermore, the BS antenna array can be designed in order to allow the estimation of the downlink channel covariance matrix from uplink training symbols, even though in FDD the downlink and the uplink take place at different carrier frequencies (see for exam- ple [HM01]). The estimation and tracking of the (slowly time-varying) channel statistics is a topic of great interest in the context of JSDM, and is left for future work. Here, we assume that the channel covariance matrix for each user group is known. ♦ 22 2.3 JSDM with Eigen-Beamforming 2.3.1 Achieving capacity with reduced CSIT Letr = P G g=1 r g and suppose that the channel covariances of theG groups are such that U U U = [U U U 1 ,··· ,U U U G ] is M×r tall unitary (i.e., r≤M andU U U H U U U =I I I r ). In order to obtain exact BD it is sufficient to letb g =r g andB B B g =U U U g . This choice for the pre-beamforming matrix is referred to in the following as eigen-beamforming. In this case, the decoupled MU-MIMO channel (2.5) takes on the form y y y g =H H H g H P P P g d d d g +z z z g =W W W H g Λ 1/2 g P P P g d d d g +z z z g , for g = 1,...,G, (2.9) where W W W g is a r g ×K g i.i.d. matrix with elements ∼ CN(0,1). In this case we have [NAAC12]: Theorem 1 ForU U U tall unitary, JSDM with PGP achieves the same sum capacity of the corresponding MU-MIMO downlink channel (2.1) with full CSIT. Proof LetC sum (H H H;P)denotethesumcapacityof(2.1)withsumpowerconstraintP and fixed channel matrixH H H, perfectly known to transmitter and receivers. By the MAC-BC duality [VJG03], we have C sum (H H H;P) = max S S Sg0: P g tr(S S Sg)≤P log I I I M + G X g=1 U U U g Λ 1/2 g W W W g S S S g W W W H g Λ 1/2 g U U U H g (2.10) whereS S S g denotes the diagonal K g ×K g input covariance matrix for group g in the dual MAC channel. For any fixed set {S S S g } of feasible input covariance matrices, define for notationsimplicityA A A g =Λ 1/2 g W W W g S S S g W W W H g Λ 1/2 g . NoticethatA A A g hasdimensionr g ×r g andis 23 invertible with probability 1 over the random channel realization. The theorem is proved by showing the determinant identity I I I M + G X g=1 U U U g A A A g U U U H g = G Y g=1 I I I M +U U U g A A A g U U U H g . (2.11) This can be proved by induction, noticing the following step: for any 1≤g ′ ≤G, I I I M + G X g=g ′ U U U g A A A g U U U H g = I I I M +U U U g ′A A A g ′U U U H g ′ I I I M +(I I I M +U U U g ′A A A g ′U U U H g ′) −1 G X g=g ′ +1 U U U g A A A g U U U H g = I I I M +U U U g ′A A A g ′U U U H g ′ I I I M +(I I I M −U U U g ′(A A A −1 g ′ +I I I rg ) −1 U U U H g ′) G X g=g ′ +1 U U U g A A A g U U U H g (2.12) = I I I M +U U U g ′A A A g ′U U U H g ′ I I I M + G X g=g ′ +1 U U U g A A A g U U U H g , (2.13) where (2.12) follows from the matrix inversion lemma and (2.13) follows from the fact that, by assumption,U U U H g ′U U U g =0 0 0 for all g ′ 6=g. Using (2.11) in (2.10) we obtain C sum (H H H;P) = max S S Sg0: P g tr(S S Sg)≤P G X g=1 log I I I rg +Λ 1/2 g W W W g S S S g W W W H g Λ 1/2 g , (2.14) which is immediately recognized to be the capacity of the dual MAC (with sum power constraint) for the set of decoupled MU-MIMO downlink channels (2.9). Remark 3 In a similar manner it is possible to show that under the orthogonality con- dition of Theorem 1 JSDM achieves the whole capacity region [WSS06], and not only the sum capacity. In order to see this, for any user subset K⊆{1,...,K} define H H H g (K) as the sub matrix of H H H g obtained by selecting the columns g k ∈ K, and let S S S g (K) denote the submatrix of S S S g obtained by retaining the rows and columns corresponding to users 24 g k ∈ K. Then, the capacity region of the dual MAC of (2.1) subject to the sum power constraint can be written as C(H H H;P) = [ S S Sg0: P G g=1 Tr(S S Sg)≤P r r r∈R + K : X g k ∈K r g k ≤ log I I I M + G X g=1 H H H g (K)S S S g (K)H H H H g (K) , ∀K⊆{1,...,K} . (2.15) The determinant identity (2.11) can be applied to the partial sum-rate bounds for each user subset K, noticing that the tall unitary condition of the singular vectors is retained by the new system matrixH H H(K) = [H H H 1 (K),...,H H H G (K)]. ♦ Remark 4 Theorem 1 has an important practical implication: in a situation where a large number of UTs, each of which has its own AoA and AS, must be served by the downlink, a good scheduling strategy consists of the following. First, partition the users into groups with (approximately) identical eigenspaces. Then, partition the collection of groups into disjoint and mutually exclusive sets, such that the groups in each set satisfy the tall unitary condition of Theorem 1, and such that the number of sets is minimal, over all possible partitions. Finally, schedule the groups in each set to be served simultaneously, on the same time-frequency slot, using JSDM, and use time-frequency sharing across the groups. Notice that this does not mean that, in general, JSDM is optimal. In fact, in order to meet the tall unitary condition we may be obliged to reduce the number G of simultaneously served groups in each set. As already noticed for the problem of clustering users into groups, also the problem of finding optimal partitions of the user groups under JSDM with PGP is far from trivial, and is partially addressed in Chapter 3. ♦ When achieving the tall unitary condition is too restrictive in terms of multiplexing gain, the pre-beamforming matrixB B B can be chosen as a function of the wholeU U U in order to achieve exact or approximated BD. This approach is presented in the next section. 25 2.3.2 Block diagonalization Recall that B B B = [B B B 1 ,...,B B B G ] is an M ×b matrix consisting of G blocks of dimension M×b g , each corresponding to a particular groupg. For given target numbers of streams per group {S g } and dimensions {b g } satisfying S g ≤ b g ≤ r g , our goal is to design the blocksB B B g such that BD is achieved, i.e.,U U U H g ′B B B g =0 0 0 for allg ′ 6=g and rank(U U U H g B B B g )≥S g . A necessary condition for exact zero-forcing of the off-diagonal blocks is Span(B B B g ) ⊆ Span ⊥ ({U U U g ′ : g ′ 6= g}). When Span ⊥ ({U U U g ′ : g ′ 6= g}) has dimension smaller than S g , the rank condition on the diagonal blocks cannot be satisfied. In this case, S g should be reduced or, as an alternative, approximated BD based on selecting r ⋆ g <r g dominant eigenmodesforeachgroupg canbeimplemented. ThisconsistsofreplacingU U U g withU U U ⋆ g in theaboveconditions. WhenSpan({U U U g ′ :g ′ 6=g})hasdimensionM,thenexactBDcannot be achieved even forS g = 1, and therefore approximated BD should be considered in any case. Without loss of generality, we formulate the design of {B B B g } for approximated BD withsomefeasiblechoiceoftheparameters{r ⋆ g },{b g }and{S g }. Itshouldbenoticedthat these are design parameters that should be optimized for a given system configuration, in order to maximize the overall spectral efficiency. This optimization is far from trivial. For the time being, we consider an arbitrary feasible choice and postpone the discussion on the tradeoff that governs the design of these parameters in Sections 2.4.3 (see Remark 5) and 2.5.1 (see Remark 8). Following the approach of [SSH04], we define Ξ g = U U U ⋆ 1 ,...,U U U ⋆ g−1 ,U U U ⋆ g+1 ,...,U U U ⋆ G , (2.16) of dimensions M × P g ′ 6=g r ⋆ g ′ and rank P g ′ 6=g r ⋆ g ′ , and let [E E E (1) g ,E E E (0) g ] denote a system of lefteigenvectorsofΞ g (e.g.,obtainedbySingularValueDecomposition(SVD)),suchthat E E E (0) g is M × M− P g ′ 6=g r ⋆ g ′ and forms a unitary basis for the orthogonal complement of Span(Ξ g ), i.e., such that Span(E E E (0) g ) = Span ⊥ ({U U U ⋆ g ′ :g ′ 6=g}). 26 WeobtainB B B g byconcatenatingtheprojectionontoSpan(E E E (0) g )witheigen-beamforming along the dominant eigenmodes of the covariance matrix of the resulting projected chan- nels of groupg, i.e., of the columns of (E E E (0) g ) H H H H g . Recalling the Karhunen-Loeve decom- position (2.2), we have that the covariance matrix of b h h h g k = (E E E (0) g ) H U U U g Λ 1/2 g w w w g k is given by b R R R g = (E E E (0) g ) H U U U g Λ g U U U H g E E E (0) g =G G G g Φ g G G G H g , (2.17) where the expression on the right of (2.17) is the SVD of b R R R g . LettingG G G g = [G G G (1) g ,G G G (0) g ] whereG G G (1) g contains the dominant b g eigenmodes of b R R R g , we eventually obtain B B B g =E E E (0) g G G G (1) g . (2.18) The pre-beamforming matrixB B B g can be interpreted as being orthogonal to the dominant r ∗ g ′ eigenmodes of groups g ′ 6= g, and matched to the b g dominant eigenmodes of the covariance matrix of the projected channels (E E E (0) g ) H H H H g of group g. By construction, we have that b g is less or equal to the rank of b R R R g , given by min n r g ,M− P g ′ 6=g r ⋆ g ′ o . In particular, if r = P g r g ≤M, we can choose b g =r ⋆ g =r g and obtain exact BD. 2.4 Performance analysis InthissectionweprovideexpressionsfortheperformanceanalysisofJSDMwithJGPand PGP and linear precoding, using the techniques of deterministic equivalents [CWD09]. For simplicity of exposition we consider a symmetric scenario with the same number K g = K ′ of users per group, the same number S g = S ′ of streams per group, and the same dimensionb g =b ′ of the pre-beamforming matrix per group. However, the analysis canbeimmediatelyextendedtothegeneralcaseconsideredbefore. Thistechniquecanbe applied as long as the users to be served in each group are selected independently of their instantaneous channel realization. Hence, we assume that for each group a subset of S ′ outofthepossibleK ′ usersispre-selectedandscheduledfortransmissionoverthecurrent 27 downlink time-frequency slot. This simplified scheduling requires only the instantaneous CSIT feedback from the pre-scheduled users 2 and it is in line with the massive MIMO concept, where hardware augmentation at the BS allows significant simplification in the system operations. Undertheseassumptions,thetransformedchannelmatrixH H Hhasdimensionb×S,with blocks H H H g of dimension b ′ ×S ′ . Also for the sake of simplicity and in line with massive MIMO system simplification (see for example [Mar10,HCPR12]) we allocate to all users the same fraction of the total transmit power P, such that the data vector covariance matrixisgivenbyE[d d dd d d H ] = P S I I I S . Inthefollowing,wepresentthedeterministicequivalent fixed-point equations for determining the Signal-to-Interference plus Noise Ratio (SINR) attheUTsreceiversforthecaseofJSDMwithJGPandPGPwithlinear regularized zero forcingprecoding. Alongthesamelines,AppendixA.1presentsthecaseofregularizedand non-regularized linear zero forcing precoding for PGP in the case of noisy CSIT obtained from downlink training (see Section 2.5). It is well-known that a discrete-time complex additive noise plus interference channel with SINR equal to γ has capacity at least as large as log(1+γ) bit/symbol [MKLSS94]. Hence, in order to obtain an asymptotically convergent approximation of the achievable spectral efficiency (in bit/symbol) per served user, we computeγ via the deterministic equivalent method, and plug the result into the log(1+γ) rate formula. 2.4.1 JSDM with joint group processing For fixed pre-beamforming matrix B B B and JGP, the regularized zero forcing precoding matrix is given by P P P rzf =ζK K KH H H, (2.19) 2 Unlike channel-based opportunistic user selection, [VDL02,YG06,ANSH09,SH05], that requires to collect CSIT from many users and then select a subset of users with quasi-orthogonal channel vectors. 28 whereK K K = h H H HH H H H +bαI I I b i −1 , α is a regularization factor, andζ is a normalization factor chosen to satisfy the power constraint and is given by ζ 2 = S tr P P P H rzf B B B H B B BP P P rzf . (2.20) The covariance matrix of the transformed channel of group g is given by ˜ R R R g = B B B H 1 R R R g B B B 1 B B B H 1 R R R g B B B 2 ··· B B B H 1 R R R g B B B G B B B H 2 R R R g B B B 1 B B B H 2 R R R g B B B 2 ··· B B B H 2 R R R g B B B G . . . . . . . . . . . . B B B H G R R R g B B B 1 B B B H G R R R g B B B 2 ··· B B B H G R R R g B B B G . (2.21) The SINR for user g k is given by γ g k ,jgp,rzf = P S ζ 2 |h h h H g k B B BK K KB B B H h h h g k | 2 P S P j6=g k ζ 2 |h h h H g k B B BK K KB B B H h h h j | 2 +1 (2.22) where the subscript “jgp” stands for joint group processing. Following the approach of [CWD09], assuming that as M → ∞ the other system dimensions r,S and b also go to infinity linearly with M, we have γ g k ,jgp,rzf −γ o g k ,jgp,rzf M→∞ −→ 0 with probability 1, (2.23) where, for all usersg k ,γ o g k ,jgp,rzf is a deterministic quantity that can be computed for any finite M as γ o g k ,jgp,rzf = P S ζ 2 (m o g ) 2 ζ 2 Υ o g +(1+m o g ) 2 , (2.24) 29 where ζ 2 = P Γ o and the quantities m o g , Υ o g and Γ o are obtained by solving the system of fixed-point equations m o g = 1 b tr ˜ R R R g T T T (2.25) T T T = S ′ b G X g=1 ˜ R R R g 1+m o g +αI I I b −1 (2.26) Γ o = 1 b P G G X g=1 n g (1+m o g ) 2 (2.27) Υ o g = 1 b P G G X g ′ =1,g ′ 6=g n g ′ ,g (1+m o g ′ ) 2 + S ′ −1 S ′ n g,g (1+m o g ) 2 , (2.28) withn n n = [n 1 ,n 2 ,...,n G ] T andn n n g = [n 1,g ,n 2,g ,...,n G,g ] T defined by n n n = (I I I G −J J J) −1 v v v (2.29) n n n g = (I I I G −J J J) −1 v v v g , (2.30) whereJ J J,v v v andv v v g are given as [J J J] g,g ′ = S ′ b tr ˜ R R R g T T T ˜ R R R g ′T T T b(1+m o g ′ ) 2 (2.31) v v v = 1 b h tr ˜ R R R 1 T T TB B B H B B BT T T ,...,tr ˜ R R R G T T TB B B H B B BT T T i T (2.32) v v v g = 1 b h tr ˜ R R R 1 T T T ˜ R R R g T T T ,...,tr ˜ R R R G T T T ˜ R R R g T T T i T (2.33) 2.4.2 JSDM with per-group processing The channel covariance matrix for a userg k is given by ¯ R R R g =B B B H g R R R g B B B g . Focusing only on the users in group g, the regularized zero forcing precoding matrix is given by P P P g,rzf = ¯ ζ g ¯ K K K g H H H g , (2.34) 30 where ¯ K K K g = h H H H g H H H H g +b ′ αI I I b ′ i −1 ,α is a regularization factor, and ¯ ζ g is the power normal- ization factor given by ¯ ζ 2 g = S ′ tr P P P H g,rzf B B B H g B B B g P P P g,rzf . (2.35) When B B B g is given by (2.18), then it is the product of two tall unitary matrices so that B B B H g B B B g =I I I b ′. However, we use (2.35) for the sake of generality. The SINR of user g k given by γ g k ,pgp = P S ¯ ζ 2 g |h h h H g k B B B g ¯ K K K g B B B H g h h h g k | 2 P S P j6=k ¯ ζ 2 g |h h h H g k B B B g ¯ K K K g B B B H g h h h g j | 2 + P S P g ′ 6=g P j ¯ ζ 2 g ′ |h h h H g k B B B g ′ ¯ K K K g ′B B B H g ′h h h g ′ j | 2 +1 (2.36) where the subscript “pgp” stands for per-group processing. Proceeding similarly as before and applying the method developed in [CWD09], and assuming that as M → ∞ the other system dimensions r,S and b also go to infinity linearly with M, we have γ g k ,pgp,rzf −γ o g k ,pgp,rzf M→∞ −→ 0 with probability 1, (2.37) where, for all usersg k ,γ o g k ,pgp,rzf is a deterministic quantity that can be computed for any finite M as γ o g k ,pgp,rzf = P S ¯ ζ 2 g (¯ m o g ) 2 ¯ ζ 2 g ¯ Υ o g,g +(1+ P g ′ 6=g ¯ ζ 2 g ′ ¯ Υ o g,g ′ )(1+ ¯ m o g ) 2 , (2.38) 31 where ¯ ζ 2 g = P/G ¯ Γ o g and the quantities ¯ m o g , ¯ Υ o g,g , ¯ Υ o g,g ′ and ¯ Γ o g are given by ¯ m o g = 1 b ′ tr ¯ R R R g ¯ T T T g (2.39) ¯ T T T g = S ′ b ′ ¯ R R R g 1+ ¯ m o g +αI I I b ′ −1 (2.40) ¯ Γ o g = 1 b ′ P G ¯ n g (1+ ¯ m o g ) 2 (2.41) ¯ Υ o g,g = 1 b ′ S ′ −1 S ′ P G ¯ n g,g (1+ ¯ m o g ) 2 (2.42) ¯ Υ o g,g ′ = 1 b ′ P G ¯ n g ′ ,g (1+ ¯ m o g ′ ) 2 (2.43) ¯ n g = 1 b ′ tr ¯ R R R g ¯ T T T g B B B H g B B B g ¯ T T T g 1− S ′ b ′ tr( ¯ R R Rg ¯ T T Tg ¯ R R Rg ¯ T T Tg) b ′ (1+¯ m o g ) 2 (2.44) ¯ n g,g = 1 b ′ tr ¯ R R R g ¯ T T T g ¯ R R R g ¯ T T T g 1− S ′ b ′ tr( ¯ R R Rg ¯ T T Tg ¯ R R Rg ¯ T T Tg) b ′ (1+¯ m o g ) 2 (2.45) ¯ n g ′ ,g = 1 b ′ tr ¯ R R R g ′ ¯ T T T g ′B B B H g ′R R R g B B B g ′ ¯ T T T g ′ 1− S ′ b ′ tr( ¯ R R R g ′ ¯ T T T g ′ ¯ R R R g ′ ¯ T T T g ′) b ′ (1+¯ m o g ′ ) 2 (2.46) 2.4.3 Validation of the asymptotic analysis In this section we present some numerical examples focusing on the case when the tall unitary condition is not satisfied, and we discuss the choice of the effective rank param- eter r ⋆ in the approximated BD for PGP (more in general, the parameters {r ⋆ g }, for an asymmetric case). We also compare the results obtained via the method of deterministic equivalents with finite-dimensional Monte Carlo simulations, in order to give an idea on the method accuracy. 3 In the following examples, the BS is equipped with a uniform circular array with M = 100 isotropic antenna elements equally spaced on a circle of radius λD, for D = 0.5 √ (1−cos(2π/M)) 2 +sin(2π/M) 2 , resulting in the minimum distance between antenna elements 3 Precise statements on the order of convergence with respect to M of the actual finite dimensional SINRs to their deterministic equivalents are given in [CWD09]. 32 0 5 10 15 20 25 30 0 50 100 150 200 250 300 350 SNR (in dBs) Sum Rate Capacity ZFBF, JGP RZFBF, JGP ZFBF, PGP RZFBF, PGP (a) r ⋆ =6 0 5 10 15 20 25 30 0 50 100 150 200 250 300 350 SNR (in dBs) Sum Rate Capacity ZFBF, JGP RZFBF, JGP ZFBF, PGP RZFBF, PGP (b) r ⋆ =11 Figure 2.2: Comparison of sum spectral efficiency (bit/s/Hz) vs. SNR (dB) for JSDM with their corresponding deterministic equivalents. “JGP” denotes JSDM with joint group processing and “PGP” denotes JSDM with per-group processing. equal to λ 2 . Users form G = 6 symmetric groups, with AS Δ = 15 o and azimuth AoA θ g = −π + Δ + (g− 1) 2π G for g = 1,...,G. The user channel correlation is obtained according to (2.3). For the system geometry defined above, the transmit covariance matrix for each group has rank r = 21. However, half of the non-zero eigenvalues are extremely small, yielding an effective rank r ⋆ = 11. Somehow arbitrarily, we fixed to serve S ′ = 5 data streams per group, so that the total number of users being served is S =S ′ G = 30, and chose b ′ = 10. Figs. 2.2(a) and 2.2(b) show the performance of the JSDM schemes when the pre- beamforming matrix is designed according to the approximate BD method described in Section 2.3.2, choosing r ⋆ = 6 and r ⋆ = 12, respectively. Given the noise unit variance normalization, we have that SNR =P. The solid “squares” are obtained through simu- lations and the dotted “x” are obtained using the corresponding deterministic equivalent approximations. The regularization parameter is fixed toα = S bP for both JGP and PGP. The performance of JSDM with JGP in Figs. 2.2(a) and 2.2(b) is identical, owing to the 33 fact that we use eigen-beamforming with B B B g =U U U g , independent of r ⋆ . For the sake of comparison, the sum capacity of the MIMO BC channel with full CSIT (see (2.1)) is also shown (solid “circles” in green), obtained by the iterative waterfiling approach of [Yu06]. Remark 5 By choosingr ⋆ too small, such that significant eigenmodes are not taken into account by the approximate BD pre-beamforming matrix, the resulting inter-group inter- ference is large and the performance of PGP is severely interference limited (e.g., Fig. 2.2(a)). Instead, by choosing r ⋆ large enough, in order to include all significant eigen- modes, the performance of PGP does not show a noticeable interference limited behavior over a wide range of SNR. This is the case of Fig. 2.2(b), where we chose r ⋆ = 12 and the channel covariance matrix has rank r = 21, but only 11 significant eigenvalues. As a matter of fact, the PGP rate curves of Fig. 2.2(b) will eventually flatten, but this happens at extremely large SNR, irrelevant for practical applications. This example shows that r ⋆ should always be chosen in order to include all strongest eigenmodes. However, making r ⋆ = r is generally not a good choice since many eigenmodes may be very close to zero (as in this example) and therefore including them in the count of r ⋆ yields a dimension- ality bottleneck without any real benefit in terms of inter-group interference (recall that r ⋆ G≤M, therefore ifr ⋆ is large we may have to decreaseG, i.e., serve less groups in par- allel). We conclude that the choice of the effective rank r ⋆ should be carefully optimized, depending on the specific channel covariance eigenvalue distribution. ♦ Remark 6 ThelargegapbetweenPGPandJGPeveninthecaseofr ⋆ = 12(Fig.2.2(b)), may seem to be in contrast to what stated in Theorem 1 about the optimality of JSDM with PGP. This can be explained by noticing that, for a general geometry of the antennas and of the user groups, the eigenvectors are far from forming a tall unitary matrix U U U. Therefore, for a fixed geometry and user group assignment, PGP can be arbitrarily worse than JGP. It is however important to notice that when the user groups can be selected 34 opportunistically, this gap can be made small for suitable channel statistics. The oppor- tunistic scheduling of the user groups, and of the users inside each group, is addressed in Chapter 3. ♦ 2.5 Downlink training and noisy CSIT In this section, we evaluate the impact of noisy CSIT by including the fact that the effective channels are estimated by the UTs from the downlink training phase. In the vast literature dedicated to CSIT feedback (see for example [CJKR10] and references therein),methodsthatachievetheestimatedchannelMean-SquareError(MSE)decreases as O(1/P β ) for some β ≥ 1, even in the presence of channel feedback noise and errors, are well-known. In contrast, the MSE due to estimation from the downlink training phase decreases at best asO(1/P). In fact, this is given by the high-SNR behavior of the MMSEforaGaussiansignal(thechannelvectors)inGaussiannoise. IftheCSITfeedback scheme is designed to achieve exponent β > 1 and the channel SNR is sufficiently large, the feedback error is negligible with respect to the downlink estimation error [CJKR10]. Hence, for simplicity, we consider the optimistic situation of ideal and delay-free CSIT feedback, and focus only on the effect of the downlink channel estimation error and dimensionality penalty factor of the training phase (a similar approach is followed in [HTC12]). For brevity, we focus only on the case of PGP. 4 From Section 2.4.2, the channel co- variance matrix for a userg k is given by ¯ R R R g =B B B H g R R R g B B B g . In order to estimate the effective channel vector h h h g k , i.e., the column of the effective channel matrix H H H g corresponding to user g k , the BS sends unitary training sequences of length b ′ , in parallel over the b ′ vir- tual inputs of the pre-beamforming of each groupg. Hence, the training phase with PGP spans b ′ symbols. The UTs in each group make use of linear MMSE estimation, which is the optimal estimator for minimizing the MSE since the observation at each user and 4 Analogous results can be obtained for the case of JGP, but these are practically less interesting since JGP requires typically too large training and feedback overhead in FDD systems. 35 the channel vector are conditionally jointly Gaussian given the training sequences. The MMSE channel estimates are fed back to the BS and are used to compute the linear precoders{P P P g }. Assuming that in each coherence block ofT symbols the training phase makesuseofb ′ symbols, and theremainingT−b ′ symbolsareavailablefordownlink data transmission, it follows that the spectral efficiency must be scaled by the dimensionality penalty factor max{1−b ′ /T,0}. We consider a scheme where a scaled unitary training matrix X X X tr of dimension b ′ × b ′ is sent, simultaneously, to all groups in the common downlink training phase. The corresponding received signal at group g receivers is given by Y Y Y g =H H H H g X X X tr + X g ′ 6=g H H H g H B B B g ′X X X tr +Z Z Z g . (2.47) Multiplying from the right by X X X H tr and using the fact that, by design, X X X tr X X X H tr = ρ tr I I I b ′ where ρ tr is the power allocated to training, we obtain Y Y Y g X X X H tr =ρ tr H H H H g +ρ tr X g ′ 6=g H H H g H B B B g ′ +Z Z Z g X X X H tr . (2.48) Extracting theg k -th row, dividing by √ ρ tr , using the fact thatZ Z Z g X X X H tr has i.i.d. entries∼ CN(0,ρ tr )andtakingHermitiantransposeofeverything, weobtainthenoisyobservation for estimating the g k -th effective channel vector in the form e h h h g k = √ ρ tr h h h g k + √ ρ tr X g ′ 6=g B B B H g ′ h h h g k +e z z z g k , (2.49) 36 where e z z z g k ∼CN(0 0 0,I I I b ′). The MMSE estimator for h h h g k based on (2.49) is given by b h h h g k =E h h h h g k e h h h H g k i E h e h h h g k e h h h H g k i −1 e h h h g k = √ ρ tr B B B H g R R R g G X g ′ =1 B B B g ′ ρ tr G X g ′ ,g ′′ =1 B B B H g ′R R R g B B B g ′′ +I I I b ′ −1 e h h h g k = 1 √ ρ tr M M M g ˜ R R R g O O O T O O O ˜ R R R g O O O T + 1 ρ tr I I I b ′ −1 e h h h g k (2.50) whereweusedthefactthath h h g k =B B B H g h h h g k , where ˜ R R R g isdefinedin(2.21)andweintroduced the b ′ ×b block matrices M M M g = [0 0 0,...,0 0 0, I I I b ′ |{z} blockg ,0 0 0,...,0 0 0] O O O = [I I I b ′,I I I b ′,...,I I I b ′]. Notice that in the case of perfect BD we have that R R R g B B B g ′ = 0 0 0 for g ′ 6= g. Therefore, (2.49) and (2.50) reduce to e h h h g k = √ ρ tr h h h g k +e z z z g k , (2.51) and b h h h g k = 1 √ ρ tr ¯ R R R g ¯ R R R g + 1 ρ tr I I I b ′ −1 e h h h g k (2.52) respectively, where we recall the definition ¯ R R R g =B B B H g R R R g B B B g . For this channel estimation scheme, the deterministic equivalent approximation of the SINR terms for RZFBF and ZFBF precoding can be obtained following [CWD09, HtBD13], the approach of which can be directly applied to our case, and using the well- known MMSE decomposition h h h g k = b h h h g k +b e e e g k , (2.53) 37 withE[ b h h h g k b h h h H g k ] = b ¯ R R R g andMMSEcovariancematrixE[b e e e g k b e e e H g k ] = ¯ R R R g − b ¯ R R R g . Forcompleteness, the fixed-point equations leading to the deterministic equivalent SINR approximation for PGP with noisy CSIT are given in Appendix A.1. Eventually, the achievable rate of user g k is approximated by R g k ,pgp,csit = max 1− b ′ T ,0 ×log(1+ b γ o g k ,pgp,csit ), (2.54) where b γ o g k ,pgp,csit indicates either b γ o g k ,pgp,rzf,csit or b γ o g k ,pgp,zf,csit , as detailed in Appendix A.1. Remark 7 Assuming that, as M → ∞, the other system dimensions r ⋆ ,S and b also go to infinity linearly with M, the achievable rate approximation error converges to zero almost surely asM →∞. However, the dimensionality factor max{1−b ′ /T,0} is equal to zero for b ′ ≥T. Hence, in order to obtain mathematically meaningful results we assume that also the coherence block length T grows linearly with M, and we define the factor τ = b ′ /T as the dimensionality crowding factor of the channel. In practice, this means that the method is valid in the regime of b ′ large, but still significantly smaller than T. ♦ 2.5.1 Results with downlink channel estimation We demonstrate the effect of noisy CSIT on the performance of RZFBF and ZFBF in Fig. 2.3, for the same antenna configuration of Section 2.4.3 with r ⋆ = 11, for S ′ = 4 andS ′ = 8 streams per group. For the sake of comparison, the solid “red” (“blue”) curve denotes the sum spectral efficiency achieved by RZFBF (ZFBF) with full noiseless CSIT, i.e., by computing the precoding matrix in one step, directly from the instantaneous channel matrix H H H. The dotted lines represent the performance of JSDM for JGP with eigen-beamformingandnoiselessCSIT(i.e.,perfectknowledgeoftheeffectivechannelH H H). The “purple” (“black”) curves denote the sum spectral efficiency for JSDM with PGP and approximate BD, also in the case of noiseless CSIT. Finally, the “green” (“cyan”) curvesdenotetheachievablesumspectralefficiencyforJSDMwithPGPandnoisyCSIT, 38 0 5 10 15 20 25 30 0 50 100 150 200 250 300 SNR (in dBs) Sum Rate Full CSI, RZFBF Full CSI, ZFBF JGP, RZFBF JGP, ZFBF PGP, RZFBF PGP, ZFBF PGP ICSI, RZFBF PGP ICSI, ZFBF (a) S ′ =4 0 5 10 15 20 25 30 0 50 100 150 200 250 300 350 400 SNR (in dBs) Sum Rate Full CSI, RZFBF Full CSI, ZFBF JGP, RZFBF JGP, ZFBF PGP, RZFBF PGP, ZFBF PGP ICSI, RZFBF PGP ICSI, ZFBF (b) S ′ =8 Figure 2.3: Sum spectral efficiency (bit/s/Hz) vs. SNR (dB) for JSDM (computed via deterministic equivalents) with r ⋆ = 11, for S ′ = 4 and S ′ = 8. The coherence block length is T = 40. The “green” and “cyan” curves denote the results for imperfect CSIT withoptimizedchoiceofb ′ . “JGP”denotesJSDMwithjoint group processing and“PGP” denotes JSDM with per-group processing. obtained by downlink training and MMSE estimation as explained above. These curves are obtained by optimizing the parameter b ′ , for given S ′ , r ⋆ and SNR. Since a set of training sequences is sent simultaneously to all groups, the training power is given by ρ tr = P G , such that the total sum power constraint is preserved also during the training phase. Remark 8 We examine now the optimization of the parameter b ′ for fixed target S ′ , in the case of downlink training and noisy CSIT. Having fixed r ⋆ as discussed in Remark 5, and assuming 0 ≤ S ′ ≤ b ′ ≤ M − r ⋆ (G− 1), for each value of SNR and given JSDM precoding scheme there is an optimal choice of b ′ . For example, Fig. 2.4 shows the dependency of the sum spectral efficiency of JSDM with PGP with respect to b ′ for S ′ = 8, SNR = 10 and 30 dB. We notice that the sum spectral efficiency including channel estimation is not monotonically increasing with b ′ . In fact, letting b ′ large yields a better conditioned effective channel matrices, but incurs a larger dimensionality cost of the downlink training phase. The tension between these two issues yields a non-trivial 39 4 6 8 10 12 14 16 30 40 50 60 70 80 b‘ Sum Rates SNR = 10 dB RZFBF, PGP, ICSI ZFBF, PGP, ICSI RZFBF, PGP ZFBF, PGP (a) S ′ =8, SNR = 10 dB 4 6 8 10 12 14 16 130 140 150 160 170 180 190 200 210 220 230 b‘ Sum Rates SNR = 30 dB RZFBF, PGP, ICSI ZFBF, PGP, ICSI RZFBF, PGP ZFBF, PGP (b) S ′ =8, SNR = 30 dB Figure 2.4: Sum spectral efficiency (bit/s/Hz) vs. b ′ for JSDM with r ⋆ = 11, for S ′ = 8 (computed via deterministic equivalents). The coherence block length T = 40. The “dashed” curves denote the results for PGP with perfect CSIT, and the “solid” lines denote the same for imperfect CSIT. choice for the optimal value ofb ′ maximizing the system spectral efficiency. Similar trends can be observed for different values of S ′ and different values of SNR. ♦ Remark 9 Having chosenb ′ , we focus now on choice of optimalS ′ . This depends heavily on the precoding scheme and the operating SNR. For a given operating SNR, there is approximately a linear dependence between the optimalS ′ andb ′ for both the RZFBF and the ZFBF precoders considered here. This linear dependence can be characterized by a single parameter, namely, the slope of the line relating the optimal S ′ and b ′ . In Fig. 2.5 we have plotted this slope versus SNR. It can be seen that for RZFBF, at low values of SNR, the choice S ′ =b ′ (slope equal to 1) is optimal. In contrast, for ZFBF it is better to serve some S ′ <b ′ number of users. As the SNR increases, the ZFBF slope increases and approaches that of the same slope of RZFBF at high SNR. ♦ 40 0 5 10 15 20 25 30 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 SNR (in dBs) Slope M = 500, Circular Array RZFBF ZFBF Figure 2.5: Ratio S ′ /b ′ (slope) for the optimized S ′ and b ′ versus the channel SNR for different precoders. 2.6 Uniform linear arrays: eigenvalues and eigenvectors In this section we consider the antenna correlation model (2.3) for the special but im- portant case of a Uniform Linear Array (ULA) of large dimension (M ≫ 1), and obtain importantinsightonthebehaviorofthenormalizedasymptoticrankρ = lim M→∞ r M and of the eigenvectorsU U U of the covariance matrixR R R. We consider a 120 deg sector, obtained byusingdirectionalradiatingelements,andassumethatthesectoriscenteredaroundthe x-axis (α = 0 azimuth angle), and that no energy is received for angles α / ∈ [−π/3,π/3]. A ULA formed byM such directional radiating elements is placed at the origin along the y-axis. Denoting by λD the spacing of the antenna elements, the covariance matrix of the channel for a user at AoAθ and AS Δ according to the model of Section 2.1 is given by the Toeplitz form [R R R] m,p = 1 2Δ Z Δ+θ −Δ+θ e −j2πD(m−p)sin(α) dα (2.55) 41 for m,p ∈ {0,1,...,M −1}. In order to characterize eigenvalues and eigenvectors of R R R with respect toD,Δ,θ, for largeM, we resort to the well-known results of [Gra06,GS84]. From [GS84], we recall the following fundamental result. Let S(ξ) be a uniformly bounded absolutely integrable function over ξ∈ [−1/2,1/2], i.e., Z 1/2 −1/2 |S(ξ)|dξ<∞, κ 1 ≤S(ξ)≤κ 2 , where the bounds hold for all ξ ∈ [−1/2,1/2] up to a set of measure zero. Assume that we can write the sequence r m = [R R R] ℓ,ℓ−m as the inverse discrete-time Fourier transform of S(ξ), i.e., r m = Z 1/2 −1/2 S(ξ)e j2πξm dξ. (2.56) Then, the Toeplitz matrixR R R can be approximated by the circulant matrixC C C defined by its first column with m-th element c m = r m +r m−M for m = 1,...,M−1 r 0 for m = 0 , (2.57) where the approximation holds in the following sense: Fact 1 Thesetofeigenvalues{λ m (R R R)},{λ m (C C C)}andthesetofuniformlyspacedsamples {S(m/M) : m = 0,...,M − 1} are asymptotically equally distributed, i.e., for any continuous function f(x) defined over [κ 1 ,κ 2 ], we have lim M→∞ 1 M M−1 X m=0 f(λ m (R R R)) = lim M→∞ 1 M M−1 X m=0 f(λ m (C C C)) = Z 1/2 −1/2 f(S(ξ))dξ. (2.58) Fact 2 The eigenvectors ofR R R are approximated by the eigenvectors ofC C C in the following eigenspace approximation sense. Define the asymptotic eigenvalue cumulative distribution function (CDF) of the eigenvalues ofR R R to be the right-continuous non-decreasing function F(λ) such that F(λ) = R S(ξ)≤λ dξ for any point of continuity κ 1 ≤λ≤κ 2 . Let λ 0 (R R R)≤ ...,≤ λ M−1 (R R R) and λ 0 (C C C) ≤ ...,≤ λ M−1 (C C C) denote the set of ordered eigenvalues of 42 R R R and C C C, and let U U U = [u u u 0 ,...,u u u M−1 ] and F F F = [f f f 0 ,...,f f f M−1 ] denote the corresponding eigenvectors. 5 For any interval [a,b] ⊆ [κ 1 ,κ 2 ] such that F(λ) is continuous on [a,b], consider the eigenvalues index sets I [a,b] = {m : λ m (R R R) ∈ [a,b]} and J [a,b] = {m : λ m (C C C)∈ [a,b]}, and defineU U U [a,b] = (u u u m :m∈I a,b] ) andF F F [a,b] = (f f f m :m∈J [a,b] ) be the submatrices ofU U U andF F F formed by the columns whose indices belong to the setsI [a,b] and J [a,b] , respectively. Then, the eigenvectors ofC C C approximate the eigenvectors ofR R R in the sense that lim M→∞ 1 M U U U [a,b] U U U H [a,b] −F F F [a,b] F F F H [a,b] 2 F = 0. (2.59) A well-known property of circulant matrices [Gra06] is that their eigenvectors form a unitary DFT matrix, i.e., the matrix whose (ℓ,m)-th element is given by [F F F] ℓ,m = e −j2πℓm/M √ M . This has an important consequence for JSDM with large ULAs: in the regime of large M where the Toeplitz channel correlation matrix R R R is well approximated by its circulant version C C C, we can approximate U U U, the tall unitary matrix of the channel covariance eigenvectors, with a submatrix of F F F, formed by a selection of columns of F F F. Hence, we can design the pre-beamforming stage of JSDM by replacingU U U with its DFT approximation, avoiding the need of a precise estimation of the actual channel covariance matrix. In order to understand how to select the columns of F F F, we need to gain more insight into the asymptotic behavior of the eigenvalues ofR R R. 5 Notice that in the channel model defined in Section 2.1 we definedU U U of dimensions M×r to be the matrix of eigenvectors corresponding to the non-zero eigenvalues of R R R. In the statement of this result, instead,U U U denotesthewholeM×M matrixofeigenvectors,includingthenon-uniqueeigenvectorsforming a unitary basis for the nullspace ofR R R, in the case r<M. 43 For r m = [R R R] ℓ,ℓ−m with [R R R] m,p given by (2.55), andC C C defined as in (2.57), the eigen- values{λ k (C C C)} can be given explicitly for any finite M as follows: λ k (C C C) = M−1 X m=0 c m e −j 2π M mk = r 0 + M−1 X m=1 [r m +r m−M ]e −j 2π M mk = r 0 + M−1 X m=1 r m e −j 2π M mk + M−1 X m=1 r ∗ m e j 2π M mk = 1 2Δ Z Δ+θ −Δ+θ " 1+2Re ( M−1 X m=0 e −j2πmω k (D,α) −1 )# dα = −1+ 1 Δ Z Δ+θ −Δ+θ cos(πω k (D,α)(M−1)) sin(πω k (D,α)M) sin(πω k (D,α)) dα, (2.60) where we define the quantity ω k (D,α) =Dsin(α)+k/M. In order to obtain the limiting CDF of the eigenvalues ofR R R and find a simple formula fortheasymptoticrankρ, weobtainanexplicitexpressionofS(ξ)fortheautocorrelation function r m = [R R R] ℓ,ℓ−m . Using (2.55) and invoking the Lebesgue dominated convergence theorem, we have S(ξ) = ∞ X m=−∞ r m e −j2πξm = 1 2Δ Z Δ+θ −Δ+θ " ∞ X m=−∞ e −j2πm(Dsin(α)+ξ) # dα (a) = 1 2Δ Z Δ+θ −Δ+θ " ∞ X m=−∞ δ(Dsin(α)+ξ−m) # dα (b) = 1 2Δ Z Dsin(Δ+θ) Dsin(−Δ+θ) " ∞ X m=−∞ δ(z+ξ−m) # dz √ D 2 −z 2 , (2.61) where in (a) we used the Poisson sum formula (also known as “picket fence miracle” [Lap09]), in (b) we made the change of variable z = Dsin(α). The expression (2.61) is valid for− π 2 ≤θ−Δ<θ+Δ≤ π 2 . A more general formula, able to recover the classical Bessel J 0 autocorrelation function [Bel63] in the case of uniform isotropic scattering, is 44 provided in Appendix A.2. Owing to the property of the Dirac delta function, we arrive at S(ξ) = 1 2Δ X m∈[Dsin(−Δ+θ)+ξ,Dsin(Δ+θ)+ξ] 1 p D 2 −(m−ξ) 2 . (2.62) We have: Lemma 1 The function S(ξ) is non-constant over its support and uniformly bounded, provided that D ∈ [0,1/2] and −φ < θ − Δ < θ + Δ < φ for some constant angle φ∈ [0,π/2). Proof S(ξ) is periodic and it is sufficient to restrict ξ to the interval [−1/2,1/2]. As observed before, if − π 2 ≤ θ−Δ < θ +Δ ≤ π 2 , the general expression of S(ξ) given in Appendix A.2 coincides with (2.62), and we have −D < −Dsin(φ) ≤ Dsin(−Δ+θ) < Dsin(Δ+θ) ≤ Dsin(φ) < D. Since −1/2 ≤ ξ ≤ 1/2 and D ∈ [0,1/2], the following inequalities hold: Dsin(−Δ+θ)+ξ ≥ Dsin(−Δ+θ)−1/2>−D−1/2≥−1 Dsin(Δ+θ)+ξ ≤ Dsin(Δ+θ)+1/2<D+1/2≤ 1. Since −1 < Dsin(−Δ+θ)+ξ < Dsin(Δ+θ)+ξ < 1, the only integer in the interval [Dsin(−Δ+θ)+ξ,Dsin(Δ+θ)+ξ] is 0. Thus, S(ξ) = 1 2Δ X 0∈[Dsin(−Δ+θ)+ξ,Dsin(Δ+θ)+ξ] 1 p D 2 −ξ 2 . (2.63) ThesupportS ofS(ξ)isthesetofvaluesξ∈ [−1/2,1/2]forwhichtheinterval[Dsin(−Δ+ θ)+ξ,Dsin(Δ+θ)+ξ]containsthepoint0, i.e.,S = [−Dsin(Δ+θ),−Dsin(−Δ+θ)]. It is clear by inspection thatS(ξ) is not constant overS (it is sufficient to observe thatS(ξ) is differentiable, and its derivative is not identically zero over a set of non-zero measure). ToprovethatS(ξ)itisuniformlybounded, itissufficienttonoticethattheterm 1 √ D 2 −ξ 2 in (2.63) is real, continuous and finite for all ξ ∈ (−D,D) ⊃ [−Dsin(φ),Dsin(φ)] ⊇ S. 45 Hence,itattainsitsminimumκ ′ andmaximumκ 2 onS,andtheseareuniformlybounded as 6 1 D ≤κ ′ <κ 2 ≤ 1 Dcos(φ) <∞. Notice that the assumptions of Lemma 1 are satisfied for antenna spacing not larger thanλ/2andintheassumption, madehere, thattheULAreceives/transmitsenergyonly in a 120 deg sector (i.e., for AoAs in [−π/3,π/3]). As a corollary of (2.62), we obtain the asymptotic rank in closed form: Theorem 2 The asymptotic normalized rank of the channel covariance matrix R R R with elements defined in (2.55), with antenna separation λD, AoA θ and AS Δ, is given by ρ = min{1,B(D,θ,Δ)}, (2.64) where B(D,θ,Δ) =|Dsin(−Δ+θ)−Dsin(Δ+θ)|. (2.65) Proof NoticethatB(D,θ,Δ)isthesizeoftheintervalforminthesummationappearing in (2.62). If B(D,θ,Δ) ≥ 1, for any ξ ∈ [−1/2,1/2] the sum in (2.62) is non-empty. It follows that S(ξ)> 0 for all ξ and the asymptotic normalized rank is ρ = 1. In contrast, if B(D,θ,Δ)< 1, there exist a setS c ⊆ [−1/2,1/2] of measure 1−B(D,θ,Δ) for which ifξ∈S c then the sum in (2.62) is empty. Therefore, in this case we haveρ =B(D,θ,Δ). A good approximation of the actual rankr for large but finiteM is given byr≈ρM, whereρ is given by Theorem 2. Hence, we can predict accurately the rank of the channel covariance from the system geometric parameters (D,θ,Δ). The empirical CDF of the eigenvalues ofR R R is defined by F (M) R R R (λ) = 1 M M X m=1 1{λ m (R R R)≤λ}. (2.66) 6 We useκ ′ instead ofκ1 to denote the minimum ofS(ξ) on its support since the minimum eigenvalue, denoted previously by κ1, is generally equal to 0 wheneverS is strictly included in [−1/2,1/2]. 46 0 0.5 1 1.5 2 2.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Eigen Values CDF Toeplitz Circulant, M finite Circulant, M ∞ Figure 2.6: M = 400,θ = π/6,D = 1,Δ = π/10. Exact empirical eigenvalue cdf of R R R (red), its approximation (2.60) based on the circulant matrix C C C (dashed blue) and its approximation from the samples of S(ξ) (dashed green). ForlargeM,F (M) R R R (λ)canbeapproximatedeitherusing(2.60)orthecollectionofsamples {S([m/M]) : m = 0,...,M −1}, where [x] indicates x modulo the interval [−1/2,1/2]. In both cases, using the resulting collection ofM values in (2.66), we obtain a convergent approximation e F (M) R R R (λ) of the empirical CDF (2.66) such that [GS84] lim M→∞ e F (M) R R R (λ) = lim M→∞ F (M) R R R (λ) =F(λ). As an example, Fig. 2.6 shows the exact empirical CDF ofR R R, its circulant approximation obtained by (2.60) and the asymptotic approximation obtained from the set{S([m/M]) : m = 0,...,M−1}, for a specific choice of the system parameters. It is apparent that, in this regime, both approximations are very accurate. 47 2.6.1 Approximating the channel eigenspace Going back to the problem of approximating the eigenvectors of R R R with a set of DFT columns, we notice the following properties of S(ξ) in (2.62): 1. S(ξ) has support on an intervalS ⊆ [−1/2,1/2], of lengthρ (see proof of Theorem 2). 2. S(ξ) is non-constant and bounded over its support (see Lemma 1). It follows that F(λ) has a single discontinuity at λ = 0, with jump of height 1−ρ, corresponding to the mass-point of the zero eigenvalues of R R R. For ρ < 1, F(λ) is con- tinuous over (0,κ 2 ] where κ 2 = maxS(ξ) < ∞ by Lemma 1. Hence, any interval [a,b] with 0<a<b≤κ 2 is a continuity interval of F(λ), and the eigenspace approximation property of Fact 2 holds. In particular, we have established the following: Corollary 1 LetS denotethesupportofS(ξ), letJ S ={m : [m/M]∈S,m = 0,...,M− 1} be the set of indices for which the corresponding “angular frequency” ξ m = [m/M] belongs to S, let f f f m denote the m-th column of the unitary DFT matrix F F F, and let F F F S = (f f f m : m∈J S ) be the DFT submatrix containing the columns with indices in J S . Then, lim M→∞ 1 M U U UU U U H −F F F S F F F H S 2 F = 0, (2.67) whereU U U is the M×r “tall unitary” matrix of the non-zero eigenvectors ofR R R. Proof SinceS(ξ)isuniformlyboundedandstrictlypositiveoverS,wehave0< min ξ∈S S(ξ) = κ ′ < maxS(ξ) =κ 2 . Hence, lettinga =κ ′ andb =κ 2 , and using the eigenspace approxi- mation property of Fact 2 yields the result. ConsidernowaJSDMconfigurationwithanULAservingGgroupswithAoAswithin a 120 deg sector. For each group g, we can approximate the eigenmodesU U U g by the DFT submatrix F F F Sg , where S g denotes the support of S g (ξ), given by (2.62) for AoA θ g and AS Δ (for simplicity we assume that the AS is common to all groups, although this 48 can be easily generalized). Corollary (1) implies that if S g ∩S g ′ = ∅ (disjoint angular frequency support), thenF F F H Sg F F F S ′ g =0 0 0. It follows that if the G groups are chosen to have spectra with disjoint support, then [F F F S 1 ,...,F F F S G ] is exactly tall unitary and, because of Fact 2,U U U = [U U U 1 ,...,U U U G ] is approximately tall unitary, for largeM. The following result provides such condition expressed directly in terms of the AoA intervals. Theorem 3 Groupsg andg ′ with angle of arrivalθ g andθ g ′ and common angular spread Δ have spectra with disjoint support if their AoA intervals [θ g −Δ,θ g +Δ] and [θ g ′ − Δ,θ g ′ +Δ] are disjoint. Proof Define: A g = max(Dsin(θ g +Δ),Dsin(θ g −Δ)) B g = min(Dsin(θ g +Δ),Dsin(θ g −Δ)) A g ′ = max(Dsin(θ g ′ +Δ),Dsin(θ g ′−Δ)) B g ′ = min(Dsin(θ g ′ +Δ),Dsin(θ g ′−Δ)). From (2.62) we notice that S g (ξ) and S g ′(ξ) have disjoint supports if A g ≤B g ′ or A g ′ ≤ B g . Sincethemappingx7→ sin(x)isone-to-oneintheinterval[−π/3,π/3], thiscondition corresponds to [θ g −Δ,θ g +Δ]∩[θ g ′−Δ,θ g ′ +Δ] =∅. 2.6.2 DFT pre-beamforming Owing to the asymptotic eigenspace approximation and mutual orthogonality of the pre- vious section, an efficient approach to JSDM design when the BS is equipped with a large ULA per sector consists of selecting groups of users with (almost) identical AoA intervals, and find G groups of such users with non-overlapping AoA intervals. Then, we let B B B g = F F F Sg , for g = 1,...,G, with F F F Sg defined as in Corollary 1. It follows that F F F H Sg U U U g ′ ≈0 0 0 for all g 6=g ′ , such that the sum spectral efficiency achieved by JSDM with PGP is close to the sum spectral efficiency of the corresponding MU-MIMO downlink 49 −0.5 0 0.5 0 1 2 3 4 5 6 7 8 ξ Eigen Values θ = −45 θ = 0 θ = 45 Figure 2.7: Eigenvalue spectra for a ULA withM = 400,G = 3,θ 1 = −π 4 ,θ 2 = 0,θ 3 = π 4 , D = 1/2 and Δ = 15 deg. 0 5 10 15 20 25 30 0 500 1000 1500 SNR Sum Rate RZFBF, Full ZFBF, Full RZFBF, DFT ZFBF, DFT Figure 2.8: Sum spectral efficiency (bit/s/Hz) vs. SNR (dB) for JSDM (computed via deterministicequivalents)forDFTpre-beamformingandPGP,fortheconfigurationwith spectra shown in Fig. 2.7, choosing b g =r g for all groups g = 1,2,3. 50 channel with full CSIT (see Theorem 1). Notice that this approach is particularly attrac- tive since only a coarse parametric knowledge (AoA interval) for each user is required, rather than an accurate estimate of its channel covariance matrix. Fig. 2.7 show the spectra S g (ξ) for g = 1,2,3, M = 400, and θ 1 = −π 4 ,θ 2 = 0,θ 3 = π 4 , with D = 1/2 and Δ = 15 deg. The performance of JSDM with PGP and DFT pre-beamforming is shown in Fig. 2.8, indicating that up to 20 dB of SNR, DFT pre- beamforming performs close to schemes with full CSIT. 2.7 JSDM with 3D pre-beamforming So far we considered a planar geometry where each user group g is identified by its AoA interval [θ g −Δ,θ g +Δ]. For the sake of simplicity, we allocated equal power to all S downlink data streams. This is a near-optimal power allocation in the high SNR (high spectral efficiency) regime and in the case where the pathloss from the BS to all the UTs is approximately equal. In practice, however, users with same (or very similar) AoA interval may be located at different distances to the BS. In this case, a simple alternative to the complicated and generally non-convex power allocation optimization across different users 7 consists of dividing the cell into concentric annular regions, and serve simultaneously groups in the same region, such that the pathloss is nearly equal for all jointly processed groups. Groups in different annular regions can be scheduled over the time-frequency slots. In this section, we consider an extension of this approach where we assume that the BS is elevated with respect to ground. For example, antenna elements could be placed on the windows frames of a tall building forming a rectangular array with M antennas in each row (each row is an ULA) and a total of N rows in the vertical dimension. By exploiting the vertical dimension, different annular regions can be served simultaneously in the spatial domain. 7 While for MU-MIMO with full CSIT and optimal capacity achieving coding, the power allocation is a convexoptimizationproblemthatcanbeefficientlysolved[WSS06], forJSDMwitheitherJGPandPGP, the problem is non-convex and the optimization is not amenable to a computationally efficient solution. 51 AssumingarectangularN×M array,weconsiderusingaseparable3Dpre-beamforming scheme: beamforming in the elevation angle dimension is used to form beams that “look down” at different angles, i.e., they illuminate concentric annular regions within the cell sector. For each such region, precoding in the azimuth angle dimension is obtained by JSDM scheme withM antennas, as done before. Thanks to separability, we can optimize JSDM schemes independently, one for each annular region. The groups served simultaneously by JSDM in the same region are now identified by two indices, (l,g) where l = 1,...,L indicates the annular region and g = 1,...,G l the group in each l-th region. A set of groups served simultaneously, on the same time- frequency dimensions, is referred to as a “pattern”. A pattern does not necessarily cover the whole sector. In fact, it is usually better to allow for “holes” in the pattern, i.e., the group footprints can be separated by gaps, in order to guarantee near orthogonality between the dominant eigenmodes of the groups in the same pattern and thus limiting inter-groupinterferencewithPGP.Inordertoprovidecoveragetothewholesector,differ- ent intertwined patterns can be multiplexed over the time-frequency dimension, similarly totheintertwinedcooperativepatternideaproposedin[RCP09],[RCP10],[CRP10]. The fraction of the time-frequency dimensions allocated to each pattern can be further opti- mized in order to maximize a network utility function, reflecting some desired notion of fairness (see for example [HCPR12]). For the time being, we focus on a single pattern comprisingL regions in the elevation angle dimension, and G l groups in the azimuth angle dimension for each region l = 1,...,L. We let K l,g denote the number of users in group (l,g). At the BS, an N ×M rectangular antenna array with N rows and M columns is used. For each region l, we denote by R R R V,l ∈ C N×N the vertical channel covariance matrix 8 and, for each group 8 We assume that the vertical correlation does not depend on g, but just on l. 52 (l,g), we let R R R H,l,g ∈ C M×M denote the the horizontal channel covariance matrix. R R R V,l andR R R H,l,g are modeled according to (2.3), with the eigen-decompositions: R R R V,l =U U U V,l Λ V,l U U U H V,l , and R R R H,l,g =U U U H,l,g Λ H,l,g U U U H H,l,g (2.68) Lettingh h h l,g k denote theMN×1 the vectorized channel from theM×N BS array to the k th user in group (l,g), we have E[h h h l,g k h h h H l,g k ] =R R R l,g =R R R H,l,g ⊗R R R V,l,g = (U U U H,l,g ⊗U U U V,l,g )(Λ H,l,g ⊗Λ V,l,g )(U U U H H,l,g ⊗U U U H V,l,g ) (2.69) Thiscovariancematrixiscommon(byassumption)toallusersg k ingroup(l,g). Denoting the ranks ofR R R H,l,g andR R R V,l by r H,l,g and r V,l , respectively, we writeh h h l,g k as h h h l,g k = (U U U H,l,g ⊗U U U V,l )(Λ 1 2 H,l,g ⊗Λ 1 2 V,l )w w w l,g k , whereU U U H,l,g is M×r H,l,g ,U U U V,l is N×r V,l , Λ H,l,g is r H,l,g ×r H,l,g and Λ V,l is r V,l ×r V,l . The vectorw w w l,g k , of dimension r H,l,g r V,l ×1, has i.i.d. entries∼CN(0,1). In JSDM with 3D pre-beamforming, the transmitted signal is given by x x x = L X l=1 (B B B l P P P l d d d l )⊗q q q l , (2.70) where q q q l denotes the N ×1 pre-beamforming vector for region l in the elevation angle dimension, B B B l is the M ×b l pre-beamforming matrix of the formB B B l = [B B B l,1 ,...,B B B l,G l ], whereB B B l,g denotes the pre-beamforming matrix of sizeM×b l,g for group (l,g) andP P P l is the linear precoding matrix for the groups of regionl, that depends on the instantaneous effective channels as given in Section 2.2. Notice that we allocate (by design) a single dimensionperregionintheelevationangledirection(thisiswhyq q q l hasdimensionsN×1) since, because of the relatively small angle under which the BS sees the different regions, it is realistic to expect that R R R V,l,g has a single dominant eigenmode. Generalizations 53 considering higher dimensional vertical pre-beamforming for each region are conceptually straightforward, although not very useful in typical practical scenarios. Using repeatedly the Kronecker product rule (A A A⊗B B B)(C C C⊗D D D) = (A A AC C C)⊗(B B BD D D), the received signal for user g k in group (l,g) can be written as y l,g k = w w w H l,g k (Λ 1 2 H,l,g ⊗Λ 1 2 V,l )(U U U H H,l,g ⊗U U U H V,l )x x x+z l,g k = w w w H l,g k (Λ 1 2 H,l,g ⊗Λ 1 2 V,l )(U U U H H,l,g ⊗U U U H V,l ) " L X m=1 (B B B m P P P m d d d m )⊗q q q m # +z l,g k = w w w H l,g k (Λ 1 2 H,l,g ⊗Λ 1 2 V,l ) L X m=1 h (U U U H H,l,g B B B m P P P m d d d m )⊗(U U U H V,l q q q m ) i +z l,g k = w w w H l,g k (Λ 1 2 H,l,g ⊗Λ 1 2 V,l ) L X m=1 h (U U U H H,l,g B B B m )⊗(U U U H V,l q q q m ) i P P P m d d d m +z l,g k . (2.71) Ifq q q m is chosen to be orthogonal to Span({U U U V,l :l6=m}), (2.71) reduces to y l,g k =w w w H l,g k (Λ 1 2 H,l,g ⊗Λ 1 2 V,l ) h (U U U H H,l,g B B B l )⊗(U U U H V,l q q q l ) i P P P l d d d l +z l,g k . (2.72) Stacking the signals y l,g k for all users g k in group (l,g) into a K l,g × 1 vector y y y l,g , we obtain y y y l,g =W W W H l,g (Λ 1 2 H,l,g ⊗Λ 1 2 V,l ) h (U U U H H,l,g B B B l )⊗(U U U H V,l q q q l ) i P P P l d d d l +z z z l,g , (2.73) where we letW W W l,g = [w w w l,g 1 ,...,w w w l,g K l,g ] andz z z l,g = [z l,g 1 ,...,z l,g K l,g ] T . If the regions are sufficiently separated in the elevation angle dimension, it is possible to align q q q l with the dominant eigemode of U U U V,l , while maintaining the orthogonality conditionU U U H V,m q q q l = 0 for m6=l. In this case, we haveU U U H V,l q q q l = (1,0,...,0) T and (2.73) reduces to the same form treated previously for the planar geometry, with an additional region-specific coefficient p λ V,l,1 , corresponding to the largest eigenvalue of the matrix Λ V,l : y y y l,g = p λ V,l,1 W W W H l,g Λ 1 2 H,l,g U U U H H,l,g B B B l P P P l d d d l +z z z l,g . (2.74) 54 Stacking the vectorsy y y l,g for all g = 1,...,G l , we obtain y y y l = p λ V,l,1 W W W l,1 Λ 1 2 H,l,1 U U U H H,l,1 W W W l,2 Λ 1 2 H,l,2 U U U H H,l,2 . . . W W W l,G l Λ 1 2 H,l,G l U U U H H,l,G l B B B l P P P l d d d l +z z z l , (2.75) which is of the same form as (2.4). At this point, the pre-beamforming matrix B B B l and the precoding matrixP P P l can be optimized independently for each region l, as described beforefortheplanargeometry. Thecoefficientsλ V,l,1 incorporatetheeffectofthedifferent geometry of the annular regions in the elevation angle dimension, including the path loss due to different distances of the regions from the BS. The allocation of the total transmit power over the regions can be further optimized. 2.7.1 Results with 3D pre-beamforming We present some results for JSDM with 3D pre-beamforming and PGP, with either BD or DFT pre-beamforming in each region. The system layout is shown in Fig. 2.9. We consideronesectorofahexagonalcellofradius600m. Thescatteringringsinthechannel correlation model have radius r = 30 m. The BS is located at the center of the cell with the antennas at an elevation of h = 50 m, and is equipped with a rectangular array with M = 200 and N = 300. We partition the sector into 8 concentric regions at distance 60l m,l∈{1,...,8}. Eachannularregionisdividedintosmallscatteringrings, eachdefining a group. The pathloss between the BS and a point at distance x m is given by g(x) = 1 1+( x d 0 ) δ , (2.76) with δ = 3.8, d 0 = 30 m. In these results we assume ideal CSIT for computing the JSDM precoder. The horizontal covariance matrix for all groups (l,g) is given by (2.3) with Δ H,l = arctan( r 60l ) = arctan( 1 2l ) and θ H,l,g ∈ [−π/3,π/3] such that for any two 55 600 m 50 m 120 degree sector Figure2.9: ThelayoutofonepatternforJSDMwith3Dpre-beamforming. Theconcentric regions are separated by the vertical pre-beamforming. The circles indicate user groups. Same-color groups are served simultaneously using JSDM. groups (l,g 1 ) and (l,g 2 ), we have |θ H,l,g 1 −θ H,l,g 2 |> 2Δ l . It is easy to see from Fig. 2.9 that as the distance of the ring from the BS increases, more and more user groups can be accommodated in the annular region, since Δ H,l decreases. The vertical covariance matrix is again given by (2.3) with Δ V,l = 1 2 (arctan( 60l+r h )−arctan( 60l−r h )) and θ V,l = 1 2 (arctan( 60l+r h )+arctan( 60l−r h )). Sincethetotalangleunderwhichthesectorisseenfrom the elevation viewpoint is narrow, a large number of antennas in the vertical direction is required in order to achieve orthogonality between all annular regions eigenmodes. For finite N, in order to guarantee a desired angular separation between annular regions and therefore have near-orthogonality in the elevation angle dimension, it is con- venient to partition the annular regions into maximally separated subsets (patterns) and apply BD in the vertical pre-beamforming. Different patterns can be scheduled in differ- ent time-frequency slots. We denote by A = {A 1 ,A 2 ,...} the set of patterns. Finding the best possible pattern partition is computationally hard, so for the sake of simplic- ity, we consider a simple partitioning as shown in Fig. 2.9, where annular regions with 56 1 2 3 4 5 6 7 8 400 500 600 700 800 900 1000 Annular Region Index l Sum rate of annular regions BD, RZFBF BD, ZFBF DFT, RZFBF DFT, ZFBF Figure 2.10: Sum spectral efficiency ¯ R l for different annular regions l = 1,...,8 with Regularized ZF and ZF for JSDM with 3D pre-beamforming and ideal CSIT. “BD” denotes PGP with approximate block diagonalization and “DFT” stands for PGP with DFT pre-beamforming. Equal power is allocated to all served users and the number of users (streams) in each group is optimized in order to maximize the overall spectral efficiency. 57 the same color belong to the same pattern. In our example (see Fig. 2.9), numbering the annular regions in ascending order based on their proximity to the BS, we have A ={{1,5},{2,6},{3,7},{4,8}}. In this way, (2.75) is replaced by y y y l =|Λ 1 2 V,l U U U H V,l q q q l | W W W l,1 Λ 1 2 H,l,1 U U U H H,l,1 W W W l,2 Λ 1 2 H,l,2 U U U H H,l,2 . . . W W W l,G l Λ 1 2 H,l,G l U U U H H,l,G l B B B l P P P l d d d l +z z z l . (2.77) Notice that due to BD in the vertical direction, the inter-region interference is exactly zero sinceU U U H V,m q q q l = 0 0 0 for m6=l. Within each annular region, we use JSDM with PGP. The pre-beamforming matricesB B B l for regionl are obtained using approximate BD or the DFT method, as discussed in previous sections. The dominant rank r ⋆ l,g for each group (l,g) is given by r ⋆ l,g =MD(sin(θ H,l,g +Δ H,l )−sin(θ H,l,g −Δ H,l )), (2.78) which is a good approximation for large M motivated by Theorem 2. For simplicity, we do not consider noisy CSIT and assume that the BS has full knowledge of the effective channels. Hence, we let b l,g =r ⋆ l,g . In contrast, in the case of noisy CSIT the parameter b l,g should be optimized for given channel coherence block length T, as discussed in Remark 8. Denoting by R q the sum spectral efficiency of pattern A q , and letting Q denote the number of patterns, the network utility maximization problem is given by max g(R 1 ,...,R Q ) subject to R q ≤ν q R ∗ q , forq = 1,...,Q, Q X q=1 ν q = 1. (2.79) 58 whereg(·)isaconcavecomponent-wisenon-decreasingnetworkutilityfunctioncapturing some desired notion of fairness, and the optimization variables {ν q } are the fractions of time-frequency dimensions allocated to each pattern. We define R ∗ q = X l∈Aq G l X g=1 S l,g X k=1 R l,g,k , to be the spectral efficiency of each individual pattern, whereS l,g is the number of down- link streams to group (l,g) and R l,g,k is the rate of the k-th stream of group (l,g). We have considered two cases of fairness: proportional fairness (PFS), and max-min fairness. In both cases, the optimal dimension allocation fractions {ν q } can be found in closed form. For PFS, we have g(R 1 ,...,R Q ) = P Q q=1 log(R q ), yielding the solution ν q = 1 Q for all q. For max-min fairness, we have g(R 1 ,...,R Q ) = min q R q , yielding the solution ν q = 1 R ∗ q P q 1 R ∗ q . The spectral efficiencyR ∗ q can be optimized independently for each patternA q . For a givenJSDMprecodingscheme, weneedtosearchoverthenumberofdownlinkstreamsin each group. This is a multi-dimensional integer search over the parameters{S l,g } for all groups(l,g)∈A q . Inaddition,weshouldoptimizewithrespecttothepowerallocationto the downlink data streams, as mentioned before. In order to obtain a tractable problem, we resort to good heuristics. Following the design guideline given in Remark 9, we know thattheratioS l,g /b l,g shouldbeapproximatelythesamefortheoptimalS l,g forallgroups (l,g) with similar geometry, i.e., belonging to the same region. Hence, we fix this ratio to the be same for all groups in the same region, and indicate it asα l . In addition, as done before, we restrict to equal power allocation to all the downlink streams. Indicating this common per-stream power value by ¯ P, and letting R l,g,k ( ¯ P) denote the rate of the k-th stream of group (l,g) as a function of ¯ P, calculated according to the methods given in 59 Table2.1: Sumspectralefficiency(bit/s/Hz)underPFSandmax-minfairnessscheduling for PGP and approximate BD/DFT. Scheme Approximate BD DFT based PFS, RZFBF 1304.4611 1067.9604 PFS, ZFBF 1298.7944 1064.2678 MAXMIN, RZFBF 1273.7203 1042.1833 MAXMIN, ZFBF 1267.2368 1037.2915 Section 2.4 and Appendix A.1, for given MU-MIMO precoding scheme, the optimization with respect to{α l } is expressed by max X l∈Aq G l X g=1 S l,g X k=1 R l,g,k ( ¯ P) subject to S l,g =⌊α l b l,g ⌋ ¯ P = P P l∈Aq P G l g=1 S l,g . (2.80) Notice that for a pattern with |A q | regions, (2.80) consists of a |A q |-dimensional search over the real parameters α l ∈ [0,1], which is tractable when |A q | is small (in our case, |A q | = 2). Fig. 2.10 shows the sum spectral efficiency P G l g=1 P S l,g k=1 R l,g,k ( ¯ P) for each annular regionl = 1,...,8 in the setup of Fig. 2.9 with system parameters given at the beginning ofthissection, resultingfromtheaboveoptimizationforbothDFTpre-beamformingand approximate BD using PGP with RZFBF and ZFBF precoding. The corresponding sum spectral efficiencies under PFS and max-min fairness scheduling are reported in Table 2.1. 60 Chapter 3 Joint Spatial Division and Multiplexing : Opportunistic Beamforming and User Grouping In this chapter, we expand our investigation of the JSDM scheme by focusing on the following topics, relevant for practical implementation. In Section 3.1, we focus on the non-asymptotic regime of finite number of base station antennas M and large number of users in each group. For this regime, we consider the performance of the well-known opportunistic beamforming scheme that serves on each downlink beamthe user achieving the maximum Signal to Interference plus Noise (SINR) onthatbeam,reminiscentofcurrentEv-DoandHSDPA“high-datarate”schemes[HT07, AGL + 06]. Intheregimeoflargenumberofusersandfixednumberofantennas,thescaling law of the sum capacity with opportunisticuser selection hasbeen widely investigated for uncorrelated channels under random beamforming [SH05] and zero-forcing beamforming [YG06]. Also the case of correlated channels with random beamforming was investigated under the assumption that all users have the same channel covariance matrix [ANSH09]. We improve upon these earlier works since we consider different groups of users with different channel covariance matrices. In particular, we show that the optimal sum rate scaling law can be achieved by JSDM with opportunistic user selection for a particular choice of “approximate block diagonalization” pre-beamforming matrix. Under certain special conditions on the correlation matrices of different groups, the sum rate gain of the uncorrelated channel counterpart can be as large as MlogM. 61 Then, in Section 3.2, differently from Chapter 2, we consider the more realistic case of users randomly distributed in the cell region, such that each user is characterized by its own individual AoA and AS. In order to apply the JSDM approach it is necessary to cluster the users into groups. We consider two approaches for the user grouping problem. ThefirstisanextensionoftheclassicalK-meansalgorithmtotheGrassmanianmanifold. The second consists of a simple quantization of the Grassmanian manifold according to the minimum chordal distance criterion, for a fixed quantizer determined by geometric considerations. We compare these two methods under the opportunistic user selection schemeofSection3.1,andnoticethatthesimplestfixedquantizationmethodusingblocks of columns of a DFT matrix as quantizer center points performs best. Finally, motivated by the work of [HTC12,HCPR12], in Section 3.3 we focus on the regime where the number of users is proportional to the number of antennas. In this regime, the DFT-based fixed quantization user grouping scheme reduces to a simple quantization in the 2-dimensional AoA-AS plane, under the assumption that the BS has a uniform linear antenna array. Since in this large-system regime, user selection becomes ineffective due to channel hardening [CBTF09], we propose a probabilistic scheduling al- gorithm where users within each group are pre-selected at random, based on probabilities derived from the large-system asymptotic analysis, such that only the selected users are required to feedback their CSIT. With the proposed scheme, CSIT feedback is reduced sinceonlythepre-selected(scheduled)usersneedtofeedbacktheireffectivechannels. We show the effectiveness of the proposed simplified downlink scheduling scheme by compar- ing with the case where all users feedback their CSIT. 3.1 Sum capacity scaling in the large user regime In this section, we focus on the regime of finiteM and largeK, and obtain an asymptotic expression for the sum capacity when all the users within a group have the same channel covariance. The case G = 1 is treated in [ANSH09]. Here, we consider the non-trivial 62 extension to the case G> 1. Without loss of fundamental generality, we assume that all groups contain the same number of users K g =K ′ =K/G, for all g. We have: Theorem 4 The sum capacity of a MU-MIMO downlink system withM antennas at the BS, total transmit power constraint of P, and K users divided into G groups of equal size K ′ =K/G, where users have mutually statistically independent channel vectors and users in group g have a common covariance matrixR R R g of rank r g , behaves, for K ′ →∞, as R sum = βloglog(K ′ )+βlog P β +O(1), (3.1) where β = min{M, P G g=1 r g } and where O(1) denotes a constant, independent of K ′ . Theorem 4 is proved by developing upper and lower bounds to the sum capacity. The upper bound analyzes directly the sum capacity of the underlying vector broadcast channel and is given in Appendix A.3. For the lower bound, we consider an explicit achievability strategy based on JSDM with a particular choice of the pre-beamforming matrix and opportunistic user selection in each group. This strategy generalizes the scheme of [SH05] (random beamforming and user selection) to the case where the users are clustered in groups, each of which has a different channel covariance matrix. Remark 10 It is worthwhile to remark that since the pre-beamforming matrices depend only on the channel second-order statistics, the feedback required from each user is just the SINR achieved on each beam (or, equivalently, the maximum SINR and the index of the beam achieving this maximum [SH05]). Hence, the achievability scheme has some practical interest since it is reminiscent of current technology based on opportunistic beam- forming with Channel Quality Indicator (CQI) (see for example [LLL + 10,RJ08]). From the proof of the upper bound (see Appendix A.3), it is immediate to obtain the following side result: 63 Corollary 2 The sum capacity of a MU-MIMO downlink system withM antennas, total transmit power constraint of P and K ′ users per group with common covariance R R R g of rank r g , when the tall unitary condition in Section 2.3.1 is satisfied, is given by R sum = G X g=1 r g loglogK ′ +log P P G g=1 r g + G X g=1 logdetΛ g +o(1), (3.2) where the o(1) term goes to zero when K ′ →∞. It is also interesting to notice that, in certain conditions, the sum rate (A.49) can be significantly larger than the sum rate of the corresponding spatially isotropic case [SH05]. For example, in the case where the channel covariance matrices are extremely directional (i.e.,r g = 1)butalltogetherformatallunitarymatrixwithG =M groupsin[NA13]itis shownthatthesumrategainofthecorrelatedversustheisotropiccaseisMlogM+o(1). This regime can be approached in the case where users are seen from the BS under a very narrowAS,buttheirAoAsaresufficientlyseparatedsuchthattheycanberesolvedbythe BS antenna array. Such result is somehow surprising since transmit antenna correlation has been traditionally regarded as detrimental, based on the result of [ANSH09], which considered the case of user channels with the same correlation (case G = 1). 3.1.1 Achievability WeconsideraspecificJSDMstrategywithPGP(seeSection(2.2))bylettingthenumber of downlink data streams per group be given by b g = S g = r ∗ g , where r ∗ g denotes the effective rank ofR R R g , and the MU-MIMO precoding matrix in each groupg be simply the identity matrix, i.e.,P P P g =I I I r ∗ g ∀g. In order to allocate the downlink data streams to the users,weselectr ∗ g outofK ′ usersineachgroupg accordingtoamaxSINRcriteriontobe specified later. Notice that since the achieved SINR for each user and pre-beamforming beam is a function of the channel matrix realization, this scheme servesr ∗ g out ofK ′ users 64 “opportunistically”, depending on the channel matrix realization. The pre-beamforming matrices B B B g are designed according to the approximate Block Diagonalization scheme mentioned before (see Section 2.3.2). For any pair of groupsg,g ′ , since them-th column ofB B B g ′, denoted byb b b g ′ m , is in the null space of the firstr ∗ g eigenvectors ofR R R g , we have that U U U H g b b b g ′ m = 0 0 0 r ∗ g ×1 x x x g,g ′ ,m (3.3) wherex x x g,g ′ ,m is some not necessarily zero vector of dimension r g −r ∗ g . Notice that when exact block diagonalization is possible and we choose r ∗ g =r g , thenU U U H g b b b g ′ m =0 0 0 rg×1 . The users g k measure the SINRs with respect to the beamforming vectors in their corresponding group g, given by SINR g k ,m = |h h h H g k b b b gm | 2 1 ρ + P n6=m |h h h H g k b b b gn | 2 + P g ′ 6=g ||h h h H g k B B B g ′|| 2 , (3.4) for m = 1,...,r ∗ g , where we let ρ = P P G g=1 r ∗ g , assuming that the total transmit power is distributed evenly over all downlink beams. 1 Each user feeds back these SINRs values to the BS. On the basis of these CQI values, the BS allocates the user with the maximum SINR on each m-th beam of each group g. We further assume that users can decode multiple downlink streams in the case they achieve the maximum on more than one 1 Notice that such SINR is easily and accurately measured by including downlink pilot symbols in the downlink streams passing through the pre-beamforming matrix, as currently done in opportunistic beamforming schemes [VDL02], [HT07]. Nevertheless, in this work we assume that this information is perfectly known at the users and can be ideally fed back to the BS. This corresponds to the ideal channel state information assumption, widely used in most information theoretic analysis of the Gaussian vector broadcast channel [CS03,WSS06,VJG03,VDL02]. 65 beam and therefore, are allocated multiple downlink beams. 2 With this type of user selection, the achievable sum rate of group g is given by R g = r ∗ g X m=1 E log 1+ max 1≤k≤K ′ SINR g k ,m . (3.5) In order to find the scaling of the sum rate expression (3.5) for large K ′ , we consider the extremal statistics of SINR g k ,m , i.e., we study the distribution of the random vari- able max 1≤k≤K ′SINR g k ,m . For this purpose, we find the distribution of a single term SINR g k ,m , whose CDF is given by F(x) = 1−P(SINR g k ,m >x). Define the quantity Z g k ,m = |w w w H g k Λ 1/2 g U U U H g b b b gm | 2 −x 1 ρ + X n6=m |w w w H g k Λ 1/2 g U U U H g b b b gn | 2 + X g ′ 6=g kw w w H g k Λ 1/2 g U U U H g B B B g ′k 2 = w w w H g k A A A ′ g,m w w w g k −xw w w H g k A A A ′′ g,m w w w H g k − x ρ , (3.6) where in (3.6) we define the matrices A A A ′ g,m = Λ 1/2 g U U U H g b b b gm b b b H gm U U U g Λ 1/2 g (3.7) A A A ′′ g,m = X n6=m Λ 1/2 g U U U H g b b b gn b b b H gn U U U g Λ 1/2 g + X g ′ 6=g Λ 1/2 g U U U H g B B B g ′B B B H g ′U U U g Λ 1/2 g . (3.8) Then, we can write the SINR CDF as F(x) = 1−P(Z g k ,m >x). (3.9) Following the analysis of [SH05], we get P(Z g k ,m >x) = 1 2πj Z ∞ −∞ e −(jω+c)x/ρ jω+c 1 Q rg i=1 (1−(jω+c)μ g,m,i (x)) dω (3.10) 2 Following [SH05], it is well-known that if each user feeds back just its maximum SINR and the index of the corresponding beam the achievable sum rate in the limit of large K is unchanged. Hence, instead of r ∗ g real numbers, the CQI feedback can be reduced to one real number and one integer beam index. 66 where{μ g,m,i (x) :i = 1,...,r g } are the eigenvalues ofA A A g,m (x) =A A A ′ g,m −xA A A ′′ g,m . In order to proceed further, we need to consider these eigenvalues. Without loss of generality, we assume the ordering μ g,m,1 (x)≥...≥μ g,m,rg (x). We have the following results: Lemma 2 The maximum eigenvalue μ g,m,1 (x) ofA A A g,m (x) is strictly positive ∀ x≥ 0. Proof Notice that A A A ′ g,m in (3.7) has rank equal to 1 and A A A ′′ g,m in (3.8) has rank at most r g −1. This is because A A A ′′ g,m is the sum of r ∗ g −1 rank-1 matrices and the matrix P g ′ 6=g Λ 1/2 g U U U H g B B B g ′B B B H g ′U U U g Λ 1/2 g has at most rank r g −r ∗ g because of (3.3). SinceA A A ′′ g,m is of dimensionr g ×r g and has rank at mostr g −1, it has a non-trivial nullspace of dimension 1, meaning we can find a vectorq q q such thatA A A ′′ g,m q q q =0 0 0. In order to prove the lemma, we first prove that U U U H g b b b gm / ∈ Span n U U U H g b b b gn :n6=m,U U U H g b b b g ′ n :g ′ 6=g, n = 1,...,r ∗ g ′ o . (3.11) In order to see this, we write U U U g = [U U U ∗ g ,U U U ′ g ], where U U U ∗ g is of rank r ∗ g and U U U ′ g is of rank r g −r ∗ g . LetB B B g = [b b b g 1 b b b g 2 ...b b b g r ∗ g ], of rank r ∗ g by construction. Then, we have U U U H g B B B g = U U U ∗H g B B B g U U U ′ H g B B B g , where the upper partU U U ∗H g B B B g has rank r ∗ g . Reasoning by contradiction, let’s assume that (3.11) is false. Then, there exist coefficients {α gn : n 6= m} and {β g ′ n : g ′ 6= g,n = 1,...,r ∗ g ′ } such that U U U H g b b b gm = X n6=m α gn U U U H g b b b gn + X g ′ 6=g r ∗ g ′ X n=1 β g ′ n U U U H g b b b g ′ n . 67 Recalling (3.3), we have that the second term in the right-hand side of the above equality takes on the form X g ′ 6=g r ∗ g ′ X n=1 β g ′ n U U U H g b b b g ′ n = 0 0 0 r ∗ g ×1 z z z , where z z z is some non-zero vector of dimension r g −r ∗ g . Since the upper part of U U U H g b b b gm , formed by the first r ∗ g componentsU U U ∗H g b b b gm , is non-zero, then it must be U U U ∗H g b b b gm = X n6=m α gn U U U ∗H g b b b gn . However, this is not possible since it contradicts the fact thatU U U ∗H g B B B g has rankr ∗ g . There- fore, we conclude that (3.11) holds. Now, choosingΛ 1 2 g q q q tobeaunitvectorintheorthogonalcomplementofSpan{U U U H g b b b gn : n6=m,U U U H g b b b g ′ n :g ′ 6=g,n = 1,...,r ∗ g ′ } and such thatq q q H Λ 1 2 g U U U H g b b b gm 6=0 0 0, we have that 0 < q q q H Λ 1 2 g U U U H g b b b gm b b b H gm U U U g Λ 1 2 g q q q ≤ max Λ 1 2 g q q q q q q H Λ 1 2 g U U U H g b b b gm b b b H gm U U U g Λ 1 2 g q q q Δ =μ g,m,1 (x), implying μ g,m,1 (x)> 0 for all x≥ 0. Lemma 3 The eigenvalues μ g,m,2 (x),...,μ g,m,rg (x) are non-positive ∀ x≥ 0. Proof Denoting by λ i (A A A ′ g,m ) and λ i (A A A ′′ g,m ) the i th largest eigenvalues ofA A A ′ g,m andA A A ′′ g,m , Weyl’s inequality [Fra12] yields μ g,m,i (x) ≤ λ i (A A A ′ g,m )−xλ rg (A A A ′′ g,m ) ≤ 0−xλ rg (A A A ′′ g,m )≤ 0 (3.12) where we used the fact thatA A A ′ g,m has rank 1, therefore all its eigenvalues fori = 2,...,r g are zero. This implies that μ g,m (x)≤ 0,∀i> 1. 68 Now, weuseCauchy’sresiduetheoreminordertocalculatethecontourintegralin(3.10). Since the eigenvalues μ g,m,2 ,...,μ g,m,rg are negative, there is a single pole at μ g,m,1 (x) in the right half of the complex plane, which is the only one contributing to the integral. After some straightforward algebra, using (3.10) into (3.9) we obtain the SINR CDF as F(x) = 1− e −x ρμ g,m,1 (x) Q rg i=2 1− μ g,m,i (x) μ g,m,1 (x) . (3.13) From well known facts in extreme value theory [SH05,ANSH09], we have that max 1≤k≤K ′SINR g k ,m for a behaves as g ∞ logK ′ +O(loglogK ′ ) as K ′ →∞, where g ∞ = lim x→∞ g(x), (3.14) andg(x) = 1−F(x) f(x) is the growth function of the CDFF(x) with corresponding PDFf(x) (see [SH05, Appendix A]). In Appendix A.4, we show that have g ∞ = ρμ ∞ g,m,1 where μ ∞ g,m,1 = lim x→∞ μ g,m,1 (x) is a bounded positive constant. As a result, the sum rate for a group g behaves as R g = r ∗ g X m=1 log ρμ ∞ g,m,1 log(K ′ ) +o(1) = r ∗ g logρ+r ∗ g loglogK ′ + r ∗ g X m=1 log μ ∞ g,m,1 +o(1) as K ′ →∞. Summing over g, we arrive at the sum rate achievable asymptotic formula R sum = G X g=1 r ∗ g logρ+ G X g=1 r ∗ g loglogK ′ +O(1). (3.15) When P G g=1 r g < M, it is possible to choose r ∗ g = r g such that the above achievable sum rate matches (in the leading terms) the upper bound (A.49). If P G g=1 r g > M, we can choose r ∗ g such that P G g=1 r ∗ g = M, such that again the achievable rate matches, 69 in the leading terms, the upper bound (A.51). Hence, in all cases, we have P G g=1 r ∗ g = min{M, P G g=1 r g } =β, such that Theorem 4 is proved. 3.2 User Grouping As a matter of fact, in reality users do not come naturally partitioned in groups with exactly the same covariance matrix. In order to exploit effectively the JSDM approach, the users’ population must be partitioned into groups according to the following qualita- tive principles: 1) users in the same group have channel covariance eigenspace spanning (approximately) a given common subspace, which characterizes the group; 2) the sub- spacesofgroupsservedonthesametime-frequencyslot(transmissionresource)byJSDM must be (approximately) mutually orthogonal, or at least have empty intersection. We specifically consider the case of a BS equipped with a uniform linear array. According to the single scattering ring model (see Section 2.1), the channel covariance matrix for a user located at AoA θ with AS Δ has (m,p)-th elements [R R R] m,p = 1 2Δ Z Δ+θ −Δ+θ e −j2πD(m−p)sin(α) dα, (3.16) where λD denotes the minimum distance between the BS antenna elements. Here we assume each user k is characterized by the pair (θ k ,Δ k ) of its AoA and AS. As before, we assume that the BS has perfect knowledge of the user channel covariance, which can be accurately learned and tracked since it is constant in time (see Remark 2). 3.2.1 Algorithm 1: K-means Clustering K-meansClusteringisastandarditerativealgorithmwhichaimsatpartitioningK obser- vations intoG clusters such that each observation belongs to the cluster with the nearest mean[YC74,Llo82]. ThisresultsinapartitionoftheobservationspaceintoVoronoicells. In our problem, the K user covariance dominant eigenspaces, i.e., {U U U ∗ k : k = 1,...,K} form the observation space. Hence, in order to apply theK-means principle, we consider 70 the chordal distance between the covariance eigenspaces. Given two matricesX X X ∈C M×p andY Y Y ∈C M×q , the chordal distance denoted by d C (X X X,Y Y Y) is defined by d C (X X X,Y Y Y) = X X XX X X H −Y Y YY Y Y H 2 F . (3.17) In a similar fashion, we need to define a notion of the mean of (tall) unitary matrices. Given N unitary matrices{U U U ∗ 1 ,U U U ∗ 2 ,...,U U U ∗ N }, the mean ¯ U U U ∗ ∈C M×p is given as [BN02] ¯ U U U ∗ = eig p " 1 N N X n=1 U U U ∗ n U U U ∗H n # , (3.18) where eig p (X X X) denotes the unitary matrix formed by the p dominant eigenvectors of X X X. Based on the set of user channel covariance dominant eigenspaces, defined by {U U U ∗ k : k = 1,...,K}, and using the distance and mean definitions in (3.17) and in (3.18), respectively, the K-means algorithm can be formulated according to [YC74,Llo82]. The output of the algorithm is a partition of the users into G disjoint groups, and a set of unitary matrices {V V V g ∈ R M×r ∗ g : g = 1,...,G} for suitable integers r ∗ g such that P G g=1 r ∗ g ≤M, obtained as the centers (means) of the groups, i.e., defining the subspaces spanning (approximately, according to the chordal distance criterion) the users in each group. For completeness, the K-means algorithm is given below. We denote by V V V ∗(n) g the group g “mean” obtained by the algorithm at iteration n, and by S (n) g the set of users belonging to group g at iteration n. • Step 1: Set n = 0 and S (0) g = ∅ for g = 1,...,G. For n = 1,2,3,..., randomly choose G different indices from the set{1,...,K} and let V V V ∗(n) g =U U U ∗ π(g) , forg = 1,2,...,G, (3.19) whereπ(g)returnsarandomnumberfromtheset{1,2,...,K}\{π(1),...,π(g−1)} 71 • Step 2: For k = 1,...,K, compute d C (U U U ∗ k ,V V V ∗(n) g ) =||U U U ∗ k U U U ∗H k −V V V ∗(n) g V V V ∗(n)H g || 2 F (3.20) • Step 3: Assign user k to group g such that g = argmin g ′ d C (U U U ∗ k ,V V V ∗(n) g ′ ) S (n+1) g = S (n) g ∪{k} (3.21) • Step 4: For g = 1,...,G and∀ k∈S g compute V V V ∗(n+1) g = eig 1 |S (n+1) g | N X k∈S (n+1) g U U U ∗ k U U U ∗H k (3.22) • Step 5: Compute the total distance at the n th and (n+1) th iteration d (n) tot,C = G X g=1 X k∈S (n) g d C (U U U ∗ k ,V V V ∗(n) g ) (3.23) • Step 6: If |d (n) tot,C −d (n+1) tot,C |>ǫd (n) tot,C , go to Step 7. Else, increment n by 1 and go to Step 2. 3 • Step 7: For g = 1,...,G, assign V V V ∗ g =V V V ∗(n) g , S g =S (n) g . 3 ǫ is a threshold for stopping the algorithm when the relative difference between the total distances at the previous and current iterations is sufficiently small. 72 3.2.2 Algorithm 2: fixed quantization In this case, the group subspaces {V V V ∗ g ∈ R M×r ∗ g : g = 1,...,G} are fixed a priori, based on the geometry of the groups and their channel scattering. Such group subspaces act as the representative points of a minimum distance quantizer in the Grassmannian manifold [BN02], where distance in this case is the chordal distance defined in (3.17). Explicitly, we have: • Step 1: For g = 1,...,G setS g =∅. • Step 2: For k = 1,...,K, compute the distances d C (U U U ∗ k ,V V V g ) =||U U U ∗ k U U U ∗H k −V V V ∗ g V V V ∗H g || 2 F , (3.24) find the minimum distance group index g = argmin g ′ d C (U U U ∗ k ,V V V ∗ g ′), and add user k to group g, i.e., letS g :=S g ∪{k}. It is clear that the performance of JSDM resulting from fixed quantization depends crit- ically on how we choose the group subspaces. We have considered two approaches. The firstreliesonthefactthat,forlargeM,thechanneleigespacesarenearlymutuallyorthog- onal when the channel AoA supports are disjoint (see Theorem 3, Section 2.6). Hence, we chooseG AoAsθ g and fixed AS Δ such that the resultingG intervals [θ g −Δ,θ g +Δ] are disjoint, and compute the eigenspace corresponding to these artificially constructed covariance matrices using the one-ring scattering model (3.16). This method consists essentially to form pre-defined “narrow sectors” and associate users to sectors according to minimum chordal distance quantization. Example 1 Suppose G = 3. Choosing θ 1 = −45 o ,θ 2 = 0 o ,θ 3 = 45 o and Δ = 15 o , we note that the angular supports are disjoint. Letting R R R 1 (θ 1 ,Δ),R R R 2 (θ 2 ,Δ) and R R R 3 (θ 3 ,Δ) 73 denote the covariance matrices obtained by (3.16) for given AoA and AS, we letV V V ∗ g =U U U ∗ g for g = 1,2,3, whereU U U ∗ g is the M×r ∗ g tall unitary matrix of the r ∗ g dominant eigenvalues ofR R R g (θ g ,Δ) and the effective ranks r ∗ g are chosen such that r ∗ 1 +r ∗ 2 +r ∗ 3 =M. A second method to chooseV V V ∗ g consists of maximizing the minimum distance between the group subspaces. Defining d V V V ∗ g :g∈{1,2,...,G} = min g,g ′ d C (V V V ∗ g ,V V V ∗ g ′) (3.25) as the minimum chordal distance of the set of group subspaces {V V V ∗ 1 ,V V V ∗ 2 ,...,V V V ∗ G }, we wish to find such set such that d V V V ∗ g :g∈{1,2,...,G} is maximized. It is easy to see that, if P G g=1 r ∗ g = M, the we can choose {V V V g } as disjoint subsets of the columns of a unitary matrix of dimensions M ×M such that all group subspaces are mutually orthogonal and d V V V ∗ g :g∈{1,2,...,G} is maximized. Using the fact that, for large M, the eigenvectors of covariance matrices of the type (3.16) are well approximated by the columns of a DFT matrix (see Fact 2, Section 2.6 for details), here we propose to use disjoint blocks of adjacent columns of the M×M unitary DFT matrix as group subspaces. Example 2 Suppose again G = 3. Assign r ∗ g = ⌊ M 3 ⌋ = r and let F F F denote the unitary M×M DFT matrix. Then, we have V V V g =F F F(:,(g−1)r+(1 :r)), i.e.,V V V g is formed by taking the (g−1)r+1 to (g−1)r+r columns ofF F F. 3.2.3 Numerical results We present some simulation results to show the performance of the different user group- ingalgorithmsunderthepre-beamformingandopportunisticuserselectionJSDMscheme as in the achievability proof of Theorem 4 (see Section 3.1.1). Having chosen the group eigenspaces{V V V ∗ g }andthesetofusersS g foreachgroupg, weobtainthepre-beamforming 74 matrices{B B B g }byblockdiagonalization(seeSection2.3.2fordetails),suchthat(V V V ∗ g ) H B B B g = 0 0 0 for all g 6= g ′ . Notice that this does not mean that the system has no inter-group in- terference, since the actual user channels eigenspaces do not coincide exactly with the group subspaces. Within each group, users are allocated to the group beams according to the user selection algorithm used in the achievability of Theorem 4, denoted here as JSDM-GBF-ALL (group beamforming with CQI feedback from all beams). For the sake of comparison, we also considered two alternatives: 1) feeding back only the maximum SINR and the corresponding beam index, as in [SH05] (denoted by JSDM-GBF-MAX; 2) zero-forcing beamforming (ZFBF) for each group with semi-orthogonal user selection (SUS), as in [YG06] (denoted by JSDM-ZFBF-SUS). Notice that the latter requires feed- ingbacktheeffectivechannelvectors{B B B H g h h h k :k∈S g }, andthereforeincursamuchlarger feedback overhead. We consider two examples with M = 8 and M = 16 BS antennas. We fix the total transmitpowerP = 10dB, allocateequalpowerperdownlinkdatastreamandnormalize the noise variance to 1, such that SNR = P in the plots. We set G = 8. The angles of arrivalfortheusersaregeneratedrandomlybetween−60 o to60 o andtheangularspreads are generated randomly between 5 o to 15 o . For the K−means clustering algorithm, the entire set of user covariances is clustered into G = 8 groups. For the fixed quantization algorithm, we choose θ ∈ {−57.5 o ,−41.5 o ,−23 o ,−7.5 o ,7.5 o ,23.5 o ,41.5,57.5 o } and Δ = 12 o for choosing our group subspaces, as shown in Example 1. For the DFT based fixed quantization scheme as in Example 2, we choose V V V ∗ g =F F F(:,Mod M [(g−1)r +(1 : 2r)]), where the Mod M operation ensures that the column indices are in the set{1,2,...,M}, andweuser = 1forM = 8andr = 2forM = 16. Oncetheclusteringisdone,wefurther separate the groups into two disjoint subsets, referred to as “patterns”, each containing G/2 = 4 groups. Groups in the same pattern are served on the same time-frequency slot, whileusersindifferentpatternareservedindifferentslots. Thispartitioningintopatterns is needed in order to keep the inter-group interference under control. For the K−means clustering algorithm, the partitioning of the group eigenspaces into two disjoint patterns 75 0 200 400 600 800 1000 5 10 15 20 25 30 35 40 45 Number of Users Sum rate ZF−DPC−GUS ZFBF−SUS ZF−RANDOM−US JSDM−ZFBF−SUS JSDM−GBF−ALL JSDM−GBF−MAX (a) M =8 0 200 400 600 800 1000 0 10 20 30 40 50 60 70 80 Number of Users Sum rate ZF−DPC−GUS ZFBF−SUS ZF−RANDOM−US JSDM−ZFBF−SUS JSDM−GBF−ALL JSDM−GBF−MAX (b) M =16 Figure 3.1: Comparison of sum spectral efficiency (bit/s/Hz) vs. number of users for JSDM with DFT-based fixed quantization user grouping and different user selection al- gorithms. isdonesuchthatthesumoftheminimumdistancesofthetwopatternsismaximized. For the fixed quantization algorithm, the patterns are obtained by considering the geometry of angular separation. In particular, we have the two patterns: {V V V ∗ 1 ,V V V ∗ 3 ,V V V ∗ 5 ,V V V ∗ 7 } and {V V V ∗ 2 ,V V V ∗ 4 ,V V V ∗ 6 ,V V V ∗ 8 }. Within each group, a specific user selection algorithm (JSDM-ZFBF- SUS, JSDM-GBF-MAX or JSDM-GBF-ALL) is applied for selecting the users which are served at any given time-frequency slot. Figures 3.1(a) and 3.1(b) show the sum spectral efficiency (in bits/sec/Hz) versus the number of users in the system, averaged over the two patterns with different user selection algorithms forM = 8 andM = 16 respectively, when DFT-based user grouping is applied. For the sake of comparison, we show also the performance of ZFBF with randomuserselection(denotedbyZFBF-RANDOM-US),Dirtypapercodingwithgreedy user selection [DS05] (denoted by ZF-DPC-GUS) and ZFBF with semi-orthogonal user selection[YG06](denotedbyZFBF-SUS),whereinsteadofrestrictingtoJSDMwithper- groupprocessing,theselectionisperformedacrossalluserswithoutgrouping,onthebasis of the full channel state information (i.e., without multiplication by the pre-beamforming matrices). These performances are shown here to compare how JSDM performs with 76 0 200 400 600 800 1000 20 25 30 35 40 45 Number of Users Sum rate FQ−AoA/AS FQ−DFT K−means (a) ZFBF −SUS 0 200 400 600 800 1000 10 15 20 25 30 35 40 Number of Users Sum rate FQ−AoA/AS FQ−DFT K−means (b) GBF −MAX Figure 3.2: Comparison of sum spectral efficiency (bit/s/Hz) vs. number of users for JSDM-ZFBF-SUS and JSDM-GBF-MAX with different user grouping algorithms for M = 16. respect to classical linear beamforming schemes without the structure constraint of fixed pre-beamforming. Notice that these user selection schemes require full channel state feedback from all users, and therefore are typically too costly in terms of feedback in order to be practical. However, it is interesting to note that JSDM outperforms ZFBF with a “simple-minded” random user selection. Figure 3.2(a) shows the sum spectral efficiency versus the number of users for differ- ent user grouping algorithms with M = 16 and JSDM-ZFBF-SUS. Figure 3.2(b) shows analogous results for JSDM-GBF-MAX. These results indicate that user grouping by DFT-based fixed quantization (indicated in the plots by “FQ”) performs generally better than the other user grouping algorithms considered in this work. Hence, also because of its simplicity, this appears to be the preferred method for practical user grouping. 3.3 Large System Limit In this section, we focus on the large system limit, i.e., when the number of antennas and the number of users go to infinity with a fixed ratio. Specifically, we modify the system model of Section 2.2 and introduce a scale parameter N, such that the BS has MN 77 antennasandservesKN singleantennausers. Then, weconsiderthesystemperformance for N → ∞. In this limit, we shall see in Section 3.3.1 that the user grouping scheme with DFT-based fixed quantization takes on a very simple form, corresponding to a quantization of the AoA/AS plane. This requires only the knowledge of the AoAs and ASs of the users, instead of the whole covariance matrix. As far as user selection is concerned, we notice from [CBTF09] that in the limit of N →∞ user selection schemes (max SINR or SUS) become less and less effective because of “channel hardening”, and require CQI feedback or effective channel state feedback from a large number of users, such that the benefit of “opportunistic” user selection is quickly offset by the extra cost of the feedback. Hence, following the idea of [HTC12], we consider a probabilistic user selection scheme where users are selected with a certain probability distribution, which is optimized in order to maximize the system ergodic sum rate. In this way, only the users actually scheduled for transmission will have to feedback their channel state information, in line with the observations made in [RJ08]. 3.3.1 DFT-based user grouping in the large system limit When the number of antennas MN is large, and the BS is equipped with a uniform linear array, the eigenvectors of the channel covariance can be approximated by a subset of the columns of the DFT matrix (see Fact 2, Section 2.6). Thus, U U U k , i.e., the matrix of eigenvectors of the channel covariance of a user k with angle of arrival θ k and angular spreadΔ k , iswellapproximatedbyamatrixF F F u k , formedbyasubsetofthecolumnsofthe DFTF F F, where the subscript denotes the user index k and the superscript is to indicate that this matrix is specific for user k. In a similar fashion,F F F grp g denotes the DFT-based group eigenspaceV V V ∗ g for a particular group g. LettingF F F = [f f f − MN 2 +1 f f f − MN 2 +2 ...f f f −1 f f f 0 f f f 1 ...f f f MN 2 ] denote the MN×MN DFT matrix, the vector f f f i denotes the Fourier vector corresponding to frequency i MN . The matrices F F F u k and F F F grp g are formed by blocks of DFT columns corresponding to adjacent 78 frequencies, i.e., they take on the form f f f l f f f l+1 ...f f f u , from some interval of DFT fre- quencies corresponding to the integer indices in [l,u]. From the analysis in Section 2.6, we know that F F F u k for a certain (θ k ,Δ k ) contains the DFT frequencies with indices in F u k = [l k ,u k ], with l k = ⌊−MNDsin(θ k +Δ k )⌋ and u k = ⌈−MNDsin(θ k −Δ k )⌉. For the group eigenspace, we denote the frequency index interval forming F F F grp g by F grp g = {L g ,L g +1,...,U g }, for some L g ,U g suitably defined (see later). Let ¯ r ∗ k and r ∗ g denote the number of columns (rank) inF F F u k andF F F grp g respectively. For the sake of analysis, we assume that r ∗ g = r ∀ g. Thus, the fixed quantization grouping scheme assigns user k to group g if g = argmin g ′ ||U U U ∗ k U U U ∗H k −V V V ∗ g ′V V V ∗H g ′ || 2 F = argmin g ′ ||F F F u k F F F uH k −F F F grp g ′ F F F grpH g ′ || 2 F = argmin g ′ ¯ r ∗ k +r ∗ g ′−2||F F F uH k F F F grp g ′ || 2 F = argmax g ′ ||F F F uH k F F F grp g ′ || 2 F (3.26) Letting N →∞ and normalizing by 1/N, we obtain lim N→∞ 1 N ||F F F uH k F F F grp g ′ || 2 F = lim N→∞ 1 N X m∈F u k X n∈F grp g 1 MN MN 2 X k=− MN 2 +1 e j 2π MN (n−m)k 2 = lim N→∞ 1 N X m∈F u k X n∈F grp g δ m,n (3.27) where δ m,n is the Kronecker delta function. For finite N, P m∈F u k P n∈F grp g δ m,n gives the number of identical columns in F F F grp g and F F F u k . In the limit N → ∞, the term 1 N P m∈F u k P n∈F grp g δ m,n reduces to MΦ k g , where Φ k g denotes the measure of overlap be- tween the intervals ( l k MN , u k MN ) and ( Lg MN , Ug MN ). Hence, (3.26) reduces to g = argmax g ′ Φ k g ′ (3.28) 79 We can further simplify (3.28), leading to a much simpler expression for the user grouping algorithm. For notational convenience, we make the following change of nota- tions a k = l k +u k 2MN ,b k = u k −l k 2MN ,A g = Lg+Ug 2MN ,B g = Ug−Lg 2MN . The overlap Φ k g between the intervals (a k −b k ,a k +b k ) and (A g −B g ,A g +B g ) is given by Φ k g = min(a k +b k ,A g +B g ) −max(a k −b k ,A g −B g ) min(a k +b k ,A g +B g )−max(a k −b k ,A g −B g )> 0 0 min(a k +b k ,A g +B g )−max(a k −b k ,A g −B g )≤ 0 (3.29) Denote Δ int =|b k −B g |. We consider two cases: • Case 1: b k −B g = Δ int > 0. Considering only the case of min(a k +b k ,A g +B g )− max(a k −b k ,A g −B g )> 0, we have Φ k g = min(a k +b k ,A g +B g )−max(a k −b k ,A g −B g ) = min(a k +B g +Δ int ,A g +B g )−max(a k −B g −Δ int ,A g −B g ) = 2B g +min(a k +Δ int ,A g )−max(a k −Δ int ,A g ) = 2B g +A g −a k +Δ int A g <a k −Δ int 2B g a k −Δ int ≤A g ≤a k +Δ int 2B g +a k −A g +Δ int A g >a k +Δ int (3.30) • Case 2: B g −b k = Δ int > 0. Proceeding similarly as Case 1, we get Φ k g = 2b x +min(a k ,A g +Δ int )−max(a k ,A g −Δ int ) = 2b k +A g −a k +Δ int a k >A g +Δ int 2b k A g −Δ int ≤a k ≤A g +Δ int 2b k +a k −A g +Δ int a k <A g −Δ int (3.31) 80 From (3.30) and (3.31), it is easy to see that, for a given k, max g Φ k g = min g |A g −a k | (3.32) Using expressions for l k and u k as N →∞, we have: a k = lim N→∞ = l k +u k MN = −Dsin(θ k +Δ k )+(−Dsin(θ k −Δ k )) 2 = −Dsin(θ k )cos(Δ k ), (3.33) reducing (3.32) to max g Φ k g = min g |A g −(−Dsin(θ k )cos(Δ k ))|. (3.34) Notice that the grouping rule (3.32) with a k given in (3.33) defines a quantization (par- titioning) of the AoA-AS plane with coordinates θ and Δ, where each user is indicated by a point (θ k ,Δ k ). Example 3 Let us assumeθ k ∈ (−60 o ,60 o ), Δ k ∈ (5 o ,15 o ) andG = 4. L g =⌊MN g−1 G − MN 2 ⌋ and U g = ⌈MN g G − MN 2 ⌉. Thus, we have A g = g− 1 2 G − 1 2 . Figure 3.3 shows the partition of the (θ k ,Δ k ) space into G = 4 groups using (3.34). The different colors indicate the different groups. 3.3.2 Probabilistic user selection We assume that users are arranged into K co-located “user subgroups” with N users in each location. Notice that this assumption is made for analytical convenience, and corresponds to the quantization of the user spatial distribution into a number of discrete points in the coverage area. In this case, if the user locations correspond, on average, to an area A = total coverage area K (m 2 ), then N/A is the user average density (users/m 2 ). For a fixed coverage area, both the number of BS antennas and the user density grows to infinity, such that the total number of users per BS antenna is fixed and equal to K/M. 81 5 10 15 20 −80 −60 −40 −20 0 20 40 60 80 Δ k Θ k Figure 3.3: Illustration of user grouping in the large system limit. ‘black’ denotes g = 1, ‘purple’ denotes g = 2 and ’red’ and ’blue’ denote g = 3 and 4. Users in each subgroup k are statistically equivalent, with common covariance matrix R R R k (θ k ,Δ k ) that depends only on the location (AoA) and local scattering (AS). We define a group as a collection of subgroups, obtained by the application of the user grouping algorithm given in the previous section. The number of subgroups forming a group g is indicated byK g , such thatK = P g K g . Defining theMN×N channel matrix of a user subgroup g k , i.e., the k-th subgroup of the g-th group, as ¯ H H H g k , we haveH H H g , the effective channel matrix of groupg asH H H g = [ ¯ H H H g 1 ... ¯ H H H g Kg ]. As a result, the received signal vector for users in subgroup g served by JSDM with PGP can be written as y y y g = ¯ H H H H g 1 B B B g . . . ¯ H H H H g Kg B B B g P P P g d d d g + G X g ′ =1,g ′ 6=g ¯ H H H H g 1 B B B g ′ . . . ¯ H H H H g Kg B B B g ′ P P P g ′d d d g ′ +z z z g (3.35) Note thatB B B g isMN×b g N,H H H g isMN×K g N andP P P g isb g N×S g N, i.e., the dimensions of all matrices scale linearly with N. 82 The BS now needs to perform downlink scheduling, i.e., selecting a subset of S g N ≤ min(b g N,K g N) out of K g N users in order to serve them in a particular time-frequency slot. The scheduling problem is formulated as the maximization of a strictly increasing and concave Network Utility Function G(·), representing some notion of fairness (for example, the class of α-fairness functions in [MW00], which yield proportional fairness for α = 1, sum rate maximization for α = 0, and max-min fairness for α → ∞) over the set of achievable ergodic rates, or “throughputs”. Since users within a subgroup are statistically equivalent, they all have the same throughput. Denoting the normalized 4 throughput of a subgroup k as ¯ R k , i.e., the sum rate of the users in subgroup k divided by N, the scheduling problem reduces to solving the following optimization: maximize G( ¯ R 1 ,..., ¯ R K ) subject to ¯ R k ∈ ¯ R∀ k∈{1,2,...,K} (3.36) where ¯ R is the system throughput region. For the sake of simplicity, we consider only ZFBF MU-MIMO precoding with equal power allocated to each downlink data stream, such that ¯ R denotes the throughput region obtained using such scheme. With zero forcing precoding, the BS can serve a maximum of min{K g N,b g N} users in a given time frequency slot in group g. In realistic scenarios of interest, K g > b g for all groups g. This condition can be always verified assuming that the system has a sufficiently large number of users such that the number of co-located subgroups in each group is large enough. Therefore, the BS should select a subset of users not larger than P G g=1 b g N to be served at each time slot. In order to find the optimal subset, an exhaustive search over all possible subsets of size less than or equal to P G g=1 b g N over KN users is required. This search is combinatorial and it would be infeasible for a large number of users. Furthermore, it requires all users to feedback their effective channel 4 SincethenumberofdownlinkstreamsscaleslinearlywithN,thesumratepersubgroupalsoincreases linearly in N. Then, in order to obtain meaningful limits and a meaningful network utility maximization problem, we divide the sum rate per subgroup by N. 83 vectors, which may be very costly when the number of users is much larger than the number of possible downlink streams. Hence, we propose a scheme where the BS selects γ g k N users in each k th subgroup of group g at random, independently of the channel matrix realization, whereγ g k ∈ [0,1] is referred to as the “fraction” of active users. Then, only the selected users feedback their effective channel and the BS serves them using JSDM-ZFBF (i.e., ZFBF in each group, irrespectively of the inter-group interference). Notice that, whileγ g k is a constant that depends only on the system statistics, the set of active users changes randomly at each slot and is uniformly distributed over all N γg k N active user subsets. This implies that all users in the same subgroup have equal chance of being served, achieving fairness across the users in the same subgroup. Denoting by P u the per-stream downlink power (same for all streams) and noticing that the number of downlink streams per group is S g N = P Kg k=1 γ g k N, we have the constraints S g = Kg X k=1 γ g k ≤ b g ∀g = 1,...,G (3.37) NP u G X g=1 S g ≤ P, (3.38) whereP is the total power of the BS. The channel matrices ¯ H H H g k are now functions ofγ g k , since their dimension isMN×γ g k N. In order to find an expression for the instantaneous rate of a generic n-th active user in subgroup k of group g, we need to find a convenient asymptotic expression for its SINR, given by SINR n g k = P u | ¯ h h h H g k ,n B B B g K K K g B B B H g ¯ h h h g k ,n | 2 1+ P G g ′ =1,g ′ 6=g PK g ′ l=1 P N m=1 P u | ¯ h h h H g k ,n B B B g ′K K K g ′B B B H g ′ ¯ h h h g ′ l ,m | 2 (3.39) 84 where ¯ h h h g k ,m denotes them-th column of ¯ H H H g k , the ZFBF precoding matrix for groupg is given by P P P g =ζ g B B B H g H H H g H H H H g B B B g B B B H g H H H g −1 where ζ g is a power normalization factor, given by ζ 2 g = NS g tr H H H H g B B B g B B B H g H H H g −1 , (3.40) where we used the fact thatB B B g is tall unitary (e.g., with the choiceB B B g =F F F grp g discussed before), and where we let K K K g =ζ g B B B H g H H H g H H H H g B B B g B B B H g H H H g −2 H H H H g B B B g . (3.41) ItisworthwhilenoticingthatinthedenominatoroftheSINRexpressionin(3.39)wefind only the contribution of the inter-group interference, since the intra-group interference is completely eliminated by ZFBF precoding. In the limit of N → ∞, the terms SINR n g k for all users in the same subgroup converge to the same deterministic quantity SINR o g k , that depends only on the subgroup index g k [WCDS12,HTC12]. Hence, the achievable normalizedthroughputforasubgroupk ofgroupg isgivenby ¯ R g k =γ g k log(1+SINR o g k ), reducing the optimization problem (3.36) to maximize G γ g k log(1+SINR o g k ) subject to S g = Kg X k=1 γ g k ≤b g 0≤γ g k ≤ 1∀ g = 1,...,G, k = 1,...,K g . with respect to the optimization variables{γ g k }. Inordertocalculatethelimitingvalues{SINR o g k }weusethetechniqueofdeterministic equivalent approximation pioneered in [WCDS12], and applied to the JSDM setting in Section 2.4. The details of the resulting fixed-point equations that yield the deterministic 85 equivalentapproximationaregiveninSection3.3.3. Thefractionsγ g k shouldbeobtained as the solution of the non-convex optimization problem (3.42). Since this is not amenable for a tractable exact solution, we follow in the footsteps of [HTC12] and use a greedy componentwise ascent maximization approach which mimics the well-known greedy user selection for the combinatorial optimization of the finite-dimensional sum rate (see for example [DS05]). 3.3.3 Analysis Following the analysis technique developed in [WCDS12], the sought SINR limit SINR o g k is given by SINR o g k = ζ o g 2 P/S 1+ P G g ′ =1,g ′ 6=g ζ o g ′ 2 Υ o g ′ ,g k P/S , (3.42) where P indicates the total transmit power and we define S = P G g=1 S g to be the total number of downlink data streams across all groups, normalized by N. The expressions of Υ o g ′ ,g k and ζ o g 2 are obtained by a sequence of converging approxi- mations for increasing (finite) N, given as follows: Υ g ′ ,g k = 1 b g K g ′ X l=1 γ g ′ l n o g k ,g ′ l (m o g ′ l ) 2 n n n ′ g k = I I I K g ′ −J J J g ′ −1 v v v ′ g k {J J J g } i,j = γg i bg tr( ¯ R R R g i T T T g ¯ R R R g j T T T g ) Nb g (m o g j ) 2 {v v v ′ g k } i = 1 Nb g tr( ¯ R R R g ′ i T T T g ′B B B H g ′R R R g k B B B g ′T T T g ′) ζ o g 2 = S g Γ o g Γ o g = 1 b g Kg X k=1 γ g k q g,k (m g k ) 2 q q q g = I I I Kg −J J J g −1 v v v g {v v v g } i = 1 Nb g tr( ¯ R R R g i T T T g B B B H g B B B g T T T g ) (3.43) 86 where m o g k is given by the solution of a set of fixed point equations m o g k = 1 Nb g tr( ¯ R R R g k T T T g ) T T T g = I I I bgN + 1 b g Kg X l=1 γ g l m o g l ¯ R R R g l −1 , (3.44) and ¯ R R R g k = B B B H g R R R g k B B B g ∀ g = 1,...,G, k = 1,...,K g , n n n ′ g k = [n o g k ,g ′ 1 ,...,n o g k ,g ′ K g ′ ] T , q q q g = [q g,1 ,...,q g,Kg ] T . AsN →∞, the covariance matrixR R R g k , for the users in thek-th subgroup of theg-th group can be approximated as R R R g k =F F F g k ¯ Λ g k F F F H g k (3.45) whereF F F g k iscomposedofasubsetofcolumnsoftheDFTmatrixF F F,and ¯ Λ g k isadiagonal matrixcontainingtheeigenvaluesobtainedbyusingtheToeplitz-Circulantapproximation (see Section 2.6 for details). Denoting the AoA and AS for this specific subgroup as θ g k and Δ g k , respectively, we have F F F g k = h f f f lg k f f f lg k +1 ...f f f ug k i (3.46) with l g k =⌊−MNDsin(θ g k +Δ g k )⌋ u g k =⌈−MNDsin(θ g k −Δ g k )⌉. Also, ¯ Λ g k is given as (see (2.61)) ¯ Λ g k = 1 2Δ g k diag ! 1 p D 2 −(l g k /MN) 2 , 1 p D 2 −((l g k +1)/MN) 2 ,..., 1 p D 2 −(u g k /MN) 2 % (3.47) 87 Since we choose the group subspaces to be the span of blocks of mutually orthogonal columns of the DFT matrix F F F, there is no need for BD and we just consider DFT pre- beamformingB B B g ′ =F F F grp g ′ . Hence, for large N we have B B B H g ′R R R g k B B B g ′ = F F F grpH g ′ F F F g k ¯ Λ g k F F F H g k F F F grp g ′ = 1 2Δ g k diag 0,...,0 | {z } L g ′−max(L g ′,lg k )−1 , 1 p D 2 −(max(L g ′,l g k )/MN) 2 , 1 p D 2 −((max(L g ′,l g k )+1)/MN) 2 ,..., 1 p D 2 −(min(U g ′,u g k )/MN) 2 , 0,...,0 | {z } U g ′−min(U g ′,ug k )−1 (3.48) InthelimitoflargeN, theset{ max(L g ′,lg k ) MN , max(L g ′,lg k )+1 MN ,..., min(U g ′,ug k ) MN }correspondsto aninterval(a g ′ g k ,b g ′ g k )⊂ (− 1 2 , 1 2 )ontheDFT“frequency”axis,witha g ′ g k = lim N→∞ max(L g ′,lg k ) MN and b g ′ g k = lim N→∞ min(U g ′,ug k ) MN . Define a function f(g k ,g ′ ,x) for x∈ (− 1 2 , 1 2 ) correspond- ing to the terms of the formB B B H g ′R R R g k B B B g ′. We have f(g k ,g ′ ,x) = 1 2Δg k 1 √ D 2 −x 2 x∈ (a g ′ g k ,b g ′ g k ) 0 elsewhere (3.49) Based on (3.49), expressions involving the trace ofB B B H g ′R R R g k B B B g ′ or functions ofB B B H g ′R R R g k B B B g ′ reduce to one-dimensional integrals over the interval (− 1 2 , 1 2 ). For the sake of clarity, consider the following example: lim N→∞ 1 Nb g tr( ¯ R R R g k ) = Z 1 2 − 1 2 f(g k ,g,x)dx. (3.50) Following this observation, we arrive at a much simplified set of equations to calculate SINR o g k in (3.42), given directly in terms of the limit for N → ∞, and not just as a 88 sequence of convergent approximations for increasing N as before. In this case, Υ o g ′ ,g k and ζ o g 2 in (3.42) are given by Υ g ′ ,g k = 1 b g K g ′ X l=1 γ g ′ l n o g k ,g ′ l (m o g ′ l ) 2 n n n ′ g k = I I I K g ′ −J J J g ′ −1 v v v ′ g k {J J J g } i,j = γg i bg R 1 2 − 1 2 f(g i ,g,x)f(g j ,g,x) h(g,x) 2 dx (m o g j ) 2 {v v v ′ g k } i = Z 1 2 − 1 2 f(g ′ i ,g ′ ,x)f(g k ,g ′ ,x) h(g ′ ,x) 2 dx ζ o g 2 = S g Γ o g Γ o g = 1 b g Kg X k=1 γ g k (m g k ) 2 q g,k q q q g = I I I Kg −J J J g −1 v v v g {v v v g } i = Z 1 2 − 1 2 f(g i ,g,x) h(g,x) 2 dx (3.51) and m o g k given by the solution of a set of fixed point integral equations m o g k = Z 1 2 − 1 2 f(g k ,g,x) h(g,x) dx h(g,x) = 1+ 1 b g Kg X l=1 γ g l m o g l f(g l ,g,x), (3.52) Having an efficient method for calculating the users SINR for given total power P and user group “fractions” {γ g k }, we next outline the greedy approach to find a good heuristic solution to the non-convex optimization problem (3.42), where optimization is with respect to the variables{γ g k }. Greedy Algorithm for optimizing the user fractions γ g k : The greedy algorithm considers incrementing the user fractions in small steps δγ, until the objective function 89 cannot be increased further. We start with γ g k = 0 for all subgroups within every group and findg k such that incrementing the user fractionγ g k byδγ yields the largest possible increase in the objective function. This procedure is repeated until the objective function cannotbeincreasedfurther. Wedenoteaniterationbyiand,withsomeabuseofnotation, the objective function asG(γ), where γ is the vector of all optimization variables{γ g k }. • Step 1: Initializei = 0,γ (i) g k = 0 ∀g = 1,...,G, k = 1,...,K g andG(γ (i) ) =G(0 0 0). • Step 2: Forδγ≪ 1, setγ g k =γ (i) +δγe e e g k , wheree e e is a vector containing all zeros but a 1 corresponding to the k-th (g,k) : g = 1,...,G, k = 1,...,K g such that γ g k ≤ 1 and P Kg k=1 γ g k ≤ b g ∀g. For the pairs (g,k) for which the conditions are not satisfied, set G(γ g k ) =G(γ (i) ). If no such pair can be found, then set γ =γ (i) and exit the algorithm. • Step 3: Compute (ˆ g, ˆ k) = argmax g=1,...,G,k=1,...,Kg G(γ g k ) and set G(γ (i+1) ) = G(γ ˆ g ˆ k ) and γ (i+1) =γ ˆ g ˆ k . • Step 4: IfG(γ (i+1) )>G(γ (i) ), incrementi by 1 and go to Step 2, else setγ =γ (i) and exit the algorithm. 3.3.4 Results We present some numerical results demonstrating the performance of the simplified user grouping algorithm based on quantization of the AoA-AS plane in conjunction with the proposedprobabilisticuserselection,foruserfractionsobtainedbygreedyoptimizationas seenbefore,fordifferentnetworkutilityfunctionsG(·). Specifically,wefocusontwocases: 1) Proportional fairness scheduling (PFS), corresponding to the choiceG( ¯ R 1 ,..., ¯ R K ) = P k log ¯ R k ; 2) Sum rate maximization, corresponding to the choice G( ¯ R 1 ,..., ¯ R K ) = P k ¯ R k . We assume a uniform distribution for the users’ angle of arrivalθ k ∈ (−60 o ,60 o ), and angular spread Δ k ∈ (5 o ,15 o ), set the number of groups equal to 8 and divide these user groups into two overlapping patterns containing G = 4 groups each. Pattern 1 contains the groups 1,3,5 and 7, and pattern 2 contains the groups 2,4,6 and 8. For 90 5 10 15 20 −80 −60 −40 −20 0 20 40 60 80 Δ k Θ k (a) Pattern 1 5 10 15 20 −80 −60 −40 −20 0 20 40 60 80 Δ k Θ k (b) Pattern 2 Figure 3.4: Partition of the θ−Δ plane into different patterns. Within each pattern, there are different groups. pattern1,wehaveA g = g− 1 2 G − 1 2 whereg∈{1,2,3,4}. Forpattern2,wehaveA g = g G − 1 2 andg∈{1,2,3,4}. We partition the user population based on their angles of arrival and angular spreads using the simplified user grouping algorithm described in Section 3.3.1. Figures 3.4(a) and 3.4(b) show the quantization regions in the AoA-AS plane. After solving (off-line) for the optimal user fractions, we apply the probabilistic user selection scheme in order to schedule the users within each pattern. The two patterns are served in orthogonal time-frequency slots, with equal sharing of the transmission resource. Figures 3.5(a) and 3.6(a) shows the network utility objective function versus S = P G g=1 P Kg k=1 γ g k for proportional fairness and sum rate maximization for Pattern 1, re- spectively. In this example we have M = 8,G = 4,b 1 =b 2 =b 3 =b 4 = 2, δγ = 0.01 and P = 10 dB. The optimization is performed by applying the greedy heuristic algorithm while omitting Step 4, in order to find the value of the objective function for increasing S even beyond its maximum, for the sake of illustration. In this case, we terminate the algorithm when no pair (g,k) can be found such that γ g k ≤ 1 and P Kg k=1 γ g k ≤ b g ∀g. Figures 3.5(b) and 3.6(b) show the distribution of the rates in different subgroups under the two considered network utility functions. In these figures, we plot the normalized rates corresponding to a subgroup versus the subgroup index for pattern 1. We notice 91 0 1 2 3 4 5 6 7 8 −300 −280 −260 −240 −220 −200 −180 S g(R) (a) Objective vs. S 0 20 40 60 80 100 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 Subgroups Distribution of Rates (b) Rate distribution Figure 3.5: Optimization of user subgroups fractions for proportional fairness scheduling in the large system limit, for Pattern 1. G = 4,b 1 = b 2 = b 3 = b 4 = 2, δγ = 0.01 and P = 10 dB. 0 1 2 3 4 5 6 7 8 0 2 4 6 8 10 12 14 S g(R) (a) Objective vs. S 0 20 40 60 80 100 0 0.5 1 1.5 2 2.5 3 3.5 Subgroups Distribution of Rates (b) Rate distribution Figure 3.6: Optimization of user subgroups fractions for sum rate maximization in the large system limit, for Pattern 1. G = 4,b 1 = b 2 = b 3 = b 4 = 2, δγ = 0.01 and P = 10 dB. 92 0 5 10 15 20 25 30 0 50 100 150 200 250 SNR (in dBs) Sum Rate N = 1 N = 2 N = ∞ (a) PFS 0 5 10 15 20 25 30 0 50 100 150 200 250 300 350 SNR (in dBs) Sum Rate N = 1 N = 2 N = ∞ (b) Sum Rate Maximization Figure 3.7: Comparison of sum spectral efficiency (bit/s/Hz) vs. SNR for JSDM with M = 64andvaryingN forsimplifiedusergroupingandprobabilisticuserschedulingwith different fairness functions. that the user rate distribution is fair under PFS whereas for sum rate maximization only a few subgroups have positive rates, leaving many other users completely starving. In a practical finite-dimensional system, for given user fractions{γ g k }, the users to be scheduledareselectedrandomlyinthefollowingmanner: theBScantransmitamaximum ofb g N independent data streams in each groupg. At each slot, within each groupg, the BS generates b g N i.i.d. random variables X 1 ,...,X bgN taking on values from the set of integers{0,1,...,b g }suchthatP(X m =k) = γg k bg ∀k6= 0andP(X m = 0) = 1− P Kg k=1 γg k bg . Auserinthek-thsubgroupofgroupg isthenservedbythem-thdownlinkstreamonthe currenttime-frequencyslotifX m =k. Thenextfewresultsdemonstratetheeffectiveness of the simplified user grouping, the greedy heuristic for optimization of user fractions and the corresponding probabilistic user selection. Figures3.7(a)and3.7(b)showthesumrateobtainedforPFSandSumrateMaximiza- tion, when simplified user grouping algorithm is applied and the optimal user fractions are obtained using the greedy heuristic based algorithm of Section 3.3.2. The “sum rate” refers to the normalized sum rate averaged over the patterns. We fix M = 64 and com- pare the finite dimensional simulations (obtained for N = 1 and N = 2 and denoted 93 0 0.5 1 1.5 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Normalized Subgroup Rate CDF of Rates SNR = 10 dB, PFS N = 1 N = 2 N = ∞ (a) PFS 0 1 2 3 4 0.4 0.5 0.6 0.7 0.8 0.9 1 Normalized Subgroup Rate CDF of Rates SNR = 10 dB, Sum Rate Maximization N = 1 N = 2 N = ∞ (b) Sum Rate Maximization Figure3.8: CDFsofthenormalizedsubgroupratesforJSDMwithM = 64,SNR = 10dB andvaryingN forsimplifiedusergroupingandprobabilisticuserschedulingwithdifferent fairness functions. by the “red” and “blue” curves) with the large system approximations (shown by the “black” curve). The finite dimensional simulations differ from those obtained using the large system results because of the intergroup interference, which does not vanish for finite N. With increasing N, the finite dimensional results will ultimately coincide with the large system limit. Figures 3.8(a) and 3.8(b) show the cumulative distribution of the normalized subgroup rates for M = 64 for a fixed SNR = 10 dB, with varying N. It is apparent that as N increases, the distribution of the normalized rates for the subgroups approachestothatobtainedfromthelargesystemanalysis. Weobservethat,asexpected, the group rates in the case of PFS are all positive, indicating that groups are served with some fairness. Instead, the group rate CDF for the case of sum rate maximization shows a “jump” at zero, indicating the fraction of groups that are given zero rate. In this case, the users in these groups are not served at all, and the system is unfair in favor of a higher total throughput. Also, as already noticed before, we wish to stress the fact that the proposed proba- bilistic user selection scheme involves a reduced channel state information feedback with 94 respect to the standard greedy user selection that needs all users to feed back their ef- fective channels. For example, user selection based methods proposed in Section 3.2.3 require feedback of the order of the total number of users in the system, whereas the proposed scheme requires feedback only from a subset of users (the size of this subset is always less than the number of spatial dimensions available for multiplexing), that are pre-selected based on the user fractions computed using approximations in the large system limit. 95 Chapter 4 Joint Spatial Division and Multiplexing for mm-Wave channels Massive MIMO is especially promising for systems operating at millimeter (mm-) Wave frequencies. Due to the short wavelength, very large arrays can be created with a reason- ableformfactor-a100-elementlineararrayisonlyabout50cmlongatacarrierfrequency of 30 GHz. In light of the extremely large bandwidths that are available for commercial use (up to 7 GHz bandwidth in the 60 GHz band, and around 1 GHz at 28 and 38 GHz carrier frequency), massive MIMO systems in the mm-Wave range are ideally suited for high-capacitytransmissionandthusanticipatedtoformanimportantpartof5Gsystems. Whilethefirstcommercialmm-Waveproductsareintendedforin-home,short-rangecom- munications (e.g., for transmission of uncompressed video) [PG11], the potential of mm- Waves for cellular outdoor has recently been investigated [RSM + 13,SWA + 13,AWW + 13]. Experiments have shown a coverage range of more than 200 m even in non line of sight (NLOS) situations [AWW + 13]. Such long-range transmissions require high-gain adaptive antennas - something that massive MIMO implicitly provides. JointSpatialDivisionandMultiplexing(JSDM)canachievemassive-MIMOlikegains for FDD systems (or, more generally, for systems that do not make explicit use of chan- nel reciprocity), with the added advantage of a reduced requirement for CSIT. In addi- tion, JSDM lends itself to a hybrid beamforming implementation, where pre-beamforming (which changes slowly in time) may be implemented in the analog RF domain, while 96 the MU-MIMO precoding stage is implemented by standard baseband processing. This approach allows the use of a very large number of antennas with a limited number of baseband-to-RF chains; the latter depends on the number of independent data streams that we wish to send simultaneously to the users. A major challenge for massive MIMO in the mm-Wave region is the fact that the Doppler shift scales linearly with frequency, and thus the coherence time is an order of magnitude lower than that of comparable microwave systems. Thus, massive MIMO systems at mm-Wave frequencies need to be restricted to low-mobility scenarios. For comparable speeds of motion, for example, at pedestrian speeds (1 m/s), coherence times are of the order of a few ms at mm-Wave frequencies. Since (outdoor) coherence bandwidths of mm-Wave channels are similar to those of microwave channels [RSM + 13], the overall challenges of CSI feedback overhead are then comparable to those of higher-mobility (vehicular) microwave massive-MIMO systems. For example, a 30 GHz channel for a user moving at 1 m/s has the same coher- ence time and bandwidth of a 3 GHz channel for a user moving at 10 m/s. In this work, we explicitly assume the availability of perfect channel state information for simplicity (wherever required). In reality, devoting a certain amount of resource to the training phase would discount the achievable throughput by a certain factor (see Section 2.5). TheperformanceofJSDMdependsonthetypeofchannelstatistics. Previousanalysis was based on the one-cluster (local scattering) model, which means that the BS “sees” the incoming multi-path components (MPCs) under a very constrained angular range. This allows for an easy division of the users into sets, whose associated MPCs are disjoint in the angular domain, and can thusbe separated by thepre-beamformers. However, this modeldoesnotrepresentmanyimportantscenarios. Forexample,inurbanenvironments, high-rise buildings or street canyons can act as important “common clusters” that create spatially correlated MPCs for many users [FMB98], [AGM + 06], [TLK + 02]. Another importanteffect,whichbecomesparticularlyrelevantatmm-Wavefrequencies,ischannel sparsity - in other words, the number of significant MPCs is much lower than that for a microwave system operating in a similar environment. The low number of MPCs enables 97 a further reduction of the CSIT that has to be fed back, and enables a new “degenerate” variant of JSDM, proposed in this paper and referred to as Covariance-based JSDM, that depends on the channel covariance information only. In fact, it is well known that, as long as the scattering geometry relative to a given user remains unchanged, the fading channel statistics are wide-sense stationary (WSS). In particular, this means that the channel covariance matrix is time-invariant. In a typical scattering scenario, even if a user changes its position by several meters, the channel second order statistics remain unchanged [Rap02, Chapter 4]. Hence, for a user moving at walking speed (1 m/s), the channelfadingprocessis“locally”WSSoveratimehorizonofseveralseconds,spanninga verylargenumberofsymboltimeslots(forexample,a20MHzOFDMchannelhassymbol duration of 4μs, corresponding to 10 6 symbols over an interval of 4s, corresponding to a userpositiondisplacementof4m). Weconcludethatitiseffectivelypossibletolearnvery accurately the channel covariance matrix at the transmitter side, even without requiring very fast CSIT feedback. This makes our scheme particularly interesting for mm-Waves. In this chapter, we apply the JSDM approach to realistic propagation channels in- spired, inter alia, by the recent experimental observations of mm-Wave channels in an urban outdoor environment [AWW + 13]. Specifically, our contributions are: • We identify a new optimization problem related to the application of JSDM to user groups that are coupled by the presence of common scatterers. In this case, nulling the common MPCs by pre-beamforming creates linearly independent user groupswhichcanbeservedsimultaneously,onthesametransmissionresource(Spa- tial Multiplexing approach). In contrast, allocating the user groups on orthogonal transmission resources allows to use all the MPCs to convey signal energy to the users (Orthogonalization approach). The ranking of these two approaches in terms of total system throughput depends on the operating SNR. • We generalize the common scatterer problem to the case of many users (or user groups)withpartialoverlappingoftheirchannelangularspectra(rigorouslydefined 98 as the Fourier transform of the antenna correlation function, see Section 4.3.1). For this case, we develop two new algorithms for user grouping and pre-beamforming design. The first algorithm (Section 4.3.2) chooses users that fill many angular directions (i.e., it tends to serve less users with higher beamforming gain). The second algorithm (Section 4.3.3) maximizes the number of users with at least one mutually non-overlapping set of directions (i.e., it tends to serve more users with lower beamforming gain). • We propose a new degenerate version of JSDM (Covariance-based JSDM) that provides orthogonalization of the users based only on the channel second-order statistics, and thus does not need feedback of the instantaneous CSIT. We discuss forwhichtypeofchannelssuchreducedcomplexityschemewouldperformwellwith respect to full JSDM, and show through numerical experiments that, as intuition suggest, covariance-based JSDM works well when the number of users is small with respect to the number of BS antennas and the channels are formed by a few MPCs with small angular spread. Remarkably, this is the case expected in a 5G small-cell system operating at mm-Wave frequencies. • We illustrate the performance of the proposed user selection and JSDM schemes through various numerical examples, based on multiple clusters of MPCs, and dis- crete isolated MPCs, obtained from ray tracing in an outdoor campus environment. • We also show sample performance results in measured propagation channels, from a 28 GHz measurement campaign recently carried out in New York City [AWW + 13]. Overall,JSDMwithappropriateuserselectionand,insomerelevantcases,alsothesimple covariance-based JSDM, appears to be a very attractive approach for the implementation of multiuser MIMO downlink schemes in outdoor, small to medium range (10 to 200m) mm-Wave channels. The remainder of the chapter is organized as follows: Section 4.1 discusses the models forpropagationchannelsasrelevantforouranalysis; Section4.2considerstheapplication 99 of JSDM in multi-cluster channels. Section 4.3 investigates the novel algorithms for user grouping and selection when the angular spectra of the users are partially overlapping. Section 4.4 provides simulation results for multi-cluster, ray-tracing-based, and measured propagation channels. 4.1 Spatial Chanel Models As we are dealing with a MU-MIMO system, a model for a multiuser, multiantenna channel has to be defined. Generally, MIMO channel models fall into two categories: (i) physicalmodels,and(ii)analyticalmodels[ABB + 07]. Physicalmodelsdescribethephys- icalpropagationbetweentransmitarrayandreceivearraythroughthe“double-directional impulseresponse”h(t,τ,θ,ψ), wheretisthetimeatwhichthechannelisexcited,τ isthe considereddelay, and(θ,ψ)aretheanglesofdepartureandarrival, respectively[SMB01]. It is common to assume that the double-directional impulse response arises as the sum of the contributions from discrete MPCs, such that h(t,τ,ψ,θ) = ¯ N(t) X p=1 ρ p e jφp δ(τ−τ p )δ(θ−θ p )δ(ψ−ψ p ), (4.1) wherethenumberofMPCs ¯ N(t)mayitselfbetime-varying. Notethattheabovedescrip- tion neglects the effect of polarization and can be generalized to include diffuse radiation by considering intervals of angles and/or delays for which we have a continuum of com- ponents, each carrying infinitesimal scattered energy (for a more detailed discussion see, e.g., [WAE + 04]). Double-directional models are the preferred method for MIMO channel modeling be- cause they are independent of the actual antenna structures, and efficient methods for incorporating realistic large-scale channel variations are available. However, for theoreti- cal analysis of transmission schemes, analytical models are often preferred. These models describe the channel transfer function matrix, i.e., a matrix whose (i,j)-th entry is the 100 transfer function from the j-th transmit to the i-th receive antenna element. The trans- fer function matrix subsumes the antenna arrays and the actual propagation channel; it is thus a description including all effects, for example, antenna coupling from transmit antenna connector to receive antenna connector. Fortunately, analytical models can be easily derived from double-directional models (though not vice versa). Specializing to the case of interest in this paper, where the MS has an omni-directional antenna, and the BS is equipped with a uniform linear array, the double directional channel transfer function between a BS antenna element m and the antenna of a user terminal k is given as h mk (f) = X p ρ kp e jφ kp e −j2πfτ kp e −j2πDmsinθ kp , (4.2) wheref denotes the subcarrier frequency, D∈ (0, 1 2 ] is the spacing between two antenna elements normalized by the carrier wavelength. We focus on the frequency-domain rep- resentation of the channel matrix because we assume the use of OFDM [Mol10], which is the modulation of choice of modern cellular and WLAN standards [Gas06]. Further- more, with respect to (4.1), in (4.2) we have dropped the dependence ont since we make the usual assumption of block fading, for which the channel is locally time-invariant over slots comprising several OFDM symbols. Therefore, the number of MPCs, denoted by ¯ N k , may depend on the user index k but not explicitly on t. Note that block fading is implicitly assumed in virtually all existing cellular and WLAN standards, based on pilot- aided channel estimation and coherent detection. In addition, small cells operating at mm-Wave frequencies are mainly dedicated to high-throughput nomadic users, for which the channel time variations are typically very slow. For this reason, we shall assume that the channel coefficients h mk (f) are known to the user receiver k. 1 In contrast, we shall 1 The knowledge of CSI at the receiver is commonly achieved in any wireless standard implemented today, and it will also be implemented in mm-Wave standards (e.g., 802.11ad). This is necessary for coherent detection, which is enabled by dedicated pilots that go through the downlink beamforming matrix. 101 discuss in great detail the required channel state information at the transmitter (CSIT) for the MU-MIMO downlink schemes proposed in this paper. The phase φ kp depends on the number of wavelengths traveled along the p-th path, and even small fluctuations in the transmitter and receiver positions can produce large variationsofsuchphase, especiallyatmm-Wavefrequencies. Here, weadoptthecommon assumption [Rap02] that the phases {φ kp : p = 1,..., ¯ N k } are uniformly distributed on [0,2π] and mutually independent. This implies uncorrelated scattering [Bel63], which is a widely accepted assumption in channel modeling. In this case, the space-frequency covariance between h mk (f 1 ) and h nk (f 2 ), i.e., the covariance between the channel of antenna element m at frequency f 1 and that of antenna element n at frequency f 2 , is given by E[h mk (f 1 )h ∗ nk (f 2 )] = E " X p X l ρ kp ρ ∗ kl e j(φ kp −φ kl ) e −j2π(f 1 τ kp −f 2 τ kl ) e −j2πD(msinθ kp −nsinθ kl ) # = X p X l ρ kp ρ ∗ kl E h e j(φ kp −φ kl ) i e −j2π(f 1 τ kp −f 2 τ kl ) e −j2πD(msinθ kp −nsinθ kl ) = X p |ρ kp | 2 e −j2πD(m−n)sinθ kp e −j2π(f 1 −f 2 )τ kp . (4.3) Inparticular, wehavethewell-knownresult(commontoalluncorrelatedscatteringchan- nel models) that the channel is wide-sense stationary with respect to frequency, i.e., that the channel spatial covariance is independent of the subcarrier f, and the covariance for different subcarriersf 1 andf 2 depends only on the subcarrier differencef 1 −f 2 . Further- more, for uniform linear arrays, we also have that the channel spatial covariance depends only on the spatial difference D(m−n) between the antennas. In particular, letting M denote the number of BS antennas, the M ×M channel spatial covariance of the user channel vectorh h h k (f) = (h 1k (f),...,h Mk (f)) T is given by R R R k =E[h h h k (f)h h h H k (f)] = X p |ρ kp | 2 a a a(θ kp )a a a H (θ kp ) (4.4) 102 where we define the linear array response for angle of arrival θ as a a a(θ) = 1 e −j2πDsinθ e −j2πD2sinθ . . . e −j2πD(M−1)sinθ (4.5) After these general modeling considerations, we now turn to the specific double- directional models occurring most often in practical situations. It is well-established that the MPCs tend to occur in clusters in the delay/angle plane, corresponding to in- teraction with physical clusters of scatterers 2 in the real world. The first, simplest, and still most widely used of such clustered models is the “one-ring” model [Lee73], in which the scatterers are located on a circle around the MS (see Section 2.1). However, measure- mentshaveshownthatthissimplemodelismostlyapplicablein(flat)ruralandsuburban areas. In metropolitan areas as well as hilly terrains, additional “far” scatterer clusters such as high-rise buildings can occur. While the local clusters “belong” to a particular user (see Section 2.1), the far clusters can contribute to the MPCs of many different users (see Section 4.2), since they are “visible” to all of them [AGM + 06]. Further clustering can occur in scenarios where wave guiding through street canyons is dominant; this is especially important if the BS antenna is below rooftop [TLK + 02]. Animportantfeatureofpropagationatmm-Wavefrequenciesisapronouncedsparsity of the double-directional impulse response [RSM + 13]. This arises from two major effects: (i) the specular reflection coefficient at (inevitably) rough house surfaces decreases, while more power is shifted into diffuse components. Consequently, only MPCs that undergo one or two reflections carry significant power (as opposed to microwaves, which often 2 Strictly speaking, the scatterers should be called “interacting objects (IOs)”, since the interaction of the MPCs with the objects might not only be diffuse scattering but also specular reflection or diffraction. However, the name “scatterers” for such IOs is widely used in the literature, so that we follow this convention. 103 can have significant power even after 5 or more reflections); (ii) diffraction becomes less prominent, so that MPCs that propagate “around a corner” are suppressed. Thus, while at microwave frequencies the number of relevant MPCs can easily reach 40 (for each user position), that number is often less than 10 at millimeter waves. Consider again the channel model in (4.2) and assume that all paths correspond approximatelytothesamedelay(i.e.,τ kp =τ k ∀p)andthatthe ¯ N k pathsaredividedinto N ′ k groupsofN ≫ 1pathseach,suchthatthepathsinthei-thclusterhaveapproximately the same angle of arrival θ kp =α ki . Hence, we can write h mk = N ′ k X i=1 iN−1 X p=(i−1)N ρ kp e jφ kp e −j2πDmsinα ki . (4.6) Since N is large, by the Central Limit Theorem [GS92] we can assume that P iN−1 p=(i−1)N ρ kp e jφ kp is complex Gaussian circularly symmetric. It follows that h h h k is a zero-mean complex Gaussian vector with given covariance matrixR R R k . Going to a diffuse scattering limit, where we assumeN ′ k →∞, with uniform scattering energyO(1/N ′ k ) and anglesα ki spanning the interval [θ k −Δ k ,θ k +Δ k ], we arrive at the one-cluster scattering model [Lee73] with (m,n) channel covariance elements [R R R k ] m,n = 1 2Δ k Z θ k +Δ k θ k −Δ k e −j2πD(m−n)sinα dα. (4.7) We focus on JSDM with PGP (Per Group Processing 3 ) and use the approximate BD approach to design the pre-beamforming matrix (see Section 2.3.2). In order to harness the spatial multiplexing in each group, we use for each group the classical zero-forcing MU-MIMO precoding. 104 Figure 4.1: Two user groups with local one-cluster scattering and a common scatterer that couples them. 4.2 Multiple scattering clusters JSDM was originally proposed for a system where users can be partitioned in groups with (approximately) same covariance subspaces in Chapter 2. Ecient user grouping algorithms for JSDM are proposed in Chapter 3. In any case, the underlying assumption is that the channel vectors in dierent groups have dominant covariance subspaces that almost do not overlap, such that BD or approximate BD can eciently separate the groups on the basis of the channel second-order statistics only. In this section, we go one step beyond the one-cluster model and consider the application of JSDM to a more general channel model where each user group is characterized by multiple scattering clusters, and where these clusters may signicantly overlap (common scatterers). We formalize the problem and present algorithms for selecting users and allocating spatial dimensions in Section 4.3. Figure 4.1 shows the case of two user groups, each of which has its own cluster of local scatterers, which share a common remote scattering cluster. Generalizing this idea, 3 This is not the only option for JSDM, but it is the most attractive one since it requires signicantly reduced instantaneous CSIT with respect to other techniques. 105 we consider a model where each user k is characterized by multiple disjoint clusters of scatterers,spanningangleofarrivalsinaunionofintervals. Forsimplicity,westillassume a uniform power distribution over the planar waves impinging on the BS antenna. This gives rise to a covariance matrixR R R k with elements [R R R k ] m,n = 1 N cl k N cl k X c=1 1 2Δ kc Z θ kc +Δ kc θ kc −Δ kc e −j2πD(m−n)sinα dα, (4.8) where N cl k is the number of scattering clusters associated to user k, and θ kc and Δ kc denote the respective azimuth angle and angular spread of cluster c of user k. One can incorporate different power levels to the scattering clusters by using a weighted sum of the terms in (4.8). In order to motivate the general problem of selecting users with multiple scattering clusters and gain insight on the design of suitable algorithms for this purpose, we first consider the example of Figure 4.1, which shows the effect of a single common scattering cluster. Because of the presence of the common scatterers, in order to simultaneously serve users in different groups we need to project the transmit signal in the orthogonal subspace of the eigendirections corresponding to the common scatterer. In this way, the pre-beamforming projection is able to decouple the two groups, such that MU-MIMO precoding in each group is able to achieve some per-group spatial multiplexing. However, in doing so we preclude the possibility of using the paths going through the common scatterer to convey signal energy to the MSs. Hence, an alternative approach consists of serving the two groups on different time-frequency slots (orthogonal transmission re- sources), but maximize the signal energy transfer to each of the groups by exploiting all the available MPC combining. Summarizing, we have two possible approaches: • Multiplexing: we employ BD to orthogonalize the groups in the spatial domain via thepre-beamformingmatrix. Inthiswayweeliminateinter-groupinterference, and we are able to serve the two groups on the same transmission resource. 106 • Orthogonalization: we serve the user groups in different channel transmission re- sources, andusethepre-beamformingmatrixtotransmitoverallthechanneleigen- modes (including the common scatterers) to each group separately. 0 5 10 15 20 25 30 0 200 400 600 800 1000 1200 SNR (in dBs) Sum Rate (in bps/Hz) Orthogonal, PGP Multiplex, PGP ZFBF, Full CSIT Figure 4.2: Sum Spectral efficiency (in bits/s/Hz) versus SNR for a scenario with two groups and a common scatterer. As an example, we set the number of user groups G = 2, the total number of users K = 100 and the number of BS antennas M = 400. We set the number of users in each group to be equal, i.e., user group 1 containsK 1 = 50 and user group 2 containsK 2 = 50 users. Each of the user groups has two clusters of scatterers, giving N cl 1 = N cl 2 = 2 with one cluster common to both of them (see Figure 4.1). The azimuth angles of the scatteringclustersforusergroup1are{−45 o ,0 o }andthoseforusergroup2are{60 o ,0 o }. The angular spreads for all the clusters are taken to be Δ = 15 o . Channel covariances are generated according to (4.8). The BS power is P and the noise is normalized to 1, giving SNR = P. Figure 4.2 shows the sum spectral efficiency versus SNR for the two approaches mentioned above. The “red” curve corresponds to Orthogonalization and the “blue” curve corresponds to Multiplexing. For comparison purposes, we also plot the performance obtained using linear zero forcing beamforming with full channel state 107 information, denoted by the “black” curve. It should be noted that for this example, acquiring full CSIT would requireM = 400 training dimensions (since we are considering an FDD system, and downlink training requirements scale with the number of antennas M) in each coherence block. On the other hand, our JSDM scheme requires only 100 training dimensions (which is a reduction by 4). This may still be too large for practical scenarios, hence, in the subsequent sections, we propose a degenerate version of JSDM that does not require any instantaneous CSIT. We observe that, at low SNR, Orthogonalization performs better than Multiplexing due to an increased received power obtained from the MPCs arising from the common scatterer. However, at high SNR, Multiplexing performs much better. This is because even though the received power is less for both groups after the removal of the common scatterer, more users can be served simultaneously, thereby giving a higher spatial mul- tiplexing, which is a factor of 2 compared to Orthogonalization (this is reflected by the slope of the spectral efficiency curves at high SNR). 4.3 Application of JSDM to Highly Directional Channels In this section, we apply the JSDM approach to highly directional channels as those observed in mm-Wave frequencies. In particular, we consider the case of channels with multiple scattering clusters, each of which has a different angle of departure and a narrow angular spread (as in (4.8)). In the limit, this reduces to channels formed by discrete and isolated MPCs, as in the model (4.4). In general, each user (or group of co-located users) has a channel covariance whose dominant eigenspace “occupies” a certain subset of the possible angular directions separable by the BS antenna array (the resolution of whichdependsonM andonthenormalizedantennaspacingD). Suchsubsetsareformed by unions of disjoint intervals in the angular domain (e.g., see (4.8)). Notice here that by assuming intervals, we implicitly consider “diffuse scattering” i.e., a continuum of scatterers. Subsets of different users overlap in some intervals, and are disjoint in other 108 intervals. In fact, this setting is a non-trivial generalization of the common scatterer problem described in Section 4.2, where in the example we have only two user groups and threeintervals,suchthatthegroupsaredisjointontwointervalsandoverlaponthethird, corresponding to the common scatterer. Thus the general problem that we wish to solve consistsofallocatingusersontheBSspatialdimensionsinordertoobtainagoodtradeoff between the spatial multiplexing (number of groups separable by pre-beamforming), and power gain (which depends on the number of MPCs that are combined to convey signal energy to the receivers). This problem is combinatorial and can be formulated as an integer program. In order to obtain an efficient and easily computable solution, we present two integer programming problem formulations and the corresponding greedy user selection algorithms. As we shall see, each algorithm is suited to a specific scenario, which will be illustrated through numerical examples in Section 4.4. 4.3.1 Channel eigenvalue spectrum and angular occupancy Using the theory developed in Section 2.6, based on Szego’s theory of large Toeplitz matrices, the eigenvalue spectrum of R R R k in the limit of large number of antennas M convergestothediscrete-timeFouriertransformoftheantennacorrelationfunction,given by r k [m−ℓ] = [R R R k ] m,ℓ . Being a discrete-time Fourier transform of an autocorrelation function, the eigenvalue spectrum is a function ξ k (f) : − 1 2 , 1 2 −→R + . For the multiple scattering clusters channel model, replicating the derivation in Section 2.6 for the one- cluster model, it is immediate to find the eigenvalue spectrum in the form: ξ k (f) = 1 2N cl k Δ kc 1 √ D 2 −f 2 f ∈I kc 0 f / ∈I kc c∈{1,2,...,N cl k } (4.9) whereI kc = (−Dsin(θ kc +Δ kc ),−Dsin(θ kc −Δ kc )). In order to handle channels formed by a discrete set of MPCs, we quantize the interval [−1/2,1/2] into M disjoint intervals (“angular bins”) of size 1 M , where binB i is centered at i M − 1 2 withi∈{0,1,...,M−1} and it is wrapped around the interval [−1/2,1/2] by the periodicity of the discrete-time 109 Fouriertransform. Wesaythatauserk “occupies”binB i if−Dsinθ kp ∈B i . Inaddition, we let π(p) denote the index of the bin occupied by the p-th MPC. Then, with a slight abuse of notation, we define ξ k (f) for the discrete MPC model as the piecewise constant function ξ k (f) = ¯ N k X p=1 |ρ kp | 2 ·1{f ∈B π(p) }. (4.10) In both cases, we let W k denote the support of ξ k (f), and define the set function f k : σ − 1 2 , 1 2 →R + given by f k (X) = Z X ξ k (f)df (4.11) whereX is an element of the Borel field σ − 1 2 , 1 2 , i.e., in particular, it can be any set formed by countable unions of intervals in − 1 2 , 1 2 . In order to formulate the user selection problem 4 , we take a graph theoretic approach and we associate the users to the nodes of a graph, such that node k (corresponding to userk) has node weightW k . An edge (k,ℓ) exists in the graph ifW k T W ℓ 6=∅. For such edge, the associated edge weight isE kℓ =W k ∩W ℓ . 4.3.2 Optimization Problem 1 In this case, we aim at maximizing the total “area” of the combined eigenvalue spectrum of the selected users while removing any subspace overlap between them. The proposed optimization problem takes on the form: maximize X k f k x k W k \ [ ℓ∈N k x ℓ E kℓ subject to x k ∈{0,1} (4.12) 4 The advantage of using linear arrays is the relatively simple mapping between the user angles of departure to the interval [−1,1] (see Section 2.6 for details), which gives an elegant mathematical formu- lation to the user selection problem and enables us to design suitable algorithms. Going beyond a linear array would change the mapping, and the problem needs to be formulated in a different manner. 110 with the following notation: forx∈{0,1} andW ∈σ − 1 2 , 1 2 we letxW =W ifx = 1 and xW =∅ if x = 0; N k denotes the neighborhood of node k in the graph, i.e., all the nodes ℓ such that an edge (k,ℓ) exists. Note that (4.12) is an integer optimization problem, whose solution may be computa- tionallycomplexforreal-timeimplementation, especiallyforsystemswithalargenumber of users and a large number of angular bins per user channel. In order to obtain an easily computable feasible user selection, we resort to a (generally suboptimal) greedy selection algorithm presented below. For notational simplicity, we denote the objective function of problem (4.12) by Q 1 (x x x), wherex x x = (x 1 ,...,x K )∈{0,1} K . Greedy Algorithm 1 • Step 1: Initialize x x x (0) = 0 0 0, the all-zero vector, Q 1 (x x x (0) ) = 0, S 1 = ∅ and K = {1,2,...,K}. • Step 2: For iteration n, find an index k ∗ such that k ∗ = arg max k∈K\S 1 Q 1 (x x x (n) k ) wherex x x (n) k =x x x (n) +e e e k , wheree e e k denotes a vector of all zeros except a 1 in the k th position. • Step 3: If Q 1 (x x x (n) k ∗ )>Q 1 (x x x (n) ), set S 1 =S 1 S {k ∗ },x x x (n+1) =x x x (n) k ∗ , n =n+1, and go to Step 2. Else, outputS 1 as the set of selected users. The greedy algorithm starts by selecting a user that occupies the maximum area in terms of eigenvalue spectrum and continues to add more users until the objective cannot be increased further. From a qualitative perspective, the algorithm implements a form of Orthogonalization, by giving preference to users which occupy a larger area in the eigenvalue spectrum and by penalizing users having a spectral overlap with the already selected users. 111 4.3.3 Optimization Problem 2 In this case, we wish to maximize the number of served users, provided that they have at least one non-overlapped spectral interval. The proposed optimization problem takes on the form: maximize X k x k subject to x k ∈{0,1} x k W k \ [ ℓ∈N k x ℓ E kℓ [ [ ℓ∈N k (1−x k )E kℓ 6=∅ ∀ k (4.13) and N k denotes all the nodes connected to node k. The constraint guarantees that the scheduled user nodes always have one non-overlapping interval, which is non-empty. For the non-scheduled user node, the constraint reduces to a union of edge weights cor- responding to its neighbors, which is trivially non-empty (assuming that the graph is connected). Qualitatively, the optimization problem (4.13) aims at maximizing the Spatial Multi- plexing, while removing any region of overlap in the angular spectrum of the users. The solution corresponds to the maximum number of users that can be simultaneously served without any common region of overlap. Again, since (4.13) is an integer program, we resort to a (suboptimal) low complexity greedy selection method that keeps adding users until the feasibility conditions in (4.13) are satisfied. Greedy Algorithm 2 • Step 1: InitializeS 2 =∅,K ={1,2,...,K} and fix ǫ> 0. 112 • Step 2: Construct a setF containing all nodes inK\S 2 that satisfy the feasibility condition when all nodes inS 2 are active, i.e., F = {k :k∈K\S 2 ,|J m |≥ǫ, ∀m∈S 2 ∪{k}} J m =W m \ [ ℓ∈Nm ℓ∈S 2 ∪{k} x ℓ E mℓ (4.14) IfF =∅, go to Step 5, else go to Step 3. • Step 3: Find an index k ∗ ∈F such that k ∗ = argmin k∈F |N k | (4.15) • Step 4: S 2 =S 2 ∪{k ∗ }. Go to Step 2. • Step 5: OutputS 2 as the set of selected users. The selection of k ∗ in (4.15) is driven by the heuristic of choosing a feasible node with minimum number of edges. One can use different heuristics yielding possibly different results. Finally, ǫ is a tuning parameter that is used to limit the maximum number of users that can be multiplexed together. The role of ǫ is to discard users from getting selected in case they have large overlap regions with other users. Note that the complexity of an optimal exhaustive search user selection algorithm for both (4.12) and (4.13) is exponential in the number of users K, i.e., O(2 K ), whereas the greedy user selection algorithms have a linear complexity, i.e., O(K). A simple example demonstrating the purpose of the optimization problems 1) and 2) and the corresponding greedy algorithms is given next. ConsiderK = 2, withW 1 = (−0.1,0.1) S (0.2,0.25) and W 2 = (−0.1,0.1) S (−0.4,−0.3). Also, assume the function f(X) for an interval X is given as f(X) =|X|, the size of the interval. Solving (4.12) gives the solution [0 1] and solving (4.13) gives [1 1] as the solution. This means that with Algorithm 1, only user 2 is selected, while with Algorithm 2 both users are selected. 113 An important point to note here is that when the channels are highly directional, the eigenvalue spectrum reduces to the form (4.10), and a user can be viewed as occupying a set of bins corresponding to the angles of arrival of the MPCs. In such a scenario, if the users are located randomly in the network, the greedy algorithm 2 basically tries to schedule users which have at least one non-overlapping bin, thereby providing a huge spatial multiplexing. 4.3.4 Application of JSDM after selection In this subsection, we briefly summarize the application of Joint Spatial Division and Multiplexing after user selection. We consider the following two different cases. 1. JSDM with spatial multiplexing: In this scenario, users come in groups, either by nature or by the application of user grouping algorithms. The selection algorithms described earlier provide a set of user groups that can be served simultaneously, in the same transmission resource. We use approximate BD based on the channel covariancesoftheselectedusergroupsinordertoobtaintheJSDMpre-beamformers (see Section 2.3.2). In this way, pre-beamforming spatially separates the groups. Then, within each group, multiple users are served by spatial multiplexing using a zero-forcing MU-MIMO precoder. 2. Covariance-based JSDM: In this scheme, irrespective of the number of users in a group, we do not perform spatial multiplexing, i.e., only one user per group is served. Mathematically, this means that the pre-beamforming matricesB B B g for all groupsg∈{1,2,...,G} have horizontal dimensionb g = 1, i.e., the pre-beamformer reducestoasinglecolumn. Thisapproachcanberegardedasadegenerateversionof JSDM where the multiplexing inside each group is trivial. Covariance-based JSDM is attractive from the system simplification viewpoint, since it does not require instantaneous CSIT to compute the MU-MIMO precoders {P P P g }. On the other hand, when a non-trivial spatial multiplexing per groupK g > 1 is possible, the rate 114 achieved by covariance-based JSDM may be significantly less than what could be achieved by full JSDM. It is important to remark, though, that in some relevant scenarios the throughput achieved by covariance-based JSDM may be comparable to that of full JSDM. For example, in a small cell system operating at mm-Wave frequencies,suchthatthenumberofusersK isnotverylarge,andeachuserchannel isformedbydiscreteMPCsthatoverlaponlyonafewcommonscatteringangles, it can be expected that, after the selection algorithm, each “group” is formed indeed by just a single user. Therefore, there is no need for further spatial multiplexing inside each group. This will be evident in some numerical experiments presented in Section 4.4. Remark 11 From (4.3), we have that the channel covariance matrix of a user k at any given frequency f is independent of the delays {τ pk } of the multi-path components, and is constant with respect to the frequency f (see (4.4)). Hence, making a narrowband assuption (e.g., focusing on a single subcarrier of an OFDM system), we can treat the channel covariance as a constant with respect to frequency. Since our algorithms depend only on the channel covariance matrices, they apply identically whether the channel is frequency selective or frequency flat. Of course, the part of the beamforming scheme that depends on the instantaneous effective channel requires CSIT for every coherence band in frequency. In an extreme case of frequency selectivity, this must be estimated over each OFDM subcarrier, while in a normal case (e.g., channels used in LTE) an estimate per channel resource block (12 adjacent subcarriers) would be sufficient. 4.4 Numerical Results Wepresentsomenumericalexperimentsdemonstratingtheperformanceofthealgorithms describedinSection4.3. Werunthealgorithmsfordifferentscenariosinordertopointout interesting insights on the effect of highly directional channels with common scatterers. We present results for the above discussed multi-cluster model, as well as for even more 115 realistic scenarios generated by ray tracing and measurements. Before presenting the numerical results, in Section 4.4.1, we describe the ray tracing setup and in Section 4.4.2, we provide details on the measurement setup. 4.4.1 Ray tracing channels In order to get channel models even more realistic than the multi-cluster model described above, we simulate the double directional impulse responses described in Section 4.1 with the aid of a commercial ray-tracing tool, Wireless InSite [Rem13]. This ray tracer provides efficient and accurate predictions of propagation and communication channel characteristics over 50 MHz to 100 GHz in complex environments. Specifically, Wireless InSiteperformsray launching,emittingrays(representingplanewaves)fromthetransmit location into all directions, and following each ray as it interacts (reflection, diffraction, transmission)withtheobjectsintheenvironment; thiscontinuesuntileitherthestrength of the ray falls below a specified threshold or it has left the area of interest 5 . Theinputtotheprogramisadigitalmapoftheenvironment(includingfootprintand height of the buildings and the electromagnetic characteristics of the building materials). Meanwhile,theeffectsoftreesarenon-neglibileinmm-Wavesystemandthusaremodeled by Foliage Feature in Wireless InSite. The output is a list of parameters for the MPCs that is similar to the result of a double directional channel. Each MPC is associated with a path vector that contains the time averaged path powerP p =ρ 2 p , propagation delayτ p , the azimuth angle of departureθ p and arrivalψ p . Like all ray tracers, the accuracy of the program is determined by the accuracy of the environmental data base, the number of rays launched, and the maximum number of interactions taken into account. Simulation resultshavebeencomparedtomeasurementsinavarietyofsettingsandshowntoprovide good agreement [Rem13]. 5 This commercial ray-tracer does not consider the effects of diffuse MPCs, while there are more ad- vanced ray-tracing tools with the addition of models of diffuse MPCs [DEKVV11], [MQO12] 116 The simulation has been conducted based on the model of the University of Southern California (USC) main campus, as shown in Figure 4.3. The green dot is the BS located above the rooftop in the middle of the map, while simulated MSs are red routes covering all possible streets of the campus. Gray objects represent the buildings, and their building surfaces are modeled with a uniform material for simplicity. The light/dark green 3D polygons denote foliage features with dierent tree density. In mm-Wave channels, the diracted MPC will be greatly attenuated, therefore restricting the ray-tracer to consider up to one diraction is a valid simplication and speeds up the simulation. The detailed simulation congurations are listed in Table 4.1. Figure 4.3: Ray-tracing simulation environment Variable Value Carrier Frequency 28 GHz Antenna Pattern Isotropic Antenna Polarization Vertical Tx power 30 dBm BS height 45 m MS height 2 m Maximal Diraction 1 Maximal Re ection 10 Table 4.1: Ray-tracing simulation parameters of USC campus 117 4.4.2 Measured channels 28 GHz wideband propagation measurements of channel impulse responses and received power were made throughout downtown New York City in the summer of 2012. Three different transmitter (BS) locations were selected on NYU buildings, two being on the rooftop of the Coles Sports Center (7 m above ground) and a third on the fifth-floor balcony of the Kaufman Center (17 m above ground). Each transmitter location shared 25 receiver locations with transmitter-receiver separation distances ranging from 31 m to 423 m, for a total of 75 TX-RX distinct RX locations, although only 25 locations with TX-RX separations less than 200 m were able to receive sufficient power for broadband signal capture. Fig. 4.4 shows a 3D map of the Manhattan environment where the mea- surements were performed, and shows the three transmitters (yellow stars) and receiver locations (green dots and purple squares, with green dots representing visible RX loca- tions and purple squares representing RX sites that are blocked by buildings). Typical measurements included: • Line-of-Sight Boresight (LOS-B) − both the TX and RX antennas are pointed directly toward each other (i.e., on boresight) and aligned in both azimuth and elevation angles with a true LOS− no obstructions between the antennas. • Line-of-SightNon-Boresight(LOS-NB)−boththeTXandRXhavenoobstructions between the antennas, but theantennas are not pointed directly towards each other in azimuth or elevation angles. • Non-Line-of-Sight (NLOS) − the TX and RX have physical obstructions between the antennas. A NLOS environment with moderate obstructions includes trees between TX and RX, or when the RX is slightly behind a building corner. A NLOS environment with heavy obstruction includes the RX completely behind buildings. Themeasurementswereperformedusinga800MHzfirstzero-crossingRFbandwidth sliding correlator channel sounder with rotational highly directional horn antennas (each 118 Figure 4.4: 28 GHz cellular measurement locations in Manhattan near the NYU campus. Three base station locations (yellow stars on the one-story rooftop of Coles Recreational Center and ve-story balcony of the Kaufman building of Stern Business School) were used to transmit to each of the 25 RX locations within 31 to 423 m. Green dots represent visible RX locations, and purple squares represent RX sites that are blocked by buildings in this image. 119 Description Value Sequence 11th order PN Code (Length = 2047) Transmitted Chip Rate 400 MHz Receiver Chip Rate 399.95 MHz Slide Factor 8000 Carrier Frequency 28 GHz NI Digitizer Sampling Rate 2 MSamples/s System measurement range 178 dB Maximum TX Power 30 dBm TX/RX Antenna Gain 24.5 dBi TX/RX Azimuth and Elevation HPBW 10.9 ◦ /8.6 ◦ TX-RX Synchronization Unsupported Table 4.2: 28 GHz Channel Sounder Specifications with24.5dBigain,or10 ◦ halfbeamwidth)[RSM + 13,AWW + 13,ZMS + 13]. Themaximum transmitter output power used was 30 dBm, and two highly directional horn antennas of 24.5 dBi (10.9 ◦ and 8.6 ◦ half-power beamwidths (HPBW) in the azimuth and elevation planes, respectively) were used at the TX and RX, allowing for a total of 178 dB of measurable path loss. The measurement parameters are summarized in Table 4.2; for further details see [RSM + 13] and [AWW + 13]. Angle of arrival (AOA) and angle of departure (AOD) measurements were made for every TX-RX location, as described in [RSM + 13]. For our simulations, we use the mea- surementstoproduceAODreceivedpowervaluesreflectingmeasurablesignalpropagation for all RX locations 6 . AOD measurements consisted of rotating the TX antenna in 10 ◦ increments in the azimuth plane at a fixed -10 ◦ elevation downtilt while the RX antenna remained stationary at fixed elevation and azimuth angles; this fixed direction of the RX antenna was chosen to approximately maximize the received power. 6 It was observed from mm-Wave field measurements that the power levels of diffuse multipath compo- nents in NLOS environments are considerably weaker than those arising from specular reflections. As a result,evaluatingouralgorithmbasedonthemostsignificantmultipathcomponentsdoesnotsignificantly impact the presented results. 120 4.4.3 JSDM with spatial multiplexing As stated in Section 4.3.4, here we assume that users come in groups, and each group has multiple scattering clusters, with covariances computed from (4.8). In order to generate such a scenario, we form a set of non-overlapping scattering clusters and divide them into two sets. Each cluster of the first set is assigned uniquely to one group, while the clusters of the second set are assigned randomly to the groups, such that a cluster in the second set may be common to multiple groups. Hence, each user group has its own scatterer, different from all the other user groups, in addition to some scatterers that are possibly common to other groups. In our simulations, we generate 10 scattering clusters at random, and vary the number of user groups G from 2 to 5. The maximum number of scattering clusters for each user group is fixed to 5. Within each user group, a finite number of users equal to the rank of the local scattering cluster is assumed. These usersarethen spatially multiplexed by ZFBF on theresultingchannel obtained after pre- beamforming, which is determined by approximated BD on the dominant eigenspaces of the selected user groups. We set M = 400, and the noise power is normalized to 1, so SNR =P, where P is the total BS transmission power. Figures4.5(a)and4.5(b)showacomparisonofthetotalachievablethroughputforthe different algorithms as a function of SNR. “Algo 1” refers to Greedy Algorithm 1, “Algo 2” refers to Greedy Algorithm 2 and “ES” refers to Exhaustive Search. We see that both algorithms give similar performance, with Algorithm 1 giving better performance than Algorithm 2 when the number of user groups is 5. The average number of users si- multaneously served, i.e., the spatial multiplexing, per time-frequency resource is plotted in Figures 4.6(a) and 4.6(b). Even though Algorithm 2 gives higher spatial multiplexing comparedtoAlgorithm1, thepresenceofmoregroupsreducesthebeamforminggainand also creates additional inter-group interference (a result of non-perfect block diagonaliza- tion), therefore, the gains due to spatial multiplexing are not fully realized. It is also noteworthy to observe the effect ofǫ as a tuning parameter. A lower value ofǫ favors the selection of more groups (multiplexing) but in this case yields lower throughput because 121 0 5 10 15 20 25 30 0 100 200 300 400 500 600 700 SNR (in dBs) Sum Rate (in bits/sec/Hz) Algo 1 Algo 2, ε = 0 Algo 2, ε = 0.05 (a) G=2 0 5 10 15 20 25 30 0 100 200 300 400 500 600 700 800 SNR (in dBs) Sum Rate (in bits/sec/Hz) Algo 1 Algo 2, ε = 0 Algo 2, ε = 0.04 Algo 2, ε = 0.09 Algo 2, ε = 0.5 Algo 1, ES Algo 2, ε = 0.09, ES (b) G=5 Figure4.5: ComparisonofsumspectralefficiencyversusSNRwithG = 2andG = 5user groups. Each user group has multiple scattering clusters, of which some are common to more than one group. −2 −1.5 −1 −0.5 0 35 40 45 50 55 60 65 70 logε Spatial Multiplexing Algo 1 Algo 2 (a) G=2 −2 −1.5 −1 −0.5 0 20 40 60 80 100 120 140 160 180 logε Spatial Multiplexing Algo 1 Algo 2 Algo 1, ES Algo 2, ES (b) G=5 Figure 4.6: Comparison of Spatial Multiplexing versus logǫ with G = 2 and G = 5 user groups. Each user group has multiple scattering clusters, of which some are common to more than one group. of the smaller beamforming gain and higher inter-group interference. Instead, a higher value of ǫ sacrifices some spatial dimensions but yields higher throughput in this case. 122 It is also noteworthy to point out that both the greedy user selection algorithms give good performance when compared with their exhaustive search counterparts, evidenced by Figures 4.5(b) and 4.6(b), for G = 5. 7 4.4.4 Covariance-based JSDM We apply the covariance-based JSDM scheme outlined in Section 4.3.4 to different sce- narios, and shall see that this scheme is particularly suited to directional channel models having a small number of discrete MPCs. User groups with multiple scattering clusters: We consider the same setup as in Section 4.4.3. As already remarked, covariance-based JSDM serves only one user per group and does not require instantaneous CSIT of the effective channels after pre- beamforming. Therefore, the precoder can be computed only from the second order statistics,eliminatingtheneedforexplicitdownlinktrainingandsimplifyingtheprecoder design. However, a price is paid in terms of achievable throughput, which is reduced considerably with respect to the full JSDM case. Figure 4.7(a) shows the sum spectral efficiencyasafunctionofSNRforthedifferentuserselectionalgorithmsandFigure4.7(b) shows the corresponding spatial multiplexing, when there are G = 5 groups. Compared to Figures 4.5(b) and 4.6(b), there is a huge reduction in the achievable data rates and in the spatial multiplexing. Isolated Users with Multiple Scattering Clusters: Here, we consider multiple scattering clusters associated to each user, similar to Section 4.4.3. We fix the number of usersinthesystemtobeK = 20, andassociateanarbitrarynumberofdisjointscattering clusters to each user. The maximum number of scattering clusters that a user can have is limited to 5. We set M = 400 and obtain a set of scheduled users by running the 7 The fact that the spatial multiplexing of Algorithm 1 using exhaustive search may be less than what obtainedbythegreedyalgorithm(asinFig. 4.6(b))canbeexpected,sinceAlgorithm1doesnotmaximize the multiplexing gain. 123 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 45 SNR (in dBs) Sum Rate (in bits/sec/Hz) Algo 1 Algo 2, ε = 0 Algo 2, ε = 0.04 Algo 2, ε = 0.09 Algo 2, ε = 0.5 (a) Sum Rate −2 −1.5 −1 −0.5 0 1 1.5 2 2.5 3 3.5 4 4.5 5 logε Spatial Multiplexing Algo 1 Algo 2 (b) Spatial Multiplexing Figure 4.7: Comparison of sum spectral efficiency versus SNR and Spatial Multiplexing versus logǫ with G = 5 user groups and no spatial multiplexing. Each user has multiple scattering clusters. 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 SNR (in dBs) Sum Rate (in bits/sec/Hz) Algo 1 Algo 2, ε = 0 Algo 2, ε = 0.01 Algo 2, ε = 0.05 Algo 2, ε = 0.1 (a) Sum Rate −3 −2.5 −2 −1.5 −1 −0.5 0 1 2 3 4 5 6 7 8 logε Spatial Multiplexing Algo 1 Algo 2 (b) Spatial Multiplexing Figure 4.8: Comparison of sum spectral efficiency versus SNR and Spatial Multiplexing versus logǫ with K = 20 users. Each user has multiple scattering clusters. algorithms of Section 4.3. Figure 4.8(a) shows the sum spectral efficiency with varying SNR for this setup and Figure 4.8(b) shows the variation of spatial multiplexing with the tuning parameter ǫ. We observe a behavior similar to what was observed for the 124 model used in 4.4.3, and the achievable throughput is reduced significantly due to no spatial multiplexing. Also interesting is the fact that even though there are a total of K = 20 users, only an average of 7 users are served simultaneously, implying that the presence of more users leads to more common scattering clusters, thereby limiting the total spatial multiplexing. This result might give the wrongintuition that having a larger number of users does not necessarily increase the total system throughput. However, this effect is due to the limitation of the covariance-based JSDM: if full JSDM is used, users spanningthesamesetofdimensionscanbegroupedtogetherandservedusingMU-MIMO spatialmultiplexingbasedontheinstantaneousCSIT.Interestingly,weshallseenextthat covariance-basedJSDMisindeedabletoachievehighspatialmultiplexing(thatincreases with the number of users, in the range K ≪ M) in the presence of highly directional channels with a small number of MPCs. 10 20 30 40 50 0 20 40 60 80 100 120 140 160 180 Transmit Power (in dBm) Sum Rate (in bits/sec/Hz) K = 5, Algo 1 K = 5, Algo 2 K = 5, Algo 1, ES K = 5, Algo 2, ES K = 10, Algo 1 K = 10, Algo 2 K = 25, Algo 1 K = 25, Algo 2 (a) K =5,10,25 10 15 20 25 30 35 40 45 50 40 60 80 100 120 140 160 180 200 220 240 Transmit Power (in dBm) Sum Rate (in bits/sec/Hz) K = 45, Algo 1 K = 45, Algo 2 K = 60, Algo 1 K = 60, Algo 2 (b) K =45,60 Figure 4.9: Comparison of sum spectral efficiency versus transmit power with varyingK when the channel is modeled as a double directional impulse response. Ray-tracing Based Channels: We next generate the channels according to (4.2) by using parameters obtained from the ray-tracing simulation setup. The phases are 125 0 10 20 30 40 50 60 0 5 10 15 20 25 30 35 40 K Spatial Multiplexing Algo 1 Algo 2 Figure4.10: ComparisonofSpatialMultiplexingversusNumberofuserswhenthechannel is modeled as a double directional impulse response. generated as φ kp ∼ Unif[0,2π]. Since in this case the channel angular support is formed by a collection of disjoint “angular frequency bins” of the same size (see Section 4.3.1), different user channels either do not overlap or overlap entirely on an integer number of bins. Therefore, in algorithm 2 we can set ǫ = 0. After obtaining the scheduled user set, BD is performed to obtain the pre-beamformers. Figure 4.9 shows the sum spectral efficiency versus transmit power (in dBm) for various number of users with different algorithms. We vary the transmit power between 10 dBm (10 mW) to 50 dBm (100 W). The noise power is set to −100 dBm, corresponding to a 20 MHz bandwidth. Here, we clearly see a tradeoff between Orthogonalization at low SNR and Multiplexing at high SNR. Also interesting is the fact that spatial multiplexing performs better with a small number of users than with a large number of users. This is because there is a non-trivial tradeoff between Orthogonalization and Multiplexing. With a lower complexity, greedy userselectionperformswellwhencomparedwithexhaustivesearch,asisclearfromFigure 4.9(a)forK = 5. Contrarytowhatwasobservedinthecaseofmultiplescatteringclusters in Section 4.4.4, Figure 4.10 shows that we are able to recover the spatial multiplexing 126 even with just covariance-based JSDM when channels are highly directional and have a few MPCs, which characterize the channels obtained from ray tracing. 10 15 20 25 30 35 40 45 50 40 60 80 100 120 140 160 180 Transmit Power (in dBm) Sum Rate (in bits/sec/Hz) BS 1, Algo 1 BS 1, Algo 2 BS 2, Algo 1 BS 2, Algo 2 BS 3, Algo 1 BS 3, Algo 2 Figure 4.11: Comparison of sum spectral efficiency versus Transmit Power for different BS locations obtained from measured data. Measured Propagation Channels: Figure 4.11 shows the sum throughput versus SNR after running the user selection algorithms on the data obtained from measured propagation channels described in Section 4.4.2. There are a total of 3 BSs, and each BS has a set of 8 user locations, so we fix the number of users K = 8. We see that the algorithms perform differently depending on the scenario. For example, with BS 2, we achieve the same spatial multiplexing using both algorithms, while for BS 3, Algorithm 2 outperforms Algorithm 1 owing to huge spatial multiplexing. Overall, we observe that covariance-based JSDM along with proper user selection achieves very high throughput in actual propagation channels. However, one should also consider that the high spectral efficienciesareduetoasinglecellscenarioanduseofachievablerateexpressionsassuming Gaussian inputs. In reality, the input signal would be modulated by a finite dimensional constellation such as QAM, which would put a limit on the maximum achievable rate. Also, the noise floor was taken to be −100 dBm in our results, which is typical for a 127 system operating at a bandwidth of 20 MHz under room temperature. Since inter-cell interference would create additional noise, this would reduce our received SNR too. Even taking into account all these imperfections, we would like to point out that in mm-Wave scenarios, the distances are short leading to smaller path losses and owing to the fact that we have a large antenna array at the BS, it is indeed possible to achieve high SNR with simple covariance based schemes, leading to high data rates. Remark 12 Note that the proposed user selection algorithms are, in fact, independent of the channel model and use only the second order statistics of the user channels. How- ever, these algorithms work well in certain kinds of channel environments such as those considered in the paper, and may not work well in other propagation environments. For example, if we have a few users with isotropic scattering, for which the energy is not concentrated in a particular angular direction but is distributed uniformly over the whole angular space, our selection algorithm will treat each of these users as a group on its own, and would either schedule one of these users alone, or multiple users with compatible di- rectional channels. In terms of spatial multiplexing as well as reduced CSIT, our proposed algorithms become meaningful when most users in the network have channels with energy concentrated in a few directions. However, if we are in a propagation environment where most users have “nearly” isotropic channel directions, the scheme reduces to serving one user at a time, or a group of users based on instantaneous CSIT, as is the case in standard massive MIMO schemes. 128 Chapter 5 Massive MIMO and Inter-tier Interference Coordination Dense spatial spectrum reuse has been widely recognized as the single most valuable resource for overcoming the wireless “spectrum crunch” [CAG08]. A possible solution consists of deploying a large number of small cells that operate under a common macro- cell umbrella. Tier-2 cells (small cells) have attracted a lot of attention both in cel- lular standards bodies such as 3GPP, as well as in academic research. A large body of literature has focused on mitigating the inter-tier interference, using schemes such as eICIC[LPGR + 11],[GMR + 12](andreferencestherein), whichinvolveorthogonalizingthe time frequency resources allocated to both tiers. A“cognitive”smallcellapproachwasproposedin[ANC11],wheretier-2basestations (BSs) have the ability to decode the tier-1 (macrocell) BS control channel and schedule theirtransmissionstodealwiththeinter-tierinterference. Itwasdemonstratedthatwith a simple power control approach and a moderate number of antennas at the tier-2 BSs, very high area spectral efficiency (bit/s/Hz/km 2 ) could be achieved in both tiers. In this chapter, we focus on a scenario where the tier-1 BS is equipped with a very large number of antennas (massive MIMO). Since the tier-1 BS is typically located on an elevated position (e.g., tower-mounted, or deployed on a building roof), it “sees” both its own users and the tier-2 cells under a relatively narrow angular spread. This gives rise to highly directional channel vectors, which can be modeled as Gaussian random vectors with a small number of dominant eigenmodes (eigenvectors of their covariance 129 matrix). As a result, the tier-1 BS can make use of directional beamforming, similar to the JSDM approach proposed in Chapter 2, in order to simultaneously achieve spatial multiplexing to its own tier-1 users as well as mitigate the inter-tier interference to the tier-2 cells. Inter-tier interference can be mitigated by nulling certain spatial directions, i.e., by transmitting in the orthogonal complement of the dominant eigenmodes of the channelvectorsfromthetier-1BStoasubsetofselectedtier-2cells. Theselectionofsuch directions can be allocated in the time-frequency resource such that each tier-2 cell has a fair share of transmission opportunities free from inter-tier interference. In analogy with theframeblankingapproachofeICIC,werefertothisapproachas“spatialblanking”. As a result, the tier-2 throughput can be increased without significantly sacrificing the tier-1 throughput, as opposed to eICIC, which can only operate on the convex combination region of the individual throughput capacities of the tier-1 and tier-2 systems. A similar scenario has been considered in [HHBD13], but under different beamforming and power control strategies. The work reflects the effectiveness of “spatial blanking”, but with highercomplexityalgorithmssincethechanneldirectionalitypropertiesarenottakeninto account, which can lead to efficient JSDM decomposition and dimensionality reduction. In addition, it was shown that reverse-TDD is not competitive, however, we find that reverse-TDD can yield significant advantages in some cases with respect to co-channel TDD. For the remainder of this chapter, we present the basic system model in Section 5.1. We analyze the system performance in Sections 5.2 and 5.3, followed by some numerical results in Section 5.4. 5.1 System model We consider a macrocell (tier-1 system) comprising of a single BS having M antennas and containing F tier-2 small cells, each equipped with L antennas. 130 Tier-1 BS (DL) Tier-1 users (UL) Tier-2 BS/Tier-2 users listening Tier-2 BS/Tier-2 users transmitting FCH UL-MAP Preamble BS DL-MAP Figure 5.1: Frame structure for the two-tier network ThesystemoperatesinTDD(TimeDivisionDuplexing)mode, whereboththedown- link and uplink bands are accessed using OFDM/TDMA. For simplicity, we consider here a frequency flat channel corresponding to a set of adjacent subcarriers in the same chan- nel coherence bandwidth. The TDD macrocell/femtocell frame structure is borrowed from [ANC11], and is shown in Fig. 5.1. The tier-1 BS frame includes the control channel and the tier-1 uplink (UL) and downlink (DL) subframes, with a small guard interval in between. The tier-2 cells have two subframes. During the first one, overlapped with the tier-1 BS control channel, all the tier-2 nodes (both BS and users) listen to the tier-1 BS control channel. After a guard interval, during which the tier-2 nodes decode the control information in the tier-1 BS control channel and acquire the allocation of the tier-1 users on the tier-1 DL/UL subframes, all the tier-2 cells are active and transmit using TDD, both in the UL and in the DL (depending on the scenario). We specifically investigate two schemes: reverse-TDD (R-TDD), as proposed originally in [ANC11], where the tier- 1 DL is aligned with the tier-2 UL (and vice-versa), and cochannel-TDD (co-TDD), as examined in [HHBD13], where tier-1 UL is aligned with the tier-2 UL (and vice-versa). It is worthwhile to point out that all the tier-2 nodes can decode the tier-1 BS control signal and use it as common information for coordination. In particular, we assume that the power, rate and positioning (location) information of the active tier-1 users scheduled on the current frame are known to all tier-2 nodes in each frame. We examine the two 131 tier system performance in terms of the achievable throughput tradeoff region between the tier-1 throughput (sum rate) and the tier-2 throughput (sum rate). 5.1.1 Macrocell subsystem Ineachslot,thetier-1BSservestier-1users. ThesearedividedintoGgroups. Tier-1users are grouped according to their position in the cell. Since co-located users have the same scatteringenvironment(seethediscussiononthevalidityofthisstatementinSection4.1, as well as efficient user grouping algorithms presented in Section 3.2) but are separated by at least a few tens of wavelengths, their channel vectors are mutually independent but have the same second-order statistics, implying that the channel covariance matrix between tier-1 users and the tier-1 BS are identical within the same group. In contrast, provided that the different groups are widely separated in their scattering components, the dominant eigenspaces of the corresponding channel covariance matrix are linearly independent. The instantaneous channel between a user k in group g (indicated as g k , with some abuse of notation) and the tier-1 BS, at each time-frequency scheduling slot, is anM×1 Gaussian vectorh h h g k . Using the Karhunen-Loeve representation, we can write h h h g k =U U U g Λ 1/2 g w w w g k , (5.1) whereR R R g =U U U g Λ g U U U H g is the channel covariance matrix, common to all users in group g, U U U g is the tap unitary matrix of eigenmodes, of dimension M×r g , and Λ g is the r g ×r g diagonal positive definite matrix of covariance eigenvalues (Karhunen-Loeve coefficients). Notice that these quantities are common to all users in group g. In contrast, the r g ×1 random vectorw w w g k ∼CN(0,I I I rg ) is independent for different users and corresponds to the randomness due to the small-scale multipath fading components. The typical duration over which the channel covariances change is several orders of magnitude larger than the dynamics of small-scale fading. Therefore, for mathematical convenience, we assumeR R R g 132 to be fixed in time and consider average rates (i.e., ergodic rates) with respect to the small-scale fading components. Notice that under the classical Wide-Sense Stationary Uncorrelated Scattering channel model [Mol10], the channel process is wide-sense sta- tionary and therefore its second-order statistics are constant in time, as we assume here. This assumption is valid “locally” when observing the system on the time-scale of a few tens of seconds. In practice, the channel covariance matrices must be adaptively learned andtrackedinordertofollowthenon-stationarytime-varyingeffectsinthenetwork(e.g., due to user mobility). FollowingSection2.1, weconsidertheone-ringscatteringmodelinordertodetermine R R R g . Namely, for a user group located at an angle of arrivalθ g and having angular spread Δ g , we haveR R R g =R R R(θ g ,Δ g ) where, assuming a uniform linear array at the tier-1 BS, the element (m,n) ofR R R(θ g ,Δ g ) is given by [R R R(θ g ,Δ g )] m,n = 1 2Δ g Z θg+Δg θg−Δg e −jπ(m−n)sin(α) dα. (5.2) Thetotaltier-1BStransmitpowerisdenotedbyP 0 . Foranalyticalsimplicity, weassume that the aggregate sum power of all tier-1 users transmitting on the UL is also equal to P 0 , such that the total radiated power of the tier-1 cell is the same in both UL and DL slots. Also for simplicity we consider equal power allocation, such that tier-1 BS DL data stream is allocated power P 0 S , where S is the total number of DL streams. 5.1.2 Small cell subsystem Tier-2 cells operate in TDD with no constraint of aligning their DL and UL with the tier-1 DL/UL slots. We assume that tier-2 cells use intra-cell orthogonal access, such that only one tier-2 user per small cell is active over any time-frequency slot. Hence, as farasthetier-2throughputisconcerned, itissufficienttoconsiderasingleuserpertier-2 cell. 133 Using the information obtained from the control channel, the femtocells implement a simple power control strategy in order to mitigate the cross tier interference. In our proposed power control strategy, the tier-2 cells (for both UL and DL) adjust their trans- mission power levels such that the average interference power at the macrocell receiver is not larger than some threshold κ, which we call the “interference temperature” of tier-2 on tier-1 [ANC11]. Furthermore, a peak transmit power P 1 is imposed to all tier-2 cells. In the tier-1 DL slot, only the tier-2 cells close to the active tier-1 users need to lower their transmission power below P 1 , in order to satisfy the interference temperature con- straint. Since the set of active tier-1 users changes randomly from slot to slot, depending on the DL scheduling of the tier-1 BS, the set of tier-2 cells that have to transmit at very low power also changes with time, thus obtaining a sort of statistical multiplexing in the spatial cell area. However, in the tier-1 UL, the tier-2 cells close to the tier-1 BS are required to significantly lower their transmission power on all slots, since the tier-1 BS does not change in time. Hence, the tier-2 cells close to the BS are permanently at a disadvantage. It is worthwhile to note that during tier-1 DL slots, when G = 1, only a small number of tier-2 cells around the (single) active tier-1 user group transmit at low power, but asG increases, more and more tier-2 cells need to lower their transmit power, resulting in a degradation of the tier-2 throughput. The same problem occurs in tier-1 UL slots, due to the presence of more user groups, which cause a lot of interference to their nearby tier-2 cells. Hence, we expect to see a tradeoff between the tier-1 and the tier-2 throughputs. In the following, we are interested in investigating this achievable throughput tradeoff region under a certain class of efficient MIMO precoding schemes that exploit the channel directionality properties said before. 5.2 System performance : Reverse TDD We denote the set of user groups by G and the set of tier-2 cells by C, of cardinality |G| = G and |C| = F, respectively. For R-TDD we focus on the case of tier-1 DL and 134 tier-2 UL only. In fact, for the other direction (tier-1 UL and tier-2 DL), we exploit uplink-downlink duality and achieve exactly the same rates with the same sum power (see [ANC11] and references therein for details). 5.2.1 Tier-1 DL SincetherankofthechannelcovariancematrixR R R g isr g ,thenumberofsimultaneouslyac- tiveusersthatcanbeservedusingmultiuserMIMO(MU-MIMO)(i.e.,thenumberofspa- tial multiplexing data streams in the tier-1 DL to users in group g) isS g ≤ min{K g ,r g }, where K g is the number of users in group g. The covariance rank r g is related to the angle of arrival θ g and angular spread Δ g for that particular group, and can be sharply characterized in closed form in the limit of largeM (see Section 2.6). The received signal at user k of group g, is given by y g k = √ a g,0 h h h H g k V V Vd d d+ X f∈C √ a g,f h g k ,f x f +z g k , (5.3) whereV V V is theM×S beamforming matrix of the tier-1 BS, where we letS = P G g=1 S g ,d d d is theS×1 vector of data symbols,h g k ,f ∼CN(0,1) is a complex coefficient representing the scalar channel between the tier-1 userg k and the tier-2 user located in cellf ∈C, and z g k ∼CN(0,1) is the AWGN at the tier-1 user receiver. The path gain coefficients a g,f includelog-normal shadowing, distance-dependent pathloss, and other possible geometric effects such as wall absorption (e.g., in the case where the tier-2 is an indoor femtocell and the tier-1 user is outdoor), between users in groups g and BS f. We use index 0 to indicate the tier-1 BS. Depending onR R R g , we consider two different beamforming schemes as outlined below: Isotropic scattering (i.e., R R R g = I I I M , r g = M): In this case, the tier-1 BS uses zero-forcing beamforming, such thatV V V is the column-normalized Moore-Penrose pseudo inverse of the channel matrixH H H = [H H H 1 H H H 2 ...H H H G ], whereH H H g denotes theM×S g channel 135 matrix formed by the channel vectors of the active users in group g. As a result, (5.3) becomes: y g k = √ a g,0 h h h H g k v v v g k d g k + X f∈C √ a g,f h g k ,f x f +z g k , (5.4) where the intra-cell tier-1 multiuser interference disappears as a result of zero-forcing beamforming. Directional scattering (i.e., R R R g = U U U g Λ g U U U H g , with Λ g strongly skewed such that only r g < M eigenmodes collect significant energy): In this case, the tier-1 BS employs JSDM with per group processing (Section 2.2) with V V V = [B B B 1 ...B B B G ]diag(P P P 1 ,...,P P P G ), whereB B B g is the M ×b g pre-beamforming matrix associated with group g andP P P g is the b g ×S g precodingmatrixobtainingbyzeroforcingontheeffectivechannelB B B H g H H H g ofgroup g after pre-beamforming. In addition, the pre-beamforming matrixB B B g depends only on the channel second-order statisticsR R R g , which can be learned accurately over a relatively long time interval, and the pre-beamforming transformation can be implemented in the RF analog domain (hybrid analog-digital beamforming), thus allowing the tier-1 BS to have a very large number of antennas M, while having a moderate number b = P b g of RF chains. The advantages of the JSDM architecture are discussed in previous chapters. The received signal (5.3) thus takes on the form: y g k = √ a g,0 h h h H g k B B B g p g k d g k + X g ′ ∈G:g ′ 6=g S ′ g X m=1 √ a g,0 h h h H g k B B B g ′p g ′ m d g ′ m (5.5) + X f∈C √ a g,f h g k ,f x f +z g k , (5.6) 136 where the intra-group interference disappears due to zero-forcing beamforming, but we have an additional inter-group interference term (5.5) in addition to the inter-tier inter- ference (5.6) due to the tier-2 cells. Achievable rate with isotropic scattering: From (5.4), the received SINR at a user g k is given as SINR g k = a g,0 |h h h H g k v v v g k | 2 P 0 /S 1+ P f∈C a g,f |h g k ,f | 2 P f , (5.7) whereP f is the transmit power of the tier-2 user in cellf. Using Jensen’s inequality, the achievable ergodic rate for user g k can be lower-bounded by E[log 2 (1+SINR g k )]≥E " log 2 ! 1+ a g,0 |h h h H g k v v v g k | 2 P 0 /S 1+ P f∈C a g,f P f %# Δ =R DL−1 g k ,IS (5.8) Using the method of deterministic equivalents (see [WCDS12]), a simple approximation to (5.8) for M →∞ is given by R DL−1 g k ,IS−DA = log 2 ! 1+ (M−S +1)a g,0 P 0 S 1+ P f∈C a g,f P f % , (5.9) where it can be shown that R DL−1 g k ,IS −R DL−1 g k ,IS−DA M→∞ −→ 0. Achievable rate with directional scattering: Proceeding in a similar fashion, we have that the achievable ergodic rate for user g k is lower bounded by R DL−1 g k ,DS =E log 2 1+ a g,0 |h h h H g k B B B g p g k | 2 P 0 /S h 1+ P g ′ 6=g a g,0 |h h h H g k B B B g ′p g ′ k | 2 P 0 /S + P f∈C a g,f P f i (5.10) Following again the approach in [WCDS12], an easy to compute approximation ofR DL−1 g k ,DS for M →∞ is given by (see also Section 2.4 for details) R DL−1 g k ,DS−DA = log 2 ! 1+ b g m g a g,0 P 0 S 1+ P g ′ 6=g I g,g ′a g,0 P 0 S + P f∈C a g,f P f % , (5.11) 137 where m g is obtained as the solution of the fixed-point equation m g = 1 b g tr B B B H g R R R g B B B g T T T −1 g T T T g = I I I bg + S g b g B B B H g R R R g B B B g m g (5.12) and{I g,g ′ :∀ g,g ′ ∈G} is the solution of the system of coupled fixed-point equations I g,g ′ = S g ′ n g,g ′ m g ′ n g,g ′ = 1 b g ′ tr B B B H g ′R R R g ′B B B g ′T T T −1 g B B B H g ′R R R g B B B g ′T T T −1 g 1−J g ′ J g ′ = 1 b g ′ tr B B B H g ′R R R g ′B B B g ′T T T −1 g B B B H g ′R R R g ′B B B g ′T T T −1 g b g ′m 2 g ′ . (5.13) 5.2.2 Tier-2 UL Tier-2 cells schedule their respective UL slots when the tier-1 cell is in the DL slot. As a result, a tier-2 BS receiving its desired user signal suffers from both intra-tier interference from the tier-2 users in neighboring cells and inter-tier interference from the tier-1 BS. The received signal at the receiver of tier-2 BS f is given by y y y f =h h h f,f x f + X f ′ 6=f √ a f,f ′h h h f,f ′x f ′ + √ a f,0 H H H H f,0 V V Vd d d+z z z f , (5.14) where h h h f,f ′ denotes the L×1 channel between the tier-2 BS f receiving antenna array and a tier-2 user in cell f ′ , x f denotes the scalar symbol transmitted by a tier-2 user in cellf,H H H f,0 is theM×L channel matrix between the tier-1 BS and the tier-2 BSf, and z z z f is an AWGN vector. The channel matrix H H H f,0 depends on the type of scattering in the propagation from the tier-1 BS to the tier-2 BS. Instead, the channel between the tier-2 users and the tier-2 BS is assumed to have isotropic scattering, since the tier-2 138 BS is surrounded by scattering elements with uniform angle of arrivals. 1 Hence, we let h h h f,f ′ ∼ CN(0 0 0,I I I L ). Finally, without loss of generality, we normalize the channel gain coefficient inside each tier-2 cell bya f,f = 1, since because of the small size of tier-2 cells, all users inside such cells have roughly the same path loss. The tier-2 BSs use linear MMSE in order to detect their intended user. The resulting SINR is given by SINR f = h h h H f,f I I I L + X f ′ 6=f a f,f ′h h h f,f ′h h h H f,f ′P f ′ +a f,0 H H H H f,0 V V VV V V H H H H f,0 P 0 S −1 h h h f,f P f . (5.15) Recall that the transmit power P f is controlled as a function of the interference temper- ature κ. In particular, we let: P f = min ( P 1 , κ a max f ) a max f = max{a g,f :g∈G} (5.16) Obviously, a g,f P f ≤ a g,f a max f κ ≤ κ, such that the average inter-tier interference caused by a tier-2 user f to an active tier-1 user in group g is at most κ. The achievable ergodic rate for tier-2 cellf is given byR UL−2 f =E[log 2 (1+SINR f )] where the SINR is given by (5.15). The statistics ofH H H f,0 (i.e., the channel matrix between the tier-1 BS and a tier-2 BS) depends on the type of scattering landscape. As for the case of tier-1 users, we consider both isotropic and directional scattering. Isotropic Scattering: In this case, H H H f,0 has i.i.d. elements ∼ CN(0,1). As said before, the tier-1 BS beamforming matrix is given by V V V = H H H H H H H H H H −1 Ξ, where Ξ is 1 Think of an indoor femtocell, or a small BS at low elevation in an urban square. 139 a diagonal matrix that makes the columns of H H H H H H H H H H −1 to have unit norm. Letting H H H =U U U 0 Σ 1/2 0 V V V H 0 denote the SVD ofH H H, we have H H H H f,0 V V VV V V H H H H f,0 = H H H H f,0 H H H(H H H H H H H) −1 Ξ 2 (H H H H H H H) −1 H H H H H H H f,0 = H H H H f,0 U U U 0 Σ −1/2 0 V V V H 0 Ξ 2 V V V 0 Σ −1/2 0 U U U H 0 H H H f,0 = ˜ H H H f,0 Σ −1/2 0 V V V H 0 Ξ 2 V V V 0 Σ −1/2 0 ˜ H H H H f,0 (5.17) where ˜ H H H f,0 =H H H H f,0 U U U 0 is also Gaussian i.i.d., by the well-known unitarily invariant prop- erty of Gaussian i.i.d. matrices. Hence, we can write the tier-2 UL achievable rate in the form R UL−2 f =E log 2 1+h h h H f,f I I I L + X f ′ 6=f a f,f ′h h h f,f ′h h h H f,f ′P f ′ +a f,0 ˜ H H H f,0 Σ −1/2 0 V V V H 0 Ξ 2 V V V 0 Σ −1/2 0 ˜ H H H H f,0 P 0 S −1 h h h f,f P f %# (5.18) Using the general results of [CDS11], we can obtain a convergent approximationR UL−2 f,IS,DA such that R UL−2 f −R UL−2 f,IS,DA M→∞ −→ 0, with R UL−2 f,IS,DA = log 2 (1+ζ f P f ), (5.19) where ζ f is the solution of the following fixed-point equation ζ f = 1+ X f ′ 6=f a f,f ′P f ′ 1+La f,f ′P f ′ζ f + S Lζ f 1−η ( ¯ H H H H ¯ H H H) −1 L(M−S +1) S 2 a f,0 P 0 ζ f −1 (5.20) where η ( ¯ H H H H ¯ H H H) −1 (γ) denotes the η-transform (see [TV04]) of the asymptotic eigenvalue distribution of the random matrix ¯ H H H = 1 √ S H H H. 140 Directional Scattering: In this case, we resort to an accurate and easy to compute lower bound on the achievable rate, which significantly simplifies calculations. First, we need the following auxiliary lemma: Lemma 4 Let A A A take on values in the cone of positive definite Hermitian symmetric matrices, and x x x be a complex vector. Then, the function f(A A A) = log(1 +x x x H A A A −1 x x x) is convex inA A A. Proof The proof follows by showing that, for any positive definite matricesA A A 1 andA A A 2 , vectorx x x and 0≤λ≤ 1, we have log 1+x x x H (λA A A 1 +(1−λ)A A A 2 ) −1 x x x ≤λlog 1+x x x H A A A −1 1 x x x +(1−λ)log 1+x x x H A A A −1 2 x x x (5.21) SinceA A A 1 andA A A 2 are positive definite, we have from [Mui74] x x x H (λA A A 1 +(1−λ)A A A 2 ) −1 x x x≤ x x x H A A A −1 1 x x x λ x x x H A A A −1 2 x x x 1−λ (5.22) Denoting a = x x x H A A A −1 1 x x x and b = x x x H A A A −1 2 x x x , we have log 1+x x x H (λA A A 1 +(1−λ)A A A 2 ) −1 x x x ≤ log 1+a λ b 1−λ (a) ≤ log (1+a) λ (1+b) 1−λ = λlog(1+a)+(1−λ)log(1+b) = λlog 1+x x x H A A A −1 1 x x x +(1−λ)log 1+x x x H A A A −1 2 x x x , where (a) is due to the generalized Holder’s inequality. 141 Applying Jensen’s inequality, we can write R UL−2 f ≥E log 2 1+h h h H f,f I I I L + X f ′ 6=f a f,f ′h h h f,f ′h h h H f,f ′P f ′ +a f,0 H H H H f,0 E[V V VV V V H ]H H H f,0 P 0 S −1 h h h f,f P f %# Δ =R UL−2 f,LB (5.23) Recall that in this case the tier-1 BS beamforming matrixV V V takes on the form V V V = [B B B 1 ...B B B G ]diag(P P P 1 ,...,P P P G ), with P P P g given by the normalized Moore-Penrose pseudo inverse of B B B H g H H H g . Also, the channel matrixH H H f,0 is given by H H H f,0 =R R R 1/2 f,0 W W W f,0 (5.24) whereR R R f,0 =U U U f,0 Λ f,0 U U U H f,0 is the M×M covariance matrix of the channels between the tier-1BSarrayandeachantennaofthetier-2BSarray,andW W W f,0 isanM×Lmatrixi.i.d. matrix with CN(0,1) elements. From a geometric viewpoint, the tier-2 BS antenna are analogoustoLco-locatedusers, suchthatthecolumnsofH H H f,0 aremutuallyindependent, assuming that the antenna elements are sufficiently separated. 142 In order to evaluate (5.23), we first need to computeE[V V VV V V H ]. Note that E[V V VV V V H ] = G X g=1 E[B B B g P P P g P P P H g B B B H g ] = G X g=1 E[B B B g B B B H g H H H g H H H H g B B B g B B B H g H H H g −2 H H H H g B B B g B B B H g ] = G X g=1 E[B B B g ˜ U U U H g ˜ Λ 1/2 g ˜ W W W H g ˜ W W W g ˜ Λ g ˜ W W W H g −2 ˜ W W W g ˜ Λ g ˜ U U U g B B B H g ] = G X g=1 B B B g ˜ U U U H g E ˜ Λ 1/2 g ˜ W W W H g ˜ W W W g ˜ Λ g ˜ W W W H g −2 ˜ W W W g ˜ Λ g ˜ U U U g B B B H g (5.25) whereB B B H g H H H g = ˜ U U U H g ˜ Λ 1/2 g ˜ W W W H g , with ˜ Λ g diagonal b g ×b g and ˜ W W W g i.i.d. Gaussian b g ×K g . LettingK K K g =E ˜ Λ 1/2 g ˜ W W W H g ˜ W W W g ˜ Λ g ˜ W W W H g −2 ˜ W W W g ˜ Λ g , we can write the (i,j)-th element as [K K K g ] i,j = E q ˜ λ g,i ˜ λ g,j ˜ w w w H g,i ˜ W W W g ˜ Λ g ˜ W W W H g −2 ˜ w w w g,j = E q ˜ λ g,i ˜ λ g,j ˜ w w w H g,i bg X k=1 ˜ λ g,k ˜ w w w g,k ˜ w w w H g,k −2 ˜ w w w g,j (a) = E q ˜ λ g,i ˜ w w w H g,i P bg k=1,k6=i ˜ λ g,k ˜ w w w g,k ˜ w w w H g,k −1 (1+ ˜ λ g,i ˜ w w w H g,i P bg k=1,k6=i ˜ λ g,k ˜ w w w g,k ˜ w w w H g,k −1 ˜ w w w g,i ) × P bg k=1,k6=j ˜ λ g,k ˜ w w w g,k ˜ w w w H g,k −1 ˜ w w w g,j q ˜ λ g,j (1+ ˜ λ g,j ˜ w w w H g,j P bg k=1,k6=j ˜ λ g,k ˜ w w w g,k ˜ w w w H g,k −1 ˜ w w w g,j ) where ˜ w w w g,k denotes thek-th column of ˜ W W W g , ˜ λ g,k the (k,k)-th entry of ˜ Λ g and (a) follows by the matrix inversion lemma. We will evaluate [K K K g ] i,j in the large-system limit when 143 S g ,b g −→∞ with a fixed ratio. After some rather heavy algebra, it can be shown that the off-diagonal entries ofK K K g go to zero, and that the diagonal entries are given by [K K K g ] i,i =E ˜ λ g,i ˜ w w w H g,i P bg k=1,k6=i ˜ λ g,k ˜ w w w g,k ˜ w w w H g,k −2 ˜ w w w g,i 1+ ˜ λ g,i ˜ w w w H g,i P bg k=1,k6=i ˜ λ g,k ˜ w w w g,k ˜ w w w H g,k −1 ˜ w w w g,i 2 (5.26) Usingthemethodof[WCDS12],aconvergentapproximation[K K K o g ] i,i suchthat[K K K g ] i,i − [K K K o g ] i,i bg,Sg→∞ −→ 0 takes on the form [K K K o g ] i,i = ˜ λ g,i S 1+ ˜ λ g,i f g 1 P bg k=1 ˜ λ g,k 1+ ˜ λ g,k fg f g = S P bg k=1 ˜ λ g,k 1+ ˜ λ g,k fg (5.27) Using (5.27) in (5.23), and using Karhunen-Loeve expansion for the columns ofH H H f,0 , we obtain R UL−2 f,LB =E log 2 1+h h h H f,f I I I L + X f ′ 6=f a f,f ′h h h f,f ′h h h H f,f ′P f ′ +a f,0 ˜ W W W H f,0 ˜ Λ f,0 ˜ W W W f,0 P 0 S −1 h h h f,f P f %# (5.28) where we used the eigenvalue decomposition Λ 1/2 f,0 U U U H f,0 G X g=1 B B B g ˜ U U U H g K K K o g ˜ U U U g B B B H g U U U f,0 Λ 1/2 f,0 = ˜ U U U f,0 ˜ Λ f,0 ˜ U U U H f,0 and we let ˜ W W W f,0 = ˜ U U U H f,0 W W W f,0 , where ˜ W W W f,0 are also i.i.d. ∼ CN(0,1). As M −→ ∞, a convergent approximation to R UL−2 f,LB in (5.28) is given by R UL−2 f,DS,DA = log 2 (1+ξ f P f ), (5.29) 144 where ξ f is the solution of the fixed-point equation ξ f = 1+ X f ′ 6=f a f,f ′P f ′ 1+La f,f ′P f ′ξ f + X k ˜ λ f,0,k a f,0 P 0 S 1+L ˜ λ f,0,k a f,0 P 0 S ξ f # −1 (5.30) 5.3 System performance : Co-channel TDD In co-TDD, as the name suggests, the tier-2 cells align their UL slot with the UL slot of the tier-1 cell (and vice-versa). Again, we evaluate the performance in one direction only, since the results hold for the other direction via uplink-downlink duality. The preferred direction for the analysis purpose is tier-1 UL/tier-2 UL. Since the tier-1 cell operates in UL mode, the tier-2 cells close to the tier-1 BS must lower their transmit power in order to meet the interference temperature requirement. Unlike R-TDD, these tier-2 cells are permanently at a disadvantage, since they have to transmit with small power on all slots. Nevertheless, thanks to the very large number of antennas, interference can be handled effectively in both tiers. 5.3.1 Tier-1 UL In this case, the received signal vector at the tier-1 BS is given by the superposition of the signals transmitted by both the tier-1 and the tier-2 users, such that we have y y y = G X g=1 Sg X k=1 √ a g,0 h h h g k d g k + X f∈C √ a f,0 h h h f,0 x f +z z z (5.31) 145 The tier-1 BS makes use of a linear MMSE receiver in order to detect the symbols of the tier-1 users and eventually decode their codewords. As a result, the ergodic achievable rate for tier-1 user g k takes on the form ¯ R UL−1 g k =E[log 2 (1+SINR g k )] =E log 2 1+a g,0 h h h H g k I I I M + X (g ′ ,m)6=(g,k) a g ′ ,0 h h h g ′ m h h h H g ′ m P 0 S + X f∈C a f,0 h h h f,0 h h h H f,0 P f −1 h h h g k P 0 S (5.32) Inthecaseofdirectionalscattering,asM −→∞weobtaintheconvergentapproximation ¯ R UL−1 g k ,DA = log 2 (1+η g ), (5.33) where η g is given by the solution of the fixed-point equation η g = a g,0 P 0 S tr R R R g T T T −1 (5.34) η f = a f,0 P f tr R R R f,0 T T T −1 (5.35) T T T = I I I M + G X g ′ =1 S g ′R R R g ′ 1+η g ′ a g ′ ,0 P 0 S + X f∈C R R R f,0 a f,0 P f 1+η f (5.36) For isotropic scattering, i.e., whenR R R g =R R R f,0 =I I I M , we can simplify further to get ¯ R UL−1 g k ,IS,DA = log 2 1+a g,0 P 0 S η (5.37) where η is given by the solution of the fixed-point equation η = 1+ G X g ′ =1 S g ′ a g ′ ,0 P 0 S 1+Ma g ′ ,0 P 0 S η + X f∈C a f,0 P f 1+Ma f,0 P f η −1 (5.38) 146 5.3.2 Practical considerations for tier-1 UL The calculation of the linear MMSE receiver requires the computation of the inverse of the M ×M covariance matrix of the received signal vector. In practice, the tier-1 BS computes the sample covariance matrix of the received signal and uses this estimate in lieu of the true covariance matrix, which is obviously not known a priori in general. For M ≫ 1, an accurate estimation of the sample covariance would require a large number of data samples, which may not be possible to obtain when the data frame length is limited. In addition, calculating the inverse of an M ×M matrix at each data frame may be computationallytoohard. Finally,conventional“tracking”techniquessuchasRLS,which update directly the inverse covariance matrix, cannot be applied in this context since the received signal covariance matrix changes abruptly from frame to frame, depending on whichtier-1usersarescheduledintheUL.Hence,inordertomakearealisticcomparison, we need to look at alternative receivers, which do not suffer from these drawbacks. WeproposetouseaJSDM-typereceiverforthetier-1UL,whichisoutlinednext. The linearreceiverfordecodinguserg k isformedbytwostages. First,wemultiplythereceived signal(5.31)byB B B g ,whereB B B g dependsonlyonthechannelsecond-orderstatistics,inorder tomitigatetheinter-groupinterferencefromtier-1usersofothergroupsg ′ 6=g. Then, we use a second linear transformation (zero-forcing)P P P g in order to remove the intra-group interference from the users belonging to group g. P P P g is given by the pseudo-inverse of the resulting effective channel B B B H g H H H g . Note that this approach is exactly the “dual” of what we have seen before for the tier-1 DL. The major disadvantage of this approach is that we have no control over mitigating the cross-tier interference coming from the tier-2 users. However, if the tier-2 cells have non-overlapping angular spectra (direction of arrival) with the eigenmodes of user group in consideration, the first stage inherently performs some form of inter-tier interference suppression, especially when the number of 147 antennasM is large (Section 2.6). Using this approach, the received signal for userg k at the tier-1 BS is given by ˆ y g k = p H g k B B B H g y y y = √ a g,0 p H g k B B B H g h h h g k d g k + G X g ′ =1,g ′ 6=g S g ′ X m=1 √ a g ′ ,0 p H g k B B B H g h h h g ′ m d g ′ m + X f∈C √ a f,0 p H g k B B B H g h h h f,0 x f +p H g k B B B H g z z z, (5.39) wherep g k denotes the columns ofP P P g corresponding to userg k and where the intra-group interference disappears due to zero forcing. The resulting achievable ergodic rate for user g k can be accurately calculated in the limit of largeM by the convergent approximation ¯ R UL−1 g k ,DS,JSDM,DA = log 2 1+ a g,0 P 0 S 1 bg ¯ mg + P g ′ 6=g ¯ I (1) g,g ′ a g ′ ,0 P 0 S + P f∈C ¯ I (2) g,f a f,0 P f (5.40) where ¯ m g is obtained as the solution of the fixed-point equation ¯ m g = 1 b g tr B B B H g R R R g B B B g T T T −1 g T T T g = I I I bg + S g b g B B B H g R R R g B B B g ¯ m g (5.41) 148 and where{ ¯ I g,g ′ :∀ g,g ′ ∈G} and{ ¯ I g,f :∀ g∈G,f ∈F} are the solution of the system of coupled fixed-point equations ¯ I (1) g,g ′ = S g ′ b g ¯ n (1) g,g ′ ¯ m 2 g ¯ n (1) g,g ′ = 1 bg tr B B B H g R R R g B B B g T T T −1 g B B B H g R R R g ′B B B g T T T −1 g 1−J g ¯ I (2) g,f = 1 b g ¯ n (2) g,f ¯ m 2 g ¯ n (2) g,f = 1 bg tr B B B H g R R R g B B B g T T T −1 g B B B H g R R R f,0 B B B g T T T −1 g 1−J g J g = 1 bg tr B B B H g R R R g B B B g T T T −1 g B B B H g R R R g B B B g T T T −1 g b g ¯ m 2 g (5.42) 5.3.3 Tier-2 UL With co-TDD, the L×1 received signal vector at a tier-2 BS f is given by y y y f =h h h f,f x f + X f ′ 6=f √ a f,f ′h h h f,f ′x f ′ + G X g=1 Sg X k=1 √ a f,g h h h f,g k d g k +z z z f (5.43) Assuming a linear MMSE receiver, it is immediate to write the corresponding ergodic achievable rate as ¯ R UL−2 f =E log 2 1+h h h H f,f I I I L + X f ′ 6=f a f,f ′h h h f,f ′h h h H f,f ′P f ′ + G X g=1 Sg X k=1 a f,g h h h f,g k h h h H f,g k P 0 S −1 h h h f,f P f . (5.44) Because of the interference temperature constraint, and since with co-TDD the tier-2 UL interfere with the tier-1 UL, the transmit power in tier-2 cell f is given by P f = min P 1 , κ a f,0 . (5.45) 149 Thechannelvectorsbetweentier-2usersandtier-2BSsaswellasbetweentier-1usersand tier-2 BSs are i.i.d. Gaussian (i.e.,h h h f,f ′ ∼CN(0,I I I L ),h h h f,g k ∼CN(0,I I I L )). As M −→∞, a convergent approximation to (5.44) is given by ¯ R UL−2 f,DA = log 2 (1+ ¯ ζ f P f ), (5.46) where ¯ ζ f is the solution of the fixed-point equation ¯ ζ f = 1+ X f ′ 6=f a f,f ′P f ′ 1+La f,f ′P f ′ ¯ ζ f + G X g=1 S g a f,g P 0 S 1+La f,g P 0 S ¯ ζ f −1 (5.47) 5.4 Results We present some numerical results in order to compare the various schemes treated in Sections5.2and5.3. Weconsiderasquaredareaofside1km, onetier-1BSinthecenter, 80 indoor tier-2 cells with radius 40 m, wall absorption of 5 dB, located on a regular squared grid 2 , no log-normal shadowing and distance dependent pathloss coefficient with a = 1/(1+(d/d 0 ) 3.5 ), where the 3 dB loss distance d 0 is 50 m. The tier-1 total transmit power is P 0 = 43 dB, and the tier-2 peak transmit power is P 1 = 20 dB (relative to a noise floor of 0 dB). Tier-1 users are assumed to be outdoor. 3 5.4.1 Design of the pre-beamforming matrices The design of pre-beamformers is relevant both in tier-1 UL and DL, when directional scattering is involved. We adopt an approximate block diagonalization approach as in Section 2.3.2. In the following, we consider two sets of results depending on the ability of the tier-1 BS to mitigate the inter-tier interference to the tier-2 cells. 2 We consider an exclusion zone near the tier-1 BS, such that there are no tier-2 cells within a radius of 50m. 3 Equivalently, we may think of an “open access” policy for the tier-2 cells, for which if a user enters a tier-2 cell, then it is offloaded automatically and is “swallowed” by the tier-2 cell. 150 Tier-2unawarepre-beamforming: Here,thetier-1BScalculatesthepre-beamformers in order to minimize the inter-group interference, both in the DL as well as in the UL. Denoting by U U U g the eigenvectors of the channel covariance matrix of group g, we need B B B H g H H H g ′ ≈ 0 ∀g ′ 6= g. This can be achieved by restricting B B B g to the orthogonal com- plement of the span of {U U U ∗ g ′ : g ′ 6= g}, where U U U ∗ g ⊂ U U U g comprises of the eigenvectors corresponding to the dominant eigenvalues ofR R R g . Tier-2 aware pre-beamforming (spatial blanking): Here, the tier-1 BS con- structs the pre-beamformers in order to minimize also the inter-tier interference to the tier-2 cells. In the UL, the tier-1 BS does not cause any interference to the tier-2 cells and the tier-2 cells use power control to keep the interference caused to tier-1 users (in R-TDD) or to the tier-1 BS (in co-TDD) below the target interference temperature. In the tier-1 DL, the idea is to construct pre-beamformers such that they are approximately orthogonal to the tier-2 cells closest to the tier-1 BS, since these are the ones that suffer the most inter-tier interference. Hence, we require that B B B g ⊂ Span ⊥ h {U U U ∗ g ′ :g ′ 6=g} [ {U U U ∗ f,0 :f ∈C BS−1 } i , whereC BS−1 denotes the set of femtocells closest to the macro BS, andU U U ∗ f,0 contains the eigenvectors associated with the dominant eigenvalues ofR R R f,0 . Notice that this approach consists of creating transmission opportunities for the tier-2 cells by leaving “blank” slots in the spatial domain (in contrast to the eICIC strategy, based on blank time-frequency slots). For this reason, we refer to this approach as “spatial blanking”. 5.4.2 Simulations Figs. 5.2(a) and 5.2(b) show the two tier network consisting of a single tier-1 BS (“black” square) serving groups of users (a group of macro users is denoted by a “red” star) and several tier-2 cells (“blue” circles). With directional scattering, the tier-1 BS only serves groups of users with disjoint angular support, motivated by the result of Section 2.6 151 −500 0 500 −500 −400 −300 −200 −100 0 100 200 300 400 500 (a) Layout (Isotropic) −500 0 500 −500 −400 −300 −200 −100 0 100 200 300 400 500 (b) Layout (Directional) Figure 5.2: Sample layout showing the two tier network. 0 5 10 15 20 25 0 50 100 150 200 250 300 Tier−1 throughput (in bits/sec/Hz) Aggregate Tier−2 throughput (in bits/sec/Hz) M = 10, K = 6, L = 4 Co TDD, 1 grp Rev TDD, 1 grp Co TDD, 3 grp Rev TDD, 3 grp (a) M =10,K =6 20 40 60 80 100 120 140 0 50 100 150 200 250 300 Tier−1 throughput (in bits/sec/Hz) Aggregate Tier−2 throughput (in bits/sec/Hz) M = 100, K = 24, L = 4 Co TDD, 1 grp Rev TDD, 1 grp Co TDD, 4 grp Rev TDD, 4 grp Co TDD, 8 grp Rev TDD, 8 grp (b) M =100,K =24 Figure 5.3: Reverse TDD vs Co TDD (Isotropic Scattering) 152 0 20 40 60 80 100 120 0 50 100 150 200 250 300 350 Tier−1 throughput (in bits/sec/Hz) Aggregate Tier−2 throughput (in bits/sec/Hz) M = 100, K = 24, L = 4 Co TDD JSDM, 1 grp Rev TDD, 1 grp Co TDD, 1 grp Co TDD JSDM, 2 grp Rev TDD, 2 grp Co TDD, 2 grp Co TDD JSDM, 6 grp Rev TDD, 6 grp Co TDD, 6 grp (a) Femtocell Unaware 0 20 40 60 80 100 120 0 50 100 150 200 250 300 350 Tier−1 throughput (in bits/sec/Hz) Aggregate Tier−2 throughput (in bits/sec/Hz) M = 100, K = 24, L = 4 Rev TDD A, 1 grp Rev TDD UA, 1 grp Rev TDD A, 2 grp Rev TDD UA, 2 grp Rev TDD A, 6 grp Rev TDD UA, 6 grp (b) Femtocell Aware Figure 5.4: Reverse TDD vs Co TDD (Directional Scattering) which states that the channel covariances of groups with disjoint angular support become orthogonal as the number of antennas M become very large. The tier-2 cells denoted as “cyan” circles are the ones that the tier-1 BS protects by spatial blanking. Note that these cells also have also a disjoint angular support with the groups of served tier-1 users, since trying to zero-force inter-tier interference with same or strongly overlapped angular support would yield a too large SNR penalty for the tier-1 user rates. Fig. 5.3 shows a performance comparison between co-TDD and R-TDD for isotropic scattering. These schemes are compared in terms of the tradeoff curve between tier-1 and tier-2 aggregate throughput. This is obtained by varying the interference temperature levelκ. In Fig. 5.3(a), we set the number of tier-1 BS antennasM = 10, and the number oftier-1usersK = 6. InFig.5.3(b),wesetM = 100andK = 24. Inboththesecases,the number of tier-2 BS antennas L = 4. We obtain different curves by varying the number of user groupsG and letK g = K G ∀g∈G. The “dashed” curves refer to R-TDD, and the “solid” curves to co-TDD. In the regime when all the tier-2 cells are transmitting with their peak power, we observe a higher tier-2 throughput for R-TDD because of the fact that only the tier-2 cells near the tier-1 BS suffer from inter-tier interference, and only 153 these cells experience a rate degradation. However, in co-TDD, as the number of tier-1 usergroupsincreases,moreandmoretier-2cellsareaffectedbytheinter-tierinterference, and this leads to lower tier-2 throughput. Co-TDD exhibits a higher tier-1 throughput in this regime because the tier-1 BS can effectively eliminate interference in the spatial domain. With R-TDD, the tier-1 users suffer from significant inter-tier interference since all the tier-2 cells are transmitting at their peak power, and this leads to performance degradation. In the other extreme, when the tier-2 cells are transmitting with low power, the tier-1 throughput in both cases remains almost the same while the tier-2 throughput is close to negligible. Fig. 5.4(a) shows the same type of comparison for directional scattering. While the throughputtradeoffbehaviormayseemsimilar, thereisonemajordifferencewithrespect to the case of isotropic scattering. With directional scattering, the number of users that can be spatially multiplexed in a group g is limited by the rank r g of the group. Therefore, as the number of groups increases, we are able to serve more and more macro users, leading to increased macrocell throughput. Fig. 5.4(b) shows the performance comparison between tier-2 aware (spatial blanking) and tier-2 unaware pre-beamforming. The tier-2 unaware scheme is denoted by Rev TDD UA while the tier-2 aware scheme is denoted by Rev TDD A, where the tier-1 BS first chooses the user groups to serve and then decides to cancel inter-tier interference to (some) tier-2 cells with non overlapping directions. 154 Chapter 6 Conclusion In this dissertation, we proposed Joint Spatial Division and Multiplexing (JSDM) for Large Scale Antenna Systems. JSDM is a novel approach to MU-MIMO downlink that requires reduced channel estimation downlink training overhead and CSIT feedback and therefore is potentially suited to FDD systems, despite a large number of BS antennas. JSDM exploits the fact that for large BSs, mounted on the top of a building or on a dedicated tower, channel vectors are far from isotropically distributed. Instead, their dominant eigenspace has dimension much smaller than the number of BS antennas. Dif- ferentgroupsofusersareselected,suchthattheusersineachgroupshare(approximately) the same dominant eigenspace, and the eigenspaces of different groups are nearly orthog- onal. JSDMservessimultaneouslysuchgroupsofusers, andmultipleusersineachgroup. The separation of the groups in the spatial domain (space-division) is obtained through a pre-beamforming matrix that depends only on the channel covariance matrices, while the multiplexing of multiple users in each group is obtained via linear MU-MIMO pre- codingbasedontheinstantaneous“effective”channel, includingthepre-beamforming. It turns out that the effective channel has reduced dimensionality with respect to the orig- inal multi-antenna multiuser channel, especially with a “per-group processing” (PGP) approach, i.e., where each group is individually pre-coded, disregarding the inter-group interference. JSDM with PGP can be regarded as a generalization of sectorization, where 155 each group acts as a directional sector, and in each sector we apply MU-MIMO spatial multiplexing, disregarding inter-sector interference. We showed that when the collection of the channel covariance eigenvectors of the groups forms a tall unitary matrix, then JSDM with PGP is optimal, in the sense that it can achieve the capacity of the underlying MU-MIMO channel with full instantaneous CSIT. Then, using Szego’s asymptotic theory of large Toeplitz matrices, we showed that when the BS is equipped with a large linear uniform array, this tall unitary condition is closely approached, and the pre-beamforming matrix can be obtained by selecting an appropriatesubsetofcolumnsofaunitaryDFTmatrix. Infact, undertheseassumptions the accurate estimation of the channel covariance matrix is not needed, and just a coarse estimation of the AoA range for each group is sufficient, as long as the AoA ranges of different groups do not overlap in the azimuth angle domain. Finally, we extended our approach to the case of 3D beamforming, considering rectangular arrays and pre- beamforming in the elevation angle (vertical) direction. In this case, the proposed JSDM scheme partitions the cell into concentric annular regions, and serves groups of users with different azimuth angle in each region. We demonstrated the effectiveness of the proposed scheme in the case of a typical cell size, typical propagation pathloss, and a large rectangular antenna array mounted on the face of a tall building. In our case, under ideal CSIT, unprecedented spectral efficiencies of the order of 1000 bit/s/Hz per sector are achieved under various fairness criteria and pre-beamforming techniques. Wealsoconsideredtheproblemofdownlinkchannelestimationandprovidedformulas for the asymptotic “deterministic equivalent” approximation of the achievable receiver SINR, which allows efficient calculation of the system performance without resorting to lengthy Monte Carlo simulation. For a realistic SNR range around 20 dB, the effect of noisy CSIT can be quantified in≈ 30% loss with respect to the ideal CSIT case. Hence, spectral efficiencies of≈ 700 bit/s/Hz can be expected for the massive 3D JSDM system scenario. 156 The design of a JSDM system involves many choices: effective rank r ⋆ g of the chan- nel covariance matrix for each group, pre-beamforming dimension b g , number of users (downlink streams) for each groupS g , for given pre-beamforming design, operating SNR, and MU-MIMO precoding scheme. In the case of 3D beamforming, this optimization is significantly more complicated since it has to be repeated for groups of annular regions served simultaneously by the vertical beamforming. One of the main merits of this work is to provide simple and solid design criteria for such system, based on the insight gained by the asymptotic analysis. In fact, a brute-force search over the whole parameter space becomes quickly infeasible for practical system scenarios. Next, we extended these results for JSDM in the context of opportunistic beamform- ing. ForthecaseofafinitenumberofBSantennasandlargenumberofusers,weobtained the scaling laws of the system sum capacity and showed that the sum capacity scales as loglogK, where K is the number of users in the system, with a coefficient that depends on the sum of the ranks of the user group covariance matrices. We also investigated the general problem of clustering the users into groups (user grouping) when, realistically, each user has its own individual channel covariance matrix (i.e., no a priori groups with same covariance matrix are assumed). We proposed a simplified algorithm requiring only the knowledge of the users AoA and AS (i.e., the angular support of the scattering from which the BS transmit power is received at the user antenna). The proposed simplified grouping corresponds to the quantization of the AoA-AS plane and works well when the number of BS antennas is large. Finally, we considered the performance analysis in the large system limit (both large number of users and large number of BS antennas), ob- tained appealing closed-form fixed-point equations that enable to calculate the SINR for each user, and based on these expressions, we proposed a method to optimize the number of downlink streams to be served by JSDM for each (discretized) point in the AoA-AS plane. This can be optimized depending on a desired network utility function of the user rates, which can be chosen to implement a desired notion of fairness. Based on this optimization, we also proposed a probabilistic user selection that implicitly allocates the 157 number of streams to the users according to the optimal downlink stream distribution. Finite dimensional simulations show the effectiveness of the proposed method. In addition, we have also considered the application of the JSDM approach to highly directional channels formed by a few discrete MPCs, or clusters of multi-path compo- nents, typically arising in outdoor mm-Wave communications. In particular, when the user channels have partially overlapping eigenspaces, due to common scattering clusters or MPCs with similar angles of departure, allocating users onto the BS array angular di- mensions becomes a difficult optimization problem. We formulate this problem in terms of a conflict graph, where each user is identified by the set of angular frequencies occu- pied by its channel covariance spectrum, and users with overlapping angular frequencies are connected in the graph. The user selection and angular dimension allocation can be formulated as integer programming problems, whose objective function depends on what we wish to optimize. We have proposed two such problems, driven by the physi- cal insights gained by considering common scattering clusters. For the proposed integer programming problems, we have provided solutions via low complexity greedy selection algorithms. Then, we have demonstrated the performance achieved by JSDM with the proposedalgorithmsinsomerelevantscenarios, includingchannelsgeneratedbyraytrac- ing in an outdoor campus environment and channels obtained by an actual measurement campaign in an urban environment. Our studies show that JSDM with good user selection turns out to be an attractive technique for the implementation of multiuser MIMO downlink in massive MIMO sys- tems. The scheme can take advantage of highly directional channel statistics, as those arising in mm-Wave frequencies. In particular, in a typical small-cell scenario where the number of users is significantly less than the number of base station antennas, and the user channels are formed by a small number of discrete multi-path components, we have proposed a simple “covariance-based” JSDM scheme that achieves remarkable spatial multiplexing while requiring only the knowledge of the channel’s second-order statis- tics. This scheme is particularly attractive since it does not require instantaneous CSIT 158 feedback, and the channel covariances can be accurately learned and tracked since they depend on the scattering environment, and are very slowly varying for nomadic users typical of small cell networks. Finally, we have studied the performance of a two tier cellular network consisting of a singletier-1BSandseveraltier-2BSssharingthesamefrequencychannel. Weconsidered the regime of large number of antennas at the tier-1 BS and moderately large number of antennas at the tier-2 BSs. We analyzed the system performance under two duplexing strategies: co-TDD and R-TDD. We derived closed form expressions and lower bounds to the achievable ergodic rates for different channel models using tools from random matrix theory and use these results to perform numerical comparisons in a scenario representa- tive of a sub-urban area with indoor tier-2 cells. Our study shows that with directional channels, and insisting on using for both R-TDD and co-TDD a practical JSDM-like two- stage beamforming scheme, R-TDD yields better results when tier-2 cells make use of power control to minimize the inter-tier interference. Overall, under realistic directional scattering channel models, a combination of JSDM-like two-stage beamforming, interfer- ence temperature power control and R-TDD yields a tier-1/tier-2 throughput tradeoff far superior to eICIC, which can only operate on the “time-sharing line” (i.e., on the convex combination) of the individual tier-1 and tier-2 throughputs. We conclude this work by pointing out two interesting related topics, which are left for future work: 1) Inter cell interference coordination; 2) estimation of the channel covariance matrix dominant eigenspace. Inter cell interference coordination deals with scheduling algorithms that, given a network comprised of different cells, each containing users characterized by their channel covariance dominant eigenspace, forms compatible groups of users that can be served simultaneously using JSDM over the network, such that the overall system spectral efficiency is maximized. In order to enable JSDM, the dominant eigenspace of each user must be estimated from noisy samples of the received signal. Here,theproblemisthatforalargenumberofBSantennasthechannelcovariance matrixishigh-dimensional,andthedimensionistypicallycomparablewiththenumberof 159 samples. Hence, the common wisdom on “sample covariance” estimation does not apply, and more sophisticated techniques must be used (e.g., [CD11, Ch. 17], [Mes08,MTS11]). 160 Appendix A A.1 Deterministic equivalents for the SINR with PGP and noisy CSIT We provide the fixed-point equations for the calculation of the deterministic equivalent approximations of the SINR for JSDM with PGP, noisy CSIT and the two types of linear precoding considered in this paper, namely, RZFBF and ZFBF. Notice that these expres- sions hold for arbitrary pre-beamforming matrices, as long as they are fixed constants independentoftheinstantaneouschannelmatrixrealizations. Inparticular, theyholdfor (approximated) BD and DFT pre-beamforming. We consider the general case of group parameters {S g }, {b g }, with equal power per stream, P g k = P S for all g k . The formulas below are a direct application of the results in [CWD09]. Their derivation is lengthy but somehow straightforward after realizing that all the assumption in [CWD09] apply to our case. In the spirit of striking a good balance between usefulness, conciseness and completeness, we report the formulas without the details of their derivation. A.1.1 Regularized Zero Forcing Precoding For users in group g, the regularized zero forcing precoding matrix is given by P P P g,rzf = ¯ ζ g ( b H H H g b H H H H g +b g αI I I bg ) −1 b H H H g , (A.1) 161 where b H H H g is the matrix formed by the channel estimates b h h h g k obtained as in (2.50). The power normalization factor ¯ ζ g is given by ¯ ζ 2 g = S g tr P P P H g,rzf B B B H g B B B g P P P g,rzf (A.2) Letting b ¯ K K K g = ( b H H H g b H H H H g +b g αI I I bg ) −1 , the SINR of user g k given by b γ g k ,pgp,csi = P S ¯ ζ 2 g | b h h h H g k b ¯ K K K g b h h h g k | 2 P S ¯ ζ 2 g |b e e e H g k b ¯ K K K g b h h h g k | 2 + P j6=k P S ¯ ζ 2 g |h h h H g k B B B g b ¯ K K K g b h h h g j | 2 + P g ′ 6=g,j P S ¯ ζ 2 g ′ |h h h H g k B B B g ′ b ¯ K K K g ′ b h h h g ′ j | 2 +1 (A.3) where “csi” denotes noisy CSIT. The deterministic equivalent of the SINR in this case given by b γ g k ,pgp,rzf,csi − b γ o g k ,pgp,rzf,csi M→∞ −→ 0 (A.4) with b γ o g k ,pgp,rzf,csi = P S b ¯ ζ 2 g ( b ¯ m o g ) 2 P S b ¯ ζ 2 g b ¯ E o g + P S b ¯ ζ 2 g b ¯ Υ o g,g + 1+ P g ′ 6=g P S b ¯ ζ 2 g ′ b ¯ Υ o g,g ′ 1+ b ¯ m o g 2 (A.5) 162 where b ¯ ζ 2 g = 1 b ¯ Γ o g and the quantities b ¯ m o g , b ¯ Υ o g,g , b ¯ Υ o g,g ′ and b ¯ Γ o g are given by b ¯ m o g = 1 b g tr b ¯ R R R g b ¯ T T T g (A.6) b ¯ T T T g = ! S g b g b ¯ R R R g 1+ b ¯ m o g +αI I I bg % −1 (A.7) b ¯ Γ o g = 1 b g b ¯ n g (1+ b ¯ m o g ) 2 (A.8) b ¯ n g = 1 bg tr b ¯ R R R g b ¯ T T T g B B B H g B B B g b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg(1+ b ¯ m o g ) 2 (A.9) b ¯ E o g = 1 b g 1 bg tr b ¯ R R R g b ¯ T T T g ( ¯ R R R g − b ¯ R R R g ) b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg(1+ b ¯ m o g ) 2 (A.10) b ¯ Υ o g,g = (1+ b ¯ m o g ) 2 A 1 − 2 b ¯ m o g (1+ b ¯ m o g )−( b ¯ m o g ) 2 A 2 (A.11) A 1 = 1 b g (S g −1) b ¯ n g,g,1 (1+ b ¯ m o g ) 2 (A.12) A 2 = 1 b g (S g −1) b ¯ n g,g,2 (1+ b ¯ m o g ) 2 (A.13) b ¯ n g,g,1 = 1 bg tr b ¯ R R R g b ¯ T T T g ¯ R R R g b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg(1+ b ¯ m o g ) 2 (A.14) b ¯ n g,g,2 = 1 bg tr b ¯ R R R g b ¯ T T T g b ¯ R R R g b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg(1+ b ¯ m o g ) 2 (A.15) b ¯ Υ o g,g ′ = S g ′ b g ′ b ¯ n g ′ ,g (1+ b ¯ m o g ′) 2 (A.16) b ¯ n g ′ ,g = 1 b g ′ tr b ¯ R R R g ′ b ¯ T T T g ′B B B H g ′R R R g B B B g ′ b ¯ T T T g ′ 1− S g ′ b g ′ tr b ¯ R R R g ′ b ¯ T T T g ′ b ¯ R R R g ′ b ¯ T T T g ′ b g ′(1+ b ¯ m o g ′) 2 (A.17) 163 A.1.2 Zero Forcing Precoding For α = 0, the precoding matrix in (A.1) reduces to the zero forcing precoding matrix given by P P P g,zf = ¯ ζ g b H H H g ( b H H H H g b H H H g ) −1 (A.18) where ¯ ζ g is the power normalization factor given by ¯ ζ 2 g = S g tr(P P P H g,zf B B B g B B B H g P P P g,zf ) (A.19) Letting b ¯ K K K g = b H H H g ( b H H H H g b H H H g ) −2b H H H H g , the SINR of user g k is given by b γ g k ,pgp,zf,csi = P S b ¯ ζ 2 g | b h h h H g k b ¯ K K K g b h h h g k | 2 P S b ¯ ζ 2 g |b e e e H g k b ¯ K K K g b h h h g k | 2 + P j6=k P S b ¯ ζ 2 g |h h h H g k B B B g b ¯ K K K g b h h h g j | 2 + P g ′ 6=g,j P S b ¯ ζ 2 g ′|h h h H g k B B B g ′ b ¯ K K K g ′ b h h h g ′ j | 2 +1 (A.20) The deterministic equivalent of the SINR is given as b γ g k ,pgp,zf,csi − b γ o g k ,pgp,zf,csi M→∞ −→ 0 (A.21) where b γ o g k ,pgp,zf,csi is given by b γ o g k ,pgp,zf,csi = P S b ¯ ζ 2 g 1+ P S b ¯ ζ 2 g b ¯ E o g ( b ¯ m o g ) 2 + P S b ¯ ζ 2 g b ¯ Υ o g,g + P g ′ 6=g P S b ¯ ζ 2 g ′ b ¯ Υ o g,g ′ = P S b ¯ ζ 2 g 1+ P S b ¯ ζ 2 g S g b ¯ E o g ( b ¯ m o g ) 2 + P g ′ 6=g P S b ¯ ζ 2 g ′ b ¯ Υ o g,g ′ (A.22) 164 with b ¯ ζ 2 g = 1 b ¯ Γ o g and the quantities b ¯ Γ o g , b ¯ Υ o g,g ′, and b ¯ m o g are given by 1 b ¯ m o g = 1 b g tr b ¯ R R R g b ¯ T T T g (A.23) b ¯ T T T g = ! S g b g b ¯ R R R g b ¯ m o g +I I I bg % −1 (A.24) b ¯ Γ o g = 1 b g b ¯ n g ( b ¯ m o g ) 2 (A.25) b ¯ Υ o g,g ′ = S g ′ b g ′ b ¯ n g ′ ,g ( b ¯ m o g ′) 2 (A.26) b ¯ n g = 1 bg tr b ¯ R R R g b ¯ T T T g B B B H g B B B g b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg( b ¯ m o g ) 2 (A.27) b ¯ n g ′ ,g = 1 b g ′ tr b ¯ R R R g ′ b ¯ T T T g ′B B B H g ′R R R g B B B g ′ b ¯ T T T g ′ 1− S g ′ b g ′ tr b ¯ R R R g ′ b ¯ T T T g ′ b ¯ R R R g ′ b ¯ T T T g ′ b g ′( b ¯ m o g ′) 2 (A.28) b ¯ Υ o g,g = A 1 −A 2 (A.29) A 1 = 1 b g (S g −1) b ¯ n g,g,1 ( b ¯ m o g ) 2 (A.30) A 2 = 1 b g (S g −1) b ¯ n g,g,2 ( b ¯ m o g ) 2 (A.31) b ¯ n g,g,1 = 1 bg tr b ¯ R R R g b ¯ T T T g ¯ R R R g b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg( b ¯ m o g ) 2 (A.32) b ¯ n g,g,2 = 1 bg tr b ¯ R R R g b ¯ T T T g b ¯ R R R g b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg( b ¯ m o g ) 2 (A.33) b ¯ E o g = 1 b g 1 bg tr b ¯ R R R g b ¯ T T T g ( ¯ R R R g − b ¯ R R R g ) b ¯ T T T g 1− Sg bg tr b ¯ R R Rg b ¯ T T Tg b ¯ R R Rg b ¯ T T Tg bg( b ¯ m o g ) 2 (A.34) 1 It is easy to see that whenB B B H g B B Bg =I I I bg , b ¯ ng = b ¯ mg 165 In order to obtain the desired expression (A.22), we notice that b ¯ E o g ( b ¯ m o g ) 2 + b ¯ Υ o g,g =S g b ¯ E o g ( b ¯ m o g ) 2 (A.35) A.2 General formula for S(ξ) We find the general expression for S(ξ) defined in (2.56) for r m = [R R R] ℓ,ℓ−m with [R R R] m,p given by (2.55), without any restriction on the AoA range. We have: S(ξ) = ∞ X m=−∞ 1 2Δ Z Δ+θ −Δ+θ e −j2πDmsin(α) dα e −j2πξm = 1 2Δ Z Δ+θ −Δ+θ " ∞ X m=−∞ e −j2πm(Dsin(α)+ξ) # dα = 1 2Δ Z Δ+θ −Δ+θ " ∞ X m=−∞ δ(Dsin(α)+ξ−m) # dα = 1 2Δ Z " ∞ X m=−∞ δ(z+ξ−m) # dz √ D 2 −z 2 , (A.36) Thelimitsin(A.36)dependontherangeof[θ−Δ,θ+Δ]. Wedistinguishthefollowing cases: 1. For θ+Δ<− π 2 ,θ−Δ> π 2 and− π 2 ≤θ−Δ<θ+Δ≤ π 2 , (A.36) becomes ∞ X m=−∞ " Z max(Dsin(θ−Δ),Dsin(θ+Δ)) min(Dsin(θ−Δ),Dsin(θ+Δ)) δ(z+ξ−m) dz √ D 2 −z 2 # (A.37) 2. For θ−Δ<− π 2 ,θ+Δ> π 2 , (A.36) becomes ∞ X m=−∞ " Z Dsin(θ−Δ) −D δ(z+ξ−m) dz √ D 2 −z 2 + Z D −D δ(z+ξ−m) dz √ D 2 −z 2 (A.38) + Z D Dsin(θ+Δ) δ(z+ξ−m) dz √ D 2 −z 2 # (A.39) 166 3. For θ−Δ<− π 2 ,− π 2 ≤θ+Δ≤ π 2 , (A.36) becomes ∞ X m=−∞ " Z Dsin(θ−Δ) −D δ(z+ξ−m) dz √ D 2 −z 2 + Z Dsin(θ+Δ) −D δ(z+ξ−m) dz √ D 2 −z 2 # (A.40) 4. For− π 2 ≤θ−Δ≤ π 2 ,θ+Δ> π 2 , (A.36) becomes ∞ X m=−∞ " Z D Dsin(θ−Δ) δ(z+ξ−m) dz √ D 2 −z 2 + Z D Dsin(θ+Δ) δ(z+ξ−m) dz √ D 2 −z 2 # (A.41) Now, owing to the property of the Dirac delta function, we have ∞ X m=−∞ Z B A δ(z+ξ−m) dz √ D 2 −z 2 = X m∈[A+ξ,B+ξ] 1 p D 2 −(m−ξ) 2 (A.42) as a result of which we can write S(ξ) for the cases identified above as 1. Case θ+Δ<− π 2 ,θ−Δ> π 2 and− π 2 ≤θ−Δ<θ+Δ≤ π 2 S(ξ) = 1 2Δ X m∈[min(Dsin(−Δ+θ),Dsin(Δ+θ))+ξ,max(Dsin(−Δ+θ),Dsin(Δ+θ))+ξ] 1 p D 2 −(m−ξ) 2 (A.43) 2. Case θ−Δ<− π 2 ,θ+Δ> π 2 S(ξ) = 1 2Δ X m∈[−D+ξ,Dsin(−Δ+θ)+ξ] 1 p D 2 −(m−ξ) 2 + 1 2Δ X m∈(−D+ξ,D+ξ) 1 p D 2 −(m−ξ) 2 + 1 2Δ X m∈[Dsin(Δ+θ)+ξ,D+ξ] 1 p D 2 −(m−ξ) 2 (A.44) 167 3. Case θ−Δ<− π 2 ,− π 2 ≤θ+Δ≤ π 2 S(ξ) = 1 2Δ X m∈[−D+ξ,Dsin(−Δ+θ)+ξ] 1 p D 2 −(m−ξ) 2 + 1 2Δ X m∈(−D+ξ,Dsin(Δ+θ)+ξ] 1 p D 2 −(m−ξ) 2 (A.45) 4. Case− π 2 ≤θ−Δ≤ π 2 ,θ+Δ> π 2 S(ξ) = 1 2Δ X m∈[Dsin(−Δ+θ)+ξ,D+ξ] 1 p D 2 −(m−ξ) 2 + 1 2Δ X m∈[Dsin(Δ+θ)+ξ,D+ξ) 1 p D 2 −(m−ξ) 2 (A.46) Itiseasytoseethattheformulareducesto(2.62)when− π 2 ≤θ−Δ<θ+Δ≤ π 2 . Taking thelimitsfrom−π toπ recoverstheFouriertransformoftheBesselJ 0 functioncommonly used to model correlated Rayleigh fading in an isotropic scattering environment [Bel63], given by 1 π rect(ξ/2D) √ D 2 −ξ 2 ,ξ∈ [−1/2,1/2] for D∈ [0, 1 2 ]. 168 A.3 Converse of Theorem 4 CaseM ≥ P G g=1 r g : LettingP g k denotethepowerallocatedtouserg k ,Q Q Q g = diag(P g 1 ,...,P g (K ′ ) ) andP g = P K ′ k=1 P g k , the uplink-downlink duality result of [VJG03] yields the sum capac- ity in the form R sum =E max P G g=1 P K ′ k=1 Pg k ≤P logdet I I I M + G X g=1 K ′ X k=1 h h h g k h h h H g k P g k =E " max P G g=1 P K ′ k=1 Pg k ≤P logdet I I I M +H H Hdiag(Q Q Q 1 ,...,Q Q Q G )H H H H # =E " max P G g=1 P K ′ k=1 Pg k ≤P logdet I I I K +H H H H H H Hdiag(Q Q Q 1 ,...,Q Q Q G ) # (a) ≤ E " max P G g=1 P K ′ k=1 Pg k ≤P G X g=1 logdet I I I K ′ +H H H H g H H H g Q Q Q g # (b) = E " max P G g=1 P K ′ k=1 Pg k ≤P G X g=1 logdet I I I rg +Λ 1/2 g W W W g Q Q Q g W W W H g Λ 1/2 g # =E " max P G g=1 P K ′ k=1 Pg k ≤P G X g=1 log det(Λ g )det Λ −1 g +W W W g Q Q Q g W W W H g # . (A.47) where (a) is due to the Hadamard inequality for block matrices and (b) follows by using the form (2.2) for the user channels and by letting the covariance matrix of channel same group g be R R R g = U U U g Λ g U U U H g of rank r g . Extracting the constant term P G g=1 logdet(Λ g ) and focusing on the remaining expression, we have E " max P G g=1 P K ′ k=1 Pg k ≤P G X g=1 logdet Λ −1 g +W W W g Q Q Q g W W W H g # (c) ≤ E " max P G g=1 P K ′ k=1 Pg k ≤P G X g=1 r g log tr(Λ −1 g +W W W g Q Q Q g W W W H g ) r g # (d) ≤ max P G g=1 Pg≤P G X g=1 r g log ! tr(Λ −1 g ) r g +E max k kw w w g k k 2 P g r g % (e) = max P G g=1 Pg≤P G X g=1 r g log ! tr(Λ −1 g ) r g +log(K ′ ) P g r g +O(loglogK ′ ) % = max P G g=1 Pg≤P " G X g=1 r g log log(K ′ ) P g r g +o(1) # (A.48) 169 where (c) follows by using the inequality det(A A A)≤ 1 r tr(A A A) r for a matrixA A A of rank r, (d) is obtained by allocating the whole group powerP g to the userk = argmax{kw w w g k k 2 } and by using Jensen’s inequality, and (e) follows from the fact that, for large K ′ , E[max k kw w w g k k 2 ] = logK ′ +O(loglogK ′ ) (see [SH05], Appendix A). Hence, for large K ′ , putting together (A.47) and (A.48) and optimizing with respect to P 1 ,...,P G , the upper bound takes on the form: R sum ≤ G X g=1 r g " loglog(K ′ )+log P P G g=1 r g # + G X g=1 logdet(Λ g )+o(1). (A.49) 170 Case M < P G g=1 r g : In this case, we can write the sum capacity as R sum = E max P G g=1 P K ′ k=1 Pg k ≤P logdet I I I M + G X g=1 K ′ X k=1 h h h g k h h h H g k P g k ≤ E max P G g=1 P K ′ k=1 Pg k ≤P Mlog tr I I I M + P G g=1 P K ′ k=1 h h h g k h h h H g k P g k M = E max P G g=1 P K ′ k=1 Pg k ≤P Mlog 1+ G X g=1 K ′ X k=1 tr h h h g k h h h H g k P g k M = E max P G g=1 P K ′ k=1 Pg k ≤P Mlog 1+ G X g=1 K ′ X k=1 ||h h h g k || 2 P g k M = E max P G g=1 P K ′ k=1 Pg k ≤P Mlog 1+ G X g=1 K ′ X k=1 w w w H g k R R R g w w w g k P g k M (a) ≤ E max P G g=1 P K ′ k=1 Pg k ≤P Mlog 1+ G X g=1 K ′ X k=1 ||w w w g k || 2 λ max P g k M (b) ≤ max P G g=1 P K ′ k=1 Pg k ≤P Mlog 1+ G X g=1 E max k ||w w w g k || 2 λ max P K ′ k=1 P g k M = max P G g=1 P K ′ k=1 Pg k ≤P Mlog 1+ G X g=1 λ max P g logK ′ M +o(1) = Mlog 1+ λ max P logK ′ M +o(1) = Mlogλ max +Mlog P M +MloglogK ′ +o(1) (A.50) where (a) follows from the Rayleigh Ritz Theorem, for whichw w w H g k R R R g w w w g k ≤λ max,g ||w w w g k || 2 , whereλ max,g is the maximum eigenvalue ofR R R g and we letλ max = max g λ max,g . (b) is due to Jensen’s inequality. As a result, we have R sum ≤ M loglog(K ′ )+log P M +Mlogλ max +o(1), 171 where λ max = max g λ max,g and λ max,g is the maximum eigenvalue of R R R g . Combining (A.49) and (A.51), we conclude thatR sum is upper bounded by a quantity in the form of the right-hand side of (3.1), and the converse is proved. A.4 Limit of the growth function Letting i = 1 in (3.12), we get μ g,m,1 (x)≤λ 1 (A A A ′ g,m )−xλ rg (A A A ′′ g,m )≤λ 1 (A A A ′ g,m ) (A.51) Thus, we have max q q q |q q q H U U U H g b b b gm | 2 ≤μ g,m,1 (x)≤λ 1 (A A A ′ g,m ) (A.52) Hence, μ g,m,1 (x) is uniformly upper and lower bounded by constants independent of x. Furthermore, from the expression of A A A g,m (x) it is immediately seen that μ g,m,1 (x) is non-increasing in x. Therefore, the limit lim x→∞ μ g,m,1 (x) =μ ∞ g,m,1 exists. After some tedious but straightforward algebra, from (3.13) we obtain the growth function explicitly as g(x) = 1 ρμ g,m,1 (x) − x˙ μ g,m,1 (x) ρ(μ g,m,1 (x)) 2 + rg X i=2 μ g,m,i (x)˙ μ g,m,1 (x)−μ g,m,1 (x)˙ μ g,m,i (x) (μ g,m,1 (x)−μ g,m,i (x))μ g,m,1 (x) # −1 (A.53) where ˙ μ g,m,1 (x) = d dx μ g,m,1 (x). As x → ∞, μ g,m,1 (x) → μ ∞ g,m,1 , x˙ μ g,m,1 (x) → 0, μ g,m,i (x)˙ μ g,m,1 (x) μ g,m,1 (x)−μ g,m,i (x) → 0 and ˙ μ g,m,i (x)μ g,m,1 (x) μ g,m,1 (x)−μ g,m,i (x) → 0, yielding g ∞ = ρμ ∞ g,m,1 , as we wanted to show. In order to prove these limits, we need the following result which follows from standard eigenvalue perturbation theory [MH88]: 172 Lemma 5 If the eigenvector corresponding to the i th largest eigenvalue of A A A g,m (x) = A A A ′ g,m −xA A A ′′ g,m isu u u g,m,i (x), then we have ˙ μ g,m,i (x) = u u u H g,m,i (x) d dx h A A A ′ g,m −xA A A ′′ g,m i u u u g,m,i (x) = −u u u H g,m,i (x)A A A ′′ g,m u u u g,m,i (x). (A.54) It is clear that x→∞, μ g,m,i (x)→−∞ for all i> 1. Also, lim x→∞ ˙ μ g,m,i (x) =− lim x→∞ u u u H g,m,i (x)A A A ′′ g,m u u u g,m,i (x) = constant Furthermore, ˙ μ g,m,1 (x)→ 0 since μ g,m,1 (x) converges to a constant. Since ˙ μ g,m,1 (x) =−u u u H g,m,1 (x)A A A ′′ g,m u u u g,m,1 (x), (A.55) the vectoru u u ∞ g,m,1 lies in the orthogonal complement ofA A A ′′ g,m . Thus, μ ∞ g,m,1 =u u u H g,m,1 (∞)A A A ′ g,m u u u g,m,1 (∞) Now, we have lim x→∞ μ g,m,1 (x) = u u u H g,m,1 (∞)A A A ′ g,m u u u g,m,1 (∞) − lim x→∞ xu u u H g,m,1 (x)A A A ′′ g,m u u u g,m,1 (x) = μ ∞ g,m,1 + lim x→∞ x˙ μ g,m,1 (x), (A.56) implying that lim x→∞ x˙ μ g,m,1 (x) = 0. Using these results, all the limits are proved. 173 References [ABB + 07] Peter Almers, Ernst Bonek, A Burr, Nicolai Czink, M´ erouane Debbah, Vit- torio Degli-Esposti, Helmut Hofstetter, P Ky¨ o, D Laurenson, Gerald Matz, etal. SurveyofChannelandRadioPropagationModelsforWirelessMIMO Systems. EURASIP Journal on Wireless Communications and Networking, 2007. [AGL + 06] Rashid Attar, Donna Ghosh, Chris Lott, Mingxi Fan, Peter Black, Ramin Rezaiifar, and Parag Agashe. Evolution of CDMA2000 cellular networks: Multi-carrier EV-DO. IEEE Communications Magazine, 44(3):46–53, 2006. [AGM + 06] H. Asplund, A.A. Glazunov, A.F. Molisch, K.I. Pedersen, and M. Stein- bauer. The Cost 259 Directional Channel Model-Part II: Macrocells. Wire- less Communications, IEEE Transactions on, 5(12):3434–3450, 2006. [ANC11] AnsumanAdhikary, VasileiosNtranos, andGiuseppeCaire. CognitiveFem- tocells: Breaking the Spatial Reuse Barrier of Cellular Systems. In Infor- mation Theory and Applications Workshop (ITA), 2011, pages 1–10. IEEE, 2011. [ANSH09] T.Y.Al-Naffouri, M.Sharif, andB.Hassibi. HowMuchDoesTransmitCor- relation Affect the Sum-Rate Scaling of MIMO Gaussian Broadcast Chan- nels? IEEE Trans. on Commun., 57(2):562–572, 2009. [ARAS + 13] Omar El Ayach, Sridhar Rajagopal, Shadi Abu-Surra, Zhouyue Pi, and Robert W Heath Jr. Spatially Sparse Precoding in Millimeter Wave MIMO Systems. arXiv preprint arXiv:1305.2460, 2013. [AWW + 13] Y. Azar, G.N. Wong, K. Wang, R. Mayzus, J.K. Schulz, Hang Zhao, F. Gutierrez, D. Hwang, and T.S. Rappaport. 28 GHz Propagation Mea- surementsforOutdoorCellularCommunicationsUsingSteerableBeamAn- tennas in New York City. In Communications (ICC), 2013 IEEE Interna- tional Conference on, pages 5143–5147, 2013. [Bel63] P. Bello. Characterization of Randomly Time-Variant Linear Channels. IEEE Transactions on Communications Systems, 11(4):360–393, 1963. [BN02] Alexander Barg and D Yu Nogin. Bounds on Packings of Spheres in the Grassmann Manifold. Information Theory, IEEE Transactions on, 48(9):2450–2454, 2002. 174 [CAG08] Vikram Chandrasekhar, Jeffrey Andrews, and Alan Gatherer. Femtocell Networks: A Survey. Communications Magazine, IEEE, 46(9):59–67, 2008. [CBTF09] G Caire, S Bellini, A Tomasoni, and M Ferrari. On the Selection of Semi- Orthogonal Users for Zero-Forcing Beamforming. In IEEE International Symposium on Information Theory, 2009. ISIT 2009, pages 1100–1104. IEEE, 2009. [CD11] R. Couillet and M. Debbah. Random Matrix Methods for Wireless Commu- nications. Cambridge Univ Pr, 2011. [CDS11] Romain Couillet, M´ erouane Debbah, and Jack W Silverstein. A Deter- ministic Equivalent for the Analysis of Correlated MIMO Multiple Access Channels. Information Theory, IEEE Transactions on, 57(6):3493–3514, 2011. [CJKR10] Giuseppe Caire, Nihar Jindal, Mari Kobayashi, and Niranjay Ravindran. Multiuser MIMO Achievable Rates with Downlink Training and Channel State Feedback. IEEE Trans. on Inform. Theory, 56(6):2845–2866, June 2010. [CRP10] G. Caire, S.A. Ramprashad, and H.C. Papadopoulos. Rethinking Network MIMO: Cost of CSIT, Performance Analysis, and Architecture Compar- isons. In Information Theory and Applications Workshop (ITA), 2010, pages 1–10. IEEE, 2010. [CS03] G.CaireandS.Shamai. OntheAchievableThroughputofaMulti-Antenna Gaussian Broadcast Channel. IEEE Transactions on Information Theory, 49(7):1691–1706, 2003. [CSAL + 12] H.Y. Clayton Shepard, N. Anand, L.E. Li, T. Marzetta, R. Yang, and L. Zhong. Argos: Practical Many-Antenna Base Stations. In Proceedings of the 18th annual international conference on Mobile computing and net- working, pages 53–64. ACM, 2012. [CWD09] Romain Couillet, Sebastian Wagner, and M´ erouane Debbah. Asymp- totic Analysis of Linear Precoding Techniques in Correlated Multi-Antenna Broadcast Channels. CoRR, abs/0906.3682, 2009. [DEKVV11] Vittorio Degli-Esposti, V Kolmonen, Enrico M Vitucci, and Pertti Vainikainen. Analysis and Modeling on co- and Cross-Polarized Urban Ra- dioPropagationforDual-PolarizedMIMOWirelessSystems. Antennas and Propagation, IEEE Transactions on, 59(11):4247–4256, 2011. [DS05] G. Dimic and N.D. Sidiropoulos. On Downlink Beamforming With Greedy User Selection: Performance Analysis and a Simple New Algorithm. Signal Processing, IEEE Transactions on, 53(10):3857–3868, 2005. 175 [ESS94] A. Eriksson, P. Stoica, and T. Soderstrom. On-Line Subspace Algorithms for Tracking Moving Sources. IEEE Trans. on Sig. Proc., 42(9):2319–2330, 1994. [FMB98] J. Fuhl, A.F. Molisch, and E. Bonek. Unified channel model for mobile radio systems with smart antennas. Radar, Sonar and Navigation, IEE Proceedings -, 145(1):32–41, 1998. [Fra12] Joel N Franklin. Matrix Theory. Courier Dover Publications, 2012. [Gas06] Matthew Gast. 802.11 Wireless Networks. O’reilly, 2006. [GMR + 12] Amitabha Ghosh, Nitin Mangalvedhe, Rapeepat Ratasuk, Bishwarup Mon- dal,MarkCudak,EugeneVisotsky,TimothyAThomas,JeffreyGAndrews, Ping Xia, Han Shin Jo, et al. Heterogeneous Cellular Networks: From The- ory to Practice. Communications Magazine, IEEE, 50(6):54–64, 2012. [Gra06] R.M. Gray. Toeplitz and Circulant matrices: A Review. Now Pub, 2006. [GS84] U.GrenanderandG.Szeg˝ o. Toeplitz Forms and their Applications. Chelsea Pub Co, 1984. [GS92] GR Grimmett and DR Stirzaker. Probability and Random Processes. Ox- ford science publications, 1992. [GWJ10] T. Gou, C. Wang, and S.A. Jafar. Aiming Perfectly in the Dark - Blind Interference Alignment through Staggered Antenna Switching. In IEEE Global Telecommunications Conference (GLOBECOM 2010), pages 1–5. IEEE, 2010. [HCPR12] H. Huh, G. Caire, H.C. Papadopoulos, and S.A. Ramprashad. Achieving ”Massive MIMO” Spectral Efficiency with a Not-so-Large Number of An- tennas. IEEE Trans. on Wireless Commun., PP(99):1 – 14, 2012. [HHBD13] Jakob Hoydis, Kianoush Hosseini, Stephan ten Brink, and M´ erouane Deb- bah. Making Smart Use of Excess Antennas: Massive MIMO, Small Cells, and TDD. Bell Labs Technical Journal, 18(2):5–21, 2013. [HM01] B.M. Hochwald and TL Marzetta. Adapting a Downlink Array from Uplink Measurements. IEEE Trans. on Sig. Proc., 49(3):642–653, 2001. [HT07] Harri Holma and Antti Toskala. HSDPA/HSUPA for UMTS: High Speed Radio Access for Mobile Communications. Wiley. com, 2007. [HtBD13] J. Hoydis, S. ten Brink, and M. Debbah. Massive MIMO in the UL/DL of Cellular Networks: How Many Antennas Do We Need? IEEE Jour. Select. Areas in Comm., 31(2):160–171, Feb. 2013. 176 [HTC12] Hoon Huh, Antonia M. Tulino, and Giuseppe Caire. Network MIMO With LinearZero-ForcingBeamforming: LargeSystemAnalysis, ImpactofChan- nel Estimation, and Reduced-Complexity Scheduling. IEEE Trans. on In- form. Theory, 58(5):2911 – 2934, 2012. [JAMV11] J. Jose, A. Ashikhmin, T. L. Marzetta, and S. Vishwanath. Pilot Contami- nationandPrecodinginMulti-CellTDDSystems. IEEE Trans. on Wireless Comm., 10(8):2640–2651, Aug. 2011. [KC12] M. Kobayashi and G. Caire. On the net DoF comparison between ZF and MAT over time-varying MISO broadcast channels. In IEEE International Symposium on Information Theory, 2012. [KJC11] M.Kobayashi,N.Jindal,andG.Caire. TrainingandFeedbackOptimization for Multiuser MIMO Downlink. IEEE Trans. on Commun., 59(8), 2011. [Lap09] A. Lapidoth. A Foundation in Digital Communications. Cambridge Uni- versity Press, 2009. [Lee73] W.C.-Y. Lee. Effects on Correlation Between Two Mobile Radio Base- StationAntennas. Vehicular Technology, IEEE Transactions on, 22(4):130– 140, 1973. [LETM14] E. Larsson, O. Edfors, F. Tufvesson, and T. Marzetta. Massive MIMO for Next Generation Wireless Systems. Communications Magazine, IEEE, 52(2):186–195, February 2014. [LL13] An Liu and Vincent Lau. Hierarchical Interference Mitigation for Large MIMO Cellular Networks. arXiv preprint arXiv:1306.2700, 2013. [LLL + 10] Qinghua Li, Guangjie Li, Wookbong Lee, Moon-il Lee, David Mazzarese, Bruno Clerckx, and Zexian Li. MIMO Techniques in WiMAX and LTE: A Feature Overview. IEEE Communications Magazine, 48(5):86–92, 2010. [Llo82] Stuart Lloyd. Least Squares Quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137, 1982. [LPGR + 11] D. Lopez-Perez, I. Guvenc, G. D. L. Roche, M. Kountouris, T. Quek, and J.Zhang. EnhancedInter-CellInterferenceCoordinationChallengesinHet- erogeneous Networks. IEEE Wireless Commun. Mag., 18(3):22–30, June 2011. [Mar10] T.L. Marzetta. Noncooperative Cellular Wireless with Unlimited Numbers of Base Station Antennas. Wireless Communications, IEEE Transactions on, 9(11):3590–3600, 2010. 177 [MAT10] M.A. Maddah-Ali and D. Tse. Completely Stale Transmitter Channel State Information is Still Very Useful . In Communication, Control, and Comput- ing (Allerton), 2010 48th Annual Allerton Conference on, pages 1188–1195. IEEE, 2010. [MCD + 13] Thomas L. Marzetta, Giuseppe Caire, Merouane Debbah, I Chih-Lin, and Saif K. Mohammed. Special Issue on Massive MIMO. Communications and Networks, Journal of, 15(4):333–337, Aug 2013. [Mes08] X. Mestre. Improved Estimation of Eigenvalues and Eigenvectors of Co- variance Matrices Using Their Sample Estimates. IEEE Trans. on Inform. Theory, 54(11):5113–5129, 2008. [MH88] D.V. Murthy and R.T. Haftka. Derivatives of Eigenvalues and Eigenvectors ofaGeneralComplexMatrix. International Journal for Numerical Methods in Engineering, 26:293–311, 1988. [MKLSS94] N. Merhav, G. Kaplan, A. Lapidoth, and S. Shamai Shitz. On Informa- tion Rates for Mismatched Decoders. IEEE Trans. on Inform. Theory, 40(6):1953–1967, 1994. [Mol10] Andreas F Molisch. Wireless Communications. Wiley. com, 2010. [MQO12] Francesco Mani, Francois Quitin, and Claude Oestges. Directional Spreads of Dense Multipath Components in Indoor Environments: Experimental Validation of a Ray-Tracing Approach. Antennas and Propagation, IEEE Transactions on, 60(7):3389–3396, 2012. [MTS11] T.L. Marzetta, G.H. Tucci, and S.H. Simon. A Random MatrixTheoretic Approach to Handling Singular Covariance Estimates. IEEE Trans. on In- form. Theory, 57(9), 2011. [Mui74] WW Muir. Inequalities concerning the Inverses of Positive Definite Ma- trices. Proceedings of the Edinburgh Mathematical Society (Series 2), 19(02):109–113, 1974. [MW00] J Mo and J Walrand. Fair End-to-End Window-based Congestion Control . Networking, IEEE/ACM Transactions on, 8(5):556–567, 2000. [NA13] JunyoungNamandJae-YoungAhn. JointSpatialDivisionandMultiplexing -BenefitsofAntennaCorrelationinMulti-UserMIMO. InIEEE Int. Symp. on Inform. Theory (ISIT 2013), pages 1 – 6, Istanbul, Turkey, July 2013. [NAAC12] Junyoung Nam, Jae-Young Ahn, Ansuman Adhikary, and Giuseppe Caire. Joint Spatial Division and Multiplexing: Realizing Massive MIMO Gains with Limited Channel State Information. In 2012 46th Annual Conference on Information Sciences and Systems (CISS), pages 1–6, March 2012. 178 [NLM13] Hien Quoc Ngo, E.G. Larsson, and T.L. Marzetta. Energy and Spectral EfficiencyofVeryLargeMultiuserMIMOSystems. Communications, IEEE Transactions on, 61(4):1436–1449, April 2013. [PG11] Eldad Perahia and Michelle X Gong. Gigabit Wireless LANs: an overview of IEEE 802.11ac and 802.11ad. ACM SIGMOBILE Mobile Computing and Communications Review, 15(3):23–33, 2011. [Rap02] TheodoreSRappaport. WirelessCommunications: PrinciplesandPractice. 2002. [RCP09] S.A. Ramprashad, G. Caire, and H.C. Papadopoulos. Cellular and Network MIMO Architectures: MU-MIMO Spectral Efficiency and Costs of Channel StateInformation. InForty-Third Asilomar Conference on Signals, Systems and Computers, pages 1811–1818. IEEE, 2009. [RCP10] S.A. Ramprashad, G. Caire, and H.C. Papadopoulos. A Joint Scheduling and Cell Clustering Scheme for MU-MIMO Downlink with Limited Coor- dination. In IEEE International Conference on Communications (ICC), pages 1–6. IEEE, 2010. [Rem13] Wireless InSite Remcom. http://www.remcom.com/wireless-insite. September 2013. [RJ08] Niranjay Ravindran and Nihar Jindal. Multi-User Diversity vs. Accurate Channel Feedback for MIMO Broadcast Channels. In IEEE International Conference on Communications, 2008. ICC’08., pages 3684–3688. IEEE, 2008. [RPL + 13] F.Rusek,D.Persson,BuonKiongLau,E.G.Larsson,T.L.Marzetta,O.Ed- fors, and F. Tufvesson. Scaling up MIMO: Opportunities and Challenges with Very Large Arrays. Signal Processing Magazine, IEEE, 30(1):40–60, 2013. [RSM + 13] T.S. Rappaport, Shu Sun, R. Mayzus, Hang Zhao, Y. Azar, K. Wang, G.N. Wong, J.K. Schulz, M. Samimi, and F. Gutierrez. Millimeter Wave Mobile Communications for 5G Cellular: It Will Work! Access, IEEE, 1(1):335– 349, May 2013. [SFGK00] D.S. Shiu, G.J. Foschini, M.J. Gans, and J.M. Kahn. Fading Correlation and Its Effect on the Capacity of Multielement Antenna Systems. IEEE Trans. on Commun., 48(3):502–513, 2000. [SH05] M. Sharif and B. Hassibi. On the Capacity of MIMO Broadcast Channels With Partial Side Information. IEEE Trans. on Inform. Theory, 51(2):506– 522, 2005. [SMB01] M. Steinbauer, A.F. Molisch, and E. Bonek. The Double-Directional Radio Channel . Antennas and Propagation Magazine, IEEE, 43(4):51–63, 2001. 179 [SSH04] Q.H.Spencer,A.L.Swindlehurst,andM.Haardt. Zero-ForcingMethodsfor Downlink Spatial Multiplexing in Multiuser MIMO Channels. IEEE Trans. on Sig. Proc., 52(2):461–471, 2004. [SWA + 13] Mathew Samimi, Kevin Wang, Yaniv Azar, George N. Wong, Rimma Mayzus, Hang Zhao, Jocelyn K. Schulz, Shu Sun, Felix Gutierrez Jr., and Theodore S. Rappaport. 28 GHz Angle of Arrival and Angle of Departure Analysis for Outdoor Cellular Communications using Steerable Beam An- tennas in New York City. In Vehicular Technology Conference Fall (VTC 2013-Fall), 2013 IEEE 74th, 2013. [TLK + 02] M. Toeltsch, J. Laurila, K. Kalliola, A.F. Molisch, P. Vainikainen, and E. Bonek. Statistical Characterization of Urban Spatial Radio Channels. Selected Areas in Communications, IEEE Journal on, 20(3):539–549, 2002. [TV04] A.M. Tulino and S. Verd´ u. Random Matrix Theory and Wireless Commu- nications, volume 1. Now Publishers Inc, 2004. [VDL02] P.Viswanath, NCDavid, andR.Laroia. OpportunisticBeamformingUsing Dumb Antennas. IEEE Trans. on Inform. Theory, 48(6):1277, 2002. [VJG03] S.Vishwanath, N.Jindal, andA.Goldsmith. Duality, AchievableRatesand Sum-Rate Capacity of Gaussian MIMO Broadcast Channels. IEEE Trans. on Inform. Theory, 49(10):2658–2668, 2003. [VLM12] P. Vallet, P. Loubaton, and X. Mestre. Improved subspace estimation for multivariate observations of high dimension: the deterministic signals case. IEEE Trans. on Inform. Theory, 58(2):1043–1068, 2012. [WAE + 04] S. Wyne, P. Almers, G. Eriksson, J. Karedal, F. Tufvesson, and A.F. Molisch. Statistical Evaluation of Outdoor-to-Indoor Office MIMO Mea- surements at 5.2 GHz. In Vehicular Technology Conference, 2004. VTC2004-Fall. 2004 IEEE 60th, volume 1, pages 101–105 Vol. 1, 2004. [WCDS12] SebastianWagner,RomainCouillet,M´ erouaneDebbah,andDirkTMSlock. Large System Analysis of Linear Precoding in Correlated MISO Broadcast ChannelsunderLimitedFeedback. Information Theory, IEEE Transactions on, 58(7):4509–4537, 2012. [WSS06] H. Weingarten, Y. Steinberg, and S. Shamai. The Capacity Region of the GaussianMultiple-InputMultiple-OutputBroadcastChannel. IEEE Trans. on Inform. Theory, 52(9):3936–3964, 2006. [YC74] Tzay Y Young and Thomas W Calvert. Classification, Estimation and Pat- tern Recognition. American Elsevier Publishing Company, 1974. [YG06] T. Yoo and A. Goldsmith. On the Optimality of Multiantenna Broadcast Scheduling Using Zero-Forcing Beamforming. IEEE J. Select. Areas Com- mun., 24(3):528–541, 2006. 180 [YGFL13] Haifan Yin, David Gesbert, Miltiades Filippou, and Yingzhuang Liu. A Co- ordinated Approach to Channel Estimation in Large-scale Multiple-antenna Systems. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICA- TIONS, 31(2), 2013. [YM13] H. Yang and T. L. Marzetta. Performance of Conjugate and Zero-Forcing Beamforming in Large-Scale Antenna Systems. IEEE Jour. Select. Areas in Comm., 31(2):172–179, 2013. [Yu06] W. Yu. Sum-Capacity Computation for the Gaussian Vector Broad- cast Channel Via Dual Decomposition. IEEE Trans. on Inform. Theory, 52(2):754–759, 2006. [ZMS + 13] Hang Zhao, R. Mayzus, Shu Sun, M. Samimi, J.K. Schulz, Y. Azar, K.Wang,G.N.Wong,F.Gutierrez,andT.S.Rappaport. 28GHzMillimeter WaveCellularCommunicationMeasurementsforReflectionandPenetration Loss in and around Buildings in New York City. In Communications (ICC), 2013 IEEE International Conference on, pages 5163–5167, 2013. [ZT02] L. Zheng and D.N.C. Tse. Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel. IEEE Trans. on Inform. Theory, 48(2):359–383, 2002. 181
Abstract (if available)
Abstract
A Large Scale Antenna System (LSAS) entails a large number (tens or hundreds) of base station antennas serving a much smaller number of terminals, with large gains in spectral efficiency and energy efficiency compared with conventional multiuser MIMO technology. However, enabling multiuser MIMO requires very accurate channel state information at the transmitter (CSIT), which can be acquired via uplink pilots in Time Division Duplexing (TDD) systems and via downlink pilots and uplink feedback in Frequency Division Duplexing (FDD) systems. In conventional cellular technology, where FDD is employed, acquiring CSIT becomes prohibitive due to the presence of a large number of antennas. In this work, we propose Joint Spatial Division and Multiplexing (JSDM) and show that it achieves significant savings both in the downlink training and in the CSIT uplink feedback, thus making the use of large antenna arrays at the base station potentially suitable also for FDD systems. JSDM is a two stage beamforming scheme, and relies on serving groups of users with approximately similar covariances. ❧ We prove a simple condition under which JSDM incurs no loss of optimality with respect to the full CSIT case and that such condition is approached in the large number of antennas limit with uniformly spaced linear arrays. We extend these ideas to the case of a two‐dimensional base station antenna array, with 3‐dimensional beamforming, including multiple beams in the elevation angle direction. We provide guidelines for optimization and calculate the system spectral efficiency under proportional fairness and max‐min fairness criteria, showing extremely attractive performance. We also show that JSDM with simple opportunistic user selection is able to achieve the same scaling law of the system capacity with full channel state information, and propose a simple scheme for grouping users in a realistic setting. We propose a low‐overhead probabilistic scheduling algorithm that selects these users at random with certain probabilities. As a result, only the pre‐selected users are required to feedback their channel state information, realizing important savings in the CSIT feedback. We study the applicability of JSDM to mm‐Wave channels and analyze its performance in some realistic propagation channels. Evaluations in propagation channels obtained from ray tracing results, as well as in measured outdoor channels show that JSDM performs surprisingly well in mm‐Wave channels. Finally, we study the performance of JSDM in a heterogeneous network consisting of a large number of small cells deployed under a macro‐cellular ""umbrella"". We propose efficient inter‐tier interference management schemes using JSDM as a sort of ""spatial blanking"", that is significantly more efficient than isotropic slot blanking (enhanced Inter‐Cell Interference Coordination, eICIC) currently proposed in LTE standardization. Our numerical results are obtained via asymptotic random matrix theory, avoiding lengthy Monte Carlo simulations.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Design and analysis of reduced complexity transceivers for massive MIMO and UWB systems
PDF
Enabling massive distributed MIMO for small cell networks
PDF
Large system analysis of multi-cell MIMO downlink: fairness scheduling and inter-cell cooperation
PDF
Optimal distributed algorithms for scheduling and load balancing in wireless networks
PDF
Structured codes in network information theory
PDF
Hybrid beamforming for massive MIMO
PDF
Signal processing for channel sounding: parameter estimation and calibration
PDF
Channel state information feedback, prediction and scheduling for the downlink of MIMO-OFDM wireless systems
PDF
Achieving high data rates in distributed MIMO systems
PDF
Elements of next-generation wireless video systems: millimeter-wave and device-to-device algorithms
PDF
Communicating over outage-limited multiple-antenna and cooperative wireless channels
PDF
Double-directional channel sounding for next generation wireless communications
PDF
Real-time channel sounder designs for millimeter-wave and ultra-wideband communications
PDF
Multidimensional characterization of propagation channels for next-generation wireless and localization systems
PDF
Propagation channel characterization and interference mitigation strategies for ultrawideband systems
PDF
mmWave dynamic channel measurements for localization and communications
PDF
Neighbor discovery in device-to-device communication
PDF
Algorithmic aspects of energy efficient transmission in multihop cooperative wireless networks
PDF
Fundamentals of two user-centric architectures for 5G: device-to-device communication and cache-aided interference management
PDF
Distributed interference management in large wireless networks
Asset Metadata
Creator
Adhikary, Ansuman
(author)
Core Title
Design and analysis of large scale antenna systems
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
07/15/2014
Defense Date
06/23/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
3D beamforming,5G,JSDM,massive MIMO,OAI-PMH Harvest,opportunistic beamforming,probabilistic scheduling
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Caire, Giuseppe (
committee chair
), Molisch, Andreas F. (
committee member
), Montgomery, M. Susan (
committee member
)
Creator Email
k.ansuman.k@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-442094
Unique identifier
UC11286691
Identifier
etd-AdhikaryAn-2697.pdf (filename),usctheses-c3-442094 (legacy record id)
Legacy Identifier
etd-AdhikaryAn-2697.pdf
Dmrecord
442094
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Adhikary, Ansuman
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
3D beamforming
5G
JSDM
massive MIMO
opportunistic beamforming
probabilistic scheduling