Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Improving spectrum efficiency of 802.11ax networks
(USC Thesis Other)
Improving spectrum efficiency of 802.11ax networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
IMPROVING SPECTRUM EFFICIENCY OF 802.11AX NETWORKS by Kaidong Wang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) December 2019 Dedication To my parents and sister for their unconditional love and support. ii Acknowledgments First, I would like to thank my advisor Prof. Konstantinos Psounis for his guidance and support during my PhD study. During my PhD study, he gave me a lot of constructive advice and helped me make a breakthrough when my research got stuck. He also emphasized the importance of written and oral communication skills, especially how to express my thoughts clearly, which benets me in both research and work. I would like to thank my committee members Prof. Leana Golubchik, Prof. Keith Michael Chugg, Prof. Andreas Molisch, and Prof. John A Silvester for their great support and invaluable advice. I am also grateful to Prof. Giuseppe Caire for his help during my PhD application. I would like to thank internship mentors Dr. Wai-tian Tan, Dr. Shyam Kapadia, Lifen Tian and colleagues Dr. Xiaoqing Zhu, Robert Edward Liston, Shi Su. I am glad to have been able to work with them. Finally, I would like to express my deepest gratitude to all of my friends in USC, Yonglong Zhang, Po-han Huang, Mengjiong Qian, Weng Chon Ao, Hang Qiu, Tianyu Hao, Kwame-Lante Wright, Jason A. Tran, Martin Martinez, Pradipta Ghosh, and Shangxing Wang for their contin- ued support. iii Table Of Contents Dedication ii Acknowledgments iii List Of Tables vi List Of Figures vii Abstract ix Chapter 1: Introduction 1 Chapter 2: Scheduling and Resource Allocation 2 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.1 The Varying Nature of the Wireless Channel . . . . . . . . . . . . . . . . . 6 2.3.2 Variable User Payload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.3 Variable Radio Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.4 Fractional Frequency Reuse for Inter-cell Interference Mitigation . . . . . . 9 2.3.5 The Importance of Scheduling and Resource Allocation . . . . . . . . . . . 9 2.4 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.1 802.11ax Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.2 Physical Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.2.1 The channel model . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.2.2 Abstraction of RUs . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4.2.3 User grouping of MU-MIMO . . . . . . . . . . . . . . . . . . . . . 14 2.4.2.4 Scheduling and resource allocation . . . . . . . . . . . . . . . . . . 14 2.4.2.5 Weighted Sum Rate . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.2.6 Eective throughput optimization for variable payload . . . . . . . 18 2.4.3 The Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5 Scheduling algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5.1 The Relaxed Scheduling and Resource Allocation Problem . . . . . . . . . . 21 2.5.2 The Original Scheduling and Resource Allocation Problem . . . . . . . . . 24 2.5.2.1 Exhaustive search . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5.2.2 Greedy algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5.2.3 UP-RA algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.3 Recursive Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.6 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.6.1 Divide and Conquer versus Exhaustive Search . . . . . . . . . . . . . . . . . 32 iv 2.6.2 Comparison between SRA Algorithms . . . . . . . . . . . . . . . . . . . . . 34 2.6.3 Comparison between Continuous-rate AMC and Discrete-rate AMC . . . . 34 2.6.4 Impact of the Distribution of Users . . . . . . . . . . . . . . . . . . . . . . . 36 2.6.5 Impact of the Number of Users . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6.6 Impact of the Number of Antennas at the AP . . . . . . . . . . . . . . . . . 39 2.6.7 Impact of variable payload . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.7 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.8 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Chapter 3: Scheduling and Resource Allocation 47 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3.1 BSS Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3.2 Spatial Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3.3 Transmit Power Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.3.4 CSMA/CA Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.3.5 Possible Information Collected at an STA in 802.11ax . . . . . . . . . . . . 58 3.4 Conventional Transmit Power Control . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4.1 Prior Conventional Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4.2 Proposed Method - Binary Algorithm . . . . . . . . . . . . . . . . . . . . . 59 3.5 Reinforcement Learning based Transmit Power Control . . . . . . . . . . . . . . . 60 3.5.1 Reinforcement Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.5.2 UCB based TPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.5.3 Reinforcement Comparison based TPC . . . . . . . . . . . . . . . . . . . . . 64 3.6 MATLAB Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.6.1 Comparison of TPC Algorithms in Static Environments . . . . . . . . . . . 66 3.6.2 Comparison of TPC Algorithms in Dynamic Environments . . . . . . . . . 67 3.6.3 Impact of the Path Loss Exponent . . . . . . . . . . . . . . . . . . . . . . . 69 3.6.4 Impact of the Distribution of STAs around the APs . . . . . . . . . . . . . 69 3.7 NS-3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Chapter 4: Conclusion 75 References 76 v List Of Tables 2.1 A valid user schedule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 The transmission rate of MCS on one subcarrierR(m) (in Mbps) . . . . . . . . . . 17 2.3 Notation glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1 The initial TX PWR and OBSS PD of STAs . . . . . . . . . . . . . . . . . . . . . 56 3.2 MATLAB simulation parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.3 NS-3 simulation parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 vi List Of Figures 2.1 The channel capacity of a frequency selective channel. . . . . . . . . . . . . . . . . 7 2.2 Channel utilization with variable user payload. . . . . . . . . . . . . . . . . . . . . 8 2.3 Sum rate distribution and average value. . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 RU locations in a 40MHz HE PPDU [61]. . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 RUs of a 20MHz HE PPDU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.6 A valid partition of the bandwidth. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.7 The relaxed SRA problem on RU(l;i). . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.8 The time complexity of exhaustive search. . . . . . . . . . . . . . . . . . . . . . . . 26 2.9 The topology of the 802.11ax BSS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.10 The gap between exhaustive search and divide and conquer. . . . . . . . . . . . . . 33 2.11 Comparison of the sum and proportionally fair rates between dierent algorithms. 35 2.12 Achieved rates using discrete-rate AMC. (compare to continuous-rate AMC in Fig. 2.11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.13 Impact of dierent type of users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.14 Impact of number of users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.15 Impact of number of antennas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.16 Comparison of the eective throughput between dierent algorithms. . . . . . . . . 42 2.17 Experiment overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.18 Oce oor-plan. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.19 Experimental results using WARP boards. . . . . . . . . . . . . . . . . . . . . . . . 45 vii 3.1 Problems with CSMA/CA: (a) hidden node problem (b) exposed node problem. . 48 3.2 The topology of an 802.11ax system . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3 The procedure of frame reception. . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.4 Illustration of the adjustment rule for OBSS PD and TX PWR. . . . . . . . . . 54 3.5 The frame captured by Wireshark. . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.6 Reinforcement learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.7 The topology of simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.8 The performance with dierent numbers of static STAs per BSS. . . . . . . . . . . 68 3.9 The performance under a dynamic environment with mobile STAs. . . . . . . . . . 70 3.10 The overall throughput with dierent path loss exponents. . . . . . . . . . . . . . . 70 3.11 The overall throughput with dierent average distances from APs. . . . . . . . . . 71 3.12 The ns-3 simulation results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 viii Abstract Legacy 802.11 standards improved the throughput of Wi-Fi networks via various techniques, such as MIMO, frame aggregation, higher modulation, and coding scheme (MCS), shorter guard interval and so on. However, the multiple access method was unchanged: a node listens to the channel before transmission, if the channel is busy, the node waits for a random period of time, if the channel is clear, it starts the transmission using the whole bandwidth. This may be inecient due to the exposed node problem, frequency selective channel, limited payload size, especially in dense deployments. 802.11ax standard, also known as high-eciency wireless (HE), aims at improving the spectrum eciency of 802.11 networks, especially for dense deployment. 802.11ax introduces a lot of new features to improve the spectrum eciency, such as OFDMA, BSS coloring, uplink MU-MIMO and so on. The performance of these features is highly related to the implementation, as shown in the following chapters. The goal of my work is to propose algorithms to optimize the implementation of these new features and hence improve the spectrum eciency of 802.11ax networks. In Chapter 2, a divide and conquer based recursive scheduling algorithm is proposed to improve the OFDMA scheduling for downlink MU transmission. Both simulation and experiment results show that the recursive scheduling algorithm can optimize the OFDMA scheduling using the CSI feedback from STAs. In Chapter 3, the transmit power control (TPC) of STAs is modeled as a multi-armed bandit (MAB) problem, which can learn the optimal policy from the environment and exploit the knowledge of the environment to achieve higher reward. The simulation results show that it outperforms conventional TPC and can adapt to various environments. ix Chapter 1 Introduction Chapter 2 investigates the scheduling and resource allocation (SRA) problem in 802.11ax, which aims at improving the throughput of downlink OFDMA transmissions. Previous SRA algorithms for general OFDMA systems or LTE systems don't satisfy 802.11ax SRA constraints, hence they cannot be used in 802.11ax networks directly. Instead, inspired by the structure of RUs, a divide and conquer based SRA algorithm is proposed to provide an upper bound, then a recursive scheduling algorithm that satises the constraints is shown to be fast and close to the optimal. The performance of all algorithms is evaluated in both simulations and experiments. Compared with SRA which aims at improving the throughput of a single BSS, BSS coloring is proposed to improve the throughput of multiple BSSs via spatial reuse. In Chapter 3, the joint adjustment of OBSS PD and TX PWR is proved to be a transmit power control (TPC) prob- lem. Previous works proposed several conventional TPC algorithms, whose performance relies on the parameters of the algorithms, hence cannot adapt to dierent environments automatically. Reinforcement learning-based TPC algorithms, on the other hand, can explore the environment and learn the optimal policy. The performance of conventional and reinforcement learning-based TPC algorithms is evaluated in both static and dynamic environments. As shown in the simu- lations, reinforcement learning-based TPC algorithms outperform conventional TPC algorithms, especially in dynamic environments. 1 Chapter 2 Scheduling and Resource Allocation 2.1 Introduction Multi-user (MU) transmissions in WiFi networks is an important feature which can improve the system throughput greatly. It was rst introduced in the 802.11ac standard. The MU transmission in 802.11ac relies on downlink MU-MIMO, which makes use of spatial diversity to cancel the interference between users. In one MU-MIMO transmission, the frame takes the whole bandwidth. 802.11ax introduces orthogonal frequency-division multiple access (OFDMA) in WiFi net- works for the rst time [61, 10, 48, 11]. It does not improve peak data rate but allows ecient transmissions of small frames to a group of users simultaneously. In OFDMA transmissions under 802.11ax, the whole bandwidth is divided into multiple subsets of subcarriers, each subset called a resource unit (RU). Each RU is assigned with a user or a user group which is typically referred to as user scheduling [61]. 802.11ax supports three types of MU transmissions: MU-MIMO, OFDMA, and Joint MU-MIMO and OFDMA [11], the latter two being the ones we study in this paper as they involve OFDMA. WiFi networks usually work in a multipath environment where the wireless channel of the whole bandwidth can be modeled as a frequency selective channel in both downlink (DL) and uplink (UL). The channel capacity of each user or user group changes over subcarriers, especially 2 when MU-MIMO is used. A good user schedule can assign RUs to dierent users or user groups based on their channel state information (CSI) such that the sum rate is maximized [48]. To eciently utilize the available bandwidth, one needs to optimally solve this scheduling and resource allocation (SRA) problem while taking into consideration the capabilities and limitations of real- world wireless access points (APs) and systems. In this paper, we formulate the SRA problem in the context of 802.11ax. To be able to study the problem analytically, we relax the original problem by allowing users and user groups to be assigned to multiple RUs. (802.11ax allows users to be assigned to a single RU only, unlike long-term evolution (LTE) which may allocate multiple resource blocks (RB) to a user.) We then introduce a divide and conquer algorithm which optimally solves this relaxed version of the original problem. We also introduce a practical greedy algorithm with fast execution time and a practical recursive scheme which jointly splits the bandwidth into RUs and schedules users on them in a near-optimal fashion. Then, we conduct extensive simulations and compare the performance of our algorithms against the optimal in the original constrained setting where a user may be scheduled to a single RU only. (The optimal in this case is computed by exhaustive search.) The simulations results show that our practical greedy and recursive schemes perform very well in a variety of realistic setups with the latter being consistently very close to the optimal while handling a plethora of real-world constraints including variable packet sizes and limited radio capabilities of APs. Last, we further investigate the performance of our algorithms via experiments. The outline of the rest of the paper is as follows. Section 2.2 brie y discusses prior work and Section 2.3 motivates our work. Then, Section 2.4 sets up the system model and formulates the optimization problem. In Section 2.5, several algorithms of the relaxed and original problem are discussed. The performance of the algorithms is compared in Section 2.6, where it is shown that our recursive algorithm can eciently solve the scheduling and resource allocation problem 3 at hand. In Section 2.7, the algorithms are also evaluated in experiments. Last, Section 2.8 concludes the paper. 2.2 Related Work The management and allocation of resources is always a critical issue in wireless networks, since resources such as spectrum and transmit power is limited. Motivated by this, there is a large body of work on managing such resources and improving the overall performance of the system, see, for example, [32, 51, 64, 15, 18, 20, 14, 25, 57, 69, 40, 67, 19, 47, 35, 31, 24, 7] and references therein. Specically in the context of 802.11ax, [32] optimizes the scheduling duration for OFDMA- based 802.11ax WLANs, [51] oers a summary of resource allocation and scheduling algorithms in connection with the quality of service (QoS) at the MAC layer, and [64] uses the sum rate as the objective of the SRA problem without considering fairness, which leads to unfairness between users. A large part of this prior work, [18, 15, 20, 14, 25, 57, 69, 40, 67, 19, 47, 35], shows that eective SRA can improve the throughput of OFDMA systems. For example, [15] proposed suboptimal algorithms for the SRA problem in a multiuser MIMO-OFDMA system which maximizes the system capacity, [18] modeled subcarrier allocation for services that have coupled UL/DL QoS re- quirements as a two-sided stable matching game and proposed a resource allocation scheme with notable performance gains in terms of the average utility per user, [20] generalized the frame- work of SRA to support various scheduling rules and objectives, [57] used a two-phase enhanced branch and bound algorithm to dynamically optimize the power and RBs allocation based on trac requirements, [69] optimized the SRA of LTE systems under per-user QoS constraints and then proposed an energy ecient algorithm which could achieve near-optimal performance, [25] 4 investigated both SU-MIMO and MU-MIMO scheduling problems for the downlink in LTE-A net- works, and, [67, 19] formulated the SRA problem of LTE uplink as a binary-integer optimization problem and proposed some ecient suboptimal algorithms whose performance was close to the optimal solution. In prior work dealing with the general SRA problem, see, for example, [15, 18, 20, 14], the whole bandwidth is split into multiple subchannels of equal size, the SRA on each subchannel is independent (except the coupling from the power constraint) and a single user may be assigned to multiple subchannels. In 802.11ax, however, a user can only be assigned with a single RU and RU sizes are variable. In [47, 35], the bandwidth is split into subcarriers and each subcarrier is allocated to users independently, while in 802.11ax, the smallest RU consists of 26 subcarriers. The SRA problem for LTE uplink/downlink, see, for example, [57, 67, 19], can assign a user with multiple RBs (in the uplink case these RBs have to be adjacent to each other) and the locations of the RBs are exible, while the RUs in 802.11ax are restricted to some specic locations, as discussed in Section 2.4.1. Thus, if an LTE SRA algorithm is applied in 802.11ax directly, a user or user group is likely to be allocated with subcarriers which cannot be grouped into an RU. What is more, MU-MIMO in 802.11ax can only be used with RUs which are greater than or equal to 106 subcarriers, and LTE SRA algorithms don't satisfy this constraint [19]. Therefore, existing SRA algorithms cannot be applied to 802.11ax networks directly. The SRA of 802.11ax also requires good user grouping algorithms which select some users to form an MU-MIMO user group, since 802.11ax supports MU-MIMO on RUs which are larger than 106 subcarriers. User grouping for MU-MIMO has been well studied for 802.11ac networks [71, 26, 59, 62]. These algorithms select the best user group based on the CSI over all subcarriers and allocate the whole bandwidth to this group. These user grouping algorithms can optimize the user group under the assumption that one transmission takes the whole bandwidth. In 802.11ax, the throughput can be further improved by allocating dierent user groups to dierent RUs. 5 Thus, despite the large body of work on the general SRA problem as well as on specic instantiations of the problem, e.g. for LTE networks, there is no existing solution that is applicable to the specic characteristics of the 802.11ax standard. 2.3 Motivation OFDMA is a multiple access technique based on orthogonal frequency-division multiplexing (OFDM). Like OFDM, OFDMA divides the whole bandwidth into subcarriers. The subcarrier spacing is small enough such that each subcarrier can be seen as a narrowband subchannel, even if the whole bandwidth is a frequency selective channel. The dierence is that OFDM allocates all the subcarriers to a single user or user group, while in OFDMA, a user or user group is only assigned with a subset of subcarriers and their data are carried on these subcarriers only. In this way, one frame can multiplex multiple users/user groups simultaneously. In this section, we discuss the eect of various real-world features, such as CSI, user payload, user radio capacity, etc. to the achieved user rates in the context of a WiFi network, motivate the introduction of OFDMA in 802.11ax, and underscore the importance of a good scheduling and resource allocation scheme to maximize user rates by properly assigning users and user groups to OFDMA subcarriers. 2.3.1 The Varying Nature of the Wireless Channel In a wireless network, wireless signals from APs propagate to the users through dierent paths and thus incur dierent path losses. Generally speaking, users with small path loss have higher received power and their data rate is higher. However, wireless networks usually operate in a multipath environment where the coherent bandwidth is small such that the wireless channel for the whole bandwidth is considered as a frequency selective channel. As a result, the CSI of a user changes over subcarriers and the capacity on each subcarrier, decided by CSI, also changes [48]. 6 If the variations over subcarriers are high enough, a user with larger path loss may have higher channel capacity than others even if their path loss is smaller. Fig. 2.1 shows the channel capacity of two MU-MIMO groups in a wireless network where an AP and users are equipped with 4 and 1 antennas respectively and users are within 25m from the AP, under a typical indoors wireless channel (WINNER II [43]). The path loss of users in user group 1 is smaller and the average channel capacity of the two user groups is 9.2 bps/Hz and 6.8 bps/Hz respectively. However, the channel capacity of user group 2 is higher than user group 1 on some subcarriers due to variations of CSI. In summary, the channel capacity is a function of both large scale fading and small scale fading, and it changes over users and subcarriers, thus an SRA scheme should take CSI into consideration. 50 100 150 200 250 Subcarrier 0 5 10 15 Channel capacity (bps/Hz) MU-MIMO user group 1 MU-MIMO user group 2 Figure 2.1: The channel capacity of a frequency selective channel. 2.3.2 Variable User Payload Most of the Internet packets are either very small or close to the network's maximum transmission unit (MTU), see, for example, [39]. Also, it is often the case that user data queues are not full, they may only have a couple of packets waiting to be transmitted. If the users' payload/packets 7 User 2 User 1 User 3 User 4 106 tones 26 tones 26 tones 52 tones User 3 User 4 User 2 User 1 52 tones 52 tones 52 tones 52 tones Underutilized (a) (b) Time Figure 2.2: Channel utilization with variable user payload. are allocated to RUs without considering the payload size of each user, it is likely that the channel will be underutilized. For example, consider the simple scenario shown in Fig. 2.2 where an AP allocates RUs of a 20MHz channel to 4 users whose payload is dierent. Fig. 2.2(a) shows the channel utilization when RU allocation is agnostic to variable user payload; even though the payload/packet of user 1 and 2 is 4 times and 2 times that of user 3 and 4 respectively, users are assigned with RUs of equal size. Then, most of user 2, 3 and 4's channel time is wasted, since the duration of a transmission is determined by the longest channel time (user 1's channel time). If, instead, RUs are allocated to users in accordance with the payload/packet size of each user, this waste can be avoided, see Fig. 2.2(b). The ability of OFDMA to allocate bandwidth eciently under variable user payload is well known, and this variability is one more aspect of the system that an SRA algorithm should take into consideration to maximize the eective throughput. 8 2.3.3 Variable Radio Capacity Real-world WiFi devices often have dierent radio capacities and some of them may not support operating on the whole bandwidth that an AP may transmit. This causes ineciencies because APs may have to operate on smaller bandwidths than they can support reducing the achieved user rates of all users, including those that may support wider channels. 802.11ax, by supporting ODFMA, may use large channels eciently. For example, if an AP uses an 80MHz channel while serving some end user devices that only support 20MHz channels, it can assign the primary 20MHz channel to one of these devices and the remaining 60MHz/RUs to other devices. The variability among radio capacities is yet another parameter for the SRA to take into account. 2.3.4 Fractional Frequency Reuse for Inter-cell Interference Mitigation Dense, real-world WiFi deployments suer from inter-cell interference when adjacent APs operate on the same channel. This is particularly severe for users which are located at the edges of the cells around the APs. One well-known promising technique to deal with this in the context of OFDMA is fractional frequency reuse (FFR). Under FFR, a cell around an AP can be divided into an inter-cell-interference-free region and an outer region. The users located at the inter-cell- interference-free region may use all the subcarriers while users in the outer region may only use a subset of subcarriers which do not suer from excessive inter-cell interference [48]. An SRA algorithm may use this information to further maximize bandwidth use and user rates. 2.3.5 The Importance of Scheduling and Resource Allocation The variation of channel capacity over users and subcarriers, of users' payload and radio capa- bilities, etc. makes it important to select a good user schedule. To see this, consider a wireless network in an indoors environment where one AP and 30 users are equipped with 4 and 1 an- tennas respectively and zero-forcing beamforming (ZFBF) is used for SU-MIMO and MU-MIMO. 9 We compare the performance of random and optimal user schedules where for each subcarrier, the optimal SRA selects the best user/user group, while the random SRA selects a user/user group randomly. To keep things simple in this motivational example, we merely vary the channel according to the WINNER II model and don't consider variable user payloads, radio capabilities, and other complexities (which would only further strengthen our point). The simulation results, see Fig. 2.3, show that if the users are scheduled randomly without optimization, the sum rate of the system is signicantly reduced, especially for joint OFDMA and MU-MIMO transmissions. This motivates us to nd a good SRA algorithm for 802.11ax OFDMA transmissions. OFDMA Joint MU-MIMO & OFDMA 0 100 200 300 400 500 600 700 Capacity (Mbps) Ramdom user schedule Optimal user schedule 0.2 0.4 0.6 0.8 1 Normalized sum rate 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CDF OFDMA Joint MU-MIMO & OFDMA Figure 2.3: Sum rate distribution and average value. 2.4 System Model 2.4.1 802.11ax Primer 802.11ax supports the following bands: 20MHz, 40MHz, 80MHz, 80+80MHz (combines two 80 MHz channels) and 160MHz (a single 160 MHz channel) [22, 61]. In an OFDMA transmission, the spectrum band is divided into multiple RUs [10, 61]. In the time domain, an RU spans the entire data portion of a high eciency (HE) PLCP protocol data unit (PPDU). In the frequency domain, it consists of a subset of contiguous subcarriers except the RUs which \straddle DC" 10 (where some nulls are placed in the middle of the band). The size of an RU in the frequency domain can be 26, 52, 106, 242, 484 or 996 subcarriers. The RUs in an HE MU PPDU using OFDMA transmission can only be any of these sizes. The locations of RUs in an HE PPDU are xed. Each RU of the size larger than 26 can be further divided into 2 smaller RUs. As an example, the locations of the RUs in a 40MHz HE PPDU in the frequency domain are shown in Fig. 2.4. The whole bandwidth can be used as a single 484-tone RU, or it can be divided into two 242-tone RUs, each of which may be further divided into smaller RUs until 26-tone RUs are reached. Once the RUs have been generated, the AP allocates one RU to each user/user group for transmission. If the bandwidth is split into RUs and each of them is allocated to an individual user, then the transmission is referred to as pure OFDMA one; if an RU is equal or larger than 106 subcarriers it can also be used for MU-MIMO, and then the transmission is referred to as a joint MU-MIMO and OFDMA one. Figure 2.4: RU locations in a 40MHz HE PPDU [61]. 11 Under 802.11ax, the AP obtains CSI from users via channel sounding, e.g. to perform MU- MIMO. Specically, the AP sends a null data packet announcement (NDPA) frame followed by a null data packet (NDP) and a Beamforming Report (BRP) trigger frame. As a response to the BRP trigger frame, users send their CSI reports to the AP simultaneously vis UL MU transmission. 2.4.2 Physical Layer 2.4.2.1 The channel model In 802.11, an AP together with all associated users is called a basic service set (BSS). Consider an 802.11ax BSS where the AP and users are equipped with N T and N R (N T > N R ) antennas respectively. 1 As usual, consider that the downlink CSI to all users is transmitted to the AP through channel sounding (CSIT) and that the AP applies ZFBF for SU-MIMO or MU-MIMO. Index the users by the setU =f1; 2;:::;Ng. In downlink transmission, the AP decides to transmit to a set of users U s U on subcarrier s. The received signal at user k on subcarrier s, y k;s , can be expressed as follows: y k;s =h k;s w k;s p P k;s x k;s + X j2Us j6=k h k;s w j;s p P j;s x j;s +z k;s ; (2.1) where x k;s , h k;s , w k;s , P k;s and z k;s are the data symbol, channel response, beamforming weight vector, transmit power and additive white Gaussian noise (AWGN) for user k at subcarrier s, where the noise power z k;s CN (0; 2 I). The transmit power P s P k2Us v 1 k;s P k;s is constant over all subcarriers, where v k;s = 1 kw k;s k 2 : (2.2) 1 It is easy to extend our work to the scenario where users have a dierent number of antennas. 12 P k;s can be optimized by waterlling [71], or set by equal power allocation for simplicity [34]. Note that since in the scenarios of interest there are many users at each BSS and thus we may always nd some users with high SNR at all subcarriers, and further motivated by the fact that most commercial APs do not do water-lling, we assume P k;s to be xed and thus the transmit power on each RU will be proportional to the RU size. The beamforming matrix W s = [w k;s ;k2 U s ], consisting of all beamforming weight vectors w k;s , is the pseudo-inverse of H s = [h T k;s ;k2U s ] T , that is W s =H H s (H s H H s ) 1 : (2.3) 2.4.2.2 Abstraction of RUs As mentioned in Section 2.4.1, each RU larger than 26 subcarriers can be split into two smaller RUs. As shown in Fig. 2.4, the whole bandwidth can be split at most L 1 times, where L is referred to as the number of levels. L is related to the whole bandwidth as larger bandwidth can be split more times. 802.11ax supports 20MHz, 40MHz, 80MHz and 160MHz, and L varies from 4 to 7 respectively. 2 We denote each RU by RU(l;i), where l is the level of the RU (the number of splits from the largest RU occupying the whole bandwidth to the current one) and i is the index of an RU at its level. Since the whole bandwidth can be split into 2 l RUs of equal size at level l (l 2 f0; 1;:::;L 1g), i assumes values 0; 1;:::; 2 l 1. Note that RU(0; 0) refers to the RU occupying the whole bandwidth and that each RU RU(l;i) with l <L 1 can be split into two RUs RU(l + 1; 2i) and RU(l + 1; 2i + 1). Using this notation, Fig. 2.5 shows an example where we label the RUs of a 20MHz HE PPDU. 2 Even though L 2 f4; 5; 6; 7g in 802.11ax, in the analysis afterwards we may allow L to assume more values. All the conclusions from the analysis are valid for all positive integers including 4,5,6,7. 13 0,0 l=3 l=2 l=1 l=0 3,0 3,1 3,2 3,3 3,4 3,5 3,6 3,7 3,0 3,1 3,2 3,3 3,4 3,5 3,6 3,7 2,0 2,1 2,2 2,3 2,0 2,1 2,2 2,3 1,0 1,1 1,0 1,1 0,0 l=3 l=2 l=1 l=0 3,0 3,1 3,2 3,3 3,4 3,5 3,6 3,7 2,0 2,1 2,2 2,3 1,0 1,1 Figure 2.5: RUs of a 20MHz HE PPDU. 2.4.2.3 User grouping of MU-MIMO If the AP works in joint MU-MIMO and OFDMA mode, the SRA procedure includes selecting an optimal user group from the user set for MU-MIMO transmission given a specic RU and user set, which is referred to as user grouping. The optimal solution to user grouping in MU-MIMO is hard to obtain and previous studies such as [71, 26, 59, 62] have proposed many ecient algorithms to provide a suboptimal solution whose performance is very close to the optimal one. It is beyond the scope of this paper to work on user grouping, our focus is instead on splitting the bandwidth into RUs and allocating user or user groups to RUs. Thus, we use prior work to select user groups and in particular the method described in [59]. As discussed above (Section 2.3), some users are unable to use the whole bandwidth, e.g. due to radio capacity limitations, and thus can only be assigned to some of the RUs. As a result, the user set for user grouping on RU(l;i), denoted by U RU(l;i) , is a subset ofU, i.e., U RU(l;i) U. 2.4.2.4 Scheduling and resource allocation In one OFDMA transmission, the whole bandwidth is divided into a combination of RUs from dierent levels. Let P =fp j ; 0 j N p g be a valid partition of the whole bandwidth where p j =RU(l j ;i j ) is thej th RU inP having a total ofN P RUs, and letP be the universal set of all partitions. For example, one possible partition is shown in Fig. 2.6. It includes RUs from level 1 to 3, specically, p 0 =RU(2; 0), p 1 =RU(3; 2), p 2 =RU(3; 3), and p 3 =RU(1; 1). 14 3,2 3,3 2,0 1,1 3,2 3,3 2,0 1,1 Figure 2.6: A valid partition of the bandwidth. Having obtained a valid partition of the bandwidth, we need to allocate users to RUs. Let u pj denote the user set allocated to p j (the jth RU in the valid partition of the whole bandwidth) whereu pj U pj whereU pj =U RU(lj;ij) . And, letg =f(p j ;u pj ); 0jN P g denote a valid user schedule. For example, one valid user schedule with the partition in Fig. 2.6 is shown in Table 2.1. Table 2.1: A valid user schedule. j p j u pj 0 (2, 0) f3g 1 (3, 2) f16g 2 (3, 3) f5g 3 (1, 1) f1, 7, 10, 15g 2.4.2.5 Weighted Sum Rate The performance of an SRA schedule can be evaluated by the weighted sum rate of the system, which is dened as R ZFBF (g) = X k X s k r k;s ; (2.4) where k is the weighting coecient for user k, and r k;s is the transmission rate for user k on subcarrier s. The weighting coecient k accounts for user priority and fairness, which is adjusted according to system requirements. Popular weight assignments include: 15 The sum rate assignment which sets all weights equally and aims at maximizing the sum rate of all the users, but leads to unfairness. In this case: k = 1; 8k: (2.5) The proportional fair (PF) assignment [20, 36], which maximizes P N k=1 log r k . In this case: k = 1= r k ; 8k; (2.6) where r k is the average eective data rate allocated to user k. The modied largest weighted delay rst (MLWDF) assignment [6] which depends on QoS and packet delay. It improves the spectral eciency while maintaining good fairness. In this case: k = k W HOL;k = r k ; 8k; (2.7) where k is the priority level and W HOL;k is the head-of-line (HOL) delay experienced at the AP buer of user k. The transmission rate for user k, r k;s , is a function of the SNR at user k on subcarrier s, and the SNR equals k;s =P k;s = 2 . In general, there are two ways to compute the rate as a function of the SNR which depend on the type of Adaptive Modulation and Coding (AMC) assumed: Continuous-rate AMC: Due to its simplicity and relative accuracy, Shannon capacity has been widely used in the literature to compute the rate as a function of the SNR under a continuous-rate AMC assumption. In this paper we specically use the method in [20] to estimate the channel capacity under this assumption, since it uses a coecient to represent 16 the gap between ideal and practical coding schemes and thus it is closer to practice. In this case, the transmission rate for user k on subcarrier s equals r k;s = log 2 (1 + k;s k ): (2.8) where k 1 represents the gap between ideal and practical coding schemes. If k = 1, (2.8) results in the Shannon capacity limit [20]. Discrete-rate AMC: In realistic WiFi networks, the AP uses rate adaption (RA) algorithms [12, 68, 29, 42] to choose the best MCS to maximize the throughput. Letm denote an MCS. Suppose the AP selectsm k as the MCS for the transmission to userk based on the eective SNR on user k's RU, then r k;s is determined by the transmission rate of m k : r k;s =R(m k ); (2.9) where the functionR(m) is the transmission rate of MCS m on one subcarrier, given in Table 2.2 for dierent MCS options of the 802.11ax standard [61]. Note thatr k;s is constant among all the subcarriers within user k's RU, since the AP applies the same MCS on all these subcarriers. Table 2.2: The transmission rate of MCS on one subcarrierR(m) (in Mbps) MCS 0 1 2 3 Rate 0.0368 0.0735 0.1103 0.1471 MCS 4 5 6 7 Rate 0.2206 0.2941 0.3309 0.3676 MCS 8 9 10 11 Rate 0.4412 0.4902 0.5515 0.6127 Note that while we have used SNR above, the analysis can be easily extended to account for interference (SINR) since intra-cell interference is minimal and inter-cell interference, when not 17 minimal, can be taken into consideration via ideas like the fractional frequency reuse discussed earlier. The only eect would be that we would have a more constrained set of valid partitions as some users at the edge of the cell would not be allowed to use RUs with high interference from other cells. Along the same lines, radio capability limitations of users would merely further constraint the set of valid user schedules that we can work with. 2.4.2.6 Eective throughput optimization for variable payload The above sections assume the users' payload/packets can always ll the DL MU transmission. In practice, this may not always be the case leading to channel underutilization if the AP solves the SRA problem without considering user trac. Suppose users have dierent payload sizes D k . The duration of a transmission T is determined by the longest channel time, see Fig. 2.2 and the associated discussion. Thus, T = max k D k r k ; (2.10) and the eective throughput of user k is ~ r k = D k T : (2.11) Similarly, the weighted eective throughput of schedule g can be dened as ~ R ZFBF (g) = X s X k k ~ r k;s : (2.12) It is easy to see that by replacing r k and R ZFBF (g) with ~ r k and ~ R ZFBF (g) the SRA can take into consideration variable payloads. 18 2.4.3 The Optimization Problem The SRA problem consists of two tasks: (i) split the bandwidth into one or multiple RUs and (ii) allocate the RUs to users (SU-MIMO) or user groups (MU-MIMO). The constraints are: 1. A user or user group can only be assigned with no more than one RU. 2. MU-MIMO transmission only applies to RUs larger than 106 subcarriers, in other words, lL 3. 3. The number of users allocated on RU(l;i) is between 1 and some number M(l). M(l) is the maximum number of users allowed on RU(l;i) and it is a function of l. If the AP transmits in joint MU-MIMO and OFDMA mode and l L 3, RU(l;i) can be used for MU-MIMO andM(l) =bN T =N R c, otherwiseRU(l;i) is used for SU-MIMO andM(l) = 1. With all the above constrains, the SRA of 802.11ax can be formulated as an optimization problem as follows: max g R ZFBF (g) s:t: 0 X j c j;k 1 1 X k c j;k M(l j ) c j;k 2f0; 1g P =fp j g2P u pj U pj ; (2.13) where c j;k indicates if user k is allocated on the jth RU, and recall that g =f(p j ;u pj )g is a user schedule. 19 Table 2.3: Notation glossary. Parameters Description N T Number of antennas at the AP N R Number of antennas at the user k, s, j User, subcarrier and RU index U A user set U The universal user set U s The user set allocated on subcarrier s N The number of users y k;s , x k;s The received and transmitted signals of user k on subcarrier s z k;s The noise signals of user k on subcarrier s h k;s , w k;s The channel response and beamforming weight vector of userk on subcarrier s H s , W s The channel response and beamforming weight matrix for all users on sub- carrier s P k;s The transmit power allocated to user k on subcarrier s P s The total transmit power allocated to subcarrier s R ZFBF () The weighted sum rate ~ R ZFBF () The weighted eective throughout r k;s The transmission rate of user k on subcarrier s r k The average data rate allocated to user k ~ r k The eective throughput of user k k The weighting coecient for user k RU(l;i) The ith RU at level l L The max number of levels p j The jth RU in partition p, equivalent to RU(l j ;i j ) c j;k A binary number indicating if user k is allocated on p j M(l) The max number of users allowed on an RU at level l g A valid user schedule and resource allocation P The set of all valid partitions 2.5 Scheduling algorithms As discussed already, the 802.11ax SRA problem has some important dierences over prior SRA problems which make it challenging. For example, (i) a user can be assigned to a single RU only, thus breaking the independence of the SRA decision across dierent RUs, and (ii) MU-MIMO can be applied only to RUs with more than 106 subcarriers, which further complicates the problem by creating a potentially large gap in the total throughput between, say, using two 106-tone RUs versus one 242-tone RU. In this section we introduce a number of novel scheduling algorithms for the 802.11ax SRA problem. 20 2.5.1 The Relaxed Scheduling and Resource Allocation Problem We start with a relaxed/simplied case which provides an upper bound on our original problem. Specically, constraint 1) makes the SRA problem complicated since it requires taking into con- sideration all users scheduled on each RU. Thus we consider the problem which relaxes Constraint 1), i.e., a user or user group can be assigned to multiple RUs and the rst constraint in Problem (2.13) can be ignored. In this case, the SRA on dierent RUs can be solved independently. l'' 1, i'' 1 ... l'' 2, i'' 2 l' 1, i' 1 ... l' 2, i' 2 l,i >l+1 l+1 l l+1,2i l+1,2i+1 l+1,2i l+1,2i+1 Figure 2.7: The relaxed SRA problem on RU(l;i). Consider the relaxed SRA problem on the subcarriers of RU(l;i) in Fig. 2.7. We can use the subcarriers as a single RU, RU(l;i), and allocate some users to it, otherwise, we need to split the subcarriers into multiple RUs at higher levels and allocate users to each of them. Note that no matter how we split the subcarriers of RU(l;i), we need to rst split RU(l;i) into two RUs, RU(l + 1; 2i) and RU(l + 1; 2i + 1). Each of them can be further split into more RUs. After we split the subcarriers of RU(l;i) into one or multiple RUs, we can select optimal user groups for each RU independently, since the relaxed SRA problem does not require Constraint 1). Note that we need to make sure the subcarriers are split optimally and the user grouping on each RU is also optimal. As we already mentioned, we use prior work such as [71, 26, 59, 62] to obtain a good user group. Our contribution is on how to split the subcarriers into RUs and how to allocate users/user groups to those RUs optimally. We do this by a novel algorithm we refer to as the divide and conquer algorithm which consists of two steps: 1. In the divide step, as shown in Fig. 2.7, the subcarriers of RU(l;i) are rst split into two RUs,RU(l + 1; 2i) andRU(l + 1; 2i + 1). Then, each of these two RUs can be further split. 21 The SRA problem onRU(l;i) becomes two subproblems onRU(l+1; 2i) andRU(l+1; 2i+1) which can be solved independently. Let the optimal user schedules on RU(l + 1; 2i) and RU(l+1; 2i+1) beg opt (l+1; 2i) andg opt (l+1; 2i+1) respectively, irrespectively of whether they are used as a single RU or split into multiple RUs. 2. In the merge step, g opt (l + 1; 2i) and g opt (l + 1; 2i + 1) can be merged into a user schedule on RU(l;i). Denote this user schedule as g m (l;i), where the subscript m means there are multiple RUs in this user schedule. Note that there is another optimal user schedule candidate on RU(l;i), which uses all the subcarriers of RU(l;i) as a single RU. Let the optimal user schedule in this case be denoted by g s (l;i), where the subscript s means there is only one RU in this user schedule. For a single RU, RU(l;i), all the subcarriers s2 RU(l;i) are allocated to a single user or user group. Thus, this is the well-known problem of user grouping over a channel, which can be solved by existing user grouping algorithms [71, 26, 59, 62]. At this point, we have two user schedule candidates g s (l;i) and g m (l;i) for RU(l;i). The optimal user schedule on RU(l;i) is g opt (l;i) = arg max g2fgs(l;i);gm(l;i)g (R ZFBF (g)): (2.14) Last, the optimal user schedule of the relaxed problem is g opt =g opt (0; 0). For more details, see the provided pseudo code. Algorithm 1 DIVIDE-AND-CONQUER(U;l;i) Require: The CSI of all users; 1: //divide the problem into two subproblems 2: if l<L 1 then 3: g opt (l + 1; 2i) = DIVIDE-AND-CONQUER(U;l + 1; 2i) 22 4: g opt (l + 1; 2i + 1) = DIVIDE-AND-CONQUER(U;l + 1; 2i + 1) 5: g m (l;i) = MERGE(g opt (l + 1; 2i);g opt (l + 1; 2i + 1)) 6: end if 7: //User selection on RU(l;i) 8: g s (l;i) = USER-SELECTION(U;l;i) 9: G =fg s (l;i);g m (l;i)g 10: //select the optimal user schedule from G 11: g opt (l;i) = arg max g2G (R ZFBF (g)) 12: return g opt (l;i) 13: function USER-SELECTION(U;l;i) 14: if lL 3 and joint MU-MIMO and OFDMA mode then 15: g s (l;i) = USER-GROUPING(U;l;i) 16: else 17: Select user u opt 2U with max capacity on RU(l;i) 18: g s (l;i) =f(RU(l;i);u opt )g 19: end if 20: return g s (l;i) 21: end function Lemma 1. Given a specic set of weighting coecientsf k :::g, the optimal user schedule of the relaxed SRA problem can be obtained by the divide and conquer algorithm. Proof. If L = 1, the lemma is obviously true. If Lemma 1 is true when the max number of levels is L, then we can also prove that it is true when the max number of levels is L + 1. Suppose there is another user schedule g 0 which is dierent from g opt obtained by the divide and conquer algorithm and R ZFBF (g 0 ) > R ZFBF (g opt ). If g 0 consists of only one RU, then its sum rate is larger thanR ZFBF (g s (0; 0)), which contradicts the fact thatg s (0; 0) is the optimal user group on RU(0; 0). If g 0 consists of multiple RUs, its sum rate is R ZFBF (g 0 (1; 0)) +R ZFBF (g 0 (1; 1)) and 23 R ZFBF (g 0 (1; 0))+R ZFBF (g 0 (1; 1))>R ZFBF (g m (1; 0))+R ZFBF (g m (1; 1)), which contradicts the fact that g m (1; 0) and g m (1; 1) are optimal user schedules on RU(1; 0) and RU(1; 1) respectively. Therefore, Lemma 1 is also true for L + 1. Lemma 2. Given a specic set of weighting coecientsf k :::g, the weighted sum rate of the optimal user schedule of the relaxed SRA problem is an upper bound for the original SRA problem. Proof. The optimal user schedule of the original SRA problem satises all the constraints of the relaxed SRA problem and thus is also a feasible solution of the relaxed SRA problem, and the sum rate of the optimal user schedule of the relaxed SRA problem is no less than the weighted sum rate of the optimal user schedule of the original SRA problem. Even though the optimal user schedule of the relaxed SRA problem does not satisfy all the constraints of the original problem, it can be used as an upper bound of the optimal user schedule of the original SRA problem. In Section 2.6 and 2.7, we show that this upper bound is quite tight via simulations and experiments. 2.5.2 The Original Scheduling and Resource Allocation Problem The original SRA problem with Constraint 1) cannot be solved by the divide and conquer al- gorithm, since the subproblems are solved independently onU, thus when the user schedules of the subproblems are merged, it is possible that a user shows up on multiple RUs. The original problem can of course be solved by exhaustive search (for small scale scenarios) and by a greedy algorithm (leading to suboptimal solutions). 2.5.2.1 Exhaustive search The exhaustive search traverses all the possible user schedules to search for the optimal solution. It guarantees to nd the optimal user schedule, but it is impractical in WiFi networks due to its time complexity. 24 To access the size of the search space we compute the number of user schedules below. Let n be the number of available users and (n;l) be the number of distinct user schedules which allocate these n users to the subcarriers of an RU at level l. Suppose all the devices may use the whole bandwidth (wideband radios, no FFR, etc.), therefore U l;i =U; 8l;i. Similar to Section 2.5.1, there are two kinds of user schedules on an RU at level l: 1. Case 1: All the subcarriers make up a single RU. There is only one way to allocate the RU to then users, ifn is no larger thanM(l). So, the number of possible user schedules in this case is s (n;l) = 8 > > > < > > > : 1 nM(l) 0 otherwise : (2.15) 2. Case 2: The subcarriers are split into multiple RUs. Similar to Algorithm 1, the subcarriers are rst split into two RUs and the number of possible user schedules in this case is m(n;l) = 8 > > > > < > > > > : n1 P k=1 n k (k;l+1)(nk;l+1) l<L1 0 otherwise : (2.16) Putting everything together we have (n;l) = s (n;l) + m (n;l) and (1;l) = 1, and, using Equations (2.15) and (2.16),(n;l) can be calculated recursively. Last, suppose there areN users in the network and L levels of RUs. In every transmission the AP chooses some of the users to serve. The number of combinations of choosing n users from N users is N n . Therefore, the number of possible user schedules is P N n=1 N n (n; 0). Fig. 2.8 shows the number of user schedules as a function of N and L. The size of the search space increases very fast as N and L increase. If a network has N = 10 users and works on 40MHz (L = 5), then the search space of OFDMA mode and joint MU-MIMO and OFDMA 25 mode is 9:1 10 8 and 1:7 10 9 respectively. Clearly, the exhaustive search is computational too expensive and thus impractical for real-world 802.11ax networks. 10 0 7 10 5 15 10 10 size of search space 6 10 15 number of levels 10 number of users 10 20 5 5 4 0 (a) OFDMA 10 0 7 10 5 15 10 10 size of search space 6 10 15 number of levels 10 number of users 10 20 5 5 4 0 (b) Joint MU-MIMO and OFDMA Figure 2.8: The time complexity of exhaustive search. 2.5.2.2 Greedy algorithm We propose a greedy algorithm which is very fast and can be used as a benchmark to compare the running time of other algorithms. Our algorithm rst selects a levell to operate at and then splits the whole bandwidth such that the partition p consists of RUs of equal size. The level l is chosen such that each RU can be assigned with at least one user/user group. Given a total number of N users with sayN R = 1 antenna each, if every RU were to be assigned with the maximum possible number of users N T (forming a maximum size user group), then we could assign users to N=N T RUs. Following this rationale, the algorithm chooses level l = min(L 1;blog 2 (N)c) for OFDMA mode or l = min(L 3;blog 2 (NN R =N T )c) 26 for joint MU-MIMO and OFDMA mode. Looking at Fig. 2.5, once the level l is identied, the algorithm moves from left to right within the chosen level selecting the best user or user group for each RU, and then moving on to the next RU to the right while excluding the already selected users from further consideration. 2.5.2.3 UP-RA algorithm In [19], a low complexity suboptimal \user pairing-resource allocation" (UP-RA) algorithm was introduced to solve the SRA problem for LTE uplink. The UP-RA algorithm separates the SRA problem into two subproblems, user pairing and resource allocation, and solves these subproblems iteratively. In the RA allocation procedure, the number of RBs allocated to a user pair can be more than one and at arbitrary locations. Thus, this algorithm cannot be used in the context of the 802.11ax SRA problem. In order to compare our algorithms, which are designed from scratch to satisfy 802.11 con- straints, with existing LTE SRA algorithms which may be retrotted to satisfy those constraints, we have revised the UP-RA algorithm [19] to satisfy 802.11ax constraints. Specically, in the RA procedure, instead of iteratively allocating remaining RBs (or, in the WiFi jargon, RUs) to the user groups assigned the left or right of the RB/RU under discussion, the AP only allows allocat- ing RU(l + 1; 2i) to the user group on RU(l + 1; 2i + 1) or vice versa. In this way, RU(l + 1; 2i) and RU(l + 1; 2i + 1) may be merged into RU(l;i) or remain as two separate RUs, maximizing the throughput while satisfying the constraints with respect to the position of the RUs. For more details, please see the pseudocode that follows. Algorithm 2 UP-RA revised Require: The CSI of all users; 1: //Initialization: G =;,N =f0; 1;:::; 2 L1 g 2: //User grouping procedure 27 3: for i = 0 : 2 L do 4: g opt (L 1;i) = USER-SELECTION(U;L 1;i) 5: G =G[g opt (L 1;i) 6: end for 7: //User groups decoupling 8: while G6=; do 9: i = arg max i2N (R ZFBF (g opt (L 1;i)) 10: G =Gnfgjg\g opt (L 1;i )6=;g 11: N =Nni 12: end while 13: //Remaining RU allocation procedure 14: for l =L 2 : 0 do 15: G 0 =; 16: if g opt (l + 1; 2i)2G and g opt (l + 1; 2i + 1) = 2G then 17: for u j in g opt (l + 1; 2i) do 18: G 0 =G 0 [fRU(l;i);u j g 19: end for 20: else if g opt (l + 1; 2i) = 2G and g opt (l + 1; 2i + 1)2G then 21: for u j in g opt (l + 1; 2i + 1) do 22: G 0 =G 0 [fRU(l;i);u j g 23: end for 24: else if g opt (l + 1; 2i)2G and g opt (l + 1; 2i + 1)2G then 25: G 0 =fg opt (l + 1; 2i)[g opt (l + 1; 2i + 1)g 26: end if 27: g opt (l;i) = arg max g2G 0 (R ZFBF (g)) 28: G =Gnfg opt (l + 1;i)g 28 29: G =G[g opt (l;i) 30: end for 2.5.3 Recursive Scheduling The divide and conquer algorithm violates Constraint 1) as the SRA of RU(l + 1; 2i) andRU(l + 1; 2i + 1) are solved independently and some users may appear in both g opt (l + 1; 2i) andg opt (l + 1; 2i + 1). In order to satisfy Constraint 1), we propose a new algorithm referred to as recursive scheduling, which excludes the users ofg opt (l+1; 2i) from the user set when solvingg opt (l+1; 2i+1) and vice versa. The algorithm may return a suboptimal solution since the optimal user schedule does not necessarily consist of either one of g opt (l + 1; 2i) andg opt (l + 1; 2i + 1). Interestingly, the simulation results in Section 2.6 show that the gap between the solution of recursive scheduling and optimal user schedule is very small. For more details on the algorithm, see the provided pseudo code. Algorithm 3 RECURSIVE-SCHEDULING(U;l;i) Require: The CSI of all users; 1: //divide the problem into two sub problems 2: if l<L 1 then 3: // Solve the SRA on RU(l + 1; 2i) 4: g opt (l + 1; 2i) = RECURSIVE-SCHEDULING(U, l + 1, 2i) 5: U c =Ufuju2g opt (l + 1; 2i)g 6: g 0 opt (l + 1; 2i + 1) = RECURSIVE-SCHEDULING(U c , l + 1, 2i + 1) 7: g m (l;i) =MERGE(g opt (l + 1; 2i);g 0 opt (l + 1; 2i + 1)) 8: // Solve the SRA on RU(l + 1; 2i + 1) 9: g opt (l + 1; 2i + 1) = RECURSIVE-SCHEDULING(U, l + 1, 2i + 1) 10: U 0 c =Ufuju2g opt (l + 1; 2i + 1)g 11: g 0 opt (l + 1; 2i) = RECURSIVE-SCHEDULING(U 0 c , l + 1, 2i) 29 12: g 0 m (l;i) =MERGE(g 0 opt (l + 1; 2i);g opt (l + 1; 2i + 1)) 13: end if 14: //User grouping on RU(l;i) 15: g s (l;i) = USER-SELECTION(U;l;i) 16: G =fg m (l;i);g 0 m (l;i);g s (l;i))g 17: //Select the optimal user schedule from G 18: g opt (l;i) = arg max g2G (R ZFBF (g)) 19: return g opt (l;i) We now comment on the complexity of the algorithm. The most time-consuming operation of recursive scheduling is user selection. When executing recursive-scheduling on an RU which is at level l < L 1, the process calls itself four times and the user-selection process once. When executing recursive-scheduling on an RU which is at level L 1, the process doesn't call itself, it only calls the user-selection process once. Putting it all together, the number of times the user-selection process is called to get the optimal solution for an RU at level l, referred to as(l), equals (l) = 8 > > > < > > > : 4(l + 1) + 1 l<L 1 1 l =L 1 : (2.17) Thus, the recursive-scheduling process calls the user-selection process for a total of (0) = (4 L 1)=3 times. Even though this grows exponentially with L, L is no more than 7 in an 802.11ax network (for the largest possible 160MHz channel). In OFDMA mode, the user selection selects the best user from N users with time complexity O(N logN); in joint MU-MIMO and OFDMA mode (lL3), the user selection selects the best user group fromN users with time complexity O(N 2 ) (assuming an ecient algorithm like the one in [59] is used for user grouping). Thus, the 30 time complexity of recursive-scheduling is O(4 L N logN) in OFDMA mode andO(4 L N 2 ) in joint MU-MIMO and OFDMA mode. We have implemented both exhaustive search and recursive scheduling in C to get a sense of the relative running time. For 10 users, when running the algorithms on a Macbook pro with a 2.7GHz dual-core Intel Core i5 processor, recursive scheduling is three orders of magnitude faster than exhaustive search. Specically, it takes about 0.05s to run recursive scheduling versus 450s to run exhaustive search in OFDMA mode and 0.18s versus 1180s in joint MU-MIMO and OFDMA mode. Our recursive scheduling code is not particularly optimized. Vendors could optimize it, partially implement it in hardware, run it every few transmissions, or use an even simpler user grouping scheme than the one we have used (as a matter of fact, all WiFi vendors today use very simple user grouping schemes with up to linear time complexity on the number of considered users), to further speed up the computation of the schedule. (Note that some vendors today update user groups about every 0.1sec so no further speedup may be required, but if one would want to update user groups and OFDMA schedules at every transmission, then some speedup would be required as a WiFi transmission typically takes a few ms, e.g. about 5ms when packet aggregation is activated.) 2.6 Simulations The presented algorithms for both the relaxed SRA problem (divide and conquer based algorithm) and the original SRA problem (UP-RA revised, greedy and recursive scheduling) are evaluated in simulations. Consider downlink MU transmissions in a single 802.11ax basic service set (BSS) in a 50m50m oce area with central frequency of 5GHz, where the wireless channel can be modeled with the WINNER II model [43, 44, 4]. According to WINNER II, the path loss is given as PL =A log 10 (d[m]) +B +C log 10 (f c [GHz]=5:0) +X; (2.18) 31 where d is the distance between the user and the AP, and A;B;C;X are parameters related to scenarios which can be found in [43]. We choose the indoor oce (A1) non-line-of-sight (NLOS) scenario for the simulations. For each simulation scenario, we keep the location of the AP xed and create dierent topologies by randomly distributing the users. We then report the throughput and fairness index under both the pure OFDMA and the joint MU-MIMO and OFDMA modes. The throughput is computed as the average (over many transmissions) sum rate or proportional fair rate (see Section 2.4) and the fairness index [36] is dened as F = ( P k r k ) 2 N P k r 2 k : (2.19) F ranges from 1 N to 1, and a larger value indicates better fairness between users. Last, in all simulations we use SIEVE [59] as the user grouping algorithm. Figure 2.9: The topology of the 802.11ax BSS. 2.6.1 Divide and Conquer versus Exhaustive Search The divide and conquer algorithm is compared with exhaustive search to evaluate the tightness of the upper bound. As mentioned in Section 2.5.2.1, the exhaustive search is computationally expensive, thus we consider a small scale scenario: The BSS consists of 1 AP and 7 users with 32 N T = 2 and N R = 1 antennas respectively. The bandwidth is 20MHz (L = 4). The throughput is computed as the average sum rate over hundreds of transmissions. As shown in Fig. 2.10, the throughput of divide and conquer is slightly higher than exhaustive search, which is expected since it allows assigning multiple RUs to a user while exhaustive search does not. Interestingly, the gap between the exhaustive search and divide and conquer is very small. Specically, the average throughput is very similar (see left plot) and the CDF of the ratio of the throughput of exhaustive search over that of divide and conquer shows that the throughput of exhaustive search is always less than 8% o that of divide and conquer (see right plot). In addition to the direct comparison between exhaustive search and divide and conquer under a small scale scenario, we perform further implicit comparisons under large scale scenarios in the following simulations. These simulations establish that recursive scheduling is very close to the upper bound (divide and conquer), thus, since exhaustive search is by denition between recursive scheduling and divide and conquer, it is also very close to the upper bound (divide and conquer). Thus, the sum rate achieved by the divide and conquer algorithm is a tight upper bound of the optimal user schedule and we will use it in place of exhaustive search (optimal) for the larger scale scenarios discussed below. Note that since the total transmit power is constant, the sum rate does not change linearly with N T . OFDMA Joint MU-MIMO & OFDMA 0 20 40 60 80 100 Throughput (Mbps) Exhaustive search Divide & conquer 0.92 0.94 0.96 0.98 1 Normalized throughput 0 0.2 0.4 0.6 0.8 1 CDF OFDMA Joint MU-MIMO & OFDMA Figure 2.10: The gap between exhaustive search and divide and conquer. 33 2.6.2 Comparison between SRA Algorithms The performance of UP-RA revised, greedy, recursive scheduling and divide and conquer is com- pared in a BSS with one AP and 30 users with N T = 4 and N R = 1 antennas respectively. The bandwidth is 40MHz (L = 5). The weighting coecients are set to compute both the sum rate and the proportional fair rate. As shown in Fig. 2.11, the throughput of recursive scheduling is very close to divide and conquer under both sum rate and proportional fair, which means it is also close to the optimal user schedule achieved by exhaustive search. The throughput of the greedy and UP-RA revised algorithms, however, are less than that of the recursive scheduling in both OFDMA and joint MU-MIMO and OFDMA modes. Fig. 2.11 also plots fairness index values. For all considered algorithms, the fairness index is relatively close to 1. This is primarily because in the scenario we consider here the pathloss is not that dierent among various users, thus the well known tradeo among maximizing throughput and achieving fairness, see, for example, [19], is not strong. (See Fig. 2.13 below for a scenario with sizable dierences in user pathloss resulting in low fairness when maximizing sum rate.) Note that recursive scheduling, our near-optimal algorithm, does achieve a good balance between throughput and fairness: its throughput is near optimal while its fairness index is quite close to 1 under proportional fair rate which strives for fairness by design. 2.6.3 ComparisonbetweenContinuous-rateAMCandDiscrete-rateAMC In the previous section, the performance of algorithms is compared assuming continuous-rate AMC, specically, Shannon capacity, as the channel capacity. This assumption is widely used in previous works even though discrete-rate AMC, used in real 802.11 networks, is dierent from Shannon capacity. In this simulation, discrete-rate AMC, other than Shannon capacity, is used to estimate the channel capacity. Compare the performance of algorithms shown in Fig. 2.11 and Fig. 2.12, it is shown that continuous-rate AMC overestimates the sum rate, but gives the same 34 UP-RA revised Greedy Recursive Divide&conquer 0 50 100 150 200 250 300 Throughput (Mbps) Sum rate Proportional fair rate UP-RA revised Greedy Recursive Divide&conquer 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Sum rate Proportional fair rate (a) OFDMA UP-RA revised Greedy Recursive Divide&conquer 0 100 200 300 400 500 600 Throughput (Mbps) Sum rate Proportional fair rate UP-RA revised Greedy Recursive Divide&conquer 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Sum rate Proportional fair rate (b) Joint MU-MIMO and OFDMA Figure 2.11: Comparison of the sum and proportionally fair rates between dierent algorithms. 35 conclusion: recursive scheduling is better than UP-RA revised and is very close to divide and conquer. In the following simulations, Shannon capacity is used to estimate the channel capacity of a user or user group. UP-RA revised Greedy Recursive Divide&conquer 0 20 40 60 80 100 120 Throughput (Mbps) Sum rate Proportional fair rate UP-RA revised Greedy Recursive Divide&conquer 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Sum rate Proportional fair rate (a) OFDMA UP-RA revised Greedy Recursive Divide&conquer 0 50 100 150 200 250 Throughput (Mbps) Sum rate Proportional fair rate UP-RA revised Greedy Recursive Divide&conquer 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Sum rate Proportional fair rate (b) Joint MU-MIMO and OFDMA Figure 2.12: Achieved rates using discrete-rate AMC. (compare to continuous-rate AMC in Fig. 2.11. 2.6.4 Impact of the Distribution of Users Consider a BSS with 1 AP and 30 users with N T = 4 and N R = 1 antennas respectively and a bandwidth of 40MHz (L = 5) like before. Suppose there are two groups of users, the rst group of users 10m away from the AP having a relatively small path loss, while the other group of users 20m away from the AP having a larger path loss. We gradually increase the proportion of users in the rst group and compare the throughput of the divide and conquer, UP-RA revised, greedy, 36 and recursive algorithms. Obviously, with more users in the rst group, the throughput of the system should increase, which is indeed the case as shown in Fig. 2.13. The gure also shows that the sum rate of the greedy algorithm is, as expected, lower than that of the other algorithms, since it doesn't allocate as many as possible RUs to users in the rst group. With respect to fairness, note that if the proportion of users in the rst group is very large, say 90%, or very small, say 10%, a large group of users have the same signal strength thus similar throughput and the fairness index is relatively large. When the split between the two user groups is more balanced, there is larger variability on the user's signal strength and the fairness index is lower as a sizable number of users is rarely selected especially when the goal is to maximize the sum rate. As a result, the fairness index is minimized when the proportion of users in the rst group is neither very close to 0 nor very close to 1, but rather somewhere in between. Last, as expected, note that when the AP uses proportional fairness, the fairness index is closer to 1 than when it maximizes the sum rate, since the former does attempt to achieve some level of fairness by design. 2.6.5 Impact of the Number of Users The impact of the number of users is evaluated in a BSS which consists of 1 AP and 20N 50 users equipped withN T = 4 andN R = 1 antennas respectively. The bandwidth is 40MHz (L = 5) like before. As shown in Fig. 2.14, the sum rate increases as N increases since there are more users who are close to the AP. If N is small, the performance can be improved by splitting the bandwidth into fewer RUs, since users with good channel conditions can get more frequency resources. UP-RA revised, recursive scheduling and divide and conquer algorithms can adjust the number of RUs dynamically, while the greedy algorithm cannot. As a result, the throughput of the greedy algorithm is much lower than that the other algorithms. Note that fairness gures are zoomed in to show the dierence between algorithms, which is quite small. Also note that as 37 0 0.2 0.4 0.6 0.8 1 The proportion of users in the first group 160 180 200 220 240 260 280 300 320 Throughput (Mbps) Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer 0 0.2 0.4 0.6 0.8 1 The proportion of users in the first group 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fairness index Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer (a) OFDMA 0 0.2 0.4 0.6 0.8 1 The proportion of users in the first group 200 300 400 500 600 700 Throughput (Mbps) Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer 0 0.2 0.4 0.6 0.8 1 The proportion of users in the first group 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fairness index Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer (b) Joint MU-MIMO and OFDMA Figure 2.13: Impact of dierent type of users. 38 the number of users increases, there are more users with higher (or lower) sum rate than average, therefore, the fairness index decreases slightly. 20 30 40 50 Number of users 100 150 200 250 Throughput (Mbps) Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer 20 30 40 50 Number of users 0.75 0.8 0.85 0.9 0.95 1 Fairness index Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer (a) OFDMA 20 30 40 50 Number of users 300 350 400 450 500 550 600 650 Throughput (Mbps) Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer 20 30 40 50 Number of users 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 Fairness index Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer (b) Joint MU-MIMO and OFDMA Figure 2.14: Impact of number of users. 2.6.6 Impact of the Number of Antennas at the AP The impact of the number of transmit antennas is evaluated in a BSS which consists of 1 AP and 25 users equipped with 2N T 6 andN R = 1 antennas respectively and a bandwidth of 40MHz (L = 5) like before. As shown in Fig. 2.15, the sum rate increases as N T increases, since there are more spatial streams in each MU-MIMO group. The sum rate of recursive scheduling is close to that of divide and conquer as N T increases. The sum rate increases faster in joint MU-MIMO and OFDMA mode because the channel capacity of MIMO is related to both the SNR and the 39 rank of the channel matrix H s : The rank of SU-MIMO in OFDMA mode is constant (equal to 1), while the rank of MU-MIMO in joint MU-MIMO and OFDMA mode may change from 2 to 6 as we vary the number of antennas from 2 to 6. Last, note that since the total transmit power P s is constant, the sum rate does not change linearly with N T . In summary, the results in Figures 2.11 - 2.15 show that recursive scheduling indeed performs near optimal (quite close to divide and conquer) under various scenarios. 246 Number of antennas 100 150 200 250 300 Throughput (Mbps) Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer 246 Number of antennas 0.88 0.9 0.92 0.94 0.96 0.98 1 Fairness index Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer (a) OFDMA 246 Number of antennas 200 300 400 500 600 700 Throughput (Mbps) Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer 246 Number of antennas 0.9 0.92 0.94 0.96 0.98 1 Fairness index Sum UP-RA revised Sum greedy Sum recursive Sum divide&conquer PF UP-RA revised PF greedy PF recursive PF divide&conquer (b) Joint MU-MIMO and OFDMA Figure 2.15: Impact of number of antennas. 2.6.7 Impact of variable payload In the previous scenarios, we have assumed that there are always packets to send to users. When this is not the case, it is important to consider that users may have dierent payload sizes (dierent 40 number of and size of packets waiting to be transmitted to them). Suppose that the available payload of each user is a random variable uniformly distributed between 200 and 11454 bytes whenever the AP starts a DL MU transmission [39]. The AP solves the SRA problem in two manners, see Section 2.4.2.6. 1. (No adaptation) The AP optimizes the performance without considering user trac. It assumes users' payload can always ll the DL MU transmission. In this case, channel time may be wasted. 2. (Adaptation) The AP adjusts the RU size according to the users' available payload to optimize the eective throughput. The performance of these approaches under multiple scheduling schemes is compared in a BSS with one AP and 30 users with N T = 4 and N R = 1 antennas respectively and a bandwidth of 40MHz. As shown in Fig. 2.16, the SRA algorithm without considering variable user payload leads to an eective throughput loss of about 40Mbps for OFDMA and 100Mbps for joint MU- MIMO and OFDMA. This demonstrates that our algorithms, by adjusting the RU size according to the users' available payload (Adaptation), can reduce the wasted airtime by taking advantage of the OFDMA capabilities of 802.11ax, without requiring to use padding like in 802.11ac [39]. 2.7 Experimental Results In the previous section, the performance of the algorithms is evaluated in simulations under some assumptions: 1) the wireless channel is modeled using the WINNER II model A1 scenario, 2) the implementation is ideal, e.g. there are no synchronization or channel estimation errors. In reality, however, the transmission can go through dierent wireless channels and suer from imperfect implementations, such as synchronization errors and channel estimation errors. Therefore, the simulation results cannot demonstrate the performance exactly. Motivated by this, we evaluate the performance of the algorithms via experiments using WARPv3 boards [2], see Fig. 2.17 for 41 UP-RA revised Greedy Recursive 0 20 40 60 80 100 120 140 160 180 Effective throughput (Mbps) Adaptation No adaptation UP-RA revised Greedy Recursive 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Adaptation No adaptation (a) OFDMA UP-RA revised Greedy Recursive 0 50 100 150 200 250 300 350 400 450 Effective throughput (Mbps) Adaptation No adaptation UP-RA revised Greedy Recursive 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Adaptation No adaptation (b) Joint MU-MIMO and OFDMA Figure 2.16: Comparison of the eective throughput between dierent algorithms. 42 Radio Radio Radio Radio Radio Radio Radio Radio Radio Radio Radio Radio ... 4 Users AP PC Ethernet switch WARP node WARP node WARP node Figure 2.17: Experiment overview. results. We obtain these results using the following testbed setup: one WARP node is used as an AP with 4 antennas, while 4 other WARP nodes, each equipped with 4 antennas, act as 16 independent users with a single antenna each. All the WARP nodes and a PC, which runs WARPLab and the SRA algorithms based on WARP measurements, connect to a switch through Ethernet cables. The receiver executes synchronization and channel estimation using the preamble and pilot subcarriers in the frames [3]. The AP works on a 20MHz channel. All the users can be allocated to any RUs. We present average results over 100 dierent topologies. For each topology, the AP stays at the same location but the 16 users are randomly placed at dierent locations inside the oce space shown in Fig. 2.18. Note that the SRA algorithms are executed at the PC using the CSI measurements obtained by the testbed. The performance of all the algorithms is compared in Fig. 2.19. The recursive scheduling algorithm can achieve a throughput which is close to the one achieved by divide and conquer, while maintaining good fairness among users, consistently with the simulation results. 43 Figure 2.18: Oce oor-plan. 2.8 Conclusions and Future Work In this paper, we study how to jointly split bandwidth into RUs and schedule users or user groups within these RUs in the context of 802.11ax networks. We rst investigate a relaxed version of this scheduling and resource allocation (SRA) problem which allows allocating a user to multiple RUs. It is proved that the relaxed SRA problem can be solved optimally by a divide and conquer based algorithm that we introduce, and the weighted sum rate of its optimal user schedule is a tight upper bound of the original SRA problem. However, the original SRA problem for 802.11ax, allocating a user to at most one RU, can only be solved optimally by exhaustive search which is computationally expensive. Motivated by this, we propose a greedy and a recursive based algorithm which solve the SRA problem eciently. As shown via simulations and experiments, the greedy algorithm is fast and achieves reasonable performance, while the recursive scheduling algorithm yields a throughput which is close to the optimal and maintains good fairness among users. We also show that modied LTE SRA algorithms like UP-RA do not perform as well as 44 UP-RA revised Greedy Recursive Divide&conquer 0 20 40 60 80 100 120 140 Throughput (Mbps) Sum rate Proportional fair rate UP-RA revised Greedy Recursive Divide&conquer 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Sum rate Proportional fair rate (a) OFDMA UP-RA revised Greedy Recursive Divide&conquer 0 50 100 150 200 250 Throughput (Mbps) Sum rate Proportional fair rate UP-RA revised Greedy Recursive Divide&conquer 0 0.2 0.4 0.6 0.8 1 1.2 Fairness index Sum rate Proportional fair rate (b) Joint MU-MIMO and OFDMA Figure 2.19: Experimental results using WARP boards. 45 recursive scheduling since they don't exploit the 802.11ax RU structure. Last, we show that our schemes can handle a plethora of real-world constraints including variable packet sizes, limited radio and CPU capabilities of access points, and inter-cell interference. In this paper, it is assumed that the AP can obtain perfect CSI from channel sounding. In an 802.11ax network, the CSI at the AP is imperfect due to quantization, delay, channel estimation error and so on. The estimate of channel capacity, as a function of CSI, is also inaccurate due to the imperfect CSI and may change the optimal solution of SRA problem. We leave as future work to study the in uence of imperfect CSI on the SRA problem. 46 Chapter 3 Scheduling and Resource Allocation 3.1 Introduction Carrier-sense multiple access with collision avoidance (CSMA/CA) is the multiple access method used in 802.11 networks. In the carrier sense phase, a node listens to the wireless channel to determine whether another node is transmitting or not. If an 802.11 preamble is detected and its strength is higher than a signal threshold, the Clear Channel Assessment (CCA) is reported as busy and the transmission is deferred for the length indicated in the incoming frame's Physical Layer Convergence Protocol (PLCP) length eld [22]. This multiple access mechanism may cause hidden node problems (HNPs) and exposed node problems (ENPs). Specically, suppose there are four nodes labeled as A, B, C, and D, with carrier sensing ranges as shown in Fig. 3.1 as colored segments. B can only exchange frames with A and C, while C can only exchange frames with B and D. In (a), both A and C want to transmit to B. A and C are unaware of each other and thus consider the channel as idle and start the transmission. These two frames collide with each other at B, but neither A nor C is aware of this collision (HNP scenario). In (b), on the other hand, B wants to transmit to A and C wants to transmit to D. B and C are aware of each other. Therefore, if one starts the transmission then the other will consider the channel as 47 A B C D A B C D (a) (b) Figure 3.1: Problems with CSMA/CA: (a) hidden node problem (b) exposed node problem. busy and defer its transmission. However, it doesn't need to defer the transmission since their transmissions don't interfere with the other's receivers (ENP scenario). The HNP can be addressed with the ready-to-send/clear-to-send (RTS/CTS) mechanism where the transmitter and receiver exchange RTS and CTS frames before transmission. The transmitter sends an RTS frame to the receiver. The receiver then replies with a CTS frame. Any node that receives the CTS frame defers its transmission and thus prevents collision at the re- ceiver. Unfortunately, the RTS/CTS mechanism reduces the opportunities for spatial reuse (SR), since nodes exposed to the transmitter or receiver may be unnecessarily blocked [46]. To improve SR, one may increase the signal threshold, however, this can worsen the HNP, since more nodes are unaware of each other. This is the well known trade-o in dealing with HNP and ENP. There are a lot of previous academic studies which try to improve SR in 802.11 networks, see, for example, [46, 55, 52, 53, 70] and references therein. Since industry adoption requires standardization, the newest WiFi standard, 802.11ax, introduced a mechanism to improve SR as well, mostly targeting dense environments since these are the environments where the problem is more acute [27, 33]. In particular, it introduced the concept of basic service set (BSS) coloring, where a node can identify an overlapping BSS (OBSS) frame, that is, a frame from a nearby BSS, by checking the BSS \color" of an incoming frame, and avoid backing o if the received signal strength indicator (RSSI) from that frame is under a threshold (call OBSS preamble detection or in short OBSS PD). It should be evident by now that the transmission power (TX PWR) of a frame and the OBSS PD may result in allowing or not allowing spatial reuse. Properly 48 setting the values for these parameters can improve the overall throughput of the system [41, 56], and, previous works [53, 55, 52] have adjusted the TX PWR and the OBSS PD based on models whose parameters depend on the environment. As a result, while a step in the right direction, these approaches do not adapt to environmental changes automatically and may lead to bad performance. Motivated by this, we investigate how to use reinforcement learning (RL), an area of machine learning which is wildly used for self-learning systems in an unknown environment, to design algo- rithms that allow WiFi stations (STAs) to learn the environment and choose the right TX PWR and OBSS PD for each situation. RL typically involves an agent which iteratively learns a policy to pursue a specic goal, adapting to the environment, and has been successfully used in areas like autonomous driving, robotic systems, computer vision, etc. In this paper, we apply several RL algorithms for transmit power control (TPC) to improve spatial reuse (SR) in 802.11ax. We show via extensive matlab simulations and realistic simulations in ns-3, that RL can signicantly improve spatial reuse in a wide variety of realistic, representative scenarios, by quickly adjusting to environmental changes and yielding sizably higher overall throughput than prior work. The rest of the paper is organized as follows. Section 3.2 brie y summarizes prior work. Section 3.3 describes the 802.11ax standard rules which STAs need to observe when jointly adjusting the TX PWR and OBSS PD. It also introduces other system model aspects, e.g. a well known Continuous Time Markov Chain (CTMC) based model that is used to model CSMA in our matlab simulations. (The ns-3 simulations use a realistic implementation of the CSMA protocol specication.) Then, Section 3.4 introduces some existing and some novel decentralized algorithms which require manual tuning to adjust to dierent environments (we will refer to such algorithms as \conventional"), and Section 3.5 introduces some novel decentralized RL-based algorithms. The performance of the various algorithms is compared in Sections 3.6 (matlab simulations) and 3.7 (ns-3 simulations), where it is shown that SR is indeed signicantly improved via RL-based approaches. Last, Section 3.8 concludes the paper. 49 3.2 Related Work Improving SR is a well-known challenge in wireless networks, especially in enterprise WiFi net- works. A number of approaches to improve SR for 802.11 networks have been suggested prior to the introduction of the 802.11ax version. For example, a two-way handshake for ecient channel reservation and spatial reuse (SR) has been suggested in [46], where a transmitting node adver- tises a probe frame, reserves the channel before transmission, and releases the channel reservation after the transmission. As another example, dynamic sensitivity control was proposed in [60], where an STA measures the RSSI of the access point (AP) beacon and then sets the receiver (RX) sensitivity threshold at a lower value by some amount called \margin". When 802.11ax was in development, BSS coloring and OBSS PD were proposed to improve SR. The performance of the proposal was studied in multiple papers, see, for example, [17, 5, 56, 54, 28, 41]. [17, 5] describe BSS coloring and spatial reuse in 802.11ax and indicate the potential improvement by SR. [56] evaluates DSC and BSS coloring schemes for both downlink and uplink transmissions while taking into account the so-called physical layer capture (PLC). [56, 54] compare the performance of DSC and BSS coloring in 802.11ax, and show that BSS coloring can improve the spectrum eciency if it is used jointly with the DSC algorithm. Last, [41] evaluates the expected performance benets oered by the 802.11ax SR, with xed OBSS PD values. The 802.11ax standard species the rules for joint adjustment of OBSS PD and TX PWR, but leaves the adjustment policy open. Some prior work proposes specic algorithms to jointly adjust the OBSS PD and TX PWR for improved 802.11ax SR [53, 55, 52, 70]. [55, 52] adjust the value of OBSS PD using the RSSI values of beacons from nearby APs, while [53] adjusts the TX PWR using the expected transmission count (ETX) as a metric to access the eectiveness of prior choices. While these works are a step in the right direction, they do not adjust well to dynamic environments. Specically, in general, these works map some information to TX PWR or OBSS 50 PD using a parameterized model and the resulting performance depends on the parameter values which, in turn, depend on the environment. If the environment changes, as shown in Section 3.6 and 3.7, these \conventional" transmit power control (TPC) algorithms for SR fail to adapt to the new environment and lead to a loss in overall throughput. Machine learning, more specically reinforcement learning (RL), has been used for DSC and TPC in the context of wireless networks but not in particular for SR. For example, in [16], the authors propose Q learning based power control for wireless sensor networks, where they assume that the sensors can collect full information from the environment, thereby making the best decisions. In contrast, in our work we take intro consideration real-world limitations and restrict the information that an STA may collect. In [23], an RL algorithm based on limited environmental information is leveraged to do power control and rate adaptation in the context of cellular networks. The authors in [58] apply multi-agent deep RL based power control in energy harvesting wireless networks. The wireless nodes learn the pattern of energy harvesting and adjust their power level accordingly to optimize the throughput. More related to our work are some recent works which use RL for the purpose of SR in WLANs. In [66, 65], the authors outline a general RL-based sensitivity control framework for improving SR in WLANs. [66] optimizes channel selection and TPC using a number of RL algorithms including "-greedy, EXP3, UCB and Thompson sampling but they do not consider a carrier sense threshold (CST) like OBSS PD. [65] uses Thompson sampling to adjust the channel, the CST and the TX PWR for static scenarios where STAs are xed. The action set considered is the combination of all possible channel, CST and TX PWR values, which doesn't satisfy the 802.11ax constraints as shown in Equation (3.2). In summary, neither of these works is designed for 802.11ax, and, as a result, they are not applicable in the context of 802.11ax. Last, it is not clear how these algorithms would perform if the action set was to be restricted in some manner to satisfy 802.11ax constraints. 51 AP1 AP2 STA1 STA2 STA3 BSS1 BSS2 Figure 3.2: The topology of an 802.11ax system 3.3 System Model 3.3.1 BSS Color BSS color is an identier of BSS which helps stations identify the source BSS of a received frame. This information is carried in the preamble of an 802.11ax PHY header. Upon the reception of a new frame, the STA checks the BSS color of the frame. If the BSS color is the same as its own BSS, it is considered as an intra-BSS frame which comes from other nodes within its BSS; otherwise, this frame is considered as an inter-BSS frame from an OBSS. The BSS color information is represented by a 6-bit BSS color eld in the SIG-A eld. It can identify up to 63 dierent BSSs (value 0 is reserved to indicate that BSS coloring is disabled). Thus, for all practical purposes, it can be assumed that there is no BSS color collision in the system as each BSS has a unique BSS color and all nodes can identify intra-BSS frames and inter-BSS frames correctly. 3.3.2 Spatial Reuse 802.11ax enables SR to improve spectrum eciency, especially in dense environments. Each STA is congured with a pair of parameters, (TX PWR, OBSS PD). During carrier sense, if an intra- BSS frame is received by an STA, the clear channel assessment (CCA) is considered as busy. 52 Frame detected Check BSS color Busy RSSI < OBSS_PD? Idle Inter-BSS Yes Intra-BSS No Figure 3.3: The procedure of frame reception. If, however, an inter-BSS frame is received, the channel is considered as busy only if the signal strength is higher than the OBSS PD, otherwise, the CCA is considered idle and the STA can start a new transmission. This procedure is depicted in Fig. 3.3. A high OBSS PD allows the STAs to ignore inter-BSS frames even when the co-channel inter- ference between these BSSs is strong, and thus increases the number of concurrently transmitting STAs. But the new transmission may harm the ongoing transmissions due to its strong interfer- ence and thus may cause collisions. In order to relieve co-channel interference while improving spatial reuse, an STA has to adjust the OBSS PD in conjunction with its TX PWR as follows [27]: OBSS PD max(OBSS PD min ; min(OBSS PD max ;OBSS PD min + (TX PWR ref TX PWR))); (3.1) 53 OBSS_PD max OBSS_PD TX_PWR TX_PWR ref OBSS_PD min Allowable OBSS_PD Figure 3.4: Illustration of the adjustment rule for OBSS PD and TX PWR. whereTX PWR ref is a reference power level dened as 21 dBm. The OBSS PD is allowed to be adjusted within the range between OBSS PD max =62 dBm and OBSS PD min =82 dBm, in which a lower TX PWR corresponds to a higher allowableOBSS PD level. It should be noted that the STA can operate at the legacy CCA level by employing OBSSPD =82 dBm, which is referred to as no SR. The adjustment rule is illustrated in Fig. 3.4. In this paper, we assume that each STA selects the OBSS PD as the largest value available, such that (3.1) becomes an equality, since, in a real world setup, all STAs will try their best to access the channel [53, 52]. Hence, setting either of OBSS PD or TX PWR determines the other, and optimizing the value selection of these two parameters is reduced to selecting one of them. We choose to select the TX PWR, and, following real world considerations where commercial WiFi chipsets can only adjust the TX PWR in 1 dBm steps, the constraints of the joint adjustment of TX PWR and OBSS PD becomes TX PWR +OBSS PD =61 dBm; TX PWR2A T =f1; 2;:::21g dBm; OBSS PD2A O =f82;81;::: 62g dBm: (3.2) 54 Note that Equation (3.2) introduces reciprocity in concurrent transmissions between any pair of STAs, even if they have dierent TX PWR and OBSS PD. To see this, suppose the path loss between STA 1 and STA 2 in Fig. 3.2 isPL and the interference from STA 1 to STA 2 is low and satises TX PWR 1 PL<OBSS PD 2 : (3.3) Then, STA 2 can start a concurrent transmission during a transmission of STA 1. Under this scenario, the TX PWR and OBSS PD of STA 1 and STA 2 satisfy the equations TX PWR 1 +OBSS PD 1 =61 dBm; TX PWR 2 +OBSS PD 2 =61 dBm: (3.4) Equation (3.3) and (3.4) imply that TX PWR 2 PL<OBSS PD 1 : (3.5) Therefore, the interference from STA 2 to STA 1 is also low such that STA 1 can also start a concurrent transmission during a transmission of STA 2. 3.3.3 Transmit Power Control Each STA adjusts its TX PWR and OBSS PD according to Equation (3.2) periodically to maxi- mize the overall system throughput. Throughout this work we do not allow STAs to coordinate with each other, to keep our ndings applicable to real world deployment. For each STA, a higher TX PWR can increase the RSSI at its associated AP, but doesn't necessarily increase the overall throughput of the system due to two reasons: Higher co-channel interference: The OBSSs receive higher co-channel interference from this STA, which reduces the SINR at the OBSS APs. 55 Fewer concurrent transmissions: The STA has to reduce its OBSS PD according to Equation (3.2). Therefore, it is less likely to transmit concurrently during OBSS transmissions. For example, suppose the STAs in Fig. 3.2 initially set their TX PWR and OBSS PD as in Table 3.1, the path loss between STA 1 and STA 2 is 90 dB, and the path loss between STA 1 and STA 3 is 95 dB. The interference from STA 1 to STA 2 and STA 3 is -80 dBm and -85 dBm respectively, lower than their OBSS PD. STA 1 can transmit concurrently with either STA 2 or STA 3. If STA 1 updates its TX PWR to be 15 dBm, it can increase the RSSI at AP 1 by 5 dBm, but the interference from STA 1 to STA 2 and STA 3 becomes -75 dBm and -80 dBm respectively. STA 2 can no longer transmit concurrently with STA 1. Even though the STA 3 can still transmit concurrently with STA 1, the new TX PWR 1 increases the interference from STA 1 to AP 2 and reduces the SINR of the transmissions from STA 3 to AP 2. Therefore, the overall throughput is not necessarily increased. Table 3.1: The initial TX PWR and OBSS PD of STAs STA TX PWR (dBm) OBSS PD (dBm) 1 10 -71 2 15 -76 3 15 -76 3.3.4 CSMA/CA Modeling For the purpose of our custom matlab-based simulations, we adopt the well known exponential CSMA/CA model, see, for example, [45, 10]. We also evaluate our algorithms using ns-3 which implements the real CSMA/CA protocol. Under the exponential CSMA/CA model, transmissions are considered exponentially dis- tributed with mean 1 , countdown times are considered exponentially distributed with mean 1 , and the medium can be sensed instantaneously [30]. Moreover, thanks to the reciprocity in con- current transmissions between any pair of STAs mentioned previously, we have an undirected interference graph. The system can therefore be modeled as a Continuous Time Markov Chain 56 (CTMC). We dene a set of non-con icting stations which can transmit at the same time as a feasible transmission pattern. The states of the CTMC are all the dierent patterns for the N s stations. Formally, let v =fv k jk = 1;:::;N s g be the state vector of the CTMC, where we let v i = 1 if stationi is transmitting andv i = 0 otherwise, and letVf0; 1g Ns be the state space of the CTMC. The limiting stationary distribution of state v2V can be calculated as follows: v = kvk1 P v 0 2V kv 0 k1 (3.6) where = = andkvk 1 is the number of stations transmitting during state v. This gives us the fraction of time that the network spends in each transmission pattern v2V. We then calculate the throughput resulting from CSMA/CA by evaluating the spectral eciency achieved for each transmission pattern, averaging these values with respect to the stationary distribution, and multiplying the result by the channel bandwidth. In practice, it is time consuming to obtain all the possible states and the corresponding limiting distribution. Therefore, for large scale scenarios we use the same technique as in [21] to get the maximal independent set (MIS) of the interference graph. The state obtained by the MIS typically takes a large portion of time and dominates the performance (see Eq. 3.6). Instead of enumerating all the possible states, one may calculate the throughput of that state and ignore the throughput of the others [45, 10]. Thus, the overall throughput of the system is approximately equal to Throughput = X k2M R(SINR k ); (3.7) whereSINR k is the SINR of STAk,M is the MIS, andR() is the throughput of the STA given the SINR. 57 3.3.5 Possible Information Collected at an STA in 802.11ax To keep our study amenable to real world implementation, we restrict the information that an STA may collect to the following: (R1) RSSI from the associated AP within its BSS. (R2) RSSI from the other APs in OBSSs. (S1) Statistics of its own previous transmissions. (S2) Statistics of the transmissions by some other STAs. Taking STA1 in Fig. 3.2 as an example, (R1) is the RSSI from AP1 to STA1, (R2) is the RSSI from AP2 to STA1, (S1) is the statistical information of its TX PWR, and (S2) is the statistical information of STA2, STA3, and STA4 at STA1 if they can be heard by STA1. (R1) and (R2) can be collected and identied from BSS beacons, see, for example, [55, 52, 60]. (S1) and (S2) can be collected due to the nature of wireless signals: when a frame is sent out by a transmitter, all the nodes in the neighborhood can receive this frame, even if they are not the destination. These receiver nodes can then demodulate the frame and read the header and the payload size. Therefore, an STA can estimate the throughput of a neighbor by monitoring its frames and the corresponding acknowledgements. We validate this mechanism in a real world setup: we conduct an experiment over an 802.11ac network with one AP and two STAs in a regular oce room. One STA transmits frames to the AP continuously while the other STA operates in the monitoring mode and captures the frames as well as the ACKs from the AP using Wireshark, see Fig. 3.5 for one such captured frame. Then, the monitoring STA reads the header and the payload size of the captured frames and estimates the throughput of its neighbor. Note that the overhead of this mechanism is rather low in practice, and prior work has also assumed its usage, see, for example, [65]. 58 Figure 3.5: The frame captured by Wireshark. 3.4 Conventional Transmit Power Control As already mentioned, we refer to algorithms that require manual tuning to adjust to dierent environments as \conventional". In this section we brie y summarize a prior and introduce a novel such algorithm. 3.4.1 Prior Conventional Algorithms Previous works on 802.11ax SR map the collected information to OBSS PD or TX PWR values via some functions. As an example, the so called control OBSS PD sensitivity threshold (COST) algorithm was proposed in [55]. Its basic idea is to use the dierence between the BSS beacon RSSI and the OBSS beacon RSSI ((R1) and (R2) in Section III) to update a margin. Then the OBSS PD is calculated based on the updated margin, see [55] for more details. 3.4.2 Proposed Method - Binary Algorithm We propose a threshold-based algorithm, which we term binary algorithm. Similar to [55, 52, 60], the binary algorithm relies on beacon RSSI measurements ((R1) and (R2) in Section III) to derive the proper TX PWR and OBSS PD values. The rationale of the binary algorithm is as follows: 59 If the BSS beacon RSSI is much larger than the OBSS beacon RSSI, which indicates the STA is close to its associated AP and far away from OBSS APs, then the STA can reduce its TX PWR and increase its OBSS PD to get higher chance to transmit; otherwise, the STA is likely far away from its associated AP but close to OBSS APs, and thus it should increase the TX PWR and reduce its OBSS PD to increase the RSSI at its associated AP. The binary algorithm sets up a threshold value Diff thr for the dierence between the BSS beacon RSSI and the OBSS beacon RSSI Diff. An STA then selects its TX PWR value (and the corresponding OBSS PD) out of two predened choices after comparing the dierence with the threshold. Algorithm 4 demonstrates the procedure of the binary algorithm. Note that this algorithm is very simple to use but its performance highly depends on the parameter settings, e.g., the threshold and the TX PWR values. If the environment is given and the parameters are tuned properly, the algorithm can perform well, see Sections 3.6, 3.7 for performance results. Algorithm 4 Binary TPC 1: Diff =BSS RSSIOBSS RSSI avg 2: if Diff >Diff thr then 3: TX PWR =TX PWR low 4: else 5: TX PWR =TX PWR high 6: end if 3.5 ReinforcementLearningbasedTransmitPowerControl Reinforcement learning [63], an area of machine learning, is wildly used for self-learning systems in an unknown environment. It involves an agent and the environment as shown in Fig. 3.6. The agent iteratively learns the policy to pursue a specic goal, adapting to the environment. At episode t, the state of the environment is denoted by s(t) and the agent takes action a(t). The environment, in uenced by the action, transits to state s(t + 1) and provides a rewardr(t) to the agent. At the next episode, the agent observes the new state and takes a new actiona(t+1) based 60 Environment a(t) s(t+1) r(t+1) s(t) r(t) Agent Figure 3.6: Reinforcement learning. on the policy. In this way, the agent iteratively interacts with the environment and learns/adapts the policy gradually to approach to the goal. We use RL to improve SR in 802.11ax. Suppose each actiona(t) =i2A T is associated with a probability distributionsD i which is unknown to the STA at the beginning. The STA selects one action at each episode and observes the associated rewardr(t). The goal is to nd the distribution with the highest expected reward and obtain as many rewards as possible. Each STA iteratively explores the environment and exploits its current knowledge to adjust its TX PWR and OBSS PD. Note that each STA may only partially observe the environment due to the limited wireless coverage and real world implementation constraints. Motived by this, we formulate the problem as a multi-armed Bandit problem (MAB) [8] to specify the policy to select an action at each episode. Traditionally, the MAB problem models a situation where a gambler sequentially pulls the arm of one of many slot machines in a casino, with the hope of maximizing the reward. In general, there areK arms/actions available for a gambler/learner, and at each timet =f1;:::;Tg, the learner picks an action a(t) while simultaneously the environment decides the reward r(t), and the learner then suers and observes the loss [8]. In our 802.11ax setting, an STA can explore dierent actions to get a better estimate of the distributionsfD i g, but exploring non-optimal actions causes a loss in the reward; or it can exploit the current estimate of the distributionsfD i g to selects the action with the highest expected reward, but the action may not be the optimal one due to the inaccurate estimate offD i g. Many MAB algorithms have been proposed to handle the tradeo between exploration, where we gather 61 more information that might lead us to better decisions in the future, and exploitation, where we make the best decision given the current information. In the following sections, we leverage and propose two RL-based algorithms to balance the exploration-exploitation tradeo in 802.11ax. 3.5.1 Reinforcement Learning Model At episode t, STA k takes action a(t) = i by setting its TX PWR and OBSS PD to be i dBm and61i dBm respectively. Then, it receives a reward r k (t). If each STA only considers its own throughput when setting its TX PWR and OBSS PD, this may not improve the overall throughput. In order to optimizing the overall throughput rather than optimizing the throughput of each STA individually, we dene r k (t) to be the sum of the throughput of STA k and its neighboring STAs. An STA estimates this using statistics of its own previous transmissions and statistics of the transmissions by other stations within range ((S1) and (S2) in Section III)). Thus, the reward STA k received at episode t is dened as r k (t) =R k (t) +R k (t); (3.8) where R k (t) is the throughput of STA k at episode t andR k (t) is the sum throughput of the neighboring STAs of STA k at episode t: R k (t) = X j2N(k) R j (t); (3.9) where N(k) is the set of all the neighbors of STA k, i.e. of STAs within range, see prior work on neighborhood-based approaches for throughput maximization of CSMA networks [50]. 62 3.5.2 UCB based TPC UCB was proposed to minimize the expected regret [9]. It keeps the number of times that each action i has been taken, denoted by n i k (t), in addition to the empirical mean r i k . STA k takes each action once initially and greedily selects the action a k (t) = arg max i2A T ( r i k (t) +c s lnt n i k (t) ) (3.10) afterwards, where c> 0 controls the degree of exploration. The details of the algorithm is shown in Algorithm 5 below. Algorithm 5 UCB based TPC 1: for t = 1 tojA T j do 2: Select action a k (t) =t 3: end for 4: for tjA T j do 5: Select action a k (t) by Equation (3.10) 6: Obeserve the reward r i k (t) 7: n i k (t + 1) n i k (t) + 1 8: r i k (t + 1) 1 n i k (t+1) P t =1 r i k () 9: t t + 1 10: end for The idea of UCB is to balance the exploration-exploitation tradeo by using the square-root term (i.e., the second term in Eq. 3.10) to measure the uncertainty of a k 's value. Each time ifa k is selected, n i k increments and thus the uncertainty term decreases. Otherwise, if an action other thana k is selected,t increases butn i k does not, and then the uncertainty estimate increases. This algorithm will eventually select all actions at least once. However, actions which have already been selected many times will be selected less frequently over time. 63 3.5.3 Reinforcement Comparison based TPC STA k maintains a distribution over actions p i k (t), a set of preferences i k (t) and the average reward r k (t). The probability of taking an action p i k (t) is computed by comparing its empirical mean of the reward r i k (t) with the average reward r k (t). p i k (t) increases if r i k (t) is higher than r k (t) and decreases otherwise. At each episode t, the probability p i k (t) is computed using a Boltzmann distribution based on preferences i k (t), that is: p i k (t) = e i k (t) P j2A T e j k (t) : (3.11) If action i is selected at episode t and a reward r i k (t) is received, the preference i k (t) is updated as follows: i k (t + 1) = i k (t) +(r i k (t) r k (t)): (3.12) Also, at every turn, the mean of the rewards is updated as: r k (t + 1) = (1) r k (t) +r i k (t); (3.13) where and are learning rates between 0 and 1. The complete algorithm is shown in pseudocode in Algorithm 6 below. Algorithm 6 Reinforcement comparison based TPC 1: while true do 2: Select action a k (t) =i with probability p i k (t) dened by Equation (3.11) 3: Observe the reward r i k (t) 4: Update i k (t + 1) and r k (t + 1) by Equation (3.12) and (3.13) respectively 5: t t + 1 6: end while The idea of reinforcement comparison is as follows. Instead of using the absolute reward value as the metric to select an action, we use the dierence between the reward of this action and the 64 average reward we have received so far to evaluate the benet of using this action. We then use this benet evaluation result to update the preference and the probability of taking this action (see Eq. 3.11 and 3.12). This can avoid the case where some actions with higher reward values at the beginning dominate the following action selection process, and it gives the other actions more opportunities to be explored. 3.6 MATLAB Simulation The performance of the TPC algorithms (COST, binary, reinforcement comparison and UCB) is evaluated in MATLAB simulations. The baseline is no spatial reuse (no SR), where the OBSS PD and TX PWR of all STAs is set -82 dBm and 21 dBm respectively. Consider the uplink transmissions of 4 BSSs working on the same channel, each of which consists of 1 AP and N s STAs. The APs and STAs are equipped with 1 antenna. The STAs are uniformly distributed within a circle around their associated APs, the distance between the AP and its STAs is between 3 m and r B m, as shown in Fig. 3.7. The value of r B can be changed to control the typical distance of STAs around their associated APs. The wireless channel is assumed to follow the WINNER II model [38]. We choose the indoor oce (A1) non-line-of-sight (NLOS) scenario for the simulations. According to this WINNER II model, the path loss PL equals PL =A log 10 d +B +C log 10 (f c =5) +X; (3.14) where 3<d< 100 is the distance between a transmitter and a receiver,f c is the carrier frequency, and A, B, C and X are parameters related with scenarios which can be found in [38]. For each scenario, we keep the APs xed and create dierent topologies by randomly distributing the STAs. When calculating the throughput of the system, [13] is used to nd the MIS for CSMA modeling, and the rate adaptation is assumed to be optimal, i.e., the STAs can select the optimal MCS based on the SINR at the AP side [27]. The simulation parameters are summarized in Table 3.2. 65 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 AP STA (a) r B = 6 m 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 AP STA (b) r B = 10 m Figure 3.7: The topology of simulations. Table 3.2: MATLAB simulation parameters Parameter Value Number of BSSs N b 4 Number of STAs per BSS N s 6, 8, 10 Radius of BSS r B 8, 10, 12 m Frequency band 5.22 GHz Bandwidth 20 MHz A, B, C 36.8, 43.8, 20 In addition to the overall throughput, we are also interested in how fair the throughput is shared among STAs, and we measure this by Jain's fairness index [37] F = ( P Ns k=1 r k ) 2 N s P Ns k=1 r 2 k : (3.15) 3.6.1 Comparison of TPC Algorithms in Static Environments The performance of TPC algorithms is rst evaluated under static scenarios where all STAs are xed. We vary N s and set r B = 8. As shown in Fig. 3.8, SR can greatly improve the overall throughput and fairness. In Fig. 3.8(a), the overall throughput of SR schemes decreases 66 slightly as N s grows, while the overall throughput of no SR is stable as the N s grows. The performance of COST and binary algorithms is good and outperforms reinforcement comparison as their parameters are tuned manually for the topology. UCB achieves the best performance and provides 45 68 Mbps gain in overall throughput compared with no SR. Note that one reason why reinforcement comparison is not doing that great here is because actions do not have similar rewards whereas the method is well known to work best when actions have similar rewards [63]. With respect to fairness, RL based algorithms are not as fair as binary and COST for the following reason. In the binary and COST algorithms all STAs set their TX PWR and OBSS PD without considering other STA, while in RL based algorithms an STA takes into account the throughput of its neighborhood and may lower its own TX PWR if this may improve the overall throughput of its neighborhood. Hence, some STAs end up having less throughput than others. This is pretty common when the objective function is the overall throughput, and there are well know methods like weighted sum rate optimization which can also consider fairness. In summary though, RL based algorithms can greatly improve the overall throughput while maintaining good fairness close to that of conventional TPC algorithms. 3.6.2 Comparison of TPC Algorithms in Dynamic Environments In practice, an 802.11ax system is a dynamic environment since STAs are mobile thus the path loss between any pair of nodes is changing over time. Therefore, the STAs need to adjust their OBSS PD and TX PWR to adapt to this dynamic environment. We consider a scenario where the STAs are initially uniformly distributed around their associated APs within r B = 10 m, and then they move around randomly with speed 0.25 m/s (a typical speed for so called nomadic users which is the type of users found in Wi-Fi networks, see, for example, [22]) while the APs' locations are xed. The conventional TPC algorithms, binary and COST, use constant values for their parameters, in particular we use in this scenario the optimized values we computed for the (static) scenario of 67 6 8 10 Number of STAs per BSS Ns 0 50 100 150 200 250 300 Overall throughput (Mbps) No SR COST Binary Rf Comp UCB (a) Overall throughput 6 8 10 Number of STAs per BSS Ns 0 0.2 0.4 0.6 0.8 1 1.2 Overall throughput (Mbps) No SR COST Binary Rf Comp UCB (b) Fairness Figure 3.8: The performance with dierent numbers of static STAs per BSS. 68 Subsection 3.6.1. Note that it is impossible in practice to keep on selecting good parameter values for binary and COST in dynamic scenarios as these algorithms have a very large state space that needs to be explored, there are multiple STAs that would need to do so quite often, and the envi- ronment changes at faster timescales than one would need to search the state space to optimize the parameter values. In contrast, RL based algorithms are designed to track environmental changes with little overhead. As shown in Fig. 3.9, the result of this is that RL based algorithms can adapt to the dynamic environment well and provide a sizable gain over conventional algorithms. Note that Reinforcement comparison is doing as good as UCB in this scenario, which can be explained by the fact that STA mobility makes the rewards of dierent actions to be similar over time due to averaging, which is the scenario where the scheme is known to do well, as already mentioned. Last, note that even if we optimize the parameters of binary and COST oine in a constant basis to track the environmental changes, which would be impossible to do in practice but it is doable in the context of an oine simulation, their performance gains are modest (less than 5%) and their overall throughput remains well below that of the RL based algorithms. 3.6.3 Impact of the Path Loss Exponent The impact of the path loss exponentA on SR performance under an environment where all STAs are static is shown in Fig. 3.10, where the path loss exponent A ranges from 32 to 38, N s = 8 and r B = 8. As A grows, the overall throughputs of all schemes increase, since the co-channel interference between BSSs is lower and STAs have more opportunities to transmit. 3.6.4 Impact of the Distribution of STAs around the APs The distribution of the STAs also in uences the RSSI between nodes, and thus the overall through- put of dierent SR approaches. To show this, we change the value ofr B to vary the typical distance of STAs around their associated APs in Fig. 3.7. As the value of r B grows, STAs tend to be further away from their associated APs but closer to OBSS APs. The RSSI at the associated AP 69 0 10 20 30 40 50 60 70 80 90 100 Time (s) 160 170 180 190 200 210 220 230 240 250 Overall throughput (Mbps) No SR Binary COST Rf Comp UCB (a) Snapshot 6 8 10 Number of STAs per BSS Ns 0 50 100 150 200 250 300 Overall throughput (Mbps) No SR Binary COST Rf Comp UCB (b) Overall throughput Figure 3.9: The performance under a dynamic environment with mobile STAs. 32 33 34 35 36 37 38 Path loss exponent 120 140 160 180 200 220 240 260 280 Overall throughput (Mbps) No SR COST Binary Rf Comp UCB Figure 3.10: The overall throughput with dierent path loss exponents. 70 8 10 12 r B (m) 0 50 100 150 200 250 Overall throughput (Mbps) No SR COST Binary Rf Comp UCB Figure 3.11: The overall throughput with dierent average distances from APs. decreases while the interference to OBSS APs is higher. Besides, the STAs have to use a lower TX PWR to increase the OBSS PD due to the higher OBSS interference, otherwise SR would be blocked. Therefore, the overall throughput in Fig. 3.11 decreases as r B grows, especially for SR schemes. Binary and COST algorithms are relatively worse when r B is larger, since their parameters are xed and cannot adapt to dierent topologies well. 3.7 NS-3 Simulation We also use ns-3 [1] to evaluate the performance of TPC algorithms. The topology is the same as in Fig. 3.7 of Section 3.6 and we use the parameters shown in Table 3.3. The STAs send UDP packets to their associated APs in a saturated manner. The downlink trac is disabled. Each BSS is assigned with a unique BSS color; therefore, all STAs can identify intra-BSS frames and inter-BSS frames correctly. MCS 0 (BPSK, 1/2), 3 (16-QAM, 1/2) and 5 (64-QAM, 2/3) are used to show the study of TPC algorithms with dierent PHY rates, since ns-3 does not support rate adaptation which we used in the matlab simulations. First, note that the overall throughput is lower than that in MATLAB simulations as the STAs are limited to use MCS 0, 3 or 5 (rather than using any available MCS option). Second, 71 Table 3.3: NS-3 simulation parameters Parameter Value Number of BSSs N b 4 Number of STAs per BSS N s 6, 8, 10 Radius of BSS r B 8 m Frequency band 5.22 GHz Bandwidth 20 MHz Guard interval 800 ns MCS 0, 3, 5 RTS/CTS disabled Trac Type uplink UDP STA TX PWR max 21 dBm STA TX PWR min 1 dBm AP TX PWR 21 dBm UDP packet size 1472 bytes note that the conventional TPC algorithms achieve similar performance as no SR, as they fail to adapt their parameters to the xed MCS case. In particular, because they adjust the TX PWR and OBSS PD based on the RSSI of the beacon from the APs, sometimes the corresponding SINR values are below the threshold to transmit with the specied MCS. RL TPC algorithms, on the other hand, maintain good performance as in MATLAB simulations, showing their ability to adapt to the xed MCS, especially reinforcement comparison, whose performance is close to UCB. Last, note that the condence intervals for the MCS 5 results are larger than they are for lower MCS cases. This is because since there is no rate adaptation, the higher the MCS the larger the variance between the achieved rate using the MCS and a zero rate when the resulting SINR is below the transmission threshold [49]. 3.8 Conclusion The overall throughput of 802.11ax networks can be improved by SR if TX PWR and OBSS PD are set properly. Conventional TPC algorithms can improve the overall throughput if their parameters are tuned properly for a specic environment. But these algorithms fail to adapt to dierent environments. RL based TPC algorithms, on the other hand, can exploit the current knowledge for 72 6 8 10 Number of STAs per BSS Ns 0 5 10 15 20 25 Overall throughput (Mbps) No SR COST Binary Rf Comp UCB (a) MCS 0 6 8 10 Number of STAs per BSS Ns 0 10 20 30 40 50 60 70 80 90 100 Overall throughput (Mbps) No SR COST Binary Rf Comp UCB (b) MCS 3 6 8 10 Number of STAs per BSS Ns 0 20 40 60 80 100 120 140 160 180 Overall throughput (Mbps) No SR COST Binary Rf Comp UCB (c) MCS 5 Figure 3.12: The ns-3 simulation results. 73 the environment and adapt to dierent environments thus providing better performance. As shown in Sections 3.6 and 3.7, RL based algorithms can adapt to dynamic environments with mobile STAs, dierent STA distances from APs, dierent channel models and dierent rate adaptation approaches. 74 Chapter 4 Conclusion In this dissertation, two new features of 802.11ax, OFDMA, and BSS coloring are investigated. First, a divide and conquer based recursive scheduling algorithm is proposed to optimize the 802.11ax SRA problem, which splits the channel into RUs and allocate user or user group to each RU. The performance is evaluated in both MATLAB simulations and experiments on SDR. Compared with other algorithms, the recursive scheduling algorithm is ecient and near-optimal in various environments. Then reinforcement learning-based algorithms are applied to solve the 802.11ax TPC problem. The TPC problem is modeled as MAB problem due to the lack of coordination between STAs. As shown in the MATLAB and ns-3 simulations, MAB based TPC algorithms outperform con- ventional TPC algorithms in both static and dynamic environments, since they can explore the environment and adjust their policy based on the knowledge obtained. 75 References [1] ns-3 Network Simulator. https://www.nsnam.org/. [2] WARP Project. http://warpproject.org. [3] WARPLab. http://warpproject.org/trac/wiki/WARPLab/Examples/MIMO_OFDM. [4] MATLAB R2017a Communications System Toolbox 6.4. The MathWorks, Inc., Natick, Mas- sachusetts, 2017. [5] M. S. Afaqui, E. Garcia-Villegas, and E. Lopez-Aguilera. IEEE 802.11ax: Challenges and Requirements for Future High Eciency WiFi. IEEE Wireless Communications, 24(3):130{ 137, June 2017. [6] Matthew Andrews, Krishnan Kumaran, Kavita Ramanan, Alexander Stolyar, Phil Whiting, and Rajiv Vijayakumar. Providing Quality of Service over a Shared Wireless Link. IEEE Communications Magazine, 39(2):150{154, February 2001. [7] Mustafa Y. Arslan, Jongwon Yoon, Karthikeyan Sundaresan, Srikanth V. Krishnamurthy, and Suman Banerjee. A Resource Management System for Interference Mitigation in En- terprise OFDMA Femtocells. IEEE/ACM Transactions on Networking, 21(5):1447{1460, October 2013. [8] P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire. The Nonstochastic Multiarmed Bandit Problem. SIAM Journal on Computing, 32(1):48{77, 2002. [9] Peter Auer, Nicol o Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2):235{256, May 2002. [10] Boris Bellalta. IEEE 802.11ax: High-eciency WLANs. IEEE Wireless Communications, 23(1):38{46, February 2016. [11] Boris Bellalta and Katarzyna Kosek-Szott. AP-initiated Multi-user Transmissions in IEEE 802.11ax WLANs. Ad Hoc Networks, 85:145 { 159, 2019. [12] Saad Biaz and Shaoen Wu. Rate Adaptation Algorithms for IEEE 802.11 networks: A Survey and Comparison. In 2008 IEEE Symposium on Computers and Communications, pages 130{136, July 2008. [13] Coen Bron and Joep Kerbosch. Algorithm 457: Finding All Cliques of an Undirected Graph. Communications of the ACM, 16(9):575{577, September 1973. [14] Eduardo Casta~ neda, Ad~ ao Silva, At lio Gameiro, and Marios Kountouris. An Overview on Resource Allocation Techniques for Multi-User MIMO Systems. IEEE Communications Surveys Tutorials, 19(1):239{284, Firstquarter 2017. 76 [15] Peter W.C. Chan and Roger S. Cheng. Capacity Maximization for Zero-Forcing MIMO- OFDMA Downlink Systems with Multiuser Diversity. IEEE Transactions on Wireless Com- munications, 6(5):1880{1889, May 2007. [16] Michele Chincoli and Antonio Liotta. Self-Learning Power Control in Wireless Sensor Net- works. Sensors, 18(2), 2018. [17] D. Deng, Y. Lin, X. Yang, J. Zhu, Y. Li, J. Luo, and K. Chen. IEEE 802.11ax: Highly Ef- cient WLANs for Intelligent Information Infrastructure. IEEE Communications Magazine, 55(12):52{59, December 2017. [18] Ahmad M. El-Hajj, Zaher Dawy, and Walid Saad. A Stable Matching Game for Joint Up- link/downlink Resource Allocation in OFDMA Wireless Networks. In 2012 IEEE Interna- tional Conference on Communications (ICC), pages 5354{5359, June 2012. [19] Jiancun Fan, Georey Ye Li, Qinye Yin, Bingguang Peng, and Xiaolong Zhu. Joint User Pair- ing and Resource Allocation for LTE Uplink Transmission. IEEE Transactions on Wireless Communications, 11(8):2838{2847, August 2012. [20] Guillem Femenias and Felip Riera-Palou. Scheduling and Resource Allocation in Downlink Multiuser MIMO-OFDMA Systems. IEEE Transactions on Communications, 64(5):2019{ 2034, May 2016. [21] M. Garetto, T. Salonidis, and E. W. Knightly. Modeling Per-Flow Throughput and Cap- turing Starvation in CSMA Multi-Hop Wireless Networks. IEEE/ACM Transactions on Networking, 16(4):864{877, August 2008. [22] Gast, Matthew. 802.11Ac: A Survival Guide. O'Reilly Media, Inc., 1st edition, 2013. [23] E. Ghadimi, F. Davide Calabrese, G. Peters, and P. Soldati. A Reinforcement Learning Approach to Power Control and Rate Adaptation in Cellular Networks. In 2017 IEEE In- ternational Conference on Communications (ICC), pages 1{7, May 2017. [24] Zhangyu Guan, Tommaso Melodia, Dongfeng Yuan, and Dimitris A. Pados. Distributed Resource Management for Cognitive Ad Hoc Networks with Cooperative Relays. IEEE/ACM Transactions on Networking, 24(3):1675{1689, June 2016. [25] Wei Guo, Jiancun Fan, Georey Ye Li, Qinye Yin, and Xiaolong Zhu. Adaptive SU/MU- MIMO Scheduling Schemes for LTE-A Downlink Transmission. IET Communications, 11(6):783{792, 2017. [26] Shengchun Huang, Hao Yin, Jiangxing Wu, and Victor C. M. Leung. User Selection for Mul- tiuser MIMO Downlink With Zero-Forcing Beamforming. IEEE Transactions on Vehicular Technology, 62(7):3084{3097, September 2013. [27] IEEE Standards Association. P802.11ax/D4.1 IEEE Draft Standard for Information Technol- ogy - Telecommunications and Information Exchange Between Systems Local and Metropoli- tan Area Networks - Specic Requirements Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specications Amendment Enhancements for High E- ciency WLAN. IEEE P802.11ax/D4.1, pages 1{754, April 2019. [28] Itagaki, Takeshi and Morioka, Yuichi and Mori, Masahito and Ishihara, Koichi and Shinohara, Shoko and Inoue, Yasuhiko. Performance Analysis of BSS Color and DSC. Presentation doc. IEEE802.11-15/0045r0, pages 1{14, January 2015. 77 [29] Tobias Lindstrm Jensen, Shashi Kant, Joachim Wehinger, and Bernard Henri Fleury. Fast Link Adaptation for MIMO OFDM. IEEE Transactions on Vehicular Technology, 59(8):3766{3778, October 2010. [30] L. Jiang and J. Walrand. A Distributed CSMA Algorithm for Throughput and Utility Max- imization in Wireless Networks. IEEE/ACM Transactions on Networking, 18(3):960{972, June 2010. [31] Zhefeng Jiang, Shiwen Mao, and Xin Wang. Dynamic Downlink Resource Allocation and Access Strategy for Femtocell Networks. Transactions on Emerging Telecommunications Technologies, 28(9):e3151{n/a, 2017. [32] Mehmet Karaca, Saeed Bastani, Basuki Endah Priyanto, Mohammadhassan Safavi, and Bj orn Landfeldt. Resource Management for OFDMA based Next Generation 802.11 WLANs. In 2016 9th IFIP Wireless and Mobile Networking Conference (WMNC), pages 57{64, July 2016. [33] E. Khorov, A. Kiryanov, A. Lyakhov, and G. Bianchi. A Tutorial on IEEE 802.11ax High Eciency WLANs. IEEE Communications Surveys Tutorials, 21(1):197{216, Firstquarter 2019. [34] Gibum Kim, Jinwoo Kim, Hyungsik Han, and Hyuncheol Park. A Low Complexity User Selection Scheme in Downlink MU-MIMO System. In 2014 IEEE International Wireless Symposium (IWS 2014), pages 1{4, March 2014. [35] Keunyoung Kim, Youngnam Han, and Seong-Lyun Kim. Joint subcarrier and power alloca- tion in uplink OFDMA systems. IEEE Communications Letters, 9(6):526{528, June 2005. [36] Mari Kobayashi and Giuseppe Caire. Joint Beamforming and Scheduling for a Multi-Antenna Downlink with Imperfect Transmitter Channel Knowledge. IEEE Journal on Selected Areas in Communications, 25(7):1468{1477, September 2007. [37] Mari Kobayashi and Giuseppe Caire. Joint beamforming and scheduling for a multi-antenna downlink with imperfect transmitter channel knowledge. IEEE Journal on Selected Areas in Communications, 25(7):1468{1477, September 2007. [38] Pekka Ky osti, Juha Meinil a, Lassi Hentil a, Xiongwen Zhao, Tommi J ams a, Christian Schnei- der, Milan Narandzi c, Marko Milojevi c, Aihua Hong, Juha Ylitalo, Veli-Matti Holappa, Mikko Alatossava, Robert Bultitude, Yvo de Jong, and Terhi Rautiainen. WINNER II Channel Models. D1.1.2 V1.2. IST-4-027756 WINNER II, September 2007. [39] Chi-Han Lin, Yi-Ting Chen, Kate Ching-Ju Lin, and Wen-Tsuen Chen. FDoF: Enhancing Channel Utilization for 802.11ac. IEEE/ACM Transactions on Networking, 26(1):465{477, Feb 2018. [40] Tarcisio F. Maciel and Anja Klein. On the Performance, Complexity, and Fairness of Sub- optimal Resource Allocation for Multiuser MIMO-OFDMA Systems. IEEE Transactions on Vehicular Technology, 59(1):406{419, January 2010. [41] Arjun Malhotra, Mukulika Maity, and Avik Dutta. How Much Can We Reuse? An Empirical Analysis of the Performance Benets Achieved by Spatial-reuse of IEEE 802.11ax. In 2019 11th International Conference on Communication Systems Networks (COMSNETS), pages 432{435, January 2019. 78 [42] Gabriel Martorell, Felip Riera-Palou, and Guillem Femenias. Cross-Layer Fast Link Adapta- tion for MIMO-OFDM Based WLANs. Wireless Personal Communications, 56(3):599{609, February 2011. [43] Juha Meinil a, Pekka KyR osti, Tommi J ams a, and Lassi Hentil a. WINNER II Channel Mod- els, pages 39{92. John Wiley & Sons, Ltd, 2009. [44] Antonios Michaloliakos, Ryan Rogalin, Yonglong Zhang, Konstantinos Psounis, and Giuseppe Caire. Performance Modeling of Next-Generation WiFi Networks. Computer Networks, 105(Supplement C):150 { 165, 2016. [45] Antonios Michaloliakos, Ryan Rogalin, Yonglong Zhang, Konstantinos Psounis, and Giuseppe Caire. Performance Modeling of Next-generation WiFi networks. Computer Networks, 105:150{165, 2016. [46] J. Mvulla, Y. Kim, and E. Park. Probe/preack: A joint solution for mitigating hidden and exposed node problems and enhancing spatial reuse in dense wlans. IEEE Access, 6:55171{ 55185, 2018. [47] Cho Yiu Ng and Chi Wan Sung. Low complexity subcarrier and power allocation for utility maximization in uplink OFDMA systems. IEEE Transactions on Wireless Communications, 7(5):1667{1675, May 2008. [48] Hassan Aboubakr Omar, Khadige Abboud, Nan Cheng, Kamal Rahimi Malekshan, Amila Tharaperiya Gamage, and Weihua Zhuang. A Survey on High Eciency Wireless Local Area Networks: Next Generation WiFi. IEEE Communications Surveys Tutorials, 18(4):2315{2344, Fourthquarter 2016. [49] Guangyu Pei and Thomas R Henderson. Validation of OFDM Error Rate Model in ns-3. Boeing Research Technology, pages 1{15, 2010. [50] Sumit Rangwala, Apoorva Jindal, Ki-Young Jang, Konstantinos Psounis, and Ramesh Govindan. Neighborhood-centric congestion control for multihop wireless mesh networks. IEEE/ACM Transactions on Networking, 19(6):1797{1810, December 2011. [51] D. Srinivasa Rao and V. Berlin Hency. QoS based Radio Resource Management Techniques for Next Generation MU-MIMO WLANs: A Survey. Journal of Telecommunication, Elec- tronic and Computer Engineering (JTEC), 8(1):97{105, 2016. [52] T. Ropitault. Evaluation of RTOT algorithm: A rst implementation of OBSS PD-based SR method for IEEE 802.11ax. In 2018 15th IEEE Annual Consumer Communications Networking Conference (CCNC), pages 1{7, Janurary 2018. [53] T. Ropitault and N. Golmie. ETP algorithm: Increasing Spatial Reuse in Wireless LANs Dense Environment using ETX. In 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), pages 1{7, October 2017. [54] I. Selinis, M. Filo, S. Vahid, J. Rodriguez, and R. Tafazolli. Evaluation of the DSC Algo- rithm and the BSS Color Scheme in Dense Cellular-like IEEE 802.11ax Deployments. In 2016 IEEE 27th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), pages 1{7, September 2016. [55] I. Selinis, K. Katsaros, S. Vahid, and R. Tafazolli. Control OBSS/PD Sensitivity Threshold for IEEE 802.11ax BSS Color. In 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), pages 1{7, September 2018. 79 [56] Ioannis Selinis, Konstantinos Katsaros, Seiamak Vahid, and Rahim Tafazolli. Exploiting the Capture Eect on DSC and BSS Color in Dense IEEE 802.11Ax Deployments. In Proceedings of the Workshop on Ns-3, WNS3 '17, pages 47{54, New York, NY, USA, 2017. ACM. [57] G okhan Se cinti, M uge Erel Oz cevik, Kaushik R. Chowdhury, and Berk Canberk. Dynamic Power Adjustment and Resource Allocation Framework for LTE Networks. In 2016 IEEE 21st International Workshop on Computer Aided Modelling and Design of Communication Links and Networks (CAMAD), pages 122{127, October 2016. [58] Mohit K. Sharma, Alessio Zappone, Mohamad Assaad, Merouane Debbah, and Spyridon Vassilaras. Multi-agent Deep Reinforcement Learning based Power Control for Large Energy Harvesting Networks. In Proc. 17th Int. Symp. on Modeling and Optim. in Mobile, Ad Hoc, and Wireless Networks (WiOpt), WiOpt '19. IEEE, 2019. [59] Wei-Liang Shen, Kate Ching-Ju Lin, Ming-Syan Chen, and Kun Tan. SIEVE: Scalable User Grouping for Large MU-MIMO Systems. In 2015 IEEE Conference on Computer Commu- nications (INFOCOM), pages 1975{1983, April 2015. [60] Graham Smith. Dynamic Sensitivity Control V2. IEEE 802.11ax IEEE 802.11{13/1012r4, pages 1{29, 2013. [61] Robert Stacey. Proposed TGax draft specication. IEEE 802.11-16/0024r1, March 2016. [62] Sanjib Sur, Ioannis Pefkianakis, Xinyu Zhang, and Kyu-Han Kim. Practical MU-MIMO User Selection on 802.11ac Commodity Networks. In Proceedings of the 22Nd Annual International Conference on Mobile Computing and Networking, MobiCom '16, pages 122{134, New York, NY, USA, 2016. ACM. [63] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT Press, 2018. [64] Kaidong Wang and Konstantinos Psounis. Scheduling and Resource Allocation in 802.11ax. In 2018 IEEE Conference on Computer Communications (INFOCOM), April 2018. [65] Francesc Wilhelmi, Sergio Barrachina-Mu~ noz, Boris Bellalta, Cristina Cano, Anders Jonsson, and Gergely Neu. Potential and pitfalls of Multi-Armed Bandits for decentralized Spatial Reuse in WLANs. Journal of Network and Computer Applications, 127:26{42, 2019. [66] Francesc Wilhelmi, Cristina Cano, Gergely Neu, Boris Bellalta, Anders Jonsson, and Sergio Barrachina-Mu~ noz. Collaborative Spatial Reuse in wireless networks via selsh Multi-Armed Bandits. Ad Hoc Networks, 88:129 { 141, 2019. [67] Ian C. Wong, Oghenekome Oteri, and Wes Mccoy. Optimal Resource Allocation in Uplink SC-FDMA Systems. IEEE Transactions on Wireless Communications, 8(5):2161{2165, May 2009. [68] Starsky H.Y. Wong, Hao Yang, Songwu Lu, and Vaduvur Bharghavan. Robust Rate Adap- tation for 802.11 Wireless Networks. In Proceedings of the 12th Annual International Con- ference on Mobile Computing and Networking, MobiCom 2006, pages 146{157, New York, NY, USA, 2006. ACM. [69] Xiao Xiao, Xiaoming Tao, and Jianhua Lu. Energy-Ecient Resource Allocation in LTE- based MIMO-OFDMA Systems with User Rate Constraints. IEEE Transactions on Vehicular Technology, 64(1):185{197, January 2015. 80 [70] Yuan Yan, Bo Li, Mao Yang, and Zhongjiang Yan. ESR: Enhanced Spatial Reuse Mechanism for the Next Generation WLAN - IEEE 802.11ax. In Bo Li, Mao Yang, Hui Yuan, and Zhongjiang Yan, editors, IoT as a Service, pages 265{274, Cham, 2019. Springer International Publishing. [71] Taesang Yoo and A. Goldsmith. On the Optimality of Multiantenna Broadcast Scheduling Using Zero-Forcing Beamforming. IEEE Journal on Selected Areas in Communications, 24(3):528{541, March 2006. 81
Abstract (if available)
Abstract
Legacy 802.11 standards improved the throughput of Wi-Fi networks via various techniques, such as MIMO, frame aggregation, higher modulation, and coding scheme (MCS), shorter guard interval and so on. However, the multiple access method was unchanged: a node listens to the channel before transmission, if the channel is busy, the node waits for a random period of time, if the channel is clear, it starts the transmission using the whole bandwidth. This may be inefficient due to the exposed node problem, frequency selective channel, limited payload size, especially in dense deployments. ❧ 802.11ax standard, also known as high-efficiency wireless (HE), aims at improving the spectrum efficiency of 802.11 networks, especially for dense deployment. 802.11ax introduces a lot of new features to improve the spectrum efficiency, such as OFDMA, BSS coloring, uplink MU-MIMO and so on. The performance of these features is highly related to the implementation, as shown in the following chapters. The goal of my work is to propose algorithms to optimize the implementation of these new features and hence improve the spectrum efficiency of 802.11ax networks. ❧ In Chapter 2, a divide and conquer based recursive scheduling algorithm is proposed to improve the OFDMA scheduling for downlink MU transmission. Both simulation and experiment results show that the recursive scheduling algorithm can optimize the OFDMA scheduling using the CSI feedback from STAs. In Chapter 3, the transmit power control (TPC) of STAs is modeled as a multi-armed bandit (MAB) problem, which can learn the optimal policy from the environment and exploit the knowledge of the environment to achieve higher reward. The simulation results show that it outperforms conventional TPC and can adapt to various environments.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Optimal distributed algorithms for scheduling and load balancing in wireless networks
PDF
Enabling massive distributed MIMO for small cell networks
PDF
Optimal resource allocation and cross-layer control in cognitive and cooperative wireless networks
PDF
Ultra-wideband multistatic and MIMO software defined radar sensor networks
PDF
Achieving efficient MU-MIMO and indoor localization via switched-beam antennas
PDF
Algorithmic aspects of energy efficient transmission in multihop cooperative wireless networks
PDF
Design and analysis of reduced complexity transceivers for massive MIMO and UWB systems
PDF
Multidimensional characterization of propagation channels for next-generation wireless and localization systems
PDF
Efficient data collection in wireless sensor networks: modeling and algorithms
PDF
Improving network security through cyber-insurance
PDF
Scheduling and resource allocation with incomplete information in wireless networks
PDF
Signal processing for channel sounding: parameter estimation and calibration
PDF
Efficient delivery of augmented information services over distributed computing networks
PDF
Intelligent near-optimal resource allocation and sharing for self-reconfigurable robotic and other networks
PDF
Large system analysis of multi-cell MIMO downlink: fairness scheduling and inter-cell cooperation
PDF
Performant, scalable, and efficient deployment of network function virtualization
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
PDF
Understanding the characteristics of Internet traffic dynamics in wired and wireless networks
PDF
Optimizing privacy-utility trade-offs in AI-enabled network applications
PDF
Real-time channel sounder designs for millimeter-wave and ultra-wideband communications
Asset Metadata
Creator
Wang, Kaidong
(author)
Core Title
Improving spectrum efficiency of 802.11ax networks
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
12/02/2019
Defense Date
10/15/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
802.11ax,BSS coloring,MU-MIMO,OAI-PMH Harvest,OFDMA,scheduling and resource allocation,spatial reuse,transmit power control
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Psounis, Konstantinos (
committee chair
), Chugg, Keith Michael (
committee member
), Golubchik, Leana (
committee member
), Molisch, Andreas (
committee member
), Silvester, John (
committee member
)
Creator Email
kaidongw@usc.edu,kaidongwang23@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-240794
Unique identifier
UC11673443
Identifier
etd-WangKaidon-7967.pdf (filename),usctheses-c89-240794 (legacy record id)
Legacy Identifier
etd-WangKaidon-7967.pdf
Dmrecord
240794
Document Type
Dissertation
Rights
Wang, Kaidong
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
802.11ax
BSS coloring
MU-MIMO
OFDMA
scheduling and resource allocation
spatial reuse
transmit power control