Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Improve cellular performance with minimal infrastructure changes
(USC Thesis Other)
Improve cellular performance with minimal infrastructure changes
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Improve Cellular Performance with Minimal Infrastructure Changes
by
Xing Xu
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2017
Copyright 2017 Xing Xu
Acknowledgements
First of all, I would like to express my deepest appreciation for Professor Ramesh Govindan. He is an
excellent academic advisor. All of my work in this dissertation would not have been possible without his
insightful guidance. He is also a helpful career mentor and a respectable role model. Throughout my
pursuit of Ph.D., under his supervision, I have become not only a good researcher and computer scientist,
but also a better person.
Besides my advisor, I would like to thank my dissertation committee, including Professor Antonio
Ortega and Professor Wyatt Lloyd, for their comments and suggestions on improving the quality of this
dissertation.
It is my honor to have the opportunity to collaborate with many prestigious researchers and I really
appreciate their contributions. Throughout this dissertation, I have collaborated with Ajay Mahimkar,
N.K. Shankaranarayanan, Jia Wang, Zihui Ge, Ioannis Broustis from AT&T Labs Research and Professor
Minlan Yu from Yale University. Although not included in this dissertation, I have worked with Professor
Ethan Katz-Bassett, Professor Antonio Ortega and Professor Wyatt Lloyd from University of Southern
California, Professor David Choffnes from Northeastern University, Professor Tarek Abdelzaher from
University of Illinois at Urbana Champaign, Professor Amotz Bar-Noy from City University of New York,
Bolian Yin, Ben Greenstein and Matt Welsh from Google, and my lab members Yurong Jiang, Tobias
Flach, Zahaib Akhtar, and Bin Liu.
Finally, I would thank my family for their sincere and selfless support, especially my beloved wife
Kaixi Xu, my parents Jianjun Xu and Shujun Li and my lovely son, Eric Xu. Without their support, this
dissertation would not exist.
ii
Table of Contents
Acknowledgements ii
List Of Tables v
List Of Figures vi
Abstract ix
Chapter 1: Introduction 1
1.1 Dissertation Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Minimizing Service Disruption during Network Upgrades (Chapter 2) . . . . . . . 4
1.1.2 A Premium Service for Better Video QoE Performance (Chapter 3) . . . . . . . . 4
1.1.3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 2: Magus: Minimizing Cellular Service Disruption during Network Upgrades 7
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The Solution Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Benefits of Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 LTE Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Measurements and Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Cellular Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Cellular Coverage and Capacity Model . . . . . . . . . . . . . . . . . . . . . . . 19
2.4.2 Operational Data Used in the Model . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 Cellular Coverage Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Service Disruption Mitigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Chapter 3: SHADE: Providing Premium Service for Adaptive High Bandwidth Applications in
Cellular Networks 37
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3 Providing Better Video QoE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 SHADE: A Premium Video Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4.1 Maintain Downlink Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4.2 Select Bitrate Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.5 Maintaining Downlink Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.1 Property of Fair Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.2 Achieve a Targeted Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5.2.1 Calculate Required Number of PRBs . . . . . . . . . . . . . . . . . . . 52
iii
3.5.2.2 Determine Weight to Obtain Certain PRBs . . . . . . . . . . . . . . . . 53
3.5.3 Throughput Maintenance: Tracking Network Dynamics . . . . . . . . . . . . . . 54
3.5.4 Supporting Multiple Critical Applicatoins . . . . . . . . . . . . . . . . . . . . . . 55
3.6 Selecting Bitrate Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6.1 Limit Impact on Non-Critical Applications . . . . . . . . . . . . . . . . . . . . . 57
3.6.2 Distribute Resources Among Critical Applications . . . . . . . . . . . . . . . . . 57
3.7 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.7.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.7.2 Premium Video Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.7.3 Throughput Maintenance Component . . . . . . . . . . . . . . . . . . . . . . . . 69
3.7.3.1 Channel Condition Variations. . . . . . . . . . . . . . . . . . . . . . . . 69
3.7.3.2 User Dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.7.3.3 Multiple Premium Users. . . . . . . . . . . . . . . . . . . . . . . . . . 74
Chapter 4: Literature Review 76
Chapter 5: Conclusions and Future Directions 81
References 85
iv
List Of Tables
2.1 Experiment results for recovery ratio, calculated using Formula 2.7 and averaged for areas
we studied. (a), (b) and (c) indicate different upgrade scenarios shown in Figure 2.9. For
power-tuning, the greatest gains are in suburban area, gains for rural and urban areas are
lower. In general, tilt-tuning cannot be as good as power-tuning, but the joint approach
greatly improves the results of power-tuning. . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Recovery ratio of one scenario, using different utility functions. Recovery ratio is cal-
culated using Formula 2.7, u
per f ormance
and u
coverage
are referring to utility functions of
Formula 2.5 and 2.6 respectively. Different utility functions converge to different tuning
changes, and Magus can choose different utility function based on requirement. . . . . . . 35
3.1 Common bitrate candidates of different content providers. . . . . . . . . . . . . . . . . . 46
3.2 Two fairness metrics on the number of PRBs allocated to all the users of Proportional Fair
scheduler, for different time intervals. Proportional Fair scheduler achieves good fairness
when the time interval is longer than 1 second. . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Mean and Std of maintained downlink throughput per second using different MCS-Intervals.
Although all the intervals can maintain the average throughput at 1200Kbps, smaller in-
tervals can reduce the variation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4 Mean and Std of maintained downlink throughput per second using different Requirement-
Intervals. A sweetspot occurs at the interval of 1s, which maintains the throughput at the
targeted value. Smaller or bigger intervals provide lower average throughput. . . . . . . . 73
v
List Of Figures
1.1 Overview for robustness and performance improvement opportunities investigated in this
dissertation. When there are base station upgrades/outages, Magus improves service ro-
bustness (coverage), by changing base station configurations. When base stations are over-
loaded, SHADE enhances video users’ performance (quality of experience), by modifying
the transmission resource scheduler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Two dimensional solution space: operating time can be either proactive or reactive; type
of tuning can be either model-based or feedback-based. . . . . . . . . . . . . . . . . . . . 12
2.2 Demonstration of performance improvement using LTE Testbed. The left figure shows the
configuration setting before and after the upgrade; the right figure shows the corresponding
performance impact. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Demonstration of operational network path loss data. Each pixel represents a grid. Brighter
color indicates lower path loss and better receive power. . . . . . . . . . . . . . . . . . . . 20
2.4 Demonstration of the service coverage map. Grids that are served by the same sector are
painted in the same color. Black pixels indicates the grids that have receive power below a
threshold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Service coverage map (Figure 2.4) on top of the satellite map. The grids that have good
receive power (higher than a threshold) are highlighted in red. The un-highlighted grids
are sparsely inhabited areas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Magus’s workflow. The search algorithm picks one potential configuration change; the
analysis model applies the change by recomputing grid level and base station level infor-
mation; the evaluation module then evaluates the performance and thus decides whether to
accept this change or not. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.7 Path loss of changing power and changing tilt, brighter color means lower path loss and
hence good receive power. (a) path loss before change; (b) after increasing the transmission
power; (c) after an antenna uptilt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.8 Coverage map of three different types of areas. An urban area contains a lot more base
stations than rural area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
vi
2.9 Three different upgrade scenarios: (a) upgrading one sector at the center, (b) upgrading
three sectors of one base station, and (c) upgrading four sectors at the four corners of the
area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.10 Illustration of coverage of tuning neighboring sector in rural area: (a) before taking down
the target sector; (b) after taking down the target sector; (c) after increasing one neighbor-
ing sector’s transmission power by 10dB. . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.11 Benefits of gradual tuning (Proactive Gradual). Gradual tuning reduce simultaneous han-
dovers by 3 and offer 99.7% of UEs a seamless handover. . . . . . . . . . . . . . . . . . 34
2.12 Comparing speed of convergence across tuning approaches. Reactive Feedback Based
approach takes much longer than other approaches. . . . . . . . . . . . . . . . . . . . . . 34
2.13 Improvement ratio of Magus (Algorithm 1) on naive approach. Magus is better for 81% of
scenarios, with maximum improvement ratio 3.87 and average improvement ratio 1.21. . . 34
3.1 Three QoE metrics (Average Bitrate, Rebuffering Ratio, Bitrate Switches) vs. downlink
throughput of watching a content that has five bitrate candidates: 350, 700, 1200, 2400,
4800Kbps. We see that: (a) in general, higher downlink throughput provides better Aver-
age Bitrate, and we see a clear “staircase” shape, with the height of each stair correspond-
ing to one bitrate candidate; (b) Higher downlink throughput leads to lower Rebuffering
Ratio; (c) Using the downlink throughput that is close to one of the bitrate candidates can
significantly reduce the number of Bitrate Switches. . . . . . . . . . . . . . . . . . . . . . 44
3.2 SHADE’s work flow. For each prospective user, SHADE’s admission control module de-
termines whether it is feasble to provide premium video service to this new user. If so,
for each admitted premium user, bitrate selection component (Section 3.6) selects a bitrate
from this user’s video bitrate candidates. These selected bitrates are then passed to the
throughput maintenance component (Section 3.5). Selected bitrates need to be updated
periodically to adapt to network dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 An illustration of the first step of SHADE’s bitrate selection algorithm. For a given appli-
cation, SHADE first estimates his achievable bitrate using a fixed number of PRBs (Bitrate
of Fixed # of PRBs), the changes of that curve is due to channel condition changes. Then,
SHADE maps that bitrate result to the closest bitrate candidate, Selected Bitrate). . . . . . . 59
3.4 Comparison of competitors on three key video QoE metrics: (a) Average Bitrate; (b) Re-
buffering Ratio; and (c) Bitrate Switches, with different premium resource percentage p,
using R
1
as R
AC
for admission control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.5 Comparison of competitors on two metrics for better analysis: (a) Downgrade Fraction;
(b) Bitrate Selection Switches, with different premium resource percentage p, using R
1
as
R
AC
for admission control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.6 Comparison of competitors on three key video QoE metrics: (a) Average Bitrate; (b) Re-
buffering Ratio; and (c) Bitrate Switches, with different premium resource percentage p,
using R
3
as R
AC
for admission control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
vii
3.7 Comparison of competitors on: (a) Average Bitrate; (b) Bitrate Switches; and (c) Bitrate
Selection Switches, with different user speed, using R
1
as R
AC
for admission control. . . . 68
3.8 Comparison between the actual throughput per PRB (Scheduled PRBs) and the estimation
using the average of MCS Index of all the PRBs (Estimation). Such estimation under-
estimates the MCS Indexes and throughput per PRB significantly, sometimes more than
40%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9 Maintained throughput per second using two different MCS-Intervals: 0.1s and 1s. Using
the interval of 0.1s provides more stable throughput, with Std of 107Kbps compared to
218Kbps of 1s case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.10 Maintained throughput per second using Requirement-Interval of 1s (with corresponding
weight), compared to the case without throughput maintenance. SHADE keeps increasing
the weight as the number of users increasing and maintains the throughput at the targeted
value. On the contrary, without the maintenance, the throughput drops to only 1/5 of its
initial value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.11 Maintained throughput per second, compared to the case without throughput maintenance,
under the circumstance of continuously degrading channel condition due to user mobility. . 74
3.12 Maintained throughput for three premium users at different targeted throughput. Dotted
lines illustrate each user’s throughput without throughput maintenance. . . . . . . . . . . . 75
viii
Abstract
Cellular networks have become more and more important to our daily life. Nowadays, besides constantly
contacting friends and families, we rely on cellular networks to connect to the internet to do many things,
including not only traditional email, web-browsing and voice communication applications, but also emerg-
ing high bandwidth applications, e.g., video, gaming, and business critical applications. The increases in
the number of cellular users and the requirement from these high bandwidth applications create a growing
challenge for cellular service providers: to provide good cellular services.
Robustness and performance are two critical requirements towards good cellular services. The ro-
bustness makes sure cellular users can receive decent cellular services despite network and environmental
uncertainties. This is critical, as cellular service disruption is not received well by customers. On the other
hand, with the increase of cellular users and the emergence of high bandwidth applications, cellular per-
formance becomes more and more important. Because of dramatically increased traffic volumes, cellular
users are likely to receive degraded cellular performance. Both robustness and performance can be im-
proved by significantly improving the cellular infrastructure, e.g., building more base stations. However,
this solution is expensive and also requires time to deploy. In this dissertation, we focus on a different
direction: achieving robustness and performance improvement using current infrastructure, and without
significant modifications to the infrastructure.
For robustness, we focus on service disruptions induced by a frequently happening network event,
planned upgrades. Planned upgrades occur every day, may often need to be performed on weekdays, and
can potentially degrade service robustness. We explore the problem of tuning base station configurations
to mitigate the impact due to a planned upgrade which takes the base station off-air. The objective is to
ix
recover the loss in service which would have occurred without any modifications. We propose a proactive
approach based on a predictive model that uses operational data to quickly estimate the best power and
tilt configuration of neighboring base stations that enables high service recovery. These ideas, embodied
in a capability called Magus, enables us to recover up to 76% of the potential service loss due to planned
upgrades.
For performance, we consider video streaming which is the dominating application. Current cellu-
lar networks do not provide any service quality guarantee. During periods of base station congestion,
video users will receive poor cellular service, which can lead to degraded video quality of experience
(QoE). We explore the idea of premium service for video users to improve their video QoE performances.
While increasing video user’s downlink throughput can improve QoE, we empirically observe that QoE
can be further improved by maintaining the downlink throughput at one of the candidate bitrates of the
corresponding video. We design SHADE to realize this idea. SHADE distributes transmission resources
among users smartly, by selecting a candidate bitrate for each premium user and maintaining the downlink
throughput at this bitrate target. SHADE achieves this with minimal changes to current cellular systems.
Our extensive simulations indicate that SHADE can significantly improve multiple video QoE metrics,
compared to previous proposals.
x
Chapter 1
Introduction
At the end of 2016, cellular data traffic reached 7.2 exabytes per month, an 18 increase over the past 5
years [2]. Two important factors drive this traffic explosion. First is the tremendous increase in cellular
users in recent years. Today, there are over 8.0 billion cellular devices globally. Second, users increasingly
rely upon cellular networks for their daily activities. Besides traditional applications like email, web-
browsing and voice communication, cellular users are using applications that require high bandwidth,
e.g., video, gaming, business critical applications, and so on. This trend is likely to continue. Cisco
predicts that cellular devices will constitute half of global devices in 2021 [2]. Further, these applications
are evolving rapidly: in recent years, we have seen video applications with higher and higher quality,
360 degree videos, and virtual reality applications. These factors together create a growing challenge in
providing good cellular services.
There are at least two important problems towards good cellular services: robustness and performance.
A robust service ensures that users continue to be able to use the network despite failures or outages. This
is critical, as disruption of cellular service harm user engagement and thus revenue. Severe outages can
affect cellular services across multiple states (in U.S.) and across multiple service providers [8]. Some of
them even impact 911 calls [1]. Studies report that the impact of these kinds of service disruptions is in
the billions of dollars [15]. Besides severe outages, smaller outages happen more frequently, affect fewer
users, e.g., users within a cell-site or a single base station. When an outage happens, users may receive
1
significantly degraded (if not completely unusable) cellular services. Cellular service providers should
improve robustness to avoid such disruptions.
Performance is another important aspect for good cellular services. In recent years, cellular traffic
volumes have increased dramatically, but the capacity of access networks have not kept pace. Thus, traf-
fic can overload the base station, resulting in base station level congestion, which can downgrade user’s
cellular performance greatly, and lead to low throughput, high latency, etc. This congestion is becoming
more frequent and severe with the emergence of high bandwidth applications like video streaming. To
deal with this, for example, Verizon throttles user’s throughput at peak hours [7]; T-Mobile incentivizes
users to watch low quality videos [3]. Throttling is not a long term solution. Throttled throughput, i.e.,
a lower throughput, leads to significantly degraded quality of experiences (QoE) of these high bandwidth
applications. In addition, byproduct of throttling, e.g., increased loss rate, can further negatively impact
QoE [33]. There is an urgent need for providing better cellular performance, especially for those high
bandwidth applications that are more performance sensitive.
To tackle the robustness and performance problems in cellular networks, there are several orthogonal
directions. On one hand, we can improve the cellular infrastructure. For example, cellular service providers
can build backup base stations for better robustness or have more base stations and thus more capacity to
enhance performance. However, this solution is expensive and also requires significant time to deploy.
On the other hand, cellular service providers can reduce the traffic burden of each base station for better
robustness, e.g., by limiting the number of admitted users per base station, or throttling user’s throughput.
Similarly, cellular performance can be improved by only serving a smaller subset of prospective users.
These solutions are clearly not ideal. Users expect to receive cellular services when needed, without any
throttling, because throttling harms the application performance, especially high bandwidth ones.
In this dissertation, we explore a different direction that is more feasible and easily deployable: to
improve cellular services while serving existing users, and without significantly modifying existing infras-
tructure. There are two high level challenges for this direction. First, improvement we want to achieve
should not require investments in new infrastructure. For example, we want to improve robustness under
2
Figure 1.1: Overview for robustness and performance improvement opportunities investigated in
this dissertation. When there are base station upgrades/outages, Magus improves service robustness
(coverage), by changing base station configurations. When base stations are overloaded, SHADE
enhances video users’ performance (quality of experience), by modifying the transmission resource
scheduler.
the same cellular infrastructure, but not with more backup base stations; and we want to enhance a user’s
cellular performance using the same amount of transmission resources, instead of introducing resources
from newly built base stations. This is challenging because modern cellular systems are already mature
and optimized, and additional improvement on top of current infrastructure is hard to achieve. The second
challenge is due to the constraint that only minimal changes are allowed to the current infrastructure. A
significant re-design of the cellular networks might provide better services. However, because we want
to have faster deployment and better compatibility to existing systems, we constrain ourselves to minimal
modifications. This constraint makes the problem more challenging as we have fewer degrees of freedom.
1.1 Dissertation Overview
In this dissertation, based on studies on two concrete problems, we illustrate that, cellular network’s ro-
bustness and performance can be improved significantly by making minimal changes to the cellular infras-
tructure (Figure 1.1). In Chapter 2, we study how to provide better robustness when there are off-air base
3
stations (due to base station outages or planned upgrades). We propose a system, called Magus, that auto-
matically improves the robustness of cellular services, by reconfiguring the antenna parameters of nearby
base stations. In Chapter 3, we consider the scenario where the base station is overloaded, and high band-
width applications receive significantly degraded performance. We design a premium service to enhance
video users’ performance for better video QoE. Below we provide overviews to these two problems in
turn.
1.1.1 Minimizing Service Disruption during Network Upgrades (Chapter 2)
Cellular service providers choose base station locations and associated configurations carefully to provide
good cellular service. However, due to outages or planned upgrades, base stations might become off-air
temporarily. In this situation, pre-chosen base station configurations are no longer optimal, which can
negatively impact cellular services of existing users. For example, some users can lose cellular service,
as their serving base station is off-air. These users are in a coverage hole (a geographic area that cannot
receive cellular service), a severe robustness issue. We explore the opportunity of reconfiguring nearby
base stations in order to mitigate this negative impact. The objective is to recover the loss in service which
would have occurred without any modifications. To realize this reconfiguration, we build a predictive
model that leverages real operational data (rather than idealized analytical models) to estimate the best
configurations of nearby base stations to improve the service robustness. These ideas, embodied in a
capability called Magus, enable significant service recovery, up to 76% of potential loss due to planned
upgrades.
1.1.2 A Premium Service for Better Video QoE Performance (Chapter 3)
Users increasingly rely upon cellular networks for high bandwidth applications, such as video applications,
which require high bitrates for good QoE. Currently, cellular service providers provide best-effort services
to all the users. Hence, without any service quality guarantee, during periods of base station conges-
tion, video users can receive poor cellular service performance, which can lead to significantly degraded
4
video QoE. To improve video users’ QoE, we explore the idea of premium video service. The enabler
of this premium service is a technique that can control and thus improve video user’s downlink through-
put. In addition, we empirically observe that video QoE is better improved by maintaining the downlink
throughput at one of the candidate bitrates of the video. We design and implement SHADE to realize this
idea. SHADE minimally changes how transmission resources are allocated among users in the base station,
distributes limited transmission resources smartly to optimize video QoE to video users, with bounded
negative performance impact to non-video users. SHADE can significantly improve three key video QoE
metrics simultaneously (up to 10 times improvement), compared to previously proposed premium services.
1.1.3 Future Directions
This dissertation sheds lights on several future directions:
Our studies illustrate that cellular services can often be improved significantly with minimal infras-
tructure changes. One important reason for this result is that cellular networks operate in a highly
variable environment. Because there are many network dynamics, being adaptive to them (by apply-
ing changes dynamically) can help cellular networks provide more optimized cellular services. One
natural future direction is towards more dynamic and adaptive cellular infrastructure, in order to cap-
ture these variable network dynamics and react accordingly. Another lesson we learned is that the
base station is a good place to apply these changes for more optimized cellular services. We believe
that deploying adaptive changes at the base station can lead to many more service improvements.
Another future direction is to explore the idea of embedding minimal application knowledge to
cellular networks for better services. As we envision that different applications will require different
underlying service qualities for good QoE (video applications need a high throughput, while gaming
applications require a low latency), it would be beneficial to make cellular networks application-
aware, i.e., traffic of different applications would be handled differently. In SHADE, we re-design
5
the cellular infrastructure to be video-aware for better video QoE. The idea of application-aware
cellular infrastructure for better application QoE can be applied to other applications as well.
Lastly, both Magus and SHADE have not been fully implemented in real cellular systems. We are not
aware of any publicly accessible cellular network testbeds, that are programmable and can accept
changes to easily evaluate research ideas. We believe that building these kind of testbeds can benefit
cellular networks research community greatly.
1.2 Dissertation Outline
This dissertation is organized as follows. We discuss Magus in Chapter 2 and SHADE in Chapter 3. In
Chapter 4, we present a comprehensive overview of related works in the literature. Finally, we summarize
and conclude the dissertation in Chapter 5.
6
Chapter 2
Magus: Minimizing Cellular Service Disruption during Network
Upgrades
2.1 Introduction
Mobile users increasingly rely upon cellular networks for their daily activities such as Web browsing,
voice communication, video on demand, social network applications, and business critical tasks. As such,
disruption of cellular service is not received well by customers: service outages are often publicly reported
in mainstream media outlets and some studies report the impact of these kinds of service disruptions to be
in the billions of dollars [15].
One reason for service disruption is due to some types of planned network upgrades. Cellular service
providers are rolling out new features (e.g., V oice over LTE) and upgrades at a rapid pace to keep up
with growing traffic and application demands, and to provide ultra-high quality of service and reliability.
Network upgrades can involve new software releases, hardware updates, configuration changes and even
equipment re-homes.
These planned upgrades have the potential to impact cellular service performance and thus have to
be carefully planned and executed. Some upgrades in the radio access network require the cellular base
station to be taken off-air for the duration of the planned work. For example, during power plant work or
hardware replacement, the cellular base station is not available to provide service to the end-users. The
7
cellular network operators carefully plan such upgrades during the off-peak hours and low-impact days,
when possible.
Despite the extraordinary care in scheduling these upgrades, it is sometimes not possible to avoid
service disruptions. Sometimes, upgrades can take longer than expected and thus spill over into the busy
hours. In other cases, they may have to be conducted during business hours depending on vendor availabil-
ity. Moreover, for certain locations such as busy airports, there is no specific preferred time for scheduling
the upgrade because of the 24/7 usage at these locations.
In this paper, we focus on planned upgrades that occur during the business hours and require the base
station to be taken offline for the duration of the work. Such planned upgrades are not infrequent. To
quantify this, we obtained one year’s worth of data on planned upgrades from a large cellular network
in North America. We observe that planned upgrades occur every day of the year and they are more than
twice as likely to occur on Tuesdays through Fridays than on other days. Typically, these planned upgrades
last 4-6 hours and impact all radio access technologies (such as LTE, UMTS as well as GSM). For some
planned upgrades, service disruptions cannot be completely avoided. Depending on the radio network
coverage and capacity, some end-users might either be denied service (due to coverage holes), or have a
degraded service performance (due to overload conditions).
Our approach. We focus on the important problem of minimizing service disruption during network
upgrades. Our work relies on the following observation. When a base station is taken off-air, end-users
can re-attach to neighboring base stations depending on their coverage overlaps and resource availability.
In general, radio network planners attempt to maximize coverage and minimize interference by setting base
station configuration parameters such as transmit power and antenna tilt. However, if a base station goes
offline during an upgrade, the coverage and capacity in that group of base stations will become sub-optimal,
and it may be possible to share resources from neighboring base stations. Thus, there is an opportunity
to improve the end-users’ quality of experience during an upgrade by controlling the configuration (e.g.,
increasing the transmit power, or adjusting the antenna tilt) on neighboring base stations.
8
Today, many cellular networks do not perform this kind of adaptation. Some advanced systems have
deployed dynamic reconfiguration techniques which iteratively and dynamically adjust the configuration
parameters of neighboring base stations to converge to near-optimal coverage and capacity if there is
any kind of base station outage (not just ones caused by planned upgrades). As we discuss later, this
dynamic adaptation in these “self-organizing networks” [39] can take significant time because each step of
the iteration requires base stations to measure signal and interference parameters to drive the adaptation.
Further, there can be operational constraints on the number of configuration changes that can be pushed to
a production network.
Given the prevalence and impact of planned upgrades, we propose a new approach, called Magus, for
automatic network re-configuration to minimize service disruptions during planned upgrades. Before the
base station is taken off-air for planned work, Magus proactively migrates end-users away towards its
neighbors by tuning the configuration accordingly. This helps by partially recovering the degradation in
performance or coverage resulting from the upgrade.
There are several challenges in achieving this kind of proactive re-configuration. First, determining
the best new configuration for the neighboring base stations can be challenging, especially in dense urban
settings. This is because modern base stations have a large number of power, tilt, and other configuration
settings, an offline base station may have tens of neighbors or more in an urban area, and changing the
configuration of one base station may increase interference for users attached to other neighboring base
stations. Moreover, the impact of a configuration change may not be known a priori, since it depends
upon a large number of complex factors including terrain, weather, the number of users etc. Second, the
migration of end-users involves a re-attachment to the new base station and needs to be carefully managed.
The re-attachment occurs via a handover and a handover is technology and implementation dependent.
Moreover, synchronized handovers resulting from a sudden configuration change can severely strain the
cellular network and potentially cause service disruptions for users.
To overcome the first challenge, Magus builds upon a predictive model to automatically learn a near-
optimal configuration setting. This model can quickly evaluate the impact of configuration changes without
9
deploying them. The model leverages the availability of large databases of path loss information that are
often used for network planning purposes (but, to our knowledge, have not been proposed for dynamic re-
configuration). Using this path loss information, Magus can estimate signal-to-interference ratios resulting
from the configuration change, then estimate the potential impact on coverage or capacity, allowing it
to quickly search the configurations space. To address the second challenge, Magus proactively starts
migrating users to neighboring base stations (and carefully tuning their configurations while doing so)
before the scheduled planned upgrade, so that the impact of synchronized handover is minimized, and
most users are migrated before the planned upgrade.
In this paper, we present the design, implementation and evaluation of Magus in the context of LTE
radio access technology on a single carrier. However, the principles underlying Magus apply to multiple
carriers and other technologies as well, such as small cells and UMTS. We do not tackle re-configuration
and migration of users across radio access technologies and defer it to future work.
Our contributions. The paper makes four contributions. First, a qualitative analysis (Section 2.2) of the
solution space reveals the tradeoffs between proactive (before upgrade) and reactive (after upgrade) and
between a model-based approach such as Magus and a feedback-based approach (which adapts configu-
rations after taking measurements). Second, experiments from an LTE testbed (Section 2.3) illustrate the
potential of re-configuring neighboring base stations in recovering lost performance. The third contribu-
tion is the design of Magus, a novel predictive model-based proactive re-configuration approach that relies
on operational
1
data available to large mobile carriers (Section 2.4 and Section 2.5). Finally, evaluations
using data from a large US mobile carrier on 3 major US cellular markets show that Magus can recover
up to 76% of lost performance, and reduce synchronized handovers by a factor of 8. Interestingly, the
performance recovery varies by area, being highest in suburban areas of moderate base station density;
these areas predominate in the three markets. This variability is surprising because planners do account for
outages, and network planning is a mature field; our result opens up avenues for better network planning
to enable higher recovery after planned upgrades.
1
For operational data, we explicitly do not show any service performance numbers in the paper for proprietary reasons.
10
2.2 The Solution Space
In this section, we qualitatively explore the solution space for minimizing service disruption during planned
network upgrades. Before describing the solution space, we briefly introduce some terminology. We
denote by C the configuration of the cellular network at any given instant. Each base station can be
configured using a number of parameters such as transmit power, antenna tilt, and so on, andC represents
the collective parameter settings of all base stations in the network. To tune a configuration means to
change the values of parameters for (some of) the base stations in the network. Thus, tuning takes the
network from some configurationC
1
to another configurationC
2
. Finally, each configuration is associated
with a utility, which measures the goodness of the configuration. Typical utility functions capture either
coverage criteria (i.e., increase the number of connections that would have otherwise being dropped), or
service performance criteria such as data throughput, or combined coverage and service criteria. We make
these notions more precise in the next section.
Abstractly, a network reconfiguration after a planned upgrade (indeed, after any network change) takes
the network from one configurationC
be f ore
to anotherC
a fter
. This is done by tuning the configurations,
with the goal of arriving at a maximal utility configuration.
As illustrated in Figure 2.1, the solution space for network reconfiguration is defined by two dimen-
sions: (i) the operating time for tuning the configuration either before the base station going off-air
(proactive), or after (reactive), and (ii) the type of tuning which can either be model-based, or feedback-
based.
A model-based approach estimates configuration parameters by leveraging traffic and performance
history (e.g., how much traffic did the base station see at the same time the previous day or the previous
week?), as well as a detailed terrain-aware model of network path loss information
2
. When a base station
goes down, the model-based approach directly tunes the neighbors to the optimal configuration.
On the other hand, a feedback-based approach iteratively tunes configurations, relying, at each itera-
tion, on measured performance (the “feedback”) after the previous iteration. This performance feedback
2
Such models (e.g., [14]) are often used for network planning.
11
Figure 2.1: Two dimensional solution space: operating time can be either proactive or reactive; type
of tuning can be either model-based or feedback-based.
consists of measured coverage and capacity parameters, and accurately represents the traffic and service
performance during the planned upgrade. The feedback-based approach terminates when it reaches a
configuration whose performance cannot be improved.
A feedback-based approach can take K iterations to reach the optimal, whereas a model-based can
reach optimal in 1 iteration. Depending on the number of neighbors and the possible values that the
configuration parameters can take, the number of iteration K can be very large. On the other hand, if
the network and traffic conditions do not match the history or the path loss model, then the model-based
approach might reach a sub-optimal configuration with lower utility than a feedback-based configuration.
This suggests that a hybrid approach may perform well: we can use the model-based approach to reach
a “good” but sub-optimal configurationC
so
, and a feedback-based approach to go fromC
so
to a higher
utilityC
a fter
in a small number of steps, denoted by k and k K. For the rest of the paper, when we
discuss a model-based approach, we implicitly assume it is augmented with a feedback-based phase that
corrects for deviations from the model or from traffic history.
12
LetC
be f ore
be the configuration before an upgrade,C
upgrade
the configuration just after the base sta-
tion is taken down, andC
a fter
as the configuration attained after tuning the neighbors. Let f(C
be f ore
),
f(C
upgrade
), and f(C
a fter
) respectively represent the utilities of these configurations. Then:
f(C
be f ore
)> f(C
a fter
) f(C
upgrade
)
With no tuning of neighbor configurations, the utility function would stay at f(C
upgrade
) for the dura-
tion of the planned upgrade. However, the key observation in this paper is that there exist opportunities for
improving the overall utility to reach f(C
a fter
) by tuning the configurations. This discussion suggests four
strategies.
Reactive feedback-based. Tuning starts after the base station is taken off-air and configurations are
iteratively optimized using performance feedback until the utility function cannot be further improved
or the algorithm reaches the maximum number of iterations permissible. Prior work on Self Organizing
Networks [39] (SON) represents an instance of a reactive feedback-based approach, albeit for unplanned
base station outages.
Reactive model-based. After the base station is taken off-air, this strategy takes one iteration to reach
the optimal configuration setting on the neighbors. The advantage is faster convergence time to the final
configuration, but the network may have a utility f(C
upgrade
) just after the base station is taken down, and
before the final configuration is reached.
Proactive feedback-based. This approach seeks to iteratively reach the optimal configuration setting
before the base station goes down, using performance feedback. An example strategy would start reducing
the transmission power of the target base station (that is going to be offline) and in each iteration, tune the
configuration settings of the neighbors to achieve a maximum value for the utility function.
Proactive model-based. This strategy uses predictive techniques to automatically learn the optimal con-
figuration setting for the neighbors before bringing down the base station.
Of these approaches, proactive model-based achieves the best performance by ensuring that the utility
function never goes below the optimal f(C
a fter
), and it reaches the optimal configuration in 1+ k step.
13
The only disadvantage of the proactive model-based solution is the excessive number of handovers that
would occur, at each step, from the base station under planned upgrade to its neighbors. We address this
by proposing a new proactive model-based strategy, Magus, that makes gradual changes and reaches the
optimal configuration. Note that only the model-based approach knows C
a fter
a-priori and can ensure that
the utility function in each iteration never goes below this value.
2.3 Benefits of Reconfiguration
In this section, we illustrate the potential of adaptively tuning configurations in cellular network in order
to mitigate service disruptions during planned upgrades.
During eNodeB service disruption, some UEs (user equipment, such as a smartphone) experience
degraded services. Neighboring eNodeBs can be re-configured to mitigate the degradation, for example, by
changing power attenuation levels, which govern the transmit power of the radio. An eNodeB can increase
the transmit power for higher SINR and thus provide better performance for its UEs. However, such
changing needs to be done carefully because it introduces interference to UEs served by other eNodeBs
[39, 74].
In a later section, we also discuss and evaluate another configuration parameter, antenna tilt. Antennas
can be electronically tilted up or down to, respectively, increase or decrease the coverage area of the base
station. The experimental hardware we use in this section does not support tilt, but operational cellular
base stations do.
In this section, we experimentally explore the opportunity of minimizing the impact of such service
disruptions by adaptively reconfiguring the cellular network parameters. For this, we use an LTE testbed,
in which we consider various topologies. There are two advantages to experimenting with a real cellu-
lar deployment: (a) we can observe the practical impact of service disruption on UEs; and (b) we can
practically evaluate the possibility of the network in continuing to offer high-quality service via adaptive
reconfiguration.
14
Our measurements provide the following insights. First, whenever an eNodeB is taken offline, UEs can
experience significant performance degradation, some may completely lose network connectivity, or may
experience a throughput drop. Second, in many cases, tuning the transmission parameters of neighboring
eNodeBs (such as their transmission power) can significantly mitigate the effects of service disruption.
In what follows, we first discuss our experimental setup and then elaborate on our measurement-driven
insights.
2.3.1 LTE Testbed
Our testbed is a full-featured LTE Release-9 network that consists of 4 eNodeBs, 10 UEs and an Evolved
Packet Core (EPC) deployment. The testbed is deployed indoors in the 4th floor of a corporate building.
eNodeB: Each eNodeB is a re-programmable Cavium LTE small cell [13] that carries an Octeon Fusion
chip and runs Linux. We use an exclusive 10-MHz experimental license for transmission in band 7, where
the downlink and uplink frequencies are centered at 2635 MHz and 2515 MHz, respectively. Each eNodeB
carries a band-7 radio daughterboard with a transmission power that can reach up to 125 mWatts. Tuning
of the transmission power takes place by tuning a software based attenuator; the attenuation (L) can take
values starting from 30 (maximum attenuation - minimum power) down to 1, and can be tuned with a step
as small as 1. All LTE transmissions are over the air using omni-directional antennae. We have verified
that there is no external interference in our testbed.
UEs: The 4 eNodeBs serve 10 UEs deployed randomly in the same area. Each UE is hosted by a Core-
I3 Intel NUC box with 4 GB memory that runs Ubuntu 14.04x64 Linux. The UEs are Sierra Wireless
Aircard 330u USB dongles.
EPC: We use the Aricent EPC R2.1.0 software. Our EPC includes MME, SGW, PGW, HSS and PCRF
elements [16]
3
. We have configured the access point name (APN) in the EPC to always set up bearers with
QCI=9 for all UEs, which provides best effort service.
3
These acronyms refer to various services configured in the software. These services provide mobility management, gatewaying
between the base station and cellular users and between the base station and the external IP network etc.
15
-3 -2 -1 0 1 2 3
Time
2.0
2.2
2.4
2.6
2.8
3.0
3.2
3.4
Utility
Upgrade ->
Proactive
Reactive
No Tuning
Scenario 1: The case for 2 eNodeBs, where eNodeB-2 is taken offline.
-3 -2 -1 0 1 2 3
Time
3.0
3.5
4.0
4.5
5.0
5.5
Utility
Upgrade ->
Scenario 2: The case for 3 eNodeBs, where eNodeB-2 is taken offline.
Figure 2.2: Demonstration of performance improvement using LTE Testbed. The left figure shows
the configuration setting before and after the upgrade; the right figure shows the corresponding
performance impact.
2.3.2 Measurements and Observations
Methodology. We consider two different eNodeB service disruption scenarios. For each scenario, our
goal is to empirically assess whether the network can be reconfigured such that users originally served by
eNodeB(s) taken offline can be re-attached to one of the active eNodeBs. Thus, in each experiment, we
first find the power configuration setting where the highest utility is achieved under normal conditions, i.e.,
in the absence of service disruption. Then, we take the eNodeB offline, and enumerate different power
levels for the remaining active neighbor eNodeBs in order to maximally increase utility.
In these experiments, our measure of utility of a configuration is the sum of the logarithms of the
UE downlink rates [51]: f(C)=å
x2UEs
log(r(x)), whereC is the set of power attenuation levels of all
16
eNodeBs (i.e., the configuration of the eNodeBs) and r(x) indicates the downlink rate of UE x. This metric
was chosen to indicate how well outage is mitigated. It balances the throughput performance while seeking
fairness for users who are disadvantaged. Compared to a simple “sum of rates”, the log property provides
a higher incentive to improve low rates of users experiencing poor radio conditions due to outage. This
metric is associated with proportional-fair scheduling which is widely used in cellular systems. Our goal
is to find the configuration with maximal utility.
During each experiment: (a) we first let the UEs attach to their preferred eNodeB, (b) then, we initiate
simultaneous 30-sec downlink TCP traffic sessions from the application server towards each UE, (c) we
measure the average downlink TCP throughput, and (d) we change the attenuations of eNodeB transmitters
and repeat the above steps until we reach maxf f(C)g. We use iperf to measure the downlink throughput
for TCP traffic flowing from an application server that is connected to the eNodeB to the UEs. During
each experiment we captureC
be f ore
andC
a fter
, andC
upgrade
.
Scenario 1: 2 eNodeBs. We first consider a simple scenario with 2 eNodeBs, where one of them needs
to be taken offline, shown in Figure 2.2 Scenario 1 (left). Here, eNodeB-1 and eNodeB-2 serve UE-1,
UE-3 and UE-4, and we need to take eNodeB-2 offline. Prior to service disruption, eNodeB-1 uses power
attenuation L=30 and eNodeB-2 uses L=1. With this, f(C
be f ore
)= 3:31.
After we take eNodeB-2 offline, the best configuration setting C
a fter
is achieved by setting L=1 at
eNodeB-1, i.e., by maximizing its transmission power. This makes f(C
a fter
) 3.09. On the other hand, if
there were no attenuation change at eNodeB-1 during service disruption, then the resulting f(C
upgrade
)
would be only 2.68 instead
4
. This is depicted in Figure 2.2 Scenario 1 (right). If we “proactively” tune
the attenuation of eNodeB-1 to the optimal value by the time we take eNodeB-2 offline, the best achieved
performance ( f(C
a fter
)) is achieved faster than with a “reactive” strategy, where eNodeB-1 increases its
power progressively after the service disruption.
4
Small changes in the utility function are significant, since the function computes the sum of the logarithm of the rates.
17
In this scenario, the attenuation tuning decision for eNodeB-1 is straightforward: eNodeB-1 should
use its highest power level since there is no interference from neighboring eNodeBs. Our next scenario
considers a setting where interference plays a key role.
Scenario 2: 3 eNodeBs. As shown in Figure 2.2 Scenario 2 (left), eNodeB-1, eNodeB-2 and eNodeB-
3 serve UE-1, UE-3, UE-5, UE-6 and UE-8. Assume that eNodeB-2 needs to be taken offline for main-
tenance. Under normal conditions when all 3 eNodeBs are online, the optimal power level configura-
tion is achieved by tuning the individual attenuation levels as follows: L=20 for eNodeB-1, L=20 for
eNodeB-3 and L=5 for eNodeB-2. After eNodeB-2 is taken offline, we find that, in order to maximize
our utility metric,C
a fter
, L=30 (the minimum transmission power) for eNodeB-1 and L=10 for eNodeB-
2. For this configuration, the resulting utility f(C
a fter
)= 4:85, and in the absence of tuning it would be
f(C
upgrade
)= 3:46. This illustrates that, in the presence of interference, power attenuation levels must be
carefully chosen in order to maximize utility (Figure 2.2 Scenario 2), and that the resulting utility is higher
than without tuning.
These experiments demonstrate that tuning the transmission power levels of active eNodeBs when a
neighboring eNodeB is taken offline can improve the network performance. Moreover, if one knew a
priori the optimal values of the attenuation levels that need to be applied upon service disruption, then the
proactive re-configuration of those values could alleviate the impact of the disruption to the user quality
of experience. How can we know these values in advance? We address this question in the next section,
where we propose a novel, accurate model-based approach for proactively deriving such values.
2.4 Cellular Network Model
In this section, we describe the model Magus uses to analyze and estimate the throughput and coverage of a
specific configuration of cellular network. Since we consider the impact of one or part of base station
5
be-
ing taken out of service, our model is designed to faithfully represent signal coverage and user throughput,
5
One base station usually contains multiple (typically 3) sectors, facing at different directions.
18
while taking into account interference across sectors. In this paper, we focus on downlink rates, although
our methodology can also be used for uplink performance.
The unique aspect of Magus’s model is that it is data-driven, and uses data that is available to a cellular
network operator. The radio path loss is a critical characteristic, and a major contribution of our work is
to use realistic operational data for transmit powers and radio path loss. While this operational data is
used for planning purposes by many network operators, it is not used for online re-configuration in large
carriers, to the best of our knowledge.
We first describe Magus’s coverage and capacity model at a high level, followed by details of the
operational path loss data we use to realize the analysis model, and a visualization of sector regions.
2.4.1 Cellular Coverage and Capacity Model
Cellular network coverage is often modeled and measured based on a geographical grid. We thus divide
the area we want to analyze into grids, and assume users within the same grid perform equally.
The performance of a cellular network is determined by the rate enjoyed by each user. Magus models
the user’s rate based on two factors (a) radio link quality, and (b) sector load. The maximum rate that a
user can sustain depends on the quality of the radio link and we model that as a direct function of the UE’s
SINR. This is also the user rate if the serving sector serves no other users. The actual rate observed by
a UE in a loaded sector is lower than the maximum rate by a fraction that is directly related to capacity
sharing. We provide details in the following subsections.
Path Loss and SINR Calculation. There could be more than one (typically 3) sectors at the same base
station aimed in different directions covering adjacent sectors. The radio signal transmitted from a sector is
attenuated due to the loss along the radio path to the user. In the model, the entire region is partitioned into
rectangular grids, and the most important metric is the matrix of the path loss values (typically expressed
in dB) from each sector to each geographical grid. If each sector b transmits with power P
b
(dBm) and tilt
T
b
, the path loss to grid g is L
b
(T
b
;g) (dB). Magus computes the Received Power (RP) (dBm) at each grid
RP
b
(g) for the transmission from each sector b, as follows:
19
Figure 2.3: Demonstration of operational network path loss data. Each pixel represents a grid.
Brighter color indicates lower path loss and better receive power.
Figure 2.4: Demonstration of the
service coverage map. Grids that
are served by the same sector are
painted in the same color. Black
pixels indicates the grids that have
receive power below a threshold.
Figure 2.5: Service coverage map
(Figure 2.4) on top of the satellite
map. The grids that have good re-
ceive power (higher than a thresh-
old) are highlighted in red. The un-
highlighted grids are sparsely in-
habited areas.
RP
b
(g)= P
b
+ L
b
(T
b
;g) (2.1)
To compute SINR for a given grid, we need the RP values from all the sectors. The sector that provides
the best RP (denoted by RP
best
) becomes this grid’s serving sector, and RP
best
is thus the signal; the RPs
from other sectors become interference. The SINR can then be calculated as follows:
SINR(g)=
RP
best
(g)
Noise+ Inter f erence
=
RP
best
(g)
Noise+å
b
RP
b
(g) RP
best
(g)
(2.2)
20
UE’s Maximum Rate and Actual Rate. From each grid’s SINR, Magus can compute the maximum rate
of a UE (denoted by r
max
(g)) in this grid, which is the user rate if there is no other user being served by
that sector. We assume the cellular system is based on the 3GPP LTE standard. In our model, we look up
the corresponding Modulation and Coding Scheme (MCS) index for a given SINR value ([4]), and then
look up the Transport Block Size (TBS) index ([10] Table 7.1.7.1-1) and finally the Transport Block Size
([10] Table 7.1.7.2.1-1) to map the SINR to the rate r
max
(g). There is a SINR threshold SINR
min
to provide
the minimum service, and if the SINR is less than SINR
min
, we conclude that the grid is out of service with
r
max
(g)= 0 for that condition.
The rate r
max
(g) is achieved if there are no other users. If the sector serves multiple UEs, the capacity
is shared by the users. For scheduling schemes such as round-robin, and proportional-fair (in the long term
average), capacity is shared uniformly. The actual rate that one UE in this grid can achieve equals the
maximum rate divided by the number of UEs the sector is currently serving. If we use N(g) to denote the
number of UEs that g’s sector serves,G to denote all the grids, and UE(x) to denote user numbers in grid
x, then N(g) equals the sum of the number of UEs in all the grids that served by g’s sector:
N(g)=
å
x2G
(UE(x) 1
g
(x)) (2.3)
where indicator 1
g
(x) indicates whether grid x and grid g are served by the same sector or not.
Then the actual rate for grid g is just
6
:
r(g)=
r
max
(g)
N(g)
(2.4)
We have described how Magus uses grid level information to model the rate for all the UEs. For
a given scenario, Magus computes all grid level information: best sector, corresponding signal RP, the
interference, SINR, and the number of UEs it contains; and sector level information: a list of serving grids,
and the total number of served UEs (Figure 3.2).
6
We assume there is no overheads involved in resource sharing.
21
The model above is deliberately simple, and it serves to validate our approach. More sophisticated
models can be easily added if needed, and exploration of this is left to future work. The model’s simplicity
is dictated by the characteristics of the operational data used to drive the model. This is the most novel
aspect of our work: while operational data has been used before for network planning, we believe it is not
used for dynamic re-configuration in cellular networks.
2.4.2 Operational Data Used in the Model
Path Loss Data from Operational Networks. In Formula 2.1, most analytical research assumes some
classical model for path loss information L
b
, e.g., attenuation as a function of link distance, frequency,
antenna heights and empirical constants. However, state of the art path loss modeling in modern cellular
networks includes details such as terrain, buildings, foliage, etc, and this detail is treated differently for
each geographical grid region. The data we use comes from such a modeling tool called Atoll [14].
Path loss values are derived by using a Standard Propagation Model which is based on classical distance,
frequency, antenna height models, which are then modified with empirical constants to capture terrain,
foliage, and clutter effects for each grid.
Instead of making simple assumptions for the attenuation, Magus uses operational network path loss
data, which contains path loss information for a large US mobile carrier network. In the model, each
sector’s path loss data covers a 60km 60km square area, centered at the sector’s location. This area
contains 600 600 grids, with a grid size 100m 100m, and there is one path loss reading for each
grid, resulting in one path-loss matrix (containing 600 600 path loss values, in dB) per antenna tilt
configuration. This path loss data is refreshed periodically as needed and Magus always uses latest path
loss data to build the model.
Figure 2.3 shows the path loss data of one sector in a metropolitan area. The path loss values range
from20dB for locations close to the sector to200dB at the boundary of the area. As we can see clearly,
this sector antenna is directional and pointing in the north-west direction. We can also see that the path
loss data contours are irregular, and thus cannot be represented easily by simple equations.
22
Base Station’s Location, Transmission Power and Tilt. Network operators choose base station loca-
tions, sector transmission powers and tilts carefully to provide better cellular service. In Magus, we use
the actual locations, transmission powers and tilts of sector for Formula 2.1.
UE Distribution. Ideally Magus could use, as input, the number of UEs in each particular grid from
operational data for Formula 2.3. We did not have fine-grained LTE UE distribution data available at the
time we wrote the paper. As an alternative, we make a simple assumption: all grids served by a particular
sector contain the same number of UEs (i.e., UE distribution follows a uniform distribution at the sector
level). Thus, the number of UEs in each grid is obtained by dividing the total amount of UEs served by
the sector by the number of grids that the sector serves. That said, if finer-grain information about UE
distribution across grids were available, we could easily incorporate this into our model, and we have left
this extension to future work.
2.4.3 Cellular Coverage Illustration
Figure 2.4 shows the predicted service map for a 300km 150km area, derived from the Magus model.
Each pixel represents a grid, and grids with the same color clustered together are served by the same
sector. Black pixels represent the grids where SINR is lower than the SINR
min
threshold we choose,
and we have intentionally chosen a high SINR threshold to show the clear difference between grids that
receive good service and other grids. In Figure 2.5, we show an overlaid satellite map of the same area
and highlight grids having service in red. We see that our model matches the real map nicely, clearly
bringing out coverage holes in sparsely inhabited areas (top-right corner of Figure 2.5). A more complete
model validation is logistically difficult, since it would require extensive measurements from UEs in known
locations. However, we are confident of the model accuracy since the data for the model comes from data
used operationally for network planning, and the model itself relies on well-studied methods in cellular
modeling.
23
Figure 2.6: Magus’s workflow. The search algorithm picks one potential configuration change; the
analysis model applies the change by recomputing grid level and base station level information; the
evaluation module then evaluates the performance and thus decides whether to accept this change
or not.
2.5 Service Disruption Mitigation
In this section, we discuss how Magus leverages the cellular network model to mitigate any service disrup-
tion due to sectors being taken down during network upgrade. It achieves this by finding the best power
and tilt configuration settingC
a fter
. Note that, Magus’s tuning approach is model-based and proactive, and
thus different from dynamic power control optimization techniques like [26] that are reactive.
Figure 3.2 shows the components of Magus. The Search Algorithm searches for a good configuration,
which is fed as an input to the Analysis Model (described in Section 2.4), which analyzes the rates achieved
by users in the selected configuration. Finally, the Evaluation component determines the goodness of
the configuration. This component can also provide feedback to guide the selection of configurations
iteratively until Magus converges to a satisfactory configuration.
The Evaluation Component. Magus’s optimization goal is to provide cellular coverage to all UEs while
achieving the best overall performance of all the UEs. There is always a trade-off in cellular system
between coverage, throughput, and fairness. We formulate the optimization goal in two steps: 1) we
24
calculate a utility value for each UE based on the value of its actual downlink rate; and 2) we calculate the
overall utility based on the utility values of all the UEs.
The utility function depends on the rate r. We use u() to denote the utility function. Thus a UE with
rate r
C
for a particular configurationC has the utility given by u(r
C
)
7
. We useU(C) to denote the set of
utility values of all the UEs for configuration settingC, then we have:
U(C)=fu(r
C
(UE
1
));u(r
C
(UE
2
));u(r
C
(UE
3
));g
The optimization goal, then, is just to maximize the overall utility f(), ofU(C):
max
C
f(U(C))
We have not defined a UE’s utility function u() and overall utility function f() yet. It is desirable for
f() to be additive where the overall utility is a sum of the individual utility values for each user.
Magus can actually use different u() and f() for different mitigation purposes. For example, to
maximize coverage, i.e., provide more UEs with qualified service, Magus can use a binary utility function
u() indicating whether the service is qualified or not:
f(U)=
å
u2U
(u(r)); u(r)=
8
>
>
<
>
>
:
1; r> 0
0; r 0
(2.5)
If the goal is to maximize performance, we can use the metric described in Section 2.3, the “sum of
the logarithm of the UE rate” [51]. Then we have:
f(U)=
å
u2U
(u(r)); u(r)=
8
>
>
<
>
>
:
log(r); r> 0
0; r 0
(2.6)
7
More precisely, rate information is at a grid level, so r() should take the grid as input but not UE. To simplify notation, here we
skip the step of getting the grid of the UE.
25
Figure 2.7: Path loss of changing power and changing tilt, brighter color means lower path loss and
hence good receive power. (a) path loss before change; (b) after increasing the transmission power;
(c) after an antenna uptilt.
Base Station Configuration Tuning. Restoring service for a sector suffering outage typically requires
that there be adequate radio signal coverage in the affected grids. There are two sector parameters that can
be tuned to increase signal coverage: (i) Power: the transmission power of the neighboring sector can be
increased, and/or (ii) Tilt: the antenna of the neighboring sector can be tilted vertically upwards (uptilt) to
shift the radio energy towards the target grids. Figures 2.7 (a), (b) and (c) illustrate the signal coverage of a
sector before tuning, after a power increase, and after an uptilt, respectively. Below, we discuss algorithms
for tuning power, tilt, and both power and tilt jointly.
Search Algorithm Component. To search for the best configuration, the simplest option is to use brute
force, which tries all the configuration settings. However, this does not leverage useful information from
the analysis model, and wastes time computing configurations which cannot mitigate induced performance
impact, e.g., tuning a sector far away from the target sector or a sector facing in the opposite direction.
To illustrate the huge search space, even if we only consider the nearby sectors, say, 10 sectors, and each
sector can increase its transmission power by 5 units, we have to explore more than 9 million (5
10
) different
configurations in this simple example.
In this paper, we propose a heuristic iterative search algorithm that leverages the unique nature of our
problem and the observations of the radio network:
26
Algorithm 1: : SEARCH ALGORITHM
1 INPUT:involved sectorsB, affected gridsG;
2 search() aaa:
3 b = / 0, T = 1
4 for g2G do
5 for b2B do
6 if r
CP(b;T)
(g)> r
C
(g) thenb =b[fbg;
7 b
best
=
b2b
f(C P
b
(T))
8 C=C P
b
best
(T)
9 updateG
10 goto 4: (increment T if needed)
i: The initial settingC
be f ore
is a good place to start since it provides coverage to all the sectors around
the target sector. Given how cellular networks are planned, users in the target sector are likely to get some
level of coverage from the neighboring sectors.
ii: From this configuration, we use the analysis model to make stepwise changes to find a better
configuration by making changes that can at least benefit some grids.
Transmission Power Tuning. Specifically, our search algorithm starts from theC
be f ore
(i), and only tries
configurations that can improve the SINR of at least one grid (ii). Algorithm 1 illustrates the important
steps in Magus’s search component for tuning power.
Let mark denote a configuration change andC P
b
(4) denote a new configuration in which sector
b’s transmission power has been changed by4. The algorithm takes as input the set of all the involved
sectors B in this scenario, which is chosen to be the set of neighbors of the sectors(s) being upgraded.
Another input is the set of all the grids whose rate performance is degraded as a result of taking down one
or more sectors,G.
In each iteration, the algorithm calculates b, which is a set that contains sectors that can improve the
SINR performance of some grid; b is empty initially. For any grid g2G, lines 4 and 5 identify all the
sectors, that can improve g’s SINR with T units
8
of transmission power change with an initial condition of
T = 1.
8
One unit is to increase the transmission power by 1dB.
27
These promising configuration changes inb are “conditionally” good because, although they can im-
prove some grids’ SINR, they can also decrease other grids’ performance, e.g., introducing more interfer-
ence to some other grids that are not served by this sector. To see whether some of the potential changes
inb are “globally” good, line 9 checks their overall utility by using the Evaluation component, and keeps
track of the sector that provides best utility improvement, b
best
. Then the algorithm applies the new con-
figuration, updates the affected grid setG, and moves on to the next iteration. Ifb is / 0, we increment the
tuning unit T . This process terminates when we have re-covered all the grids that see degraded perfor-
mance, or there is no sector can further improve the overall utility.
We have also left to future work an understanding of this algorithm’s stability properties as well as
its relationship to an optimal strategy. Despite this, as we show later, Magus provides significant gains in
coverage and performance relative to naive strategies.
Antenna Tilt Tuning. The basic methodology discussed above extends to exploring different tilt config-
urations. Conceptually, we can compute path loss models for each sector of each sector for all possible
tilt settings (operational Atoll data [14] contains 16 different tilt settings besides the normal case). Once
we have these, we can effectively replace the test in step 4 of Algorithm 1 to check whether tilting b by a
specific tilt setting improves the rate.
For logistical reasons, we chose a simpler approach that approximates the effect of tilting, but is more
computationally efficient (and have left it to future work to explore a more faithful tilting model). First,
our approach makes the simplifying assumption that the change to a path loss matrix caused by a specific
uptilt or downtilt is the same across all sectors. This allows us to compute one change matrix for each
uptilt or downtilt across all sectors (rather than computing a separate path loss matrix for each sector and
tilt setting pair). Second, rather than search across all sectors inB to find the optimal combination of tilt
setting, we use a greedy algorithm: we incrementally uptilt the first neighboring sector until we reach a
point where the utility becomes worse, then we uptilt the second sector, and so on.
Joint Tuning. Shown in Figures 2.7(b) and (c), tilt and power tuning produce different coverage results,
so combining the two can potentially provide better results. In our evaluations, we explore the benefit of
28
Figure 2.8: Coverage map of three different types of areas. An urban area contains a lot more base
stations than rural area.
Figure 2.9: Three different upgrade scenarios: (a) upgrading one sector at the center, (b) upgrading
three sectors of one base station, and (c) upgrading four sectors at the four corners of the area.
first employing tilt-tuning, followed by power-tuning. More elaborate joint optimizations are possible, and
we have left an exploration of this to future work.
2.6 Evaluation
In this section, we first describe our evaluation methodology, then evaluate various properties of Magus.
Evaluation Methodology. Our evaluation uses operational cellular network data (base station locations,
user density estimates and path loss information) for three different markets in the United States (a mar-
ket roughly corresponds to the greater metropolitan area surrounding a major city). In each market, we
evaluate Magus on a few 10km 10km areas, exploring several planned upgrade scenarios in each re-
gion (described below). For each upgrade scenario, we tune sector configurations within the area, but we
expand our analysis area to a larger 30km 30km region to avoid boundary effects.
29
Since the trade-offs in radio network coverage can vary significantly with the radius of sectors, we
select three different types of areas: rural, suburban and urban areas. Figure 2.8 shows a service map
example for each of them. The density of sectors are quite different across the cases: in our experiments,
we observe on average 26 sectors that interfere with the sectors in our rural area, 55 that interfere with the
sectors in the suburban area and 178 that interfere with the sectors in the urban areas.
For each region, we attempt three different upgrade scenarios, shown in Figure 2.9: (a) upgrading a
single sector at a centrally-located base station, (b) upgrading three sectors located at the same central base
station, and (c) upgrade four sectors at the four corners of the region. The first scenario reflects a planned
upgrade on a single sector at one base station, the second reflects upgrading the entire base station, and the
third reflects a multi-sector concurrent upgrade.
We use a recovery ratio metric to evaluate the mitigation performance achieved by tuning to theC
a fter
network configuration. This ratio is defined as:
f(C
a fter
) f(C
upgrade
)
f(C
be f ore
) f(C
upgrade
)
(2.7)
The denominator is the degradation in the global utility due to a configuration change to C
upgrade
,
while the numerator is the amount by which the global utility improves from the degraded value due to
a configuration modification made by Magus. Thus, the recovery measures the fraction of the degraded
utility from the planned upgrade recovered by tuning. A ratio of 1 indicates full recovery to f(C
be f ore
) and
0 indicates no improvement from mitigation.
Mitigation of Service Disruption. We studied 3 different rural areas, suburban areas and urban areas
(total 9 different areas) across three markets. For each area we analyzed 3 different upgrade scenarios, for
a total of 27 different scenarios. Our results are averaged and summarized in Table 2.1. The table shows
recovery ratio results for power-tuning, tilt-tuning, and joint power-/tilt-tuning.
30
Types of Rural Suburban Urban
Tuning (a) (b) (c) (a) (b) (c) (a) (b) (c)
Power 18.3% 17.5% 11.0% 56.5% 32.2% 24.5% 17.1% 22.7% 14.1%
Tilt 8.4% 23.0% 9.3% 37.7% 27.9% 22.8% 8.8% 29.7% 3.8%
Joint 37.0% 28.9% 17.0% 76.4% 37.4% 38.8% 20.1% 32.0% 19.2%
Table 2.1: Experiment results for recovery ratio, calculated using Formula 2.7 and averaged for
areas we studied. (a), (b) and (c) indicate different upgrade scenarios shown in Figure 2.9. For
power-tuning, the greatest gains are in suburban area, gains for rural and urban areas are lower. In
general, tilt-tuning cannot be as good as power-tuning, but the joint approach greatly improves the
results of power-tuning.
Figure 2.10: Illustration of coverage of tuning neighboring sector in rural area: (a) before taking
down the target sector; (b) after taking down the target sector; (c) after increasing one neighboring
sector’s transmission power by 10dB.
Power-Tuning. Table 2.1 shows that, across all areas in all three markets, Magus is able to recover at least
10% and up to 56% of performance by only tuning power. This is encouraging: from a cellular network
provider perspective, any recovery is beneficial since it means smaller impact on customers.
It is interesting to note in Table 2.1 that the greatest gains are in suburban areas. This was, to us, an
unexpected result. To a large degree, the efficacy of Magus is a function of the spare capacity available
in neighboring sectors, and network capacity planners go to great lengths to place base stations to ensure
adequate coverage. So, we expected that network planning would ensure uniform recovery regardless of
the type of area. Yet, this result suggests that the resulting configurations are, in some areas, more effective
at failure coverage than in other areas.
The reasons for the lower recovery in rural and urban areas are different. In rural case, the sector sizes
tend to be large. The neighboring sectors are far away, and use up most of the available power to cover their
sectors and are noise limited (the noise in Formula 2.2 becomes significant). In Figure 2.10, after the central
31
sector is down, coverage cannot be recovered even if we increase the power of the closest neighboring
sector (marked in white in (c)) by 10dB (10 power! and such increment probably already exceeds
the maximum transmission power of that sector). The maximum transmission power limit becomes a
constraint. As a result, in rural cases, it is relatively harder for neighboring sectors to provide good service
to grids served by the target sector.
The urban case is the opposite of rural case. There are several nearby neighboring sectors with enough
power for their signals to reach the grids affected by the upgrade. However, urban radio networks are
interference-limited, and severe interference to nearby grids limits the tuning potential. In addition, be-
cause sectors interact with many more neighbors, we may need to tune more sectors to get better results.
For example, we may need to carefully tune neighbors of the sectors we are currently tuning. In these
cases, our heuristic may get stuck at a local optima. We remind the reader that our first objective is to
explore the viability of a model-based mitigation during a planned upgrade (to identify opportunities for
improvement enabled by configuration tuning). A more sophisticated version of Magus (which we have
left to future work) may do better.
In suburban areas, neighboring sectors can reach affected grids, and they are also relatively less
interference-limited. So, there is room to tune neighboring sectors, and Magus indeed tunes more for
suburban area cases.
In the three markets we study, nearly 49% of the areas are suburban, and only 6% are urban
9
. Thus,
Magus, even in its present form, can be extremely effective in recovering performance loss due to a planned
upgrade.
Tilt-Tuning and Joint Tuning. Table 2.1 shows that in general the recovery ratio of simply tuning tilt is not
as good as of power-tuning. The reason is that tilt-tuning reshapes the angular distribution of radio energy
without increasing total power; it reaches further at the cost of sacrificing nearby areas, and it does not
increase radio signal in the side-lobe and back-lobe areas outside the primary zone of coverage. However,
9
We categorize areas by looking at the number of base stations this area contains.
32
by combining the two, we see that the joint approach always performs better than power-tuning and tilt-
tuning individually, improving performance by 2 over power-tuning. Our joint tuning algorithm is fairly
simplistic, more sophisticated approaches might be able to improve recovery further.
From our current results, none of the upgrade scenarios offers consistently higher improvement than
others across all areas. To understand whether one of these upgrade scenarios is statistically better than
others would require evaluating a much larger number of areas, and we have left this to future work.
Benefits of Gradual Tuning. If we change the configuration fromC
be f ore
toC
a fter
in one step, many
UEs need to handover to a different sector, and such handovers will happen simultaneously. This can
introduce a significant signaling burden in the cellular network. Moreover, handovers are faster and incur
less overhead when the source and destination sector are both online than when the source sector is taken
offline. Based on these observations, we see value in tuning the configuration setting from C
be f ore
to
C
a fter
gradually, to avoid synchronized handover, and to provide as many UE as possible the opportunity
of seamless handover before the sector upgrade.
A less-sophisticated approach can also reduce the power of the target sector gradually. But this does
not guarantee the utility metric value, which could go through periods where the metric is even lower than
the final f(C
a fter
) scenario. Magus has an estimate of f(C
a fter
), so it can ensure that, throughout the
gradual tuning process, the overall utility is never less than f(C
a fter
). Magus uses a greedy strategy in
which it decreases the transmission power of the target sector (the one for which an upgrade is planned)
in small steps well before the planned upgrade time. This will force a small number of UEs to migrate
to neighboring stations. However, we make sure that the utility never goes below f(C
a fter
): if Magus
predicts that it will, it tunes towardC
a fter
a bit to compensate, by either uptilt or increases the power of
the neighboring sectors. The process terminates either when Magus cannot compensate (in which case it
jumps directly toC
a fter
), or when no UEs are left attached to the target sector.
Figure 2.11 demonstrates one example of gradual tuning. At each step, UEs re-attach to nearby sectors
(handover) gradually, as opposed to the direct tuning case (Proactive) where all the UEs that need to
handover do it simultaneously at the time of upgrade. Moreover, the utility never goes below f(C
a fter
)
33
0
500
1000
1500
2000
2500
3000
Utility Change
^ ^ ^ ^ ^ Upgrade
Tuning Steps
0
2000
4000
6000
8000
Handovers
Proactive
Proactive Gradual
Figure 2.11: Benefits of gradual tuning
(Proactive Gradual). Gradual tuning re-
duce simultaneous handovers by 3 and
offer 99.7% of UEs a seamless handover.
0 5 10 15 20 25
Time
1
2
3
4
5
6
7
8
Utility (x 1e4)
Proactive Model Based
Reactive Model Based
No Tuning
Reactive Feedback Based
Figure 2.12: Comparing speed of conver-
gence across tuning approaches. Reac-
tive Feedback Based approach takes much
longer than other approaches.
1.0 1.5 2.0 2.5 3.0 3.5
Improvement Ratio
0.0
0.2
0.4
0.6
0.8
1.0
CDF (%)
Figure 2.13: Improvement ratio of Magus (Algorithm 1) on naive approach. Magus is better for
81% of scenarios, with maximum improvement ratio 3.87 and average improvement ratio 1.21.
(“^” marks the tuning steps we tune nearby sectors in Figure 2.11). With our approach, at any step, the
largest number of simultaneous handovers we see is 2457; without, it is 9827. So, Magus can reduce
simultaneous handovers by 3, and 99.7% of UEs can do a seamless handover (when the source sector
is still on-line) instead of hard handover (when it is offline). Across all our scenarios, Magus reduces
simultaneous handovers by a factor of 8, and enables 96.1% of UEs to achieve seamless handover.
Speed of Convergence. We compare our proactive model-based approach to a reactive feedback approach
(Figure 2.12). We estimate the number of steps the reactive feedback approach needs to converge to the
best configuration setting.
34
We assume a simple tuning strategy for the reactive feedback approach: it can only tune 1 power-tuning
or tilt-tuning unit of one single neighboring sector in each step. To give benefit to this strategy, we set it up
so that at each step it picks the best configuration (we use our model to determine this). Even under this
idealized scenario, the reactive feedback approach still needs 27 steps to get the best configuration setting,
in one of our upgrade scenarios. In practice, it could be a lot longer. We have estimated that a more
realistic estimate for the same scenario is 310 steps. Taking into account the time to obtain the feedback
(extract performance measures from the sector) which can be on the order of several minutes, even the
idealized reactive feedback based approach could recover performance only after two hours after the start
of the planned upgrade.
Flexibility of Using Different Utility Functions. As discussed in Section 2.5, Magus can use differ-
ent utility functions that specify different objectives. We illustrate this for a suburban area with upgrade
scenario (a), use both the log-sum rate performance utility (Formula 2.5) and the coverage utility (For-
mula 2.6). Table 2.2 shows the recovery ratio for both metrics, and the achieved utilities. When the
Optimization
Utility Function
Recovery Ratio
u
per f ormance
u
coverage
u
per f ormance
66.3% 2.6%
u
coverage
-29.3% 14.4%
Table 2.2: Recovery ratio of one scenario, using different utility functions. Recovery ratio is cal-
culated using Formula 2.7, u
per f ormance
and u
coverage
are referring to utility functions of Formula 2.5
and 2.6 respectively. Different utility functions converge to different tuning changes, and Magus can
choose different utility function based on requirement.
performance utility is used, 63.3% performance is recovered but only 2.6% coverage is recovered. Con-
versely, when the coverage utility is used, 14.4% coverage is recovered, but at the cost of performance.
This illustration indicates that Magus can be used with different utility functions in different areas. In a
rural area, providing coverage might be a priority, but in an urban area, performance might.
Benefit of Search Heuristic. To compare the quality of Magus’s ability to detect the best tuning config-
uration given a current configuration, we could have used brute force, but the search space is too large:
even without tilt changes, if we tune only 10 sectors, and for each of them try 10 units of power change,
35
our search space is 10 billion. For this reason, Magus contains a heuristic search strategy (Algorithm 1,
which is for power-tuning only) that prunes the search space taking the problem structure into account. To
demonstrate that this heuristic is better than simpler approaches, we compare it to a naive approach similar
to the one we use for tilt-tuning: it increases transmission power by 1dB for the first neighbor at each step
until utility worsens, then does the same for the second neighbor and so on.
For the 27 different scenarios we studied, we compare our solution to the naive approach’s solution,
and calculate the improvement ratio, which is just
Magus recovery ratio
naive recovery ratio
. We plot the CDF in Figure 2.13.
Among these 27 scenarios, our algorithm is no worse than the naive approach for 22 of scenarios, takes
81% of the scenarios we study. For those 5 scenarios that our algorithm performs worse, we still generate
solution with similar recovery ratio (the improvement ratio is never below 0.9). For more than 22% of the
scenarios, our solutions are 30% better than naive approach (improvement ratio greater than 1.3). For the
best case, the improvement ratio is 3.87. Overall, our algorithm is 21% better than the naive approach.
36
Chapter 3
SHADE: Providing Premium Service for Adaptive High Bandwidth
Applications in Cellular Networks
3.1 Introduction
Cellular networks are designed to provide Quality of Service (QoS) to traffic flows over the radio access
network. For instance, LTE provides guaranteed bitrate (GBR) classes of traffic to guarantee radio re-
sources to meet the requirements of specific applications such as V oice over LTE. However, GBR class is
only designed for low bitrate applications, and the bulk of IP data flows still use the non-guaranteed bitrate
(NGBR) class of service, without any service quality guarantee. This best-effort service provides poor
service quality when there is radio access network congestion (congestion happens at the base station).
Though there are also backhaul congestions, base station level congestion is becoming more frequent with
high bandwidth applications like adaptive bitrate application, as the capacity of radio access networks
(determined by the quantity of transmission resources) has lagged behind the demand posed by these ap-
plications. In addition, these applications require better service quality than best-effort services like Web
and email, and this congestion can significantly affect the quality of experience (QoE), e.g., it can lower the
Average Bitrate (averaged quality), increase Rebuffering Ratio (percentage of the time spend on waiting),
and introduce more Bitrate Switches (number of quality changes). This degraded QoE leads to lower user
37
engagement and lower revenue [86, 23]. Therefore, there is an urgent need for providing better cellular
service to adaptive bitrate applications during periods of congestion.
In general, there are several orthogonal directions to solve this capacity scarcity issue at the access net-
work. On one hand, we can improve the infrastructure to have move transmission resources, to overcome
the limit on the available spectrum, and thus provide higher capacity, e.g., building more base stations,
but this solution is expensive and also requires significant time to deploy. On the other hand, we can also
reduce the bandwidth requirement of these applications. However, this is difficult since users expect high
quality application, and application quality is an important QoE metric. Further, we expect future adaptive
bitrate applications, like 360-degree video, to have even more stringent QoE requirements. In this paper,
we explore a different direction that is more feasible and easily deployable: to improve the QoE of these
applications while serving existing users, and without significantly modifying existing infrastructure.
Today’s adaptive bitrate application technologies are mainly adaptive bitrate streaming protocols. A
content is broken into chunks, which contains several seconds of the original content, and is encoded at a
few different bitrates (qualities). The client-side player monitors current downlink bandwidth, and uses that
information to request the next chunk of the suitable bitrate. To have good Average Bitrate, Rebuffering
Ratio and Bitrate Switches, intuitively, application should be provided with high downlink throughput.
1
With a higher downlink throughput, player will request chunks of higher bitrates (better Average Bitrate),
and accumulate longer buffer (better Rebuffering Ratio). However, as transmission resources are pretty
limited, arbitrarily increasing applications’ throughputs is a sub-optimal solution. By empirically analyz-
ing the relationship between downlink throughput and application QoE, we observe that a stable downlink
throughput maintained at one of the bitrate candidates of the application can provide good performance
on these three QoE metrics with good efficiency on the usage of transmission resources. Using this obser-
vation as the design guideline, in this paper, we focus on improving applications’ downlink throughputs,
and maintaining their throughputs at one of their application bitrate candidates, without noticeably hurting
other applications.
1
Though there are other QoE metrics, this paper considers three most important QoE metrics [24].
38
For scheduling of limited transmission resources, current cellular systems generally allocate transmis-
sion resources among applications in a fair way. To treat adaptive bitrate applications differently, we use a
differentiated service model to enhance the quality of service of applications which are more critical. To
improve and maintain critical applications’ downlink throughput at one of the bitrate candidates, we re-
design the cellular network to be content-aware, where the infrastructure knows which users are adaptive
bitrate application users, and for each adaptive bitrate application, what is the bitrate candidates of that
application. In this paper, we describe a system called SHADE to realize this idea, which Stabilizes appli-
cation’s throughput at a Higher downlink bitrAte to provide better aDaptive bitrate application Experience.
At a high level, SHADE has two challenges. The first challenge is to stabilize a given application’s
downlink throughput at a pre-determined bitrate choice under all kinds of network dynamics, including
changes on number of cellular users, changes on users’ requirements, and more importantly, users’ channel
conditions. This requires a modification to the transmission resource scheduler of the cellular system. We
require SHADE to make only minimum modification for feasible deployment, i.e., we consider the re-
design of the cellular system relatively infeasible for the deployment. In addition, as throughput can be
maintained by sacrificing utilization of transmission resources, SHADE also needs to be work-conserving
for high utilization, like the widely-used Proportional Fair scheduler. Naive reservation based schedulers
can maintain throughput by reserving transmission resources for users, but it is not work-conserving, i.e.,
those reserved resources would be wasted if the owner does not need them, but Proportional Fair scheduler
never wastes resources. To tackle this challenge, SHADE implements a throughput maintenance component
on top of the Proportional Fair scheduler, to keep its work-conserving property, and with minimal changes,
i.e., a weight parameter. By dynamically tuning this weight parameter to react to network dynamics, each
application’s downlink throughput can be maintained.
With the throughput maintenance component in place, the second challenge is to distribute limited
transmission resources among applications efficiently. Concretely, SHADE needs to select a bitrate candi-
date for each critical application and send these selection solution to throughput maintenance component;
39
the selection solution needs to 1) limit the negative impact
2
on non-critical applications and 2) provide
optimal overall QoE to critical applications. SHADE employs a bitrate selection component for this chal-
lenge. The first problem is solved by only using up to an operator-configurable amount of transmission
resources to promote critical applications, to bound the negative impact on non-critical applications. The
second problem, to pick a bitrate for each critical application, is more challenging. Intuitively, SHADE
should simply give more resources to applications with better channel condition for better overall effi-
ciency. However, because of network dynamics, e.g., user’s mobility, requirement, channel condition
changes, SHADE must adapt to these changes, but also satisfy the conflicting goal of minimizing bitrate
selection changes. We find that stabilizing bitrate selections over time is more important than efficiency
and design SHADE’s bitrate selection component accordingly for optimal QoE performance.
Through extensive ns-3 simulations, we show that SHADE can improve Average Bitrate, Rebuffering
Ratio, and Bitrate Switches simultaneously, compared to multiple service baselines, including a strong
competitor that uses the same amount of resources to promote critical applications and also tries to maintain
critical application’s downlink throughput at one of the bitrate candidates. Compared to this baseline,
SHADE achieves more than 10 times reduction on both Rebuffering Ratio and Bitrate Switches, while also
improving the Average Bitrate by 18%.
We make following contributions in this paper:
We make the observation that maintaining at one of the bitrate candidates can provide better QoE
for adaptive bitrate applications (Section 3.3);
The design and implementation of SHADE (Section 3.4):
– A throughput maintenance component that maintains each user’s downlink throughput at the
targeted value and provides high utilization at the same time, with only minimal changes to
current scheduler (Section 3.5);
2
Negative impact in terms of reduced allocated transmission resources
40
– A bitrate selection component that provides good performance on three competing adaptive
bitrate application QoE metrics, using only limited resources (Section 3.6);
An extensive evaluation of SHADE (Section 3.7).
3.2 Background
SHADE needs to change how transmission resources are allocated in the cellular network to enable better
service to critical applications. Thus, we provide relevant background to LTE downlink. We also provide
necessary adaptive bitrate application background, and definitions of three key adaptive bitrate application
QoE metrics we consider in this paper.
LTE Downlink. We describe the downlink transmission of Long-Term Evolution (LTE) cellular technol-
ogy standard, with a focus on the resource scheduler.
3
In the downlink, LTE uses the Orthogonal Frequency Division Multiple Access (OFDMA) scheme. In
the frequency domain, the available bandwidth is divided into orthogonal subcarriers, each of them takes
a bandwidth of 15 kHz. In the time domain, LTE divides the time duration into frames of 10 milliseconds.
Each frame is then divided into 10 Transmission Time Intervals (TTIs), and each TTI contains two slots.
A set of twelve consecutive subcarriers over the duration of one slot is called a Physical Resource Block
(PRB), which is the basic scheduling unit: OFDMA can assign each PRB to an arbitrary user.
Each LTE user periodically measures the channel condition and reports that back to the base station as
Channel Quality Indicator (CQI) reports. In general, the channel condition relies on Signal-to-Interference-
plus-Noise-Ratio (SINR), where a stronger signal improves the channel condition while a stronger inter-
ference degrades the condition. CQI report is used in two ways. First, the base station uses it to determine
the Modulation and Coding Scheme (MCS) used by the radio to transmit to this user. Different MCS
represents different downlink rate, which is selected to maximize the data rate under current channel con-
dition. Second, LTE scheduler also uses it to schedule PRBs to improve bandwidth efficiency, i.e., each
3
This paper uses LTE technology as the example, but the idea of our approach can be generalized to other cellular technologies
as well.
41
PRB should be used by the user with the best CQI over the corresponding subcarriers to achieve higher
system efficiency.
The PRB scheduling algorithm is important to LTE network, as it must provide good efficiency (through-
put to users) and fairness at the same time. To maximize efficiency, i.e., the Maximum Rate [81] algorithm,
always allocates PRB to the user with the best reported downlink CQI. The Maximum Rate algorithm uti-
lizes PRB in the most efficient way. However, because users with relatively low CQI values would not be
scheduled at all, this algorithm results in poor fairness performance. The Round Robin algorithm [31] is
designed to provide good fairness, which allocates equal amount of transmission resources to each user.
Because Round Robin algorithm does not take users’ channel conditions into account, its throughput per-
formance (efficiency) can be significantly bad.
The most widely deployed scheduler [57, 27], Proportional Fair scheduler [46], provides a balance
between efficiency and fairness. For each PRB and for each user i, Proportional Fair algorithm calculates
two values: first, user i’s achievable rate when using this PRB (denoted by r
i
); and second, user i’s average
data rate over a time interval in the past (denoted by R
i
). This PRB is then assigned to the user with the
highest value of metric m:
m= argmax
r
i
R
i
(3.1)
Proportional Fair can achieve good efficiency and fairness. r
i
ensures efficiency; for fairness, because
of R
i
, a user with low r
i
will also get scheduled if his r
i
is relatively higher compared to his past records.
Adaptive Bitrate Application Streaming. Today’s adaptive bitrate application streaming technologies
are mainly HTTP based adaptive streaming protocols. At the server-side, application content is encoded at
a few different bitrates (usually 5-6 of them [6]). Each bitrate version is then broken into multiple chunks
that each contains several seconds of the content. Chunks of different bitrates are aligned so that the player
can smoothly switch to a different bitrate at the chunk granularity.
42
At the client, an adaptive bitrate algorithm (ABR) is implemented to provide good adaptive bitrate
application QoE to users. ABR looks at the recent available bandwidth
4
and the buffered playback to
determine a suitable bitrate for the next chunk to request. A higher bandwidth makes the download faster
and thus allows ABR to request the chunk with higher bitrate. Similarly, with a larger buffer, the next
chunk becomes less urgent and ABR can potentially request a higher bitrate chunk. In this paper, we as-
sume that user’s available bandwidth is bottlenecked by the cellular downlink throughput, as the capacity
of the cellular network is relatively limited compared to the backbone network. Therefore, a user’s down-
link throughput always indicates his available bandwidth. In addition, user’s available bandwidth can be
controlled by tuning his downlink throughput.
There are multiple important adaptive bitrate application QoE metrics, in this paper, we consider three
most important ones [24], Average Bitrate, Rebuffering Ratio and Bitrate Switches. Average Bitrate is
the time average of the bitrates played for this content. The higher, the better, as better quality is prefer-
able. Rebuffering Ratio is computed as the ratio of time spent on waiting (rebuffering) over watching
smoothly. A high Rebuffering Ratio significantly degrades QoE. Bitrate Switches counts the number of
quality (bitrate) changes. Less switches is preferred.
3.3 Providing Better Video QoE
In this section, we identify two challenges of providing good video QoE in cellular networks. We then
describe our approach to address these challenges.
Low Downlink Throughput. ABR algorithms are designed to optimize QoE. In cellular networks, how-
ever a user’s QoE is still mainly dominated by the downlink throughput: when the downlink throughput is
limited, QoE can be bad even with a good ABR algorithm. To illustrate this, in Figure 3.1, we conduct ex-
periments to study the QoE performance (Y-axis) of different downlink throughputs (X-axis).
5
Figure 3.1a
4
Generally speaking, this is done by directly measuring the throughput of downloading the previous chunk. But potentially other
informations can be helpful, e.g., the channel quality.
5
We use an adaptive bitrate application streaming simulator to study this relationship, which will be described in detail in Sec-
tion 3.7.1.
43
0 1000 2000 3000 4000 5000 6000
Downlink Throughput (Kb/s)
1000
2000
3000
4000
5000
Average Bitrate (Kbps)
Video Bitrate
Candidates
(a) Average Bitrate
0 1000 2000 3000 4000 5000 6000
Downlink Throughput (Kb/s)
0.00
0.01
0.02
0.03
0.04
0.05
0.06
Rebuffer Ratio (%)
(b) Rebuffering Ratio
0 1000 2000 3000 4000 5000 6000
Downlink Throughput (Kb/s)
50
100
150
200
Switches (1)
(c) Bitrate Switches
Figure 3.1: Three QoE metrics (Average Bitrate, Rebuffering Ratio, Bitrate Switches) vs. downlink
throughput of watching a content that has five bitrate candidates: 350, 700, 1200, 2400, 4800Kbps.
We see that: (a) in general, higher downlink throughput provides better Average Bitrate, and we see
a clear “staircase” shape, with the height of each stair corresponding to one bitrate candidate; (b)
Higher downlink throughput leads to lower Rebuffering Ratio; (c) Using the downlink throughput
that is close to one of the bitrate candidates can significantly reduce the number of Bitrate Switches.
44
shows that the Average Bitrate increases with the downlink throughput; Figure 3.1b shows that the Re-
buffering Ratio can be reduced by increasing the downlink throughput. With a low downlink throughput,
user will have both poor Average Bitrate and Rebuffering Ratio performance.
This low downlink throughput issue is the main challenge in supporting good QoE in cellular networks,
as cellular user frequently experiences low downlink throughput. Traffic often overloads the base station,
resulting in degraded downlink throughput. Besides, another reason is due to wireless channel variabil-
ity. User might experience bad channel quality, e.g., when his serving base station is far away, or he is
experiencing strong interference. This can lead to low downlink throughput even if the base station is not
overloaded. It is important to improve users downlink throughput to provide better QoE.
Unstable Downlink Throughput. Besides the issue of low downlink throughput, another problem is that
arbitrary downlink throughput can lead to a large number of Bitrate Switches. Figure 3.1c illustrates Bi-
trate Switches vs. downlink throughput, with the vertical dotted lines being the bitrate candidates of the
content. We see that the downlink throughput values that provide best (lowest) Bitrate Switches perfor-
mance are bitrate candidates (350, 700, 1200, 2400 and 4800Kbps), and a throughput between two bitrate
candidates can lead to high Bitrate Switches (e.g., 1600Kbps). The intuition behind this observation is
that, because ABR algorithm monitors downlink throughput to determine the bitrate level of requesting
chunks, a throughput at one of the bitrate candidates makes the ABR algorithm stick to chunks of this
bitrate level. On the contrary, a throughput between two bitrate candidates makes the ABR algorithm pick
chunks of both of these bitrate levels and lead to many Bitrate Switches. Note that, this switching effect
would not be alleviated by introducing more bitrates. In reality, due to downlink throughput variation, we
expect this issue to be even worse. Therefore, to have good Bitrate Switches performance, user’s downlink
throughput should be maintained at one of the bitrate candidates.
There is an additional benefit of doing that for Average Bitrate. Figure 3.1a shows a clear “staircase”
pattern (e.g., from 2400 to 4800Kbps) between adjacent candidate bitrates: starting from a bitrate candidate
(e.g., 2400Kbps), the Average Bitrate would not be improved much by increasing the downlink throughput
(e.g., from 2400 to 4000Kbps), unless the increment is significant enough to make the downlink throughput
45
Provider Bitrate Candidates (Kbps)
Youtube 350, 670, 1200, 2400, 4400
Pub-E 340, 640, 1200, 1600, 2300, 3400, 5000
Pub-V 220, 360, 560, 660, 960, 1400, 2600, 3400, 4400, 5400
Table 3.1: Common bitrate candidates of different content providers.
close to the next bitrate level (4800Kbps). This means that, when it is expensive to increase one user’s
downlink throughput, we should either keep this user’s current downlink throughput (2400Kbps), or boost
it to the next level (4800Kbps), but never increase it to a value between these two bitrate levels.
Implications for Design. Above observations illustrate that, to support good QoE, cellular service providers
should maintain adaptive bitrate application user’s downlink throughput at one of the content bitrate can-
didates (note that, it is still up to the application to control the actual bitrate). To achieve that, cellular
service providers have to know these users’ bitrate candidates in advance. This information is available:
it can be obtained by parsing the application manifest
6
. In fact, application content providers generally
use common bitrates across their contents, as the best bitrate candidate for mobile users and gaps between
adjacent bitrates are recommended by the HTTP Live Streaming (HLS) [6]. This makes parsing the man-
ifest unnecessary. In Table 3.1, we crawl videos from three providers and get common bitrates for each
of them. For example, Youtube provides 5 common bitrates to mobile users, from 350Kbps to 4400Kbps.
Therefore, instead of actually parsing the manifest, a simpler way of getting bitrate candidates is to recog-
nize the content provider and then retrieve pre-learned common bitrate candidates for this provider. In this
work, we assume this bitrate information is available to cellular service providers using aforementioned
methods. Based on that, we re-design the cellular network to be content-aware. For the rest of the paper,
we assume the cellular infrastructure knows which users are adaptive bitrate application users, and what
are the bitrate candidates for each adaptive bitrate application user.
46
Figure 3.2: SHADE’s work flow. For each prospective user, SHADE’s admission control module deter-
mines whether it is feasble to provide premium video service to this new user. If so, for each admitted
premium user, bitrate selection component (Section 3.6) selects a bitrate from this user’s video bi-
trate candidates. These selected bitrates are then passed to the throughput maintenance component
(Section 3.5). Selected bitrates need to be updated periodically to adapt to network dynamics.
3.4 SHADE: A Premium Video Service
Based on the above discussion, we design a better service for cellular service provider, called SHADE,
which is a differentiated service and Stabilizes critical applications’ downlink throughput at a Higher rAte
for better aDaptive bitrate application Experience (Figure 3.2).
7
When an application requests a content,
the cellular service provider asks her whether she wants to use better service. If so, an admission control
module uses current workload to determine whether this new application can be admitted as a new critical
application. If admitted, SHADE strives to maintain this application’s downlink throughput at one of the
bitrate candidates of the content. SHADE has two main challenges:
Maintain downlink throughput: SHADE needs to modify the scheduler of cellular systems to main-
tain the downlink throughput for critical applications, a completely new requirement for cellular
systems. In addition, to enable faster deployment, we design SHADE to only use minimal changes
to current cellular system.
6
In general, when the client makes the request, the first response is the manifest file, which describes available bitrates to allow
the client to pick one of them.
7
The name SHADE also means to provide coolness for “hot” (overloaded) base stations.
47
Select bitrate target: SHADE needs to select a bitrate target for each critical application, to provide
good QoE (high Average Bitrate, low Rebuffering Ratio and low Bitrate Switches) to all the crit-
ical applications. SHADE needs to achieve this using limited transmission resources, and without
noticeably degrading performances of non-critical applications.
We provide an overview of SHADE’s solutions to these two challenges below.
3.4.1 Maintain Downlink Throughput
Traditionally, application’s downlink throughput can be maintained by using a reservation based approach,
like the Guaranteed Bit Rate (GBR) traffic class in LTE. This approach reserves transmission resources
for each application to guarantee a low throughput. Using this technique to maintain critical application’s
downlink throughput has two issues. First, it is not work-conserving. Reserved resources cannot be used
by others when the owner does not need them. Thus, this approach is not suitable to maintain a high
throughput, because reserved but not used resources can introduce too much inefficiency to the cellular
system. Second, it requires a major change to the hardware of cellular networks, which is hard to deploy.
To simplify deployment and maintain backward compatibility, we design SHADE to minimize the changes
to current cellular system.
SHADE builds upon the existing, widely-used Proportional Fair scheduler. Proportional Fair is a sharing
based (not reservation based), and thus work-conserving scheduler. By building on top of Proportional
Fair, SHADE keeps the important high efficiency property of the scheduler. Potentially, there are multiple
modification methods that can achieve the same goal. However, to ease the deployment, SHADE only
applies a small change, i.e., a weight parameter W to the Proportional Fair metric of Equation 3.1. More
imporantly, this change is doable via a software parameter reconfiguration:
m= argmaxW
i
r
i
R
i
(3.2)
48
Section 3.5 describes how does SHADE use this weight parameter to maintain the downlink throughput
in detail. Here we provide a high-level intuition. From scheduler’s point of view, a critical application with
weight W is equivalent to W identical applications with weight 1. Thus, by applying W, one application
can get roughly W times of transmission resources (PRBs) (see Section 3.5 for detailed analysis). This
provides a way to control the amount of PRBs that are assigned to this application. Then, to maintain
his downlink throughput, SHADE dynamically adapt the number of PRBs needed to achieve his targeted
throughput.
3.4.2 Select Bitrate Target
In addition to a mechanism for maintaining application’s throughput, we need a policy for selecting the
targeted bitrate for each critical application. This problem is challenging because only a limited number
of PRBs are available to SHADE. SHADE cannot simply maintain each critical application’s throughput
at the highest quality, as it may not be feasible. More importantly, supporting highest quality to critical
applications will significantly degrade non-critical applications service qualities. Thus, SHADE have two
separate challenges: the first is to bound the negative impact to non-critical applications; the second is to
provide good overall QoE to critical applications.
To bound the negative impact to non-critical applications, SHADE limits the aggregate resources that
can be used by critical applications. This upper bound on critical resources can effectively limit the impact
on non-critical applications.
The second challenge, which is more important, is how to select a bitrate for each critical application
to provide the best QoE to critical applications using limited resources. SHADE should perform well on
all three competing QoE metrics: we want to achieve high Average Bitrate, low Rebuffering Ratio, and
stable Bitrate Switches at the same time. This is challenging because improving one QoE metric normally
degrades other two metrics. In addition, SHADE has to adapt to network dynamics, e.g., changes on
user’s mobility, requirement, and channel condition, and frequently update the bitrate selection choices.
This introduces another challenge of stabilizing bitrate selection choices over time. In our experiment,
49
we find that stabilizing selected bitrates over time is a must for good QoE performance. Moreover, as a
differentiate service, when feasible, SHADE should maintain at least minimum service (minimum content
bitrate) to all the critical applications. To make this provision mostly feasible, SHADE needs to employ a
separate admission control module, which is required to accept the right amount of applications as critical
applications. We do not focus on proposing the best admission control scheme in this paper, instead, we
explore admission control choices with different degrees of conservativeness.
We describe SHADE’s solution to this bitrate selection problem in Section 3.6.
3.5 Maintaining Downlink Throughput
In this section, we describe how SHADE maintains the downlink throughput for critical applications by just
applying a weight parameter to the Proportional Fair scheduler. We first show that the resource allocation
behavior of the Proportional Fair scheduler is predictable, which offers a way to control the allocation
(Section 3.5.1). Based on that, we discuss how to achieve a targeted throughput for single critical applica-
tion at a time instant (Section 3.5.2). With this basic technique, we need to handle challenges of: 1) how
does SHADE adapt to network dynamics and thus maintain the throughput over time (Section 3.5.3); and
2) how to deal with the case of multiple critical applications (Section 3.5.4).
3.5.1 Property of Fair Allocation
Proportional Fair scheduler is designed to provide both good efficiency and fairness. Our simulation
experiment shows that Proportional Fair algorithm can achieve up to 1.8x of average throughput compared
to Round Robin algorithm, indicating its good efficiency.
8
These efficiency benefits over Round Robin are
achieved by trading-off fairness. However, surprisingly, Proportional Fair can achieve excellent fairness,
if we observe for a longer duration (e.g., 1 second). Table 3.2 shows two different fairness metrics on
the number of PRBs allocated to all the applications. Although its fairness performance is relatively bad
8
See Section 3.7.1 for more details.
50
Interval (s) 0.1 0.5 1 5
Jain Index 0.895 0.993 0.998 1.000
Mean/Std 37.0/12.6 184.8/15.7 369.9/17.2 1497.1/36.4
Table 3.2: Two fairness metrics on the number of PRBs allocated to all the users of Proportional
Fair scheduler, for different time intervals. Proportional Fair scheduler achieves good fairness when
the time interval is longer than 1 second.
when the time interval is small (< 1 second), it can achieve excellent fairness on larger time intervals (e.g.,
1 second), with Jain Fairness Index [45] greater than 0.998.
Due to this fairness property, Proportional Fair scheduler’s PRB allocation over a long enough time
interval (e.g., 1 second) is predictable: each application will simply get the fair share,
1
N
100% of
the total PRBs, where N is the number of applications. Because a application of weight parameter W is
equivalent to W applications, the scheduler allocates roughly W
1
N
100% of total PRBs to her. As there
are in total N+W 1 applications instead of N applications, more precisely, this percentage number is:
W
N+W 1
100% (3.3)
Because the total number of PRBs is a known constant to cellular service providers, we can tune W of
Equation 3.3 to control the amount of PRBs assigned to a application.
3.5.2 Achieve a Targeted Throughput
With the technique of controlling PRB allocation, next, we explain how to use that to achieve one critical
application’s throughput at a targeted value for a time instant, in two steps:
SHADE first determines the number of PRBs required to achieve the targeted throughput, at that time
instant (Section 3.5.2.1).
SHADE then determines the appropriate weight parameter to obtain the calculated number of PRBs
(Section 3.5.2.2).
51
3.5.2.1 Calculate Required Number of PRBs
SHADE firstly calculates, under current configuration, for a given critical application, what is the achievable
throughput for each PRB; then, SHADE determines how many PRBs does she need to achieve the targeted
throughput. User’s achievable throughput per PRB depends on her channel condition, which is measured
by Modulation and Coding Scheme (MCS) Index (denoted by I
MCS
). There is a one-to-one mapping
(function f ) from application’s channel condition (I
MCS
) to the achievable throughput (A). This mapping
is given by Transport Block Size Index Table and Transport Block Size Table (Table 7.1.7.1-1 and Table
7.1.7.2.1-1 of [10]):
A= f(I
MCS
) (3.4)
Then, given the targeted throughput (denoted by T ) of this critical application, we can determine her
required number of PRBs (denoted by P) by:
P=
T
A
=
T
f(I
MCS
)
(3.5)
With Equation 3.4 and 3.5, the challenge left is how to estimate application’s MCS Index (I
MCS
of
Equation 3.4). SHADE uses this critical application’s past MCS Indices for estimation. A naive estimation
would be averaging her MCS Indices on all the recent PRBs. However, we find that this significantly
underestimates her MCS Index. The reason is that only a small portion of PRBs will be allocated to
this critical application, and the MCS Indices of these allocated PRBs are much higher than the rest of
PRBs, as Proportional Fair scheduler generally allocates PRBs with higher MCS Indices to this application
(the reason why Proportional Fair can achieve good efficiency). This means that SHADE should instead
estimate the MCS Index only on PRBs that have been allocated to this critical application. Our evaluation
in Section 3.7.3.1 shows that this estimation approach works well for throughput maintenance.
52
3.5.2.2 Determine Weight to Obtain Certain PRBs
The next task is to calculate this critical application’s weight parameter W to assign her P PRBs. Recall
Equation 3.3, a weight of W will get
W
N+W1
of the total number of PRBs (denoted by M), we have:
P= M
W
N+W 1
and thus W =
P(N 1)
M P
(3.6)
Equation 3.6 assumes that all the applications are backlogged applications who require more resources
than what they get, and thus resources would be fairly divided among these backlogged applications.
However, in reality, there will be non-backlogged applications who require less than the fair share. Because
scheduler only allocates PRBs to applications with requirements, Equation 3.6 should be adjusted for non-
backlogged applications.
The adjustment of Equation 3.6 is based on the fact that when there are non-backlogged applications,
the scheduler would only grant them their required PRBs instead of the fair share, and then divide the rest
of the PRBs uniformly among backlogged applications. Using Q(x) to denote non-backlogged application
x’s required number of PRBs, we should replace Equation 3.6’s M and N by M’ and N’:
M
0
= M
å
x2Non-Backlogged Users
Q(x) (3.7)
N
0
= BackloggedUsers (3.8)
Non-backlogged applications’ PRB requirements Q(x) is a new required input. To provide this in-
formation, SHADE’s scheduler records the number of allocated PRBs to each application in previous
epochsand uses that as an estimation for Q(). Equation 3.7 also needs to know whether a given appli-
cation x is a non-backlogged application or a backlogged application. By definition, if Q(x) is smaller
than the fair share F, then he is a non-backlogged application. At the very beginning, the fair share is just
F = M=N. After SHADE labels some applications as non-backlogged applications, F should be updated
53
Algorithm 2: Calculating the weight parameter for a premium user to achieve his targeted downlink
throughput.
1 Update-Weight() Data: M;N;T;I
MCS
;Q()
Result: W
/* Label users to get M
0
and N
0
. */
2 while N
0
has not converged do
3 F = M
0
=N
0
;
4 M
0
= M; N
0
= N;
5 for x2 AllUsers do
6 if Q(x)< F then/* Non-backlogged */
7 M
0
= M
0
Q(x);
8 N
0
= N 1;
/* Calculate weight. */
9 P= T= f(I
MCS
);
10 W = P(N
0
1)=(M
0
P);
accordingly on the changes of total PRBs M (becomes M
0
) and number of backlogged applications N (be-
comes N
0
). This updated F will be higher because those non-backlogged applications saved some PRBs,
which potentially labels more applications as non-backlogged application. This iterative process keeps
updating M, N and F until there are no more new non-backlogged applications (N converged).
As a summary, Algorithm 2 illustrates the process of determining one critical application’s weight
parameter for achieving the targeted throughput.
3.5.3 Throughput Maintenance: Tracking Network Dynamics
Algorithm 2 achieves a targeted throughput for a time instant. That throughput needs to be maintained
over time. Previously we assume that there is no dynamics within the network. However, users join/leave
the network frequently (changes of N), their requirements are not constant (changes of Q()), and the
channel condition varies (I
MCS
changes). The weight parameter needs to be updated periodically to adapt
to network dynamics in order to maintain the throughput over time.
Intuitively, we can use a reactive updating approach, where an update happens whenever an input
change has been detected. For SHADE, we choose to use a proactive approach where update happens
periodically, for two reasons. The first is that some input data need longer time to estimate accurately, e.g.,
54
Algorithm 3: Using different time intervals to update MCS Index of the premium user and the PRB
allocation of all the users.
1 Update-MCS-Index() while True do
2 Sleep(MCS-Interval);
3 Update I
MCS
(x),8x2 PremiumUsers;
4 Update-Weight();
5 Update-Requirements() while True do
6 Sleep(Requirement-Interval);
7 Update Q(x),8x2 AllUsers; and M, N;
8 Update-Weight();
Q(), and thus should only be updated periodically. Concretely, for estimates that rely on the assumption
of fair allocation of PRBs, SHADE has to observe them for a long enough time interval, to make the
estimation accurate. For example, we count one user’s PRBs to estimate her requirement Q(). This
estimation technique relies on the assumption that the scheduler allocates PRB fairly among users, which
does not hold for a short time interval. The second reason of updating periodically is to limit the update
frequency and then the complexity SHADE introduces to the base station. Dynamics happen at extremely
high frequencies, e.g., channel condition changes all the time, it is infeasible to update for each single
change. It is also not efficient because many changes do not lead to significant weight parameter change.
Therefore, in Algorithm 3, SHADE uses two different update time intervals, MCS-Interval and Requirement-
Interval, to update I
MCS
and Q() (Q() also implies M and N) respectively. Intuitively, we want a short
MCS-Interval to capture the channel condition variation quickly, but a long Requirement-Interval to ac-
curately estimate users’ requirements, as it relies on the fair allocation assumption. The effects of using
different interval values are evaluated in Section 3.7.3.1 and 3.7.3.2. Note that, both updates of MCS-
Interval and Requirement-Interval will trigger Algorithm 2, the update of the weight parameter.
3.5.4 Supporting Multiple Critical Applicatoins
Above calculation assumes there is only one critical application. Next we extend SHADE to the case of
multiple critical applications, where SHADE has to determine multiple weight parameters. SHADE cannot
simply use Algorithm 2 to determine each weight one by one, because these weights depend on each other:
55
the calculation of the required number of PRBs of the first critical application, P
1
, relies on other critical
applications’ weights W
2
;W
3
;, as Equation 3.6 should be changed to:
P
1
= M
W
1
N+W
1
1+W
2
1+W
3
1+
(3.9)
SHADE solves this dependency problem by aggregating critical applications’ required PRBs together
and view them as one aggregated, virtual critical application. SHADE determines this virtual critical appli-
cation’s weight using Algorithm 2 first and then distributes this weight among critical applications propor-
tionally to their PRB requirements. Concretely, assume there are S critical applications, SHADE creates a
virtual user by:
P
virtual
= P
1
+ P
2
+ P
3
++ P
S
(3.10)
N= N S+ 1 (3.11)
After Algorithm 2 returns the weight parameter W
virtual
, SHADE calculates the i-th critical application’s
weight by:
W
i
= W
virtual
P
i
P
virtual
(3.12)
SHADE uses this method to extend Algorithm 2 to multiple critical applications.
3.6 Selecting Bitrate Target
In this section, we discuss how SHADE selects the bitrate target for each critical application. The two
challenges in this bitrate selection is how to limit the negative impact on non-critical applications, and how
to provide good application QoE to critical applications using limited resources.
56
3.6.1 Limit Impact on Non-Critical Applications
To limit the negative impact on non-critical applications, SHADE limits the resources (i.e., PRBs) that can
be used to promote service to a fraction p of the total PRBs. By doing so, we can limit the negative impact
on non-critical applications to be strictly less than p: assuming that there are N non-critical applications,
without this service, each non-critical application would get
1
N
of the total PRBs; now with our service,
this number drops to
1p
N
.
9
Note that, although we separate PRBs for critical applications (p100%) and non-critical applications
((1 p) 100%), SHADE is a work-conserving scheduler: when one category of applications in aggregate
require less than their PRB quota, those PRBs will not be wasted, they can be used by the other application
category. SHADE is work-conserving because it is based on a non-reservation based scheduler, Proportional
Fair, which always allocates PRBs to applications who need them.
3.6.2 Distribute Resources Among Critical Applications
Admission Control. As SHADE can only use limited PRBs, we need to limit the number of critical ap-
plications in order to provide them reasonably good application QoE. Otherwise, SHADE has to frequently
downgrade admitted critical applications to non-critical class. In this paper, we assume the simplest pricing
model, where all the critical applications pay the same price. We then design a simple admission control
component, which makes the accept/reject decision for each prospective critical application: SHADE only
accepts this application if a bitrate R
AC
can be provided to all the critical applications (including this new
application). The conservativeness of this admission control can be tuned by varying R
AC
: using the mini-
mum bitrate option R
1
as R
AC
admits applications aggressively, while using a higher R
AC
(e.g., R
2
, R
3
, etc.)
9
Without this service, our estimation of each non-critical application’s share,
1
N
, is not accurate. It should be less than that because
those critical applications become non-critical applications and need to take PRBs. This makes the negative impact to be strictly less
than p.
57
becomes more conservative. This paper does not focus on designing the best admission control algorithm,
though the advantages and disadvantages of different conservativeness will be evaluated in Section 3.7.
Bitrate Selection. Next, SHADE selects one of the content candidate bitrates for each admitted critical
application, in aggregate using only p of the total PRBs, to provide critical applications good Average
Bitrate, Rebuffering Ratio, and Bitrate Switches. In general, higher bitrate leads to better application
QoE (Figure 3.1). Thus, intuitively, SHADE should strive to provide highest overall bitrates for critical
applications. This intuition implies an optimization problem that maximizes some metrics on overall
bitrate under the PRB limit constraint. For example, Avis [30] maximizes the sum of the logarithm of each
critical application’s bitrate. However, through evaluation, we find that there are more important factors
than simply maximizing overall bitrate.
First, as a better service, SHADE should strive to provide at least minimum bitrate R
1
to all the admitted
critical applications; in other words, SHADE should minimize the duration that a application has been
downgraded to non-critical class. This is to make sure that each admitted critical application has high
application QoE, as applications who have been downgraded to non-critical class have to compete PRBs
with non-critical applications and may experience poor bitrate. Second, Figure 3.1b shows that, high
bitrate can reduce Rebuffering Ratio, and in particular, ABR algorithm can provide 0 Rebuffering Ratio on
any bitrate that is higher than 700Kbps (R
2
). So ideally it would be nice to provide at least R
2
to everyone.
Third, and more importantly, because of network dynamics, SHADE has to adapt to these changes and
frequently update the bitrate selection choices. We observe that stabilizing bitrate selection over time is
of the highest priority for better QoE, i.e., SHADE should reduce changes of bitrate selections over time.
10
At a high level, only a stabilized bitrate can benefit QoE as shown in Figure 3.1, and an unstable bitrate
can be worse than doing nothing (without bitrate selection and maintenance at all). Because the bitrate
is now more variable, jumping up and down quickly between distinct discrete values (bitrate candidates),
leading to many Bitrate Switches that would negatively affect QoE. In addition, note that, excessive Bitrate
10
We use the term “bitrate selection switches” to differentiate from Bitrate Switches. The former is performed by the bitrate
selection algorithm, the latter is done by ABR algorithm.
58
0 50 100 150 200 250 300
Time (s)
0
500
1000
1500
2000
2500
3000
3500
Bitrate (Kbps)
Bitrate of Fixed # of PRBs
Selected Bitrate
Figure 3.3: An illustration of the first step of SHADE’s bitrate selection algorithm. For a given
application, SHADE first estimates his achievable bitrate using a fixed number of PRBs (Bitrate of
Fixed # of PRBs), the changes of that curve is due to channel condition changes. Then, SHADE maps
that bitrate result to the closest bitrate candidate, Selected Bitrate).
Switches also degrades Average Bitrate. Because when Bitrate Switch happens, ABR has to throw away
current unfinished chunk, and waste the bandwidth spent on that chunk.
We design SHADE’s bitrate selection algorithm based on above three observations. The top priority is
to stabilize bitrate selection over time. To achieve that, we have tried many different solutions, and obtain
one property for better stability: SHADE should make sure that each application’s selected bitrate does
not rely on other applications’ channel conditions. In other words, each application should select their
bitrate independently. Approaches without this property cannot stabilize its bitrate selections over time
as applications channel conditions fluctuates independently. For example, Avis [30]’s bitrate optimization
algorithm is comparison-based. It always favors applications with good channel conditions. However,
applications’ channel conditions change dramatically, and thus its bitrate selection may favor different
applications over time and lead to unstable bitrate selections.
59
SHADE satisfies above property, by allocating similar amount of PRBs to the same application over
time, regardless of other applications’ channel condition changes. As illustrated in Figure 3.3. For any
given application, SHADE first uses a fixed amount of PRBs to estimate the achievable bitrate using his
current channel condition (Bitrate of Fixed # of PRBs), and then maps such bitrate to the closest bitrate
candidate that requires minimum PRB adjustment (Selected Bitrate). SHADE may choose to use different
“fixed amount of PRBs” for different applications, for example, applications who pay more will have more
initial PRBs. In this paper, without loss of generality, SHADE uses the same “fixed amount of PRBs” for
all the applications: the fair share, which is
1
N
P
p of the total PRBs, where N
P
is the number of critical
applications.
To provide minimum service to all the applications when possible, we force SHADE to map to at
least one bitrate level first (e.g., R
1
). In practice, we choose this level to be R
2
instead of R
1
because
bitrate higher than R
2
can significantly reduce Rebuffering Ratio. By mapping to at least R
2
, SHADE may
use more PRBs than the allowed value p. When that happens, SHADE keeps downgrading application’s
currently selected bitrate to save PRBs, until the total used PRBs is allowed. Algorithm 4 summarizes
SHADE s bitrate selection process. Line 1-3 maps to a bitrate level for each critical application, and line
4-13 downgrades mapped selections if needed. This downgrade works in two steps: in the first step (line
4-9), SHADE keeps looking for application whose selected bitrate is higher or equal to R
2
, and downgrades
her for one bitrate level to R
1
; when there is no such application, SHADE goes into the second step, which
downgrades application from R
1
to non-critical class (line 10-13). This two steps downgrade is to make
sure that SHADE supports R
2
or R
1
to everyone when possible. In addition, for each downgrade step,
SHADE downgrades application from the worst channel condition application, to the best channel condition
application, to save more PRBs from each downgrade.
60
Algorithm 4: SHADE’s Bitrate Selection Algorithm. It firstly select each user’s closest bitrate candi-
date, using the fair share (Figure 3.3); and then downgrade such selected bitrates until the allocated
PRB is no more than the allowed value.
Data: The number of total PRBs, T ; the percentage of total PRBs that can be used by SHADE, p; the number of
premium users, N
P
; the channel condition of each user, in terms of achievable rate per PRB, c
.
Result: selected bitrate for each user, x
.
/* Map to a bitrate candidate (Figure 3.3) */
1 for i2[1::N
P
] do
/* Get rate using the fair share: pT=N
P
*/
2 x
i
= ClosestBitrateCandidate(c
i
pT=N
P
);
3 PRB= PRB+ x
i
=c
i
;
/* Downgrade to use no more than pT PRBs */
4 x
= Sort()/* Sort user in ascending order of c
*/
5 while PRB> pT do
6 for i2[1::N
P
] do
7 if x
i
R2 then
8 x
i
;PRB= DowngradeOneLevel(x
i
;PRB);
9 break;
10 while PRB> pT do
11 for i2[1::N
P
] do
12 x
i
;PRB= DowngradeToNonPremium(x
i
;PRB);
13 break;
3.7 Evaluation
In this section, we experimentally evaluate SHADE. Our main result is an evaluation of SHADE’s QoE
performance compared to other premium service baselines. In addition, later, we also evaluate SHADE’s
throughput maintenance component separately, a prerequisite component for SHADE.
3.7.1 Methodology
We use the LTE module [69] of ns-3 simulator (ns-3.24.1) [5] to implement and evaluate SHADE. We then
feed the ns-3 simulated, each user’s TCP throughput trace into a video streaming simulator to evaluate the
video QoE of this user.
ns-3 Simulation Using AT&T’s Real World Data. We modify ns-3’s Proportional Fair scheduler of
the LTE module, by applying the weight parameter to each user and logging the MCS Index statistics as
well as PRB allocation statistics periodically. We implement a throughput maintenance controller, which
takes MCS Indexes and PRB allocation statistics as input, and determine each user’s weight parameter, to
61
maintain this user’s downlink throughput at the targeted value. The admission control and bitrate selection
component are on top of the throughput maintenance controller. They collectively determine the bitrate
target for each premium user, and pass this information to the controller.
Our ns-3 simulation configures an evolved packet core (EPC) and a radio access network (RAN) with
multiple eNodeBs. We use frequency division duplex (FDD) on LTE band 4 (downlink central frequency
2132.5 MHz and uplink central frequency 1732.5 MHz). We use the channel bandwidth of 20MHz for
downlink to simulate the same bandwidth of 10MHz 2x2 MIMO, which translates to maximum through-
put of roughly 70Mbits/s on the downlink. Inside RAN, we use RLC AM (Radio Link Control Acknowl-
edged Mode) and Hybrid ARQ (HARQ). These configurations represent a typical setting of AT&T’s LTE
network. At the PHY layer, we use the propagation model of Friis transmission equation [68].
We use the real locations of AT&T’s cellular network to place cell-sites. We first place one cell-site,
which consists three sectors, at the center of the map, and then place two surrounding rounds of cell-sites.
We use AT&T’s sector-level user density statistics to place users to each sector randomly. For most of the
experiment, we only focus on users served by one sector of the central cell-site, while other sectors/cell-
sites are configured as interfering sectors.
For each user, we configure one video server (TCP sender) for him, and connect this server to EPC
using a high bandwidth link (100Gb/s) with a latency of 20ms. Because this server and link are both ded-
icated to this user, they would not bottleneck the transmission. As we focus on video application, we pay
more attention to users’ downlink traffic, which will be configured differently for different experiments.
We also generate uplink traffic to simulate the realistic workload.
Video Streaming Simulator. We feed our ns-3 simulated TCP throughput trace into a video streaming
simulator (ABR simulator) to evaluate the application level performance. The simulator uses the through-
put trace as realtime bandwidth to perform a discrete event simulation of a video player, it chooses chunks
from a set of candidate bitrates when adapting bitrate. At the end of the simulation, it outputs three key
QoE metrics: Average Bitrate, Rebuffering Ratio and Bitrate Switches. The ABR simulator currently
allows choosing from three Adaptive Bitrate algorithms, a state of the art control theoretic ABR [86], a
62
buffer based ABR [42] and a rate based ABR [48]. Our experiment mainly uses the control theoretic ABR
[86].
3.7.2 Premium Video Service
We evaluate the video QoE performance of SHADE and other four competing baselines. We gradually add
properties for good QoE to form these five competitors:
1. Non-Premium (NP): A baseline that all the premium users will be admitted as non-premium user,
i.e., when there is no premium service.
2. Paris Metro Pricing (PMP): (additional property: premium service) A real premium service inspired
by the Paris Metro Pricing model [28], where there are two classes of identical cars with only ticket
price difference. In PMP, we create the “1st class car” by reserving p 100% of PRBs for premium
users. Each admitted premium user will get the same amount of PRBs (
1
N
P
p 100%). Note that,
although PMP is simple premium service, it still needs to leverage SHADE’s dynamic weights tuning
technique to control each premium user’s PRBs.
3. LogRate: (additional property: maintain downlink throughput at bitrate candidates) Avis’s bitrate
selection algorithm, which maximizes the sum of the logarithm of selected bitrate with some bitrate
switch penalty [30]. We implement this solution with parameters suggested by the Avis paper.
4. SHADE-MaxRate: (additional property: strives to provide R
2
or R
1
) Similar to SHADE, SHADE-
MaxRate provides R
2
or R
1
to every premium user when possible. The difference is that for the rest
of PRBs, it is optimization based: it maximizes the sum of selected bitrates among users.
5. SHADE: (additional property: property for better stability) Our proposed solution.
We use the same admission control for all the competitors, and compare their performance on three key
QoE metrics: Average Bitrate, Rebuffering Ratio, Bitrate Switches. For each QoE metric of a competitor,
we calculate the average of that metrics across all the admitted premium users. In addition, we add the
63
fourth metric, Downgrade Fraction, which is defined as the percentage of duration that one user has been
downgraded to non-premium class. A premium service should reduce the Downgrade Fraction to the
minimum. In addition, a competitor that downgrades more users to non-premium class has the benefit of
allocating PRBs among fewer users and thus the average performance might be improved. Downgrade
Fraction helps us realize this and makes fair comparison among competitors.
General Comparison. Figure 3.4 compares all the competitors with different premium resource percent-
age p. In general, we see that with higher p, all the premium services, except NP, achieve higher Average
Bitrate (Figure 3.4a), which shows the benefit of using premium service to improve video QoE. Also, we
see that SHADE and SHADE-MaxRate achieve the best Average Bitrate, up to 18% improvement compared
to PMP. Because SHADE and PMP use the same amount of PRBs, such performance gain comes from
leveraging PRBs more efficiently and smartly. Rebuffering Ratio is often viewed as the most important
QoE metric, in Figure 3.4b, we see that SHADE and SHADE-MaxRate provide lowest Rebuffering Ratio,
due to that SHADE and SHADE-MaxRate both try to provide the basic premium service to users. This
reduces Rebuffering Ratio for two reasons. The first reason is that they both strive to not downgrade
user to non-premium class, i.e., providing at least R
1
to everyone. Figure 3.5a illustrate that SHADE and
SHADE-MaxRate achieves similar Downgrade Fraction, however, LogRate downgrades users more often.
Providing minimum premium service makes sure that the Rebuffering Ratio would not be so bad, how-
ever, users who have been downgraded to non-premium class have to compete resources with non-premium
users and receive poor video QoE (LogRate downgrades so much and thus its Rebuffering Ratio shows
some correlation with NP). The second reason is that SHADE and SHADE-MaxRate strive to firstly provide
R
2
when possible, which introduces 0 Rebuffering Ratio, a much better result than R
1
. For Bitrate Switches
(Figure 3.5b), without maintaining the downlink bitrate, NP and PMP have similar, highest number of Bi-
trate Switches. LogRate’s performance is similar but for a different reason: it introduces too many bitrate
selection switches, see Figure 3.5b. LogRate makes many more bitrate selection switches for two reasons,
the first reason is that it is optimization based and does not satisfy the stability property that users should
pick their bitrates independently; the second reason is that although its optimization objective penalizes
64
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
300
400
500
600
700
800
900
1000
1100
1200
Average Bitrate (Kbps)
SHADE
SHADE-MaxRate
LogRate
PMP
NP
(a) Average Bitrate
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
0.0
0.5
1.0
1.5
2.0
2.5
Rebuffering Ratio (%)
(b) Rebuffering Ratio
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
5
10
15
20
25
30
35
40
45
Bitrate Switches
(c) Bitrate Switches
Figure 3.4: Comparison of competitors on three key video QoE metrics: (a) Average Bitrate; (b)
Rebuffering Ratio; and (c) Bitrate Switches, with different premium resource percentage p, using
R
1
as R
AC
for admission control.
switches, it uses a greedy approach to round the solution of such optimization to bitrate candidates, and
this rounding process is very sensitive to users’ channel conditions and introduces switches again. As a
result, we see that SHADE-MaxRate can reduce Bitrate Switches by more than 50%; in addition, as SHADE
satisfies such property for better stability, SHADE can further reduce another 50% from SHADE-MaxRate
(Figure 3.5b). As a summary, we see that SHADE outperforms PMP and LogRate on all three QoE metrics.
Admission Control with Different Conservativeness. Next, we study the effect of different conserva-
tiveness of the admission control component. In Figure 3.6, we use R
3
as R
AC
to admit users, a very
conservative admission control process as it only admit
1
3
of users compared to using R
1
. By admitting
65
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
0
5
10
15
20
25
30
Downgrade Fraction (%)
(a) Downgrade Fraction
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
0
5
10
15
20
25
30
35
40
Bitrate Selection Switches
(b) Bitrate Selection Switches
Figure 3.5: Comparison of competitors on two metrics for better analysis: (a) Downgrade Fraction;
(b) Bitrate Selection Switches, with different premium resource percentage p, using R
1
as R
AC
for
admission control.
fewer users, each user can use more PRBs, and thus we see improved performance on Average Bitrate
(Figure 3.6a) and Rebuffering Ratio (Figure 3.6b) in general. For Bitrate Switches (Figure 3.6c), we see
that they perform similarly compared to the less conservative admission control case (Figure 3.4c), except
for LogRate, which achieves lower Bitrate Switches. The reason is that when there are more PRBs, it is
easier for LogRate to stabilize each user’s bitrate selection as every user can get a good bitrate; however,
when PRBs are scarce, LogRate keeps moving PRBs to the user with good channel condition, which leads
to unstable bitrate selections. Interestingly, in Figure 3.6a, SHADE-MaxRate outperforms SHADE on Av-
erage Bitrate significantly, because it uses PRBs in a more efficient way by allocating more PRBs to good
channel condition users, while SHADE gives every user similar amount of PRBs. This benefit becomes
more significant when admission control is conservative. For one case, we find that SHADE selects at
least R
3
to every user, however, SHADE-MaxRate gives R
2
to every one and then uses the rest of PRBs to
promote 2 users who have the best channel conditions to R
5
.
66
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
400
600
800
1000
1200
1400
1600
Average Bitrate (Kbps)
SHADE
SHADE-MaxRate
LogRate
PMP
NP
(a) Average Bitrate
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Rebuffering Ratio (%)
(b) Rebuffering Ratio
0.04 0.06 0.08 0.10 0.12 0.14 0.16
Premium Resource p
5
10
15
20
25
30
35
40
45
Bitrate Switches
(c) Bitrate Switches
Figure 3.6: Comparison of competitors on three key video QoE metrics: (a) Average Bitrate; (b)
Rebuffering Ratio; and (c) Bitrate Switches, with different premium resource percentage p, using
R
3
as R
AC
for admission control.
Faster Channel Condition Changes. Lastly, we vary the degree of channel condition variation (Fig-
ure 3.7). We can achieve this by varying user’s speed, higher user speed introduces greater channel condi-
tion changes.
11
In Figure 3.7a, we observe decreasing Average Bitrate for SHADE-MaxRate and LogRate:
when the speed is faster than 10m/s, both SHADE-MaxRate and LogRate perform worse than PMP. The
reason is that when user’s channel condition changes faster, it is more difficult to stabilize bitrate selection,
and thus we see an increasing bitrate selection switches for all the competitors (Figure 3.7c), which leads to
11
We have also tried to keep the speed the same but shrink the cell size, i.e., distance between neighboring cells, to vary the degree
of channel condition variation and observe similar results.
67
2 4 6 8 10 12 14
User Speed (m/s)
400
500
600
700
800
900
1000
1100
1200
Average Bitrate (Kbps)
SHADE
SHADE-MaxRate
LogRate
PMP
NP
(a) Average Bitrate
2 4 6 8 10 12 14
User Speed (m/s)
5
10
15
20
25
30
35
40
Bitrate Switches
(b) Bitrate Switches
2 4 6 8 10 12 14
User Speed (m/s)
0
5
10
15
20
25
30
35
40
45
Rate Selection Switches
(c) Bitrate Selection Switches
Figure 3.7: Comparison of competitors on: (a) Average Bitrate; (b) Bitrate Switches; and (c) Bitrate
Selection Switches, with different user speed, using R
1
as R
AC
for admission control.
more Bitrate Switches (Figure 3.7b). We see that SHADE-MaxRate’s Bitrate Switches is approaching PMP
(LogRate is even worse), indicating the poor stability of bitrate selection. On the other hand, SHADE’s
Bitrate Switches is still significantly better than PMP (56% better for the worst case), which also leads to
good Average Bitrate performance, indicating more significant benefit of the property for stability when
channel condition changes faster (Figure 3.7c). Even so, SHADE’s Average Bitrate benefit on top of PMP
is degrading, it means that schemes that maintains at one of the bitrate candidates needs to stabilize their
bitrate selections to enjoy extra benefits over schemes that do not maintain downlink throughput, which
again confirms the importance of stabilizing the bitrate selections.
68
Handovers. Handover is the process that mobile user changes its serving base station when travels from
the coverage area of one base station to another. During handover, user experiences poor performance
for a couple of seconds. Such performance degradation of a short period of time generally would not
affect video QoE much, as ABR usually maintains buffer of minutes of video [], which can absorb the
interruption due to handover. In our experiments, we have not observed significant QoE degradation due
to introduced handovers. However, SHADE and SHADE-MaxRate have the benefits of quickly downgrading
users who experiencing handovers to non-premium class; by doing so, PRBs will be used efficiently on
other users instead of being wasted by handover users.
As a summary, we verify that SHADE outperforms other premium service baselines on all three key
QoE metrics, under different settings. We also confirm that the top priority for schemes like SHADE or
LogRate that maintain at one of the bitrate candidates is to stabilize the selected bitrates.
3.7.3 Throughput Maintenance Component
Aforementioned SHADE’s good QoE performance relies on whether SHADE’s throughput maitenance com-
ponent works well or not. This subsection evaluates the throughput maintenance performance separately.
We complicate the circumstance step by step. We first evaluate when significant channel condition varia-
tions are introduced, and then consider dynamics on users (join/leave/change requirements/moving), lastly,
we evaluate the case where there are multiple premium users.
3.7.3.1 Channel Condition Variations.
Channel condition variations mainly come from signal fading and interference, which happens all the time
at a high frequency. It is impossible to predict these changes. However, SHADE’s task is to react quickly to
these changes and try to maintain the long term average downlink throughput with small variation. MCS
Indexes estimation component (Algorithm 3) does this task by capturing the MCS Indexes in the past and
react appropriately.
69
0 10 20 30 40 50
Time (s)
1000
2000
3000
4000
5000
6000
Rate per PRB (b)
Estimation
Scheduled PRBs
Figure 3.8: Comparison between the actual throughput per PRB (Scheduled PRBs) and the estima-
tion using the average of MCS Index of all the PRBs (Estimation). Such estimation underestimates
the MCS Indexes and throughput per PRB significantly, sometimes more than 40%.
MCS-Interval (s) 0.1 0.2 0.5 1 2
Mean (Kbps) 1206 1201 1193 1195 1202
Std (Kbps) 85 117 137 176 195
Table 3.3: Mean and Std of maintained downlink throughput per second using different MCS-
Intervals. Although all the intervals can maintain the average throughput at 1200Kbps, smaller
intervals can reduce the variation.
Estimate MCS Indexes. We collect user’s MCS Indexes in the past to predict his MCS Indexes in the
next time interval. A naive estimator would average the MCS Indexes on all the PRBs in the previous
time interval, however, we find that using this estimator leads to much higher throughput than the targeted
value. Figure 3.8 shows that this estimation approach significantly underestimates the throughput per PRB,
compared to the average achieved rate of PRBs that have been scheduled to him in the next time interval.
The reason is that Proportional Fair will try to schedule PRBs of higher MCS Indexes to this user, and the
average MCS Indexes of those scheduled PRBs is much higher than the overall average value, which also
indicates the efficiency benefit of Proportional Fair scheduler. This means that we should use the average
of scheduled PRBs in the previous time interval to estimate the MCS Indexes for the next time interval.
70
0 10 20 30 40 50 60
1200
1300
1400
1500
1600
1700
1800
0.1s (106.789)
0 10 20 30 40 50 60
Time (s)
1200
1300
1400
1500
1600
1700
1800
1s (217.590)
Downlink Rate (Kbps)
Figure 3.9: Maintained throughput per second using two different MCS-Intervals: 0.1s and 1s. Us-
ing the interval of 0.1s provides more stable throughput, with Std of 107Kbps compared to 218Kbps
of 1s case.
Determine MCS-Interval. Intuitively, smaller MCS-Interval can react to the channel condition variation
faster. Table 3.3 shows the mean and the standard deviation (std) of throughput per second using different
MCS-Intervals, which matches our intuition. We see that they all maintain the mean of the throughput at
1200Kbps, but using a MCS-Interval of 0.1s can reduce the Std from 195Kbps to 85Kbps. Figure 3.9 plots
the throughput per second of two MCS-Intervals: 0.1s and 1s. Channel condition changes greatly at 25
seconds of the experiment, the throughput of using 1s MCS-Interval case fluctuate significantly while the
0.1s case maintains the throughput more stably. We choose to use MCS-Interval of 0.1s for SHADE.
3.7.3.2 User Dynamics.
Besides channel condition variation, user creates another dimension of dynamics. For any given time, there
are new users join the network, some users leave the network, and their requirements change all the time as
their traffic pattern is not constant. These changes affect the requirement estimation, Q(), of Algorithm 2.
In addition, premium user’s mobility changes the channel condition, which affects their PRB requirement
71
for the same targeted downlink throughput. SHADE needs to update Q() of non-premium users and re-
quired number of PRBs P for premium user for throughput maintenance, according to Algorithm 2.
Estimate User Requirement. Applying the same idea of MCS Indexes estimation, we use user require-
ment log in the previous time interval to estimate for the next time interval, and we need to determine
the Requirement-Interval. It is a harder decision compared to determining MCS-Interval: Requirement-
Interval cannot be too big, otherwise we cannot capture requirement changes instantly; however, it cannot
be too small neither, because our calculation relies on the assumption that the scheduler would treat users
fairly, and small interval breaks such assumption and then the estimation would not be accurate.
We simulate a simple scenario with different number of users to vary Q(), and evaluate the effect of
using different Requirement-Intervals. We configure 45 users for the central sector. At very beginning,
there are only 5 users, including one premium user. Then, for the rest of users, each user joins the network
after the previous one with a randomly chosen delay, according to a uniform distribution between 1 and 5
seconds. At very beginning, we do not need to tune the weight as the premium user’s throughput is higher
than the targeted value, but with more and more users joining the network, we need to apply higher and
higher weight for this premium user.
Table 3.4 shows the mean and std of the maintained downlink throughput using different Requirement-
Intervals, with targeted throughput of 1200Kbps. We observe a sweetspot for Requirement-Interval (around
1s). Both smaller and larger intervals tend to provide lower throughput, but for different reasons. When we
use a small interval, it breaks the assumption about the scheduler would fairly allocate resources. Instead,
the scheduler would favor users with better channel condition, and thus make SHADE neglect the existence
of other users who are currently experiencing relatively bad channel condition. On the other hand, us-
ing a larger interval would react slower to requirement changes. In our example, with more user join the
network, SHADE needs to keep increasing the weight parameter, a slow reaction would use a low weight
which leads to the lower maintained throughput.
Figure 3.10 shows maintained throughput of the best Requirement-Interval setting of 1s. Without
the throughput maintenance, this user’s throughput keeps going down as there are more and more users
72
Requirement-Interval (s) 0.1 0.5 1 5 10
Mean (Kbps) 1109 1200 1213 1167 1142
Std (Kbps) 77 77 90 129 123
Table 3.4: Mean and Std of maintained downlink throughput per second using different
Requirement-Intervals. A sweetspot occurs at the interval of 1s, which maintains the throughput
at the targeted value. Smaller or bigger intervals provide lower average throughput.
0 20 40 60 80
Time (s)
0
500
1000
1500
Downlink Rate (Kbps)
W/ Maintenance
W/o Maintenance
Weight
1
2
3
4
5
6
7
8
9
Weight
[t]
Figure 3.10: Maintained throughput per second using Requirement-Interval of 1s (with correspond-
ing weight), compared to the case without throughput maintenance. SHADE keeps increasing the
weight as the number of users increasing and maintains the throughput at the targeted value. On
the contrary, without the maintenance, the throughput drops to only 1/5 of its initial value.
joining the network. The case with throughput maintenance reacts to the requirement changes (the number
of users), keeps increasing the weight to maintain the same targeted throughput. In addition, separately,
from this figure, we can also observe that it reacts to channel condition variation as well: a low throughput
point always triggers a high weight value right away, to compensate the channel condition degradation.
Introduce User Mobility. Using random walk mobility model, we find that it would not generate as vari-
able channel condition changes as interference and fading, if user’s velocity is not very fast. To generate
significant channel condition degradation, we show a case where a premium user is moving toward a neigh-
boring base station with velocity of 5m/s (18Km/h). In Figure 3.11, we see the throughput is maintained
at 700Kbps with continuously degrading channel condition due to mobility.
73
0 10 20 30 40 50
Time (s)
0
200
400
600
800
1000
Throughput (Kbps)
W/ Maintenance
W/o Maintenance
Rate per PRB
2000
2500
3000
3500
4000
4500
5000
Rate per PRB (b)
[t]
Figure 3.11: Maintained throughput per second, compared to the case without throughput mainte-
nance, under the circumstance of continuously degrading channel condition due to user mobility.
3.7.3.3 Multiple Premium Users.
We explore the throughput maintenance for the case of multiple premium users, with all aforementioned
network dynamics (interference, fading, users’ joining/leaving the network, changing their requirements,
moving). We see that three collocated premium users throughput can be maintained at three different
targets (1500Kbps, 1200Kbps, 800Kbps); without throughput maintenance, they achieve much lower
throughput. In particular, we introduce a big requirement explosion at 30 seconds by starting a group
of backlogged users and increasing existing users’ requirements simultaneously. We see that without
throughput maintenance, their throughput dropped by 40%. The maintained throughput case is affected by
this too, however it can detect this sudden change and recover quickly.
74
0 10 20 30 40 50 60
0
500
1000
1500
2000
W. Maintenance
W/o Maintenance
Time (s)
Downlink Rate (Kbps)
[t]
Figure 3.12: Maintained throughput for three premium users at different targeted throughput. Dot-
ted lines illustrate each user’s throughput without throughput maintenance.
75
Chapter 4
Literature Review
This dissertation covers several topics in the area of improving cellular performance. This chapter dis-
cusses related works by categorizing techniques that can improve cellular services. Recognizing issues of
cellular services is a prerequisite for performance improvement, we firstly present impact assessment tech-
niques. Cellular networks operate in a highly variable environment, dynamic base station reconfiguration
can adapt to these network dynamics for better cellular services. Base station reconfiguration techniques
contain a large body of works, which will be introduced next. We then discuss performance modeling
techniques, as modeling the performance of cellular networks is an important tool to guide the changes
(e.g., base station reconfiguration) we apply to cellular networks. In addition, besides aforementioned
concrete techniques, at the service model level, instead of offering traditional fair services to all the users
and applications, a different service model can potentially enhance cellular services of certain users and
applications. We summarize proposals that are enabled by novel service models. Lastly, we discuss the
most essential component to wireless systems, resource scheduling. Scheduling transmission resources
differently is a powerful knob for better cellular services, e.g., to leverage limited transmission resources
more efficiently, to enhance certain users and applications, etc.
Impact assessment. There are several proposals and industry solutions for impact assessment in IP
networks [34, 35, 83], data center networks [49, 60, 37] and Software Defined Networks [72, 49]. In
cellular networks, Litmus [65] and PRISM [64] focuses on impact assessment of planned network changes.
76
Magus does not only assess impact, but also provide reconfiguration settings for recovery. Its problem
scope is different from the state-of-art solutions and managing upgrades for cellular networks brings in new
sets of technical challenges such as its dependence on external environments, radio network configurations,
and end-user workload and mobility patterns.
Base station reconfiguration. The idea of automatic configuration optimization [26, 77, 70] envisions
SON [9]. It self-heals from network outages, adjusts configurations from the variation of base station
loads, as discussed in 3GPP Release 10 [11]. Prior work has explored Cell Outage Detection (COD),
Cell Outage Recovery (COR) and Cell Outage Compensation (COC) [58, 84, 59, 11, 19, 12, 25, 66, 18].
These reconfiguration algorithms tune the transmission power, antenna tilt and antenna azimuth angle for
improving cellular services. They are reactive and begin their tuning process after the occurrence of an
outage. They completely rely on performance and configuration feedback from the field to make their
tuning decisions. Magus on the other hand, does not rely only on feedback and uses predictive model-
based approaches in a proactive manner to better manage service performance during planned upgrades.
Magus also converges much faster than feedback-based approaches.
Performance modeling. Researchers have made great efforts on performance modeling techniques to
understand the performance of cellular networks [40, 75, 22]. Because it is extremely hard to estimate
end-user’s performance, people usually make some assumptions in their model. [58, 84, 59] model base
station’s coverage range, treat locations within the coverage range equally, and use overlapping coverage
areas to approximate interference and un-covered areas to approximate coverage holes. [44] assumes
omni-directional base stations. [85, 26] make assumptions about the signal propagation model. On the
other hand, Magus divides the coverage area into 100m 100m grid and calculates each grid’s SINR and
throughput rate independently. We also leverage operational network data and ATOLL-based coverage
models to make Magus’s predictive model more realistic.
Service model. Cellular service providers traditionally offer fair services to all the users, a different
service model can potentially enhance cellular performance of certain users or applications. There are
77
proposals that use different pricing schemes to enhance user performance. For example, [38] alleviates
congestion by implementing a time-dependent pricing scheme to allow users defer their delay tolerant
traffic to save money. Others proposed to use a differentiated service model. [28, 67] discuss using Paris
Metro Pricing scheme to support a differentiated digital service, e.g., cloud computing services, or net-
work access services. Compared to these works, SHADE’s benefits is not achieved by simply leveraging a
differentiated service model. A more important enabler of SHADE’s benefits is a novel resource scheduling
technique, which can maintain video users’ downlink throughputs.
Given certain application knowledge of traffic, application-aware service can enhance application
QoE. This dissertation uses video as an application example and we mainly focus on video applications.
Researchers have studied streaming videos on variable bandwidth conditions for a long time. For example,
Chapter 4 of [73] explores the problem space of adaptive video streaming thoroughly. It studies trade-offs
of applying adaptation and making adaptation decisions at different components (the video server, the net-
work, or the video player client), and how the adaptation is supported by different video encoding codecs.
Lakshman et al. [56] analyze how video encoding interacts with underlying delivery network in an end-
to-end video delivery system. It models the video delivery system as the video unit (the video sender that
contains a video encoder, an encoding buffer and a user network interface), the network and the video
decoder. Based on this model, [56] evaluates different video streaming proposals on performance metrics
including video quality, delay, and statistical multiplexing gains. Using the same model, Chapter 9 of [80]
derives conditions that guarantee a smooth play of the video (without any rebuffering events) under certain
QoS support, with a particular focus on various encoding techniques that can meet these conditions. Both
[56] and [80] indicate the potential of encoding/streaming videos on variable bitrates and the benefits of
leveraging bandwidth condition feedback at the video encoder. Inspired by these studies, video streaming
industry converged on one possible solution described by [73], namely, chunk-based ABR solution. This
approach requires that videos are pre-encoded (instead of encoding on-the-fly) in different qualities, bro-
ken into chunks and then stored at the video server. The client (video player) then makes the adaptation
78
decision by requesting chunks of different qualities from the video server. Chunk-based approach simpli-
fies the model used by [73, 56, 80]: pre-encoding removes encoding latency, and separates video encoding
techniques from video delivery systems. This dissertation assumes the chunk-based delivery framework,
and explores what minimal application-awareness has to be embedded within the network, and how to
make minimal changes to existing cellular base station infrastructure. We assume that the state of art
encoding techniques have been used to provide the best video quality for each bitrate.
Researchers have proposed solutions to improve current chunk-based ABR approach from different
perspectives. For example, [36, 47] focus on picking the best CDN to serve users; [86, 42, 78] propose
better ABR algorithms; [41, 17, 48] studies the interaction between video player and TCP; CQIC [62]
leverages user’s channel condition to improve video’s sending rate; QA V A [29] controls the video server
and delivers appropriate video bitrate to user to not exceed her monthly data quota; A VIS [30] uses traffic
shaper to achieve better fairness for video users, etc. Compared to these works, SHADE targets at a differ-
ent, but more essential problem: when the downlink throughput is not high enough to support good video
QoE. Because higher downlink throughput is preferred by video applications, our approach is complemen-
tary to aforementioned works. There are also works that control video user’s bitrate, e.g., AGBR [82],
QA V A [29], A VIS [30]. Compared to A VIS [30], SHADE’s bitrate selection is for better video QoE instead
of better fairness. QA V A [29] selects appropriate bitrate for each user to best leverage her monthly data
quota, while SHADE is trying to allocate limited resources among competing video users. AGBR [82] also
focuses on allocating resources among competing video users, but it is not QoE aware. In addition, these
works have not studied the importance of the stability of selected bitrates over time and how to achieve this
stability. SHADE achieves stable downlink throughput by modifying the resource allocation of the base sta-
tion, which is robust to network dynamics including wireless channel quality variations. Moreover, SHADE
only introduces minimal changes to current cellular infrastructure, and is thus more deployable.
Resource scheduling. Resource scheduling is important for two reasons. First, transmission resources are
extremely limited. Second, it affects users’ throughputs directly. Besides Max Rate [81], Round Robin [31]
and Proportional Fair [46], there are many resource scheduler proposals for wireless systems [71, 79, 20,
79
21, 50, 32]. Some Proportional Fair based schedulers [53, 76, 43] apply a weight parameter to Equation 3.1
to improve certain properties of scheduling, e.g., reduce latency, reduce queue length, etc. SHADE uses
the same technique for a different goal: to treat user differently and eventually maintain users’ downlink
throughput. Network virtualization (RAN-sharing) provides services to multiple slices (customers) on
a shared physical networks [54, 55]. Schedulers designed for network virtualization mainly focus on
allocating right amount of resources to each slice, while SHADE controls per user resource for throughput
maintenance. In addition, many of these schedulers only focus on resource isolation, and do not strive to
allocate resource to the right user for better efficiency. Instead, SHADE’s resource allocation is based on
Proportional Fair, and inherits its high efficiency property.
For video-aware services, there are proposals that modify resource scheduler to enhance video QoE.
[52] prioritizes on the most important video packet among users (key frame of the video); [61] takes the
deadlines of video packets into account; [63] schedules based on different delay constraints of users. These
proposals require detailed video knowledge that are hard to obtain and thus introduce significant complex-
ity and modifications to the cellular infrastructure. On the contrary, though SHADE is also video-aware, it
relies on easily accessible video information, and thus introduces minimal complexity and changes.
80
Chapter 5
Conclusions and Future Directions
In this dissertation, we have explored how to improve cellular network’s robustness and performance
significantly by making minimal changes to the cellular infrastructure.
In Chapter 2, we describe the design and evaluation of Magus that mitigates service disruption during
planned upgrades. These occur frequently and can have a significant impact in modern cellular networks.
To our knowledge, no prior work has explored this problem. During such upgrades, when a base station
is taken off-air, users can re-attach to neighboring base stations, as our experiments on an LTE testbed
demonstrate. Magus is a proactive model-based approach that predicts the near-optimal power and tilt
configuration for neighboring base stations and then tunes these stations before the planned outage. Magus
uses real operational network data such as base station location, configuration, and path loss information to
achieve accuracy, and a heuristic search algorithm to converge to a near-optimal configuration for a given
upgrade. Our experiments show that Magus can recover a significant fraction of lost capacity while also
greatly reducing the number of synchronized handovers caused by the upgrade. For future work of Magus,
several directions exist: a field deployment, improved joint tuning techniques of power and tilt, using
Magus’s predictive model for unplanned outages (using Magus’s computed configuration as a starting
point for feedback control, and pre-computing configurations for different outages), or for load-balancing
and reducing congestion.
81
In Chapter 3, we describe the design of SHADE, a premium service to improve video QoE for cellular
users. SHADE achieves better video QoE by allocating more transmission resources to premium users to
support higher downlink throughputs, and also maintaining their downlink throughputs at one of the video
bitrate candidates. SHADE employs a bitrate selection component that selects bitrate for each premium
user. It smartly leverages limited transmission resources to maximize overall QoE among premium users.
A throughput maintenance component then maintains each premium user’s downlink throughput at the
targeted bitrate value, by dynamically assigning her different amount of transmission resources. Through
extensive ns-3 simulation, we show that SHADE’s throughput maintenance component can maintain user’s
downlink throughput stably, under variable situations. Based on that, as a premium video service, SHADE
can significantly improve multiple competing video QoE metrics (including Average Bitrate, Rebuffering
Ratio and Bitrate Switches) at the same time, compared to previously proposed premium service baselines.
There are several future directions fo SHADE. The first is the implementation of SHADE in real cellular
systems. We want to understand whether SHADE’s throughput maintenance technique can control and
stabilize downlink throughput in real cellular system instead of simulation environment, and what is the
additional complexity it introduces to the base station. Second, as we have not discussed pricing, currently
SHADE treats admitted premium users equally. However, there should be a pricing scheme for the service,
and SHADE should assign resources to premium users proportional to their paid price. We left SHADE’s
pricing scheme and corresponding adjustment for future work. Third, ABR algorithms are designed to pro-
vide good video QoE under variable downlink throughput. However, as SHADE can stabilize video user’s
downlink throughput, ABR algorithms should be modified accordingly. Intuitively, stabilized downlink
throughput eases the task of ABR algorithms, and thus can potentially achieve better QoE. We will also
explore the opportunity of a specifically designed ABR for SHADE.
Besides follow-up work for Magus and SHADE, this dissertation also sheds lights on several general
future directions.
First, Magus and SHADE illustrate that cellular services can be improved significantly with minimal
changes to the infrastructure. One important reason for this result is that cellular networks operate in a
82
highly variable environment. Because of these network dynamics, being adaptive to them by applying
changes to the infrastructure, can help cellular networks provide more optimized services. For example,
Magus optimizes service coverage under different sets of serving base stations; SHADE optimizes video
QoE performance with variable user behaviors and channel conditions. One natural future direction is
designing more dynamic and adaptive cellular infrastructure, in order to capture these variable network
dynamics and react accordingly. There are two challenges for this direction. The first is to identify oppor-
tunities of better cellular services. We can start with frequently occurring network dynamics that have not
been handled specifically (like network upgrades in Magus). The second challenge is to actually design
changes to realize service improvements. These improvement might be achieved by multiple solutions
(different changes). However, we need to choose the solution with minimal changes to ease the deploy-
ment. In our studies, both Magus and SHADE apply changes to the access network at the base station. We
realize that the base station is actually a good place to apply changes for improvement, for several reasons.
First, because the capacity of the access network have not kept pace with cellular traffic, it is usually the
bottleneck of the connection. Thus, it is the right place to solve many capacity related problems. Second,
base station represents the most challenging and opportune part of cellular networks: the wireless part.
The unique challenges of wireless communications should be naturally tackled here. Last, base station is
an aggregate point for all the cellular users, and modifications deployed on it can benefit all the users. It is
worth exploring opportunities of modifying base station for cellular service improvement. This approach
is also feasible. Because base station is responsible for many important tasks, including admission control,
transmission resource scheduling, etc., it can potentially realize many sophisticated ideas. For example,
SHADE tackles a challenging task, throughput maintenance, by only introducing minimal changes to the
transmission resource scheduler.
The second future direction is towards application-aware cellular networks for better application per-
formance. Concretely, we want to explore the idea of embedding minimal application knowledge to cellu-
lar networks and let cellular networks handle traffic of different applications differently for better services.
As applications become more and more heterogeneous, we envision that different applications will require
83
different underlying service qualities for good QoE. For example, video applications need a high through-
put, while gaming applications require a low latency. Therefore, instead of providing generic cellular
services to all the applications, it would be beneficial to make cellular networks application-aware. For ex-
ample, in SHADE, we use video as an example and re-design the cellular infrastructure to be video-aware.
We want to apply this application-aware idea to other applications as well. In general, application-aware
cellular networks can provide two benefits. The first benefit is obviously better application QoE due to
this awareness. The second benefit is better efficiency of leveraging resources. Generic cellular services
need to provide good service on all the performance metrics, e.g., throughput, latency, etc. However,
most applications do not simultaneously require all these metrics. For example, many gaming applica-
tions do not require a high throughput. With performance requirements of different applications known
in advance, application-aware cellular networks can provide different services to different applications,
utilize resources more efficiently and potentially achieve better results. While exploring this direction,
we should also minimize the application knowledge we need to expose to the cellular networks. These
knowledge should be easy to obtain to not introduce significant overhead to the infrastructure. For exam-
ple, SHADE only requires video bitrates information, which is easily accessible from the video manifest.
However, some other proposals require video knowledge that are expensive to obtain, e.g., whether this
packet contains key frame or not, when is the delivering deadline of this packet to avoid rebuffering, etc.
Lastly, both Magus and SHADE have not been fully implemented in real cellular systems. We are not
aware of any publicly accessible cellular network testbeds, that are programmable and can accept changes
to easily evaluate research ideas. Magus does use an indoor LTE testbed. However, larger-scale testbeds
are needed to approximate real cellular networks. Testbeds also need to be programmable in order to
evaluate ideas that involves changes. We believe that building these kind of testbeds can benefit cellular
networks research community greatly.
84
References
[1] Cell Tower Outages Impact 911 Calls in Washington County. Available at https://
insidetowers.com/cell-tower-outages-impact-911-calls-washington-county/.
[2] Cisco Visual Networking Index: Global Mobile Data Traffic Forecast, 20162021. Avail-
able at http://www.cisco.com/c/en/us/solutions/collateral/service-provider/
visual-networking-index-vni/vni-forecast-qa.pdf.
[3] If You Take Advantage of T-Mobile’s Latest Offer, All of Your Videos are Go-
ing to Look a Lot Worse. Available at http://www.businessinsider.com/
t-mobile-binge-on-reduces-video-quality-2015-11.
[4] LENA Project (LTE-EPC Network simulAtor), LTE module. Available athttp://lena.cttc.es/
manual/lte-testing.html.
[5] ns-3. Available athttps://www.nsnam.org/.
[6] Technical Note TN2224: Best Practices for Creating and Deploying HTTP Live Streaming Media for
the iPhone and iPad. Available at https://developer.apple.com/library/ios/technotes/
tn2224/_index.html.
[7] Verizon Throttling Speeds at Peak Hours. Available at https://forums.verizon.com/t5/
Fios-Internet/Verizon-throttling-speeds-at-peak-hours-don-t-be-fooled-75-75/
td-p/773232.
[8] Why a massive cell phone outage hit the Southeast. Available at http://money.cnn.com/2015/
08/05/technology/cell-phone-outage/.
[9] 3rd Generation Partnership Project, TS 21.101 V8.0.0, Feb. 2009.
[10] 3rd Generation Partnership Project, TS 36.213 V9.2.0, June 2010.
[11] 3rd Generation Partnership Project, TS 32.541 V10.0.0, March 2011.
[12] Self-Optimizing Networks: The Benefits of SON in LTE, July 2011.
[13] Cavium Octeon Fusion, http://www.cavium.com/OCTEON-Fusion.html.
[14] Atoll: a Wireless Network Design and Optimisation Platform, http://www.forsk.com/atoll/.
[15] Mobile Network Outages & Service Degradations: A Heavy Reading Survey Analysis,
http://www.heavyreading.com/details.asp?sku id=3103&
skuitem itemid=1524.
[16] Aricent Evolved Packet Core, http://www.http://www.aricent.com.
85
[17] Saamer Akhshabi, Lakshmi Anantakrishnan, Constantine Dovrolis, and Ali C. Begen. Server-based
traffic shaping for stabilizing oscillating adaptive streaming players. In Proceeding of the 23rd ACM
Workshop on Network and Operating Systems Support for Digital Audio and Video, NOSSDA V ’13,
pages 19–24, New York, NY , USA, 2013. ACM.
[18] M. Amirijoo, L. Jorguseski, T. Kurner, R. Litjens, M. Neuland, L.C. Schmelz, and U. Turke. Cell
Outage Management in LTE Networks. In ISWCS, 2009.
[19] M. Amirijoo, L. Jorguseski, R. Litjens, and R. Nascimento. Effectiveness of Cell Outage Compen-
sation in LTE Networks. In IEEE CCNC, 2011.
[20] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, P. Whiting, and R. Vijayakumar. Providing
quality of service over a shared wireless link. IEEE Communications Magazine, 39(2):150–154, Feb
2001.
[21] M. Andrews, K. Kumaran, K. Ramanan, A. L. Stolyar, R. Vijayakumar, and P. Whiting. CDMA
data QoS scheduling on the forward link with variable channel conditions. Technical report, Bell
Laboratories Technical Report, April 2000.
[22] Athula Balachandran, Vaneet Aggarwal, Emir Halepovic, Jeffrey Pang, Srinivasan Seshan, Shobha
Venkataraman, and He Yan. Modeling Web Quality-of-experience on Cellular Networks. In ACM
MOBICOM, 2014.
[23] Athula Balachandran, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica, and Hui Zhang.
Developing a predictive model of quality of experience for internet video. In Proceedings of the
ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM ’13, pages 339–350. ACM, 2013.
[24] Athula Balachandran, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica, and Hui Zhang.
Developing a predictive model of quality of experience for internet video. In Proceedings of the
ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM ’13, pages 339–350, New York, NY ,
USA, 2013. ACM.
[25] U. Barth. Self-X RAN: Autonomous Self Organizing Radio Access Networks. In IEEE WiOpt,
2009.
[26] Simon C. Borst, Arumugam Buvaneswari, Lawrence M. Drabeck, Michael J. Flanagan, John M.
Graybeal, Georg K. Hampel, Mark Haner, William M. MacDonald, Paul A. Polakos, Gee Ritten-
house, Iraj Saniee, Alan Weiss, and Philip A. Whiting. Dynamic Optimization in Future Cellular
Networks. Bell Labs Technical Journal, 10(2):99–119, 2005.
[27] T. Bu, L. Li, and R. Ramjee. Generalized proportional fair scheduling in third generation wireless
data networks. In Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on
Computer Communications, pages 1–12, April 2006.
[28] Chi-Kin Chau, Qian Wang, and Dah-Ming Chiu. Economic viability of paris metro pricing for digital
services. ACM Trans. Internet Technol., 14(2-3):12:1–12:21, October 2014.
[29] Jiasi Chen, Amitabha Ghosh, Josphat Magutt, and Mung Chiang. Qava: Quota aware video adapta-
tion. In Proceedings of the 8th International Conference on Emerging Networking Experiments and
Technologies, CoNEXT ’12, pages 121–132, New York, NY , USA, 2012. ACM.
[30] Jiasi Chen, Rajesh Mahindra, Mohammad Amir Khojastepour, Sampath Rangarajan, and Mung Chi-
ang. A scheduling framework for adaptive video delivery over cellular networks. In Proceedings of
the 19th Annual International Conference on Mobile Computing & Networking, MobiCom ’13,
pages 389–400, New York, NY , USA, 2013. ACM.
86
[31] Erik Dahlman, Stefan Parkvall, Johan Skold, and Per Beming. 3G Evolution, Second Edition: HSPA
and LTE for Mobile Broadband. Academic Press, 2 edition, 2008.
[32] A. Eryilmaz and R. Srikant. Fair resource allocation in wireless networks using queue-length-based
scheduling and congestion control. IEEE/ACM Transactions on Networking, 15(6):1333–1344, Dec
2007.
[33] Tobias Flach, Pavlos Papageorge, Andreas Terzis, Luis Pedrosa, Yuchung Cheng, Tayeb Karim,
Ethan Katz-Bassett, and Ramesh Govindan. An internet-wide analysis of traffic policing. In Pro-
ceedings of the 2016 ACM SIGCOMM Conference, SIGCOMM ’16, pages 468–482, New York, NY ,
USA, 2016. ACM.
[34] Pierre Francois. Disruption Free Topology Reconfiguration in OSPF Networks. IEEE INFOCOM,
2007, 2007.
[35] Pierre Francois, Pierre-Alain Coste, Bruno Decraene, and Olivier Bonaventure. Avoiding Disrup-
tions During Maintenance Operations on BGP Sessions. IEEE Transactions on Network and Service
Management, 4(3):1–11, 2007.
[36] Aditya Ganjam, Faisal Siddiqui, Jibin Zhan, Xi Liu, Ion Stoica, Junchen Jiang, Vyas Sekar, and Hui
Zhang. C3: Internet-scale control plane for video quality optimization. In 12th USENIX Symposium
on Networked Systems Design and Implementation (NSDI 15), pages 131–144, Oakland, CA, May
2015. USENIX Association.
[37] Soudeh Ghorbani and Matthew Caesar. Walk the Line: Consistent Network Updates with Bandwidth
Guarantees. In HotSDN, 2012.
[38] Sangtae Ha, Soumya Sen, Carlee Joe-Wong, Youngbin Im, and Mung Chiang. Tube: Time-
dependent pricing for mobile data. In Proceedings of the ACM SIGCOMM 2012 Conference on
Applications, Technologies, Architectures, and Protocols for Computer Communication, SIGCOMM
’12, pages 247–258, New York, NY , USA, 2012. ACM.
[39] S. Hamalainen, H. Sanneck, and C. Sartori. LTE Self-Organising Networks (SON): Network Man-
agement Automation for Operational Efficiency. Wiley, 1st edition, 2012.
[40] Junxian Huang, Feng Qian, Yihua Guo, Yuanyuan Zhou, Qiang Xu, Z. Morley Mao, Subhabrata
Sen, and Oliver Spatscheck. An In-depth Study of LTE: Effect of Network Protocol and Application
Behavior on Performance. In ACM SIGCOMM, 2013.
[41] Te-Yuan Huang, Nikhil Handigol, Brandon Heller, Nick McKeown, and Ramesh Johari. Confused,
timid, and unstable: Picking a video streaming rate is hard. In Proceedings of the 2012 ACM Con-
ference on Internet Measurement Conference, IMC ’12, pages 225–238, New York, NY , USA, 2012.
ACM.
[42] Te-Yuan Huang, Ramesh Johari, Nick McKeown, Matthew Trunnell, and Mark Watson. A buffer-
based approach to rate adaptation: Evidence from a large video streaming service. SIGCOMM Com-
put. Commun. Rev., 44(4):187–198, August 2014.
[43] Jong Hun Rhee and Dong Ku Kim. Scheduling of Real/Non-real Time Services in an AMC/TDM
System: EXP/PF Algorithm, pages 506–513. Springer Berlin Heidelberg, Berlin, Heidelberg, 2003.
[44] M.A. Ismail, X. Xu, and R. Mathar. Autonomous Antenna Tilt and Power Configuration Based on
CQI for LTE Cellular Networks. In ISWCS, 2013.
[45] Raj Jain, Arjan Durresi, and Gojko Babic. Throughput fairness index: An explanation, 1999.
87
[46] A. Jalali, R. Padovani, and R. Pankaj. Data throughput of cdma-hdr a high efficiency-high data rate
personal communication wireless system. In Vehicular Technology Conference Proceedings, 2000.
VTC 2000-Spring Tokyo. 2000 IEEE 51st, volume 3, pages 1854–1858 vol.3, 2000.
[47] Junchen Jiang, Vyas Sekar, Henry Milner, Davis Shepherd, Ion Stoica, and Hui Zhang. Cfa: A practi-
cal prediction system for video qoe optimization. In 13th USENIX Symposium on Networked Systems
Design and Implementation (NSDI 16), pages 137–150, Santa Clara, CA, March 2016. USENIX As-
sociation.
[48] Junchen Jiang, Vyas Sekar, and Hui Zhang. Improving fairness, efficiency, and stability in http-
based adaptive video streaming with festive. In Proceedings of the 8th International Conference on
Emerging Networking Experiments and Technologies, CoNEXT ’12, pages 97–108, New York, NY ,
USA, 2012. ACM.
[49] Xin Jin, Hongqiang Harry Liu, Rohan Gandhi, Srikanth Kandula, Ratul Mahajan, Ming Zhang, Jen-
nifer Rexford, and Roger Wattenhofer. Dynamic Scheduling of Network Updates. In ACM SIG-
COMM, 2014.
[50] Yong ju XIAN, Feng chun TIAN, Chang biao XU, and Yue YANG. Analysis of m-lwdf fairness and
an enhanced m-lwdf packet scheduling mechanism. The Journal of China Universities of Posts and
Telecommunications, 18(4):82 – 88, 2011.
[51] Frank Kelly. Charging and Rate Control for Elastic Traffic. European Transactions on Telecommu-
nications, 1997.
[52] N. Khan, M. G. Martini, and D. Staehle. Opportunistic proportional fair downlink scheduling for
scalable video transmission over lte systems. In Vehicular Technology Conference (VTC Fall), 2013
IEEE 78th, pages 1–6, Sept 2013.
[53] Kinda Khawam, Daniel Kofman, and Eitan Altman. The weighted proportional fair scheduler. In
Proceedings of the 3rd International Conference on Quality of Service in Heterogeneous Wired/Wire-
less Networks, QShine ’06, New York, NY , USA, 2006. ACM.
[54] R. Kokku, R. Mahindra, H. Zhang, and S. Rangarajan. Nvs: A substrate for virtualizing wireless
resources in cellular networks. IEEE/ACM Transactions on Networking, 20(5):1333–1346, Oct 2012.
[55] R. Kokku, R. Mahindra, H. Zhang, and S. Rangarajan. Cellslice: Cellular wireless resource slic-
ing for active ran sharing. In 2013 Fifth International Conference on Communication Systems and
Networks (COMSNETS), pages 1–10, Jan 2013.
[56] T. V . Lakshman, A. Ortega, and A. R. Reibman. Vbr video: tradeoffs and potentials. Proceedings of
the IEEE, 86(5):952–973, May 1998.
[57] S. B. Lee, I. Pefkianakis, A. Meyerson, S. Xu, and S. Lu. Proportional fair frequency-domain packet
scheduling for 3gpp lte uplink. In INFOCOM 2009, IEEE, pages 2611–2615, April 2009.
[58] F. Li, X. Qiu, L. Meng, H. Zhang, and W. Gu. Achieving Cell Outage Compensation in Radio Access
Network With Automatic Network Management. In IEEE GLOBECOM, 2011.
[59] W. Li, P. Yu, Z. Jiang, and Z. Li. Centralized Management Mechanism for Cell Outage Compensation
in LTE Networks. IJDSN, 2012.
[60] Hongqiang Harry Liu, Xin Wu, Ming Zhang, Lihua Yuan, Roger Wattenhofer, and David Maltz.
zUpdate: Updating Data Center Networks with Zero Loss. In ACM SIGCOMM, 2013.
88
[61] Q. Liu, Z. Zou, and C. W. Chen. Qos-driven and fair downlink scheduling for video streaming over
lte networks with deadline and hard hand-off. In 2012 IEEE International Conference on Multimedia
and Expo, pages 188–193, July 2012.
[62] Feng Lu, Hao Du, Ankur Jain, Geoffrey M. V oelker, Alex C. Snoeren, and Andreas Terzis. Cqic:
Revisiting cross-layer congestion control for cellular networks. In Proceedings of the 16th Interna-
tional Workshop on Mobile Computing Systems and Applications, HotMobile ’15, pages 45–50, New
York, NY , USA, 2015. ACM.
[63] H. Luo, S. Ci, D. Wu, J. Wu, and H. Tang. Quality-driven cross-layer optimized video delivery over
lte. IEEE Communications Magazine, 48(2):102–109, February 2010.
[64] Ajay Mahimkar, Zihui Ge, Jia Wang, Jennifer Yates, Yin Zhang, Joanne Emmons, Brian Huntley,
and Mark Stockert. Rapid Detection of Maintenance Induced Changes in Service Performance. In
ACM CoNEXT, 2011.
[65] Ajay Mahimkar, Zihui Ge, Jennifer Yates, Chris Hristov, Vincent Cordaro, Shane Smith, Jing Xu,
and Mark Stockert. Robust Assessment of Changes in Cellular Networks. In ACM CoNEXT, 2013.
[66] W. Mohr. Self-Organisation in Wireless Networks - Use Cases and Their Interrelation, May 2009.
[67] Andrew Odlyzko. Paris metro pricing for the internet. In Proceedings of the 1st ACM Conference on
Electronic Commerce, EC ’99, pages 140–147, New York, NY , USA, 1999. ACM.
[68] J.D. Parsons. The mobile radio propagation channel. Halsted Press, 1992.
[69] Giuseppe Piro, Nicola Baldo, and Marco Miozzo. An lte module for the ns-3 network simulator. In
Proceedings of the 4th International ICST Conference on Simulation Tools and Techniques, SIMU-
Tools ’11, pages 415–422, ICST, Brussels, Belgium, Belgium, 2011. ICST (Institute for Computer
Sciences, Social-Informatics and Telecommunications Engineering).
[70] C. Prehofer and C. Bettstetter. Self-Organization in Communication Networks: Principles and De-
sign Paradigms. Communications Magazine, IEEE, 2005.
[71] H. A. M. Ramli, R. Basukala, K. Sandrasegaran, and R. Patachaianand. Performance of well known
packet scheduling algorithms in the downlink 3gpp lte system. In Communications (MICC), 2009
IEEE 9th Malaysia International Conference on, pages 815–820, Dec 2009.
[72] Mark Reitblatt, Nate Foster, Jennifer Rexford, Cole Schlesinger, and David Walker. Abstractions for
Network Update. In ACM SIGCOMM, 2012.
[73] Mihaela van der Schaar and Philip A. Chou. Multimedia over IP and Wireless Networks: Compres-
sion, Networking, and Systems. Academic Press, Inc., Orlando, FL, USA, 2007.
[74] S. Sesia, I. Toufik, and M. Baker. The UMTS Long Term Evolution, From Theory to Practice. John
Wiley and Sons Ltd; 2nd edition, August 2011.
[75] Muhammad Zubair Shafiq, Lusheng Ji, Alex X. Liu, Jeffrey Pang, Shobha Venkataraman, and Jia
Wang. A First Look at Cellular Network Performance During Crowded Events. In ACM SIGMET-
RICS, 2013.
[76] Sanjay Shakkottai and Alexander L. Stolyar. Scheduling algorithms for a mixture of real-time and
non-real-time data in hdr. In in Proceedings of 17th International Teletraffic Congress (ITC-17, pages
793–804.
89
[77] Chong Shen, Dirk Pesch, and James Irvine. A Framework for Self-Management of Hybrid Wire-
less Networks Using Autonomic Computing Principles. 3rd Annual Communication Networks and
Services Research Conference, 2005.
[78] Kevin Spiteri, Rahul Urgaonkar, and Ramesh K. Sitaraman. BOLA: near-optimal bitrate adaptation
for online videos. CoRR, abs/1601.06748, 2016.
[79] Er L. Stolyar and Kavita Ramanan. Largest weighted delay first scheduling: Large deviations and
optimality, to appear. Annals of Appl. Prob.
[80] Ming-Ting Sun and Amy R. Reibman. Compressed Video over Networks. Marcel Dekker, Inc., New
York, NY , USA, 1st edition, 2000.
[81] B. S. Tsybakov. File transmission over wireless fast fading downlink. IEEE Transactions on Infor-
mation Theory, 48(8):2323–2337, Aug 2002.
[82] D. De Vleeschauwer, H. Viswanathan, A. Beck, S. Benno, G. Li, and R. Miller. Optimization of
http adaptive streaming over mobile cellular networks. In 2013 Proceedings IEEE INFOCOM, pages
898–997, April 2013.
[83] Ye Wang, Hao Wang, Ajay Mahimkar, Richard Alimi, Yin Zhang, Lili Qiu, and Yang Richard Yang.
R3: resilient routing reconfiguration. In ACM SIGCOMM, 2010.
[84] L. Xia, W. Li, H. Zhang, and Z. Wang. A Cell Outage Compensation Mechanism in Self-Organizing
RAN. In WiCOM, 2011.
[85] J. Yang and J. Lin. Optimization of Power Management in a CDMA Radio Network. In IEEE-VTS
Fall VTC, 2000.
[86] Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. A control-theoretic approach for
dynamic adaptive video streaming over http. In Proceedings of the 2015 ACM Conference on Special
Interest Group on Data Communication, SIGCOMM ’15, pages 325–338, New York, NY , USA,
2015. ACM.
90
Abstract (if available)
Abstract
Cellular networks have become more and more important to our daily life. Nowadays, besides constantly contacting friends and families, we rely on cellular networks to connect to the internet to do many things, including not only traditional email, web-browsing and voice communication applications, but also emerging high bandwidth applications, e.g., video, gaming, and business critical applications. The increases in the number of cellular users and the requirement from these high bandwidth applications create a growing challenge for cellular service providers: to provide good cellular services. ❧ Robustness and performance are two critical requirements towards good cellular services. The robustness makes sure cellular users can receive decent cellular services despite network and environmental uncertainties. This is critical, as cellular service disruption is not received well by customers. On the other hand, with the increase of cellular users and the emergence of high bandwidth applications, cellular performance becomes more and more important. Because of dramatically increased traffic volumes, cellular users are likely to receive degraded cellular performance. Both robustness and performance can be improved by significantly improving the cellular infrastructure, e.g., building more base stations. However, this solution is expensive and also requires time to deploy. In this dissertation, we focus on a different direction: achieving robustness and performance improvement using current infrastructure, and without significant modifications to the infrastructure. ❧ For robustness, we focus on service disruptions induced by a frequently happening network event, planned upgrades. Planned upgrades occur every day, may often need to be performed on weekdays, and can potentially degrade service robustness. We explore the problem of tuning base station configurations to mitigate the impact due to a planned upgrade which takes the base station off-air. The objective is to recover the loss in service which would have occurred without any modifications. We propose a proactive approach based on a predictive model that uses operational data to quickly estimate the best power and tilt configuration of neighboring base stations that enables high service recovery. These ideas, embodied in a capability called \ppm, enables us to recover up to 76% of the potential service loss due to planned upgrades. ❧ For performance, we consider video streaming which is the dominating application. Current cellular networks do not provide any service quality guarantee. During periods of base station congestion, video users will receive poor cellular service, which can lead to degraded video quality of experience (QoE). We explore the idea of premium service for video users to improve their video QoE performances. While increasing video user's downlink throughput can improve QoE, we empirically observe that QoE can be further improved by maintaining the downlink throughput at one of the candidate bitrates of the corresponding video. We design SHADE to realize this idea. SHADE distributes transmission resources among users smartly, by selecting a candidate bitrate for each premium user and maintaining the downlink throughput at this bitrate target. SHADE achieves this with minimal changes to current cellular systems. Our extensive simulations indicate that SHADE can significantly improve multiple video QoE metrics, compared to previous proposals.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Satisfying QoS requirements through user-system interaction analysis
PDF
Global analysis and modeling on decentralized Internet
PDF
Gradient-based active query routing in wireless sensor networks
PDF
Robust video transmission in erasure networks with network coding
PDF
Cloud-enabled mobile sensing systems
PDF
Improving efficiency, privacy and robustness for crowd‐sensing applications
PDF
Improving network security through collaborative sharing
PDF
Detecting and mitigating root causes for slow Web transfers
PDF
Measuring the impact of CDN design decisions
PDF
Enabling virtual and augmented reality over dense wireless networks
PDF
Elements of next-generation wireless video systems: millimeter-wave and device-to-device algorithms
PDF
Compression of signal on graphs with the application to image and video coding
PDF
Resource scheduling in geo-distributed computing
PDF
Making web transfers more efficient
PDF
Scaling-out traffic management in the cloud
PDF
Performant, scalable, and efficient deployment of network function virtualization
PDF
High-performance distributed computing techniques for wireless IoT and connected vehicle systems
PDF
Crowd-sourced collaborative sensing in highly mobile environments
PDF
Improving user experience on today’s internet via innovation in internet routing
PDF
3D inference and registration with application to retinal and facial image analysis
Asset Metadata
Creator
Xu, Xing
(author)
Core Title
Improve cellular performance with minimal infrastructure changes
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
10/10/2017
Defense Date
03/22/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
base station configuration,cellular network,OAI-PMH Harvest,performance,premium service,QoE,robustness,video quality of experience
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Govindan, Ramesh (
committee chair
), Lloyd, Wyatt (
committee member
), Ortega, Antonio (
committee member
)
Creator Email
vess527@gmail.com,xingx@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-447047
Unique identifier
UC11265765
Identifier
etd-XuXing-5837.pdf (filename),usctheses-c40-447047 (legacy record id)
Legacy Identifier
etd-XuXing-5837.pdf
Dmrecord
447047
Document Type
Dissertation
Rights
Xu, Xing
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
base station configuration
cellular network
premium service
QoE
robustness
video quality of experience