Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Automated repair of layout accessibility issues in mobile applications
(USC Thesis Other)
Automated repair of layout accessibility issues in mobile applications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
AUTOMATED REPAIR OF LAYOUT ACCESSIBILITY ISSUES IN
MOBILE APPLICATIONS
by
Ali S. Alotaibi
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2023
Copyright 2023 Ali S. Alotaibi
Dedication
To my parents, Saleh and Maneia, my wife Ghadah, and my brothers and sisters, for
their unwavering love, support, and encouragement, and to my children, Abdulaziz
and Aljawharah, for being my source of joy and happiness.
ii
Acknowledgements
Pursuing a Ph.D. at USC has been among the most rewarding yet challenging experiences of my
life. I extend my heartfelt gratitude to Allah Almighty for His blessings and for endowing me
with the knowledge, strength, and patience to complete this dissertation. I would also like to
recognize and thank the many people who have provided support, guidance, and encouragement
along the way.
I would like to express my gratitude to my advisor, Professor William G.J. Halfond. His guidance, mentorship, and support have been invaluable throughout my research journey. I am also
grateful for his personal care and attention. He possesses the rare ability to nurture not just the
intellectual but also the human aspects of his students. He consistently advocated for a balanced
approach between life and work and always asked about both my personal well-being and the
well-being of my family. This dissertation and my growth as a researcher were all possible because of his guidance and support. I would also like to extend my sincere appreciation to my
committee members: Professor Nenad Medvidović, Professor Mukund Raghothaman, Professor
Gisele Ragusa, and Professor Chao Wang. Their invaluable feedback and insights have shaped
this dissertation and helped me think critically about my research and improve the quality of my
work.
I would like to thank my colleagues and lab mates for making my time at USC both enjoyable
and intellectually enriching. A special thank you goes to Paul Chiou, who has been both a coauthor and a dear friend. I am grateful for and very proud of all the great experiences and amazing
work that we have accomplished together. I also extend my gratitude to Abdulmajeed Alameer,
Negarsadat Abolhasani, Mian Wan, and Yingjun Lyu, who were the senior members of my lab
iii
when I joined. Their wisdom, critical feedback, and exemplary roles helped me navigate the
early days of my Ph.D. journey. I also want to thank my lab mates, Sasha Volokh, Zhaoxu Zhang,
Robert Winn, Fazel Tawsif, and Christina Chaniotaki, for their help, support, and the cherished
fun moments we have shared.
I would like to acknowledge and thank my teachers at all stages of my education, my professors at King Saud University, and the individuals who have believed in me and inspired me to
learn and grow. I would like to extend special thanks to Professor Abdullah Alghamdi for believing in me and for his unwavering support and encouragement. I am also very grateful to my close
friends Ahmed Almunify, Abdulaziz Alshayban, Abdulaziz Alaboudi, and Abdulaziz Arashidi for
always being there for me. I would like to extend my gratitude to all my friends in the Saudi
Student Association at USC, my friends in Glendale, and all the wonderful people I have had the
privilege of meeting throughout Southern California and the U.S. Thank you for your friendships
and for contributing to all the memorable events and activities we’ve enjoyed together, which
have been a constant source of joy and inspiration, providing much-needed balance to my Ph.D.
life.
I owe a deep debt of gratitude to my family back in Saudi Arabia: my parents, my brothers,
and my sisters. Although we were separated by continents, your unconditional love, support,
and encouragement have always been with me. You have been a constant source of strength
and inspiration, keeping me motivated in even the most challenging phases of this journey. I
am lucky to have you as my family and am forever thankful to each one of you. Finally and
most importantly, my deepest love and appreciation go to my wife, Ghadah. The journey of
completing this Ph.D. has coincided with another significant journey—becoming parents to two
beautiful children, Abdulaziz and Aljawharah. Your sacrifices and unwavering support have been
the cornerstone of not just this dissertation but of our evolving family life. This journey would
not have been possible without you by my side. I feel profoundly grateful and blessed to have you
as my wife and as a partner in this incredible adventure, and for that, I am eternally thankful.
iv
Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Major Challenges and Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.1 Challenges for Automated Repair of Layout Accessibility Issues . . . . . . 9
1.1.2 Insights to Automatically Repair of Layout Accessibility Issues . . . . . . . 10
1.2 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Overview of Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.1 Chapter 3: A Framework to Repair Layout Accessibility Issues in Mobile
Apps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 Chapters 4 and 5: Repairing Layout Accessibility Issues in Mobile Apps . . 13
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 2: Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 Mobile User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Mobile Accessibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Chapter 3: A Framework for Repairing Layout Accessibility Issues . . . . . . . . . . . . . 22
3.1 Analysis Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.1 Modeling the UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.2 UI Dependency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 UI Generation Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.1 Generating Candidate Repaired UIs . . . . . . . . . . . . . . . . . . . . . . 27
3.2.2 Rendering Candidate Repaired UIs . . . . . . . . . . . . . . . . . . . . . . 28
3.2.3 Evaluating and Ranking Candidate Repaired UIs . . . . . . . . . . . . . . . 28
3.2.4 Terminating the Search and Generating the Final Repair . . . . . . . . . . 30
Chapter 4: Repairing UI Scaling Accessibility Failure (USAF) . . . . . . . . . . . . . . . . . 32
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
v
4.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.1 The Analysis Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.1.1 Building the UI Model . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2.1.2 UI Dependency Analysis . . . . . . . . . . . . . . . . . . . . . . 43
4.2.2 Generating Repaired UIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2.1 Repair representation . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2.2 Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2.3 Rendering and Evaluating the Candidate Repairs . . . . . . . . . 49
4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.2 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.3 Experiment One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.3.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.3.2 Presentation of Results . . . . . . . . . . . . . . . . . . . . . . . 53
4.3.3.3 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3.4 Experiment Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.4.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.4.2 Presentation of Results . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.4.3 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.5 Threats to Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Chapter 5: Repairing Size Based Inaccessibility Issues (SBIIs) . . . . . . . . . . . . . . . . . 61
5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.1 Analysis Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2.1.1 Building the UI Model . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2.1.2 UI Dependency Analysis . . . . . . . . . . . . . . . . . . . . . . 71
5.2.2 Generating Repaired UIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2.2.1 Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2.2.2 Solution Representation and Initial Population . . . . . . . . . . 75
5.2.2.3 Generating a Repair . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.2 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3.3 Experiment One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3.3.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3.3.2 Presentation of Results . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.3.3 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.4 Experiment Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3.4.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3.4.2 Presentation of Results . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.4.3 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . . 83
5.3.5 Threats to Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
vi
Chapter 6: Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1 Accessibility and Usability of Mobile Applications . . . . . . . . . . . . . . . . . . 87
6.2 Accessibility and Usability of Web Applications . . . . . . . . . . . . . . . . . . . 90
Chapter 7: Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
vii
List of Tables
4.1 ScaleFix’s effectiveness in repairing USAFs (RQ1) . . . . . . . . . . . . . . . . . . 53
5.1 Results for SALEM’s effectiveness in repairing SBIIs (RQ1) and its run time (RQ2). 79
viii
List of Figures
1.1 Examples of accessibility issues impacting the UIs of mobile apps. . . . . . . . . . 3
1.2 Example of USAFs caused by the use of scaling assistive services. . . . . . . . . . 5
1.3 Example of Size-Based Inaccessibility Issue (SBII) issues: The left screenshot
shows the original UI, while the right screenshot highlights the SBIIs as identified
by the Google Accessibility Scanner. . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Screenshot and associated View Hierarchy (VH) of the ’End User License
Agreement’ UI within the Fedex app, captured using the UI Automator Viewer. . . 17
2.2 Example of using Google Accessibility Scanner (GAS) to Identify Accessibility
issues: The left screenshot displays issues as identified by GAS, and the right
screenshot provides detailed information upon selecting one of these identified
issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1 Overview of the repair framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 A simple example of a UI model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 Examples of collision and text cutoff USAFs caused by the use of scaling assistive
services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Examples of missing USAFs caused by the use of scaling assistive services. . . . . 37
4.3 An example of a Dimension relationship . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 An example of a Weight relationship . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5 An example of a Space relationship . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.6 Accessibility Rate for the FD UIs before / after repair. . . . . . . . . . . . . . . . . 53
4.7 Ratings of the original scaled UIs and our repairs. . . . . . . . . . . . . . . . . . . 57
5.1 Example of SBIIs issues: The left screenshot shows the original UI, while the
right screenshot highlights the SBIIs as identified by the Google Accessibility
Scanner. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
ix
5.2 Example that shows two apps’ UIs annotated with a simplified version of the
visually related groups that were identified by the clustering algorithm. . . . . . . 69
5.3 Participants’ preference between the original and repaired UI versions . . . . . . 81
5.4 Example that demonstrates SALEM. The left screenshot shows the original UI,
the middle screenshot highlights the SBIIs detected by Google Accessibility
Scanner, and the right screenshot shows the UI after applying SALEM’s repair. . . 82
x
Abstract
Mobile accessibility is more critical than ever due to the significant increase in mobile app usage,
particularly among people with disabilities who rely on mobile devices to access essential information and services. People with vision and motor disabilities often use assistive technologies to
interact with mobile applications. However, recent studies show that a significant percentage of
mobile apps remain inaccessible due to layout accessibility issues, making them challenging to
use for older adults and people with disabilities. Unfortunately, existing techniques are limited
in helping developers debug these issues; they can only detect issues but not repair them. Therefore, the repair of layout accessibility issues remains a manual, labor-intensive, and error-prone
process.
Automated repair of layout accessibility issues is complicated by several challenges. First, a
repair must account for multiple issues holistically in order to preserve the relative consistency
of the original app design. Second, due to the complex relationship between UI components,
there is no straightforward way of identifying the set of elements and properties that need to be
modified for a given issue. Third, assuming the relevant views and properties could be identified,
the number of possible changes that need to be considered grows exponentially as more elements
and properties need to be considered. Finally, a change in one element can create cascading
changes that lead to new problems in other areas of the UI. Together, these challenges make a
seemingly simple repair difficult to achieve.
In this dissertation, I introduce a repair framework that builds and analyzes models of the
User Interface (UI) and leverages multi-objective genetic search algorithms to repair layout accessibility issues. To evaluate the effectiveness of the framework, I instantiated it to repair the
xi
different known types of layout accessibility issues in mobile apps. The empirical evaluation of
these instantiations on real-world mobile apps demonstrated their effectiveness in repairing these
issues. In addition, I conducted user studies to assess the impact of the repairs on the UI quality
and aesthetics. The results demonstrated that the repaired UIs were not only more accessible but
also did not distort or significantly change their original design. Overall, these results are positive
and indicate that my repair framework can be highly effective in automatically repairing layout
accessibility issues in mobile applications. Overall, my results confirm my dissertation’s hypothesis that a repair framework employing a multi-objective genetic search-based approach can be
highly effective in automatically repairing layout accessibility issues in mobile applications.
xii
Chapter 1
Introduction
Mobile devices have become one of the most common means of modern-day communication.
Studies show that a majority (59%) of users access daily tasks, such as shopping, product registration, and finances, via the mobile interface as opposed to a web-based interface [26]. As more
products and services are brought to consumers via mobile applications (apps), it is important
for the 16% of the world population with disabilities to have equal access to these apps [28, 65].
A majority of the apps today are not built in a way that offers accessibility to users with disabilities [8, 65]. The design of a user interface (UI) plays an essential role in the quality of mobile
apps, including their accessibility and usability. However, many older adults and people with disabilities find it challenging to interact with and comprehend the content of mobile UIs, leading to
difficulties in recognizing and understanding the displayed information [6, 5, 8, 66]. The severity
of the situation is reflected in recent findings, where more than 78% of mobile apps have at least
10% of their UI elements inaccessible to disabled users [65]. Moreover, over 44% of mobile apps
contain accessibility issues that hinder their compatibility with assistive services [65]. Recent
studies also show that when mobile UIs are not designed with accessibility in mind, it can be
difficult for users to navigate mobile apps [5, 66, 67].
Ensuring that mobile apps are accessible is not only a social responsibility to ensure equal access for everyone but also a legal requirement. U.S. laws and regulations require digital content to
be accessible to people with disabilities. Case in point, the Americans with Disabilities Act (ADA)
1
in the U.S. prohibits discrimination against disabled people in public life settings, such as in education, transportation, and public accommodations. This law also requires companies and developers to make their services, including their websites and mobile apps, accessible to people with
disabilities. To help in compliance with these legal requirements, various standards and guidelines have been developed. These technical guidelines provide specifications and instructions
for how to make apps accessible and ensure their UIs are compatible with accessibility assistive
services. One notable example is the Web Content Accessibility Guidelines (WCAG) [83], which
was developed by accessibility experts and software professionals. WCAG provides instructions
and steps for making digital content, including mobile apps, accessible to users with disabilities.
Additionally, tech companies like Google and Apple have created their own accessibility guidelines to assist developers in building more accessible apps on their respective platforms [11, 17].
In addition to the legal requirements, making mobile apps accessible ensures that people with
disabilities have equal access to important information and services, just as their able-bodied
counterparts.
Failure to comply with these laws and guidelines can result in legal and financial consequences
for app developers. An example of this is Robles v. Domino’s Pizza, where Domino’s Pizza was
sued for having an inaccessible mobile app [29]. In that lawsuit, a blind person sued the company
for accessibility issues that impacted disabled users navigating the app using assistive services
and that prevented them from being able to place orders online. The case went through multiple
levels of the federal judicial system, including the Supreme Court, before it was settled in June
2022 for an undisclosed amount. Similarly, companies such as Apple, Amazon, and Target have
faced lawsuits for not making their apps and platforms accessible to people with disabilities [76].
Target eventually settled the case and was required to pay $6 million. While the cases involving
Apple and Amazon are still ongoing, the precedents set by the Domino’s and Target cases suggest
that they could either result in substantial financial settlements or continue to be prosecuted.
These types of lawsuits and complaints are on the rise as more services are becoming available
online and as more people become reliant on web and mobile apps to access these services. In
2
fact, recent data show that ADA digital accessibility-related lawsuits are expected to jump by
approximately 82% from 2018 to 2023 [82].
(a) Feedly app (b) Yelp app (c) Cnet app
Figure 1.1: Examples of accessibility issues impacting the UIs of mobile apps.
There are a wide range of issues that can negatively impact the accessibility of mobile apps.
Figure 1.1 shows examples of these accessibility issues in real-world mobile apps. A disabled
user navigating the Yelp app using navigation assistive services (e.g., Talkback) will not be able
to rate restaurants, services, or submit reviews of places because the rating element, highlighted
in Figure 1.1a, is completely inaccessible to these assistive services. Similarly, the three elements
on the main UI of the Feedly app, highlighted in Figure 1.1b, are completely inaccessible to users
navigating the app using assistive services. Without access to these elements, users will not be
able to dismiss the splash screen and access the main page, access the app’s main menu, or use
the app’s search functionality. In other words, users using assistive services will not have access
to any of the app’s functionality, rendering it completely useless for those users. In the Cnet
app, shown in Figure 1.1c, when the user activates a scaling assistive service to enhance text
3
readability, the content in the profile UI becomes distorted, making it difficult to understand the
content of the UI. Studies show that these scaling-based distortions can significantly impact the
accessibility of these apps and make it harder for people with disabilities to navigate and interact
with mobile apps [20, 8, 65].
This dissertation specifically addresses layout accessibility issues. These are accessibility issues that impact the organization and rendering of UI elements in mobile applications, making
them less accessible to people with disabilities. Layout accessibility issues primarily affect older
adults and people with vision and motor disabilities who depend on assistive services to use mobile apps [49, 9, 71]. Recent studies show that layout accessibility issues are widespread in mobile
apps [8, 49, 9], with more than 73% of apps having over 10% of their elements impacted by these
issues. In fact, studies indicate that these issues render the text in as much as 23% of popular
mobile apps unreadable for people with disabilities [8]. Layout accessibility issues can occur in
two main ways. First, issues occur when the design of a mobile app’s UI is not compatible with
the accessibility assistive services that many users with disabilities rely on. This leads to various problems and inconsistencies that hinder these users’ ability to access mobile UIs. Second,
issues occur when UI elements are not designed to meet the requirements outlined by the accessibility guidelines. As a result, these elements become challenging to access or interact with,
significantly impacting accessibility and user experience, especially for older adults and people
with motor disabilities. This dissertation categorizes UI layout accessibility issues into two main
categories: (1) UI Scaling Accessibility Failures (USAFs), which includes issues that manifest due
to UI scaling, such as text scaling issues, missing scaling issues, and collision scaling issues; and
(2) Size-Based Inaccessibility Issues (SBIIs), which includes issues that manifest due to the small
size of interactive elements in the UI.
The first type of UI layout accessibility issue, USAFs, occurs when a mobile UI is not compatible with scaling assistive services. Scaling assistive services, which are the most commonly used
accessibility features among those who utilize accessibility services [1], enable users with disabilities to enlarge or increase the size of screen content to accommodate users’ needs. These services
4
(a) Elements colliding (Cnet and Fedex app)
scaling
missing (b) Elements missing (8Vim app) (c) Text cutoff (Infini app)
Figure 1.2: Example of USAFs caused by the use of scaling assistive services.
5
are particularly helpful for individuals with mild visual or motor disabilities who may find it difficult to read small text or interact with small UI content. However, when a UI is scaled up, the
organization and rendering of elements across the screen can become distorted. These inconsistencies can result in lost content or inaccessible functionality. Recent studies have shown that
more than 23% of popular Android apps on Google Play [13] and F-droid [34] contain USAFs [49,
8]. These issues can manifest in several ways. First, there are collision issues where multiple
elements overlap with each other. Examples of this issue are shown in Figure 1.2a. In the Cnet
app, the two text blocks at the top of the UI become unreadable after scaling the UI, while in the
Fedex app, the toggle and information buttons collide, making it difficult for users to distinguish
or activate one button without inadvertently activating the other. Second, there are missing element issues, where elements disappear from the UI when scaled. Figure 1.2b shows an example of
this issue in the 8Vim app, where scaling the UI causes links to the app’s Google Play Store page,
Github page, and other social accounts to disappear and move out of view. Finally, there are text
visibility issues, where text within UI elements is either cut off or rendered invisible. Figure 1.2c
shows an example of this issue in the Infini app. The app allows users to manage daily activities
through created lists. In the app’s popup dialog, five lines of instructional text designed to guide
the user on how to use the app are cut off from the bottom of the UI due to UI scaling.
The second type of UI layout accessibility issue, SBIIs, occurs when interactive elements (i.e.,
touch targets) in the UI are too small to comply with accessibility guidelines. These guidelines
require that such elements be large enough for older adults and people with motor disabilities to
interact with accurately [59, 19]. The presence of SBIIs can significantly hinder users’ access to
the UI by slowing them down, increasing the chance of making mistakes, and making app navigation challenging for people with disabilities and older adults [65, 8, 64]. SBII is a very common
accessibility issue in mobile apps. In fact, a study found that at least 73% of mobile apps have over
10% of their interactive elements impacted with SBIIs. Another recent large-scale empirical study
on Android app accessibility ranked SBIIs as the second most common accessibility issue [8]. Figure 1.3 shows examples of this issue in the Fintech Credit Seasme app, where all the interactive
6
Figure 1.3: Example of SBII issues: The left screenshot shows the original UI, while the right
screenshot highlights the SBIIs as identified by the Google Accessibility Scanner.
7
elements of the "Create Your Account" UI are flagged by Google Accessibility Scanner [2] for having SBIIs. This could make creating an account a frustrating and error-prone process for users
with motor disabilities or for older adults.
Existing techniques are limited in terms of helping developers repair layout accessibility issues. Many techniques can help developers detect these issues. For example, Google Accessibility Scanner [2], MATE [31], Accessibility Testing Framework [38], and IBM Accessibility
Checker [41] can automatically detect size-based layout accessibility issues. Meanwhile, techniques such as dVerim [71], OwlEye [49], and AccessiText [9] focus on detecting scaling layout
accessibility issues. However, all of these techniques focus on detection and do not attempt to
repair layout accessibility issues. Other techniques provide various workarounds to help users interact with inaccessible UIs rather than fix them. Interactiles [89] focuses on making touchscreens
accessible by attaching a hardware interface to Android phone screens to enhance tactile interaction for visually impaired users. Similar hardware overlay techniques [74, 40, 45] have been
proposed in HCI research, but they do not address the underlying issues and require hardware
cutouts to be tailored to fit the devices. Software-based approaches [44, 43, 18] mostly focus on
using audio-based interaction techniques to allow the visually impaired to access touchscreens.
Android Magnification service [84] enlarges part of the screen to make it easier for people to
interact with the UI, while Touch Guard [90] helps enlarge the touch area to disambiguate the
bounds between multiple targets. These tools serve as alternative assistive technologies when
users have trouble perceiving or operating an app’s UI but cannot be used to repair layout accessibility issues. Thus, repairing layout accessibility issues remains largely a manual process,
which is both labor-intensive and error-prone.
8
1.1 Major Challenges and Insights
1.1.1 Challenges for Automated Repair of Layout Accessibility Issues
Repairing layout accessibility issues is complicated by the following challenges. The first challenge involves determining what needs to be changed in the UI to repair a given issue. Although
accessibility detectors can often report elements exhibiting layout accessibility issues, they do
not specify which elements should be modified or which properties should be adjusted to repair these issues. This is mainly due to the complex relationships between UI elements and their
properties, which determine how these elements are rendered in the UI. Repairing these issues,
therefore, may require making changes not just to the elements exhibiting the issues but also to
other related elements, which may themselves have relationships with other elements that need
to be modified. For example, the size of an element in a mobile UI is determined both by its own
size-related properties, the size-related properties of its neighboring elements, and by the size of
the elements that visually contain that element.
The second challenge involves the large space of solutions that may need to be considered.
Even if one can determine what needs to be changed to fix the issues (i.e., addressing the first
challenge), the space of possible repairs grows exponentially as the number of issues, elements,
and properties that need to be considered increases. A single UI element can have up to 10
properties, with many represented as numerical values ranging from zero to potentially large
numbers. When multiple properties across several elements are considered, the space of potential
solutions expands exponentially.
The third challenge is that a perfect repair may not even exist. Even if the first two challenges
are addressed, a repair that completely fixes the layout accessibility issues without distorting
the UI might not be achievable. Introducing changes to one part of the UI can trigger cascading
changes in other parts of the UI as elements change and move to accommodate the initial changes.
This complexity increases when multiple issues exist in the UI, as this increases the number of
9
necessary changes that need to be introduced and the likelihood that the final layout will be
distorted.
1.1.2 Insights to Automatically Repair of Layout Accessibility Issues
To address the challenges of repairing layout accessibility issues, I came up with the following
insights:
Insight 1: The complex relationships between UI elements, when correctly modeled,
can guide the process of identifying the changes needed to fix a given layout accessibility issue. This insight directly addresses the first challenge by focusing on the core of that
challenge: the inter-dependency of UI elements and their properties. Modeling these relationships, especially those impacted by layout accessibility issues, allows for a systematic approach
to tracking these dependencies. This, in turn, helps in pinpointing which elements or properties
need to be modified to repair a specific layout accessibility issue. This insight also partially addresses the second challenge, as focusing on modifying a subset of elements and properties in the
UI reduces the space of possible solutions that may need to be considered.
Insight 2: A multi-objective genetic search approach can be effective in generating a UI
that repairs the layout accessibility issues. These search-based techniques offer a way to
effectively navigate large solution spaces without having to explore an exponential number of
solutions. This directly addresses the second challenge by recognizing that finding an absolute
"best" solution is not always practical or necessary. Instead, these techniques can identify “good
enough” solutions that repair layout accessibility issues in a reasonable runtime. These searchbased techniques are also effective in considering trade-offs when finding a repair. This directly
addresses the third challenge by recognizing that a perfect repair may not always be achievable,
and therefore, trade-offs may be necessary. Using a fitness function, these techniques can act as
an approximation to identify satisfactory solutions more efficiently when multiple "good enough"
solutions may satisfy the constraints.
10
1.2 Hypothesis
Based on the insights presented above, the hypothesis of my dissertation is:
A repair framework that builds models of the UI and employs a multi-objective genetic
search-based approach to generate repairs can effectively repair layout accessibility issues.
To evaluate this hypothesis, I designed a framework that can automatically repair layout accessibility issues. At a high level, the framework builds models of the UI that capture important
relationships between its elements and their properties. The framework then analyzes these models to identify what needs to be changed in the UI to repair the issues. Once these changes are
identified, the framework starts the UI repair process by generating and evaluating candidate
repaired UIs using a search-based approach until a successful repair is identified.
To evaluate the effectiveness of this framework, I instantiated it to repair USAF and SBII
layout accessibility issues. I assessed these instantiations by applying them to the UIs of realworld mobile apps and conducting user studies to gauge the quality of the repaired UIs. The
notion of "effectiveness" is operationalized through several key metrics. First, the repair rate
showed that the framework successfully repaired over 91% of USAFs and 99% of SBIIs. Second,
the results of the conducted user studies not only confirmed that the repairs did not distort the
UI, but they also showed that participants’ ratings and preferences often favored the repaired
UIs over the originals. Therefore, with respect to these criteria, the results demonstrate that
the instantiations effectively repair layout accessibility issues without negatively impacting or
distorting the original UIs.
1.3 Overview of Dissertation
The goal of my dissertation is to assist developers in automatically repairing layout accessibility
issues in mobile apps. To accomplish this, I designed a repair framework that relies on building
and analyzing models of a mobile UI to determine what needs to be changed to fix the issues.
11
It then uses a multi-objective genetic search approach to generate the repaired UI. With this
repair framework, I then addressed two specific types of layout accessibility issues in mobile
apps: UI Scaling Accessibility Issues (USAFs) and Size-based Inaccessibility Issues (SBIIs). The
effectiveness of these framework instantiations was demonstrated by evaluating them on realworld mobile apps and conducting user studies with users affected by these issues. The evaluation
showed that these instantiations were effective in repairing the layout accessibility issues without
impacting the quality or aesthetics of the original UIs.
My dissertation work is divided into three main parts: the repair framework (Chapter 3), the
repair of USAFs (Chapter 4), and the repair of SBIIs (Chapter 5). I provide a high-level overview
of these three parts in the following subsections.
1.3.1 Chapter 3: A Framework to Repair Layout Accessibility Issues in
Mobile Apps
In this chapter, I introduce a framework designed to repair layout accessibility issues in mobile
apps. This framework breaks down the repair tasks into separate, identifiable components and
offers instantiation points at key stages of the repair process. This allows the framework to be
adopted and instantiated to repair the different types of layout accessibility issues.
The chapter begins by detailing the repair process and showing the critical role each component of the framework plays in the repair process. It then outlines the steps required to instantiate
the framework for repairing a specific type of layout accessibility issue. Particularly, it discusses
how UI relationships and properties that control the visual rendering of elements in the UI can be
modeled. It then shows how this modeling can be employed to identify the elements that might
need to be modified to repair a layout accessibility issue. Furthermore, the chapter describes how
to generate candidate repairs, assess their qualities, and guide the search process toward finding
the optimal repair. The effectiveness of the framework is later evaluated by applying it to repair
layout accessibility issues in Chapter 4 and Chapter 5.
12
1.3.2 Chapters 4 and 5: Repairing Layout Accessibility Issues in Mobile
Apps
In Chapters 4 and 5, my goal is to evaluate the effectiveness of my framework by applying it to the
two types of layout accessibility issues in mobile apps: UI Scaling Accessibility Failures (USAFs)
and Size-based Inaccessibility Issues (SBIIs). In each chapter, I focus on instantiating the different
components of the framework for the respective issue type. I also show the effectiveness of the
instantiation by evaluating it on real-world apps and assessing the impact of my repairs on the
visual quality of the UIs through a user study.
The work conducted in these chapters has been published at the 36th IEEE/ACM International
Conference on Automated Software Engineering (ASE) in 2021 and the 39th IEEE International
Conference on Software Maintenance and Evolution (ICSME) in 2023.
1. Ali S. Alotaibi, Paul T. Chiou and William G.J. Halfond. Automated Repair of Size-Based
Inaccessibility Issues in Mobile Applications. In 36th IEEE/ACM International Conference
on Automated Software Engineering (ASE). November 2021.
2. Ali S. Alotaibi, Paul T. Chiou, Fazel Mohammed Tawsif, and William G.J. Halfond. ScaleFix:
An Automated Repair of UI Scaling Accessibility Issues in Android Applications. In 39th
IEEE International Conference on Software Maintenance and Evolution (ICSME). October
2023.
1.4 Contributions
The contributions of my dissertation are as follows:
Repair Framework: I designed a conceptual framework for repairing layout accessibility issues in mobile applications. My approach is the first to provide a general repair framework for
layout accessibility issues in mobile applications. Part of this contribution involved an extensive
13
analysis of the different layout accessibility issues to understand how these issues manifest and
identity strategies for how they can be repaired. From this analysis, I abstracted the repair process into a modular set of components that can be instantiated and tailored to repair different
types of layout accessibility issues. I discuss the details of this contribution in Chapter 3.
Repairing Layout Accessibility Issues: To evaluate the effectiveness of the repair framework, I instantiated it to repair the two types of layout accessibility issues in Android applications. These are the first automated techniques to repair these issues in mobile UIs. Part of this
contribution involved the formalization and modeling of UI relationships related to and impacted
by the layout accessibility issues in Android User Interface (UI). The contribution also involved
identifying the set of UI qualities needed to rank the generated repairs and guide the search process to finding successful repairs. The contribution also included evaluating the instantiations on
a set of real-world apps and conducting a user study that showed my repairs do not compromise
the attractiveness of the UIs and are preferred for mobile usage. I discuss this contribution in
Chapter 4 and Chapter 5.
14
Chapter 2
Background
This chapter provides background information on concepts that are used throughout this dissertation. I start this chapter with Section 2.1, which gives an overview of mobile User Interfaces
(UIs), highlighting their structure, how they can be built and analyzed. Section 2.2 focuses on
mobile accessibility. In this section, I examine the types of accessibility issues that impact mobile apps, the existing guidelines that define what it means for an app to be accessible, and the
various methods that developers can employ to identify and address accessibility issues in their
applications.
2.1 Mobile User Interfaces
An Android app consists of a set of activities. An activity is the class that creates the user interface (UI) window. The UI is organized in a tree-based structure called the View Hierarchy (VH),
where elements are organized in hierarchical relationships. The VH contains information about
the logical relationships among elements, such as the parent-child relationship, and information
about the visual properties of each element, such as size, color, and location. UI elements are either Views or ViewGroups. A View (also known as a “widget”) occupies a rectangular area on the
screen and is visible by default. A ViewGroup is a special type of View that can contain other Views
as children, allowing for more complex and nested layouts. It controls the layout parameters of
its child Views as well as various aspects of their visual rendering on the UI. A touch target refers
15
to any element on the UI that a user can touch, click, or interact with to perform some action.
These include elements that are interactive by default (e.g., Buttons) and non-interactive elements
attached to an event handler that allow them to respond to user actions (e.g., implementing an
OnClick method for an ImageView or a LinearLayout).
The appearance of an element can be configured using a set of properties (i.e., attributes).
Android uses these properties to determine how these elements are rendered on the screen. For
example, height and width properties (i.e., dimensions) can be used to set the size of an element in
the UI. These properties can either be set to a specific number (i.e., 40dp), to match the size of the
parent element (i.e., using match_parent), or to be at least as large as its content (i.e., using wrap_-
content). Elements can also be created with margins and paddings, which add additional spaces
between elements and between an element and its content, respectively. Another configurable
property is the weight value, which specifies how much space an element should occupy within
its parent. This weight property is used to divide up extra space within the parent and allocate it
to its contained elements.
Many properties exist to control how text appears and expands on the screen. For example,
the singleLine property can be used to specify whether the text should be confined to a single
line or allowed to expand to multiple lines. Elements can also be defined with constraints in relation to other elements. For example, the ConstraintEnd_toEndOf constraint ensures that the end
of one element aligns with the end of another. Additional properties include element ID, which
helps identify an element, and content description, which helps assistive services understand the
role of the element and its functionality in the UI. Properties can also specify the visibility and the
types of interaction an element supports, such as touch, swipe, or long touch. In addition to the
properties defined for the elements inside the app layout files, elements can be assigned additional
properties after they have been rendered on the screen. These properties capture the elements’
runtime visual characteristics, such as their Minimum Bounding Rectangle (MBR), size, and positional relationships with each other. The VH of the UI and the properties of its elements can be
obtained by parsing the app’s layout files and utilizing Android SDK tools such as Android Debug
16
Figure 2.1: Screenshot and associated VH of the ’End User License Agreement’ UI within the
Fedex app, captured using the UI Automator Viewer.
Monitor Library (ddmlib) [37], Layout Inspector [46], or UI Automator [77]. Figure 2.1 shows an
example of using UI Automator Viewer, a graphical interface of UI Automator, to capture both a
screenshot and the dynamic VH of an app. The tool allows developers to visually examine each
UI component along with its dynamic attributes.
Developers build mobile UIs using tools such as Android Studio, the official integrated development environment (IDE) for Android [15]. Within Android Studio, the Layout Editor offers a
drag-and-drop interface to build the UI while also providing a live preview of the UI across various device configurations. UI elements can be added, positioned, resized, or nested inside other
elements using this editor. For more granular control, developers can also access the XML editor
to directly modify the VH of the UI, setting elements’ properties, constraints, and relationships.
The rendering of a mobile application’s UI is a complex, dynamic process often influenced
by various factors such as device specifications and user settings. This process starts once the
17
application’s UI is launched. The framework typically parses the VH layout files in a hierarchical,
top-down manner, starting from the root node. During this rendering process, the framework
aims to adhere to all defined properties and constraints for each view. However, constraints,
such as varying device screen sizes, resolutions, or activated scaling options, might render this
task infeasible. In such cases, the Android framework employs a layout optimization process. It
performs multiple iterations of adjustments and approximations to ensure the UI is adequately
displayed on the screen. This involves recalculating dimensions, re-evaluating constraints, and,
if necessary, changing the properties of UI elements dynamically.
2.2 Mobile Accessibility
Mobile accessibility aims to ensure mobile applications are accessible to all users, regardless of
their physical or cognitive abilities. The process of designing and developing mobile applications
to be accessible involves ensuring that these applications are accessible to users navigating these
applications by touch as well as for people relying on assistive services and technologies, such
as screen readers and text-to-speech services. Various accessibility guidelines are available to
guide developers on how to ensure that their applications are accessible to users with varying
abilities. One of the most widely recognized accessibility guidelines is the Web Content Accessibility Guidelines (WCAG) [59], which, although primarily created for web content, are equally
relevant to mobile applications [59]. Further initiatives, like W3C’s Mobile Accessibility Task
Force, focus on producing guidelines tailored specifically for mobile applications [81]. Similarly,
tech companies, such as Google and Apple, have introduced their own accessibility guidelines for
Android [11] and iOS [17], respectively. Other guidelines, such as the BBC Mobile Accessibility
Guidelines [19], provide further recommendations on how to ensure mobile accessibility. These
guidelines provide specific recommendations for developing accessible applications. For example, the BBC Mobile Accessibility Guidelines [19] specify that touch targets should be sufficient
in size. Specifically for Android, the guideline specifies that touch targets should have a size of at
18
least 48×48dp, while for iOS, a minimum size of 44×44 pt is recommended. Another example is
Google’s Android Accessibility Guidelines, which recommend that for text smaller than 18pt or
bold text smaller than 14pt, the color contrast ratio should be at least 4.5:1, while a minimum color
contrast ratio of 3:1 is recommended for all other text. By following these guidelines, developers
can ensure their mobile applications are accessible to people with disabilities.
At a high level, these guidelines are founded on four general accessibility principles with respect to the UI and its content. The first principle, Perceivable, emphasizes that the UI and the
information it presents can be easily perceived and identified by users. The next principle, Operable, emphasizes the importance of straightforward interaction with the UI and its components and
that users, regardless of their abilities, can easily navigate through the app. The third principle,
Understandable, emphasizes the importance of clarity and coherence, ensuring that the UI and its
content are clear and intuitive. Lastly, the Robust emphasizes the importance of consistent and
reliable presentation and interpretation of content across various devices and screen resolutions.
These principles form the basis for accessibility guidelines that offer actionable recommendations concerning UI design, content presentation, user interactions, and alternative input and
output methods. For example, in order to ensure the operability of the UI for touch-based users,
guidelines include the need to have larger touch targets in the UI. To ensure the perceivability
of the UI, especially for those with low vision, guidelines include the need to ensure sufficient
color contrast and offer meaningful captions for images and multimedia elements. By following
these guidelines, developers can create user-friendly applications that are accessible to everyone,
including people with disabilities.
Accessibility issues occur when developers do not adhere to accessibility guidelines. Accessibility issues can manifest in various ways, but in general, they impact one or more of the four
accessibility principles discussed earlier, making it difficult for users with disabilities to interact
with mobile applications and understand their content. Recent studies have examined accessibility issues that impact mobile apps. These studies found that many mobile applications lack the
support for important assistive services, which are essential for disabled users to interact with
19
Figure 2.2: Example of using Google Accessibility Scanner (GAS) to Identify Accessibility issues:
The left screenshot displays issues as identified by GAS, and the right screenshot provides detailed
information upon selecting one of these identified issues.
20
and navigate these applications. Such lack of support can manifest as unintuitive navigation,
missing elements, or missing content labels [8, 65]. Small touch targets are another common issue in mobile applications [8, 65, 6] that can make it difficult for users with motor disabilities or
older users to interact with these applications. Similarly, low color contrasts or poor text legibility
are issues that primarily impact those with low vision or color vision disabilities. Additionally,
some applications fail to support alternative input methods, such as Switch Access [73], which are
important for users with physical disabilities who cannot interact with touchscreens using standard gestures. Various approaches exist to help developers identify accessibility issues in their
mobile applications. Developers can either manually explore the application (e.g., using assistive
services) and identify accessibility violations or employ specialized accessibility detectors. Tools
from both industry and academia have been developed to assist developers in finding accessibility issues [2, 65, 71, 49, 9, 5]. One popular example is the Google Accessibility Scanner (GAS) [2],
which can automatically identify a wide range of issues, from small touch targets and inadequate
color contrasts to missing content labels and unintuitive navigation. Examples of accessibility
issues detected by the GAS can be seen in Figure 2.2.
21
Chapter 3
A Framework for Repairing Layout Accessibility Issues
This chapter introduces the framework for repairing layout accessibility issues in mobile apps.
The framework represents a distillation of my experience in fixing layout accessibility issues and
working with mobile UIs testing and design. The framework outlines a systematic process for
repairing layout accessibility issues. It breaks down the repair tasks into separate, identifiable
components and shows how each of these components plays a critical role in the repair process.
Collectively, these components contribute to the effective repair of layout accessibility issues.
The framework also serves as a roadmap for future efforts in this domain. It provides instantiation points and extensible functions at the key stages of the repair process that can be adapted
or extended as needed. This allows the framework to accommodate a range of scenarios and
challenges that may be encountered when targeting specific issues.
The repair framework, as shown in Figure 3.1, consists of two main phases, and it takes two
inputs to initiate the repair process. The first input is the information about the UI that needs to
be repaired. This information includes a copy of the UI’s VH, which, as described in Section 2.1,
contains detailed information about the UI and its elements, and a screenshot of the visual rendering of the UI. This information can be obtained via one of the available tools that are part of
the mobile SDKs [70, 77] or any other tool that can generate this information. The second input is
a list of the detected layout accessibility issues within the UI, referred to as the detection report.
This report can be obtained through one of the available detectors (e.g., [71]) or supplied manually by the developer. The first phase of the framework, which I refer to as the analysis phase,
22
Figure 3.1: Overview of the repair framework
23
focuses on generating the necessary information needed to start the repair process. In particular,
it identifies what needs to be changed within the UI in order to repair the detected issues.
In the second phase of the framework, which I refer to as the UI generation phase, the framework employs a multi-objective genetic algorithm approach to generate and evaluate candidate
repaired UIs in search of the best possible repair to resolve the issues. In this phase, the framework utilizes the information generated from the analysis phase to create an initial set of repairs.
These repairs serve as the initial population for the search process. The framework then evaluates and ranks these generated repaired UIs using a fitness function. The framework continues
to generate and evaluate candidate repairs until it reaches a termination condition that tells the
framework to stop the search process. The best repair found at this point is selected as the final
repair for the UI. In the following subsections, I give more details on each of these phases. I will
elaborate on these aspects in detail when discussing the framework instantiations in Chapter 4
and Chapter 5.
3.1 Analysis Phase
The goal of this phase is to identify what may need to be changed in the UI to repair the detected
layout accessibility issues. To achieve this goal, the framework first constructs a model of the
UI that captures important information about the UI and the relationships between its elements.
The framework then utilizes the constructed model and the list of detected layout accessibility
issues to perform a dependency analysis, which identifies a set of elements and properties that
may need to be modified to fix these detected issues. In my framework, this set is referred to as
the FixSet. This set is used to create the initial set of candidate repaired UIs.
3.1.1 Modeling the UI
The first step in the Analysis Phase is to construct a model of the UI that captures relationships
that control or impact aspects of the rendering of elements on the screen To accomplish this, the
24
Contained by
Button1
Linear
Layout
button2
Height, 0.8
To the left of
To the left of, bottom aligned
Button1
Linear Layout
Button2
Relative Layout
Relative
Layout
Contained by
Contained by
Height, 0.65
Figure 3.2: A simple example of a UI model
framework analyzes the UI’s VH. The VH contains detailed information about the hierarchical
relationships between elements (such as parent-child or sibling relationships) and the properties
defined for the elements in the UI. This information enables the framework to infer and extract
relationships between elements in the UI. The UI model is represented by a graph where nodes
represent elements in the UI, and edges represent the identified relationships between these elements. For example, Figure 3.2 shows a simplified UI model. In this model, the nodes Button1
and LinearLayout are linked by an edge annotated with height. This annotation indicates the
existence of a height relationship between the two nodes.
The main challenge in building the model lies in identifying the relationships that should be
included in the model. This requires an understanding of how mobile UIs work, how their rendering is impacted by the targeted layout accessibility issues, and the potential repair strategies
for these issues. For example, if the focus is on repairing SBIIs, then relationships involving the
size of elements become important as they can help determine how to enlarge the size of small
touch targets in the UI. Once the essential relationships are identified, the next step is to build a
UI model that accurately models these relationships. This model should enable the framework to
reason about these relationships and identify transitive relationships and dependencies between
elements.
25
3.1.2 UI Dependency Analysis
The next step in the Analysis Phase is to perform a UI dependency analysis. The goal of the
dependency analysis is to identify, given a layout accessibility issue within a UI and the UI model,
the elements and properties within the UI that need to be modified to repair the issue (i.e., the
FixSet). The dependency analysis takes as an input the UI model and the set of detected layout
accessibility issues and outputs a set of elements and properties that may need to be modified to
repair the detected issues. The specification of the dependency analysis will depend on the type
of relationships modeled in the UI model, the type of layout accessibility issues targeted, and how
they can be repaired. A challenge in specifying the dependency analysis lies in ensuring that the
FixSet is both minimal and safe to enable efficient repair of the layout accessibility issues within
a reasonable time. Trivial solutions, such as including all of a UI’s elements and properties in
the FixSet, while safe, are not minimal and could significantly increase the runtime, making the
repair process inefficient. On the other hand, including only elements directly impacted by the
issues may not be safe since it may not lead to a successful repair, as fixing the issues may require
modifications to related elements.
An advantage of modeling UI relationships as a graph is that it enables efficient and comprehensive analysis of these relationships, allowing the framework to traverse this complex set of
UI elements and their relationships. This is typically achieved by computing a transitive closure
of the model, starting from the element impacted by the identified issue. For example, referring
back to Figure 3.2, if Button1 represents an element having an SBII due to its inadequate height,
the dependency analysis can utilize the UI model to identify elements, such as LinearLayout and
RelativeLayout, that may also need to be changed to repair the issue caused by Button1. The properties that may need to be changed for each identified element can be determined by a predefined
mapping between edge types and properties. In the previous example with Figure 3.2, the edge
representing a height relationship between Button1 and LinearLayout suggests that properties
such as layout_height, minHeight, or maxHeight might need to be modified to resolve the issue
affecting Button1.
26
3.2 UI Generation Phase
The goal of this phase is to identify the best possible repair for the identified layout accessibility
issues and then apply that repair to generate a repaired version of the UI. To accomplish that,
the framework first utilizes the output of the Analysis Phase (i.e., the FixSet) to create an initial
set of candidate repaired UIs. The framework then employs a multi-objective genetic algorithm
approach to refine and improve these candidate repairs. In subsequent iterations, the framework
continues to generate and evaluate candidate repaired UIs until a termination condition is met.
At this point, the best repair found so far is selected as the final repair. The following subsections
elaborate on each step of this process in detail.
3.2.1 Generating Candidate Repaired UIs
The goal of this step is to generate a set of candidate repaired UIs, also referred to as a population.
A population is a collection of chromosomes in a particular iteration of the search. A chromosome
consists of a set of genes. A gene represents a change and takes the form of ⟨i, p,a⟩. i is a UI
element to be changed, p represents the target of the change, and a is the change value applied to
p. For example, returning to the example in Figure 3.2, a change that targets increasing the height
of Button1 to fix its height SBII could be represented as ⟨Button1,layout_height,1.20⟩. Here, i is
the ID of the UI element to be changed (Button1), p is the target property to be changed (layout_-
height), and a is the change value applied to p, indicating a 20% increase in the current height of
Button1.
At the beginning of the search process, an initial population is created based on the output
of the UI dependency analysis. The initial population is typically created by iterating through
the FixSet and generating a change for each element and property in the set. There are several
strategies for selecting the initial values for these changes. For example, one could use values
derived from accessibility guidelines, such as the recommended minimum height and width of
48dp for touch target elements in the UI. Alternatively, an initial change value could be selected
27
based on a percentage increase or decrease of an element’s existing value, like proposing a 20%
increase to the height of an element currently set at 60dp.
After the creation of the initial population, the framework continues to create a generation
of candidate repaired UIs based on the generation of the prior iteration. This process involves
utilizing the typical genetic search operators, specifically selection, crossover, and mutation. Selection is performed by identifying the best repairs from the current population and including
them in the next generation for continued evaluation against new repairs. Crossover merges
changes from multiple existing repairs to generate new candidate repairs, or “offspring”. This
allows effective changes from the parents to be passed on to their offspring. Lastly, mutation
introduces random changes to enhance the diversity within the population. This diversity is important for enabling efficient exploration of the search space and for avoiding early convergence
on suboptimal solutions.
3.2.2 Rendering Candidate Repaired UIs
After generating the candidate repairs, the next step is to render them on the target mobile device’s screen for runtime evaluation. Ensuring accurate rendering is critical for evaluating the
effectiveness of the repairs, as many UI elements’ visual properties can often only be fully assessed when they are rendered at runtime. This is particularly relevant for issues like USAFs,
where scaling assistive services must be activated during the rendering process to ensure accuracy. To perform this rendering, the framework iterates through the set of changes specified in
each candidate repair, and for each change, it modifies the corresponding UI element’s attributes
in the app’s layout files.
3.2.3 Evaluating and Ranking Candidate Repaired UIs
The next step in the process is to evaluate the generated repairs. The framework employs a fitness
function to perform this evaluation. Defining the fitness function is a critical step in instantiating
28
the framework. The fitness function enables the systematic evaluation and ranking of the different repairs. These rankings influence the selection of repairs used in the genetic algorithm’s
search operators, therefore guiding the generation of future candidate repaired UIs. This function
takes a generated repaired UI as input and outputs a score, indicating how effective the candidate repair is in satisfying certain objectives. These objectives represent desirable qualities or
characteristics that should exist in the repaired UI.
Objectives in the fitness function can be broadly classified into two main categories. First,
there are accessibility-related objectives. These are objectives specific to the targeted layout accessibility issues and are derived from the definitions of the issues. For example, minimizing the
number of missing or colliding elements may be examples of objectives that can be used in a fitness function of repairs that target USAFs. The details of how these objectives can be defined are
often informed by the accessibility guidelines describing the layout accessibility issues. Second,
there are aesthetic-related objectives. These objectives represent certain elements’ relationships
or design aspects that, when maintained in the repaired UI, could reduce the visual distortion
of the UI. These objectives do not directly target layout accessibility issues but play a role in
ensuring the framework penalizes repairs that distort or change the UI.
My examination of how layout accessibility issues impact the UI helped me identify two
aesthetic-related objectives that I found to be effective in helping guide the search toward minimizing layout distortions and are applicable to the different types of layout accessibility issues.
The first objective is to minimize the amount of changes to the positional relationships among
the elements in the UI. The goal of this objective is to guide the search towards solutions that
have fewer changes to the positional relationships among elements in the UI. Positional relationships include directional relationships, such as one element being above the other, and alignment
relationships, such as two elements being bottom-aligned. These relationships play an important
role in helping to create a visual hierarchy and establish an intuitive layout for users. For example, related buttons in a form, such as "Submit" and "Cancel," are often placed next to each other
to indicate their connection, making it easier for users to find and interact with them. Similarly,
29
menu items are typically aligned either horizontally or vertically to create a visually consistent
structure. Altering these positional relationships can disrupt the UI’s visual hierarchy and intuitive layout, which users depend on for understanding and navigation. The second objective is
to minimize the amount of change introduced to the UI. This objective is included based on the
observation that when multiple repairs rank similarly in terms of addressing layout accessibility
issues, repairs that introduce fewer changes to the elements in the UI are closer to the original
design and the UI developer’s original aesthetic intentions compared to repairs that introduce
additional changes to these elements.
A challenge in defining the fitness function lies in quantifying its objectives and balancing
their weights to reflect their relative importance. Generally, accessibility-related objectives may
need to be prioritized over aesthetic-related objectives due to their direct role in addressing the
layout accessibility issue. aesthetic-related objectives may be considered secondary and consequently assigned lower weights. The quantification of the objectives depends on the specific UI
quality from which the objective is derived and how that quality can be converted into a numerical value. For a more nuanced evaluation of the repairs, it is preferable to quantify objectives on
a gradient rather than using a step function. For example, using a step function to evaluate the
’sufficient touch target size’ objective by simply counting the number of SBIIs fails to distinguish
between two repairs that both have the same number of SBIIs. However, one repair may have
touch targets that are, on average, larger in size, making it a more accessible and, therefore, a
preferable solution. A gradient function that also accounts for the varying sizes of touch targets
provides a more nuanced evaluation, enabling better differentiation between repairs and guiding
the search more effectively toward finding the best repair.
3.2.4 Terminating the Search and Generating the Final Repair
The process of generating and evaluating candidate repaired UIs continues until a termination
condition is met. These conditions are important for ensuring the efficiency of the search and
30
preventing it from running indefinitely. In my framework, I primarily consider two termination
conditions:
(1) Reaching a fixed point: The search process terminates if there is a lack of improvement in
the quality of the repairs over a given number of generations. This condition indicates that the
search has likely reached a point where new iterations are unable to improve the quality of the
repair. The exact criteria for what indicates a lack of improvement can be customized. Typically, a
lack of improvement can be defined as no change in the fitness score. However, it is also possible
to establish a threshold where improvements below a limit are considered negligible and treated
as no improvements.
(2) Reaching a maximum number of iterations: The search can also terminate when it reaches a
predefined maximum number of generations. This condition ensures that the search terminates
and does not run indefinitely in situations where the first condition cannot be satisfied. The
customization of this condition involves setting an upper limit on the number of generations to
be explored.
The precise specification of these conditions and associated numerical values depends on
factors such as the complexity of the targeted issue, the available computational resources, and
the required quality of the repairs. However, in general, the specification of the termination
conditions should be done in a way that allows for sufficient exploration of the solution space.
Once a termination condition is met, the search process concludes and the solution with the best
fitness score within the final population is selected as the final repair.
31
Chapter 4
Repairing UI Scaling Accessibility Failure (USAF)
The design of a UI is an important software quality attribute for mobile apps, with a focus on
maximizing usability and user experience. Elder adults and many people with visual disabilities
often have difficulties reading the text displayed, leading to challenges in recognizing and understanding the content of the UI [71, 9, 65]. Scaling adjustments, which include increasing text
size and display scaling, are the most commonly used accessibility features among people who
use accessibility options in mobile apps [1]. However, studies have shown that when increasing
the default scaling settings to make apps accessible, the layout of many apps that do not follow
accessibility guidelines [11, 59, 19] can become distorted, often leading to lost content or compromised functionality, which defeats the goal of accessibility. In fact, 17% of popular Android
apps on Google Play and F-droid contain UI display issues involving scaling [49]. In this chapter,
I refer to any layout inconsistencies that result from the use of scaling assistive services as UI
Scaling Accessibility Failures (USAFs)
The goal of my work here is to automatically repair USAFs, while maintaining, as much as
possible, the aesthetics of the UI. A naive approach to resolve these issues is to decrease the text
size or significantly reduce the content size of the UIs. This solution, however, directly contradicts the purpose of using these scaling services. An effective repair should fix USAFs without
negatively impacting the scaling of the UI, but this is complicated by several challenges: First,
Android UIs are controlled by complex relationships among elements, properties, and rendering
constraints that impact various aspects of the UI. Using scaling services to scale the UI further
32
complicates these relationships and makes it difficult to predict what changes are needed to fix
the issues. Second, when attempting to introduce a change to the UI this often leads to a chain
of cascading changes in other parts of the UI and can create additional layout problems. This
can become even more complex when multiple USAFs in the UI need to be addressed. Lastly,
examining the app’s final rendering at runtime is essential to assess the impact of the introduced
changes on the UI, as analyzing the app’s static files cannot reveal the actual UI layout or behavior
at runtime. Collectively, these challenges make repairing USAFs a demanding task.
Existing approaches cannot help in repairing USAFs. AccessiText [9] and dVermin [71] can
detect text scaling issues, such as USAFs, but not repair them. Android Magnification service [84]
enlarges part of the screens to make it easier for people to interact with the UI. Touch Guard
[90] helps to enlarge the touch area to disambiguate the bounds between multiple targets. These
existing tools serve as alternative assistive technologies when users have trouble perceiving or
operating an app’s UI, but they cannot be used to repair USAFs. Repairing USAFs remains largely
a manual process that can be time-consuming and error-prone, especially for developers not
familiar with these assistive services.
In this chapter, I describe how I instantiate the repair framework presented in Chapter 3
to repair USAFs. In this instantiation, which I refer to as ScaleFix, I employ different insights
from the problem domain to build the UI model, specify the UI dependency analysis, and define
heuristics to evaluate the generated repairs and guide the search toward finding the best repair
to fix USAFs without negatively impacting the UI. To ensure completeness, I present the full
approach in this chapter. While some components of the general framework are reiterated for
clarity, the focus here is on how these components are specifically tailored and instantiated to
address USAFs.
33
4.1 Background
Android assistive services enable users with disabilities to better access mobile apps. For those
with mild visual or motor disabilities who may struggle to clearly see text or cannot rely on
large touch targets, scaling services provide a means to enlarge or increase the size of the screen
content. Android scaling services adjust the display configuration to accommodate users’ needs,
offering several options: Font Scaling (the Font-Only configuration, or TO), which increases font
size; Display Scaling (the Display-Only configuration, or DO), which focuses on enlarging content size; and a combination of both Font and Display Scaling (the Font-Display configuration, or
FD) for maximum scaling. Users can choose their preferred scaling options to enhance the presentation of the content on their devices. When a UI is scaled up, the way elements are organized
across the screen can become distorted. This can cause UI elements to collide, disappear, or have
their text cut off. In this chapter, I use the term USAFs to refer to any layout distortion in a UI that
occurs after activating scaling assistive services. USAFs can manifest in different ways. Here, I
show examples of how these USAFs manifest in real-world apps.
Collision Scaling issues: These issues occur when scaling the UI causes visible elements to
collide and overlap. When two visible elements collide in the UI, it can make it difficult for users
to perceive the UI content or interact with it. Figure 4.1a shows the Cnet app, where two text
elements overlap after scaling is applied, making it difficult for users to read or understand the
text. The Figure also shows a similar but more severe example that occurs in the Fedex app. The
toggle button and the information button collide after scaling. As a result, the user would not
be able to distinguish between these two buttons. Also, users would not be able to activate one
button without inadvertently activating the other.
Text Scaling Issues: When UI content size increases, elements with text that are not configured to adapt to UI scaling may have their text truncated or cut off. This can negatively impact
the user experience and accessibility, as crucial information could become unreadable or entirely
34
lost. For example, Figure 4.1b illustrates the Infini app, which allows users to manage daily activities through created lists. In the app’s popup dialog, five lines of instructional text designed to
guide the user on how to use the app are cut off from the bottom of the UI due to UI scaling.
Missing Scaling Issues: When the scaling of the UI causes elements to be rendered outside
of the viewport’s boundaries, their functionality becomes unavailable. As shown in Figure 4.2a,
in the 8Vim app, links to the app’s Google Play Store page and other accounts become inaccessible because they are out of view for users. Similarly, in the Weaver app, as shown in Figure 4.2b,
the login button completely disappears in the scaled version. In these examples, USAFs can prevent users who rely on scaling assistive services from logging in or accessing the app’s related
functionalities.
4.2 Approach
The goal of this approach is to automatically repair USAFs. Developing an approach that can
identify a single correct solution is challenging because of the complex relationships between UI
elements and properties and the various rendering constraints imposed on the UI, which makes
it difficult to predict what changes are needed to fix the issues. Moreover, when one aspect of the
UI is changed, it often leads to a chain of cascading changes in other parts of the UI. All of this
makes it hard to identify a single repair that correctly fixes the issues. To address these challenges,
I leverage the repair framework I described in Chapter 3. This framework is well-suited for the
domain as it employs a multi-objective genetic search algorithm, which works well for balancing
trade-offs between repairing different USAF types and maintaining the original UI design. The
framework also enables the use of insights into the problem domain to build UI models, which
can be analyzed to identify a subset of elements and properties that are most likely to resolve the
issues while minimizing the runtime needed to find a successful repair. Section 4.2.1 describes
the instantiation of the framework’s analysis phase. Section 4.2.2 describes the instantiation of
the framework’s UI generation phase, including the multi-objective genetic search approach.
35
(a) collision scaling issues(Cnet and Fedex app)
(b) Text cutoff (Infini app)
Figure 4.1: Examples of collision and text cutoff USAFs caused by the use of scaling assistive
services. 36
scaling
missing
(a) Elements missing from the 8Vim app
Scaling
(b) Element missing from the Weaver app
Figure 4.2: Examples of missing USAFs caused by the use of scaling assistive services.
37
The approach takes two inputs. The first input is information about the UI that needs to be
repaired. The UI information includes a copy of the UI’s VH, which contains information about
the UI and its elements, and a screenshot of the visual rendering of the UI. This information can
be obtained via available tools that are part of the mobile SDKs [70, 77] or any other tool that can
generate this information. The second input is the set of detected USAFs in the UI under repair,
referred to as the detection report. This report can be obtained by one of the available detectors
(e.g., [71]) or supplied manually by the developer. In my approach, I refer to UI elements reported
to have USAFs in this detection report as the problematic elements. The output of our approach
is a repaired UI.
4.2.1 The Analysis Phase
The goal of this phase is to identify the initial population that will be used to start the search
process. This includes (1) building a model of the UI that captures the important UI relationships
and (2) performing the UI dependency analysis to identify the set of elements and properties that
will be targeted with change (i.e., the FixSet) and then suggest initial change values for this set.
I developed general guidelines for determining, given a USAF, what needs to be changed in
the UI, and therefore, how to build the UI model and specify the dependency analysis, based on an
examination of a diverse set of UIs from more than 100 apps across different categories, some with
USAFs and others without. Of this set, 16 were also part of the evaluation. The development of
the guidelines also drew from my years of experience in working with and building Android UIs.
This examination looked at how UIs visually changed after activating scaling services, including
changes in elements’ properties, such as height, width, text, and positions. I also investigated
how USAFs could be repaired by manually introducing different changes to the examined UIs and
identifying what elements and properties needed to change to fix the issues. For each element I
changed, I examined its relationship with the elements impacted by the problem and referenced
the Android developers’ guidelines [27] to understand the possible ways such a relationship could
be defined in a UI. It is worth noting that my approach demonstrated comparable performance
38
across both the set of apps that were part of my initial analysis and those that were not. Based
on this analysis, I identified a superset of elements and properties that had to be adjusted to
address those types of issues. My observations revealed that when assistive services scale the UI,
they trigger a series of changes that impact the height, width, text properties, and location of the
elements in the UI (referred to as the placement characteristics). Changes to these aspects can
result in the manifestation of USAFs, and repairing them requires modifications to the placement
characteristics of elements exhibiting the issues (i.e., the problematic elements) and other related
elements in the UI.
However, the placement characteristics of an element are determined based on complex relationships between the properties defined for that element and its relationships with other elements in the UI. Understanding these relationships, which I refer to as placement relationships,
is a critical first step in creating the FixSet. Placement relationships indicate how other elements
in the UI control or impact an element’s height, width, text, or location. In the following subsections, I first describe how my approach builds a model of the UI to capture these relationships
and then show how the approach performs the UI dependency analysis to identify the FixSet.
4.2.1.1 Building the UI Model
The next step after identifying the placement relationships is to build a model of the UI that captures these relationships. My approach models the placement relationships using a graph-based
abstraction of the UI, called the UI Information Graph (UIG). The UIG is formally represented
as a graph ⟨V,E⟩, where V represents the set of visible elements in the UI and E is a set of directed edges that represent the relationships between elements in the UI. E is represented as a
set of tuples of the form ⟨t, p⟩. t ∈ T, where T is the set of four placement relationship types:
{dimension,weight,space, constraint}. p ∈ P, where P is the set of properties for a UI element
(e.g., height). To build the graph, my approach first extracts the set of visible elements in the UI
from the VH that is provided as an input, including their properties and their hierarchical relationships, such as parent-child and sibling relationships. My approach adds these elements as
39
nodes in the UIG. The approach then iteratively processes each node and creates the placement
edges based on the conditions for each relationship type. In the following paragraphs, I discuss
the details of each of the placement relationship types.
text text parent
35 dp 40 dp
Figure 4.3: An example of a Dimension relationship
btn2
textview btn1
parent
weight = 2 weight = 1 weight = 1
Figure 4.4: An example of a Weight relationship
A dimension-type placement relationship exists between an element and one of its ancestors
(i.e., containing elements) when the ancestor does not automatically change to accommodate
changes in the element’s dimensions. For example, in Figure 4.3, attempting to fix the cutoff
issues by increasing the dimensions of the two textviews will not be successful unless the ancestor dimension (40dp) is increased to allow the children’s dimensions to change. Modeling these
relationships is important to identify cases where ancestors must be included in the initial population and subsequently targeted with changes to fix USAFs. Equation (4.1) specifies the two cases
where an ancestor (v2) does not accommodate changes in its child element (v1). In the equation,
40
btn
text
Content
margin
padding
Figure 4.5: An example of a Space relationship
match_parent and wrap_content are represented as mp and wc, respectively. If the conditions
specified in Equation (4.1) are met, an edge of this type is created and added to the graph.
D = {(v1, v2) | v1, v2 ∈ V ∧ v2 ∈ ancestors(v1)∧dim(v2) > 0∨(dim(v1) = mp∧dim(v2) = wc)}
(4.1)
Aweight-type placement relationship exists between elements that share a sibling relationship
in the UI with a weight property defined for them. The weight property indicates how much of
the parent’s available space each child can occupy. Developers use weights to specify the size
of elements in the UI in proportion to each other. Modeling these relationships is important
to identify elements with weights that may need to be included in the initial population and
subsequently changed to fix USAFs. In Figure 4.4, for example, my approach may need to adjust
the weights of the three siblings to fix the collision for “btn2”. The approach creates the edges of
this type in the UIG based on the conditions specified in Equation (4.2). In particular, my approach
41
checks if a node vi
in UIG has siblings with defined weights and a height or width of 0. If so, a
weight-type edge is created between node vi and each of its siblings.
W = {(v1, v2) | v1, v2 ∈ V ∧ sibilings(v1, v2)∧weight(v1) ̸= null∧weight(v2) ̸= null
∧ (height(v1) = 0∨width(v1) = 0)∧ (height(v2) = 0∨width(v2) = 0)} (4.2)
A space-type placement relationship exists between two elements in the UI when at least one
of them has padding or margin properties defined. Padding defines the space between the content
of an element and its border, while margin defines the space between the border of an element and
other elements. When a UI is scaled, this extra space can significantly increase, pushing elements
away and limiting the space available to make changes to the UI. Modeling these relationships
allows my approach to identify and add these elements to the initial population so they can be
subsequently changed or modified. In Figure 4.5, for example, fixing the collision for “btn” may
require adjusting the extra space defined for “text” and “btn”. My approach creates the edges of
this type in the UIG based on the conditions specified in the Equation (4.3). In particular, for a node
vi
in UIG, my approach identifies other nodes in the graph that have defined space properties. My
approach then creates space-type edges between vi and these nodes and adds them to the graph.
S = {(v1, v2) | v1, v2 ∈ V ∧space(v2) ̸= null} (4.3)
A constraint-type placement relationship exists between two nodes when a constraint defines
a layout relationship between them in the UI. Android enables developers to build complex UIs by
establishing how elements are structured and laid out in the UI based on relationships between
sibling elements and their parent elements. These relationships create dependencies between
elements in the UI that must be considered when attempting to change the UI. Modeling these
relationships allows my approach to identify these dependencies and add related elements to the
initial population so they can be subsequently modified to fix USAFs. The approach creates the
edges of this type in the UIG based on the conditions specified in Equation (4.4). Specifically, my
42
approach checks if a constraint relationship exists between node vi and any of its siblings (e.g.,
start_toEndOf and layout_above). If so, a constraint-type edge is created and added to the
graph.
C = {(v1, v2) | v1, v2 ∈ V ∧ sibilings(v1, v2) ∧is_constraint_related(v1, v2)} (4.4)
4.2.1.2 UI Dependency Analysis
The next step in my approach is to analyze the dependencies in the UIG, identify the elements
and properties that must be included in the initial population (i.e., the FixSet), and determine the
set of initial changes for these elements and properties. These changes will become genes in the
initial population. To identify the elements and properties that need to be included in the FixSet,
my approach starts by utilizing the detection report, one of the inputs to the approach, to identify
the problematic nodes. These problematic nodes are the UI elements reported to have USAFs. My
approach then computes the transitive closure starting from each problematic node and adds the
nodes to the FixSet. Next, based on the edge types (i.e., D, W, S, and C), my approach determines
the subset of applicable properties Ps ∈ P that may need to be changed for these nodes.
The next step after identifying the elements and properties is to identify the initial values
that can be assigned to them. To do that, my approach utilizes insights into how USAFs manifest
and how they can be repaired. For example, the general strategy for fixing a cutoff USAF focuses
on providing more space for the text content to be fully visible. This can be achieved by one or
more of the following: (1) increasing the dimensions of the problematic element and its ancestors,
(2) reducing the margins and padding of the problematic element and the elements around it, (3)
removing constraints that limit the size of the problematic element, and (4) modifying or changing
the defined text properties (e.g., modifying the singleLine attribute that forces the entire text to
appear in one line). Based on these strategies and the type of each property p ∈ Ps
, my approach
determines the change values that will be introduced for each node in the FixSet. These sets of
changes represent the initial population that will be used to start the search process.
43
4.2.2 Generating Repaired UIs
In this section, I describe my multi-objective genetic search approach to repairing USAFs. My
approach employs a multi-objective genetic search because the goal is to balance two primary
objectives: repairing the USAFs while preserving the original design and aesthetics of the UI. At
a high level, my approach starts by generating the initial set of candidate repairs, which represent the initial population and serves as the starting point for the search. During the iterative
search process, my approach generates new sets of candidate repairs and evaluates their quality
using metrics defined in the fitness function (discussed in section 4.2.2.2). The candidate repairs
are subsequently ranked based on their fitness scores and then undergo the typical search operations (i.e., selection, crossover, and random mutation) to produce new candidate repairs in each
iteration. The search continues until a termination condition is met, either when no improvement in the population is observed over multiple iterations (reaching a fixed point) or when the
maximum number of generations is reached. In the following subsections, I focus on describing
the repair representation and the fitness function. These components represent the unique aspects of my search process that require detailed discussion. The remaining components follow
the general approach of a genetic search algorithm.
4.2.2.1 Repair representation
A repair is represented by a set of proposed changes R, where each change δ takes the form
⟨i, p,a⟩. In this representation, i is a UI element to be changed, p represents the target of the
change, and a is the change value applied to p. Each change δ is considered a gene, a set of
changes R forms a chromosome, and a collection of chromosomes in a particular iteration of the
search form a population.
4.2.2.2 Fitness Function
The goal of the fitness function is to guide the search to a solution that resolves as many USAFs as
possible while preserving the aesthetics of the UI. The fitness function consists of five objectives
44
derived from two general goals that a repair should achieve. First, a repair should enhance the accessibility of the scaled UI by addressing as many USAFs as possible. To achieve this goal, I utilize
the definitions of the various types of USAFs to establish objectives that can guide the search in
resolving these issues. Specifically, I define the following objectives: Minimize Text Cutoff (TC),
Minimize Elements Collision (EC), and Minimize the Number of Missing Elements (ME). These three
primary objectives help guide the search towards finding repairs that minimize the USAFs in the
UI. Second, a repair should accomplish this without significantly distorting the original design
of the UI. To achieve this goal, I introduce two secondary objectives: Maintain Positional Relationships (PR) and Minimize the Amount of Change (AC). These objectives are mapped to the two
universal Aesthetic-related objectives described in Section 3.2.3 and work by penalizing repairs
that introduce significant changes to the original design of the UI and provide distinguishing
powers between repairs that may otherwise score similarly in terms of the primary objectives.
The fitness function of a candidate repair is calculated as the weighted sum of these five
objectives as shown in Equation (4.5).
F(R) = w1 · TC(R) + w2 · EC(R) + w3 · ME(R) + w4 · Tanh(PR(R)) + w5 · Tanh(AC(R)) (4.5)
In this formula, F(R) represents the fitness score of a candidate repair R. The five objectives are
weighted by w1 to w5, respectively. Since the secondary objectives should be subordinate to the
primary objectives, it is important to ensure that these secondary objectives contribute to the
overall fitness score without dominating the primary objectives. To achieve this, my approach
uses a scaled hyperbolic tangent (tanh) function on the fourth and fifth objectives to scale their
values between 0 and 1. In the following paragraphs, I discuss each of the five objectives in detail.
4.2.2.2.1 Minimizing Text Cutoff The goal of this objective is to guide the repair approach
towards solutions that have fewer cutoff type USAFs by approximating the number of such issues
that exist in a repaired UI. The mechanism for quantifying this objective is based on identifying
and counting the discrepancies between the intended text to be displayed in the UI and the actual
45
text displayed in the UI, with the difference representing the approximate amount of text cutoff
present in the repaired UI. To identify the intended text to be displayed, my approach examines
the VH of the original unscaled app. In this VH, the text content included as a part of elements
set to be visible in the UI is considered the intended text. My approach uses Optical Character
Recognition (OCR) techniques to scan the screenshot of the scaled UI and determine the actual
text that is displayed in the repaired UI. The difference between the two texts represents the
text that has been cut off and should be displayed. To quantify this difference, my approach
computes the Levenshtein distance between the two sets of strings. The Levenshtein distance
is a string metric for measuring the difference between two sequences, quantifying how many
single-character edits are required to change one text into the other. It is important to note that
this is an approximation that heavily relies on the OCR’s ability to detect the difference in text,
which may have some false positives or false negatives. However, as demonstrated in Section 5.3,
it serves as a useful approximation that can lead the approach to solutions that minimize text
cutoff issues in the repaired version.
4.2.2.2.2 Minimizing Elements Collisions The goal of this objective is to guide the repair
approach towards solutions that have fewer collision type USAFs by approximating the number of such issues in the repaired UI. My mechanism for quantifying this objective is based on
identifying and counting the differences between the elements intended to be overlapping and
the elements that are actually overlapping in the repaired UI. My approach can identify elements intended to be overlapping by examining the VH of the original unscaled UI and using
the elements’ MBR to find overlapping elements. Similarly, my approach can identify the actual
elements overlapping in the repaired UI by examining the elements’ MBR in VH of the repaired
UI. The difference between the number of elements intended to be overlapping and elements
actually overlapping in the repaired UI represents the set of colliding elements in the repaired UI.
One of the goals when quantifying a fitness function objective, as described in Section 3.2.3,
is to quantify it using a gradient approach instead of simply counting the number of violations
(i.e., issues). This allows for a more fine-grained distinction between repairs and, therefore, better
46
guides the search in finding more effective repairs. My approach for achieving this goal for this
objective is by computing the minimum Euclidean distance between the positions of the colliding
elements in the repaired UI and the position they need to be in to restore the original relationship.
My approach then sums this distance for all collisions and uses this value as the metric for this
objective. It is important to note that this approach serves as an approximation and may have false
negatives or false positives (e.g., in the case of invisible collisions). However, as demonstrated in
Section 5.3, this approximation is useful in guiding the search to solutions that minimize collision
USAFs issues in the repaired version.
4.2.2.2.3 Minimizing Missing Elements The goal of this objective is to guide the repair
approach towards solutions that have fewer missing type USAFs by approximating the number
of such issues in the repaired UI. My mechanism for quantifying this objective is based on identifying and counting the differences between the elements that are intended to be displayed in the
UI and the actual elements that are displayed in the repaired UI. My approach can identify the
elements that are intended to be displayed by examining the elements in the VH of the original
unscaled UI and the elements that are actually displayed in the repaired UI by examining the
VH of the repaired UI. The difference between the intended and displayed elements is counted
towards the total number of missing elements. However, a metric that solely relies on the number of missing elements is not sufficient, as it fails to capture the importance of these elements
within the UI. As described in Section 3.2.3, one of the goals when quantifying a fitness function objective is to provide a more nuanced and gradient approach rather than simply counting
the number of violations. Following the gradient approach, my mechanism for quantifying this
objective also takes into account the specific types of the missing elements. For example, a repair with a missing decorative element may have less impact on the UI compared to one with
a missing interactive button that users interact with to access certain functionality. To account
for such cases, our approach, in addition to counting the number of missing elements, assigns an
impact score for each missing element based on its view type, interactivity, and presence of text.
47
My approach then calculates a weighted sum of the missing scores for all missing elements and
reports this value as the metric for the candidate repair.
4.2.2.2.4 Maintain Positional Relationships The goal of this objective is to guide the search
towards solutions that have fewer changes to the positional relationships between elements in
the UI. Positional relationships include directional relationships, such as one element being above
the other, and alignment relationships, such as two elements being bottom-aligned. These relationships play an important role in helping to create a visual hierarchy and establish an intuitive
layout for users. For example, related buttons in a form, such as "Submit" and "Cancel," are often
placed next to each other to indicate their connection, making it easier for users to find and interact with them. Similarly, menu items are typically aligned either horizontally or vertically to
create a visually consistent structure. Changes to these relationships, even though they do not
necessarily result in a USAF, can still introduce distortions to the UI.
My approach quantifies this objective by identifying and counting the differences in the intended positional relationships and actual positional relationships in the repaired UI. My approach can identify the intended relationships by examining the VH of the original unscaled UI
and using elements’ MBR to identify the positional relationships between them. My approach
can identify the actual relationships in the repaired UI by examining the VH of the repaired UI
and using elements’ MBR to identify these relationships. The differences between intended and
actual positional relationships in the repaired UI represent the set of violated positional relationships. To quantify this difference in a way that achieves the goal outlined in Section 3.2.3 and
following the gradient approach described in the Minimizing Element Collisions objective, my
approach computes the minimum Euclidean distance between the positions of the elements in
the violated relationship in the repaired UI and the position they need to be in to restore their
original intended relationship. My approach then uses the sum of these violations as the metric
for this objective. This quantification allows for a more fine-grained distinction between repairs
and better guides the search to find effective solutions. To ensure that this secondary objective
serves as a distinguishing factor between repairs that perform similarly in terms of the primary
48
objectives without dominating them, I apply a scaled tanh function to the computed metric. By
doing so, I account for these aspects of the UI design while maintaining the primary focus on
addressing the USAFs.
4.2.2.2.5 Minimize the Amount of Change This is the second of the two secondary objectives in the fitness function. I included this objective based on the observation that when multiple
repairs rank similarly in terms of addressing USAFs, repairs that introduce fewer changes to the
location or size of the elements in the UI are closer to the original design compared to repairs that
introduce unnecessary changes to these aspects. Therefore, the goal of this secondary objective
is to prioritize repairs that minimize changes to the location and size of the elements in the UI.
To quantify this objective, my approach calculates a metric based on the differences in the elements’ sizes and locations between the original and repaired UI. The sum of all these changes is
used as the metric for this objective. Similar to the previous secondary objective, I apply a scaled
tanh function to the computed metric to ensure that it serves as a distinguishing factor between
repairs that perform similarly in terms of the primary objectives without dominating them.
4.2.2.3 Rendering and Evaluating the Candidate Repairs
Before a candidate repaired UI can be evaluated using the fitness function, it must be rendered
on the screen, and information about its rendering must be captured. This process, as outlined in
Section 3.2.2, is essential in understanding how the introduced changes impact the rendering of
the UI as the final visual rendering of UI elements can only be accurately determined at runtime.
At a high level, the rendering process involves first applying the suggested changes to the app’s
layout files to create a new version of the UI, which can then be rendered and evaluated according
to the defined criteria. To map components between static and dynamic VHs and locate elements
during the app rewriting, the approach follows a process similar to the one used by prior work
(e.g., [72] and [6]). At a high level, I employ a matching mechanism based on elements’ IDs,
XPaths, and other related properties. To help increase the accuracy of the mapping, I preprocess
the app layout files and assign unique identifiers to UI elements. The approach then renders the
49
newly generated UI on an Android device and extracts its information (i.e., VH and screenshot).
Using the extracted information, my approach evaluates the candidate repair using the fitness
function discussed in Section 4.2.2.2 and ranks the candidate repairs based on their fitness score.
4.3 Evaluation
To evaluate my approach, I designed experiments to answer the following research questions:
RQ 1: How effective is the framework instantiation in repairing USAFs?
RQ 2: How much time does the framework instantiation need to repair USAFs?
RQ 3: How do the generated repairs impact the readability and aesthetics of the UI?
4.3.1 Implementation
I implemented ScaleFix as a Java prototype tool. ScaleFix uses Apktool [16] to decode APK resource files and rebuild these files into a new APK after the repair. I implemented an interface
based on UI Automator [77], the Android Debug Bridge (ADB) [14], and Android Debug Monitor
Library (ddmlib) [37] to interact with Android devices, automatically activate scaling assistive
services, and capture layout information. To process and extract text from UI screenshots, ScaleFix utilizes a Python implementation of a well-known open-source OCR technique, Tesseract [75],
with oem and psm configurations set to 3 and 6, respectively. The weights I found most effective
for the fitness function Equation (4.5) were 2 for w1 and w2, 3 for w3, and 1 for w4 and w5. These
weights were determined considering the relative importance of the objectives. I assigned higher
weights to the primary objectives that reflect the quality metric with respect to repairing USAFs.
I assigned lower weights to the secondary objectives and scaled their value to a range between
0 and 1 to ensure they only act as a way to distinguish between repairs that may have similar
USAFs. The weights of the fitness function are configurable and can be adjusted as needed. In
my experiments, I utilized seven parallel instances of an Android emulator based on Android 9.0.
50
The number of emulators can be customized based on available resources, and the implementation is capable of supporting more if needed. I conducted the experiments on an Intel i7-11700
64-bit machine equipped with 64GB of memory and running Ubuntu Linux 20.04 LTS. In my experiments, I utilized seven emulators based on Android 9.0, and the approach can support more
emulators if needed.
4.3.2 Subjects
I conducted my experiments on a set of 38 UIs from 27 real-world mobile applications gathered
from three prior studies with confirmed USAFs [49, 71, 9]. For each UI, I obtained the APK with
the reported issue and manually confirmed the issue was reproducible by running the app using the assistive scaling service. During the filtering process, I excluded USAFs involved in (1)
applications that could not be run or analyzed, such as MySTC app, which displayed a message
indicating the version was no longer supported, or the secure-file-manager app, where UI Automator displayed an error message when attempting to capture its UI information; (2) WebViews
or AdViews, as they require modifying web content that my approach does not handle; and (3)
USAFs impacting UI elements created in code as my approach does not rewrite code and is limited
to introducing layout changes.
With the goal of running the approach on all of the faulty scaling versions of each of the 38
UIs, I examined the presence of USAFs in all three different scaling configurations (i.e., TO, DO,
FD) for each UI. From this investigation, I identified and collected only those scaled UI versions
that exhibited USAFs, resulting in a total of 63 scaled versions: 11 TO, 14 DO, and 38 FD. I consider
these 63 scaled UI versions as the subject UIs for my experiments. In total, the subject UIs contain
a total of 122 USAFs (52 text cutoff, 47 missing, and 23 collision type). These issues are distributed
across the different scaling versions as follows: 25 USAFs in the TO version, 25 USAFs in the DO
version, and 72 USAFs in the FD version.
51
4.3.3 Experiment One
4.3.3.1 Protocol
To address RQ1, I ran ScaleFix on each subject UI and measured its effectiveness in repairing
USAFs. To calculate the effectiveness, I computed two metrics: (1) the repair success rate and (2)
the accessibility rate before and after the repair. The repair success rate measures the effectiveness
of ScaleFix in terms of repairing USAFs and is calculated as the ratio of fixed USAFs to the total
number of USAFs in the UI. The accessibility rate, a known metric to measure the accessibility of
a mobile app’s UI [8, 65, 6], evaluates the effectiveness of ScaleFix in improving the accessibility
of a UI. It is calculated as the ratio of the number of elements in a UI free from USAFs to the total
number of elements in the UI.
Two of my collaborators from the ScaleFix project [7] and I determined if the issues in a repaired UI were fixed by comparing the repaired UI with its oracle. The oracle for a UI is the
original unscaled version that does not exhibit the issues. The three of us performed this verification process independently. For each subject UI, we examined the two versions side-by-side
and assessed whether USAFs were present. To determine this, we answered the following three
questions: (1) Are there any visible elements with text in the oracle that have parts of their text
missing or disappearing in the repaired UI? (2) Are there any visible elements in the oracle that
are not visible on the repaired UI? (3) Are there any two elements that were visibly separated and
had no parts overlapping in the oracle but appear to overlap in the repaired UI? For each question
answered with “yes" (i.e., there are issues), we noted the number and type of issues identified.
An issue was considered fixed if both of my collaborators, and I confirmed that the issue was
resolved. I discuss the potential threat this protocol may impose on the results in Section 5.3.5.
After the verification steps, I calculated the repair success rate and accessibility rate before and after the repair, as discussed above. I also measured the runtime of ScaleFix during the experiments
to address RQ2.
52
Table 4.1. ScaleFix’s effectiveness in repairing USAFs (RQ1)
FO DO FD
SR(%) AR(%) SR(%) AR(%) SR(%) AR(%)
Avg 100 84 / 100 86 75 / 98 86 78 / 99
Med 100 88 / 100 100 78 / 100 100 85 / 100
Max 100 86 / 100 100 90 / 100 100 96 / 100
Min 100 75 / 100 0 60 / 75 0 13 / 75
4.3.3.2 Presentation of Results
The results for RQ1 are shown in Table 4.1. In this table, “SR” refers to the repair success rate,
“AR” refers to the accessibility rate (Before / After). TO, DO, and FD represent the different scaling
versions. Figure 4.6 shows the change in accessibility rate across my 38 FD UIs.
subjects
0.00%
25.00%
50.00%
75.00%
100.00%
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Change in Accessibility Rate After Repair Original Accessibility Rate
Figure 4.6: Accessibility Rate for the FD UIs before / after repair.
4.3.3.3 Discussion of Results
The overall results show that ScaleFix was effective in repairing USAFs, achieving an average repair success rate of 90% across all UIs. The average rate of improvement in accessibility rate across
53
all UIs was 23% (with a maximum of 88%), indicating considerable improvements in accessibility.
Repair success rate for TO UIs was consistently 100%. For DO UIs, repair success rate was 100%
for 12 out of 14 UIs, and for FD, it was 100% for 34 out of 38 UIs. The median accessibility rate
after the repair across all scaling versions was 100%, demonstrating the effectiveness of ScaleFix
in repairing USAFs for most of the UIs.
I analyzed the cases where my approach failed to fix issues. It was mainly due to limitations
in the approximations used to quantify the fitness function objectives, specifically in identifying
element collisions and text cutoffs in the repaired subjects. For example, my fitness function did
not penalize repairs in the DO and FD versions of the easer UI because my approximation for
detection collisions failed to detect colliding elements as they were already invisibly colliding in
the unscaled version of the UI. In the budgeta UI, inaccuracies in the OCR detection led to repairs
that completely removed the USAF being heavily penalized and ranked lower compared to other
repairs. In all of these cases, my approach generated repaired UIs that fixed the USAFs; however,
these UIs were not selected by the search approach.
For the infini UI, my approach managed to completely fix USAFs in the TO and DO versions.
In the FD version (the largest scaling option), my approach successfully resolved the two missing
USAFs, but in doing so, introduced two cutoff issues. This occurred because the infini UI is a
popup with a partial screen window and a large amount of text, making it challenging to fix all
USAFs in the FD configuration.
The findings for RQ2 demonstrate that ScaleFix can generate repairs within a reasonable time
with an average of 5.9 minutes. A detailed analysis of the runtime revealed that approximately
99% of the time was spent evaluating the quality of the candidate repairs (i.e., fitness function).
To further optimize this step, as noted in Section 4.3.1, the number of parallel emulators can be
expanded beyond the seven used in these experiments. This would allow for a more effective
distribution of the workload across additional devices.
54
4.3.4 Experiment Two
4.3.4.1 Protocol
To address RQ3, I conducted a user-based study to evaluate the impact of my repairs on the UI
from the users’ perspective. In the user study, I had two objectives. First, I wanted to assess my
repairs’ impact on the UI with respect to the original UI. Second, I wanted to assess how my
repairs would compare with repairs made by the developers. For the first objective, participants
were asked to rate the UI before and after the repair. In the first phase (Part 1), participants were
unaware of the USAFs. They were presented with screenshots of the scaled UIs before and after
repair, along with a screenshot of the original unscaled UI for reference. The order of the before and after versions was randomized, and they were only labeled as version1 and version2 to
minimize potential bias. Participants were asked to rate the two UIs based on attractiveness and
readability on a numeric scale of 1 to 10, where 10 represents the most attractive/readable. Participants were also asked to indicate their preference using a 5-point Likert scale. The preference
options ranged from ’I strongly prefer version 1’ to ’I strongly prefer version 2,’ with ’No preference’ as the middle option. These options allowed participants to indicate their preferences,
whether they had a strong preference for either version, a slight preference for one version over
the other, or no preference at all. I also collected written feedback from participants to understand the rationale behind their preference and rating for attractiveness and readability. In the
second phase (Part 2), participants were informed about the USAFs in each UI and, again, asked
to rate their preference between the UIs using a 5-point Likert scale.
For the second objective, I asked participants to directly compare my repairs with developers’
repairs. I conducted this direct comparison on a subset of UIs for which developers’ repairs were
available, and I could obtain the corresponding APKs. For this direct comparison, I followed the
same process described in Part 1. Participants were presented with screenshots of two versions
(my repair and developers’ repair) and a screenshot of the original unscaled UI for reference. The
order of the UIs was randomized, and the versions were labeled as version1 and version2.
55
I conducted the survey on a group of individuals with disabilities who were recruited through
a charity foundation based in California, USA, that I have worked with before. This foundation
congregates people with disabilities, primarily those with visual and motor disabilities. In total,
I had 27 participants. 41% of the participants consider themselves to have vision-related issues
or have used screen magnification to adjust the display or font size. All of the participants have
some form of motor disability. I included participants with motor disabilities in addition to those
with visual disabilities because USAFs also affects them from an operational aspect. Many of the
motor-impaired participants rely on assistive services to increase the size of UI elements to make
interaction easier. All of the subject UIs were analyzed in the study. There were a total of 85 combinations from the different scaling configurations from these UIs, including those with available
developers’ repairs (i.e., second objective). I divided them into ten random surveys (about eight
per survey). Each participant was randomly assigned to complete three unique surveys.
4.3.4.2 Presentation of Results
The results for the aesthetics ratings from the user study are shown in Figure 4.7. The box plot
shows the attractiveness and readability scores when comparing my repaired UIs with the original
scaled UIs. The boxes represent the distribution of the numeric ratings given by the participants.
When comparing my repaired UIs to the original scaled UIs, the average attractiveness score
increased from 4.9 to 7.2, and the average readability score increased from 5.0 to 7.2. These results
were statistically significant using the Wilcoxon signed-rank test with p-values of < .00001. I
used the Wilcoxon Signed-Rank test for the analysis because I compared paired ratings from two
dependent samples, which were not normally distributed.
4.3.4.3 Discussion of Results
The user study results show that my approach successfully maintained the readability and aesthetics of its repaired UIs. On average, the participants’ attractiveness and readability ratings for
my repaired UIs were 44% and 47% higher than the original scaled UIs.
56
Figure 4.7: Ratings of the original scaled UIs and our repairs.
Participants also preferred my repairs more than the original scaled UIs. On average, 73%
preferred my repairs, where 56% rated “I prefer the repaired UI” and 17% rated “I strongly prefer
the repaired UI”. For the 12% of the participants that rated “No preference,” most of them thought
the UIs looked similar and could not tell the difference between each. I believe this indicates
that my repair maintained the design’s originality and did not negatively affect user preference
while fixing almost all of the USAFs. I investigated the 16% that preferred the original scaled UI
and found that, in some cases, the participants preferred a UI with more white space because it
“looks cleaner”, despite having minor overlapping or partial text cutoffs. This is more evident in
UIs where my repair had to introduce significant changes to repair the cutoff type issues. Once
the participants were aware of the implications of the USAFs, I saw an average of 19% increase,
where my repair was in favor among 92% of the ratings. Particularly, the number of ratings that
“strongly prefer” more than doubled.
57
I directly compared my repaired UIs with the developers’ repaired UIs to better understand
their repair qualities. Regarding attractiveness and readability, I found the ratings very similar.
There were no statistical significance differences between both. I believe this is a positive result
because it shows that my approach generated UIs that were indistinguishable in terms of aesthetic
quality from the ones done by the developers. Regarding user preferences, 46% of the participants
preferred my repair, while 37% preferred the developer’s repair. Based on the comments provided,
participants generally preferred the developer’s repairs when it was more similar to the original
UI compared to my repairs. I investigated the UIs of those preferred developer’s repairs and found
that, in some cases, developers resolved the issues by fixing the font size so that the text would
not respond to scaling requests. Although this approach may prevent USAFs from manifesting, it
essentially defeats the purpose of using scaling assistive services to improve the accessibility of
the UI. In summary, the findings demonstrate that my automated repairs achieved a quality that
is comparable to, or even exceeds, those manually constructed repairs made by the application’s
developers.
4.3.5 Threats to Validity
External Validity: The selection of participants may present a potential threat to the validity
of our user study. To address this threat, I specifically targeted individuals with disabilities who
are most affected by these issues. In particular, I sought participants with varying levels of visual
impairments, including those who require screen magnifiers, as well as individuals with motor
impairments, such as paralysis, who may experience difficulties in accurately interacting with
the UI.
Construct Validity: A potential threat is the possibility that the USAFs considered for repair
may not be genuine issues. However, these issues have been previously detected and reported in
prior studies on mobile accessibility. Many of these issues have already been reported by users
and acknowledged or repaired by the developers of these apps.
58
A potential threat to the construct validity is the subjectivity of the participant ratings in the
user study. To mitigate this threat, the user study is designed to evaluate pairs of UIs in relation
to one another. This approach ensures consistent ratings are assigned to the same pair of UIs,
even when participants apply different assessment criteria.
Internal Validity: A threat is the manual verification process for RQ1, which may be prone
to errors or mistakes. To mitigate this threat, the verification process was carried out through
an independent unanimous agreement between myself and two of my collaborators from the
ScaleFix project [7]. In addition, I calculated the inter-rater agreement between myself and my
collaborators and found that we had a high agreement (99%). To further mitigate the threat, I
asked user study participants (Experiment 2) to assess whether the issues in each UI were fixed
using a 5-point Likert Scale. Participants confirmed that my repair improved the accessibility
of the UIs by removing scaling issues, of which 67% “strongly agree” and 30% “agree” that our
repair fixed issues. The participants’ responses positively confirmed the results obtained in RQ1,
providing further evidence that our manual verification process was accurate and reliable. Please
note that I used a manual verification because of the limited scope and availability of the existing
detectors. Finally, a similar process for examining UIs to check for UI and accessibility issues has
been used in prior studies [49, 48, 86, 9].
Another potential threat to validity is that users evaluated the UIs based on screenshots instead of directly interacting with them on a mobile device. Our choice was influenced by the
following reasons. First, our user study focuses on assessing the visual aspects of the rendered
UIs, which do not require direct interaction with the UIs. Second, using screenshots allows us
to control variations in the results that could result from differences in participants’ settings or
devices, ensuring a consistent and uniform presentation format. Third, using screenshots simplifies comparisons, as users can view the UI versions side-by-side, eliminating the need to install,
run, and uninstall various app versions. It also allows us to reach a wider range of participants
without having to address the distribution and installation of numerous apps or hardware device
59
considerations. Lastly, screenshots are frequently employed in user studies focusing on the evaluation of UI attractiveness or readability and in determining if issues have been resolved (e.g.,
[51, 52, 3, 47, 6]).
4.4 Conclusion
This chapter partially confirms the hypothesis that a repair framework utilizing UI models and a
multi-objective genetic search approach can effectively repair layout accessibility issues. Specifically, I introduced ScaleFix, an instantiation of the framework for automated repair of USAFs in
mobile apps. ScaleFix creates a UI model that encapsulates the placement relationships between
elements and then leverages this model to identify the elements and properties that need to be
modified to fix the detected issues. The approach employs a multi-objective genetic algorithm,
with a fitness function designed based on insights into the impact of scaling assistive services on
UIs, to guide the search for potential candidate repairs.
The evaluation of ScaleFix demonstrated its effectiveness in repairing USAFs, achieving an
average success rate of 92% in real-world apps. A user study further validated the effectiveness
of the instantiation in generating quality repairs by showing that the repaired versions were
rated better in terms of attractiveness and readability compared to the original UI versions. In
fact, participants’ attractiveness and readability ratings for the repaired UIs were more than 40%
higher than the original scaled UIs. Moreover, 73% of the participants preferred the repaired
versions. When compared with repairs manually created by developers, participants rated our
repairs similarly to those created by developers. This successful instantiation and evaluation
contribute toward confirming my proposed hypothesis.
60
Chapter 5
Repairing Size Based Inaccessibility Issues (SBIIs)
Touchscreen technology has been the most prominent input method for users to interact with
mobile devices [60]. Yet, interacting with mobile devices by touch can be difficult for many people,
such as older adults and those with motor disabilities (e.g., paralysis, tremors, or neurological
diseases). These difficulties can translate into imprecise touches, increased touch mistakes, or
even the inability to access important functionalities in mobile apps. Studies have shown that
inadequate size of touch targets is the root cause that manifests these difficulties [30, 78, 39, 63].
This type of issue, known as Size-Based Inaccessibility Issues (SBIIs) [65], occurs when the size
of a touch target is less than the minimum size specified by the accessibility guidelines [55, 80,
19]. Recent studies have shown that SBIIs are among the most prevalent accessibility issues that
affect mobile apps [8, 65]. In fact, a study on real-world apps from 33 app categories of the Google
Play store showed that small touch target size was ranked as the second top accessibility issue
[8]. Another recent large-scale empirical study on accessibility issues found that 78% of apps had
more than 10% of their elements impacted by these issues [65].
Automatically repairing SBIIs is a challenging task for several reasons. First, a repair must
account for multiple SBIIs holistically in order to preserve the relative consistency of the original
UI design. Second, due to the complex relationship between Android UI components, there is no
clear way of identifying the set of views and properties that need to be modified for a given SBII.
Finally, assuming that the relevant views and properties can be identified, a change in the size of
61
one element can introduce further alignment or spacing issues to other areas of the UI. Together,
these challenges make a seemingly simple repair difficult to achieve.
Existing approaches cannot help developers repair SBIIs. Research by Zhang et al. developed
prototypes to address Android accessibility by enhancing user interactions. Their work Interactiles [89] focuses on making touchscreens accessible by attaching a hardware interface to the
Android phone’s screen to enhance tactile interaction for the visually impaired. Similar hardware
overlay techniques [74, 40, 45] have been proposed in HCI research, but they do not fix the underlying issues and require the hardware cutouts to be tailored to fit the devices. Software-based
approaches to improve touchscreen accessibility [44, 43, 18] are more robust, but they mostly
focus on using audio-based interaction techniques to allow the visually impaired to access touchscreens. Zhang also introduced “interaction proxies” to be inserted on top of an app’s original
UI for disabled users to more easily manipulate the app [87]. While this approach can potentially address size-based inaccessibility, it relies heavily on manually remapping interactions into
new interactions. Touch Guard [90] helps users to access inaccessible small touch targets by enhancing their touched areas with screen magnification to enlarge and disambiguate the bounds
between multiple targets. These existing tools merely operate as “assistive technologies” to provide increased usability. However, they do not provide a way to help app developers repair the
root causes of the problem.
In this chapter, I describe how I instantiate the repair framework, presented in Chapter 3,
to repair SBIIs. In this instantiation, which I refer to as Size-based inAnaccessibiLity rEpair
in Mobile apps (SALEM), I employ different insights from the problem domain to derive the UI
qualities, build the UI model, and define heuristics to evaluate the generated repairs and guide the
search toward finding the best repair to fix SBIIs without negatively impacting the UI. To ensure
completeness, I present the full approach in this chapter. While some components of the general
framework are reiterated for clarity, the focus here is on how these components are specifically
tailored and instantiated to address SBIIs.
62
Figure 5.1: Example of SBIIs issues: The left screenshot shows the original UI, while the right
screenshot highlights the SBIIs as identified by the Google Accessibility Scanner.
5.1 Background
Touch Targets are components of a user interface (UI) that respond to user interactions, such
as tapping, swiping, or pinching. These interactive elements provide access to a mobile app’s
functionality and can capture and respond to users’ actions. Ensuring that touch targets are
sufficiently sized is a critical aspect of mobile app accessibility, allowing users to interact with the
interface effectively. Small touch targets can be especially problematic for people with disabilities,
such as motor disabilities or reduced dexterity, as well as older individuals who may experience
age-related decline in fine motor skills. Struggling to interact with small touch targets can lead to
frustration, errors, and ultimately, a reduced ability to use the app effectively [89]. Ensuring that
touch targets are sufficiently sized is, therefore, essential for creating an accessible and inclusive
mobile experience.
63
Web and mobile accessibility guidelines emphasize the importance of adequately sized touch
targets. Google’s Material Design principles for Android accessibility [55] and Guideline 2.5.5 of
the international accessibility standard WCAG 2.1 [80] both outline the requirement for touch
targets to be at least 48dp × 48dp with respect to the screen [36, 55, 19]. Testing tools such
as Google Accessibility Scanner [2], Accessibility Testing Framework [38], and IBM’s Mobile
Accessibility Checker [41] are designed to detect touch target issues in mobile apps based on
these guidelines. In this chapter, I use the term Size-Based Inaccessibility Issue (SBII) to refer
to any violations of this guideline where a touch target falls below the required size threshold.
Figure 5.1 shows examples of this issue in the Fintech Credit Seasme app, where all the touch
targets of the "Create Your Account" UI are flagged by Google Accessibility Scanner [2] for having
SBIIs. This could make creating an account a frustrating and error-prone process for users with
motor disabilities or for older adults.
5.2 Approach
The goal of this instantiation, SALEM, is to automatically repair the SBIIs in a mobile app’s UI
while maintaining, as much as possible, the aesthetics and design of the original UI. Fixing SBIIs
requires changes to the properties that control the size of elements in the UI to allow the UI to
meet the accessibility requirements. Finding the new values that fix the SBIIs while maintaining
the UI’s aesthetic is complicated by several challenges. The first challenge is to maintain the
visual consistency of the UI design. For example, for a navigation bar with a set of menu items,
changing the size of one item without updating the other menu items will distort the navigation
bar’s visual consistency. The second challenge is knowing what needs to be changed in order to
fix the SBIIs. Directly changing the elements with accessibility problems does not always fix the
problem due to the fact that the final rendered appearance of an element depends not only on its
properties, but its containing elements and nearby elements. Therefore, the set of elements and
properties that need to be adjusted to fix the SBIIs often include other elements in addition to the
64
one with an SBII. The third challenge is that a repair can have a cascading effect. A change to one
part of a UI can trigger a chain of changes in other parts of the UI as elements change and move
to accommodate the change. This challenge is compounded with the existence of multiple SBIIs
in a UI or when many elements must be adjusted together to maintain visual consistency, which
increases the likelihood that the final layout will be distorted.
To address these challenges, I leverage the repair framework I described in Chapter 3. This
framework is well-suited for this domain as it employs a multi-objective genetic search algorithm,
which works well for balancing trade-offs between repairing different SBIIs and maintaining the
original UI design. The framework also enables the use of insights into the problem domain to
build UI models, which can be analyzed to identify a subset of elements and properties that are
most likely to resolve the issues while minimizing the runtime needed to find a successful repair.
Section 5.2.1 describes the instantiation of the framework’s analysis phase. Section 5.2.2 describes
the instantiation of the framework’s UI generation phase, including the multi-objective genetic
search approach. The input for SALEM is an APK of an app along with a detection report that
lists each of the app’s UIs that exhibit SBIIs along with details about the SBIIs of each of these
UIs. The detection report can be provided by automated detection techniques, such as Google
Accessibility Scanner [2] or Accessibility Test Framework for Android (GATF) [38]. When the
search terminates, the best values obtained for all selected elements are used to update the UI’s
corresponding static layouts. Once all of the UIs in the app are repaired, the app is compiled and
provided as the output of the approach.
5.2.1 Analysis Phase
The goal of this phase is to identify the initial population that will be used to start the search process. The goal of this phase is to identify, for each of the SBIIs, the set of elements and properties
that need to be changed in order to repair these issues. Although the reports from accessibility
detection tools can be used to identify elements that exhibit SBIIs, they do not necessarily indicate
which elements need to be adjusted to repair the SBIIs nor which properties should be adjusted.
65
There are several reasons for this limitation. First, the size of the element may be set based on an
interaction of its properties with the properties of other elements that are located close by or of
elements from which it inherits display constraints. For example, the size of an element that is set
to fill the available space depends on its neighbors’ size, or an element’s size may be bounded by
the fixed size of another element in which it is visually contained. Second, maintaining the consistency of the repaired UI requires modifications to other elements, which may themselves have
relationships with other elements that need to be modified to maintain consistency. One could
address both of the above challenges by making the repair phase of the approach consider modifying all elements and properties present in the UI. However, such a solution would dramatically
increase the search space for a possible repair and means that the search process could take a long
time to complete. Therefore, in this phase, my approach aims to identify a subset of all elements
and properties in the UI that is (1) safe, in that it contains the elements and properties that, when
modified, can repair the observed SBII and keep the UI consistent; and (2) minimal, to reduce the
runtime needed to identify a successful repair. To accomplish this goal, the approach first builds
a UI model that captures the visual and size rendering relationships among the elements in an UI.
Using this model, the approach then performs the UI dependency analysis to identify the set of
elements and properties that may need to be changed to fix the detected issues (i.e., the FixSet).
In the following subsections, I first describe how my approach builds a model of the UI and then
show how the approach performs the UI dependency analysis to identify the FixSet.
5.2.1.1 Building the UI Model
My approach models two types of relationships: consistency relationships, which exist between
elements that need to be changed together to maintain the visual consistency of the UI, and dependency relationships, which exist between two elements if the size of one constrains, in some
way, the size of the other. My approach models these relationships using a graph-based abstraction of the UI, called the Size Relation Graph (SRG). The Size Relation Graph (SRG) is formally
represented as a graph ⟨V,E,M⟩, where V represents the set of visible elements in the UI and E is
66
a set of directed edges that represent the relationships between elements in the UI. M is a function that maps each edge to a set of tuples of the form ⟨p,ϕ⟩. p ∈ P, where P = {height,width},
and ϕ is a ratio of the drawing values of p for the edge’s nodes. In the following subsections, I
discuss the details of these two relationships.
5.2.1.1.1 Consistency Relationships A consistency relationship exists between two elements that are visually related. The goal of modeling this type of relationship is to identify the
sets of elements that should be adjusted together to maintain the visual consistency of the repaired UI. Maintaining consistency among visually related elements (e.g., items in a menu list) is
essential to maintaining the aesthetics and design of the original UI. However, identifying these
sets is challenging since apps’ UIs vary significantly from each other. This variation can even
exist within the same app, as different UIs may have their own layouts with a varying number of
elements and a different set of visual relationships. This means relying on a predefined number
of groups with fixed rules on how to map the elements in any UI to those groups is not practical.
Instead, these groups need to be identified on a per-UI basis. A simplistic approach to identifying
these groups might put elements that have the same class type (e.g., all Buttons) or style into the
same group. However, in my experience, this was generally inaccurate since elements with the
same class type can vary widely in their appearance, and styles are not used in a disciplined way
by most developers. Techniques with similar goals but targeted to web applications (e.g., [68, 50,
51]) rely on various metrics, such as DOM structure, to group elements, and in my experience,
this also resulted in inaccurate groupings when applied to mobile app UIs.
To identify visually related elements, I characterized the problem as a clustering problem,
where elements represent the data points that need to be made into clusters, and the cluster
membership is determined by the similarity of the elements’ rendering attributes. To cluster elements, my approach uses the well-known density-based clustering technique, DBSCAN [33].
This particular technique is well suited for this problem since the algorithm (1) does not require
predefining the number of clusters and (2) produces mutually exclusive clusters (i.e., hard clustering). Both of these attributes are important for this problem domain since the variance of app
67
UI layouts means they can have varying numbers of groupings, and having non-mutually exclusive clusters could prevent the search process from converging. To define the distance function,
I found that (1) logical location (represented by the XPath), (2) element size, and (3) element class
type consistently resulted in the most useful groupings. In my experience, elements with similar XPaths had a higher chance of being related in terms of sharing a similar visual appearance
and/or inheriting the same properties from a mutual parent (for example, icons in the navigation
bar). I also found that elements with a similar size were often visually related (e.g., lists of buttons
or items). Finally, I found that while element class type was, by itself, insufficient to indicate element grouping, when combined with the other dimensions, it helped to improve the grouping’s
accuracy.
The first step in identifying the visually related elements is to analyze the VH of the UI and
extracts each unique element and its properties. The elements become the data points that will
be clustered. Next, to determine the distance between those data points, my approach defines a
function based on the three above-mentioned metrics. To calculate the location distance, my approach computes the Levenshtein distance between elements’ XPath. The Levenshtein distance
between two XPaths is the minimum number of XPath tags that need to be modified to change
one XPath into the other. My approach then normalizes the value of the location distance metric
to a range of [0,1]. A metric value of zero indicates a complete match between the two elements,
while one indicates a maximum difference. To calculate the size distance, my approach computes
a metric for each of the size properties (height, width, and margins). If elements v1 and v2 have the
same size propriety (e.g., height), then the metric value for that property is set to 0; otherwise, it
is set to 1. Similarly, my approach computes a metric to calculate the element class type distance.
If two elements have the same element class type, then the metric value is set to 0. Otherwise, it
is set to 1. The approach then calculates the overall distance as a weighted sum of the normalized
value of each of the above three metrics. The weights of the metrics were determined empirically based on my experiments. The DBSCAN algorithm then uses this information to group the
elements into different clusters. Each cluster then represents a set of visually related elements,
68
1
2
3
4
1
2
3
4
1
2
3
4
Figure 5.2: Example that shows two apps’ UIs annotated with a simplified version of the visually
related groups that were identified by the clustering algorithm.
and the set of all clusters is the output of this phase. Figure 5.2 shows a simplified version of the
visually related groups identified for two mobile apps’ UIs. Each number on the graph represents
an identified group.
A consistency edge, therefore, is created to ensure size changes are propagated among elements within a visually related group to maintain their visual consistency. For example, for the
UI shown on the left-hand side of Figure 5.2, my approach creates consistency edges between the
nodes in the SRG that represent the three buttons in group ➍ to ensure that a change applied
to one can be propagated to the others. To create the consistency edges for a visually related
69
group, my approach iterates over its elements, and for each pair of elements v1 and v2, my approach creates a consistency edge between their correspondent nodes in the SRG. The approach
then creates an edge tuple to capture the relationship between each of the pair’s dimensions (i.e.,
height and width) and calculate the ratio ϕ for that tuple by dividing the drawing value of the
dimension for v1 over the drawing value of the dimension for v2. Returning to the example in Figure 5.2, for the edge created between the nodes in the SRG representing the ‘Sign up’ and ‘Privacy
Policy’ buttons, my approach creates a tuple that models the height relationship between them.
Since the height of both buttons is 30dp, the tuple will be initialized with the value ⟨height,1.0⟩.
5.2.1.1.2 Dependency Relationships A dependency relationship exists between two elements if one element can control or limit the other’s ability to change or expand. The goal of
modeling this type of relationship is to identify the set of nodes and properties that, given a
change to a property p for an element v, may need to be changed to accommodate the change
in v. To create the dependency edges, my approach iterates over the nodes in the VH of the UI.
For each node v, my approach iterates through the set of its ancestors (i.e., containing layouts),
starting from its parent. Then, based on the size attributes defined for that ancestor, my approach
will either create a dependency edge with that ancestor or skip it and move on to the analysis
of the next ancestor. My approach determines that based on the following three cases. First, if
the size attribute is set as an exact number, then my approach marks that ancestor as the target
node for the dependency edge. The reason for that is that an ancestor with a size attribute set as
a fixed number does not change in response to the change in the size of v. Therefore, the size of v
is dependent on this ancestor. Second, if the size attribute for the ancestor is set as wrap_content,
then my approach will only mark that ancestor as the target node for the dependency edge if
the size attribute for v was set as match_parent. That is because this is the only case where that
ancestor may need to be changed directly as v’s size can not be directly increased. Third, if the
ancestor size is set as match_parent, then my approach skips that ancestor and moves to the next
one. The reason is that an ancestor vp with size set as match_parent follows the size of its own
parent. Therefore, if vp’s parent increases, then vp’s size will increase, allowing v to change. The
70
dependency edge is created between v and the identified ancestor in the SRG as determined by
these three cases. The approach then creates an edge tuple to capture the relationship between
the two nodes and calculates the ratio ϕ for that tuple by dividing the value of v over the value
of the identified ancestor.
5.2.1.2 UI Dependency Analysis
The next step in my approach is to analyze the dependencies in the SRG and identify the elements
and properties that must be included in the FixSet. To do that, my approach computes a subgraph
for each visually related group that contains an SBII. The subgraph identifies the elements that
will be targeted by the repair methodology in Section 5.2.2. The edges of the subgraph and their
corresponding annotations identify properties to be considered for the repair and provide information on how to propagate that repair to the other elements in the subgraph.
To compute the subgraphs, my approach iterates over each visually related group that contains an SBII. For each such group g, my approach identifies the set of elements and properties
that may need to be changed to resolve the SBII in g. To do this, my approach computes a subgraph of the SRG that corresponds to the transitive closure of the graph originating from the
element va in g, where va represents the view that has the SBII. If g contains more than one element with an SBII, my approach chooses, as va, the element that requires the largest size increase
to fix its SBII. The intuition of selecting va in this way is that the largest size increase applied to
this element will also likely repair the other elements that require a smaller size increase. The
computed subgraphs are represented as a set of tuples, A, with each tuple of the form of ⟨i,sb⟩
where i represents the ID of the element va ∈ V that contains an SBII, and sb represents the subgraph of the SRG computed for that node. These subgraphs essentially make up the FixSet for
each visually related group with an SBII.
71
5.2.2 Generating Repaired UIs
In this section, I describe my multi-objective genetic search approach to repairing SBIIs in the UI.
My approach employs a multi-objective genetic search with the goal of repairing the SBIIs while
preserving the original design and aesthetics of the UI. The search-based technique I define follows the general approach of a genetic search algorithm. Therefore, I only give a brief overview
below of the overall flow of the search and then describe the unique parts, the fitness function,
problem representation, initial population, and repair generation in more detail. In each iteration of the search, my approach evaluates the candidate repairs in the current population, using
the metrics defined in Section 5.2.2.1, then performs selection, uniform crossover, and uniform
random mutation. The approach terminates the search once the maximum number of predefined
generations has been reached or the approach reaches a fixed point where no improvement in
the population has been observed for multiple generations.
5.2.2.1 Fitness Function
The goal of the fitness function is to guide the search to a solution that resolves as many SBII as
possible. However, solutions that resolve SBIIs may do so by increasing the size of touch targets
dramatically and in a way that distorts the UI. Therefore, I design the fitness function to include
not only metrics that guide the search to a UI with improved accessibility but also metrics that
penalize solutions that cause the resulting UI to significantly differ from the original or introduce
new design problems. First, I define the Accessibility Heuristic objective, which is derived from
the definition of SBIIs, to guide the search in improving UI accessibility. Additionally, I introduce three other objectives that aim to minimize UI changes and distortions. These objectives are
based on my experiments with the automatically generated touch target size adjustments, which
identified several aspects of repairs that, when penalized, helped my approach achieve this goal.
These objectives are as follows: changes to the positional relationships, spacing between elements, and the overall amount of changes in the UI. The fitness function for a candidate repair
is calculated as a weighted sum of these four objectives. Note that the two objectives related to
72
changes in the positional relationships between elements and the overall amount of changes are
in alignment with the two universal aesthetic-related objectives described in Section 3.2.3.
Accessibility Heuristic: The metric of this objective represents the primary representation
of how good a solution is with respect to improving the identified SBIIs. Ideally, this could be
measured by inserting a solution into the app and then running Google’s Accessibility Scanner
[2] on the modified app and calculating a new accessibility score based on its report. However,
the process of running the scanner can take a significant amount of time. Therefore, I utilized
an approximation of the accessibility score. The approach inserts a candidate solution into the
app, then scans the rendered UI to identify the actual size of each touch target that had been
reported as having an SBII and is still below the minimum threshold for accessible size. Simply
using this number as the metric is insufficient since it defines a step function that does not provide
meaningful discernment powers among solutions where both result in the same number of SBII
violations, but one may be closer. To convert this information into a gradient function with more
useful notions of correctness, I calculate the amount of size that the touch targets would need to
increase to satisfy the touch target minimum. This enables the approach to value solutions that
are getting closer to a satisfying solution even if the resulting UI has not yet completely resolved
the detected SBIIs.
Maintain Positional Relationships: Changes to the size of touch targets can cause changes
to the relative position and alignment of elements in the UI as they move to accommodate the
repaired elements’ changed size. In some cases, this can significantly distort the original layout
of the UI. Therefore, I introduce two metrics that favor solutions that result in lower amounts
of change in the relative positioning and alignments of its elements with respect to the original
UI. My approach realizes these metrics using the following steps: First, my approach extracts
the position of each element in the VH of the original UI and identifies the type of alignment
and relative position it has with the other elements. For relative position, any two elements
may have the following relationships: (1) intersection, (2) containment, (3) above, (4) below, (5)
to the left of, or (6) to the right of. For alignment, any two elements may be (1) top aligned,
73
(2) bottom aligned, (3) left aligned, or (4) right aligned. These relationships can be determined
by comparing the x and y coordinates of each element’s MBR. For example, two elements are
bottom-aligned if the y values of their bottom-right and bottom-left coordinates are equal. The
same process is repeated for the UI after a candidate solution has been applied to it. Then, the two
sets of alignments and relative positions are compared. If a difference exists, then my approach
computes the magnitude of the change by computing the minimum Euclidean distance between
the current position of the changed element and where it would need to be located in order to
restore the violated relationship. For the bottom-aligned example, this would be the absolute
difference between the y coordinates. My approach sums the differences for all elements that
have violated a prior alignment or relative position relationship and reports this as the metric for
the candidate solution.
Minimum Spacing Between Elements: Touch targets that expand in size can do so by expanding into the space between each pair of touch targets. However, doing so can have an impact
on the layout of a UI and cause it to look very different from its original design. Therefore, I introduce a metric to favor solutions that do not cause the spacing between any pair of elements to
become too small. To realize this metric, the approach computes the distance between the MBRs
of each pair of touch targets, and if the resulting space is below the minimum value required
by Google’s Material Design guidelines, then the solution is penalized. This allows solutions to
utilize some of the space between touch targets but only penalizes them if the space falls below
this minimum value. This realization of the metric reflects my observations that many SBIIs could
not be repaired without significant distortion without utilizing at least some of the space between
touch targets.
Minimize the Amount of Change: A drawback of my accessibility heuristic is that it favors
solutions that always increase the size of the touch targets. This can favor solutions that unnecessarily increase the size of the touch targets, which in turn increases the amount of distortion
relative to the original UI. To penalize these changes, my approach defines a metric that favors
solutions that minimize the amount of change in the size of the touch targets. To realize this
74
metric, my approach compares the size of a touch target in the original UI (using the element’s
MBRs) and compares this to the size of the touch target in the UI produced by a candidate repair.
The sum of all such changes in the element is used as the metric.
5.2.2.2 Solution Representation and Initial Population
Each candidate repair (chromosome) is comprised of a set S of tuples (each tuple corresponds to
a gene), where each tuple is of the form ⟨i, p, v⟩. In this tuple, i can refer to either a group (as
defined in Section 5.2.1) or an individual element; v denotes the amount of change or adjustment
that the repair will make to i; and p indicates the property of i to which v will be applied and can
be the height, spacing, or width. My candidate repairs allow the approach to change an entire
group (if i refers to a group) with one adjustment or an individual element. The group identified
genes allow the approach to explore solutions that maintain consistency, while the individual
identified genes represent elements that have a size dependency relationship with the element
containing the SBII.
For a given UI that contains SBIIs, my approach defines the genes that will be included in the
chromosome in the following way. For each visually related group g and the subgraph identified
for g in section 5.2.1, the approach first identifies the subset of properties (e.g., height or width)
that might need to change to repair the SBIIs. These properties can be identified based on the
violation reported in the SBII detection report. For each such property p, the property defines a
gene for g and a gene for each element that is connected to va in the subgraph via a dependency
edge. For example, if height is the property that needs to be changed for g, and vb is the node
connected to va via a dependency edge, then my approach will create two tuples in S. The first
tuple is created with i referring to the group g to which va belongs and p = height. The second
tuple is created with i referring to vb and p = height. Note that the value field v of each tuple
is undefined at this point since this step only defines the chromosome structure. Based on this
chromosome structure, my approach then creates an initial population of size n of candidate
solutions. For each of the n solutions, the approach creates a chromosome with the gene structure
75
defined using the above process. Then, the approach iterates over each gene and initializes its
value field v by sampling a random value in a Gaussian distribution based on the element’s value.
5.2.2.3 Generating a Repair
When a candidate solution is ready to be evaluated by the fitness function, my approach converts
the solution to a repair that can be inserted into the UI. Given a candidate solution c, my approach
performs the following steps: (1) For each gene in c, my approach propagates the change represented by the gene to all of the elements in the subgraph. (2) Then my approach again traverses
the subgraphs capturing the changes in a set R of concrete repairs, each of which is represented
as a tuple ⟨xr
, pr
,ar
, vr⟩, where xr
is the XPath of the node in the VH that need to be changed,
pr
is the property to be changed, ar
is the attribute that needs to be modified when applying the
change to the layout files, and vr
is the new value for pr of xr
. After generating R, the approach
(3) rewrites the app’s layout files and generates a new APK that can be run. These four steps are
also used for generating the final and best solution identified by my approach.
In the first step, each gene in the candidate solution (c) is applied to each of the subgraphs.
To do this my approach transitively traverses each outgoing dependency and consistency edge
in each subgraph and for each edge, vr → vt
, traversed, my approach computes vt
’s new value of
p by multiplying the value assigned to p of vr by the ratio, ϕ, defined by the edge tuple between
vr and vt
. My approach then uses this new value of p to compute new values for the other sizerelated properties defined for vt
, such as padding and minimum size. This ensures that the ratio
between these properties and p is maintained after the size change. For each property changed
for vt
, my approach creates a corresponding node in R.
In the second step, my approach once again traverses the set of identified subgraphs. For
each subgraph, my approach sets the values of the corresponding tuples in R with the value set
for the node in the subgraph. For each node changed in the subgraph, my approach determines
the value of ar based on a predefined mapping between each property and the corresponding
attribute used in Android for that property. This is a direct mapping except in two cases. First,
76
when the value of the attribute that p is mapped to is defined as a wrap_content, then instead
of mapping p to that attribute, my approach maps p to its corresponding min attribute (e.g.,
android : minHeight). Second, If the value of the attribute is set as match_parent, then the change
of p cannot be directly applied to the attribute in vt
. Instead, this change is indirectly achieved
by propagating the change, using the dependency edges, to a containing node.
In the third and final step, my approach iterates over the set of changes in R, and for each,
my approach modifies the corresponding attributes in the app’s layout files. To map components
between static and dynamic VHs and locate elements during the app rewriting, I employ a matching mechanism based on elements’ IDs, XPaths, and other related properties. To help increase
the accuracy of the mapping, I preprocess the app layout files and assign unique identifiers to UI
elements. The approach then renders the newly generated UI on an Android device and extracts
its information (i.e., VH and screenshot). Using the extracted information, the approach evaluates
the candidate repair using the fitness function discussed above and ranks the candidate repairs
based on their fitness score.
5.3 Evaluation
To evaluate SALEM, I designed experiments to answer the following research questions:
RQ1: How effective is SALEM in repairing SBIIs in Android applications?
RQ2: How long does it take for SALEM to generate repairs for SBIIs?
RQ3: How does SALEM impact the visual appeal of Android applications after applying the
selected repair?
5.3.1 Implementation
I implemented SALEM in Java as a prototype tool, SALEM. The implementation uses Apktool [16]
to disassemble APK resource files and repack the modified files into a new APK file. To collect
UI information, I used UI Automator [77] and ADB [14] to dump the layout hierarchy files and
77
capture the screenshots when running an app on an Android Emulator based on Android 8.0. For
detecting the SBIIs in an app, I used Google Accessibility Scanner [2] and then filtered its output
to capture SBIIs. To get the style information and build the VH for the UIs, I used a tool based
on Layout Inspector [46] in addition to UI Automator. I ran the experiments with the following
configurations: population size = 9, generation size = 8. I ran SALEM on an AMD Ryzen 7 2700X
64-bit machine with 64GB memory, running Ubuntu Linux 18.04.4 LTS.
5.3.2 Subjects
I conducted the experiments on a set of 58 UIs from 48 real-world mobile applications gathered
from a dataset of applications used in a recent large-scale study on accessibility issues in mobile
applications [8]. This dataset consists of 1,000 applications collected from across 33 categories
in the Google Play store [35]. To select the subjects, I ran an accessibility evaluation tool on the
dataset [8] and randomly selected 48 applications that contained SBIIs. I confirmed these reported
SBIIs by manually verifying the size of each element reported. For each of these 48 apps, I selected
the UIs that the detection tool reported to have at least one SBII. From the list of SBIIs in each UI,
I filtered out the ones that were part of WebViews or AdViews, which SALEM does not handle, as
they require modifying web content which SALEM does not handle. In total, I had 220 SBIIs in
58 unique UIs across the 48 subjects.
5.3.3 Experiment One
5.3.3.1 Protocol
To address RQ1 and RQ2, we ran SALEM on each of the subject’s faulty UIs. To evaluate the
effectiveness of SALEM, for each UI we calculated its number of SBIIs and its accessibility rate
before and after the repair. The number of SBIIs was determined based on the reports from Google
Accessibility Scanner[2]. The accessibility rate was calculated as the ratio of the number of touch
targets that are free of SBIIs over the total number of touch targets in the UI. This is a widely
78
used metric to measure and rank the accessibility of UIs in a mobile app [8, 65]. To address RQ2,
I measured the time it took to run SALEM during the experiment.
5.3.3.2 Presentation of Results
The results for effectiveness (RQ1) and time (RQ2) are shown in Table 5.1. The “Original” and
“Repaired” columns correspond to the results before and after applying SALEM’s repairs. I list
the number of SBIIs and the resulting accessibility rate under “# of SBIIs” and “Accessibility Rate”
for the original and repaired versions. I also calculated the total, average, median, maximum, and
minimum (rounded to whole numbers) across all 58 UIs for each of the metrics.
Table 5.1. Results for SALEM’s effectiveness in repairing SBIIs (RQ1) and its run time (RQ2).
Original Repaired
# of Touch Targets # of SBIIs A11y Rate # of SBIIs A11y Rate Time (mins)
All 305 220 28 2 99 579
Average 5 4 26 0 99 9
Median 5 4 17 0 100 8
Max 20 17 93 1 100 19
Min 1 1 0 0 80 6
5.3.3.3 Discussion of Results
Overall, the results of my experiment show that SALEM was able to significantly reduce the
number of SBIIs in the subject apps. Out of the total number of 220 reported SBIIs, SALEM was
able to completely fix 218 (99%) of them. The total accessibility rate across all 58 UIs after the
repair was 99%, compared to only 28% before the repair. These results indicate that SALEM was
effective in repairing the SBIIs and improving the accessibility of apps. I investigated the two
SBIIs in two different UIs that my approach could not repair and found they were UI elements
whose size properties are defined in code. These SBIIs can only be repaired by analyses that
would require analyzing and rewriting the source code, which is not handled by my approach.
79
The results for RQ2 show that SALEM was able to generate repairs within a reasonable time. I
analyzed the runtime breakdown of each individual step in SALEM and found that SALEM spent
a significant ∼98% amount of time evaluating the candidate repairs by compiling a new APK
for each repair and then running them on the emulator. This part can be further optimized by
running the approach in parallel (e.g., using Amazon AWS).
5.3.4 Experiment Two
5.3.4.1 Protocol
To answer RQ3, I conducted a user-based study where I asked users to compare the original and
repaired UIs. The goal of this evaluation was to understand how my repairs affect the UI’s visual
layout from a user’s perspective. The surveys presented side-by-side screenshots of the original
and the repaired UIs, each calibrated to be shown in the resolution of the Nexus 6P mobile device
that was used to run the experiment. This device has a display and resolution that is within
the range of the most popular Android mobile screen sizes [69]. The order of the screenshots’
placement was randomized and only labeled Version 1 and Version 2.
Each survey was divided into two parts. For the first part, I wanted to measure the participants’ general opinion of the original and repaired versions of the UIs. I presented the two
versions and asked each participant to (1) rate their preference on a 5-point Likert scale; and (2)
rate each UI’s attractiveness on a numeric scale from 1 to 10. I also asked participants to provide
a written explanation of their answers to understand the reason for their preference. For the second part of the survey, I wanted to measure the participants’ opinions of the two versions after
knowing about the accessibility improvements. I presented the same set of UI screenshots as the
first part, but this time, I highlighted the SBIIs on each screenshot in the same way as they would
be shown in Google’s Accessibility Scanner [2]. I also presented a short description explaining
the issues and the functionalities that are activated by each touch target affected by the SBIIs. I
then asked the participants to again rate their preference between the original and the repaired
on a 5-point Likert scale.
80
I conducted the survey on participants from two sources: (1) Amazon Mechanical Turk (AMT),
a crowd-sourcing platform that has been widely used to conduct user studies [10]; and (2) a group
of individuals with motor disabilities, specifically those with paralysis and limited hand mobility.
These participants were recruited through a charity foundation based in California, USA, that
I have worked with before. For the AMT participants, I separated the responses into two age
categories: those under 55 years old and those 55 years old or older. To ensure the participants
understood the instructions, I limited the locality to the U.S. and Canada. I chose only those
workers who had been rated as highly reliable (with an approval rating of over 98%) and who
had completed over 5,000 approved tasks. I also followed AMT best practices by employing a
captcha and a check question. In total, I had 122 responses from the 55- group, 24 responses from
the 55+ group, and 20 responses from the group with motor disability.
8
12
57
17
7
3
8
50
23
15
1
4
56
34
5
1
2
41
36
20
1
5
48
31
15
0 0
51
30
19
Before (55-)
After (55-)
Before (55+)
After (55+)
Before (SCI)
After (SCI)
8
12
57
17
7
3
8
50
23
15
1
4
56
34
5
1
2
41
36
20
1
5
48
31
15
0 0
51
30
19
0
10
20
30
40
50
60
I strongly prefer the
original UI.
I prefer the original UI. No preference. I prefer the repaired UI. I strongly prefer the
repaired UI.
%
%
%
%
%
%
%
Figure 5.3: Participants’ preference between the original and repaired UI versions
5.3.4.2 Presentation of Results
The results from the user study are shown in Figure 5.3. The bar chart shows the distribution
of the 5-point Likert scale preference ratings where the lighter bars are the preference ratings
before accessibility awareness and the darker bars are those after accessibility awareness. I used
solid bars to represent the group that is under 55 years old (55- group), striped bars to represent
81
the group that is 55 years old or older (55+ group), and dotted bars to represent the group of users
with motor disability (SCI group).
In terms of average attractiveness, the participants rated the original (O) slightly higher than
the repaired (R) with an average of (O: 6.43 R: 6.40) among the 55- group. For the 55+ and the
SCI groups, the repaired version had a slightly higher rating of (O: 6.11 R: 6.25) and (O: 6.46 R:
6.94), respectively. The rating difference for the 55- group was not statistically significant (p-value
= 0.57563 > 0.05), and the rating differences for the 55+ and the SCI groups were statistically
significant (p-values = 0.03327 < 0.05, and 0.00891 < 0.05, respectively). I used the Wilcoxon
Signed-Rank test for the analysis because I was comparing paired ratings from two dependent
samples, and these ratings were not normally distributed.
Figure 5.4: Example that demonstrates SALEM. The left screenshot shows the original UI, the
middle screenshot highlights the SBIIs detected by Google Accessibility Scanner, and the right
screenshot shows the UI after applying SALEM’s repair.
82
5.3.4.3 Discussion of Results
The result from the user study showed that SALEM was very successful in maintaining the visual
appeal of its repaired UIs. For preference, a majority of participants rated “No preference” when
deciding between the original and repaired versions. This is a very good indication that my repair
did not negatively affect user preference while it was able to fix almost all of the SBIIs. In fact,
across all three groups, when combining “No preference” with those that prefer the repaired UI,
my repair was in favor among 90% of the ratings. I investigated the comments provided by the 10%
who did not prefer my repaired UI and found that the reason participants preferred the original
was that they perceived smaller UI components to be more attractive. Since this is a personal
preference, I do not think it undermines the quality of my repairs.
Participants preferred my repairs even more once they were aware of the implications of the
SBIIs. Across the three groups, I see an average of 11% increase in favor of the repaired UI. Particularly, the number of ratings that “strongly prefer” the repaired UI doubled for the general 55-
group and quadrupled for the 55+ group. This is a very strong indication that participants value
accessibility and are willing to change their initial preference for the trade-off. I revisited those
10% that did not favor the repair and preferred a smaller layout to see whether their preference
changed. Interestingly, over half of them switched to either “No preference.” or preferred the
repaired version, leaving only under 5% still preferring the original after awareness. The comments from the participants who switched were overwhelmingly positive, expressing that they
were unaware of accessibility at first but had no problem adapting to the repaired UI for a greater
gain. One commented “... and it still looks good, and now it is workable.”
In addition to a positive impact on visual appeal, the user study also showed that my repair
was considered to be more accessible. I investigated the comments provided by the participants
to understand the reason for the repaired UI being both more attractive and preferable among the
55+ and SCI groups. I found, in general, that these groups perceived bigger UI components to be
better and more usable even before accessibility awareness. Many participants explained that the
bigger touch targets from the repaired UI could help them be more efficient and avoid mistakes
83
during interaction. Among the 19% of the SCI group that “strongly prefer” the repaired UI is a
quadriplegic participant who uses his knuckles instead of fingertips to activate touch screens. He
explained, “The larger spacing between lines would make it considerably easier for me to access each
input box with my knuckles.” These types of insights from actual mobile users show that SALEM
can be impactful in addressing accessibility.
5.3.5 Threats to Validity
External Validity: The first potential threat is that the selection of participants for the userbased study in my experiment may not be representative of individuals impacted by SBIIs. To
address this threat, I implemented an age question in the AMT surveys and sought users with
paralysis motor disabilities to ensure the participants were diverse in age and abilities.
A second threat is that the repaired UIs may not rate as well when displayed on-screen dimensions different from the one I used in the evaluation. This aspect of generalizability was not
tested in the evaluation. However, I believe that since my approach’s focus was on maintaining
relative visual relationships and Android uses a dynamic layout rendering approach, repairs on
screens with other dimensions would likely look similar from an aesthetic point of view.
Internal Validity: One potential threat is that screenshots used in the user-based study may
appear differently in size depending on the participants’ displays. To mitigate this threat, I asked
the participants to enter the display device they used for answering the survey and included only
those results with a screen PPI that would render the screenshots near the actual size of the Nexus
6P device’s UI that was used in the emulator to generate the screenshots.
Another potential threat is that users rated the UIs based on the screenshots without directly
interacting with the UIs on a mobile device. My decision to use screenshots was for the following
reasons. First, my user study does not ask users to evaluate apps’ usability, which would require
direct interaction with the apps. Instead, users are only asked to rate the attractiveness of the
rendered UIs. Second, the use of screenshots allows me to avoid any variations in the results that
may happen due to the differences in the participants’ mobile devices or their selected settings.
84
Third, the use of screenshots allows for easy comparison as users can view the two versions of
the UIs next to each other instead of having to install, run, and then uninstall different versions
of the subjects. Finally, screenshots are frequently used in user studies that attempt to evaluate
the attractiveness of UIs (e.g., [51, 52, 3, 47]).
Construct Validity: A potential threat is that my definition of SBIIs is dependent on the
reports of GATF and the Google Accessibility Scanner [2]. The use of this definition is reasonable
because it is based on Google’s own Material Design research [55]. The guidelines’ metric is
what is considered as the accessibility threshold by experts. As further validation, I analyzed the
severity of the repaired SBIIs to see how much larger they had to be in order to be considered
accessible. I found that the SBIIs needed an average 56% increase in their area. Of special note,
18% of the SBIIs required doubling their touch areas, and 5% of the SBIIs required an area increase
of over three times to become accessible. This indicates that many of the repaired SBIIs were
undersized and required significant size increases to become accessible.
Another potential threat is that the attractiveness and preference ratings by participants are
subjective. To mitigate this threat, my survey is designed to measure relative values with either side-by-side comparison or before-and-after repair versions for the activities. This ensures
the same pair of activity receives consistent ratings even though different participants may rate
according to different standards.
5.4 Conclusion
This chapter partially confirms the hypothesis that a repair framework utilizing UI models and a
multi-objective genetic search approach can effectively repair layout accessibility issues. Specifically, I introduced SALEM, an instantiation of the framework for automated repair of SBIIs in
mobile apps. These issues occur when touch targets fail to meet the size threshold set by accessibility guidelines. SBII is a very common accessibility issue in mobile apps [8, 65]. It can
significantly impact the accessibility of mobile applications and hinder users’ ability to interact
85
with and access apps’ functionality. SALEM repairs SBIIs by first constructing SRG that captures size dependency and consistency relationships in the UI. Utilizing SRG, SALEM identifies
the elements and properties that need to be changed to fix the detected issues. For each identified SBII, computes a transitive closure of the graph, starting from the node representing the
element impacted by the issue. SALEM then employs a multi-objective genetic approach to generate potential candidate repairs. The fitness function, used for evaluating and ranking repairs,
is designed based on a set of UI qualities defined by the problem domain and the characteristics
expected to be preserved in the repaired UIs.
In the evaluation, SALEM was able to successfully repair 99% of SBIIs in real-world applications and was able to improve the accessibility rate of these applications from an average of 28%
to 99% after repair. The quality of the repaired UIs was assessed through a user study involving older adults, individuals suffering from paralysis with limited hand mobility, and able-bodied
participants under 55 years of age. The user study results showed that SALEM was highly successful in preserving the aesthetics of the repaired UIs. Across all three participant groups, my
repair was in favor among 90% of the ratings. Furthermore, participants preferred my repairs
even more once they were aware of the implications of the SBIIs. This successful instantiation
and evaluation contribute toward confirming my dissertation hypothesis.
86
Chapter 6
Related Work
6.1 Accessibility and Usability of Mobile Applications
Accessibility for mobile applications has been the focus of many recent studies. Chen et al. [22]
proposed an app exploration tool to scan mobile applications for accessibility issues and used
that tool to collect a dataset of more than 80,000 accessibility issues from over 2,000 unique Android apps. Using this dataset, the authors conducted an empirical study to understand the type
of accessibility issues, the UI components they impact, and their severity from the end-users’
perspective. Ross et al. [65] proposed an epidemiology-inspired framework for understanding
accessibility issues in mobile apps. Using this framework, they performed a large-scale empirical analysis of more than 9,000 Android applications to identify the type of accessibility barriers present. They identified seven accessibility barriers: TalkBack-related issues, missing labels,
duplicate labels, uninformative labels, editable TextViews with contentDescription, overlapping
clickable elements, and undersized elements. Alshayban et al. [8] conducted surveys with app developers to gauge their sentiments and understanding of mobile accessibility. Vendome et al. [79]
conducted a qualitative analysis of online discussions among developers about accessibility in Android apps. Through their investigation, they created a taxonomy categorizing the key aspects of
accessibility that are most frequently discussed by developers. Wu et al. [85] conducted a survey
to investigate users’ awareness of built-in accessibility features on mobile devices. Their findings show that users, particularly older adults, are not very aware of these accessibility features.
87
Based on this finding, the authors developed a prototype tool that helps recommend accessibility
features to users based on their interaction and potential needs. Nicolau et al. [62] compared how
people with and without disabilities interact with mobile touchscreens. Their study involved 15
people with tetraplegia and 18 people without disabilities. They found that smaller touch targets
led to more errors, emphasizing the need for sufficiently large touch targets on mobile screens.
Mateus et al. [56] conducted a comparative study to assess the effectiveness of automated accessibility evaluation tools in detecting real-world accessibility issues impacting users with visual
disabilities in mobile applications. Their results emphasized the importance of combining automated tools and manual user testing approaches for a comprehensive accessibility evaluation.
These studies focus on understanding how disabled people interact with mobile devices and on
investigating how accessibility issues, including layout accessibility issues, impact those users
rather than on repairing these issues.
There has been particular interest in developing techniques to automatically detect accessibility issues in mobile apps. dVermin [71] detects scaling accessibility issues in Android apps.
AccessiText [9] detects text scaling issues caused by increasing the font size in the UI. These techniques are limited to detecting different types of scaling accessibility issues but do not attempt
to repair layout accessibility issues. Alotaibi et al. [5] designed a technique that can automatically detect TalkBack navigation issues by simulating how disabled users would navigate mobile
applications using TalkBack swipe gestures. Similarly, Salehnamadi et al. [67] developed a technique that detects navigation issues by automatically crawling Android apps. These techniques
focus on detecting navigation accessibility issues. Other techniques have been proposed to detect various types of presentation issues in mobile apps. Bo et al. [86] analyzed Material Design
Guidelines [54] to identify design smells that impact UI elements in mobile apps, such as using
multiple colors in a bottom navigation bar or using the primary color as the background color
of text fields. Based on this analysis, the authors introduced UIS-Hunter, a technique to detect
these design smells in Android apps. Liu et al. [49] examined 10,330 unique screenshots from 562
88
mobile applications to determine common UI display issues. The authors then proposed an automated technique, OwlEye, based on Convolutional Neural Networks (CNNs) for automatically
detecting these issues. NightHawk [48] extends OwlEye by providing a better localization of issues in the UI. ITDroid [32] detects internationalization presentation issues in Android apps. All
of these techniques focus on detecting different types of presentation issues that impact mobile
UIs, and none of them attempt to repair layout accessibility issues.
Efforts have also been made to improve the usability and accessibility of mobile apps. GEMMA
[47] applies a search-based technique to recommend a darker color scheme that can reduce the
energy consumption of mobile UIs, using sets of heuristics and fitness functions specific to the
problem domain of the work. LabelDroid [20] employs a deep-learning approach to improve the
accessibility of mobile applications by generating content labels for icons and images within the
UI. COALA [58] improves the quality of the generated content labels by considering the context
of the UI and its elements. Chen et al. created a method using image classification and a few-shot
learning model to enhance the quality of icon labeling in mobile applications [21]. These approaches focus on improving the accessibility for disabled users interacting with applications via
screen readers. Zhang et al. used interaction proxies [87] and annotations of mobile UIs [88] to
improve disabled users’ interaction with mobile applications via touch and to improve the compatibility of mobile applications with TalkBack assistive service [12]. Smart Touch [60] enhances
user touch screen interaction accuracy by deploying a template-based matching algorithm that
maps user touches to specific target areas on the screen. Despite their contributions to improving
mobile app accessibility and usability in various ways, none of these techniques provide a repair
for layout accessibility issues.
Several studies have examined the implications of using assistive services on the security and
privacy of people with disabilities. Naseri et al. [61] discovered that activating assistive services
on Android might expose many applications to potential attacks, leading to inadvertent leaks of
sensitive user information. Similarly, Kalysch et al. [42] found that attackers could exploit assistive services on Android to illicitly obtain user information and execute denial-of-service attacks.
89
OverSight [57] is a tool designed to detect instances where Android’s assistive technologies may
inadvertently grant access to functionalities and data that should not be otherwise accessible to
users (i.e., detecting overly accessible elements). These studies focus on detecting accessibility
vulnerabilities and potential security issues related to the use of assistive services, but they do
not repair layout accessibility issues.
6.2 Accessibility and Usability of Web Applications
Recent techniques have increasingly focused on the usability and accessibility of web applications. KAFE [23] employs both static and dynamic analysis to build a model that represents
possible keyboard-based navigation through a web page’s visible UI elements. This model is
then utilized to automatically detect and localize Keyboard Accessibility Failures (KAFs). LOTUS [25] uses a graph-based approach to detect dialog-related keyboard navigation accessibility
failures in web pages. BAGEL [24] automatically analyzes web pages to detect various accessibility issues that impact keyboard-based users, such as insufficient focus indicator, unintuitive
change-of-context, and unintuitive navigation order issues. These techniques primarily target
keyboard-related accessibility issues in web pages. Other techniques have focused on UI presentation issues in web pages. xFix [53] proposed a technique to repair Cross Browser Issues
(XBIs) using a search-based technique. GWALI [4] is a technique that automatically detects presentation issues caused by the internationalization of web pages (i.e., translating a webpage from
the default original language into different languages). iFix [52] uses a search-based approach
to repair these internationalization issues in web pages. CBRepair [3] improves the repair of
internationalization failures by using a constraints-based approach, which the authors showed
to be effective in reducing runtime and increasing the effectiveness of the repairs. VizAssert et
al.vizassert introduces a formalized approach for ensuring the accessibility of a web page’s layout
across various renderings and user configurations. It proposes a visual logic language to specify
color-contrast accessibility properties that should exist on a web page. It also offers an automated
90
approach for verifying these properties. mFix [51] repairs mobile-friendly problems in web pages
that occur due to web pages not being optimized to work on small-screen devices such as mobile
apps. All these techniques are limited to detecting and repairing issues in web pages and cannot
repair layout accessibility issues in mobile apps.
91
Chapter 7
Conclusion and Future Work
7.1 Summary
The goal of my research is to automate the repair of layout accessibility issues in mobile apps.
The hypothesis of my dissertation is:
A repair framework that builds models of the UI and employs a multi-objective genetic
search-based approach to generate repairs can effectively repair layout accessibility issues.
To evaluate this hypothesis, I designed a framework for automatically repairing layout accessibility issues. The repair framework, detailed in Chapter 3, serves as a systematic roadmap for
addressing layout accessibility issues. It operationalizes key insights into how these issues can
be repaired and identifies essential components critical for the repair process. At a high level, the
framework consists of two main phases. The first is the analysis phase, in which the framework
builds models of the UI that capture relationships that control or impact the rendering of elements
in the UI. These models are used to perform a UI dependency analysis, in which the framework
attempts to identify a safe and minimal set of elements and properties that must be changed to
repair the detected issues. In the second phase, the UI generation phase, the framework employs
a multi-objective genetic algorithm to generate candidate repaired UIs and subsequently evaluates them using a fitness function defined to capture important qualities that must exist in the
92
repaired UIs. This iterative process continues until the best possible repair is identified or the
search terminates.
To evaluate the effectiveness of this framework, I created two instantiations, ScaleFix and
SALEM, to repair layout accessibility issues. ScaleFix, detailed in Chapter 4, focused on the repair of USAFs. USAFs refer to any layout distortion in a UI that occurs due to the use of scaling
assistive services. USAFs can manifest in different ways. They can manifest as text visibility issues, where elements in the UI have their text cut, causing important information to become lost
or unreadable. USAFs can also manifest as collision issues, where scaling the UI causes visible
elements to collide and overlap. This can make it difficult for users to perceive the UI content or
interact with it. Lastly, USAFs can manifest as missing issues, where the scaling of the UI causes
elements to be rendered outside of the viewport’s boundaries. This leads to functionality associated with these elements becoming unavailable to disabled people. The evaluation of ScaleFix
demonstrated its high effectiveness, achieving an average success rate of 90% in repairing USAFs
in real-world applications. The results also showed that ScaleFix was able to increase the accessibility of the mobile UIs by up to 88%. This effectiveness was further confirmed by a user study,
which showed that repairs generated by ScaleFix were significantly preferred, both in terms of
attractiveness and readability, over the original UI versions. Specifically, the attractiveness and
readability ratings for the repaired UIs were more than 40% higher than those for the original
scaled UIs. When compared to repairs manually crafted by the apps’ developers, the repairs generated by ScaleFix were rated similarly well. These findings show the effectiveness of ScaleFix and
partially confirm my hypothesis that a repair framework that leverages UI models and employs
a multi-objective genetic search algorithm is effective in repairing layout accessibility issues in
mobile applications.
The second framework instantiation, SALEM, detailed in Chapter 5, focused on the repair
of SBIIs. These layout accessibility issues occur when touch targets (i.e., interactive elements)
in the UI are too small to pass the minimum size threshold set by accessibility guidelines. SBIIs
can make interacting with UIs very challenging, especially for people with motor disabilities or
93
reduced dexterity, as well as older adults who may experience age-related decline in fine motor
skills. Struggling to interact with small touch targets can lead to errors and, ultimately, frustration
and reduced ability to use the app. The evaluation of SALEM on 58 UIs from 48 real-world apps
showed its effectiveness in repairing SBIIs. SALEM was able to completely repair the SBIIs in 56
of the 58 UIs. The average accessibility rate for the UIs significantly improved after the repair,
jumping from 26% to 99%. The effectiveness of the instantiation was further confirmed by a
user study conducted with participants with various abilities, including individuals with motor
disabilities and older adults. The results of the user study showed that even before knowing about
SBIIs, an overwhelming 91% of participants preferred the repaired UIs or had no preference. Once
they became aware of the SBIIs, the number of people who strongly preferred the repaired UI
tripled. These findings show the effectiveness of SALEM and partially confirm my hypothesis
that a repair framework that leverages UI models and employs a multi-objective genetic search
algorithm is effective in repairing layout accessibility issues in mobile applications.
The evaluations of ScaleFix and SALEM demonstrated their effectiveness in repairing layout
accessibility. ScaleFix achieved a 90% success rate in repairing USAFs in real-world applications,
while SALEM successfully repaired SBIIs in 56 out of 58 UIs, improving the average accessibility
rate from 26% to 99%. Both techniques are instantiations of the repair framework, which utilizes
UI models and multi-objective genetic search algorithms, as proposed in my hypothesis. The
results from these two instantiations combined confirm my dissertation hypothesis that the repair
framework is effective in automatically repairing layout accessibility issues.
7.2 Future Work
As society becomes increasingly digital, the role of mobile applications in simplifying daily activities, accessing essential services, and maintaining social connections will continue to grow.
This means that ensuring mobile apps are accessible will continue to be important, not just for
94
today but also for the future. My dissertation identifies several challenges in the domain of mobile
accessibility and lays the foundation for future work in this domain.
One possible direction for future work is to expand the scope of the repair framework to
address a broader range of UI-related accessibility issues in mobile applications. For instance,
the framework could be customized to address color and navigation-related accessibility issues
in mobile apps. The use of UI modeling and a multi-objective genetic approach could enable the
building of techniques that effectively consider multiple aspects related to these problem domains
in the process of finding the best possible repairs. Another direction could focus on extending the
applicability of the repair framework to handle issues found in dynamic content such as WebViews
and AdViews, or even in other platforms like wearables and in-car entertainment systems.
Auto modifications to mobile UIs can enable work in various areas, including the repair of
UI presentation issues, automated UI generation, and UI synthesis and optimization. Techniques
could be developed to streamline and speed up the process of building and modifying UIs according to design principles and specific guidelines. The quantification and casting of accessibility
guidelines and UI properties as measurable metrics can enable the work in various areas beyond
accessibility repair. For example, future work could utilize these metrics in improving the automated testing and evaluation of UIs, as well as in the auditing and ranking of mobile apps.
Another potential application is in UI personalization, where interfaces can adapt dynamically to
particular user needs. They can also be employed in prototyping, where metrics serve as guiding
mechanisms to avoid violations in the early stages of development.
Lastly, future work could also consider integrating the repair framework into existing developer environments and workflows. This would not only improve developer adoption but also
help in the process of building accessible UIs across various platforms and devices.
95
References
[1] Accessibility research mobile apps. url: https://accessibility.q42.nl/ (visited on
07/14/2023).
[2] Accessibility Scanner - Apps on Google Play. en. url: https://play.google.com/store/
apps/details (visited on 12/28/2020).
[3] Abdulmajeed Alameer, Paul T. Chiou, and William G. J. Halfond. “Efficiently repairing
internationalization presentation failures by solving layout constraints”. In: 2019 12th IEEE
conference on software testing, validation and verification (ICST). 2019, pp. 172–182. doi:
10.1109/ICST.2019.00026.
[4] Abdulmajeed Alameer, Sonal Mahajan, and William G. J. Halfond. “Detecting and Localizing Internationalization Presentation Failures in Web Applications”. In: 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). Apr. 2016, pp. 202–
212. doi: 10.1109/ICST.2016.36.
[5] Ali S. Alotaibi, Paul T. Chiou, and William G.J. Halfond. “Automated Detection of TalkBack Interactive Accessibility Failures in Android Applications”. In: 2022 IEEE Conference
on Software Testing, Verification and Validation (ICST). ISSN: 2159-4848. Apr. 2022, pp. 232–
243. doi: 10.1109/ICST53961.2022.00033.
[6] Ali S. Alotaibi, Paul T. Chiou, and William G.J. Halfond. “Automated Repair of Size-Based
Inaccessibility Issues in Mobile Applications”. In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). ISSN: 2643-1572. Nov. 2021, pp. 730–742.
doi: 10.1109/ASE51524.2021.9678625.
[7] Ali S. Alotaibi, Paul T. Chiou, Fazle M. Tawsif, and William G.J. Halfond. “ScaleFix: An
automated repair of UI scaling accessibility issues in android applications”. In: 39th IEEE
international conference on software maintenance and evolution (ICSME). tex.acceptancerate:
22.7% (27/119) tex.pubtype: Conference. Oct. 2023.
[8] Abdulaziz Alshayban, Iftekhar Ahmed, and Sam Malek. “Accessibility Issues in Android
Apps: State of Affairs, Sentiments, and Ways Forward”. en. In: 2020, p. 12.
[9] Abdulaziz Alshayban and Sam Malek. “AccessiText: automated detection of text accessibility issues in Android apps”. en. In: Proceedings of the 30th ACM Joint European Software
96
Engineering Conference and Symposium on the Foundations of Software Engineering. Singapore Singapore: ACM, Nov. 2022, pp. 984–995. isbn: 978-1-4503-9413-0. doi: 10 . 1145 /
3540250 . 3549118. url: https : / / dl . acm . org / doi / 10 . 1145 / 3540250 . 3549118
(visited on 11/17/2022).
[10] Amazon Mechanical Turk. url: https://www.mturk.com/ (visited on 02/07/2021).
[11] Android Accessibility for Developers. en. url: https://developer.android.com/guide/
topics/ui/accessibility (visited on 10/04/2022).
[12] Android accessibility overview - Android Accessibility Help. url: https://support.google.
com/accessibility/android/answer/6006564hl (visited on 10/05/2021).
[13] Android Apps on Google Play. url: https : / / play . google . com / store (visited on
07/24/2023).
[14] Android Debug Bridge (adb). en. url: https : / / developer . android . com / studio /
command-line/adb (visited on 01/22/2021).
[15] Android Studio & App Tools - Android Developers. url: https://developer.android.
com/studio (visited on 08/12/2023).
[16] Apktool - A tool for reverse engineering 3rd party, closed, binary Android apps. url: https:
//ibotpeaches.github.io/Apktool/ (visited on 01/22/2021).
[17] Apple Apps Accessibility. en-US. url: https://developer.apple.com/documentation/
accessibility (visited on 07/24/2023).
[18] Shiri Azenkot, Cynthia L. Bennett, and Richard E. Ladner. “DigiTaps: Eyes-free number entry on touchscreens with minimal audio feedback”. In: Proceedings of the 26th annual ACM
symposium on user interface software and technology. UIST ’13. Number of pages: 6 Place:
St. Andrews, Scotland, United Kingdom. New York, NY, USA: Association for Computing
Machinery, 2013, pp. 85–90. isbn: 978-1-4503-2268-3. doi: 10.1145/2501988.2502056.
url: https://doi.org/10.1145/2501988.2502056.
[19] BBC Mobile Accessibility Guidlone. url: https : / / www . bbc . co . uk / accessibility /
forproducts/guides/mobile/ (visited on 07/29/2020).
[20] Jieshan Chen, Chunyang Chen, Zhenchang Xing, Xiwei Xu, Liming Zhu, Guoqiang Li, and
Jinshui Wang. “Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI
Components by Deep Learning”. en. In: Mar. 2020. url: https://arxiv.org/abs/2003.
00380v2 (visited on 08/27/2020).
[21] Jieshan Chen, Amanda Swearngin, Jason Wu, Titus Barik, Jeffrey Nichols, and Xiaoyi Zhang.
“Towards complete icon labeling in mobile applications”. In: CHI. 2022. url: https : / /
97
docs - assets . developer . apple . com / ml - research / papers / icon - labelling -
mobile-apps-chi-22.pdf.
[22] Sen Chen, Chunyang Chen, Lingling Fan, Mingming Fan, Xian Zhan, and Yang Liu. “Accessible or Not An Empirical Investigation of Android App Accessibility”. In: IEEE Transactions on Software Engineering (2021). Conference Name: IEEE Transactions on Software
Engineering, pp. 1–1. issn: 1939-3520. doi: 10.1109/TSE.2021.3108162.
[23] Paul T. Chiou, Ali S. Alotaibi, and William G. J. Halfond. “Detecting and localizing keyboard
accessibility failures in web applications”. In: Proceedings of the 29th ACM Joint Meeting on
European Software Engineering Conference and Symposium on the Foundations of Software
Engineering. ESEC/FSE 2021. New York, NY, USA: Association for Computing Machinery,
Aug. 2021, pp. 855–867. isbn: 978-1-4503-8562-6. doi: 10.1145/3468264.3468581. url:
http://doi.org/10.1145/3468264.3468581 (visited on 08/23/2021).
[24] Paul T. Chiou, Ali S. Alotaibi, and William G.J. Halfond. “BAGEL: An approach to automatically detect navigation-based web accessibility barriers for keyboard users”. In: ACM CHI
conference on human factors in computing systems (CHI 2023). tex.acceptancerate: X% (X/X)
tex.award: Honorable Mention tex.pubtype: Conference. Apr. 2023.
[25] Paul T. Chiou, Ali S. Alotaibi, and William G.J. Halfond. “Detecting dialog-related keyboard
navigation failures in web applications”. In: IEEE/ACM international conference on software
engineering (ICSE 2023). tex.acceptancerate: 26% (209/796) tex.pubtype: Conference. May
2023.
[26] Desktop vs mobile vs tablet market share worldwide - september 2022. url: https://gs.
statcounter.com/platform-market-share/desktop-mobile-tablet.
[27] Developer guides. en. url: https://developer.android.com/guide (visited on 07/15/2023).
[28] Disability-WHO. en. url: https://www.who.int/news-room/fact-sheets/detail/
disability-and-health (visited on 09/14/2023).
[29] “Domino’s Pizza app must be accessible to blind people”. en-GB. In: BBC News (Jan. 2019).
url: https://www.bbc.com/news/technology-46894463 (visited on 07/24/2023).
[30] Sacha N. Duff, Curt B. Irwin, Jennifer L. Skye, Mary E. Sesto, and Douglas A. Wiegmann.
“The Effect of Disability and Approach on Touch Screen Performance during a Number
Entry Task”. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting
54.6 (Sept. 2010). Publisher: SAGE Publications Inc, pp. 566–570. issn: 2169-5067. doi: 10.
1177/154193121005400605. url: https://doi.org/10.1177/154193121005400605
(visited on 01/29/2021).
[31] Marcelo Medeiros Eler, Jose Miguel Rojas, Yan Ge, and Gordon Fraser. “Automated Accessibility Testing of Mobile Apps”. In: 2018 IEEE 11th International Conference on Software
98
Testing, Verification and Validation (ICST). Apr. 2018, pp. 116–126. doi: 10.1109/ICST.
2018.00021.
[32] Camilo Escobar-Velásquez, Michael Osorio-Riaño, Juan Dominguez-Osorio, Maria Arevalo,
and Mario Linares-Vásquez. “An Empirical Study of i18n Collateral Changes and Bugs in
GUIs of Android apps”. In: 2020 IEEE International Conference on Software Maintenance and
Evolution (ICSME). ISSN: 2576-3148. Sept. 2020, pp. 581–592. doi: 10.1109/ICSME46990.
2020.00061.
[33] Martin Ester, Hans-Peter Kriegel, and Xiaowei Xu. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”. en. In: (), p. 6.
[34] F-Droid - Free and Open Source Android App Repository. url: https://f-droid.org/en/
(visited on 07/24/2023).
[35] Google. Android Apps on Google Play. en. url: https://play.google.com/store/apps?
hl=en&gl=US (visited on 01/11/2021).
[36] Google Accessibility for Android. en. url: https://developer.android.com/guide/
topics/ui/accessibility (visited on 01/29/2021).
[37] Google Git: mirror-goog-studio-main - ddmlib. tex.timestamp: 2023-04-23. url: https://
android.googlesource.com/platform/tools/base/+/refs/heads/mirror-googstudio-main/ddmlib/.
[38] google/Accessibility-Test-Framework-for-Android. original-date: 2015-09-12T00:49:01Z. Dec.
2020. url: https://github.com/google/Accessibility- Test- Framework- forAndroid (visited on 12/28/2020).
[39] Tiago Guerreiro, Hugo Nicolau, Joaquim Jorge, and Daniel Gonçalves. “Towards accessible touch interfaces”. In: Proceedings of the 12th international ACM SIGACCESS conference
on Computers and accessibility. ASSETS ’10. New York, NY, USA: Association for Computing Machinery, Oct. 2010, pp. 19–26. isbn: 978-1-60558-881-0. doi: 10.1145/1878803.
1878809. url: http://doi.org/10.1145/1878803.1878809 (visited on 12/17/2020).
[40] Liang He, Zijian Wan, Leah Findlater, and Jon E. Froehlich. “TacTILE: A preliminary toolchain
for creating accessible graphics with 3D-Printed overlays and auditory annotations”. In:
Proceedings of the 19th international ACM SIGACCESS conference on computers and accessibility. ASSETS ’17. Number of pages: 2 Place: Baltimore, Maryland, USA. New York, NY,
USA: Association for Computing Machinery, 2017, pp. 397–398. isbn: 978-1-4503-4926-0.
doi: 10.1145/3132525.3134818. url: https://doi.org/10.1145/3132525.3134818.
[41] IBM Mobile Accessibility Checker. original-date: 2017-11-06T14:35:17Z. May 2020. url: https:
//github.com/IBMa/MAC (visited on 12/28/2020).
99
[42] Anatoli Kalysch, Davide Bove, and Tilo Müller. “How Android’s UI Security is Undermined
by Accessibility”. en. In: Proceedings of the 2nd Reversing and Offensive-oriented Trends Symposium on ZZZ - ROOTS ’18. Vienna, Austria: ACM Press, 2018, pp. 1–10. isbn: 978-1-4503-
6171-2. doi: 10.1145/3289595.3289597. url: http://dl.acm.org/citation.cfm?
doid=3289595.3289597 (visited on 09/02/2019).
[43] Shaun K. Kane, Jeffrey P. Bigham, and Jacob O. Wobbrock. “Slide rule: Making mobile touch
screens accessible to blind people using multi-touch interaction techniques”. In: Proceedings of the 10th international ACM SIGACCESS conference on computers and accessibility.
Assets ’08. Number of pages: 8 Place: Halifax, Nova Scotia, Canada. New York, NY, USA:
Association for Computing Machinery, 2008, pp. 73–80. isbn: 978-1-59593-976-0. doi: 10.
1145/1414471.1414487. url: https://doi.org/10.1145/1414471.1414487.
[44] Shaun K. Kane, Meredith Ringel Morris, Annuska Z. Perkins, Daniel Wigdor, Richard E.
Ladner, and Jacob O. Wobbrock. “Access overlays: Improving non-visual access to large
touch screens for blind users”. In: Proceedings of the 24th annual ACM symposium on user
interface software and technology. UIST ’11. Number of pages: 10 Place: Santa Barbara, California, USA. New York, NY, USA: Association for Computing Machinery, 2011, pp. 273–
282. isbn: 978-1-4503-0716-1. doi: 10.1145/2047196.2047232. url: https://doi.org/
10.1145/2047196.2047232.
[45] Shaun K. Kane, Meredith Ringel Morris, and Jacob O. Wobbrock. “Touchplates: Low-cost
tactile overlays for visually impaired touch screen users”. In: Proceedings of the 15th international ACM SIGACCESS conference on computers and accessibility. ASSETS ’13. Number of
pages: 8 Place: Bellevue, Washington tex.articleno: 22. New York, NY, USA: Association for
Computing Machinery, 2013. isbn: 978-1-4503-2405-2. doi: 10.1145/2513383.2513442.
url: https://doi.org/10.1145/2513383.2513442.
[46] Layout Inspector. en. url: https://developer.android.com/studio/debug/layoutinspector (visited on 04/22/2021).
[47] Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta,
Rocco Oliveto, and Denys Poshyvanyk. “Multi-Objective Optimization of Energy Consumption of GUIs in Android Apps”. In: ACM Transactions on Software Engineering and
Methodology 27.3 (Sept. 2018), 14:1–14:47. issn: 1049-331X. doi: 10.1145/3241742. url:
http://doi.org/10.1145/3241742 (visited on 08/12/2021).
[48] Zhe Liu, Chunyang Chen, Junjie Wang, Yuekai Huang, Jun Hu, and Qing Wang. “Nighthawk:
Fully Automated Localizing UI Display Issues via Visual Understanding”. In: IEEE Transactions on Software Engineering (2022). Conference Name: IEEE Transactions on Software
Engineering, pp. 1–1. issn: 1939-3520. doi: 10.1109/TSE.2022.3150876.
[49] Zhe Liu, Chunyang Chen, Junjie Wang, Yuekai Huang, Jun Hu, and Qing Wang. “Owl eyes:
spotting UI display issues via visual understanding”. In: Proceedings of the 35th IEEE/ACM
International Conference on Automated Software Engineering. ASE ’20. New York, NY, USA:
100
Association for Computing Machinery, Dec. 2020, pp. 398–409. isbn: 978-1-4503-6768-4.
doi: 10.1145/3324884.3416547. url: http://doi.org/10.1145/3324884.3416547
(visited on 10/03/2022).
[50] S. Mahajan and W. G. J. Halfond. “Detection and Localization of HTML Presentation Failures Using Computer Vision-Based Techniques”. In: 2015 IEEE 8th International Conference
on Software Testing, Verification and Validation (ICST). ISSN: 2159-4848. Apr. 2015, pp. 1–10.
doi: 10.1109/ICST.2015.7102586.
[51] Sonai Mahajan, Negarsadat Abolhassani, Phil McMinn, and William G. J. Halfond. “Automated repair of mobile friendly problems in web pages”. en. In: Proceedings of the 40th
International Conference on Software Engineering. Gothenburg Sweden: ACM, May 2018,
pp. 140–150. isbn: 978-1-4503-5638-1. doi: 10 . 1145 / 3180155 . 3180262. url: https :
//dl.acm.org/doi/10.1145/3180155.3180262 (visited on 01/04/2021).
[52] Sonal Mahajan, Abdulmajeed Alameer, Phil McMinn, and William G. J. Halfond. “Automated repair of internationalization presentation failures in web pages using style similarity clustering and search-based techniques”. In: 2018 IEEE 11th international conference on
software testing, verification and validation (ICST). 2018, pp. 215–226. doi: 10.1109/ICST.
2018.00030.
[53] Sonal Mahajan, Abdulmajeed Alameer, Phil McMinn, and William G. J. Halfond. “XFix: an
automated tool for the repair of layout cross browser issues”. In: Proceedings of the 26th
ACM SIGSOFT International Symposium on Software Testing and Analysis. ISSTA 2017. New
York, NY, USA: Association for Computing Machinery, July 2017, pp. 368–371. isbn: 978-
1-4503-5076-1. doi: 10.1145/3092703.3098223. url: https://dl.acm.org/doi/10.
1145/3092703.3098223 (visited on 04/28/2023).
[54] Material Design. url: https://material.io/design/usability (visited on 04/20/2021).
[55] Material Desing Accessibility. en. url: https : / / material . io / design / usability /
accessibility.html#layout-and-typography (visited on 01/29/2021).
[56] Delvani Antônio Mateus, Carlos Alberto Silva, Marcelo Medeiros Eler, and André Pimenta
Freire. “Accessibility of mobile applications: evaluation by users with visual impairment
and by automated tools”. In: Proceedings of the 19th Brazilian Symposium on Human Factors
in Computing Systems. IHC ’20. New York, NY, USA: Association for Computing Machinery,
Oct. 2020, pp. 1–10. isbn: 978-1-4503-8172-7. doi: 10.1145/3424953.3426633. url: http:
//doi.org/10.1145/3424953.3426633 (visited on 01/14/2021).
[57] Forough Mehralian, Navid Salehnamadi, Syed Fatiul Huq, and Sam Malek. “Too Much Accessibility is Harmful! Automated Detection and Analysis of Overly Accessible Elements
101
in Mobile Apps”. en. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. Rochester MI USA: ACM, Oct. 2022, pp. 1–13. isbn: 978-1-
4503-9475-8. doi: 10.1145/3551349.3560424. url: https://dl.acm.org/doi/10.
1145/3551349.3560424 (visited on 08/14/2023).
[58] Forough Mehralian, Navid Salehnamadi, and Sam Malek. “Data-driven accessibility repair
revisited: on the effectiveness of generating labels for icons in Android apps”. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2021. New York, NY, USA:
Association for Computing Machinery, Aug. 2021, pp. 107–118. isbn: 978-1-4503-8562-6.
doi: 10.1145/3468264.3468604. url: https://doi.org/10.1145/3468264.3468604
(visited on 08/30/2021).
[59] Mobile Accessibility. en. url: https : / / www . w3 . org / TR / mobile - accessibility -
mapping/mobile-accessibility-considerations-primarily-related-to-principle2-operable (visited on 10/05/2021).
[60] Martez E. Mott, Radu-Daniel Vatavu, Shaun K. Kane, and Jacob O. Wobbrock. “Smart Touch:
Improving Touch Accuracy for People with Motor Impairments with Template Matching”.
In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. CHI
’16. New York, NY, USA: Association for Computing Machinery, May 2016, pp. 1934–1946.
isbn: 978-1-4503-3362-7. doi: 10.1145/2858036.2858390. url: http://doi.org/10.
1145/2858036.2858390 (visited on 12/17/2020).
[61] Mohammad Naseri, Nataniel P. Borges, Andreas Zeller, and Romain Rouvoy. “AccessiLeaks:
Investigating Privacy Leaks Exposed by the Android Accessibility Service”. en. In: Proceedings on Privacy Enhancing Technologies 2019.2 (Apr. 2019). Accessibilty services can pose
privcy risks as developers can use them to listen to users events and inputs and gain access
to passwords, etc. They did a study and found that 72% of finacial and 80% of social apps are
vulrnable to these attacks. They developed a tool that can automatically flag apps that can
exploide accessibilty to access other apps data. They also show a demo of fixing tool that
was able to fix sensitive information leaks Detecting Tool: parse APK using apktool and
then identify hotspots ( eidt texts that are set as password and Important for accessibilty
attribute is set to Yes or not assignet "That means itWill be transformed to any accessibitly
service)) Did not consider any dynamic created content AcGuard: Listen for events and if
it find that a user is about to write a password with accessibitly set to true then it checks
if there are any accessibilty service listening and show a notification that those potential
services might be able to see password and allowing users to disable them Fixes improve
secuirty but affect accessibilty., pp. 291–305. issn: 2299-0984. doi: 10 . 2478 / popets -
2019-0031. url: https://content.sciendo.com/view/journals/popets/2019/2/
article-p291.xml (visited on 09/02/2019).
[62] Hugo Nicolau, Tiago Guerreiro, Joaquim Jorge, and Daniel Gonçalves. “Mobile touchscreen
user interfaces: bridging the gap between motor-impaired and able-bodied users”. en. In:
Universal Access in the Information Society 13.3 (Aug. 2014), pp. 303–313. issn: 1615-5297.
102
doi: 10.1007/s10209-013-0320-5. url: https://doi.org/10.1007/s10209-013-
0320-5 (visited on 01/29/2021).
[63] L. Nurgalieva, J. J. Jara Laconich, M. Baez, F. Casati, and M. Marchese. “A Systematic Literature Review of Research-Derived Touchscreen Design Guidelines for Older Adults”. In:
IEEE Access 7 (2019). Conference Name: IEEE Access, pp. 22035–22058. issn: 2169-3536.
doi: 10.1109/ACCESS.2019.2898467.
[64] Yong S. Park, Sung H. Han, Jaehyun Park, and Youngseok Cho. “Touch key design for target
selection on a mobile phone”. In: Proceedings of the 10th international conference on Human
computer interaction with mobile devices and services. MobileHCI ’08. New York, NY, USA:
Association for Computing Machinery, Sept. 2008, pp. 423–426. isbn: 978-1-59593-952-4.
doi: 10.1145/1409240.1409304. url: http://doi.org/10.1145/1409240.1409304
(visited on 12/17/2020).
[65] Anne Spencer Ross, Xiaoyi Zhang, James Fogarty, and Jacob O. Wobbrock. “An Epidemiologyinspired Large-scale Analysis of Android App Accessibility”. In: ACM Transactions on Accessible Computing 13.1 (Apr. 2020), 4:1–4:36. issn: 1936-7228. doi: 10.1145/3348797. url:
http://doi.org/10.1145/3348797 (visited on 06/11/2020).
[66] Navid Salehnamadi, Abdulaziz Alshayban, Jun-Wei Lin, Iftekhar Ahmed, Stacy Branham,
and Sam Malek. “Latte: Use-Case and Assistive-Service Driven Automated Accessibility
Testing Framework for Android”. In: Proceedings of the 2021 CHI Conference on Human
Factors in Computing Systems. 274. New York, NY, USA: Association for Computing Machinery, May 2021, pp. 1–11. isbn: 978-1-4503-8096-6. url: http://doi.org/10.1145/
3411764.3445455 (visited on 07/14/2021).
[67] Navid Salehnamadi, Forough Mehralian, and Sam Malek. “Groundhog: An Automated Accessibility Crawler for Mobile Apps”. In: Proceedings of the 37th IEEE/ACM International
Conference on Automated Software Engineering. ASE ’22. New York, NY, USA: Association
for Computing Machinery, Jan. 2023, pp. 1–12. isbn: 978-1-4503-9475-8. doi: 10 . 1145 /
3551349 . 3556905. url: https : / / dl . acm . org / doi / 10 . 1145 / 3551349 . 3556905
(visited on 04/28/2023).
[68] A. Sanoja and S. Gançarski. “Block-o-Matic: A web page segmentation framework”. In:
2014 International Conference on Multimedia Computing and Systems (ICMCS). Apr. 2014,
pp. 595–600. doi: 10.1109/ICMCS.2014.6911249.
[69] Screen sizes. tex.timestamp: 2021-04-19. url: https://screensiz.es/nexus-6p.
[70] SDK Platform Tools. en. url: https://developer.android.com/tools/releases/
platform-tools (visited on 08/21/2023).
[71] Yuhui Su, Chunyang Chen, Junjie Wang, Zhe Liu, Dandan Wang, Shoubin Li, and Qing
Wang. “The Metamorphosis: Automatic Detection of Scaling Issues for Mobile Apps”. In:
103
Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. ASE ’22. New York, NY, USA: Association for Computing Machinery, Jan. 2023, pp. 1–
12. isbn: 978-1-4503-9475-8. doi: 10.1145/3551349.3556935. url: https://doi.org/
10.1145/3551349.3556935 (visited on 02/06/2023).
[72] Yuhui Su, Chunyang Chen, Junjie Wang, Zhe Liu, Dandan Wang, Shoubin Li, and Qing
Wang. “The Metamorphosis: Automatic Detection of Scaling Issues for Mobile Apps”. In:
Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. ASE ’22. New York, NY, USA: Association for Computing Machinery, Jan. 2023, pp. 1–
12. isbn: 978-1-4503-9475-8. doi: 10.1145/3551349.3556935. url: https://dl.acm.
org/doi/10.1145/3551349.3556935 (visited on 04/28/2023).
[73] Switch Access - Android Accessibility. url: https://support.google.com/accessibility/
android/answer/6122836?hl=en (visited on 08/24/2023).
[74] Brandon Taylor, Anind Dey, Dan Siewiorek, and Asim Smailagic. “Customizable 3D printed
tactile maps as interactive overlays”. In: Proceedings of the 18th international ACM SIGACCESS conference on computers and accessibility. ASSETS ’16. Number of pages: 9 Place: Reno,
Nevada, USA. New York, NY, USA: Association for Computing Machinery, 2016, pp. 71–79.
isbn: 978-1-4503-4124-0. doi: 10.1145/2982142.2982167. url: https://doi.org/10.
1145/2982142.2982167.
[75] Tesseract documentation. en-US. url: https://tesseract-ocr.github.io/ (visited on
04/28/2023).
[76] Top Companies That Got Sued Over Accessibility. url: https://www.accessi.org/blog/
famous-web-accessibility-lawsuits/ (visited on 07/24/2023).
[77] UI Automator. en. url: https://developer.android.com/training/testing/uiautomator (visited on 01/22/2021).
[78] Xabier Valencia, J Eduardo Pérez, Myriam Arrue, Julio Abascal, Carlos Duarte, and Lourdes
Moreno. “Adapting the Web for People With Upper Body Motor Impairments Using Touch
Screen Tablets”. In: Interacting with Computers 29.6 (Nov. 2017), pp. 794–812. issn: 0953-
5438. doi: 10.1093/iwc/iwx013. url: https://doi.org/10.1093/iwc/iwx013 (visited
on 01/29/2021).
[79] Christopher Vendome, Diana Solano, Santiago Liñán, and Mario Linares-Vásquez. “Can
Everyone use my app? An Empirical Study on Accessibility in Android Apps”. In: 2019
IEEE International Conference on Software Maintenance and Evolution (ICSME). ISSN: 2576-
3148. Sept. 2019, pp. 41–52. doi: 10.1109/ICSME.2019.00014.
[80] W3 Target Size. url: https://www.w3.org/WAI/WCAG21/Understanding/target-size
(visited on 01/29/2021).
104
[81] w3c_wai. Mobile Accessibility Task Force. url: https://www.w3.org/WAI/GL/mobilea11y-tf/ (visited on 08/11/2023).
[82] Jena Wallace. Key Takeaways from UsableNet’s ADA Report. en. July 2023. url: https :
//www.3playmedia.com/blog/key-takeaways-usablenets-ada-web-app-report/
(visited on 07/24/2023).
[83] Website Accessibility Conformance Evaluation Methodology (WCAG-EM). url: https : / /
www.w3.org/TR/WCAG-EM/ (visited on 08/19/2022).
[84] Window magnifier in Android 12. en. url: https://source.android.com/docs/core/
display/window-magnifier (visited on 04/28/2023).
[85] Jason Wu, Gabriel Reyes, Sam C. White, Xiaoyi Zhang, and Jeffrey P. Bigham. “When can
accessibility help? An exploration of accessibility feature recommendation on mobile devices”. In: W4A. 2021. url: https://arxiv.org/pdf/2105.01734.pdf.
[86] Bo Yang, Zhenchang Xing, Xin Xia, Chunyang Chen, Deheng Ye, and Shanping Li. “Don’t
Do That! Hunting Down Visual Design Smells in Complex UIs Against Design Guidelines”.
In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). ISSN: 1558-
1225. May 2021, pp. 761–772. doi: 10.1109/ICSE43902.2021.00075.
[87] Xiaoyi Zhang, Anne Spencer Ross, Anat Caspi, James Fogarty, and Jacob O. Wobbrock.
“Interaction proxies for runtime repair and enhancement of mobile application accessibility”. In: Proceedings of the 2017 CHI conference on human factors in computing systems.
CHI ’17. Number of pages: 14 Place: Denver, Colorado, USA. New York, NY, USA: Association for Computing Machinery, 2017, pp. 6024–6037. isbn: 978-1-4503-4655-9. doi:
10.1145/3025453.3025846. url: https://doi.org/10.1145/3025453.3025846.
[88] Xiaoyi Zhang, Anne Spencer Ross, and James Fogarty. “Robust Annotation of Mobile Application Interfaces in Methods for Accessibility Repair and Enhancement”. In: Proceedings
of the 31st Annual ACM Symposium on User Interface Software and Technology. UIST ’18.
New York, NY, USA: Association for Computing Machinery, Oct. 2018, pp. 609–621. isbn:
978-1-4503-5948-1. doi: 10.1145/3242587.3242616. url: http://doi.org/10.1145/
3242587.3242616 (visited on 12/17/2020).
[89] Xiaoyi Zhang, Tracy Tran, Yuqian Sun, Ian Culhane, Shobhit Jain, James Fogarty, and Jennifer Mankoff. “Interactiles: 3D printed tactile interfaces to enhance mobile touchscreen accessibility”. In: Proceedings of the 20th international ACM SIGACCESS conference on computers and accessibility. ASSETS ’18. Number of pages: 12 Place: Galway, Ireland. New York, NY,
USA: Association for Computing Machinery, 2018, pp. 131–142. isbn: 978-1-4503-5650-3.
doi: 10.1145/3234695.3236349. url: https://doi.org/10.1145/3234695.3236349.
[90] Yu Zhong, Astrid Weber, Casey Burkhardt, Phil Weaver, and Jeffrey P. Bigham. “Enhancing
Android accessibility for users with hand tremor by reducing fine pointing and steady
105
tapping”. en. In: Proceedings of the 12th International Web for All Conference. Florence Italy:
ACM, May 2015, pp. 1–10. isbn: 978-1-4503-3342-9. doi: 10 . 1145 / 2745555 . 2747277.
url: https://dl.acm.org/doi/10.1145/2745555.2747277 (visited on 03/30/2021).
106
Abstract (if available)
Abstract
Mobile accessibility is more critical than ever due to the significant increase in mobile app usage, particularly among people with disabilities who rely on mobile devices to access essential information and services. People with vision and motor disabilities often use assistive technologies to interact with mobile applications. However, recent studies show that a significant percentage of mobile apps remain inaccessible due to layout accessibility issues, making them challenging to use for older adults and people with disabilities. Unfortunately, existing techniques are limited in helping developers debug these issues; they can only detect issues but not repair them. Therefore, the repair of layout accessibility issues remains a manual, labor-intensive, and error-prone process.
Automated repair of layout accessibility issues is complicated by several challenges. First, a repair must account for multiple issues holistically in order to preserve the relative consistency of the original app design. Second, due to the complex relationship between UI components, there is no straightforward way of identifying the set of elements and properties that need to be modified for a given issue. Third, assuming the relevant views and properties could be identified, the number of possible changes that need to be considered grows exponentially as more elements and properties need to be considered. Finally, a change in one element can create cascading changes that lead to new problems in other areas of the UI. Together, these challenges make a seemingly simple repair difficult to achieve.
In this dissertation, I introduce a repair framework that builds and analyzes models of the User Interface (UI) and leverages multi-objective genetic search algorithms to repair layout accessibility issues. To evaluate the effectiveness of the framework, I instantiated it to repair the different known types of layout accessibility issues in mobile apps. The empirical evaluation of these instantiations on real-world mobile apps demonstrated their effectiveness in repairing these issues. In addition, I conducted user studies to assess the impact of the repairs on the UI quality and aesthetics. The results demonstrated that the repaired UIs were not only more accessible but also did not distort or significantly change their original design. Overall, these results are positive and indicate that my repair framework can be highly effective in automatically repairing layout accessibility issues in mobile applications. Overall, my results confirm my dissertation's hypothesis that a repair framework employing a multi-objective genetic search-based approach can be highly effective in automatically repairing layout accessibility issues in mobile applications.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Automated repair of presentation failures in Web applications using search-based techniques
PDF
Detecting SQL antipatterns in mobile applications
PDF
Automatic detection and optimization of energy optimizable UIs in Android applications using program analysis
PDF
Detection, localization, and repair of internationalization presentation failures in web applications
PDF
Side-channel security enabled by program analysis and synthesis
PDF
Energy optimization of mobile applications
PDF
Reducing user-perceived latency in mobile applications via prefetching and caching
PDF
Utilizing user feedback to assist software developers to better use mobile ads in apps
PDF
Toward understanding mobile apps at scale
PDF
Techniques for methodically exploring software development alternatives
PDF
Analysis of embedded software architecture with precedent dependent aperiodic tasks
PDF
Formal analysis of data poisoning robustness of K-nearest neighbors
PDF
Building generalizable language models for code processing
PDF
Proactive detection of higher-order software design conflicts
PDF
Detecting anomalies in event-based systems through static analysis
PDF
Constraint-based program analysis for concurrent software
PDF
Prediction of energy consumption behavior in component-based distributed systems
PDF
A unified framework for studying architectural decay of software systems
PDF
Design and application of a C-shaped miniaturized coil for transcranial magnetic stimulation in rodents
PDF
Situated proxemics and multimodal communication: space, speech, and gesture in human-robot interaction
Asset Metadata
Creator
Alotaibi, Ali S.
(author)
Core Title
Automated repair of layout accessibility issues in mobile applications
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Degree Conferral Date
2023-12
Publication Date
05/01/2024
Defense Date
09/20/2023
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
accessibility,AI,analysis,engineering,Mobile,OAI-PMH Harvest,search-based,software,testing
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Halfond, William G.J. (
committee chair
), Medvidović, Nenad (
committee member
), Raghothaman, Mukund (
committee member
), Ragusa, Gisele (
committee member
), Wang, Chao (
committee member
)
Creator Email
aalotaib@usc.edu,alisotaibi@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113762979
Unique identifier
UC113762979
Identifier
etd-AlotaibiAl-12448.pdf (filename)
Legacy Identifier
etd-AlotaibiAl-12448
Document Type
Dissertation
Format
theses (aat)
Rights
Alotaibi, Ali S.
Internet Media Type
application/pdf
Type
texts
Source
20231103-usctheses-batch-1104
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
accessibility
AI
analysis
engineering
search-based
testing