Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The effects of required security on software development effort
(USC Thesis Other)
The effects of required security on software development effort
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE EFFECTS OF REQUIRED SECURITY ON SOFTWARE DEVELOPMENT EFFORT
by
ELAINE VENSON
A DissertationPresented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
COMPUTER SCIENCE
August 2021
Copyright 2021 Elaine Venson
This dissertation is dedicated to Eduardo, Luísa, and Felipe.
ii
ACKNOWLEDGMENTS
Throughout the writing of this dissertation, I have received a great deal of support and
assistance.
I would first like to thank my supervisor, Professor Barry Boehm, whose expertise and
guidance were invaluable for achieving the results in this work. More than giving advice, he
exemplifies work ethics and generosity that are truly inspiring.
I would like to thank all in the wonderful CSSE family. I would particularly like to single
out Brad Clark, who motivated me to pursue this research topic, made himself available to
discuss the research, and provided the tools I needed to complete my dissertation. I am
also grateful for the kind and uplifting support from my CSSE colleagues Reem Alfayez and
Kamonphop Srisopha, especially during this time we were isolated because of the pandemics.
I would like to acknowledge Diana Baklizky and Mauricio Aguiar for first encouraging me
to come to USC and for their support during this whole journey. The contact with software
organizations in Brazil and the data collection would not have been possible without their
collaboration.
Finally, I could not have completed this dissertation without the love and care from my
family and friends who were always there for me.
iii
Table of Contents
Page
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . 3
Aims and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Dissertation Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter1:Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 Definitions for Security . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Cost-effectiveness of Secure Software Development . . . . . . . . . 9
1.3 Need for Secure Software Development Cost Estimation . . . . . . 13
1.4 Approaches to Estimating Costs of Secure Software Development . 15
1.4.1 COCOMO-based Models for Costing Secure Software . . . . 15
1.4.2 Other Models for Costing Secure Software . . . . . . . . . . 18
1.4.3 Additional Costs of Secure Software . . . . . . . . . . . . . . 20
1.4.4 Validation of the Models . . . . . . . . . . . . . . . . . . . . 21
1.4.5 Accuracy of the Models . . . . . . . . . . . . . . . . . . . . . 22
1.5 Measuring the Software Security Level . . . . . . . . . . . . . . . . 22
1.6 Sources of Cost in Secure Software Development . . . . . . . . . . 26
1.7 Open Issues and Opportunities . . . . . . . . . . . . . . . . . . . . 27
Chapter2:Costing Secure Software Development - State of the Art and
Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
iv
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.1 Systematic Mapping . . . . . . . . . . . . . . . . . . . . . . 35
2.3.2 Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Systematic Mapping Results . . . . . . . . . . . . . . . . . . . . . 48
2.4.1 RQ1.1 Which papers describe research on software security
and its relation to costs? . . . . . . . . . . . . . . . . . . . . 49
2.4.2 RQ1.2 What are the major sources of costs in developing se-
cure software? . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.3 RQ1.3: What approaches were developed in academia to es-
timating the costs of security in software development projects? 55
2.4.4 RQ1.4: Which software security standards and engineering
processes are used in the studies? . . . . . . . . . . . . . . . 59
2.5 Survey Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.5.1 Background Information . . . . . . . . . . . . . . . . . . . . 61
2.5.2 Projects Characterization . . . . . . . . . . . . . . . . . . . . 64
2.5.3 RQ2.1 What approaches are used in industry to estimating
the costs of security in software development projects? . . . . 65
2.5.4 RQ2.2 What is the frequency and effort spent on software
practices in projects? . . . . . . . . . . . . . . . . . . . . . . 68
2.5.5 RQ2.3 How much effort/cost is added to a project due to the
application of security practices? . . . . . . . . . . . . . . . . 72
2.5.6 R3 How practice compare to state of the art? . . . . . . . . . 72
2.6 Limitations of the Study . . . . . . . . . . . . . . . . . . . . . . . . 74
2.7 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Chapter3:Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.1 Research Goal, Questions, and Methods . . . . . . . . . . . . . . . 83
3.2 Phase II - The Rating Scale Development . . . . . . . . . . . . . . 84
3.2.1 Item Development . . . . . . . . . . . . . . . . . . . . . . . . 86
3.2.2 Scale Development . . . . . . . . . . . . . . . . . . . . . . . 87
3.3 Phase III - Research Design for the Secure Software Development
Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.3.1 Analyze Existing Literature . . . . . . . . . . . . . . . . . . 92
3.3.2 Perform Behavioral Analysis . . . . . . . . . . . . . . . . . . 92
v
3.3.3 Determine Form of Model, Identify Relative Significance of
Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.3.4 Perform Expert Judgement, Delphi Assessment . . . . . . . . 93
3.3.5 Gather Project Data . . . . . . . . . . . . . . . . . . . . . . 96
3.3.6 Build and Validate Model . . . . . . . . . . . . . . . . . . . 96
3.3.7 Gather more data, Refine Model . . . . . . . . . . . . . . . . 99
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Chapter4:Secure Software Development Scale . . . . . . . . . . . . . . . . . . 100
4.1 Development of the Rating Scale . . . . . . . . . . . . . . . . . . . 100
4.2 Evaluation of the Rating Scale . . . . . . . . . . . . . . . . . . . . 107
4.2.1 Focus Group . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.2.2 Wideband Delphi . . . . . . . . . . . . . . . . . . . . . . . . 108
4.2.3 Online Delphi . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3 Initial Estimates for the Scale . . . . . . . . . . . . . . . . . . . . . 114
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Chapter5:A Statistical Cost Model for Secure Software Development . . . 118
5.1 Data Set Description . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.2 Collinearity Test Results . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3 Model Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.4 SECU Quantification . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.5 Multiple Regression Results . . . . . . . . . . . . . . . . . . . . . . 127
5.6 Model Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.7 Outliers Detection and Removal . . . . . . . . . . . . . . . . . . . 129
5.8 Multiple Regression Results for the Filtered Dataset . . . . . . . . 131
5.9 Model Validation for the Filtered Dataset . . . . . . . . . . . . . . 134
5.10The Resulting Model . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.11Results for the SECU Cost Driver . . . . . . . . . . . . . . . . . . 137
5.12Results by System Architecture . . . . . . . . . . . . . . . . . . . . 138
5.13Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Chapter6:Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1.1 Multipliers for Required Software Security Levels . . . . . . . 143
6.1.2 Relationship of Security with Other Variables . . . . . . . . 146
6.1.3 Outliers for Productivity and Security-related Effort . . . . . 146
6.2 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
vi
6.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.4 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Chapter7:Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Appendix A:Data Collection Instruments . . . . . . . . . . . . . . . . . . . . . 169
A.1 Instrument for Collecting Expert Opinion . . . . . . . . . . . . . . 169
A.2 Instrument for Collecting Project Data . . . . . . . . . . . . . . . . 174
Appendix B:Model Initial Values . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Appendix C:Regression Results for System Architecture-based Models . . 180
C.1 Web-Mainframe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
C.2 Mainframe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
C.3 Client-Server, Web-Mainframe . . . . . . . . . . . . . . . . . . . . 183
C.4 Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
C.5 Client-Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
vii
List of Tables
2.1 Security Practices Description [87] . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2 Research questions for the mapping study . . . . . . . . . . . . . . . . . . . 36
2.3 Search String Versions and Sensitivity . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Properties extracted from each paper . . . . . . . . . . . . . . . . . . . . . . 42
2.5 Search Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6 Group Search Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7 Venues used more than once for publications . . . . . . . . . . . . . . . . . . 50
2.8 Sources of cost for Secure Software Development . . . . . . . . . . . . . . . . 56
2.9 Approaches to Estimating Costs of Secure Software Development . . . . . . 57
2.10 Usage of SWSec Practices on the Group . . . . . . . . . . . . . . . . . . . . 64
2.11 How Participant Became Aware of SWSec Practices . . . . . . . . . . . . . . 64
2.12 Development Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.13 Summary Statistics of Team Size and Project Duration . . . . . . . . . . . . 66
2.14 SW Estimation Technique and Planning of SWSec Activities . . . . . . . . . 67
4.1 Descriptors for Attribute Levels . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2 Items Detailed Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Practices Grouped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.4 Practices Summarized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.5 Scale Version After the Focus Group . . . . . . . . . . . . . . . . . . . . . . 109
4.6 Productivity Ranges for the First Round of the Wideband Delphi . . . . . . 110
4.7 Productivity Ranges for the Second Round of the Wideband Delphi . . . . . 110
4.8 Productivity Range Statistics in Round 1 and Round 2 of the Online Delphi 112
4.9 Summary Results for Increase in Application Size . . . . . . . . . . . . . . . 114
4.10 Productivity Range for the Delphi Sessions Combined . . . . . . . . . . . . . 116
5.1 Dataset Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
viii
5.2 Cost Drivers Used in the Dataset . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3 Predictor Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.4 Expert Opinion Estimates for the SECU Cost Driver . . . . . . . . . . . . . 127
5.5 Coefficients for the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.6 Model Fitness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.7 Model Validation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.8 Model Coefficients for the Filtered Dataset . . . . . . . . . . . . . . . . . . . 134
5.9 Model Coefficients for the Filtered Dataset without SECU . . . . . . . . . . 135
5.10 Model Fitness for the Two Versions of the Dataset . . . . . . . . . . . . . . . 135
5.11 Model Validation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.12 New Cost Drivers’ Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.13 Comparison of SECU Rating Values . . . . . . . . . . . . . . . . . . . . . . . 138
5.14 Projects by System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.15 Regression Results by System Architecture . . . . . . . . . . . . . . . . . . . 140
5.16 Stepwise Regression Results by System Architecture . . . . . . . . . . . . . . 141
5.17 Comparison of SECU Multipliers by System Architecture . . . . . . . . . . . 142
A.1 Software System Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
A.2 Software Functional Component . . . . . . . . . . . . . . . . . . . . . . . . 175
A.3 Software Functional Component Size . . . . . . . . . . . . . . . . . . . . . . 178
B.1 Initial Values for the Cost Drivers . . . . . . . . . . . . . . . . . . . . . . . . 179
ix
List of Figures
1 Research Phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1 Decomposition of the security production function into two steps [25] . . . . 10
1.2 Cost Models Compared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Cost Models’ Rating Scales From Low to Super High . . . . . . . . . . . . . 21
2.1 Study Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.2 Papers by Year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3 Papers by Category and Pertinence . . . . . . . . . . . . . . . . . . . . . . . 50
2.4 Position in Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.5 Experience in SWSec Development and Academic Degree . . . . . . . . . . . 63
2.6 Sector and Organization Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.7 Frequency of Security Practices Usage . . . . . . . . . . . . . . . . . . . . . 68
2.8 Effort of Security Practices Usage (each time) . . . . . . . . . . . . . . . . . 69
2.9 Participants’ Usage of Security Practices in the Project . . . . . . . . . . . . 71
2.10 Percentage of Project Effort Dedicated to SWSec Practices Across Develop-
ment Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.11 Percentage of Project Effort Dedicated to SWSec Practices Across Sectors . 74
3.1 Scale Development Phases and Steps . . . . . . . . . . . . . . . . . . . . . . 86
3.2 Modeling Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.3 The Online Delphi Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.4 Statistical Model-Building Process (adapted from [24]) . . . . . . . . . . . . 97
4.1 Security Practices Summarizing Steps . . . . . . . . . . . . . . . . . . . . . . 104
4.2 Productivity Range Distribution for the Requirements and Design group . . 112
4.3 Productivity Range Distribution for the Coding and Tools group . . . . . . . 113
4.4 Productivity Range Distribution for the V&V group . . . . . . . . . . . . . . 113
4.5 Effort Multipliers per Security Level Calculated from the Productivity Range 115
x
4.6 Effort Multipliers per Security Level Calculated from the Productivity Range 116
5.1 Function Points Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.2 Effort Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.3 Correlation Between Size and Effort . . . . . . . . . . . . . . . . . . . . . . . 121
5.4 Pairwise Correlation for the Predictors . . . . . . . . . . . . . . . . . . . . . 124
5.5 Pairwise Correlation for the Predictors with the new aggregate cost driver
EXPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.6 Correlation Between SECU and Other Predictors . . . . . . . . . . . . . . . 125
5.7 Productivity Distribution in the Dataset . . . . . . . . . . . . . . . . . . . . 130
5.8 Security Effort Distribution in the Dataset . . . . . . . . . . . . . . . . . . . 131
5.9 Scatter Plot of Size vs Effort with Outliers Highlighted . . . . . . . . . . . . 132
5.10 Distribution of the Productivity Variable After Outliers Removal . . . . . . . 132
5.11 Distribution of the Security Effort Ratio Variable After Outliers Removal . . 133
5.12 SECU Coefficient Estimates with 95% Confidence Intervals . . . . . . . . . . 137
5.13 Comparison of Coefficients by System Architecture . . . . . . . . . . . . . . 139
6.1 Comparison of Models’ Security Multipliers . . . . . . . . . . . . . . . . . . 145
A.1 Data Collection Form - Wideband Delphi . . . . . . . . . . . . . . . . . . . . 170
A.2 Data Collection Form - Online Delphi - Round 1 . . . . . . . . . . . . . . . . 171
A.3 Data Collection Form - Online Delphi - Round 2 - Page 1 . . . . . . . . . . . 172
A.4 Data Collection Form - Online Delphi - Round 2 - Page 2 . . . . . . . . . . . 173
xi
ABSTRACT
Software development teams are under pressure to adopt security practices in their projects
in response to cyber threats. Despite the effort required to perform these activities, the
few proposed cost models for security effort do not consider security practices as input and
were not properly validated with empirical data. This dissertation aims at examining the
effectsofapplyingsoftwaresecuritypracticestothesoftwaredevelopmenteffort. Specifically,
it quantifies the effort required to develop secure software in increasing levels of rigor and
scope.
An ordinal scale to measure the degree of application of security practices was developed.
The scale items are based on the sources of cost for secure software development, captured
through a systematic mapping and a survey with security experts. Effort estimation experts
and software security experts evaluated the scale and provided initial estimates for the pro-
ductivity range through Wideband Delphi and online Delphi sessions. Finally, a statistical
model to quantify the security effort was built based on the estimates and on a dataset with
projects from the industry. The model calibration showed that the application of software
security practices can impact the cost estimations ranging from a 19% additional effort, on
the first level of the scale, to a 102% additional effort, on the highest level of the scale.
These results suggest that the effort required to develop secure software is lower than it
was estimated in previous studies, especially when considering the domain of Information
Systems software. This research builds on previous works on secure software cost models
and goes one step further by providing an empirical validation for the required software
security scale. The resulting model can be used by practitioners in this area to estimate
proper resources for secure software development. Additionally, the validated multipliers are
an important piece of information for researchers developing investment models for software
security.
xii
Introduction
Software is increasingly pervading our daily lives, bringing benefits in all areas of human
activities. Nevertheless, as we become more dependent on it, concerns about the harmful
impact of systems’ flaws and misuse are raised, specially when they are security-related.
News on data breaches and other security issues have appeared constantly on the media,
and computer security has become an area of interest in industry and academia [77, 5, 4, 70,
47].
Although security has been considered important in software projects for a long time
[23], it was not a priority for most of the systems developed in the 20th century, as empha-
sis was placed on transforming advances in performance and capability into more features
[99]. However, beginning in the year 2000, a substantial increase in computers’ connectiv-
ity, ever-larger and ever-more complex information systems, and new flexible and extensible
architectures have contributed to a growing number of software vulnerabilities [78]. The
first books and academic classes on the topic of Software Security appeared in 2001 [76].
And on the industry side, organizations have published recommended security practices to
be applied in the software development processes [87]. The “building security in” mentality
of Software Security (SWSec), which is the idea of developing software that continues to
1
work correctly under malicious attack, has been evolving since then, as one response to the
security problems [78, 54].
The Software Security approach consists of incorporating security practices during the
software development life cycle (SDLC), distinguishing itself from the traditional approach
tosecuritywhichisfocusedonprotectingsoftwareapplicationsbybuildingsecurityaroundit
with network-based security tools. An emerging community of Secure Software Engineering
(SSE) researchers advocates for the improvement of software development processes and
activities in order to deliver products that are less vulnerable to security problems and
easier to protect [55, 129].
SSE researchers argue that post-development approaches to defending against attacks are
not sufficient, because many security problems are rooted on the way we build software [78,
55]. Trend analysis point out that implementation errors are the major source of vulnerabil-
ities, accounting for two-thirds of the total [72]. Threats have targeted the application layer
of the protocol stack, which is less protected by operational security tools [52]. Attacks like
SQL injection, which allows an attacker without credentials to gain direct access (read, write,
delete) to the tables of the back-end database, occupy the top positions of most frequent
vulnerabilities lists [32, 72, 41].
Whileidentifyingandfixingvulnerabilitieshastypicallybeencarriedoutattheendofthe
development cycle or even after deployment in network-centric approaches to security [78,
55], prior empirical research in Software Engineering has demonstrated the cost-effectiveness
of resolving flaws and bugs on early phases [23, 20, 117]. Motivated by those findings and
inspired by advances on Information Security Economics (ISE), the SSE community has
2
studied the cost-effectiveness of applying SWSec practices since the beginning of software
development projects to prevent vulnerabilities [52, 17, 56, 92, 31, 54, 100]. By identifying
vulnerabilities early in the SDLC, SWSec can reduce losses from security incidents, decrease
system’s downtime, and save operational effort/costs [52].
Statement of the Problem
Despitetheeffortsfromseveralorganizationsinprovidingdeveloperswithpracticesforsecure
software development [78, 75, 111, 97], Secure Software Development Life Cycles (S-SDLC)
have not been broadly embraced by the industry so far [32]. Practitioners often point the
costs as barrier to the adoption of such practices, although studies show that there is a lack
of knowledge about the amount of resources needed to achieve a determined level of security
assurance [134, 56, 54, 45].
Current models for software effort prediction do not consider the security factor and may
cause inaccurate estimations, which affects the planning of required resources for software
project development. A few cost models were proposed to predict effort considering security
activities, but they do not take into account current software security practices and were
not sufficiently validated with empirical data [134, 126]. Verendel [127] points the lack of
repeated large-sample empirical studies as an obstacle to validate security quantification
models.
While the amount of effort/cost required to incorporate security practices in software
development is still an open issue, studies demonstrate that this effort cannot be neglected
[31, 126, 56]. A survey with security professionals and developers on LinkedIn showed that
3
practitioners largely use expert judgement and work breakdown methods to plan project
resources, yet at the same time many projects fail to include some of the practices that
are used in the planning or even do not include any practice at all [126]. These facts
raise the concern that estimations are not providing enough resources for a secure software
development, which, in addition to the lack of security culture from developers, managers
and business stakeholders, contribute to the increasing number of implementation and design
vulnerabilities found in software systems [72].
In summary, the overall problem addressed by this thesis proposal is:
Security threats urge software practitioners to include security practices in the
software development lifecycle. However, the few proposed cost models for
security effort do not consider security practices and were not properly
validated, challenging the resulting estimates, which, in turn, hinders
cost-effectiveness analysis and resource planning for software projects.
Aims and Objectives
In order to support cost estimation for secure software development and cost-effectiveness
evaluations, this dissertation aims at quantifying the effects of secure software development
in the development effort. For achieving this, the main objectives are to develop a scale
for measuring the usage degree of software security practices; and to build and validate a
security cost estimation model, based on the COCOMO framework, using data collected
from expert opinion and industry software projects with varying levels of security.
4
The quantification of software security effort, according to the particular needs of each
project, will allow business stakeholders, developers and managers to understand and agree
ontheamountofsecurityresourcestobeallocatedinprojects. Suchinformationcanimprove
software security by leveraging the adoption of security practices. Furthermore, research on
cost-effectiveness of software security relies on information regarding the costs of applying
security practices and countermeasures. Thus, it will be beneficial for both industry and
academia to be able to estimate the costs driven by software security.
Research Questions
The research questions are centered on determining if there is and what is the increase in
software development effort caused by the incorporation of security practices:
• RQ1: How to measure the growing levels of secure software development?
• RQ2: What is the increase in development effort caused by growing levels of secure
software development?
The claim is that increasing levels of required security, which are determined by growing
security risks, will also raise software development effort or cost. The quantification of
the model’s parameters will indicate the magnitude of the effect that security has on the
development effort.
Dissertation Overview
This thesis is divided in three phases, as shown in Figure 1. Phase I comprises two prelimi-
nary studies that were conducted to provide an understating around the sources of cost for
5
software security and to support the formulation of the research questions. The results of
these works are presented in Chapter 2.
Phase II, which addresses research question 1, is concerned with constructing an ordinal
scaletoestablishthelevelsofsecuresoftwaredevelopment. Themethodologyandprocedures
selected are explained in the first part of Chapter 3 and the results are reported on Chapter
4.
Toanswerresearchquestion2,phaseIIIfocusonbuildingandevaluatingasecuresoftware
development cost model. The methods used for development of the model are described in
the second part of chapter 3 and the results are detailed in Chapter 5.
December, 2017 July, 2021
Phase I Phase II Phase III
Systematic literature
review
Survey with
practitioners
Establish a rating
scale for security
Collect data from
experts, industry and
OSS
Build statistical model
and test hypothesis
Evaluate scale and
model
Activities
Results
Research
questions
Security rating scale
Datasets
Research report
Thesis defense
Figure 1 Research Phases
Complementing this structure, Chapter 2, Background, presents a literature review on
the topics related to this dissertation; Chapter 6, Discussion, explains the findings and
limitations of the research and provides recommendations of studies to follow; and, finally,
Chapter 7, Conclusion, summarizes and reflects on the results obtained in this dissertation.
6
Chapter 1
Literature Review
This chapter provides background definitions and presents a review of studies that analyze
the cost-effectiveness of secure software development and the need for security cost models.
Next, it describes and compares the existing cost models for security effort in software
development, presents the sources of cost, and explore the issues regarding the measurement
of security in the context of software development. Last, the open issues and opportunities
that were drawn from the literature are summarized.
1.1 Definitions for Security
Cyber Security is currently the most used term to designate the security aspects of the
digital world [112]. Schatz, Bashroush, and Wall [112] reviewed professional, academic, and
government literature, to define Cyber Security as:
“The approach and actions associated with security risk management processes
followed by organizations and states to protect confidentiality, integrity and avail-
ability of data and assets used in cyber space.”
7
While Cyber Security is used in a broad context, Software Security has been defined as
an approach focused on the software development life cycle (SLDC). Many authors refer to
McGraw’s definition of Software Security as "the idea of building software that continues to
function properly under a malicious attack" [76].
Throughout this research other terms related to Software Security were found in the
literature, such as:
• Security Engineering (SE):
“Security engineering (SE) processes can be defined as the set of activities performed
to develop, maintain and deliver a secure software product; security activities may be
either sequential or iterative" [7].
“Security engineering activities include those activities needed to engineer a secure so-
lution. Examples include security requirements elicitation and definition, secure design
based on design principles for security, use of static analysis tools, secure reviews and
inspections, and secure testing methods" [40].
• Secure Software Engineering (SSE):
“Simply put, secure software engineering is not necessarily engineering security soft-
ware. SSE seeks to apply processes, principles, and methods to build vulnerability free
software, software that remains in a secure state under attack and continues to provide
service to authorized users" [52].
• Secure Software Development:
“According to the concepts of secure software development, software developers must
ensure that the developing software protects system and user assets against possible
security threats" [103] (as cited in Hedayatpour, Kama, and Chuprat [51]).
“Secure software development (otherwise known as ‘SWSec’ or secure software en-
gineering) leverages software engineering practice and risk management to raise the
quality of security decision making in all phases of software and system development"
[76] (as cited in Heitzenrater and Simpson [56]).
• Security by Design Paradigm:
“Security by Design paradigm reflects a systematic awareness for an integration of
security issues as an important SW quality criteria, during the whole lifecycle of a SW
and implies the design of SW’s security right from the start of the SW development"
[31].
As a common ground, these definitions establish the scope of software security as the
8
specific practices that yield security and are performed within the SDLC. They are used
interchangeably in this dissertation.
1.2 Cost-effectiveness of Secure Software Development
In secure software development, security competes for a slice of a limited budget - adding
more security usually means removing features from the scope. While the growing field of
Software Security provides technical solutions to address current security problems, financial
issues are still a barrier to their effective introduction in projects [56]. Research on cost-
effectivenessofsecuresoftwaredevelopmentexploresthebalancebetweensecurityinvestment
and benefits.
The benefits of applying security practices early on the SDLC are frequently discussed,
but studies mostly present anecdotal evidence. Such benefits are considered as a motivation
in many studies, which state that security can be improved and the total cost of a software
system can be reduced with the appropriate allocation of resources in the early stages of
the software project [92, 52, 55, 31, 54, 137]. The total cost is reduced because security
defects are found and fixed close to their point of introduction. If the same security issues
are left to be found during testing and operation, the costs to repair will be much higher.
This statement is drawn as an analogy with known studies in Software Engineering, which
observed the realization of cost savings when problems are detected and fixed early in the
lifecycle [24]. Besides considering the savings in security patching, authors often relate
benefits with avoided risks relative to vulnerabilities. Researchers ponder that more secure
software implies in less losses with down-time and recovery costs from attacks [52, 54].
9
Market value is also cited, as a software product delivered with vulnerabilities may cause
customer dissatisfaction, reputation loss, and lost sales [137].
An important question to be answered in the context of the cost-effectiveness of secure
software development is how much to invest in order to get the benefits. Pfleeger and Rue
[101] says that software project managers need better data to support their decision-making
about security. Based on the results of diverse surveys, they urge for the development of
models that can answer how much to invest in cybersecurity.
Böhme [25], backed by a set of information security investment models, argues that
security investments exhibit decreasing marginal returns. He defends the introduction of
the security level variable as an intermediate factor to lessen the degree of abstraction of
such models. Figure 1.1, extracted from his paper, shows the decomposition of the security
production function into two steps: costs mapped to a security level, and the security level
mapped to benefits.
Figure 1.1 Decomposition of the security production function into two steps [25]
10
The security level, in this model, represents the quality of the security protection. The
security productivity, which can vary, is related to the efficiency of the security technology
and its ability to mitigate risk [25].
Bringing these concepts to Secure Software Engineering, Heitzenrater and Simpson [56]
discuss the challenges that need to be faced to realize economically-informed secure software
practices. Their paper, entitled ‘A Case for the Economics of Secure Software Development’,
makes a call for the establishment of an economics of secure software development. Heitzen-
rater, Bohme, and Simpson [55] defend that secure software engineering is a valid security
investment that decreases overall security expenditure. They extend one investment model
fromInformationSecurityresearch(namedIWL,or Iterated Weakest Link), proposinganini-
tial model that captures secure software development investment (IWL-SSE). According to
the model, investing in security early in the lifecyle reduces the defender uncertainty regard-
ing the vulnerabilities present in the software and at the same time increases the costs to the
attacker. To evaluate the model, the authors propose the Return on Secure Software Process
(ROSSP) metric, as the difference between the the Return on Security Investment (ROSI)
with secure development and ROSI without secure development. Based on a hypothetical
scenario, they find an optimal point of investment where the ROSSP is 11.1. Despite the
interesting results, the model relies on simplifying assumptions about the costs of security
practices and its effectiveness, and lack empirical application.
Another example of an investment model is the jump-diffusion approach, proposed by
Zheng et al. [137]. For them, the investment value of security is that it diminishes the
number of vulnerabilities and avoids damages caused by malicious attacks. If no security
11
investment is done, the emergence of vulnerabilities will change the value of the asset in a
discrete contingency. The announcement of a vulnerability is a jump in the model, which
will impact negatively in the software market. The model was evaluated with hypothetical
numerical examples, showing promising results.
Empirical studies in this area are scarce, mainly due to the difficulty in acquiring data.
Chehrazi, Heimbach, and Hinz [31] conduct an empirical study with Open Source Soft-
ware (OSS) to demonstrate that there is a positive relation between considering security
in the early stages and the project’s success. Their research combine information gathered
through a survey with project leaders and information obtained from the projects’ reposito-
ries. Nunez, Lindo, and Rodriguez [91] present a case study in a industrial setting, in which
they compare the development of two modules of the same software project, performed by
the same team. One module, M1, was developed testing security only at the end of the
lifecycle, named ‘Classic Scenario’, and another module, M2, was developed following a se-
cure development process, named ‘Emerging Scenario’. The results show that the number
of vulnerabilities was reduced by 66% in the ‘Emerging Scenario’. Also, the criticality of
vulnerabilities was also significantly reduced. However, the study does not present clear
information about the size of each module, hindering the comparison. In terms of effort, the
percentage of effort dedicated to security was almost the same, 11.2% for M1 and 11.7% for
M2, with the difference that for M1 the effort was spent at the end of the development, and
for M2, the effort was distributed over the development process.
12
1.3 Need for Secure Software Development Cost Estima-
tion
Despite the efforts in showing that secure software development is cost-effective, costs are
often considered as a hurdle. According to Heitzenrater and Simpson [56], secure software
development can be costly because of additional processes and tools that require expertise,
time, and resources from the development organization. Software security practices applied
over the software lifecyle are pointed out as the main sources of cost for software security
[126, 125].
The importance of estimating the effort and planning for security is addressed in some
papers. Peeters and Dyson [100] raise the problem of prioritization of quality requirements,
such as security, in agile development. According to them, Agile’s fundamental goal is to
deliver software ‘fit for purpose’, instead of grand technical solutions, but the lack of estima-
tion and adequate planning for quality requirements causes problems. Agile developers use
to hide the costs with quality in the estimates of user stories, but this practice is harmful for
protecting a software system from serious threats, which usually require more resources. As
a solution, they defend the adoption of abuser stories (the security extension of user stories)
as a way to make security requirements explicit, thus helping in planning and estimating
efforts for secure software development [100]. Similarly, Heitzenrater and Simpson [53] ap-
proach the problem of making explicit the costs with security by applying economic utility
functions withing the development of negative use cases. Negative use cases, like abuser
stories, abuse cases, or misuse cases are applied at requirements and architectural level in
13
order to represent scenarios where an attacker tries to use the software system for malicious
purposes [78, 53]. By integrating economic factors, negative use cases become a means of
quantitative justification and provide a trade-space for dealing with resource constraints [53].
However, these approaches rely on developers having the adequate knowledge to define the
negative use cases and estimate the effort to implement the respective countermeasures.
A survey with 46 organizations at two security conferences indicated that for 23.9% of
the companies, formal secure software development lifecycles are too time consuming [45].
On the other hand, software security vendors say that the cost of implementing secure
development methodologies is much less expensive than developers assume [45]. At the
project level, stakeholders also often disagree. Heitzenrater and Simpson [56] observe that
security professionals often consider that they do not have enough resources to deal with
security in software projects. The survey with practitioners revealed a perception that it
is difficult to allocate effort for including security practices in the projects, mainly due
to lack of a security culture from developers, managers, and business stakeholders [125].
Further results of the survey reveal that the effort dedicated to security activities in a new
development projects is around 20% (median) of the total costs [125]. However, the planning
and estimation of such tasks is problematic. For more than 40% of the projects selected by
the participants, not all activities or even no security activity were taken into account during
planning. Also, as most of the projects applied Expert Judgment and Work breakdown as
estimation techniques, this means that many projects do not receive the resources needed for
properly implementing security [125]. These examples show that there is a lack of knowledge
and data about the costs of secure software development, which hinders the decision-making
14
of project stakeholders.
1.4 Approaches to Estimating Costs of Secure Software
Development
The systematic mapping of the literature revealed the existing models for estimating effort
for developing secure software. Most of the approaches are COCOMOII-based models, but
there are some other methods as described in the next subsections.
1.4.1 COCOMO-based Models for Costing Secure Software
The first attempt to introduce the security factor in a software cost model was made by
Reifer, Boehm, and Gangadharan [105], motivated by the security risks of incorporating
COTS software in critical systems. They describe an effort in enhancing COCOMO II to
modeltheimpactofsecurityondevelopmenteffortandduration, andrelateittoaframework
that estimates effort for COTS-based development (COCOTS). The study proposes the use
of an optional cost driver for security in COCOMO II, built upon the knowledge obtained
around the Common Criteria (CC) standard [35]. A Delphi exercise was conducted to obtain
expert-based values for the security cost driver (results are presented in table 2.9). Such
ratings apply for the glue code, i.e., the code developed to integrate the COTS components
intothelargeapplication. ThetotalcostofincorporatingaCOTSpackageisgivenbyadding
the glue code effort, with the COTS tailoring effort and the COTS assessment effort. The
result is expressed in a new security cost driver for the COCOTS model [105].
15
Some years later, Colbert and Boehm [34] presented COSECMO, an extension to CO-
COMO II, to account for the development of secure software. COSECMO provides the total
effort for the development of a new software system whose security is assured as the sum of
the effort to develop security functions and the effort to assure that the system is secure.
The effort to assure that the system is secure is computed according to an assurance level
(%Effort(AL)). Such assurance effort level is obtained through a new cost driver for security
assurance activities, named SECU, whose levels are based on the Common Criteria (CC)
Evaluation Assurance Levels (EAL) to indicate the security level of the software. For EAL
1 and EAL 2, no additional effort is considered. From EAL 3 to EAL 7, the additional effort
is computed with the exponential formula [34]:
%Effort(EAL) = %Effort
3
SECU
(EAL3)
for EAL >= 3 (1.1)
The resulting additional effort percentage value for security, thus, depend on the value of
SECU, which was defined as 2.5, and on the %Effort
3
, calculated according to data provided
by a real-time operating system’s developer. For a 5 KSLOC software, %Effort
3
was defined
as20%. Thepercentagesofaddedeffortforthismodel, presentedintable2.9, werecalculated
by the authors for different software sizes ranging from 5 KSLOC to 1,000 KSLOC.
The next COCOMO-based model was proposed by Lee, Gu, and Baik [74] to estimate
systems in the defense domain. Their study extends the seven-step modeling methodology
of COCOMO II [24] and extracts cost factors related to the defense domain to develop a
specialized software cost estimation model. Security is one amongst the five factors found for
the defense domain, which also include Interoperability, Hardware and software development
simultaneity, Hardware emulator quality, and Hardware precedentedness. This study was
16
an attempt to empirically validate a proposed model for software security, employing 73
data points from embedded software development projects on weapon systems in Korea.
Authors reported a mean magnitude of relative error (MMRE) value of 0.566 and number of
predictions within 30% of the actual, PRED(30), of 37 [74]. However, the coefficient for the
Security predictor did not achieve statistical significance, thus the value presented in Table
2.9 refers to the expert opinion collected through a Delphi exercise.
Yang, Du, and Wang [134] reviewed previous estimation models for secure software de-
velopment and proposed a model based on COCOMO II, but customized for a Chinese IT
security standard. The model adds the cost driver SECU as a multiplier in the original
COCOMO II equation. Five rating values are defined for SECU, according to the security
degree of the software system to be estimated, which is mapped from the CC EAL. Authors
conducted a mini-delphi including themselves and industry experts to establish the values for
the rating scale. The values for this model are presented in table 2.9 (Secure OS software cost
model). The study evaluates the model by comparing the actual effort to develop a secure
operating system with the predicted effort generated by the proposed model. The compari-
son is also made with estimations generated by the original COCOMO II model, the original
COSECMO model, and local calibrations of these two models. The relative error (RE) for
the proposed model resulted in 0.41 (prediction was underestimated by 41%). The second
best estimation was predicted by the local calibration of COSECMO, for which RE was
-0.53 (prediction was overestimated by 53%). On the other hand, the original COSECMO
model provided an estimate with the largest RE among the models compared, almost 8 times
overestimated (RE of -7.93) [134]. Authors argue that such difference can be explained by
17
three factors: (1) The secure OS system that they used for evaluation might be much smaller
than the product used to calibrate COSECMO; (2) the software industry in China presents
a higher productivity; and (3) differences in government software acquisition process from
China and US, which affects the development process.
1.4.2 Other Models for Costing Secure Software
Apart from COCOMO II-based models, five other studies found through the systematic
mappingpresentapproachesrelatedtoestimatingcostsofsoftwaresecurity. However, except
for the function points extended model, all other approaches have a more restrict scope of
application compared to the COCOMO II-based models, which can be used to estimate the
costs of a complete secure software development lifecyle.
Abdullah et al. [1] extend the Function Point Analysis (FPA) method for software sizing
to address the software security attribute. A software security characteristics formulation
was developed considering four common security standards and a survey with developers.
The score for the security attribute is determined according to the number of security char-
acteristics employed by the software project. The degree of influence for security can vary
from 0 to 5. When the degree is maximum (i.e. score is 5), the function points count will
be increased by 5%.
Anothersoftwaresizingmetric, COSMICFunctionPoints(CFP),wasappliedtoestimate
the effort to fix vulnerabilities as proposed by Dammak, Jedidi, and Gargouri [38]. The
remediationactivitiesofeachvulnerabilityaredescribedusingthefourCFPdatamovements,
which are summed to obtain a CFP total. This value is then combined with the severity
18
impact to help prioritize vulnerabilities to be fixed.
The prediction of remediation costs based on Machine Learning was proposed by Oth-
mane et al. [93]. Their study investigates the major factors that impact the time to fix
security issues based on data collected automatically within the secure development process
of the German company SAP. As a result they observed that the software structure, the
fixing processes, and the development groups are the dominant factors that impact the time
spent to address security issues.
Dashevskyi, Brucker, and Massacci [39] defined three security effort models – centralized,
distributed, andhybrid, focusedonthecostsofmaintainingfreeopensourcesoftware(FOSS)
consumed components. Authors used data from 166 FOSS components to identify factors
that impact the maintenance effort. Only variables that could be automatically or semi-
automatically obtained were considered. They found that the most significant factors were
lines of code and the age of the component.
Already mentioned in section 1.2, the software security investment model proposed by
Heitzenrater, Bohme, and Simpson [55] defines a equation to account for the estimated
costs of secure software development. The equation defines the defender investment for the
Architecture and Design (AD) and Implementation (IT) phases as:
I
AD;IT
= (ic)+i(effe) (1.2)
wherei is the number of security reviews or security test iterations,c is the individual cost of
the security practice,eff is the effectiveness of the practice (probability of finding a security
issue via review or test), and e is the cost of repairing the security flaws or bugs.
19
1.4.3 Additional Costs of Secure Software
Figure 1.2 compares the models ratings, showing that the additional effort for security can
vary largely, according to the model and the level of security.
Even in the same level, the differences are considerable. For example, for a high level of
security, the models add 27% (COCOMO II security extensions), 20%-80% (COSECMO),
87% (Weapon systems), and 25%-50% (Secure OS software cost model) to the project effort.
These differences are illustrated in Figure 1.3, which excludes COSECMO (max) and Ultra-
High level for better visualization.
0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00
Low
Nominal
High
Very High
Extra High
Super High
Ultra High
Effort Multiplier
Low Nominal High Very High Extra High
Super
High
Ultra High
COSECMO (max) 1.00 1.80 3.00 6.00 13.50 32.25
COSECMO (min) 1.00 1.20 1.50 2.25 4.13 8.81
Secure OS sw cost model (max) 1.00 1.50 2.00 2.75 3.75
Secure OS sw cost model (min) 1.00 1.25 1.75 2.00 3.00
Weapon systems cost model 1.00 1.87
COCOMO II security extension 0.94 1.02 1.27 1.43 1.75
Figure 1.2 Cost Models Compared
20
0.94
1.02
1.27
1.43
1.75
1.00
1.20
1.50
2.25
4.13
1.00
1.87
1.00
1.25
1.75
2.00
3.00
1.00
1.50
2.00
2.75
3.75
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
Low Nominal High Very High Extra High Super High
Effort Multiplier
COCOMO II security extension COSECMO (min)
Weapon systems cost model Secure OS sw cost model (min)
Secure OS sw cost model (max)
Figure 1.3 Cost Models’ Rating Scales From Low to Super High
1.4.4 Validation of the Models
Table 2.9 also shows that most of the approaches were not empirically validated. Only one
study, Vulnerability fixing time prediction [39], performed cross validation and another study,
Secure OS software cost model [134], verified the model through a case study. The Weapon
systems development cost model study [74] presented an attempt to calibrate a model using
data from the defense domain, however the data sample size was not sufficient to achieve
statistical significance.
For Yang, Du, and Wang [134], there is a vicious spiral: since models like COCOMO
II and FPA are not applicable to secure software development, estimators do not use them
and, as a consequence, do not produce historical data, which prevent models from being
21
validated or improved. Lee, Gu, and Baik [74] point out that for weapon systems, the long
development period hinders data collection, making it difficult to validate the model.
1.4.5 Accuracy of the Models
Regardingtheaccuracyofthevalidatedmodels, thereportedstudiesshowsomeimprovement
over original models. Lee, Gu, and Baik [74] compared their Weapon systems cost model
with the original COCOMO II and a local calibration, and showed enhanced values for
accuracy metrics, although the Security predictor was not significant in their model. Yang,
Du, and Wang [134] also compared their secure OS software cost model with the original
COCOMO II, a local calibration of COCOMO II, original COSECMO and a local calibration
of COSECMO, but for one project only. Their customized model achieved the least relative
error which, according to the authors, demonstrates the positive effects of the proposed
model.
1.5 Measuring the Software Security Level
Any cost model intended to provide estimates for secure software development will have as
input some measure to indicate how much security is being considered. Some models use
a count of some attribute to express the level of security, but they simplify assumptions
over the security scope. For example, the IWL-SSE model uses the number of iterations
of security verification and testing; the FPA security extension uses the count of security
activities executed, selected from a catalog of practices. Seeking to provide a broader scope,
most of the security cost estimation models, described in section 1.4, use the Common
22
Criteria standard (CC) Evaluation Assurance Levels (EAL) [36] as a rating scale to specify
the degree of security required by the application.
CC is an international standard that provides a framework to (1) specify security require-
ments and claims, and (2) to evaluate and certify products according to the specification
[120]. The Evaluation Assurance Levels (EAL1 through EAL7) define a scale for measuring
the level of assurance of the system during a CC security evaluation. One of the most impor-
tant benefits of the framework is that it allows users to compare IT products according to
the implementation of security features. Thus, the standard has been mostly used to assess
and certify the security of IT Products [64].
Despite the importance of CC for Information Security, discussions around the suitability
ofsuchstandardforsecuresoftwaredevelopmenthavebeenraised. CCissometimescriticized
because of difficulties related to comparability, “point in time” certification, concerns for
mutual recognition and the high costs of certification [73, 64]. Duncan and Whittington [44]
argue that compliance with standards does not necessarily leads to security, and add that a
more intelligent approach is to focus on internal process and commit with a security mindset.
CC evaluations are also expensive and take time. The CC assurance is heavily focused
on documentation, which must follow a specific structure, as required by the certification
process, requiring a reasonable effort [16]. According to Tierney and Boswell [120], a eval-
uation for EAL4 typically takes six to nine months, and higher level evaluations can take
significant longer.
While CC represents a foundation for security evaluation of government and commer-
cial products, who can justify a considerable budget to security, it is hardly considered by
23
software developers who are seeking to introduce security practices in their software devel-
opment process. CC is a product evaluation focused standard that can provide some secure
software development guidance. EALs are defined around the depth and rigor of design,
tests and reviews of security features. However, it does not address secure software develop-
ment directly as other models that were conceived around it, such as the Microsoft Security
Development Lifecyle (SDL) [83], the Building Security in Maturity Model (BSIMM) [84],
and the Software Assurance Maturity Model (SAMM) [97].
BSIMM and SAM are maturity models that reflect how organizations evolve in the ap-
plication of security practices. Version 10 of BSIMM is a result of the observation of the
use of practices in 122 firms [84]. BSIMM presents a framework with 12 main practices,
whose maturity evolve in three levels. SAMM was developed as a open framework to help
organizations reduce security risks by engaging in security practices. Similar to BSIMM,
SAMM describes 12 main practices in three maturity levels. BSIMM and SAMM also pro-
vide a measurement scale related to software security. However, differently from CC, the
scales of BSIMM and SAMM propose to measure the maturity level of the security practices
performed by an organization. In both models, such score, that ranges from 0 to 3, is as-
signed individually for each security practice, expressing the level of sophistication to which
the activities are performed, not the maturity of the organization.
Another issue with rating scales such as EAL is that they were not developed or as-
sessed regarding its ability to correctly measure the required security in software projects.
Measuring software security, and, consequently, defining such degrees of security for software
development is still an open issue. In a recent study, Saarela et al. [110] studied security met-
24
rics and concluded that theoretical and practical issues make the current metrics unsuitable
to use in daily engineering processes. For Zalewski et al. [135], the development of secu-
rity metrics is challenging considering that ‘refining and adjusting the concepts of computer
security assessment may take decades and in fact is a challenge for the entire generation’.
The difficulty in developing security metrics is related to the challenges in defining se-
curity requirements. It is necessary to know a property to proper measure it. Security
requirements are derived from security goals expressed by stakeholders, which traditionally
involve preserving valuable assets in order to guarantee their confidentiality, integrity, and
availability (CIA) [102, 49]. Existing literature refers to security requirements sometimes
as functionality and sometimes as a software quality. Many approaches, techniques, and
notations to define security requirements have been proposed, but there is still no agreement
in the field of requirements engineering [124]. To achieve CIA objectives, developers can
propose security features, that can be regarded as functional requirements, but also define
constraints on the functional requirements, which characterize quality requirements. CC,
for example, defines security requirements by establishing a catalog of components or secu-
rity features to be selected, implemented and assessed [35]. Most often, however, security
requirements are considered a quality requirement [124]. Haley et al. [49] argues that defin-
ing security requirements in terms of functions implies making choices on what to protect
before properly evaluating the reasons for it. They propose a framework to express security
requirements as specific constraints on specific functions in the system. For Türpe [124],
security design may be underspecified if driven by only functional security requirements, as
the implementation of such functions is not enough to guarantee the security against attacks.
25
Haley et al. [49] explain three aspects that make security requirements challenging to
specify, implement and verify: (1) security is usually treated as a negative property (to
prevent‘badthings’tohappen, theabsenceofvulnerabilities), whichisveryhardtomeasure;
(2) it is difficult to determine when it is well enough satisfied, while stakeholders expect
a simple yes/no satisfaction criteria; and (3) the amount of resources provided to satisfy
security requirements may depend on the likelihood and impact of cyber attacks. Türpe
[124] systematizes the problem space of security requirements and proposes a meta-model
for security requirements engineering, composed by three dimensions - threats to which the
systemisexposed, stakeholder goals ofpreventingfromattacks(manifestationofthethreats),
and security design, which is shaped during development. Türpe [124] argues that security
requirements engineering need to iterate through all three dimensions.
It is worth noting that security requirements that translate to security features can be
measured using traditional software sizing metrics such as lines of code, functions points,
etc. However, none of the studied security cost models specify if the effort to develop these
requirements are accounted for in the size component of the model or in the effort multiplier
presented. If this distinction is not made, there is a risk of double counting the effort to
develop security features.
1.6 Sources of Cost in Secure Software Development
While establishing an objective measurement for software security is still far in the horizon,
the application of security practices can be an indicator of secure software development costs,
as it was found in the systematic mapping of literature, presented on Chapter 2.
26
The identification of the sources of cost in secure software development, one of the ques-
tions of the study, pointed out that most of the papers relate the costs of secure software
development with security practices. They represent 80% of the sources extracted in the
study. On the top of the list are Perform Security Review, which considers the inspection of
all deliverables in a project (extracted from 21 papers), followed by Apply Threat Modeling,
to anticipate, analyze, and document how and why attackers may attempt to misuse the
software (from 18 papers), and Perform Security Testing (from 16 papers).
1.7 Open Issues and Opportunities
As security becomes more and more an Achilles’ heel for software development, researchers
have been seeking ways to provide information for decision-makers about costs and benefits
of building security in. The cost component to balance the equation of cost-effectiveness is
an open issue as few cost models were proposed to predict effort considering the security
factor. Two aspects of these models require deeper investigation - the scale to measure the
level of security applied in the software and a sound empirical validation to demonstrate
their utility.
As observed in section 1.5, CC EALs, which are used by many cost models as a rating
scale, do not offer a well fitted metric for secure software development in general. Further-
more, objective metrics for software security seem to be far in the horizon, challenging their
use as input for cost models. The development of a new rating scale, founded on the real
sources of cost for secure software development is an opportunity to improve such models.
As described in section 1.6, software security practices are the main sources of cost for secure
27
software development. Survey studies on the application of such practices [87, 125] confirm
they are used by practitioners. Thus, the development of a new rating scale to qualitatively
measure secure software development based on different levels of security practices applica-
tion is expected to better capture the actual development scenario in a project and, in this
way, facilitate the security effort data collection and improve models’ accuracy.
A further issue is the validation of methods and models that quantify security with
empirical data, so to make them useful for practitioners. With the development of the new
rating scale and joining efforts with the COCOMO III initiative, comes an opportunity to
analyze data from industry projects. Besides data from projects, security experts, contacted
through the survey study, and cost estimation experts, who participate in CSSE events, are
a rich source of information that can provide inputs for a secure software development cost
model.
28
Chapter 2
Costing Secure Software Development -
State of the Art and Practice
This chapter presents two initial studies conducted to investigate the sources of cost in secure
software development. First, the field of secure software development effort estimation is
analyzed through a systematic mapping of the literature [126]. Next, a survey with software
security experts provides an overview about software security practices in industry [125].
Finally, the results of both studies are compared.
2.1 Introduction
The Software Security approach consists of incorporating security practices during the soft-
ware development life cycle (SDLC), distinguishing itself from the traditional approach to
security which is focused on protecting software applications by building security around it
with network-based security tools. A emerging community of Secure Software Engineering
29
(SSE) advocates for the improvement of software development processes and activities in
order to deliver products that are less vulnerable to security problems and easier to protect
[55].
SSE researchers argue that post-development approaches to defending against attacks
are not sufficient, because many security problems are rooted on the way we build software
[78, 55]. Trend analysis point out that implementation errors are the major source of vulner-
abilities, accounting for two-thirds of the total [72]. Threats have targeted the application
layer of the protocol stack, which is less protected by operational security tools [52]. Attacks
like SQL injection, which allows an attacker without credentials to gain direct access (read,
write, delete) to the tables of the back-end DB, occupy the top positions of most frequent
vulnerabilities lists [32, 72, 41].
Whileidentifyingandfixingvulnerabilitieshastypicallybeencarriedoutattheendofthe
development cycle or even after deployment in network-centric approaches to security [78,
55], prior empirical research in Software Engineering has demonstrated the cost-effectiveness
of resolving flaws and bugs on early phases [23, 20, 117]. Motivated by those findings and
inspired by advances on Information Security Economics (ISE), the SSE community has
studied the cost-effectiveness of applying SWSec practices since the beginning of software
development projects to prevent vulnerabilities [52, 17, 56, 92, 31, 54, 100]. By identifying
vulnerabilities early in the SDLC, SWSec can reduce losses from security incidents, decrease
system’s downtime, and save operational effort/costs [52].
However, few companies follow a Secure Software Development Life Cycle (S-SDLC) [32].
The amount of effort/cost required to achieve a certain level of software security and reap
30
the benefits is not clear yet. This lack of knowledge jeopardize the adoption of the practices
[134, 56, 54] since costs are frequently considered as a major challenge for software security
assurance [45]. It is paramount for users, developers and managers to understand and agree
on the right amount of resources to be allocated for software projects to deliver proper
security. Aiming to shed light on these issues, this paper investigates the state of the art
and the state of the practice on the topic of secure software development costs.
In order to gather a better understating about the implications of SWSec on software
development costs, the following research questions were devised:
• RQ1: What are the major sources of costs for secure software development found in
the literature and which approaches have been proposed by researchers to estimating
software security costs?
• RQ2: What approaches are used in industry to estimating the effort in secure software
development and how much effort/cost is added in projects due to the application of
security practices?
• RQ3: How practice compare to state of the art in software security costs?
2.2 Related Works
Systematic reviews and mapping studies that report the state of the art have been published
on the topic of Software Development Cost Estimation and, more recently, on the topic of
Software Security. Jorgensen and Shepperd [62] reviewed papers on software cost estimation
published in journals up to 2004. Kitchenham, Mendes, and Travassos [68] analyzed the lit-
erature in order to compared cross-company and within-company effort estimation models.
Idri, Hosni, and Abran [60] performed a systematic review on ensemble effort estimation
(EEE) technique, a combination of the existing models of software development effort esti-
31
mation, which covered papers from 2000 to 2016. Wickramaarachchi and Lai [128] searched
for literature related to effort estimation on Global Software Development. All these studies,
however, focus their attention on general aspects of cost/effort models, and thus they do not
present information on software security aspects.
Software security related systematic reviews began to appear more recently. Arunagiri,
Rakhi, and Jevitha [6] reviewed research done on commonly used browsers extension vulner-
abilities. Mohammed et al. [85] identified existing software security approaches with respect
to different phases of software development life cycle. Van den Berghe et al. [18] inventoried
the existing design notations for secure software and provided an in-depth, comparative anal-
ysis for each of them to identify the existing opportunities for original contributions. Ito et
al. [61] studied research papers that use security patterns (SPs) to build secure systems and
analyze the nature of SPs to find about how SPs are being investigated to guide future re-
search. Silva et al. [118] presented metrics in Cloud Computing about publications available
in literature that deal with security threats in the guide “Top Threats to Cloud Computing"
fromtheCloudSecurityAlliance(CSA).KhanandIkram[65]identified15clustersregarding
problems faced in security requirements engineering and their solutions. Nguyen et al. [90]
provided a detailed analysis of the state of the art in Model-Driven Security (MDS), which
is a specialized Model-Driven Engineering research area for supporting the development of
secure systems. This mapping study differs from the above mentioned systematic reviews
by focusing on the costs of secure software development. In other words, we screened papers
that compose the intersection set of studies between software cost estimation research and
software security research.
32
Surveys on secure software development are scarce and conducted with limited sampling
frames. Errata Security company conducted a survey on the usage of Secure Development
Life Cycles (SDLs) [45]. The study intended to identify organizations that were not adopting
software security practices and understand why. The study received 46 responses and re-
ported a margin of error of about 14.5%. Only 30.4% of the participants indicated that they
used a formal SDL. The study found that the main reasons for not adopting an SDL were
that the approaches are too time-consuming, the participants are not aware of methodologies
and that they require too many resources. The study also showed that the Microsoft Secu-
rity Development Life Cycle (SDL) and SDL-Agile [83] were the best-known methodologies,
followed by the Comprehensive Lightweight Application Security Process (CLASP) [96], the
Building Security in Maturity Model (BSIMM) [78], and the Software Assurance Maturity
Model (SAMM) [119]. Differently from [45], this approach focus on general software security
practices instead of surveying the adoption of SDLs. With this strategy, we were able to
collect more granular information about security usage in projects.
A list of software security practices elaborated from four Security Engineering processes
(BSIMM, Microsoft SDL, CLASP, and SAFECode [111]), was compiled by Morrison, Smith,
andWilliams[87]. Thepracticeswereselectedthroughcontentanalysis, groupedinto16core
practices and validated in a survey, which collected empirical data from 11 security-focused
open source projects. Table 2.1 presents the final 13 validated security practices along with
their descriptions as defined by [87]. The columns with an ‘x’ mark the sources from which
the practices were found in the referred study.
This empirically validated set of security practices was used in this survey to elaborate
33
Table 2.1 Security Practices Description [87]
Name Description BSIMM CLASP MS SDL SAFECode
Apply Security Re-
quirements
Consider and document security concerns prior to
implementation of software features.
x x x
Apply Data Classi-
fication Scheme
Maintain and apply a Data Classification Scheme.
Identify and document security-sensitive data, per-
sonal information, financial information, system
credentials.
x x
Apply Threat
Modeling
Anticipate, analyze, and document how and why
attackers may attempt to misuse the software.
x x x x
Document Techni-
cal Stack
Document the components used to build, test, de-
ploy, and operate the software. Keep components
up to date on security patches.
x x x x
Apply Secure Cod-
ing Standards
Apply (and define, if necessary) security-focused
coding standards for each language and component
used in building the software.
x x x x
Apply Security
Tooling
Use security-focused verification tool support (e.g.
static analysis, dynamic analysis, coverage analy-
sis) during development and testing.
x x x x
Perform Security
Testing
Consider security requirements, threat models, and
all other available security-related information and
tooling when designing and executing the soft-
ware’s test plan.
x x x x
Perform Penetra-
tion Testing
Arrange for security-focused stress testing of the
project’s software in its production environment.
Engage testers from outside the software’s project
team.
x x x
Perform Security
Review
Perform security-focused review of all deliverables,
including, for example, design, source code, soft-
ware release, and documentation. Include review-
ers who did not produce the deliverable being re-
viewed.
x x
Publish Operations
Guide
Document security concerns applicable to admin-
istrators and users, supporting how they configure
and operate the software.
x x x
Track Vulnerabili-
ties
Track software vulnerabilities detected in the soft-
ware and prioritize their resolution.
x x
Improve Develop-
ment Process
Incorporate”lessonslearned” fromsecurityvulnera-
bilities and their resolutions into the project’s soft-
ware development process.
x
Perform Security
Training
Ensure project staff are trained in security con-
cepts, and in role-specific security techniques.
x x x x
some of the questions. This allowed the comparison of results and also contributes to the
topic under investigation by adding more evidence for the same list of practices. However,
while the sampling frame of the survey conducted by Morrison, Smith, and Williams [87]
was extracted from 11 open source projects, this study embraces a broader population, by
considering professionals of the LinkedIn social network. There is also a difference in the
34
focus of the two studies, this survey is interested in how the practices impact the effort/costs
of the projects, while their goal was to evaluate security practice adherence in software
development.
2.3 Methods
The systematic mapping approach was the method selected to determine the state of the
art of the costs of software security. Mapping studies provide a broader perspective than
standard systematic reviews, which are driven by a specific question to be investigated in
depth. For the state of the practice analysis, it was conducted an online survey within the
professional social network LinkedIn.
Table 2.2 presents the detailed research questions that were devised to explore the topic
of software security costs. The following subsections describe the methods and procedures
applied in the systematic mapping, and then in the survey.
2.3.1 Systematic Mapping
The protocol for the systematic mapping was defined between November and December
of 2017. The protocol described the Search Strategy used to obtain a list of papers to be
analyzed, the Studies Selection Process specifying the criteria established to decide if a paper
should be included or not in the review, and the Data Extraction and Analysis Processes
specifying the methods to collect the evidence from the resultant papers. The systematic
mapping was executed from January 2017 to February 2019.
35
Table 2.2 Research questions for the mapping study
Research Question Aim
RQ1: What are the major sources of costs for se-
cure software development found in the literature
and which approaches have been proposed by re-
searchers to estimating software security costs?
To provide the state of the art in software security
costs.
RQ1.1: Which papers describe research on soft-
ware security and its relation to costs?
To provide and overview of published papers that
report studies that refer to costs of software security.
RQ1.2: What are the major sources of costs in
developing secure software?
To identity and categorize the aspects that affect the
cost of security in software projects.
RQ1.3: What approaches were developed in
academia to estimating the costs of security in
software development projects?
To identify models or tools proposed to estimate se-
cure software development costs.
RQ1.4: Which software security standards and
engineering processes are used in the studies?
To relate standards and processes with the costs of
secure software development.
RQ2: What approaches are used in industry to esti-
mating the effort in secure software development and
how much effort/cost is added in projects due to the
application of security practices?
To provide the state of the practice in software secu-
rity costs.
RQ2.1: What approaches are used in industry
to estimating the costs of security in software
development projects?
To identify existing methods, models or tools used
in the industry to estimate secure software develop-
ment.
RQ2.2: What is the frequency and effort spent
on software practices in projects?
To understand the usage of software security prac-
tices in the industry.
RQ2.3: How much effort/cost is added to a
project due to the application of security prac-
tices?
To understand the impact size of security practices
applied to software development in terms of costs.
RQ3: How practice compare to state of the art in
software security costs?
To perform a comparative analysis between the cur-
rent state of the art and practice to find out research
gaps.
Search Strategy
The search strategy combined manual and automated search and applied the concept of a
Quasi-Gold Standard (QGS) set of papers to select and assess the completeness of the search
[136]. This mixed method improves the rigor of the search process [136] and is especially
recommended for mapping studies [69]. The strategy begins by selecting all papers published
by a journal or conference proceedings and then manually inspecting each paper according to
36
the inclusion and exclusion criteria. Since there is not a specialized conference or workshop
for the topic under investigation that could be used as a basis for the manual search, venues
from three distinct sources were selected:
S1. The main internationally recognized software engineering venues that regularly
publish high quality studies [69].
S2. The four most important software cost estimation journals found in a systematic
review of Software Development Cost Estimation Studies [62].
S3. The main venues found in a systematic mapping study of software security ap-
proaches in software development life cycle [85].
The resulting list of venues for the manual search is (S1, S2 and S3 indicate the source
from which the venue was selected):
• IEEE Transactions on Software Engineering (TSE) —S1, S2
• ACM Transactions on Software Engineering Methodology (TOSEM) —S1
• Empirical Software Engineering Journal (EmSE) —S1, S2
• Journal of Systems and Software (JSS) —S1, S2
• Information and Software Technology (IST) —S1, S2
• Proceedings of the International Conference on Software Engineering (ICSE) —S1, S3
• Empirical Software Engineering and Metrics Conference (ESEM) —S1
• Workshop on Software Engineering for Secure Systems (SESS) —S3
• Software and Systems Modeling (SoSyM) —S3
Following the GQS method, the papers found through the manual search are used to
build a search string, which is then applied to the automated search. The resources for the
automated search were based on the recommendations from Kitchenham and Brereton [67]
and Kitchenham, Budgen, and Brereton [69], which are:
• IEEE Digital Library
• ACM Digital Library
• SpringerLink
• Scopus
37
• Web of Science
The search strategy for both the manual and automated search was constrained to pub-
lished work from 2000 to 2017, since studies on software security are relatively recent, as
discussed in the Introduction section of this paper.
The completeness achieved with the search strategy was assessed by comparing the pri-
mary studies identified through the automated search process against the set of studies
selected by the manual search process (the quasi-gold standard). This comparison was car-
ried out through the application of sensitivity metric. Sensitivity (also named Recall in some
studies) is the proportion of relevant studies retrieved for that topic, while precision is the
proportion of retrieved studies that are relevant [136]. They are calculated as [136]:
Sensitivity =
Number of relevant studies retrieved
Total number of relevant studies
100% (2.1)
Precision =
Number of relevant studies retrieved
Number of studies retrieved
100% (2.2)
Total number of relevant studies are the studies identified through manual search (quasi-
gold standard set of papers). Number of relevant studies retrieved is the subset of the studies
found through manual search that was also found through automated search. Number of
studies retrieved is the number of papers retrieved by the automated search.
Equation 2.1 actually measures a quasi-sensitivity, as the real number of relevant studies
is unknown. The quasi-gold standard was retrieved from the venues considered in the manual
search, instead of the search universe, which would not be feasible to execute.
Whereas both sensitivity and precision metrics can be used to assess the search perfor-
mance, achieving a high sensitivity is more desirable [136]. Sensitivity indicates if the search
38
string derived for the automatic search is consistent with the studies found through manual
search in the main venues elected. The threshold of 80% was used as an indicator that the
search performance was acceptable [136]. Precision is important because a high rate of this
metric indicates that the burden on reviewers to check papers that turn out not to be rele-
vant is low [69]. However, as precision is increased, sensitivity is typically reduced because
the number of sources searched is narrowed.
The result of the execution of this strategy is illustrated in Figure 2.1. Since only nine
papers were retrieved from the manual search, the snowballing method was used to enhance
thequasi-goldstandardsetofpapers. Insnowballing, thereferencelistofapaper(backwards
snowballing) or the citations to a paper (forward snowballing) was used to identify additional
related papers [132]. This measure was established as a contingency in this research protocol,
to be used if the number of papers found by the manual search did not achieve 30 papers,
as recommended by Kitchenham, Budgen, and Brereton [69].
Using the nine papers found through the manual search as the starting set, five iterations
of backward and forward snowballing were performed, following the procedure described by
Wohlin [132]. After the fifth iteration no new paper was found. At the end of this process,
with 37 additional papers, the size of the quasi-gold standard grew to 46 papers, which was
used to assess the automated search.
In the next step, a search string was proposed and refined, using the sensitivity metric
to evaluate its performance. Ten versions of the search string were tested and six rounds of
papers retrieval from the resources of automated search were required to achieve a sensitivity
of 84%, above the threshold of 80%. Four out of the 46 papers of the quasi-gold standard
39
11,657
search results
Manual Search Snowballing Automatic Search
Preselection
Tittle
Preselection
Abstract
Final
Selection
324
potentially relevant
9
relevant papers
37
relevant papers
10,172
search results
589
potentially relevant
11
relevant new
papers
search string (86% recall)
57
relevant papers
54
relevant studies
Duplicated studies
removal
Figure 2.1 Study Selection
were not considered in the computing of sensitivity as they were not indexed by any of
the resources used in the automated search. Table 2.3 presents the first, fifth and the final
version of the search string with the corresponding sensitivity. Versions one through four of
the search string resulted in few papers and were used to stabilize the search string, therefore
sensitivity was not computed for them.
Table 2.3 Search String Versions and Sensitivity
V Search String Sens.
1 Software AND Development AND (Security OR Safety OR Cybersecurity)
AND ((Effort OR Cost) AND (Model OR Estimate))
n/a
5 (’Software Security’ OR ’Software Vulnerability’) AND (Effort OR Cost OR
Economics OR Budget)
57%
10 (’Software Security’ OR ’Software Vulnerability’ OR ’Secure software devel-
opment’ OR ’Security Development’) AND (Effort OR Cost OR Economics
OR Budget)
84%
40
Primary Study Selection Procedure and Criteria
The primary study selection process was applied in three steps:
1. Check the title of each paper against the inclusion/exclusion criteria.
2. Check the title, keywords and abstract of each paper against the inclusion/exclusion
criteria.
3. Check the whole text of each paper against the inclusion/exclusion criteria.
After each step, those papers identified as irrelevant were removed from the set of candidate
papers. Then, the remaining papers with similar titles and authors were reviewed one
additional time in order to remove duplicate studies, since the same study can be referred
in different papers.
Two inclusion criteria were applied:
• IC1: Study about software security that considers effort/cost impacts.
• IC2: Study about effort/cost estimation or measurement that considers software secu-
rity issues.
The exclusion criteria established were:
• EC1: Paper describes effort of an activity only to compare the efficiency of methods
and the values reported don’t directly impact software development cost/effort.
• EC2: Paper describes reactive approach to software security issues.
• EC3: Paper describes research about security management of business processes.
• EC4: Paper describes research about software safety, that is, software that can cause
harm.
• EC5: Paper describes research on information security, application security or com-
puter security only.
• EC6: Paper is not presented in English.
• EC7: Paper is not accessible in full-text.
• EC8: It is a book or gray literature.
• EC9: It is a tutorial, workshop or poster summary only.
• EC10: Study is duplicated (the most comprehensive paper reporting the study was
maintained).
41
• EC11: Paper published before 2000 or after 2017.
Data Extraction and Analysis
The extraction process involved the collection of evidence from the set of selected studies to
support the research questions. Table 2.4 presents the properties that were extracted.
Table 2.4 Properties extracted from each paper
Property Research Question
Publication details RQ1.1
Study type RQ1.1
Settings RQ1.1
Contribution RQ1.1
Pertinence RQ1.1
Sources of cost RQ1.2
Estimation/cost model/approach RQ1.3
Security standard/process RQ1.4
Thematic analysis was employed to identify, analyze, and report patterns within the
papers. An integrative approach employing both a start list of codes (deductive approach)
and the development of new codes along the way (inductive approach) was used, following
Cruzes and Dybå [37]. The initial list of codes was derived from the research questions.
The Qualitative Data Analysis (QDA) software NVivo
1
supported the coding of textual
fragments and the analysis of the themes in the resulting set of selected studies.
2.3.2 Survey
The survey was planned between December 2018 and February 2019, using the conceptual
framework proposed by Mello [79] and the checklist designed by Molléri, Petersen, and
Mendes [86] as references. It was executed between March and April 2019.
1
https://www.qsrinternational.com/nvivo/
42
Population Search Plan
The target population was composed by Software Engineering practitioners that perform or
are concerned about software security. The professional social network LinkedIn was used
as sample source.
The sampling frame was established by looking at members of LinkedIn groups of interest
that discuss software security concepts. The search question for determining the sampling
frame was: “Which are the groups from LinkedIn composed by SE practitioners that discuss
software security?”. The search mechanism of LinkedIn was then used to look for groups of
interest and collect their basic information, such as group title, group description, number
of members, and rules. Table 2.5 presents the search expressions derived from the search
question. Expressions were based on works from [87, 121].
Table 2.5 Search Expressions
Type Expression
Requirement Practices “securityrequirements”, “misusecase”, “abuse
case”, “abuser story”, “security features”
Design Practices “threat modeling”, “secure design”, “cryptog-
raphy strategy”, “attack surface”, “UMLSec”,
“design for security”
Coding Practices “secure coding”, “secure code”, “security de-
fect”, “vulnerability”
Testing Practices “security testing”, “penetration testing”, “se-
curity review”, “security inspection”
Other “securitytraining”, “securedevelopment”, “se-
curity practices”, “secure practices”, “security
risks”, “countermeasure”, “security tool”, “se-
curity assurance”
The following search algorithm, adapted from [81], was used to retrieve the groups from
43
LinkedIn:
For each keyword, do:
1. Submit a search expression (between quotes) preceded by the term “software” in
the option “Group Search”;
2. Identify all groups of interest returned, extracting the following data: name, de-
scription, group rules, group size (number of members), and group language.
Exclusion criteria were then established and applied to the resultant list of groups. The
criteria were based on the groups’ attributes information, following [81]. Groups were re-
moved if:
• explicitly prohibited the execution of studies;
• explicitly restricted the individual messaging between its members (a default feature
provided by LinkedIn);
• were explicitly directed to a city, region or country, since the target audience is not
geographically restricted;
• were focused on promoting specific organizations;
• had their description out of the scope of Software Engineering or Software Security;
• had a vague description;
• had a single member;
• were driven to headhunting and job offering;
• represented LinkedIn’ subgroups, since the sampling frame must be composed by
groups of interest, and;
• had a non-English language as default.
Table 2.6 presents the five largest groups found by applying the algorithm and the ex-
clusion criteria. Along with the number of members, ‘%Acc’ represents the accumulated
percentage of the size of the group relative to the total number of members from all groups
retrieved (9,263), and ‘#Retr’ is the number of times that each group appears in the results
based on each search expression. ‘CSIAC’ was the largest group found, but it only appeared
once in the search. The second largest group, ‘Software Security Group’ appeared for seven
44
of the expressions during the search process.
Table 2.6 Group Search Results
Group Members %Acc #Retr
CSIAC - Cyber Security and Information Systems
Information Analysis Center
3587 39% 1
Software Security Group 2119 62% 7
Open Source Compliance & Security for Software
Executives
1078 73% 1
CSSLP - study group 616 80% 3
Secure Coding Forum 605 86% 1
Sampling Strategy
The initial plan was to use a stratified sampling approach, considering the distinct groups
found from the search in LinkedIn as the sub-population (strata), as performed by Mello,
Silva, and Travassos [81]. The recruitment would then be conducted by sending individual
invitations to the users through the platform’s messaging service (access to member’s e-mails
was not available).
However,aftercomposingthesamplingframe,itwasdiscoveredthatLinkedInhadlimited
the number of messages that a user can send to fellow group members to 15 messages
2
. Even
the top LinkedIn Premium accounts have low limits, from 30 to 50 messages per month. And
currently, only group owners and group managers are able to send an unlimited number of
messages to group members that are not personal connections
3
. In an attempt to overcome
this situation, individual messages were sent to the group owners of the sampling frame
2
https://www.linkedin.com/help/linkedin/answer/192
3
https://www.linkedin.com/help/linkedin/answer/202/sending-messages-to-group-members-group-
management
45
explaining the aims of the research and asking if they would allow the researchers to be
promoted to group managers temporarily in order to conduct the study. The owner of the
‘Software Security Group’, which was the most pertinent group in the investigation according
to its description and to the number of times it appeared in the groups’ search, kindly agreed
with the request. The owners of the other groups either did not answer or did not accept
the inquiry.
The study proceeded with a simple random sampling (SRS) solely for the Software Se-
curity group, which became the sampling frame. Even though LinkedIn showed the group
size as 2,119 members, only 2102 records of users could be extracted. The sample size was
established over this number with a confidence level of 99% and a confidence interval of 3.5.
Over that number, extra 10% were added as in [81], resulting in a sample of 908 individu-
als. The list of all users retrieved from the group was randomly ordered using a tool from
Random.org
4
and then the 908 first members were selected to invite to participate in the
survey.
Recruitment Strategy
It was not possible apply the automated approach to send the invitations through the
LinkedIn message service (which was used in [81]) due to recent changes in the platform
that modified the identification of interface elements. The identification elements are now
dynamically established in each session, which prevents the automation of the message send-
ing process using web automation tools like iMacros. Therefore, all invitations to participate
in the survey were manually sent. This process took about 10 hours for the initial message
4
www.random.org
46
sending, and also nearly 10 hours for the reminder messages that were sent one week later.
To encourage responses, a raffle of an Amazon gift card was offered as a reward to the
individuals that completed the questionnaire.
Questionnaire
The questionnaire was developed by the researchers and reviewed by an external expert
in software effort estimation. It was piloted with ten members of the target population.
The main improvement made after the pilot execution was to include one specific question
to identify if the participant had previous experience in projects that involved software
security practices. Initially, the questionnaire only asked if security practices were executed
in the participant’s group, which was not clear enough to determine if the participant was
in position to answer questions about the application of software security practices.
The final version of the questionnaire is divided into four parts. The first part presents
the motivation for the study, the definition of software security and four close-ended ques-
tions regarding the awareness of software security practices by the participants. It is mainly
intended to filter out the ones who did not experience working in a project that adopted
software security practices. In the second part of the questionnaire, participants were in-
structed to freely choose one project, in which they were involved, that applied software
security activities. Thus, the questions in this part seek to characterize the project and
the security practices performed; there are one multiple-choice question, three quantitative
questions, one scale question, and two matrices for the participant to select for each security
practice (rows) the frequency (multiple-choice columns) and effort (multiple-choice columns)
47
applied. The list of validated software security practices presented in Table 2.1 was used as
a reference to create these questions. The third part consists of questions about software se-
curity effort estimation; and is composed by two multiple-choice questions, one quantitative
question and one open-ended question for the user to describe difficulties faced when plan-
ning and estimating the effort to perform security activities. The fourth and last part of the
questionnaire asks seven close-ended demographic questions to characterize the participant.
Data Collection and Analysis
The survey was conducted using the Qualtrics
5
web-based survey tool, provided by USC,
which facilitated the data collection and the monitoring of the response rate.
The data analysis is mostly quantitative, using tables, charts, and descriptive statistics
to find patterns in the collected information. The data was analyzed with regard to the
impact of the adoption of security practices in the software development effort.
A qualitative approach was employed to analyze the open-ended question related to the
challenges in estimating software security activities. All the answers were observed, catego-
rized, and each segment of the response was associated with a distinct reported difficulty
(inductive approach).
2.4 Systematic Mapping Results
The mapping study analyzed 54 papers published between 2000 and 2017 that describe
research related to software security and its economic aspects. The following subsections
5
https://itservices.usc.edu/qualtrics/
48
0 0 1 1 0
2
1
3
2
5
6
5
2
6
4
5
8
3
0
1
2
3
4
5
6
7
8
9
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Number of Papers
Figure 2.2 Papers by Year
present a summary of the papers and answer the research questions defined in Table 2.2.
2.4.1 RQ1.1 Which papers describe research on software security
and its relation to costs?
The number of studies published has been growing over time, as illustrated by Figure 2.2.
This tendency reinforces the perception that the interest in software security, and in its
costing aspects, is gaining importance.
The venue more frequently used for publication of this type of study is the International
Conference on Availability, Reliability and Security (ARES), with four papers (see Table
2.7). Other three journals and one conference were used by two papers each, and all the
remaining papers were found in distinct publications.
The resulting set of 54 papers were grouped in 10 categories, according to the aims
and contributions of the studies. Figure 2.3 shows the categories found and the respective
49
Table 2.7 Venues used more than once for publications
Venue Name Type Papers
International Conference on Availability, Reliability and
Security (ARES)
Conference [53, 8, 106, 28]
Empirical Software Engineering Journal [116, 74]
International Symposium on Engineering Secure Software
and Systems (ESSoS)
Conference [39, 131]
IEEE Security Privacy Journal [100, 130]
IEEE Transactions on Software Engineering Journal [46, 114]
numbers of papers, according to their pertinence to this research. The pertinence property
(adapted from Rodríguez et al. [108]) classify studies according to their focus in software
security costs aspects.
2
5
3
2
2
1
4
1
1
4
9
7
5
4
2
1
1
0 2 4 6 8 10
Economics of SecSW dev
Countermeasure ident/priorit
Vulnerabilit y predict/detect
Software cost estimation model
Security in agile development
Security development method
Security effort analysis
Quantification of SwSec
Security practices in soft dev
Security req ident/priorit
Fully Focused Partially Focused Marginally Focused
Figure 2.3 Papers by Category and Pertinence
Economics of secure software development refers to papers that analyze software security
in the context of business, management and finances. In secure software development, secu-
50
rity competes for a slice of a limited budget - adding more security usually means removing
features from the scope. While the growing field of SWSec provides technical solutions to
address current security problems, financial issues are still a barrier to their effective intro-
duction in projects [56]. The effectiveness of applying security practices early on the SDLC
is frequently discussed [92, 31, 57, 55]. Other papers discuss planning for security [56, 53,
100], investment models and equations [52, 55, 57, 137], and the financial benefits of software
security [52, 55].
Countermeasure identification/prioritization papers discuss approaches that allow devel-
opers to make informed decisions regarding alternative security designs, through cost-benefit
analysis [46, 11, 27, 59, 17, 12, 28], or through collaborative security game [131, 130].
Vulnerability prediction/detection studies propose prediction models that delimit areas
of the code base that are more prone to security issues, saving resources in the execution of
security activities such as code reviews and security testing. Several approaches for develop-
ing prediction models were identified in the resulting set of papers, such as models based on
traditional code metrics [114], execution complexity metrics [115], tradition fault prediction
models [116], existing vulnerability databases and version archives mining [89], and online
code changing [133]. Other studies investigated the characteristics of vulnerabilities discov-
ered by code review [26], and applied empirical methods to analyze the effectiveness of static
analysis tools to detect vulnerabilities [10, 29, 13].
Software cost estimation model papers present existing cost estimation models and meth-
ods that address security-related issues. Basha and Ponnurangam [15] published a compar-
ative study on empirical software effort estimation. Their investigation analyzes several
51
characteristics of estimation models and tools, and points out the ones that take security
into account.
One set of the models build upon the well-known COCOMO II software cost estimation
model [24]. Reifer, Boehm, and Gangadharan [105] described an effort in enhancing CO-
COMO II to model the impact of security on development effort and duration, and relates
it to a framework that estimates effort for COTS-based development (COCOTS). Colbert
and Boehm [34] presented COSECMO, an extension to COCOMO II that accounts for the
development of secure software with the inclusion of a new cost driver (“SECU") for security
assurance activities. Lee, Gu, and Baik [74] extended the seven step modeling methodology
of COCOMO II and extracted cost factors related to the military domain to develop a soft-
ware cost estimation model, including a security factor. Yang, Du, and Wang [134] reviewed
previous estimation models for secure software development and proposed a model based on
COCOMO II, customized to Chinese IT security standards.
Another approach to address security in estimation models is based on the Function
Point Analysis (FPA) method for software sizing. Abdullah et al. [1] applied characteristics
of security standards to extend the calculation of FPA, and investigated its acceptance in a
follow-up study [2].
Security in agile development papers discuss the impact of security activities according
to the agile values and principles. Studies address the integration of security engineering
with agile methods [14]; the proposition and evaluation of new agile development processes
that address security issues based on standards and existing security-engineering processes
[9, 7, 8]; and analysis about the compliance of a combination of agile methods and security-
52
engineering process against Finland’s established national security regulation [107, 106].
Security development method refers to those papers that propose new approaches to soft-
ware security while considering its cost-effectiveness. Included in this category are an unified
and formal definition of command injection vulnerabilities [63], a method to automatically
generate class diagrams from misuse cases [50], a model for conducting more efficient se-
curity analysis in the software design phase [51], and a pattern-based method for database
designers to design database specification compliant to the organizational security policies
regarding authorization [3].
Security effort analysis includepapersthatreportonfactorsthatimpactsecurityeffortof
using a Free and Open Source Software (FOSS) component in a larger software product [39],
factorsthat affectthetime tofixvulnerabilities [94, 94], andsecurity impactandremediation
cost of vulnerabilities [38].
Quantification of software security groups papers that propose an approach to estimate
software security (related to its security level) in the early stage of a software development
life cycle [30], and a formal model to describe and analyze security metrics [71].
Security practices in software development papers propose decomposition to solve ab-
straction and effort issues in Common Criteria evaluation methodology [104], and present a
set of security practices validated from literature review and survey data [87].
Security requirements identification/prioritization papers apply lean methodology do
identify security requirements [48], and an extension to threat modeling to prioritize re-
quirements with a graph that models the value of assets, threats and countermeasures [98].
53
2.4.2 RQ1.2 What are the major sources of costs in developing
secure software?
Each study was analyzed to identify which aspects related to software security are treated
by the authors as sources of secure software development costs. Table 2.8 summarizes the
aspects found in the papers.
The validated list of security practices developed by Morrison, Smith, and Williams [87]
was used as a starting point to classify such aspects, and new categories were added for
those that did not fit in the software practices list. In some cases the source of cost was
treated at a level of granularity that did not allow the identification of particular practices
or finer factors. Thus, the Implement Countermeasures category was added, which was
captured in nine studies, and can be considered as a general classification for strategies
that are adopted by software engineers to mitigate security risks. Other categories added
were Fix Vulnerabilities (cost spent to remediate security defects), Achieve Security Level
(investment required to achieve a certain level of security), Security Personnel (costs with
Security Experts, Security Group or Security Master), Functional Features (costs to secure
functional features), Hardening Procedures (effort to reduce the attack surface) and Security
by Design Paradigm (investment in security during the whole software life cycle).
As Table 2.8 shows, most of the papers relate the costs of secure software development
with security practices. They represent 80% of the sources extracted. On the top of the list
are Perform Security Review, which considers the inspection of all deliverables in a project
(extracted from 21 papers), followed by Apply Threat Modeling, to anticipate, analyze, and
document how and why attackers may attempt to misuse the software (from 18 papers), and
54
Perform Security Testing (from 16 papers).
As in the list of practices compiled by Morrison, Smith, and Williams [87], many studies
referred to security activities established by known secure software engineering processes and
standards, such as Comprehensive, Lightweight Application Security Process (CLASP), Mi-
crosoft Secure Development Lifecycle (SDL), Common Criteria for Information Technology
Security Evaluation (CC)
6
, and Touchpoints
7
.
2.4.3 RQ1.3: What approaches were developed in academia to esti-
matingthecostsofsecurityinsoftwaredevelopmentprojects?
To answer this question, the models or methods were extracted from the set of relevant
papers which allow to estimate the costs (considering effort and time) of secure software
development, maintenance or specific activities. Some of the approaches found were briefly
introduced in section 2.4.1, as those papers are fully devoted to estimate security efforts.
Other approaches identified here were extracted from papers that used security estimation
or prediction as a mean for specific study analysis.
The approaches identified are described in Table 2.9, already presented in Chapter 1.
COCOMO II based models dominate the results, comprising five of the ten approaches.
Other methods to security estimation are based on software sizing methods extensions, as
the approaches that apply Function Point Analysis (FPA) and COSMIC Function Points
(CFP); based on machine learning (for vulnerability fixing time prediction); or are specific
equations (for FOSS integration maintenance and IWL-SSE).
6
https://www.commoncriteriaportal.org
7
https://www.bsimm.com/framework/software-security-development-lifecycle.html
55
Table 2.8 Sources of cost for Secure Software Development
Source of Cost N Papers
Perform Security Review 21 [1, 2, 7, 9, 34, 39, 56, 54, 55, 74, 87, 89,
105, 107, 106, 114, 115, 116, 131, 130,
133]
Apply Threat Modeling 18 [1, 2, 7, 9, 11, 8, 17, 27, 28, 50, 51, 56,
74, 87, 105, 107, 131, 130]
Perform Security Testing 16 [1, 2, 7, 9, 56, 54, 55, 74, 87, 89, 105,
107, 115, 116, 131, 130]
Apply Security Requirements 11 [1, 2, 7, 9, 11, 48, 56, 74, 87, 105, 107]
Apply Security Tooling 11 [7, 10, 9, 11, 13, 26, 29, 87, 95, 107,
131]
Implement Countermeasures 9 [12, 38, 46, 52, 53, 59, 71, 98, 100]
Fix Vulnerabilities 9 [13, 26, 30, 31, 52, 63, 92, 95, 93]
Apply Secure Coding Standards 8 [1, 2, 7, 9, 11, 87, 105, 107]
Apply Data Classification Scheme 7 [1, 2, 3, 11, 87, 105, 107]
Publish Operations Guide 7 [1, 2, 9, 56, 87, 105, 107]
Perform Security Training 6 [1, 2, 7, 14, 87, 107]
Improve Development Process 5 [7, 9, 87, 104, 105]
Perform Penetration Testing 5 [7, 9, 56, 87, 105]
Achieve Security Level 3 [15, 134, 137]
Document Technical Stack 3 [87, 107, 106]
Security Experts, Security Group, Security
Master
3 [8, 14, 51]
Track Vulnerabilities 3 [7, 9, 87]
Functional Features 2 [1, 2]
Hardening Procedures 2 [11, 106]
Security by Design Paradigm 1 [31]
56
Table 2.9 Approaches to Estimating Costs of Secure Software Development
Approach Additional Cost of Security Source Validation
COCOMO II security
extension (Software
development cost model)
[105]
0.94 (Low) Expert Not
1.02 (Nominal) estimation validated
1.27 (High)
1.43 (Very High)
1.75 (Extra High)
COCOTS security extension
(COTS integration cost
model) [105]
1.00 (Low) Expert Not
1.15 (Nominal) estimation validated
1.29 (High)
1.44 (Very High)
COSECMO (COCOMO II
extension) [34]
0% (Nominal) COCOMO II Not
20% to 80% (EAL 3 - High) RELY rating validated
50% to 200% (EAL 4 - Very High) and one data
125% to 500% (EAL 5 - Extra High) point
313% to 1250% (EAL 6 - Super High)
781% to 3125% (EAL 7 - Ultra High)
Weapon systems
development cost model
(COCOMO II based) [74]
1.0 (Low or Nominal) Expert Not
1.87 (High) estimation validated
Secure OS software cost
model (COCOMO II based)
[134]
1 (EAL 1-2 - Nominal) Expert Case study,
1.25 to 1.5 (EAL 3 - High) estimation Relative
1.75 to 2.0 (EAL 4 - Very High) Error of 0.41
2.0 to 2.75 (EAL 5-6 - Extra High)
3.0 to 4.0 (EAL 7 - Super High)
FPA security extension
(Software sizing method) [1]
0 to 5% increase in the Practices from Not
function points size of survey with validated
the project developers
COSMIC FP security
extension (Software sizing
method) [38]
COSMIC FP needed to remediate COSMIC FP Not
vulnerabilities Extension validated
FOSS security maintenance
effort estimation (Consumed
FOSS) [39]
Function of number of known Not applicable Not
vulnerabilities and number of validated
products using the component
Vulnerability fixing time
prediction (Machine
Learning based) [93]
Machine learning prediction based SAP’s tools Cross
on past data vulnerabilities validation
IWL-SSE [55, 54]
Function of SSE activities executed, Not applicable Not
number of iterations, effectiveness validated
of the activity and time to fix
Table 2.9 also shows that most of the approaches has not been empirically validated.
Only two studies performed cross validation and one study verified the model through a case
study. To Yang, Du, and Wang [134], there is a vicious spiral: since models like COCOMO
57
II and FPA are not applicable to secure software development, estimators do not use them
and, as a consequence, do not produce historical data, which prevent models from being
validated or improved. Lee, Gu, and Baik [74] point out that for weapon systems, the long
development period hinders data collection, making it difficult to validate the model.
Regarding the accuracy of the validated models, the results show improvements over
original models. Lee, Gu, and Baik [74] compared their Weapon systems cost model with
the original COCOMO II and a local calibration, and showed enhanced values for accuracy
metrics. Yang, Du, and Wang [134] also compared their secure OS software cost model with
the original COCOMO II, a local calibration of COCOMO II, the original COSECMO and a
local calibration of COSECMO, but for one project only. Their customized model achieved
the least relative error which, according to the authors, demonstrates the positive effects of
the proposed model.
TheadditionalcostofsecurityinformationinTable2.9showsthattherangesofestimated
values are wide, according to the level of security required. Apart from COCOTS security
extension, which accounts for a specific estimation scope, all other COCOMO-based models
present coherent values for the security factor among the different levels. For example, for
a high level of security, the models add 27% (COCOMO II security extensions), 20%-80%
(COSECMO), 87% (Weapon systems), and 25%-50% (Secure OS software cost model) to
the project effort. As an example, Rindell, Hyrynsalmi, and Leppänen [106] investigated the
development of an Identity Management software for a government and reported that the
project management estimated a 1.5 to 2 factor increase in costs due to security features
and regulation.
58
2.4.4 RQ1.4: Which software security standards and engineering
processes are used in the studies?
The standards and engineering processes were analyzed together because many reviewed
papers use them in the same context.
The most frequent standards and processes were Common Criteria (CC, 8 studies, [1, 7,
9, 34, 46, 59, 104, 105]), Microsoft Security Development Lifecyle (SDL, 4 studies, [7, 9, 87,
107]), Open Web Application Security Project (OWASP, 2 studies, [1, 87]), SAP security
development lifecycle (S2DL, 2 studies, [94, 93]), and Touchpoints (2 studies, [7, 9]).
Other standards/processes with one reference were Building Security In Maturity Model
(BSIMM) [87], Comprehesive, Lightweight Application Security Process (CLASP) [7]), CO-
BIT
8
[1], ISO/IEC 13335-5: Guidelines for Management of IT Security [46], Information
Technology Security Cost Estimation Guide [1], and SAFECode [87].
It is relevant to note that some of them are related, for example BSIMM is a study that
quantifies security practices adopted by organizations, and Touchpoints is one category of
BSIMM’sactivities. OWASPisanot-for-profitorganizationthatpromotesinitiativesrelated
to software security improvement, and CLASP is one project under OWASP.
CC’s Evaluation Assurance Levels (EAL) are used to build the SECU cost driver for
COCOMO II security extension [105] and COSECMO [34]. For the other papers, CC’s
activities are analyzed and compared to other standards and engineering processes.
8
https://cobitonline.isaca.org
59
2.5 Survey Results
The survey remained available online for the invitees to participate for two weeks. After that
period, 161 responses were obtained, 110 of which were complete. More than 40 participants
used the LinkedIn messaging service to interact with the researchers. Most of the messages
were to confirm participation. Other cases were positive feedback about the study, includ-
ing offers to further collaboration in the research, or to inform that they did not consider
themselves to be part of the target audience. One group member stated that he did not
agree with the approach to the topic under study, and another one informed that he was not
allowed to provide information under his current work contract.
Five notifications were received from users who informed that they did not consider
themselves the target audience since they were sales professionals (three cases), recruiter
(one case), or ‘unspecified’ (one case). Given this information, the users’ profile of the whole
sample were analyzed to filter out the group members that were recruiters and sales profes-
sionals. Profiles from 49 participants were classified as sales professionals, 50 as recruiters
and one as ‘other’ (not clear from the profile, but the user declared himself not to be the
target audience). Thus, 100 profiles of 908 (11% of the recruited sample) were not considered
as the target audience.
Considering that 11% of the sample was composed by profiles that were not the target
audience, this percentage was deducted from the sample size and the population to evaluate
the extent to which the survey results represent the opinion of the professionals from the
Software Security group.
The sample frame was adjusted to the remaining 808 profiles. Taking into account the
60
adjusted sample frame, the completed responses (110) represent 13.61% of the total of the
recruited sample (808). This value can be considered to be good, since response rates for
Software Engineering survey studies, when reported, tend to be low [81], typically less than
10% [122]. A response rate between 3% to 4% was reported by Mello, Silva, and Travassos
[80] in a similar LinkedIn-based survey.
The fact that one of the researchers was a manager of the LinkedIn group, together
with the gift card raffle offered, may have influenced the participation positively. On the
other hand, the sensitive nature of security questions may have hindered more extensive
engagement, as exemplified by the case of the group member who notified that his work
contract did not allow his participation.
If 11% is deduct from the original set of 2,102 LinkedIn profiles, the adjusted population
becomes 1,871 professionals. Based on this value and in the number of responses obtained,
a confidence interval of 9.07 was calculated for a level of confidence of 95%. This is a similar
result obtained by Mello, Silva, and Travassos [81], in which the best result for confidence
interval was 9.95.
2.5.1 Background Information
The survey reached a broad set of countries. The majority of the participants were from
the United States (25%); followed by India (15%); Italy (6%); Ireland, Brazil, and UK (5%
each); Israel, Saudi Arabia, and Netherlands (4% each); Germany and Iran (3% each); and
18 other countries with 2% and 1% of the responses.
Figure 2.4 presents the positions of all respondents in their organizations. Most of the
61
participants work as Security expert in their organizations (37%), along with Management,
Software developer (16% each), and Project leader (12%), represent 81% of the sample.
37%
16%
16%
12%
5%
1%
13%
Security expert
Management (e.g. Area manager)
Software developer
Project leader in the development
Member of the security group
Security test er
Other
Figure 2.4 Position in Organization
Regarding experience, 73% of the participants have more than three years of experience
in software security, and from the whole sample 34% answered to have more than ten years.
For the academic degree question, Master level was pointed out by 39% and Bachelor degree
by 38% of the participants. Figure 2.5 depicts the years of experience in secure software
development and the academic degree of the participants combined.
The participants were asked to select the size and the primary domain or industrial sec-
tor of their organization. The Information sector, the Professional, Technical and Scientific
Services sector and the Financial and Insurance sector dominate the sample with 57% par-
ticipation. The size of the organization was defined in terms of the number of employees.
Half of them (49%) are large organizations with one thousand or more employees. The sec-
ond most significant portion was of the small companies with 1 to 24 workers (21%). Figure
2.6 shows the distribution of the respondents across all distinct industries and the size of the
62
0% 5% 10% 15% 20% 25% 30% 35% 40%
less than 6 months
6 months to < 1 year
1 to < 3 years
3 to < 6 years
6 to < 10 years
10 years or more
High school Bachelor Associate Master PhD Other
Figure 2.5 Experience in SWSec Development and Academic Degree
organizations.
Most of the respondents in the sample confirmed that SWSec practices are used for all
or for most of the projects in their groups, as shown in Table 2.10.
0% 5% 10% 15% 20% 25% 30%
Information
Professional, Technical and Scientific…
Other
Financial and Insurance
Healthcare
Manufacturing
Education
Retail
1 to 24 50 to 249 25 to 49 250 to 999 1000 and more
Figure 2.6 Sector and Organization Size
Itwasobservedadiversesourceoflearningresourcesregardingsoftwaresecuritypractices
63
Table 2.10 Usage of SWSec Practices on the Group
Security Practices Usage n %
No 8 7.3%
Not Anymore 2 1.8%
Yes 100 90.9%
For few projects 5 5.0%
For selected projects 10 10.0%
For most of the projects 30 30.0%
For all projects 55 55.0%
among the participants as table 2.11 shows. Technical forums was the most common source
of information, closely followed by Training, Conference and Academic course.
Table 2.11 How Participant Became Aware of SWSec Practices
Source n %
a
Article in a scientific journal 12 3.2%
Colleague (external) 13 3.5%
Consultant 23 6.2%
Other 23 6.2%
Article in a practitioner journal or magazine 29 7.8%
Industry workshops 29 7.8%
Colleague (internal) 30 8.1%
Textbook 35 9.4%
Academic course 38 10.2%
Conference 42 11.3%
Training 48 12.9%
Technical forums 49 13.2%
a
Overall exceeds 100% due to multiple select
2.5.2 Projects Characterization
To answer questions about secure software development, the participants were asked to
freely choose one project that they had participated in which software security practices
were applied. From the 110 participants, 97 (88%) were involved in such projects, and 13
(12%) were not. From the 97 projects, 60 (61.9%) were a New development type of project.
64
The distribution among development type is presented in Table 2.12.
Table 2.12 Development Type
Project Type n %
New development 60 61.9%
Enhancement 22 22.7%
Migration 7 7.2%
Re-development 6 6.2%
Other 2 2.1%
Overall 97 100.0%
Table 2.13 summarizes the statistical measures related to the size of the project team, the
duration of the project from the project charter approval to the delivery of the last iteration,
the project size in Persons-Month (PM), and the security risk. Participants could classify
the security risk in a scale from 1 to 5, where level 1 represents low likelihood of attempts
to attack (low “attractiveness”), and in an eventual successful attack the impact would be
minor damages to the system or service and data; while level 5 means high probability of
attempts to attack (high “attractiveness”), and in an eventual successful attack the impact
would be full system compromise or undetected modification of data. The typical project
in this study’s sample takes around one year of execution, with an 8-person team, having a
security risk level of 4, as can be observed in Table 2.13.
2.5.3 RQ2.1 What approaches are used in industry to estimating
the costs of security in software development projects?
The participants were asked if SWSec activities were taken into account when the time/effort
to develop the project was planned. Answered affirmatively, 56.7% of the participants; but
for 34% only part of the security activities was considered; for 6.2% of the projects these
65
Table 2.13 Summary Statistics of Team Size and Project Duration
Measure Team Size
Duration
(months)
Project
Size (PM)
Security
Risk Level
Min 1.0 0.5 4.0 1.0
1st Qu. 5.0 6.0 30.0 3.0
Median 8.0 11.0 85.0 4.0
Mean 33.2 14.3 564.3 3.7
3rd Qu. 20.0 15.8 366.0 5.0
Max 1000.0 97.0 12000.0 5.0
Std. Dev. 108.7 14.6 1785.9 1.3
NA 13.0 14.0 14.0 16.0
activities were not taken into account; 3.1% answered they did not participate in the project
planning.
Regarding the method of software effort estimation used by the project, Expert judge-
ment, which includes story points, structured processes for expert judgment and paired
comparisons, was the most frequent, in 46.4% of the cases. Other methods were Work break-
down (WBS-based and other activity decomposition-based methods), with 21.6%; Analogy
(analogy- and case-based reasoning; e.g., analogy with different projects), with 11.3%; Func-
tion Point (methods based on function points, feature points, or use case points), with 6.2%;
Parametric model (e.g. COCOMO II, SEER-SEM, SLIM), with 2.1%; and Other with 4.1%.
Participants who did not know the effort estimation technique were 8.2%.
Table 2.14 shows the responses for software estimation techniques and planning of SWSec
activities combined. Column ‘Yes’ means that all security activities were planned; ‘Part’,
activities were partially planned; ‘No’, were not planned; ‘NP’, the respondent did not
participate in the planning; ’Ov’ is the overall result.
The participants were invited to comment on the difficulties that they faced when plan-
ning and estimating the time/effort to perform security activities in the project. The ques-
66
Table 2.14 SW Estimation Technique and Planning of SWSec Activities
Method / Planning Yes Part No NP Ov(n) Ov(%)
Analogy Based 5 5 1 0 11 11.3%
Expert judgment 27 14 3 1 45 46.4%
Function Point Based 3 2 0 1 6 6.2%
Parametric model 1 1 0 0 2 2.1%
Work breakdown 15 4 2 0 21 21.6%
Not known 2 5 0 1 8 8.2%
Other 2 2 0 0 4 4.1%
Overall (n) 55 33 6 3 97 100.0%
Overall (%) 57% 34% 6% 3% 100%
tion, which was the only open-ended in the whole survey, was answered by 78 participants.
The lack of security culture from developers, managers and business stakeholders was the
most cited challenge (mentioned six times):
• “There are a few, but getting people to truly stop, and understand 100% why the best
practices are needed, can be a challenge - when people get focused on delivery dates.
Once you explain the ’What could happen...’ - it tends to sink in."
• “Always people considered security as feature to add after business logic and program-
ming are finished so it happens to delay the project a lot."
• "Convincing project manager to incorporate security related time and effort."
• “Low priority from higher management, strict delivery deadlines - all estimates were
hard or rejected."
The prioritization of business features upon security was also pointed out as a difficulty
(mentioned four times):
• “Business wants least time in security as the delivery is (the) main focus."
• “Fast development, to get feature out. Feature priority, security takes back seat some-
times."
• “Estimating time/effort wasn’t the real challenge. It was more of getting a buy-in from
Development team regarding time allocation for security assurance activities as these
were generally given lower priority due to their non-functional nature compared to
business/functional tasks."
67
2.5.4 RQ2.2 What is the frequency and effort spent on software
practices in projects?
The participants answered the extent to which they applied software security practices in
the project and the average effort required to their application each time. The practices
were selected according to the study performed by Morrison, Smith, and Williams [87], and
presented in Table 2.1. The results are illustrated in Figure 2.7 (frequency of engagement)
and in Figure 2.8 (effort each time).
27%
14%
16%
13%
54%
36%
22%
8%
14%
5%
27%
18%
7%
23%
14%
13%
20%
13%
23%
19%
8%
12%
3%
16%
18%
5%
21%
21%
21%
26%
12%
20%
21%
14%
28%
18%
30%
26%
12%
5%
9%
11%
16%
4%
5%
12%
23%
13%
19%
6%
13%
25%
0%
3%
7%
2%
2%
2%
5%
12%
8%
11%
4%
9%
23%
22%
26%
22%
15%
10%
10%
11%
18%
15%
23%
3%
6%
14%
3%
12%
9%
7%
4%
4%
10%
16%
8%
22%
13%
10%
13%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Apply Security Requirements
Apply Data Classification Scheme
Apply Threat Modeling
Document Technical Stack
Apply Secure Coding Standards
Apply Security Tooling
Perform Security Testing
Perform Penetration Testing
Perform Security Review
Publish Operations Guide
Tr ack Vulnerabilities
Improve Development Process
Perform Security Training
Daily Weekly Monthly Quarterly Annually Once in the Project Not Applied
Figure 2.7 Frequency of Security Practices Usage
As Figure 2.7 shows, Application of Secure Coding Standards is performed daily in 54%
of the projects, the only practice that achieved majority with this frequency. Apply Security
Tooling was the second most frequent (36%), followed by Apply Security Requirements and
Track Vulnerabilities (27% both).
In common with the results obtained by [87], Application of Secure Coding Standards is
the practice most often applied daily. However, the second most frequent practice with daily
68
3%
6%
2%
4%
13%
14%
8%
1%
3%
5%
10%
2%
1%
11%
5%
4%
4%
5%
5%
4%
4%
5%
3%
5%
7%
2%
16%
12%
10%
18%
13%
10%
8%
3%
10%
3%
13%
15%
5%
24%
21%
19%
19%
21%
25%
13%
5%
12%
9%
14%
15%
18%
10%
8%
14%
12%
8%
12%
13%
11%
15%
13%
12%
10%
13%
10%
12%
12%
9%
11%
8%
11%
11%
15%
18%
11%
20%
24%
20%
20%
27%
24%
22%
16%
31%
45%
29%
24%
16%
21%
21%
5%
15%
11%
10%
6%
8%
10%
19%
9%
25%
16%
9%
16%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Apply Security Requirements
Apply Data Classification Scheme
Apply Threat Modeling
Document Technical Stack
Apply Secure Coding Standards
Apply Security Tooling
Perform Security Testing
Perform Penetration Testing
Perform Security Review
Publish Operations Guide
Tr ack Vulnerabilities
Improve Development Process
Perform Security Training
15 min or less 15-30 min 30 min - 1 hour 1-4 hours 4-8 hours 1-2 days More than 2 days Not Applied
Figure 2.8 Effort of Security Practices Usage (each time)
use in their study was Track Vulnerabilities, while in the present survey it is only the forth.
For this study’s sample, Track Vulnerabilities is more commonly applied monthly than daily.
Another comparable frequency value between this study and [87] is Publish Operations Guide
as the least applied practice of all in both studies.
Regarding effort per application, Perform Penetration Testing takes more than two days
for 45% of the projects. On the other hand, Apply Security Tooling, Apply Security Require-
ments and Apply Secure Coding Standards are the practices whose application requires one
day or less for more than 60% of the projects.
Figures 2.7 and 2.8 also demonstrate that the usage profile of security practices vary
considerably, according to the type of practice. Aspects like the nature of the task, the
role responsible for it, the type of development process used, and the organization’s culture
can affect the practices usage. For example, Application of Secure Coding Standards can
be executed whenever a developer is working on the code base, in general, a daily task
69
for this role; on the other hand, activities typically performed by external roles like Perform
Penetration Testing tendtobelessfrequent, butcantakemoretimewhenexecuted. Perform
Security Testing will be applied more frequently if an Agile approach is used and will be less
applied and more towards the end of the project in more traditional life cycles.
As in the study performed by Morrison, Smith, and Williams [87], the frequency and
effort were converted from ratio scales to ordinal values to compute the overall effort that
each respondent spent in security practices. One difference is that while Morrison, Smith,
and Williams [87] treated the ordinal answers as ‘number of times used per year’, this study
treated them as ‘number of times in the project’. Thus, instead of normalizing the results per
year, the results were normalized per project, using information about the project duration
(PD), in months, provided by the participants.
The following translation was used for frequency: Daily=PD*20, Weekly=PD*4,
Monthly=PD, Quarterly=PD/4, Annually=PD/12, Once in the Project=1, Not Applied=0.
For effort, the translation was: 15 minutes or less=0.125, 15-30 minutes=.375, 30 minutes-1
hour=.75, 1-4 hours=2, 4-8 hours=6, 1-2 days=12, More than 2 days=40, Not Applicable=0.
The usage of the practice was computed as the proportion of the amount of effort in
hours (frequency * effort each application) for each practice, divided by the overall effort for
one person in the project (PD*20*8, where 20 is the number of work days in a month, and
8 is the number of daily work hours). This assumes that the person worked full-time in the
project. Figure 2.9 presents the results. The median of the usage for all practices is below
10% and only Apply Secure Coding Standards, Perform Security Testing, and Apply Security
Tooling have the third quartile above 15%.
70
The chart in Figure 2.9 also shows the existence of outliers for all practices. Values that
lied 1.5 times the interquartile range from the end of the box were considered as outliers
(Tukey industry standard
9
). These points refer to cases where the participants reported
daily practices with high average effort for each time it was executed. For example, if the
practice was execute daily and took 4-8 hours in each execution (6 hours in this study’s
translation), the practice would be classified as 75% usage. For some cases, it was noticed
that participants selected a practice as performed daily with an average execution time of
more than 2 days, which would surpass 100%. These values were limited to 100%. This
happened in 4.6% of the data points, notably for the Secure Coding Standards practice,
which may indicate that some participants consider the practice as an ongoing effort.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
110%
Sec Req Data Class Threat Model Tech Stack Secure Code Sec Tooling Sec Testing
Pen Testing Sec Review Oper Guide Tracking Vuln Impro Process Sec Training
Figure 2.9 Participants’ Usage of Security Practices in the Project
9
https://www.purplemath.com/modules/boxwhisk3.htm
71
2.5.5 RQ2.3 How much effort/cost is added to a project due to the
application of security practices?
The participants were asked to state a percentage value of the overall project effort dedi-
cated to security activities. Given the high variance of the results, the distribution of the
effort percentage provided by the participants was analyzed across the variables Type of
Development (Figure 2.10) and Industry Sector (Figure 2.11).
For New Development type of projects, which represents 61.9% of the sample, the median
of effort is 20% and the average is about 25%. Similarly to the practices’ usage analysis,
the presence of outliers is observed, suggesting that the participants’ view of security as
ingrained in the software development process.
Regarding industries sectors, Figure 2.11 shows that the Information and Healthcare
verticals presents the lowest median value of effort (about 15%), while Retail, Manufacturing
and Education had the highest values, about 30%, 30% and 50%, respectively.
2.5.6 R3 How practice compare to state of the art?
The major sources of costs discovered in the systematic mappings are countermeasures in
general or specific security practices as listed in Table 2.8. Except for Document Technical
Stack and Improve Development Process, all other practices included in the survey (Table
2.1) were also found in the sources of cost extracted from the literature. This shows that
academia and practice are aligned in what drives the cost of software security. Despite
that, the security activities were only partially considered (34% of the projects) or not even
considered (6.2%) on time/effort estimations when planning the projects in the industry.
72
New development Enhancement Migration Re-development Other:
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Figure 2.10 Percentage of Project Effort Dedicated to SWSec Practices Across
Development Type
This oversight yields inadequate resources to software security, which in turn can undermine
the application of the practices.
Comparing effort estimation methods or models, five out of the ten found in the system-
atic mapping were COCOMOII-based models. Other methods were extensions of functional
size metrics like IFPUG Functions Points and COSMIC Function Points, or other equations
used by the researchers. For practitioners, on the other hand, COCOMOII- and Functional
Size-based methods were the least used (2.1% and 6.2% respectively). The reduced number
of studies, having some of them been developed for specific environments, in addition to
the lack of empirical validation suggest that they may not be ready to be adopted by the
industry.
Regarding the added costs for security, the literature indicates that for projects with a
73
Professional, Technical …
Information
Financial and Insurance
Retail
Other
Manufacturing
Healthcare
Education
Public Administration
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Figure 2.11 Percentage of Project Effort Dedicated to SWSec Practices Across
Sectors
high level of security the additional costs can range from 20% to 80% (see Table 2.9). For
the New Development category of projects surveyed, the average effort dedicated to security
was 24% and the median was 20%, which in terms of added costs represents an average of
31.6% and median of 25%.
2.6 Limitations of the Study
Aiming to analyze the state of the art in software security costs, the first part of the study
conducted a literature review in the form of a systematic mapping. The common threats to
this type of research are the coverage of study, research bias in the study selection process
and data extraction and analysis. The adoption of the quasi-gold standard search strategy,
74
and the broad selection of sources for the manual and automated search helped to cope
with the mapping coverage threat. One issue of this study was that the whole process was
conducted solely by a PhD student, while it is usually recommended to be based on a team
of experienced researchers. This limitation was minimized by preparing a detailed research
protocol which was revised by the supervisor and another PhD student. In addition, the
paper screening and data extraction processes were later partially validated by two Master
students, who found minimal discrepancies.
For the survey part of the study, common threats are related to bias in the question-
naire construction and generalization of the results (external validity). To mitigate the first
threat, three measures were considered. Prior to constructing the questionnaire, previous
surveys related to the topic were reviewed; then, to assess if the questionnaire was capable
of conveying a common understanding between researchers and subjects, it was evaluated by
an external senior researcher; and finally, the questionnaire was also piloted with members
of the target population. Additionally, participants did not express any concerns regarding
the questionnaire in their feedback through the LinkedIn message tool.
The external validity of the survey assesses the extent to which the results can be gen-
eralized to the target population, represented by the sample analyzed. Although limitations
imposed by LinkedIn message tool prevented us from composing a broader set stratified sam-
ples involving more professionals, the sampling strategy sought to extract a representative
set of respondents, framed from a software security group of interest in the professional social
network LinkedIn. Two strategies were used to engage the subjects, a reminder to partici-
pate in the survey was sent after one week of the initial invitation, and a raffle of a US$50.00
75
Amazon gift card was offered as reward for the ones who completed the questionnaire in
time. The questionnaire garnered 110 responses from practitioners with varying experiences,
backgrounds, development teams, and organizations. While LinkedIn does not represent the
entire population of software security professionals, it is the most popular social network
used by professionals and recruiters
10
, offering a diverse demographic audience. However,
the findings can not be generalized to every other development situation and it is limited to
the sample size.
2.7 Lessons Learned
The lessons learned from this study are mostly related to the application of the quasi-gold
standard search strategy and the use of LinkedIn as a source of the sampling frame.
The systematic mapping, through the combination of manual search, snowballing, and
automated search resulted in a large number of papers to be screened. In this study, snow-
balling was the most effective method to identify relevant papers, 65% of the resultant set.
On the other hand, nine venues were analyzed for manual search, corresponding to a total
o 11,657 papers, from which only nine studies were selected. The manual search step was
essential to avoid bias in the starter set of papers for the snowballing process and the search
string elaboration. However, care must be taken when selecting the venues and the period
of the search in order to achieve a good balance between recall and precision.
With respect to the survey, LinkedIn showed to be a demographically diverse database
of industry professionals. However, there are limitations that need to be accounted for when
10
https://www.lifewire.com/what-is-linkedin-3486382
76
planning a survey based on it.
The LinkedIn groups, which gather professionals according to mutual subjects interest,
are the source generally used to identify the population and establish the sampling frame.
Nevertheless, to have access to the set of group members, it is necessary to be part of the
group which requires to be accepted by the group owner or group managers. It was observed
that some group owners promptly accept requests for entering the group, while others take
more time to do it, and in some cases no response is received from them.
A list of all members of the pertinent groups needs to be extracted from LinkedIn if the
plan is to apply a random sampling approach or stratified sampling approach. As noted
by Mello, Silva, and Travassos [81], regular accounts limit the visualization of members to
2,000 records. Premium accounts offer additional filters which allows segmenting the list to
extract the whole group. However, having a premium account requires the payment of a
monthly fee.
In addition, Mello, Silva, and Travassos [81] observed that it was necessary to use a
premium account to access a high number of subjects. As the study was conducted, it was
realized that currently even these special accounts offer very low limits to send messages
to other users in the platform. This restriction impacts significantly the use of LinkedIn to
conduct surveys, if not making it infeasible. This new policy seems to be related to privacy
concerns and the amount of spam messages that were sent by some users
11
. To overcome the
restriction with message sending, group owners were asked to promote one of the researchers
to a group manager. Group managers and group owners have no limits in sending messages
11
https://business.linkedin.com/talent-solutions/blog/2014/07/inmail-policy-change
77
inside the group. Fortunately, the access to the most pertinent group for this research was
granted, but resourcing on this is a risky strategy to take when planning a survey. If this
approach is chosen, it is recommended to execute a feasibility study step early on the survey
planning.
After sending the initial invitation to the selected sample of group members, many sales
professionals and recruiters were detected in the group, who were not part of the target
audience.All profiles in the sample were inspected and it was verified that they represented
11% of the sample. This phenomenon may occur in other groups as well.
It was not possible to automate the process of sending personalized messages to the
members in the sample as performed in [81]. The dynamic behavior of interface elements
identification restricts the use of web automation tools. Since the sample was not too large
(908), finding a solution would take more time than manually sending the invitations. How-
ever, this risk needs to be addressed for large sample sizes.
Despite all these limitations, LinkedIn is an excellent tool to get in touch with Software
Engineeringpractitioners. Thedemographicresultsofthestudywerediverse, contributingto
the representativeness of the sample. Moreover, the messaging tool facilitates the interaction
with the participants, which could be further explored in future studies.
2.8 Conclusion
Software security practices have been increasingly applied by development teams to reduce
vulnerabilities in deployed applications and cope with external threats. The effort/costs of
performing these activities are often pointed out as a barrier to their wide use, although there
78
is a lack of knowledge about the amount of resources needed to achieve a determined level
of security assurance. This study provided the state of the art and practice related to the
costs of secure software development, through the execution of a systematic mapping of the
literature and a survey with practitioners recruited over the professional network LinkedIn.
For the systematic mapping, papers from 2000 to 2017 were selected combining manual
search, snowballing and automated search, which allowed to assess the completeness of the
search strategy and improve the search string. 54 papers were found, categorized, and
analyzedtoanswertheresearchquestions. Theresultsdemonstratedthatthereisanincrease
in the number of papers published over the years, suggesting a growing interest in the
community.
Papers were also classified in 10 categories, according to their goals and results. The
groups related to Software Cost Estimation Models and the Economics of Software Security
were the most relevant papers to this study. Regarding sources of cost, most of the aspects
are related to software security practices established by known secure software engineering
processes and standards. The three most frequent sources of cost were Perform Security
Review, Apply Threat Modeling, and Perform Security Testing. It was found 10 approaches
devoted to estimating the costs of software security, five of which were COCOMOII-based
cost models. However, despite the coherence for the added costs of security among these
models, validation remains a challenge. Economics of Secure Software Development papers
showed that this is a recent line of research, which seeks to shed light on the cost-benefit
aspects of Software Security.
For the survey, 110 responses were received from 29 different countries. The final sample
79
was composed mainly by security experts, managers and software developers with more than
three years of experience. The participants were asked to choose a project in which they had
applied security practices and answered questions regarding the usage and effort estimation
of such activities.
The effort estimation method most frequently used was Expert Judgement, followed by
Work Breakdown. The effort dedicated to security activities in new development projects is
around 20% (median). Analyzing this effort across sectors, the Information and Healthcare
verticals presented the lowest value (15% for the median), while Retail, Manufacturing and
Education presented the highest effort (median above 30%). The participants mentioned the
lack of security culture and the prioritization of business features upon security as the major
challenges for planning and estimating the security effort.
The results showed that all 13 security practices analyzed were applied in most of the
projects. The Application of Secure Code Standards was the most often daily practice, while
Publish Operations Guide was the least used. The usage of the practices was computed as
the proportional effort of that activity related to the overall one-person effort in the project.
Again, Application of Secure Code Standards presented the highest value for median and
average. For some participants, the usage was calculated as 100%, which may indicate
that they consider the security practice an ongoing effort. The wide variance and outliers
found in practices’ usage indicate that further investigation is needed to comprehend how
development teams and stakeholders consider these efforts in the projects.
It was observed that development projects in diverse industry verticals are indeed apply-
ing security practices (90.9% of the sample), and in most cases for all projects conducted
80
in the organizational unit (according to 55% of the participants). The results shows that
security-related activities can represent a big chunk of a project effort; however, the plan-
ning and estimation of such tasks is problematic. For more than 40% of the projects, not all
activities or even no security activity were taken into account during planning. Since most of
the projects apply Expert Judgment and Work breakdown as an estimation technique, this
means that many projects do not receive the resources needed for properly implementing
security. Furthermore, the lack of security culture from developers, managers, and business
stakeholders hinders the incorporation of security practices in the projects.
Comparing the results from the systematic mapping with the survey, similarities were
found regarding the study and use of the same security practices in academia (as found in
the sources of costs) and industry (demonstrated by its usage). The ranges of added effort
for security activities in development projects are comparable in the models proposed by
researchers and the values reported by the practitioners. On the other hand, the methods for
cost/effort estimation differ, the literature explore COCOMO and functional sizing metrics
based models, while industry employs Expert- and Work breakdown-based methods.
Overall, the results of this study offer an opportunity to understand the current research
about security cost issues in software engineering and also provide valuable descriptive in-
formation about the current state of security practices adoption in software developing orga-
nizations. Software engineering practitioners who are initiating the introduction of security
practices can use these results as a basis to plan the activities and the effort and frequency
with which they are applied. Organizations that already apply these practices can use this
study as a benchmark for assessing their efforts.
81
Future works could address the challenges for the cost models validation with potential
improvementstothesemodels. Theincrementineffortforadoptingsecuritypracticescannot
be neglected. However, how to make informed decisions about the right amount of resources
needed to achieve a proper level of security according to the characteristics of each project
is still a challenge. Further studies with an interdisciplinary approach between Software
Engineering and Software Security would be beneficial.
2.9 Summary
These two preliminary studies showed that software security practices, which are the main
source of costs, are currently being applied in the industry. It was also observed that the
effort predicted by the theoretical models are somewhat corroborated by the effort spent
in the projects according to the survey. For a project with a high level of security, the
estimates of the COCOMOII-based models ranged from 20% to 80%, and for the survey this
value was 31.6%. Despite of that, security activities were only partially considered (34% of
the projects) or not even considered (6.2%) on time/effort estimations when planning the
projects in the industry. This oversight may yield inadequate resources to software security,
which in turn can undermine the thorough application of the practices.
Overall, the prior research studies provided evidence that security is a factor that drives
effort in software projects and that security practices are an important source of the cost.
Such practices can represent an important factor for establishing a parametric model to
predict security effort.
82
Chapter 3
Methodology
This chapter describes the methods applied to answer the research questions. In the first
section, the research questions are revisited and the approaches to address each of them are
introduced. Sections 3.2 and 3.3 detail the methods for phases II and III of the research (see
Figure 1).
3.1 Research Goal, Questions, and Methods
As stated in the introduction, this research aims at quantifying the effects of secure software
development in the development effort. As a refresher, this dissertation answers two main
questions:
• RQ1: How to measure the growing levels of secure software development?
Based on the preliminary studies’ results, a new measurement scale was developed,
more in line with current development practices. A primer on scale development was
adapted to the context of secure software engineering to provide a foundation for the
new scale. The establishment of such a scale was an important step towards the data
collection and calibration of the software security cost model.
• RQ2: What is the increase in development effort caused by growing levels of secure
83
software development?
Research question 2 was addressed by building a statistical model to explain the in-
crease in development effort caused by specific degrees of secure software development.
Experts estimation and project effort data were collected and analyzed to build a
model and quantify the factors that affect software development effort, with a focus
on security.
3.2 Phase II - The Rating Scale Development
Phase II of the work addressed research question 1, establishing the degrees of application
of security practices in software development. According to the level of the security risk of a
software system, the development team may perform different security practices and apply
these practices with varying intensity. Also, developers may implement distinct security
features in response to security threats. These factors drive the effort required to develop
the software in order to mitigate the security risks.
Secure software developmentis a complex phenomenon, thusnot trivial tomeasure. Most
of the previous security cost estimation models, as analyzed in the systematic mapping [126],
used the Common Criteria standard (CC) Evaluation Assurance Levels (EAL) [36] to define
the levels of security in software development. However, as discussed in chapter 1, CC EAL
is not an appropriate scale measure for secure software development. Thus, instead of using
the CC standard, this study developed a new rating scale to qualitatively measure secure
software development based on different levels of security practices application.
Scales are measurement instruments that combine sets of items into a composite score
and are frequently developed to reveal levels of theoretical variables that cannot be directly
observed [42]. Methods for scale development have long been discussed in health and social
84
sciences [42, 19], providing rigorous and tested approaches. Such methods and concepts
can be applied for scale development in Software Engineering, considering the nature of the
phenomenon being analyzed, which, in this case, involves development teams applying sets
of security practices in different degrees.
The process to create the security rating scale was adapted from a recent primer on best
practices for developing and validating scales for health, social, and behavioral research,
proposed by Boateng et al. [19]. The primer presents an overall process to develop scales
for measuring complex phenomena. The process has nine steps, divided into three phases,
which can be selected according to the purpose of the research, the intended end-users of
the scale, and the resources available.
In the first phase, items for the scale are generated and the content is validated. The
second phase involves the steps to construct the scale and pre-testing it. The third and
last phase is about the scale evaluation, regarding its dimensions, reliability, and validity.
Phase three was not applied in this study, partly because some of the tests (dimensionality)
were not necessary for the scale, and partly because of the unavailability of the required
data. Although a formal evaluation, as proposed in phase three, was not possible, informal
evaluations and feedback provided at the end of phase two served to prepare the scale for
its use.
Figure 3.1 illustrates the whole process, with the two phases and activities planned for
each phase, adapted to this research needs and resources.
Adaptations and simplifications were made to the original process based on the charac-
teristics of the proposed rating scale, which sometimes differ from typical scales from health
85
Item
Development
1. Identification
of domain and
item generation
2. Content
validity
Scale
Development
3. Scale points
description
4. Item
reduction
5. Pre-testing
scale
6. Sampling and
data collection
Figure 3.1 Scale Development Phases and Steps
and social sciences:
• The subject to be measured is a software development project.
• The property to be measured is the secure software development level, captured by the
degree of application of security practices.
• The final scale consists of one aggregated item (secure software development level)
composed of initial individual items (software security practices).
One important difference of the proposed scale with typical scales cited in the primer
is that the subjects here are software projects, instead of individuals. The availability of
software projects to be measured is generally more restrict, especially when they require
high levels of security. Another particularity for the proposed scale is that, since original
items will not be answered individually, some activities in the scale development, related to
the joint analysis of items’ answers, do not apply. Further details on the adaptations are
described for each phase and step description in the following subsections.
3.2.1 Item Development
This phase is concerned with the content of the scale and has two steps:
• Step 1: Identification of domain and item generation: selecting items for the scale.
• Step 2: Content validity: assessing if the items adequately measure the domain of
interest.
86
Step 1, Identification of domain and item generation, specifies the boundaries of the
domain and the appropriate questions or items related. As recommended in the primer,
this study defines the domain and items of the scale based on a literature review. For the
item generation, it is recommended to balance a deductive method, as the literature review,
with an inductive method that obtains qualitative data through direct observations and
exploratory research methodologies. However, a study to explore the domain in an inductive
manner is not necessary in this case, as the practices identified in the literature review were
selected from frameworks used in the industry.
Step 2, Content validity, evaluates how adequate the items are to measure the target
phenomenon, regarding relevance, representativeness, and technical quality [19].
3.2.2 Scale Development
Phase 2, the scale construction, has four steps, including pre-testing the questions, adminis-
tering a survey with the scale, reducing the number of items, and understanding how many
factors the scale captures. For this research, the steps were adapted in the following manner:
• Step 3: Scale points description: specifying the meaning of each level in the rating
scale.
• Step 4: Item reduction: ensuring the scale is parsimonious.
• Step 5: Pre-testing scale: ensuring the items and rating descriptions are meaningful.
• Step 6: Sampling and data collection: getting enough data from the right projects.
Step 3, Scale points description is not in the primer, but this activity was needed for this
research because the proposed scale to measure the level of security in software development
requires more than a numeric item as response (as a Likert scale). Each point in the scale
87
needs a proper specification that describes the intensity and rigor of the execution of a set
of software security practices.
Giventhedifficultiesincollectingdatafromsoftwarepractitioners, step4, Item reduction,
originally the fifth step, after pre-testing questions and survey administration, was brought
forward to prune the instrument, before presenting it to the end-users. The purpose of this
step is to remove the less significant items of the scale, ensuring that only the items that
contribute to the scale measurement are retained to develop a parsimonious scale.
The criteria established to keep only the most relevant practices in the scale was to
evaluate the degree of contribution of the practice’s scale points description to distinguish
levels of security practices application. For the practices to be removed, the nature of the
activities involved should not change significantly from a basic level of security to a high
level of security, except for the descriptor for attribute level.
Step 5, Pre-testing scale, involves ensuring that the items are meaningful to the target
population, minimizing misunderstandings and measurement error [19]. Two components
are examined: the extent to which the questions reflect the domain being studied and the
extent to which answers produce valid measurements.
Toevaluateiftheratingscalereflectsthedomainbeingstudied, thefocusgrouptechnique
was chosen. This is one of the approaches suggested by Boateng et al. [19] for this step. Also,
to evaluate if the rating scale is able to produce valid measurements, a Wideband Delphi
session was selected as strategy.
Both approaches, the focus group discussion and the Wideband Delphi session, were
planned as two workshops to happen in the annual research review event organized by CSSE.
88
This event is a good opportunity to execute these studies because it gathers researchers and
industry affiliates who are potential users of the proposed model. Also, the attendees of this
event are used to Wideband Delphi sessions.
The focus group goal is to discuss with participants if the rating scale correctly measures
distinct levels of secure software development. The following steps were followed for the
focus group discussion:
1. Provide an overview of the objectives of the study and the discussions, and provide
information on how participants should act during the session.
• Background information: purpose of the study, authors of the study, how the
information is going to be used.
• State that organizers guarantee the confidentiality and anonymity of the discus-
sions.
2. Context. Present background for the participants (from previous papers, survey).
3. Present the rating scale table. Provide hand-out with the scale to each participant.
• Describe practices and levels of practices usage.
• Show the condensed and the detailed views of the scale.
4. Ask participants:
• Is the scale easy to understand? Which aspects need clarification? (understand-
ability)
• Are there missing practices or other security related aspects that contribute to a
team’s effort? (completeness)
• Are there unnecessary practices or information? (completeness)
• How coherent are the levels of the practices for each degree of the scale? (coher-
ence)
• In which ways you or your team could use the rating scale to plan security efforts
in your projects? (usefulness)
• What else would you like to add to our discussion?
• Do you have any advice for us? (feedback)
5. Make a summary and adjust the rating scale.
The Wideband Delphi is a consensus-based technique for estimating effort, comprising
89
the following steps [23]:
1. Coordinator presents each expert with a specification and an estimation form.
2. Coordinator calls a group meeting in which the experts discuss estimation issues with
the coordinator and each other.
3. Experts fill out forms anonymously.
4. Coordinator prepares and distributes a summary of the estimates.
5. Coordinator calls a group meeting, specifically focusing on having the experts discuss
points where their estimates vary widely.
6. Experts fill out forms, again anonymously, and steps 4 to 6 are iterated for as many
rounds as appropriate.
The following steps were planned for the Wideband Delphi session, aimed at estimating
effort ranges according to the rating scale degrees:
1. Provide an overview of the objectives of the study and provide information on how par-
ticipants should act during the session. Guarantee the confidentiality and anonymity
of the discussions.
2. Present the adjusted ratingscale table. Describe practices andlevels of practicesusage.
3. Participants estimate the range of effort required to perform the practices of the last
step in the rating scale.
4. Consolidate results of the participants and present to discussion.
5. Repeat steps 3 and 4 until consensus is reached (acceptable coefficient of variation).
Step 6, Sampling and data collection, involves using the scale to collect data. The secure
software development level is one among other variables to be collected for a set of software
development projects. The sample is composed by companies that are interested in the
construction of the model.
The step Extraction of factors, which assesses the optimal number of factors to be drawn
from the list of items, is not planned to be executed because the proposed scale is composed
by one aggregated item.
90
3.3 Phase III - Research Design for the Secure Software
Development Cost Model
The third and last phase addressed research question 2 by establishing a parametric model
to explain the increase in development effort caused by specific degrees of secure software
development.
Data was collected from experts and software projects and analyzed to quantify the
factors that affect software development effort, including Security.
ThemodelfollowedpreviousworksontheCOCOMOIIstatisticalmodel-buildingprocess
[33, 24]. The modeling methodology comprises seven steps. The first four steps overlap
somewhat with the rating scale development process. Figure 3.2 depicts the process, adapted
from Boehm et al. [24].
Analyze Existing Literature
1
Perform Behavioral Analysis
2
Determine Form of the Model,
Identify relative significance of
parameters
3
Perform Expert-Judgement,
Delphi Assessment
4
Gather Project Data
5
Build and Validate Model
6
Gather More Data, Refine Model
7
Figure 3.2 Modeling Methodology
91
The next subsections describe how each step is planned, in order to answer the research
question 2.
3.3.1 Analyze Existing Literature
Theexistingliteraturearoundcostingsecuresoftwarewasbroadlyanalyzedinthesystematic
mapping of literature (see Chapter 2). Existing cost models that address security were
surveyed, and sources of cost were extracted from the literature. The results of this review
revealed the issues with the existing models and provided insights on how to approach their
improvement by the means of a new rating scale, further analysis, and empirical validation.
3.3.2 Perform Behavioral Analysis
Abehavioralanalysiswasexploredinthesurveystudywithpractitioners. Thedatagathered
from the participants, most of them security experts working on software projects, showed
that the introduction of security-related tasks in the software development process has a
significant impact on the overall effort. Furthermore, it was found that projects apply
security practices with different frequencies and time, suggesting that it may vary according
to the product characteristics, software lifecycle and team structure.
3.3.3 Determine Form of Model, Identify Relative Significance of
Parameters
The original COCOMO II model has the following mathematical form [24]:
PM =ASize
E
n
Y
i=1
EM
i
(3.1)
92
where: A = Effort coefficient that can be calibrated
Size = represents the software size
EM = Effort Multipliers
E = Scaling Exponent for Effort
In equation 3.1, E is the exponent related to scale factors:
E =B + 0:01
5
X
j=1
SF
j
(3.2)
where: B = Scaling base-exponent for Effort that can be calibrated
SF = Scale Factors
A Required Software Security parameter (SECU) was introduced in the model, modifying
the original equation 3.1 to:
PM =ASize
E
SECU
n
Y
i=1
EM
i
(3.3)
where: SECU = Effort Multiplier for Secure Software Development Level
This step also involves the determination of the rating scale and the definition of the
meaning for each point in the scale, according to the project tasks. This was deemed in
steps 3 and 4 of the scale development, as described in the Phase II of the research 3.2.
3.3.4 Perform Expert Judgement, Delphi Assessment
Two panels with experts were performed. One session of Delphi to test the security scale and
collect initial estimates from cost model experts as reported in step 5 (Pre-testing scale), of
Phase II. A second Delphi was conducted online with security experts that work in software
development.
For the second Delphi panel, the sample was composed by participants of the survey
study (Phase I), who indicated interest in the follow-up of the research (82 participants).
93
Facilitator
Experts
Facilitator
-----------
-----------
-----------
-----------
-----------
Report
Request
estimation
Submit
estimates
Send back
summary of
compiled results,
clarify
assumptions,
adjust questions
Report
results
Figure 3.3 The Online Delphi Process
Given the geographically diverse localization of the invitees and current travel restrictions,
this data collection was conducted as an online Delphi study (e-Delphi) [43, 66, 88].
TheprocesswasplannedandexecutedasillustratedinFigure3.3. Initially, thefacilitator
sent an invitation request to the security experts, providing an overview of the study and a
presentation of the rating scale. Next, the participants provided their initial estimates and
comments for the rating scale by filling an online form. After the established deadline for
round 1, the researchers consolidated and discussed the results, prepared and sent back a
customized report for each participant. The report contained a summary of the compiled
results, with histograms showing the answers’ distribution and the participant’s estimates,
besides elucidation of common questions and assumptions made by other experts. In round
2, participants were invited again to discuss the results and re-estimate the effort range,
based on the feedback from round 1. Once again, the results were compiled and sent them
back to all participants.
In each round the participants were asked to provide a productivity range for the three
groups of the scale - Requirements & Design, Coding & Tools, and Verification & Validation.
The productivity range is defined as the ratio between the highest level (Ultra High) and
94
the lowest level of the scale (Nominal). For example, a productivity range of 3 means that
to apply the practices in the Ultra High level, it would take 3 times more effort compared
to the Nominal level (no explicit security-related activity), an increase of 300% effort.
After round 1, four new questions were added for the participants to provide an estimate
of the percentage of increase in the application size (function points, requirements, lines of
code, etc) to develop the security controls.
Participants were also asked to consider a set of assumptions during estimation:
• Assume that there is a security team supporting the developers and performing spe-
cialized tasks (e.g. penetrating testing, managing external penetration tests, etc).
• Assume a hybrid security practices approach is taken (manual and automated).
• Assumethatotheraspectsthataffecttheproductivityrange(teamexperience, product
complexity, reuse, etc) will be captured by factors other than the security scale.
• Assume that the productivity range does not include initial setup effort for security
tools and infrastructure.
• Assume that the security measures are applied at the same level across the groups.
• Assumethattheproductivityrangeincludes*only*thesecuritypracticesapplied(code
review, penetration testing, threat modeling, etc), excluding the development of the
security features.
The resulting estimate for the range of the security factor obtained with the Delphi
sessions was then used to set the initial values for the rating scale.
Considering that the costs with security tend to present an exponential growth as the
security level is increased (as showed by the decomposed production function in section 1.2),
the scale points values were calculated by the following equations:
EM
i
= (1 +r)
i
(3.4)
r =range
(1=n)
1 (3.5)
where:
95
EM
i
is the effort multiplier for level i
i is the security level (Nominal is 0)
r is the growth rate
range is the range of the effort multiplier
n is the index of the highest level (i = 4)
The forms prepared to collect data in the Delphi sessions are presented in Appendix A.
3.3.5 Gather Project Data
The instrument for collecting project data points was in the form of a spreadsheet that
was distributed to industry partners to provide the security ratings, software size, effort,
and schedule data. The instrument also collects the attributes that provide contextual
information about the project and ratings for other factors that drive effort besides security.
Questions about security defects removal were also be included, which, if were filled,
could have contributed to analyzing the effectiveness of the investment in the security effort.
The instrument was reviewed by estimation experts before being sent out to participants.
The final version of the instrument prepared to collect project data is presented in Appendix
A.
3.3.6 Build and Validate Model
This is the step where the model is statistically built based on the expert estimations and
data collected in the previous steps. The general process for building the model is depicted
in figure 3.4, as described in [24].
It starts with the examination of the data collected in the previous steps to verify if
the observations were properly recorded, identify possible subgroup behavior, or other issues
that could indicate a need to review the data collected or change the form of the model.
One perspective analyzed in the model building was the software domain. As indicated
by Menzies et al. [82], different software application domains appear to establish different
development modes, which may drive costs and schedule of projects differently. A study
96
Exploratory
Data Analysis
Formulate
the Problem
Fit the Model
Criticize the
Model
Evaluate the
Fit
Use for
Prediction
Ok? Ok?
Security and Software Engineering Theory
Figure 3.4 Statistical Model-Building Process (adapted from [24])
from Rosa et al. [109] showed that the introduction of the domain variable improves effort
prediction in early phase cost models. Unfortunately, the data points available for analysis
pertained only to the domain of Information Systems. Nonetheless, this was an aspect that
could be analyzed when comparing to previous security cost models.
Having formulated the problem, the data was then fitted into the model, generating a
calibration of the model parameters. The calibration involves (1) linearizing the equation
3.3 by taking logarithms on both sides of the equation; and (2) applying multiple linear
regression (MLR) to find the coefficients of the equation.
The multiple regression model for equation 3.3, can be expressed as:
PM =
0
[Size]
1
+0:01
P
5
j=1
j+1
SF
j
7
SECU
n
Y
i=1
j+7
EM
i
! (3.6)
In equation 3.6, Size, SF
j
, SECU, and EM
i
are the values of the predictor (regressor)
variables,
0
:::
n+7
are the coefficients to be estimated, and! is the log-normally distributed
error term (and ln(!) =").
Taking the logarithms on both sides of equation 3.6 will produce the linear equation to
apply in the MLR:
ln(PM) =
0
+
1
ln[Size] +
2
SF
1
ln[Size] +::: +
6
SF
5
ln[Size]
+
7
ln[SECU] +
8
ln[EM
1
] +::: +
n+7
ln[EM
n
] +" (3.7)
97
After estimating the parameters of the model, the coefficients calculated need to be
examined to verify their magnitude (t-value) and to check if they indicate the expected
behavior (correct signs). The remediation may involve changes in the model parameters or
the dataset (removing outliers, for example).
Deriving
7
, the coefficient for theSECU predictor, was the main focus of this research.
The value of
7
indicates the degree of influence that the security parameter has on the
projects’ effort. A
7
close to zero would indicate that SECU does not influence effort.
Besides that, the t-value (ratio between the coefficient estimated value and corresponding
standard error) indicates the statistical significance of the SECU parameter. The higher
the t-value, the stronger the statistical significance of the predictor variable.
Once the model was considered adequate for the data and parameters estimated, the
fit was evaluated to verify if it can generate good predictions involving different levels of
required security for the projects. The ‘goodness of fit’ of the model to the data in the MLR
was evaluated by the adjusted R-squared value and the standard error. To test the model
accuracy, the Validation Set approach and k-fold Cross Validation were employed and the
metrics of the mean magnitude of relative error (MMRE) and the percentage of predictions
within 25 percent of the actuals PRED(25) were calculated. The magnitude of relative error
(MRE) and PRED are given as:
PRED(25) =
100
N
N
X
i=1
8
>
>
<
>
>
:
1; if MRE
i
0:25
0; otherwise
(3.8)
MRE
i
=
jy
i
y
i
j
y
i
(3.9)
where:
N is the number of projects
y
i
is the actual effort of project i
y
i
is the predicted effort of project i
98
3.3.7 Gather more data, Refine Model
The modeling methodology, as illustrated in Figure 3.2, is iterative. As new data becomes
available, the model can be continually refined by reviewing any of the previous steps of the
methodology.
3.4 Summary
This chapter described the methods that were applied to investigate and answer the research
questions. Research question 1 was answered by applying the scale development process
specifiedinsection3.2. Researchquestion2wasaddressedbythedevelopmentofastatistical
model that incorporates the security scale established by question 1. Data was collected from
experts and industry projects allowing to establish initial estimates and validate the results.
The significance of the security parameters, the goodness of fit of the model, and its accuracy
allow to evaluate the model and answer research question 2.
99
Chapter 4
Secure Software Development Scale
This chapter describes the results of the application of the scale development process de-
scribed in Chapter 3. The first section presents the generation of the initial version of the
scale, based on the security practices and respective levels description. The second section
shows an improved version of the scale, based on the feedback of the focus group and Delphi
sessions. The last section presents the initial estimates for the Required Software Security
multipliers, obtained with the estimates collected from the experts.
4.1 Development of the Rating Scale
The domain of the scale was identified with the systematic mapping of the literature, which
revealed that previous cost models for security effort used the Evaluation Assurance Levels
(EAL) from the Common Criteria (CC) standard as a rating scale. However, as observed in
Chapter 1, the EAL scale is not completely adherent to the factors that drive security efforts
in software development projects. No other scale to measure secure software development
was found.
The preliminary studies also provided the foundation to develop the items of the scale.
From the systematic literature review, the factors that affect costs of security in software
100
Table 4.1 Descriptors for Attribute Levels
N Level Modifier Descriptor
1 Extra Low Extremely (Easy, Little, Simple, Informal, Casual)
2 Very Low Very (Easy, Little, Simple, Informal, Casual)
3 Low Fairly (Easy, Little, Simple, Informal, Casual)
4 Nominal Average
5 High Fairly (Hard, Extensive, Difficult, Formal, Rigorous)
6 Very High Very (Hard, Extensive, Difficult, Formal, Rigorous)
7 Extra High Extremely (Hard, Extensive, Difficult, Formal, Rigorous)
development, specifically the security practices, were extracted as the items to include in the
scale. The items’ list initially comprised 13 practices, extracted from the literature [87, 126],
and one additional item to account for the security features.
The survey with software security experts tested the Content validity for the selected
items. The participants of the survey are potential end-users of the rating scale. In the
questionnaire, the respondents answered for each item (security practice) what was the
frequency of use and the effort required to apply it each time. Results of the survey showed
that all 13 practices are frequently used by professionals.
The description of the scale points for each item was performed based on descriptors for
attribute levels of the COCOMO II model, combined with existing models that reflect the
maturity of security practices application. Cost drivers’ levels in COCOMO II range from
the Extra Low level to the Extra High level. Table 4.1 presents levels and descriptors used
to classify the presence and degree of an attribute in the software product to be developed.
BSIMM [84] and SAMM [97] models were used to help describe the increasing levels of
security practices’ application. It is important to notice the distinct meanings of maturity of
practices and degree of use of the practices. For example, one organization may be mature
in some security practices, but it may apply these practices with different frequency and
rigor in its projects, according to the security risks involved. The description of practices in
increasing maturity levels, thus, cannot be directly mapped to the increasing degrees of usage
101
but can help understand how sophistication is introduced in the process as the security needs
to step up. As a result, each of the items in the rating scale is composed of a combination of
the COCOMO II attribute level descriptors (Table 4.1) and the description of the practices
in increasing levels of maturity.
Table 4.2 presents the list of items with their five scale points descriptions, supported by
BSIMM, SAMM, and the descriptors for attribute levels.
Table 4.2 Items Detailed Description
Practice Nominal High Very High Extra High Ultra High
Apply Secure Cod-
ing Standards
Ad-hoc secure
coding
Address com-
mon vulnera-
bilities
Address common
and off-nominal
vulnerabilities
Address all vulner-
abilities and some
weakness
Coding to address
all known vulnera-
bilities and weak-
nesses
Perform Security
Testing
Ad-hoc secu-
rity testing
Basic adversar-
ial testing
Moderate ad-
versarial testing
driven with secu-
rity requirements
and security fea-
tures.
Extensive ad-
versarial testing
driven by high
security risks.
Rigorous adver-
sarial testing
driven by security
risks and attack
patterns.
Apply Security
Tooling
Casual use
of standard
static analysis
tool to iden-
tify security
defects.
Basic use of
static analysis
tool to iden-
tify security
defects.
Routine use of
static analysis and
penetration testing
tools to identify
security defects.
Extensive use of
static analysis,
penetration testing
and black-box
security testing
tools.
Rigorous use of
static analysis,
penetration testing
and black-box se-
curity testing tools
with tailored rules.
Perform Security
Review
Ad-hoc secu-
rity features
code review.
Basic security
features code
review.
Moderate security
code review.
Systematic exten-
sive security code
and design review.
Systematic rigor-
ous security code
and design review.
Track Vulnerabili-
ties (development
time)
Ad-hoc vulner-
abilities track-
ing and fixing.
Regular vul-
nerabilities
tracking and
fixing.
Systematic vulner-
abilities tracking
and fixing.
Extensive vulner-
abilities tracking
and fixing.
Rigorous vulner-
abilities tracking
and fixing.
Apply Security Re-
quirements
Ad-hoc secu-
rity require-
ments.
Basic security
requirements
derived from
business func-
tionality.
Moderate security
requirements de-
rived from business
functionality and
compliance drivers.
Complex security
requirements de-
rived from business
functionality, com-
pliance drivers and
known risks.
Extreme secu-
rity requirements
derived from
business function-
ality, compliance
drivers and ap-
plication/domain
specific security
risks.
Improve Soft-
ware Development
Process
Occasional
improvements
driven by secu-
rity incidents.
Regular im-
provements
driven by vul-
nerabilities
resolution.
Systematic im-
provements driven
by vulnerabilities
resolution.
Systematic and
frequent improve-
ments driven by
organizational
security knowledge
base.
Systematic and
rigorous improve-
ments driven by
security science
team.
102
Table 4.2 – continued from previous page
Practice Nominal High Very High Extra High Ultra High
Perform Penetra-
tion Testing
Ad-hoc pene-
tration testing.
Basic pene-
tration testing
addressing
common vul-
nerabilities
(for sanity
check before
shipping).
Routine penetra-
tion testing (each
release) addressing
common and criti-
cal vulnerabilities.
Frequent pene-
tration testing
(each increment)
based on project
artifacts.
Deep-dive analysis
and maximal pene-
tration testing.
Document Techni-
cal Stack
None Basic technical
stack docu-
mentation.
Moderate tech-
nical stack doc-
umentation with
explicit third-
party components
identification.
Detailed technical
stack documenta-
tion with third-
party components
identified and
assessed based on
security risks.
Exceptionaltechni-
cal stack documen-
tation with third-
party components
identified and rig-
orously assessed by
a security science
team.
Apply Threat
Modeling
None Ad-hoc threat
modeling.
Apply threat mod-
eling with generic
attacker profiles.
Apply threat mod-
eling with specific
attackers informa-
tion.
Apply threat mod-
eling using new at-
tack methods de-
veloped with a sci-
ence team.
Apply Data Classi-
fication Scheme
None Simple data
classification
scheme.
Moderate data
classification
scheme.
Complete data
classification
scheme.
Maximal data clas-
sification scheme.
Perform Security
Training
None Security aware-
ness training is
performed.
Security on-
demand training
and advanced-role
specific training
are performed.
Security central-
ized reporting
knowledge is used.
Material specific to
company history
is used in train-
ing. Vendors and
outsourced work-
ers are trained.
Annual train-
ing required for
everyone.
Progression on
security train-
ing curriculum is
rewarded.
Publish Operations
Guide
None Regular oper-
ations guide
with critical
security in-
structions for
deployment.
Moderate opera-
tions guide with
critical security
instructions and
procedures for
typical application
alerts.
Thorough opera-
tions guide with
with detailed se-
curity instructions
and, procedures
for all application
alerts.
Very thorough op-
erations guide with
with maximal se-
curity instructions
and, procedures
for all application
alerts.
Build Security Fea-
tures
None. Build basic se-
curity features
(authenti-
cation, role
management).
Build additional
security features
(authentication,
role management,
key management,
audit/log, cryptog-
raphy, protocols).
Build secure-by-
design middle-
ware for security
features (authen-
tication, role
management,
key management,
audit/log, cryptog-
raphy, protocols).
Build container-
based approaches
for security fea-
tures (authen-
tication, role
management,
key management,
audit/log, cryptog-
raphy, protocols).
For the rating scale to comply with the requirement of being compact, the list and
description of practices per level were progressively summarized as follows:
1. Practices Selection and Grouping: selection of the most relevant practices (items) and
grouping them into categories.
103
2. Practices Summarizing: aggregation of the scale points specification of each group in
one individual group description.
3. One-Line Summary: aggregation of the groups’ descriptions as a one-line description.
Practices Consolidation Practices Grouping Practices Summarization
Tasks, Pr act i ces & Act i vi t i es Char act er i s t i cs f or SECU
rati ngs
Degr ees
Appl y Sec ur e Codi ng
St andar ds
St andar ds cover age
Bas i c ( l i s t of banned f unct i ons ) , m oder at e,
ext ensi ve ( pr oper use of API s, m em or y
sani ti zati on, cryptography) .
Ad- hoc s ecur e
codi ng
Addr es s c om m on
vul ner abi l i t i es
Addr es s c om m on
and of f - nom i nal
vul ner abi l i t i es
Addr es s al l
vul ner abi l i t i es and
som e w eakness
Codi ng t o addr es s al l
know n vul ner abi l i t i es
and w eaknesses
Per f or m Secur i t y Tes t i ng Test i ng r i gour and cover age
Bas i c t es t i ng ( s i m pl e edge cas es and boundar y
condi t i ons) , basi c t est i ng deri ved f rom
requi rem ents and securi ty f eatures, deri ved f rom
ri sk anal ysi s w i th m edi um coverage,
com prehensi ve t est s deri ved f rom abuse cases,
com pl et e set of t est s deri ved f rom abuse cases.
Ad- hoc s ecur i t y
testi ng
Bas i c adver s ar i al
testi ng
Moderate
adver sar i al t est i ng
dr i ven w i t h s ecur i t y
requi rem ents and
securi ty f eatures.
Ext ensi ve
adver sar i al t est i ng
dr i ven by hi gh
securi ty ri sks.
Ri gor ous adver s ar i al
testi ng dri ven by
securi ty ri sks and
at t ack pat t er ns.
Appl y Sec ur i t y Tool i ng Tool s usage Bas i c t ool conf i gur at i on, cus t om i z ed w i t h t ai l or ed
rul es, abl e to detect m al i ci ous code.
Cas ual us e of
standard stati c
anal ysi s t ool t o
id e n t if y s e c u r it y
def ect s .
Bas i c us e of s t at i c
anal ysi s t ool t o
id e n t if y s e c u r it y
def ect s .
Rout i ne us e of s t at i c
anal ysi s and
penet r at i on t es t i ng
tool s to i denti fy
securi ty def ects.
Ext ensi ve use of
stati c anal ysi s,
penet r at i on t es t i ng
and bl ack- box
securi ty testi ng
tool s.
Ri gor ous us e of s t at i c
anal ysi s, penet r at i on
testi ng and bl ack- box
securi ty testi ng tool s
wi t h t a i l o r ed r u l es .
Per f or m Secur i t y Revi ew Revi ew r i gour and cover age
Ad- hoc bas i c code r evi ew f or hi gh- ri sk code,
system ati c code revi ew f or hi gh- ri sk code,
system ati c com prehensi ve code revi ew ,
system ati c extensi ve code revi ew .
Ad- hoc s ecur i t y
features code
revi ew .
Bas i c s ecur i t y
features code
revi ew .
Moderate securi ty
code revi ew .
Syst em at i c ext ensi ve
securi ty code and
des i gn r evi ew .
Syst em at i c r i gor ous
securi ty code and
des i gn r evi ew .
Tr ack Vul ner abi l i t i es
(devel opment ti me) Res ol ut i on cover age
Cr i t i cal vul ner abi l i t i es , hi gh r i s k vul ner abi l i t i es ,
moderate ri sk vul nerabi l i ti es, l ow ri sk
vul ner abi l i t i es.
Ad- hoc
vul ner abi l i t i es
tracki ng and
fi xi ng.
Regul ar
vul ner abi l i t i es
tracki ng and fi xi ng.
Syst em at i c
vul ner abi l i t i es
tracki ng and fi xi ng.
Ext ensi ve
vul ner abi l i t i es
tracki ng and fi xi ng.
Ri gor ous
vul ner abi l i t i es
tracki ng and fi xi ng.
Appl y Sec ur i t y Requi r em ent s Requi r em ent s s peci f i cat i on
Gener i c , bas ed on bus i nes s f unc t i onal i t y, bas ed
on know n r i s ks , bas ed on pr oj ect s peci f i c t hr eat
model .
Ad- hoc s ecur i t y
requi rem ents.
Bas i c s ecur i t y
requi rem ents
der i ved f r om
bus i nes s
functi onal i ty.
Moderate securi ty
requi rem ents
der i ved f r om
bus i nes s
functi onal i ty and
com pl i ance dri vers.
Com pl ex s ecur i t y
requi rem ents
der i ved f r om
bus i nes s
functi onal i ty,
com pl i ance dri vers
and know n r i sks.
Ext r em e secur i t y
requi rem ents
der i ved f r om
bus i nes s
functi onal i ty,
com pl i ance dri vers
and
appl i cat i on/dom ai n
speci f i c securi ty
ri sks.
Improve Software
Devel opm ent Pr oc es s Improvement frequency End of pr oj ect , each r el ease, each i t er at i on.
Oc as i onal
im p r o v e m e n t s
dr i ven by s ecur i t y
in c id e n t s .
Regul ar
im p r o v e m e n t s
dr i ven by
vul ner abi l i t i es
resol uti on.
Syst em at i c
im p r o v e m e n t s
dr i ven by
vul ner abi l i t i es
resol uti on.
Syst em at i c and
frequent
im p r o v e m e n t s
dr i ven by
or gani z at i onal
securi ty know l edge
bas e.
Syst em at i c and
ri gorous
im p r o v e m e n t s d r iv e n
by s ecur i t y s ci ence
team .
Per f or m Penet r at i on Tes t i ng
Penet r at i on t es t i ng
frequency Bef or e s hi ppi ng, f or each r el eas e, per i odi c.
Ad- hoc
penet r at i on
testi ng.
Bas i c penet r at i on
testi ng addressi ng
com m on
vul ner abi l i t i es ( f or
sani ty check bef ore
shi ppi ng) .
Rout i ne penet r at i on
testi ng ( each
rel ease) addressi ng
com m on and cri t i cal
vul ner abi l i t i es.
Fr equent
penet r at i on t es t i ng
(each i ncrement)
bas ed on pr oj ect
ar t i f act s.
Deep- di ve anal ys i s
and m axi m al
penet r at i on t es t i ng.
Doc um ent Tec hni c al St ac k
Cont r ol s ecur i t y of t hi d- par t
com ponent s
Bas i c ( i dent i f y and keep t hi r d- par t com ponent s
up t o dat e on s ecur i t y pat ches ) , m oder at e ( as s es s
thi rd- par t com ponent s r i s k) .
None
Bas i c t echni cal
stack
docum ent at i on.
Moderate techni cal
stack
docum ent at i on w i t h
expl i ci t t hi r d- par t
com ponent s
id e n t if ic a t io n .
Det ai l ed t ec hni c al
stack docum entati on
wi t h t h i r d- par t
com ponent s
id e n t if ie d a n d
assessed based on
securi ty ri sks.
Except i onal t echni cal
stack docum entati on
wi t h t h i r d- par t
com ponent s
id e n t if ie d a n d
formal l y ri gorousl y
assessed by a
securi ty sci ence
team .
Appl y Thr eat M odel i ng At t ac k i nf or m at i on
Bas ed on gener i c at t acker pr of i l es , w i t h s peci f i c
at t acker s in f o r m a t io n , u s in g o r g a n iz a t io n ' s t o p N
pos s i bl e at t acks , bas ed on new at t ack m et hods
devel oped by a s ci ence t eam .
None Ad- hoc t hr eat
model i ng.
Appl y t hr eat
model i ng w i th
gener i c at t acker
pr of i l es .
Appl y t hr eat
model i ng w i th
speci f i c at t acker s
in f o r m a t io n .
Appl y t hr eat
model i ng usi ng new
at t ack m et hods
devel oped w i t h a
sci ence team .
Appl y D at a Cl as s i f i c at i on
Schem e Dat a c l as s i f i c at i on s c hem e
Si m pl e cl assi f i cat i on ( l ow r i sk dat a) , m oder at e
cl assi f i cat i on ( m edi um ri sk dat a) , com pl ex
cl assi f i cat i on ( hi gh ri sk dat a) .
None
Si m pl e dat a
cl assi f i cat i on
schem e.
Moderate data
cl assi f i cat i on
schem e.
Com pl et e dat a
cl assi f i cat i on
schem e.
Maxi m al data
cl assi f i cat i on
schem e.
Per f or m Secur i t y Tr ai ni ng Tr ai ni ng l evel and cover age
Gener al aw ar enes s , r ol e- speci f i c, advanced rol e-
speci f i c, custom i zed w i th com pany
dat a/ know l edge, s ecur i t y cer t i f i cat i on.
None
Secur i t y aw ar eness
trai ni ng i s
per f or m ed.
Secur i t y on- dem and
trai ni ng and
advanced- rol e
speci f i c trai ni ng are
per f or m ed. Secur i t y
cent ral i zed
reporti ng
know l edge i s used.
Materi al speci f i c to
com pany hi st ory i s
us ed i n t r ai ni ng.
Vendor s and
out w our ced w or ker s
ar e t r ai ned. Annual
trai ni ng requi red for
ever yone.
Pr ogr es s i on on
securi ty trai ni ng
curri cul um i s
rew arded.
Publ i s h O per at i ons G ui de Gui di ng c ov er age
Bas i c ( cr i t i cal s ecur i t y i nf or m at i on f or
depl oym ent ) , m oder at e ( pr ocedur es f or t ypi cal
appl i cat i on al er t s) , t hor ough ( f or m al oper at i onal
securi ty gui des) .
None
Regul ar oper at i ons
gui de w i t h cr i t i cal
securi ty i nstructi ons
for depl oyment.
Moderate
oper at i ons gui de
wi t h c r i t i c a l s ec u r i t y
in s t r u c t io n s a n d
pr ocedur es f or
typi cal appl i cati on
al er t s.
Thor ough oper at i ons
gui de w i t h w i t h
det ai l ed s ecur i t y
in s t r u c t io n s a n d ,
pr ocedur es f or al l
appl i cat i on al er t s.
Ver y t hor ough
oper at i ons gui de
wi t h wi t h m a x i m a l
securi ty i nstructi ons
and, pr ocedur es f or
al l appl i cat i on al er t s.
Ta s k Practices
Characteristics for
SE C U ra ti n g s Lo w No m in a l Hig h Ve ry High Ex tra H i g h
Requirements and
De sign
Apply Se curity
Requirements
Requirements
sp e c i fi c a ti o n
Ad -hoc
se c u ri ty
re q u ire m e n ts.
Basic security
re q u ire m e n ts
derived from
business
fu n ctio n a lity.
Moderate
se c u ri ty
re q u ire m e n ts
derived from
business
fu n ctio n a lity
an d c o m p l i an c e
drivers.
Complex
se c u ri ty
re q u ire m e n ts
derived from
business
fu n ctio n a lity,
co m p l i a n ce
drivers and
kn o wn ri s ks .
Ex tre m e
se c u ri ty
re q u ire m e n ts
derived from
business
fu n ctio n a lity,
co m p l i a n ce
drivers and
ap p l i c ati o n / d o
ma i n s p e c i fi c
se c u ri ty ri sk s.
Se c u ri ty Fe a tu re s Sc o p e a n d rig o u r No n e .
Build basic
se c u ri ty
fe a tu re s
(a u th e n tica tio n ,
ro le
ma n a g e me n t).
Build add itio n al
se c u ri ty
fe a tu re s
(a u th e n tica tio n ,
ro le
ma n a g e me n t,
ke y
ma n a g e me n te ,
au d i t/ l o g ,
cry p to g ra p h y ,
protocols).
Build secure - by -
design
mi d d l e w a re fo r
se c u ri ty
fe a tu re s
(a u th e n tica tio n ,
ro le
ma n a g e me n t,
ke y
ma n a g e me n te ,
au d i t/ l o g ,
cry p to g ra p h y ,
protocols).
Build con tain er-
based
ap p ro ac h e s fo r
se c u ri ty
fe a tu re s
(a u th e n tica tio n ,
ro le
ma n a g e me n t,
ke y
ma n a g e me n te ,
au d i t/ l o g ,
cry p to g ra p h y ,
protocols).
Apply Thre at
Modeling Attack information No n e .
Ad -hoc threat
mo d e l i n g .
Apply thre at
mo d e l i n g w i th
ge n e ri c a tta c k e r
profiles.
Apply thre at
mo d e l i n g w i th
sp e c i fi c
attac k e rs
information.
Apply thre at
mo d e l i n g u s i n g
new attack
me th o d s
developed with
a s c i e n c e te am .
Coding
Apply Se cure Coding
Sta n d a rd s Sta n d a rd s c o v e ra g e
Ad -hoc secure
co d i n g
Addre ss
co m m o n
vu l n e ra b i l i ti e s
Addre ss
co m m o n a n d
off-nominal
vu l n e ra b i l i ti e s
Addre ss all
vu l n e ra b i l i ti e s
an d s o m e
we a k n e ss
Coding to
ad d re s s al l
kn o wn
vu l n e ra b i l i ti e s
an d w e ak n e s s e s
Apply Se curity Toolin g To o l s u s a g e
Casual use of
sta n d a rd sta ti c
an al y s i s to o l to
identify
se c u ri ty
defects.
Basic use of
sta ti c a n a l y si s
to o l to id e n tify
se c u ri ty
defects.
Routine use of
sta ti c a n a l y si s
an d
penetration
te stin g to o ls to
identify security
defects.
Ex te n s i v e u s e o f
sta ti c a n a l y si s,
penetration
te stin g a n d
black -box
se c u ri ty te sti n g
to o ls.
Rigorous use o f
sta ti c a n a l y si s,
penetration
te stin g a n d
black -box
se c u ri ty te sti n g
to o ls with
ta ilo re d ru le s.
Ve rification and
Validation
Perform Security
Te s ti n g
Te s ti n g rig o u r an d
co v e ra g e
Ad -hoc
se c u ri ty te sti n g
Basic
ad v e rs ari al
te stin g
Moderate
ad v e rs ari al
te stin g d riv e n
wi th se c u ri ty
re q u ire m e n ts
an d s e c u ri ty
fe a tu re s.
Ex te n s i v e
ad v e rs ari al
te stin g d riv e n
by high security
risk s.
Rigorous
ad v e rs ari al
te stin g d riv e n
by security risks
an d attac k
patterns.
Perform Security
Review
Review rig o u r an d
co v e ra g e
Ad -hoc
se c u ri ty
fe a tu re s co d e
re v ie w.
Basic security
fe a tu re s co d e
re v ie w.
Moderate
se c u ri ty c o d e
re v ie w.
Sy s te m a ti c
ex ten s i v e
se c u ri ty c o d e
an d d e s i g n
re v ie w.
Sy s te m a ti c
rig o ro u s
se c u ri ty c o d e
an d d e s i g n
re v ie w.
Perform Penetration
Te s ti n g
Penetration testing
fre q u e n cy
Ad -hoc
penetration
te stin g .
Basic
penetration
te stin g
ad d re s s i n g
co m m o n
vu l n e ra b i l i ti e s
(fo r sa n ity
ch e ck b e fo re
sh i p p i n g ).
Routine
penetration
te stin g (e a ch
re le a se )
ad d re s s i n g
co m m o n a n d
cri ti ca l
vu l n e ra b i l i ti e s .
Fre q u e n t
penetration
te stin g (e a ch
increment)
based on
project
arti fac ts .
De e p -dive
an al y s i s an d
ma x i ma l
penetration
te stin g .
Application Type Stand alone Network Connected Access to private data Hospitals, Banking Loss of life
General Characteristic None/Ad-hoc Basic Moderate Extensive Rigorous
Requirements and Design
Summary
Security requirements considered
part of reliability.
Basic security requirements and
features. Basic threat modeling.
Moderate security requirements
and additional security features
(audit/log, cryptography).
Moderate threat modeling.
Complex security requirements,
advanced secure-by-design
security features middleware
development. Threat modeling
with specific attackers
information.
Extreme security requirements,
container-based approaches for
advanced security features
development. Rigorous threat
modeling.
Effort Multiplier 1 1.1 1.25 1.45 1.7
Coding and Tools Summary No secure coding and no use of
static analysis tool.
Basic vulnerabilities applicable to
the software will be prevented
with secure coding standards or
detected through basic use of
static analysis tools.
Known and critical vulnerabilities
applicable to the software will be
prevented with secure coding
standards or detected through
routine use of static analysis and
penetration testing tool.
Extensive list of vulnerabilities
and weaknesses applicable to the
software will be prevented with
secure coding standards or
detected through extensive use of
static analysis, black-box and
penetration testing tools.
Very extensive list of
vulnerabilities and weaknesses
applicable to the software will be
prevented with secure coding
standards or detected through
rigorous use of static analysis,
penetration testing and black-box
security testing tools with tailored
rules.
1 1.1 1.2 1.35 1.6
Verification and Validation
Summary
No security testing, review or
penetration testing.
Basic adversarial testing and
security code review. Basic
penetration testing before
shipping.
Moderate adversarial testing and
security code review. Routine
penetration testing each release.
Extensive adversarial testing and
security design/code review.
Frequent and specialized
penetration testing each
increment.
Rigorous adversarial testing and
security design/code review.
Exhaustive deep-dive analysis
penetration testing.
1 1.2 1.4 1.65 2
Security Activities One Line
Summary
Security-related activities for
requirements, coding, and testing
considered part of reliability.
Basic security-related activities
for requirements, coding, and
testing. Typical security functional
features. Regular use of static
analysis tools to detect security
defects.
Moderate security-related
activities for requirements,
design, coding, and testing.
Additional security features
(audit/log, cryptography).
Identification and controlled
update of third-part components'
security patches. Routine use of
static analysis and penetration
testing tools.
Complex security requirements
and threat modeling. Advanced
secure-by-design security
features. Extensive adversarial
testing and security design/code
review. Security assessment of
third-part components and timely
security patches updates.
Thorough use of statics analysis,
black-box, and penetration testing
tools.
Extreme security requirements
and threat modeling. Container-
based approaches to advanced
security features. Exhaustive
adversarial testing, security
design/code review, and deep-
dive analysis penetration testing.
Third-part components rigorously
assessed and updated by a
security science team. Maximal
use of tools for static analysis,
penetration testing, and black-box
security testing. One-line
Summary
Figure 4.1 Security Practices Summarizing Steps
Figure 4.1 illustrates the progressive summarizing process.
TheitemsdefinedinTable4.2wereprunedbyapplyingthecriteriapreviouslyestablished,
leading to the removal of the following practices:
• Apply data classification scheme.
• Improve development process.
• Perform security training.
• Publish operations guide.
• Track vulnerabilities.
The nine remaining practices were grouped into three categories, namely, (1) Security Re-
quirements, and Design, (2) Secure Coding and Security Tools, and (3) Security Verification
and Validation, resulting in Table 4.3 (Practices Grouping).
Next, the descriptions of individual practices were summarized for each group, and then
the groups were summarized as a one-line summary, resulting in the scale presented in Table
4.4 (Practices Summarization and One-line Summary).
104
Table 4.3 Practices Grouped
Practice Nominal High Very High Extra High Ultra High
Requirements and Design
Apply Security Re-
quirements
No concern
with security
requirements.
Basic security
requirements
derived from
business func-
tionality.
Moderate security
requirements de-
rived from business
functionality and
compliance drivers.
Complex security
requirements de-
rived from business
functionality, com-
pliance drivers and
known risks.
Extreme secu-
rity requirements
derived from
business function-
ality, compliance
drivers and ap-
plication/domain
specific security
risks.
Security Features None. Build basic se-
curity features
(authenti-
cation, role
management).
Build additional
security features
(authentication,
role management,
key management,
audit/log, cryptog-
raphy, protocols).
Build secure-by-
design middle-
ware for security
features (authen-
tication, role
management,
key management,
audit/log, cryptog-
raphy, protocols).
Build container-
based approaches
for security fea-
tures (authen-
tication, role
management,
key management,
audit/log, cryptog-
raphy, protocols).
Apply Threat
Modeling
None. Basic threat
modeling.
Apply threat mod-
eling with generic
attacker profiles.
Apply threat mod-
eling with specific
attackers informa-
tion.
Apply threat mod-
eling using new at-
tack methods de-
veloped with a sci-
ence team.
Control security of
third-party compo-
nents
None Basic technical
stack docu-
mentation.
Moderate tech-
nical stack doc-
umentation with
explicit third-
party components
identification.
Detailed technical
stack documenta-
tion with third-
party components
identified and
assessed based on
security risks.
Exceptionaltechni-
cal stack documen-
tation with third-
party components
identified and rig-
orously assessed by
a security science
team.
Coding and Tools
Apply Secure Cod-
ing Standards
No secure cod-
ing standard
used.
Address com-
mon vulnera-
bilities
Address common
and off-nominal
vulnerabilities
Address all vulner-
abilities and some
weakness
Coding to address
all known vulnera-
bilities and weak-
nesses
Apply Security
Tooling
No use of
static analysis
tool to iden-
tify security
defects.
Basic use of
static analysis
tool to iden-
tify security
defects.
Routine use of
static analysis and
penetration testing
tools to identify
security defects.
Extensive use of
static analysis,
penetration testing
and black-box
security testing
tools.
Rigorous use of
static analysis,
penetration testing
and black-box se-
curity testing tools
with tailored rules.
Verification and Validation
Perform Security
Testing
No security
testing
Basic adversar-
ial testing
Moderate ad-
versarial testing
driven with secu-
rity requirements
and security fea-
tures.
Extensive ad-
versarial testing
driven by high
security risks.
Rigorous adver-
sarial testing
driven by security
risks and attack
patterns.
Perform Security
Review
No security
code review.
Basic security
features code
review.
Moderate security
code review.
Systematic exten-
sive security code
and design review.
Systematic rigor-
ous security code
and design review.
105
Table 4.3 – continued from previous page
Practice Nominal High Very High Extra High Ultra High
Perform Penetra-
tion Testing
No penetration
testing.
Basic pene-
tration testing
addressing
common vul-
nerabilities
(for sanity
check before
shipping).
Routine pene-
tration testing
(each release) ad-
dressing common
and off-nominal
vulnerabilities.
Frequent pene-
tration testing
(each increment)
based on project
artifacts.
Deep-dive analysis
and maximal pene-
tration testing.
Table 4.4 Practices Summarized
Nominal High Very High Extra High Ultra High
Requirements and Design
Security require-
ments considered
part of reliability.
Basic security
requirements and
features. Basic
threat modeling.
Moderate security re-
quirements and addi-
tional security features
(audit/log, cryptogra-
phy). Moderate threat
modeling.
Complex security re-
quirements, advanced
secure-by-design
security features mid-
dleware development.
Threat modeling with
specific attackers
information.
Extreme security re-
quirements, container-
based approaches for
advanced security
features develop-
ment. Rigorous threat
modeling.
Coding and Tools
No secure coding
and no use of static
analysis tool.
Basic vulnerabili-
ties applicable to
the software will
be prevented with
secure coding stan-
dards or detected
through basic use
of static analysis
tools.
Known and critical
vulnerabilities appli-
cable to the software
will be prevented
with secure coding
standards or detected
through routine use
of static analysis and
penetration testing
tool.
Extensive list of vul-
nerabilities and weak-
nesses applicableto the
software will be pre-
ventedwithsecurecod-
ing standards or de-
tected through exten-
sive use of static analy-
sis, black-box and pen-
etration testing tools.
Very extensive list
of vulnerabilities and
weaknesses applicable
to the software will
be prevented with se-
cure coding standards
or detected through
rigorous use of static
analysis, penetration
testing and black-box
security testing tools
with tailored rules.
Verification and Validation
Nosecuritytesting,
review or penetra-
tion testing.
Basic adversarial
testing and secu-
rity code review.
Basic penetration
testing before
shipping.
Moderate adversarial
testing and security
code review. Routine
penetration testing
each release.
Extensive adversarial
testing and security
design/code review.
Frequent and spe-
cialized penetration
testing each increment.
Rigorous adversarial
testing and security
design/code review.
Exhaustive deep-dive
analysis penetration
testing.
One-Line Summary
106
Table 4.4 – continued from previous page
Nominal High Very High Extra High Ultra High
Security-related
activities for
requirements, cod-
ing, and testing
considered part of
reliability.
Basic security-
related activities
for requirements,
coding, and test-
ing. Typical
security functional
features. Regular
use of static analy-
sis tools to detect
security defects.
Moderate security-
related activities for
requirements, design,
coding, and testing.
Additional security
features (audit/log,
cryptography). Identi-
fication and controlled
update of third-party
components’ security
patches. Routine use
of static analysis and
penetration testing
tools.
Complex security re-
quirements and threat
modeling. Advanced
secure-by-design
security features. Ex-
tensive adversarial
testing and security
design/code review.
Security assessment
of third-party com-
ponents and timely
security patches up-
dates. Thorough use of
statics analysis, black-
box, and penetration
testing tools.
Extreme security re-
quirements and threat
modeling. Container-
based approaches to
advanced security
features. Exhaustive
adversarial testing,
security design/code
review, and deep-dive
analysis penetration
testing. third-party
components rigorously
assessed and updated
by a security science
team. Maximal use
of tools for static
analysis, penetration
testing, and black-box
security testing.
4.2 Evaluation of the Rating Scale
A focus group and a Wideband Delphi session were realized in February 2020 as part of the
CSSE annual research review event to validate the contents of the rating scale and to collect
initial expert estimates for the developed scale.
4.2.1 Focus Group
Six attendees participated in the focus group, two of them were graduate students, one was
a postdoc researcher and the other three were estimation experts with more than ten years
of experience. Table 4.4 was presented and handed out to the participants. They were
asked to evaluate the scale regarding the aspects of understanding, completeness, coherence,
and usefulness. Any other feedback was also welcome. The discussion took two hours and
generated the following improvements for the scale:
Changes made for the Security Requirements and Design group:
• The description of the first level of security requirements and design was defined as
None, considering that there are projects that do not consider security as an explicit
requirement.
Changes made for the Security Coding and Tools group:
107
• The Penetration testing activity was removed from Apply Security Tooling and kept
only the the Verification & Validation group.
• The application of formal methods in coding was added for the Ultra-High level de-
scription.
• The development and use of specialized in-house tools was added for the Ultra-High
level description of the group.
Changes made for the Security Verification and Validation (V&V) group:
• A new item was added to represent the degree of independency of the V&V activities.
A description was added for each scale point description. For example, in the High
level, V&V activities are performed within the development group, whereas for the
Ultra-High level, such activities are performed by a certified outside company.
• Formal verification was added for the Ultra-High level of the perform security review
item.
• The use of custom developed tools to perform penetration testing was added for the
Ultra-High level.
The modifications were also reflected in the on-line summary of the scale. The final result
is presented in Table 4.5.
Overall, the participants considered the scale easy to understand and ready to be applied.
After the adjustments were made, on the following day, a Wideband Delphi session was
conducted for collecting estimates for the productivity ranges of the rating scale.
4.2.2 Wideband Delphi
Five participants attended the Wideband Delphi session - two graduate students and one
postdoc researcher with some or moderate experience in effort estimation, and two experts
with more than ten years of experience. Regarding the experience with secure software
development, one participant had ten years and the other participants had one or two years
of experience.
The new version of the rating scale was again presented and explained to the attendees.
An online form was set and made available for the participants to anonymously enter the
productivity range of the scale for each one of the three activity groups, namely Security
108
Table 4.5 Scale Version After the Focus Group
Nominal High Very High Extra High Ultra High
Requirements and Design
None Basic security
requirements and
features. Basic
threat modeling.
Moderate security re-
quirements and addi-
tional security features
(audit/log, cryptogra-
phy). Moderate threat
modeling.
Complex security re-
quirements, advanced
secure-by-design
security features mid-
dleware development.
Threat modeling with
specific attackers’
information.
Extreme security require-
ments, container-based
approaches for advanced
security features devel-
opment. Rigorous threat
modeling.
Coding and Tools
No secure coding
and no use of static
analysis tool.
Basic vulnerabil-
ities applicable
to the software
will be prevented
with secure coding
standards and/or
detected through
basic use of static
analysis tools.
Known and critical
vulnerabilities appli-
cable to the software
will be prevented with
secure coding stan-
dards and/or detected
through routine use of
static analysis tools.
Extensive list of vul-
nerabilities and weak-
nesses applicableto the
software will be pre-
ventedwithsecurecod-
ing standards and/or
detected through ex-
tensive use of static
analysis and black-box
tools.
Very extensive list of vul-
nerabilities and weaknesses
applicable to the software
will be prevented with
secure coding standards
and/or detected through
rigorous use of static anal-
ysis and black-box secu-
rity testing tools with tai-
lored rules. Employ formal
methods in coding.
Verification and Validation
None Basic adversarial
testing and secu-
rity code review.
Basic penetration
testing. Security
V&V activities
conducted within
the project.
Moderate adversarial
testing and security
code review. Routine
penetration testing.
Security V&V activi-
ties conducted by an
independent group.
Extensive adversarial
testing and security
design/code review.
Frequent and spe-
cialized penetration
testing. Security V&V
activities conducted by
an independent group
at the organizational
level.
Rigorous adversarial
testing and security
design/code review. Ex-
haustive deep-dive analysis
penetration testing. Use
of formal verification and
custom developed V&V
tools. Security V&V
activities conducted by an
outside certified company.
One-Line Summary
Security-related
activities for
requirements, cod-
ing, and testing
nonexistent
Basic security-
related activities
for requirements,
coding, and test-
ing. Typical
security functional
features. Regular
use of static analy-
sis tools to detect
security defects
within the project.
Moderate security-
related activities for
requirements, design,
coding, and testing.
Additional security
features (audit/log,
cryptography). Identi-
fication and controlled
update of third-party
components’ security
patches. Routine use
of static analysis and
penetration testing
tools. Security V&V
activities conducted by
an independent group.
Complex security re-
quirements and threat
modeling. Advanced
secure-by-design
security features. Ex-
tensive adversarial
testing and security
design/code review.
Security assessment
of third-party com-
ponents and timely
security patches up-
dates. Thorough use of
statics analysis, black-
box, and penetration
testing tools. V&V
activities conducted by
an independent group
at the organization
level.
Extreme security require-
ments and threat model-
ing. Container-based ap-
proaches to advanced se-
curity features. Exhaus-
tive adversarial testing, se-
curity design/code review,
deep-dive analysis penetra-
tion testing, and use of for-
mal methods throughout
the lifecyle. Third-party
components rigorously as-
sessedandupdatedbyase-
curity science team. Max-
imal use of tools for static
analysis, penetration test-
ing, and black-box secu-
rity testing. Use of formal
verification and custom de-
veloped V&V tools. Se-
curity V&V activities con-
ducted by an outside certi-
fied company.
109
RequirementsandDesign,SecurityCodingandTools,andSecurityVerification&Validation.
The summary results of the first round are presented in Table 4.6.
Table 4.6 Productivity Ranges for the First Round of the Wideband Delphi
Group Average Standard Deviation Coefficient of Variation
(AVG) (SD) (SD/AVG)
Requirements and Design 1.58 0.642 41%
Coding and Tools 1.68 0.835 50%
Verification & Validation 2.38 0.669 28%
Productivity Range 6.317
The results of the first round presented a high level of variance, especially for the Security
Coding and Tools group. The productivity range, which is computed as the multiplication
of the ranges estimated for each group, was calculated as 6.317. This would mean that the
total effort for a project with an Ultra-High level of security would be more than six times
the effort estimated for the development.
The summarized results of Table 4.6 were presented to the participants and the effort
required for the activities was discussed, aiming to promote a common understanding. Then
the participants anonymously filled the form in a second round of the Wideband Delphi
technique. The new results are presented in Table 4.7.
Table 4.7 Productivity Ranges for the Second Round of the Wideband Delphi
Group Average Standard Deviation Coefficient of Variation
(AVG) (SD) (SD/AVG)
Requirements and Design 1.18 0.045 04%
Coding and Tools 1.18 0.130 11%
Verification & Validation 1.56 0.152 10%
Productivity Range 2.172
The variation of the estimates in the second round was significant smaller and no further
estimation round was considered necessary. The final productivity range obtained for the
110
security factor was 2.172, which means that a project will double its effort to perform the
Ultra-High level of security.
4.2.3 Online Delphi
The first round of the online Delphi received responses from 17 security experts during ten
days. In round 2, which lasted 12 days, 14 out of the 17 participants responded with new
estimates. The reported average of experience in secure software development was 10.88
years and 11.05 years average in effort estimation.
After the first round, some of the assumptions and comments made by the participants
were compiled and reported back to be considered for the re-estimation on the second round.
The assumptions were:
• Assume that there is a security team supporting the developers and performing spe-
cialized tasks (e.g. penetrating testing, managing external penetration tests, etc).
• Assume a hybrid security practices approach is taken (manual and automated).
• Assumethatotheraspectsthataffecttheproductivityrange(teamexperience, product
complexity, reuse, etc) will be captured by factors other than the security scale.
• Assume that the productivity range does not include initial setup effort for security
tools and infrastructure.
• Assume that the security measures are applied at the same level across the groups.
• Assumethattheproductivityrangeincludes*only*thesecuritypracticesapplied(code
review, penetration testing, threat modeling, etc), excluding the development of the
security features.
Table 4.8 presents the summary results of range estimates for each of the three security
practices’ group. Comparing Round 1 and Round 2, we can observe that the average and
median productivity range for all groups were considerably lower in the second estimation.
The coefficients of Variation (CV) decreased in Round 2 for Secure Coding and Tools, and
Security V&V, though they are still high.
Figure 4.2 presents the distribution of the estimates for the Requirements and Design
group for Round 2. The distribution is right-skewed, with estimates concentrated between
111
Table 4.8 Productivity Range Statistics in Round 1 and Round 2 of the Online
Delphi
Round 1 Round 2
Req & Des Coding & Tools Ver & Val Req & Des Coding & Tools Ver & Val
Average 2.276 2.709 5.447 1.957 2.046 2.561
Median 2 2.3 3 1.5 1.4 1.75
SD 1.043 1.997 11.528 1.093 1.193 2.335
CV 46% 74% 212% 56% 58% 91%
Figure4.2ProductivityRangeDistributionfortheRequirementsandDesigngroup
1 and 2.05. The range is between 1 and 4.5, with a median of 1.5.
For the Coding and Tools group, the distribution of the range estimates on the second
round is also skewed right, with seven estimates between 1 and 1.4, as shown in Figure 4.3.
The median is 1.4 and the range is between 1 and 5.
The distribution of the range estimates for the Verification and Validation, illustrated in
Figure 4.4, is strongly right-skewed, with a median of 1.75. The range is between 1 and 10.
By multiplying the averages of the three groups, the productivity range for the security
scale is calculated as 10.256. However, considering the skewness of the distributions, it seems
more appropriate to use the median to obtain the productivity range, which in this case is
3.675 (1.5 * 1.4 * 1.75). This means that a project at level Ultra High would cost 3.675
112
Figure 4.3 Productivity Range Distribution for the Coding and Tools group
Figure 4.4 Productivity Range Distribution for the V&V group
113
Table 4.9 Summary Results for Increase in Application Size
High Very High Extra High Ultra High
Min 1.05 1.1 1.15 1.2
Max 1.5 2.35 3.25 4.2
Average 1.170 1.393 1.668 1.914
Median 1.1 1.25 1.5 1.675
SD 0.125 0.366 0.590 0.839
CV 11% 26% 35% 44%
times more to develop than a project at level Nominal.
In the second round of the Delphi, four new questions were introduced to ask the partici-
pants about the increase in the application size (function points, requirements, lines of code,
etc) to develop the security features. Figure 4.5 presents the distributions of the estimates
for the four levels of the scale and Table 4.9 presents the summary results for the estimates
about the increase in application size.
For the size increment, there was more agreement among the participants for the initial
levels of the scale, as we can observe by the coefficient of variation in Table 2. The ranges
for the increase in size estimate were smaller than the ranges for the productivity range.
For theseresults, considering themedian, an applicationdeveloped applyingthe practices
at the level High will have an increase of 10% over the original application size due to the
development of security features; whereas an application at the level Ultra High will have
an increase of 67%.
4.3 Initial Estimates for the Scale
The resulting productivity range for the Wideband Delphi session was 2.172. It was calcu-
latedastheaverageofthefiveparticipants’estimationsandwasalmostnormallydistributed.
The productivity range for the online Delphi was 3.675 and was based on the median, given
the skewness of the responses’ distributions.
The productivity ranges estimates of the second round of both Delphi sessions were then
114
Increase Factor
[1.05, 1.10]
(1.10, 1.14]
(1.14, 1.19]
(1.19, 1.23]
(1.23, 1.28]
(1.28, 1.32]
(1.32, 1.37]
(1.37, 1.41]
(1.41, 1.46]
(1.46, 1.50]
Number of Participants
0
1
2
3
4
5
6
7
Increase in Size - High
Increase Factor
[1.10, 1.24]
(1.24, 1.38]
(1.38, 1.52]
(1.52, 1.66]
(1.66, 1.79]
(1.79, 1.93]
(1.93, 2.07]
(2.07, 2.21]
(2.21, 2.35]
0
1
2
3
4
5
6
7
8
Increase in Size - Very High
Increase Factor
[1.15, 1.36]
(1.36, 1.57]
(1.57, 1.78]
(1.78, 1.99]
(1.99, 2.2]
(2.2, 2.41]
(2.41, 2.62]
(2.62, 2.83]
(2.83, 3.04]
(3.04, 3.25]
Number of Participants
0
1
2
3
4
5
6
Increate in Size - Super High
Increase Factor
[1.2, 1.5]
(1.5, 1.8]
(1.8, 2.1]
(2.1, 2.4]
(2.4, 2.7]
(2.7, 3]
(3, 3.3]
(3.3, 3.6]
(3.6, 3.9]
(3.9, 4.2]
0
1
2
3
4
5
6
7
Increase in Size - Ultra High
Figure 4.5 Effort Multipliers per Security Level Calculated from the Productivity
Range
combinedtoproduceoneinitialestimateforthescale, basedonthemedianoftheestimations
for each group of security activities. The result is shown in Table 4.10
Using an exponential growth curve to distribute the productivity range, the effort mul-
tipliers can be calculated for each level, as defined in Chapter 3. The resulting multipliers
for the 2.70 productivity range are presented in Figure 4.6.
115
Table 4.10 Productivity Range for the Delphi Sessions Combined
Group Median Average Standard Deviation Coefficient of Variation
(AVG) (SD) (SD/AVG)
Req & Design 1.3 1.753 0.993 57%
Coding & Tools 1.3 1.818 1.089 60%
Ver & Val 1.6 2.297 2.037 89%
Prod. Range 2.704
According to these estimates for the scale, a project applying security practices of the
High level would require 28% of additional effort, while a project applying security practices
of the Ultra High level would require 170% of additional effort.
1.00
1.28
1.64
2.11
2.70
0.50
1.00
1.50
2.00
2.50
3.00
Nominal High Very High Extra High Ultra High
Multiplier
Figure 4.6 Effort Multipliers per Security Level Calculated from the Productivity
Range
4.4 Summary
This chapter presented the scale developed to measure the level of secure software develop-
ment and the initial cost multipliers estimated for each degree of the scale. The scale items
116
were identified and validated based on the results of the systematic mapping of the literature
and the survey conducted with security experts. The scale points for each item were specified
based on the description of security activities present in the BSIMM and SAMM models,
after which the items were reduced and grouped to generate a parsimonious scale with the
most relevant items. Focus Group, Wideband Delphi, and online Delphi sessions were real-
ized with estimation and security experts, producing the initial values of effort multipliers
for the security cost driver.
117
Chapter 5
A Statistical Cost Model for Secure
Software Development
This chapter presents the results for the statistical model building. It starts by describing
the dataset used and the collinearity test of the candidate predictors for the Multiple Linear
Regression (MLR). Next, the statistical model is defined and the Required Security (SECU)
predictor is presented. The following sections present the regression results, the model
validation, the resulting model, and the values obtained for the SECU multiplier. Finally,
an analysis of the security predictor for different system architectures is presented.
5.1 Data Set Description
The dataset used to calibrate the cost model contains 1,140 maintenance projects, executed
during 2019 and 2020, by two Brazilian companies. The first company is one of the largest
banking institutions in Brazil, and the other is one of the largest telecommunications com-
panies in Brazil.
ThebankinginstitutionusesWaterfallastheirmainsoftwaredevelopmentlifecycle. They
apply statistical process control for software development processes and have about 20 years
118
of experience in measuring Function Points. The telecom company employs Agile methods
and is less experienced in measuring projects with Function Points.
All projects in this dataset use Function Points (FP) as size metric. Because they are
maintenanceprojects, thesizeisacompositionofadded, changed, anddeleteddatafunctions
and transaction functions. The functions changed and deleted are deflated by a factor,
representing the impact of change according to the type of the function.
Equations 5.1 to 5.3 define the composition of the size metric used for all observations in
the dataset:
Size =FP
DataFunctions
+FP
TransactionFunctions
(5.1)
FP
DataFunctions
=DF
Added
+DF
Modified
0:4 +DF
Deleted
0:3 (5.2)
FP
TransactionFunctions
=TF
Added
+TF
Modified
0:6 +TF
Deleted
0:3 (5.3)
whereDF represents the Function Points for the Data Functions and TF represents the
Function Points for the Transaction Functions.
Table 5.1 presents the summary statistics for the main variables of the dataset.
Variable Mean SD Median Min Max
Size (FP) 79.51 154.52 37.64 1.18 2806.99
Effort (h) 1,047.45 1,774.29 524.94 1.00 22,024.00
Effort (PM) 6.89 11.67 3.45 0.01 144.89
Productivity (h/PF) 16.98 31.77 12.80 0.25 680.00
Table 5.1 Dataset Summary Statistics
The projects’ size ranges from 1 to 2807 Function Points, with an average of 80, a
median of 38, and a standard deviation of 155 Function Points. Figure 5.1 illustrates how
the distribution of the Function Points is strongly right-skewed.
Projects’ effort ranges from 0.01 Person-Month (PM) or 1 hour to 145 PM or 22,024
hours, with an average of 7 PM, a median of 3.45 PM, and a standard deviation of 12 PM.
The effort in hours is divided by 152 to convert it to PM. Figure 5.2 shows the distribution
of the effort for the dataset, also strongly right-skewed.
119
0
100
200
300
400
0 1000 2000
Function Points
Observations
Bin size = 50 FP
Figure 5.1 Function Points Distribution
0
50
100
150
200
0 5000 10000 15000 20000
Effort (hours)
Observations
Bin size = 152 hours or 1 person−month
Figure 5.2 Effort Distribution
120
The scatter plot of Figure 5.3 shows the correlation between size and effort. Given the
skewness of the variables, both axis values are presented in the log10 scale. As it can be
observed, the telecom projects are smaller in size and are not so nicely correlated with effort
as the banking projects. The distinct contexts of software process maturity and software
development lifecycles may contribute to the differences in the plot.
1
10
100
1000
10000
1 10 100 1000
Function Points (log scale)
Effort in Hours (log scale)
Domain
bank
telecommunication
Size vs Effort
Figure 5.3 Correlation Between Size and Effort
The cost drivers measured in the data set are presented in Table 5.2. They are a subset
of the cost drivers established by the COCOMO II model, with the addition of the Required
Software Security (SECU) cost driver, which is defined according to the scale developed in
the previous chapter.
Table 5.3 shows the summary statistics for the cost drivers rated in the dataset. In this
dataset, the SECU cost driver covers the Nominal and High levels, respectively, 3 and 4.
121
Table 5.2 Cost Drivers Used in the Dataset
Code Name Description
SECU Required Software Security Thisdrivercapturesthedegreeofapplicationofsoftware
security practices.
FAIL Impact of Software Failure This is the measure of the extent to which the software
mustperformitsintendedfunctionoveraperiodoftime.
CPLX Product Complexity The complexity of the product is characterized by the
operations (control, computational, device-dependent,
data management, and user interface management) per-
formed.
PLAT Platform Constraints This driver captures the limitations placed on the
platform’s capacity such as execution time, pri-
mary/secondary storage, communications bandwidth,
battery power, and maybe others.
PVOL Platform Volatility The targeted platform may still be evolving while the
software application is being developed. This driver cap-
tures the impact of the instability of the targeted plat-
form resulting in additional/increased work.
APEX Application Domain Experi-
ence
This driver captures the average level of application do-
main experience of the project team developing the soft-
ware system or subsystem.
LTEX Language and Tool Experi-
ence
This is a measure of the level of programming language
and software tool experience of the project team devel-
oping the software system or subsystem.
PLEX Platform Experience This driver recognizes the importance of understanding
the target platform.
TOOL Use of Software Tools This driver rates the use of software development tools
using three characteristics.
SITE Multisite Development The multisite development effects on effort are signif-
icant. Determining its cost driver rating involves the
assessment of collocation and communication.
5.2 Collinearity Test Results
All predictors - the size and the cost drivers variables - were tested for multicollinearity as it
can affect negatively the results of the regression. The correlogram obtained for the pairwise
correlation between the predictors is shown in Figure 5.4.
The pairs Language and Tools Experience (LTEX) and Platform Experience (PLEX),
Language and Tools Experience (LTEX) and Application Domain Experience (APEX), and
122
Table 5.3 Predictor Summary Statistics
Predictor N Mean SD Min Median Max
Function Points 1140.00 79.85 154.40 1.18 38.05 2806.99
SECU 1140.00 3.11 0.31 3.00 3.00 4.00
FAIL 1140.00 3.50 0.77 1.00 4.00 5.00
CPLX 1140.00 2.71 0.54 1.00 3.00 5.00
PLAT 1140.00 3.45 0.71 3.00 3.00 6.00
PVOL 1140.00 3.13 0.58 2.00 3.00 5.00
APEX 1140.00 2.64 0.72 1.00 3.00 4.00
LTEX 1140.00 3.67 0.70 2.00 4.00 5.00
PLEX 1140.00 3.67 0.71 1.00 4.00 5.00
TOOL 1140.00 2.76 0.65 1.00 3.00 5.00
SITE 1140.00 4.00 0.47 3.00 4.00 6.00
Application Domain Experience (APEX) and Platform Experience (PLEX) presented strong
correlations, equal or greater than 0.9. As recommended in the literature, the correlated
cost drivers were aggregated into one predictor, which was named Experience (EXPE), by
applying the geometric mean.
The new correlogram with the EXPE predictor replacing LTEX, PLEX and APEX is
showed in Figure 5.5. This set of predictors has only weak correlations and, therefore, was
used for the model building.
For the Required Software Security (SECU) cost driver, no significant association with
other variables was found. The maximum correlation value was 0.154, with the Platform
Constraints (PLAT) predictor, as illustrated in Figure 5.6.
5.3 Model Definition
Based on the dataset analysis, the model equation was defined as:
123
0.18 0.13 0.13 0.14 −0.02 −0.05 −0.1 −0.06 0 0.02
0.28 0.3 0.31 −0.31 0.18 0.03 0.1 0.26 0.15
0.92 0.92 0.21 −0.23 −0.38 0.1 0.08 0.12
0.97 0.18 −0.23 −0.4 0.1 0.09 0.14
0.19 −0.23 −0.4 0.09 0.11 0.16
−0.21 −0.17 −0.04 −0.27 −0.12
0.35 −0.01 0.32 0.1
−0.01 0.14 0.14
0.11 0.15
0.37
FP
tool
apex
ltex
plex
site
cplx
pvol
secu
fail
tool
apex
ltex
plex
site
cplx
pvol
secu
fail
plat
−1.0
−0.5
0.0
0.5
1.0
Corr
Figure 5.4 Pairwise Correlation for the Predictors
Effort =ASize
E
SECUFAILCPLXPLATPVOL
EXPETOOLSITE (5.4)
where: A = Effort coefficient
Size = Represents the software size in Function Points
E = Scaling exponent
SECU = Required Software Security predictor
FAIL = Impact of Failure predictor
CPLX = Product Complexity predictor
PLAT = Platform Constraints predictor
PVOL = Platform Volatility predictor
EXPE = Experience predictor (aggregate of APEX, LTEX, and PLEX)
TOOL = Use of Software Tools predictor
SITE = Multisite Development predictor
124
0.18 0.14 −0.02 −0.05 −0.1 −0.06 0 0.02
0.3 −0.31 0.18 0.03 0.1 0.26 0.15
0.2 −0.23 −0.4 0.1 0.1 0.14
−0.21 −0.17 −0.04 −0.27 −0.12
0.35 −0.01 0.32 0.1
−0.01 0.14 0.14
0.11 0.15
0.37
FP
tool
expe
site
cplx
pvol
secu
fail
tool
expe
site
cplx
pvol
secu
fail
plat
−1.0
−0.5
0.0
0.5
1.0
Corr
Figure 5.5 Pairwise Correlation for the Predictors with the new aggregate cost
driver EXPE
−0.05
0.00
0.05
0.10
0.15
FP site cplx pvol plex ltex apex tool fail plat
Variable
Correlation with SECU
Figure 5.6 Correlation Between SECU and Other Predictors
125
The multiple regression model for equation 5.4, can be expressed as:
d
Effort =
0
Size
1
SECU
2
FAIL
3
CPLX
4
PLAT
5
PVOL
6
EXPE
7
TOOL
8
SITE
9
! (5.5)
where:
0
= Coefficient to be estimated for constant A
1
= Coefficient to be estimated for exponent E
2
to
9
= Coefficients to be estimated for the cost drivers
! = The log-normally distributed error term
The values of the new cost drivers are obtained by raising the initial values of the cost
drivers to the respective coefficients of
2
to
9
.
Taking the logarithms on both sides of equation 5.5 will produce the linear equation to
apply in the multiple linear regression:
ln(
d
Effort) =
0
+
1
ln[Size] +
2
ln[SECU] +
3
ln[FAIL] +
4
ln[CPLX] +
5
ln[PLAT ]
+
6
ln[PVOL] +
7
ln[EXPE] +
8
ln[TOOL] +
9
ln[SITE] +" (5.6)
5.4 SECU Quantification
The Required Software Security (SECU) cost driver was measured according to the scale
developedandpresentedinthepreviouschapter. Thebankcompanyappliesasetofsoftware
security practices compatible with the High level of the scale for certain projects that are
designated by a Software Security team. From the 1,140 observations in the dataset, 120
projects were developed at the High-security level. For the telecom company projects, all
observations were defined as nominal in the dataset.
The values for the SECU cost driver obtained through the Wideband and online Delphi
sessions were used to initialize the statistical analysis process. The scale and the correspond-
ing values are presented in Table 5.4. For the remaining cost drivers, the initial values were
126
adapted from the COCOMO II model (see Appendix B).
Table 5.4 Expert Opinion Estimates for the SECU Cost Driver
Parameter N H VH XH UH PR
SECU 1.00 1.28 1.64 2.11 2.70 2.70
5.5 Multiple Regression Results
Theresultsofthemultiplelinearregressionforthemodelbasedonequation5.6arepresented
in Table 5.5. The model fitness information is showed in Table 5.6. The p-value of the F-
statistics is < 2.2e-16, which is highly significant and shows that there are predictor variables
significantly related to the effort.
Table 5.5 Coefficients for the Model
Term Estimate Std.Error t-value p-value Conf-low Conf-high
(Intercept) 2.602 0.069 37.887 < 2e-16 (***) 2.467 2.736
log(fsm_count) 0.976 0.014 71.823 < 2e-16 (***) 0.949 1.003
log(fail) 1.201 0.241 4.985 7.18e-07 (***) 0.728 1.674
log(cplx) 0.869 0.233 3.736 0.000197 (***) 0.413 1.326
log(secu) 1.111 0.214 5.201 2.35e-07 (***) 0.692 1.531
log(plat) 0.587 0.215 2.727 0.006490 (**) 0.165 1.009
log(pvol) 0.657 0.230 2.855 0.004383 (**) 0.205 1.108
log(expe) 3.189 0.315 10.108 < 2e-16 (***) 2.570 3.808
log(tool) 3.243 0.322 10.061 < 2e-16 (***) 2.611 3.875
log(site) -0.654 0.518 -1.264 0.206588 -1.670 0.361
Signif. codes: 0 ‘(***)’ 0.001 ‘(**)’ 0.01 ‘(*)’ 0.05 ‘(.)’ 0.1 ‘ ’ 1
Table 5.6 Model Fitness
R
2
Adj. R
2
RSE F-statistic P-value DF DF Residual N Obs
0.838 0.837 0.536 650.1 < 2.2e-16 9 1130 1140
127
The R
2
is 0.838 and the Adjusted R
2
is 0.837, indicating that 84% of the variance in the
effort can be predicted by the variables in the model. The analysis of the t-statistics for the
coefficients in Table 5.5 reveals that, except for SITE, all other predictor variables have a
significant association with the dependent variable effort. The t-value measures how many
standard errors the coefficient is away from zero, thus the higher the t-value, the greater the
confidence in the coefficient as a predictor.
The intercept coefficient is 2.602 and back-transforming it from natural log to the original
scale (e
2:602
) results in 13.49. This means that, without considering the other parameters, it
takes 13.49 hours to develop one Function Point.
The estimated coefficient for the Required Software Security (SECU) parameter is 1.111,
with a significant t-value of 5.201. Also, the p-value for the SECU predictor is very low,
supporting the evidence that the SECU cost driver has an effect on the resulting effort.
The Multisite Development (SITE) predictor presented a negative coefficient with high
p-value. The negative value indicates that this parameter, that represents the impact of
collocation and communication on effort, had a opposite behavior that was expected by the
model. However, the p-value is large (0.207), indicating also that SITE is not significant to
explain the resulting effort in this sample. One possible explanation for this result is that
as most of the projects in the sample are small (the median for effort is 3.45 person-month),
they require small teams for which different communication and collocation levels may not
impact the productivity.
5.6 Model Validation
Table5.6alsoshowstheResidualStandardError(RSE)as0.536. Thisvalueisinthenatural
log space and can be roughly interpreted as the standard error of a prediction measured in
percentage points.
To further evaluate the accuracy of the model, two approaches were used to validate
128
it: (1) the Validation Set approach, which consists of randomly splitting the data into two
sets: one set is used to train the model and the other set is used to test the model, and (2)
the K-fold Cross-Validation approach, which evaluates the model performance on different
subsets of the training data and then calculates the average prediction error rate.
Table 5.7 presents the results using a 75/25 split for training/testing in the Validation
Set approach and 5-fold Cross validation.
Table 5.7 Model Validation Results
Approach R2 MMRE PRED25 PRED40
Validation Set 0.900 0.358 50.4% 68.7%
5-fold Cross Validation 0.693 0.393 53.7% 71.1%
5.7 Outliers Detection and Removal
Outliers in a dataset may have a considerable impact on the fitness of a model as they
introduce bias and affect predictions. Two variables were analyzed to check for outliers -
productivity and security effort ratio.
The projects’ productivity is given by the quotient between the effort and the size vari-
ables and, in this dataset, is expressed in hours per Function Point (h/FP). In the dataset,
the productivity ranges from 0.25 h/PF to 680 h/PF, with an average of 16.9 h/FP and a
median of 12.8 h/FP. Figure 5.7 illustrates the existence of observations far distant from the
median.
Another variable in the dataset that presented extreme values is the security effort ratio.
This variable is calculated as the effort spent with security in the project divided by the
development effort (without the security effort). The security effort ratio ranges from 0.008
to 22.21, with an average of 0.87 and a median of 0.23. Figure 5.8 shows the security effort
ratio distribution.
129
0
200
400
600
Frequency
0 200 400 600
0 200 400 600
Productivity
0
200
400
600
Frequency
0 200 400 600
0 200 400 600
Productivity
Figure 5.7 Productivity Distribution in the Dataset
Considering that the extreme values of productivity and security effort ratio could affect
the accuracy of the model, an analysis of the outliers was performed. A well-known rule to
detect outliers was defined by Tukey [123], which states that the outliers are values more
than 1.5 times the interquartile range (IQR) from the quartiles — either below Q1 1:5
IQR, or above Q3 + 1:5IQR. Hoaglin and Iglewicz [58] analyzed the performance of the
parameter k = 1:5 established by Tukey and determined the probability that a sample with
n observations contains no observations beyond the cutoffs using the probabilities of 90%
and 95%. The study showed that for an n above 10, the median value for k is 2.1 for 90%
and 2.3 for 95% probabilities that no observations are beyond the cutoffs.
Theparameterk = 2:3wasthenappliedtolabeltheoutliersforthevariablesproductivity
and security effort ratio in the dataset. The upper range for the productivity variable was
calculated as 30.71 h/PF and 64 observations were labeled as outliers. For the security effort
ratio variable, the upper range was defined as 1.74, and 15 observations were identified as
130
0
250
500
750
1000
Frequency
0 5 10 15 20
0 5 10 15 20
Security Effort Ratio
0
250
500
750
1000
Frequency
0 5 10 15 20
0 5 10 15 20
Security Effort Ratio
Figure 5.8 Security Effort Distribution in the Dataset
outliers. In total, 68 observations out of the 1,140 observations were classified as outliers,
as 11 observations were commonly labeled by both the productivity and the effort security
ratio rules. Figure 5.9 shows the position of the outliers in the Size versus Effort scatter
plot; and Figures 5.10 and 5.11 show the distribution of the two variables after the outliers
removal.
A new version of the dataset with the remaining 1,072 observations, named Filtered
Dataset was created and used to calibrate a new version of the model.
5.8 Multiple Regression Results for the Filtered Dataset
The results of the coefficients for the model using the Filtered Dataset are presented in Table
5.8. The R
2
and the adjusted R
2
, both rounded to 0.93, were improved with the Filtered
Dataset-based model, keeping a strong p-value. This means that 93% of the variance in
131
1
10
100
1000
10000
1 10 100 1000
Function Points (log scale)
Effort in Hours (log scale)
Legend
Common Outliers
Productivity Outliers
Security Effort Ratio Outliers
Figure 5.9 Scatter Plot of Size vs Effort with Outliers Highlighted
0
30
60
90
0 10 20 30
Productivity Index (h/PF)
Observations
Bin size = 1 h/FP
Figure 5.10 Distribution of the Productivity Variable After Outliers Removal
effort can be explained by the variables in the model, compared to the 84% obtained with
the complete version of the dataset.
These results are also reflected in more significant p-values for the predictors Platform
Constraints (PLAT) and Platform Volatility (PVOL), whose p-values had significance <
0.01 in the original version of the dataset. The t-values of the coefficients confirm that
132
0
5
10
15
20
25
0.0 0.5 1.0
Security Certification Effort Ratio
Observations
Bin size = 0.1
Figure 5.11 Distribution of the Security Effort Ratio Variable After Outliers Re-
moval
the predictors have a significant association with the effort. The only exception, again, is
Multisite Development (SITE) parameter. The signal of the coefficient for SITE changed (it
was negative in the previous version), but the p-value remains not significant.
The SITE parameter was then removed from the equation and new coefficients were
calculated, as shown in Table 5.9. When compared to the previous table, the coefficients
were slightly changed because of the removal of the SITE parameter. The model fitness for
the two versions of the dataset can be viewed in Table 5.10.
When comparing the coefficients of the model based on the Filtered Dataset (Table 5.8
or Table 5.9) with the coefficients of the model based on the Complete Dataset (Table 5.5), it
can be observed that the removal of the outliers caused important changes in the coefficient
values. The new intercept coefficient (2.328) when back-transformed to the original scale
(e
2:328
) results in 10.26 h/FP, a lower value when compared to the 13.49 h/PF in the previous
model. This was expected as the outliers’ analysis revealed extreme values for productivity,
such as the maximum value of 680 h/PF, which was removed as the upper range for the
productivity variable as defined at 30.71 h/PF.
A similar phenomenon occurred to the Required Software Security (SECU) predictor.
133
Table 5.8 Model Coefficients for the Filtered Dataset
Term Estimate Std.Error t-value p-value Conf-low Conf-high
(Intercept) 2.361 0.048 48.985 < 2e-16 (***) 2.267 2.456
log(fsm_count) 1.038 0.010 107.851 2e-16 (***) 1.019 1.056
log(fail) 1.348 0.168 8.019 2.81e-15 (***) 1.018 1.678
log(cplx) 0.916 0.160 5.737 1.26e-08 (***) 0.603 1.229
log(secu) 0.702 0.155 4.543 6.18e-06 (***) 0.399 1.005
log(plat) 0.563 0.152 3.695 0.000231 (***) 0.264 0.862
log(pvol) 0.605 0.162 3.731 0.000200 (***) 0.287 0.923
log(expe) 2.339 0.225 10.374 2e-16 (***) 1.896 2.781
log(tool) 2.221 0.225 9.868 2e-16 (***) 1.780 2.663
log(site) 0.402 0.361 1.113 0.266 -0.306 1.110
Signif. codes: 0 ‘(***)’ 0.001 ‘(**)’ 0.01 ‘(*)’ 0.05 ‘(.)’ 0.1 ‘ ’ 1
The new coefficient for SECU, 0.707, is lower than the previous value of 1.111. The analysis
of the security effort ratio distribution revealed extreme observations such as the maximum
ratio, where the effort with security activities was 22.21 times higher than the effort required
to develop the maintenance project. The outlier rule set an upper limit of 1.74 for this
variable, which reduced the effort spent with security in the dataset. Nevertheless, the new
coefficient is still inside the confidence interval calculated for SECU predictor in the original
version of the dataset (0.692 to 1.531).
5.9 Model Validation for the Filtered Dataset
Table 5.10 also shows that the Residual Standard Error (RSE) for the Filtered Dataset-based
model is better than in the previous model. It was reduced from 0.536 to 0.364, suggesting
that the predictions are more accurate.
Further results of the accuracy assessment are presented in Table 5.11. All accuracy
metrics improved when using the Filtered Dataset. The best performance of the model
is shown in the 5-fold Cross-Validation approach with the Mean Magnitude of Relative
134
Table 5.9 Model Coefficients for the Filtered Dataset without SECU
Term Estimate Std.Error Statistic P-value Conf-low Conf-high
(Intercept) 2.328 0.038 61.900 < 2e-16 (***) 2.254 2.401
log(fsm_count) 1.039 0.010 109.319 < 2e-16 (***) 1.021 1.058
log(fail) 1.386 0.165 8.415 < 2e-16 (***) 1.063 1.709
log(cplx) 0.920 0.160 5.759 1.11e-08 (***) 0.606 1.233
log(secu) 0.707 0.154 4.575 5.33e-06 (***) 0.404 1.010
log(plat) 0.563 0.152 3.695 0.000231 (***) 0.264 0.862
log(pvol) 0.607 0.162 3.742 0.000193 (***) 0.289 0.925
log(expe) 2.413 0.215 11.209 < 2e-16 (***) 1.991 2.836
log(tool) 2.166 0.220 9.864 < 2e-16 (***) 1.735 2.597
Signif. codes: 0 ‘(***)’ 0.001 ‘(**)’ 0.01 ‘(*)’ 0.05 ‘(.)’ 0.1 ‘ ’ 1
Table 5.10 Model Fitness for the Two Versions of the Dataset
Dataset R
2
Adj. R
2
RSE F-statistic P-value DF DF Residual N Obs
Complete 0.838 0.837 0.536 650.1 < 2.2e-16 9 1130 1140
Filtered 0.927 0.926 0.364 1678.9 < 2.2e-16 9 1063 1072
Error (MMRE) equal to 0.29, 64.2 percent of predictions within 25 percent of the actuals
(PRED25), and 82.2 percent of the predictions within 40 percent of the actuals (PRED40).
Table 5.11 Model Validation Results
Dataset Approach R2 MMRE PRED25 PRED40
Complete
Validation Set 0.900 0.358 50.4% 68.7%
5-fold Cross Validation 0.693 0.393 53.7% 71.1%
Filtered
Validation Set 0.910 0.296 63.1% 79.9%
5-fold Cross Validation 0.739 0.291 64.2% 82.2%
5.10 The Resulting Model
The resulting model equation, calibrated with the 1,072 observations of the Filtered Dataset
is:
135
d
log(Eort_h) = 2:328 + 1:039(log(Size_FP)) + 0:707(log(SECU)) + 1:386(log(FAIL))+
0:920(log(CPLX)) + 0:563(log(PLAT)) + 0:607(log(PVOL)) + 2:413(log(EXPE)) +
2:166(log(TOOL)) (5.7)
Back-transforming Equation 5.7 to the original units, results in:
d
Eort_h = 10:25 Size_FP
1:039
SECU
0:707
FAIL
1:386
CPLX
0:920
PLAT
0:563
PVOL
0:607
EXPE
2:413
TOOL
2:166
(5.8)
The resulting values for the cost drivers’ multipliers are then obtained by applying the
respective exponent of Equation 5.8 to the initial values used in the multiple linear regression
(see Appendix B). Table 5.12 presents the cost drivers’ values obtained. Columns VL to UH
arethecostdrivers’levels, VeryLow(VL),Low(L),Nominal(N),High(H),VeryHigh(VH),
Extra High (XH), and Ultra High (UH). PR stands for Productivity Range, calculated as
the maximum value of the cost drivers divided by its minimum value.
Table 5.12 New Cost Drivers’ Values
Cost Driver VL L N H VH XH UH PR
SECU 1.00 1.19 1.42 1.69 2.02 2.02
FAIL 0.76 0.89 1.00 1.14 1.38 1.63 2.14
CPLX 0.75 0.88 1.00 1.16 1.31 1.66 2.21
PLAT 1.00 1.04 1.12 1.28 1.28
PVOL 0.92 1.00 1.09 1.17 1.27
TOOL 1.41 1.21 1.00 0.80 0.58 2.43
EXPE 1.56 1.24 1.00 0.78 0.64 0.55 2.84
136
5.11 Results for the SECU Cost Driver
The estimation of the SECU coefficient with the Complete Dataset is 1.102 with a 95% con-
fidence interval between 0.683 and 1.521 (this is for the model without the SITE parameter).
The estimation of SECU for the same model, with the Filtered Dataset, was 0.707 with a
confidence interval between 0.404 and 1.010 (Table 5.8). Figure 5.12 illustrates the SECU
coefficient estimates with its confidence intervals.
1.102
0.707
0.683
0.404
1.521
1.01
Complete DS
Filtered DS
0.6 0.9 1.2 1.5
Estimate
Figure 5.12 SECU Coefficient Estimates with 95% Confidence Intervals
In both situations, the SECU coefficient is statistically significant, with a t-value above
4 and p-value < 0.0001. Independently of considering the complete dataset or removing the
outliers (Filtered Dataset), the null hypothesis that SECU does not influence the projects’
effort can be rejected. As depicted in Figure 5.12, both estimates and confidence intervals
for the SECU coefficient are safely distant from the zero value.
The resulting SECU multipliers for the 0.707 estimate of the coefficient, shown in Table
5.12, are lower than the initial estimates provided by experts in the Delphi sessions. Table
5.13presentsthemultipliersfortheSECUlevelsfrombothsources. AccordingtotheFiltered
Dataset results, a software project planning to apply practices at the High level of the scale
will spend 19% additional effort with the required security. The highest value of the SECU
137
multiplier is 2.02 for the Ultra High level, meaning that a project will double its effort if
applying the practices specified in this level.
Table 5.13 Comparison of SECU Rating Values
Source N H VH XH UH PR
Expert Opinion (initial data) 1.00 1.28 1.64 2.11 2.70 2.70
Filtered DS Model (coef=0.707) 1.00 1.19 1.42 1.69 2.02 2.02
Complete DS Model (coef=1.102) 1.00 1.31 1.73 2.28 2.99 2.99
For the Complete Dataset results, the multipliers are slightly higher than the expert
opinion-provided values. According to these results, the effort estimation for a project ap-
plying security practices at the high level of the SECU scale will cost 31% more than the
nominal value. Projects in the extreme of the SECU scale will spend three times the effort
originally estimated.
5.12 Results by System Architecture
The dataset provided information about the system architecture for each maintenance
project. In total, there were 33 different classifications, including combinations of system
architectures. Table 5.14 presents the top five architectures and the respective number of
data points for the Complete and Filtered datasets.
Table 5.14 Projects by System Architecture
System Architecture N (Complete DS) N (Filtered DS)
Web-Mainframe 342 332
Mainframe 249 245
Client-Server, Web-Mainframe 137 135
Web 88 83
Client-Server 54 49
To analyze the behavior of the coefficients according to different system architectures,
138
five new datasets were created by selecting the projects from the Filtered dataset, according
to the architectures shown in Table 5.14. Figure 5.13 compares the coefficients obtained for
each model generated with all cost drivers available. It indicates that some of the predictors
have distinct effects depending on the architecture.
log(site)
log(tool)
log(expe)
log(pvol)
log(plat)
log(cplx)
log(fail)
log(secu)
log(fsm_count)
−5 0 5
Estimate
Model
Web−Mainframe
Mainframe
Web−Mainframe / Client−Server
Web
Client−Server
Figure 5.13 Comparison of Coefficients by System Architecture
139
Table 5.15 presents the results of the regression for the five datasets. For each predictor,
the first line shows the coefficient value and the significance level, while the second line
presents the standard error, in brackets.
Table 5.15 Regression Results by System Architecture
Web-Mainframe Mainframe Web-Mainframe
/ Client-Server
Web Client-
Server
(Intercept) 2.24 *** 2.26 *** 2.33 *** 2.77 *** 1.82 ***
(0.05) (0.07) (0.12) (0.18) (0.30)
log(fsm_count) 1.08 *** 1.07 *** 0.98 *** 1.03 *** 1.07 ***
(0.01) (0.01) (0.02) (0.04) (0.05)
log(secu) 1.08 *** 1.05 *** 0.36 0.77 2.13 *
(0.16) (0.16) (0.47) (0.78) (0.93)
log(fail) 1.39 *** 1.74 *** 1.97 *** 1.23 0.84
(0.18) (0.24) (0.38) (0.76) (0.68)
log(cplx) 0.52 *** 1.26 *** 0.65 1.73 * -0.95
(0.15) (0.21) (0.36) (0.83) (0.77)
log(plat) 1.02 *** 0.85 *** 1.06 *** 1.09 -0.37
(0.18) (0.19) (0.29) (0.82) (0.58)
log(pvol) 0.55 ** 0.32 0.21 1.1 4.02 ***
(0.20) (0.25) (0.45) (0.96) (1.00)
log(expe) 2.37 *** 3.21 *** 1.69 ** 1.64 3.91 ***
(0.30) (0.42) (0.58) (1.06) (1.03)
log(tool) 0.07 0.84 * 1.24 1.14 2.54
(0.31) (0.34) (1.01) (0.89) (1.26)
log(site) 1.20 ** 0.92 -1.48 4.99 ** -2.44
(0.42) (0.49) (1.25) (1.62) (3.00)
N 332 245 135 83 49
R2 0.97 0.97 0.95 0.93 0.95
Signif. codes: 0 ‘(***)’ 0.001 ‘(**)’ 0.01 ‘(*)’ 0.05 ‘(.)’ 0.1 ‘ ’ 1
Apart from the size (fsm_count) predictor, which was statistically significant in all ar-
chitectures, the other predictors presented different impacts on the dependent variable. To
better analyze the significant set of predictors for each architecture, a backward stepwise
selection was applied for each dataset. The results are presented in Table 5.16.
The results for the security predictor showed similar coefficients for Web-Mainframe and
Mainframe architectures, with values (1.08 and 1.07) above the coefficient obtained for the
140
Table 5.16 Stepwise Regression Results by System Architecture
Web-Mainframe Mainframe Web-Mainframe
/ Client-Server
Web Client-
Server
(Intercept) 2.24 *** 2.27 *** 2.32 *** 2.77 *** 1.99 ***
(0.05) (0.07) (0.11) (0.18) (0.20)
log(fsm_count) 1.08 *** 1.07 *** 0.98 *** 1.03 *** 1.08 ***
(0.01) (0.01) (0.02) (0.04) (0.04)
log(secu) 1.08 *** 1.05 *** 2.08 ***
(0.16) (0.16) (0.49)
log(fail) 1.39 *** 1.75 *** 1.94 *** 1.50 *
(0.18) (0.24) (0.38) (0.74)
log(cplx) 0.51 *** 1.35 *** 0.69 1.88 *
(0.15) (0.19) (0.35) (0.79)
log(plat) 1.02 *** 0.90 *** 1.11 *** 1.59 *
(0.18) (0.19) (0.29) (0.76)
log(pvol) 0.56 ** 3.94 ***
(0.19) (0.96)
log(expe) 2.39 *** 3.23 *** 1.68 ** 2.39 ** 3.11 ***
(0.29) (0.42) (0.58) (0.87) (0.76)
log(site) 1.21 ** 0.88 -1.8 4.53 **
(0.41) (0.49) (1.19) (1.53)
log(tool) 0.84 * 1.42 1.25 3.19 **
(0.34) (0.98) (0.87) (1.13)
N 332 245 135 83 49
R2 0.97 0.96 0.95 0.93 0.94
Signif. codes: 0 ‘(***)’ 0.001 ‘(**)’ 0.01 ‘(*)’ 0.05 ‘(.)’ 0.1 ‘ ’ 1
Filtered dataset (0.707). For the client-server architecture, the SECU coefficient turned to be
considerable higher than the other estimates (2.08). For the Web-Mainframe / Client-Server
and Web architectures, the results were not statistically significant. Appendix ?? includes
the complete analysis of the stepwise regression for each system architecture.
Table 5.17 presents the SECU multipliers calculated according to the coefficients of each
system architecture. As indicated by the comparison of the coefficients, the multipliers for
these three architectures presented higher values than the ones obtained for the Filtered
dataset. Especially for the Client-Server architecture, the resultant factors were significantly
higher. For example, projects applying software security practices in the High level, would
141
have a 30% additional effort if the architecture is Mainframe, but a 68% additional effort if
the architecture is Client-Server.
Table 5.17 Comparison of SECU Multipliers by System Architecture
Source N H VH XH UH
Web-Mainframe 1.00 1.31 1.71 2.23 2.92
Mainframe 1.00 1.30 1.69 2.19 2.85
Client-Server 1.00 1.68 2.81 4.72 7.92
5.13 Summary
This chapter presented the results for the statistical model building based on a dataset
with 1,140 maintenance projects. The multiple regression results offered evidence that the
proposed Required Software Security (SECU) parameter had a positive and statistically
significant effect on the projects’ effort. Through the analysis of the dataset, outlier obser-
vations were detected and a new version of the dataset, named Filtered Dataset, was used
to calculate new coefficients. The models based on both dataset versions were validated
using the Validation Set and the 5-fold Cross-Validation approaches. The Filtered Dataset-
based model presented improved accuracy for the MMRE and PRED metrics. Finally, an
analysis of the security coefficient for different system architectures showed that the Client-
Server model presented a considerable higher effort by security level when compared to other
architectures.
142
Chapter 6
Discussion
This chapter presents the key-findings of the dissertation, along with the implication of the
results, limitations and recommendations for studies to follow.
6.1 Findings
The lack of knowledge about the effort required to build security in software development
hinders the planning of proper resources to cope with security threats. In this dissertation,
thisproblemwastackledbydevelopingaratingscaletomeasurethelevelofrequiredsoftware
security and by providing effort multipliers for the levels of the scale.
6.1.1 Multipliers for Required Software Security Levels
In the first phase of the research, it was found that the application of security practices
is the main factor that drives the effort spent with security during software development.
Based on this result, a scale to measure levels of required security was developed and used to
collecteffortdatafromtwosources: expertopinionandprojectdata. Theproductivityrange
provided by experts for the Required Software Security scale was 2.7, from the nominal level
(no security practice applied) to the highest level of the scale (practices applied rigorously
143
and extensively). The productivity range obtained through the statistical model building
was 2.02 for the Filtered Dataset and 2.99 for the Complete Dataset.
Previous studies also reported results for the estimation of the productivity range and
the multipliers for the security factor in software cost models, however were not validated
with observational data, as presented in the literature review. The effort models calibrated
in this dissertation confirm that the application of security practices during the software
developmentprocesshasanimpactoneffort, providingevidencewithdatafromrealprojects.
The outcomes of the statistical models also demonstrate that the costs of secure software
development are not as high as presented in previous studies. Figure 6.1 shows a comparison
between the prior models and the ones produced in this study. It can be observed that
the three sets of multipliers obtained in this research (Expert Opinion, Complete Dataset,
and Filtered Dataset) are positioned in the bottom part of the chart. Only the COCOMO
II Security Extension presents a very similar result to the one obtained by the Filtered
Dataset-based model. The multipliers for this model were estimated in a Delphi exercise
that involved commercial and aerospace organizations in 2003 [105].
On the top part of the chart are the COSECMO min and max values. These multipliers
were derived from a quotation provided by an independent agent for evaluating 5 KSLOC
of infrastructure software, according to the Common Criteria Evaluation Assurance Levels
[34]. Another model on the top is the Weapon System, with a 1.87 multiplier for the level
High of security, obtained from expert opinion with the aim of calibrating a model with 73
data points from the Korean defense domain [74]. The model calibration, however, did not
achieve a statistically significant coefficient for the security predictor.
The Secure OS min and max multipliers are intermediary values in this comparison.
They resulted from a mini-Delphi conducted among the authors and an industry expert in
an exploratory study for establishing an effort estimation model for secure operating system
(OS) software development [134].
The comparison of the models suggests that the domain of the input data has an im-
144
Low Nominal High Very High Extra High
Super or Ultra
High
Filtered DS (this study) 1.00 1.19 1.42 1.69 2.02
COCOMO II Sec. Ext. 0.94 1.02 1.27 1.43 1.75
Expert Opinion (this study) 1.00 1.28 1.64 2.11 2.70
Complete DS (this study) 1.00 1.31 1.73 2.28 2.99
Secure OS (min) 1.00 1.25 1.75 2.00 3.00
Secure OS (max) 1.00 1.50 2.00 2.75 4.00
COSECMO (min) 1.00 1.20 1.50 2.25 4.13
COSECMO (max) 1.00 1.80 3.00 6.00 13.50
Weapon Systems 1.00 1.87
2.02
2.70
2.99
4.00
4.13
3.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
Multiplier
Figure 6.1 Comparison of Models’ Security Multipliers
pact on the security multipliers. Recall that the dataset used in this study is based on
projects from bank and telecom organizations that could be classified in the Information
Systems super-domain. Except for the COCOMO II Security Extension, whose estimates
are very close to the multipliers produced by the Filtered-DS model, all other studies in-
volved domains or applications related to the Engineering and Real-Time super-domains.
This interpretation confirms previous expectations [82] and is in line with the findings of
Rosa et al. [109], whose investigation, involving 20 agile projects from the US Department
of Defense (DoD), demonstrated that the productivity in the Information Systems super-
domain is higher than the productivity in the Engineering super-domain, which, in turn, is
higher than the productivity in the Real-Time super-domain [109].
Another plausible explanation for the lower values of multipliers in our results is related
to the advances in automation in software development, particularly in the Security field,
145
that tends to improve productivity. The dataset analyzed in this dissertation is from projects
developed over the years of 2019 and 2020 and may have benefited from security tools that
were not available years ago when the estimations from the other studies were collected.
6.1.2 Relationship of Security with Other Variables
Software qualities such as Security and Reliability share some characteristics - e.g. both
contribute to dependability - and are commonly associated [22, 21]. This relationship has
been hypothesized in some studies [134, 34]. However, contrary to this assumption, the
collinearity test for the predictors in the analyzed dataset revealed no meaningful correlation
between the security predictor (SECU) and the impact of software failure (FAIL) - another
denomination for Reliability. In fact, no correlation was found between security and any
other predictor in the dataset.
While the independence of the SECU parameter is positive for using it as a predictor in
statistical model building, the low values of correlation with other variables is an interesting
result per se. Studies on the interrelations of software qualities point out that there can be
conflicts and synergies between qualities [113]. For example, rigorous security tests impact
positively on reliability - a synergy; on the other hand, a measure to improve security by
using a single-agent key distribution system can introduce a single point of failure in terms
of reliability - a conflict [21].
6.1.3 Outliers for Productivity and Security-related Effort
The identification of the outliers in the dataset created a situation where the resulting co-
efficient for the SECU parameter can adjust the initial multipliers to a higher value, if the
outliers are maintained, or adjust them to a lower value, if the outliers are removed.
The histograms for the two variables used for analyzing the outliers, productivity and
security effort ratio, showed that both presented extreme values.
146
One example was the max value of 22.21 for the security effort ratio, indicating that
the effort with security was more than 22 times the effort spent in developing the software.
This specific data point is a project of 2.82 Function Points, developed in 14.8 hours, with
329 hours dedicated to security activities. An interview with the responsible for the dataset
informed that this is likely a case where the software was submitted to a security scanner to
detect vulnerabilities for the first time and it revealed a backlog of threats that had to be
patched in order to deploy the enhancement project. Considering that this practice could
have happened for other projects was well, this contextualization helps to make the case for
the removal of security effort ratio outliers.
For the productivity variable, one extreme case was the data point with productivity of
680 h/PF. This project was a 4.8 Function Points maintenance that took 5,440 hours or
35.8 person-months to be developed. While the information available in the dataset does
not allow to state that this was a measurement error, this possibility cannot be discarded.
The constant A was estimated by the Filtered dataset as 10.25 h/PF, and for the Complete
dataset as 13.49 h/PF, what can be considered the productivity indexes if the cost drivers
are ignored. Even considering the worst case scenario for all cost drivers, the productivity
of 680 h/PF would not be possible.
Another aspect to consider in favor of the outliers removal was that a conservative ap-
proach was taken to determine the limits to label the outliers. The limits were calculated
using a parameter that considers a probability of 95% that the sample contains no observa-
tions beyond the cutoffs [58].
6.2 Implications
These results build on previous research of parametric cost models that proposed a security
cost driver by providing an empirical validation for the determination of the multipliers.
In terms of practice, the results indicate that software security costs less than previously
147
reported in the literature, at least for the Information Systems domain. This is encouraging
for practitioners, who can use the calibrated model to produce software estimates considering
their security needs. Another important outcome for practitioners is that the model was
calibrated for the Function Points sizing metric, which has been largely adopted in industry
and public agencies.
The empirically validated security multipliers also contribute to the research on cost-
effectiveness analysis of secure software development. The secure software engineering com-
munityhavebeentryingtodemonstratethatapplyingsecuritypracticesearlyinthesoftware
development life cycle can lead to a decrease in the total cost of a software system, as the
risks and costs of patching vulnerabilities after deployment tend to be much lower. Authors,
however, have applied hypothetical scenarios so far to make theirs claims. The mapping
between the level of secure software development and estimated costs provided in this dis-
sertation is a piece of valuable input information for the development of software security
investment models.
6.3 Limitations
Despite the large dataset used in this research, the generalization of the results is limited in
some aspects, such as the type and size of the projects, the software sizing method, and the
variability of the cost drivers ratings.
Also, the dataset is composed of maintenance projects, classified mostly as small projects.
While models like COCOMO II do not differentiate development and maintenance projects
when defining the cost drivers, it is not clear if larger development projects could affect the
resulting multipliers.
For the sizing method, all projects applied a factor to deflate the project size for changed
and deleted Function Points. This is a common practice in industry when measuring the
scope of maintenance projects that involve added, changed, and deleted functions. In a way,
148
it normalizes the size, making it more comparable with development projects that in general
contain only added functions. However, it must be considered that the use of the multipliers
is conditioned to a similar approach for sizing the project to be estimated.
Although limited by the aspects presented above, the results of this research are nonethe-
less valid for the purpose of answering the research questions. Access to datasets from the
software industry containing variables such as size, effort, and security information is rare,
making any sample available a valuable resource.
6.4 Recommendations
Considering the difficulties in acquiring data to develop research in this area, one recommen-
dation for future studies is to explore public repositories of open source software. Recent
research on software security and software effort estimation has been collecting data from
Open Source Projects (OSP) and comparing it with Closed Source Project (CSP). Organi-
zations participating in OSP, as well as foundations that support many OSP, are interested
in quantifying the effort that each collaborating company, and also the community, put in
the project. However, cost models that traditionally use only effort as output need to be
reviewed to fit in this new environment.
The analysis of different architectures showed that there are considerable differences in
effort for applying security practices in distinct system architectures. It would be interesting
to investigate how secure software development differs in these cases. Besides examining
the additional effort with security for system architectures, the domain of the application
is another factor to consider. Gathering data that contemplate such variables is one way
forward.
Also, further research is needed to understand the interrelations between security and
other cost drivers. Studies that analyze the synergies and conflicts between software qualities
can give support to deepen the analysis at least for some of the cost drivers. Another
149
opportunity to bring new insights to this research topic is to involve security experts and
cybersecurity researchers in Software Engineering studies.
Security in software development also implies the implementation of security features,
such as authentication, cryptography, auditing, etc. Despite the estimates obtained from
security experts about the increase in the application size due to these features, the dataset
did not provide information to validate such estimates. More investigation is needed to
understand this functional aspect of security and its impact on the development effort.
Lastly, analyzing the effects of investing in different levels of secure software development
is also one way forward. For example, there is an expected relation between the increase of
software security costs and the reduction of software vulnerabilities. With the extraction of
vulnerability data, a model could be developed to represent the vulnerabilities introduction
and removal based on the project characteristics and selected software security degree. As a
result, it would be possible to predict the residual vulnerabilities in the software according to
the distinct levels of security applied, which, in turn, could help to answer questions about
the cost-effectiveness of software security.
150
Chapter 7
Conclusion
This research aimed to develop a cost estimation model in line with current software secu-
rity practices in order to quantify the effects of required software security on the software
development effort.
An ordinal scale was developed, based on the sources of cost discovered through a sys-
tematic review of the literature and a survey with practitioners. The descriptions for the
scale levels were evaluated and improved with the feedback obtained from estimation and
security experts. Their expertise was also put in use through in-person and online Delphi
sessions to collect initial estimates for the productivity range of the security cost driver.
Gathering data for building the statistical model was a great challenge in this project.
After many meetings and attempts, a large dataset from two organizations, containing a
subset of maintenance projects that performed security activities was obtained. Based on
the results of a multiple linear regression from this dataset, it can be concluded that the
application of software security practices can impact the cost estimations ranging from a
19% additional effort (the first level of the scale) to a 102% additional effort (the highest
level of the scale).
This research builds on previous works on secure software cost models and goes one
step forward by providing an empirical validation for the required software security scale.
151
Even though the scope of the dataset limits the generalization of the results, the resulting
model can be used by practitioners in this field to estimate proper resources for secure
software development. Additionally, the validated multipliers represent an important source
of information for researchers developing investment models for software security.
Recommendations of studies to follow include further investigating the relationships be-
tween security and other cost drivers, exploring open source software repositories as a means
to obtain security and productivity data, investigating the size of security features, and
analyzing the impact of the adoption of different levels of security on the number of vulner-
abilities reported.
152
REFERENCES
[1] N. A. S. Abdullah et al. “Extended function point analysis prototype with security
costing estimation”. In: 2010 International Symposium on Information Technology.
Vol. 3. June 2010, pp. 1297–1301. doi: 10.1109/ITSIM.2010.5561460.
[2] Nur Atiqah Sia Abdullah et al. “User Acceptance for Extended Function Point Analy-
sisinSoftwareSecurityCosting”.en.In: Software Engineering and Computer Systems.
Communications in Computer and Information Science. Springer, Berlin, Heidelberg,
June 2011, pp. 346–360. isbn: 978-3-642-22190-3 978-3-642-22191-0. doi: 10.1007/
978-3-642-22191-0_31. url: https://link.springer.com/chapter/10.1007/978-3-642-
22191-0_31 (visited on 11/03/2017).
[3] Jenny Abramov, Arnon Sturm, and Peretz Shoval. “Evaluation of the Pattern-based
method for Secure Development (PbSD): A controlled experiment”. In: Information
and Software Technology 54.9 (2012), pp. 1029–1043. issn: 0950-5849. doi: https:
//doi.org/10.1016/j.infsof.2012.04.001. url: https://www.sciencedirect.com/
science/article/pii/S0950584912000729.
[4] E. M. O. Abu-Taieh. “Cyber Security Body of Knowledge”. In: 2017 IEEE 7th Inter-
national Symposium on Cloud and Service Computing (SC2). Nov. 2017, pp. 104–111.
doi: 10.1109/SC2.2017.23.
[5] E. Amoroso. “Recent Progress in Software Security”. In: IEEE Software 35.2 (Mar.
2018), pp. 11–13. issn: 0740-7459. doi: 10.1109/MS.2018.1661316.
[6] J. Arunagiri, S. Rakhi, and K. P. Jevitha. “A Systematic Review of Security Measures
for Web Browser Extension Vulnerabilities”. en. In: SpringerLink (2016), pp. 99–112.
doi: 10.1007/978-81-322-2674-1_10. url: https://link.springer.com/chapter/10.
1007/978-81-322-2674-1_10 (visited on 12/01/2017).
[7] Tigist Ayalew, Tigist Kidane, and Bengt Carlsson. “Identification and Evaluation of
Security Activities in Agile Projects”. en. In: Secure IT Systems. Ed. by Hanne Riis
Nielson and Dieter Gollmann. Lecture Notes in Computer Science. Springer Berlin
Heidelberg, 2013, pp. 139–153. isbn: 978-3-642-41488-6.
[8] D. Baca et al. “A Novel Security-Enhanced Agile Software Development Process Ap-
plied in an Industrial Setting”. In: 2015 10th International Conference on Availability,
Reliability and Security. Aug. 2015, pp. 11–19. doi: 10.1109/ARES.2015.45.
153
[9] DejanBacaandBengtCarlsson.“AgileDevelopmentwithSecurityEngineeringActiv-
ities”. In: Proceedings of the 2011 International Conference on Software and Systems
Process. ICSSP ’11. New York, NY, USA: ACM, 2011, pp. 149–158. isbn: 978-1-4503-
0730-7. doi: 10.1145/1987875.1987900. url: http://doi.acm.org/10.1145/1987875.
1987900 (visited on 11/02/2018).
[10] Dejan Baca, Bengt Carlsson, and Lars Lundberg. “Evaluating the Cost Reduction
of Static Code Analysis for Software Security”. In: Proceedings of the Third ACM
SIGPLAN Workshop on Programming Languages and Analysis for Security. PLAS
’08. New York, NY, USA: ACM, 2008, pp. 79–88. isbn: 978-1-59593-936-4. doi: 10.
1145/1375696.1375707. url: http://doi.acm.org/10.1145/1375696.1375707 (visited
on 11/02/2018).
[11] Dejan Baca and Kai Petersen. “Countermeasure graphs for software security risk
assessment: An action research”. In: Journal of Systems and Software 86.9 (2013),
pp. 2411–2428. issn: 0164-1212. doi: https://doi.org/10.1016/j.jss.2013.04.023. url:
https://www.sciencedirect.com/science/article/pii/S0164121213001027.
[12] Dejan Baca and Kai Petersen. “Prioritizing Countermeasures through the Counter-
measure Method for Software Security (CM-Sec)”. en. In: Product-Focused Software
Process Improvement. PROFES. Springer, Berlin, Heidelberg, June 2010, pp. 176–
190. doi: 10.1007/978-3-642-13792-1_15. url: https://link.springer.com/chapter/
10.1007/978-3-642-13792-1_15 (visited on 07/17/2018).
[13] Dejan Baca et al. “Improving software security with static automated code analysis
in an industry setting”. en. In: Software: Practice and Experience 43.3 (Mar. 2013),
pp. 259–279. issn: 1097-024X. doi: 10.1002/spe.2109. url: http://onlinelibrary.
wiley.com/doi/abs/10.1002/spe.2109 (visited on 11/02/2018).
[14] D. A. Barbosa and S. Sampaio. “Guide to the Support for the Enhancement of Secu-
rity Measures in Agile Projects”. In: 2015 6th Brazilian Workshop on Agile Methods
(WBMA). Oct. 2015, pp. 25–31. doi: 10.1109/WBMA.2015.9.
[15] Saleem Basha and Dhavachelvan Ponnurangam. “Analysis of Empirical Software Ef-
fort Estimation Models”. In: International Journal of Computer Science and Infor-
mation Security 7.3 (Apr. 2010). arXiv: 1004.1239, pp. 68–77. issn: 1947-5500. url:
http://arxiv.org/abs/1004.1239 (visited on 06/04/2018).
[16] Kristian Beckers et al. “Common criteria compliant software development (CC-
CASD)”.en.In: Proceedings of the 28th Annual ACM Symposium on Applied Comput-
ing - SAC ’13.Coimbra,Portugal:ACMPress,2013,p.1298.isbn:978-1-4503-1656-9.
doi: 10.1145/2480362.2480604. url: http://dl.acm.org/citation.cfm?doid=2480362.
2480604 (visited on 06/12/2020).
[17] Punam Bedi et al. “Mitigating Multi-threats Optimally in Proactive Threat Manage-
ment”. In: SIGSOFT Softw. Eng. Notes 38.1 (Jan. 2013), pp. 1–7. issn: 0163-5948.
154
doi: 10.1145/2413038.2413041. url: http://doi.acm.org/10.1145/2413038.2413041
(visited on 11/02/2018).
[18] Alexander van den Berghe et al. “Design notations for secure software: a systematic
literature review”. en. In: Software & Systems Modeling 16.3 (July 2017), pp. 809–
831. issn: 1619-1366, 1619-1374. doi: 10.1007/s10270-015-0486-9. url: https:
//link.springer.com/article/10.1007/s10270-015-0486-9 (visited on 10/30/2017).
[19] Godfred O. Boateng et al. “Best Practices for Developing and Validating Scales for
Health, Social, and Behavioral Research: A Primer”. In: Frontiers in Public Health 6
(June 2018). issn: 2296-2565. doi: 10.3389/fpubh.2018.00149. url: https://www.
ncbi.nlm.nih.gov/pmc/articles/PMC6004510/ (visited on 01/21/2020).
[20] B. Boehm and V. R. Basili. “Top 10 list [software development]”. In: Computer 34.1
(Jan. 2001), pp. 135–137. issn: 0018-9162. doi: 10.1109/2.962984.
[21] Barry Boehm and Nupul Kukreja. “An Initial Ontology for System Qualities”. en. In:
INSIGHT 20.3 (2017), pp. 18–28. issn: 2156-4868. doi: 10.1002/inst.12160. url:
https://onlinelibrary.wiley.com/doi/abs/10.1002/inst.12160 (visited on 01/13/2020).
[22] Barry Boehm et al. “The Key Roles of Maintainability in an Ontology for System
Qualities”. en. In: INCOSE International Symposium 26.1 (2016), pp. 2026–2040.
issn: 2334-5837. doi: 10.1002/j.2334-5837.2016.00278.x. url: https://onlinelibrary.
wiley.com/doi/abs/10.1002/j.2334-5837.2016.00278.x (visited on 01/13/2020).
[23] Barry W. Boehm. Software Engineering Economics. English. 1 edition. Englewood
Cliffs, N.J: Prentice Hall, Nov. 1981. isbn: 978-0-13-822122-5.
[24] Barry W. Boehm et al. Software Cost Estimation with COCOMO II. 1st. Upper
Saddle River, NJ, USA: Prentice Hall Press, 2000. isbn: 978-0-13-702576-3.
[25] Rainer Böhme. “Security Metrics and Security Investment Models”. en. In: Advances
in Information and Computer Security. Ed. by Isao Echizen, Noboru Kunihiro,
and Ryoichi Sasaki. Vol. 6434. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010,
pp. 10–24. isbn: 978-3-642-16824-6 978-3-642-16825-3. doi: 10.1007/978-3-642-
16825-3_2. url: http://link.springer.com/10.1007/978-3-642-16825-3_2 (visited on
10/26/2018).
[26] Amiangshu Bosu et al. “Identifying the Characteristics of Vulnerable Code Changes:
AnEmpiricalStudy”. In: Proceedings of the 22Nd ACM SIGSOFT International Sym-
posium on Foundations of Software Engineering. FSE 2014. New York, NY, USA:
ACM, 2014, pp. 257–268. isbn: 978-1-4503-3056-5. doi: 10.1145/2635868.2635880.
url: http://doi.acm.org/10.1145/2635868.2635880 (visited on 07/13/2018).
[27] S. A. Butler. “Security attribute evaluation method: a cost-benefit approach”. In: Pro-
ceedings of the 24th International Conference on Software Engineering. ICSE 2002.
May 2002, pp. 232–240. doi: 10.1109/ICSE.2002.1007971.
155
[28] D. Byers and N. Shahmehri. “Prioritisation and selection of software security activi-
ties”. In: 2009, pp. 201–207. doi: 10.1109/ARES.2009.52.
[29] B. Carlsson and D. Baca. “Software security analysis - execution phase audit”. In:
31st EUROMICRO Conference on Software Engineering and Advanced Applications.
Aug. 2005, pp. 240–247. doi: 10.1109/EUROMICRO.2005.53.
[30] S. Chandra, R. A. Khan, and A. Agrawal. “Security Estimation Framework: Design
Phase Perspective”. In: 2009 Sixth International Conference on Information Technol-
ogy: New Generations. Apr. 2009, pp. 254–259. doi: 10.1109/ITNG.2009.157.
[31] Golriz Chehrazi, Irina Heimbach, and Oliver Hinz. “The Impact of Security by Design
on the Success of Open Source Software”. en. In: ECIS 2016 Proceedings. 2016, p. 18.
url: http://aisel.aisnet.org/ecis2016_rp/179.
[32] Raoul Chiesa and Marco De Luca Saggese. “Data Breaches, Data Leaks, Web De-
facements: Why Secure Coding Is Important”. en. In: Proceedings of 4th International
Conference in Software Engineering for Defence Applications. Ed. by Paolo Cian-
carini et al. Vol. 422. Series Title: Advances in Intelligent Systems and Computing.
Cham: Springer International Publishing, 2016, pp. 261–271. isbn: 978-3-319-27894-0
978-3-319-27896-4. doi: 10.1007/978-3-319-27896-4_22. url: http://link.springer.
com/10.1007/978-3-319-27896-4_22 (visited on 06/18/2020).
[33] S. Chulani. “Bayesian analysis of software cost and quality models”. In: Proceedings
IEEE International Conference on Software Maintenance. ICSM 2001. 2001, pp. 565–
568. doi: 10.1109/ICSM.2001.972773.
[34] Ed Colbert and Dr Barry Boehm. “Cost Estimation for Secure Software & Systems”.
en.In:ISPA/SCEA 2008 Joint International Conference.TheNetherlands,2008,p.9.
[35] Common Criteria. Common Criteria for Information Technology Security Evaluation
v3.1 - Part 3: Security assurance components. 2017.
[36] Common Criteria : New CC Portal. May 2019. url: https : / / www .
commoncriteriaportal.org/index.cfm? (visited on 05/22/2019).
[37] Daniela S. Cruzes and T. Dybå. “Recommended Steps for Thematic Synthesis in
Software Engineering”. In: 2011 International Symposium on Empirical Software En-
gineering and Measurement. Sept. 2011, pp. 275–284. doi: 10.1109/ESEM.2011.36.
[38] Salma Dammak, Faiza Ghozzi Jedidi, and Faiez Gargouri. “Quantifying Security in
Web ETL Processes”. en. In: SpringerLink. Springer, Cham, July 2015, pp. 160–173.
doi: 10.1007/978-3-319-31811-0_10. url: https://link.springer.com/chapter/10.
1007/978-3-319-31811-0_10 (visited on 11/30/2017).
[39] Stanislav Dashevskyi, Achim D. Brucker, and Fabio Massacci. “On the Security Cost
of Using a Free and Open Source Component in a Proprietary Product”. en. In: Engi-
156
neering Secure Software and Systems. Lecture Notes in Computer Science. Springer,
Cham, Apr. 2016, pp. 190–206. isbn: 978-3-319-30805-0 978-3-319-30806-7. doi: 10.
1007/978-3-319-30806-7_12. url: https://link.springer.com/chapter/10.1007/978-
3-319-30806-7_12 (visited on 10/23/2017).
[40] Noopur Davis. Secure Software Development Life Cycle Processes: A Technology
Scouting Report. en. Tech. rep. CMU/SEI-2005-TN-024. Carnegie Mellon University,
2005, p. 39. url: https://resources.sei.cmu.edu/asset_files/TechnicalNote/2005_
004_001_14516.pdf (visited on 03/05/2019).
[41] G. Deepa and P. Santhi Thilagam. “Securing web applications from injection and
logic vulnerabilities: Approaches and challenges”. en. In: Information and Software
Technology 74 (June 2016), pp. 160–180. issn: 0950-5849. doi: 10.1016/j.infsof.2016.
02.005. url: http://www.sciencedirect.com/science/article/pii/S0950584916300234
(visited on 06/18/2020).
[42] Robert F. DeVellis. Scale Development: Theory and Applications. English. Third edi-
tion. Thousand Oaks, Calif: SAGE Publications, Inc, June 2011. isbn: 978-1-4129-
8044-9.
[43] Holly Donohoe, Michael Stellefson, and Bethany Tennant. “Advan-
tages and Limitations of the e-Delphi Technique”. In: American Jour-
nal of Health Education 43.1 (Jan. 2012). Publisher: Routledge _eprint:
https://doi.org/10.1080/19325037.2012.10599216, pp. 38–46. issn: 1932-5037.
doi: 10.1080/19325037.2012.10599216. url: https://doi.org/10.1080/19325037.2012.
10599216 (visited on 08/25/2020).
[44] Bob Duncan and Mark Whittington. “Compliance with Standards, Assurance and
Audit:DoesThisEqualSecurity?” In: Proceedings of the 7th International Conference
on Security of Information and Networks. SIN ’14. New York, NY, USA: ACM, 2014,
77:77–77:84. isbn: 978-1-4503-3033-6. doi: 10.1145/2659651.2659711. url: http:
//doi.acm.org/10.1145/2659651.2659711 (visited on 08/28/2018).
[45] David Geer. “Are Companies Actually Using Secure Development Life Cycles?” In:
Computer 43.6(June2010).ConferenceName:Computer,pp.12–16.issn:1558-0814.
doi: 10.1109/MC.2010.159.
[46] G. Georg et al. “Verification and Trade-Off Analysis of Security Properties in UML
System Models”. In: IEEE Transactions on Software Engineering 36.3 (May 2010),
pp. 338–356. issn: 0098-5589. doi: 10.1109/TSE.2010.36.
[47] Georg Schmitt. “High-Level Cybersecurity Meeting Warns of Dire Effects of Cyber-
attacks on Prosperity, Innovation and Global Collaboration”. In: World Economic
Forum (Nov. 2018). url: https://www.weforum.org/press/2018/11/high-
level-cybersecurity-meeting-warns-of-dire-effects-of-cyberattacks-on-prosperity-
innovation-and-global-collaboration/ (visited on 04/25/2019).
157
[48] Matteo Giacalone et al. “Security triage: an industrial case study on the effectiveness
of a lean methodology to identify security requirements”. en. In: Proceedings of the
8th ACM/IEEE International Symposium on Empirical Software Engineering and
Measurement - ESEM ’14. Torino, Italy: ACM Press, 2014, pp. 1–8. isbn: 978-1-
4503-2774-9. doi: 10.1145/2652524.2652585. url: http://dl.acm.org/citation.cfm?
doid=2652524.2652585 (visited on 11/02/2018).
[49] Charles Haley et al. “Security Requirements Engineering: A Framework for Repre-
sentation and Analysis”. In: IEEE Transactions on Software Engineering 34.1 (Jan.
2008), pp. 133–153. issn: 0098-5589, 1939-3520, 2326-3881. doi: 10.1109/TSE.2007.
70754.
[50] SpyrosT.Halkidis,AlexanderChatzigeorgiou,andGeorgeStephanides.“Movingfrom
Requirements to Design Confronting Security Issues: A Case Study”. en. In: On the
Move to Meaningful Internet Systems: OTM 2009. Ed. by Robert Meersman, Tharam
Dillon, and Pilar Herrero. Lecture Notes in Computer Science. Springer Berlin Hei-
delberg, 2009, pp. 798–814. isbn: 978-3-642-05151-7.
[51] Saman Hedayatpour, Nazri Kama, and Suriayati Chuprat. “Analyzing Security As-
pects during Software Design Phase using Attack-based Analysis Model”. en. In: In-
ternational Journal of Software Engineering and Its Applications (2014), p. 14.
[52] Daniel Hein and Hossein Saiedian. “Secure Software Engineering: Learning from
the Past to Address Future Challenges”. In: Information Security Journal: A
Global Perspective 18.1 (Feb. 2009), pp. 8–25. issn: 1939-3555. doi: 10.1080/
19393550802623206. url: https://doi.org/10.1080/19393550802623206 (visited on
11/02/2018).
[53] C. Heitzenrater and A. Simpson. “Misuse, Abuse and Reuse: Economic Utility Func-
tions for Characterising Security Requirements”. In: 2016 11th International Confer-
ence on Availability, Reliability and Security (ARES). Aug. 2016, pp. 572–581. doi:
10.1109/ARES.2016.90.
[54] C. Heitzenrater and A. Simpson. “Software Security Investment: The Right Amount
of a Good Thing”. In: 2016 IEEE Cybersecurity Development (SecDev). Nov. 2016,
pp. 53–59. doi: 10.1109/SecDev.2016.020.
[55] Chad Heitzenrater, Rainer Bohme, and Andrew Simpson. “The Days Before Zero
Day: Investment Models for Secure Software Engineering”. en. In: 2016, p. 14.
[56] Chad Heitzenrater and Andrew Simpson. “A Case for the Economics of Secure Soft-
ware Development”. In: Proceedings of the 2016 New Security Paradigms Workshop.
NSPW ’16. New York, NY, USA: ACM, 2016, pp. 92–105. isbn: 978-1-4503-4813-3.
doi: 10.1145/3011883.3011884. url: http://doi.acm.org/10.1145/3011883.3011884
(visited on 10/19/2018).
158
[57] ChadDHeitzenrater.“SoftwareSecurityInvestmentModellingforDecision-Support”.
en. PhD thesis. Oxford: University of Oxford, 2017. url: https://ora.ox.ac.uk/
catalog/uuid:64ddd45e-87ab-4c92-a085-df2d0d4e22e0/download_file?file_format=
pdf&safe_filename=2018.07.12-Dissertation-Heitzenrater-CORRECTIONS.pdf&
type_of_work=Thesis (visited on 12/05/2018).
[58] David C. Hoaglin and Boris Iglewicz. “Fine-Tuning Some Resistant Rules for Outlier
Labeling”. In: Journal of the American Statistical Association 82.400 (1987). Pub-
lisher:[AmericanStatisticalAssociation,Taylor&Francis,Ltd.],pp.1147–1149.issn:
0162-1459.doi: 10.2307/2289392.url: http://www.jstor.org/stable/2289392 (visited
on 04/12/2021).
[59] S. H. Houmb et al. “Cost-benefit trade-off analysis using BBN for aspect-oriented
risk-driven development”. In: 10th IEEE International Conference on Engineering of
Complex Computer Systems (ICECCS’05). June 2005, pp. 195–204. doi: 10.1109/
ICECCS.2005.30.
[60] Ali Idri, Mohamed Hosni, and Alain Abran. “Systematic literature review of en-
semble effort estimation”. In: Journal of Systems and Software 118.Supplement C
(Aug. 2016), pp. 151–175. issn: 0164-1212. doi: 10.1016/j.jss.2016.05.016. url:
http://www.sciencedirect.com/science/article/pii/S0164121216300450 (visited on
11/08/2017).
[61] Yurina Ito et al. “Systematic Mapping of Security Patterns Research”. In: Proceedings
of the 22Nd Conference on Pattern Languages of Programs. PLoP ’15. event-place:
Pittsburgh, Pennsylvania. USA: The Hillside Group, 2015, 14:1–14:10. isbn: 978-1-
941652-03-9. url: http://dl.acm.org/citation.cfm?id=3124497.3124514 (visited on
03/08/2019).
[62] Magne Jorgensen and Martin Shepperd. “A Systematic Review of Software Develop-
ment Cost Estimation Studies”. In: IEEE Transactions on Software Engineering 33.1
(Jan. 2007). Conference Name: IEEE Transactions on Software Engineering, pp. 33–
53. issn: 1939-3520. doi: 10.1109/TSE.2007.256943.
[63] G. Jourdan. “Securing Large Applications Against Command Injections”. In: 2007
41st Annual IEEE International Carnahan Conference on Security Technology. Oct.
2007, pp. 69–78. doi: 10.1109/CCST.2007.4373470.
[64] Samuel Paul Kaluvuri, Michele Bezzi, and Yves Roudier. “A Quantitative Analysis
of Common Criteria Certification Practice”. en. In: Trust, Privacy, and Security in
Digital Business. Ed. by Claudia Eckert, Sokratis K. Katsikas, and Günther Pernul.
Vol. 8647. Cham: Springer International Publishing, 2014, pp. 132–143. isbn: 978-
3-319-09769-5 978-3-319-09770-1. doi: 10.1007/978-3-319-09770-1_12. url: http:
//link.springer.com/10.1007/978-3-319-09770-1_12 (visited on 10/26/2018).
159
[65] N. F. Khan and N. Ikram. “Security Requirements Engineering: A Systematic Map-
ping (2010-2015)”. In: 2016 International Conference on Software Security and As-
surance (ICSSA). Aug. 2016, pp. 31–36. doi: 10.1109/ICSSA.2016.13.
[66] Dmitry Khodyakov et al. “Practical Considerations in Using Online Modified-Delphi
Approaches to Engage Patients and Other Stakeholders in Clinical Practice Guideline
Development”. en. In: The Patient - Patient-Centered Outcomes Research 13.1 (Feb.
2020), pp. 11–21. issn: 1178-1661. doi: 10.1007/s40271-019-00389-4. url: https:
//doi.org/10.1007/s40271-019-00389-4 (visited on 08/19/2020).
[67] B.KitchenhamandP.Brereton.“Asystematicreviewofsystematicreviewprocessre-
searchinsoftwareengineering”.In:Information and Software Technology 55.12(2013),
pp. 2049–2075. doi: 10.1016/j.infsof.2013.07.010.
[68] B. A. Kitchenham, E. Mendes, and G. H. Travassos. “Cross versus Within-Company
Cost Estimation Studies: A Systematic Review”. In: IEEE Transactions on Software
Engineering 33.5 (May 2007), pp. 316–329. issn: 0098-5589. doi: 10.1109/TSE.2007.
1001.
[69] Barbara Ann Kitchenham, David Budgen, and Pearl Brereton. Evidence-Based Soft-
ware Engineering and Systematic Reviews. English. 1 edition. Boca Raton: Chapman
and Hall/CRC, Nov. 2015. isbn: 978-1-4822-2865-6.
[70] A. Kott, A. Swami, and P. McDaniel. “Security Outlook: Six Cyber Game Changers
for the Next 15 Years”. In: Computer 47.12 (Dec. 2014), pp. 104–106.issn: 0018-9162.
doi: 10.1109/MC.2014.366.
[71] LeanidKrautsevich,FabioMartinelli,andArtsiomYautsiukhin.“FormalApproachto
Security Metrics.: What Does "More Secure" Mean for You?” In: Proceedings of the
Fourth European Conference on Software Architecture: Companion Volume. ECSA
’10. New York, NY, USA: ACM, 2010, pp. 162–169. isbn: 978-1-4503-0179-4. doi:
10.1145/1842752.1842787. url: http://doi.acm.org/10.1145/1842752.1842787
(visited on 11/02/2018).
[72] Rick Kuhn, Mohammad Raunak, and Raghu Kacker. “It Doesn’t Have to Be Like
This: Cybersecurity Vulnerability Trends”. In: IT Professional 19.6 (Nov. 2017). Con-
ference Name: IT Professional, pp. 66–70. issn: 1941-045X.doi: 10.1109/MITP.2017.
4241462.
[73] Min-gyu Lee et al. “Secure Software Development Lifecycle which supplements secu-
rity weakness for CC certification”. English. In: International Information Institute
(Tokyo). Information; Koganei 19.1 (Jan. 2016), pp. 297–302. issn: 13434500. url:
http://search.proquest.com/docview/1776684205/abstract/3E850391C94D4932PQ/
1 (visited on 08/28/2018).
[74] Taeho Lee, Taewan Gu, and Jongmoon Baik. “MND-SCEMP: an empirical study of
a software cost estimation modeling process in the defense domain”. en. In: Empirical
160
Software Engineering 19.1 (Feb. 2014), pp. 213–240.issn: 1382-3256, 1573-7616.doi:
10.1007/s10664-012-9220-1. url: https://link.springer.com/article/10.1007/s10664-
012-9220-1 (visited on 02/09/2018).
[75] M. Howard and S. Lipner. The Security Development Lifecycle. Redmond, WA, USA:
Microsoft Press, 2006.
[76] G. McGraw. “Software security”. In: IEEE Security Privacy 2.2 (Mar. 2004), pp. 80–
83. issn: 1540-7993. doi: 10.1109/MSECP.2004.1281254.
[77] Gary McGraw. “Cyber War is Inevitable (Unless We Build Security In)”. In: Journal
of Strategic Studies 36.1 (Feb. 2013), pp. 109–119. issn: 0140-2390. doi: 10.1080/
01402390.2012.742013. url: https://doi.org/10.1080/01402390.2012.742013 (visited
on 03/12/2018).
[78] Gary McGraw. Software Security: Building Security In. English. 1 edition. Upper
Saddle River, NJ: Addison-Wesley Professional, Feb. 2006. isbn: 978-0-321-35670-3.
[79] Rafael Maiani de Mello. “Conceptual Framework for Supporting the Identification of
Representative Samples for Surveys in Software Engineering”. en. PhD thesis. Brazil:
Universidade Federal do Rio de Janeiro (UFRJ), 2016.
[80] Rafael Maiani de Mello, Pedro Correa da Silva, and Guilherme Horta Travassos.
“Sampling Improvement in Software Engineering Surveys”. In: Proceedings of the 8th
ACM/IEEE International Symposium on Empirical Software Engineering and Mea-
surement. ESEM ’14. New York, NY, USA: ACM, 2014, 13:1–13:4. isbn: 978-1-4503-
2774-9. doi: 10.1145/2652524.2652566. url: http://doi.acm.org/10.1145/2652524.
2652566 (visited on 11/16/2018).
[81] Rafael Maiani de Mello, Pedro Corrêa da Silva, and Guilherme Horta Travassos.
“Investigating probabilistic sampling approaches for large-scale surveys in software
engineering”. en. In: Journal of Software Engineering Research and Development 3.1
(June 2015), p. 8. issn: 2195-1721. doi: 10.1186/s40411-015-0023-0. url: https:
//doi.org/10.1186/s40411-015-0023-0 (visited on 11/15/2018).
[82] Tim Menzies et al. “Negative results for software effort estimation”. en. In: Empirical
Software Engineering 22.5 (Oct. 2017), pp. 2658–2683. issn: 1382-3256, 1573-7616.
doi: 10.1007/s10664-016-9472-2. url: https://link.springer.com/article/10.1007/
s10664-016-9472-2 (visited on 09/20/2017).
[83] Microsoft Security Development Lifecycle. en-us. url: https://www.microsoft.com/
en-us/securityengineering/sdl (visited on 10/04/2019).
[84] Sammy Migues, John Steven, and Mike Ware. Building Security in Maturity Model
(BSIMM) - Version 10. en. Tech. rep. 10. Synopsys Software Integrity Group, 2019,
p. 92. url: https://www.bsimm.com/download.html (visited on 10/17/2019).
161
[85] Nabil M. Mohammed et al. “Exploring software security approaches in software devel-
opment lifecycle: A systematic mapping study”. In: Computer Standards & Interfaces
50.Supplement C (Feb. 2017), pp. 107–115. issn: 0920-5489. doi: 10.1016/j.csi.2016.
10.001. url: http://www.sciencedirect.com/science/article/pii/S0920548916301155
(visited on 12/01/2017).
[86] Jefferson Seide Molléri, Kai Petersen, and Emilia Mendes. “An Empirically Evaluated
Checklist for Surveys in Software Engineering”. In: arXiv:1901.09850 [cs] (Jan. 2019).
arXiv: 1901.09850. url: http://arxiv.org/abs/1901.09850 (visited on 02/04/2019).
[87] PatrickMorrison,BenjaminH.Smith,andLaurieWilliams.“SurveyingSecurityPrac-
tice Adherence in Software Development”. In: Proceedings of the Hot Topics in Sci-
ence of Security: Symposium and Bootcamp. HoTSoS. New York, NY, USA: ACM,
2017, pp. 85–94. isbn: 978-1-4503-5274-1. doi: 10.1145/3055305.3055312. url: http:
//doi.acm.org/10.1145/3055305.3055312 (visited on 11/03/2017).
[88] KatieLMortonetal.“Engagingstakeholdersandtargetgroupsinprioritisingapublic
health intervention: the Creating Active School Environments (CASE) online Delphi
study”. en. In: BMJ Open 7.1 (Jan. 2017), e013340. issn: 2044-6055, 2044-6055. doi:
10.1136/bmjopen-2016-013340. url: http://bmjopen.bmj.com/lookup/doi/10.1136/
bmjopen-2016-013340 (visited on 08/19/2020).
[89] StephanNeuhausetal.“PredictingVulnerableSoftwareComponents”.In: Proceedings
of the 14th ACM Conference on Computer and Communications Security. CCS ’07.
New York, NY, USA: ACM, 2007, pp. 529–540. isbn: 978-1-59593-703-2. doi: 10.
1145/1315245.1315311. url: http://doi.acm.org/10.1145/1315245.1315311 (visited
on 07/09/2018).
[90] Phu H. Nguyen et al. “An extensive systematic review on the Model-Driven Devel-
opment of secure systems”. en. In: Information and Software Technology 68 (Dec.
2015), pp. 62–81. issn: 09505849. doi: 10.1016/j.infsof.2015.08.006. url: http:
//linkinghub.elsevier.com/retrieve/pii/S0950584915001482 (visited on 03/22/2018).
[91] J.C.S. Nunez, A.C. Lindo, and P.G. Rodriguez. “A preventive secure software de-
velopment model for a software factory: A case study”. In: IEEE Access 8 (2020),
pp. 77653–77665. doi: 10.1109/ACCESS.2020.2989113.
[92] MohammedM.OlamaandJamesNutaro.“Secureitnoworsecureitlater:thebenefits
of addressing cyber-security from the outset”. In: Cyber Sensing 2013. Vol. 8757.
International Society for Optics and Photonics, May 2013, p. 87570L. doi: 10.1117/
12.2015465. url: https://www-spiedigitallibrary-org.libproxy1.usc.edu/conference-
proceedings-of-spie/8757/87570L/Secure-it-now-or-secure-it-later--the-benefits/10.
1117/12.2015465.short (visited on 11/02/2018).
[93] Lotfi Ben Othmane et al. “Time for Addressing Software Security Issues: Prediction
ModelsandImpactingFactors”.en.In: Data Science and Engineering 2.2(June2017),
162
pp.107–124.issn:2364-1185,2364-1541.doi:10.1007/s41019-016-0019-8.url:https:
//link.springer.com/article/10.1007/s41019-016-0019-8 (visited on 06/06/2018).
[94] Lotfi ben Othmane et al. “Factors Impacting the Effort Required to Fix Security
Vulnerabilities”. en. In: Information Security. Springer, Cham, Sept. 2015, pp. 102–
119. doi: 10.1007/978-3-319-23318-5_6. url: https://link.springer.com/chapter/10.
1007/978-3-319-23318-5_6 (visited on 07/17/2018).
[95] Lotfi ben Othmane et al. “Incorporating attacker capabilities in risk estimation and
mitigation”. In: Computers & Security 51 (2015), pp. 41–61. issn: 0167-4048. doi:
https://doi.org/10.1016/j.cose.2015.03.001. url: https://www.sciencedirect.com/
science/article/pii/S0167404815000334.
[96] OWASP. CLASP Concepts - OWASP. url: https://www.owasp.org/index.php/
CLASP_Concepts (visited on 04/23/2019).
[97] OWASP SAMM Project. Software Assurance Maturity Model (SAMM): A guide to
building security into software development - v1.5. en. Tech. rep. Version 1.5. 2017,
p. 72. url: https://owaspsamm.org/ (visited on 04/23/2019).
[98] Keun-Young Park, Sang-Guun Yoo, and Juho Kim. “Security Requirements Prioriti-
zation Based on Threat Modeling and Valuation Graph”. en. In: Convergence and Hy-
brid Information Technology. Vol. 206. Berlin, Heidelberg: Springer Berlin Heidelberg,
2011, pp. 142–152. isbn: 978-3-642-24105-5 978-3-642-24106-2. doi: 10.1007/978-3-
642-24106-2_19. url: http://link.springer.com/10.1007/978-3-642-24106-2_19
(visited on 10/25/2018).
[99] David A. Patterson. “20th century vs. 21st century C&C: the SPUR manifesto”. In:
Communications of the ACM 48.3 (Mar. 2005), pp. 15–16. issn: 0001-0782. doi:
10.1145/1047671.1047688. url: http://doi.org/10.1145/1047671.1047688 (visited on
06/18/2020).
[100] J. Peeters and P. Dyson. “Cost-Effective Security”. In: IEEE Security Privacy 5.3
(May 2007), pp. 85–87. issn: 1540-7993. doi: 10.1109/MSP.2007.56.
[101] S. L. Pfleeger and R. Rue. “Cybersecurity Economic Issues: Clearing the Path to
Good Practice”. In: IEEE Software 25.1 (Jan. 2008), pp. 35–42. issn: 0740-7459.doi:
10.1109/MS.2008.4.
[102] Awais Rashid et al. The Cyber Security Body of Knowledge. en. 2019. url: https:
//www.cybok.org/.
[103] Sanjay Rawat and Ashutosh Saxena. “Application Security Code Analysis: a Step
Towards Software Assurance”. In: Int. J. Inf. Comput. Secur. 3.1 (June 2009), pp. 86–
110. issn: 1744-1765. doi: 10.1504/IJICS.2009.026622. url: http://dx.doi.org/10.
1504/IJICS.2009.026622 (visited on 03/06/2019).
163
[104] M. Razzazi et al. “Common Criteria Security Evaluation: A Time and Cost Effective
Approach”. In: 2006 2nd International Conference on Information Communication
Technologies. Vol. 2. Apr. 2006, pp. 3287–3292. doi: 10.1109/ICTTA.2006.1684943.
[105] DonaldJ.Reifer,BarryW.Boehm,andMuraliGangadharan.“EstimatingtheCostof
Security for COTS Software”. en. In: COTS-Based Software Systems. Lecture Notes
in Computer Science. Springer, Berlin, Heidelberg, Feb. 2003, pp. 178–186. isbn:
978-3-540-00562-9 978-3-540-36465-8. doi: 10.1007/3-540-36465-X_17. url: https:
//link.springer.com/chapter/10.1007/3-540-36465-X_17 (visited on 11/03/2017).
[106] K. Rindell, S. Hyrynsalmi, and V. Leppänen. “Case Study of Security Development in
an Agile Environment: Building Identity Management for a Government Agency”. In:
2016 11th International Conference on Availability, Reliability and Security (ARES).
Aug. 2016, pp. 556–563. doi: 10.1109/ARES.2016.45.
[107] Kalle Rindell, Sami Hyrynsalmi, and Ville Leppänen. “A Comparison of Security
Assurance Support of Agile Software Development Methods”. In: Proceedings of the
16th International Conference on Computer Systems and Technologies. CompSysTech
’15. New York, NY, USA: ACM, 2015, pp. 61–68. isbn: 978-1-4503-3357-3. doi: 10.
1145/2812428.2812431. url: http://doi.acm.org/10.1145/2812428.2812431 (visited
on 10/05/2018).
[108] Pilar Rodríguez et al. “Continuous deployment of software intensive products and
services: A systematic mapping study”. In: Journal of Systems and Software 123
(Jan. 2017), pp. 263–291. issn: 0164-1212. doi: 10.1016/j.jss.2015.12.015. url:
http://www.sciencedirect.com/science/article/pii/S0164121215002812 (visited on
11/09/2018).
[109] Wilson Rosa et al. “Early Phase Cost Models for Agile Software Processes in the US
DoD”. In: 2017 ACM/IEEE International Symposium on Empirical Software Engi-
neering and Measurement (ESEM). ISSN: null. Nov. 2017, pp. 30–37. doi: 10.1109/
ESEM.2017.10.
[110] Marko Saarela et al. “Measuring Software Security from the Design of Software”. In:
Proceedings of the 18th International Conference on Computer Systems and Tech-
nologies. CompSysTech’17. New York, NY, USA: ACM, 2017, pp. 179–186. isbn:
978-1-4503-5234-5. doi: 10.1145/3134302.3134334. url: http://doi.acm.org/10.
1145/3134302.3134334 (visited on 07/12/2018).
[111] SAFECode. Fundamental Practices for Secure Software Development: Essential Ele-
ments of a Secure Development Lifecycle Program. Mar. 2018. url: https://safecode.
org/wp-content/uploads/2018/03/SAFECode_Fundamental_Practices_for_
Secure_Software_Development_March_2018.pdf.
[112] Daniel Schatz, Rabih Bashroush, and Julie Wall. “Towards a More Representative
Definition of Cyber Security”. In: Journal of Digital Forensics, Security and Law 12.2
164
(June 2017). issn: (Print) 1558-7215. doi: https://doi.org/10.15394/jdfsl.2017.1476.
url: https://commons.erau.edu/jdfsl/vol12/iss2/8.
[113] Severine Sentilles et al. “Software Qualities and their Dependencies Report on two
editions of the workshop”. In: ACM SIGSOFT Software Engineering Notes 45.1 (Jan.
2020), pp. 31–33. issn: 0163-5948. doi: 10.1145/3375572.3375581. url: http://doi.
org/10.1145/3375572.3375581 (visited on 05/25/2020).
[114] Y. Shin et al. “Evaluating Complexity, Code Churn, and Developer Activity Metrics
as Indicators of Software Vulnerabilities”. In: IEEE Transactions on Software Engi-
neering 37.6 (Nov. 2011), pp. 772–787. issn: 0098-5589. doi: 10.1109/TSE.2010.81.
[115] Yonghee Shin and Laurie Williams. “An Initial Study on the Use of Execution Com-
plexity Metrics As Indicators of Software Vulnerabilities”. In: Proceedings of the 7th
International Workshop on Software Engineering for Secure Systems. SESS ’11. New
York, NY, USA: ACM, 2011, pp. 1–7.isbn: 978-1-4503-0581-5.doi: 10.1145/1988630.
1988632.url: http://doi.acm.org/10.1145/1988630.1988632 (visited on 11/02/2018).
[116] Yonghee Shin and Laurie Williams. “Can traditional fault prediction models be used
for vulnerability prediction?” en. In: Empirical Software Engineering 18.1 (Feb. 2013),
pp. 25–59. issn: 1382-3256, 1573-7616. doi: 10.1007/s10664-011-9190-8. url: https:
//link.springer.com/article/10.1007/s10664-011-9190-8 (visited on 06/06/2018).
[117] F. Shull et al. “What we have learned about fighting defects”. In: Proceedings Eighth
IEEE Symposium on Software Metrics. June 2002, pp. 249–258. doi: 10.1109/
METRIC.2002.1011343.
[118] Carlo Marcelo Revoredo da Silva et al. “Systematic Mapping Study On Secu-
rity Threats in Cloud Computing”. In: arXiv:1303.6782 [cs] (Mar. 2013). arXiv:
1303.6782. url: http://arxiv.org/abs/1303.6782 (visited on 03/08/2019).
[119] Software Security Engineering: A Guide for Project Managers (white paper). url:
https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=297359 (visited on
11/14/2017).
[120] John Tierney and Tony Boswell. “Common Criteria: Origins and Overview”. en. In:
Smart Cards, Tokens, Security and Applications. Ed. by Keith Mayes and Kon-
stantinos Markantonakis. Cham: Springer International Publishing, 2017, pp. 193–
216. isbn: 978-3-319-50500-8. doi: 10.1007/978-3-319-50500-8_8. url: https:
//doi.org/10.1007/978-3-319-50500-8_8 (visited on 10/27/2019).
[121] Tony Rice et al. Fundamental Practices for Secure Software Development. White Pa-
per Third Edition. Software Assurance Forum for Excellence in Code (SAFECode),
Mar. 2018. url: https://safecode.org/wp-content/uploads/2018/03/SAFECode_
Fundamental_Practices_for_Secure_Software_Development_March_2018.pdf
(visited on 11/20/2018).
165
[122] M. Torchiano et al. “Lessons Learnt in Conducting Survey Research”. In: 2017
IEEE/ACM 5th International Workshop on Conducting Empirical Studies in Industry
(CESI). May 2017, pp. 33–39. doi: 10.1109/CESI.2017.5.
[123] John Wilder Tukey. Exploratory data analysis. eng. Addison-Wesley series in behav-
ioral science. Reading, Mass: Addison-Wesley PubCo, 1977. isbn: 978-0-201-07616-5.
[124] Sven Türpe. “The Trouble with Security Requirements”. In: 2017 IEEE 25th Inter-
national Requirements Engineering Conference (RE). ISSN: 2332-6441. Sept. 2017,
pp. 122–133. doi: 10.1109/RE.2017.13.
[125] Elaine Venson. The Impact of Software Security Practices on Development Effort. en.
Washington, DC, Nov. 2019. url: https://sercuarc.org/wp-content/uploads/2019/
12/7.-Venson-SDSF-2019.pdf.
[126] Elaine Venson et al. “Costing Secure Software Development: A Systematic Map-
ping Study”. In: Proceedings of the 14th International Conference on Availability,
Reliability and Security. ARES ’19. event-place: Canterbury, CA, United Kingdom.
New York, NY, USA: ACM, 2019, 9:1–9:11. isbn: 978-1-4503-7164-3. doi: 10.1145/
3339252.3339263. url: http://doi.acm.org/10.1145/3339252.3339263 (visited on
08/26/2019).
[127] Vilhelm Verendel. “Quantified Security is a Weak Hypothesis: A Critical Survey of
Results and Assumptions”. In: Proceedings of the 2009 Workshop on New Security
Paradigms Workshop. NSPW ’09. event-place: Oxford, United Kingdom. New York,
NY, USA: ACM, 2009, pp. 37–50. isbn: 978-1-60558-845-2. doi: 10.1145/1719030.
1719036.url: http://doi.acm.org/10.1145/1719030.1719036 (visited on 10/11/2019).
[128] Dilani Wickramaarachchi and Richard Lai. “Effort estimation in global software de-
velopment - a systematic review”. In: Computer Science and Information Systems
14.2 (2017), pp. 393–421. url: http://www.doiserbia.nb.rs/Article.aspx?ID=1820-
02141700007W&AspxAutoDetectCookieSupport=1 (visited on 10/26/2017).
[129] L. Williams, G. McGraw, and S. Migues. “Engineering Security Vulnerability Pre-
vention, Detection, and Response”. In: IEEE Software 35.5 (2018), pp. 76–80. doi:
10.1109/MS.2018.290110854.
[130] L. Williams, A. Meneely, and G. Shipley. “Protection Poker: The New Software Secu-
rity "Game";” in: IEEE Security Privacy 8.3 (May 2010), pp. 14–20. issn: 1540-7993.
doi: 10.1109/MSP.2010.58.
[131] Laurie Williams, Michael Gegick, and Andrew Meneely. “Protection Poker: Struc-
turing Software Security Risk Assessment and Knowledge Transfer”. en. In: Engi-
neering Secure Software and Systems. Ed. by Fabio Massacci, Samuel T. Redwine,
and Nicola Zannone. Lecture Notes in Computer Science. Springer Berlin Heidelberg,
2009, pp. 122–134. isbn: 978-3-642-00199-4.
166
[132] Claes Wohlin. “Guidelines for Snowballing in Systematic Literature Studies and a
Replication in Software Engineering”. In: Proceedings of the 18th International Con-
ference on Evaluation and Assessment in Software Engineering. EASE ’14. New York,
NY, USA: ACM, 2014, 38:1–38:10. isbn: 978-1-4503-2476-2. doi: 10.1145/2601248.
2601268.url: http://doi.acm.org/10.1145/2601248.2601268 (visited on 10/30/2017).
[133] L. Yang, X. Li, and Y. Yu. “VulDigger: A Just-in-Time and Cost-Aware Tool for
Digging Vulnerability-Contributing Changes”. In: GLOBECOM 2017 - 2017 IEEE
Global Communications Conference. Dec. 2017, pp. 1–7. doi: 10.1109/GLOCOM.
2017.8254428.
[134] Ye Yang, Jing Du, and Qing Wang. “Shaping the Effort of Developing Secure Soft-
ware”. In: Procedia Computer Science. 2015 Conference on Systems Engineering Re-
search 44.Supplement C (Jan. 2015), pp. 609–618. issn: 1877-0509. doi: 10.1016/
j.procs.2015.03.041. url: http://www.sciencedirect.com/science/article/pii/
S187705091500277X (visited on 11/14/2017).
[135] Janusz Zalewski et al. “Measuring Security: A Challenge for the Generation”. en.
In: Sept. 2014, pp. 131–140. doi: 10.15439/2014F490. url: https://fedcsis.org/
proceedings/2014/drp/490.html (visited on 06/05/2020).
[136] He Zhang, Muhammad Ali Babar, and Paolo Tell. “Identifying relevant studies in
software engineering”. In: Information and Software Technology. Special Section: Best
papers from the APSEC 53.6 (June 2011), pp. 625–637. issn: 0950-5849. doi: 10.
1016/j.infsof.2010.12.010. url: http://www.sciencedirect.com/science/article/pii/
S0950584910002260 (visited on 11/02/2017).
[137] J.Zhengetal.“Ajump-diffusionapproachtomodellingsoftwaresecurityinvestment”.
In: 2012 Fifth International Conference on Business Intelligence and Financial En-
gineering. Aug. 2012, pp. 274–278. doi: 10.1109/BIFE.2012.149.
167
Appendices
168
Appendix A
Data Collection Instruments
A.1 Instrument for Collecting Expert Opinion
Figure A.1 presents the form used in rounds one and two of the Wideband Delphi exercise,
performed in February 2020.
Figures A.2, A.3 and A.4 present the forms used in rounds one and two of the Online
Delphi exercise, performed with security experts, in September 2020.
169
1.
Mark only one oval.
Little or none
Some
A moderate amount
An extensive amount
An extensive amount + experience in teaching COCOMO
2.
3.
4.
5.
SECU Rating Scale - 1st round
Wideband Delphi
* Required
What is your experience using a COCOMO model? *
What is your level of experience with secure so!ware development (in years)? *
What is your estimation for the productivity range of the SECU cost driver - Security
Requirements and Design? *
What is your estimation for the productivity range of the SECU cost driver - Security Coding
and Tools? *
What is your estimation for the productivity range of the SECU cost driver - Security Veri"cation
& Validation? *
Figure A.1 Data Collection Form - Wideband Delphi
170
1.
2.
3.
4.
Mark only one oval.
Not con,dent at all
1 2 3 4 5
Completely con,dent
5.
6.
What is your estimation of the productivity range for the Security Requirements and Design
group? *
What is your estimation of the productivity range for the Security Coding and Tools group? *
What is your estimation of the productivity range for the Security Veri!cation & Validation
group? *
How con!dent you are in your estimations? *
Any comments about assumptions you made for the estimations?
What is your level of experience with secure so"ware development (in years)? *
Figure A.2 Data Collection Form - Online Delphi - Round 1
171
1.
2.
3.
E!o" to Develop Security Controls
The next 4 questions ask you to estimate the increase in the application size (function points, requirements, lines of code, etc) to
develop the security controls. For example, if you consider that an application at Level 1 will have an increase of 20% over the
original application size due to the development of security controls, answer 1.20 for Level 1.
4.
5.
6.
What is your new estimation of the productivity range for the Security Requirements and
Design group? *
What is your new estimation of the productivity range for the Security Coding and Tools group?
*
What is your new estimation of the productivity range for the Security Veri!cation & Validation
group? *
Size increase factor for Level 1
Size increase factor for Level 2
Size increase factor for Level 3
Figure A.3 Data Collection Form - Online Delphi - Round 2 - Page 1
172
7.
8.
Mark only one oval.
Not conLdent at all
1 2 3 4 5
Completely conLdent
9.
10.
This content is neither created nor endorsed by Google.
Size increase factor for Level 4
How con!dent you are in your estimations? *
Any comments about assumptions you made for the estimations?
Enter your email address to receive the summary results of the second round of estimations.
Forms
Figure A.4 Data Collection Form - Online Delphi - Round 2 - Page 2
173
A.2 Instrument for Collecting Project Data
The form to collect data from projects was composed by three main parts, (1) System Con-
text, to collect information about the overall software system; (2) Functional Components,
to describe the functional components of the software system; and (3) Functional Component
Size, to provide information about the size of the component in Function Points. Tables A.1,
A.2, and A.3 list the information requested in each part.
Table A.1 Software System Context
System Details Data Element Instructions
System Name Enter a unique, sanitized name for the system for which the data is being
reported. For instance, 521, AMZ, etc. Only you will know the true identity
of the system.
System Descrip-
tion/Architecture
List the major capabilities of this system and describe if this system is part
of a Systems-of-Systems. Specify if the system requires either a dedicated
computing platform, servers or a host site; uses open architecture/COTS; and
list the platform operating system (e.g., Windows, Linux, Wind River, etc.)
Application Super-
Domain
Select ONE overall super-domain for this system: Real-Time, Engineering,
Automated Information Systems.
# Unique SW Baselines
Maintained
Provide a count of the number of unique baselines of the software that are con-
currently being developed/maintained. Multiple baselines may exist to support
different platform configurations.
Development Process Select ONE software lifecycle process used to control the software development
(pull-down).
System Comments Provide any additional information that would help explain this system.
System-level Cost Drivers Data Element Instructions
Precedentedness This driver captures the project’s ability to provide new ideas and methods for
solving challenges posed by application requirements
Development Flexibility This driver captures the flexibility in delivered requirements, development pro-
cesses and implementation constraints.
Risk/Opportunity Man-
agement
This driver captures the project’s use of a comprehensive, effective
risk/opportunity management process, and the amount of risk on the current
project.
Software Architecture Un-
derstanding
This driver rates the degree of understanding of determining and manag-
ing the system architecture in terms of platforms, standards, new and NDI
(COTS/GOTS) components, connectors (protocols), and constraints.
Stakeholder Team Cohe-
sion
This driver accounts for the sources of project turbulence and entropy due to
difficulties in synchronizing the project’s stakeholders: users, customers, devel-
opers, maintainers, supply chain and distribution chain partners, interfacers,
others.
174
Table A.1 – continued from previous page
Process Capability & Us-
age
This driver accounts for the consistency and effectiveness of the project team
in performing Software Engineering (SWE) processes.
Required Development
Schedule
This driver expresses the effect of schedule compression / expansion from nom-
inal or average.
Table A.2 Software Functional Component
Component Details Data Element Instructions
Component Name Provide the unique, sanitized name for the Component. Only you will know
the true identity of the Component.
Component Description Provide a sanitized functional description of this Component.
Component Magnitude Specify if this Component is a big, small, little, or use another description of
magnitude.
Unusual Circumstances Provide any information of any unusual circumstances that impact the size, ef-
fort or duration of the Component, e.g. changes in funding, high staff turnover,
contractor change, unexpected functional changes.
Development Iteration If the lifecycle process is iterative, which iteration did the data in this form
come from?
Component Comments Provide any additional information that would better explain component de-
tails.
Component Effort Data Element Instructions
Total Effort Provide the total number of hours expended on this Component.
Phases Included in Effort Provide the effort hours in each phase that are included in the reported hours:
Inception Phase Thefirstphaseinwhichtheinitialidea, orrequestforproposal, fortheprevious
generation is brought to the point (at least internally) of being funded to enter
the elaboration phase.
Elaboration Phase The second phase in which the product vision and its architecture are defined.
Construction Phase The third phase in which the software is brought from an executable architec-
tural baseline to the point at which it is read to be transitioned to the user
community.
Transition Phase The fourth phase in which the software is turned over to the user community.
Effort Comments Provide explanations or additional information that would help explain the
effort data. Specify whether the effort information refers to the actuals or
estimated values.
Component Duration Data Element Instructions
Start Date Provide the start date for work on the component. (Format: MM/DD/YYYY)
Start Criteria Describe the criteria used to start work on the component.
End Date Provide the end date for work on the component. (Format: MM/DD/YYYY)
End Criteria Describe the criteria used to stop work on the component.
175
Table A.2 – continued from previous page
Duration Comments Provide any additional information that would clarify the above duration data,
e.g. alternative criteria for start / end dates. Specify whether the duration
information refers to the actuals or estimated values.
Component-level Cost
Drivers
Data Element Instructions
ImpactofSoftwareFailure This is the measure of the extent to which the software must perform its in-
tended function over a period of time.
Product Complexity Complexity is divided into five characteristics. The complexity rating is the
subjective average of the characteristic ratings. See definition.
Developed for Reusability This driver accounts for the additional effort needed to construct components
intended for reuse on the current or future projects.
Required Software Secu-
rity
This driver captures the protection required by the software to stop unautho-
rized access. See definition.
Platform Constraints This driver captures the limitations placed on the platform’s capacity such as
execution time, primary/secondary storage, communications bandwidth, bat-
tery power, and maybe others. See definition.
Platform Volatility The targeted platform may still be evolving while the software application
is being developed. This driver captures the impact of the instability of the
targeted platform resulting in additional/increased work. See definition.
Analyst Capability The Analyst Team is important for solution space development and supervision
of its implementation. Analysts work on requirements, high-level design and
detailed design in the applicable domain. See definition.
Programmer Capability Major attributes to be considered in this rating are the programming team’s
ability, competence, proficiency, aptitude, thoroughness, andtheabilitytocom-
municate and cooperate. See definition.
Personnel Continuity This driver captures the stability of the program team on a multi-year project
or an organization’s program team in the case of multiple deliveries in a year.
Application Domain Ex-
perience
This driver captures the average level of application domain experience of the
project team developing the software system or subsystem.
Language and Tool Expe-
rience
This is a measure of the level of programming language and software tool
experience of the project team developing the software system or subsystem.
See definition.
Platform Experience This driver recognizes the importance of understanding the target platform.
See definition.
Use of Software Tools This driver rates the use of software development tools using three character-
istics. See definition.
Multisite Development The multisite development effects on effort are significant. Determining its cost
driver rating involves the assessment of collocation and communication.
Automated Analysis Automated analysis includes code analyzers, syntax and semantics analyzers,
type checkers, requirements and design consistency and traceability checkers,
model checkers, formal verification and validation etc.
Peer Reviews Peer reviews cover the spectrum of group review activities including reviewer
roles, change control processes, checklists and statistical process control.
176
Table A.2 – continued from previous page
ExecutionTesting&Tools Execution Testing and Tools covers the process and use of tools for different
types of testing and includes the use of test checklists, problem tracking, and
test data management.
Cost Driver Comments Provide any additional information that would clarify the ratings of the above
cost drivers.
177
Table A.3 Software Functional Component Size
Component-level Software
Size
Data Element Instructions
Implementation Program-
ming Language
Provide the programing language(s) used in implementation.
Functional Size Measure
(FSM)
Data Element Instructions
FSM Description Identify the FSM being reported, e.g. SW Requirements, Function Points,
Cosmic Function Points, Simple Function Points, Use Case Points, Stories, etc.
FSM Count Provide the total FSM count.
Development Type Identify whether the component was part of a New Development effort, En-
hancement project, Reused components requiring changes, or Other. Descrip-
tion of the development types is given below.
Enhancement Parameters Provide the following additional information for Enhancement tasks/projects.
FSM Added Count In terms of the Functional Size used, provide the size of added functional-
ity/features/modules.
FSM Modified Count In terms of the Functional Size used, provide the size of the modified function-
ality/features/modules.
FSM Deleted Count In terms of the Functional Size used, provide the size of deleted functional-
ity/features/modules.
FSM Unmodified Count In terms of the Functional Size used, provide the size of the unmodified func-
tionality/features/modules.
Software Understanding Software Understanding is an assessment of the comprehensibility of the soft-
ware source code. See definition for rating info.
Programmer Unfamiliar-
ity
Programmer Unfamiliarity assesses the programming team’s knowledge of the
adapted software. See definition for rating info.
Product Complexity Rate the following Product Complexity characteristics individually. See defi-
nition for details.
Control Operations
Computational Operations
Device-Dependent Operations
Data Management Operations
User Interface Management Operations
FSM Count for Security
Features
Provide the count of the security features, i.e. the count related to functions
like authentication, authorization, security audit, etc.
FSM Comments Provide any additional information that would help explain the FSM count.
Specify whether the FSM counts refer to the actuals or estimated values.
178
Appendix B
Model Initial Values
Table B.1 presents the initial values used in the model to perform the multiple linear re-
gression. Columns VL to UH are the cost drivers’ levels, Very Low (VL), Low (L), Nominal
(N), High (H), Very High (VH), Extra High (XH), and Ultra High (UH). PR stands for
Productivity Range.
Table B.1 Initial Values for the Cost Drivers
Parameter VL L N H VH XH UH PR
SECU 1.00 1.282 1.644 2.109 2.704 2.704
FAIL 0.820 0.920 1.000 1.100 1.260 1.420 1.730
CPLX 0.730 0.870 1.000 1.170 1.340 1.740 2.380
PLAT 1.000 1.080 1.230 1.545 1.550
PVOL 0.870 1.000 1.150 1.300 1.490
APEX 1.220 1.100 1.000 0.880 0.810 1.510
LTEX 1.200 1.090 1.000 0.910 0.840 0.770 1.560
PLEX 1.190 1.090 1.000 0.910 0.850 0.790 1.510
TOOL 1.170 1.090 1.000 0.900 0.780 1.500
SITE 1.220 1.090 1.000 0.930 0.860 0.800 1.530
EXPE 1.203 1.093 1.000 0.900 0.833 0.780 1.526
179
Appendix C
Regression Results for System
Architecture-based Models
The following sections present the regression tables for different system architectures and
combinations.
180
C.1 Web-Mainframe
Call:
lm(formula = log(effort_h) ~ log(fsm_count) + log(fail) + log(cplx) +
log(secu) + log(plat) + log(pvol) + log(expe) + log(site),
data = df)
Residuals:
Min 1Q Median 3Q Max
-0.90832 -0.12490 -0.00431 0.13618 0.39929
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.24406 0.05352 41.928 < 2e-16 ***
log(fsm_count) 1.07691 0.01073 100.379 < 2e-16 ***
log(fail) 1.39136 0.17788 7.822 7.46e-14 ***
log(cplx) 0.50822 0.14758 3.444 0.00065 ***
log(secu) 1.07733 0.15524 6.940 2.15e-11 ***
log(plat) 1.02061 0.17802 5.733 2.26e-08 ***
log(pvol) 0.55810 0.19164 2.912 0.00384 **
log(expe) 2.39147 0.28808 8.301 2.85e-15 ***
log(site) 1.21179 0.41284 2.935 0.00357 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1947 on 323 degrees of freedom
Multiple R-squared: 0.9701, Adjusted R-squared: 0.9694
F-statistic: 1311 on 8 and 323 DF, p-value: < 2.2e-16
181
C.2 Mainframe
Call:
lm(formula = log(effort_h) ~ log(fsm_count) + log(fail) + log(cplx) +
log(secu) + log(plat) + log(expe) + log(tool) + log(site),
data = df)
Residuals:
Min 1Q Median 3Q Max
-0.64381 -0.15491 0.01688 0.15258 0.71620
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.26759 0.06500 34.886 < 2e-16 ***
log(fsm_count) 1.07236 0.01373 78.101 < 2e-16 ***
log(fail) 1.75230 0.24121 7.265 5.42e-12 ***
log(cplx) 1.35301 0.19385 6.979 2.97e-11 ***
log(secu) 1.05327 0.16448 6.404 8.13e-10 ***
log(plat) 0.89809 0.18763 4.787 2.99e-06 ***
log(expe) 3.23190 0.42483 7.608 6.63e-13 ***
log(tool) 0.83714 0.33857 2.473 0.0141 *
log(site) 0.88190 0.49080 1.797 0.0736 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2276 on 236 degrees of freedom
Multiple R-squared: 0.965, Adjusted R-squared: 0.9638
F-statistic: 812.2 on 8 and 236 DF, p-value: < 2.2e-16
182
C.3 Client-Server, Web-Mainframe
Call:
lm(formula = log(effort_h) ~ log(fsm_count) + log(fail) + log(cplx) +
log(plat) + log(expe) + log(tool) + log(site), data = df)
Residuals:
Min 1Q Median 3Q Max
-1.34420 -0.13551 0.01364 0.16446 1.14444
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.31985 0.11469 20.227 < 2e-16 ***
log(fsm_count) 0.98375 0.02061 47.727 < 2e-16 ***
log(fail) 1.93712 0.37585 5.154 9.5e-07 ***
log(cplx) 0.68857 0.35252 1.953 0.052985 .
log(plat) 1.10656 0.28616 3.867 0.000175 ***
log(expe) 1.68238 0.57840 2.909 0.004286 **
log(tool) 1.42041 0.97766 1.453 0.148726
log(site) -1.80281 1.19084 -1.514 0.132537
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2853 on 127 degrees of freedom
Multiple R-squared: 0.9547, Adjusted R-squared: 0.9522
F-statistic: 382.2 on 7 and 127 DF, p-value: < 2.2e-16
183
C.4 Web
Call:
lm(formula = log(effort_h) ~ log(fsm_count) + log(fail) + log(cplx) +
log(plat) + log(expe) + log(tool) + log(site), data = df)
Residuals:
Min 1Q Median 3Q Max
-1.44675 -0.21889 0.04249 0.18672 1.01578
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.76565 0.17648 15.671 < 2e-16 ***
log(fsm_count) 1.03404 0.03507 29.484 < 2e-16 ***
log(fail) 1.50072 0.73863 2.032 0.04572 *
log(cplx) 1.87622 0.78624 2.386 0.01954 *
log(plat) 1.58867 0.75780 2.096 0.03942 *
log(expe) 2.39059 0.86777 2.755 0.00737 **
log(tool) 1.24679 0.87308 1.428 0.15743
log(site) 4.52869 1.52967 2.961 0.00411 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4354 on 75 degrees of freedom
Multiple R-squared: 0.93, Adjusted R-squared: 0.9235
F-statistic: 142.4 on 7 and 75 DF, p-value: < 2.2e-16
184
C.5 Client-Server
Call:
lm(formula = log(effort_h) ~ log(fsm_count) + log(secu) + log(pvol) +
log(expe) + log(tool), data = df)
Residuals:
Min 1Q Median 3Q Max
-0.71735 -0.17647 -0.01157 0.14058 0.72229
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.98508 0.19682 10.086 6.66e-13 ***
log(fsm_count) 1.07915 0.04462 24.184 < 2e-16 ***
log(secu) 2.07986 0.48880 4.255 0.000111 ***
log(pvol) 3.93924 0.96474 4.083 0.000190 ***
log(expe) 3.11137 0.75927 4.098 0.000181 ***
log(tool) 3.19042 1.12674 2.832 0.007019 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3113 on 43 degrees of freedom
Multiple R-squared: 0.9406, Adjusted R-squared: 0.9337
F-statistic: 136.1 on 5 and 43 DF, p-value: < 2.2e-16
185
Abstract (if available)
Abstract
Software development teams are under pressure to adopt security practices in their projects in response to cyber threats. Despite the effort required to perform these activities, the few proposed cost models for security effort do not consider security practices as input and were not properly validated with empirical data. This dissertation aims at examining the effects of applying software security practices to the software development effort. Specifically, it quantifies the effort required to develop secure software in increasing levels of rigor and scope. ❧ An ordinal scale to measure the degree of application of security practices was developed. The scale items are based on the sources of cost for secure software development, captured through a systematic mapping and a survey with security experts. Effort estimation experts and software security experts evaluated the scale and provided initial estimates for the productivity range through Wideband Delphi and online Delphi sessions. Finally, a statistical model to quantify the security effort was built based on the estimates and on a dataset with projects from the industry. The model calibration showed that the application of software security practices can impact the cost estimations ranging from a 19% additional effort, on the first level of the scale, to a 102% additional effort, on the highest level of the scale. ❧ These results suggest that the effort required to develop secure software is lower than it was estimated in previous studies, especially when considering the domain of Information Systems software. This research builds on previous works on secure software cost models and goes one step further by providing an empirical validation for the required software security scale. The resulting model can be used by practitioners in this area to estimate proper resources for secure software development. Additionally, the validated multipliers are an important piece of information for researchers developing investment models for software security.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Calibrating COCOMO® II for functional size metrics
PDF
Domain-based effort distribution model for software cost estimation
PDF
A model for estimating cross-project multitasking overhead in software development projects
PDF
Security functional requirements analysis for developing secure software
PDF
Improved size and effort estimation models for software maintenance
PDF
Quantitative and qualitative analyses of requirements elaboration for early software size estimation
PDF
Software quality understanding by analysis of abundant data (SQUAAD): towards better understanding of life cycle software qualities
PDF
Assessing software maintainability in systems by leveraging fuzzy methods and linguistic analysis
PDF
A model for estimating schedule acceleration in agile software development projects
PDF
Incremental development productivity decline
PDF
A search-based approach for technical debt prioritization
PDF
Software security economics and threat modeling based on attack path analysis; a stakeholder value driven approach
PDF
A reference architecture for integrated self‐adaptive software environments
PDF
Value-based, dependency-aware inspection and test prioritization
PDF
Experimental and analytical comparison between pair development and software development with Fagan's inspection
PDF
Process implications of executable domain models for microservices development
PDF
Quantifying the impact of requirements volatility on systems engineering effort
PDF
A value-based theory of software engineering
PDF
Toward better understanding and improving user-developer communications on mobile app stores
PDF
Techniques for methodically exploring software development alternatives
Asset Metadata
Creator
Venson, Elaine
(author)
Core Title
The effects of required security on software development effort
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Degree Conferral Date
2021-08
Publication Date
07/17/2021
Defense Date
06/03/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
OAI-PMH Harvest,secure software development,software cost model,software effort estimation,software security
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Boehm, Barry William (
committee chair
), Adler, Paul (
committee member
), Wang, Chao (
committee member
)
Creator Email
elaine.venson@gmail.com,venson@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC15600910
Unique identifier
UC15600910
Legacy Identifier
etd-VensonElai-9759
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Venson, Elaine
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
secure software development
software cost model
software effort estimation
software security