Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: Over the last decade several software vulnerability databases have been introduced to guide researchers
Received 18 October 2021 and developers in developing more secure and reliable software. While the Software Engineering research
Revised 1 February 2022
community is increasingly becoming aware of these vulnerabilities databases, no comprehensive litera-
Accepted 14 February 2022
ture survey exists that studies how they are used in software development. The objective of our survey
Available online 16 February 2022
is to provide insights on how the software vulnerability database (SVDBs) research landscape has evolved
Keywords: over the past 17 years and outline some open challenges associated with their use in non-security do-
Security main. More specifically, we introduce a semi-automated methodology based on topic modeling, to dis-
Vulnerability databases cover relevant topics from our dataset of 99 relevant SE research articles. We find 24 topics discussing
Software development the use of SVDBs in SE domain. The results shows that i) topics describing the use of SVDBs range from
Software security security empirical (case) studies to tools for generating security test cases; ii) the majority of the surveyed
Vulnerability analysis
papers cover a limited number of software engineering contributions or activities (e.g., maintenance) and
iii) that most of the surveyed articles rely on only one SVDB as their knowledge source. Dataset and
results are available at https://github.com/isultane/svdbs_dataset
© 2022 Published by Elsevier Ltd.
3
E-mail address: ssalqahtani@imamu.edu.sa https://www.mozilla.org/en-US/security/advisories/
1 4
https://nvd.nist.gov/vuln/data-feeds https://nvd.nist.gov/vuln/data-feeds
2 5
https://www.cert.org/ https://cve.mitre.org/index.html
https://doi.org/10.1016/j.cose.2022.102661
0167-4048/© 2022 Published by Elsevier Ltd.
S.S. Alqahtani Computers & Security 116 (2022) 102661
2
S.S. Alqahtani Computers & Security 116 (2022) 102661
1
For the next processing step, we use the cleansed data as input popular (zk ) = θ ( di , zk ) (1)
|D|
to our topic model algorithm. The topic modeling is used to au- di ∈ D
tomate the discovery of topics from our dataset. In this paper, we θ ( di , zk ) ≥ δ
rely on Latent Dirichlet Allocation (LDA), a statistical topic mod-
eling approach which is best suited for finding discussion topics where D is the set of all articles in our dataset. The popularity met-
in natural language text documents (Blei et al., 2003). LDA cre- ric measure allows us to assess the relative popularity of a topic zk
ates topics when it finds sets of words that co-occur frequently across all articles. For example, if a topic has a popularity metric
in the documents of the corpus. Often, the words in a discovered of 10%, then 10% of all articles contain this topic.
3
S.S. Alqahtani Computers & Security 116 (2022) 102661
where G(zk , sp, y ) denotes all topics zk related to SE phase sp in 3.2. What are the main security discussion topics in SE articles (2006
year y. This metric measures the number of articles covering a SE – 2017) (RQ#2)?
phase during a year, relative to all articles in that year.
In what follows, we describe in more detail the 24 topics dis-
3. Results covered by the topic analysis used in our methodology. The full list
of topics discovered, including their popularity metric (popular (1)
We now present the results of applying our research methodol- intrudced erlier) can be found in Appendix A (see Table A3). The
ogy to our dataset and report on findings related to our research full list of topics discovered, including their popularity metric can
questions introduced earlier. be found in public repository (Alqahtani, 2021).
Our analysis also shows that topics span a several security con-
3.1. What types of SVDBs are most commonly used (RQ#1)? cepts, such as “security study”, “sql injection”, “detecting overflow”,
“prediction model” etc. We use the groupings of top words to re-
Our survey shows that none of the surveyed SE article reported flect their semantic similarities (see Table A1 in the Appendix). For
on the use of common SVDBs prior to 2006 (see Fig. 3). This is example, words such as ‘‘model, ‘‘predict, ‘‘evaluation”, and “com-
due to the fact that the first widely-recognized common SVDBs ponents” are grouped together in this context as part of Vulnera-
(e.g., Karlsson, 2012) became publicly available only in late 2004 bility Prediction Model (Hovsepyan et al., 2012). In addition to this
and beginning 2005 with more specialized public SVDBs emerging automated classification, we also manually verified most of the top
in SE articles in late 2010. documents (SE articles), to ensure a natural fit with both the given
Our analysis also shows that the majority of surveyed arti- topic and the other topics in the documents.
cles (91%) use common SVDBs in their work, whereas only 9% In what follows, we present a subset (top 4 from Table A1) of
rely on specialized SVDBs as their primary resource for vulner- topics along with a representative example articles (Article-id: title
ability information. Further analysis of the common SVDBs us- and SVDB used in the article) to illustrate what constitutes such
age in these papers (see Fig. 3) shows that most of the sur- topic.
veyed articles (26%) are used NVD as their SVDB of choice, fol- Security study topic: Mining security software repositories is of-
lowed by CVE6 (16%). It should be noted that NVD is based on ten used to study how programmers deal with security concerns
the CVE dictionary augmented with additional analysis informa- during software development. To this extent, we identified multi-
tion, a database, and a fine-grained search engine. NVD is syn- ple security topics, including Security Study (Empirical), and Secu-
chronized regularly with CVE such that any CVE update will also rity Maintenance and Design. Below are two examples of articles
be reflected in NVD (after approved by the NVD security engi- that fall into the Security Study topic.
neers). NVD includes security checklists, security related software
flaws, misconfigurations, affected product names, and impact met-
rics. The OWASP, another common SVDB which has been used by
7
https://blog.osvdb.org/
8
https://www.mitre.org/
6 9
https://cve.mitre.org/ http://seclists.org/bugtraq/
4
S.S. Alqahtani Computers & Security 116 (2022) 102661
10
https://www.exploit-db.com/ 11
http://www.cs.cmu.edu/∼lujiang/resources/igraph.pdf
5
S.S. Alqahtani Computers & Security 116 (2022) 102661
6
S.S. Alqahtani Computers & Security 116 (2022) 102661
this a manifold such as: (1) specialized SVDBs contain known se- least provide users with patch information on how to fix the vul-
curity vulnerabilities affecting specific systems written in a specific nerability.
programming language (e.g., PHP). Analysis results obtained from A limitation of many common SVDBs is that they do not include
specialized SVDBs are typically not generalized to other systems the actual code causing the security vulnerability, which is in con-
(e.g., using Java vs PHP) therefore limiting the potential impact trast to specialized SVDBs that often share the code of known se-
of the published work; (2) common SVDBs contain more diverse curity vulnerabilities. Having direct access to this vulnerable code
known security vulnerabilities affecting different types of software fragment, simplifies the work of SE researchers evaluating their se-
systems and therefore can accommodate different research inter- curity analysis approaches.
ests; (3) among the common SVDBs, we found NVD being the most RQ#2: What are the main security topics covered by the re-
popular SVDBs used in the SE community. There are several rea- viewed SE articles? and RQ#3: Has the security interests in specific
sons for this popularity of NVD such as: ease of access (e.g., auto- SE phase changed overtime? We studied security topics discussed
matic data feeds), updates, size, and quality of the dataset. in SE articles that uses SVDBs in their research methodology to
Even with the popularity of common SVDBs, studies have identify how these SVDBs are used. We further clustered these
shown that developers are often not aware of known security vul- topics based on topics’ terms relationships describing SE activi-
nerabilities affecting their systems (Cadariu et al., 2015; Alqahtani ties for a more fine-grained analysis. Our findings related to RQ#2
et al., 2016b; Plate et al., 2015), resulting in situations where reveal that security studies (empirical or case studies) are among
known vulnerabilities are only late or never patched after the dis- the most common research activities covered by our reviewed ar-
closure of a vulnerability. This implies limited communication be- ticles, with most articles only citing a single SVDBs. Even though
tween vendors in charge of patching the vulnerabilities and com- some research has shown that combining multiple SVDBs can im-
mon SVDBs providers, since vendors are expected to provide a new prove vulnerability detection coverage and performance (Massacci
(patched) version of components with known vulnerabilities or at and Nguyen, 2014; Alqahtani et al., 2017).
7
S.S. Alqahtani Computers & Security 116 (2022) 102661
Our results for RQ#3 introduce 7 clusters which represents SE Other usages of SVDBs: Our survey also showed that SE re-
activities such as Maintenance, Testing and Tools design, Model- searchers used SVDBs for topics not associated with any of our
ing, Coding, Risk Analysis, and other. Due to the space limit, we topic clusters, for example:
will show in some details the top three, Maintenance, Testing and Studying the lifetime of vulnerabilities Frei et al. (2006). use
Modeling, summarized as follows: SVDBs (e.g., NVD) to quantify the time period between a vulner-
Maintenance: Among the SE activities which are supported by ability disclosure, the time of exploiting the vulnerability and the
SVDBs, maintenance is most common one. Our further manual time it takes to patch a vulnerability Zhang et al. (2011). used data
analysis of these maintenance related articles showed that many from the NVD to predict the time until a new vulnerability is dis-
of them focus on SVDBs for vulnerability evolution. Like traditional covered in a software product Zaman et al. (2011). preformed an
software evolution research with its focus on analysis and charac- exploratory study using Firefox and the CVE dataset to uncover se-
terization on how a software system evolves over time, the pres- curity bugs. The study reveals that on the hand security bugs are
ence of vulnerabilities is a crucial problem in this context. Vulnera- fixed faster than design bugs, and that security bugs tend to be
bility evolution requires organizations to monitor and manage evo- reopened multiple times in a bug repository.
lution of vulnerabilities to ensure security and reliability of their Studying vulnerabilities and their hidden impact Wijayasekara
systems. Furthermore, it often relies on information on how cer- et al. (2014., 2012) study the hidden impact of vulnerabilities, vul-
tain vulnerabilities evolve over time and what are the causes for nerabilities that are discovered after a bug has been made public.
these vulnerabilities (Alhazmi and Malaiya, 2006). Common to the They used CVE and bug repositories for the analysis of the Linux
reviewed papers, is that SVDBs provide tangible evidence about kernel and MySQL and observed that these systems had 32% and
security issues affecting software systems and how these vulner- 62% of hidden impact vulnerabilities between 2006 and 2011.
abilities and the systems they occur have evolved over time (e.g.,
Murtaza et al., 2016; Stuckman and Purtilo, 2014; Meneely et al., 5. The road ahead
2013).
Models: Modeling in software engineering community is pri- There are two key observations that we believe will impact the
marily concerned with reducing the gap between software prob- future development of SVDBs and how SE researchers and prac-
lems and implementation through the use of models that describe titioners will use SVDBs. The first observation can be helpful for
complex systems at multiple levels of abstraction and from a va- SVDBs designers to enhance current SVDBs features to meet ad-
riety of perspectives (Atlee et al., 2007). In vulnerabilities analysis, ditional requirements from SE researchers and practitioners. The
modeling techniques play an important role, for resource alloca- second is an observation that can guide SE researchers and practi-
tion during patch development and when evaluating the risk of tioners to further improve traceability and documentation of prod-
vulnerabilities exploitations. Vulnerability discovery models (e.g., uct changes caused by patched vulnerabilities.
Massacci and Nguyen, 2014; Alhazmi and Malaiya, 2006) are in-
troduced. A widely used example of such a prediction model is Observation 1. Using SVDBs beyond just being information silos.
the Vulnerabilities Prediction Model (VPM) (Rountev et al., 2004; Our results show that a majority of SE researchers use SVDBs
Morrison et al., 2015) introduced to predict the occurrence or ab- to gain security related knowledge. In fact, developers already use
sence of security vulnerability in the software systems. The use of SVDBs to identify security vulnerabilities, and determine features
VPM is also evident by the common use of the “prediction model” (e.g., vulnerability patch information) that they want to implement.
topic in our surveyed papers. Due to the available vulnerability Presently, the role of SVDBs is mostly as a repository for report-
patch information provided by SVDBs, we find SE researchers also ing known security vulnerability. However, we envision that fu-
start including SVDBs in their vulnerability prediction analysis and ture versions of SVDBs will play an increasing role, as an inte-
recommendations for patching the vulnerabilities (e.g., Sampaio grated knowledge source for guiding secure software development,
and Garcia, 2016; Theisen et al., 2015; Wang et al., 2017; Appelt providing security testing, and refining software security design.
et al., 2015; Chatzipoulidis et al., 2015). Furthermore, SVDBs data Hence, we believe that future versions of SVDBs need to incorpo-
is often used to increase the precision or recall of existing models rate a mechanism where SE researchers can link and trace vulner-
(e.g., Theisen et al., 2015) or as example to further enrich and train ability information directly across knowledge resources.
models with “real” security vulnerability data (e.g., Chatzipoulidis Another interesting finding is that SE researchers and prac-
et al., 2015). titioners reuse vulnerability information usually only from one
Testing and tools: Automated tools play an important role in source (single SVDB), limiting their analysis approach to the data
the software engineering domain (O’Regan, 2017) to support dif- available in this SVDB. One approach to address this problem is
ferent activities and tasks. From our survey, we observed that SE by improving the accessibility of information across SVDBs bound-
researchers introduce different types of automated tools that sup- aries. Providing users with the ability to use a standardized ac-
port vulnerability analysis and avoidance. For example, papers pro- cess to these knowledge resources, where queries will be retriev-
pose automatic test generation (suggested by “test cases” topic), ing information across SVDBs boundaries will represent a first step
and penetration testing tool that automates the process of detect- to perform new types of vulnerability analysis (e.g., global secu-
ing and exploiting SQL injection flaws (suggested by “exploit” and rity impact). While linking these knowledge resources is an im-
“sql injection” topics). Among the articles being cluster in the Tool portant initial step, additional semantic modeling will be needed
and Testing activity, SVDBs were used for evaluating results of the to ensure the consistency and quality of knowledge across SVDBs
proposed tools or as the main knowledge resource for the actual boundaries. For example, threats to consistency and ambiguity
approach presented in the papers (e.g., Wang et al., 2017; Stivalet across these knowledge resources will have to be addressed to
and Fong, 2016; Alqahtani et al., 2016a; Pham et al., 2015; Blome ensure that a vulnerability reported in two databases is actually
et al., 2013). the same (or different) instance. One approach would be to re-
While several security test cases generation and vulnerability place current proprietary knowledge modeling approaches used
prediction tools have emerged, many of these tools remain still at by SVDB providers and agree upon a standardized knowledge
the prototype level. We also observed that little work exists in ana- modeling approach, which would include the ability to semanti-
lyzing cross-cutting security concerns across different security test- cally link, query SVDBs across the repository boundaries and to
ing and modeling approaches. provide each vulnerability with a global, unique identifier, sim-
8
S.S. Alqahtani Computers & Security 116 (2022) 102661
ilar to the Universal Resource Identifier used by the Semantic used in SE research over last 17 years. From our analysis of the 99
Web. papers in our dataset we can conclude the following:
Observation 2. Linking Security Commit Changes to SVDBs • there is an increasing awareness of SVDBs in the research com-
munity in terms of papers being published describing the use
With a more widespread use of SVDBs in software develop- and application of SVDBs in the SE domain;
ment, we believe that SVDBs should become an integrated part of • the majority of the surveyed studies apply SVDBs only to a lim-
current software development processes and best practices. Simi- ited number of software engineering activities;
lar to the current practice of adding an issue number to a com- • most studies rely only on one SVDB for their contribution;
mit message, commit message also should include a link to the • researchers usually treat SVDBs as trusted, information knowl-
vulnerability in the SVDB where it is reported. Such vulnerabil- edge resources without fully integrating them with other soft-
ity traceability can provide additional insights and documentation ware lifecycle artifacts.
to QA and future maintainers when analyzing and comprehending
the code patch. Furthermore, a bi-directional link from the vulner- Our study can be used to further increase the SVDB awareness
abilities to the known and patched code would be desirable. We of both, SE researchers and practitioners and by providing them
further believe that next generation IDEs should not only facilitate with some directions of future research and application of SVDBs
this linking process, but also take advantage of this links to recom- in the SE domain. Addtionly, this paper can be extened to be a sys-
mend patches or identify potential impacts of these vulnerabilities tematic review to include SE publications in vulnerabiltiy analysis
on other parts of the system. between 2018 and 2022, and invistegate some research questions
such as: What are common SE Repositories which are used together
with SVDBs? Which SE software lifecycle activities are supported by
6. Conclusion SVDBs?
While SE research community are increasingly focusing on se- Declaration of Competing Interest
curity and reliability, no comprehensive literature survey exists
that studies how software vulnerabilities databases are used and The authors declare that they have no known competing finan-
integrated in a developer’s tools chain. In this paper, we proposed cial interests or personal relationships that could have appeared to
a methodology to discover and quantify security topics and trends influence the work reported in this paper.
of using SVDBs in SE research. Our methodology is based on LDA, a
widely-applied statistical topic modeling approach, which we used A. Appendix
to discover topics from our dataset of relevant SE research articles.
We define various metrics to quantify how security topics have Tables A1–A4
evolved over time, allowing us to gain insights on how SVDBs are
Table A1
The top 10 terms in the 24 topics clustering 6 major SE phases as found by this research.
1 Maintenance Topic 1 Taint analysis Analysi, execut, analyz, taint, result, hypercal, compar, becom, inform,
lightweight
Topic 2 Fuzzy logic Program, input, fuzz, explor, base, path, real, symbol, stage, limit,
Topic 4 Detecting Detect, technique, overflow, automat, buffer, fals, earli, static, posit, analysi
overflow
Topic 7 Attack Attack, approach, browser, base, firewall, request, detect, implement,
mechanism present, includ
Topic 10 Web application Applic, web, access, control, state, client, check, world, make, real
Topic 13 Bugs and release Bug, releas, research, analysi, increase, non, empir, examin, associ, differ
analysis
Topic 23 Static analysis Static, valid, cross, input, script, site, propos, string, common, dynam
2 Source Code Topic 8 Security design Secur, threat, approach, function, design, base, engine, level, process, featur
Topic 5 Security study Secur, report, problem, studi, process, reliabl, fix, collect, detect, discuss
Topic 19 Development Develop, secur, improve, practice, priorit, evalu, context, find, defin,
framework
Topic 21 Source code Code, sourc, develop, open, contain, line, linux, integ, perform, transform
3 Testing Topic 6 System analysis System, use, base, analysi, signatur, forma, specif, metric, architecture,
scenario
Topic 11 Test cases Test, generat, case, result, use, effect, algorithm, xssvs, data, complex
Topic 20 Software security Softwar, provid, part, time, statist, reduc, general, combin, exploit, secur
Topic 22 Exploit Exploit, system, paper, mitig, challeng, worm, defens, diagnosi, major,
memori
4 Modeling Topic 12 Malware Method, malwar, use, show, behavior, learn, detect, malici, comput, featur
methods
Topic 18 Patch study Vendor, disclosur, patch, time, respons, show, sever, valu, releas, studi
Topic 24 Prediction model Model, predict, paper, evalu, propos, use, compon, methodology, issu, effort
5 Risk Analysis Topic 9 Risk estimation Inform, risk, avail, estim, potenti, oper, cvss, allow, data, provid
Topic 17 Project Data, identify, project, repository, inform, sourc, relat, across, exist,
repository research
6 Tool Topic 3 Binary tool Tool, binary, user, mani, emonstr, creat, present, engine, structur, crash
Topic 16 SQL injection Inject, sql, database, identify, slice, tool, work, xml, type, obtain
7 Other Topic 14 Version Version, investing, assess, caus, autom, trace, anomali, surfac, current,
assessment approxim
Topic 15 Vulnerability Vulner, discov, discoveri, type, signific, known, order, result, exist, flow
discovery
9
S.S. Alqahtani Computers & Security 116 (2022) 102661
Table A2
The 24 topics discovered by LDA.
Security study Secur, report, problem, studi, process, reliabl, fix, collect, detect,
discuss
Prediction model Model, predict, paper, evalu, propos, use, compon,
methodology, issu, effort
Bugs and release Bug, releas, research, analysi, increase, non, empir, examin,
analysis associ, differ
Static analysis Static, valid, cross, input, script, site, propos, string, common,
dynam
System analysis System, use, base, analysi, signatur, forma, specif, metric,
architecture, scenario
Test cases Test, generat, case, result, use, effect, algorithm, xssvs, data,
complex
Web application Applic, web, access, control, state, client, check, world, make,
real
Project Data, identify, project, repository, inform, sourc, relat, across,
repository exist, research
Detecting Detect, technique, overflow, automat, buffer, fals, earli, static,
overflow posit, analysi
Vulnerability Vulner, discov, discoveri, type, signific, known, order, result,
discovery exist, flow
Patch study Vendor, disclosur, patch, time, respons, show, sever, valu,
releas, studi
Malware Method, malwar, use, show, behavior, learn, detect, malici,
comput, featur
Exploit Exploit, system, paper, mitig, challeng, worm, defens, diagnosi,
major, memori
Risk estimation Inform, risk, avail, estim, potenti, oper, cvss, allow, data, provid
Attack Attack, approach, browser, base, firewall, request, detect,
mechanism implement, present, includ
Taint analysis Analysi, execut, analyz, taint, result, hypercal, compar, becom,
inform, lightweight
Development Develop, secur, improve, practice, priorit, evalu, context, find,
defin, framework
Fuzzy logic Program, input, fuzz, explor, base, path, real, symbol, stage,
limit,
Binary tool Tool, binary, user, mani, emonstr, creat, present, engine,
structur, crash
SQL injection Inject, sql, database, identify, slice, tool, work, xml, type, obtain
Source code Code, sourc, develop, open, contain, line, linux, integ, perform,
transform
Version Version, investing, assess, caus, autom, trace, anomali, surfac,
assessment current, approxim
Security design Secur, threat, approach, function, design, base, engine, level,
process, featur
Software security Softwar, provid, part, time, statist, reduc, general, combin,
exploit, secur
Table A3
Topic shares and trends
Topic name Popularity (%) Trend (p-value) Topic name Popularity (%) Trend (p-value)
10
S.S. Alqahtani Computers & Security 116 (2022) 102661
Table A4
99 Articles included in the final dataset.
Title DOI
Table A4 (continued)
A39 Assessing the Threat Landscape for Software Libraries
10.1109/ISSREW.2014.58
A40 Input injection detection in Java code
10.1109/ICODSE.2014.7062698
A41 Automated Test Generation from Vulnerability
Signatures 10.1109/ICST.2014.32
A42 Mining Security Vulnerabilities from Linux Distribution
Metadata 10.1109/ISSREW.2014.101
A43 Security Benchmarks for Web Serving Systems
10.1109/ISSRE.2014.38
A44 A New Technique for Counteracting Web Browser
Exploits 10.1109/ASWEC.2014.28
A45 Mining SQL injection and cross site scripting
vulnerabilities using hybrid program analysis 10.1109/ICSE.2013.6606610
A46 Path sensitive static analysis of web applications for
remote code execution vulnerability detection 10.1109/ICSE.2013.6606611
A47 Program transformations to fix C integers
10.1109/ICSE.2013.6606625
A48 Automated software architecture security risk analysis
using formalized signatures 10.1109/ICSE.2013.6606612
A49 Automatic Generation of Test Drivers for Model
Inference of Web Applications 10.1109/ICSTW.2013.57
A50 VERA: A Flexible Model-Based Vulnerability Testing
Tool 10.1109/ICST.2013.65
A51 A scalable approach for malware detection through
bounded feature space behavior modeling 10.1109/ASE.2013.6693090
A52 When a Patch Goes Bad: Exploring the Properties of
Vulnerability-Contributing Commits 10.1109/ESEM.2013.19
A53 Model-Based Vulnerability Testing for Web Applications
10.1109/ICSTW.2013.58
A54 Vulnerability of the Day: Concrete demonstrations for
software engineering undergraduates 10.1109/ICSE.2013.6606667
A55 Extracting and Analyzing the Implemented Security
Architecture of Business Applications 10.1109/CSMR.2013.37
A56 Using software reliability models for security
assessment - Verification of assumptions 10.1109/ISSREW.2013.6688858
A57 Securing web-clients with instrumented code and
dynamic runtime monitoring 10.1016/j.jss.2013.02.047
A58 A Large Scale Exploratory Analysis of Software
Vulnerability Life Cycles 10.1109/ICSE.2012.6227141
A59 Predicting common web application vulnerabilities
from input validation and sanitization code patterns 10.1145/2351676.2351733
A60 Supporting automated vulnerability analysis using
formalized vulnerability signatures 10.1145/2351676.2351691
A61 Using Multiclass Machine Learning Methods to Classify
Malicious Behaviors Aimed at Web Systems 10.1109/ISSRE.2012.30
A62 Fast Detection of Access Control Vulnerabilities in PHP
Applications 10.1109/WCRE.2012.34
A63 SPaCiTE – Web Application Testing Engine
10.1109/ICST.2012.187
A64 Automated detection of client-state manipulation 10.1145/2531921
vulnerabilities
A65 Securing Opensource Code via Static Analysis
10.1109/ICST.2012.123
A66 Structured Binary Editing with a CFG Transformation
Algebra 10.1109/WCRE.2012.11
A67 CAWDOR: Compiler Assisted Worm Defense
10.1109/SCAM.2012.30
A68 Improving VRSS-based vulnerability prioritization using
analytic hierarchy process 10.1016/j.jss.2012.03.057
A69 SimFuzz: Test case similarity directed deep fuzzing
10.1016/j.jss.2011.07.028
A70 Empirical Results on the Study of Software
Vulnerabilities (NIER Track) 10.1145/1985793.1985960
A71 One Technique is Not Enough: A Comparison of
Vulnerability Discovery Techniques 10.1109/ESEM.2011.18
A72 Using SQL Hotspots in a Prioritization Heuristic for
Detecting All Types of Web Application Vulnerabilities 10.1109/ICST.2011.15
A73 An empirical investigation into open source web 10.1007/s10664-
applicationsâ€TM implementation vulnerabilities 010-9131-y
A74 Searching for a Needle in a Haystack: Predicting
Security Vulnerabilities for Windows Vista 10.1109/ICST.2010.32
A75 Security Trend Analysis with CVE Topic Models
10.1109/ISSRE.2010.53
A76 Client-Side Detection of Cross-Site Request Forgery
Attacks 10.1109/ISSRE.2010.12
A77 Mining security changes in FreeBSD
10.1109/MSR.2010.5463289
(continued on next page)
12
S.S. Alqahtani Computers & Security 116 (2022) 102661
Table A4 (continued)
A78 Detecting recurring and similar software vulnerabilities
10.1145/1810295.1810336
A79 Quantifying security risk level from CVSS estimates of
frequency and impact 10.1016/j.jss.2009.08.023
A80 MUTEC: Mutation-based testing of Cross Site Scripting
10.1109/IWSESS.2009.5068458
A81 Improving CVSS-based vulnerability prioritization and
response with context information 10.1109/ESEM.2009.5314230
A82 Vulnerability analysis for a quantitative security
evaluation 10.1109/ESEM.2009.5315969
A83 Security of open source web applications
10.1109/ESEM.2009.5314215
A84 Towards a Unifying Approach in Understanding Security
Problems 10.1109/ISSRE.2009.25
A85 On mining data across software repositories
10.1109/MSR.2009.5069498
A86 An empirical study of security problem reports in Linux
distributions 10.1109/ESEM.2009.5315985
A87 Static detection of cross-site scripting vulnerabilities
10.1145/1368088.1368112
A88 An Empirical Analysis of the Impact of Software
Vulnerability Announcements on Firm Stock Price 10.1109/TSE.2007.70712
A89 Efficiency of Vulnerability Disclosure Mechanisms to
Disseminate Vulnerability Knowledge 10.1109/TSE.2007.26
A90 Software Vulnerability Assessment Version Extraction
and Verification 10.1109/ICSEA.2007.64
A91 Threat-driven modeling and verification of secure
software using aspect-oriented Petri nets 10.1109/TSE.2006.40
A92 Modeling Software Vulnerabilities With Vulnerability
Cause Graphs 10.1109/ICSM.2006.40
A93 Measuring and Enhancing Prediction Capabilities of
Vulnerability Discovery Models for Apache and IIS HTTP 10.1109/ISSRE.2006.26
Servers
A94 Large-scale vulnerability analysis
10.1145/1162666.1162671
A95 An Empirical Study on Using the National Vulnerability 10.1007/978-3-
Database to Predict Software Vulnerabilities 642-23088-2_15
A96 Security versus performance bugs: a case study on
Firefox 10.1145/1985441.1985457
A97 Vulnerability identification and classification via text
mining bug databases 10.1109/IECON.2014.7049035
A98 Mining Bug Databases for Unidentified Software
Vulnerabilities 10.1109/HSI.2012.22
A99 Estimating ToE Risk Level Using CVSS
10.1109/ARES.2009.151
CRediT authorship contribution statement Bartlett, R.F., 1993. Linear modelling of Pearson’s product moment correlation coef-
ficient: an application of Fisher’s z-transformation. Stat 42 (1), 45. doi:10.2307/
2348110.
Sultan S. Alqahtani: Methodology, Software, Validation, Formal Blei, M.I., David, M., Ng, A.Y., Jordan, 2003. Latent dirichlet allocation. J. Mach. Learn.
analysis, Investigation, Data curation, Writing – original draft. Res. 3, 993–1022.
Blome, A., Ochoa, M., Li, K., Peroli, M., Dashti, M.T., 2013. VERA: a flexible model-
based vulnerability testing tool. In: 2013 IEEE Sixth International Conference
on Software Testing, Verification and Validation, pp. 471–478. doi:10.1109/ICST.
References 2013.65 Mar..
Bozic, J., Garn, B., Simos, D.E., Wotawa, F., 2015. Evaluation of the IPO-family al-
Alhazmi, O., Malaiya, Y., 2006. Measuring and enhancing prediction capabilities gorithms for test case generation in web security testing. In: 2015 IEEE Eighth
of vulnerability discovery models for apache and IIS HTTP servers. In: 2006 International Conference on Software Testing, Verification and Validation Work-
17th International Symposium on Software Reliability Engineering, pp. 343–352. shops (ICSTW), pp. 1–10. doi:10.1109/ICSTW.2015.7107436 Apr..
doi:10.1109/ISSRE.2006.26. Cadariu, M., Bouwers, E., Visser, J., van Deursen, A., 2015. Tracking known security
Alqahtani, S.S., “Dataset,” 2021. https://github.com/isultane/svdbs_dataset (accessed vulnerabilities in proprietary software systems. In: IEEE 22nd International Con-
Dec. 12, 2021). ference on Software Analysis, Evolution, and Reengineering (SANER), pp. 516–
Alqahtani, S.S., Eghan, E.E., Rilling, J., 2016a. SV-AF — a security vulnerability analy- 519. doi:10.1109/SANER.2015.7081868 Mar..
sis framework. In: 2016 IEEE 27th International Symposium on Software Relia- Chatzipoulidis, A., Michalopoulos, D., Mavridis, I., 2015. Information infrastructure
bility Engineering (ISSRE), pp. 219–229. doi:10.1109/ISSRE.2016.12 Oct.. risk prediction through platform vulnerability analysis. J. Syst. Softw. 106, 28–
Alqahtani, S.S., Eghan, E.E., Rilling, J., 2016b. Tracing known security vulnerabili- 41. doi:10.1016/j.jss.2015.04.062, Aug..
ties in software repositories – a Semantic Web enabled modeling approach. Sci. D’Ambros, M., Lanza, M., 2006. Software bugs and evolution: a visual approach to
Comput. Program. 121, 153–175. doi:10.1016/j.scico.2016.01.005, Jun.. uncover their relationship. In: Conference on Software Maintenance and Reengi-
Alqahtani, S.S., Eghan, E.E., Rilling, J., 2017. Recovering semantic traceability links be- neering (CSMR’06), p. 10. doi:10.1109/CSMR.2006.51 pp. –238.
tween APIs and security vulnerabilities: an ontological modeling approach. In: Fang, M., Hafiz, M., 2014. Discovering buffer overflow vulnerabilities in the wild. In:
2017 IEEE International Conference on Software Testing, Verification and Vali- Proceedings of the 8th ACM/IEEE International Symposium on Empirical Soft-
dation (ICST), pp. 80–91. doi:10.1109/ICST.2017.15 Mar.. ware Engineering and Measurement - ESEM ’14, pp. 1–10. doi:10.1145/2652524.
Appelt, D., Nguyen, C.D., Briand, L., 2015. Behind an application firewall, are we safe 2652533.
from SQL injection attacks? In: 2015 IEEE 8th International Conference on Soft- Frei, S., May, M., Fiedler, U., Plattner, B., 2006. Large-scale vulnerability analysis. In:
ware Testing, Verification and Validation (ICST), pp. 1–10. doi:10.1109/ICST.2015. Proceedings of the 2006 SIGCOMM workshop on Large-scale attack defense -
7102581 Apr.. LSAD ’06, pp. 131–138. doi:10.1145/1162666.1162671.
Atlee, J.M., France, R., Georg, G., Moreira, A., Rumpe, B., Zschaler, S., 2007. Modeling Hindle, A., Bird, C., Zimmermann, T., Nagappan, N., 2015. Do topics make sense
in software engineering. In: 29th International Conference on Software Engi- to managers and developers? Empir. Softw. Eng. 20 (2), 479–515. doi:10.1007/
neering (ICSE’07 Companion), pp. 113–114. doi:10.1109/ICSECOMPANION.2007.53 s10664- 014- 9312-1, Apr..
May.
13
S.S. Alqahtani Computers & Security 116 (2022) 102661
Hovsepyan, A., Scandariato, R., Joosen, W., Walden, J., 2012. Software vulnerability Plate, H., Ponta, S.E., Sabetta, A., 2015. Impact assessment for vulnerabilities in
prediction using text analysis techniques. In: Proceedings of the 4th interna- open-source software libraries. In: 2015 IEEE International Conference on Soft-
tional workshop on Security measurements and metrics - MetriSec ’12, p. 7. ware Maintenance and Evolution (ICSME), pp. 411–420. doi:10.1109/ICSM.2015.
doi:10.1145/2372225.2372230. 7332492 Sep..
Ingo, F., “Introduction to the tm package text mining in R.” pp. 1–8, 2013, [Online]. Raghavan, U.N., Albert, R., Kumara, S., 2007. Near linear time algorithm to detect
Available: https://cran.r-project.org/web/packages/tm/vignettes/tm.pdf. community structures in large-scale networks. Phys. Rev. E 76 (3), 036–106.
Karlsson, M., “The edit history of the national vulnerability database and similar doi:10.1103/PhysRevE.76.036106, Sep..
vulnerability databases,” 2012. Rountev, A., Kagan, S., Gibas, M., 2004. Static and dynamic analysis of call chains
Lebeau, F., Legeard, B., Peureux, F., Vernotte, A., 2013. Model-based vulnerability in java. ACM SIGSOFT Softw. Eng. Notes 29 (4), 1. doi:10.1145/1013886.1007514,
testing for web applications. In: 2013 IEEE Sixth International Conference on Jul..
Software Testing, Verification and Validation Workshops, pp. 445–452. doi:10. Sampaio, L., Garcia, A., 2016. Exploring context-sensitive data flow analysis for early
1109/ICSTW.2013.58 Mar.. vulnerability detection. J. Syst. Softw. 113, 337–361. doi:10.1016/j.jss.2015.12.021,
Massacci, F., Nguyen, V.H., 2014. An empirical methodology to evaluate vulnerability Mar..
discovery models. IEEE Trans. Softw. Eng. 40 (12), 1147–1162. doi:10.1109/TSE. Schumacher, M., Haul, C., Hurler, M., Buchmann, Alejandro, 20 0 0. Data mining in
2014.2354037, Dec.. vulnerability databases. Comput. Sci. 12.
Mendes, N., Madeira, H., Duraes, J., 2014. Security benchmarks for web serving sys- Stivalet, B., Fong, E., 2016. Large Scale Generation of complex and faulty PHP test
tems. In: 2014 IEEE 25th International Symposium on Software Reliability Engi- cases. In: 2016 IEEE International Conference on Software Testing, Verification
neering, pp. 1–12. doi:10.1109/ISSRE.2014.38 Nov.. and Validation (ICST), pp. 409–415. doi:10.1109/ICST.2016.43 Apr..
Meneely, A., Srinivasan, H., Musa, A., Tejeda, A.R., Mokary, M., Spates, B., 2013. When Stuckman, J., Purtilo, J., 2014. Mining security vulnerabilities from linux distribution
a patch goes bad: exploring the properties of vulnerability-contributing com- metadata. In: 2014 IEEE International Symposium on Software Reliability Engi-
mits. In: 2013 ACM /IEEE International Symposium on Empirical Software Engi- neering Workshops, pp. 323–328. doi:10.1109/ISSREW.2014.101 Nov..
neering and Measurement, pp. 65–74. doi:10.1109/ESEM.2013.19 Oct.. Theisen, C., Herzig, K., Morrison, P., Murphy, B., Williams, L., 2015. Approximating at-
Ming, J., Wu, D., Wang, J., Xiao, G., Liu, P., 2016. StraightTaint: decoupled of- tack surfaces with stack traces. In: 2015 IEEE/ACM 37th IEEE International Con-
fline symbolic taint analysis. In: Proceedings of the 31st IEEE/ACM Interna- ference on Software Engineering, pp. 199–208. doi:10.1109/ICSE.2015.148 May.
tional Conference on Automated Software Engineering - ASE 2016, pp. 308–319. Thomas, S.W., 2011. Mining software repositories with topic models. In: Soft-
doi:10.1145/2970276.2970299. ware Engineering (ICSE), 2011 33rd International Conference on, pp. 1138–1139.
Morrison, P., Herzig, K., Murphy, B., Williams, L., 2015. Challenges with applying doi:10.1145/1985793.1986020.
vulnerability prediction models. In: Proceedings of the 2015 Symposium and Walden, J., Stuckman, J., Scandariato, R., 2014. Predicting vulnerable components:
Bootcamp on the Science of Security - HotSoS ’15, pp. 1–9. doi:10.1145/2746194. software metrics vs text mining. In: 2014 IEEE 25th International Symposium
2746198. on Software Reliability Engineering, pp. 23–33. doi:10.1109/ISSRE.2014.32 Nov..
Murtaza, S.S., Khreich, W., Hamou-Lhadj, A., Bener, A.B., 2016. Mining trends and Wang, R., Liu, P., Zhao, L., Cheng, Y., Wang, L., 2017. deExploit: identifying misuses of
patterns of software vulnerabilities. J. Syst. Softw. 117, 218–228. doi:10.1016/j. input data to diagnose memory-corruption exploits at the binary level. J. Syst.
jss.2016.02.048, Jul.. Softw. 124, 153–168. doi:10.1016/j.jss.2016.11.026, Feb..
O’Regan, G., “Software engineering tools,” 2017, pp. 279–295. Wijayasekara, D., Manic, M., McQueen, M., 2014. Vulnerability identification and
Palsetia, N., Deepa, G., Khan, F.A., Thilagam, P.S., Pais, A.R., 2016. Securing native classification via text mining bug databases. In: IECON 2014 - 40th Annual Con-
XML database-driven web applications from XQuery injection vulnerabilities. J. ference of the IEEE Industrial Electronics Society, pp. 3612–3618. doi:10.1109/
Syst. Softw. 122, 93–109. doi:10.1016/j.jss.2016.08.094, Dec.. IECON.2014.7049035 Oct..
Panichella, A., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., De Lucia, A., 2013. Wijayasekara, D., Manic, M., Wright, J.L., McQueen, M., 2012. Mining bug databases
How to effectively use topic models for software engineering tasks? an ap- for unidentified software vulnerabilities. In: 2012 5th International Conference
proach based on genetic algorithms. In: Proceedings of the 2013 International on Human System Interactions, pp. 89–96. doi:10.1109/HSI.2012.22 Jun..
Conference on Software Engineering, pp. 522–531. Zaman, S., Adams, B., Hassan, A.E., 2011. Security versus performance bugs. In: Pro-
Pham, V.-T., Ng, W.B., Rubinov, K., Roychoudhury, A., 2015. Hercules: reproducing ceeding of the 8th working conference on Mining software repositories - MSR
crashes in real-world application binaries. In: 2015 IEEE/ACM 37th IEEE Interna- ’11, p. 93. doi:10.1145/1985441.1985457.
tional Conference on Software Engineering, pp. 891–901. doi:10.1109/ICSE.2015. Zhang, S., Caragea, D., and Ou, X., “An empirical study on using the national vulner-
99 May. ability database to predict software vulnerabilities,” 2011, pp. 217–231.
14