You are on page 1of 9

2020 8th International Conference on Information Technology

and Multimedia (ICIMU)

Design and Development of Machine Learning


Technique for Software Project Risk Assessment - A
Review
Mohamed Najah Mahdi1, Mohamed Zabil M.H2, Azlan Yusof2, Lim Kok Cheng2, Muhammad Sufyian Mohd Azmi2, Abdul
Rahim Ahmad
{ najah.mahdi, hazli, azlany, kokcheng, sufyian, abdrahim }@uniten.edu.my
1
Insitute of Informatics and Computing in Energy, Universiti Tenaga Nasional
2
College of Computing and Informatics, Universiti Tenaga Nasional

Abstract—Accurate assessment of software project risk is This paper is further organized as follows: Section II
amongst the key activities in a software project. It directly highlights the research questions. Section III discusses the
impacts the time and cost of software projects. This paper taxonomy of literature on software project risk assessment
presents a literature review of designing & developing machine using a machine learning technique. Section IV describes the
learning techniques for software project risk assessment. The review and survey articles, summarize the current state of
results of the review have concluded prominent trends of understanding of software development, challenges in this
machine learning approaches, size metrics, and study findings field of research and a new view of the software project risk
in the growth and advancement of machine learning in project assessment using a machine learning technique. Finally, a
management. Besides that, this research provides a deeper
conclusion is presented in Section V.
insight and an important framework for future work in the
software project risk assessment. Furthermore, we II. RESEARCH QUESTIONS
demonstrated that the assessment of project risk using machine-
learning is more efficient in reducing a project's fault. It also Software project management (SPM) assessment has been
increases the probability for the software project's prediction addressed in many studies to identify the possible solutions to
and response, provides a further way to reduce the probability its concomitant risks. Numerous studies have highlighted the
chances of failure effectively and to increase the software requirements for SPM assessment but inadequately covered
development performance ratio. its effects and solutions. Thus, the main objectives of this
paper are to identify and to classify the methods and review of
Keywords— Software Project Management, formatting, these articles that would help the software project risk review
Machine Learning Technique, Assessment, Risk. with the machine learning (ML) approaches to define and
explain the threats, advantages, and recommendations.
I. INTRODUCTION
However, because of the huge amount of ML algorithms,
Software project cost estimation relates to estimating the various machine study algorithms remain unanalyzed in this
effort required to create a unit of a software product. Lack of paper. In particular, we focused on identifying the potential
proper estimation [1],[2] is cited as one of the major reasons attributes that were used as predictors of SPM assessment in
for the astounding failure rates of software projects. Although the literature and the prediction techniques that were used to
there are definition of project success and failure in the improve the accuracy of prediction. Two research questions
literature, there are long disagreements about how to assess were formulated to understand the status. The first research
project progress. The views on what constitutes project questions is what does the existing research literature reveal
success and how project success is measured[1] are different. about design & development of SPM assessment using
Hughes, et al. [2], and PMI [3] differentiate between the machine learning techniques? The second research questions
performance of the project and the performance factors of the is how efficiently and successfully can these techniques be
assessment of success or failure. Performance factors for applied?
projects are considered as inputs contributing to performance
in projects. A good project is historically characterized in III. DESIGN & DEVELOPMENT REVIEW
terms of delivery, within an agreed timeline, of the desired The review is divided into two categories, which are ML
results, and using the chosen resources[4]. PMI [3] recognizes methods and other methods used in software development
projects that are effective in meeting the project objectives, techniques and risk assessment.
criteria, and aspirations of the stakeholder. The results of the
classical outcome measures, such as project cost (below and A. Studies Conducted on Machine Learning Methods
on or above budget), length of the project (early, on or late), Selected works were classified into broad categories
and the quality of the outcome of the project (with less or depending on the ML methods in software development
better than the necessary properties and functions), are defined techniques. This section has fifteen articles of the ML methods
by researchers, such as Aladwani [5], Cates and Mollaghasemi and that used many algorithms in the software development
[6], Parsons [7] and Rosenfeld [8]. The numerous activities in techniques. Selected works were classified into broad
the software project planning phase can be grouped into two categories depending on the ML methods in software
major activities, namely effort estimation and risk development techniques and risk assessment.
management [2]. Software effort estimation calculates the The goal of Lochmann, et al. [9] is to make it as quick and
effort that is required in a software project based on several convenient as possible to implement and incorporate
cost factors, while risk management activities include consistency models in the production method. [9] proposes
identifying, addressing, and eliminating software project risks continuous quality control to provide developers with
before undesirable outcomes [3].

978-1-7281-7310-8/20/$31.00
Authorized licensed use limited to: UNIVERSITY ©2020 IEEE
TENAGA NASIONAL. 354on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

immediate feedback when quality problems are introduced. simpler way that can be used to assist a project manager or
This includes an ongoing assessment of quality. The aim is, senior staff to enable connections with projects dependent on
therefore, to make the assessment quickly and effortlessly decisions
using a high-quality model. Furthermore, concentrate on A general context for implementing computer training to
analyzing the trade-off utilizing broad, robust quality models assess and forecast product quality in broad product
or oriented quality models using a specific range of quality organizations is given by Rana and Staron [16]. ISO 15939
assurance steps. illustrates how various software indicators can be used to
The solution to the calculation of non-quantitative data by generate software quality models and can be used to analyze
Shivhare and Rath [10] and is applied in two steps, focused on and forecast output and satisfy their organizations' knowledge
ML technics. The first phase focuses on selecting the optimum requirements in terms of output. An example using bottom-up
characteristic set in high-dimensional information related to approach is adopted, as ML techniques such as ANN may be
past projects. For feature reductions, a theoretical study is used to determine in which quality category a specific
performed with a rough set theory. The second phase software module/component falls at a given moment during
estimates the effort based on the optimum first phase feature its creation.
set. The calculation is achieved differently with the use of the Tariq, et al. [17] presents the general description of
techniques of the Naive Bayes classifier and Artificial Neural assessments of commitment utilizing different ML methods
Networks (ANN). In contrast to the neural networks that have the latest patterns. The implementation of the latest
technologies, the classifier in Naive Bayes has obtained strategy promotes expense and energy mitigation. To measure
greater performance for estimation. the project commitment and compare based on criteria such as
The most reliable of the previously developed model is the right amount, mean absolute error, root means absolute
merged by Dehghan, et al. [11] to create a current hybrid error and relative absolute error, the efficiency of the proposed
model more accurately. The hybrid framework hires three system is evaluated.
different attribute sets (1) attributes based on early metadata, The grades for embedded applications engineering
(2) naming, and (3) program job definition. In comparison to ventures with an ANN and SVM are provided by Iwata, et al.
previous model models, the model is also suitable for a wider [18]. The paper discusses the classification effect in the
number of tasks, as it is not limited to just one kind of data estimate of code development effort. It is especially necessary
source that is not always available. to estimate the effort requirements for new software ventures.
Hu, et al. [12] provides an analytical model that can Because outliers are damaging to estimates, many estimate
forecast and track the danger involved with software creation models are excluded. However, these outliers can be identified
overall, rather than relying solely on the specific factor: in practice once the projects have been finalized, and therefore
project performance. In this analysis, a structured risk during model creation and estimating the required effort
assessment model was initially developed and then actual should not be excluded.
cases were obtained from app developers to establish a risk Minku and Yao [19] suggested a multi-target learning
forecasting model. Two ML algorithms, ANN and Support issue for machine effort estimation model development. To
Vectror Machine (SVM), are contrasted to test the efficiency help explain the trade-offs between different output metrics by
of our model. The experiments demonstrate that SVM-based constructing SEE models, multiple objective evolutionary
risk prediction model performs better. algorithms (MOEA) is used to concurrently automate these
Fitzgerald, et al. [13] offers a tool-implemented steps. It shows that the measurements of performance are very
architecture that constructs automatic failure analysis models different and sometimes even show opposite trends. They are
utilizing ML classification algorithms to evaluate Firefox and then used to create SEE ensembles as a source of diversity. A
Netbeans results of various techniques. The calculation is good deal can be achieved using an ensemble of MOEA
based on a cost-benefit model to determine the importance of solutions between different measures.
additional initial research. The importance of the more initial A slightly greater degree than a misclassification of an
study of this model relies on the probability it would perform advanced project as a setback is discussed by Hu, et al. [20],
of avoiding failures and the potential effects of failures overviewing on project loss. In the software project risk
associated with its effects. It is shown that automatic prediction, ensemble training, a technique that is well-known
prediction models have more utility than a series of baselines for improved prediction performance in other areas, has still
for such failure forms and projects for realistic estimates of not been fully studied. This study seeks to close the gaps in
these two parameters. research by examining methods used for cost-sensitive
The empirical analysis of many projects that were analysis and classification. Comparative analyzes of T-tests
developed in different software industries using K-mean utilizing 327 samples for outsourced software ventures in 60
technology is proposed using Ramaswamy, et al. [14] to separate risk prediction frameworks indicate that the ideal
cluster them by defects. The paper offers a clustering method model is a standardized series of decision trees (DT) on a
for clustering ventures dependent on project performance. The bagging basis. Of course, DT has failed to provide accurate,
result is then formed to build the classifier model based on K- cost-sensitive analysis of the supported vector machine
fold cross-validation with different classification algorithms. (SVM), within the proposed context.
Random forest algorithms are the best precision compared Review by Pospieszny, et al. [21] is intended to overcome
with other classificatory for the specified data collection. This these shortcomings and close a void in the original project
method of project management, based on effective data development initiative life cycle between up-to-date testing
mining technology, ensures that a company's success rate is results and the future application in the operation of reliable
predicted accurately. ML algae. This was achieved with the implementation of the
Manalif, et al. [15] incorporates the flippant reasoning that ISBSG dataset, intelligent data processing, a hybrid average
can tackle scenario uncertainty and linguistic factors, to boost of three ML cross-validation algorithms. For organizations
the responsiveness of project risk evaluation tools with the aid designing or adopting information programs, the templates for
of the Professional COCOMO. The approach suggested is a

355on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

effort and length estimates obtained are intended to include a development of μ-SVR and v-SVR, two data sets of software
decision support resource. projects were used. Following linear regression, the
Joseph [22] aims at creating an ML-based risk stimulator estimation precision of the SVRs was achieved. Based on
system that provides the most relevant risk prompts for the statistical tests, results showed the statistical improvement of
software development project based on specifications, a linear kernel SVR than the statistical regression model, when
scenarios, and taxonomy tags. The reduction in the number of software projects were upgraded on the Mid-Range platform
prompts for a "keyword" quest is 50 percent. It is lower in and coded in third-generation programming languages.
scenario, taxonomy, and specifications data to develop the ML ESIM is an integrated hybrid intelligence model of a fast
algorithm to generate risks, but requires an advanced array of messy genetic algorithm (FmGA) SVM, Chou, et al. [29]. The
distributed ML jobs. As it is assumed that the taxonomies are SVM provides primarily learning and curve adaptation while
independent of risk triggers, the analysis can be treated the fmGA reduces errors to a minimum. The empirical
independently of each of these taxonomies. findings in this study suggest that the suggested ESIM, as
The ANN-COA model was proposed by Desai and Desai compared to ANN and SVM, offers the manufacturing
and Mohanty [23], that is to say. ANN learned to forecast companies superior precision, shorter training period, and less
program cost estimates using a cuckoo optimization algorithm overfitting estimation in the early phase of the ERP product
(COA). The primary objective is to show the use of a new development project. The key aim of this work was to forecast
ANN learning process to predict better software cost ERP's software creation by implementing an Evolutionary
estimation. The findings are root average square error and SVM inference model (ESIM).
mean relative error magnitude. To define the relevant metrics to be used at a particular
The International Software Benchmarking Standards time, Dahab et al. [?] propose to analyze and classify measures
Group (ISBSG) Release 11 uses the industrial datasets Nassif, during runtime using an SVM learning method. Determined
et al. [24] to perform an objective analysis between neural the mechanism for recommending to incorporate certain
netbox model systems. The ISBSG Release 11 data sets measures that select metrics from the current plan or reorients
contain over 5,000 businesses around the globe. The main the measurement program. Improving the measurement plans
features/issues of ISBSG datasets are 1. In the ISBSG project, by flexibilities metrics. For project managers, that is an
different programming languages, numerous life-cycle essential prerequisite. This enables specific useful
product development models were built for different measurements to be addressed to prevent measurements that
platforms. 2. Various device scale measurements are used. are not always relevant during a determined time frame.
SLOC, IFPUG, COSMIC are also used. Calculating the For applications with a 4 percent defect average, Koroglu,
product engineering initiative is one of SPM's key activities. et al. [30] produce a prediction model with high accuracy. The
The introduction of Schleier-Smith [25] contains tailored model utilizes the Random Forest, which has a greater
feedback from millions of people, addresses questions within predictive ability in the situation than Naive Bayes, Logistic
a second and fresh knowledge in seconds. Produce in volume, Recovery, and Decision Trees. The goal of this project is to
produce top performance from 10,000,000 sub-second incorporate the defect prediction into the continuous software
reaction times, and incorporate the latest features in a few project development process and to the testing effort. The
seconds in real-time. The program relies on innovative method described in the context of a journal of practice. In
function development, focusing on data scientists to develop particular, data were obtained from seven older software
improvements that produce useful signals from raw releases and additional features were used in the estimation of
information. new releases defects. Comparing the most reliable defect
Volf and Shmueli [26] explain how project gating systems indicator in the business with the different classification
work and are used to keep the construction line clean, which methods, including the Naive Bayes, Decision Trees, and
is the basic category of CI systems. Propose and test three Random Forest, and re-examined training results.
heuristics for the Gating device processing of submissions. The theoretical approach to the evaluation is proposed by
The first is that each application will be regularly reviewed, Petkovic, et al. [31] and most notably, the prevention of the
demonstrating that this has an output adverse impact. The outcomes of the software engineering collaboration focused
second choice is to change the processing of the application on evidence from a joint software engineering unit. provide
using the performance rate as a measure for the proportion of the following contributions: a) a comprehensive team activity
the submissions to be screened. Display that this results in a assessment data set, including over 40 objective and
good outcome through the usage of screening at a high success measurable measurements collected from student teams
rate and continuous screening at a low stage. The third and last operating on class initiatives, was first established in detail; b)
heuristic evaluation evaluates ML leverages to optimize the an ML system refers to team activities and team outcomes by
selection. using the algorithm of Random Forest to predict team
Liyi, et al. [27] have developed a project risk assessment members.
model for the LS-SVM. The simulation reveals that the Alsri, et al. [32] propose a neural network-based task
expected result of SVM is very practical. The Less Square allocation model, where a suitable position can be identified
(LS-SVM) method was used to research the project danger for a specified job, then the position can be located. The sites
estimation model. Expert risk evaluation data are used to train involved may be used as an alternative to the appropriate site
the developed LS-SVM regression model for the mapping or as alternate places to carry out different activities
relationship between the danger and the characteristic. In the concurrently. The suggested model offers project managers a
test samples, the built LS-SVM and the BP neural network set of possible locations for choosing the appropriate Global
were compared. The result demonstrated that the LS-SVM App Production sites for particular tasks. Often analyze and
model is highly accurate and widely generalized. test in conjunction with other methods the suggested work
To predict the existence of software enhancement, Lopez- allocation model. The alternate positions are chosen using a
Martin, et al. [28] suggests the introduction of two forms of methodology identical to that used in the k-NN-algorithm
SVR known as т-SVR and v-SVR. For the training and

356on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

centered on the Euclidean distance of their N-dimensional model; instead, contributors fork and independently alter the
victors. repository. If a collection is available, a pull request is made,
Basgalupp, et al. [33] propose the implementation of an which determines the local branch that is to be merged in the
evolutionary LEGAL-Tree algorithm which develops main repository with a branch.
decision-making models to build a decision-taking tree To forecast cost deceleration utilizing data mining
tailored to a collection of software effort data given by a broad, techniques, Makris, et al. [39] pose a model classification The
global IT-enterprise. The results indicate that the statistically model uses the budget and schedule for the initial planning of
sophisticated decision-making trees outperform those that are an ICT project and then forecasts a cost slip in the project
evolutionarily influenced and the conventional logistic category. There are three types in which a small slip is deemed
regression. natural, medium slip needs to be recognized and big slip needs
Bakır, et al. [34] analyze and reflect on the embedded intervention. Describe how data mining techniques may be
context, the homogeneity of cost data across the device used to construct a classification model to estimate cost losses.
domains. To this end, develop three experimental installations The model suggested takes data from a small range of project
to compare cross-domain-data models with domain-data parameters and classifies a project into one of three groups.
models. While the first experiment uses domain data for a The need for sophisticated structures such as IoT, Fog, and
baseline, the second and third experiments deal with cross- cloud utilizing suitable methods were discussed by
domain data with different training sizes. Different ML Gouthaman and Sankaranarayanan [40] as well as the value of
techniques were applied in each configuration to make results agile methodology. Consequently, a program was suggested
independent of the techniques used. incorporating design and a structure for risk analysis to further
A mixture of methods is introduced by Helming, et al. [35] accomplish this goal. Included in that proposed system are
for semiautomatic release preparation and semi-automatic role also machine learning techniques, which proves more
assignment. As a consequence, the issue of iteration conducive to a continuous system. In comparison, for the
preparation would be the semi-automated two-dimensional existing business risk assessment measures and applied with
approach, indicating a genetic algorithm for maximizing the IoT the outcomes cannot be sufficient. The IoT-based
resulting plans in both sides of the equation and propose the software indicators will also be used for efficient efficiency.
use of a neural network for the formation of a list organized De Andrés, et al. [41] recommend a robust quantile
for every criterion by expertise. A genetic algorithm is also regression method for risk avoidance to be routinely
used to optimize the discharge plan. The combined result is an integrated, whereby higher-order relies on the relationship
iteration plan, including the information, which the developer between the scale of the project and commitment of more risk-
works on which requirement. sensitive decision-makers. An economy of scale may be seen
The prediction model of story points is proposed in in a cubic quantity regression model. The method is shown
Choetkiertikul, et al. [36], focused on two strong architectures with an empirical application to a real project database. Say
of deep learning: long-range memory and repeated highway that higher-order regression quantiles may vary dramatically
network. The prediction system is end-to-end training without from conditional medians and demonstrate that converting or
manual feature engineering from raw input data to prevision multiplying an average normal is a naive expedient for taking
outcomes. Offer an exhaustive story-based estimate dataset risk aversion into account.
with 23,313 outstanding questions from 16 open-source Pa and SNSVSC [42] cascade the Fuzzy Logic Controllers
projects. The empirical assessment shows that three common in a stage of two and six steps of the software effort
baselines and six alternatives are consistently outperformed in calculation. The proposed research is achieved by enhancing
mean absoluteness, median absolute error, and standardized the performance of the fluent logic controllers by enhancing
accuracy. the case-based fuel controls with minimal size in the control
base. The effect is the cascade of fluid logic controllers. The
B. Other Methods only way to find out the exact number of cascaded fuzzy logic
This section reviews the other methods and how it’s used. drivers is to do this. A subtractive clustering mechanism was
These articles were divided into various topics and suggested to resolve this. A further reduction is rendered in
applications. Selected works were classified into broad the rules for models produced through subtractive clustering.
categories depending on the methods in software development Concerning the minimization of rule and different evaluation
techniques and risk assessment. criteria, fuzzy models developed using subtractive clustering
The research design for research into communication provide better estimates of the software development effort.
study in agile software development projects was described by The method "regression fluid logic" is help for Nassif, et
Niinimäki, et al. [37]. The aim is to analyze various methods al. [43]. Parameters of advanced and uneven output
of data collection and interpretation for agile software testing measurement such as uniform precision, impact the size, and
and, finally, to incorporate these approaches in a more detailed mean, balanced relative errors are implemented for the
study project on the application of agile engineering estimation of models and statistical research. The
techniques in global software development programs. The methodology is considered to be "regression fluid logic."
research also expected to gather data from the teams' issue Models have been equipped and evaluated using the
trackers, assignment software, and version control systems. multinational ISBSG data collection of industrial ventures.
Based on such details, we hope that the association between Tests showed the model efficiency was influenced by data
behavior, psycho-subjective variables, and project heteroscedasticity. It was noticed that fuzzy logic models were
performance can be analyzed. very prone to outliers.
Gousios and Zaidman [38] explain the way project Huang [44] suggested an approach to the software project
proposals have been chosen, the features chosen have been development of CBR-C based clustering mechanisms (CBR-
evaluated and an ML package for the R statistical context is Cs) to provide an effective cost estimate. The aim of the CBR-
introduced. The main repository of the project is not shared C method is to that estimate error and time, and to allow
between potential contributors in the pull-based development managers to recognize estimation processes quickly. To

357on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

identify appropriate apps required by managers to evaluate the obstacles to using such technologies effectively, and the
critical functions, using the clustering approach to index guidance about how such issues can be alleviated.
event. The classifier shall identify it first in a new situation. This section presents some recommendations for solving the
Next, the two most similar cases are retrieved, and the cost issues and challenges in software project risk assessment using
estimator inputs both cases with the new case. The analysis the ML technique. (see Figure. 1).
indicates that the CBR-C method suggested offers a strong
framework for calculating the costs of designing projects in A. Recommendations for Software Effort Estimation
software. For effective software programs, measuring software
Wang, et al. [45] established an Earth-Moon software commitment is quite critical. Precise calculations of
project risk management model that takes into account computational effort are necessary if the project schedule and
software development characteristics in line with the software expenditure are correct. To preserve its business position, a
project lifecycle theory introduced. Wang and Wang et al. specific budget is very essential to tech firms.
This paper generalizes two forms of life cycles in risk Clustering Methods has several potential avenues for
management for software applications, analyzes the future work. In addition, to the suggestion of a more structured
relationships, and then introduces the software project risk upgrade methodology for cross-company projects, other
management philosophical paradigm 'Earth-Moon.' Also clustering approaches, base-students, project input attributes,
analyzing and solving the key technologies involved in the clustering software features, parameter values, and
new model. (automated) tuning procedures can be discussed[48].
Amasaki, et al. [46] use Cross-Project Defect Prediction
(CPDP) methods to identify and quantify the effects of data B. Recommendations for Expert Based Measures
simplification (DS). Experiments have been conducted and Identifying Expert Stakeholders. The know-how of a
have been contrasted with, and without, the predictive ability project includes expense, time, distance, efficiency and
of CPDP. Using an adaptive learning approach for the resources management, is focused on the expert's
machine commitment calculation, a framework for data understanding, skills, experience and intuition (often
simplification was introduced. The experiments included 44 contributes to over-optimistic estimates). This reflects
OSS projects, 4 predictive models, and two CPDP methods, collections of procedures, strategies, and activities that
namely Burak filter selection and cross-project selection. The produce and accumulate information required to carry out a
simplification of the data has greatly enhanced the prevision project. There are two levels of information relevant to
of cross-project collection results. measures – micro and macro.
To forecast program effort using a user case point process, Software Process Model Recommendation Method. At the
Nassif, et al. [47] have suggested Treeboost Software size, early stage of the development phase project managers are
productivity, and complexity include the inputs for the model. encouraged to select the right model for the software cycle for
A multi-linear regression model was created and the a current project centered on existing information technology
Treeboost model was evaluated using four performance evidence. The system casts a grouping question with the
criteria: MMRE, PRED, MdMRE, and MSE against the multi- protocol layout suggestion. Second, it analyzes the different
linear regression model and the case point model. The variations of the alternative classification and selection
Treeboost model can be used to quantify program effort with algorithms and then uses the suggestion model with historical
encouraging performance. software development data to forecast the method pattern for
a new software project with just a few data [50].
IV. DISCUSSION AND RECOMMENDATIONS The Architecture of Personalized Recommendations Event
In this paper, the used of ML technique and its application History. This recommendation provides an interface that
in software project risk assessment are reviewed, including the allows teams of data scientists to successfully participate in
current evaluation of software project risk. Secondly, a ML initiatives. Working on the open-source implementation
taxonomy can indicate study gaps. The mapping of the such as Antelope which integrates the concept of event history
literature on applications for project risk assessment into and provides a versatile software engineering framework and
separate categories highlights weak and strong research utilizes many ML resources along with technology that has
coverage features. The taxonomy in this research been built here, and is capable of envisaging deeper
demonstrates, for example, the significance of classes of integrations including those with common data management
individual applications in analysis and evaluation at the cost systems that only have to enforce an interface[51].
of consolidated approaches and systems, and growth activities
C. Recommendations for Management Software Process
(as is expressed in the abundance of their categories).
Following an appropriate analysis, the taxonomy further Agile software development initiatives carried out in
demonstrates the absence of studies on the production of broader project settings frequently have trouble achieving
project risk evaluations. The literature is a representative performance, given overall system requirements that are
study. Studies in this area seek to grow ML or share their deemed reasonable. This may also be correlated with agile
knowledge. To cope with new trends and to strengthen communities that clash with non-agile crops contributing to
inactive areas, statistical data on individual taxonomy tension between agile and non-agile values[52]. The trends
categories identify sectors engaged in the ML technique. can assist in defining disputes described above and include a
The taxonomy proposed in this review is a common short-term solution for short-term changes and a long-term
language, similar to taxonomies in other fields, for researchers strategy, to resolve disputes of values by utilizing Agile
to convey and discuss emerging works such as development strategies. Different societies converge at this stage and
papers, comparative studies, and software project risk therefore communicate and various belief structures encounter
assessment review using ML technology. The research show one another. Such relationships may be fiscal, technical,
facets of the literature content: the reasons behind the creation functional, organizational, or communicative.
of digital project risk management utilizing ML technology,

358on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

Value-Based Conflicts of Agile Software. Agile processes help boost production process efficiency[57]. This is a good
are independent. The literature suggests that an organization illustration. The plugin enhances the rich Visual Space
is a vital element of progress to become an agile organization. atmosphere with advanced release planning capacities, which
What can be done when an agile project does not influence or boost the efficiency of the release preparation, boost
alter its project environment can be found in the literature very profitability, and strengthen project stakeholders'
little help. What can be said to track and control such conflicts connectivity. SRP-Plugin strengthens Visual Lab, utilizing a
and interferences is no advice. This methodology combines sound and comprehensive structured methodology, with
agile product creation, value-based disputes, and project ReleasePlannerTM 's efficient release plan generation
structures by Pechau [53], which define the challenges and capability. However, the development of release plans is just
solutions encountered. My perspective is that of an agile the first move in wider concepts that seek to provide greater
software development team, which operates in a non-trivial support for decisions in the area of strategic release planning.
project setting and can not substantially impact or alter. The
value structures of certain individuals and organizations are D. Recommendations for Risk Prediction
part of this world and it is thus important to analyze agile Risk assessment is essential for increasing the
principles. performance rate of software-related initiatives irrespective of
Little can be found on what can be done when an agile their area of operation. Customer needs are still not discussed
project does not influence or alter its project environment in in proven programs that undergo thorough monitoring[58].
the literature. What can be said to track and control such The evaluation of program threats is a key element in
conflicts and interferences has no advice as well. This assessing the progress and failure of a project. Almost all
methodology combines agile product creation, value-based businesses use specialized instruments to thoroughly
disputes, and project structures by Pechau [53], which define recognize, reduce, and eliminate harm.
the challenges and solutions encountered. Our perspective is Intelligent Risk Prediction Model. How can we identify a
that of an agile software development team, which operates in high-risk project in time? The existing models are mostly
a non-trivial project setting and can not substantially impact based on such a hypothesis that all the cost of misclassification
or alter. The value structures of certain individuals and is equal, which is not consistent with the reality that in the
organizations are part of this world and it is thus important to domain of software project risk prediction, the cost of
analyze agile principles. predicting a fail-prone project as a success-prone project is
Automated Estimation Method for Agile. Automated story different from predicting a success-prone project as a fail-
card estimation system which effectively applies current ML prone project. To the best of our knowledge, the cost-sensitive
algorithms to historical projected human data [54]. This "Auto learning method has not yet been applied in the domain of
Estimate" method enhances the most common manual outsourced software project risk management though it has
planning method of poker in agile environments[55]. been widely used in a variety of fields[59]. There are two main
Auto-Estimate utilizes functionality taken from story cards research gaps in the research field of software project risk
in an agile environment. (a) Increase the accuracy of the prediction model: First, there are seldom researches of risk
estimation by reducing the impact from poor estimation; (b) prediction model specifically for an outsourced software
Demonstrate that auto estimation improves poker planning in project. Secondly, although there is plenty of researches for
the later part of the project; (c) Set the importance of writing software project risk prediction, there are no researches about
properly-structured story cards; and (d) Creating a simple data software project risk prediction that introduced the cost-
collection estimation tool that requires no information other sensitive learning method.
than to be stored. The Risk Predictive Models. Bayesian networks are
Project Control Through Computational Intelligence important methods in necessity engineering for the
Methods. Project control is related to the management of automation of risk assessment. Using multiple classification
numerical and linguistic data, noise caused by measurement measures, the precision of the Bayesian models obtained is
errors, appreciation of people, and vague concepts for calculated and compared. A tree improved network layout
decision-making. It also examines schools and technological shows efficient experimental efficiency across all data sets.
tools to manage projects, as well as open-source software for The relationships between the variables obtained, in addition
the application of computational intelligence techniques over to that defined by requirement engineers, to assess the degree
the past decades. Current tendencies and improvement areas, of danger under a condition. In software creation, risk
valuing niche markets with high applicability around the management aims to define, measure, schedule, and respond
thematic goal are also analyzed[56]. The contribution is to possible threats to prevent their effect on the software
related to the predicted necessity of constructing new models project.
and IT tools for project control which integrate ML-based
approaches and treatment of imprecision, vagueness, or V. CONCLUSIONS
uncertainty in the information, using key performance Considerable research has been conducted on ML
indicators linked to fundamental knowledge areas. The approaches in the software project risk assessment. Work
implementation of new libraries for learning evaluation in distribution is constant over the years. The principal ML
project control with open-source software tools opens a field methods used for automated effort calculation are ANN,
of research related to increasing technological integration with Fuzzy Logic, Genetic, and regression algorithms. The exact
IT project management tools. measurement of effort is one of the main software
Strategic Release Planning (SRP). SRP a crucial phase in development practices. The period and complexity of the
the creation of iterative applications. SRP requires the program programs were directly impacted. This paper aims to
distribution of release characteristics or specifications, offer observations through the study and taxonomisation of
including complicated and soft constraints, such as time, similar works. Different work on the software effort
energy, price, or capital. The SRP-Plugin demonstrates that estimation can draw specific patterns. These works are
software used as plug-ins for the commonly used application approximately focusing on the types of academic findings and

359on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

reports on ML science. An in-depth review of these articles


would help the software project risk review with

Recommendations for Management Software


Recommendations for Software
Process
Effort Estimation
 Value Based Conflicts of Agile Software
 The Hybrid Model
 Automated Estimation Method for Agile
 Use Case Point method
 Project Control Through Computational
 Clustering Methods Intelligence Methods
 Strategic Release Planning (SRP)

Software Project
Recommendations for Expert Based Risk Recommendations for risk prediction
Measures Assessment
 Identifying Expert Stakeholders  Intelligent Risk Prediction Model
Recommendatio
 Software Process Model n  The Agile Based Software Management
Recommendation Method  The Risk Predictive Models
 The Architecture of Personalized
Recommendations Event History

Fig. 1. Categories of recommendations for Using Software Project Assessment

the ML approaches to define and explain the threats, Proceedings of the 9th International Conference on Predictive
advantages, and recommendations. Although the literature Models in Software Engineering, 2013, pp. 1-9.
on project management explains the performance of [10] J. Shivhare and S. K. Rath, "Software effort estimation using
machine learning techniques," in Proceedings of the 7th India
projects and loss, there is a lengthy tradition of Software Engineering Conference, 2014, pp. 1-6.
disagreements over whether project progress should be [11] A. Dehghan, K. Blincoe, and D. Damian, "A hybrid model for task
calculated. Research has still to explore the estimation of completion effort estimation," in Proceedings of the 2nd
software effort based on machine learning approaches that International Workshop on Software Analytics, 2016, pp. 22-28.
focus on risk assessment. Another factor is the use of [12] Y. Hu, X. Zhang, X. Sun, M. Liu, and J. Du, "An intelligent model
collective filtering methods to minimize the problem by for software project risk prediction," in 2009 International
creating districts with the same stakeholders and to predict Conference on Information Management, Innovation Management
and Industrial Engineering, 2009, pp. 629-632.
whether a stakeholder is aware of the issue.
[13] C. Fitzgerald, E. Letier, and A. Finkelstein, "Early failure prediction
ACKNOWLEDGMENT in feature request management systems," in 2011 IEEE 19th
International Requirements Engineering Conference, 2011, pp. 229-
This research is generously funded by Universiti Tenaga 238.
Nasional (UNITEN) and UNITEN R&D Sdn. Bhd, under [14] V. Ramaswamy, V. Suma, and T. Pushphavathi, "An approach to
TNB Seed Grant projects U-TC-RD-19-10 titled “Smart predict software project success by cascading clustering and
Software Project Management Assessment System”. classification," 2012.
[15] E. Manalif, L. F. Capretz, A. B. Nassif, and D. Ho, "Fuzzy-ExCOM
REFERENCES software project risk assessment," in 2012 11th International
Conference on Machine Learning and Applications, 2012, pp. 320-
[1] H. G. Gemünden, "Success factors of global new product 325.
development programs, the definition of project success, knowledge
[16] R. Rana and M. Staron, "Machine learning approach for quality
sharing, and special issues of project management journal®," Project
assessment and prediction in large software organizations," in 2015
Management Journal, vol. 46, pp. 2-11, 2015.
6th IEEE International Conference on Software Engineering and
[2] S. W. Hughes, D. D. Tippett, and W. K. Thomas, "Measuring project Service Science (ICSESS), 2015, pp. 1098-1101.
success in the construction industry," Engineering Management
[17] S. Tariq, M. Usman, R. Wong, Y. Zhuang, and S. Fong, "On learning
Journal, vol. 16, pp. 31-37, 2004.
software effort estimation," in 2015 3rd International Symposium on
[3] A. PMI, "guide to the project management body of knowledge Computational and Business Intelligence (ISCBI), 2015, pp. 79-84.
(PMBOK guide)," in Project Management Institute, 2013.
[18] K. Iwata, T. Nakashima, Y. Anan, and N. Ishii, "Effort estimation
[4] L. J. Kirsch, "Software project management: An integrated for embedded software development projects by combining machine
perspective for an emerging paradigm," Framing the Domains of IT learning with classification," in 2016 4th Intl Conf on Applied
Management: Projecting the Future... Through the Past, pp. 285-304, Computing and Information Technology/3rd Intl Conf on
2000. Computational Science/Intelligence and Applied Informatics/1st Intl
[5] A. M. Aladwani, "IT project uncertainty, planning and success," Conf on Big Data, Cloud Computing, Data Science & Engineering
Information Technology & People, 2002. (ACIT-CSII-BCD), 2016, pp. 265-270.
[6] G. R. Cates and M. Mollaghasemi, "The project assessment by [19] L. L. Minku and X. Yao, "Software effort estimation as a
simulation technique," Engineering Management Journal, vol. 19, multiobjective learning problem," ACM Transactions on Software
pp. 3-10, 2007. Engineering and Methodology (TOSEM), vol. 22, pp. 1-32, 2013.
[7] V. S. Parsons, "Project performance: How to assess the early stages," [20] Y. Hu, B. Feng, X. Mo, X. Zhang, E. Ngai, M. Fan, et al., "Cost-
Engineering Management Journal, vol. 18, pp. 11-15, 2006. sensitive and ensemble-based prediction model for outsourced
[8] Y. Rosenfeld, "Root-cause analysis of construction-cost overruns," software project risk prediction," Decision Support Systems, vol. 72,
Journal of construction engineering and management, vol. 140, p. pp. 11-23, 2015.
04013039, 2014. [21] P. Pospieszny, B. Czarnacka-Chrobot, and A. Kobylinski, "An
[9] K. Lochmann, J. Ramadani, and S. Wagner, "Are comprehensive effective approach for software project effort and duration
quality models necessary for evaluating software quality?," in estimation with machine learning algorithms," Journal of Systems
and Software, vol. 137, pp. 184-196, 2018.

360on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

[22] H. R. Joseph, "Poster: Software Development Risk Management: International Conference on Smart Systems and Inventive
Using Machine Learning for Generating Risk Prompts," in 2015 Technology (ICSSIT), 2018, pp. 48-51.
IEEE/ACM 37th IEEE International Conference on Software [41] J. De Andrés, M. Landajo, and P. Lorca, "Using nonlinear quantile
Engineering, 2015, pp. 833-834. regression for the estimation of software cost," in International
[23] V. S. Desai and R. Mohanty, "ANN-Cuckoo Optimization conference on hybrid artificial intelligence systems, 2018, pp. 422-
Technique to Predict Software Cost Estimation," in 2018 Conference 432.
on Information and Communication Technology (CICT), 2018, pp. [42] R. S. Pa and R. SNSVSC, "Improving efficiency of fuzzy models for
1-6. effort estimation by cascading & clustering techniques," Procedia
[24] A. B. Nassif, M. Azzeh, L. F. Capretz, and D. Ho, "Neural network Computer Science, vol. 85, pp. 278-285, 2016.
models for software development effort estimation: a comparative [43] A. B. Nassif, M. Azzeh, A. Idri, and A. Abran, "Software
study," Neural Computing and Applications, vol. 27, pp. 2369-2381, development effort estimation using regression fuzzy models,"
2016. Computational intelligence and neuroscience, vol. 2019, 2019.
[25] J. Schleier-Smith, "An architecture for Agile machine learning in [44] Z.-W. Huang, "Cost Estimation of Software Project Development by
real-time applications," in Proceedings of the 21th ACM SIGKDD Using Case-Based Reasoning Technology with Clustering Index
International Conference on Knowledge Discovery and Data Mechanism," in 2009 Fourth International Conference on Innovative
Mining, 2015, pp. 2059-2068. Computing, Information and Control (ICICIC), 2009, pp. 1049-
[26] Z. Volf and E. Shmueli, "Screening heuristics for project gating 1052.
systems," in Proceedings of the 2017 11th Joint Meeting on [45] Y.-H. Wang, J. Jia, and Y. Qu, "The “Earth-Moon” model on
Foundations of Software Engineering, 2017, pp. 872-877. software project risk management," in 2010 International
[27] M. Liyi, Z. Shiyu, and G. Jian, "A project risk forecast model based Conference on Machine Learning and Cybernetics, 2010, pp. 1999-
on support vector machine," in 2010 IEEE International Conference 2003.
on Software Engineering and Service Sciences, 2010, pp. 463-466. [46] S. Amasaki, K. Kawata, and T. Yokogawa, "Improving cross-project
[28] C. Lopez-Martin, S. Banitaan, A. Garcia-Floriano, and C. Yanez- defect prediction methods with data simplification," in 2015 41st
Marquez, "Support vector regression for predicting the enhancement Euromicro Conference on Software Engineering and Advanced
duration of software projects," in 2017 16th IEEE International Applications, 2015, pp. 96-103.
Conference on Machine Learning and Applications (ICMLA), 2017, [47] A. B. Nassif, L. F. Capretz, D. Ho, and M. Azzeh, "A treeboost
pp. 562-567. model for software effort estimation based on use case points," in
[29] J.-S. Chou, M.-Y. Cheng, Y.-W. Wu, and C.-C. Wu, "Forecasting 2012 11th International Conference on Machine Learning and
enterprise resource planning software effort using evolutionary Applications, 2012, pp. 314-319.
support vector machine inference model," International Journal of [48] L. Song, L. L. Minku, and X. Yao, "The impact of parameter tuning
Project Management, vol. 30, pp. 967-977, 2012. on software effort estimation using learning machines," in
[30] Y. Koroglu, A. Sen, D. Kutluay, A. Bayraktar, Y. Tosun, M. Cinar, Proceedings of the 9th international conference on predictive models
et al., "Defect prediction on a legacy industrial software: A case in software engineering, 2013, pp. 1-10.
study on software with few defects," in 2016 IEEE/ACM 4th [50] A. Poth, S. Sasabe, A. Mas, and A. L. Mesquida, "Lean and agile
International Workshop on Conducting Empirical Studies in software process improvement in traditional and agile
Industry (CESI), 2016, pp. 14-20. environments," Journal of Software: Evolution and Process, vol. 31,
[31] D. Petkovic, M. Sosnick-Pérez, S. Huang, R. Todtenhoefer, K. p. e1986, 2019.
Okada, S. Arora, et al., "Setap: Software engineering teamwork [51] P. Lops, M. De Gemmis, and G. Semeraro, "Content-based
assessment and prediction using machine learning," in 2014 IEEE recommender systems: State of the art and trends," in Recommender
Frontiers in Education Conference (FIE) Proceedings, 2014, pp. 1- systems handbook, ed: Springer, 2011, pp. 73-105.
8.
[52] S. S. M. Fauzi, N. Ramli, and M. H. N. M. Nasir, "Software
[32] A. Alsri, S. Almuhammadi, and S. Mahmood, "A model for work Configuration Management A Result from the Assessment and its
distribution in global software development based on machine Recommendation," in 2009 International Conference on Information
learning techniques," in 2014 Science and Information Conference, Management and Engineering, 2009, pp. 416-419.
2014, pp. 399-403.
[53] J. Pechau, "Rafting the agile waterfall: value based conflicts of agile
[33] M. P. Basgalupp, R. C. Barros, and D. D. Ruiz, "Predicting software software development," in Proceedings of the 16th European
maintenance effort through evolutionary-based decision trees," in Conference on Pattern Languages of Programs, 2011, pp. 1-15.
Proceedings of the 27th Annual ACM Symposium on Applied
Computing, 2012, pp. 1209-1214. [54] K. Moharreri, A. V. Sapre, J. Ramanathan, and R. Ramnath, "Cost-
effective supervised learning models for software effort estimation
[34] A. Bakır, B. Turhan, and A. B. Bener, "A new perspective on data in agile environments," in 2016 IEEE 40th Annual Computer
homogeneity in software cost estimation: a study in the embedded Software and Applications Conference (COMPSAC), 2016, pp. 135-
systems domain," Software Quality Journal, vol. 18, pp. 57-80, 140.
2010.
[55] B. Prakash and V. Viswanathan, "A,“Survey on Software Estimation
[35] J. Helming, M. Koegel, and Z. Hodaie, "Towards automation of Techniques in Traditional and Agile Development Models”,"
iteration planning," in Proceedings of the 24th ACM SIGPLAN Indonesian Journal of Electrical Engineering and Computer Science,
conference companion on Object oriented programming systems vol. 7, pp. 867-876, 2017.
languages and applications, 2009, pp. 965-972.
[56] J. A. L. García, A. B. Peña, P. Y. P. Pérez, and R. B. Pérez, "Project
[36] M. Choetkiertikul, H. K. Dam, T. Tran, T. Pham, A. Ghose, and T. control and computational intelligence: Trends and challenges,"
Menzies, "A deep learning model for estimating story points," IEEE International Journal of Computational Intelligence Systems, vol.
Transactions on Software Engineering, vol. 45, pp. 637-656, 2018. 10, pp. 320-335, 2017.
[37] T. Niinimäki, A. Piri, P. Hynninen, and C. Lassenius, "Studying [57] J. G. Mohebzada, G. Ruhe, and A. Eberlein, "SRP-plugin: a strategic
communication in agile software development: a research release planning plug-in for visual studio 2010," in Proceedings of
framework and pilot study," in Proceedings of the ICMI-MLMI'09 the 1st Workshop on Developing Tools as Plug-ins, 2011, pp. 36-39.
Workshop on Multimodal Sensor-Based Systems and Mobile
Phones for Social Computing, 2009, pp. 1-4. [58] T. H. Kappen, Y. Vergouwe, L. Van Wolfswinkel, C. Kalkman, K.
Moons, and W. Van Klei, "Impact of adding therapeutic
[38] G. Gousios and A. Zaidman, "A dataset for pull-based development recommendations to risk assessments from a prediction model for
research," in Proceedings of the 11th Working Conference on postoperative nausea and vomiting," British journal of anaesthesia,
Mining Software Repositories, 2014, pp. 368-371. vol. 114, pp. 252-260, 2015.
[39] C. Makris, P. Vikatos, and J. Visser, "Classification model for [59] U. Kanimozhi, S. Ganapathy, D. Manjula, and A. Kannan, "An
predicting cost slippage in governmental ICT projects," in intelligent risk prediction system for breast cancer using fuzzy
Proceedings of the 30th Annual ACM Symposium on Applied temporal rules," National Academy Science Letters, vol. 42, pp. 227-
Computing, 2015, pp. 1238-1241. 232, 2019.
[40] P. Gouthaman and S. Sankaranarayanan, "Agile Software Risk
Management Architecture for IoT-Fog based systems," in 2018

361on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded
2020 8th International Conference on Information Technology
and Multimedia (ICIMU)

[60] R. Anderson, L. S. Peranich, R. Dungca, J. P. Milana, X. Shao, P. C.


Dulany, et al., "Detecting and measuring risk with predictive models
using content mining," ed: Google Patents, 2008.

362on December 15,2020 at 04:16:22 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: UNIVERSITY TENAGA NASIONAL. Downloaded

You might also like