Evaluation of Software Architectures Under Uncertainty: A Systematic Literature Review

1 Evaluation of Software Architectures under Uncertainty:
2
3
A Systematic Literature Review
4
5 DALIA SOBHY, Computer Engineering Department, Arab Academy of Science and Technology and
6 Maritime Transport, Egypt
7 RAMI BAHSOON, School of Computer Science, University of Birmingham and FRSA, UK
8 LEANDRO MINKU, School of Computer Science, University of Birmingham, UK
9 RICK KAZMAN, Information Technology Management, University of Hawaii and SEI/CMU, USA
10
11 Context: Evaluating software architectures in uncertain environments raises new challenges, which require
12
continuous approaches. We define continuous evaluation as multiple evaluations of the software architecture
that begins at the early stages of the development and is periodically and repeatedly performed throughout
13
the lifetime of the software system. Numerous approaches have been developed for continuous evaluation;
14
to handle dynamics and uncertainties at run-time, over the past years, these approaches are still very few,
15
limited, and lack maturity.
16 Objective: This review surveys efforts on architecture evaluation and provides a unified terminology and
17 perspective on the subject.
18 Method: We conducted a systematic literature review to identify and analyse architecture evaluation approaches
19 for uncertainty including continuous and non-continuous, covering work published between 1990-2020. We
20 examined each approach and provided a classification framework for this field. We present an analysis of the
21 results and provide insights regarding open challenges.
22 Major results and conclusions: The survey reveals that most of the existing architecture evaluation approaches
23
typically lack an explicit linkage between design-time and run-time. Additionally, there is a general lack of
systematic approaches on how continuous architecture evaluation can be realised or conducted. To remedy
24
this lack, we present a set of necessary requirements for continuous evaluation and describe some examples.
25
26 Additional Key Words and Phrases: Continuous Software Architecture Evaluation, Design-time Software
27 Architecture Evaluation, Run-time Software Architecture Evaluation, Uncertainty.
28 Reference Format:
29 Dalia Sobhy, Rami Bahsoon, Leandro Minku, and Rick Kazman. 2021. Evaluation of Software Architectures
30 under Uncertainty: A Systematic Literature Review. 1, 1 (April 2021), 50 pages.
31
32 1 INTRODUCTION
33 Architecture evaluation is a milestone in the decision-making process. It aims at justifying the
34 extent to which architecture design decisions meet a system’s quality requirements and their
35 trade-offs, particularly in the face of operational uncertainties and changing requirements. The
36 evaluation can aid in early identification and mitigation of design risks; the exercise is typically
37 done in an effort to save integration, testing and evolution costs [124]. Examples of seminal work
38 include Architecture Tradeoff Analysis Method (ATAM) [85], and Cost Benefit Analysis Method
39 (CBAM) [82].
40
Authors’ addresses: Dalia Sobhy, Computer Engineering Department, Arab Academy of Science and Technology and
41
Maritime Transport, Alexandria, Egypt, dalia.sobhi@aast.edu; Rami Bahsoon, School of Computer Science, University
42 of Birmingham and FRSA, Birmingham, UK, r.bahsoon@cs.bham.ac.uk; Leandro Minku, School of Computer Science,
43 University of Birmingham, Birmingham, UK, l.l.minku@cs.bham.ac.uk; Rick Kazman, Information Technology Management,
44 University of Hawaii and SEI/CMU, Hawaii, USA, kazman@hawaii.edu.
45
46
© 2021
47 XXXX-XXXX $
48
49
, Vol. 1, No. 1, Article . Publication date: April 2021.
2 Sobhy et al.
50 Software architectures that operate in dynamic and non-stationary environments (e.g., IoT and
51 cloud applications) require a fundamental shift in the way evaluations are conducted. This is due
52 to unforeseen factors that may affect the evaluation, including (but not limited to), fluctuations in
53 QoS, multi-tenancy, hyper-connectivity, sensor ageing effects, etc [71, 109, 130].
54 Though existing design-time evaluation approaches promise to evaluate flexibility in architec-
55 tures under uncertainty and their responses in enabling change [19, 82, 85], in contexts of highly
56 dynamic environments these approaches tend to be limited because there may be emerging sce-
57 narios where the architect cannot rely solely on design-time evaluation. Such scenarios require
58 a run-time evaluation to inform and calibrate the design-time decisions. In this context, a more
59 continuous approach would benefit the evaluation process. We define continuous software architec-
60 ture evaluation as multiple evaluations of the software architecture that begins at the early stages of
61 the development and is periodically and repeatedly performed throughout the lifetime of the software
62 system. Continuous evaluation is performed either continuously or sporadically covering either
63 one feature (e.g. QoS) or multiple features.
64 There have been many research studies aimed at evaluating software architectures to deal with
65 uncertainty which may implicitly or explicitly adopt continuous approaches (e.g. DevOps [17]). The
66 field has attracted a wide range of researchers and practitioners. However, continuous evaluation
67 has not been viewed as a key area within software architecture research. We still lack a clear vision
68 regarding the elements of a continuous software architecture evaluation approach.
69 In past years, many research studies have reviewed design-time architecture evaluation methods
70 (e.g. [27, 53, 122]), while some have attempted to review run-time methods without addressing
71 them from the context of continuous architecture evaluation (e.g. [26, 47, 93, 98, 131]). In particular,
72 to date there is no systematic literature review for software architecture evaluation approaches
73 for uncertainty which may implicitly or explicitly adopt continuous approaches. A systematic
74 literature review (SLR) is a methodological mean to aggregate empirical studies, to systematically
75 investigate a research topic, answer specific research questions, and finally determine the gaps and
76 research directions for the research topic [88, 89, 116].
77 The objective of this study is to (i) provide a basic classification schema which categorises
78 software architecture evaluation approaches under uncertainty; (ii) categorise the current design-
79 time and run-time approaches for evaluating software architectures based on this schema; (iii)
80 determine the necessary guidelines for developing a continuous evaluation approach; (iv) point
81 out current gaps and directions for future research in software architectures for environments
82 characterised by uncertainty, where we consider both design-time and run-time evaluation that
83 take into account the possibility of uncertainties in the environment where the system will operate
84 / is operating. Concretely, we aim to provide answers for the following research questions:
85
86
(1) How can the current research on software architecture evaluation under uncertainty be
87
categorised and what are the current state-of-the-art approaches with respect to this cate-
88
gorisation? The goal is to provide a categorisation of existing architecture evaluation approaches
89
under uncertainty and classify the state-of-the-art approaches under this categorisation.
90
(2) What are the actions taken by these architecture evaluation approaches to deal with un-
91
certainty? The aim of this question is to demonstrate and discuss how the existing approaches
92
deal with uncertainty and whether these actions can contribute to developing more continuous
93
approaches.
94
(3) What are the current trends and future directions in software architecture evaluation for
95
uncertainty and their consideration for continuous evaluation? This question aims to show
96
how researchers and practitioners can benefit from the existing approaches to draw inspiration
97
98
Evaluation of Software Architectures under Uncertainty:
A Systematic Literature Review 3
99
on the essential requirements and address the pitfalls when developing a continuous evaluation
100
approach.
101
102 The manuscript is structured as follows: Section 1.1 identifies and explains the necessary con-
103 cepts to ease the understanding of the review. Section 2 demonstrates the systematic literature
104 review process, Section 3 provides an overview of the included studies from the chronological and
105 distribution perspectives. Section 4 categorises the included studies with respect to a classification
106 framework and presents the limitations of review. The related reviews are discussed in Section
107 5. New trends and research directions are discussed in Section 6. Finally, Section 7 concludes the
108 work.
109
110 1.1 Preliminaries and Basic Concepts
111 In this section, we list descriptions of the main concepts used in this review to ease the analysis.
112
113 1.1.1 Architecture Design Decisions. The foundation of an architecture is in the set of taken
114 [25, 80, 137]. The architects define the possible set of candidate architectures to serve a particular
115 concern and then based on their experience and knowledge they choose the best candidate [35].
116 For example, in an IoT application, the architect could prefer processing the data in the cloud rather
117 than the fog devices to improve the energy consumption. However, this design decision could
118 have a negative impact on the performance. This motivates the need for software architecture
119 evaluation.
120
121 1.1.2 Software Architecture. In the literature, software architecture is defined in many ways. In our
122 work, we use the definition introduced by ISO/IEC/IEEE 42010:2011: "the fundamental concepts
123 or properties of a system in its environment embodied in its elements, relationships, and in the
124 principles of its design and evolution". This definition is complementary to [115, 125] and later ones
125 [16]. In this context, a software architecture represents the abstractions for a software system by
126 defining its structure, behaviour, and key properties [125]. These include software components (i.e.
127 processing and computational elements), connectors (i.e. interaction elements), and their relation
128 to the environmental conditions [16, 115].
129
1.1.3 Architecture Evaluation. It is a milestone in the decision-making process. Classical approaches
130
to architecture evaluation are generally a human-centric, where architects and various stakeholders
131
(e.g. developers, managers, etc) are involved to evaluate the extent to which the architecture design
132
decisions and adopted styles can meet quality attributes of interest and their trade-offs. The exercise
133
also involves analysis of costs and likely added value of the decisions. Classical approaches heavily
134
rely on experts’ judgement; they utilise human generated inputs, such as scenarios for evaluating
135
the architecture. Evaluation is conducted at design-time and before the system is built, covering the
136
statics of an architecture (e.g. style, structure and topology) and its dynamics (e.g. likely performance
137
and scalability).
138
139
1.1.4 Design-time Architecture Evaluation. It is the process where humans, tools, and methods are
140
used to reason about the architecture of the system-to-be. The evaluation can cover both static
141
aspects of the architecture relating to structure, topology, environment, and style, etc and dynamic
142
analysis that relates to behavioural properties of the architecture, such as performance, scalability,
143
etc. The evaluation can heavily rely on stakeholders involvement and their estimates. Estimation
144
can be backed up by experts judgement about the domain, historical data and benchmarks that
145
relates to the likely performance of similar systems, or what-if analysis of simulated instances for
146
the projected deployment environments, predicted or eventual load (before the system is deployed).
147

4 Sobhy et al.
148 1.1.5 Run-time Architecture Evaluation. It means the execution of the architecture under study;
149 this can be a typical execution profile or it can be the actual deployed system implementing the
150 architecture for the objectives of profiling, refinements or enrichment. For either cases, architects
151 can collect dynamic, near real or real time information about the performance of QA of interest to
152 inform the evaluation or further tuning of the running system. In other cases, simulated data (e.g.
153 QoS data) are used to capture the dynamic behaviour of architecture decisions under uncertainty
154 at run-time and to use such information to profile and evaluate design decisions, if full deployment
155 was expensive. The evaluation can leverage simulation tools with inputs from the running system
156 to perform anticipatory evaluation of key design decisions and their possible variants based on the
157 run-time contextual requirements.
158
159
1.1.6 Continuous Architecture Evaluation. It goes beyond the classical architecture evaluation
160
approaches to include additional run-time information that can assist the evaluation and help in
161
tuning the parameters. Several flavors can implement this category of evaluation: for example,
162
info-symbiotic simulation1 can be linked to the architecture to simulate how an architecture can
163
behave if implemented and deployed in particular environment. The run-time information can be
164
then fed into the evaluation to tune the parameters. This step can involve a self-adaptive mechanism
165
and can leverage components of the MAPE-K to tune the parameters. As for the actors involved
166
in the evaluation - these can be various stakeholders (architects, developers, etc) and automated
167
agents (taking the form of monitoring agents for the environment, analysis, planning and actuating
168
for the observed inputs - these can be automatic and/or interactive etc).
169
We see continuous architecture evaluation to include two activities: design-time and run-time
170
evaluation. In particular, design-time evaluation can be used to support the necessary initial system
171
design and deployment based on estimations only. After that, run-time evaluation can assist
172
continuous architecture evaluation in monitoring QAs and suggesting re-configuration from a
173
repository of candidate options, some of which their technical viability has been established but
174
requires further profiling and confirmation following continuous monitoring at run-time. The
175
recommendation can utilise learning and suggest a suitable configuration; it can also call for further
176
refinements and/or phasing out of existing reconfiguration. Once the architecture is adopted, it is
177
very expensive to change the architecture or amend its structural design. Would the architecture
178
appear to lag behind optimality, for this case, run-time evaluation may recommend more structural
179
changes to the architecture, which can be very expensive to deal with following deployment, unless
180
the context is aimed as prototyping and learning through prototypes. In other words, the evaluation
181
can be also used to repeatedly assess to what extent the architecture options created at design-time,
182
as well as other potential architecture options, perform well at run-time. This enables architects
183
to make informed decisions on potential changes to the architecture, so that its performance
184
remains good over time. In other contexts, evaluations can be intertwined and interleaved between
185
design-time and run-time. Consider, for example, in modern incremental software development
186
(e.g. DevOps), microservices, etc, the design of each change to the system when evolving it again is
187
"design-time".
188 1.1.7 Uncertainty in Architecture Evaluation. A common issue in architecture evaluation is the
189 presence of uncertainty. In architecture evaluation and decision-making, uncertainty is the lack of
190 full knowledge about the outcomes of deploying the architecture options [95]. For instance, the
191 architects may be uncertain about the effect of a proposed software architecture on benefit (e.g.
192 performance, availability, etc) and cost. Uncertainty also may arise due to unpredictable situations
193
194 1 a term that is widely used by the dynamic data driven simulation system community (e.g. http://1dddas.org/InfoSymbiotics/
195 DDDAS2020, https://sites.google.com/view/dddas-conf/home)

196
197
in dynamic applications, such as IoT. For instance, sensors ageing effects, the varying internet
198
connectivity and mobility of sensors, fluctuations in QoS and so forth [1, 71, 108, 109].
199
Architecture can experience two sources of uncertainty: aleatory and epistemic [15, 50, 66].
200
Aleatory conception of uncertainty intends that uncertainty arises from variability in possible
201
realisation of a stochastic event, where unknown and different results could appear every time one
202
runs an experiment under similar conditions. It is also defined as "the inherent variation associated
203
with the physical system or environment under consideration" [111]. This type of uncertainty is
204
more common in run-time. In other words, it is the uncertainties occurring in the later execution
205
environment. For instance, in IoT systems, new types of sensors with new communication behaviour
206
might be introduced, which do not match the workload model assumed for a system. This knowledge
207
will only become available after running the system. Epistemic conception of uncertainty denotes
208
the rise of uncertainty due to lack of confidence or missing knowledge to a fact which is either
209
true or false. It is also defined as "uncertainty of the outcome due to the lack of knowledge or
210
information in any phase or activity of the modelling process" [111]. This type of uncertainty
211
is more common in design-time. In particular, this may occur due to the impact of decisions at
212
design-time that are not yet known (e.g. designing new way of communication, without knowing
213
yet how much performance can be improved in a distributed and parallel setup by this decision,
214
which one needs to implement and measure to find out). In some contexts, this type of uncertainty
215
could be partially reduced at design-time.
216
217 1.1.8 Quality Attribute. We adopt the definition introduced by the IEEE Standard for Software
218 Quality Metrics [45], where a quality attribute is "a characteristic of software, or a generic term
219 applying to quality factors, quality sub-factors, or metric values". Examples of quality attributes are
220 performance, reliability, energy consumption, availability, security, and so forth.
221
1.1.9 Stakeholder. We adopt the notion used by ISO/IEC/IEEE 42010:2011: "an individual, team,
222
organization, or classes thereof, having an interest in a system". In this context, stakeholders have a
223
stake in the success of the architecture, and of any systems that are derived from the architecture. So
224
this could include customers, programmers, testers, reusers, architects, integrators, users, managers,
225
etc. An architect is just one stakeholder among many, whose needs are less important (and hence
226
lower priority) than the needs of many of the other stakeholders.
227
228 2 SYSTEMATIC LITERATURE REVIEW PROCESS
229
In this section, we will discuss the SLR protocol, how the systematic review process has been
230
carried out, and finally the existing architecture evaluation approaches with respect to criteria and
231
review objectives.
232
233 2.1 SLR Protocol
234
We have followed the systematic literature review guidelines and procedures [116] and the work of
235
[27] to develop our review protocol. In particular, the protocol identifies the objectives of the review,
236
the necessary background, research questions, inclusion and exclusion criteria, search strategy, data
237
extraction and analysis of gathered data. One author has developed the review protocol and then
238
the outcome has been revised by other authors to limit bias. The review objectives, background, and
239
the research questions are discussed in Section 1, whereas other procedures are described below.
240
241 2.2 Inclusion and Exclusion Criteria
242
Initially, we needed to set up a criteria to aid in the search process and filtration of irrelevant studies.
243
We considered English papers published in peer-reviewed journals, conferences, and workshops
244
from 1990 and early 2019. This time frame was chosen because one of the earlier well-known
245

6 Sobhy et al.
246 architecture evaluation approaches (e.g. SAAM [83]) was published in 1994. We excluded studies
247 that do not have software architecture evaluation as one of its main contributions. We also excluded
248 editorials, opinion, keynote, abstract, tutorial summary, position paper, panel discussion, or technical
249 reports, panels and poster sessions. Moreover, we found that some studies are duplicated in different
250 versions that appear as books, journal papers, conference and workshop papers. In this context, we
251 included only the latest and most complete version. We provide a summary of the inclusion and
252 exclusion criteria below. Publications are included if they cover all the inclusion criteria in Section
253 2.2.1, and publications are excluded if they fit any of the exclusion criteria in Section 2.2.2.
254
255
2.2.1 Inclusion Criteria.
256 • Studies published between 1990 and early 2020.
257 • Studies in the domain of software architecture evaluation. In particular, the study should
258 include a software architecture evaluation method as one of its contributions.
259 • Studies that discuss architecture evaluation approaches with explicit focus on high-level
260 architecture design (e.g. component level, style, architecture design decisions and tactics),
261 covering design-, run-time and continuous evaluation; we exclude approaches which discuss
262 low-level structural design (e.g. code and class refactoring).
263 • Studies that report on software architecture evaluation supported by quantitative analy-
264 sis/models (e.g, using utility theory as part of ATAM; using cost-benefit analysis as part of
265 CBAM, etc.)
266
2.2.2 Exclusion Criteria.
267
268
• Studies that do not explicitly consider architecture evaluation. For example, some self-adaptive
269
system studies may make use of architecture evaluation to inform self-adaptation, but may
270
not explicitly refer to this as architecture evaluation. Such studies were excluded.
271
• Studies that are non-peer reviewed.
272
• Studies not written in English and not accessible in full-text.
273
2.3 Search Strategy
274
275 The search strategy was performed to identify the studies through the following:
276 1. Applying an initial search to determine the current systematic reviews and mapping studies,
277 and hence identifying significantly related primary studies.
278 2. Using the concept of "quasi-gold" standard, as introduced by Zhang and Babar [142], where
279 we performed a manual scan for the most well-known venues of the software architecture
280 and software engineering domains to cross-check the automated search results.
281 3. Performing several trials using different combinations of keywords derived from the main
282 objectives of the review (i.e. automated search from recognised bibliographical data sources).
283 4. Performing an additional search to manually check and analyse the related references (snow-
284 balling) [140] to ensure that we did not miss any important study and hence guarantee a
285 representative set of studies.
286 All the prior procedures aided us in defining valid search strings along with other procedures
287 discussed in Section 2.3.1. For the venues, we manually searched the following:
288
− International Conference on Software Engineering (ICSE).
289
− International Conference on Software Architecture (ICSA) 2 .
290
− European Conference on Software Architecture (ECSA).
291
292 2 Formerly the Working IEEE/IFIP Conference on Software Architecture (WICSA) and International Conference Series on
293 the Quality of Software Architectures (QoSA).
294
295
Our manual search included the title, keywords, and abstract of each publication. After finishing
296
the manual and automatic searches, we checked the differences between the results to guarantee
297
the most appropriate coverage of the domain. We found that all the manual results were a subset of
298
the automatic results (i.e. meeting the "quasi-gold" standard).
299
300 2.3.1 Keyword Selection. As mentioned above, we used both automatic and manual search. In
301 the automatic search, we tried several keywords on search engines of electronic bibliographical
302 sources. Manual search is not a practical procedure as it retrieves thousands of results, which is
303 difficult to manually filter. However, we still performed a manual search (to meet the "quasi-gold"
304 standard [142]) to ensure that we used the most suitable search queries.
305 One of the main challenges identified through our automatic search is a lack of well-defined
306 terminology for the process of continuous architecture evaluation. As an example, some self-*
307 systems can implicitly incorporate some principles that resemble architecture evaluation. To avoid
308 missing any relevant studies, we used some generic keywords in the search query of automatic
309 search (e.g. "run-time", "dynamic", etc). This led to retrieving some studies that were actually relevant
310 to our search. We have also performed a manual search for the studies, which could seem to be a
311 run-time architecture evaluation approach. To obtain our search query, we applied the following
312 procedures:
313
1. Extract the major keywords from the objectives of review and main research topics.
314
2. Determine and try different spellings, related terms and synonyms for major keywords, if
315
applicable.
316
3. Use the "advanced" search option to insert the complete search query and filter by date, if
317
the bibliographical source allows for that (Section 2.3.2).
318
4. Pilot various combinations of search keywords in test queries.
319
5. Validate the results of (4) with "quasi-gold" standard.
320
321 From our pilot testing, we found that the notion of "continuous" architecture evaluation is
322 used in different forms in the context of software architecture and software engineering with
323 other closely-related alternative terms, such as run-time and dynamic. This is because the term
324 "continuous" is not clearly defined. We also incorporated additional keywords which may implicitly
325 refer to continuous evaluation, such as design-time and static (i.e. the state-of-the-art approaches
326 for architecture evaluation). Furthermore, in other contexts, architecture evaluation is interpreted
327 as architecture assessment or architecture analysis. Therefore, we tried to consider these related
328 keywords in our search query and used them in an interchangeable manner.
329 The search query is composed of five major terms, Continuous AND Software AND Architecture
330 AND Evaluation AND Uncertainty. To generate the main search query, we used the alternate
331 keywords listed above. This is performed by connecting these terms through logical OR as follows:
332 (design-time OR run-time OR design time OR runtime OR static OR dynamic OR continuous)
333 AND Software AND (architecture OR architectural) AND (evaluation OR analysis OR assessment)
334 AND uncertainty
335
2.3.2 Bibliographical Sources. The selected databases present the most important and highest
336
impact journals and conference proceedings. They also provided us with the ability to perform
337
expert search with a variety of Boolean operations and limit the search on the Title, Abstract and
338
Keywords fields and time frame, which returned more relevant results as compared to searching
339
all the fields. For instance, this allowed us to use Boolean "OR" to try different spellings and
340
synonyms, and use Boolean "AND" to link the major keywords (e.g. software AND architecture
341
AND evaluation).
342
The electronic bibliographical sources used include:
343

8 Sobhy et al.
344 Table 1. Summary of Search Results and Included Studies from each database. Note that the number of
345 included studies listed for each of the databases excludes studies that have already been included by a former
346
database. A total of 48 unique studies have been included.
347
348
Database Search Results # Included Studies
349 IEEE Xplorer 994 11
350 ACM digital library 2108 8
351 SpringerLink 999 3
352 ScienceDirect 524 5
353 GoogleScholar 1000 7
354 Other
355
Snowballing Process 349 14
356
357
Total 48
358
359
360 − IEEE Xplorer (http://ieeexplore.ieee.org/Xplore/)
361 − ACM digital library (http://portal.acm.org/)
362 − SpringerLink (http://www.springerlink.com/)
363 − ScienceDirect (http://www.sciencedirect.com/)
364 − GoogleScholar (http://scholar.google.com/)
365
Note that we included Google Scholar as there are some of works in software architecture
366
evaluation (e.g. ATAM), which were not retrieved in the first four databases. we have found that
367
Google retrieves many irrelevant results after the first pages of retrieved results. This is because
368
Google enables retrieval of results that do not match the search query completely. Therefore, we
369
have limited the Google scholar results to 1000. Other works (e.g. [2]) have also limited the Google
370
scholar results to specific number of pages.
371
372
2.4 Search Execution
373
374
In this stage, we executed the search process in Figure 1, realising the procedures in Section
375
2.3. Initially, we manually searched in the current systematic reviews and mapping studies (e.g.
376
[11, 27, 53, 98, 122]) to identify significantly related primary studies (13 results). We then performed
377
manual search (17 results) to determine the set of studies to be compared with automatic search list
378
(i.e. "quasi-gold" standard). After that, we searched through all the search engines and bibliographical
379
sources mentioned in Section 2.3.2 using search queries created in Section 2.3.1. All the search
380
engines provided the option to save the results to excel spreadsheets, except for Springer which
381
exports only the first 999 relevant results and ScienceDirect which does not have that option and
382
hence a manual scan was performed. We then filtered the 5,625 primary studies using title, abstract,
383
full-text (when needed), inclusion and exclusion criteria. We also snowballed the primary studies
384
[140], where we scanned the list of references for the primary studies and the citations to add
385
related works (349 results), which were not identified by the bibliographical engines. In the end we
386
included 48 studies. The search results and number of included studies from each database and
387
snowballing process are listed in Table 1.
388
389
2.5 Quality Assessment
390 To assess the quality of the findings, we adopted similar quality criteria to the ones used by [27].
391 The following criteria show the credibility of an individual study when analyzing the results:
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436 Fig. 1. Search Execution.
437
438
439
440
441

10 Sobhy et al.
442 Table 2. Data Extraction Criteria.

443
444 Extracted Data Description
445 Study Identification Unique ID for the study
446 Bibliographical references Author, title, publication type, source
447 and year
448 Study Type Book, journal paper, conference paper
449 workshop paper
450 Study Focus Main area and study objectives
451 Strengths and Limitations Identified strengths and limitations of
452 the approach and its application and
453 its potentials for future directions
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
Fig. 2. Distribution of the publication types.
473
474
475 1. The study provides evidence or theoretical reasoning for their experimental evaluation and
476 data analysis rather than relying on non-justified or adhoc statements.
477 2. The study describes the context in which the research was conducted.
478 3. The design and implementation of the research is mapped to the study objectives.
479 4. The study provides full description of their data collection process.
480
481
All 48 studies identified in the search described above met the quality assessment criteria.
482
483
2.6 Data Extraction process
484 In this process, we performed a thorough scan for the 48 included papers to extract the relevant
485 data, which were managed by Excel spreadsheets and bibliographical management tool BibTeX. The
486 data extraction for the 48 studies was driven by the form depicted in Table 2 and the classification
487 framework in Section 4.1. For the data analysis, we investigated the extracted data with respect
488 to their relationships. The results of this process is given in the subsequent sections. The list of
489 included studies are presented in Appendix B.
490
491
3 OVERVIEW OF THE INCLUDED STUDIES
492
493
Here we provide an overview of the included studies with respect to their distribution along
494
publication channels, over the years, and their ranks.
495
Table 3. Distribution of included studies along with the publication channels.
496
497
Publication Channel No. of
498
Studies
499
500 IEEE International Conference on Software Engineering 8
501 (ICSE)
502 International Conference on Software Architecture (ICSA) 3 7
503 Software Engineering for Self-Adaptive Systems (SEAMS) 4
504 Journal of Systems and Software (JSS) 5
505 Book 3
506 IEEE Transactions on Software Engineering (TSE) 2
507 IEEE Internet Computing 1
508 Software Quality Journal 1
509 Empirical Software Engineering 1
510 European Conference of Software Architecture (ECSA) 2
511 IEEE International Conference on Software Maintenance (ICSM) 1
512 ACM Joint European Software Engineering Conference and Symposium on the 1
513 Foundations of Software Engineering (ESEC/FSE)
514 IEEE International Conference on Autonomic Computing (ICAC) 1
515 ACM/SPEC International Conference on Performance Engineering (ICPE) 1
516 International Conference on Software Reuse (ICSR) 1
517 International Conference on Quality of Software Architectures (QoSA) 2
518 IEEE International Conference and Workshops on Engineering of 1
519 Computer-Based Systems (ECBS)
520 International Conference on Evaluation of Novel Approaches to Software 1
521 Engineering (ENASE)
522 IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, 1
523 Networking and Parallel/Distributed Computing (SNPD)
524 International Workshop on the Economics of Software and Computation (ESC) 1
525 IEEE International Enterprise Distributed Object Computing Conference 1
526 Workshops (EDOC)
527 Proceedings of the 3rd international workshop on Software and performance (WOSP) 1
528 Software and Systems Modeling (Springer) 1
529 International Workshop on Software Engineering for Embedded Systems (SEES) 1
530 Total 48
531
532
533 3.1 Distribution of Studies over Publication Channels
534 Most of the included studies (i.e. 48 studies) were published in the most well-known and prominent
535 journals and conferences. In Table 3, we provide an overview of the included studies with respect
536 to their publication channels and the number of studies per channel. We have checked the included
537
3 Formerly the Working IEEE/IFIP Conference on Software Architecture (WICSA) and International Conference Series on
538
the Quality of Software Architectures (QoSA).
539

12 Sobhy et al.
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
Fig. 3. Distribution of the publication types among the years.
556
557
Table 4. An overview of citation rate of included studies.
558
559
Cited by <10 10-50 50-100 >100
560
561 Number of Studies 8 21 5 14
562 (Total = 48)
563
564
studies against the criteria for quality assessment and confirmed that they indeed fulfil the quality
565
criteria introduced in Section 2.5. We have also plotted the distribution of the included studies
566
related to the publication channel (i.e. conference, journal, etc) in Figure 2. From these results, we
567
found that there are a significant number of studies published in conferences (about 62%), followed
568
by a smaller number of studies (19%) in journals. There are limited studies published in workshops
569
(roughly 13%) and books (about 6%). This indicates that architecture evaluation approaches are
570
still presented in conferences, and some of them have matured and published through books and
571
journals.
572
573 3.2 Distribution of Included Studies Through the Years
574
By analysing the studies by year of publication, as depicted in Figure 3, we observe an increasing
575
trend in the domain of software architecture evaluation starting from 2003 till 2013 (with some
576
oscillation). Though it may seem that interest in architecture evaluation has decreased in the past
577
four years, there were recent studies that provided new architecture evaluation approaches, which
578
are included in this survey (e.g., [127, 136]).
579
580 3.3 Citation Rate of Included Studies
581
We list in Table 4 the citation rate for the included studies, which was obtained from Google
582
Scholar4 . The citation rate is not meant for comparing studies; instead we use it to provide a rough
583
estimate of the quality of papers. In particular, almost five studies were cited by fewer than 10
584
sources. Two of them were cited in 2004 and 2010 and hence we do not expect that they will be
585
cited further, whereas the others are relatively new. Almost 45% of the studies (21 publications)
586
587 4 http://www.googlescholar.com
588
589 Table 5. Featuring the most cited studies above 100 citations.
590
591 Rank Ref Author(s) Year Title
592
1 [43] R. Kazman, M. Klein, 2003 Evaluating software architectures
593
P. Clements and others
594
2 [83] R. Kazman, L. Bass, 1994 SAAM: A method for analyzing the
595
G. Abowd, & M. Webb properties of software architectures
596
3 [19] P. Bengtsson, N. Lassing, 2004 Architecture-level modifiability analysis
597
J. Bosch, and H. Vliet (ALMA)
598
4 [30] R. Calinescu, L. Grunske, 2011 Dynamic QoS management and optimization
599
M. Kwiatkowska, in service-based systems
600
R. Mirandola,
601
& G. Tamburrelli
602
5 [58] I. Epifani, C. Ghezzi, 2009 Model evolution by run-time
603
R. Mirandola, parameter adaptation
604
& G. Tamburrelli
605
6 [82] R. Kazman, J. Asundi, 2001 Quantifying the costs and benefits
606
& P. Clements of architectural decisions
607
7 [139] G. Williams,U. Smith 2002 PASA: A Method for the Performance
608
Assessment of Software Architectures
609
8 [18] P. Bengtsson,J. Bosch 1998 Scenario-based software
610
architecture reengineering
611
9 [133] G. Tesauro 2007 Reinforcement learning in autonomic computing:
612
A manifesto and case studies
613
10 [40] S. Cheng 2004 Rainbow: cost-effective software
614
architecture-based self-adaptation
615
11 [5] T. Al-Naeem, I. Gorton, 2005 A quality-driven systematic approach for
616
and M. Babar architecting distributed software applications
617
12 [62] N. Esfahani, E. Kouroshfar, 2011 Taming uncertainty in self-adaptive software
618
& S. Malek
619
13 [145] L. Zhu, A. Aurum, 2005 Tradeoff and sensitivity analysis in
620
I. Gorton, & R. Jeffery software architecture evaluation using
621
analytic hierarchy process
622
14 [32] R. Calinescu 2009 Using quantitative analysis to
623
& M. Kwiatkowska implement autonomic IT systems
624
625
626
were cited by 10-50 other sources, and five studies were cited 50-100 times. Fourteen studies have
627
very high rates with more than 100 citations and the first ranked study was cited almost 1578 times.
628
This shows that the included studies are, in general, highly cited, which signifies their quality
629
and impact. In Table 5, we present the most cited publications. The first study is a book, and the
630
remainder are journal and conference papers.
631
632 4 DATA EXTRACTION RESULTS
633
This section aims to provide answers for the first and second research question: (1) How can the
634
current research on software architecture evaluation under uncertainty be categorised and what are
635
the current state-of-the-art approaches with respect to this categorisation?; (2) What are the actions
636
taken by these architecture evaluation approaches to deal with uncertainty? Our analysis of research
637

14 Sobhy et al.
638 topics addressed in each study and the systematic reviews and surveys found in literature (e.g.
639 [11, 27, 53, 93, 98]) helped us in developing the following classification framework. This classification
640 aided us in filtering, mapping, and understanding the architecture evaluation domain. We also
641 discuss how the included evaluation approaches deal with uncertainty.
642
643 4.1 Classification Framework
644 Next, we will explain in detail the criteria presented in Figure 4.
645
1. Quality Evaluation: Architecture evaluation is typically done as a milestone review that
646
aims at justifying the extent to which the architecture design decisions meet the quality
647
requirements and their trade-offs. The evaluation can aid in early identification and miti-
648
gation of design risks. The point of the exercise is to avoid poor decisions, identify a stable
649
architecture and thus save integration, testing and evolution costs that can be attributed
650
to design decisions that are not fit in meeting the changes [124]. We review Stage of Eval-
651
uation, covering design-time, run-time and continuous along with Approaches to Evaluation
652
covering major efforts including utility-based, scenario-based, parametric-based, search-based,
653
economics-based, and learning-based.
654
2. Quality Attributes Considerations: Our literature review aims to show how the studied
655
software architecture evaluation methods addressing quality attributes (i.e. focus on single
656
versus multiple QAs), as well as what are the supported quality attributes. Examples of quality
657
attributes are performance, reliability, security, cost, etc. Further monitoring and treatment
658
of quality attributes is an important aspect to discuss, which could provide the architects
659
and architecture evaluaters with the necessary elements to design a continuous architecture
660
evaluation framework.
661
3. Level of Autonomy: In software architecture evaluation, the level of autonomy is an im-
662
portant aspect while designing a continuous architecture evaluation framework. In this
663
context, we will review how the studies performed the management of stakeholder input and
664
management of trade-offs between conflicting requirements.
665
4. Uncertainty Management: In this category, we focus on discussing the sources of uncer-
666
tainty and how the literature has treated uncertainty.
667
668 In Section 4.2 to 4.5, we aim to provide answers for the review’s research questions mentioned
669 earlier. We classify the architecture evaluation approaches as design-time and run-time. In each
670 category we further classify and explain the existing architecture evaluation approaches with
671 respect to the framework (answering research question 1). We also discuss the actions taken by
672 these architecture evaluation approaches to deal with uncertainty (answering research question 2).
673 Table 6- 12 provide a summary of the representative contributions with respect to the classification
674 framework.
675
676 4.2 Quality Evaluation
677 4.2.1 Approaches to Evaluation Under Uncertainty. Architecture evaluation methods can take
678 several forms: the methods can be bespoke, providing phases and systematic guidance for architects
679 to evaluate the extent to which the architecture can meet its non-functional goals and trade-offs -
680 e.g. ATAM [85], CBAM [82], etc. Additionally, the architects can utilise generic frameworks for
681 quality assessment, which can be used to evaluate any artefact under consideration, where the
682 software architecture can be a beneficiary. Regardless of the type of evaluation used, the architects
683 can adopt one of the below commonly approaches to evaluate architecture design decisions and
684 choices in the presence of uncertainty. The commonly used approaches can be categorised as
685 utility-based, scenario-based, parametric-based, search-based, economics-based, and learning-based.
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
Fig. 4. The proposed classification of architecture evaluation approaches.
717
718
719
1. Utility-based: This category focuses on approaches to architecture evaluation methods that
720
adopt utility functions for decision-making when justifying architecture design decisions,
721
adopting a tactic and style among alternative candidates, etc. Utility functions are used in
722
two contexts. First, it is a measure of the extent to which the candidate solution satisfy
723
the set of quality attributes in question. Second, it can be used to provide a stakeholder’s
724
preferences over a set of quality attributes, which is called a Weighted Utility function. Various
725
methods have adopted utility theory to shortlist the candidate architectures operating under
726
uncertainty, such as [63, 95, 113].
727
− Osterlind et al. [113] used utility theory to balance quality attributes against each other to
728
obtain the best possible architecture.
729
− GuideArch [63] is an architecture framework that explicitly models the uncertainty of
730
architecture decisions using fuzzy logic to rank and determine the optimal architecture
731
decision. However, the use of fuzzy logic cannot be empirically evaluated and adjusted.
732
− Letier et al. [95] designed a method, based on GuideArch and CBAM, to deal with uncer-
733
tainty. Utility theory and Monte Carlo simulation were used to calculate the costs and
734
benefits of candidate architecture decisions under uncertainty. The latter approach made
735

16 Sobhy et al.
736 an assumption that the probability distributions of model attributes are accurate; this may
737 affect its applicability, particularly in dynamic environments.
738 − The architecture evaluation approach in [96] focuses on middleware and design pattern
739 integration for developing adaptive self-managing architectures at design-time that is able
740 to recover from failures. This approach suffers from the same limitation of design-time
741 approaches: the design-time patterns (i.e. decision) may not be able to handle the changing
742 environmental conditions at run-time. Architecture Software Quality Assurance (aSQA)
743 [42] is an evaluation method that uses metrics to determine the user’s satisfaction towards
744 prioritized quality requirements, especially in agile software projects. Despite it focuses on
745 a single point of evaluation to lighten the evaluation process, yet it misses the main aim of
746 evaluation (i.e. assess the impact of architecture decisions on quality attributes).
747 − Decision-centric software architecture evaluation method (DACAR) [57] assesses the ar-
748 chitecture decisions made or to be made independently using utility functions based on
749 stakeholders’ beliefs, rather than evaluating the whole architecture. The method could be
750 potentially adopted in agile projects, since the architecture decisions could be evaluated
751 as they appear in the process. However, the approach is not as flexible as scenario-based
752 methods in obtaining the novel paradigms and significant change domains from stakehold-
753 ers.
754 − Heaven et al. [75] reported on an approach tailored for self-managed software systems.
755 The approach provides the following features: high-level task planning, architecture config-
756 uration and reconfiguration, and component-based control. Their approach uses weighted
757 utility functions to represent quality attributes and determine the total utility of configura-
758 tions by taking into account reliability and performance concerns.
759 − Esfani et al. [62] proposed an approach that elicits from stakeholders their beliefs regard-
760 ing uncertainty with respect to attributes such as network bandwidth. In particular, the
761 stakeholders provide an estimate for the range of uncertainty with respect to the expected
762 level of input variation. The approach also quantifies the uncertainty through profiling by
763 comparing the actual values with estimates from stakeholders and hence provides proba-
764 bility distributions for the variation in data collection. After that the overall uncertainty is
765 computed using fuzzy math.
766 − Veritas [67] is another utility-based approach which adopts utility functions for the man-
767 agement of run-time test cases to improve the adaptation procedure.
768 − Cooray et al. [46] proposed a proactive approach, which continuously updates reliability
769 predictions in response to environmental changes. The approach has proved its efficiency
770 in adapting the system before it experiences a significant performance drop. However, the
771 approach does not consider cost and suffers from scalability issues.
772 − Models@run.time [22, 39, 107] includes built-in mechanism for evaluating the behaviour
773 of software systems through continuous monitoring, planning, and model transforma-
774 tion. However, the effort was not discussed from the architecture evaluation angle. In
775 particular, the authors state that "models of the functional and/or non-functional software
776 behaviour are analysed at run-time, in order to select system configurations that satisfy the
777 requirements" [29]. Models@run.time operates on the assumption that possible run-time
778 configurations have already been evaluated and encoded in the system, where evaluation
779 can be an afterthought through profiling configurations and recommending alternatives.
780 It aims to "reevaluate requirements satisfaction while the system is evolving" [54]. In the
781 spirit of models@run.time, several approaches which are architecture-centric have been
782 discussed in the context of self-adaptive and managed architectures [36, 47, 68, 91, 131].
783
784
785
Examples of these approaches include [7, 30, 40, 46, 58, 69], which formally analyse their
786
architectural models.
787
– The Rainbow framework [40] uses Markov processes to determine the likely aggregated
788
impact of each strategy on each quality attribute. It requires high human intervention
789
to determine the effects of strategies with respect to quality attributes (i.e. predefined
790
probabilities) [41].
791
– Epifani et al. [58] proposed a utility-based approach leveraging a Discrete Time Markov
792
Chain approach and Bayesian estimators to provide continuous automatic verification of
793
requirements at run-time and support failure detection and prediction. Their approach
794
does not consider multiple quality attributes, switching cost, and variance in run-time
795
data.
796
– In [104], Meedeniya et al. proposed a Discrete Markov Chain approach that performs
797
MonteCarlo simulations to predict the reliability of heterogeneous software architectures.
798
The approach also adjusts the number of architecture evaluations with respect to partic-
799
ular performance levels. They then extended the work to deal with different sources of
800
uncertainty, which occur in different software architecture evaluation models [105]. One
801
major concern in this approach is its assumption that all software architectures can be
802
modelled as Markov chains, which may not be true in some contexts due to complexity.
803
– Ghezzi et al’s [69] method is one of the few that complement design-time with run-
804
time analysis. At design-time, the approach integrates goal-refinement methodologies
805
with Discrete Markov Time Chains to determine all possible execution paths for the
806
goal. At run-time, it exploits utility functions to measure the utility of paths, based on
807
assumptions. For example, the utility for a 5ms response time is 1 and so forth. Given
808
these assumptions, a hill climbing algorithm is used to select the optimal goal. We will
809
discuss [7, 30, 46] in learning-based section.
810
– Other utility-based approaches are found in Table 6.
811
812
Summary and Reflection: Generally, the major problems of the prior approaches are: (i) the high
813
reliance of stakeholders for utility estimations, which is subject to their experience; (ii) the utility
814
functions are hard to define; (iii) there is complexity and uncertainty in the quantification of utility
815
values. This motivated the need to integrate learning techniques to learn over time and hence improve
816
the analysis (discussed in the learning-based section).
817 2. Scenario-based: The foundation of most architecture evaluation approaches rests on sce-
818 narios [27, 53, 122]. These approaches use quantitative evaluation to determine the fitness of
819 operational quality attributes. They elicit from stakeholders the utilities of architecture deci-
820 sions and their effect on quality attributes of interest. Some of the scenario-based approaches
821 have been validated and used in industry over the past decades [53].
822 − Software Architecture Analysis Method (SAAM) [83]: is the first well-known architecture
823 evaluation method that aimed to reify quality attributes via a set of scenarios as a means to
824 evaluate architecture design decisions under concern and identify risks in an architecture.
825 It assesses the extent to which the architecture satisfies the quality goals. It was originally
826 used for assessing modifiability, but it has been applied for other quality attributes, such as
827 portability and extensibility. SAAM takes as input: business goals, software architecture
828 description, and quality requirements that illustrate the interaction between stakeholders
829 and the system being analysed. It then maps between scenarios and architecture compo-
830 nents to assess anticipated changes to the system. This mapping can also be employed to
831 estimate the amount of effort needed to handle these changes. The SAAM does not explicitly
832
833

18 Sobhy et al.
834 deal with trade-offs between quality attributes. The lack of trade-offs management has
835 contributed to the evolution of Architectural Trade-off Analysis Method.
836 − Architectural Trade-off Analysis Method (ATAM) [43]: is the most popular architecture
837 evaluation method. It is an evolved version of SAAM. Unlike SAAM, the ATAM focuses
838 on a comprehensive evaluation of quality attributes rather than just concentrating on
839 modifiability, portability, and extensibility. ATAM is a generic design-time architecture
840 evaluation method that uses scenarios to assess the value of architecture design decisions.
841 Specifically, it aims to reveal the degree to which an architecture will meet its quality
842 requirements (e.g. availability, security, usability, and modifiability), and the interaction
843 between those goals through trade-off analysis.
844 − Cost-Benefit analysis method (CBAM) [82]: is an architecture evaluation method that extends
845 ATAM to provide cost/benefit analysis of architecture design decisions. The CBAM was
846 created to "develop a process that helps a designer choose amongst architectural options,
847 during both initial design and its subsequent periods of upgrade, while being constrained to
848 finite resources" [9]. Although CBAM uses cost/benefit information to value the architecture
849 design decisions and to justify their selection, this method is unable to dynamically profile
850 the added value of architecture decisions, which is essential for applications operating
851 in uncertain environments (such as IoT). It only deals with uncertainty through set of
852 scenarios, similar to ATAM.
853 − Scenario-Based Architecture Re-engineering (SBAR) [18]: is another scenario-based architec-
854 ture evaluation method that uses different techniques to assess the quality attributes of
855 interest and implicitly deal with uncertainty: scenarios, simulation, and mathematical mod-
856 eling. For example, if a quality attribute is concerned with development and design-time
857 properties, such as maintainability and reusability, scenario-based techniques can be best
858 utilized. Scenario-based analysis can be still used for behavioral and run-time properties,
859 such as performance and fault-tolerance, simulation and/or mathematical models can better
860 provide meaningful insights and can complement scenario-based ones. A major concern in
861 SBAR is its use of impractical assumptions. For instance, to address the reusability concern,
862 the architect has to define all the scenarios related to the reuse of parts of the architecture,
863 which is not feasible.
864 − Architecture-Level Modifiability Analysis (ALMA) [19]: Unlike ATAM and CBAM, ALMA
865 focuses on a single quality attribute, and hence it does not consider trade-offs. It utilizes
866 probabilities to determine the likelihood of the impact of scenarios at the software archi-
867 tecture level with respect to modifiability concern (e.g. maintenance cost prediction and
868 risk assessment).
869 − Systematic Quantitative Analysis of Scenarios’ Heuristics (SQUASH) [79]: is a systematic
870 quantitative method for scenario-based value, risk, and cost analysis. The method focuses
871 on evaluating the relative benefits of proposed scenarios in early stages of architecting.
872 The method extends some steps from CBAM by providing extensive evaluations of the
873 internal structure of the scenarios to predict the quality attributes of architecture decisions.
874 In this context, the approach relies more on stakeholders than CBAM and hence it may not
875 be easy to apply in practical settings.
876 − Analytic Principles and Tools for the Improvement of Architectures (APTIA) [84]: is an archi-
877 tecture improvement method that combines existing architecture evaluation methods (such
878 as ATAM, CBAM, etc.) through: "quality attribute models, design principles in the form of
879 tactics, scenario-based quality attribute elicitation and analysis, and explicit elicitation of
880 the costs and benefits of architecture decisions from stakeholders" [84] as well as the use of
881 architecture documentation templates. It also adds new steps to the analysis. Particularly, it
882
883
identifies design decisions linked to the analysis rather than stating their future problems.
884
It was able to aid the team of architects to propose architecture design decisions for a
885
complex system and in a short period of time.
886
− Architectural Tradeoff Method using Implied Scenario (ATMIS) [64]: is an extension of ATAM
887
through the adoption of Implied Scenarios for security testing [3]. The main aim of this
888
approach is to apply trade-off analysis between security and any other quality attribute
889
through the use of implied scenarios.
890
− Further, there is another scenario-based method which is different from the commonly used
891
scenario-based architecture evaluation methods. The method is named Performance Assess-
892
ment of Software Architecture (PASA) [139]. In PASA, the architect uses the architecture
893
specification to form performance models. The generated models are then utilised to assess
894
whether the performance objectives are met. ATAM uses scenarios to determine, prioritise
895
and refine the key quality attributes by constructing a utility tree, where each leaf in tree
896
represents a scenario. PASA instead employs scenarios in the form UML and sequence
897
diagrams to demonstrate how the software architecture will achieve the performance
898
objectives.
899
− Finally, Yang et al. [141] proposed a utility-based approach that extends the scenario-based
900
approaches (e.g. ATAM, CBAM) and profiles the run-time information to better manage the
901
QA trade-offs. It aims to improve decision-making and handle the uncertainty which may be
902
better managed at run-time. In particular, their approach determines the potential QA trade-
903
off points, designs the adaptive architecture decisions, and finally deploys their system on
904
a middleware platform to collect run-time information. Though the latter approach is one
905
of the few attempts to extend scenario-based approaches at run-time, it lacks the ability to
906
learn over time and hence cannot forecast the future potentials of architecture decisions.
907
908
Summary and Reflection: Scenario-based evaluation approaches can be described as best-effort,
909
where the evaluators’ expertise, choice of stakeholders, etc., are all factors that influence the evaluation.
910
In particular, these approaches heavily rely on human inputs and expert judgement. These processes
911
can thus suffer from subjectivity, bias and can never be complete. As for the their effectiveness for
912
evaluating for uncertainties, these methods advocate the use of exploratory, growth and stress and the
913
like of scenarios that can test for the likelihood of an issue (e.g., sudden spike in load; downtime in part
914
of the network; hostile attack, etc) to be confronted by the architecture along its response and quality
915
trade-offs affected and the soundness of the architecture design decision and choices in responding
916
to these issues. The choice of these scenarios can be critical input to the evaluation process and its
917
conclusion on the extent to which the architecture can be resilient to uncertainty. Henceforth, the
918
soundness of the evaluation for uncertainty can be influenced by human expertise, judgement and their
919
skills and experience in identifying of uncertainty revealing scenarios to steer the evaluation exercise.
920 3. Parametric-based: The previous scenario-based approaches used simplistic mathematical
921 models and relied heavily on stakeholders for the elicitation of scenarios and on expert evalu-
922 ators for the impact of these scenarios on quality attributes. Here, we will discuss approaches
923 that assess architecture decisions using parametric models - parameterised mathematical
924 models with parameters identified and supplied that can aid decision-making. Stakehold-
925 ers often provide values for these parameters (i.e. design-time and interactive approaches)
926 or can be provided or calibrated at run-time through observing relevant concerns of the
927 parameterised functions.
928 − Analytic Hierarchy Process (AHP) [123] is a mathematical modelling tool used in deal-
929 ing with complex decision-making. AHP has been used in two contexts for architecture
930 evaluation: managing trade-offs and determining the relative importance of scenarios and
931

20 Sobhy et al.
932 decisions. Zhu et al. [145] adopted AHP to explicitly determine the trade-offs being made
933 and the relative size of these trade-offs. It has been used with CBAM to determine the
934 relative importance of scenarios through pair-wise comparisons [94]. It relies on eliciting
935 the benefits and costs from stakeholders, and hence suffers from the same limitations of
936 scenario-based approaches.
937 − ArchDesigner [5] is an architecture framework that first adopts AHP to elicit from stakehold-
938 ers their preferred architecture decisions. It then uses Integer programming to determine
939 the optimal architecture decision, which satisfies conflicting stakeholder quality goals
940 subject to project constraints, such as cost and time.
941 − LiVASAE [86] (a lightweight value-based architecture evaluation technique) attempts to
942 measure the level of uncertainty using AHP and also provides three simplified evaluation
943 procedures as compared to the CBAM. All these approaches rely on stakeholders for
944 evaluating the candidate architecture decisions as well as their benefits and costs.
945 − Other approaches include [40, 58, 103–105] (mentioned in utility-based approaches section),
946 can also satisfy the parametric-based evaluation, as they use Markov Chains to determine
947 the QoS of architectures.
948 Summary and Reflection: Though all the prior approaches provide some management for uncer-
949 tainties, they suffer from the same concerns: the high reliance on stakeholders for the elicitation of the
950 relative importance (i.e. rank) of architecture decisions and their impact on quality attributes.
951
4. Search-based: This category focuses on showing how search-based techniques have been
952
used to complement architecture evaluation (but not related to work on search-based tech-
953
niques in software architecture unless the work is evaluation-related). Search-based software
954
engineering is "the application of metaheuristic search techniques, such as genetic algorithms,
955
simulated annealing and tabu search" to the analysis [73]. In software architecture, it is used
956
to solve complex problems in terms of searching for the most suitable (i.e. optimal) candidate
957
architecture choice [73]. In this context, it is sometimes called search-based optimisation
958
[74].
959
− Evolutionary Algorithms are generally adopted for decision-making in software systems
960
[7]. For example, ArcheOpetrix [6] is a tool that exploits evolutionary algorithms for
961
multi-objective optimization of an embedded system’s architecture.
962
− Grunske et al. [70] proposed a method to automate the trade-off management process
963
using an evolutionary algorithm. The aim of the approach was to rank design decisions
964
(architecture refactorings) by taking into consideration competing quality goals. However,
965
this was an initial attempt without a complete evaluation (i.e. it has not been applied on
966
architecture evaluation methods).
967
− As aforementioned in utility-based section, Ghezzi et al’s [69] method uses a hill climbing
968
algorithm to select the optimal goal, which could also be seen as a search-based technique.
969
− Among the notable excluded work is [6], as the work does not explicitly or implicitly
970
address uncertainties in architecture evaluation though they have covered some phases of
971
design-time and run-time evaluation. However, we have included their subsequent work
972
[102] as it addresses uncertainty in architecture evaluation decision-making. In particular,
973
Meedeniya et al. [102] proposed a Robust ArcheOpterix framework that can determine
974
the uncertain information related to system parameters and hence search for the most
975
optimal and robust candidate architecture. The framework provides the architect with
976
the flexibility to choose the most suitable optimisation algorithm from the following list
977
[101, 118]: Multi-Objective Genetic Algorithm (MOGA), Non-dominated Sorting Genetic
978
Algorithm (NSGA-II), Pareto Ant Colony Algorithm (P-ACO), Simulated Annealing (SA),
979
980
981
Hill Climbing, Bayesian Heuristic for Component Deployment optimization (BHCDO),
982
Random Search Algorithm, and Brute-Force Algorithms. The used software architecture
983
evaluation model is based on their previous work [104].
984
− PerOpteryx [28] is an automated tool based on Palladio framework [120] for selecting
985
the optimal candidate architecture. This approach performs evaluation at design-time. So
986
still run-time monitoring is important to complement the design-time decisions. However,
987
we see potentials in extending PerOpteryx tool with run-time analysis to develop the
988
continuous evaluation framework.
989
− Other approaches include [5, 62, 63, 67, 95, 103, 127] (mentioned in other sections), can
990
also satisfy the search-based evaluation, as they adopt some search-based algorithms for
991
the analysis.
992
993
Summary and Reflection: Search-based techniques, which are fundamentally optimisation-based,
994
have been used to evaluate architecture design decisions and choices. These techniques often rely on
995
the assumption that fitness functions guide the search. These techniques suffer from the following
996
limitations: stopping criteria for the search is often difficult to confirm with confidence and solutions
997
tend to provide "good enough" optima. Additionally, as much of the work on architecture evaluation
998
are scenario-based, mapping the concerns of the scenarios into search-based objective functions along
999
their constraints can be complex to abstract if one would be seeking a search that would reflect on these
1000
scenarios. Nevertheless, search-based techniques can be specifically useful if one would use the search
1001
and evolutionary techniques to generate new styles and architecture configurations that could better
1002
meet the requirements of interest.
1003 5. Economics-based: This category presents approaches that inform architecture evaluation
1004 using economics and finance inspired methods; these approaches quantitatively evaluate the
1005 worthiness, short- and long-term benefits, option, risks and costs of the architecture design
1006 decisions. Though these approaches can be essentially utility-based and/or parametric, we
1007 are discussing the economics-driven approaches that were utilised in steering these efforts.
1008 In most cases, economics-based approaches have been used to evaluate the architectures at
1009 design-time.
1010 − Traditional cost-benefit analysis methods have been used to evaluate software. For instance,
1011 Cellini et al. [34] computed the net benefit of a software through the deduction of total costs
1012 from total benefits. These attributes have been obtained from software architects through
1013 a group of questions (e.g. "what is the state of the world in the absence of the program
1014 ?"). CBAM [82] is a utility-based architecture evaluation method that uses cost-benefit to
1015 analyse the impact of architecture decisions on quality attributes of interest. This approach
1016 partially capture uncertainties which motivated the need to integrate some finance-inspired
1017 approaches into the software engineering field. Boehm [23, 24] was among the first to
1018 introduce economics and finance theories to evaluate software design decisions. Examples
1019 of these approaches: Net Present Value (NPV) [56, 97], Modern Portfolio Theory (MPT)
1020 [100], and Real Options Analysis (ROA) [8] (which will be discussed afterwards).
1021 − Recently, the approach in [136] proposes an architecture evaluation approach inspired by
1022 CBAM [82] for run-time decision-making in self-adaptive systems that considers benefits
1023 and costs of decisions. The approach adopts a weighted utility measure of the qualities that
1024 the adaptation decisions can provide to the stakeholders. Although this approach seems
1025 to provide continuous evaluation, it requires additional elements, such as online machine
1026 learning techniques, and extra experimental evaluation for applicability and efficiency.
1027 Real Options Analysis and Modern Portfolio theory have been used to inform that analysis
1028 of software architecture in the presence of uncertainty. Though they have been used in
1029

22 Sobhy et al.
1030 various software engineering and design domains, such as [14, 60, 128, 132], to evaluate low
1031 design decisions (e.g. modularity in design) using economics-based thinking; they were not
1032 concerned with architecture evaluation. There are other few works (e.g. [12, 13, 112, 136])
1033 which initiated the use of economics-based techniques in architecture evaluation. In this
1034 context, in Table 10, we outlined the software architecture evaluation-related approaches (e.g.
1035 [12, 13, 112, 136]) as they operate on widely used architecture frameworks such as ATAM
1036 and CBAM, obtained from our search results and satisfy our inclusion/exclusion criteria.
1037 − Net Present Value (NPV) [56, 97]: is a popular approach used to value software. It values the
1038 software project by eliciting the probability of investing in an established discount rate
1039 or interest. A positive NPV indicates that its financially beneficial to invest (i.e. deploy
1040 this architecture decision) and negative NPV is the opposite. It has been originally used in
1041 [49, 65].
1042 − Modern Portfolio Theory (MPT) [100]: was first introduced by the Nobel prize winner
1043 Markowitz in 1950s. MPT aims to improve the decision-making process by allocating
1044 capital to a portfolio of diverse investment assets. MPT handles uncertainty through the
1045 distribution of capital among assets to minimize risk and maximize the returns. In particular,
1046 it provides a weighted combination (i.e. portfolio) of the assets, where the weight denotes
1047 the investor’s share of capital in each asset. In this context, MPT seeks to demonstrate the
1048 rewards of having a diversified portfolio of assets. MPT is well-known in finance domain
1049 and has been also introduced in software engineering domain as a means to deal with
1050 uncertainties. In software architecture [112], it has been adopted with CBAM to determine
1051 which portfolio of architecture decisions will deliver value by considering sustainability
1052 dimensions. Although this approach explicitly deals with uncertainty, yet it provides a
1053 short-term value. It does not embed flexibility as real options analysis.
1054 − Real Options Analysis (ROA) [8]: provides an analysis paradigm that emphasizes the value-
1055 generating power of flexibility under uncertainty. An option is the right, but not the
1056 obligation, to make an investment decision in accordance to given circumstances for a
1057 particular duration into the future, ending with an expiration date [134]. Real options
1058 are typically used for real assets (non-financial), such as a property or a new product
1059 design. ROA treats uncertainty as an option which may provide future opportunities
1060 to the project, which could be exercised when it provides a high option value. On the
1061 contrary, MPT specifically deals with financial assets and considers uncertainty as a risk
1062 that should be minimized. Real Options analysis has been used in software architecture in
1063 [12, 13, 114]. Bahsoon et al. [12] used real options analysis along with CBAM to measure
1064 the architecture’s stability. They then used their method to value scalability in distributed
1065 architectures [13].
1066 Summary and Reflection: NPV has been discouraged, because it ignores the value of the flexibility
1067 under uncertainty [14, 59, 129]. Modern Portfolio Theory provides some treatments for uncertainty,
1068 but for short-term evaluation. On the contrary, Real Options analysis methods could be used as a way
1069 to manage uncertainty on the long-term. Further, in software architecture evaluation, few methods
1070 embedded finance-inspired techniques to their analysis. However, we see great potentials for including
1071 these techniques to the evaluation especially in high dynamic and unpredictable environments.
1072
6. Learning-based: We define learning-based architecture evaluation methods as methods
1073
which adopt machine learning techniques to improve the evaluation. In most cases, learning-
1074
based approaches have been used to evaluate the architectures at run-time. "The effectiveness
1075
of model-based reasoning about the properties of a software system depends on the accuracy
1076
of the models used in the analysis" [29]. For example, some models may become obsolete
1077
1078
1079
due to evolution in the software architecture. The same applies to the use of utilities for
1080
evaluation and decision-making. Therefore, machine learning could be adopted to better
1081
enhance the evaluation through profiling the observations of the system properties over time,
1082
as in the following studies [31, 61, 87, 106, 133].
1083
− In the context of using reinforcement learning techniques, Tesauro et al. [133] integrated
1084
queuing policies with reinforcement learning, forming a hybrid approach to enhance the
1085
dynamic resource-allocation decision-making process in data centers. The approach suffers
1086
from scalability and performance overhead. A reinforcement learning online planning
1087
technique was used by Kim et al. [87] to improve a robot’s operation with respect to
1088
changes in the environment, by dynamically discovering the appropriate adaptation plans.
1089
However, it does not continuously evaluate the cost-effectiveness of architecture decisions
1090
over time. These approaches [87, 133] tend to be domain-specific. Further, Calinescu et
1091
al. [31] proposed initial attempts for the use of Bayesian learning and ageing coefficients
1092
to update the model parameters, where the ageing coefficients may be a useful element
1093
for a continuous evaluation approach. Because it may then allow the architect to tune the
1094
sensitivity of approach to present/past observations. Though their work had potential, it
1095
was still work-in-progress (i.e. initial evaluation for the approach has been performed and
1096
hence it requires further analysis).
1097
− FUSION [61] is another learning-based approach that adopts a machine learning algorithm
1098
named Model Trees Learning (MTL) to tune the adaptation logic towards unpredictable
1099
triggers, rather than using static analytical models. It also uses utility functions to determine
1100
the benefit of models in question. The major benefit of FUSION is its ability to learn over
1101
time and improve the adaptation actions due to the promising learning accuracy. However,
1102
FUSION has the following limitations: (i) it is specifically tailored to feature modelling; and
1103
(ii) it only detects goal violations, i.e. constraints, but does not have the ability to check if
1104
the current architecture option is getting worse.
1105
− In [127], a run-time architecture evaluation approach has been proposed, which is suited
1106
for systems that exhibit uncertainty and dynamism in their operation. The method uses
1107
machine learning and cost-benefit analysis at run-time to continuously profile the architec-
1108
ture decisions made, to assess their added value. This approach is considered as a reactive
1109
approach, as it ignores the future potentials of architecture decisions. This approach is
1110
considered as one of the few attempts which explicitly evaluates software architectures at
1111
run-time.
1112
− Moreno et al. [106] proposed a proactive latency-aware adaptation approach that constructs
1113
most of the Markov Decision Processes offline through stochastic dynamic programming.
1114
Their method focuses on optimizing the latency of adaptation action based on forecasts,
1115
without considering the cost of architecture decisions and multiple stakeholder concerns.
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127

24 Sobhy et al.
1128 Summary and Reflection: The use of machine learning in architecture evaluation can be challeng-
1129 ing. First, formulating the evaluation as a learning problem requires data that relate to historical
1130 observations along with data evaluation for recency, decay, relevance, etc. Second, the problem with
1131 any study involving machine learning is that the results may not generalise to other data sets, therefore,
1132 the methods should be tested on various data sets with different input parameters. Further, comparative
1133 studies should be provided to confirm the validity of the model. Accuracy and error metrics should also
1134 be adopted to determine how far are the forecast values from the actual ones. The selected measures
1135 should be unbiased towards under or over estimations. Additionally, the software architecture com-
1136 munity can benefit from guidance on the type of learners that can be best suited for the evaluation of
1137 software architectures under uncertainty, yet such guidance is lacking and bridging efforts are still
1138 needed.
1139
4.2.2 Stage of Evaluation. The evaluation could occur at design-time and/or run-time. Design-time
1140
evaluation occurs before system deployment, where the stakeholders are more involved in reasoning
1141
the system under study, whereas the run-time evaluation approaches use run-time and/or simulated
1142
data (e.g. QoS) to capture the dynamic behaviour of architecture decisions under uncertainty and
1143
use such information to profile or evaluate design decisions either during the prototyping stage or
1144
post-deployment.
1145
1146 1. Design-time Evaluation: The design-time evaluation of software architectures aims at
1147 eliciting a proper specification of the problem, which is the first step on the path of analysing
1148 architecture decisions for suitability.
1149 − Documented efforts on systematic design-time architecture evaluation approaches are best
1150 linked to the seminal work of [19, 82, 83, 85]. These approaches focus on identifying design
1151 decisions that best fit the quality requirements of interest and their trade-offs using scenarios
1152 (i.e. scenario-based approaches).
1153 − Other examples of design-time approaches are treated as utility-based (e.g. [18, 19, 63, 79,
1154 82, 84, 95, 139, 145]), parametric-based (e.g. [5, 18, 84, 103, 145]), search-based (e.g. [5, 28, 63,
1155 95, 102, 103]), and economics-based (e.g. [12, 13, 112, 114]). Since learning-based approaches
1156 require run-time analysis, therefore, we have not found methods which are learning-based.
1157 Table 10 summarises the included studies related to design-time architecture evaluation
1158 approaches with respect to the proposed classification.
1159 Summary and Reflection: Design-time evaluation has received significant attention over the years
1160 and the subject is a relatively mature area. However, as we can see from the various discussed methods,
1161 the evaluation is essentially human-reliant and the treatment for uncertainty has been left to the
1162 evaluators; this can include their choice for the scenarios to steer the evaluation, the adopted models,
1163 stakeholders involved, etc. The process can then suffer from subjectivity, bias and can never be complete.
1164 Therefore, a systematic design-time evaluation approach that explicitly deals with uncertainty rather
1165 than either relying on ad hoc evaluation or implicit mitigation of uncertainty is necessary.
1166 2. Run-time Evaluation: By run-time evaluation, we refer to approaches that use run-time
1167 and/or simulated data (e.g. QoS data) to capture the dynamic behaviour of architecture
1168 decisions under uncertainty and to use such information to profile and evaluate design
1169 decisions. Table 11 summarises the run-time architecture evaluation methods studied.
1170 − In software architecture evaluation, utility functions are commonly used to select the
1171 optimal architecture option. This approach has also been adopted to determine the stake-
1172 holder’s preferences towards quality attributes of interest. Therefore, it is utilized as a way
1173 to model trade-offs between quality attributes. Utility functions have been used at run-time
1174 (i.e. utility-based) for self-adaptive and self-managed systems, such as [38, 40, 46, 61, 62, 67,
1175 69, 75, 136, 141].
1176
1177
− Other run-time evaluation approaches apply some machine learning techniques (i.e. learning-
1178
based) to improve the decision-making process through profiling the observations of the
1179
system properties over time, as in the following studies [31, 61, 87, 133].
1180
− [141] is one of the few attempts to extend scenario-based approaches at run-time. As
1181
mentioned earlier, this approach lacks the ability to learn over time and hence cannot
1182
forecast the future potentials of architecture decisions.
1183
− To the best of our knowledge, there are no economics-based approaches that evaluate
1184
architectures at run-time.
1185
Summary and Reflection: As far as we know, the majority of run-time evaluation approaches rely
1186
on models for the analysis, which may be subject to scalability and complexity concerns. For that,
1187
these approaches have adopted some machine learning algorithms, such as Reinforcement learning, to
1188
update their models at run-time. Despite their potential, these approaches suffer from the following
1189
limitations: (i) they assume that the quality data about architecture decisions is available at every
1190
timestep, which may not be true in non-stationary environments such as IoT; and (ii) they lack
1191
the capability for checking whether the current architecture decision is getting worse. However, the
1192
proposed method in [127] has provided some techniques to handle the above concerns but still requires
1193
further investigation and more techniques are needed to enhance the evaluation. The most important
1194
component in a continuous evaluation approach is the run-time approach to be included. Some of the
1195
above approaches (e.g. [55, 127]) seem to provide important elements for a run-time approach in terms
1196
of providing learning techniques. These techniques could aid the architect in predicting the impact of
1197
architecture decisions on quality attributes under different scenarios of interest. On the contrary, few of
1198
the approaches were explicitly used in the context of software architecture evaluation (e.g. [127]).
1199
3. Continuous Evaluation: We define continuous evaluation as multiple evaluations of the
1200
software architecture that begins at the early stages of the development and is periodically
1201
and repeatedly performed throughout the lifetime of the software system.
1202
− Continuous Performance Assessment of Software Architecture (CPASA) [117] is one the
1203
few explicit attempts for continuous evaluation. It is an extension of PASA, with an explicit
1204
focus on deployment in agile development process. It provides an interactive system
1205
that aids the architect in the automatic assessment of performance attributes through
1206
modelling of architecture decisions. They define "continuous" assessment as the production
1207
of continuous performance evaluation tests. Despite the attempts in PASA and CPASA to
1208
handle cost-benefit trade-offs, (i) the evaluation was incomplete; (ii) they are not using
1209
any run-time information to refine their architecture decisions; and (iii) it lacks run-time
1210
monitoring and forecasting of the performance of architecture decisions. In such cases
1211
architecture is, at best, a modelling tool, which may (or may not) be applicable in dynamic
1212
environments. Therefore, these approaches are still design-time evaluation approaches.
1213
− Further, the approaches proposed in [69], [136] and [127] could seem to provide some
1214
initial attempts for continuous evaluation, but they suffer from the concerns mentioned in
1215
(ii) and (iii).
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225

26 Sobhy et al.
1226 Summary and Reflection: Continuous architecture evaluation approach starts at design-time and
1227 continues to operate at run-time, with design-time architecture evaluation being at its earlier stages.
1228 Continuous evaluation shall provide built-in support to deal with operational uncertainties and dynam-
1229 icity, starting from design-time by predicting run-time behaviour and while calibrating its evaluation
1230 at run-time and post deployment. Continuous evaluation can leverage machine learning to provide
1231 predictive and proactive diagnostic capabilities; however, such improvement requires data that can
1232 relate to the architecture design decisions, quality attributes performance, that might not be always
1233 available or easy to extract from operational and maintenance logs. In the absence of real-time data,
1234 the evaluation can, for example, benefit from info-symbiotica simulations and digital twins capabilities
1235 to improve the prospect of the evaluation in dealing with uncertainties.
1236 aa term that is widely used by the dynamic data driven simulation system community (e.g.
1237 http://1dddas.org/InfoSymbiotics/DDDAS2020, https://sites.google.com/view/dddas-conf/home)
1238
1239
1240
4.3 Quality Attribute Considerations
1241 4.3.1 Addressing quality attributes: There are some evaluation methods which focus on a single
1242 or multiple quality attributes. Based on the results, we have found that most of the software
1243 architecture evaluation studies’ have addressed multiple quality attributes, e.g. modifiability with
1244 portability and extensibility [83], stability with cost [12]. Other examples of studies are found in
1245 Table 7- 9.
1246
1247 4.3.2 Supported Quality Attributes: we categorised the quality attributes supported into: general
1248 and specific. For general, we consider the literature that discusses the support of any quality attribute,
1249 such as performance, availability, reliability, etc. For instance, some studies propose generic methods
1250 (62% of included studies) that can be generic enough and applicable to various quality attributes.
1251 However, there are others that focus on specific quality attributes (e.g. performance only, i.e. any
1252 QA) – 36% of included studies, whereas others focus on cost only – 49% of included studies. Based
1253 on our review, for the approaches that evaluate the software architecture at design-time, some
1254 studies (e.g. [43, 70, 79, 82, 84, 145]) accept generic quality attributes, whereas others focus on
1255 specific quality attribute (e.g. [12, 18, 19, 79, 82–84, 96, 104, 145]). One remarkable investigation is
1256 that very few run-time architecture evaluation approaches consider costs through the evaluation
1257 process (e.g.[40, 61, 127, 136, 141]), as well as most of the run-time approaches evaluate with respect
1258 to specific quality attributes (e.g. performance and energy consumption [32], and reliability and
1259 performance [30, 31, 75]). As for the few continuous approaches, their proposed techniques could
1260 be applied for generic quality attributes. Other examples of studies are found in Table 7- 9. The way
1261 existing evaluation methods consider cost and value is not done in isolation but in alignment with
1262 the qualities under consideration and their trade-offs. Our review holds examples from mainstream
1263 architecture evaluation methods (e.g. [5, 34, 79, 82, 84, 139, 145]). Other examples are shown in
1264 Table 7- 9.
1265
1266 4.3.3 Monitoring and treatment of quality attributes: This criterion is relevant to run-time and
1267 continuous evaluation approaches where quality attribute values are either determined through
1268 run-time monitoring or through prediction. Similarly for the treatment, there are two types [72,
1269 93, 121]: reactive and proactive. A reactive approach triggers a switch after experiencing a drop
1270 in performance, a goal violation, etc. A proactive approach switches architecture options without
1271 experiencing a drop in performance; instead it is based on predictions that a significant change
1272 in performance may occur in the near future. Based on our investigation, most of the approaches
1273 used reactive monitoring and treatment of quality attributes (e.g. [30, 31, 61, 62, 67]), whereas very
1274
1275
few approaches embedded proactivity to their architecture evaluation method (e.g [46, 69]). Other
1276
examples of studies are found in Table 7- 9.
1277
1278
Summary and Reflection: Quality Attributes continue to be the driver for architecture evaluation
1279
to test the architecture fitness with respect to the considered attributes. Considering multiple quality
1280
attributes and their simultaneous effect on the architecture is still a challenging task, if the evaluation
1281
would consider uncertainties that relate to the provision and support of these attributes. Research has
1282
also to look at how the evaluation can consider multiple source of uncertainties that can relate to the
1283
simultaneous provision of these attributes. Research can benefit from search-based and evolutionary
1284
computing to provide the basis for automatic refinements of architecture in supporting quality attributes
1285
and embracing for various sources of uncertainties. The challenge, however, is to construct sound fitness
1286
functions and stopping criteria for managing the search. The support can goes beyond the classical
1287
monitoring and reactive interventions to provide a holistic approach for proactive and preventive
1288
diagnostic of software architecture, while having multiple qualities and their corresponding source of
1289
uncertainties, as first class citizen in the evaluation.
1290
4.4 Level of Autonomy
1291
1292 4.4.1 Management of Stakeholder Involvement in Evaluation: This category has been further cate-
1293 gorised to human-reliant, semi-autonomous, and autonomous. We have to distinguish between: (i)
1294 Human-reliant (i.e. totally dependent on stakeholders for evaluating the behaviour of candidate
1295 architecture options); (ii) Semi-autonomous process for architecture evaluation, with human in the
1296 loop (e.g. stakeholders and architects in the loop for interactive evaluation); (iii) Autonomous (i.e.
1297 the evaluation is performed autonomously without human intervention). To further clarify those
1298 categories, we consider the case of architecture evaluation in self-adaptive Systems (SAS): there are
1299 human-reliant architecture design decisions (such as whether to introduce a self adaptation mecha-
1300 nism), semi-autonomous (such as human in the loop participation in self-adaptive systems [33]), and
1301 autonomous architecture design decisions (such as the SAS adapting and deploying components
1302 to different servers at run-time). Another example of the use of autonomous architecture design
1303 decisions is the incorporation of intelligent and learning mechanisms, evolutionary computations,
1304 etc, to assist in the automatic evaluation of decisions. Continuous architecture evaluation can
1305 monitor QAs and suggest re-configuration from a repository of candidate options, some of which
1306 their technical viability has been established but requires further profiling and confirmation. The
1307 evaluation process can then learn and suggest a suitable configuration; it can also call for further
1308 refinements and/or phasing out of reconfiguration.
1309 For classical design-time architecture evaluation approaches (e.g. scenario-based), most of them
1310 tend to fully involve the stakeholders to their analysis, e.g. ATAM, CBAM, ATMIS, etc. Other design-
1311 time approaches (e.g. utility-based, economics-based and search-based) are semi-autonomous, such
1312 as [5, 12, 70, 95, 105, 113], in the context of requiring some inputs (e.g. utilities, users’ satisfaction
1313 towards quality attributes, QoS constraints, etc) for evaluation from the architect. Since that run-
1314 time architecture evaluation approaches occur at run-time (e.g. learning-based), most of these
1315 approaches are autonomous (e.g. [31, 32, 40, 87, 133]), whereas few of them require some human
1316 involvement (e.g. [30, 75, 141]). Other studies are depicted in Table 7- 9.
1317
4.4.2 Management of Trade-offs: A common problem in selecting an optimal architecture decision
1318
is the management of trade-offs [21]. For example, an architecture decision concerning a sensor
1319
could provide high response time but with low energy efficiency. So one objective could be to
1320
select an architecture decision that can satisfy both quality attributes. There are two types of
1321
trade-off management: manual and automatic. Manual management denotes the adoption of tools
1322
or techniques that require human-intervention, whereas automatic indicates the use of parametric
1323

28 Sobhy et al.
1324 models that automatically select and/or shortlist trade-off candidates. Some of the design-time
1325 architecture evaluation approaches (e.g. ATAM, CBAM, ATMIS, and APTIA) handle trade-offs
1326 manually through the analysis of trade-off points elicited from stakeholders or do not consider
1327 it at all (e.g. SAAM, SBAR, SQUASH, and ALMA). As for the run-time architecture evaluation
1328 approaches, some run-time approaches provide automatic management of trade-offs, such as
1329 [30, 32, 40, 62, 87, 127], whereas one noticeable investigation is that many approaches have no
1330 support for trade-off management, such as [31, 67, 69, 75, 141]. Other studies are shown in Table 7-
1331 9.
1332 Summary and Reflection: Providing semi- or fully-autonomous and automated techniques for
1333 trade-off management is crucial in a continuous evaluation framework. In particular, research shall
1334 look at how the evaluation can support continuous and seamless management for various quality
1335 trade-offs and their corresponding uncertainties. In line with what we discussed in the quality attribute
1336 considerations section, the seamless management may need to consider simultaneous qualities, their
1337 inference, risks contributions and aversions. Additionally, the autonomous evaluation can operate
1338 at various views (e.g. 4+1 views [92]) of the architecture, where the evaluation can then converge to
1339 seamless negotiation of the various views for conflicts, reconcile these views while considering the
1340 various uncertainties within the architecture and across the views - the ultimate objective is to provide
1341 holistic seamless evaluation of the architecture.
1342
1343 4.5 Uncertainty Management
1344 4.5.1 Source of Uncertainty: As aforementioned in Section 1.1, architecture can experience two
1345 sources of uncertainty: aleatory and epistemic [15, 50, 66]. To summarise: aleatory conception of
1346 uncertainty intends that uncertainty arises from variability in possible realisation of a stochastic
1347 event, where unknown and different results could appear every time one runs an experiment under
1348 similar conditions; epistemic conception of uncertainty denotes the rise of uncertainty due to lack
1349 of confidence or missing knowledge to a fact which is either true or false. We analysed the works
1350 based on the sources of uncertainty it addresses.
1351
− We found that most of the design-time architecture evaluation approaches address epistemic
1352
uncertainty (e.g. [18, 19, 43, 79, 82, 83]).
1353
− Aleatory uncertainty is encountered in most of the run-time architecture evaluation ap-
1354
proaches (e.g. [32, 40, 64, 87, 96, 133]).
1355
− On the contrary, very few design-time (e.g. [64, 79, 104, 105]), run-time (e.g. [30, 40, 58, 61,
1356
62, 136]), and continuous (e.g. [127]) approaches experience both epistemic and aleatory
1357
uncertainties.
1358
1359 4.5.2 Treatment of Uncertainty: In the research literature there are approaches that deal with
1360 explicit or implicit uncertainty. Explicit approaches are those that consider uncertainty to be a main
1361 focus whereas other methods which do not mention uncertainty, but their tools and techniques
1362 could be used to handle uncertainties (i.e. implicit). Next, we will summarise how the studied
1363 architecture evaluation approaches dealt with uncertainty.
1364 − Uncertainties and risks, linked to the deployment, are implicitly discussed and mitigated
1365 through envisioning a set of scenarios, taking the form of use case, growth, and exploratory
1366 scenarios [43, 85, 90] as defined by the ATAM (a design-time architecture evaluation approach).
1367 A use case scenario reveals how stakeholders envision the system usage. A growth scenario
1368 illustrates planned and foreseen refinements to the architecture, whereas an exploratory
1369 scenario helps to probe the extent to which the architecture can adapt to future changes
1370 (e.g. functionality upgrades, new quality attribute requirements). Hence, the evaluation and
1371 its conclusions are highly dependent on the choice of these scenarios. The ATAM defines
1372
1373 Table 6. Summary of Contributions for the included studies with Respect to Approaches to Evaluation.
1374
1375 Stage Approaches to Evaluation
Utility-based Scenario-based Parametric-based Search-based Economics-based Learning-based
1376 Design-time [18, 19, 63, 79, 82, 84, 95, 139, 145] [18, 19, 79, 82–84, 139, 145] [5, 18, 84, 103, 145] [5, 28, 63, 95, 102, 103] [12, 13, 112, 114] -
1377 [5, 42, 57, 64, 70, 94, 96, 113] [63, 95, 112, 114] [102, 103, 105]
[86, 102–105]
1378 Run-time [30–32, 40, 62, 75, 87, 133, 141] [141] [40, 58] [62, 67] - [61, 87, 106, 133]
[46, 58, 61, 67, 69, 106]
1379 Continuous [69, 117, 127, 136] [136] [69] [69, 127] [127, 136] [127]
1380
1381
1382
and records the risks that may threaten the achievement of quality attribute goals. These
1383
include architecture decisions leading to subsequent problems in some quality attributes
1384
(risks), architecture decisions where a slight alteration results in significant impact on quality
1385
attribute responses (sensitivity points), and the simultaneous effect of a single decision on
1386
multiple quality attributes (trade-off points) [81]. ATAM focuses on the risks and benefits
1387
of architecture decisions and does not explicitly consider cost. CBAM extends ATAM by
1388
considering cost-benefit trade-offs.
1389
− To summarise, for the design-time scenario-based architecture evaluation approaches, ATAM,
1390
CBAM, ATMIS, SQUASH and APTIA partially capture uncertainty through scenarios (as
1391
mentioned in the previous point), despite that they do not conduct evaluation at run-time.
1392
However, they suffer from the same drawbacks of design-time evaluation (i.e. high reliance on
1393
stakeholders). ATMIS is also specifically tailored for security. ALMA is similar to ATAM and
1394
CBAM, in the context of taking more utility-based perspective for the evaluation. It aids in per-
1395
forming architecture evaluation more systematically than SBAR. Scenario-based approaches
1396
do not provide explicit management for uncertainties, and include manual tools/techniques
1397
which may not be effective at run-time.
1398
− For the other architecture evaluation approaches, some approaches provided explicit man-
1399
agement of uncertainty through the use of probability distributions (e.g. [103, 104]), Fuzzy
1400
math (e.g. [62, 63]), Monte Carlo simulation (e.g. [95]), Modern Portfolio Theory (e.g.[112]),
1401
Real Options Analysis (e.g. [12, 13, 114]), ageing coefficients (e.g. [31, 127]), AHP consistency
1402
rate [86], utility theory (e.g. [41]), etc.
1403 Summary and Reflection: The treatment for uncertainties, its sources and management has been
1404 discussed in earlier sections in relation to qualities attribute management, trade offs, autonomy and
1405 covering various stages, techniques (e.g., utility-based, economics-based, evolutionary, search-based,
1406 etc) and various methods for evaluation(e.g., design and runtime). This is because the discussion and
1407 treatments for uncertainties is orthogonal to all the above and cannot be discussed in isolation of the
1408 solution domains. Interested reader can refer to the relevant summary and reflection sections. However,
1409 the software architecture evaluation community may need to develop common language and knowledge
1410 for eliciting architecture uncertainties at various levels and provide guidance from mitigating their
1411 consequence on the software architecture. The community may also identify various techniques for
1412 managing the uncertainties, covering various contexts, application domain, etc.
1413
1414 4.6 Limitations of the Review
1415 Though this review was developed following the typical systematic literature review methodology
1416 [88, 89, 116], there are some limitations that require clarification:
1417
− The main threats to validity in this SLR is the selection bias when including the studies and
1418
extracting the data. To resolve that in terms of determining the relevant studies a research
1419
protocol (Section 2) was conducted. We applied this protocol to set out the objectives of the
1420
review, the necessary background, the research questions, inclusion and exclusion criteria,
1421

30 Sobhy et al.
1422 Table 7. Summary of Contributions for Design-time Architecture Evaluation approaches with other categories.
1423
1424 Category Representative
1425 Contributions
1426 Single [19, 104, 139]
Addressing QA
1427
Design-time Multiple [18, 28, 43, 79, 82, 83]
1428 [70, 84, 86, 94, 102, 145]
1429 [5, 12, 28, 95, 113, 117]
1430 [13, 57, 103, 105, 112, 114]
1431 General [28, 43, 79, 82, 84, 145]
1432 Supported QA [63, 70, 86, 94, 113]
1433 [12, 57, 95, 105, 112, 117]
1434 Specific (Cost) [5, 79, 82, 84, 139, 145]
1435 [42, 70, 86, 94, 117]
1436 [12, 13, 63, 95, 113]
1437 [112, 114]
1438 Specific (Other QA) [18, 19, 64, 83, 96, 102]
1439 [12, 13, 103, 104, 139]
1440 Full [43, 79, 82, 83, 86, 139]
1441 Management of stakeholder input [18, 19, 84, 96, 145]
1442 [57, 64, 94, 117]
1443 Semi-Autonomous [5, 12, 28, 70, 95, 105, 113]
1444 [13, 42, 63, 102–104, 112, 114]
1445 Manual [43, 82, 84, 86, 145]
Management of Trade-offs
1446 [12, 13, 64, 94, 117, 139]
1447 [112, 114]
1448 Automatic [5, 28, 42, 63, 70, 95, 102, 103, 113]
1449 No Support [18, 19, 57, 79, 83, 96, 104, 105]
1450 Implicit [18, 19, 43, 82, 83]
Treatment of Uncertainty
1451 [5, 70, 79, 84, 96]
1452 [57, 117, 139]
1453 Explicit [86, 94, 95, 102, 113, 145]
1454 [12, 13, 28, 42, 64, 114]
1455 [63, 103–105, 112]
1456 Epistemic [18, 19, 43, 79, 82, 83, 102, 139]
Source of Uncertainty
1457 [5, 64, 84, 86, 94, 117, 145, 145]
1458 [12, 13, 57, 63, 95, 113]
1459 [42, 70, 105, 112, 114]
1460 [18, 19, 28, 43, 82, 83, 104]
1461 Aleatory [86, 94, 95, 113, 145]
1462 [12, 13, 42, 64, 102, 114]
1463 [63, 103–105, 112]
1464
1465
1466 search strategy, data extraction and analysis of gathered data. The SLR protocol was arranged
1467 by one author and then revised by other authors to verify and evaluate the research questions
1468 and whether the search queries map to the review objectives and research questions. They
1469 also checked the relevance between data to be extracted and research questions.
1470
1471 Table 8. Summary of Contributions for Run-time Architecture Evaluation approaches with other categories.
1472
1474 Contributions
1475
Single [58, 67, 133]
1476 Addressing QA
Run-time Multiple [32, 40, 46, 75, 87, 106, 141]
1477
[30, 31, 61, 62, 69]
1478
General [40, 46, 61, 62, 69, 87, 106, 141]
1479 Supported QA
Specific (Cost) [40, 61, 141]
1480
Specific (Other QA) [30, 31, 58, 67, 75, 133]
1481
Semi-Autonomous [30, 58, 62, 67, 69, 75, 141]
1482 Management of stakeholder input
Autonomous [31, 32, 40, 46, 61, 87, 106, 133]
1483
Automatic [30, 32, 40, 46, 61, 62, 87]
1484 Management of Trade-offs
No Support [31, 58, 67, 69, 75, 106, 133, 141]
1485
Implicit [32, 46, 58, 75, 141]
1486 Treatment of Uncertainty
Explicit [31, 40, 62, 87, 133]
1487
[30, 61, 67, 69, 106]
1488
Epistemic [18, 19, 43, 79, 82, 83, 114]
1489 Source of Uncertainty
[5, 64, 84, 86, 94, 112, 145]
1490
[12, 13, 42, 57, 63, 70, 95, 113]
1491
Aleatory [32, 40, 64, 87, 96, 133]
1492
[58, 61, 62, 67, 69, 75, 106, 136, 141]
1493
Reactive [30, 32, 40, 75, 87, 133, 141]
1494 Monitoring and Treatment of QAs
[31, 61, 62, 67]
1495
Proactive [46, 58, 69, 106]
1496
1497 Table 9. Summary of Contributions for Continuous Architecture Evaluation approaches with other categories.
1498
1500 Contributions
1501
Single -
1502 Addressing QA
Continuous Multiple [69, 117, 127, 136]
1503
General [69, 117, 127, 136]
1504 Supported QA
Specific (Cost) [117, 127, 136]
1505
Human-Reliant [117]
1506 Management of stakeholder input
Semi-Autonomous [69]
1507
Autonomous [127, 136]
1508
Manual [117]
1509 Management of Trade-offs
Automatic [127, 136]
1510
No Support [69]
1511
Implicit [117]
1512 Treatment of Uncertainty
Explicit [69, 127, 136]
1513
Epistemic [117, 127, 136]
1514 Source of Uncertainty
Aleatory [69, 127, 136]
1515
No treatment [117]
1516 Monitoring and Treatment of QAs
Reactive [127, 136]
1517
Proactive [69]
1518
1519

32 Sobhy et al.
1520 − Several junior and senior researchers (with up to 15-30 years of experience in architecture
1521 evaluation) assessed and reviewed the SLR. They provided feedback which reduced the bias
1522 of the formalisation of the protocol, due to the selection of search keywords. There is still a
1523 risk of missing some related studies. This could occur in cases where software architecture
1524 evaluation keywords are not standardized and clearly identified. For instance, continuous
1525 evaluation is defined under different terms, such as continuous, run-time, dynamic, etc.
1526 Therefore, we made an agreement with each other about the definitions of unclear keywords.
1527 In some cases it was difficult to elaborate how the authors of reviewed studies interpreted
1528 terms such as continuous or run-time or dynamic (Section 2.3.1). In this context, we tried our
1529 best to include all the related terms that imply continuity. However, we cannot guarantee
1530 completeness.
1531 − We also used a data extraction form to select information for answering research questions
1532 hence improving the consistency of data extraction (Section 2.6). To ensure that the findings
1533 and results were credible, we conducted a quality assessment on related studies (Section 2.5).
1534 − The limited number of included studies might open a question about the completeness and
1535 coverage of the review, as compared to other SLRs (e.g. [7]). But the objective of this review
1536 was to focus on a specific goal, i.e. the state-of-the-art in software architecture evaluation
1537 approaches for uncertainty and to what extent continuous software architecture evaluation
1538 approaches are used. This results in a narrowed scope for the review. This is analogous to
1539 the case of [98] that conducted a review focusing on methods that handle multiple quality
1540 attributes in architecture-based self-adaptive systems (54 included studies), and [99] that
1541 studied the variability in quality attributes of service-based software systems (48 included
1542 studies). The narrow scope of SLRs explains the limited number of search results and included
1543 studies. We believe that the relevant studies to the research topic were indeed included.
1544 Further, the quality of conferences, journals, and books of the included studies ensures the
1545 significance of the analysis.
1546 − In our search execution, some relevant studies may have not been shown in the search
1547 results of the bibliographical sources. This may be due to the fact that automated searches
1548 depend on the quality of the search engine. However, the selected bibliographical sources
1549 are considered the largest and most significant sources for conducting SLRs and the most
1550 used ones in software architecture and software engineering [27, 98]. We also performed
1551 manual and automated searches through the most popular venues for software architecture
1552 and software engineering [98]. Consequently, we are confident that the included studies are
1553 the most relevant and important ones and others are unlikely to be missed.
1554 − We applied our search on meta-data (i.e. abstract, title, and keywords) only and some studies
1555 might have used architecture evaluation as a part of their proposed work without mentioning
1556 that explicitly in abstract, title, and keywords. Since the authors identify the meta-data of
1557 their studies, therefore, our included studies depend on the quality of the bibliographical
1558 digital sources in classifying and indexing studies.
1559 − One of the main threats to validity is the validation of the classification framework. In this
1560 context, the development of the classification framework was guided by a method for building
1561 taxonomies [110], where we have taken conceptual to empirical approach informed by the
1562 SLR to capture the concepts of software architecture evaluation under uncertainty. The
1563 process was iterative. We then applied subjective and objective evaluation to validate our
1564 classification framework. Subjective evaluation of the process of building the classification
1565 framework was inspired by [110]. In particular, our team members had several interactive
1566 sessions (∼4 meetings) first to discuss the initial build-up of the classification framework.
1567 Subsequent iterations and refinements were informed by three working and feedback sessions
1568
1569
with team members (each taking an average of 2.5 hrs, one senior member with more than 30
1570
years of experience in academic and industrial software architecture research and considered
1571
to be one of the founders of the field of architecture evaluation, a second senior member with
1572
more than 20 years experience in software architecture research and practice, and another two
1573
with 5-6 years experience in software architecture and computational intelligence in software
1574
engineering, covering uncertainties). Our team also consulted two external collaborators
1575
with expertise in the area of the software architecture for additional feedback. The following
1576
criteria, inspired by [110], informed our refinements and iterations: checks for the extent
1577
to which the classification framework is concise (i.e. with limited number of dimensions
1578
and limited number of characteristics for each dimension), robustness (i.e. with sufficient
1579
dimensions and characteristics to determine software architecture evaluation approaches
1580
under uncertainty), comprehensive (i.e. to categorize all known dimensions of architecture
1581
evaluation approaches under uncertainty within the software architecture domain), exten-
1582
sible (i.e. to allow the inclusion of additional dimensions and new characteristics within a
1583
dimension when new types of architecture evaluation approaches under uncertainty appear)
1584
and explanatory (i.e. by providing useful discussion of the architecture evaluation approaches
1585
under uncertainty to facilitate the understanding of how to evaluate software architectures
1586
under uncertainty). As for the objective evaluation inspired by [110], we ensured that every
1587
category (e.g. Quality Evaluation, Quality Attributes Considerations, Level of Autonomy,
1588
and Uncertainty Management) is unique and not repeated. All characteristics of architecture
1589
evaluation under uncertainty have been examined and no new characteristics are needed for
1590
addition.
1591
− Our review focuses on architecture evaluation in the presence of uncertainty. In particular,
1592
the focus of the survey is on how existing architecture evaluation methods and commonly
1593
used approaches can provide ways for mitigating for uncertainties. For example, architec-
1594
ture evaluation can take several forms: the methods can be bespoke, providing phases and
1595
systematic guidance for architects to evaluate for the extent to which the architecture can
1596
meet its non-functional goals and trade-offs - e.g. ATAM, CBAM, etc. These methods can
1597
provide support for mitigating uncertainties. As an example, the use of exploratory and stress
1598
scenarios in ATAM is a way to anticipate likely or extreme cases and to design the architecture
1599
in a way that it can withstand these changes. Additionally, architecture evaluation can also
1600
focus on one concern (e.g., performance, security), where the analysis can utilise low level
1601
design models (e.g. state charts) and model-based analysis to analyse the system for specific
1602
qualities. Though these approaches are often regarded to be design-level evaluation with
1603
restricted focus on specific qualities (e.g. performance, security, reliability, liveliness, etc), the
1604
feedback gathered from their low level design analysis can help the architects to refine the
1605
software architecture under evaluation (e.g. ATMIS [64], performance modelling approaches
1606
[76–78, 135], etc). Analysis using model-based approaches can help the architects to reach
1607
more robust architectures against qualities of interest (e.g. security or performance) through
1608
continuous refinements that can better cater for uncertainties. For example, the architect
1609
can use performance models [76–78, 135] to inform refinements of the architecture that can
1610
better cope with uncertainties. Model-based analysis are design-level analysis. This analysis
1611
is often focused on the analysis on one or more sets of qualities using model-based modelling,
1612
analysis and tooling. Though this analysis operates on lower level of abstraction of that the
1613
architecture, the feedback of their analysis can help software architects and designers to
1614
evaluate software architectures for uncertainties and to suggest refinements that can better
1615
mitigate for uncertainties. These methods were not specifically discussed as either (i) methods
1616
for architecture evaluation, nor (ii) methods for evaluating and mitigating for uncertainties.
1617

34 Sobhy et al.
1618 Nevertheless, we acknowledge their complementary role, if the architect would wish to utilise
1619 their use. Henceforth, model-based analysis is not the core objective of our survey due to
1620 their wide use of versatile and context-dependent use.
1621 − Since the self-adaptive and self-managed domain is large, we did our best to include studies
1622 which show architecture evaluation as part of their approach. In particular, we added studies
1623 from a list of 5974 papers (the output of the search process in Figure 1) through search
1624 databases and a snowballing process, in addition to some manual search. However we may
1625 have missed some works unintentionally.
1626 − Furthermore, a common threat to validity is the fact that there are some criteria—such as
1627 dealing with uncertainty and management of trade-offs—where the paper’s authors do not
1628 explicitly mention whether they are addressed or not. In this context, we attempted to infer
1629 these criteria. Similarly, a common concern in the run-time approaches is that, in most cases,
1630 "the proactiveness or reactiveness of the approaches are not explicitly discussed and it can
1631 only be inferred from the adaptation strategies" [98]. Accordingly, we made our best effort to
1632 infer the reactiveness and proactiveness of the examined approaches.
1633 − Other approaches, such as self-healing works, were excluded. For example, self-healing refers
1634 to the process of automatic recovery from failure. However, our SLR is concerned with the
1635 extent to which the architecture design decisions, tactics, and architecture choices tend to
1636 meet the quality requirements of the systems and their trade offs. As for uncertainty, it
1637 refers to the evaluation of these decisions in situations where it is difficult to predict the
1638 performance of these qualities due to dynamism in the system’s operations and/or adequate
1639 understanding of the application domain. Though self-healing is not among the objectives of
1640 the paper, it can represent a specific scenario for the evaluation, where the architects can
1641 evaluate the extent to which the architecture design decisions can realise self-recovery for
1642 faults under uncertainty.
1643 − Some continuous approaches were excluded from the list of studies. As an example, for [17],
1644 the focus has been primarily on development, whereas [146] focused on continuous testing
1645 and their relevance to the inclusion criteria is weak. Nevertheless, these types of approaches
1646 have motivated us to review and introduce continuous software architecture evaluation to
1647 the software architecture community.
1648
1649
5 RELATED REVIEWS
1650
1651 In the area of design-time architecture evaluation, there are many studies, such as [11, 27, 53, 122].
1652 For instance, Dobrica et al. [53] focused on surveying the most popular methods, such as ATAM
1653 [85], CBAM [82], and ALMA [19]. Babar et al. [11] provided a framework for classifying design-
1654 time software architecture evaluation methods and a comparative analysis for the scenario-based
1655 approaches in specific in [10]. Roy et al. [122] extended the previous reviews and considered most
1656 of the design-time evaluation methods at that time. The authors in [27] systematically reviewed
1657 and classified architecture evaluation methods from the architecture evolution perspective.
1658 Other surveys focused on run-time methods, such as self-adaptive systems [47, 93, 98], self-
1659 managed systems [26], and models@run-time [131]. From [26, 47, 93, 98, 131], we found that none of
1660 the studies explicitly demonstrated the use of run-time architecture evaluation principles. And none
1661 of the works have examined continuous software architecture evaluation. This is surprising because
1662 some research studies implicitly provide the elements for a continuous evaluation approach. Our
1663 survey bridges this gap by rethinking architecture evaluation and providing classifications that can
1664 do the following: (i) help architects to conduct the evaluation in continuous settings by determining
1665 the elements of a continuous evaluation approach; (ii) help in identifying common approaches for
1666
1667
this type of evaluation; (iii) identify common concerns for systems that can benefit from this type
1668
of evaluation; (iv) point out the strengths and weaknesses of these types of approaches.
1669
1670
6 DISCUSSION AND RECOMMENDATIONS FOR FUTURE RESEARCH
1671
Based on the SLR, it is clear that the area of software architecture evaluation has received substantial
1672
attention in recent years. Nevertheless, the results demonstrate some observations which could
1673
lead to future research. In particular, this SLR has identified several gaps in relation to architecture
1674
evaluation for uncertainty with respect to decisions which relate to designing dynamic and complex
1675
systems, such as IoT, cloud, and volunteer computing. In this context, this section aims to address
1676
the third question: RQ3: What are the current trends and future directions in software architecture
1677
evaluation for uncertainty and their consideration for continuous evaluation? This question aims to
1678
show how we can benefit from the existing approaches to draw inspiration from the requirements and
1679
address the pitfalls when developing a continuous evaluation approach. In Section 6.1, we present
1680
the architecture evaluation research area maturation stages and classification. We then highlight
1681
the important objectives that should be accomplished by the research community to advance this
1682
research area (Section 6.2 and 6.3). This is inline with the summary and reflection sections shown
1683
in Section 4.
1684
1685
6.1 Research Area Maturation
1686
1687 In this systematic review, we aim to investigate the extent to which architecture evaluation for
1688 uncertainty and the consideration for continuous evaluation have matured as a discipline. For this
1689 purpose, we examine the included studies with respect to the Redwine-Riddle model [119]. The
1690 latter provides six stages for technology (research area) maturity. These stages are [119]:
1691 1. Basic Research: investigating the ideas and concepts; and providing a clear articulation of
1692 problem’s scope.
1693 2. Concept Formulation: presenting a comprehensive evaluation of solution approach through
1694 seminal paper or a demonstration system.
1695 3. Development and Extension: preliminary using the ideas and extending the general approach
1696 to a broader solution.
1697 4. Internal Enhancement and Exploration: extending the general approach to solve real problems
1698 in other research areas.
1699 5. External Enhancement and Exploration: creating a broader group and involving them in
1700 decision-making to provide a substantial evidence of value and applicability.
1701 6. Popularization: showing production-quality, providing supported versions, as well as market-
1702 ing and commercializing the technology.
1703
Initially, one author has classified the 48 included studies against Redwine-Riddle model, and the
1704
outcome was revised independently by other authors. Discussions and agreements were carried
1705
out in cases of discrepancies between the authors’ categorizations. Figure 5 shows the results
1706
of classification. It is clear that almost 80% of the studies are still in early maturity stages (Basic
1707
Research and Concept Formulation), whereas almost 20% have been extended to broader problem
1708
domains and applied in practice. Among those approaches that are already adopted by industry,
1709
none of them are deployed at run-time; they only focus on design-time evaluation. In particular,
1710
maturity has only been proven for design-time approaches, such as ATAM and CBAM. This explains
1711
why 4% (2) of approaches are still in the popularization stage. In Appendix B, we tabulate the
1712
studies with respect to domain maturity level.
1713
We have seen some examples of continuous evaluation that are either implicit, partial, or ex-
1714
plicit, such as CPASA and DevOps. However, these research efforts have not demonstrated and
1715

36 Sobhy et al.
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730 Fig. 5. Distribution of the included studies over the domain maturity classification model (The maturity
1731 distribution are shown in percentage).
1732
1733
1734
documented how to adapt those practices in the evaluation of software architectures for uncer-
1735
tainty. Therefore, to mature the architecture evaluation research area with continuous evaluation
1736
approaches, we need a set of guidelines, tools, systematic procedures, acceptance from (and case
1737
studies with) real-world organizations, and shared benchmarks across companies for best practices.
1738
6.2 Leveraging Existing Approaches To Develop A Continuous Evaluation Framework
1739
1740 Having done this SLR, we observe that elements from different approaches could be combined to
1741 develop a continuous software architecture evaluation framework. We briefly present our views on
1742 potential ones that seem worthwhile to be further explored below:
1743 • Architecture capabilities that can better cope and respond to uncertainty: examples
1744 of these capabilities are architecture design diversification [51], tactics for meeting non-
1745 functional requirements, etc. Consider diversification as one of the capabilities that could
1746 enrich the architecture to cater for uncertainty and provide means for reliability and continu-
1747 ously meeting the behavioural requirements. Such capability require the software architecture
1748 community to leverage findings on design diversity in software engineering to develop fun-
1749 damentals for software architecture diversity for uncertainties, covering styles, decisions,
1750 tactics, etc, which is inline of our earlier work - [126, 127], as well as rethinking architecture
1751 evaluation to consider dynamism and uncertainty. In this context, a systematic design-time
1752 evaluation approach that can deal with these capabilities and handle uncertainties is neces-
1753 sary. This is an important foundation of a continuous evaluation framework. Some initial
1754 works have discussed these potentials [126], but it still requires further investigations.
1755 • The use of economics-based approaches in architecture evaluation: based on the ex-
1756 isting approaches, we infer that there is a lack of well-documented, real-world examples
1757 for economics-based approaches in the context of design-time evaluations (Table 6 and 10:)
1758 In particular, these approaches ([12, 13, 114]) have not been used to deal with cost-benefit
1759 trade-offs in dynamic environments, such as IoT. Further, they have not been explored from
1760 the perspective of forecasting the long-term value of architecture decisions to determine
1761 whether the complex design decisions, such as diversity in design [51], can handle uncertain-
1762 ties that can be attributed to dynamic changes in the environment. As mentioned previously,
1763 the CBAM [82] is a scenario-based design-time evaluation method, which determines the
1764
1765
influence of architecture decisions on the cost-benefit trade-offs. The CBAM provides an
1766
implicit mitigation for uncertainty through different types of scenarios. However, this type
1767
of evaluation approach would not be suitable for the emerging technologies and paradigms,
1768
such as IoT and cloud-based systems. We believe that economics-based approaches, such as
1769
real options analysis [8] and modern portfolio theory [100], could be combined with CBAM
1770
to support the analysis. Real Options Analysis is one of the few design-time techniques that
1771
can embed flexibility under uncertainty. Therefore, it can aid the architect in predicting the
1772
impact of architecture decisions on quality attributes of interest. It can also shortlist the
1773
candidate options for deployment at run-time and thus reduce unnecessary costs. This is still
1774
very much a research area that requires further investigation in the context of design-time
1775
evaluation, as an initial stage for continuous evaluation.
1776
• New methods for continuous architecture evaluation that interleave and intertwine
1777
design and run-time architecture evaluation: we found that most of the architecture eval-
1778
uation approaches focus on design-time (about 60% of the approaches) and less on run-time
1779
(about 40% of the approaches). Evaluation approaches also tend to focus on development (i.e.
1780
mostly human-centric activities) and lack a consolidated approach that integrates design-time
1781
and run-time considerations. On the contrary, in the context of architecting and evaluating
1782
dynamic and complex systems, a more continuous approach that starts at the early stages
1783
of development and continues to evaluate the architecture options during the lifetime of
1784
the system at run-time is necessary to cope with operational uncertainties, such as high
1785
fluctuations in QoS, sensor ageing effects, etc.
1786
1787
6.3 Finding The Necessary Ingredients For Developing A Continuous Evaluation
1788
Framework
1789
1790 Modern software system environments, such as IoT, cloud, volunteer computing, and microservices,
1791 are a challenge for existing software architecture evaluation methods. Such systems are largely
1792 data-driven, characterised by their dynamism, unpredictability in operation, hyper-connectivity,
1793 and scalability. Properties, such as performance, delayed delivery, and scalability, are acknowledged
1794 to pose great risk and are difficult to evaluate at design-time only. Therefore, a run-time evaluation
1795 approach is necessary to complement design-time analysis. This run-time stage should be able to
1796 handle different sources of uncertainty and evaluate complex design-time decisions. In this regard,
1797 we need to determine the necessary ingredients for this run-time stage. We briefly present our
1798 views on potential directions for run-time stage that seem worthwhile to be further explored below:
1799 • Analysing the cost as a quality concern when developing a continuous evaluation
1800 approach: one interesting observation is that just 25% of run-time approaches address cost
1801 as a concern (Table 11 and 6). Since the management of cost-benefit trade-offs is essential in
1802 dynamic environments [71], cost will highly influence the value of architecture decisions.
1803 When evaluating software architectures, there would be some conflicting QoS goals. Therefore,
1804 when designing a continuous evaluation approach, one could benefit from the literature with
1805 respect to multi-objective optimisation under uncertainty, such as the use of Pareto-Optimal
1806 in [44, 70], Genetic Algorithms in [138], Fuzzy Logic in [144], etc.
1807 • The need to incorporate change detection tests to the evaluation: based on the results
1808 of our review (Table 11), most of the run-time approaches handle uncertainty either by
1809 checking goal violations or providing some probabilistic estimations. However, in contexts
1810 of highly dynamic environments such as [4, 37, 71, 108, 109], this is not sufficient. Even if the
1811 currently deployed architecture decision is not violating any goal, this does not mean that it
1812 has good performance. For example, in some cases, an architecture decision is meeting its
1813

38 Sobhy et al.
1814 quality constraints but it is providing poor performance. In this context, a change detection
1815 test is a necessary component in a continuous evaluation framework to determine significant
1816 drifts in the architecture decisions. This type of test can provide the architect with the
1817 flexibility of adjusting the sensitivity to changes. Therefore, determining the type of test and
1818 its efficiency could be a potential future direction. In [127], one type of change detection test
1819 was used, however, we see potentials of exploring other change detection tests [52] to handle
1820 different forms of uncertainty.
1821 • The need for ageing parameters for data analysis: most of the existing run-time methods
1822 rely on historical data or online data to perform the evaluation, but they do not consider
1823 the age of data. Therefore, embedding some ageing parameters to emphasise the relative
1824 importance of older versus more recent data could potentially improve the analysis [31].
1825 Further investigations, related to the use of these parameters and how the architect could
1826 tune these parameters to enhance the evaluation are required.
1827 • The need for new proactive approaches for continuous architecture evaluation: from
1828 the run-time perspective (Table 11 and 6), it is clear that most of the current approaches (e.g.
1829 [31, 61, 143], etc) tend to be reactive when simplistic learning, partial or incomplete knowledge
1830 is used. Thus they may suggest incorrect decisions due to unexpected future environment
1831 changes and recommend unnecessary switches due to the lack of future knowledge about
1832 the candidate architecture decisions. This in turn may affect the architecture’s stability and
1833 overall behaviour. To bridge the gap, further proactive approaches are necessary to improve
1834 the continuous evaluation process.
1835 • Embedding machine learning and forecasting techniques to the continuous evalu-
1836 ation framework: our analysis shows that just 25% of the run-time approaches embed
1837 machine learning principles in the decision-making process. Using machine learning ap-
1838 proaches in decision-making has shown great improvements to the decision-making (e.g.
1839 [20]). Therefore, another important element when developing a continuous architecture
1840 evaluation framework is leveraging machine learning techniques. There are methods (e.g.
1841 [58, 117]) that explicitly mention continuous architecting and assessment, and others that im-
1842 plicitly adopt it (e.g. [17]). These approaches can benefit from further investigations in terms
1843 of how continuous evaluation could dynamically track and forecast architecture decisions
1844 and automatically manage cost-benefit trade-offs.
1845 • Consider scalability when designing a continuous evaluation framework: the liter-
1846 ature depicts that there are some approaches (e.g. [46, 58, 69, 106]) that are proactive in
1847 terms of failure prediction and recommending alternatives. These approaches may, however,
1848 experience scalability problems. Moreover, these approaches assume that the impact of archi-
1849 tecture decisions on QoS is available at run-time, which is not always the case for uncertain
1850 environments such as IoT. To this end, novel solutions are required to determine how QoS
1851 monitoring challenges could be handled.
1852
1853 7 CONCLUSION
1854 Continuous evaluation has been discussed under different labels, such as run-time, dynamic,
1855 continuous, etc, along with assessment and analysis. The common characteristic among these
1856 efforts is that they start at design-time (even if they do not mention that explicitly) and continue
1857 to evaluate architecture decisions during the life-time of system by observing environmental
1858 conditions. In this review we have attempted to unify these efforts. We performed a systematic
1859 literature review to examine existing architecture evaluation methods that deal with uncertainty
1860 either design-time or run-time. We also provided guidelines for the necessary elements to develop
1861 and conduct a continuous architecture evaluation approach. We both automatically and manually
1862
1863
searched well-known venues for software architecture and engineering, other related systematic
1864
reviews and mapping studies, and significant bibliographical data sources. In addition we applied a
1865
snowballing process to collect our primary studies.
1866
The results of our investigation are the following: (a) design-time architecture evaluation ap-
1867
proaches garnered more attention than run-time ones, though the latter are increasingly important
1868
to handle the dynamism and increasing complexity in software systems; (b) there is a lack of
1869
examples on demonstrating how continuous evaluation approaches can realised and conducted;
1870
(c) few methods focus on managing trade-offs between benefits and costs at run-time; (d) few
1871
methods focus on adopting machine learning techniques to the evaluation; (e) most of the run-time
1872
approaches tend to be reactive (and may recommend unnecessary switches and hence increase
1873
deployment costs).
1874
In summary, based on our main findings listed in Tables 10, 11, 6, and 7, 8, we suggest the following
1875
opportunities for future work in this area: (i) employ economics-based approaches (i.e. forecasting
1876
the long-term value of complex architecture decisions); (ii) adopt economics-based principles in the
1877
design-time evaluation approach (the initial stage of a continuous evaluation approach) because
1878
it embeds flexibility under uncertainty; (iii) perform additional research in analysing the use of
1879
machine learning techniques to improve architecture evaluation at run-time (the ongoing stage in a
1880
continuous evaluation approach); (iv) investigate the development of proactivity in the architecture
1881
evaluation process; (v) explore how tuning the input parameters for the continuous evaluation (e.g.
1882
sensitivity to changes, monitoring intervals, the relative importance of present/past data) could
1883
affect the evaluation and what are the most suitable parameters to improve the decision-making;
1884
(vi) analyze the use of continuous architecture evaluation in dynamic environments, such as IoT
1885
and cloud systems.
1886
1887
1888
A THE LIST OF INCLUDED STUDIES WITH RESPECT TO CLASSIFICATION
1889
FRAMEWORK
1890 In this appendix, we tabulate the list of included studies with respect to classification framework in
1891 Table 10-12.
1892
1893 B THE INCLUDED STUDIES AND THEIR MATURITY LEVEL
1894 In this appendix, we first tabulate the studies with respect to domain maturity level in Table 13 and
1895 then provide a list of included studies in the systematic literature review in Table 14, 15, and 16.
1896
1897
REFERENCES
1898
[1] 2017. (July 2017). https://techbeacon.com/5-challenges-performance-engineering-iot-apps?amp
1899
[2] Amritanshu Agrawal, Tim Menzies, Leandro L Minku, Markus Wagner, and Zhe Yu. 2020. Better software analytics
1900 via" DUO": Data mining algorithms using/used-by optimizers. Empirical Software Engineering 25, 3 (2020), 2099–2136.
1901 [3] Sarah Al-Azzani and Rami Bahsoon. 2010. Using implied scenarios in security testing. In Proceedings of the 2010 ICSE
1902 Workshop on Software Engineering for Secure Systems. ACM, 15–21.
1903 [4] Yahya Al-Dhuraibi, Fawaz Paraiso, Nabil Djarallah, and Philippe Merle. 2018. Elasticity in cloud computing: state of
the art and research challenges. IEEE Transactions on Services Computing 11, 2 (2018), 430–447.
1904
[5] Tariq Al-Naeem, Ian Gorton, Muhammed Ali Babar, Fethi Rabhi, and Boualem Benatallah. 2005. A quality-driven
1905 systematic approach for architecting distributed software applications. In Proceedings of the 27th international conference
1906 on Software engineering. ACM, 244–253.
1907 [6] Aldeida Aleti, Stefan Bjornander, Lars Grunske, and Indika Meedeniya. 2009. ArcheOpterix: An extendable tool for
1908 architecture optimization of AADL models. In Model-Based Methodologies for Pervasive and Embedded Software, 2009.
MOMPES’09. ICSE Workshop on. IEEE, 61–71.
1909
[7] Aldeida Aleti, Barbora Buhnova, Lars Grunske, Anne Koziolek, and Indika Meedeniya. 2013. Software architecture
1910 optimization methods: A systematic literature review. IEEE Transactions on Software Engineering 39, 5 (2013), 658–683.
1911

40 Sobhy et al.
1912 Table 10. Representative Contributions for Design-time Architecture Evaluation.

1913
Study Approaches Addressing Supported Management of Management Treatment Source
1914
to QA QA stakeholder of of of
1915 Evaluation input trade-offs uncertainty uncertainty
1916 [83] Scenario-based Multiple Specific (Modifiability, Portability, Human-Reliant No Support Implicit Epistemic
Extensibility)
1917
[43] Utility-based Multiple General Human-Reliant Manual Implicit Epistemic
1918 Scenario-based
[82] Utility-based Multiple General + Specific (Cost) Human-Reliant Manual Implicit Epistemic
1919
Scenario-based
1920 [18] Utility-based Multiple Specific (Performance, Human-Reliant No Support Implicit Epistemic
1921 Scenario-based Fault-tolerance, Maintainability,
Parametric-based Reusability)
1922 [19] Utility-based Single Specific (Modifiability) Human-Reliant No Support Implicit Epistemic
1923 Scenario-based
[79] Utility-based Multiple General + Specific (Cost) Human-Reliant No Support Implicit Epistemic
1924
Scenario-based + Aleatory
1925 [84] Utility-based Multiple General + Specific (Cost) Human-Reliant Manual Implicit Epistemic
1926 Scenario-based
Parametric-based
1927 [139] Utility-based Single Specific (Performance+Cost) Human-Reliant Manual Implicit Epistemic
1928 Scenario-based
[145] Utility-based Multiple General + Specific (Cost) Human-Reliant Manual Explicit Epistemic
1929
Scenario-based
1930 Parametric-based
[5] Utility-based Multiple General + Specific (Cost) Semi-Autonomous Automatic Implicit Epistemic
1931
Parametric-based
1932 Search-based
1933 [70] Utility-based Multiple General + Specific (Cost) Semi-Autonomous Automatic Implicit Epistemic
1934 Parametric-based
1935 [96] Utility-based Multiple Specific (Dependability, Reliability Human-Reliant No Support Implicit Aleatory
and Maintainability)
1936
1937 [42] Utility-based Multiple General + Specific (Cost) Semi-Autonomous Automatic Explicit Epistemic
1938 [64] Utility-based Multiple General + Human-Reliant Manual Explicit Epistemic
+ Aleatory
1940 [57] Utility-based Multiple General Human-Reliant No Support Implicit Epistemic
[63] Utility-based Multiple General + Specific (Cost) Semi-Autonomous Automatic Explicit Epistemic
1941
Scenario-based
1942 Search-based
Scenario-based
1944 Search-based
1945 [12] Economics-based Multiple Specific (Stability) + Specific (Cost) Semi-Autonomous Manual Explicit Epistemic
[13] Economics-based Multiple Specific (Scalability) + Specific (Cost) Semi-Autonomous Manual Explicit Epistemic
1946
[114] Economics-based Multiple General + Specific (Cost) Semi-Autonomous Manual Explicit Epistemic
1947 Scenario-based
1948 [112] Economics-based Multiple General + Specific (Cost) Semi-Autonomous Manual Explicit Epistemic
Scenario-based
1949 [104] Utility-based Single Specific (Reliability) Semi-Autonomous No Support Explicit Epistemic
1950 Parametric-based + Aleatory
[103] Utility-based Multiple Specific (Reliability) Semi-Autonomous Automatic Explicit Aleatory
1951
Parametric-based
1952 Search-based
[102] Utility-based Multiple Specific (Reliability + Performance) Semi-Autonomous Automatic Explicit Epistemic
1953
Parametric-based
1954 Search-based
1955 [28] Utility-based Multiple Generic Semi-Autonomous Automatic Explicit Epistemic
Search-based
1956 [105] Utility-based Multiple Generic Semi-Autonomous No Support Explicit Epistemic
1958
1959
1960
1961 Table 11. Representative Contributions for Run-time Architecture Evaluation.

1962
1963 Study Approaches Addressing Supported Management of Management Treatment Source Monitoring &
to QA QA stakeholder of of of Treatment
1964 Evaluation input trade-offs uncertainty uncertainty of QAs
[40] Utility-based Multiple General + Specific (Cost) Autonomous Automatic Explicit Epistemic Reactive
1965
Parametric-based + Aleatory
1966 [133] Utility-based Single Specific (Performance) Autonomous No Support Explicit Aleatory Reactive
Learning-based
1967 [32] Utility-based Multiple Specific (Performance, Autonomous Automatic Implicit Aleatory Reactive
Energy Consumption)
1968 [87] Utility-based Multiple General Autonomous Automatic Explicit Aleatory Reactive
1969 Learning-based
[75] Utility-based Multiple Specific (Reliability and Performance) Semi-Autonomous No Support Implicit Aleatory Reactive
1970 [141] Utility-based Multiple General + Specific (Cost) Semi-Autonomous No Support Implicit Aleatory Reactive
Scenario-based
1971 [31] Utility-based Multiple Specific (Reliability and Performance) Autonomous No Support Explicit Aleatory Reactive
[30] Utility-based Multiple Specific (Reliability and Performance) Semi-Autonomous Automatic Explicit Epistemic Reactive
1972 + Aleatory
1973 [62] Utility-based Multiple General Semi-Autonomous Automatic Explicit Epistemic Reactive
Search-based + Aleatory
1974 [61] Utility-based Multiple General + Specific (Cost) Autonomous Automatic Explicit Epistemic Reactive
Learning-based + Aleatory
1975 [67] Utility-based Single Specific (Energy Consumption) Semi-Autonomous No Support Explicit Aleatory Reactive
Search-based
1976 [58] Utility-based Single Specific (Reliability) Semi-Autonomous No Support Implicit Epistemic Proactive
[69] Utility-based Multiple General Semi-Autonomous No Support Explicit Aleatory Proactive
1978 [46] Utility-based Multiple Specific (Reliability and Efficiency) Autonomous Automatic Implicit Aleatory Proactive
[106] Utility-based Multiple Specific (Performance) Autonomous No Support Explicit Aleatory Proactive
1979 Learning-based
1980
1981 Table 12. Representative Contributions for Continuous Architecture Evaluation.
1982
Study Approaches Addressing Supported Management of Management Treatment Source Monitoring &
1983 to QA QA stakeholder of of of Treatment
Evaluation input trade-offs uncertainty uncertainty of QAs
1984
[117] Utility-based Multiple General + Specific (Cost) Human-Reliant Manual Implicit Epistemic No treatment
1985 [69] Utility-based Multiple General Semi-Autonomous No Support Explicit Aleatory Proactive
Parametric-based
1986
Search-based
Economics-based + Aleatory
1989 Search-based + Aleatory
Scenario-based
1990 Learning-based
Economics-based
1991
1992 Table 13. Studies with respect to Domain maturation level.
1993
1994 Domain Maturation Level Studies # of Studies
1995
Basic Research [28, 83, 113, 117] 4
1996
Concept Formulation [5, 12, 40, 70, 79, 84, 86, 104, 105, 114, 133, 139, 145] 35
1997
[13, 30–32, 46, 58, 61, 62, 75, 87, 94, 96, 141]
1998
[48, 67, 69, 95, 102, 103, 106, 112, 136]
1999
Development and Extension [18, 57] 2
2000
Internal Enhancement [63, 64] 2
2001
2002
External Enhancement [19, 42] 2
2003
Popularization [43, 82] 2
2004 Total 48
2005
2006
2007
2008
2009

42 Sobhy et al.
2010 Table 14. Studies included in the review.

2011
2012 Study Ref Author(s) Year Title
2013 S1 [83] R. Kazman, L. Bass, 1994 SAAM: A method for analyzing
2014 G. Abowd, & M. Webb the properties of software architectures
2015 S2 [18] P. Bengtsson & J. Bosch 1998 Scenario-based software
2016 architecture reengineering
2017 S3 [82] R. Kazman, J. Asundi, 2001 Quantifying the costs and benefits of
2018 & P. Clements architectural decisions
2019 S4 [139] L. Williams & C. Smith 2002 PASASM: A Method for the Performance
2020 Assessment of Software Architectures
2021 S5 [43] R. Kazman, M. Klein, 2003 Evaluating software architectures
2022 P. Clements & others
2023 S6 [19] P. Bengtsson, N. Lassing, 2004 Architecture-level modifiability analysis
2024 J. Bosch, & H. Vliet (ALMA)
2025 S7 [12] R. Bahsoon & W. Emmerich 2004 Evaluating architectural stability
2026 with real options theory
2027 S8 [40] S. Cheng 2004 Rainbow: cost-effective software
2028 architecture-based self-adaptation
2029 S9 [79] M. Ionita, P. America, 2004 A Scenario-Driven Approach for Value,
2030 D. Hammer, H. Obbink Risk, and Cost Analysis in
2031 & J. Trienekens System Architecting for Innovation
2032 S10 [145] L. Zhu, A. Aurum, 2005 Tradeoff and sensitivity analysis in software
2033 I. Gorton, & R. Jeffery architecture evaluation using analytic
2034 hierarchy process
2035 S11 [5] T. Al-Naeem, I. Gorton, 2005 A quality-driven systematic
2036 M. Babar, F. Rabhi approach for architecting
2037 & B. Benatallah distributed software applications
2038 S12 [84] R. Kazman, L. Bass 2006 The essential components of software
2039 & M. Klein architecture design and analysis
2040 S13 [70] L. Grunske 2006 Identifying good architectural design alternatives
2041 with multi-objective optimization strategies
2042 S14 [133] G. Tesauro 2007 Reinforcement learning in autonomic computing:
2043 A manifesto and case studies
2044 S15 [114] I. Ozkaya, R. Kazman 2007 Quality-attribute based economic
2045 & M. Klein valuation of architectural patterns
2046 S16 [86] C. Kim, D. Lee, 2007 A Lightweight Value-based Software
2047 I. Ko & J. Baik Architecture Evaluation
2048 S17 [13] R. Bahsoon & W. Emmerich 2008 An economics-driven approach for valuing
2049 scalability in distributed architectures
2050 S18 [96] Y. Liu, M. Babar 2008 Middleware Architecture Evaluation for
2051 & I. Gorton Dependable Self-managing Systems
2052
2053
2054
[8] Martha Amram, Nalin Kulatilaka, et al. 1998. Real Options:: Managing Strategic Investment in an Uncertain World.
2055 OUP Catalogue (1998).
2056 [9] Jayatirtha Asundi, Rick Kazman, and Mark Klein. 2001. Using economic considerations to choose among architecture
2057 design alternatives. Technical Report. DTIC Document.
2058
2059 Table 15. Studies included in the review (Continued).

2060
2061 Study# Ref Author(s) Year Title
2062
S19 [75] W. Heaven, D. Sykes, 2009 A case study in goal-driven
2063
J. Magee & J. Kramer architectural adaptation
2064
S20 [87] D. Kim & S. Park 2009 Reinforcement learning-based dynamic
2065
adaptation planning method for architecture-
2066
based self-managed software
2067
S21 [141] J. Yang, G. Huang, W. Zhu, 2009 Quality attribute tradeoff through
2068
X. Cui & H. Mei adaptive architectures at runtime
2069
S22 [94] J. Lee, S. Kang 2009 Software architecture evaluation methods
2070
& C. Kim based on cost benefit analysis
2071
and quantitative decision making
2072
S23 [58] I. Epifani, C. Ghezzi, 2009 Model evolution by run-time
2073
R. Mirandola parameter adaptation
2074
& G. Tamburrelli
2075
S24 [32] R. Calinescu 2009 Using quantitative analysis to implement
2076
& M. Kwiatkowska autonomic IT systems
2077
S25 [42] He. Christensen, K. Hansen 2011 Lightweight and continuous architectural
2078
& B. LindstrÃÿm software quality assurance using
2079
the asqa technique
2080
S26 [117] R. Pooley & A. Abdullatif 2010 Cpasa: continuous performance assessment of
2081
software architecture
2082
S27 [64] F. Faniyi, R. Bahsoon, 2011 Evaluating security properties of architectures
2083
A. Evans & R. Kazman in unpredictable environments:
2084
A case for cloud
2085
S28 [62] N. Esfahani, E. Kouroshfar 2011 Taming uncertainty in self-adaptive software
2086
& S. Malek
2087
S29 [31] R. Calinescu, K. Johnson 2011 Using observation ageing to
2088
& Y. Rafiq improve Markovian model learning
2089
in QoS engineering
2090
S30 [30] R. Calinescu, L. Grunske, 2011 Dynamic QoS management and optimization
2091
M. Kwiatkowska, in service-based systems
2092
R. Mirandola
2093
& G. Tamburrelli
2094
S31 [104] I. Meedeniya, I. Moser, A. Aleti, 2011 Architecture-based reliability
2095
& L. Grunske evaluation under uncertainty
2096
S32 [103] I. Meedeniya, I. Moser, A. Aleti, 2011 Architecture-driven reliability optimization
2097
& L. Grunske with uncertain model parameters
2098
2099
2100
[10] Muhammad Ali Babar and Ian Gorton. 2004. Comparison of scenario-based software architecture evaluation methods.
2101
In Software Engineering Conference, 2004. 11th Asia-Pacific. IEEE, 600–607.
2102 [11] Muhammad Ali Babar, Liming Zhu, and Ross Jeffery. 2004. A framework for classifying and comparing software
2103 architecture evaluation methods. In Software Engineering Conference, 2004. Proceedings. 2004 Australian. IEEE, 309–318.
2104 [12] Rami Bahsoon and Wolfgang Emmerich. 2004. Evaluating architectural stability with real options theory. In Software
2105
Maintenance, 2004. Proceedings. 20th IEEE International Conference on. IEEE, 443–447.
[13] Rami Bahsoon and Wolfgang Emmerich. 2008. An economics-driven approach for valuing scalability in distributed
2106
architectures. In Software Architecture, 2008. WICSA 2008. Seventh Working IEEE/IFIP Conference on. IEEE, 9–18.
2107

44 Sobhy et al.
2108 Table 16. Studies included in the review (Continued).

2109
2110 Study# Ref Author(s) Year Title
2111 S33 [102] I. Meedeniya, A. Aleti, I. Avazpour 2012 Robust ArcheOpterix: Architecture
2112 & A. Amin Optimization of Embedded Systems
2113 Embedded Systems under uncertainty
2114 S34 [113] M. Osterlind, P. Johnson, 2013 Enterprise architecture evaluation
2115 K. Karnati, R. Lagerstrom using utility theory
2116 & M. Valja
2117 S35 [63] N. Esfahani, S. Malek 2013 GuideArch: guiding the exploration of
2118 & K. Razavi architectural solution space
2119 under uncertainty
2120 S36 [46] D. Cooray, E. Kouroshfar, 2013 Proactive self-adaptation for improving
2121 S. Malek & R. Roshandel embedded, the reliability of
2122 mission-critical, and mobile software
2123 S37 [61] N. Esfahani, A. Elkhodary 2013 A learning-based framework for
2124 & S. Malek engineering feature-oriented self-adaptive
2125 software systems
2126 S38 [69] C. Ghezzi & A. Sharifloo 2013 Dealing with non-functional requirements
2127 for adaptive systems via dynamic
2128 software product-lines
2129 S39 [67] E. Fredericks, B. DeVries 2014 Towards run-time adaptation of test
2130 & B. Cheng cases for self-adaptive systems in
2131 the face of uncertainty
2132 S40 [95] E. Letier, D. Stefan 2014 Uncertainty, risk, and information value
2133 & E. Barr in software requirements and architecture
2134 S41 [105] I. Meedeniya, A. Aleti, 2014 Evaluating probabilistic models with
2135 & L. Grunske uncertain model parameters
2136 S42 [57] V. Eloranta, U. Heesch, 2015 Lightweight Evaluation of Software
2137 P. Avgeriou, N. Harrison Architecture Decisions
2138 & K. Koskimies
2139 S43 [112] B. Ojameruaye, R. Bahsoon 2016 Sustainability debt: a portfolio-based
2140 & L. Duboc approach for evaluating sustainability
2141 requirements in architectures
2142 S44 [106] G. Moreno, J. Camara, 2016 Efficient decision-making under uncertainty
2143 D. Garlan & B. Schmerl for proactive self-adaptation
2144 S45 [136] V. Donckt, M. Jeroen, 2018 Cost-Benefit Analysis at Runtime for
2145 D. Weyns, M. Iftikhar Self-Adaptive Systems Applied to an
2146 & R. Singh Internet of Things Application
2147 S46 [48] M. De Sanctis, R. Spalazzese, 2019 Qos-based formation of software
2148 & C. Trubiani architectures in the internet of things
2149 S47 [28] A. Busch, D. Fuchß, 2019 Peropteryx: Automated improvement of
2150 & A. Koziolek software architectures
2151 S48 [127] D. Sobhy, L. Minku, R. Bahsoon 2020 Run-time evaluation of architectures:
2152 T. Chen & R. Kazman A case study of diversification in IoT
2153
2154
2155
2156
2157
[14] Carliss Young Baldwin and Kim B Clark. 2000. Design rules: The power of modularity. Vol. 1. MIT press.
2158 [15] Linden J Ball, Balder Onarheim, and Bo T Christensen. 2010. Design requirements, epistemic uncertainty and solution
2159 development strategies in software design. Design Studies 31, 6 (2010), 567–589.
2160 [16] Len Bass, Paul Clements, and Rick Kazman. 2012. Software architecture in practice. Addison-Wesley Professional.
2161 [17] Len Bass, Ingo Weber, and Liming Zhu. 2015. DevOps: A Software Architect’s Perspective. Addison-Wesley Professional.
[18] PerOlof Bengtsson and Jan Bosch. 1998. Scenario-based software architecture reengineering. In Software Reuse, 1998.
2162
Proceedings. Fifth International Conference on. IEEE, 308–317.
2163 [19] PerOlof Bengtsson, Nico Lassing, Jan Bosch, and Hans van Vliet. 2004. Architecture-level modifiability analysis
2164 (ALMA). Journal of Systems and Software 69, 1 (2004), 129–147.
2165 [20] Amel Bennaceur, Valérie Issarny, Daniel Sykes, Falk Howar, Malte Isberner, Bernhard Steffen, Richard Johansson, and
2166 Alessandro Moschitti. 2012. Machine learning for emergent middleware. In International Workshop on Eternal Systems.
Springer, 16–29.
2167
[21] Patrik Berander, Lars-Ola Damm, Jeanette Eriksson, Tony Gorschek, Kennet Henningsson, Per Jönsson, Simon Kågström,
2168 Drazen Milicic, Frans Mårtensson, Kari Rönkkö, et al. 2005. Software quality attributes and trade-offs. Blekinge Institute
2169 of Technology (2005).
2170 [22] Gordon Blair, Nelly Bencomo, and Robert B France. 2009. Models@ run. time. Computer 42, 10 (2009).
2171 [23] Barry W Boehm et al. 1981. Software engineering economics. Vol. 197. Prentice-hall Englewood Cliffs (NJ).
[24] Barry W Boehm and Kevin J Sullivan. 2000. Software economics: a roadmap. In Proceedings of the conference on The
2172
future of Software engineering. ACM, 319–343.
2173 [25] Jan Bosch. 2004. Software architecture: The next step. In European Workshop on Software Architecture. Springer,
2174 194–199.
2175 [26] Jeremy S Bradbury, James R Cordy, Juergen Dingel, and Michel Wermelinger. 2004. A survey of self-management
2176 in dynamic software architecture specifications. In Proceedings of the 1st ACM SIGSOFT workshop on Self-managed
systems. ACM, 28–33.
2177
[27] Hongyu Pei Breivold, Ivica Crnkovic, and Magnus Larsson. 2012. A systematic review of software architecture
2178 evolution research. Information and Software Technology 54, 1 (2012), 16–40.
2179 [28] Axel Busch, Dominik Fuchß, and Anne Koziolek. 2019. Peropteryx: Automated improvement of software architectures.
2180 In 2019 IEEE International Conference on Software Architecture Companion (ICSA-C). IEEE, 162–165.
2181 [29] Radu Calinescu. 2013. Emerging techniques for the engineering of self-adaptive high-integrity software. In Assurances
for Self-Adaptive Systems. Springer, 297–310.
2182
[30] Radu Calinescu, Lars Grunske, Marta Kwiatkowska, Raffaela Mirandola, and Giordano Tamburrelli. 2011. Dynamic
2183 QoS management and optimization in service-based systems. IEEE Transactions on Software Engineering 37, 3 (2011),
2184 387–409.
2185 [31] Radu Calinescu, Kenneth Johnson, and Yasmin Rafiq. 2011. Using observation ageing to improve Markovian model
2186 learning in QoS engineering. In Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering.
ACM, 505–510.
2187
[32] Radu Calinescu and Marta Kwiatkowska. 2009. Using quantitative analysis to implement autonomic IT systems. In
2188 Software Engineering, 2009. ICSE 2009. IEEE 31st International Conference on. IEEE, 100–110.
2189 [33] Javier Cámara, Gabriel Moreno, and David Garlan. 2015. Reasoning about human participation in self-adaptive systems.
2190 In 2015 IEEE/ACM 10th International Symposium on Software Engineering for Adaptive and Self-Managing Systems. IEEE,
2191 146–156.
[34] Stephanie Riegg Cellini and James Edwin Kee. 2015. Cost-effectiveness and cost-benefit analysis. Handbook of practical
2192
program evaluation 4 (2015).
2193 [35] Humberto Cervantes and Rick Kazman. 2016. Designing software architectures: a practical approach. Addison-Wesley
2194 Professional.
2195 [36] Franck Chauvel, Nicolas Ferry, Brice Morin, Alessandro Rossini, and Arnor Solberg. 2013. Models@ Runtime to
2196 Support the Iterative and Continuous Design of Autonomic Reasoners.. In MoDELS@ Run. time. 26–38.
[37] Lianping Chen. 2018. Microservices: architecting for continuous delivery and DevOps. In 2018 IEEE International
2197
Conference on Software Architecture (ICSA). IEEE, 39–397.
2198 [38] Tao Chen, Ke Li, Rami Bahsoon, and Xin Yao. 2018. FEMOSAA: Feature-Guided and Knee-Driven Multi-Objective
2199 Optimization for Self-Adaptive Software. ACM Trans. Softw. Eng. Methodol. 27, 2, Article 5 (June 2018), 50 pages.
2200 https://doi.org/10.1145/3204459
2201 [39] Betty HC Cheng, Kerstin I Eder, Martin Gogolla, Lars Grunske, Marin Litoiu, Hausi A Müller, Patrizio Pelliccione,
Anna Perini, Nauman A Qureshi, Bernhard Rumpe, et al. 2014. Using models at runtime to address assurance for
2202
self-adaptive systems. In Models@ run. time. Springer, 101–136.
2203 [40] Shang-Wen Cheng. 2008. Rainbow: cost-effective software architecture-based self-adaptation. ProQuest.
2204
2205

46 Sobhy et al.
2206 [41] Shang-Wen Cheng. 2008. Rainbow: Cost-effective Software Architecture-based Self-adaptation. Ph.D. Dissertation.
2207 Pittsburgh, PA, USA. Advisor(s) Garlan, David. AAI3305807.
2208 [42] Henrik Bærbak Christensen, Klaus Marius Hansen, and Bo Lindstrøm. 2010. Lightweight and continuous architectural
software quality assurance using the asqa technique. In European Conference on Software Architecture. Springer,
2209
118–132.
2210 [43] Paul Clements, Rick Kazman, Mark Klein, et al. 2003. Evaluating software architectures. Tsinghua University Press
2211 Beijing.
2212 [44] David W Coit, Tongdan Jin, and Naruemon Wattanapongsakorn. 2004. System optimization with component reliability
2213 estimation uncertainty: a multi-criteria approach. IEEE transactions on reliability 53, 3 (2004), 369–380.
[45] Software Engineering Standards Committee et al. 1998. IEEE Standard for a software quality metrics methodology, Std.
2214
1061-1998. Technical Report. Technical Report.
2215 [46] Deshan Cooray, Ehsan Kouroshfar, Sam Malek, and Roshanak Roshandel. 2013. Proactive self-adaptation for improving
2216 the reliability of mission-critical, embedded, and mobile software. IEEE Transactions on Software Engineering 39, 12
2217 (2013), 1714–1735.
2218 [47] Rogério De Lemos, Holger Giese, Hausi A Müller, Mary Shaw, Jesper Andersson, Marin Litoiu, Bradley Schmerl,
Gabriel Tamura, Norha M Villegas, Thomas Vogel, et al. 2013. Software engineering for self-adaptive systems: A
2219
second research roadmap. In Software Engineering for Self-Adaptive Systems II. Springer, 1–32.
2220 [48] Martina De Sanctis, Romina Spalazzese, and Catia Trubiani. 2019. Qos-based formation of software architectures in
2221 the internet of things. In European Conference on Software Architecture. Springer, 178–194.
2222 [49] Mark Denne and Jane Cleland-Huang. 2004. The incremental funding method: Data-driven software development.
2223 IEEE software 21, 3 (2004), 39–47.
[50] Armen Der Kiureghian and Ove Ditlevsen. 2009. Aleatory or epistemic? Does it matter? Structural Safety 31, 2 (2009),
2224
105–112.
2225 [51] Yves Deswarte, Karama Kanoun, and Jean-Claude Laprie. 1998. Diversity against accidental and deliberate faults. In
2226 csda. IEEE, 171.
2227 [52] Gregory Ditzler, Manuel Roveri, Cesare Alippi, and Robi Polikar. 2015. Learning in nonstationary environments: A
2228 survey. IEEE Computational Intelligence Magazine 10, 4 (2015), 12–25.
[53] Liliana Dobrica and Eila Niemela. 2002. A survey on software architecture analysis methods. IEEE Transactions on
2229
software Engineering 28, 7 (2002), 638–653.
2230 [54] Donia El Kateb, François Fouquet, Grégory Nain, Jorge Augusto Meira, Michel Ackerman, and Yves Le Traon. 2014.
2231 Generic cloud platform multi-objective optimization leveraging models@ run.time. In Proceedings of the 29th Annual
2232 ACM Symposium on Applied Computing. ACM, 343–350.
2233 [55] Ahmed Elkhodary, Naeem Esfahani, and Sam Malek. 2010. FUSION: a framework for engineering self-tuning self-
adaptive software systems. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of
2234
software engineering. ACM, 7–16.
2235 [56] Salah E Elmaghraby, Willy S Herroelen, et al. 1990. The scheduling of activities to maximize the net present value of
2236 projects. European Journal of Operational Research 49, 1 (1990), 35–49.
2237 [57] Veli-Pekka Eloranta, Uwe van Heesch, Paris Avgeriou, Neil Harrison, and Kai Koskimies. 2015. Lightweight Evaluation
2238 of Software Architecture Decisions. In Relating System Quality and Software Architecture. Elsevier, 157–179.
[58] Ilenia Epifani, Carlo Ghezzi, Raffaela Mirandola, and Giordano Tamburrelli. 2009. Model evolution by run-time
2239
parameter adaptation. In Proceedings of the 31st International Conference on Software Engineering. IEEE Computer
2240 Society, 111–121.
2241 [59] Hakan Erdogmus. 2002. A Real Options Perspective of Software Reuse. In International Workshop on Reuse Economics
2242 âĂĲRedirecting Reuse EconomicsâĂİ Tuesday.
2243 [60] Hakan Erdogmus and Jennifer Vandergraaf. 1999. Quantitative approaches for assessing the value of COTS-centric
development. (1999).
2244
[61] Naeem Esfahani, Ahmed Elkhodary, and Sam Malek. 2013. A learning-based framework for engineering feature-oriented
2245 self-adaptive software systems. IEEE transactions on software engineering 39, 11 (2013), 1467–1493.
2246 [62] Naeem Esfahani, Ehsan Kouroshfar, and Sam Malek. 2011. Taming uncertainty in self-adaptive software. In Proceedings
2247 of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM,
2248 234–244.
[63] Naeem Esfahani, Sam Malek, and Kaveh Razavi. 2013. GuideArch: guiding the exploration of architectural solution
2249
space under uncertainty. In Software Engineering (ICSE), 2013 35th International Conference on. IEEE, 43–52.
2250 [64] Funmilade Faniyi, Rami Bahsoon, Andy Evans, and Rick Kazman. 2011. Evaluating security properties of architectures
2251 in unpredictable environments: A case for cloud. In 2011 Ninth Working IEEE/IFIP Conference on Software Architecture.
2252 IEEE, 127–136.
2253
2254
2255
[65] John Favaro. 1996. A comparison of approaches to reuse investment analysis. In Software Reuse, 1996., Proceedings
2256 Fourth International Conference on. IEEE, 136–145.
2257 [66] Craig R Fox and Gülden Ülkümen. 2011. Distinguishing two dimensions of uncertainty. Perspectives on thinking,
2258 judging, and decision making (2011), 21–35.
2259 [67] Erik M Fredericks, Byron DeVries, and Betty HC Cheng. 2014. Towards run-time adaptation of test cases for self-
adaptive systems in the face of uncertainty. In Proceedings of the 9th International Symposium on Software Engineering
2260
for Adaptive and Self-Managing Systems. ACM, 17–26.
2261 [68] David Garlan and Bradley Schmerl. 2004. Using architectural models at runtime: Research challenges. In European
2262 Workshop on Software Architecture. Springer, 200–205.
2263 [69] Carlo Ghezzi and Amir Molzam Sharifloo. 2013. Dealing with non-functional requirements for adaptive systems via
2264 dynamic software product-lines. In Software Engineering for Self-Adaptive Systems II. Springer, 191–213.
[70] Lars Grunske. 2006. Identifying good architectural design alternatives with multi-objective optimization strategies. In
2265
Proceedings of the 28th international conference on Software engineering. ACM, 849–852.
2266 [71] Jayavardhana Gubbi, Rajkumar Buyya, Slaven Marusic, and Marimuthu Palaniswami. 2013. Internet of Things (IoT): A
2267 vision, architectural elements, and future directions. Future Generation Computer Systems 29, 7 (2013), 1645–1660.
2268 [72] Marcus Handte, Gregor Schiele, Verena Matjuntke, Christian Becker, and Pedro José Marrón. 2012. 3PC: System
2269 support for adaptive peer-to-peer pervasive computing. ACM Transactions on Autonomous and Adaptive Systems
(TAAS) 7, 1 (2012), 10.
2270
[73] Mark Harman and Bryan F Jones. 2001. Search-based software engineering. Information and software Technology 43,
2271 14 (2001), 833–839.
2272 [74] Mark Harman, S Afshin Mansouri, and Yuanyuan Zhang. 2012. Search-based software engineering: Trends, techniques
2273 and applications. ACM Computing Surveys (CSUR) 45, 1 (2012), 1–61.
2274 [75] William Heaven, Daniel Sykes, Jeff Magee, and Jeff Kramer. 2009. A case study in goal-driven architectural adaptation.
In Software Engineering for Self-Adaptive Systems. Springer, 109–127.
2275
[76] Robert Heinrich. 2016. Architectural run-time models for performance and privacy analysis in dynamic cloud
2276 applications. ACM SIGMETRICS Performance Evaluation Review 43, 4 (2016), 13–22.
2277 [77] Robert Heinrich, Christian Zirkelbach, and Reiner Jung. 2017. Architectural Runtime Modeling and Visualization for
2278 Quality-Aware DevOps in Cloud Applications. In 2017 IEEE International Conference on Software Architecture Workshops
2279 (ICSAW). IEEE, 199–201.
[78] Nikolaus Huber, Fabian Brosig, Simon Spinner, Samuel Kounev, and Manuel Bähr. 2016. Model-based self-aware
2280
performance and resource management using the descartes modeling language. IEEE Transactions on Software
2281 Engineering 43, 5 (2016), 432–452.
2282 [79] Mugurel T Ionita, Pierre America, Dieter K Hammer, Henk Obbink, and Jos JM Trienekens. 2004. A scenario-driven
2283 approach for value, risk, and cost analysis in system architecting for innovation. In Software Architecture, 2004. WICSA
2284 2004. Proceedings. Fourth Working IEEE/IFIP Conference on. IEEE, 277–280.
[80] Anton Jansen and Jan Bosch. 2005. Software architecture as a set of architectural design decisions. In Software
2285
Architecture, 2005. WICSA 2005. 5th Working IEEE/IFIP Conference on. IEEE, 109–120.
2286 [81] Lawrence G Jones and Anthony J Lattanze. 2001. Using the architecture tradeoff analysis method to evaluate a wargame
2287 simulation system: A case study. Technical Report. DTIC Document.
2288 [82] Rick Kazman, Jai Asundi, and Mark Klein. 2001. Quantifying the costs and benefits of architectural decisions. In
2289 Proceedings of the 23rd international conference on Software engineering. IEEE Computer Society, 297–306.
[83] Rick Kazman, Len Bass, Gregory Abowd, and Mike Webb. 1994. SAAM: A method for analyzing the properties of
2290
software architectures. In Software Engineering, 1994. Proceedings. ICSE-16., 16th International Conference on. IEEE,
2291 81–90.
2292 [84] Rick Kazman, Len Bass, and Mark Klein. 2006. The essential components of software architecture design and analysis.
2293 Journal of Systems and Software 79, 8 (2006), 1207–1216.
2294 [85] Rick Kazman, Mark Klein, and Paul Clements. 2000. ATAM: Method for architecture evaluation. Technical Report. DTIC
Document.
2295
[86] Chang-Ki Kim, Dan-Hyung Lee, In-Young Ko, and Jongmoon Baik. 2007. A lightweight value-based software architec-
2296 ture evaluation. In Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2007.
2297 SNPD 2007. Eighth ACIS International Conference on, Vol. 2. IEEE, 646–649.
2298 [87] Dongsun Kim and Sooyong Park. 2009. Reinforcement learning-based dynamic adaptation planning method for
2299 architecture-based self-managed software. In 2009 ICSE Workshop on Software Engineering for Adaptive and Self-
Managing Systems. IEEE, 76–85.
2300
[88] Barbara Kitchenham, O Pearl Brereton, David Budgen, Mark Turner, John Bailey, and Stephen Linkman. 2009. Systematic
2301 literature reviews in software engineering–a systematic literature review. Information and software technology 51, 1
2302 (2009), 7–15.
2303

48 Sobhy et al.
2304 [89] Barbara Kitchenham, Rialette Pretorius, David Budgen, O Pearl Brereton, Mark Turner, Mahmood Niazi, and Stephen
2305 Linkman. 2010. Systematic literature reviews in software engineering–a tertiary study. Information and Software
2306 Technology 52, 8 (2010), 792–805.
[90] Heiko Koziolek. 2011. Sustainability evaluation of software architectures: a systematic review. In Proceedings of the
2307
joint ACM SIGSOFT conference–QoSA and ACM SIGSOFT symposium–ISARCS on Quality of software architectures–QoSA
2308 and architecting critical systems–ISARCS. ACM, 3–12.
2309 [91] Jeff Kramer and Jeff Magee. 2007. Self-managed systems: an architectural challenge. In 2007 Future of Software
2310 Engineering. IEEE Computer Society, 259–268.
2311 [92] Philippe B Kruchten. 1995. The 4+ 1 view model of architecture. IEEE software 12, 6 (1995), 42–50.
[93] Christian Krupitzer, Felix Maximilian Roth, Sebastian VanSyckel, Gregor Schiele, and Christian Becker. 2015. A survey
2312
on engineering approaches for self-adaptive systems. Pervasive and Mobile Computing 17 (2015), 184–206.
2313 [94] Jihyun Lee, Sungwon Kang, and Chang-Ki Kim. 2009. Software architecture evaluation methods based on cost benefit
2314 analysis and quantitative decision making. Empirical Software Engineering 14, 4 (2009), 453–475.
2315 [95] Emmanuel Letier, David Stefan, and Earl T Barr. 2014. Uncertainty, risk, and information value in software requirements
2316 and architecture. In Proceedings of the 36th International Conference on Software Engineering. ACM, 883–894.
[96] Yan Liu, Muhammad Ali Babar, and Ian Gorton. 2008. Middleware architecture evaluation for dependable self-managing
2317
systems. In International Conference on the Quality of Software Architectures. Springer, 189–204.
2318 [97] Timothy A Luehrman. 1998. Strategy as a portfolio of real options. Harvard business review 76 (1998), 89–101.
2319 [98] Sara Mahdavi-Hezavehi, Vinicius HS Durelli, Danny Weyns, and Paris Avgeriou. 2017. A systematic literature review
2320 on methods that handle multiple quality attributes in architecture-based self-adaptive systems. Information and
2321 Software Technology 90 (2017), 1–26.
[99] Sara Mahdavi-Hezavehi, Matthias Galster, and Paris Avgeriou. 2013. Variability in quality attributes of service-based
2322
software systems: A systematic literature review. Information and Software Technology 55, 2 (2013), 320–343.
2323 [100] Harry Markowitz. 1959. Portfolio Selection, Cowles Foundation Monograph No. 16. John Wiley, New York. S. Moss
2324 (1981). An Economic theory of Business Strategy, Halstead Press, New York. TH Naylor (1966). The theory of the firm: a
2325 comparison of marginal analysis and linear programming. Southern Economic Journal (January) 32 (1959), 263–74.
2326 [101] R Timothy Marler and Jasbir S Arora. 2004. Survey of multi-objective optimization methods for engineering. Structural
and multidisciplinary optimization 26, 6 (2004), 369–395.
2327
[102] Indika Meedeniya, Aldeida Aleti, Iman Avazpour, and Ayman Amin. 2012. Robust archeopterix: Architecture
2328 optimization of embedded systems under uncertainty. In 2012 Second International Workshop on Software Engineering
2329 for Embedded Systems (SEES). IEEE, 23–29.
2330 [103] Indika Meedeniya, Aldeida Aleti, and Lars Grunske. 2012. Architecture-driven reliability optimization with uncertain
2331 model parameters. Journal of Systems and Software 85, 10 (2012), 2340–2355.
[104] Indika Meedeniya, Irene Moser, Aldeida Aleti, and Lars Grunske. 2011. Architecture-based reliability evaluation
2332
under uncertainty. In Proceedings of the joint ACM SIGSOFT conference–QoSA and ACM SIGSOFT symposium–ISARCS
2333 on Quality of software architectures–QoSA and architecting critical systems–ISARCS. ACM, 85–94.
2334 [105] Indika Meedeniya, Irene Moser, Aldeida Aleti, and Lars Grunske. 2014. Evaluating probabilistic models with uncertain
2335 model parameters. Software & Systems Modeling 13, 4 (2014), 1395–1415.
2336 [106] Gabriel A Moreno, Javier Cámara, David Garlan, and Bradley Schmerl. 2016. Efficient decision-making under
uncertainty for proactive self-adaptation. In Autonomic Computing (ICAC), 2016 IEEE International Conference on. IEEE,
2337
147–156.
2338 [107] Brice Morin, Olivier Barais, Jean-Marc Jezequel, Franck Fleurey, and Arnor Solberg. 2009. Models@ run. time to
2339 support dynamic adaptation. Computer 42, 10 (2009).
2340 [108] Klara Nahrstedt, Hongyang Li, Phuong Nguyen, Siting Chang, and Long Vu. 2016. Internet of mobile things: Mobility-
2341 driven challenges, designs and implementations. In Internet-of-Things Design and Implementation (IoTDI), 2016 IEEE
First International Conference on. IEEE, 25–36.
2342
[109] Nanjangud Narendra and Prasant Misra. 2016. Research Challenges in the Internet of Mobile Things. (March 2016).
2343 https://iot.ieee.org/newsletter/march-2016/research-challenges-in-the-internet-of-mobile-things.html
2344 [110] Robert C Nickerson, Upkar Varshney, and Jan Muntermann. 2013. A method for taxonomy development and its
2345 application in information systems. European Journal of Information Systems 22, 3 (2013), 336–359.
2346 [111] William L Oberkampf, Jon C Helton, Cliff A Joslyn, Steven F Wojtkiewicz, and Scott Ferson. 2004. Challenge problems:
uncertainty in system response given uncertain parameters. Reliability Engineering & System Safety 85, 1-3 (2004),
2347
11–19.
2348 [112] Bendra Ojameruaye, Rami Bahsoon, and Leticia Duboc. 2016. Sustainability debt: a portfolio-based approach for
2349 evaluating sustainability requirements in architectures. In Proceedings of the 38th International Conference on Software
2350 Engineering Companion. ACM, 543–552.
2351
2352
2353
[113] Magnus Osterlind, Pontus Johnson, Kiran Karnati, Robert Lagerstrom, and Margus Valja. 2013. Enterprise architecture
2354 evaluation using utility theory. In 2013 17th IEEE International Enterprise Distributed Object Computing Conference
2355 Workshops. IEEE, 347–351.
2356 [114] Ipek Ozkaya, Rick Kazman, and Mark Klein. 2007. Quality-attribute based economic valuation of architectural
2357 patterns. In Economics of Software and Computation, 2007. ESC’07. First International Workshop on the. IEEE, 5–5.
[115] Dewayne E Perry and Alexander L Wolf. 1992. Foundations for the study of software architecture. ACM SIGSOFT
2358
Software Engineering Notes 17, 4 (1992), 40–52.
2359 [116] Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for conducting systematic mapping studies
2360 in software engineering: An update. Information and Software Technology 64 (2015), 1–18.
2361 [117] RJ Pooley and AAL Abdullatif. 2010. Cpasa: continuous performance assessment of software architecture. In
2362 Engineering of Computer Based Systems (ECBS), 2010 17th IEEE International Conference and Workshops on. IEEE, 79–87.
[118] Aurora Ramirez, José Raúl Romero, and Sebastian Ventura. 2019. A survey of many-objective optimisation in
2363
search-based software engineering. Journal of Systems and Software 149 (2019), 382–395.
2364 [119] Samuel T Redwine Jr and William E Riddle. 1985. Software technology maturation. In Proceedings of the 8th
2365 international conference on Software engineering. IEEE Computer Society Press, 189–200.
2366 [120] Ralf H Reussner, Steffen Becker, Jens Happe, Robert Heinrich, Anne Koziolek, Heiko Koziolek, Max Kramer, and
2367 Klaus Krogmann. 2016. Modeling and simulating software architectures: The Palladio approach. MIT Press.
[121] Matthias Rohr, Simon Giesecke, Marcel Hiel, Willem-Jan van den Heuvel, Hans Weigand, and Wilhelm Hasselbring.
2368
2006. A classification scheme for self-adaptation research. (2006).
2369 [122] Banani Roy and TC Nicholas Graham. 2008. Methods for evaluating software architecture: A survey. School of
2370 Computing TR 545 (2008), 82.
2371 [123] Thomas L Saaty. 2008. Decision making with the analytic hierarchy process. International journal of services sciences
2372 1, 1 (2008), 83–98.
[124] Software Engineering Institute (SEI). 2018. Reduce Risk with Architecture Evaluation. Technical Report. SEI/CMU.
2373
[125] Mary Shaw and David Garlan. 1996. Software architecture: perspectives on an emerging discipline. Vol. 1. Prentice Hall
2374 Englewood Cliffs.
2375 [126] Dalia Sobhy, Rami Bahsoon, Leandro Minku, and Rick Kazman. 2016. Diversifying Software Architecture for
2376 Sustainability: A Value-based Perspective. Proceedings of 2016 the European Conference on Software Architecture (ECSA)
2377 (2016).
[127] Dalia Sobhy, Leandro Minku, Rami Bahsoon, Tao Chen, and Rick Kazman. 2020. Run-time evaluation of architectures:
2378
A case study of diversification in IoT. Journal of Systems and Software 159 (2020), 110428.
2379 [128] Kevin J Sullivan, Prasad Chalasani, Somesh Jha, and Vibha Sazawal. 1999. Software design as an investment activity:
2380 a real options perspective. Real options and business strategy: Applications to decision making (1999), 215–262.
2381 [129] Kevin J Sullivan, William G Griswold, Yuanfang Cai, and Ben Hallen. 2001. The structure and value of modularity in
2382 software design. In ACM SIGSOFT Software Engineering Notes, Vol. 26. ACM, 99–108.
[130] Harald Sundmaeker, Patrick Guillemin, Peter Friess, and Sylvie Woelfflé. 2010. Vision and challenges for realising the
2383
Internet of Things. (2010).
2384 [131] Michael Szvetits and Uwe Zdun. 2016. Systematic literature review of the objectives, techniques, kinds, and architec-
2385 tures of models at runtime. Software & Systems Modeling 15, 1 (2016), 31–69.
2386 [132] Brendan Tansey and Eleni Stroulia. 2007. Valuating software service development: integrating COCOMO II and real
2387 options theory. In Economics of Software and Computation, 2007. ESC’07. First International Workshop on the. IEEE, 8–8.
[133] Gerald Tesauro. 2007. Reinforcement learning in autonomic computing: A manifesto and case studies. IEEE Internet
2388
Computing 11, 1 (2007), 22–30.
2389 [134] Lenos Trigeorgis. 1996. Real options: Managerial flexibility and strategy in resource allocation. MIT press.
2390 [135] Catia Trubiani, Indika Meedeniya, Vittorio Cortellessa, Aldeida Aleti, and Lars Grunske. 2013. Model-based per-
2391 formance analysis of software architectures under uncertainty. In Proceedings of the 9th international ACM Sigsoft
2392 conference on Quality of software architectures. ACM, 69–78.
[136] M Jeroen Van Der Donckt, Danny Weyns, M Usman Iftikhar, and Ritesh Kumar Singh. 2018. Cost-Benefit Analysis
2393
at Runtime for Self-adaptive Systems Applied to an Internet of Things Application.. In Proceedings of ENASE 2018,
2394 Portugal.
2395 [137] Jan Salvador van der Ven, Anton GJ Jansen, Jos AG Nijhuis, and Jan Bosch. 2006. Design decisions: The bridge
2396 between rationale and architecture. In Rationale management in software engineering. Springer, 329–348.
2397 [138] Naruemon Wattanapongskorn and David W Coit. 2007. Fault-tolerant embedded system design and optimization
considering reliability estimation uncertainty. Reliability Engineering & System Safety 92, 4 (2007), 395–407.
2398
[139] Lloyd G Williams and Connie U Smith. 2002. PASA SM: a method for the performance assessment of software
2399 architectures. In Proceedings of the 3rd international workshop on Software and performance. ACM, 179–189.
2400
2401

50 Sobhy et al.
2402 [140] Claes Wohlin. 2014. Guidelines for snowballing in systematic literature studies and a replication in software
2403 engineering. In Proceedings of the 18th international conference on evaluation and assessment in software engineering.
2404 ACM, 38.
[141] Jie Yang, Gang Huang, Wenhui Zhu, Xiaofeng Cui, and Hong Mei. 2009. Quality attribute tradeoff through adaptive
2405
architectures at runtime. Journal of Systems and Software 82, 2 (2009), 319–332.
2406 [142] He Zhang and Muhammad Ali Babar. 2010. On searching relevant studies in software engineering. (2010).
2407 [143] Dongbin Zhao and Zhaohui Hu. 2011. Supervised adaptive dynamic programming based adaptive cruise control. In
2408 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE, 318–323.
2409 [144] Ruiqing Zhao and Baoding Liu. 2004. Redundancy optimization problems with uncertainty of combining randomness
and fuzziness. European Journal of Operational Research 157, 3 (2004), 716–735.
2410
[145] Liming Zhu, Aybüke Aurum, Ian Gorton, and Ross Jeffery. 2005. Tradeoff and sensitivity analysis in software
2411 architecture evaluation using analytic hierarchy process. Software Quality Journal 13, 4 (2005), 357–375.
2412 [146] Peter Zimmerer. 2018. Strategy for continuous testing in iDevOps. In Proceedings of the 40th International Conference
2413 on Software Engineering: Companion Proceeedings. ACM, 532–533.
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450

Evaluation of Software Architectures Under Uncertainty: A Systematic Literature Review

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evaluation of Software Architectures Under Uncertainty: A Systematic Literature Review

Uploaded by

Copyright:

Available Formats

1 Evaluation of Software Architectures under Uncertainty:

, Vol. 1, No. 1, Article . Publication date: April 2021.

195 DDDAS2020, https://sites.google.com/view/dddas-conf/home)

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

442 Table 2. Data Extraction Criteria.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

1912 Table 10. Representative Contributions for Design-time Architecture Evaluation.

1961 Table 11. Representative Contributions for Run-time Architecture Evaluation.

, Vol. 1, No. 1, Article . Publication date: April 2021.

2010 Table 14. Studies included in the review.

2059 Table 15. Studies included in the review (Continued).

, Vol. 1, No. 1, Article . Publication date: April 2021.

2108 Table 16. Studies included in the review (Continued).

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

, Vol. 1, No. 1, Article . Publication date: April 2021.

You might also like