You are on page 1of 20

Article

Speculative Computing for Efficient AAFM Solutions in


Large-Scale Product Configurations
Cristian Vidal-Silva1 , Jesennia Cárdenas-Cobo2 , Vannessa Duarte3 , Iván Veas4∗ , José Miguel Rubio-León5

5 School of Videogame Development and Virtual Reality Engineering, Faculty of Engineering, University of
Talca, Campus Talca, Chile; cvidal@utalca.cl
2 Facultad de Ciencias e Ingenierías, Universidad Estatal de Milagro, Milagro, 091706, Ecuador;
jcardenasc@unemi.edu.ec
3 Escuela de Ciencias Empresariales, Universidad Católica del Norte, Coquimbo, 1781421, Chile;
vannessa.duarte@ucn.cl
4 Departamento de Administración, Facultad de Economía y Administración, Universidad Católica del
Norte, Antofagaste, 1270709, Chile; iveas@ucn.cl
5 Escuela de Computación e Informática, Universidad Bernardo O’Higgins, Av. Viel 1497, Santiago, 8320000,
Santiago, Chile; josemiguel.rubio@ubb.cl
* Correspondence: cvidal@utalca.cl; Tel.: +56-9-62002702

1 Abstract: The configuration of variability-intensive systems selects options regarding the system’s
2 user requirements and configuration restrictions. Feature models are a “de facto” standard for
3 modeling variability-intensive systems’ functionality and configurations. Nonetheless, manual
4 analysis of configurations in variability-intensive systems is practically impossible due to the large
5 number of configuration instances they represent and the high number of components in each
6 configuration. The automated analysis of feature models was then born for the automated analysis
7 of variability-intensive systems. Traditional solutions for automated feature model analysis
8 apply sequential computing. Although these solutions may be efficient for working with low
9 variability systems, these solutions require excessive processing time to work on large-scale high
10 variability systems. Those solutions lack the flexibility to use additional computing resources,
Citation: Vidal-Silva, C.; 11 such as multi-core technology, for their sequential computing approach to reduce the response
Cárdenas-Cobo, J.; Duarte, V.; Veas, I.; 12 time when working with large-scale feature models and configurations. This article describes
Rubio-León, J.M. S PECULATIVE 13 and assesses the adaptation and effective scalability of existing operation solutions of automated
C OMPUTING FOR E FFICIENT 14 analysis of feature models to improve their response time to work with large-scale models and
C OMPUTING AAFM S OLUTIONS IN
15 configurations. Specifically, this article summarizes PARALLEL Q UICK XP LAIN, the adaption of
L ARGE -S CALE P RODUCT
16 Q UICK XP LAIN, for detecting a minimum set of conflicting constraints, and PARALLEL FAST D IAG,
C ONFIGURATIONS. Journal Not
17 the adaption of FAST D IAG, to obtain a preferred minimum diagnosis. PARALLEL Q UICK XP LAIN
Specified 2023, 1, 0. https://doi.org/
18 and PARALLEL FAST D IAG, happy speculative programming to try to reduce the direct number of
Received:
19 consistency checks and thus obtain more efficient results with large-scale models.
Accepted:
Published: 20 Keywords: Conflict detection; diagnosis detection; configuration, speculative programming;
21 Parallel QuickXplain; Parallel FastDiag.
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional af-
filiations. 22 1. Introduction
23 Variability-Intensive Systems (VIS) are software systems in which variability man-
Copyright: © 2023 by the authors.
24 agement and product configuration are core activities
Submitted to Journal Not Specified
25 Product configuration is designing a product according to a set of requirements and
for possible open access publication
26 configuration rules [1,2]. With further details, product configuration systems need the
under the terms and conditions 27 knowledge base regarding the components and combination rules and the customers’
of the Creative Commons Attri- 28 requirements for selecting product components (configuration) that match their pref-
bution (CC BY) license (https:// 29 erences [3]. A valid product configuration only depends on the selected features in a
creativecommons.org/licenses/by/ 30 consistent knowledge base scenario; each valid product results from the composition of
4.0/). 31 component type instances that respect the set of defined combination rules [4].

Version October 1, 2023 submitted to Journal Not Specified https://www.mdpi.com/journal/notspecified


Version October 1, 2023 submitted to Journal Not Specified 2 of 20

32 Product configuration systems require systematically managing all features and


33 composition rules to analyze the feature selection for desired products [5]. Variability
34 Models (VMs) describe the different relationships and configuration options for the
35 variability management of software systems in software engineering. Different VMs
36 exist, such as Feature Models (FMs) and Orthogonal Variability Models (OVMs). FMs
37 permit representing functional commonalities and variabilities of software systems
38 [6], whereas OVMs permit describing the variant parts of the base model of systems
39 [7]. Kang et al. [8] introduced FMs as part of the FODA (Feature-Oriented Domain
40 Analysis) method, and they became the most used VM in the Software Product Line
41 (SPL) community afterward.
42 A Feature Model (FM) defines a set of features and their relationships for defining
43 valid feature combinations or products, that is, sets of features that respect the FM’s
44 defined relationships. For Chacón-Luna et al. [9], FM permits representing all the
45 products of an SPL. An FM organizes in a tree-like structure that starts at the root
46 feature that identifies the SPL and from which tree branches of features emerge. We can
47 then define a product in terms of the set of features that compose it, and each feature
48 describes an increment of functionality in the products containing that feature [10]. An
49 FM supports binary and sets relationships between parent and child features and cross-
50 tree relationships to symbolize dependencies between features. Fig. 1a shows an FM
51 for describing an operating system SPL and a valid configuration of it (the grey-colored
52 features). Fig. 1b shows the same FM along with a non-valid configuration of it.

53 1.1. Problem statement, goal and contributions


54 Product configuration permits assisting the mass customization production [4]. VIS
55 instances follow mass customization in software engineering by addressing variability in
56 the software development process phases. Because VIS users expect software products
57 to adapt to their needs, managing the users’ requirements variability in VIS represents
58 an essential activity for those systems [11]. Research works concerning managing the
59 variability of VIS already exist in the literature, such as in the Linux operating system
60 [12,13], the Debian-based distributions of Linux [14], with the Android mobile system
61 [15], and with the content management framework Drupal [16]. Those works use
62 variability models for representing and analyzing VIS.
63 An SPL is a case of VIS that systematically manages commonalities and variabilities
64 for configuring software products [17]. SPL defines domain engineering as analyzing
65 and developing reusable common and variable functionalities (features) in the products’
66 domain and applying engineering to produce customized products regarding user
67 feature selection. Defining valid configuration in SPL is a complex task for the growing
68 complexity of the configuration knowledge base [15]. When the users’ feature selection
69 conflicts with consistent configuration knowledge bases, it is necessary to identify those
70 issues to solve them. As Benavides et al. [6] remark, the manual analysis of variability
71 models, such as FMs, are error-prone and time-consuming tasks mainly because of
72 their increasing size. For example, variability and configuration models for Debian-
73 based distributions describe around twenty-eight thousand variability points [14], and
74 manually analyzing those models without mistakes is impractical. Mechanisms for the
75 Automated Analysis of Feature Models (AAFM) [6] are a solution to face those issues.
76 A set of features in an FM is called a configuration, and each software variant in an
77 FM identifies a valid configuration or product [18]. FMs permit organizing the configura-
78 tion space to facilitate the construction of software variants by describing configuration
79 options using interdependent features or functionalities. Hence, AAFM operations
80 for assisting the obtention of conflict-free FMs and configurations are high-value tasks.
81 Nonetheless, existing AAFM operations usually follow a sequential computing approach
82 and cannot scale to work on large-scale and high-variability models. Various algorithms
83 and solutions applicable to the AAFM exist in the literature, such as Q UICK XP LAIN [19],
84 FAST D IAG [20] to detect a minimal conflict and minimal-preferred diagnosis in a set of
Version October 1, 2023 submitted to Journal Not Specified 3 of 20

Debian

texteditor bash gui game

vi gedit openoffice.org-1 gnome kde gnuchess glchess

openoffice.org-1.1 openoffice.org-1.2

Mandatory Alternative Requires

Optional Or Excludes

Non-selected Selected

(a) Feature model with a valid configuration example [1].

(b) Feature model with a non-valid configuration example [1].

Figure 1. Feature model example of a valid and non-valid configuration of an operating system.

85 constraints in conflict, respectively. Those solutions cannot use additional resources for
86 their sequential computing nature, such as multiple cores or network technologies for
87 parallel and distributed computing. Next, we describe those operations in more detail.
88 • Minimal conflict set. For a consistent FM that we can define as a set of constraints,
89 a non-consistent configuration violates those constraints. The features model of
90 Fig. 1a exemplifies a consistent configuration; this configuration does not violate
91 the FM constraints. If feature gnuchess were also selected for that FM and config-
92 uration, the resulting configuration would be non-consistent because a conflict
93 exists between features gnuchess and glchess. In this example, { gnuchess, glchess}
94 is a Minimal Conflict Set (MCS) because we cannot find a subset of it that results
95 in a conflict set. As [20] remark, we can solve an MCS by merely deleting one of its
96 constraints. After finding and solving all the MCS instances, the configuration is
97 valid (conflicts-free).
98 The functioning of Q UICK XP LAIN uses the consistency check over constraint
99 sets, a costly action, as a primary step to achieve its main purpose: to identify
100 a preferred MCS. The Q UICK XP LAIN algorithm efficiently finds preferred MCS
101 regarding the order of the constraints definition. The application of Q UICK XP LAIN
102 for the conflict analysis of large-scale FMs such as the Android mobile operating
103 system [15], the Linux kernel [12], and the Linux distributions like Debian [14] are
Version October 1, 2023 submitted to Journal Not Specified 4 of 20

104 examples of computationally expensive tasks. That occurs mainly because of the
105 sequential nature of Q UICK XP LAIN and the high demand for computing resources
106 such as the execution time and memory space to work with large-scale models.
107 • Minimal diagnosis. Given the constraints of a consistent FM and a non-consistent
108 configuration that violates the FM constraints, a diagnosis is the set of constraints
109 that permits a consistent configuration after removing those constraints. For the
110 consistent configuration of Fig. 1b after selecting the feature gnuchess, we know
111 that { gnuchess, glchess} is a Minimal Conflict Set (MCS). Then, either deleting
112 gnuchess or glchess, we can obtain a valid configuration (conflict-free); that is,
113 gnuchess and glchess are examples of diagnosis.
114 The functioning of FAST D IAG uses the consistency check over constraint sets, a
115 costly action, as a primary step to achieve its main purpose: to identify a preferred
116 and minimal diagnosis. The FAST D IAG algorithm efficiently finds a preferred-
117 minimal diagnosis regarding the order of the constraints definition. The application
118 of this algorithm for the diagnosis analysis of large-scale FMs such as the Android
119 mobile operating system [15], the Linux kernel [12], and distributions of Linux like
120 Debian [14] results in computationally expensive tasks. That occurs mainly due to
121 the sequential nature of FAST D IAG and the high demand for computing resources
122 such as the execution time and memory space to work with large-scale models.
123 Developing FM product configurations without conflicts requires identifying each
124 conflict and the necessary steps to solve or diagnose them. Hence, conflict detection
125 and diagnosis are essential operations for getting conflict-free models. Because AAFM
126 solutions for the conflict detection, diagnosis, and completion of products already exist,
127 the main goal of this article is to review the functionality and computing cost of existing
128 traditional and speculative AAFM solutions for assisting the product configuration on
129 large-scale models.
130 Benavides et al. [6] 2010 presented a Systematic Literature Review (SLR) about FM
131 and AAFM operations. Different AAFM proposals exist using various formal approaches
132 such as CSP, SAT, and BDD solvers [7]. The main contributions of this article are:
133 • To analyze the functionality and computing performance of commonly used AAFM
134 solutions for minimal conflict detection, minimal diagnosis, and the minimal
135 completion of partial product configuration.
136 • To describe speculative programming as a computing approach that can improve
137 the performance of AAFM solutions and describe application examples.
138

139 This article highlights research opportunities for developing new and more efficient
140 solutions for conflict detection and diagnosis of large-scale product configurations. To
141 improve the readability of the study, Tab. 1 presents the different acronyms in those
142 topics with their meanings.
143 The rest of this paper is organized as follows. Section 2 describes and exemplifies
144 the use of FMs. Section 3 describes the Automated Analysis of Feature Model (AAFM)
145 and the product configuration processes. That section also details the conflict detection
146 and diagnosis operations and existing solutions. Section 4 presents practical results
147 of the analyzed solutions for the product configuration for a test set. Section 5 details
148 speculative solutions and practical results of the analyzed solutions for the product
149 configuration for a test set. Section 7 details a few practical issues of our research. The
150 paper concludes by summarizing the benefits of our academic experience and detailing
151 the motivation for continuing with it in the current and future years.

152 2. Background
153 A feature model is an information model that represents the variant flexibility and
154 maintainability for systems’ variability and configuration [2]. A feature is an abstraction
Version October 1, 2023 submitted to Journal Not Specified 5 of 20

Table 1: Acronym list

Acornym Meaning
FM, FMs Feature Model, Feature Models
AAFM Automated Analysis of Feature Model
VIS Variability Intesive System
VM, VMs Variability Model, Variability Models
OVMs Orthogonal Variability Models
SPL Software Product Line
MCS Minimal Conflict Set
SLR Systematic Literature Review
MCS Minimal Conflict Set
MD Minimal Diagnosis
FODA Feature-Oriented Domain Analysis

155 of a prominent or distinctive user-visible aspect, requirement, quality, or functional


156 characteristic of a family of software systems [21,22]; each feature constitutes a user-
157 visible configuration option of the problem domain [23]. An FM is a tree-like structure
158 commonly used to represent common and variable functionalities (features) and their
159 relationships to the configuration of products in a Software Product Line (SPL) [8]. Kang
160 et al. [8] introduced FMs in the FODA (Feature-Oriented Domain Analysis) method, and
161 they are the “de facto” standard for describing common and variable features in system
162 families [24,25] regardless of their size because FMs facilitate the software reuse [26].
163 An FM starts with the root feature. Each successively deeper level in the FM corre-
164 sponds to a more fine-grained configuration option for product-line variants. Features
165 are nodes of that tree, and their relationships are the edges (relationships and constraints)
166 between features [21]. The relationships among features are of two types: structural
167 relationships between a parent and its child features and cross-tree or cross-hierarchy
168 constraints [21]. FMs represent an effective communication medium between customers
169 and developers of SPLs [27]. As Benavides et al. [6] describe, different FM dialects exist
170 nowadays, such as basic FMs models, cardinality-based FMs, and extended FMs using
171 feature attributes [10,28,29].

172 2.0.1. Basic feature models


173 A basic FM supports two types of relationships between features: structural rela-
174 tionships between parents and their child features and cross-tree constraints [6]. Thus,
175 each non-root feature has a parent feature and is either part of a group or not. The next
176 lines describe each type of FM relationship.

177 • Structural relationships between parents and their child features:

178 – Mandatory: A mandatory relationship states that a parent feature requires


179 its child. The top-left figure of Table 2 shows the graphic representation of a
180 mandatory relationship between parent and child features.
181 – Optional: An optional relationship states that a child feature may be or not be
182 present (its parent feature does not require it). The top-right figure of Table 2
183 illustrates an optional relationship between parent and child features.
184 – Set: A defined number of children’s features (sub-features) are selectable for
185 products when their parent is selected. A cardinality relation [x, y] gives this
186 number of features for x <= y and y <= number of child features in the set.
187 Two cases are XOR (alternative) and Or (inclusive) sets.
188
* Inclusive Or: At least one child’s features must be present. In this case,
189 the cardinality relation is [1, n] (n corresponds to the number of child
190 features). The middle-left figure of Table 2 illustrates an inclusive rela-
191 tionship between a parent feature and a set of children’s features.
Version October 1, 2023 submitted to Journal Not Specified 6 of 20

Table 2: Feature model relations.

Mandatory

Unary relations

Optional

Inclusive
(OR)

Set relations

Exclusive
(XOR)

Requires
Cross-tree
constraints
Excludes

192 The middle-right row of Table 2 illustrates an alternative relationship


193 between a parent feature and a set of children’s features.
194
* Alternative XOR: Only one child feature must be present. The associated
195 cardinality relation is [1, 1] in this case.

196 • Cross-Tree Constraints.

197 – Requires: For two features, A and B, if A requires B, then A’s presence implies
198 the presence of B in a product. The top division in the bottom-row of Table
199 2 illustrates a required cross-tree constraint relationship between a source
200 feature A and a target feature B.
201 – Excludes: For two features, A and B, if A excludes B, then A and B cannot be
202 present in the same product. The bottom division of the bottom row of Table
203 2 illustrates an excludes cross-tree constraint relationship between features A
204 and B.

205 More complex cross-tree relationships exist in the literature to define constraints in
206 generic propositional formulas such as “A and not B implies C” [6].
207 Fig. ?? illustrates a valid configuration for the FM of the Debian operating system:
208 nodes represent the features of the model (i.e., selectable packages to install), and edges
209 are the constraints between features (e.g., packages that require the installation of other
210 packages). In this example, we can observe that packages texteditor, bash, and gui are
211 mandatory (i.e., they must always be included in any Debian configuration). In contrast,
Version October 1, 2023 submitted to Journal Not Specified 7 of 20

Figure 2. Automated Analysis of Feature Models (AAFM) process [1].

212 the package games is optional. We can also observe that textditor requires at least one of
213 the packages vi, gedit, or openoffice. Likewise, feature gui requires at least one of gnome or
214 kde. In the case of the package games, we observe that it requires either gnuchess or glchess,
215 but only one, non both. Similarly, the package openoffice.org-1 requires either version
216 openoffice.org-1.1 or openoffice.org-1.2. Finally, we observe that openoffice.org-1 strictly
217 requires the installation of gnome. Hence, the selection of features {Debian, texteditor,
218 vi, gedit, openoffice, openoffice-1, bash, gui, gnome, kde, game, glchess} exemplifies a valid
219 product. Fig. 1b illustrates a non-valid configuration for the FM of the Debian operating
220 system: the selection of features {Debian, texteditor, vi, gedit, openoffice, openoffice-1, bash,
221 gui, kde, game, gnuchess, glchess} exemplifies a non-valid configuration that does not
222 respect the requires cross-tree constraint between the option openoffice.org-1 and gnome
223 (only the first option is selected), and gnuchess and glchess are selected for the option game
224 when only one feature must be selected (game represents an alternative set of features).
225 The application and analysis of FMs is a common approach to performing analysis
226 tasks. Benavides et al. [6] mention that the manual analysis of FMs is a time-demanding
227 and error-prone activity, and the AAFM process permits solving those issues. The
228 AAFM process starts by translating the FM and additional information, such as global
229 restrictions, into logical constraints. Afterward, queries can proceed with the translated
230 model using an off-the-shelf solver and other tools such as programming solutions, thus
231 obtaining analysis results [30]. Fig. 2 illustrates the AAFM process.
232 Such as Galindo et al. [30] summarize six different variability facets that exist where
233 the AAFM is currently applied: i ) product configuration and derivation; ii ) testing and
234 evolution; iii ) reverse engineering; iv) multi-model variability-analysis; v) variability
235 modeling, and; vi ) variability-intensive systems. The first AAFM application results
236 in the most traditional usage of automated analysis mechanisms. This article aims to
237 contribute to it.
238 Developing FM and product configurations without errors or conflicts requires
239 identifying each conflict and the necessary steps to solve or diagnose them. Hence,
240 conflict detection and diagnosis are operations needed for getting conflict-free models.
241 Completing a product configuration of FM by hand also represents an error-prone and
242 time-consuming task. Solutions for those tasks to work efficiently on large-scale models
243 represent high-value tasks nowadays. AAFM solutions for product conflict detection,
244 diagnosis, and completion already exist.

245 3. Automated Analysis of Variability-Intensive Systems


246 The development process of a VIS considers identifying and representing the
247 system’s components and relationships among those components as two core activities.
Version October 1, 2023 submitted to Journal Not Specified 8 of 20

248 The application and analysis of FMs is a common approach to performing those analysis
249 tasks. Benavides et al. [6] mention that the manual analysis of FMs is a time-demanding
250 and error-prone activity, and the AAFM process permits solving those issues. The
251 AAFM process starts by translating the FM and additional information, such as global
252 restrictions, into logical constraints. Afterwards, queries can proceed with the translated
253 model using an off-the-shelf solver and other tools such as programming solutions, thus
254 obtaining analysis results [30].
255 For Galindo et al. [30], six different variability facets exist where the AAFM is
256 currently applied: i ) product configuration and derivation; ii ) testing and evolution;
257 iii ) reverse engineering; iv) multi-model variability-analysis; v) variability modeling,
258 and; vi ) variability-intensive systems. The first AAFM application results in the most
259 traditional usage of automated analysis mechanisms. This article aims to contribute to it.
260 Developing FM and product configurations without errors or conflicts requires
261 identifying each conflict and the necessary steps to solve or diagnose them. Hence,
262 conflict detection and diagnosis are operations needed for getting conflict-free models.
263 Completing a product configuration of FM by hand also represents an error-prone and
264 time-consuming task. Solutions for those tasks to work efficiently on large-scale models
265 represent high-value tasks nowadays. AAFM solutions for product conflict detection,
266 diagnosis, and completion already exist. The next sections describe an existing algorithm
267 for detecting Minimal Conflict Sets (MCS), a current algorithm for detecting Minimal
268 Diagnosis (MD), and traditional approaches to complete product configurations.

269 3.1. Product Configuration Solutions


270 Minimal Conflict Sets (MCS) detection: An MCS of a system represents a minimal
271 set of constraints in conflict. For definition 3.1 [4], it is necessary to identify the set
272 of constraints B that represents a consistent background knowledge and the set of
273 constraints C that is the suspected subject of a conflict search.
274 A set AC = B ∪ C = {c1 , c2 , ..., cn } represents the set of all constraints in the knowl-
275 edge base; that is, AC is the union of the consistent knowledge base B and the suspicious
276 set of constraints subject of conflict search C. Then, a conflict CS = {c a , cb , ..., cz } is a
277 non-empty and non-consistent subset of C. CS is minimal if ¬∃ CS′ such that CS′ ⊂ CS
278 CS is preferred if the order of its constraints follows a defined ranking of preferences.
279 For the FM of Fig. 1b, concerning definition 3.1, the consistent base knowledge B is
280 a formal definition of the FM, that is, a logic representation of the set of features and their
281 relationships. We can detect conflict in the configuration of products for that model. For
282 the product configuration C = {Debian, texteditor, bash, gui, game, vi, gedit, openoffice.org-1,
283 gnome, kde, glchess, openoffice.org-1}, the resulting minimal conflict set is {} because
284 C represents a consistent configuration. For the product configuration C = {Debian,
285 texteditor, bash, gui, game, vi, gedit, openoffice.org-1, kde, gnuchess, glchess, openoffice.org-1},
286 the resulting preferred minimal conflict set is {openoffice.org-1, ¬ gnome}}. The next lines
287 describe the Q UICK XP LAIN algorithm for efficiently detecting preferred MCS.
288 Q UICK XP LAIN [19] is an efficient approach to determining a minimal conflict set.
289 Q UICK XP LAIN receives C as the set of suspicious constraints with conflict and B as
290 consistent constraints of the background knowledge. Then, a conflict does not exist if B
291 ∪ C is consistent or C is empty. On the other hand, Q UICK XP LAIN proceeds by returning
292 the results of the function QX. QX receives the parameters C (initially the complete set
293 of constraints with conflict), B (initially the knowledge base), and Bδ (initially empty)
294 that represents the last items added to B. Function QX follows a divide-and-conquer
295 approach for conflict detection. Hence, Bδ corresponds to the set of constraints added
296 for reviewing the consistency of the knowledge base, and C is the set of constraints
297 to continue analyzing if the current B is consistent. Algorithms 1 and 2 show the
298 pseudo-code of the functions of Q UICK XP LAIN.
Version October 1, 2023 submitted to Journal Not Specified 9 of 20

Algorithm 1 Q UICK XP LAIN (C, B) : CS


1: if C ONSISTENT ( B ∪ C ) then
2: return(’no conflict’)
299
3: else if C = ∅ then
4: return(∅)
5: else
6: return(QX (C, B, ∅))
7: end if

Algorithm 2 QX (C = {c1 ..cm }, B, Bδ) : CS


1: if Bδ ̸= ∅ and I N C ONSISTENT ( B) then
2: return(∅)
3: end if
4: if C = {cα } then
300 5: return({cα })
6: end if
7: k = ⌊ m2 ⌋
8: Ca ← c1 ...ck ;Cb ← ck+1 ...cm ;
9: ∆2 ← QX (Ca , B ∪ Cb , Cb );
10: ∆1 ← QX (Cb , B ∪ ∆2 , ∆2 );
11: return(∆1 ∪ ∆2 )

301 Q UICK XP LAIN permits determining one MCS per computation. Felfernig et al. [4]
302 indicate that we need to update adequately or delete one of the constraints of an MCS
303 to solve it, and, if the model is non-consistent yet, to apply Q UICK XP LAIN and repeat
304 the process. When the resulting model is consistent, the updated constraints represent a
305 diagnosis or solution for the model.

306 3.2. Minimal diagnosis detection


307 Identifying and solving conflicts one by one is necessary to obtain a conflict-free
308 model: we need to identify a conflict first, adapt (update or eliminate) constraints of
309 that conflict for its solution, and repeat this process until no more conflict exists, that
310 is, until reaching a consistent model. The set of all the adapted constraints for getting
311 a conflict-free model represents a diagnosis. Definition 3.2 formally defines the term
312 diagnosis [4,31].
313 A set AC = {c1 , c2 , ..., cn } represents the set of all constraints in the problem for
314 diagnosis; that is, AC is the union of the consistent base knowledge B and the set of
315 constraints subject of the conflict search C: AC = B ∪ C. Then, a diagnosis is a set of
316 constraints ∆ ⊆ C such that ( B ∪ C − ∆) results in a consistent or conflict-free set. ∆
317 is minimal if ¬∃ ∆′ such that ∆′ ⊂ ∆. A minimal diagnosis is of minimal cardinality if
318 there does not exist a minimal diagnosis ∆′ such as |∆′ | < |∆|.
319 A minimal diagnosis for the FM configuration of Fig. 1b has to consider solutions for
320 each conflict. Hence, this example contains two diagnosis options. To get a conflict-free
321 model, the user has to solve each diagnosis. Cases with multiple diagnosis instances
322 exist, and determining all the diagnoses can be computationally expensive. Model
323 constraints can be in relevant order for obtaining a preferred diagnosis. Obtaining all
324 the diagnoses to look for the preferred one is a time-demanding and lost time activity
325 since solving one diagnosis is enough for a conflict-free model. The next lines describe
326 the FAST D IAG algorithm to determine a minimal preferred diagnosis.
327 FAST D IAG algorithm permits determining a preferred or leading diagnosis concern-
328 ing a previously defined relevance order of constraints in the knowledge base. FAST D IAG
329 follows the algorithmic structure and reasoning of Q UICK XP LAIN for a different purpose:
330 diagnosis detection without calculating MCS instances. Hence, FAST D IAG is based on
Version October 1, 2023 submitted to Journal Not Specified 10 of 20

331 conflict-independent search strategies [32]. Algorithms 3 and 4 give the pseudo-code of
332 FAST D IAG functions.

Algorithm 3 FAST D IAG (C, AC ) : diagnosis ∆


1: if C = ∅ or I N C ONSISTENT ( AC − C ) then
333
2: return(∅)
3: else
4: return(FD (∅, C, AC ))
5: end if

Algorithm 4 FD ( D, C = {c1 ..cq }, AC ) : diagnosis ∆


1: if D ̸= ∅ and C ONSISTENT ( AC ) then
2: return(∅)
3: end if
4: if |C | = 1 then
334 5: return(C)
6: end if
q
7: k = ⌊2⌋
8: Ca ← c1 ...ck ; Cb ← ck+1 ...cq ;
9: ∆1 ← FD (Cb , Ca , AC − Cb );
10: ∆2 ← FD (∆1 , Cb , AC − ∆1 );
11: return(∆1 ∪ ∆2 )

335 Assuming that conflicts to diagnosis exist, If the conflict set C is non-empty, and
336 AC without C is consistent, algorithm FAST D IAG calls and waits for the results of the
337 recursive algorithm FD. FD first reviews the consistency of AC as a source of diagnosis.
338 Because always AC contains C and does not contain D, S is the constraint set with
339 conflicts, and D is empty; when D is not empty, and AC is consistent, D is the source of
340 conflict. When that base case is not accomplished, either because D is empty (such as at
341 the beginning) or AC is consistent (this is only possible after removing elements from
342 AC − D represents the last removed elements from AC), then AC is still in conflict, and
343 C is a source of conflict. Then, FD reviews the size of C since if it were minimal (size 1),
344 then C is the diagnosis. If C is not of minimal size, FD proceeds to partition C in the sets
345 C1 and C2 , of which the last one corresponds to the most preferred partition. Afterward,
346 FD calls FD over C2 , C1 , and AC − C2 to review if C2 is the diagnosis source and, if not
347 so, to continue reviewing C1 with that goal.
348 In summary, Q UICK XP LAIN and FAST D IAG are efficient algorithm solutions for
349 identifying MCS and minimal diagnosis. Even though they are efficient sequential-
350 computing solutions, such as Vidal et al. [5] highlight, they are inadequate for large-scale
351 FMs. The following section reviews the computing performance of those solutions.

352 4. AAFM Product Configuration Solutions in Practice


353 Next, we summarize experiments to evaluate the computing performance of Q UICK -
354 XP LAIN and FAST D IAG.
355 QX complexity. Assuming a splitting k = ⌊ m2 ⌋ of C = {c1 ..cm }, the worst-case time
356 complexity of Q UICK XP LAIN in terms of the number of consistency checks needed for
357 calculating one minimal conflict is 2k × log2 ( mk ) + 2k where k is the minimal conflict set
358 size and m represents the underlying number of constraints [19]. We should optimize the
359 computing performance of consistency checks becuase they are the most time-consuming
360 part of conflict detection.
361 FD complexity. Assuming a splitting d = ⌊ n2 ⌋ of S = {s1 ..sn }, the worst-case time
362 complexity of FD in terms of the number of consistency checks needed for calculating
363 one minimal diagnosis is 2d × log2 ( nd ) + 2d where d is the minimal diagnosis set size
364 and n represents the underlying number of constraints [32]. The runtime performance of
Version October 1, 2023 submitted to Journal Not Specified 11 of 20

365 the underlying algorithms must be optimized because consistency checks are the most
366 time-consuming part of diagnosis detection.
367 Execution environment. For Q UICK XP LAIN and FAST D IAG, all experiments reported
368 were conducted using an AMD EPYC 7571 machine equipped with a CPU with eight
369 cores and 2.60GHz. Each core maintained up to 2 threads, which means that 16 cores
370 could be simulated using hyper-threading. It had 64 GB of RAM.
371 Characteristics of the knowledge bases. For evaluation purposes of Q UICK XP LAIN
372 and FAST D IAG, we generated configuration knowledge bases (feature models) from the
373 publicly available B ETTY tool suite [29], which allows for systematic testing of different
374 consistency checking and conflict detection approaches for knowledge bases. The
375 knowledge base instances (represented as background knowledge B in Q UICK XP LAIN
376 and AC in FAST D IAG) that were selected for the purpose of our evaluation had around
377 1.000 binary variables (derived from the 1.000 features used) and also varied in terms
378 of the number of included constraints depending on the different feature relationships
379 and the total of derived clauses (around 1,600 SAT clauses in the generated CNF files).
380 Based on these knowledge bases, we randomly generated requirements (ci ∈ C) that
381 covered 10% of the variables included in the knowledge base. These requirements have
382 been generated to analyze conflict sets of different cardinalities. We also shuffled C to
383 get different orders because this can affect the number of consistency checks needed.

384 4.1. Q UICK XP LAIN results


385 Tab. 3 summarizes the results of the Q UICK XP LAIN performance analysis to identify
386 a preferred minimal conflict of product configurations. Each entry represents the average
387 runtime in msec for all knowledge bases with a preferred conflict set of cardinality n (1–
388 16). We can appreciate that the time increases when more conflicts exist in the analyzed
389 product configurations. For the mentioned issue that Q UICK XP LAIN identifies only one
390 conflict that, after solving it, a new execution is necessary to identify the remaining one.

Table 3: Avg. runtime (in msec) of QX when determining minimal conflicts.


Conflict Cardinality
lmax 1 2 4 8 16
1 1,167.68 2,120.68 3,415.95 5,488.71 9,242.53
lmax=1 is equivalent to sequential Q UICK XP LAIN. Each cell follows a
heat map coloring: the darker, the slower. In bold, the cells with faster
time for a given conflict cardinality

391 4.2. FAST D IAG results


392 Tab. 4 summarizes the results of the FAST D IAG performance analysis to identify a
393 preferred minimal diagnosis of product configurations. Each entry represents the average
394 runtime in msec for all knowledge bases with a preferred diagnosis set of cardinality n
395 (1–16). We can appreciate a surprising time execution difference between the conflict and
396 diagnosis detection; that is, FAST D IAG results more efficient than Q UICK XP LAIN even
397 though they pursue different tasks. We can appreciate in Tab. 4 that the time increases
398 when more conflicts exist in the product configurations because FAST D IAG requires
399 identifying diagnosis of more cardinality.

Table 4: Avg. runtime (in msec) of FD (lmax=1) for determining preferred diagnosis
Conflict Cardinality
lmax 1 2 4 8 16
1 1,161.14 1,162.47 1,162.48 1,166.44 1,166.12
lmax=1 is equivalent to sequential FAST D IAG. Each cell follows a heat map
colouring: the darker the slower. In bold the cells with faster time for a
given conflict cardinality.
Version October 1, 2023 submitted to Journal Not Specified 12 of 20

400 The Q UICK XP LAIN, F LEX D IAG, and data for experiments are available in 1 , and 2 ,
401 respectively.

402 5. Speculative Programming


403 Speculative programming is an optimization technique for writing programs with
404 portions of code that can be executed before needing their results. Theobol et al. [33]
405 remark that speculative execution looks to speed up the execution of programs by run-
406 ning portions of code before knowing if they will be reached. Speculative execution
407 is the pre-execution or pre-calculation of results that can contribute to computing the
408 expected outcome [34]. We can define parallel tasks for executing those pre-calculations
409 for a Thread Level Speculation (TLS) [35]. We can differentiate effective and non-valid
410 speculative executions for which the program execution time can improve or even dete-
411 riorate for the cost of orchestrating the parallel tasks. The current availability of parallel
412 computing resources permits developing solutions using speculative programming for
413 eventually getting more efficient solutions [36].

Figure 3. Flow diagram example with four tasks A, B, C, and D.

414 Fig. 3 shows a flow diagram with four tasks A, B, C, and D. A is the first task, and
415 after it, task B or C can occur depending on a condition, and then task D would execute
416 before the end. Task D depends of the results of B or C to proceed. Fig. 4b shows the
417 tracking for the sequential execution options, each one containing three steps according
418 to the defined sequential order. Fig. 4 shows a speculative execution that executes
419 tasks A, B, and C in parallel and then executes task D. Then, task D would proceed
420 after getting the correct input from the previous tasks. In this example, tasks B and C
421 would give an output; then, it is necessary to decide the correct input for D before its
422 execution. Executing this example in a machine with adequate parallel resources would
423 permit the execution of the speculative solution more efficiently than the sequential one.
424 This example illustrates that speculating about future actions permits improving the
425 execution time of a set of tasks when resources for parallel computing are available. We

1 https://github.com/cvidalmsu/A-Python-QX-implementation
2 https://github.com/cvidalmsu/A-Python-FD-implementation
Version October 1, 2023 submitted to Journal Not Specified 13 of 20

(a) Tracking options for the tasks of Fig. 3.

(b) Speculative tracking for the tasks of Fig. 3.

Figure 4. Example of Traditional and Speculative Execution for the Tasks of Fig. 3..

426 can use speculative programming to look for a parallel, more efficient version of existing
427 solutions to solve the first two issues.

428 5.1. Parallel QuickXplain


429 Our approach looks for parallelizing the consistency checks in QX by substituting
430 the direct solver call I NCONSISTENT (B) in QX with the activation of a lookahead function
431 (QXG EN) in which consistency checks are not only triggered to provide feedback to
432 QX requests directly. Our approach also provides fast answers for consistency checks
433 potentially relevant in upcoming states of a QX instance. We follow the principles of
434 speculative programming [37]: we start calculating consistency checks that could be
435 useful in the future. The advantage is that we can anticipate resource-intensive reasoning
436 tasks. The drawback is that we use some computation resources that will be wasted if
437 the pre-calculation is finally not used. Therefore, the challenge in this kind of technique
438 is to find algorithms able to anticipate as many reusable calculations as possible while
439 reducing the calculation tasks that are not reusable.

Algorithm 5 I NCONSISTENT(C, B, Bδ):Boolean

440
1: if ¬ E XISTS C ONSISTENCY C HECK(B) then
2: QXG EN ({C }, { Bδ}, { B − Bδ}, { Bδ}, 0)
3: end if
4: return (¬ L OOK U P ( B ))

441 The QXG EN function (Algorithm 6) is based on the idea of issuing recursive calls
442 and adapting the parameters of the calls depending on the two possible situations 1)
443 consistent(Bδ ∪ B) and 2) inconsistent(Bδ ∪ B).
Version October 1, 2023 submitted to Journal Not Specified 14 of 20

Algorithm 6 QXG EN(C, Bδ, B, δ, l)


C = {C1 ..Cr } ... consideration set (subsets Cα )
Bδ = { Bδ1 ..Bδn } ... added knowledge (subsets Bδβ )
B = { B1 ..Bo } ... background (subsets Bγ )
δ = { D1 ..D p } ... to be checked (subsets Dπ )
l ... current lookahead depth
1: if l < lmax then
2: if | f (δ)| > 0 then
3: A DD CC(Bδ ∪ B)
4: end if
444 5: {Bδ ∪ B assumed consistent}
6: if | f (C )| = 1 ∧ | f ( Bδ)| > 0 then
7: QXG EN ( Bδ, ∅, B ∪ {C1 }, {C1 }, l + 1)
8: else if | f (C )| > 1 then
9: S PLIT (C, Ca , Cb )
10: QXG EN (Ca , Cb ∪ Bδ, B, Cb , l + 1)
11: end if
12: {Bδ ∪ B assumed inconsistent}
13: if | f ( Bδ)| > 0 ∧ | f (δ)| > 0 then
14: QXG EN ({ Bδ1 }, Bδ − { Bδ1 }, B, ∅, l + 1)
15: end if
16: end if
445 The experimentation was conducted based on a Python3 implementation of the
446 Q UICK XP LAIN algorithm and the parallelized Q UICK XP LAIN (QX) version presented in
447 this chapter. For the implementation, we used the multiprocessing Python package for
448 running parallel tasks. For representing our test knowledge bases and conducting the
449 corresponding consistency checks, we used Sat4J [38] as it is one of the most used solvers
450 integrated in many software (product line) engineering tools such as FeatureIDE [39],
451 FAMA [40], FAMILIAR [41] among others [42–44]. Python was used for its paralleliza-
452 tion capabilities while Sat4J was one of the most used solvers in the SPL community.
453 Nonetheless, any other technologies could have been used.

Conflict Cardinality
lmax 1 2 4 8 16
1 1167,68 2120,68 3415,95 5488,71 9242,53
2 1094,12 1760,60 2540,17 3960,71 6121,19
3 859,43 1506,16 2228,37 3443,92 5470,39
4 900,89 1475,80 2153,87 3083,36 5071,48
5 892,96 1602,45 2203,80 3358,84 5233,00

lmax=1 is equivalent to sequential Q UICK XP LAIN. Each cell follows a heat map colouring: the darker the slower. In bold the cells with
faster time for a given conflict cardinality. With lmax=5 performance deteriorates since the limitations in terms of available cores.

Table 5: Avg. runtime (in msec) of parallelized QX when determining minimal conflicts.

454 Tables 5 shows a summary of the results of our QXG EN performance analysis. On
455 an average, the runtime needed by standard Q UICK XP LAIN (lmax = 1) to identify a
456 preferred minimal conflict of cardinality 16 is 1.82× higher compared to a parallelized
457 solution based on QXGen (lmax = 4). In Table 5, each entry represents the average
458 runtime in msec for all knowledge bases with a preferred conflict set of cardinality n,
459 where the same set of knowledge bases has been evaluated for lmax sizes 1–5 (lmax = 1
460 corresponds to the usage of standard QX without QXG EN integration).
461 It can be observed that with an increasing lmax the performance of QX increases.
462 However, with lmax = 5, a performance deterioration can be observed which can be
463 explained by the fact that the number of pre-generated consistency checks starts to exceed
464 the number of physically available processors. In the line of our algorithm analysis, the
465 number of relevant consistency checks that can be performed with lmax = 5 is between
Version October 1, 2023 submitted to Journal Not Specified 15 of 20

466 5 and 3. Taking into account the overheads for managing the parallelized consistency
467 checks, the shown results support our theoretical analysis of QXG EN. Figure 5 depicts
468 the same results.

469 5.2. Parallel FastDiag


470 Such as Felfernig et al. [4] argue the consistency checking CC is an expensive
471 computing step. Our approach to parallelizing the CC in FD substitutes the direct solver
472 call CONSISTENT (AC) with the activation of a lookahead function (FDG EN) in which
473 consistency checks are not only triggered to provide feedback to FD requests directly,
474 but also to be able to provide fast answers for consistency checks potentially relevant in
475 upcoming states of a FD instance. We follow the principles of speculative programming
476 [37]: we start calculating consistency checks that could be useful in the future. The
477 advantage is that we can anticipate resource-intensive reasoning tasks. The drawback is
478 that we use some computation resources that will be wasted if some pre-calculation is
479 finally not used. Therefore, the challenge in this kind of technique is finding algorithms
480 able to anticipate as many reusable calculations as possible while reducing the calculation
481 tasks that are not reusable.

Algorithm 7 C ONSISTENT(D, C, AC):Boolean

482
1: if ¬ E XISTS C ONSISTENCY C HECK(AC) then
2: FDG EN ({ D }, {C }, { AC ∪ D }, { D }, 0)
3: end if
4: return( L OOK U P ( AC ))

483 In the parallelized variant of PARALLEL FAST D IAG that we propose, CC is activated
484 by FD with C ONSISTENT(D, S, AC) (see Algorithm 7). This also activates FDG EN (see
485 Algorithm 8) that starts to generate and trigger (in a parallelized fashion) further CC
486 instances that might be of relevance in upcoming FD phases. For describing FDG EN, we
487 employ a two-level ordered set notation which requires to embed the FD D into { D }, S
488 into {S}, and AC into { AC }. In FDG EN, D, S, and AC are interpreted as ordered sets.

Algorithm 8 FDG EN(D, C, AC, δ, l)


D = { D1 ..Dr } ... removed set (subsets Dα )
C = {C1 ..Cn } ... set (subsets Cβ ) to consider
AC = { AC1 ..ACt } ... base knowledge with
knowledge to consider (subsets ACγ )
δ = {δ1 ..δp } ... to be checked (subsets δπ )
l ... current lookahead depth
1: if l < lmax then
2: if | f (δ)| > 0 then
3: A DD CC(AC − D)
489
4: end if
5: {AC − D assumed inconsistent}
6: if | f (C )| = 1 ∧ | f ( D )| > 0 then
7: FDG EN (∅, D, AC − {C1 }, {C1 }, l + 1)
8: else if | f (C )| > 1 then
9: S PLIT (C, Ca , Cb )
10: FDG EN (Cb ∪ D, Ca , AC, Cb , l + 1)
11: end if
12: {AC − D assumed consistent}
13: if | f ( D )| > 0 ∧ | f (δ)| > 0 then
14: FDG EN ( D − { D1 }, { D1 }, AC, ∅, l + 1)
15: end if
16: end if
Version October 1, 2023 submitted to Journal Not Specified 16 of 20

Figure 5. Performance of Q UICK XP LAIN vs PARALLEL Q UICK XP LAIN with 2 to 5 threads.

490 We conducted the experimentation based on the implementation in Python3 of


491 FAST D IAG and PARALLEL FAST D IAG. We used the multiprocessing Python package for
492 running parallel tasks. We used Sat4J [38] for representing our test knowledge bases and
493 conducting the corresponding consistency checks since it is one of the most used solvers
494 integrated in many software (product line) engineering tools such as FeatureIDE [39],
495 FAMA framework [40], FAMILIAR [41] among others [42–44]. Nonetheless, we could
496 use any other technology able for writing and reasoning on AAFM solutions.

Conflict Cardinality
lmax 1 2 4 8 16
1 61,14 62,47 62,48 66,44 66,12
2 56,14 35,78 56,89 57,10 70,73
3 49,78 44,16 45,07 48,54 50,14
4 48,78 46,73 46,69 47,89 54,34
5 43,68 44,13 43,68 48,25 50,55

lmax=1 is equivalent to sequential FAST D IAG. Each cell follows a heat map colouring: the
darker the slower. In bold the cells with faster time for a given conflict cardinality.

Table 6: Avg. runtime (in msec) of FD (lmax=1) and parallelized FD (lmax>1) for
determining preferred diagnosis.

497 Table 6 shows a summary of the performance and analysis results of FAST D IAG
498 and FDG EN. On an average, the runtime needed by standard FAST D IAG (lmax = 1) to
499 identify a preferred minimal diagnosis for conflict of cardinality 16 is 23, 54% slower
500 compared to a parallelized solution for the same purpose based on FDG EN (lmax = 5).
501 In Table 6, each entry represents the average runtime in msec for all knowledge bases
502 with a conflict set of cardinality n, where the same set of knowledge bases has been
503 evaluated for lmax sizes 1–5 (lmax = 1 corresponds to the usage of standard FD without
504 FDG EN integration).
505 We can observe that with an increasing lmax, the performance improvement of
506 FD increases with a few exceptions: the solution for four threads is the best for models
507 with eight conflicts, and the solution for three threads is the best for models with sixteen
508 conflicts. A deterioration can exist with lmax = 4 and lmax = 5 because the number of
Version October 1, 2023 submitted to Journal Not Specified 17 of 20

Figure 6. Performance of FAST D IAG vs PARALLEL FAST D IAG with 2 to 5 threads

509 pre-generated consistency checks starts to exceed the number of physically available
510 processors. The obtained results support our theoretical analysis of FDG EN, taking
511 into account the overheads for managing the consistency checks in parallel. Figure
512 6 illustrates the performance results of Table 6. The performance improvement of
513 PARALLEL FAST D IAG presents a scalability tendency even though it is not as notorious as
514 for PARALLEL Q UICK XP LAIN. After reviewing the results, some conflicts are solvable by
515 updating only one or a few constraints. Then, finding a conflict set with various conflicts
516 can require more computation.

517 6. Discussion
518 We conducted all experiments using an AMD EPYC 7000 machine equipped with a
519 CPU with 16 cores and 2.50GHz. It had 64 GB of RAM.
520 This article emphasizes the relevance of conflict detection and diagnosis detec-
521 tion, representing relevant operations in the product configuration of VIS. This article
522 remarked that Q UICK XP LAIN and FAST D IAG are still some of the most proper and effi-
523 cient conflict detection and diagnosis operations. Considering the work of Vidal-Silva et
524 al. [45] to apply speculative programming on Q UICK XP LAIN for optimizing the conflict
525 detection process, this article highlights the usability of speculative programming for
526 optimizing the diagnosis and other conflict detection operations such as MergeXplain
527 [46]. Vidal et al. [47] showed the usability and efficiency of applying diagnosis solutions
528 such as FAST D IAG for product completion. Speculative programming seems directly
529 applicable for optimizing the computing performance of FAST D IAG for its algorithmic
530 and computing similarity with Q UICK XP LAIN.
531 One issue of speculative programming is the generation of non-effective specula-
532 tions, that is, computations with non-usable results. Thus, applying speculative pro-
533 gramming demands computing effective speculations as much as possible, speculations
534 with a high grade of effectiveness. That requires a deep study of the current solutions to
535 define and compute speculation strategies that can guarantee their effectiveness.
Version October 1, 2023 submitted to Journal Not Specified 18 of 20

536 To show the functionality and evaluate the performance of our solutions, we imple-
537 mented them using Python and FAMA [17] for Q UICK XP LAIN and FAST D IAG.

538 7. Threats to Validity


539 This work presents relevant operations for the Automated Analysis of Product
540 Configuration of Feature Models. Nonetheless, it is necessary to discuss the following
541 practical issues:

542 • We implemented our solutions to run in Python and FAMA [17]. For executing
543 Q UICK XP LAIN and FAST D IAG, Python and FAMA should be in the computer. That
544 seems not a problem because Python in 2022 is one of the most used programming
545 environments, and FAMA is freely accessible online.

546 • We used generated FMs by the use of Betty. Product configuration solutions can
547 be more precise in inaccurate models and configuration cases. Nonetheless, the
548 generated models are adequate for the simulation goal.

549 8. Conclusion
550 This article reviewed the functionality, computing performance, and main details of
551 Q UICK XP LAIN and PARALLEL Q UICK XP LAIN for conflict detection and FAST D IAG and
552 PARALLEL FAST D IAG for diagnosing the product configuration of small-scale and large-
553 scale products. This study provides the base and highlights the speculative programming
554 approach as an algorithmic optimization technique applicable for optimizing efficient
555 sequential solutions to work on the product configuration of large-scale products:

556 1. We recognized that conflict detection is a base step for solving configuration is-
557 sues. We found that Q UICK XP LAIN represents an efficient solution for detecting
558 minimal preferred conflict. Although Q UICK XP LAIN uses an efficient divide-
559 and-conquer algorithmic approach, analyzing large-scale FM and configurations
560 takes a long time. Moreover, for its sequential nature, Q UICK XP LAIN cannot use
561 computing resources, such as multiple cores for parallel computing. This arti-
562 cle parallelized Q UICK XP LAIN to develop a more efficient solution for detecting
563 conflicts in large-scale configuration scenarios. Our analysis found a costly op-
564 eration step that uses data from the previous executions in the Q UICK XP LAIN
565 functioning. We pre-calculate that operation by applying speculative computation
566 to look for improvements. Thus, PARALLEL Q UICK XP LAIN was born. The obtained
567 results validated the efficiency of PARALLEL Q UICK XP LAIN regarding traditional
568 Q UICK XP LAIN for analyzing large-scale FM and configurations.

569 2. We found that FAST D IAG represents an efficient solution for detecting minimal
570 preferred diagnosis using an efficient divide-and-conquer algorithmic approach.
571 However, FAST D IAG takes a long time to analyze large-scale FM and configura-
572 tions, and it cannot use computing resources, such as multiple cores for parallel
573 computing, for its sequential nature. Hence, we parallelized FAST D IAG for getting
574 a more efficient solution for diagnosis in large-scale configuration scenarios as
575 our second research goal. Like in the analysis of Q UICK XP LAIN, our analysis
576 found a costly operation step that uses data from the previous executions in the
577 FAST D IAG functioning. We pre-calculate that operation by applying speculative
578 computation to look for improvements. Thus, PARALLEL FAST D IAG was born.
579 The obtained results validated the efficiency of PARALLEL FAST D IAG regarding
580 traditional FAST D IAG for analyzing large-scale FM and configurations.

581 Acknowledgment
582 Our research team continues working in the software engineering and product line
583 areas. Our special thanks, first, to the University of Seville for giving the fundamental
Version October 1, 2023 submitted to Journal Not Specified 19 of 20

584 base of this research. Then, thanks to the University of Talca in Talca-Chile and Universi-
585 dad Católica del Norte in Antofagasta and Coquimbo-Chile for assisting us during the
586 whole process and providing the budget to fulfill our research tasks. We feel encouraged
587 to continue dealing with these research activities in the coming years.

References
1. Vidal-Silva, C.; Duarte, V.; Cardenas-Cobo, J.; Serrano-Malebran, J.; Veas, I.; Rubio-León, J. Reviewing Automated Analysis of
Feature Model Solutions for the Product Configuration. Applied Sciences 2023, 13. doi:10.3390/app13010174.
2. Modrak, V.; Bednar, S.; Soltysova, Z. Resolving Product Configuration Conflicts. Closing the Gap Between Practice and Research
in Industrial Engineering; Viles, E.; Ormazábal, M.; Lleó, A., Eds.; Springer International Publishing: Cham, 2018; pp. 95–104.
3. Le, V.M.; Felfernig, A.; Uta, M.; Tran, T.N.T.; Silva, C.V. WipeOutR: Automated Redundancy Detection for Feature Models.
Proceedings of the 26th ACM International Systems and Software Product Line Conference - Volume A; Association for Computing
Machinery: New York, NY, USA, 2022; SPLC ’22, p. 164–169. doi:10.1145/3546932.3546992.
4. Felfernig, A.; Hotz, L.; Bagley, C.; Tiihonen, J. Knowledge-based Configuration: From Research to Business Cases, 1 ed.; Morgan
Kaufmann Publishers Inc.: San Francisco, CA, USA, 2014.
5. Vidal-Silva, C.; Felfernig, A.; Galindo, J.A.; Atas, M.; Benavides, D. Explanations for Over-Constrained Problems with Parallelized
Q UICK XP LAIN. Journal of Intelligent Information Systems - Integrating Artificial Intelligence and Database Technologies; Helic,
D.; Leitner, G.; Stettinger, M.; Felfernig, A.; Raś, Z.W., Eds.; Springer International Publishing: Cham, 2021; pp. –.
6. Benavides, D.; Segura, S.; Ruiz-Cortés, A. Automated Analysis of Feature Models 20 Years Later: A Literature Review. Information
Systems 2010, 35, 615–636. doi:10.1016/j.is.2010.01.001.
7. Galindo, J.A.; Roos-Frantz, F.; Benavides, D.; Ruiz-Cortés, A.; García-Galán, J. Automated Analysis of Diverse Variability Models
with Tool Support. XIX Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2014), 2014, pp. 160–168.
8. Kang, K.C.; Cohen, S.G.; Hess, J.A.; Novak, W.E.; Peterson, A.S. Feature-Oriented Domain Analysis (FODA) Feasibility Study.
Technical report, Carnegie-Mellon University Software Engineering Institute, 1990.
9. Chacón-Luna, A.E.; Gutiérrez, A.M.; Galindo, J.A.; Benavides, D. Empirical software product line engineering: A systematic
literature review. Information and Software Technology 2020, 128, 106389. doi:https://doi.org/10.1016/j.infsof.2020.106389.
10. Bhushan, M.; Negi, A.; Samant, P.; Goel, S.; Kumar, A. A Classification and Systematic Review of Product Line Feature Model
Defects. Software Quality Journal 2020, 28, 1507–1550. doi:10.1007/s11219-020-09522-1.
11. Galster, M. Variability-Intensive Software Systems: Product Lines and Beyond. Proceedings of the 13th International Workshop
on Variability Modelling of Software-Intensive Systems; Association for Computing Machinery: New York, NY, USA, 2019;
VAMOS ’19. doi:10.1145/3302333.3302336.
12. She, S.; Lotufo, R.; Berger, T.; Wasowski,
˛ A.; Czarnecki, K. The Variability Model of The Linux Kernel. Fourth International
Workshop on Variability Modelling of Software-Intensive Systems, Linz, Austria, January 27-29, 2010. Proceedings; Benavides,
D.; Batory, D.S.; Grünbacher, P., Eds. Universität Duisburg-Essen, 2010, Vol. 37, ICB-Research Report, pp. 45–51.
13. Rothberg, V.; Dintzner, N.; Ziegler, A.; Lohmann, D. Feature Models in Linux: From Symbols to Semantics. Proceedings of the
Tenth International Workshop on Variability Modelling of Software-intensive Systems; ACM: New York, NY, USA, 2016; VaMoS
’16, pp. 65–72. doi:10.1145/2866614.2866624.
14. Galindo, J.; Benavides, D.; Segura, S. Debian Packages Repositories as Software Product Line Models. Towards Automated
Analysis 2010. pp. 29–34.
15. Galindo, J.A.; Turner, H.; Benavides, D.; White, J. Testing Variability-intensive Systems Using Automated Analysis: An
Application to Android. Software Quality Journal 2016, 24, 365–405. doi:10.1007/s11219-014-9258-y.
16. Sánchez, A.B.; Segura, S.; Parejo, J.; Ruiz-Cortés, A. Variability testing in the wild: the Drupal case study. Software & Systems
Modeling 2015, pp. 1–22. doi:10.1007/s10270-015-0459-z.
17. Lettner, M.; Rodas-Silva, J.; Galindo, J.A.; Benavides, D. Automated analysis of two-layered feature models with feature attributes.
J. Comput. Lang. 2019, 51, 154–172. doi:10.1016/j.cola.2019.01.005.
18. Lienhardt, M.; Damiani, F.; Johnsen, E.B.; Mauro, J. Lazy Product Discovery in Huge Configuration Spaces. Proceedings of the
ACM/IEEE 42nd International Conference on Software Engineering; Association for Computing Machinery: New York, NY,
USA, 2020; ICSE ’20, p. 1509–1521. doi:10.1145/3377811.3380372.
19. Junker, U. QuickXPlain: Preferred Explanations and Relaxations for Over-constrained Problems. 19th national conference on
Artifical intelligence; AAAI Press: San Jose, CA, 2004; pp. 167–172.
20. Felfernig, A.; Benavides, D.; Galindo, J.; Reinfrank, F. Towards Anomaly Explanation in Feature Models. Proceedings of the 15th
International Configuration Workshop, August 2013.
21. Benavides, D.; Galindo, J.A. Automated analysis of feature models: current state and practices. Proceeedings of the 22nd
International Systems and Software Product Line Conference - Volume 1, SPLC 2018, Gothenburg, Sweden, September 10-14,
2018, 2018, p. 298. doi:10.1145/3233027.3233055.
22. Zhou, F.; Jiao, J.R.; Yang, X.J.; Lei, B. Augmenting feature model through customer preference mining by hybrid sentiment
analysis. Expert Systems with Applications 2017, 89, 306 – 317. doi:https://doi.org/10.1016/j.eswa.2017.07.021.
23. Weckesser, M.; Lochau, M.; Schnabel, T.; Richerzhagen, B.; Schürr, A. Mind the Gap! Automated Anomaly Detection for
Potentially Unbounded Cardinality-Based Feature Models. Proceedings of the 19th International Conference on Fundamental
Version October 1, 2023 submitted to Journal Not Specified 20 of 20

Approaches to Software Engineering - Volume 9633; Springer-Verlag New York, Inc.: New York, NY, USA, 2016; pp. 158–175.
doi:10.1007/978-3-662-49665-7_10.
24. Santos, A.R.; Santana de Almeida, E. Do #Ifdef-based Variation Points Realize Feature Model Constraints? SIGSOFT Softw. Eng.
Notes 2015, 40, 1–5. doi:10.1145/2830719.2830728.
25. Segura, S.; Sánchez, A.B.; Ruiz-Cortés, A. Automated Variability Analysis and Testing of an E-commerce Site: An Experience
Report. Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering; ACM, ACM: New
York, NY, USA, 2014; ASE ’14, pp. 139–150. doi:10.1145/2642937.2642939.
26. Nieke, M.; Mauro, J.; Seidl, C.; Thüm, T.; Yu, I.C.; Franzke, F. Anomaly Analyses for Feature-model Evolution. Proceedings of the
17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences; ACM: New York, NY,
USA, 2018; GPCE 2018, pp. 188–201. doi:10.1145/3278122.3278123.
27. Le, V.M.; Felfernig, A.; Uta, M.; Benavides, D.; Galindo, J.; Tran, T.N.T. DIRECTDEBUG: Automated Testing and Debugging of
Feature Models. 2021 IEEE/ACM 43rd International Conference on Software Engineering: New Ideas and Emerging Results
(ICSE-NIER), 2021, pp. 81–85. doi:10.1109/ICSE-NIER52604.2021.00025.
28. Karataş, A.S.; Oğuztüzün, H. Attribute-based Variability in Feature Models. Requir. Eng. 2016, 21, 185–208. doi:10.1007/s00766-
014-0216-9.
29. Galindo, J.; Benavides, D. Towards a new repository for feature model exchange. Proceedings of the 23rd International Systems
and Software Product Line Conference, SPLC 2019, Volume B, Paris, France, September 9-13, 2019; Cetina, C.; Díaz, O.; Duchien,
L.; Huchard, M.; Rabiser, R.; Salinesi, C.; Seidl, C.; Tërnava, X.; Teixeira, L.; Thüm, T.; Zadi, T., Eds. ACM, 2019, pp. 85:1–85:4.
doi:10.1145/3307630.3342405.
30. Galindo, J.; Benavides, D.; Trinidad, P.; Gutiérrez-Fernández, A.; Ruiz-Cortés, A. Automated analysis of feature models: Quo
vadis? Computing 2019, 101, 387–433. doi:10.1007/s00607-018-0646-1.
31. Bhushan, M.; Duarte, J.Á.G.; Samant, P.; Kumar, A.; Negi, A. Classifying and resolving software product line redundancies using
an ontological first-order logic rule based method. Expert Syst. Appl. 2021, 168, 114167. doi:10.1016/j.eswa.2020.114167.
32. Felfernig, A.; Schubert, M.; Zehentner, C. An Efficient Diagnosis Algorithm for Inconsistent Constraint Sets. Artif. Intell. Eng. Des.
Anal. Manuf. 2012, 26, 53–62. doi:10.1017/S0890060411000011.
33. Theobald, K.B.; Gao, G.R.; Hendren, L.J. Speculative Execution and Branch Prediction on Parallel Machines. Proceedings of the
7th International Conference on Supercomputing; Association for Computing Machinery: New York, NY, USA, 1993; ICS ’93, p.
77–86. doi:10.1145/165939.165958.
34. Tatemura, J. Speculative parallelism of intelligent interactive systems. Proceedings of IECON ’95 - 21st Annual Conference on
IEEE Industrial Electronics, 1995, Vol. 1, pp. 193–198 vol.1. doi:10.1109/IECON.1995.483357.
35. Martínez, J.F.; Torrellas, J. Speculative Synchronization: Applying Thread-Level Speculation to Explicitly Parallel Applications.
Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems;
Association for Computing Machinery: New York, NY, USA, 2002; ASPLOS X, p. 18–29. doi:10.1145/605397.605400.
36. Bérenger, B. Increasing the degree of parallelism using speculative execution in task-based runtime systems. PeerJ Computer
Science, 2019. doi:10.7717/peerj-cs.183.
37. Burton, F.W. Speculative computation, parallelism, and functional programming. IEEE Transactions on Computers 1985, C-34, 1190–
1193. doi:10.1109/TC.1985.6312218.
38. Le Berre, D.; Parrain, A. The Sat4j library, release 2.2. Journal on Satisfiability, Boolean Modeling and Computation 2010, 7, 59–64.
39. Kastner, C.; Thum, T.; Saake, G.; Feigenspan, J.; Leich, T.; Wielgorz, F.; Apel, S. FeatureIDE: A tool framework for feature-oriented
software development. 2009 IEEE 31st International Conference on Software Engineering. IEEE, 2009, pp. 611–614.
40. Alférez, M.; Acher, M.; Galindo, J.A.; Baudry, B.; Benavides, D. Modeling variability in the video domain: Language and
experience report. Software Quality Journal 2019, 27, 307–347.
41. Acher, M.; Collet, P.; Lahire, P.; France, R.B. FAMILIAR: A domain-specific language for large scale management of feature
models. Sci. Comput. Program. 2013, 78, 657–681. doi:10.1016/j.scico.2012.12.004.
42. Batory, D. Feature Models, Grammars, and Propositional Formulas. Proceedings of the 9th International Conference on Software
Product Lines; Springer-Verlag: Berlin, Heidelberg, 2005; SPLC’05, pp. 7–20. doi:10.1007/11554844_3.
43. Doux, G.; Albert, P.; Barbier, G.; Cabot, J.; Del Fabro, M.D.; Lee, S.U.J. An MDE-based approach for solving configuration
problems: An application to the Eclipse platform. European Conference on Modelling Foundations and Applications. Springer,
2011, pp. 160–171.
44. Thum, T.; Batory, D.; Kästner, C. Reasoning about Edits to Feature Models 2009. pp. 254–264.
45. Vidal-Silva, C.; Felfernig, A.; Galindo, J.A.; Atas, M.; Benavides, D. A Parallelized Variant of Junker’s QuickXPlain Algorithm.
Foundations of Intelligent Systems; Helic, D.; Leitner, G.; Stettinger, M.; Felfernig, A.; Raś, Z.W., Eds.; Springer International
Publishing: Cham, 2020; pp. 457–468.
46. Shchekotykhin, K.; Jannach, D.; Schmitz, T. MERGEXPLAIN: Fast Computation of Multiple Conflicts for Diagnosis. Proceedings
of the 24th International Conference on Artificial Intelligence. AAAI Press, 2015, IJCAI’15, pp. 3221–3228.
47. Vidal, C.; Galindo, J.A.; Giráldez, J.; Benavides, D. Automated completion of partial configurations as a diagnosis task. Using
FastDiag to improve performance. ISMIS 2020, Industry Session, Graz University of Technology, Graz, Austria, September, 2020,
2020.

You might also like