You are on page 1of 11

Future Generation Computer Systems 122 (2021) 209–219

Contents lists available at ScienceDirect

Future Generation Computer Systems


journal homepage: www.elsevier.com/locate/fgcs

A variable scale case-based reasoning method for evidence location in


digital forensics✩

Ai Wang a , Xuedong Gao b ,
a
School of Humanities and Social Science, University of Science and Technology Beijing, No.30 Xueyuan Road, Beijing, 100083, China
b
School of Economics and Management, University of Science and Technology Beijing, No.30 Xueyuan Road, Beijing, 100083, China

article info a b s t r a c t

Article history: With the increasing usage of information technology on the criminal side, the digital forensic analysis,
Received 6 May 2020 especially multimedia forensics, becomes an emerging technique for cybercrime investigators to
Received in revised form 12 March 2021 improve examination efficiency. The study focuses on the digital triage problem for evidence location
Accepted 17 March 2021
during the automatic forensic process. After defining the multi-scale knowledge base for storing
Available online 21 April 2021
digital forensic investigators’ prior knowledge, a variable scale case-based reasoning method (VSCBR)
Keywords: is proposed to support investigators predicting evidential areas. The variable-scale clustering algorithm
Digital forensics based on the scale transformation strategy (VSC-STS) is also put forward, which could identify highly
Scale transformation similar past cases containing candidate evidence in the case reuse and revise phase. A case study is
Variable-scale clustering established using a real 15.9 GB bidding case dataset, which contains both text bidding documents and
Case-based reasoning image technical drawings. Numerical experimental results show that the validation of the proposed
VSC-STS is significantly improved compared with the traditional single-scale clustering algorithm, and
it is insensitive to the initial parameter threshold. Moreover, the proposed method VSCBR is able to
help investigators locate suspicious rule-violating evidences in practice.
© 2021 Elsevier B.V. All rights reserved.

1. Introduction areas on a target device through automatically analyzing investi-


gators’ prior knowledge. And only files stored in that area would
Traditional human dependent forensic approaches could not be targeted as candidate evidence for criminal behaviors like
meet the requirement of today’s sophisticated cybercrime in- fraud offenses, waiting for further examination. Compared to the
vestigation. Consequently, advanced digital forensic techniques file type focused approach, this file location focused approach has
have gained great significance on collecting, locating, and even advantages on avoiding gathering large amount of non-relevant
identifying investigation-supported evidences [1]. data. Besides, the CBR-FT is also not sensitive to the modification
of digital content.
In order to accelerate the automatic forensic process, digital
Although the case-based reasoning method has been proven
triage modules have been widely utilized in forensic software [2].
to be effective in digital forensics, especially digital triage, there
For instance, EnCase Portable, as a powerful device triage so-
is still existing the time varying problem in the case learning
lution, allows forensic professionals and non-experts alike to
phase that heavily affects the accuracy of forensic results [4]. For
recover data following the file type, such as images, documents,
instance, the traditional CBR-FT knowledge base is established
and even emails [3]. However, this digital triage approach driven under a fixed time period (single scale), which is inability to keep
by preset file categories might suffer from low efficiency when a up with criminal cyber activities that changing over time.
large number of files, that having the same type with the evidence The multi-scale data theory considers that, each object can
file, are maintained on one target computer. obtain multiple values of different scales under each attribute [5].
Horsman et al. [4] proposes a digital forensic method for These attribute values follow the information transformation re-
device triage using the case-based reasoning framework (CBR-FT). lation between scale levels, and could divide the global objects
The CBR-FT could improve the efficiency of locating suspicious into several granules through certain partition principle, e.g. in-
distinguishability, similarity, etc. [6,7]. Therefore, how to obtain
✩ This document is the results of the research project funded by the National the scale hierarchical structure of the initial object dataset with
Natural Science Foundation of China [71272161] and China Scholarship Council.
clear scale characteristics has become one of the main tasks of
∗ Corresponding author. multi-scale data analysis [8].
E-mail addresses: ai.wang@uta.edu (A. Wang), The variable-scale clustering method (VSC) is just to study
gaoxuedong@manage.ustb.edu.cn (X. Gao). how to obtain object classes with clear scale characteristics by

https://doi.org/10.1016/j.future.2021.03.019
0167-739X/© 2021 Elsevier B.V. All rights reserved.
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

changing the observation scale of objects [9]. Hence, after con-


structing the multi-scale data model, the VSC could automati-
cally obtain the optimal observation scale with clear manage-
ment objectives, as well as the management objects with scale
characteristics that responsible for each level.
Thus, this paper aims to solve the digital triage problem for
locating evidence during the automatic forensic process. The main
contributions are as follows. Firstly, the structure of the multi-
scale knowledge base is established following the multi-scale
data model. Compared to the traditional CBR knowledge base, the
multi-scale knowledge base could solve the time varying problem
through increasing multiple scales (like temporal scale) of every
case attribute.
Secondly, the optimistic scale transformation strategy (OSTS)
and the pessimistic scale transformation strategy (PSTS) are pro-
posed to optimize the attribute selection in the scale transfor-
mation process. Especially for the multiple attribute scale trans-
formation problem, the scale transformation strategy and mech- Fig. 1. The case-based reasoning mechanism.
anism could clarify the scale transformation sequence and direc-
tion.
Thirdly, a variable scale clustering algorithm based on the authenticity of videos, which fully considers the features pattern
scale transformation strategy (VSC-STS) is proposed to overcome of current multimedia platforms.
the uncertainty caused by the random attribute selection in the Another research direction supporting digital evidence inves-
VSC. Experimental results illustrate that the validation of the tigation is to speed up and optimize the digital triage process.
proposed VSC-STS is significantly improved compared with the Hence, this paper studies the digital evidence location problem
traditional single-scale clustering algorithm, and it is insensitive using advanced machine learning algorithms.
to the initial parameter threshold.
Finally, after defining the multi-scale case similarity mea- 2.2. Case-based reasoning mechanism
surement, a variable scale case-based reasoning method (VSCBR)
is proposed to support investigators predicting evidential areas. The case based reasoning methodology (CBR) aims to solve
A case study is established using a real 15.9 GB bidding case new problems through previous valuable knowledge and expe-
dataset, which contains both text bidding documents and image rience to improve the accuracy and efficiency, which has been
technical drawings. Experimental results illustrates that the pro- wildly applied in various fields, such as assessing the throm-
posed method VSCBR is able to help investigators locate likely bophilia predisposition risk in the health care industry [17].
rule-violating evidences in practice. The case-based reasoning mechanism consists of five phases
The structure of this paper is as follows. Section 2 summarizes (see Fig. 1): Case Retrieve, case reuse, case revise, case retain and
the relevant research work, including the digital forensics in case learning (or case-base maintenance) [18].
cybercrime, case-based reasoning mechanism and variable-scale (1) Case Retrieve. The objective of Case Retrieve phase is
data analysis. Section 3 firstly defines the structure of multi-scale to identify past cases in the knowledge base that have similar
knowledge base, and then proposes the variable-scale clustering
characteristics with the new problem. The prior experience of
method based on the scale transformation strategy (VSC-STS). The
past cases could be applied for the solution of the new problem.
variable scale case-based method (VSCBR) is also put forward
(2) Case Reuse. According to the results of Case Retrieve phase,
using the VSC-STS. Experimental procedures and analysis results
if similar past cases are retrieved successfully, the case reuse
of a real 15.9 GB bidding dataset are discussed in Section 4. And
phase will directly output all the potential solutions within those
the paper is concluded in Section 5.
cases for the new problem.
(3) Case Revise. According to the results of Case Retrieve
2. Related works
phase, if there is no similar past cases to the new problem, then
2.1. Digital forensics in cybercrime the case revise phase will start to correct a certain or some
historical solution, and output the verified solution.
The rapid development of computer technology, especially (4) Case Retain. After the case reuse and revise phases, the
digitization related techniques, witnesses the increase of cyber- structural new case, that contains not only the new problem,
crimes worldwide [10]. The abuse of intelligent devices like smart but also the final solution, will be established and saved into the
phones could not only commit crime, but also bring troubles for knowledge base for the next query.
forensic investigation [11]. (5) Case Learning. After the case retain phase, the case learn-
Academic institutions and industrial departments have paid ing mechanism and methods will maintain the knowledge base
attention on automatic digital forensic techniques to prevent through evaluating the quality of the new solution, and update
cybercrimes [12]. And how to collect, preserve and analyze digital relevant information for the next query.
evidence becomes one of the most critical tasks for investigate Compared to the traditional CBR framework, this paper pro-
criminal activities [13]. poses the multi-scale knowledge base model to solve the time
One of the mainstream directions is to study how to iden- varying problem of criminal behaviors for digital forensics.
tify and distinguish whether the digital evidence is original or
modified by criminals [14]. And the digital multimedia contents, 2.3. Variable-scale data analysis
especially video and image, gain more research significance due
to the high evidential value in the judicial process [15]. With the wide accumulation of business data in different
For instance, Quinto et al. [16] proposes technical solution for industries and the rapid development of multivariate data anal-
digital evidence identification through detecting the integrity and ysis tools, a standardized application process methodology has
210
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

been gradually formed in the application of data mining technol-


ogy, that is the cross-industry standard process for data mining
(CRISP-DM) [19–21]. According to the life cycle of the data min-
ing project, the CRISP-DM divides the application process of the
data mining technology into six stages: business understanding,
data understanding, data preparation, modeling, evaluation and
deployment [22,23].
The data preparation, modeling and evaluation stages con-
stitute the execution process of the data mining project [24].
Data preparation contains all the activities that form the final
data set from the original data, such as attribute selection, data
transformation, and cleaning of the data set [25]. The objective
of modeling stage is to develop the data mining algorithm that
meets the implementation conditions, and optimize the algo-
rithm parameters according to the characteristics of datasets Fig. 2. The hierarchy structure of the multi-scale knowledge base.
and mining tasks [26]. The objective of the evaluation stage is
to verify whether the model results have reached the business
demands [27]. Lemma 1. Each attribute of the single-scale case dataset, that is
The variable scale data analysis technology (like the variable- constructed by the multi-scale knowledge base, has and only has one
scale clustering method, VSC) studies the automatic mechanism observation scale.
of the execution process of the data mining project (i.e., data
preparation, modeling and evaluation stage), through modeling According to Lemma 1, given a multi-scale knowledge base
the scale transformation characteristics of human thinking activ- DS that has r attributes Aλ ∈ AS ∈ DS (λ = 1, 2, . . . , r) if every
ities, which could be applied for the intelligent decision-making
λ
attribute has n∏ observation scales Aλi ∈ Aλ (i = 0, 1, 2, . . . , (n −
1)), there are λ=1 nλ different single-scale case dataset in total.
r
process under various business scenarios [28–30].
The scale transformation theory, that is the theoretical basis of
the variable-scale clustering, consists of the concept space model Lemma 2. Any case partition of the multi-scale knowledge base,
(CS) for describing the relations between different scales, the that is divided by a lower hierarchical observation scale, is the
scale transformation rate measurement (STR) for controlling the refinement of its higher hierarchical partition result.
scale transformation process, the granular deviation index (GrD) According to Lemma 2, for observation scale Aλi , Aλs ∈ Aλ and
λ
for evaluating the satisfaction degree of VSC results, as well as the Ai ⪯ Aλs (λ = 1, 2, . . . , r ; i = 0, 1, 2, . . . , (n − 1)), all the case
depth and width scale transformation mode. partitions Uitλ ∈ U /Aλi are the refinement of its higher hierarchical
Table 1 shows the research summary of the scale transfor- partition result Ustλ ∈ U /Aλs , that is Uitλ ⊆ Ustλ (t ∈ N + ).
mation theory. Compared to the previous research work, the Compared to the traditional CBR knowledge base (in
proposed method VSC-STS (see Section 3) mainly focuses on the Section 1), the multi-scale knowledge base could solve the time
optimistic and pessimistic scale transformation strategy problem, varying problem through increasing multiple scales (like tempo-
in order to improve the scale transformation efficiency. ral scale) of every case attribute.
Compared to the traditional concept space model (in
3. Variable scale case-based reasoning method for digital Section 2.3), the multi-scale knowledge base increases the solu-
forensics tion set structure (solution space), which could store the suspi-
cious evidential areas of all the cases for the automatic forensic
3.1. Multi-scale knowledge base investigation.
After defining the structure of multi-scale knowledge base,
Since the novel machine learning algorithm (Variable-scale how to control the scale transformation process of multi-scale
clustering, VSC) has won high efficiency in various practical prob- knowledge base is supposed to be measured quantitatively.
lems, this section applies the VSC to the case-based reasoning According to the scale transformation theory, there are two
mechanism for locating evidence in digital forensics. different scale transformation approaches, that is the scale up
transformation (ST ) and scale down transformation (ST ) For the
Definition 1 (Multi-Scale Knowledge Base). Let DS = (U , AS , d, scale up transformation process, the change of case partitions in
V S f ) represents the multi-scale knowledge { base, where } U = multi-scale knowledge base DS could be measured by the scale
{x1 , x2 , . . . , xh } is the case set; AS = A1 , A2 , . . . , Ar is the up transformation rate (STR) as :
attribute set and there is at least one attribute Aλ (Aλ ∈ AS ) ⋃l { λ
}
has more than one observation scale, that is ∃Aλ , CL(Aλ ) = λ λ
| q=1 Usq |
STR(Ai , As ) = (1)
⟨Aλ0 , Aλ1 , . . . , Aλn ⟩(λ = 1, 2, . . . , r); d = {d1 , d2 , . . . , dt } is the |U |
solution set of every case; f is the information function, V S is the
Where U ∈ DS is the case set and |U | calculates the total number
case value domain of case set U that is f : U × AS → V S .
of cases in U; Observation scale Aλi , Aλs ∈ Aλ ∈ AS ∈ DS (λ =
Fig. 2 shows the hierarchy structure of the multi-scale knowl- 1, 2, . . . , r), and Aλi ⪯ Aλs ; Case partition Uipλ ∈ U /Aλi (p =
edge base. Given a multi-scale attribute Aλ it can be seen that: λ
1, 2, . . . , t), Usq ∈ U /Aλs (q = 1, 2, . . . , l), and ∃Uipλ → Uipλ = Usqλ .
(1) There is the partial order relation between observation scale Similarly, For the scale down transformation process, the
pairs in the concept chain at different hierarchies, that is Aλi ⪯ change of case partitions in multi-scale knowledge base DS could
Aλi+1 (i ∈ N); (2) The observation scale decides the case value in be measured by the scale down transformation rate (STR) as:
the value space at the same hierarchy, that is Aλi = Vijλ |j ∈ N + ;
{ }
Uiqλ |
⋃t { }
λ λ
| p=1
(3) There is the partial order relation between case value pairs STR(As , Ai ) = (2)
in the value space following the relation of observation scales |U |
at different hierarchies, that is Aλi ⪯ Aλi+1 → Vijλ ⩾̸ V(iλ+1)k (i ∈ Where U ∈ DS is the case set and |U | calculates the total number
N ; j, k ∈ N + ). of cases in U; Observation scale Aλi , Aλs ∈ Aλ ∈ AS ∈ DS (λ =
211
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

Table 1
Research summary of the scale transformation theory.
# Research Problem [31] [8] [9] [32] [33] [34] [35] The proposed method
√ √ √ √
1 Concept space model of categorical variables √ √ √
2 Concept space model of binary variables √ √
3 Concept space model of numerical variables √ √ √ √ √ √ √
4 Scale depth transformation mode √ √ √
5 Scale breadth transformation mode √ √ √ √ √ √ √
6 Single scale up transformation direction √ √
7 Single scale down transformation direction √
8 Scale transformation feedback mechanism √
9 Optimistic scale transformation strategy √
10 Pessimistic scale transformation strategy

Table 2 However, for the multiple attribute scale transformation prob-


Example: Multi-scale knowledge base. lem, it is necessary to build the scale transformation strategy
and mechanism following the scale transformation value of each
attribute, in order to clarify the scale transformation sequence
and transformation direction of multi-attribute.
Hence, two scale transformation strategies are proposed fol-
lowing the scale transformation mechanism in Section 2.3 (see
Fig. 3).
Strategy 1. (Optimistic scale transformation strategy, OSTS)
During once scale transformation, always being prior to selecting
attribute with the maximum scale transformation value (attribute
with the smallest scale transformation rate).
Fig. 3 shows the scale transformation process under the OSTS.
1, 2, . . . , r), and Aλi ⪯ Aλs ; Case partition Uipλ ∈ U /Aλi (p = In the beginning, establish the scale transformation space of the
1, 2, . . . , t), Usqλ
∈ U /Aλs (q = 1, 2, . . . , l), and ∃Usqλ → Uipλ = Usqλ . initial single-scale case dataset through the multi-scale knowl-
edge base, and the candidate observation scale combination of
Table 2 shows a multi-scale knowledge base DS = (U { , 1A , 2d)}
S

with six cases U = {xk |k ∈ [1, 6]}, two attributes A = A , A S all attributes under all scale transformation directions is also
defined. Then, according to the scale transform space, calculate
with three observation scales respectively (that is A10 ⪯ A11 ⪯ A12 ,
the scale transform rate of each attribute. As for the multi-scale
A20 ⪯ A21 ⪯ A22 ), and six solutions d = {dk |k ∈ [1, 6]}.
data analysis adopting the OSTS, attribute with the maximum
Taking the basic single-scale case dataset {(that all } the at- scale transformation value is selected as the target attribute of
tributes stand at the lowest hierarchy) D0 = (U , A10 , A20 , d) as an each scale transformation. Finally, apply the new observation
example, it can be seen that U /A10 = {{x1 } , {x2 } , {x3 } , {x4 } , {x5 } , scale to represent the target attribute and update dataset.
{x6 }}, U /A11 = {{x1 , x2 , x3 } , {x4 } , {x5 , x6 }}, U /A12 = {{x1 , x2 , x3 , Strategy 2. (Pessimistic scale transformation strategy, PSTS)
x4 } , {x5 , x6 }}. During once scale transformation, always being prior to selecting
If attribute A1 is implemented a one-step scale up transforma- attribute with the minimum scale transformation value (attribute
tion ST (A1 , 1), the effect of ST (A1 , 1) is STR(A10 , A11 ) = |U4| = 0.17;
|{x }|
with the largest scale transformation rate).
1
If attribute A is implemented a two-step scale up transformation Fig. 3 also shows the scale transformation process under the
ST (A1 , 2), the effect of ST (A1 , 2) is STR(A10 , A12 ) = |U | = 0. PSTS. Similarly , in the beginning, establish the scale transfor-
|∅|

Similarly, taking the highest single-scale case dataset{ (that all mation space of the initial single-scale case dataset through the
the attributes stand at the highest hierarchy) D2 = (U , A12 , A22 ,
} multi-scale knowledge base, and the candidate observation scale
combination of all attributes under all scale transformation di-
d) as an example, if attribute A2 is implemented a one-step
rections is also defined. Then, according to the scale transform
scale down transformation ST (A2 , 1), the effect of ST (A2 , 1) is
space, calculate the scale transform rate of each attribute. While
STR(A22 , A21 )
for the multi-scale data analysis adopting the PSTS, attribute
= |{x5|U,x|6 }| = 0.33; If attribute A2 is implemented a two-step with the minimum scale transformation value is selected as the
scale down transformation ST (A2 , 2), the effect of ST (A2 , 2) is target attribute of the scale transformation. Finally, apply the new
|{x ,x }|
STR(A22 , A20 ) = 5|U |6 = 0.33. observation scale to represent the target attribute and update
From the above numerical examples, it can be seen that both dataset.
the scale up and down transformation rate STR ∈ [0, 1]. The In conclusion, the scale transformation performance of multi-
smaller STR represents the more significant changes of case par- scale knowledge base could be heavily affected by several factors,
titions during once scale transformation; On the contrary, the such as the scale transformation steps (i.e., single step transfor-
larger STR represents the less significant changes of case parti- mation versus multiple steps transformation), the scale transfor-
tions during once scale transformation. mation strategies (i.e., optimistic scale transformation strategy
versus pessimistic scale transformation strategy), and the scale
transformation directions (i.e., scale up transformation direction
3.2. Variable-scale clustering method based on the scale transforma-
versus scale down transformation direction).
tion strategy Therefore, in order to better control the scale transformation
process of multi-scale knowledge base, especially the variable-
In Section 3.1, the scale transformation rate (STR) has been scale clustering process, three assumptions are put forward fol-
able to quantitatively evaluate the case partition change of once lowing the traditional VSC method in Section 2.3:
scale transformation process of each attribute, that is attribute ⃝1 During once iteration process of variable-scale clustering,
scale transformation value. The smaller (STR) represents the more only the observation scales that have direct partial order re-
significant scale transformation value; On the contrary, the larger lation with target scale could become the candidates for scale
(STR) represents the less significant scale transformation value. transformation.
212
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

Fig. 3. The optimistic and pessimistic scale transformation strategies.

⃝2 During once iteration process of variable-scale clustering, Algorithm 1: Variable-Scale Clustering Algorithm Based on the
only one scale transformation strategy could be applied for scale Scale Transformation Strategy (VSC-STS)
transformation.
3 During once iteration process of variable-scale clustering,
Input: Multi-scale knowledge base DS with clear hierarchy

structure (CS), Number of clusters k, Scale transformation
only one attribute in the whole multi-scale knowledge base could
range [Sstart , Send ]
be applied for scale transformation.
Output: Scale hierarchy (Scale transformation route), Satisfied
Moreover, in order to evaluate the clustering validity of multi-
clusters with scale characteristics.
scale knowledge base DS = (U , AS , d) the traditional granular
Step 1. Start the initial clustering analysis on the basic scale
deviation (GrD) could be expressed as:
 ∑ ∑ combination of DS and evaluate the satisfaction degree
 n rj=1 ni=1 δ (xij , xIj )
 of clustering results via GrD (see Eq.(3)).
GrD(XI , Aλ ) = √ (3) Step 2. Identify satisfied clusters and take the largest satisfied
λ 2
∑t
k=1 |Uk | granular deviation value as R0 .
Where XI ⊆ U is any case cluster obtained by once iteration pro- Step 3. Output all satisfied clusters with scale characteristics
cess of variable-scale clustering, and XI = {x1 , x{2 , . . . , xn }; Aλ (λ }= and delete all the objects of satisfied clusters from DS .
1, 2, . . . , r) is the target attribute, and U /Aλ = U1λ , U2λ , . . . , Utλ ; Step 4. Start scale transformation following CS.
xI is the cluster center of XI , and for all xij ∈ XI , if xij = xIj ,
• If OSTS is adopt, scale up one attribute with the
δ (xij , xIj ) = 0; otherwise, δ (xij , xIj ) = 1. minimum STR under the range condition
Fig. 4 shows the variable-scale clustering process under scale
[Sstart , Send ], and update Ḋ.
transformation strategies. Firstly, initial clustering analysis is per-
• If PSTS is adopt, scale up one attribute with the
formed on the single-scale case dataset D1st at the basic hierarchy
maximum STR under the range condition
level, and the satisfaction threshold R0 is determined via the GrD.
[Sstart , Send ], and update Ḋ.
Retain the clusters that satisfies R0 with scale characteristics.
Then, delete all the objects of satisfied clusters from the original Step 5. Start clustering analysis on Ḋ via R0 . If the GrD of all
multi-scale dataset, and select one attribute for scale transforma- clusters exceeds R0 , reset R0 as the minimum granular
tion following one of the scale transformation strategies (OSTS deviation value.
or PSTS), in order to obtain the new single-scale case data set Step 6. Output all satisfied clusters with scale characteristics
D2nd with higher hierarchy and fewer objects. After that, start and delete all the objects of satisfied clusters from Ḋ.
clustering analysis on D2nd via R0 , and retain all the satisfied Step 7. If all objects are partitioned into a satisfied cluster
clusters with scale characteristics. Similarly, only the cases in the (i.e.Ḋ = ∅), output the scale transformation route;
remaining non-satisfaction clusters are continued to carry out the Otherwise, go to Step 4.
next scale transformation process. Finally, the VSC-STS stops until
all objects have been partitioned into a satisfied cluster.
According to the optimistic and pessimistic scale transforma-
tion strategy, combined with the scale up transformation mech-
For instance, let D1st = (U , A10 , A20 , d) represents the initial
{ }
anism, a variable scale clustering algorithm based on the scale
transformation strategy (VSC-STS) is proposed. The algorithm single-scale case dataset of the multi-scale knowledge base DS
steps is shown in Algorithm 1. in Table 2. According to the first step of {the VSC-STS, the scale
The time complexity of VSC-STS is O(t ϕ ), where ϕ = min(m, up transformation space is STS(D1st ) = A11 , A21 . If the PSTS is
}
nr ), r is the number of attributes, n is the maximum number of adopted, A2 becomes the target attribute for the second itera-
scale levels in one attribute, m is the number of cases, and t is tion due to STR(A10 , A11 ) < STR(A20 , A21 ), and the single-scale}case
dataset for the second iteration process is D2nd = (U , A10 , A21 , d).
{
the time consuming of the meta clustering process.
213
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

Fig. 4. The variable-scale clustering process under scale transformation strategies.

Where Caseλik (Caseλjk ) is the value of Casei (Casej ) under the


observation scale Aλk , (nλ + 1) is the number of scales in at-
tribute Aλ . If Caseλik = Caseλjk , δ (Caseλik , Caseλjk ) = 1; Otherwise,
δ (Caseλik , Caseλjk ) = 0.
For instance, the multi-scale similarity between x5 and x6 in
Table 2 is SimS (x5 , x6 ) = 0.83.
Finally, the variable scale case-based reasoning algorithm
(VSCBR) is proposed. The algorithm steps is shown in Algorithm
2.
Algorithm 2: Variable-Scale Case-Based Algorithm using the
VSC-STS (VSCBR)
Input: Multi-scale knowledge base DS , New case C ,Similarity
threshold ξ , Number of clusters k, Scale transformation
range [Sstart , Send ]
Output: Past solutions.
Step 1. Calculate the multi-scale similarity between new case
Fig. 5. Scale transformation process of the VSC-STS. C and all cluster centers in the knowledge base DS
using Eq.(4) and Eq.(5). If there are clusters that exceed
the similarity threshold ξ , go to Step 2; Otherwise go
During the second iteration, the scale up transformation space is to Step 3.
STS(D2nd ) = A11 , A22 If the OSTS is adopted, A1 becomes the target
{ }
Step 2. Output past accurate solutions of all qualified cases, in
attribute for the third iteration due to STR(A10 , A11 ) < STR(A21 , A22 ), the descending order of case similarity, and go to Step
and the single-scale 5.
}case dataset for the third iteration process
is D3rd = (U , A11 , A21 , d). Finally, the scale transformation route
{
Step 3. Revise the initial parameters ( that is the number of
could be obtained (see Fig. 5.) clusters k and scale transformation range ([Sstart , Send ])
and calculate new clusters of the multi-scale
3.3. Variable scale case-based reasoning method using the VSC-STS knowledge base DS through the
VSC-STS(D, k, [Sstart , Send ]). If there is clusters that
According to the structure of multi-scale knowledge case in exceed the similarity threshold ξ , go to Step 5;
Section 3.1, the Case Retrieve phase is supposed to consider the Otherwise go to Step 4.
similarity between new and past cases under all observation Step 4. Decrease the similarity threshold ξ , and go to Step 1.
scales. Step 5. Store the new case and the top qualified solution (that
has the greatest similarity with the new case) into the
Definition 2 (Multi-Scale Case Similarity). Let Casei , Casej repre- multi-scale knowledge base DS .
sents two different cases in the multi-scale knowledge
{ λ base DS = Step 6. Update the case clusters of the multi-scale knowledge
(U , AS , d), where multi-scale attribute set AS
= A |λ} 1, 2, . . . ,
= base DS through the VSC-STS(D, k, [Sstart , Send ]).
r }, and each attribute Aλ = Aλk |k = 0, 1, 2, . . . , nλ , the multi-
{
scale similarity between Casei and Casej is defined as :
Compared to the CBR-FT in Section 1, the proposed VSCBR
r
utilizes the case similarity information to replace more prior
d(Caseλi , Caseλj )

SimS (Casei , Casej ) = (4) knowledge (like the evidence relevance rating), which could de-
λ=1 crease the subjective risks caused by digital forensic investiga-
λ
λ λ
n
∑ δ (Caseλik , Caseλjk ) tors. Moreover, the VSCBR knowledge base could update over
d(Casei , Casej ) = (5) time through increasing current criminal cases under different
(nλ + 1)
k=0 temporal scale.
214
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

Table 3
Investigation content for digital forensics in the bidding context.
# Laws and regulations Number of articles
1 Law of the People’s Republic of China on Tenders and Bids 67
2 Regulations for Implementations on the Law of the People’s Republic of China on Tenders and Bids 82
3 Government Procurement Law of the People’s Republic of China 86
4 Regulations for Implementations on the Government Procurement Law of the People’s Republic of China 78
5 The Administrative Rules on Bidding Invitation and Procurement of the Shenhua Group (the 2016 version) 137
6 Measures for Bidding Invitation and Procurement of Goods Needed in Construction Projects 64
7 Measures for Tenders and Bids for Investigation and Design of Engineering Construction Projects 59
8 Measures for Tenders and Bids for Construction of Engineering Construction Projects 91
9 Handling Measures for Dealing with Complaints in Bidding Invitation and Bidding in Construction Projects 30
10 Provisional Rules on Bidding Evaluation Committee and on Bidding Evaluation Methods 61
11 Provisional Measures for Administration of Bidding Evaluation Experts and Bidding Evaluation Tank of Experts 18
12 Provisional Rules for Announcing Bidding Results 21
13 Measures for Invitation for Online Bidding 66
14 The Administrative Rules on Bidding Invitation and Procurement of the Shenhua Group (the 2017 version) 109

Table 4
Summary of variables.
Attribute Scale Description
A10 Supply risk rating during the latest three months observation scale, that is [2017-09-01,2017-12-31]
Supply A11 Supply risk rating during the latest six months observation scale, that is [2017-06-01,2017-12-31]
A12 Supply risk rating during the latest one year observation scale, that is [2017-01-01,2017-12-31]
A20 Product value rating during the latest three months observation scale, that is [2017-09-01,2017-12-31]
Product A21 Product value rating during the latest six months observation scale, that is [2017-06-01,2017-12-31]
A22 Product value rating during the latest one year observation scale, that is [2017-01-01,2017-12-31]
A30 Demand risk rating during the latest three months observation scale, that is [2017-09-01,2017-12-31]
Demand A31 Demand risk rating during the latest six months observation scale, that is [2017-06-01,2017-12-31]
A32 Demand risk rating during the latest one year observation scale, that is [2017-01-01,2017-12-31]
A40 Location of the project under city observation scale
Project
A41 Location of the project under province observation scale

4. Experimental analysis decreasing during every scale transformation process, the obser-
vation scale of the final variable-scale clustering result is incon-
4.1. Experiment design sistent with other comparison algorithms. Thus, only the exter-
nal validation evaluation approach is adopted for comparative
A real 15.9 GB bidding knowledge base from the Shenhua experiments.
Group is utilized for the experiments, which includes both text (2) Numerical experiment on the performance of the VSCBR
bidding documents and image technical drawings. Table 3 shows The experiment objective of practical value verification is to
the 969 articles from fourteen laws and regulations that need test whether the VSCBR is able to predict suspicious evidential ar-
to be investigated. And all the 85 evidence location of bidding eas for digital forensics investigators. Since the VSC-STS is utilized
cases are encoded as four digits (see Fig. 6). It can be seen that to support key steps of digital triage in the VSCBR (see Algorithm
compared to the multi-scale attribute set, the solution set (evi- 2) , the practical value of the VSC-STS is also be verified through
dence location) of bidding knowledge base also has a hierarchy this numerical experiment.
structure.
4.2. Data preparation
Three numerical experiments are designed to test the effec-
tiveness of the VSC-STS and VSCBR theoretically and experimen-
After field investigations of eight bidding managers from the
tally.
Shenhua Group, multi-scale attributes (dimensions) of the bid-
(1) Numerical experiments on the performance of the VSC-STS
ding knowledge base is established (see Table 4) jointly with
The experiment objective of theoretical value verification is
managers and data analysts. According to managers’ experience,
to test whether the VSC-STS is able to improve the accuracy of
there are totally four dimensions that greatly affects the forensic
traditional clustering algorithm under various evaluation metrics. investigation in the practical bidding process, i.e., the supply risk
⃝1 Validation analysis of the clustering results
dimension, product value dimension, demand risk dimension and
According to the algorithm steps of VSC-STS (see Algorithm 1), project value dimension.
select k-modes as meta cluster algorithm. Moreover, we repeat Moreover, since the bidding managers make regular report of
numerical experiment fifty times due to the random feature of their responsible districts quarterly, three temporary observation
k-modes, and compare the variable scale clustering results with scales (i.e., the latest three months, the latest six months and the
the entire scale space clustering results. latest one year) and two spatial observation scales (i.e., the city
⃝2 Sensitive analysis of the parameter threshold level and province level) are determined for the bidding knowl-
According to the algorithm steps of VSC-STS, set the scale edge base. Therefore, all bidding raw data relevant to the first
transformation range as [3,6] and variation step as 0.1. Similarly, three dimensions is extracted, transformed, loaded respectively
we repeat numerical experiment fifty times due to the random under three different temporal scales to obtain the first nine
feature of k-modes, and compare the average clustering accuracy variables. All bidding raw data relevant to the fourth dimension
under different initial parameter value. is extracted, transformed, loaded respectively under two different
There are two widely utilized evaluation approaches of clus- spatial scales to obtain the last two variables.
tering algorithms, that is internal validation evaluation and ex- In order to clearly show the bidding experimental data, a sam-
ternal effective evaluation. Since the case set (domain) keeps ple of the bidding knowledge base with twenty five cases, four
215
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

Fig. 6. Example: Evidence location of bidding cases.

observation attributes (A1 , A2 , A3 , A4 ) and one solution attribute ⃝1 During the fifty experiments, the average performance of
(d) is provided (see Table 5). Taking the lowest observation scales the VSC-STS method in all evaluation indexes are better than the
as the basic scale combination (A10 , A20 , A30 ), it can be seen that the average effectiveness results obtained by the traditional single-
entire scale space could generate 54 different single-scale case scale clustering method k-modes at the basic scale level. And
dataset (scale combinations). the accuracy improvement rate exceeds 10%, which verifies that
Moreover, experiments utilize the classic external evaluation the VSC-STS could meet the validity requirement of traditional
metric (such as Fmeasure, NMI and RI) to analyze the validation single-scale clustering algorithms.
of VSC-STS in the first two experiments. All the experiments are ⃝2 During the fifty experiments, the deviation between the
performed in OS X (10.14.4) environment on a machine with 8GB average performance of the VSC-STS method and the average
RAM and 2.9 GHz Intel Core i5 CPU. All methods are coded in effectiveness results obtained by the traditional single-scale clus-
Matlab R2017a. tering analysis method k-modes under the optimal scale level is
less than 5%. That proves that the VSC-STS could improve the
4.3. Experiment results and discussion
efficiency of traditional single-scale clustering algorithms.
(2) Numerical experiment on the parameter sensitivity of the
In this section, the VSC-STS results of the first two experiments
VSC-STS
and the VSCBR (combined with the VSC-STS) results of the third
experiment are discussed respectively. Fig. 8 shows the sensitivity experiment results of the initial
(1) Numerical experiment on the validation of the VSC-STS algorithm parameter of the VSC-STS. The broken line represents
Fig. 7 further shows the validation experiment results of the the standardized average clustering validation evaluated by ex-
VSC-STS. The black broken line represents evaluation results of ternal validation metrics (i.e., the Fmeasure, NMI and RI), within
each external validation metrics, i.e., the Fmeasure, NMI and the scale transformation range of initial algorithm parameter.
RI [36,37]. The blue dashed line represents the average clustering It can be seen that all evaluation indexes stay at a relatively
accuracy of the k-modes under the basic scale combination; while stable level with the increase of algorithm parameters. Further
the red dashed line represents the average clustering accuracy discussion is as follows.
of the k-modes under the optimal (best) scale combination (see ⃝1 Although the validation evaluation results of three indexes
Table 6). fluctuates by changing the initial satisfaction (degree) threshold,
It is found that the VSC-STS could completely meet the re- the maximum fluctuation range is less than 1%, which indicates
quirements of traditional single-scale clustering methods on the that the validation of the VSC-STS is insensitive to the initial
result validation. Further discussion is as follows. algorithm parameter.
216
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

Fig. 7. The validation experiment results of the VSC-STS.

Table 5
Example: Bidding knowledge base.
U A10 A11 A12 A20 A21 A22 A30 A31 A32 A40 A41 d
Case1 1 E Y 1 L N 1 T Y 1 Y 1101
Case2 1 E Y 1 L N 1 T Y 1 Y 1101
Case3 1 E Y 2 L N 2 U Y 1 Y 1101
Case4 2 E Y 2 L N 3 R O 1 Y 1101
Case5 2 E Y 2 L N 2 U Y 2 N 1201
Case6 2 E Y 3 L N 3 R O 1 Y 2002
Case7 3 G Y 3 L N 3 R O 2 N 2002
Case8 3 G Y 4 L N 2 U Y 2 N 1201
Case9 3 G Y 4 L N 3 R O 1 Y 2002
Case10 3 G Y 5 I N 2 U Y 1 Y 3001
Case11 4 G Y 5 I N 3 R O 2 N 2002
Fig. 8. The experiment results of algorithm parameter sensitivity.
Case12 4 G Y 5 I N 4 S O 1 Y 3001
Case13 4 G Y 5 I N 1 T Y 1 Y 3001
Case14 5 F N 5 I N 4 S O 2 N 5001
Case15 5 F N 6 I N 2 U Y 2 N 5001 the accurate evidence area automatically, which is able to reduce
Case16 5 F N 6 I N 3 R O 1 Y 5001
Case17 5 F N 6 I N 4 S O 1 Y 5001
the digital triage workload of forensic investigators. It can be
Case18 6 B N 6 I N 1 T Y 2 N 5001 seen that time-consuming curve of both two algorithms presents
Case19 6 B N 6 I N 5 Q N 1 Y 3001 an overall stable but local fluctuations tendency, and decreases
Case20 6 B N 7 M Y 5 Q N 2 N 4001 obviously among the initial stage. That is mainly because the
Case21 7 B N 6 I N 4 S O 3 N 4001 heavy initial case clustering work of the bidding knowledge base,
Case22 7 B N 7 M Y 5 Q N 3 N 4001 which has been improved rapidly proven by the sharp decrease of
Case23 7 B N 8 M Y 5 Q N 4 N 4001
Case24 7 B N 8 M Y 6 Q N 4 N 4001
calculation time. The experimental results demonstrate that the
Case25 7 B N 8 M Y 6 Q N 4 N 4001 historical experience contributes to the solution of new problems
(cases) significantly. In addition, the time consuming of the VSCBR
* Valuables A10 , A11 and A12 are encoded as [1-11], [A-G] and [Y,N] respectively is always lower than the traditional CBR at any cases, which
according to the level of supply risk under each temporal scale; Valuables A20 , A21
indicates that the proposed VSCBR has high efficiency in practical
and A22 are encoded as [1-9], [I-M] and [Y,N] respectively according to the level
forensic tasks.
of product value under each temporal scale; Valuables A30 , A31 and A32 are encoded
Fig. 9(b) shows the comparison results of the traditional CBR
as [1-7], [Q-U] and [Y,O,N] respectively according to the level of demand risk
under each temporal scale; Valuables A40 and A41 are encoded as [1-5] and [Y,N] and the proposed VSCBR method in the case revise phase. Ten
respectively according to the level of project value under each spatial scale. new cases are utilized to test whether the VSCBR could locate
all the suspicious evidential areas automatically. It can be seen
that time-consuming curve of both two algorithms also presents
an overall stable but local fluctuations tendency, and decreases
⃝2 The overall tendency of three validation evaluation results
obviously among the initial stage. And the time consuming of
increase slightly with the increase of the initial satisfaction pa-
the VSCBR is always lower than the traditional CBR at any cases,
rameter, which indicates that the over strict initial satisfaction
which further indicates that the proposed VSCBR has high effi-
constraint is not conducive to the VSC-STS obtaining the optimal ciency in practical forensic tasks.
solution.
(3) Numerical experiment on the performance of the VSCBR 5. Conclusions
Fig. 9(a) shows the comparison results of the traditional CBR
and the proposed VSCBR method in the case reuse phase. Fifteen As for the increasingly criminal usage of information technol-
new cases are utilized to test whether the VSCBR could locate ogy, the digital forensic analysis (like the digital triage methods)
217
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

Table 6
Evaluation results of the VSC-STS.
Objectives Initial observation scale combination (A10 A20 A30 A40 ) Optimal observation scale combination (A11 A21 A31 A40 )
Average Maximum Minimum Standard deviation Average Maximum Minimum Standard deviation
Fmeasure 0.6269 0.7863 0.4777 0.0692 0.6940 0.8436 0.5543 0.0615
NMI 0.5791 0.7610 0.3830 0.0742 0.6548 0.8537 0.4786 0.0669
RI 0.8079 0.8900 0.7300 0.0369 0.8409 0.9167 0.7733 0.0288

Fig. 9. The comparison results of the traditional CBR and the proposed VSCBR.

becomes an emerging technique for cybercrime investigators to References


improve examination efficiency. This paper aims to solve the
digital triage problem for locating evidence during the auto- [1] F. Amato, G. Cozzolino, V. Moscato, F. Moscato, Analyse digital forensic
evidences through a semantic-based methodology and nlp techniques,
matic forensic process. Firstly, according to STR, the optimistic Future Gener. Comput. Syst. 98 (2019) 297–307.
scale transformation strategy (OSTS) and the pessimistic scale [2] A.H. Lone, R.N. Mir, Forensic-chain: Blockchain based digital forensics chain
transformation strategy (PSTS) are proposed to optimize the at- of custody with poc in hyperledger composer, Digit. Invest. 28 (2019)
tribute selection in the scale transformation process; Secondly, a 44–55.
[3] S. Bunting, EnCase Computer Forensics: The Official EnCE: EnCase Certified
variable scale clustering algorithm based on the scale transforma-
Examiner Study Guide, SYBEX Inc., 2012.
tion strategy (VSC-STS) is proposed to overcome the uncertainty [4] G. Horsman, C. Laing, P. Vickers, A case-based reasoning method for
caused by the random attribute selection in the VSC. We test locating evidence during digital forensic device triage, Decis. Support Syst.
the effectiveness of the VSC-STS theoretically and experimentally. 61 (2014) 69–78.
[5] D. Sun, S. Zhao, Multi-scale clustering algorithm based on the weight
Finally, after defining the structure of the multi-scale knowledge
vector, Comput. Sci. 42 (2015) 263–267.
base, a variable scale case-based reasoning method (VSCBR) is [6] Y. Han, S. Zhao, Multi-scale clustering algorithm, Comput. Sci. 43 (2016)
proposed to support investigators predicting evidential areas. A 244–248.
case study is established using a real 15.9 GB bidding case dataset, [7] Y. Ren, P. Liu, Bp neural network model for prediction of listing corporation
stock price of qinghai province, J. Syst. Manage. Sci. 6 (2016) 54–65.
which contains both text bidding documents and image technical
[8] A. Wang, X. Gao, Multifunctional product marketing using social media
drawings. Experiment results illustrate that the validation of the based on the variable-scale clustering, Teh. Vjesn. 26 (1) (2019) 193–200.
proposed VSC-STS is significantly improved compared with the [9] X. Gao, A. Wang, Customer satisfaction analysis and management method
traditional single-scale clustering algorithm, and it is insensitive based on enterprise network public opinion, Oper. Res. Manage. Sci. 29 (7)
(2020) 232–239.
to the initial parameter threshold. Also, the proposed method
[10] P.R. Yogesh, S.R. Devane, Network forensic investigation protocol to iden-
VSCBR is able to help investigators locate likely rule-violating tify true origin of cyber crime, J. King Saud Univ. - Comput. Inform. Sci.
evidences in practice. (2019).
Future studies will focus on improve the efficiency of the [11] N. Akatyev, J.I. James, Evidence identification in iot networks based on
threat assessment, Future Gener. Comput. Syst. 93 (2019) 814–821.
VSCBR through optimizing the initial parameter and human in-
[12] N.L. Beebe, J.G. Clark, G.B. Dietrich, M.S. Ko, D. Ko, Post-retrieval search
volvement, as well as explain why a certain attack has been hit clustering to improve information retrieval effectiveness: Two digital
made [38]. More types of criminal behaviors for digital forensics forensics case studies, Decis. Support Syst. 51 (4) (2011) 732–744.
will be considered in the construction of the knowledge base. [13] H. Henseler, S.V. Loenhout, Educating. judges, Educating judges prosecutors
and lawyers in the use of digital forensic experts, Digit. Investig. 24 (2018)
S76–S82.
CRediT authorship contribution statement [14] N. Koroniotis, N. Moustafa, E. Sitnikova, A new network forensic framework
based on deep learning for internet of things networks: A particle deep
framework, Future Gener. Comput. Syst. 110 (2020) 91–106.
Ai Wang: Conceptualization, Methodology, Software, Valida- [15] F.M. Awaysheh, M. Alazab, M. Gupta, T.F. Pena, J.C. Cabaleiro, Next-
tion, Writing - original draft. Xuedong Gao: Supervision, Re- generation big data federation access control: A reference model, Future
Gener. Comput. Syst. 108 (2020) 726–741.
sources, Writing - review & editing. [16] C. Quinto Huamán, A.L. Sandoval Orozco, L.J. García Villalba, Authentica-
tion and integrity of smartphone videos through multimedia container
structure analysis, Future Gener. Comput. Syst. 108 (2020) 15–33.
Declaration of competing interest [17] J. Vilhena, H. Vicente, M.R. Martins, J.M. Grañeda, F. Caldeira, R. Gusmão,
J. Neves, J. Neves, A case-based reasoning view of thrombophilia risk, J.
Biomed. Inform. 62 (2016) 265–275.
The authors declare that they have no known competing finan-
[18] T.P.D. Homem, P.E. Santos, A.H.R. Costa, R.A.D.C. Bianchi, R.L.D. Mantaras,
cial interests or personal relationships that could have appeared Qualitative case-based reasoning and learning, Artif. Intell. 283 (2020)
to influence the work reported in this paper. 103258.1–103258.23.

218
A. Wang and X. Gao Future Generation Computer Systems 122 (2021) 209–219

[19] M.L. Reid, S.M. Emery, Scale-dependent effects of gypsophila paniculata in- [34] A. Wang, X. Gao, Intelligent computing: Knowledge acquisition method
vasion and management on plant and soil nematode community diversity based on the management scale transformation, Comput. J. 64 (3) (2021)
and heterogeneity, Biol. Cons. 224 (2018) 153–161. 314–324, http://dx.doi.org/10.1093/comjnl/bxaa077.
[20] C.D.D. Sohoulande, K. Stone, V.P. Singh, Quantifying the probabilistic [35] A. Wang, X. Gao, M. Tang, Computer supported data-driven decisions for
divergences related to time-space scales for inferences in water resource service personalization: A variable-scale clustering method, Stud. Inform.
management, Agricult. Water Manag. 217 (2019) 282–291. Control 29 (1) (2020) 55–65.
[21] H. Jiawei, K. Micheline, Data mining: Concepts and techniques, Data Mining [36] S. Liang, D. Han, Y. Yang, Cluster validity index for irregular clustering
Concepts Models Methods & Algorithms, vol. 5(4), second ed, 2006, pp. results, Appl. Soft Comput. 95 (2020) 106583.
1–18. [37] J.M. Luna-Romera, M. Martínez-Ballesteros, J. García-Gutiérrez, J.C.
[22] N. Takashina, M.L. Marissa, L. Baskett, Exploring the effect of the spatial Riquelme, External clustering validity index based on chi-squared
scale of fishery management, J. Theoret. Biol. 390 (2016) 14–22. statistical test, Inform. Sci. 487 (2019) 1–17.
[23] G. Mariscal, O. Marban, C. Fernandez, A survey of data mining and [38] L. Longo, R. Goebel, F. Lecue, P. Kieseberg, Explainable Artificial Intelli-
knowledge discovery process models and methodologies, Knowl. Eng. Rev. gence: Concepts, Applications, Research Challenges and Visions, Machine
25 (2) (2010) 137–166. Learning and Knowledge Extraction, in: Lecture Notes, LNCS 12279, 2020,
[24] S. Wu, X. Gao, M. Bastien, Data Warehousing and Data Mining, http://dx.doi.org/10.1007/978-3-030-57321-8-1.
Metallurgical Industry Press, 2003.
[25] T. Gocken, M. Yaktubay, Comparison of different clustering algorithms via
genetic algorithm for vrptw, Int. J. Simul. Modell. 18 (4) (2019) 574–585.
[26] R.C. Hrosik, E. Tuba, E. Dolicanin, R. Jovanovic, M. Tuba, Brain image seg- Ai Wang received her Ph.D degree in 2021 in Man-
mentation based on firefly algorithm combined with k-means clustering, agement Science and Engineering and she is currently
Stud. Inform. Control 28 (2) (2019) 167–176. doing her postdoctoral research in the School of Hu-
[27] L. Wang, Z. Hao, X. Han, R. Zhou, Gravity theory-based affinity propaga- manities and Social Science, University of Science and
tion clustering algorithm and its applications, Teh. Vjesn. 25 (4) (2018) Technology Beijing, China. She has published papers
1125–1135. in respected journals like IEEE Access, Studies in In-
[28] Z.Z. Wang, Y. Xiong, R. Wang, C.H. Zhong, Numerical investigation of the formatics and Control. Her research interests focus on
scale effect of hydrodynamic performance of the hybrid crp pod propulsion data mining and decision making, as well as emergency
system, Appl. Ocean Res. 54 (2016) 26–38. management.
[29] G. Tavakoli Mehrjardi, R. Behrad, S.N. Moghaddas Tafreshi, Scale effect on
the behavior of geocell-reinforced soil, Geotext. Geomembr. 47 (2) (2019)
154–163.
[30] G. Feng, D. Ming, M. Wang, J. Yang, Connotations of pixel-based scale Xuedong Gao received his Bachelor degree from
effect in remote sensing and the modified fractal-based analysis method, Nankai University, China in 1983, and the Ph.D. de-
Comput. Geosci. 103 (C) (2017) 183–190. gree from Belarusian State University in 1993. He is
[31] X. Gao, A. Wang, Variable-scale clustering, in: Proceeding of 8th In- currently the professor in the Department of Manage-
ternational Conference on Logistics, Informatics and Service Sciences, ment Science and Engineering, School of Economics and
2018. Management at University of Science and Technology
[32] A. Wang, X. Gao, Hybrid variable-scale clustering method for social media Beijing, China. His research interests include man-
marketing on user generated instant music video, Teh. Vjesn. 26 (3) (2019) agement process optimization, data mining, decision
771–777. making.
[33] A. Wang, X. Gao, M. Yang, Variable-scale clustering based on the numerical
concept space, in: Proceeding of 9th International Conference on Logistics,
Informatics and Service Sciences, 2019.

219

You might also like