You are on page 1of 12

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO.

8, AUGUST 2021 5645

Running Industrial Workflow Applications in a


Software-Defined Multicloud Environment Using
Green Energy Aware Scheduling Algorithm
Zhenyu Wen , Member, IEEE, Saurabh Garg , Gagangeet Singh Aujla , Senior Member, IEEE,
Khaled Alwasel, Deepak Puthal , Senior Member, IEEE, Schahram Dustdar , Fellow, IEEE,
Albert Y. Zomaya , Fellow, IEEE, and Rajiv Ranjan , Senior Member, IEEE

Abstract—Industry 4.0 have automated the entire manu- with the availability of green energy on different cloud data
facturing sector (including technologies and processes) by centers, where the energy consumption consists of the us-
adopting Internet of Things and cloud computing. To han- age of running application over multiple data centers and
dle the workflows from Industrial Cyber-Physical systems, transferring the data among them through SDWAN. The
more and more data centers have been built across the evaluation shows that our proposed method can increase
globe to serve the growing needs of computing and stor- usage of green energy for the execution of industrial work-
age. This has led to an enormous increase in energy usage flow up to 3× times with a slight increase in the cost when
by cloud data centers, which is not only a financial burden compared to cost-based workflow scheduling methods.
but also increases their carbon footprint. The private soft-
ware defined wide area network (SDWAN) connects a cloud Index Terms—Big data, green energy, industrial
provider’s data centers across the planet. This gives the clouds, industrial workflow applications, software defined
opportunity to develop new scheduling strategies to man- networking.
age cloud providers workload in a more energy-efficient
manner. In this context, this article addresses the problem
of scheduling data-driven industrial workflow applications
over a set of private SDWAN connected data centers in an I. INTRODUCTION
energy-efficient manner while managing tradeoff of a cloud
HE INDUSTRY 4.0 revolution helps to gather data in
provider’ revenue. Our proposed algorithm aims to mini-
mize the cloud provider’s revenue and the usage of nonre-
newable energy by utilizing the real-world electricity prices
T real-time and then analyze it at remote clouds to improve
the industrial processes and detect faulty operations. This helps
to realize the future operating problems in advance and make
Manuscript received September 2, 2020; revised December 15, 2020; the industrial processes more stringent. This transition triggers
accepted December 15, 2020. Date of publication December 18, 2020;
date of current version May 3, 2021. This work was supported in part high quality productivity that further improves the industrial
by the PACE Project EP/T021985/1, in part by the SUPER Project productivity, economics, and workflows. However, this makes
EP/R033293/1, in part by the National Natural Science Foundation of more data-driven undertakings, which increases the data sharing
China under Grant 62072408, and in part by the Zhejiang Provincial
Natural Science Foundation of China under Grant LY20F020030. Paper across multiples sites and even across industrial (factory) bound-
no. TII-20-4197. (Corresponding author: Gagangeet Singh Aujla.) aries. As a result, the dependence on cloud computing will in-
Zhenyu Wen and Deepak Puthal are with the School of Computing, crease though the deployment of machine data and functionality
Newcastle University, NE1 7RU Newcastle upon Tyne, U.K. (e-mail:
wenluke427@gmail.com; dputhal88@gmail.com). over remote clouds. Eventually, this leads to an increase in the in-
Saurabh Garg is with the University of Tasmania, Hobart TAS 146, dustrial workflow applications running at the cloud data centers.
Australia (e-mail: saurabh.kr.garg@gmail.com). However, more and more industrial applications are harnessing
Gagangeet Singh Aujla is with the Department of Computer
Science, Durham University, DH1 3LE Durham, U.K. (e-mail: cloud resources. To satisfy the needs of their customers and
gagi_aujla82@yahoo.com). industrial workflow applications, cloud computing providers in
Khaled Alwasel is with the School of Computing Science, Newcastle general maintain very large infrastructure.
University, NE1 7RU Newcastle upon Tyne, U.K., and also with the Col-
lege of Computing and Informatics, Saudi Electronic University, Riyadh It is quite evident that the amount of energy consumed by the
11673, Saudi Arabia (e-mail: kalwasel@gmail.com). world’s data centers—the repositories of billions of gigabytes of
Schahram Dustdar is with the Computer Science (Informatics), TU information—will exponentially increase over the next decade,
Wien, 1040 Vienna, Austria (e-mail: dustdar@dsg.tuwien.ac.at).
Albert Y. Zomaya is with the School of Computer Science, putting an enormous strain on energy sources. In 2010, elec-
The University of Sydney, Sydney, NSW 2006, Australia (e-mail: tricity usage in global data centers accounted for about 1.3% of
albert.zomaya@sydney.edu.au). total electricity usage worldwide [1]. Data centers now consume
Rajiv Ranjan is with the School of Computing Science, Newcastle
University, NE1 7RU Newcastle upon Tyne, U.K., and also with the about three percent of the global electricity supply. Clearly,
Chinese University of Geosciences, Wuhan 430074, China (e-mail: to operate such large infrastructure, a very large amount of
rranjans@gmail.com). electricity is required depending on the size of the data cen-
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TII.2020.3045690. ters. This also contributes significantly to their high operational
Digital Object Identifier 10.1109/TII.2020.3045690 cost. According to a report published by the European Union,
1551-3203 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
5646 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

a decrease in emission volume of 15–30% is required before storage, computation, and network [6]–[8]. However, most of
2020 to keep the global temperature increase below 2 ◦ C. Thus, these solutions are not directly applicable in the context of
high energy consumption and the carbon footprint of cloud data scientific or industrial workflow applications, which is the focus
center infrastructures have become key environmental concerns of this article. For each application workload and execution
and have immense potential to deal a hefty blow to efforts to profile, a different strategy is needed to minimize their energy
contain global warming. The majority of cloud providers (such usage while optimizing SLAs. Scientific and industrial work-
as Amazon, Apple, and Google) offer a multicloud environment flows are one of the most complex applications where several
(including industrial clouds) which is Geo-distributed across dif- tasks have to be executed in a synchronous manner to achieve
ferent countries and are connected via software-defined network the required quality of service [9]. Communication between
(SDN) [2]. Some of them are designed to utilize the green energy, different tasks makes the matter worse as the energy usage of
provided from local providers.1 So, this provides an opportunity the network also needs to be considered with other constraints.
for the cloud providers to schedule their workloads to the data Giacobbe et al. [10] proposed usage of multiple data center
centers, which utilize more renewable energies. locations to improve energy cost and also minimize environ-
mental impact. However, these solutions are designed for simple
A. Research Question applications that consist of tasks, which can run independently.
Also, the existing solutions do not take the advantages of SDN
How to execute an industrial workflow application across
network to account to further improving the data provision
multiple data centers via private SDWAN? To execute an in-
strategies.
dustrial workflow task on cloud, we need sufficient computing
In order to minimize energy usage while avoiding SLA vi-
resources provided by cloud provider, the codebase of the task
olations of workflow applications, we need new system and
belonged to user and data provided by user or generated from
algorithmic solutions that can consider several factors including
upstream tasks. To this end, the workflow scheduler has to handle
dependency between different tasks with data transfer cost in
the three factors at the same time, i.e., resource provisioning, task
private SDWAN, in addition to energy cost and carbon footprint
provisioning, and data provisioning. Wen et al. [3], [4] developed
associated with application execution. To this end, we propose
the schedulers to run scientific workflow over multiple data
an adaptive genetic algorithm-based mechanism to schedule
centers. However, they do not consider the software-defined data
workflow applications considering application users’ require-
centers that offers more flexible for design new green energy
ments such as deadline and budget. To minimize the carbon
scheduling algorithm. WARM [5] aim of scheduling the tasks
footprint [11], the proposed algorithm selects the schedule
in SDN-based cloud data center while maximizing the revenue
that favors the data centers where green energy being utilized.
of the cloud provider. It optimizes the latency of tasks both
However, as green energy availability varies with time [12],
in network and VM. However, this solution is not suitable for
thus, our algorithm also considers resources from different data
multiple clouds scenarios.
centers. Moreover, from the user perspective execution cost
How to optimize for energy efficiency while avoiding SLA
and minimum execution time is also important; our algorithm
violations? While cloud computing aims to optimize the use
also considers this tradeoff between execution cost, usage of
of hosted ICT resources, the cloud providers do not (yet) have
green energy, and execution time. In particular, our proposed
an effective solution for simultaneously optimizing energy con-
algorithm minimizes execution cost while selecting solutions
sumption and SLAs (e.g., deadline, processing cost) especially
with minimum carbon footprint for overall schedule by using
for big data-driven scientific and industrial workflow application
multiple data centers with more green energy usage (Table III
scheduling in software-defined multicloud environments. One of
highlights the novelty of our work). The contributions of this
the major reasons for this state of affairs is that cloud providers
article are as follows.
operate multiple large data centers distributed across multiple
1) A new SDN-based workflow broker (SDNWB) to de-
locations. Depending on the location of the data center and
ploy industrial workflow tasks across multiple software-
location of the application owner, the scheduling process for
defined data centers while automating the task provision-
cloud applications has to automate cloud data center selection
ing, data provisioning, and resource provisioning.
and, in doing so, ensure that SLA (e.g., application hosting
2) An adaptive genetic algorithm (GA) and associated
cost, application run-time performance) and energy (e.g., total
SDNWB for green scheduling of industrial workflow
electricity bills, sustainability goals) requirements are met at
applications.
the same time, which are often conflicting. When selecting
3) Tradeoff analysis of different factors such as energy cost,
ICT resources (e.g., virtual machines, containers, storage space)
green energy availability, and workflow requirements
from multiple data centers, cloud providers must consider het-
based on real data.
erogeneous set of criteria and complex dependencies across
4) Extensive experimental evaluation to study the feasibility
multiple layers (e.g., application level, data center level), which
of the proposed scheduling algorithm and architecture.
is impossible to resolve manually.
There is a substantial amount of related work address-
ing the improvement of the carbon footprint of data centers II. SYSTEM MODEL
by managing customer workloads at different levels such as
Fig. 1 presents a high level system model with components
of SDNWB utilized by public cloud providers for executing the
1 [Online]. Available:www.google.com/green industrial workflow application.

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
WEN et al.: RUNNING INDUSTRIAL WORKFLOW APPLICATIONS IN A SOFTWARE-DEFINED MULTI-CLOUD ENVIRONMENT 5647

Fig. 1. High level system model for running industrial workflows.

The system S, in this article, consists of a set of software- where Tl (vmi ) is time required to launch VMi ∈
defined data center owned by a provider such as Ama- {small(VM), medium(VM), large(VM)}, and Te (vi , vmi )
zon EC2. S = {R1 , R2 , . . . , R3 } ∪ D, where Ri , 1 ≤ i ≤ k, = represents the time required execute vi over vmi . Tt (vi , vmi ) is
{vm1 , vm2 , . . . , vmi , vmk } ∪ di . the time required transfer vi ’s input data to vmi . However, if all
In a data center Ri , vmi is a virtual machine for hosting the input data are in the same VM, the transferring is equal to
application services (e.g., a workflow task) and di is the cloud 0, i.e., Tt (vi , vmi ) = 0. The models for calculating these times
specific data repository (such as S3 in the case of Amazon S3 will be detailed in the following section. Based on (1), we can
cloud). identify the total cost of executing a workflow over a set of
A public cloud provider utilizes Workflow Orchestrator that VMs that are allocated over different data centers as follows:
deploys the industrial workflow application across multiple data 
center and the SDN Controller optimizes the data transferring T Cost(λ, VM) = Cost(vi , VMi ). (2)
among the data centers while executing a workflow application, vi ∈λ, VMi ∈VM
such that the application can be executed with minimal execution
cost and carbon footprint. In particular, the users submit their B. Performance Model
industrial workflows with all the executables and information
such as execution requirements, task description, and the desired As abovementioned, the makespan (or performance) of ex-
security requirements to our broker. Workflow Orchestrator is ecuting an industrial workflow application over different VMs
responsible for matching different workflow tasks to different that are deployed over different data centers includes three parts:
data centers based on their electricity prices and usage of green the time the VM (Tl ) is launched, transferring input data to
energy. Based on the planning, the Workflow Orchestrator inter- destination VM (Tt ), and executing the service (Te ). In order
acts with each data center to prepare virtual machines to execute to model this, we assume that the launching time of each type
the workflow tasks in a defined order. SDN Controller manages of VM is the same which is equal to Tl . Next, the network
the data transfer between different tasks during execution by throughput (or bandwidth) is tp, 2tp, and 3tp corresponding to
configuring the flowable of the related SDN-switches. small(VM), medium(VM) and large(VM), noting tpvmi . How-
ever, data transmission rate is not only affected by throughput
of the deployed VMs, but also the geographical location of the
A. Cost Model
VMs.
We assume that each dci has three types of VM: small(VM), Same data center: A workflow includes two tasks v1, v2,
medium(VM), and large(VM). The price of these three where the data are generated from v1 and transferred to v2.
VMs are: 4 ∗ Price(Small(VM)) = 2 ∗ Price(medium(VM)) = They are allocated to two different VMs VM1 and VM2 , and
Price(large(VM)) = 4M . In this article, we assume that the the VMs are deployed over the same data center. Thus, the
VMs are charged according to how long they are used for, throughput between v1 and v2 is tpv1→v2 = min(tpvm1 , tpvm2 )).
therefore, the cost of running small(VM) for four minutes is Thus, the time required to transfer P size of data from v1 to v2
4M . is: Tt (v1, v2) = min(tpvmP ,tpvm ) .
1 2
Based on the assumptions abovementioned, we can have the Moreover, the host VMs of v1 and v2 are allocated in different
cost of executing vi over small(VM) as follows: data centers. For instance, if the vm1 , which is used to host v1 is
Cost(vi , vmi ) = (Tl (small(VM))) + Te (vi , vmi ) deployed on dc1 , and vm2 , which is used to host v2 is deployed
on dc2 . So the time of transferring P data from v1 to v2 is:
+ Tt (vi , vmi )) ∗ Price(vmi ) (1) Tt (v1, v2) = min(tpvmP ,tpvm ) + P ∗ ϕ ∗ H(dc1 , dc2 ), where ϕ is
1 2

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
5648 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

the average latency incurred at each hop and H(dc1 , dc2 ) is the D. Electricity Cost
number of core network hops between two data centers. The electricity cost is caused by data processing and
The execution time of each service depends on the perfor- communication. We compute the electricity cost of data pro-
mance of the host VM, in general, the larger VM has better cessing by multiplying the local electricity price with energy
performance. Therefore, the execution of time vi , which is consumed by the corresponding VM as shown in 8, where
hosted on vmi can be represented as: Te (vi , vmi ). Thus, the Eprice(vmi ) represent electricity price of the data center vmi
makespan of executing λ is: deployed
 
Makespan(V, VM) = Tl (vi , vmi ) + Tt (vi , vj ) comp
Ehost (VM) = [(α ∗ ρ + (1 − α) ∗ ρ ∗ uvmi )
vi ,vj ∈V⊂λ vmi ∈VM
vmi ,vmj ∈VM
i =j
∗ Eprice(vmi )]. (8)
+ Tt (vi , vmi ) (3)
Regarding the electricity cost cause by data exchanging,
where V is a set of the services, which belong to the critical path. we assume that the electricity price is a constant value Ω for
each hop. As the result, the total electricity cost for running a
C. Energy Model workflow with deployment solution λ is formalized in
comp
There are two key operations in the execution of an industrial T Ele(λ, VM) = Ehost (VM)
workflow application where energy is spent: a) in data process- 
ing or computation and b) communication. For computation, in + P comm (vmi , vmj ) ∗ Ω. (9)
vmi ∈VM
general the power consumption of a server varies as a function vmj ∈VM
of its utilization level. If a server is idle, the power saving
mechanism lowers the frequency of CPU and, thus, only a small III. PROPOSED ENERGY-AWARE INDUSTRIAL WORKFLOW
proportion (α) of peak power is utilized. If ρ is peak power SCHEDULING ALGORITHM: GREENGA
consumption and u is utilization of resources, then the power
consumption by a host in the data center will be In this article, we aim to find an optimized solution that
maximizes the proportion of renewable energy and minimizes
comp
Phost = α ∗ ρ + (1 − α) ∗ ρ ∗ u. (4) the real electricity cost under deadline constraints from users.
Therefore, this can be considered a dual objective optimization
A host may run several VMs at a time, thus, this power will problem.
be spent by each VM according to its usage of resources. Even Given the complexity of the problem with multiple objec-
though a host may have several resources such as CPU cores, tive functions and constraints, it is not possible to find the
disk, memory, and other elements, we assume that uvmi indicates solution to the scheduling problem in polynomial time. Thus,
aggregate resources utilized by each VM i hosted on the server. we adapted a well known evolutionary algorithm (i.e., genetic
The power consumption of server (host) will be algorithm), which is known to find the near-optimal solution for
scheduling workflow applications. Previously the genetic algo-
comp

Phost (VM) = (α ∗ ρ + (1 − α) ∗ ρ ∗ uvmi ). (5) rithm has been applied for optimizing makespan of workflow
vmi ∈VM applications; however, its applicability and performance has not
been tested for optimizing different factors such as revenue,
Let one VM i be transmitting data to another VM j. Let energy consumption, and carbon footprint. In our approach,
H(i, j) be the number of hops or routers/switches between these we first converted this multiobjective problem into a single
VMs. For communication, the power consumption depends on objective optimization problem by multiplying each objective;
the bandwidth used by a VM in communication and number of the resultant problem is formulated as
routers those data need to be transmitted from to reach to the
destination VM. If B is the total bandwidth available and ξrouter min (f (λ) ∗ g(λ))
is the power consumed by a router, the power consumption for s.t. f (λ) = T Ele(λ, VM)
the communication will be
g(λ) = (1 − σ(λ))T Energy(λ, VM)
tpv1→v3
P comm
(vmi , vmj ) = ξrouter ∗ H(i, j) ∗ . (6) Makespan(V, VM) ≤ deadline
B
Therefore, the total energy consumption can be formalized as T Cost(λ, VM) ≤ budget (10)
comp f represents the total monetary cost of running a given workflow
T Energy(λ, VM) = Phost (VM)
 with deployment solution λ. Next, g indicates the nongreen
+ P comm (vmi , vmj ). (7) energy consumption and where σ(λ) is a function that calculates
vmi ∈VM the proportion of renewable energy consumption on deployment
vmj ∈VM
solution λ. Finally, deadline and budget are given by users as

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
WEN et al.: RUNNING INDUSTRIAL WORKFLOW APPLICATIONS IN A SOFTWARE-DEFINED MULTI-CLOUD ENVIRONMENT 5649

the hard constraints of the execution time and the maximum


Algorithm 1: Elitist Prevention.
deployment cost of the given workflow.
s–elitist size; elist– elitist list; pop– all individuals O–task
A. Algorithm Details list; C–cloud list
The aim of GA is to search the solution space and find the best if elist is empty then
value for objective function or fitness function, i.e., combination  ASCsort sorts the pop as ascending order
of renewable energy and energy cost. pop ← ASCsort (pop)
To this end, we need to encode the objective function in (10).  copy the first s number of solutions to elist
Our deployment solution λ is encoded as a vector [si1 ; sj2 :::: skn ], elist ← from pop[0] to pop[s − 1]
where sij means that service or task si is deployed on cloud cj . end
Therefore, the vector can be used to compute the value of the for o in O do
objective function as well as the constraints based on the cost for c in C do
model, energy model, and performance model. After encoding, pop ← combine(elist, pop) pop ← ASCsort (pop)
we can perform the adaptive GA to compute a suboptimal delete s numbers of pop in tail
solution through the following four phases. elist ← from pop[0] to pop[s − 1]
1) Candidate list generation and initializing population: end
Initially, we randomly generate the population, which is end
coded as abovementioned. We select the clouds from the
‘candidate list” that lists the clouds meet the constraints of
workflow task such as deadline to reduce the possibility Algorithm 2: Diversity Protection.
of generating an infeasible solution. pop– all individuals; size–size of pop;
2) Selection: To generate efficient solutions, two factors threshold–threshold of diversity; rate–the current
(selection pressure and population diversity) have to be mutation rate; M ax–maximum mutation rate.
carefully considered. We utilized the elitism method [13]  function removeDup removes the duplication
given in Algorithm 1, to prevent superior individuals from rpop ← removeDup(pop)
getting destroyed in crossover and mutation. Thus, if a sr ← |rpop|
sr
solution is tagged as elitist, it should be part of the new d ← 1 − size
population generation process. This method can ensure if d < threshold then
that our algorithm does not waste time to rediscovery the increase rate
good results that have been already obtained in previous end
generations. For the selection process two methods are else if rate > M ax then
used: fitness function and diversity analysis. The fitness decrease rate
function can transfer the fitness of a coding into a numeric end
representation to select superior solutions. The fitness
function is the same as the objective function defined in
(10), which consists of green energy usage and execu- 1.75
The increasing and decreasing step is computed as |λ|∗|pop|
tion cost. The diversity analysis is important step as it following the suggestion of [16].
influences further steps of crossover and mutation. Low The abovementioned steps are repeated until the termination
diversity of population usually indicates a local extreme, constraints are reached.
which impacts the search for optimal solutions. Time complexity. The proposed method is split into four
3) Crossover and mutation: Crossover aims to exchanges phases: selection, crossover, mutation, and diversity mainte-
the parts of two chromosomes to generate two new chro- nance. The time complexity of the selection phase is O(|P | ×
mosome. In this article, we use one-point crossover [14]. |G| × |O|), where P is the size of population; G is the num-
Mutation can enhance the search range, thus, we devel- ber of generations O represent the total number of tasks of
oped a solution that randomly select a small proportion (as a given workflow. For crossover phases, we need to operate
described in [15]) of chromosomes in current generation each individual in every generation, so the complexity of both
and changing them to new feasible chromosomes for next is O(|P | × |G|). Although mutation does not operate each
generation. We set a very small initial mutation rate as individual in each generation, the mutation rate is decided on
0.015. However, it can be dynamically adjusted by our diversity maintenance phase, which requires to sort solutions.
proposed algorithm, detailed in the following. Thus, the time complexity mutation and diversity maintenance
4) Diversity maintenance: The diversity of a chromosome together is O(|P | × |G|). As a result, the overall time complexity
affects by the mutation rate adjusted by Algorithm 2. First, of the proposed method is O(|P | × |G| × |O|).
we compute the density d of the population by comparing
unique chromosomes (sr) with total number of popula-
B. Termination Method
tion (size). Next, we increase the mutation rate if d is
less than predefined threshold. However, if the mutation If the number of iterations iter is ∞, GA can provide an
rate is higher than its maximum rate, it will be decreased. optimal solution. However, the computation resource is limited.

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
5650 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

for example, it will cost 10.5 US Dollar per hour and


its network bandwidth, CPU, VM size, and RAM are
1000(Kb/s), 70(mips), 1000(GB), and 512(MB), respec-
tively, where the mips describes the CPU powers, i.e.,
millions of instructions per second. Also, we assume that
the Small VM consume 4.5 kw electric energy per hour.
c) Data center networking. Fig. 2 indicates the allocation
of the data center, and where the weight of each edge
represents the number of hops that have been passed for
transferring data from one data center to another. Also
we assume that the network latency between data centers
will add 0.3 extra time for transferring data from one data
center to another on average.
d) Electricity prices. Market electricity price varies by coun-
try and by hour of the day. We use United Kingdom
day-ahead market2 observed over one week to simulate
Fig. 2. Data centers the market electricity cost of each data centre. The prices
are modified based on the cost of energy in the countries
TABLE I in which the data centers are allocated.3 More details can
NUMBER OF TASKS OF EACH WORKFLOW AT EACH SCALE be found in [12].
2) User Configuration:
a) Workflow generation. To evaluate our algorithm, four
common workflow applications are consider: Cyber-
Shake (earthquake risk characterization), Montage (gen-
eration of sky mosaics), LIGO (detection of gravita-
tional waves), and Epigenomics (bioinformatics)4 . Table I
TABLE II shows the number of tasks of each workflow application.
CONFIGURATION OF DIFFERENT TYPE OF VMS
Notably, our simulator only consider input and output data
size and execution time of each task.
b) Deadline generation. In this article, deadline is a hard
constraint, which is defined as the mean of fastest solution
and slowest solution. fastest solution is the deployment
solution, deploying the workflow over most the powerful
VMs in the same data center, and the slowest solution is to
deploy the workflow over the least powerful VMs across
different data centers.
In this article, we terminate our algorithm if there is not further
improvement of the solution in a fixed number of interactions R. B. Experimental Results
We evaluate the performance of our proposed algorithm
IV. PERFORMANCE EVALUATION based on five objectives: electricity cost, energy consumption,
A. Experimental Setup deadline, and proportion of the usage of renewable energy. To
this end, we compare the solutions generated by our proposed
In this article, we use CloudSim [17] to simulate the multiple
algorithm with the best and worst case of each objective. More-
data center environments to investigate our algorithm. CloudSim
over, we evaluate the performance of our adaptive algorithm in
is one of the most widely used simulators in the world and it was
two ways: 1) to optimize only one objective and keep others
evaluated and made comparison with the real-world test-beds in
within predefined constraints; 2) to optimize more than one
many scenarios, including deploying scientific and industrial
objectives by mapping these objectives into a weighted linear
workflow on multiple clouds [4], [18].
function while ensuring other objectives within the predefined
1) Cloud Provider Configurations:
constraints.
a) Data center location and proportion of green energy We
1) Electricity Cost: We develop two versions of GA-based
assume there are six cloud data centers, which are allo-
algorithms namely EleCostGA and GreenGA to optimize the
cated in different areas, as shown in Fig. 2. The proportion
of the usage of green energy of the given area is {0.895,
2 [Online].Available: http://www.nordpoolspot.com accessed 01-06-2015
0.895, 0.934, 0.932, 0.622, 0.071}. 3 [Online].Available: https://en.wikipedia.org/wiki/Electricitypricing
b) VM configuration We assume each data center has three 4 The XML description files of the workflows are available via the Pe-
types of VM: small, medium, and large. Table II shows gasus project: [Online]. Available: https://confluence.pegasus.isi.edu/display/
the configuration of each type of VM. The small VM pegasus/WorkflowGenerator

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
WEN et al.: RUNNING INDUSTRIAL WORKFLOW APPLICATIONS IN A SOFTWARE-DEFINED MULTI-CLOUD ENVIRONMENT 5651

Fig. 3. Electricity cost for medium size workflow. Fig. 6. Electricity cost versus number of generation.

Fig. 4. Electricity cost for large size workflow.


Fig. 7. Energy consumption for Epigenomics.

lower bound can be obtained by minimizing the electricity cost


of executing each task of the workflow. To this end, we first select
the VMs which can minimize the excutionTime ∗ Consumption
of each task, and then choose the data centers, which have
the cheapest electricity price for deploying the selected VMs
(mainly considering when the selected VMs are started and
Fig. 5. Electricity cost for very large size workflow.
how long the selected VMs will be launched).
Fig. 6 shows the relation between the number of generations of
GA and electricity cost. We first set the result of ten generations
workflow deployment across multiple data centers. EleCostGA as the baseline and then compute how many percentage can be
only considers minimization of the electricity cost, while meet- saved with the increasing generations.
ing other constraints in terms of makespan, energy consumption, 2) Energy Efficiency: In this article, we do not aim to min-
and user budgets. GreenGA shares the same constraints as imize the energy consumption for executing workflow appli-
EleCostGA, but also aims to minimize both electricity cost and cations across a set of data centers. However, our proposed
consumption of nonrenewable energy (i.e., green energy). method allows cloud providers to set a constraint for energy
Figs. 3–5 show the results of applying both algorithms to consumption. To this end, we first provide the lower bound and
different types of workflow. The lower bound represents the upper bound of energy consumption for executing a workflow
lowest electricity cost that be obtained without considering any application over available data centers.
constraints, where Y-axis is the ratio of the results generated by Lower bound and upper bound. The lower bound of energy
EleCostGA
proposed algorithms with the lower bound (i.e., Lower bound or consumption is computed by selecting the most energy effi-
GreenGA
Lower bound ). cient VMs inside the same data center to execute the given
The results illustrate that the cloud providers have to spend workflow. Furthermore, the most energy efficient Vs for a task
comp
more when they optimize the proportion of the usage of renew- Vi is: arg minVMi ∈VM Makespan(vi , VMi ) ∗ Phost (VMi ). The
able energy. However, with the increasing size of the workflows, upper bound includes both energy consumption for VM and
the differences of electricity cost for GreenGA and ElecCostGA transferring data over hops. Therefore, we choose the type of
become smaller. VM that consumes the most energy and the data center, which
Lower bound is generated by a greedy-based method, which contains the most hops for data transmission.
is briefly illustrated as follows. Figs. 7–10 indicate the energy consumption of executing the
Lower bound: The electricity cost is calculated by given workflows over different data centers. The Y-axis indicates
excutionTime ∗ electricityPrice ∗ Consumption. In this article, the ratio of the energy consumption of different solution with the
we do not have the electricity price of each hop when data is lower bound of energy consumption. Although both GreenGA
transferred from one data center to another. Also, in practice this and ElectCostGA are not designed to minimize energy consump-
cost is considered by cloud providers. Therefore, the electricity tion, they can guarantee energy consumption that will meet the
cost of execution workflow over multiple data centers, which predefined constraints, while minimizing the electricity cost.
can be simply computed by adding the electricity cost of each 3) Green Energy Efficiency: This section shows the
data center where the target workflow is allocated. So the proportion of the usage of renewable energy of each solution,

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
5652 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

Fig. 13. Proportion of the usage of renewable energy for very large
Fig. 8. Energy consumption for CyberShake. size workflow.

Fig. 9. Energy consumption for Montage. Fig. 14. Presentation of time save comparing with deadline.

The local optimal has a very high probability of causing the ter-
mination of the program by meeting the predefined termination
condition, as described in Section III-B.
4) Performance/Makespan: Deadline is a hard constraint,
which means that the execution time of each submitted workflow
must be equal to or less than the specific deadline. Fig. 14
shows the time saving of the generated solutions, comparing
Fig. 10. Energy consumption for LIGO. with user given deadlines, where the Y-axis represents the ratio
of saving time and the given deadline( savedTime
deadLine ). The X-axis is the
type of workflow, where “M EP”, “L EP”, and “VL EP” are the
medium, large and very large size of “Epigenomics” workflows.
The results show that all generated solutions can guarantee the
given deadline.

V. ENERGY EFFICIENCY IN SDWAN


To evaluate the energy efficiency and flexibility of the SDN
Fig. 11. Proportion of the usage of renewable energy for medium size for data transmission in WAN, we conduct the following experi-
workflow.
ments by using IoTSim-SDWAN [19], which simulates multiple
cloud data center connected via traditional WAN and SDWAN
environments. It provides the facilities to evaluate energy con-
sumption of networks in both traditional WAN and SDWAN
environments.
Experiment configuration. We consider two types of network
topology: small scale WAN and large scale WAN. The number of
hops in the large scale WAN are twice as in small scale WAN. Re-
garding the data centers and workflows, we keep configuration
Fig. 12. Proportion of the usage of renewable energy for large size similar to the experiments performed in the previous sections. In
workflow. the SDN-enabled WAN environment, we use the shortest path
to transfer the data between two data centers. However, in the
traditional WAN environment, we randomly select a path for the
which is generated by different algorithms. Figs. 11–13 show data transmission.
that the solutions, which are generated by GreenGA use more We report the experimental results as the ratio of the en-
green energy than those generated by EleCostGA. However, the ergy consumption of SDN-based solution and non-SDN-based
difference reduces with the increasing size of workflows. The solution, i.e., SDN/Non−SDN. Figs. 15–17 show that SDN-
larger size of workflow corresponding to more deployment so- enabled environment consume less energy than traditional WAN
lutions, the GA-based methods are easier to reach local optimal. environment on transferring data across multiple clouds. The

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
WEN et al.: RUNNING INDUSTRIAL WORKFLOW APPLICATIONS IN A SOFTWARE-DEFINED MULTI-CLOUD ENVIRONMENT 5653

Fig. 15. Energy consumption for transferring medium size workflow Fig. 19. Energy consumption for transferring large size workflow data
data in small scale WAN. in large scale WAN.

Fig. 16. Energy consumption for transferring large size workflow data
in small scale WAN. Fig. 20. Energy consumption for transferring very large size workflow
data in large scale WAN.

to non-SDWAN solution, SDWAN solution consumes 73.8%


less energy, as shown in Figs. 18 –20. Similar to the small
scale network typology case, this advantage degrades with the
increase size of the workflow. However, this degradation is very
slow. From medium size workflow to very large size workflow,
the energy saving is reduced from 77 to 70.75%.
Fig. 17. Energy consumption for transferring very large size workflow
data in small scale WAN.
VI. DISCUSSION
The evaluations in the simulated environment show that our
proposed algorithm outperforms the comparison methods in
terms of green-energy efficiency and network efficiency. In order
to conduct the evaluations in the real-world environments, the
following challenges need to be considered: 1) multidata-centre
SDWAN network, which are currently owned and by managed
by cloud providers (e.g., Google, Facebook), which is restricted
or no access to the network control plane of the data centres;
Fig. 18. Energy consumption for transferring medium size workflow 2) monitoring the energy consumption, while running workflow
data in large scale WAN. applications in data centers, is not currently supported by the
proprietary cloud monitoring tools (e.g., AWS Cloudwatch); 3)
the use of real data centre servers and networks for benchmark-
SDWAN solution can save energy up to 32.5%, compared to ing energy efficiency and workflow application performance is
non-SDWAN solution. However, the advantage reduces with the often constrained by their heterogeneity (e.g., hypervisor type,
increase size of workflow. For example, the energy saving of network type). To overcome the abovementioned challenges, we
SDWAN solution is around 21.5%. This is because the small propose the potential solutions as follows. One can utilize the
scale network topology has less option of routing paths in network trace data (e.g., from B4 project [2]) to parameterize
which are very similar number of hops. The increase size of the a microbenchmark, which is emulated by a lab level test-bed.
workflow, the more shortest paths are selected by the random Regarding to the large scale experiments, the environments
solution. can be simulated using the real-world network trace data to
When the network topology becomes more complicated, the parameterize our simulators. Similarly the energy consumption
advantage of the SDWAN become more significant. Compared of different VMs, can be modeled based on the data center cluster

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
5654 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

TABLE III
COMPARE WITH OTHER RELATED WORK

traces such as Microsoft Azure Dataset.5 and Alibaba Cluster carbon footprint. For instance, Aksanli et al. [31] proposed a
Data.6 database scheduling strategy, which predicts the green energy
availability reducing rescheduling of jobs. Goiri et al. [32] also
VII. RELATED WORK predicted green energy availability to schedule map reduce jobs.
Deng et al. [33] proposed an online algorithm to minimize
A. Cost and Performance-Based Tasks Scheduling the operational cost of data centers by using mutiple energy
To improve the performance of run a workflow application, resources. Kaushik et al. [34] proposed an energy saving cloud
Mao and Humphrey [23] introduced an autoscaling method to storage solution by dividing the storage structures in different
allocate workflow tasks to a set of VMs to meet the deadline zones based on power characteristics. Most of the abovemen-
constraints. tioned works focus on single data centers. Garg et al. [11]
Malawski et al. [24] considered the monetary cost for run- proposed a green cloud framework, which utilizes multiple
ning workflow over cloud. They developed the algorithm to clouds to improve energy consumption. However, the work does
overcome the tradeoff between makespan and financial cost. not give a mechanism to maximize the usage of green energy.
An new algorithm was proposed in [17] that utilizes the idle Kiani et al. [35] shared similar aims to ours, i.e., to increase
time of provisioned resources and surplus budget to scale up the green energy usage and cutting the cost of electricity across
execution of workflow application to catch of chance of meeting multiple data centers. They utilize the concept of decomposing
deadlines. There are also some algorithms [25] considering the workload into green and brown, however, they focus on a
security for running the workflow applications on cloud. simple workload consisting of individual tasks. Giacobbe et al.
In multiple clouds case, Yuan et al. [20] proposed an effective [10], [36] introduced approaches for migrating virtual machines
algorithm to scheduling the tasks across public cloud and private, among more distributed federated clouds where costs ([36]) and
while minimizing the cost and ensuring the delay within a environmental impact (using renewable energy along with the
boundary. However, this method is not suitable for scientific or selection of data centers with the lower PUE [10]) are taken into
industrial workflow applications. PANDA [26] was developed to account. Yuan et al. [7] proposed an algorithm that focus on
schedule workflow across private cloud and public while finding ensuring the deadline of executing tasks on green data centers.
the “best” tradeoff between performance and cost. Fard et al. These works consider simple computation resources at the level
[27] solved the tradeoff between monetary cost and completion of generic VMs, no further investigation from this perspective
time via a Pareto-optimal-based algorithm. However, none of is performed.
them considered security and cloud availability change. Security In summary, to the best of our knowledge, our proposed
was consider in [28] that introduced a static method to optimize GA-based scheduling is the first work that focuses on increas-
the deployment of a workflow application on multiple cloud by ing usage of green energy and minimizing electricity cost for
considering security, makespan, and monetary cost. workflow application execution in multiple cloud data centers.
We also consider variable electricity cost across different cloud
B. Green Scheduling data centers. Table III provides a comparison between our work
and the state of the arts.
Various existing proposals [21], [22] suggested different so-
lutions with respect to green scheduling in cloud computing.
Catena and Tonellotto [29] proposed a predictive energy saving VIII. CONCLUSION
online scheduling algorithm to reduce the energy consumption
of distributed web search engines. Li et al. [30] proposed an In recent years, several works were attempted to develop
energy model for edge and core cloud, to estimate the energy mechanisms to be able to efficiently execute industrial workflow
consumption based on the number of IoT devices and the desired applications in software-defined cloud environments with min-
application QoS. There have been many works [6], [31]–[33] imal cost. However, over the years, as usage of cloud increases,
that have proposed techniques to improve cloud data center concern about its carbon footprint has also became a critical
efficiency in terms of electricity usage and decreasing their topic of research. In this context, this article proposed an adaptive
GA-based industrial workflow scheduling algorithm that utilizes
5 [Online]. Available: https://github.com/Azure/AzurePublicDataset multiple software-defined cloud data center resources not only to
6 [Online]. Available: https://github.com/alibaba/clusterdata improve green energy usage but also keep the cost of execution

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
WEN et al.: RUNNING INDUSTRIAL WORKFLOW APPLICATIONS IN A SOFTWARE-DEFINED MULTI-CLOUD ENVIRONMENT 5655

to a minimum. The performance of our algorithm was evaluated [18] M. Malawski, G. Juve, E. Deelman, and J. Nabrzyski, “Algorithms for cost-
using real industrial workflow workload with different sizes and deadline-constrained provisioning for scientific workflow ensembles
in IAAS clouds,” Future Gener. Comput. Syst., vol. 48, pp. 1–18, 2015.
under various configurations of virtual machines. We compared [19] K. Alwasel et al., “IOTSIM-SDWAN: A simulation framework for in-
our algorithm with another GA-base algorithm that just opti- terconnecting distributed datacenters over software-defined wide area
mizes electricity cost. The experimental results clearly show network (SD-WAN),” J. Parallel Distrib. Comput., vol. 143, pp. 17–35,
2020.
that our proposed algorithm favors more green energy usage [20] H. Yuan, J. Bi, W. Tan, M. Zhou, B. H. Li, and J. Li, “TTSA: An effective
with expenditure similar to the base algorithm for large and very scheduling approach for delay bounded tasks in hybrid clouds,” IEEE
large size workflows. For smaller size workflows, with 10–20% Trans. Cybern., vol. 47, no. 11, pp. 3658–3668, Nov. 2017.
[21] G. S. Aujla and N. Kumar, “SDN-based energy management scheme for
increase in electricity cost, our algorithm can generate a schedule sustainability of data centers: An analysis on renewable energy sources
that uses almost 200% times more green energy. and electric vehicles participation,” J. Parallel Distrib. Comput., vol. 117,
In future, we will evaluate our proposed algorithm in real pp. 228–245, 2018.
[22] G. S. Aujla and N. Kumar, “Mensus: An efficient scheme for energy
cloud environments and integrate with workflow engines. We management with sustainability of cloud data centers in edge–cloud envi-
will also work on improving the algorithm by considering ronment,” Future Gener. Comput. Syst., vol. 86, pp. 1279–1300, 2018.
dynamic changes in green energy availability. [23] M. Mao and M. Humphrey, “Auto-scaling to minimize cost and meet ap-
plication deadlines in cloud workflows,” in Proc. Int. Conf. High Perform.
Comput., Netw., Storage Anal., Nov. 2011, pp. 1–12.
[24] M. Malawski, G. Juve, E. Deelman, and J. Nabrzyski, “Cost- and
REFERENCES deadline-constrained provisioning for scientific workflow ensembles in
IaaS clouds,” in Proc. Int. Conf. High Perform. Comput., Netw., Storage
[1] J. Koomey et al., “Growth in data center electricity use 2005 to 2010,” Anal., 2012, pp. 22–1–22–11.
Rep. Anal. Press, Completed Request New York Times, vol. 9, no. 2011, [25] J. Mace, A. van Moorsel, and P. Watson, “The case for dynamic security
p. 161, 2011. solutions in public cloud workflow deployments,” in Proc. IEEE/IFIP 41st
[2] S. Jain et al., “B4: Experience with a globally-deployed software defined Int. Conf. Dependable Syst. Netw. Workshops, Jun. 2011, pp. 111–116.
wan,” ACM SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 3–14, [26] M. R. H. Farahabady, Y. C. Lee, and A. Y. Zomaya, “Pareto-optimal
2013. cloud bursting,” IEEE Trans. Parallel Distrib. Syst., vol. 25, no. 10,
[3] Z. Wen, J. Cała, P. Watson, and A. Romanovsky, “Cost effective, reliable pp. 2670–2682, Oct. 2013.
and secure workflow deployment over federated clouds,” IEEE Trans. Serv. [27] H. M. Fard, R. Prodan, and T. Fahringer, “A truthful dynamic workflow
Comput., vol. 10, no. 6, pp. 929–941, Nov./Dec. 2016. scheduling mechanism for commercial multicloud environments,” IEEE
[4] Z. Wen, R. Qasha, Z. Li, R. Ranjan, P. Watson, and A. Romanovsky, Trans. Parallel Distrib. Syst., vol. 24, no. 6, pp. 1203–1212, Jun. 2013.
“Dynamically partitioning workflow over federated clouds for optimising [28] F. Jrad, J. Tao, I. Brandic, and A. Streit, “{SLA} enactment for large-
the monetary cost and handling run-time failures,” IEEE Trans. Cloud scale healthcare workflows on multi-cloud,” Future Gener. Comput. Syst.,
Comput., vol. 8, no. 4, pp. 1093–1107, Oct.–Dec. 2020. vol. 43/44, pp. 135–148, 2015.
[5] H. Yuan, J. Bi, M. Zhou, and K. Sedraoui, “Warm: Workload-aware multi- [29] M. Catena and N. Tonellotto, “Energy-efficient query processing in
application task scheduling for revenue maximization in SDN-based cloud web search engines,” IEEE Trans. Knowl. Data Eng., vol. 29, no. 7,
data center,” IEEE Access, vol. 6, pp. 645–657, 2017. pp. 1412–1425, Jul. 2017.
[6] J. Wu, S. Guo, J. Li, and D. Zeng, “Big data meet green chal- [30] Y. Li, A.-C. Orgerie, I. Rodero, B. L. Amersho, M. Parashar, and J.-M.
lenges: Greening big data,” IEEE Syst. J., vol. 10, no. 3, pp. 873–887, Menaud, “End-to-end energy models for edge cloud-based IoT platforms:
Sep. 2016. Application to data stream analysis in IoT,” Future Gener. Comput. Syst.,
[7] H. Yuan, J. Bi, M. Zhou, and A. C. Ammari, “Time-aware multi-application vol. 87, pp. 667–678, 2018.
task scheduling with guaranteed delay constraints in green data center,” [31] B. Aksanli, J. Venkatesh, L. Zhang, and T. Rosing, “Utilizing green energy
IEEE Trans. Autom. Sci. Eng., vol. 15, no. 3, pp. 1138–1151, Jul. 2018. prediction to schedule mixed batch and service jobs in data centers,” ACM
[8] J. Bi et al., “Application-aware dynamic fine-grained resource provision- SIGOPS Oper. Syst. Rev., vol. 45, no. 3, pp. 53–57, 2012.
ing in a virtualized cloud data center,” IEEE Trans. Autom. Sci. Eng., [32] Í. Goiri, K. Le, T. D. Nguyen, J. Guitart, J. Torres, and R. Bianchini,
vol. 14, no. 2, pp. 1172–1184, Apr. 2017. “Greenhadoop: Leveraging green energy in data-processing frameworks,”
[9] K. Vahi et al., “A case study into using common real-time workflow mon- in Proc. 7th ACM Eur. Conf. Comput. Syst., ACM, 2012, pp. 57–70.
itoring infrastructure for scientific workflows,” J. Grid Comput., vol. 11, [33] W. Deng, F. Liu, H. Jin, C. Wu, and X. Liu, “Multigreen: Cost-minimizing
no. 3, pp. 381–406, 2013. multi-source datacenter power supply with online control,” in Proc. 4th
[10] M. Giacobbe, A. Celesti, M. Fazio, M. Villari, and A. Puliafito, “Evaluating Int. Conf. Future Energy Syst., ACM, 2013, pp. 149–160.
a cloud federation ecosystem to reduce carbon footprint by moving com- [34] R. T. Kaushik, L. Cherkasova, R. Campbell, and K. Nahrstedt, “Lightning:
putational resources,” in Proc. IEEE Symp. Comput. Commun., Jul. 2015, Self-adaptive, energy-conserving, multi-zoned, commodity green cloud
pp. 99–104. storage system,” in Proc. 19th ACM Int. Symp. High Perform. Distrib.
[11] S. K. Garg, C. S. Yeo, and R. Buyya, “Green cloud framework for improv- Comput., ACM, 2010, pp. 332–335.
ing carbon efficiency of clouds,” in Proc. Eur. Conf. Parallel Process., [35] A. Kiani and N. Ansari, “Toward low-cost workload distribution for inte-
Springer, 2011, pp. 491–502. grated green data centers,” IEEE Commun. Lett., vol. 19, no. 1, pp. 26–29,
[12] R. Blanco, M. Catena, and N. Tonellotto, “Exploiting green energy to Jan. 2015.
reduce the operational costs of multi-center web search engines,” in Proc. [36] M. Giacobbe, A. Celesti, M. Fazio, M. Villari, and A. Puliafito, “An
25th Int. Conf. World Wide Web, 2016, pp. 1237–1247. approach to reduce energy costs through virtual machine migrations in
[13] D. Bhandari, C. Murthy, and S. K. Pal, “Genetic algorithm with elitist cloud federation,” in Proc. IEEE Symp. Comput. Commun., Jul. 2015,
model and its convergence,” Int. J. Pattern Recognit. Artif. Intell., vol. 10, pp. 782–787.
no. 6, pp. 731–747, 1996.
[14] R. Poli and W. B. Langdon, “Genetic programming with one-point Zhenyu Wen (Member, IEEE) received the
crossover and point mutation,” in Proc. Soft Comput. Eng. Des. Manuf., M.Sc. and Ph.D. degrees in computer science
Springer-Verlag, 1997, pp. 180–189. from Newcastle University, Newcastle Upon
[15] D. Goldberg, Genetic Algorithms in Search Optimization and Machine Tyne, U.K., in 2011 and 2016, respectively.
Learning. Boston, MA, USA: Addison-Wesley, 1988, pp. 62–66. He is currently a Postdoc Researcher with
[16] J. Smith and T. C. Fogarty, “Self adaptation of mutation rates in a steady the School of Computing, Newcastle Univer-
state genetic algorithm,” in Proc. IEEE Int. Conf. Evol. Comput., 1996, sity, Newcastle upon Tyne, U.K. His current re-
pp. 318–323. search interests include multi-objects optimisa-
[17] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, and tion, crowdsources, AI, and cloud computing.
R. Buyya, “CloudSim: A toolkit for modeling and simulation of For his contributions to the area of scalable data
cloud computing environments and evaluation of resource provision- management for the Internet-of-Things, he was
ing algorithms,” Softw. Pract. Experience, vol. 41, no. 1, pp. 23–50, awarded the IEEE TCSC Award for Excellence in Scalable Computing
Jan. 2011. (Early Career Researchers) in 2020.

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.
5656 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

Saurabh Garg received the 5-year Integrated Schahram Dustdar (Fellow, IEEE) received the
Master of Technology in mathematics and com- M.Sc. and PhD. degrees in business informat-
puting from the Indian Institute of Technology ics (Wirtschaftsinformatik) from the University of
(IIT) Delhi, India, in 2006, and the Ph.D. degree Linz, Austria, in 1990 and 1992, respectively.
in computer science from the University of Mel- He is Full Professor of computer science and
bourne, Melbourne, Australia, in 2010. head of the Distributed Systems Group, the TU
He is one of the few Ph.D. students who Wien, Austria. From 2004 to 2010, he was also
completed in less than three years. His doctoral Honorary Professor of Information Systems with
thesis focused on devising novel and market- the Department of Computing Science, the Uni-
oriented meta-scheduling mechanisms for dis- versity of Groningen (RuG), Netherlands. From
tributed systems under conditions of concurrent December 2016 until January 2017, he was a
and conflicting resource demand. He has gained about three years of Visiting Professor with the University of Sevilla, Spain and from January
experience in the Industrial Research while working at IBM Research until June 2017 he was a Visiting Professor with UC Berkeley, USA.
Australia and India. Dr. Dustdar is the Editor-in-Chief of Computing (Springer) and an
Associate Editor for the IEEE TRANSACTIONS ON CLOUD COMPUTING,
IEEE TRANSACTIONS ON SERVICES COMPUTING, ACM TRANSACTIONS ON
THE WEB, and ACM TRANSACTIONS ON INTERNET TECHNOLOGY and on
the editorial board of IEEE INTERNET COMPUTING AND IEEE COMPUTER.
Gagangeet Singh Aujla (Senior Member,
IEEE) received the bachelor’s and master’s de-
grees in computer science from the Punjab
Technical University, Kapurthala, India, in 2003
and 2013, respectively, and the Ph.D. degree
in computer science from the Thapar University,
Patiala, India, in 2018.
Albert Y. Zomaya (Fellow, IEEE) received
He is an Assistant Professor of Computer
the B.S. degree in electrical engineering from
Science with Durham University, Durham, U.K.
Kuwait University, Kuwait, in 1987, and the Ph.D.
Before this, he was a Postdoc Research Asso-
degree in control engineering from Sheffield
ciate with Newcastle University, Newcastle upon
University, U.K., in 1999.
Tyne, U.K., a Research Associate with Thapar University, India, a Vis-
He is currently the Chair Professor of High
iting Researcher with the University of Klagenfurt, Klagenfurt, Austria,
Performance Computing and Networking with
and on various other academic positions. For his contributions to the
the School of Computer Science, University of
area of scalable and sustainable computing, he was awarded the 2018
Sydney. He is also the Director of the Centre for
IEEE TCSC Outstanding Ph.D. Dissertation Award of Excellence.
Distributed and High Performance Computing,
which was established in late 2009. He was an
Australian Research Council Professorial Fellow during 2010–2014 and
held the CISCO Systems Chair Professor of Internet working during the
period 2002–2007 and also was the Head of school for 2006–2007 in
Khaled Alwasel received the B.S. and M.S. the same school.
degrees in information technology from Indiana
University-Purdue University Indianapolis, Indi-
anapolis, IN, USA and from Florida Interna-
tional University, Miami, FL, USA, in 2014 and
2015 respectively. He is currently working to-
ward the Ph.D. degree in computer science with
the School of Computing Science, Newcastle Rajiv Ranjan (Senior Member, IEEE) received
University, Newcastle upon Tyne, U.K. the Ph.D. degree in computer science and soft-
His research interest include areas of SDN, ware engineering from the Department of Com-
big data, IoT, and cloud computing. puter Science and Software Engineering, Uni-
versity of Melbourne, Melbourne, Australia, in
2009.
He is a Full Professor in computing science
with Newcastle University, U.K. Before moving
to Newcastle University, he was Julius Fellow
(2013–2015), Senior Research Scientist and
Project Leader in the Digital Productivity and
Deepak Puthal (Senior Member, IEEE) Services Flagship of Commonwealth Scientific and Industrial Research
received the Ph.D. degree in computer science Organization (CSIRO C Australian Government’s Premier Research
and information systems from the University of Agency). He was a Senior Research Associate (Lecturer level B) with
Technology Sydney, Ultimo, Australia, in 2017. the School of Computer Science and Engineering, University of New
He is an Assistant Professor with the School South Wales.
of Computing, Newcastle University, Newcastle
upon Tyne, U.K., and an Honorary Fellow with
the Faculty of Engineering and Information
Technology, University of Technology Sydney,
Australia. Before this position, he was working
as a Lecturer with the University of Technology
Sydney, and as a Graduate Research Fellow with Data 61, CSIRO,
Australia.
Dr. Puthal is the recipient of the 2019 Best IEEE ComSoc Young Re-
searcher Award (For EMEA Region), 2018 IEEE Technical Committee
on Scalable Computing Award for Excellence in Scalable Computing
(Early Career Researcher), and 2017 IEEE Distinguished Doctoral
Dissertation Award.

Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 04:20:03 UTC from IEEE Xplore. Restrictions apply.

You might also like