Professional Documents
Culture Documents
15C009 - 15C018 - 15C054 - Carbon Footprint Minimization Through Cost-Aware Load Balancing Model in Cloud Computing PDF
15C009 - 15C018 - 15C054 - Carbon Footprint Minimization Through Cost-Aware Load Balancing Model in Cloud Computing PDF
Submitted by
AJITHA S (15C009)
APARNA A (15C018)
KEERTHANAA K (15C054)
of
BACHELOR OF ENGINEERING
in
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
i
Abstract
Balancing the incoming data traffic across the servers is termed as Load
depends on the different factors which include balancing the loads at the
carbon footprint of the virtual machines at the data center. In this paper,
dynamic environment. This model enables to balance the load across the
virtual machines in the data center and evaluate the overall performance
ii
Table of Contents
Abstract ii
List of Tables v
List of Abbreviations x
1. Introduction 1
2. Literature Survey 3
3. Requirements 6
4. Proposed Method 7
4.1 UserBase 7
iii
4.2.1 Closest Data Center(CDC) 8
5. Implementation 11
6.2 Results 21
8. References 42
9. Publications 46
iv
List of Tables
1. Facebook subscribers statistics as of June 30, 17
2017
v
9. Response time of various algorithms for CDC 29
and ORT under S1:One data center with 25
VMs, located at region 0
vi
List of Figures
1. CACO load balancing model 8
3. Workspace Creation 11
6. UserBase configuration 13
8. Advanced configuration 14
9. Internet configurations 15
vii
16. Average Processing time of various 25
algorithms for ORT under S2:Two data
centers with 25 VMs each, located at region
0 and 2 respectively
viii
24. Average Response time of various 31
algorithms for ORT under S2:Two data
centers with 25 VMs each, located at region
0 and 2 respectively
ix
List of Abbreviations
CPU Central Processing Unit
VM Virtual Machine
DC Data Centre
CACO Cost-Aware Ant Colony Optimization
BW Bandwidth
MB Mega Byte
GB Giga Byte
TB Tera Byte
RR Round Robin
LA Location Aware
HB Honeybee
ms milliseconds
kW KiloWatt
kWh KiloWatt-hour
x
CHAPTER 1
INTRODUCTION
For the betterment of our environment, the idea of green computing has
emerged. Green computing is a technique of efficiently utilizing IT resources by
applying better policies and algorithms. One of the important key concepts of
green computing is virtualization (Jain & Choudhary, 2016,March). Virtualization
is generally referred as the abstraction of computing resources such as CPU,
memory, network, storage and database related applications and the clients
utilizing the service provisioned by the cloud providers (Kumar & Charu, 2015). It
enables multi-tenancy model by providing a scalable, shared resource platform for
all tenants (Xing & Zhan, 2012).
1
amount of carbon dioxide emission in the environment. In context of cloud, this
issue is faced due to inefficient usage of data center. The data center’s efficiency
can be increased by applying appropriate load balancing algorithms. By balancing
the workload across all the nodes of a cloud, overheating is reduced which in turn
reduces the energy consumption (Uddin, Darabidarabkhani, Shah, & Memon). As
the energy consumption increases, carbon footprint increases. Therefore, by
reducing the energy consumption, the aim of green computing is achieved (Thakur
& Chaurasia, 2016,January).
In this paper, ACO based carbon aware load balancing model is proposed for
improving the allocation of tasks to the virtual machines. This model also provides
the comparison with various load balancing algorithms. The objective of the
proposed work is 1)To reduce the processing time 2)To lessen the response time
3)To lower the cost 4)To bring down the power consumption 5)To minimize the
carbon footprint .The load balancing algorithms are analyzed under different
service broker policies in order to provide better performance analysis.
2
CHAPTER 2
LITERATURE SURVEY
(Naqvi, Javaid, Butt, Kamal, Hamza, & Kashif, 2018) has proposed a bio-
inspired algorithm called Ant Colony optimization for load balancing. The ant
colony optimization algorithm is the swarm based genetic algorithm. It works on
the mechanism of real ants using pheromones in order to explore its path. In the
same way, the allocation path of the cloudlets is identified based on the minimal
path cost in a probabilistic manner.
(Meftah, Youssef, & Zakariah, 2018) aims to illustrate the how the load
balancing algorithms which are defaultly available in the cloud analyst tool work
under various service broker policies .It is inferred that throttled is better than
Round Robin and equally spread by analysing performance parameters sucha as
average processing time and average response time with respect to different
scenarios.
3
(Rong, Zhang, Xiao, Li, & Hu, 2016) focuses on technologies for energy
conservation in data center environment.The energy measure is calculated bases on
power usage effectiveness which is ratio of facility load to IT load.This paper
analyses the energy consumption under different scenarious for various time
periods.
(Patel & Patel, 2015) has implemented and compared the service broker
policies in cloud analyst which is built over cloudsim in both static and dynamic
approaches. The paper concludes that proposal of broker policy is based on cost
and performance which includes Processing time and Response time.
(Kishor, 2014)has proposed a new service broker policy. The main role of
any service broker policy is to select a data center for processing a request within a
region. The algorithm is designed in such a way that it selects a data center based
4
on the underlying infrastructure. Based on the number of computing elements
available and the resource handling capacity of the data centres, proportion weight
is assigned to each data centres. The data centre with the greatest proportion
weight will be selected for processing the user requests.
5
CHAPTER 3
REQUIREMENTS
Hardware Support
Architecture : i5
Software requirements
Version : 1607
(ii) CloudAnalyst
Technologies used
Java
XML
6
CHAPTER 4
PROPOSED METHOD
The ACO based carbon aware model identifies a solution for efficient load
balancing by considering factors such as processing time, response time to reduce
carbon footprint in the cloud computing environment. The Proposed algorithm is
based on genetic algorithm Ant colony optimization which uses path cost and
threshold. The major components are 1) User Base2) Data centreselector 3) VM
selector and allocator 4) Efficiency analyzer.
UserBase
The User base is a collection of users grouped under a region. The chosen
configuration includes 6 regions across the world (Limbani, 2012). A Single-user
base may consist of thousands of users and each user, in turn, may request for
thousands of tasks. The user bases generate traffic as in real time. The increasing
number of users determines the efficiency of the simulation. The number of
simultaneous users in each user base can be bundledasa single unit by grouping
factor.
The data center Selector maps the data center with the traffic generating user base
depending on the service broker policy. The service broker policies are Closest
Data Center(CDC) and Optimise Response Time(ORT).
7
Closest Data Center (CDC)
The user bases are routed to the data center which has the minimum network
latency irrespective of network bandwidth (Patel & Patel, 2015).If there are two
data centers under the same region proximity one of them are randomly chosen.
This policy calls the data center selector to identify the closest data center. This
default broker policy is advantageous in case of the requests arebeing processed by
the data center within the same location.
8
Optimise Response Time (ORT)
This service broker policy uses the same methodology as the closest data center
policy to select the data center as per the network latency (Kishor, 2014). In
addition, it calculates the current response time and checks whether the estimated
response time is the same as that of the closest data center. Otherwise, the data
center with the least response time or that within the closest proximity is chosen
evenly with the occurrence ratio of 50:50.
VM selector and allocator use the VM load balancer to allocate the cloudlets (user
requested tasks) to the Virtual Machine. The existing load balancing policies in
cloud analyst are Round-Robin, Equally Spread current execution load, Throttled,
Honey bee, Ant colony optimization.
The proposed CACO algorithm uses the approach of swarm-based Ant colony
optimization algorithm. Ants deposit a type of biochemical substance known as a
pheromone in order to explore its path. Similarly, ants are considered as cloudlets.
Each cloudlet maintains the pheromones table in which path cost is updated.
Initially, each cloudletchoosesthe virtual machine randomly. The next available
virtual machine is identified based on the score function and workload. After
completing its tour, update the pheromones table. If all the cloudlets completed
their trips then calculate the makespan of the cloudlets and retain the optimal
solution. If it reaches the maximum limit of iteration then stops the iteration and
yield the best solution.
9
CACO Load Balancing Algorithm
Input:List of ants and VM’s
Steps:
End for
End while
10
CHAPTER 5
IMPLEMENTATION
In our project, we have used Eclipse editor for implementing the proposed
algorithm and integrated CloudAnalyst with Eclipse for running the simulation.The
detailed description of our work is explained below:
11
Fig. 4. Creating New java project
12
Fig. 5. Execution of load balancing algorithm
13
Fig. 7. Data center configuration
14
Fig. 9. Internet configurations
15
Fig. 11. Completion of Simulation
16
CHAPTER 6
The purpose of our model is to reduce the carbon footprint by efficiently allocating
the tasks to the virtual machines by using different service broker policies. Social
networking media connect people across the world in the online platform .It is one
of the largest internet applications that can be satisfied via cloud
computing.Facebook is one such social media application with a large population
of users. As of 30 June 2017, Facebookhas 1.97 million users across various
geographic locations as shown in Table I . In our experimental model, we have
used CloudAnalyst tool for the simulation to analyze the characteristics of
Facebook application in the cloud environment.
17
Experimental setup
We have chosen six user bases to indicate the various geographic locations as
shown in Table 2. For the ease of simulation, we have chosen 1/10 th of Facebook’s
population. The peak utilization time is assumed to be at night for about 2 hours
which is considered as the time zone. It is also assumed that 1% of users are using
the platform simultaneously during the peak hours and a single tenth of them are
using the platform simultaneously during the off-peak hours. The CloudAnalyst
tool enables various parameters as input as mentioned in Table 3.
Simultaneous Simultaneous
Peak
Online Users Online Users
User Base Region Hours
During Peak During Off-
(GMT)
Hours peak Hours
13:00–
N. America 0 2630812 263081
15:00
15:00–
S. America 1 3709753 370975
17:00
20:00–
Europe 2 3432737 343273
22:00
01:00–
Asia 3 7360030 736003
03:00
21:00–
Africa 4 1602070 160207
23:00
09:00–
Oceania/Australia 5 194632 19463
11:00
18
Table 3.DataCenter Parameter values used in the experiments
19
Simulation Scenarios
Table 4. Scenarios chosen for Simulation (Meftah, Youssef, & Zakariah, 2018)
In Scenario S1, simulation is executed with single data center with 25 VMs located
in region 0.In the second scenario S2, two data centersare chosen with same count
of VMs at different locations. In third scenario S3, twodata centers are chosen with
variable count of VMs at different geographical locations. In fourth scenario, three
data centers with variable count of VMs are chosen at different geographical
locations across the globe. The chosen algorithms along with the proposed one are
simulated in all the four scenarios and their respective behaviours are analyzed.
20
Results
The Min, Max, and overall average values of processing time and response time
arerecorded for each scenario during the simulation for about 60 hours as shown in
Table 5 to Table 8 and Table 9 to Table 10 respectively. The existing algorithms
such as Round Robin (RR), Equally Spread Current Execution Load (ESCE),
Location Aware (LA), Honeybee (HB), Ant Colony (ACO) and the proposed
CACO Algorithm are taken into consideration for analysis. The results are
recorded for the factors including processing time, response time, total cost and
power consumed and are tabulated in Table 5 to Table 8, Table 9 to table 12, Table
13 and Table 14respectively. A sample screenshot of the result generated is shown
in Fig. 12.
21
Processing Time
Processingtimeis calculated based on the length of the tasks requested by users and
the capacity of the virtual machines which are assigned to handle the respective
tasks (Rehman, Javaid, Ali, Saif, Ashraf, & Abbasi, 2018,October). The processing
time resulted from the algorithms are represented in milliseconds (ms). The
processing time for various algorithms under CDC and ORT service broker
policies for S1, S2, S3 and S4 are tabulated in Table 5,Table 6, Table 7 and Table 8
respectively and are graphically represented in Fig. 13 to Fig. 20.
Table 5. Processing time of various algorithms for CDC and ORT under S1:One data center with
25 VMs, located at region 0
22
S1-CDC
4500
4000
3500
Processing Time(ms)
3000
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 13. Average Processing time of various algorithms for CDC under S1:One data center with
25 VMs, located at region 0
S1-ORT
4500
4000
3500
Processing Time(ms)
3000
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 14. Average Processing time of various algorithms for ORT under S1:One data center with
25 VMs, located at region 0
23
Table 6. Processing time of various algorithms for CDC and ORT under S2:Two data centers
with 25 VMs each, located at region 0 and 2 respectively
S2-CDC
4500
4000
3500
Processing Time(ms)
3000
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 15. Average Processing time of various algorithms for CDC under S2:Two data centers with
25 VMs each, located at region 0 and 2 respectively
24
S2-ORT
3500
3000
Processing Time(ms)
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 16. Average Processing time of various algorithms for ORT under S2:Two data centers with
25 VMs each, located at region 0 and 2 respectively
Table 7. Processing time of various algorithms for CDC and ORT under S3:Two data centers
with 25,50 VMs, located at region 0 and 2 respectively
25
S3-CDC
4500
4000
3500
Processing Time(ms)
3000
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 17. Average Processing time of various algorithms for CDC under S3:Two data centers with
25,50 VMs, located at region 0 and 2 respectively
S3-ORT
3500
3000
2500
Processing Time(ms)
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 18. Average Processing time of various algorithms for ORT under S3:Two data centers with
25,50 VMs, located at region 0 and 2 respectively
26
Table 8. Processing time of various algorithms for CDC and ORT under S4:Three data centers
with 25,30, 50 VMs, located at three different regions 0, 2, and 1 respectively
S4-CDC
4500
4000
3500
Processing Time(ms)
3000
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 19. Average Processing time of various algorithms for ORT under S4:Three data centers
with 25,30, 50 VMs, located at three different regions 0, 2, and 1 respectively
27
S4-ORT
3000
2500
Processing Time(ms)
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 20. Average Processing time of various algorithms for ORT under S4:Three data centers
with 25,30, 50 VMs, located at three different regions 0, 2, and 1 respectively
Response Time
Response time is the time taken by the data center to receive requests from the user
base. The response time resulted from the algorithms are represented in
milliseconds(ms).The response time for various algorithms under CDC and ORT
service broker policies for S1, S2, S3 and S4 are tabulated in Table 9 to Table 13
respectively and are graphically respresented in Fig. 21 to Fig. 28.
28
Table 9. Response time of various algorithms for CDC and ORT under S1:One data center with
25 VMs, located at region 0
S1-CDC
6000
5000
Response Time(ms)
4000
3000
2000
1000
0
RR ESCE LA HB ACO CACO
Fig. 21. Average Response time of various algorithms for CDC under S1:One data center with
25 VMs, located at region 0
29
S1-ORT
6000
5000
Response Time(ms)
4000
3000
2000
1000
0
RR ESCE LA HB ACO CACO
Fig. 22. Average Response time of various algorithms for ORT under S1:One data center with 25
VMs, located at region 0
Table 10. Response time of various algorithms for CDC and ORT under S2:Two data centers
with 25 VMs each, located at region 0 and 2 respectively
30
S2-CDC
6000
5000
Response Time(ms)
4000
3000
2000
1000
0
RR ESCE LA HB ACO CACO
Fig. 23. Average Response time of various algorithms for CDC under S2:Two data centers with
25 VMs each, located at region 0 and 2 respectively
S2-ORT
4500
4000
3500
Response Time(ms)
3000
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 24. Average Response time of various algorithms for ORT under S2:Two data centers with
25 VMs each, located at region 0 and 2 respectively
31
Table 11. Response time of various algorithms for CDC and ORT under S3: Two data centers
with 25,50 VMs, located at region 0 and 2 respectively
S3-CDC
6000
5000
Response Time(ms)
4000
3000
2000
1000
0
RR ESCE LA HB ACO CACO
Fig. 25. Average Response time of various algorithms for CDC under S3: Two data centers with
25,50 VMs, located at region 0 and 2 respectively
32
S3-ORT
4500
4000
3500
Response Time(ms)
3000
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 26. Average Response time of various algorithms for ORT under S3: Two data centers with
25,50 VMs, located at region 0 and 2 respectively
Table 12. Response time of various algorithms for CDC and ORT under S4: Three data centers
with 25,30, 50 VMs, located at three different regions 0, 2, and 1 respectively
33
S4-CDC
6000
5000
Response Time(ms)
4000
3000
2000
1000
0
RR ESCE LA HB ACO CACO
Fig. 27. Average Response time of various algorithms for CDC S4: Three data centers with
25,30, 50 VMs, located at three different regions 0, 2, and 1 respectively
S4-ORT
4000
3500
3000
Response Time(ms)
2500
2000
1500
1000
500
0
RR ESCE LA HB ACO CACO
Fig. 28. Average Response time of various algorithms for ORT S4: Three data centers with
25,30, 50 VMs, located at three different regions 0, 2, and 1 respectively
34
From Table 5 to Table 12, we can infer that the proposed CACO Algorithm gives
better results in terms of processing time and response time when compared to
other existing algorithms other than ACO. A quick look-over to the resultsrevealed
that, in general, the proposed CACO load balancing algorithmoutperforms the
other existing algorithms including ACO in scenario1 and scenario 2 with the CDC
servicebroker policy as the overall processing time and the overall response time
iscomparatively better.
Total Cost
One of the most important parameters of cloud computing is cost. The total cost
includes virtual machine migration cost and data transfer cost. During the
simulation, the algorithms such as Round Robin, Equally Spread Current
Execution Load, Location Aware, and Honeybee give the same results for Total
cost as the cost is not included as a factor in these algorithms. The total cost of the
algorithms is represented in dollars ($). Hence these algorithms are altogether
referred to as “Others” in Table 13, Table 14, Table 15 and Table 16.
35
140000
120000
100000
80000
Total Cost ($) S1
Total Cost ($) S2
60000
Total Cost ($) S3
Total Cost ($) S4
40000
20000
0
ACO CACO Others ACO CACO Others
Closest Data Center(CDC) Optimise Response Time(ORT)
Power is the rate at which the user requests are satisfied by the data center. Thus,
power consumption includes the total power consumed by all the data centers
under various scenarios for different algorithms mentioned in Table 14. Power is
represented in KiloWatt(kW). The graphical representation of Power consumption
is mentioned in Fig. 30.
36
Table 14 .ComparativeAnalysis ofPower Consumption
18000
16000
14000
12000
2000
0
ACO CACO Others ACO CACO Others
Closest Data Center(CDC) Optimise Response Time(ORT)
37
Datacenter consumes a large amount of energy because of its high-performance
components. Energy is one of the beneficiary factors in the management of
datacenters in the cloud (Rong, Zhang, Xiao, Li, & Hu, 2016).Energy is
represented in Kilowatt-hour (kWh) for the chosen scenarios represented in Table
15. The graphical representation of energy consumption is mentioned in Fig. 31.
1200000
1000000
800000
Energy(kWh) S1
600000
Energy(kWh) S2
400000 Energy(kWh) S3
Energy(kWh) S4
200000
0
ACO CACO Others ACO CACO Others
Closest Data Center(CDC) Optimise Response Time(ORT)
38
Carbon Footprint Reduction
Data center carbon footprint is the amount of carbon released into the atmosphere.
By considering the social welfare into account, the CACO model has aimed at
reducing the carbon footprint in the cloud environment. Reduction of power usage
greatlycontributes to reducing the carbon footprint (Khosravi & Buyya, 2018). It is
assumed that 1000kWh of power consumption emits 0.72 tons of CO2 (Uddin,
Darabidarabkhani, Shah, & Memon).The carbon footprint analysis is represented
in Table 16 an the corresponding graphical respresentation is shown in Fig. 32.
39
800
700
600
500
Carbon Footprint(tons) S1
400
Carbon Footprint(tons) S2
100
0
ACO CACO Others ACO CACO Others
Closest Data Center(CDC) Optimise Response Time(ORT)
The proposed Cost-aware ACO based load balancing model has been compared
withother load balancing algorithms. The main focusof our work is to reduce the
carbon footprint in cloud data center. In most of the scenarios, under both service
broker policies, our proposed CACO model gives better results in terms of cost,
power, energy and carbon footprint. Thus by managing the carbon footprint
efficiently, our model contributes to the betterment and welfare of the
environment.
40
CHAPTER 7
In this paper, we have analyzed the effects of various load balancing algorithms
and service broker policies under different scenarios in a large scale cloud
environment. The existing algorithms namely, Round Robin, Equally Spread
Current Execution Load, Location Aware and Honeybee and Ant Colony
algorithms and the service broker policies such as closest data center and optimise
response time are taken into consideration in order to analyze the performance of
the proposed work. In order to accomplish our work, we have chosen
CloudAnalyst simulation tool andFaceBook dataset for configuration. The results
are tabulated and it clearly revealed that the proposed CACO Algorithm performs
better in scenario 1 and scenario 2 with the CDC service broker policy. As power
consumption increases, carbon footprint also increases. Therefore, by reducing the
power consumption we can reduce the carbon footprint which greatly benefits the
environment. Future work concentrates on finding more procedures to reduce the
power consumption and also focus onthe power supply fordata centers by
renewable energy sources.
41
CHAPTER 8
REFERENCES
42
8. Limbani, D. &. (2012). A proposed service broker policy for data center
selection in cloud environment with implementation. International Journal
of Computer Technology & Applications , 3(3), 1082-1087.
9. Meftah, A., Youssef, A. E., & Zakariah, M. (2018). Effect of Service Broker
Policies and Load Balancing Algorithms on the Performance of Large Scale
Internet Applications in Cloud Data centers. INTERNATIONAL JOURNAL
OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS , 9(5), 219-
227.
10. Mell, P., & Grance, T. (2011). The NIST definition of cloud computing.
11. Mishra, R. &. (2012). Ant colony optimization: A solution of load balancing
in cloud. International Journal of Web & Semantic Technology , 3(2), 33.
12. Naqvi, S. A., Javaid, N., Butt, H., Kamal, M. B., Hamza, A., & Kashif, M.
(2018). Metaheuristic optimization technique for load balancing in cloud-fog
environment integrated with smart grid. International Conference on
Network-Based Information Systems (pp. 700-711). Cham: Springer.
14. Patel, H., & Patel, R. (2015). Cloud analyst: an insight of service broker
policy. International Journal of Advanced Research in Computer and
Communication Engineering , 4(1), 122-127.
43
15. Rehman, M., Javaid, N., Ali, M. J., Saif, T., Ashraf, M. H., & Abbasi, S. H.
(2018,October). Threshold based load balancer for efficient resource
utilization of smart grid using cloud computing. In International Conference
on P2P, Parallel, Grid, Cloud and Internet Computing (pp. 167-179).
Cham: Springer.
16. Rong, H., Zhang, H., Xiao, S., Li, C., & Hu, C. (2016). Optimizing energy
consumption for data centers. Renewable and Sustainable Energy Reviews ,
58, 674-691.
17. Subalakshmi, S., & Malarvizhi, N. (2017). Enhanced hybrid approach for
load balancing algorithms in cloud computing. Int. J. Sci. Res. Comput. Sci.,
Eng. Inf. Technol. IJSRCSEIT , 2(2).
19. Uddin, M., Darabidarabkhani, Y., Shah, A., & Memon, J. Evaluating power
efficient algorithms for efficiency and carbon emissions in cloud data
centers: A review. . In Renewable and Sustainable Energy Reviews (pp. 51,
1553-1563).
44
21. Xing, Y., & Zhan, Y. (2012). Virtualization and cloud computing. In In
Future Wireless Networks and Information Systems (pp. 305-312). Springer,
Berlin, Heidelberg.
22. Zhao, W., Peng, Y., Xie, F., & Dai, Z. (2012, November). Modeling and
simulation of cloud computing: A review. In 2012 IEEE Asia Pacific cloud
computing congress (APCloudCC) (pp. 20-24). IEEE.
45
CHAPTER 9
PUBLICATIONS
46
47
48
Enhancing Test Suites through analysis of Code
Coverage and Mutation Testing
S. Ajitha#1, A. Aparna#2, K. Keerthanaa#3
#
Computer Science and Engineering,
Thiagarajar College of Engineering,
Madurai 625015, Tamil Nadu, India
1
ajithasundarjee@gmai
l.com
2
aparnaarunpandi@gmai
l.com
3
keerthanaakalyan@gma
il.com
Abstract— Code Coverage and Mutation testing is an important White box Testing
aspect of Software testing. Code Coverage helps to explore the
areas which are not covered by the test cases while mutation White box testing is also known as structural or glass box
testing explores the robustness of the test cases. Automated tools testing is a testing technique that considers the internal
help us to enhance coverage, stability, testability, and mechanism of a system. White box testing is mostly used for
maintainability. The choice of choosing the appropriate testing verification.
tool is still a challenge for developers. This paper aims to
provide an analysis of code coverage and mutation testing tools, Types of Testing
thereby helping the developers to choose their testing tools
suitably. Therefore, we performed a controlled experiment to There are many types of testing[21] .They are listed as
investigate the performance of these tools by implementing follows.
Bubble sort algorithm in the Eclipse plug-in.
1. Unit Testing
Index Terms— Code Coverage, Analysis of Code Coverage 2. Functional Testing
Tools, Mutation testing, Analysis of mutation testing tools. 3. Integration Testing
4. System Testing
5. Stress Testing
I. INTRODUCTION 6. Regression Testing
[9]
Software testing is used to ensure that the software 7. Acceptance Testing
system is defect free by ensuring that the actual results meet 8. Usability Testing
9. Performance Testing
the expected results. It is used for verification of System
10. Beta Testing
under Test. It is used to identify the errors or missing
requirements to make the software system better. Software Unit Testing
testing can be done either manually or by using automated Unit testing is a type of white-box testing which tests a
tools. single unit or group of related units. It is used by the
programmer to test if the units are producing the expected
Black box Testing output against given input for the program executed.
Black box testing also known as functional testing is a testing
technique that does not take the internal mechanism of the Functional Testing
system into account rather focuses on the output generated by
the system against any given input. In black box testing, tests Functional testing is the type of Black box testing which
are done from a user’s point of view and test cases can be ensures that the functionality specified in the system
written with the help of specification alone. Black box testing requirements works correctly.
is mostly used for validation.
49
Integration Testing beta version. Beta testing aims to cover the unexpected errors
Integration testing is the technique of ensuring that the that have occurred in the system unknowingly.
group of combined components produces the aimed output.
Also, the relation and interaction between the software and II. CODE COVERAGE
hardware components are tested using Integration testing. It Code coverage [12]
is the measurement of the extent to which
may be used as both white box testing and black box testing
the source program is executed when the corresponding test
depending upon the software system.
suite runs. It is measured in percentage. If the program has
System Testing
high code coverage then there is less chance of undetected
System testing is the technique to ensure that the software bugs while lower coverage results in more undetected bugs
is compatible in different environments (e.g., Operating comparatively. Different metrics are used to calculate Code
Systems. System testing can be done only with the full system
coverage. The basic measure of code coverage is the coverage
implementation and environment. It fits inside the class of
black box testing. item. Coverage item is anything that could be counted and to
see it is covered or not by the test suites.
Stress Testing
Coverage can be measured by the following formula:
Stress testing is a type of black box testing. It is the
technique to evaluate how the system works under Coverage= (Number of coverage items exercised / Total
unfavorable conditions. This Testing is performed beyond number of coverage items)*100.
limits mentioned in the specifications.
50
is of less cost compared to other commercially available
import java.io.*;
import java.util.*; coverage testing tools Coverage reporting is done in XML,
import junit.framework.*; HTML, or via a Swing GUI. It also provides clear and
public class BubbleSortTest extends TestCase understandable metrics displayed in HTML reports. Clover
{
can be integrated with test frameworks including JUnit,
private BubbleSort s1; TestNG, and Spock.
public void setUp()
{
s1 = new BubbleSort();
}
for(int i=0;i<exp.length;i++)
{
assertEquals(exp[i],res[i]);
Fig. 3. Coverage results-Clover
}
}
public void test2() throws IOException
{ int[] ar={2,3,4,5,6,1};
assertNotNull(ar);
}
public void test3() throws IOException
{
int[] arr1={6,3,3};
int[] res=s1.bubbleSort(arr1,arr1.length);
int[] exp={3,3,6};
for(int i=0;i<exp.length;i++)
{
assertEquals(exp[i],res[i]);
} Fig. 4. Test run explorer window-Clover
assertArrayEquals(exp,res);
}
public void test4() throws IOException
{
int[] arr2=new int[]{2,2,2};
int[] res=s1.bubbleSort(arr2,arr2.length);
int[] exp={2,2,2};
for (int i=0;i<exp.length;i++)
{
assertEquals(exp[i],res[i]);
}
assertArrayEquals(exp,res);
}
Fig. 5. Generated report -Clover
Fig. 2. Bubble Sort Test code
A. Clover
Clover [14] is available as either an Eclipse or IDEA plug-
ins. It also works using ANT script. It supports Java and .Net
programming. It supports statement, branch, method, class,
and package coverage. This tool provides an accurate and
configurable coverage analysis. It is not open source yet it
51
Fig. 9. Boolean analyzer-CodeCover
Fig. 6. Class coverage distribution and complexity-Clover
B. CodeCover
CodeCover[15] is an extensible open source testing tool for
Java programmers. It can be integrated into Eclipse. It
performs source code instrumentation and provides statement
and branch coverage. It helps to increase test quality and
helps to develop new test cases. It generates the report in
HTML .The GUI support provided by code cover helps in
easy observation on the coverage of the code. It uses a Fig. 10. Correlationmatrix-CodeCover
correlation matrix to find redundant test cases and optimize
the test suite .It can be integrated with the JUnit framework.
C. EclEmma
EclEmma[16] is free code coverage tool for java. It provides
Fig. 7. Coverage results-CodeCover
source code instrumentation. It performs dead code analysis
and verifies the parts of the program that are exercised by the
test suite. It supports class, method, line and basic block
coverage .It provides GUI support and generates report in
XML and HTML. It can be integrated with JUnit framework.
52
D. Observations
For the conducted experiment, Bubble sort program is taken
and executed. The coverage for bubble sort program in case of
CodeCover is 95.13% with overall coverage of 98%.It also
enables the results to be viewed as Boolean analyzer,
Correlation matrix, coverage graph, and test session. The
coverage in EclEmma is similar to CodeCover and it is
supported in Eclipse kepler version and further updated
versions. EclEmma is easy to handle since it displays the
coverage results easily but it does not provide results in the
form of graph or matrix as in case of CodeCover. Clover, on
the other hand resulted in 94.4% of coverage with overall
coverage of 97.9% .It is not open source but trial version for
Fig. 12. Coverage results-EclEmma 30 days is available. Clover provides further more in depth
reports than the other two tools including Boundary value
analysis, Class complexity and method complexity which of
these cannot be observed from the other two tools.
TABLE II
COMPARISON OF COVERAGE METRICS
Coverage Tools
metrics Clover CodeCover EclEmma
Statement 100 100 100
Loop - 66.7 -
Fig. 13. Instructions covered-EclEmma
Condition - 87.5 -
TABLE I Method 94.4 - 100
COMPARISON OF CODE COVERAGE TOOLS
Tools
Features
Clover CodeCover EclEmma III. MUTATION TESTING
Access Charge Open-source Open-source Mutation testing [5] is a type of white box testing mainly
Supported used for unit testing. It is done by mutating certain statements
Java,.Net Java Java
Languages
Instrumentati in the program, thereby checking that the test cases are able
Source code Source code Byte code
on to find the error or not. The program is mutated only to a
Statement,
Statement, Statement, smaller extent so as not to affect the overall objective of the
Coverage Block
method, class, Branch, loopand
Measures Method and program. It is also called Fault based testing strategy. The
and package condition
Supported Class goal of mutation testing is to make the software more robust
coverage Coverage
Coverage
against failures.
Ant, Maven,
Grails, Eclipse, Mutation Score:
IDEA, Bamboo, Ant, Jenkins, Makefile and
Integrations
Gradle, Griffon, Eclipse ANT The mutation score is defined as the ratio of killed mutants
Jenkins, Hudson, to the total number of mutants represented in percentage. By
Sonar decreasing the count of live mutants, the mutation score can
be increased.
Junit Support Yes Yes Yes
53
Analysis of Mutation Testing Tools
This section presents Mutation testing tools[3],[8],[19] that
ensure the robustness of the formulated test suites. There are
mutation testing tools that are available either for free or
commercially. Following are some mutation testing tools.
A. Muclipse
Muclipse[18] is an open source mutation testing tool for the
java developers. It supports JUnit and can be integrated with
Eclipse. Muclipse provides both source code and byte code
instrumentation. By injecting mutants, the correctness of the
test cases is verified.
B. PITclipse
PITclipse[20] is a Java based open source mutation testing
tool. It applies a configurable set of mutation operators[1] (or
mutators) to the byte code generated by compiling the code.
The reports produced by PIT are easily understandable. It
generates the report in HTML and XML. It supports JUnit
framework.
54
IV. CONCLUSION
55
CHAPTER 10
PLAGIARISM REPORT
56