You are on page 1of 58

DETECTION AND PREVENTION TECHNIQUE (DAPT)

TO DODGE THE OPENING OF A VIRUS FILE

A PROJECT REPORT

Submitted by

ANIRUDH.R 211615104025
ARUN KUMAR.S 211615104037
BHARGAVAN.N 211615104048

in partial fulfillment for the award of the degree

of

BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING

I
RAJALAKSHMI ENGINEERING COLLEGE, CHENNAI-602105

ANNA UNIVERSITY : CHENNAI 600 025

MARCH 2019

ANNA UNIVERSITY: CHENNAI 600 025


BONAFIDE CERTIFICATE

Certified that this project report “DETECTION AND PREVENTION


TECHNIQUE (DAPT) TO DODGE THE OPENING OF A VIRUS FILE” is
the bonafide work of “ANIRUDH.R [211615104025], ARUN KUMAR.S
[211615104037] and BHARGAVAN.N [211615104048]” who carried out the
project work under my supervision.

SIGNATURE SIGNATURE
Dr. P. KUMAR, M.E., Ph.D. Mr. S. SURESH KUMAR, M.E.

HEAD OF THE DEPARTMENT SUPERVISOR


Professor, Associate Professor,

II
Department of Computer Science and Department of Computer Science
Engineering and Engineering
Rajalakshmi Engineering College Rajalakshmi Engineering College
Chennai – 602 105 Chennai – 602 105

ANNA UNIVERSITY : CHENNAI 600 025

This project is submitted for Viva Voce Examination to be held on


__/__/2019.

ANIRUDH.R 211615104025
ARUN KUMAR.S 211615104037
BHARGAVAN.N 211615104048

III
INTERNAL EXAMINER EXTERNAL EXAMINER

ACKNOWLEGEMENT

Initially we thank the Almighty for being with us through every walk of our
life and showering his blessings through the endeavor to put forth this report.

Our sincere thanks to our Chairman Mr. S. MEGANATHAN, B.E, F.I.E,


and our respected Chairperson Dr. (Mrs) THANGAM MEGANATHAN, Ph.D.,
for providing us with the requisite infrastructure and sincere endeavoring educating
us in their premier institution.

Our sincere thanks to Dr. S.N. MURUGESAN, M.E., Ph.D., our beloved
Principal for his kind support and facilities provided to complete our work in time.

We express our sincere thanks to Dr. P. KUMAR, M.E., Ph.D., Head of the
Department of Computer Science and Engineering, Rajalakshmi Engineering
College, for his guidance, encouragement, and extending all the facilities to me to
work on this project. We convey our sincere and deepest gratitude to our internal
guide, Mr. S. SURESH KUMAR, M.E., Associate Professor, Department of
Computer Science and Engineering, Rajalakshmi Engineering College, for his
guidance, constant encouragement, and support. His meticulous attention and
creative thinking have been a source of inspiration for me throughout this project.

We are glad to thank our project coordinator, Mrs. PRIYA VIJAY, M.E,
Associate Professor, Department of Computer Science and Engineering,

IV
Rajalakshmi Engineering College, for her useful tips during our review to build our
project.

Finally, we express our gratitude to our parents and classmates for their
moral support and valuable suggestions during the course of the project.

ABSTRACT

The field of internet has become an essential part in day to day life. Security in the

internet has always been an important issue. Malware is used to breach into the

target system. There are different types of malwares such as virus, worms, rootkits,

Trojan- horse, ransomware and so on. Each malware has its own approach to affect

the target system in different ways, thereby causing harm to the system. The virus

might be in some random files, which when opened can modify or delete the

contents or data in the target system. Therefore, we propose a prevention

technique for a specific type of malware called virus. The prevention technique in

this project uses an algorithm to prevent the opening of the virus affected file as

described. Generally, the system consists of a free antivirus software which is

limited to some functionalities. The proposed model overcomes the functionalities

by using an algorithm, thereby detecting and preventing the virus attack on a

system effectively. The proposed system uses an algorithm which checks for virus

in the file by using a comparison method. In this method, the pattern of A VIRUS

file is analyzed by passing the file through the pattern checker. The next step

V
involves generation of bytes. Each file has different byte size. Further, the byte

comparison is done to compare a normal file with a corrupted file. The corrupted

file will have different byte length. If the normal file and corrupted file byte length

doesn’t match, then it can be said that, the file is a virus file and thereby preventing

the opening of that virus file.

TABLE OF CONTENTS

CHAPTER TITLE PAGE


NO NO
1. INTRODUCTION 1
1.1 AIM 1

1.2 OBJECTIVE 1

1.3 SCOPE 1

1.4 OVERVIEW 2

2. LITERATURE SURVEY 5

2.1 COMPUTER VIRUS 5

2.2 TECHNIQUE TO THWART THE OPENING OF A

VIRUS AFFECTED FILE WITHOUT THE AID OF

AN ANTIVIRUS SOFTWARE 5

2.3 THE NEW AGE OF COMPUTER VIRUS AND THEIR

DETECTION 6

VI
2.4 A COMPARITIVE STUDY OF VIRUS DETECTION

TECHNIQUES 6

2.5 SHA FORMALIZATION 7

2.6 EVOLUTION OF COMPUTER VIRUS CONCEALMENT

AND ANTIVIRUS TECHNIQUES: A SHORT SURVEY 7

2.7 MALWARE DETECTION AND KERNEL ROOTKIT

PREVENTION IN CLOUD COMPUTING

ENVIRONMENT 7

3. SYSTEM ANALYSIS 8

3.1 EXISTING SYSTEM 8

3.1.1 DISADVANTAGES OF EXISTING SYSTEM 8

3.2 PROPOSED SYSTEM 9

3.2.1 ADVANTAGES OF PROPOSED SYSTEM 9

4. SYSTEM REQUIREMENT SPECIFICATION 10

4.1 INTRODUCTION 10

4.2 HARDWARE AND SOFTWARE SPECIFICATION 10

4.2.1 HARDWARE SPECIFICATIONS 10

4.2.2 SOFTWARE SPECIFICATIONS 10

4.3 TOOLS USED 11

VII
4.3.1 VMWARE 11

4.3.1.1 ARCHITECTURE OF VMWARE 12

4.3.1.2 ADVANTAGES OF USING VMWARE 13

4.3.2 ECLIPSE 14

4.3.2.1 ARCHITECTURE OF ECLIPSE 14

4.3.2.2 ADVANTAGES OF USING ECLIPSE 15

4.3.3 GOOGLE CLOUD 15

4.3.3.1 ARCHITECTURE OF GOOGLE CLOUD 15

4.3.3.2 ADVANTAGES OF GOOGLE CLOUD 16

4.4 TECHNOLOGY USED 16

4.4.1 JAVA 16

4.4.1.1 FEATURES OF JAVA 17

5. SYSTEM DESIGN 19

5.1 ARCHITECTURE DIAGRAM 19

5.2 OVERVIEW 20

5.3 UML DIAGRAMS 20

5.3.1 ACTIVITY DIAGRAM 21

5.3.2 USECASE DIAGRAM 21

5.3.3 CLASS DIAGRAM 22

VIII
5.3.4 SEQUENCE DIAGRAM 23

5.3.5 COMPONENT DIAGRAM 24

5.3.6 DEPLOYMENT DIAGRAM 25

5.3.7 COLLABORATION DIAGRAM 27

6. SYSTEM IMPLEMENTATION 28

6.1 MODULE DESCRIPTION 28

6.1.1 PATTERN CHECKER 28

6.1.2 BYTE GENERATION 31

6.1.3 BYTE COMPARISON 32

7. SOFTWARE TESTING 35

7.1 GENERAL 35

7.2 TESTING OBJECTIVES 35

7.3 TYPES OF TESTING 35

7.3.1 WHITE BOX TESTING 35

7.3.2 BLACK BOX TESTING 36

7.3.3 UNIT TESTING 36

7.3.4 INTEGRATION TESTING 37

7.3.5 VALIDATION TESTING 37

8. CONCLUSION AND FUTURE WORK 38

IX
8.1 CONCLUSION 38

8.2 FUTURE WORK 38

9. APPENDICES 39

9.1 SAMPLE CODE 39

9.2 SCREENSHOTS 44

10. REFERENCES 45

LIST OF FIGURES

FIG.NO TITLE PAGE NO

4.3.1.1 (a) ARCHITECTURE OF VMWARE 12

(b) ARCHITECTURE OF VMWARE 13

4.3.2.1 ARCHITECTURE OF ECLIPSE 14

4.3.3.1 ARCHITECTURE OF GOOGLE CLOUD 15

5.1 ARCHITECTURE DIAGRAM 19

5.3.1 ACTIVITY DIAGRAM 21

5.3.2 USE CASE DIAGRAM 22

5.3.3 CLASS DIAGRAM 23

5.3.4 SEQUENCE DIAGRAM 24

5.3.5 COMPONENT DIAGRAM 25

X
5.3.6 DEPLOYMENT DIAGRAM 26

5.3.7 COLLABORATION DIAGRAM 27

6.1.1 PATTERN CHECKER MODULE 29

6.1.2 BYTE GENERATION MODULE 32

6.1.3 BYTE COMPARISON MODULE 34

XI
CHAPTER – 1

INTRODUCTION

1.1 AIM
The main aim of the system is to scan each file when entering the device and to
detect, if the file is a virus file and alert the user about the infected file. Thereby,
preventing the opening of virus infected file.

1.2 OBJECTIVE
The objective of this project is to scan each file that enters the system, so that the
virus affected file can be detected and thereby preventing the user from opening
the infected file. The preliminary test is used to check for any virus signatures in
the file. Further, the file size is then represented in bytes and the file is compared
with the virus file using byte comparison method. Before going to second step, a
backup of all the files is taken. After the virus infects the files, the byte comparison
is done. To conclude, the file can be a virus infected file, if there is a size
difference between the original file and affected file.

1.3 SCOPE
This project helps in detecting virus by identifying its signature. Signature refers to
the property of virus or the pattern of virus. These parameters represent the
behavior of virus which makes the detection process easy. In this project, the hash
values that is calculated using the Secure Hash Algorithm (SHA) is used as a
signature for finding the traces of virus in a file. By implementing this approach for
detecting virus, inferior scanning approach is avoided. This drawback can be found
in free antivirus software. This project overcomes this drawback by implementing
this (DAPT) method of approach. Further, this process gives full functionality of
detecting virus in the system which is not provided in free antivirus software, and it
also avoids memory management problems in the systems.

1
1.4 OVERVIEW
Cloud computing security or cloud security refers to broad set of policies,
technologies, applications, and controls utilized to protect virtualized IP, data,
application, services, and the associated infrastructure of cloud computing. It is a
sub-domain of computer security, network security, and information security.
Cloud computing and storage provides users with capabilities to store and process
their data in third party data centers.

Scanning from the outside and inside using free commercial products is very
important because, without a hardened environment, your service is considered as
a soft target. Virtual server should be hardened like a physical server against data
leakage, malwares, and exploited vulnerabilities. “Data loss or leakage represents
24.6% and cloud related malware with 3.4% of threats causing cloud outages. The
three important key features to be maintained in a security related data is
Confidentiality, Access Control and Integrity. Confidentiality is a set of rules that
limits access to information. That is data can be viewed only by authorized persons
and it cannot be viewed by unauthorized persons. Integrity is the assurance that the
information is trust worthy and accurate and it must be made sure that the data is
not altered. Data must not be altered in transit and steps must be taken to ensure
that data cannot be altered by unauthorized people. These methods include file
permissions and user access controls.

A computer virus is a type of malicious software that, when executed, it replicates


itself by modifying other computer programs and thereby inserting its own piece of
code. When the replication task succeeds, the affected areas are infected with a
computer virus. It is a program that reproduces its own code by attaching itself to
the executable files in such a way that the virus code is executed when the infected
executable file is executed. An example of executable file would be a program
(com or exe file). The virus strategy involves two phases, Infect and attack. In the
infect phase, the virus enters the target system and starts taking control of the
system. But it does not initiate the execution. For example, a document or some
other system resource that is suspected to contain virus is infected. At this time the
virus does not starts its execution which means the system is not affected but it
contains virus. Once the virus starts its execution, the system performance or

2
memory gets affected by it. This refers to the attacking phase where the virus
shows its action being performed.

The virus balances infection versus detection possibility. Some viruses use variety
of techniques to hide or mask their identity. It is Similar to viruses, that are
malicious code in Trojan horses, worms and logic bombs. Often the characteristics
of a virus and worm can be similar, thereby making the situation even more
complex. Some viruses will only activate based on triggering action from the
attacker side. Not all the virus gets activated, but all the viruses steals the system
resources. It also has some bugs which could do destructive. The balance between
viruses, worms, and Trojan horse changes from time to time. There is no idea on
number of viruses that is present, as there are loads of viruses which can cause
damage to the system. There is nothing good about a virus. There are many reasons
for using the virus in a system to infect and damage the data. One of the reasons to
use a virus is for money. The intention might vary from the attacker’s perspective.
The categories of virus are many and diverse. There are four major phases of a
virus.

The first phase is dormant phase. In this phase the virus is idle. The virus has
managed to access the target user’s computer system or a software in the computer
system, but during this stage the virus does not take any action. The virus will
eventually be activated by the "trigger" which states which event will execute the
virus, such as a date, the presence of another program or file, the capacity of the
disk exceeding some limit or the user taking a certain action (e.g., double-clicking
on a certain icon, opening an e-mail, etc.). Not all viruses have this stage. The
second stage is propagation stage where the virus starts propagating, that is
multiplying and replicating itself. The virus places a copy of itself into other
programs or into certain system areas on the disk. The copy may not be identical to
the propagating version; viruses often "morph" or change to evade detection by IT
professionals and anti-virus software. Each infected program will now contain a
clone of the virus, which will itself enter a propagation phase. The next phase is
triggering phase. A dormant virus moves into this phase when it is activated and
will perform the function for which it was intended. The triggering phase can be
caused by a variety of system events, including a count of the number of times that
this copy of the virus has made copies of itself to take control of the entire system
so that it can perform its action throughout the system. The next phase is the

3
execution phase which is the last phase. This is the actual work of the virus, where
the "payload" will be released. It can be destructive such as deleting files on disk,
crashing the system, or corrupting files or relatively harmless such as popping up
humorous or political messages on screen. One may reduce the damage done by
viruses by making regular backups of data (and the operating systems) on different
media, that are either kept unconnected to the system (most of the time, as in a
hard drive), read-only or not accessible for other reasons, such as using different
file systems. This way, if data is lost through a virus, one can start again using the
backup which will hopefully be recent. If a backup session on optical media like
CD and DVD is closed, it becomes read-only and can no longer be affected by a
virus (so long as a virus or infected file was not copied onto the CD/DVD).
Likewise, an operating system on a bootable CD can be used to start the computer
if the installed operating systems become unusable. Backups on removable media
must be carefully inspected before restoration.

The virus might be hidden in some random files, which when opened can modify
or delete the contents or data in the target system. Free anti-virus software has
some limited features. It has inferior scanning process, slower scans, frequent
upgrade prompts, lack of customer support, and lack of comprehensive protection.
The above problem could be overcome by passing the file through a model which
checks for the file type, extracts the file, calculates the file size, gets the binary size
of the file (0’s and 1’s), compare it with the infected file, and generate a new
sequence for the compared file.

This model can be used to find out the affected virus files and alert the user by not
attempting to open the file, thereby controlling its execution from the system. The
overall process involves sending two files through the pattern checker and
analyzing it to find out if there’s any traces of virus hidden within the file. If there
are any traces of virus in the file, then the files are sent to byte generation. In byte
generation, the files after checking it gets extracted and byte values of files are
calculated. Proposed model overcomes the functionalities by using an algorithm,
thereby detecting and preventing the virus attack on a system effectively. This can
be implemented without the aid of any anti-virus software. Generally, antivirus
software in a computer system are free or demo version in which less functionality
is provided. In order to get full functionality of the anti-virus software, the user
should pay the required money or subscribe to the service provider. In this project,

4
an implementation technique is provided in which virus can be detected and its
execution can be stopped without any problems mentioned above.

CHAPTER – 2

LITERATURE SURVEY

2.1 COMPUTER VIRUS


Author: Fred Cohen

About: An overview of computer virus, its behavior, causes and effects. Basic
theoretical results of various prevention techniques are provided. The prevention
techniques are based on fundamental properties of the system and code related to
each prevention technique is provided.

Advantages

 A wide variety of prevention techniques is provided along with a detailed


description of its behavior
Disadvantages

 Lack of experimental results on prevention techniques.

2.2 TECHNIQUE TO THWART THE OPENING OF A VIRUS


AFFECTED FILE WITHOUT THE AID OF AN ANTIVIRUS
SOFTWARE
Author: Sriram Kalyanaraman

About: A technique is devised to prevent the spread of a virus by first detecting it


and the stopping its execution by not allowing the infected file to open in its first
place. An algorithm is designed to implement this technique. This technique is
capable of carrying out this functionality without the aid of an antivirus software.

5
Advantages:

 Efficient Logic for detection of virus

Disadvantages:

 The code that is used in this technique implements a wrong logic for
detection of virus.

2.3 THE NEW AGE OF COMPUTER VIRUS AND THEIR


DETECTION
Author: Nitesh Kumar Dixit, Lokesh Mishra, Mahendra Singh Charan, Bhabesh
umar Dey

About: This paper presents a general overview of computer virus and detection
techniques. Various types of virus and different detection techniques along with its
implementation methodology is listed out in this paper.

Advantages:

 Detection techniques by using latest technologies is mentioned.


 Experimental results are provided.
Disadvantages:

 Lack of detection technique to find hidden files.

2.4 A COMPARATIVE STUDY OF VIRUS DETECTION


TECHNIQUES
Author: Sulaiman Al Amro, Ali Alkhalifah

About: A comparative study about different types of virus detection techniques. It


provides advantages and disadvantages of using the corresponding detection

6
techniques. Provides a view of efficient detection technique that can be used for
detecting virus.

Advantages:

 Easy to choose the correct and efficient detection technique to be used.


Disadvantages:

 Experimental results are not provided.


2.5 SHA FORMALIZATION
Author: Diana Toma, Dominique Borrione

About: Description of various secure hash algorithms along with its properties
and working. Steps involved in calculating the hash values are mentioned in this
study.

Advantages:

 Provides all types of SHA algorithms and also proven analogous safety
theorems on all of them.

2.6 EVOLUTION OF COMPUTER VIRUS CONCEALMENT


AND ANTIVIRUS TECHNIQUES: A SHORT SURVEY
Author: Babak Bashri Rad, Maslin Masrom, Suhaimi Ibrahim

About: Overview on evolution of concealment methods in computer virus and


defensive techniques employed by anti-virus products. This paper reviews all the
methodologies used by antivirus products and outline their strength and weakness.

Advantages:

 Provides a comparative study on implementation techniques.

2.7 MALWARE DETECTION AND KERNEL ROOTKIT


PREVENTION IN CLOUD COMPUTING ENVIRONMENT
Author: Laus Baumgartner, Pablo Graubner

7
About: This paper provides an approach for combined malware detection and also
kernel rootkit prevention in cloud computing environments. Uses a signature based
detection technique which has the details of virus behavior. A sha 256 hash
algorithm is used in the proposed work.

CHAPTER – 3
SYSTEM ANALYSIS

3.1 EXISTING SYSTEM


The existing system has general security measures that an operating system
provides to a computer system. Some of the common features provided by the
operating systems are protection from spam, protection malicious advertisements,
extended life of computers and protection from acquaintance. Generally, most of
the operating systems are provided with free antivirus software. But the problem
here is that since a free version is provided, all the functionalities that the antivirus
software has is not provided by them. There are many disadvantages due to this.

3.1.1 DISADVANTAGES OF EXISTING SYSTEM

 Since less functionalities are provided by the antivirus, there is a lack of


comprehensive protection for the computer system

 The user will be getting update prompts for the antivirus software
frequently, asking the user to pay the required money to get all the
specifications of the software.

 Poor customer support will be provided. The scanning process will be


inferior. For example, if some of the files are affected by the virus only few
files will be detected and there are chances that some of the files in the
system may go undetected.

 The scanning process will be slower. It takes a lot of time to scan all the files
in the computer. As a result of this, the performance of the system may get
affected because it uses some of computer’s RAM while it is operating. If

8
the computer has lots of games and if the user often performs intensive tasks
such as gaming or video editing, antivirus has to do the scanning process at
that time and for that the user has to pay for it.

3.2 PROPOSED SYSTEM


The proposed system uses an algorithm which checks for virus in the file by using
a comparison method. In this method, the pattern of A VIRUS file is analyzed by
passing the file through the pattern checker. The next step involves generation of
bytes. Each file has different byte size. Further, the byte comparison is done to
compare a normal file with a corrupted file. The corrupted file will have different
byte length. If the normal file and corrupted file byte length doesn’t match, then it
can be said that, the file is a virus file and thereby preventing the opening of that
virus file. Before performing this operation, a backup of all the files is taken so that
the data is not lost, corrupted, or deleted during the process. The backup files are
then later compared with the current files which is affected by the virus. The
pattern checker module consists of two files, the file in the original folder and the
file in the backed-up folder. The backed-up folder contains files that are backed up
before affected by virus. The file is checked for the virus pattern by calculating the
SHA-256 values of the file and checking whether it is equal or not and there by
finding the virus pattern in it. A SHA hash value for each file can be calculated and
if the file is modified, then its SHA-256 values changes. Therefore, if the original
file is attacked by virus the hash values of the files will change. The hash values of
the original file and the backed-up file is calculated. Each virus has a SHA value
which represents the pattern or signature. The file is checked for the virus pattern
by calculating the SHA256 value for the original and the backed-up file and then
comparing the hash values of the files. Secure Hash Algorithms (SHA) are a class
of cryptographic functions that is designed to keep data secured If the values differ,
then the virus pattern matches with the file. In this project we have considered a
sample file and calculated hash value for that file.

3.2.1 ADVANTAGES OF PROPOSED SYSTEM


 Effective method for detecting viruses by identifying the signature.

9
 No inferior scanning process involved.

 Immediate alert to the user, when the virus file is found.

 Scanning of files and comparing to find out if the file is virus affected.

CHAPTER – 4

SYSTEM REQUIREMENT SPECIFICATION

4.1 INTRODUCTION
The requirements specification is a technical specification of requirements for
the software products. It is the first step in the requirements analysis process, it
lists the requirements of a specific software system including functional,
performance and security requirements. The requirements also provide usage
scenarios from a user, an operational and an administrative perspective. The
purpose of software requirements specification is to provide a detailed overview of
the software project, its parameters and goals. This describes the project target
audience and it’s user interface, hardware and software requirements.

4.2 HARDWARE AND SOFTWARE REQUIREMENTS

4.2.1 HARDWARE SPECIFICATION


Processor : Core i5 and above

Hard Disk : 1TB

RAM : 8GB or more

4.2.2 SOFTWARE SPECIFICATION


Operating System : Windows 8 and above

10
Tool : VMWare and Eclipse

Cloud Service Provider : Google Cloud

Coding Language : Java

4.3 TOOLS USED


4.3.1 VMWARE

VMware is a virtualization and cloud computing software provider based in Palo


Alto, California. Founded in 1998, VMware is a subsidiary of Dell Technologies.
EMC Corporation originally acquired VMware in 2004; EMC was later acquired
by Dell Technologies in 2016. VMware bases its virtualization technologies on
its bare-metal hypervisor ESX/ESXi in x86 architecture.

With VMware server virtualization, a hypervisor is installed on the physical server


to allow for multiple virtual machines (VMs) to run on the same physical server.
Each VM can run its own operating system (OS), which means multiple OSes can
run on one physical server. All the VMs on the same physical server share
resources, such as networking and RAM.

VMware is low-level enough to make a guest OS appear to be receiving hardware


interrupts (such as timer interrupts) and behave as if it were the only OS on the
machine. At the same time, it provides isolation so that a failure in or misbehaving
of a guest OS does not affect other guest OSs or the underlying system. For
instance, a guest OS crashing will not crash the underlying system. As opposed to
a software simulator, much of the code running in a VM executes directly on the
hardware without interpretation. Only privileged instructions are trapped and
impose additional overhead. The major advantage of using a VM as opposed to a
simulator is the performance improvement possible through direct execution of
unprivileged instructions.

11
Operating systems currently supported as guest operating systems under VMware
include Windows 95/98/2000/NT, FreeBSD, Solaris, Novell Netware, DOS, and
Linux, all of which run unmodified. Theoretically, any OS that can run on an x86
architecture can run as a guest OS, since it will see a complete virtualized PC
environment.

4.3.1.1 ARCHITECTURE OF VMWARE

Fig 4.3.1.1 (a) ARCHITECTURE OF VMWARE

The components and the overall architecture of ESXi are designed to ensure


security of the ESXi system as a whole. From a security perspective, ESXi consists
of three major components: the virtualization layer, the virtual machines, and the

12
virtual networking layer. VMware designed the virtualization layer, or VMkernel,
to run virtual machines. It controls the hardware that hosts use and schedules the
allocation of hardware resources among the virtual machines. Virtual machines are
the containers in which applications and guest operating systems run. By design,
all VMware virtual machines are isolated from one another.

ESX ARCHITECTURE

Fig 4.3.1.1 (b) ARCHITECTURE OF VMWARE

4.3.1.2 ADVANTAGES OF USING VMWARE


 Cost effective use of hardware.

 Large portions of your production environment can be replicated on a few


servers.

 Lower cost of hardware for the entire test environment.

 Faster rollback during testing.

13
 Faster deployment of a new test platform.

 Test VMs can be decommissioned and even deleted after they are not needed.

 Reliability.

 Security.

 Manageability.

 Hardware Independence.

4.3.2 ECLIPSE
Eclipse is an integrated development environment (IDE) used in computer

programming, and it is the most widely used JAVA IDE. It contains a base

workspace and an extensible plug-in system for customizing the environment.

Eclipse is written mostly in Java and it’s primary use is for developing Java

applications, but it may also be used to develop applications in other programming

languages via plug-ins.

14
4.3.2.1 ARCHITECTURE OF ECLIPSE

Fig 4.3.2.1 ARCHITECTURE OF ECLIPSE


4.3.2.2 ADVANTAGES OF USING ECLIPSE

 Using IDE costs less time and effort.

 Navigation is made easier.

 Auto completion of the keywords.

 Error debugging is easy, we can easily navigate to error line.

 All files can be viewed and managed at the same screen.

 It is free and open source available.


4.3.3 GOOGLE CLOUD

15
Google Cloud Platform is a suite of public cloud computing services offered
by Google. The platform includes a range of hosted services for compute, storage
and application development that run on Google hardware. Google Cloud Platform
services can be accessed by software developers, cloud administrators and other
enterprise IT professionals over the public internet or through a dedicated network
connection.

4.3.3.1 ARCHITECTURE OF GOOGLE CLOUD

Fig 4.3.3.1 ARCHITECTURE OF GOOGLE CLOUD

4.3.3.2 ADVANTAGES OF USING GOOGLE CLOUD

 Better pricing Plans Availability

 Enhanced Execution

 Benefits of Live Migration

 Private Network

 Commitment to Constant Development

 Control and Security

16
 Redundant Backups
4.4 TECHNOLOGY USED
4.4.1 JAVA
Java is a programming language that produces software for multiple
platforms. When a programmer writes a Java application, the compiled code
(known as bytecode) runs on most operating systems (OS), including Windows,
Linux and Mac OS. Java derives much of its syntax from the C and C++
programming languages.

Java produces applets (browser-run programs), which facilitate graphical


user interface (GUI) and object interaction by Internet users. Prior to Java applets,
Web pages were typically static and non-interactive. Java applets have diminished
in popularity with the release of competing products, such as Adobe Flash and
Microsoft Silverlight. Java applets run in a Web browser with Java Virtual
Machine (JVM), which translates Java bytecode into native processor instructions
and allows indirect OS or platform program execution. JVM provides the majority
of components needed to run bytecode, which is usually smaller than executable
programs written through other programming languages. Bytecode cannot run if a
system lacks required JVM

Java program development requires a Java software development kit (SDK) that
typically includes a compiler, interpreter, documentation generator and other tools
used to produce a complete application. 

Development time may be accelerated through the use of integrated development


environments (IDE) - such as JBuilder, Netbeans, Eclipse or JCreator. IDEs
facilitate the development of GUIs, which include buttons, text boxes, panels,
frames, scrollbars and other objects via drag-and-drop and point-and-click actions.

Java programs are found in desktops, servers, mobile devices, smart cards and Blu-
ray Discs (BD).

4.4.1.1 FEATURES OF JAVA

17
The programming language is simple. The Java language is easy to learn.
Java code is easy to read and write. Familiar in the world of IT industry. Java is
similar to C/C++ but it removes the drawbacks and complexities of C/C++ like
pointers and multiple inheritances. So if you have background in C/C++, you will
find Java familiar and easy to learn. It is an Object-Oriented programming
language. Unlike C++ which is semi object-oriented, Java is a fully object-oriented
programming language. It has all OOP features such as abstraction, encapsulation,
inheritance and polymorphism.

One major advantage of java programming language is robust. With automatic


garbage collection and simple memory management model (no pointers like C/C+
+), plus language features like generics, try-with-resources, Java guides
programmer toward reliable programming habits for creating highly reliable
applications.

It is secure and the Java platform is designed with security features built into the
language and runtime system such as static type-checking at compile time and
runtime checking (security manager), which let you creating applications that can’t
be invaded from outside. You never hear about viruses attacking Java applications.
Since is not an interpreter, the program is executed at a faster rate during
compilation time which leads to high performance. The runtime environment of
java is independent of operating system platform. That is, the byte code can be
executed in any operating system.

Java code is compiled into bytecode which is highly optimized by the Java
compiler, so that the Java virtual machine (JVM) can execute Java applications at
full speed. In addition, compute-intensive code can be re-written in native code and
interfaced with Java platform via Java Native Interface (JNI) thus improve the
performance.

The Java platform is designed with multithreading capabilities built into the
language. That means you can build applications with many concurrent threads of
activity, resulting in highly interactive and responsive applications.

Java code is compiled into intermediate format (bytecode), which can be executed
on any systems for which Java virtual machine is ported. That means you can write

18
a Java program once and run it on Windows, Mac, Linux or Solaris without re-
compiling. Thus, the slogan “Write once, run anywhere” of Java.

Besides the above features, programmers can benefit from a strong and vibrant
Java ecosystem:

 Java is powered by Oracle - one of the leaders in the industry. Java


also gets enormous support from big technology companies like IBM,
Google, Redhat, so it has been always evolving over the years.
 Java is easy to learn. Java was designed to be easy to use and is
therefore easy to write, compile, debug, and learn than other
programming languages.
 There are a lot of open source libraries which you can choose for
building your applications.
 There are many superior tools and IDEs that makes your Java
development easier.
 There are many frameworks that help you build highly reliable
applications quickly.
 The community around Java technology is very big and mature, so
that you can get support easily.

CHAPTER – 5
SYSTEM DESIGN AND IMPLEMENTATION

5.1 ARCHITECTURE DIAGRAM

19
Fig 5.1 ARCHITECTURE DIAGRAM

5.2 OVERVIEW

The above architecture diagram shows the overall system architecture.


VMWare is used for initiating the Guest OS. The general system consists of the
application, kernel mode and the hardware. From the system, the first task to
complete is Pattern Checker. The Pattern Checker is used to scan the incoming
files and then it tries to find any trace of malwares or its signatures. This is the
initial test that is done at the start. Further, if no malware is to be found, it moves to

20
the next task. The second process involves generation of bytes. Byte Generation is
used to generate bytes for each file and stores it. Backup is taken for all the files
before affected. Then, the process moves to next part, which is Byte Comparison.
Each file has a different byte format and varies in size. The Byte Comparison is
used to compare the normal backed up file, with the newly infected file. The
comparison involves comparison of each file bytes. If the file size, that is the byte
value of the file is changed, then it can be said, that the file is a virus infected file.

5.3 UML DIAGRAMS

The Unified Modeling Language (UML) is a standard visual modeling


language intended to be used for modeling business and similar processes,
analysis, design, and implementation of software-based systems.

UML is a common language for business analysts, software architects and


developers used to describe, specify, design, and document existing or new
business processes, structure and behavior of artifacts of software systems. UML
can be applied to diverse application domains (e.g., banking, finance, internet,
aerospace, healthcare, etc.) It can be used with all major object and component
software development methods and various implementation platforms (e.g.,
J2EE, .NET). UML is not a programming language but there are tools that can be
used to generate code in various languages using UML diagrams. UML has a
direct relation with object-oriented analysis and design. UML diagrams provide a
way for visualizing the system design through various types. There are different
types of UML diagrams that can be used for designing the architecture.

5.3.1 ACTIVITY DIAGRAM

Activity diagram is a flowchart to represent the flow from one activity to


another activity. The activity can be described as the operation of the system.
Thereby, the control flow is drawn from one activity to another. In this system,
some of the activities are Pattern Checker, Byte Generation and Byte Comparison.

21
Fig 5.3.1 ACTIVITY DIAGRAM

5.3.2 USECASE DIAGRAM

Use case diagrams gives a graphic overview of the actors involved in a


system, different functions needed by those actors and how these different
functions are interacted. A use case diagram can identify the different types of
users of a system and the different use cases and will often be accompanied by
other types of diagrams as well. The use cases are represented by either circles or
ellipses. In this system, the actors are the only users who scans the files using the
proposed system and gets alert if a virus file is found during the scan.

22
Fig 5.3.2 USE CASE DIAGRAM

5.3.3 CLASS DIAGRAM

Class diagram shows the classes in a system, attributes and operations of each
class and relationship between each class. The classes in a class diagram represent
both the main elements, interactions in the application, and the classes to be
programmed. It is the main building block of any object-oriented solution. A class
has three parts, name at the top, attributes in the middle and operations or methods
at the bottom. In large systems with many related classes, classes are grouped
together to create class diagrams. It is used to represent the object-oriented design
of an application or a product. Different relationships between classes are shown
by different types of arrows. In the design of a system, various number of classes
are identified and grouped together in a class diagram that helps to determine the

23
static relations between them. With detailed modelling, a better visualization of the
object-oriented design.

Fig 5.3.3 CLASS DIAGRAM

5.3.4 SEQUENCE DIAGRAM

Sequence diagram in UML shows how object interact with each other and the
order those interactions occur in a sequential order. It’s important to note that they
show the interactions for a specific scenario. A sequence diagram consists of
actors, lifelines, messages, guards. There are different types of messages like
synchronous messages, asynchronous messages, create messages, delete messages
etc. A message sent from outside the diagram can be represented by a message
originating from a filled-in circle (found message in UML) or from a border of the
sequence diagram. The actors involved in this process are users, pattern checker,
byte generation and byte comparison. Sequence diagram represents the sequence of
process flow. The process is represented as vertical and the interactions are shown
as arrows. In our system, the sequence of activities is shown below. The first

24
message represents the action that user performs. This involves files that are sent to
pattern checker module for finding the traces of virus in the file. Once the pattern
checker condition is true, then the files are sent to other two modules for
confirming whether the file is affected by the virus.

Fig 5.3.4 SEQUENCE DIAGRAM

5.3.5 COMPONENT DIAGRAM

Component diagram are used to visualize the organization and relationship


among components in a system. A component is something required to execute a
stereotype function. Examples of stereotypes in components include executables,

25
documents, database tables, files, and library files. Components are wired together
by using an assembly connector to connect the required interface of one
component with the provided interface of another component. These diagrams are
used to make executable system. Some of the components in the system include
User and System. When using a component diagram to show the internal structure
of a component, the provided and required interfaces of the encompassing
component can delegate to the corresponding interfaces of the contained
components.

Fig 5.3.5 COMPONENT DIAGRAM

5.3.6 DEPLOYMENT DIAGRAM

Deployment diagram is a structure diagram which shows architecture of the


system as deployment of software artifacts to deployment targets. It is used to

26
visualize the topology of the physical components of a system, where the software
components are deployed, and it consists of nodes and their relationship. Artifacts
represents concrete elements in the physical world that are the results of a
deployment process. The deployment diagram describes the software artifacts for
the system. Deployment diagram represents the deployment view of a system. It is
related to the component diagram because the components are deployed using the
deployment diagrams. A deployment diagram consists of nodes. Nodes are nothing
but physical hardware used to deploy the application. In this project, the
deployment diagram consists of user and the system as its components and it
describes the relation between the components and how the process works.

27
Figure 5.3.6 DEPLOYMENT DIAGRAM

5.3.7 COLLABORATION DIAGRAM

A collaboration diagram, also called as communication diagram, is an


illustration of the relationship and interaction among software objects in a unified
modeling language. It resembles a flowchart that portrays the roles, functionality
and behavior of individual objects as well as the overall operation of the system
in real time. Objects are shown as rectangles with naming labels inside. These
labels are preceded by colons and may be underlined. The relationships between
the objects are shown as lines connecting the rectangles. The messages between
objects are shown as arrows connecting the relevant rectangles along with labels
that define the message sequencing. Collaboration diagrams are best suited to the
portrayal of simple interactions among relatively small numbers of objects. As the
number of objects and messages grows, a collaboration diagram can become
difficult to read. Several vendors offer software for creating and editing
collaboration diagrams. The collaboration diagram below illustrates the
relationship between the modules inside the system. These relationships are
represented by using the messages. In relation to this project, the collaboration
diagram contains pattern checker, files, byte generation and byte comparison as
objects.

28
Fig 5.3.7 COLLABORATION DIAGRAM

CHAPTER – 6

SYSTEM IMPLEMENTATION

6.1 MODULE DESCRIPTION

The different modules in the system are as follows:

 Pattern Checker

 Byte Generation

 Byte Comparison

6.1.1 PATTERN CHECKER

The file is sent to the pattern checker for analyzing the type of a file. The file
is checked for virus signature. If the file has virus signature then there are
possibilities that the file may be affected by virus. The sha256 values of the file is

29
used as a bench mark for pattern checking. If the sha256 value of the virus affected
file and the backup file are different, then it verifies the initial sign of virus in the
computer. The file is checked for the virus pattern by calculating the SHA-256
values of the file and checking whether it is equal or not and there by finding the
virus pattern in it. A SHA hash value for each file can be calculated and if the file
is modified, then its SHA-256 values changes. Therefore, if the original file is
attacked by virus the hash values of the files will change. Secure Hash Algorithms
(SHA) are a class of cryptographic functions that is designed to keep data secured.
In Secure Hash Algorithms the data is transformed into hash values by performing
certain functions that involves bitwise operations, modular additions and
compression functions. There are three different types of SHA functions so far
namely, SHA-1, SHA-2 and SHA-3. Each function produces an output as a hash
value that differs in number of bits. Here SHA256 algorithm is used. It produces a
unique 256-bit (32 byte) signature for a text or a number. The hash values of the
original file and the backed-up file is calculated. Each virus has a SHA value
which represents the pattern or signature. The file is checked for the virus pattern
by calculating the SHA256 value for the original and the backed-up file and then
comparing the hash values of the files. If the values differ, then the virus pattern
matches with the file. In this paper we have considered a sample file and calculated
hash value for that file. Below given the pseudocode that is used for checking virus
pattern in the file.

30
Fig 6.1.1 PATTERN CHECKER MODULE

The pseudo code for the pattern checker is given below.

class Sha256

function main

create MessageDigest object md

set hex to checksum() method

set hex2 to checksum() method

print hex

print hex2

set isequal to true;

for i := 1 to length of hex

do

31
if(hex[i] != hex2[i])

do

set isequal to false;

if isequal is true

do

print “the virus pattern does not matches with the file”

else

print "the virus pattern matches with the fil";

Function checksum

// file hashing with DigestInputStream

Create objet DigestInputStream Object dis

while dis.read() is not equal -1

do nothing //empty loop to clear the data

set md to dis.getMessageDigest()

// bytes to hex

Create string builder object result

for byte b in md.digest()

do

32
append values to result

return result;

6.1.2 BYTE GENERATION

If the initial test was successful, then the next process consists of second
level testing. The initial stage in second level testing is byte generation, where the
files are converted and represented in byte format. First, two byte arrays are
created of approximate size. Then, the function that converts files to bytes is
called. This function returns array values to the calling function. Then, these byte
arrays are saved to a temporary file. The called function creates a byte array of
length that is equal to the length of the file. Then the file that is passed as an
argument to this function is opened and the file input stream object is used to read
bytes from the file and it is stored in the byte array. The FileInputStream obtains
input bytes from a file in a file system. Java FileInputStream class obtains input
bytes from a file. FileInputStream class is used to do file related operations and
here it is used to get bytes from the file. It is used for reading byte-oriented data
(streams of raw bytes) such as image data, audio, video, text etc. Then the byte
array is returned to the calling function. Each file will have a byte value. If the
contents of the file changes, then the byte values of the file also changes. If the file
is attacked by virus, then the contents of the files are also changed and as a result
the byte values are also changed. So, the byte values of the file are used as a bench
mark to check if the file is a virus affected file.

33
Fig 6.1.2 BYTE GENERATION MODULE

6.1.3 BYTE COMPARISON

This module involves comparison of byte values of the files. In general, each
and every file contains its own byte value. If the contents of the file changes, then
the byte value of the files also changes. Thus, if the original file is affected by
virus, then the changes are reflected in the byte values of the files. Each and every
single byte is compared for both the files. When compared, if the byte values are
different, then it means that the file is affected by the virus and thereby we can
conclude that the file is a virus affected file. Below given the pseudocode for byte
generation and comparison. Below given the pseudocode for byte generation and
comparison.

Class Filebyte

Function main

34
Set isequal to true

Create bfile[] byte array for original file

Create bfile2[] byte array for backed up file

bfile = readbytesfromfile(filepath)

bfile2 = readbytesfromfile2(filepath)

Set bfile to a temporary file1

Set bfile2 to a temporary file2

Set i to zero

Set j to zero

while i < bfile.length() && j<bfile2.length()

do

if bfile[i] != bfile[j]

do

isequal = false

i++

j++

if(isequal == true)

do

print “The bytes value of the files have unchanged”

else

35
print “The bytes value of the files have changed! The file is affected by virus”

Function readbytesfromfile(String filepath)

Set bytearray[] to null

try

FILE file = new FILE(filepath)

Bytearray = new BYTE[(int) file.length()]

fileInputStream = new FileInputStream(file);

fileInputStream.read(bytesarray);

catch(exception)

finally

fileInputstream.close()

return bytearray;

Fig.6.1.3 BYTE COMPARISON MODULE

36
7. SOFTWARE TESTING
7.1 GENERAL
Testing is the process of detecting errors. Testing plays a critical role in
assuring quality and ensuring the reliability of software. The results of testing are
used later, also during maintenance.

7.2 TESTING OBJECTIVES


The main objective of testing is to uncover a host of errors, systematically
and with minimum effort and time.

 Testing is a process of executing a program with the intent of finding


an error.

 A good test case is one that has a high probability of finding error, if it
exists.

 The tests are inadequate to detect possibly present errors.

 Software more or less confirms to the quality and reliable standards.

7.3 TYPES OF TESTS


7.3.1 WHITE BOX TESTING
White Box Testing (also known as Clear Box Testing, Open Box Testing,
Glass Box Testing, Transparent Box Testing, Code-Based Testing or Structural
Testing) is defined as the testing of a software solution's internal structure, design,
and coding. In this type of testing, the code is visible to the tester. It focuses
primarily on verifying the flow of inputs and outputs through the application,
improving design and usability, strengthening security. White box testing is also
known as Clear Box testing, Open Box testing, Structural testing, Transparent Box
testing, Code-Based testing, and Glass Box testing. It is usually performed by

37
developers. White box testing is testing beyond the user interface and into the
nitty-gritty of a system.

7.3.2 BLACK BOX TESTING

Black box testing is defined as a testing technique in which functionality of


the Application Under Test (AUT) is tested without looking at the internal code
structure, implementation details and knowledge of internal paths of the software.
This method of test can be applied virtually to every level of software
testing: unit, integration, system and acceptance. It is sometimes referred to as
specification-based testing.

This type of testing is based entirely on software requirements and specifications.


For example, a tester, without knowledge of the internal structures of a website,
tests the web pages by using a browser; providing inputs (clicks, keystrokes) and
verifying the outputs against the expected outcome.

7.3.3 UNIT TESTING

Unit testing involves the design of test cases that validate the internal
program logic, whether it is functioning properly and that the program input
produces valid outputs. The objective of Unit Testing is to isolate a section of code
and verify its correctness.All decision branches and internal code flow should be
validated. It is the testing of individual software units of the application. It is done
after the completion of an individual unit before integration. In SDLC, STLC, V
Model, Unit testing is first level of testing done before integration testing. Unit
testing is a WhiteBox testing technique that is usually performed by the developer.
Though, in a practical world due to time crunch or reluctance of developers to
tests, QA engineers also do unit testing. 

This is a structural testing, that relies on knowledge of construction and is invasive.


Unit tests perform basic tests at component level and test specific business process,
application, and/or system configuration. For example, you are testing a function;

38
whether loop or statement in a program is working properly or not than this is
called as unit testing. A beneficial example of a framework that allows automated
unit testing is JUNIT (a unit testing framework for java).

7.3.4 INTEGRATION TESTING

Integration Testing is defined as a type of testing where software modules


are integrated logically and tested as a group.A typical software project consists of
multiple software modules, coded by different programmers. Integration Testing
focuses on checking data communication amongst these modules.

Software integration testing is the incremental integration testing of two or more


integrated software components on a single platform to find out the cause of
failures by interface defects. The task of the integration test is to check the
components or software applications, e.g. components in a software system or
software applications at the company level with interaction and no error. For
example you have to test the keyboard of a computer than it is a unit testing, but
when you have to combine the keyboard and mouse of a computer together to see
its working or not than it is the integration testing. So it is prerequisite that for
performing integration testing, a system must be unit tested before.

7.3.5 VALIDATION TESTING

Validation is the process of evaluating the final product to check whether the
software meets the business needs. In simple words, the test execution which we
do in our day to day life is actually the validation activity which includes smoke
testing, functional testing, regression testing, systems testing etc.

Validation test succeeds when the software functions in a manner that can be
reasonably expected by the client. Software validation is achieved through a series
of black box testing which confirms to the requirements. Black box testing is
conducted at the software interface. The test is designed to uncover interface
errors, it is also used to demonstrate the software functions that is operational,
input is properly accepted, output is produced, and the integrity of external
information is maintained.

39
CHAPTER - 8

CONCLUSION AND FUTURE WORK

8.1 CONCLUSION

Virus detection is a very topical subject domain. It has been developed to


overcome the problems associated with traditional signature-based virus detection.
The objective of this project is to scan each file that enters the system, so that the
virus affected file can be detected and thereby preventing the user from opening
the infected file. The preliminary test is used to check for any virus signatures in
the file. Further, the file size is then represented in bytes and the file is compared
with the virus file using byte comparison method. Before going to second step, a
backup of all the files is taken. After the virus infects the files, the byte comparison
is done. To conclude, the file can be a virus infected file, if there is a size
difference between the original file and affected file.

In this project, we focused on detection of a virus affected file and prevention of


the spread of virus file by preventing the file from opening. Generally, a virus that
affects files spread when it is opened. Thereby, it’s operation is stopped by
designing a prevention model that uses two levels of checking to confirm the
presence of virus in a file. This project has concentrated more on virus detection
because once it is detected, the spread of the virus can be stopped by not opening
it. Thereby, this is the actual first level of prevention. The experimental results are
correct and right to detect the virus.

8.2 FUTURE WORK

In this project, we focused on detecting the virus affected file so that its
execution can be stopped by not opening the file. In future, we will try to prevent
the process that is executed by the virus so that it can be eradicated and try to
recover the file and will implement this model in a cloud environment.

40
CHAPTER – 9

APPENDICES

9.1 SAMPLE CODING

PATTERN CHECKER CODE

public class Sha256 {

public static void main(String[] args) throws NoSuchAlgorithmException,

IOException {

MessageDigest md = MessageDigest.getInstance("SHA-256"); //SHA, MD2,

MD5, SHA-256, SHA-384...

String hex = checksum("C://Users//Maalolan//dll backup

files//KEYBOARD.SYS", md);

String hex2 = checksum("C://Windows//System32//KEYBOARD.SYS", md);

System.out.println(hex);

System.out.println(hex2);

boolean isequal = true;

for(int i=0;i<hex.length();i++)

if(hex.charAt(i)!=hex2.charAt(i))

isequal = false;

41
}

if(isequal)

System.out.println("equal");

else

System.out.println("not equal");

BYTE CONVERSION AND COMPARISON CODE

import java.io.*;

import java.nio.*;

import java.nio.file.Files;

import java.nio.file.Path;

import java.nio.file.Paths;

42
public class Filebit {

public static void main(String[] args) {

// TODO Auto-generated method stub

boolean isequal = true; int i=0,j=0;

try {

// convert file to byte[]

byte[] bFile = readBytesFromFile("C://Windows//System32//kernel32.dll");

byte[] bFile2 = readBytesFromFile("C://Windows//System32//kernel32.dll");

// save byte[] into a file

Path path = Paths.get("C://users//Maalolan//Desktop//New Text

Document(2).txt");

Path path2 = Paths.get("C://users//Maalolan//Desktop//New Text

Document(3).txt");

Files.write(path, bFile);

Files.write(path2, bFile2)

//Print bytes[]

while(i<bFile.length && j<bFile2.length) {

System.out.print(bFile[i]);

if((char) bFile[i] != ((char)bFile2[i]))

43
isequal=false;

i++;

j++;

if(isequal)

System.out.println("The file is not affected by virus");

else

System.out.println("The bytes value of the file has changed! The file has

been affected;

by virus");

} catch (IOException e) {

e.printStackTrace();

44
private static byte[] readBytesFromFile(String filePath) {

FileInputStream fileInputStream = null;

byte[] bytesArray = null;

try {

File file = new File(filePath);

bytesArray = new byte[(int) file.length()];

//read file into bytes[]

fileInputStream = new FileInputStream(file);

fileInputStream.read(bytesArray);

} catch (IOException e) {

e.printStackTrace();

finally {

if (fileInputStream != null) {

try {

fileInputStream.close();

} catch (IOException e) {

e.printStackTrace();

45
}

return bytesArray;

9.2 SCREENSHOTS

PATTERN CHECKER OUTPUT

BYTE COMPARISON OUTPUT

46
CHAPTER – 10

REFERENCES

[1] Fred Cohen, “Computer virus”, Elsevier Science Publishers B.V (North-
Holland), 1987
[2] Sriram Kalyanaraman, “Technique to thwart the opening of a virus embedded
file without the aid of an anti-virus software”, International Journal of Computer
Science and Network Security (IJCSNS), VOL.11 No.9, September 2011
[3] Nitesh Kumar Dixit, Lokesh Mishra, Mahendra Singh Charan, Bhabesh
Kumar Dey, “The new age of computer virus and their detection”, International
journal of Network Security & Its Applications (IJNSA), Vol.4, No.3, May 2012
[4] Sulaiman Al Amro, Ali Alkhalifah, “A Comparative Study of Virus Detection
techniques”, World Academy of Science, Engineering and Technology,
International Journal of Computer and Information Engineering, Vol:9, No.6, 2015
[5] Babak Bashari Rad, Maslin Masrom and Sushaimi Ibrahim, “Evolution of
Computer Virus Concealment and Anti-Virus Techniques: A Short Survey, IJCSI
International Journal of Computer Science Issues, Vol. 8, Issue 1, January 2011
ISSN(Online): 1694-0814, www.IJCSI.org
[6] Diana Toma, Dominique Borrione, “SHA Formalization”, TIMA Laboratory,
Grenoble, France.
[7] Lars Baumgartner, Pablo Graubner, “Malware Detection and Kernel Rootkit
Prevention in Cloud Computing Environment”, Department of Mathematics and
Computer Science, University of Marburg, Hans Meerwein-Str. 3, D-35032,
Germany, March 2011
[8] http://www.offensivecomputing.net/
[9] https://www.virustotal.com/#/home/upload

47

You might also like