You are on page 1of 18

ANDROID MALWARE DETECTION

USING MACHINE LEARNING


QIS COLLEGE OF ENGINEERING AND TECHNOLOGY

INFORMATION TECHNOLOGY

Guided By: K.Sreenath(Asst prof, M.Tech)

Presented by:
D.Akhila (17491A1207)
G.Jahnavi (17491A1211)
N.Preethi (17491A1220)
T.Bhaanu (17491A1229)
V.Srikanth (17491A1255)
CONTENTS

1. Abstract
2. Introduction
3. Existing System
4. Proposed System
5. Requirements
6. Modules
7. Modules Description
8. Conclusion
Abstract

 Android is being the world's most popular operating system


having a billion of users and it has drawn the attention of cyber
criminals operating particularly through wide distribution of
malicious applications. In recent years, wide ranging
researches have been conducted on malware analysis and
detection but that techniques were not able to detect unknown
malware. So, to detect the malware we proposed to use
effectual machine learning approaches by making use of
evolutionary genetic algorithms. They are used to train
machine learning classifier and their capabilities in
identification.
Introduction
 Malware is nothing but malicious software ,and it is
intentionally designed to cause damage to a computer devices,
tablets and also smartphones.
 There are six types of malicious software they are Viruses,
Worms, Torjan Horse, Spyware, Adware and Ransomware.
 Due to open source nature most of the people are simply
installing this malware apps from google play store and
giving their personal details to access the application .
 So here we developing a antivirus software by using malware
analysis and reverse engineering.
 Malware analysis is of two types static and dynamic.
Existing System
The main contribution of the work is reduction of feature
dimension to less than half of original feature-set using
Genetic Algorithm such that it can be fed as input to
machine learning classifiers for training with reduced
complexity while maintaining their accuracy in malware
classification.
The optimized feature set obtained using Genetic algorithm
is used to train two machine learning algorithms: Support
Vector Machine and Neural Network.
Proposed system

 Two set of Android Apps or APKs: Malware/Good ware are


reverse engineered to extract features such as permissions and
count of App Components such as Activity, Services, Content
Providers, etc.
 In the proposed methodology, static features are obtained
from AndroidManifest.xml which contains all the important
information needed by any Android platform about the Apps.
Androguard tool has been used for disassembling of the APKs
and getting the static features.
Software Requirements
 For developing the Application
1. Python
2. Django
3. Mysql
4. Mysql client
5. Wamp Server 2.4
Hardware Requirements
 Operating System supported by
1. Windows
 Processor – Pentium IV or higher
 RAM (12 GB)
Modules

1. Upload Android dataset


2. Generate Train & test model
3. Pre-processing
4. Run SVM & Neural network alg
5. Display Accuracy Graph
Upload Android dataset

We have collected the malware datasets from the previous
attacks. So we have to upload these datasets to our proposed
model.

Based on these datasets, It will identify whether the android is
malware or not.
Generate Train & test model


In this module, we divide the datasets into 80:20 ratio. After
that, we have to give training to our model to identify the
malware apps and links in android.

After completing the training on the model. We have to
verify whether the model is working properly or not. For that
reason, we have to check the model with the test data.
Pre-processing

 In this module, We have to analyze the complete data and


remove the null values and filter the data according to our
requirements.
Run SVM & Neural network alg


For analyzing the input data with the datasets we are using the
feature selection and genetic algorithm. In feature selection
algorithms we are using the SVM algorithm for better
accuracy. We are also using a Neural network algorithm to
compare which is better

We are mixing the genetic algorithm with both algorithms and
we can also test them independently so that we can identify
which is better.
Display Accuracy Graph


After completing the analysis report we can see the analysis
in the form of a graph that how much accurate we are getting
by using these algorithms.

The graph indicates, the x-axis represents algorithm name and
the y-axis represents accuracy and in all SVM got high
accuracy.
conclusion
 As the number of threats posed to Android platforms is
increasing day to day, spreading mainly through malicious
applications, it is very important to design a framework which
can detect such malwares with accurate results.. The proposed
methodology attempts to make use of Genetic Algorithm to
get most optimized feature subset used to train machine
learning algorithms in most efficient way.
THANK YOU

You might also like