Welcome to Scribd!

Virus Detection Using Deep Learning: Saurabh Malusare Rojan Sudev Rishabh Nrupnarayan

Uploaded by

0% found this document useful (0 votes)

6 views28 pages

This document proposes using deep learning to classify files as viruses or legitimate. It discusses extracting relevant features from file headers, training a deep belief network (DBN) in an unsupervised manner using restricted Boltzmann machines (RBM), and then fine-tuning the DBN in a supervised manner using logistic regression. The system is able to classify files with 94.5% accuracy, demonstrating that deep learning can overcome limitations of conventional virus detection techniques by learning complex patterns without large signature databases.

Original Description:

Project

Original Title

Final Project Presentation.pptx 0

Copyright

Available Formats

ODP, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as ODP, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

6 views28 pages

Virus Detection Using Deep Learning: Saurabh Malusare Rojan Sudev Rishabh Nrupnarayan

Uploaded by

9545417941

Copyright:

Available Formats

Download as ODP, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 28

Search inside document

VIRUS DETECTION USING DEEP

LEARNING

By
Saurabh Malusare
Rojan Sudev
Rishabh Nrupnarayan

Under The Guidance of

Prof. Anil M. Bhadgale
INTRODUCTION

A computer virus is a program or piece of code

that, when executed replicates by reproducing
itself or infecting other computer program by
modifying them.
VIRUS DETECTING TECHNIQUES

• Signature Based Detection

• Heuristic Based Detection
• Detection using Bait
LIMITATIONS OF CONVENTIONAL
TECHNIQUES

• Time required between virus detection and

creation
• Large Database have to be maintained
• New patterns of virus cannot be detected
PROBLEM DEFINITION

Using Deep learning to classify whether a file is

virus or legitimate , while overcoming the
existing limitations of conventional techniques.
System Architecture
Important fields of PE header:
Feature Selection
• Extract only features relevant to classification
• Fisher Score algorithm for feature selection
• Fisher Score based on ranks.
• Ranks between 0 and 1
• Higher rank,more relevance
Fisher Score formula:

• µi,p = mean of positive samples for ith PE header feature

• µi,n = mean of negative samples for ith PE header feature
• σi,p = standard deviation of positive samples for ith PE
header feature
• σi,n = standard deviation of negative samples for ith PE
header feature
Feature Extraction
• Extract 21 most relevant features determined
using Fisher Score.
• These features are real values.
• Normalize features using min-max
normalization
• Features are scaled to [0,1]
• Normalized Feature values are then converted
to binary values using the condition:

If feature >mean(feature)
feature=1
else
feature-0
DBN
• Deep belief network obtained by stacking
several RBMs(Restricted Boltzmann machine)
on top of each other.
• The hidden layer of the RBM at layer `i`
becomes the input of the RBM at layer `i+1`.
• When used for classification, the DBN is
treated as a MLP, by adding a logistic
regression layer on top.
RBM

Fig. RBM

Fig. Forward phase

Fig. Backward phase

RBM Training
Contrastive Divergence-k(CD-k):
• Take a training sample v, compute the
probabilities of the hidden units and sample a
hidden activation vector h from this
probability distribution.
• Compute the outer product of v and h and call
this the positive gradient.
• From h, sample a reconstruction v1 of the
visible units, then resample the hidden
activations h1 from this.
• Repeat above step k times to calculate vk and
Training DBN
• DBN trained in semi-supervised way.
2 phases:
1)Unsupervised training phase
2)Supervised training phase
Unsupervised Training
Algorithm:
• 1. Train the first layer as an RBM that models the raw input as its visible
layer.
• 2. Use that first layer to obtain a representation of the input that will be
used as data for the second layer.
• 3. Train the second layer as an RBM, taking the transformed data
(samples ) as training examples (for the visible layer of that RBM).
• 4. Iterate (2 and 3) for the desired number of layers, each time
propagating upward either samples .
Supervised Training
• Uses Logistic Regression on top of DBN
• Logistic Regression Model trained in
Supervised way-uses labelled virus and
legitimate files
• Logistic regression is a probabilistic, linear
classifier parametrized by a weight
matrix W and a bias vector b .
Fine Tuning Parameters

• Number of hidden layers

• Number of processing units per hidden layer
• Learning rate
PERFORMANCE
EVALUATION

03/06/17 CS-152 23
SNAPSHOTS
RESULTS
• Feature Extractor capable of extracting
relevant features from dataset and input
file.
• DBN capable of classifying a given PE
structure file as virus or legitimate with an
accuracy of 94.5%.
CONCLUSION

House Price Prediction: Project Description
Document11 pages
House Price Prediction: Project Description
POLURU SUMANTH NAIDU STUDENT - CSE
No ratings yet
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
Document20 pages
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
vanjchao
No ratings yet
CSL0777 L22
Document35 pages
CSL0777 L22
Konkobo Ulrich Arthur
No ratings yet
CSL0777 L19
Document23 pages
CSL0777 L19
Konkobo Ulrich Arthur
No ratings yet
BTP PPT Phase1
Document14 pages
BTP PPT Phase1
MANISH KUMAR
No ratings yet
Project
Document27 pages
Project
bisan.ahmad.alhaj
No ratings yet
L09-10 DL and CNN
Document56 pages
L09-10 DL and CNN
Paulo Santos
No ratings yet
BTP PPT Phase1
Document14 pages
BTP PPT Phase1
MANISH KUMAR
No ratings yet
Cvpresentation 190812154654
Document25 pages
Cvpresentation 190812154654
PyariMohan Jena
No ratings yet
Chapter 6 - Subprogram Control
Document29 pages
Chapter 6 - Subprogram Control
migad
No ratings yet
NLP-NeuralNetworks Reading Notes
Document13 pages
NLP-NeuralNetworks Reading Notes
David
No ratings yet
ML Lab Manual
Document38 pages
ML Lab Manual
Rahul
No ratings yet
Neuralnetworks 1
Document65 pages
Neuralnetworks 1
rdsraj
No ratings yet
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
Document50 pages
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
Zee Ingame
No ratings yet
UNIT 2 Self Notes
Document10 pages
UNIT 2 Self Notes
jainayushtech
No ratings yet
Person Re-Identification Via Structural Deep Metric Learning
Document31 pages
Person Re-Identification Via Structural Deep Metric Learning
sankaridevi
No ratings yet
CSL0777 L16
Document25 pages
CSL0777 L16
Konkobo Ulrich Arthur
No ratings yet
Application of Wavelets To Document Image Processing, Graph Processing and Video Processing
Document62 pages
Application of Wavelets To Document Image Processing, Graph Processing and Video Processing
Mr.Mohammed Zakir B ELECTRONICS & COMMUNICATION
No ratings yet
Lab Expt 6 SVM Classifer With Feature Kernel Techniques
Document14 pages
Lab Expt 6 SVM Classifer With Feature Kernel Techniques
AYUSHI WAKODE
No ratings yet
Ker As Tutorial
Document33 pages
Ker As Tutorial
Yoann Dragneel
No ratings yet
Arabic OCR Report
Document20 pages
Arabic OCR Report
Amir
No ratings yet
Vineela Ann1
Document9 pages
Vineela Ann1
vineela
No ratings yet
Basics of CNN and Face Recognition and Verification
Document41 pages
Basics of CNN and Face Recognition and Verification
Aditi
No ratings yet
Pytorch Tutorial: Narges Honarvar Nazari January 30
Document29 pages
Pytorch Tutorial: Narges Honarvar Nazari January 30
Minh Nguyen
No ratings yet
21BCS1133 - Exp 2.3
Document4 pages
21BCS1133 - Exp 2.3
jiteshkumardj
No ratings yet
DCGAN (Deep Convolution Generative Adversarial Networks)
Document27 pages
DCGAN (Deep Convolution Generative Adversarial Networks)
lakpa tamang
No ratings yet
Final Project Report Nur Alam (肖恩） 20183290523
Document12 pages
Final Project Report Nur Alam (肖恩） 20183290523
Xiao en
No ratings yet
Your Paragraph Text
Document13 pages
Your Paragraph Text
Athulya B.S
No ratings yet
Malware Detection Technique For Android Iot Devices: Presented By-Tellakula Hima Bindu Reg No. 221003100
Document22 pages
Malware Detection Technique For Android Iot Devices: Presented By-Tellakula Hima Bindu Reg No. 221003100
Erukulla Dayakar
No ratings yet
Major Classes of Neural Networks
Document21 pages
Major Classes of Neural Networks
bhaskar rao m
No ratings yet
SSRN Id3884722
Document6 pages
SSRN Id3884722
Ezekiel Choosen
No ratings yet
Technical Answers To Real World Problems - CBS1901 Vellore Institute of Technology, Vellore Summer Special Semester 2021-22
Document4 pages
Technical Answers To Real World Problems - CBS1901 Vellore Institute of Technology, Vellore Summer Special Semester 2021-22
Charan Bhogaraju
No ratings yet
Machine Learning Introduction
Document17 pages
Machine Learning Introduction
Er Himanshu Singhal
No ratings yet
Neural Networks For Machine Learning: Lecture 14A Learning Layers of Features by Stacking Rbms
Document39 pages
Neural Networks For Machine Learning: Lecture 14A Learning Layers of Features by Stacking Rbms
Tyler Roberts
No ratings yet
SML Unit 4
Document61 pages
SML Unit 4
aryan kumar
No ratings yet
Pipelines
Document17 pages
Pipelines
vgokuul007
No ratings yet
Labview Academy: 12. Óra - Event, Property Node
Document54 pages
Labview Academy: 12. Óra - Event, Property Node
劉燕明
No ratings yet
Unit 3
Document110 pages
Unit 3
Nishanth Nuthi
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
Document69 pages
Data Mining: Practical Machine Learning Tools and Techniques
elgatoa3
No ratings yet
15 ML
Document60 pages
15 ML
maykelnawar
No ratings yet
Aidl Unit III
Document79 pages
Aidl Unit III
kanchiraju vamsi
No ratings yet
Data Mining Data Transformations: Gergely Lukács
Document51 pages
Data Mining Data Transformations: Gergely Lukács
Blazs
No ratings yet
Ann PDF
Document129 pages
Ann PDF
Sohan Reddy
No ratings yet
26 Weka
Document5 pages
26 Weka
sandyguru05
No ratings yet
DuongToGiangSon 517H0162 HW2 Nov-26
Document17 pages
DuongToGiangSon 517H0162 HW2 Nov-26
Son Tran
No ratings yet
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 02
Document26 pages
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 02
srikanth.mujjiga
No ratings yet
Deep Learning Algorithms For Object Detection
Document43 pages
Deep Learning Algorithms For Object Detection
Vaijayanthi
No ratings yet
MLT Essentials
Document32 pages
MLT Essentials
TANISHA SAXENA
No ratings yet
Software Testing and Quality Assurance
Document30 pages
Software Testing and Quality Assurance
akshita
No ratings yet
MLOA Exp 1 - C121
Document18 pages
MLOA Exp 1 - C121
Devanshu Maheshwari
No ratings yet
Mini Project B20CS061
Document16 pages
Mini Project B20CS061
THANMAI MITTAPELLY
No ratings yet
Machine Learning Unit 3
Document40 pages
Machine Learning Unit 3
read4free
No ratings yet
Model Evaluation and Selection Cheatsheet 1708023215
Document7 pages
Model Evaluation and Selection Cheatsheet 1708023215
felipe.burneo.posavac
No ratings yet
Lab 6 Dsa
Document15 pages
Lab 6 Dsa
AYUSHI WAKODE
No ratings yet
High Dimensional Representation
Document33 pages
High Dimensional Representation
Hosam Hatim
No ratings yet
Learning To Detect Faces A Large-Scale Application of Machine Learning
Document26 pages
Learning To Detect Faces A Large-Scale Application of Machine Learning
vinit
No ratings yet
Dimensionality Reduction of High Dimensional Data: Summer Internship Project Summary
Document20 pages
Dimensionality Reduction of High Dimensional Data: Summer Internship Project Summary
R.Gangaadharan Gangaa
No ratings yet
Week8 WEB
Document54 pages
Week8 WEB
Ankit Shaw
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Vehicle Accidentand Traffic Classification Using Deep Convolutional Neural Networks
Document7 pages
Vehicle Accidentand Traffic Classification Using Deep Convolutional Neural Networks
Tariku Kussia
No ratings yet
Keras Cheat Sheet Python
Document1 page
Keras Cheat Sheet Python
John
No ratings yet
Thongsuwan 2020
Document10 pages
Thongsuwan 2020
chamanthi
No ratings yet
RNN + RL: Shusen Wang
Document51 pages
RNN + RL: Shusen Wang
MInh Thanh
No ratings yet
Neural Networks
Document3 pages
Neural Networks
Mahesh Malkani
No ratings yet
Introduction To Deep Learning: Le Hoang Nam Namlh@hanu - Edu.vn
Document33 pages
Introduction To Deep Learning: Le Hoang Nam Namlh@hanu - Edu.vn
Khoa Tran Ngoc
No ratings yet
Deep Learning With Keras
Document136 pages
Deep Learning With Keras
Hisham Shihab
100% (3)
Gujarat Technological University
Document1 page
Gujarat Technological University
feyayel988
No ratings yet
Recognizing and Classifying Daily Human Activities: Group-22
Document23 pages
Recognizing and Classifying Daily Human Activities: Group-22
Divyam Gupta
No ratings yet
Lecture 2 Deep Learning Overview
Document99 pages
Lecture 2 Deep Learning Overview
Rizal Adi Saputra ST., M.Kom
No ratings yet
Deep Learning UNIT 1&2
Document69 pages
Deep Learning UNIT 1&2
Harsha
No ratings yet
Machine Learning Toolkit User Manual PDF
Document7 pages
Machine Learning Toolkit User Manual PDF
Angel of war
No ratings yet
Adaptive Learning-Based K-Nearest Neighbor Classifiers With Resilience To Class Imbalance
Document17 pages
Adaptive Learning-Based K-Nearest Neighbor Classifiers With Resilience To Class Imbalance
Ayush
No ratings yet
Seminar Report cnn1
Document23 pages
Seminar Report cnn1
Mohammed Sufiyan
No ratings yet
ML Syllabus
Document1 page
ML Syllabus
Gee Jay
No ratings yet
Machine Learning Toolkit User Manual
Document7 pages
Machine Learning Toolkit User Manual
Eduardo Loyo
No ratings yet
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
Document31 pages
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
s8nd11d UNI
No ratings yet
Data Mining K-Means Algorithm
Document36 pages
Data Mining K-Means Algorithm
zulkifli disdik
No ratings yet
The Backpropagation Algorithm
Document4 pages
The Backpropagation Algorithm
Lokeshadhis
No ratings yet
Keras Cheat Sheet Python For Data Science: Model Architecture Inspect Model
Document1 page
Keras Cheat Sheet Python For Data Science: Model Architecture Inspect Model
Minh Nguyễn
No ratings yet
4 Neural Network
Document74 pages
4 Neural Network
noor ali
No ratings yet
Dayananda Sagar College of Engineering, Department of Computer Science and Engineering
Document20 pages
Dayananda Sagar College of Engineering, Department of Computer Science and Engineering
Saptadip Saha
No ratings yet
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
Document15 pages
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
PRAFULKUMAR PARMAR
No ratings yet
Step by Step Guide How To Rapidly Build Neural Networks
Document6 pages
Step by Step Guide How To Rapidly Build Neural Networks
Hamza Aziz
No ratings yet
Machine Learning With MATLAB Quick Reference
Document36 pages
Machine Learning With MATLAB Quick Reference
Zoltán Petró
No ratings yet
CSE 599 Lecture 6: Neural Networks and Models
Document47 pages
CSE 599 Lecture 6: Neural Networks and Models
aneeslayas
No ratings yet
An Introduction To WEKA
Document85 pages
An Introduction To WEKA
Jamil
No ratings yet
Apriori Algorithm
Document13 pages
Apriori Algorithm
Kiran Joshi
No ratings yet
ANN Matlab
Document13 pages
ANN Matlab
Akhil Arora
No ratings yet
Topic 5 - Part1 Multilayer Perceptron
Document28 pages
Topic 5 - Part1 Multilayer Perceptron
ﻣﺤﻤﺪ الخميس
No ratings yet