You are on page 1of 29

MALWARE DETECTION

1
Contents

1 Overview: malware

2 anomaly
3 Simple examples
44 Types of intrusion detection systems
5 Limitation and challenges
6 Applications of Anomaly Detection
Contents

7 Botnet architecture

8 Detection Techniques

9 Literature Review

10 Objectives
Problem statement

11 Objectives

12 Proposed Model
Research scope

Network security
Malware

Data mining Cloud


techniques computing

4
Overview: malware

Malware is any software intentionally designed to cause damage to a computer, server ,client,


or computer network.
In the present world huge amounts of data are stored and transferred from one location to another. The
data when transferred or stored is primed exposed to attack. Although various techniques or
applications are available to protect data, loopholes exist

including computer viruses, worms, trojan
horses , spyware, adware,  botnet,……..

5
Anomaly
 generally , anomalies are anything that noticeably different from the
expected
 Anomaly is a pattern in the data that does not conform to the expected
behavior
 These unexplained phenomena could be outliers, anomalies, cyber-attack,
novelties, exceptions, deviations, surprises, or noise ,where outliers are the
data points that are considered out of the ordinary.
 Detection of these points can be done using outlier detection methods .
The anomalies are a special kind of outlier that has actionable pieces of
information which could be meaningful. To detect these points, anomaly
detection methods are used . Similarly, fault detection is used to detect
noise, which is unwanted, and wrong data that has to be removed . Cyber
attacks, on the other hand, are more sophisticated; they can be hidden
between the data points and hard to detect 6
Simple examples

7
There are two types of intrusion detection systems :
 Anomaly-based intrusion detection system regularly
monitors events and compares them with the statistical
model.

 The signature-based detection method used by


intrusion prevention systems involves a dictionary of
uniquely identifiable signatures located in the code of
each exploit

8
Limitation and challenges

• Defining a representative normal region is challenging


• The boundary between normal and outlying behavior is often
not precise
• Data might contain noise
• If the program is compressed or encrypted, it will be difficult to
open the encryption or decompression by the anti-virus
• It may contain code inside the malicious program that is
difficult to decipher by the anti-virus
• The hacker may add tools that make the task difficult for the
anti-virus
• According to recent research 70% to 90% of the world spam
mailing traffic is caused by botnets

9
Applications of Anomaly Detection

• Network intrusion detection

• Insurance / Credit card fraud detection

• Healthcare Informatics / Medical diagnostics

• Industrial Damage Detection

• Image Processing / Video surveillance


10
Botnet architecture
Centralize Hybrid(centralized
Decentralize
& decentralized)

Http , IRC
P2P
Bot
master Bot
master

C&c
bot bot bot

bot1 bot2 ….. bot n


bot bot
11
malware analysis

Malware
Malware analysis answers the analysis
following questions :
1. Is the file malware ?
2. How does the file malware modify
the system? Static Dynamic
analysis analysis
3. How does the malware contact and
why?
4. How can the malware be detected
or removed ? Without
execution
execution

12
DOS
DDOS

13
Detection Techniques

Detection based on data mining techniques :

Mining techniques are used to discover botnet based on

• Detection based on signature threats system Known signature


• Detection based on behavior
A Check inside the
According to the known signature system if its A
(A , B & C ) are malicious B signature exists
D is not signature then it is B
C malicious
software C
D

look for
threat detection Add signature
signature 14
Literature Review

15
Literature Review

NO Title year
Intelligent Windows Malware 2019 Thilo Denzer et al. [13]: suggests an approach for the reliable
1 Type Detection based on multifamily malware classification using dynamic
Multiple Sources of Dynamic characteristics from community-accepted Cuckoo Sandbox.
Characteristics The goal of this research is to explore a way to improve such
classification by exploiting available dynamic characteristics.
Instead of doing a binary malware classification into malicious
and benign, malware is classified into a respective sub-group
based on its functionality and targeted activities. This
research uses the most popular malware classification with
machine learning algorithms namely Naive Bayes, Support
Vector Machine (SVM), Artificial Neural Network (ANN), k
Nearest Neighbors (KNN), J48, Logistic Regression (LR),
Random Forest (RF). The best-achieved classification results
using Random Forest was 87.5% for 10 malware families.
Literature Review

NO Title year
The Use of Machine 2020 suggest effective detected unknown malware attack based on
2 Learning Techniques to machine learning algorithm .Also this work using Random
Advance the Detection and Forest feature selection algorithm to decrease the number of
Classification of Unknown features and using cross-validation for data splitting to
Malware sensible performance improvements. Several machine
learning algorithms were applied on a benchmark dataset in
proposed model experiments. The results achieved accuracy
improvements over all binary and multi-classifiers. The
highest accuracy was achieved by Decision Tree is 98.2% for
binary classification and 95.8% by Random Forest for multi-
class classification. The lowest accuracy was achieved by
Bernoulli Naïve Bayes with an accuracy of 91% and 81.8% for
binary classification and multi-class classification, respectively
N Paper and styday year
O
Machine Learning-Based IoT-Botnet 2020 using three types of machine learning (ML) to detected a botnet
3 Attack Detection with Sequential attack IoT devices.An efficient feature selection approach is
Architecture adopted to improve the accuracy of the machine learning. The
overall detection performance achieves around 99% for the
botnet attack detection using three different ML algorithms,
including artificial neural network (ANN), J48 decision tree, and
Naïve Bayes. The experiment result indicates that the proposed
architecture can effectively detect botnet-based attacks, and also
can be extended with corresponding sub-engines for new kinds
of attacks.

Multilayer Framework for Botnet 2021 proposed approaches for detection botnetThis study
4 Detection Using Machine Learning proposed two models are : The first module is clustering
Algorithms used K-means in WEKA, while the second module used
three classifiers. The classifiers used are k-Nearest
Neighbor (k-NN), Support Vector Machine (SVM), and
Multilayer Perceptron (MLP). The accuracy of both layers
is more than 90%, and the false-negative rate is less than
2.5%. So this study is quite popular, as these techniques
presume that Botnet traffic will be behaviorally different
from normal traffic
Literature Review
NO Title year
Attribution Classification 2021 In 2021, Shudong Li et al. [25]: offers a classification method
Method of APT Malware in I0T for attribution organizations with advanced persistent threat
5 Using Machine Learning (APT) malware based on machine learning. Based on
Techniques malicious software samples in APT attacks, this method first
dynamically analyzes samples, preprocesses the acquired
behavior data, constructs a behavioral data set of malware
samples, then uses the TF-IDF (term frequency-inverse
document frequency) method to perform the feature
representation forms a vector matrix. In this research
proposed Synthetic Minority Oversampling Technique
(SMOTE) and Random forest (RF) organization classification
model using for effectively identify APT attack. The results
illustrated that the method of feature extraction can achieve
more than 80% accuracy in general models and the SMOTE-RF
model performs well and has stable performance in the
classification of APT malware

19
Problem statement

1- The constant threat of data by attackers, especially when


using the cloud
2- The difficulty of detecting malicious botnet as a result of the
rapid spread and the emergence of new botnet

20
Objectives
Malicious programs such as a botnet or worm are considered the biggest threat
to cyber security. The main objective of this research is to detect the activities
of malicious programs represented By using one of the mining techniques after
analyzing the botnet and providing the information we need to respond to the
network penetration
• Data mining techniques are very useful to extract unexpected network
patterns
• Design a new mechanism to detection of malware by data mining
techniques .
• Evaluate the performance of the proposed technique .
• Verify and validate the proposed technique based on the results
obtained .
21
Proposed Model
Our Proposed Model, construct malware botnet detection model
using hybrid approaches of combine more than one data mining
techniques model to achieve network security in cloud environment.
It divides into preprocessing and detection part as shown in Figure
In our proposal, classifiers based on different data mining algorithms
may be applied in different attack detection , which leads to better
detection performance and shorter processing times and a beater
implementation

22
Proposed Model

start
Data set

Producing an
preprocessing alert

Training data set Detection Detect the


botnet Data base
mechanism
yes Botnet
information

no

end 23
Timeline

24
timeline
Six month Six month Six month Six month
2021 -2022 2021 - 2022 2022-2023 2022 - 2023

1. literature review 1. Pre-processing data 1. Test result 1. Write two Papers


2. write paper for 2. Apply data mining 2. Improve and compare for publications
publications algorithms results 2. Writing Thesis
3. collect data 3. find Results
4. Selection of assistive
programs
5. Determine the API
(Python, Mat lab)

25
Reference
• Yuval Sinay. Common malware evasion techniques. http://blogs.microsoft.co.il/yuval14/
2017/06/20/common-malware-evasion-techniques/, 2017. Accessed: 2019-01-11.
• R. Tian, L. Batten, R. Islam, and S. Versteeg. An automated classification system based on the
strings of trojan and virus families. In 2009 4th International Conference on Malicious and
Unwanted Software (MALWARE), pages 23–30, Oct 2009.
• [1] Aijaz, U.N., Patra, A., Siddiq, A.S., Chatterjee, B., Ghiyas Khan, M., 2018. Malware Detection on
Server using Distributed Machine Learning. Proceedings of Knowledge Discovery in Information
Technology and Communication Engineering (KITE-2018) 2, 172–175. URL: http: //www.pices-
journal.com/downloads/V2I7-PICES0046.pdf.
• [2] Amos, B., Turner, H., White, J., 2013. Applying machine learning classifiers to dynamic android
malware detection at scale. 2013 9th International Wireless Communications and Mobile
Computing Conference, IWCMC 2013 , 1666–1671doi:10.1109/IWCMC.2013.6583806.
• [3] Baychev, Y., Bilge, L., 2018. Spearphishing malware: Do we really know the unknown?, in:
International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment,
Springer. pp. 46–66.
• [4] Chumachenko, K., et al., 2017. Machine learning methods for malware detection and
classification .
• [5] Firdausi, I., Lim, C., Erwin, A., Nugroho, A.S., 2010. Analysis of machine learning techniques 26
used in behavior-based malware detection. Proceedings - 2010 2nd International Conference on
Reference
• Green, J.P., Chandnani, A.D., Christensen, S.D., 2019. Detecting script-based malware using emulation and
heuristics. US Patent 10,387,647.
• Jung, J., Kim, H., Shin, D., Lee, M., Lee, H., Cho, S., Suh, K., 2018. Android malware detection based on useful
api calls and machine learning, in: 2018 IEEE First International Conference on Artificial Intelligence and
Knowledge Engineering (AIKE), pp. 175–178. doi:10. 1109/AIKE.2018.00041
• Cisco. Cisco Visual Networking Index (VNI) Global Mobile Data Traffic Forecast Update, 2017–2022 White
Paper;Cisco Systems Inc.: San Jose, CA, USA, 2019.
• Symantec Internet Security Threat Report 2019. Volume 24. Available online: https://docs.broadcom.com/
doc/istr-24-2019-en (accessed on 2 January 2020).
• Kuzin, M.; Shmelev, Y.; Kuskov, V. New Trends in the World of IoT Threats—Securelist Kaspersky Lab. 2018.
Available online: https://securelist.com/new-trends-in-the-world-of-iot-threats/87991/ (accessed on 2 January
2020).
• 13. Shafiq, M.; Tian, Z.; Sun, Y.; Du, X.; Guizani, M. Selection of effective machine learning algorithm and Bot-
IoTattacks traffic identification for internet of things in smart city. Futur. Gener. Comput. Syst. 2020, 107, 433–
442.[CrossRef
• [2] P. Wainwright and H. Kettani, ‘‘An analysis of botnet models,’’ in Proc. 3rd Int. Conf. Compute Data Anal.,
New York, NY, USA, Mar. 2019,pp. 116–121
27
Reference
• K. M. Prasad, A. R. M. Reddy, and K. V. Rao, ‘‘BARTD: Bio-inspired anomaly based real time detection of under
rated app-DDoS attack on Web,’’ J. King Saud Univ.-Comput. Inf. Sci., vol. 32, no. 1, pp. 73–87, Jan. 2020
• M. Pawlicki, M. Choraś, and R. Kozik, ‘‘Defending network intrusion detection systems against adversarial
evasion attacks,’’ Future Gener. Comput. Syst., vol. 110, pp. 148–15
• 3. Doffman, Z. Cyberattacks On IOT Devices Surge 300% In 2019, ‘Measured in Billions’. Available online:
https://www.forbes.com
• /sites/zakdoffman/2019/09/14/dangerous-cyberattacks-on-iot-devices-up-300-in-2019-now-rampant-report-
claims/?sh=24e245575892 (accessed on 10 November 2020)
• Furbush, J. Machine Learning: A Quick and Simple Definition. Available online:
https://www.oreilly.com/content/machine-learning-a-quick-and-simple-definition/ (accessed on 10 November
2020).
• 5. Jmj, A. 5 Industries That Heavily Rely on Artificial Intelligence and Machine Learning. Available online:
https://medium.com/d
atadriveninvestor/5-industries-that-heavily-rely-on-artificial-intelligence-and-machine-learning-53610b6c1525
(accessed on10 November 2020)

28
29

You might also like