Professional Documents
Culture Documents
CHAPTER 1
INTRODUCTION
CHAPTER 2
CHAPTER 3
We can discuss cyber terrorism here related to the spoofing of confidential information.
This can happen by security breach and access by unauthorized user. Vicious software and
viruses like Trojan horse are the reason behind the violation in security which can leads to
antisocial activities in the world of cyber crime. There are few more applications which are
included in cyber security to analyze data for auditing computer applications. We can build
a data ware house that contains data to audit and then by using different existing data
mining tools we can analyze whether potential anomalies are present or not. By using data
mining techniques we can restrict confidential information or data to the legitimate users
and unauthorized access could be stopped. For detection and prevention of cyber attacks
data mining technique can be used effectively, also, Data mining can be used to detect and
prevent cyber attacks, data mining also aggravate security issues like privacy and
interference. Security model shown in below figure.
This includes servers, web clients, operating systems, networks & databases. Most of the
cyber attacks and terrorism happened because of malicious intrusion. In malicious intrusion
things will process like someone without nay authorization tries to attack in the safe
network and get the confidential information. This might be any vicious automated
software or robot made by human or any human intruder. Cyber attacks or malicious
intrusions is often beneficial to show analogies of non cyber computing world i.e.
confidential relevant to cyber terrorism— and apply these attacks on computer world or
networking. Cyber terror increases day by day worldwide which is shown in below figure
CHAPTER 4
Data mining has many applications in security including in national security (e.g.,
surveillance) as well as in cyber security (e.g., virus detection). The threats to national
security include attacking buildings and destroying critical infrastructures such as power
grids and telecommunication systems. Data mining techniques are being used to identify
suspicious individuals and groups, and to discover which individuals and groups are
capable of carrying out terrorist activities. Cyber security is concerned with protecting
computer and network systems from corruption due to malicious software including
Trojan horses and viruses. Data mining is also being applied to provide solutions such as
intrusion detection and auditing. In this paper we will focus mainly on data mining for
cyber security applications. Data mining is one of the four detection methods used today
for detecting malware. The other three are scanning, activity monitoring, and integrity
checking. When building a security app, developers use data mining methods to improve
the speed and quality of malware detection as well as to increase the number of detected
zero-day attacks.
There are five strategies for detecting malware:
Anomaly detection
Misuse detection
Hybrid detection
Text classification technique
Cluster based technique
The term `misuse' is herein defined in a broad sense as the use or behavior of a network
environment in any way that is not consistent with the system's expected functionality, as
perceived by the provider of the network service. Misuse detection is also sometimes
referred to as signature-based detection because alarms are generated based on specific
attack signatures. This work focuses on the detection of such misuse events. The misuse is
often that of unauthorized access of the system or using the system in an unauthorized
way. In this case, the detection of such protection mechanism is called an Intrusion
Detection System (IDS).
The H-IDS designed within this paper is based on an original approach, where the outputs of
an anomaly-based detector and a signature-based detector are collected. The parameters
of the detectors are controlled by a centralized node. This node is referred to as hybrid
detection engine (HDE). The design goal of this intrusion detection system is to enhance the
overall performance of
DDoS attack detection, by shortening the detection delay, while increasing the detection
accuracy.
The block diagram of the proposed H-IDS is shown in Figure. As can be seen from this figure,
the observed data containing normal traffic and DDoS attacks is processed to extract some
features; then processed data is linked to signature-based and anomaly-based detector
To understand the topology of cyber terrorist networks and discover their operation
methods, firstly the Identification of their sub-committees or cells should be conducted
using cyber communities detection approaches
This would assist investigators to extract valuable knowledge from a vast amount of
gathered data about the structures and strategies of cyber terrorist groups. The efficient
use of organizational data contributes considerably to develop the network map which
describes the cyber terrorist group structure, as well as to understand individuals roles
within the group. In addition, the detection of some actors (nodes) in every subgroup of the
cyber terrorist group would allow achieving the clustering process. In fact, this process aims
to group a set of objects, sharing characteristics and following same criteria, in groups,
called clusters.
CHAPTER 5
5.1 IMPLEMENTATION
In this research paper, I have shown the concept of data mining techniques to identify
cyber-attacks. My focus of attention would be on “finding patterns” in a log file (records
that occur in the system) which shows the sequence of events. From this log file i identify
patterns. To start with, I use the clustering technique to discover the type of cyber-crime,
Denial of service (DoS) attacks. As we know that clustering is grouping of data that has
similar features. So this grouping helps to discover similar patterns of data that occur
constantly in the log file. Step 1: Evaluate the log file. Step 2: Mine the date with time Step
3: Scan the data Step 4: Add the found data in the main file. When the above procedure is
carried out, we will record that data which contains normal patterns and also abnormal
patterns (malicious). By using the clustering technique we identify the data that occur
repeatedly [9]. System Configuration: In order to run our obtained data, we use the
Windows Server to maintain the database. Initially we run the data that contains zero
attacks and then add them to the master file or log file. The ICMP (Internet Control Message
Protocol) will make the system inactive by sending voluminous amount of “ping” command.
Now the data that contains the normal activities and the data that contains attacks are
passed through the technique that we have proposed. If the observations of the log file
show normal behavior then they will be ignored. If the observations show multiple requests
of the same transaction, then this data will be directed through our algorithm “Apriori” and
will be shown in the attack logs. This algorithm will detect if similar patterns of requests
exist in the normal records prior to consider it as attack. If the algorithm finds out the
pattern and or finds the number of request for the same transaction more than the
threshold value it is considered as an attack and it sends signal or message to the
administrator about the suspected attack.
In the fig. as shown above, we could see the DoS attack that has been made by the
anonymous user (intruder) initially by gaining the access to the system (server) by posing as
a authenticated user. In denial of service attack, the attacker gains the access through the
vulnerabilities present in the system and copies the message sent by an authenticated user
and makes multiple copies of the same request or query and sends it to the server. So, the
server will process the same query or the request sent by a user for multiple times. In this
way, the server is kept busy by processing the same request multiple times. This is called as
denial of service attack. Another example is the “ping” attack where multiple ping requests
will be sent from one user or multiple users and the server is again overloaded with
processing the same request. This type of attack is severe. We apply data mining techniques
to identify these types of attacks by finding similar patterns or request from the users. In
our approach, we define a threshold of minimum support (5). If the same request is
received to the server more than the threshold value, it assumes it as an attack and notifies
the administrator. In some cases, based on the working environment, the threshold value
could be set accordingly.
Procedures:
Step 1: Start
5.2.2 DISVANTAGES
While this list of the benefits is impressive, there are also certain drawbacks
you need to know about:
Data mining is complex, resource-intensive, and expensive
Building an appropriate classifier may be a challenge
Potentially malicious files need to be inspected manually
Classifiers need to be constantly updated to include samples of new malware
There are certain data mining security issues, including the risk of unauthorized
disclosure of sensitive information
Data mining helps you quickly analyze huge datasets and automatically discover hidden
patterns, which is crucial when it comes to creating an effective anti-malware solution that’s
CHAPTER 6
CONCLUSIONS
In prospect job context, this study suggests to scheme regarding continuance of work
about mining the security risks in massive datasets. For instance, this work just examined
the deep links hijacking- risks. Since the deep links could be castoff to outbreak cell phone
browser or applications hence, this paper suggests toiling upon the exposure of exposed
communications applications plus malevolent deep-links over the web. Additionally, there
is a need to refine the carried-out methods to construct it farther applied pro diverse
actual world apps. For instance, an updating of the existing data-leak exposure scheme
plus manifestos to make the procedure flowing network interchange proficient.
It can be concluded this research has developed a proof-of concept of a methodology to
detect documents which contain information related to cyber terrorism using text
classification techniques based on English textual document. In addition, by applying
feature selection known as Best First algorithm it can avoid computational expensive and
cut the execution time without decreasing the performance of the classifiers and even
improve its level of accuracy.
Last, by comparing the result of each classifier, it shows that Support Vector Machine
algorithm has the best by achieving 100% of accuracy based upon term-frequency
representation with feature selection. This result proves that the capability of Support
Vector Machine in high dimensional input space. As the future works in relation to this
research is the used of TF-IDF (Term Frequency- Inverse Document Frequency) as the
vector representation
CHAPTER 7
TEXT REFERENCES
[1]. Data Mining for Security Applications : Bhavani Thuraisingham, Latifur Khan,
Mohammad M. Masud, Kevin W. Hamlen
[2]. Rakesh Agrawal, Tomasz Imieliski, and Arun Swami. Mining association rules
between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD
international conference on Management of data,
[3]. Daniel Barbara and Sushil Jajodia, editors. Applications of Data Mining in
Computer Security. Kluwer Academic Publishers
[4]. Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and J Sander. Lof:
identifying density-based local outliers. In Proceedings of the 2000 ACM SIG-MOD
international conference on Management of data, pages
[5]. Varun Chandola and Vipin Kumar. Summarization {compressing data into an
informative representation. In Fifth IEEE International Conference on Data Mining,
pages.
WEB REFERENCES
[6]. https://ieeexplore.ieee.org/document/5946881
[7]. https://sci-hub.se/.
[8].https://www.apriorit.com/dev-blog/527-data-mining-cyber-security#:~:text=Data
%20mining%20has%20great%20potential,known%20and%20zero%2Dday
%20attacks.
[9].https://www.cs.odu.edu/~mukka/cs795sum10dm/Lecturenotes/Day7/Barbara
%20Jajodia%20Data%20Mining%20Book.pdf