You are on page 1of 15

Project Title

Project Proposal

Session: 2021-2025

Submitted by:
Aniqa Noor 2021-CS-626
Nimra Shahzadi 2021-CS-636
Iqra Asghar 2021-CS-702

Supervised by:
Dr. Irfan Yousuf

Department of Computer Science, New Campus


University of Engineering and Technology
Lahore, Pakistan
Contents
List of Figures ii
List of Tables iii
1 Proposal Synopsis 1
1.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . 1
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Features/Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
1.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .2
1.7 Proposed Methodology/System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
1.8 Tools and Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
1.9 Team Members Individual Tasks/Work Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 7
1.10 Data Gathering Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.11 Timeline/Gantt chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 References 10

i
List of Figures
1.1 Workflow of Intrusion Detection System . . . . . . . . 6
1.2 Sample Gantt chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

ii
List of Tables
1.1 Related System Analysis . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Work Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

iii
Chapter 1

Proposal Synopsis

1.1 Abstract
In today's digital marketplace, consumers face a plethora of options when
shopping online, highlighting the importance of informed decision-making for
optimal value. To address this, we propose a dynamic price comparison website
utilizing web scraping technology. This platform empowers users to seamlessly
compare product prices across multiple e-commerce platforms, enabling
informed purchasing decisions. Our website features a user-friendly interface
with a simple search bar initiating real-time price retrieval from designated
retailers. The data is meticulously analyzed and presented clearly, allowing users
to identify the most competitive offers effortlessly. Additionally, the platform
integrates user reviews, product ratings, and specifications to provide a
comprehensive view of the product landscape. Privacy and data security are
prioritized, ensuring compliance with regulations and robust protection
measures. Ultimately, our price comparison website aims to empower consumers
with the insights needed to optimize their online shopping experience.

1.2 Introduction
In the rapidly evolving digital marketplace, consumers are increasingly
confronted with the challenge of sifting through a myriad of online retailers to
find the best deals and prices for products. Presently, various price comparison
tools and websites attempt to address this need, yet many lack comprehensive
coverage or user-friendly interfaces, leaving consumers frustrated and
dissatisfied. In response to these shortcomings, we propose the development of a
dynamic price comparison website powered by innovative web scraping
technology.

Unlike existing solutions that may offer limited coverage or present data in a
complex manner, our solution stands out for its comprehensive approach and
user-friendly interface. By leveraging web scraping technology, our platform
systematically gathers real-time pricing data from a wide range of e-commerce
Proposal Synopsis 2

websites, providing users with a holistic view of the market landscape. This
ensures that users have access to the most up-to-date and extensive pricing
information, enabling them to make informed purchasing decisions with
confidence.

The advantages of our price comparison website are manifold. Firstly, it offers
unparalleled convenience and efficiency by streamlining the process of
comparing prices across multiple retailers. With just a few clicks, users can access
comprehensive pricing information, saving time and effort. Additionally, our
platform enhances transparency in the online shopping experience, empowering
consumers with the knowledge needed to secure the best deals and optimize
their purchasing power. By prioritizing user experience and comprehensive
coverage, our solution revolutionizes the way consumers navigate the
complexities of the digital marketplace.

In conclusion, our dynamic price comparison website represents a significant


advancement in the realm of online shopping. By combining innovative web
scraping technology with a user-centric design, we offer consumers a powerful
tool to navigate the vast landscape of online retailers with ease and confidence.
From simplifying price comparisons to enhancing transparency and efficiency,
our platform empowers consumers to make informed and cost-effective
purchasing decisions, ultimately transforming their online shopping experience
for the better.

1.3 Problem Statement


The online shopping landscape lacks a centralized platform for efficiently
comparing prices across multiple e-commerce websites, leading to time-
consuming searches and frustration for users. Additionally, the absence of
essential features like product reviews exacerbates decision-making complexity.
This gap highlights the need for a comprehensive price comparison website that
simplifies price comparisons and integrates crucial features, addressing these
pain points and enhancing the overall shopping experience.

In essence, the problem can be summarized as follows:

 Lack of a centralized platform for comparing prices of products across


multiple e-commerce websites.
Proposal Synopsis 3

 Absence of additional features such as product reviews, ratings, and


specifications to aid in decision-making.
 Manual and time-consuming process for users to find the best deals,
leading to frustration and potential loss of value.

Therefore, the primary objective of this project is to develop a price comparison


website that leverages web scraping technology to compare prices of products
from different websites efficiently. Additionally, the website will incorporate
various additional features to enrich the shopping experience and empower users
to make informed purchasing decisions.

1.4 Objectives
 Develop a user-friendly price comparison website capable of comparing
prices of specific products across multiple e-commerce websites.
 Implement web scraping algorithms to retrieve real-time pricing data from
designated online retailers.
 Design and integrate a streamlined search functionality, allowing users to
input the desired product and retrieve comprehensive price comparisons.
 Incorporate additional features such as product reviews, ratings, and
specifications to provide users with a holistic view of the product
landscape.
 Ensure the scalability and reliability of the website to accommodate a
growing database of products and retailers.
 Optimize the website for performance and speed to enhance user
experience and minimize loading times.
 Implement robust security measures to safeguard user information and
ensure compliance with data protection regulations.
 Conduct thorough testing and debugging to identify and resolve any
technical issues or discrepancies in price comparison accuracy.
 Gather feedback from users to continually improve and enhance the
functionality and usability of the website.
 Measure and evaluate the effectiveness of the website based on metrics
such as user engagement, satisfaction, and the frequency of successful
price comparisons.

1.5 Scope
The following points outline the scope of intrusion detection system:
Proposal Synopsis 4

1. Utilization of Deep Learning algorithms (CNNs and RNNs) for real-time


analysis of network traffic data.
2. Adaptive learning capability for detecting unknown attacks based on
learned patterns.
3. Integration with existing security infrastructure for comprehensive
protection against cyber threats.

1.6 Related Work


1. Liu and Lang.[1] have discussed about various methods to monitor network
traffic and detect potential security threats. These methods include Packet-
Based, Flow-Based, Session-Based, and Log-Based detection. Packet-Based
IDSs lack context and are resource-intensive. Flow-Based IDSs struggle with
heterogeneous data and may miss complex attacks. Session-Based IDSs face
session variability and scalability issues. Log-Based IDSs deal with log format
variability and high volumes, requiring specialized expertise.
2. Alrawashdeh and Purdy.[2] this work presents a deep learning approach for
anomaly detection using a Restricted Boltzmann Machine (RBM) and a deep
belief network, achieving a 97.9% detection rate and a low false negative rate
of 2.47% on the DARPA KDDCUP’99 dataset. The architecture outperforms
previous methods in both detection speed and accuracy. Future work includes
applying the strategy to larger and more challenging datasets with larger
classes of attacks.
3. Dorothy E. Denning.[3] have described a real-time intrusion-detection expert
system designed to detect security violations by monitoring audit records for
abnormal system usage patterns. The system includes profiles for representing
subject behaviour towards objects using metrics and statistical models, along
with rules for acquiring knowledge from audit records and detecting
anomalous behaviour. It is designed to be independent of specific systems,
applications, vulnerabilities, or intrusion types, providing a generalpurpose
framework for intrusion detection.
4. Zhao et al. [4] have proposed an intrusion detection method using a deep
belief network (DBN) and probabilistic neural network (PNN) to address
issues like redundant information and long training times. The method
converts raw data to low-dimensional data using DBN, optimizes the number
of hidden-layer nodes with particle swarm optimization, and then classifies the
data using PNN.
5. Ahmim et al.[5] have proposed an IDS that combines REP Tree, JRip
algorithm, and Forest PA classifiers, showing superior performance in
accuracy, detection rate, false alarm rate, and time overhead compared to
existing schemes. The system analyzes network traffic as Attack/Benign using
Proposal Synopsis 5

features from the dataset and outputs from the classifiers, demonstrating
effectiveness on the CICIDS2017 dataset.

6. Radford et al.[6] have utilized LSTM RNNs for anomaly detection with the
ISCX IDS dataset, achieving AUC values between 0.39 and 0.84. A recurrent
neural network (RNN) is used to learn a model for representing network
communication sequences, enabling the identification of outlier network
traffic without relying on known malicious behaviour patterns.
7. Pektas and Acarman.[7] have proposed a combination of convolutional and
recurrent neural networks for botnet detection, achieving high accuracy
(99.3%) and F-measure (99.1%) on datasets like CTU-13 and ISOT,
surpassing traditional methods like blacklists and signature matching. The
approach leverages analysis of botnet network communication flows and rich
datasets for training and testing, demonstrating the effectiveness of deep
learning for identifying evolving botnet variants.
8. Nguyen et al.[8] have proposed an IDS platform, IDS-CNN, based on
convolutional neural networks (CNN) for detecting DoS attacks, achieving
high accuracy (up to 99.87%) compared to traditional machine learning
techniques like KNN, SVM, and Naïve Bayes. The system aims to provide
real-time protection against malicious network traffic, addressing challenges
such as detection complexity and execution time.
9. Shone et al.[9] have introduced a novel deep learning technique, the
nonsymmetric deep autoencoder (NDAE), for intrusion detection, aiming to
reduce human interaction and improve detection accuracy. The proposed
model, implemented in GPU-enabled TensorFlow, demonstrates promising
results on the KDD Cup '99 and NSL-KDD datasets, showing improvements
over existing approaches and strong potential for modern NIDSs.
10. Farnaaz and Jabbar.[10] have presented a model for intrusion detection
using the random forest classifier, aiming to address the complexities and
challenges of IDS. The model, evaluated on the NSL-KDD dataset,
demonstrates high efficiency with low false alarm rates and high detection
rates, indicating its effectiveness in detecting and classifying attacks compared
to traditional classifiers.
Table 1.1: Related System Analysis
Related Weakness Proposed Project Solution
System
Liu and Limited context and high false Make packet-based systems smarter,
Lang.[1] positives, Resource intensiveness, speed up packet and session-based
challenges with encrypted traffic systems, Looking inside encrypted
and difficulty in detecting
messages and Getting smarter against
sophisticated attacks
attacks
Alrawashdeh Redundant information, high- Feature selection and seep learning,
Proposal Synopsis 6

and Purdy. dimensional data, Local optima Optimization algorithms, Hybrid


[2] during training, Dimensionality dimensionality reduction, Automated
reduction trade-offs and Challenges network structure design and Hardware
in designing optimal network
acceleration and cloud
structures
solutions
Dorothy E. Heavy reliance on predefined Adopting dynamic profiling,
Denning [3] profiles and static models, Integrating machine learning,
Sensitivity to thresholds and Incorporating threat intelligence feeds,
Limited adaptability to evolving Implementing continuous monitoring
threats and Fostering collaborative defense
strategies
Zhao et al. Computational intensity, Computational intensity,
[4] Interpretability challenges, Interpretability, Data labelling
Data labelling dependency, dependency and Vulnerability to
Susceptibility to adversarial adversarial attacks and scalability
attacks, Complex model tuning
and scalability challenges
Ahmim et Limited model diversity, Diverse classifiers, Automated
al. [5] Complexity in maintenance, maintenance, Bias mitigation,
Susceptibility to training data bias, Computational optimization, Adaptive
Increased computational overhead, updates, Improved transparency,
Limited generalization, Reduced Enhanced security
interpretability, and Vulnerability
to adversarial attacks
Radford et Reliance on representative training Enhancing training data quality,
al.[6] data, Susceptibility to overfitting, Implementing regularization
Interpretability challenges, techniques, Employing interpretability
Computational complexity, tools, Optimizing model architectures,
Sensitivity to hyperparameters, Leveraging automated hyperparameter
Limitations in adapting to emerging tuning, Incorporating continuous
threats learning mechanisms
Pektas and Complexity of model architecture, Simplify model architecture, Balance
Acarman. Challenges in optimization and class distribution, Utilize transfer
[7] resource management, Lack of learning, Incorporate explainable AI
techniques, Optimize and parallelize,
generalization, Interpretability
Address false positives with post-
hurdles, Training time and false
processing methods, Address
positives/negatives and
adversarial attacks with robustness
Vulnerability to adversarial attacks
testing, collaborate among experts
Nguyen et Accuracy for different attack types, Accuracy improvement, Execution time
al. [8] reduction, Optimization streamlining
Proposal Synopsis 7

Large execution time, and Generalizability improvement


Optimization challenges and
Dataset specificity
Shone et al. Limited class performance, Handling of smaller classes,
[9] Challenges in novel threats, Issue Generalization to zero-day attacks,
Improves scalability and efficiency,
in scalability and Challenges in
Evaluation on real-world data
diverse network environments

Farnaaz and Limited feature selection, Dataset Evaluate Performance on diverse


Jabbar. [10] specificity and Limited evaluation datasets, Consider additional
metrics evaluation metrics and Explore
comprehensive evaluation frameworks

1.7 Proposed Methodology/System

Data Collection

Data Pre-processing

Model Training

Model Evaluation

Deployment(Optional)

Figure 1.1: Workflow of Intrusion Detection System

1. Data Collection:
• We will evaluate the model on publicly available dataset
and we will not capture the data.
Proposal Synopsis 8

• Capture network configurations, traffic interactions, attack


timings, and metadata.
• Store data in a suitable format for preprocessing.
2. Data Preprocessing:
• Remove noise and irrelevant information.
• Extract relevant features for model training.
3. Model Training:
• Use labelled datasets to train Deep Learning models.
• Use training algorithms to optimize model parameters.
4. Model Evaluation:
• Evaluate the performance of the trained models using
metrics such as accuracy, precision, recall, and F1-score.
• Use cross-validation to ensure the robustness of the models.
5. Deployment (Optional):
• Implement the trained models in the IDS.
• Analyse incoming network traffic in real-time.
• Classify traffic as normal or malicious based on learned
patterns.

1.8 Tools and Technologies


Following are enlisted tools and technologies which will be used in designing this
model:
1. IDE:
• Jupyter Notebook,
• PyCharm,
• Visual Studio Code Editor,
• Google Colab
2. Languages:
• Python,
• JavaScript
3. Techniques:
Deep Learning Model
4. Packages:
• PyTorch/TensorFlow,
Proposal Synopsis 9

• React.JS

1.9 Team Members Individual Tasks/Work Division


Table 1.2: Work Division
SR # Member’s Name Tasks

1. Fatima Shahid Model Evaluation


Model Training
Deployment (Optional)
Documentation

2. Faiza Atta Data Pre-processing


Model Training
Deployment (Optional)
Documentation

3. Sadia Latif Data Collection


Model Evaluation
Model Training
Documentation

1.10 Data Gathering Approach


The following points outline the data gathering approach:

• The CICIDS2017 dataset is designed for evaluating Intrusion Detection Systems


(IDSs).
• It addresses the need for reliable test datasets in anomaly-based intrusion
detection.
• The dataset includes a diverse range of attacks and benign traffic.
• It covers attacks such as Brute Force, DoS, DDoS, Heartbleed, Web Attack,
Infiltration, Botnet, etc.

• Attacks were executed over a period of five days, starting from July 3, 2017.
• The dataset includes detailed network configurations, traffic interactions, attack
timings, and metadata.
• It provides a realistic representation of real-world network data for cybersecurity
research and development.
Proposal Synopsis 10

1.11 Timeline/Gantt chart

Figure 1.2: Sample Gantt chart


11

References
[1] Liu, H., & Lang, B. (2019). Machine learning and deep learning methods for
intrusion detection systems: A survey. applied sciences, 9(20), 4396.
[2] Alrawashdeh, K., & Purdy, C. (2016, December). Toward an online anomaly
intrusion detection system based on deep learning. In 2016 15th IEEE
international conference on machine learning and applications (ICMLA) (pp.
195-200). IEEE.
[3] Denning, D. E. (1987). An intrusion-detection model. IEEE Transactions on software
engineering, (2), 222-232.

[4] Zhao, G., Zhang, C., & Zheng, L. (2017, July). Intrusion detection using deep belief network
and probabilistic neural network. In 2017 IEEE international conference on computational
science and engineering (CSE) and IEEE international conference on embedded and
ubiquitous computing (EUC) (Vol. 1, pp. 639-642). IEEE.

[5] Ahmim, A., Maglaras, L., Ferrag, M. A., Derdour, M., & Janicke, H. (2019, May). A novel
hierarchical intrusion detection system based on decision tree and rules-based models.
In 2019 15th International Conference on Distributed Computing in Sensor Systems
(DCOSS) (pp. 228-233). IEEE.

[6] Radford, B. J., Apolonio, L. M., Trias, A. J., & Simpson, J. A. (2018).
Network traffic anomaly detection using recurrent neural networks. arXiv
preprint arXiv:1803.10769.
[7] Pektaş, A., & Acarman, T. (2019). Deep learning to detect botnet via network flow
summaries. Neural Computing and Applications, 31(11), 8021-8033.

[8] Nguyen, S. N., Nguyen, V. Q., Choi, J., & Kim, K. (2018, February). Design and
implementation of intrusion detection system using convolutional neural network for DoS
detection. In Proceedings of the 2nd international conference on machine learning and soft
computing (pp. 34-38).

[9] Lim, B., Son, S., Kim, H., Nah, S., & Mu Lee, K. (2017). Enhanced deep residual networks
for single image super-resolution. In Proceedings of the IEEE conference on computer vision
and pattern recognition workshops (pp. 136-144).

[10] Farnaaz, N., & Jabbar, M. A. (2016). Random forest modelling for
network intrusion detection system. Procedia Computer Science, 89, 213-217.

You might also like