You are on page 1of 52

ONLINE FAKE REVIEWS DETECTION IN E-COMMERCE

ABSTRACT

As most of the people require review about a product before spending their money

on the product. So people come across various reviews in the website but these

reviews are genuine or fake is not identified by the user. In some review websites

some good reviews are added by the product company people itself in order to

make in order to produce false positive product reviews. They give good reviews

for many different products manufactured by their own firm. User will not be able

to find out whether the review is genuine or fake. To find out fake review in the

website this “Online Fake Reviews Detection in E-Commerce applications” system

is introduced. This system will find out fake reviews made by posting fake

comments about a product by identifying the IP address along with review posting

patterns. User will login to the system using his user id and password and will view

various products and will give review about the product. To find out the review is

fake or genuine, system will find out the IP address of the user if the system

observes fake review send by the same IP Address many a times it will inform the

admin to remove that review from the system. This system uses data mining

methodology. This system helps the user to find out correct review of the product.
1. INTRODUCTION

The rapid growth of internet access has given rise to a digital era. The availability of internet
access has pushed almost 70% of the population to switch to internet for their daily needs and
accessories. Mainly, E-commerce platforms are being used at a much higher rate than ever
before. People who buy from these e-commerce platforms make decisions on whether to buy a
product or not solely based on the ratings and reviews of a product that are provided by these
platforms. Due to the simple nature of this review system, sellers and even individuals tend to
exploit it by writing dishonest reviews with an intention of either boosting its ratings or simply to
sabotage it. These fake reviews are aimed at deceiving customers and convince them to buy/deter
a certain product. Due to the lack of a robust system to identify real and fake reviews, these
spams manage to show up on top. To avoid this problem and provide a more efficient way to
filter and provide a more efficient way to reviews. This work focus on designing machine
learning model for fake review detection and compare the performance of three different
algorithms. As a result of this research work random forest algorithm outperform than other two
algorithms. Web based User Interface (UI) designed to remove fake review and display trusted
review based on the ranking. In general, e-commerce websites give users the option to review a
product or service. These reviews' presence can be used as a source of knowledge. For instance,
businesses can use it to select how to design their goods or services, and prospective customers
can use it to decide whether to purchase a product or not. Positive reviews influence customers to
buy products and generate revenue, however negative reviews frequently result in sales declines
on e-commerce websites. Unfortunately, phony reviews written by some parties to boost the
popularity of their product or to disparage a competitor's goods limit the value of the review. As
a result, fake reviews are a serious issue for e-commerce sites and other service providers
because today's consumers rely heavily on them. The main aim of this project isto assist
customers in selecting the best product by identifying phony reviews on E-Commerce sites using
the review text, ratings given to product and other information. The project also performs the
sentiment analysis on customer reviews to classify the reviews as positive or negative based on
the text written in the reviews, ratings of the product etc. Customers increasingly rely on reviews
for product information. However, the usefulness of online reviews is impeded by fake reviews
that give an untruthful picture of product quality. Therefore, detection of fake reviews is needed.
Unfortunately, so far, automatic detection has only had partial success in this challenging task. In
this research, we address the creation and detection of fake reviews. This issue is important for
marketing and e-commerce domains for three main reasons. First, fake reviews consumer trust in
online reviews as a whole, which would signify a major market decline. Sincere consumers write
reviews to share their experiences, either positive or negative. Hence, truthful reviewing renders
a valuable service in the marketplace, as the information in these reviews provides a signal of
quality for other consumers. A truthful marketplace for reviews is also in the interest of
companies, as they can receive authentic feedback from customers that can be analysed to
improve products and services. If fake reviews were to permeate the marketplace at scale, this
would risk systematically degrading source credibility of online reviews in general. The
consequence might be adverse selection, a process in which consumers are unable to distinguish
good reviews from bad ones. Second, fake reviews can influence a product's ranking either
positively (when the fake review is positive) or negatively (when the fake review is negative).
This is because online marketplaces' algorithms use reviews as a signal to determine a product's
ranking among other products in the same category. Therefore, fake reviews can result in unfair
competition, where a product's ranking is artificially inflated or deflated. This means that fake
reviews can be an unethical firm may generate an influx of negative reviews about its rival.
Flooding the market with such reviews can cause the ranking algorithms of online platforms to
lower the visibility of the attacked firm. It is essential to detect and prevent such effects from
taking place in order to protect firms from unfair competition.
LITERATURE REVIEW

2.1 Titles: Fake Online Reviews: A Unified Detection Model Using Deception Theories
Author: Mujahed Abdulqader

Year: 2022
Online reviews influence consumers’ purchasing decisions. However, identifying fake online
reviews automatically remains a complex problem, and current detection approaches are
inefficient in preventing the spread of fake reviews. The literature on fake reviews detection
lacks a comprehensive and interpretable theory-based model with high performance, which
enables us to understand the phenomenon from a psychological perspective and analyze reviews
based on user-generated content as well as consumer behavior. In this research, we synthesized
ten well-founded deception theories from psychology, namely leakage theory, four-factor theory,
interpersonal deception theory, self-presentational theory, reality monitoring theory, criteria-
based content analysis, scientific content analysis, verifiability approach, truth-default theory,
and information manipulation theory, and selected nine relevant constructs to develop a unified
model for detecting fake online reviews. These constr ucts include specificity, quantity, non-
immediacy, affect, uncertainty, informality, consistency, source credibility, and deviation in
behavior. We characterized the selected constructs using verbal and non-verbal features to
validate the proposed model empirically. Subsequently, we extracted features from the Yelp
datasets and used them to train four machine learning algorithms, specifically Logistic
Regression, Naïve Bayes, Decision Tree, and Random Forest. We demonstrated that quantity,
non-immediacy, affect, informality, consistency, source credibility, and deviation in behavior are
essential constructs for detecting fake reviews. To our surprise, we discovered that non-verbal
features are more important than verbal features and that combining features from both types
improves the prediction performance. Our theory-based model outperformed most of the state-
of-the-art fake review detection models and yielded high interpretability and low complexity.
2.2 Titles: Fraud Detection in Online Product Review Systems via Heterogeneous Graph
Transformer
Author: Songkai Tang; Luhua Jin; Fan Cheng
Year: 2021
In online product review systems, users are allowed to submit reviews about their purchased
items or services. However, fake reviews posted by fraudulent users often mislead consumers
and bring losses to enterprises. Traditional fraud detection algorithm mainly utilizes rule-based
methods, which is insufficient for the rich user interactions and graph-structured data. In recent
years, graph-based methods have been proposed to handle this situation, but few prior works
have noticed the camouflage fraudster’s behavior and inconsistency heterogeneous nature.
Existing methods have either not addressed these two problems or only partially, which results in
poor performance. Alternatively, we propose a new model named Fraud Aware Heterogeneous
Graph Transformer (FAHGT), to address camouflages and inconsistency problems in a unified
manner. FAHGT adopts a type-aware feature mapping mechanism to handle heterogeneous
graph data, then implementing various relation scoring methods to alleviate inconsistency and
discover camouflage. Finally, the neighbors’ features are aggregated together to build an
informative representation. FAHGT shows a remarkable performance gain compared to several
baselines on different datasets. GraphConsis addresses the inconsistency problem by computing
the similarity score between node embeddings, which cannot distinguish nodes with different
types. CAREGNN enhances GNN-based fraud detectors against camouflaged fraudsters by
reinforcement learning-based neighbor selector and relation aware aggregator. Its performance
still suffers from the heterogeneous graph. In this paper, we introduce the Fraud Aware
Heterogeneous Graph Transformer (FAHGT), where we propose heterogeneous mutual attention
to address the inconsistency problem and design a label-aware neighbor selector to solve the
camouflage problem. Both are implemented in a unified manner called the “score head
mechanism”. We demonstrate the fraud detection performance of FAHGT on many real-world
datasets. It is verified that FAHGT can considerably improve F1 score, KS and AUC over
several baselines.
2.3 Title: An Ensemble Model for Fake Online Review Detection Based on Data Resampling,
Feature Pruning, and Parameter Optimization.
Author: Jianrong Yao; Yuan Zheng; Hui Jiang

Year: 2021

With the widespread of fake online reviews, the detection of fake reviews has become a hot
research issue. Despite the efforts of existing studies on fake review detection, the issues of
imbalanced data and feature pruning still lack sufficient attention. To address these gaps, the
present study proposes an ensemble model for the detection of fake online reviews. The model
consists of four steps, and the first three steps are proposed to optimize the base classifiers: (i)
Data resampling: We propose a novel way to address the data imbalance problem by combining
the resampling and the grid search technique. (ii) Feature pruning: We propose an ablation study
to drop unimportant features. (iii) Parameters optimization: We apply the grid search algorithm
to determine suitable values of the relevant parameters for each base classifier. (iv) Classifier
ensembling: We apply majority voting and stacking strategies to integrate the optimized base
classifiers. The proposed data resampling method is also applied for the meta-classifier in the
stacking ensemble model. This study produces advances in terms of combining different
methods or algorithms into a model and the results show that the proposed ensemble model
outperforms some existing techniques, thereby providing a new way to solve the data imbalance
and feature pruning issues in the field of fake review detection. The prevalence of fake reviews is
becoming a severe problem, as it misleads consumers when making their purchase decisions and
results in great damage to the sustainable development of online review systems. Some websites
allow consumers to report reviews that they suspect to be fake. However, it is difficult for
consumers to identify fake reviews because some of them are written carefully and resemble
authentic reviews. Because of the difficulty of identifying fake reviews manually, searching for
an automatic detection method is the main direction of related research. Among various types of
fake review detection methods, machine learning methods have been widely used. However,
some problems remain understudied.
2.4 Titles: Revisiting Semi-Supervised Learning for Online Deceptive Review Detection
Author: Jitendra Kumar Rout; Anmol Dalmia; Kim-Kwang Raymond Choo

Year: 2017

With more consumers using online opinion reviews to inform their service decision making,
opinion reviews have an economic impact on the bottom line of businesses. Unsurprisingly,
opportunistic individuals or groups have attempted to abuse or manipulate online opinion
reviews (e.g., spam reviews) to make profits and so on, and that detecting deceptive and fake
opinion reviews is a topic of on-going research interest. In this paper, we explain how semi-
supervised learning methods can be used to detect spam reviews, prior to demonstrating its
utility using a data set of hotel reviews. Deceptive online review detection is generally
considered a classification problem, and one popular approach is to use supervised text
classification techniques. These techniques are robust if the training is performed using large
datasets of labelled instances from both classes, deceptive opinions (positive instances) and
truthful opinions (negative examples). However, it is challenging in practice to obtain such large
and accurate training sets. Explained that identification of deceptive online reviews is often
performed using prior human knowledge, which increases the probability of mislabelled reviews
due to the potential for subjectivity during the labelling process. Therefore, in studies such as
those of, synthetic datasets of deceptive reviews are used. In these approaches, the classification
of reviews is performed by investigating the psycholinguistic and structural differences between
deceptive and non-deceptive reviews.
2.5 Titles: Detecting Spammer Groups From Product Reviews: A Partially Supervised Learning
Model
Author: Lu Zhang; Zhiang Wu; Jie Cao

Year: 2017

Nowadays, online product reviews play a crucial role in the purchase decision of consumers. A
high proportion of positive reviews will bring substantial sales growth, while negative reviews
will cause sales loss. Driven by the immense financial profits, many spammers try to promote
their products or demote their competitors' products by posting fake and biased online reviews.
By registering a number of accounts or releasing tasks in crowdsourcing platforms, many
individual spammers could be organized as spammer groups to manipulate the product reviews
together and can be more damaging. Existing works on spammer group detection extract
spammer group candidates from review data and identify the real spammer groups using
unsupervised spamicity ranking methods. Actually, according to the previous research, labeling a
small number of spammer groups is easier than one assumes, however, few methods try to make
good use of these important labeled data. In this paper, we propose a partially supervised
learning model (PSGD) to detect spammer groups. By labeling some spammer groups as positive
instances, PSGD applies positive unlabeled learning (PU-Learning) to study a classifier as
spammer group detector from positive instances (labeled spammer groups) and unlabeled
instances (unlabeled groups). Specifically, we extract reliable negative set in terms of the
positive instances and the distinctive features. By combining the positive instances, extracted
negative instances and unlabeled instances, we convert the PU-Learning problem into the well-
known semisupervised learning problem, and then use a Naive Bayesian model and an EM
algorithm to train a classifier for spammer group detection. Experiments on real-life Amazon.cn
data set show that the proposed PSGD is effective and outperforms the state-of-the-art spammer
group detection methods.
2. SYSTEM ANALYSIS AND DESIGN

3.1 EXISTING SYSTEM

Recommender Systems are indispensable to provide personalized services on the Web.


Recommending items which match a user’s preference has been researched for a long time, and
there exist a lot of useful approaches. First, discuss existing Collaborative Filtering methods with
explicit feedbacks. Collaborative Filtering with explicit feedbacks that both positive and negative
feedbacks are observed in the dataset. The Collaborative Filtering methods can be divided into
the memory-based method, the model based method and the combination of the two. The
memory-based method includes the Neighborhood method, which calculates the similarity of the
users or items. The model-based method includes the Matrix Factorization model, the
Probabilistic model and Cluster based model. The biggest problem in Collaborative Filtering is
the sparseness of observed values. It means feedbacks are observed in very small portion of all
possible user-item pairs. However the Matrix Factorization model is known to work better than
other models even if the data is sparse.

DISADVANTAGES

• Only analyzed ratings from user reviews

• Fake reviews can’t be analyzed by existing work

• User can’t be identify genuine reviews

• Handle only limited number of product reviews


2.1 PROPOSED SYSTEM

A recommendation system has been implemented based on hybrid approach of stochastic


learning and context based engine. We have tried to combine the existing application for
recommendation to come up with a hybrid one. It improves the performance by overcoming the
drawbacks of traditional recommendation systems. Recommender systems being a part of
information filtering system are used to forecast the bias or ratings the user tends to give for an
item. Among different kinds of recommendation approaches, collaborative filtering technique
has a very high popularity because of their effectiveness. These traditional collaborative filtering
systems can even work very effectively and can produce standard recommendations, even for
wide ranging problems. For item based on their neighbor’s preferences entropy based technique
creates better suggestions than others. Whereas other techniques like content based suffers from
poor accuracy, scalability, data sparsity and big-error prediction. To find these possibilities we
have used user-based collaborative filtering approach. In this Item based filtering technique we
first examine the User item rating matrix and we identify the relationships among various items,
and then we use these relationships in order to compute the recommendations for the user. Then
using cosine similarity which is a similarity weight is going to play an important role in the item
based filtering approach and hence in order to maintain or select the trustable users from the
given set of user. Hence they give us a method to increase or decrease the significance of a
particular user or item. In the present methodology we are using adjusted similarity for
computation of similar weights of items.

ADVANTAGES

• System helps the user to find out correct review of the product

• Handle large number of contextual information

• User easily buy genuine products

• Recommend the positive products based on user reviews

• Automatic decision making system in product recommendation


SCOPE OF THE PROJECT

The Student Feedback System is a management information system for education


establishments to manage student data. Student Feedback Systems provide
capabilities for selecting particular subject for feedback and generate the report
automatically, build student details, student-related data needs in a college. A
Student Feedback System is an automatic feedback generation system that
provides the proper feedback to the teachers as per the categories like always, poor,
usually, very often, sometimes.

This system is developed mainly for the purpose of students to give their feedback
about their teachers.

• Student can give the feedback to assign faculty.

• This system is developed for the exclusively for the college students and
lecturers.

• College can register with the system with branch and faculty details.

• Student has to register for the feedback process though the valid string id which
is given by admin.
SYSTEM ARCHITECTURE

Systems design is the process of defining elements of a system like


modules, architecture, components and their interfaces and data for a system
based on the specified requirements. Systems design interfaces, and data for an
electronic control system to satisfy specified requirements. System design could
be seen as the application of system theory to product development. There is
some overlap with the disciplines of system analysis, system architecture and
system engineering.

Various organizations define systems architecture in different ways, including:

 An allocated arrangement of physical elements which provides the design


solution for a consumer product or life-cycle process intended to satisfy the
requirements of the functional architecture and the requirements baseline.
 Architecture comprises the most important, pervasive, top-level, strategic
inventions, decisions, and their associated rationales about the overall
structure (i.e., essential elements and their relationships) and associated
characteristics and behavior.
 If documented, it may include information such as a detailed inventory of
current hardware, software and networking capabilities; a description of
long-range plans and priorities for future purchases, and a plan for upgrading
and/or replacing dated equipment and software
 The composite of the design architectures for products and their life-cycle
processes.
3. SYSTEM REQUIREMENTS

3.1HARDWARE REQUIREMENTS

 Processor : Dual core processor 2.6.0 GHZ


 RAM : 1GB
 Hard disk : 160 GB
 Compact Disk : 650 Mb
 Keyboard : Standard keyboard
 Monitor : 15 inch color monitor

3.2SOFTWARE REQUIREMENTS

 Operating system : Windows OS


 Front End : Python
 Back End : MySQL SERVER
 IDLE : Python 2.7 IDLE
4. MODULE DESCRIPTION & LIBRARY DESCRIPTION

MODULES

 Login

 Upload review dataset

 Text processing

 Sentiment analysis

 Classification

 Fake review detection

MODULES DESCRIPTION

Login- In this module allows admin to the system for login process. Admin can enter the
username and password to login the system.

Upload review dataset- After login process, admin collect the user reviews and upload dataset
in to the system. This dataset is stored in the database which contains user reviews on the e-
commerce website.

Text processing- This module user reviews are taken from the dataset. The reviews are in text
format. First text is pre-processed using some text processing methods.
Sentiment analysis- After text processing, particular keywords are extracted from the text then
sentiment is analyzed from the text which is means positive or negative review.

Classification- The sentiment analyzed from the text which can be classified such as positive or
negative based on the extracted keywords.

Fake review detection- In this module admin can identify the fake review of users. Multiple
fake reviews are detected by same user account that is detected using this approach.
5. PYTHON DESCRIPTION

FRONT END: PYTHON


Python is an interpreted high-level programming language for general-purpose
programming. Created by Guido van Rossum and first released in 1991, Python has a design
philosophy that emphasizes code readability, notably using significant whitespace. It provides
constructs that enable clear programming on both small and large scales. In July 2018, Van
Rossum stepped down as the leader in the language community. Python features a dynamic type
system and automatic memory management. It supports multiple programming paradigms,
including object-oriented, imperative, functional and procedural, and has a large and
comprehensive standard library. Python interpreters are available for many operating systems.
CPython, the reference implementation of Python, is open source software and has a community-
based development model, as do nearly all of Python's other implementations. Python and
CPython are managed by the non-profit Python Software Foundation. Rather than having all of
its functionality built into its core, Python was designed to be highly extensible. This compact
modularity has made it particularly popular as a means of adding programmable interfaces to
existing applications. Van Rossum's vision of a small core language with a large standard library
and easily extensible interpreter stemmed from his frustrations with ABC, which espoused the
opposite approach. While offering choice in coding methodology, the Python philosophy rejects
exuberant syntax (such as that of Perl) in favor of a simpler, less-cluttered grammar. As Alex
Martelli put it: "To describe something as 'clever' is not considered a compliment in the Python
culture."Python's philosophy rejects the Perl "there is more than one way to do it" approach to
language design in favour of "there should be one—and preferably only one—obvious way to do
it".

Python's developers strive to avoid premature optimization, and reject patches to non-
critical parts of CPython that would offer marginal increases in speed at the cost of clarity.
[ When speed is important, a Python programmer can move time-critical functions to extension
modules written in languages such as C, or use PyPy, a just-in-time compiler. CPython is also
available, which translates a Python script into C and makes direct C-level API calls into the
Python interpreter. An important goal of Python's developers is keeping it fun to use. This is
reflected in the language's name a tribute to the British comedy group Monty Python and in
occasionally playful approaches to tutorials and reference materials, such as examples that refer
to spam and eggs (from a famous Monty Python sketch) instead of the standard for and bar.

A common neologism in the Python community is pythonic, which can have a wide
range of meanings related to program style. To say that code is pythonic is to say that it uses
Python idioms well, that it is natural or shows fluency in the language, that it conforms with
Python's minimalist philosophy and emphasis on readability. In contrast, code that is difficult to
understand or reads like a rough transcription from another programming language is called
unpythonic. Users and admirers of Python, especially those considered knowledgeable or
experienced, are often referred to as Pythonists, Pythonistas, and Pythoneers. Python is an
interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-
level built in data structures, combined with dynamic typing and dynamic binding, make it very
attractive for Rapid Application Development, as well as for use as a scripting or glue language
to connect existing components together. Python's simple, easy to learn syntax emphasizes
readability and therefore reduces the cost of program maintenance. Python supports modules and
packages, which encourages program modularity and code reuse. The Python interpreter and the
extensive standard library are available in source or binary form without charge for all major
platforms, and can be freely distributed. Often, programmers fall in love with Python because of
the increased productivity it provides. Since there is no compilation step, the edit-test-debug
cycle is incredibly fast. Debugging Python programs is easy: a bug or bad input will never cause
a segmentation fault. Instead, when the interpreter discovers an error, it raises an exception.
When the program doesn't catch the exception, the interpreter prints a stack trace. A source level
debugger allows inspection of local and global variables, evaluation of arbitrary expressions,
setting breakpoints, stepping through the code a line at a time, and so on. The debugger is written
in Python itself, testifying to Python's introspective power. On the other hand, often the quickest
way to debug a program is to add a few print statements to the source: the fast edit-test-debug
cycle makes this simple approach very effective.
Python’s initial development was spearheaded by Guido van Rossum in the late 1980s.
Today, it is developed by the Python Software Foundation. Because Python is a multiparadigm
language, Python programmers can accomplish their tasks using different styles of programming:
object oriented, imperative, functional or reflective. Python can be used in Web development,
numeric programming, game development, serial port access and more.

There are two attributes that make development time in Python faster than in other programming
languages:

6. Python is an interpreted language, which precludes the need to compile code before
executing a program because Python does the compilation in the background. Because
Python is a high-level programming language, it abstracts many sophisticated details
from the programming code. Python focuses so much on this abstraction that its code can
be understood by most novice programmers.
7. Python code tends to be shorter than comparable codes. Although Python offers fast
development times, it lags slightly in terms of execution time. Compared to fully
compiling languages like C and C++, Python programs execute slower. Of course, with
the processing speeds of computers these days, the speed differences are usually only
observed in benchmarking tests, not in real-world operations. In most cases, Python is
already included in Linux distributions and Mac OS X machines.

BACK END: MY SQL

MySQL is the world's most used open source relational database management system
(RDBMS) as of 2008 that run as a server providing multi-user access to a number of databases.
The MySQL development project has made its source code available under the terms of the
GNU General Public License, as well as under a variety of proprietary agreements. MySQL was
owned and sponsored by a single for-profit firm, the Swedish company MySQL AB, now owned
by Oracle Corporation.

MySQL is a popular choice of database for use in web applications, and is a central component
of the widely used LAMP open source web application software stack—LAMP is an acronym
for "Linux, Apache, MySQL, Perl/PHP/Python." Free-software-open source projects that require
a full-featured database management system often use MySQL.

For commercial use, several paid editions are available, and offer additional functionality.
Applications which use MySQL databases include: TYPO3, Joomla, Word Press, phpBB,
MyBB, Drupal and other software built on the LAMP software stack. MySQL is also used in
many high-profile, large-scale World Wide Web products, including Wikipedia, Google(though
not for searches), ImagebookTwitter, Flickr, Nokia.com, and YouTube.

Interimages

MySQL is primarily an RDBMS and ships with no GUI tools to administer MySQL databases or
manage data contained within the databases. Users may use the included command line tools, or
use MySQL "front-ends", desktop software and web applications that create and manage MySQL
databases, build database structures, back up data, inspect status, and work with data records.
The official set of MySQL front-end tools, MySQL Workbench is actively developed by Oracle,
and is freely available for use.

Graphical

The official MySQL Workbench is a free integrated environment developed by MySQL AB, that
enables users to graphically administer MySQL databases and visually design database
structures. MySQL Workbench replaces the previous package of software, MySQL GUI Tools.
Similar to other third-party packages, but still considered the authoritative MySQL frontend,
MySQL Workbench lets users manage database design & modeling, SQL development
(replacing MySQL Query Browser) and Database administration (replacing MySQL
Administrator).

MySQL Workbench is available in two editions, the regular free and open source Community
Edition which may be downloaded from the MySQL website, and the proprietary Standard
Edition which extends and improves the feature set of the Community Edition.

Command line
MySQL ships with some command line tools. Third-parties have also developed tools to manage
a MySQL server, some listed below.

 Maatkit - a cross-platform toolkit for MySQL, PostgreSQL and Memcached, developed


in Perl Maatkit can be used to prove replication is working correctly, fix corrupted data,
automate repetitive tasks, and speed up servers. Maatkit is included with several GNU/Linux
distributions such as CentOS and Debian and packages are available for Programming

MySQL works on many different system platforms, including AIX, BSDi, FreeBSD, HP-UX,
eComStation, i5/OS, IRIX, Linux, Mac OS X, Microsoft Windows, NetBSD, Novell NetWare,
OpenBSD, OpenSolaris, OS/2 Warp, QNX, Solaris, Symbian, SunOS, SCO OpenServer, SCO
UnixWare, Sanos and Tru64. A port of MySQL to OpenVMS also exists.[32]

MySQL is written in C and C++. Its SQL parser is written in yacc, and a home-brewed lexical
analyzer. Many programming languages with language-specific APIs include libraries for
accessing MySQL databases. These include MySQL Connector/Net for integration with
Microsoft's Visual Studio (languages such as C# and VB are most commonly used) and the
JDBC driver for Java. In addition, an ODBCinterimage called MyODBC allows additional
programming languages that support the ODBC interimage to communicate with a MySQL
database, such as ASP or ColdFusion. The HTSQL - URL-based query method also ships with a
MySQL adapter, allowing direct interaction between a MySQL database and any web client via
structured URLs.

Features

As of April 2009, MySQL offered MySQL 5.1 in two different variants: the open source MySQL
Community Server and the commercial Enterprise Server. MySQL 5.5 is offered under the same
licences. They have a common code base and include the following features:

 A broad subset of ANSI SQL 99, as well as extensions


 Cross-platform support
 Stored procedures
 Triggers
 Cursors
 Updatable Views
 Information schema
 Strict mode (ensures MySQL does not truncate or otherwise modify data to conform to an
underlying data type, when an incompatible value is inserted into that type)
 X/Open XAdistributed transaction processing (DTP) support; two phase commit as part
of this, using Oracle's InnoDB engine
 Independent storage engines (MyISAM for read speed, InnoDB for transactions and
referential integrity, MySQL Archive for storing historical data in little space)
 Transactions with the InnoDB, and Cluster storage engines; savepoints with InnoDB
 SSL support
 Query caching
 Sub-SELECTs (i.e. nested SELECTs)
 Replication support (i.e. Master-Master Replication & Master-Slave Replication) with
one master per slave, many slaves per master, no automatic support for multiple masters per
slave.
 Full-text indexing and searching using MyISAM engine
 Embedded database library
 Unicode support (however prior to 5.5.3 UTF-8 and UCS-2 encoded strings are limited to
the BMP, in 5.5.3 and later use utf8mb4 for full unicode support)
 ACID compliance when using transaction capable storage engines (InnoDB and Cluster)
 Partititoned tables with pruning of partitions in optimiser
 Shared-nothing clustering through MySQL Cluster
 Hot backup (via mysqlhotcopy) under certain conditions
 Multiple storage engines, allowing one to choose the one that is most effective for each
table in the application (in MySQL 5.0, storage engines must be compiled in; in MySQL 5.1,
storage engines can be dynamically loaded at run time): Native storage engines (MyISAM,
Falcon, Merge, Memory (heap), Federated, Archive, CSV, Blackhole, Cluster, EXAMPLE,
Maria, and InnoDB, which was made the default as of 5.5). Partner-developed storage engines
(solidDB, NitroEDB, ScaleDB, TokuDB, Infobright (formerly Brighthouse), Kickfire, XtraDB,
IBM DB2). InnoDB used to be a partner-developed storage engine, but with recent acquisitions,
Oracle now owns both MySQL core and InnoDB.

6.1 Building the API/GU

Flask Implementation:

Flask is often referred to as a "micro" framework because it focuses on


simplicity and minimalism. It does not come with built-in tools for database
abstraction, form validation, or other components that are considered part of a
larger framework. Instead, Flask allows developers to choose and integrate their
preferred tools and libraries, offering a high degree of customization. To install
Flask, you can use Python's package manager, pip. Here are the steps to install
Flask:
Make sure you have Python installed on your system. You can download the latest
version of Python from the official Python website. Follow the installation
instructions for your operating system.
Verify pip Installation:
Pip is the package installer for Python. It usually comes installed with Python.
You can check if pip is installed by running the following command in your
terminal or command prompt:
#### pip --version

Install Flask:

Once pip is installed, you can install Flask using the following command:

#### flask –version


This should display the Flask version if the installation was successful.

6.2 IMPORT LIBRARIES NEEDED IN THE PROJECT

To create a Flask application or any other Python program, you often need to
import various libraries or modules to extend the functionality of your code. Below
are some libraries and modules that you might want to import when working with
Flask for online train ticketing system.

from flask import Flask, render_template, flash, request, session

from flask import render_template, redirect, url_for, request

#from wtforms import Form, TextField, TextAreaField, validators, StringField,


SubmitField

#from werkzeug.utils import secure_filename

import mysql.connector

import datetime
7. SYSTEM TESTING

7.1 SYSTEM TESTING

Testing is a series of different tests that whose primary purpose is to fully exercise the computer
based system. Although each test has a different purpose, all work should verify that all system
element have been properly integrated and performed allocated function. Testing is the process
of checking whether the developed system works according to the actual requirement and
objectives of the system. The philosophy behind testing is to find the errors. A good test is one
that has a high probability of finding an undiscovered error. A successful test is one that
uncovers the undiscovered error. Test cases are devised with this purpose in mind. A test case is
a set of data that the system will process as an input.

7.2 TYPES OF TESTING

SYSTEM TESTING

After a system has been verified, it needs to be thoroughly tested to ensure that every
component of the system is performing in accordance with the specific requirements and that it is
operating as it should including when the wrong functions are requested or the wrong data is
introduced.

Testing measures consist of developing a set of test criteria either for the entire system or for
specific hardware, software and communications components. For an important and sensitive
system such as an electronic voting system, a structured system testing program may be
established to ensure that all aspects of the system are thoroughly tested.

Testing measures that could be followed include:


 Applying functional tests to determine whether the test criteria have been met
 Applying qualitative assessments to determine whether the test criteria have been met.
 Conducting tests in “laboratory” conditions and conducting tests in a variety of “real
life” conditions.
 Test measures for hardware may include:
 Applying “non-operating” tests to ensure that equipment can stand up to expected levels
of physical handling.
 Testing “hard wired” code in hardware (firmware) to ensure its logical correctness and
that appropriate standards are followed.
 Tests for software components also include:
 Testing all programs to ensure its logical correctness and that appropriate design,
development and implementation standards have been followed.
 Conducting “load tests”, simulating as close as possible a variety of “real life” conditions
using or exceeding the amounts of data that could be expected in an actual situation.
 Verifying that integrity of data is maintained throughout its required manipulation.

UNIT TESTING

The first test in the development process is the unit test. The source code is normally
divided into modules, which in turn are divided into smaller units called units. These units have
specific behavior. The test done on these units of code is called unit test. Unit test depends upon
the language on which the project is developed. Unit tests ensure that each unique path of the
project performs accurately to the documented specifications and contains clearly defined inputs
and expected results. Unit testing producing tests for the behavior of components (nodes and
vertices) of a product to ensure their correct behavior prior to system integration.
INTEGRATION TESTING

In integration testing modules are combined and tested as a group. Modules are typically
code modules, individual applications, source and destination applications on a network, etc.
Integration Testing follows unit testing and precedes system testing. Testing after the product is
code complete. Betas are often widely distributed or even distributed to the public at large in
hopes that they will buy the final product when it is released.

ACCEPTANCE TESTING

This testing is done to verify the readiness of the system for the implementation.
Acceptance testing begins when the system is complete. Its purpose is to provide the end user
with the confidence that the system is ready for use. It involves planning and execution of
functional tests, performance tests and stress tests in order to demonstrate that the implemented
system satisfies its requirements.

VALIDATION TESTING

Valid and invalid data should be created and the program should be made to process this
data to catch errors. When the user of each module wants to enter into the page by the login page
using the use rid and password .If the user gives the wrong password or use rid then the
information is provided to the user like “you must enter user id and password”. Here the inputs
given by the user are validated. That is password validation, format of date are correct, textbox
validation. Changes that need to be done after result of this testing.
8.1 SOURCE CODE

From flask import Flask, render_template, flash, request, session


from flask import render_template, redirect, url_for, request
from wtforms import Form, TextField, TextAreaField, validators, StringField, SubmitField
from werkzeug.utils import secure_filename
import hashlib
import mysql.connector
#from nltk.corpus import stopwords
#from nltk.tokenize import word_tokenize

app = Flask(__name__)
app.secret_key = 'any random string'

app.config['DEBUG']
def data():
example_sent = n

stop_words = set(stopwords.words('english'))

word_tokens = word_tokenize(example_sent)

filtered_sentence = [w for w in word_tokens if not w in stop_words]

filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)

print(word_tokens)
print(filtered_sentence)

def listToString(s):
# initialize an empty string
str1 = " "

# return string
return (str1.join(s))

# Driver code
s = filtered_sentence
print(listToString(s))
sentence = listToString(s)
s11 = listToString(word_tokens)
print('---------')
print(s11)
d = ["careless", "together", "criminal", "corrupt", "depressed", "Overcritical", "Aggressive",
"Armchair", "critic", "Cynical", "Impulsive", "Tactless", "Thoughtless", "badmood",
"hurtful", "lose",
"lousy", "lumpy", "naive", "nasty", "naughty", "negate", "negative", "never", "nobody",
"non",
"descript", "noxious", "sad", "stupid", "stressful", "upset", "worthless", "zero", "ugly",
"undermine","unfair", "unfavorable", "unhappy", "unhealthy", "not+good"]

s1 = set(sentence.split())
s2 = set(d)
s111 = set(s11.split())

print(s1.intersection(s2))
print(len(s1.intersection(s2)))
cn = len(s111.intersection(s2))
print(cn)
fruits = []

fruits.append(n)
sentence1 = fruits
print(sentence1)

def check_all(sentence1, ws):


return all(w in sentence1 for w in ws)

for sentences in sentence1:


if any(check_all(sentences, word.split('+')) for word in d):
print(sentence1)
sta = 1
break
else:
print('not fount')
sta = 0
break
print(sta)

@app.route("/")
def homepage():
return render_template('index.html')

@app.route("/register")
def register():
return render_template('register.html')

@app.route("/admin")
def admin():
return render_template('login.html')

@app.route("/userhome")
def userhome():
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
# cursor = conn.cursor()
cur = conn.cursor()
cur.execute("SELECT * FROM product")
data= cur.fetchall()
return render_template('userhome.html', data=data)

@app.route("/adminhome")
def adminhome():
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
# cursor = conn.cursor()
cur = conn.cursor()
cur.execute("SELECT * FROM register")
data = cur.fetchall()
return render_template('adminhome.html', data=data)

return render_template('adminhome.html')

@app.route("/user")
def user():
return render_template('userlogin.html')

@app.route("/addproduct")
def addproduct():
return render_template('addproduct.html')

@app.route("/orderdetails")
def orderdetails():
#uname = session['cuname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute("select * from buy ")
data = cursor.fetchall()
return render_template('orderdetails.html',data=data)

@app.route("/viewproduct")
def viewproduct():
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
# cursor = conn.cursor()
cur = conn.cursor()
cur.execute("SELECT * FROM product")
data = cur.fetchall()
return render_template('viewproduct.html', data=data)
return render_template('viewproduct.html')

@app.route("/view")
def view():
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
# cursor = conn.cursor()
cur = conn.cursor()
cur.execute("SELECT * FROM product")
data = cur.fetchall()
return render_template('view.html',data=data)
@app.route("/rview")
def rview():
id = request.args.get('id')
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
# cursor = conn.cursor()
cur = conn.cursor()
cur.execute("SELECT * FROM ratingdb1 where pid='"+id+"'")
data = cur.fetchall()
return render_template('rview.html',data=data)

@app.route("/adminlog", methods=['GET', 'POST'])


def adminlog():
error = None
if request.method == 'POST':
if request.form['uname'] == 'admin' and request.form['password'] == 'admin':
error = 'Invalid Credentials. Please try again.'
conn = mysql.connector.connect(user='root', password='', host='localhost',
database='shop')
# cursor = conn.cursor()
cur = conn.cursor()
cur.execute("SELECT * FROM register")
data = cur.fetchall()
return render_template('adminhome.html', data=data)
return render_template('adminhome.html', error=error)
else:
return render_template('index.html', error=error)
@app.route("/userlog", methods=['GET', 'POST'])
def userlog():
error = None
if request.method == 'POST':
username = request.form['uname']
password = request.form['password']
session['cuname'] = request.form['uname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute("SELECT * from register where uname='" + username + "' and password='"
+ password + "'")
data = cursor.fetchone()

if data is None:
return 'Username or Password is wrong'
else:
return render_template('userhome.html')

@app.route("/reg", methods=['GET', 'POST'])


def reg():
if request.method == 'POST':
n = request.form['name']
g = request.form['gender']
# st = request.form['station']
# p = request.form['post1']
address = request.form['address']
pnumber = request.form['pnumber']
email = request.form['email']
uname = request.form['uname']
password = request.form['password']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO register VALUES ('','" + n + "','" + g + "','" + address + "','" + pnumber +
"','" + email + "','" + uname + "','" + password + "')")
conn.commit()
conn.close()
# return 'file register successfully'
return render_template('userlogin.html')

@app.route("/adduser2", methods=['GET', 'POST'])


def adduser2():
if request.method == 'POST':
n = request.form['cat']
f = request.files['file']
f.save("static/upload/" + secure_filename(f.filename))
g = request.form['scat']
pname = request.form['pname']
price = request.form['price']
stock = request.form['stock']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO product VALUES ('','" + n + "','" + g + "','" + pname + "','" + price + "','"
+ stock + "','" + f.filename + "')")
conn.commit()
conn.close()
# return 'file uploaded successfully'
return render_template('adminhome.html')

@app.route("/upload", methods=['GET', 'POST'])


def upload():
if request.method == 'POST':
n = request.form['cat']
f = request.files['file']
f.save("static/upload/" + secure_filename(f.filename))
g = request.form['scat']
pname = request.form['pname']
price = request.form['price']
stock = request.form['stock']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO product VALUES ('','" + n + "','" + g + "','" + pname + "','" + price + "','"
+ stock + "','" + f.filename + "')")
conn.commit()
conn.close()
# return 'file uploaded successfully'
return render_template('adminhome.html')
@app.route("/buy", methods=['GET'])
def buy():
# if request.method == 'POST':
cat = request.args.get('cat')
pid = request.args.get('id')
pname = request.args.get('pname')
price = request.args.get('price')
uname = session['cuname']
'''conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO buy VALUES ('','"+pid+"','" + cat + "','" + pname + "','" + uname + "','" +
price + "','','','')")
conn.commit()
conn.close()'''
conn1 = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
# cursor = conn.cursor()
cur1 = conn1.cursor()
cur1.execute("SELECT * FROM product where id='"+str(pid)+"'")
data = cur1.fetchall()
# return 'file register successfully'
return render_template('order.html',data=data)

@app.route('/data', methods=['GET'])
def data():
# here we want to get the value of user (i.e. ?user=some-value)
user = request.args.get('user')
uname = session['cuname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO buy VALUES ('" + user + "','','" + uname + "','','','','','')")
conn.commit()
conn.close()
# return 'file register successfully'
return render_template('order.html')
@app.route("/buy1", methods=['GET', 'POST'])
def buy1():
if request.method == 'POST':
cat = request.form['cat']
pid = request.form['pid']
pname = request.form['pname']
price = request.form['price']
qty = request.form['qty']
uname = session['cuname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO buy VALUES ('','"+str(pid)+"','" + str(cat) + "','" + str(pname) + "','" +
str(uname) + "','" + str(price) + "','"+str(qty)+"','','')")
conn.commit()
conn.close()
p=int(qty)*int(price)

# return 'file register successfully'


return render_template('pay.html',data=p)
@app.route("/pay")
def pay():

uname = session['cuname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute(
"update buy set status='1' where uname='"+uname+"'")
conn.commit()
conn.close()
return 'Payment successfully'
@app.route("/oview")
def oview():
uname = session['cuname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute("select * from buy where uname='"+uname+"'")
data = cursor.fetchall()
return render_template('oview.html', data=data)

@app.route("/rate")
def rate():
id = request.args.get('id')
uname = session['cuname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute("select * from buy where id='"+id+"'")
data = cursor.fetchone()
return render_template('rating.html', data=data)
@app.route("/commt",methods = ['GET', 'POST'])
def commt():

if request.method == 'POST':
n = request.form['commt']
oid = request.form['id']
pid = request.form['pid']
star = request.form['star']
ar = request.form['ar']

uname = session['cuname']
sta=0
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute("select * from ratingdb1 where pid='"+pid+"' and uname='"+uname+"' and
status='1'")
data = cursor.fetchone()
if data is None:
conn = mysql.connector.connect(user='root', password='', host='localhost',
database='shop')
cursor = conn.cursor()
cursor.execute(
"insert into ratingdb1(id,pid,uname,rating,srate,commt,st,status)values('','" + str(pid) +
"','" + str(
uname) + "','" + str(ar) + "','" + str(star) + "','" + str(n) + "','" + str(sta) + "','1')")
data = cursor.fetchone()

conn.commit()
conn.close()
else:
return "Rating Already Given"
return render_template('oview.html')
@app.route("/cview")
def cview():

uname = session['cuname']
conn = mysql.connector.connect(user='root', password='', host='localhost', database='shop')
cursor = conn.cursor()
cursor.execute("select * from ratingdb1 where uname='"+uname+"'")
data = cursor.fetchall()
return render_template('cview.html', data=data)

if __name__ == '__main__':
app.run(debug=True, use_reloader=True)
app.secret_key = 'super secret key'
# app.config['SESSION_TYPE'] = 'filesystem’

8.2 SCREENSHOT
9. Conclusion & Future Enhancement

In this project, proposed a product recommendation system based on hybrid


recommendation algorithm. The main advantages of method are a visual organization of the data
based on the underlying structure, and a significant reduction in the size of the search space per
result output. And user can easily search the products anywhere and anytime. Ratings, reviews
and emoticons are analyzed and categorized as positive and negative sentiments. Search the
products based on price based filtering and reviews based filtering. Medium Access Control
(MAC) based filtering approach can be used to avoid fake reviews. The current results are
notably better than random approach. However, feel that with a better dataset and a number of
improvements to method, may achieve better results. Hybrid Recommendations is one of the
main modules of the system which helps overcome the drawbacks of the traditional
Collaborative and Content Based Recommendations. And have obtained promising results using
current model.

9.1 FUTURE WORK


And can extend the work with number of directions work can potentially take in the
future. It is possible to modify algorithm to use an approach that lies between collaborative
filtering and content-based filtering. There are multiple ways to do this. One way would be to
include user-specific data such as information about products liked by a user’s friends, and
information from reviews written by the user and their similarity to other interests and so on, as
inputs to the hybrid systems.

10. REFERANCE

 Darapaneni, Narayana, et al. "Detection of Distracted Driver using Convolution


Neural Network." arXiv preprint arXiv:2204.03371 (2022).
 Alkinani, Monagi H., et al. "HSDDD: a hybrid scheme for the detection of
distracted driving through fusion of deep learning and handcrafted
features." Sensors 22.5 (2022): 1864.
 Hossain, Md Uzzol, et al. "Automatic driver distraction detection using deep
convolutional neural networks." Intelligent Systems with Applications 14 (2022):
200075.
 Aljasim, Mustafa, and Rasha Kashef. "E2DR: a deep learning ensemble-based
driver distraction detection with recommendations model." Sensors 22.5 (2022):
1858.
 Aljohani, Abeer A. "Real-time driver distraction recognition: A hybrid genetic
deep network based approach." Alexandria Engineering Journal 66 (2023): 377-
389.
 Chirra, Venkata Rami Reddy, SrinivasuluReddyUyyala, and Venkata Krishna
Kishore Kolli. "Deep CNN: A Machine Learning Approach for Driver
Drowsiness Detection Based on Eye State." Rev. d'IntelligenceArtif. 33.6 (2019):
461-466.
 Arefnezhad, Sadegh, et al. "Driver drowsiness detection based on steering wheel
data applying adaptive neuro-fuzzy feature selection." Sensors 19.4 (2019): 943.
 Mahmoodi, Mohammad, and Ali Nahvi. "Driver drowsiness detection based on
classification of surface electromyography features in a driving simulator."
Proceedings of the Institution of Mechanical Engineers, Part H: Journal of
Engineering in Medicine 233.4 (2019): 395-406.
 Bakheet, Samy, and Ayoub Al-Hamadi. "A Framework for Instantaneous Driver
Drowsiness Detection Based on Improved HOG Features and Naïve Bayesian
Classification." Brain Sciences 11.2 (2021): 240.
 Tanveer, M. Asjid, et al. "Enhanced drowsiness detection using deep learning: an
fNIRS study." IEEE access 7 (2019): 137920-137929.

You might also like