You are on page 1of 9

1 Group Id : 2A

2 Project Title : “Sentimental analysis on E-Governance”

3 Project Option : Internal Project

4 Internal Guide : Prof. R.A.Deshmukh

5 Sponsorship and External Guide : None

6 Technical Keywords (As per ACM Keywords)


Sentiment analysis, Natural language processing, reviews, E-governance, Extract

7 Problem Statement

Portal for Government Schemes that implies to develop an online application for citizens through
which citizens can know different schemes that are provided by government and they can apply
for those on same portal submitting required documents. The citizens can register on this portal
and get the confirmation and notifications. Citizen will be able to comment on Policies and
Schemes.
8 Abstract:

Sentiment analysis or opinion mining is the computational study of people’s opinions,


sentiments, attitudes, and emotions expressed in written language. It is one of the most active
research areas in natural language processing and text mining in recent years. Its popularity is
mainly due to two reasons. First, it has a wide range of applications because opinions are central
to almost all human activities and are key influencers of our behaviors. Whenever we need to
make a decision, we want to hear others’ opinions. Hence sentiment analysis is very necessary.

In this project GUI is based on the basis of product tool where admin and user are the two
user roles that are admin role and user role. Admin is the member of the government team where
he/she is able to add the Schemes and the particular Policies for the schemes. If the normal user
logged in then he/she is able to view all the schemes and the particular policies. If the user likes
the policies that have been added by the admin then user is also able to comment on it. Further
the comments have been stored into text file and the text file is being stored into hadoop for
storage purpose. If the new user log in for the first time then the most positive commented policy
or the scheme is showed first as a recommendation for the same. The Comments text has been
classified into the basis “POSITIVE”, “NEGATIVE” and “NEUTRAL”. Experiments for
review-level categorization will be performed with promising outcomes. 

9. Goals and Objectives:-

Main Objective of the proposed system is to provide Government portal through which user will
be able to add different Schemes, Polices and comments on those policies.
In Proposed system we are going to implement following schemes like :
1. Rojgar Yojna.
2. Mamta Yojna.
3. KrushiVikas Yojna.

Proposed system uses Hadoop to store large Data files , in that files comments about schemes
and Policies will be added. For Sentiment Analysis Stanford Core NLP (Natural Language
Processing) jar will be used.
In Proposed system there will be two Users
1. Admin
2. Users
Admin will be able to add different Policies and Schemes on Government portal who can
make changes to those Policies and Schemes. User will be able to see those Policies and
Schemes and able to comment on that. For next user who will login for the first time will be able
to view Maximum Positive comments.
All the schemes which government will add onto the portal will be stored into hadoop as
a big data analytics and further Map-Reduce will be applied it onto the same.

10. Relevant mathematics associated with the Project

System Specification:

S= {S, s, X, Y, T, fmain, DD, NDD, ffriend, memory shared, CPUcount}

 S (system):- Is our proposed system which includes following tuple.


 s (initial state at time T ) :-GUI of Sentiment analysis on Government websites for
Scheme rating. The GUI provides space to enter a query/input for user.
 X (input to system) :- Input Query. The user has to first enter the query. The query may
be ambiguous or not. The query also represents what user wants to search.
 Y (output of system) :- List of URLs with Snippets. User has to enter a query into
Sentiment analysis on Government websites for Scheme rating then Sentiment analysis
on Government websites for Scheme rating generates a result which contains relevant and
irrelevant URL’s and their snippets.
 T (No. of steps to be performed) :- 6. These are the total number of steps required to
process a query and generates results.
 fmain(main algorithm) :- It contains Process P. Process P contains Input ,Output and
subordinates functions. It shows how the query will be processed into different modules
and how the results are generated.
 DD (deterministic data):- It contains Database data. Here we have considered
 E-Government Database which contains number of ambiguous queries. Such queries are
user for showing results. Hence, E-Government is our DD.
 NDD (non-deterministic data):- No. of input queries. In our system, user can enter
numbers of queries so that we cannot judge how many queries user enters into single
session. Hence, Number of Input queries are our NDD.
 ffriend :- WC And IE. In our system, WC and IE are the friend functions of the main
functions. Since we will be using both the functions, both are included in ffriend function.
WC is Web Crawler which is bot and IE is Information Extraction which is used for
extracting information on browser.
 Memory shared: - Database. Database will store information like list of receivers,
registration details and numbers of receivers. Since it is the only memory shared in our
system, we have included it in the memory shared.
 CPUcount: - 2. In our system, we require 1 CPU for server and minimum 1 CPU for
client. Hence, CPUcount is 2.

Subordinate functions:
 Identify the processes as P.
S= {I, O, P....}
P= {RC, IE}

Where,

 RC is Recommendation System
 IE is Information Extraction.
 P is processes.

 WC= {U, MAX, CP}


Where,
 U=Schemes and Policies.
 MAX = {1, 2, 3, … , n}
 CP is output of Recommendation System which is Generated by Application.

 IE= {CP, NLP Techniques, Info}


Where,
 CP is input which is Recommendation System given to IE
 NLP is use for transforming all the letters to lowercases, stemming and removing stop
word.
Algorithm:
 Step 1: Login with two user roles.
 Step 1.1: Login with Admin.
 Step 1.2: Login with User.
 Step 2: Add Schemes and Government Policies when logged in by Admin.
 Step 3: call RC function
 Step 2.1: Get U (Schemes and Policies) as Input to WC.
 Step 2.2 : for i=0 to MAX
 //MAX = Maximum no of policies
 Step2.3 : visit i (E-Government site) when logged in by normal user .
 Step 2.4 : go to step 2.2 till MAX
 Step 2.5 : Output as CP is able to view all policies and schemes.
 Step 4: call to IE Function
 Step 4.1 : Get CP (Comment) as Input .
 Step 4.2 :Call Function NLP
 Step 4.3: Process NLP as Removing Stop Word.
 Step 4.4 : Get Relevant Information of positive, negative and neutral
comments
 Step5: Display Result in the graphical form.
 Step 6: Stop.
State Transition Diagram:

Where,
s=input state
x=query
q1= accept ambiguous Query
q2 =User log
q3= feedback session.
q4=pseudo doc.
q5= restructuring result.
q6= optimization.
e= end state
Fig: State Transition Diagram

Explanation

1. The ‘q1’ state accept the ambiguous query ‘x’ from the state ‘s’ which is our initial state .
2. The q2 state is meant for User log which stores the query x which is accept in state q1.
The query stores in state q2 used for finding the related URL’s form database and send
this records to state q3.
3. In q3 state, the records from database as well as from click through log will retrieved and
generate the feedback sessions.
4. ‘q4’ state will generate pseudo documents of URL from feedback session with clicked
URLs by the user.
5. Apply clustering algorithm to form a group of relevant keywords from pseudo documents
so that each cluster represents one user search objective.
6. Next step ‘q6’ is to organize words from one cluster into topic-subtopic hierarchy by
using optimization which will display result in to restructured form.
7. In this step i.e. ‘q6’, we use CAP evaluation for performance of restructured result.

11. Names of Conferences / Journals where papers can be published

1. International Journal of Innovative Research in Computer and Communication Engineering

2. International Journal of Advanced Research in Electrical, Electronics and Instrumentation

3. International Journal of Innovative Research in Science, Engineering and Technology

12. Review of Conference/Journal Papers supporting Project idea

1. Sentiment Crawling: Extremist Content collection through a sentiment analysis guided


web-crawler
2. Concept-Level Sentiment Analysis: A World Wide Web Conference 2014 Tutorial
3. Sentiment Extraction by Leveraging Aspect-Opinion Association Structure
4. An Evaluation of Machine Translation for Multilingual Sentence-level Sentiment
Analysis
Literature Survey

Sentiment Analysis problem is a machine learning problem that has been a research interest for
recent years. Through this literature survey, the relevant works done to solve this problem could
be studied. Although several notable works have come in this field, a fully automated and highly
efficient system has not been introduced till now. This is because of the unstructured nature of
natural language. The vocabulary of natural language is very large that things become even hard.
Several challenges still exist in the field of machine learning and some of them are Named entity
Recognition, Co-reference Resolution, domain dependency etc.

Concept-level sentiment analysis focuses on a semantic analysis of text through the use of web
ontologies or semantic networks, which allow the aggregation of conceptual and affective
information associated with natural language opinions. By relying on external knowledge, such
approaches step away from blind use of keywords and word co-occurrence count, but rather rely
on the implicit features associated with natural language concepts. Unlike purely syntactical
techniques, concept-based approaches are able to detect also sentiments that are expressed in a
subtle manner, e.g., through the analysis of concepts that do not explicitly convey any emotion,
but which are implicitly linked to other concepts that do so. The bag-of-concepts model can
represent semantics associated with natural language much better than bags-of-words. In the bag-
of-words model, in fact, a concept such as cloud computing would be split into two separate
words, disrupting the semantics of the input sentence (in which, for example, the word cloud
could wrongly activate concepts related to weather) . The analysis at concept-level allows for the
inference of semantic and affective information associated with natural language text and, hence,
enables comparative fine-grained feature-based sentiment analysis.

Sentiment extraction is an important task for aspect-level sentiment analysis. Previous works can
be divided into two categories: sentence level extraction and corpus level extraction. For
sentence-level extraction, previous methods mainly aimed to identify all opinion target/word
mentions in sentences by supervised learning. They regarded it as a sequence labeling task,
where several classical models were used. Jin and Ho applied a lexicalized HMM model to learn
patterns to extract aspects and opinion expressions. Li et al. integrated two CRF variations, i.e.,
Skip-CRF and Tree-CRF, to extract aspects and also opinions.
13. Plan of Project Execution

You might also like