You are on page 1of 25

Twitter Sentiment Analysis

Project Report

By

Jaanav Mathavan
S Vishal 
Bevin Sukil Subash

Class:12
Section:F1

1
(Affiliated to Central Board of Secondary Education, New Delhi)
(Chettinad House, R.A.Puram, Chennai – 600 028)

COMPUTER SCIENCE

Certified to  be  the  Bonafide  Record  of  work  done  by

________________________________________ of Std XII  Sec ____  

in  the  Computer  Science  Lab  of  the  CHETTINAD  VIDYASHRAM, 

CHENNAI, during the year 2021 – 2022.

Date: Teacher-in-charge

REGISTER NO. ____________________

Submitted for All India Senior Secondary Practical Examination in

Computer Science held on ______________________________at

Chettinad Vidyashram, Chennai – 600 028.

Principal Internal Examiner External Examiner

2
ACKNOWLEDGEMENT

I would like to express my sincere thanks to

Meena Aunty, Principal Mrs.S.Amudhalakshmi

for their encouragement and support to work on

this project. I am grateful to my computer

science teacher UMA MAGESHWARI R and to

the computer science department for the

constant guidance and support to complete the

project.

3
INDEX
SNO Topic Page NO.

1 OVERVIEW OF PYTHON 5
2 PROJECT DESCRIPTION 6
3 FUNCTIONS USED 9
4 FILES USED 11
5 SOURCE CODE 11
6 SAMPLE OUTPUTS 21
7 CONCLUSION 25
8 BIBLIOGRAPHY 25

4
OVERVIEW OF PYTHON

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python


is designed to be highly readable. It uses English keywords frequently where as other
languages use punctuation, and it has fewer syntactical constructions than other languages.
 Python is Interpreted − Python is processed at runtime by the interpreter. You do not
need to compile your program before executing it. This is similar to PERL and PHP.
 Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
 Python is Object-Oriented − Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.
 Python is a Beginner's Language − Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications from
simple text processing to WWW browsers to games.
 
 
 

5
Introduction
What is sentiment analysis?
Sentiment Analysis is the process of ‘computationally’ determining whether a
piece of writing is positive, negative or neutral. It’s also known as opinion
mining, deriving the opinion or attitude of a speaker.
Sentiment analysis (also known as opinion mining or emotion AI) refers to the
use of natural language processing, text analysis, computational linguistics, and
biometrics to systematically identify, extract, quantify, and study affective states
and subjective information. Sentiment analysis is widely applied to voice of the
customer materials such as reviews and survey responses, online and social
media, and healthcare materials for applications that range from marketing to
customer service to clinical medicine.
Why sentiment analysis?
 Business: In marketing, companies use it to develop their strategies, to
understand    customers’ feelings towards products or brands, how people
respond to their campaigns or product launches and why consumers don’t
buy some products.
 Politics: In the political field, it is used to keep track of political views, to
detect consistency and inconsistency between statements and actions at
the government level. It can be used to predict election results as well!
 Public Actions: Sentiment analysis also is used to monitor and analyse
social phenomena, for the spotting of potentially dangerous situations and
determining the general mood of the blogosphere.

 
 
 

6
Overview of the Proposed System

Introduction
As the term suggests, microblogging is the blogging of small statements such as
“I am having lunch” and is considered a passive form of blogging.
Microblogging services provide a simple, easy form of communication that
enables users to broadcast and share information about their day-to-day
activities, opinions, news stories, current status, and other interests. 
Commercial or purposive microblogs also exist and are used to promote
websites, services, products, or individuals by using microblogging on popular
platforms such as Twitter, Facebook, etc., as marketing and public relations
services. 
Since its launch in October 2006, Twitter has become a ubiquitous real-time
information network powered by people all around the world that lets users
share and discover what is happening now. Twitter is a social medium for
people to communicate and stay connected through the exchange of quick,
frequent messages. People write short updates, often called “tweets,” limited to
140 characters, about various topics such as their day-to-day activities. They
share information, news, and opinions with followers, and seek knowledge and
expertise through public tweets. 
Twitter employs a social-networking model called “following,” in which
Twitter users can follow any other user without permission, i.e., the relationship
of following requires no reciprocation. To follow someone on Twitter means to
7subscribe to their tweets or updates on the site almost in real time. A
“follower”  is another Twitter user who has followed you. A “reply” is a tweet
posted in reply to another user’s message; it begins with “@username,” where
the “@” sign is used to call out usernames in tweets.
“RT,” which stands for “retweet,” is the act of forwarding another user’s tweet
to all of your followers. Users can respond to another person’s tweet, which is
called “mention.” A “mention” is any Twitter update that contains
“@username” in the tweet content. It is important for popular users such as
celebrities, politicians, or corporations to understand their audiences, and to
measure their influence toward audiences on Twitter. The goal of this study is to

7
develop a measure of positive negative influence for popular users on Twitter
and reveal how this measure of influence is related to real-world phenomena.
We will collect the tweets of certain popular users, together with the tweets of
other empirical analysis of user sentiment on Twitter based on an analysis of
negative and positive words. We will develop a measure of the positive-
negative influence between popular users and their audience and then
investigate whether the positive negative influence changes over time. The
primary contribution of this work is that this measure of influence on Twitter
can be used as an indicator to identify real world audience sentiments, providing
new insights into influence and a better understanding of popular users. 

Methodology Adopted
The proposed methodology for our project can be summarised by the following
four steps:
 First we authorize the Twitter API Client with the help of the twitter API
credentials that have been provided to us.
 We then do a get request to the Twitter API for a particular query with
the help of the tweepy library which will be explained later in detail.
 We then parse the tweets and classify whether a tweet is positive or
negative or neutral.
 We then create a pie chart with the help of the matplotlib library showing
the percentage of tweets that are positive, negative and neutral.

Architecture or Modules of the Proposed System

Fig: System Architecture

Authentication:

In order to fetch tweets through Twitter API, one needs to register an App
through their twitter account. Follow these steps for the same:

8
 Make a Twitter Developer account
 ‘Create New App’
 Fill the application details. You can leave the callback URL field empty.
 Open the ‘Keys and Access Tokens’ tab.
 Copy ‘Consumer Key’, ‘Consumer Secret’, ‘Access token’ and ‘Access
Token Secret’.

Modules imported 

tweepy to access Twitter API

textblob,TextBlob to process textual data

tweepy,OAuthHandler to set the credentials to be used in all API calls

tkinter to create the interface

tkinter,ttk to create the frames on which the output is


shown

tkinter,messagebox to display the error message

Installation:

 TextBlob: textblob is the python library for processing textual data.


Install it using following pip command:
pip install textblob
 Also, we need to install some NLTK corpora using following command: 
python -m textblob.download_corpora

Functions used in Interface

error() to print the error message if no tweets exists for the


entered person/item

list_of_queries() to open the file ‘list_of_queries’ and copying it’s

9
contents

checking() to check whether a file exists in the name of the


entered person/item and if exists, derives data from it

create_and_append() to create a text file in the name of the person/item


searched and adding it to the ‘list_of_queries’ file

function_for_output() for creating the output window

angle() to find the angle of a particular region in the pie chart

update_to_box() to add the contents of the drop down menu to the entry
box

clear_content() to delete the contents of the entry box

dd() to refresh the drop-down menu after each execution

python_code()

__init__(self)

clean_tweet(self, tweet)

get_tweet_sentiment(self, tweet)

get_tweets(self, query, count = 10)

main()

10
Files used

list_of_queries.txt to store the names of person/item searched

<person/item to store the data pertaining to the person/item


name>.txt

SOURCE CODE

import re
import sys
import tweepy
import matplotlib.pyplot as plt
from tweepy import OAuthHandler
from textblob import TextBlob
 
from tkinter import *
from tkinter import ttk
from tkinter import messagebox
entered_value=[];new=[];lst=[];show_queries=[]
positive_tweets=[];negative_tweets=[];neutral_tweets=[];code_analysis={}
g=''
def python_code():
   global new
   global entered_value
   global qu
   new=[]
   queryna=entry.get().upper().split()
   queryname=''
   for i in queryna:
       queryname+=i+' '
   queryname=queryname.rstrip(' ')   
  
   new.append(queryname)
   i=len(new)
   if new[i-1] in entered_value:
       pass
 

11
   else:
       entered_value.append(entry.get())
       class TwitterClient(object):
           def __init__(self):
               # keys and tokens from the Twitter Dev Console
               consumer_key = 'xM7IgmY4nbq7tgIw9ENVXyBEw'
               consumer_secret='qkcW5AgsgLNOKjIhn4Hd0LTt1ktq7ox52pypYEc5lJUNy2fJXQ'
               access_token = '1225054341913923584-WlGX3batUtcb9KXTPGKQ6z7bf2Nqw5'
               access_token_secret ='qndhd5zaI3WAwIXBLe5gyRiV5X1u6fwJ8O6KC39X2PRgu'
               # attempt authentication
               try:
                   # create OAuthHandler object
                   self.auth = OAuthHandler(consumer_key, consumer_secret)
                   # set access token and secret
                   self.auth.set_access_token(access_token, access_token_secret)
                   # create tweepy API object to fetch tweets
                   self.api = tweepy.API(self.auth)
               except:
                   print("Error: Authentication Failed")
           def clean_tweet(self, tweet):
               '''
               Utility function to clean tweet text by removing links, special characters
               using simple regex statements.
               '''
               return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t]) |(\w+:\/\/\S+)", "
",tweet).split())
           def get_tweet_sentiment(self, tweet):
 
               '''
               Utility function to classify sentiment of passed tweet
               using textblob's sentiment method
               '''
               # create TextBlob object of passed tweet text
               analysis = TextBlob(self.clean_tweet(tweet))
               # set sentiment
               if analysis.sentiment.polarity > 0:
                   return 'positive'
               elif analysis.sentiment.polarity == 0:
                   return 'neutral'
               else :
                   return 'negative'
           def get_tweets(self, query, count = 10):
 

12
               '''
               Main function to fetch tweets and parse them.
               '''
               # empty list to store parsed tweets
               tweets = []
               try:
 
                   # call twitter api to fetch tweets
                   fetched_tweets = self.api.search(q = query, count = count)
                   # parsing tweets one by one
                   for tweet in fetched_tweets:
                       # empty dictionary to store required params of a tweet
                       parsed_tweet = {}
                       # saving text of tweet
                       parsed_tweet['text'] = tweet.text
                       # saving sentiment of tweet
                       parsed_tweet['sentiment'] =self.get_tweet_sentiment(tweet.text)
                       # appending parsed tweet to tweets list
                       if tweet.retweet_count > 0:
                     # if tweet has retweets, ensure that it is appended only once
                           if parsed_tweet not in tweets:
                               tweets.append(parsed_tweet)
                       else:
                           tweets.append(parsed_tweet)
                     # return parsed tweets
                   return tweets
               except tweepy.TweepError as e:
                   # print error (if any)
                   print("Error : " + str(e))
 
       def main():
           # creating object of TwitterClient Class
           api = TwitterClient()
           global g,positive_tweets,negative_tweets,neutral_tweets,code_analysis
           i=len(entered_value)-1
           g = entered_value[i]
           if g==None:
               g=input("enter value")
               # calling function to get tweets
           tweets = []
           try:
               tweets = api.get_tweets(query = g, count = 10)
               if not tweets:

13
                   return 'this should raise an error'
                   raise RuntimeError("No tweets available for this topic!")
               a=str(len(tweets))
               if len(a)==1:
                   a= '0'+str(a)
               code_analysis['No. of tweets\t\t \t:'] = a
               # picking positive tweets from tweets
               ptweets = [tweet for tweet in tweets if tweet['sentiment'] == 'positive']
               # percentage of positive tweets
               pp=0
               pp=round(100*len(ptweets)/len(tweets),2)
               code_analysis['No. of positive tweets \t\t:'] = str(len(ptweets))
               # picking negative tweets from tweets
               ntweets = [tweet for tweet in tweets if tweet['sentiment'] == 'negative']
               code_analysis['No. of negative tweets \t\t:'] = str(len(ntweets))
               # percentage of negative tweets
               np=0
               np=round(100*len(ntweets)/len(tweets),2)   
               #picking neutral tweets from tweets
               netweets = [tweet for tweet in tweets if tweet['sentiment'] == 'neutral']
               # percentage of neutral tweets
               nup=0
               nup=round(100*len(netweets)/len(tweets),2)
               code_analysis['No. of neutral tweets \t\t:'] = str(len(netweets))
               code_analysis['Positive tweets Percentage\t\t:'] = pp
               code_analysis['Negative tweets Percentage\t\t:'] = np 
               code_analysis['Neutral tweets Percentage\t\t:'] = nup
              
               # printing first 5 positive tweets
               ptw=''
               for tweet in ptweets:
                   ptw=ptw+','+str(tweet['text'])+'\n'
                   positive_tweets.append(ptw)
                   ptw=''
                  
               # printing first 5 negative tweets
               ntw=''
               for tweet in ntweets:
                   ntw=ntw+', '+str(tweet['text'])+'\n'
                   negative_tweets.append(ntw)
                   ntw=''
          
               # printing first 5 neutral tweets

14
               netw=''
               for tweet in netweets:
                   netw=netw+', '+str(tweet['text'])+'\n'
                   neutral_tweets.append(netw)
                   netw=''
          
               exp_vals=[pp,np,nup]
               exp_labels=["Positive","Negative","Neutral"]
               plt.pie(exp_vals, labels=exp_labels)
           except Exception as error:
               print("ERROR : ", repr(error))
          
 
       if __name__ == "__main__":
           # calling main function
           a=main()
           if a=='this should raise an error':
               error()
           else:   
               show_queries.insert(-2,queryname)
               dd()
               create_and_append(new[0])
               function_for_output()
         
def error():
   messagebox.showerror('ERROR','No Tweets for this topic!!')
def list_of_queries():
   global lst
   try:
       f=open('List_of_queries.txt','r')
       s=' '
       while s:
           s=f.readline()
           s=s.rstrip('\n')
           s=s.strip(' ')
           lst.append(s)
       f.close()
       print(lst)
   except:
       lst=[' ']
def checking():
   global positive_tweets,negative_tweets,neutral_tweets,code_analysis,lst,show_queries
   list_of_queries()

15
   queryna=entry.get().upper().split()
   queryname=''
   positive_tweets.clear();negative_tweets.clear();neutral_tweets.clear()
   for i in queryna:
       queryname+=i+' '
   queryname=queryname.rstrip(' ')
   lst=[i for i in lst if i!='']
   if queryname in lst:   
       name = open(queryname+'.txt','r',encoding='UTF-8')
       a=''
       lst1=name.readlines()
       name.close()
       code_analysis['No. of tweets\t\t \t:'] = (lst1[0].split())[-1]
       code_analysis['No. of positive tweets \t\t:'] = (lst1[1].split())[-1]
       code_analysis['No. of negative tweets \t\t:'] = (lst1[2].split())[-1]
       code_analysis['No. of neutral tweets \t\t:'] = (lst1[3].split())[-1]
       code_analysis['Positive tweets Percentage\t\t:'] = float(lst1[4].split()[-1])
       code_analysis['Negative tweets Percentage\t\t:'] = float(lst1[5].split()[-1])
       code_analysis['Neutral tweets Percentage\t\t:'] = float(lst1[6].split()[-1])
       print(lst1)
       u=int(lst1[1].split()[-1])
       v=int(lst1[2].split()[-1])
       w=int(lst1[3].split()[-1])
       for i in lst1[7:]:
           i=i.rstrip('\n')
           print(i)
           if i=='newline':
               if len(positive_tweets)<u:
                   positive_tweets.append(a)
               elif len(negative_tweets)<v:
                   negative_tweets.append(a)
               elif len(neutral_tweets)<w:
                   neutral_tweets.append(a)
               a=''
               continue
           elif i in ['',' ']:
               pass
           else :
               a+= i+' '     
               print(a)
       function_for_output()  
       clear_content()
   else:

16
       python_code()
       clear_content()
 
def create_and_append(queryname):
   global positive_tweets,negative_tweets,neutral_tweets,code_analysis
   file_contents = open(queryname+'.txt','w',encoding='UTF-8')
   contents=[positive_tweets,negative_tweets,neutral_tweets]
   for i in code_analysis:
       file_contents.write(i+' '+ str(code_analysis[i])+'\n')
   for i in contents:
       for j in i:
           file_contents.write(j+'\n')
           file_contents.write('newline\n')
   file_contents.close()   
   searched = open('List_of_queries.txt','a',encoding='UTF-8')
   searched.write(queryname+'\n')
   searched.close()
def function_for_output():
   #global entered_value
   #entered_value.append(entry.get())
   output=Toplevel()
   output.geometry("1400x700")
   output.title('output')
   output.config(bg='#fafcff')
   notebook=ttk.Notebook(output)
   tab1=Frame(notebook,width=1250,height=600)
   tab2=Frame(notebook,width=1250,height=600)
   tab3=Frame(notebook,width=1250,height=600)
   tab4=Frame(notebook,width=1250,height=600)
   notebook.add(tab1,text='code analysis')
   notebook.add(tab2,text='positive tweets')
   notebook.add(tab3,text='negative tweets')
   notebook.add(tab4,text='neutral tweets')
   notebook.place(x=0,y=0)
   c=0
   for i in code_analysis:
       Label(tab1,text=i + str(code_analysis[i]),font=('Bahnschrift
SemiBold',14,'bold')).place(x=0,y=0+(40*c))
       c+=1
   for i in range(len(positive_tweets)):
       Label(tab2,text=str(i+1) + positive_tweets[i],font=('Bahnschrift
SemiBold',12,'bold')).place(x=0,y=0+(40*i))
   if len(positive_tweets)==0:

17
       Label(tab2,text='No Positive Tweets for this Topic!!',font=('Bahnschrift
SemiBold',12,'bold'),fg='red').place(x=0,y=0)
      
   for i in range(len(negative_tweets)):
       Label(tab3,text=str(i+1) + negative_tweets[i],font=('Bahnschrift
SemiBold',12,'bold')).place(x=0,y=0+(40*i))
   if len(negative_tweets)==0:
       Label(tab3,text='No Negative Tweets for this Topic!!',font=('Bahnschrift
SemiBold',12,'bold'),fg='red').place(x=0,y=0)
      
   for i in range(len(neutral_tweets)):
       Label(tab4,text=str(i+1) + neutral_tweets[i],font=('Bahnschrift
SemiBold',12,'bold')).place(x=0,y=0+(40*i))
   if len(neutral_tweets)==0:
       Label(tab4,text='No Neutral Tweets for this Topic!!',font=('Bahnschrift
SemiBold',12,'bold'),fg='red').place(x=0,y=0)
      
   def angle(n):
       if n!=100.00:
           return (360*n)/100
       else:
           return 359
   canvas=Canvas(tab1,width=200,height=200)
   canvas.place(x=50,y=320)
   canvas.create_arc((2,2,150,150),fill = 'green', outline = 'green' , start = 0, extent =
angle(code_analysis['Positive tweets Percentage\t\t:']))
   canvas.create_arc((2,2,150,150),fill = 'red', outline = 'red' , start =
angle(code_analysis['Positive tweets Percentage\t\t:']), extent = angle(code_analysis['Negative
tweets Percentage\t\t:']))
   canvas.create_arc((2,2,150,150),fill = 'blue', outline = 'blue' , start =
angle(code_analysis['Positive tweets Percentage\t\t:'])+angle(code_analysis['Negative tweets
Percentage\t\t:']), extent = angle(code_analysis['Neutral tweets Percentage\t\t:']))
   Label(tab1,bg='green').place(x=220,y=330)
   Label(tab1,text='positive tweets ({})'.format(code_analysis['Positive tweets Percentage\t\
t:'])).place(x=260,y=330)
   Label(tab1,bg='red').place(x=220,y=370)
   Label(tab1,text='negative tweets ({})'.format(code_analysis['Negative tweets Percentage\t\
t:'])).place(x=260,y=370)
   Label(tab1,bg='blue').place(x=220,y=410)
   Label(tab1,text='neutral tweets ({})'.format(code_analysis['Neutral tweets Percentage\t\
t:'])).place(x=260,y=410)
 
#Creating main window   

18
mw=Tk()
mw.geometry("1500x700")
mw.title("TWITTER SENTIMENTAL ANALYSIS")
mw.config(bg='#1DA1F2')
 
clicked=StringVar()
clicked.set('Show Searched Queries')       
def update_to_box():
   if clicked.get() in ['Show Searched Queries','',' ']:
       clicked.set('Show Searched Queries')
   else:
       update=clicked.get()
       entry.delete(0,'end')
       entry.insert(0,update)
       clicked.set('Show Searched Queries')
 
def clear_content():
   entry.delete(0,'end')
def dd():
   global clicked
   print('show queries= ',show_queries)
   list_of_queries()#lst                                                                                                 
   drop = OptionMenu(mw,clicked,*show_queries)     
   drop.place(x=400,y=340)
   Button(mw,text='Update to Entry
Box',command=update_to_box).place(x=570,y=340,height=30)                                                    
                              
   Button(mw,text='X',command=clear_content).place(x=800,y=300,height=30)   
try:
   f=open('List_of_queries.txt','r')
   s=' '
   while s:
       s=f.readline()
       s=s.rstrip('\n')
       s=s.strip(' ')
       show_queries.append(s)
   f.close()
   print(lst)
except:
   show_queries=[' ']
 
#inserting image
photo = PhotoImage(file='twitter logo.png')     

19
label=Label(mw,image=photo)
label.pack()
 
#creating label
label2=Label(mw,
            text="Learn from twitter",
            font=('Bahnschrift SemiBold',14,'bold'),
            bg='#ffffff')
label2.place(x=400,y=260)
 
# creating the entry box
entry=Entry(mw,
           font=('Bookman Old Style',14))
entry.place(x=400,y=300,width= 400,height=30)
 
dd()
 
#creating the search button
sub=Button(mw,
          text='search',
          bg="#e4edf5",
          font=('Franklin Gothic Medium',14,'bold'),
          command=checking)
sub.place(x=815,y=300,height=30)
#creating a drop down menu
mw.mainloop()
 
 

20
 SAMPLE OUTPUT

21
22
23
24
CONCLUSION:
In the project that we have done,  sentiment analysis is done with Twitter API,
to highlight the popularity of any specific hashtags and topics of discussion. The
rise of streaming services in the 21st century means that the work done and
progress achieved in the project is completely relevant to the real-time scenario
of the computer dynamics of the current world. Additionally, we have done a
sentimental analysis of a particular tweet, judging by its content, in what sense,
the tweet-er is trying to convey his/her message. It is an important problem to
have overcome and automated, because due the sheer volume of tweets that
have to be analyzed each day on Twitter, it is practically impossible to analyze
the contents of individual tweets in a manual way.

BIBLIOGRAPHY
1. Honey, C. and Herring, S.C., 2009, January. Beyond microblogging:
Conversation and collaboration via Twitter. In 2009 42nd Hawaii
International Conference on System Sciences (pp. 1-10). IEEE
2. Huberman, B.tA., Romero, D.M. and Wu, F., 2008. Social networks that
matter: Twitter under the microscope. arXiv preprint arXiv:0812.1045
3. Cha, M., Haddadi, H., Benevenuto, F. and Gummadi, K.P., 2010, May.
Measuring user influence in twitter: The million follower fallacy. In the
fourth international AAAI conference on weblogs and social media.
4. https://www.youtube.com/watch?v=TuLxsvK4svQ&t=8721s
5. https://www.geeksforgeeks.org/twitter-sentiment-analysis-using-python/
6. STACKOVER-FLOW
7. W3-SCHOOLS
8. https://classroom.google.com/u/0/c/MzEzMDQ3NjAwOTM3/m/
MzY2MDMwMTk1NTUz/details
9. Computer Science with Python Textbook for class 12-By Sumita Arora.
10.https://webapps.stackexchange.com/questions/19241/how-can-i-get-code-
syntax-highlighting-in-google-docs

25

You might also like