Professional Documents
Culture Documents
SENTIMENT ANALYSIS
BY
UMAR, HAUWA
BU/17C/IT/2744
SEPTEMBER, 2020
A DYNAMIC MODEL OF SOCIAL MEDIA MONITORING TOOLS WITH
SENTIMENT ANALYSIS
B.Sc.
IN
COMPUTER SCIENCE [SOFTWARE ENGINEERING
BY
UMAR, HAUWA
BU/17C/IT/2744
TO
THE DEPARTMENT OF COMPUTER SCIENCE
BAZE UNIVERSITY, ABUJA
SEPTEMBER, 2020
ACKNOWLEDGMENT
I would like to express my sincere gratitude to Allah SWT for giving me the health,
opportunity, wisdom and understanding to carry out this project. I would also like to thank my
parents, grandma and siblings for their emotional, physical, financial and moral support all
through my university experience, they were always my source of inspiration and motivation.
I am grateful to my project supervisors, Dr Amit Mishra and Mr Charles Isah Saidu for their
support, patience and constructive criticism throughout the implementation of this project. I
am also grateful to all the lecturers that have made an impact in my educational up-bringing at
Baze University either directly or indirectly, I could mention these lecturers individually but
the list would go on. I am forever grateful for the impact they’ve all had in my life.
I would also like to thank my friends for motivating and having a positive impact in my life
from the friendly competition between ourselves to helping each other during difficult times in
school. I will forever hold you close to my heart.
Finally, I would like to thank Baze University for grooming me both morally and
educationally and setting me up for a bright future.
i
DECLARATION
This is to certify that this Thesis entitled “A Dynamic Model of Social Media Monitoring
Tools with Sentiment Analysis”, which is submitted by Umar Hauwa in partial fulfilment of
the requirement for the award of degree for B.Sc. in Information Technology to the
Department of Computer Science, Baze University Abuja, Nigeria, comprises of only my
original work and due acknowledgement has been made in the text to all other materials used.
APPROVED BY …………………
HOD
Dept. of Computer Science
ii
CERTIFICATION
This is to certify that this Thesis entitled A Dynamic Model of Social Media Monitoring
Tools with Sentiment Analysis which is submitted by Umar Hauwa in partial fulfilment of
the requirement for the award of degree for B.Sc. in Information Technology to the
Department of Computer Science, Baze University Abuja, Nigeria is a record of the
candidate’s own work carried out by the candidate under my/our supervision. The matter
embodied in this thesis is original and has not been submitted for the award of any other
degree.
Date: Supervisor:
iii
APPROVAL
This is to certify that the research work, “A Dynamic Model of Social Media Monitoring
Tools with Sentiment Analysis” and the subsequent preparation by Umar Hauwa with
BU/17C/IT/2744 has been approved by the Department of Computer Science, Faculty of
Computing and Applied Science, Baze University, Abuja, Nigeria.
By
iv
DEDICATION
This project is dedicated to Allah SWT for his mercy and blessings for guiding me throughout
my life and putting me in this position to complete this project and chapter of my life. I also
dedicate this project to my dad Mr. Umar Hamza Kangiwa, my mum Mrs. Bilkisu Umar, my
brothers Mukhtar and Abdulmalik Umar, and my grandma Mrs. Aisha Kaoje they have been
my support system since forever and I am forever grateful to God for blessing me with my
family.
v
ABSTRACT
The proposed system is a dynamic model of social media monitoring tool with sentiment
analysis. Social media platforms contain a lot of data which might be considered ambiguous
to businesses/organizations. Most organizations make use of social media platforms for
advertisement as they reach a large crowd in a short period of time and are also more
affordable compared to other options. Businesses find it difficult to have a clear vision/insight
on how well their business has grown after each advertisement or product release. The aim of
the project is to provide various relevant analysis including sentiment analysis on real-time
data from social media platforms for specified keywords or social media accounts defined by
registered users. Sentiment analysis is a process in which an algorithm is used to determine the
emotion of texts i.e. positive, negative or neutral. Sentiment analysis helps businesses
understand how their customers feel about their product. Businesses can also compare their
products with their competitors to have a clearer insight on how their services are doing online
compared to other businesses. This application was developed using the increment
methodology and successfully produced the desirable results for the first iteration of the
project. To accomplish this project Python language and PostgreSQL database are used. The
end-result of the first iteration is focused on Twitter data. It performs all the requirements
specified. At the end the application passed through different types of tests and the obtained
accuracy was 98%. The application will continue with future enhancements and the other
iterations.
vi
TABLE OF CONTENTS
ABSTRACT vi
LIST OF TABLES IX
LIST OF FIGURES X
LIST OF ABBREVIATIONS XI
CHAPTER 1: INTRODUCTION 1
1.1 OVERVIEW 1
1.2 BACKGROUND AND MOTIVATION 1
1.3 STATEMENT OF THE PROBLEM 2
1.4 AIM AND OBJECTIVES 3
1.5 SIGNIFICANCE OF THE PROJECT 4
1.6 PROJECT RISKS ASSESSMENT 5
1.7 SCOPE/PROJECT ORGANIZATION 7
CHAPTER 2: LITERATURE REVIEW 8
2.1 INTRODUCTION 8
2.2 HISTORICAL OVERVIEW 8
2.3 RELATED WORK 9
2.4 SUMMARY 13
CHAPTER 3: REQUIREMENTS ANALYSIS AND DESIGN 14
3.1 OVERVIEW 14
3.2 PROPOSED METHODOLOGY 14
3.3 APPROACH TO CHOSEN METHODOLOGY/METHODS 19
3.4 TOOLS AND TECHNIQUES 21
3.5 ETHICAL CONSIDERATION 22
3.6 REQUIREMENT ANALYSIS 23
3.7 REQUIREMENTS SPECIFICATIONS 24
3.7.1 Functional Requirement Specifications 24
3.7.2 Non-Functional Requirement Specifications 26
3.8 SYSTEM DESIGN 26
3.8.1 Application Architecture 26
3.8.2 Use Case 27
3.8.3 Activity Diagrams 30
3.8.4 Data Flow Diagram 31
3.8.5 Entity-Relationship Diagram (ERD) 34
3.8.6 User Interface Design 35
3.9 Summary 39
CHAPTER 4: IMPLEMENTATION AND TESTING 40
4.1 OVERVIEW 40
4.2 MAIN FEATURES 40
4.3 IMPLEMENTATION PROBLEMS 43
4.4 OVERCOMING IMPLEMENTATION PROBLEMS 44
4.5 TESTING 44CH
vii
4.5.1 Tests Plans (for Unit Testing, Integration Testing, and System Testing) 45
4.5.2 Test Suite (for Unit Testing, Integration Testing, and System Testing) 46
4.5.3 Test Traceability Matrix (for Unit Testing, Integration Testing, and System Testing) 74
4.5.4 Test Report Summary (for Unit Testing, Integration Testing, and System Testing) 76
4.5.5 Error Reports and Corrections 76
4.6 USE GUIDE 77
4.7 SUMMARY 77
CHAPTER 5: DISCUSSION, CONCLUSION, AND RECOMMENDATIONS 78
5.1 OVERVIEW 78
5.2 OBJECTIVE ASSESSMENT 78
5.3 LIMITATIONS AND CHALLENGES 78
5.4 FUTURE ENHANCEMENTS 78
5.5 RECOMMENDATIONS 79
5.6 CONCLUSION 80
5.7 SUMMARY 80
REFERENCES 81
APPENDICES 83
viii
LIST OF TABLES
ix
LIST OF FIGURES
x
LIST OF ABBREVIATIONS
IT Information Technology
xi
CHAPTER 1:
INTRODUCTION
1.1 Overview
The internet is becoming more and more instilled in every human’s day to day activity and
life. It has been observed that social media platforms are used consistently 24/7 by users
from all around the world to express their opinions on different topics freely.
Many brands and companies take advantage of these platforms to create strategies and
understand their target market so as to help in advertising and marketing using influencers
on Instagram, twitter, Facebook etc.
Social media monitoring is the process of identifying data relevant to a topic or organization
from a social media platform and making various analyses on the data for different purposes
such as academic, decision making, marketing strategies etc.
This project will show how social media has and will help businesses in making analysis on
the opinions of the customers on their products so as to improve their marketing strategies
and know what their target market wants. (Amandeep, Deepesh, Khushboo, & Ranjit Singh,
2016).
Sentiment Analysis is used to classify texts into positive, negative and neutral by using text
analysis techniques (Sentiment Analysis, 2020).
Using sentiment analysis to make predictions of emotions in words, sentences or documents,
the work can gain an overview of the wider public opinion behind the product. A lot of
people use social media sites for networking with other people and news. Social Media
provides a platform for people to voice their opinions. For example, a person might have had
a very hot day and decided to buy a refreshing bottle of coke and decide to post a picture of
themselves on twitter with a caption about it. This kind of information can be used by
organizations to evaluate, and rate the performance of their products all around the world as
either successful or not. (Khalid & Ahmad, 2017). This is one the many features the project will
provide.
1
1.2 Background and Motivation
The previous methods of advertising, marketing and understanding the masses opinions
where very time consuming and not always efficient. But social media platforms have
provided a cheaper and more efficient way to get organization's products and services out to
the mass. And with this development analysis on how well these products are doing in the
market is very possible. A lot of social media platforms provide API’s that give the public
access to their data.
TweetReach is a social media tool for twitter specifically. It makes analysis based on
keywords, URLs or account names and gives an overview over all the tweets posted.
The advantage of using a social media monitoring tool is that it provides the following:
- Users can have an overview of what people are saying about their products or how many
mentions their products are making. It can break down the status of who are making the
posts all the way down to their gender, location, source of the post etc.
- The users are getting real-time analysis on their products. Giving data on the number of
positive and negative reviews and comments people post on their products. This can help
the organization know the next step to take with their marketing strategies (Sentiment
Analysis, 2020).
- The user can keep track of their competitors also.
This project is not only limited to organizations or businesses, academic scholars and people
in politics can use this project to make analysis in their respective fields too.
These are some of the advantages and features the implementation of this project has to offer
to academia, industry and government.
2
It helps to establish brand awareness as most social media platforms have millions of users.
This can help the business reach a wide range of customers locally and internationally.
Also, it is very cost effective, if using the previous methods such as using billboards,
television ads, newspapers etc. they usually cost a lot and are usually not very cost efficient
but using social platforms it costs almost nothing to get your products into the market
compared the old ways. (Khanna, 2018).
Also, political candidates also find it difficult have a more accurate data on what the society
feels about them and how well their campaign or adverts have affected people’s view on
them
The main problem is the accessibility of data for individuals or organizations on their
products and analyzing this data to know how successful or unsuccessful the products or
strategies are working. This project gives its users access to that information.
These are the aims and objective of this current project as when you get to Chapter 3 you
will notice this project is separated into iterations with an enhancement being made at every
iteration.
4
Sentiment analysis also focuses on data science which is a vital branch of computer science.
Implementing this project in the country will enhance data management and manipulation,
that is, understanding the importance of data and how it can help in business analytics and
intelligence as well as many other sectors.
Social media monitoring helps individuals and organizations have access to data available
on the internet. It carries out analysis on the data extracted to give a better insight on their
topic.
1.6 Project Risk Assessment
This section shows possible problems that may occur from the beginning of this project and
throughout and suggest possible solutions and prevention techniques that can be used.
5
-Lack of
skills
6
1.6.4 Design isn’t feasible
This is when a feature in the project is not possible either due to lack of skills or time. It
can lead to an incomplete project or dedicating time to reevaluation of the project.
Solution: This is caused by improper planning and feasibility study of the project. This is
just to say these stages are also very vital as they can affect the end product of the project
and should be taken seriously.
7
CHAPTER 2:
LITERATURE REVIEW
2.1 Introduction
A lot of research work has been done in relation to social media monitoring and sentiment
analysis. This section shows previous research works and products in the field of sentiment
analysis, social media monitoring and data mining. Some of the research works have also been
briefly tabulated in Table.3 and Table .4.
8
The first academic work on sentiment analysis or topic related to sentiment analysis was during
World War II with politics as its motivation (Stagner, 1940). Modern sentiment analysis started
in the 21st century and was mainly on product reviews (Dave, Lawrence, & Pennock, 2003),
(Mika, Daniel, & Kuutila, 2016). It has grown to focus on other topics such as prediction of
markets, reactions to terrorist attacks etc. (Nassirtoussi, Aghabozorgi, Wah, & Ngo, 2014)
(Mika, Daniel, & Kuutila, 2016). There have also been research work on irony detection and
multi-lingual support (Mika, Daniel, & Kuutila, 2016). Further research has also been done on
differentiating negative emotions such as anger and grief (Cambria, Gastaldo, Bisio, & Zunnio,
2015).
In this era of the 21st century, a lot of people own phones which have access to the internet. The
internet is used for a lot of purposes which includes receiving data and posting out data. Social
media platforms, e-commerce websites etc. have made it very easy for users to post about their
opinions on topics, products etc. There has been a lot of research work on sentiment analysis and
will be talked about in Section 2.3.
The first social media site created is SixDegrees.com in 1997. Although it was not used by just
anyone and it did not perform the common tasks social media platforms of nowadays perform. It
was used to send messages within networks. After SixDegrees so many other social media
platforms have surfaced over the years such as Friendster, Myspace, Facebook, Twitter etc. The
trend of using these platforms increased over the years to the point where it has become a daily
routine for the majority of worlds’ population. Due to this new trend, social media platforms
contain a lot of data that can be analyzed for various reasons. This section will summarize
articles and products on Social Media Monitoring tools that are relevant to this project.
2.3.1.1 Social media monitoring: Aims, methods and challenges for international companies
9
This research paper shows the aims, objectives and challenges of social media monitoring for
international companies. In the paper it talks about the different methods of monitoring topics or
accounts on social media. Some of the challenges one might face includes finding out what to the
track and identifying the metrics to be analyzed, also there are some ethical constraints because
some influencers would not like to be monitored by companies (Boyang & Marita, 2014). These
are some of the issues pointed out in this research paper.
2.3.2.1 Sentiment Analysis: Adjectives and adverbs are better than Adjectives alone
Sentiment Analysis: Adjectives and adverbs are better than Adjectives alone, is a linguistic
approach to sentiment analysis where polarities are assigned using the analogy that using both
adjectives and adverbs for issuing out polarities gives a better sentiment analysis compared to
using only adjectives (Farah, Carmine, Antonio, Diego, & Subrahmanian, 2006). The research
worked with three methods Variable Priority Scoring, Adjective Priority Scoring and Adverb
First Priority Scoring. After experimenting on about 200 documents of news resources the results
showed that Adjective Priority Scoring produced the best result with a weight of 35% (Farah,
Carmine, Antonio, Diego, & Subrahmanian, 2006).
10
2.3.1.2 Opinion Digger: Unsupervised Opinion Miner from Unstructured Product Reviews
Opinion Digger is another Sentiment Analysis solution introduced by Samaneh and Ester
(Samaneh & Martin, 2010).It is an unsupervised opinion miner that provides a summary of
ratings for major aspects of a product where an aspect is defined as a products’ feature e.g. For a
phone it could be its battery life or camera. It gives the user an insight on the quality of the
product and allows users to compare them with similar products (Samaneh & Martin, 2010). It
works at sentence level. Their approach consisted of two phases which are extracting product
aspects and estimating aspects ratings (Samaneh & Martin, 2010). It was successful with product
rating at aspect level with 0.51 success. Its downfall was that it only worked with known data
(Najma, Pintu, Monika, Sourabh, & S.K., 2019).
Figure 2.2: Sentiment Analysis methodology used in Sentiment Analysis of News Articles: A
Lexicon based Approach (Soonh, Bakhtawer, & Areej, 2019)
11
Over 1000 news articles were used in this experiment and were classified into positive, negative
and neutral with a score of +1 as positive and -1 as negative. The problem with this project is
limited word coverage which leads to the constant update of the lexical database with new words
and their semantics (Soonh, Bakhtawer, & Areej, 2019).
2.3.1.4 A Joint Model of Feature Mining and Sentiment Analysis for Product Review
Rating
This is a machine learning sentiment analysis solution introduced by Jorge et al. “It focuses on
measuring the polarity and strength of opinions in product reviews” (Jorge, Laura, Pablo, &
Alberto, 2011) using machine learning. They made this possible using four steps which are first,
it recognizes features the user cares about on a specific product. Secondly, it finds reviews where
the features the user cares about are stated, Then, it checks the polarity of each review. Finally,
based on the polarity it inputs scores for each review, then uses a Vector of feature Intensities to
represent the reviews and put them into an algorithm that will predict the rating (Jorge, Laura,
Pablo, & Alberto, 2011). The problem with this project is the use of WordNet but it has a
staggering prediction accuracy of 71% (3 categories) and 46.9 (5 categories) (Najma, Pintu,
Monika, Sourabh, & S.K., 2019).
12
Table 2.1: Sentiment Analysis Tools/Techniques
2.4 Summary
The research work reviewed in this chapter discusses the challenges faced in monitoring social
media platforms due to issues such as privacy, vagueness of data e.t.c. Therefore, the proposed
solution will ensure data extracted from social media platforms are less ambiguous and have
more meaning to prospective users. Section 2.3.2 talks about the various techniques used to
perform sentiment analysis and the best option so as to deliver the most accurate result under
budget and time.
Chapter 3 presents the requirement analysis and proposed methodology used to develop this
project.
13
CHAPTER 3:
REQUIREMENT ANALYSIS AND DESIGN
3.1 OVERVIEW
The aim of the chapter is to give a detailed breakdown of the requirement elicitation, analysis
and design of the system. The proposed methodology used for the project will be explained in
detail in this chapter as well as the functionalities and how they are being carried out with the
help of UML Diagrams.
14
with requirement gathering and also reduces chances of failure but on the downside, it
causes an increase in management cost, customers might be too involved in the project
which could delay completion of the project and also cause a lot of changes which will
after the workflow.
15
3.2.1.3 Scrum Development Methodology
Scrum Development Methodology is another software development methodology which
is considered to be an agile software development approach. It is similar to rapid action
development methodology as it also aims at developing software solutions in a short
period of time whereby there are changes made using iterations. It gives room for
improvement and updates. However, it does not work with large projects and it requires
extremely skilled experts for the team.
16
Figure 3.4: Waterfall Methodology Model (K, 2018)
17
3.2.1.6 INCREMENTAL MODEL
This is a method used in developing software solutions by using iterations to create
different releases for the project i.e. the project is broken down into multiple standalone
modules. It allows developments of software to be quick and flexible. However, it
requires good planning. In this project, this methodology will be used because it gives
room for improvement and allows for quick development of software solutions. This web
app will have a lot of functionalities when the end product is completed as it will
integrate data mining from Twitter, Facebook and Instagram. It cannot all be completed
and also meet up with the deadline. This model allows a thorough understanding of the
goal of the web app and breaks it down into mini projects to be completed. Figure 3.5
shows how the project is going to be carried in iterations after the important
functionalities are defined.
Figure 3.6: Iterative Incremental Model (Tutorialspoint, SDLC - Iterative Incremental Model -
Tutorialspoint, 2020)
18
- Machine Learning: This is when the dataset is trained so that machines can learn and
improve on knowledge through experience using algorithms (Team, 2017). To begin
analysis, labeled datasets are created. Then feature spaces are used to show what words
are in the document and what words are not in the document using keywords. This is
important so as to get only words that are useful for the analysis. Then the use of a
supervised algorithm to create a model. The advantage is that it gives a better result
compared to Lexicon based or rule-based.
- Rule based: This is when words are given polarity individually. The polarities of all the
words in the sentence are combined and an average. The end result will then determine if
the word is positive or negative. (Andrew, et al., 2011)
- Lexicon based: It uses semantic orientation by measurement of opinion and subjectivity
of reviews then generates sentiment polarity (Mika, Daniel, & Miikka, 2018). This uses
dictionaries where words are tagged as negative or positive, then goes through
documents to count the number of positive and negative words in each document. Then
each document is tagged as either positive or negative. The advantage of using this is it
requires no labels.
With research done in Chapter 2, Machine Learning sentiment analysis delivered the most
accurate result.
19
only able to access the free API that provides limited but still substantial data for our
analysis.
- Iteration 2: This will focus on data extracted from Facebook. Facebook is also among
the most popular social media apps in the 21 st century. Facebook also provides an API for
extraction of data from pages or events. Pages are used by brands or businesses to
connect with their customers, followers or fans. Analysis can also be done on the data
extracted from here to know whether you are getting positive or negative feedback from
your posts.
- Iteration 3: This will focus on data extracted from Instagram. Instagram is a social
media platform that allows its users to make posts and comment on posts. Instagram is
also owned by the owners of Facebook and an API is also available to the public for
analysis which will be added to this Web App to give more insight into what people feel
about a topic from the most popular social media platforms.
- Iteration n: Future enhancements….
The next section shows the method of requirement elicitation for iteration 1.
20
Table 3.1: Brainstorming Report
SN Ideas Time Feasibility Risk Total
Efficiency /5 Analysis /15
/5 /5
1 Using web scraper to scrape 2 5 1 8
data from social media and
filtering them by keywords
2 Developing a data 1 3 3 7
warehouse that stores all
data gotten from different
social media platforms and
analyzing them
3 Using social media API’s to 3 5 4 12
extract data and performing
analysis based on users’
requests
From the table above, it is notable that the safest and most feasible approach is to use social
media API’s to extract data based on keywords and then running analysis on this data and
providing them to the users on a web application.
21
which helps manage the database without having to use long query syntax. This is one of the
many advantages of using the Flask framework.
IDE (Integrated Development Environment): The IDE used for the development of this web
application is Visual Studio Code. It is a code editor that allows the use of different
programming languages and supports operations such as debugging, task running and version
control. Different files can be created on this platform and linked together such as .html, .py,
.css, .js e.t.c. It helps coordinate all the different types of files created in one place.
Third-party API: Various APIs were used for development of Mention Feeds and they have
been listed down below.
Twitter API: Twitter created a platform for developers to gain access to the data
on their platform. They have provided a free API for the public to use with limited
resources provided. Tweepy is a library in python that allows users extract information
from twitter.
Bokeh: This is a data visualization tool written in python that is used to
diagrammatically represent the analysis such as line graphs, bar charts, pie charts e.t.c.
This is a very good tool because it is responsive which means the size of the graphs
adjust to the size of the screen therefore it can be used both on desktops and mobile
phones.
22
c.) Confidentiality: The API may give users access to confidential information that is not
made public and users of this API should keep them confidential provided unless you
have gained the right to do otherwise.
d.) Permission: If your application wishes to make posts or perform some specific activities
on a users’ behalf on their twitter account you must have gained consent from the user
giving details of what you plan on doing.
The above listed are some of the terms and conditions given to developers when given access
to twitter API. Now, we will look at ethical considerations that were observed when
designing the website.
a.) Confidentiality of user’s information: Any confidential information provided by the user
will be stored in a hashed key in order to reduce chances of having the information stolen
by hackers or cyber criminals.
This project deals with a lot of data and it is important to ensure personal information is not
accessed by unauthorized parties.
23
Organizations might also desire to compare a keyword(brand) to another keyword in terms of
their sentiment analysis, to know how their competitors are performing on these social media
platforms as well. Analysts can also use this functionality to compare two topics for research
purposes.
These organizations or small businesses would also like to know the best time to make a post on
these social media platforms on their accounts. This functionality will also be available which
would extract the latest 20 posts made by the user and show the time each post was made and the
number of engagements (likes and retweets) for each post.
24
R-108 The web application shall extract the top 100 posts on twitter Functional
based on the keyword entered by a user using Twitter API
R-109 The web application shall run sentiment analysis on each post Functional
extracted and give a cumulative result
R-110 The web application shall also classify each post based on the Functional
type of device used to make the post
R-111 The web application shall also classify each tweet into original Functional
tweet, retweet or reply
R-112 The web application shall show all posts extracted by the API Functional
R-113 The web application shall show the number of tweets posted Functional
on that keyword on a daily basis
R-114 The web application shall visualize the analysis carried out Functional
with pie charts, bar charts and graphs
R-115 The web application shall allow users select two keywords and Functional
display analysis of each keyword side by side
R-116 The web application shall run analysis on a twitter account Functional
defined by a user
R-117 The web application shall allow users get historic tweets based Functional
on their keywords
R-118 The web application shall run analysis on the historic tweets Functional
extracted
R-119 The web application shall extract personal information of Functional
twitter accounts
R-120 The web application shall extract a twitter account timeline i.e. Functional
the latest 20 post by a twitter account
25
3.7.2 Non-functional Requirement specification
Table 3.3: Non-functional Requirement specification
Req. No. Description Type
NR-101 When the URL is searched, the web application will run unless the Performance
user is not connected to the internet
NR-102 The web application shall ensure sensitive information is secure Security
NR-103 The web application shall explain the different functionalities the Usability
web application has to offer on the page
NR-104 The web application shall be user-friendly Usability
NR-105 Non registered members will not be able to access the functionalities Security
NR-106 The web application shall run well on desktop and mobile devices Configuration
26
Figure 3.7.1: Mention Feeds Architecture
27
Figure 3.7.2.1: Use case Diagram for one keyword
28
Figure 3.7.2.2: Use Case for Compare Keyword
29
3.8.3 Activity diagram
An activity diagram is to elaborate and give more in-depth visualization to use cases (Tanwir,
Adnan, Junaid, Dragos, & Ivan, 2019). This activity diagram shows the process of registration
and logging in of users.
30
3.8.4 Data Flow Diagram
Data flow diagram is used to show the flow of data in a system. The diagrams below will show
the Context level DFD, Level 0 DFD and Level 1 DFD.
31
Figure 3.7.4.3: Level 1 DFD
32
CONTEXT LEVEL DFD LEVEL 0 DFD LEVEL 1 DFD
33
3.8.5 Entity-Relationship Diagram (ERD)
Entity relationship diagram is a type of UML diagram that is used to show the structural
design of an applications database. It gives an account of all the entities and attributes present in
a database. Below, a brief description of the data model will be explained.
USER: This holds all the vital information of the user. This holds the information to authenticate
a user. Each user has a unique id and email which will ensure that each user is unique.
KEYWORD: Each keyword will have a unique id and will be linked to the user who added a
new keyword with user_id holding the id of the user.
TWITTERUSER: Each username will have a unique id and will be linked to the user who
added the new username with user_id holding the id of the user.
34
3.8.6 User Interface Design
This sections the user interface which is the front-end design which the user will use to interact
with the web application.
35
Figure 3.7.6.3: User Keyword Dashboard
36
Figure 3.7.6.5: Compare Keywords
37
Figure 3.7.6.7: Add new Twitter Username
Login Page: Figure 3.8.7.1 shows a picture of the login page where registered users
fill in their login information to gain access to their dashboard.
Register Page: Figure 3.8.7.2 shows a picture of the register page where
prospective users can register and provide the necessary data needed to become
registered users.
User Keyword Dashboard: Figure 3.8.7.3 shows a picture of the first page that
will appear on a users’ screen after logging in. On this page, there are two options
which are Tracker and Historic Data. A brief description of what each option means
will be explained below.
a.) Tracker: Under this option, registered users are provided with two options
which are Add Tracker and Compare. In Figure 3.8.7.4, a picture shows what
appears when a registered user selects “Add Tracker” which is a pop where
register users can add new keywords to their dashboard. While Figure 3.8.7.5
shows what pops up when a user selects “Compare” which is an option of all the
keywords added by a user on the dashboard. Users can then select two
keywords and get comparisons on both keywords based on sentiment analysis
and other categories.
38
Also, the list of keywords on the front page are a list of keywords the user has
added to his/her dashboard. When a user clicks on submit only the latest 100
posts will be extracted and analysis will be carried out on each post.
b.) Historic Data: Under this option, registered users can access more than 500
posts to be analyzed which gives a more intricate result.
3.9 Summary
In this section, Requirement elicitation and analysis were examined here showing the different
methods used to carry out this task. This led to understanding the design of the web application
and details using UML diagrams and other tools.
39
CHAPTER 4:
4.1 Overview
The aim of this chapter is to document the process of development of the main features. It gives
a detailed breakdown of the problems encountered and how they were resolved. It also goes
through the test plan and test report for the project to ensure all the functionalities are functioning
properly. This chapter is where the project is going to be implemented.
41
Language Toolkit (NLTK). Before a tweet is classified into Positive, Negative and Neutral, a
text has to be cleaned that is getting rid of the punctuation marks. This is so as to get the most
accurate result. Afterwards these tweets will be classified into positive, negative and neutral.
Using bokeh, a pie chart will be used to represent the result.
42
database and linked to the user that added the keyword. Beside every twitter username there is a
delete button attached to it. When clicked the object of that particular twitter account is deleted
from the database with the help of SQL alchemy which simplifies SQL queries. Figure 3.7.6.7 is
a diagram that shows pop up that appears when a user clicks “Add Twitter Account”.
43
4.4 Overcoming Implementation Problems
In section 4.3 the problems encountered during the implementation were discussed. In this
section how those problems were overcome.
Integrating bokeh graph to HTML Page: To resolve this issue with the help of stack
overflow and the documentation provided on Bokeh’s website. Bokeh allows
developers to pass script and div code to HTML.
Deleting Keywords and Twitter Accounts from a users’ dashboard: This problem
was caused because of improper implementation of code. The structure of the code
was not arranged properly. This was solved by going through SQLAlchemy
documentations to have a clearer understanding of the library.
Extracting data using the Twitter API: This problem was solved by going through
the documentation of the Twitter API and with the help of GitHub. The solution
was to change the time on the hardware used to implement the web application.
Managing sessions: This problem was solved by making use of Flask Login which
provides the user session management for Flask. It helps manage a users’ login and
logout session.
4.5 Testing
In this section, the functionalities will be verified and validated. This helps developers identify
and understand the limitations and possible vulnerabilities of the web application.
44
4.5.1 Tests Plans (for Unit Testing, Integration Testing and System Testing)
Below is the test plan for “Mention Feeds”.
4.5.1.2 Reference
Mention Feeds
Work Plane
Detailed project documentation
Test Summary
4.5.1.3 Introduction
This section provides a breakdown of the tests carried out to verify and validate the
functional requirements for Mention Feeds. White box and Black box testing were
carried out.
45
4.5.1.5 Approach
Visual Studio Code is used for testing the web application before it is deployed. It
supports python and flask extension which were used for the implementation of this
project which also include debugging.
4.5.1.6 Deliverables
The deliverable of the test plan includes:
Test Cases
Test report
Traceability matrix
Test results
Error report
APPROVALS
Hauwa Umar
Supervisors
4.5.2 Test Suite (for Unit Testing, Integration Testing, and System Testing)
Test case TC-001(Register User)
Table 4.1: Test Suite Performed for Register
Test Suit ID R-121
46
Status Successful
Remarks The test was successful
Created by Hauwa Umar
Date Created 6th June 2020
Executed by Hauwa Umar
Date Execution 6th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-001(Register User): This aim of this test case is to ensure the registration process
works accordingly. It will also ensure users enter the right information. Error messages also
appear to help guide users understand how to appropriately fill information on the registration
form. If the user successfully registers it will redirect the user to the login page and show a
success message. This test case was successful.
47
Figure 4.1.2: Successful Registration
48
Test case TC-002 (Login User): The aim of this test is to ensure registered users can login. It will
verify that the email and password entered by the user is correct. If the user enters the right
information the user will be redirected to their dashboard. Otherwise, an error message will
appear. This test case was successful.
49
Test case TC-003 (Add New Keyword)
Table 4.3: Test Suite Performed to Add Keyword to Users’ Dashboard
Test Suit ID R-123
Test Case ID TC-003
Test case summary This test case is to ensure registered users
can add keywords to their dashboard.
Related Requirement R-123
Prerequisite User must be logged in
User must be on “Keyword Based
Analysis Page”
User must click the “Add
Keyword” button
Test Procedure 1. Login
2. Click “Add Keyword” button
Test Data Keyword
Expected Result The web application should add the
keyword to a users’ dashboard and link it
to the user in the database.
Actual Result The web application adds the keyword to a
users’ dashboard and links it to the user in
the database.
Status Successful
Remarks The test was successful
Created by Hauwa Umar
Date Created 10th June 2020
Executed by Hauwa Umar
Date Execution 11th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-003 (Add New Keyword): The aim of this test case is to ensure registered users can
add keywords to their dashboard. Logged in users will find the “Add Keyword” button on their
dashboard which should trigger a popup form that should allow users enter in their desired
keyword and add it to their dashboard. This test case was successful and it performed everything
that was expected.
50
Figure 4.4.1: Pop up to add Keyword
51
Test case TC-004 (Add Twitter Account)
Table 4.4: Test Suite Performed to Add Twitter Account to Users’ Dashboard
Test Suit ID R-124
Test Case ID TC-004
Test case summary The aim of this test case is to ensure
registered users can add twitter accounts to
their dashboard.
Related Requirement R-124
Prerequisite User must be logged in
User must navigate to “Social
Media Analysis” Page
User must click on “Add Twitter
Account” button
Test Procedure 1. Login
2. Navigate to “Social Media Account
Analysis” Page
3. Click “Add Twitter Account”
button
Test Data Twitter Account
Expected Result The web application should add the twitter
account to a users’ dashboard and link it to
the user in the database.
Actual Result The web application adds the twitter
account to a users’ dashboard and links it
to the user in the database.
Status Successful
Remarks This test case is successful
Created by Hauwa Umar
Date Created 12th June 2020
Executed by Hauwa Umar
Date Execution 13th June 2020
Test Environment Hardware: HP Laptop
Software: Browser - Google Chrome
Test case TC-004 (Add Twitter Account): The aim of this test case is to ensure registered users
can add twitter accounts to their dashboard. Logged in users will navigate to “Social Media
Account Analysis” Page and find the “Add twitter account” button on their dashboard which
should trigger a popup form that should allow users enter in the desired twitter account username
and add to their dashboard. This test case was successful and it performed everything that was
expected.
52
Figure 4.5.1: Test to Add New Keyword
53
Test case TC-005 (Extract latest 100 Tweets from Twitter based on Keyword)
Table 4.5: Test Suite Performed to extract latest 100 Tweets from Twitter based on
Keyword
Test Suit ID R-125
Test Case ID TC-005
Test case summary The aim of this test case is to extract the latest 100 Tweets
from Twitter based on Keyword.
Related Requirement R-125
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display the latest 100 tweets
from twitter based on the selected keyword.
Actual Result The web application displays the latest 100 tweets from
twitter based on the selected keyword.
Status Successful
Remarks This test case was successful
Created by Hauwa Umar
Date Created 14th June 2020
Executed by Hauwa Umar
Date Execution 20th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-005 (Extract latest 100 Tweets from Twitter based on Keyword): The aim of this
test case is to extract the latest 100 Tweets from Twitter based on Keyword. This registered user
must have a keyword on their dashboard. There is a submit button beside each keyword on the
dashboard. A user will click the submit button whichever keyword they want to get analysis on
which will then extract the latest 100 Tweets from Twitter based on the Keyword. This test was
successful and performed what was expected.
54
Figure 4.6: Tweets Extracted Successfully
Test case TC-006 (Data Visualization for number of posts extracted per day)
Table 4.6: Test Suite Performed to display Data Visualization for number of posts per day
Test Suit ID R-126
Test Case ID TC-006
Test case summary The aim of this test case is to display a line graph that shows
the number of posts extracted per day
Related Requirement R-126
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display a line graph showing the
number of tweets extracted based on the keyword per day
Actual Result The web application displays a line graph showing the number
of tweets extracted based on the keyword per day
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 21st June 2020
Executed by Hauwa Umar
Date Execution 26th June 2020
55
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-006 (Data Visualization for number of posts extracted per day): The aim of this test
case is to display a line graph that shows the number of posts extracted per day. This registered
user must have a keyword on their dashboard. There is a submit button beside each keyword on
the dashboard. A user will click the submit button whichever keyword they want to get analysis
on which will then display a line graph that shows the number of posts extracted per day.
However, if all 100 tweets were posted on the same day, a bar chart will appear rather. This test
was successful and performed what was expected.
56
Test case TC-007 (Sentiment Analysis of posts Extracted)
Table 4.7: Test Suite Performed to perform Sentiment Analysis of posts Extracted
Test Suit ID R-127
Test Case ID TC-007
Test case summary The aim of the test case is to perform sentiment analysis on
each post extracted and give a cumulative result
Related Requirement R-127
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test case TC-007 (Sentiment Analysis of posts Extracted): The aim of the test case is to perform
sentiment analysis on each post extracted and give a cumulative result. This registered user must
have a keyword on their dashboard. There is a submit button beside each keyword on the
dashboard. A user will click the submit button whichever keyword they want to get analysis on
which will then perform sentiment analysis on each tweet extracted and give a cumulative result
of the number of positive, negative and neutral tweets extracted. This test was successful and
performed what was expected.
57
Test case TC-008 (Categorizing tweets into Type of Device Used)
Table 4.8: Test Suite Performed to Categorize tweets into Type of Device Used
Test Suit ID R-128
Test Case ID TC-008
Test case summary The aim of the test case is to categorize tweets into the type of
device used to make each post tweeted.
Related Requirement R-128
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should categorize tweets into the type of
device used to make each post tweeted.
Actual Result The web application categorizes tweets into the type of device
used to make each post tweeted.
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 6th July 2020
Executed by Hauwa Umar
Date Execution 6th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-008 (Categorizing tweets into Type of Device Used): The aim of the test case is to
categorize tweets into the type of device used to make each post tweeted. This was successful.
58
Test case TC-009 (Categorizing tweets into Tweet Type (Original Tweet, Reply or Retweet))
Table 4.9: Test Suite Performed to categorize tweets into Tweet Type (Original Tweet,
Reply or Retweet)
Test Suit ID R-129
Test Case ID TC-009
Test case summary The aim of this test case is to categorize tweets into Tweet
Type (Original Tweet, Reply or Retweet)
Related Requirement R-129
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should categorize tweets into Tweet
Type (Original Tweet, Reply or Retweet)
Actual Result The web application categorizes tweets into Tweet Type
(Original Tweet, Reply or Retweet)
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 7th July 2020
Executed by Hauwa Umar
Date Execution 7th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-009 (Categorizing tweets into Tweet Type (Original Tweet, Reply or Retweet)):
The aim of this test case is to categorize tweets into Tweet Type (Original Tweet, Reply or
Retweet). This was successful.
59
Test case TC-010 (Data Visualization for All Analysis made on tweets extracted)
Table 4.10: Test Suite Performed to display data visualization for analysis carried out
Test Suit ID R-130
Test Case ID TC-010
Test case summary The aim of this test case is to visualize the analysis carried out
with pie charts, bar charts and graphs.
Related Requirement R-130
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should visualize the analysis carried out
with pie charts, bar charts and graphs.
Actual Result The web application visualizes the analysis carried out with
pie charts, bar charts and graphs.
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 8th July 2020
Executed by Hauwa Umar
Date Execution 13th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-010 (Data Visualization for All Analysis made on tweets extracted): The aim of
this test case is to visualize the analysis carried out with pie charts, bar charts and graphs. When
a user clicks submit a lot of analysis is carried out on the tweets extracted and should be
displayed using graphs and pie charts. The test case was successful.
60
Figure 4.8.1: Sentiment Analysis Visualization
61
Figure 4.8.3: Post Type Visualization
62
Test case TC-011 (Compare two keywords)
Table 4.11: Test Suite Performed to compare two keywords
Test Suit ID R-131
Test Case ID TC-011
Test case summary The aim of this test case is to allow registered users compare
analysis on two keywords
Related R-131
Requirement
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have at least two keywords on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Two Keywords
Expected Result The web application should compare the analysis of tweets
extracted for two keywords and display them
Actual Result The web application compares the analysis of tweets extracted for
two keywords and display them
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 14th July 2020
Executed by Hauwa Umar
Date Execution 16th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-011 (Compare two keywords): The aim of this test case is to allow registered users
to compare analysis on two keywords. The user must have at least two keywords on their
dashboard. They can only select 2 and get analysis on both keywords side by side. This test case
was successful.
63
Figure 4.9.1: Test to Compare Keywords
64
Figure 4.9.3: Compare Analysis Visualization
65
Test case TC-012 (Extract Information of Twitter Account)
Table 4.12: Test Suite Performed to Extract Information of Twitter Account
Test Suit ID R-132
Test Case ID TC-012
Test case summary The aim of this test case is to extract personal information of a
twitter account on a users’ dashboard
Related Requirement R-132
Prerequisite User must be logged in
User must be on “Social Media Account Analysis” Page
User must have a valid twitter account on their
dashboard
Test Procedure 1. Login
2. Navigate to “Social Media Account Analysis” Page
3. Click submit beside desired twitter account
Test Data Twitter Account
Expected Result The web application should extract personal information of a
twitter account on a users’ dashboard
Actual Result The web application extracts personal information of a twitter
account on a users’ dashboard
Status Successful
Remarks The test case is successful however Twitter API gives
information based in the twitter account’s status i.e. Private or
Public
Created by Hauwa Umar
Date Created 17th July 2020
Executed by Hauwa Umar
Date Execution 18th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-012 (Extract Information of Twitter Account): The aim of this test case is to extract
personal information of a twitter account on a users’ dashboard. Logged in users will navigate to
“Social Media Account Analysis” Page and must have a valid twitter account on their page. If
the account is valid, it will extract personal information from the twitter account. If it is not, it
will delete the account from the users’ dashboard. This test case was successful.
66
Figure 4.9.4: Extract Twitter Account Information
67
Test case TC-013 (Extract User Timeline)
Table 4.13: Test Suite Performed to Extract User Timeline
Test Suit ID R-133
Test Case ID TC-013
Test case summary The aim of this test case is to extract the latest 20 post made by a
twitter account on a users’ dashboard
Related Requirement R-133
Prerequisite User must be logged in
User must be on “Social Media Account Analysis” Page
User must have a valid twitter account on their
dashboard
Test Procedure 1. Login
2. Navigate to “Social Media Account Analysis” Page
3. Click submit beside desired twitter account
Test Data Twitter Account
Expected Result The web application should extract the latest 20 post made by a
twitter account on a users’ dashboard
Actual Result The web application extracts the latest 20 post made by a twitter
account on a users’ dashboard
Status Successful
Remarks The test case is successful
Created by Hauwa Umar
Date Created 19th July 2020
Executed by Hauwa Umar
Date Execution 20th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-013 (Extract User Timeline): The aim of this test case is to extract the latest 20
posts made by a twitter account on a users’ dashboard. Along with the extracting information
about a twitter account, the web application extracts the users latest 20 tweets. This test case was
successful.
68
Figure 4.9.5: User Timeline Extracted
69
a twitter user’s timeline and display it on the web application
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 21th July 2020
Executed by Hauwa Umar
Date Execution 28th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-014 (Data Visualization for Optimization of posts): The aim of this test case is to
run analysis on the tweets extracted from a twitter user’s timeline and display it on the web
application. The web application will categorize the engagement on each post on an hourly basis
and displays a bar graph to visualize this analysis. This test case was successful.
70
Test case TC-015 (Historic Data)
Test Suit ID R-135
Test Case ID TC-015
Test case summary The aim of this test case is to extract up to 1000 posts based on
a keyword from twitter, perform analysis and provide data
visualization for them.
Related Requirement R-135
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a Keyword on their dashboard
Test Procedure 4. Login
5. Navigate to “Keyword Based Analysis” Page
6. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should extract up to 1000 posts based on
a keyword from twitter, perform analysis and provide data
visualization for them.
Actual Result The web application extracts up to 1000 posts based on a
keyword from twitter, performs analysis and provides data
visualization for them.
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 29th July 2020
Executed by Hauwa Umar
Date Execution 10th August 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-015 (Historic Data): The aim of this test is to extract from 200 to 1000 tweets to
run analysis using a library called GetOldTweets3. It will also provide a data visualization for the
analysis which is sentiment analysis. This test case was successful.
71
Figure 4.9.7: Historic Data Page
72
Figure 4.9.8.2: Historic Data Sentiment Analysis
73
4.5.3 Test Traceability Matrix (for Unit Testing, Integration Testing and
System Testing)
Table 4.15: Test Traceability Matrix
Reqt. Description Priority Test Test Date Test
Case Result
R-121 The web application should 8 1 6th June 2020 Successful
allow users register
R-122 The web application should 7 2 8th June 2020 Successful
allow users login
R-123 The web application should 4 3 11th June 2020 Successful
allow users add New
Keywords to their
dashboard
R-124 The web application should 4 4 13th June 2020 Successful
allow users add Twitter
Accounts to their
dashboard
R-125 The web application should 6 5 20th June 2020 Successful
extract latest 100 Tweets
from Twitter based on
Keyword
R-126 The web application should 6 6 26th June 2020 Successful
display a line graph
showing the number of
tweets extracted based on
the keyword per day
R-127 The web application should 9 7 5th July 2020 Successful
perform sentiment analysis
on each post extracted
R-128 The web application should 8 8 6th July 2020 Successful
categorize tweets into type
of device used to make
each post tweeted.
R-129 The web application should 8 9 7th July 2020 Successful
categorize tweets into
Tweet Type (Original
Tweet, Reply or Retweet)
74
R-130 The web application should 7 10 13th July 2020 Successful
visualize the analysis
carried out with pie charts,
bar charts and graphs.
R-131 The web application should 8 11 16th July 2020 Successful
compare the analysis of
tweets extracted for two
keywords and display them
R-132 The web application should 5 12 18th July 2020 Successful
extract personal
information of a twitter
account on a users’
dashboard
R-133 The web application should 6 13 20th July 2020 Successful
extract the latest 20 post
made by a twitter account
on a users’ dashboard
R-134 The web application should 8 14 28th July 2020 Successful
run analysis on the tweets
extracted from a twitter
user’s timeline and display
it on the web application
R-135 The web application should 7 15 10th August Successful
extract up to 1000 posts 2020
based on a keyword from
twitter, perform analysis
and provide data
visualization for them.
75
4.5.4 Test Report Summary (for Unit Testing, Integration Testing and
System Testing)
Table 4.16: Test Report Summary
SUMMARY OF TEST CARRIED OUT RESULTS
15
Number of functions tested
0
Number of functions not tested
15
Number of tests passed
0
Number of tests failed
100%
Percentage of tests passed
0%
Percentage of test failed
76
4.6 User Guide
The user guide is used by prospective and already registered users to understand the
functionalities provided by the web application and how to access them. Users make use of the
user guide to understand what the web application has to offer.
4.7 Summary
This chapter gives a clear breakdown of the main features and functionalities of the web
application and the various methods and techniques used to implement the features. Each
functionality was tested and documented. Errors faced while carrying out implementation were
discussed as well as how the errors were solved. This chapter played a crucial role in the
development of this web application.
77
CHAPTER 5:
5.1 Overview
The aim of this chapter is to discuss the objective assessment of the project, the limitations and
challenges encountered during the development of the project, possible future plans to enhance
the project and also recommendations for future projects in social media monitoring tools.
78
or business-like use. Currently Mention Feeds only provides current data on Twitter. Below are
brief descriptions of possible enhancements for Mention Feeds.
Upgrading the Twitter API access: This will grant access to more data on twitter and help
improve the features Mention Feeds can provide.
Adding analysis from other social media platforms: The aim of Mention Feeds is not to
focus only on data on Twitter but data from big social media platforms which include
Facebook and Instagram. These platforms also provide APIs for developers to gain
access to their data. This will provide users of Mention Feeds access to more information
that can be beneficial for individuals and organizations.
Organization Users and Individual Users: Mention Feeds should be able to allow users
define their roles as either Organizations or Individuals which would provide different
features for both users.
Enhancing result for Sentiment Analysis: TextBlob was deployed to carry out the
sentiment analysis for this version of Mention Feeds which provides a reasonable level of
accuracy to its result. It provides a generic sentiment analysis for all text. A future plan to
create a model that allows organizations/businesses to understand exactly what the
negative and positive comments are talking about.
Streaming live tweets and giving live sentiment analysis for tweets as they are posted.
These are some future enhancements that could be implemented over a period of time to help
achieve the overall aim of Mention Feeds.
5.5 Recommendations
Social media platforms provide a large amount of vague data which can be very beneficial for
analysts, Businesses, Organizations or Influencers. Based on the research carried out in relation
to Sentiment Analysis and Social Media Monitoring tools, I have been able to come up with the
following recommendations:
As discussed in the future enhancement section, Sentiment Analysis should be able to be
user specific to give a more specific result. Businesses/Organizations/Influencers would
79
take advantage of this information to help improve their services and understand the
positive and negative feedback provided by their customers.
Small Businesses in Nigeria should be enlightened on the benefits of social media and
how the social media monitoring tools can help improve and grow their businesses.
In the 21st century, social media plays a role in day to day activities of an average human being,
more projects should go into analyzing this data for the benefits of businesses, analysts and
influencers.
5.6 Conclusion
In this project, we proposed a web application that analyzes data gotten from Twitter. Twitter is a
social media platform that contains a lot of vague data. If analyzed properly, this data can be
very beneficial to the industry and the educational sector. As input, Mention Feeds takes either a
keyword specified by a registered user or a public twitter account defined by a user. Mention
Feeds carries out various analysis on the data extracted which include sentiment analysis,
categorization of posts into Type of Post (Reply, Retweet or Original Tweet) and Device Used
(iPhone, Android, WebApp or Others). It also allows for comparison of analysis between two
keywords. It also carries out analysis on Twitter accounts for optimization of posts.
Businesses/Organizations that have access to data on their brand at real time will have a positive
effect on the growth of their business especially if used during the beginning or start of a
business. It provides a more accurate and less biased customer insight of your business. It also
allows businesses to compare the publics’ reaction to their products/services in comparison to
their competition. Influencers have the chance to see when the majority of their followers are
active which helps increase their engagement rate. Analysts can also benefit from Mention Feeds
as it gives them access to a large amount of data with the use of the historic data feature.
Mention Feeds aims at providing analysis for data on all major social platforms for prospective
users which will be achieved at later times. Mention Feeds proves and magnifies the significance
of social media platforms to the society.
5.7 Summary
This chapter provides an objective assessment of the project which focuses on if all the
requirements specified were met. The challenges faced during the implementation of this project
80
were also documented in this chapter. Future enhancement suggestions and recommendations
were also discussed. Overall, this chapter is the conclusion of the entire thesis.
REFERENCES
References
Ali, H., Sana, M., Ahmad, K., & Shahaboddin, S. (2018). Machine Learning-Based Sentimental Analysis
for Twitter Accounts. Mathematical Computational Application.
Amandeep, K., Deepesh, K., Khushboo, V., & Ranjit Singh, S. (2016). Sentiment Analysis on Twitter
using Apache Spark. Ottawa: Carleton University.
Andrew, M. L., Raymond, D. E., Peter, P. T., Dan, H., Andrew, N. H., & Potts, C. (2011). Learning Word
Vectors for Sentiment Analysis. Proceedings of 49th Annual Meeting of the Association for
Computational Linguistics, 1-7.
Boyang, Z., & Marita, V. (2014). Social media monitoring: Aims, methods and challenges for
international companies. Corporate Communications: An International Journal, 371-383.
Cambria, E., Gastaldo, P., Bisio, F., & Zunnio, R. (2015). An ELM-based model for affective analogical
reasoning. Neurocomputing, 443-455.
Dave, K., Lawrence, S., & Pennock, D. (2003). Mining the peanut gallery: Opinion extraction and
semantic classification of product reviews. Proceedings of the12th international conference on
World Wide Web (pp. 1-18). New York: Association for Computing Machinery.
Dumbleton, R. (2018, December 19). Complete Guide to Sentiment Analysis: Updated 2020. Retrieved
from getthematic: https://getthematic.com/insights/sentiment-analysis/
Emmanouil, P., George, M., & Ioannis, K. (2019). Social Media Monitoring: An Innovative Intelligent
Approach.
Farah, B., Carmine, C., Antonio, P., Diego, R., & Subrahmanian, V. (2006). Sentiment Analysis:
Adjectives and Adverbs are better than Adjectives alone. International Conference on Weblogs
and Social Media, (pp. 1-7). ICWSM.
Jorge, C. d., Laura, P., Pablo, G., & Alberto, D. (2011). A Joint Model of Feature Mining and Sentiment
Analysis for Product Review Rating. Proceedings of International Conference on Advances in
Information Retrieval, (pp. 55-66).
K, J. (2018, June 01). 12 Best Software Development Methodologies with Pros & Cons. Retrieved from
acodez: https://acodez.in/12-best-software-development-methodologies-pros-cons/
Khalid, N., & Ahmad, I. (2017). Discovering and Analyzing Important Real-Time Trends in Noisy
Twitter Streams. Systemics, Cybernetics and Informatics, 26-30.
Khanna, U. (2018, 01 29). The Impact of Social Media Marketing Today. Retrieved from Social Media
Impact: http://www.socialmediaimpact.com/impact-social-media-marketing-today/#
81
Lin, Y. (2019, 11 30). 10 Twitter Statistics Every Marketer Should Know in 2020 [Infographic].
Retrieved from Oberlo: https://ng.oberlo.com/blog/twitter-
statistics#:~:text=Here's%20a%20summary%20of%20the,are%20between%2035%20and%2065.
Mika, M. V., Daniel, G., & Kuutila, M. (2016). The Evolution of Sentiment Analysis - A Review of
Research Topics, Venues, and Top Cited Paper. Computer Science Review, 18-32.
Mika, V. M., Daniel, G., & Miikka, K. (2018). The Evolution of Sentiment Analysis-A Review of
Research Topics. Computer Science Review, 16-32.
Najma, S., Pintu, K., Monika, R. P., Sourabh, C., & S.K., S. A. (2019). Sentiment analysis for product
review. International Journal of Soft Computing, 1913-1919.
Nassirtoussi, A., Aghabozorgi, S., Wah, T., & Ngo, D. (2014). Text mining for market prediction: A
systematic review. Expert Systems with Applications, 7653-7670.
Samaneh, M., & Martin, E. (2010). Opinion Digger. Proceedings of 19th ACM International Conference
on Information and Knowledge Management, (pp. 1825-1828).
Sentiment Analysis. (2020, January 02). Retrieved from Monkey Learn:
https://monkeylearn.com/sentiment-analysis/
Soonh, T., Bakhtawer, S., & Areej, F. M. (2019). Sentiment Analysis of News Articles: A Lexicon based
Approach. International Conference on Computing, Mathematics and Engineering Technologies.
Stagner, R. (1940). The cross-out technique as a method in public opinion analysis. The Journal of Social
Psychology, 79-90.
Tanwir, A., Adnan, A., Junaid, I., Dragos, T., & Ivan, P. (2019). Model-based testing using UML activity
diagrams: A systematic mapping study. Computer Science Review, 98-112.
Team, E. S. (2017, March 7). What is Machine Learning? A definition-Expert System. Retrieved from
Expert System: https://expertsystem.com/machine-learning-definition/
Tutorialspoint. (2020). SDLC - Iterative Incremental Model - Tutorialspoint. Retrieved from
Tutorialspoint.com:
https://www.tutorialspoint.com/adaptive_software_development/sdlc_iterative_incremental_mod
el.htm
Tutorialspoint. (2020). SDLC - Spiral Model - Tutorialspoint. Retrieved from Tutorialspoint.com:
https://www.tutorialspoint.com/sdlc/sdlc_spiral_model.htm
ZAYNABZAHRA. (2017, October 7). SCRUM Methodology. Retrieved from Zaynab's Blog:
https://zaynabzahrablog.wordpress.com/2017/10/07/scrum-methodology/
82
APPENDICES
83
Appendix B – Source Codes
84
Appendix C – Test Cases
Test case TC-005 (Extract latest 100 Tweets from Twitter based on Keyword)
Test Suit ID R-125
Test Case ID TC-005
Test case summary The aim of this test case is to extract the latest 100 Tweets from
Twitter based on Keyword.
Related Requirement R-125
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 4. Login
5. Navigate to “Keyword Based Analysis” Page
6. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display the latest 100 tweets from
twitter based on the selected keyword.
85
Actual Result The web application displays the latest 100 tweets from twitter
based on the selected keyword.
Status Successful
Remarks This test case was successful
Created by Hauwa Umar
Date Created 14th June 2020
Executed by Hauwa Umar
Date Execution 20th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
Test case TC-006 (Data Visualization for number of posts extracted per day)
Test Suit ID R-126
Test Case ID TC-006
Test case summary The aim of this test case is to display a line graph that shows
the number of posts extracted per day
Related Requirement R-126
Prerequisite User must be logged in
User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 4. Login
5. Navigate to “Keyword Based Analysis” Page
6. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display a line graph showing the
number of tweets extracted based on the keyword per day
Actual Result The web application displays a line graph showing the number
86
of tweets extracted based on the keyword per day
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 21st June 2020
Executed by Hauwa Umar
Date Execution 26th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome
87
- A line graph/bar chart showing the number of posts extracted in the last seven days.
- A pie chart showing the cumulative sentiment analysis of the tweets extracted
- A pie chart showing the device used to make the tweets
- A pie chart showing the types of tweets extracted (Original Tweets, Retweets or
Replies)
- On the top left there is a side nav button that when clicked slides a side navigation bar
with the option “Analysis” and “Post”
- Under “Post”, the user can see all the tweets extracted in a separate, the positive
tweets in a separate table as well the negative tweets.
On the “Tracker” page users can also compare analysis between two keywords by
clicking on the “Compare” button.
Once a user clicks on the “Compare” Button a pop up will appear with a list of keywords
already added by users in which they can select only two keywords to compare.
On the analysis page of comparing keywords, the following will be provided:
- A double line graph or bar chart showing the number of posts extracted in the last
seven days on both keywords.
- A pie chart showing the cumulative sentiment analysis of the tweets extracted for
both keywords
- A pie chart showing the device used to make the tweets for both keywords
- A pie chart showing the types of tweets extracted (Original Tweets, Retweets or
Replies) for both keywords
- It also provides extra information on both keywords.
On the “Historic Data” page, the keywords added by the user will be displayed with a
dropdown option which allows users to gain access to up to 1000 tweets based on a
keyword.
On the analysis page for Historic Data, the following will be provided:
- A line graph/bar chart showing the number of posts extracted monthly.
- A pie chart showing the cumulative sentiment analysis of the tweets extracted.
On the “Social Media Account Analysis” page, users can add valid twitter accounts by
clicking on the “Add Twitter Account” button which would then add the twitter account
to their dashboard.
Beside the twitter account added there is a submit button which allows users access to
analysis made on that twitter account, if the twitter account is invalid or private it will
delete the account from their dashboard automatically. There is also a delete button which
allows users to delete twitter accounts from their dashboard.
On the analysis page for Twitter accounts, the following will be provided:
- Details about the twitter account
- A grouped bar chart that shows the number of engagements on an hourly basis on the
latest 20 posts made by the twitter account.
- A table containing the latest 20 posts by the twitter account
There is also a logout option also for logged-in users to logout.
88
89