You are on page 1of 102

A DYNAMIC MODEL OF SOCIAL MEDIA MONITORING TOOLS WITH

SENTIMENT ANALYSIS

BY

UMAR, HAUWA
BU/17C/IT/2744

THE DEPARTMENT OF COMPUTER SCIENCE


BAZE UNIVERSITY, ABUJA

SEPTEMBER, 2020
A DYNAMIC MODEL OF SOCIAL MEDIA MONITORING TOOLS WITH
SENTIMENT ANALYSIS

THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENT


FOR THE DEGREE OF

B.Sc.
IN
COMPUTER SCIENCE [SOFTWARE ENGINEERING

BY

UMAR, HAUWA
BU/17C/IT/2744

TO
THE DEPARTMENT OF COMPUTER SCIENCE
BAZE UNIVERSITY, ABUJA

SEPTEMBER, 2020
ACKNOWLEDGMENT
I would like to express my sincere gratitude to Allah SWT for giving me the health,
opportunity, wisdom and understanding to carry out this project. I would also like to thank my
parents, grandma and siblings for their emotional, physical, financial and moral support all
through my university experience, they were always my source of inspiration and motivation.
I am grateful to my project supervisors, Dr Amit Mishra and Mr Charles Isah Saidu for their
support, patience and constructive criticism throughout the implementation of this project. I
am also grateful to all the lecturers that have made an impact in my educational up-bringing at
Baze University either directly or indirectly, I could mention these lecturers individually but
the list would go on. I am forever grateful for the impact they’ve all had in my life.
I would also like to thank my friends for motivating and having a positive impact in my life
from the friendly competition between ourselves to helping each other during difficult times in
school. I will forever hold you close to my heart.
Finally, I would like to thank Baze University for grooming me both morally and
educationally and setting me up for a bright future.

i
DECLARATION

This is to certify that this Thesis entitled “A Dynamic Model of Social Media Monitoring
Tools with Sentiment Analysis”, which is submitted by Umar Hauwa in partial fulfilment of
the requirement for the award of degree for B.Sc. in Information Technology to the
Department of Computer Science, Baze University Abuja, Nigeria, comprises of only my
original work and due acknowledgement has been made in the text to all other materials used.

Date: Name of Student: Umar Hauwa

APPROVED BY …………………
HOD
Dept. of Computer Science

ii
CERTIFICATION

This is to certify that this Thesis entitled A Dynamic Model of Social Media Monitoring
Tools with Sentiment Analysis which is submitted by Umar Hauwa in partial fulfilment of
the requirement for the award of degree for B.Sc. in Information Technology to the
Department of Computer Science, Baze University Abuja, Nigeria is a record of the
candidate’s own work carried out by the candidate under my/our supervision. The matter
embodied in this thesis is original and has not been submitted for the award of any other
degree.

Date: Supervisor:

iii
APPROVAL
This is to certify that the research work, “A Dynamic Model of Social Media Monitoring
Tools with Sentiment Analysis” and the subsequent preparation by Umar Hauwa with
BU/17C/IT/2744 has been approved by the Department of Computer Science, Faculty of
Computing and Applied Science, Baze University, Abuja, Nigeria.

By

Dr Amit Mishra Date/Sign


1st Supervisor

Mr Charles Isah Saidu Date/Sign


2nd Supervisor

Dr. C. V. Uppin Date/Sign


Head, Department of Computer Sciences

Prof. Mohammed Hammawa Baba Date/Sign


Dean, Faculty of Computing and Applied Science

Prof Ahmed Baita Garko Date/Sign


External Examiner

iv
DEDICATION

This project is dedicated to Allah SWT for his mercy and blessings for guiding me throughout
my life and putting me in this position to complete this project and chapter of my life. I also
dedicate this project to my dad Mr. Umar Hamza Kangiwa, my mum Mrs. Bilkisu Umar, my
brothers Mukhtar and Abdulmalik Umar, and my grandma Mrs. Aisha Kaoje they have been
my support system since forever and I am forever grateful to God for blessing me with my
family.

v
ABSTRACT

The proposed system is a dynamic model of social media monitoring tool with sentiment
analysis. Social media platforms contain a lot of data which might be considered ambiguous
to businesses/organizations. Most organizations make use of social media platforms for
advertisement as they reach a large crowd in a short period of time and are also more
affordable compared to other options. Businesses find it difficult to have a clear vision/insight
on how well their business has grown after each advertisement or product release. The aim of
the project is to provide various relevant analysis including sentiment analysis on real-time
data from social media platforms for specified keywords or social media accounts defined by
registered users. Sentiment analysis is a process in which an algorithm is used to determine the
emotion of texts i.e. positive, negative or neutral. Sentiment analysis helps businesses
understand how their customers feel about their product. Businesses can also compare their
products with their competitors to have a clearer insight on how their services are doing online
compared to other businesses. This application was developed using the increment
methodology and successfully produced the desirable results for the first iteration of the
project. To accomplish this project Python language and PostgreSQL database are used. The
end-result of the first iteration is focused on Twitter data. It performs all the requirements
specified. At the end the application passed through different types of tests and the obtained
accuracy was 98%. The application will continue with future enhancements and the other
iterations.

vi
TABLE OF CONTENTS

ABSTRACT vi
LIST OF TABLES IX
LIST OF FIGURES X
LIST OF ABBREVIATIONS XI

CHAPTER 1: INTRODUCTION 1
1.1 OVERVIEW 1
1.2 BACKGROUND AND MOTIVATION 1
1.3 STATEMENT OF THE PROBLEM 2
1.4 AIM AND OBJECTIVES 3
1.5 SIGNIFICANCE OF THE PROJECT 4
1.6 PROJECT RISKS ASSESSMENT 5
1.7 SCOPE/PROJECT ORGANIZATION 7
CHAPTER 2: LITERATURE REVIEW 8
2.1 INTRODUCTION 8
2.2 HISTORICAL OVERVIEW 8
2.3 RELATED WORK 9
2.4 SUMMARY 13
CHAPTER 3: REQUIREMENTS ANALYSIS AND DESIGN 14
3.1 OVERVIEW 14
3.2 PROPOSED METHODOLOGY 14
3.3 APPROACH TO CHOSEN METHODOLOGY/METHODS 19
3.4 TOOLS AND TECHNIQUES 21
3.5 ETHICAL CONSIDERATION 22
3.6 REQUIREMENT ANALYSIS 23
3.7 REQUIREMENTS SPECIFICATIONS 24
3.7.1 Functional Requirement Specifications 24
3.7.2 Non-Functional Requirement Specifications 26
3.8 SYSTEM DESIGN 26
3.8.1 Application Architecture 26
3.8.2 Use Case 27
3.8.3 Activity Diagrams 30
3.8.4 Data Flow Diagram 31
3.8.5 Entity-Relationship Diagram (ERD) 34
3.8.6 User Interface Design 35
3.9 Summary 39
CHAPTER 4: IMPLEMENTATION AND TESTING 40
4.1 OVERVIEW 40
4.2 MAIN FEATURES 40
4.3 IMPLEMENTATION PROBLEMS 43
4.4 OVERCOMING IMPLEMENTATION PROBLEMS 44
4.5 TESTING 44CH

vii
4.5.1 Tests Plans (for Unit Testing, Integration Testing, and System Testing) 45
4.5.2 Test Suite (for Unit Testing, Integration Testing, and System Testing) 46
4.5.3 Test Traceability Matrix (for Unit Testing, Integration Testing, and System Testing) 74
4.5.4 Test Report Summary (for Unit Testing, Integration Testing, and System Testing) 76
4.5.5 Error Reports and Corrections 76
4.6 USE GUIDE 77
4.7 SUMMARY 77
CHAPTER 5: DISCUSSION, CONCLUSION, AND RECOMMENDATIONS 78
5.1 OVERVIEW 78
5.2 OBJECTIVE ASSESSMENT 78
5.3 LIMITATIONS AND CHALLENGES 78
5.4 FUTURE ENHANCEMENTS 78
5.5 RECOMMENDATIONS 79
5.6 CONCLUSION 80
5.7 SUMMARY 80

REFERENCES 81
APPENDICES 83

viii
LIST OF TABLES

TABLE 1 RISK ASSESSMENT 5


TABLE 2 SENTIMENT ANALYSIS TOOLS/TECHNIQUES 12
TABLE 3.1 BRAINSTORMING REPORT 20
TABLE 3.2 FUNCTIONAL REQUIREMENT SPECIFICATIONS 23
TABLE 3.3 NON-FUNCTIONAL REQUIREMENT SPECIFICATIONS 25
TABLE 4.1 TEST SUITE PERFORMED FOR REGISTER 45
TABLE 4.2 TEST SUITE PERFORMED FOR LOGIN 47
TABLE 4.3 TEST SUITE PERFORMED TO ADD KEYWORD TO USERS' DASHBOARD 48
TABLE 4.4 TEST SUITE PERFORMED TO ADD TWITTER ACCOUNT TO USERS' DASHBOARD 50
TABLE 4.5 TEST SUITE PERFORMED TO EXTRACT LATEST 100 TWEETS FROM TWITTER BASED ON KEYWORD 52
TABLE 4.6 TEST SUITE PERFORMED TO DISPLAY DATA VISUALIZATION FOR NUMBER OF POSTS PER DAY 54
TABLE 4.7 TEST SUITE PERFORMED TO PERFORM SENTIMENT ANALYSIS OF POST EXTRACTED 56
TABLE 4.8 TEST SUITE PERFORMED TO CATEGORIZE TWEETS INTO TYPE OF DEVICE USED 57
TABLE 4.9 TEST SUITE PERFORMED TO CATEGORIZE TWEETS INTO TWEET TYPE 58
TABLE 4.10 TEST SUITE PERFORMED TO DISPLAY DATA VISUALIZATION FOR ANALYSIS CARRIED OUT 59
TABLE 4.11 TEST SUITE PERFORMED TO COMPARE TWO KEYWORDS 62
TABLE 4.12 TEST SUITE PERFORMED TO EXTRACT INFORMATION OF TWITTER ACCOUNT 65
TABLE 4.13 TEST SUITE PERFORMED TO EXTRACT USER TIMELINE 67
TABLE 4.14 TEST SUITE PERFORMED TO DISPLAY DATA VISUALIZATION FOR OPTIMIZATION OF POSTS 68
TABLE 4.15 TEST TRACEABILITY MATRIX 73
TABLE 4.16 TEST REPORT SUMMARY 75
TABLE 4.17 ERROR REPORTS AND CORRECTIONS 75

ix
LIST OF FIGURES

FIGURE 2.1 GOOGLE TREND ON SENTIMENT ANALYSIS 8


FIGURE 2.2 SENTIMENT ANALYSIS METHODOLOGY USED IN SENTIMENT ANALYSIS OF NEWS ARTICLES: 11
FIGURE 3.1 PROTOTYPE METHODOLOGY 14
FIGURE 3.2 RAPID APPLICATION DEVELOPMENT MODEL 14
FIGURE 3.3 SCRUM DEVELOPMENT METHODOLOGY MODEL 15
FIGURE 3.4 WATERFALL METHODOLOGY MODEL 16
FIGURE 3.5 SPIRAL MODEL 16
FIGURE 3.6 ITERATIVE INCREMENTAL MODEL 17
FIGURE 3.7.1 MENTION FEEDS ARCHITECTURE 26
FIGURE 3.7.2.1 USE CASE DIAGRAM FOR ONE KEYWORD 27
FIGURE 3.7.2.2 USE CASE FOR COMPARE KEYWORD 28
FIGURE 3.7.3 ACTIVITY DIAGRAM 29
FIGURE 3.7.4.1 CONTEXT LEVEL DIAGRAM 30
FIGURE 3.7.4.2 LEVEL 0 DFD 30
FIGURE 3.7.4.3 LEVEL 1 DFD 31
FIGURE 3.7.4.4 PROCESS DECOMPOSITION 32
FIGURE 3.7.5 ENTITY RELATIONSHIP DIAGRAM OF MENTION FEEDS 33
FIGURE 3.7.6.1 LOGIN PAGE 34
FIGURE 3.7.6.2 REGISTER PAGE 34
FIGURE 3.7.6.3 USER KEYWORD DASHBOARD 35
FIGURE 3.7.6.4 ADD NEW KEYWORD 35
FIGURE 3.7.6.5 COMPARE KEYWORDS 36
FIGURE 3.7.6.6 USER SOCIAL MEDIA ACCOUNT DASHBOARD 36
FIGURE 3.7.6.7 ADD NEW TWITTER USERNAME 37
FIGURE 4.1 BOKEH VISUALIZATION 42
FIGURE 4.1.1 TESTING USER REGISTRATION 46
FIGURE 4.1.2 SUCCESSFUL REGISTRATION 46
FIGURE 4.3.1 TESTING LOGIN PAGE 48
FIGURE 4.3.2 LOGIN SUCCESSFUL 48
FIGURE 4.4.1 POP UP TO ADD KEYWORD 49
FIGURE 4.4.2 KEYWORD SUCCESSFULLY ADDED 50
FIGURE 4.5.1 TEST TO ADD NEW KEYWORD 51
FIGURE 4.5.2 TWITTER ACCOUNT SUCCESSFULLY ADDED 52
FIGURE 4.6 TWEETS EXTRACTED SUCCESSFULLY 53
FIGURE 4.7.1 LINE GRAPH FOR EXTRACTED POSTS 55
FIGURE 4.7.2 BAR CHART FOR E XTRACTED POSTS 55
FIGURE 4.8.1 SENTIMENT ANALYSIS VISUALIZATION 60
FIGURE 4.8.2 DEVICE USED VISUALIZATION 60
FIGURE 4.8.3 POST TYPE VISUALIZATION 61
FIGURE 4.9.1 TEST TO COMPARE KEYWORDS 63
FIGURE 4.9.2 COMPARE KEYWORDS DOUBLE LINE GRAPH 63
FIGURE 4.9.3 COMPARE ANALYSIS VISUALIZATION 64
FIGURE 4.9.4 EXTRACT TWITTER ACCOUNT INFORMATION 66
FIGURE 4.9.5 USER TIMELINE E XTRACTED 68
FIGURE 4.9.6 DATA VISUALIZATION FOR POST OPTIMIZATION 69
FIGURE 4.9.7 HISTORIC DATA PAGE 71
FIGURE 4.9.8.1 HISTORIC DATA GRAPH 71
FIGURE 4.9.8.2 HISTORIC DATA SENTIMENT ANALYSIS 72

x
LIST OF ABBREVIATIONS

CPU Central Processing Unit

ERD Entity Relationship Diagram

IT Information Technology

API Application Program Interface

HTML HyperText Mark-up Language

CSS Cascading Style Sheets

xi
CHAPTER 1:
INTRODUCTION
1.1 Overview
The internet is becoming more and more instilled in every human’s day to day activity and
life. It has been observed that social media platforms are used consistently 24/7 by users
from all around the world to express their opinions on different topics freely.
Many brands and companies take advantage of these platforms to create strategies and
understand their target market so as to help in advertising and marketing using influencers
on Instagram, twitter, Facebook etc.
Social media monitoring is the process of identifying data relevant to a topic or organization
from a social media platform and making various analyses on the data for different purposes
such as academic, decision making, marketing strategies etc.
This project will show how social media has and will help businesses in making analysis on
the opinions of the customers on their products so as to improve their marketing strategies
and know what their target market wants. (Amandeep, Deepesh, Khushboo, & Ranjit Singh,
2016).
Sentiment Analysis is used to classify texts into positive, negative and neutral by using text
analysis techniques (Sentiment Analysis, 2020).
Using sentiment analysis to make predictions of emotions in words, sentences or documents,
the work can gain an overview of the wider public opinion behind the product. A lot of
people use social media sites for networking with other people and news. Social Media
provides a platform for people to voice their opinions. For example, a person might have had
a very hot day and decided to buy a refreshing bottle of coke and decide to post a picture of
themselves on twitter with a caption about it. This kind of information can be used by
organizations to evaluate, and rate the performance of their products all around the world as
either successful or not. (Khalid & Ahmad, 2017). This is one the many features the project will
provide.

1
1.2 Background and Motivation
The previous methods of advertising, marketing and understanding the masses opinions
where very time consuming and not always efficient. But social media platforms have
provided a cheaper and more efficient way to get organization's products and services out to
the mass. And with this development analysis on how well these products are doing in the
market is very possible. A lot of social media platforms provide API’s that give the public
access to their data.
TweetReach is a social media tool for twitter specifically. It makes analysis based on
keywords, URLs or account names and gives an overview over all the tweets posted.
The advantage of using a social media monitoring tool is that it provides the following:
- Users can have an overview of what people are saying about their products or how many
mentions their products are making. It can break down the status of who are making the
posts all the way down to their gender, location, source of the post etc.
- The users are getting real-time analysis on their products. Giving data on the number of
positive and negative reviews and comments people post on their products. This can help
the organization know the next step to take with their marketing strategies (Sentiment
Analysis, 2020).
- The user can keep track of their competitors also.

This project is not only limited to organizations or businesses, academic scholars and people
in politics can use this project to make analysis in their respective fields too.
These are some of the advantages and features the implementation of this project has to offer
to academia, industry and government.

1.3 Statement of Problem


In the past companies/organizations/brands had to use radio, television ads, newspaper/print
outs etc. to get their products and services to the market. But with the trend of social media it
has helped businesses advertise and market their products and services better ways by
improving customers' insights, these businesses can use the platforms to understand what the
customers like and dislike on a personal level easily.

2
It helps to establish brand awareness as most social media platforms have millions of users.
This can help the business reach a wide range of customers locally and internationally.
Also, it is very cost effective, if using the previous methods such as using billboards,
television ads, newspapers etc. they usually cost a lot and are usually not very cost efficient
but using social platforms it costs almost nothing to get your products into the market
compared the old ways. (Khanna, 2018).
Also, political candidates also find it difficult have a more accurate data on what the society
feels about them and how well their campaign or adverts have affected people’s view on
them

The main problem is the accessibility of data for individuals or organizations on their
products and analyzing this data to know how successful or unsuccessful the products or
strategies are working. This project gives its users access to that information.

1.4 Aims and Objectives


Companies, organizations, influencers, political candidates and so many others find it
difficult to analyze data or information that is relevant to them on the internet. This thesis as
whole aims at providing this data from various social media platforms so as to make analysis
and give users an insight of what social media users feel about them or their
products/services. Due to the time constraint of this project. This particular project work will
focus mainly on twitter.
This project aims at finding the most accurate method of sentiment analysis to help achieve
the following:
1.) For brands or keyword:
- Extracts tweets containing the particular keyword requested by the user.
- Find the most efficient method for sentiment analysis on each tweet and then give an
average of positive, negative and neutral tweets.
- Show devices that post the tweets
- Show if tweets were original, replies or retweets.
- Compare different brands e.g. iPhone vs Samsung
- Number of engagements in a page
- Sources of posts (e.g. iPhone, Android, Web App e.t.c.)
3
- Compare two keywords and provides graph shows number of posts for each keyword
- Shows number of followers and following of twitter account
- Shows location of tweets
- Develops graph to give a clearer insight on the analysis being carried out
2.) For Twitter account:
- Shows number of Followers and number of people following
- Types of post by user (i.e. whether reply, original, retweet)
- Shows the top posts with number of likes and retweets
- Shows the amount of engagement of on your page
- Develops a graph that shows the best time to make a post in a day based on the
previous day

OBJECTIVE OF THE PROJECT


1.) To find the most accurate method of sentiment analysis
2.) To develop interactive visualization for data analysis done on data
3.) To perform analysis on data gotten either based on keywords or twitter accounts.

These are the aims and objective of this current project as when you get to Chapter 3 you
will notice this project is separated into iterations with an enhancement being made at every
iteration.

1.5 Significance of the Project


The implementation of this project has potential benefits to the academic field and society.
In view of society, this research will enhance the marketing strategies used by organizations
and companies in Nigeria. It quickly gains insights using large volumes of text data. Product
sentiment analysis will provide a platform for companies/brands to understand the opinions
of the mass based on their products and advertisements.
It provides some answers into what the most important issues are, from the perspective of
customers, at least. Because sentiment analysis can be automated, decisions can be made
based on a significant amount of data rather than plain intuition that isn’t always right.
(Dumbleton, 2018).

4
Sentiment analysis also focuses on data science which is a vital branch of computer science.
Implementing this project in the country will enhance data management and manipulation,
that is, understanding the importance of data and how it can help in business analytics and
intelligence as well as many other sectors.
Social media monitoring helps individuals and organizations have access to data available
on the internet. It carries out analysis on the data extracted to give a better insight on their
topic.
1.6 Project Risk Assessment
This section shows possible problems that may occur from the beginning of this project and
throughout and suggest possible solutions and prevention techniques that can be used.

Table 1: Risk Assessment

S/N Risk event Risk Probability Impact Factors


Timeframe (out of
100)

1.6.1 Time Efficiency Undefined 40 out of Incomplete -Lack of


100 Project time
management
1.6.2 Scope creep 20-30 days 50 out of Incomplete -Changes
inflates scope after project 100 Project from
has started lecturers or
supervisors

1.6.3 Dependencies are Undefined 70 out of More time -Lack of


inaccurate 100 or cost proper
spent planning

1.6.4 Design isn’t 10-20 days 10 out of Incomplete -May be


feasible after project 100 project excessively
has started costly

5
-Lack of
skills

1.6.5 Loss of Undefined 50 out of Setback in -Improper


hardware/software 100 project maintenance
data or resources of the
system etc.

1.6.1 Time Efficiency


The time given to complete this project may not be sufficient. This problem can affect the
completion of the product and lead to a poor-quality product. This can be due to poor
time management or bulkiness of the project.
Solution: Have a proper Gantt chart and work plan that explains each task and its
dependencies from start date to end date.

1.6.2 Scope creep inflates scope


Changes in requirement when the project has gotten to a particular stage in the
development process can be very expensive both resource and time wise. This can cause
a setback in the project. This can be caused by inefficient review of project topic by
lecturer or also when some requirements are realized to not be feasible half way through
the implementation.
Solution: Performing proper feasibility studies of research before implementing the
project.
1.6.3 Dependencies are inaccurate
For any successful project a work plan has to be created which shows the step by step
procedures to complete the project. Dependencies are used to show tasks that have to be
completed in order to perform another task. There are some steps that must be done
before others and if these steps aren’t followed duly it can cause setbacks in the project. It
is usually as a result of improper planning in the beginning of the project. Solution: Re-
evaluating the work plan and Gantt chart for errors and anomalies.

6
1.6.4 Design isn’t feasible
This is when a feature in the project is not possible either due to lack of skills or time. It
can lead to an incomplete project or dedicating time to reevaluation of the project.
Solution: This is caused by improper planning and feasibility study of the project. This is
just to say these stages are also very vital as they can affect the end product of the project
and should be taken seriously.

1.6.5 Loss of hardware/software resources


The system in use may crash or have issues which can cause a major setback in the
development process or delivery of the project.
Solution/Prevention: Use websites like GitHub and Google Drive to save codes in case
of these kinds of issues and also use google drive to save all the materials on a cloud
which can be retrieved on any system.

1.7 Scope/Project Organization


The rest of this project is a breakdown of how the project will be implemented. Chapter 2 is
a literature review that will show research works that are relevant to this project and how far
this field has come in sentiment analysis and opinion mining. Chapter 3 shows all the
methodology and approach used to carry out this project i.e. it will give a breakdown of the
UML diagram which will help give a clearer picture of what the functions are and how they
will be implemented. Chapter 4 is for implementation; this is where the actual codes are
being written and tested. Chapter 5 is the conclusion that shows if you were able to meet
your goal. It will also show future recommendations for the project.

7
CHAPTER 2:
LITERATURE REVIEW

2.1 Introduction
A lot of research work has been done in relation to social media monitoring and sentiment
analysis. This section shows previous research works and products in the field of sentiment
analysis, social media monitoring and data mining. Some of the research works have also been
briefly tabulated in Table.3 and Table .4.

2.2 Historical Overview


Social media monitoring is an important aspect and research area in data science. There has been
interest in opinion mining since people started talking, from a simple feedback from your friend
on your outfit, to what customers think about the latest iPhone product. Sentiment analysis and
opinion mining plays a very crucial role in marketing, advertising and designing of products in
organizations or businesses. There has been a remarkable increase in the number of research
work done on sentiment analysis and opinion mining. By 2016, more than 7000 papers had been
published and 99% of the papers were published after 2004 (Mika, Daniel, & Kuutila, 2016). We
will observe in Figure 1 how “Sentiment Analysis” has been a topic of interest that increases in
searches on Google search engine (Mika, Daniel, & Kuutila, 2016).

Figure 2.1: GOOGLE TREND ON SENTIMENT ANALYSIS: This is


(http://www.google.com/trends) data showing the number of searches of the string “Sentiment
Analysis” worldwide on Google search engine.

8
The first academic work on sentiment analysis or topic related to sentiment analysis was during
World War II with politics as its motivation (Stagner, 1940). Modern sentiment analysis started
in the 21st century and was mainly on product reviews (Dave, Lawrence, & Pennock, 2003),
(Mika, Daniel, & Kuutila, 2016). It has grown to focus on other topics such as prediction of
markets, reactions to terrorist attacks etc. (Nassirtoussi, Aghabozorgi, Wah, & Ngo, 2014)
(Mika, Daniel, & Kuutila, 2016). There have also been research work on irony detection and
multi-lingual support (Mika, Daniel, & Kuutila, 2016). Further research has also been done on
differentiating negative emotions such as anger and grief (Cambria, Gastaldo, Bisio, & Zunnio,
2015).
In this era of the 21st century, a lot of people own phones which have access to the internet. The
internet is used for a lot of purposes which includes receiving data and posting out data. Social
media platforms, e-commerce websites etc. have made it very easy for users to post about their
opinions on topics, products etc. There has been a lot of research work on sentiment analysis and
will be talked about in Section 2.3.

2.3 RELATED WORKS

2.3.1 Social media monitoring research works

The first social media site created is SixDegrees.com in 1997. Although it was not used by just
anyone and it did not perform the common tasks social media platforms of nowadays perform. It
was used to send messages within networks. After SixDegrees so many other social media
platforms have surfaced over the years such as Friendster, Myspace, Facebook, Twitter etc. The
trend of using these platforms increased over the years to the point where it has become a daily
routine for the majority of worlds’ population. Due to this new trend, social media platforms
contain a lot of data that can be analyzed for various reasons. This section will summarize
articles and products on Social Media Monitoring tools that are relevant to this project.

2.3.1.1 Social media monitoring: Aims, methods and challenges for international companies

9
This research paper shows the aims, objectives and challenges of social media monitoring for
international companies. In the paper it talks about the different methods of monitoring topics or
accounts on social media. Some of the challenges one might face includes finding out what to the
track and identifying the metrics to be analyzed, also there are some ethical constraints because
some influencers would not like to be monitored by companies (Boyang & Marita, 2014). These
are some of the issues pointed out in this research paper.

2.3.1.2 Social Media Monitoring: An Innovative Intelligent Approach


This article proposes the use of artificial intelligence to make analysis on social media posts. It
dives in on how using an intelligent approach can be used by organizations to have a deeper and
clearer insight to how customers feel about their products and or services (Emmanouil, George,
& Ioannis, 2019).

2.3.2 Sentiment Analysis research works


A lot of research work and products have also been done on Sentiment Analysis and Opinion
Mining. Social media has played a huge role in sentiment analysis research because it provides a
large amount of data which is very important in sentiment analysis. Politicians, companies,
influencers e.t.c are now interested in what people portray about them in social media. This
section summarizes some research papers and products on Sentimental Analysis and Opinion
Mining. Table.2. contains research work done on sentiment analysis for products.

2.3.2.1 Sentiment Analysis: Adjectives and adverbs are better than Adjectives alone
Sentiment Analysis: Adjectives and adverbs are better than Adjectives alone, is a linguistic
approach to sentiment analysis where polarities are assigned using the analogy that using both
adjectives and adverbs for issuing out polarities gives a better sentiment analysis compared to
using only adjectives (Farah, Carmine, Antonio, Diego, & Subrahmanian, 2006). The research
worked with three methods Variable Priority Scoring, Adjective Priority Scoring and Adverb
First Priority Scoring. After experimenting on about 200 documents of news resources the results
showed that Adjective Priority Scoring produced the best result with a weight of 35% (Farah,
Carmine, Antonio, Diego, & Subrahmanian, 2006).

10
2.3.1.2 Opinion Digger: Unsupervised Opinion Miner from Unstructured Product Reviews
Opinion Digger is another Sentiment Analysis solution introduced by Samaneh and Ester
(Samaneh & Martin, 2010).It is an unsupervised opinion miner that provides a summary of
ratings for major aspects of a product where an aspect is defined as a products’ feature e.g. For a
phone it could be its battery life or camera. It gives the user an insight on the quality of the
product and allows users to compare them with similar products (Samaneh & Martin, 2010). It
works at sentence level. Their approach consisted of two phases which are extracting product
aspects and estimating aspects ratings (Samaneh & Martin, 2010). It was successful with product
rating at aspect level with 0.51 success. Its downfall was that it only worked with known data
(Najma, Pintu, Monika, Sourabh, & S.K., 2019).

2.3.1.3 Sentiment Analysis of News Articles: A Lexicon based Approach


This is another Sentiment Analysis research work by Soonh, Bakhtawer and Fatemah. In this
article they use a Lexicon-based approach which entails using a dictionary with opinion words
that are each attached to a polarity, then looks for words in the sentence that matches with words
in the dictionary and finds its polarity (Soonh, Bakhtawer, & Areej, 2019). The methodology
used for their experiment is shown in Figure.2.

Figure 2.2: Sentiment Analysis methodology used in Sentiment Analysis of News Articles: A
Lexicon based Approach (Soonh, Bakhtawer, & Areej, 2019)

11
Over 1000 news articles were used in this experiment and were classified into positive, negative
and neutral with a score of +1 as positive and -1 as negative. The problem with this project is
limited word coverage which leads to the constant update of the lexical database with new words
and their semantics (Soonh, Bakhtawer, & Areej, 2019).

2.3.1.4 A Joint Model of Feature Mining and Sentiment Analysis for Product Review
Rating
This is a machine learning sentiment analysis solution introduced by Jorge et al. “It focuses on
measuring the polarity and strength of opinions in product reviews” (Jorge, Laura, Pablo, &
Alberto, 2011) using machine learning. They made this possible using four steps which are first,
it recognizes features the user cares about on a specific product. Secondly, it finds reviews where
the features the user cares about are stated, Then, it checks the polarity of each review. Finally,
based on the polarity it inputs scores for each review, then uses a Vector of feature Intensities to
represent the reviews and put them into an algorithm that will predict the rating (Jorge, Laura,
Pablo, & Alberto, 2011). The problem with this project is the use of WordNet but it has a
staggering prediction accuracy of 71% (3 categories) and 46.9 (5 categories) (Najma, Pintu,
Monika, Sourabh, & S.K., 2019).

2.3.1.5 Machine Learning-Based Sentimental Analysis for Twitter Account


The aim of the research paper is to find the best machine learning technique for sentiment
analysis between SentiWordNet, TextBlob and W-WSD. The Twitter API was used to gather
sufficient data to carry out this research. The conclusion of this research shows that between
SentiWordNet and TextBlob, TextBlob has a 76% accuracy while SentiWordNet has 54.75%
accuracy. (Ali, Sana, Ahmad, & Shahaboddin, 2018)

12
Table 2.1: Sentiment Analysis Tools/Techniques

2.4 Summary
The research work reviewed in this chapter discusses the challenges faced in monitoring social
media platforms due to issues such as privacy, vagueness of data e.t.c. Therefore, the proposed
solution will ensure data extracted from social media platforms are less ambiguous and have
more meaning to prospective users. Section 2.3.2 talks about the various techniques used to
perform sentiment analysis and the best option so as to deliver the most accurate result under
budget and time.
Chapter 3 presents the requirement analysis and proposed methodology used to develop this
project.

13
CHAPTER 3:
REQUIREMENT ANALYSIS AND DESIGN
3.1 OVERVIEW
The aim of the chapter is to give a detailed breakdown of the requirement elicitation, analysis
and design of the system. The proposed methodology used for the project will be explained in
detail in this chapter as well as the functionalities and how they are being carried out with the
help of UML Diagrams.

3.2 PROPOSED METHODOLOGY


The purpose of Mention Feeds is to provide a platform for its registered users where they can
have an insight of what people on social media platforms feel about their brands and also gives
an analysis on users’ accounts. It will be connected to a database that stores a users’ personal
information as well as the keywords(brand) and social media accounts the user is interested in. It
will also make use of twitter API to gain access and extract tweets relevant to a users’ keyword
as well as information about a users’ account. This project will entail four parts web
development, data mining, sentiment analysis and data visualization. These will be achieved by
using different types of methodologies.

3.2.1 WEB DEVELOPMENT


In software development, projects can be carried out using various methodologies such as
Waterfall methodology, prototype methodology, Rapid Application Development, Scrum
Development Methodology, Spiral Model etc. In the next paragraph the various types of
software development methodologies will be discussed as well as an evaluation to understand
which methodology would work best with this project.

3.2.1.1 Prototype Methodology


Prototype Methodology is a software development methodology where the developers
create a sample of the project/software solution for the customers and the team
developers to have a glimpse of how the end product will function/work before
developing the final product. Using this methodology can be advantageous as it helps

14
with requirement gathering and also reduces chances of failure but on the downside, it
causes an increase in management cost, customers might be too involved in the project
which could delay completion of the project and also cause a lot of changes which will
after the workflow.

Figure 3.1: PROTOTYPE METHODOLOGY

3.2.1.2 Rapid Application Development Methodology


Rapid Application Development Methodology is a software development methodology
that makes release of product solutions at a rapid pace a priority. Unlike some other
methodologies that try to be rigid and work solely on the plan from the beginning, this
methodology gives room for changes at a low management cost. This encourages code
reuse and also allows for better risk management. However, it does not work with
projects that have a small budget and also requires a skilled team to handle complexities
in the project. (K, 2018)

Figure 3.2: Rapid Application Development Model (K, 2018)

15
3.2.1.3 Scrum Development Methodology
Scrum Development Methodology is another software development methodology which
is considered to be an agile software development approach. It is similar to rapid action
development methodology as it also aims at developing software solutions in a short
period of time whereby there are changes made using iterations. It gives room for
improvement and updates. However, it does not work with large projects and it requires
extremely skilled experts for the team.

Figure 3.3: Scrum Development Methodology Model (ZAYNABZAHRA, 2017)

3.2.1.4 Waterfall Methodology


Waterfall methodology is a traditional method for software development where all the
processes are achieved linearly where each stage must be achieved before proceeding to
the next stage. It is usually used during large projects but it does not give much room for
maintenance or updates.

16
Figure 3.4: Waterfall Methodology Model (K, 2018)

3.2.1.5 Spiral Model


Spiral Model is a software development model that uses iteration for the development of
the software solution. It uses a combination of waterfall model and iterative development
process model. It works very well with projects that have very high risk and gives room
for addition of functionalities later. But it has its disadvantages, it can be very costly and
may lead to serious damage and setbacks if risk analysis is not carried out properly.

Figure 3.5: Spiral Model (Tutorialspoint, 2020)

17
3.2.1.6 INCREMENTAL MODEL
This is a method used in developing software solutions by using iterations to create
different releases for the project i.e. the project is broken down into multiple standalone
modules. It allows developments of software to be quick and flexible. However, it
requires good planning. In this project, this methodology will be used because it gives
room for improvement and allows for quick development of software solutions. This web
app will have a lot of functionalities when the end product is completed as it will
integrate data mining from Twitter, Facebook and Instagram. It cannot all be completed
and also meet up with the deadline. This model allows a thorough understanding of the
goal of the web app and breaks it down into mini projects to be completed. Figure 3.5
shows how the project is going to be carried in iterations after the important
functionalities are defined.

Figure 3.6: Iterative Incremental Model (Tutorialspoint, SDLC - Iterative Incremental Model -
Tutorialspoint, 2020)

3.2.2 SENTIMENT ANALYSIS


Sentiment Analysis is the process of having a machine classify phrases, sentences or documents
into positive, negative or neutral. It is basically used to predict or get
In sentiment analysis, there are also various methodologies/approaches such as rule-based
approach and automatic approach.

18
- Machine Learning: This is when the dataset is trained so that machines can learn and
improve on knowledge through experience using algorithms (Team, 2017). To begin
analysis, labeled datasets are created. Then feature spaces are used to show what words
are in the document and what words are not in the document using keywords. This is
important so as to get only words that are useful for the analysis. Then the use of a
supervised algorithm to create a model. The advantage is that it gives a better result
compared to Lexicon based or rule-based.
- Rule based: This is when words are given polarity individually. The polarities of all the
words in the sentence are combined and an average. The end result will then determine if
the word is positive or negative. (Andrew, et al., 2011)
- Lexicon based: It uses semantic orientation by measurement of opinion and subjectivity
of reviews then generates sentiment polarity (Mika, Daniel, & Miikka, 2018). This uses
dictionaries where words are tagged as negative or positive, then goes through
documents to count the number of positive and negative words in each document. Then
each document is tagged as either positive or negative. The advantage of using this is it
requires no labels.

With research done in Chapter 2, Machine Learning sentiment analysis delivered the most
accurate result.

3.3 APPROACH TO CHOSEN METHODOLOGY/METHOD


We have decided to go with incremental methodology because this project has three different
stages. The aim of the project as a whole is to create a Web App that gives its users insight on
data gotten from the most popular social media platforms in relation to what is relevant to them.
Due to lack of time and proper funding. This thesis will focus on just one iteration. Below gives
a little insight to what each iteration aims to achieve.
- Iteration 1: This will focus on data extracted from twitter. Twitter is one of the most
popular social media apps in the 21st century. It has over 300 million users (Lin, 2019)
with an average of 100 million active users on a daily basis (Lin, 2019). This means
twitter holds a lot of data that can be useful for data analysis. Twitter provides an API
that allows developers to gain access to their posts. Due to the lack of funding, we are

19
only able to access the free API that provides limited but still substantial data for our
analysis.
- Iteration 2: This will focus on data extracted from Facebook. Facebook is also among
the most popular social media apps in the 21 st century. Facebook also provides an API for
extraction of data from pages or events. Pages are used by brands or businesses to
connect with their customers, followers or fans. Analysis can also be done on the data
extracted from here to know whether you are getting positive or negative feedback from
your posts.
- Iteration 3: This will focus on data extracted from Instagram. Instagram is a social
media platform that allows its users to make posts and comment on posts. Instagram is
also owned by the owners of Facebook and an API is also available to the public for
analysis which will be added to this Web App to give more insight into what people feel
about a topic from the most popular social media platforms.
- Iteration n: Future enhancements….

The next section shows the method of requirement elicitation for iteration 1.

3.3.1 METHOD 1 (BRAINSTORMING)


Brainstorming is a process of undergoing critical thinking in order to derive a permanent or
temporary solution to a particular problem. It involves giving a clear definition of what the
problem is and coming up with ideas to solve the problem.
Definition of problem: There is a lot of data available on the internet, social media platforms to
be specific which are made by the public. However, this data is very ambiguous. There needs to
be a way for people to get data relevant to them as well as analysis that gives them insight on the
topic, they are interested in.
Table 3.1 below shows the various ideas to as well as their scores based on
Time efficiency: Answers the question if it can be achieved within the time for the project.
1being very bad and 5 being very good.
Feasibility: This answers the question if the developer has the skills as well as the availability of
tools/funding to carry out that solution. 1 being very bad and 5 being very good.
Risk analysis: How risky would it be to embark on this solution. 1 being very risky and 5 being
moderately okay.

20
Table 3.1: Brainstorming Report
SN Ideas Time Feasibility Risk Total
Efficiency /5 Analysis /15
/5 /5
1 Using web scraper to scrape 2 5 1 8
data from social media and
filtering them by keywords
2 Developing a data 1 3 3 7
warehouse that stores all
data gotten from different
social media platforms and
analyzing them
3 Using social media API’s to 3 5 4 12
extract data and performing
analysis based on users’
requests

From the table above, it is notable that the safest and most feasible approach is to use social
media API’s to extract data based on keywords and then running analysis on this data and
providing them to the users on a web application.

3.3.2 METHOD 2 (DESK RESEARCH)


Desk research in literal terms is research being done while sitting on a desk using resources such
as the internet, articles, journals etc. It is also known as secondary research where individuals’
make research on other people’s work to have a broader understanding of the projects. Check
Chapter 2 to see how this was carried out.

3.4 TOOLS AND TECHNIQUES


Flask is a micro web framework written in python. It is code already written in python to make it
easier to develop web applications. It allows us to focus on what users are requesting and the sort
of response that should be given back to the user after the request has been made. It provides a
platform to create interactive web applications using different types of python libraries as well as
the use of HTML, CSS and JavaScript. It can be deployed on Chrome, Safari, Opera Mini, Edge
e.t.c.
Flask is an appropriate framework for this project because it is very flexible and allows the use
of third-party tools and libraries. It is easy to learn and provides libraries such as SQLAlchemy

21
which helps manage the database without having to use long query syntax. This is one of the
many advantages of using the Flask framework.
IDE (Integrated Development Environment): The IDE used for the development of this web
application is Visual Studio Code. It is a code editor that allows the use of different
programming languages and supports operations such as debugging, task running and version
control. Different files can be created on this platform and linked together such as .html, .py,
.css, .js e.t.c. It helps coordinate all the different types of files created in one place.
Third-party API: Various APIs were used for development of Mention Feeds and they have
been listed down below.
 Twitter API: Twitter created a platform for developers to gain access to the data
on their platform. They have provided a free API for the public to use with limited
resources provided. Tweepy is a library in python that allows users extract information
from twitter.
 Bokeh: This is a data visualization tool written in python that is used to
diagrammatically represent the analysis such as line graphs, bar charts, pie charts e.t.c.
This is a very good tool because it is responsive which means the size of the graphs
adjust to the size of the screen therefore it can be used both on desktops and mobile
phones.

3.5 ETHICAL CONSIDERATION


This section explores and explains some ethical constraints that should be considered to ensure
the application is harming the users or providing any data that can be used for illegal activities.
Because this project focuses on Twitter and the Twitter API provides a lot of information that if
not portrayed properly can cause harm to their users, we will focus on some of the Twitter API
terms and conditions when using their API.
a.) Location Data: While the API allows users to get geo coordinates of where a tweet was
made. Users of the API are not allowed to store this information or allow others to have
access to this information. No one is allowed to use the data on a standalone basis.
b.) Security: When using the API, users are given token keys and passwords to gain access to
the twitter data. Users are not allowed to share these credentials with a third-party and
ensure they are secured to avoid hackers from stealing them.

22
c.) Confidentiality: The API may give users access to confidential information that is not
made public and users of this API should keep them confidential provided unless you
have gained the right to do otherwise.
d.) Permission: If your application wishes to make posts or perform some specific activities
on a users’ behalf on their twitter account you must have gained consent from the user
giving details of what you plan on doing.

The above listed are some of the terms and conditions given to developers when given access
to twitter API. Now, we will look at ethical considerations that were observed when
designing the website.
a.) Confidentiality of user’s information: Any confidential information provided by the user
will be stored in a hashed key in order to reduce chances of having the information stolen
by hackers or cyber criminals.

This project deals with a lot of data and it is important to ensure personal information is not
accessed by unauthorized parties.

3.6 REQUIREMENT ANALYSIS


The requirement of this project was elicited from the information gathered during brainstorming
and desk research. Social media platforms contain a lot of data provided by their millions of
users.
Social media is the new trend of the 21st century and a lot of small and large organizations have
made use of these platforms to their advantage as it provides a cheaper platform for them to
make advertisements and marketing strategies. However, after these strategies are carried out to
get a company’s brand out to the public, understanding the data gotten back from users’ posts on
social media can be quite overwhelming for individuals to carry out analysis manually.
The solution to this is creating an application where organizations can add a brand name to the
dashboard which would then after extract posts from social media platforms and perform
sentiment analysis on each post to understand whether that brand is getting positive, negative or
neutral feedback from the public. It will also show the type of device to make each post and the
type of post made (i.e. original tweet, retweet or replies).

23
Organizations might also desire to compare a keyword(brand) to another keyword in terms of
their sentiment analysis, to know how their competitors are performing on these social media
platforms as well. Analysts can also use this functionality to compare two topics for research
purposes.
These organizations or small businesses would also like to know the best time to make a post on
these social media platforms on their accounts. This functionality will also be available which
would extract the latest 20 posts made by the user and show the time each post was made and the
number of engagements (likes and retweets) for each post.

3.7 REQUIREMENTS SPECIFICATIONS


This section will define the requirements that are necessary for the system to be considered a
success (functional requirements) and also the requirements that help improve the quality of
the app (non-functional requirement).

3.7.1 Functional Requirement Specifications


Table 3.2: Functional Requirement Specifications
Req. No. Description Type
R-101 The web application shall run on windows, mac, iPhone and Configuration
android
R-102 The web application shall include user interface Functional
R-103 The web application shall require access to the internet to Configuration
function
R-104 The web application shall require users to register to make use Functional
of functionalities
R-105 The web application shall accept inputs from users Functional
R-106 The web application shall allow users enter desired Functional
keyword/brand
R-107 The web application shall allow users enter valid twitter Functional
accounts to their dashboard

24
R-108 The web application shall extract the top 100 posts on twitter Functional
based on the keyword entered by a user using Twitter API
R-109 The web application shall run sentiment analysis on each post Functional
extracted and give a cumulative result
R-110 The web application shall also classify each post based on the Functional
type of device used to make the post
R-111 The web application shall also classify each tweet into original Functional
tweet, retweet or reply
R-112 The web application shall show all posts extracted by the API Functional
R-113 The web application shall show the number of tweets posted Functional
on that keyword on a daily basis
R-114 The web application shall visualize the analysis carried out Functional
with pie charts, bar charts and graphs
R-115 The web application shall allow users select two keywords and Functional
display analysis of each keyword side by side
R-116 The web application shall run analysis on a twitter account Functional
defined by a user
R-117 The web application shall allow users get historic tweets based Functional
on their keywords
R-118 The web application shall run analysis on the historic tweets Functional
extracted
R-119 The web application shall extract personal information of Functional
twitter accounts
R-120 The web application shall extract a twitter account timeline i.e. Functional
the latest 20 post by a twitter account

25
3.7.2 Non-functional Requirement specification
Table 3.3: Non-functional Requirement specification
Req. No. Description Type
NR-101 When the URL is searched, the web application will run unless the Performance
user is not connected to the internet
NR-102 The web application shall ensure sensitive information is secure Security
NR-103 The web application shall explain the different functionalities the Usability
web application has to offer on the page
NR-104 The web application shall be user-friendly Usability
NR-105 Non registered members will not be able to access the functionalities Security
NR-106 The web application shall run well on desktop and mobile devices Configuration

3.8 System Design Overview


The rest of this chapter contains detailed designs that give description of the web application can
its components using UML diagrams such as use case diagram, activity diagram, data-flow
diagram, e.t.c as well as the web applications user interface and application architecture.

3.8.1 Application Architecture


Application Architecture is a diagrammatic representation of an applications’ structure to give
the developers a visualization of what the application may look like and each page connects to
one another.

26
Figure 3.7.1: Mention Feeds Architecture

3.8.2 Use Case


A use case diagram is used to depict the interaction between users and the system. It shows the
functionalities of the system from a users’ point of view and the different activities a user can
perform.

27
Figure 3.7.2.1: Use case Diagram for one keyword

28
Figure 3.7.2.2: Use Case for Compare Keyword

29
3.8.3 Activity diagram
An activity diagram is to elaborate and give more in-depth visualization to use cases (Tanwir,
Adnan, Junaid, Dragos, & Ivan, 2019). This activity diagram shows the process of registration
and logging in of users.

Figure 3.7.3: Activity Diagram (Registration Process)

30
3.8.4 Data Flow Diagram
Data flow diagram is used to show the flow of data in a system. The diagrams below will show
the Context level DFD, Level 0 DFD and Level 1 DFD.

Figure 3.7.4.1: Context Level Diagram

Figure 3.7.4.2: Level 0 DFD

31
Figure 3.7.4.3: Level 1 DFD

32
CONTEXT LEVEL DFD LEVEL 0 DFD LEVEL 1 DFD

Figure 3.7.4.4: Process Decomposition

33
3.8.5 Entity-Relationship Diagram (ERD)
Entity relationship diagram is a type of UML diagram that is used to show the structural
design of an applications database. It gives an account of all the entities and attributes present in
a database. Below, a brief description of the data model will be explained.
USER: This holds all the vital information of the user. This holds the information to authenticate
a user. Each user has a unique id and email which will ensure that each user is unique.
KEYWORD: Each keyword will have a unique id and will be linked to the user who added a
new keyword with user_id holding the id of the user.
TWITTERUSER: Each username will have a unique id and will be linked to the user who
added the new username with user_id holding the id of the user.

Figure 3.7.5: Entity Relationship Diagram of Mention Feeds

34
3.8.6 User Interface Design
This sections the user interface which is the front-end design which the user will use to interact
with the web application.

Figure 3.7.6.1: Login Page

Figure 3.7.6.2: Register Page

35
Figure 3.7.6.3: User Keyword Dashboard

Figure 3.7.6.4: Add New Keyword

36
Figure 3.7.6.5: Compare Keywords

Figure 3.7.6.6: User Social Media Account Dashboard

37
Figure 3.7.6.7: Add new Twitter Username

 Login Page: Figure 3.8.7.1 shows a picture of the login page where registered users
fill in their login information to gain access to their dashboard.
 Register Page: Figure 3.8.7.2 shows a picture of the register page where
prospective users can register and provide the necessary data needed to become
registered users.
 User Keyword Dashboard: Figure 3.8.7.3 shows a picture of the first page that
will appear on a users’ screen after logging in. On this page, there are two options
which are Tracker and Historic Data. A brief description of what each option means
will be explained below.
a.) Tracker: Under this option, registered users are provided with two options
which are Add Tracker and Compare. In Figure 3.8.7.4, a picture shows what
appears when a registered user selects “Add Tracker” which is a pop where
register users can add new keywords to their dashboard. While Figure 3.8.7.5
shows what pops up when a user selects “Compare” which is an option of all the
keywords added by a user on the dashboard. Users can then select two
keywords and get comparisons on both keywords based on sentiment analysis
and other categories.

38
Also, the list of keywords on the front page are a list of keywords the user has
added to his/her dashboard. When a user clicks on submit only the latest 100
posts will be extracted and analysis will be carried out on each post.
b.) Historic Data: Under this option, registered users can access more than 500
posts to be analyzed which gives a more intricate result.

 User Social Media Account Dashboard: This is another feature on Mention


Feeds where users can get analysis of twitter accounts added to the dashboard by
using the “Add Twitter Account” option. Figure 3.8.7.7 shows what pops up when
the “Add Twitter Account” option is selected where users can add valid Twitter
accounts.

3.9 Summary
In this section, Requirement elicitation and analysis were examined here showing the different
methods used to carry out this task. This led to understanding the design of the web application
and details using UML diagrams and other tools.

39
CHAPTER 4:

IMPLEMENTATION AND TESTING

4.1 Overview
The aim of this chapter is to document the process of development of the main features. It gives
a detailed breakdown of the problems encountered and how they were resolved. It also goes
through the test plan and test report for the project to ensure all the functionalities are functioning
properly. This chapter is where the project is going to be implemented.

4.2 Main Features


This section shows the main features of the web application and how they were
implemented.

4.2.1 Registration and Login


For an individual or organization to make use of the functional features of the web application,
they would have to be registered and logged in. A sign up and login page are created for
prospective users to register and login to gain access to their individual dashboard. The form
which accepts the users’ input is managed by Flask Form which helps simplify creation of forms
in the HTML and provides some other functionalities. When a user registers, all their credentials
are saved into a database. Passlib.hash which is a library that changes raw text to hash key using
an algorithm is used to save the passwords in hashed value rather than raw texts into the database
to reduce the chances of having unauthorized people gaining access to a registered users’
account. Flask login is a library provided in python for flask frameworks to help manage
sessions. This also helps with knowing the status of a user i.e. if they are logged in and even
provides a line of code to easily log out a user. It also ensures that random people don’t gain
access to pages they are not authorized to see.

4.2.2 Addition and Deletion of Keyword


Mention Feeds allows registered users to add and delete keywords to their dashboard. This
feature is available to all registered users on the users’ dashboard. A button that says “Add
40
Tracker” allows users to add their desired keywords to their dashboard. When this button is
clicked a pop up appears for the user to enter a desired keyword. The form which accepts the
users’ input is managed by Flask Form which helps simplify creation of forms in the HTML and
provides some other functionalities. After clicking submit the keyword is added to the database
and linked to the user that added the keyword. Beside every keyword there is a delete button
attached to it. When clicked the object of that particular keyword is deleted from the database
with the help of SQL alchemy which simplifies SQL queries. Figure 3.8.7.4 is a diagram that
shows pop up that appears when a user clicks “Add Tracker”.

4.2.3 Keyword Analysis


After a user adds a keyword to their dashboard, the user can then click the submit button attached
to the keyword they desire to get analysis on. After clicking the submit button each of the
following will occur.

4.2.3.1 Tweet Extraction


Mention Feeds extracts tweets made on twitter about a particular keyword requested by a
registered user. This is possible because Twitter provides a free API for developers to gain
access to this data. This API provides its data in objects and lists and can be manipulated to get
different analysis. To gain access to this data, developers have to apply for a developers’ and
create an app in order to get the credentials required to access the data. Afterwards, Tweepy
which is a library on python used for accessing Twitter API. It is used to authenticate the API
credentials provided and extract all the data provided by Twitter API. A class was created on
python to authenticate the credentials which is then passed to another class called TwitterClient
where a function is defined to extract the top 100 tweets. The tweets extracted are in the form of
dictionaries with various information about each tweet.

4.2.3.2 Sentiment Analysis


Sentiment Analysis is the process of determining the emotions laced in a text by using an
algorithm to classify the text into Negative, Positive and Neutral. TextBlob is a library provided
on Python that performs processing textual data. It helps with various natural language
processing such as Translation, Classification, Sentiment Analysis e.t.c. It is built on Natural

41
Language Toolkit (NLTK). Before a tweet is classified into Positive, Negative and Neutral, a
text has to be cleaned that is getting rid of the punctuation marks. This is so as to get the most
accurate result. Afterwards these tweets will be classified into positive, negative and neutral.
Using bokeh, a pie chart will be used to represent the result.

4.2.3.3 Device Type


Mention Feeds provides a data visualization that shows the types of device used to make each
post that is extracted on a keyword. Each tweet extracted comes in a dictionary which holds
information about that particular tweet which include the type of device, the date it was created,
the location it was tweeted from, etc. with this information tweets can be classified by the types
of devices by making use of Pandas and NumPy to classify each text into iPhone, Android,
WebApp or Others. A pie chart is then used to diagrammatically represent this information on
the web application using bokeh.

4.2.3.4 Compare Keyword


Mention Feeds allows registered users to compare two keywords on the dashboard of the user.
This will provide a double line graph that shows the number of tweets made on each keyword
per day for the past 7 days. It also shows the sentiment analysis as well other analysis side by
side for both keywords. This allows the user to compare their brand to their competition and
have a clearer view of how their competitors are doing in the market. A compare button is visible
on the users dashboard which when clicked will pop up an option of all the keywords available
on the users’ dashboard as shown in Figure 3.7.6.5 Tweets made by both keywords will be
analysed and displayed with bokeh.

4.2.4 Addition and Deletion of Twitter Usernames


Mention Feeds allows registered users to add and twitter usernames to their dashboard. This
feature is available to all registered users on the users’ dashboard. A button that says “Add
Twitter Account” allows users to add their desired twitter account to their dashboard. When this
button is clicked a pop up appears for the user to enter a twitter account. The form which accepts
the users’ input is managed by Flask Form which helps simplify creation of forms in the HTML
and provides some other functionalities. After clicking submit the twitter account is added to the

42
database and linked to the user that added the keyword. Beside every twitter username there is a
delete button attached to it. When clicked the object of that particular twitter account is deleted
from the database with the help of SQL alchemy which simplifies SQL queries. Figure 3.7.6.7 is
a diagram that shows pop up that appears when a user clicks “Add Twitter Account”.

4.2.5 User Timeline Extraction


Mention Feeds allows its registered users to get analysis on a twitter account that is public. It
also makes use of Tweepy to extract a twitter user timeline then categorizes the tweets extracted
from the user timeline using NumPy and Pandas then gives a data visualization that shows the
best time to make a post based on the engagements of each post. The data visualization is
provided with the use of bokeh. However, if a user enters an invalid username (i.e. a twitter
username that does not exist) the web application will automatically delete this username from
the users’ dashboard.

4.2.6 Logout User


A user can logout at point after logging in. A logout button is provided on every page for logged
in users to log out. Flask provides a library called Flask Login which helps manage users’
sessions. It helps simplify the code for logging out a user with a simple line of code.

4.3 Implementation Problems


A couple of bugs were encountered during the implementation process. The list of bugs/errors I
faced while implementing the project will be listed below.
 Integrating bokeh graph to HTML page: Bokeh is a library available on python
which helps with developing interactive data visualization on web pages.
Uploading the graphs developed to a particular webpage was a bit tasking.
 Deleting Keywords and Twitter Accounts from a users’ dashboard: A problem was
faced when trying to implement the delete button using SQLAlchemy. The delete
button is to delete the keyword or twitter account it is attached to from the database
which would remove it from the webpage.
 Extracting data using the Twitter API: Another problem encountered during the
implementation process was making use of the Twitter API to extract relevant data.
An error code kept generating that hindered the process of accessing the data.
 Managing sessions: Sessions are used to understand who is making use of a web
application. Another problem encountered was managing the login session of users.

43
4.4 Overcoming Implementation Problems
In section 4.3 the problems encountered during the implementation were discussed. In this
section how those problems were overcome.
 Integrating bokeh graph to HTML Page: To resolve this issue with the help of stack
overflow and the documentation provided on Bokeh’s website. Bokeh allows
developers to pass script and div code to HTML.

Figure 4.1: Bokeh Visualization

 Deleting Keywords and Twitter Accounts from a users’ dashboard: This problem
was caused because of improper implementation of code. The structure of the code
was not arranged properly. This was solved by going through SQLAlchemy
documentations to have a clearer understanding of the library.
 Extracting data using the Twitter API: This problem was solved by going through
the documentation of the Twitter API and with the help of GitHub. The solution
was to change the time on the hardware used to implement the web application.
 Managing sessions: This problem was solved by making use of Flask Login which
provides the user session management for Flask. It helps manage a users’ login and
logout session.

4.5 Testing
In this section, the functionalities will be verified and validated. This helps developers identify
and understand the limitations and possible vulnerabilities of the web application.

44
4.5.1 Tests Plans (for Unit Testing, Integration Testing and System Testing)
Below is the test plan for “Mention Feeds”.

4.5.1.1 Test Identifier:


TEST LEVEL: Master Test Plan
AUTHOR’S NAME: Hauwa Umar
AUTHOR’S CONTACT: umarhauwa67@gmail.com

4.5.1.2 Reference
 Mention Feeds
 Work Plane
 Detailed project documentation
 Test Summary

4.5.1.3 Introduction
This section provides a breakdown of the tests carried out to verify and validate the
functional requirements for Mention Feeds. White box and Black box testing were
carried out.

4.5.1.4 Features to be Tested


Listed below are the areas that are to be tested for the web application.
 Register Page
 Login Page
 Add New Keyword
 Add Twitter Account
 Extract latest 100 Tweets from Twitter based on Keyword
 Data Visualization for number of posts extracted per day
 Sentiment Analysis of posts Extracted
 Categorizing tweets into Type of Device Used
 Categorizing tweets into Tweet Type (Original Tweet, Reply or Retweet)
 Data Visualization for All Analysis made on tweets extracted
 Compare two keywords
 Extract Information of Twitter Account
 Extract User Timeline
 Data Visualization for Optimization of posts
 Logout Button
 Historic Data

45
4.5.1.5 Approach
Visual Studio Code is used for testing the web application before it is deployed. It
supports python and flask extension which were used for the implementation of this
project which also include debugging.

4.5.1.6 Deliverables
The deliverable of the test plan includes:
 Test Cases
 Test report
 Traceability matrix
 Test results
 Error report

APPROVALS

Hauwa Umar
Supervisors

4.5.2 Test Suite (for Unit Testing, Integration Testing, and System Testing)
Test case TC-001(Register User)
Table 4.1: Test Suite Performed for Register
Test Suit ID R-121

Test Case ID TC-001


Test case summary This is to ensure users can register

Related Requirement R-121

Prerequisite  The web app must be opened in a


browser
 Uninterrupted internet connection
must be available
Test Procedure 1. Navigate to the site using URL
2. Click the Register Button
Test Data Users’ information

Expected Result The user should be registered


Actual Result The web application registered the user

46
Status Successful
Remarks The test was successful
Created by Hauwa Umar
Date Created 6th June 2020
Executed by Hauwa Umar
Date Execution 6th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-001(Register User): This aim of this test case is to ensure the registration process
works accordingly. It will also ensure users enter the right information. Error messages also
appear to help guide users understand how to appropriately fill information on the registration
form. If the user successfully registers it will redirect the user to the login page and show a
success message. This test case was successful.

Fig 4.1.1: Testing User Registration Process

47
Figure 4.1.2: Successful Registration

Test case TC-002 (Login User)


Table 4.2: Test Suite Performed for Login
Test Suit ID R-122
Test Case ID TC-002
Test case summary This test is to ensure registered users can login
Related Requirement R-121
Prerequisite  User must be connected to the internet
 User must be registered
Test Procedure 1. Navigate to login Page
2. Login
Test Data 1. Registered Email
2. Password
Expected Result User should be logged in and redirected to dashboard
Actual Result User is logged in
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 7th June 2020
Executed by Hauwa Umar
Date Execution 8th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

48
Test case TC-002 (Login User): The aim of this test is to ensure registered users can login. It will
verify that the email and password entered by the user is correct. If the user enters the right
information the user will be redirected to their dashboard. Otherwise, an error message will
appear. This test case was successful.

Figure 4.3.1: Testing Login Page

Figure 4.3.2: Login Successful

49
Test case TC-003 (Add New Keyword)
Table 4.3: Test Suite Performed to Add Keyword to Users’ Dashboard
Test Suit ID R-123
Test Case ID TC-003
Test case summary This test case is to ensure registered users
can add keywords to their dashboard.
Related Requirement R-123
Prerequisite  User must be logged in
 User must be on “Keyword Based
Analysis Page”
 User must click the “Add
Keyword” button
Test Procedure 1. Login
2. Click “Add Keyword” button
Test Data Keyword
Expected Result The web application should add the
keyword to a users’ dashboard and link it
to the user in the database.
Actual Result The web application adds the keyword to a
users’ dashboard and links it to the user in
the database.
Status Successful
Remarks The test was successful
Created by Hauwa Umar
Date Created 10th June 2020
Executed by Hauwa Umar
Date Execution 11th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-003 (Add New Keyword): The aim of this test case is to ensure registered users can
add keywords to their dashboard. Logged in users will find the “Add Keyword” button on their
dashboard which should trigger a popup form that should allow users enter in their desired
keyword and add it to their dashboard. This test case was successful and it performed everything
that was expected.

50
Figure 4.4.1: Pop up to add Keyword

Figure 4.4.2: Keyword Successfully Added

51
Test case TC-004 (Add Twitter Account)
Table 4.4: Test Suite Performed to Add Twitter Account to Users’ Dashboard
Test Suit ID R-124
Test Case ID TC-004
Test case summary The aim of this test case is to ensure
registered users can add twitter accounts to
their dashboard.
Related Requirement R-124
Prerequisite  User must be logged in
 User must navigate to “Social
Media Analysis” Page
 User must click on “Add Twitter
Account” button
Test Procedure 1. Login
2. Navigate to “Social Media Account
Analysis” Page
3. Click “Add Twitter Account”
button
Test Data Twitter Account
Expected Result The web application should add the twitter
account to a users’ dashboard and link it to
the user in the database.
Actual Result The web application adds the twitter
account to a users’ dashboard and links it
to the user in the database.
Status Successful
Remarks This test case is successful
Created by Hauwa Umar
Date Created 12th June 2020
Executed by Hauwa Umar
Date Execution 13th June 2020
Test Environment Hardware: HP Laptop
Software: Browser - Google Chrome

Test case TC-004 (Add Twitter Account): The aim of this test case is to ensure registered users
can add twitter accounts to their dashboard. Logged in users will navigate to “Social Media
Account Analysis” Page and find the “Add twitter account” button on their dashboard which
should trigger a popup form that should allow users enter in the desired twitter account username
and add to their dashboard. This test case was successful and it performed everything that was
expected.

52
Figure 4.5.1: Test to Add New Keyword

Figure 4.5.2: Twitter account successfully added

53
Test case TC-005 (Extract latest 100 Tweets from Twitter based on Keyword)
Table 4.5: Test Suite Performed to extract latest 100 Tweets from Twitter based on
Keyword
Test Suit ID R-125
Test Case ID TC-005
Test case summary The aim of this test case is to extract the latest 100 Tweets
from Twitter based on Keyword.
Related Requirement R-125
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display the latest 100 tweets
from twitter based on the selected keyword.
Actual Result The web application displays the latest 100 tweets from
twitter based on the selected keyword.
Status Successful
Remarks This test case was successful
Created by Hauwa Umar
Date Created 14th June 2020
Executed by Hauwa Umar
Date Execution 20th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-005 (Extract latest 100 Tweets from Twitter based on Keyword): The aim of this
test case is to extract the latest 100 Tweets from Twitter based on Keyword. This registered user
must have a keyword on their dashboard. There is a submit button beside each keyword on the
dashboard. A user will click the submit button whichever keyword they want to get analysis on
which will then extract the latest 100 Tweets from Twitter based on the Keyword. This test was
successful and performed what was expected.

54
Figure 4.6: Tweets Extracted Successfully

Test case TC-006 (Data Visualization for number of posts extracted per day)
Table 4.6: Test Suite Performed to display Data Visualization for number of posts per day
Test Suit ID R-126
Test Case ID TC-006
Test case summary The aim of this test case is to display a line graph that shows
the number of posts extracted per day
Related Requirement R-126
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display a line graph showing the
number of tweets extracted based on the keyword per day
Actual Result The web application displays a line graph showing the number
of tweets extracted based on the keyword per day
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 21st June 2020
Executed by Hauwa Umar
Date Execution 26th June 2020

55
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-006 (Data Visualization for number of posts extracted per day): The aim of this test
case is to display a line graph that shows the number of posts extracted per day. This registered
user must have a keyword on their dashboard. There is a submit button beside each keyword on
the dashboard. A user will click the submit button whichever keyword they want to get analysis
on which will then display a line graph that shows the number of posts extracted per day.
However, if all 100 tweets were posted on the same day, a bar chart will appear rather. This test
was successful and performed what was expected.

Figure 4.7.1: Line Graph for Extracted Posts

Figure 4.7.2: Bar Chart for Extracted Posts

56
Test case TC-007 (Sentiment Analysis of posts Extracted)
Table 4.7: Test Suite Performed to perform Sentiment Analysis of posts Extracted
Test Suit ID R-127
Test Case ID TC-007
Test case summary The aim of the test case is to perform sentiment analysis on
each post extracted and give a cumulative result
Related Requirement R-127
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard

Test Procedure 1. Login


2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should perform sentiment analysis on each
post extracted
Actual Result The web application performs sentiment analysis on each post
extracted
Status Successful
Remarks This test case is successful
Created by Hauwa Umar
Date Created 27th June 2020
Executed by Hauwa Umar
Date Execution 5th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-007 (Sentiment Analysis of posts Extracted): The aim of the test case is to perform
sentiment analysis on each post extracted and give a cumulative result. This registered user must
have a keyword on their dashboard. There is a submit button beside each keyword on the
dashboard. A user will click the submit button whichever keyword they want to get analysis on
which will then perform sentiment analysis on each tweet extracted and give a cumulative result
of the number of positive, negative and neutral tweets extracted. This test was successful and
performed what was expected.

57
Test case TC-008 (Categorizing tweets into Type of Device Used)
Table 4.8: Test Suite Performed to Categorize tweets into Type of Device Used
Test Suit ID R-128
Test Case ID TC-008
Test case summary The aim of the test case is to categorize tweets into the type of
device used to make each post tweeted.
Related Requirement R-128
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should categorize tweets into the type of
device used to make each post tweeted.
Actual Result The web application categorizes tweets into the type of device
used to make each post tweeted.
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 6th July 2020
Executed by Hauwa Umar
Date Execution 6th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-008 (Categorizing tweets into Type of Device Used): The aim of the test case is to
categorize tweets into the type of device used to make each post tweeted. This was successful.

58
Test case TC-009 (Categorizing tweets into Tweet Type (Original Tweet, Reply or Retweet))
Table 4.9: Test Suite Performed to categorize tweets into Tweet Type (Original Tweet,
Reply or Retweet)
Test Suit ID R-129
Test Case ID TC-009
Test case summary The aim of this test case is to categorize tweets into Tweet
Type (Original Tweet, Reply or Retweet)
Related Requirement R-129
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should categorize tweets into Tweet
Type (Original Tweet, Reply or Retweet)
Actual Result The web application categorizes tweets into Tweet Type
(Original Tweet, Reply or Retweet)
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 7th July 2020
Executed by Hauwa Umar
Date Execution 7th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-009 (Categorizing tweets into Tweet Type (Original Tweet, Reply or Retweet)):
The aim of this test case is to categorize tweets into Tweet Type (Original Tweet, Reply or
Retweet). This was successful.

59
Test case TC-010 (Data Visualization for All Analysis made on tweets extracted)
Table 4.10: Test Suite Performed to display data visualization for analysis carried out
Test Suit ID R-130
Test Case ID TC-010
Test case summary The aim of this test case is to visualize the analysis carried out
with pie charts, bar charts and graphs.
Related Requirement R-130
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
User must have a keyword on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should visualize the analysis carried out
with pie charts, bar charts and graphs.
Actual Result The web application visualizes the analysis carried out with
pie charts, bar charts and graphs.
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 8th July 2020
Executed by Hauwa Umar
Date Execution 13th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-010 (Data Visualization for All Analysis made on tweets extracted): The aim of
this test case is to visualize the analysis carried out with pie charts, bar charts and graphs. When
a user clicks submit a lot of analysis is carried out on the tweets extracted and should be
displayed using graphs and pie charts. The test case was successful.

60
Figure 4.8.1: Sentiment Analysis Visualization

Figure 4.8.2: Device Used Visualization

61
Figure 4.8.3: Post Type Visualization

62
Test case TC-011 (Compare two keywords)
Table 4.11: Test Suite Performed to compare two keywords
Test Suit ID R-131
Test Case ID TC-011
Test case summary The aim of this test case is to allow registered users compare
analysis on two keywords
Related R-131
Requirement
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have at least two keywords on their dashboard
Test Procedure 1. Login
2. Navigate to “Keyword Based Analysis” Page
3. Click submit beside desired keyword
Test Data Two Keywords
Expected Result The web application should compare the analysis of tweets
extracted for two keywords and display them
Actual Result The web application compares the analysis of tweets extracted for
two keywords and display them
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 14th July 2020
Executed by Hauwa Umar
Date Execution 16th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-011 (Compare two keywords): The aim of this test case is to allow registered users
to compare analysis on two keywords. The user must have at least two keywords on their
dashboard. They can only select 2 and get analysis on both keywords side by side. This test case
was successful.

63
Figure 4.9.1: Test to Compare Keywords

Figure 4.9.2: Compare Keywords Double Line Graph

64
Figure 4.9.3: Compare Analysis Visualization

65
Test case TC-012 (Extract Information of Twitter Account)
Table 4.12: Test Suite Performed to Extract Information of Twitter Account
Test Suit ID R-132
Test Case ID TC-012
Test case summary The aim of this test case is to extract personal information of a
twitter account on a users’ dashboard
Related Requirement R-132
Prerequisite  User must be logged in
 User must be on “Social Media Account Analysis” Page
 User must have a valid twitter account on their
dashboard
Test Procedure 1. Login
2. Navigate to “Social Media Account Analysis” Page
3. Click submit beside desired twitter account
Test Data Twitter Account
Expected Result The web application should extract personal information of a
twitter account on a users’ dashboard
Actual Result The web application extracts personal information of a twitter
account on a users’ dashboard
Status Successful
Remarks The test case is successful however Twitter API gives
information based in the twitter account’s status i.e. Private or
Public
Created by Hauwa Umar
Date Created 17th July 2020
Executed by Hauwa Umar
Date Execution 18th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-012 (Extract Information of Twitter Account): The aim of this test case is to extract
personal information of a twitter account on a users’ dashboard. Logged in users will navigate to
“Social Media Account Analysis” Page and must have a valid twitter account on their page. If
the account is valid, it will extract personal information from the twitter account. If it is not, it
will delete the account from the users’ dashboard. This test case was successful.

66
Figure 4.9.4: Extract Twitter Account Information

67
Test case TC-013 (Extract User Timeline)
Table 4.13: Test Suite Performed to Extract User Timeline
Test Suit ID R-133
Test Case ID TC-013
Test case summary The aim of this test case is to extract the latest 20 post made by a
twitter account on a users’ dashboard
Related Requirement R-133
Prerequisite  User must be logged in
 User must be on “Social Media Account Analysis” Page
 User must have a valid twitter account on their
dashboard
Test Procedure 1. Login
2. Navigate to “Social Media Account Analysis” Page
3. Click submit beside desired twitter account
Test Data Twitter Account
Expected Result The web application should extract the latest 20 post made by a
twitter account on a users’ dashboard
Actual Result The web application extracts the latest 20 post made by a twitter
account on a users’ dashboard
Status Successful
Remarks The test case is successful
Created by Hauwa Umar
Date Created 19th July 2020
Executed by Hauwa Umar
Date Execution 20th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-013 (Extract User Timeline): The aim of this test case is to extract the latest 20
posts made by a twitter account on a users’ dashboard. Along with the extracting information
about a twitter account, the web application extracts the users latest 20 tweets. This test case was
successful.

68
Figure 4.9.5: User Timeline Extracted

Test case TC-014 (Data Visualization for Optimization of posts)


Table 4.14: Test Suite Performed to display Data Visualization for Optimization of posts
Test Suit ID R-134
Test Case ID TC-014
Test case summary The aim of this test case is to run analysis on the tweets
extracted from a twitter user’s timeline and display it on the
web application
Related Requirement R-134
Prerequisite  User must be logged in
 User must be on “Social Media Account Analysis”
Page
 User must have a valid twitter account on their
dashboard
Test Procedure 1. Login
2. Navigate to “Social Media Account Analysis” Page
3. Click submit beside desired twitter account
Test Data Twitter Account
Expected Result The web application should run analysis on the tweets
extracted from a twitter user’s timeline and display it on the
web application
Actual Result The web application runs analysis on the tweets extracted from

69
a twitter user’s timeline and display it on the web application
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 21th July 2020
Executed by Hauwa Umar
Date Execution 28th July 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-014 (Data Visualization for Optimization of posts): The aim of this test case is to
run analysis on the tweets extracted from a twitter user’s timeline and display it on the web
application. The web application will categorize the engagement on each post on an hourly basis
and displays a bar graph to visualize this analysis. This test case was successful.

Figure 4.9.6: Data Visualization for Post Optimization

70
Test case TC-015 (Historic Data)
Test Suit ID R-135
Test Case ID TC-015
Test case summary The aim of this test case is to extract up to 1000 posts based on
a keyword from twitter, perform analysis and provide data
visualization for them.
Related Requirement R-135
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have a Keyword on their dashboard
Test Procedure 4. Login
5. Navigate to “Keyword Based Analysis” Page
6. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should extract up to 1000 posts based on
a keyword from twitter, perform analysis and provide data
visualization for them.
Actual Result The web application extracts up to 1000 posts based on a
keyword from twitter, performs analysis and provides data
visualization for them.
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 29th July 2020
Executed by Hauwa Umar
Date Execution 10th August 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-015 (Historic Data): The aim of this test is to extract from 200 to 1000 tweets to
run analysis using a library called GetOldTweets3. It will also provide a data visualization for the
analysis which is sentiment analysis. This test case was successful.

71
Figure 4.9.7: Historic Data Page

Figure 4.9.8.1: Historic Data Graph

72
Figure 4.9.8.2: Historic Data Sentiment Analysis

73
4.5.3 Test Traceability Matrix (for Unit Testing, Integration Testing and
System Testing)
Table 4.15: Test Traceability Matrix
Reqt. Description Priority Test Test Date Test
Case Result
R-121 The web application should 8 1 6th June 2020 Successful
allow users register
R-122 The web application should 7 2 8th June 2020 Successful
allow users login
R-123 The web application should 4 3 11th June 2020 Successful
allow users add New
Keywords to their
dashboard
R-124 The web application should 4 4 13th June 2020 Successful
allow users add Twitter
Accounts to their
dashboard
R-125 The web application should 6 5 20th June 2020 Successful
extract latest 100 Tweets
from Twitter based on
Keyword
R-126 The web application should 6 6 26th June 2020 Successful
display a line graph
showing the number of
tweets extracted based on
the keyword per day
R-127 The web application should 9 7 5th July 2020 Successful
perform sentiment analysis
on each post extracted
R-128 The web application should 8 8 6th July 2020 Successful
categorize tweets into type
of device used to make
each post tweeted.
R-129 The web application should 8 9 7th July 2020 Successful
categorize tweets into
Tweet Type (Original
Tweet, Reply or Retweet)

74
R-130 The web application should 7 10 13th July 2020 Successful
visualize the analysis
carried out with pie charts,
bar charts and graphs.
R-131 The web application should 8 11 16th July 2020 Successful
compare the analysis of
tweets extracted for two
keywords and display them
R-132 The web application should 5 12 18th July 2020 Successful
extract personal
information of a twitter
account on a users’
dashboard
R-133 The web application should 6 13 20th July 2020 Successful
extract the latest 20 post
made by a twitter account
on a users’ dashboard
R-134 The web application should 8 14 28th July 2020 Successful
run analysis on the tweets
extracted from a twitter
user’s timeline and display
it on the web application
R-135 The web application should 7 15 10th August Successful
extract up to 1000 posts 2020
based on a keyword from
twitter, perform analysis
and provide data
visualization for them.

75
4.5.4 Test Report Summary (for Unit Testing, Integration Testing and
System Testing)
Table 4.16: Test Report Summary
SUMMARY OF TEST CARRIED OUT RESULTS
15
Number of functions tested

0
Number of functions not tested

15
Number of tests passed

0
Number of tests failed

100%
Percentage of tests passed

0%
Percentage of test failed

4.5.5 Error Reports and Corrections


Table 4.17: Error Reports and Corrections
S/N Error Reports Corrections

1 Logical bugs and errors These were corrected by


going through
documentations of libraries
and scheming through
codes to change the logic so
to fit the functionalities.

76
4.6 User Guide
The user guide is used by prospective and already registered users to understand the
functionalities provided by the web application and how to access them. Users make use of the
user guide to understand what the web application has to offer.

4.7 Summary
This chapter gives a clear breakdown of the main features and functionalities of the web
application and the various methods and techniques used to implement the features. Each
functionality was tested and documented. Errors faced while carrying out implementation were
discussed as well as how the errors were solved. This chapter played a crucial role in the
development of this web application.

77
CHAPTER 5:

DISCUSSION, CONCLUSION AND RECOMMENDATION

5.1 Overview
The aim of this chapter is to discuss the objective assessment of the project, the limitations and
challenges encountered during the development of the project, possible future plans to enhance
the project and also recommendations for future projects in social media monitoring tools.

5.2 Objective Assessment


Mention Feeds is a web application that performs analysis on data extracted from social media
platforms for interested users. All functional requirements discussed in Chapter 3 were
implemented and satisfied. Table 4.16 which is the Test Report Summary shows that all the
functionalities defined were implemented.

5.3 Limitations and Challenges


This section discusses the problems and challenges faced during the development process. The
challenges encountered will be discussed below.
 The COVID-19 pandemic broke out during the early stages of the development of this
project which led to a nationwide lockdown in the country. Due to this unforeseen
situation, communicating with supervisors and knowledgeable lecturers became restricted
to online meetings only.
 The Twitter API provides a free API for twitter developers which is restricted and
provides limited data. Developers can upgrade and gain access to more data but this
requires funds that were not available during the implementation of this particular version
of the web application.
 Due to lack of sufficient time, the implementation of the overall goal of Mention feeds
were separated into iterations as discussed in Chapter 3.

5.4 Future Enhancements


The overall objective of Mention Feeds is to provide a platform for interested users to gain
access to meaningful data that is of relevance to them from social media platforms for personal

78
or business-like use. Currently Mention Feeds only provides current data on Twitter. Below are
brief descriptions of possible enhancements for Mention Feeds.
 Upgrading the Twitter API access: This will grant access to more data on twitter and help
improve the features Mention Feeds can provide.
 Adding analysis from other social media platforms: The aim of Mention Feeds is not to
focus only on data on Twitter but data from big social media platforms which include
Facebook and Instagram. These platforms also provide APIs for developers to gain
access to their data. This will provide users of Mention Feeds access to more information
that can be beneficial for individuals and organizations.
 Organization Users and Individual Users: Mention Feeds should be able to allow users
define their roles as either Organizations or Individuals which would provide different
features for both users.
 Enhancing result for Sentiment Analysis: TextBlob was deployed to carry out the
sentiment analysis for this version of Mention Feeds which provides a reasonable level of
accuracy to its result. It provides a generic sentiment analysis for all text. A future plan to
create a model that allows organizations/businesses to understand exactly what the
negative and positive comments are talking about.
 Streaming live tweets and giving live sentiment analysis for tweets as they are posted.

These are some future enhancements that could be implemented over a period of time to help
achieve the overall aim of Mention Feeds.

5.5 Recommendations
Social media platforms provide a large amount of vague data which can be very beneficial for
analysts, Businesses, Organizations or Influencers. Based on the research carried out in relation
to Sentiment Analysis and Social Media Monitoring tools, I have been able to come up with the
following recommendations:
 As discussed in the future enhancement section, Sentiment Analysis should be able to be
user specific to give a more specific result. Businesses/Organizations/Influencers would

79
take advantage of this information to help improve their services and understand the
positive and negative feedback provided by their customers.
 Small Businesses in Nigeria should be enlightened on the benefits of social media and
how the social media monitoring tools can help improve and grow their businesses.

In the 21st century, social media plays a role in day to day activities of an average human being,
more projects should go into analyzing this data for the benefits of businesses, analysts and
influencers.

5.6 Conclusion
In this project, we proposed a web application that analyzes data gotten from Twitter. Twitter is a
social media platform that contains a lot of vague data. If analyzed properly, this data can be
very beneficial to the industry and the educational sector. As input, Mention Feeds takes either a
keyword specified by a registered user or a public twitter account defined by a user. Mention
Feeds carries out various analysis on the data extracted which include sentiment analysis,
categorization of posts into Type of Post (Reply, Retweet or Original Tweet) and Device Used
(iPhone, Android, WebApp or Others). It also allows for comparison of analysis between two
keywords. It also carries out analysis on Twitter accounts for optimization of posts.
Businesses/Organizations that have access to data on their brand at real time will have a positive
effect on the growth of their business especially if used during the beginning or start of a
business. It provides a more accurate and less biased customer insight of your business. It also
allows businesses to compare the publics’ reaction to their products/services in comparison to
their competition. Influencers have the chance to see when the majority of their followers are
active which helps increase their engagement rate. Analysts can also benefit from Mention Feeds
as it gives them access to a large amount of data with the use of the historic data feature.
Mention Feeds aims at providing analysis for data on all major social platforms for prospective
users which will be achieved at later times. Mention Feeds proves and magnifies the significance
of social media platforms to the society.

5.7 Summary
This chapter provides an objective assessment of the project which focuses on if all the
requirements specified were met. The challenges faced during the implementation of this project

80
were also documented in this chapter. Future enhancement suggestions and recommendations
were also discussed. Overall, this chapter is the conclusion of the entire thesis.
REFERENCES

References
Ali, H., Sana, M., Ahmad, K., & Shahaboddin, S. (2018). Machine Learning-Based Sentimental Analysis
for Twitter Accounts. Mathematical Computational Application.
Amandeep, K., Deepesh, K., Khushboo, V., & Ranjit Singh, S. (2016). Sentiment Analysis on Twitter
using Apache Spark. Ottawa: Carleton University.
Andrew, M. L., Raymond, D. E., Peter, P. T., Dan, H., Andrew, N. H., & Potts, C. (2011). Learning Word
Vectors for Sentiment Analysis. Proceedings of 49th Annual Meeting of the Association for
Computational Linguistics, 1-7.
Boyang, Z., & Marita, V. (2014). Social media monitoring: Aims, methods and challenges for
international companies. Corporate Communications: An International Journal, 371-383.
Cambria, E., Gastaldo, P., Bisio, F., & Zunnio, R. (2015). An ELM-based model for affective analogical
reasoning. Neurocomputing, 443-455.
Dave, K., Lawrence, S., & Pennock, D. (2003). Mining the peanut gallery: Opinion extraction and
semantic classification of product reviews. Proceedings of the12th international conference on
World Wide Web (pp. 1-18). New York: Association for Computing Machinery.
Dumbleton, R. (2018, December 19). Complete Guide to Sentiment Analysis: Updated 2020. Retrieved
from getthematic: https://getthematic.com/insights/sentiment-analysis/
Emmanouil, P., George, M., & Ioannis, K. (2019). Social Media Monitoring: An Innovative Intelligent
Approach.
Farah, B., Carmine, C., Antonio, P., Diego, R., & Subrahmanian, V. (2006). Sentiment Analysis:
Adjectives and Adverbs are better than Adjectives alone. International Conference on Weblogs
and Social Media, (pp. 1-7). ICWSM.
Jorge, C. d., Laura, P., Pablo, G., & Alberto, D. (2011). A Joint Model of Feature Mining and Sentiment
Analysis for Product Review Rating. Proceedings of International Conference on Advances in
Information Retrieval, (pp. 55-66).
K, J. (2018, June 01). 12 Best Software Development Methodologies with Pros & Cons. Retrieved from
acodez: https://acodez.in/12-best-software-development-methodologies-pros-cons/
Khalid, N., & Ahmad, I. (2017). Discovering and Analyzing Important Real-Time Trends in Noisy
Twitter Streams. Systemics, Cybernetics and Informatics, 26-30.
Khanna, U. (2018, 01 29). The Impact of Social Media Marketing Today. Retrieved from Social Media
Impact: http://www.socialmediaimpact.com/impact-social-media-marketing-today/#

81
Lin, Y. (2019, 11 30). 10 Twitter Statistics Every Marketer Should Know in 2020 [Infographic].
Retrieved from Oberlo: https://ng.oberlo.com/blog/twitter-
statistics#:~:text=Here's%20a%20summary%20of%20the,are%20between%2035%20and%2065.
Mika, M. V., Daniel, G., & Kuutila, M. (2016). The Evolution of Sentiment Analysis - A Review of
Research Topics, Venues, and Top Cited Paper. Computer Science Review, 18-32.
Mika, V. M., Daniel, G., & Miikka, K. (2018). The Evolution of Sentiment Analysis-A Review of
Research Topics. Computer Science Review, 16-32.
Najma, S., Pintu, K., Monika, R. P., Sourabh, C., & S.K., S. A. (2019). Sentiment analysis for product
review. International Journal of Soft Computing, 1913-1919.
Nassirtoussi, A., Aghabozorgi, S., Wah, T., & Ngo, D. (2014). Text mining for market prediction: A
systematic review. Expert Systems with Applications, 7653-7670.
Samaneh, M., & Martin, E. (2010). Opinion Digger. Proceedings of 19th ACM International Conference
on Information and Knowledge Management, (pp. 1825-1828).
Sentiment Analysis. (2020, January 02). Retrieved from Monkey Learn:
https://monkeylearn.com/sentiment-analysis/
Soonh, T., Bakhtawer, S., & Areej, F. M. (2019). Sentiment Analysis of News Articles: A Lexicon based
Approach. International Conference on Computing, Mathematics and Engineering Technologies.
Stagner, R. (1940). The cross-out technique as a method in public opinion analysis. The Journal of Social
Psychology, 79-90.
Tanwir, A., Adnan, A., Junaid, I., Dragos, T., & Ivan, P. (2019). Model-based testing using UML activity
diagrams: A systematic mapping study. Computer Science Review, 98-112.
Team, E. S. (2017, March 7). What is Machine Learning? A definition-Expert System. Retrieved from
Expert System: https://expertsystem.com/machine-learning-definition/
Tutorialspoint. (2020). SDLC - Iterative Incremental Model - Tutorialspoint. Retrieved from
Tutorialspoint.com:
https://www.tutorialspoint.com/adaptive_software_development/sdlc_iterative_incremental_mod
el.htm
Tutorialspoint. (2020). SDLC - Spiral Model - Tutorialspoint. Retrieved from Tutorialspoint.com:
https://www.tutorialspoint.com/sdlc/sdlc_spiral_model.htm
ZAYNABZAHRA. (2017, October 7). SCRUM Methodology. Retrieved from Zaynab's Blog:
https://zaynabzahrablog.wordpress.com/2017/10/07/scrum-methodology/

82
APPENDICES

Appendix A – Brainstorming Report


Time efficiency: Answers the question if it can be achieved within the time for the project. 1
being very bad and 5 being very good.
Feasibility: This answers the question if the developer has the skills as well as the availability of
tools/funding to carry out that solution. 1 being very bad and 5 being very good.
Risk analysis: How risky would it be to embark on this solution. 1 being very risky and 5 being
moderately okay.

S/N Ideas Time Feasibility Risk Total /15


Efficiency /5 /5 Analysis /5
1 Using web scraper 2 5 1 8
to scrape data from
social media and
filtering them by
keywords
2 Developing a data 1 3 3 7
warehouse that
stores all data
gotten from
different social
media platforms
and analyzing them

3 Using social media 3 5 4 12


API’s to extract
data and
performing
analysis based on
users’ requests

83
Appendix B – Source Codes

84
Appendix C – Test Cases
Test case TC-005 (Extract latest 100 Tweets from Twitter based on Keyword)
Test Suit ID R-125
Test Case ID TC-005
Test case summary The aim of this test case is to extract the latest 100 Tweets from
Twitter based on Keyword.
Related Requirement R-125
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have a keyword on their dashboard
Test Procedure 4. Login
5. Navigate to “Keyword Based Analysis” Page
6. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display the latest 100 tweets from
twitter based on the selected keyword.
85
Actual Result The web application displays the latest 100 tweets from twitter
based on the selected keyword.
Status Successful
Remarks This test case was successful
Created by Hauwa Umar
Date Created 14th June 2020
Executed by Hauwa Umar
Date Execution 20th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Test case TC-006 (Data Visualization for number of posts extracted per day)
Test Suit ID R-126
Test Case ID TC-006
Test case summary The aim of this test case is to display a line graph that shows
the number of posts extracted per day
Related Requirement R-126
Prerequisite  User must be logged in
 User must be on “Keyword Based Analysis” Page
 User must have a keyword on their dashboard
Test Procedure 4. Login
5. Navigate to “Keyword Based Analysis” Page
6. Click submit beside desired keyword
Test Data Keyword
Expected Result The web application should display a line graph showing the
number of tweets extracted based on the keyword per day
Actual Result The web application displays a line graph showing the number

86
of tweets extracted based on the keyword per day
Status Successful
Remarks The test case was successful
Created by Hauwa Umar
Date Created 21st June 2020
Executed by Hauwa Umar
Date Execution 26th June 2020
Test Environment Hardware: HP Laptop
Software: Browser – Google Chrome

Appendix D – User Guide/Manual


 Access the web application by navigating to the website’s URL.
 Navigate to the register page and fill out the form with the appropriate information.
 If a user successfully registers, the web application will redirect the user to the login page
where the user would provide their correct email and password.
 If login is successful, the user will be redirected to their dashboard.
 On the dashboard there are two pages: “Keyword Based Analysis” and “Social Media
Account Analysis” page.
 On the “Keyword Based Analysis” page there are two-tab options “Tracker” and
“Historic Data”
 On the “Tracker” page users can add keywords but clicking on the “Add Keyword”
Button.
 Once a keyword has been added, the user can select on the submit button beside the
keyword to load the analysis generated on the keyword and users can click on the delete
button beside the keyword to delete the keyword.
 On the analysis page, the following will be provided:

87
- A line graph/bar chart showing the number of posts extracted in the last seven days.
- A pie chart showing the cumulative sentiment analysis of the tweets extracted
- A pie chart showing the device used to make the tweets
- A pie chart showing the types of tweets extracted (Original Tweets, Retweets or
Replies)
- On the top left there is a side nav button that when clicked slides a side navigation bar
with the option “Analysis” and “Post”
- Under “Post”, the user can see all the tweets extracted in a separate, the positive
tweets in a separate table as well the negative tweets.
 On the “Tracker” page users can also compare analysis between two keywords by
clicking on the “Compare” button.
 Once a user clicks on the “Compare” Button a pop up will appear with a list of keywords
already added by users in which they can select only two keywords to compare.
 On the analysis page of comparing keywords, the following will be provided:
- A double line graph or bar chart showing the number of posts extracted in the last
seven days on both keywords.
- A pie chart showing the cumulative sentiment analysis of the tweets extracted for
both keywords
- A pie chart showing the device used to make the tweets for both keywords
- A pie chart showing the types of tweets extracted (Original Tweets, Retweets or
Replies) for both keywords
- It also provides extra information on both keywords.
 On the “Historic Data” page, the keywords added by the user will be displayed with a
dropdown option which allows users to gain access to up to 1000 tweets based on a
keyword.
 On the analysis page for Historic Data, the following will be provided:
- A line graph/bar chart showing the number of posts extracted monthly.
- A pie chart showing the cumulative sentiment analysis of the tweets extracted.
 On the “Social Media Account Analysis” page, users can add valid twitter accounts by
clicking on the “Add Twitter Account” button which would then add the twitter account
to their dashboard.
 Beside the twitter account added there is a submit button which allows users access to
analysis made on that twitter account, if the twitter account is invalid or private it will
delete the account from their dashboard automatically. There is also a delete button which
allows users to delete twitter accounts from their dashboard.
 On the analysis page for Twitter accounts, the following will be provided:
- Details about the twitter account
- A grouped bar chart that shows the number of engagements on an hourly basis on the
latest 20 posts made by the twitter account.
- A table containing the latest 20 posts by the twitter account
 There is also a logout option also for logged-in users to logout.

88
89

You might also like