You are on page 1of 4

YZV 201E - Data Structures

Exercise #2

Introduction
For this assignment, you are expected to implement a program called TweetAnalyzer that can analyze
people’s opinions from hashtags and tweets using sentiment analysis which is classifying a piece of text
as positive, neutral, or negative.

For any issues regarding the assignment, please contact Barış Bilen (bilenb20@itu.edu.tr).

Implementation Notes
The implementation details below are included in the grading:

1. Please follow a consistent coding style (indentation, variable names, etc.) with comments.
2. Please do not use any pre-compiled header files or STL commands.

1
1 Data
You are given three .txt files; positive_words_tr1 , negative_words_tr1 , and tweets.

1.1 positive_words_tr
List of words that has a positive effect in a sentence (eg. good). First line contains number of words in the
file.

1.2 negative_words_tr
List of words that has a negative effect in a sentence (eg. bad). First line contains number of words in the
file.

1.3 tweets
A file that consists of hashtags and tweets including username and date in the following format:
#hashtag
@username
id
day month year
<tweet>

2 Implementation Details
You are expected to implement 3 classes; HashtagList, Hashtag, and Tweet.

!
Notice: You need to store hashtags and tweets in the linked list structure, as shown in the figure
above. Hence, do not forget to implement necessary variables and functions which is not listed below.
Hashtags should stored in alphabetical order (A-Z) and tweets should be from newest to oldest.

1 Chen, Y., & Skiena, S. (2014). Building Sentiment Lexicons for All Major Languages. In ACL (2) (pp. 383-389).

2
2.1 HashtagList
Public Attributes:
HashtagList class has a country (eg. Turkey) to specify which country the hashtag list belongs.

Methods:
1. AnalyzeAll: Check sentiment of every tweet that in this list and report overall happiness of the coun-
try that list associated with.

To check the sentiment of a tweet, you can use strtok function to tokenize (split) a sentence into
words, and check if they are in positive or negative words lists. For every word that is in the positive
word list, increase sentiment value of a tweet by one and for every word that is in the negative word
list, decrease sentiment value of a tweet by one. Do nothing for the words that belongs to neither of
them.
A small tip about using strtok;
1 # include < cstring >
2 # include < iostream >
3
4 using namespace std ;
5
6 int main () {
7 string input = " the quick brown fox jumps over the lazy dog " ;
8
9 // duplicate the string because strtok will remove spaces
10 char * dup = strdup ( input . c_str () ) ;
11 char * token = strtok ( dup , " " ) ;
12 while ( token ) {
13 cout << token << endl ;
14 token = strtok ( nullptr , " " ) ;
15 }
16
17 free ( dup ) ;
18 }

Above code will produce the following output:


1 the
2 quick
3 brown
4 fox
5 jumps
6 over
7 the
8 lazy
9 dog

2. AnalyzeHashtag: Check sentiment of every tweet in a hashtag from the list and report happiness of
it.
3. AnalyzeTweet: Check sentiment of a tweet from the list and report happiness of it.

4. IncreaseHappiness: Delete all tweets that have lower sentiment value than the given input so that
people have less negativity in their lives. If a hashtag has no tweet left after this operation, remove
it from the list.

2.2 Hashtag
Public Attributes:
Hashtag class has a text (eg. #ITU).

3
2.3 Tweet
Public Attributes:
Tweet class has a username (eg. @itu_yzv), id, day, month, year, and text (tweet).

3 Program Flow

Based on the choice of the user, your program should be able to handle commands in the menu above. So,
implement necessary functions and variables which is not listed above.

You might also like