You are on page 1of 13

Lecture 3:Analyze Twitter Data by

Time Period
Import libraries
• import numpy as np
• import pandas as pd
• from pandas import DataFrame
• from pandas import Series
• import calendar
• import matplotlib.pyplot as plt
• df=pd.read_csv(d:/'CSR_user_timeline_2013.c
sv')
Convert created_at column to
time variable
• df.columns
• Len(df.columns)
• df.dtypes
• To work with time, we first have to have a variable in
our dataframe that indicates time. We will use the
created_at column, which represents the time at
which the tweet was created. In the following line we
will convert this variable from text format to
python's datetime format.
Continue..
df.dtypes[:14]
df['created_at'] =
pd.to_datetime(df['created_at'])

df.dtypes[:14]
Set the Index
• df = df.set_index(['created_at'])
• df.head(2)
Generate Number of Tweets over Different
Time Periods
• def f(x): return
Series(dict(Number_of_tweets =
x['content'].count(),))
Generate Daily Counts
• daily_count =
df.groupby(df.index.date).apply(f)
• print len(daily_count)
• daily_count.head(5)
Naming Index column

• daily_count.index.name = 'date’
daily_count.head(5)
minimum and maximum daily
values
• daily_count.index.min()
• Result:
• datetime.date(2013, 1, 1)
• daily_count.index.max()
• Reslut:
• datetime.date(2013, 12, 31)
Generate Day-of-the-Week Tweets
• weekday_count =
df.groupby(df.index.weekday).apply(f)
• print len(weekday_count)
Creating plot for daily count
• daily_plot =
dialy_count['Number_of_tweets'].plot(kind='line',
alpha=1, legend=True, color='blue', figsize=(12,8))
• daily_plot.set_xlabel(‘Daily', weight='bold',
labelpad=15)
• daily_plot.set_ylabel('# Tweets (Messages)',
weight='bold', labelpad=1, fontsize = 17")
Creating plot for Weekly count
• Weekly_plot =
weekday_count['Number_of_tweets'].plot(kind='line',
alpha=1, legend=True, color='blue', figsize=(12,8))
• daily_plot.set_xlabel(‘Weekly', weight='bold',
labelpad=15)
• daily_plot.set_ylabel('# Tweets (Messages)',
weight='bold', labelpad=1, fontsize = 17")
ASSIGNMENT 3
1. Generate Monthly of Number of Tweets.
2. Generate Yearly of Number of Tweets.
3. Generate Hourly of Number of Tweets
4. Generate Minutes of Number of Tweets
5. Generate Seconds of Number of Tweets
6. Creating plot for Monthly Number of Tweets.
7. Creating plot for Hourly of Number of Tweets.
8. Creating plot for minutes of Number of Tweets.
9. What is the minimum and maximum weekly
Number of Tweets.
10. What is the minimum and maximum monthly
Number of Tweets.
11.Change index from created_at and set to
original index.
12. set tweet_id to index.

You might also like