You are on page 1of 8

TE Computer Engineering 2022-23 Mini project report

Artificial Intelligence
Mini Project Report
On

" Sentiment Analysis using Natural Language Processing"

Submitted in the fulfillment of requirements of Savitribai Phule Pune University


For the degree of T.E (Computer Engineering)

By

Mr. Shivam Bharambe Seat No . T190554212 (TE-A-108)

Department of Computer Engineering


PCET-NMVPM’s
NUTAN MAHARASHTRA INSTITUTE OF ENGINEERING AND TECHNOLOGY
TALEGAON-DABHADE,
Vishnupuri, Talegaon Station-410507

SAVITRIBAI PHULE PUNE UNIVERSITY


2021-2022

Department Of Computer Engineering , NMIET, Talegaon


TE Computer Engineering 2022-23 Mini project report

NUTAN MAHARASHTRA INSTITUTE OF ENGINEERING &


TECHNOLOGY, TALEGAON-DABHADE

CERTIFICATE
This is to Certify that

Mr. Shivam Bharambe Seat No. T190554212

Has satisfactorily completed the requirements of Artificial Intelligence mini project simple
portfolio for the degree of T.E (Computer Engineering)
On

‘Sentiment Analysis using Natural Language Processing

Prof.Bharti Jadhao Dr. Saurabh Saoji Dr. Vilas Deotare


Subject Teacher Head of Department Principal

Department Of Computer Engineering , NMIET, Talegaon


TE Computer Engineering 2022-23 Mini project report

Sr. No Title Page


No.

1 Abstract 4

2 Title of mini Project 5

3 Problem Statement / Aim 5

4 Objectives 5

5 Source code 7

6 Output 11

7 Conclusion 12

Department Of Computer Engineering , NMIET, Talegaon


TE Computer Engineering 2022-23 Mini project report

ABSTRACT

This report presents the development of a sentiment analysis project using natural language
processing techniques. The project aims to classify text data into positive, negative, or neutral
sentiment categories. The report outlines the project's objectives, methodology, data sources, and
tools used. The project utilizes Python and its natural language processing libraries, including
NLTK and Scikit-learn, to preprocess the data and perform sentiment analysis. The report also
includes an evaluation of the model's performance and potential improvements, demonstrating the
effectiveness of natural language processing techniques in sentiment analysis tasks.

• Sentiment analysis project using natural language processing techniques


• Aims to classify text data into positive, negative, or neutral sentiment categories
• Utilizes Python and its natural language processing libraries
• Includes an evaluation of the model's performance and potential improvements
• Demonstrates effectiveness of natural language processing in sentiment analysis tasks.

Department Of Computer Engineering , NMIET, Talegaon


TE Computer Engineering 2022-23 Mini project report

Title: Sentiment Analysis using Natural Language Processing


Objective: To perform sentiment analysis on a dataset using Python's Natural Language
• Toolkit (NLTK) library and classify the data into positive, negative or neutral sentiments.

Problem Statement:
Sentiment analysis is a technique used to analyze and classify the sentiment of a text. With the
increase in social media usage, it has become important to monitor the sentiment of people
towards a particular product, service or event. The aim of this project is to perform sentiment
analysis on a dataset of tweets related to a particular topic and classify them into positive,
negative or neutral sentiments.
Outcome: The output of this project will be a report containing the sentiment analysis of the
given dataset and a classification of the tweets into positive, negative or neutral sentiments.
Software Requirements:
To perform the analytics, we need the following software:
• Python 3.5 or above
• panda’s library
• matplotlib library
• Scikit-learn library

Theory Concept:

Natural Language Processing (NLP): It is a subfield of Artificial Intelligence that deals with the
interaction between computers and humans using natural language. NLP techniques are used to
process, analyze and understand human language. Sentiment Analysis: It is a technique used to
analyze and classify the sentiment of a text. Sentiment analysis can be performed using machine
learning algorithms or lexicon-based approaches. NLTK Library: It is a Python library used for
natural language processing. It provides a suite of text processing libraries and tools, including
tokenization, stemming, lemmatization, part-ofspeech tagging, and sentiment analysis. Pandas:
Pandas is a Python library used for data manipulation and analysis. It provides data structures for
efficiently storing and manipulating large datasets and provides functions for data cleaning,
merging, filtering, and other operations

Department Of Computer Engineering , NMIET, Talegaon


TE Computer Engineering 2022-23 Mini project report

Program

1. Import necessary libraries: Import the required libraries - NLTK, Pandas and Scikitlearn. 2. Load
the dataset: Load the dataset of tweets into a Pandas dataframe. 3. Text preprocessing: Preprocess
the text data by removing stop words, punctuation marks, and performing tokenization. 4. Feature
extraction: Extract features from the preprocessed text data using bag-ofwords and TF-IDF
techniques. 5. Sentiment analysis: Train a machine learning algorithm (such as Naive Bayes or
Support Vector Machine) using the extracted features and perform sentiment analysis on the
dataset. 6. Classification: Classify the tweets into positive, negative or neutral sentiments based on
the sentiment score

Code and Output

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix
import pandas as pd
nltk.download('stopwords')
df = pd.read_csv('tweets.csv')
stop_words = set(stopwords.words('english'))
df['text'] = df['text'].apply(lambda x: ' '.join([word for word in word_tokenize(x.lower()) if
word.isalpha() and word not in stop_words]))
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['text'])
clf = MultinomialNB()
clf.fit(X, df['sentiment'])

Department Of Computer Engineering , NMIET, Talegaon


TE Computer Engineering 2022-23 Mini project report

test_tweet = "I love this product"


test_tweet = ' '.join([word for word in word_tokenize(test_tweet.lower()) if word.isalpha() and
word not in stop_words])
test_tweet = vectorizer.transform([test_tweet])
prediction = clf.predict(test_tweet)
if prediction == 1:
print("Positive Sentiment")
elif prediction == -1:
print("Negative Sentiment")
else:
print("Neutral Sentiment")

Output:

• If the sentiment predicted for the test_tweet is positive, the output will be: "Positive
Sentiment"
• If the sentiment predicted for the test_tweet is negative, the output will be: "Negative
Sentiment"
• If the sentiment predicted for the test_tweet is neutral, the output will be: "Neutral
Sentiment"

The output of the code will depend on the contents of the 'tweets.csv' dataset and the specific
test_tweet used in the code.So, the exact output of the code will depend on the specific
test_tweet used and the sentiment predicted by the model for that tweet

Department Of Computer Engineering , NMIET, Talegaon


TE Computer Engineering 2022-23 Mini project report

In conclusion: In this report, the sentiment analysis project demonstrated how machine learning
can be used to classify text into positive, negative, or neutral sentiment. The project used the Count
Vectorizer for feature extraction and Multinomial for classification. Overall, the project showcased
the potential of AI and NLP in analysing large volumes of text data.

Date of Presentation Performance Understanding Total Sign with


Checking (10) (10) (10) (10) Date
(10)

Department Of Computer Engineering , NMIET, Talegaon

You might also like