You are on page 1of 45

WhatsApp Chat Analyser

A Minor Project Report


Submitted in partial fulfillment of requirement of the
Degree of
BACHELOR OF TECHNOLOGY in COMPUTER SCIENCE &
ENGINEERING
BY
Shivani Gehlot
EN21CS301720

Under the Guidance of


Prof. Mandakini Ingle

Department of Computer Science & Engineering


Faculty of Engineering
MEDI-CAPS UNIVERSITY, INDORE- 453331

APRIL-2024
Report Approval

The project work “WhatsApp Chat Analyser” is hereby approved as a creditable


study of an engineering/computer application subject carried out and presented in a
manner satisfactory to warrant its acceptance as prerequisite for the Degree for
which it has been submitted.

It is to be understood that by this approval the undersigned do not endorse or


approve any statement made, opinion expressed, or conclusion drawn there in; but
approve the “Project Report” only for the purpose for which it has been submitted.

Internal Examiner
Name:
Designation
Affiliation

External Examiner
Name:
Designation
Affiliation

2
Declaration

I hereby declare that the project entitled “WhatsApp Chat Analyser” submitted in
partial fulfillment for the award of the degree of Bachelor of Technology/Master of
Computer Applications in ‘Computer Science & Engineering’ completed under
the supervision of Prof. Mandakini Ingle, Assistant Professor from department
of Computer Science and Engineering, Faculty of Engineering, Medi-Caps
University Indore is an authentic work.

Further, I declare that the content of this Project work, in full or in parts, have
neither been taken from any other source nor have been submitted to any other
Institute or University for the award of any degree or diploma.

Signature and name of the student(s) with date

3
Certificate

I, Prof. Mandakini Ingle certify that the project entitled “WhatsApp Chat
Analyser” submitted in partial fulfillment for the award of the degree of Bachelor
of Technology/Master of Computer Applications by Shivani Gehlot is the record
carried out by her under my guidance and that the work has not formed the basis of
award of any other degree elsewhere.

________________________________

Prof. Mandakini Ingle


Computer Science and Engineering

Medi-Caps University, Indore

_____________________

Dr. Ratnesh Litoriya

Head of the Department


Computer Science & Engineering

Medi-Caps University, Indore

4
Acknowledgements

I would like to express my deepest gratitude to the Honorable Chancellor, Shri R C Mittal, who
has provided me with every facility to successfully carry out this project, and my profound
indebtedness to Prof. (Dr.) D. K. Patnaik, Vice Chancellor, Medi-Caps University, whose
unfailing support and enthusiasm has always boosted up my morale. I also thank Prof. (Dr.)
Pramod S. Nair, Dean, Faculty of Engineering, Medi-Caps University, for giving me a chance
to work on this project. I would also like to thank my Head of the Department Dr. Ratnesh
Litoriya for his continuous encouragement for the betterment of the project.

It is their help and support, due to which we became able to complete the design
and technical report.

Without their support this report would not have been possible.

Shivani Gehlot
B.Tech. III Year
Department of Computer Science & Engineering
Faculty of Engineering
Medi-Caps University, Indore

5
Abstract
WhatsApp has evolved into an essential platform for global communication, serving billions of
users across personal and professional spheres. As the volume of data generated from WhatsApp
conversations continues to surge, there's a growing interest in extracting meaningful insights and
patterns from this wealth of information. Enter the WhatsApp chat analyzer—a web application
crafted to cater to this need, furnishing users with a robust tool for unraveling the intricacies of
their conversations.

At its essence, the WhatsApp chat analyzer harnesses a blend of machine learning, natural
language processing (NLP), and data visualization methodologies to offer users a holistic
understanding of their WhatsApp chats. It serves as a web-based platform specifically
engineered to analyze both group and individual chats.

Utilizing popular Python libraries such as matplotlib, streamlit, seaborn, re, and pandas,
alongside key NLP concepts, the analyzer delves deep into the data, providing users with
actionable insights and compelling visualizations. By amalgamating machine learning and NLP
techniques, it elevates the analysis process, empowering users to glean valuable information
from their chat data.

Through interactive visualizations generated using Matplotlib, Seaborn, and Streamlit, users gain
a nuanced perspective of their chat dynamics. From graphical representations of message
frequency over time to word clouds highlighting frequently used terms, the visualizations offer a
comprehensive overview of the chat data, making it easier for users to identify patterns and
derive insights.

Ultimately, the WhatsApp chat analyzer serves as an invaluable toolkit for dissecting and
comprehending WhatsApp conversations. By amalgamating cutting-edge technologies and
visualization techniques, it equips users with the tools needed to make informed decisions,
enhance communication strategies, and unlock the full potential of their chat data.

6
Table of Contents
S.NO Content
Report Approval
Declaration
Certificate
Acknowledgement
Abstract
Table of Contents
List of figures
Abbreviations
Chapter 1 Introduction
1.1 Introduction
1.2 Literature Review
1.3 Objectives
1.4 Significance
1.5 Research Design
1.6 Source of Data
Chapter 2 Requirements Specification
2.1 User Characteristic
2.2 Functional Requirements
2.3 Dependencies
2.4 Performance Requirements
2.5 Hardware Requirements
2.6 Constraints & Assumptions
Chapter 3 Design
System Design
3.1 Data Flow Diagrams (Level 0,Level1)
3.2 Activity Diagram
3.3 Flow Chart
3.4 Class Diagram

7
3.5 ER Diagram
3.6 Sequence diagram
3.7 Use-Case Diagram
Chapter 4 Implementation, Testing, and Maintenance
4.1 Introduction to Languages, IDE’s, Tools and Technologies used
for Implementation
4.2 Testing Techniques and Test Plans (According to project)
4.3 Installation Instructions
4.4 End User Instructions
Chapter 5 Results and Discussions
5.1 User Interface Representation (of Respective Project)
5.2 Brief Description of Various Modules of the system
5.3 Snapshots of system with brief detail of each
Chapter 6 Summary and Conclusions
Chapter 7 Future scope
Appendix
Bibliography

8
MEDI-CAPS UNIVERSITY, INDORE

List of Figures

S.No. Figure Name Fig no.


1 Data Flow Diagrams Level 0 Fig – 3.1.1
2 Data Flow Diagrams Level 1 Fig – 3.1.2
3 Activity Diagram Fig – 3.2.1
4 Flow Chart Fig – 3.3.1
5 Class Diagram Fig – 3.4.1
6 ER Diagram Fig – 3.5.1
7 Sequence diagram Fig – 3.6.1
8 Use Case Diagram Fig – 3.7.1
9 Top Statistics Fig – 5.3.1
10 Monthly Timeline Fig – 5.3.2
11 Daily Timeline Fig – 5.3.3
12 Activity Map Fig – 5.3.4
13 Most Busy User Fig – 5.3.5
14 Weekly Activity Map Fig – 5.3.6
15 Word Cloud Fig – 5.3.7
16 Most Common Words Fig – 5.3.8
17 Emoji Analysis Fig – 5.3.9

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Abbreviations

In the context of the WhatsApp Chat Analyzer, some common abbreviations and their meanings
may include:

1. NLP: Natural Language Processing. This refers to the field of artificial intelligence that
focuses on the interaction between computers and humans through natural language.

2. EDA: Exploratory Data Analysis. This is an approach to analyzing data sets to summarize
their main characteristics, often using visual methods.

3. CSV: Comma-Separated Values. This is a file format used to store tabular data, where each
line corresponds to a row, and the values within each line are separated by commas.

4. HTML: Hypertext Markup Language. This is the standard markup language used for creating
web pages and web applications.

5. API: Application Programming Interface. This is a set of rules and protocols that allows
different software applications to communicate with each other.

6. GUI: Graphical User Interface. This is a type of user interface that allows users to interact
with electronic devices using graphical icons and visual indicators.

7. ML: Machine Learning. This is a branch of artificial intelligence that enables computers to
learn from data and improve their performance on specific tasks without being explicitly
programmed.

8. DL: Deep Learning. This is a subset of machine learning that uses neural networks with many
layers (deep neural networks) to learn from large amounts of data.

10

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Chapter-1

Introduction

1.1 Introduction

In this report, I propose the development of a WhatsApp Chat Analyzer, aimed at extracting
insights from the rich variety of communication found within WhatsApp chats, whether in group
settings or personal conversations. WhatsApp chats are treasure troves of diverse topics and
discussions, offering valuable data for analysis and exploration, particularly for technologies like
machine learning. The efficacy of machine learning models heavily relies on the quality and
quantity of the data they are trained on. Therefore, having access to a robust dataset from
WhatsApp chats can significantly enhance the learning experience of these models. This
proposed application aims to bridge the gap between raw chat data and actionable insights by
providing comprehensive analysis tools. Notably, the advantage of this application lies in its
simplicity and accessibility, as it is built using widely-used Python libraries such as Seaborn,
Pandas, NumPy, Streamlit, and Matplotlib. These libraries offer powerful capabilities for data
manipulation, visualization, and analysis, allowing users to effortlessly create data frames,
generate various types of graphs, and derive meaningful insights from their WhatsApp
conversations. By harnessing the potential of these libraries, the WhatsApp Chat Analyzer
promises to be a user-friendly and efficient tool for unlocking the hidden value within WhatsApp
chat data.

1.2 Literature Review

Existing System

In olden days there was no analysis for whatsapp chat. If someone wants to analyze there is no
CSV file to analyze. WhatsApp Application provides an export txt file which is in raw format. It
is very complicated for analysis. So we have to forget that system and switch over to the
WhatsApp Chat Analyzer.

Disadvantages of Existing System

11

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
1 Raw data.
2 Time consuming.
3 Difficult to Analyze.
4 Analysis is not accurate.

Proposed System

The “WhatsApp Chat Analyzer” provides a platform to the user which enables users to analyze
whatsapp chats online on heroku link. This application allows users to browse whatsapp
exported (.txt) file and import it to WhatsApp chat analyzer and get analysis according to that txt
file. And user can Analyze by clicking Show Analysis button.

Advantages of WhatsApp Chat Analyzer.

• Runs on all devices. • Monthly timeline.

• Shows based on whatsapp chat file. • Most busy day.

• Shows different visualizations. • Most busy month.

• Total Messages. • Weekly activity.

• Total words. • Most busy users.

• Media shared. • Most used words

• Link shared. • Emoji analysis.

1.3 Objective

Everyone today uses Whatsapp for daily conversation. It has become one of the biggest business
engines where multiple e-commerce businesses share the product-designs and product details,
accept orders, are involved in money transactions, and a lot more. Indeed the business support
Whatsapp does not have analytics support where the people or business can analyze their
monthly or daily activities to get an idea of where they are lacking, the demand of customers,
sales, marketing, activeness of group members, and many things.

So to gain the solution to the above statement, we aim to develop a complete interface where
users can upload the WhatsApp chat in text format by exporting the chat from WhatsApp. It will
provide users with two options to study the chats. On submitting the chat, the engine will display
12

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
the complete report with interactive graphs, which is easy to understand. The user can get an in-
depth idea of how the business over WhatsApp is performing. The report we want to display will
include the following analysis from the chat we need to showcase.

 Total number of messages best, and in a year, which month


includes the most conversations?
 Total words
 Weekly activity map
 Number of Media and links shared
 Most Busy Users
 Monthly and Daily Timeline: Chat
activity on a daily basis and on a  Top and common words in
monthly basis. conversation

 Most busy day and month – In a  Emoji analysis


week which day outperforms the

1.4 Significance

The WhatsApp Chat Analyzer project offers several advantages or significances:

Insightful Data Analysis: Users gain valuable insights into their WhatsApp conversations,
including communication patterns, popular topics, and sentiment analysis, enabling them to
make informed decisions based on data-driven insights.

Enhanced Communication Strategies: By understanding communication dynamics better,


users can optimize their messaging strategies, improve engagement, and foster more effective
communication within their groups or personal chats.

Data-Driven Decision Making: The analyzer enables users to make data-driven decisions,
whether in personal matters or professional settings, by providing actionable insights derived
from chat data analysis.

Improved Productivity: Understanding communication patterns and identifying key topics can
lead to increased productivity, as users can prioritize discussions, address important issues
promptly, and streamline their interactions.

13

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
Customized Visualizations: The project offers customizable visualizations that allow users to
present their insights in a clear and visually appealing manner, facilitating communication and
collaboration with others.

User-Friendly Interface: The analyzer features a user-friendly interface, making it accessible to


a wide range of users, from individuals managing personal chats to businesses analyzing group
discussions.

Continuous Improvement: The project is built on the foundation of continuous improvement,


with regular updates and enhancements based on user feedback and evolving communication
trends, ensuring that it remains relevant and valuable over time.

1.5 Research Design

To design a research study for evaluating the effectiveness and usability of the WhatsApp Chat
Analyzer, we need to outline the research objectives, methodology, participants, data collection
methods, and analysis techniques.

1. Research Objectives:

 Evaluate the usability of the WhatsApp Chat Analyzer in analyzing WhatsApp


conversations.
 Assess the effectiveness of the analyzer in providing actionable insights from chat data.
 Identify potential areas for improvement and enhancement of the analyzer.

2. Methodology:

 Mixed-Methods Approach: Employ a mixed-methods research design, combining both


quantitative and qualitative methods for comprehensive evaluation.
 Quantitative Analysis: Use surveys and metrics to quantitatively measure user
satisfaction, efficiency, and effectiveness of the analyzer.
 Qualitative Analysis: Conduct interviews and user testing sessions to gather qualitative
feedback, insights, and suggestions for improvement.

14

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

3. Participants:

 Target Audience: Users who frequently use WhatsApp for personal or professional
communication, including individuals, businesses, and researchers.
 Sample Size: Aim for a diverse sample size comprising both novice and experienced users
of data analysis tools and WhatsApp.

4. Data Analysis Techniques:

 Quantitative Analysis: Analyze survey responses using statistical techniques such as


descriptive statistics, correlation analysis, and regression analysis to identify patterns and
trends.
 Qualitative Analysis: Conduct thematic analysis of interview transcripts and qualitative
data from user testing sessions to identify common themes, issues, and recommendations.
 Triangulation: Integrate findings from both quantitative and qualitative analyses to
provide a comprehensive understanding of user perceptions and experiences.

1.6 Source of Data

The source of data for the WhatsApp Chat Analyzer typically comes from the chat logs or
message archives exported from the WhatsApp application. WhatsApp provides users with the
functionality to export their chat history, either for individual chats or group chats, in the form of
text files. These exported chat files usually contain a record of all messages exchanged within the
chat, along with timestamps and other metadata.

Exported Chat Logs: Users can export their chat logs directly from the WhatsApp application.
This feature allows users to save their chat history locally on their device or to cloud storage
services. The exported chat logs are typically in text format and contain a chronological record of
all messages exchanged within the chat.

15

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Chapter-2

Requirement Specification

2.1 User Characteristic

 Personal Users: Individuals using the tool for personal chat analysis.

Characteristics: Casual users, seeking simple and intuitive interface.

 Business Users: Professionals or businesses analyzing customer feedback, team


communication, or marketing insights.

Characteristics: Goal-oriented, seeking actionable insights.

 Data Analysts: Professionals specializing in data analysis, using the tool for in-depth
analysis.

Characteristics: Analytical mindset, seeking detailed insights

 Developers: Interested in extending or integrating the tool's functionality.

Characteristics: Tech-savvy, seeking customization options.

 Researchers: Academic or industry researchers studying communication patterns.

Characteristics: Research-oriented, seeking data for studies.

2.2 Functional Requirements

Functional requirements are product features or functions that developers must implement to
enable users to accomplish their tasks. They describe system behavior under specific conditions
and define what precisely a software must do and how the system must respond to
inputs. Functional requirements may involve calculations, technical requirements, or basic
facilities that the end user specifically demands as basic facilities that the system should
offer. Functional requirements define the software's goals, meaning that the software will not
work if these requirements are not met. It's essential to make functional requirements clear both
for the development team and stakeholders.

Functional requirements for a WhatsApp chat analyzer:

Import Chat Data: The system should be able to import chat data from WhatsApp, including
16

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
text messages, media files, and other attachments.

User Authentication: Users should be able to authenticate themselves to access their chat

data securely.

Chat Analysis: The system should analyze the chat data to extract meaningful insights such

as message frequency, word usage, sentiment analysis, and media usage.

Keyword Search: Users should be able to search for specific keywords or phrases within the

chat data.

User Insights: The system should provide insights into individual user behavior, such as

message activity, response times, and interaction patterns.

Group Chat Analysis: The system should be able to analyze group chats.

2.3 Dependencies

The dependencies for the WhatsApp Chat Analyzer project typically include both software
libraries and frameworks needed for data analysis, natural language processing (NLP), and web
development. Here are the main dependencies:

1. Python: The core programming language for the project.

2. Pandas: A powerful data manipulation and analysis library in Python, used for handling chat
data.

3. NumPy: A fundamental package for scientific computing with Python, often used for
numerical operations and array manipulation.

4. Matplotlib: A popular plotting library for creating static, interactive, and animated
visualizations.

5. Seaborn: A statistical data visualization library built on top of Matplotlib, offering enhanced
visualizations and statistical graphics.

6. Regular Expressions (re): Python's built-in module for working with regular expressions,
used for pattern matching within text data.

17

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
7. Streamlit: A Python library for creating interactive web applications for data science and
machine learning projects. It provides a user-friendly interface for visualizing and analyzing
data.

These dependencies enable the functionality of the WhatsApp Chat Analyzer, allowing users to
import chat data, analyze it using machine learning and NLP techniques, visualize insights, and
interact with the results through a web-based interface.

2.4 Performance Requirements

Response Time: The system should respond to user queries and requests within a reasonable
timeframe, such as 1-2 seconds for basic operations and 3-5 seconds for more complex
operations.

Throughput: The system should be able to handle a certain number of concurrent users or
requests, such as 1000 requests per minute, without significant degradation in performance.

Scalability: The system should be scalable to accommodate an increasing amount of chat data
and users over time without compromising performance.

Resource Utilization: The system should efficiently utilize hardware resources, such as CPU,
memory, and storage, to ensure optimal performance.

Availability: The system should be available for use during specified hours of operation, such as
99% uptime during business hours.

2.5 Hardware Requirements

The hardware platform requirements for a WhatsApp Chat Analyzer are relatively modest, as the
primary computational tasks involve data processing, analysis, and visualization rather than
intensive computation. Here are the hardware components typically required:

Computer: A standard desktop or laptop computer is sufficient for running the WhatsApp Chat
Analyzer. The computer should have adequate processing power and memory to handle data
processing tasks efficiently. Most modern computers, including those with mid-range
specifications, should suffice.

18

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
Processor(CPU): A multi-core processor with decent clock speeds is recommended to ensure
smooth execution of data analysis tasks. Processors from Intel Core i5 or AMD Ryzen series and
above are suitable for most analytical tasks.

Memory (RAM): Sufficient RAM is essential for handling large datasets and performing data
manipulations. A minimum of 8GB of RAM is recommended, although 16GB or more would
provide better performance, especially for handling larger chat datasets.

Storage: Adequate storage space is required for storing chat data, analysis results, and any
associated files. A solid-state drive (SSD) is preferred over a traditional hard disk drive (HDD)
for faster data access speeds and improved overall system performance.

Operating System: The WhatsApp Chat Analyzer can run on various operating systems,
including Windows, macOS, and Linux. Users should choose the operating system that they are
most comfortable with and that best suits their needs.

2.6 Constraints & Assumptions

Constraints and assumptions play a crucial role in defining the scope, limitations, and
functionalities of the WhatsApp Chat Analyzer. Here are some common constraints and
assumptions:

1. Data Format: The analyzer assumes that the WhatsApp chat data is provided in a specific
format, typically a text file containing messages with timestamps and metadata. Any deviation
from this format may result in errors or inaccuracies in the analysis.

2. Language Support: The analyzer may have limitations in analyzing chat data in languages
other than the ones it supports. It assumes that the majority of the chat data is in a language
supported by the natural language processing (NLP) tools and libraries used in the analysis.

3. Message Structure: The analyzer assumes a standard message structure within the chat data,
including sender information, message content, and timestamps. Non-standard message formats
or multimedia messages may not be fully supported.

4. Accuracy of Analysis: The accuracy of the analysis depends on the quality and completeness
of the chat data provided. The analyzer may not be able to provide accurate insights if the chat
data is incomplete, inconsistent, or contains errors.
19

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Chapter-3

Design

System Design

3.1 Data Flow Diagrams (Level 0, Level1)

A data flow diagram (DFD) maps out the flow of information for any process or system. It uses
defined symbols like rectangles, circles and arrows, plus short text labels, to show data inputs,
outputs, storage points and the routes between each destination. Data flowcharts can range from
simple, even hand-drawn process overviews, to in-depth, multi-level DFDs that dig progressively
deeper into how the data is handled. They can be used to analyze an existing system or model a
new one. Like all the best diagrams and charts, a DFD can often visually “say” things that would
be hard to explain in words, and they work for both technical and nontechnical audiences, from
developer to CEO. That’s why DFDs remain so popular after all these years. While they work
well for data flow software and systems, they are less applicable nowadays to visualizing
interactive, real-time or database-oriented software or systems.

Fig 3.1.1 Level 0 DFD

20

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Fig 3.1.2 Level 1 DFD

3.2 Activity Diagram

Activity diagram is another important behavioral diagram in UML diagram to describe dynamic
aspects of the system. Activity diagram is essentially an advanced version of flow chart that
modeling the flow from one activity to another activity. Activity Diagrams describe how
activities are coordinated to provide a service which can be at different levels of abstraction.
Typically, an event needs to be achieved by some operations, particularly where the operation is
intended to achieve a number of different things that require coordination, or how the events in a
single use case relate to one another, in particular, use cases where activities may overlap and
require coordination. It is also suitable for modeling how a collection of use cases coordinate to
represent business workflows

21

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Fig 3.2.1 Activity Diagram

3.3 Flow Chart

A flowchart is a diagram that depicts a process, system or computer algorithm. They are widely
used in multiple fields to document, study, plan, improve and communicate often complex
processes in clear, easy-to-understand diagrams. Flowcharts, sometimes spelled as flow charts,
use rectangles, ovals, diamonds and potentially numerous other shapes to define the type of step,
along with connecting arrows to define flow and sequence. They can range from simple, hand-
drawn charts to comprehensive computer-drawn diagrams depicting multiple steps and routes. If
we consider all the various forms of flowcharts, they are one of the most common diagrams on
the planet, used by both technical and non-technical people in numerous fields.

22

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Fig 3.3.1 Flowchart

3.4 Class Diagram

A class diagram in the Unified Modeling Language (UML) is a type of static structure
diagram that describes the structure of a system by showing the system's classes, their attributes,
operations (or methods), and the relationships among objects.

A class is a description of a group of objects all with similar roles in the system, which consists
of:

Structural features (attributes) define what objects of the class "know"

 Represent the state of an object of the class

 Are descriptions of the structural or static features of a class

Behavioral features (operations) define what objects of the class "can do"

 Define the way in which objects may interact


23

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

 Operations are descriptions of behavioral or dynamic features of a class

Fig 3.4.1 Class Diagram

3.5 ER Diagram

The Entity Relational Model is a model for identifying entities to be represented in the database
and representation of how those entities are related. The ER data model specifies enterprise
schema that represents the overall logical structure of a database graphically. The Entity
Relationship Diagram explains the relationship among the entities present in the database. ER
models are used to model real-world objects like a person, a car, or a company and the relation
between these real-world objects. In short, the ER Diagram is the structural format of the
database.

24

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Fig 3.5.1 ER Diagram

3.6 Sequence Diagram

Sequence Diagrams are interaction diagrams that detail how operations are carried out. They
capture the interaction between objects in the context of collaboration. Sequence Diagrams are
time focus and they show the order of the interaction visually by using the vertical axis of the
diagram to represent time what messages are sent and when.

Sequence Diagrams captures:

 The interaction that takes place in a collaboration that either realizes a use case or an
operation (instance diagrams or generic diagrams)

 High-level interactions between user of the system and the system, between the system
and other systems,between subsystems (sometimes known as system sequence diagrams)
25

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Fig 3.6.1 Sequence Diagram

3.7 Use Case Diagram

In the Unified Modeling Language (UML), a use case diagram can summarize the details of your
system's users (also known as actors) and their interactions with the system. To build one, you'll
use a set of specialized symbols and connectors. An effective use case diagram can help your
team discuss and represent:

 Scenarios in which your system or application interacts with people, organizations, or


external systems

 Goals that your system or application helps those entities (known as actors) achieve

 The scope of your system

26

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Fig 3.7.1 Use-Case Diagram

27

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Chapter-4

Implementation, Testing, and Maintenance

4.1 Introduction to Languages, IDE’s, Tools and Technologies used for


Implementation

IDLE

Jupyter notebook - Jupyter Notebook serves as a versatile and interactive platform for
developing and deploying WhatsApp chat analyzers. Its flexibility and integration with various
data science libraries make it an ideal environment for conducting analysis on WhatsApp chat
data. Users can write and execute Python code within Jupyter Notebook cells, allowing for
seamless integration of data processing, analysis, and visualization tasks. The platform's rich text
capabilities enable users to document their analysis process, including explanations,
observations, and insights, alongside code cells. Additionally, Jupyter Notebook supports the
visualization of results using libraries such as Matplotlib, Seaborn, and Plotly, providing users
with interactive and visually appealing representations of their chat data. With its ease of use and
collaborative features, Jupyter Notebook facilitates the development of robust WhatsApp chat
analyzers, empowering users to gain valuable insights from their conversations.

PyCharm: PyCharm is a powerful and feature-rich IDE developed by JetBrains specifically for
Python development. It offers a wide range of tools and features to support developers in writing,
debugging, and managing Python code effectively. PyCharm provides advanced code editing
capabilities such as syntax highlighting, code completion, and code refactoring, which help
improve productivity and code quality. It also includes powerful debugging tools, version control
integration, and support for various frameworks and libraries commonly used in Python
development. PyCharm is available in both free and paid versions, with the paid version offering
additional features such as web development support, database tools, and code analysis.

Python IDLE: Python IDLE (Integrated Development and Learning Environment) is a simple
and lightweight IDE that comes bundled with the Python programming language. It provides
basic features for writing and running Python code, making it suitable for beginners and users
who prefer a minimalistic development environment. Python IDLE includes a Python shell,
28

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
which allows users to execute Python code interactively and see the results immediately. It also
offers basic code editing features such as syntax highlighting and indentation support. While
Python IDLE lacks some of the advanced features and tools found in other IDEs like PyCharm, it
serves as a convenient and easy-to-use option for writing and experimenting with Python code.

Tools and Technologies

The WhatsApp Chat Analyzer utilizes a combination of tools and technologies to perform
various tasks, including data processing, analysis, visualization, and presentation. Here are the
key tools commonly used in building such an analyzer:

Python Programming Language: Python is widely used for its simplicity, versatility, and
extensive libraries for data analysis and natural language processing (NLP).

Jupyter Notebook: Jupyter Notebook provides an interactive environment for writing and
executing Python code, facilitating exploratory data analysis and documentation.

Pandas: Pandas is a powerful Python library used for data manipulation and analysis,
particularly for handling structured data such as chat logs.

NumPy: NumPy is a fundamental library for scientific computing in Python, providing support
for large, multi-dimensional arrays and matrices.

Matplotlib: Matplotlib is a popular Python library for creating static, interactive, and animated
visualizations, including line plots, bar charts, histograms, and scatter plots.

Seaborn: Seaborn is a statistical data visualization library built on top of Matplotlib, offering
higher-level functions for creating attractive and informative statistical graphics.

Regular Expressions (re): Regular expressions are used for pattern matching within text data,
facilitating tasks such as message extraction, cleaning, and parsing.

Streamlit: Streamlit is a Python library for creating interactive web applications for data science
and machine learning projects, providing a user-friendly interface for visualizing and analyzing
data.

29

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
Scikit-learn: Scikit-learn is a machine learning library in Python, offering simple and efficient
tools for data mining and data analysis, including clustering, classification, regression, and
dimensionality reduction algorithms.

WordCloud: WordCloud is a Python library for creating word clouds, a visual representation of
text data where the size of each word indicates its frequency or importance within the dataset.

By leveraging these tools and libraries, developers can build a robust and feature-rich WhatsApp
Chat Analyzer capable of extracting valuable insights from chat data.

4.2 Testing Techniques and Test Plans

Testing techniques for the WhatsApp Chat Analyzer are essential to ensure its functionality,
reliability, and usability. Here's an overview of testing techniques for the WhatsApp Chat
Analyzer:

1. Unit Testing:

 Objective: To test individual components or units of the analyzer, such as functions,


methods, or modules, in isolation.
 Techniques: Use Python testing frameworks like unittest or pytest to write and execute
unit tests for each function or method.
 Examples: Test functions for data parsing, message analysis, and visualization
generation.

2. Integration Testing:

 Objective: To test the integration and interaction between different components or


modules of the analyzer.
 Techniques: Create test cases to simulate various scenarios where components interact,
ensuring seamless integration and data flow.
 Examples: Test data processing pipelines, integration between NLP modules and data
visualization components.

3. Functional Testing:

30

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

 Objective: To verify that the analyzer meets its functional requirements and performs as
expected.
 Techniques: Develop test cases based on functional requirements and user stories, covering
all key functionalities of the analyzer.
 Examples: Test importing chat data, analyzing messages for sentiment analysis, generating
visualizations, exporting analysis results.

4. User Interface (UI) Testing:

 Objective: To evaluate the usability, responsiveness, and user experience of the analyzer's
web interface.
 Techniques: Conduct manual and automated tests to assess UI elements, navigation, input
validation, and error handling.
 Examples: Test user interactions such as importing chat files, selecting analysis options,
interacting with visualizations.

5. Performance Testing:

 Objective: To assess the performance, scalability, and resource usage of the analyzer
under various loads and conditions.
 Techniques: Use load testing tools to simulate multiple users accessing the analyzer
concurrently and measure response times, memory usage, and CPU utilization.
 Examples: Test the analyzer's performance when analyzing large chat datasets, generating
complex visualizations, and handling multiple users simultaneously.
4.3 Installation Instructions

To provide installation instructions for the WhatsApp Chat Analyzer, we need to assume it is a
Python-based application. Below are the general steps for installing and setting up the analyzer:

1. Python Installation:

- Ensure that Python is installed on your system. You can download and install Python from
the official website: https://www.python.org/downloads/

2. Clone the Repository:

31

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
- If the WhatsApp Chat Analyzer is available as a Git repository, clone it to your local
machine:

git clone <repository_url>

3. Install Dependencies:

- Navigate to the project directory and install the required dependencies using pip:

cd whatsapp_chat_analyzer

pip install -r requirements.txt

4. Run the Analyzer:

- Once the dependencies are installed, you can run the WhatsApp Chat Analyzer:

python whatsapp_chat_analyzer.py

5. Access the Web Interface:

- If the WhatsApp Chat Analyzer provides a web interface, open a web browser and navigate to
the specified URL (e.g., http://localhost:5000) to access the analyzer.

6. Import WhatsApp Chat Data:

- Follow the instructions provided by the analyzer to import your WhatsApp chat data. This
typically involves uploading the chat file exported from the WhatsApp application.

7. Analyze Chat Data:

- After importing the chat data, explore the analysis features provided by the WhatsApp Chat
Analyzer, such as visualizations, sentiment analysis, and keyword extraction.

4.4 End User Instructions

Below are end-user instructions for using the WhatsApp Chat Analyzer:

1. Export WhatsApp Chat:

Begin by exporting the chat data from your WhatsApp application. Open the chat you want to
analyze, tap on the three dots (menu), select "More", then "Export chat". Choose whether to

32

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
include media files and select a destination to save the chat file. The exported file typically has
a .txt extension.

2. Open the Website:

Open your web browser and navigate to the website hosting the WhatsApp Chat Analyzer.

3. Upload WhatsApp Chat File:

On the website, locate the option to upload the WhatsApp chat file. This is usually a button
labeled "Upload" or "Choose File". Click on it to open a file dialog, then select the exported chat
file from your device.

4. Initiate Analysis:

After selecting the chat file, there should be a button labeled "Show Analysis" or similar. Click
on this button to initiate the analysis process. The analyzer will start processing the uploaded
chat file to extract insights and generate visualizations.

5. View Analysis:

Once the analysis is complete, the analyzer will display the results on the web interface. This
may include various visualizations, statistics, and insights derived from the chat data. Review the
analysis to gain insights into the chat's content, activity, and patterns.

6. Individual Analysis:

If you wish to analyze individual users' contributions within the chat, look for an option to select
users. This could be a dropdown menu or checkboxes next to user names. Choose the user(s) you
want to analyze, then click on the "Show Analysis" button again. The analyzer will generate
specific insights and visualizations for the selected user(s).

7. Explore Insights:

Explore the insights provided by the analyzer, such as message frequency, word clouds,
sentiment analysis, and more. Use the visualizations and statistics to understand communication
patterns, popular topics, and user activity within the chat.

33

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Chapter-5

Results and Discussions

5.1 User Interface Representation

User Interfaces

The web-based user interface for the WhatsApp chat analyzer will be intuitive and user-friendly,
accessible on various devices. It will feature a dashboard for an overview of analysis results,
easy navigation for different features, and a simple data import interface. Users can customize
analysis settings and view results through interactive visualizations. The interface will also
support data export and be responsive to ensure a consistent experience across devices.

Hardware Interfaces

Hard Disk: Greater than 500 GB

RAM: Greater than 4 GB

Processor: I3 and Above

Software Interfaces

WhatsApp API: Interface with the WhatsApp API to extract chat data from WhatsApp chat logs.

Language – Python 3.8

Data Analysis Libraries: Interface with data analysis libraries such as pandas, numpy, and
matplotlib for performing analysis on the chat data.

Communications Interfaces

WhatsApp API Integration: The WhatsApp chat analyzer will establish seamless
communication with WhatsApp's APIs, enabling direct access to chat data while adhering to
WhatsApp's data privacy and security protocols.

Data Retrieval: Through the WhatsApp APIs, the analyzer will retrieve chat logs and relevant
metadata, ensuring comprehensive access to messaging data for analysis and visualization within
the tool's interface.

34

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
Real-time Updates : Leveraging WhatsApp APIs, the analyzer will support real-time updates,
allowing users to monitor and analyze ongoing conversations dynamically, ensuring the most up-
to-date insights are available for analysis and visualization.

5.2 Brief Description of Various Modules of the system

The WhatsApp Chat Analyzer boasts several key features designed to provide comprehensive
insights into WhatsApp conversations. Here's a breakdown key Modules of the system:

Top Statistics: This feature presents top-level statistics summarizing various aspects of the
WhatsApp chat. It may include metrics such as total messages sent, average message length,
most active participants, and overall engagement levels.

Monthly Timeline: The monthly timeline feature visualizes message activity over time, showing
how communication patterns fluctuate month by month. It helps users understand trends, identify
busy periods, and track changes in conversation dynamics over time.

Daily Timeline: Similar to the monthly timeline, the daily timeline feature displays message
activity on a daily basis. It provides a granular view of communication patterns, allowing users
to pinpoint specific days of high or low activity and analyze daily trends.

Activity Map: The activity map feature maps the geographical locations associated with
message senders, providing a visual representation of where participants are located. This feature
can be particularly useful for group chats with members from different regions or countries.

Weekly Activity: This feature aggregates message activity on a weekly basis, offering insights
into weekly communication patterns. It helps users identify recurring trends, such as peak
activity days or quiet periods, and plan communication strategies accordingly.

Common Words: The common words feature analyzes the frequency of words used in the chat
and identifies the most commonly used words. This helps users understand the main topics of
conversation and the language patterns prevalent within the chat.

Emoji Analysis: The emoji analysis feature examines the use of emojis in the chat and provides
insights into the emotional tone and sentiment of conversations. It may highlight frequently used
emojis, popular emoji combinations, and trends in emoji usage over time.

35

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

5.3 Snapshots of system with brief detail of each


Top Statistics: This feature presents top-level statistics summarizing various aspects of the
WhatsApp chat. It may include metrics such as total messages sent, average message length,
most active participants, and overall engagement levels.

Fig 5.3.1

Monthly Timeline: The monthly timeline feature visualizes message activity over time,
showing how communication patterns fluctuate month by month. It helps users understand
trends, identify busy periods, and track changes in conversation dynamics over time.

Fig 5.3.2

36

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Daily Timeline: Similar to the monthly timeline, the daily timeline feature displays message
activity on a daily basis. It provides a granular view of communication patterns, allowing users
to pinpoint specific days of high or low activity and analyze daily trends.

Fig 5.3.3

Activity Map: The activity map feature maps the geographical locations associated with
message senders, providing a visual representation of where participants are located. This feature
can be particularly useful for group chats with members from different regions or countries.

Fig 5.3.4

Most Busy User: The most busy user is Group analysis feature of the WhatsApp Chat
Analyzer provides valuable insights into the level of activity and participation of group members
37

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
within a WhatsApp group chat. This analysis identifies and highlights the user(s) who contribute
the most to the conversation based on various metrics such as the number of messages sent, the
frequency of interactions, and the overall engagement level.

Fig 5.3.5

Weekly Activity: This feature aggregates message activity on a weekly basis, offering
insights into weekly communication patterns. It helps users identify recurring trends, such as
peak activity days or quiet periods, and plan communication strategies accordingly.

Fig 5.3.6

38

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Word Cloud: A word cloud is a popular visualization technique used to represent textual data,
where the size of each word indicates its frequency or importance within the dataset. In the
context of the WhatsApp Chat Analyzer, a word cloud visually depicts the most commonly used
words or phrases within a chat conversation, providing insights into the topics, themes, and
sentiments expressed by the participants.

Fig 5.3.7

Common Words: The common words feature analyzes the frequency of words used in the
chat and identifies the most commonly used words. This helps users understand the main topics
of conversation and the language patterns prevalent within the chat.

39

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Fig 5.3.8

Emoji Analysis: The emoji analysis feature examines the use of emojis in the chat and
provides insights into the emotional tone and sentiment of conversations. It may highlight
frequently used emojis, popular emoji combinations, and trends in emoji usage over time.

Fig 5.3.9

40

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Chapter-6

Summary and Conclusions

Summary and Conclusions

The implementation of the WhatsApp Chat Analyzer has met its primary objective, as identified
during the initial requirement analysis phase. This system not only delivers dependable results
but also presents a user-friendly interface, enhancing accessibility for users with different levels
of computer proficiency. The transition from manual to automated processes has enabled the
analyzer to overcome the limitations of manual methods, guaranteeing accuracy and efficiency in
data analysis.

In conclusion, the WhatsApp Chat Analyzer stands as a testament to the efficacy of automated
systems in enhancing data analysis processes. Its array of features not only ensures reliable
results but also enhances user satisfaction. Moving forward, the analyzer sets a benchmark for
similar systems, showcasing the potential for automation to streamline tasks and improve
outcomes in various domains. As technology continues to evolve, such tools will undoubtedly
play a crucial role in facilitating efficient data analysis and decision-making processes.

The WhatsApp Chat Analyzer boasts several features that contribute to its effectiveness and user
satisfaction:

1. User-Friendly Interface: The system is designed with simplicity and intuitiveness in mind,
allowing users to navigate through menus and functionalities effortlessly. This user-friendly
approach ensures that even users with limited computer knowledge can operate the system with
ease.

2. Time-Saving: By automating the process of analyzing WhatsApp chat data, the analyzer
significantly reduces the time and effort required for manual analysis. Users can quickly upload
chat files and obtain insights without the need for manual data entry or analysis.

3. Cross-Platform Compatibility: The system is designed to run on any device with internet
access, including desktops, laptops, tablets, and smartphones. This flexibility enables users to
access the analyzer from their preferred devices, enhancing convenience and accessibility.

41

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE
4. Compatibility with WhatsApp Files: The analyzer is capable of analyzing any WhatsApp
chat file imported by the user. Whether it's an individual chat or a group conversation, the system
can process the data efficiently and generate insights and visualizations.

5. Accuracy and Reliability: With rigorous testing and validation processes in place, the
analyzer ensures the accuracy and reliability of the analysis results. Users can trust the system to
provide precise insights into their WhatsApp conversations, helping them make informed
decisions based on reliable data.

6. Ease of Use: The system's intuitive interface and streamlined workflows make it easy for
users to perform complex analyses without requiring extensive training or technical expertise.
Users can upload chat files, initiate analysis, and interpret results effortlessly, enhancing overall
usability and satisfaction.

In summary, the WhatsApp Chat Analyzer's array of features underscores its commitment to
enhancing user experience and efficiency in WhatsApp chat data analysis. From its user-friendly
interface to its compatibility with various devices and chat file formats, the analyzer empowers
users to glean valuable insights from their conversations with ease and accuracy.

42

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Chapter-7

Future scope

Future scope

The future scope and further enhancement of the WhatsApp Chat Analyzer project are
promising, with numerous avenues for development. Advanced natural language processing
(NLP) techniques can be integrated to extract deeper insights from chat data, including sentiment
analysis at a more granular level and syntactic analysis for understanding message structures.
Implementing advanced topic modeling algorithms could help users uncover latent topics and
emerging trends within conversations. Real-time analysis capabilities would enable users to
monitor chat activity as it happens, providing immediate insights and alerts. Integration with
external data sources, such as social media or CRM systems, could offer a more comprehensive
view of communication across various channels. Collaborative features would facilitate sharing
and discussion of analysis results among users. Predictive analytics models could forecast future
trends in chat activity, aiding users in proactive communication management. Enhanced
visualization options, customization features, and optimization for scalability and performance
would further enhance the usability and effectiveness of the analyzer. Additionally, extending
support to analyze chat data from other messaging platforms would broaden its applicability and
utility. By embracing these future enhancements, the WhatsApp Chat Analyzer can continue to
evolve as a valuable tool for understanding and optimizing communication strategies in the
digital realm.

Appendix

 Integration of Advanced NLP Techniques: Future development can incorporate


advanced natural language processing (NLP) techniques to extract deeper insights from
chat data. This may include sentiment analysis at a more granular level and syntactic
analysis for understanding message structures

 Implementation of Advanced Topic Modeling Algorithms: Introducing advanced


topic modeling algorithms could aid users in uncovering latent topics and emerging
trends within conversations, enhancing the depth of analysis provided by the tool.
43

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

 Real-time Analysis Capabilities: Enabling real-time analysis capabilities would allow


users to monitor chat activity as it occurs, providing immediate insights and alerts for
timely response and intervention.

 Integration with External Data Sources: Integration with external data sources such as
social media or CRM systems could provide a more comprehensive view of
communication across various channels, enriching the analytical capabilities of the tool.

 Collaborative Features: Adding collaborative features would enable users to share and
discuss analysis results among team members or stakeholders, fostering collaboration and
informed decision-making.

 Predictive Analytics Models: Incorporating predictive analytics models could empower


users to forecast future trends in chat activity, aiding in proactive communication
management and strategy development.

 Enhanced Visualization Options and Customization Features: Offering enhanced


visualization options and customization features would provide users with greater
flexibility in interpreting and presenting analysis results according to their specific needs
and preferences.

 Optimization for Scalability and Performance: Optimizing the tool for scalability and
performance would ensure its effectiveness even as the volume of data processed and the
user base grows over time.

 Support for Other Messaging Platforms: Extending support to analyze chat data from
other messaging platforms would broaden the applicability and utility of the tool, catering
to a wider range of user needs and preferences.

44

COMPUTER SCIENCE AND ENGINEERING


MEDI-CAPS UNIVERSITY, INDORE

Bibliography

 Available from: http://www. statista.com/statistics/260819/number- of-monthly-active-


WhatsApp-users. Number of monthly active WhatsApp users worldwide from April 2013
to February 2016(in millions).

 Ahmed, I., Fiaz, T., Mobile phone to youngsters: Necessity or addiction, African Journal of
Business Management Vol.5 (32), pp. 12512-12519, Aijaz, K. (2011).

 Aharony, N., T., G., The Importance of the WhatsApp Family Group: An Exploratory
Analysis. Aslib Journal of Information Management, Vol. 68, Issue 2, pp.1-37 (2016).

 Access Data Corporation. FTK Imager, 2013. Available at


http://www.accessdata.com/support/product-downloads.

D.Radha, R. Jayaparvathy, D. Yamini, Analysis on Social Media Addiction using Data Mining
Technique, International Journal of Computer Applications (0975 8887) Volume 139 No.7, pp.
23- 26, April 2016.

45

COMPUTER SCIENCE AND ENGINEERING

You might also like