You are on page 1of 17

SEMINAR REPORT

On
Multimodal NLP for sentimental analysis

BY
Kunal Jadhav
(TECO-B39)

Under The Guidance of

Mrs Shrinika Inamdar

DEPARTMENT OF COMPUTER ENGINEERING


Pimpri Chinchwad College Of Engineering and
Research
Plot No. B, Sector no. 110, Gate no.1,Laxminagar,
Ravet, Haveli, Pune - 412101
Department of Computer Engineering
Pimpri Chinchwad College of Engineering and Research

CERTIFICATE

This is to certify that Kunal Jadhav from Third Year Computer Engineer-
ing has successfully completed her seminar work titled ”Multimodal NLP
for sentimental analysis ” at Pimpri Chinchwad College of Engineering
and Research, Ravet in the partial fulfillment of the bachelor’s degree in
engineering.

Mrs Shrinika Inamdar Dr. Archana Chaugule Dr.H.U.Tiwari


Seminar Guide Head of Department Principal
Abstract

Artificial Intelligence (AI) and Machine Learning (ML) have emerged as in-
dispensable tools in advancing the field of Multimodal Natural Language
Processing (NLP). The integration of AI and ML techniques has revolu-
tionized how machines interpret and comprehend diverse forms of human
communication, transcending the boundaries of traditional unimodal NLP.
In the domain of Multimodal NLP, AI and ML algorithms play a pivotal
role in processing and fusing information from various modalities such as
text, images, audio, and videos. Through advanced neural network architec-
tures, including Convolutional Neural Networks (CNNs), Recurrent Neural
Networks (RNNs), and Transformer models, machines are equipped to learn
complex patterns and relationships between different modalities. This en-
ables them to decipher nuanced context, emotions, and intent embedded
within multimodal data.
Acknowledgments
It gives us great pleasure in presenting the seminar report on ‘Multi-

modal NLP for sentimental analysis ’.

I would like to take this opportunity to thank my internal guide Mrs Shrinika
Inamdar for giving me all the help and guidance I needed. I am really
grateful to them for their kind support. Their valuable suggestions were very
helpful.

I am also grateful to Dr. Archana Chaugule , Head of Computer Engi-


neering Department, PCCOE&R for her indispensable support, suggestions.

In the end our special thanks to Dr. H. U. Tiwari for providing various
resources such as laboratory with all needed software platforms, continuous
Internet connection, for our seminar.

Kunal Jadhav
(T.E. Computer Engg.)
Multimodal NLP for sentimental analysis

Contents

1 Introduction 2
1.1 Seminar Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Motivation of the Seminar . . . . . . . . . . . . . . . . . . . . 2
1.3 Introduction Part . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Literature Survey 4

3 Methodology/Proposed system 6
3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4 Results 7

5 Summary and Conclusion 8

6 Plagiarism Report 9

7 References 10

PCCOE&R, Department of Computer Engineering 2023 5


Multimodal NLP for sentimental analysis

List of Figures

3.1 Multimodal-nlp . . . . . . . . . . . . . . . . . . . . . . . . . . 6

PCCOE&R, Department of Computer Engineering 2023 6


Multimodal NLP for sentimental analysis

List of Tables

2.1 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . 5

PCCOE&R, Department of Computer Engineering 2023 1


Multimodal NLP for sentimental analysis

Chapter 1

Introduction

1.1 Seminar Idea


• This seminar aims to provide a clear idea of how developing new ap-
proaches to Multimodal natural language Processing that can accu-
rately understand and interpret context and emotion(sentimental anal-
ysis)

1.2 Motivation of the Seminar


• Enhanced Contextual Understanding: Combining textual and visual
information allows for a deeper contextual understanding of language.
Visual cues can provide additional context, helping AI models better
comprehend ambiguous text, idiomatic expressions, and references.

• Cross-Modal Data Fusion: By integrating different modalities, AI mod-


els can take advantage of complementary information, improving the
overall accuracy of tasks such as information retrieval, summarization,
translation, and question answering.

• Robustness in Real-World Applications: In real-world scenarios, com-


munication often involves multiple modalities. Multimodal NLP equips
AI systems to handle the complexity of real-life conversations, which
often include text, voice, images, and gestures.

PCCOE&R, Department of Computer Engineering 2023 2


Multimodal NLP for sentimental analysis

1.3 Introduction Part


• Multimodal Natural Language Processing (NLP) is a cutting-edge method
for comprehending human communication by fusing textual data with
other modalities like images, videos, and audio.

• The idea enables machines to understand not just language clues but
also visual and audio environment, this developing field broadens the
traditional NLP boundaries.

• By combining the contextual information offered by several modalities,


multimodal NLP seeks to improve the precision of language under-
standing, sentiment analysis, and emotion detection. Its spectrum of
applications includes virtual assistants, social media analysis, multime-
dia content interpretation, and medical diagnostics.

PCCOE&R, Department of Computer Engineering 2023 3


Multimodal NLP for sentimental analysis

Chapter 2

Literature Survey

Research Objective Methods/Techniques Relevant find-


article ings/Limitations
(Au- identified
thor/Year)
[1] To develop a The system uses a com- The system has the
Abdu et multimodal bination of text and vi- potential to be a
al.(2023) sentiment sual information to iden- valuable tool for a
analysis tify whether a person is variety of applica-
system for exhibiting aggressive be- tions, such as pain
recognizing havior due to pain. The management, law en-
person ag- system is composed of forcement, and cus-
gressiveness five components: 1. Text tomer service. How-
in pain. preprocessing 2. Feature ever, the system is
extraction 3. Classifica- still under develop-
tion 4. Image preprocess- ment and more re-
ing 5. Feature extraction search is needed to
evaluate its effective-
ness in real-world set-
tings.

PCCOE&R, Department of Computer Engineering 2023 4


Multimodal NLP for sentimental analysis

[2] Zeyd Develop a Computer vision, NLP, Improved metadata


Boukhers multimodal Machine Learning extraction accuracy,
et approach for Challenges in han-
al.(2022) metadata dling non-standard
extraction layouts
from scien-
tific PDFs
[3] Neeraj Develop a CNN, LSTM, Multi- Enhanced perfor-
Bhadani multimodal modal Fusion mance in identifying
et deep learn- hateful memes, Chal-
al.(2021) ing model lenges in handling
for predict- diverse visual and
ing hateful textual content
memes
[4] Quigfu Develop a Multimodal Encoder- Improved sentiment
Qi et al. novel multi- Decoder, Transformer analysis accuracy,
(2022) modal senti- Interpretability of
ment analysis multimodal attention
model using a mechanisms
Transformer-
based archi-
tecture
Table 2.1: Literature Survey

PCCOE&R, Department of Computer Engineering 2023 5


Multimodal NLP for sentimental analysis

Chapter 3

Methodology/Proposed system

3.1 Architecture
A multimodal NLP architecture for sentimental analysis is a system that
uses a combination of text, audio, and video information to identify and un-
derstand the sentiment of a piece of content. Multimodal NLP architectures
can improve the accuracy of sentimental analysis by taking into account the
complementary information that can be found in different modalities. For
example, a model might use facial expressions to identify the sentiment of a
speaker, even if their words are neutral or ambiguous. Or, a model might
use the tone of voice of a speaker to identify the sentiment of a conversation,
even if the words themselves are positive.

Figure 3.1: Multimodal-nlp

PCCOE&R, Department of Computer Engineering 2023 6


Multimodal NLP for sentimental analysis

Chapter 4

Results

Multimodal NLP models improves the accuracy of sentiment analysis by


taking into account the complementary information that can be found in
different modalities. For example, a model might use facial expressions to
identify the sentiment of a speaker, even if their words are neutral or am-
biguous. Or, a model might use the tone of voice of a speaker to identify the
sentiment of a conversation, even if the words themselves are positive. For
eg.Customer service,Market research,Social media monitoring etc

PCCOE&R, Department of Computer Engineering 2023 7


Multimodal NLP for sentimental analysis

Chapter 5

Summary and Conclusion

In conclusion, Multimodal sentiment analysis is a rapidly developing field


with the potential to revolutionize the way we understand human sentiment.
As multimodal NLP models continue to improve, we can expect to see them
used in even more applications in the future. Multimodal sentiment analysis
has the potential to make a significant impact on the way we interact with
the world around us. By better understanding human sentiment, we can
create more meaningful and impactful experiences for everyone.

PCCOE&R, Department of Computer Engineering 2023 8


Multimodal NLP for sentimental analysis

Chapter 6

Plagiarism Report

PLAGIARISM SCAN REPORT

Date 2023-10-02

0% 100%
Words 630
Plagiarised Unique

Characters 4892

Content Checked For Plagiarism

Abstract:
Artificial Intelligence (AI) and Machine Learning (ML) have emerged as indispensable tools in advancing the field of
Multimodal Natural Language Processing (NLP). The integration of AI and ML techniques has revolutionized how machines
interpret and comprehend diverse forms of human communication, transcending the boundaries of traditional unimodal
NLP. In the domain of Multimodal NLP, AI and ML algorithms play a pivotal role in processing and fusing information from
various modalities such as text, images, audio, and videos. Through advanced neural network architectures, including
Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer models, machines are
equipped to learn complex patterns and relationships between different modalities. This enables them to decipher
nuanced context, emotions, and intent embedded within multimodal data.
Keywords: Sentimental analysis
Introduction:
Multimodal Natural Language Processing (NLP) is a cutting-edge method for comprehending human communication by
fusing textual data with other modalities like images, videos, and audio. By enabling machines to understand not just
language clues but also visual and audio environment, this developing field broadens the traditional NLP boundaries. By
combining the contextual information offered by several modalities, multimodal NLP seeks to improve the precision of
language understanding, sentiment analysis, and emotion detection. Its spectrum of applications includes virtual assistants,
social media analysis, multimedia content interpretation, and medical diagnostics.

Objectives:
The motivation/objective for multimodal natural language processing (NLP) using AI and machine learning lies in
leveraging the synergy between diverse data modalities, such as text, images, audio, and more. By integrating these
modalities, researchers and practitioners aim to enhance the understanding, interpretation, and generation of human
language, which in turn can lead to a range of benefits and advancements:
1. Enhanced Contextual Understanding: Combining textual and visual information allows for a deeper contextual
understanding of language. Visual cues can provide additional context, helping AI models better comprehend ambiguous
text, idiomatic expressions, and references.
2. Robustness in Real-World Applications: In real-world scenarios, communication often involves multiple modalities.
Multimodal NLP equips AI systems to handle the complexity of real-life conversations, which often include text, voice,
images, and gestures.
3. Richer User Interaction: Integrating speech, text, and visual inputs can create more natural and immersive user
interactions with AI systems, making interfaces more user-friendly and accessible, especially for individuals with different
communication preferences.
4. Cross-Modal Data Fusion: By integrating different modalities, AI models can take advantage of complementary
information, improving the overall accuracy of tasks such as information retrieval, summarization, translation, and question
answering.
5. Enabling New Applications: Multimodal NLP opens the door to innovative applications such as image captioning,
video description, interactive chatbots with visual understanding, automatic video content generation, and more.

Page 1 of 2
.

PCCOE&R, Department of Computer Engineering 2023 9


Multimodal NLP for sentimental analysis

Chapter 7

References

REFERENCES.

[1] Abdu et al. (2023), J. Yang and S. Jana, ”Deepxplore: Auto-


mated whitebox testing of deep learning systems”, Proceedings of the 26th
Symposium on Operating Systems Principlesser. SOSP’17,pp. 1-18, 2017
[Google Scholar][IEEE]

[2] Zeyd Boukhers et al., Sooryanarayan Gobu Doraisamy and Nava-


neeth Kumar Kanakarajan, ”A Multimodal Approach for Extracting Content
Descriptive Metadata from Lecture Videos”, J. Intell. Inf. Syst, vol. 46, no.
1, pp. 121-145, 2016, 50, [online] Available: https://doi.org/10.1007/s10844-
015-0356-5. [Google Scholar][IEEE].

[3] Neeraj Bhadani et al., Richard Tzong-Han Tsai, Cheng-Lung


Sung, Chiu-Chen Hsieh, Cheng-Wei Lee, Shih-Hung Wu, et al.,”Reference
metadata extraction using a hierarchical knowledge representation frame-
work”, Decision Support Systems, vol. 43, no. 1, pp. 152-167, 2022, 10,[on-
line] Available: https://doi.org/10.1016/j.dss.2006.08.006. [Google Scholar][IEEE].

[4] Quigfu Qi et al., D Wang, S Kumari et al., ”Large-scale atlas of


microarray data reveals the distinct expression landscape of different tissues
in Arabidopsis”, Plant J, vol. 86, no. 6, pp. 472-80, Jun 2016., 3, [Google
Scholar][IEEE]

PCCOE&R, Department of Computer Engineering 2023 10


Multimodal NLP for sentimental analysis

PCCOE&R, Department of Computer Engineering 2023 11

You might also like