Welcome to Scribd!

Skip carousel

Image Captioning

Uploaded by

Pallavi Bharti

0% found this document useful (0 votes)

79 views16 pages

Original Title

392260017 Image Captioning Ppt

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

79 views16 pages

Image Captioning

Uploaded by

Pallavi Bharti

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 16

Search inside document

IMAGE -C A P T I ON

GEN E R A T IO N
PRESENTED BY:
DEEPANSHU SINGH
MAYANK
OVERVIEW

 IMAGE CAPTIONING IS THE PROCESS OF GENERATING TEXTUAL DESCRIPTION

OF AN IMAGE.

 IT USES BOTH NATURAL-LANGUAGE-PROCESSING AND COMPUTER-VISION TO
GENERATE THE CAPTIONS.
DEEP LEARNING

• DEEP LEARNING IS A SUBFIELD OF MACHINE LEARNING CONCERNED WITH ALGORITHMS

INSPIRED BY THE STRUCTURE AND FUNCTION OF THE BRAIN CALLED ARTIFICIAL NEURAL
NETWORKS.

• IN-SHORT, IT IS NEURAL NETWORKS WITH MULTIPLE LAYERS.

• DEEP LEARNING TACKLES THE PROBLEM OF IMAGE CAPTIONING QUITE WELL
THAN ANY OTHER PARADIGMS OF PROGRAMMING.
PLATFORMS & TOOLS

• KAGGLE PLATFORM
• GOOGLE CLOUD CONSOLE PLATFORM
• KERAS WITH TENSORFLOW AS BACKEND
• PRE-TRAINED VGG MODEL BY OXFORD
• VIM TEXT-EDITOR
DATASET DESCRIPTION
• A GOOD DATASET TO USE WHEN GETTING STARTED WITH IMAGE CAPTIONING IS
THE FLICKR8K DATASET.

• FLICKR8K_DATASET.ZIP (1 GIGABYTE) AN ARCHIVE OF ALL PHOTOGRAPHS

(6000+2000).

• FLICKR8K_TEXT.ZIP (2.2 MEGABYTES) AN ARCHIVE OF ALL TEXT DESCRIPTIONS

FOR PHOTOGRAPHS(5 CAPTIONS PER IMAGE).

• THE REASON OF USING FLICKR8K DATASET IS BECAUSE IT IS REALISTIC AND

RELATIVELY SMALL TO BUILD MODELS ON YOUR WORKSTATION USING A CPU.
CONCEPTS
• CNN ( CONVOLUTION NEURAL NETWORK ) USED IN VGG.
• LONG SHORT-TERM MEMORY (LSTM) RECURRENT NEURAL NETWORK.
• TEXT-PROCESSING.
• BLEU SCORE FOR TEXTUAL EVALUATION.
• LINUX AND COMMAND LINE
• GOOGLE CLOUD CONSOLE USAGE
BRIEF APPROACH
Words RNN
(pre-processed) (LSTM)

Merger Layer

Image features
(Pre-processed)
VGG (VISUAL GEOMETRY GROUP)
• VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE VISUAL RECOGNITION.
• CONVOLUTIONAL NETWORKS (CONVNETS) CURRENTLY SET THE STATE OF THE ART IN
VISUAL RECOGNITION.

• THEY CREATED 16 AND 19 LAYERED MODELS TO WIN THE IMAGENET COMPETITION IN

2014.

• WE HAVE USED THE 16 LAYERED VGG MODEL IN OUR PROJECT TO EXTRACT THE FEATURES
OF IMAGES.
CONVOLUTIONAL NEURAL NETWORK
(CNN)

• CNN ARE VERY SIMILAR TO ORDINARY NEURAL NETWORKS.

• CONVNET ARCHITECTURES MAKE THE EXPLICIT ASSUMPTION THAT THE INPUTS ARE
IMAGES, WHICH ALLOWS US TO ENCODE CERTAIN PROPERTIES INTO THE ARCHITECTURE.

• CNNS OPERATE OVER VOLUMES.

• IT HAS FILTERS, POOLING LAYERS BECAUSE OF THE ASSUMPTION THAT INPUT IS ALWAYS
AN IMAGE.
RNN AND LSTM
• RECURRENT NEURAL NETWORKS ARE THE STATE OF THE ART ALGORITHM FOR SEQUENTIAL
DATA AND AMONG OTHERS USED BY APPLES SIRI AND GOOGLES VOICE SEARCH.

• IT HAS AN INTERNAL MEMORY WHICH MAKES IT PERFECTLY SUITED FOR MACHINE

LEARNING PROBLEMS THAT INVOLVE SEQUENTIAL DATA BECAUSE IT CAN REMEMBERS IT’S
INPUT.

• IN OUR MODEL, TEXT INPUT SEQUENCES WITH A PRE-DEFINED LENGTH (34 WORDS) WHICH
ARE FED INTO AN EMBEDDING LAYER THAT USES A MASK TO IGNORE PADDED VALUES. THIS
IS FOLLOWED BY AN LSTM LAYER WITH 256 MEMORY UNITS.
BLEU SCORE
• BLEU, OR THE BILINGUAL EVALUATION UNDERSTUDY, IS A SCORE FOR COMPARING TRANSLATION
OF TEXT TO ONE OR MORE REFERENCE TRANSLATIONS.

• IN OUR MODEL THERE WERE MORE THAN ONE POSSIBLE CAPTIONS FOR AN IMAGE, SO, WE
EVALUATE OUR MODEL USING BLEU SCORE.

• A PERFECT MATCH RESULTS IN A SCORE OF 1.0, WHEREAS A PERFECT MISMATCH RESULTS IN A

SCORE OF 0.0.

• NLTK, PROVIDES AN IMPLEMENTATION OF THE BLEU SCORE.

• WE HAVE USED CORPUS_BLUE FOR CALCULATING THE BLEU SCORE FOR MULTIPLE SENTENCES
SUCH AS A PARAGRAPH OR A DOCUMENT.
OUR EVALUATION
OUTPUT-1
OUTPUT-2
CONCLUSION

HERE WE CAN CONCLUDE THAT (RNN + VGG) CAN GIVE GOOD RESULTS FOR IMAGE
CAPTIONING EVEN AFTER TRAINING ON SMALL DATASETS LIKE FLICKR_8K.
Thank You
EveryOne

Image Captioning Using Deep Learning
Document16 pages
Image Captioning Using Deep Learning
Mayank Gupta
50% (2)
Ker As Tutorial
Document33 pages
Ker As Tutorial
Yoann Dragneel
No ratings yet
PPT
Document20 pages
PPT
Harsh
No ratings yet
Detecting and captioning images using CNN-LSTM
Document34 pages
Detecting and captioning images using CNN-LSTM
tereyo
No ratings yet
DCGAN (Deep Convolution Generative Adversarial Networks)
Document27 pages
DCGAN (Deep Convolution Generative Adversarial Networks)
lakpa tamang
No ratings yet
MRI Brain Image Classification Using Deep Learning
Document18 pages
MRI Brain Image Classification Using Deep Learning
Rohit Arya
No ratings yet
Object Detection Methods Comparison
Document20 pages
Object Detection Methods Comparison
rohith mukkamala
67% (3)
AI Deep Learning Convolutional Neural Networks
Document33 pages
AI Deep Learning Convolutional Neural Networks
mohamed sherif
No ratings yet
Integrated Circuit Design Using Open-Source Tools
Document1 page
Integrated Circuit Design Using Open-Source Tools
Complex Systems Modeling and Engineering
No ratings yet
Convolutional Neural Networks
Document17 pages
Convolutional Neural Networks
Manish Man Shrestha
No ratings yet
SCHEDULING PERIODIC REAL-TIME TASKS CLOUD
Document31 pages
SCHEDULING PERIODIC REAL-TIME TASKS CLOUD
bhavadharani
No ratings yet
Gap Method For Keyword Spotting: Varsha Thakur, Himanisikarwar
Document7 pages
Gap Method For Keyword Spotting: Varsha Thakur, Himanisikarwar
Sunil Chaluvaiah
No ratings yet
MAJOR14
Document14 pages
MAJOR14
swathi
100% (1)
Sample Proposal Presentation
Document12 pages
Sample Proposal Presentation
Quasi Kris
No ratings yet
Ausoori Raghuveer
Document2 pages
Ausoori Raghuveer
Ravikishore Gandikota
No ratings yet
Bidirectional RNN and RVNN
Document15 pages
Bidirectional RNN and RVNN
Lakshmi Narayanan Ranganatha
No ratings yet
CNN Apps
Document17 pages
CNN Apps
asidharth157
No ratings yet
GLCNNB Algorithm Detects Image Forgeries Fast and Accurately
Document28 pages
GLCNNB Algorithm Detects Image Forgeries Fast and Accurately
laxmann
No ratings yet
Real-Time Rails: Action Cable Any Cable
Document28 pages
Real-Time Rails: Action Cable Any Cable
Khoa Dương
No ratings yet
DLW Deep Learning Workshop
Document3 pages
DLW Deep Learning Workshop
Indhu R
No ratings yet
Content Based Image Retrieval
Document23 pages
Content Based Image Retrieval
VPLAN INFOTECH
No ratings yet
2019 CVPR Paper Overview: Sualab Ho Seong Lee
Document30 pages
2019 CVPR Paper Overview: Sualab Ho Seong Lee
Ramanarayan
No ratings yet
02b The Virtual Brain (TVB-GUI)
Document47 pages
02b The Virtual Brain (TVB-GUI)
dagush
No ratings yet
File Without Change in Histogram
Document21 pages
File Without Change in Histogram
Manoj J
No ratings yet
Compact Deep Convolutional Neural Networks For Image Classification
Document7 pages
Compact Deep Convolutional Neural Networks For Image Classification
Dimas Ajjh
No ratings yet
Edelman Interopnyc 092014pv
Document34 pages
Edelman Interopnyc 092014pv
MedanExchange Data Solusi
No ratings yet
Classify Webcam Images Using Deep Learning
Document17 pages
Classify Webcam Images Using Deep Learning
gaurav
No ratings yet
East West Institute of Technology: An Improved Approach For Fire Detection Using Deep Learning Models
Document21 pages
East West Institute of Technology: An Improved Approach For Fire Detection Using Deep Learning Models
Deepu
No ratings yet
Identify Web Cam Images Using Neural Networks
Document17 pages
Identify Web Cam Images Using Neural Networks
gaurav
No ratings yet
Introduction To Parallel Processing: Unit-2
Document32 pages
Introduction To Parallel Processing: Unit-2
sushil@ird
No ratings yet
Visualizing Deep CNNs for Understanding, Diagnosis and Refinement
Document41 pages
Visualizing Deep CNNs for Understanding, Diagnosis and Refinement
Yamuna Jayabalan
No ratings yet
Deep Learning: Seungsang Oh
Document39 pages
Deep Learning: Seungsang Oh
KaAI Kookmin
No ratings yet
Balaji Emulation Intern
Document3 pages
Balaji Emulation Intern
freebhushan
No ratings yet
Deep Learning For Computer Vision
Document55 pages
Deep Learning For Computer Vision
akram
No ratings yet
SQL Server 2008 Internals
Document786 pages
SQL Server 2008 Internals
RajuKiran Lokanathan
No ratings yet
Py Torch
Document23 pages
Py Torch
william
50% (2)
Tutorials
Document17 pages
Tutorials
nandini chinthala
No ratings yet
Machine Learning Hardware Acceleration
Document26 pages
Machine Learning Hardware Acceleration
Sai Sumanth
No ratings yet
Eva Resume Summer 2023
Document1 page
Eva Resume Summer 2023
Jj
No ratings yet
Image/Digit Recognition Using Machine Learning: by Raghav Chawla, I.T/B.Tech/Hmritm/5 Semester 43713303117
Document15 pages
Image/Digit Recognition Using Machine Learning: by Raghav Chawla, I.T/B.Tech/Hmritm/5 Semester 43713303117
Raghav Chawla
100% (1)
05-JP Id Tech 5 Challenges
Document37 pages
05-JP Id Tech 5 Challenges
Nemanja Ćosović
No ratings yet
Deep Monocular Depth Estimation Via Integration of Global and Local Predictions
Document2 pages
Deep Monocular Depth Estimation Via Integration of Global and Local Predictions
bvkarthik2711
No ratings yet
Generatives Neural Networks
Document31 pages
Generatives Neural Networks
Devyansh Gupta
No ratings yet
Style A Video Paper Presentation
Document22 pages
Style A Video Paper Presentation
黃偉宸
No ratings yet
CV - Deep Convolutional Neural Networks
Document55 pages
CV - Deep Convolutional Neural Networks
Maa See
No ratings yet
UNIT 4 - Switched Capacitance
Document75 pages
UNIT 4 - Switched Capacitance
rajasekarkpr
100% (1)
Popular Model Compression Techniques for NLP
Document173 pages
Popular Model Compression Techniques for NLP
Sahithi Katakam
No ratings yet
MeshLab and Arc3D
Document22 pages
MeshLab and Arc3D
Stéphane Ancelot
No ratings yet
Compiling Occam Into Field-Programmable Gate Arrays: Ian Page and Wayne Luk
Document13 pages
Compiling Occam Into Field-Programmable Gate Arrays: Ian Page and Wayne Luk
Tamiltamil Tamil
No ratings yet
Main PPT2
Document31 pages
Main PPT2
Shanthi Kishore
No ratings yet
AI Technology For NoC Performance Evaluation
Document5 pages
AI Technology For NoC Performance Evaluation
Soumya Shatakshi Panda
No ratings yet
CH4-7 GRAPH Lecture
Document36 pages
CH4-7 GRAPH Lecture
pyramidmistre
No ratings yet
A Design Methodology For Implementing DSP With Xilinx System Generator For Matlab
Document5 pages
A Design Methodology For Implementing DSP With Xilinx System Generator For Matlab
mohamedartyum
No ratings yet
Conversion of Sign Language To Text
Document13 pages
Conversion of Sign Language To Text
Aakriti
No ratings yet
Learning Robust Feature Representations For Scene Text Detection
Document10 pages
Learning Robust Feature Representations For Scene Text Detection
Minh Đinh Nhật
No ratings yet
Department of Electronics and Communication Engineering Saintgits College of Engineering
Document41 pages
Department of Electronics and Communication Engineering Saintgits College of Engineering
Georgeyfk
No ratings yet
Recurrent Neural Networks: Anahita Zarei, PH.D
Document37 pages
Recurrent Neural Networks: Anahita Zarei, PH.D
Nick
No ratings yet
Convolutional Neural Network
Document78 pages
Convolutional Neural Network
Aaqib Inam
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Cisco Networks: Engineers' Handbook of Routing, Switching, and Security with IOS, NX-OS, and ASA
From Everand
Cisco Networks: Engineers' Handbook of Routing, Switching, and Security with IOS, NX-OS, and ASA
Chris Carthern
No ratings yet
Automated Testing Tools for Software
Document6 pages
Automated Testing Tools for Software
Pallavi Bharti
No ratings yet
Database - Mongodb: Presented by
Document5 pages
Database - Mongodb: Presented by
Pallavi Bharti
No ratings yet
Tutorial No.7
Document7 pages
Tutorial No.7
Pallavi Bharti
No ratings yet
Operating System Tutorial
Document41 pages
Operating System Tutorial
Pallavi Bharti
No ratings yet
Receipt : An Autonomous Institute of Government of Maharashtra Vidyanagar, Karad, Maharashtra 415124, India
Document1 page
Receipt : An Autonomous Institute of Government of Maharashtra Vidyanagar, Karad, Maharashtra 415124, India
Pallavi Bharti
No ratings yet
Software Requirement Specification (SRS) For Personal Investment Management System (PIMS)
Document13 pages
Software Requirement Specification (SRS) For Personal Investment Management System (PIMS)
Pallavi Bharti
No ratings yet
Mysql vs. Mongodb: Looking at Relational and Non-Relational Databases
Document23 pages
Mysql vs. Mongodb: Looking at Relational and Non-Relational Databases
Pallavi Bharti
No ratings yet
Tutorial No - 5: Admin
Document2 pages
Tutorial No - 5: Admin
Pallavi Bharti
No ratings yet
Name - Bharti Pallavi Arjun Roll No. - 18141112 Q.1) List Testing Objectives and Explain Testing Principles. Ans
Document7 pages
Name - Bharti Pallavi Arjun Roll No. - 18141112 Q.1) List Testing Objectives and Explain Testing Principles. Ans
Pallavi Bharti
No ratings yet
Database Connectivity: 18141101: Nikam Pournima Vijay 18141121: Devkare Aishwarya Ulhas
Document4 pages
Database Connectivity: 18141101: Nikam Pournima Vijay 18141121: Devkare Aishwarya Ulhas
Pallavi Bharti
No ratings yet
DATABASE CONNECTIVITY Adms
Document7 pages
DATABASE CONNECTIVITY Adms
Pallavi Bharti
No ratings yet
Speaker web app for event registration
Document5 pages
Speaker web app for event registration
Pallavi Bharti
No ratings yet
Automated Testing Tools for Software
Document6 pages
Automated Testing Tools for Software
Pallavi Bharti
No ratings yet
XAMPP AND MYSQL DATABASE CONNECTIVITY
Document6 pages
XAMPP AND MYSQL DATABASE CONNECTIVITY
Pallavi Bharti
No ratings yet
List of OSS Experiments For External Exam
Document2 pages
List of OSS Experiments For External Exam
Pallavi Bharti
No ratings yet
Database Connectivity and MongoDB Features
Document4 pages
Database Connectivity and MongoDB Features
Pallavi Bharti
No ratings yet
List of OSS Experiments For External Exam
Document2 pages
List of OSS Experiments For External Exam
Pallavi Bharti
No ratings yet
Data Input Methods: Data Dictionary: Its Development and Use
Document17 pages
Data Input Methods: Data Dictionary: Its Development and Use
Kefelegn Gulint
No ratings yet
Database Connectivity: 18141101: Nikam Pournima Vijay 18141121: Devkare Aishwarya Ulhas
Document4 pages
Database Connectivity: 18141101: Nikam Pournima Vijay 18141121: Devkare Aishwarya Ulhas
Pallavi Bharti
No ratings yet
CASE Tools Notes
Document10 pages
CASE Tools Notes
A Senthilkumar
No ratings yet
E Commerce Notes
Document37 pages
E Commerce Notes
Anitha Ilango
No ratings yet
Output Devices
Document16 pages
Output Devices
Pallavi Bharti
No ratings yet
Control Audit
Document10 pages
Control Audit
Rahul Tiwari
No ratings yet
Process Specification: 1. Motivation and Learning Goals
Document30 pages
Process Specification: 1. Motivation and Learning Goals
Pallavi Bharti
No ratings yet
DFD
Document14 pages
DFD
Komal Varshney
No ratings yet
LNM 9
Document21 pages
LNM 9
Priyanka Rawat
No ratings yet
LOGICAL DATABASE DESIGN-INTRODUCTION
Document31 pages
LOGICAL DATABASE DESIGN-INTRODUCTION
Kumarecit
No ratings yet
Unit I-Introduction of Object Oriented Modeling
Document92 pages
Unit I-Introduction of Object Oriented Modeling
Pallavi Bharti
No ratings yet
Unit - III - Reference
Document30 pages
Unit - III - Reference
Pallavi Bharti
No ratings yet
Unit 2 Introduction To UML and Structural Modeling
Document70 pages
Unit 2 Introduction To UML and Structural Modeling
Pallavi Bharti
No ratings yet
2022 - 10 - 21 Anejo A Mocion Informativa y en Solicitud de Orden (Tpi)
Document1 page
2022 - 10 - 21 Anejo A Mocion Informativa y en Solicitud de Orden (Tpi)
valentina
No ratings yet
Recent Survey On Automatic Ontology Learning
Document5 pages
Recent Survey On Automatic Ontology Learning
TIJUKA
No ratings yet
Multimodal Disentangled Domain Adaption For Social Media Event Rumor Detection
Document14 pages
Multimodal Disentangled Domain Adaption For Social Media Event Rumor Detection
thing
No ratings yet
Explaining Network Intrusion Detection System Using Explainable AI Framework
Document10 pages
Explaining Network Intrusion Detection System Using Explainable AI Framework
Khalid Syfullah
No ratings yet
Rohit Kumar Resume (Photo) PDF
Document1 page
Rohit Kumar Resume (Photo) PDF
Bhashkar Jha
No ratings yet
12 Best Practices For Modern Data Integration: White Paper
Document10 pages
12 Best Practices For Modern Data Integration: White Paper
vgutierrezm
100% (1)
2016 Questions
Document8 pages
2016 Questions
mrss.lttt
No ratings yet
Machine Learning Model for Stroke Prediction
Document10 pages
Machine Learning Model for Stroke Prediction
Basavaraj Shellikeri
No ratings yet
Sas #6 - Neri, Armee Gay C.
Document5 pages
Sas #6 - Neri, Armee Gay C.
Armee Gay Neri
No ratings yet
Unit 3 - DBMS - Ques Bank With Answers
Document16 pages
Unit 3 - DBMS - Ques Bank With Answers
Rohail Hussain
No ratings yet
CEPC0504: An Introduction to Civil Engineering Information Systems
Document9 pages
CEPC0504: An Introduction to Civil Engineering Information Systems
Rhea Basilio
100% (1)
(MCQS) Big Data - Last Moment Tuitions
Document9 pages
(MCQS) Big Data - Last Moment Tuitions
ZuniButt
No ratings yet
3.1 Natural Language Processing
Document5 pages
3.1 Natural Language Processing
Fardeen Azhar
No ratings yet
Database Management System: by Hemant Tulsani
Document18 pages
Database Management System: by Hemant Tulsani
Hemant Tulsani
No ratings yet
Certificate - 1
Document4 pages
Certificate - 1
akshay kumar
No ratings yet
Claim Management System
Document3 pages
Claim Management System
Moheed Uddin
No ratings yet
Damage Detection and Classification of Road Surfaces
Document9 pages
Damage Detection and Classification of Road Surfaces
IJRASETPublications
No ratings yet
Gis BDR
Document300 pages
Gis BDR
Melkamu Amushe
No ratings yet
CS403 Quiz2 Serh Mcqs
Document193 pages
CS403 Quiz2 Serh Mcqs
sapen79344
No ratings yet
Distributed Databases Introduction
Document16 pages
Distributed Databases Introduction
aishwarya saji
No ratings yet
Abstract Hi-Tech Hexagons PowerPoint Templates
Document10 pages
Abstract Hi-Tech Hexagons PowerPoint Templates
Nikhat Parween
No ratings yet
221
Document2 pages
221
Abhishek Ranjan
No ratings yet
Distance LAB CBCS TimeTable December 2023
Document5 pages
Distance LAB CBCS TimeTable December 2023
zarah zayd
No ratings yet
Library Management
Document37 pages
Library Management
ishitabhargava124
No ratings yet
Analyse digital forensic evidence with semantics and NLP
Document39 pages
Analyse digital forensic evidence with semantics and NLP
nana
No ratings yet
FISCHLER - Food, Self and Identity
Document19 pages
FISCHLER - Food, Self and Identity
Camila Assunção Crumo
No ratings yet
Feature Selection For Loan Repayment Prediction System Using Machine Learning
Document10 pages
Feature Selection For Loan Repayment Prediction System Using Machine Learning
IJRASETPublications
No ratings yet
SIT 200 Database Management System
Document4 pages
SIT 200 Database Management System
peterwanjiru1738
No ratings yet
Canteen Project Report
Document119 pages
Canteen Project Report
Abody 7337
No ratings yet
LECTIO Visiting Program - 2024-2025
Document6 pages
LECTIO Visiting Program - 2024-2025
Ernesto Fuentes Padgett
No ratings yet