Professional Documents
Culture Documents
Submitted By
Aoun Javeed
Shahzaib Naveed
(2018-2022)
Project
Submitted to
Mr. Saqib Ubaid
Department of Computer Science
In partial fulfillment of the requirements
For the degree of
By
Aoun Javeed
CS181065
Shahzaib Naveed
CS181119
(2018-2022)
Signature: ________________
Signature: ________________
II
APPROVAL FOR SUBMISSION
I certify that this project report entitled “Image to Story” was prepared by AOUN JAVEED
and SHAHZAIB NAVEED has met the required standard for submission in partial
fulfillment of the requirements for the award of Bachelor of Science in Computer Science at
Khwaja Fareed University of Engineering & Information Technology.
Approved by:
Signature : _________________________
Date : _________________________
III
ACKNOWLEDGEMENT
First of all, Thanks to the Almighty ALLAH, for granting knowledge and for all the
blessings that He has provided and poured upon us. He has shown His unconditional and pure
love by using the people around us who can let us feel that We are loved and cared.
After this, We would like to thank our institute, KFUEIT. Development and documentation
phases of FYP are a great chance of learning and professional development for us.
We gratefully acknowledge the support and patience of our family, professors and friends
throughout our studies without them this project report can never be completed.
We are grateful to Dr. Saleem Ullah, Head of Computer Science Category, for their support
and appreciation. Also, very grateful to Mr. Saqib Ubaid, Lecturer of Computer Science
Category and Supervisor of our project, for support, guideline and great supervision.
We again thank our course fellows for their good cooperation during the course. Throughout
this phase of documentation, We did not only gain a lot of knowledge but more importantly,
We also had a great chance to sharpen our skills in a professional working environment.
We would like to thank everyone who had contributed to the successful completion of this
project. We would like to express our gratitude to our Project supervisor, Mr. Saqib Ubaid
for his invaluable advice, guidance and his enormous patience throughout the development of
the research.
In addition, We would also like to express our gratitude to our loving parents and friends who
had helped and encouraged us.
IV
ABSTRACT
A picture is worth a thousands words. A image contains many thoughts and emotions. So this
system is develop for to generate these words that contains some meaning and create a story.
Stories have been a proven medium for influencing, explaining, and teaching for ages. Stories
provoke thought and bring out insights that are hard to decipher and understand otherwise.
The idea of this project is to develop an intelligent system that generates little stories about
images. This system is for generating stories with your images, this system is used in many
learning and educational fields and many authors can use this system to enhance their work.
There are some systems regarding story generation through the image but these systems
perform some specific tasks like generating reports for trading markets, generating some
romantic stories, or some scripts. The problem is here that there is no proper system that will
generate different types of stories, for example, a system named “neural Storyteller” will only
generate romantic stories, so this system is unable to generate different category stories. The
main challenge is to develop a system that describe the image to blind persons. According to
the above problem, We can introduce/make a system that will generate all category/types of
stories, that is not bound to create the same type of story like romantic, etc.
V
Table of Contents
Chapter 1 Introduction......................................................................................................................1
1.1 Introduction...........................................................................................................................1
1.2 Problem Statement.................................................................................................................1
1.3 Objective................................................................................................................................1
1.4 Scope.....................................................................................................................................2
1.5 Advantages Of Proposed Solutions.......................................................................................2
1.6 Relevance To Study Program................................................................................................2
1.7 Chapter Summary..................................................................................................................2
Chapter 2 Existing System.................................................................................................................3
2.1 Existing System.....................................................................................................................3
2.2 Drawbacks of Existing System..............................................................................................6
2.3 Examples of Existing System................................................................................................6
2.4 Need to Enhance Existing System.........................................................................................6
2.5 Chapter Summary..................................................................................................................6
Chapter 3 Requirement Engineering................................................................................................7
3.1 Detailed Description of Proposed System..............................................................................7
3.2 Understanding the system......................................................................................................7
3.2.1 User involvement...........................................................................................................7
3.2.2 Stakeholders...................................................................................................................8
3.2.3 Domain..........................................................................................................................8
3.3 Requirements Engineering.....................................................................................................8
3.3.1 Functional Requirements...............................................................................................9
3.3.2 Non-Functional Requirements.....................................................................................10
3.3.3 Gantt Chart...................................................................................................................10
3.4 Hurdles in optimizing the current system.............................................................................11
3.5 Chapter Summary................................................................................................................11
Chapter 4 Design..............................................................................................................................12
4.1 Software Process Model......................................................................................................12
4.2 Benefits of Model................................................................................................................12
4.3 Limitations of Model...........................................................................................................13
4.4 Design..................................................................................................................................14
4.4.1 Methodology Diagram/ System Overview Diagram....................................................14
4.4.2 System Flow Diagram..................................................................................................15
4.4.3 UML Diagrams............................................................................................................16
4.4.3.1 Collaboration Diagram.............................................................................................18
VI
4.4.3.2 Activity Diagram.....................................................................................................17
4.4.3.3 Use Case Diagram....................................................................................................16
4.5 Chapter Summary................................................................................................................19
VII
List of Figures
Figure 2.1 Neural Network Sample Output...........................................................................................4
Figure 2.2 Pix2Story Sample Output.....................................................................................................5
Figure 3.1 Gantt Chart.........................................................................................................................10
Figure 4.1 Agile Model.......................................................................................................................12
Figure 4.2 Methodology Diagram/System Overview Diagram............................................................14
Figure 4.3 System Flow Diagram........................................................................................................15
Figure 4.4 Use Case Diagram..............................................................................................................16
Figure 4.5 Activity Diagram................................................................................................................17
Figure 4.6 Collaboration Diagram.......................................................................................................18
VIII
List of Tables
Table 2.1 Comparision of Existing Systems..........................................................................................5
Table 3.1 Functional Requirements.......................................................................................................9
Table 3.2 Non-Functional Requirement...............................................................................................10
Table 4.1 Trained Model Module........................................................................................................17
Table 4.2 Model Prediction Module....................................................................................................17
IX
CHAPTER 1 INTRODUCTION
1Chapter 1 Introduction
1.1 Introduction
A picture is worth a thousands words .A image contains many thoughts and emotions. So this
system is develop for to generate these words that contains some meaning and create a story.
Stories have been a proven medium for influencing, explaining, and teaching for ages. Stories
provoke thought and bring out insights that are hard to decipher and understand otherwise.
The art of storytelling is often overlooked in data-driven operations, and businesses fail to
understand that the best stories, if not presented well, can end up being useless.
So, the purpose of this project is to make some stories from images that make a picture into a
descriptive. For achieving this target we perform the following tasks which are following in
the below paragraph.
We have two-step to generate a story from the image, first is to generate a caption from the
image. Image caption generator is a popular research area of Artificial Intelligence that deals
with image understanding and a language description for that image. Generating well-formed
sentences requires both syntactic and semantic understanding of the language. Being able to
describe the content of an image using accurately formed sentences is a very challenging
task. The second is caption to story generator, In the second step, We will generate a story
according to our caption. During this, we apply the different languages. We can use different
techniques to generate a relative short story. The final result or story will tell all about the
caption of the story. They will tell what happened in the image. In the end, The story we
generate describes the caption of the image.
Automatically describing the content of images using natural language is a fundamental and
challenging task.The main challenge is to develop a system that describe the image to blind
persons
1.3 Objective
Serve community,for visually impared persons that are unable to see and understand
images.
1
CHAPTER 1 INTRODUCTION
1.4 Scope
Community Services.
The advantages of machine learning solutions like the image to story Generator will help
authors and writers to take some idea from this we can enhance this project to create scripts
from the image. The main advantages is that a person that is physically blind can easily
understand that whats going on in picture.
In terms of Software-based this project is relevant to a study program in which phase where
image processing will be performed, but for coding stages, the back-end code of this project
will be written in python, which is relevant to a study program because we use such type of
technology and development framework IoT, Image Processing, etc.
A picture is worth a thousands words .A image contains many thoughts and emotions. So this
system is develop for to generate these words that contains some meaning and create a story.
Stories have been a proven medium for influencing, explaining, and teaching for ages. Stories
provoke thought and bring out insights that are hard to decipher and understand otherwise.
Although great development has been made in computer vision, tasks such as recognizing an
object, action classification, image classification, attribute classification and scene
recognition are possible but it is a relatively new task to let a computer describe an image that
is forwarded to it in the form of a human-like sentence. For this goal of image captioning,
based on the semantics of images should be captured here and expressed in the desired form
of natural languages. So, to make our image caption generator model, we will be merging
CNN-RNN architectures. Here an image is given as the input, and the method as output in the
form of a sentence in English describing the content of the image. This model analyzes the
image and generate more Trivial and relevant words for images
2
CHAPTER 2 EXISTING SYSTEM
There are some systems regarding story generation through the the image but these systems
perform some specific tasks like generating reports for trading markets, generating some
romantic stories, or some scripts. These systems used some already existed concepts for Story
Generation Convolutional Neural Network Rapid advances in computer vision are enabling
applications which are producing results in our daily life by the help Convolutional neural
network which is involved in some applications like Image Classification Object Detection,
Neural Style Transfer, Face Recognition, etc. had used the concept of CNN. They have used
Convolutional Neural Network to classify the objects and tells what type of objects are
present in a particular image. In recent years many approaches had come for image
classification, all those approaches do the same thing i.e., classifying the object by using
CNN. After that, it uses GPT approaches to generates text on the basis of input.
Neural-storyteller is a recurrent neural network that generates little stories about images. This
repository contains code for generating stories with your own images, as well as instructions
for training new models.
Skip-thought vectors
Image-sentence embendings
Conditional neural language model
Style shifting
The 'style-shifting' operation is what allows this model to transfer standard image captions to
the style of stories from novels.
The System has first trained a recurrent neural network (RNN) decoder on romance novels.
Each passage from a novel is mapped to a skip-thought vector. The RNN then conditions on
the skip-thought vector and aims to generate the passage that it has encoded. it uses romance
novels collected from the BookCorpus dataset.
3
CHAPTER 2 EXISTING SYSTEM
Parallel to this, that was train a visual-semantic embedding between COCO images and
captions. In this model, captions and images are mapped into a common vector space. After
training, it can embed new images and retrieve captions.
Given these models, they need a way to bridge the gap between retrieved image captions and
passages in novels. After this, the caption is given to decoder that will keep the "thought" of
the caption, but replace the image caption style with that of a story.
2.1.2 Pix2Story:
Pix2Story is an AI that generates stories about an image in different literary genres. A trained
visual semantic embedding model analyzes the image and generates captions. The Pix2Story
application then becomes the storyteller by transforming the captions and generating a
narrative. Obtain the captions from the uploaded picture and feed them to the Recurrent
Neural Network model to generate the narrative based on the genre and the picture.
It was trained a visual semantic embedding model on the MS COCO captions dataset of
300,000 images to make sense of the visual input by analyzing the uploaded image and
generating the captions. it also transformed the captions and generate a narrative based on the
selected genre: Adventure, SciFi or Thriller. For this, we trained for 2 weeks an encoder-
decoder model on more than 2000 novels. This training allows each passage of the novels to
be mapped to a skip-thought vector, a way of embedding thoughts in vector space.
It use the new Azure Machine Learning Service as well as the azure model management SDK
with Python 3 to create the docker image with these models and deploy it using AKS with
GPU capability making the project ready to production.
4
CHAPTER 2 EXISTING SYSTEM
Picture Books generates stories for children age four to six based on a set of picture elements
selected by the user, namely background, characters and objects. The system creates an
abstract story representation of the theme for the given picture and uses NLG techniques to
generate the sentences in the story. Manual evaluations on 15 generated stories were
conducted to validate the correctness of their grammar and appropriateness of their content.
The generated stories received an average score of 91.5% for grammar correctness and 88%
for appropriateness of content. Automated evaluation was also performed by comparing the
generated stories against reference stories, with the system receiving a low word error rate of
9.71% and a sentence error rate of 41.96%. The test results demonstrated that computers can
generate stories from pictures, and the system's knowledge representation, using the
combination of story pattern data structure and semantic ontology, is feasible and usable to
all story generation systems regardless of the domain.
Neural No No No No Yes
StoryTeller
Pix2Story No Yes Yes Yes No
Picture Yes Yes No No Yes
Book
5
CHAPTER 2 EXISTING SYSTEM
The drawback of all the existing systems is they create a specific type of story. For example,
a neural storyteller will create only a type of “romantic” short story. so this system is unable
to generate different category stories.
Neural StoryTeller
Pix2Story
PictureBook
The need to enhance the existing systems is that there is no work on creating a different type
(romantic, funny, action ,Scifi , thriller , adventure ) of story generation.
There are some systems used some already existed concepts for Story Generation
Convolutional Neural Network Rapid advances in computer vision are enabling brand new
applications which are producing best results in our daily life by the help Convolutional
neural network which is involved in some applications like Image Classification Object
Detection, Neural Style Transfer, Face Recognition, etc. had used the concept of CNN. we
have used Convolutional Neural Network for the to classify the objects and tells what type of
objects are present in a particular image.There are different existing systems that generates
different types of stories.Neural StoryTeller generates romantic stories, pix2 story generates
adventure, Scifi, Thriler and picture book generates stories only for children. The drawback
of all the existing systems is that there is no work regarding the image to story writing of all
category in one system.
6
CHAPTER 3 REQUIREMENT ENGINEERING
There are two major steps for the image to story generator system first is the image caption
generator. Image caption Generator is a popular research area of Artificial Intelligence that
deals with image understanding and a language description for that image. Generating well-
formed sentences requires both syntactic and semantic understanding of the language. Being
able to describe the content of an image using accurately formed sentences is a very
challenging task, but it could also have a great impact, by helping visually impaired people
better understand the content of images.
In the second step we discuss caption to story generator, we will generate a story according to
our caption. During this, we apply the different languages. We can use different techniques to
generate a relative short story. This final result or story will tell all about the caption of the
story. They will tell what happened in the image. In the end the story we generate describes
caption of the image. We first train a recurrent neural network (RNN or CNN) decoder on
romance novels. Each passage from a novel is mapped to a skip-thought vector. The (RNN
or CNN) then conditions on the skip-thought vector and aims to generate the passage that it
has encoded. Parallel to this, we train a visual-semantic embedding between COCO images
and captions. In this model, captions and images are mapped into a common vector space.
After training, we can embed new images and retrieve captions. Given these models, we need
a way to bridge the gap between retrieved image captions and passages in novels. That is, if
we had a function F that maps a collection of image caption vectors x to a book passage
vector F, then we could feed F to the decoder to get our story. Here, c is the mean of the skip-
thought vectors for Microsoft COCO training captions.
Image to story generator is a recurrent neural network that generates little stories about
images. This repository contains code for generating stories with your own images, as well as
for instructions for training models. There are many choices in types of the story like
romantic, funny, action, etc.
7
CHAPTER 3 REQUIREMENT ENGINEERING
In user involvement of the system, there are two types of people that are directly involved
with the system one of them is the owner to whom the system notifies in the alarming
situation.
3.2.2 Stakeholders
Every project has some stakeholders and these stakeholders can be internal or external.
Internal stakeholders are those who are involved in the development procedure of the project
and external stakeholders are those who are want to construct a project. In our project
stakeholder must be a developer and user. Above mention, stakeholders can be affected or get
affected by the outcome of our project. These stakeholders are directly in to our project.
There are two types of stakeholders involved in the development of these types of projects or
trading platforms that are as follows.
Developer
User
3.2.3 Domain
The Domain of this system is Deep Learning and Artificial Intelligence/ Python which are
implemented in our system.
In our project, we clearly define and specify the requirements that relate to our trading
platform’s services these requirements obtained by different requirement stages and these
stages of requirement accruing are as follows.
Requirement elicitation
Requirement specification
Requirement verification & validation
Requirement elicitation
To identify the missing feature in the existing trading platform we take interview of
experienced trader that can tell us the requirements and also identifying the requirements.
Requirement specification
8
CHAPTER 3 REQUIREMENT ENGINEERING
We specify the requirements of our project by drawing a physical model of our trading
platform. All requirements such as functional as well as non-functional requirements are
specified by the physical model of our trading platform.
Functional requirements define the basic system behavior. Essentially, they are what the
system does or must not do, and can be thought of in terms of how the system responds to
inputs. Functional requirements usually define if/then behavior’s and include calculations,
data input, and business processes. Functional requirements are features that allow the system
to function as it was intended.
9
CHAPTER 3 REQUIREMENT ENGINEERING
Requirement
Requirement Description Must/Want
ID
The System would never store any user data and provides
NFR003
privacy.
Must
10
CHAPTER 3 REQUIREMENT ENGINEERING
The current system has a complex structure and has old techniques that are outdated and
slow.
There are two major steps for the image to story generator system first is the Image Caption
Generator. Image caption Generator is a popular research area of Artificial Intelligence that
deals with image understanding and a language description for that image. Being able to
describe the content of an image using accurately formed sentences is a very challenging
task, but it could also have a great impact, by helping visually impaired people better
understand the content of images. In a second step, we discuss caption to story generator, We
will generate a story according to our caption. This final result or story will tell all about the
caption of the story. In the end the story we generate describes caption of the image. That is,
if we had a function F that maps a collection of image caption vectors x to a book passage
vector F, then we could feed F to the decoder to get our story. Image-to-story generator is a
recurrent neural network that generates little stories about images. This repository contains
code for generating stories with your images, as well for instructions for training models.
Mainly our project Image to Story Generator system is developed for entertainment and
learning purposes.
11
CHAPTER 4 DESIGN
4Chapter 4 Design
4.1 Software Process Model
As we know that many process models are used for developing many different types of
computer based systems and applications. Every process model has a uniqueness of
developing
a specific type of application that is suited for a particular system also time lap. Two majors
approaches are commonly used in software development that is “Plan driven” approach and
second is a “Agile approach”. Each approach works on a specific criterion for developing
computer-based system and application and these approaches has a set of process models that
followed by a developing personal. We follow Agile approach for our project that is an
“Image To Story”.
In Agile project management, testing is an integrated part of the project execution phase
which means that the overall quality of the final product is greater.
12
CHAPTER 4 DESIGN
Agile allows managers to have better control over the project due to its transparency,
feedback integration, and quality-control features. Quality is ensured throughout the
implementation phase of the project and all stakeholders are involved in the process
with daily progress reports through advanced reporting tools and techniques.
With increased visibility, predicting risks, and coming up with effective mitigation plans
becomes easier. Within the Agile framework, there are greater ways to identify and predict
risks and plan to ensure that the project runs smoothly.
In theory, any project using an Agile methodology will never fail. Agile works in small
sprints that focus on continuous delivery. There is always a small part that can be salvaged
and used in the future even if a particular approach doesn’t go as planned.
Because Agile is based on the idea that teams won’t know what their result (or even a few
cycles of delivery down the line) will look like from day one, it’s challenging to predict
efforts like cost, time, and resources required at the beginning of a project (and this challenge
becomes more pronounced as projects get bigger and more complex).
In Agile, documentation happens throughout a project and is often “just in time” for building
the output, not at the beginning. As a result, it becomes less detailed and often falls to the
back burner.
4.3.3 No finite end
The fact that Agile requires minimal planning at the beginning makes it easy to get
sidetracked delivering new, unexpected functionality. Additionally, it means that projects
have no finite end, as there is never a clear vision of what the “final product” looks like.
13
CHAPTER 4 DESIGN
4.4 Design
A design is a plan or specification for the construction of an object or system or for the
implementation of an activity or process, or the result of that plan or specification in the form
of a prototype, product or process. The verb to design expresses the process of developing a
design.
14
CHAPTER 4 DESIGN
The System Overview Diagram, can help you identify the building blocks of your system,
without having to determine, yet, if they are to be Hardware, Software or mechanics. ...
The System Overview Diagram is drawn before splitting into HW and SW block diagrams.
15
CHAPTER 4 DESIGN
System flowcharts are a way of displaying how data flows in a system and how decisions are
made to control events. To illustrate this, symbols are used. They are connected together to
show what happens to data and where it goes. Flow charts do not include decisions, they just
show the path that data takes, where it is held, processed, and then output.
16
CHAPTER 4 DESIGN
UML is an acronym that stands for Unified Modeling Language. Simply put, UML is a
modern approach to modeling and documenting software. In fact, it’s one of the most
popular business process modeling techniques.
In the Unified Modeling Language (UML), a use case diagram can summarize the details of
your system's users (also known as actors) and their interactions with the system. To build
one, you'll use a set of specialized symbols and connectors. An effective use case diagram
can help your team discuss and represent:
17
CHAPTER 4 DESIGN
18
CHAPTER 4 DESIGN
19
CHAPTER 4 DESIGN
Every process model has uniqueness of developing specific type of applications that is suited
for particular system also time lap. As we know that many process models are used for
developing many different types of computer base systems and applications. The System
Overview Diagram is drawn before splitting into HW and SW block diagrams.
In fact, it’s one of the most popular business process modeling techniques. An activity
diagram is a behavioral diagram i.e. We follow Agile approach for our project that is an
“Image To Story”. UML is an acronym that stands for Unified Modeling Language. Two
majors approaches are commonly used in software development that are “Plan driven”
approach and second is a “Agile approach”. The System Overview Diagram, can help you
identify the building blocks of your system, without having to determine, yet, if they are to be
Hardware, Software or mechanics. An effective use case diagram can help your team discuss
and represent Scenarios in which your system or application interacts with people,
organizations , or external systems. A collaboration diagram, also known as a
communication diagram, is an illustration of the relationships and interactions among
software objects in the Unified Modeling Language (UML).Admin panel and user panel.
Goals that your system or application helps those entities (known as actors) achieve The
scope of your system. In the Unified Modeling Language (UML), a use case diagram can
summarize the details of your system's users (also known as actors) and their interactions
with the system.
20