Professional Documents
Culture Documents
Multimedia and Games: Project Proposal
Multimedia and Games: Project Proposal
Project proposal
Thanh-Hai Tran
1
Outline
Context and Motivation
Existing approaches
Proposed solutions
Topics
Schedule
2
Context and Motivation
Context:
In Vietnam, there are about 2,6m death people
The number of persons giving translation service is very limited
(about 20)
The death people have difficulty to access to information (medicinal,
heath, security, education, etc)
Motivation: building an automatic system supporting
Vietnamese death people to understand information.
Ideal system: speech2sign = speech2text + text2sign
In this work: text2sign
3
An example
4
Existing approach using Computer
graphic
5
Existing approach
6
Existing approach
7
Existing approach using generative model
Text2Sign: Towards Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks,
January 2020, International Journal of Computer Vision 8
Generated results
Text2Sign: Towards Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks,
January 2020, International Journal of Computer Vision 9
Our proposed solution
10
T1: Overview of VSL
Main tasks:
Study research on Vietnamese sign language (VSL)
Main characters of VSL
Set of common vocabulary in VSL
Resources of VSL (video/text)
List and contact with people giving translation service for data
collection
Expected result
Report about the mentioned problems
All resources are organized in a shared drive (e.g. google)
Required skills:
Reading research paper, summary, and report
Good communication and organization
11
T2: Related works
Main tasks:
Study existing works on Sign language production with two main
approaches: computer graphic and computer vision based
Method, Dataset
Obtained results
Code (if accessible)
Share your knowledge with other groups
Expected result
Report about related works
All resources /references/ links/ library/dataset) are organized in a
shared drive (e.g. google)
Required skills:
Reading research paper in English, summary, and report
Good background in math, probability
Some knowledge about Image processing, Computer vision 12
T3: Building a dictionary for VSL
Main tasks:
Study some existing datasets of American SL, Japanese SL, German SL
and their data collection method and organization
Collect a set of phrase, words for VSL in some specific domain (each
domain a subset: for example: coronavirus outbreak, healthcare, daily news,
weather) from different sources in Internet
Organize and encode the data according the format proposed in existing
dataset
Communicate with an real user translator to shorten the phrases (working
with Group 1)
Expected results:
4 sets of common phrases in four domains coronavirus outbreak, healthcare,
daily news, weather
These sets are organized and encoded according to a pre-defined format
Required skills:
Reading research paper in English, summary, and report
Some knowledge about natural language processing
13
T4: Shorten phrase (text2gloss)
Main tasks:
Study existing methods for automatically shorten textual phrases
(text2gloss)
Select one method (e.g opensouce with good performance) and implement it
(prefer programming in Python)
Evaluate the obtained results with the ground truth (done by real user
translator).
Expected results:
A module: input is a textual phrase, output is a shorten phrase
Quantitative evaluation and report
Required skills:
Good in implementation in Python
Some knowledge about natural language processing
14
T5: video dataset collection
Main tasks:
Study existing dataset of Vietnamese Sign Languages
Invite some people know VSL to perform collected subsets.
Expected results:
Video
Annotated video (video with corresponding text using ELAN tool)
Quantitative evaluation and report
Required skills:
Know to setup camera, synchronize camera
More and less programming, OpenCV
15
T6: video dataset collection
Main tasks:
Study existing dataset of Vietnamese Sign Languages
Invite some people know VSL to perform collected subsets.
Expected results:
Video
Annotated video (video with corresponding text using ELAN tool)
Quantitative evaluation and report
Required skills:
Know to setup camera, synchronize camera
More and less programming, OpenCV
16
An example
17
T6: Biometric face recognition
Main tasks:
Study existing methods for biometric face recognition
Select and implement a method
Evaluate on a dataset
Expected results:
A module of biometric face recognition
Quantitative evaluation and report
Required skills:
Programming in Python, OpenCV,
More and less in Image processing, CV, Deep learning
18
An example
19
T7: Human pose estimation
Main tasks:
Study existing methods for human pose estimation
Using OpenPose library to estimate pose of human
Evaluate the obtained results with the ground truth.
Expected results:
A module: input is video of frames, output is video of poses
Quantitative evaluation and report
Required skills:
Good in implementation in Python
Some knowledge about Computer vision, deep learning
20
An example
21
T8: Create 3D model of human with
skeleton using blender
Main tasks:
Using blender, create 3D model of human (MakeHuman)
Programming to move the human according a pre-defined (with the skeleton
we have determined in previous task
Expected results:
A 3D model
A module to move a human skeleton
Quantitative evaluation and report
Required skills:
Good in Computer graphic, blender
22
T9: Create animation
23
T10: Gloss2Pose
Main tasks:
Study existing methods
Select a method (opensource) and implement
Test on a dataset
Expected results:
Module in Python
Quantitative evaluation and report
Required skills:
Good in Computer Vision
24
T10: Gloss2Pose
25
T11: Pose2Video
Main tasks:
Study existing methods
Select a method (opensource) and implement
Test on a dataset
Expected results:
Module in Python
Quantitative evaluation and report
Required skills:
Good in Computer Vision
26
An example
27
T12: text2action
Main tasks:
Study existing methods
Select a method (opensource) and implement
Test on a dataset
Expected results:
Module in Python
Quantitative evaluation and report
Required skills:
Good in Computer Vision
28
An example
29
Hope you enjoy and succeed !
30