LeHoaiDuy LeHoaiDuy Portfolio NLP

LE HOAI DUY
010-4289-9823 ⋄ Seoul, South Korea

hoaiduy1396@gmail.com ⋄ https://www.linkedin.com/in/hoaiduy1396/ ⋄ https://github.com/HoaiDuyLe
ABOUT ME
Artificial intelligence engineer with 4+ years of experience in machine learning/deep learning model building
and software development. Fluent in Python and have the ability to apply machine learning/deep learning
techniques to solve real-world business problems. My services cover ML Engineering, DL Engineering, Computer
Vision, Natural Language Processing, and Data Science.
KEY SKILLS
• Apply machine learning and deep learning in Natural Language Processing tasks including Machine Reading
Comprehension, Question Answering System, Text Classification, and Sentiment Analysis.
• Apply machine learning and deep learning to solve Computer Vision problems such as Image Classification,
Object Detection, and Semantic Segmentation.
• Apply machine learning and deep learning in Multimodal Learning including Video Classification, Text-to-
image Generation, Text-image Retrieval, Text-video Retrieval.
• Proficiently and quickly search, read, and understand Machine Learning/Deep Learning papers in diverse fields.
• Using Python, proficiency in implementing codes in various deep learning frameworks (TensorFlow, Keras, and
PyTorch). Understand open-source code and quickly custom to new domains or related tasks..
TECHNICAL SKILLS
Programming Proficiency in Python - familiar with C++, C#.

Libraries Proficiency in TensorFlow, Keras, Pytorch, and HuggingFace.
Soft Skills Quick learning, problem-solving, critical thinking, teamwork, and project management.
Tools Jira, Notion, Git, AWS EC2
Language English - TOEIC 830 (tested in Jan 2021)
SELECTED PROJECTS
I. Fine-tune BERT with dual learning network for Question Answering System. (AI research and mod-
eling)
Written in Python using TensorFlow. Duration: Oct 2018 - May 2019 (8 months).
• Overview: fine-tuning BERT with a proposed dual learning paradigm on the question-answering task to replace
a previous model that uses traditional NLP pipelines.
• Input: a paragraph (context or corpus) and related questions (both as strings).
• Output: answer sentences (as string) extracted from the corpus.
• Method:
1. Use pre-trained BERT to extract word embeddings of input Questions, Answers, and Contexts.
2. Capture sequential information and encode sentence embeddings or paragraph embeddings using Bi-LSTM.
3. Leverage dot-product attention mechanism to extract contextual correlation between question, answer, and
corpus.
4. Apply FC layers to predict pointers (start and end position) of the answers and questions. The network was
trained in a dual-task learning scheme.
• Result: The proposed model outperforms BERT on the SQuAD dataset. The model was embedded in the
pepper robot and it is the first AI project that gets a purchase order (PO) for the company.
II. Object Detection system for Fault Detection on Wafer. (AI modeling and software development)
Written in Python using TensorFlow and C# with WPF. Duration: July 2020 - Feb 2021 (8 months). Onsite in
Singapore.
• Overview: build a detection model using YOLOv4 to detect and recognize the defects on the semiconductor
wafer surface. Carried out an end-to-end project from data pre-processing and construction to model building
and deployment. Within 3 months, learn C# and WPF to develop a .NET desktop application to deploy the
model on production.
• Input: images with or without defects.
• Output: defective or non-defective classification. if so predict bounding boxes of defects.
• Method: modify and fine-tune YOLOv4 on custom datasets. Apply image processing algorithm (Dilation,
Erosion, find contours) as a post-processing step to enhance the model performance.
• Results: The model achieves better performance than the traditional image processing method of the customer.
The project then gets a purchase order (PO).
III. Semantic Segmentation for Defect Inspection on Chip Surfaces. (AI modeling and software develop-
ment)
Written in Python using TensorFlow. Duration: June 2019 - May 2020 (11 months). Onsite in Taiwan.
• Overview: within 3 months, moving from the NLP field to learn deep learning in computer vision and build
an end-to-end AI-based pipeline (data processing, dataset construction, model building&training, and model
inference) using Mask-RCNN to segment and classify defects on in-chip surfaces.
• Input: images with or without defects.
• Output: defective or non-defective classification results. If so predict the binary mask of defects.
• Method: fine-tune Mask R-CNN on custom datasets. Not much modification of the original Mask R-CNN
architecture but propose a creative annotation approach to obtain a significant performance (false positive ratio
< 1% and false negative ratio < 3%).
• Results: The model outstandingly reduces false negative samples with an acceptable false positive ratio (< 1%).
The model was sold to KLA Inc. and ASE Group. It is the first AI project that revenues the company.
IV. Transformer-based Video Emotion Recognition. (research project)

Written in Python using PyTorch. Duration: Feb 2022 - Dec 2022 (10 months).
• Overview: build a video emotion recognition model using the transformer encoder to synthesize information
from subtitles, frames, and audio waveforms and the transformer decoder to learn label-level representations.
• Input: short videos containing transcription, video frames, and audio signal.
• Output: multi-label emotional categories (happy, sad, surprise, ...)
• Method:
1. Backbone: VGG19 for video frames and audio Mel spectrogram chunks, pre-trained ALBERT for text.
2. Fusion: multi-layer transformer encoders to fuse multimodal features at the token level.
3. Label-level embedding network: multi-layer transformer decoders to learn label-level representation
from fused features.
• Results: significant improvement over the strong baseline methods with a gap of +0.2% accuracy and +3.8%
F1-score on the IEMOCAP, +2.0% weight accuracy and +0.6% F1-score on the CMU-MOSEI. Submitted a
research paper to IEEE ACCESS journal.

LeHoaiDuy LeHoaiDuy Portfolio NLP

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LeHoaiDuy LeHoaiDuy Portfolio NLP

Uploaded by

Copyright:

Available Formats

LE HOAI DUY

010-4289-9823 ⋄ Seoul, South Korea

Programming Proficiency in Python - familiar with C++, C#.

IV. Transformer-based Video Emotion Recognition. (research project)

You might also like