Welcome to Scribd!

Sip Project Report: Rishi Gupta M.T Echai, 19729

Uploaded by

0% found this document useful (0 votes)

7 views1 page

1. The document reports on a project using the wav2vec2 model to perform speech to text conversion on a medical domain dataset. 2. When test data was fed to the wav2vec2 model alone, some words were mismatched with the original text. 3. By implementing a 5-gram language model trained on medical dialog data using KenLM, accuracy was improved when the test data was passed through the combined wav2vec2 and language model system.

Original Description:

Original Title

PRNN_Assignment_1 (2)

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

7 views1 page

Sip Project Report: Rishi Gupta M.T Echai, 19729

Uploaded by

rishi gupta

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 1

Search inside document

SIP PROJECT REPORT

Rishi Gupta M.T echAI , 19729

I. A BSTRACT III. R ESULTS

When we fed our test data to wav2vec2 model then there
In this project we have perform speech to text conversion
are some word which are mismatched with original data
using deep learning model Wav2vec2.Since output of model
that is - 1. Doctor I felt weakness in my body from several
wav2vec2 is not very accurate, hence we have implemented
days. This sentence when passed through wav2vec2 without
Language model over wav2vec2 using KenLM for improving
Language model gives output as-
accuracy. In this we have used medical domain dataset for
Doctor i affect to be ness in my body from several days.
training language model.
This model can only predict normal text but not able to
predict keyword of certain domain .
II. T ECHNICAL DETAILS Now, when this test data is passed through wav2vec2 with
Language model then model decode this as-
We have used deep learning model wav2vec2 which
Doctor I felt weakness in my body from several days.
convert speech into text.N-gram Language model is also
implemented over wav2vec2 with KenLM.Main advantage IV. T OOLS USED
of using wav2vec2 is it takes unlabeled data in pretraining 1. Libraries like pyctcdecode, transformers, datasets.
and only few hours of labeled data is required in training. 2. From transformers we import wav2vec2
Wav2vec2Tokenizers and Wav2vec2ForCTC.
A. Training dataset 3. Librosa for reading audio file.
For training data we have collected medical domain con-
versation between doctor and patient and record that using
five of our friend which we used to finetune over pretrained
wav2vec2 model.

B. Test Data
For testing we have used recorded audio sentences from
similar set of friends who recorded test data.

C. Implementation stages
There are three stages of implementation of stages which
are as- 1. We give raw input to our wav2vec2 model which
will convert raw audio into latent speech representation and
then perform quantization
2. Some vectors of quantized speech are removed before
it is given to any language model called masking.Masked
input is given to Language model.
3.Language model predict vectors which was masked in
previous step and gives the output text and accurately that
will determine WER

D. Language model
We have integrated 5-gram language model over
wav2vec2.For training our language model we have
used hugging face medical dialog dataset using KenLM
model.This medical dialog data consist of millions of dia-
logue realted to question and answering between doctor and
patient.

Sip Project Report: Rishi Gupta M.T Echai, 19729
Document1 page
Sip Project Report: Rishi Gupta M.T Echai, 19729
rishi gupta
No ratings yet
Sip Project
Document7 pages
Sip Project
rishi gupta
No ratings yet
Lab 7
Document5 pages
Lab 7
ranahassanirfan2005
No ratings yet
Doc2vec Explain
Document5 pages
Doc2vec Explain
Pushkar Mishra
No ratings yet
NguyenLeHuuDuy 20IT309
Document32 pages
NguyenLeHuuDuy 20IT309
Kết Đoàn Nguyễn
No ratings yet
Applying Wav2vec2 For Speech Recognition On Bengali Common Voices Dataset
Document5 pages
Applying Wav2vec2 For Speech Recognition On Bengali Common Voices Dataset
Latifur Rahman Zihad
No ratings yet
SLM - Unit 02
Document15 pages
SLM - Unit 02
Pavan Thakur
No ratings yet
S E A T - S: Ample Fficient Daptive EXT TO Peech
Document15 pages
S E A T - S: Ample Fficient Daptive EXT TO Peech
Gerardo Meza Gómez
No ratings yet
Online Representation Learning in Recurrent Neural Language Models
Document6 pages
Online Representation Learning in Recurrent Neural Language Models
jaczek
No ratings yet
Feature Learning in Infinite-Width Neural Networks
Document65 pages
Feature Learning in Infinite-Width Neural Networks
alekthiery
No ratings yet
TCP/IP Sockets in Java: Practical Guide for Programmers
From Everand
TCP/IP Sockets in Java: Practical Guide for Programmers
Kenneth L. Calvert
Rating: 4 out of 5 stars
4/5 (5)
He2020 - DeBERTa - Decoding Enhanced BERT With Disentangled Attention
Document23 pages
He2020 - DeBERTa - Decoding Enhanced BERT With Disentangled Attention
Mjolne
No ratings yet
Object Oriented Programming Through Java: G.Praveen Itdept-Snist
Document118 pages
Object Oriented Programming Through Java: G.Praveen Itdept-Snist
kilku bulku
No ratings yet
CNS - Lab Workbook - 21ad2201
Document156 pages
CNS - Lab Workbook - 21ad2201
Hemanth Kumar Mupparaju
No ratings yet
21ec2208 - Adc - Lab Completed List
Document61 pages
21ec2208 - Adc - Lab Completed List
Spacial
No ratings yet
1 Oop
Document69 pages
1 Oop
Sachin Kondawar
No ratings yet
Oop Winter 2019 (C++)
Document21 pages
Oop Winter 2019 (C++)
Aditya Borle
No ratings yet
OOP Kit
Document68 pages
OOP Kit
Anjali Bhalerao
No ratings yet
Comparing Attention-Based Neural Architectures For Video Captioning
Document10 pages
Comparing Attention-Based Neural Architectures For Video Captioning
Jaanav Mathavan me22b007
No ratings yet
Word Embeddings With Neural Network
Document5 pages
Word Embeddings With Neural Network
efegallego9679
No ratings yet
C Sharp and Java Comparative
Document28 pages
C Sharp and Java Comparative
Sai Karthik
No ratings yet
C J J T: Ompiling AVA Ust in IME
Document8 pages
C J J T: Ompiling AVA Ust in IME
Ranjit Kumar
No ratings yet
Machine Translation Using Natural Language Process
Document6 pages
Machine Translation Using Natural Language Process
Fidelzy Moreno
No ratings yet
Feature Learning Greg Yang
Document63 pages
Feature Learning Greg Yang
Wang
No ratings yet
Ac 2011-451: A Taste of Java - Discrete and Fast Fourier Trans-Forms
Document9 pages
Ac 2011-451: A Taste of Java - Discrete and Fast Fourier Trans-Forms
István Varga
No ratings yet
Lab Manual EC-313 DSP
Document67 pages
Lab Manual EC-313 DSP
Muhammad sharjeel
No ratings yet
GSoC 2017 Proposal - Rajat Arora
Document9 pages
GSoC 2017 Proposal - Rajat Arora
RajatArora
No ratings yet
Documen PDF
Document33 pages
Documen PDF
Rahiminshaha
No ratings yet
A Very Short Summary (Around 150-250 Words) of What The Experiment Is About, What You Found, and Why It May Be Important
Document7 pages
A Very Short Summary (Around 150-250 Words) of What The Experiment Is About, What You Found, and Why It May Be Important
ياسر محمد صالح صيفي ياسر محمد صالح صيفي
No ratings yet
Lab 09
Document4 pages
Lab 09
ranahassanirfan2005
No ratings yet
A Proposal of Test Code Generation Tool For Java Programming Learning Assistant System
Document6 pages
A Proposal of Test Code Generation Tool For Java Programming Learning Assistant System
didier.diazmena
No ratings yet
Better Speech Synthesis Through Scaling
Document12 pages
Better Speech Synthesis Through Scaling
tusharmlnlp
No ratings yet
Natural Language Processing Nanodegree Syllabus: Before You Start
Document5 pages
Natural Language Processing Nanodegree Syllabus: Before You Start
ijaz
No ratings yet
Jp1 Mar2023 Lecture Notes
Document45 pages
Jp1 Mar2023 Lecture Notes
nick
No ratings yet
0 DLC Protocol Multi Species 2019 Nature Protocols
Document27 pages
0 DLC Protocol Multi Species 2019 Nature Protocols
scq b
No ratings yet
Diploma in Computer Engineering / Computer Hardware Engineering
Document5 pages
Diploma in Computer Engineering / Computer Hardware Engineering
ophamdan53
No ratings yet
rnn-1406 1078 PDF
Document15 pages
rnn-1406 1078 PDF
alan
No ratings yet
Projec Description
Document4 pages
Projec Description
saeed khan
No ratings yet
Java Unit I
Document211 pages
Java Unit I
J. Karthick Myilvahanan CSBS
No ratings yet
Neural Machine Translation
Document6 pages
Neural Machine Translation
IJRASETPublications
No ratings yet
Unit I Basic Syntactical Constructs in Java
Document21 pages
Unit I Basic Syntactical Constructs in Java
Ss1122
No ratings yet
Lip Reading Word Classification: Abiel Gutierrez Stanford University Zoe-Alanah Robert Stanford University
Document9 pages
Lip Reading Word Classification: Abiel Gutierrez Stanford University Zoe-Alanah Robert Stanford University
Roopali Chavan
No ratings yet
Universal Sentence Encoder
Document7 pages
Universal Sentence Encoder
viterbi kkk
No ratings yet
ITL202 - Ktu Qbank
Document5 pages
ITL202 - Ktu Qbank
Manoj Kumar
No ratings yet
Java Programming Manual
Document60 pages
Java Programming Manual
NIKHIL SHARMA
No ratings yet
Steps To Solve - 1
Document2 pages
Steps To Solve - 1
Aman Singh
No ratings yet
Query Expansion Based On Modified Concept2vec Model Using Resource Description Framework Knowledge Graphs
Document10 pages
Query Expansion Based On Modified Concept2vec Model Using Resource Description Framework Knowledge Graphs
IAES IJAI
No ratings yet
Noc18 cs48 Assignment3
Document4 pages
Noc18 cs48 Assignment3
shweta
100% (1)
Object Oriented Programming With JAVA 4341602
Document10 pages
Object Oriented Programming With JAVA 4341602
Devam Rameshkumar Rana
0% (1)
Essay 6
Document15 pages
Essay 6
noemailokisaidno
No ratings yet
Word 2 Vec
Document6 pages
Word 2 Vec
alihamda535
No ratings yet
Advanced Object Oriented Programming
Document13 pages
Advanced Object Oriented Programming
Sanjana Gohel
No ratings yet
A Neural Words Encoding Model: Dayiheng Liu Jiancheng LV Xiaofeng Qi and Jiangshu Wei
Document5 pages
A Neural Words Encoding Model: Dayiheng Liu Jiancheng LV Xiaofeng Qi and Jiangshu Wei
Utkrisht Sahai
No ratings yet
Etest v1 2 Filetype PDF
Document1 page
Etest v1 2 Filetype PDF
Kelley
No ratings yet
S 2 S
Document45 pages
S 2 S
davinia3001
No ratings yet
Sentiment Analysis From Ecovid-19 Related Web Content
Document4 pages
Sentiment Analysis From Ecovid-19 Related Web Content
Colme N
No ratings yet
Basics in Java
Document10 pages
Basics in Java
Mani Kandan
No ratings yet
TEXT SUMMARIZATION USING NLP (Final-2)
Document40 pages
TEXT SUMMARIZATION USING NLP (Final-2)
sakthivel
No ratings yet
Unit I Introducation and Overview Java
Document34 pages
Unit I Introducation and Overview Java
shantanujoshi445
No ratings yet
Project Proposal: Project Title: Speech To Text Conversion Problem Statement
Document2 pages
Project Proposal: Project Title: Speech To Text Conversion Problem Statement
rishi gupta
No ratings yet
Watershed Segmentation Based On Distance Transform
Document6 pages
Watershed Segmentation Based On Distance Transform
rishi gupta
No ratings yet
Kernels Hotness 010520
Document19 pages
Kernels Hotness 010520
rishi gupta
No ratings yet
Conversation Between Nurse and Patient About Appendicitis
Document2 pages
Conversation Between Nurse and Patient About Appendicitis
rishi gupta
No ratings yet
Dsa Assignment - 4: Implementation Detail: Onebin
Document4 pages
Dsa Assignment - 4: Implementation Detail: Onebin
rishi gupta
No ratings yet
New 4
Document38 pages
New 4
rishi gupta
No ratings yet
New 5
Document37 pages
New 5
rishi gupta
100% (1)
Emotion Detection From Facial Images: Rishi Gupta, Mangal Deep Singh MLSP Final Project 2022
Document7 pages
Emotion Detection From Facial Images: Rishi Gupta, Mangal Deep Singh MLSP Final Project 2022
rishi gupta
No ratings yet
CMO Assignment 2 Revision 3
Document4 pages
CMO Assignment 2 Revision 3
rishi gupta
No ratings yet
AnmolAsati Ass2
Document16 pages
AnmolAsati Ass2
rishi gupta
No ratings yet
Automatic Facial Emotion Recognition: January 2005
Document11 pages
Automatic Facial Emotion Recognition: January 2005
rishi gupta
No ratings yet
E9 261 - Speech Information Processing: Homework # 3 Due Date: May 2, 2021
Document4 pages
E9 261 - Speech Information Processing: Homework # 3 Due Date: May 2, 2021
rishi gupta
No ratings yet
Project Proposal: Project Title: Speech To Text Conversion Problem Statement
Document2 pages
Project Proposal: Project Title: Speech To Text Conversion Problem Statement
rishi gupta
No ratings yet
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
Document4 pages
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
rishi gupta
No ratings yet
E9 205 - Machine Learning For Signal Processing
Document3 pages
E9 205 - Machine Learning For Signal Processing
rishi gupta
No ratings yet
MLSP Project Report: 2. Result
Document1 page
MLSP Project Report: 2. Result
rishi gupta
No ratings yet
Sastry Merged Mid 1
Document700 pages
Sastry Merged Mid 1
rishi gupta
No ratings yet
MLSP Project Report: Emotion Detection From Facial Images Rishi Gupta Mangal Deep Singh
Document1 page
MLSP Project Report: Emotion Detection From Facial Images Rishi Gupta Mangal Deep Singh
rishi gupta
No ratings yet
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
Document3 pages
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
rishi gupta
No ratings yet
LTI 20 20 TruSpeed S PDF
Document48 pages
LTI 20 20 TruSpeed S PDF
Михайло Пуканич
No ratings yet
Chapter7 Monitoring and Managing Linux Processes
Document3 pages
Chapter7 Monitoring and Managing Linux Processes
Hassan Mohamed
No ratings yet
The Optimizing Information Leakage in Multicloud Storage Services
Document7 pages
The Optimizing Information Leakage in Multicloud Storage Services
Editor IJTSRD
No ratings yet
B Tech Engg Phys Course Structure IIT Roorkee
Document8 pages
B Tech Engg Phys Course Structure IIT Roorkee
Sujith
No ratings yet
Simplex Method Flowchart
Document2 pages
Simplex Method Flowchart
Reza
No ratings yet
Trans GNN
Document11 pages
Trans GNN
czyhhu000
No ratings yet
Using A System Cost Analysis To
Document11 pages
Using A System Cost Analysis To
saa6383
No ratings yet
Possibilities With New GL
Document5 pages
Possibilities With New GL
Wayne William
No ratings yet
CEA-fall2020-IME-wasim New
Document5 pages
CEA-fall2020-IME-wasim New
سید کاظمی
No ratings yet
CLINITEK Atlas Software V7.50 Update Kit
Document3 pages
CLINITEK Atlas Software V7.50 Update Kit
Alvaro Restrepo Garcia
No ratings yet
Chemistry of High-Energy
Document386 pages
Chemistry of High-Energy
biopowered@gmail.com
No ratings yet
1
Document19 pages
1
Ahammad Kabeer
No ratings yet
Piles in Chalk
Document9 pages
Piles in Chalk
tomaszda
No ratings yet
GRD 8 - 0860 Computing Scheme of Work - Stage 9 - v1 - tcm143-635636
Document133 pages
GRD 8 - 0860 Computing Scheme of Work - Stage 9 - v1 - tcm143-635636
anu
No ratings yet
Imo - Resolution - msc333 - 90 VDR
Document8 pages
Imo - Resolution - msc333 - 90 VDR
Ajay Varma
No ratings yet
Long and Short Question and Answers Optics
Document9 pages
Long and Short Question and Answers Optics
krishna garg
No ratings yet
PP 2070 XP Multi Pin
Document2 pages
PP 2070 XP Multi Pin
Manuel Ri
No ratings yet
Ce 311 Engineering Utilities 1: Building Electrical Materials and Equipment
Document12 pages
Ce 311 Engineering Utilities 1: Building Electrical Materials and Equipment
Anonymous A
No ratings yet
Form 1 - COMPUTER - NOTES All Chapters
Document292 pages
Form 1 - COMPUTER - NOTES All Chapters
Takunda Tatiwa
No ratings yet
Pilar Jembatan
Document37 pages
Pilar Jembatan
MAWAR08
100% (4)
CMZ900S PDF
Document11 pages
CMZ900S PDF
Phuoc Ho Nguyen
No ratings yet
From Logic Programming To Prolog
Document345 pages
From Logic Programming To Prolog
Török Zoltán
100% (3)
DFSMS/MVS V1R4 Technical Guide: June 1997
Document176 pages
DFSMS/MVS V1R4 Technical Guide: June 1997
api-3829326
No ratings yet
03 Theories of Covalent Bond MCQ
Document6 pages
03 Theories of Covalent Bond MCQ
cookiemaaw
No ratings yet
Contador de Particulas
Document3 pages
Contador de Particulas
cesar6huh
No ratings yet
bIRUL LAG - 1
Document3 pages
bIRUL LAG - 1
Kurniawan Arika
No ratings yet
Transmission Line Fault Analysis by Using Matlab Simulation: Shyamveer Rajput - Dr. K.T. Chaturvedi
Document3 pages
Transmission Line Fault Analysis by Using Matlab Simulation: Shyamveer Rajput - Dr. K.T. Chaturvedi
KALYAN KUMAR
No ratings yet
Measurements In: Intensity Building Acoustics
Document8 pages
Measurements In: Intensity Building Acoustics
Luigi Di Francesco
No ratings yet
1LE1504-3AB23-4AB4 Datasheet en
Document1 page
1LE1504-3AB23-4AB4 Datasheet en
Okke Boyke
No ratings yet
Umar
Document15 pages
Umar
omang manimart
No ratings yet