You are on page 1of 20

IDENTIFICATION TRIBAL OF THE NATION’S DIALECT

WITH MEL-FREQUENCY CEPSTRAL COEFFICIENT


AND ZERO CROSSING RATE WITH DEEP NEURAL
NETWORK CLASSIFIER

Gesha Faithul Ajrin


1101180510

1st Advisor: Ir. Rita Magdalena, M.T.


2nd Advisor: Dr. Ir. Bambang Hidayat, DEA, IPM.
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Outline

Result
Basic System
Introduction & Conclusion
Theory Design
Analysis
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Introduction
Speech Recogniton

• Voice signal identification


• Voice command

Speech Recognition Process


Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Goals

2. Knowing how the Deep


1. Knowing how the MFCC and Neural Network in determining
ZCR methods work in whether the voice signal is a
extracting audio characteristics dialect of the Batak, Serawai,
and Makassar tribes

3. Make a simulation that can


identify tribes dialects using Matlab
and determine the level of accuracy
of the Deep Neural Network in
recognizing tribes group based on
validation test parameters
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Basic Theory

MFCC ZCR

DNN

Three Classes:
1. Batak
2. Serawai
3. Makassar
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

MFCC
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

ZCR
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

DNN
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

System Design
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Result
&
Analysis
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Validation Test DNN


L2Weigth L2Weight Sparsity Sparsity
Hidden Hidden Regularizat Regularization
Testing Regularization Regularization Epoch Validation
Size 1 Size 2 ion 1 2
1 2

First Testing 100 100 0.1 0.1 1 1 100 67.5%


Second 150 150 0.1 0.1 1 1 100 74.7%
Testing
Third Testing 200 200 0.1 0.1 2 2 150 76%
Fourth Testing 250 250 0.1 0.1 2 2 200 81.7%
Fifth Testing 300 300 0.1 0.1 2 2 100 85.7%
Sixth Testing 300 300 0.01 0.01 5 5 200 91.3%
Seventh 300 300 0.01 0.01 4 4 100 87%
Testing
Eight Testing 300 300 0.001 0.001 3 3 150 97.7%
Ninth Testing 300 300 0.0001 0.0001 2 2 100 100%
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Accuracy Test MFCC+DNN


Parameter Value
Sentence Accuracy Hidden size 1 & 2 300
Sentence Accuracy
Test 1 L2WeightRegularization H1 & H2 0.0001 Test 2
SparsityRegularization H1 & H2 0.0001

Epoch 100

1=Batak ; 2=Serawai ; 3=Makassar


Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Parameter Value

Hidden size 1 & 2 300


Sentence Accuracy L2WeightRegularization H1 & H2 0.0001 Sentence Accuracy
Test 3 SparsityRegularization H1 & H2 0.0001
Test 4
Epoch 100
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Sentence Accuracy
Test 5

Parameter Value

Hidden size 1 & 2 300

L2WeightRegularization H1 & H2 0.0001

SparsityRegularization H1 & H2 0.0001

Epoch 100
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Accuracy Test ZCR+DNN


Parameter Value
Sentence Accuracy Sentence Accuracy
Hidden size 1 & 2 300

Test 1 L2WeightRegularization H1 & H2 0.0001 Test 2


SparsityRegularization H1 & H2 0.0001

Epoch 100
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Parameter Value

Hidden size 1 & 2 300

Sentence Accuracy L2WeightRegularization H1 & H2 0.0001 Sentence Accuracy


Test 3 SparsityRegularization H1 & H2 0.0001
Test 4
Epoch 100
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Sentence Accuracy
Test 5

Parameter Value

Hidden size 1 & 2 300

L2WeightRegularization H1 & H2 0.0001

SparsityRegularization H1 & H2 0.0001

Epoch 100
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Conclusion

• Deep Neural Network (DNN) method which has been built in this
study can classify tribal dialects consisting of the Batak, Serawai, and
Makassar tribes
• Mel-Frequency Cepstral Coefficient (MFCC) has a higher accuracy rate
than Zero Crossing Rate (ZCR). The MFCC method has the best
accuracy rate of 86%. While the ZCR method has the best level of
accuracy, which is 80%
• The findings obtained from testing voice characteristics with training
of 300 data and testing of 75 data are using Hidden size 300, L2
Weight Regularization 0.0001, sparsity Regularization 2, and Epoch
100
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Conclusion
Result &
Outline Introduction Basic Theory System Design
Analysis Conclusion

Thank You

You might also like