You are on page 1of 54

Implementasi Machine Learning dan

Deep Learning
Achmad Benny Mutiara
Dekan Fakultas Ilmu Komputer dan Teknologi Informasi, Universitas Gunadarma
SEKJEN-APTIKOM
2020
Hubungan DS-BD-AI-ML-DL Dewasa ini

Source: adaptation from Ian Goodfellow, et.al 2016 & and Matthew Mayo, 2016
Machine learning techniques

Machine learning mainly has three types of learning


techniques:
• Supervised learning
• Unsupervised learning
• Reinforcement learning
Machine Learning tasks categories

1. Classification
2. Regression
3. Clustering
4. Anomaly detection
5. Association
6. Recommendation
7. Dimensionality reduction
8. Computer Vision
9. Text Analytics
• Link buku-buku
• Big-data dan Data Science:
https://drive.google.com/drive/folders/18jbNHjUWsRor8W64oNDxggOd
_yWqMzHs?usp=sharing
Proses Machine Learning
Tool Implementasi: Matlab
• Matlab https://www.mathworks.com/products/matlab.html
• Komersial versi terakhir R2020a
• Tersedia Toolbox: AI, Data Science, and Statistics
• Statistics and Machine Learning Toolbox
• Deep Learning Toolbox
• Reinforcement Learning Toolbox
• Text Analytics Toolbox
• Predictive Maintenance Toolbox

• Link buku Matlab:


https://drive.google.com/drive/folders/1qHLqc2kYrI7REC2UClijIZhrzICmm8A
F?usp=sharing
• Link buku Deep Learning with Matlab:
https://drive.google.com/drive/folders/1QuU9tAMPF-
XPwM4WmSBRiSYQoj8aA9Wg?usp=sharing
Tool Implementasi: RapidMiner
• RapidMiner https://rapidminer.com/
• platform perangkat lunak data science
• yang dikembangkan oleh perusahaan bernama sama dengan yang menyediakan lingkungan
terintegrasi untuk data preparation, machine learning, deep learning, text mining, and
predictive analytics.
• Digunakan untuk bisnis dan komersial, juga untuk penelitian, pendidikan, pelatihan, rapid
prototyping, dan pengembangan aplikasi serta mendukung semua langkah dalam proses
machine learning termasuk data preparation, results visualization, model validation and
optimization.
• RapidMiner dikembangkan pada open core model. Dengan RapidMiner Studio Free Edition,
yang terbatas untuk 1 prosesor logika dan 10.000 baris data, tersedia di bawah lisensi AGPL.
RapidMiner Studio 9.7 (https://my.rapidminer.com/nexus/account/index.html#downloads)
Harga komersial dimulai dari $2.500 dan tersedia dari pengembang.
• Link buku RapidMiner:
https://drive.google.com/drive/folders/1ln2R4ryr2qj_Iwbk-
ZZT_T9wTyvpuhaN?usp=sharing
Tool Implementasi: R-Studio
Mengapa Pakai R Language ?
• R is a free, open-source software and programming language developed in
1995 at the University of Auckland as an environment for statistical
computing and graphics (Ikaha and Gentleman, 1996).
• Since then R has become one of the dominant software environments for
data analysis and is used by a variety of scientific disiplines, including soil
science, ecology, and geoinformatics (Envirometrics CRAN Task View; Spatial
CRAN Task View).
• R is particularly popular for its graphical capabilities, but it is also prized for
it’s GIS capabilities which make it relatively easy to generate raster-based
models.
• More recently, R has also gained several packages which are designed
specifically for analyzing soil data.
Tool Implementasi: Python, Jupyter, Anaconda
• Python
• Versi 3.8.X
• Tersedia IDE: Spyder https://www.spyder-ide.org/
• Tool interactive: Jupyter (Project Jupyter exists to develop open-source software,
open-standards, and services for interactive computing across dozens of programming
languages.) https://jupyter.org/
• Toolkit: Anaconda (the open-source Individual Edition (Distribution) is the easiest way
to perform Python/R data science and machine learning on a single machine.
Developed for solo practitioners, it is the toolkit that equips you to work with
thousands of open-source packages and libraries) https://www.anaconda.com/
• Google Colab Colaboratory, or "Colab" for short, allows you to write and execute
Python in your browser, with
• Zero configuration required
• Free access to GPUs
• Easy sharing
• Whether you're a student, a data scientist or an AI researcher, Colab can make your work easier
https://colab.research.google.com/notebooks/intro.ipynb
Jupyter
Anaconda
Cheat-Sheet for AI

Download: https://drive.google.com/drive/folders/1o3t1Ceg_0fL6lfXaNtfq6MzNLgWKqLKH?usp=sharing
Library: scikit-learn

SCIKIT LEARN : https://scikit-learn.org/stable/


• Link buku-buku
• Big-data dan Data Science:
https://drive.google.com/drive/folders/18jbNHjUWsRor8W64oNDxggOd
_yWqMzHs?usp=sharing
• Deep Learning dan Machine Learning:
https://drive.google.com/drive/folders/1hJ-
E5OJhg35R7LC7_bHy99ccoY7CJ3nO?usp=sharing
• Python:
https://drive.google.com/drive/folders/1zqr5GPjQhP96XqKcWMxcmeWi
MAAZ1iVx?usp=sharing
scikit-learn
• scikit-learn user guide, Mar 01, 2019 :
https://drive.google.com/drive/folders/1rRsU6WdnPUlT3d9NcsTuk
2f6N92PLZkc?usp=sharing
Contoh: scikit-learn, Perbandingan Classifier
Contoh: scikit-learn, Perbandingan Algoritma Klasterisasi
Contoh: scikit-learn, SVM-Kernel
Interactive Machine Learning Experiments

https://github.com/trekhleb/machine-learning-experiments
Google-Colab: TensorFlow

https://www.tensorflow.org/tutorials/quickstart/beginner
Terkait Convid-19: Using OpenCV, Computer Vision, and Deep Learning for
Social Distancing

The steps to build a social distancing


detector include:
1. Apply object detection to detect all
people (and only people) in a video
stream (see this tutorial on building
an OpenCV people counter)
2. Compute the pairwise distances
between all detected people
3. Based on these distances, check to
see if any two people are less than N
pixels apart
https://www.pyimagesearch.com/2020/06/01/opencv-social-distancing-detector/
Project Structure
• YOLO object detector files including the CNN
architecture definition, pre-trained weights, and class
names are housed in the yolo-coco/ directory. This YOLO
model is compatible with OpenCV’s DNN module.
• pyimagesearch module consists of:
• social_distancing_config.py : A Python file holding a
number of constants in one convenient place.
• detection.py : YOLO object detection with OpenCV
involves more lines of code that some easier
models.
• Social distance detector application logic resides in the
social_distance_detector.py script. This file is
responsible for looping over frames of a video stream
and ensuring that people are maintaining a healthy
distance from one another during a pandemic. It is
compatible with both video files and webcam streams.
• Our input video file is pedestrians.mp4 and comes from
TRIDE’s Test video for object detection. The output.avi
file contains the processed output file.
Results
Results
Contoh Riset Deep Learning
PEMODELAN DEEP LEARNING
LONG-SHORT TERM MEMORY (LSTM) UNTUK PENGENALAN
SUARA DIGIT BILANGAN DESIMAL BERBAHASA INDONESIA

Ericks Rachmat Swedia, Achmad Benny Mutiara


SPEECH RECOGNITION
TUJUAN PENELITIAN

• Mengembangkan model LSTM untuk suara digit bahasa Indonesia


yang speaker independent.
• Menghasilkan akurasi pengenalan yang lebih tinggi dari penelitian
sebelumnya
• Memperoleh waktu eksekusi yang lebih optimal
BASIS DATA
Digit Indonesia Pengucapan

Speaker Independent 0 Nol e-nol


1 Satu Sa-tu
 Data Latih 2 Dua du-a

 799 Mahasiswa 3 Tiga ti-ga

 Usia 19-22 4 Empat em-pat

 389 Perempuan, 410 Laki-laki 5 Lima li-ma

 7990 Data 6 Enam e-nam


7 Tujuh tu-juh

 Data Uji 8 Delapan de-la-pan

 79 Mahasiswa 9 Sembilan sem-bi-lan

 Usia 19-22  Noise :


 Suara AC, Pintu terbuka-
 23 Perempuan, 56 Laki-laki
tertutup, Suara batuk
 790 Data mahasiswa lain, Suara Keyboard,
Langkah orang lain berjalan, dll
Sample-rate 8000Hz
The sampling rate must be at least twice the rate of the highest frequency, according to Nyquist's theorem.

Human speech frequency : 300 Hz to 3400 Hz.


METODOLOGI PENELITIAN
MENDESAIN MODEL LSTM

JUMLAH?

Koefisien
Jumlah Digit Klasifikasi
Spectrogram / MFCC
MODEL LSTM YANG DIUSULKAN

1 LAYER LSTM, 1 LAYER LSTM,


700 LSTM-CELL 400 LSTM-CELL

SPEKTROGRAM MFCC
HASIL PENELITIAN
100 96,58 96,33
93,17 92,28
88,73 88,98 87,56
90
81,01
78,73
80 75,69

70

60

50

40

30 25,19

20
10,13
10

0
DTW HMM RNN LSTM Nahid LSTM Daneshvar LSTM Digit Indonesia

MFCC SPECTROGRAM

DTW HMM RNN LSTM


AKURASI (%) 88.73% 88.98% 81.01% 96.58 %
WAKTU EKSEKUSI (detik) 588.40 133 7.82 6.0024
GPU - SPEED UP

PROSES TRAINING LSTM LSTM - GPU SPEED UP


WAKTU EKSEKUSI (detik) 121,712.3445 11,124.295 10.9411
PROSES KLASIFIKASI LSTM LSTM - GPU SPEED UP
WAKTU EKSEKUSI (detik) 6.0024 0.780 7.6953

https://www.geforce.com/hardware/notebook-gpus/geforce-gtx-970m/specifications
KESIMPULAN
 LSTM yang dikembangkan mampu mengenali suara digit bahasa Indonesia
dengan model :
 64 input layer, 700 LSTM Cell, 10 Fully-Connected Layer, 1 Softmax layer,
dan 10 output layer.

 Tingkat akurasi lebih tinggi dibandingkan dengan Model LSTM Lain.

 Waktu eksekusi proses pengenalan LSTM lebih cepat.

 Pemrograman Matlab dengan GPU mampu mempercepat proses training dan


klasifikasi.
SARAN
 Mengembangkan model LSTM agar mengenali suara untuk semua kata
dalam bahasa Indonesia atau dalam bahasa daerah yang ada di Indonesia.
 Tingkat emosi pembicara juga dapat diteliti lebih lanjut agar pengenalan
menjadi lebih baik. Basis data untuk pelafalan dan dialek yang berbeda-
beda juga perlu ditambahkan agar sistem dapat mengenali lebih baik.
 Untuk keamanan dalam pengucapan password atau pin perlu dikembangkan
sistem untuk mengenali karakteristik dari suara pembicara, tidak hanya digit
yang diucapkan sehingga penelitian untuk mengenali suara pembicara
sangat perlu dilakukan.
KONTRIBUSI PENELITIAN
 Tersedianya Database digit berbahasa Indonesia yang dapat digunakan
untuk penelitian selanjutnya.
 7990 data train.
 799 data test.
 Verifikasi pengucapan untuk password, ATM, pin, dll
 Sistem pengenalan suara digit berbahasa Indonesia (ASR).
TERIMA KASIH
Education is the most powerful weapon
which you can use to change the world
(Nelson Mandela)

You might also like