You are on page 1of 52

Big Data Penelitian Indonesia:

Tantangan dan Peluang Riset


Imam Much Ibnu Subroto, ST, M.Sc, Ph.D
BIODATA
Afiliasi: Dosen Universitas Islam Sultan Agung
Short CV: Imam Much Ibnu Subroto, ST, M.Sc, Ph.D
Jabatan : Dosen Juruan Teknik Informatika Tim Pakar SINTA RISTEKBRIN, IAES Founder, IEEE member
Bidang Keahlian: Kecerdasan Buatan, Data MASTERPIECE AWARD
Mining, e-Learning Founder, Science and Technology Index(SINTA)
Lahir: Semarang, 13 Maret 1973 (sinta.ristekdikti.go.id) Juara III Dosen Berprestasi bidang Sains, LLDKTI
Founder, Indonesian Publication Index (IPI) (KOPERTIS) Wilayah IV Jawa Tengah tahun 2018
EDUCATION (portalgaruda.org). Penghargaan Pengembang GARUDA, Direktur
ST (S1) Sarjana Teknik Elektro, Universitas Founder, Garba Rujukan Digital (GARUDA) Kekayaan Intelektual RISTEKDIKTI 2018
Gadjah Mada (UGM), 1998 (garuda.ristekdikti.go.id). Juara I Dosen Berprestasi, Universitas Islam
M.Sc (S2) Computer Science, Universiti Founder, Repository Tugas Akhir Mahasiswa (RAMA) Sultan Agung, Quality Day Nov 2017
Teknologi Malaysia (UTM), 2007 (rama.ristekdikti.go.id). Penghargaan Kekayaan Intelektual,
PhD (S3) Computer Science, Universiti Teknologi Founder, Indonesia Menulis (IDMENULIS) “Pengembang SINTA Science and Technology
Malaysia (UTM), 2015 (idmenulis.ristekdikti.go.id). Index”, Direktorat Jenderal Riset dan
Founder, JARLITBANGNOV Jateng Jaringan Peneliti
ORGANIZATION EXPERIENCE Jawa Tengah (jarlitbangnov.bappeda.jatengprov.go.id) Pengembangan – RISTEKDIKTI, HAKTEKNAS,
2002-2005 : Kepala UPT Komputer dan Founder, e-SINAU Adaptive e-Learning Systems 2017
Teledukasi UNISSULA Best IT Innovation Award for Practical Used,
2015-2017 : Ketua Program Magister Teknik PENGELOLA JURNAL INTERNASIONAL International Conference on Research
Elektro (MTE) UNISSULA Chief Editor, Journal of Telematics and Informatics (JTI) Innovation in Information System (ICRIIS 2009)
2017-2019 : Ketua Jurusan Teknik Informatika Managing Editor, International Journal of Artificial Intelligence Malaysia, 2009
(IJ-AI) BRONZE Medal “MyCopyDetect: Plagiarism
UNISSULA Editorial Board, Journal of Information Technology and
Communication (IJ-ICT) Detection Tool” , INATEX Industrial Art and
RESEARCH TOPICS
Editorial Board, Computer Engineering and Applications Journal Technology Exhibition Malaysia 2009
» Big Data on Research and Innovation (ComengApp)
» Plagiarism Detection Using Multiple etc......
Classifiers PUBLICATION
GS INDEXED SCOPUS INDEXED
» Adaptive e-Learning Systems using Machine INTELECTUAL PROPERTY RIGHT (IPR) H-Index : 9 H-Index : 7
Learning Science and Technology Index (SINTA), Documents: 72 Documents: 25
» Publication Performance Measurement Hak Cipta RISTEKDIKTI 2018 (granted) Citations : 634 Citations : 209
based on Science and Technology Index E-SINAU Adaptive e-Learning System, Hak Cipta
(SINTA) 2018 (Granted)
Outline
• Data Science
• Big Data, Data Mining, Machine Learning, etc.
• Research Data: Strategic interest
• Indikator Kemajuan Bangsa
• Potensi Sumber Daya Manusia
• Experts & Expertise
• Knowledge tersembunyi
• Data Mining & Machine Learning Implementation & Peluang Riset
• Classification, Estimation, prediction, Clustering, Association
Data Science?
• Big Data
• Data Mining
• Machine Learning
• Data Visualization
• Models
• Research
• Business Intelligence
Popular Machine Learning Model
• Linear Regression
• Support Vector Machine
• Decision Tree
• K Nearest Neighbours
• K Means
• Naive Bayes
• ANN
•Data yang diolah dengan baik dan tepat
akan memberikan decision yang tepat
Research Database: What?
• Research Material
• Research Progress
• Research Fund
• Publication
• Bibliography
• Citation
• Patent
• Researcher
• Affiliation
• Collaboration Network
• Research Trend
Sistem Informasi Riset

Publikasi dan
Material/Bahan Pelaksanaan Integritas Pengukuran
Report Kekayaan
Bacaan Riset Akademik Riset
Intelektual
Research Data: Strategic interest
1. Research adalah Indikator Masyarakat/negara Maju
• Research dan Innovation adalah untuk meningkatkan kehidupan
masyarakat menjadi lebih baik
• Data dan Fakta: Negara maju mempunyai rekam jejak penelitian lebih
banyak serta jumlah rujukan yang lebih banyak
Source: Scimagojr.com akses: 08-08-2022

International Rank
Asiatic Region Rank
Publication Trend Kebangkitan
Publikasi
Indonesia?
Research Data: Strategic interest
2. Potensi Sumber Daya Manausia dan Potensi Daerah
3. Kepakaran
Peneliti
Indonesia
4. Knowledge (Tersembunyi) Big Data
Penggalian pengetahuan (knowledge) tersembunyi dari sejumlah data
yang besar dapat dilakukan dengan beberapa cara:
• Data Visualization
• Data mining:
• Classification
• Estimation
• Prediction
• Clustering
• Association
• Machine Learning, sebagai metode komputasi berdasarkan
pembelajaran dalam Data Mining
User of Big Data Research Indonesia
•Researcher : Student dan Professional
•Lembaga Research: Universitas, Pusat Studi,
badan Riset Daerah, Badan Riset Pemerintah
•Government
•Industries
Big Data: Data Source
• SINTA
• GARUDA
• Google
• Scopus
• WoS
Potensial Big Data Riset Indonesia
• Jumlah perguruan tinggi: 4.500
• Jumlah dosen: 280.000
• Jumlah mahasiswa: 9 juta
• Jumlah Jurnal: 15.000 terindeks GARUDA
• Publikasi Nasional terindeks GARUDA: 2,2 M (disertai fulltext)
• Jumlah Publikasi (SINTA): 4.661.160 artikel
• Visitors of SINTA (September 2022): 2,1 M visitors
• *Search history (sinta/garuda) => big data
• recommender system
• Data analytics
• Kolaborasi
Interoperabilitas
SINTA
Data Visualization
• Objective:
• Decision Making
• Insight
• Sebagian besar data adalah berupa angka dan huruf dianggap kurang
meaningful untuk tujuan decision making
• Insight dimaksudkan untuk melihat data dari sudut pandang yang
berbeda
International Research Network (IRN)
National Research Network
Contoh Data Mining pada Data Riset Indonesia
• Classification
• Field Area: Journal & Publications
• Expertise (fingerprint) : researcher
• Prediction
• Integritas Akademik: Identifikasi perilaku ketidakjujuran peneliti
• Field Area: Perkembangan bidang ilmu
• Produktifitas: Prediksi jumlah produk penelitian berdasarkan history
• Estimation
• Predict “Quality” of paper
• Association
• Collaboration Network (National & International)
• Citation Network
• Clustering
• Klasterisasi perguruan Tinggi
• Recommender System
• Related Work Recommender System
• Citation Recommender System
How to Implement?
1. Classification
• Contoh penerapan:
• Field Area: Journal & Publications
• Expertise (fingerprint) : researcher
• Integritas Akademik: Identifikasi perilaku ketidakjujuran peneliti
• Metode:
• Preprocessing: NLP (Natural Language Processing)
• Machine learning:
• Artificial Neural Network (ANN)
• K-Nearest Neighbours (KNN)
• Support Vector Machine (SVM)
• Naïve Bayes (NB)
• Regresi Linear
Tahapan Klasifikasi Field Area (Python)
• Setup: Importing Libraries
• Loading the data set & Exploratory Data Analysis
• Text pre-processing
• Extracting vectors from text (Vectorization)
• Running ML algorithms
• Evaluation
NLP dengan Python
• Menggunakan Library NLTK dan Sastrawi
Contoh dataset field area
ID Title Field Area
1238883A test for constant correlations in a multivariate GARCH model Arts & Humanities
1296804Parametric excitations of linear systems having many degrees of freedom Arts & Humanities
1443530Habitat Templets and the Changing Worldview of Ecology Arts & Humanities
1504659The politics of space: changing discourses on Chinese burial grounds in post-war Singapore Arts & Humanities
1559694Role of Temporal Integration and Fluctuation Detection in the Highly Irregular Firing of a Leaky Int Arts & Humanities
62555Classifications and graph-based representations of switching functions using a novel complex spectral Engineering & Technology
technique
68914Poly(trimethylsilylcyclooctatetraene): A soluble conjugated polyacetylene via olefin metathesis Engineering & Technology
75274Mechanism of Decomposition of Cuprous Cyanide. Infrared and Thermal Evidence Engineering & Technology
79025Inclusion shaping and extremal property of the Taylor-Saffman bubble Engineering & Technology
80544Optimal design of internal ring supports for vibrating circular plates Engineering & Technology
8897Thermal Isomerization of All-trans-Lutein in a Benzene Solution Life Sciences & Medicine
9099Synthesis, crystal and molecular structures of pyridine adducts of the zinc and cadmium bis-1,2,4-tr Life Sciences & Medicine
11027Flavonoids in the black rhizomes of Boesenbergia panduta Life Sciences & Medicine
17863Cobalt(II)-Catalyzed Reaction of Aldehydes with Acetic Anhydride under an Oxygen Atmosphere: Scope a Life Sciences & Medicine
7350Catalytic Ring Closing Metathesis of Dienynes: Construction of Fused Bicyclic Rings Natural Sciences
8368Spontaneous formation of complex and ordered structures on oxygen-plasma-treated elastomeric polydim Natural Sciences
8897Thermal Isomerization of All-trans-Lutein in a Benzene Solution Natural Sciences
9099Synthesis, crystal and molecular structures of pyridine adducts of the zinc and cadmium bis-1,2,4-tr Natural Sciences
1264539Digestibility and asymmetric information in the choice between acquisitions and joint ventures: Wher Social Sciences & Management
1288863Optimization modeling for sewer network management Social Sciences & Management
1289491Some iterative techniques for general monotone variational inequalities Social Sciences & Management
1305324Sales forecasts for existing consumer products and services: Do purchase intentions contribute to ac Social Sciences & Management
1339088Verbal, vocal, and visible factors in judgments of another's affect Social Sciences & Management
NLP dengan Python (cont.)
• Library Machine Learning:
• Pandas
• Scikit-learn
• Numpy
• Tensor Flow
• Keras
• dll
Hasil Klasifikasi Field Area
Fingerprints (Field Area)
Contoh Fingerprint (Field Area)
Source: Elsevier pure
How to Implement?
2. Prediction
• Contoh penerapan:
• Integritas Akademik: Identifikasi perilaku ketidakjujuran peneliti
• Field Area: Perkembangan bidang ilmu
• Produktifitas: Prediksi jumlah produk penelitian berdasarkan history
• Metode:
• Regresi Linear
• Regresi Non Linear
• Artificial Neural Network (ANN)
Identifikasi ketidakwajaran perilaku peneliti
Pendekatan Regresi Linear:
• Sumbu x: Score Sinta 3 tahun
• Sumbu y: Score Sinta overall

• Data 1, 2, 3 adalah Outlier yang


patut dilakukan observasi lebih
lanjut
Prediksi Jumlah Publikasi
Perkembangan bidang ilmu
keyword: Machine Learning
How to Implement?
3. Estimation
• Contoh penerapan:
• Rank: Metrics and Score
• Estimasi Kualitas Penelitian
• Metode:
• Statistics
• ANN, SVM, KNN
Bagaimana mengukur
kualitas paper ini?
How to Implement?
4. Association
Contoh implementasi:
• Collaboration Network (International & National)
• Relevansi Bidang Ilmu
• Kedekatan antar peneliti

• Metode
• Data Visualization
• Association + Clustering
• Apriori
Collaboration Network
How to Implement?
5. Clustering
Contoh Implementasi
• Klasterisasi perguruan tinggi berdasarkan performa
• Klasterisasi perguruan tinggi berdasarkan kedekatan bidang ilmu
How to Implement?
6. Recommender System
• Profil User:
• History penelitian dan publikasi
• Histori pencarian data

Berbasis profil user maka recommender system akan memberi


rekomendasi:
• Related Work (Rekomendasi sumber penelitian)
• Researcher terkait (Rekomendasi kolaborator penelitian)
• Lembaga riset terkait (Rekomendasi lembaga penelitian)
• Potensial Sitasi
Citation recommendation system:
Content-based author Co-citation network
Graph + KNN
Terimakasih

You might also like