You are on page 1of 33

BUSINESS INTELIGENCE

with Data Analytics


~ Data Preparation ~
Faizal Mahananto
Bio
• Faizal Mahananto
• Departemen Sistem Informasi ITS
• Acc. Bg : ITS (S1) – Kumamoto Univ. (S2,S3)
• Interes Penelitian
• Data analisis (time series, signal)
• Medical, Economic, Engineering
• Proyek
• ICU Patient Monitor and analysis
• Out-patien rehabilitation apps
• Predictive maintenance using ML
Modul Pelatihan
• Business Intelligence(BI) adalah mata
kuliah WAJIB yang diajarkan pada
Departemen Sistem Informasi, ITS (4
sks).
• Decision Support System adalah
mata kuliah PILIHAN untuk Lab
Rekayasa Data dan Intelegensi Bisnis
(RDIB).
• Mata kuliah lain yang bersinggungan
dengan isi pelatihan ini adalah
Statistika, Data Mining, Machine
Learning, Forecasting Techniques.
Business Intelligence

Turban et al, 2010

Learning Outcome : memanfaatkan BI untuk memaksimalan


data yang dimiliki dalam mencapai goal organisasi.
Solusi sederhana
Outline
1. Data Formatting 4. Basic Descriptive Analytics
1. Create data 1. Describe function
2. Google Colab env. 2. Individual measurement
calculation
2. Basic Python
1. Data connection 5. Hands on
2. Library for desc analytics
3. Functions
3. Auto Visualization
1. Matplotlib
2. Visualizing (Bar, Scatter, Line)
Persiapan
• Data : BI TSA.xlsx
• Akses ke Google Colab : https://colab.research.google.com/
Google Colab
• Cloud based, python environment.
Working with Google Colab
• Create notebook
• Create simple code
• Rename notebook
• Check Google Drive working folder
Data Formatting
• Export Excel to CSV
• CSV File to Google Drive
• Excel file to Google Drive
Basic connection
• CSV data access

Browser

Code
Basic connection
• Google Sheet data Access

Code - Read Code - Write


Basic connection
• Database Access
Library for desc. analytics
• Library for numeric Pandas and Numpy
Library for desc. analytics
• Access data
Python function

Dengan return value


Tanpa return value
Basic Descriptive Analytics
Data analytics can be broken into four key types:
1. Descriptive, which answers the question, “What happened?”
2. Diagnostic, which answers the question, “Why did this happen?”
3. Predictive, which answers the question, “What might happen in the
future?”
4. Prescriptive, which answers the question, “What should we do
next?”
Basic Descriptive Analytics
• Analisis deskriptif menjawab:
• Performa bisnis yang sedang berjalan
• Rata-rata pengunjung toko per-hari, revenue, inventory, production
• Trend dari data historis
• Kenaikan/penurunan penjualan, followers, engagement rate
• Perbandingan data and posisi bisnis
• Rasio pendapatan dan pengeluaran, rasio ROI

• Analisis deskriptif memberikan pandangan garis besar terhadap bisnis


dan data.
• Namun, kurang mampu memberikan informasi mengenai penyebab
kondisi tersebut dan apa yang terjadi selanjutnya.
Basic Descriptive Analytics
• Alur analisis deskriptif
1. Tentukan ukuran/variabel bisnis yang ingin dianalisis
2. Identifikasi data yang dibutuhkan
3. Siapkan data
4. Lakukan data analisis yang dibutuhkan
5. Visualisasikan data
Basic Descriptive Analytics
• Descriptive statistics
Basic Descriptive Analytics
• Examples
Basic Descriptive Analytics
• Examples
Basic Descriptive Analytics
Basic Descriptive Analytics
• Data Trend
• https://www.kaggle.com/datasets/kandij/electric-production
• kaggle datasets download -d kandij/electric-production
• Code
• from sklearn.linear_model import LinearRegression
• model = LinearRegression()
• model.fit(np.arange(len(dataelc)).reshape(-
1,1), dataelc['Value'])
• r_sq = model.score(np.arange(len(dataelc)).reshape(-
1,1), dataelc['Value'])
• print('R2:', r_sq)
• print('intercept:', model.intercept_)
• print('slope:', model.coef_)
Access data from Kaggle
• Buat akun Kaggle.
• Download Personal API
• Install Kaggle connection
• !pip install -q Kaggle
• Import API
• from google.colab import files
• files.upload()
• !mkdir ~/.kaggle
• !cp kaggle.json ~/.kaggle/
• !chmod 600 ~/.kaggle/kaggle.json
• !kaggle datasets download -d kandij/electric-
production
• !unzip electric-production.zip
Access data from Kaggle
• Akses dengan pandas
• pd.read_csv('/content/Electric_Production.csv’)
Visualisasikan
• from matplotlib import pyplot as plt
• plt.scatter(np.arange(len(dataelc)).reshape(-
1,1), dataelc['Value'],color='g')
• plt.plot(np.arange(len(dataelc)).reshape(-
1,1), model.predict(np.arange(len(dataelc)).reshape(-
1,1)),color='g')
• plt.scatter(np.arange(len(dataelc)).reshape(-
1,1), dataelc['Value'], color='r')
• plt.plot(np.arange(len(dataelc)).reshape(-
1,1), dataelc['Value'], color='r')
Hands on
Data → variable bisnis
Access data from database
• hostname: relational.fit.cvut.cz
• port: 3306
• username: guest
• password: relational
• database : AdventureWorks2014

• DB Info : MySql DB, online server.


• !pip install mysql.connector
Lakukan analisis – Trend Analysis
• import numpy as np
• import pandas as pd
• import mysql.connector as mc
• from sklearn.linear_model import LinearRegression

host_mysql = 'relational.fit.cvut.cz'
• user_mysql = 'guest'
• pass_mysql = 'relational'
• db_mysql = 'AdventureWorks2014'

connmysql = mc.connect(host=host_mysql, user=user_mysql, password=pass_mysql, database=db_mysql)
• cursormysql = connmysql.cursor()

cursormysql.execute("select year(orderdate) as 'Year', month(orderdate) as 'Month', onlineorderflag, customerid, salespersonid, subtotal from SalesO
rderHeader")

data2 = pd.DataFrame(cursormysql.fetchall())

regroupGender = data2.groupby([0 , 1])
• data = np.array(regroupGender[5].sum())
• model = LinearRegression()
• model.fit(np.arange(data.size).reshape(-1,1), data)
• r_sq = model.score(np.arange(data.size).reshape(-1,1), data)
• print('R2:', r_sq)
• print('intercept:', model.intercept_)
• print('slope:', model.coef_)
Visualisasi ?

You might also like