You are on page 1of 19

DATA WAREHOUSE

Arsitektur & Struktur


Granularity, Partisi, & Summary

Asrul Azhari Muin


Arsitektur & Struktur
| OUTLINE Granularity, Partisi, & Summary

1. Arsitektur DW
2. Struktur DW
3. Granularity
4. Partisi
5. Summary
Arsitektur & Struktur
| #1 ARSITEKTUR DW Granularity, Partisi, & Summary
Arsitektur & Struktur
Granularity, Partisi, & Summary
Arsitektur & Struktur
| #1 DATA SOURCE COMPONENT Granularity, Partisi, & Summary

Source data coming into the data warehouses may be grouped into four broad categories:
• Production Data: This type of data comes from the different operating systems of the
enterprise. Based on the data requirements in the data warehouse, we choose segments of
the data from the various operational modes.
• Internal Data: In each organization, the client keeps their "private" spreadsheets,
reports, customer profiles, and sometimes even department databases. This is the internal
data, part of which could be useful in a data warehouse.
• Archived Data: Operational systems are mainly intended to run the current business. In
every operational system, we periodically take the old data and store it in achieved files.
• External Data: Most executives depend on information from external sources for a large
percentage of the information they use. They use statistics associating to their industry
produced by the external department.
Arsitektur & Struktur
| #1 DATA STAGING COMPONENT Granularity, Partisi, & Summary

1) Data Extraction: This method has to deal with numerous data sources. We
have to employ the appropriate techniques for each data source.
2) Data Transformation: As we know, data for a data warehouse comes from
many different sources. If data extraction for a data warehouse posture big
challenges, data transformation present even significant challenges. We perform
several individual tasks as part of data transformation.
3) Data Loading: Two distinct categories of tasks form data loading functions.
When we complete the structure and construction of the data warehouse and go
live for the first time, we do the initial loading of the information into the data
warehouse storage. The initial load moves high volumes of data using up a
substantial amount of time.
Arsitektur & Struktur
| #1 DATA STORAGE COMPONENT Granularity, Partisi, & Summary

Data storage for the data warehousing is a split repository. The


data repositories for the operational systems generally include
only the current data. Also, these data repositories include the
data structured in highly normalized for fast and efficient
processing.
Arsitektur & Struktur
| #1 INFORMATION DELIVERY COMPONENT Granularity, Partisi, & Summary

The information delivery element is used to


enable the process of subscribing for data
warehouse files and having it transferred to one
or more destinations according to some
customer-specified scheduling algorithm.
Arsitektur & Struktur
| #1 METADATA & DATA MARTS Granularity, Partisi, & Summary

• MetaData
Metadata in a data warehouse is equal to the data dictionary or the data catalog in a database
management system. In the data dictionary, we keep the data about the logical data structures, the
data about the records and addresses, the information about the indexes, and so on.

• DataMarts
It includes a subset of corporate-wide data that is of value to a specific group of users. The scope
is confined to particular selected subjects. Data in a data warehouse should be a fairly current, but not
mainly up to the minute, although development in the data warehouse industry has made standard and
incremental data dumps more achievable. Data marts are lower than data warehouses and usually
contain organization. The current trends in data warehousing are to developed a data warehouse with
several smaller related data marts for particular kinds of queries and reports.
Arsitektur & Struktur
| #1 MANAGEMENT & CONTROL COMPONENTGranularity, Partisi, & Summary

The management and control elements coordinate the services and functions
within the data warehouse. These components control the data transformation
and the data transfer into the data warehouse storage. On the other hand, it
moderates the data delivery to the clients. Its work with the database management
systems and authorizes data to be correctly saved in the repositories. It monitors the
movement of information into the staging method and from there into the data
warehouses storage itself.
Arsitektur & Struktur
| #2 STRUKTUR DW Granularity, Partisi, & Summary

▪ Struktur spesifik yang terdapat pada bagian datawarehouse manager


▪ Menurut Poe, Vidette, data warehouse memiliki struktur yang spesifik dan mempunyai perbedaan dalam
tingkatan detail data dan umur data
Arsitektur & Struktur
| #2 STRUKTUR DW (CONT) Granularity, Partisi, & Summary
Arsitektur & Struktur
| #2 STRUKTUR DW (CONT) Granularity, Partisi, & Summary

▪ Current detail data merupakan data detil yang efektif saat ini, mencerminkan keadaan yang sedang berjalan
dan merupakan level terendah dalam data warehouse. Didalam area ini warehouse menyimpan seluruh detail
data yang terdapat pada skema basis data. Jumlah data sangat besar sehingga memerlukan storage yang
besar pula dan dapat siakses secara cepat. Dampak negatif yang ditimbulkan adalah kerumitan untuk
mengatur data menjadi meningkat dan biaya yang diperlukan menjadi mahal
▪ Older Detail Data ini merupakan data historis dari current detail data, dapat berupa hasil cadangan atau
archive data yang disimpan dalam storage terpisah.
▪ Lightly summarized data; Data ini merupakan ringkasan atau rangkuman daricurrent detail data.Data ini
dirangkum berdasarkan periode atau dimensi lainnya sesuai dengan kebutuhan.
▪ Highly summarized data; Data ini merupakan tingkat lanjutan dari Lightly summarized data, merupakan
hasil ringkasan yang bersifat totalitas, dapat diakses misal untuk melakukan analisis perbandingan data
berdasarkan urutan waktu tertentu dan analisis menggunakan data multidimensi.
Arsitektur & Struktur
| #3 GRANULARITY Granularity, Partisi, & Summary

▪ Inmon (2005 : 498) yang menyatakan bahwa Ganularity adalah tingkat detil yang terkandung pada setiap
unit data. Semakin detil tingkat datanya, maka semakin rendah tingkat granularity-nya. Dan Sebaliknya,
semakin rendah tingkat detil datanya, maka semakin tinggi tingkat granularity-nya.

▪ Penentuan tingkat granularity harus dilakukan di awal desain data warehouse, karena dampaknya pada
pemanfaatan data warehouse, volume, dan kompleksiitas data warehouse itu sendiri .
▪ Umumnya diambil jalan tengah berupa data yang lightly-summarized
Arsitektur & Struktur
| #3 GRANULARITY (CONT) Granularity, Partisi, & Summary
Arsitektur & Struktur
| #4 PARTISI Granularity, Partisi, & Summary

Partisi data mengacu pada pemecahan data menjadi berbagai unit fisik yang bisa di manjemen secara
independen.Dalam DW, partisi data bersifat wajib, yang perlu didesain secara teliti dan tepat.
Keuntungan penggunaan partisi ialah kemudahan melaksanakan proses berikut (karena ukuran yang lebih
kecil):
1.Restructuring
2.Indexing
3.Sequential scanning
4.Reorganization
5.Recovery
6.Monitoring
Partisi berdasarkan:
• Waktu
• Organisasi
• Geografis
Arsitektur & Struktur
| #5 SUMMARY Granularity, Partisi, & Summary

• Much storage required • Very compact


• No loss of details • Some loss of details
• Much processing to do anything with data • The older data gets, the less detail is kept
Arsitektur & Struktur
Granularity, Partisi, & Summary

“THE WORLD’S MOST VALUABLE RESOURCE IS


NO LONGER OIL, BUT DATA”
The Economist
#2 Arsitektur & Struktur DW

Thank you

You might also like