You are on page 1of 33

Open Source Business Intelligence Tools

Alex Meadows TriLUG, January 2012

Agenda

Business Intelligence Overview Review of OSBI Tools


Data Warehousing Data Integration Reporting/OLAP Visualization Statistical Analysis/Predictive Analytics

What Is Business Intelligence?

Utilizing technology to identify and analyze trends in data to make better business decisions.

Overlapping Fields

Source: Back In Business, Klimberg, Miori (www.informs.org)

Competing On Analytics

Source: Competing on Analytics; Thomas Davenport, Jeanne Harris

Phases of Growth

The Three Types of Questions

What happened?

How was performance last week? How is performance right now? What can I do to reach our goals?

What is currently happening?

What will happen?

Data Warehousing

Store data outside of application/normal business environment (i.e. ERP systems) Specific for reporting/analytics Modeling Styles

3NF (normal database modeling) Data Marts (aka star schemas) Data Vault (hybrid 3NF/Data Mart) Anchor Modeling (6NF)

Data Warehousing

Databases

MySQL, Postgres, etc Infobright*, LucidDB, InfiniDB*, etc. Greenplum* (both RDBMS and Columnar) Hadoop, CouchDB, MongoDB, etc.

Columnar Data Stores

Hybrid Data Warehouse Databases

NoSQL

*Hardware and/or Software limitations in community editions

RDBMS vs Columnar

Source: http://www.calpont.com/column-oriented-database-bi

NoSQL?

Not Only SQL Unstructured/semi-structured data Huge (multi-terrabyte to petabyte+ data sets)

Source: http://www.information-management.com/specialreports/20040622/1005301-1.html

Data Integration

Syncing data across systems Includes:


ETL (Extract, Transform, Load) MDM (Master Data Management) EAI (Enterprise Application Integration) EII (Enterprise Information Integration)

Talend

Data Management Tool Suite


ETL MDM Data Profiling Data Quality

Code generator Eclipse based Extensible plugin architecture

Pentaho K.E.T.T.L.E.

Kettle Extraction, Transport, Transformation, and Loading Environment Focus on ETL Extensible plugin architecture Engine based

Reporting

Focus: Historical Analysis

Reporting Options
MDX BIRT Pentaho JasperReports SQL Power Wabit Saiku Pivot Table Charting SQL Other Sources* Drill Parameterized Through

*Flat Files, NoSQL, etc.

BIRT Example

Visualization

Focus: Trending and Present

Pentaho CDE/CDF

Dashboard framework and editor built into Pentaho BI Server Community developed uses open web languages (Javascript, HTML, etc).

Statistics/Predictive Analytics

Focus: All relevent data used to predict outcomes

Statistics/Predictive Analytics

R stats oriented Weka machine learning oriented RapidMiner mixed


Originally YALE Weka and R Plugins Like SAS Enterprise Miner

BI From Reporting to Statistical Analysis


ETL Jaspersoft * Pentaho SpagoBI * * Metadata Reporting Dashboards OLAP*** ** ** Statistics Automated Decisions

* Utilizes Talend ETL **Utilizes Weka Data Mining ***All use Mondrian for OLAP, with different front ends

Shameless Plug

RTP Pentaho User Group


On LinkedIn (soon to be also on Meetup) Meets quarterly

You might also like