Professional Documents
Culture Documents
Agenda
Utilizing technology to identify and analyze trends in data to make better business decisions.
Overlapping Fields
Competing On Analytics
Phases of Growth
What happened?
How was performance last week? How is performance right now? What can I do to reach our goals?
Data Warehousing
Store data outside of application/normal business environment (i.e. ERP systems) Specific for reporting/analytics Modeling Styles
3NF (normal database modeling) Data Marts (aka star schemas) Data Vault (hybrid 3NF/Data Mart) Anchor Modeling (6NF)
Data Warehousing
Databases
MySQL, Postgres, etc Infobright*, LucidDB, InfiniDB*, etc. Greenplum* (both RDBMS and Columnar) Hadoop, CouchDB, MongoDB, etc.
NoSQL
RDBMS vs Columnar
Source: http://www.calpont.com/column-oriented-database-bi
NoSQL?
Not Only SQL Unstructured/semi-structured data Huge (multi-terrabyte to petabyte+ data sets)
Source: http://www.information-management.com/specialreports/20040622/1005301-1.html
Data Integration
ETL (Extract, Transform, Load) MDM (Master Data Management) EAI (Enterprise Application Integration) EII (Enterprise Information Integration)
Talend
Pentaho K.E.T.T.L.E.
Kettle Extraction, Transport, Transformation, and Loading Environment Focus on ETL Extensible plugin architecture Engine based
Reporting
Reporting Options
MDX BIRT Pentaho JasperReports SQL Power Wabit Saiku Pivot Table Charting SQL Other Sources* Drill Parameterized Through
BIRT Example
Visualization
Pentaho CDE/CDF
Dashboard framework and editor built into Pentaho BI Server Community developed uses open web languages (Javascript, HTML, etc).
Statistics/Predictive Analytics
Statistics/Predictive Analytics
* Utilizes Talend ETL **Utilizes Weka Data Mining ***All use Mondrian for OLAP, with different front ends
Shameless Plug