You are on page 1of 4

Business Intelligence Systems

Need for data analysis


o Managers track daily transactions to evaluate how business is performing
o Strategies should be developed to meet organizational goals using operational
databases
o Data analysis provides information about short-term tactical evaluations
strategies.
Business Intelligence
o Comprehensive, cohesive, and integrated tools and processes
o Capture, collect, and integrate, store and analyze data
o Generate information to support business decision making
o Framework that allow businesses to transform data into information, information
to knowledge, and knowledge to wisdom
Business Architecture
o Composed of data, people, processes, technology and management.
o KPI (key performance indicator) are measurements that assess companys
effectiveness of success in reaching goals
o Multiple tools from different vendors can be integrated into one single Business
Intelligence framework
Business Framework (check notes)
Business Intelligence Benefits
o Integrating architecture
o Common user interface for data reporting and analysis
o Common data repository fosters single version of company data
o Improved organizational performance (since there is one version)
Evolution of Business Intelligence Information dissemination formats
o 1970s centralized reporting
o 1980s spreadsheets
o 1990s enterprise reporting and OLAP
o 2000s mobile BI, dashboards
Business Intelligence Trends
o Data storage improvements
o Business intelligence appliances
o Business intelligence as a service
o Big Data analytics
o Personal analytics
Decision Support Data
o Business intelligence effectiveness depends on the quality of data gathered at
operational level
o Operational data are seldom well-suited for decision making (which is why it
should be integrated and analyzed to decision support data)
o Need to reformat data in order to be useful for Business Intelligence
Operational
Decision Support
Current
Historical
Real time
Time variant
Atomic-detailed data
Summarized data
Low summarization level (some
High summarization level
aggregate yields)
(many aggregate yields)
High normalized (mostly
Non-normalized (mostly multirelational DBMS)
dimensional, some relational)
Updates critical
Retrievals critical
Simple queries
Complex queries
Decision Support Database Requirements
o Specialized DBMS tailored to provide fast answers to complex queries

o Main requirement: database schema, data extraction and loading, database size
Data Warehouse: integrated, subject-oriented, time-variant, and non-volatile collection of
data that supports decision making
Transform operational data to decision support data
o Operational data is in tabular format, with each row representing a single
transaction
o Decision support data is DSS that has broader timespan and multiple dimensions
ETL Process (check notes)
Star Schemas
o Data modeling technique that maps multidimensional decision support data into
relational data
o Components: facts, dimension, attributes
o Ex. Sales
Data Mining
o Use sophisticated statistical and mathematical techniques to find patterns and
relationships
o Analyze data, uncover problems hidden in data relationships and predict business
behavior
o Two modes: guided and automated
Data Mining Phases
o Data prep
Identify data set
Clean data set
Integrate data set
o Data Analysis and classification
Classification analysis
Clustering and sequential analysis
Link analysis
Trend and deviation analysis
o Knowledge acquisition
Select and apply algorithms
Neural networks
Inductive logic
Decision trees
Clustering
Regression tree
Nearest neighbor
Visualization, etc.
o
Prognosis
modeling
forecasting
predicting

Big Data

7 Vs: VOLUME, VELOCITY, VARIETY, veracity, variability, value, volatility


Store Big Data in (1) Hadoop, (2) NoSQL
Hadoop:
o Assumptions include:
High volume, more than 1 terabyte
Write once, read many
Fault tolerance, replication factor of 3
NoSQL:
o Key- value Dynamo
o Document MongoDB
o Column Cassandra

o Graph Neo4J
SQL limitations:
o Rigid schema, difficulty adding columns
o JOINS are expensive
o Cant handle unstructured data
o Not adaptive to new requirements
When to use RDBMS
When to use NoSQL
Centralized application (ERP)
Decentralized application (IOT, mobile,
web)
Moderate to high availability
Continuous availability, no downtime
Moderate velocity data
High velocity data (devices, sensors,
Data from one to few locations
etc)
Primarily structured data
Data from many locations (variety)
Complex/nested transaction
Structured, semi/unstructured data
Primary concern scaling reads
Simple transactions
Scale UP for more data/users
Primary concern scaling reads AND
Maintain moderate data volume with
writes
purge
Scale OUT for more data/users
Maintain high data volumes; retain
forever

MongoDB

Key value Databases (Dynamo)


o Simplest NoSQL data model
o Only stores key-value pairs (values can be anything)
o Key- value pairs organized into buckets logical groupings of keys
o Operations include: get, store, delete
Document Databases (MongoDB)
o Subset of key-value databases
o Value component always an encoded doc (XML, JSON)
o Tags provide meaning
o Organized into collections
o Supports aggregate functions
Operations in MongoDB
o List all databases: show dbs
o Create database: use itpdb
o Delete database: db.dropDatabase()
o List all collections: show collections
o Create collection: db.createCollection()
o Delete collection: db.name.drop()
How to populate Database:
o Single
o db.products.insert(
{name:
id:
model:
tags:[,,]
})
o Multiple
o db.products.insert(
[{doc1},
{doc2},
{doc3}
])

Primary Keys
o MongoDB automatically creates PK called _id
o Uses combination of timestamp, machine id, process id, and counter
Query
o To query: db.products.find()
o To filter query: db.collection.find({
key 1: value
})