You are on page 1of 44

INTRODUCTION TO THE PENN STATE DATA WAREHOUSE

Where Have We Been?

Transactions processed on the mainframe -ISIS, IBIS, ADIS Many years of historical data, millions of mainframe records
Accessing mainframe data requires knowledge and skills in Natural programming language

Steps Required to get Data from the Mainframe


Security

forms to get access to files

Complete form. Specify purpose of request, what data is required. Signed by Access and Security Representative (ASR). Must be approved by all data stewards (may be several).

Steps Required to get Data from the Mainframe


Security

forms to get access to files Write a Natural program


Natural is a complex language, requires programming skills and extensive training. File structures are complex.

Steps Required to get Data from the Mainframe


Security

forms to get access to files Write a Natural program Write the JCL
Use Roscoe system to write Job Control Language.

Steps Required to get Data from the Mainframe


Security

forms to get access to files Write a Natural program Write the JCL Submit the job
Goes into a queue with other similar jobs: first in, first out.

Steps Required to get Data from the Mainframe


Security

forms to get access to files Write a Natural program Write the JCL Submit the job Wait at least overnight
Jobs in queue run only at night. Queues can be quite long. Some jobs will wait several days in the queue until they reach the top.

Steps Required to get Data from the Mainframe


Security

forms to get access to files Write a Natural program Write the JCL Submit the job Wait at least overnight Job failed? Repeat steps
Check the status. If anything failed, fix problems and repeat the steps.

Steps Required to get Data from the Mainframe

Security forms to get access to files Write a Natural program Write the JCL Submit the job Wait at least overnight Job failed? Repeat steps Days or weeks to complete
Experienced Natural programmer writing a simple job will take several days of coding, testing, running. Anything complicated can take longer.

Solution to a problem -- AIDA


Administrative Information Decision Aid (AIDA) Available for ISIS (registration, admissions, enrollment) and IBIS (human resources) data Run on the mainframe against extract (i.e. not live data) files Create reports or data files for downloading

Solution to a problem -- AIDA

AIDAs were developed to fulfill ad-hoc reporting needs


DISADVANTAGES: Inflexible Outdated technology Long development time for a new AIDA Difficult to change or to add new features

ADVANTAGES: No programming required Easy to use Available to everyone, even from terminals Eliminated a lot of ad-hoc programming

What is a Data Warehouse?

The

consolidation of data from mainframe legacy systems into subjectoriented tables that are accessible through desktop tools

What is a Data Warehouse?


Move

the data from the mainframe to a server where it can be accessed from Data extracted periodically. the users PC Changes on the mainframe
may not be reflected on the warehouse for a week or more.

Snapshot

data, NOT live

Transcripts are on the warehouse, but official transcripts are only available through ISIS.

Use

for data analysis, NOT operational

AIS Three-Tier Data Structure


On-line transactions update the mainframe systems immediately

EIS
summary data

Enterprise Information System extracts data from the warehouse and summarizes it Data Warehouse extracts detail data from the mainframe on a periodic schedule

Data Warehouse
detail data extracted periodically

ISIS, IBIS, ADIS


detail transactions processed immediately

Data Transformation
Data goes through a series of steps as it is moved to the warehouse: Extract programs
Write Natural programs to extract data from the mainframe data base

Data Transformation
Data goes through a series of steps as it is moved to the warehouse: Extract programs Verify data
Verify accuracy and consistency of data -- ensure data legibility

Data Transformation
Data goes through a series of steps as it is moved to the warehouse: Extract programs Verify data Create tables
Create normalized tables on the warehouse -eliminate data redundancy (i.e. address appears in one place only)

Data Transformation
Data goes through a series of steps as it is moved to the warehouse: Extract programs Verify data Create tables Load tables
Load warehouse tables with extracted data

Data Transformation
Data goes through a series of steps as it is moved to the warehouse: Extract programs Verify data Create tables Load tables Refresh data
Establish a schedule to refresh the data. Frequency depends on volatility of the data. Some refreshed weekly, some once per semester

How is the Warehouse Accessed?

Query Tool to retrieve data


Tools are off-the-shelf software that run on the desktop. Users can purchase whatever package they want based on platform, price, preference.

End Users

Data Warehouse

Features of Query Tools

Easy access to data


Programming ability not required. Tools have Graphical User Interface -- point and click.

Features of Query Tools


Easy access to data Quick results

Results are returned quickly, often within minutes.

Features of Query Tools


Easy access to data Quick results Interactive approach to creating reports

Query criteria can be easily and quickly adjusted to modify results or obtain additional data

Features of Query Tools


Easy access to data Quick results Interactive approach to creating reports Data available on the users desktop

Results returned directly to desktop

Features of Query Tools


Easy access to data Quick results Interactive approach to creating reports Data available on the users desktop Manipulation of data for customized report layout

Data can be exported to other desktop software for formatting and analysis

Features of Query Tools


Easy access to data Quick results Interactive approach to creating reports Data available on the users desktop Manipulation of data for customized report layout Unlimited retrieval of data

Amount of data retrieved limited only by query tool and size of PC hard drive

Query Tool Requirements


To access Penn States data warehouse, you can purchase any query tool that satisfies the following two criteria: Uses Structured Query Language (SQL) Supports Open Data Base Connectivity (ODBC)

Readily Available Query Tools


Microsoft

Access Excel Query

Microsoft

If you have Microsoft Office, you already have three query tools on your desktop that will work with Penn States data warehouse.

Microsoft

Who Can Access the Warehouse?


Users

in departments, colleges, campuses


Hundreds of Penn State data warehouse users in departments, colleges, campuses.

Who Can Access the Warehouse?


Users
Two

in departments, colleges, campuses


requirements for warehouse users:
the data
Understand

This understanding is critical and not simple to acquire. If youve used AIDA or the mainframe systems, you already have some of this knowledge. Takes time and training.

Who Can Access the Warehouse?


Users

in departments, colleges, campuses


requirements for warehouse users:

Two

Understand

the data Familiar with query tools


Hands-on training available

How to Get Connected


A PC or Macintosh connected to the Penn State

Backbone with TCP/IP communications software SQL Client License or MS BackOffice license (Purchased from the Microcomputer Order Center)

For Windows, SQL Client Tools (provided by the MOC when the SQL Client License is purchased) must be installed on the PC. Not required for Macintosh.
Open Data Base Connectivity (ODBC) compliant query tool

First Steps for New Users


Be

authorized as a warehouse user

Request access for each database separately. Requests submitted via e-mail. Instructions on the web site.

First Steps for New Users


Be

authorized as a warehouse user Set up your workstation


Purchase client license. Install client tools for Windows. Instructions on the web site.

First Steps for New Users


Be

authorized as a warehouse user Set up your workstation Select a query tool


ODBC-compliant. Install on your PC.

First Steps for New Users


Be

authorized as a warehouse user Set up your workstation Select a query tool Get training
Training offered through HRDC and TLT: hands-on and web-based

First Steps for New Users


Be

authorized as a warehouse user Set up your workstation Select a query tool Get training Learn the data
Work with it; documentation on the web site; help available from steward offices.

First Steps for New Users


Be

authorized as a warehouse user Set up your workstation Select a query tool Get training Learn the data Get help
User groups, listservs, web site

More Detailed Information ... on our Web Site http://ais.its.psu.edu/dataware/


Web site contains data documentation, instructions for setting up your PC to access the warehouse, sample queries, schedules for training and user group meetings.

Benefits of the Warehouse

Improves access to administrative information for faculty and staff


Can get data quickly and easily to do analysis. We can work with better information, make decisions based on data.

Benefits of the Warehouse


Improves access to administrative information for faculty and staff Reduces cost

Mainframe can be used for transaction processing. Programmers can use skills to enhance the mainframe systems instead of writing ad-hoc reports.

Benefits of the Warehouse


Improves access to administrative information for faculty and staff Reduces cost Timely retrieval of data

Can retrieve data in minutes rather than days.

Benefits of the Warehouse


Improves access to administrative information for faculty and staff Reduces cost Timely retrieval of data Flexibility

Data can be retrieved with limitless combinations of criteria and can be exported into other desktop tools for analysis and manipulation.

What Does the Future Hold?


More

data will be added to the warehouse


will primarily be used for transaction processing

Mainframe

You might also like