You are on page 1of 22

What Does “Ab Initio” Mean?

• Ab Initio is Latin for “From the Beginning.”


• From the beginning our software was
designed to support the largest, most complex
business applications. Crucial capabilities like
parallelism and checkpointing can’t be added
after the fact.
• The Graphical Development Environment and a
powerful set of components allow our
customers to get valuable results from the
beginning.

Confidential & Proprietary


Ab Initio’s focus

 “Big Data” problems


high volume
high complexity
 High performance, scalable
solutions
 High productivity development

Confidential & Proprietary


Ab Initio Software

• Ab Initio software is a general-


purpose data processing platform for
enterprise class, mission-critical
applications such as:
 Data warehousing
 Batch processing
 Click-stream analysis
 Data movement
 Data transformation

Confidential & Proprietary


Parallel Computer
Architecture
• Computers come in many “shapes and sizes”:
• Single-CPU
• Multi-CPU
• Network of single-CPU nodes
• Network of multi-CPU nodes

• Multi-CPU machines are often called SMP’s (for


Symmetric Multi Processors).

• Specially-built networks of machines are often called


MPP’s (for Massively Parallel Processors).

Confidential & Proprietary


A Single-CPU Computer

Processor
Bus
Memory

Disk

Confidential & Proprietary


A Multi-CPU Computer (SMP)

Confidential & Proprietary


A Network of Single-CPU
Nodes

Network
If all of these comprise one computer, it may be an MPP

Confidential & Proprietary


A Network of Multi-CPU
Nodes

Confidential & Proprietary


A Network of Networks

Confidential & Proprietary


Ab Initio Provides For:

• Distribution - a platform for applications to run on


collections of cpu’s

• Complexity - the ability for applications to run in


parallel on any combination of single-CPU computers,
multi-CPU computers, and networks of computers.

Confidential & Proprietary


Applications of Ab Initio
Software

• “Big Data” processing.

• Parallel execution of existing applications.

• Parallel sort/merge processing.

• Data transformation.

• Rehosting of corporate data.

Confidential & Proprietary


Applications of Ab Initio
Software

• Front end of Data Warehouse:


• Transformation of disparate sources
• Aggregation and other preprocessing
• Referential integrity checking
• Database loading

• Back end of Data Warehouse:


• Extraction for external processing
• Aggregation and loading of Data Marts

Confidential & Proprietary


Ab Initio Product Architecture

User Applications

Development Environments
GDE Shell C++
Component Suite User 3rd Party
Partitioners, Transforms, ... Components Components

The Co>Operating System


Native Operating Systems (Unix, Windows, OS/390)

Confidential & Proprietary


Co>Operating System Runs on:
• Sun Solaris 2.6, 7, and 8 (SPARC)
• IBM AIX 4.2, and 4.3
• Hewlett-Packard HP-UX 10.20, 11.00, and 11.11
• Siemens Pyramid Reliant UNIX Release 5.43
• IBM DYNIX/ptx 4.4.6, 4.4.8, 4.5.1, and 4.5.2
• Silicon Graphics IRIX 6.5
• Red Hat Linux 6.2 and 7.0 (x86)
• Windows NT 4.0 (x86) with SP 4, 5 or 6
• Windows NT 2000 (x86) with no service pack or SP1
• Digital UNIX V4.0D (Rev. 878) and 4.0E (Rev. 1091)
• Compaq Tru64 UNIX Versions 4.0F (Rev 1229) and 5.1 (Rev 732)
• IBM OS/390 Version 2.8, 2.9, and 2.10
• NCR MP-RAS 3.02

Confidential & Proprietary


Connectivity to Other
Software
• Common, high performance database interface:
• IBM DB2, DB2/PE, UDB
• Oracle
• Informix XPS
• Sybase
• Teradata
• MS SQL Server 7
• Other software packages:
• SAS
• Trillium
• Postalsoft
• ...

Confidential & Proprietary


Co>Operating System
Services

• Parallel and distributed application execution


• Control
• Data Transport
• Transactional semantics on the application
level.
• Checkpointing.
• Monitoring and debugging.
• Parallel file management.
• Metadata-driven components.

Confidential & Proprietary


Components

• A component is a program.

• Components may run on any computer running


the Co>Operating System.

• Different components do different jobs.

• The particular work a component accomplishes


depends on its parameter settings.

• Some parameters are computational metadata.

Confidential & Proprietary


Datasets

• A dataset is a source or destination of data. It


can be a file, a database table, a SAS dataset, ...
• Datasets may reside on any machine running the
Co>Operating System.
• Datasets may reside on other machines if
connected by FTP or database middleware
• Data is described by record format metadata.

Confidential & Proprietary


The Graph Model

Confidential & Proprietary


The Graph Model: Naming
the Pieces
Components
Dataset Datasets

Flows

Confidential & Proprietary


The Graph Model: Some
Details
Ports

Record format
Expression
metadata
metadata

Confidential & Proprietary


File Extensions
You have just created a set of directories to hold
the pieces of your graphs:
•mp - graphs
•dml - record formats
•xfr - transforms
•db - database-related files
•run - deployed graph scripts

Confidential & Proprietary

You might also like