Professional Documents
Culture Documents
19 April 2022 4 May 2022 10 May 2022 17 May 2022 June 2022
Live Session Time: Live Session Time: Live Session Time: Live Session Time:
9:30 AM–10:30 AM BST 9:30 AM–10:30 AM BST 9:30 AM–10:30 AM BST 9:30 AM–10:30 AM BST
10:30 AM–11:30 AM CEST 10:30 AM–11:30 AM CEST 10:30 AM–11:30 AM CEST 10:30 AM–11:30 AM CEST
2:00 PM–3:00 PM IST 2:00 PM–3:00 PM IST 2:00 PM–3:00 PM IST 2:00 PM–3:00 PM IST
IDMC
Introduction
IDMC vs. IICS
Informatica´s
Intelligent Data Managment Cloud Informatica Intelligent Cloud Services
Cloud Data
Platforms
“Big Data”
FRAGMENTATION
SaaS Apps
Enterprise Data
Enterprise Warehouses
Apps
Relational
Databases
Mainframe
THEN TODAY
TYPE Structured and Unstructured
LATENCY Batch and Real-time
USERS Technical and Business Users
75% 90%
of organizations, by 2022, will
of organizations are actively utilize multiple CSPs and will
migrating data to the cloud require significant augmented
data management and integration
Cloud Modernization
CATALOG INGEST INTEGRATE CLEANSE RELATE GOVERN PROTECT PREPARE SHARE & DELIVER
Discover, catalog, Multi-latency data Integrate all types Make data fit for Match and relate Define and verify Detect and protect For analytics Publish and
and curate all ingestion and edge of data purpose identities and data governance sensitive data and collaborate on manage APIs and
enterprise data computing entities policies projects Data Services
Azure
CATALOG INGEST INTEGRATE CLEANSE RELATE GOVERN PROTECT PREPARE SHARE & DELIVER
Discover, catalog, Multi-latency data Integrate all types Make data fit for Match and relate Define and verify Detect and protect For analytics Publish and
and curate all ingestion and edge of data purpose identities and data governance sensitive data and collaborate on manage APIs and
enterprise data computing entities policies projects Data Services
Azure
Lack of agility and innovation
CATALOG INGEST INTEGRATE CLEANSE RELATE GOVERN PROTECT PREPARE SHARE & DELIVER
Discover, catalog, Multi-latency data Integrate all types Make data fit for Match and relate Define and verify Detect and protect For analytics Publish and
and curate all ingestion and edge of data purpose identities and data governance sensitive data and collaborate on manage APIs and
enterprise data computing entities policies projects Data Services
Azure
Application Cloud
Analytics Cloud
Salesforce | Oracle | SAP | Adobe |
ServiceNow | Coupa | Workday | Apptio Tableau | ThoughtSpot | Power BI
Snowflake | Amazon Redshift | Google Big Database Cloud Azure | Amazon | Google Cloud |
Query | Azure | Databricks Oracle | Rackspace
Microsoft SQL Azure | MongoDB |
Oracle | | Amazon Aurora
MULTI-HYBRID
DATA DATA API & APP DATA MDM & 360 GOVERNANCE & DATA
CATALOG INTEGRATION INTEGRATION QUALITY APPLICATIONS PRIVACY MARKETPLACE
Connectivity
Metadata System of Record
SaaS Self-Managed
Cloud Modernization
Presentation
Data Warehouse, Lakes and App
Name Goes Here
Modernization
Source:
1 – Harvard Business Review Services Survey, 2019
22
DATA CONSUMERS
ETL Developer Data Engineer Citizen Integrator Data Scientist Data Analyst Business Users
DISCOVER & ACCCESS & CONNECT & CLEANSE & MASTER & GOVERN & SHARE &
UNDERSTAND INTEGRATE AUTOMATE TRUST RELATE PROTECT DEMOCRATIZE
DATA DATA API & APP DATA MDM & 360 GOVERNANCE & DATA
CATALOG INTEGRATION INTEGRATION QUALITY APPLICATIONS PRIVACY MARKETPLACE
Connectivity
Metadata System of Record
DATA SOURCES
Real-time /
SaaS Apps On-premises
Sources + Sources + Streaming
Mainframe Applications Databases IoT Machine Data Logs Sources
Presentation
Simplicity
Name Goes Here
Speaker Name, Roboto Regular 20 Point
Speaker Title or Email, Roboto Italic 20 Point
Address Any Cloud Integration Need
Presentation
Productivity
Name Goes Here
Speaker Name, Roboto Regular 20 Point
Speaker Title or Email, Roboto Italic 20 Point
Only AI @ Scale Delivering the Enterprise System
of Record for Metadata
Most Comprehensive Active ✓ Google-like search
Metadata Across the Enterprise
✓ Amazon-like recommendations
DATA
TRANSFORMATION
RELATIONSHIP RECOMMENDATIONS
INFERENCE
COLUMN SEARCH
DATA SIMILARITY RANKING
DATASET
VOLUME RECOMMENDATIONS
PROJECTIONS
NATURAL
OPERATIONAL
LANGUAGE BUSINESS
SCHEMA ANOMALY
DESCRIPTION TERM
INFERENCE DETECTION
OF CODE ASSOCIATIONS
DATASET
SIMILARITY
BUSINESS ECONOMIC
SMART DATA SCHEMA
RULE VALUE
VISUALIZATION MAPPING
ASSOCIATIONS OF DATA
COST OF PREDICTIVE
SCHEMA ENTITY
DATA OPERATIONAL
MATCHING EXTRACTION
BREACH ANALYTICS
Date goes here
Presentation
Scale
Name Goes Here
Speaker Name, Roboto Regular 20 Point
Speaker Title or Email, Roboto Italic 20 Point
SCALE: Performant & cost-effective data management engine
CODELESS AND SERVERLESS
Advanced Advanced
Elastic Engine Pushdown
Optimization
Auto Scaling
True
+
Serverless
Auto Tuning
3TB data processed
under 2 hours 50X more performant than
ETL
Presentation
The Informatica Difference
Name Goes Here
Speaker Name, Roboto Regular 20 Point
Speaker Title or Email, Roboto Italic 20 Point
Plug-and-Play Connectivity to Any Data Type
Achieve the core goal of delivering trusted, actionable data when and how the
business needs it
Aligned Program
Streaming 1 Stream
6
Processing
Stream Storage Real-time
Analytics
IoT Machine Apps Business
Data Cloud Data Lake 4 CDW 5 User
CDI-Elastic/A-PDO
Data Provisioning
Data Integration & Enterprise
Log Files Social Mobile 2 Analytics
Quality
Landing Landing Data
Zone Zone Analyst
Mass Ingestion
On-Premises Line of Business /
Self-Service
A-PDO Analytics Line of
Business
Mainframe Application Databases 3
Servers Landing Data Enterprise
Zone Enrichment Zone
Data
Documents Data Engineer
CDI-Elastic
Warehouse
Data Science / AI
SaaS
Data
Scientist
CDI-Elastic
ERP DRM
3
Corporate Network
Secure Agent Group
Data
1
Cloud Applications
firewall
DEMO
Admin Service
Cloud Mass
Ingestion (CMI)
Informatica Data Warehouse and DataLake Architecture
Streaming 1 Stream
6
Processing
Stream Storage Real-time
Analytics
IoT Machine Apps Business
Data Cloud Data Lake 4 CDW 5 User
CDI-Elastic/A-PDO
Data Provisioning
Data Integration & Enterprise
Log Files Social Mobile 2 Analytics
Quality
Landing Landing Data
Zone Zone Analyst
Mass Ingestion
On-Premises Line of Business /
Self-Service
A-PDO Analytics Line of
Business
Mainframe Application Databases 3
Servers Landing Data Enterprise
Zone Enrichment Zone
Data
Documents Data Engineer
CDI-Elastic
Warehouse
Data Science / AI
SaaS
Data
Scientist
CDI-Elastic
ERP DRM
Advanced
Analytics
Data Integration
Dashboards
Files Databases
Mass Ingestion
Google Cloud
Storage Self Service
Streaming
ADLS Gen2 Analytics
On-Premise
Machine Data IoT
Data Warehouse
+
Cloud API Streaming File Mass Database Mass Mass Ingestion
Connectivity Applications
Management Ingestion Ingestion Ingestion
Google storage
MI Metadata Cloud
Transfer any file type with a Data
Integration
high performance and
scalability Cloud Redshift S3
1 MI Task 4
Update Job
Job and file level tracking and log Azure DW, Blob, Data Lake
monitoring
Secure
Data Data
Ingest relational database data Warehouses Lakes
from Oracle, SQL-Server & MySQL.
Also supporting Schema Drift on
CDC supported Databases
On-Premises
Sources
54 © Informatica. Proprietary and Confidential.
Benefits of Mass Ingestion Databases
WebLogs
Data Lake
Ingest streaming data: Logs, Social & ML Consumption
Media
clickstream, social media,
Kafka Kinesis, S3, ADLS, Messaging
Systems
Firehose, etc.
Real-time monitoring of
ingestion jobs with lifecycle
management and alerting in
case of issues
• The SaaS and on-premises applications used in your business or organization store large
amounts of business-critical data on a daily basis. You can use MIA to transfer the data
stored by your applications to cloud-based targets that can handle large volumes of
data.
• After you transfer the data to the target, you can consolidate the data and use it for
various purposes, such as advanced data analytics and data warehousing.
• Initial load
- Loads source data read at a single point in time to a target.
• Incremental load
- Loads data changes continuously or until the ingestion job is stopped or ends.
• Initial and Incremental load
- Performs an initial load of point-in-time data to the target and then automatically switches to
propagating incremental data changes made to the same source objects on a continuous
basis
Streaming 1 Stream
6
Processing
Stream Storage Real-time
Analytics
IoT Machine Apps Business
Data Cloud Data Lake 4 CDW 5 User
CDI-Elastic/A-PDO
Data Provisioning
Enterprise
Log Files Social Mobile 2 Data Integration Analytics
Landing Landing Data
Zone Zone Analyst
Mass Ingestion
On-Premises Line of Business /
Self-Service
A-PDO Analytics Line of
Business
Mainframe Application Databases 3
Servers Landing Data Enterprise
Zone Enrichment Zone
Data
Documents Data Engineer
CDI-Elastic
Warehouse
Data Science / AI
SaaS
Data
Scientist
CDI-Elastic
ERP DRM
• Ease of Use
• Templates and Wizards
• Micro-service Architecture
• Reusability
• Broad Hybrid and Multi-Cloud
Connectivity
• No coding across the platform
• Performance optimizations like
CDC, parallel processing,
pushdown optimization, Mass
Ingestion, etc
Task Flows
52
• Data Integration: Build a template once – automate mapping execution for 1000’s of
sources with different schemas automatically
• Mapping self-adjusts dynamically to external schema changes and column characteristics
Generic Source and Target Varying logic, e.g., apply TRIM for varying
with varying schemas number of String fields in the Source
76 © Informatica. Proprietary and Confidential.
7
Dynamic Mapping – Features
Streaming 1 Stream
6
Processing
Stream Storage Real-time
Analytics
IoT Machine Apps Business
Data Cloud Data Lake 4 CDW 5 User
Data Provisioning
Data Integration & Enterprise
2 Analytics
CDI-Elastic
Log Files Social Mobile
Quality
Landing Landing Data
Zone Zone Analyst
Mass Ingestion
On-Premises Line of Business /
Self-Service
A-PDO Analytics Line of
Business
Mainframe Application Databases 3
Servers Landing Data Enterprise
Zone Enrichment Zone
Data
Documents Data Engineer
CDI-Elastic
Warehouse
Data Science / AI
SaaS
Data
Scientist
CDI-Elastic
ERP DRM
Same, familiar
Informatica Design-Time
Elastic Elastic
Compute
Compute Compute
Mappings
Metadata
Customer VPC Data
Web browser
(Build & manage)
Microservices
Informatica VPC
VPN
Compute
✓ Auto-Scaling
✓ Multiple AWS regions
✓ Resiliency and HA
✓ Tenant Isolation
DMZ
87 © Informatica. Proprietary and Confidential.
CDI-e: Automated Performance Tuning
Powered by CLAIRE
Manual work
30% of your Engineers time
Pick new
Parameters
Frequent Outages
Pager ringing at 3 AM
Developer
Analyze the
Run the Job
Logs
Slow and expensive
Missing SLA’s every week.
Execute Elastic
Source File Directory
Mapping Process
• Solution:
• CDI-Elastic can track data that has been processed during a previous run of an MCT by
persisting the state information of the job run.
• Incremental File Load is a feature of CDI-Elastic which will maintain the state information and
prevent reprocessing of old data.
• Time travel will help to go back in time and re-process files