Professional Documents
Culture Documents
Abstract:
The topic of data warehousing encompasses architec- tures, algorithms, and tools for bringing
together selected data from multiple databases or other information sources into a single repository, called a
data ware- house, suitable for direct querying or analysis. Data Warehousing is a single, unified enterprise
data integration platform that allows companies and government organizations of all sizes to access,
discover, and integrate data from virtually any business system, in any format, and deliver that data
throughout the enterprise at any speed. A mapping is a set of source and target definitions linked by
transformation objects that define the rules for data transformation. Mappings represent the data flow
between sources and targets. When the Integration Service runs a session, it uses the instructions configured
in the mapping to read, transform, and write data An ETL (Extract, Transform and Load).
INTRODUCTION
A data warehouse is a single source for key, corporate information needed to enable business
decisions .An application that updates is called an on-line transaction processing (OLTP) application. An
application that issues queries to the read- only database is called a decision support system (DSS).A data
mart is a subset of the data warehouse that may make it simpler for users to access key corporate data.
Obtain results from the information sources; perform appropriate translation, ltering, and merging of the
information to the user or application.
Figure 1: Informatica Server Architecture
The architecture of conventional ETL is shown as Fig.1. The phases of extract, transform and load
were executed in one single process. Under the framework of conventional ETL, the ETL process is defined
[1] for different data source, develop and compile program or script; retrieval records from database; Fig1.
After extract, exchange the data according to users requirement, load the data to target data warehouse; and
process the records piece by piece until the end of source database. The integrator is responsible for
installing the information in the warehouse, which may include filtering the information, summarizing it, or
merging it with information from other sources. In order to properly integrate new information into the
warehouse, it may be necessary for the integrator to obtain further information from the same or different
Information sources. Desired performance. The architecture and basic functionality we have described is
more general than that provided by most commercial data warehousing systems. In particular, current
systems usually assume that the sources and the warehouse subscribe to a single data model normally
relational, that propagation of information from the sources to the warehouse is performed as a batch
process.
Source qualifier can be termed as a default transformation which comes up when we select
source for the mapping. It provide a default SQL Query which can be generated by clicking Generate
SQL (properties tab>SQL Query attribute). You can have multiple sources in a mapping but you can
only have one source qualifier for the mapping .You can enter any SQL statement supported by
your Source database with proper joins with other Sources. When you drag a source (Represents data read
from relational or flat file sources) into the Mapping Designer workspace [2], Fig 2. add an instance of the
source definition to the mapping and one Source Qualifier automatically comes with that, but if we are
using multiple sources then multiple Source Qualifier will automatically pop up (as shown in the screenshot
below) which requires us to delete the additional Source Qualifier[3].
2.1 Mapping:
Fig 3. Logically Defines the ETL Process:
Fig 4 and 5 In Windows, in the System DSN tab of the ODBC Data Source Administrator, in user
ID create a grant DBA and TNS name by default ORCL. Create an ODBC connection names scott_source
and kanth_target.
Figure 6: Source Table an Source Analyzer
Fig 6. When U add a relational or a flat file source definition to a mapping, U need to connect it to a source
qualifier transformation[4].The source qualifier transformation represents the records that the informatica
server reads when it runs a session.
Fig 7. Target definitions define the structure of tables in the target database, or the structure of file targets
the Power Center Server creates when you run a workflow. If you add a target definition to the repository
that does not exist in a relational database, you need to create target tables in your target database[5]. You
do this by generating and executing the necessary SQL code within the Warehouse Designer.
Figure 8: The logic ports of Expression Transformation
Fig 8. In the following steps, you will copy the EMPLOYEES source definition into the
Warehouse Designer to create the target definition [6]. Then, you will modify the target definition by
deleting and adding columns to create the definition you want.
IIF(ISNULL(COMM),SAL,SAL+COMM)
Fig 10. The above expression in emp table V_TOTAL>3000 above sal calculate tax 0.25 and
V_TOTLA> 3000 below sal calculate tax 0.15
IIF(V_TOTAL>3000,V_TOTAL*0.25,V_TOTAL*O.15)
Go to Transformation (Under Tool Bar) Select > Create Select > Expression from
Dropdown.
Fig 13. The above diagram create Work Flow Designer connect to the Session, in this sessino internally
create Read and Write operation in mapping Designer.
Figure 14: Executing Informatica powercenter workfolw monitor
Fig 14. Informatica PowerCenter Workflw moniter will display the result executing process will
display the result success or fail.
REFERENCES
[1] I. William, S. Derek, and N. Genia, DW 2.0: The Architecture for the Next Generation of Data
Warehousing. Burlington, MA: Morgan Kaufman, 2008, pp. 215-229.
[2] R. J. Davenport, September 2007. [Online] ETL vs. ELT: A Subjective View. In Source IT Consulting
Ltd., U.K. Available at: http://www.insource.co.uk/pdf/ETL_ELT.pdf
[3] T. Jun, C. Kai, Feng Yu, T. Gang, The Research and Application of ETL Tools in Business Intelligence
Project, in Proc. International Forum on Information Technology and Applications, 2009, IEEE,
pp.620-623.
[7] L. Troy, C. Pydimukkala, How to Use Power Center with Teradata to Load and Unload
Data, Informatica Corporation [Online], Available at: www.myinformatica.com