You are on page 1of 6

Design Specification for GIM Heatmap v.

1 APPENDIX

1.1 Robustness for GIM-Heatmap


Robustness : The new process of Robustness check the validity for full and delta load both.
We have a new source system IRMS or in short IS. For IRMS Stage layer schema in Sophia
DWH, we have the following tables per source table from IRMS. For example, for source table
IRMS.REQUEST, the following tables are created in IRMS Sophia Stage layer schema
SPH_STG_IS:
1. SPH_STG_IS.H_<table_name>: To load the data from IRMS.REQUEST incrementally. For
example SPH_STG_IS.H_REQUEST. Table structure is same as Source table without unused
CLOB columns and addition of FEED_ID and FEED_RUN_ID columns.

2. SPH_STG_IS.IS_AUDIT_TABLE_<table_name>: To check the record count for Insert, Update


and Delete for Robustness. For example SPH_STG_IS.IS_AUDIT_TABLE_REQUEST. Table
structure is given below:

COLUMN NAME DATA TYPE REMARK

CB_SYS_IT_SYSTEM_N NUMBER From FEED_RUN_CALENDER


O

LOAD_ID NUMBER From FEED_RUN_CALENDER

TABLE_NAME VARCHAR2(50) Source Table Name

JOB_NAME VARCHAR2(100 Job or Session Name


)

START_TIME DATE Start time of Job.

END_TIME DATE End time of Job.

STATUS VARCHAR2(20) ‘SUCCEEDED’

SRC_RECORDS_READ NUMBER Number of records read from source.

INSERT_COUNT NUMBER Number of records valid for INSERT.

UPDATE_COUNT NUMBER Number of records valid for UPDATE.

DELETE_COUNT NUMBER Number of records valid for DELETE.

NO_CHANGE_COUNT NUMBER Number of records valid for NO CHANGE.

FULL_DELTA_FLAG VARCHAR2(1) Value is ‘F’ if the table is of type Full load else ‘D’

3. SPH_STG_IS.R_<table_name>: Transfer the valid load (after checking the Robustness) from
SPH_STG_IS.H_REQUEST. For example SPH_STG_IS.R_REQUEST. Table structure is same
as Source table without unused CLOB columns and addition of FEED_ID and FEED_RUN_ID
columns.

Apart from the above three tables, following tables are created for the IRMS stage layer.
Design Specification for GIM Heatmap v. 1

 SPH_STG_IS.IS_AUDIT_TABLE: This will contain data from different


SPH_STG_IS.IS_AUDIT_TABLE_*. Table structure is given below:

COLUMN NAME DATA TYPE REMARK

CB_SYS_IT_SYSTEM_N NUMBER From FEED_RUN_CALENDER


O

LOAD_ID NUMBER From FEED_RUN_CALENDER

TABLE_NAME VARCHAR2(50) Source Table Name

JOB_NAME VARCHAR2(100 Job or Session Name


)

START_TIME DATE Start time of Job.

END_TIME DATE End time of Job.

STATUS VARCHAR2(20) ‘SUCCEEDED’

SRC_RECORDS_READ NUMBER Number of records read from source.

INSERT_COUNT NUMBER Number of records valid for INSERT.

UPDATE_COUNT NUMBER Number of records valid for UPDATE.

DELETE_COUNT NUMBER Number of records valid for DELETE.

NO_CHANGE_COUNT NUMBER Number of records valid for NO CHANGE.

FULL_DELTA_FLAG VARCHAR2(1) Value is ‘F’ if the table is of type Full load else ‘D’

 SPH_STG_IS.IS_STAGE_LOAD_VALIDITY: Details about valid loads only. Table structure is


given below:

COLUMN NAME DATA REMARK


TYPE

CB_SYS_IT_SYSTEM_N NUMBER From FEED_RUN_CALENDER


O

LOAD_ID NUMBER From FEED_RUN_CALENDER

IS_STAGE_LOAD_VALI NUMBER If a Load is valid then it will be 1.


D

 SPH_STG_IS.IS_STAGE_TABLE_TARGET: This contains value of base percentage which is


controlled by .txt file externally placed on server.

COLUMN NAME DATA TYPE REMARK

TYPE VARCHAR2( I,U,D or F.


1)

IS_TABLE_PERENTAGE_TARG NUMBER
ET

CB_SYS_IT_SYSTEM_NO NUMBER From FEED_RUN_CALENDER

SPH_STG_IS.IS_STAGE_TABLE_EXCEPT: Table name for which Robustness check will NOT be


apply. This is controlled by .txt file externally placed on server. There is always one dummy entry in this
table and in IS_STAGE_TABLE_EXCEPT.txt. Value of this dummy entry is ‘IRMS_DUMMY_TABLE’.

Design Specification for GIM Heatmap v. 1

COLUMN NAME DATA TYPE REMARK

IS_STAGE_TABLE_NAME VARCHAR2(50)

FEED_RUN_CALENDER for GIM-Heat map: Different ETL mappings for different Data sources. The
structure will be like this. This table present in SPH_GENERAL schema.

COLUMN NAME DATA TYPE REMARK

CB_SYS_IT_SYSTEM_N NUMBER Value will be taken from XCL_SYS_ITSYSTEM.


O

LOAD_ID NUMBER Different for every data load.

SERIAL_NO NUMBER Same as Earlier.

LAYER NVARCHAR2(1 Same as Earlier.


0)

EXTRACT_FROM DATE Same as Earlier.

EXTRACT_TO DATE Same as Earlier.

ETL_START_TIME DATE Same as Earlier.

ETL_END_TIME DATE Same as Earlier.

ETL_STATUS NVARCHAR2(1 Same as Earlier.


0)

RUN_TYPE NVARCHAR2(1 Same as Earlier.


0)

UPDATE_TIME DATE Same as Earlier.

CREATE_TIME DATE Same as Earlier.

FEED_RUN_CALENDER_HIST for GIM-Heat map: This table present in SPH_GENERAL schema and
having history of ETL data loads.

COLUMN NAME DATA TYPE REMARK

CB_SYS_IT_SYSTEM_N NUMBER Value will be taken from FEED_RUN_CALENDER.


O

LOAD_ID NUMBER Value will be taken from FEED_RUN_CALENDER.

SERIAL_NO NUMBER Same as Earlier.

LAYER NVARCHAR2(1 Same as Earlier.


0)

EXTRACT_FROM DATE Same as Earlier.

EXTRACT_TO DATE Same as Earlier.

ETL_END_TIME DATE Same as Earlier.

ETL_STATUS NVARCHAR2(1 Same as Earlier.


0)

UPDATE_TIME DATE Same as Earlier.

CREATE_TIME DATE Same as Earlier.


Design Specification for GIM Heatmap v. 1

Plan to load data from different Sources in Sophia STG layer:


1. FEED_RUN_CALENDER will capture the EXTRACT_FROM and EXTRACT_TO.
2. SPH_STG_IS.H_RESPONSE table will extract the increment data from IRMS.RESPONSE table
based on EXTRACT_FROM and EXTRACT_TO timestamp from FEED_RUN_CALENDER.
3. Every other table will use RESPID column from SPH_STG_IS.H_RESPONSE to extract the
incremental data from its corresponding IRMS source table.
4. Data is loaded from Source table into H_* Table as new load with a new LOAD_ID. This can be
full or incremental load.
5. Check for the different record count (Insert, Update and Delete).
6. Apply the Robustness to identify if the load is valid or not as per the counts available from 5.
7. If the load is valid, this information will load to SPH_STG_IS.IS_STAGE_LOAD_VALIDITY table.
8. After Robustness check, if the load is valid, then the data from all H_* tables will be loaded to R_*
tables. (Does not matter if the table is ignored in Robustness check or not).
9. After Robustness check, if the Load is not valid then, data from H_* table will not loaded to R_*
tables . This will be applicable to all tables whether the table was ignored in Robustness check or
not.
10. Data will be loaded from the H_* to the corresponding R_* Table by joining
SPH_STG_IS.IS_STAGE_LOAD_VALIDITY to FEED_RUN_CALENDER and corresponding H_
table.
11. If load is not valid, it will not loaded to R_* tables. Only valid loads will be moved further from H_*
tables.
12. If a particular load is not valid or failed in stage layer (ETL failed may be due to space issue), this
load will keep in H_* table only, but will not loaded to R_* tables and onwards.
13. If a particular load is not valid or failed (ETL failed may be due to space issue), ETL framework
will get restarted and FEED_RUN_CALENDER will generate a new LOAD_ID. Therefore, only
valid loads with new LOAD_ID will processed further from H_* tables.
14. H_* tables having all kind of loads i.e. FAILED,VALID and INVALID
15. Data is loaded into the CORE from the R_* Table.
16. All R_* tables are truncate and load tables, but H_* are not.
17. Solutions for Full and Delta Load :

• Full Load

• Number of rows in the current load will be compared with the Latest Recoverable load
stored in H_* Tables.
Design Specification for GIM Heatmap v. 1

• We need to check in H_* tables that ,out of all available loads (after the housekeeping
jobs removed the old loads and we have latest 6 loads available) which load is oldest
valid load .

• This oldest valid load is Latest Recoverable load.

• Therefore, total number of records in the Latest Recoverable load in H_* table will
become base count = 100%.

• Delta Load

• Number of rows in the load will be compared with the total row count in SPH_CORE (for
this table) [Current_Verison = 1]

• For a particular Delta load, the validity is checked on Insert, Update and Delete records
from Source.

• For a valid load, each individual percentage of Insert, Update and Delete records should
less than the prescribed limit. If any single (Either Insert, update or delete) cross the limit,
that particular table load is invalid hence the complete load is invalid.

Load percentages:
For robustness check, the base percentage will be finalized for INSERT, UPDATE and DELETE records
for source system. These percentages are available in SPH_STG_IS.IS_STAGE_TABLE_TARGET table
and controlled externally from a .txt file placed on Informatica server.
In case a particular load is invalid or ETL failure, these base percentages will be tweaked to
accommodate the record counts since the last valid load for the next valid load.
Once the next valid load will be finished, these base percentage will be tweaked again (as per the
discussion of Sophia team to IRMS team) to check the robustness for daily valid record counts.
Design Specification for GIM Heatmap v. 1

You might also like