How to run a Shell Script within the scope of a Data stage job?

By using "ExcecSH" command at Before/After job properties.

How do you eliminate duplicate rows?

sql: delete from tablename a where rowid>(select min(rowid) from tablename b) where a'key values=b.key values. datastage server:using sort stage(option:allow duplicates(yes/no)) or hash file parallel: remove duplicate stage

Importance of Surrogate Key in Data warehousing?
Surrogate Key is a Primary Key for a Dimension table. Most importance of using it is it is independent of underlying database. i.e Surrogate Key is not affected by the changes going on with a database. surrogate is the systemgenerated key it is a numaric key it is primary key in the dimension table and it is forgien key in the fact table it is used to hadle the missing data and complex situation in the datastage

