Pre Requisites ----------------* SQL Server * Working with SELECT, joins, subqueries, SPs, cursors, triggers * Basic knowledge MSBI

Langs) * Knowledge of VB.Net * Basic programming features like variables, controls statements * Working with functions, classes * Working with databases, xml files etc What is BI? ------------* It is a collection of tools, services or technologies whose main aim is to con vert data into information. * This information can be used to guide organization. * First time BI term was introduced by Inmons. * BI supports 3 major tools * ETL Tools (Extract Transform and Load) * SSIS, Informatica etc * Reporting Tools * SSRS, Cognos, Crystal Reports, Business Objects etc * Analytical Tools * SSAS * BI project Layers * Data Source layer (dbs, xml files, .xls, .txt files, Multi dimensi onal structures etc) * Data Integration Layer (ETL Component) * Database Layer (RDBMS) * Analysis Layer (Analytical tools) * Reporting Layer (Reporting Tools) Advantages ------------* Data Integration from different sources into centralized location. * Implementing automation * Loading datawarehouse periodically from operational databases. * Reporting * Analysis and decision making * High Performance * Security * Reduced cost MSBI ------* Microsoft Business Intellegince suite is a collection of various tools to impl ement BI projects * BI project can be implemented using 4 tools - ETL, Reporting, Analysis, RDBMS. * All these four tools are available as part of MSBI * MSBI supports * ETL Tools * In 2000 version --> DTS * In 2005 ,, --> SSIS * Reporting * SSRS * Analysis of .Net framework 2000 version (scripting languages) 2005 ,, (VB.Net+ scripting lang) 2008 ,, (C#.Net + VB.Net + Scripting

* SSAS * MSBI 2000 supports scripting languages (VB Script, JavaScript) * MSBI 2005 supports VB.Net , Scripting langs * MSBI 2008 supports VB.Net and C#.net, Scripting Langs SSIS (SQL Server Ingtegration Services) -----------------------------------------------* It is enterprise data integration component present as part of MSBI, which sup ports ETL process. * It was introduced in SS 2005 in the place of DTS. * ETL project is developed with collection of packages (like classes in language s). Each page is collection of tasks (like methods in a class). * Package is stored with .dtsx extension. * Package hierachy * Every package consists of 2 major components * Control Flow items (Tasks) * Data Flow items (Pipeline components) * Source, Transformations, destination

Features ---------* Data integration from any source. * Automation * Loading datawarehouse periodically. * Supports - Enhancements using .Net langs * Supports checkpoints * Checkpoints are restartable points * Supports package configuration * Configuration is required for easy enhancements, to move packages from one location to another. * Supports 2 types of configurations * Direct * Eash package has its own configuration file * Indirect * Multiple packages can share same configuration file. * Supports Logging * Storing step by step execution details or required events in a seperat e location so that we can troubleshoot the errors.

Parts of SSIS Package -------------------------* Control Flow * To defined the complete logic of a package * Implemented using tasks * Tasks are connected with precedence constraints * Success * Failure * Completion * Condition based * SSIS supports 2 types of tasks * Container Tasks * To group one or more tasks * To execute tasks repeatedly for no of times * Eg For Loop Container ForEach Loop ,, Sequence ,, * Standard Tasks * Data Flow * Consists of data flow items which can be used to flow data from source to destination. * Consists of * Source * Transformations * Destinations * Event Handlers * To handle runtime errors. * It works as a catch block in other languages * We can handle package level as well as task level events. e.g OnError OnWarning OnInformation OnPostExecute * Connection Managers * maintains source and destination connections * Package Explorer * Consists of complete details of the package Package Life Cycle ----------------------1. Development * Import and Export Wizard (dtswizard)

* BIDS (devenv) * NotePad 2. Deployment * Placing package in centralized server so that users can access the pac kages. * SSIS supports to deploy packages * File System * MSDB (SQL Server) 3. Executing * We can schedule as well as execute packages after deployment * SSMS -- We can schedule the packages. * BIDS (F5) * DTExec, DTExecUI tools Ex: Create a package which takes data from emp table into emp.txt file. : emp table : emp.txt file : Data Flow Task : OLEDBSource FlatFileDestination

Requirement * Source * Destination Implementation * Tasks * Data Flow Steps

1. Creating database use master go Create database SSISDb 2. Creating table use SSISDb go create table emp(empid int,ename varchar(40),sal money,deptno in t) go insert emp values(1,'Rajesh',6000,10) insert emp values(2,'Kumar',4500,20) 3. Creating package * start --> Programs --> Microsoft SQL Server --> SQL Server Business intelligence development studion * Go to File --> New --> Project Under templates select "Integration Services Project" Name: SSSI_Basics Location: d:\SSIS_Examples OK 4. Go to Control Flow and Data Flow Task --> double click on it 5. Place OLEDBSource and Flat File Destination. 6. Connect both the controls 7. R.C on OLEDBSource --> Edit --> Connection Manager = New Click New Enter server name = required name select database = SSISDB OK OK Under table or view = select emp table OK 8. R.C on Flat File Destination --> edit --> Connection Manager = new Connection Manager Name=emp.txt File Name=c:\emp.txt

select checkbox "Column names in first data row" OK Click on Columns OK Click on Mappings OK 9. Run the package (press F5)

Working with tasks ----------------------1. Data Flow Task * It is required to flow data from source to destination. * it uses transformations to modify, remove, split data while flowing. 2. Execute SQL Task * To execute SQL commands as well as stored procedures * We can set the command * Directly * From Variable * From File Properties * Connection * SQL Statement * Result Set None Single ResultSet Full Resultset XML 3. File System Task * To work with files and folders * We can copy, move, delete files and folders Properties * Operation * Source Connection * Destination Connection 4. For Each loop Container * To execute one or more tasks repeatedly. * To work with collections * For Each File * For Each ADO - To work with rows * For Each Variable Ex: Create a package which copies all .txt files present in My Documents into d:\txtFiles folder. Store the copied files information in the following table Files_Copied FileID FileNames Date_Copied Requirement * Source : .txt files of My Documents * Destination : d:\txtFiles folder Implementation

* Tasks Steps

: For Each Loop File System Task Execute SQL Task

1. Go to SSMS --> create the following table use SSISDb go create table Files_Copied ( FileID int IDENTITY, FileNames varchar(100), Date_Copied datetime default getdate()) 2. Take new package and design as follows 3. R.C on For Each Loop --> Edit Collection --> Enumerator : For Each File System Folder = Click on Browse button and select My Documents Files = *.txt Go to Variable Mapping--> Under Variable --> Take New Variable Name: strFilePaths Value Type : string Value: a OK OK 4. R.C on File System Task --> Edit Operation= Copy Files IsSourcePathisVariable: True Source Variable ; strFilePaths DestinationConnection = New Connection Usage Types= Existing Folder Folder = d:\txtFiles (Create this folder manually) OK Ok 5. R.C on ExecuteSQL Task -->edit Connection= new connection --> Create connection for required server name and SSISDb database. SQL Statement=insert Files_Copied(FileNames) values(?) Go to Parameter Mapping Add From Variables --> Select strFilePaths Direction: Input Data Type: Varchar Parameter name: 0 Parameter size : 100 * OK 6. Run the package 5. Execute process task ----------------------------------* To execute .exe and .bat files * To stop or start the processes present in windows Properties * Executable * Arguments * Working directory Ex: Create a .rar file of all the files present in d:\txtFiles Requirements * Source : d:\txtFiles all files

Implementation * Tasks : Execute Process Task Steps 1. Take new package --> Place Execute Process Task --> R.C -->Edit Go to Process Executable: C:\Program Files\WinRAR\Rar.exe Arguments : a text_Files *.txt Working Directory: d:\txtFiles OK 2, Run the package 6. FTP Task ----------------* To maintain website files we need FTP server. * Every web application need FTP server where all the website resouces i.e web p ages, image files etc are maintained. * We can upload, download, create, drop files and folders from FTP server with F TP task. Properties * FTPConnection * Operation * Source Path * Destination Path Ex: Steps 1. Open prev package --> Place FTP Task 2. Go to SSIS menu --> Variables --> Add Variable --> Name : strFileName Data Type : String Value : d:\textFiles\txt_Files.rar 3. R.C on FTP task --> edit --> FTP Connection = New connection Server Name : ftp.xxxxxx.com User Name Password Click on Test Connection 4. Go to File Transfer Operation : Send Files IsLocalPathVariable : True Variable : strFileName RemotePath : /Backup (Any folder present in FTP server) OK 5. Run the package 7. Web Service Task ----------------------------* Web service is used to implement common business logic which can be used in an y application. * Business logic is stored in a file with extension .asmx * It consists of logic in the form of classes and methods. * Web service functionality is present in one file with the extension .wsdl * .wsdl file act as an interface between webservice and SSIS package or client * Web services are independent of * O/S * Language * Web server Ex: Create a package which loads data from customers.txt file into customers table of SSISDb. Verify the EmailID before loading. Create a package which uploads text_Files.rar into FTP server.

Requirement --------------* Source * Destination Implementation -----------------* Tasks

: customers.txt : Customers table : : : : For Each loop Data Flow Web service Task ExecuteSQL Task

Steps 1. Create a file with the name customers.txt as follows custID,custName,EmailID,PhoneNumber 1,Jagadeesh,jagadees@gmail.com,998543333 2,Harish,hari@yahoo.com,985344343 2. Create the following table in SSISDb Create table Customers ( CustID int,CustName varchar(40),EmailID varchar(40),Phone varcha r(12) ) 3. Take new package and design as follows 4. double click on Data Flow task --> Place Flat File Source and RecordSet destination. * R.C on Flat File Source --> Configure to Customers.txt file * Go to SSIS menu--> Variables--> Add variable--> Name: objCustomers Data Type: Object Scope: Package4 (package level) 5. R.C on Recordset destination --> edit -->Variable Name: select objCus tomers Go to Input Columns --> Select All columns-->OK 6. R.C on For Each Loop container Edit--> Go to Collection Enumerator = For Each ADO enumerator ADO Object Source Variable: objCustomers Go to Variable Mapping --> Under Variables--> new variable Name: VCustID Valute Type: Int32 Value: 0 Name: strCustName Value Type: string Name : vEmailID Valute Type: string value : x Name : vPhoneNo value type : string value : x 7. R.C on Webservice Task --> Edit --> HttpConnection=New Connection Server URL http://www.webservicex.net/ValidateEmail.asmx?WSDL Click on Test Connection --> OK WSDL file: d:\ValidateEmail.wsdl Click on Download wsdl button Go to Input

Service = select ValidateEmail Method= IsValidEmail Under Varaible = select checkbox Under Value = select vEmailID variable Go to Output File = New connection Usage Type: Create File File: d:\email.xml OK OK 8. Go to SSIS menu --> Variables --> Add variable --> Name: strResult Type string 9. R.C on Script Task --> edit --> ReadWriteVariable= User::strResult Edit Script --> type the following using System.Xml; In Main() method write the following logic XmlTextReader r = new XmlTextReader("d:\Email.xml"); r.ReadStartElement("boolean"); Dts.Variables["strResult"].Value = r.Value; r.Close(); * Go to Build menu --> Build ................. * File --> Exit * OK 10. R.C on ExecuteSQL Task --> edit --> Connection = New Connection --> Select server name and db name SQL Statement : insert Customers values(?,?,?,?) Go to Parameter Mapping Add 4 parameters and map with the above four variables. Direction= input varname direction datatype paramname paramsize 0 1 2 3 Mention respective data types

OK 11. R.C on connector between script task and execute SQL Task --> Edit Evaluation Operation = Expression and Constraint Value = Success Expression = @strResult=="true" Click on Test button OK 12. Run the package