You are on page 1of 6

DAY 1

In the session we talked about the following topics:


SQL server and it's queries, different types of joins, constraints, analytical functions, DDL,
DML, TCL commands, DWH , Data mart with practical example, Normalisation, Data
modelling.

Types of Joins :
1.Inner join - The INNER JOIN keyword selects records that have matching values in both
tables.
2.Outer join
● Left join - The LEFT JOIN keyword returns all records from the left table (table1), and
the matched records from the right table (table2). The result is NULL from the right side,
if there is no match.
● Right join - The RIGHT JOIN keyword returns all records from the right table (table2),
and the matched records from the left table (table1). The result is NULL from the left
side, when there is no match.
● Full outer join or full join - The FULL OUTER JOIN keyword return all records when
there is a match in either left (table1) or right (table2) table records.

Note: FULL OUTER JOIN can potentially return very large result-sets!

3. Self join - self JOIN is a regular join, but the table is joined with itself.

4. Cross join - It produces a result set which is the number of rows in the first table multiplied by
the number of rows in the second table if no WHERE clause is used along with CROSS
JOIN.This kind of result is called as Cartesian Product. If WHERE clause is used with CROSS
JOIN, it functions like an INNER JOIN.

Problem statement:
We have to find out the names of the employee for which department number is not provided.
We have 2 different tables
Employee table and department table

Solution :
1. Select emp_name
From employee e left join department d where e.dept_id ! = d.dept_id or e.dept_id is null ;
Or,

2. Select e.emp_name
From employee e
Where not exists ( select dept_id from department d where e.dept_id = d.dept_id);

Difference between truncate, delete and drop?


● Delete: In delete option rows are deleted according to the condition specified. Delete can
be rollback.

● Truncate: It removes all rows but teh table structure remains.It cannot be rollback and it's
faster than delete.

● Drop : It removes table from database completely. All the indexed, constraints on the
table also gets removed.

What are Slowly changing dimensions?


Dimensions that change over time.
We have 5 different types of SCD i.e. SCD 0,1,2,3,4,6
SCD 0 is used for no change
SCD 1 is used to maintain the latest change in the source table
SCD 2 is used to maintain the history of full table.
SCD 3 is used to maintain history but instead of adding rows we add columns.

Data warehouse and Data mart


DWH - is a data lake to leverage the data for effective and precision decision making. It is
maintained for large organisations.
Data mart - it a part of data warehouse that is maintained for small units.

Data flow :
Oltp system ------- operational data source -------- DWH -------- DM --------- reporting

Data modelling:
Data is retrieved and send in small time. A perfect data model has to be there in order to have
good performance.

Normalisation :
Distribution of data into smaller parts so as to avoid anomalies and redundancy.

Note! We generally model our data so as to have good performance tuning.


Fact and dimensions table
● Fact table -- transactional information that is coming very fast. It has all the primary keys
of dimensions and measures like price , quantity etc..
● Dimension table --- it provides descriptive information for all measurements recorded in
the fact.

In our second session we discussed about the different types of problems that can be there in our
table data entries like we can have employee which do not have a valid department or we can
have two managers for one employee.

We have 6 tables : Employee, manager , customer, product, order , department.


In our product table there may be data that is changing with time. So we have to maintain a
separate table called stock which keeps the records like stock quantity, price etc

Problem statement:
I have a Employee Table with following columns Emp_id, Emp_Name and Mgr_id. I have to
find the current manager of the employee.

Solution:
Select Distinct e.Ename as Employee, m.mgr as reports_to, m.Ename as manager
FROM Employees e, Employees m
WHERE e.mgr=m.EmpID;

DAY 2

QlikView Introduction
QlikView is basically an Dashboarding Tool and Discovering tool. In which we can visualize
our data using tables and mainly can analyze data. QlikView allows you to create our own data
model along with it, it maintains security model as well.

ADVANTAGES :
1. High usage of creating Report wizard and excellent ‘Drag and Drop’ objects to generated
reports in secure way.
2. Data can be export from an object into XML, Excel, CSV, TXT etc
3. Data Analysis becomes easier.
4. Flexibility for analysis of data
5. Good Memory.
6. Automatic association in modelling.

QlikView has following extensions :


1. QVW : It is extension to the qlikView document which stores data, data objects and UV
objects for eg abc.qvw
2. QVD : It is extension to the qlikView binary data file of qlikView document eg abc. qvd
3. QVX : It is extension to the qlikView executable file .

Schema :
● STAR Schema :
Simplest form of dimensional model, in which data is organized into fact and
dimensions. It is diagrammed by surrounding each fact table with its associated
dimensions table output diagram resembles a STAR.
● SNOWFLAKES Schema:
It is an extension of the star schema by means applying additional dimensions to the
dimensions of a star schema in a relational environment.
The schema is diagrammed with each fact surrounded by its associated dimension as in a
star schema and those dimensions are further related to other dimensions branching out
into a snowflakes pattern.

Normalized Data:
Well structured form of data which doesn't have any repetition or redundancy of data. It kind
of relational data.

Denormalized Data:
It's a whole bunch of data which doesn't have any relationship among themselves with
redundancy of data.

In QlikView document, we have denormalized data for faster retrieval and merging also we can
have snowflakes schema for data modelling.

QlikView Script Editor :

Data Loading in QlikView:


Data can be loaded in qlikview by using the load script option. We specify the table name which
we want to load and the tables having common columns between it gets associated automatically
which is advantage of qlikview over other reporting tools.
We can import data from excel, csv, qvd etc

Importing data from excel into qlikView :


1. Wizard driven:
For loading data from any file first need to click on “ edit script”
Steps :
1. Open script editor
2. Select table path from wizard
3. Select internal data from file which we want to import.
4. Give labels if don't have
5. We can give multiple settings while selecting data.
6. Then click on “Reload” button

2. Script driven:
After loading data by wizard in script will be written by qlikView only.
While we open script editor some script is already written which is used to set some
values and formats.
e.g SET var= 10
LOAD script FORMAT:
<TableName> : /* naming to this data set*/
LOAD [column name1] as [newcolname], /*to set alias name to the previous
column/Renaming of column*/
[column name2],
….
[column nameN]
FROM <path_of_file>/ <table_name>
(<label type>, TABLE IS [tablename]) ;
If we don't know the column names from table we can use ‘* ‘ symbol after Load to load whole
data columns from table.
For Example:
DemoTable :
LOAD cola as ‘A’,
colb,
cold as ‘C’
FROM C: abc/xyz.xlsx
(OOXML, embedded labels, table is [emp]) ;

By clicking on “RELOAD” Button to load data or reload data.

You might also like