DWH Team

An Introduction to Database Concepts

Topics Covered…

DWH Team


• A database is a structured collection of records or data that is stored in a computer system so that a computer program or person can easily retrieve and manipulate the data using a query language.

DWH Team

Database Management System[DBMS]

• A database management system (DBMS) is computer software designed for creating and maintaining databases and allows users to retrieve information from that database.
MSAccess, Foxpro,DBase

DWH Team

Relational Database Management System[RDBMS]

• A type of DBMS in which the database is organized and accessed according to the relationships between data values. It is based on the relational model.

•Oracle, DB2, MS SQL Server,Teradata

DWH Team

Database Model

• A database model is a theory or specification describing how a database is structured and used.

Three types of Data models are…  Hierarchical Model  Network model  Relational Model
DWH Team

Hierarchical Model

• Hierarchical data model organizes data in a tree structure,hierarchy is of parent and child data segments.hierarchical model structures data as a tree of records, with each record having one parent record and many children

DWH Team

Network Data Model

• Network data model organize data as a network and the network model allows each record to have multiple parent and child records.

DWH Team

Relational Data Model

• A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints. In such a database the data and relations between them are organised in tables.

DWH Team

Database Objects  Table
Table is a database object or structure where the data is stored.A table will contains rows and columns. A row in a table is called 'tuple' and the column in the table is called 'Attribute'.

 View
• A view is a virtual, dynamic or logical table computed or collated from data in the database. • Changing the data in a table alters the data shown in the view.

DWH Team

Database Objects  Stored Procedure
A stored procedure is a subroutine available to applications accessing a relational database system. Stored procedures are actually stored in the database.

 Trigger
A database trigger is procedural code that is automatically executed in response to certain events on a particular table in a database. Triggers can restrict access to specific data, perform logging, or audit data modifications.
DWH Team

Database Objects • Index A database index is a data structure that improves the
speed of operations in a table.Indexes can be created using one or more columns.The disk space required to store the index is typically less than the storage of the table , or audit data modifications.

Types of Index are  Unique Index  Non-unique Index
DWH Team

Keys in a table  Primary Key
• It is a unique and non-nullable attribute of the table • The primary key of a relational table uniquely identifies each record in the table.

 Foreign key
• It is a field in a relational table that matches the primary key column of another table. • Foreign key is used to establish and enforce a link between the data in two tables.

DWH Team

Keys in a table  Candidate Keys
• A key that uniquely identifies rows in a table. Any of the identified candidate keys can be used as the table's primary key.

 Super Key
• A Candidate Key as a Super Key that contains only the minimum number of columns necessary to determine uniqueness.

DWH Team

Database Normalization
It is a technique for designing relational database tables to minimize duplication of information.

The goals of normalization are, Eliminating redundant data Ensuring data dependencies make sense

DWH Team

Database Normalization Types of Normalization are
1st Normal form  2nd Normal form  3rd Normal form  Boyce-Codd Normal Form  4th Normal form  5th Normal form

DWH Team

Database Normalization
 1st Normal form
A relation to be in 1NF, each column must contain only a single value and each row must contain the same columns.

 2nd Normal form
In order to be in Second Normal Form, a relation must first fulfill the requirements to be in First Normal Form. Additionally, each nonkey attribute in the relation must be functionally dependent upon the primary key.

DWH Team

Database Normalization
 3rd Normal form
In order to be in Third Normal Form, a relation must first fulfill the requirements to be in Second Normal Form. Additionally, all attributes that are not dependent upon the primary key must be eliminated.

 Boyce-Codd Normal Form
A relation is in Boyce-Codd Normal Form (BCNF) if every determinant is a candidate key.

DWH Team

Database Normalization
 4th Normal form
To be in Fourth Normal Form, a relation must first be in Boyce-Codd Normal Form. Additionally, a given relation may not contain more than one multivalued dependency.

 5th Normal form
A 4NF table is said to be in the 5NF if and only if every join dependency in it is implied by the candidate keys.

DWH Team

Structured Query Language[SQL]

SQL is an ANSI standard computer language for accessing and manipulating Database systems.     Data Definition Language[DDL] Data Manipulation Language[DML] Data Control Language [DCL] Transaction Control Language[TCL]

DWH Team

Data Definition Language [DDL]
SQL statements that can be used either interactively or within programming language source code to define databases and their components. DDL commands are …
     CREATE -Create tables ALTER -Changing the table definition DROP -Drop tables RENAME -Renaming table TRUNCATE -Deletes the data

DWH Team

Data Manipulation Language [DML]

SQL statements that can be used to manipulate the data in a relational table. It includes …
 SELECT  UPDATE  DELETE  INSERT INTO - extracts data from a database table - updates data in a database table - deletes data from a database table - inserts new data into a database table

DWH Team

Data Control Language[DCL]

SQL statements that can be used for control access to data in a database. It includes…
 GRANT -to allow specified users to perform specified tasks.  REVOKE- to cancel previously granted or denied permissions

DWH Team

Transaction Control Language[TCL]

SQL statements that can be used for used to control transactional processing in a database
 COMMIT -to apply the transaction.  ROLLBACK -to undo all changes of a transaction

DWH Team

Select Options…
 where  group by -it is used to apply condition on queries

- we can use the GROUP BY clause to divide the rows in a table into groups, to return summary information for each group. -it is used in combination with the GROUP BY clause. It can be used in a SELECT statement to filter the records that a GROUP BY returns. -It allows to specify the order in which rows appear in the Result.

 having

 order by

DWH Team

Functions in SQL
The syntax for built-in SQL functions is, SELECT function(column) FROM table

Important types of functions…  String Functions  Conversion Functions  Mathematical Functions  Date Functions

DWH Team

String Functions

The string functions allows to manipulate strings Some String functions are…
 length  concat  substr  upper  trim - Length of the specified string. - Concatenate two strings together. -Used to extract a substring from a string - Converts all letters uppercase - Removes all specified characters either from the beginning or the ending of a string

DWH Team

Conversion Functions

Conversion functions are used to convert the datatype. Some Conversion functions are…
 Cast -converts one datatype to another.  to_char -converts a number or date to a string  to_date -converts a string to a date.  to_number-converts a string to a number.  to_lob -converts LONG or LONG RAW values to LOB values.  Convert -converts a string from one character set to another
DWH Team

Aggregation Functions

Aggregate functions operate against a collection of values, but return a single value. Some Aggregation functions are…
 avg(column)  count(column)  max(column)  first(column)  sum(column) -Returns the average value of a column -Returns the number of rows - Returns the highest value of a column -Returns the value of the first record in a specified field - Returns the total sum of a column

DWH Team

Date Functions

Date functions used to manipulate date & time values. Some Date functions are…
 current_date  last_day  next_day  to_date  Round -returns the current date -returnsthe last day of the month based on a date value -returns the first weekday -converts a string to a date - returns a date rounded to a specific unit of measure

DWH Team

Some other Commands / Functions

-it is a pseudo column that uniquely identifies a row within a table, but not within a database. the ROWNUM pseudocolumn returns a number indicating the order in which Oracle selects the row from a table.

rownum -For each row returned by a query,

coalesce function -the coalesce function
returns the first non-null expression in the list. If all expressions evaluate to null, then the coalesce function will return null.

sample -sample in queries used for getting
sample output.
DWH Team

Set Operations in Database
 Union : the UNION operator combines the results of two SQL

queries into a single table of all matching rows. The two queries must have the same number of columns and compatible data types in order to join them. Any duplicate records are automatically removed unless UNION ALL is used.

 Intersect: SQL INTERSECT operator takes the results of two
queries and returns only rows that appear in both result sets.

 Except : The SQL EXCEPT operator takes the distinct rows of
one query and returns the rows that do not appear in a second result set.

DWH Team

A JOIN clause in SQL combines records from two tables in a relational database and results in a new (temporary) table, also called a "joined table". Structured Query Language specifies two types of joins: Inner join and Outer join

DWH Team

Inner Joins INNER JOIN returns all rows from both tables where there is a match.
Inner join is subdivided into three types Equi-join Natural join Cross join

DWH Team

Inner Joins
Equi-join : The type of Join which links the columns of two tables using an equal relationship. Natural join: An inner join in which redundant columns are eliminated. Cross join : A cross join (or Cartesian Product join) will return a result table where each row from the first table is combined with each row from the second table. The number of rows in the result table is the product of the number of rows in each table

DWH Team

Outer Joins
A Join operation in which all source records are included in the result, even those that don't satisfy the join condition

The Outer joins subdivided in to  Left outer join  Right outer join  Full outer join

DWH Team

Outer Joins
Left outer join : The resultant table will contain all the
records from the first table plus the records from second table which satisfying the join condition.

Right outer join : The resultant table will contain all the
records from the second table plus the records from first table which satisfying the join condition

Full outer join

: A full outer join combines the results

of both left and right outer joins. The joined table will contain all records from both tables

DWH Team

Joins Diagrams

DWH Team

DWH Team