You are on page 1of 85

The Fundamentals of Database

Administration
Sean Dunn
This book is for sale at http://leanpub.com/dbafundamentals

This version was published on 2021-05-07

This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and
many iterations to get reader feedback, pivot until you have the right book and build traction once
you do.

© 2020 - 2021 Sean Dunn


Contents

CHAPTER 1: The Who, What, and Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Who should read this book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
What is SQL, SQL Server, and a Database Administrator? . . . . . . . . . . . . . . . . . . . . 2
Why should you learn the fundamentals of Database Administration? . . . . . . . . . . . . 4

Chapter Two: Database Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
What is a database? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Flat-File databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Hierarchical databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Relational databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Workbooks VS Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Chapter 3: Relational Database Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Tables, Rows, and Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Language Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Pages and Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Chapter 4: Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Common Built-in Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Built-in Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Percision and Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Large Data items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Data Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Chapter 5: DML VS DDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
CONTENTS

Data Manipulation Language (DML) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28


Data Definition Language (DDL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Chapter 6: SQL Server Management Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Download SQL Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Download SSMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Access SSMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Exploring SSMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Executing Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Other Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Chapter 7: Database Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67


Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Installing AdventureWorks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Creating and Using Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Creating and Using Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Creating and Using Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
SQL Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
CHAPTER 1: The Who, What, and Why?

Chapter 1: The Who, What, and Why?

Introduction
Allow me to introduce myself. My name is Sean Dunn. I currently work as a software consultant and
have been a .NET developer for over five years now. During that time, I have studied and completed
several courses and certifications, compiling an impressive amount of notes. Looking back through
them, I realized how much I have learned over the years and how much I have yet to learn. This
experience made me want to share what I learned, what made it click, and how it helped me succeed
in my career. If you wish to contact me, please email me at sdbooks23@gmail.com.

Who should read this book?


Everyone! I am writing with the assumption that you currently know nothing about the concepts
covered in this book. In that regard, I will be providing a lot of examples, explanations, and stories
that help me understand the concepts. Hopefully, they will help you. There are no prerequisites for
this book. Anyone with an interest in learning more about SQL, SQL Server, or the skillset of a DBA
CHAPTER 1: The Who, What, and Why? 2

will find this book helpful. This book will also help prepare you for exam 98-364 MTA: Database
Administration Fundamentals.

What is SQL, SQL Server, and a Database


Administrator?
This book is being compiled from my notes and experiences preparing for Microsofts 98-364 MTA:
Database Administration Fundamentals exam. The exam is designed to teach you about SQL, SQL
Server, and prepare you for a role as a Database Administrator. I will not assume everyone who reads
this knows what those are. Therefore I will provide you a brief introduction to each of them.
1. What is SQL?

• SQL (pronounced “sequel” or “ess-que-el” depending on who you ask) stands for Structured
Query Language. It is the primary language you use to communicate with a database. If you
are new to SQL, then this communication is a black box for you. In programming, a black box
is when you are aware of the inputs and the output but not aware of the steps that create the
output. For example, when you order food at a restaurant, you submit your order (input). A bit
later, you receive your food (output). However, you are unaware of the process used to create
your meal.

To prove how useful and popular SQL is, take a look at the image of the most recent Stackoverflow
Developer Survey.
CHAPTER 1: The Who, What, and Why? 3

Top Technologies

SQL had consistently ranked in the top programming languages in Stackoverflow’s survey since 2013,
when they started tracking it. By the end of this book, you should be comfortable communicating
with databases using SQL.
2. What is Microsoft’s SQL Server?

• SQL Server is a relational database management system. What a relational database means
will be covered in the next chapter. Of important note, when you download SQL Server, you
also download Microsoft’s SQL Server Management Studio (SSMS). SSMS is a Graphical
User Interface (GUI) developed by Microsoft to allow the user to easily configure, manage,
and administer components in SQL Server.

3. What is a Database Administrator?


CHAPTER 1: The Who, What, and Why? 4

• A Database Administrator (DBA) is a person who takes on the role of maintaining a safe
database environment by directing or performing all related activities to keep the data secure.
The primary duty of a DBA is to maintain data integrity. Meaning the DBA will ensure that
information is protected from unauthorized access but is available to users.

Why should you learn the fundamentals of Database


Administration?
DBA’s are a highly in-demand job that can lead to a rewarding and profitable career. As a DBA,
you will be in charge of your companies data and will often play a vital role in the company.
From Web Development to Artificial Intelligence, even if you don’t want to be a DBA, learning
the fundamentals can help you in a variety of other fields. Learning to handle and manipulate data
makes you a valued asset at any company.
Chapter Two: Database Types

Database Types

Introduction
In this chapter, we will be covering some of the fundamental concepts about databases, defining
what a database is, as well as the three most common types of databases.

What is a database?
A database(DB) is an organized collection of data, similar to a phone book, stored in an electronic
format. It allows you to insert, update, delete, and retrieve data quickly. Traditionally, databases are
composed of files, records, and fields.
That definition may be too technical for some, so let us examine a phone book as a simple example.
If you could digitize a phone book and store it on a disk, it would turn into a file. Therefore, the
book itself is a file. If you were to open that file (open the book), you would see a list of records.
Each of these would include a name, address, and phone number. In other words, a record would
Chapter Two: Database Types 6

be a single row listed in the phone book. Those pieces of information (name, address, and phone
number) individually comprise a field. A field is the same as a column.

Phonebook

As you can see, a phone book can be pretty substantial, with hundreds of records listed on each page.
Imagine having to flip through an unsorted phone book trying to find the person you are looking
up. Thankfully, databases don’t work that way. To retrieve data from a database, you run what is
called a query. A query is a request you send to the database that returns information.
Databases, like phone books, can often be massive, containing thousands of records with many fields.
Even modern computers can take a significant amount of time searching through a table to find the
requested data. In a phone book, when we want to find something fast, we flip to the Index page
to see where the information exists. Similarly, we can create an index data structure in a database
table to speed up data retrieval. We will be covering how indexes work in greater detail later.
Now that we have a basic understanding of a database let’s talk about where you can find and
access them. Most databases exist on a database server. The advantage of a database server is that
they allow multiple users to access the database and provide a high level of performance. Microsoft
SQL Server is one such database server. Interestingly, very few people access a database server
directly. Instead, they use a database management system (DBMS) to access the server indirectly.
As mentioned in the previous chapter, a DBMS allows you to configure, manage, and administer
components easily in a database server. Microsofts SQL Server Management Studio is a DBMS.
Chapter Two: Database Types 7

To get a better idea of how this works, imagine yourself working at the ticket counter at a movie
theater. When a customer purchases a ticket from you, you will use a ticket system program on the
computer at your station to punch in the order. The system would then access the database to add
the purchase. You are not directly updating the database. The program is doing that for you.
The last thing we will cover in this section are the types of files SQL Server uses to store databases.
Primary data files have a .mdf extension, are the first files the database creates, and where the
majority, if not all, your data lives.
Secondary data files have a .ndf extension, are used when you run out of memory on your first
hard disk or dealing with large databases. By putting some of the data into the primary .mdf and
other data in the secondary .ndf’s, you give SQL Server more scalability. Scalability means SQL
Server can handle a more intensive workload without issues. For example, you just opened a new
web store where you expect a thousand people to come to your store for the first day. To your
surprise, your store was a hit, and ten thousand people came to your site. If you only set up SQL
Server to handle your estimated one thousand, it would have been unable to handle the workload
provided. Your website would become laggy, unresponsive, and potentially shut down your server.
Fortunately, you set your server up with the ability to scale by adding a secondary .ndf file. Now
able to handle the workload, you raked in the profit. If you need more scalability for your site, you
can add extra secondary hard disks.
The third and final type of file is the .ldf log file. If you don’t know what a log file is, consider it a
checklist for transactions. The log file keeps track of all the changes made to the databases. We will
go into more detail about why this is useful in a later lesson.
Now that you know what a database is, how they are accessed, and how they store physical data, it
is time to go over the most common database types. There are three types of databases you should
be familiar with when making an appropriate choice to build your database tables.

• Flat-type databases
• Hierarchical databases
• Relational databases

In the following sections, we will cover each database type and its distinctive design features.

Flat-File databases
The first type of database is a Flat-file database. Flat-files are “flat” because they are two-
dimensional tables consisting of rows and columns. These are very simplistic in design. If you
have ever entered data into an excel sheet, you have created what is essentially a Flat-file database.
Because they hold one record per line, they are efficient, allowing users to access and run queries
on the database quickly. Other common examples include a comma-separated values(CSV) file and
a tab-delimited file(TDF).
Chapter Two: Database Types 8

CSV Database

Name Age Address City State


John 28 123 Main St NY City NY
Jane 45 321 Second st Santa Fe NM
Sally 21 567 First st Dallas TX

Hierarchical databases
The next type of database is a Hierarchical database. This type of database uses a tree structure
to define relationships between multiple tables. A tree structure is comparable to a family tree. A
family tree is composed of parent and child relationships. Each child has one set of parents, and
parents can have one or more children.
Similarly, in a Hierarchical database, each table can have one parent and multiple children. Most
people have probably used tree structures without realizing it. If you have opened up file explorer
and navigated through files, you are browsing through a tree structure. Let’s take a quick look at a
simple example to see how this works with tables.
(Parent) Employee Table

EmpID Name Position


1 John Manager
2 Sally Developer
Chapter Two: Database Types 9

(Child) Equipment Table

SerialID Type EmpID


123456 Laptop 1
456789 Computer 2
012345 Monitor 2

In this example, the parent table holds the employee data. Each record or row contains data about
an employee’s information. The most important field or column in the table is the employee’s
ID (EmpID), and you will see why in just a second. The child table holds information about the
company’s equipment and who is using it thanks to the EmpId field. The EmpId links the employee
table to the equipment table. For example, if you wanted to know what equipment Sally was using,
you would look at her EmpId (2), then look at the equipment table. From the equipment table, you
would see that she is using the computer and monitor listed as they match her EmpId. Likewise, if
you wanted to know who was using the laptop, you would look at the EmpId on the equipment
table, see that it is one. You would then go to the employee table and see that John has an EmpId of
one. You would then know that John is using the laptop.
Hopefully, you have a decent understanding of how hierarchical databases work as the next database
type will be similar, but slightly more complicated.

Relational databases
The last and most crucial database type is the relational database. We will be working with
relational databases for the rest of this book. A relational database is composed of multiple flat-file
tables. Each of these flat-file tables has relationships between one another. These relationships work
similar to the hierarchical database tables, except each child table can have multiple parent tables.
In contrast, in the hierarchical database, each child can only have one parent. Since all the flat-file
tables are connected or related to one another, we call this database type a relational database.
(Parent) Customer Table
CustID Name
100 Jake
101 Logan
102 Emily

(Parent) Product Table


ProdID Name Desc NumInStock
1 Keyboard Black mechanical keyboard 10
2 Laptop Apple laptop 5
3 Desktop Windows desktop setup 3

(Child) Order Table


Chapter Two: Database Types 10

OrderID ProdID CustID


1000233 3 102
1000234 1 101
1000235 1 102
1000236 2 100

Let’s take a look at the tables shown above and see how they are related to one another. The first
thing to note is the table names. We have a customer, product, and order table. The customer table
contains a customer ID and name fields. Likewise, the product table consists of a product ID, name,
description, and the number in stock fields.
Interestingly, the Orders table has three ID fields: the orders ID, the products ID, and the customers
ID. These IDs allow the Orders table to have two parent tables, the product table, and the customer
table. This design is something we would be unable to do with a hierarchical database as it only
allows a single parent relationship.
Let’s dive further into what our example is showing us. If we wanted to know what items Jake has
purchased, we would look at the orders table see that his CustID (100) is related to order number
1000236. From there, we can see he purchased a product with an ID of 2. Looking at our products
table, we know that he bought an Apple laptop. What if you want to know who has purchased a
keyboard. Looking at our products table we see the keyboard has a product ID of 1. Moving to our
orders table, we see that the keyboard has been purchased twice in orders 1000234 and 1000235. From
there, we see that customers 101 and 102 have both purchased a keyboard. Looking at our customer
table we find out Logan and Emily have both acquired a keyboard. What if we want to know what
makes up order 100233. I think you should get the idea of this point. The order table shows the
related product ID of 3 and a customer ID of 102. Viewing the products and customer tables yield
to the results of the order consisting of a desktop purchased by Emily.
The point of walking you through each of those scenarios is to drill into how relational databases
find data. Understanding how tables relate to each other and how a database finds data is crucial
for DBAs.

Workbooks VS Databases
Starting the database creation process with a spreadsheet is often a good idea. Worksheets allow us
to create headings and begin entering data quickly. It is also easy to edit the formatting and sort
through the data in a variety of ways.
We have previously gone over that a spreadsheet in Excel is essentially a flat-file type of database.
To create a mock-up of a relational database, we would add more sheets to the workbook. Excel also
allows us to develop relationships between these tables, but we won’t be going over that here. Just
know it is possible. Let’s take a look at the spreadsheets below to get an understanding.
Chapter Two: Database Types 11

Customer Table in Excel

Product Table in Excel


Chapter Two: Database Types 12

Order Table in Excel

The spreadsheets display the same tables that we had previously gone through. You can see that this
workbook consists of the customer, product, and order worksheets. At this point, you maybe asking
how this is different than a database? The difference lies in scale. Spreadsheets can handle shortlists
of data but cap out after a few thousand records. Databases, on the other hand, are used to store
thousands, millions, to possibly billions of records of data. Databases are generally much faster at
accessing data. In a spreadsheet, you would most likely scroll through the sheet to look up the data.
This method is usually not very efficient.
Furthermore, databases have access to all of a server’s resources, allowing thousands of users to use
it all at once. Usually, a single user can only access a spreadsheet at a time. As you can see, a database
can do everything a spreadsheet does, but at a much larger and faster scale.

Summary
Let’s review what we learned in this chapter. We started off defining a database.

A database(DB) is an organized collection of data, > similar to a phone book, stored in


an electronic format. > It allows you to insert, update, delete, and retrieve data > quickly.
Traditionally, databases are composed of files, > records, and fields.

We then moved onto going over some of the more useful database concepts, as defined below.

• A query is a request you send to the database that returns information.


Chapter Two: Database Types 13

• An index data structure in a database table speeds up data retrieval.


• A database server hosts databases.
• A database management system (DBMS) allows indirect access and provides an easy to use
interface to interact with the database server.

Followed by going over the three types of files a database creates to store data and process
transactions.

• Primary data files(.mdf) are the first files the database creates on the primary hard disk, and
the majority, if not all, your data lives.
• Secondary data files(.ndf) are used when you run out of memory on your first hard disk or
dealing with large databases. By putting some of the data into the primary .mdf and other data
in the secondary .ndf’s, you give SQL Server more scalability.
• The log file(.lgf) keeps track of all the changes made to the databases.

The next three sections cover the three primary database types.

• A flat-file database is simplistic in design. These databases hold one record per line, allowing
a user to access and run queries quickly. Common types include CSV Files, TDF Files, and an
Excel spreadsheet(.xsl).
• A hierarchical database uses a tree structure to create relationships between tables. Each
parent table can have multiple children, but each child table can have only one parent.
• A relational database is composed of multiple flat-file tables. Each of these flat-file tables
has relationships between one another. These relationships work similar to the hierarchical
database tables, except each child table can have multiple parent tables.

The last thing we covered in this chapter was comparing an Excel workbook to a database. In the
end, we determined that the main difference was scale. Here are some things to remember:

• A spreadsheet is an excellent place to start creating a database.


• Spreadsheets are useful for small data lists and crunching numbers but cap out after a few
thousand entries.
• Databases store thousand, millions, or possibly billions of data records.
• Databases are more efficient at finding data than spreadsheets.
• Databases have access to all of the resources of a server allowing thousands of users at once.
• A spreadsheet usually only allows one user to access it at a time.

That concludes this chapter. Hopefully, by now, you have a good grasp of all the concepts listed
above. In the next part, we will delve deeper into relational database concepts.
Chapter 3: Relational Database
Concepts

Relational Database Concepts

Introduction
Before designing and creating your first database, you must first understand the components of a
relational database and the terminology used to describe them. In this chapter, we will cover essential
concepts for relational databases. Specifically, we will cover relationships between tables, rows, and
columns. Talk about some of the crucial language elements you need to know. Move onto a brief
overview of normalization. Lastly, we will cover indexes and SQL Servers’ physical storage, pages,
and files.

Tables, Rows, and Columns


A relational database consists of multiple tables that help organize and define relationships from data.
Each table is essentially a flat-file database, similar to an Excel spreadsheet or a CSV file. From our
Chapter 3: Relational Database Concepts 15

previous examples, we see that a table typically stores one type of data: a customer, a product, order,
etc. Tables include columns, which are also called fields or domains, and rows, also called records
or entities.
Each column represents one piece of information that you want to store in the database, such as
a name, ID, or address. Similarly, each row contains all of the data for a single record, such as a
customer or a product. The table’s name is usually a good hint for what a row contains. Let’s relook
at some of the tables given in the previous chapter.
Customer Table
CustID Name
100 Jake
101 Logan
102 Emily

Order Table
OrderID ProdID CustID
1000233 3 102
1000234 1 101
1000235 1 102
1000236 2 100

The customer table, predictably, contains information about customers. There are three records, or
customers, listed in this table. Each column contains one piece of information about those customers,
a CustID number that internally keeps track of different customers and another column for their
name. The bottom table contains information about customer orders. There are four rows here and
three columns. One of the columns, CustID, is creating a relationship with the customer table’s
CustID column. Additionally, each order has a link to the product the customer purchased through
the ProdID and an order ID.

Language Elements
We know from the previous section that a table consists of columns and rows. New to us is the
concept of data types. A data type is an attribute that specifies the type of data of an object.
Customer Table
CustID Name
100 Jake
101 Logan
102 Emily

For example, taking a look at the table above, we can classify the data in CustID as an int, short
for integer, data type. An int data type means only a whole number is a valid input into the CustId
Chapter 3: Relational Database Concepts 16

column. You can’t enter a decimal or character. We will cover data types in a lot more detail in the
next chapter.
Constraints are rules placed on a field to limit the type of data entered. You can think of them as
restrictions for data. For example, if we add a new column to our Customer table for the customer’s
age, we know that input has to be positive as a person can’t have an age below zero.
Customer Table
CustID Name Age
100 Jake 45
101 Logan 32
102 Emily 21

Listed below are the most common constraints that a DBA should know.

• Unique: prevents duplicate values. Note, the NULL value can only be listed once.
• Check: limits the types of data a user can insert into the database.
• Default: inserts a default value if no other value is specified.
• Not Null: The field must have a value, it cannot be NULL.
• Primary Key: Uniquely identifies each record in the table. Each table must have one and only
one primary key. Note, while primary keys follow the unique constraint, they cannot contain
a NULL value.
• Foreign Key: Points from a field in one table to the primary key of another table. Note, foreign
key columns can contain NULL values. Furthermore, a foreign key can reference columns in
the same table. This practice is called a self-reference.

Normalization
Normalization is a vital practice for DBAs. In a nutshell, normalization is the design process used
to eliminate redundant data to save space in the database by splitting the database into two or more
tables and defining relationships between them. This process helps to improve the storage of the
database. If you eliminate unnecessary data, you make the database smaller. This process, in turn,
helps to improve the efficiency of the database.
The basic idea is that you shouldn’t have duplicates of data. Each piece of data is stored only one
time within the database. If you need to reference data outside of where it is stored, you create a
relationship to refer back to it.
Let’s look at an example of a denormalized database and then compare it to a normalized database.
Denormalized Database
Family Table
Chapter 3: Relational Database Concepts 17

Mother Father Child1 Child2 Child3


Emily Gary Sean Matthew
Jennifer Roy Shelby
Jannet Rusty Michael Will Caroline

Normalized Database
Marriage Table

MarriageID Mother Father


1 Emily Gary
2 Jennifer Roy
3 Jannet Rusty

Child Table
ChildID Name Parents
1 Sean 1
2 Matthew 1
3 Shelby 2
4 Michael 3
5 Will 3
6 Caroline 3

The top table shows a denormalized database. The family table contains a mother, father, and three
children attributes. You can see a few key differences if you compare it to the bottom two tables that
form a normalized database. Firstly, the normalized database comprises two tables, the marriage
and child tables, instead of the one family table for the denormalized database. This splitting of the
data into two tables allows us to eliminate the multiple child columns. Let’s think about why this is
useful. The first thing that comes to mind is if a family had four children, where would the fourth
child be stored. The family table only has three child columns. By creating a child table, we can add
as many children as we need and relate them to their parents.
The second big difference is the use of IDs. In the normalized database, we can see each record
uniquely identified by a marriage ID and a child ID. Let’s think about why this is important. These
IDs allow us to create a relationship between a child and their parents. We can see that Sean and
Matthew are born from the marriage of Gary and Emily. Shelby is born from Jennifer and Roy, and
Michael, Will, and Caroline are born Jannet and Rusty. Now you might be thinking, so what we
can tell that from the denormalized table as well. It’s easy to see in a small table, but databases
can be millions of rows. You need some way to identify a record to retrieve it quickly. What if
we used the parent names? There is no guarantee that the parents’ names are unique in several
million rows. There could be several marriages between people named Emily and Gary. No matter
how you look at it, the denormalized database can’t identify each record uniquely as it doesn’t
contain a primary key. Keys are essential for creating relationships between tables and enforcing
entity integrity and referential integrity. Entity integrity enforces uniqueness in a record, and
referential integrity means that referencing a key from another table must exist. A key doesn’t have
to be an ID. However, they work in most cases and are easy to set up and use.
Chapter 3: Relational Database Concepts 18

To summarize, normalization is the design process of organizing data into multiple tables and
defining relationships between them. This method allows us to remove redundant data and ensure
integrity between our data tables and records. We will jump into the nitty-gritty of how to normalize
a database in a later chapter. Having a concept of what it is and why it is essential will help you
better understand some of the upcoming lessons.

Indexes
To retrieve data from a database, SQL Server checks each row to look for the data you are trying to
find. SQL Server stores data in the order it is inserted. To see why this is ineffective, let’s look at a
simple example.
Name Table
ID FirstName LastName
1 John Carter
2 Sally Yates
.. ……… ……..
104524 Michael Williams
.. ……… ……..

In the table above, we see that there is no order to how people are stored. Let’s say we want to find
the person Michael Williams. To do so, we write a query searching for a person with FirstName =
Michael and LastName = Williams. SQL Server will then start with row 1 and compare it to our
query to see if it matches. It will then proceeded to go through each row until it reaches row 104524.
This brute force approach can be very ineffective, depending on the location of the data. Indexes
help us solve this issue.
Indexes work by duplicating data to cross-reference the table to look up information quickly. This
design operates similarly to an index in a book. You look at the index to quickly find what you are
looking for then cross-reference the page number or other information provided to then promptly
flip to the correct section of the book. I should point out that this is duplicating data. However, since
this is not a table and increases the efficiency of the table, it does not violate the normalization rules.

Pages and Files


SQL Server stores data on disks in 8kb chunks. These chunks of data are called pages. Each row in
a table must fit within one page, but a page can consist of multiple rows if you have smaller sized
rows. Indexes also are stored in pages. When SQL Server operates on the disk, it does so one page
at a time.
This process plays a critical role in how SQL Server works. If SQL Server needs to modify data, it
will perform a read operation on the disk, meaning it will find the 8kb page that contains the data,
then read that page into memory. Once in memory, they will not be instantly written back to the
Chapter 3: Relational Database Concepts 19

disk. SQL Server knows that there might be more upcoming changes, and it will be faster if it just
keeps this information in memory. Later it will apply all the changes to the disk to keep the data
safe.
Keeping information in memory, while more efficient is also dangerous. If the power gets cut or
a server goes down, the information in retention will wipe out. Luckily, SQL Server has a way of
recovering lost data through the log file we previously discussed. SQL Server will keep track of all
the changes in a transaction log. This log operates as a checklist of sorts. When a transaction writes
back to disk, the log file will put a check by it to note that. This way, when we are restoring a database
after a power outage, we know what transactions are safe on the disk and what transactions still
need to write to disk. The transaction log is stored on the drive to prevent its loss during a crash.
While SQL Server is making changes in memory, it will also quickly add to the transaction log.

Summary
This chapter focused on relational database concepts and the linguistics used to describe them. We
started off covering the basics of tables, columns, and rows.

• Tables: In a relational database, a table is essentially a flat-file database.


• Columns: A column is also known as a field or domain and contains one attribute used to
describe the row.
• Rows: A row is also known as a record or entity and contains data for each instance.

We then moved on to covering specific language elements you need to know: data types and
constraints.

• Data types: A data type is an attribute that specifies the type of data of an object.
• Constraints: are rules placed on a field to limit the type of data entered. You can think of them
as restrictions for data.

After defining constraints, we then covered the most common types.

• Unique: prevents duplicate values. Note, the NULL value can only be listed once.
• Check: limits the types of data a user can insert into the database.
• Default: inserts a default value if no other value is specified.
• Not Null: The field must have a value, it cannot be NULL.
• Primary Key: Uniquely identifies each record in the table. Each table must have one and only
one primary key. Note, while primary keys follow the unique constraint, they cannot contain
a NULL value.
• Foreign Key: Points from a field in one table to the primary key of another table. Note, foreign
key columns can contain NULL values. Furthermore, a foreign key can reference columns in
the same table. This practice is called a self-reference.
Chapter 3: Relational Database Concepts 20

Normalization is one of the more essential concepts we covered in this chapter.

• Normalization: Normalization is the design process used to eliminate redundant data to save
space in the database. It does so by splitting the database into two or more tables and defining
relationships between them.

Following normalization, we covered another vital concept, indexes.

• Index: An index is an on-disk structure associated with a table that speeds up data retrieval
from the table.

Lastly, we talked about how SQL Server physically stores data using pages and files.

• Page: An 8KB chunk of data stored on a disk.


• File: A log file is critical in how SQL operates and the restoration of a database in the event of
a failure.

Hopefully, this chapter gives you a good grasp of some of the more crucial relational database
concepts.
Chapter 4: Data Types

Data Types

Introduction
In this chapter, we will be covering database data types, why they are essential, and how they affect
storage requirements. Each column in a table has a designated data type that indicates the type of
data that the column will store. When choosing a data type, we need to carefully consider which type
provides the most efficient storage and querying schema. Data types can be challenging to change
once created and have data in them, which is why we need to plan when creating the tables design
or schema. One of the critical roles of a DBA is to decide which data type best suits the current
application. SQL Server includes a large number of built-in data types, but programmers can create
custom user-defined data types. User-defined data types are a rare occurrence as the built-in data
types cover almost all situations.

Common Built-in Data Types


SQL Servers organizes built-in data types into the following categories:
Chapter 4: Data Types 22

• Exact numbers
• Approximate numbers
• Date and time
• Character strings
• Unicode character strings
• Binary character strings
• Other data types
• CLR data types
• Spatial data types

You will find yourself using some of these data types regularly and others sporadically. The most
common data types you will need to know are as follows:

• Money(numeric)
• Datetime/Datetime2
• Integer
• Varchar
• Boolean
• Float

Let’s quickly cover these data types. Money is a numeric data type commonly used in places where
you want money or currency. However, if you need to calculate a percentage, it is best to use the
float data type. A numeric data type is a fixed-precision data type because you can only go out to
four decimal places. A float is an approximate data type, but we will cover that more in a bit.
There are two commonly used DateTime data types. The datetime data type store’s dates between
January 1, 1753, and December 31, 9999, and are accurate to 3.33 milliseconds. On the other hand,
the datetime2 data type store dates and times between January 1, 1900, and June 6, 2079, and are
accurate to 1 minute. The second significant difference between the two types is that the datetime
type uses 8 bytes of storage, whereas the datetime2 type uses 4 bytes of storage.
The integer(int) is a numeric data type that is a whole number. It ranges from -2³¹⁽-²,¹⁴⁷,⁴⁸³,⁶⁴⁸⁾ 
²31(2,147,483,648). Common uses for integers are unique identifiers and mathematical computations
where you don’t require a decimal in the output.
The varchar datatype is a character-string datatype commonly used when you are supporting only
English text. If you are supporting multiple languages, use the nvarchar data type to minimize issues
with character conversions.
The boolean data type is also known as the bit data type. This data type evaluates if something is
true or false. Typical uses include setting a flag on a record. For example, if we need to know whether
a customer has finished payment processing for an order, we can have a FinishedPaymentFlag
attribute that evaluates to 0 if the customer hasn’t paid and 1 if they have finished. You may have
noticed that I used 0 for false and 1 for true. This terminology is because boolean data types convert
true and false strings to bit values. True converts to 1 and false converts to 0.
Chapter 4: Data Types 23

The last commonly used data type is the float data type. The float data type is an approximate-
number data type. Approximate-number data type means that not all values within the range are
precisely accurate. A float can support precision between 7-15 decimal places, depending on the type
of float used.

Built-in Data Types


This section will list all the built-in data types, but will not dive into much detail. If you want to
know more about a specific data type Microsoft has documentation online about each one.
Exact Numerics
Data Type Range Storage
bit 0-1 1 byte
tinyint 0-255 1 byte
smallint -2¹⁵ ⁽-³²,⁷⁶⁸⁾-²15 (32,768) 2 bytes
int -2³¹ ⁽-²,¹⁴⁷,⁴⁸³,⁶⁴⁸⁾-²31 (2,147,483,648) 4 bytes
bigint -2⁶³ ⁽-⁹,²²³,³⁷²,⁰³⁶,⁸⁵⁴,⁷⁷⁵,⁸⁰⁸⁾-²63 (9,223,372,036,854,775,808) 8 bytes
numeric -10³⁸⁺¹ - ¹⁰38+1 Varies
decimal -10³⁸⁺¹ - ¹⁰38+1 Varies
smallmoney -214,748.3648-214,748.3648 4 bytes
money -922,337,203,685,477.508-922,337,203,685,477.508 8 bytes

Approximate Numerics

Data Type Range Storage


Float -1.79E + 308 - 1.79E + 308 Varies
Real -3.40E + 38 - 3.40E + 38 4 Bytes

DateTimes
Data Type Range Storage
datetime January 1, 1753 - December 31, 9999 with an accuracy of 8 bytes
3.33 milliseconds
datetime2 January 1, 0001 to December 31, 9999 with an accuracy 6-8 bytes
of 100 nanoseconds
smalldatetime From January 1, 1900, to June 6, 2079, with an accuracy 4 bytes
of 1 minute
date January 1, 0001 to December 31, 9999 3 Bytes
time accuracy of 100 nanoseconds 3-5 bytes
datetimeoffset January 1, 0001 to December 31, 9999 with an accuracy 10 bytes
of 100 nanoseconds with timezone awareness
timestamp Stores a unique number that gets updated every time a
row gets created or modified. The timestamp value is
based upon an internal clock and does not correspond
to real-time. Each table may have only one timestamp
variable
Chapter 4: Data Types 24

Character Strings

Data Type Description Max Size Width


Char(n) Fixed width character 8,000 Defined width
string
varchar(n) Variable width character 8,000 2 bytes + number of
string chars
varchar(max) Variable width character 1,073,741,824 2 bytes + number of
string chars

Unicode Character Strings

Data Type Description Max Size Width


nchar Fixed width Unicode string 4,000 Defined width x 2
nvarchar Variable width Unicode string 4,000
nvarchar(max) Variable width Unicode string 536,870,912

Binary Strings

Data Type Description Max Size Width


binary(n) Fixed width binary string 8,000
varbinary Variable width binary string 8,000
varbinary(max) Variable width binary string 2GB

Other Data Types

Data Type Description


sql_variant Stores up to 8,000 bytes of data of various data types, except text,
ntext, and timestamp
uniqueidentifier Stores a globally unique identifier (GUID)
xml Stores XML formatted data. Maximum 2GB
cursor Stores a reference to a cursor used for database operations
table Stores a result-set for later processing

Percision and Scale


Decimal and numeric daa types have a fixed percision and scale. The syntax is as follows:

• decimal[(p[,s])]
• numeric[(p[,s])]

This syntax may be hard to follow, so let us start with defining precision and scale then walkthrough
how the language works.
Precision is the maximum total number of digits to the left and right of the decimal point. Precision
ranges from a minimum of 1 up to a max of 38. For example, the number 180.34 has a precision of
Chapter 4: Data Types 25

five because there are five digits. You should note that the default precision is 18. Meaning if you
don’t specify a value, it will be 18.
Scale is the maximum number of digits to the right of the decimal point. The value ranges between 0
- p, where p is the defined precision value. Meaning you can only specify a scale if you have already
set a precision value. For example, the number 180.34 will have a precision of 5 and a scale of 2. The
default value for scale is 0.
Now that we have defined our terms let us look at how to use the decimal syntax to create our data
type. Let us start by looking at the previous number we have defined, 180.34. We have stated that
this number has a precision of 5 and a scale of 2. We would determine the decimal data type to be
decimal(5,2). What happens to the number if we define the decimal data type to be decimal(2,5)?
Well, it would be invalid. According to our rules, the scale can only go up to the value of precision. If
we define it to be decimal(5,4), the output would be 1.8034. Decimal(5) would output 18034 because
our default value for scale is 0. Decimal() would output 000000000000018034 which equates to 18034
because the default value for precision is 18. It is essential to try to use the smallest precision and
scale you need to be efficient with the data.

Large Data items


SQL Server is often useful to store large data items. Varchar(max) is frequently the data type used
to store BLOBs and other large pieces of data. BLOB stands for binary large object, which means
a chunk of data that doesn’t fit the 8kb page. In this case, the page contains a pointer to where the
BLOB is stored. The BLOB is outside the page on one or more successive pages. While useful for
blocks of texts, images, and large binary files, it is not the most efficient use of SQL Server resources
in some cases. For example, if we are trying to store Word or Excel documents, a better process is to
let those files sit on the disk and save the file path or a pointer to the records in the table. SQL Server
also contains the FILESTREAM capabilities, which help integrate a file system and file attachments
with the data. We won’t cover FILESTREAMS as they are more advanced than the introduction to
concepts this book intends to cover. I mention them to make you aware that SQL Server has many
different capabilities to help you with the storage of extensive data.

Data Conversions
There are times where you will need to convert between data types. In some cases, SQL can implicitly
convert between the data. An implicit conversion means that SQL is doing it under the hood for
you. This process is a black box method that only requires your input; you are not aware of how SQL
is doing the conversion behind the scenes. There is a performance penalty with implicit conversions.
Therefore, I would advise you to store the data in the form you intend to use or use SQL Servers
functions to convert the data types explicitly.
SQL Server has two explicit conversion functions: CAST and Convert. CAST is compliant with
ANSI standards, which will allow you to import or export to other database management systems.
CONVERT is more flexible but prevents you from integrating with other systems.
Chapter 4: Data Types 26

The syntax of the CAST function is:


CAST(source-value AS destination-type)
For example, if you wanted to convert a decimal to an int you would do the following:
CAST(25.55 AS int)
This CAST would output 25.
The syntax for the convert function is:
CONVERT(data_type[(length )], expression[,style])
This syntax is a little more complex but is very useful once you master it. The length parameter
allows you to specify how many digit or character the value will be. The style allows you to convert
between different cultural practices. For example:
CONVERT(nvarchar(10), TransactionDate, 101)
The above method will convert the TransactionDate from a DateTime type to a nvarchar value. The
101 style represents the USA style of dating, i.e., mm/dd/yyyy.
As you can see, conversion can be very tricky. The following URL leads to Microsoft documentation
about conversions:
https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-conversion-database-engine?view=sql-
server-2017
I often reference this documentation as it provides a great graph of what conversions need to be
explicit, implicit, or not allowed.

Summary
In this chapter, we started by defining a data type.

• Data type: indicates the type of data that the column will store

We then covered the most common data types.

• Money(numeric)
• Datetime/Datetime2
• Integer
• Varchar
• Boolean
• Float

Following that, we talked about precision and scale in decimal and numeric data types.
Chapter 4: Data Types 27

• Precision: Precision is the maximum total number of digits to the left and right of the decimal
point. It ranges from 1 to 38.
• Scale: Scale is the maximum number of digits to the right of the decimal point.

We then quickly covered some of the ways you can deal with massive data on SQL Server.

• BLOB: binary large object, a chunk of data that doesn’t fit within the 8kb page.

Finally, we covered both implicit and explicit conversions.

• Implicit: SQL Server will make the conversion under the hood for a performance cost.
• CAST: A CAST is an explicit conversion function that follows the ASCI Standards.
• CONVERT: A more powerful explicit conversion function than CAST but can’t integrate with
other systems.

By the end of this chapter, you should know the most common data types, and be familiar with
the others. You should also understand precision and scale, be familiar with how to store large
pieces of data, understand how implicit conversions work, and know how to use both the CAST and
CONVERT functions to convert data types explicitly.
Chapter 5: DML VS DDL

DDL VS DML

Introduction
In this chapter, we will be covering the basics of the SQL language. SQL stands for the structured
query language. We will focus on two different portions of SQL: Data Manipulation Language (DML)
and Data Definition Language (DDL).

Data Manipulation Language (DML)


Data Manipulation Language (DML) is the language element of SQL that allows you to issue
queries to retrieve, insert, or manipulate existing data. SQL Server uses T-SQL or Transact SQL,
a variant of the SQL language. T-SQL is compliant with American National Standards Institue
(ANSI) industry standards.
The core DML statements include the following:
Chapter 5: DML VS DDL 29

• SELECT: Retrieves data from the database through the selection of one or more rows or
columns from one or more tables.
• INSERT: Adds data to the database. It can add one or multiple rows.
• UPDATE: Updates existing data in the database.
• DELETE: Removes existing data in the database.
• MERGE: Perform an insert, update, or delete it based on the results of another query. A join is
often used instead of a merge. We will cover joins in a later chapter.

In the next few sections, we will be covering examples of each DML statement.

SELECT Examples
The SELECT statement is likely to be the most common DML statement you will use. SELECT allows
you to retrieve data from a table and follows this form:

1 SELECT column1, column2, ...


2 FROM table_name;

Weather Base Table


City State High Low
Dallas TX 104 95
Nashville TN 97 89
Denver CO 89 84

You can see in the table above we are looking at the Weather for different cities. The table contains
a City, State, High, and Low attributes. Let’s put that SELECT statement to use by writing a query
only to retrieve the name of the cities where the High is above 95.

1 SELECT City FROM Weather WHERE High > 95

Breaking apart our SQL query, we are selecting the City attribute from the Weather table where the
high is greater than 95. The WHERE clause is something we haven’t covered yet, but it is pretty
straightforward as it filters the result based on the conditions provided. In this case, where the cities
High are greater than 95. This query will result in the following table.
Query 1 Results

City
Dallas
Nashville

Dallas and Nashville return because they have highs above 95, and we only wanted the City field.
What if instead of just the City field, we wanted all the attributes returned.
Chapter 5: DML VS DDL 30

1 SELECT * FROM Weather WHERE High > 95

Looking at the query, we notice one big difference between the previous statement. Instead of having
City, we have *. * in SQL means all columns. Our statement then reads select all columns from the
Weather table where High is greater than 95. This query will result in the following table.
Query 2 Results

City State High Low


Dallas TX 104 95
Nashville TN 97 89

You should notice that all the columns have returned. If we only wanted the City and State, then we
would replace * with City, State.

INSERT Example
The INSERT statement is used to create new entities in a table and follows the following syntax:

1 INSERT INTO table_name


2 VALUES (value1, value2, value3, ...);

Let us look back at the weather table.


Weather Base Table
City State High Low
Dallas TX 104 95
Nashville TN 97 89
Denver CO 89 84

Suppose we receive a request to add Lexington, Kentucky, with a High of 95 and a Low of 87 to the
table. We can write the following query to insert our new entity into the table.

1 INSERT INTO Weather


2 VALUES ('Lexington', 'KY', 95, 87);

We can read this query as an insert into the Weather table, a City of Lexington, State of KY, a High
of 95, and a Low of 87. This statement will create the following result.
Weather Insert Table 1
Chapter 5: DML VS DDL 31

City State High Low


Dallas TX 104 95
Nashville TN 97 89
Denver CO 89 84
Lexington KY 95 87

We can see our new record in the table. The previous example seems pretty straightforward, so let’s
look at a more complex example. Our client comes back to us and requests us to add a new record
to the table for Naples, Florida, but doesn’t provide us with the temperatures. The question we need
to answer is how do we specify individual columns to insert data. Luckily SQL gives us an easy way
to accomplish this.

1 INSERT INTO Weather (City, State)


2 VALUES ('Naples', 'FL');

As you can see, we can specify what columns we want to add data to by adding them in parenthesis
after the table name. This syntax is convenient when we don’t have all the data but still need to
insert a record.
At this point, you may be wondering what happens to the attributes we do not insert data. Let us
take a look at the resulting table.
Weather Insert Table 2
City State High Low
Dallas TX 104 95
Nashville TN 97 89
Denver CO 89 84
Lexington KY 95 87
Naples FL NULL NULL

We can see a new value called NULL added into those columns. NULL is a field with no data.
Therefore, if you don’t define a value for a column on an insert statement, NULL is inserted instead.
If you fail to specify a value that has a NOT NULL constraint on the field, the insert will fail.

UPDATE Example
The UPDATE statement is used to modify existing data in the table and follows the following syntax:

1 UPDATE table_name
2 SET column1 = value1, column2 = value2, ...
3 WHERE condition;

The WHERE statement is optional, but you should always include it. Otherwise, whatever column
values you set will be applied to every record in the table.
Chapter 5: DML VS DDL 32

Let us take a look at the following scenario and how you handle it. Your boss comes to you and
says that the Weather for Nashville is displaying the wrong High. The High should be 95, but it is
showing a High of 97. He has tasked you to fix this mistake.
Weather Base Table
City State High Low
Dallas TX 104 95
Nashville TN 97 89
Denver CO 89 84

Looking at the table, we can verify that the High for Nashville is 97. To fix this issue, we run the
following statement.

1 UPDATE Weather
2 SET High = 95
3 WHERE City = Nashville

The statement reads as update the weather table, set the High attribute to 95 where the City equals
Nashville. This SQL statement will result in the following table.
Weather Update Table 1

City State High Low


Dallas TX 104 95
Nashville TN 95 89
Denver CO 89 84

We can see that Nashville now has a High of 95. I previously mentioned that you should almost
always include the WHERE condition. Let us understand what would happen if we didn’t include
it.

1 UPDATE Weather
2 SET High = 95

This statement will result in the following updated table.


Weather Update Table 2

City State High Low


Dallas TX 95 95
Nashville TN 95 89
Denver CO 95 84

As we can see, not including the WHERE condition has set every record’s High attribute to 95.
Chapter 5: DML VS DDL 33

DELETE Example
The DELETE Statement is used to removed existing data from a table and follows the following
syntax:

1 DELETE FROM table_name WHERE condition;

Like the UPDATE statement, the WHERE clause in the DELETE statement is optional, HOWEVER,
you should always include it. Otherwise, you will delete every single record from the table.
Suppose your boss comes to you and says we need to remove Colorado cities from the database.
They decided to break away from the United States and form their union. Since our weather system
only tracks towns in the united states, we must remove them from our database.
Weather Base Table
City State High Low
Dallas TX 104 95
Nashville TN 97 89
Denver CO 89 84

While an unlikely scenario, we do as our boss asks and issue the following statement to the database.

1 DELETE FROM Weather


2 WHERE State = CO;

This statement reads as delete from the Weather table, all records where the State equals CO. This
script results in the following modified table.
Weather Delete Table 1
City State High Low
Dallas TX 104 95
Nashville TN 95 89

As we can see, no cities are listed with CO as the State, satisfying our boss. I mentioned before that
you should almost always include the WHERE clause in the DELETE statement. Let us see what
happens when we remove the WHERE clause.

1 DELETE FROM Weather

Weather Delete Table 2


City State High Low
Chapter 5: DML VS DDL 34

There are no rows in the table. We effectively told the database to delete all records from the Weather
table. This process can be hazardous, so take note to always include the WHERE clause in the
DELETE statement.

Data Definition Language (DDL)


Data Definition Langauge (DDL) is used to create and manipulate database objects such as
tables and views. The ANSI standards don’t cover DDL, but many database vendors have similar
conventions. Most of the DDL statements are simpler to execute using the SMSS interface. However,
this book will focus on using DDL statements in T-SQL scripts, as it allows a bit more power,
flexibility, and control. You can also use DDL statements to script tasks or activities that need to
be executed on a schedule or as required.
There are five main DDL statements:

• USE: Changes the database context.


• CREATE: Creates a database object such as a table or a view.
• ALTER: Changes an existing object.
• DROP: Removes an object from the database.
• TRUNCATE: Removes rows from a table and frees up the resources used by those rows.
TRUNCATE is similar to the DELETE DML command, which also removes rows from a table
but does not free up space.

USE Example
The USE statement allows a user to specify what database they want to query from and follows the
following syntax:

1 USE { database_name }

For example, if we are currently working in the Master database but want to pull some information
from the WeatherSystem database, we can use the following statement.

1 USE [WeatherSystem]
2
3 SELECT * FROM Weather

When we analyze the above statement, it is essential to note that we are working in a database outside
of the WeatherSystem database. The USE statement is useful to pull information from multiple
databases and in scripting out procedures.
Chapter 5: DML VS DDL 35

CREATE Example
The CREATE statement creates database objects such as tables and views.
For example, if we want to create the weather table we have been using in our cases, we would do
the following.

1 USE [WeatherSystem]
2 GO
3 CREATE TABLE [Weather](
4 [ID] [int] IDENTITY(1,1) PRIMARY KEY,
5 [City] [varchar] (50),
6 [State] [char] (2),
7 [High] [smallint],
8 [Low] [smallint]
9 )
10 GO

The USE statement is specifying we want to create this in the WeatherSystem database. GO is
executing the previous USE statement. In our CREATE TABLE, we are specifying Weather as the
name. We are setting an ID of type int as the primary key. The IDENTITY constraint auto increments
the value of the column by one starting at one. Setting an IDENTITY constraint is a useful way of
handling a primary key. We then include the City, State, High, and Low attributes we have seen
above. We define the City attribute as varchar(50). I am unaware of any city names that would
reach this limit, so a length of 50 is a good option. Since we are using state abbreviations, char(2)
is a great data type to use. Initially, tinyint might seem to be a good data type for the High and
Low attributes. It seemed logical since the temperature could never go above 255. However, we
have to analyze further. Tinyint can’t store numbers below zero, but temperatures can be negative.
Therefore smallint is a better data type to use. The GO statement will then execute the CREATE
TABLE command resulting in the following table schema.
New Weather Table
ID City State High Low

We can see this is the same design as the example tables we have been using, except we added an
ID column as a primary key.

ALTER Example
The ALTER statement modifies an existing database object, such as a table or view. For example,
we are adding new functionality to our weather system. In addition to tracking the high and low of
the City, we want to track the average temperature. To do so, we write the following statement.
Chapter 5: DML VS DDL 36

1 USE [WeatherSystem]
2 GO
3 ALTER TABLE Weather
4 ADD Average smallint
5 GO

This statement will read as in the WeatherSystem database, alter the weather table by adding a new
column called Average of the smallint data type. Which will result in the following table schema:
Altered Weather Table
ID City State High Low Average

We can now see the Average column in our table.

DROP Example
The DROP statement removes existing database objects. For example, if we wanted to delete our
Weather table, we would run the following command:

1 DROP TABLE Weather

On a successful result, we would no longer have the Weather table in our database. If you attempt
to drop a table with a FOREIGN KEY constraint, you would either have to delete the child tables
first or alter the table schema to remove the FOREIGN KEY restriction. Otherwise, SQL will fail to
drop the table.

TRUNCATE Example
The TRUNCATE statement deletes the data inside of a table, frees up resources, and keeps the table
schema intact. For example, if we have a previous weather table.
Weather Base Table
City State High Low
Dallas TX 104 95
Nashville TN 95 89

We run the following statement.

1 TRUNCATE TABLE Weather

We end up with the following result.


Chapter 5: DML VS DDL 37

Weather Truncated Table


City State High Low

As we can see, the Weather table has no more data. The TRUNCATE statement is very similar to the
DELETE statement. In large databases, it suggested using TRUNCATE over DELETE as it is more
efficient and will free up space.

Summary
The core DML statements include the following:

• SELECT: Retrieves data from the database through the selection of one or more rows or
columns from one or more tables.
• INSERT: Adds data to the database. It can add one or multiple rows.
• UPDATE: Updates existing data in the database.
• DELETE: Removes existing data in the database.
• MERGE: Perform an insert, update, or delete it based on the results of another query. A join is
often used instead of a merge. We will cover joins in a later chapter.

There are five main DDL statements:

• USE: Changes the database context.


• CREATE: Creates a database object such as a table or a view.
• ALTER: Changes an existing object.
• DROP: Removes an object from the database.
• TRUNCATE: Removes rows from a table and frees up the resources used by those rows.
TRUNCATE is similar to the DELETE DML command, which also removes rows from a table
but does not free up space.

By the end of this chapter, you should know the difference between DML and DDL statements, as
well as how to use the ones listed above.
Chapter 6: SQL Server Management
Studio

SQL Server Management Studio

Introduction
SQL Server Management Studio(SSMS) is the primary graphical user interface(GUI) used to
interact with SQL Server. In this chapter, we will cover how to download SQL Server and SSMS
locally, familiarize ourselves with the interface, identify important database objects, go over how to
execute queries, and briefly go over some other tools you may need.

Download SQL Server


To download SQL Server, go to this link https://www.microsoft.com/en-us/sql-server/sql-server-
downloads and click on “Download Now” for SQL Server Express.
Chapter 6: SQL Server Management Studio 39

Step 1

Once downloaded, double click on the SQL2019-SSEI-Expr.exe file. The executable file will run and
open up a new window.
Chapter 6: SQL Server Management Studio 40

Step 2

Select the “Basic” option.


Chapter 6: SQL Server Management Studio 41

Step 3

Accept the terms and conditions, then click the Install button on the next step.
Chapter 6: SQL Server Management Studio 42

Step 4

You should see the following screen when it has finished installing.
Chapter 6: SQL Server Management Studio 43

Step 5

Download SSMS
To download SSMS, go to this link https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-
management-studio-ssms?view=sql-server-ver15 and click on Download SQL Server Management
Studio (SSMS). Once downloaded, we will have a file called Once “SSMS-Setup-ENU.exe.” Double
click on it.
Chapter 6: SQL Server Management Studio 44

Step 1

Click on the install button. After clicking the install button, you should get a progress indicator.
Chapter 6: SQL Server Management Studio 45

Step 2

You should see the following screen when SSMS has finished installing.
Chapter 6: SQL Server Management Studio 46

Step 3

Access SSMS
Now that we have downloaded SSMS, we are ready to access it. Go to Start Menu>Programs>Microsoft
SQL Server Tools 18> Microsoft SQL Server Management Studio 18 or wherever you downloaded
SSMS.
The management studio will open up and present you with a Connect to Server screen.
Chapter 6: SQL Server Management Studio 47

Connection Window

Let’s look at what each field does. First off, we have the server type field. The server type gives you
the option to select from one of four SQL service options. For this book, we will only be working with
the ‘Database Engine’ server type. Secondly, we have the server name field. The server name field
is where we specify the name of the SQL Server we want to connect to our management studio.
Thirdly, we have the authentication field. The authentication field should default to ‘Windows
Authentication.’ We will cover this section in more detail in a later chapter. Lastly, we have the
‘User Name and Password’ fields. These will be available if we switch the authentication to ‘SQL
Server Authentication’ or another authentication that requires a login.
If you don’t see the server name, click on the browse button and navigate to the database. The
name should be the same as the one shown when we finished installing SQL Server. Verify the
authentication type is windows authentication then hit the connect button.

Exploring SSMS
Once you are connected to SQL Server and in SSMS, you should see a screen similar to the following
image.
Chapter 6: SQL Server Management Studio 48

SQL Server Management Studio

The first thing you will notice is the object explorer on the left-hand side of the screen. The Object
Explorer will allow you to browse your database components and create and alter existing objects
and schema. It works similarly to the file structures we commonly use on our computers. We have
gone over this before in a previous chapter, but files are a tree structure. If we look in our database
folder, we can find some system databases that come with SQL Server.
Chapter 6: SQL Server Management Studio 49

DB Window

You usually wouldn’t do too much with these as they are databases that SQL Server uses to run
itself. Until we start creating our own DBs, we will look at them to familiarize ourselves with their
layout. There are four folders of importance that we will cover. First off is the tables folder. This
folder contains all the tables in the database. It allows you to find and edit existing databases easily.
Chapter 6: SQL Server Management Studio 50

Tables Window

If you right-click on a table and click design, a new window will pop open on the studio’s right-
side. This design tool will allow you to add new columns easily, see and edit existing columns and
data types, as well as existing constraints. Clicking on a column will bring up the column property
window. This window provides useful information, such as if the column has a default value or not.
If you are working with a table, this is an excellent place to start as it will allow you to understand
what restrictions are on the table and what you can and cannot do with the table.
Chapter 6: SQL Server Management Studio 51

Design View

You can also quickly look at data by right-clicking on a table and selecting either the ‘select top 1000
rows’ or ‘edit the top 200 rows’. We can see from the image below all the data from the top thousand
rows. There are only three shown because there are only three rows in this table.
Chapter 6: SQL Server Management Studio 52

Top 1000

The edit top 200 option is a simple way to edit data and can be especially useful for small datasets.
When dealing with extensive datasets, it is unlikely you will use this option.
Chapter 6: SQL Server Management Studio 53

Edit 200

The next folder is the views folder. Similarly, it contains all the views in the database.
Chapter 6: SQL Server Management Studio 54

Views View
Chapter 6: SQL Server Management Studio 55

Next, we have the programmability folder. This name might not be self-explanatory, but it provides
a folder for all our existing stored procedures and functions in the database. We will not cover the
other folders for this book.
Chapter 6: SQL Server Management Studio 56
Chapter 6: SQL Server Management Studio 57

Lastly, we have the security folder. This folder handles users, roles, and permissions for the database.
You should know where the users and roles folders are, but we will not go over the other folders.
Chapter 6: SQL Server Management Studio 58
Chapter 6: SQL Server Management Studio 59

Executing Queries
SSMS also comes with a query analyzer. The Query Analyzer is a tool used to execute queries and
view the results. It also includes tools to evaluate query performance and execution plans so you can
fine-tune the efficiency. The easiest way to access the query analyzer is to hit the new query button
in the toolbar. Doing so will create a new file on the right side of the screen.

Query Analyzer

One way to select the database you want to execute your commands on is to apply the USE command
we learned about in the last chapter. However, SSMS also provides an easy way to switch between
databases. We can use the dropdown list in the top left to switch between databases.
Chapter 6: SQL Server Management Studio 60

DB DropDown

Let us run a quick query to see how the query analyzer works. A cool thing about the query analyzer
is that it uses IntelliSense to help complete your statement.
Chapter 6: SQL Server Management Studio 61

Intellesense

As you can see in the image above, the query analyzer attempts to help complete the statement by
providing a list of the possible schema. IntelliSense can be beneficial, especially when you don’t
quite remember what attributes an object contains.
Chapter 6: SQL Server Management Studio 62

QA Example 1

You can see from the image above that I ran a simple query to retrieve all the data from the
MSreplication_options options table. After executing the statement, the results pop up in the table
below. If we want to run multiple queries in one execution, we can separate them by creating
space between them. We can also break our code across multiple lines by hitting enter, which is
useful for organizing complex statements. For example, if I want to retrieve all the data from the
MSreplication_options table and the spt_monitor table, we can do the following.
Chapter 6: SQL Server Management Studio 63

QA Example 2

You now see two tables listed in the results below. You can highlight one of the statements only to
run that statement. For example, if we only want to view the spt_monitor results from that script,
we would highlight it and then execute it.
Chapter 6: SQL Server Management Studio 64

QA Example 3

We can see that only the highlighted statement returned a result. If you click on the messages tab,
you should see something along the lines of 1 row affected. This message is a good sign. It means
there were no errors, and the query returned a result of 1. For example, if we ran the following
statement, the messages screen would return an error message.

1 SELECT *
Chapter 6: SQL Server Management Studio 65

QA Example 4

As you can see from the image, the message tab returned an error saying “must specify the table to
return from.” Most errors you receive will tell you why the query failed to execute. In this case, we
didn’t specify from which table we want to retrieve data.

Other Tools
There are several useful tools that you should know about outside of SSMS. XQuery is a helpful
tool to query XML-structured data stored within a column of a table. SQL Command (SQLCMD),
is a command-line tool used to execute queries. SQLCMD sounds like the Query Analyzer tool, but
SQLCMD is mostly used to run automated batch files. This book won’t cover these tools in any more
detail, but it is useful to know.

Summary
In this chapter, we covered how to install a local instance of SQL Server Express and connect to it
using SSMS. We then moved on to exploring the Object Explorer and interface and located several
vital files. We also learned how to use the ‘select top 1000’ and ‘edit top 200’ functions by right-
clicking on a table.
After the object explorer, we moved on to understanding how to use the query analyzer. The query
analyzer allows us to run and analyze SQL statements. We can organize complex queries by breaking
Chapter 6: SQL Server Management Studio 66

the code across multiple lines. If we want to run various queries, we can separate them by line
breaks before executing them. We can also highlight a SQL statement and hit execute only to run
the highlighted query.
Lastly, we learned about XQuery and SQLCMD tools. These tools exist outside of SSMS, and we
won’t go into any detail on them, but it is essential to know that they exist.
By the end of this chapter, you should be comfortable knowing your way around the SSMS interface.
Chapter 7: Database Objects

SQL Server Management Studio

TODO: Create and place ch7titlecard

Introduction
In this chapter, we will learn how to create and use several commonly used database objects. These
objects include tables, views, and stored procedures. We will then briefly go over SQL injection and
how to prevent it.

Installing AdventureWorks
Before we start creating new database objects, we will install a sample database provided by
Microsoft called AdventureWorks. To get started, go to this link: https://docs.microsoft.com/en-
us/sql/samples/adventureworks-install-configure?view=sql-server-ver15&tabs=tsql and download
the .bak file related to your version of SSMS.
Chapter 7: Database Objects 68

After you downloaded the .bak file, move the file to your SSMS backup folder. The backup
location could be different, but the default location for SSMS 2019 is C:Program FilesMicrosoft SQL
ServerMSSQL15.MSSQLSERVERMSSQLBackup.
Once you moved the file, connect to SQL Server in SSMS. Right-click on the Databases folder name
and hit restore. Click the device option then browse. Click add, then select the file you downloaded
and click ok. You should now have a sample DB.

Creating and Using Tables


We have already covered tables, but let’s do a quick recap. A table is a database structure that stores
data within the database. Tables contain rows, which provide a record of an entity, and columns
comprise an attribute of an object.
In the next section, we will learn how to create a table using SSMS, followed by learning how to
create a table using T-SQL.

Creating a table using SSMS


Once you are inside of SSMS and connected to a server, expand the database in which you want to
create the new table, then follow these steps:
1. Right-click on the tables folder, hover over New, and select Table.
Chapter 7: Database Objects 69

Creating a New Table

2. Complete the details for Column Name, Data Type, and Allow Nulls columns.
Chapter 7: Database Objects 70

Adding Table Details

3. Save your table selecting File > Save Table_1 or hitting ctrl + s on your keyboard.
Chapter 7: Database Objects 71

Save Table
Chapter 7: Database Objects 72

4. Type the tables name and click OK.

Type Table Name

If you refresh the object explorer, you will see the table under the Tables section.
Chapter 7: Database Objects 73

New Table Location


Chapter 7: Database Objects 74

Creating a table using T-SQL


We can also create tables using SQL. We had already seen an example of this when we were learning
about DDL. Let’s look at an example to review.

1 CREATE TABLE [Weather](


2 [ID] [int] IDENTITY(1,1) PRIMARY KEY,
3 [City] [varchar] (50),
4 [State] [char] (2),
5 [High] [smallint],
6 [Low] [smallint]
7 )

Executing this statement in the query analyzer will create the same table we created using SSMS.

Creating and Using Views


A view is a virtual table consisting of different columns from one or more tables. Unlike a table, a
view is stored in a database as a query object. The tables that make up the view are also known as
underlying tables. Once you have defined a view, you can reference it as you would a table.
Views are used as security mechanisms. They ensure that users can retrieve and modify only the data
seen by them through their permissions, preventing access to the remaining data in the underlying
tables.
Vies are also used as mechanisms to simplyfi query executions. Complex queries can be stored in
the form of a view and data from the view can then be accessed using simple query statements.
Views ensure the security of data by restricting access to the following Data:

• Specific rows of tables


• Specific columns of tables
• Specific rows and columns of tables
• Rows obtained by using joins
• Statistical summaries of data in given tables
• Subsets of another view or subsets of views and tables

Some common examples of views include the following:

• A subset of rows or columns of a base table


• A union of two or more tables
• A join of two or more tables
Chapter 7: Database Objects 75

• A statistical summary of base tables


• A subset of another view or some combination of views and base tables

Database views are designed to create a virtual table containing one or more underlying tables in
an alternative way. There are two reasons you would use a view over allowing users direct access
to the underlying tables.

• Views allow you to limit the type of data a user can access. You can grant and deny view
permissions for the data.
• Views reduce complexity for end users. You can write the complex queries on behalf of the end
user and hide them in a view.

Creating a view using SSMS


1. Right click the Views folder, then select New View.

SSMS New View

2. This will open up the Add Table box.


Chapter 7: Database Objects 76

SSMS Add Table Box

Lets explain a bit about how to use the Add Table dialog box.

• To specify the table to be used as the primary source, click the appropriate table in the tbales
tab of the dialog box.
• To use another existing view, click the Views tab of the dialog box.
• If you want to generate records from a function, go to the functions tab.
• If you want to use more than one source, you can click each of the different tabs to find the
table, view or function you wish to add to your query.

3. Once you have selected the desired source(s), click the Add button for each one. In this example
selected and added the Customer(Sales) table.
4. Once you have selected and added all the desired sources, click the close button.
5. After you have selected the objects you wish to use, the View Designer toolbar will be added,
in which you can further map out the views you wish to incorporate into your query. In this
example, I checked the CustomerId and the Account Number fields.
Chapter 7: Database Objects 77

SSMS View Designer


Chapter 7: Database Objects 78

6. Save your view and enter a name. I named my view vCustomer.

Creating a view using T-SQL


You can create the same view using the following T-SQL script:

1 CREATE VIEW vCustomer


2 AS
3 SELECT CustomerId, AccountNumber
4 FROM Sales.Customer

Creating and Using Stored Procedures


A stored proceedure is a previously written SQL statement that has been saved into the database.
Stored proceedures are useful when you run the same query often.
They are also pivitol in integrating your database with .NET Web Applications. For example, if you
are building a website built on the .NET Framework you would use stored proceedures to query and
retrieve data from the database to display on your website.

Creating a Stored Procedure using SSMS


1. Expand the programmability section, then right click on Stored Proceedures and choose New
Stored Proceedure.
Chapter 7: Database Objects 79

SSMS New Stored Proc

2. The text editor window will open with a default stored procedure tmeplate for you to edit.
Chapter 7: Database Objects 80

SSMS Generated Stored Proc

3. Edit the template by giving it a name, pass in parameters(optional) and add your query.

Creating a Stored Procedure using T-SQL


The following code will generate a stored procedure using T-SQL:
Chapter 7: Database Objects 81

1 CREATE PROCEDURE [Sales].[Customer]


2 @CustomerID BIGINT = NULL
3 AS
4 BEGIN
5 SELECT CustomerId, AccountNumber
6 FROM Sales.Customer
7 WHERE CustomerID = @CustomerId

SQL Injection
A SQL injection is an attach in which malicious code is inserted into strings that are later passed on
to instances of SQL Server waitig to parse the string and execute it. Any procedure that constructs
SQL statements should be reviewed for injection vulnerabilites because SQL Server will execute all
valied queries from any source.
The main form of SQL injection is a direct insertion of code into a user-input variable that
concatenats with a SQL command and is then executed. A second and less direct way is to inject
malicious code into strings that are going to be stored in a table or metadata. When this stored
malicious code is then executed in a SQL command, it will override the command and run its own
malicious code.

Summary
By the end of this chapter, you should be comfortable creating tables, views, and stored procedures
using either SSMS or T-SQL. You should also have a understanding of SQL injection and the main
causes of it.

You might also like