You are on page 1of 45

Physical database architecture

Training Division New Delhi

Pages

? The fundamental unit of data storage in Microsoft SQL Server is the page.

? In SQL Server version 7.0 the size of pages is 8 KB which means there can be 128 pages per megabyte.

? The start of each page is a 96 byte header used to store information

? such as the type of page,

? the amount of free space on the page,

? the object ID of the object owning the page.

8K

page

? The body of the page is of 8096 bytes.

Page Header

Body

A- 96 byte header

B- 8096 byte body

A

B

Types of Pages

? Data

? Index

? Text/Image

? Global Allocation maps

? Page Free Space

? Index Allocation map

Data rows with all data except text,ntext, and image data

Index entries

text, ntext, and image data

Information about allocated extents

Information about free space available on pages

Information about extents used by a table or index

Data Pages

? Data pages contain all the data in data rows except text,ntext, and image data, which are stored in separate pages.

? Data rows are placed serially on the page starting immediately after the

header.

? Rows cannot span pages in SQL Server.

after the header. ? Rows cannot span pages in SQL Server. ? In SQL Server 7.0,the

? In SQL Server 7.0,the maximum amount of data contained in a single row is 8060 bytes, not including text,

ntext, and image data.

Row offset table

? Starts at the end of the page and determines the location of a row within a page.

? Contains on entry for each row on the page and each entry records how far the first byte of the row is from the start of the page.

? The entries in the row offset table are in reverse sequence from the sequence of the rows on the page.

Inserting Name4 Deleting Name1 Inserting Name5

Inserting Name4

Inserting Name4 Deleting Name1 Inserting Name5

Deleting Name1

Inserting Name4 Deleting Name1 Inserting Name5

Inserting Name5

Inserting Name4 Deleting Name1 Inserting Name5

Index Pages

Stores the index pages .

An index page has the same layout as the data page.

Row in an index page consists of the index key and the pointer to the page at the next lower level.

Extents

? Extents are the basic unit in which space is allocated to tables and indexes.

? An extent is 8 contiguous pages, or 64KB(databases have 16 extents per MB. )

SQL Server 7.0 has two types of extents:

? Uniform extents are owned by a single object; all eight pages in the extent can only be used by the owning object.

pages in the extent can only be used by the owning object. Page address ? Mixed

Page address

? Mixed extents are shared by up to eight objects.

can only be used by the owning object. Page address ? Mixed extents are shared by

Page address

Log Data

Stored in a physically separate location from the data.

No longer stored in the system table.

Therefore ,does not compete for memory resources.

Physically stored as one or several log files which SQL Server stores as series of records.

Text and Image data

Text , ntext and image datatypes used.

Each column for a row of these types store upto 2 GB.

In the data page , there is a 16 byte pointer which points to the location of the text or image data.

A table has one collection of pages to hold the text and image data. (Stored in sysindexes for table ,indid=255)

Page Free Space Pages

Page Free Space (PFS) pages record

? Whether an individual page has been allocated,

? Amount of space free on each page.

? Each PFS page covers 8,000 pages.

For each page, the PFS has a bitmap recording whether the page is

?

empty

?

1-50% full,

?

51-80% full,

?

81-95% full,

?

or 96-100% full.

Global Allocation Map pages

GAM pages record

? What extents have been allocated.

? Whether they have been allocated to objects & indexes

? Whether the allocation has been for uniform or mixed extents

There are two types of Global Allocation Maps:

Global Allocation Map(GAM)

? Keeps track of allocated extents irrespective of whether the allocation is for mixed or uniform extent.

? The GAM has one bit for each extent in the

? If the bit is 1, the extent is free;

? if the bit is 0, the extent is allocated.

Shared Global Allocation Map

(SGAM)

? Records what extents are currently used as mixed extents and have at least one unused page.

? The SGAM has one bit for each extent in the interval it covers.

? If the bit is 1, the extent is being used as a mixed extent and has free pages;

? If the bit is 0, the extent is not being used as a mixed extent,or it is a mixed extent whose pages are all in use.

Both GAM & SGAM covers 64,000 extents, or nearly 4 GB of data

Extent Usage

GAM Bit

SGAM Bit

Free

1

0

Uniform

0

0

Mixed ,with no free pages

0

0

Mixed ,with free pages

0

1

? A new table or index is allocated pages from mixed extents.

? When the table or index grows to the point that it has eight pages, it is switched to uniform extents.

SQL Server 7.0 does not allocate entire extents to tables with small amounts of data inorder to make its space allocation efficient,

Database Files and Filegroups

? A database is mapped over a set of operating system files.

? These files are created at the same time as the database is created.

? Minimum of two operating system files are created for each database created.

? Primary data file

? Log file

SQL Server 7 allows the following three types of database files:

Primary data files :

Every database has one primary data file that keeps track of all the rest of the files in the database,in addition to storing data.By convention,the name of a primary data file has the extension MDF.

Secondary data files:

A database might have zero or more secondary data files.By convention, the name of a secondary data file has the extension NDF.

Log files :

Every database will have at least one lof file that contains the information necessary to recover all the transactions in a database.By convention, a log file will have the suffix LDF.

• Maximum size for a database file

• Maximum size of a log file

• 32 TB

• 4 TB

SQL Server 7.0 databases have three types of files:

? Primary data files

?Is the starting point of the database ,

?Points to the rest of the files in the database,

?Every database has one primary data file.

?Recommended file extension for primary data files is .mdf.

? Secondary data files

? Comprise all of the data file other than the primary data file.

? Some databases may not have any secondary data files, while others have multiple secondary data files.

? The recommended file extension for secondary data files is .ndf.

? Log files

? Hold all of the log information used to recover the database.

? There must be at least one log file for each database, although there can be more than one.

? The recommended file extension for log files is .ldf.

On creation of a database say for eg : “Training” , the two files that are created are :

C:\MSSQL7\data\training_Data.MDF

A

C:\MSSQL7\data\training_Log.LDF

B

where A is the primary data file ,

and

B

is the log file.

The information of the database files is contained in the table called “sysfiles” .

File id

Sysfiles table

Database identification number which is unique for each database

groupid

size

Identification of the filegroup to which the file belongs

Size of the file in pages

maxsize

Max size of file.”0”-no autogrowth,”1”-autogrowth till disk file

growth

perf

Autogrowth increment in pages or percentage of file size

Reserved for future use

name

Logical name of file

filename

The physical name of the file, including path

? SQL Server 7.0 files can grow automatically from their originally specified

size.

? When you define a file, you can specify a growth increment.

? Each time the file fills, it increases its size by the growth increment.

? If there are multiple files in a filegroup, they do not autogrow until all the files

are full.

? Each file can also have a maximum size specified.

? If a maximum size is not specified, the file can continue to grow until it has

used all available space on the disk.

? The user can let the files autogrow as needed to lessen the administrative

burden of monitoring the amount of free space in the database and allocating additional space manually.

Points to remember :

If the database must never be allowed to grow beyond its initial size,then set the maximum growth size of the database to zero.

This will prevent the database files from growing. If the database files fill with data, no more data is added until more data files are added to the database or existing files are expanded.

Fragmentation of Files

Allowing files to grow automatically can cause fragmentation of those files if a large number of files share the same disk.

Therefore, it is recommended that files or filegroups be created on as many different available local physical disks as possible.

Place objects that compete heavily for space in different filegroups.

Disk Management Techniques

SQL Server can

? Allow the database file to grow automatically

? Shrink the size of the database if the space is not needed

Creating a database specifying the primary,secondary and log files with autogrowth feature.

create database training

on (name=‘training_data1’,

filename=‘c:\sql_data\training1.mdf,

size=50,

maxsize=100,

filegrowth=10),

(name=‘training_data2’,

filename=‘d:\sql_data\training2.ndf,

size=100,

filegrowth=20),

log on (name=‘training_log’,

filename=‘e:\sql_data\’training_log.ldf’,

size=50,

filegrowth=20%)

go

Shrinking of databases

? Each file within a database can be shrunk to remove unused pages.

? Both data and transaction log files can be shrunk.

? The database files can be shrunk manually, either as a group or individually and can also be set to shrink automatically at given intervals.

? Shrinking activity occurs in the background and does not affect any user activity within the database.

Shrinks the size of the data files in the specified database.

DBCC SHRINKDATABASE ( database_name [, target_percent] [, {NOTRUNCATE | TRUNCATEONLY}]

)

Shrinking size of database file

Shrinks the size of the specified data file or log file for the related database.

DBCC SHRINKFILE {file_name | file_id } { [, target_size]

| [, {EMPTYFILE | NOTRUNCATE | TRUNCATEONLY}]

}

)

Database filegroups

A database comprises of :

? A primary filegroup and

? Any user-defined filegroups.

? Default filegroups.

The primary filegroup contains the :

? Primary data file and

? Any other files that are not put into another file group.

? All pages for the system tables are allocated in the primary file group.

User defined file group

These are filegroups that are specified using the FILEGROUP keyword in a CREATE DATABASE or ALTER DATABASE statement, or on the property page within SQL Server Enterprise Manager.

Default filegroup

They contains the pages for all tables and indexes that do not have a filegroup specified when they are created. In each database, only one filegroup at a time can be the default filegroup. If no default filegroup was specified, it defaults to the primary filegroup.

Some important facts about file groups:

? No file can be a member of more than one filegroup.

? Log files are never a part of a filegroup.

? Files in a filegroup will not autogrow unless there is no space available on any of the files in the filegroup.

? A maximum of 256 file groups can be created per database, and file groups can contain only data files;

? It is not possible to move files to a different filegroup once the files have been added to the database.

Advantages of filegroups:

File groups allow files to be grouped together for administrative and data allocation/placement purposes.

For example, three files (data1.ndf, data2.ndf, and data3.ndf) can be created on three disk drives, respectively, and assigned to the filegroup fgroup1.

A table can then be created specifically on the filegroup fgroup1. Queries for data from the table will be spread across the three disks, thereby improving performance.

The same performance improvement can be accomplished with a single file created on a RAID (redundant array of independent disks) stripe set.

Files and filegroups, however, allow you to easily add new files on new disks.

Additionally, if your database exceeds the maximum size for a single Microsoft Windows file, you can use secondary data files to allow your database to continue to grow.

By creating a filegroup on a specific disk or RAID (redundant array of independent disks) device, you can control where tables and indexes in your database are physically located.

Reasons for placing tables and indexes on specific disks include:

?Improved query performance.

?Parallel queries.

The following example creates a database with a primary data file, a user-defined filegroup, and a log file. The primary data file is in the primary filegroup and the user- defined filegroup has two secondary data files. An ALTER DATABASE statement makes the user-defined filegroup the default. A table is then created specifying the user-defined filegroup.

CREATE DATABASE training ON PRIMARY ( NAME=’Trg_Primary',

FILENAME='c:\mssql7\data\Trg_Prm.mdf',

SIZE=4,

MAXSIZE=10,

FILEGROWTH=1),

FILEGROUP Trg_FG1

( NAME = ’Trg_FG1_Dat1',

FILENAME = 'c:\mssql7\data\Trg_FG1_1.ndf', SIZE = 1MB,

MAXSIZE=10,

FILEGROWTH=1),

( NAME = ’Trg_FG1_Dat2', FILENAME = 'c:\mssql7\data\Trg_FG1_2.ndf', SIZE = 1MB,

MAXSIZE=10,

FILEGROWTH=1)

LOG ON ( NAME=’trg_log',

FILENAME='c:\mssql7\data\Trg.ldf',

SIZE=1,

MAXSIZE=10,

FILEGROWTH=1)

GO

An ALTER DATABASE statement makes the user-

defined filegroup the default.

ALTER DATABASE Trg MODIFY FILEGROUP Trg_FG1 DEFAULT GO

A table is then created specifying the user-

defined filegroup.

USE Trg CREATE TABLE TrgTable (par_id par_nm ON Trg_FG1 GO

int PRIMARY KEY, char(8) )

Indexes

Indexes can be of two types:

1. Clustered Index :

?Data rows are sorted and stored in the table based on their key values.

?There can only be one clustered index per table because the data rows themselves can only be sorted in one order.

?Clustered indexes are efficient for finding rows.

?

?The data rows form the lowest level of the clustered index.

2. Non Clustered Index :

Nonclustered indexes have a structure that is completely separate from the data rows.

The lowest rows contain the nonclustered index key values and each key value entry has pointers to the data rows containing the key value.

The data rows are not stored in order based on the nonclustered key.

Two types of tables :

1.Clustered tables

Are tables that have a clustered index.

The data rows are stored in order based on the clustered index key. The data pages are linked in a doubly-linked list. The index is implemented as a B-tree index structure that supports fast retrieval of the rows based on their clustered index key values.

2. Heaps

Are tables that have no clustered index.

The data rows are not stored in any particular order, and there is no particular order to the sequence of the data pages. The data pages are not linked in a linked list.

Maximum Capacity Specifications

This table specifies the maximum sizes and numbers of various objects defined in Microsoft SQL Server databases or referenced in Transact-SQL statements.

Object

SQL Server 7.0

Batch size

65,536* Network Packet Size

Bytes per short string column

8000

Bytes per text, ntext, or image column

2 GB-2

Bytes per GROUP BY, ORDER BY

8060

Bytes per index

900

Bytes per foreign key

900

Bytes per primary key

900

Bytes per row

8060

Bytes in source text of a stored procedure

Lesser of batch size or 250 MB

Clustered indexes per table

1

Columns in GROUP BY, ORDER BY

Limited only by number of bytes

Columns or expressions in a GROUP BY WITH CUBE or WITH ROLLUP statement

10

Columns per index

16

Columns per foreign key

16

Columns per primary key

16

Columns per base table

1024

Columns per SELECT statement

Database size

4096

Columns per INSERT statement

1024

Connections per client

Databases per server

Max. value of configured connections 1,048,516 TB

32,767

Filegroups per database

256

Files per database

32,767

File size (data) File size (log)

32 TB 4 TB

Foreign key table references per table

253

Identifier length (in characters)

128

Locks per connection Locks per server

Max. locks per server 2,147,483,647 (static) 40% of SQL Server memory (dynamic)

Nested stored procedure levels

32

Nested subqueries

32

Nested trigger levels

32

Nonclustered indexes per table

SQL string length (batch size)

249

Objects concurrently open in a server*

2,147,483,647

Objects in a database*

2,147,483,647

Parameters per stored procedure

1024

REFERENCES per table

63

Rows per table

Limited by available storage 128* TDS packet size

Tables per database

Limited by number of objects in a database

Tables per SELECT statement

256

Triggers per table

Limited by number of objects in a database

UNIQUE indexes or constraints per table

249 nonclustered and 1 clustered

* Database objects include all tables, views, stored procedures, extended stored procedures, triggers, rules, defaults, and constraints. The sum of the number of all these objects in a database cannot exceed 2,147,483,647.