You are on page 1of 57

VSAM Day 1

UNIT
VSAM Intro…

Topics:

 Access Methods

 VSAM Introduction

 VSAM Organization

 Control Interval
Access Methods

What is an access method?

Most computer applications


DASD

are designed to manipulate

data and generate results

based on the data.


Access Methods

Continued…
Access Methods

Data must be stored in a way that its retrieval is

easy and quick.

Access methods are ways to maximize the

efficiency of data storage and retrieval.


Access Methods

Listed below are some of the traditional access


methods that are available for the Multiple Virtual
Storage

(MVS) environment: Queued Sequential Access


Method (QSAM)
Access Methods

Basic Sequential Access Method (BSAM)

Indexed Sequential Access Method (ISAM)

Basic Direct Access Method (BDAM)

Partitioned Data Set Extended (PDS-E)


VSAM Introduction

VSAM stands for Virtual Storage Access Method

Used to access the data sets quickly and


effectively.

Used to organize, store, catalog, retrieve and


delete data sets.
Role of VSAM
VSAM acts as an interface between processing
programs and the operating system.

VSAM groups individual data records into larger units


In order to reduce the number of I/O requests
Required when sequentially retrieving records.
These larger units are transferred between the
Direct Access Storage Device (DASD) and virtual
storage by the operating system.
How does VSAM retrieve a record ?
In retrieving a record VSAM goes DASD

through the following steps:

1.VSAM interprets the processing


program’s logical request and
determines what services are Operating System

desired. VSAM

Processing Program
2.VSAM makes the required Input or Output (I/O)
request(s) to the operating system.

3.The operating system performs the physical


I/O operation(s) between the device and the storage.

4.VSAM locates and extracts the desired data before


returning it to the processing program.
Advantages of VSAM

Data can be accessed faster.

Records can be inserted in more efficient manner.


Deletion of records results in them being physically
removed from the disk.

Records can be accessed sequentially or randomly.

VSAM data sets are device independent.


Disadvantages of VSAM

VSAM data sets require more storage space

than other types of data sets.

VSAM data set require additional free space

that must be embedded in them.


Data Sets in VSAM

VSAM supports the following data set types:

Entry-Sequenced Data Set (ESDS)


Key-Sequenced Data Set (KSDS)
Relative Record Data Set (RRDS)
Linear Data Set (LDS )
Entry-Sequenced Data Set

Records in an ESDS are stored in the order in which

they are written and are retrieved by addressed access.

Records are loaded irrespective of their contents and

their byte addresses cannot be changed.


ESDS is also referred to as a sequential VSAM data set.

This is because records in an ESDS are normally

processed sequentially.

ESDS is best suited for applications where most

processing is done sequentially.


Key-Sequenced Data Set

Records in a KSDS are stored in key sequence and


are controlled by an index, determine the order in
which records are stored.

In a KSDS, records can be processed both


sequentially and randomly using their key field values.
The advantages of KSDS are:

Sequential processing is useful for


retrieving records in the sorted form

Random or direct processing of records is


useful in on-line applications
Relative Record Data Set

Records in an RRDS are loaded into fixed


length or variable length slots.

These records are represented by the Relative


Record Numbers (RRNs) of their slots.
A processing program uses RRN to provide
random access to records.

Slots

R1 R2 R3

1 2 3 4

Relative Record Numbers


Linear Data Set

LDS is a data set containing only a contiguous


string of data bytes with no intervening control
information.

An LDS is divided into blocks. These blocks can


be sequentially retrieved by a processing program
in physical order.

Continued…
Concepts
A processing program can group several logical
records together into a single block

LDS can be kept permanently in store for


enhanced performance.
Relative Byte Address

What is a Relative Byte Address?

The Relative Byte Address (RBA) of a record is its


displacement (in bytes) from the beginning of the
data set.

Continued…
VSAM treats data as a contiguous string of bytes.
This approach makes the address of a record
device-independent.

A VSAM data set can be moved without affecting


the RBAs of its records.
In addition to data
records VSAM also
stores control 0 100 200 300
Information.
The presence of control Record 1 Record 2 Record3

Information affects the Relative Byte Addresses


RBAs of subsequent
data records.
The example represents a VSAM data set
containing 100-byte, fixed-length records.

The RBA of the first record is 0.


The RBA of the second record is 100.
The RBA of the third record is 200 and so on.
Cluster

What is a cluster?

A cluster is the collection of physical data sets that


make up one logical data set.

The concept of a cluster is more suited for a


KSDS.
Continued…
A KSDS cluster has two
data sets. KSDS.CLUSTER

One data set holds the


KSDS.INDEX

actual data records. KSDS.DATA

.
The other data set contains an index component.

The index component permits the direct retrieval of


data
With an ESDS, an
Cluster

RRDS and an LDS,

the cluster name C


A B

and the dataset

Related Data Sets


component name
both refer to the same data set and a cluster
consists of only a single physical data set: the
data component.
Control Interval

What is a control interval?

o A control interval is the amount of data transferred


between the device and virtual storage.

o When a record is read from or written to a data set,


VSAM groups individual data records into larger units
of storage. These units of storage are called control
intervals.
Control Interval

o Size of a control interval should be minimum 512


bytes and increased by multiples of 512 upto a
limit of 8 K. Beyond that the increments should
be 2KB up to a maximum of 32KB.

o Control interval size = n * 512 or n * 2048 where


n is between 1 to 16.
4 3 2 9 1
Unused
R1 R1 R1 2 4 2 8 0
Space
0 0 0 0 5
5
Control Interval
Control interval has 4 components :
• Data

• Control interval description field


• Record description field
•Free space
Data
Contains the actual data processed by the program
Control Interval 1 CIDF
1
9
0
Unused Space (No Records) 220 340 420 Unused Space (1055) 420 340 220 8
5
0
5
Control Interval

Control interval description field

Contains the information about the free space


within the CI.

Field is with 4 bytes long.


Control Interval
Record Description Field:

Contains the information about the records within

the data space.

R1 R2 R3 R4
Unused Space (1055) 4 432
432 432 432 432
Control Interval

An RDF is a three byte field that is used to define

the location and length of a record or a group of

records.

The RDFs immediately precede the CIDF, at the

end of the control interval.


Control Interval

ESDS Organization

In an ESDS records are stored in the order in


which they are written and can be read in the
same order.
Control Interval 1 Control Interval 2 Control Interval 3

R1 R2 R3 R4 R5 R6 R7 R8 R1 R2 R3 R4
Control Interval
The characteristics of ESDS are summarized
below:

Records are stored sequentially.

Records can be of fixed or variable length.

Records are physically grouped into control


intervals.

Control intervals contain control information


along with data
JCL for ESDS
//SAMP003B JOB ,,CLASS=M,
// MSGLEVEL=(1,1),NOTIFY=SAMP003,TIME=(1)
//EXEC1 EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DEFINE CLUSTER -
( -
NAME(SAMP003.ESDS.CLUSTER) -
TRACKS(2,1) -
CISZ(512) -
RECORDSIZE(80,80) -
NONINDEXED -
)
DATA(NAME(SAMP003.ESDS.DATA)
)
/*
Control Interval 1 Control Interval 2 Control Interval 3

R1 R2 R3 R4 R5 R6 R7 R8 R1 R2 R3 R4

How is the size of a control interval determined?

The size of a control interval is determined when


the data set is defined.
The size can be defined in either of the following
ways:
The size can be defined with the DEFINE
CLUSTER command that defines the data set.

VSAM selects the size that best utilizes DASD


storage.
Control Interval 1 Control Interval 2 Control Interval 3

R1 R2 R3 R4 R5 R6 R7 R8 R1 R2 R3 R4

Control Info

What is control information?

VSAM uses some information to locate records

within a control interval. This information is called

control information.
Data records are stored at the beginning of the

control interval, while the control information is

located at the end of the control interval


Control Interval 1 CIDF
1
0
Unused Space (No Records) 0 2
0

What does CIDF contain?


The CIDF contains the following two values, each
two bytes long:
The first value indicates where the unused
space in the control interval begins, stored as a
displacement from the beginning of the control
interval.

The second value indicates the length of the


unused space.
Control Interval 1 RDF3RDF2RDF1 CIDF
1
9
R1 R2 R3 0
Unused Space (1055) 420 340 220 8
5
220 340 420 0
5

Record Definition Field

What is a Record Definition Field?

The data records in a control interval are


described by a group of record definition fields
(RDFs).
Record Definition Field – Case 1

Control Interval 1 RDF3RDF2RDF1 CIDF


1
9
R1 R2 R3 Unused Space (1055) 0
420 340 220 8
5
220 340 420 0
5

Case 1: Consider a case, where no two

consecutive records have the same length and

the control interval has unused space.


Record Definition Field – Case 2

Control Interval 1 RDF2 RDF1 CIDF


1
3
R1 R2 R3 R4 7
Unused Space (1055) 4 432 2
1
432 432 432 432 0
8

Case 2: Consider a case, where all records in


a control interval are of the same length and
the control interval has unused space.
Record Definition Field – Case 3
Control Interval 1 RDF4 RDF3 RDF2 RDF1 CIDF
1
2
R1 R2 R3 R4 R5 8
Unused Space 228 332 3 380 332 0 2
332 380 380 380 332 8
4

Case 3: Consider a case, where records are


of variable length, but some consecutive
records are of same length.
Managing Control Intervals
5K (5120 bytes) Control Intervals

3000 5000 3000 3000 5000

How is space wasted in a control interval ?

The example shows how a data set, containing


records of 3000 bytes and 5000 bytes, would
be stored if a 5K (5120 bytes) control interval is
used.
The following inferences can be made from the
given example:
Only a single record of either size can fit in a control
interval

For every 3000-byte record, 2113 bytes are unused


and for every 5000-bytes record, 113 bytes are
unused

As a result, just for 5 records, 6565 bytes of space


is wasted
Managing Control Intervals
5K (5120 bytes) Control Intervals

3000 5000 3000 3000 5000

How can you avoid wastage of space?

There are two solutions to prevent space wastage


in control intervals. They are:
Increasing the control interval size
Spanned records
16K Control Interval

Increasing the Size of Control Intervals

In this case, a 16K control interval might result in


a less wasted space, depending on the
proportion and sequence of the 3000-byte and
5000-byte records. However, as VSAM always
reads the CIDF into main storage, this solution is
not very efficient.
3K (3072 bytes) Control Intervals

3000 3062 1038 3000 3000 3062 1938

Spanned Records

The problem of wasted space can be resolved


by reducing the control interval size and
directing VSAM to split records over control
intervals. Records which are split between
control intervals are called spanned records.
A spanned record, thus, is a logical record
contained in more than one block.

In the given case, if 3K intervals are used


with spanned records, instead of 5K control
intervals, the total amount of wasted space
will be reduced to 4234 bytes.

You might also like