Professional Documents
Culture Documents
1
Outline I: Fundamental File
Structure Concepts
• Stream Files
• Field Structures
• Reading a Stream of Fields
• Record Structures
• Record Structures that use a length
indicator
2
Outline II: Managing Files of
Records
• Record Access
• More About Record Structures
• File Access and File Organization
• More Complex File Organization and
Access
• Portability and Standardization
3
Field and Record Organization:
Overview
• When we deal with file structures :
– Data to be persistent
– i.e. data read by a file/ written by another file data
should be same.
• The basic logical unit of data is the field
which contains a single data value.
• Fields are organized into aggregates, either as
many copies of a single field (an array) or as
a list of different fields (a record).
4
Field and Record Organization:
Overview
• When a record is stored in memory, we
refer to it as an object and refer to its
fields as members.
• Here we will study the ways that objects
can be represented as records in files.
5
Stream Files
• Here we deal with how data is handled
in streams.
• For E.g.
6
Stream Files
Input 1 Input 2
•Mary Ames •Alan Mason
•123 Maple •90 Eastgate
•S llwater, OK 74075 •Ada, OK 74820
7
Stream Files
• In Stream Files, the information is written as a
stream of bytes containing no added
information as follows:
• Problems:
– Wastage of space
• Ames requires 4 bytes but we use 10 bytes.
– If require more space than allotted.
• Solve these by fixing the lengths to larger space.
. 11
Field Structures
Method 2:Begin each field with a length indicator
• Begin each field with the length of that field
value.
• If length is too long we require more space for
length.
• Looks as follows:
12
Field Structures
Method 3: Place a delimiter at the end of each field to
separate from next field.
• Each field is separated by a delimiter.
• Delimiter can be white space characters like blank,
new line, tab
• The above can be used with in the values like blank
can be used in address.
• Hence we use vertical bar character.
13
Field Structures
Method 3:Use a “keyword = value” expression to
identify each field and its content.
15
Reading a Stream of Fields
• This time, we do preserve the notion of
fields, but something is missing:
– Rather than a stream of fields
– These should be two records
16
Record Structure I
• A record can be defined as a set of fields that
belong together when the file is viewed in
terms of a higher level of organization.
• Like the notion of a field, a record is another
conceptual tool which needs not exist in the
file in any physical sense.
• Yet, they are an important logical notion
included in the file’s structure.
17
Record Structures II
• Methods for organizing the records of a file
include:
– Requiring that the records be a predictable number of
bytes in length.
– Requiring that the records be a predictable number of
fields in length.
– Beginning each record with a length indicator
consisting of a count of the number of bytes that the
record contains.
– Using a second file to keep track of the beginning byte
address for each record.
– Placing a delimiter at the end of each record to
separate it from the next record.
18
Record Structures II
Method 1:Requiring that the records be a predictable number of
bytes in length.(fixed length not for field it is for record)
19
Record Structures II
Method 3:Beginning each record with a length indicator
consisting of a count of the number of bytes that the
record contains.
20
Record Structures II
Method 5:Placing a delimiter at the end of each
record to separate it from the next record.
21
Record Structures that Use a
Length Indicator
• To known how the record structure are dealt
we will consider length indicator method.
• Implementation:
– Writing the variable-length records to the
file
– Representing the record length
– Reading the variable-length record from the
file.
22
Record Structures that Use a
Length Indicator
Writing the variable-length records to the file:
–If we want to write length of a record to the initial position.
–We need to know the length of a record
–Hence we will read the data to a buffer then identify the length
using strlen function
23
Record Structures that Use a
Length Indicator
Representing the record length:
• 2 byte binary integer
• Convert into character string.
fprintf(file, ’%d’, length); //C stream
stream<<length<<‘ ’; //C++ sream
The above 2 functions inserts the length and places a
space as delimiter.
24
Record Structures that Use a
Length Indicator
Reading the variable-length record from the file:
–Read the records from a file
– records is read into buffer
–Then to object p.
–The value from buffer is read into character string
strbuff.
25
Mixing numbers & characters:
Use a file dump Contd..
• The actual length represented in a file
as a character string is as follows:
26
Mixing numbers & characters:
Use a file dump Contd…
27
Mixing numbers & characters:
Use a file dump
• In UNIX platform the data is dumped as
shown.(od – UNIX command)
28
Using Classes to Manipulate
Buffers
• Buffers mainly depends upon whether
they are:
– Fixed length
– Variable length
• It also depends on:
– Delimiter
29
Using Classes to Manipulate
Buffers-I
• Class with delimiter:
30
Using Classes to Manipulate
Buffers-I
• Pack function of a delimiter:
31
Using Classes to Manipulate
Buffers-I
• Unpack Function (Fields):
32
Using Classes to Manipulate
Buffers-II
• For Fixed length buffers:
33
Using Classes to Manipulate
Buffers-II
• There is initialize function which will
initializes the fields of the file.
34
Using Inheritance for Record
Buffer Classes
• Here we use Inheritance to remove
duplication of code if same procedures
are used by more classes.
• We have seen classes
– fstream , istream, ostream
– fstream inherits input/output operations
from parent class iostream.
– Which is nothing but inherits istream,
ostream
35
Using Inheritance for Record
Buffer Classes
37
Using Inheritance for Record
Buffer Classes
38
Using Inheritance for Record
Buffer Classes
• IOBuffer is the base class
• Protected members- to be used by only
inherited classes
39
Using Inheritance for Record
Buffer Classes
• All methods are declared virtual : allows
subclass for there own implementation.
• =0 (pure virtual class):-
– IOBuffer doesn’t include implementation of
any method.
– No objects can be created.
40
Using Inheritance for Record
Buffer Classes
• Write function of variable length buffer class.
• Tellp() : returns position in the output
sequence.
• Returns the address where it has written.
41
Using Inheritance for Record
Buffer Classes
42
Assignment-1
• Explain with a program how data is
packed, unpacked with fixed length
records.
• Explain with a program how data is
packed, unpacked with variable length
records.
43
Record Access: Keys
45
Record Access: Keys
• Uniquely key:
– i.e. if there are many records of same
• key : AMES
• To prevent the above:
– Define a primary key
– Which is unique to a record
• We can also create a secondary key in
support to the primary key.
46
Record Access: Keys
• When we choose a primary key we
should be careful as it contains real
data:
• Key should be unchangeable.
• To avoid the above problem we should
not choose data of a record as key
discussed later.
47
Record Access:
Using Sequential Search
• Evaluating Performance of Sequential
Search.
• Improving Sequential Search
Performance with Record Blocking.
• When is Sequential Search Useful?
48
Record Access:
Using Sequential Search
Evaluating Performance of Sequential
Search:
– Best case: 1
– Average case: n/2
– Worst case: n
Sequential search steps:
– Read calls for each record
– To perform read the seek required to read a record.
– E.g.. 10 records=>10 read calls => 10 seek
– Seeking takes more time than read.
49
Record Access:
Using Sequential Search
Improving Sequential Search
Performance with Record Blocking:
•If we have 100 records =>100 read calls
•Hence make a block of records
– E.g. 1 Block => 10 records
– Then 10 read calls => 10 blocks
– Block size will almost be of sector oriented.
– If 1 sector => 512 bytes => 10 records
50
Record Access:
Using Sequential Search
Points of record blocking:
– Searching is still O(n) as no of records are
same.
– Seek time is reduced
– The amount of data transfer is more.
• Even if need to access the first record.
– Too expensive
51
Record Access:
Using Sequential Search
When is Sequential Search Good?
– It is extremely easy to program
– Simple file structures
Mainly depends on:
• Processor speed
Mainly used:
• Tapes
• Lesser number of records
52
Record Access:
UNIX tools for sequential
processing
54
Direct Access
• How do we know where the beginning of the
required record is?
Ü It may be in an Index (discussed in a different
unit)
Ü We know the relative record number (RRN)
Ü Position of a record relative to begining
Ü E.g. First record=> RRN 0, next record=> RRN 1
and so on
55
Direct Access
• RRN are not useful when working with variable
length-records: the access is still sequential.
• In order to work with RRN we need to work with
fixed-length records.
– If records are of fixed length:
• Using RRN we can calculate ByteOffset
• Byteoffset = n* r n=> no. of bytes
r => RRN no. of a record.
– If fixed length is 512 bytes & RRN=500 then
byteoffset?
56
Record Structure
Choosing a Record Structure and Record
Length
Header Records
Adding Headers to C++ Buffer classes
57
Record Structure
Choosing a Record Structure and Record Length:
•To use RRN no. for direct access:
– First we should fix record length.
– Record length means: size of the field to be fixed
•Two ways to do:
– Fixed length field
58
Record Structure
1. Fixed length field approach:
• Simplicity
2. Fixed record length
• More efficient as a fixed amount of space at the end.
In the above 2 methods => 1 identification to be made:
– Differentiate between real data / unused space in
the record.
– The above can be done as follows:
• Record length indicator
• Delimiter
• Count fields
59
Record Structure
Header records:
•General information of a file.
•Header record at the beginning of the file to
hold this information.
•Information in header file:
– Count of no. of records
– Length of data records
– Date and time of the file updated.
– Name of the file
60
Record Structure
• Header record will be self describing
object
• Any to access a file will know about:
– File structures used in the file
– Helps in access of a record
– E.g. header record:
61
Record Structure
• Header record an example:
62
Encapsulating Record I/O
Operation in a single class
• Till now we have done a read/ write
operation :
– Two steps:
• Read/ write to a buffer
• Then buffer to a file
• Here we will use a class that hides
buffer.
• It looks as though we have read/ written
with a file. 63
Encapsulating Record I/O
Operation in a single class
• RecordFile is a class inherits BufferFile
• BufferFile contains functions to read/ write from a
buffer.
• Only we will use this functions.
64
Encapsulating Record I/O
Operation in a single class
• Shows how read/write functions of a
BufferFile is used to perform our task of
reading / writing.
65
File Access and File
Organization: A Summary
67