Professional Documents
Culture Documents
File I/O: Introduction, C++ streams, C++ streams classes, File Stream classes, file operations,
finding end of file, File opening modes, File Organization, working with Text Files and Binary
Files, Random Access of a Record in a file.
File Organizations: Sequential, indexed sequential, direct, inverted, multi-list, directory systems,
Indexing using B-tree and B+ tree
C++ STREAMS
The C++ I/O system contains a Hierarchy of classes that are used to define streams to
deal with both the console and disk files.
The iostream.h library holds all the stream classes in the C++ programming language.
ios class
This class is the base class for all stream classes.
The streams can be input or output streams.
This class defines members that are independent of how the templates of the class are
defined.
istream class
The istream class handles the input stream in c++ programming language.
These input stream objects are used to read and interpret the input as a sequence of
characters.
The cin is an object belonging to class istream that handles the input streams in c++
programming language.
ostream class
The ostream class handles the output stream in c++ programming language.
These output stream objects are used to write data as a sequence of characters on the
screen.
cout is an object belonging to class ostream that handle the out streams in c++
programming language.
iostream class
The iostream class handles the input and output stream in c++ programming language.
iostream class inherits the properties of istream and ostream classes
These input output stream objects are used to write data as a sequence of characters on
the screen or used to read and interpret the input as a sequence of characters.
streambuf class
The streambuf class is used to create a stream buffer.
Stream buffer is an object in charge of performing the reading and writing operations of
the stream object it is associated with.
The stream delegates all such operations to its associated stream buffer object, which is
an intermediary between the stream and its controlled input and output sequences.
The streambuf class is an elaborated base class designed to provide a uniform public
interface for all derived classes
fstream class
fstream class is used to create read and write data to files.
fstream class is inherited from ofstream and ifstream classes.
It represents both File Output Stream and File Input Stream. So it can read from files and
write to files.
filebuf class
The streambuf class is used to create a file stream buffer object .
A file stream buffer is used to read and write to files.
These objects are associated to a file by calling member open. Once open, all input/output
operations performed on the object are reflected in the associated file.
Objects of this class may internally maintain an intermediate input buffer and/or
an intermediate output buffer, where individual characters are read or written by i/o
operations.
FILE OPERATIONS
Opening a file
In order to perform any operation on a file we need to open the file.
A file can be opened in two ways
using open() function
during the creation of file object
using open() function
syntax:-
filestreamobject.open(filename, filemode);
example:-1(text file)
fin.open("file1.txt", ios::in);
example:-2(Binary file)
fin.open("file1.txt", ios::in | ios::binary);
Text Files
Text files are special subset of binary files that are used to store human readable
characters as a rich text document or plain text document.
Text files store data in sequential bytes but bits in text file represents characters.
Text files are less prone to get corrupted as any undesired change may just show up once
the file is opened and then can easily be removed.
Text files can be classified as plain text files and Rich text files.
Because of simple and standard format to store data, text files are one of the most used
file formats for storing textual data and are supported in many applications.
Binary File
Binary Files are used to store multiple types of data (images, audio, text, etc) under a
single file.
Binary file are those typical files that store data in the form of sequence of bytes grouped
into eight bits or sometimes sixteen bits.
When data is stored in a file in the binary format, reading and writing data is faster
because no time is lost in converting the data from one format to another format.
A small change in the Binary file can corrupt the file and make it unreadable to the
supporting application.
One most common example of binary file is image file is .PNG or .JPG.
#include <iostream.h>
#include <fstream.h>
#include<conio.h>
#include<stdlib.h>
//class employee declaration
class Employee
{
public :
int empID;
char empName[100] ;
char designation[100];
int ddj,mmj,yyj;
int ddb,mmb,yyb;
void readEmployee()
{
cout<<"EMPLOYEE DETAILS"<<endl;
cout<<"ENTER EMPLOYEE ID : " ;
cin>>empID;
cout<<"ENTER NAME OF THE EMPLOYEE : ";
cin>>empName;
cout<<"ENTER DESIGNATION : ";
cin>>designation;
}
//function to write employee details
void displayEmployee()
{
cout<<"EMPLOYEE ID: "<<empID<<endl;
cout <<"EMPLOYEE NAME: "<<empName<<endl;
cout <<"DESIGNATION: "<<designation<<endl;
}
};
int main()
{
Employee emp;
emp.readEmployee();
fstream file;
file.open("empnew1.txt",ios::out);
if(!file)
{
cout<<"Error in creating file...\n";
exit(1);
}
file<<emp.empID<<endl;
file<<emp.empName<<endl;
file<<emp.designation<<endl;
file.close();
cout<<"Date saved into file the file.\n";
file.open("empnew1.txt",ios::in);
if(!file)
{
cout<<"Error in opening file...\n";
exit(1);
}
if(!file.eof())
{
file>>emp.empID;
file>>emp.empName;
file>>emp.designation;
cout<<endl<<endl;
cout<<"Data extracted from file..\n";
//print the object
emp.displayEmployee();
}
else
{
cout<<"Error in reading data from file...\n";
exit(1);
}
file.close();
getch();
return 0;
}
Output
EMPLOYEE DETAILS
ENTER EMPLOYEE ID : 90
ENTER NAME OF THE EMPLOYEE : rani
ENTER DESIGNATION : nurse
Date saved into file the file.
#include <iostream.h>
#include <fstream.h>
#include<conio.h>
#include<stdlib.h>
//class employee declaration
class Employee
{
private :
int empID;
char empName[100] ;
char designation[100];
int ddj,mmj,yyj;
int ddb,mmb,yyb;
public :
void readEmployee()
{
cout<<"EMPLOYEE DETAILS"<<endl;
cout<<"ENTER EMPLOYEE ID : " ;
cin>>empID;
cout<<"ENTER NAME OF THE EMPLOYEE : ";
cin>>empName;
cout<<"ENTER DESIGNATION : ";
cin>>designation;
}
//function to write employee details
void displayEmployee()
{
cout<<"EMPLOYEE ID: "<<empID<<endl
cout<<"EMPLOYEE NAME: "<<empName<<endl
cout<<"DESIGNATION: "<<designation<<endl
}
};
int main()
{
Employee emp;
emp.readEmployee();
fstream file;
file.open("emp.dat",ios::out|ios::binary);
if(!file)
{
cout<<"Error in creating file...\n";
exit(1);
}
file.write((char*)&emp,sizeof(emp));
file.close();
cout<<"Date saved into file the file.\n";
file.open("emp.dat",ios::in|ios::binary);
if(!file)
{
cout<<"Error in opening file...\n";
exit(1);
}
if(file.read((char*)&emp,sizeof(emp)))
{
cout<<endl<<endl;
cout<<"Data extracted from file..\n";
//print the object
emp.displayEmployee();
}
else
{
cout<<"Error in reading data from file...\n";
return -1;
}
file.close();
getch();
return 0;
}
Output
EMPLOYEE DETAILS
ENTER EMPLOYEE ID : 90
ENTER NAME OF THE EMPLOYEE : rani
ENTER DESIGNATION : nurse
Date saved into file the file.
File pointer
Each file stream class contains a file pointer that is used to keep track of the current
read/write position within the file.
By default, when opening a file for reading or writing, the file pointer is set to the
beginning of the file.
In random file access the file pointer can be set to any desired position and the data can
be read from that position
Random file access is performed using seekg() , seekp(), tellg() and tellp() functions
tellg()
The tellg() function is used with input streams, and returns the current “get” position of
the pointer in the stream.
It has no parameters and returns a value of the member type pos_type, which is an integer
data type representing the current position of the get stream pointer.
Syntax:-
tellg();
Returns: The current position of the get pointer on success, -1 on failure.
tellp()
The tellp() function is used with output streams, and returns the current “put” position of
the pointer in the stream.
It has no parameters and return a value of the member type pos_type, which is an integer
data type representing the current position of the put stream pointer.
Syntax:
tellp();
Return – Current output position indicator on success otherwise return -1.
seekg()
seekg() is a function in the iostream library (part of the standard library) that allows you
to seek to an arbitrary position in a file.
It is used in file handling to sets the position of the next character to be extracted from
the input stream from a given file.
Syntax – There are two syntax for seekg() in file handling :
istream& seekg(streampos position);
istream& seekg(streamoff offset, ios_base::seekdir dir);
Description –
seekp()
The seekp() method of ostream is used to set the position of the pointer in the output
sequence with the specified position.
This method takes the new position to be set and returns this ostream instance with the
position set to the specified new position.
Syntax:
ostream& seekp(streampos pos);
Parameter: This method takes the new position to be set as the parameter.
Return Value: This method returns this ostream instance with the position set to the
specified new position.
Examples:-
fout.seekp(14, ios::cur); // move forward 14 bytes
#include<iostream.h>
#include<fstream.h>
#include<stdlib.h>
#include<cstring.h>
#include<conio.h>
int main()
{
ifstream fin("sample.txt");
// If we couldn't open the input file stream for reading
if (!fin)
{
// Print an error and exit
cout << "Uh oh, sample.txt could not be opened for reading!" << endl;
exit(1);
}
string strData;
fin.seekg(0);
getline(fin, strData);
cout << strData << endl;
fin.seekg(0, ios::cur);
getline(fin, strData);
cout << strData << endl;
fin.seekg(2, ios::end);
getline(fin, strData);
cout << strData << endl;
getch();
return 0;
}
sample.txt
kiran
jyothi
ravi
raju
mani
Output
kiran
jyothi
raju
File is a group of all the records. Therefore, a file contains Records and Records contain
fields; Fields contain data items; Data items contain characters (alphabets, digits, special
characters, etc.). Each character occupies one byte for its storage.
The technique used to represent and store the records in a file is known as file organization.
A record in a file is searched using single key or multiple keys.
File organizations are classified into different types based on single key or multiple key.
Different types of file organizations are
1. Sequential File Organization
2. Direct Access or Random Access File Organization
3. Index Sequential Access File Organization
4. Inverted list File Organization
5. Multi list File Organization
Directory:A collection of nodes containing information about all files is called directory.
There are 5 types of directory structure in Operating System.
1)Single Level Directory
2)Two Level Directory
3)Tree Structured Directory
4)Acyclic Graph Directory
5)General Graph Directory
4)Acy
clic Graph Directory
Acyclic Graph is the graph with no cycles.
It allows directories to share subdirectories and files.
With a shared file, only one actual file exists, so any changes made by one person are
immediately visible to the another.
A problem with acyclic graph is: how do we guarantee there are no cycles?
5)General Graph Directory:
General graph allows cycles.
We have to allow only links to subdirectories not for file.
Every time when a new link is added , use a cycle detection algorithm to determine
whether it is ok.
INDEXING
Indexing is a data structure technique which allows you to quickly retrieve records from a
database file.
Indexing is a way to optimize the performance of a database by minimizing the number of
disk accesses required when a query is processed.
The index is a type of data structure. It is used to locate and access the data in a
database table quickly.
It is defined based on the indexing attribute or field or key.
Indexing technique are implemented by BTrees and B+ Trees
B TREE
B Tree is a specialized m-way tree that can be widely used for disk access.
A B-Tree of order m can have at most m-1 keys and m children.
One of the main reason of using B tree is its capability to store large number of keys in a
single node and large key values by keeping the height of the tree relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains
the following properties.
Every node in a B-Tree contains at most m children.
Every node in a B-Tree except the root node and the leaf node contain at least m/2
children.
The root nodes must have at least 2 nodes.
All leaf nodes must be at the same level.
It is not necessary that, all the nodes contain the same number of children but, each node
must have m/2 number of nodes.
Insertions are done at the leaf node level. The following algorithm needs to be followed
in order to insert an item into B Tree.
1. Traverse the B Tree in order to find the appropriate leaf node at which the node can
be inserted.
2. If the leaf node contain less than m-1 keys then insert the element in the increasing
order.
3. Else, if the leaf node contains m-1 keys, then follow the following steps.
1. Insert the new element in the increasing order of elements.
2. Split the node into the two nodes at the median.
3. Push the median element upto its parent node.
4. If the parent node also contain m-1 number of keys, then split it too by
following the same steps.
B+ TREES
B+ tree is an M-way tree with a variable but often large number of children per node.
B+ tree is used in storing data for efficient retrieval in a block-oriented storage in file
systems.
A B+ tree consists of a root, internal nodes and leaves. The root may be either a leaf or
a node with two or more children.
All leaves are at the same distance from the root.
The leaf nodes have an entry for every value of the search field, along with a data
pointer to the record .
The leaf nodes of the B+ tree are linked together to provide ordered access on the
search field to the records.
Internal nodes of a B+ tree are used to guide the search. Some search field values from
the leaf nodes are repeated in the internal nodes of the B+ tree.
Structure of Internal node
Each internal node is of the form < P1, K1, P2, K2 . . . Pn-1, Kn-1, Pn > where Ki is the
key and Pi is a tree pointer Within each internal node, K1 < K2, . . . < Kn-1
For all search field value x in the subtree pointed at by Pi, we have Ki-1 x <= Ki.
Each internal node has at most p tree pointers.
Each internal node, except the root, has at least (P/2) tree pointers.
Structure of a leaf node
Each leaf node is of the form <<K1, P1>, <K2, P2> . . . <Kn-1,Pn-1>, Pnext> Within
each leaf node, K1 < K2 . . . < Kn-1.
Pi is a data pointer that points to the record whose search field value is Ki.
Each leaf node has at least ⌈(P/2)⌉ values. All leaf nodes are at the same level.
Inserting in B+ tree
Perform a search operation in the B+ tree to check the ideal node where this new key
should go to.
If the node is not full( does not violate the B+ tree property ), then add that key into this
node.
Otherwise split the nodes into two nodes and push the middle key to the parent node and
then insert the new key.
Repeat the above steps if the parent node is there and the current node keeps getting full.
B Tree B+ Tree
Search keys can not be repeatedly stored. Redundant search keys can be present.
Data can be stored in leaf nodes as well as Data can only be stored on the leaf
internal nodes nodes.
Searching for some data is a slower Searching is comparatively faster as data
process since data can be found on can only be found on the leaf nodes.
internal nodes as well as on the leaf
nodes.
Deletion will never be a complexed
Deletion of internal nodes are so
process since element will always be
complicated and time consuming.
deleted from the leaf nodes.
Leaf nodes are linked together to make
Leaf nodes can not be linked together.
the search operations more efficient.