You are on page 1of 18

File

Structures

Presentation By:
Abhishek Kumar Giri
23265001
Outline:
 WHAT ARE FILES?

 FILE STRUCTURES

 FILE ORGANIZATION:
o SEQUENTIAL FILES ORGANIZATION.
o INDEXED FILE ORGANIZATION.
o DIRECT FILE ORGANIZATION.

 PROS AND CONS OF ALL


FILE ORGANIZATION
 COMPARISIONS OF ALL FILE
ORGANIZATIONS
 REAL LIFE APPLICATIONS
WHAT ARE FILES?
 File is a collection of records related to each other.

 The file size is limited by the size of memory and storage medium.
FILE STRUCTURES

 File structure refers to the logical arrangement of data in a file.


It’s about how the data within a file is structured and relates to
the format of the data. For example, a file could be structured as
a text file, a binary file, a CSV file, json file, etc. The structure of
a file determines how the data can be read and interpreted..
File organization
 File organization refers to the physical arrangement of data on the storage
medium. It’s about how the data is physically stored and accessed on the
disk.

 Characteristics of File organization:


o Efficient: To perform insert, delete or update transaction on the records should be quick and
easy.
o No Redundancy: The duplicate records cannot be induced as a result of insert, update or
delete.
o Cost effective: For the minimal cost of storage, records should be stored efficiently

*The organization of a file affects the speed and efficiency of data retrieval.*
Types of File organization

1. Sequential file organization


2. Direct Access (or hash) file organization
3. indexed file organization.
1. Sequential file organization​
 In this method, records are stored one after the other in a sequential manner.
The records are stored based on a key field which is a part of each record.
This key field is also known as the ‘Primary Key’.
Types of Sequential File Organization
1.) Pile File Method: 2.) Sorted File Method:
PROS OF SFO: PROS OF SFO:

• Simplicity: It’s straightforward to • Slow Access for Individual Records: If


implement because records are stored in a you need to access a specific record, you
specific order. may need to go through many other records
first, which can be time-consuming.
• Efficient for Large Volumes of Data:
When dealing with large amounts of data that • Insertion and Deletion: Inserting and deleting
need to be processed in sequence, this method records can be inefficient as it requires shifting
is efficient. of records.
2. Indexed sequential access method
(ISAM)

 ISAM method is an advanced


sequential file organization. In
this method, records are stored in
the file using the primary key. An
index value is generated for each
primary key and mapped with the
record. This index contains the
address of the record in the file.
PROS OF ISAM CONS OF ISAM

 Fast Access: By using an index,  Space Requirement: This


records can be accessed quickly. method requires additional space
• Efficient Modification: The records to store the index.
can be inserted, deleted, modified in the middle  Overhead of Index
of the file. Maintenance: Maintaining the
index can add overhead,
especially when records are
inserted or deleted.
3. Direct access file organization

 In this method, a hash


function is used to compute
the address of each record.
This allows for quick access
to records as you can
directly compute the
address of the record and
retrieve it.
PROS OF HASHED CONS OF HASHED

•Fast Access: Since the location of each •Hash Collisions: Two different keys may
record can be directly computed, access to hash to the same value, leading to a
records is very fast. collision. Handling these collisions can be
complex and may require additional
storage.
•Space Utilization: If the hash function
does not distribute records evenly, some
areas of storage may be heavily used while
others are underutilized.
Real life applications
Direct Access (or Hashed) File
Sequential File Organization Indexed File Organization
Organization

• Transaction Processing Systems: • Databases: Many databases use • Databases: Databases often use
In banking or any other transaction hashing for direct access to records. indexed file organization to speed
processing system, records such as This allows for quick retrieval of up data retrieval. Indexes are used
transactions are typically appended data, which is crucial for to quickly locate data without
in a sequential manner. performance. having to search every row in a
• Log Files: Log files generated by • Caching Systems: In-memory database table every time a
systems and applications often use caches like Memcached or Redis database table is accessed.
sequential file organization. Each use hashing to store and retrieve • File Systems: Some file systems
new log entry is appended at the data quickly. use indexing to speed up file
end of the file. retrieval. For example, the NTFS
file system used by Windows
creates an index of files to speed up
search operations.
How to choose file organization method
Sequential File Direct Access (Hashed)
Indexed File Organization:
Organization: File Organization:

• When the order of records is • When you need quick access to • When you need quick access to
important. specific records. records but also need to maintain
• When you need to process large • When the system performs many the order of records.
volumes of data in a batch or in a random read and write • When the system performs many
specific sequence. operations. random read operations.
• When the system performs many • When the key value of the data is • When you need to perform range
read operations that need to uniformly distributed. queries.
process the entire file or a large
portion of it.
Thank YOU!

You might also like