You are on page 1of 10

{5 THE ART OF

FILE PROCESSING }
INTRODUCTION
• A file may generally be defined as an organized collection of well-
ordered, well-related, and self-contained information held on a stable
storage medium.
• The information in a file is placed in a specific way and read back in a
specific way and must be kept together as a unit in the same sequence in a
well-organized way.
• The stable storage medium may be a piece of paper, a magnetic or optical
disk, or a magnetic tape or any other medium.
• The information bearing the characteristics mentioned above stored in the
main memory of a computer will not make a file because the main
memory of a computer can hold it only as long as electricity is supplied to
the main memory.
FILE CLASSIFICATION
• Files can be classified into two basic types:
• A program file is a file that contains a sequential set of instructions in a
computer language that can direct a computer in the performance of some
specific task.
• A data file is a collection of records about closely-related or similar entities.
• Files should possess all the features stated in the generalized definition.
• A record is an ordered collection of the attribute values of an entity.
• An attribute is any characteristic or feature of an entity that tells something
about the entity, where an entity is anything with a physical or conceptual
existence.
• A fact is anything that is true about an entity. To collect facts about an entity,
we first decide on some attributes of the entity and procure facts on those
attributes.
FILE ORGANIZATION
• A file is typically considered a data file.
• The task of file processing is the set of activities performed
on the records of a file to generate some desired information.
• Basically, file organization can be classified into three
categories:
• Sequential File Organization
• Indexed File Organization
• Hashed/Relative/Random File Organization
FILE ORGANIZATION
• A file is typically considered a data file.
• The task of file processing is the set of activities performed
on the records of a file to generate some desired information.
• Basically, file organization can be classified into three
categories:
• Sequential File Organization
• Indexed File Organization
• Hashed/Relative/Random File Organization
SEQUENTIAL FILE ORGANIZATION

• Sequential file organization is one in which


records are kept in a file, one after another, and
processed in the same sequence in which they are
written. The term sequential means one after
another, and hence the name bears the nature of
the organization of the file.
INDEXED FILE ORGANIZATION
• Indexed file organization is one in which sequentially organized records are
associated with an index for the purpose of direct access to the records.
• An index is a special kind of file that contains records consisting of two
attribute values, one that is a unique identifying attribute of the records in the
sequential file and the other that contains the address of the records in the
main file.
• The identifying attribute is also known as the key attribute or key field.
• The records in the index are kept in the ascending order of the key field
values.
• When a user wishes to access a record from an indexed file, she initiates a
binary search in the index for some key field value and the record found in the
search process is then accessed to get the address of the desired record in the
main file.
HASHED FILE ORGANIZATION
• A hashed or relative file organization is also a direct access file
organization.
• In such a file organization, the key field or identifying attribute value is
hashed or converted to some location address in the file space relative to
the beginning of the file-record positions on the basis of some predefined
function.
• The predefined function is called a hash routine and the method is called
hashing.
• As the hashing is done dynamically during the creation of the file, no
extra file space is needed for this purpose, rather, the records can be
pointed to directly later by using the same hash function.
• The only problem with this type of organization is the proper selection of
the hash function and its implementation through programming
EXERCISE
• Problem 5.1. Construct a flowchart to show how the records of the students in a computer training
institution are kept in a file. Each record consists of
• STUDENT-ID (for unique identification of the students)
• STUDENT-NAME
• COURSE-NAME
• COURSE-FEE
• FEES-PAID
• DATE-OF-ADMISSION
• Task Analysis. The logic of this problem is straightforward. Data is accepted from the terminal for
the attributes for one student at a time in the order of their specification to form a student record
and then the record is written in the file space designated through the opening statement of a file
until the user signals there are no more records to be written in the file. The file space is then
delinked by writing a statement for closing the file. The user’s signal for no more records to be
written in the file can be indicated by inputting an invalid data value for the first attribute of the
record, say 0 for STUDENT-ID.
EXERCISE

You might also like