You are on page 1of 65

LECTURE – VI

BY: ABHISHEK BHARDWAJ


PGT- COMPUTER SCIENCE
Introduction to File Handling

A File in itself is a bunch of bytes stored on some storage


device like hard-disk, thumb-drive etc.
 Need of Data Files:-

Program Execute Output

1.When, we write a program, it executes and we see


output (Temporarily) and we think that our program has
successfully been executed.
2.Once, we close our IDLE, we are unable to find output
until we follow Step 1 again from execution part (Run).
3.If, we consider the example of a banking system,
this practice is not acceptable, because each and every
data is important.
Introduction to File Handling

From previous example, it is clear that we must store the


output of our program.
Data Base

Program Execute Output STORE

Data
Files

We may store data pertaining to a specific application, in Data


Bases or in Data Files for later use. First we will discuss about
storing of data in Data Files.
 Data File Handling means how we will store our data in files?
Data Files

 The data files can be stored in two ways:


Text Files Binary
Files
Text Files :- A text file stores information in the form of a stream
of ASCII or Unicode characters (the one which is default for
programming platform).
In text files, each line of text is terminated, (delimited) with
a special character known as EOL (End of Line) character.
Some internal translation take place when this EOL
character (e.g. when we press enter the next line will be our
input area) is read or written.
In python, by default, EOL character is the new line
character (‘\n’) or carriage-return (moving the cursor to the
beginning of the line), newline combination (‘\r\n’)
Data Files

The text files can be of following types :


Regular Text files : These are the text files which store the text in
the same form as typed. Here EOL is translated and ends a line.
File Extension .txt
Delimited Text files : A specific character is stored to separate the
values. e.g. a tab (TSV- tab separated value files) or a comma
(CSV- comma separated value file) after every value.
Regular text file content : I am simple text.
TSV file content : I  am  simple 
CSV file content : text. I, am, simple, text.
NOTE : Some setup files (e.g. Initialization .INI files) and rich text format files
(.RTF files) are also text files.
Data Files

Binary Files : stores the information in the form of a stream of


bytes.
It has the information in same format in which the
information is held in memory.
File contents are raw (without translations or no
specific encoding).
No delimiter (a blank space, comma, or other character
or symbol that indicates the beginning or end of a character
string, word, or data item) for a line.
As no translation occur in binary files, these files are faster
and easier for a program to read and write than are text files.
The text files can be opened in any text editor and are
in human readable form, while binary files are not in
human readable form.
Difference between Text Files and Binary Files
Ser Text Files Binary Files
A text file stores information in Stores the information in the form
1. the form of a stream of ASCII or of a stream of bytes.
Unicode characters.
Each line of text is terminated, No Delimiter for a line.
(delimited) with a special
2. character known as EOL (End of
Line) character.

Some internal translation take As no translation occur in binary


place when EOL character is files, these files are faster and easier
3. Read/ Write. for a program to read and write than
are text files.

The text files can be opened in The binary files are not in
4. any text editor and are in human readable form.
human readable form.
Working With Data Files

 The most basic file manipulation tasks include adding,


modifying or deleting data in a file.

 Any one or combination of operations may be performed : -

 Reading data from files


 Writing data to files
 Appending data to files

NOTE : In order to work with a file first open it in a specific mode.


File Access Modes

Text File Binary File


Mode Mode Description Notes

Default Mode ; File must already exist, otherwise Python


‘r’ ‘rb’ read only will raise I/O Error.
 If the file does not exist, file is created.
‘w’ ‘wb’ write only If the file exist, truncate existing data. So, this
mode must be used with caution.

 File is in write only mode.


 If file exists, data in file is retained and new data being
‘a’ ‘ab’ append written will be appended to the end.
 If the file does not exist, file is created.

read and  File must exist otherwise error is raised.


‘r+’ ‘r+b’ or rb+ write  Both reading and writing operations can take place.

‘w+b’ or write and  If the file does not exist, file is created.
‘w+’  If file exist, file is truncated.
wb+ read  Both reading and writing operations can take place.

 If the file does not exist, file is created.


write and If file exist, data in file is retained and new data
‘a+’ ‘a+b’ or ab+ read is appended.
 Both reading and writing operations can take
place.
Opening and Closing Files

 Open() function as per one of the following syntaxes :-


<file_objectname>=open(<filename>)
<file_objectname>=open(<filename>, <mode>)

e.g. myfile=open(“student.txt”)
A file-object is also known as file-handle, is a reference to a
file on disk. It opens it & makes it available for different tasks.
[Python will look this file in current working directory (Directory
in which, we store our program or module file)]
 Opened file is attach to its file object e.g. myfile (file object)
 Default mode of opened files is read mode (or we may assign
mode as “r” for read mode)
NOTE: In read mode, the given file must exist in the folder,
otherwise Python will raise FileNotFound Error.
Opening and Closing Files

<file_objectname>=open(<filename>, <mode>)

myfile=open(“student.txt”, “r”)

myfile1=open(“student.txt”, “w”)

myfile2=open(“e:\\main\\student.txt”, “w”)
Path : Python will look in E: drive\main folder

myfile3=open(r “e:\main\student.txt”, “r”)


The \\ or prefix r in front of a string makes it raw string that means there is
no special meaning attached to any character.
f=open(“c:\temp\data.txt”, r) In this example \t will be treated as tab
character.
Opening and Closing Files

 File objects are used to read and write data to a file on disk.
The file object is used to obtain a reference to the file on
disk and open it for a number of different tasks.
All the functions we perform on a data file are
performed through file-objects.
File mode governs the type of operations
(e.g. read/write/append) possible in the opened file i.e. it
refers to how the file will be used once it’s opened?
close() method is used to close a file. In Python, files
are automatically closes at the end of the program but it is
good practice to close files explicitly. Because if program
exits unexpectedly there is a danger that data may not have
been written to the file!
Opening and Closing Files

A close() function breaks the link of file-object and the file


on the disk. After close(), no tasks can be performed on that
file through the file-object (file-handle).
<file_object>.close()

file3.close()

NOTE : open() is a built-in function (used stand-alone), while


close() is a method used with file-handle object.
Working with Text Files

 Reading from Text Files


Ser Method Syntax Description
Reads at most n bytes; if no
n is specified, reads the entire
1. read() <fileobject>.read ([n]) file.
Returns the read bytes in
the form of a string.

reads a line of input; if n is


specified reads at most n bytes.
Returns the read bytes in
the form of a string ending with
2. readline() <fileobject>.readline ([n]) In(line) character or returns a
blank string if no more bytes are
left for reading in the file.

Reads all lines and


3. readlines() <fileobject>.readlines () returns them in a list.
Reading a file’s first 30 bytes and printing it.

Text File : ssc.txt Code Snippet 1

Code Snippet 2

Output

If the ssc.txt is in same folder, in which


the program file is.
If the ssc.txt is stored in some other
drive/ location.
Reading n bytes and then reading more bytes from
the last position read.
Text File : ssc.txt Code Snippet

Output
Reading a file entire content.

Text File : ssc.txt Code Snippet

Output
Reading a file’s first three lines- line by line.

Text File : ssc.txt Code Snippet

Output
Reading a complete file – line by line.

Text File : ssc.txt Code Snippet

Output
Displaying the size of a file after removing EOL (\n)
characters, leading and trailing white spaces and blank lines.
Text File : ssc.txt Code Snippet

Output
Reading a complete file in a List.

Text File : ssc.txt Code Snippet

read() and readline() read bytes and


return them in string.
readlines() reads lines and
return
them in List.
Output
Write a program to display the size of a file in bytes.

Text File : ssc.txt Code Snippet

Output
Write a program to display the number of lines in the file.

Text File : ssc.txt Code Snippet

Output
Working with Text Files

 Writing onto Text Files

Ser Method Syntax Description

Writesstring str to
1. write() <fileobject>.write (str) file referred by <fileobject>.

Writes all strings in list


2. writelines() <fileobject>.writelines (L) L as lines to file referenced
by
<fileobject>.
Create a file to hold data of 5 student names.

Code Snippet

Output
Create a file to hold data of 5 names separated as lines.

Code Snippet

Output
Creating a file with some names separated by newline
characters without using write() function.
Code Snippet

Output
The flush() Function

When we write onto a file using any of the write


functions, Python holds everything to write in the file in buffer
and pushes it onto actual file on storage device a later time.
flush() function can be used to force Python to write
the contents of buffer onto storage.
 Python automatically flushes the file buffers when closing them
i.e. this function is implicitly called by the close() function.
 But it flush the data before closing any file.
 The syntax to use flush() function is:

<fileobject>.flush()
Write a program to get roll numbers, names and marks of
the students of a class (prompt user) and store these details
in a file called “Marks.txt”.
Code Snippet

In this program values are


separate by comma.
This is called CSV format
(Comma Separated Values)

At Runtime Inputs by User Output


Write a program to add two more students’ details to the
file “Marks.txt” created in last program.
Code Snippet
Open file
in append
“a” mode,
as old data
must be
retained.

At Runtime Inputs by User Output


Write a program to display the contents of file “Marks.txt”
created in last two programs.
Code Snippet

Input File Output


Here extra space among lines
is due to print() function &
‘\n’.
If you don’t want this, use
end=“ ” in print function.
Read, Write and Search CSV (Comma Separated Value) Files

CSV files are delimited files that store tabular data (data
stored in rows and columns).
The separator character of CSV files is called a delimiter.
Default and most popular delimiter is comma.
Other are tab (\t), colon (:), pipe (|) and semi-colon
(;) characters.
Since CSV files are text files, we may apply text file
procedures on these and then split values using split() function,
but using csv module in Python we may handle CSV files.
The csv module of Python provides functionality to read
and write tabular data in CSV format.
Two specific types of objects – the reader and writer objects
to read and write into CSV files.
Why CSV files are popular?

 Easier to create.
 Preferred export and import for databases and
spreadsheets.
format
 Capable of storing large amounts of data.

Opening and Closing CSV Files:


obj=open(“student.csv”, “w”) obj.close()

CSV file opened in write mode with the file handle as obj
CSV file is closed
in the same
fobj=open(“student.csv”, “r”) manner as any
other file.
CSV file opened in read mode with the file handle as fobj
Writing in CSV files.

MEMORY
csv.writerow() is
used to write
onto the writer
object
csv.writer object
It converts the user data Delimited Data
csv.writerow()
into csv writable form, i.e.
Input User Data
delimited string form as
per csv settings.

CSV File on
ROLE OF THE CSV WRITER OBJECT
storage disk
FUNCTIONS
csv.writer() returns a writer object which writes data into CSV file

<writerobject>.writerow() writes one row of data onto the writer object.

<writerobject>.writerows() writes multiple rows of data onto the writer object.


Reading in CSV files.

MEMORY fetch one row Iterable


at a time from
reader object
using a loop One row
csv.reader object of data
It parses the delimited csv Loop for reading One row
CSV File on file data and loads it into of data
storage disk an iterable.
One row
of data

ROLE OF THE CSV READER OBJECT


FUNCTION
returns a reader object which loads data from CSV file into an iterable
csv.reader()
after parsing delimited data.
Python Program to Write, Read and Search into CSV file

1. If you use with open() then there is


1 2 no need to use close() in the end of
the program.
3
4 2. fobj is file handle/ object/ pointer
for opening the file.
3,5. writerow() takes only one
5 argument , so take a list to enter
multiple values.

4. True means loop will execute until


6 we terminate it.

6. break to terminate loop.

7. next() will skip first line


and searching will start from line
no. 2. First line is skipped because,
while having condition i[2]>=90, there
7
is no number for comparison in first
8 line. ‘Roll_No’, ’Name’, ’Total_Marks’

8. i[2] will check index no. 2 means,


value no. 3 in list.
Note : Additional parameter newline=‘’

NOTE: The csv.writer writes \n into the file directly. so


open file with the additional parameter newline='' (empty
string) instead.

If we do not use this extra parameter, then when we open


our CSV file, it will show alternative lines blank (means
data in first row then second row blank, data in third row
and fourth row blank and so on).
Python Program to Write record of students and Search it
into CSV file by roll no. given by user.
Python Program to Write record of students and Search into
CSV file to Print record of student having ‘MAX’ marks.
Binary Files in Python

 Stores the information in the form of a stream of bytes.

 No Delimiter for a line.

As no translation occur in binary files, these files are


faster and easier for a program to read and write than
are text files.

 The binary files are not in human readable form.


Binary Files in Python

As data in Binary Files are stored in Stream of Bytes, so it


is necessary to store non-simple objects like
dictionaries, tuples, lists in such a way so that their
structure/ hierarchy is maintained.
For this purpose, objects are often serialized and
then stored in binary files.
Pickling / Serialisation

Structure
Byte Stream
(List/ Dictionary) Unpickling / De-Serialisation

“The pickle module implements a fundamental, but powerful algorithm


for serializing and de-serializing a Python object structure.”
Pickling and Unpickling

Pickling/ Serialisation : is the process of converting Python


object hierarchy into a byte stream so that it can be written
into a file.

Unpickling/ De-Serialisation : is the inverse of Pickling where


a byte stream is converted into an object hierarchy.
Unpickling produces the exact replica of the original object.
 In order to work with the pickle module, import it.
import pickle
pickle.dump(Structure, file_object) Structure_var =pickle.load(file_object)

To write on binary file. dump() load() To read from binary file.


Working with pickle module

Process of working with binary files :

(i) Import pickle module.

(ii) Open binary file in the required file mode (read or


write mode).

(iii) Process binary file by writing/ reading objects


using pickle module’s methods.

(iv) Once done, close the file.


Writing into Binary File (Structure : List)

Program Code

Output (File saved in the directory where Python program was


saved)
Reading in Binary File (Structure : List)

Program Code

Output
Reading and Writing into Binary File (Structure : Dictionary)

Program Code

Output
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)

Continue…..
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)

Continue…..
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)
Output
Setting Offsets in a File

The functions (read/ write) which we have used till


now are used to access the data sequentially from a file.
But if we want to access data in a random fashion,
then Python gives us seek() and tell() functions to do so.

tell() : This function returns an integer that specifies the


current position of the file object in the file.
The position so specified is the byte position from
the beginning of the file till the current position of the
file object.

The syntax of using tell() is: file_object.tell()


Setting Offsets in a File

seek(): This method is used to position the file object at


a particular position in a file.
Syntax: file_object.seek(offset [, reference_point])
file_object.seek(offset , from_what)
 Offset is the number of bytes (Characters) by
which the
file object is to be moved.
 reference_point indicates the starting
position of the
file object.
1 - Beginning of the File
2 - Current position of the File
3 - End of File
By default, the value of reference_point is 0, i.e. the
Program to know the Position of your File Pointer

Text File : ssc.txt Code Snippet

Output
Program to know the Position of your File Pointer

Text File : ssc.txt Code Snippet

Output
Program to know the Position of your File Pointer

Text File : ssc.txt Code Snippet

Output
Program to know the Position of your File Pointer

Text File : ssc.txt Code Snippet

Output
Program to know the Position of your File Pointer

Text File : ssc.txt Code Snippet

Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

1
2

Output
1By default read mode keep
the file pointer in the starting of
the file.
2Default value of
from_what/ reference point is
also ‘0’, which also keeps the
file pointer in the starting of the
File.
Difference between read() and seek()

A read call will read the specified amount of bytes


from a "file".

 Read call will also advance the position of the


offset
according to how much bytes it read.

Seekingis the equivalent of just scrolling the bar to


whatever position you want.

 It doesn't read anything in-between the jump.


Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

Output

While opening the text file, mode


‘rb’ has been used in place of ‘r’
because Python versions above 3.0
show io.UnsupportedOperation
error if any other reference_point is
used in place of default ‘0’ .
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet

Output

You might also like