Professional Documents
Culture Documents
CENG 351 1
CENG 351-Section 2
• Instructor: Nihan Kesim Çiçekli
• Office: A308
• Email: nihan@ceng.metu.edu.tr
• Lecture Hours:
Tue. 9:40; Thu. 13:40,14:40 (BMB3)
• Course Web page: http://cow.ceng.metu.edu.tr
• Teaching Assistants:
Ömer Nebil Yaveroğlu
Nilgün Dağ
CENG 351 2
References
1. Betty Salzberg, File Structures: An Analytic
Approach, Prentice Hall, 1988.
2. Raghu Ramakrishnan, Database Management
Systems (3rd. ed.), McGraw Hill, 2003.
3. Michael J. Folk, Bill Zoellick and Greg
Riccardi, File Structures, An object oriented
approach with C++, Addison-Wesley, 1998.
4. R. Elmasri, S.B. Navathe, Fundamentals of
Database Systems, 4th edition, Addison-Wesley,
2004.
CENG 351 3
Course Outline
1. Introduction: Secondary storage devices
2. Fundamental File Structure Concepts:
Sequential Files
3. External Sorting
4. Indexed Sequential Files (B-trees)
5. Direct access (Hashing)
6. Introduction to Database Systems:
E/R modeling, relational model,
7. Query languages: Relational algebra, relational
calculus, SQL
8. Query Evaluation
CENG 351 4
Grading
3 written HW, 3 programming assignments 30%
Midterm Exam 1 20%
Midterm Exam 2 20%
Final Exam 30%
CENG 351 5
Grading Policies
• Policy on missed midterm:
– no make-up exam
• Lateness policy:
– Late assignments are penalized up to 10% per day.
• All assignments and programs are to be your own
work. No group projects or assignments are
allowed.
CENG 351 6
Introduction to File management
CENG 351 7
Motivation
Most computers are used for data processing
(over $100 billion/year). A big growth area in
the “information age”
This course covers data processing from a
computer science perspective:
– Storage of data
– Organization of data
– Access to data
– Processing of data
CENG 351 8
Data Structures vs File Structures
• Both involve:
– Representation of Data
+
– Operations for accessing data
• Difference:
– Data structures: deal with data in main memory
– File structures: deal with data in secondary
storage
CENG 351 9
Where do File Structures fit in
Computer Science?
Application
DBMS
File system
Operating System
Hardware
CENG 351 10
Computer Architecture
data is Main Memory - Semiconductors
manipulated (RAM) - Fast, expensive,
here
volatile, small
data
transfer
CENG 351 11
Advantages
• Main memory is fast
• Secondary storage is big (because it is cheap)
• Secondary storage is stable (non-volatile) i.e.
data is not lost during power failures
Disadvantages
• Main memory is small. Many databases are too
large to fit in main memory (MM).
• Main memory is volatile, i.e. data is lost during
power failures.
• Secondary storage is slow (10,000 times slower
than MM)
CENG 351 12
How fast is main memory?
• Typical time for getting info from:
Main memory: ~12 nanosec = 120 x 10-9 sec
Magnetic disks: ~30 milisec = 30 x 10-3 sec
CENG 351 14
Goal of the file structures
• Minimize the number of trips to the disk in
order to get desired information
• Grouping related information so that we are
likely to get everything we need with only
one trip to the disk.
CENG 351 15
Physical Files and Logical Files
• physical file: a collection of bytes stored on a disk or
tape
• logical file: a "channel" (like a telephone line) that
connects the program to a physical file
• The program (application) sends (or receives) bytes
to (from) a file through the logical file. The program
knows nothing about where the bytes go (came from).
• The operating system is responsible for associating a
logical file in a program to a physical file in disk or
tape. Writing to or reading from a file in a program is
done through the operating system.
CENG 351 16
Files
• The physical file has a name, for instance
myfile.txt
• The logical file has a logical name (a
varibale) inside the program.
– In C :
FILE * outfile;
– In C++:
fstream outfile;
CENG 351 17
Basic File Processing Operations
• Opening
• Closing
• Reading
• Writing
• Seeking
CENG 351 18
File Systems
CENG 351 19
Example
• A student file may be a collection of student
records, one record for each student
• Each student record may have several fields, such
as
– Name
– Address
– Student number
– Gender
– Age
– GPA
• Typically, each record in a file has the same fields.
CENG 351 20
Properties of Files
CENG 351 21