This action might not be possible to undo. Are you sure you want to continue?
Operating System (OS) and Data Storage is an exciting software area because the design of an operating system exerts a major influence on the overall function and performance of the entire computer. This field is undergoing change at a breathtakingly rapid rate, as computers are now prevalent in virtual every application, from games for children through the most sophisticated planning tools for governments and multinational firms. The objective of this book is to provide a complete discussion of OS principles, implementation with code, algorithms, implementation issues, and lab exercises to provide you an understanding of contemporary OS practice and data storage. My goal is to explain general OS principles and data storage techniques so you will have a deep understanding of how an OS is designed. The study of operating systems has traditionally been one of the most challenging and exciting software disciplines in computer science. I hope this book makes complex aspects of operating system and data storage easy to understand and avoids making simple aspects boring. I hope you will enjoy it as much as I do! Omprakash Sangwan
OPERATING SYSTEM AND DATA STORAGE
Course Contents: Module I: Introduction What is an Operating System, Types of Operating Systems, Simple Batch Systems, Multiprogramming Systems, Time-Sharing Systems, Parallel Systems, Distributed Systems, Real-time Systems Module II: Operating System Structures System Components, System Calls, System Programs, System Structure, Virtual Machines Module III: Processes Management Process Concept, Process Scheduling, Operation on processes, Cooperating Processes, Interprocess Communication, Back ground process, Module IV: CPU Scheduling Basic Concepts, Scheduling Criteria, Scheduling Algorithms, Multi-Processor
Scheduling, Real-Time Scheduling, Algorithm Examination System Models, Methods for Handling Deadlocks, Deadlock Prevention, Deadlock Avoidance, Deadlock Detection, Deadlock Recovery Module V: Memory Management Memory Management, Address Space, Memory Allocation Techniques, Swapping, Paging Segmentation with paging, Virtual Memory, Demand Paging, Performance of Demand Paging, Page Replacement, Thrashing, Demand Segmentation
Module VI: File System Interface File Concept, Access Methods, Directory Structure, Protection, File System Structure, and Allocation Methods.
Module VII: The Unix System Case Study History, Design Principle, Programmer Interface, User Interface, Process Management, Memory Managements, File management, Interprocess Communication. Module VIII: Data Storage Disks, organization disks, capacity & space, organizing tracks by sectors, clusters, Extents, fragmentation, organism tracks by block. Magnetic tapes, number of tracks/character set used. Storage by Data warehousing. OLAP DSS (decision support system). Characteristics of data warehouses. Date modeling for data warehouses, two dimensional matrix models. Three dimensional data cube method, Roll up operation, Drill down operation. Acquisition of data for warehouse, typical functionality of data warehouse. Problems and open issues in data warehouses.
S.I. Nos. 1 2 3 4 5 6 7 8 Chapter no. I II III IV V VI VII VIII Subject Introduction Operating System Structure Processes Management CPU Scheduling Memory Management File System Interface The UNIX System Case Study Data Storage
Contents: 1.1 What is an operating system? 1.2 Types of operating system 1.2.1 Serial processing system 1.2.2 Batch processing system 1.2.3 Multiprogramming operating system 1.3 Time sharing systems 1.4 Parallel systems 1.4.1 Symmetric multiprocessor 1.4.2 Asymmetric multiprocessor 1.5 Distributed systems 1.6 Real time systems
1.0 Operating System
1.1 What is an operating system? Operating system is a manager or supervisor that controls the resources of computer system. It manages entire working through its various management functions like memory management, file management, process management, device management, device management and input-output management. It is a well-organized and well-designed set of software programs that provides a better environment for execution of programs. Other software programs that are stored on secondary memory are loaded into main memory by the operating system whenever needed. Some parts of the operating are permantely resident in the main memory while the rest of the parts are loaded into the transient parts of the main memory as and when required. It keeps track of each process and resources status and decides which process has to be assigned which resource for how long. Thus we can say operating system is system software that provides the proper use of all the resources of the computer system. Various operating systems are MS-WINDOWS, DOS, UNIX, LINUX etc.
1.2 Types of operating System: The operating systems are broadly divided into three categories: 1.2.1 Serial processing operating system Batch processing operating system Multiprogramming operating system Serial Processing Operating System -Early computer systems had a concept of bare
machine in which each job was to be performed manually. The first type of operating system which uses the concept of mediator far from the concept of bare machines was the serial processing operating system. In serial processing operating system uses the concept of serially processing system in which one program had to be run at a time. Therefore a lot of time wastage when there are many processes concurrently, due to which efficiency of system resources reduces.
1.2.2 Batch Processing Operating System - After the serial operating systems the next operating system was batch processing operating system. In this operating system we make a batch of several jobs and then submitted to the system. A set of control statements also called as JCL (Job Control Language) is used to identify the next job when the previous job is completed. Thus each job processes individually and after processing of all programs, control is transferred back to the user. Batch processing system increase resource utilization, especially in multi-user environment. But batch processing system increase the turnaround time (the time a job is submitted until its output is received). Thus in a batch operating system, a general user has to suffer a long turnaround time. 1.2.2 Multiprogramming operating system - A single user can’t keep all the devices busy
all the time. So the concept of multiprogramming operating system was introduced. The idea behind this is that whenever one program waits for an I/O activity, another program that waits for CPU is scheduled for its execution. Thus two more programs can be executed so as to keep the CPU & I/O busy almost all time is called as multiprogramming and the operating, which implements it, is called as multiprogramming operating system. The number of programs activity completing for resources of a multiprogramming computer system is often called as degree of multiprogramming. Higher the degree of multiprogramming, higher will be resources utilization. It also supports multiple users. Program1
Time sharing system
When there are multiple user and they want to share the same machine, now and they operate on a common platform, then the concept of time-sharing system can into existence. Actually it is the logical extension of the multiprogramming system. It provides multiple interactive users at a reasonable cost. Each user think that the machine is dedicated to him/her, but there is not so. Time sharing system use time-slicing scheduling. In this system, programs are executed according to the given priority that increases during waiting and drops after the service is granted. Here each user shares the computer system. As the migration of a process takes much less time as compared to the I/O devices. So CPU sits idle until it gets input. Therefore instead of sitting idle (CPU), the O.S. will rapidly switch the CPU from one user to the next. Thus the one computer is shored by several users. 1.4 Parallel System When there is multiple user or many processor are given to the single processor, in such a situation there comes the burden on a single processor. In such type of situation it is better to distribute the workload among several processors. It will reduce the load of the single processor In parallel systems, processors share memory, the bus, and the clock and peripheral devices. It is also known as multiprocessor systems. There are many advantages of design parallel systems. When the no. of processors increases, then definitely the amount of work done completed in a unit amount of time also increases. In other words we can say that more work done in less time. Hence the throughput of the system definitely increases. In parallel system as the memory, buses, peripheral devices and power supplies are shared. It will reduce the cost as compared to the multiple single users. In parallel processing systems, workload is distributed among several processors. Thus in case of failure of one processor among them will not stop the system. The work done of failure of the processor will be distributed, among other will slow down their speed of processing but it will be lightly. Generally the parallel processors are used in two different ways –
Symmetric multiprocessor - In such type of arrangement each processor work individually or
each processor work identical. They also communicate with each other when needed.
Symmetric multiprocessing system
Asymmetric Multiprocessor - There is a master processor that controls the system and all the
other processors either look to the master for instruction or have predefined tasks. Such scheme is called as master slave relationship. It is the master who allocates the work to the slave processors. Memory
1.5 Distributed System It is a collection of autonomous computer systems capable of communication and co-operation with the help of software and hardware connections. In distributed systems, each processor has its own local memory rather than having a shared memory or a clock as in parallel system. The processors communicate with one another through various communication media such as telephone lines, high speed buses etc. This system is called as loosely coupled system or distributed system. The processors may be called as sites, minicomputers and large general purpose computer. Advantages of distributed system are : Users sitting at different locations can share the resources. Users can exchange data with one another such as electronic wait Backup & recovery of data is easily possible.
In case of failure of one processor slow down the speed of the system but of each system is assigned to perform a pre-specified task then failure of one system can half the system then the 1.6 Real Time Systems Real time systems are the form of systems in which time requirements are critical on the operation of a CPU or the flow of data. Sensors bring data to the computer. The computer first analyzes the data and possibly adjusts the controls to modify the sensors inputs. Certain applications such as scientific control equipment, medical sector, industrial sectors etc. computer systems under the real time systems. Real-time systems can be divided mainly into two categories: Hard real-time system Soft real-time system When the task is being completed within the time slice with guarantee. This type of real system is called the hard real time system. In this, two tasks can’t be mixed whereas when the time process gets priority over other processes that is called the soft-time system. Therefore soft-time systems can be easily mixed with the other type of systems.
End Chapter quizzes:
Q1. The operating system of a computer serves as a software interface between the user and the ________. a. Hardware b. Peripheral c. Memory d. Screen Q2. Distributed OS works on the ________ principle. a. File Foundation b. Single system image c. Multi system image d. Networking image Q3. Which file system does Windows 95 typically use? a. FAT16 b. FAT32 c. NTFS d. LMFS Q4. The operating system manages ________. a. Memory b. Processor c. Disk and I/O devices d. All of the above Q5. The term “Operating System “means ________. a. A set of programs which controls computer working b. The way a computer operator works c. Conversion of high-level language in to machine level language d. The way a floppy disk drive operates
Chapter-II Operating System Structure
Contents: 2.1 System Component 2.2 System Calls 2.3 System Programs 2.4 System Structure 2.5 Virtual Machine
In this chapter, we explore three aspects of OS, showing the viewpoints of users, programmers, and operating-system designers.
2.1 System Components
The functions of OS vary from type to type of the operating systems for example the functions of a single user OS is different from that of multiuser O.S. but in general the basic functions performed by all OS are same. These functions are as follows: a. Process Management b. Memory Management c. I/O Management d. File Management
When a program is in active state it is called the process in other words process can be thought of as a program in execution. It is the O.S. which assigns different jobs to the processor to be performed by the computer system. The process management performs the following activities:1. It controls the progress of the processes. 2. Creating & deleting both user and system processes. 3. It allocates hardware resources to the processes. 4. It controls the e signal communication processes. 5. It handles the deadlock condition. b. Memory Management It allocates the main memory as well the user programs and data during execution of them. The memory management does the following tasks/jobs. Allocating and de-allocating memory as the requirement. It tracks how much to be allocated and when to be allocated. It also manages the secondary storages &virtual memory. In case of multi-user. It partitions the main memory & solve the case of overlapping.
c. I/O Management This management system manages the input and output requirements. It is also known as device management function that controls the input/output devices by keeping track of resources or devices, channels and control unit. It may also called as I/O traffic controllers. It co-ordinates and assigns the different input and output devices by helping track of resources or devices, channels and control unit. I/O management system is also known as disk schedule. As it manages the request for disk access by carefully examining the requests that are in pipeline for perform the operations to final most efficient way to serve the requests. d. File Management File management performing the file management that has a facilities which can be manage the complete file system in itself. Functions performed by the file management are: Mapping files out secondary storage. It creates backup of files on stable storage files. It arranges the all files creation and deletion of files. It also arranges the creation & deletion of files.
2.2 System Calls System calls provide the interface between a process and the operating system .These calls are provided with the manuals that are used by assembly –language programmers .However some higher level languages , such as C, also include such features .The C language allows system calls to be made directly . System calls occur in different ways, depending on the computer in use. System calls are broadly divided into five categories: process control, file management, device management, information maintenance and communication.
2.3 System Programs
System programs provide a convenient environment for program development and execution. The can be divided into the following categories. File management Status information File modification Programming language support Program loading and execution Communications Application programs
Most users’view of the operation system is defined by system programs, not the actual system calls. Actually it provide a convenient environment for program development and execution some of them are simply user interfaces to system calls; others are considerably more complex. Most operating systems are supplied with programs that solve common problems, or perform common operations. Few of them are as follows. a. File management can create, delete, copy, rename, print, dump, list, and generally manipulate files and directories Status information. b. File modification Text editors to create and modify files Special commands to search contents of files or perform transformations of the text c. Programming-language supportcompilers, assemblers, debuggers and interpreters sometimes provided. d. Program loading and execution-Absolute loaders, relocatable loaders, linkage editors, and overlay-loaders, debugging systems for higher-level and machine language e. Communications – provide the mechanism for creating virtual connections among processes, users, and computer systems. It allow users to send messages to one another’s screens, browse web pages, send electronic-mail messages, log in remotely, transfer files from one machine to another.
2.4 Operating System Structure The design of operating system architecture traditionally follows the separation of concerns principle. This principle suggests structuring the operating system into relatively independent parts that provide simple individual features, thus keeping the complexity of the design manageable. Besides managing complexity, the structure of the operating system can influence key features such as robustness or efficiency:
The operating system possesses various privileges that allow it to access otherwise protected resources such as physical devices or application memory. When these privileges are granted to the individual parts of the operating system that require them, rather than to the operating system as a whole, the potential for both accidental and malicious privileges misuse is reduced.
Breaking the operating system into parts can have adverse effect on efficiency because of the overhead associated with communication between the individual parts. This overhead can be exacerbated when coupled with hardware mechanisms used to grant privileges. The following sections outline typical approaches to structuring the operating system. Monolithic Systems A monolithic design of the operating system architecture makes no special accommodation for the special nature of the operating system. Although the design follows the separation of concerns, no attempt is made to restrict the privileges granted to the individual parts of the operating system. The entire operating system executes with maximum privileges. The communication overhead inside the monolithic operating system is the same as the communication overhead inside any other software, considered relatively low. CP/M and DOS are simple examples of monolithic operating systems. Both CP/M and DOS are operating systems that share a single address space with the applications. In CP/M, the 16 bit address space starts with system variables and the application area and ends with three parts of the operating system, namely CCP (Console Command Processor), BDOS (Basic Disk Operating System) and BIOS (Basic Input/Output
System). In DOS, the 20 bit address space starts with the array of interrupt vectors and the system variables, followed by the resident part of DOS and the application area and ending with a memory block used by the video card and BIOS.
Figure: Simple Monolithic Operating Systems Example Most contemporary operating systems, including Linux and Windows, are also considered monolithic, even though their structure is certainly significantly different from the simple examples of CP/M and DOS. Layered Systems A layered design of the operating system architecture attempts to achieve robustness by structuring the architecture into layers with different privileges. The most privileged layer would contain code dealing with interrupt handling and context switching, the layers above that would follow with device drivers, memory management, file systems, user interface, and finally the least privileged layer would contain the applications. MULTICS is a prominent example of a layered operating system, designed with eight layers formed into protection rings, whose boundaries could only be crossed using specialized instructions. Contemporary operating systems, however, do not use the layered design, as it is deemed too restrictive and requires specific hardware support. Microkernel Systems
A microkernel design of the operating system architecture targets robustness. The privileges granted to the individual parts of the operating system are restricted as much as possible and the communication between the parts relies on specialized communication mechanisms that enforce the privileges as necessary. The communication overhead inside the microkernel operating system can be higher than the communication overhead inside other software; however, research has shown this overhead to be manageable. Experience with the microkernel design suggests that only very few individual parts of the operating system need to have more privileges than common applications. The microkernel design therefore leads to a small system kernel, accompanied by additional system applications that provide most of the operating system features. MACH is a prominent example of a microkernel that has been used in contemporary operating systems, including the NextStep and OpenStep systems and, notably, OS X. Most research operating systems also qualify as microkernel operating systems. Virtualized Systems Attempts to simplify maintenance and improve utilization of operating systems that host multiple independent applications have lead to the idea of running multiple operating systems on the same computer. Similar to the manner in which the operating system kernel provides an isolated environment to each hosted application, virtualized systems introduce a hypervisor that provides an isolated environment to each hosted operating system. Hypervisors can be introduced into the system architecture in different ways.
A native hypervisor runs on bare hardware, with the hosted operating
systems residing above the hypervisor in the system structure. This makes it possible to implement an efficient hypervisor, paying the price of maintaining a hardware specific implementation.
A hosted hypervisor partially bypasses the need for a hardware specific
implementation by running on top of another operating system. From the bottom up, the system structure then starts with the host operating system that includes the hypervisor, and then the guest operating systems, hosted above the hypervisor.
2.5 Virtual Machines
A virtual machine may be defined as an efficient, isolated duplicate of a real machine. Current use includes virtual machines which have no direct correspondence to any real hardware. Virtual machines are separated into two major categories, based on their use and degree of correspondence to any real machine. 1. System virtual machine: It provides a complete system platform which supports the execution of a complete operating system (OS). It is also known as hardware virtual machine. 2. Process virtual machine: It is designed to run a single program, which means that it supports a single process. It is also called as an application virtual machine. An essential characteristic of a virtual machine is that the software running inside is limited to the resources and abstractions provided by the virtual machine—it cannot break out of its virtual world. Example: A program written in Java receives services from the Java Runtime Environment software by issuing commands to, and receiving the expected results from, the Java software. By providing these services to the program, the Java software is acting as a "virtual machine", taking the place of the operating system or hardware for which the program would ordinarily be tailored.
(a) Non-virtual machine (b) virtual machine
System virtual machines
System virtual machines allow the sharing of the underlying physical machine resources between different virtual machines, each running its own operating system. The software layer providing the virtualization is called a virtual machine monitor or hypervisor. A hypervisor can run on bare hardware (Type 1 or native VM) or on top of an operating system (Type 2 or hosted VM). The main advantages of system VMs are: Multiple OS environments can co-exist on the same computer, in strong isolation from each other. The virtual machine can provide an instruction set architecture (ISA) that is somewhat different from that of the real machine. Application provisioning, maintenance, high availability and disaster recovery. The main disadvantage of system VMs is:
The virtual machine is less efficient than a real machine when it accesses the hardware indirectly
Process virtual machines
A process VM, runs as a normal application inside an OS and supports a single process. It is created when that process is started and destroyed when it exits. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware or operating system, and allows a program to execute in the same way on any platform. A process VM provides a high-level abstraction — that of a high-level programming language. Process VMs are implemented using an interpreter; performance comparable to compiled programming languages is achieved by the use of just-in-time compilation. This type of VM has become popular with the Java programming language, which is implemented using the Java virtual machine. Other examples include the Parrot virtual machine, which serves as an abstraction layer for several interpreted languages, and the .NET Framework, which runs on a VM called the Common Language Runtime. A special case of process VMs are systems that abstract over the communication mechanisms of a (potentially heterogeneous) computer cluster. Such a VM does not consist of a single process, but one processes per physical machine in the cluster. They are designed to ease the task of programming parallel applications by letting the programmer focus on algorithms rather than the communication mechanisms provided by the interconnect and the OS. They do not hide the fact that communication takes place, and as such do not attempt to present the cluster as a single parallel machine. Unlike other process VMs, these systems do not provide a specific programming language, but are embedded in an existing language; typically such a system provides bindings for several languages (e.g., C and FORTRAN). Examples are PVM (Parallel Virtual Machine) and MPI (Message Passing Interface). They are not strictly virtual machines, as the applications running on top still have access to all OS services, and are therefore not confined to the system model provided by the "VM".
Chapter-II Operating System Structure
End Chapter quizzes:
Q1. A major problem with priority scheduling is _________. a. Definite blocking b. Starvation c. Low priority d. None of the above Q2. Which is not performed by the OS? (a) Memory management (c) Supplying Input to system (b) Device management (d) Process management
Q3. The OS is used in intelligent device(a) Multiplexing (b) Handheld OS (b) Multiprocessing (d) Real time OS
Q4. On the basic of no. of processors supported, how many types of OS are there? (a) One (c) Three (b) two (d) four
Q5. This is a program used to write source code(a) Editor (b) Assembler (b) Complier (d) Interpreter
Chapter-III Process Management
Contents: 3.1 Process concept 3.2 Process scheduling 3.3 Operation on process 3.4 Cooperating process 3.5 Inter process communication 3.6 Background process
3.1 Process Concept
A process is the unit of work in most systems. A program may be two states either in active state or in passive state. When a program is in the active state that is called the process. Each process may contain one of the following states: New state: The process being created. Running state: a process is said to be running if it (the process) using the CPU at that specific instant. Suspended state or Waiting state: a process is said to be suspended if it is waiting for some event it occur. Ready state: a ready state process is run able but temporarily stopped running to let another process run. Terminated state: when the process has finished execution is called the terminated state. Thus the process management performs the following functions: Creating and removing processes. Manage the progress of a process. Handling the interrupts and errors during the program execution. Allocating hardware resources among several processes.
Figure 3.1: Process states
3.2 Process Scheduling
Process scheduling refers to the set of policies and mechanisms that govern the order in which the computer completes the work. There are basically two scheduling philosophy. a. Preemptive scheduling b. Non preemptive scheduling. a. Preemptive scheduling:-
A scheduling in which currently running process is replaced by a higher priority process even if its time slice is not occurring. b. Non preemptive scheduling:In this scheduling the running process retains the control of the CPU and all allocated resources until it terminates.
3.3 Operations on Processes
Operation system provides a mechanism or facility for process creation and termination. 3.3.1 Process Creation A process may create several processes, via a create-process system call, during the course of execution. The creating process is called a parent process, whereas the new processes are called the children of that process. Each of these new processes may in turn create other processes, forming a tree of processes. When a process creates sub-process, he sub-process may be able to obtain its resources directly from the operating system, or it may be constrained to a subset of the resources of the parent process. Resource sharing is under the following circumstances. a. Parent and children share all resources. b. Children share subset of parent’s resources. c. Parent and child share no resources. When a process creates a new process, two possibilities exist in terms of execution: a. The parent continues to execute concurrently with its children. b. The parent waits until some or all its children have terminated.
The parent creates a child process using the fork system calls. The value of pid (process identifier used in UNIX to identify each process) for the child process is zero; that for the parent is an integer value greater than zero. In UNIX system the parent wais for the child process to complete with the wait system call. When the child process completes, the parent process resumes from the call o wait where it completes using the exit system call. 3.3.2 Process Termination A process terminates when it finishes executing is final statement and asks the operating system to delete it by using the exit system call. Termination occurs under additional circumstances. The process can cause the termination of another process via n appropriate system call (for example, abort). A parent may terminate the execution of one of its children for variety of reasons. Other most common operation performed on processes are as follows:a. CREATE b. DELFTE c. ABORT d. JOHN e. SUSPEND f. DELAY g. RESUME h. GET ATTRIBUTES i. CHANGE, PRIORITY a. CREAT: - When an O.S. encounters the CREATE call; it creates a new process with the specified attributes and identifier. b. DELETE Call:- When the DELETE system call executes, the O.S. destroys the designed process and remove it from main memory. A process can be deleted either by itself or by another process. c. THE ABORT Call: - When the ABORT system call execution, the O.S. terminates a process. When a process is aborted, the O.S. usually furnishes a either and memory
dump together with the identity of aborting process and reason for this action. A process can be aborted either by itself or by other processes. d. The JOIN/FORK Call: - It is the method of process creative and deletion. The FORK command is used to split a sequence of statements into two concurrently executable sequences of code divided by the FORK. e. The SUSPEND Call: - The specified process is suspended indefinitely and placed in the suspended state. A process remains in suspended state until its suspending condition is removed. f. The DELAY CALL:- When the DELAY Call is involved, the target is suspended for the duration of the specified time period. A process can be delayed either by itself or some other process. g. The GET-ATTRIBUTES:- When the GET-ATTRIBUTES CALL is involved, the O.S. accessed the current values of the process attributes from its PCB. Generally this call is used in order to check the process’s status. h. The CHANGE PRIORITY Call:- When this call is executed, the priority of the designed process is changed. At the time of this call, it receives two arguments-PCB and new priority. This call is not implemented in systems where process priority is static.
3.4 Cooperating Processes
The concurrent processes executing in the operating system may be either independent processes or cooperating processes. Independent process cannot affect or be affected by the execution of another process. Cooperating process can affect or be affected by the execution of another process. A process is cooperating if it can affect or be affected by the other processes executing in the system. Clearly, any process that shares data with others processes is a cooperating process. Co-operating processes share any type of data co-operating processes must synchronized with each other when they are to use shared resources. However cooperating processes may either directly share a logical address space or be allowed to share data and through files. Generally there are following important factors that are responsible for providing an environment that allows process co-operating;
In a multiuser environment where several users may be interested in the same type
of resources, a suitable environment must be provides that allows concurrent access of these types of resources. (ii) A task can run faster by breaking it into subtasks, each of which will be executing in parallel with others. (iii) You can also have many tasks to work on at one time. Advantages of process cooperation Information sharing Computation speed-up Modularity Convenience
3.5 Interprocess Communication
Inter process communication is mainly of three types:a. b. c. Inter process synchronization inter process signaling inter process communication
Inter Process Synchronization:- It is a set of protocols and mechanisms used to preserve system integrality & consistency when concurrent processes share a serially usable resources. For example, a printer is a serially usable device, which can process the print common of our user at the moment. It can start the printing the ghgfwg of the other user when the print job of one user is over. a. Inter Process Signaling: - It is the exchange of timing signals between the concurrent processes. It used to co-ordinate the collective progress of process. b. Inter process Communication:Concurrent co-operating processes must communicate with each other for exchanging data reporting the progress and gathering collective results. Shared memory provides simple and common means to interprocess communication or all processes can access it.
3.6 Back ground processA background process (or batch process) is a computer process that runs with a relatively low priority, requires little or no input, and generates a minimum of output. Types of background process are Daemons and Compute-intensive tasks. Daemon processes – It offer services like web pages serving, email transferring, time synchronization, and similar. They usually consume little CPU and memory, and run quietly without user interaction. They mainly communicate with other computer programs, or with other computers via network. The background is also used for long tasks that require huge amount of computation and thus CPU time. Running this kind of task at low priority may seem counterintuitive, but it becomes clearer when one considers that a computer typically spends more than 90% of its time waiting for user input. One can assign a high priority to interactive tasks, which will appear highly responsive, and leave the majority of the time to low priority tasks.
Chapter-III Process Management
End Chapter quizzes:
Q1. Information about a process is maintained in a _________. (a) Stack (b) Translation Lookaside Buffer (c) Process Control Block (d) Program Control Block Q2. Which is not a scheduling algorithm –? (a) FIFS (c) STRN (b) RRS (d) SRTN
Q3 How many states of a process are there? (a) Two (c) Three (b) Four (d) Seven
Q4. How many types of schedules can be there (a) Two (c) Three (b) Four (d) Seven
Q5. Which is not a form of interprocess interaction? (a) Communication (c) Signaling (b) synchronization (d) co-operation
Chapter-IV CPU Scheduling
Contents: 4.1 Basic Concepts 4.1.1 CPU-I/O Burst Cycle 4.1.2 Scheduler 4.1.3. Dispatcher 4.2 Scheduling Criteria 4.3 Scheduling Algorithms 4.4 Multi-Processor Scheduling 4.5 Real-Time Scheduling 4.6 Deadlocks 4.7 Deadlock Prevention 4.8 Deadlock Avoidance 4.9 Deadlock Detection and Recovery
4.1 Basic Concepts
CPU scheduling is the basis of muliprogrammed operating systems. By switching the CPU among processes, the operating system can make the computer more productive. The objective of multiprogramming is to have some process running at all times, in order to maximize CPU utilization. The idea of multiprogramming is relatively simple. A process is executed until it must wait, typically for the completion of some input/output request. In a simple computer system, the CPU would then sit idle; all his waiting time is wasted. With multiprogramming, we try to use this time productively. Several processes are kept in memory at one time. When one process has to wait, the operating system takes the CPU away from that process and gives the CPU to another process. 4.1.1 CPU-I/O Burst Cycle Process execution consists of a cycle of CPU execution and I/O wait. Processes alternate between two states. Process execution begins with CPU burst followed by an I/O burst and then another CPU burst, then another I/O burst, and so on. Eventually, the last CPU burst will end with a system request to terminate execution of the process, rather than with another I/O burst. For effective utilization of CPU a program must be combination of I/O-bound program and CPU-bound program. 4.1.2 Scheduler Whenever the CPU becomes idle, the operating system must select one of the processes in the ready queue to be executed. The selection process is carried out by scheduler. There are three types of schedulers used by the operating for process scheduling. Their details are as follows.
a. b. c.
Long-term scheduler or an admission scheduler or high-level scheduler. Mid-term or medium-term scheduler. Short-term scheduler.
a. Long-term scheduler: The long-term, or admission, scheduler decides which jobs or processes are to be admitted to the ready queue; that is, when an attempt is made to execute a
program, its admission to the set of currently executing processes is either authorized or delayed by the long-term scheduler. Thus, this scheduler dictates what processes are to run on a system, and the degree of concurrency to be supported at any one time - ie: whether a high or low amount of processes are to be executed concurrently, and how the split between IO intensive and CPU intensive processes is to be handled. In modern OS's, this is used to make sure that real time processes get enough CPU time to finish their tasks. Without proper real time scheduling, modern GUI interfaces would seem sluggish. Long-term scheduling is also important in large-scale systems such as batch processing systems, computer clusters, supercomputers and render farms. In these cases, special purpose job scheduler software is typically used to assist these functions, in addition to any underlying admission scheduling support in the operating system. b. Medium-term scheduler: The mid-term scheduler temporarily removes processes from main memory and places them on secondary memory (such as a disk drive) or vice versa. This is commonly referred to as "swapping out" or "swapping in. The mid-term scheduler may decide to swap out a process which has not been active for some time, or a process which has a low priority, or a process which is page faulting frequently, or a process which is taking up a large amount of memory in order to free up main memory for other processes, swapping the process back in later when more memory is available, or when the process has been unblocked and is no longer waiting for a resource. In many systems today, the mid-term scheduler may actually perform the role of the long-term scheduler, by treating binaries as "swapped out processes" upon their execution. In this way, when a segment of the binary is required it can be swapped in on demand, or "lazy loaded". c. CPU (or Short ) term scheduler: The short-term scheduler or CPU scheduler, decides which of the ready, in-memory processes are to be executed (allocated a CPU) next following a clock interrupt, an IO interrupt, an operating system call or another form of signal. Thus the short-term scheduler makes scheduling decisions much more frequently than the long-term or mid-term schedulers - a scheduling decision will at a minimum have to be made after every time slice and these are very short. This scheduler can be preemptive, implying that it is capable of forcibly removing processes from a CPU when it decides to allocate that CPU to
another process, or non-preemptive or "co-operative", in which case the scheduler is unable to "force" processes off the CPU. 4.1.3 Dispatcher Another component involved in the CPU-scheduling function is the dispatcher. The dispatcher is the module that gives control of the CPU to the process selected by the shortterm scheduler. This function involves the following:
Switching context Switching to user mode Jumping to the proper location in the user program to restart that program
The dispatcher should be as fast as possible, since it is invoked during every process switch. The time it takes for the dispatcher to stop one process and start another running is known as the dispatch latency.
4.2 Scheduling Criteria
CPU scheduling algorithms have different scheduling criteria and may favor one class of processes over another. The characteristics used for comparison can make a substantial difference in the determination of the best algorithm. The criteria include the following.
CPU Utilization: We want to keep the CPU as busy as possible. Throughput: If the CPU is busy executing processes, then work is being done. One measure of work is the number of processes that are completed per time unit, called throughput. For long processes, this rate may be one process per hour; for short transactions, it may be 10 processes per second.
Turnaround time: From the point of view of a particular process, the important criterion is how long it takes to execute that process. The interval from the time of submission of a process to the time of completion is the turnaround time. Turnaround time is the sum of the periods spent waiting to get into memory, waiting in the ready queue, executing on the CPU, and doing I/O.
Waiting time: The CPU scheduling algorithm does not affect the amount of the time during which a process executes or does I/O; it affects only the amount of time that a process spends waiting in the ready queue. Waiting time is the sum of periods spends waiting in the ready queue.
Response time: In an interactive system, turnaround time may not be the best criterion. Often, a process can produce some output fairly early and can continue computing new results while previous results are being output to the user. Thus, another measure is the time from the submission of a request until the first response is produced. This measure, called response time, is the time it takes to start responding, not the time it takes to output the response. The turnaround time is generally limited by the speed of the output device.
It is desirable to maximize CPU utilization and throughput and to minimize turnaround time, waiting time, and response time.
4.3 Scheduling Algorithms
CPU scheduling deals with the problem of deciding which of the processes in the ready queue is to be allocated the CPU. There are many different CPU scheduling algorithms. In this section, we describe several of them. 4.3.1 First-Come, First Served Scheduling First Come, First Served (FCFS) is the simplest scheduling algorithm. It is also known Fist-In First-Out (FIFO). FIFO simply queues processes in the order that they arrive in the ready queue.
Since context switches only occur upon process termination, and no reorganization of the process queue is required, scheduling overhead is minimal. Throughput can be low, since long processes can hog the CPU Turnaround time, waiting time and response time can be low for the same reasons above No prioritization occurs, thus this system has trouble meeting process deadlines.
The lack of prioritization does permit every process to eventually complete, hence no starvation.
4.3.2 Shortest remaining time or Shortest Job First (SJF)
With this strategy the scheduler arranges processes with the least estimated processing time remaining to be next in the queue. This requires advanced knowledge or estimations about the time required for a process to complete.
If a shorter process arrives during another process' execution, the currently running process may be interrupted, dividing that process into two separate computing blocks. This creates excess overhead through additional context switching. The scheduler must also place each incoming process into a specific place in the queue, creating additional overhead.
This algorithm is designed for maximum throughput in most scenarios. Waiting time and response time increase as the process' computational requirements increase. Since turnaround time is based on waiting time plus processing time, longer processes are significantly affected by this. Overall waiting time is smaller than FIFO, however since no process has to wait for the termination of the longest process.
No particular attention is given to deadlines; the programmer can only attempt to make processes with deadlines as short as possible. Starvation is possible, especially in a busy system with many small processes being run.
4.3.3 Fixed priority pre-emptive scheduling
The O/S assigns a fixed priority rank to every process, and the scheduler arranges the processes in the ready queue in order of their priority. Lower priority processes get interrupted by incoming higher priority processes.
Overhead is not minimal, nor is it significant. FPPS has no particular advantage in terms of throughput over FIFO scheduling.
Waiting time and response time depend on the priority of the process. Higher priority processes have smaller waiting and response times. Deadlines can be met by giving processes with deadlines a higher priority. Starvation of lower priority processes is possible with large amounts of high priority processes queuing for CPU time.
4.3.4 Round-robin scheduling
The scheduler assigns a fixed time unit per process, and cycles through them.
RR scheduling involves extensive overhead, especially with a small time unit. Balanced throughput between FCFS and SJN, shorter jobs are completed faster than in FCFS and longer processes are completed faster than in SJN. Fastest average response time, waiting time is dependent on number of processes, and not average process length. Because of high waiting times, deadlines are rarely met in a pure RR system. Starvation can never occur, since no priority is given. Order of time unit allocation is based upon process arrival time, similar to FCFS.
Consider the following set of processes, with the length of the CPU-burst time given in milliseconds: Process P1 P2 P3 Burst Time 8 5 4 Priority 2 1 (High) 3
The processes are assumed to have arrived in the order P1, P2, P3 and at time t=0.
a. Draw four Gantt charts illustrating the execution of these processes using FCFS, SJF, a non preemptive (a smaller priority number implies a higher priority), and RR (quantum = 2) scheduling.
b. What is the waiting time of each process for each of the scheduling algorithms in
part a? c. What is the turnaround time of each process for each of the scheduling algorithms in part a? Solutions: (a) Gantt chart for First-Come First-Serve (FCFS) Scheduling
P1 0 8
Gantt chart for Shortest Job First (SJF) Scheduling
P3 0 4
Gantt chart for priority Scheduling
P2 0 5
Gantt chart for RR Scheduling
P1 0 2
P1 12 14
Waiting time for each process and average waiting time FCFS 0 8 13 7.0 SJF 9 4 0 4.333 Priority 5 0 13 6.0 RR 9 10 8 9.0
Process Waiting time for process P1 Waiting time for process P2 Waiting time for process P3 Average waiting time
Turnaround time for each process and average turnaround time Process FCFS 8 13 17 12.66 SJF 17 9 4 10.00 Priority 13 5 17 11.66 RR 17 15 12 11.33
Turnaround time for process P1 Turnaround time for process P2 Turnaround time for process P3 Average Turnaround time
4.3.5 Multilevel Queue Scheduling
This is used for situations in which processes are easily classified into different groups. For example, a common division is made between foreground (interactive) processes and background (batch) processes. These two types of processes have different response-time requirements and so may have different scheduling needs. 4.3.6 Multilevel Feedback Queue Scheduling Multi level feedback queue scheduling allows a process o move between queues. The idea is to separate processes with different CPU-burst characteristics. If a process uses to much CPU time, it will be moved to a lower-priority queue. Similarly, a process that waits too long in a lower-priority queue may be moved to a higher-priority queue. In general, a multilevel feedback queue scheduler is defined by the following parameters: No. of queues Scheduling algorithm for each queue The method used to determine when to upgrade a process to a higher-priority queue The method used to determine when to upgrade a process to a lower-priority queue The method used to determine which queue a process will enter when that process needs service
4.4 Multi-Processor Scheduling
There are many reasons to use the concept of multiprocessing scheduling: • To support multiprogramming • Large numbers of independent processes • Simplified administration • To support parallel programming • “job” consists of multiple cooperating/communicating threads and/or processes • Load sharing It will consider only shared memory multiprocessor
•Central queue – queue can be a bottleneck
•Distributed queue – load balancing between queue
There are two types of multi-processor scheduling viz. symmetric and asymmetric. Asymmetric multiprocessing is far simpler than symmetric multiprocessing, because only
one processor accesses the system data structures, alleviating the need for data sharing. Typically, asymmetric multiprocessing is implemented first within an operating system, and then upgraded to symmetric multiprocessing as the system evolves.
4.5 Real-Time Scheduling
A multitasking operating system proposed for real-time applications is known as realtime operating system. The applications comprise embedded systems like programmable thermostats, household appliance controllers, industrial robots, mobile telephones, scientific research equipment and spacecraft, industrial control. Features:A real-time operating system provides the design of a real-time system, but does not ensures the real time of the final outcome as it needs proper growth of the software.
A real-time operating system does not possesses high through put; rather, a realtime operating system provides, assurance deadlines that can be met generally like soft real-time or deterministically like hard real-time.
Real-time operating system characteristically utilizes specific scheduling algorithms in order to facilitate the real-time developer with the device required to generate deterministic performance in the final system.
Real-time operating system is valued more for its fastness and predictably as it can retort to a meticulous event for the given amount of work it can execute over time.
Main issues in a real-time operating system are smallest interrupt latency and a least thread switching latency. An initial example of a large-scale real-time operating system was termed as "control program" which was established by American Airlines and IBM for the Sabre Airline Reservations System.
4.6 Deadlock s
A deadlock is a situation in multiprogramming system when each process of a group acquires some of resources needed for its completion while it is waiting for other resources that are acquired by the other processes of same group to be released. This situation permanently blocks all the processing and the system may come to halt. Deadlock is the major side effect of synchronization and concurrent processing. It occurs as a result uncontrolled granting of system resources to requesting processes. For example, consider three tape devices. Suppose each of three processes hold one tape drive. Every process now requests the other tape drive. Three processes will be now in deadlock state. Another example is a process having control of the tape drive, requires printer to complete its printing job, a deadlock occurs in which both the processes are waiting for resources. Deadlock can occurs as a result of competition over any shared device. During deadlock, processes never finish and all system resources are blocked. This prevents new jobs from starting.
Necessary conditions for deadlock to occur: For a deadlock to occur, following conditions must be true: 1. Mutual Exclusion: shared resources are acquired and used in a mutually exclusion manner i.e. only one at a time. 2. Hold and Wait: each process holds the resources allocated to it while waiting for resources. 3. No Preemptions: once allocated, resources can only be released back by that process. The system cannot forcefully revoke them. 4. Circular Waiting: deadlocks are invoked in a circular chain such that each process holds some resources needed by others. All the above sated conditions must be there for a deadlock to occur.
Methods for handling deadlock:
To ensure that the deadlock will never occur, the system can sue either a deadlock prevention or deadlock avoidance scheme. Most of deadlock handling techniques can fall into one these three classes. 1. Deadlock prevention 2. Deadlock avoidance 3. Deadlock detection and recovery. Deadlock prevention is a set of methods for ensuring that at least one of the necessary conditions cannot hold where as deadlock avoidance requires that the operating system be given in advance additional information concerning which resources a process will request and use during its lifetime. If a system does not employ either deadlock prevention or a deadlock avoidance algorithm, then a deadlock situation may arise and in his environment dead detection and recovery methods are used.
4.7 Deadlock Prevention
The basic idea of deadlock prevention is to deny at least one of the four criterions that are necessary for deadlocks. Out of these conditions, mutual exclusion is usually very difficult to deny as it will affect the system performance but, we can consider other three conditions. (a) Eliminating Hold and Wait: hold and wait condition can be eliminated by forcing processes to release all its resources when it requests for an unavailable resource. It can be done by using two strategies: 1. Process can request for resources when it is having no resources at all. 2. A process can request for resources step by step in such a manner that it release the resources acquired by it before requesting another resource that is unavailable. The first strategy looks very easy to implement but if follow the strategy, a process will have to wait for these resources. This leads to major system degradation. The second approach requires careful holding and releasing of resources. It avoids the disadvantage of first approach but some resources cannot be reacquired later for example, files in the temporary memory. (b) Eliminating No Preemption: no preemption can be avoid by following preemption. That means the system can revoke ownership of resources from a process. But this requires storing the state of the process before revoking a resource from it. Preemption is
possible for some type of resources e.g. CPU & memory whereas it cannot be applied to some resources like printer. (c) Eliminating Circular Wait: Circular Wait can be avoided by linear ordering of resources in a system. In linear ordering of resources, the system resources are divided into different classes Ci where the value can range from 1 to N. deadlocks are prevented by forcing the processes to acquire system resources in increasing order of resources class. For example if a process has requested a resources of class2, it can now request resources of class 3 or above and not resources of class 1.
4.8 Deadlock Avoidance
The basic idea of deadlock avoidance is to grant only those resource requests that cannot possibly result into state of deadlock. This strategy is implemented by having a resource allocator in the system, which examines the effects of allocating resource and grant the access only if it will not result into a state of deadlock, otherwise the requesting process will be suspended till the time it would be safe to grant access to the required resource. To avoid the deadlock, system requires each process to specify their maximum resource needs before their execution. The process requesting resources more than the pre-stated limit is not admitted for execution. The resource allocator keeps track of number of allocated and free resources of each type. A process, which is requesting an unavailable resource, is made to wait. When the resource is available, the resources allocator analyses that granting the accesses would result into a state of deadlock or not. If no access is granted and if yes, process is suspended. A graph-based algorithm may also be used. In this, a graph of current system state and after grant system state is plotted to decide about granting access to a requested resource.
4.9 Deadlock Detection and Recovery
In deadlock detection and recovery, the system grants the access to each requesting process freely. It occasionally, checks for deadlocks in order to reclaim held by processes in deadlock. At a stage, when system is checked for deadlock, the detection algorithm examines all possible sequences for incomplete process. If competition process exists then the system
is not in deadlock stage otherwise the system is not deadlock stage and all incomplete processes are blocked. Deadlock detection is commonly a part of deadlock and recovery process deadlock detection only a problem does not solves it. Now, the system must break the deadlock so that process may be processed. There are two options for deadlock recovery via: process termination and resource preemption. The first step in deadlock recovery is to identify deadlock processes. The next step is to roll back or restart one or more processes causing deadlock. Restarting leads to the loss of work done by the particular processes are chosen which are less costly to roll back. The process is rolled back to the point where the deadlock is released. Such facilities required by the system needs high reliability and or availability but these algorithms will be dangerous when processes have made changes, which can be rolled back.
Chapter-IV CPU Scheduling
End Chapter quizzes:
Q1. A major problem with priority scheduling is _________. a. Definite blocking b. Starvation c. Low priority d. None of the above Q2. Mutual exclusion a. if one process is in a critical region others are excluded b. prevents deadlock c. requires semaphores to implement d. is found only in the Windows NT operating system Q3. In one of the deadlock prevention methods, impose a total ordering of all resource types, and require that each process requests resources in an increasing order of enumeration. This violates the _______________ condition of deadlock a. Mutual exclusion b. Hold and Wait c. Circular Wait d. No Preemption Q4. In a multithreaded environment _______. a. Each thread is allocated with new memory from main memory. b. Main threads terminate after the termination of child threads. c. Every process can have only one thread. d. None of the above Q5. In a multithreaded environment _______. a. Each thread is allocated with new memory from main memory. b. Main threads terminate after the termination of child threads. c. Every process can have only one thread. d. None of the above
Chapter-V Memory Management
Contents: 5.1 Memory Management 5.2 Address Space 5.3 Memory Allocation Techniques 5.4 Swapping, Segmentation with paging 5.5 Virtual Memory, Demand Paging 5.6 Performance of Demand Paging 5.7 Page Replacement 5.8 Thrashing 5.9 Demand Segmentation
5.1 Memory Management
It is a technique used by O.S. to allocate physical memory of finite capacity to multiple requesting processes. In other words we can say that memory management is concerned with the allocation & de-allocation of physical to users and system processes. Address Binding: - The binding of instructions and data to memory address is called the address binding. It can be at any step of processing as follows. (a) Compile time binding:-
If the address of the process in the memory can be noticed at compile time & then the absolute code can be generated this type of binding is called the compile time binding. In compile time binding physical memory changes, the whole program will have to be rewritten. Compile Process Main Memory
Compile Time Binding (b)Load time binding:If the final binding of address is done at the load time that binding is called the load time binding. In this scheme of the starting address changes then the program is just reloaded to incorporate the changes value.
Process Main Memory
Load time binding
(c) Run time binding:If the address binding is done at the run time that scheme is called the run time binding or execution time binding. A special hardware called the memory management unit is used for this kind of binding. Memory management schemes are broadly divided into two major categories i.e. Contiguous Allocation and Non-contiguous Allocation. In contiguous memory allocation, each logical object is placed in physical memory at consecutive address. A common approach with contiguous allocation is to partition the available physical memory and fulfill the requirement of requesting processes. Memory partitioned may be either static partitioned or dynamic partitioned. Static Partitioned: - In this memory partitioning, memory partitions are created some time before the execution of user programs and these partitions remain fined, it means that one can’t change then one they are defined. In static partitioning washing of memory within a partition, due to a difference in the size of a partition and the size of a resident process within it is called as internal fragmentation. Dynamic Partition Memory Allocation: - In this memory allocation, memory partitions are created dynamically in response to process requests. The partition creation process goes on till the whole memory has been utilized or maximum allowable degree of multiplexing is reached. In this scheme the sizes of the partitions are not fined, it means that you can change them after they are defined. Dynamic partitioning remains the problem of internal fragmentation by making each partition only as large as necessary in order to satisfy the request of the incoming process. In dynamic partitioning, wasting of memory between partitions due to scattering of the free partitions is called as external fragmentation.
5.2 Address Space
The range of virtual addresses that the operating system assigns to a user or separately running program is called an address space. This is the area of contiguous virtual addresses available for executing instructions and storing data. The range of virtual addresses in an address space starts at zero and can extend to the highest address permitted by the operating system architecture.
Logical versus Physical Address Space: An address space generated by the CPU is commonly referred to as a logical address, where as an address seen by the memory unitthat is, the one loaded into the memory-address register of the memory- commonly referred to as a physical address. The set of all logical address generated by a program is a logical-address space; the set of all physical address corresponding these logical addresses is a physical- address space.
5.3 Memory Allocation Techniques
Memory allocation is of two kinds: Contiguous Memory Allocation. Non-contiguous Memory Allocation.
Contiguous Memory Allocation: In contiguous memory allocation, a memory- resident program occupies a single contiguous block of physical memory. The memory is partitioned into blocks of different sizes, for accommodating the programs. The partitioning is of two kinds: a) Fixed Partitioning. The memory is divided into a fixed number of partitions, of different sizes, which may suit the range of usually occurring program-sizes. Each partition can accommodate exactly one process. Thus, the degree of multi-programming is fixed. Whenever, a program needs to be loaded, a partition, big enough to accommodate the program, is allocated. Since, a program may not exactly fit the allocated partition; some space may be left unoccupied, after loading the program. This space is wasted and it is termed as Internal Fragmentation. The memory management is implemented using a table, called Partition Description Table (PDT). This table indicates the base and size of each partition, along with its status (whether F:Free or A: Allocated). Example : Partition Description Table (PDT) Partition Id Partition Base Partition Size Partition Status 0 0 50 K A 1 50 K 50 K F 2 100 K 50 K A 3 150 K 100 K A
4 250 K 250 K F Physical Memory Space (0-50 K) Operating System (0-45 K) (Internal Fragmentation : 05 K) (50-100 K) Free (100-150 K) Process 'B' (100-140 K) (Internal Fragmentation : 10 K) (150-250 K) Process 'A' (150-230 K) (Internal Fragmentation : 20 K) (250-500 K) Free Advantages 1. The implementation is very simple. 2. The processing overheads are low. Disadvantage 1. The degree of multi-programming is fixed, since the number of partitions is fixed. 2. Suffers from internal fragmentation. (b) Variable Partitioning: This scheme is free of the limitations encountered in the case of fixed-partitioning. Non-Contiguous Memory Allocation: It offers the following advantages over, contiguous memory allocation: Permits sharing of code and data amongst processes. There is no external fragmentation of physical memory. Supports virtual memory concept.
However, non-contiguous memory allocation involves a complex implementation and involves additional costs in terms of memory and processing. Non-contiguous memory allocation can be implemented by the concept of: Paging It permits physical address space of a process to be non-contiguous. The logical address space of a process is divided into blocks of fixed size called Pages. Also, the physical memory is divided into blocks of fixed size called Frames. In a system, the page and frame will be of same size. The size is of the order of 512 bytes to a few MB. Whenever, a process is to be executed, its page is moved from secondary storage (a fast disk) to the available frames in physical memory. The information about frame number, in which a page is resident, is entered in page table. The page table is indexed by page number. Paging; Segmentation;
Implementation of Paging The system makes use of paging table, to implement paging. When a process is to be loaded, its pages are moved to free frames in the physical memory. The information about frame number, where a page is stored, is entered in the page table. During the process execution, CPU generates a logical address, that comprises of page number (p) and offset within the page (d). The page number p is used to index into the page table and fetch corresponding frame number (f). The physical address is obtained by combining the frame number (f) with the offset (d). Given a Logical Address L, How to Compute the Corresponding Physical Address? For a ‘m’ bit processor, the address will be m bits long. Let the page size be 2n bytes. Then, the lower order n bit of a logical address I will represent page offset (d) and the higher order m-n bits will represent the page number (p). Then, page number And page offset p = L/2n d = L%2n
Let f be the frame number that hold the page referenced by logical address 1. Then f can be obtained by indexing into page-table, by using page number p as
index i.e., Corresponding physical address
= page-table [p]; = f * 2n + d
This physical address is fed to the Memory Management Unit (MMU) to access the memory location, referenced by logical address 1. The max size of logical address space of a process can be 2m byte i.e., up to 2n pages. Internal Fragmentation in Paging Since, a program size may not be an exact multiple of the page-size, some space would remain unoccupied in the last page of a process. This results in internal fragmentation. The average memory loss, due to internal fragmentation, would be of the order of half page per process. So, larger the page size, larger would be the loss of memory, by internal fragmentation, would be of the order of half page per process. So, larger the page size, larger would be the loss of memory, by internal fragmentation. Suppose, a system supports a page – size of P byte, Then, a program of size M bytes, will have an internal fragmentation = p-(M % P) bytes. Limitations of Basic Paging Scheme (discussed above) vis-a-vi Contiguous Memory Allocation The effective memory access time increases, since for accessing of an operand, first its frame number has to be accessed. Since, the page table resides in the RAM itself, the effective access time to get an operand will be twice the RAM access time. So, if RAM access time is 100 ns, the effective access time would be 200 ns. The page table occupies a significant amount of memory. Example: Page # (p) Offset (d)
For a 32 bit processor, with a page size of 1024 byte, Size of logical address = n = 32 Page size = 2m = 210 = 1024 bytes Number of bits to represent page offset = m = 10 Number of bits to represent page number = n – m = 22
The low order 10 bits of a logical address will represent page-offset and the higher order 22 bits will represent page number. Max size of logical address space = 232 bytes = 4 G Bytes. Max number of pages in logical address space = 222 = 4 Million. So, the max length of page table of a process = 4 M entries, each entry being 4 bytes. So, a page table would occupy 16 M byte in RAM. Suppose, the physical memory size is 256 MB, the page table of a process would occupy 1/16th of the whole memory. (a substantial amount of memory, jut to accommodate a process page table). The page table is per process, so, if 5 processes are memory resident simultaneously, the page tables would occupy 80 MB of RAM. Since the page table is per process, the page table would also need to be switched during context switching. As evident, the Basic Paging Scheme would need some enhancements to reduce its memory overheads. These are discussed in the proceeding paragraphs.
Translation Look-aside Buffer (TLB) TLB is page-table cache, which is implemented in a fast associative memory. The associative memory is distinguished by its ability to search for a key, concurrently in all entries in a table. This property makes the associative memory much faster than conventional RAM. But, it is much costlier also. Due to higher cost, it may not be costeffective to have the entire page table in TLB, but a subset of the page-table, that may be currently active, can be moved to TLB. It is implemented as follows: Each entry of TLB would contain Page # of a page and the Frame # where the page is stored in RAM. Whenever, a logical address is generated, the page number p of the logical address is fed as a key to the TLB. The key is searched in parallel in all the entries of TLB. If a match found for the page number p, it is termed as TLB Hit. The entry, with the matching number contains the Frame # f, where the page is stored. The frame number is used to access the desired physical location in the RAM.
If match not found for the page number, it indicates TLB Miss. Then, the frame number is accessed from the page table. Also, the page table entry is moved to TLB, so, that for further references to that page, its frame number can be accessed from the TLB itself. If TLB is full, then some replacement algorithm can be used to replace one of the existing entries in the TLB. The algorithm could be “replacement of the least recently used (LRU) entry”. This would improve the effective memory access time as illustrated by the following example: Let TLB Hit Ratio =0.9 (This is the probability that an intended frame number would be found in the TLB itself, with no need to look into the Page Table. Larger the TLB, higher would be the TLB Hit Ratio.) Let RAM access time t And TLB access time T Effective memory access (with TLB) = 20 ns = 100 ns = H* (T + t) + (1 – H) (2T + t) = 0.9* 120 + 0.1* 220 = 108 + 22 = 130 ns Effective memory access (without TLB) Reduction in effective access time = 2 T = 200 ns = (200 – 130)* 100/200 = 35%
Inverted Page Table The size of the inverted page table size is related to the size of the physical memory, not the size of the logical address space. Since, each frame in the physical memory can have at most one entry in the inverted page table, the table can act as a system wide table, if each entry in the table contain process-id (of the process, to which the page belongs) along with Page #. Since, information about process-id of each page is also available in the IPT, a single IPT is sufficient for the entire system. So, IPT is not process-specific and it does not need switching during context switching. A logical address generated by CPU contains process-id (p-id), page number (p) and page-offset (d). A search is carried out in the IPT to find a match for the process-id and the Page #. The offset of the
matching slot in the IPT. Gives the Frame # f, where the desired page is residing. The Frame # f, combined with offset d, gives the intended physical address.
Multi-level Paging The page table is split into multiple levels. For example, in a two-level page table, a logical address would comprise of the following fields: Page Number p1, for indexing into the Outer page Table. Each entry in the outer page table contains the base address of an inner page table. So, indexing the outer page table with p1, will select the intended inner page table. Base address of inner page table = Outer-page-table [p1]; Page number p2, for indexing into the inner page table, selected by page number p1. Each entry of a inner page table would contain a frame number that contains the intended page. So, indexing the selected inner page table with p2, makes available the base address of the frame that contains the intended page. Frame number f = desired operand. Physical address = f* page-size + d; The combined length of all the inner page tables would be same as the length of a single page table in the case of single-level-paging. Example : Suppose size of logical address = n = 32 Page size = 2m = 210= 1024 bytes Number of bits to represent page offset = m = 10 Number of bits to represent page number = n – m = 22 Length of page table (for single-level-paging) = 222 = 4 m entries Suppose, for the two level paging, of the total 22 bits representing page number and 10 bits are used to represent the inner page number. Then, Length of outer page table = 212 = 4 K entries Number of outer page tables =1 Inner-page-table [p2]; Offset or displacement d, that is used to index into the selected frame to obtain the
Length of each inner page table = 210 = 1 K entries Number of inner page tables = 212 = 4 K Combined length of all inner page table = 1 K X 4 K = 4 M entries This is exactly equal to the page table length to single-level paging. But, in the two level paging, all inner page tables need not be concurrently memoryresident. Suppose, the executing program has a logical address space of 32 MB. This would comprise 32K pages. Each inner page table can address 1 K pages. So, only 32 inner page tables (out of the total 4 K inner page tables) would need to be memoryresident. Thus, it reduces the memory overhead of page table. Advantage of Multi-level paging All the inner page tables would not be required to be memory resident simultaneously. Depending upon the size of executing program, only a small fraction of the set of inner page tables would need to be memory resident, thus reducing the memory overhead of page table. Disadvantage of Multi-level Paging To access an operand, multi-level paging needs some extra memory accesses. For example, in the case of two level paging, an additional memory access is required, that is, to get the base address of inner page table. Hashed Page Table A page table is created of length M. Whenever, logical address is generated, a hashing function is applied to the page number p, to generate an index value i.
= p % M;
The index value Ί is used to index into the page table. Each entry in the page table is a pointer to a link list. A node in the link list will provide mapping between page number p and the corresponding frame number f. Each node will contain the following information: Page-number (say p 1), such that (p1 % M = = i) Frame-number f1, where the page number p1 resides. Pointer to the next node in the list. So, the link list, accessed through the index value i. will be traversed, till a match found for page number p and the corresponding frame number f is obtained.
Using Hashed table, we can have a page table of any length. Translation from logical address to physical address will involve indexing into the page table and then traversing the link list looking for a match of the page number. The link list, being sequential, is time-consuming. But, this will reduce the memory overhead of page table, since link list nodes will be created only for those pages, that will be memory resident. Segmentation In segmentation, the physical is divided in segments of varying sizes. Each segment is assigned a unique segment number. The memory management is done through a segment table, which is indexed by segment number. For each segment, it has an entry that provides (a) base address of the segment and (b) size of the segment. The logical address contains segment number ‘s’ and offset within the segment ‘d’. Using the segment number, the system obtains base address of the segment. Then it makes a check to determine whether offset is within the segment size or not. If yes then the offset d is valid and physical address is computed by adding the offset to the base address; else it is error.
5.4 Swapping and Segmentation with paging
Swapping is a technique in which a suspended process or a preempted process is removed from main memory to secondary memory and later on the same process is brought back in main memory from secondary memory. When a process is shifted from main memory to secondary memory it is called swapping out and its subsequent bringing back is called swapping in. The part of the operating system that performs this action is called the swapper. The swapper performs the following functions: 1. It helps in the selection of processes among several blocked or unblocked processes to swap out. 2. It helps in the selection of processes to swap in. 3. It also performs the action of allocation and management of swap space. In segmentation with paging, the logical space is divided into a number of segments of verifying sizes and each segment is divided into a number of pages, each of a fixed size. The memory management is done through a segmentation table. Each segment has an
entry in the table. An entry contains base address of the segment page table and size of the segment page table. The logical address contains segment # ‘s’, page number ‘p’ in a segment and offset ‘d’ in the page. The segment number‘s’ is to access the segment entry in the segmentation table. The page table base address ‘B’ in the entry is used to access the segment page table. So, each segment will have a separate page table. The page number ‘p’ is used to access the frame # ‘f’ in the page table’ provide p < M (i.e., size of the segment page table); else it is invalid page number. The frame number ‘f’ is combined with the offset‘d’ to compute the physical address.
5.5 Virtual Memory, Demand Paging
It is a memory management scheme, which allows partial loading of the virtual address space of a resident process in physical memory. In real memory management when a process is swapped out from the main memory, the entire process image was to be swapped out. Later on when the same process was swapped in. these schemes are easy to implement, but unfortunately if the size of physical memory is limited, the no. of active processes are also limited. To overcome this limitation, virtual memory is used that allows the execution of processes that may not be completely in main memory. Thus we can say that virtual memory is a management. It is not a part of main or secondary memory. The main visible advantages of virtual memory are that it can also execute such programs whose sizes are greater than the capacity of the available physical memory. Advantages of virtual memory over the real memory system:1. A process is no longer limited to amount of main memory available.
2. Degree of multiprogramming increases as each user program could take
less main memory space.
3. I/O operation takes less time when a process is loaded or swapped out.
In demand paging, the execution of a process starts with at least one page into main memory; obviously this page must contain the starting point of the execution of the process. There is no need to load all the pages into the main memory. When a memory reference is made for a page, but that page occurs outside the main memory, a page fault occurs which creates interrupt in the system and it requests for loading the referenced
page immediately & it is the operating system that brings the required page into the main memory. Hence the terms demand paging is used for this technique.
5.6 Performance of Demand Paging
Page fault rate always lies between zero and one. Page fault rate 0<= p<=1 (Let p = probability of a page fault) if p=0, no page fault if p=1, every reference results in a page fault
Effective Access Time (EAT) EAT = (1 – p) * hit-time + p * miss-time It may also be defined as WAT= (1-p) effective_memory_access+ (page fault overhead + [swap out] + swap in + restart overhead) Let us consider an example: – effective_memory_access is 107 nane sec(with TLB when TLB access is 5 nano sec). – Memory access time is 100 nenosec – – – Page fault overhead is 100 micro sec Page swap time is 10 millisecond. 50% of the time, the page that is being replaced has been modified and therefore needs to be swapped out. – – Restart overhead isn 20 micro second. P- Page fault rate.
Effective Access Time = 107*(1-p)+(100000+10000000+0.5*10000000+200000+5+200)*p EAT=107*(1-p)+15,120,205*p Suppose p is 1% EAT=151,309
For the luxury of virtual memory to cost only 10% overhead, we need p to be around 0.00000066. This means one page fault for every 1500000 memory accesses.
5.7 Page Replacement
The following the page replacement algorithms: 1. FIFO 2. Least Recently Used (LRU) 3. Optimal Page Replacement 4. Clock Page Replacement 5. Least Frequently Used (LFU) 6. Most Frequently Used (MFU)
1. FIFO Page Replacement Algorithm It replaces the page that has been in the memory longest. One possible implementation is a FIFO queue of existing pages in the memory. The oldest page will be at the head of the queue. Whenever, a page-fault occurs, the page at the top of the queue is made victim and the new page is put at the tail of the queue. Reference String A reference string refers to the sequences of page numbers referenced by a program during its execution. Page number =Quotient (Logical Address/Page-size) Therefore, if logical address= 0745 and page size = 100 bytes, Then page number = Quotient (0745/100) = 7 Assume a reference string : 7,0,1,2,0,3,0,4,2,3,0,3,2,1,2,0,1,7,0,1 and a set of 3 frames available for allocation. Reference String 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 0 1
2 0 1
2 3 1
2 3 0
4 3 0
4 2 0
4 2 3
0 2 3
0 1 3
0 1 2
7 1 2
7 0 2
7 0 1
Page in-out Record 7 0 1 2 7 3 0 0 1 4 2 2 3 3 0 0 4 1 2 2 3 7 0 0 1 1 2
FIFO Queue 7 0 7 1 0 7 2 1 0 3 2 1 0 3 2 4 0 3 2 4 0 3 2 4 0 3 2 1 0 3 2 1 0 7 2 1 0 7 2 1 0 7
Number of Page Faults in FIFO = 15 Plus points of FIFO Algorithm: Implementation is fairly simple. Limitations of FIFO Algorithm This algorithm does not take into account the current usage of the pages and may often eject some pages that may be currently active. Such pages would need to be moved-in again, in the near future. Also, if the system has global page replacement, then the program having largest number of allocated pages would have higher page fault rate, since the probability of oldest page, belonging to this program, would be very high. This phenomena is called Belady`s Anomaly and it defies intuition.
Least Recently Used (LRU)
It replaces LRU page i.e, the page which has been used least recently. So, while choosing a resident page for replacement, the algorithm takes into account its current usage. It is presumed that a page that has been used least recently, would be the one that would be least likely to be accessed in the near future. One implementation of this algorithm could be by using a stack. Whenever, a new page is brought in, it is placed at the top of stack. Also, whenever, a resident page is accessed, it is removed from its current position and moved to the top of stack. Whenever, a page is to be replaced, victim page is chosen from
the bottom of stack. Let us see the performance of this algorithm, for the above reference string. Reference String 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
Page Frames 7 7 0 7 0 1 2 0 1 2 0 3 4 0 3 4 0 2 4 3 2 0 3 2 1 3 2 1 0 2 1 0 7
Page in-out Record 7 0 1 2 7 3 1 4 2 2 3 3 0 0 4 1 0 0 3 7 2
Special Stack 7 0 7 1 0 7 2 1 0 0 2 1 3 0 2 0 3 2 4 0 3 2 4 0 3 2 4 0 3 2 3 0 2 2 3 0 1 2 3 2 1 3 0 2 1 1 0 2 7 1 0 0 7 1 1 0 7
Number of Page Faults in LRU = 12 Plus points of LRU Algorithm. While selecting a resident page for replacement, it takes into consideration the current usage of a page the algorithm is free from Belady`s Anomaly. Limitations of LRU Algorithm. The algorithm has a lot of processing-overheads, which are needed to keep track of LRU page. 3. Optimal (OPT) Page Replacement In its ideal form, this algorithm should replace a page, which is to be referenced in the most distant future. Since, it requires knowledge of the future; its ideal form is not practically realizable. The significance of this algorithm is only theoretical. It is used to
compare performance of a practically realizable algorithm with that of the optimal (though not a realizable algorithm, but optimal). Optimal Reference String 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
Page Frames 7 7 0 7 0 1 2 0 1 2 0 3 2 4 3 2 0 3 2 0 1 7 0 1
Page in-out Record 7 0 1 2 7 3 1 4 0 0 4 1 3 7 2
Number of page faults in Optimal = 09 Going by the number of page faults, the Optimal Algorithm appears to be the best, but it is not feasible to implement this algorithm, since the algorithm requires knowledge of the future. Out of FIFO and LRU algorithms, the LRU is better, since it takes into consideration the current usage of a page whenever a resident page is considered for replacement. The page, which has been least recently used, is chosen for replacement. So, the algorithm is free of Belady`s Anomaly. 4. Clock Algorithm This algorithm combines the relative low overhead of FIFO algorithm with the pageusage-consideration of the LRU algorithm. This algorithm is also sometimes referred to as Not Recently Used (NRU). It replaces a resident page, which has not been accessed in the near past. It works as follows: The algorithm maintains a circular list of all resident pages. A referenced bit is associated with each page, which is set whenever, a page is referenced.
Either at regular intervals or whenever the number of free page-frames falls below a preset threshold, the algorithm sweeps the circular list. While sweeping, it operates as follows: It inspects the referenced bit of each page. If the referenced bit of a page is set, it implies that the page has been referenced in the near past i.e., subsequent to the last sweep. The algorithm clears the referenced bit and proceeds to look at the next page. If the referenced bit of a page is not set, it implies that the page has not been referenced subsequent to the last sweep. Now, if the modified bit of the page is not set then the page is declared non-resident and the frame is freed. However, if the modified bit is set then the page is scheduled for writing onto disk. The algorithm continues the sweep, till the required number of frames has been freed. Next time, it commences from the point where it had left last time (as indicated by the pointer). 5. Least Frequently Used (LFU) In this algorithm, a count is associated with each resident page, which is incremented by one, whenever a page is referenced. Whenever a replacement is necessary, a page with least count is replaced. The main drawback of this algorithm is that some pages may have a very high usage initially and may build a high count. Such pages, even if they have low usage subsequently, would remain memory-resident, due to the high count. One solution to this problem could be that occasionally the count of each resident page could be shifted right by one (divide by two). Thus, the count would become “exponentiallydecaying average-usage count”. 6. Most Used (MU) This algorithm replaces the page with the largest usage-count. It is based on the assumption that the pages with smaller count have been brought-in only recently and would need to be resident. This algorithm is very close to FIFO algorithm and has all the anomalies with FIFO.
In page replacement when a page fault occurs, the new page replaces with a resident page of any process. This situation leads to a serious problem as the no. of specified by the
architecture. If such a process does not have this no. of frames, it will vary quickly page fault. It will spend more time in paging rather than its execution. This highly paging activity of the system is called as thrashing. In Thrashing as the degree of multiprogramming increases CPU utilization also increases until a max. is reached, say paint T after this point T, if the degree of multiprogramming is further increased then thrashing sets in and CPU utilization down sharply systems spends most of its time in shuttling pages between main memory and secondary memory.
Degree of multiprogramming
5.9 Demand Segmentation
Used when insufficient hardware to implement demand paging OS/2 allocates memory in segments, which it keeps track of through segment descriptors Segment descriptor contains a valid bit to indicate whether the segment is currently in memory. – – – If segment is in main memory, access continues, If not in memory, segment fault Hybrid Scheme: Segmentation with demand paging
Chapter-V Memory Management
End Chapter quizzes:
Q1. Virtual Memory is commonly implemented by __________. a. Segmentation b. Swapping c. Demand Paging d. None of the above Q2. _________ Page replacement algorithm suffers from Belady's anomaly. a. LRU b. MRU c. FIFO d. LIFO Q3. Paging _________. a. solves the memory fragmentation problem b. allows modular programming c. allows structured programming d. avoids deadlock Q4. In memory management, a technique called as paging, physical memory is broken into fixed-sized blocks called ___________. a. Pages b. Frames c. Blocks d. Segments Q5. Virtual memory is __________. a. An extremely large main memory b. An extremely large secondary memory c. An illusion of extremely large main memory d. A type of memory used in super computers
Chapter-VI File System Interface
Contents: 6.1 File Concept 6.2 Access Methods 6.3 Directory Structure 6.4 Protection 6.5 File System Structure and Allocation Methods
6.1 File Concept
The most important function of an operating system is the effective management of information. The modules of the operating system dealing with the management of information are known as file system. The file system provides the mechanism for online storage and access to both data and programs. The file system resides permanently on secondary storage, which has the main requirement that it must be able to hold a large amount of data, permanently. The desirable features of a file system are: Minimal I/O operations. Flexible file naming facilities. Automatic allocation of file space. Dynamic allocation of file space Unrestricted flexibility between logical record size and physical block size. Protection of files against illegal forms of access. Static and dynamic sharing of files. Reliable storage of files.
In general a file system consists of two main parts: Files- collection of logical related data items. Directory- collection of related files. A directory structure organizes all the files and contains enough information about the files in the system.
6.2 Access Methods
There are several different ways in which the data stored in a file may be accessed for reading and writing. The operating system is responsible for supporting these file access methods. The fundamental methods for accessing information in the file are (a) sequential access: in it information in the file must be accessed in the order it is stored in the file, (b) direct access, and (c) index sequential access.
A sequential file is the most primitive of all files structures. It has no directory and no linking pointers. The records are generally organized in a specific sequence according to the key field. In other words, a particular attribute is chosen whose value will determine the order of the records. Access proceeds sequentially from start to finish. Operations to
read or write the file need not specify the logical location within the file, because operating system maintains a file pointer that determines the location of the next access. Sometimes when the attribute value is constant for a large number of records a second key is chosen to give an order when the first key fails to discriminate. Use of sequential file requires data to be sorted in a desired sequence according to the key field before storing or processing them. Its main advantages are: It is easy to implement It provides fast access to the next records if the records are to be accessed using lexicographic order. Its disadvantages are: b. It is difficult to update and insertion of a new record may require moving large proportion of the file. Random access is extremely slow.
In direct access file organization, any records can be accessed irrespective of the current position in the file. Direct access files are created on direct access storage devices. Whenever a record is to be inserted, its key value is mapped into an address using a hashing function. On that address record is stored. The advantage of direct access file organization is realized when the records are to be accessed randomly (not sequentially). Otherwise this organization has a number of limitations such as (a) poor utilization of the I/O medium and (b) Time consumption during record address calculation. c. Index sequential access method The index file organization is essentially a compromise between the sequential and direct file organization. Index sequential file organization uses indexing to locate the desired records. In this the records are stored in a sorted order. The principle of an index file organization is to record and access small groups of records in an essential sequential manner and to provide a hierarchical index to determine which group contains a particular record. Thus instead of searching the whole file sequentially, the searching is reduced to a small group of records. In this, a table is maintained in which track number indexes. The track index indicates the smallest and largest student number, as shown in fig.
Track Start End Number
1 17 33 ----------
5 21 35 ----------
7 26 38 ----------
12 29 40 ----------
14 30 42 ----------
#1 #2 #3 ------#6
1 17 33 ----------
14 30 42 ----------
#1 #2 #3 ------#6
6.3 Directory Structure
The file systems of computers can be extensive. Some systems store thousands of files on hundreds of gigabytes of disk. To manage all these data, we need to organize them. This organization is usually done in two parts; first, the file system is broken into in the volumes in the PC. In this way, the user needs to be concerned with only the logical directory and file structure, and can ignore completely the problems of physically allocating space for files. For this reason partitions can be thought of as virtual disks. Second, each partition contains information about files within it. This information is kept in a device directory or volume table of contents. The device directory (more commonly known simply as a “directory”) records information such as name, location, size and type for all files on that partition. The following are the different types of Directory Structure. Single-Level Directory: The simplest directory structure is the single-level tree. A single level tree system has only one directory. All files are contained in the same directory, which is easy to support and understand. Names in that directory refer to files or other non- directory objects. Such a system is practical only on systems with very limited numbers of files. A single-level directory has significant limitations, when the number of files increases or when there is more than one user. Since all files are stored in the same directory, the name given to each file should be unique. If there are two users and they give the same name to there file, then there is a problem.
Figure – Single Level Directory Even with a single user, as the number of files increase, it becomes difficult to remember the names of all the files, so as to create only files with unique names. It is not uncommon for a user to have hundreds of files on one computer system and an equal number of additional files on another system. In such an environment, keeping track of so many files is a daunting task. Two-Level Directory: The disadvantage of a single-level directory is confusion of file names. The standard solution is to create a separate directory for each user. In a two level system, only the root level directory may contain names of directories and all other directories refer only to non- directory objects. In the two-level directory structure, each user has his/her own user file directory (UFD). Each UFD has a similar structure, but lists only the files of a single user. When a user starts or a user logs in, the system’s master file directory is searched. The master file directory is indexed by user name or account. Master file directory ajay moham raj
User directory Two-level Directory Tree-Structure Directories: A tree system allows growth of the tree beyond the second level. Any directory may contain names of additional directories as well as nondirectories objects. This generalization allows users to create their own sub- directories and to organize their files accordingly. The MS-DOS system, for instance, is structure as
a tree. In fact, a tree is the most common directory structure. The tree has a root directory. Every file in the system has a unique path name. A path name is the path from the root, through all the subdirectories, to a specified. a A1 A2 B1 A23 B B2 C C1 C2 C3 .
Acyclic-Graph Directories: Sharing of file is another important issue in deciding the directory structure. If more than one user are working on some common project. So the files associated with that project should be placed in a common directory that can be shared among a number of users. A A1 A2 A3 A4 B B1 B2 B3
The important characteristic of sharing is that if a user is making a change in a shared file that is to be reflected to other user also. In this way a shared file is not the same as two copies of the file. With two copies each programmer can view the copy rather than the original, but if one programmer changes the file, the changes will not appear in the other’s copy. With a shared file, there is only one actual file, so any changes made by the person would be immediately visible to the other.
We have already studied that a file is a collection of related information. Therefore it needs protection from both physical damage (reliability) and improper access (protection). File system can be damaged due to hardware problems, such as power failure, head crashes, dirt, very high temperature, and so on. However files can be deletely knowingly or unknowingly. The only way to achieve reliability is to make duplicate copes of the disk files on any other storage device at regular intervals, say on a tape drive. Protection can be achieved by its limited use. If you want complete protection then the system should not permit direct access to the files of other users. Actually a typical protection mechanism provides controlled access by limiting the types of file access that can be made. There are so many different protection mechanisms. For instance if we use a protection mechanism on a heavily used system then it is not necessary to use similar type of a protection mechanism on a system that is not heavily used. Therefore protection mechanism varies from system to system. The very simple protection mechanism is to associate an access list is associated with each file and directory. When a process request for a file access, the operating system checks the access lists associated with that file. If it is in the access list then the request is granted; otherwise it flags an error message, such as protection violation error. The main problem with this scheme is that it is impossible to know in advance which user is going to use which file. Additionally it increases the directory size. The problem is overcome by dividing the users into three categories- owner, group and others.
6.5 File System Structure and Allocation Methods
An operating system's file system structure is its most basic level of organization. Almost all of the ways an operating system interacts with its users, applications, and security model are dependent upon the way it stores its files on a storage device. It is crucial for a variety of reasons that users, as well as programs, be able to refer to a common guideline to know where to read and write files. A file system can be seen in terms of two different logical categories of files:
Shareable vs. unshareable files Variable vs. static files
Shareable files are those that can be accessed by various hosts; unshareable files are not available to any other hosts. Variable files can change at any time without any intervention; static files, such as read-only documentation and binaries, do not change without an action from the system administrator or an agent that the system administrator has placed in motion to accomplish that task. The reason for looking at files in this way is to help you understand the type of permissions given to the directory that holds them. The way in which the operating system and its users need to use the files determines the directory where those files should be placed, whether the directory is mounted read-only or read-write, and the level of access allowed on each file. The top level of this organization is crucial, as the access to the underlying directories can be restricted or security problems may manifest themselves if the top level is left disorganized or without a widely-used structure. However, simply having a structure does not mean very much unless it is a standard. Competing structures can actually cause more problems than they fix. Because of this, Red Hat has chosen the the most widely-used file system structure and extended it only slightly to accommodate special files used within Red Hat Linux. In short we have the following main points about file structure • File structure:
– – • • • •
Logical storage unit. Collection of related information.
File system resides on secondary storage (disks). Information about files is kept in the directory structure, which is maintained on the disk. The directory structure organizes the files. File Control Block (FCB) – storage structure consisting of information about a file.
Allocation Methods The direct- access nature of disks allows us flexibility in the implementation of files. In almost every case, many files will be stored on the same disk. The main problem is how to allocate space to these files so that disk space is utilized effectively and files can be accessed quickly. Three major methods of allocating disk-space are in wide use: contiguous, linked, and indexed. Each method has advantages and disadvantages. Some systems (such as Data General’s RDOS for its Nova line of computers) support all three. More commonly, a system will use one particular method for all files. Different types of allocation methods exits are Contiguous Allocation, Linked Allocation, and Indexed Allocation.
Chapter-VI File System Interface
End Chapter quizzes:
Q1. File extensions are used in order to (a) Name the file (b) Ensure the filename is not lost (c) Identify the file (d) Identify the file type. Q2. Which of the following memory allocation scheme suffers from External fragmentation? a. Segmentation b. Pure demand paging c. Swapping d. Paging Q3. A ___________ contains information about the file, including ownership, permissions, and location of the file contents. a. File Control Block (FCB) b. File c. Device drivers d. File system Q4. In the ___________ method of data transfer, the participation of the processor is eliminated during data transfer. a. Buffering b. Caching c. Direct Memory Access d. Indirect Memory Access Q5. ___________ begins at the root and follows a path down to the specified file a. Relative path name b. Absolute path name c. Standalone name d. All of the above
Chapter-VII Unix System (Case Study)
Contents: 7.1 History 7.2 Design Principle 7.3 Programmer Interface 7.4 User Interface 7.5 Process Management 7.6 Memory Managements 7.7 File management 7.8 Interprocess Communication
UNIX is the most popular operating system on multi user system. It was developed at AT & T Laboratories in 1969 by Ken Thompson and Dennis Ritchie. Brion Kernighan named it UNICS (Uniplexed Information and Computing system).In 1970, it is changed from UNICS to UNIX. Originally written in assembler, UNIX was rewritten in 1973 in C.
7.2 The Design Principles
The UNIX Operating System is divided into three major components: The Kernel The Shell Utilities And Application Programs Architecture of the UNIX operating system
The functioning of UNIX is manned in 3 levels: 1. Kernel
2. Shell 3. UNIX Utilities and Application Software SHELL: It is the interface between the user and the kernel that effectively insulates the user from knowledge of kernel functions. It actually analyzes the commands (or it reads the commands and interprets them, before forwarding it to another agency which actually executes them. It also has a programming capability of its own. As a programming language-it control how and when commands are carried. UNIX Utilities and Application Software: The UNIX Utilities are a collection of about 200 programs that service the day-to-day processing requirements. UNIX provide more than thousand UNIX based application programs, like database management systems, word processing, accounting software and language processors etc. are available from independent software developers. Kernel: Kernel is the heart of the UNIX system. It is the UNIX operating System. It is a collection of programs written in C that directly communicate with hardware. It is the memory resident portion of the system but not directly interact with user. There is only one Kernel for any system.
7.3 Program Interface
resource allocation – to multiple users or multiple jobs running at the same time accounting – accumulate usage statistics and bill users accordingly protection – ensuring that all access to system resources is controlled
7.4 User’s Interface
program execution – capability to load a program into memory and to run it I/O operations – since user programs are restricted, OS must provide I/O file system manipulation – capability to read, write, create, and delete files
communication – exchange of information between processes executing either on the same computer or in a distributed network (implemented via shared memory or message passing) error detection – ensure correct computing by detecting errors in the CPU and memory hardware, in I/O devices, or in user programs
7.5 Process Management
A process is the execution of a program and consists of a pattern of bytes that the CPU interprets as machine instructions. A process on a UNIX system is the entity that is created by the fork system call. Every process except process 0 is created when another process executes the fork system call. The process that invoked the fork system call is the parent process, and the newly created process is the child process. Every process has one parent process, but a process can have many child processes. The kernel identifies each process by its process number, called the process ID (PID). Process 0 is a special process that is created "by hand" when the system boots; after forking a child process (process 1), process 0 becomes the swapper process. Process 1, known as init, is the ancestor of every other process in the system and enjoys a special relationship with them The current activity of a process is known as its state.As a process executes, its state changes. A process can exist in one of the following states: Process states The New state The process is being created. The Running state Instructions are being executed. The process that gets executed (single CPU) The Waiting state The process is waiting for some event to occur(such as I/O completion or reception of a signal). The Ready state The process has acquires the required resources and is waiting to be assigned to a processor any process that is ready to be executed The Terminate state
The process has finished execution.
7.6 Memory Management
Memory management decides which process should reside in main memory and manage the parts of the virtual address of a process which are residing on secondary storage devices. It monitors the amount of physical memory and secondary storage devices. Swapping The early development of UNIX systems transferred entire process between primary memory and secondary storage device but did not transfer parts of a process independently, except for shared text. Such a memory management policy is called swapping. Demand Paging Berkeley introduces demand paging to UNIX with BSD (Berkeley system) which transferred memory pages instead of processes to and from a secondary device. When a process need a page and the page is not there, a page fault to the kernel occurs, a frame of main memory is allocated and then the process is loaded into the frame by the kernel.
7.7 File management
UNIX uses a hierarchical file structure to store information. This structure has the maximum flexibility in grouping information .It allows for easy maintenance and efficient implementation. The Directory Structure The UNIX File Structure is hierarchal. That is, directories are linked in the form of a family tree; files are stored under specific directories. The basic structure of a UNIX system is:
root The primary directory of the system. It is also the super-user's Home directory.
bin This directory contains the standard UNIX utility programs - the UNIX commands. etc This directory contains the administrative and system configuration utilities. dev This directory contains the utilities and special files that control communications with terminals, printers, disk drives and other peripherals. lib This directory contains object libraries - libraries of information used by UNIX utilities. tmp This directory may be used to store temporary files - files that are not wanted after a program or process has been completed. This UNIX system may be set to clear old files from tmp at regular intervals. usr This directory contains a number of sub-directories for such as printer scheduling utilities, user-to-user communications, extra UNIX utilities and libraries. In many UNIX systems it is also used to store users' Home directories.
7.8 Inter-Process Communication
Inter process communication (IPC) provides a mechanism to allow processes to communicate with each other via an Inter-Process Communication (IPC) facility. IPC provides a mechanism to allow processes to communicate and to synchronize their actions.IPC is best provided by Message queue. Processes may be running on one or more computers connected by a network. The method of IPC used may vary based on the bandwidth and latency of communication between the threads, and the type of data being communicated.
Chapter-VII Unix System (Case Study)
End Chapter quizzes:
Q1. In UNIX system inode has how many block addresses. (a) 10 Block Addresses (b) 13 Block Addresses (c) 15 Block Addresses (d) 05 Block Addresses Q2. The ___________ used to create a new process from an existing process (a) (b) (c) (d) fork() Fork a child pid create ()
Q3. What are the process states in UNIX? (a) (b) (c) (d) (e) Running Waiting Stopped Zombie All of the above
Q4. Inter process communication can be done through __________. (a) Mails (b) Messages (c) System calls (d) Traps
Chapter-VIII Data Storage
Contents: 8.1 Disk Structure 8.1.1 Organization of disks, capacity & space 8.1.2 Organizing tracks by sectors 8.1.3 Clusters &Extents 8.1.4 Fragmentation 8.1.5 Organism tracks by block. 8.2 Disk Management 8.2.1 Magnetic tapes 8.2.2 Storage by Data warehousing 8.2.3 OLAP DSS (decision support system) 8.3 Characteristics of data warehouses 8.4 Functionality of data warehouse 8.5 Problems and open issues in data warehouses
8.1 Disk Structure
8.1.1 Disk Organization Before data can be stored on a magnetic disk, the disk must first be divided into numbered areas so the data can be easily retrieved. Dividing the disk so the data can be easily written and retrieved is known as formatting the disk. The format program divides each data surface into tracks and sectors. Tracks Concentric rings, called tracks, are written on the disk during the formatting process. Floppy disks have 40 or 80 tracks per side. Fixed disks and disk packs can have from 300 to over 1,000 tracks per side. Figure 8-1 shows an example of how tracks are written on a disk surface. Each track is assigned a number. The outermost track on a disk is assigned number 00. The innermost track is assigned the highest consecutive number. Sectors Each track is divided into sectors. Sectors are numbered divisions of the tracks designed to make data storage more manageable. Without sectors, each track would hold more than 4,500 bytes of information and small files would use an entire track.
Figure 8-1 Tracks on a segment of a magnetic disk.
Fixed Disks Fixed disks are small sealed units that contain one or more disk platters. Fixed disks are known by several terms, such as Winchester drive, hard drive, or fixed disk. For clarity, we refer to them as fixed disks throughout this chapter. Fixed disks are used in minicomputers and personal computers. They can also be adapted for use in mainframe computers instead of having separate disk file units. Floppy Disks Floppy disks come in several sizes and densities. They are called floppy disks because the magnetic coating is placed on a thin flexible polyester film base. THE 8-INCH Floppy Disk: The 8-inch floppy used disk was the first disk widely for commercial purposes. It is available as both single- or double-sided and
single- or double density. The 8-inch disk is quickly becoming obsolete. THE 5.25-INCH Floppy Disk: The 5.25- inch floppy disks are used with both personal computers and minicomputers. The standard double-sided, double-density disk has a capacity of 360 kilobytes (K). Quad-density disks hold 720K, while the newest high-density disks can hold 1.2 megabytes (M). THE 3.5-INCH Floppy Disk: The current disk of choice is the 3.5-inch floppy disk. These disks are also used with personal computers and minicomputers. disks have density disks. 8.1.2 Organizing tracks by sectors These smaller data capacities of 720K for double-density disks and 1.44M for high-
Fig.8.1.2 Disk structures: (A) Track (B) Geometrical sector (C) Track sector (D) Cluster In the context of computer disk storage, a sector is a subdivision of a track (Figure 8.1.2, item A)on a magnetic disk or optical disc. Each sector stores a fixed amount of data. The typical formatting of these media provides space for 512 bytes (for magnetic disks) or 2048 bytes (for optical discs) of user-accessible data per sector. Mathematically, the word sector means a portion of a disk between a center, two radii and a corresponding arc (see Figure 8.1.2, item B), shaped like a slice of a pie. Thus, the common disk sector (Figure 8.1.2, item C) actually refers to the intersection of a track and mathematical sector. Early on in various computing fields, the term block was used for this small chunk of data, but sector appears to have become more prevalent. One quite probable reason for this is the fact block has often been applied to data chunks of varying sizes for many types of data streams, rather than being limited to the smallest accessible amount of data on a medium. 8.1.3 Clusters &Extents Clusters A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks. Clusters are usually deployed to improve performance and/or availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability. Types of clusters 1. High –availability Clusters 2. Load balancing Clusters 1. High-availability (HA) clusters- High-availability clusters it is also known as Failover Clusters, these are implemented primarily for the purpose of improving the availability of
services that the cluster provides. They operate by having redundant nodes, which are then used to provide service when system components fail. The most common size for an HA cluster is two nodes, which is the minimum requirement to provide redundancy. HA cluster implementations attempt to use redundancy of cluster components to eliminate single points of failure. There are commercial implementations of High-Availability clusters for many operating systems. 2 Load-balancing clusters- Load-balancing is when multiple computers are linked together to share computational workload or function as a single virtual computer. Logically, from the user side, they are multiple machines, but function as a single virtual machine. Requests initiated from the user are managed by, and distributed among, all the standalone computers to form a cluster. This results in balanced computational work among different machines, improving the performance of the cluster system. Compute clusters Often clusters are used primarily for computational purposes, rather than handling IOoriented operations such as web service or databases. For instance, a cluster might support computational simulations of weather or vehicle crashes. The primary distinction within compute clusters is how tightly-coupled the individual nodes are. For instance, a single compute job may require frequent communication among nodes - this implies that the cluster shares a dedicated network, is densely located, and probably has homogenous nodes. This cluster design is usually referred to as Beowulf Cluster. The other extreme is where a compute job uses one or few nodes, and needs little or no inter-node communication. This latter category is sometimes called "Grid" computing. Tightlycoupled compute clusters are designed for work that might traditionally have been called "supercomputing". Middleware such as MPI (Message Passing Interface) or PVM (Parallel Virtual Machine) permits compute clustering programs to be portable to a wide variety of clusters. Extents Overview An extent is a logical unit of database storage space allocation made up of a number of contiguous data blocks. One or more extents in turn make up a segment. When the existing space in a segment is completely used, Oracle allocates a new extent for the segment.
Extents Allocation When you create a table, Oracle allocates to the table's data segment an initial extent of a specified number of data blocks. Although no rows have been inserted yet, the Oracle data blocks that correspond to the initial extent are reserved for that table's rows. If the data blocks of a segment's initial extent become full and more space is required to hold new data, Oracle automatically allocates an incremental extent for that segment. An incremental extent is a subsequent extent of the same or greater size than the previously allocated extent in that segment. For maintenance purposes, the header block of each segment contains a directory of the extents in that segment. Extents De-allocation In general, the extents of a segment do not return to the table space until you drop the schema object whose data is stored in the segment (using a DROP TABLE or DROP CLUSTER statement). Exceptions to this include the following: The owner of a table or cluster, or a user with the DELETE ANY privilege, can truncate the table or cluster with a TRUNCATE...DROP STORAGE statement. A database administrator (DBA) can deallocate unused extents using the following SQL syntax: ALTER TABLE table_name DEALLOCATE UNUSED; Periodically, Oracle deallocates one or more extents of a rollback segment if it has the OPTIMAL size specified. When extents are freed, Oracle modifies the bitmap in the datafile (for locally managed tablespaces) or updates the data dictionary (for dictionary managed tablespaces) to reflect the regained extents as available space. Any data in the blocks of freed extents becomes inaccessible. Extents in Nonclustered Tables As long as a nonclustered table exists or until you truncate the table, any data block allocated to its data segment remains allocated for the table. Oracle inserts new rows into a block if there is enough room. Even if you delete all rows of a table, Oracle does not reclaim the data blocks for use by other objects in the table space. After you drop a nonclustered table, this space can be reclaimed when other extents require free space. Oracle reclaims all the extents of the table's data and index segments
for the table spaces that they were in and makes the extents available for other schema objects in the same table space. In dictionary managed tables paces, when a segment requires an extent larger than the available extents, Oracle identifies and combines contiguous reclaimed extents to form a larger one. This is called coalescing extents. Coalescing extents is not necessary in locally managed tablespaces, because all contiguous free space is available for allocation to a new extent regardless of whether it was reclaimed from one or more extents. Extents in Clustered Tables Clustered tables store information in the data segment created for the cluster. Therefore, if you drop one table in a cluster, the data segment remains for the other tables in the cluster, and no extents are deallocated. You can also truncate clusters (except for hash clusters) to free extents. Extents in Materialized Views and Their Logs Oracle deallocates the extents of materialized views and materialized view logs in the same manner as for tables and clusters. Extents in Indexes All extents allocated to an index segment remain allocated as long as the index exists. When you drop the index or associated table or cluster, Oracle reclaims the extents for other uses within the table space. Extents in Temporary Segments When Oracle completes the execution of a statement requiring a temporary segment, Oracle automatically drops the temporary segment and returns the extents allocated for that segment to the associated table space. A single sort allocates its own temporary segment in the temporary table space of the user issuing the statement and then returns the extents to the table space. Multiple sorts, however, can use sort segments in a temporary table space designated exclusively for sorts. These sort segments are allocated only once for the instance, and they are not returned after the sort, but remain available for other multiple sorts. A temporary segment in a temporary table contains data for multiple statements of a single transaction or session. Oracle drops the temporary segment at the end of the
transaction or session, returning the extents allocated for that segment to the associated table space. Extents in Rollback Segments Oracle periodically checks the rollback segments of the database to see if they have grown larger than their optimal size. If a rollback segment is larger than is optimal (that is, it has too many extents), then Oracle automatically deallocates one or more extents from the rollback segment. 8.1.4 Fragmentation In computer storage, fragmentation is a phenomenon in which storage space is used inefficiently, reducing storage capacity and in most cases performance. The term is also used to denote the wasted space itself. There are three different but related forms of fragmentation: Internal fragmentation. External fragmentation. Data fragmentation.
Various storage allocation schemes exhibit one or more of these weaknesses. Fragmentation can be accepted in return for increase in speed or simplicity. (1) Internal Fragmentation Internal fragmentation occurs when storage is allocated without ever intending to use it. This space is wasted. While this seems foolish, it is often accepted in return for increased efficiency or simplicity. The term "internal" refers to the fact that the unusable storage is inside the allocated region but is not being used. For example, in many file systems, each file always starts at the beginning of a cluster, because this simplifies organization and makes it easier to grow files. Any space left over between the last byte of the file and the first byte of the next cluster is a form of internal fragmentation called file slack or slack space. Similarly, a program which allocates a single byte of data is often allocated many additional bytes for metadata and alignment. This extra space is also internal fragmentation.
Another common example: English text is often stored with one character in each 8-bit byte even though in standard ASCII encoding the most significant bit of each byte is always zero. The unused bits are a form of internal fragmentation. Similar problems with leaving reserved resources unused appear in many other areas. For example, IP addresses can only be reserved in blocks of certain sizes, resulting in many IPs that are reserved but not actively used. This is contributing to the IPv4 address shortage. Unlike other types of fragmentation, internal fragmentation is difficult to reclaim; usually the best way to remove it is with a design change. For example, in dynamic memory allocation, memory pools drastically cut internal fragmentation by spreading the space overhead over a larger number of objects. (2)External Fragmentation External fragmentation is the phenomenon in which free storage becomes divided into many small pieces over time. It is a weakness of certain storage allocation algorithms, occurring when an application allocates and deallocates ("frees") regions of storage of varying sizes, and the allocation algorithm responds by leaving the allocated and deallocated regions interspersed. The result is that although free storage is available, it is effectively unusable because it is divided into pieces that are too small to satisfy the demands of the application. The term "external" refers to the fact that the unusable storage is outside the allocated regions. For example, in dynamic memory allocation, a block of 1000 bytes might be requested, but the largest contiguous block of free space has only 300 bytes. Even if there are ten blocks of 300 bytes of free space, separated by allocated regions, one still cannot allocate the requested block of 1000 bytes, and the allocation request will fail. External fragmentation also occurs in file systems as many files of different sizes are created, change size, and are deleted. The effect is even worse if a file which is divided into many small pieces is deleted, because this leaves similarly small regions of free spaces. (3) Data Fragmentation
Data fragmentation occurs when a piece of data in memory is broken up into many pieces that are not close together. It is typically the result of attempting to insert a large object into storage that has already suffered external fragmentation. For example, files in a file system are usually managed in units called blocks or clusters. When a file system is created, there is free space to store file blocks together contiguously. This allows for rapid sequential file reads and writes. However, as files are added, removed, and changed in size, the free space becomes externally fragmented, leaving only small holes in which to place new data. When a new file is written, or when an existing file is extended, the new data blocks are necessarily scattered, slowing access due to seek time and rotational delay of the read/write head, and incurring additional overhead to manage additional locations. This is called file system fragmentation. As another example, if the nodes of a linked list are allocated consecutively in memory, this improves locality of reference and enhances data cache performance during traversal of the list. If the memory pool's free space is fragmented, new nodes will be spread throughout memory, increasing the number of cache misses. Just as compaction can eliminate external fragmentation, data fragmentation can be eliminated by rearranging data storage so that related pieces are close together. For example, the primary job of a defragmentation tool is to rearrange blocks on disk so that the blocks of each file are contiguous. Most defragmenting utilities also attempt to reduce or eliminate free space fragmentation. Some moving garbage collectors will also move related objects close together (this is called compacting) to improve cache performance. 8.1.5 Organism tracks by block In computing (specifically data transmission and data storage), a block is a sequence of bytes or bits, having a nominal length (a block size). Data thus structured are said to be blocked. The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data. Blocked data are normally read a whole block at a time. Blocking is almost universally employed when storing data to 9-track magnetic tape, to rotating media such as floppy disks, hard disks, optical discs and to NAND flash memory. Most file systems are based on a block device, which is a level of abstraction for the hardware responsible for storing and retrieving specified blocks of data, though the block
size in file systems may be a multiple of the physical block size. In classical file systems, a single block may only contain a part of a single file. This leads to space inefficiency due to internal fragmentation, since file lengths are often not multiples of block size, and thus the last block of files will remain partially empty. This will create slack space, which averages half a block per file. Some newer file systems attempt to solve this through techniques called block sub-allocation and tail merging. Block storage is normally abstracted by a file system or database management system for use by applications and end users. The physical or logical volumes accessed via block I/O may be devices internal to a server, direct attached via SCSI or Fibre Channel, or distant devices accessed via a storage area network (SAN) using a protocol such as iSCSI, or AoE. Database management systems often use their own block I/O for improved performance and recoverability as compared to layering the DBMS on top of a file system.
8.2 Disk Management
8.2.1 Magnetic tape Magnetic tape is a medium for magnetic recording, made of a thin magnetizable coating on a long, narrow strip of plastic. Most audio, video and computer data storage is this type. It was developed in Germany, based on magnetic wire recording. Devices that record and play back audio and video using magnetic tape are tape recorders and video tape recorders. A device that stores computer data on magnetic tape is a tape drive (tape unit, streamer). Magnetic tape revolutionized broadcast and recording. When all radio was live, it allowed programming to be prerecorded. At a time when gramophone records were recorded in one take, it allowed recordings in multiple parts, which mixed and edited with tolerable loss in quality. It is a key technology in early computer development, allowing unparalleled amounts of data to be mechanically created, stored for long periods, and to be rapidly accessed.
Today, other technologies can perform the functions of magnetic tape. In many cases these technologies are replacing tape. Despite this, innovation in the technology continues and tape is still widely used. Audio recording Magnetic tape was invented for recording sound by Fritz Pfleumer in 1928 in Germany, based on the invention of magnetic wire recording by Valdemar Poulsen in 1898. Pfleumer's invention used an iron oxide (Fe2O3) powder coating on a long strip of paper. This invention was further developed by the German electronics company AEG, which manufactured the recording machines and BASF, which manufactured the tape. In 1933, working for AEG, Eduard Schuller developed the ring shaped tape head. Previous head designs were needle shaped and tended to shred the tape. An important discovery made in this period was the technique of AC biasing which improved the fidelity of the recorded audio signal by increasing the effective linearity of the recording medium. Due to the escalating political tensions, and the outbreak of World War II, these developments were largely kept secret. Although the Allies knew from their monitoring of Nazi radio broadcasts that the Germans had some new form of recording technology, the nature was not discovered until the Allies acquired captured German recording equipment as they invaded Europe in the closing of the war. A wide variety of recorders and formats have developed since, most significantly reel-to-reel and Compact Cassette. Video recording The practice of recording and editing audio using magnetic tape rapidly established itself as an obvious improvement over previous methods. Many saw the potential of making the same improvements in recording television. Television ("video") signals are similar to audio signals. A major difference is that video signals use more bandwidth than audio signals. Existing audio tape recorders could not practically capture a video signal. Many set to work on resolving this problem. Jack Mullin (working for Bing Crosby) and the BBC both created crude working systems that involved moving the tape across a fixed tape head at very fast speeds. Neither system saw much use. It was the team at Ampex,
lead by Charles Ginsburg, that made the breakthrough of using a spinning recoding head and normal tape speeds to achieve a very high head-to-tape speed that could record and reproduce the high bandwidth signals of video. The Ampex system was called Quadruplex and used 2 inch wide tape, mounted on reels like audio tape, which wrote the signal in what is now called transverse scan. Later improvements by other companies, particularly Sony, lead to the development of helical scan and the enclosure of the tape reels in an easy-to-handle cartridge. Nearly all modern videotape systems use helical scan and cartridges. Videocassette recorders are very common in homes and television production facilities though many functions of the VCR are being replaced. Since the advent of digital video and computerized video processing, optical disc media and digital video recorders can now perform the same role as videotape. These devices also offer improvements like random access to any scene in the recording and "live" time shifting and are likely to replace videotape in many situations. Data storage In all tape formats, a tape drive uses motors to wind the tape from one reel to another, passing tape heads to read, write or erase as it moves. Magnetic tape was first used to record computer data in 1951 on the Eckert-Mauchly UNIVAC I. The recording medium was a thin strip of one half inch (12.65 mm) wide metal, consisting of nickel-plated bronze (called Vicalloy). Recording density was 128 characters per inch (198 micrometre/character) on eight tracks. Early IBM tape drives were floor-standing drives that used vacuum columns to physically buffer long U-shaped loops of tape. The two tape reels visibly fed tape through the columns, intermittently spinning the reels in rapid, unsynchronized bursts, resulting in visually-striking action. Stock shots of such vacuum-column tape drives in motion were widely used to represent "the computer" in movies and television.
Most modern magnetic tape systems use reels that are much smaller than the 10.5 inch open reels and are fixed inside a cartridge to protect the tape and facilitate handling. Many late 1970s and early 1980s home computers used Compact Cassettes encoded with the Kansas City standard. Modern cartridge formats include LTO, DLT, and DAT/DDC. Tape remains a viable alternative to disk in some situations due to its lower cost per bit. Though the areal density of tape is lower than for disk drives, the available surface area on a tape is far greater. The highest capacity tape media are generally on the same order as the largest available disk drives (about 1 TB in 2007). Tape has historically offered enough advantage in cost over disk storage to make it a viable product, particularly for backup, where media removability is necessary. 8.2.2 Storage by Data warehousing The concept of data warehousing has evolved out of the need for easy access to a structured store of quality data that can be used for decision making. It is globally accepted that information is a very powerful asset that can provide significant benefits to any organization and a competitive advantage in the business world. Organizations have vast amounts of data but have found it increasingly difficult to access it and make use of it. This is because it is in many different formats, exists on many different platforms, and resides in many different file and database structures developed by different vendors. Thus organizations have had to write and maintain perhaps hundreds of programs that are used to extract, prepare, and consolidate data for use by many different applications for analysis and reporting. Also, decision makers often want to dig deeper into the data once initial findings are made. This would typically require modification of the extract programs or development of new ones. This process is costly, inefficient, and very time consuming. Data warehousing offers a better approach. Data warehousing implements the process to access heterogeneous data sources; clean, filter, and transform the data; and store the data in a structure that is easy to access, understand, and use. The data is then used for query, reporting, and data analysis. As such, the access, use, technology, and performance requirements are completely different from those in a transaction-oriented operational environment. The volume of data in data warehousing can be very high,
particularly when considering the requirements for historical data analysis. Data analysis programs are often required to scan vast amounts of that data, which could result in a negative impact on operational applications, which are more performance sensitive. Therefore, there is a requirement to separate the two environments to minimize conflicts and degradation of performance in the operational environment. The following general stages of use of the data warehouse can be distinguished: Off line Operational Database Data warehouses in this initial stage are developed by simply copying the data off an operational system to another server where the processing load of reporting against the copied data does not impact the operational system's performance. Off line Data Warehouse Data warehouses at this stage are updated from data in the operational systems on a regular basis and the data warehouse data is stored in a data structure designed to facilitate reporting. Real Time Data Warehouse Data warehouses at this stage are updated every time an operational system performs a transaction. Integrated Data Warehouse Data warehouses at this stage are updated every time an operational system performs a transaction. The data warehouses then generate transactions that are passed back into the operational systems. 8.2.3 OLAP DSS (Decision Support System) Overview This area evolved via consultants, RDBMS vendors, and startup companies. All had something to prove, had to "differentiate their product". As a result, the area is a mess. Researchers making a little (but just a little) headway cleaning up the mess. A "data warehouse" is an organization-wide snapshot of data, typically used for decision-making.
A DBMS that runs these decision-making queries efficiently is sometimes called a "Decision Support System" DSS DSS systems and warehouses are typically separate from the on-line transaction processing (OLTP) system. By contrast, one class of DSS queries is sometimes called on-line analytic processing (OLAP) A "data mart" is a mini-warehouse -- typically a DSS for one aspect or branch of a company, with lots of relatively homogeneous data.
Warehouse/DSS properties Very large: 100gigabytes to many terabytes (or as big as you can go) Tends to include historical data Workload: A mostly complex query that access lots of data, and do many scans, joins, aggregations. Tend to look for "the big picture". Some workloads are canned queries (OLAP), some are ad-hoc (general DSS). Parallelism a must. Updates pumped to warehouse in batches (overnight). Data Cleaning Data Migration: simple transformation rules (replace "gender" with "sex") Data Scrubbing: use domain-specific knowledge (e.g. zip codes) to modify data. Try parsing and fuzzy matching from multiple sources. Data Auditing: discover rules and relationships (or signal violations thereof). Not unlike data "mining". Data Load: can take a very long time! (Sorting, indexing, summarization) Parallelism a must. Full load: like one big xact – change from old data to new is atomic. Incremental loading ("refresh") makes sense for big warehouses, but transaction model is more complex – have to break the load into lots of transactions, and commit them periodically to avoid locking everything. Need to be careful to keep metadata & indices consistent along the way.
OLAP Overview To facilitate analysis and visualization, data is often modeled multi-dimensionally
Think n-dimensional spreadsheet rather than relational table
E.g. for a sales warehouse we have dimensions time_of_sale, sales_district, salesperson,product Dimensions can be organized hierarchically into more detail e.g. time_of_sale may be "rolled up" into day-month-quarter-year product "rolled up" into product-category-industry opposite of "rollup": "drill-down" Other fun ops:
o Slice_and_dice (i.e. selection & projection in the dimensions) o Pivot (re-orient the multidimensional view) The values stored in the multidimensional cells are called numeric measures E.g. sales, budget, revenue, inventory, ROI (return on investment), etc. These are things over which you might aggregate
ROLAP vs. MOLAP ROLAP (Relational OLAP) uses standard relational tables & engine to do OLAP: Requires denormalized schema Star Schema: Fact table + table per dimension Snowflake Schema: off of the dimensions, have rolled-up versions Products: MicroStrategy, Metacube (Informix), Information Advantage. Uses standard relational query processing, with lots of indexes and precomputation MOLAP (Multidimensional OLAP) actually stores things in multi-d format Special index structures are used to support this
Note that much of the cube is empty! (no sales of Purple Chevy Trucks in June in Reno) Identify the "dense" and "sparse" dimensions. Build an index over combos of sparse dimensions. Within a combo, find a dense subcube of the dense dimensions. Zhao, et al. proposes a different model.
Products: Essbase (Arbor), Express (Oracle), Lightship (Pilot) Essentially everything is precomputed
More recently, HOLAP (Hybrid OLAP) to combine the best of both Microsoft Plato due out soon, will make OLAP commonplace Some vendors (e.g. Informix/Essbase) talking about MOLAP ADTs inside an ORDBMS Keep this in mind as we read Zhao/Deshpande/Naughton
8.3 Characteristics of data warehouses
Generally speaking, a data warehouse can easily be assembled using the same drive types and storage arrays that service other aspects of the organization. The common objectives of high reliability, data integrity and good storage performance should always be considered, but data warehouse workload patterns generally favor fast sequential reads, rather than the random I/O often encountered with file systems and database queries. Sequential read performance allows storage to efficiently stream vast amounts of information to the BI applications. In terms of disk choice, analysts note that disks should be selected to achieve a reasonable cost/performance tradeoff. High-end Fibre Channel (FC) disks running at 15,000 rpm can offer significant performance that may ideal for busy BI platforms that only have seconds to process information, such as finding relevant products for returning e-commerce site visitors. Still, the disks are expensive and their capacity is limited, forcing an even larger storage investment. But high-end disks are not always necessary or appropriate. "Data warehouses are not update intensive," says Greg Schulz, founder and senior analyst at the Storage I/O Group.
"Other than adding data to the warehouse, there are not a lot of transactions taking place. In some cases, a data warehouse is a step right before archiving." This means slower and less-expensive 10,000 rpm FC drives can be employed in a dedicated storage area networks (SAN). The use of nearline SATA drives has also become very appealing for many data warehouse systems. In fact, DATAllegro Inc. supplies dedicated data warehousing appliances based on enterprise-class SATA drives. DATAllegro's C-series appliance will soon be incorporating 500 GB 7,200 rpm Caviar RE drives. Benefits Some of the benefits that a data warehouse provides are as follows: A data warehouse provides a common data model for all data of interest regardless of the data's source. This makes it easier to report and analyze information than it would be if multiple data models were used to retrieve information such as sales invoices, order receipts, general ledger charges, etc. Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis. Information in the data warehouse is under the control of data warehouse users so that, even if the source system data is purged over time, the information in the warehouse can be stored safely for extended periods of time. Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems. Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, notably customer relationship management (CRM) systems. Data warehouses facilitate decision support system applications such as trend reports exception reports, and reports that show actual performance versus goals. Disadvantages There are also disadvantages to using a data warehouse. Some of them are: Data warehouses are not the optimal environment for unstructured data.
Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data. Over their life, data warehouses can have high costs. Data warehouses can get outdated relatively quickly. There is a cost of delivering suboptimal information to the organization. There is often a fine line between data warehouses and operational systems. Duplicate, expensive functionality may be developed. Or, functionality may be developed in the data warehouse that, in retrospect, should have been developed in the operational systems and vice versa.
8.4 Functionality of data warehouse How should the functionality of a Data Warehouse be counted? 1. Applications. 2. The Data Warehouse may, or may not, provide the required reporting functions. In some cases, external applications access the Data Warehouse files to generate their own reports and queries.
3. Data Warehouse functions are often based upon packaged software. An example
is the Business Objects product. Where the data within the Warehouse supports the reporting requirements FUNCTIONAL OVERVIEW While it is recognized that Data Warehouse systems each have their own unique characteristics, there are certain generic characteristics shared by the family of systems known as Data Warehouses. These include: 1. The prime purpose of a Data Warehouse is to store, in one system, data and information that originates from multiple applications within, or across, organizations. The data may be stored ‘as received’ from the source application, or it may be processed upon input to validate, translate, aggregate or derive new data/information.
2. Most of the data load functions are processed in batch. There are few on-line data maintenance functions. The on-line functions that do exist tend to update the reference files and data translation tables. 3. A database alone does not constitute a Data Warehouse system. At a minimum, a Data Warehouse system must include the database and corresponding load functions. Data reporting functions are optional. They may, or may not be an integral part of the Data Warehouse system. 4. The prime purpose of storing the data is to support the information reporting requirements of an organization i.e. multiple users and multiple of multiple applications and users, the data may be physically stored based upon the user requirements. Separate database segments may store the ‘user views’ for a particular user. This results in the physical storage of a considerable amount of redundant data. The data storage approach is designed to optimize data/information retrieval. 8.5 Problems and open issues in data warehouses Problems in Data ware houses: There are many situations where the data warehouse projects fails. We will detail the types of situations that could be characterized as failures. 1. The Project Is Over Budget Depending on how much the actual expenditures exceeded the budget, the project may be considered a failure. The cause may have been an overly optimistic budget or the inexperience of those calculating the estimate. The inadequate budget might be the result of not wanting to tell management the bitter truth about the costs of a data warehouse. Unanticipated and expensive consulting help may have been needed. Performance or capacity problems, more users, more queries or more complex queries may have required more hardware or extra effort to resolve the problems. The project scope may have been extended without a change in the budget. Extenuating circumstances such as delays caused by hardware problems, software problems, user unavailability, and change in the business or other factors may have resulted in additional expenses. 2. Slipped Schedule
Most of the factors listed in the preceding section could also have contributed to the schedule not being met, but the major reason for a slipped schedule is the inexperience or optimism of those creating the project plan. In many cases management wanting to “put a stake in the ground” were the ones who set the schedule by choosing an arbitrary date for delivery in the hope of giving project managers something to shoot for. The schedule becomes a deadline without any real reason for a fixed delivery date. In those cases the schedule is usually established without input from those who know how long it takes to actually perform the data warehouse tasks. The deadline is usually set without the benefit of a project plan. Without a project plan that details the tasks, dependencies and resources, it is impossible to develop a realistic date by which the project should be completed. 3. Functions and Capabilities Not Implemented The project agreement specified certain functions and capabilities. These would have included what data to deliver, the quality of the data, the training given to the users, the number of users, the method of delivery e.g. web based, service level agreements (performance and availability), pre-defined queries, etc. If important functions and capabilities were not realized or were postponed to subsequent implementation phases, these would be indications of failure 4. Unhappy Users If the users are unhappy, the project should be considered a failure. Unhappiness is often the result of unrealistic expectations. Users were expecting far more than they got. They may have been promised too much or there may have been a breakdown in communication between IT and the user. IT may not have known enough to correct the users’ false expectations, or may have been afraid to tell them the truth. We often observe situations where the user says jump, and IT is told to say “how high?” Also, the users may have believed the vendors’ promises for grand capabilities and grossly optimistic schedules. Furthermore, users may be unhappy about the cleanliness of their data, response time, availability, usability of the system, anticipated function and capability, or the quality and availability of support and training. 5. Unacceptable Performance
Unacceptable performance has often been the reason that data warehouse projects are cancelled. Data warehouse performance should be explored for both the query response time and the extract/transform/load time. Any characterization of good query response time is relative to what is realistic and whether it is acceptable to the user. If the user was expecting sub second response time for queries that join two multi-million-row tables, the expectation would cause the user to say that performance was unacceptable. In this example, good performance should have been measured in minutes, not fractions of a second. The user needs to understand what to expect. Even though the data warehouse may require executing millions of instructions and may require accessing millions of rows of data, there are limits to what the user should be expected to tolerate. We have seen queries where response time is measured in days. Except for a few exceptions, this is clearly unacceptable. As data warehouses get larger, the extract/transform/load (ETL) process will take longer, sometimes as long as days. This will impact the availability of the data warehouse to the users. Database design, architecture, and hardware configuration, database tuning and the ETL code – whether an ETL product or hand written code – will significantly impact ETL performance. As the ETL process time increases, all of the factors have to be evaluated and adjusted. In some cases the service level agreement for availability will also have to be adjusted. Without such adjustments, the ETL processes may not complete on time, and the project would be considered a failure. 6. Poor Availability Availability is both scheduled availability (the days per week and the number of hours per day) as well as the percentage of time the system is accessible during scheduled hours. Availability failure is usually the result of the data warehouse being treated as a second-class system. Operational systems usually demand availability service level agreements. The performance evaluations and bonus plans of those IT members who work in operations and in systems often depends on reaching high availability percentages. If the same standards are not applied to the data warehouse, problems will go unnoticed and response to problems will be casual, untimely and ineffective. 7. Inability to Expand
If a robust architecture and design is not part of the data warehouse implementation, any significant increase in the number of users or increase in the number of queries or complexity of queries may exceed the capabilities of the system. If the data warehouse is successful, there will also be a demand for more data, for more detailed data and, perhaps, a demand for more historical data to perform extended trend analysis, e.g. five years of monthly data. 8. Poor Quality Data/Reports If the data is not clean and accurate, the queries and reports will be wrong, In which case users will either make the wrong decisions or, if they recognize that the data is wrong, will mistrust the reports and not act on them. Users may spend significant time validating the report figures, which in turn will impact their productivity. This impact on productivity puts the value of the data warehouse in question. 9. Too Complicated for Users Some tools are too difficult for the target audience. Just because IT is comfortable with a tool and its interfaces, it does not follow that all the users will be as enthusiastic. If the tool is too complicated, the users will find ways to avoid it, including asking other people in their department or asking IT to run a report for them. This nullifies one of the primary benefits of a data warehouse, to empower the users to develop their own queries and reports. 10. Project Not Cost Justified Every organization should cost justify their data warehouse projects. Justification includes an evaluation of both the costs and the benefits. When the benefits were actually measured after implementation, they may have turned out to be much lower than expected, or the benefits came much later than anticipated. The actual costs may have been much higher than the estimated costs. In fact, the costs may have exceeded both the tangible and intangible benefits. 11. Management Does Not Recognize the Benefits In many cases, organizations do not measure the benefits of the data warehouse or do not properly report those benefits to management. Project managers, and IT as a whole, are often shy in boasting about their accomplishments. Sometimes they may not know how to report on their progress or on the impact the data warehouse is having on the
organization. The project managers may believe that everyone in the organization will automatically know how wonderfully IT performed, and that everyone will recognize the data warehouse for the success that it is. They are wrong. In most cases, if management is not properly briefed on the data warehouse, they will not recognize its benefits and will be reluctant to continue funding something they do not appreciate. Data Warehouse Issues There are certain issues surrounding data warehouses that companies need to be prepared for. A failure to prepare for these issues is one of the key reasons why many data warehouse projects are unsuccessful. One of the first issues companies need to confront is that they are going to spend a great deal of time loading and cleaning data. Some experts have said that the typical data warehouse project will require companies to spend 80% of their time doing this. While the percentage may or may not be as high as 80%, one thing that you must realize is most vendors will understate the amount of time you will have to spend doing it. While cleaning the data can be complicated, extracting it can be even more challenging. Not matter how well a company prepares for the project management, they must face the fact that the scope of the project will probably be longer then they estimate. While most projects will begin with specific requirements, they will conclude with data. Once the end users see what they can do with the data warehouse once its completed, it is very likely that they will place higher demands on it. While there is nothing wrong with this, it is best to find out what the users of the data warehouse need next rather than what they want right now. Another issue that companies will have to face is having problems with their systems placing information in the data warehouse. When a company enters this stage for the first time, they will find that problems that have been hidden for years will suddenly appear. Once this happens, the business managers will have to make the decision of whether or not the problem can be fixed via the transaction processing system or a data warehouse that is read only. It should also be noted that a company will often be responsible for storing data that has not be collected by the existing systems they have. This can be a headache for developers who run into the problem, and the only way to solve it is by storing data into the system. Many
companies will also find that some of their data is not being validated via he transaction processing programs. In a situation like this, the data will need to be validated. When data is placed in a warehouse, there will be a number of inconsistencies that will occur within fields. Many of these fields will have information that is descriptive. When of the most common issues is when controls are not placed under the names of customers. This will cause headaches for the warehouse user that will want the data warehouse to carry out an ad hoc query for selecting the name of a specific customer. The developer of the data warehouse may find themselves having to alter the transaction processing systems. In addition to this, they may also be required to purchase certain forms of technology. One of the most critical problems a company may face is a transaction processing system that feeds info into the data warehouse with little detail. This may occur frequently in a data warehouse that is tailored towards products or customers. Some developers may refer to this as being a granular issue. Regardless, it is a problem you will want to avoid at all costs. It is important to make sure that the information that is placed in the data warehouse is rich in detail. Many companies also make the mistake of not budgeting high enough for the resources that are connected to the feeder system structure. To deal with this, companies will want to construct a portion of the cleaning logic for the feeder system platform. This is especially important if the platform happens to be a mainframe. During the cleaning process, you will be expected to do a great deal of sorting. The good news about this is that the mainframe utilities are often proficient in this area. Some users’ choose to construct aggregates within the mainframe since aggregation will also require a lot of sorting. It should also be noted that many end user will not use the training that they receive for using the data warehouse. However, it is important that the be taught the fundamentals of using it, especially if the company wants them to use the data warehouse frequently.
Chapter-VIII Data Storage
End Chapter quizzes:
Q1. Which of the following is crucial time while accessing data on the disk? a. Seek time b. Rotational time c. Transmission time d. Waiting time Q2. What is the smallest unit of Magnetic Disk? (a) Cylinder (b) Sector (c) Track (d) platter Q3. Which of the following memory allocation scheme suffers from External fragmentation? a. Segmentation b. Pure demand paging c. Swapping d. Paging Q4. Select all statements that are true (a) External fragmentation is possible with paged memory systems (b) Internal fragmentation is possible with paged memory systems (c) External fragmentation is possible with a memory management policy that allocates variable sized partitions of memory Q5. The collection of processes on the disk that is waiting to be brought into memory for execution forms the ___________ a. Ready queue b. Device queue c. Input queue d. Priority queue
Text & References:
Text: Operating Systems Concepts, Silberschatz Galvin, Fifth Edition Addition Wesley Publication. Modern Operating Systems, A S Tanenbaum, Prentice Hall of India New Delhi, 1995.
References: Design of UNIX Operating System, Maurice J. Bauch, Prentice Hall of India. Operating Systems Design, Peterson & Galvin. Operating Systems, Third Edition, Pearson Education, 2008.