Your Success, Our Strength ™




Your Success, Our Strength ™

× TestedbyProgrammers
× SuggestedbyEducators
× Writtenbyme
× Helpedbyfriends

1. Dec2007
2. Sep2009




- ReviewC 3.0.0
- ReviewC++ 3.0.0
- ReviewJava 3.0.0
- ReviewVBpart1 3.0.0
- Workingwith.Net 2.0.0
- Let’sMakeLinuxTogether 3.0.0
- ReviewDAA 3.0.0

: NotefromAuthor










Making a book is a bear of undertaking, not only for the person who created it but also
for those around him or those who have to use this book. Mere words can‘t do justice to
the effect these people have had on me, either through technical assistance or moral
support. Still, words are all I have.

This book is derived from its previous editions. Who help me in my previous edition
includes Mr. Ravinder kajal and Mr. Manoj Rana, lecturers in department of computer
science in Maharaja Surjamal Institute. I am thankful to them for their dedication and
attention towards the book.

This book would not have been possible without the work of several other people. Firstly,
I want to thanks Mr. Vikas Polowalia, who give me idea of writing this integrated
solution. Many friends and colleagues have contributed greatly to the quality of this
book. I thank all of you for your help and constructive criticisms. My thanks go to
- Mr. Vikas Polowalia who help me in typing and in finding other topics
- MSI‘s laboratory of computer science
- Mrs. Krishna Devi (mother) who motivate and care me while I was working for
this book.
- Mr. Nishant Kohli for their practical works.

Grateful acknowledgement is extended to all the people who make this book possible
with their enthusiastic and encouragement.

I am grateful for the support given to me by my friends for supporting and motivating me
to write this book efficiently and effectively. They have provided an environment that has
helped me develop my self-confidence and generously given me the resources needed to
make this book. I really appreciate your support. For once words fail me. I have no idea
what I could say that would be appropriate for all the times you all have been there
throughout the years. ―Thank you‖ seems pitifully inadequate, but it‘s all I have.

Rohit Kumar
12/25/2007 11:28:29 AM


This book provides an integrated solution for Linux environment. I have tried to write it
in simple language which is easily understandable by all the students. I wrote this book as
a text for an introductory course in Linux Programming at the junior or undergraduate
level. This book work as text book and more correctly you can refer it as notes for your
syllabus. Now to get marks is no more difficult if you complete this book.
To teacher: This book is designed to be both versatile and complete. You will find it
very useful and interesting while reading and teaching from this book. The problems
given in this book will sharpen your mind.
To students: I home that this textbook/ notes provide you with an enjoyable introduction
to the field of Linux. Before starting reading, you must have some knowledge of C.
To professional: the wide range of topics in this book makes it very useful and excellent
handbook. Because each unit is relatively self contained, you can focus in on the topics
that most interest you. Most of the part can be practically implemented and can be seen in
Linux kernel.
To colleagues: I have supplied an extensive bibliography and pointers to the current
literature. This book ends with model test papers and MYLinuxEntrance (T&P). Despite
myriad requests from students for solutions to problems and exercises, we have chosen as
a matter of policy not to supply references for problems and exercises, to remove the
temptation for students to look up a solution rather than to find it themselves.


1. OperatingSystemPrinciplesbyAbrahamSilberschatz,…
2. LinuxKernelProgrammingbyM.Beck,…
3. BeginningLinuxProgrammingbyNeilMethew,…
4. Fedora6BiblebyChristopherNegus
8. HALbyBillWeinberg
12. MemorymanagementbyZhihua(Scott)Jiang
20. NamedpipesbyFaisal Faruqui



- History
- Advantages+Disadvantages
- Kernel+Shell
- Multi-user+timesharingsystem
- Internalandexternalcommands
- Help
- Files

- History
- Features
- Advantages
- Distributions
- Unixvs.Linux
- Patternmatching+Basiccommands
- Overviewofcommands

- Architecture
- Editors
- Processmanagement
- Systemcalls
- ShellScripts
- Systemadministration

- DataStructureinkernel
- Memorymanagement
o Arch.Independentmemorymodel
o Virtualaddressspace
o Blockdevices+Caching
o PagingunderLinux
- Filesystemrepresentation
- Proc
- Ext2

- Multiprocessing
- Debugging
- Modules

o Synchronizationinkernel
o Files
o Pipes
o SystemVIPC
o Socket
o Ptrace

- Differenttypesofkernels
- BootingupofLinux
- CompilingKernel

- ModelTestPaper1
- ModelTestPaper2
- ModelTestPaper3

- MYLinuxEntrance2007



History of Unix:
- Unix started as a single user OS.
- In 1969, Ken Thomson with Denis Ritchie and other wrote a general purpose OS
for small machines. It attracted a large number of users and then it is developed to
be multi-user OS.
- In 1973, Thompson and Ritchie rewrote the Unix OS in C, breaking away from
the tradition of writing Os in assembly.
- Around 1974, Unix was licensed to universities for educational purposes and few
years later was made commercial available.
- It is interesting to note that MS DOS was created much later than Unix. By that
time industry has begun to accept Unix as standard OX.
- Many vendors such as IBM, SUN, HP and others purchased the code of Unix and
develop their own versions of Unix.
o IBM use AIX
o HP use HP-UNIX
- To avoid confusion, some standards were laid down, known as POSIX ( Portable
Operating System Interface) (for Unix Environment). POSIX was a set pf
Standards that enables soft wares to run on different Unix based OS without
change to source code.
- There are many features which are same to that of Linux.
- In 1984, the BSD (Berkeley Software Distributions) group ported Unix on their
VAX machines. They implemented certain improvements over the original Unix
and created BSD versions of Unix.
- Improvements of BSD versions were:
o Support for virtual memory
o Demand paging
o A new shell (CSH)
o TCP/IP network protocols
· The services like remote login, file transfer and e mail are possible
due to the TCP/IP network protocol supported by Unix.
· Unix is still the most preferred OS for networking.

Advantages of Unix:
- Unix is more secure than other OS. It allows only authorised users to modify files
and directories.
- It does not infect from viruses.
- Stable: It can stay up for several years without any problem.
- Multi-user OS
- Can be loaded on any type of computer system
- Multi tasking OS: you can work with multiple programs simultaneously.
Disadvantages of Unix:
- Slightly difficult to install and configure
- Difficult to learn for windows users of PC
- Windows soft wares MS office, Internet explorer etc. are not available in Unix
however their replacements are available. eg: Word Perfect and Netscape

Unix Architecture:

The kernel is the core of OS. This is loaded in RAM as soon as the system starts
up. It manages memory, files, and peripheral devices and also maintains date and
time. The different functions performed by kernel are:
- Managing memory: Allocation and de-allocation of memory
- Scheduling: Enabling each user to work efficiently
- Organizing data transfer between I/O devices and memory
- Accept the instructions from shell and execute them
- Enforce the security measures.

App. Programs and

Shell is a program which interprets commands given by user. These commands
can be given in two ways: either typed or in a file called shell script. A variety of
shells supported by Unix are: Bourne shell, Korn shell, C shell.
Features of Shell:
- communication between user and Unix system takes place through the shell
- Shell allows background processing
- A sequence of commands can be collected in a file called shell script. The name
of the file can be used to execute the sequence of commands automatically.
- Shell includes features like loop and conditional constructs.
- Input of one command can be taken from the output of another command or
output of one command can be diverted to the input of a file or printer. Two
commands can be combined using pipe operator.
Multi User:
Unix is a multi-user OS. Each user has a different terminal. A terminal consists of
keyboard and a monitor and is connected to main computer known as host computer. A
host consists of hard disks, memory, processor and printer etc. these resources are
accessible to all the users.

Time Sharing System:
In time sharing system, CPU time is divided among all the users. Unix works on
the concept of time sharing system. Each user‘s program is allocated a short period of
CPU time, one by one. This short period of CPU time is called Time Slice/ time slot or
quantum. It may be of 10-20 ms. In time sharing environment, CPU switches from one
user to another so rapidly that each user has the illusion that he alone is using the
computer. eg: let us assume that the time slice for a time sharing system is 10 ms that is
in one cycle each user gets 10 ms to execute its task.

Command Line and Command Syntax:
A command in Unix is a series of characters. These characters consist of words
separated by white spaces. The first work is the command itself and the rest are the
command‘s argument. Unix commands are case-sensitive. ie: cd is different from Cd or
CD. Type all your commands in lower case. Commands are issued to the shell at the
command line. A command line comprises commands, the line of instructions, optiosn
and any command-line argument that you provide. eg: a command line $man cp.
Commands are entered at the shell prompt ($,#). Prompt is merely a symbol that appears
at the start of a command line. Prompt means Unix is ready and waiting for your
Note: a command line may have more than one command separated by semicolons(;).
Pipes(|) or ampersands(&).

Internal and External Commands:
Internal commands are such programs that are built-in into the OS and reside in
the memory along with the kernel. They are loaded in the memory at the time of booting.
eg: cp, mv, rmdir, ls etc. all are internal commands.
External commands are not built-in into the OS but loaded from other program
files which resides on HDD/Floppy. They are loaded as and when required.

To seek help on a command, we use $ man <command name>
eg: $man cp
A manual page will be displayed on the screen, with a lot of information such as
commands used, its syntax, switches/options available, examples etc.

File System:
- All information in Unix is treated as a file
- A single disk can store thousand of file.
- To manage these files, OS provides a file system.
- Similar files can be grouped in a directory.
- The file system of Unix is the main key to success of Unix system.

File Type in Unix:
Everything in Unix is treated as a file. eg: it treats I/O devices as files. There are
three types of files in Unix.
a) Ordinary files: All files created by user are called ordinary/regular files. It
includes data file, program file, object file and executable file.
b) Directory files: For each directory, there is a file by same name as the
directory which contains information about files under that directory. eg: for
the directory abc, there will be a directory file called ―abc‖. This file contains
the information on all the files and sub-directories under the directory abc.
Some main points about this are:
a. It is automatically created by Unix, whenever a directory is created.
b. A directory file contains 2 fields- name of file and identification
c. Cannot be modified by the user. Only change is done by Unix when a
old file is deleted or new file is created.
c) Device files: Each I/O device is associated with a special file called device
file. Any combination to I/O device is done through device file.

Text and Binary files:
A text file is a file that consists of text characters. A text file can be read by any
editor or word processor.
A binary file is a program or data file that contains binary information in a
machine readable form rather than in a human readable form.

Structure of File System:
It follows tree or hierarchical directory structure. It starts with root directory. It is
represented by forward slash(/). In Unix forward slash is used as a separator. Nore that
windows or DOS use backslash(\) as a separator. Under the root directory, there are
several system and home directory. Brief descriptions of these are given here:
1. /bin: contains binary file, executable program files. In his directory we can find
the files for Unix commands. It is same as file in DOS
2. /dev: This directory contains the device files. eg: printer file is known as prn,
HDD file is had. It‘s first partition is hda0
3. /etc: This directory contains all the configuration information of the system.
4. /lib: contains the library files. It contains the reusable functions and routines for
the programmers to use.
5. /tmp:This directory contains all the temporary informatoins same as
c:\windows\temp\ directory in windows.

6. /mnt: This contains the directories where other mounted file system reside like
floppy, CD, other partitions of HDD.
7. /usr: This directory contains the home directories of the users. There is one home
directory for each user.
8. /kernel: This directory contains all the kernel specific code.

File naming conventions:
1. filename can be up to 255 characters long.
2. may or may not have extensions.
3. can contains alphabets, digits, dots(.), hyphen(-), underscore(_) any where.
4. can have any no. of dots. eg: a.b.c
5. can contain both upper and lower case characters
6. file names are case-sensitive
7. should not have a blank space or tab.


History of Linux:
- Linus Torvald, a student at the University of Helsinki, Finland introduced
Linux in 1991
- Torvalds worked on Linux project and wrote the source code of Linux kernel.
He made Linux available on net.
- Many programmers added to the code, change it and built in support for all
kind of hardware.
- There are several versions of Linux available for different hardware platforms.
o Linux version 0.02 of Linux kernel was released in 1991
- Torvalds and hundreds of developers from across the world worked on it and
in march 1994, version 1.0 of Linux kernel was released.
o Red Har Linux 6.0 uses version 2.25 of all the Linux kernel
- Linux is Unix clone and has been written from scratch by Torvalds. Torvalds
was working in minix, a miniature version of Unix which was used for
teaching purposes in universities and colleges.
- Linux follows the open development model. Torvalds made the source code
available for study and changes over the internet. (and still is)
- Torvalds accepts modification to the kernel code. The result is that whenever
a new version of Linux having new functionality was released, people work
on it to fix the bug if any.
- To assess whether they are using some version, the following scheme is mage
eg: version 1.x.y if x is even it signifies stable version else changes to be done
very soon it‘s not stable.

Main Features of Linux:
1. Multitasking: Linux support true preemptive multitasking. All the processes
run entirely independent of each other. No process needs to be concerned with
making processor time available to other processes.
2. Multi-user: Linux allows a number of users to work with the system at the
same time.
3. Multiprocessing: From version 2.0 upwards, Linux also run on multiprocessor
architectures. This means that the OS can distribute several applications
across several processors
4. Architecture Independence: Linux run on almost all the platforms that are able
to process bit and bytes. The supported hardware, from embedded systems to
IBM mainframes, is sufficient. This kind of hardware independence is not
achieved by any other serious OS.
5. Demand load executables: Only those parts of a program actually required for
execution are loaded into memory.
6. Shared Libraries: Linux introduced the concept of shared libraries. Those
libraries which are used by almost all the programs are put in shared libraries
which when required are used from that place.
7. Support for POSIX: Linux support all of the POSIX standards.

8. Various executable formats: Linux supports various executable formats like
.out was old extensions which were supported by Linux and now ELF also
supported by Linux and for .exe files there are Dos Emulator.
9. Different File System: This is that point which makes Linux much popular
over other all OS. Linux supports 13 different file systems (NTFS, vfat,
FAT16, FAT32, DOS, ext2, proc, swap …) by help of a virtual file system
namely VFS.
10. Cron Scheduler: Linux has a scheduler program, called Cron scheduler. It is
used to run commands, shell scripts or programs at scheduled time.
11. Office suits
12. Data archiving utilities: Linux provides utilities for basic data backup such as
tar, cpio and dd. Red Hat 5.0 onwards also provide a Backup and Restore Unit
(BRU), which can be purchased. BRU offers automated backup and
13. Licensing: Linux is copyright under GNU general public license. This
licensing for Red Hat Linux states that a person can make any number of
copies of software‘s and distributes it freely or charge a price for it. One can
freely download Linux from internet for use.

Advantages of Linux:
1. Reliability: Linux is stable OS. Linux servers are not shut down for years
together. This means that users on Linux OS work consistently without
reporting OS failure.
2. Backward compatibility: Linux has an excellent support for older hardware. It
can run on different type of processors not just intel 386/ 486 but also on
DEC‘s alpha, sun sparc machine, power PC etc.
3. GUI interface: The graphical interface for Linux is X windows system. It is
devided into 2 subsystem consisting of server and client. Linux has a number
of graphical user interfaces called Desktop Environments such as K Desktop
Environment and GNU Object Model Environment, both of which are
versions of X windows system.
a. When we start KDE, desktop is organized into folder such as auto
start, CDROM, printer and floppy drive in form of icons
b. GNOME can be configured in the way you want to use it. It supports
the drag and drop mechanism. Gnome follows the common client
request bnker architecture (CORBA) standards to allow different
software to communicate easily.
4. Excellent Security features: It support high security that is why many ISP
(internet service providers) are replacing their current OS with Linux system.
5. Development libraries: Linux provides an excellent platform for many
development languages like c++, c, Perl, java, PHP and many more.
6. Can support high user load: Linux can support a large number of users
working simultaneously.
7. No known viruses: Linux is free from any viruses attacks so far, there are no
known viruses for Linux.
8. Multiple Distributors: there are many distributors of Linux

9. Simple upgrade: the installation procedure of most Linux version is menu
driven and easy. It includes the ability to upgrade from prior version. The
upgrade process preserves the existing configuration files and maintains a list
of its actions during installation.

Multiple Distributions:
To install Linux, the user requires a distribution. This consists of a boot diskette
and other diskettes or a CD-ROM. Installation scripts enable even in experienced users to
install systems that can be run. It helps that many software packages are already adapted
to Linux and appropriately configures: this saves a lot of time. Discussions are constantly
taking place with in Linux community on the quality of various distribution of this sort is
a very lengthy and complex task.
Internationally, the RedHat, S.u.S.E, Debian and slackware distributions are
widely used. Which of these distributions is used is a matter of taste. Distributions can be
obtained from FTP servers, e-mail systems, public domains distributors and some
bookshops. Sources of supply can be found by consulting specialist magazines or the
Linux newsgroup in usenet.

Unix vs. Linux:
× Unix is costly while Linux is freely available
× Source of Linux is available while source of Unix is not available
× Linux supports all of the network topologies while Unix support only star
× Linux can run on different platforms while for different hardware there are
different versions of Unix.
× Linux is GUI while Unix is CUI
× Linux is paged system while Unix is swap system.
× Linux can support 78 GB and 2 GB RAM while Unix can support 2 GB
and 512 MB.
× More shells are available under Linux
× Linux support development libraries, scripting languages but Unix not.

Pattern Matching:
1. ls a.???
2. ls a*.*
3. ls a[1 2 4]
4. ls a[1-5]
5. ls a[!1-5]

What are these what are their functions and why they are used? Try these on
system and find difference or you may ask any teacher of Linux. For further
information regarding these put these points in Google search.

Basic Commands:
These commands are executed on a terminal ( main menu > system tools > terminal)

$ useradd : to add a new user
$ adduser: to add a new user
$ passwd: to change the password
$ passwd rohit and press enter it will change the password of user rohit by
asking you the new password.
$ passwd and press enter it will change the password of root/administrator by
asking you the new and confirm password
$ exit: to exit the terminal
$ cat filename: to display the contents of a file
$ cat > filename: create a file and to save press ctrl + d
$ cat firstfilename secondfilename : to display content of both files. Firstly
first file followed by second file
$ cat Ifile > IIfile : it will copy the content of first file into second file
$ cat Ifile >> IIfile: it will append the contents of first file into second file
$ cat >.filename: it will create the hidden file
$ ls: to see the list of files under current directory
$ ls –l: to see the long details with descriptions as permissions, users
$ ls –a: to see all files hidden and normal
$ ls –x: to see files in multicolumn
$ ls –Fx: F is used for identifying directories and executable files. * for files ad /
for folders.
$ ls [mv]?r* : to list files that begin with m or w following any one characters,
then r and then any string.
$ cat a*>b : to send all files starting with a to b
$ bc: basic calculator. 12+5 and press enter, it will show 17. you can use
multiple expressions separated by semicolon

Overview of commands:
Used to check the accessibility of files
Access(pathname, access_mode)
Char* pathname;
int access-mode;
The access modes are.
04 read
02 write
01 execute (search)
00 checks existence of a file
& operator
execute a command as a background process.
prints the specified string in large letters. Each argument may be upto 10 characters long.
is used to break out of a loop. It does not exit from the program.
Produces a calender of the current month as standard output. The month (1-12) and year
(1-9999) must be specified in full numeric format.
Cal [[ month] year]
Displays contents of the calendar file
case operator
The case operator is used to validate multiple conditions.
Case $string in
Pattern 1)
Command list;;
Command list;;
Pattern 3)
Command list;;
(for concatenate) command is used to display the contents of a file. Used without
arguments it takes input from standard input <Dtrl d> is used to terminate input.
cat [filename(s)]
cat > [filename]
Data can be appended to a file using >>
Some of the available options are :
Cat [-options] filename(S)

-s silent about files that
cannot be accessed
-v enables display of non printinging characters (except tabs, new lines, form-
-t when used with -v, it causes tabs to be printed as ^I‘s
-e when used with -v, it causes $ to be printed at the end of each line
The -t and -e options are ignored if the -v options is not specified.
Used to change directories
Changes the group that owns a file.
Chgrp [grou -id] [filename]
Allows file permissions to be changed for each user. File permissions can be changed
only by the owner (s).
Chmod [+/-][rwx] [ugo] [filename]
Used to change the owner of a file.
The command takes a file(s) as source files and the login id of another user as the target.
Chown [user-id] [filename]
The cmp command compares two files (text or binary) byte-by-byte and displays the first
occurrence where the files differ.
Cmp [filename1] [filename2] -1 gives a long listing
The comm command compares two sorted files and displays the instances that are
common. The display is separated into 3 columns.
Comm. filename1 filename2
first displays what occurs in first files but not in the second
second displays what occurs in second file but not in first
third displays what is common in both files
continue statement
The rest of the commands in the loop are ignored. It moves out of the loop and moves on
the next cycle.
The cp (copy) command is used to copy a file.
Cp [filename1] [filename2]
cpio(copy input/output)
Utility program used to take backups.
Cpio operates in three modes:
-o output
-i input
-p pass

the system call creates a new file or prepares to rewrite an existing file. The file pointer is
set to the beginning of file.
int creat(path, mode)
char *path;
int mode;
used to cut out parts of a file. It takes filenames as command line arguments or input from
standard input. The command can cut columns as well as fields in a file. It however does
not delete the selected parts of the file.
Cut [-ef] [column/fie,d] filename
Cut-d ―:‖ -f1,2,3 filename
Where -d indicates a delimiter specified within ―:‖
used to find the number of free blocks available for all the mounted file systems.
#/etc/df [filesystem]
the diff command compares text files. It gives an index of all the lines that differ in the
two files along with the line numbers. It also displays what needs to be changed.
Diff filename1 filename2
The echo command echoes arguments on the command line.
echo [arguments]

Displays the permanent environment variables associated with a user‘s login id
exit command
Used to stop the execution of a shell script.
expr command
Expr (command) command is used for numeric computation.
The operators + (add), -(subtract), *(multiplu), /(divide), (remainder) are allowed.
Calculation are performed in order of normal numeric precedence.
The find command searches through directories for files that match the specified criteria.
It can take full pathnames and relative pathnames on the command line.
To display the output on screen the -print option must be specified
for operator
The for operator may be used in looping constructs where there is repetitive execution of
a section of the shell program.
For var in vall val2 val3 val4;
Do commnds; done

Used to check the file system and repair damaged files. The command takes a device
name as an argument
# /etc/fsck /dev/file-system-to-be-checked.
grave operator
Used to store the standard the output of a command in an enviroment variable. (‗)
The grep (global regular expression and print) command can be used as a filter to search
for strings in files. The pattern may be either a fixed character string or a regular
Grep ―string‖ filename(s)
User‘s home directory
if operator
The if operator allows conditional operator
If expression; then commands; fi
if … then…else… fi
$ if; then
efile; then
used to stop background processes
used to link files. A duplicate of a file is created with another name
displays user‘s login name
Lists the files in the current directory

Some of the available options are:
-l gives a long listing
-a displays all file{including hidden files
used to print data on the line printer.
Lp [options] filename(s)
The mesg command controls messages received on a terminal.
-n does not allow messages to be displayed on screen
-y allows messages to be displayed on screen

used to create directories
The more command is used to dispay data one screenful at a time.
More [filename]
Mv (move) moves a file from one directory to another or simply changes filenames. The
command takes filename and pathnames as source names and a filename or exiting
directory as target names.
mv [source-file] [target-file]
The news command allows a user to read news items published by the system
Displays the contents of a file with line numbers
Changes the password
The paste command joins lines from two files and displays the output. It can take a
number of filenames as command line arguments.
paste file1 file2
The directories that the system searches to find commands

Used to display data one page (screenful) at a time. The command can take a number of
filenames as arguments.
Pg [option] [filename] [filename2]…..
Operator (1) takes the output of one commands as input of another command.

Gives information about all the active processes.
The system prompt
(print working directory) displays the current directory.
The rm (remove) command is used to delete files from a directory. A number of files may
be deleted simultaneously. A file(s) once deleted cannot be retrieved.
rm [filename 1] [filename 2]…

sift command
Using shift $1becomes the source string and other arguments are shifted. $2 is shifted to
$1,$3to $2 and so on.
The sleep command is used to suspend the execution of a shell script for the specified
time. This is usually in seconds.
Sort is a utility program that can be used to sort text files in numeric or alphabetical order
Sort [filename]
Used to split large file into smaller files
Split-n filename
Split can take a second filename on the command line.
Used to switch to superuser or any other user.
Used to copy data in buffers to files
Used to run a UNIX command from within a C program
The tail command may be used to view the end of a file.
Tail [filename]
Used to save and restore files to tapes or other removable media.
Tar [function[modifier]] [filename(s)]
output that is being redirected to a file can also be viewed on standard output.
test command
It compares strings and numeric values.
The test command has two forms : test command itself If test ${variable} = value then
Do commands else do commands
The test commands also uses special operators [ ]. These are operators following the of
are interpreted by the shell as different from wildcard characters.
Of [ -f ${variable} ]
Do commands
[ -d ${variable} ]
do commands

do commands
many different tests are possible for files. Comparing numbers, character strings, values
of environment variables.
Used to display the execution time of a program or a command. Time is reported in
Time filename values

The tr command is used to translate characters.
tr [-option] [string1 [string2]]
Displays the terminal pathname
Used to specify default permissions while creating files.
The uniq command is used to display the uniq(ue) lines in a sorted file.
Sort filename uniq
The operator executes the commands within a loop as long as the test condition is false.
Used to send a message to all users logged in.
# /etc/wall message
the command halts the execution of a script until all child processes, executed as
background processes, are completed.
The wc command can be used to count the number of lines, words and characters in a
wc [filename(s)]
The available options are:
wc -[options] [filename]
while operator
the while operator repeatedly performs an operation until the test condition proves false.
$ while
Ø do

Ø done
displays information about all the users currently logged onto the system. The user name,
terminal number and the date and time that each user logged onto the system.
The syntax of the who command is who [options]
The write command allows inter-user communication. A user can send messages by
addressing the other user‘s terminal or login id.
write user-name [terminal number]

Linux architecture:
The core of Linux system is the kernel – the OS program. Kernel controls the
resources of computer, allot them to different users and tasks. It interacts directly with
h/w so making programs easy to write and portable across the different platforms
hardware. Since kernel communicates directly with hardware, the part of kernel must be
customized to the h/w features of each system. However kernel does not deal directly
with user. Instead, the login process starts up a separate interactive program called shell
for each user.

Linux has a simple user interface b/w user and kernel called shell that has a power
to provide the service that a user wants. It protects the user from having to now the
intricate h/w details.

Linux utilities and application programs:
Linux utilizes or commands are a collection of programs that service day to day
processing requirements. These programs are invoked through the shell which is itself
another utility. Apart from utilities provided as a part of Linux OS, more that a thousand
Linux based application programs like DBMS, word processor and various other
programs are available from independent software vendors.

Vi editor:
General Startup
To use vi: vi filename
To exit vi and save changes: ZZ or :wq
To exit vi without saving changes: :q!
To enter vi command mode: [esc]
A number preceding any vi command tells vi to repeat that command that many
Cursor Movement
h move left (backspace)
j move down
k move up
l move right (spacebar)
[return] move to the beginning of the next line
$ last column on the current line
0 move cursor to the first column on the current line
^ move cursor to first nonblank column on the current line
w move to the beginning of the next word or punctuation mark
W move past the next space
b move to the beginning of the previous word or punctuation mark
B move to the beginning of the previous word, ignores punctuation
e end of next word or punctuation mark
E end of next word, ignoring punctuation
H move cursor to the top of the screen
M move cursor to the middle of the screen

L move cursor to the bottom of the screen
Screen Movement
G move to the last line in the file
xG move to line x
z+ move current line to top of screen
z move current line to the middle of screen
z- move current line to the bottom of screen
^F move forward one screen
^B move backward one line
^D move forward one half screen
^U move backward one half screen
^R redraw screen ( does not work with VT100 type terminals )
^L redraw screen ( does not work with Televideo terminals )

r replace character under cursor with next character typed
R keep replacing character until [esc] is hit
i insert before cursor
a append after cursor
A append at end of line
O open line above cursor and enter append mode

x delete character under cursor
dd delete line under cursor
dw delete word under cursor
db delete word before cursor
Copying Code
yy (yank)'copies' line which may then be put by the p(put) command. Precede
with a count for multiple lines.

Put Command
brings back previous deletion or yank of lines, words, or characters
P bring back before cursor
p bring back after cursor
Find Commands
? finds a word going backwards
/ finds a word going forwards
f finds a character on the line under the cursor going forward
F finds a character on the line under the cursor going backwards
t find a character on the current line going forward and stop one character
before it
T find a character on the current line going backward and stop one
character before it
; repeat last f, F, t, T

Miscellaneous Commands
. repeat last command
u undoes last command issued
U undoes all commands on one line
xp deletes first character and inserts after second (swap)
J join current line with the next line
^G display current line number
% if at one parenthesis, will jump to its mate
mx mark current line with character x
'x find line marked with character x
NOTE: Marks are internal and not written to the file.
Line editor mode
Any commands form the line editor ex can be issued
upon entering line mode.
To enter: type ':'
To exit: press[return] or [esc]
Reading files
copies (reads) filename after cursor in file currently editing
:r filename
Write file

:w saves the current file without quitting

:# move to line #
:$ move to last line of file

Starting ee
To edit a file simply type ee followed by the filename at your Unix prompt, for example:
ee mytext
If a file of that name exists, the start of the file is then displayed on the screen, otherwise
an empty file of that name is created.
Full details of the ee command can be seen by typing
man ee
at your Unix prompt.
Text and Commands
Unlike some other editors, there are no special modes to worry about. Typing ordinary
text will insert it into the document at the position of the cursor, and commands are
provided by [Ctrl] key combinations, a list of which is shown permanently at the top of
the screen as follows:

^[ (escape) menu ^e search prompt ^y delete line ^u up ^p prev
^a ascii code ^x search ^z undelete line ^d down ^n next page
^b bottom of text ^g begin of line ^w delete word ^l left
^t top of text ^o end of line ^v undelete word ^r right Note: ^
^c command ^k delete char ^f undelete char hold down

The caret symbol (^) indicates that the [Ctrl] key should be held down while pressing the
relevant key. On most keyboards, the cursor arrow keys will work correctly as well as the
[Ctrl] commands for cursor movement, and some keyboards also have [Page Up],
[Page Down] and [Delete] keys which can be used.

Using Menus
You can call up a menu of additional commands by pressing [Ctrl+[] or [Esc]. This main
menu appears as follows:

The cursor will be over the top menu item. leave editor in this case.
To select a menu item, move the cursor down to the required item using the cursor
motion commands or arrow keys and press the Enter key. Some items call up a further
Further Commands
Typing [Ctrl+C] will cause a prompt command: to appear at the bottom of the screen and
the panel at the top of the screen to be replaced by the following list of commands:

help : get help info |file : print file name |line : print line #
read : read a file |char : ascii code of char |0-9 : go to line "#"
write: write a file |case : case sensitive search |exit : leave and save
!cmd : shell "cmd" |nocase: ignore case in search |quit : leave, no
expand: expand tabs |noexpand: do not expand tabs

Text and Commands

Unlike some other editors, there are no special modes to worry
about. Typing ordinary text will insert it into the document at the
position of the cursor, and commands are provided by [Ctrl] key
combinations, a list of which is shown permanently at the top of the
screen as follows:

To obey one of these commands, simply type the command name at the prompt and press
the Enter key. Most of the commands are also available via the main menu, so you can
choose your favourite way of doing some tasks. Commands may be abbreviated, as long
as that abbreviation is unique. For example, as no other command starts with a q, you can
quit the editor without saving your changes by typing the sequence [Ctrl+C] [q] [Enter].
Invoking ee from Mail and other Programs
A number of programs, including the pine> mailer, call a text editor directly. The default
editor on IS Unix Services is ee.
Tip for mail users: to incorporate a file into a mail message, you can use the read
command on the [Ctrl+C] menu.

Quitting ee
Leaving the editor is achieved by selecting the top item (leave editor) in the main menu,
which calls the following further menu:

Text and Commands +---------------------+
| leave menu |
| |
Unlike some other editors, | save changes | s to worry
about. Typing ordinary text | no save | document at the
position of the cursor, and | | [Ctrl] key
combinations, a list of whi | press Esc to cancel | t the top of the
screen as follows: +---------------------+

So the usual method of finishing an editing session is to type the key sequence [Esc]
[Enter] [Enter]

ee stands for "easy editor" ee is a very easy to use editor which always displays the
keybindings on line so that you don't have to know anything to use it. It is good for
people who don't want to learn how to use a text editor.
Emacs is a popular text editor. it has several features including extensibility,
customisability, and several powerful features file comparison, spell checking, syntax
hilighting which is good but not as good as that on vim, an IDE, mail and news reading ...
calling emacs an editor is a little misleading. Compared to vim , it seems fast for editing
small regions of text but slower for editing large files , as the vim commands are more
compact, and in some cases more powerful. The nice use of emacs-style keybindings is
for interactive shells: the default line editor in bash resembles emacs.
Jed is a versatile editor that can "impersonate" a number of well known text editors
includiong emacs. For people looking for a wordstar clone or a lightweight version of
emacs, jed is nice.
Joe is a lot like jed: it is a lightwieght editor that includes different modes including pico,
wordstar and emacs.
mcedit is the editor that ships with midnight commander. It is very dos like, it looks and
feels a lot like dos edit. Good for nostalgic windows refugees.
Nedit is a GUI editor which should make windows users feel at home. It has some
programmer-friendly features, and is essentially a superset of the classic windows text
editors (notepad) in functionality.
Pico ships with the popular email client pine pico is much like ee in a number of ways. It
is not terribly powerful (in fact quite the opposite) but easy to use. Good for people who
don't want to know how to use a text editor.

Process Management
Any application that runs on a Linux system is assigned a process ID or PID. This is a
numerical representation of the instance of the application on the system. In most
situations this information is only relevant to the system administrator who may have to
debug or terminate processes by referencing the PID. Process Management is the series
of tasks a System Administrator completes to monitor, manage, and maintain instances of
running applications.
Process Management beings with an understanding concept of Multitasking. Linux is
what is referred to as a preemptive multitasking operating system. Preemptive
multitasking systems rely on a scheduler. The function of the scheduler is to control the
process that is currently using the CPU. In contrast, symmetric multitasking systems such
as Windows 3.1 relied on each running process to voluntary relinquish control of the
processor. If an application in this system hung or stalled, the entire computer system
stalled. By making use of an additional component to pre-empt each process when its
―turn‖ is up, stalled programs do not affect the overall flow of the operating system.
Each ―turn‖ is called a time slice, and each time slice is only a fraction of a second long.
It is this rapid switching from process to process that allows a computer to ―appear‘ to be
doing two things at once, in much the same way a movie ―appears‖ to be a continuous
Types of Processes
There are generally two types of processes that run on Linux. Interactive processes are
those processes that are invoked by a user and can interact with the user. VI is an
example of an interactive process. Interactive processes can be classified into foreground
and background processes. The foreground process is the process that you are currently
interacting with, and is using the terminal as its stdin (standard input) and stdout
(standard output). A background process is not interacting with the user and can be in one
of two states – paused or running.
The following exercise will illustrate foreground and background processes.
1. Logon as root.
2. Run [cd \]
3. Run [vi]
4. Press [ctrl + z]. This will pause vi
5. Type [jobs]

6. Notice vi is running in the background
7. Type [fg %1]. This will bring the first background process to the foreground.
8. Close vi.
The second general type of process that runs on Linux is a system process or Daemon
(day-mon). Daemon is the term used to refer to process‘ that are running on the computer
and provide services but do not interact with the console. Most server software is
implemented as a daemon. Apache, Samba, and inn are all examples of daemons.
Any process can become a daemon as long as it is run in the background, and does not
interact with the user. A simple example of this can be achieved using the [ls –R]
command. This will list all subdirectories on the computer, and is similar to the [dir /s]
command on Windows. This command can be set to run in the background by typing [ls
–R &], and although technically you have control over the shell prompt, you will be able
to do little work as the screen displays the output of the process that you have running in
the background. You will also notice that the standard pause (ctrl+z) and kill (ctrl+c)
commands do little to help you.

System Call:
System call is the services provided by Linux kernel. In C programming, it often uses
functions defined in libc
which provides a wrapper for many system calls. Manual page section 2 provides more
information about
system calls. To get an overview, use ―man 2 intro‖ in a command shell.
It is also possible to invoke syscall() function directly. Each system call has a function
number defined in
<syscall.h>or <unistd.h>. Internally, system call is invokded by software interrupt 0x80
to transfer control to
the kernel. System call table is defined in Linux kernel source file
―arch/i386/kernel/entry.S ‖.
System Call Example
#include <syscall.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
int main(void) {
long ID1, ID2;
/* direct system call */
/* SYS_getpid (func no. is 20) */
ID1 = syscall(SYS_getpid);
printf ("syscall(SYS_getpid)=%ld\n", ID1);
/* "libc" wrapped system call */
/* SYS_getpid (Func No. is 20) */
ID2 = getpid();
printf ("getpid()=%ld\n", ID2);
System Call Quick Reference
No Func Name Description
terminate the current process
create a child process

read from a file descriptor
write to a file descriptor
open a file or device
close a file descriptor
wait for process termination
create a file or device ("man 2 open" for
make a new name for a file
10 unlink
delete a name and possibly the file it refers to
11 execve
execute program
12 chdir
change working directory
13 time
get time in seconds
14 mknod
create a special or ordinary file
15 chmod
change permissionsof a file
16 lchown

change ownership of a file
18 stat
get file status
19 lseek
reposition read/write file offset
20 getpid
get process identification
21 mount
mount filesystems
22 umount
unmount filesystems
23 setuid
set real user ID
24 getuid
get real user ID
25 stime
set system time and date
26 ptrace
allows a parent process to control the execution of
a child process
27 alarm
set an alarm clock for delivery of a signal
28 fstat
get file status
29 pause
suspend process until signal
30 utime
set file access and modification times
33 access
check user's permissions for a file
34 nice

change process priority
36 sync
update the super block
37 kill
send signal to a process
38 rename
change the name or location of a file
39 mkdir
create a directory
40 rmdir
remove a directory
41 dup
duplicate an open file descriptor
42 pipe
create an interprocess channel
43 times
get process times
45 brk
change the amount of space allocated for the
calling process's data segment
46 setgid
set real group ID
47 getgid
get real group ID
48 sys_signal
ANSI C signal handling
49 geteuid
get effective user ID
50 getegid
get effective group ID

51 acct
enable or disable process accounting
52 umount2
unmount a file system
54 ioctl
control device
55 fcntl
file control
56 mpx
57 setpgid
set process group ID
58 ulimit
59 olduname
obsolete uname system call
60 umask
set file creation mask
61 chroot
change root directory
62 ustat
get file system statistics
63 dup2
duplicate a file descriptor
64 getppid
get parent process ID
65 getpgrp
get the process group ID
66 setsid
creates a session and sets the process group ID
67 sigaction
POSIX signal handling functions

68 sgetmask
ANSI C signal handling
69 ssetmask
ANSI C signal handling
70 setreuid
set real and effective user IDs
71 setregid
set real and effective group IDs
72 sigsuspend
install a signal mask and suspend caller until
73 sigpending
examine signals that are blocked and pending
74 sethostname
set hostname
75 setrlimit
set maximum system resource con sumption
76 getrlimit
get maximum system resource con sumption
77 getrusage
get maximum system resource con sumption
78 gettimeofday
get the date and time
79 settimeofday
set the date and time
80 getgroups
get list of supplementary group IDs
81 setgroups
set list of supplementary group IDs
82 old_select
sync. I/O multiplexing

83 symlink
make a symbolic link to a file
84 lstat
get file status
85 readlink
read the contents of a symbolic link
86 uselib
select shared library
87 swapon
start swapping to file/device
88 reboot
reboot or enable/disable Ctrl-Alt-Del
89 old_readdir
read directory entry
90 old_mmap
map pages of memory
91 munmap
unmap pages of memory
92 truncate
set a file to a specified length
93 ftruncate
set a file to a specified length
94 fchmod
change access permission mode of file
95 fchown
change owner and group of a file
96 getpriority
get program scheduling priority
97 setpriority
set program scheduling priority

98 profil
execution time profile
99 statfs
get file system statistics
100 fstatfs
get file system statistics
101 ioperm
set port input/output permissions
102 socketcall
socket system calls
103 syslog
read and/or clear kernel message ring buffer
104 setitimer
set value of interval timer
105 getitimer
get value of interval timer
106 sys_newstat
get file status
107 sys_newlstat
get file status
108 sys_newfstat
get file status
109 olduname
get name and information about current kernel
110 iopl
change I/O privilege level
111 vhangup
virtually hangup the current tty
112 idle
make process 0 idle
113 vm86old
enter virtual 8086 mode

114 wait4
wait for process termination, BSD style
115 swapoff
stop swapping to file/device
116 sysinfo
returns information on overall system statistics
117 ipc
System V IPC system calls
118 fsync
synchronize a file's complete in-core state with that
on disk
119 sigreturn
return from signal handler and cleanup stack
120 clone
create a child process
121 setdomainname
set domain name
122 uname
get name and information about current kernel
123 modify_ldt
get or set ldt
124 adjtimex
tune kernel clock
125 mprotect
set protection of memory mapping
126 sigprocmask
POSIX signal handling functions
127 create_module
create a loadable module entry
128 init_module

initialize a loadable module entry
129 delete_module
delete a loadable module entry

130 get_kernel_syms
retrieve exported kernel and module symbols
131 quotactl
manipulate disk quotas
132 getpgid
get process group ID
133 fchdir
change working directory
134 bdflush
start, flush, or tune buffer-dirty-flush daemon
135 sysfs
get file system type information
136 personality
set the process execution domain
137 afs_syscall
138 setfsuid
set user identity used for file system checks
139 setfsgid
set group identity used for file system checks
140 sys_llseek
move extended read/write file pointer
141 getdents
read directory entries
142 select
sync. I/O multiplexing
143 flock

apply or remove an advisory lock on an open file
144 msync
synchronize a file with a memory map
145 readv
read data into multiple buffers
146 writev
write data into multiple buffers
147 sys_getsid
get process group ID of session leader
148 fdatasync
synchronize a file's in-core data with that on disk
149 sysctl
read/write system parameters
150 mlock
lock pages in memory
151 munlock
unlock pages in memory
152 mlockall
disable paging for calling process
153 munlockall
reenable paging for calling process
154 sched_setparam
set scheduling parameters
155 sched_getparam
get scheduling parameters
156 sched_setscheduler set scheduling algorithm parameters
157 sched_getscheduler get scheduling algorithm parameters
158 sched_yield
yield the processor

get max static priority range
get min static priority range
get the SCHED_RR interval for the named process
162 nanosleep
pause execution for a specified time (nano seconds) kernel/sched.c
163 mremap
re-map a virtual memory address
164 setresuid
set real, effective and saved user or group ID
165 getresuid
get real, effective and saved user or group ID
166 vm86
enter virtual 8086 mode
167 query_module
query the kernel for various bits pertain ing to
168 poll
wait for some event on a file descriptor
169 nfsservctl
syscall interface to kernel nfs daemon
170 setresgid
set real, effective and saved user or group ID
171 getresgid
get real, effective and saved user or group ID
172 prctl
operations on a process

173 rt_sigreturn
174 rt_sigaction
175 rt_sigprocmask
176 rt_sigpending
177 rt_sigtimedwait
178 rt_sigqueueinfo
179 rt_sigsuspend
180 pread
read from a file descriptor at a given offset
181 sys_pwrite
write to a file descriptor at a given offset
182 chown
change ownership of a file
183 getcwd
Get current working directory
184 capget
get process capabilities
185 capset
set process capabilities
186 sigaltstack
set/get signal stack context
187 sendfile
transfer data between file descriptors
188 getpmsg
189 putpmsg
190 vfork
create a child process and block parent

System Administration
To look after everything about Linux system, called system administration which
a. monitoring disk space and taking backup
b. handling system problems which are unexpected
c. to handle every eventuality. System administrator must have thorough
practical knowledge of every system component
d. he is responsible for installing all system peripherals
e. he devise the scripts for automating some operations carried out regularly
f. he must also be able to configure the system‘s initialization script

logging into system administrator:
login: root (press enter)
passeord: _ (press enter after entering password)
# (will appear not $ as for other users)
Acquiring super user mode:
$su (press enter)
Change password:
#passwd (will change password of root)
$passwd rohit (will change password of user rohit)
#date MMDDhhmm (to set date, month,day,hour,minute)
#calendar (to see the scheduler of all users)
To limit on file size
#ulimit (press enter)
#ulimit 20791510 (press enter : this will change file size)
$shutdown –g2 (shutdown after 2 minutes)
$shutdown –y –g0 (shutdown immediately)
$shutdown 17:30 (shutdown at 17:30)
$shutdown –r now (shutdown and reboot now)
$shutdown –h10 (halt after 10 minutes)
#df (will show how much disk free in each main directories)
/ ---
/home ---
/root ---
/stand ---
Disk usuage:
#du /home/rohit (show in particular directory)
/home/rohit/one ---

File compression:
$compress foldername
$uncompress foldername.z

Gnu zip:
$gzip foldername/filename
$gunzip gzipfile
Locating files:
$find <pathlist> <selection criteria> <action>

#find / -name a.txt –print (search file a.txt in / and then display)
#find /home –atime +365 –mailroot (search file whose access time is more than
365 days and then mail it)
#find / -size -2048 –print (search file whose size is below 2 mb)
#find / -size +1024 –atime -365 –rm
Copying diskettes and tapes
Copy floppy to temp folder:
#dd if=/dev/rdsk/f0q18dt of=/temp bs=147456
Input file name output file name block size
Copy tape
#dd if=/dev/rct0 of=/dev/rct1 bs=9K

CPIO (copy input output)
#ls | cpio –ov > /dev/rdsk/f0q18dt
#cpio –iv < /dev/rdsk/f0q18dt
For understanding: o- output i -input v- visually
Tar: tape archieve program
tar vs cpio
- tar can take directories as input
- it copies entire directory tee
- it can create several versions in single archive
- it can append without overwriting
# tar –cvf /dev/fdsk/f0q18dt /home/rohit/*
#tar –xvfb /dev/fdsk/f0q18dt
x- extract
c- copy
mount and unmount:
mount /dev/hda1 /mnt/flash (mount c:\ to /mnt/flash)
unmount /dev/hda1 (for unmounting) or
unmount /mnt/flash ( - - -)
in later versions umount also works as unmount
creating partitions
#fdisk (it will give list of commands to perform)

Ques1: write any five function of a Linux system administrator.

Ques2: how kernel access a file
1. kernel know inode of current directory, maintained in memory, with this, it first
search inode blocks and find inode block of this directory and then fetch address
of data block which contain directory file
2. open directory file and search for file ‗a.txt‘ and goes to inode and find inode for
file ‗a.txt‘, reads its detail like size, indirect nodes etc.
3. instruct disk driver to move disk head to respective blocks and count the number
of byte and match with file size and read till they match.

Shell Script
1) written in any editor
2) to run, we
i. firstly change its mode
1. $chmod +x filename
ii. then to run
1. ./filename

a. to display: echo string name
b. to read: read variablename
c. to pick value of a variable: $variablename
d. to put values in a variable: $variablename=value
e. to solve expression: `expr 1+3`


echo"e nterstringtofind"








1) echo“hello”;;
2) echo“hi”;;
3) echo“by”;;


Unit 4
Data Structure in Linux Kernel*:
*- this unit is given as for crammers. If you want to understand this unit, you have
to study at your own yet I have tried to make understandable.

Intro to Linux kernel:
Linux was not designed on drawing board, but developed in an evolutionary manner and
is continued to develop. Every function of kernel has been repeatedly altered and
expanded to get rid of bugs and incorporates new features. Linux kernel is not in every
respect of a good model of Structured programming. There are magic numbers instead of
constant declaration in header files, inline expansion of functions instead of function
calls, goto instead of break, assembler instructions instead of c code. Large part of Kernel
is time-critical, so kernel code is optimized for good running time rather than easy
readability. This distinguishes Linux from Minix which was written as a ―Teaching
Operating System‖ and never designed for everyday use.

Data structure in Linux:
. Task Structure
. Process Table
. Files and Inodes
. Dynamic Memory Management
. Queue and Semaphores
. System time and timers

System time and timers:
In the Linux system there is just one internal time base. It is measured in ticks
elapsed since the system was booted, with one tick equal to 10 millisecond. These ticks
are generated by a timer chip in the hardware and counted by timer interrupt. Functions
used are:

¼ void add_timer(struct timer_list *timer);
¼ int del_timer(struct timer_list *timer);
¼ int mod_timer(struct timer_list *timer, unsigned long expires);

add_timer : activates a timer by entering it to the global timer list
del_timer : removes it from global timer list
Mod_timer : modifies the expires time of an activated timer

Dynamic Memory Management:
Under Linux, memory is managed on a page basis. One page contains 2 raised to
12 bytes. The basic operations to request a free page are the functions

struct page * _alloc_pages(int gfp_mask, unsigned long order);
unsigned long _get_free_pages(int gfp_mask,unsigned long order);
(2) Fn gets free pages in memory and (1) Fn allocates pages to a process
Þ gfp_mask need to control the pages and behavior of functions

Like of C-functions, Kernel uses kmalloc() and kfree() for allocating small amount
of memory and free(ing) that memory.
void * kmalloc(size_t size,int flags)
void kfree(void * object)

Process Table:
Every process occupy exactly one entry in process table. INIT_TASK macro
points at the first task in the system. It is initialized by starting the system using
INIT_TASK macro. After the system has been booted, that is only responsible for the use
of unclaimed system time (idle process : System idle process (pid:0))

Even if the process table has a dynamic structure, the number of tasks is restricted
to max_threads in system.
int max_thread;
however it can be change by sysctl interface.
for working with each process kernel use for_each_task() macro
#define for_each_task(p)
for (p=&init_task; p != &init_task; p=p->next_task)

Queue and Semaphores:
struct list_head {
struct list_head *next,*prev;
struct _wait_queue {
struct task_struct *task;
struct list_head task_list;
struct wait_queue_head {
struct list_head task_list;
add_wait_queue() and remove_wait_queue() are used for adding deleting in wait queue
extern int sem->count=1;
void down(struct semaphore *sem)
while(sem->count <= 0)
sem->count -= 1;

void up(struct semaphore *sem)
sem->count +=1;

Files and inodes:

struct file

mode_t f_mode; //access mode in which file is opened
loff_t f_pos; //pos of read/write pointer
atomic_t f_count; //simple reference number(index)
unsigned int f_flags; //additional flag for access
struct dentry *fs_dentry; //reference to entry in directory cache
// to access inode

struct inode

kdev_t i_dev; //description of device (fd0,cdrom,hda,sda)
unsigned long i_no; //identify file in device (fifth,100th) eg: index no
off_t i_size; //size of file
time_t i_mtime; //last modification time
time_t i_atime; //last access time
time_t i_ctime; //last modification to inode eg: file move, cut-paste.


Task Structure:
One of the most important concept in a multitasking system such as Linux is the
task : the data structure and algorithms for process management form the central core of

struct task_struct
volatile long state; //current state of process TASK_RUNNING
unsigned long flags; //bit mask of system status (sys. Working, stand
//by, logoff, switch user
unsigned long ptrace; //process is monitored by another process/not
long counter; //number of ticks assigned
long nice; //priority default: {NZERO}
unsigned long policy; //SCHED_FIFO,SCHED_RR,SCHED_OTHER
unsigned long rt_priority; //real-time priority

Process/ Task relations : * subpart of task structure

§ struct task_struct *prev_task,*next_task;//for previous and next task
§ struct task_struct *p_opptr; //pointer to original parent
§ struct task_struct *p_pptr; //pointer to current parent
§ struct task_struct *p_cptr; //current process : youngest child
§ struct task_struct *p_ysptr; //pointer to younger child/ sibling
§ struct task_struct *p_osptr; //pointer to older child/ sibling

Memory management:
• The Architecture-independent Memory Model in LINUX
• The Virtual Address Space for a Process
• Block Device Caching
• Paging Under LINUX

The architecture-independent memory model
• Pages of Memory
• Defined by the PAGE_SIZE macro in the asm/page.h
• For X86, the size is 4k bytes
• For Alpha uses 8K bytes
• Virtual Address Space
• Given by reference to a segment selector and the offset within the segment
• C pointers hold the offsets
• Defined in asm/segment.h
• KERNERL_DS (segment selector for kernel data)
• USER_DS (segment selector for user data)
• By carrying out a conversion on the segment selector register, a system
function can be given pointers to the kernel segment.
• Used by UMSDOS file system to simulate a Unix file system
• MMU of an x86 processor converts the virtual address to a linear address
• 4 Gbytes by width of the linear address
• 3 Gbytes for user segment
• 1 Gbyte for kernel segment
• Alpha does not support segmentation
• Offset addresses for the user segment not permitted to overlap with the
offset addresses for the kernel segment
• Converting the Linear Address

• The Page Directory
• The Page Middle Directory
• The Page Table

The virtual address space for a process
• The User Segment
o In user mode, access only in user segment
o Individual page tables for different processes
o system call fork
o child and parent processes have different page directories and page
o however, in the kernel segment page tables are shared by all processes
o system call clone
o old and new threads share the memory fully
o Some explanation for shared libraries in the user segment
+ Originally, linked into one binary, lead to efficiency
+ Drawback is the growth of the length
+ Stored in separate files and loaded at program start
+ Linked to static addresses
+ With ELF, allowed shared libraries to be loaded during
program execution
+ No absolute address references in the compiled code
• Virtual Memory Areas
o Process not use all functions at any time
o Process can share codes if they are run by the same executable file
o Copy-on-write strategy used for memory management
• The System Call brk
o The brk field points to the end of the BSS segment for non-statically
initialized data
o Used for allocating or releasing dynamic memory
o The system call brk can be used to find the current value of the pointer
or to set it to a new one under protection check
o Rejected if the mem required exceeds the estimated size
o function sys_brk() calls do_map() to map a private and anonymous
area between the old & new values of brk
• The Kernel Segment
o In x86 architecture, a system call is generally initiated by the software
interrupt 128 (0x80) being triggered.
o Any processes in system mode will encounter the same kernel segment
o Kernel segment in alpha architecture cannot start at addr 0
o A PAGE_OFFSET is provided between physical & virtual addrs

• Static Memory Allocation in the Kernel Segment
o Initialization routine for character-oriented devices is called as follows
o memory_start = console_init(memory_start, memory_end);
o Reserves memory by returning a value higher than the parameter
o The memory between the return value and memory_start can be used
as desired by the initialized component
• Dynamic Memory Allocation in the Kernel Segment

o In LINUX kernel, kmalloc() and kfree() used for dynamic memory
o void * kmalloc(size_t size, int priority);
o void kfree(void *obj);
o To increase efficiency, the memory reserved is not initialized
o In LINUX kernel 1.2, __get_free_pages() only to reserve contiguous
areas of memory of 4, 8, 16, 32, 64, and 128 Kbytes in size
o kmalloc() can reserve far smaller areas of memory
o Sizes[] contains descriptors for different for different sizes of memory
+ one manages memory suitable for DMA
+ the other is responsible for ordinary memory

o Kmalloc() and kfree() restricted to the size of one page of mem
o vmalloc() and vfree() improved to multiple of the size of one page of
o The max of value of size is limited by the amount of physical memory
o Memory reserved by vmalloc() won‘t be copied to external storage

Block Device Caching:
• Block Buffering
o Block size may be 512, 1024, 2048, or 4096 bytes
o Held in memory via a buffering system
o A special case applies for blocks taken from files opened with the flag
o Transferred to disk every time their contents are modified
o Data is organized as frequently requested data lie every close together
& can be kept in the processor cache
• The update and bdflush Processes
o At periodic intervals, update process calls the system call bdflush with
an parameter
o All modified buffer blocks are written back to disk with all superblock
and inode information
o bdflush, writes back the number of blocks buffers marked ―dirty‖
given in the bdflush parameter
o Always activated when a block is released by means of brelse()
o Also activated when new block buffers are requested or the size of the
buffer cache needs to be reduced
• List Structures for the Buffer Cache
o LINUX manages its block buffers via a number of different doubly
linked lists
o Block buffers in use are managed in a set of special LRU lists
LRU list(index) Description
Block buffers not managed in other lists -
content matches relevant block on hard disk
BUF_UNSHARED Block buffers formerly (but no longer)
managed in BUF_SHARED
BUF_LOCKED Locked block buffers (b_lock != 0 )
BUF_LOCKED1 Locked block buffers for inodes and
BUF_DIRTY Block buffers with contents not matching the
relevant block on hard disk
BUF_SHARED Block buffers situated in a page of memory
mapped to the user segment of a process

The various LRU lists
• Using the Buffer Cache
o Function bread() is called for block read
o Variance of bread(), breada(), reads not the block requested into the buffer
cache but a number of following blocks

Paging under Linux:
• Page Cache and Management

• LINUX can save pages to extenral media in 2 ways
• a complete block device as the external medium, typically a partition on a
hard disk
• fixed-length files on a file system for its external storage
• Data that belong together are stored in a cache line (16 bytes)
• Finding a Free Page
• __get_free_pages() is called after physical pages of mem reserved
• unsigned long __get_free_pages(int priority, unsigned long order, int dma)
Priority Description
Free page to be returned only if free pages are still available
in physical mem
GFP_ATOMIC The function __get_free_page must not interrupt the current
process, but a page should be returned if possible
GFP_USER The current process may be interrupted to swap pages
GFP_KERNEL This para is the same as GFP_USER
GFP_NOBUFFER The buffer cache won‘t be reduced by an attempt to find a
free page in mem
GFP_NFS The difference between this & GFP_USER is that the # of
pages reserved for GFP_ATOMIC is reduced from
min_free_pages to five. Will speed up NFS operations

Priorities for the function __get_free_page()

• Page Errors and Reloading a Page
• do_page_fault() is called when there generates a page fault interrupt
• void do_page_fault(struct pt_regs *regs, unsigned long error_code);
• do_no_page() or do_wp_page() is called when the address is in a virtual
memory area, the legality of the read or write operation is checked by
reference to the flags for the virtual mem

Linux File System:
In the PC field, variety in a file system is common: practically every OS has its
own file system and each of these of course claims to be faster, better and more secure
than its predecessors. The large number of file systems supported by Linux is undoubtly
one of the main reasons why Linux has gained acceptance so quickly in its short life. Not
every user is in a position to put in the time and effort to convert his or her old data to a
new file system.
The range of file system supported is made possible by the unified interface of
Linux kernel. This is virtual file system switch (VFS). Note that it is not a file system on
its own but an interface providing a clearly defined link between the OS kernel and the
different file system.
The virtual file system supplies the applications with the system calls for the
management, maintains internal structures and passes tasks on to the appropriate actual
file system. Another important task of the VFS is the performance of default actions. As a
rule, no file system implementation will actually provide an lseek() function which is
provided by VFS as a default. So, VFS is commonly known as virtual file system.
A central demand of file system is purposeful structuring, speed of access and
facility for random access. Random access is made possible by block oriented devices,
which are divided into a specific number of equal sized blocks. Using the functions of
buffer cache to access any of the sequentially numbered blocks in a given device. The file
system itself must be capable of ensuring unique allocation of the data to the hardware
In Unix, the information required for management is kept strictly apart from the
data and collected in separate inode structures for each file. This information includes
access time, access rights, pointers to data blocks, indirect pointers to data blocks, double
indirect pointers, triplet indirect pointers, owner, size etc. Access to larger files is
provided via indirect blocks which also contain block numbers. Each file is represented
by just one inode, which means that within a file system, each inode has a unique number
and the file itself and the file itself can also be accessed using this number.
Directories allow the file system to be given a hierarchical structure. These are
also implemented as files, but the kernel assumes them to contain pairs consisting of a
file name and its inode number.
The basic structure is the same for all the different Unix file systems. Each file
system starts with a boot block. This block is reserved for the code required for the code
required to boot the operating system. The boot block will be present whether or not the
computer is booted from the device or not.
All the information which is essential for managing the file system is held in the
super block. This is followed by a number of inode blocks containing the inode structures
for the file system. The remaining blocks for the device provide the space for the data.
The data blocks thus contain ordinary files along with the directory entries and indirect
A new file system can be mounted onto any directory. This orginal directory is
then known as the mount point and is occupied by the root directory of new file system
along with its subdirectories and files. Unmouting the file system releases the hidden
directory structure again.

Representation of File System in kernel:
Every file system need to be made known to the VFS via the following



Before a file can be accessed, the file system containing the file must be mounted.
This can be done using either the system call mount or the function mount_root(). Every
mounted file system is represented by a super_block structure. These structures are
placed in a dynamic table super_blocks held by the struct list_head type. The maximum
length of this list is limited by the max_super_blocks variable. The function read_super()
of VFS is used to initialize the superblock. It creates an empty superblock, puts it in the
superblock list and calls the function provided by every file system implementation to
create the superblock.

The file system specific function read_super() reads information from the
corresponding block device. This is also the reason why a process is necessary for
mounting file system.
structlist_heads_list; //listofsuperblocks
kdev_ts_dev; //device
unsignedlongs_blocksize; //blocksize
unsignedchars_dirt; //ifsuperblockhasbeenchanged
wait_queue_hed_ts_wait; //waitingqueue




structlist_headi_list; //chaining
unsignedlongi_ino; //inodenumber
uid_ti_uid; //userid
gid_ti_gid; //groupid
kdev_ti_rdev; //realdevice
time_ti_atime; //lastaccesstime
off_ti_size; //sizeoffile
time_ti_mtime; //lastmodificationtime
time_ti_ctime; //lastchangetoinode





structlist_headf_list; //chaining
mode_tf_mode; //accesstype
loff_tf_pos; //positioninfile
unsignedintf_uid,f_gid; //owner



Proc file system:
Linux supports different filesystem. Each process in the system which is currently
running is assigned a directory /proc/pid where pid is the process identification number of
the relevant process. This directory contains files holding information on certain
characterstics of the process.
When proc file system is mounted, the VFS function read_super() is called by
do_mount() and in turn calls the function proc_read_super() for the proc file system in
the file_system list.
iget() generate the indoe for the proc root directory, which is entered in the
superblock_parse_options() then processes the mount options data that have been
provided and sets the owner of the root inode.
Accessing the file system is always carried out by accessing the root inode of the
file system. The first access is made by calling iget(). If the inode does not exists, this
function then calls the proc_read_inode() function entered in the proc_sops structure
The inode describe the directory with read and execute permissions for all
processes. The proc_root_inode_operations only provides two functions: readdir in form
of proc_readroot() function and lookup as proc_lookuproot(). Both functions operate
using the table root_dir[], which contains the different entries for the root directory

constchar*name; //nameofentry
mode_tmode; //mode
uid_tuid; //userid
gid_tgid; //groupid
unsignedlongsize; //sizeoffile
structproc_dir_entry*next,*parent; //chaining

Ext2 file system:
As Linux was initially developed under MINIX, it is hardly surprising that first
Linux file system was minix file system. However, this file system restricts partitions to
64 MB and file names to 14 chars, so the search for a better file system was obvious.
Although this allowed partitions of up to 2GB and filename up to 255 chars. It included
several signigicant extensions but offered unsatisfactory performance. The second
extended file system (Ext2) was introduced in 1994.

i) Block Fragmentation: it allows different sized blocks to be allocated.
ii) Access Control List: allows ACL to be associated with each file
iii) Handles compressed and encrypted files
iv) Logical Deletion: an undelete option will allow users to easily recover
removed files.
Ext2 file system has blocks and each block has
a. a copy of file system‘s super block
b. a copy of group of block group descriptors
c. a data block bitmap
d. a group of inode
e. an inode bitmap
f. a chunk of data belonging to a file/ data block
An ext2 disk super block is stored in ext2_super_block structure which contains the total
number of inodes, file system size, number of reserved blocks, free blocks counter, free
inodes counter, block size, fragment size and other important information.

are maintained by singly link list.

struct ext2_dir_entry_2 {
_u32 inode; //inode number
_u16 rec_len; // length of directory entry
_u8 name_len; //length of file name
_u8 file_type; //type of file
Char name[EXT2_NAME_LEN]; //file name

Block allocation:
a. Target oriented: This algo looks for target block if that not found then look with
in 32 blocks near target block if no one is free, then find block at least in same
block group and even if that is not found search else where and allocate that.
b. Pre allocation: If a free block is found, up to eight following blocks are reserved.
When the file is closed, the remaining blocks still reserved are released. This also
guarantees that as many data blocks as possible are collected into one cluster.

Proc vs ext2:

Proc ext2

1. procedure oriented Access oriented
2. file name length=14 chars 255
3. max. partition=512MB 2 GB
4. mounted on /proc on /
5. currently running process assigned all files stored on it

More advance and faster processors are entering in market; there will always be
applications that require still more processor power. A multitasking system, solution is to
employ several processors in order to achieve true parallel processing of tasks.
Performance doesn‘t increase linearly with number of processors, rather, OS bears an
increased responsibility to distribute all task among processors in such a way that few
processors as possible hamper each other.

Intel Multiprocessor Specification:
Pentium already has some internal function which supports multiprocessor operation such
as cache synchronization, interrupt handling and atomic operations for checking, setting
and exchanging values in main memory. Cache synchronization facilitates symmetric
multiprocessing implementation in kernel.
Intel multiprocessor specification version 1.4 defines the interaction between hardware
and software in order to facilitate the development of SMP – capable OS and to create
possibility of these systems run on new hardware. It defines a highly symmetric
architecture in term of
Memory Symmetry: same main memory, same data/code, same OS and application
I/O Symmetry: All processor share I/O subsystem to reduce possible I/O bottleneck.
The following diagram shows a typical SMP system with two processors connected via
Interrupt Controller Communications (ICC) bus with one/more Advance Programmable
Interrupt Controller (APIC) Pentium Processors also have Local APIC + I/O APIC
constitute a unit which deals with distribution of incoming interrupts.

One processor is chosen by BIOS called Boot Processor (BSP) and used for system
initialization. All other Application Processors (AP) are initially halted by BIOS. The
Multi Processor (MP) specification filled in BIOS and informs OS about existing
multiprocessor system. BIOS initially forwards all interrupts only to boot processor, so
that single processor system see no difference and run only on BSP.
Problems with multiprocessor systems:
For correct functioning of multitasking system, it is important that data in kernel can only
be changed by one processor, so that identical resources cannot be allocated twice.

Unix system:
a) Coarse granted locking: lock whole kernel
b) Finer grained locking: reduce time that a lock must keep => reduce
particularly critical latency time

Linux system:
Rules were established
i. no process running in kernel mode interrupted by other process running in
kernel mode except when it releases control and steps
ii. an interrupt handling can interrupt a process in kernel mode but in end
control return back to same process
iii. interrupt handling will be processed completely and interrupt cannot be by
process in kernel mode and be by interrupt of higher priority

Initially, a semaphore was used by all processes to monitor transition to kernel
mode which obey all rules => low performance. Later, finer grained locking was
used. The transition can be carried out hierarchically by substituting one
semaphore with several others which cover smaller area of Linux kernel this
guarantees higher parallelism and higher system performance.

Symmetric Multiprocessing:
There are two processors and kernel decides which process should be allocated to
the processor. Memory is shared b/w processors. When there is some updation in the
process, then they will be stored in process only. This will not be reflected back which
will create problem. We use shared memory because changes are reflected. SMP denote a
multiprocessor architecture in which low CPU is selected as a Master CPU but rather all
of these cooperate on an equal basis. When kernel is to be loaded, the basic processor
CMOS is set which loads the kernel.
Features of SMP:
1. Shared Memory
2. Shared IO Port
3. Hardware Cache Synchronization: Synchronization means providing a cache at
time to process one by one. Suppose both processes are working on same program
and storing variables in different cache and they are doing same modifications at
same time, which is not safe and therefore only one cache is provided at a time.
4. SMP is provided by atomic operation like read/write back to disk, modification
5. Distributed Interrupt Handling: Every process has been provided by its own
interrupt controller known as APIC

Changes to kernel:
In order to implement SMP in Linux, changes have to be made to both portable
and processor specific implementations.

Kernel Initialization:
All processors must be started because BIOS has halted all APs and only boot
processor is running. Only this processor enters start_kernel(), then smp_init() for normal
initialization then smp_boot_cpus(). This activates all other processors. Each processor
receives its own stack. Each processor execute code and jump to start_kernel(). Once
exception handling and interrupt handling are initialized, processors trapped by
smp_callin() inside start_secondary().
asmlinkage void start_secondary(void)
void smp_callin(void)

celibrate_delay(); //determine processor‘s bogomips
smp_store_cpu_info(cpuid); //save processors parameters


How a halted processor is started:
This is served by APIC which allows to send Interprocessor Interrupt (IPI) and
InitIPI to all processors. This INIT signal works like reset and via this reset flag,
processor jump to BIOS. Then startupIPI is send to begin execution of real mode routine.
After all processors are started, smp_num_cpus contains number of currently running
processors. Then an idle task is created for each processes in order not to blick kernel
mode for all other processors. After smp_init(), smp_commence() is called which sets
smp_commenced flag where all APs can quit smp_callin() and process their individual
idle task.

This is about OS concept: where input is written on magnetic tape and then
processed and latterly output is again written on tape and when printer will free, that will
pick and print.
Rest of this topic I leave on you as your home work

Linux scheduler shows only slightly changes. Task structure now has
NO_PROC_ID; //if no processor has assigned as yet
last_processor; number/ID of processor processes last task
Every processor is assigned a new task which is executable and not been assigned
to any other processor. Those task are preferred that last ran on currently available
processor. This led to improvement in system performance when internal processor
caches still contain data valid for selected processes.
current_set(smp_processor_id()); // to set process to current processor

Message exchange between processors:
Message in form of inter processor interrupt (IPI) are handled via interrupts 13
and 16. Interrupt 13 is fast interrupt and not require kernel lock and not disturb scheduler
and is used to deliver message only ex: KB interrupt. Interrupt 16 is slow interrupt, wait
for kernel lock, and trigger scheduler. It is used to start scheduler on other processors. ex:
timer interrupts.
Interrupt Handling:
Interrupt are distributed by I/O APIC. As system part, all interrupts are forwarded
only to BSP. Each SMP OS switch APIC to SMP mode, so that other processors can
handle interrupts. Linux not use this Operating Mode, interrupt only deliver to BSP. This
compromises the latency time, since incoming interrupts can only be handled when no
processor in kernel and only BSP in kernel. If there is an AP in kernel, interrupt handling
routine must wait until the AP has left kernel. In order to use APIC‘s SMP mode, changes
must be made to current interrupt handling.


Every significant piece of s/w will contain defects, typical 2-5 per100 loc. Because of
these mistakes, s/w does not behave as it is suppose to. Bug Tracking, identification and
removal can consume a large amount of programmer‘s time during s/w development.
Debugging is not testing (the task of verifying the program‘s operation in all possible
condition) although testing and debugging are related and many bugs are discovered
during testing process.

Types of errors:
Specification errors:
If a program is incorrectly specified, it will inevitably fail to
perform as required. Even the best programmer in the world can
write the wrong program. Before starting programming.
Programmer must understand hat the program needs to do you can
detect and remove many specification errors by removing
requirement and agreeing with person who will use this.
Design errors:
Programs of any size need to be designing before they are created.
Take time to think about how you will construct the program, hat
data structure you ill need and how they will be used
Codding errors:
Every one makes typing errors. Creating source code from design
in an imperfect process, if you find a bug. Instead of rereading or
asking other one to read,. It is surprisingly just how many bugs you
can detect and remove by talking through implementation with
someone else.
Note: compiler (like c) can caught errors at compile time while interpreter (Linux shell)
caught at run time

NOTE: try executing, core program on a paper, called dry running

The five stages of debugging are:

× Testing : find out what defects or bugs exist
× Stabilization : making bugs reproducible
× Localization : identity lines responsible
× Correction: fixing code
× Verification : making sure the fix work

In brief, following approach used for debugging and testing a Linux program

1. code inspection change program and run it
2. instrumentation gain more info, what is happening inside program
3. controlled execution inspect program operation directly

Sometime , a segmentation fault occurs , os send s a signal to program saying it
has detected an illegal memory access and is terminating program to prevent
memory from being corrupted, the ability of os to detect illegal memory access
depends on its h/w configuration and its memory management,

Code inspection: reread program. Tools available for this: compiler
ex: gcc –wall –pedanit -ansi
This enables warnings. Additional checks for c standards and wall
for helpful information.

Instrumentation: is adding of code to program for purpose of collecting more info
about the behavior of program as its runs. Ex: printf statement

#ifdef DEBUG
# endif compile with gcc-DEBUG

__LINE__ for current line
__FILE__ for current file # ifdef debug
__DATE__ current date printf(―line‖__LINE__―date‖__DATE__);
__TIME__ current time #endif
$cc –o cfile DEBUG cfile1.c
Controlled execution: add extra lines or use a debugger like gnu debugger (gdb),
xxgdb,tgdb. The emcas editor also has a facility that allows
u to run gdb on your program, set breakpoints and see
which line is being executed.
Working with gdb:
$cc –g –o file1 file1.c (press enter)
$gdb file1 (press enter)
. . .
. . .
(gdb) help (press enter)
For displaying commands
(gdb)run (press enter)
Starting program : /root/file1
. . .
Program received signal . . ., segmentation fault
. . .
(gdb) print j
$1 = 4
(gdb) print a[3]
$2 = { . . . }
(gdb) help breakpoint
. . .
(gdb) break 20
To insert breakpoint at line 20
(gdb) run
To continue running
(gdb) print a[0] @ 5
Print data from a[0] to a[5]
(gdb) cont
To continue execution
(gdb) diable break1
To disable first break point
(gdb) commands 2 those commands are written here which are run on second break point
> set variable m=m+1
> cont

Other debuggers:
1. $ valgrind -leak –check=yes –v ./file1
2. printk: in printk debugger, code is checked and an error occurred create the
check points and print an appropriate alarm message. ex: whenever a kernel
segment process wish to call the data and code of user segment process,
verify_area() is fired, which check all area related to process and if any error
is occurred, calls the printk debugger, which print appropriate message.

Modules are components of Linux kernel that can be loaded and attached to it as
needed. To add support for a new device, you can now simply instruct a kernel to load its
module. In some cases, you may have to recompile only that module to provide support
for your device. The use of modules has the added advantage of reducing the size if the
kernel program. The kernel can load modules in memory only as they are needed. ex: the
module for the BLOCK devices and FILE SYSTEM .

Implementation of modules in kernel:
Linux provides three system calls: create_module, init_module and
delete_module for implementation of Linux modules. A further system call is used by the
user process to obtain a copy of kernel‘s symbol table.
The administration of modules under Linux makes use of a list in which all the
modules loaded are included. The list also administers the modules‘s symbol tables and
references. As far as the kernel is concerned, modules are loaded in two steps
corresponding to system calls create_module and init_modules. For the user process, this
procedure divides into four phases.
1. The process fetches the content of the object file into its own address space.
To get the code and data into a form in which they can actually be exec uted,
the actual load address must be added at various points. This process is called
2. The system call create_module is now used, firstly to obtain the final address
of the object module and secondary tor eserve memory for it. To do this, a
structure module is entered for the module in the list of modules and the
memory is allocated. The return value gives us the address to which the
module will later be copied.
3. The load address received by create_module is used to relocate the object file.
This procedure takes place in a memory area belonging to the process – if
process is a user process, then load in user area and if kernel process then load
in kernel segment
When a module is already used in a process and other process wish to use this
then it uses the module which earlier loaded. This mechanism is known as
module stacking.
4. Once the preliminary work is complete, we can load the object module. This
uses the system call init_modules(). Cleanup() ios called when the module is
5. By using the system call delete_module(), a module that has been loaded can
be removed again. Two precondition need to be met for this
a. There must be no reference to the modules
b. module‘s use_counter must hold a value zero.

Select: the select function checks whether data can be read from the device or written to
it. If the device is free or argument wait is NULL, the device will only be checked. If it is
ready for function concerned, select() will return 1 otherwise a 0. If wait is not NULL,
the processes must be held up until the device become available.

Kernel daemon:
The kernel daemon is a process which automatically carries out loading and
removing of modules without user noticing it. ex: whenever a file is accessed by floppy,
so kernel daemon load the block device module for handling the block device and load
the file system modules for particular system. But how does the kernel daemon know that
modules need to be loaded.
Communication between the Linux kernel and kernel daemon is carried out by
means of IPC. The kernel daemon opens a message queue with the new flag
IPC_KERNELD. The kernel sends the message to the kernel daemon by kerneld_send
function. Request is stored in kerneld_msg structure, which includes different

mtype: component contains the message
id: indicate whether the kernel expects an answer
pid: component holds the PID of the process that triggered the kernel request

responsibility for loading and releasing modules lies with the functions:

a) request_module: kernel requests the loading of a module and waits until the
operation has been carried out.
b) release_module: removes a module
c) delayed_release_module: allows a module to be removed with a specific delay
d) cancel_release_module: allows a module to be removed with a specified

*-This unit is totally written by me after working with IPC however contents are copied
but that are changed to make proper understanding. For any queries regarding this unit
just write to my email. All queries asked within support period will be answered and
explained in detail.

There are many applications in which process need to cooperate with each other. If a
process want to share a resource, it is important to make sure that no other process is
accessing that resource. This situation called race condition. To eliminate the race
conditions is the only use of IPC. A variety of forms of IPC can be used under Linux.
These supports resource sharing, synchronization, connectionless and connection oriented
exchange of data or combination of all of above.

1. Resource Sharing: resources like printer sharing and sharing memory need
communication. These two techniques should be taken care while
communication, so that no process can modify other process or access other
2. Connection less/oriented: In connection oriented, two processes must set up a
connection before communicating ex: you use telephone call, in Linux we use
pipes. In connectionless, we send a packet and leave it to infrastructure to deliver
them. Like for sending letter.
3. Synchronization: are used to eliminate race conditions.

Synchronization in kernel:
Because kernel manages the system, access by processes to these resources must
be synchronized. Normally, a process will not be interrupted by schedule() unless it
explicitly allows the execution of other processes by calling schedule(). Process in kernel
can be interrupted by interrupt handling routine: this may result in race condition even if
process is not executing any function that can lock file.
In Linux kernel, process are locked and particular events are waited for via
waiting queues. A process can sit on waiting queue and will not be interrupted until
processes in waiting queue are reactivated by interrupt handling routine/ another process.
Make the structure of wait_queue and wait_queue_head and list_head. These are already
given in unit4.
For adding and removing tasks from wait queue, we use add_wait_queue() and
remove_wait_queue(). A process can be moved to TASK_INTERRUPTIBLE and
TASK_UNINTERRUPTIBLE state. In first case, a task can be interrupted while not in II
case. Note that, both processes can be written in same wait queue. A process can be sleep
using sleep_on function and wake by wake_up macro.
sleep_on(struct wait_queue **p)
struct wait_queue wait;

Synchronization in kernel is done by semaphores. explain up and down functions of

Communication via files:
is the oldest way of exchanging data between programs. Program A writes data to
a file and B read data. In a multitasking system, both programs could be run as processes
at least quasi parallel to each other. Race condition then usually produce inconsistencies
in the file data. Program B reads data before A finished modifying it. In Linux, there are
two ways of locking: mandatory: lock whole kernel and advisory: locking file records;
reading / writing to file will continue even after lock has been set.

Mandatory: It blocks r/w operations through out entire area. There are two methods for
locking entire area
a) lock whole file by means of fcntl() system call. This function is invoked by
flock(). Flock() are not defined as POSIX standard, so programmers are advised
against using it.
b) in addition to files to be locked, there is an auxiliary file known as lock file is
created which refuses access to the file when it is present.
a. link: create a lock file if it does not exist. On a failure, it calls sleep()
b. create: abort with an error code if process which is being called does not
possess the appropriate access right
c) open lock file, if it not exist else error message will appear

- to all these three points is that after a failure, process must repeat its attempt to set
a lock file. The process will call sleep() to wait for 1 sec and then try again.
+ process can work so that no other process can modify it or write data in file
+ no more than one process can excess same data. It means that reading can be done
but writing cannot be done on any area of file

Advisory locking: with advisory locking, all processes accessing the file for r/w
operations have to set the appropriate lock and release it again. Locking file area is
usually referred as record locking. Advisory locking of file area is achieved by system
call fcntl().
If there is no existing lock, then both r/w operations are possible
If more than one read lock exist, then read is possible while write not
If one write lock exists, then r/w both are illegal
- deadlock
+ more than one process work on different area in a single file

Are the ways in which, one process can communicate with other processes in fifo
manner, that is pipe, is a one way flow of data b/w processes, all the data written by a
process to the pipe, is routed by the kernel to another process, which can thus read it.
In Linux shell, pipes can be created by means of |(pipe) operator
Pipes are very simple way of IPC. And thus I leave it to you to understand and make your own notes.







Named pipes:
Named pipes allow two unrelated processes to communicate with each other. They are
also known as FIFOs (first-in, first-out) and can be used to establish a one-way (half-
duplex) flow of data.
Named pipes are identified by their access point, which is basically in a file kept on the
file system. Because named pipes have the pathname of a file associated with them, it is
possible for unrelated processes to communicate with each other; in other words, two
unrelated processes can open the file associated with the named pipe and begin
communication. Unlike anonymous pipes, which are process-persistent objects, named
pipes are file system-persistent objects, that is, they exist beyond the life of the process.
They have to be explicitly deleted by one of the processes by calling "unlink" or else
deleted from the file system via the command line.
In order to communicate by means of a named pipe, the processes have to open the file
associated with the named pipe. By opening the file for reading, the process has access to
the reading end of the pipe, and by opening the file for writing, the process has access to
the writing end of the pipe.
A named pipe supports blocked read and write operations by default: if a process opens
the file for reading, it is blocked until another process opens the file for writing, and vice
versa. However, it is possible to make named pipes support non-blocking operations by
specifying the O_NONBLOCK flag while opening them. A named pipe must be opened
either read-only or write-only. It must not be opened for read-write because it is half-
duplex, that is, a one-way channel.
Shells make extensive use of pipes; for example, we use pipes to send the output of one
command as the input of the other command. In real-life UNIX® applications, named
pipes are used for communication, when the two processes need a simple method for
synchronous communication.
Creating a Named Pipe
A named pipe can be created in two ways -- via the command line or from within a
From the Command Line
A named pipe may be created from the shell command line. For this one can use either
the "mknod" or "mkfifo" commands.
To create a named pipe with the file named "npipe" you can use one of the following

% mknod npipe p
% mkfifo npipe
You can also provide an absolute path of the named pipe to be created.
Now if you look at the file using "ls ?l", you will see the following output:
prw-rw-r-- 1 secf other 0 Jun 6 17:35 npipe
The 'p' on the first column denotes that this is a named pipe. Just like any file in the
system, it has access permissions that define which users may open the named pipe, and
whether for reading, writing, or both.
Within a Program
The function "mkfifo" can be used to create a named pipe from within a program. The
signature of the function is as follows:
int mkfifo(const char *path, mode_t mode)
The mkfifo function takes the path of the file and the mode (permissions) with which the
file should be created. It creates the new named pipe file as specified by the path.
The function call assumes the O_CREATE|O_EXCL flags, that is, it creates a new named
pipe or returns an error of EEXIST if the named pipe already exists. The named pipe's
owner ID is set to the process' effective user ID, and its group ID is set to the process'
effective group ID, or if the S_ISGID bit is set in the parent directory, the group ID of the
named pipe is inherited from the parent directory.
Opening a Named Pipe
A named pipe can be opened for reading or writing, and it is handled just like any other
normal file in the system. For example, a named pipe can be opened by using the open()
system call, or by using the fopen() standard C library function.
As with normal files, if the call succeeds, you will get a file descriptor in the case of
open(), or a 'FILE' structure pointer in the case of fopen(), which you may use either
for reading or for writing, depending on the parameters passed to open() or to fopen().
Therefore, from a user's point of view, once you have created the named pipe, you can
treat it as a file so far as the operations for opening, reading, writing, and deleting are

Reading From and Writing to a Named Pipe
Reading from and writing to a named pipe are very similar to reading and writing from or
to a normal file. The standard C library function calls read( ) and write( ) can be used
for reading from and writing to a named pipe. These operations are blocking, by default.
The following points need to be kept in mind while doing read/writes to a named pipe:
A named pipe cannot be opened for both reading and writing. The process opening it
must choose either read mode or write mode. The pipe opened in one mode will
remain in that mode until it is closed.
Read and write operations to a named pipe are blocking, by default. Therefore if a
process reads from a named pipe and if the pipe does not have data in it, the reading
process will be blocked. Similarly if a process tries to write to a named pipe that has
no reader, the writing process gets blocked, until another process opens the named
pipe for reading. This, of course, can be overridden by specifying the
O_NONBLOCK flag while opening the named pipe.
Seek operations (via the Standard C library function lseek) cannot be performed on
named pipes.
Full-Duplex Communication Using Named Pipes
Although named pipes give a half-duplex (one-way) flow of data, you can establish full-
duplex communication by using two different named pipes, so each named pipe provides
the flow of data in one direction. However, you have to be very careful about the order in
which these pipes are opened in the client and server, otherwise a deadlock may occur.
For example, let us say you create the following named pipes:
NP1 and NP2
In order to establish a full-duplex channel, here is how the server and the client should
treat these two named pipes:
Let us assume that the server opens the named pipe NP1 for reading and the second pipe
NP2 for writing. Then in order to ensure that this works correctly, the client must open
the first named pipe NP1 for writing and the second named pipe NP2 for reading. This
way a full-duplex channel can be established between the two processes.
Failure to observe the above-mentioned sequence may result in a deadlock situation.
Benefits of Named Pipes
Named pipes are very simple to use.
mkfifo is a thread-safe function.
No synchronization mechanism is needed when using named pipes.

Write (using write function call) to a named pipe is guaranteed to be atomic. It is
atomic even if the named pipe is opened in non-blocking mode.
Named pipes have permissions (read and write) associated with them, unlike
anonymous pipes. These permissions can be used to enforce secure communication.
Limitations of Named Pipes
Named pipes can only be used for communication among processes on the same host
Named pipes can be created only in the local file system of the host, that is, you
cannot create a named pipe on the NFS file system.
Due to their basic blocking nature of pipes, careful programming is required for the
client and server, in order to avoid deadlocks.
Named pipe data is a byte stream, and no record identification exists.
Code Samples
The code samples given here were compiled using the GNU C compiler version 3.0.3 and
were run and tested on a SPARC processor-based Sun Ultra 10 workstation running the
Solaris 8 Operating Environment.
The following code samples illustrate half-duplex and full-duplex communication
between two unrelated processes by using named pipes.
Example of Half-Duplex Communication
In the following example, a client and server use named pipes for one-way
communication. The server creates a named pipe, opens it for reading and waits for
input on the read end of the pipe. Named-pipe reads are blocking by default, so the server
waits for the client to send some request on the pipe. Once data becomes available, it
converts the string to upper case and prints via STDOUT.
The client opens the same named pipe in write mode and writes a user-specified string to
the pipe (see Figure 1).
The following table shows the contents of the header file used by both the client and
server. It contains the definition of the named pipe that is used to communicate between
the client and the server.

Filename : half_duplex.h
#define HALF_DUPLEX "/tmp/halfduplex"
#define MAX_BUF_SIZE> 255

Server Code
The following table shows the contents of Filename : hd_server.c.
#include <stdio.h>
#include <errno.h>
#include <ctype.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <halfduplex.h> /* For name of the named-pipe */

int main(int argc, char *argv[])
int fd, ret_val, count, numread;
char buf[MAX_BUF_SIZE];

/* Create the named - pipe */
ret_val = mkfifo(HALF_DUPLEX, 0666);

if ((ret_val == -1) && (errno != EEXIST)) {
perror("Error creating the named pipe");
exit (1);


/* Open the pipe for reading */

/* Read from the pipe */
numread = read(fd, buf, MAX_BUF_SIZE);

buf[numread] = '0';

printf("Half Duplex Server : Read From the
pipe : %sn", buf);

/* Convert to the string to upper case */
count = 0;
while (count < numread) {
buf[count] = toupper(buf[count]);

printf("Half Duplex Server : Converted String : %sn", buf);

Client Code
The following table shows the contents of Filename : hd_client.c.
#include <stdio.h>
#include <errno.h>

#include <ctype.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <halfduplex.h> /* For name of the named-pipe */

int main(int argc, char *argv[])
int fd;

/* Check if an argument was specified. */

if (argc != 2) {
printf("Usage : %s <string to be sent to the server>n", argv[0]);
exit (1);

/* Open the pipe for writing */

/* Write to the pipe */
write(fd, argv[1], strlen(argv[1]));

Running the Client and the Server

When you run the server, it will block on the read call and will wait until the client writes
something to the named pipe. After that it will print what it read from the pipe, convert
the string to upper case, and then terminate. In a typical implementation this server will
be either an iterative or a concurrent server. But for simplicity and to demonstrate the
communication through the named pipe, we have kept the server code very simple. When
you run the client, you will need to give a string as an argument.
Make sure you run the server first, so that the named pipe gets created.
Expected output:
1. Run the server:

% hd_server &

The server program will block here, and the shell will return control to the command
2. Run the client:

% hd_client hello
3. The server prints the string read and terminates:

Half Duplex Server : Read From the pipe : hello
Half Duplex Server : Converted String : HELLO
Example of Full-Duplex Communication
In the following example, a client and server use named pipes for two-way
communication. The server creates two named pipes. It opens the first pipe for reading
and the second pipe for writing to communicate back to the client. It then waits for input
on the read pipe. Once data is available, it converts the string to upper case and writes the
converted string to the write pipe, which the client will read and print.
The client opens the first pipe for writing, and it sends data through this pipe to the
server. The client opens the second pipe for reading, and through this pipe, it reads the
server's response (see Figure 2).
The following table shows the contents of the header file used by both the client and
server. It contains the definition of the two named pipes that are used to communicate
between the client and the server.

Filename : fullduplex.h
#define NP1 "/tmp/np1"
#define NP2 "/tmp/np2"
#define MAX_BUF_SIZE 255

Server Code
The following table shows the contents of Filename : fd_server.c.
#include <stdio.h>
#include <errno.h>
#include <ctype.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <fullduplex.h> /* For name of the named-pipe */

int main(int argc, char *argv[])
int rdfd, wrfd, ret_val, count, numread;
char buf[MAX_BUF_SIZE];

/* Create the first named - pipe */
ret_val = mkfifo(NP1, 0666);

if ((ret_val == -1) && (errno != EEXIST)) {
perror("Error creating the named pipe");
exit (1);

ret_val = mkfifo(NP2, 0666);

if ((ret_val == -1) && (errno != EEXIST)) {
perror("Error creating the named pipe");
exit (1);

/* Open the first named pipe for reading */
rdfd = open(NP1, O_RDONLY);

/* Open the second named pipe for writing */
wrfd = open(NP2, O_WRONLY);

/* Read from the first pipe */
numread = read(rdfd, buf, MAX_BUF_SIZE);

buf[numread] = '0';

printf("Full Duplex Server : Read From the
pipe : %sn", buf);

/* Convert to the string to upper case */
count = 0;
while (count < numread) {
buf[count] = toupper(buf[count]);


* Write the converted string back to the second
* pipe
write(wrfd, buf, strlen(buf));

Client Code
The following table shows the contents of Filename : hd_client.c.
#include <stdio.h>
#include <errno.h>
#include <ctype.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <fullduplex.h> /* For name of the named-pipe */

int main(int argc, char *argv[])
int wrfd, rdfd, numread;
char rdbuf[MAX_BUF_SIZE];

/* Check if an argument was specified. */

if (argc != 2) {

printf("Usage : %s <string to be sent to the server>n", argv[0]);
exit (1);

/* Open the first named pipe for writing */
wrfd = open(NP1, O_WRONLY);

/* Open the second named pipe for reading */
rdfd = open(NP2, O_RDONLY);

/* Write to the pipe */
write(wrfd, argv[1], strlen(argv[1]));

/* Read from the pipe */
numread = read(rdfd, rdbuf, MAX_BUF_SIZE);

rdbuf[numread] = '0';

printf("Full Duplex Client : Read From the
Pipe : %sn", rdbuf);

Running the Client and the Server
When you run the server, it will create the two named pipes and will block on the read
call. It will wait until the client writes something to the named pipe. After that it will
convert the string to upper case and then write it to the other pipe, which will be read by

the client and displayed on STDOUT. When you run the client you will need to give a
string as an argument.
Make sure you run the server first, so that the named pipe gets created.
Expected output:
1. Run the server:

% fd_server &

The server program will block here, and the shell will return control to the command
2. Run the client:

% fd_client hello

The client program will send the string to server and block on the read to await the
server's response.
3. The server prints the following:

Full Duplex Server : Read From the pipe : hello
The client prints the following:

Full Duplex Client : Read From the pipe : HELLO

Shared memory:
Shared Memory is an efficeint means of passing data between programs. One
program will create a memory portion which other processes (if permitted) can access.
In the Solaris 2.x operating system, the most efficient way to implement shared memory
applications is to rely on the mmap() function and on the system's native virtual memory
facility. Solaris 2.x also supports System V shared memory, which is another way to let
multiple processes attach a segment of physical memory to their virtual address spaces.
When write access is allowed for more than one process, an outside protocol or
mechanism such as a semaphore can be used to prevent inconsistencies and collisions.
A process creates a shared memory segment using shmget()|. The original owner of a
shared memory segment can assign ownership to another user with shmctl(). It can also
revoke this assignment. Other processes with proper permission can perform various
control functions on the shared memory segment using shmctl(). Once created, a shared
segment can be attached to a process address space using shmat(). It can be detached
using shmdt() (see shmop()). The attaching process must have the appropriate
permissions for shmat(). Once attached, the process can read or write to the segment, as
allowed by the permission requested in the attach operation. A shared segment can be
attached multiple times by the same process. A shared memory segment is described by a
control structure with a unique ID that points to an area of physical memory. The
identifier of the segment is called the shmid. The structure definition for the shared
memory segment control structures and prototypews can be found in <sys/shm.h>.
Accessing a Shared Memory Segment
shmget() is used to obtain access to a shared memory segment. It is prottyped by:
int shmget(key_t key, size_t size, int shmflg);
The key argument is a access value associated with the semaphore ID. The size
argument is the size in bytes of the requested shared memory. The shmflg argument
specifies the initial access permissions and creation control flags.
When the call succeeds, it returns the shared memory segment ID. This call is also used
to get the ID of an existing shared segment (from a process requesting sharing of some
existing memory portion).
The following code illustrates shmget():
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>


key_t key; /* key to be passed to shmget() */
int shmflg; /* shmflg to be passed to shmget() */
int shmid; /* return value from shmget() */
int size; /* size to be passed to shmget() */


key = ...
size = ...
shmflg) = ...

if ((shmid = shmget (key, size, shmflg)) == -1) {
perror("shmget: shmget failed"); exit(1); } else {
(void) fprintf(stderr, "shmget: shmget returned %d\n", shmid);
Controlling a Shared Memory Segment
shmctl() is used to alter the permissions and other characteristics of a shared memory
segment. It is prototyped as follows:
int shmctl(int shmid, int cmd, struct shmid_ds *buf);
The process must have an effective shmid of owner, creator or superuser to perform this
command. The cmd argument is one of following control commands:
-- Lock the specified shared memory segment in memory. The process must have
the effective ID of superuser to perform this command.
-- Unlock the shared memory segment. The process must have the effective ID of
superuser to perform this command.
-- Return the status information contained in the control structure and place it in
the buffer pointed to by buf. The process must have read permission on the
segment to perform this command.
-- Set the effective user and group identification and access permissions. The
process must have an effective ID of owner, creator or superuser to perform this
-- Remove the shared memory segment.

The buf is a sructure of type struct shmid_ds which is defined in <sys/shm.h>
The following code illustrates shmctl():
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>


int cmd; /* command code for shmctl() */
int shmid; /* segment ID */
struct shmid_ds shmid_ds; /* shared memory data structure to
hold results */

shmid = ...
cmd = ...
if ((rtrn = shmctl(shmid, cmd, shmid_ds)) == -1) {
perror("shmctl: shmctl failed");
Attaching and Detaching a Shared Memory Segment
shmat() and shmdt() are used to attach and detach shared memory segments. They are
prototypes as follows:
void *shmat(int shmid, const void *shmaddr, int shmflg);

int shmdt(const void *shmaddr);
shmat() returns a pointer, shmaddr, to the head of the shared segment associated with a
valid shmid. shmdt() detaches the shared memory segment located at the address
indicated by shmaddr
. The following code illustrates calls to shmat() and shmdt():
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>

static struct state { /* Internal record of attached segments. */
int shmid; /* shmid of attached segment */

char *shmaddr; /* attach point */
int shmflg; /* flags used on attach */
} ap[MAXnap]; /* State of current attached segments. */
int nap; /* Number of currently attached segments. */


char *addr; /* address work variable */
register int i; /* work area */
register struct state *p; /* ptr to current state entry */

p = &ap[nap++];
p->shmid = ...
p->shmaddr = ...
p->shmflg = ...

p->shmaddr = shmat(p->shmid, p->shmaddr, p->shmflg);
if(p->shmaddr == (char *)-1) {
perror("shmop: shmat failed");
} else
(void) fprintf(stderr, "shmop: shmat returned %#8.8x\n",

i = shmdt(addr);
if(i == -1) {
perror("shmop: shmdt failed");
} else {
(void) fprintf(stderr, "shmop: shmdt returned %d\n", i);

for (p = ap, i = nap; i--; p++)
if (p->shmaddr == addr) *p = ap[--nap];

Example two processes comunicating via shared memory: shm_server.c,
We develop two programs here that illustrate the passing of a simple piece of memery (a
string) between the processes if running simulatenously:
-- simply creates the string and shared memory portion.

-- attaches itself to the created shared memory portion and uses the string
The code listings of the 2 programs no follow:
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>

#define SHMSZ 27

char c;
int shmid;
key_t key;
char *shm, *s;

* We'll name our shared memory segment
* "5678".
key = 5678;

* Create the segment.
if ((shmid = shmget(key, SHMSZ, IPC_CREAT | 0666)) < 0) {

* Now we attach the segment to our data space.
if ((shm = shmat(shmid, NULL, 0)) == (char *) -1) {

* Now put some things into the memory for the
* other process to read.

s = shm;

for (c = 'a'; c <= 'z'; c++)
*s++ = c;
*s = NULL;

* Finally, we wait until the other process
* changes the first character of our memory
* to '*', indicating that it has read what
* we put there.
while (*shm != '*')

* shm-client - client program to demonstrate shared memory.
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>

#define SHMSZ 27

int shmid;
key_t key;
char *shm, *s;

* We need to get the segment named
* "5678", created by the server.
key = 5678;

* Locate the segment.
if ((shmid = shmget(key, SHMSZ, 0666)) < 0) {


* Now we attach the segment to our data space.
if ((shm = shmat(shmid, NULL, 0)) == (char *) -1) {

* Now read what the server put in the memory.
for (s = shm; *s != NULL; s++)

* Finally, change the first character of the
* segment to '*', indicating we have read
* the segment.
*shm = '*';


Explain p/up and v/down function
Semaphore Example 1: Locking
The most typical use of a semaphore is to protect a chunk of code that can only be
executed by one thread at a time. The semaphore acts as a lock; acquire_sem() locks the
code, release_sem() releases it. Semaphores that are used as locks are (almost always)
created with a thread count of 1.
As a simple example, let's say you keep track of a maximum value like this:
/* max_val is a global. */
uint32 max_val = 0;

/* bump_max() resets the max value, if necessary. */
void bump_max(uint32 new_value)
if (new_value > max_value)
max_value = new_value;

bump_max() isn't thread safe; there's a race condition between the comparison and the
assignment. So we protect it with a semaphore:
sem_id max_sem;
uint32 max_val = 0;

/* Initialize the semaphore during a setup routine. */
status_t init()
if ((max_sem = create_sem(1, "max_sem")) < B_NO_ERROR)
return B_ERROR;
void bump_max(uint32 new_value)
if (acquire_sem(max_sem) != B_NO_ERROR)
if (new_value > max_value)
max_value = new_value;

Semaphore Example 2: Benaphores
A "benaphore" is a combination of an atomic variable and a semaphore that can improve
locking efficiency. If you're using a semaphore as shown in the previous example, you
should consider using a benaphore instead (if you can).

Here's the example re-written to use a benaphore:
sem_id max_sem;
uint32 max_val = 0;
int32 ben_val = 0;

status_t init()
/* This time we initialized the semaphore to 0. */
if ((max_sem = create_sem(0, "max_sem")) < B_NO_ERROR)
return B_ERROR;
void bump_max(uint32 new_value)
int32 previous = atomic_add(&ben_val, 1);
if (previous >= 1)
if (acquire_sem(max_sem) != B_NO_ERROR)
goto get_out;

if (new_value > max_value)
max_value = new_value;

previous = atomic_add(&ben_val, -1);
if (previous > 1)

The point, here, is that acquire_sem() is called only if it's known (by checking the
previous value of ben_val) that some other thread is in the middle of the critical section.
On the releasing end, the release_sem() is called only if some other thread has since
entered the function (and is now blocked in the acquire_sem() call). An important point,
here, is that the semaphore is initialized to 0.
Semaphore Example 3: Imposing an Execution Order
Semaphores can also be used to coordinate threads that are performing separate
operations, but that need to perform these operations in a particular order. In the
following example, we have a global buffer that's accessed through separate reading and
writing functions. Furthermore, we want writes and reads to alternate, with a write going
We can lock the entire buffer with a single semaphore, but to enforce alternation we need
two semaphores:
sem_id write_sem, read_sem;
char buffer[1024];

/* Initialize the semaphores */

status_t init()
if ((write_sem = create_sem(1, "write")) < B_NO_ERROR) {
if ((read_sem = create_sem(0, "read")) < B_NO_ERROR) {

status_t write_buffer(const char *src)
if (acquire_sem(write_sem) != B_NO_ERROR)
return B_ERROR;

strncpy(buffer, src, 1024);


status_t read_buffer(char *dest, size_t len)
if (acquire_sem(read_sem) != B_NO_ERROR)
return B_ERROR;

strncpy(dest, buffer, len);


The initial thread counts ensure that the buffer will be written to before it's read: If a
reader arrives before a writer, the reader will block until the writer releases the read_sem

Message queue:
The basic idea of a message queue is a simple one.
Two (or more) processes can exchange information via access to a common system
message queue. The sending process places via some (OS) message-passing module a
message onto a queue which can be read by another process (Figure 24.1). Each message
is given an identification or type so that processes can select the appropriate message.
Process must share a common key in order to gain access to the queue in the first place
(subject to other permissions -- see below).

Fig. Basic Message Passing IPC messaging lets processes send and receive messages,
and queue messages for processing in an arbitrary order. Unlike the file byte-stream data
flow of pipes, each IPC message has an explicit length. Messages can be assigned a
specific type. Because of this, a server process can direct message traffic between clients
on its queue by using the client process PID as the message type. For single-message
transactions, multiple server processes can work in parallel on transactions sent to a
shared message queue.
Before a process can send or receive a message, the queue must be initialized (through
the msgget function see below) Operations to send and receive messages are performed
by the msgsnd() and msgrcv() functions, respectively.
When a message is sent, its text is copied to the message queue. The msgsnd() and
msgrcv() functions can be performed as either blocking or non-blocking operations. Non-
blocking operations allow for asynchronous message transfer -- the process is not
suspended as a result of sending or receiving a message. In blocking or synchronous
message passing the sending process cannot continue until the message has been
transferred or has even been acknowledged by a receiver. IPC signal and other
mechanisms can be employed to implement such transfer. A blocked message operation
remains suspended until one of the following three conditions occurs:

• The call succeeds.
• The process receives a signal.
• The queue is removed.
Initialising the Message Queue
The msgget() function initializes a new message queue:
int msgget(key_t key, int msgflg)
It can also return the message queue ID (msqid) of the queue corresponding to the key
argument. The value passed as the msgflg argument must be an octal integer with settings
for the queue's permissions and control flags.
The following code illustrates the msgget() function.
#include <sys/ipc.h>;
#include <sys/msg.h>;


key_t key; /* key to be passed to msgget() */
int msgflg /* msgflg to be passed to msgget() */
int msqid; /* return value from msgget() */

key = ...
msgflg = ...

if ((msqid = msgget(key, msgflg)) == &ndash;1)
perror("msgget: msgget failed");
} else
(void) fprintf(stderr, &ldquo;msgget succeeded");
IPC Functions, Key Arguments, and Creation Flags: <sys/ipc.h>
Processes requesting access to an IPC facility must be able to identify it. To do this,
functions that initialize or provide access to an IPC facility use a key_t key argument.
(key_t is essentially an int type defined in <sys/types.h>
The key is an arbitrary value or one that can be derived from a common seed at run time.
One way is with ftok() , which converts a filename to a key value that is unique within
the system. Functions that initialize or get access to messages (also semaphores or shared
memory see later) return an ID number of type int. IPC functions that perform read,
write, and control operations use this ID. If the key argument is specified as
IPC_PRIVATE, the call initializes a new instance of an IPC facility that is private to the
creating process. When the IPC_CREAT flag is supplied in the flags argument
appropriate to the call, the function tries to create the facility if it does not exist already.
When called with both the IPC_CREAT and IPC_EXCL flags, the function fails if the
facility already exists. This can be useful when more than one process might attempt to
initialize the facility. One such case might involve several server processes having access

to the same facility. If they all attempt to create the facility with IPC_EXCL in effect,
only the first attempt succeeds. If neither of these flags is given and the facility already
exists, the functions to get access simply return the ID of the facility. If IPC_CREAT is
omitted and the facility is not already initialized, the calls fail. These control flags are
combined, using logical (bitwise) OR, with the octal permission modes to form the flags
argument. For example, the statement below initializes a new message queue if the queue
does not exist.
msqid = msgget(ftok("/tmp",
key), (IPC_CREAT | IPC_EXCL | 0400));
The first argument evaluates to a key based on the string ("/tmp"). The second argument
evaluates to the combined permissions and control flags.
Controlling message queues
The msgctl() function alters the permissions and other characteristics of a message queue.
The owner or creator of a queue can change its ownership or permissions using msgctl()
Also, any process with permission to do so can use msgctl() for control operations.
The msgctl() function is prototypes as follows:
int msgctl(int msqid, int cmd, struct msqid_ds *buf )
The msqid argument must be the ID of an existing message queue. The cmd argument is
one of:
-- Place information about the status of the queue in the data structure pointed to by buf.
The process must have read permission for this call to succeed.
-- Set the owner's user and group ID, the permissions, and the size (in number of bytes) of
the message queue. A process must have the effective user ID of the owner, creator, or
superuser for this call to succeed.
-- Remove the message queue specified by the msqid argument.
The following code illustrates the msgctl() function with all its various flags:
#include <sys/ipc.h>
#include <sys/msg.h>
if (msgctl(msqid, IPC_STAT, &buf) == -1) {
perror("msgctl: msgctl failed");
if (msgctl(msqid, IPC_SET, &buf) == -1) {
perror("msgctl: msgctl failed");
Sending and Receiving Messages
The msgsnd() and msgrcv() functions send and receive messages, respectively:
int msgsnd(int msqid, const void *msgp, size_t msgsz, int msgflg);

int msgrcv(int msqid, void *msgp, size_t msgsz, long msgtyp, int msgflg);
The msqid argument must be the ID of an existing message queue. The msgp argument is
a pointer to a structure that contains the type of the message and its text. The structure
below is an example of what this user-defined buffer might look like:
struct mymsg {
long mtype; /* message type */
char mtext[MSGSZ]; /* message text of length MSGSZ */
The msgsz argument specifies the length of the message in bytes.
The structure member msgtype is the received message's type as specified by the sending
The argument msgflg specifies the action to be taken if one or more of the following are
• The number of bytes already on the queue is equal to msg_qbytes.
• The total number of messages on all queues system-wide is equal to the system-
imposed limit.
These actions are as follows:
• If (msgflg & IPC_NOWAIT) is non-zero, the message will not be sent and the
calling process will return immediately.
• If (msgflg & IPC_NOWAIT) is 0, the calling process will suspend execution until
one of the following occurs:
o The condition responsible for the suspension no longer exists, in which case the
message is sent.
o The message queue identifier msqid is removed from the system; when this
occurs, errno is set equal to EIDRM and -1 is returned.
o The calling process receives a signal that is to be caught; in this case the message
is not sent and the calling process resumes execution.
Upon successful completion, the following actions are taken with respect to the data
structure associated with msqid:
o msg_qnum is incremented by 1.
o msg_lspid is set equal to the process ID of the calling process.
o msg_stime is set equal to the current time.
The following code illustrates msgsnd() and msgrcv():
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>


int msgflg; /* message flags for the operation */
struct msgbuf *msgp; /* pointer to the message buffer */
int msgsz; /* message size */
long msgtyp; /* desired message type */
int msqid /* message queue ID to be used */


msgp = (struct msgbuf *)malloc((unsigned)(sizeof(struct msgbuf)
- sizeof msgp->mtext + maxmsgsz));

if (msgp == NULL) {
(void) fprintf(stderr, "msgop: %s %d byte messages.\n",
"could not allocate message buffer for", maxmsgsz);


msgsz = ...
msgflg = ...

if (msgsnd(msqid, msgp, msgsz, msgflg) == -1)
perror("msgop: msgsnd failed");
msgsz = ...
msgtyp = first_on_queue;
msgflg = ...
if (rtrn = msgrcv(msqid, msgp, msgsz, msgtyp, msgflg) == -1)
perror("msgop: msgrcv failed");
POSIX Messages: <mqueue.h>
The POSIX message queue functions are:
mq_open() -- Connects to, and optionally creates, a named message queue.
mq_close() -- Ends the connection to an open message queue.
mq_unlink() -- Ends the connection to an open message queue and causes the queue to be
removed when the last process closes it.
mq_send() -- Places a message in the queue.
mq_receive() -- Receives (removes) the oldest, highest priority message from the queue.
mq_notify() -- Notifies a process or thread that a message is available in the queue.
mq_setattr() -- Set or get message queue attributes.
The basic operation of these functions is as described above. For full function prototypes
and further information see the UNIX man pages
Example: Sending messages between two processes
The following two programs should be compiled and run at the same time to illustrate
basic principle of message passing:
-- Creates a message queue and sends one message to the queue.
-- Reads the message from the queue.
message_send.c -- creating and sending to a simple message queue
The full code listing for message_send.c is as follows:
#include <sys/types.h>

#include <sys/ipc.h>
#include <sys/msg.h>
#include <stdio.h>
#include <string.h>

#define MSGSZ 128

* Declare the message structure.

typedef struct msgbuf {
long mtype;
char mtext[MSGSZ];
} message_buf;

int msqid;
int msgflg = IPC_CREAT | 0666;
key_t key;
message_buf sbuf;
size_t buf_length;

* Get the message queue id for the
* "name" 1234, which was created by
* the server.
key = 1234;

(void) fprintf(stderr, "\nmsgget: Calling msgget(%#lx,\
key, msgflg);

if ((msqid = msgget(key, msgflg )) < 0) {
(void) fprintf(stderr,"msgget: msgget succeeded: msqid = %d\n", msqid);

* We'll send message type 1


sbuf.mtype = 1;

(void) fprintf(stderr,"msgget: msgget succeeded: msqid = %d\n", msqid);

(void) strcpy(sbuf.mtext, "Did you get this?");

(void) fprintf(stderr,"msgget: msgget succeeded: msqid = %d\n", msqid);

buf_length = strlen(sbuf.mtext) ;

* Send a message.
if (msgsnd(msqid, &sbuf, buf_length, IPC_NOWAIT) < 0) {
printf ("%d, %d, %s, %d\n", msqid, sbuf.mtype, sbuf.mtext, buf_length);

printf("Message: \"%s\" Sent\n", sbuf.mtext);

The essential points to note here are:
• The Message queue is created with a basic key and message flag msgflg =
IPC_CREAT | 0666 -- create queue and make it read and appendable by all.
• A message of type (sbuf.mtype) 1 is sent to the queue with the message ``Did you
get this?''
message_rec.c -- receiving the above message
The full code listing for message_send.c's companion process, message_rec.c is as
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <stdio.h>

#define MSGSZ 128

* Declare the message structure.


typedef struct msgbuf {
long mtype;
char mtext[MSGSZ];
} message_buf;

int msqid;
key_t key;
message_buf rbuf;

* Get the message queue id for the
* "name" 1234, which was created by
* the server.
key = 1234;

if ((msqid = msgget(key, 0666)) < 0) {

* Receive an answer of message type 1.
if (msgrcv(msqid, &rbuf, MSGSZ, 1, 0) < 0) {

* Print the answer.
printf("%s\n", rbuf.mtext);
The essential points to note here are:
• The Message queue is opened with msgget (message flag 0666) and the same key
as message_send.c.
• A message of the same type 1 is received from the queue with the message ``Did
you get this?'' stored in rbuf.mtext.

Some further example message queue programs
The following suite of programs can be used to investigate interactively a variety of
massage passing ideas (see exercises below).
The message queue must be initialised with the msgget.c program. The effects of
controlling the queue and sending and receiving messages can be investigated with
msgctl.c and msgop.c respectively.
msgget.c: Simple Program to illustrate msget()
* msgget.c: Illustrate the msgget() function.
* This is a simple exerciser of the msgget() function. It prompts
* for the arguments, makes the call, and reports the results.

#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>

extern void exit();
extern void perror();

key_t key; /* key to be passed to msgget() */
int msgflg, /* msgflg to be passed to msgget() */
msqid; /* return value from msgget() */

(void) fprintf(stderr,
"All numeric input is expected to follow C conventions:\n");
(void) fprintf(stderr,
"\t0x... is interpreted as hexadecimal,\n");
(void) fprintf(stderr, "\t0... is interpreted as octal,\n");
(void) fprintf(stderr, "\totherwise, decimal.\n");
(void) fprintf(stderr, "IPC_PRIVATE == %#lx\n", IPC_PRIVATE);
(void) fprintf(stderr, "Enter key: ");
(void) scanf("%li", &key);
(void) fprintf(stderr, "\nExpected flags for msgflg argument
(void) fprintf(stderr, "\tIPC_EXCL =\t%#8.8o\n", IPC_EXCL);
(void) fprintf(stderr, "\tIPC_CREAT =\t%#8.8o\n", IPC_CREAT);
(void) fprintf(stderr, "\towner read =\t%#8.8o\n", 0400);
(void) fprintf(stderr, "\towner write =\t%#8.8o\n", 0200);
(void) fprintf(stderr, "\tgroup read =\t%#8.8o\n", 040);
(void) fprintf(stderr, "\tgroup write =\t%#8.8o\n", 020);
(void) fprintf(stderr, "\tother read =\t%#8.8o\n", 04);
(void) fprintf(stderr, "\tother write =\t%#8.8o\n", 02);

(void) fprintf(stderr, "Enter msgflg value: ");
(void) scanf("%i", &msgflg);

(void) fprintf(stderr, "\nmsgget: Calling msgget(%#lx,
key, msgflg);
if ((msqid = msgget(key, msgflg)) == -1)
perror("msgget: msgget failed");
} else {
(void) fprintf(stderr,
"msgget: msgget succeeded: msqid = %d\n", msqid);
msgctl.cSample Program to Illustrate msgctl()
* msgctl.c: Illustrate the msgctl() function.
* This is a simple exerciser of the msgctl() function. It allows
* you to perform one control operation on one message queue. It
* gives up immediately if any control operation fails, so be
* not to set permissions to preclude read permission; you won't
* able to reset the permissions with this code if you do.
#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <time.h>

static void do_msgctl();
extern void exit();
extern void perror();
static char warning_message[] = "If you remove read permission
for \
yourself, this program will fail frequently!";

struct msqid_ds buf; /* queue descriptor buffer for IPC_STAT
and IP_SET commands */
int cmd, /* command to be given to msgctl() */

msqid; /* queue ID to be given to msgctl() */

(void fprintf(stderr,
"All numeric input is expected to follow C conventions:\n");
(void) fprintf(stderr,
"\t0x... is interpreted as hexadecimal,\n");
(void) fprintf(stderr, "\t0... is interpreted as octal,\n");
(void) fprintf(stderr, "\totherwise, decimal.\n");

/* Get the msqid and cmd arguments for the msgctl() call. */
(void) fprintf(stderr,
"Please enter arguments for msgctls() as requested.");
(void) fprintf(stderr, "\nEnter the msqid: ");
(void) scanf("%i", &msqid);
(void) fprintf(stderr, "\tIPC_RMID = %d\n", IPC_RMID);
(void) fprintf(stderr, "\tIPC_SET = %d\n", IPC_SET);
(void) fprintf(stderr, "\tIPC_STAT = %d\n", IPC_STAT);
(void) fprintf(stderr, "\nEnter the value for the command: ");
(void) scanf("%i", &cmd);

switch (cmd) {
case IPC_SET:
/* Modify settings in the message queue control structure.
(void) fprintf(stderr, "Before IPC_SET, get current
/* fall through to IPC_STAT processing */
case IPC_STAT:
/* Get a copy of the current message queue control
* structure and show it to the user. */
do_msgctl(msqid, IPC_STAT, &buf);
(void) fprintf(stderr, ]
"msg_perm.uid = %d\n", buf.msg_perm.uid);
(void) fprintf(stderr,
"msg_perm.gid = %d\n", buf.msg_perm.gid);
(void) fprintf(stderr,
"msg_perm.cuid = %d\n", buf.msg_perm.cuid);
(void) fprintf(stderr,
"msg_perm.cgid = %d\n", buf.msg_perm.cgid);
(void) fprintf(stderr, "msg_perm.mode = %#o, ",
(void) fprintf(stderr, "access permissions = %#o\n",
buf.msg_perm.mode & 0777);
(void) fprintf(stderr, "msg_cbytes = %d\n",
(void) fprintf(stderr, "msg_qbytes = %d\n",

(void) fprintf(stderr, "msg_qnum = %d\n", buf.msg_qnum);
(void) fprintf(stderr, "msg_lspid = %d\n",
(void) fprintf(stderr, "msg_lrpid = %d\n",
(void) fprintf(stderr, "msg_stime = %s", buf.msg_stime ?
ctime(&buf.msg_stime) : "Not Set\n");
(void) fprintf(stderr, "msg_rtime = %s", buf.msg_rtime ?
ctime(&buf.msg_rtime) : "Not Set\n");
(void) fprintf(stderr, "msg_ctime = %s",
if (cmd == IPC_STAT)
/* Now continue with IPC_SET. */
(void) fprintf(stderr, "Enter msg_perm.uid: ");
(void) scanf ("%hi", &buf.msg_perm.uid);
(void) fprintf(stderr, "Enter msg_perm.gid: ");
(void) scanf("%hi", &buf.msg_perm.gid);
(void) fprintf(stderr, "%s\n", warning_message);
(void) fprintf(stderr, "Enter msg_perm.mode: ");
(void) scanf("%hi", &buf.msg_perm.mode);
(void) fprintf(stderr, "Enter msg_qbytes: ");
(void) scanf("%hi", &buf.msg_qbytes);
do_msgctl(msqid, IPC_SET, &buf);
case IPC_RMID:
/* Remove the message queue or try an unknown command. */
do_msgctl(msqid, cmd, (struct msqid_ds *)NULL);

* Print indication of arguments being passed to msgctl(), call
* msgctl(), and report the results. If msgctl() fails, do not
* return; this example doesn't deal with errors, it just reports
* them.
static void
do_msgctl(msqid, cmd, buf)
struct msqid_ds *buf; /* pointer to queue descriptor buffer */
int cmd, /* command code */
msqid; /* queue ID */

register int rtrn; /* hold area for return value from msgctl()

(void) fprintf(stderr, "\nmsgctl: Calling msgctl(%d, %d,
msqid, cmd, buf ? "&buf" : "(struct msqid_ds *)NULL");
rtrn = msgctl(msqid, cmd, buf);
if (rtrn == -1) {
perror("msgctl: msgctl failed");
} else {
(void) fprintf(stderr, "msgctl: msgctl returned %d\n",

The client server model
Most interprocess communication uses the client server model. These terms refer to the
two processes which will be communicating with each other. One of the two processes,
the client, connects to the other process, the server, typically to make a request for
information. A good analogy is a person who makes a phone call to another person.
Notice that the client needs to know of the existence of and the address of the server, but
the server does not need to know the address of (or even the existence of) the client prior
to the connection being established. Notice also that once a connection is established,
both sides can send and receive information.
The system calls for establishing a connection are somewhat different for the client and
the server, but both involve the basic construct of a socket. A socket is one end of an
interprocess communication channel. The two processes each establish their own socket.
The steps involved in establishing a socket on the client side are as follows:
1. Create a socket with the socket() system call
2. Connect the socket to the address of the server using the connect() system call
3. Send and receive data. There are a number of ways to do this, but the simplest is
to use the read() and write() system calls.
The steps involved in establishing a socket on the server side are as follows:
1. Create a socket with the socket() system call
2. Bind the socket to an address using the bind() system call. For a server socket on
the Internet, an address consists of a port number on the host machine.
3. Listen for connections with the listen() system call
4. Accept a connection with the accept() system call. This call typically blocks until
a client connects with the server.
5. Send and receive data
Socket Types
When a socket is created, the program has to specify the address domain and the socket
type. Two processes can communicate with each other only if their sockets are of the
same type and in the same domain. There are two widely used address domains, the unix
domain, in which two processes which share a common file system communicate, and the
Internet domain, in which two processes running on any two hosts on the Internet
communicate. Each of these has its own address format.
The address of a socket in the Unix domain is a character string which is basically an
entry in the file system.
The address of a socket in the Internet domain consists of the Internet address of the host
machine (every computer on the Internet has a unique 32 bit address, often referred to as
its IP address). In addition, each socket needs a port number on that host. Port numbers
are 16 bit unsigned integers. The lower numbers are reserved in Unix for standard
services. For example, the port number for the FTP server is 21. It is important that
standard services be at the same port on all computers so that clients will know their
addresses. However, port numbers above 2000 are generally available.
There are two widely used socket types, stream sockets, and datagram sockets. Stream
sockets treat communications as a continuous stream of characters, while datagram
sockets have to read entire messages at once. Each uses its own communciations
protocol. Stream sockets use TCP (Transmission Control Protocol), which is a reliable,

stream oriented protocol, and datagram sockets use UDP (Unix Datagram Protocol),
which is unreliable and message oriented.
The examples in this tutorial will use sockets in the Internet domain using the TCP
Sample code
C code for a very simple client and server are provided for you. These communicate
using stream sockets in the Internet domain. The code is described in detail below.
However, before you read the descriptions and look at the code, you should compile and
run the two programs to see what they do.
Click here for the server program
Click here for the client program
Download these into files called server.c and client.c and compile them separately into
two executables called server and client. They probably won't require any special
compiling flags, but on some solaris systems you may need to link to the socket library
by appending -lsocket to your compile command.
Ideally, you should run the client and the server on separate hosts on the Internet. Start
the server first. Suppose the server is running on a machine called cheerios. When you
run the server, you need to pass the port number in as an argument. You can choose any
number between 2000 and 65535. If this port is already in use on that machine, the server
will tell you this and exit. If this happens, just choose another port and try again. If the
port is available, the server will block until it receives a connection from the client. Don't
be alarmed if the server doesn't do anything; it's not supposed to do anything until a
connection is made. Here is a typical command line:
server 51717
To run the client you need to pass in two arguments, the name of the host on which the
server is running and the port number on which the server is listening for connections.
Here is the command line to connect to the server described above:
client cheerios 51717
The client will prompt you to enter a message. If everything works correctly, the server
will display your message on stdout, send an acknowledgement message to the client and
terminate. The client will print the acknowledgement message from the server and then
You can simulate this on a single machine by running the server in one window and the
client in another. In this case, you can use the keyword localhost as the first argument to
the client.
Server code
The server code uses a number of ugly programming constructs, and so we will go
through it line by line.

#include <stdio.h>
This header file contains declarations used in most input and output and is typically
included in all C programs.

#include <sys/types.h>
This header file contains definitions of a number of data types used in system calls. These
types are used in the next two include files.

#include <sys/socket.h>
The header file socket.h includes a number of definitions of structures needed for sockets.

#include <netinet/in.h>
The header file netinet/in.h contains constants and structures needed for internet domain

void error(char *msg)
This function is called when a system call fails. It displays a message about the error on
stderr and then aborts the program. The perror man page gives more information.

int main(int argc, char *argv[])
int sockfd, newsockfd, portno, clilen, n;
sockfd and newsockfd are file descriptors, i.e. array subscripts into the file descriptor
table . These two variables store the values returned by the socket system call and the
accept system call.
portno stores the port number on which the server accepts connections.
clilen stores the size of the address of the client. This is needed for the accept system call.
n is the return value for the read() and write() calls; i.e. it contains the number of
characters read or written.

char buffer[256];
The server reads characters from the socket connection into this buffer.

struct sockaddr_in serv_addr, cli_addr;
A sockaddr_in is a structure containing an internet address. This structure is defined in
<netinet/in.h>. Here is the definition:
struct sockaddr_in {
short sin_family; /* must be AF_INET */
u_short sin_port;
struct in_addr sin_addr;
char sin_zero[8]; /* Not used, must be zero */
An in_addr structure, defined in the same header file, contains only one field, a unsigned
long called s_addr. The variable serv_addr will contain the address of the server, and
cli_addr will contain the address of the client which connects to the server.

if (argc < 2) {
fprintf(stderr,"ERROR, no port provided\n");

The user needs to pass in the port number on which the server will accept connections as
an argument. This code displays an error message if the user fails to do this.

sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error("ERROR opening socket");
The socket() system call creates a new socket. It takes three arguments. The first is the
address domain of the socket. Recall that there are two possible address domains, the
unix domain for two processes which share a common file system, and the Internet
domain for any two hosts on the Internet. The symbol constant AF_UNIX is used for the
former, and AF_INET for the latter (there are actually many other options which can be
used here for specialized purposes).
The second argument is the type of socket. Recall that there are two choices here, a
stream socket in which characters are read in a continuous stream as if from a file or pipe,
and a datagram socket, in which messages are read in chunks. The two symbolic
constants are SOCK_STREAM and SOCK_DGRAM. The third argument is the protocol.
If this argument is zero (and it always should be except for unusual circumstances), the
operating system will choose the most appropriate protocol. It will choose TCP for
stream sockets and UDP for datagram sockets.
The socket system call returns an entry into the file descriptor table (i.e. a small integer).
This value is used for all subsequent references to this socket. If the socket call fails, it
returns -1. In this case the program displays and error message and exits. However, this
system call is unlikely to fail.
This is a simplified description of the socket call; there are numerous other choices for
domains and types, but these are the most common. The socket() man page has more

bzero((char *) &serv_addr, sizeof(serv_addr));
The function bzero() sets all values in a buffer to zero. It takes two arguments, the first is
a pointer to the buffer and the second is the size of the buffer. Thus, this line initializes
serv_addr to zeros.

portno = atoi(argv[1]);
The port number on which the server will listen for connections is passed in as an
argument, and this statement uses the atoi() function to convert this from a string of digits
to an integer.

serv_addr.sin_family = AF_INET;
The variable serv_addr is a structure of type struct sockaddr_in. This structure has four
fields. The first field is short sin_family, which contains a code for the address family. It
should always be set to the symbolic constant AF_INET.

serv_addr.sin_port = htons(portno);
The second field of serv_addr is unsigned short sin_port , which contain the port number.
However, instead of simply copying the port number to this field, it is necessary to

convert this to network byte order using the function htons() which converts a port
number in host byte order to a port number in network byte order.

serv_addr.sin_addr.s_addr = INADDR_ANY;
The third field of sockaddr_in is a structure of type struct in_addr which contains only a
single field unsigned long s_addr. This field contains the IP address of the host. For
server code, this will always be the IP address of the machine on which the server is
running, and there is a symbolic constant INADDR_ANY which gets this address.

if (bind(sockfd, (struct sockaddr *) &serv_addr,
sizeof(serv_addr)) < 0)
error("ERROR on binding");
The bind() system call binds a socket to an address, in this case the address of the current
host and port number on which the server will run. It takes three arguments, the socket
file descriptor, the address to which is bound, and the size of the address to which it is
bound. The second argument is a pointer to a structure of type sockaddr, but what is
passed in is a structure of type sockaddr_in, and so this must be cast to the correct type.
This can fail for a number of reasons, the most obvious being that this socket is already in
use on this machine. The bind() man page has more information.

The listen system call allows the process to listen on the socket for connections. The first
argument is the socket file descriptor, and the second is the size of the backlog queue,
i.e., the number of connections that can be waiting while the process is handling a
particular connection. This should be set to 5, the maximum size permitted by most
systems. If the first argument is a valid socket, this call cannot fail, and so the code
doesn't check for errors. The listen() man page has more information.

clilen = sizeof(cli_addr);
newsockfd = accept(sockfd, (struct sockaddr *) &cli_addr, &clilen);
if (newsockfd < 0)
error("ERROR on accept");
The accept() system call causes the process to block until a client connects to the server.
Thus, it wakes up the process when a connection from a client has been successfully
established. It returns a new file descriptor, and all communication on this connection
should be done using the new file descriptor. The second argument is a reference pointer
to the address of the client on the other end of the connection, and the third argument is
the size of this structure. The accept() man page has more information.

n = read(newsockfd,buffer,255);
if (n < 0) error("ERROR reading from socket");
printf("Here is the message: %s\n",buffer);
Note that we would only get to this point after a client has successfully connected to our
server. This code initializes the buffer using the bzero() function, and then reads from the
socket. Note that the read call uses the new file descriptor, the one returned by accept(),

not the original file descriptor returned by socket(). Note also that the read() will block
until there is something for it to read in the socket, i.e. after the client has executed a
write(). It will read either the total number of characters in the socket or 255, whichever
is less, and return the number of characters read. The read() man page has more

n = write(newsockfd,"I got your message",18);
if (n < 0) error("ERROR writing to socket");
Once a connection has been established, both ends can both read and write to the
connection. Naturally, everything written by the client will be read by the server, and
everything written by the server will be read by the client. This code simply writes a short
message to the client. The last argument of write is the size of the message. The write()
man page has more information.

return 0;
This terminates main and thus the program. Since main was declared to be of type int as
specified by the ascii standard, some compilers complain if it does not return anything.
Client code
As before, we will go through the program client.c line by line.
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
The header files are the same as for the server with one addition. The file netdb.h defines
the structure hostent, which will be used below.

void error(char *msg)

int main(int argc, char *argv[])
int sockfd, portno, n;
struct sockaddr_in serv_addr;
struct hostent *server;
The error() function is identical to that in the server, as are the variables sockfd, portno,
and n. The variable serv_addr will contain the address of the server to which we want to
connect. It is of type struct sockaddr_in.
The variable server is a pointer to a structure of type hostent. This structure is defined in
the header file netdb.h as follows:
struct hostent {
char *h_name; /* official name of host */

char **h_aliases; /* alias list */
int h_addrtype; /* host address type */
int h_length; /* length of address */
char **h_addr_list; /* list of addresses from name server */
#define h_addr h_addr_list[0] /* address, for backward compatiblity */
It defines a host computer on the Internet. The members of this structure are:
h_name Official name of the host.

h_aliases A zero terminated array of alternate
names for the host.

h_addrtype The type of address being returned;
currently always AF_INET.

h_length The length, in bytes, of the address.

h_addr_list A pointer to a list of network addresses
for the named host. Host addresses are
returned in network byte order.
Note that h_addr is an alias for the first address in the array of network addresses.

char buffer[256];
if (argc < 3) {
fprintf(stderr,"usage %s hostname port\n", argv[0]);
portno = atoi(argv[2]);
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error("ERROR opening socket");
All of this code is the same as that in the server.

server = gethostbyname(argv[1]);
if (server == NULL) {
fprintf(stderr,"ERROR, no such host\n");
argv[1] contains the name of a host on the Internet, e.g. The
struct hostent *gethostbyname(char *name)
Takes such a name as an argument and returns a pointer to a hostent containing
information about that host. The field char *h_addr contains the IP address. If this
structure is NULL, the system could not locate a host with this name.
In the old days, this function worked by searching a system file called /etc/hosts but with
the explosive growth of the Internet, it became impossible for system administrators to

keep this file current. Thus, the mechanism by which this function works is complex,
often involves querying large databases all around the country. The gethostbyname() man
page has more information.

bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
bcopy((char *)server->h_addr,
(char *)&serv_addr.sin_addr.s_addr,
serv_addr.sin_port = htons(portno);
This code sets the fields in serv_addr. Much of it is the same as in the server. However,
because the field server->h_addr is a character string, we use the function:
void bcopy(char *s1, char *s2, int length)
which copies length bytes from s1 to s2.

if (connect(sockfd,&serv_addr,sizeof(serv_addr)) < 0)
error("ERROR connecting");
The connect function is called by the client to establish a connection to the server. It takes
three arguments, the socket file descriptor, the address of the host to which it wants to
connect (including the port number), and the size of this address. This function returns 0
on success and -1 if it fails. The connect() man page has more information.
Notice that the client needs to know the port number of the server, but it does not need to
know its own port number. This is typically assigned by the system when connect is

printf("Please enter the message: ");
n = write(sockfd,buffer,strlen(buffer));
if (n < 0)
error("ERROR writing to socket");
n = read(sockfd,buffer,255);
if (n < 0)
error("ERROR reading from socket");
return 0;
The remaining code should be fairly clear. It prompts the user to enter a message, uses
fgets to read the message from stdin, writes the message to the socket, reads the reply
from the socket, and displays this reply on the screen.
Enhancements to the server code
The sample server code above has the limitation that it only handles one connection, and
then dies. A "real world" server should run indefinitely and should have the capability of
handling a number of simultaneous connections, each in its own process. This is typically
done by forking off a new process to handle each new connection.

The following code has a dummy function called dostuff(int sockfd). This function will
handle the connection after it has been established and provide whatever services the
client requests. As we saw above, once a connection is established, both ends can use
read and write to send information to the other end, and the details of the information
passed back and forth do not concern us here. To write a "real world" server, you would
make essentially no changes to the main() function, and all of the code which provided
the service would be in dostuff().
To allow the server to handle multiple simultaneous connections, we make the following
changes to the code:
1. Put the accept statement and the following code in an infinite loop.
2. After a connection is established, call fork() to create a new process.
3. The child process will close sockfd and call dostuff, passing the new socket file
descriptor as an argument. When the two processes have completed their conversation, as
indicated by dostuff() returning, this process simply exits.
4. The parent process closes newsockfd. Because all of this code is in an infinite
loop, it will return to the accept statement to wait for the next connection.
Here is the code.
while (1) {
newsockfd = accept(sockfd,
(struct sockaddr *) &cli_addr, &clilen);
if (newsockfd < 0)
error("ERROR on accept");
pid = fork();
if (pid < 0)
error("ERROR on fork");
if (pid == 0) {
else close(newsockfd);
} /* end of while */
Click here for a complete server program which includes this change. This will run with
the program client.c.
The zombie problem
The above code has a problem; if the parent runs for a long time and accepts many
connections, each of these connections will create a zombie when the connection is
terminated. A zombie is a process which has terminated but but cannot be permitted to
fully die because at some point in the future, the parent of the process might execute a
wait and would want information about the death of the child. Zombies clog up the
process table in the kernel, and so they should be prevented. Unfortunately, the code
which prevents zombies is not consistent across different architectures. When a child
dies, it sends a SIGCHLD signal to its parent. On systems such as AIX, the following
code in main() is all that is needed.

This says to ignore the SIGCHLD signal. However, on systems running SunOS, you have
to use the following code:
void *SigCatcher(int n)
int main()
The function SigCatcher() will be called whenever the parent receives a SIGCHLD signal
(i.e. whenever a child dies). This will in turn call wait3 which will receive the signal. The
WNOHANG flag is set, which causes this to be a non-blocking wait (one of my favorite
Alternative types of sockets
This example showed a stream socket in the Internet domain. This is the most common
type of connection. A second type of connection is a datagram socket. You might want to
use a datagram socket in cases where there is only one message being sent from the client
to the server, and only one message being sent back. There are several differences
between a datagram socket and a stream socket.
1. Datagrams are unreliable, which means that if a packet of information gets lost
somewhere in the Internet, the sender is not told (and of course the receiver does not
know about the existence of the message). In contrast, with a stream socket, the
underlying TCP protocol will detect that a message was lost because it was not
acknowledged, and it will be retransmitted without the process at either end knowing
about this.
2. Message boundaries are preserved in datagram sockets. If the sender sends a
datagram of 100 bytes, the receiver must read all 100 bytes at once. This can be
contrasted with a stream socket, where if the sender wrote a 100 byte message, the
receiver could read it in two chunks of 50 bytes or 100 chunks of one byte.
3. The communication is done using special system calls sendto() and receivefrom()
rather than the more generic read() and write().
4. There is a lot less overhead associated with a datagram socket because
connections do not need to be established and broken down, and packets do not need to
be acknowledged. This is why datagram sockets are often used when the service to be
provided is short, such as a time-of-day service.
Click here for the server code using a datagram socket.
Click here for the client code using a datagram socket.
These two programs can be compiled and run in exactly the same way as the server and
client using a stream socket.
Server code with a datagram socket
Most of the server code is similar to the stream socket code. Here are the differences.
sock=socket(AF_INET, SOCK_DGRAM, 0);

Note that when the socket is created, the second argument is the symbolic constant
SOCK_DGRAM instead of SOCK_STREAM. The protocol will be UDP, not TCP.

fromlen = sizeof(struct sockaddr_in);
while (1) {
n = recvfrom(sock,buf,1024,0,(struct sockaddr *)&from,&fromlen);
if (n < 0) error("recvfrom");
Servers using datagram sockets do not use the listen() or the accept() system calls. After a
socket has been bound to an address, the program calls recvfrom() to read a message.
This call will block until a message is received. The recvfrom() system call takes six
arguments. The first three are the same as those for the read() call, the socket file
descriptor, the buffer into which the message will be read, and the maximum number of
bytes. The fourth argument is an integer argument for flags. This is ordinarily set to zero.
The fifth argument is a pointer to a sockaddr_in structure. When the call returns, the
values of this structure will have been filled in for the other end of the connection (the
client). The size of this structure will be in the last argument, a pointer to an integer. This
call returns the number of bytes in the message. (or -1 on an error condition). The
recvfrom() man page has more information.

n = sendto(sock,"Got your message\n",17,
0,(struct sockaddr *) &from,fromlen);
if (n < 0) error("sendto");
To send a datagram, the function sendto() is used. This also takes six arguments. The first
three are the same as for a write() call, the socket file descriptor, the buffer from which
the message will be written, and the number of bytes to write. The fourth argument is an
int argument called flags, which is normally zero. The fifth argument is a pointer to a
sockadd_in structure. This will contain the address to which the message will be sent.
Notice that in this case, since the server is replying to a message, the values of this
structure were provided by the recvfrom call. The last argument is the size of this
structure. Note that this is not a pointer to an int, but an int value itself. The sendto() man
page has more information.
The Client Code
The client code for a datagram socket client is the same as that for a stream socket with
the following differences.
• the socket system call has SOCK_DGRAM instead of SOCK_STREAM as its
second argument.
• there is no connect() system call
• instead of read and write, the client uses recvfrom and sendto which are described
in detail above.
Sockets in the Unix Domain
Here is the code for a client and server which communicate using a stream socket in the
Unix domain.
Click here for the server program
Click here for the client program

The only difference between a socket in the Unix domain and a socket in the Internet
domain is the form of the address. Here is the address structure for a Unix Domain
address, defined in the header file sys/un.h.
struct sockaddr_un
short sun_family; /* AF_UNIX */
char sun_path[108]; /* path name (gag) */
The field sun_path has the form of a path name in the Unix file system. This means that
both client and server have to be running the same file system. Note that on systems
running AFS, such as the Rensselaer Computer System, these sockets must be created in
the directory /tmp. Once a socket has been created, it remain until it is explicitly deleted,
and its name will appear with the ls command, always with a size of zero. Sockets in the
Unix domain are virtually identical to named pipes (FIFOs).
Designing servers
There are a number of different ways to design servers. These models are discussed in
detail in a book by Douglas E. Comer and David L. Stevens entiteld Internetworking with
TCP/IP Volume III:Client Server Programming and Applications published by Prentice
Hall in 1996. These are summarized here.
Concurrent, connection oriented servers The typical server in the Internet domain creates
a stream socket and forks off a process to handle each new connection that it receives.
This model is appropriate for services which will do a good deal of reading and writing
over an extended period of time, such as a telnet server or an ftp server. This model has
relatively high overhead, because forking off a new process is a time consuming
operation, and because a stream socket which uses the TCP protocol has high kernel
overhead, not only in establishing the connection but also in transmitting information.
However, once the connection has been established, data transmission is reliable in both
Iterative, connectionless servers Servers which provide only a single message to the
client often do not involve forking, and often use a datagram socket rather than a stream
socket. Examples include a finger daemon or a timeofday server or an echo server (a
server which merely echoes a message sent by the client). These servers handle each
message as it receives them in the same process. There is much less overhead with this
type of server, but the communication is unreliable. A request or a reply may get lost in
the Internet, and there is no built-in mechanism to detect and handle this.
Single Process concurrent servers A server which needs the capability of handling several
clients simultaneous, but where each connection is I/O dominated (i.e. the server spends
most of its time blocked waiting for a message from the client) is a candidate for a single
process, concurrent server. In this model, one process maintains a number of open
connections, and listens at each for a message. Whenever it gets a message from a client,
it replies quickly and then listens for the next one. This type of service can be done with
the select system call.

Execution tracing is a technique that allows a program to monitor the execution of
another program. The traced program can be executed step by step. Until a signal is
received r until a system call is invoked. Execution tracing is widely used by debuggers,
together with other techniques like the insertion of breakpoints in the debugged program
and run time access to its variables. In Linux, execution tracing is performed through the
ptrace() system call, which an handle the following commands:
1. PTRACE_TRACEME: start execution tracing for the current process
2. PTRACE_ATTACH: start execution tracing for another process
3. PTRACE_DETACH: terminate execution tracing
4. PTRACE_KILL: kill the traced process
5. PTRACE_PEEKTEXT: read a 32 bit value from the text segment
6. PTRACE_PEEKDATA: read a 32 bit value from data segment
7. PTRACE_POKETEXT: write a 32 bit value to text segment
8. PTRACE_POKEDATA: write a 32 bit value to data segment
9. PTRACE_CONT: resume execution
Several monitored events can be associated with a traced program:
- end of execution of a single assembly instruction
- entering a system call
- exiting from a system call
- receiving a signal

When a monitored event occurs, the traced program is stopped and a SIGCHLD signal is
sent to its parent. When the parent wishes to resume the child‘s execution. T can use one
A process can also be traced using some debugging features of the intel
processors. For example: the parent could set the value of the dr0 . . . dr7 debug registers.
The cpu raises the ―debug‖ exception: the exception handler can then suspend the traced
process and send the SIGCHLD signal to the parent.

The greatest contributions of the Industrial Revolution were standardization and
interchangeable parts. Open software and the APIs it embodies certainly capture the spirit
of outward standardization, but the same can‘t be said for parts inside the box. Linux, for
all its flexibility and portability, still feels like a flintlock factory when it comes to
retargeting it across new systems, boards, and CPUs.In contrast, the BSP is solidly
industrial. Twenty years ago, the BSP, which stands for Board Support Package, was
coined to describe the abstraction of hardware dependencies in an embedded OS. More
recent TLAs—three-letter acronyms—like HAL and OAL (Hardware and OEM
Abstraction Layers) have joined BSP in the acronym hall of fame, but BSP in pervasive
computing has stuck. Developers, partners, sales people, all call our company regularly to
check on ―BSP availability.‖ A HAL, when implemented well, allows an OS to
generalize many of the particulars of a system‘s CPU, cache, MMU/TLBs, serial ports,
NICs, display device, interrupt controller, memory map, etc., both to allow the OS to
focus on ―big issues‖ and to facilitate porting to new hardware configurations.
The existence of Ready Systems‘ and Wind River‘s BSP/HAL specifications eased the
migration of the VRTX and VxWorks kernels onto literally thousands of boards with
dozens of architectures; the WindowsCE OAL helped Microsoft target over a hundred
boards with dozens of different CPUs in a very short time.
So, why doesn‘t Linux have a HAL? I can tell you the answer in one word – Tradition.
The Linux kernel emanates from, which essentially produces a white box OS,
supporting x86/IA-32 compatible CPUs. With that Wintel architecture, things like code
compatibility, BIOS, and chipsets come together to form what I call the PC/AT ―virtual
machine.‖ Linux, like Windows, leverages basic knowledge about this platform, so that
booting and hardware initialization are taken care of, leaving a kernel to worry about the
more interesting things. As one hacker says, ―on x86, it just works!‖
Linux does have a HAL – it‘s the PC. Pervasive computing is not about the ubiquity of
the PC. Architectures other than x86/IA-32, like PowerPC, ARM, MIPS, and
SuperHitachi, dominate the space and each presents its own take on hardware
configuration, while none offers a broadly accepted set of hardware support conventions.
Some abstraction work is indeed under way. As each Linux architecture tree matures,
conventions arise through the magic of Open Source cooperation. The PowerPC tree is a
case in point. The PowerPC CPU family, while binary-compatible for user-space
applications, diverges vastly among members in terms of MMU, cache, floating point,
breakpoint registers, and with Book E, new instructions, and PowerPC boards present
over half a dozen boot monitors.
Porting to new PowerPC hardware might still not be for the faint of heart, but agreed-
upon abstractions have matured to the point that new boards can be added to the corpus
of supported systems with modifications to as few as three files. Now, the MIPS folks are
talking to the PowerPC maintainers and appear to be headed in the same direction.
While these grass-roots efforts represent the best of Open Source, fragmentation across
architectures, which still abounds, cannot be good for Linux overall. The current
prediction is that Linux will truly triumph in embedded where it has not done so on the
desktop. That means the Community needs to get together with both bottom-up and top-
down initiatives to accelerate quick and easy porting to new hardware

What is POSIX®? POSIX is the Portable Operating System Interface, the open operating
interface standard accepted world-wide. It is produced by IEEE and recognized by ISO
and ANSI.
POSIX support assures code portability between systems and is increasingly mandated
for commercial applications and government contracts. For instance, the USA's Joint
Technical Architecture—Army (JTA-A) standards set specifies that conformance to the
POSIX specification is critical to support software interoperability.
POSIX conformance is worth more than POSIX compliance
POSIX conformance is what real-time embedded developers are usually looking for.
POSIX conformance means that the POSIX.1 standard is supported in its entirety. In the
case of the LynxOS real-time operating system, the routines of the POSIX.1b and
POSIX.1c subsets are also supported.
Certified POSIX conformance exists when conformance is certified by an accredited,
independent certification authority. For example, LynxOS has been certified conformant
to POSIX 1003.1-1996 by Mindcraft, Inc. and tested against FIPS 151-2 (Federal
Information Processing Standard).
POSIX compliance is a less powerful label, and could merely mean that a product
provides partial POSIX support. "POSIX compliance" means that documentation is
available that shows which POSIX features are supported and which are not.
Be wary of claims like POSIX operating system or 95% POSIX, which do not
specify POSIX conformance.
Remember that POSIX compliance does not always mean that all POSIX-defined
features are supported.

Different Kernel Designs Overview:
Kernel terminology gets tossed about quite a bit. One of the more common topics
regarding operating system kernels is the overall design. In particular how the kernel is
structured. Generally, there are three major types of kernels; monolithic, microkernel and
A monolithic kernel is one single program that contains all of the code necessary to
perform every kernel related task. Most UNIX and BSD kernels are monolithic by
default. Recently more UNIX and BSD systems have been adding the modular capability
which is popular in the Linux kernel. The Linux kernel started off monolithic, however, it
gravitated towards a modular/hybrid design for several reasons. In the monolithic kernel,
some advantages hinge on these points:
Since there is less software involved it is faster.
As it is one single piece of software it should be smaller both in source and
compiled forms.
Less code generally means less bugs which can translate to fewer security
Those points are dependent upon how well the software is written in the first place. It can
be assumed that a stable kernel that has modular capability added to it will, of course,
grow both in raw software terms and regarding internal communications.
Most work in the monolithic kernel is done via system calls. These are interfaces, usually
kept in a tabular structure, that access some subsystem within the kernel such as disk
operations. Essentially calls are made within programs and a checked copy of the request
is passed through the system call. Hence, not far to travel at all.
The disadvantages of the monolithic kernel are converse with the advantages. Modifying
and testing monolithic systems takes longer than their microkernel counterparts. When a
bug surfaces within the core of the kernel the effects can be far reaching. Also, patching
monolithic systems can be more difficult (especially for source patching).
The microkernel architecture is very different from the monolithic. In the microkernel,
only the most fundamental of tasks are are performed such as being able to access some
(not necessarily all) of the hardware, manage memory and coordinate message passing
between the processes. Some systems that use microkernels are QNX and the HURD. In
the case of QNX and HURD, user sessions can be entire snapshots of the system itself or
views as it is referred to. The very essence of the microkernel architecture illustrates
some of its advantages:

Maintenance is generally easier. Patches can be tested in a separate instance, then
swapped in to take over a production instance.
Rapid development time, new software can be tested without having to reboot the
More persistence in general, if one instance goes hay-wire, it is often possible to
substitute it with an operational mirror.
Again, all of the points are making certain assumptions about the code itself. Assuming
the code is well formed, those points should stand reasonably well.
Most microkernels use a message passing system of some sort to handle requests from
one server to another. The message passing system generally operates on a port basis
with the microkernel. As an example, if a request for more memory is sent, a port is
opened with the microkernel and the request sent through. Once within the microkernel,
the steps are similar to system calls.
Disadvantages in the microkernel exist however. A few examples are:
Larger running memory footprint
More software for interfacing is required, there is a potential for performance loss
(note, the QNX system is extraordinarily fast).
Messaging bugs can be harder to fix due to the longer trip they have to take versus
the one off copy in a monolithic kernel.
Process management in general can be very complicated.
The disadvantages for microkernels are extremely context based. As an example, they
work well for small single purpose (and critical) systems because if not many processes
need to run, then the complications of process management are effectively mitigated.
Modular/Hybrid Kernels
Many traditionally monolithic kernels are now at least adding (if not actively exploiting)
the module capability. The most well known of these kernels is the Linux kernel. The
modular kernel essentially can have parts of it that are built into the core kernel binary or
binaries that load into memory on demand. It is important to note that a code tainted
module has the potential to destabilize a running kernel. Many people become confused
on this point when discussing microkernels. It is possible to write a driver for a
microkernel in a completely separate memory space and test it before going live. When a
kernel module is loaded, it accesses the monolithic portion's memory space by adding to
it what it needs, therefore, opening the doorway to possible pollution. A few advantages
to the modular kernel are:
Faster development time for drivers that can operate from within modules. No
reboot required for testing (provided the kernel is not destabilized).
On demand capability versus spending time recompiling a whole kernel for things
like new drivers or subsystems.

Faster integration of third party technology (related to development but pertinent
unto itself nonetheless).
Modules, generally, communicate with the kernel using a module interface of some sort.
The interface is generalized (although particular to a given operating system) so it is not
always possible to use modules. Often the device drivers may need more flexibility than
the module interface affords. Essentially, it is two system calls and often the safety
checks that only have to be done once in the monolithic kernel now may be done twice.
Some of the disadvantages of the modular approach are:
With more interfaces to pass through, the possibility of increased bugs exists
(which implies more security holes).
Maintaining modules can be confusing for some administrators when dealing with
problems like symbol differences.

Booting process:
1. BIOS: The Basic Input/Output System is the lowest level interface between the
computer and peripherals. The BIOS performs integrity checks on memory and
seeks instructions on the Master Boor Record (MBR) on the floppy drive or hard
2. The MBR points to the boot loader (GRUB or LILO: Linux boot loader).
3. Boot loader (GRUB or LILO) will then ask for the OS label which will identify
which kernel to run and where it is located (hard drive and partition specified).
The installation process requires to creation/identification of partitions and where
to install the OS. GRUB/LILO are also configured during this process. The boot
loader then loads the Linux operating system.
o See the YoLinux tutorial on creating a boot disk for more information on
GRUB and LILO and also to learn how to put the MBR and boot loader
on a floppy for system recovery.
4. The first thing the kernel does is to execute init program. Init is the root/parent
of all processes executing on Linux.
5. The first processes that init starts is a script /etc/rc.d/rc.sysinit

Boot Script works as:
Run /sbin/initlog
Run devfs to generate/manage system devices
Run network scripts: /etc/sysconfig/network
Start graphical boot (If so configured): rhgb
Start console terminals, load keymap, system fonts and print console greeting:
mingetty, setsysfonts
The various virtual console sessions can be viewed with the key-stroke: ctrl-
alt-F1 through F6. F7 is reserved for the GUI screen invoked in run level 5.
Mount /proc and start device controllers.
Done with boot configuration for root drive. (initrd) Unmount root drive.
Re-mount root file system as read/write
Direct kernel to load kernel parameters and modules: sysctl, depmod, modprobe
Set up clock: /etc/sysconfig/clock
Perform disk operations based on fsck configuration
Check/mount/check/enable quotas non-root file systems: fsck, mount, quotacheck,
Initialize logical volume management: vgscan, /etc/lvmtab
Activate syslog, write to log files: dmesg
Configure sound: sndconfig
Activate PAM
Active swqpping: swapon

More details on booting process:
The process of booting a Linux® system consists of a number of stages. But whether
you're booting a standard x86 desktop or a deeply embedded PowerPC® target, much of
the flow is surprisingly similar. This article explores the Linux boot process from the
initial bootstrap to the start of the first user-space application. Along the way, you'll learn
about various other boot-related topics such as the boot loaders, kernel decompression,
the initial RAM disk, and other elements of Linux boot.
In the early days, bootstrapping a computer meant feeding a paper tape containing a boot
program or manually loading a boot program using the front panel address/data/control
switches. Today's computers are equipped with facilities to simplify the boot process, but
that doesn't necessarily make it simple.
Let's start with a high-level view of Linux boot so you can see the entire landscape. Then
we'll review what's going on at each of the individual steps. Source references along the
way will help you navigate the kernel tree and dig in further.
Figure 1 gives you the 20,000-foot view.

Figure 1. The 20,000-foot view of the Linux boot process

When a system is first booted, or is reset, the processor executes code at a well-known
location. In a personal computer (PC), this location is in the basic input/output system
(BIOS), which is stored in flash memory on the motherboard. The central processing unit
(CPU) in an embedded system invokes the reset vector to start a program at a known
address in flash/ROM. In either case, the result is the same. Because PCs offer so much
flexibility, the BIOS must determine which devices are candidates for boot. We'll look at
this in more detail later.
When a boot device is found, the first-stage boot loader is loaded into RAM and
executed. This boot loader is less than 512 bytes in length (a single sector), and its job is
to load the second-stage boot loader.
When the second-stage boot loader is in RAM and executing, a splash screen is
commonly displayed, and Linux and an optional initial RAM disk (temporary root file
system) are loaded into memory. When the images are loaded, the second-stage boot
loader passes control to the kernel image and the kernel is decompressed and initialized.

At this stage, the second-stage boot loader checks the system hardware, enumerates the
attached hardware devices, mounts the root device, and then loads the necessary kernel
modules. When complete, the first user-space program (init) starts, and high-level system
initialization is performed.
That's Linux boot in a nutshell. Now let's dig in a little further and explore some of the
details of the Linux boot process.

System startup
The system startup stage depends on the hardware that Linux is being booted on. On an
embedded platform, a bootstrap environment is used when the system is powered on, or
reset. Examples include U-Boot, RedBoot, and MicroMonitor from Lucent. Embedded
platforms are commonly shipped with a boot monitor. These programs reside in special
region of flash memory on the target hardware and provide the means to download a
Linux kernel image into flash memory and subsequently execute it. In addition to having
the ability to store and boot a Linux image, these boot monitors perform some level of
system test and hardware initialization. In an embedded target, these boot monitors
commonly cover both the first- and second-stage
boot loaders.
In a PC, booting Linux begins in the BIOS at address
0xFFFF0. The first step of the BIOS is the power-on
self test (POST). The job of the POST is to perform
a check of the hardware. The second step of the
BIOS is local device enumeration and initialization.
Given the different uses of BIOS functions, the
BIOS is made up of two parts: the POST code and
runtime services. After the POST is complete, it is
flushed from memory, but the BIOS runtime services
remain and are available to the target operating
To boot an operating system, the BIOS runtime
searches for devices that are both active and bootable
in the order of preference defined by the
complementary metal oxide semiconductor (CMOS)
settings. A boot device can be a floppy disk, a CD-ROM, a partition on a hard disk, a
device on the network, or even a USB flash memory stick.
Commonly, Linux is booted from a hard disk, where the Master Boot Record (MBR)
contains the primary boot loader. The MBR is a 512-byte sector, located in the first sector
on the disk (sector 1 of cylinder 0, head 0). After the MBR is loaded into RAM, the BIOS
yields control to it.

Extracting the MBR
To see the contents of your
MBR, use this command:
# dd if=/dev/hda of=mbr.bin
bs=512 count=1
# od -xa mbr.bin
The dd command, which needs
to be run from root, reads the
first 512 bytes from /dev/hda
(the first Integrated Drive
Electronics, or IDE drive) and
writes them to the mbr.bin file.
The od command prints the
binary file in hex and ASCII

Stage 1 boot loader
The primary boot loader that resides in the MBR is a 512-byte image containing both
program code and a small partition table (see Figure 2). The first 446 bytes are the
primary boot loader, which contains both executable code and error message text. The
next sixty-four bytes are the partition table, which contains a record for each of four
partitions (sixteen bytes each). The MBR ends with two bytes that are defined as the
magic number (0xAA55). The magic number serves as a validation check of the MBR.

Figure 2. Anatomy of the MBR

The job of the primary boot loader is to find and load the secondary boot loader (stage 2).
It does this by looking through the partition table for an active partition. When it finds an
active partition, it scans the remaining partitions in the table to ensure that they're all
inactive. When this is verified, the active partition's boot record is read from the device
into RAM and executed.

Stage 2 boot loader
The secondary, or second-stage, boot loader could be more aptly called the kernel loader.
The task at this stage is to load the Linux kernel and optional initial RAM disk.

The first- and second-stage boot loaders combined
are called Linux Loader (LILO) or GRand Unified
Bootloader (GRUB) in the x86 PC environment.
Because LILO has some disadvantages that were
corrected in GRUB, let's look into GRUB. (See
many additional resources on GRUB, LILO, and
related topics in the Resources section later in this
The great thing about GRUB is that it includes knowledge of Linux file systems. Instead
of using raw sectors on the disk, as LILO does, GRUB can load a Linux kernel from an
ext2 or ext3 file system. It does this by making the two-stage boot loader into a three-
stage boot loader. Stage 1 (MBR) boots a stage 1.5 boot loader that understands the
particular file system containing the Linux kernel image. Examples include
reiserfs_stage1_5 (to load from a Reiser journaling file system) or e2fs_stage1_5 (to load
from an ext2 or ext3 file system). When the stage 1.5 boot loader is loaded and running,
the stage 2 boot loader can be loaded.
With stage 2 loaded, GRUB can, upon request, display a list of available kernels (defined
in /etc/grub.conf, with soft links from /etc/grub/menu.lst and /etc/grub.conf). You can
select a kernel and even amend it with additional kernel parameters. Optionally, you can
use a command-line shell for greater manual control over the boot process.
With the second-stage boot loader in memory, the file system is consulted, and the
default kernel image and initrd image are loaded into memory. With the images ready,
the stage 2 boot loader invokes the kernel image.


GRUB stage boot loaders
The /boot/grub directory
contains the stage1, stage1.5,
and stage2 boot loaders, as well
as a number of alternate loaders
(for example, CR-ROMs use
the iso9660_stage_1_5).

Manual boot in GRUB
From the GRUB command-
line, you can boot a specific
kernel with a named initrd
image as follows:
grub> kernel /bzImage-

grub> initrd /initrd-
[Linux-initrd @ 0x5f13000,
0xcc199 bytes]

With the kernel image in memory and control given
from the stage 2 boot loader, the kernel stage begins.
The kernel image isn't so much an executable kernel,
but a compressed kernel image. Typically this is a
zImage (compressed image, less than 512KB) or a
bzImage (big compressed image, greater than
512KB), that has been previously compressed with
zlib. At the head of this kernel image is a routine that
does some minimal amount of hardware setup and
then decompresses the kernel contained within the
kernel image and places it into high memory. If an
initial RAM disk image is present, this routine
moves it into memory and notes it for later use. The
routine then calls the kernel and the kernel boot
When the bzImage (for an i386 image) is invoked, you begin at ./arch/i386/boot/head.S in
the start assembly routine (see Figure 3 for the major flow). This routine does some basic
hardware setup and invokes the startup_32 routine in
./arch/i386/boot/compressed/head.S. This routine sets up a basic environment (stack, etc.)
and clears the Block Started by Symbol (BSS). The kernel is then decompressed through
a call to a C function called decompress_kernel (located in
./arch/i386/boot/compressed/misc.c). When the kernel is decompressed into memory, it is
called. This is yet another startup_32 function, but this function is in
In the new startup_32 function (also called the swapper or process 0), the page tables are
initialized and memory paging is enabled. The type of CPU is detected along with any
optional floating-point unit (FPU) and stored away for later use. The start_kernel function
is then invoked (init/main.c), which takes you to the non-architecture specific Linux
kernel. This is, in essence, the main function for the Linux kernel.

Figure 3. Major functions flow for the Linux kernel i386 boot

grub> boot

Uncompressing Linux... Ok,
booting the kernel.

If you don't know the name of
the kernel to boot, just type a
forward slash (/) and press the
Tab key. GRUB will display
the list of kernels and initrd

With the call to start_kernel, a long list of initialization functions are called to set up
interrupts, perform further memory configuration, and load the initial RAM disk. In the
end, a call is made to kernel_thread (in arch/i386/kernel/process.c) to start the init
function, which is the first user-space process. Finally, the idle task is started and the
scheduler can now take control (after the call to cpu_idle). With interrupts enabled, the
pre-emptive scheduler periodically takes control to provide multitasking.
During the boot of the kernel, the initial-RAM disk (initrd) that was loaded into memory
by the stage 2 boot loader is copied into RAM and mounted. This initrd serves as a
temporary root file system in RAM and allows the kernel to fully boot without having to
mount any physical disks. Since the necessary modules needed to interface with
peripherals can be part of the initrd, the kernel can be very small, but still support a large
number of possible hardware configurations. After the kernel is booted, the root file
system is pivoted (via pivot_root) where the initrd root file system is unmounted and the
real root file system is mounted.
The initrd function allows you to create a small
Linux kernel with drivers compiled as loadable
modules. These loadable modules give the kernel the
means to access disks and the file systems on those
disks, as well as drivers for other hardware assets.
Because the root file system is a file system on a
disk, the initrd function provides a means of
bootstrapping to gain access to the disk and mount
the real root file system. In an embedded target
without a hard disk, the initrd can be the final root file system, or the final root file
system can be mounted via the Network File System (NFS).

After the kernel is booted and initialized, the kernel starts the first user-space application.
This is the first program invoked that is compiled with the standard C library. Prior to this
point in the process, no standard C applications have been executed.
In a desktop Linux system, the first application started is commonly /sbin/init. But it need
not be. Rarely do embedded systems require the extensive initialization provided by init
(as configured through /etc/inittab). In many cases, you can invoke a simple shell script
that starts the necessary embedded applications.

decompress_kernel output
The decompress_kernel
function is where you see the
usual decompression messages
emitted to the display:
Uncompressing Linux... Ok,
booting the kernel.

Overview of Linux and compiling the kernel:

Linux is a clone of the operating system Unix, written from scratch by Linus Torvalds
with assistance from a loosely-knit team of hackers across the Net. It aims towards
POSIX and Single UNIX Specification compliance.
It has all the features you would expect in a modern fully-fledged Unix, including true
multitasking, virtual memory, shared libraries, demand loading, shared copy-on-write
executables, proper memory management, and multistack networking including IPv4 and
IPv6. It is distributed under the GNU General Public License - see the
accompanying COPYING file for more details.

Although originally developed first for 32-bit x86-based PCs (386 or higher), today
Linux also runs on (at least) the Compaq Alpha AXP, Sun SPARC and UltraSPARC,
Motorola 68000, PowerPC, PowerPC64, ARM, Hitachi SuperH, Cell, IBM S/390, MIPS,
HP PA-RISC, Intel IA-64, DEC VAX, AMD x86-64, AXIS CRIS, Xtensa, AVR32 and
Renesas M32R architectures. Linux is easily portable to most general-purpose 32- or 64-
bit architectures as long as they have a paged memory management unit (PMMU) and a
port of the GNU C compiler (gcc) (part of The GNU Compiler Collection, GCC). Linux
has also been ported to a number of architectures without a PMMU, although
functionality is then obviously somewhat limited. Linux has also been ported to itself.
You can now run the kernel as a userspace application - this is called UserMode Linux

- There is a lot of documentation available both in electronic form on the Internet and in
books, both Linux-specific and pertaining to general UNIX questions. I'd recommend
looking into the documentation subdirectories on any Linux FTP site for the LDP (Linux
Documentation Project) books. This README is not meant to be documentation on the
system: there are much better sources available.

- There are various README files in the Documentation/ subdirectory: these typically
contain kernel-specific installation notes for some drivers for example. See
Documentation/00-INDEX for a list of what is contained in each file. Please read the
Changes file, as it contains information about the problems, which may result by
upgrading your kernel.

- The Documentation/DocBook/ subdirectory contains several guides for kernel
developers and users. These guides can be rendered in a number of formats: PostScript
(.ps), PDF, and HTML, among others. After installation, "make psdocs", "make pdfdocs",
or "make htmldocs" will render the documentation in the requested format.

INSTALLING the kernel:

- If you install the full sources, put the kernel tarball in a

directory where you have permissions (eg. your home directory) and
unpack it:
gzip -cd linux-2.6.XX.tar.gz | tar xvf -
bzip2 -dc linux-2.6.XX.tar.bz2 | tar xvf -

Replace "XX" with the version number of the latest kernel.
Do NOT use the /usr/src/linux area! This area has a (usually incomplete) set of kernel
headers that are used by the library header files. They should match the library, and not
get messed up by whatever the kernel-du-jour happens to be.

- You can also upgrade between 2.6.xx releases by patching. Patches are distributed in
the traditional gzip and the newer bzip2 format. To install by patching, get all the newer
patch files, enter the top level directory of the kernel source (linux-2.6.xx) and execute:
gzip -cd ../patch-2.6.xx.gz | patch -p1
bzip2 -dc ../patch-2.6.xx.bz2 | patch -p1

(repeat xx for all versions bigger than the version of your current source tree,
_in_order_) and you should be ok. You may want to remove the backup files (xxx~ or
xxx.orig), and make sure that there are no failed patches (xxx# or xxx.rej). If there are,
either you or me has made a mistake.

Unlike patches for the 2.6.x kernels, patches for the 2.6.x.y kernels (also known as the -
stable kernels) are not incremental but instead apply directly to the base 2.6.x kernel.
Please read Documentation/applying-patches.txt for more information.
Alternatively, the script patch-kernel can be used to automate this process. It
determines the current kernel version and applies any patches found.
linux/scripts/patch-kernel linux
The first argument in the command above is the location of the kernel source. Patches
are applied from the current directory, but an alternative directory can be specified as the
second argument.

- If you are upgrading between releases using the stable series patches (for example,
patch-2.6.xx.y), note that these "dot-releases" are not incremental and must be applied to
the 2.6.xx base tree. For example, if your base kernel is 2.6.12 and you want to apply
the patch, you do not and indeed must not first apply the and
patches. Similarly, if you are running kernel version and want to jump to, you must first reverse the patch (that is, patch -R) _before_ applying
the patch. You can read more on this in Documentation/applying-patches.txt

- Make sure you have no stale .o files and dependencies lying around:

cd linux
make mrproper

You should now have the sources correctly installed.


Compiling and running the 2.6.xx kernels requires up-to-date versions of various
software packages. Consult Documentation/Changes for the minimum version numbers
required and how to get updates for these packages. Beware that using excessively old
versions of these packages can cause indirect errors that are very difficult to track down,
so don't assume that you can just update packages when obvious problems arise during
build or operation.

BUILD directory for the kernel:

When compiling the kernel all output files will per default be stored together with the
kernel source code. Using the option "make O=output/dir" allow you to specify an
alternate place for the output files (including .config).
kernel source code: /usr/src/linux-2.6.N
build directory: /home/name/build/kernel

To configure and build the kernel use:
cd /usr/src/linux-2.6.N
make O=/home/name/build/kernel menuconfig
make O=/home/name/build/kernel
sudo make O=/home/name/build/kernel modules_install install

Please note: If the 'O=output/dir' option is used then it must be used for all invocations of

CONFIGURING the kernel:

Do not skip this step even if you are only upgrading one minor version. New
configuration options are added in each release, and odd problems will turn up if the
configuration files are not set up as expected. If you want to carry your existing
configuration to a new version with minimal work, use "make oldconfig", which will
only ask you for the answers to new questions.

- Alternate configuration commands are:
"make config" Plain text interface.
"make menuconfig" Text based color menus, radiolists & dialogs.
"make xconfig" X windows (Qt) based configuration tool.
"make gconfig" X windows (Gtk) based configuration tool.
"make oldconfig" Default all questions based on the contents of
your existing ./.config file and asking about
new config symbols.
"make silentoldconfig"

Like above, but avoids cluttering the screen
with questions already answered.
"make defconfig" Create a ./.config file by using the default
symbol values from arch/$ARCH/defconfig.
"make allyesconfig"
Create a ./.config file by setting symbol
values to 'y' as much as possible.
"make allmodconfig"
Create a ./.config file by setting symbol
values to 'm' as much as possible.
"make allnoconfig" Create a ./.config file by setting symbol
values to 'n' as much as possible.
"make randconfig" Create a ./.config file by setting symbol
values to random values.

The allyesconfig/allmodconfig/allnoconfig/randconfig variants can also use the
environment variable KCONFIG_ALLCONFIG to specify a filename that contains
config options that the user requires to be set to a specific value. If
KCONFIG_ALLCONFIG=filename is not used, "make *config" checks for a file
named "all{yes/mod/no/random}.config" for symbol values that are to be forced. If this
file is not found, it checks for a file named "all.config" to contain forced values.

NOTES on "make config":
- having unnecessary drivers will make the kernel bigger, and can under some
circumstances lead to problems: probing for a nonexistent controller card may confuse
your other controllers
- compiling the kernel with "Processor type" set higher than 386 will result in
a kernel that does NOT work on a 386. The kernel will detect this on bootup, and give
- A kernel with math-emulation compiled in will still use the coprocessor if one
is present: the math emulation will just never get used in that case. The kernel will be
slightly larger, but will work on different machines regardless of whether they have a
math coprocessor or not.
- the "kernel hacking" configuration details usually result in a bigger or slower
kernel (or both), and can even make the kernel less stable by configuring some routines
to actively try to break bad code to find kernel problems (kmalloc()). Thus you should
probably answer 'n' to the questions for "development", "experimental", or "debugging"

COMPILING the kernel:

- Make sure you have at least gcc 3.2 available. For more information, refer to
Documentation/Changes. Please note that you can still run a.out user programs with this

- Do a "make" to create a compressed kernel image. It is also possible to do "make
install" if you have lilo installed to suit the kernel makefiles, but you may want to check
your particular lilo setup first. To do the actual install you have to be root, but none of
the normal build should require that. Don't take the name of root in vain.

- If you configured any of the parts of the kernel as `modules', you will also have to do
"make modules_install".

- Keep a backup kernel handy in case something goes wrong. This is especially true for
the development releases, since each new release contains new code which has not been
debugged. Make sure you keep a backup of the modules corresponding to that kernel, as
well. If you are installing a new kernel with the same version number as your working
kernel, make a backup of your modules directory before you do a "make
modules_install". Alternatively, before compiling, use the kernel config option
"LOCALVERSION" to append a unique suffix to the regular kernel version.
LOCALVERSION can be set in the "General Setup" menu.

- In order to boot your new kernel, you'll need to copy the kernel image (e.g.
.../linux/arch/i386/boot/bzImage after compilation) to the place where your regular
bootable kernel is found.

- Booting a kernel directly from a floppy without the assistance of a bootloader such as
LILO, is no longer supported.

If you boot Linux from the hard drive, chances are you use LILO which uses the
kernel image as specified in the file /etc/lilo.conf. The kernel image file is usually
/vmlinuz, /boot/vmlinuz, /bzImage or /boot/bzImage. To use the new kernel, save a
copy of the old image and copy the new image over the old one. Then, you MUST
RERUN LILO to update the loading map!! If you don't, you won't be able to boot the
new kernel image.
Reinstalling LILO is usually a matter of running /sbin/lilo. You may wish to edit
/etc/lilo.conf to specify an entry for your old kernel image (say, /vmlinux.old) in case the
new one does not work. See the LILO docs for more information.
After reinstalling LILO, you should be all set. Shutdown the system, reboot, and

If you ever need to change the default root device, video mode, ramdisk size, etc. in
the kernel image, use the 'rdev' program (or alternatively the LILO boot options when
appropriate). No need to recompile the kernel to change these parameters.

- Reboot with the new kernel and enjoy.


- If you have problems that seem to be due to kernel bugs, please check the file
MAINTAINERS to see if there is a particular person associated with the part of the
kernel that you are having trouble with. If there isn't anyone listed there, then the second
best thing is to mail them to me (, and possibly to any
other relevant mailing-list or to the newsgroup.

- In all bug-reports, *please* tell what kernel you are talking about, how to duplicate the
problem, and what your setup is (use your common sense). If the problem is new, tell
me so, and if the problem is old, please try to tell me when you first noticed it.

- If the bug results in a message like

unable to handle kernel paging request at address C0000010
Oops: 0002
eax: xxxxxxxx ebx: xxxxxxxx ecx: xxxxxxxx edx: xxxxxxxx
esi: xxxxxxxx edi: xxxxxxxx ebp: xxxxxxxx
ds: xxxx es: xxxx fs: xxxx gs: xxxx
Pid: xx, process nr: xx
xx xx xx xx xx xx xx xx xx xx

or similar kernel debugging information on your screen or in your system log, please
duplicate it *exactly*. The dump may look incomprehensible to you, but it does contain
information that may help debugging the problem. The text above the dump is also
important: it tells something about why the kernel dumped code (in the above example
it's due to a bad kernel pointer). More information on making sense of the dump is in

- If you compiled the kernel with CONFIG_KALLSYMS you can send the dump
as is, otherwise you will have to use the "ksymoops" program to make sense of the
dump (but compiling with CONFIG_KALLSYMS is usually preferred). This utility can
be downloaded from ftp://ftp.<country> .
Alternately you can do the dump lookup by hand:

- In debugging dumps like the above, it helps enormously if you can look up what the
EIP value means. The hex value as such doesn't help me or anybody else very much: it
will depend on your particular kernel setup. What you should do is take the hex value
from the EIP line (ignore the "0010:"), and look it up in the kernel namelist to see which
kernel function contains the offending address.

To find out the kernel function name, you'll need to find the system binary associated
with the kernel that exhibited the symptom. This is the file 'linux/vmlinux'. To extract
the namelist and match it against the EIP from the kernel crash, do:
nm vmlinux | sort | less

This will give you a list of kernel addresses sorted in ascending order, from which it is
simple to find the function that contains the offending address. Note that the address
given by the kernel debugging messages will not necessarily match exactly with the
function addresses (in fact, that is very unlikely), so you can't just 'grep' the list: the list
will, however, give you the starting point of each kernel function, so by looking for the
function that has a starting address lower than the one you are searching for but is
followed by a function with a higher address you will find the one you want. In fact, it
may be a good idea to include a bit of "context" in your problem report, giving a few
lines around the interesting one.

If you for some reason cannot do the above (you have a pre-compiled kernel image or
similar), telling me as much about your setup as possible will help. Please read the
REPORTING-BUGS document for details.

- Alternately, you can use gdb on a running kernel. (read-only; i.e. you cannot change
values or set break points.) To do this, first compile the kernel with -g; edit
arch/i386/Makefile appropriately, then do a "make clean". You'll also need to enable
CONFIG_PROC_FS (via "make config").

After you've rebooted with the new kernel, do "gdb vmlinux /proc/kcore". You can
now use all the usual gdb commands. The command to look up the point where your
system crashed is "l *0xXXXXXXXX". (Replace the XXXes with the EIP value.)

gdb'ing a non-running kernel currently fails because gdb (wrongly) disregards the
starting offset for which the kernel is compiled.


Model Test Papers

1. Each model test paper is of 100 marks
2. time for solving is 3 hrs
3. MillnniumYear recommends to solve every question given in this unit to get good

Model test paper attempt all questions

Q1 what is Linux. write its features which makes it popular 3
O2 write difference b/w monolithic & modular kernel 3
Q3 write short note on semaphores. 3
Q4 what is kernel. Illustrate its function in Unix/Linux 3
Q5 what is multiprocessing. Explain symmetric multiprocessing 3
Q6 explain representation of file system in Linux. 5
Q7 explain data structure in Linux kernel 5

Attempt one part in each question
Each part is of 12.5 marks

Q8 a) explain file system in Unix.
b) What is BSD version of Unix, write advantage & disadvantage of Unix
Explain kernel architecture
Q9 a) when Linux born and how it is developed .writ difference of Unix
vs. Linux and Linux vs. windows NT
b) Explain any 25 commands in Linux
Q10 a) explain Linux architecture and its editor
b) Why system administration is necessary and also explains the concept of
process and system calls.
Q11 a) write short note on memory management
b) What is the file system? What are the various types of file system?
Discuss proc and ext2
Q12 a) explain changes to kernel in case of multiprocessing.
b) Explain modules and debugging in brief.
Q13 a) what is synchronization? Explain communication via files and debugging with
b)Explain the concept of pipe with help of c-programing.

Model Test Paper2:

1. a. what is file system in Linux (5)
b. explain sockets (5)
c. explain mjor and minor devices(5)
d. explain chown,grave,chmod,telinit,pg,ps,top,head,tr,cut (5)
e. working of fork,exec,wait,msg queues,shared memory (5)

attempt any six question from 2 to 9
2. a. what are character and block devices(4)
b. what is arp and subnetting(2)
c. symmetric multiprocessing(2.5)
d. dirty block, sticky, suid, guid bit(4)

3. a) write a program via sockets where client send a number to server and server
returns its square(6)
b) i) demonstrate the use of pipes and fifo in a programs(6)
ii) write difference b/w pipe and fifo

4. explain multiprocessing(12.5)
5. a. what are modules(6)
b. what is debugging. explain GDB or SDB(1/2 + 6)

6. a. name any five file systems and explain any one file system(6)
b. name any six editors and explain any one (6.5)
7. explain booting process of Linux(12.5)
8. give answers
a. write name of different environment variables (1.5)
b. what is HAL(1.5)
c. what is POSIX(1.5)
d. what is IPC and race condition(2)
e. explain concept of virtual address space(6)
9. give answers
a. explain static and dynamic allocation(2.5)
b. why cat and ls commands are used(2)
c. what are – dd,uucp,gunzip,gzip,tar,wc,tty,echo,rm,mv(5)
d. what is lp command(1)
e. what is man command(1)
f. what are internal and external commands(1)

Model test paper 3:

Attempt any eight questions

1. explain(12.5)
2. i. write shell scripts(10)
a. to find factorial
b. to check prime/not
c. to display table
d. to copy two files into third file
e. take a number and show corresponding month
ii. what is memory and i/o symmetry(2.5)
3. answer
a. what is proc,explain(4)
b. write difference b/w proc and ext2(2)
c. what are modules(2.5)
d. explain structure of inode,superblock and their operations(4)
4. explain
a. explain process management(4)
b. explain system calls(4.5)
c. explain multiprocessing(4)
5. write c programs
a. to demonstrate the use of shared memory(3)
b. msg queue(3)
c. write client server program using sockets(4.5)
d. explain why pipe mechanisms is not efficient in client server
6. write short note on system administration
7. what is architecture independent memory model
8. explain various data structures in Linux kernel
9. what are various IPC mechanisms and explain semaphores and explain use of
semaphores in c program.


Linux for competitors

Attempe all questiont
Theory exam 80 marks 5 hrs
Prac exam 20 marks 3 hrs
Total 100 marks 8 hrs

Millennium year recommends to solve all question given in this unit to get good marks

MY Linux entrance 2007 (theory)

Attempt all questions and each question carry 10 marks

1) answer
a. linux is compatible with _____________ standard.
b. Linux born in _____________
c. Linux is still a__________- bit os
d. In ext2, file name length can be of________________-
e. ______________-fn, is I c fn. Called in booting of Linux
f. loof _t structure use for ________________
g. max_thread can be altered by ___________ interface
h. what is query_module
i. what is gdb
j. APIC stands for
2) explain any 10 features of Linux
3) explain any 20 commands
4) explain memory management
5) explain
a. all six data structure in Linux(6)
b. system administration(2)
c. process management(2)
6) explain
a. representation of file system in kernel with atleast of six structures(3)
b. explain proc and ext2 and write difference b/w ntfs and ext2(4)
c. explain any six system calls(3)
7) explain
a. all ipc mechanisms with c examples(8)
b. synchronization in kernel(2)
8) write short note on(10)
a. HAL
b. Compiling kernel
d. Different types of kernel
e. Booting up of linux

MY Linux entrance 2007 (practical)

Attempt any 20 questions and each question carry 1 marks

Write shell scripts for

1. to send o/p of date,cal,time,ps,who to a file ―a.txt‖
2. change *c to *Farnehit
3. find even/odd
4. find leap/odd by taking year at command line
5. find prime/not by entering at run time
6. sort numbers
7. find/searching in a file
8. read a number and show month using case
9. find factorial of a number
10. display fibnoccii series
11. display multiplication table
12. find square of a number by calling a function
13. demonstrate the use of returning value by a function
14. show all parameters entered on command line
15. copy one file to other using command line
16. in above program, you have to make a file which have all the data that is copied
using your program
17. compile Linux kernel
18. write a shelll script which mount USB and floppy and copy all the data of floppy
to USB by making a folder
19. which deletes all the data, files on a floppy disk and all folders also
20. search a string in a directory and if found then deleres all that files. Note: string
and directory are entered by user
21. display shellname,username,path where commands found,logging name,OS type,
number of column on your screen, your home directory, number of rows on your
scree,shell version.


1 answers :
a) POSIX-1003.1
b) 1991
c) 32
d) 255
e) start_kernel()
f) telling f_pos
g) sysctl
j) Advance Programmable Interrupt controller


19: check if it is a folder, then move inside and delete files and then folder
for file in *
if [ -d $file ] ; then
cd $file
rm *
cd ..
rm *
rm *

20. echo

Example1: write a shell script to read name, grade, basic salary and display it
echo ―enter name:‖
read name
echo ―enter grade:‖
read gd
echo ―enter basic salary:‖
read bs
echo ―name: $name, grade: $grade, bs=$bs‖
exit 0

Example2: write a program in C to show use of message queue
#include <string.h>
#include <stdio.h>
struct msg {
long int msg_type;
char text[100];
int main() {
int msgid=(msgget(1234,0666|IPC_CREAT);
struct msg data;
msgsnd(msgid,(void *)&data,100,0);
#include <string.h>
#include <stdio.h>
struct msg {
long int msg_type;
char text[100];
int main() {
int msgid=(msgget(1234,0666|IPC_CREAT);
struct msg data;
msgrcv(msgid,(void *)&data,100,0);

Example3: to show use of shared memory
#include <string.h>
#include <unistd.h>
#include <sys/ipc.h>
struct data {
char text[100];
int main() {
struct data *d;
int shmid=shmget(1234,sizeof(struct data),0666|IPC_CREAT);
void *shared_memory=(void *)0;
shared_memory=shmat(shmid,(void *)0,0);
d=(struct data *)shared_memory;
strcpy(d->text, ―hello‖);
return 0;
#include <string.h>
#include <unistd.h>
#include <sys/ipc.h>
struct data {
char text[100];
int main() {
struct data *d;
int shmid=shmget(1234,sizeof(struct data),0666|IPC_CREAT);
void *shared_memory=(void *)0;
shared_memory=shmat(shmid,(void *)0,0);
d=(struct data *)shared_memory;
return 0;

Example4: using FIFO
#include <string.h>
#include <stdio.h>
#define fifo1 ―/home/fifo.1‖
int main() {
int childpid,readfd,writefd;
char msg[100];
strcpy(msg, ―hello‖);
Example5: use of Pipes
struct share {
char string[10]; };
int main() {
int pipe1[2];
int pid=fork();
if(pid>0) { //parent
struct share s;
write(pipe1[1],(struct share *)&s,sizeof(struct share));
else { //child
struct share t;
read(pipe1[0],(struct share *)&s,sizeof(struct share));
return 0;

Example6: to check user is minor, young, old.
echo ―enter age:‖
read age
if [ $age –le 12 ]
echo ―minor‖
elif [ $age –le 18 ]
echo ―major‖
elif [ $age –le 25 ]
echo ―young‖
echo ―old‖
exit 0