P. 1
Linux Open Source

Linux Open Source

|Views: 7|Likes:
Published by Vimal EA



More info:

Published by: Vimal EA on Feb 15, 2013
Copyright:Traditional Copyright: All rights reserved


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less






What Software is Needed?
 Operating Systems  Application Software  Software Development Tools

 Web services
 Database Servers and RDBMS’s




What is open-source software (OSS)?
 Software comes in the form of compiled code (binaries), and the human-readable

source code from which these binaries are compiled.
 Open-source software is software whereby the software is distributed in the form

of binaries as well as source code.
 The distributor cannot restrict any party from redistributing the software, nor can

any party be restricted from making modifications or making derivative works based on the source code.




Open Source Definition (OSD)
1. Free Redistribution The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.




Open Source Definition (OSD)
2. Source Code The program must include source code, and must allow distribution in source code as well as compiled form. 3. Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.




Open Source Definition (OSD)
 Integrity of The Author’s Source Code
 Distinguished changes from the base source.

 No Discrimination Against Persons or Groups.

 Distribution of License
 No need for execution of an additional license.

 License Must Not Be Specific to a Product
 Must not depend on the program’s being part of a particular software distribution.

 License Must not Restrict Other Software




What is open-source software (OSS)? (continued)

 Open Source Software (OSS) is an example of a second order Internet effect.  The first order was commercialization through buying and selling (e.g., Amazon

and eBay).
 The second order is based on collaboration and information sharing (e.g.,

 Programmers throughout the world can be engaged in software development.




Open Source Vs. Closed Source Software

Developed by Companies and developers work for economic purposes.

Developed By Volunteers work for peer recognition. People know that recognition as a good developer have great advantage Decentralized, distributed, multi-site development

Centralized, single site development

Users may suggest requirements but they may or may not be implemented
Release is not too often. There may 2/18/2013only yearly be UNIT-I releases.

User suggests additional features that often get implemented.
Software is released on a daily or weekly basis 9

Open Source Vs. Closed Source Software (Continued)

Market believes commercial CSS is highly secure because it is developed by a group of professionals confined to one geographical area under a strict time schedule. But quite often this is not the case, hiding information does not make it secure, it only hides its weaknesses Security cannot be enhanced by modifying the source code
2/18/2013 UNIT-I

OSS is not market driven; it is quality driven. Community reaction to bug reports is much faster compared to CSSD which makes it easier to fix bugs and make the component highly secure

The ability to modify the source code could be a great advantage if you want to deploy a highly secure system 10

 Open Source is a certification standard issued by the Open Source Initiative (OSI)
 indicates that the source code of a computer program is made available free of charge to the general



 OSI dictates that in order to be considered "OSI Certified" a product must
 The author or holder of the license of the source code cannot collect royalties on the distribution

of the program.
 The distributed program must make the source code accessible to the user.  The author must allow modifications and derivations of the work under the program's original

 No person, group or field of endeavor can be denied access to the program




Example of open source software
 Programming Tools
 Zope, and PHP, are popular engines behind the "live content" on the World Wide

 Languages:
 Perl  Python  Ruby  Tcl/Tk

 GNU compilers and tools
 GCC,Make,Autoconf,Automake,etc.




Open source software sites
 Free Software Foundation www.fsf.org  Open Source Initiative www.opensource.org  Freshmeat.net  SourceForge.net  OSDir.com  developer.BerliOS.de  Bioinformatics.org  see also individual project sites; e.g., www.apache.org; www.cpan.org; c.




Software Development operating systems http://gcc.gnu.org/

What open-source software is available? (continued)

o GCC - The compiler for C, C++, Fortran, Java, that comes standard with all the major OSS

o JBOSS - A popular open-source implementation of J2EE http://www.jboss.org o Perl - A very popular language widely used in scripts to drive `live content’ on the World

Wide Web http://www.perl.org
o PHP - A very popular scripting language for interactive web development and applications

o Python - A popular object-oriented scripting language for web and desktop development





What open-source software is available?
 Multi-user Networked Operating Systems

Linux :The most popular OSS operating system on the planet http://www.linux.org
 Internet/intranet Services and Applications

Apache web server - Accounts for over 60% of the web servers on the Internet http://www.apache.org
o BIND name server - The software that provides the DNS (domain name service). Many of

the root name servers as well as the Internet backbone network ISPs use BIND http://www.isc.org/products/BIND/
o Sendmail (Mail Exchange server) - The most widely used email transport software on the

Internet http://www.sendmail.org



 Database Systems

What open-source software is available? (continued)

o MySQL - A very popular open-source RDBMS http://www.mysql.com o PostgreSQL - A popular open-source RDBMS with many advanced features

 Desktop Applications o OpenOffice.org - An integrated office suite featuring word-processing, spreadsheet, drawing and

presentation software largely compatible with Microsoft Office http://www.openoffice.org
o Ximian Evolution - A GUI desktop application for personal email, calendar and diary having

similar look and feel with Microsoft Outlook http://www.ximian.org
o Mozilla - The open-source evolution of the popular Netscape web browser





Can We Count On OSS?
 OSS is developed and/or maintained by volunteer programmers so is a single

party fully accountable for it ?
 Yes, For Common open source project we find a non-profit foundations or

normal businesses supporting the software
 For example, Apache is supported through the Apache Software Foundation

and Red Hat Linux is supported and maintained by Red Hat Corporation




Open source companies
 IBM  uses and develops Apache and Linux;  created Secure Mailer and created other software on AlphaWorks

 Apple
 released core layers of Mac OS X Server as an open source BSD operating system

called Darwin;  open sourcing the QuickTime Streaming Server and the OpenPlay network gaming toolkit  HP  uses and releases products running Linux  Sun  uses Linux; supports some open source development efforts (Eclipse IDE for Java and the Mozilla web browser)




Open source companies
 Red Hat Software
 Linux vendor

 ActiveState
 develops and sells professional tools for Perl, Python, and Tcl/tk developers.




Can We Get Support On OSS?
 The most frequently cited reasons against using OSS in corporations is the lack of support.  In Propriety CSS we can relay on the vendor for support.  But, There exists professional companies providing service and support for open-source (e.g.Red

Hat for Linux, Zend for PHP, and recently Sun Microsystems for MySQL)
 The Internet is another great source of informal support that is efficient (Newsgroups, FAQ’s and

HOW-TO documents).








 Free Software Foundation (FSF)
 Open Source Initiative (OSI)




Open source licensing
 The licence is what determines whether software is open source

 The licence must be approved by the Open Source Initiative- OSI

 The Open Source Initiative approves open source licenses after they have

successfully gone through the approval process and comply with the Open Source
Definition (above).
 All approved licences meet their Open Source Definition

 Approved licences >50 and include the GPL, LGPL, MPL and BSD.


UNIT-I 2/18/2013

 Software is protected by copyright law  By default only the owner of software may copy, adapt or distribute it.  The owner of software can agree to let another person copy, adapt or distribute the

code - this agreement is called a licence.


UNIT-I 2/18/2013

Free/Open Source Software Licenses

Academic Free License Apache Software License Apple Public Source License Artistic license BSD license GNU GPL ….

GNU LGPL IBM Public License Intel Open Source License MIT license Sun Public License




Open Source Software licensing and copyright
 The two most common types of OSS licensing are: o BSD (Berkley System Distribution)Style: this category of license allows one to take an

open-source software and redistribute it with or without modifications as proprietary software. (e.g. Apache, BIND )
o GNU General Public License(GPL) : It is a license that requires that the product

derived from the original open-source software must also be distributed under the same licensing regime as the original. Thus it cannot be turned into a closed-source product.




Free/Open Source Software Licenses
If you distribute your software under one of these licenses, you are permitted to say that your software is "OSI Certified Open Source Software." If you see either of an OSI Certified certification mark on a piece of software, the software

is being distributed under a license that conforms to the Open Source Definition.




BSD license
 no restriction on derivative work  The only restrictions placed on users of software released under a typical BSD license are

that if they redistribute such software in any form, with or without modification, they
must include in the redistribution
 (1) the original copyright notice,  (2) a list of two simple restrictions and  (3) a disclaimer of liability.

These restrictions can be summarized as
 (1) one should not claim that they wrote the software if they did not write it  (2) one should not sue the developer if the software does not function as expected or as desired




Mozilla Public License
 Divides software into:

1.Open Source part 2. anything added by user  Things added by user can be proprietary, if he does not modify the Open Source part  Everything is Open Source if he modifies the Open Source part




Apache license
BSD + condition:  OK to distribute code if under the Apache name, but not for resale




Taxonomy of Software by FSF

Proprietary: the use, redistribution or modification of the software is prohibited, or requires you to ask for permission, or is restricted so much that you effectively can't do it freely.


Semi-free: not free, but comes with permission for individuals to use, copy, distribute, and modify (including distribution of modified versions) for non-profit purposes. e.g. PGP


A. Copylefted: redistribution cannot add additional restriction B. Non-Copylefted:




Copyleft – as explained by FSF
 Copyleft is a general method for making a program free software and requiring all

modified and extended versions of the program to be free software as well.
 To copyleft a program, we first state that it is copyrighted; then we add

distribution terms, which are a legal instrument that gives everyone the rights to
use, modify, and redistribute the program's code or any program derived from it but only if the distribution terms are unchanged.
 Thus, the code and the freedoms become legally inseparable.




Free, copyrighted but not copylefted
 Non-copy lefted free software comes from the author with permission to

redistribute and modify, and also to add additional restrictions to it.
 If a program is free but not copylefted, then some copies or modified versions may

not be free at all.
 A software company can compile the program, with or without modifications,

and distribute the executable file as a proprietary software product.
 Example: X11 Window System




 The biggest difference between the GPL and BSD licenses is the fact that the

former is a copyleft license and the latter is not.
 Copyleft is the application of copyright law to permit the free creation of

derivative works but requiring that such works be redistributable under the same terms (i.e., the same license) as the original work.




Typical OSS development model
Improvements (as source code) and evaluation results: User as Developer Developer Development Community Trusted Developer

Bug Reports

Trusted Repository Distributor

“Stone soup development”
• OSS users typically use software without paying licensing fees • OSS users typically pay for training & support (competed) • OSS users are responsible for paying/developing new improvements & any evaluations that they need; often cooperate with others to do so • Goal: Active development community (like a consortium)



Advantages of open source software





The availability of the source code and the right to modify it is very important.
 

It enables the unlimited tuning and improvement of a software product. It also makes it possible to port the code to new hardware, to adapt it to changing conditions, and to reach a detailed understanding of how the system works.


The right to redistribute modifications and improvements to the code, and to reuse other open source code, permits all the advantages due to the modifiability of the software to be shared by large communities.


No exclusive rights to the software Open source software is open to everyone. Because of this no individual programmer or company can specify the direction that the development should take.





The right to use the software in any way.
 This, combined with redistribution rights, ensures (if the software is useful enough), a large

population of users, which helps in turn to build up a market for support and customization of the software, which can only attract more and more developers to work in the project.
 This in turn helps to improve the quality of the product, and to improve its functionality.

5. The biggest advantage of open source for users is that most projects are free to

download and use




 is that the focus is often on backend processing of information and not on user

interfaces.  Microsoft Windows has arguably one of the easiest interfaces with which to work.  Often, open source software such as Linux requires the user to have specialized knowledge that cannot be configured with just clicks of a mouse.  In addition, open source projects often do not have good documentation to walk the user through the learning and using of the technologies.







Types of Operating System
 Tasks
 Uni tasking,Multi tasking

 Users
 Single User,Multi User

 Processing
 Uni processing,Multi processing

 Timesharing
 is the sharing of a computing resource among many users by means of

multiprogramming and multi-tasking




 Multics – 1964
 Unics – 1969  Minix – 1990  Linux – 1991




 Multiplexed Information and Computing Service  Written in 1964  Mainframe Timesharing OS  Last version was shut down on October 30, 2008  Monolithic kernel

 Disadvantages -crashes, insecure, error prone, expensive




 Uniplexed Information and Computing System
 Later renamed as UNIX  Written in 1969

 Ken Thompson, Dennis Ritchie were among the developers
 Multi user, Multi tasking and timesharing  Monolithic kernel




 Minimal Unix
 Tanenbaum developed this OS  Mainly for educational purpose

 Unix like OS, implemented with Micro kernel. So the name Minix




What is Linux?
 Developed in 1991 by a University of Finland student Linus Torvalds.  Basically a kernel, it was combined with the various software and compilers from GNU

Project to form an OS, called GNU/Linux
 Linux is a full-fledged OS available in the form of various Linux Distributions  RedHat, Fedora, SuSE, Ubuntu, Debian are examples of Linux distros  Linux is supported by big names as IBM, Google, Sun, Novell, Oracle, HP, Dell, and many





 Used in most of the computers, ranging from super computers to embedded

 Multi user  Multi tasking  Time sharing  Monolithic kernel  Stable version of linux kernel – 2.6.28, released on 24-Dec-2008




History of Linux
 Inspired by the UNIX OS, the Linux kernel was developed as a clone of UNIX  GNU was started in 1984 with a mission to develop a free UNIX-like OS  Linux was the best fit as the kernel for the GNU Project  Linux kernel was passed onto many interested developers throughout the Internet  Linux today is a result of efforts of thousands of individuals, apart from Torvalds




Free Software Foundation & GNU
 Organisation that started developing copylefted programs
 GNU Project: Richard Stallman on September 27th 1983.  The GNU Project was launched in 1984 to develop a complete Unix-like operating

system which is free software: the GNU system.
 GNU's kernel isn't finished, so GNU is used with the kernel Linux. The combination

of GNU and Linux is the GNU/Linux operating system, now used by millions.
 www.gnu.org




Linux on Servers and Supercomputers
 Linux is the most used OS on servers  5 out of 10 reliable web hosting companies use Linux  Linux is the cornerstone of the LAMP server-software combination (Linux,

Apache, MySQL, Perl/PHP/Python) which has achieved popularity among developers
 Out of top 500 supercomputers, Linux is deployed on 426 of them




Linux on Embedded Systems
 16.7% of smartphones worldwide use Linux as OS
 Linux poses a major competition to the most popular OS is this segment – Symbian  Nokia, Openmoko supply Linux on their select smartphones




Why should you use Linux?
 No threat of viruses  Linux systems are extremely stable  Linux is Free  Linux comes with most of the required software pre-installed  Linux never gets slow  Linux can even run on oldest hardware




 Free Open Source Software
 Free – Means Liberty and not related to Price or cost  Open – Source code is available and any body can contribute to the

development. Organization independent




4 Freedoms with FOSS
 Freedom to run the software anywhere  Freedom to study how the programs work. i.e source code will be accessible  Freedom to redistribute copies  Freedom to improve the software

If a software has all these 4 freedoms, then it is a FOSS




 Core or nucleus of an operating system  Interacts with the hardware

 First program to get loaded when the system starts and runs till the

session gets terminated
 Different from BIOS which is hardware dependent.  Kernel is software dependent  LINUX: In hard disk, it is represented by the file /vmlinuz.




 Monolithic

Kernel types

 All OS related code are stuffed in a single module  Available as a single file  Advantage : Faster functioning  Micro  OS components are isolated and run in their own address space  Device drivers, programs and system services run outside kernel memory space.Only

a few functions such as process scheduling, and interprocess communication are included into the microkernel
 Supports modularity & Lesser in size



 Program that interacts with kernel
 Bridge between kernel and the user  Command interpreter

 User can type command and the command is conveyed to the kernel and it

will be executed




Types of Shell
 Sh – simple shell  BASH – Bourne Again Shell  KSH – Korne Shell  CSH – C Shell  SSH – Secure Shell  To use a particular shell type the shell name at the command prompt.  Eg $csh – will switch the current shell to c shell  To view the current shell that is being used, type echo $SHELL at the command





Linux Distributions
 Today there are hundreds of different distributions available popular Linux distributions

include ■ SUSE Linux ■ Fedora Linux ■ Red Hat Enterprise Linux ■ Debian Linux ■ ALT Linux ■ TurboLinux ■ Mandrake Linux ■ Lycoris Linux ■ Linspire ■ Gentoo Linux

■ Ubuntu



Red Hat Linux : One of the original Linux distribution. The commercial, nonfree version is Red Hat Enterprise Linux, which is aimed at big companies

using Linux servers and desktops in a big way.
•Free version: Fedora Project. Debian GNU/Linux : A free software distribution. Popular for use on servers. However, Debian is not what many would consider a distribution for beginners, as it's not designed with ease of use in

SuSE Linux : SuSE was recently purchased by Novell. This distribution is primarily for pay because it contains many commercial programs, although free version that you can download. Mandrake Linux : Mandrake is perhaps strongest on the desktop. Originally based off of Red Hat Linux. Gentoo Linux : Gentoo is a specialty distribution meant for programmers. there's a available stripped-down

Linux OS




Operating System
User 1

User 2








 User mode is the normal mode of operating for programs. Web browsers, calculators, etc.

will all be in user mode.
 They don't interact directly with the kernel, instead, they just give instructions on what

needs to be done, and the kernel takes care of the rest.
 Code running in user mode must delegate to system APIs to access hardware or memory.  Due to the protection afforded by this sort of isolation, crashes in user mode are always

 Most of the code running on your computer will execute in user mode.  When in User Mode, some parts of RAM can’t be addressed, some instructions can’t

be executed, and I/O ports can’t be accessed




 Kernel mode, on the other hand, is where programs communicate directly with the kernel.  The kernel-mode programs run in the background, making sure everything runs smoothly


- things like printer drivers, display drivers, drivers that interface with the monitor,
keyboard, mouse, etc.
 The executing code has complete and unrestricted access to the underlying hardware.  It can execute any CPU instruction and reference any memory address.

 Kernel mode is generally reserved for the lowest-level, most trusted functions of the

operating system.
 Crashes in kernel mode are catastrophic; they will halt the entire PC.




 A good example of this would be device drivers.
 A device driver must tell the kernel exactly how to interact with a piece of

hardware, so it must be run in kernel mode
 Because of this close interaction with the kernel, the kernel is also a lot more

vulnerable to programs running in this mode, so it becomes highly crucial that drivers are properly debugged before being released to the public.




 The only way an user space application can explicitly initiate a switch to kernel mode during

normal operation is by making an system call such as open, read, write etc.
 Whenever a user application calls these system call APIs with appropriate parameters, a software

interrupt/exception(SWI) is triggered.
 As a result of this SWI, the control of the code execution jumps from the user application to a

predefined location in the Interrupt Vector Table [IVT] provided by the OS.
 This IVT contains an address for the SWI exception handler routine, which performs all the

necessary steps required to switch the user application to kernel mode and start executing kernel instructions on behalf of user process.




Decomposition of Linux System into Major Subsystems
User Applications -- the set of applications in use on a particular Linux system will be different depending on what the computer system is used for Examples ,wordprocessing and a web-browser. O/S Services -- these are services that are typically considered part of the operating system (a windowing system, command shell, etc.); also, the programming interface to the kernel (compiler tool and library) is included in this subsystem. Linux Kernel -- the kernel abstracts and mediates access to the hardware

resources, including the CPU.
Hardware Controllers -- this subsystem is comprised of all the possible physical devices in a Linux installation; for example, the CPU, memory hardware, hard disks, and network hardware




The fundamental architecture of the GNU/Linux operating system




User Applications
 At the top is the user, or application, space.
 This is where the user applications are executed.  Below the user space is the kernel space where the Linux

kernel exists.




GNU C Library (glibc)
 provides the system call interface that connects to the kernel  provides the mechanism to transition between the user-space application

and the kernel. This is important because the kernel and user application occupy different protected address spaces.
 And while each user-space process occupies its own virtual address space,

the kernel occupies a single address space




Fundamental architecture of the GNU/Linux operating system
 The Linux kernel can be further divided into three gross levels.  At the top is the system call interface, which implements the basic functions such as read and

 Below the system call interface is the kernel code, which can be more accurately defined as the

architecture-independent kernel code.
 This code is common to all of the processor architectures supported by Linux.  Below this is the architecture-dependent code, which forms what is more commonly called a

BSP (Board Support Package).
 This code serves as the processor and platform-specific code for the given architecture.




File management

Directory Tree

When you log on the the Linux OS using your username you are automatically located in your home directory.

Most important subdirectories
 /bin : Important Linux commands available to the average user.  /boot : The files necessary for the system to boot.

Not all Linux distributions use this one. Fedora does.
 /dev : All device drivers. Device drivers are the files that your Linux system uses to talk to your

 /etc : System configuration files.  /home : Every user except root gets her own folder in here, named for her login account. So, the

user who logs in with linda has the directory /home/linda, where all of her personal files are kept.
 /lib : System libraries. Libraries are just bunches of programming code that the programs on your

system use to get things done.

Most important subdirectories
• /mnt

: Mount points. When you temporarily load the contents of a CD-ROM or USB drive, you

typically use a special name under /mnt.
• /root : The root user's home directory. • /sbin : Essential commands that are only for the system administrator. • /tmp : Temporary files and storage space. Don't put anything in here that you want to keep. Most

Linux distributions (including Fedora) are set up to delete any file that's been in this directory

longer than three days.
• /usr : Programs and data that can be shared across many systems and don't need to be changed. • /var : Data that changes constantly (log files that contain information about what's

happening on your system, data on its way to the printer, and so on).

Home directory
 You can see what your home directory is called by entering

pwd (print current working directory)

 Block diagram of Linux Kernel

 System call is the mechanism used by an application program to

Linux Kernel- System Call Interface

request service from the operating system.  API is a function definition that specifies how to obtain a given service(ex.calloc,malloc ,free etc.), while System call is an explicit request to the kernel made via a software interrupt  Invoking a system call by user mode process

Five main subsystems-Overview
 The Process Scheduler (SCHED) is responsible for controlling process access to the CPU.
 The scheduler enforces a policy that ensures that processes will have fair access to the CPU,

while ensuring that necessary hardware actions are performed by the kernel on time.
 The Memory Manager (MM) permits multiple process to securely share the machine's

main memory system.
 In addition, the memory manager supports virtual memory that allows Linux to support

processes that use more memory than is available in the system.
 Unused memory is swapped out to persistent storage using the file system then swapped back

in when it is needed.

Five main subsystems
 The Virtual File System (VFS) abstracts the details of the variety of hardware devices

by presenting a common file interface to all devices.
 In addition, the VFS supports several file system formats that are compatible

with other operating systems.
 The Network Interface (NET) provides access to several networking standards and a

variety of network hardware.
 The Inter-Process Communication (IPC) subsystem supports several mechanisms

for process-to-process communication on a single Linux system.

Kernel Subsystem Overview

Linux Kernel-Memory Management
 Linux’s physical memory-management system deals with allocating

and freeing pages, groups of pages, and small blocks of memory.
 It has additional mechanisms for handling virtual memory, memory

mapped into the address space of running processes

Managing Physical Memory
 The page allocator allocates and frees all physical pages; it can allocate ranges of

physically-contiguous pages on request.
 The allocator uses a buddy-heap algorithm to keep track of available physical pages.  Each allocatable memory region is paired with an adjacent partner.  Whenever two allocated partner regions are both freed up they are combined to form

a larger region.
 If a small memory request cannot be satisfied by allocating an existing small free

region, then a larger free region will be subdivided into two partners to satisfy the request.
 Memory allocations in the Linux kernel occur either statically (drivers reserve a

contiguous area of memory during system boot time) or dynamically (via the page allocator).

Virtual Memory
 The VM system maintains the address space visible to each process: It creates pages of

virtual memory on demand, and manages the loading of those pages from disk or their swapping back out to disk as required.
 The VM manager maintains two separate views of a process’s address space:  A logical view describing instructions concerning the layout of the address space.  The address space consists of a set of non overlapping regions, each representing a

continuous, page-aligned subset of the address space.
 A physical view of each address space which is stored in the hardware page tables

for the process.

File System
 A file system is the methods and data structures that an operating system uses to

keep track of files on a disk or partition; that is, the way the files are organized on the disk.
 A file is an ordered string of bytes  Files are organized in directory.  File information like size,owner,access permission etc. are stored in a separate data

structure called inode.
 Superblock is a data structure containing information about file system

 The Virtual Filesystem (also known as Virtual Filesystem Switch or VFS)

is a kernel software layer that handles all system calls related to a standard Unix filesystem.
 Its main strength is providing a common interface to several kinds of

filesystems. ex. copy a file from MS-dos filesystem to Linux

Network stack
 The network stack, by design, follows a layered architecture modeled after the

protocols themselves. Recall that the Internet Protocol is the core network layer protocol that sits below the transport protocol .
 Above TCP is the sockets layer, which is invoked through the SCI.  The sockets layer is the standard API to the networking subsystem and provides

a user interface to a variety of networking protocols.
 From raw frame access to IP protocol data units and up to TCP and the User

Datagram Protocol (UDP), the sockets layer provides a standardized way to manage connections and move data between endpoints.

 While much of Linux is independent of the architecture on which it runs, there are

Architecture-dependent code

elements that must consider the architecture for normal operation and for efficiency.
 The ./linux/arch subdirectory defines the architecture-dependent portion of the

kernel source contained in a number of subdirectories that are specific to the architecture .
 Each architecture subdirectory contains a number of other subdirectories that

focus on a particular aspect of the kernel, such as boot, kernel, memory management, and others.


Linux Kernel_Process
 Process is a program in execution.
 Process is represented in OS by Process

Control Block.

Interactive process
 Interactive processes are those processes that are invoked by a user and can

interact with the user.
 Examples: shells, text editors, GUI applications.  Interactive processes can be classified into foreground and background processes.  The foreground process is the process that you are currently interacting with,

and is using the terminal as its stdin (standard input) and stdout (standard
 A background process is not interacting with the user and can be in one of two

states - paused or running.
 There has to be someone connected to the system to start these processes;

they are not started automatically as part of the system functions.

System Process
 Daemon (day-mon). Daemon is the term used to refer to process' that are

running on the computer and provide services but do not interact with the

 Most server software is implemented as a daemon. Apache, Samba, are all

examples of daemons.
 Any process can become a daemon as long as it is run in the background,

and does not interact with the user.
 A simple example of this can be achieved using the ls -l command  Running in the background by typing ls -l &

Automatic or batch processes
 Automatic or batch processes are not connected to a terminal.  Rather, these are tasks that can be queued into a spooler area, where

they wait to be executed on a FIFO (first-in, first-out) basis.
Such tasks can be executed using one of two criteria:
 At a certain date and time

 At times when the total system load is low enough to accept extra jobs:

done using the batch command.
 By default, tasks are put in a queue where they wait to be executed until

the system load is lower than 0.8.

Batch processes
 In large environments, the system administrator may prefer batch

processing when large amounts of data have to be processed or when tasks demanding a lot of system resources have to be executed on an already loaded system.
 Batch processing is also used for optimizing system performance.






















Process State-FLAGS
 Each process on the system is in exactly one of five different states. This value

is represented by one of five flags:
 TASK_RUNNING: The process is runnable; it is either currently running or

on a runqueue waiting to run
 This is the only possible state for a process executing in user-space; it can

also apply to a process in kernel-space that is actively running.
 TASK_INTERRUPTIBLE: The process is sleeping (that is, it is blocked),

waiting for some condition to become true or a signal is received.
 When the condition becomes true, the kernel sets the process's state to

 The process can awake and become runnable if it receives a signal.

 This state is identical to TASK_INTERRUPTIBLE except that it does not

wake up and become runnable if it receives a signal.
 This is used in situations where the process must wait without

interruption or when the event is expected to occur quite quickly.
 Because the task does not respond to signals in this state,

TASK_UNINTERRUPTIBLE is less often used than



 The task has terminated, but its parent has not yet issued a wait() system

 The task's task structure must remain in case the parent wants to access it.  If the parent calls wait(), the task structure is deallocated.

 Process execution has stopped; the task is not running nor is it eligible to

 This occurs if the task receives the SIGSTOP, SIGTSTP, SIGTTIN, or

SIGTTOU signal or if it receives any signal while it is being debugged.

Zombie process
 A zombie process or defunct process is a process that has completed

execution but still has an entry in the process table.
 This entry is still needed to allow the parent process to read its child's exit

 When a process ends, all of the memory and resources associated with it

are deallocated so they can be used by other processes. However, the process's entry in the process table remains.
 The parent can read the child's exit status by executing the wait system

call, whereupon the zombie is removed.

 When a child exits, the parent process will receive a SIGCHLD signal to

indicate that one of its children has finished executing; the parent process will typically call the wait() system call at this point.
 That call will provide the parent with the child’s exit status, and will

cause the child to be reaped, or removed from the process table.
 It’s possible that the parent process is intentionally leaving the process in

a zombie state to ensure that future children that it may create will not receive the same pid

Causes of Zombie Processes
 When a subprocess exits, its parent is supposed to use the "wait" system

call and collect the process's exit information.
 The subprocess exists as a zombie process until this happens, which is

usually immediately.
 However, if the parent process isn't programmed properly or has a bug

and never calls "wait," the zombie process remains, eternally waiting for

its information to be collected by its parent.

Process states(USING FLAGS)

Process states in Linux:
 Running: Process is either running or ready to run  Interruptible: a Blocked state of a process and waiting for an event or

signal from another process
 Uninterruptible: a blocked state. Process waits for a hardware condition

and cannot handle any signal
 Stopped: Process is stopped or halted and can be restarted by some other

 Zombie: process terminated, but information is still there in the process


System calls used for process management in linux

System calls used for Process management:
 Fork () :- Used to create a new process  Exec() :- Execute a new program  Wait():- wait until the process finishes execution  Exit():- Exit from the process  Getpid():- get the unique process id of the process  Getppid():- get the parent process unique id

Pid and Parentage
 A process ID or pid is a positive integer that uniquely identifies a running

process, and is stored in a variable of type pid_t.
 You can get the process pid or parent’s pid










getppid- getppid returns the PID of the parent of the calling proces main() { pid_t pid, ppid; printf( "My PID is:%d\n\n",(pid = getpid()) ); printf( "Par PID is:%d\n\n",(ppid = getppid()) ); }

2. fork()
 #include <sys/types.h>

#include <unistd.h> pid_t fork( void );
 Creates a child process by making a copy of the parent process --- an

exact duplicate.
 Implicitly specifies code, registers, stack, data, files

Both the child and the parent continue running.

fork() as a diagram
pid = fork() Child pid == 0 Shared Program Data Copied

Returns a new PID: e.g. pid == 5


Process IDs (pids revisited)
 When a fork is executed, The child gets a unique pid. The parent's file descriptor table is copied. (The implication is that

all files open to the parent are also open to the child.)
The return value from fork is -1 if failure (e.g., the process table is full) 0 (in the child) pid of the child (in the parent)

 wait is called by a parent process to await termination by one of its

children. This allows a simple form of synchronization between parent
and child. int wait ( int * statloc );
 When a process calls wait, if the caller has no children, the wait call

returns immediately with an error code.
 if the caller has children, but none has terminated, the caller is blocked

until one does.
 if a child process has terminated which has not been waited for (a so-called

zombie process), the child is removed from the process table, and the wait

call returns with the status of the child.
125  UNIT-I 2/18/2013

 When the call returns, the return value is the pid of the terminated child

& the child's status is stored at statloc.
 Two similar system calls, waitid and waitpid, are provided which

provide options to allow waiting on a specific child, and return without blocking.




wait() Actions
 A process that calls wait() can:  suspend (block) if all of its children are still running, or

 return immediately with the termination status of a child,


 return immediately with an error if there are no child


Using the exec Family
 The exec functions replace the program running in a process with

another program.
 When a program calls an exec function, that process immediately ceases

executing that program and begins executing a new program from the beginning, assuming that the exec call doesn’t encounter an error.
 Within the exec family, there are functions that vary slightly in their

capabilities and how they are called.
 Functions that contain the letter p in their names (execvp and execlp) accept

a program name and search for a program by that name in the current execution path; functions that don’t contain the p must be given the full path of the program to be executed.

 Functions that contain the letter v in their names (execv, execvp, and execve)

accept the argument list for the new program as a NULL-terminated array of

pointers to strings.
 Functions that contain the letter l (execl, execlp, and execle) accept the

argument list using the C language’s varargs mechanism.
 Functions that contain the letter e in their names (execve and execle) accept

an additional argument, an array of environment variables.
 The argument should be a NULL-terminated array of pointers to

character strings. Each character string should be of the form

Simple Execlp Example
include <sys/type.h> #include <stdio.h> #include <unistd.h> int main() { pid_t pid; /* fork a child process */ pid = fork(); if (pid < 0){ printf(“Fork Failed”); exit(-1); } else if (pid == 0){ /* child process */ execlp(“/bin/ls”,”ls”,NULL); } else { /* parent process */ /* parent will wait for child to complete */ wait(NULL); printf(“Child Complete”); exit(0); } }


Linux Process Scheduling Policy
 A scheduling policy is the set of decisions you make regarding scheduling

priorities, goals, and objectives
 A scheduling algorithm is the instructions or code that implements a given

scheduling policy
 • Linux has several, conflicting objectives  Fast process response time  Good throughput for background jobs  Avoidance of process starvation

Linux Process Scheduling Policy
 Linux uses a timesharing technique 

We know that this means that each process is assigned a small

quantum or time slice that it is allowed to execute.
 Linux schedule process according to a priority ranking, this is a

“goodness” ranking
 Linux uses dynamic priorities, i.e., priorities are adjusted over time to

eliminate starvation
 Processes that have not received the CPU for a long time get

their priorities increased, processes that have received the CPU often get their priorities decreased

Linux Process Scheduling Policy
We can classify processes using two schemes
 – CPU-bound versus I/O-bound  I/O-bound programs have the property of performing only a small amount of

computation before performing IO. Such programs typically do not use up their entire CPU quantum.
 CPU-bound programs, on the other hand, use their entire quantum without performing

any blocking IO operations.
 Consequently, one could make better use of the computer's resources by giving higher

priority to I/O-bound programs and allow them to execute ahead of the CPU-bound
 Interactive versus batch versus real-time  These classifications are somewhat independent, e.g., a batch process can be either I/O-

bound or CPU-bound.
 Linux recognizes real-time programs and assigns them high priority,

Linux Process Scheduling Policy
 Linux uses process preemption, a process is preempted when

– Its time quantum has expired
 A new process enters TASK_RUNNING state and its priority is greater than the

priority of the currently running process
 The preempted process is not suspended, it is still in the ready queue, it simply no longer has

the CPU

Consider a text editor and a compiler
 – Since the text editor is an interactive program, its dynamic priority is higher than the

 – The text editor will be block often since it is waiting for I/O  When the I/O interrupt receives a key-press for the editor, the editor is put on the

ready queue and the scheduler is called since the editor’s priority is higher than the

compiler.– The editor gets the input and quickly blocks for more I/O

Linux Process Scheduling Policy
 Determining the length of the quantum

– Should be neither too long or too short
– If too short, the overhead caused by process switching becomes excessively high – If too long, processes no longer appear to be executing concurrently
 For Linux, long quanta do not necessarily degrade response time for

interactive processes because their dynamic priority remains high, thus they

get the CPU as soon as they need it
 The for Linux is the longest possible quantum without affecting responsiveness;

this turns out to be about 20 “clock ticks” or 210 milliseconds

Linux Process Scheduling Algorithm
 The Linux scheduling algorithm is not based on a continuous CPU time axis, instead it

divides the CPU time into epochs
 An epoch is a division of time or a period of time  In each epoch, every process gets a specified time quantum  quantum = maximum CPU time assigned to the process in that epoch duration of

quantum computed when epoch begins
 different processes may have different time quantum durations  when process forks, remainder of parent’s quantum is split / shared between

parent and child
 Epoch ends when all runnable processes have exhausted their quanta  At end of epoch, scheduler algorithm recomputes the time-quantum durations of all

processes; new epoch begin
 The a new epoch starts and all process get a new quantum

Linux Process Scheduling Algorithm
When does an epoch end? Important!
 – An epoch ends when all processes in the ready queue have used their

 – This does not include processes that are blocking on some wait queue,

they will still have quantum remaining
 – The end of an epoch is only concerned with processes on the ready


Linux Process Scheduling Algorithm
 Selecting a process to run next. The scheduler considers the priority of each

process. There are two kinds of priorities
 Static priorities - these are assigned to real-time processes and range from 1 to

99; they never change
 Dynamic priorities - Dynamic priority is calculated from static priority and

average sleep time .
 When process wakes up, record how long it was sleeping, up to some maximum

 When the process is running, decrease that value each timer tick  The static priority of real-time process is always higher than the dynamic

priority of conventional processes
 Conventional processes will only execute when there are no real-time

processes to execute

Linux Process Scheduling Algorithm
Calculating process quanta for an epoch
 Each process is initially assigned a base time quantum, as mentioned

previously it is about 20 “clock ticks”
 If a process uses its entire quantum in the current epoch, then in the next epoch

it will get the base time quantum again
 If a process does not use its entire quantum, then the unused quantum carries

over into the next epoch (the unused quantum is not directly used, but a “bonus”

is calculated)
 Why? Process that block often will not use their quantum; this is used to favor

I/O-bound processes because this value is used to calculate priority
 When forking a new child process, the parent process’ remaining quantum

divided in half; half for the parent and half for the child

Linux Process Scheduling Algorithm
Scheduling data in the process descriptor
 – The process descriptor (task_struct in Linux) holds essentially of

the information for a process, including scheduling information
 – Recall that Linux keeps a list of all process task_structs and a list of

all ready process task_structs

Linux Process Scheduling Algorithm
 Each process descriptor (task_struct) contains the following fields  need_resched - this flag is checked every time an interrupt handler

completes to decide if rescheduling is necessary
 For real-time processes this can have the value of  SCHED_FIFO - first-in, first-out with unlimited time quantum  SCHED_RR - round-robin with time quantum, fair CPU usage

For all other processes the value is

For processes that have yielded the CPU, the value is

Linux Process Scheduling Algorithm
Process descriptor fields (con’t)
 – rt_priority - the static priority of a real-time process, not used for

other processes
 priority - the base time quantum (or base priority) of the process  counter - the number of CPU ticks left in its quantum for the current

epoch. This field is updated for every clock tick
 – The priority and counter fields are used to for timesharing

and dynamic priorities in conventional processes .

Linux Process Scheduling Algorithm
Scheduling actually occurs in schedule() – Its objective is to find a process in the ready queue then assign the CPU to it – It is invoked in two ways
 

Direct invocation Lazy invocation

Linux Process Scheduling Algorithm
Direct invocation of schedule() The scheduler is invoked directly when the current process must be blocked right away because the resource it needs is not available
 – A process must be blocked because a resource is not available, a device

driver can invoke schedule() directly if it will be executing a long iterative task.
 The current process is taken off of the ready queue and is placed on

the appropriate wait queue; its state is changed to TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE
 When a process is woken up and its priority is higher than that of the

current process

Linux Process Scheduling Algorithm
Lazy invocation of schedule() Scheduler can also be invoked in a lazy way by setting the need_ resched flag field to 1.
 Occurs when the current process has used up its quantum  A process is added to the ready queue and its priority is higher than the

currently executing process
 A process calls sched_yield()
 sched_yield() causes the calling thread to relinquish the CPU. The thread is moved

to the end of the queue for its static priority and a new thread gets to run.

Linux Process Scheduling Algorithm
Actions performed by schedule()
 First it runs any kernel control paths that have not completed and other

uncompleted house-keeping tasks
 Remember, the kernel is not preemptive, so it cannot switch to another

process if a process is already in the kernel or if the kernel is in the middle of doing something else
 If the current process is SCHED_RR and has used all of its quantum, then it

is given a new quantum and placed at the end of the ready queue
 If the process is not SCHED_RR, then it is removed from the ready queue

Linux Process Scheduling Algorithm
Actions performed by schedule() (con’t)
 It scans the ready queue for the highest priority process

 It calculates the priority using the goodness() function
 It may not find any processes that are “good” when all processes on the

ready queue have used up their quantum (i.e., all have a zero counter

field). In this case it must start a new epoch by assigned a new quantum to
all processes
 If a higher priority process was found, then the scheduler performs a

process switch

Linux Process Scheduling Algorithm
How good is a runnable process?
– Uses goodness() to determine priority • (goodness == -1000) - do not select process • (goodness == 0) - process has exhausted quantum • (0 < goodness < 1000) - conventional process with quantum • (goodness >= 1000) - real-time process

Linux Process Scheduling Algorithm
Linux scheduler issues
 Does not scale very well as the number of process grows because it

has to recompute dynamic priorities
 Tries to minimize this by computing at end of epoch only

 Large numbers of runnable processes can slow response time
 Predefined quantum is too long for high system loads  I/O-bound process boosting is not optimal  Some I/O-bound processes are not interactive (e.g., database search or

network transfer)
 Support for real-time processes is weak

Personalities in Linux

Execution Domains_persona System call
 The execution domain system allows Linux to provide limited support for

binaries compiled under other UNIX-like operating systems.
 Linux is its ability to execute files compiled for other operating systems. Of course,

this is possible only if the files include machine code for the same computer architecture on which the kernel is running.
 Two kinds of support are offered for these "foreign" programs:  Emulated execution: necessary to execute programs that include system calls that

are not POSIX-compliant
 Native execution: valid for programs whose system calls are totally POSIX-

 "Portable Operating System Interface", is a family of standards specified by

the IEEE for maintaining compatibility between operating systems.

 Microsoft MS-DOS and Windows programs are emulated: they cannot be natively


executed, because they include APIs that are not recognized by Linux
 POSIX-compliant programs compiled on operating systems other than Linux can be

executed without too much trouble, because POSIX operating systems offer similar APIs.
 A process specifies its execution domain by setting the personality field of its

descriptor. Each process has an associated personality identifier that can slightly modify the semantics of certain system calls. Used primarily by emulation libraries to request that system calls be compatible with certain specific flavors of OS.
 A

process can change its personality by issuing a suitable system call

named personality( )
 Programmers are not expected to directly change the personality of their programs;

instead, the personality( ) system call should be issued by the glue code that sets up the execution context of the process

Personalities in LINUX
Personality PER_LINUX PER_BSD PER_SUNOS PER_RISCOS PER_SOLARIS Operating system Standard execution domain

BSD Unix SunOS RISC OS Sun's Solaris

 Linux supports different execution domains, or personalities, for each process.  Among other things, execution domains tell Linux how to map signal

numbers into signal actions.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->