You are on page 1of 29

SCHEDULING IN LINUX

Irene Monica N Geetha A P

Brindha M Harini K S
❑ The Linux operating system is preemptive.

❑ Only one thread is allowed to execute on the same CPU at a time.

❑ An example of threads can be seen in almost every video games, as there are multiple
threads handling various modules of the game – one for user input, one for graphics,
one for sound processing and another for AI.
Process Scheduling Types
• Real time Processes
– Deadlines that have to be met
– Should never be blocked by a low prioritytask

• Normal Processes
– Interactive
• Constantly interact with their users, therefore spend a lot of time waiting for key presses and
mouse operations.

• When input is received, the process must wake up quickly (delay must be
between 50 to 150 ms)

– Batch
• Do not require any user interaction, often run in the background.

3
History (Schedulers For Normal Processes)

• O(n) Scheduler – Linux 2.4 To 2.6

• O(1) Scheduler – Linux 2.6 To 2.6.22

• CFS Scheduler – Linux 2.6.23 Onwards


First some concepts
Thread
❑ Thread/Task is the nothing but a process

Timeslice:
❑ Time allocated for each thread/task to run on CPU in a certain time interval/timeout
❑ CPU cycles divided in proportion to thread’s weight

Runtime:
❑ Accumulative thread time on CPU.
❑ Once runtime > timeslice, thread is preempted.

Runqueue:
❑ Queue of threads waiting to be executed by CPU
❑ Queue sorted by runtime
CPU T T T T
A Running Queue :
❑ A running queue (rq) is created for
each processor (CPU).

❑ It is defined in kernel/sched.c as
struct _runqueue.

❑ Each rq contains a list of runnable


processes on a given processor.
Kernel 2.4 -An O(n) scheduler

❑ Goes through the entire “ global runqueue” to determine the next task to be run.

❑ This is an O(n) algorithm w here 'n' is the number of processes.

❑ The time taken w as proportional to the number of active processes in the system

❑ A Global runqueue - All CPUs had to wait for other CPUs to finish execution.

❑ A Global runqueue for all processors in a Symmetric Multiprocessing system (SMP).

❑ This meant a task could be scheduled on any processor -- w hich can be good for load
balancing but bad for memory caches.

❑ This lead to large performance hits during heavy workload


multi-cores (Global runqueue)
CPU Core 0 CPU Core 1 CPU Core 2 CPU Core 3

Global Runqueue




Problems
• Context Switching requires access to runqueue
• Only one core can access/manipulate runqueue at any one time
• Other cores must wait to get new threads
O(1) Scheduler – linux 2.6

✶ Each CPU has a runqueue made up of 40 priority lists that are


serviced in FIFO order.
✶ Tasks that are scheduled to execute are added to the end of
their respective runqueue's priority list.
✶ Each task has a time slice that determines how much time it's
permitted to execute.
✶ The first 100 priority lists of the runqueue are reserved for real-
time tasks, and the last 40 are used for user tasks
Na ive runqueue load-balancingalgorithms
Balance runqueues by same number of threads?
• Ignores thread-priority, some threads more important than others

Balance runqueues by thread weights?


• Some high priority threads can sleep a lot

• Scenario: One sleepy high priority thread in a queue


Core 0 Runqueue Core 1 Runqueue
• Waste of CPU resources Thread A (W= 80, 25%) Thread B (W=25, 60%)
- Thread C (W=25, 40%)
- Thread D (W=10, 50%)
- Thread E (W=20, 50%)
Total Weight = 80 Total Weight = 80
Slightly improved load-balancing algorithm
✓ Concept of “load”

✓ 𝑙𝑜𝑎𝑑 (𝑡ℎ𝑟𝑒𝑎𝑑) = 𝑇ℎ𝑟𝑒𝑎𝑑 𝑊𝑒𝑖𝑔ℎ𝑡 * Average % of CPU Utilisation

✓ Balance runqueues by total load

Core 0 Thread load Core 1 Thread load Core 0 Thread load Core 1 Thread load
Runqueue Runqueue Runqueue Runqueue
Thread A 20 Thread B 15 Thread A 20 Thread B 15
(W=80, 25%) (W=25, 60%) (W=80, 25%) (W=25, 60%)
- - Thread C 10 Thread E 10 Thread C 10
(W=25, 40%) (W=20, 50%) (W=25, 40%)
- - Thread D 5 - - Thread D 5
(W=10, 50%) (W=10, 50%)
- - Thread E 10 - - - -
(W=20, 50%)

11
O(1) scheduler

• Constant time required to pick the next process to execute


– easily scales to large number of processes

– task on the highest priority list to execute

Priority Arrays
❑ are the data structures that provide O(1) scheduling by mapping
each running task to a priority queue

❑ Each runqueue contains pointer to 2 priority array objects:


active, expired.

❑ Priority arrays defined in kernel/sched.c


12
O(1) SCHEDULER
• Two ready queues in each CPU
– Each queue has 40 priority classes (100 – 139)
– 100 has highest priority, 139 has lowest priority

The time taken to find a task to execute depends not on the number
of active tasks but instead on the number of priorities

Active Run queues Expired Run queues

low 139 139


138 138

priority
: :
priority

: :
102 102
101 101
high
100 100
13
The Scheduling Policy
• Pick the first task from the lowest numbered run queue
• When done put task in the appropriate queue in the expired run
queue

Active Run queues Expired Run queues


139 139

138 138

priority
: :
: :
execute
102 102
101 101
100 100
14
The Scheduling Policy
• Once active run queues are complete
– Make expired run queues active and vice versa

Active Run queues Expired Run queues

low 139 139


138 138

priority
: :
priority

: :
102 102
101 101
high
100 100
15
Dynamic Priority and Run Queues
• Dynamic priority used to determine which run queue to put the task
• No matter how ‘nice’ you are, you still need to wait on run queues ---
prevents starvation

Active Run queues Expired Run queues


139 139
138 138
: :
: :

execute
102 102
101 101
100 100
16
Completely Fair Scheduler (Single-Core)

CPU Core Time elapsed (s)


0

Time interval: 1 second Runqueue sorted by Runtime


Timeslice calculation Assigned Sorted Runtime
Thread Weight (Weight / Total) * Timeslice threads
name Interval ThreadA 0
A 10 10 / 200 * 1 0.05 B 0
B 20 20 / 200 * 1 0.10 C 0
C 40 40 / 200 * 1 0.20 D 0
D 50 50 / 200 * 1 0.25 E 0
E 80 80 / 200 * 1 0.40
Total 200
Completely Fair Scheduler (Single-Core)

CPU Core
Time elapsed (s)
Thread A 0 . 05

Time interval: 1 second Runqueue sorted by Runtime


Thread Weight Timeslice calculation Assigne Sorted Runtime
name (Weight / Total) * d threads
Interval Timeslic ThreadB 0
e C 0
A 10 10 / 200 * 1 0.05
D 0
B 20 20 / 200 * 1 0.10
E 0
C 40 40 / 200 * 1 0.20
D 50 50 / 200 * 1 0.25
E 80 80 / 200 * 1 0.40
Total 200
Completely Fair Scheduler (Single-Core)

CPU Core
Time elapsed (s)
Thread B 0 . 15

Time interval: 1 second Runqueue sorted by Runtime


Thread Weight Timeslice calculation Assigne Sorted
name (Weight / Total) * d threads Runtime
Interval Timeslic Thread C 0
e D 0
A 10 10 / 200 * 1 0.05
E 0
B 20 20 / 200 * 1 0.10
A 0.05
C 40 40 / 200 * 1 0.20
D 50 50 / 200 * 1 0.25
E 80 80 / 200 * 1 0.40
Total 200
Completely Fair Scheduler (Single-Core)

CPUCore
Time elapsed (s)
Thread C 0 . 35

Time interval: 1 second


Runqueue sorted by Runtime
Threa Weight Timeslice calculation Assigne
Sorted Runtime
d (Weight / Total) * d threads
name Interval Timeslic
e ThreadD 0
A 10 10 / 200 * 1 0.05 E 0
B 20 20 / 200 * 1 0.10 A 0.05
C 40 40 / 200 * 1 0.20 B 0.10
D 50 50 / 200 * 1 0.25
E 80 80 / 200 * 1 0.40
Total 200
Completely Fair Scheduler (Single-Core)

CPUCore
Time elapsed (s)
Thread D 0.60

Time interval: 1 second Runqueue sorted by Runtime

Thread Weight Timeslice calculation Assigned Sorted Runtime


name (Weight / Total) * Timeslice threads
Interval Thread E 0
A 10 10 / 200 * 1 0.05 A 0.05
B 20 20 / 200 * 1 0.10 B 0.10
C 40 40 / 200 * 1 0.20 C 0.20
D 50 50 / 200 * 1 0.25
E 80 80 / 200 * 1 0.40
Total 200
Completely Fair Scheduler (Single-Core)

CPUCore
Time elapsed (s)
Thread E 1.00

Time interval: 1 second Runqueue sorted by Runtime


Threa Weight Timeslice calculation Assigne Sorted Runtime
d (Weight / Total) * d threads
name Interval Timeslic A 0.05
e B 0.10
A 10 10 / 200 * 1 0.05
C 0.20
B 20 20 / 200 * 1 0.10
D 0.25
C 40 40 / 200 * 1 0.20
D 50 50 / 200 * 1 0.25
E 80 80 / 200 * 1 0.40
Total 200
Completely Fair Scheduler (Single-Core)

CPU Time elapsed (s)


Core
1.00

Time interval: 1 second


Runqueue sorted by Runtime
Threa Weight Timeslice calculation Assigne
d (Weight / Total) * d Sorted Runtime
name Interval Timeslic threads
e A 0.05
A 10 10 / 200 * 1 0.05 B 0.10
B 20 20 / 200 * 1 0.10 C 0.20
C 40 40 / 200 * 1 0.20 D 0.25
D 50 50 / 200 * 1 0.25 E 0.40
E 80 80 / 200 * 1 0.40
Total 200
Completely Fair Scheduler (Single-Core)

CPU Time elapsed (s)


Core 0

Time interval: 1 second


Runqueue sorted by Runtime
Thread Weight Timeslice calculation Assigned
Sorted Runtime
name (Weight / Total) * Timeslice
threads
Interval
A 10 10 / 250 * 1 0.04 Thread F 0
B 20 20 / 250 * 1 0.08 A 0.05
C 40 40 / 250 * 1 0.16 B 0.10
D 50 50 / 250 * 1 0.20 C 0.20
E 80 80 / 250 * 1 0.32 D 0.25
F 50 50 / 250 * 1 0.20 E 0.40
Total 250
Completely Fair Scheduler (Single-Core)

CPU Core
Time elapsed (s)
Thread F 0. 20

Time interval: 1 second


Thread Weight Timeslice calculation Assigne Runqueue sorted by Runtime
name (Weight / Total) * d
Interval Timeslic Sorted Runtime
e threads
A 10 10 / 250 * 1 0.04 ThreA
ad A 0.05
B 20 20 / 250 * 1 0.08 B 0.10
C 40 40 / 250 * 1 0.16 C 0.20
D 50 50 / 250 * 1 0.20 D 0.25
E 80 80 / 250 * 1 0.32 E 0.40
F 50 50 / 250 * 1 0.20
Total 250
Completely Fair Scheduler (Single-Core)

CPU Core
Time elapsed (s)
Thread A 0.24

Time interval: 1 second


Threa Weight Timeslice calculation Assigned Runqueue sorted by Runtime
d (Weight / Total) * Timeslice Sorted Runtime
name Interval threads
ThreadB 0.10
A 10 10 / 250 * 1 0.04
C 0.20
B 20 20 / 250 * 1 0.08
F 0.20
C 40 40 / 250 * 1 0.16
D 0.25
D 50 50 / 250 * 1 0.20
E 0.40
E 80 80 / 250 * 1 0.32
F 50 50 / 250 * 1 0.20
Total 250
Real-Time (RT) Scheduling

❑ The scheduler supports real time scheduling quite well, it will do its best to
meet predetermined dead lines but does not guarantee it.

❑ RT tasks priority range is [0,99], this tasks will always preempt user
tasks as user tasks are ranged [100,139]
Scheduling schemes CPU T1 T2

1. SCHED_FIFO – As the name implies, first in, first out. Timeslices are irrelveant in this
scheme, the tasks with the highest priority runs until is finishes.

2. SCHED_RR – Round Robin, tasks are scheduled by priority, task s in the same priority
run in a round robin fashion for an pre-allotted timeslice.

3. SCHED_BATCH – “batch” style of execution of processes. →lowest priority tasks ( nice +19 )
THANKS..

You might also like