You are on page 1of 9

Design Challenges 2.

Specification &
Hardware/Software Codesign
Application complexity: want adaptability, but specialized
Summary Andreas Biri, D-ITET 27.01.18 - large systems must often provide legacy compatibility Models of Computation
- mixture of event driven & data flow tasks
Observer: subject changes state, observer is notified
1. Introduction Target system complexity: design space is very large - one-to-many dependency between subject & observers
- different technologies, processor types, designs
Embedded Systems are designed for specialized processes Synchronized: causes processes to run sequentially
- use finished system-on-chips, distributed implementation
- systems exist of dedicated, specialized hardware - solves race conditions, bad performance as not parallel
Constraints & design objectives - can easily cause deadlocks if circular dependence
- require design optimizations targeted at intended usage
- cost, power consumption, timing constraints, size,
(performance, cost, power consumption, reliability)
predictability, processing power, temperature Models of Computation
Very specific systems for one description of model
- restricted language, restricted rules
- offer more possibilities to optimize (more pre-knowledge)
- ease-of-use & better analysis (can verify correctness)
- efficient usage, high abstraction level

Model of Computation: “What happens inside & how interact?”


- Components & execution model for computations
- Communication model for information exchange
Levels of Abstraction
Discrete Event model: associate even with (trigger) time
Embedded systems (ES): information processing systems Specification: “Computer requires FPGA, DSP, …” - search for next even to simulate, independent of realtime
embedded into a larger product - Model formally describes selected system properties - allows efficient simulation, as only compute actions
- deliver enhanced functionality of existing system - Consists of data and associated methods - VHDL (hardware description language): sensitivity lists
- work in parallel & distributed target platforms
Synthesis: step from abstraction to real system Finite state machines (FSM): abstract proc. representation
- require assured reliability, guarantees & safety
- connects levels of abstraction (refinement) Differential equation: describe component mathematically
(predictability & bounded execution times are critical)
- from problem-level description to implementation
- usage known at design-time, not programmable by user
Modelling: connects implementation to problem-level Shared memory: potential race conditions & deadlocks
- fixed run-time requirements (at lowest possible cost)
- estimation of lower layer properties to improve design - Critical section: must receive exclusive access to resource
Often, such systems require real-time processing while still
Asynchronous message passing: non-blocking
offering low power consumption for energy independence Structural view: abstract layout of components
- sender does not have to wait until delivered
Multiprocessor system-on-a-chip (MPSoC) Behavioural view: describes function of component - potential buffer overflow if receiver doesn’t read enough
dedicated system, highly specialized with dedicated HW Physical view: effective hardware as seen on chip Synchronous message passing: blocking
- application characterized by variety of tasks - requires simultaneous actions by both end components
Hardware/Software Mapping: partitioning of system to - automatically synchronizes devices
General Purpose Computer: broad class of applications programmable components (software) & specialized HW - Communication sequential process (CSP): rendez-vous
- programmable by end user, usage might vary over time - can adapt implementation HW to match problem set based communication, bot indicate when ready for it
1
Specification requirements Simulation phase Random: information known about system & input is not
All edge labels are evaluated in 3 different phases: sufficient to determine its outputs (undeterministic)
Hierarchy: dependencies between components
1. Effect of external changes & conditions evaluated Determinate: histories of channels depends only on input
- Behavioural hierarchy: states, processes, procedures
2. Set of transitions to be made is computed - independent of timing, state, hardware; only function
- Structural hierarchy: processors, racks, circuit boards
Timing behaviour: mostly requires to be tightly bound 3. Transitions become effective (simultaneously)
Kahn process: monotonic mapping of input to output
State-oriented behaviour: required for reactive system Can also consider internal events instantaneous - create output solely based on previous input
Dataflow-oriented behaviour: parts send data streams - external events only considered in stable state, i.e. when
Adding non-determinacy can occur in multiple ways:
no more internal steps are conducted & status remains
Classical automata: input changes state & output (FSM) - allow processes to test for emptiness of input channels
- state diagram only represents stable states
- complex graphs are difficult to read & analyse by humans - allow shared channels (read or write )
- Moore: output only depending on state, not input StateChart solves some, but not all problems: - allow shared data between processes (variables)
- Mealy: output depends on state and input - hierarchy allows nesting of OR & AND states
Scheduling Kahn networks
- can auto-generate C code (but often inefficient)
2.1 StateCharts (2-25) Responsibility of the system, not the programmer
- no object-orientation & structural hierarchy
- bounded memory (buffers) can overflow if not managed
- not useful for distributed applications
Introduces hierarchies to increase readability: Tom Parks algorithm: iteratively increase buffer sizes
(as have asynchronous events & executions)
- Super-state: can have internal substates
1. Start with network with blocking writes
- Basic state: its super-state is called ancestor state Specification & Description Language (SDL): unambiguous 2. Use scheduling which doesn’t stall if not all block
specification of reactive & distributed systems 3. Run as long as no deadlocks occur
OR-super-state: can be in exactly one of the substates
- allows asynchronous message passing, but simultaneous 4. When deadlock, increase size of smallest buffer
AND-super-state: is in all of the immediate sub-states
- can still have undeterminism if race conditions occur
Finite buffer sizes: limit buffers by introducing reverse
(e.g. if we have multiple inputs for the same FIFO queue)
- if want to send, always need to first read from reverse
2.2 Data-Flow Models (2-55) - blocks if already wrote too much, will wait for token

Try to make result independent of time, focus on data


- processes communicate through FIFO buffers
Computation of state sets: traverse tree representation
- one buffer per connection to avoid time dependence
- OR-super-states: addition of sub-states Kahn Process networks offer various advantages:
- AND-super-states: multiplication of sub-states All processes run simultaneously with imperative code - scheduling algorithm does not affect functional behaviour
- processes can only communicate through buffers - matches stream-based processing & explicit parallelism
Timers: indicated by special edges, traverse after timeout - maps easily to parallel hardware / block-diagram specs - easy mapping to distributed & multi-processor platforms
- fuzzy & difficult to implement with balanced rates
Besides states, can use variables to keep information: Kahn Process Network
- action: variable changes as result of state transition - read: destructive & blocking (empty queue → busy-wait) Synchronous Dataflow (SDF): allow compile-time schedule
(multiple actions executed simultaneously: (𝑎 ≔ 𝑏; 𝑏 ≔ 𝑎) - write: non-blocking - process reads/writes fixed number of token each time
- condition: dependences of state transitions - FIFO queues are of infinite size - uses relative execution rates by solving linear equations
- determinate: no random variables, always know which - requires rank 𝑛 − 1 for 𝑛 processes & sufficient init token
queue I will read next as blocking (cannot peak & decide) - period found when same number of initial tokens again
2
3. Mapping Application-Architecture 4. System Partitioning 4.1 Integer Linear program (ILP) (4-10)

Allocation: select components Assign tasks to computing resources (& com to networks) An integer programming model requires two ingredients:
- objective function: linear cost expressions of int variables
Binding: assign functions to components (run what where)
Optimal partitioning can be achieved using various ways: - constraints: limit design space & optimization
Scheduling: determine execution order (scheduling) - compare design alternatives (design space exploration)
- estimate with analysis, simulation, prototyping
Partitioning: allocation & binding
(if system parameters are unknown / not determinable)
Mapping: binding & scheduling
Usually, one has conflicting design goals & constraints
Synthesis: implementation of the given specifications
- costs: cost of allocated components (should be minimal) IP problem: minimize objective function under constraints
- uses an underlying model of computation
- latency: due to scheduling / resource sharing (parallelize)
- constraints give maximal parameters, try to find solution For partitioning, we can setup the following ILP
Data Flow graph (DFG): show operation & communication
- 𝑥𝑖,𝑘 = {0,1} ∶ determines whether object 𝑜𝑖 in block 𝑝𝑘
- nodes: operations; edges: communication / data
Cost functions: quantitative performance measurement
- initial, abstract representation
- system cost, latency, power consumption, weight, size
Control Flow graph (CFG): node ≙ line of code - try to minimize function consisting of all variables
- includes loops and conditional statements - linear cost function weights & sums individual costs

Architecture Specification: reflects structure & properties Partitioning: assign 𝑛 objects to 𝑚 locks such that
- can be done at different abstraction levels - all objects are assigned / mapped (uniquely / once)
- costs are minimized & all constraints are kept
Mapping relates application & architecture specifications: Load balancing: maximal sum of all durations should be
- binds processes to processors minimized (minimize 𝑇, where 𝑇 ≥ 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑃𝑖 )
Partitioning methods
- binds communication between processes to busses/paths
- specifies resource sharing disciplines & scheduling Exact methods: get on optimal solution with minimal costs Additional constraints: can e.g. maximize number of
- enumeration: iterate through all solutions & compare objects in a single block
DFG Application model (3-11): initial specification - integer linear program (ILP)
𝑓
- functional nodes 𝑉𝑃 : tasks, procedure
Heuristic methods: get good solution with high probability
- communication nodes 𝑉𝑃𝑐 : data dependencies
- Constructive: random mapping, hierarchical clustering
Architecture model (3-12): describe physical hardware Maximizing the cost function: minimize negative function
- Iterative: Kernighan-Lin algorithm, simulated annealing,
𝑓
- functional resources 𝑉𝐴: processor, RISC, DSP evolutionary algorithms
𝑐 ILPs are very popular for synthesis problems
- bus resources 𝑉𝐴 : shared bus, PTP bus, fiber
- acceptable run-time with guaranteed quality
Specification graph: maps application graph to architecture (might be sub-optimal, as only search integer values
- action of binding abstract functions to actual hardware - scheduling can be integrated as well
- must occur after allocation - can add arbitrary constraints (however, hard to find)
- each data flow node must have an outgoing edge - NP-complete (can take long time if too complex)
- communication must actually connect correct HW - good starting point for designing heuristical optimization

3
4.2 Heuristic methods (4-17) 5. Multi-Criteria Optimization 5.2 Multiobjective Evolutionary Algos (5-24)
Constructive methods Evaluate set of solutions simultaneously
Network processor: execute communication workload
Try finding a good solution in a single computation - black-box optimization using randomization (no local min)
- high-performance for network packet processing
- assumption: better solutions are found near good ones
Random mapping: object is randomly assigned to block For optimization, we require:
1. Choose a set of initial solutions (parent set)
Hierarchical clustering: stepwise group objects (i.e. assign - task model: specification of the task structure
2. Mating selection: select some out of parent set
to same blocks) by evaluating the closeness function - flow model: different usage scenarios
3. Variation: use neighbourhood-operators to
- always merge the two “closest” objects (maximal value) The implementation defines architecture, task mapping & generate a new children set
- can stop after reaching desired level of clusters scheduling while considering objectives & constraints 4. Determine a union of children & parent set
- objectives: maximize performance, minimize costs 5. Environment selection: eliminate bad solutions
Iterative methods
- constraints: memory, delay, costs, size (conflicting)
Start at one point and try to improve in steps
- results in a performance model of the system Environmental selection
1. Start with an initial configuration
Criteria to choose which new solutions to take on:
2. search neighbourhood (similar partitions) and Black-box optimization: can only give input, observe
- Optimality: take the ones close to (unknown) front
select one as a candidate (slightly modified) output and use objective function to optimize input
- Diversity: should cover a large part of objective space
3. Evaluate fitness function of candidate
Constraints: only take feasibly solutions, add penalty Hypervolume indicator: should be maximized
4. Stop when criterion is fulfilled / after some time
- reference point: should be far away from optimal point
Hill climbing: always take the one with higher fitness 5.1 Multiobjective Optimizations (5-12) (can strongly influence chosen set by weighting area)
- if no more neighbours better, stop execution Different objectives are often not comparable - corresponds to region dominated by pareto-optimal front
- local optimum as a best result (depends on initialization) - however, there are clearly inferior solutions - using dominated points will not increase the area
- can start at various points to get good (& quick) estimate - all additional pareto-optimal solutions increase the area
Can use classical single objective optimization methods
KL: use information of previous runs to find global one - a better set (dominates other one) always has larger area
- simulated annealing, Kernighan-Lin, ILP
Simulated annealing: use complex acceptance rule (jump) - decision making is done before optimization Choose the solution which increases hypervolume the
Evolutionary algorithms: complex strategy to add entropy - map all dimensions to one using some cost function least and throw it out of the parent set for the next round

Kernighan-Lin algorithm (4-29) Decision space: feasible set of alternatives Neighbourhood-operators (5-37)
From all possible pairs of objects, virtually regroup the best Objective space: image of decision space using the Work on representations of solutions (e.g. integer vector)
- from the remaining objects, continue until all regrouped objective function (“evaluated performance”) Completeness: each solution has an encoding
- after 𝑛/2 turns, take lowest cost one & actually perform
Uniformity: all solutions are represented equally often
External costs 𝐸𝑖 : from node to nodes in other partition Pareto-dominated: if better or equal for all objectives
(else biased to solution with many encodings)
Internal costs 𝐼𝑖 : from node to nodes in same partition Pareto-optimal: not dominated by any other solution
Pareto-optimal front: set of pareto-optimal points Feasibility: each encoding maps to feasible solution
Desirability to move: 𝐷𝑖 = 𝐸𝑖 − 𝐼𝑖
(e.g. make it priority, not absolute definition)
Gain: 𝑔 = 𝐷𝑥 + 𝐷𝑦 − 2 ∗ 𝑐(𝑥, 𝑦) Population-based optimization costs: Pareto-optimal front
Use evolutionary algorithms to get a set of solutions Crossover: take 2 solutions & exchange properties
Simulated annealing: vary (randomly), always take better- - decision making is done after the optimization
cost but also probability to take worse-cost neighbours Mutation: randomly vary property of solution
- function can then weight & map different image points
- gradual cooling: slowly decrease prob. of accepting worse (reorder, flip, replace with different part)
4
Event list: queue of events, processed in order
6. System Simulation - organized as a priority queue (may include time)
7. Design Space Exploration
6.1 System classification Simulation time: represents current value of time Optimal design criteria
- during discrete event, clock advances to next event time
System: combination of components to perform a function Mappings: all possible bindings of tasks to architecture
not possible with the individual parts System modules: model subsystems of simulated system
Request: operational cost of executing given task
- process events, manipulate event queue & system state
Model: formal description of the system (abstraction) - sensitivity list indicates whether module is concerned Binding: subset of mappings so that every task is bounded
to exactly one allocated resource (actual implementation
State: contains all information necessary to determine the Zero duration virtual time interval: delta-cycle (𝜹)
output together with the input for all future times Design constraints
- prevents the cause & effect events to coincide at the
Discrete state models: countable number of states same time instance (if they occur instantly, outcome Delay constraints: maximal time a packet can be processed
- e.g. processors with registers depends on ordering & race conditions occur) Throughput maximization: maximize packets per second
Continuous state model: actual analogue signals - orders “simultaneous” events within simulation cycle Cost minimization: implement only numbered resources
Discrete time model: changes only at discrete times Conflicting usage scenarios: should be good for mixture
SystemC (6-23)
Continuous time model: time advances the system
System-level modelling language to simulate concurrent
Events: tuple of a value 𝑣 and a tag 𝑡 executions in embedded systems (HW, communication)
- 𝑡 = 𝑡𝑖𝑚𝑒 : timed event - event-driven simulation kernel for discrete-event models
- 𝑡 = 𝑠𝑒𝑞𝑢𝑒𝑛𝑐𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 : untimed event
Processes are the basic units of functionality:
Time-driven simulation: time is partitioned into intervals - SC_THREADS: called once, run forever (block if no input),
- simulation step is performed even if nothing happens can be suspended using 𝑤𝑎𝑖𝑡() / started with 𝑛𝑜𝑡𝑖𝑓𝑦()
Event-driven simulation: discrete or continuous time base - SC_METHOD: event-triggered, require sensitivity list
- evaluation & state changes only at occurrences of events, execute repeatedly when called without being suspended

Discrete Event system (DES): event-driven system Channel: contained for communication & synchronization Simple analysis model
- state evolution depends entirely on occurrence of - can have state (FIFOs), implement one or more interfaces Optimize both to minimize maximal processor & bus load
discrete events over time (not by the evolution of time) Check that modules have sufficient initial tokens! - want to spread load over all CPUs, but not too much
- signals / streams: ordered and/or timed events - get numbers using static parameters, functional
- processes: functions which act on signals or streams simulation & instruction-set simulation (using benchmarks)
6.3 Simulation at high abstract levels
6.2 Discrete event simulation (6-16) (Untimed) functional level: model functionality
- C/C++, Matlab: shared variables & messages
Modules describe the entire system & allow separation:
- Behaviour: described using logic and algebraic expression Transaction level: early SW development, timing
- State: persistent variables inside these modules - SystemC: method calls to channels
- Communication: done through ports via signals
- Synchronization: done through events and signals Register transfer / Pin level: HW design & verification
- Verilog, VHDL: wires and registers
5
8. Performance Estimation Estimation methods (8-19) 9. WCET Analysis
Measurements: use prototype to measure performance
High-level estimation: just look at the functional behaviour Hard Real-Time Systems: embedded controllers are
Simulation: develop program which runs system model
- short estimation time, implementation details irrelevant expected to finish their tasks reliably within time bounds
- limited accuracy, e.g. no information about timing Statistics: develop statistical abstraction & derive statistics
Formal analysis: mathematical abstraction to compute Worst-Case Execution Time (WCET): upper bound on the
Low-level estimation: simulate all physical layers formulas which describe the system performance (WC) execution time of a task, should be kept minimal
- higher accuracy, deeper analysis possible - upper bound consists of all pessimistic consumptions
- long estimation time, need to define exactly - can be approximated with exhaustive measurements
Analytic models: abstract system & derive characteristics
We use performance estimation to check: - performance measures are stochastic values (e.g. avg) Usually, try to compute by analysing program structure
- Validation of non-functional aspects: verification - can use for worst-case/best-case evaluation (bracketing) - modern processers exploit parallelism, therefore
- Design space exploration: allows optimization execution time not simply sum of single instructions
Static analytic models: use algebraic equations & relations
- out-of-order execution by leveraging independence
Exploration: reconfigure system and evaluate performance between components to describe properties
- caches, pipelines, branch prediction, speculation
- fast & simple, but generally inaccurate modelling
- difference between BCET and WCET can be gigantic
Performance metric: function giving a quantitative (scheduling, overhead, resource sharing neglected)
indication on system execution, should be representative Timing Accident: cause for increase of execution time
Dynamic analytic model: extend static models
- time, power, temperature, area, cost, SNR, processing - execution is increased by timing penalty
- implement non-determinism in run-time & processing
- describe e.g. resource sharing (scheduling & arbitration) - causes: cache miss, pipeline stalls, branch misprediction,
Evaluation difficulties
bus collisions, DRAM memory refresh, TLB miss
Non-determinism: computation (parallel), communication Simulation: implement a model of the system (SW, HW)
(interference), memory (shared resources), interactions - include precise allocation, mapping, scheduling Micro-architecture analysis: use abstract interpretation
- combines functional simulation & performance analysis - exclude as many timing accidents as possible
Cyclic timing dependencies: internal streams interact on
- performance evaluation by running the entire program - determine WCET for basic blocks by analysing HW
computation & communication, influences characteristics
- difficult to focus on part, but enables detailed debugging
Uncertain environment: different scenarios (cached, pre- Worst-case Path Determination: maps control flow graph
- one run only evaluates for a single simulation scenario
emption enabled), worst-case vs. best-case inputs to an ILP to determine upper bound & associated path
(specific input trace & initial system state)
Various resource availability & demands: request depend - complex setup & extensive runtimes, but accurate
on precise circumstances, where & when it is executed
Trace-based simulation: separate functional & timing beh.
- run-time of functions can vary depending on state
- trace only determined by functional application
- disregards timing (only focus on functional behaviour)
- faster than low-level simulation, as abstracting as events
- allows evaluation on multiple architectures using the
same event graph on “virtual machines”

6
9.1 Program Path Analysis (9-18) 9.2 Value Analysis (9-29) 9.4 Pipelines (9-54)
Determine sequence of instructions which is executed in Abstract Interpretation (AI): don’t work on actual variables, Ideal case: finish 1 instruction per cycle
the worst-case scenario (i.e. resulting in longest runtime) but consider possible variable intervals (abstract values)
- we know WCET for basic block from static analysis - can give exact WCET by considering all possible inputs Instruction execution is split into several stages
- number of loops must be bounded - supports correctness proofs - multiple instructions can be executed in parallel
- may execute instructions out-of-order
Basic block: sequence of instructions where control flow Value analysis is used to provide:
enters at beginning and exits at end without stopping in- - Access information to data-cache/pipeline analysis Pipeline hazards
between or branching (just linear sequence, SISO) - Detection of infeasible paths Data hazards: operands not yet available
- Derivation of loop bounds (data dependencies, cache miss)
Determining the first instructions of basic blocks:
(solve with pipeline stall & forwarding)
- The first instruction 9.3 Caches (9-35)
- Targets of (un-)conditional jumps Resource hazards: consecutive instr. uses same resource
Provide fast access to stored data without accessing main
- Instructions that follow (un-)conditional jumps Control hazards: conditional branches (requires flush)
memory, as speed gap between CPU & memory is large
Instruction-cache hazards: instruction fetch causes miss
The WCET can then be calculated as the sum of the blocks:
Assumes local correlation between data access:
𝑁 - program will use similar data soon (many hits)
𝑐𝑖 ∶ 𝑊𝐶𝐸𝑇 𝑜𝑓 𝑏𝑙𝑜𝑐𝑘 𝑖 Cache analysis: prediction of cache hits (data / instr.)
𝑊𝐶𝐸𝑇 = ∑ 𝑐𝑖 𝑥𝑖 , - program will reuse items (instructions, data)
𝑥𝑖 ∶ # 𝑒𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛𝑠 𝑜𝑓 𝑖
𝑖=1 - access patterns are evenly distributed across the cache Dependence analysis: analysis of data/control hazards
The number of executions of a block 𝑥𝑖 is given by: Resource reservation tables: analysis of resource hazards
4-way set associative cache (9-39): store 4 tags / cache line
- structural constraints: given by flow equations
Least Recently Used (LRU): replace oldest block (ages) Simulation
- additional constraints: extracted from program code
(e.g. number of loop iterations, logical connections) Processor: consider CP as a big state machine with initial
We can distinguish two statically cache contents analyses:
state 𝑠, instruction stream 𝑏 and trace 𝑡
The entire ILP can then be written as: Abstract pipeline: limit simulation to pipeline
Must analysis: Worst-case, “At which position at least?”
- each predicted cache hit reduces WCET (always hit) - may lack information, e.g. about cache contents
- “union + maximal age”: where is my worst position?
Assuming local worst-case at every step leads to the global
May analysis: Best-case, “At which position at most?” worst-case result (might be longer than in real-world)
- each predicted cache miss increases BCET (always miss) - timing anomalies might however counteract this
- “union + minimal age”: where is my best position? assumption (may be faster if delayed in beginning)
- can also join sets of states under this assumption
Loop unrolling: improve analysis by limiting influence of (always keep most pessimistic look, as always safe)
state before the loop by executing it first as 𝑖𝑓 - always assume cache misses where not excluded
- more optimistic result for WCET, pessimistic for BCET

7
10. Performance Analysis of Greedy Processing Component (GPC) Time-interval domain relations (10-32)
If tasks are available & resources ready, always use them
Distributed Embedded Systems - assume preemptable tasks (can stop anytime if 𝐶(𝑡) = 0)
- processes are only restricted by limited resources
Embedded system (ES)
- are processed one after the other in FIFO order
- Computation, Communication, Resource interaction Using convolution, we can describe the relation as:
- build a system from subsystems meeting requirements

10.1 Real-time calculus (10-11)


Abstract systems to calculate evaluation for all possible
executions at once (entire behaviour in one analysis)

Min-plus
- used for interval arithmetic (abstract values)

Computation: task instance 𝑅(𝑡), computing resource 𝐶(𝑡)


Communication: data packet 𝑅(𝑡), bandwidth 𝐶(𝑡)
Delay & Backlog (10-37):
Infimum: greatest element (not necessarily in set) which is - backlog: max. nr of components in queue / vertical diff.
less or equal to all other elements of the set - delay: max. time in queue / horizontal difference

Metrics (10-17)
Data streams 𝑅(𝑡) : number of events in [ 0, 𝑡)
Resource streams 𝐶(𝑡) : available resources in [ 0, 𝑡) 𝑢 ∶ last time the buffer was completely empty (𝑅′ = 𝑅)
- for 0 ≤ 𝑡 ′ ≤ 𝑡, we constantly use all available resources
Arrival curve 𝛼 = [ 𝛼 𝑙 , 𝛼 𝑢 ] 10.2 Modular Performance Analysis (10-40)
𝑅′ (𝑡) − 𝑅′ (𝑢) ≤ 𝐶(𝑡) − 𝐶(𝑢)
- max / min arriving events in any interval of length ∆
𝐶 ′ (𝑡) = sup { 𝐶(𝑢) − 𝑅(𝑢) }
Service curve 𝛽 = [ 𝛽 𝑙 , 𝛽 𝑢 ] 0≤𝑢≤𝑡
- max/min available service in any interval of length ∆
Conservation law: 𝑅′ (𝑡) ≤ 𝑅(𝑡) ∀ 𝑡
Common event pattern: specified by parameter triple
- 𝑝 : period Time domain: cumulative functions
- 𝑗 : jitter Tine-interval domain: variability curves
- 𝑑 : minimum inter-arrival distance of events

Ex. Periodic with Jitter: (10-21)


Ex. TDMA Resource: (10-24)
Different scheduling mechanisms: (10-41)
- Fixed priority/Rate monotonic, EDF, Round Robin, TDMA

8
11. Various
Application Specific Instruction Set
- specialized, but still programmable (efficiently)
- HW with custom instruction set & operations

Translation Look-aside buffer (TLB): stores recent


translation of virtual to physical memory

“Traffic shaper”: guarantee min. delay between tasks


- spreads bursts to minimize influence on other tasks

You might also like