You are on page 1of 131

System Programming

System programming involves designing and writing computer


programs that allow the computer hardware to interface with
the programmer and the user, leading to the effective
execution of application software on the computer system.
• Now raised the questions:
Click to
add text
– What is Software?
– How many different types of software ?
– Define readymade software and user defined software?
– What is system software?
Classification of system
• It is classified in to parts
• 1. Closed System:
– It is a system interacting with the entities of the system itself, but not
interacting with its external environment.
– In the real world , a closed system does not exist because without
interacting with environment ,a system cannot survive for a long
period of time.
• 2. Open System: It is a system interacting with the entities of the
system itself and also interacting with its external environment .
– In the real world all system are open system.
– An open system can further classified as follows
– A. Deterministic System
– B. Probabilistic System
Cont.
• The Collection of program is known as software.
• There are three types of software
– Readymade software
• Like MS Office, Photoshop, Tally
– User defined software
• Define by user
– For account package
– Automated software package for organization and company
– ERP
– System software
• Operating system is known as system software
– Window operating system
– Unix/Linux operating system
– MS-DOS operating system
Some System Software Concept
• Language
– HLL and LL language
– Machine level and Assembly level language
– Loader and linker
– Translator
Design of a Translator or compiler can be
said to have the following Phases
• Lexical analysis phase
• Syntax analysis phase
• Semantics analysis phase
• Code generation phase
• Code optimization phase
Operating System

• It is also known as system software


• The interaction between Hardware and user is
also known as operating system.
• Such a monitoring mechanism is termed as a
operating system.
Cont.
• Operating system having some important points-

– Resource Management
– Device Management
– Concept of Memory Management
– Scheduled
– Concurrent programming
System
• It is collection of item entity or subsystem
• Entity of a system one interconnected and interaction with
each other by accepting an input and producing the
corresponding output.

Input Output
System
Cont.

• System

Entity 1 Entity 2

Entity3 Subsystem
• how many bits in 1 kb =1024
• 1024kb=?1MB
• 1024Mb=?1GB
• 1024gb=?1TB
• Assignment
• Prove the above
• 1024kb=?1MB
• 1024Mb=?1GB
• 1024gb=?1TB
NUMBER SYSTEM
What is radix number of
Decimal 10
Binary - 02
Octal -08
Hexadecimal -16
Can you convert Decimal to binary
64,128,256,1024
100000,1000000,10000000,1000000000
100011=?10 =35
111000=?10 =56
111011001001=?16 =EC9
111000101=?8 =705
110011.01=?10=51.25
11001110011=?8=3163
1111100001000100=?16=F844
BCD
• Binary coded decimal
If given decimal number is
5 3 1 9
Convert in to BCD
00000101 00000101 00000001 00001001
2.
6 2 3 5
00000110 00000010 00000011 00000101
Classification of system
• It is classified in to parts
• 1. Closed System:
– It is a system interacting with the entities of the system itself, but not
interacting with its external environment.
– In the real world , a closed system does not exist because without
interacting with environment ,a system cannot survive for a long period of
time.
• 2. Open System: It is a system interacting with the entities of the system
itself and also interacting with its external environment .
– In the real world all system are open system.
– An open system can further classified as follows
– A. Deterministic System: System accepts the input and gives a definite
output
– B. Probabilistic System: When the processing and input and it may not
give a definite output all the time .
Software Hierarchy
• Highest Level
• Application Program
• High level language ; programs and compilers
• Operating system
• Assembly Programming Language
• Lowest Level

– MLP
» Machine Language Programs
Types of System Programs
• Status Information
• The status information system programs provide required data on the current or past status
of the system. This may include the system date, system time, available memory in system,
disk space, logged in users etc.
• Communications
• These system programs are needed for system communications such as web browsers. Web
browsers allow systems to communicate and access information from the network as
required.
• File Manipulation
• These system programs are used to manipulate system files. This can be done using various
commands like create, delete, copy, rename, print etc. These commands can create files,
delete files, copy the contents of one file into another, rename files, print them etc.
• Program Loading and Execution
• The system programs that deal with program loading and execution make sure that programs
can be loaded into memory and executed correctly. Loaders and Linkers are a prime example
of this type of system programs.
• File Modification
• System programs that are used for file modification basically change the data in the file or
modify it in some other way. Text editors are a big example of file modification system
programs.
Assembler, Loader, Linker
• An assembler then translates the assembly
program into machine code (object).
• A linker tool is used to link all the parts of the
program together for execution (executable
machine code).
• A loader loads all of them into memory and
then the program is executed.
Assembler
• An assembler translates assembly language
programs into machine code.
• The output of an assembler is called an object
file, which contains a combination of machine
instructions as well as the data required to
place these instructions in memory.
Cont.
• • Assemblers need to – translate assembly instructions and
pseudo-instructions into machine instructions – Convert
decimal numbers, etc. specified by programmer into binary.
• • Typically, assemblers make two passes over the assembly
file – First pass: reads each line and records labels in a symbol
table – Second pass: use info in symbol table to produce
actual
Object file format
• Object file header describes the size and position
of the other pieces of the file
• Text segment contains the machine instructions
• Data segment contains binary representation of
data in assembly file
• Relocation info identifies instructions and data
that depend on absolute addresses
• Symbol table associates addresses with external
labels and lists unresolved reference
Cont.
• Object file header: OFH
• Text segment: TS
• Data segment Relocation: DSR
• information Symbol table : IST
• Debugging information: DI
Assembler
• Compiler
• Compiler is used to translate an high level
programming language code to machine level code and
to create an executable program. Compiler checks the
error in the program and reports them. All errors are to
be removed otherwise code will not be compiled and
executed.
• Assembler
• Assembler is used to translate an assembly level code
to machine readable code. Assembler too checks the
correctness of each instruction and reports the
diagnosis report.
Assemblers and Linkers
Cont.
• This document contains very brief examples of assembly language programs
for the x86. The topic of x86 assembly language programming is messy
because:
• There are many different assemblers out there: MASM, NASM, gas, as86,
TASM, a86, Terse, etc. All use radically different assembly languages.
• There are differences in the way you have to code for Linux, OS/X, Windows,
etc.
• Many different object file formats exist: ELF, COFF, Win32, OMF, a.out for
Linux, a.out for FreeBSD, rdf, IEEE-695, as86, etc.
• You generally will be calling functions residing in the operating system or other
libraries so you will have to know some technical details about how libraries
are linked, and not all linkers work the same way.
• Modern x86 processors run in either 32 or 64-bit mode; there are quite a few
differences between these.
• We’ll give examples written for NASM, MASM and gas for both Win32 and
Linux. We will even include a section on DOS assembly language programs for
historical interest. These notes are not intended to be a substitute for the
documentation that accompanies the processor and the assemblers, nor is it
intended to teach you assembly language. Its only purpose is to show how to
assemble and link programs using different assemblers and linkers.
Cont.
• Each assembly language file is assembled into an "object file" and
the object files are linked with other object files to form an
executable. A "static library" is really nothing more than a collection
of (probably related) object files. Application programmers
generally make use of libraries for things like I/O and math.
• Assemblers you should know about include
• MASM, the Microsoft Assembler. It outputs OMF files (but
Microsoft’s linker can convert them to win32 format). It supports a
massive and clunky assembly language. Memory addressing is not
intuitive. The directives required to set up a program make
programming unpleasant.
• GAS, the GNU assembler. This uses the rather ugly AT&T-style
syntax so many people do not like it; however, you can configure it
to use and understand the Intel-style. It was designed to be part of
the back end of the GNU compiler collection (gcc).
• NASM, the "Netwide Assembler." It is free, small, and best of all it
can output zillions of different types of object files. The language is
much more sensible than MASM in many respects.
Cont.
There are many object file formats. Some you should know
about include
• OMF: used in DOS but has 32-bit extensions for Windows.
Old.
• AOUT: used in early Linux and BSD variants
• COFF: "Common object file format"
• Win, Win32: Microsoft’s version of COFF, not exactly the
same! Replaces OMF.
• Win64: Microsoft’s format for Win64.
• ELF, ELF32: Used in modern 32-bit Linux and elsewhere
• ELF64: Used in 64-bit Linux and elsewhere
• macho32: NeXTstep/OpenStep/Rhapsody/Darwin/OS X 32-
bit
• macho64: NeXTstep/OpenStep/Rhapsody/Darwin/OS X 64-
bit
Cont.
• The NASM documentation has great
descriptions of these.
• You’ll need to get a linker that (1) understands
the object file formats you produce, and (2)
can write executables for the operating
systems you want to run code on. Some
linkers out there include
– LINK.EXE, for Microsoft operating systems.
– ld, which exists on all Unix systems; Windows
programmers get this in any gcc distribution.
Programming Using System Calls

• 64-bit Linux installations use the processor’s


SYSCALL instruction to jump into the portion of
memory where operating system services are
stored. To use SYSCALL, first put the system call
number in RAX, then the arguments, if any, in
RDI, RSI, RDX, R10, R8, and R9, respectively.
• In our first example we will use system calls for
writing to a file (call number 1) and exiting a
process (call number 60). Here it is in the NASM
assembly language:
----------------------------------------------------------------------------------------
; Writes "Hello, World" to the console using only system calls. Runs on 64-bit Linux only.
; To assemble and run:
;
; nasm -felf64 hello.asm && ld hello.o && ./a.out
; ----------------------------------------------------------------------------------------

global _start

section .text
_start: mov rax, 1 ; system call for write
mov rdi, 1 ; file handle 1 is stdout
mov rsi, message ; address of string to output
mov rdx, 13 ; number of bytes
syscall ; invoke operating system to do the write
mov rax, 60 ; system call for exit
xor rdi, rdi ; exit code 0
syscall ; invoke operating system to exit

section .data
message: db "Hello, World", 10 ; note the newline at the end
the same program in gas
hello.s
# ----------------------------------------------------------------------------------------
# Writes "Hello, World" to the console using only system calls. Runs on 64-bit Linux only.
# To assemble and run:
#
# gcc -c hello.s && ld hello.o && ./a.out
#
# or
#
# gcc -nostdlib hello.s && ./a.out
# ----------------------------------------------------------------------------------------

.global _start

.text
_start:
# write(1, message, 13)
mov $1, %rax # system call 1 is write
mov $1, %rdi # file handle 1 is stdout
mov $message, %rsi # address of string to output
mov $13, %rdx # number of bytes
syscall # invoke operating system to do the write

# exit(0)
mov $60, %rax # system call 60 is exit
xor %rdi, %rdi # we want return code 0
syscall # invoke operating system to exit
message:
.ascii "Hello, world\n"
System Calls in 32-bit Linux
• There are some systems with 32-bit builds of Linux out
there still. On these systems you invoke operating systems
services through an INT instruction, and use different
registers for system call arguments (specifically EAX for the
call number and EBX, ECX, EDX, EDI, and ESI for the
arguments).
• Although it might be interesting to show some examples for
historical reasons, this introduction is probably better kept
short.
Programming with a C Library
• you might like to use your favorite C library functions in
your assembly code. This should be trivial because the C
library functions are all stored in a C library, such as libc.a.
• Technically the code is probably in a dynamic library, like
libc.so, and libc.a just has calls into the dynamic library.
Still, all we have to do is place calls to C functions in our
assembly language program, and link with the static C
library and we are set.
• Before looking at an example, note that the C library
already defines _start, which does some initialization, calls
a function named main, does some clean up, then calls the
system function exit! So if we link with a C library, all we
have to do is define main and end with a ret instruction!
Here is a simple example in NASM, which illustrates calling
puts.
Programming for Win32
• Win32 is the primary operating system API
found in most of Microsoft’s 32-bit operating
systems including Windows 9x, NT, 2000 and
XP. We will follow the plan of the previous
section and first look at programs that just use
system calls and then programs that use a C
library.
Calling the Win32 API Directly
• Win32 defines thousands of functions! The code for these
functions is spread out in many different dynamic libraries,
but the majority of them are in KERNEL32.DLL, USER32.DLL
and GDI32.DLL (which exist on all Windows installations).
• The interrupt to execute system calls on the x86 processor
is hex 2E, with EAX containing the system call number and
EDX pointing to the parameter table in memory. However,
according to z0mbie, the actually system call numbers are
not consistent across different operating systems, so, to
write portable code you should stick to the API calls in the
various system DLLs.
• Here is the "Hello, World" program in NASM, using only
Win32 calls.
Cont.
• "Hello, World" program in NASM, using only Win32 calls.
• hello.asm
• ; ----------------------------------------------------------------------------
; hello.asm
;
; This is a Win32 console program that writes "Hello, World" on one line and
; then exits. It uses only plain Win32 system calls from kernel32.dll, so it
; is very instructive to study since it does not make use of a C library.
; Because system calls from kernel32.dll are used, you need to link with
; an import library. You also have to specify the starting address yourself.
;
; Assembler: NASM
; OS: Any Win32-based OS
; Other libraries: Use gcc's import library libkernel32.a
; Assemble with "nasm -fwin32 hello.asm"
; Link with "ld -e go hello.obj -lkernel32"
; ----------------------------------------------------------------------------

global go
extern _ExitProcess@4
extern _GetStdHandle@4
extern _WriteConsoleA@20
Cont.
• section .data
msg: db 'Hello, World', 10
handle: db 0
written:
db 0
section .text
go:
; handle = GetStdHandle(-11)
push dword -11
call _GetStdHandle@4
mov [handle], eax
; WriteConsole(handle, &msg[0], 13, &written, 0)
push dword 0
push written
push dword 13
push msg
push dword [handle]
call _WriteConsoleA@20
; ExitProcess(0)
push dword 0
call _ExitProcess@4
• Here you can see that the Win32 calls we are using are
• GetStdHandle WriteConsoleA ExitProcess
Cont.
• Get StdHandle
• WriteConsole A
• ExitProcess and parameters are passed to these calls on the
stack. The comments instruct us to assemble into an object
format of "win32" (not "coff"!) then link with the linker ld.
• Of course you can use any linker you want, but ld comes with
gcc and you can download a whole Win32 port of gcc for free.
We pass the starting address to the linker, and specify the
static library libkernel32.a to link with.
• This static library is part of the Win32 gcc distribution, and it
contains the right calls into the system DLLs.
Differences between NASM, MASM, and GAS
• The complete syntactic specification of each
assembly language can be found elsewhere,
but you can learn 99% of what you need to
know by looking at a comparison table:
Operation NASM MASM GAS
Move contents of esi into ebx mov ebx, esi movl %esi, %ebx
Move contents of si into dx mov dx, si movw %si, %dx
Clear the eax register xor eax, eax xorl %eax, %eax
Move immediate value 10 into register al mov al, 10 movb $10, %al
Move contents of address 10 into register mov ecx, [10] I DON’T KNOW movl 10, %ecx
ecx
Move contents of variable dog into register mov eax, [dog] mov eax, dog movl dog, %eax
eax
Move address of variable dog into register mov eax, dog I DON’T KNOW movl $dog, %eax
eax
Move immediate byte value 10 into mov byte [edx], 10 mov byte ptr [edx], movb $10, (%edx)
memory pointed to by edx 10
Move immediate 16-bit value 10 into mov word [edx], 10 mov word ptr [edx], movw $10, (%edx)
memory pointed to by edx 10
Move immediate 32-bit value 10 into mov dword [edx], 10 mov dword ptr [edx], movl $10, (%edx)
memory pointed to by edx 10
Compare eax to the contents of memory 8 cmp eax, [ebp+8] cmpl $8(%ebp),
bytes past the cell pointed to by ebp %eax
Add into esi the value in memory ecx add esi, [eax+ecx*8] addl (%eax,%ecx,8),
quadwords past the cell pointed to by eax %esi
Add into esi the value in memory ecx add esi, [eax+ecx*4+128] addl $128(%eax,%ec
doublewords past 128 bytes past the cell x,4), %esi
pointed to by eax
Add into esi the value in memory ecx add esi, [eax+ecx*4+array] addl
doublewords past eax bytes past the array(%eax,%ecx,4),
beginning of the variable named array %esi
Add into esi the value in memory ecx add esi, [ecx*2+array] addl array(,%ecx,2),
words past the beginning of the variable %esi
named array
Move the immediate value 4 into the mov byte [fs:eax], 4 mov byte ptr fs:eax, movb $4,
Assembly Process
• An assembly line is a manufacturing process (often called a
progressive assembly) in which parts (usually interchangeable parts) are
added as the semi-finished assembly moves from workstation to
workstation where the parts are added in sequence until the
final assembly is produced.
• The assembly process can be done temporarily with fasteners or
permanently by welding or gluing. If the assembled part requires some
kind of service, it is better to connect temporarily. During the assembly
process, the order should also be taken into account during the design
stage.
• The assembly process also exists in the electronics. In the prototyping
level it is done by hand, but in the commercial level it should be done by
automation because the commercial product should be given a warranty
at least for 2 years. Moreover, for impact and vibration resistance, it is
important that the electronic components are assembled well.
• It is also very important, from the electric signal perspective, that the
soldering of the electronic components is uniform, so that the connection
points will not show resistance and heat up.
Cont.
Assembly of Components
• Assembly processes are involved in at least
two stages of the overall manufacturing flow
for optoelectronic systems.
• The individual components, such as integrated
circuits, are assembled into packages, such as
small outline integrated circuit or quad flat
pack, and then the packaged components are
assembled into a module such as a printed
circuit board.
Assembly Process
• Assembling the source code into an object file
• Linking the object file with other modules or
libraries into an executable program
• Loading the program into memory
• Running the program
Cont.
• Figure 1.Assemly Process
Cont.
Assembly Sample Program
• START
; add_16_bytes.asm ;

.586P
; Flat memory model,
standard calling convention: .
MODEL FLAT, STDCALL
;
; Data segment _D
ATA SEGMENT values db 16 DUP( 5 ) ;
16 bytes of values "5" _DATA ENDS ;
Code segment _TEXT SEGMENT
START: mov eax, 0 ; clear result mov bl, 16
; init loop counter lea esi, values ; init data pointer
addup: add al, [esi] ; add byte to sum inc esi ;
increment data pointer dec bl ; decrement loop counter
jnz addup ; if BL not zero, continue mov [esi], al ;
save sum ret ;
Cont.
• At assembly time, the assembler:
• Evaluates conditional-assembly directives, assembling if the
conditions are true.
• Expands macros and macro functions.
• Evaluates constant expressions such as MYFLAG AND 80H,
substituting the calculated value for the expression.
• Encodes instructions and nonaddress operands. For
example, mov cx, 13; can be encoded at assembly time
because the instruction does not access memory.
• Saves memory offsets as offsets from their segments.
• Places segments and segment attributes in the object file.
• Saves placeholders for offsets and segments (relocatable
addresses).
• Outputs a listing if requested.
• Passes messages (such as INCLUDELIB) directly to the linker.
PASS ES of the Assembler
• Assembler is a program for converting instructions written in
low-level assembly code into relocatable machine code and
generating along information for the loader .
• It generates instructions by evaluating the mnemonics
(symbols) in operation field and find the value of symbol and
literals to produce machine code.
• Now, if assembler do all this work in one scan then it is called
single pass assembler, otherwise if it does in multiple scans
then called multiple pass assembler. Here assembler divide
these tasks in two passes:
Cont.
• Pass-1:
– Define symbols and literals and remember them in
symbol table and literal table respectively.
– Keep track of location counter
– Process pseudo-operations
• Pass-2:
– Generate object code by converting symbolic op-code
into respective numeric op-code
– Generate data for literals and look for values of
symbols
• Firstly, We will take a small assembly language
program to understand the working in their
respective passes. Assembly language statement
format:
Cont.
• [Label] [Opcode] [operand]
• Example:
• M ADD R1, ='3' where, M - Label; ADD - symbolic opcode; R1 -
symbolic register operand; (='3') – Literal
• Assembly Program:
• Label Op-code operand LC value(Location counter)
JOHN START 200
• MOVER R1, ='3‘ 200
• MOVEM R1, X 201
• L1 MOVER R2, ='2‘ 202
• LTORG 203
• X DS 1 204
• END 205
Cont.
• START: This instruction starts the execution of program
from location 200 and label with START provides name
for the program.(JOHN is name for program)
• MOVER: It moves the content of literal(=’3′) into register
operand R1.
• MOVEM: It moves the content of register into memory
operand(X).
• MOVER: It again moves the content of literal(=’2′) into
register operand R2 and its label is specified as L1.
• LTORG: It assigns address to literals(current LC value).
• DS(Data Space): It assigns a data space of 1 to Symbol X.
• END: It finishes the program execution.
Working of Pass-1
Define Symbol and literal table with their addresses.
Note: Literal address is specified by LTORG or END.
• Step-1: START 200 (here no symbol or literal is found so both
table would be empty)
• Step-2: MOVER R1, =’3′ 200 ( =’3′ is a literal so literal table is
made)
• LITERAL ADDRESS
• =’3‘ –––
• Step-3: MOVEM R1, X 201
X is a symbol referred prior to its declaration so it is stored in
symbol table with blank address field.
• SYMBOL ADDRESS
• X –––
Cont.

• Step-4: L1 MOVER R2, =’2′ 202


L1 is a label and =’2′ is a literal so store them in respective tables
• SYMBOL LADDRESS
• X –––
• L1 202
• LITERAL ADDRESS
• =’3′ –––
• =’2′ –––
• Step-5: LTORG 203
Assign address to first literal specified by LC value, i.e., 203
• LITERAL ADDRESS
• =’3′ 203
• =’2′ –––
Cont.
Step-6: X DS 1 204
It is a data declaration statement i.e X is assigned data space of 1. But X is a symbol which
was referred earlier in step 3 and defined in step 6.
• This condition is called Forward Reference Problem where variable is referred prior to its
declaration and can be solved by back-patching.
• So now assembler will assign X the address specified by LC value of current step.
• SYMBOL ADDRESS
• X 204
• L1 202
• Step-7: END 205
Program finishes execution and remaining literal will get address specified by LC value of
END instruction. Here is the complete symbol and literal table made by pass 1 of
assembler.
• SYMBOL ADDRESS
• X 204
• L1 202
• LITERAL ADDRES
• S=’3′ 203
• =’2′ 205
Now tables generated by pass 1 along with their LC value will go to pass-2 of assembler for
further processing of pseudo-opcodes and machine op-codes.
PASS-2
• Pass-2 of assembler generates machine code by converting
symbolic machine-opcodes into their respective bit
configuration(machine understandable form).
• It stores all machine-opcodes in MOT table (op-code table)
with symbolic code, their length and their bit configuration.
• Flow Chart figure is given in next slide
Cont.
• Two pass translations consist of pass I and pass II.
• Generally, LC processing performed in the first pass and
symbols defined in the program entered into the
symbol table, hence first pass performs analysis of the
source program.
• So, two pass translation of assembly lang. the program
can handle forward reference easily.
• The second pass synthesizes the target form using the
address information found in the symbol table.
• Moreover, The first pass constructs an intermediate
representation of the source program and that will be
used by the second pass.
• IR consists of two main components: data structure + IC
(intermediate code)
Cont.
• Flow Chart
Cont.
• BEGIN {generation of object module} Write assembler report headings & any leading
comment lines (Note: as each source line is processed, it is written to the assembler report)
Process the
• START statement, if present, setting Locctr to the operand’s value (default is 0) Initialize the
object module:
• 1. Locctr value is initial load point
• 2. END val from Pass 1 is tentative “execute next” Loop through the source lines until the
END statement is reached or source runs out BEGIN Skip over any comment lines (but write
them to the assembler report) Extract Opcode, & Operand, increment Locctr, then if Opcode
is
• 1. RESW or RESB, start a new module:
• a. ! delimiter to end prior module
• b. loader address replaces END val in prior module as “execute next”
• c. Locctr value is next load point
• d. ENDval from Pass 1 as this module’s a tentative “execute next”
• 2. WORD or BYTE, Operand gives the storage value(s) to write to the object module
• 3. an assembler directive, process as spec’d
• 4. an instruction, build the object version utilizing nixbpe bits, Locctr, and Operand value
from the symbol table
• END {of loop}
• Append the ! delimiter to end the final module Output the object module(s) as the object
code file if no errors were encountered in Pass 1 or 2
• END {of Pass 2}
Cont.
Various Data bases required by pass-2:
1. MOT table(machine opcode table)
2. POT table(pseudo opcode table)
3. Base table(storing value of base register)
4. LC ( location counter)
Assembler Directives
• Assembler directives are pseudo instructions
– They will not be translated into machine instructions.
– They only provide instruction/direction/information to the
assembler.
• Basic assembler directives :
– START : Specify name and starting address for the program
– END : Indicate the end of the source program.
– EQU : The EQU directive is used to replace a number by a
symbol. For example: MAXIMUM EQU 99. After using this
directive, every appearance of the label “MAXIMUM” in
the program will be interpreted by the assembler as the
number 99 (MAXIMUM = 99).
– Symbols may be defined this way only once in the
program. The EQU directive is mostly used at the
beginning of the program.
Macros & Macro processors
• Macro represents a group of commonly used
statements in the
source programming language. Macro
Processor replace each macro instruction with
the corresponding group of source language
statements. This is known as expansion
of macros. ... Macro Processor involves
definition, invocation and expansion.
Source Code (with macro) Macro Processor Expanded Code Compiler or Assembler
obj
Cont.
• Macros are used to provide a program generation
facility through macro expansion.
• Many languages provide build-in facilities for
writing macros like PL/I, C, Ada AND C++.
• Assembly languages also provide such facilities.
When a language does not support build-in facilities
for writing macros what is to be done?
• A programmer may achieve an equivalent effect by
using generalized preprocessors or software tools like
Awk of Unix.
Cont.
A macro is a unit of specification for program
generation through expansion.
• A macro consists of
• a name, a set of formal parameters and a
body of code.
• The use of a macro name with a set of actual
parameters is replaced by some code generated
from its body. This is called macro expansion.
Two kinds of expansion can be identified.
CLASSIFICATION OF MACROS:
• Lexical expansion:
• Lexical expansion implies replacement of a character string by
another character string during program generation.
• Lexical expansion is to replace occurrences of formal
parameters by corresponding actual parameters.
• Semantic expansion:
• Semantic expansion implies generation of instructions
tailored to the requirements of a specific usage.
• Semantic expansion is characterized by the fact that different
uses of a macro can lead to codes which differ in the number,
sequence and opcodes of instructions.
• Eg: Generation of type specific instructions for manipulation
of byte and word operands.
EXAMPLE
• The following sequence of instructions is used to increment the
value in a memory word by a constant.
• 1. Move the value from the memory word into a machine
register.
• 2. Increment the value in the machine register.
• 3. Move the new value into the memory word.
• 4.Since the instruction sequence MOVE-ADD-MOVE may be used
a number of times in a program, it is convenient to define a macro
named INCR.
• 5.Using Lexical expansion the macro call INCR A,B,AREG can lead to the
generation of a MOVEADD-MOVE instruction sequence to increment A
by the value of B using AREG to perform the arithmetic.
• 6.Use of Semantic expansion can enable the instruction sequence to be
adapted to the types of A and B.
• 7.For example an INC instruction could be generated if A is a byte
operand and B has the value „1‟.
HOW DOES MACRO DIFFER FROM
SUBROUTINE?
Macros differ from subroutines in one fundamental
respect.
• Use of a macro name in the mnemonic field of an
assembly statement leads to its expansion,
• whereas use of subroutine name in a call instruction
leads to its execution.
• So there is difference in Size Execution Efficiency
• Macros can be said to trade program size for
execution efficiency. More difference would be
discussed at the time of discussion of macro
expansion.
MACRO DEFINITION AND CALL
• MACRO DEFINITION
• A macro definition is enclosed between a macro header statement
and a macro end statement.
• Macro definitions are typically located at the start of a program.
• A macro definition consists of.
• A macro prototype statement
• One or more model statements
• Macro preprocessor statements
• The macro prototype statement declares the name of a macro and
the names and kinds of its parameters.
• It has the following syntax [< formal parameter spec > [,..]]
• Where appears in the mnemonic field of an assembly statement
and
• < formal parameter spec> is of the form
Macros using AIF, AGO,
REPT.
• AIF
• An AIF statement has the syntax:
• Where <expression>is a relational expression
involving ordinary strings, formal parameters
and their attributes,
MACRO
• EV name are entered in EVNTAB while
processing EV declarations.
• SS name are entered in SSNTAB while
processing an SS reference or definition,
whichever occur earlier.
• Eac
• REPT statement
• Syntax: REPT <expression>
• < expression > should evaluate to a numerical
value during macro expansion
Types of Parameters
• Positional parameters
• Keyword parameters
• Default specification of parameter
• Macro with mixed parameter lists
• Other uses of parameter
Positional parameters
• A positional formal parameter is written as &.
The in call on a macro using positional
parameters is simply an .
• Step-1 find the ordinal position of XYZ in the list
of formal parameters in the macro prototype
statement
• . Step-2 find the actual parameter specification
occupying the same ordinal position in the list of
actual parameters in macro call statement.
Example
• INCR A, B, AREG
• The rule of positional association values of the
formal parameters are:
• Formal parameter value
• MEM_VAL A
• INCR_VAL B
• REG AREG
• Lexical expansion of model statement now leads
to the code
• + MOVER AREG, A
• + ADD AREG, B
• + MOVEM AREG, A
Keyword parameters
• is an ordinary string and is the string „=„ in
syntax rule.
• The is written as =.
• The keyword association rules:
• Step-1 find the actual parameter
specification which has the form XYZ=
• Step-2 Let in the specification be the string
ABC. Then the value of formal parameter XYZ
is ABC
Default specification of parameters
• A default is a standard assumption in the
absence of an explicit specification by
programmer.
• Default specification of parameters is useful
in situations where a parameter has the same
value in most calls.
• When desired value is different from the
default value, the desired value can be
specified explicitly in a macro call.
Example
• Call the macro INCR_D MEM_VAL=A,
INCR_VAL=B INCR_D INCR_VAL=B,
MEM_VAL=A INCR_D INCR_VAL=B,
MEM_VAL=A, REG=BREG
• MARCO DIFINITION MACRO INCR_D
&MEM_VAL=,&INCR_VAL=,®=AREG MOVER ®,
&MEM_VAL ADD ®, &INC_VAL MOVEM ®,
&MEM_VAL MEND
Macro with mixed parameter lists
• A macro may be defined to use both
positional and keyword parameters.
• All positional parameters must precede all
keyword parameters.
• Example:
• SUMUP A,B,G=20,H=X
• Where A,B are positional parameters while
G,H are keyword parameters
Other uses of parameters
• The model statements have used formal
parameters only in operand fields.
• Formal parameter can also appear in the
label and opcode fields of model statements
Nested Macro Call
• A model statement in macro may constitute a
call on another macro, such calls are known as
nested macro calls.
• The macro containing the nested call is
called outer macro.
• The called macro called inner macro.
Expansion of nested macro calls follows the
last-in first-out(LIFO) rule.
Advanced Macro Facilities
• Advance macro facilities are aimed at
supporting semantic expansion.
• Facilities for alteration of flow of control
during expansion.
• Expansion time variables Attributes of
parameters
AIF
• An AIF statement has syntax AIF ()
• Where, is relational expression involving
ordinary strings, formal parameters and their
attributes, and expansion time variables.
• If the relational expression evaluates to true,
expansion time control is transferred to the
statement containing in its label field.
AGO
• An AGO statement the syntax AGO
• Unconditionally transfer expansion time
control to the statement containing in its label
field.
• An ANOP statement is written as ANOP
• Simply has the effect of defining the
sequencing symbol.
Expansion Time Variable
• Expansion time variable are variables which
can only be used during the expansion of
macro calls.
• Local EV is created for use only during a
particular macro call.
• Global EV exists across all macro calls
situated in program and can be used in any
macro which has a declaration for it.
• LCL [,…] GBL [,…]
Linker and Loader
• Linker
Loader
• In computer systems a loader is the part of an
operating system that is responsible for loading
programs and libraries.
• It is one of the essential stages in the process of
starting a program, as it places programs into
memory and prepares them for execution.
Cont.
Direct Linking Loader
• Direct Linking Loaders
• A Direct linking loader is a general relocating loader and is the
most popular loading scheme presently used.
• This scheme has an advantage that it allows
the programmer to use multiple procedure and multiple data
segments

You might also like