You are on page 1of 17

Assembly Language

An assembly language is a low-level programming language designed for a specific


type of processor. It may be produced by compiling source code from a high-level
programming language (such as C/C++) but can also be written from scratch. Assembly
code can be converted to machine code using an assembler.

Since most compilers convert source code directly to machine code, software
developers often create programs without using assembly language. However, in some
cases, assembly code can be used to fine-tune a program. For example, a programmer
may write a specific process in assembly language to make sure it functions as
efficiently as possible.

While assembly languages differ between processor architectures, they often


include similar instructions and operators. Below are some examples of instructions
supported by x86 processors.

 MOV - move data from one location to another

 ADD - add two values

 SUB - subtract a value from another value

 PUSH - push data onto a stack

 POP - pop data from a stack

 JMP - jump to another location


 INT - interrupt a process

The following assembly language can be used to add the numbers 3 and 4:

mov eax, 3 - loads 3 into the register "eax"


mov ebx, 4 - loads 4 into the register "ebx"
add eax, ebx, ecx - adds "eax" and "ebx" and stores the result (7) in "ecx"

Writing assembly language is a tedious process since each operation must be


performed at a very basic level. While it may not be necessary to use assembly code to
create a computer program, learning assembly language is often part of a Computer
Science curriculum since it provides useful insight into the way processors work.

List of Useful and Frequently Used DOS Command

This list of DOS commands is very useful when repairing Windows after a system
crash when Windows doesn’t load and the only option you have is a Dos command
prompt. Use the “help” command to find the usage and details of any particular
command e.g. C:\>help copy
CHDIR – Displays the name of or changes the current directory.

CHKDSK – Checks a disk and displays a status report.

CLS – Clears the screen.

COMP – Compares two groups of files to find information that does not match.

COPY – Copies and appends files.

DATE – Displays and/or sets the system date.

DEFRAG – Optimizes disk performance by reorganizing the files on the disk.

DEL – Deletes files from disk.

DELTREE – Deletes a directory including all files and subdirectories that are in it.

DIR – Displays directory of files and directories stored on disk.

DISKCOMP – Compares the contents of two diskettes.

ECHO – Displays messages or turns on or off the display of commands in a batch file.

EDIT – Starts the MS-DOS editor, a text editor used to create and edit ASCII text files.

EXIT – Exits a secondary command processor.

EXPAND – Expands a compressed file.

FASTHELP – Displays a list of DOS commands with a brief explanation of each.

FIND – Finds and reports the location of a specific string of text characters in one or
more files.

FOR – Performs repeated execution of commands (for both batch processing and
interactive processing).

FORMAT – Formats a disk to accept DOS files.

GRAPHICS – Provides a way to print contents of a graphics screen display.

IF – Allows for conditional operations in batch processing.


LABEL – Creates or changes or deletes a volume label for a disk.

MEM – Displays amount of installed and available memory, including extended,


expanded, and upper memory.

MKDIR – Creates a new subdirectory.

MORE – Sends output to console, one screen at a time.

MOVE – Moves one or more files to the location you specify. Can also be used to
rename directories.

PATH – Sets or displays directories that will be searched for programs not in the
current directory.

RENAME – Changes the filename under which a file is stored.

RMDIR – Removes a subdirectory.

SORT – Sorts input and sends it to the screen or to a file.

XCOPY – Copies directories, subdirectories, and files.

Assembler
An assembler is a program that converts assembly language into machine code. It takes
the basic commands and operations from assembly code and converts them
into binary code that can be recognized by a specific type of processor.
Assemblers are similar to compilers in that they produce executable code. However,
assemblers are more simplistic since they only convert low-level code (assembly
language) to machine code. Since each assembly language is designed for a specific
processor, assembling a program is performed using a simple one-to-one mapping from
assembly code to machine code. Compilers, on the other hand, must convert generic
high-level source code into machine code for a specific processor.

Most programs are written in high-level programming languages and are compiled
directly to machine code using a compiler. However, in some cases, assembly code may
be used to customize functions and ensure they perform in a specific way.
Therefore, IDEs often include assemblers so they can build programs from both high
and low-level languages.

Preprocessor Directives
Preprocessor directives are lines included in a program that begin with the
character #, which make them different from a typical source code text. They are
invoked by the compiler to process some programs before compilation. Preprocessor
directives change the text of the source code and the result is a new source code without
these directives.
Although preprocessing in C# is conceptually similar to that in C/C++, it is different
in two aspects. First, preprocessing in C# does not involve a separate step for
preprocessor execution before compilation. It is processed as a part of the lexical
analysis phase. Second, it cannot be used to create macros. In addition, the new
directives #region and #unregion have been added in C# along with the exclusion of
some directives used earlier (#include is a notable directive whose use is replaced with
"using" to include assemblies).
Java does not support preprocessor directives.

A preprocessor directive is usually placed in the top of the source code in a separate
line beginning with the character "#", followed by directive name and an optional white
space before and after it. Because a comment on the same line of declaration of the
preprocessor directive has to be used and cannot scroll through the following line,
delimited comments cannot be used. A preprocessor directive statement must not end
with a semicolon (;). Preprocessor directives can be defined in source code or in the
common line as argument during compilation.

Examples for preprocessing directives that can be used in C# include:

 #define and #undef: To define and undefine conditional compilation


symbols, respectively. These symbols could be checked during compilation
and the required section of source code can be compiled. The scope of a
symbol is the file in which it is defined.

 #if, #elif, #else, and #endif: To skip part of source code based on conditions.
Conditional sections may be nested with directives forming complete sets.

 #line: To control line numbers generated for errors and warning. This is
mostly used by meta-programming tools to generate C# source code from
some text input. It is generally used to modify the line numbers and source
file names reported by the compiler in its output.
 #error and #warning : To generate errors and warnings, respectively.
#error is used to stop compilation, while #warning is used to continue
compilation with messages in the console.

 #region and #endregion :To explicitly mark sections of source code. These
allow expansion and collapse inside Visual Studio for better readability and
reference.

Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

ASCII adjust AL after used with unpacked binary coded


AAA 0x37
addition decimal

8086/8088 datasheet documents


only base 10 version of the AAD
instruction (opcode 0xD5 0x0A),
but any other base will work.
Later Intel's documentation has
ASCII adjust AX before
AAD the generic form too. NEC V20 0xD5
division
and V30 (and possibly other NEC
V-series CPUs) always use base
10, and ignore the argument,
causing a number of
incompatibilities
Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

Only base 10 version (Operand is


ASCII adjust AX after
AAM 0xA) is documented, see notes 0xD4
multiplication
for AAD

ASCII adjust AL after


AAS 0x3F
subtraction

0x10…0x15,
destination := destination +
ADC Add with carry 0x80/2…0x
source + carry_flag
83/2

0x00…0x05,
(1) r/m += r/imm; (2) r +=
ADD Add 0x80/0…0x
m/imm;
83/0

0x20…0x25,
(1) r/m &= r/imm; (2) r &=
AND Logical AND 0x80/4…0x
m/imm;
83/4

0x9A, 0xE8,
push eip; eip points to the
CALL Call procedure 0xFF/2,
instruction directly after the call
0xFF/3

CBW Convert byte to word 0x98

CLC Clear carry flag CF = 0; 0xF8

CLD Clear direction flag DF = 0; 0xFC

CLI Clear interrupt flag IF = 0; 0xFA


Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

CMC Complement carry flag 0xF5

0x38…0x3D,
CMP Compare operands 0x80/7…0x
83/7

Compare bytes in
CMPSB 0xA6
memory

CMPSW Compare words 0xA7

Convert word to
CWD 0x99
doubleword

Decimal adjust AL after (used with packed binary coded


DAA 0x27
addition decimal)

Decimal adjust AL after


DAS 0x2F
subtraction

0x48…0x4F,
DEC Decrement by 1 0xFE/1,
0xFF/1

DX:AX = DX:AX / 0xF6/6,


DIV Unsigned divide
r/m; resulting DX == remainder 0xF7/6

Used
ESC 0xD8..0xDF
with floating-point unit
Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

HLT Enter halt state 0xF4

DX:AX = DX:AX / 0xF6/7,


IDIV Signed divide
r/m; resulting DX == remainder 0xF7/7

0x69, 0x6B
(both since
80186),
(1) DX:AX = AX * r/m; (2) AX = 0xF6/5,
IMUL Signed multiply
AL * r/m 0xF7/5,
0x0FAF
(since
80386)

(1) AL = port[imm]; (2) AL =


0xE4, 0xE5,
IN Input from port port[DX]; (3) AX =
0xEC, 0xED
port[imm]; (4) AX = port[DX];

0x40…0x47,
INC Increment by 1 0xFE/0,
0xFF/0

INT Call to interrupt 0xCC, 0xCD

Call to interrupt if
INTO 0xCE
overflow

IRET Return from interrupt 0xCF


Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

(JA, JAE, JB, JBE, JC, JE, JG,


0x70…0x7F,
JGE, JL, JLE, JNA, JNAE, JNB,
0x0F80…0x
Jcc Jump if condition JNBE, JNC, JNE, JNG, JNGE,
0F8F (since
JNL, JNLE, JNO, JNP, JNS, JNZ,
80386)
JO, JP, JPE, JPO, JS, JZ)

JCXZ Jump if CX is zero 0xE3

0xE9…0xEB
JMP Jump , 0xFF/4,
0xFF/5

Load FLAGS into AH


LAHF 0x9F
register

LDS Load pointer using DS 0xC5

LEA Load Effective Address 0x8D

LES Load ES with pointer 0xC4

Assert BUS LOCK#


LOCK (for multiprocessing) 0xF0
signal

if (DF==0) AL = *SI++; else AL =


LODSB Load string byte 0xAC
*SI--;

if (DF==0) AX = *SI++; else AX


LODSW Load string word 0xAD
= *SI--;
Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

LOOP/LO (LOOPE, LOOPNE, LOOPNZ,


Loop control 0xE0…0xE2
OPx LOOPZ) if (x && --CX) goto lbl;

copies data from one location to


MOV Move 0xA0...0xA3
another, (1) r/m = r; (2) r = r/m;

if (DF==0)

Move byte from string *(byte*)DI++ = *(byte*)SI++;


MOVSB 0xA4
to string else

*(byte*)DI-- = *(byte*)SI--;

if (DF==0)

Move word from string *(word*)DI++ =


MOVSW 0xA5
to string *(word*)SI++; else

*(word*)DI-- = *(word*)SI--;

(1) DX:AX = AX * r/m; (2) AX = 0xF6/4…0x


MUL Unsigned multiply
AL * r/m; F7/4

Two's complement 0xF6/3…0x


NEG r/m *= -1;
negation F7/3

opcode equivalent to XCHG EAX,


NOP No operation 0x90
EAX

Negate the 0xF6/2…0x


NOT r/m ^= -1;
operand, logical NOT F7/2
Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

0x08…0x0D
(1) r/m |= r/imm; (2) r |= m/im ,
OR Logical OR
m; 0x80…0x83
/1

(1) port[imm] = AL; (2) port[DX]


0xE6, 0xE7,
OUT Output to port = AL; (3) port[imm] =
0xEE, 0xEF
AX; (4) port[DX] = AX;

0x07,
r/m = *SP++; POP CS (opcode 0x0F(8086/
0x0F) works only on 8086/8088. 8088 only),
POP Pop data from stack
Later CPUs use 0x0F as a prefix 0x17, 0x1F,
for newer instructions. 0x58…0x5F,
0x8F/0

Pop FLAGS
POPF FLAGS = *SP++; 0x9D
register from stack

0x06, 0x0E,
0x16, 0x1E,
0x50…0x57,
PUSH Push data onto stack *--SP = r/m; 0x68, 0x6A
(both since
80186),
0xFF/6

PUSHF Push FLAGS onto stack *--SP = FLAGS; 0x9C


Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

0xC0…0xC1
/2 (since
RCL Rotate left (with carry) 80186),
0xD0…0xD3
/2

0xC0…0xC1
/3 (since
Rotate right (with
RCR 80186),
carry)
0xD0…0xD3
/3

Repeat
(REP, REPE, REPNE, REPNZ,
REPxx MOVS/STOS/CMPS/L 0xF2, 0xF3
REPZ)
ODS/SCAS

Not a real instruction. The


assembler will translate these to a
RET Return from procedure RETN or a RETF depending on
the memory model of the target
system.

Return from near


RETN 0xC2, 0xC3
procedure

Return from far


RETF 0xCA, 0xCB
procedure

ROL Rotate left 0xC0…0xC1


Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

/0 (since
80186),
0xD0…0xD3
/0

0xC0…0xC1
/1 (since
ROR Rotate right 80186),
0xD0…0xD3
/1

SAHF Store AH into FLAGS 0x9E

0xC0…0xC1
/4 (since
Shift Arithmetically left
SAL (1) r/m <<= 1; (2) r/m <<= CL; 80186),
(signed shift left)
0xD0…0xD3
/4

0xC0…0xC1
/7 (since
Shift Arithmetically (1) (signed) r/m >>=
SAR 80186),
right (signed shift right) 1; (2) (signed) r/m >>= CL;
0xD0…0xD3
/7

alternative 1-byte encoding 0x18…0x1D,


Subtraction with
SBB of SBB AL, AL is available 0x80…0x83
borrow
via undocumented SALC /3
Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

instruction

SCASB Compare byte string 0xAE

SCASW Compare word string 0xAF

0xC0…0xC1
/4 (since
Shift left (unsigned shift
SHL 80186),
left)
0xD0…0xD3
/4

0xC0…0xC1
/5 (since
Shift right (unsigned
SHR 80186),
shift right)
0xD0…0xD3
/5

STC Set carry flag CF = 1; 0xF9

STD Set direction flag DF = 1; 0xFD

STI Set interrupt flag IF = 1; 0xFB

if (DF==0) *ES:DI++ = AL; else *


STOSB Store byte in string 0xAA
ES:DI-- = AL;

if (DF==0) *ES:DI++ = AX; else


STOSW Store word in string 0xAB
*ES:DI-- = AX;
Original 8086/8088 instruction set

Instruct
Meaning Notes Opcode
ion

0x28…0x2D,
(1) r/m -= r/imm; (2) r -=
SUB Subtraction 0x80…0x83
m/imm;
/5

0x84, 0x84,
0xA8, 0xA9,
TEST Logical compare (AND) (1) r/m & r/imm; (2) r & m/imm;
0xF6/0,
0xF7/0

Waits until BUSY# pin is inactive


WAIT Wait until not busy 0x9B
(used with floating-point unit)

r :=: r/m; A spinlock typically


0x86, 0x87,
XCHG Exchange data uses xchg as an atomic operation.
0x91…0x97
(coma bug).

Table look-up
XLAT behaves like MOV AL, [BX+AL] 0xD7
translation

0x30…0x35,
(1) r/m ^= r/imm; (2) r ^=
XOR Exclusive OR 0x80…0x83
m/imm;
/6

You might also like