Assembly Introduction

NOTE: This page is currently Under Construction =)



If your new to assembly, coming back to it, or refreshing and/or expanding your knowledge, a good place to begin is with the basics.

Notes

Before you begin, please take note of a few constants in this text. If we are talking about numbers, we may precede the number with (bin), (dec), (hex), which indicates a BINary, DECimal, or HEXadecimal value.

Assembly numbers

Assembly works with numbers in hexadecimal format. Hexadecimal is a base 16 numbering system. Values range from 0 to F; 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F. (e.g., (dec)15 = (hex)F, (dec)16 = (hex)10 - 1 'group' of 16, and no remainder. (dec)17 = (hex)11, (dec)31 = (hex)1F).

Registers

Registers are internal memory locations within the processor itself. Rather than forcing all variables to be loaded from RAM, processors provide registers which are small chucks of internal memory that can be accessed very easily, and usually without any performance penalties. The drawback of registers is that there are only a few of them. IA32 processors only provide 8 generic 32-bit registers. There are more, but they're there for specific purposes, and should not be used as generic storage. Registers are used mostly for local variables or caching frequently used data within a local function block or procedure; long term storage is handled in RAM, which we'll address later.

Generic IA-32 Registers

EAX, EBX, EDX Generic registers that can be used for any integer, boolean, logical or memory operation.
ECX Generic register, commonly used by repetitive instructions that require counting.
ESI / EDI Generic, frequently used as source or destination pointers in instructions that copy memory. (SI stands for Source Index, DI stands for Destination Index).
EBP Stack base pointer, used to define a stack frame in conjunction with the stack pointer. A stack frame is the current function's stack zone, which is used for access to local variables and parameters passed to the current function. It is the area of allocated memory between the stack pointer (ESP) and the base pointer (EBP). The base pointer usually refers to the stack position right after the return address for the current function.
ESP Stack pointer that stores the current position in the stack for pushing variables onto the stack.

General purpose Registers


Flags

IA32 processors have a special register called EFLAGS. EFLAGS contains status and system flags. We're interested in the status flags, which will be discussed in depth later on.

Instruction Format

Instructions in IA-32 assembly generally consist of an opcode and 1 or more parameters, however some opcodes do not receive any parameters. Opcodes are operation codes, which give an instruction to the processor to preform a task.

Before we delve into all the various opcodes, it is a good idea to take a look at the aforementioned flags. As mentioned, flags contain system and status flags; we're interested in the status flags.

Arithmetic Flags

Overflow flags consist of CF (carry flag) and OF (overflow flag). There are 2 overflow flags to differentiate between signed and unsigned operations. Because signed integers are one bit smaller than their equivalent-sized unsigned counterparts (1 bit is used to hold the sign). OF represents overflows in signed operands and CF represents overflows in unsigned operands.
ZF (zero flag) represents an arithmetic operation where the result is zero.
SF (sign flag) receives the value of the most significant bit of the result of an arithmetic operation. For signed integers, this represents the sign of the result; 1 denotes negative, and 0 denotes positive (or zero).
PF (parity flag) is a rarely used flag that denotes the binary parity of the lower 8 bits of a given arithmetic result. Binary parity means that the flag reports the parity of the number of bits set, as opposed to the actual numeric parity of the result (i.e., 0's & 1's, expressed in binary to represent a decimal number; e.g., (bin)101 = (dec)5, and the parity flag is set to 0). 1 denotes an even number of set bits while 0 denotes an odd number of set bits in the lower 8 bits of the result.

Instuctions (Opcodes [operation codes])

Now it's time to look at some instructions.

OpCode Param 1 Param 2 Description / other params
MOV Destination Source Move opcode. Instructs the processor to move the source into the destination. (e.g., MOV EAX, C - move the hexadecimal value of (dec)12 into the EAX register.
CMP Operand1 Operand2 Comparison opcode. Instructs the processor to subtract Operand2 from Operand1, discard the result, and set the EFLAGS to indicate the result of the operation.
JMP Label N/A Instructs the processor to jump execution to the specified label.
[Label]: N/A N/A Specifies a Label for a jump point.
Jxx N/A N/A Conditional jump codes. Refer to the table below for all jump codes.
SETxx Operand N/A Set Byte on Condition; tests the EFLAGS as per the jump, except that the result is stored in the operand. Refer to the table below for all jump codes.
CMOVxx Operand1 Operand2 Conditional Move; test the EFLAGS as per the jump, and copies data from Operand2 to Operand1. Refer to the table below for all jump codes.
ADD Operand1 Operand2 Adds two signed or unsigned integers. The result is stored in Operand1.
ADC Operand1 Operand2 Adds two signed or unsigned integers. The result is stored in Operand1, with the value of the CF flag.*
INC Operand N/A Increments the Operand.
SUB Operand1 N/A Subtracts the value at Operand2 from the value at Operand1. Applies to signed and unsigned operands.
SBB Operand1 N/A Subtracts the value at Operand2 from the value at Operand1, with the value of the CF flag. Applies to signed and unsigned operands.*
MUL Operand N/A Multiplies the unsigned operand by EAX and stores the result in a 64-bit value in EDX:EAX. This means that the low (least significant) 32 bits are stored in EAX and the high (most significant) bits are stored in EDX.
DIV Operand N/A Divides the unsigned 64-bit value stored in EDX:EAX by the unsigned operand. Stores the quotient in EAX and the remainder in EDX.
IMUL Operand N/A Multiplies the signed operand by EAX and stores the result in a 64-bit value in EDX:EAX.
IDIV Operand N/A Divides the signed 64-bit value stored in EDX:EAX by the signed operand. Stores the quotient in EAX and the remainder in EDX.
MOVZX Operand N/A Copies the smaller operand into the larger and zero extends it on the way (Zero extending means that the source operand is copied into the destination operand and the most significant bits are set to zero, regardless of the source operand's value.* **
MOVSX Operand N/A Same as above except that instead of zero extending it performs sign extending when enlarging the integer.* **
CDQ Operand N/A Copies a signed 32-bit integer in EAX to a 64-bit sign-extended integer in EDX:EAX.* ***

* indicates 64-bit arithmetic operations. ** indicates integer conversion, from 8-bit to 16-bit or 32 bit; 16-bit to 32-bit. *** indicates converting a signed 32-bit integer to a signed 64-bit integer.
Conditional Code Constructs (Use bolded code to replace xx in Jxx above).

Signed Conditional Codes for CMP and SUB Instructions.
Note: Logical operators used in the EFLAGS are discussed below.

Mneumonics EFLAGS Mathematic representation (Operand1 is X, Operand2 is Y) Comments
Greater (G)
Not Less or Equal (NLE)
ZF = 0 && (OF = 0 && SF=0) || (OF=1 && SF=0) X > Y ZF confirms the operands are unequal; SF checks for a positive result without an overflow, or a negative result with an overflow.
Greater or Equal (GE) (OF=0 && SF=0) || (OF=1 && SF=1) X >= Y Same as above, except ZF is not tested to allow for equal operands.
Less (L)
Not Greater or Equal (NGE)
(OF=1 && SF=0) || (OF=0 && SF=1) X < Y OF AND NOT SF confirms that we have a positive result and an overflow (X was lower than Y), or we have a negative result with no overflow.
Less or Equal (LE)
Not Greater(NG)
ZF=1 || ((OF=1 && SF=0) || (OF=0 && SF=1)) X <= Y Same as above, except ZF is tested to allow for equal operands.



Unsigned Conditional Codes for CMP and SUB Instructions.

Mneumonics EFLAGS Mathematic representation (Operand1 is X, Operand2 is Y) Comments
Above (A)
Not Below or Equal (NBE)
CF=0 && ZF=0 X > Y CF confirms no carry and ZF confirms unequal.
Above or Equal (AE)
Not Below (NB)
Not Carry (NC)
CF=0 X >= Y Same as above, except ZF is not tested to allow for equal operands.
Below (B)
Not Above or Equal (NAE)
Carry (C)
CF=1 X < Y CF confirms a carry overflow.
Below or Equal (BE)
Not Above (NA)
SF=1 || ZF=1 Same as above, except ZF is tested to allow for equal operands.
Equal (E)
Zero (Z)
ZF=1 X = Y ZF confirms the operands are equal.
Not Equal (NE)
Not Zero (NZ)
ZF=0 X != Y ZF confirms the operands are unequal.

Logical Operators

= Assignment operator.
== Equality operator.
!= Non-equality operator.
> Greater than operator.
>= Greater than or equal to operator.
< Less than operator.
<= Less than or equal to operator.
&& Logical AND operator.
|| Logical OR operator.

Whew. All that and guess what... We haven't done and MASM yet !?!?!! Understanding the above principles is paramount to understanding the few abstractions made is masm32, so that you can read your code in a debugger. This page will be updated soon, so check back for more!