I have learned assembly mostly from reading various tutorials online, reading books, and ask questions in newsgroups and IRC.
This tutorial will focus on x86 assembly. Which going to help you further in reverse engineering tutorials which I will be releasing after this.
Knowledge of higher level programming languages and basic knowledge of computer architecture is assumed.
Why assembly ?
Because when we are going to reverse any binary the code which we going to get will be in assembly.
Who use assembly?
I don’t know. If you know then do share that with me as according to me now you for higher level languages like C,c++ and a ton of other language then why to use assembly.
Though Assembly generation process is automated then there might be some where we might need handwritten assembly code.
Positive and negative of assembly:
Positive is very fast, it’s powerful and small.
Negatives are Hardware dependent, not easy to debug and consume too much of time then High level languages.
Assembler is used to convert Assembly to machine language.
Often, it will come with a linker that links the assembled files and produces an
executable from it. Windows executables have the .exe extension. Here are some of the
1. MASM – This is the assembler this tutorial is geared towards, and you should
use this while going through this tutorial. Originally by Microsoft, it’s now
included in the MASM32v8 package, which includes other tools as well. You
can get it from here.
2. TASM – Another popular assembler. Made by Borland but is still a
commercial product, so you can not get it for free.
3. NASM – A free, open source assembler, which is also available for other
platforms. It is available at link. Note that
NASM can’t assemble most MASM programs and vice versa.
There are soem commands to learn (very basic of course) this will help us to understand the reversed code easily in efficient
Registers are special memory locations on the CPU.
Assuming person is using computers x86 or later processors.
There are 8 32-bit general purpose registers.
The first 4, eax, ebx, ecx, and edx can also be accessed using 16 or 8-bit names.
ax gets the first 16 bits of eax, al gets the first 8 bits, and ah gets bits 9-16.
bx gets the first 16 bits of ebx.
The other registers can be accessed in a similar fashion.
We can use these registers for anything, although most have a special use:
Address Name Description
EAX Accumulator Register calculations for operations and results data
EBX Base Register pointer to data in the DS segment
ECX Count Register counter for string and loop operations
EDX Data Register input/output pointer
link Source Index source pointer for string operations
EDI Destination Index destination pointer for string operations
ESP Stack Pointer stack pointer, should not be used
EBP Base Pointer pointer to data on the stack
NOTE: In windows programming we may only use EAX,ECX and EDX registers
There are 6 16-bit segment registers. They define segments in memory:
Address Name Description
CS :Code Segment,instructions being executed are stored
DS, ES, FS, GS : Data Segment for data segment
SS :Stack Segment, to store the address of stack for the current program.
Two 32-bit registers that don’t fit anywhere:
Address Name Description
EFLAGS Code Segment status, control, and system flags
EIP Instruction Pointer offset for the next instruction to be executed
Basic Instruction Set:
There are a lot of other instructions other than these what I am going to tell you here.
WIll cover other instructions when we going to face them.
ADD: reg/memory, reg/memory/constant Adds the two operands and stores the result into the first operand. If there is a result with carry, it will be set in CF.
SUB: reg/memory, reg/memory/constant Subtracts the second operand from the first and stores the result in the first operand.
AND: reg/memory, reg/memory/constant Performs the bitwise logical AND operation on the operands and stores the result in the first operand.
OR: reg/memory, reg/memory/constant Performs the bitwise logical OR operation on the operands and stores the result in the first operand.
XOR: reg/memory, reg/memory/constant Performs the bitwise logical XOR operation on the operands and stores the result in the first operand. Note that you can not XOR two memory operands.
MUL: reg/memory Multiplies the operand with the Accumulator Register and
stores the result in the Accumulator Register.
DIV: reg/memory Divides the Accumulator Register by the operand and stores
the result in the Accumulator Register.
INC: reg/memory Increases the value of the operand by 1 and stores the result in
DEC: reg/memory Decreases the value of the operand by 1 and stores the result
in the operand.
NEG: reg/memory Negates the operand and stores the result in the operand.
NOT: reg/memory Performs the bitwise logical NOT operation on the operand and
stores the result in the operand.
PUSH: reg/memory/constant Pushes the value of the operand on to the top of the stack.
POP: reg/memory Pops the value of the top item of the stack in to the operand.
MOV: reg/memory, reg/memory/constant Stores the second operand’s value in the first operand.
CMP: reg/memory, reg/memory/constant Subtracts the second operand from the first operand and sets the respective flags. Usually used in conjunction with a JMP, REP, etc.
JMP: label Jumps to label.
LEA: reg, memory Takes the offset part of the address of the second operand and
stores the result in the first operand.
CALL: subroutine Calls another procedure and leaves control to it until it returns.
RET: Returns to the caller.
INT: constant Calls the interrupt specified by the operand.
You can grab latest complete instruction set reference at:
Push and Pop
Push and Pop are operations that manipulate the stack.
Push takes a value and adds it on top of the stack. Pop takes the value at the top of the stack, removes it, and stores it
in the operand. Thus, the stack uses a Last In First Out (LIFO). Stacks are common data structures in computers.
This much is enough for now, If you have any question or if you find something wrong then do tell me.