From 7fdc20eb453ce242c11b65f6b5d4b78a23cb2d52 Mon Sep 17 00:00:00 2001 From: Crystal Date: Wed, 10 Apr 2024 21:05:54 +0100 Subject: Me when the when me me --- blog/asm/1.html | 154 +++++++++++++++++++++++++++++++------------------------- 1 file changed, 86 insertions(+), 68 deletions(-) (limited to 'blog/asm/1.html') diff --git a/blog/asm/1.html b/blog/asm/1.html index 31491a2..4ed3574 100644 --- a/blog/asm/1.html +++ b/blog/asm/1.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + x86 Assembly from my understanding @@ -23,9 +23,9 @@

Soooo this article (or maybe even a series of articles, who knows ?) will be about x86 assembly, or rather, what I understood from it and my road from the bottom-up hopefully reaching a good level of understanding

-
-

Memory :

-
+
+

Memory :

+

Memory is a sequence of octets (Aka 8bits) that each have a unique integer assigned to them called The Effective Address (EA), in this particular CPU Architecture (the i8086), the octet is designated by a couple (A segment number, and the offset in the segment)

@@ -40,9 +40,9 @@ Memory is a sequence of octets (Aka 8bits) that each have a unique integer assig The offset and segment are encoded in 16bits, so they take a value between 0 and 65535

-
-

Important :

-
+
+

Important :

+

The relation between the Effective Address and the Segment & Offset is as follow :

@@ -52,8 +52,8 @@ The relation between the Effective Address and the Segment & Offset is as fo

    -
  • Example :
    -
    +
  • Example :
    +

    Let the Physical address (Or Effective Address, these two terms are interchangeable) 12345h (the h refers to Hexadecimal, which can also be written like this 0x12345), the register DS = 1230h and the register SI = 0045h, the CPU calculates the physical address by multiplying the content of the segment register DS by 10h (or 16) and adding the content of the register SI. so we get : 1230h x 10h + 45h = 12345h

    @@ -66,16 +66,16 @@ Now if you are a clever one ( I know you are, since you are reading this <3 )
-
-

Registers

-
+
+

Registers

+

The 8086 CPU has 14 registers of 16bits of size. From the POV of the user, the 8086 has 3 groups of 4 registers of 16bits. One state register of 9bits and a counting program of 16bits inaccessible to the user (whatever this means).

-
-

General Registers

-
+
+

General Registers

+

General registers contribute to arithmetic’s and logic and addressing too.

@@ -126,28 +126,28 @@ Now here are the Registers we can find in this section:
-
-

Addressing and registers…again

-
+
+

Addressing and registers…again

+
-
-

I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?

-
+
+

I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?

+

Well lets take a step back to the notion of effective addresses VS relative ones.

-
-

Effective = 10h x Segment + Offset . Part1

-
+
+

Effective = 10h x Segment + Offset . Part1

+

When trying to access a specific memory space, we use this annotation [Segment:Offset], so for example, and assuming DS = 0100h. We want to write the value 0x0005 to the memory space defined by the physical address 1234h, what do we do ?

    -
  • Answer :
    -
    +
  • Answer :
    +
    MOV [DS:0234h], 0x0005
     
    @@ -159,7 +159,7 @@ Why ? Let’s break it down : -
    +

    lain-dance.gif

    @@ -177,9 +177,9 @@ Simple, right ?, now for another example
-
-

Another example :

-
+
+

Another example :

+

What if we now have this instruction ?

@@ -192,9 +192,9 @@ What does it do ? You might or might not be surprised that it does the exact sam

-
-

Segment + Register <3

-
+
+

Segment + Register <3

+

Consider DS = 0100h and BX = BP = 0234h and this code snippet:

@@ -230,8 +230,8 @@ The General rule of thumb is as follows :
    -
  • Note
    -
    +
  • Note
    +

    The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can’t write directly into the DS or CS segment registers, so something like

    @@ -246,9 +246,9 @@ The values of the registers CS DS and SS are automatically initialized by the OS
-
-

The ACTUAL thing :

-
+
+

The ACTUAL thing :

+

Enough technical rambling, and now we shall go to the fun part, the ACTUAL CODE. But first, some names you should be familiar with :

@@ -258,9 +258,9 @@ Enough technical rambling, and now we shall go to the fun part, the ACTUAL CODE.
  • Operands : These are the options passed to the instructions, like MOV dst, src, and they can be anything from a memory location, to a variable to an immediate address.
  • -
    -

    Structure of an assembly program :

    -
    +
    +

    Structure of an assembly program :

    +

    While there is no “standard” structure, i prefer to go with this one :

    @@ -276,9 +276,9 @@ While there is no “standard” structure, i prefer to go with this one
    -
    -

    MOV dst, src

    -
    +
    +

    MOV dst, src

    +

    The MOV instruction copies the Second operand (src) to the First operand (dst)… The source can be a memory location, an immediate value, a general-purpose register (AX BX CX DX). As for the Destination, it can be a general-purpose register or a memory location.

    @@ -327,13 +327,13 @@ for segment registers only these types of MOV are supported: memory: [BX], [BX+SI+7], variable

    -
    -

    Note : The MOV instruction cannot be used to set the value of the CS and IP registers

    +
    +

    Note : The MOV instruction cannot be used to set the value of the CS and IP registers

    -
    -

    Variables :

    -
    +
    +

    Variables :

    +

    Let’s say you want to use a specific value multiple times in your code, do you prefer to call it using something like var1 or E4F9:0011 ? If your answer is the second option, you can gladly skip this section, or even better, seek therapy.

    @@ -353,9 +353,9 @@ Anyways, we have two types of variables, bytes and words(which are two value - can be any numeric value in any supported numbering system (hexadecimal, binary, or decimal), or “?” symbol for variables that are not initialized.

    -
    -

    Example code :

    -
    +
    +

    Example code :

    +
        org 100h
         .data
    @@ -369,9 +369,9 @@ Anyways, we have two types of variables, bytes and words(which are two
     
    -
    -

    Arrays :

    -
    +
    +

    Arrays :

    +

    We can also define Arrays instead of single values using comma separated vaues. like this for example

    @@ -432,9 +432,9 @@ Of course, you can use DW instead of DB if it’s required to keep values la

    -
    -

    LEA

    -
    +
    +

    LEA

    +

    LEA stands for (Load Effective Address) is an instruction used to get the offset of a specific variable. We will see later how its used, but first. here is something we will need :

    @@ -457,18 +457,18 @@ For example: assembler supports shorter prefixes as well:

    -
      -
    1. - for BYTE PTR
    2. -
    3. - for WORD PTR
    4. -
    +
      +
    • b. - for BYTE PTR
    • +
    • w. - for WORD PTR
    • +

    in certain cases the assembler can calculate the data type automatically.

      -
    • Example :
      -
      +
    • Example :
      +
          org 100h
           .data
      @@ -489,9 +489,9 @@ in certain cases the assembler can calculate the data type automatically.
       
    -
    -

    Constants :

    -
    +
    +

    Constants :

    +

    Constants in Assembly only exist until the code is assembled, meaning that if you disassemble your code later, you wont see your constant definitions.

    @@ -510,11 +510,29 @@ Of course constants cant be changed, and aren’t stored in memory. So they
    +
    +

    ⚐ :

    +
    +

    +Now comes the notion of Flags, which are bits in the Status register, which are used for logical and arithmetical instructions and can take a value of 1 or 0 . Here are the 8 flags that exist for the 8086 CPU : +

    +
      +
    • Carry Flag(CF): Set to 1 when there is an unsigned overflow, for example when you add 255 + 1( not in range [0,255] ). by default its set to 0.
    • +
    • Overflow Flag(CF): Set to 1 when there is a signed overflow, for example when you add 100 + 50( not in range [-128, 128[ ). by default its set to 0.
    • +
    • Zero Flag(ZF): Set to 1 when the result is 0. by default its set to 0.
    • +
    • Auxiliary Flag(AF): Set to 1 when there is an unsigned overflow for low nibble (4bits), or in human words : when there is a carry inside the number. for example when you add 29H + 4CH , 9 + C => 15. So we carry the 1 to 2 + 4 and AF is set to 1.
    • +
    • Parity Flag(PF): Set to 1 when the result has an even number of one bits. and 0 if it has an odd number of one bits. Even if a result is a word, only the Low 8bits are analyzed.
    • +
    • Sign Flag(SF): Self explanatory, set to 1 if the result is negative and 0 if its positive.
    • +
    • Interrupt Enable Flag(IF): When its set to 1, the CPU reacts to interrupts from external devices.
    • +
    • Direction Flag(DF): When this flag is set to 0, the processing is done forward, if its set to 1, its done backward.
    • +
    +
    +

    Author: Crystal

    -

    Created: 2024-03-23 Sat 15:57

    +

    Created: 2024-04-10 Wed 21:05

    -- cgit 1.4.1-2-gfad0