From 2461686f0e37055df3dc9a3b5ce85076c98f51cb Mon Sep 17 00:00:00 2001 From: Crystal Date: Thu, 7 Mar 2024 20:49:32 +0100 Subject: Add stuff --- blog/asm/1.html | 157 ++++++++++++++++++++++++++------------------ src/gifs/lain-dance.gif | Bin 0 -> 55181 bytes src/org/blog/assembly/1.org | 68 ++++++++++++------- 3 files changed, 137 insertions(+), 88 deletions(-) create mode 100644 src/gifs/lain-dance.gif diff --git a/blog/asm/1.html b/blog/asm/1.html index f283a63..702bc47 100644 --- a/blog/asm/1.html +++ b/blog/asm/1.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + x86 Assembly from my understanding @@ -23,9 +23,9 @@

Soooo this article (or maybe even a series of articles, who knows ?) will be about x86 assembly, or rather, what I understood from it and my road from the bottom-up hopefully reaching a good level of understanding

-
-

Memory :

-
+
+

Memory :

+

Memory is a sequence of octets (Aka 8bits) that each have a unique integer assigned to them called The Effective Address (EA), in this particular CPU Architecture (the i8086), the octet is designated by a couple (A segment number, and the offset in the segment)

@@ -40,9 +40,9 @@ Memory is a sequence of octets (Aka 8bits) that each have a unique integer assig The offset and segment are encoded in 16bits, so they take a value between 0 and 65535

-
-

Important :

-
+
+

Important :

+

The relation between the Effective Address and the Segment & Offset is as follow :

@@ -52,8 +52,8 @@ The relation between the Effective Address and the Segment & Offset is as fo

    -
  • Example :
    -
    +
  • Example :
    +

    Let the Physical address (Or Effective Address, these two terms are enterchangeable) 12345h (the h refers to Hexadecimal, which can also be written like this 0x12345), the register DS = 1230h and the register SI = 0045h, the CPU calculates the physical address by multiplying the content of the segment register DS by 10h (or 16) and adding the content of the register SI. so we get : 1230h x 10h + 45h = 12345h

    @@ -66,16 +66,16 @@ Now if you are a clever one ( I know you are, since you are reading this <3 )
-
-

Registers

-
+
+

Registers

+

The 8086 CPU has 14 registers of 16bits of size. From the POV of the user, the 8086 has 3 groups of 4 registers of 16bits. One state register of 9bits and a counting program of 16bits inaccessible to the user (whatever this means).

-
-

General Registers

-
+
+

General Registers

+

General registers contribute to arithmetic’s and logic and addressing too.

@@ -125,97 +125,126 @@ Now here are the Registers we can find in this section:
-
-

Offset/Address Registers

-
+
+
+

Addressing and registers…again

+
+
+
+

I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?

+
+

+Well lets take a step back to the notion of effective addresses VS relative ones. +

+
+
+
+

Effective = 10h x Segment + Offset . Part1

+

-SP: This is the stack pointer. It is of 16 bits. It points to the topmost item of the stack. If the stack is empty the stack pointer will be (FFFE)H (or 65534 in decimal). Its offset address is relative to the stack segment(SS). +When trying to access a specific memory space, we use this annotation [Segment:Offset], so for example, and assuming DS = 0100h. We want to write the value 0x0005 to the memory space defined by the physical address 1234h, what do we do ?

+
+
    +
  • Answer :
    +
    +
    +
    MOV [DS:0234h], 0x0005
    +
    +

    -BP: This is the base pointer. It is of 16 bits. It is primarily used in accessing parameters passed by the stack. Its offset address is relative to the stack segment(SS). +Why ? Let’s break it down : +lain-dance.gif

    + +

    -SI: This is the source index register. It is of 16 bits. It is used in the pointer addressing of data and as a source in some string-related operations. Its offset is relative to the data segment(DS). +We Already know that Effective = 10h x Segment + Offset, So here we have : 1234h = 10h x DS + Offset, we already know that DS = 0100h, we end up with this simple equation 1234h = 1000h + Offset, therefor the Offset is 0234h

    +

    -DI: This is the destination index register. It is of 16 bits. It is used in the pointer addressing of data and as a destination in some string-related operations. Its offset is relative to the extra segment(ES). +Simple, right ?, now for another example

    +
  • +
-
-

Segment Registers

-
+
+

Another example :

+

-CS: Code Segment, it defines the start of the program memory, and the different addresses of the different instructions relative to CS. +What if we now have this instruction ?

- +
+
    MOV [0234h], 0x0005
+
+

-DS: Data Segment, defines the start of the data memory where we store all data processed by the program. +What does it do ? You might or might not be surprised that it does the exact same thing as the other snipped of code, why though ? Because apparently and for some odd reason I don’t know, the compiler Implicitly assumes that the segment used is the DS one. So if you don’t specify a register( we will get to this later ), or a segment. Then the offset is considered an offset with a DS segment.

- +
+
+
+

Segment + Register <3

+

-SS: Stack Segment, or the start of the pile. The pile is a memory zone that is managed in a particular way, it’s like a pile of plates, where we can only remove and add plates on top of the pile. Only one address register is enough to manage it, its the stack pointer SP. We say that this pile is a LIFO pile (Last IN, First OUT) +Consider DS = 0100h and BX = BP = 0234h and this code snippet:

+
+
    MOV [BX], 0x0005 ; NOTE : ITS NOT THE SAME AS MOV BX, 0x0005. Refer to earlier paragraphs
+
+
+

-EX: The start of an auxiliary segment for data +Well you guessed it right, it also does the same thing, but now consider this :

+
+
    MOV [BP], 0x0005
+
-
-
-
-

The format of an address:

-
+

-An Address must have this fellowing form [RS : RO] with the following possibilities: +If you answered that its the same one, you are wrong. And this is because the segment used changes according to the offset as I said before in an implicit way. Here is the explicit equivalent of the two commands above:

- -
    -
  • A value : Nothing
  • -
  • ES : DI
  • -
  • CS : SI
  • -
  • ES : BP
  • -
  • DS : BX
  • -
+
+
    MOV [DS:BX], 0x0005
+    MOV [SS:BP], 0x0005
+
-
-

Note 1 :

-
+

-When the register isn’t specified. the CPU adds it depending on the offset used : +The General rule of thumb is as follows :

-
  • If the offset is : DI SI or BX, the Segment used is DS.
  • -
  • If its BP, then the segment is SS.
  • +
  • If its BP or SP, then the segment is SS.
-
-
-

Note 2 :

-
+
    +
  • Note
    +

    -Apparently we will assume that we are in the DS segment and only access to memory using the offset. +The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can’t write directly into the DS or CS segment registers, so something like

    +
    +
    MOV DS, 0x0005 ; Is INVALID
    +MOV DS, AX ; This one is VALID
    +
    -
    -

    Note 3 :

    -
    -

    -The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. -

    -
    +
  • +

Author: Crystal

-

Created: 2024-02-24 Sat 18:22

+

Created: 2024-03-07 Thu 20:48

diff --git a/src/gifs/lain-dance.gif b/src/gifs/lain-dance.gif new file mode 100644 index 0000000..aeb56be Binary files /dev/null and b/src/gifs/lain-dance.gif differ diff --git a/src/org/blog/assembly/1.org b/src/org/blog/assembly/1.org index daa4976..fa77e49 100644 --- a/src/org/blog/assembly/1.org +++ b/src/org/blog/assembly/1.org @@ -68,42 +68,62 @@ LOOP #+BEGIN_SRC asm MUL BX (DX, AX = AX * BX) #+END_SRC +** Addressing and registers...again +*** I realized what I wrote here before was almost gibberish, sooo here we go again I guess ? -*** Offset/Address Registers -*SP*: This is the stack pointer. It is of 16 bits. It points to the topmost item of the stack. If the stack is empty the stack pointer will be (FFFE)H (or 65534 in decimal). Its offset address is relative to the stack segment(SS). +Well lets take a step back to the notion of effective addresses VS relative ones. +*** Effective = 10h x Segment + Offset . Part1 +When trying to access a specific memory space, we use this annotation *[Segment:Offset]*, so for example, and assuming *DS = 0100h*. We want to write the value *0x0005* to the memory space defined by the physical address *1234h*, what do we do ? +**** Answer : +#+BEGIN_SRC asm +MOV [DS:0234h], 0x0005 +#+END_SRC -*BP*: This is the base pointer. It is of 16 bits. It is primarily used in accessing parameters passed by the stack. Its offset address is relative to the stack segment(SS). +Why ? Let's break it down : +[[../../../gifs/lain-dance.gif]] -*SI*: This is the source index register. It is of 16 bits. It is used in the pointer addressing of data and as a source in some string-related operations. Its offset is relative to the data segment(DS). -*DI*: This is the destination index register. It is of 16 bits. It is used in the pointer addressing of data and as a destination in some string-related operations. Its offset is relative to the extra segment(ES). -*** Segment Registers -*CS*: Code Segment, it defines the start of the program memory, and the different addresses of the different instructions relative to CS. +We Already know that *Effective = 10h x Segment + Offset*, So here we have : *1234h = 10h x DS + Offset*, we already know that *DS = 0100h*, we end up with this simple equation *1234h = 1000h + Offset*, therefor the Offset is *0234h* -*DS*: Data Segment, defines the start of the data memory where we store all data processed by the program. -*SS*: Stack Segment, or the start of the pile. The pile is a memory zone that is managed in a particular way, it's like a pile of plates, where we can only remove and add plates on top of the pile. Only one address register is enough to manage it, its the stack pointer SP. We say that this pile is a LIFO pile (Last IN, First OUT) +Simple, right ?, now for another example +*** Another example : +What if we now have this instruction ? +#+BEGIN_SRC asm + MOV [0234h], 0x0005 +#+END_SRC +What does it do ? You might or might not be surprised that it does the exact same thing as the other snipped of code, why though ? Because apparently and for some odd reason I don't know, the compiler Implicitly assumes that the segment used is the *DS* one. So if you don't specify a register( we will get to this later ), or a segment. Then the offset is considered an offset with a DS segment. + -*EX*: The start of an auxiliary segment for data -** The format of an address: -An Address must have this fellowing form [RS : RO] with the following possibilities: +*** Segment + Register <3 -- A value : Nothing -- ES : DI -- CS : SI -- ES : BP -- DS : BX +Consider *DS = 0100h* and *BX = BP = 0234h* and this code snippet: +#+BEGIN_SRC asm + MOV [BX], 0x0005 ; NOTE : ITS NOT THE SAME AS MOV BX, 0x0005. Refer to earlier paragraphs +#+END_SRC -*** Note 1 : -When the register isn't specified. the CPU adds it depending on the offset used : +Well you guessed it right, it also does the same thing, but now consider this : +#+BEGIN_SRC asm + MOV [BP], 0x0005 +#+END_SRC + +If you answered that its the same one, you are wrong. And this is because the segment used changes according to the offset as I said before in an implicit way. Here is the explicit equivalent of the two commands above: +#+BEGIN_SRC asm + MOV [DS:BX], 0x0005 + MOV [SS:BP], 0x0005 +#+END_SRC + +The General rule of thumb is as follows : - If the offset is : DI SI or BX, the Segment used is DS. -- If its BP, then the segment is SS. +- If its BP or SP, then the segment is SS. -*** Note 2 : -Apparently we will assume that we are in the DS segment and only access to memory using the offset. -*** Note 3 : -The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. +**** Note +The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can't write directly into the DS or CS segment registers, so something like +#+BEGIN_SRC asm +MOV DS, 0x0005 ; Is INVALID +MOV DS, AX ; This one is VALID +#+END_SRC -- cgit 1.4.1-2-gfad0