From edbb6a1c58b75a4be494a268d02240e3f4720b77 Mon Sep 17 00:00:00 2001 From: Crystal Date: Fri, 22 Mar 2024 14:08:37 +0100 Subject: New --- blog/asm/1.html | 260 ++++++++++++++++++++++++++++++++++++++------ src/org/blog/assembly/1.org | 113 +++++++++++++++++++ 2 files changed, 337 insertions(+), 36 deletions(-) diff --git a/blog/asm/1.html b/blog/asm/1.html index fcaae55..124d81d 100644 --- a/blog/asm/1.html +++ b/blog/asm/1.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + x86 Assembly from my understanding @@ -23,9 +23,9 @@

Soooo this article (or maybe even a series of articles, who knows ?) will be about x86 assembly, or rather, what I understood from it and my road from the bottom-up hopefully reaching a good level of understanding

-
-

Memory :

-
+
+

Memory :

+

Memory is a sequence of octets (Aka 8bits) that each have a unique integer assigned to them called The Effective Address (EA), in this particular CPU Architecture (the i8086), the octet is designated by a couple (A segment number, and the offset in the segment)

@@ -40,9 +40,9 @@ Memory is a sequence of octets (Aka 8bits) that each have a unique integer assig The offset and segment are encoded in 16bits, so they take a value between 0 and 65535

-
-

Important :

-
+
+

Important :

+

The relation between the Effective Address and the Segment & Offset is as follow :

@@ -52,8 +52,8 @@ The relation between the Effective Address and the Segment & Offset is as fo

    -
  • Example :
    -
    +
  • Example :
    +

    Let the Physical address (Or Effective Address, these two terms are interchangeable) 12345h (the h refers to Hexadecimal, which can also be written like this 0x12345), the register DS = 1230h and the register SI = 0045h, the CPU calculates the physical address by multiplying the content of the segment register DS by 10h (or 16) and adding the content of the register SI. so we get : 1230h x 10h + 45h = 12345h

    @@ -66,16 +66,16 @@ Now if you are a clever one ( I know you are, since you are reading this <3 )
-
-

Registers

-
+
+

Registers

+

The 8086 CPU has 14 registers of 16bits of size. From the POV of the user, the 8086 has 3 groups of 4 registers of 16bits. One state register of 9bits and a counting program of 16bits inaccessible to the user (whatever this means).

-
-

General Registers

-
+
+

General Registers

+

General registers contribute to arithmetic’s and logic and addressing too.

@@ -126,28 +126,28 @@ Now here are the Registers we can find in this section:
-
-

Addressing and registers…again

-
+
+

Addressing and registers…again

+
-
-

I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?

-
+
+

I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?

+

Well lets take a step back to the notion of effective addresses VS relative ones.

-
-

Effective = 10h x Segment + Offset . Part1

-
+
+

Effective = 10h x Segment + Offset . Part1

+

When trying to access a specific memory space, we use this annotation [Segment:Offset], so for example, and assuming DS = 0100h. We want to write the value 0x0005 to the memory space defined by the physical address 1234h, what do we do ?

    -
  • Answer :
    -
    +
  • Answer :
    +
    MOV [DS:0234h], 0x0005
     
    @@ -159,7 +159,7 @@ Why ? Let’s break it down : -
    +

    lain-dance.gif

    @@ -177,9 +177,9 @@ Simple, right ?, now for another example
-
-

Another example :

-
+
+

Another example :

+

What if we now have this instruction ?

@@ -192,9 +192,9 @@ What does it do ? You might or might not be surprised that it does the exact sam

-
-

Segment + Register <3

-
+
+

Segment + Register <3

+

Consider DS = 0100h and BX = BP = 0234h and this code snippet:

@@ -230,8 +230,8 @@ The General rule of thumb is as follows :
    -
  • Note
    -
    +
  • Note
    +

    The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can’t write directly into the DS or CS segment registers, so something like

    @@ -246,10 +246,198 @@ The values of the registers CS DS and SS are automatically initialized by the OS
+
+

The ACTUAL thing :

+
+

+Enough technical rambling, and now we shall go to the fun part, the ACTUAL CODE. But first, some names you should be familiar with : +

+ +
    +
  • Mnemonics : Or Instructions, are the…well…Instructions executed by the CPU like MOV , ADD, MUL…etc, they are case insensitive but i like them better in UPPERCASE.
  • +
  • Operands : These are the options passed to the instructions, like MOV dst, src, and they can be anything from a memory location, to a variable to an immediate address.
  • +
+
+
+

Structure of an assembly program :

+
+

+While there is no “standard” structure, i prefer to go with this one : +

+ +
+
    org 100h
+.data
+                                ; variables and constants
+
+.code
+                                ; instructions
+
+
+
+
+
+

MOV dst, src

+
+

+The MOV instruction copies the Second operand (src) to the First operand (dst)… The source can be a memory location, an immediate value, a general-purpose register (AX BX CX DX). As for the Destination, it can be a general-purpose register or a memory location. +

+ + +

+these types of operands are supported: +

+
+
MOV REG, memory
+MOV memory, REG
+MOV REG, REG
+MOV memory, immediate
+MOV REG, immediate
+
+
+

+REG: AX, BX, CX, DX, AH, AL, BL, BH, CH, CL, DH, DL, DI, SI, BP, SP. +

+ +

+memory: [BX], [BX+SI+7], variable +

+ +

+immediate: 5, -24, 3Fh, 10001101b +

+ + +

+for segment registers only these types of MOV are supported: +

+
+
MOV SREG, memory
+MOV memory, SREG
+MOV REG, SREG
+MOV SREG, REG
+SREG: DS, ES, SS, and only as second operand: CS.
+
+
+

+REG: AX, BX, CX, DX, AH, AL, BL, BH, CH, CL, DH, DL, DI, SI, BP, SP. +

+ +

+memory: [BX], [BX+SI+7], variable +

+
+
+

Note : The MOV instruction cannot be used to set the value of the CS and IP registers

+
+
+
+

Variables :

+
+

+Let’s say you want to use a specific value multiple times in your code, do you prefer to call it using something like var1 or E4F9:0011 ? If your answer is the second option, you can gladly skip this section, or even better, seek therapy. +

+ +

+Anyways, we have two types of variables, bytes and words(which are two bytes), and to define a variable, we use the following syntax +

+ +
+
name DB value ; To Define a Byte
+name DW value ; To Define a Word
+
+
+ +

+name - can be any letter or digit combination, though it should start with a letter. It’s possible to declare unnamed variables by not specifying the name (this variable will have an address but no name). +value - can be any numeric value in any supported numbering system (hexadecimal, binary, or decimal), or “?” symbol for variables that are not initialized. +

+
+
+

Example code :

+
+
+
    org 100h
+    .data
+    x db 33
+    y dw 1350h
+
+    .code
+    MOV AL, x
+    MOV BX, y
+
+
+
+
+
+

Arrays :

+
+

+We can also define Arrays instead of single values using comma separated vaues. like this for example +

+
+
    a db 48h, 65h, 6Ch, 6Fh, 00H
+    b db 'Hello', 0
+
+
+ +

+Surprise Surprise, the arrays a and b are identical, the reason behind it is that characters are first converted to their ASCII values then stored in memory!!! Wonderful right ? And guess what, accessing values in assembly IS THE SAME AS IN C !!! +

+
+
    MOV AL, a[0] ; Copies 48h to AL
+    MOV BL, b[0] ; Also Copies 48h to BL
+
+
+

+You can also use any of the memory index registers BX, SI, DI, BP, for example: +

+
+
MOV SI, 3
+MOV AL, a[SI]
+
+
+ +

+If you need to declare a large array you can use DUP operator. +The syntax for DUP: +

+ +

+number DUP ( value(s) ) +number - number of duplicate to make (any constant value). +value - expression that DUP will duplicate. +

+ +

+for example: +

+
+
c DB 5 DUP(9)
+;is an alternative way of declaring:
+c DB 9, 9, 9, 9, 9
+
+
+

+one more example: +

+
+
d DB 5 DUP(1, 2)
+;is an alternative way of declaring:
+d DB 1, 2, 1, 2, 1, 2, 1, 2, 1, 2
+
+
+

+Of course, you can use DW instead of DB if it’s required to keep values larger then 255, or smaller then -128. DW cannot be used to declare strings. +

+
+
+
+

Author: Crystal

-

Created: 2024-03-17 Sun 21:28

+

Created: 2024-03-22 Fri 14:08

diff --git a/src/org/blog/assembly/1.org b/src/org/blog/assembly/1.org index e268824..3fd21e4 100644 --- a/src/org/blog/assembly/1.org +++ b/src/org/blog/assembly/1.org @@ -128,3 +128,116 @@ The values of the registers CS DS and SS are automatically initialized by the OS MOV DS, 0x0005 ; Is INVALID MOV DS, AX ; This one is VALID #+END_SRC + +* The ACTUAL thing : +Enough technical rambling, and now we shall go to the fun part, the ACTUAL CODE. But first, some names you should be familiar with : + +- *Mnemonics* : Or *Instructions*, are the...well...Instructions executed by the CPU like *MOV* , *ADD*, *MUL*...etc, they are case *insensitive* but i like them better in UPPERCASE. +- *Operands* : These are the options passed to the instructions, like *MOV dst, src*, and they can be anything from a memory location, to a variable to an immediate address. + +** Structure of an assembly program : +While there is no "standard" structure, i prefer to go with this one : + +#+BEGIN_SRC asm + org 100h +.data + ; variables and constants + +.code + ; instructions +#+END_src +** MOV dst, src +The MOV instruction copies the Second operand (src) to the First operand (dst)... The source can be a memory location, an immediate value, a general-purpose register (AX BX CX DX). As for the Destination, it can be a general-purpose register or a memory location. + + +these types of operands are supported: +#+BEGIN_SRC asm +MOV REG, memory +MOV memory, REG +MOV REG, REG +MOV memory, immediate +MOV REG, immediate +#+END_SRC +*REG*: AX, BX, CX, DX, AH, AL, BL, BH, CH, CL, DH, DL, DI, SI, BP, SP. + +*memory*: [BX], [BX+SI+7], variable + +*immediate*: 5, -24, 3Fh, 10001101b + + +for segment registers only these types of MOV are supported: +#+BEGIN_SRC asm +MOV SREG, memory +MOV memory, SREG +MOV REG, SREG +MOV SREG, REG +SREG: DS, ES, SS, and only as second operand: CS. +#+END_SRC +*REG*: AX, BX, CX, DX, AH, AL, BL, BH, CH, CL, DH, DL, DI, SI, BP, SP. + +*memory*: [BX], [BX+SI+7], variable + +*** Note : The MOV instruction *cannot* be used to set the value of the CS and IP registers +** Variables : +Let's say you want to use a specific value multiple times in your code, do you prefer to call it using something like *var1* or *E4F9:0011* ? If your answer is the second option, you can gladly skip this section, or even better, seek therapy. + +Anyways, we have two types of variables, *bytes* and *words(which are two bytes)*, and to define a variable, we use the following syntax + +#+BEGIN_SRC asm +name DB value ; To Define a Byte +name DW value ; To Define a Word +#+END_SRC + +*name* - can be any letter or digit combination, though it should start with a letter. It's possible to declare unnamed variables by not specifying the name (this variable will have an address but no name). +*value* - can be any numeric value in any supported numbering system (hexadecimal, binary, or decimal), or "?" symbol for variables that are not initialized. + +*** Example code : +#+BEGIN_SRC asm + org 100h + .data + x db 33 + y dw 1350h + + .code + MOV AL, x + MOV BX, y +#+END_SRC + +*** Arrays : +We can also define Arrays instead of single values using comma separated vaues. like this for example +#+BEGIN_SRC asm + a db 48h, 65h, 6Ch, 6Fh, 00H + b db 'Hello', 0 +#+END_SRC + +Surprise Surprise, the arrays a and b are identical, the reason behind it is that characters are first converted to their ASCII values then stored in memory!!! Wonderful right ? And guess what, accessing values in assembly IS THE SAME AS IN C !!! +#+BEGIN_SRC asm + MOV AL, a[0] ; Copies 48h to AL + MOV BL, b[0] ; Also Copies 48h to BL +#+END_SRC +You can also use any of the memory index registers BX, SI, DI, BP, for example: +#+BEGIN_SRC asm +MOV SI, 3 +MOV AL, a[SI] +#+END_SRC + +If you need to declare a large array you can use DUP operator. +The syntax for *DUP*: + +number DUP ( value(s) ) +*number* - number of duplicate to make (any constant value). +*value* - expression that DUP will duplicate. + +for example: +#+BEGIN_SRC asm +c DB 5 DUP(9) +;is an alternative way of declaring: +c DB 9, 9, 9, 9, 9 +#+END_SRC +one more example: +#+BEGIN_SRC asm +d DB 5 DUP(1, 2) +;is an alternative way of declaring: +d DB 1, 2, 1, 2, 1, 2, 1, 2, 1, 2 +#+END_SRC +Of course, you can use DW instead of DB if it's required to keep values larger then 255, or smaller then -128. DW cannot be used to declare strings. -- cgit 1.4.1-2-gfad0