summary refs log tree commit diff stats
diff options
context:
space:
mode:
-rw-r--r--blog/asm/1.html157
-rw-r--r--src/gifs/lain-dance.gifbin0 -> 55181 bytes
-rw-r--r--src/org/blog/assembly/1.org68
3 files changed, 137 insertions, 88 deletions
diff --git a/blog/asm/1.html b/blog/asm/1.html
index f283a63..702bc47 100644
--- a/blog/asm/1.html
+++ b/blog/asm/1.html
@@ -3,7 +3,7 @@
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
 <head>
-<!-- 2024-02-24 Sat 18:22 -->
+<!-- 2024-03-07 Thu 20:48 -->
 <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
 <meta name="viewport" content="width=device-width, initial-scale=1" />
 <title>x86 Assembly from my understanding</title>
@@ -23,9 +23,9 @@
 <p>
 Soooo this article (or maybe even a series of articles, who knows ?) will be about x86 assembly, or rather, what I understood from it and my road from the bottom-up hopefully reaching a good level of understanding
 </p>
-<div id="outline-container-org0804bec" class="outline-2">
-<h2 id="org0804bec">Memory :</h2>
-<div class="outline-text-2" id="text-org0804bec">
+<div id="outline-container-orgf540874" class="outline-2">
+<h2 id="orgf540874">Memory :</h2>
+<div class="outline-text-2" id="text-orgf540874">
 <p>
 Memory is a sequence of octets (Aka 8bits) that each have a unique integer assigned to them called <b>The Effective Address (EA)</b>, in this particular CPU Architecture (the i8086), the octet is designated by a couple (A segment number, and the offset in the segment)
 </p>
@@ -40,9 +40,9 @@ Memory is a sequence of octets (Aka 8bits) that each have a unique integer assig
 The offset and segment are encoded in 16bits, so they take a value between 0 and 65535
 </p>
 </div>
-<div id="outline-container-org91745ea" class="outline-4">
-<h4 id="org91745ea">Important :</h4>
-<div class="outline-text-4" id="text-org91745ea">
+<div id="outline-container-orgdfb155c" class="outline-4">
+<h4 id="orgdfb155c">Important :</h4>
+<div class="outline-text-4" id="text-orgdfb155c">
 <p>
 The relation between the Effective Address and the Segment &amp; Offset is as follow :
 </p>
@@ -52,8 +52,8 @@ The relation between the Effective Address and the Segment &amp; Offset is as fo
 </p>
 </div>
 <ul class="org-ul">
-<li><a id="orge330b02"></a>Example :<br />
-<div class="outline-text-5" id="text-orge330b02">
+<li><a id="org1aab4ca"></a>Example :<br />
+<div class="outline-text-5" id="text-org1aab4ca">
 <p>
 Let the Physical address (Or Effective Address, these two terms are enterchangeable) <b>12345h</b> (the h refers to Hexadecimal, which can also be written like this <b>0x12345</b>), the register <b>DS = 1230h</b> and the register <b>SI = 0045h</b>, the CPU calculates the physical address by multiplying the content of the segment register <b>DS</b> by 10h (or 16) and adding the content of the register <b>SI</b>. so we get : <b>1230h x 10h + 45h = 12345h</b>
 </p>
@@ -66,16 +66,16 @@ Now if you are a clever one ( I know you are, since you are reading this &lt;3 )
 </li>
 </ul>
 </div>
-<div id="outline-container-org90039d0" class="outline-3">
-<h3 id="org90039d0">Registers</h3>
-<div class="outline-text-3" id="text-org90039d0">
+<div id="outline-container-org86f9b8f" class="outline-3">
+<h3 id="org86f9b8f">Registers</h3>
+<div class="outline-text-3" id="text-org86f9b8f">
 <p>
 The 8086 CPU has 14 registers of 16bits of size. From the POV of the user, the 8086 has 3 groups of 4 registers of 16bits. One state register of 9bits and a counting program of 16bits inaccessible to the user (whatever this means).
 </p>
 </div>
-<div id="outline-container-org758d630" class="outline-4">
-<h4 id="org758d630">General Registers</h4>
-<div class="outline-text-4" id="text-org758d630">
+<div id="outline-container-org9fb78cf" class="outline-4">
+<h4 id="org9fb78cf">General Registers</h4>
+<div class="outline-text-4" id="text-org9fb78cf">
 <p>
 General registers contribute to arithmetic&rsquo;s and logic and addressing too.
 </p>
@@ -125,97 +125,126 @@ Now here are the Registers we can find in this section:
 </div>
 </div>
 </div>
-<div id="outline-container-org810d22b" class="outline-4">
-<h4 id="org810d22b">Offset/Address Registers</h4>
-<div class="outline-text-4" id="text-org810d22b">
+</div>
+<div id="outline-container-orgbdc7488" class="outline-3">
+<h3 id="orgbdc7488">Addressing and registers&#x2026;again</h3>
+<div class="outline-text-3" id="text-orgbdc7488">
+</div>
+<div id="outline-container-org81d6e8a" class="outline-4">
+<h4 id="org81d6e8a">I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?</h4>
+<div class="outline-text-4" id="text-org81d6e8a">
+<p>
+Well lets take a step back to the notion of effective addresses VS relative ones.
+</p>
+</div>
+</div>
+<div id="outline-container-org7dee427" class="outline-4">
+<h4 id="org7dee427">Effective = 10h x Segment + Offset . Part1</h4>
+<div class="outline-text-4" id="text-org7dee427">
 <p>
-<b>SP</b>: This is the stack pointer. It is of 16 bits. It points to the topmost item of the stack. If the stack is empty the stack pointer will be (FFFE)H (or 65534 in decimal). Its offset address is relative to the stack segment(SS).
+When trying to access a specific memory space, we use this annotation <b>[Segment:Offset]</b>, so for example, and assuming <b>DS = 0100h</b>. We want to write the value <b>0x0005</b> to the memory space defined by the physical address <b>1234h</b>, what do we do ?
 </p>
+</div>
+<ul class="org-ul">
+<li><a id="org0f12415"></a>Answer :<br />
+<div class="outline-text-5" id="text-org0f12415">
+<div class="org-src-container">
+<pre class="src src-asm"><span style="color: #89b4fa;">MOV</span> [DS:0234h], 0x0005
+</pre>
+</div>
 
 <p>
-<b>BP</b>: This is the base pointer. It is of 16 bits. It is primarily used in accessing parameters passed by the stack. Its offset address is relative to the stack segment(SS).
+Why ? Let&rsquo;s break it down :
+<img src="../../../gifs/lain-dance.gif" alt="lain-dance.gif" />
 </p>
 
+
+
 <p>
-<b>SI</b>: This is the source index register. It is of 16 bits. It is used in the pointer addressing of data and as a source in some string-related operations. Its offset is relative to the data segment(DS).
+We Already know that <b>Effective = 10h x Segment + Offset</b>, So here we have : <b>1234h = 10h x DS + Offset</b>, we already know that <b>DS = 0100h</b>, we end up with this simple equation <b>1234h = 1000h + Offset</b>, therefor the Offset is <b>0234h</b>
 </p>
 
+
 <p>
-<b>DI</b>: This is the destination index register. It is of 16 bits. It is used in the pointer addressing of data and as a destination in some string-related operations. Its offset is relative to the extra segment(ES).
+Simple, right ?, now for another example
 </p>
 </div>
+</li>
+</ul>
 </div>
-<div id="outline-container-orgfd6556c" class="outline-4">
-<h4 id="orgfd6556c">Segment Registers</h4>
-<div class="outline-text-4" id="text-orgfd6556c">
+<div id="outline-container-org757ac64" class="outline-4">
+<h4 id="org757ac64">Another example :</h4>
+<div class="outline-text-4" id="text-org757ac64">
 <p>
-<b>CS</b>: Code Segment, it defines the start of the program memory, and the different addresses of the different instructions relative to CS.
+What if we now have this instruction ?
 </p>
-
+<div class="org-src-container">
+<pre class="src src-asm">    <span style="color: #cba6f7;">MOV</span> [0234h], 0x0005
+</pre>
+</div>
 <p>
-<b>DS</b>: Data Segment, defines the start of the data memory where we store all data processed by the program.
+What does it do ? You might or might not be surprised that it does the exact same thing as the other snipped of code, why though ? Because apparently and for some odd reason I don&rsquo;t know, the compiler Implicitly assumes that the segment used is the <b>DS</b> one. So if you don&rsquo;t specify a register( we will get to this later ), or a segment. Then the offset is considered an offset with a DS segment.
 </p>
-
+</div>
+</div>
+<div id="outline-container-org2f959c2" class="outline-4">
+<h4 id="org2f959c2">Segment + Register &lt;3</h4>
+<div class="outline-text-4" id="text-org2f959c2">
 <p>
-<b>SS</b>: Stack Segment, or the start of the pile. The pile is a memory zone that is managed in a particular way, it&rsquo;s like a pile of plates, where we can only remove and add plates on top of the pile. Only one address register is enough to manage it, its the stack pointer SP. We say that this pile is a LIFO pile (Last IN, First OUT)
+Consider <b>DS = 0100h</b> and <b>BX = BP = 0234h</b> and this code snippet:
 </p>
+<div class="org-src-container">
+<pre class="src src-asm">    <span style="color: #cba6f7;">MOV</span> [BX], 0x0005 <span style="color: #6c7086;">; </span><span style="color: #a6e3a1; font-weight: bold;">NOTE</span><span style="color: #6c7086;"> : ITS NOT THE SAME AS MOV BX, 0x0005. Refer to earlier paragraphs</span>
+</pre>
+</div>
+
 
 <p>
-<b>EX</b>: The start of an auxiliary segment for data
+Well you guessed it right, it also does the same thing, but now consider this :
 </p>
+<div class="org-src-container">
+<pre class="src src-asm">    <span style="color: #cba6f7;">MOV</span> [BP], 0x0005
+</pre>
 </div>
-</div>
-</div>
-<div id="outline-container-orgb663ae9" class="outline-3">
-<h3 id="orgb663ae9">The format of an address:</h3>
-<div class="outline-text-3" id="text-orgb663ae9">
+
 <p>
-An Address must have this fellowing form [RS : RO] with the following possibilities:
+If you answered that its the same one, you are wrong. And this is because the segment used changes according to the offset as I said before in an implicit way. Here is the explicit equivalent of the two commands above:
 </p>
-
-<ul class="org-ul">
-<li>A value : Nothing</li>
-<li>ES : DI</li>
-<li>CS : SI</li>
-<li>ES : BP</li>
-<li>DS : BX</li>
-</ul>
+<div class="org-src-container">
+<pre class="src src-asm">    <span style="color: #cba6f7;">MOV</span> [DS:BX], 0x0005
+    <span style="color: #cba6f7;">MOV</span> [SS:BP], 0x0005
+</pre>
 </div>
-<div id="outline-container-orgc26de48" class="outline-4">
-<h4 id="orgc26de48">Note 1 :</h4>
-<div class="outline-text-4" id="text-orgc26de48">
+
 <p>
-When the register isn&rsquo;t specified. the CPU adds it depending on the offset used :
+The General rule of thumb is as follows :
 </p>
-
 <ul class="org-ul">
 <li>If the offset is : DI SI or BX, the Segment used is DS.</li>
-<li>If its BP, then the segment is SS.</li>
+<li>If its BP or SP, then the segment is SS.</li>
 </ul>
 </div>
-</div>
-<div id="outline-container-orgc918fef" class="outline-4">
-<h4 id="orgc918fef">Note 2 :</h4>
-<div class="outline-text-4" id="text-orgc918fef">
+<ul class="org-ul">
+<li><a id="org0e8010b"></a>Note<br />
+<div class="outline-text-5" id="text-org0e8010b">
 <p>
-Apparently we will assume that we are in the DS segment and only access to memory using the offset.
+The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can&rsquo;t write directly into the DS or CS segment registers, so something like
 </p>
+<div class="org-src-container">
+<pre class="src src-asm"><span style="color: #89b4fa;">MOV</span> <span style="color: #cba6f7;">DS</span>, 0x0005 <span style="color: #6c7086;">; </span><span style="color: #6c7086;">Is INVALID</span>
+<span style="color: #89b4fa;">MOV</span> <span style="color: #cba6f7;">DS</span>, AX <span style="color: #6c7086;">; </span><span style="color: #6c7086;">This one is VALID</span>
+</pre>
 </div>
 </div>
-<div id="outline-container-org4affc44" class="outline-4">
-<h4 id="org4affc44">Note 3 :</h4>
-<div class="outline-text-4" id="text-org4affc44">
-<p>
-The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset.
-</p>
-</div>
+</li>
+</ul>
 </div>
 </div>
 </div>
 </div>
 <div id="postamble" class="status">
 <p class="author">Author: Crystal</p>
-<p class="date">Created: 2024-02-24 Sat 18:22</p>
+<p class="date">Created: 2024-03-07 Thu 20:48</p>
 </div>
 </body>
 </html>
diff --git a/src/gifs/lain-dance.gif b/src/gifs/lain-dance.gif
new file mode 100644
index 0000000..aeb56be
--- /dev/null
+++ b/src/gifs/lain-dance.gif
Binary files differdiff --git a/src/org/blog/assembly/1.org b/src/org/blog/assembly/1.org
index daa4976..fa77e49 100644
--- a/src/org/blog/assembly/1.org
+++ b/src/org/blog/assembly/1.org
@@ -68,42 +68,62 @@ LOOP
 #+BEGIN_SRC asm
 MUL BX (DX, AX = AX * BX)
 #+END_SRC
+** Addressing and registers...again
+*** I realized what I wrote here before was almost gibberish, sooo here we go again I guess ?
 
-*** Offset/Address Registers
-*SP*: This is the stack pointer. It is of 16 bits. It points to the topmost item of the stack. If the stack is empty the stack pointer will be (FFFE)H (or 65534 in decimal). Its offset address is relative to the stack segment(SS).
+Well lets take a step back to the notion of effective addresses VS relative ones.
+*** Effective = 10h x Segment + Offset . Part1
+When trying to access a specific memory space, we use this annotation *[Segment:Offset]*, so for example, and assuming *DS = 0100h*. We want to write the value *0x0005* to the memory space defined by the physical address *1234h*, what do we do ?
+**** Answer :
+#+BEGIN_SRC asm
+MOV [DS:0234h], 0x0005
+#+END_SRC
 
-*BP*: This is the base pointer. It is of 16 bits. It is primarily used in accessing parameters passed by the stack. Its offset address is relative to the stack segment(SS).
+Why ? Let's break it down :
+[[../../../gifs/lain-dance.gif]]
 
-*SI*: This is the source index register. It is of 16 bits. It is used in the pointer addressing of data and as a source in some string-related operations. Its offset is relative to the data segment(DS).
 
-*DI*: This is the destination index register. It is of 16 bits. It is used in the pointer addressing of data and as a destination in some string-related operations. Its offset is relative to the extra segment(ES).
 
-*** Segment Registers
-*CS*: Code Segment, it defines the start of the program memory, and the different addresses of the different instructions relative to CS.
+We Already know that *Effective = 10h x Segment + Offset*, So here we have : *1234h = 10h x DS + Offset*, we already know that *DS = 0100h*, we end up with this simple equation *1234h = 1000h + Offset*, therefor the Offset is *0234h*
 
-*DS*: Data Segment, defines the start of the data memory where we store all data processed by the program.
 
-*SS*: Stack Segment, or the start of the pile. The pile is a memory zone that is managed in a particular way, it's like a pile of plates, where we can only remove and add plates on top of the pile. Only one address register is enough to manage it, its the stack pointer SP. We say that this pile is a LIFO pile (Last IN, First OUT)
+Simple, right ?, now for another example
+*** Another example :
+What if we now have this instruction ?
+#+BEGIN_SRC asm
+    MOV [0234h], 0x0005
+#+END_SRC
+What does it do ? You might or might not be surprised that it does the exact same thing as the other snipped of code, why though ? Because apparently and for some odd reason I don't know, the compiler Implicitly assumes that the segment used is the *DS* one. So if you don't specify a register( we will get to this later ), or a segment. Then the offset is considered an offset with a DS segment.
+
 
-*EX*: The start of an auxiliary segment for data
 
-** The format of an address:
-An Address must have this fellowing form [RS : RO] with the following possibilities:
+*** Segment + Register <3
 
-- A value : Nothing
-- ES : DI
-- CS : SI
-- ES : BP
-- DS : BX
+Consider *DS = 0100h* and *BX = BP = 0234h* and this code snippet:
+#+BEGIN_SRC asm
+    MOV [BX], 0x0005 ; NOTE : ITS NOT THE SAME AS MOV BX, 0x0005. Refer to earlier paragraphs
+#+END_SRC
 
-*** Note 1 :
-When the register isn't specified. the CPU adds it depending on the offset used :
 
+Well you guessed it right, it also does the same thing, but now consider this :
+#+BEGIN_SRC asm
+    MOV [BP], 0x0005
+#+END_SRC
+
+If you answered that its the same one, you are wrong. And this is because the segment used changes according to the offset as I said before in an implicit way. Here is the explicit equivalent of the two commands above:
+#+BEGIN_SRC asm
+    MOV [DS:BX], 0x0005
+    MOV [SS:BP], 0x0005
+#+END_SRC
+
+The General rule of thumb is as follows :
 - If the offset is : DI SI or BX, the Segment used is DS.
-- If its BP, then the segment is SS.
+- If its BP or SP, then the segment is SS.
 
-*** Note 2 :
-Apparently we will assume that we are in the DS segment and only access to memory using the offset.
 
-*** Note 3 :
-The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset.
+**** Note
+The values of the registers CS DS and SS are automatically initialized by the OS when launching the program. So these segments are implicit. AKA : If we want to access a specific data in memory, we just need to specify its offset. Also you can't write directly into the DS or CS segment registers, so something like
+#+BEGIN_SRC asm
+MOV DS, 0x0005 ; Is INVALID
+MOV DS, AX ; This one is VALID
+#+END_SRC