An Introduction to Systems Programming in Rust

In this chapter we will emulate our own cpu and memory and learn about things like a CPUs endianness.

Endianness

It is just the way cpu orders its bytes, theres two ways to go about it, the regular order in big endian way and the reverse order called little endian. x86 does it in little endian way while arm is bi-endian, a lot of network protocols are big endian.

Demystifying floating point numbers and their quirkiness

Inspiration

Floating point numbers were created by computer scientists by taking inspiration from the scientific notion of represent larger numbers. Here is what the mass of sun looks like in the scientific notion.

M_{Sun} = 1.989 \times 10^{30} kg

Inspired by this computer scientists came up with a way to create a fixed-width format that encodes a wide range of numbers. A floating-point value is a container with three fields:

A sign bit
An exponent
A mantissa

looking inside 32 bit floats

!Screenshot 2024-11-30 at 2.33.18 AM.png
This is how one would represent a float 42.42. Now how do you actually go about calculating 42.42 from this encoded components?

n = - 1^{sign_bit} \times mantissa \times {Radix}^{(exponent - Bias)}

you could use this neat little formula and do these calculations,

\begin{aligned} n & = - 1^{sign_bit} \times mantissa \times {Radix}^{(132 - 127)} \\ n & = - 1^{sign_bit} \times mantissa \times 2^{(132 - 127)} \\ n & = - 1^{sign_bit} \times 1.325625 \times 2^{(132 - 127)} \\ n & = - 10 \times 1.325625 \times 2^{5} \\ n & = 1 \times 1.325625 \times 32 \\ n & = 42.42 \end{aligned}

One quirk of floating-point numbers is that their sign bits allow for both 0 and –0. That is, floating-point numbers that have different bit patterns compare as equal (0 and –0) and have identical bit patterns (NAN values) that compare as unequal.

Tearing apart 32 bit floats in rust

const BIAS: i32 = 127;     // just a standard
const RADIX: f32 = 2.0;    // cause binary
  
fn main() {                
  let n: f32 = 42.42;  
  
  let (sign, exp, frac) = to_parts(n);  
  let (sign_, exp_, mant) = decode(sign, exp, frac);  
  let n_ = from_parts(sign_, exp_, mant);  
  
  println!("{} -> {}", n, n_);  
  println!("field    |  as bits | as real number");  
  println!("sign     |        {:01b} | {}", sign, sign_);  
  println!("exponent | {:08b} | {}", exp, exp_);  
  println!("mantissa | {:023b} | {}", frac, mant);  
}  
  
fn to_parts(n: f32) -> (u32, u32, u32) {  
  let bits = n.to_bits();  
  
  let sign     = (bits >> 31) & 1;     
  let exponent = (bits >> 23) & 0xff;   
  let fraction =  bits & 0x7fffff ;     
  
  (sign, exponent, fraction)          
}  
  
fn decode(  
  sign: u32,  
  exponent: u32,  
  fraction: u32
) -> (f32, f32, f32) {  
  let signed_1 = (-1.0_f32).powf(sign as f32); 
  
  let exponent = (exponent as i32) - BIAS;     
  let exponent = RADIX.powf(exponent as f32);  
  let mut mantissa: f32 = 1.0;  
  
  for i in 0..23 {                             
    let mask = 1 << i;                         
    let one_at_bit_i = fraction & mask;        
    if one_at_bit_i != 0 {                     
      let i_ = i as f32;
      let weight = 2_f32.powf( i_ - 23.0 );
      mantissa += weight;
    }                             
  }
  
  (signed_1, exponent, mantissa)  
}  
  
fn from_parts(
  sign: f32,  
  exponent: f32,  
  mantissa: f32,  
) -> f32 {  
    sign *  exponent * mantissa  
}

this simple nifty little script can be used to separate out floats into their 3 building blocks. This is how the output would look like

42.42 -> 42.42
field    |  as bits | as real number
sign     |        0 | 1
exponent | 10000100 | 32
mantissa | 01010011010111000010100 | 1.325625

Isolating the signed bit

To isolate the sign bit, shift the other bits out of the way. For f32, this involves a right shift of 31 places (>> 31).

Isolating the exponent

To isolate the exponent, two bit manipulations are required. First, perform a right shift to overwrite the mantissa’s bits (>> 23). Then use an AND mask (& 0xff) to exclude the sign bit.
The exponent’s bits also need to go through a decoding step. To decode the exponent, interpret its 8 bits a signed integer, then subtract 127 from the result.(known as BIAS).

Isolating the mantissa

I think the book really does a great job explaining this

To isolate the mantissa’s 23 bits, you can use an AND mask to remove the sign bit and the exponent (& 0x7fffff). However, it’s actually not necessary to do so because the following decoding steps can simply ignore bits as irrelevant. Unfortunately, the mantissa’s decoding step is significantly more complex than the exponent’s. To decode the mantissa’s bits, multiply each bit by its weight and sum the result. The first bit’s weight is 0.5, and each subsequent bit’s weight is half of the current weight; for example, 0.5 (2–1), 0.25 (2–2),…, 0.00000011920928955078125 (2–23). An implicit 24th bit that represents 1.0 (2–0) is always considered to be on, except when special cases are triggered.

Fixed-point number formats

In addition to representing decimal numbers with floating-point formats, fixed point is also available. These can be useful for representing fractions and are an option for performing calculations on CPUs without a floating point unit (FPU), such as microcontrollers. Unlike floating-point numbers, the decimal place does not move to dynamically accommodate different ranges. In our case, we’ll be using a fixed-point number format to compactly represent values between –1..=1. Although it loses accuracy, it saves significant space.2 The Q format is a fixed-point number format that uses a single byte.3 It was created by Texas Instruments for embedded computing devices. The specific version of the Q format that we will implement is called Q7. This indicates that there are 7 bits available for the represented number plus 1 sign bit. We’ll disguise the decimal nature of the type by hiding the 7 bits within an i8. That means that the Rust compiler will be able to assist us in keeping track of the value’s sign. We will also be able to derive traits such as PartialEq and Eq, which provide comparison operators for our type, for free.

Implementing Q7 in Rust

#[derive(Debug,Clone,Copy,PartialEq,Eq)]
pub struct Q7(i8);

impl From<f64> for Q7 {
    fn from (n: f64) -> Self {
        if n >= 1.0 {
            Q7(127)
        } else if n <= -1.0 {
            Q7(-128)
        } else {
            Q7((n * 128.0) as i8)
        }
    }
}

impl From<Q7> for f64 {
    fn from(n: Q7) -> f64 {
        (n.0 as f64) * 2f64.powf(-7.0)
    }
}

impl From<f32> for Q7 {
    fn from (n: f32) -> Self {
        Q7::from(n as f64)
    }
}

impl From<Q7> for f32 {
    fn from(n: Q7) -> f32 {
        f64::from(n) as f32
    }
}

#[cfg(test)]
mod tests {          // <1>
    use super::*;    // <2>

    #[test]
    fn out_of_bounds() {
        assert_eq!from(10.), Q7::from(1.);
        assert_eq!from(-10.), Q7::from(-1.);
    }

    #[test]
    fn f32_to_q7() {
        let n1: f32 = 0.7;
        let q1 = Q7::from(n1);

        let n2 = -0.4;
        let q2 = Q7::from(n2);

        let n3 = 123.0;
        let q3 = Q7::from(n3);

        assert_eq!(q1, Q7(89));
        assert_eq!(q2, Q7(-51));
        assert_eq!(q3, Q7(127));
    }

    #[test]
    fn q7_to_f32() {
        let q1 = Q7::from(0.7);
        let n1 = f32::from(q1);
        assert_eq!(n1, 0.6953125);

        let q2 = Q7::from(n1);
        let n2 = f32::from(q2);
        assert_eq!(n1, n2);
    }
}

Conversion Magic

The real complexity lies in our conversion implementations. Let's break down the From<f64> implementation:

impl From<f64> for Q7 {
    fn from (n: f64) -> Self {
        if n >= 1.0 {
            Q7(127)  // Maximum positive value
        } else if n <= -1.0 {
            Q7(-128)  // Minimum negative value
        } else {
            Q7((n * 128.0) as i8)
        }
    }
}

What's happening here is pure numerical alchemy. We're taking a potentially unbounded floating-point number and squeezing it into our tight 8-bit representation:

Values above 1 get clamped to 127
Values below -1 get clamped to -128
Values in between get scaled by multiplying by 128

The reverse conversion is equally elegant:

impl From<Q7> for f64 {
    fn from(n: Q7) -> f64 {
        (n.0 as f64) * 2f64.powf(-7.0)
    }
}

Here, we're essentially undoing our previous scaling, converting back to a floating-point representation by multiplying by 2^-7.

Lets test this out

The tests are where the real validation happens. We're not just implementing a type; we're proving its behavior:

#[test]
fn out_of_bounds() {
    assert_eq!from(10.), Q7::from(1.);
    assert_eq!from(-10.), Q7::from(-1.);
}

This test confirms our clamping behavior. No matter how far beyond our range a number goes, it gets neatly tucked into our [-1, 1] boundary.

Why Fixed-Point Matters

Embedded systems with limited computational resources
Digital signal processing
Graphics and game development
Any scenario where memory efficiency trumps absolute precision

By sacrificing some floating-point flexibility, we gain predictable, compact number representation.

In the next chapter we would be implementing our very own cpu don't miss it! Implementing our very own CPU