Integers

Overview

Integers are typically encoded using either unsigned encoding or two’s-complement. The following table highlights how the min and max of these encodings behave:

Value	$w = 8$	$w = 16$	$w = 32$
$U M i n_{w}$	`0x00`	`0x0000`	`0x00000000`
$U M a x_{w}$	`0xFF`	`0xFFFF`	`0xFFFFFFFF`
$TM i n_{w}$	`0x80`	`0x8000`	`0x80000000`
$TM a x_{w}$	`0x7F`	`0x7FFF`	`0x7FFFFFFF`

The width of an integer type refers to the number of bits used in the type’s binary representation. It’s precision refers to the number of bits used in the type’s binary representation, excluding those used for signedness.

Unsigned Encoding

Always represents nonnegative numbers. Given an integral type $x$ of $w$ bits, we convert binary to its unsigned encoding with: $B 2 U_{w} (x) = 2^{w - 1} x_{w - 1} + \sum_{i = 0}^{w - 2} 2^{i} x_{i}$

Note we unfold the summation on the RHS by one term to make it’s relationship to $T 2 U_{w}$ clearer.

Two’s-Complement

Represents negative numbers along with nonnegative ones. Given an integral type $x$ of $w$ bits, we convert binary to its twos’-complement encoding with: $B 2 T_{w} (x) = - 2^{w - 1} x_{w - 1} + \sum_{i = 0}^{w - 2} 2^{i} x_{i}$

Casting

Most implementations of C cast an object of signed type to unsigned type and vice versa, most implementations simply re-interpret the object’s binary representation. This casting may happen implicitly if comparing or operating on signed and unsigned objects in the same expression. $T 2 U$ and $U 2 T$ reflect this method of casting:

x + 2^w & x < 0 \\ x & x \geq 0 \end{cases}$$ $$U2T_w(x) = \begin{cases} x & x \leq TMax_w \\ x - 2^w & x > TMax_w \end{cases}$$ ### Expansion For unsigned encoding, use **zero extension** to convert numbers to larger types. For example, $1010_2$ can be expanded to 8-bit $00001010_2$. For two's-complement, use **sign extension** to convert numbers to larger types. This means the additional leftmost bits are set to match the sign bit of the original number. For example, $1010_2$ can be expanded to 8-bit $11111010_2$. ### Truncation Let $$\begin{align*} x & = \langle x_{w-1}, \ldots, x_1, x_0 \rangle \\ x' & = \langle x_{k-1}, \ldots, x_1, x_0 \rangle \end{align*}$$ Then in unsigned encoding, truncating $x$ to $k$ bits is equal to $x \bmod 2^k$. This is because $x_i \bmod 2^k = 0$ for all $i \geq k$ meaning $$B2U_k(x') = B2U_w(x) \bmod 2^k$$ In two's-complement encoding, truncating $x$ to $k$ bits is equal to $U2T_k(T2U_w(x) \bmod 2^k)$. Like with unsigned truncation, $B2U_k(x') = B2U_w(x) \bmod 2^k$. Therefore $$U2T_k(B2U_k(x')) = U2T_k(B2U_w(x) \bmod 2^k)$$ ## Arithmetic ### Addition Addition of two unsigned or two two's-complement numbers operate in much the same way as grade-school arithmetic. Digits are added one-by-one and overflows "carried" to the next summation. Overflows are truncated; the final carry bit is discarded in the underlying bit adder. Unsigned addition of $w$-bit integral types, denoted $+_w^u$, behaves like so: $$x +_w^u y = \begin{cases} x + y - 2^w & \text{if } x + y \geq 2^w \\ x + y & \text{otherwise} \end{cases}$$ This is more simply expressed as $x +_w^u y = (x + y) \bmod 2^w$. Two's-complement addition, denoted $+_w^t$ operates similarly: $$x +_w^u y = \begin{cases} x + y - 2^w & \text{if } x + y \geq 2^{w-1} \\ x + y + 2^w & \text{if } x + y < -2^{w-1} \\ x + y & \text{otherwise} \end{cases}$$ Unlike with unsigned addition, there is no simpler modulus operation that can be applied. ### Shifting Left shift operations (`<<`) drop the `k` most significant bits and fills the right end of the result with `k` zeros. Right shift operations (`>>`) are classified in two ways: * **Logical** * Drops the `k` least significant bits and fills the left end of the result with `k` zeros. * This mode is always used when calling `>>` on unsigned data. * Sometimes denoted as `>>>` to disambiguate from arithmetic right shifts. * **Arithmetic** * Drops the `k` least significant bits and fills the left end of the result with `k` copies of the most significant bit. * This mode is usually used when calling `>>` on signed data. In C, it is undefined behavior to shift by more than the width $w$ of an integral type or by a negative value. ### Multiplication Unsigned multiplication, denoted with the $*_w^u$ operator, is defined as follows: $$x *_w^u y = (x \cdot y) \bmod 2^w$$ Similarly, two's-complement multiplication is defined as follows: $$x *_w^t y = U2T_w((x \cdot y) \bmod 2^w)$$ ### Division Integer division divides the result and discards any fractional result. This has the same effect as rounding toward zero. ## Bibliography * Bryant, Randal E., and David O'Hallaron. *Computer Systems: A Programmer's Perspective*. Third edition, Global edition. Always Learning. Pearson, 2016. * Finley, Thomas. “Two’s Complement,” April 2000. [https://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html](https://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html). * Ronald L. Graham, Donald Ervin Knuth, and Oren Patashnik, *Concrete Mathematics: A Foundation for Computer Science*, 2nd ed (Reading, Mass: Addison-Wesley, 1994). * “Two’s-Complement.” In *Wikipedia*, January 9, 2024. [https://en.wikipedia.org/w/index.php?title=Two%27s_complement&oldid=1194543561](https://en.wikipedia.org/w/index.php?title=Two%27s_complement&oldid=1194543561).

My Notebook

Explorer

Explorer

Integers

Overview

Unsigned Encoding

Two’s-Complement

Casting

Graph View

Table of Contents

Backlinks