A gathering of data related to bits, byte, endianess, etc …
Number Systems
Decimal
The numbering system we use in everyday life is called decimal (base 10). Each column can use a number between 0 and 9 (10 possibilities).
| 103 | 102 | 101 | 100 |
|---|---|---|---|
| 1 | 8 | 2 | 5 |
The top row is the value of the column and the bottom row is a decimal number. For anyone who’s a little rusty on their maths – xy means x times x, repeated y times. So, 104 = 10 * 10 * 10 * 10.
The decimal number 1825 is one thousand, eight hundred and twenty five or
(1 * 103) + (8 * 102) + (2 * 101) + (5 * 100) (1 * 10 * 10 * 10) + (8 * 10 * 10) + (2 * 10) + (5 * 1) (1 * 1000) + (8 * 100) + (2 * 10) + (5 * 1) 1000 + 800 + 20 + 5 = 1825
Remember that any number to the power of 0 is 1, so 100 = 1
Binary
Computers use the binary system (base 2), where each column can have a 0 or a 1 (2 possibilities). This works well for machinery as they can use positive and negative electrical signals to represents 0 and 1.
| 23 | 22 | 21 | 20 |
|---|---|---|---|
| 1 | 0 | 1 | 1 |
The binary number 1011 is equivalent to 11 in decimal.
(1 * 23) + (0 * 22) + (1 * 21) + (1 * 20) (1 * 2 * 2 * 2) + (0 * 2 * 2) + (1 * 2) + (1 * 1) (1 * 8) + (0 * 4) + (1 * 2) + (1 * 1) 8 + 0 + 2 + 1 = 11
As you can see you need a lot more columns to represent a number in binary that you do in decimal. If you had to convert 1825 into decimal it would be
011100100001 210 + 29 + 28 + 25 + 20 1024 + 512 + 256 + 32 + 1 = 1825
When talking about binary each column is refered to as a bit (so each 0 or 1 is a bit) and each collection of 8 bits is refered to as a byte. A kilobyte or kb is 1024 bytes. A megabyte or Mb is 1024 kilobytes or 1048576 bytes. Now that’s a lot of 1′s and 0′s !
Hexadecimal
Binary is a very verbose format so it’s common for binary data to be represented using hexadecimal (hex for short) which is base 16. Thats means every column can contain one of 16 unique values. As we only have 10 numerical digits, hex uses some letters from the alphabet to give a full set of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
| 163 | 162 | 161 | 160 |
|---|---|---|---|
| 1 | b | 3 | a |
The hex number 1b3a is equivalent to 6970 in decimal.
For the following calculations the alphabetical characters maps to decimal equivalents; a = 10, b = 11, c = 12, d = 13, e = 14 and f = 15
(1 * 163) + (11 * 162) + (3 * 161) + (10 * 160) (1 * 16 * 16 * 16) + (11 * 16 * 16) + (3 * 16) + (10 * 1) (1 * 4096) + (11 * 256) + (3 * 16) + (10 * 1) 4096 + 2816 + 48 + 10 = 6970
So, how can hex make binary look more attractive? Every 4 bits from a binary number can be represented as 1 column from a hex number! Lets take 011100100001 from earlier and break it up;
011100100001 0111 0010 0001 (in binary) 7 2 1 (in hex)
So 011100100001 in binary becomes 721 in hex. Another example
1110011011010010 1110 0110 1101 0010 (in binary) e 6 d 2 (in hex)
So 1110011011010010 in binary becomes e6d2 in hex.
Endianess
Endianess referes to which end (left or right) the most significant bit lies. Sparc processors use big endian storage format which stores the most significant byte first. Intel processors (common PC) store data in little endian.
Java uses big endian format whereas a lot of C code I’ve come across uses little endian. If you need to read something in little endian using java it’s best to use the NIO suit.
ByteBuffer.wrap( buffer ).order( ByteOrder.LITTLE_ENDIAN ).getLong();
Decimal (and above discussions on binary and hexadecimal) uses big endian as it stores the largest bit first (on the left hand side). A binary example of this
Big Endian 00110110 = 32 + 16 + 4 + 2 = 54 Little Endian 00110110 = 4 + 8 + 32 + 64 = 108
There is a more thorough description and good C example highlighting the problems with endianess differences located at Wikipedia
Hex Editing
A good Guide to Hex Editing