Binary encoding

CSE101 fall 2010

Modern computer systems represent all possible meanings using binary encodings, which are encodings with only two symbols. These symbols can be anything, as long as there are two of them: five volts and zero volts, a pit a bump, magnetic north and magnetic south, charged and discharged, light and dark, quiet and loud, high-pitched and low-pitched. Conventionally, these symbols are referred to as 0 and 1 — but be careful! These aren't actually the numbers zero and one, they're just symbols. If it helps, think of them as “donut” and “stick”.

A single symbol in this system is called a binary digit or bit for short.

An obvious question is why use only two symbols, rather than one, ten, twenty-six, or some other number?

One symbol turns out to be not quite enough. Yes, we can represent different values using strings of one symbol, which is called a unary encoding. For example, if we choose • for our symbol then the possible symbol strings are •, ••, •••, ••••, and so on. However, each meaning in a unary system has to be encoded using a string of a different length since there is only one unary string of each length. And it turns out that being able to design systems using fixed-length symbol strings is extremely useful.

So, we need at least two symbols. In a binary encoding system, a bit string of length n has 2n possible values, so there are four different bit strings of length two, (22 = 4, and the strings are 00, 01, 10, and 11), eight different bit strings of length three (23 = 8), and 256 different bit strings of length eight (28 = 256).

A sequence of eight bits is called as a byte. A sequence of four bits is called a nibble.

Two symbols may not seem like a lot, but if we make longer sequences of them we can represent a very large number of meanings with not very many bits. For example, using strings of just 32 bits we can encode 232 = 4,294,967,296 different meanings. If the set of interesting meanings is larger, we can simply use longer strings of bits. For instance, the set of all possible stereo audio recordings of 74 minutes length or less can be nicely encoded using strings of about 5,455,872,000 bits, which is the data capacity of a standard audio CD.

Of course, if we use more than two symbols we can encode our meanings in shorter strings. For example, if we use ten different symbols like the familiar decimal digits then we can encode one thousand different meanings in a string of length three, 000 through 999, but we need a ten-bit string to do the same thing in a binary encoding (actually, 210 = 1024, but close enough!)

However, it's actually quite difficult to build real machines based on encodings with more than two symbols, for reasons that are a bit beyond the scope of this course.

Greg Phillips

 

This web site is not an official publication of the Royal Military College of Canada nor of the Department of National Defence. Ce site web n’est pas une publication officielle du Collége militaire royal du Canada ni du Ministère de la défense nationale.