Characters
The use of binary codes to represent characters
Computers work in binaryA number system that contains two symbols, 0 and 1. Also known as base 2.. As a result, all characters, whether they are letters, punctuation or digits are stored as binary numbers. All of the characters that a computer can use are called a character setA table of data that links a character to a number. This allows the computer system to convert text into binary. Examples are ASCII and Unicode. .
Two standard character sets in common use are:
- ASCIIAmerican Standard Code for Information Interchange. A 7-bit character set used for representing English keyboard characters.
- UnicodeA system of encoding text in computing widely used on the internet.
ASCII code
ASCII uses seven bitThe smallest unit of data in computing represented by a 1 in binary., giving a character set of 128 characters. The characters are represented in a table, called the ASCII table. The 128 characters include:
- 32 control codes (mainly to do with printing)
- 32 punctuation codes, symbols, and space
- 26 upper case letters
- 26 lower case letters
- numeric digits 0-9
We tend to say that the letter ‘A’ is the first letter of the alphabet, ‘B’ is the second and so on, all the way up to ‘Z’, which is the 26th letter. In ASCII, each character has its own assigned number. For example:
| Character | Denary | Binary | Hexadecimal |
| A | 65 | 1000001 | 41 |
| Z | 90 | 1011010 | 5A |
| a | 97 | 1100001 | 61 |
| z | 122 | 1111010 | 7A |
| 0 | 48 | 0110000 | 30 |
| 9 | 57 | 0111001 | 39 |
| Space | 32 | 0100000 | 20 |
| ! | 33 | 0100001 | 21 |
| Character | A |
|---|---|
| Denary | 65 |
| Binary | 1000001 |
| Hexadecimal | 41 |
| Character | Z |
|---|---|
| Denary | 90 |
| Binary | 1011010 |
| Hexadecimal | 5A |
| Character | a |
|---|---|
| Denary | 97 |
| Binary | 1100001 |
| Hexadecimal | 61 |
| Character | z |
|---|---|
| Denary | 122 |
| Binary | 1111010 |
| Hexadecimal | 7A |
| Character | 0 |
|---|---|
| Denary | 48 |
| Binary | 0110000 |
| Hexadecimal | 30 |
| Character | 9 |
|---|---|
| Denary | 57 |
| Binary | 0111001 |
| Hexadecimal | 39 |
| Character | Space |
|---|---|
| Denary | 32 |
| Binary | 0100000 |
| Hexadecimal | 20 |
| Character | ! |
|---|---|
| Denary | 33 |
| Binary | 0100001 |
| Hexadecimal | 21 |
‘A’ is represented by the denary number 65 (binary 1000001, hex 41), ‘B’ by 66 (binary 1000010, hex 42) and so on up to ‘Z’, which is represented by the denary number 90 (binary 1011010, hex 5A).
Similarly, lowercase letters start at denary 97 (binary 1100001, hex 61) and end at denary 122 (binary 1111010, hex 7A).
When data is stored or transmitted, it is its ASCII or Unicode number that is used, not the character itself.
For example, in binary, the word "Computer" would be represented as:
1000011 1101111 1101110 1110000 1110101 1110100 1100101 1110010
Question
What would this message say?
1001000 1100101 1101100 1101100 1101111 0100001
Hello!
Extended ASCII
Extended ASCII uses eight bits, giving a character set of 256 characters. This allows for special characters such as those with accents in languages such as French and Spanish.
Unicode
While suitable for representing English characters, 256 characters is far too small to hold every character in other languages, such as Chinese or Arabic. Unicode uses 16 bits, giving a range of over 65,000 characters. This makes it more suitable for those situations.