Binary Representation of Characters
When characters are transmitted or stored each character is represented as a binary string. The number of bits used to represent each character differs from one system to another. On punched cards there were 12 hole positions for each character, and on some paper tapes there were only 5. In most modern systems, seven, eight or nine bits are used.
In general the number of different characters which can be encoded is 2n, where n is the number of bits used for each character.
e.g. The number of different characters you can have with an eight-bit code is 28 = 256.
Worked question
Eight-bit storage locations are used to store coded characters. One bit is a parity bit. 0000 0000 and 1111 1111 both have special uses and cannot be used to code characters. How many different characters can be represented?
Seven bits are used for the actual code.
Seven bits gives 27 = 128 characters.
Two codes cannot be used
No. of possible characters = 128-2
=126
COMMONLY USED CODES
There are many different character codes in use. Which one is used in a particular situation depends on:
1- The size of the character set being represented.
2- The number of bits available for the codes.
3- The medium being used.
Three codes commonly used in computing (Fig 4) are:
1- ASCII-American Standard Code for Information Interchange
2- BCD-Binary Coded Decimal
Characteristic |
Name of code |
||
ASCII |
BCD |
EBCDIC |
|
Number of bits |
7 |
6 |
8 |
(excluding parity) |
|||
Maximum possible |
|||
size of character set |
128 |
64 |
256 |
Examples of where |
(i) Data |
(i) Seven-track |
Nine-track |
it is used |
transmission |
magnetic tape |
magnetic tape |
(ii) Main store |
(ii) Main store |
||
of microcomputers |
of some large |
||
computers |
|||
Example codes: |
|||
Letter A |
1000001 |
110001 |
11000001 |
Letter B |
1000010 |
110010 |
11000010 |
Digit 1 |
0110001 |
000001 |
11110001 |
Fig. 4 Character codes
EXAMPLE-CODES ON MAGNETIC TAPE
1- Audio cassettes
Standard tape cassettes are often used as a backing store for microcomputers. Characteristics
(a) Often only one track is used, one bit lasting a given time interval, bits being stored one after another along the tape.
(b) Usually 1 and 0 are represented by sounds of two different frequencies.
(c) The code used is often ASCII, with ‘start’ bits and ‘stop’ bits in between the characters to show where each character begins and ends.
2- Standard 1/2-inch computer magnetic tape
Often in reels 732 metres (2400 ft) long as:
(a) a seven-track tape using BCD code and a parity bit; or
(b) a nine-track tape using EBCDIC and a parity bit.
Small areas of the tape are magnetized to produce a situation comparable with paper tape (Fig. 5).
Fig. 5 Seven-track magnetic tape
Note: For further details of storage of data on magnetic tapes, and for details of data storage on discs see Unit 9.2.
Worked questions
1- Suggest, with reasons, two codes which could be used-one for each of the following situations.
(a) A language with a character set of 47 characters is to be used in a computer with a 24-bit word.
(b) In a microcomputer 100 different characters are used and the code must include information as to whether a character is to be printed on the screen normally or in reverse.
(a) 6 bits can hold 26 = 64 different characters
5 bits can only hold 25 = 32 different characters
:. 6 bits are necessary to code 47 characters
BCD code could be used with 4 characters in each 24-bit word.
(b) 7 bits can code 27 = 128 different characters
6 bits can code 26 = 64
:.7 bits are needed to code 100 characters
Seven-bit ASCII could be used. An eighth bit could be 1 for normal printing, 0 for reverse.
2- A computer has a store of 20K 16-bit words. How many characters can be stored in it using an eight-bit character code?
Number of words = 20K
:.Numbers of characters (stored two to a word) =20Kx2
40K
3- When printed in decimal a seven-bit code gives the value 65 for letter A and 67 for letter C.
Suggest values for B and D.
6510= 10000012; 6710= 10000112
This is probably ASCII. In any case the right-hand digits seem to be the binary for the position in the alphabet (1 for A, 3 for C).
The first bit is not a parity bit. Assume it is 1 for all letters.
Probable codes are B = 1000010; D = 1000100