■ SUMMARY
In this chapter, we introduced the ARC ISA, and studied some general properties of ISAs. In the design of an instruction set, a balance must be struck between sys- tem performance and the characteristics of the technology in which the processor is implemented. Interaction between the CPU and the memory is a key consideration.
When a memory access is made, the way in which the address is calculated is called the memory addressing mode. We examined the sequence of computations that can be combined to make up an addressing mode. We also looked at some specific cases which are commonly identified by name.
We also looked at several parts of a computer system that play a role in the execution of a program. We learned that programs are made up of sequences of instructions, which are taken from the instruction set of the CPU. In the next chapter, we will study how these sequences of instructions are translated into object code.
■ FURTHER READING
The material in this chapter is for the most part a collection of the historical experience gained in fifty years of stored program computer designs. Although each generation of computer systems is typically identified by a specific hardware technology, there have also been historically important instruction set architectures. In the first generation systems of the 1950’s, such as Von Neuman’s EDVAC, Eckert and Mauchly’s UNIVAC and the IBM 701, programming was performed by hand in machine language. Although simple, these instruction set architectures defined the fundamental concepts surrounding opcodes and operands.
The concept of an instruction set architecture as an identifiable entity can be traced to the designers of the IBM S/360 in the 1960’s. The VAX architecture for Digital Equipment Corporation can also trace its roots to this period when the minicomputers, the PDP-4 and PDP-8 were being developed. Both the 360 and VAX are two-address architectures. Significant one-address architectures include the Intel 8080 which is the predecessor to the modern 80×86, and its contemporary at that time: the Zilog Z-80. As a zero-address architecture, the Burroughs B5000 is also of historical significance.
There are a host of references that cover the various machine languages in existence, too many to enumerate here, and so we mention only a few of the more celebrated cases. The machine languages of Babbage’s machines are covered in (Bromley, 1987). The machine language of the early Institute for Advanced Study (IAS) computer is covered in (Stallings, 1996). The IBM 360 machine language is covered in (Strubl, 1975). The machine language of the 68000 can be found in (Gill, 1987) and the machine language of the SPARC can be found in (SPARC, 1992). A full description of the JVM and the Java class file format can be found in (Meyer & Downing, 1997.)
Bromley, A. G., “The Evolution of Babbage’s Calculating Engines,” Annals of the History of Computing, 9, pp. 113-138, (1987).
Gill, A., E. Corwin, and A. Logar, Assembly Language Programming for the 68000, Prentice-Hall, Englewood Cliffs, New Jersey, (1987).
Meyer, J. and T. Downing, Java Virtual Machine, O’Reilly & Associates, Sebastopol, California, (1997).
SPARC International, Inc., The SPARC Architecture Manual: Version 8, Prentice Hall, Englewood Cliffs, New Jersey, (1992).
Stallings, W., Computer Organization and Architecture, 4/e, Prentice Hall, Upper Saddle River, (1996).
Struble, G. W., Assembler Language Programming: The IBM System/360 and 370, 2/e, Addison-Wesley, Reading, (1975).
■ PROBLEMS
4.1 A memory has 224 addressable locations. What is the smallest width in bits that the address can be while still being able to address all 224 locations?
4.2 What are the lowest and highest addresses in a 220 byte memory, in which a four-byte word is the smallest addressable unit?
4.3 The memory map for the ARC is shown in Figure 4-20.
(a) How much memory (in bytes) is available for each of the add-in video memory modules? (Give your answer as powers of two or sums of powers of two, e.g. 210.)
(b) When a finger is drawn across the touchscreen, the horizontal (x) and ver- tical (y) positions of the joystick are updated in registers that are accessed at locations (FFFFF0)16 and (FFFFF4)16, respectively. When the number ‘1’ is written to the register at memory location (FFFFEC)16 the screen flashes, and then location (FFFFEC)16 is automatically cleared to zero by the hardware (the software does not have to clear it). Write an ARC program that flashes the screen every time the user’s position changes. Use the skeleton program shown below.
4.4 Write an ARC subroutine that performs a swap operation on the 32-bit operands x = 25 and y = 50, which are stored in memory. Use as few registers as you can.
4.5 A section of ARC assembly code is shown below. What does it do? Express your answer in terms of the actions it goes through. Does it add up numbers, or clear something out? Does it simulate a for loop, a while loop, or some- thing else? Assume that a and b are memory locations that are defined else- where in the code.
4.6 A pocket pager contains a small processor with 27 8-bit words of memory.
The ISA has four registers: R0, R1, R2, and R3. The instruction set is shown in Figure 4-27, as well as the bit patterns that correspond to each register, the
instruction format, and the modes, which determine if the operand is a register (mode bit = 0) or the operand is a memory location (mode bit = 1). Either or both of the operands can be registers, but both operands cannot be memory locations. If the source or destination is a memory location, then the corresponding source or destination field in the instruction is not used since the address field is used instead.
(a) Write a program using object code (not assembly code) that swaps the con- tents of registers R0 and R1. You are free to use the other registers as necessary, but do not use memory. Use no more than four lines of code (fewer lines are possible). Place 0’s in any positions where the value does not matter.
(b) Write a program using object code that swaps the contents of memory
locations 12 and 13. As in part (a), you are free to use the other registers as necessary, but do not use other memory locations. Place 0’s in any positions where the value does not matter.
4.7 An ARC program calls the subroutine foo, passing it three arguments, a, b, and c. The subroutine has two local variables, m and n. Show the position of the stack pointer and the contents of the relevant stack elements for a stack based calling convention at the points in the program shown below. Note that subroutine foo does not return anything.
4.8 Why does sethi only load the high 22 bits of a register? It would be more useful if sethi loaded all 32 bits of a register. What is the problem with having sethi load all 32 bits?
4.9 Which of the three subroutine linkage conventions covered in this chapter (registers, data link area, stack) is used in Figure 4-14?
4.10 A program compiled for a SPARC ISA writes the 32-bit unsigned integer 0xABCDEF01 to a file, and reads it back correctly. The same program com-
piled for a Pentium ISA also works correctly. However, when the file is trans- ferred between machines, the program incorrectly reads the integer from the file as 0x01EFCDAB. What is going wrong?
4.11 Refer to Figure 4-25. Show the Java assembly language instructions for the code shown in locations 0x011e – 0x0122. Use the syntax format shown in locations 0x00e3 – 0x0ef of that same figure.
You will need to make use of the following Java instructions:
invokespecial n (opcode 0xb7) – Invoke a method with index n into the constant pool. Note that n is a 16-bit (two-byte) index that follows the invokespecial opcode.
aload_0 (opcode 0x2a) – Push local variable 0 onto the stack.
4.12 Is the JVM a little-endian or big-endian machine? Hint: Examine the first line of the bytecode program in Figure 4-24.
4.13 Write an ARC program that implements the bytecode program shown in Figure 4-26. Assume that, analogous in the code in the figure, the arguments are passed on a stack, and that the return value is placed on the top of the stack.
4.14 A JVM is implemented using the ARC ISA.
a) How much memory traffic will be generated when the program of Figure 4-26 executes?
b) For exercise 4-13, compute the memory traffic your program will generate. Then, for part (a) above, compare that traffic with the amount generated by your program. If most of the execution time of a program is due to its memory accesses, how much faster will your program be compared to the program in Figure 4-26?
4.15 Can a Java bytecode program ever run as fast as a program written in the native language of the processor? Defend your answer in one or two paragraphs.
4.16 (a) Write three-address, two-address, and one-address programs to compute the function A = (B-C)*(D-E). Assume 8-bit opcodes, 16-bit operands and addresses, and that data is moved to and from memory in 16-bit chunks. (Also assume that the opcode must be transferred from memory by itself.) Your code should not overwrite any of the operands. Use any temporary registers needed.
(b) Compute the size of your program in bytes.
(c) Compute the memory traffic your program will generate at execution time, including instruction fetches.
4.17 Repeat Exercise 4.16 above, using ARC assembly language. Note that the subtract mnemonic is subcc and that the multiplication mnemonic is smul.