Arithmetic and Logic Instructions
INTRODUCTION
In this chapter, we examine the arithmetic and logic instructions. The arithmetic instructions include addition, subtraction, multiplication, division, comparison, negation, increment, and decrement. The logic instructions include AND, OR, Exclusive-OR, NOT, shifts, rotates, and the logical compare (TEST). This chapter also presents the 80386 through the Core2 instructions XADD, SHRD, SHLD, bit tests, and bit scans. The chapter concludes with a discussion of string comparison instructions, which are used for scanning tabular data and for comparing sections of memory data. Both comparison tasks are performed efficiently with the string scan (SCAS) and string compare (CMPS) instructions.
If you are familiar with an 8-bit microprocessor, you will recognize that the 8086 through the Core2 instruction set is superior to most 8-bit microprocessors because most of the instructions have two operands instead of one. Even if this is your first microprocessor, you will quickly learn that this microprocessor possesses a powerful and easy-to-use set of arithmetic and logic instructions.
CHAPTER OBJECTIVES
Upon completion of this chapter, you will be able to:
1. Use arithmetic and logic instructions to accomplish simple binary, BCD, and ASCII arithmetic.
2. Use AND, OR, and Exclusive-OR to accomplish binary bit manipulation.
3. Use the shift and rotate instructions.
4. Explain the operation of the 80386 through the Core2 exchange and add, compare and exchange, double-precision shift, bit test, and bit scan instructions.
5. Check the contents of a table for a match with the string instructions.
ADDITION, SUBTRACTION, AND COMPARISON
The bulk of the arithmetic instructions found in any microprocessor include addition, subtraction, and comparison. In this section, addition, subtraction, and comparison instructions are illustrated. Also shown are their uses in manipulating register and memory data.
Addition (ADD) appears in many forms in the microprocessor. This section details the use of the ADD instruction for 8-, 16-, and 32-bit binary addition. A second form of addition, called add-with-carry, is introduced with the ADC instruction. Finally, the increment instruction (INC) is presented. Increment is a special type of addition that adds 1 to a number. In Section 5–3, other forms of addition are examined, such as BCD and ASCII. Also described is the XADD instruction, found in the 80486 through the Pentium 4.
Table 5–1 illustrates the addressing modes available to the ADD instruction. (These addressing modes include almost all those mentioned in Chapter 3.) However, because there are more than 32,000 variations of the ADD instruction in the instruction set, it is impossible to list them all in this table. The only types of addition not allowed are memory-to-memory and segment register. The segment registers can only be moved, pushed, or popped. Note that, as with all other instructions, the 32-bit registers are available only with the 80386 through the Core2. In the 64-bit mode of the Pentium 4 and Core2, the 64-bit registers are also used for addition.
Register Addition. Example 5–1 shows a simple sequence of instructions that uses register addition to add the contents of several registers. In this example, the contents of AX, BX, CX, and DX are added to form a 16-bit result stored in the AX register.
Whenever arithmetic and logic instructions execute, the contents of the flag register change. Note that the contents of the interrupt, trap, and other flags do not change due to arithmetic and logic instructions. Only the flags located in the rightmost 8 bits of the flag register and the overflow flag change. These rightmost flags denote the result of the arithmetic or a logic operation. Any ADD instruction modifies the contents of the sign, zero, carry, auxiliary carry, parity, and overflow flags. The flag bits never change for most of the data transfer instructions presented in Chapter 4.
Immediate Addition. Immediate addition is employed whenever constant or known data are added. An 8-bit immediate addition appears in Example 5-2. In this example, DL is first loaded with 12H by using an immediate move instruction. Next, 33H is added to the 12H in DL by an immediate addition instruction. After the addition, the sum (45H) moves into register DL and the flags change, as follows:
Memory-to-Register Addition. Suppose that an application requires memory data to be added to the AL register. Example 5–3 shows an example that adds two consecutive bytes of data, stored at the data segment offset locations NUMB and NUMB + 1, to the AL register.
The first instruction loads the destination index register (DI) with offset address NUMB. The DI register, used in this example, addresses data in the data segment beginning at memory location NUMB. After clearing the sum to zero, the ADD AL,[DI] instruction adds the contents of memory location NUMB to AL. Finally, the ADD AL,[ DI + I] instruction adds the contents of memory location NUMB plus 1 byte to the AL register. After both ADD instructions execute, the result appears in the AL register as the sum of the contents of NUMB plus the contents of NUMB + 1.
Array Addition. Memory arrays are sequential lists of data. Suppose that an array of data (ARRAY) contains 10 bytes, numbered from element 0 through element 9. Example 5–4 shows how to add the contents of array elements 3, 5, and 7 together.
This example first clears AL to 0, so it can be used to accumulate the sum. Next, register SI is loaded with a 3 to initially address array element 3. The ADD AL,ARRAY[SI] instruction adds the contents of array element 3 to the sum in AL. The instructions that follow add array elements 5 and 7 to the sum in AL, using a 3 in SI plus a displacement of 2 to address element 5, and a displacement of 4 to address element 7.
assembler program cannot determine if, for example, the INC [DI] instruction is a byte-, word-, or doubleword-sized increment. The INC BYTE PTR [DI] instruction clearly indicates byte- sized memory data; the INC WORD PTR [DI] instruction unquestionably indicates a word-sized memory data; and the INC DWORD PTR [DI] instruction indicates doubleword-sized data. In 64-bit mode operation of the Pentium 4 and Core2, the INC QWORD PTR [RSI] instruction indicates quadword-sized data.
Example 5–6 shows how to modify Example 5–3 to use the increment instruction for addressing NUMB and NUMB + 1. Here, an INC DI instruction changes the contents of register DI from offset address NUMB to offset address NUMB + 1. Both program sequences shown in Examples 5–3 and 5–6 add the contents of NUMB and NUMB + 1. The difference between them is the way that the address is formed through the contents of the DI register using the increment instruction.
Increment instructions affect the flag bits, as do most other arithmetic and logic operations. The difference is that increment instructions do not affect the carry flag bit. Carry doesn’t change because we often use increments in programs that depend upon the contents of the carry flag. Note that increment is used to point to the next memory element in a byte-sized array of data only. If word-sized data are addressed, it is better to use an ADD DI,2 instruction to modify the DI pointer in place of two INC DI instructions. For doubleword arrays, use the ADD DI,4 instruction to modify the DI pointer. In some cases, the carry flag must be preserved, which may mean that two or four INC instructions might appear in a program to modify a pointer.
Addition-with-Carry. An addition-with-carry instruction (ADC) adds the bit in the carry flag (C) to the operand data. This instruction mainly appears in software that adds numbers that are wider than 16 bits in the 8086–80286 or wider than 32 bits in the 80386–Core2.
Table 5–3 lists several add-with-carry instructions, with comments that explain their operation. Like the ADD instruction, ADC affects the flags after the addition.
Suppose that a program is written for the 8086–80286 to add the 32-bit number in BX and AX to the 32-bit number in DX and CX. Figure 5–1 illustrates this addition so that the placement and function of the carry flag can be understood. This addition cannot be easily performed with- out adding the carry flag bit because the 8086–80286 only adds 8- or 16-bit numbers. Example 5–7 shows how the contents of registers AX and CX add to form the least significant 16 bits of the sum. This addition may or may not generate a carry. A carry appears in the carry flag if the sum is greater than FFFFH. Because it is impossible to predict a carry, the most significant 16 bits of this addition are added with the carry flag using the ADC instruction. The ADC instruction adds the 1 or the 0 in the carry flag to the most significant 16 bits of the result. This program adds BX–AX to DX–CX, with the sum appearing in BX–AX.
Suppose the same software is rewritten for the 80386 through the Core2, but modified to add two 64-bit numbers in the 32-bit mode. The changes required for this operation are the use of the extended registers to hold the data and modifications of the instructions for the 80386 and above. These changes are shown in Example 5–8, which adds two 64-bit numbers. In the 64-bit mode of the Pentium 4 and Core2, this addition is handled with a single ADD instruction if the location of the operands is changed to RAX and RBX as in the instruction ADD RAX,RBX, which adds RBX to RAX.
Exchange and Add for the 80486–Core2 Processors. A new type of addition called exchange and add (XADD) appears in the 80486 instruction set and continues through the Core2. The XADD instruction adds the source to the destination and stores the sum in the destination, as with any addition. The difference is that after the addition takes place, the original value of the destination is copied into the source operand. This is one of the few instructions that change the source.
For example, if BL = 12H and DL = 02H, and the XADD BL,DL instruction executes, the BL register contains the sum of 14H and DL becomes 12H. The sum of 14H is generated and the original destination of 12H replaces the source. This instruction functions with any register size and any memory operand, just as with the ADD instruction.
Subtraction
Many forms of subtraction (SUB) appear in the instruction set. These forms use any addressing mode with 8-, 16-, or 32-bit data. A special form of subtraction (decrement, or DEC) subtracts 1 from any register or memory location. Section 5–3 shows how BCD and ASCII data subtract. As with addition, numbers that are wider than 16 bits or 32 bits must occasionally be subtracted. The subtract-with-borrow instruction (SBB) performs this type of subtraction. In the 80486 through the Core2 processors, the instruction set also includes a compare and exchange instruction. In the 64-bit mode for the Pentium 4 and Core2, a 64-bit subtraction is also available.
Table 5–4 lists some of the many addressing modes allowed with the subtract instruction (SUB). There are well over 1000 possible subtraction instructions, far too many to list here. About the only types of subtraction not allowed are memory-to-memory and segment register subtractions. Like other arithmetic instructions, the subtract instruction affects the flag bits.
Register Subtraction. Example 5–9 shows a sequence of instructions that perform register sub- traction. This example subtracts the 16-bit contents of registers CX and DX from the contents of register BX. After each subtraction, the microprocessor modifies the contents of the flag register. The flags change for most arithmetic and logic operations.
Immediate Subtraction. As with addition, the microprocessor also allows immediate operands for the subtraction of constant data. Example 5–10 presents a short sequence of instructions that subtract 44H from 22H. Here, we first load the 22H into CH using an immediate move
instruction. Next, the SUB instruction, using immediate data 44H, subtracts 44H from the 22H. After the subtraction, the difference (0DEH) moves into the CH register. The flags change as follows for this subtraction:
Both carry flags (C and A) hold borrows after a subtraction instead of carries, as after an addition. Notice in this example that there is no overflow. This example subtracted 44H ( + 68) from 22H ( + 34), resulting in a 0DEH ( – 34). Because the correct 8-bit signed result is – 34, there is no overflow in this example. An 8-bit overflow occurs only if the signed result is greater than + 127 or less than – 128.
Decrement Subtraction. Decrement subtraction (DEC) subtracts 1 from a register or the con- tents of a memory location. Table 5–5 lists some decrement instructions that illustrate register and memory decrements.
The decrement indirect memory data instructions require BYTE PTR, WORD PTR, DWORD PTR, or QWORD PTR because the assembler cannot distinguish a byte from a word or doubleword when an index register addresses memory. For example, DEC [SI] is vague because the assembler cannot determine whether the location addressed by SI is a byte, word, or double- word. Using DEC BYTE PTR[SI], DEC WORD PTR[DI], or DEC DWORD PTR[SI] reveals
the size of the data to the assembler. In the 64-bit mode, a DEC QWORD PTR[RSI] decrement the 64-bit number stored at the address pointed to by the RSI register.
Subtraction-with-Borrow. A subtraction-with-borrow (SBB) instruction functions as a regular subtraction, except that the carry flag (C), which holds the borrow, also subtracts from the differ- ence. The most common use for this instruction is for subtractions that are wider than 16 bits in the 8086–80286 microprocessors or wider than 32 bits in the 80386–Core2. Wide subtractions require that borrows propagate through the subtraction, just as wide additions propagate the carry.
Table 5–6 lists several SBB instructions with comments that define their operations. Like the SUB instruction, SBB affects the flags. Notice that the immediate subtract from memory instruction in this table requires a BYTE PTR, WORD PTR, DWORD PTR, or QWORD PTR directive.
When the 32-bit number held in BX and AX is subtracted from the 32-bit number held in SI and DI, the carry flag propagates the borrow between the two 16-bit subtractions. The carry flag holds the borrow for subtraction. Figure 5–2 shows how the borrow propagates through the carry flag (C) for this task. Example 5–11 shows how this subtraction is performed by a program. With wide subtraction, the least significant 16- or 32-bit data are subtracted with the SUB
instruction. All subsequent and more significant data are subtracted by using the SBB instruction. The example uses the SUB instruction to subtract DI from AX, then uses SBB to subtract- with-borrow SI from BX.
Comparison
The comparison instruction (CMP) is a subtraction that changes only the flag bits; the destination operand never changes. A comparison is useful for checking the entire contents of a register or a memory location against another value. A CMP is normally followed by a conditional jump instruction, which tests the condition of the flag bits.
Table 5–7 lists a variety of comparison instructions that use the same addressing modes as the addition and subtraction instructions already presented. Similarly, the only disallowed forms of compare are memory-to-memory and segment register compares.
Example 5–12 shows a comparison followed by a conditional jump instruction. In this example, the contents of AL are compared with 10H. Conditional jump instructions that often follow the comparison are JA (jump above) or JB (jump below). If the JA follows the comparison, the jump occurs if the value in AL is above 10H. If the JB follows the comparison, the jump occurs if the value in AL is below 10H. In this example, the JAE instruction follows the comparison. This instruction causes the program to continue at memory location SUBER if the value in AL is 10H or above. There is also a JBE (jump below or equal) instruction that could follow the comparison to jump if the outcome is below or equal to 10H. Later chapters provide additional detail on the comparison and conditional jump instructions.
Compare and Exchange (80486–Core2 Processors Only). The compare and exchange instruction (CMPXCHG), found only in the 80486 through the Core2 instruction sets, compares the destination operand with the accumulator. If they are equal, the source operand is copied into the destination; if they are not equal, the destination operand is copied into the accumulator. This instruction functions with 8-, 16-, or 32-bit data.
The CMPXCHG CX,DX instruction is an example of the compare and exchange instruction. This instruction first compares the contents of CX with AX. If CX equals AX, DX is copied into AX; if CX is not equal to AX, CX is copied into AX. This instruction also compares AL with 8-bit data and EAX with 32-bit data if the operands are either 8- or 32-bit.
In the Pentium–Core2 processors, a CMPXCHG8B instruction is available that compares two quadwords. This is the only new data manipulation instruction provided in the Pentium–Core2 when they are compared with prior versions of the microprocessor. The compare-and-exchange- 8-bytes instruction compares the 64-bit value located in EDX:EAX with a 64-bit number located in memory. An example is CMPXCHG8B TEMP. If TEMP equals EDX:EAX, TEMP is replaced with the value found in ECX:EBX; if TEMP does not equal EDX:EAX, the number found in TEMP is loaded into EDX:EAX. The Z (zero) flag bit indicates that the values are equal after the comparison.
This instruction has a bug that will cause the operating system to crash. More information about this flaw can be obtained at www.intel.com. There is also a CMPXCHG16B instruction available to the Pentium 4 when operated in 64-bit mode.