Hardware watchdog
Context
● You are developing an embedded application using one or more members of the 8051 family of microcontrollers.
● You are designing an appropriate software foundation for your application.
Problem
How can you ensure that – if your application ‘hangs’ due to an unexpected software or hardware error – the system will automatically reset itself?
Background
See the introduction to this chapter for an explanation of the watchdog analogy.
Solution
Working with HARDW ARE W A TCHDOG means using either an internal or external (hard- ware) timer.
We have seen in many previous cases that, where available, the use of on-chip components is to be preferred to the use of equivalent off-chip components. Specifically, on-chip components generally offer the following benefits:
● Reduced hardware complexity, which tends to result in increased system reliability.
● Reduced application cost.
● Reduced application size.
In the case of watchdog timers, the situation is more complex, because external watchdog chips typically provide some useful facilities that are not available in most on-chip versions.
For example, the popular ‘1232’ watchdogs (available, in various versions, from Dallas Semiconductors, Maxim, Linear Technology and Analog Devices) are low-cost, low-power devices. In addition to functioning as a watchdog timer, they also provide power system monitoring capabilities (see ROBUST RESET [page 77] for details of this). If, as in many designs, you intend to use an external ‘robust reset’ circuit anyway, then the 1232 chips allow you to incorporate an external watchdog facility for mini- mal addition cost and only a very minor increase in hardware complexity.
Another beneficial feature of external watchdogs is that they are inherently portable: you can generally use the same external watchdog with any member of the 8051 family. By contrast, code written to work with an internal watchdog will gener- ally have to be rewritten for use with a different hardware.
One situation in which on-chip watchdogs (such as those in the Infineon c515x devices) can be beneficial is where they allow you to determine whether the system has undergone a normal reset or a reset caused by a watchdog overflow. This may allow you to modify the system behaviour to match these circumstances. Without this information (which is not generally available through external watchdogs with- out some complex coding) your system may be continually reset by the watchdog timer overflow.
We can summarize by saying that, if you require watchdog facilities, you need to consider both internal and external solutions carefully. There is no single ‘ideal’ solu- tion and – considering the issues mentioned earlier – you need to find the best match to your requirements.
Reliability and safety implications
Before using either an internal or external watchdog, you need to be sure that the use of such a timer will increase (rather than decrease) the reliability of your application.
The first thing to bear in mind is that watchdog behaviour should be for disaster recovery. In a well-designed system the occurrence of a watchdog reset should be a noteworthy event that occurs rarely. If you think of the use of watchdogs in terms of ‘if all else falls, then we will have to let the watchdog reset the system’, then you are taking a realistic view of the capabilities of this approach.
Used without due care at the design phase and/or adequate testing, watchdogs can reduce the system reliability dramatically. A particular problem with a badly designed watchdog can occur in the presence of sustained hardware faults. In these circum- stances, a badly implemented watchdog can mean that your system constantly resets itself. This can be extremely dangerous.
You also need to appreciate that watchdogs are unsuitable for many applications, because the time taken to react to an error is too long. Suppose, for example, the braking system in an automotive application uses a 500 ms watchdog and the vehicle encounters a problem when it is travelling at 70 miles per hour (110 km per hour). In these circum- stances, the vehicle and its passengers will have travelled some 16 yards / 15 metres – right into the car in front – before the vehicle even begins to reset the braking system. In short, where fast recovery is required, watchdogs are rarely the best solution.
Portability
As already noted, internal watchdogs are based on hardware that is not part of the 8051/52 core. As a result, different forms of watchdog now exist on the various differ- ent 8051 derivatives and code written for one on-chip watchdog will generally need to be adapted for use with a different device. By contrast, software written for external watchdogs can be more portable.
Overall strengths and weaknesses
Watchdogs can provide a ‘last resort’ form of error recovery. If you think of the use of watchdogs in terms of ‘if all else fails, then reset the system’, then you are taking a realistic view of the capabilities of this approach.
In the presence of intermittent faults, e.g. rare bursts of EMI, watchdogs can be very effective.
Watchdogs with long timeout periods are unsuitable for many applications.
Used without due care at the design phase and / or adequate testing, watchdogs can reduce the system reliability dramatically.
In the presence of sustained hardware faults, badly implemented watchdogs can mean that your system constantly resets itself. This can be very dangerous.
Related patterns and alternative solutions
In certain restricted circumstances, a software watchdog may also be useful.
This can be created from two components:
● A timer ISR
● A refresh function
Essentially, we set a timer to overflow in (say) 60 ms. Under normal circumstances, this timer will never overflow, because we will call the refresh function regularly and thereby restart the timer. If, however, the program is ‘jammed’, the refresh function will not be called. When the timer overflows, the ISR will be called: this will imple- ment an ‘appropriate’ error recovery strategy.
We have used software watchdogs in several applications. The main problem with this approach is that some software errors (for example, those induced by EMI) can disrupt the watchdog timer as well as the main application code: this rarely happens with hardware watchdogs, which tend to be more robust.
The main advantage with software watchdogs is that different forms of error recov- ery (not just a complete chip reset) are possible. However, use of an on-chip hardware watchdog can provide flexible reset behaviour and is, in many circumstances, a more reliable solution.
Example: Using the ‘1232’ external watchdog timer
In this example we assume that we will be developing a simple central-heating con- trol system and will be using an external ‘1232’ watchdog chip to improve the reliability of the application.
The use of the 1232 is very straightforward:
● We wire up the watchdog to the microcontroller reset pin, as illustrated in Figure 12.2.
● We choose from one of three (nominal) possible timeout periods, and connect the TD pin on the 1232 to select an appropriate period (see Table 12.1).
● We pulse the ST line on the 1232 regularly, with a pulse interval less than the time- out period.
Example: Using the internal watchdog timer on the Atmel 89S53
The Atmel 89S53 is an example of a Standard 8051 microcontroller with a good on-chip watchdog timer.
A key feature of this timer is that it operates from an independent oscillator: as a result, it allows the system to respond to (intermittent) failures of the main crystal oscillator or resonator.
The key register used to control the watchdog timer is the WCON register, shown in Table 12.2.
The prescaler bits, PS0, PS1 and PS2 in SFR WCON are used to set the period of the Watchdog Timer from 16 ms to 2048 ms. The available timer periods are shown in Table 12.4 and the actual timer periods (at Vcc = 5V) are within ±30% of the nominal.
The WDT is disabled by power-on reset and during power-down. It is enabled by setting the WDTEN bit in SFR WCON (address = 96H). The WDT is reset by setting the WDTRST bit in WCON. When the WDT times out without being reset or disabled, an internal RST pulse is generated to reset the CPU.
Listings 12.3 to 12.7 how we might use this watchdog in the simple central-heating system discussed in the previous example.