Overview of Electrical Maintenance and Testing Strategies
Much of the essence of effective electrical equipment preventive maintenance (PM) can be summarized by four rules:
• Keep it dry.
• Keep it clean.
• Keep it cool.
• Keep it tight.
More specifically, most electrical power and control equipment is susceptible to a relatively small number of mechanisms of degradation, and the purpose of most EPM activities is to prevent them, retard them, or mitigate their effects. There are number of traditional philosophical approaches to electrical maintenance, such as run-to-failure (RTF), maintain as necessary, perform maintenance on fixed time schedules, and predictive maintenance, which are briefly summarized in the following sections. The reliability-centered main- tenance (RCM) program is gaining favor because it combines the strengths of reactive, preventive, predictive, and proactive maintenance strategies. The RCM approach to electrical equipment is discussed in a greater detail than other maintenance strategies because it is becoming a maintenance program of choice. However, most power utilities, manufacturing firms, and owners of plant facilities utilize a combination of these programs. The decision as to which approach to adopt is largely dependent on the scope of system and equipment, as well as a function of how management views the cost and benefits of maintenance.
In this approach, EPM per se is not performed at all. Degraded equipment is only repaired or replaced when the effect of degradation on process output becomes unacceptable. (For most types of electric power equipment, this coincides with catastrophic failure.) No explicit attempt is made to monitor performance or to avert failure, and the risks associated with ultimate failure are accepted. Because of the generally high reliability of electric power equipment installed in a benign environment, the RTF approach often provides satisfactory power reliability and availability in noncritical applications.
Small organizations which lack dedicated maintenance staffs often utilize this approach by default, and larger and more sophisticated organizations in the manufacturing sector also frequently apply it to noncritical equipment and sys- tems. This maintenance strategy is also referred to as reactive maintenance.
Inspect and service as necessary
This approach is an advance beyond RTF wherein plant operating or maintenance personnel inspect electrical equipment on a more or less regular schedule (often during regular rounds of the plant). Under this approach, incipient failures are usually corrected before they become catastrophic, especially if the impact of a failure is considered unacceptable, and there is often some informal monitoring of performance to predict future failures. Many industrial manufacturing plants use this approach and find it satisfactory.
The time-based maintenance (TBM) strategy is also known as scheduled PM.
In this approach, established EPM activities are performed at fixed intervals of calendar time, operating hours, or operating cycles. Both procedures and schedules are usually based on manufacturers’ recommendations or industry standards. While the scheduled EPM approach ensures that equipment gets periodic attention, it does not necessarily prioritize EPM according to safety or productivity significance, nor does it optimize the application of limited EPM resources or take advantage of lessons learned from plant and industry experience. Scheduled EPM currently is the predominant approach among relatively sophisticated operators of plants where productivity and safety is a serious concern.
The condition-based maintenance (CBM) strategy is also called predictive maintenance. It is an extension of the TBM strategy and uses nonintrusive testing techniques to assess equipment condition. It uses planned maintenance tasks that are based on equipment’s previous operating history, and trending of the maintenance data. It is most effective when combined with a PM program because it prioritizes EPM based on criticality of equipment, productivity, resources, or lessons learned from experience.
It is a maintenance strategy where equipment condition, criticality, failure history, and life cycle cost are integrated to develop logically the most effective maintenance methods for each system, subsystem, and components.
RCM capitalizes on the respective strengths of reactive, preventive, predictive, and proactive maintenance methods to maximize equipment reliability and availability. It is an ongoing process that continuously refines and redefines each maintenance activity.
The RCM process reduces the uncertainty inherently associated with the operational reliability of equipment by managing the risk through the periodic assessment of equipment condition. By using the proper instrumentation, the ability to determine the current equipment condition, changes from the baseline, and margin to failure, limits are readily determined. This allows the maintenance and operations staff to quantify the risk associated with continued operation or maintenance deferment, and to identify the most probable cause of the problem to the component level. In the majority of cases, condition testing is nonintrusive, allowing equipment condition assessments to be performed with the equipment operating under normal, loaded conditions.
The concept of RCM has evolved considerably over time when one applies it to facility maintenance. Historically, there was an intuitive belief that because mechanical parts wear out over time, equipment reliability is directly related to operating age. The belief was that the more frequently that equip- ment was overhauled, the better protected it would be against failure. Industry increased PM to include nearly everything.
In the 1970s, the airline industry found that many types of failure could not be prevented regardless of the intensity of maintenance. Actuarial analysis of failure data suggested that PM was ineffective by itself in controlling failure rates. And for many items, failure rates did not increase with increased oper- ational use. In the 1980s, early forms of condition monitoring devices came on the market and coincided with microprocessors and a new computer literacy. RCM theory was refined and adopted by the US Navy’s submarine fleet. It was shown that in many cases, scheduled overhaul increases the overall failure rate by introducing new infant mortality probability into an otherwise stable system.
What has evolved is a complementary program—rigorous and stream- lined—that has its most appropriate applications based on the consequences of failure, the probability of failure, historical data, and the amount of risk willing to be tolerated.
Rigorous RCM in its original concept involves a heavy reliance on detailed failure modes and effects analyses; math-calculated probabilities of failure; model development and accumulation of historical data. It provides the most detailed knowledge on a specific system and component and provides the most detailed documentation. Because of the detail involved, it is highly labor intensive, time-consuming, and comparatively expensive. The most appropriate applications of RCM are when the consequences of failure would result in a catastrophic risk to personal safety and health, to the environment, or could result in complete economic failure of an organization.
Plant managers adopted a streamlined RCM approach recognizing its benefits while realizing that few building mechanical and electrical sys- tems carry the catastrophic risk addressed in the rigorous RCM process. Lower intensity more in line with the scale of a facility’s infrastructure also meant lower costs. Streamlined RCM targets systems and components in order of criticality. It relies heavily on condition-based tasks and elimi- nates low-value maintenance tasks altogether based on maintenance and operations staff input and historical data. It minimizes extensive analysis in favor of finding the most obvious, costly problems early-on, capitalizes on the early successes, and then expands outward in a continuous fashion.
Common applications of maintenance strategies for RCM program. (From St. Germain, E. and Pride, A., NASA Facilities RCM Guide, 1996, p. 1-1. With permission.)
Streamlined RCM requires a thorough understanding of condition monitoring technologies as well as analytical techniques, including root cause failure analysis (RCFA), trend analysis, and failure modes and effects analysis (FMEA). With some exceptions streamlined RCM is the philosophy of choice in plant maintenance programs.
Failure: RCM defines failure as any unsatisfactory condition. It may be a loss of function, where a system or component stops running altogether, or it may be a loss of acceptable quality, where operation continues, but at a substandard or inadequate quality. A failure may be catastrophic or merely out of tolerance. As stated, RCM seeks the optimum mix of four maintenance strategies: reactive (RTF), preventive (time-directed), predictive (condition-directed), and proactive (failure-finding). Most common elements of each maintenance strategy are illustrated in Figure 1.1. The application of the various elements of the four maintenance strategies for an automobile RCM program is shown
in Figure 1.2.
The mix: Maintenance activity at facilities typically run about 80%–85% reactive (service requests, trouble calls, repairs), 15% preventive, 1% predic- tive, and 1% proactive. Goals for effective maintenance programs should be in the range of 30%–35% for reactive maintenance, 30%–35% for PM, 25% for predictive maintenance, and 10% for proactive maintenance.
In addition to improving reliability, this maintenance mix will have a size- able impact on the cost of maintenance: breakdowns and repairs typically cost about $17–18 per installed horsepower (hp)/year, preventive costs about $11–13 per installed hp/year, and predictive maintenance costs about $7–9 per installed hp/year.
An example of proactive maintenance applied to an automobile. (From St. Germain, E. and Pride, A., NASA Facilities RCM Guide, 1998, p. 1-7. With permission.)
A decision logic tree shown in Figure 1.3 may be used as a starting point to determine the appropriate maintenance strategy for a given system or compo- nent. Various maintenance strategies are discussed in the following sections.
Reactive maintenance: It involves repair or replacement only when deteriora- tion of the condition causes a functional failure. The unit breaks down. Reactive maintenance assumes that failure is equally likely to occur in any part, component, or system. If an item fails and parts are not available, delay will occur. Management has no influence on when the failure will occur (usually at the most inopportune time) and a premium will be paid for urgent attention. When this is the sole type of maintenance practiced, there is typi- cally a high percentage of unplanned maintenance, a large replacement parts
Will failure have a direct and adverse impact on environ., health, security, and safety?
inventory must be maintained, and it is an inefficient use of the workforce. An appropriate application of reactive maintenance is when a failure of the system or component poses little risk to operations, is inconsequential, and the costs of maintenance outweigh the items replacement cost. Examples include the replacement of failed fuses, incandescent lamps, and repair equipment when it breaks down. Reactive maintenance strategy is similar to RTF strategy discussed earlier.
PM: It consists of the regularly scheduled inspection, adjustments, cleaning, lubrication, parts replacement, and repair of components. It is performed on an arbitrary time basis without regard for equipment condition. Maintenance intervals are normally predefined by the manufacturer (who may have a protective self-interest at stake and a lesser regard for costs to the plant). Regularly scheduled PM can result in unnecessary, even damaging, maintenance. Maintenance-induced failures and high maintenance costs typify this strategy. An example is overhauling a properly functioning motor generator set based on a manufacturer recommended timetable. PM strategy is the same as scheduled PM discussed earlier.
Predictive maintenance or condition monitoring: It uses nonintrusive testing techniques, visual inspection, performance data, and data analysis to assess equipment condition. It replaces arbitrarily scheduled maintenance tasks with maintenance tasks that are driven by the item’s condition. Trending analysis of data is used for planning and to establish schedules. Since the technology is not applicable to all types of equipment or possible failure modes, it should not be the sole maintenance strategy employed. It is most effective when used in conjunction with a preventive program. Examples are detection of high-resistance electrical connections by infrared thermography, bearing deficiencies by vibration analysis, and motor winding problems by motor signature analysis.
Vibration monitoring: It is perhaps the most familiar and most beneficial of the mainline techniques for rotating apparatus such as motors. It should be applied to all large (> 7.5 hp), high-cost, and critical rotating equipment to monitor wear, imbalance, misalignment, mechanical looseness, bearing damage, belt flaws, sheave and pulley flaws, gear damage, flow turbulence, cavitation, structural resonance, mounting deficiencies, and fatigue. It can take several weeks or months of warning before failure occurs, thereby allowing the remedial task to be planned during a convenient time and logistically prepared. It has an accuracy rate of as high as 92% when applied correctly and a false alarm rate of about 8%. The vibration analysis can be performed in-house by technicians who have a good understanding of vibra- tion theory and adequate equipment or it can be outsourced.
Infrared thermography: It has numerous applications in checking electrical systems (connections, unbalanced loads, and overheating), mechanical systems (blocked flow, binding, bearings, fluid levels, and thermal efficiency), and structural systems (roof leaks, building envelope integrity, and insulation). Equipment varies from contact devices to imaging infrared cameras, coupled
with appropriate analysis software. Analysis can be a challenge, based on part by environmental factors that influence the data, so a technician with level I or II thermography certification should be employed to perform this survey. These services can be outsourced. Thermographic finds are invaluable from a safety perspective and typically result in a cost recovery within 1 year.
Passive airborne ultrasonic: It is a low-cost tool for detecting pressure and vacuum leaks in piping, steam traps, pressure vessels, and valves; mechanical systems bearings, lubrication, and mechanical rubbing; and electrical systems arching and corona. Ultrasonic devices are becoming increasingly popular by technicians performing lubrication tasks to determine appropri- ate lubrication levels. Operators require little training or prior experience and scanners cost as little as $1000.
Lubrication oil analysis: It is often performed on large or critical machines to determine its mechanical wear, the condition of lubricant, if the lubricant has become contaminated, and the condition and appropriateness of the lubri- cant additives. Lube oil packages include checking for visual condition and odor, viscosity, water content, acidity, alkalinity, and metallic and nonmetallic contamination. Precise procedures must be followed in obtaining clean, representative samples; however, analysis is performed in a laboratory at reasonable costs ($10–$100 per test). A single failure detected could pay for the program for several years.
Electrical condition monitoring techniques: It should be applied to electrical distribution cabling, panels, and connections; switchgear and controllers; transformers; electric motors; and generators. It is estimated that 95% of all electrical problems are due to connections (loose, corroded, under- sized, and over tightened), unbalanced load, inductive heating, spiral heating in multistrand wires, slip rings, commutators, and brush riggings. Condition monitoring detects abnormal temperature, voltage, current, resistance, complex impedance, insulation integrity, phase imbalance, mechanical binding, and the presence of arching. The most common predictive tests are
• Infrared thermography—To detect temperature differences and the overheating of circuits (see Chapter 8 for more detail)
• Insulation power factor (PF)—Measures power loss through insula- tion to ground (see Chapter 3 for more detail)
• Insulation oil analysis—Detects transformer, switch, breaker insula- tion oil condition, and contamination (see Chapter 4 for more detail)
• Dissolved gas analysis—Trends the amount of nine gases in trans- former oil formed by transformer age and stress (carbon monoxide [CO] and carbon dioxide [CO2] to detect overheating of windings; CO, CO2, and methane [CH4] to detect hot spots in insulation; hydro- gen, ethane, ethylene, and methane (H2, C2H6, C2H4, and CH4) to detect overheating of oil and/or corona discharge; and acetylene (C2H2) to detect internal arching) (see Chapter 4 for more detail)
• Megohmmeter testing—Measures insulation resistance phase to phase or phase to ground (see Chapters 2 and 3 for more detail)
• High-potential (hi-pot) testing—Go/no-go test of the insulation
• Airborne ultrasonic noise—Detects electrical arching and corona discharges
• Battery impedance—Checks impedance between terminals and com- pares the same battery against previous readings (should be within 5%), compares the battery with others in the bank (within 10%), inter- nal short (impedance > 0), open circuit (impedance > infinity), and premature aging due to heat/discharges (fast rise in capacity loss) (see Chapter 8 for more detail)
• Surge testing—Go/no-go test of winding insulation
• Motor circuit analysis (MCA)—Measures motor circuit resistance, capacitance, imbalance, and rotor influence (see Chapter 10 for details)
• Motor current signature analysis (MCSA)—Provides signatures of motor current variations (see Chapter 10 for details)
Electric motor phase voltage unbalances affect the phase current unbalances, cause motors to run hotter, and reduce the motor’s ability to produce torque. For every 10°F increase in operating temperature, it is estimated that the life of the equipment is reduced by half (H.W. Penrose, White Paper, Test methods for determining the impact of motor condition on motor efficiency and reliability). Some of these electrical tests require the circuits to be energized, and others not. Some tests require specific initial conditions, such as normal operating temperature. Whereas some high loads amplify problems, low load allows for their nondetection.
Electricians, technicians, and electrical engineers trained in electrical predictive techniques can perform the testing. A comprehensive testing program toolbox would include an infrared camera, ultrasonic detector, multimeter/voltohmmeter, clamp-on current transformer, an insulation and PF test set, battery impedance test set, MCSA test set, and MCA tester.
Proactive maintenance: It improves equipment condition and rate of degrada- tion through better design, installation procedures, failure analysis, workman- ship, and scheduling. Its procedures and technologies are used during forensic evaluations to determine the cause of failure. Proactive maintenance uses feed- back to ensure that changes from lessons learned and best practices are incor- porated in future designs and procedures. It employs a life-cycle view of maintenance, ensures that nothing affecting maintenance is done in isola- tion, and integrates maintenance support functions into maintenance plan- ning. It uses RCFA and predictive technologies to maximize maintenance effectiveness. Common proactive techniques are:
RCM specifications: Specifications that incorporate RCM philosophy and techniques are prepared for new and rebuilt equipment. These specifications include vibration, alignment, and balance standards; electrical testing criteria;
lube oil testing requirements; and commissioning and acceptance testing requirements. Operator and maintenance feedback and RCM analysis docu- mentation provide designers with justification for equipment upgrades and modernization. New and replacement units’ design should reflect lessons learned and best practices for improvements on operability, maintainability, and reliability.
Failed part analysis: Involves visually inspecting failed parts to identify the root cause of the failure. It looks at forensic scoring, color, and pitting, par- ticularly of bearings, which are generally the equipment’s weakest compo- nents and achieve only 10%–20% of their design life.
RCFA: Maintenance technicians usually repair symptoms, although recur- ring problems are symptomatic of more severe problems. The end result is high cost, questionable mission reliability, strained user goodwill, and safety hazards. RCFA seeks to find the cause, not just the effect, quickly, efficiently, and economically. Predictive maintenance techniques detect and correct prob- lems before failure, but do not act on the root cause. RCFA provide the infor- mation to eliminate the recurrence and instill the mentality of “fix forever.”
FMEA: Similar to RCFA, but performed prior to failure. Its goal is to iden- tify potential failures and failure modes to take action to prevent the failure, detect the failure earlier, and reduce the consequences of failure. For each affected equipment, it describes the function, identifies failure modes and the effects of failure, the probability and criticality of failure, and suggests a maintenance approach.
Reliability engineering: It involves the redesign, modification, and replace- ment of components with superior components, such as sealed bearings, upgraded metal, and lubricant additives.
Age exploration: Determines the optimal maintenance frequency. Starts with the manufacturer’s recommendations, then adjusts the frequency based on equipment histories and observations and condition assessments during PMs and “open and inspects.”
Recurrence control: A repetitive failure is the recurring inability of a system, subsystem, structure, or component to perform the required function. The process analyzes the repeated failure of the same component, repeated fail- ure of various components of the same system, and the repeated failure of the same component of various systems. Historical maintenance and trend data would be monitored to determine if recurring component problems might be symptomatic of possible genetic problems and/or procedures of system aging, corrosion, wear, design, operations, the work environment, or maintenance application (or misapplication).
Program implementation: The planning of a maintenance program should include considerations for proper test equipment, tools, trained per- sonnel to carry out the maintenance tasks, and time required to perform inspections, tests, and maintenance routines. Also, consideration should be
given to record-keeping systems that range from computerized maintenance management systems (CMMSs) to manual file systems. There are number of companies that offer computerized maintenance management programs as stand alone programs or they can be incorporated into the facility opera- tional programs. The reader is encouraged to look into this programs since they are not fully covered in this book.
The following are the steps in implementing an effective maintenance program:
1. Determine the objectives and long-range goals of the maintenance program.
2. Survey and consolidate data on equipment breakdowns.
3. Determine equipment criticalities.
4. Determine the risk and the amount of risk that you are willing to tolerate.
5. Establish metrics and key performance indicators (KPIs) to track and trend performance.
6. Establish the best maintenance techniques within your resources to mitigate the risk. Determine the maintenance procedures and frequencies.
7. Schedule and implement the program, starting with the most critical systems and those with the fastest, most beneficial paybacks first.
8. Publicize successes; provide trends, metrics, and KPIs to top man- agement to gain management support.
9. Repeat the cycle.
Maintenance analyst: The quality of the maintenance program is reflective of the skill of the maintenance technicians, their workmanship, quality of the supporting documentation, procedures, and the technologies used.
A position for maintenance analyst should be included in an RCM pro- gram. This person should be able to detect the equipment condition, must have the skill to analyze the condition, must be able to diagnose the machine or system operation and develop a course of action, and must take the action needed to prevent failure (or allow RTF). The analyst would be responsible for monitoring and analyzing data for the mechanical systems. He or she would receive all work orders, trouble calls, KPIs, and test results and would provide continuous oversight and analysis.
Plant databases: CMMSs, building management systems (BMSs), and energy management systems (EMSs) provide invaluable historical data to the mainte- nance analyst. Historical data from these provide information on age–reliability relationships, data to trend and forecast impending failure, test results, perfor- mance data, and feedback to improve performance and to document condition. RCM involves specifying and scheduling EPM activities in accordance with the statistical failure rate and/or life expectancy of the equipment being
maintained and its criticality and productivity, and continually updating EPM procedures and schedules to reflect actual maintenance experience in the plant. RCM is the most cost-effective of the alternative approaches because it improves plant safety, reliability, and availability while reducing mainte- nance costs by concentrating limited maintenance resources on items which are the most important and/or troublesome, and reducing or eliminating unnecessary maintenance on items which are of little significance and/or highly reliable. A comprehensive RCM program also incorporates structured provisions for failure root cause investigation and correction and for perfor- mance monitoring to predict failures. RCM is used extensively in the military and is gaining acceptance among both nuclear utilities and manufacturing plant operators as its advantages are increasingly recognized.
Key Factors in EPM Optimization Decisions
The optimum EPM approach for any specific plant, system, and/or piece of equipment depends on a variety of factors, including the following:
• Safety impact of equipment failure
• Productivity and profitability impact of equipment failure (includ- ing costs of lost production as well as failed equipment repair or replacement)
• Cost of PM
• Failure rate and/or anticipated life of equipment
• Predictability of failure (either from accumulated operating time or cycles or from discernible clues to impending failure)
• Likelihood of inducing equipment damage or system problems during maintenance and testing
• Technical sophistication of the plant maintenance staff
• Availability of equipment reliability data to support RCM
General Criteria for an Effective EPM and Testing Program
Effective electrical equipment and subsystem PM and testing programs should satisfy the criteria listed below.
First and most fundamental, a structured EPM program should actually exist. That is, EPM should be performed as follows:
• Under formal management control
• In accordance with defined practices and schedules
• By clearly designated persons
Management should assign a high priority to EPM. As a corollary, ade- quate resources—personnel, facilities, tools, test equipment, training,
engineering, and administrative support—should be devoted to EPM. Adequate support from design engineering and operations are especially important.
EPM activities should be prioritized according to the criticality of the systems and equipment involved, with the highest resource intensity and scheduling priority assigned to equipment, subsystems, and systems important to safety.
EPM should be performed according to unambiguous written procedures based on specific consideration of equipment, application, and environmental characteristics.
EPM procedures and schedules should be maintained and reviewed in order to ensure engineering review of procedural changes and the incorporation of plant modifications.
The EPM program should have provisions to take effective advan- tage of actual experience accumulated both in the plant and else- where (e.g., as professional society and industry association publications, and informal communications with other interested organizations).
The EPM program should incorporate effective provisions for failure root cause analysis, correction, and recurrence control.
Information systems should be in place to record and update the plant maintenance, testing, and operating history, and to facilitate trend- ing of test data, in support of the previous two criteria.
EPM should be performed only by appropriately qualified personnel. (See Section 1.3.3.)
Management should continually monitor and reevaluate the effective- ness of the EPM program, and make appropriate changes in response to identified programmatic problems and advances in maintenance technology.
By clear implication, the “RTF” and “inspect and service as necessary” phi- losophies described earlier fail to provide enough structure, direction, and monitoring to satisfy the criteria for a sound EPM approach. These philoso- phies are not acceptable for important equipment and systems. At a mini- mum, a scheduled EPM program is clearly necessary.
Qualifications of EPM Personnel
The minimum acceptable qualifications for personnel assigned to perform EPM depend on the type of maintenance and the type of the equipment to be maintained. It is normally acceptable for nonspecialists personnel to perform superficial inspections and other undemanding EPM tasks when guided by defined procedures and acceptance criteria. However, effective administrative controls should be in place to ensure that critical PM tasks on important equip- ment and systems are performed only by—or at least under the immediate and active supervision of—appropriately trained and experienced maintenance technicians. Such tasks typically include internal inspection, testing, calibration, and refurbishment.
Training for critical EPM work on important equipment and systems should include at least the following:
• The fundamentals of electrical power technology
• General electrical maintenance techniques
• Electrical safety methods and practices
• The design and operation of the equipment and system to be maintained
• The applicable maintenance and testing procedures required for the maintenance and testing of the equipment
For critical tasks, technicians’ experience should include similar work on the same or closely comparable equipment, preferably in an operational environ- ment, although experience acquired in a training environment under direct supervision of experienced instructors is acceptable.
With regard to electrical safety methods and practices, the National Fire Protection Association (NFPA) and the Occupational Safety and Health Administration (OSHA) have promulgated new guidelines and require- ments to protect workers from shock and flash hazards. The NFPA 70E, Article 110.8 (B) (1) requires safety-related work practices to be used to pro- tect employees who might be exposed to the electrical hazards involved when working on live parts operating at 50 V or more. Appropriate safety- related work practices shall be determined before any person approaches exposed live parts within the limited approach boundary by using both shock hazard and flash hazard analyses. Similarly, OSHA 1910.335(a)(1)(i) requires employees working in areas where there are potential electrical hazards to be provided with, and to use, electrical protective equipment that is appropriate for the specific parts of the body to be protected and for the work to be performed. Also in accordance with OSHA 1910.132(d), the employer is required to assess the workplace hazard to determine the use of personal protective equipment (PPE) required to protect the worker from shock and flash hazards. The NFPA 70E and OSHA requirements for shock and arc-flash hazards and guidelines for performing such an analysis are covered in more detail in Chapter 13, Sections 13.2 and 13.3. The mainte- nance of protective devices and its impact on arc-flash hazard are covered in Section 1.7 of Chapter 1.
Optimization of PM Intervals
Experience in a variety of industries demonstrates that performing PM on an absolutely fixed schedule rarely results in the optimum balance among the costs of preventive and corrective maintenance and the safety and productivity benefits of equipment reliability and availability. Given an adequate historical failure and maintenance database, reasonably straightforward methods can be used to optimize the PM cycle.
Also, several industry standards such as National Electrical Code (NEC) Standard 70B, National Electrical Testing Association (NETA) maintenance specifications, and others including manufacturer’s recommendations pro- vide guidelines on the frequency of maintenance of electrical equipment which could be used to establish EPM cycle.
Trending of Test Results
Systematic trending of EPM test results is a key element of a high-quality electrical maintenance program. This is true because the magnitudes (pass or fail value) of many of the parameters measured during EPM tests on equipment are poor predictors of future failures, unless they are so far out of the normal range that they indicate imminent and probably irretrievable failure. Examples include insulation resistance, leakage current, capaci- tance, PF, and dissipation factor (DF); bearing temperature and vibration; and winding temperature. However, a degrading trend in these parame- ters strongly indicates impending trouble, especially if the trend is acceler- ating. A sound trending program can often alert the maintenance and operations staff of the plant in time to arrest the degradation and avert the failure, or at least to minimize the effect of the failure on safety and productivity.
To provide meaningful information, the trending program must be structured to screen the effects of external factors which affect the mea- sured results but which are irrelevant to the actual condition of the equip- ment health and reliability. Test procedures should mandate precautions to ensure that the external conditions which can affect the test results remain the same from test to test, or to correct the results when this is impractical. (For example, insulation resistances readings taken at vary- ing temperatures are corrected to a common base temperature.) Typical irrelevant external conditions that affect electrical test results include tem- perature, humidity, and load.
Systematic Failure Analysis Approach
Failure analysis and root cause investigation should be an integral part of any EPM program. The steps to be taken after a failure is observed are
1. Use a failure cause analysis to determine the proximate cause of the failure. The proximate cause is expressed in terms of the piece-part- level failure, e.g., relay XX failed to transfer due to corroded contacts.
2. Compare the proximate cause to past failures or conditions on the same and similar equipment to determine if the problem has a systematic root cause, e.g., a chemically active environment in the example cited above.
3. If there appears to be no systematic root cause, correct the failure, resume operation, and continue performance monitoring. If there is a discernible root cause, initiate a structured root cause investigation.
4. If the problem is generic, contact other affected plants and manufac- turers of the equipment to determine if they have taken any effective corrective actions. If so, adapt these actions to the specific circum- stances of the affected equipment; if not, proceed to the next step.
5. If the problem is plant-specific, or if it is generic but no effective solution has been developed elsewhere, determine if it is attributable to a unique system design, to application or environmental factors, or to operational factors such as maintenance, testing, and operations practices.
6. If the problem is determined to be related to system design, equip- ment application, or environment, determine the specific deficiency (through special tests performance monitoring, environmental mon- itoring, etc.), and make appropriate corrections.
7. If the problem is related to faulty operations, identify and correct the specific procedures involved.
8. Determine whether the root cause of the problem is a programmatic deficiency, e.g., in procedures writing, training, supervision, or ade- quacy of resources, and make appropriate corrections.
9. Perform the necessary postcorrection testing and monitoring to close out the problem and ensure that it is corrected.
Postmaintenance testing provides the best assurance that maintenance actions were accomplished correctly and that the system or component was returned to functional condition. Postmaintenance testing is heavily emphasized in the better-performing plants. In these organizations, postmaintenance tests are performed following any action that potentially affects the operability of a com- ponent/subsystem/system and the scope of the testing is broad enough to con- firm all of the potentially affected functions. Associated systems, subsystems, or components are tested along with the systems, subsystems, or components which initiated the process if an engineering analysis indicates that the mainte- nance action could have a significant impact on these associated items.
Engineering support is intended to ensure that the PM program properly addresses the engineering and logistical aspects of maintenance. In view of this broad objective, engineering support of maintenance encompasses much of the engineering and management activity that takes place in a plant. This includes at least the following functions:
• Maintenance engineering
• System engineering
• Design engineering
• Spare parts and materials management
• Quality assurance
• Quality control
There are, of course, many other areas of maintenance involvement with engineering support groups. The intent here is to show areas which stand out in the better-performing plants and which tend to be missing or under- developed in other organizations.
Maintenance engineering is the engineering support activity most directly involved with PM. This function is present in all of the better-performing plants, although its name and where it fits into the organization vary widely from plant to plant. Its purpose is to optimize the maintenance program through planning, feedback, continual evaluation, and periodic updating of policies and procedures. The functions of a maintenance engineering group typically include
• Maintenance procedure development and control
• Periodic review and updating of maintenance practices and procedures
• Maintenance recordkeeping
• In-service inspection and testing (ISI/IST) program development
• Providing guidance to the training staff on maintenance training
• Collecting and trending equipment failure, reliability, availability, and maintainability data
• Tracking and trending the corrective- to preventive-maintenance ratio
• Failure root cause analysis
• Tracking, trending, and analysis of nonconformances
• Identifying and monitoring maintenance-related equipment perfor- mance parameters, especially failure precursors
• Identifying and monitoring maintenance performance indicators
The foregoing has been a brief look at the features of the EPM program. There are many ways to effect improvements in an organization, but probably the dominant cause of failing to improve is resistance to change. In the plants that have outstanding maintenance organizations, upper management has
overcome this resistance by direct, long-term involvement in establishing and implementing policies leading to improved maintenance. Perceptible improvements in reliability, availability, and thermal efficiency have gener- ally resulted; the indirect results have been both greater safety and higher profits. The changes in these organizations were not easy and required both time and dedication to implement. Effective management appears to be the key to an effective overall maintenance organization, not the number of programs management has in place.