|
OR/MS Today - October 2003 "Power" of OR Can Operations Research Light Up Power Industry? The massive blackout that crippled North America points out the need to modernize the electrical power grid. A team from Bruce Power near Toronto seeks to optimize power generation and distribution maintenance by capturing the expertise of veteran experts before it's too late. By E. Kevin Doyle, Thierry Duchesne and Danny I. Cho The massive blackout that crippled North America points out the need to modernize the electrical power grid. A team from Bruce Power (above) near Toronto seeks to optimize power generation and distribution maintenance by capturing the expertise of veteran experts before it's too late. At 4:09 p.m. on a hot, humid day in August, more than 50 million people suddenly found themselves hurled back in time during the largest and most expensive blackout in North American history. The blackout cascaded across the Northeast from Cleveland to Toronto to New York in a matter of minutes, stranded a significant portion of the United States and Canada without electricity for over 24 hours, and left government and power officials wondering what went wrong and how to prevent it from happening again. Time magazine estimated the cost of the blackout at $5 billion. Clearly, this historical electrical outage emphasizes the need for both old-hand knowledge and new systems to analyze our aging generating and distribution facilities. Bruce Power, a private sector electricity producer located about 120 miles north of Toronto on the shores of Lake Huron and the largest nuclear generating site in North America, faced a complete lack of grid access at 4:10 p.m. However the Bruce nuclear reactors were able to lead the way to rebuilding the grid by supplying 2,000 badly needed megawatts within hours. In terms of old-hand knowledge, both the popular press and many journal publications have recently been alerting the world to the tremendous loss of capability and expertise that all large industries, including the electrical generation and distribution sectors, will suffer as a result of the mass retirements of the huge post-war "baby-boom" generation. In some instances, these retirements are already taking place. This article illustrates a method of preserving some of that knowledge and using that knowledge in stochastic optimization systems to improve the performance initially the maintenance function of nuclear generation stations, and eventually, most areas of power generation and distribution. Our team is developing a multidisciplinary elicitation protocol to elicit component lifetime distributions in other words, the failure rate of equipment. The customized protocol will be capable of eliciting expertise from the baby-boom demographic in all areas of power generation and distribution. The team, consisting of people from Bruce Power, the University of Toronto, Laval University and Brock University, observed a very gratifying correlation during this, the initial case study of the program. As with any complete case study of this scale, it will take several years to prove that the distributions can be worked down via stochastic optimization methods into actual year-end operating savings. However, the limited window of opportunity available for the technique requires it to be brought to the attention of industry as soon as possible. For those in power generation and others in large industries, the expert knowledge of the baby-boomer bulge is rapidly passing into retirement. Statistical Maintenance Optimization The Bruce Site (BNGS) has eight Candu reactors with an installed capacity of approximately 6,400 megawatts. The Darlington site (DNGS) has four reactors of approximately 900 megawatts each. Candu is a Canadian designed and built power reactor that is fueled at full power with natural uranium. The use of various maintenance optimization techniques like modified reliability centered maintenance (RCM) and expert panel evaluations has resulted in cost-effective preventive maintenance procedures. The Bruce Site has initiated several modifications to the RCM process that accelerated the analysis while maintaining its accuracy [10]. Further refinement of the maintenance process is being continued with the present work that will introduce a more statistical aspect to preventive maintenance. A lack of field implementation has been identified industry-wide as one of the major problems for the statistical maintenance optimization area of operations research. Part of the problem is recognized as a lack of failure data and/or failure distributions. We conducted relatively random, and by no means exhaustive, reviews of the OR literature in order to determine if there is a different degree of real-world implementation in far-flung areas of OR as opposed to the maintenance optimization area where Bruce Power is attempting to acquire a degree of sophistication. The overall thought was that if successful implementation techniques or protocols were available, they could possibly be modified and used to expedite programs in the maintenance area. We identified the following promising paths forward:
We have emphasized the subject of demographics throughout the paper because a truly unique opportunity is available now to capture the expertise of the large, fast-retiring, post-war generation. This opportunity allows the use of well-seasoned experts in all three areas of the "knowledge acquisition" sequence the knowledge engineering/project leader area, the domain expert area and the end-user area. While the use of expert opinion is not new, a systematic procedure to use knowledge-based system techniques to determine subjective component lifetime distributions via 30 year-veterans is. We have been developing a custom protocol for the age of the participants along with techniques to balance off bias and participant fatigue, etc. In the competitive environment of the new global economy, the fiscal advantages have to be clearly demonstrated at each stage of the endeavor. The current atmosphere of deregulation that is sweeping the industry in North America dictates that every competitive advantage be thoroughly evaluated and developed. Developing the Protocol Having identified that the first and most important step in this maintenance optimization program was to obtain estimates from experts, we initiated work on establishing a protocol to obtain estimates by knowledge elicitation. This would appear to be the area where the most effort should be expended. If things are done right the first time, then less effort will be spent down the road with dubious data manipulations, non-repeatable experimental results and general procedural confusion. Other researchers have considered such areas as combining the opinions of several experts. This could create several problem areas, not the least of which is the demands on experts' time. The more expert time a project eats up, the less the chances of senior management approving the large-scale project. Experts rarely agree 100 percent on anything, so a researcher is left with the near-impossible task of trying to rate experts against each other. Here we placed our emphasis on determining the best possible expert for a given evaluation, carefully aligning him or her with the needs and directions of the project, and implementing correctly the exactly determined responses. Such other disciplines as psychological and psychometric studies [11, 13] have evaluated the idea of determining subjective probabilities for more than 40 years. Several of their quite valid concerns were taken into consideration in the design of the present elicitation technique. As we reviewed work in this area, we obtained many indications of trouble spots to watch for and verified the previously mentioned theme that it is essential to have very experienced users/problem owners/maintenance engineers who can validate calculations, conclusions or recommendations. The overriding observations of these studies are best summarized by saying that subjective probability is a highly individual matter, and people may vary considerably in their judgment as well as their ability to form and encode their judgments accurately and reliably. In general it can be said that a probability encoding is reliable when it is relatively free of random error, and it is valid when it accurately represents the opinion of the person from which it is elicited. Both reliability and validity are obviously innate characteristics of the person being interviewed. In addition we noted that non-experts appeared to exhibit traits of overconfidence when expressing their knowledge in probabilistic terms. In sum we believe that a person's opinion is not an existing entity waiting to be measured; rather, it is a quantity that develops while the question and context are being presented. A careful decision analyst (or risk analyst or project leader) has to be able to set context so that bias and inconsistencies are minimized and the encoding closely represents the interviewee's carefully considered opinion. While setting context the following should be considered:
Over a period of several months we have developed an easily portable computer program that permits the expert to vary both the time interval and the failures in each interval of the histogram in an iterative fashion. This visual feedback loop increases the speed and user satisfaction of the expert with the process. It also enables more constructive and uniform input by the overseeing project leader to each domain expert. We conducted the interviews in each respondents' office at the plant so that he would feel most comfortable and in control. The sessions started out with a general discussion of the overall project of recording knowledge for posterity. Teaching/motivating appeared to be the most important and most time-consuming phase of the process. The current effort was emphasized as a pilot project to help get some of the bugs worked out of the larger program. Context was set and an attempt was made to remove any bias that might affect the data. The respondents were eventually asked for subjective component lifetime distributions based on their years of experience on a particular piece of equipment. As a starting point the participants were asked to imagine "x" pieces of rebuilt equipment at time zero and to report on how many would still be operating at the end of various intervals of his own choice. In essence they were being asked for histograms. Results We conducted two separate fuel-handling compressor failure data elicitations one for Bruce NGS shown in Figure 1, and one for Darlington NGS shown in Figure 2. The elicitation respondents were very experienced system engineers, both at Bruce NGS and Darlington NGS. ![]() Figure 1. ![]() Figure 2. As can be seen by comparing the figures, there is a significant difference between the two distributions. The relatively low failure rate of the DNGS machines can be traced to the short preventive rebuilding cycle at that station. The relatively high failure rate at BNGS in the first two years leaves very few machines to fail subsequently. Obviously an increase in the maintenance effort is indicated. All respondents expressed an extreme reluctance to use failure percentages [12]. They felt more comfortable with the actual number of compressors, pumps, valves, etc. that failed in a particular service. There is some logic to this as they would be speaking of the number that they had been using daily for more than 25 years. At the same time it accentuates the fact that care must be taken to not impose protocols that statisticians may think are reasonable and easy to understand but that non-statistically trained experts do not. When we changed to the actual number of installed compressors a very different distribution was produced as indicated by Figures 3 and 4. In addition, the BNGS and DNGS distributions became quite similar as would be expected. Again the slightly better performance of DNGS can be attributed to the relatively high level of preventive overhauls at Darlington. ![]() Figure 3. ![]() Figure 4. The parallel here with previous work [8] is striking. In the previous work Weibull-based tools were used to evaluate an optimal maintenance interval based on failure history and financial constraints. Costs for failure (corrective) repairs and for preventive repairs were $2,400 and $1,600, respectively. The overhaul time was 15,000 hours. The hazard function, the cumulative distribution function and the reliability function curves all showed a significant change for the worse in the 6,000 to 7,500 hour range. The subjective histograms of Figure 3 and Figure 4 show the largest percentage of failures occurring in the 5,000 to 9,000 hour interval which correlates very well with the previous effort. The relatively close correlation of the responses of the coached experts lends credence to the universal concept of properly developed protocols being used by very experienced people. Conclusions We have developed a successful preliminary protocol and found a gratifying correlation in this initiating case study with previous work done using Weibull analysis [8]. Modification of the protocols to eliminate the use of percentages produced a quick increase in accuracy when run through the compressor experts. A larger scale program would produce rapid increases in efficiencies as the protocols were fine-tuned. A very interesting theme noted through most of the early literature was that of intuitive reasoning with respect to statistics and the need to acquire an extremely in-depth knowledge of the equipment before embarking on an analysis. Both seem to be directions that demand pursuing. A combination of these two areas via the host of long experienced personnel produced by the post-war, baby-boom demographics will allow maintenance optimization/operational research to attain a much higher profile in large industries if the situation is managed properly. It will also provide the "nuts and bolts" preliminary demonstration studies that are needed to convince senior executives not trained in stochastic methods of the efficacy of implementing such methods. But time is of the essence as once retired, the referenced expertise will be gone forever. In addition, a further opportunity is available during the next several years. Computer-based maintenance systems set up in nuclear plants and large industries across North America since about 1999 will at long last provide the quality of data required to implement the many statistical models that have been developed over the past 20 years. Efforts like the present endeavor, which uses expert-opinion data sampling, problem modeling and result interpretation, will be able to obtain the executive-level buy-in required for the day when the new computer-based maintenance systems' data banks become large enough to be useful. The Road Ahead In the future we will be expending significant effort in refining the subjective lifetime probability distribution protocols to improve the accuracy and reduce the demands on expert's time. This would, of course, reduce the overall project cost. We will eventually use the distributions in optimization formulae to optimize maintenance intervals, replacement times, switchover times, etc. Comparison of year-over-year maintenance and operating costs will quantify the extent of the financial benefit. MacLean's Magazine reports that it could cost up to $100 billion to modernize the North American electrical power grid. Social costs of that magnitude demand that OR practitioners and OR theorists work together with experienced people who have operated the actual field equipment for decades. References
E. Kevin Doyle is a senior engineer with Bruce Power. Thierry Duchesne is an assistant professor of Statistics at the Département de Mathématiques et de Statistique, Université Laval. Danny I. Cho is an associate professor of Operations Management & Information Systems at Brock University. The authors thank assistant professor C-G Lee, of the Department of Mechanical and Industrial Engineering, University of Toronto, for his advice and discussions in this area. OR/MS Today copyright © 2003 by the Institute for Operations Research and the Management Sciences. All rights reserved. Lionheart Publishing, Inc. 506 Roswell Rd., Suite 220, Marietta, GA 30060 USA Phone: 770-431-0867 | Fax: 770-432-6969 E-mail: lpi@lionhrtpub.com URL: http://www.lionhrtpub.com Web Site © Copyright 2003 by Lionheart Publishing, Inc. All rights reserved. |