Paul Sorensen - Cheryl Chambers - Ken Anderson - Brent Sorensen
Neural Engineering Research and Development
Universal Synaptics Corporation
Visit us at our web page: www.usynaptics.com
For those in the avionics repair and maintenance business, the acronyms NFF (No Fault Found) and CND (Cannot Duplicate) are, unfortunately, all too familiar terms. After several decades of frustration with this illusive phenomenon, it continues to consume an enormous amount of test and diagnostic effort and is the source of considerable cost and discomfort within the multi-level avionics repair model.
There are undoubtedly many causes of NFF and all of them should be addressed. The question is: Where do you start and which solution will be the most beneficial?
Our particular efforts have focused on the literal or statistical analysis of NFF, recognizing that if the system’s MTBF (Mean Time Between Failure) has decreased, or if the device's NFF rate has increased with age and deterioration, a physical fault is most likely present. However, if it isn’t found during conventional testing then it probably only fails intermittently. Similarly, having an intermittent failure mode, it in all probability cannot be detected or diagnosed at testing time because of known and demonstrated limitations in the conventional measurement equipment used to perform the tests.
In this paper we will outline the problem of intermittence and its testing difficulties. More importantly, we will describe the unique equipment and process which has produced overwhelming success in Intermittence / NFF resolution and MTBF extension. Working with Total Quality Systems, (TQS) Ogden, Utah, we implemented our team-developed overhaul system called IFDIS (Intermittent Fault Detection and Isolation System) which incorporates all the necessary testing procedures and technological capabilities that are proving to be critical to the resolution of the chronic intermittent / NFF problem.
THE TESTING PROBLEM
Intermittence occurs randomly in time, place, amplitude and duration. The very nature of the failure mode suggests that the ability to detect and further isolate the intermittence root cause is based on detection SENSITIVITY and PROBABILITY rather than conventional methods concentrating on ohmic measurement accuracy. Simply put, you can’t detect an intermittent event until it occurs, and then you might have limited opportunities to catch it on the specific circuit when it does. Trying to measure fractions of a milliohm, scanning one circuit at a time, is ineffective for this particular failure mode.
Through extensive hands-on failure analysis and repair of NFF avionics and other aging electronics, our research revealed that nearly all NFF failures are caused by underlying intermittence in the circuit path interconnections, not the electrical components. The electrical components generally fail “hard” and are, by comparison, easy to troubleshoot and repair. In contrast, the interconnecting devices mostly fail intermittently. These types of “devices” are defined as the connectors, crimps, splices, circuit board traces and via's, solder joints, bulkhead connectors, backplanes, switches, circuit breakers, fuse receptacles, etc. In short, it is all the electromechanical devices that mechanically tie the circuit components together.
Just like machinery, these particular devices wear gradually, or contamination builds-up over a period of time. Rarely, unless damaged, will they be working perfectly one minute and the next become a repeatable, testable, hard failure. Instead, the electromechanical devices go into a long and frustrating period of low-level intermittency as their mechanical tolerances change depending on their age, wear and the current environmental conditions such as temperature, humidity, vibration and contamination.
When a particular circuit device’s electromechanical intermittence reaches sufficient magnitude, its overall electrical function will begin to malfunction, resulting in increasing intermittent-type system failures, which, when subsequently tested on the ground in a static environment, may perform sufficiently well as to avoid detection.
It is important to note here that an intermittent of sufficient amplitude and duration as to cause a system malfunction during extremes of the operating environment is likely to manifest itself at a much smaller amplitude and duration during ground-based testing, unless environmental stimulus is applied. The amount of stimulus required to expose an intermittent is inversely proportional to the sensitivity of the testing equipment used to detect the intermittent.
It’s at this point that NFF’s circular logic and confusion begins to occur. When a malfunction is reported but is no longer evident or easily detectable with conventional test equipment, the technician has only two expedient diagnostic choices: the intermittence is either in the aircraft or it is in the box. It’s highly unlikely that the pilot was mistaken, imagined, or fabricated the original in-flight malfunction. Consequently, line technicians are often left to simply take a “shotgun” approach to the repair in an attempt to address the original write-up in a timely manner. Unfortunately, by removing system elements prior to locating the root of the intermittent, the potential exists that the removal was not necessarily the problem. Suggestions that the technician simply pulled the wrong item due only to inadequate training, tech orders, inexperience, etc., somewhat ignores the original reported malfunction and ensures that the defect remains undetected somewhere in the system. If it positively is not in the box, then it’s more than likely still in the aircraft hazarding flight operations.
Since intermittence occurs primarily in the electromechanical devices, when the “most likely” opportunity is calculated, the Line Replaceable Unit (LRU) becomes the most prominent suspect. There are hundreds and in many cases thousands of potential failure points in a typical avionics box, whereas the aircraft circuits and connections leading into the box may be just two to three hundred.
THE TESTING SOLUTION
Once intermittent failure modes are clearly understood, it becomes quite evident why the vast array of conventional test equipment cannot efficiently or effectively test for, or isolate the root-cause of this elusive problem.
In a typical avionics system, there are thousands of internal and external circuit paths moving electrons through thousands more physical interconnection points which are all aging to some degree and will fail intermittently long before they fail permanently. It only takes one of these devices reaching this condition to render the unit unreliable. Since it is virtually impossible to manually probe such a system, and even if attempted, the probability that you would be measuring that specific line, at just the right moment, looking for the right signal, would be infinitesimal and futile.
By any reasonable scientific explanation of the problem, to catch intermittents on the ground, you need to have phenomenal testing speed (sensitivity) and 100% bandwidth. In other words, the proper technology for the task must be able to test all of the failing system’s lines, all of the time, in a simultaneous and continuous fashion. Conventional test equipment does just the opposite. Most continuity testing devices employ digital sampling and averaging techniques to achieve higher levels of parametric accuracy. Most will completely “average” a short-duration, ohmic, intermittent event right out of existence. Likewise, virtually all continuity testing devices use scanning methodology, which measures only one circuit at a time and then only briefly. A continuity test ONLY verifies that the unit under test is wired correctly and is stable at that specific moment. These devices are limited to measurement speeds in the 100–200 millisecond range which add up to some rather massive holes in intermittence test coverage when testing just a single line and event detection is nearly impossible on hundreds or thousands of interconnections which are found in typical avionics systems.
To address all these limitations, the Intermittent Fault Detector (IFD) was developed specifically with intermittence requirements in mind. It uses super sensitive analog detection technology on the front end and digital reporting and data processing technology on the backend, and it does it all in an efficient, parallel circuitry manner. The IFD consistently detects intermittent circuit events on an unlimited number of circuits, simultaneously, at ohmic glitch durations as short as 50 nanoseconds.
What does this mean in the overall scope of intermittence detection probabilities? It means everything! It means success or failure, reliability or unreliability, integrity of a test or no integrity whatsoever.
While certainly not comparing ourselves to Albert Einstein, his formula, E=MC2, which explained the force unleashed by the atomic bomb, is very similar to the probability gains derived from the IFD technology to catch random intermittents. To explain and demonstrate this enhanced capability in a system of simultaneous circuit paths under test we use a similar formula that we affectionately, with respect to Mr. Einstein, call:
Universal Synaptics’ Law of Intermittent Fault Detection Effectiveness or
In our formula, E is the Effectiveness that the IFD technology provides in detecting the most evasive of intermittent malfunctions (those causing NFF) in a given Unit Under Test (UUT) device versus any other comparable piece of test equipment (measured in a ratio:1).
S is the single circuit intermittence detection Speed advantage that the IFD has over the single circuit intermittent detection speed capability of any comparable testing technology… for the IFD, use *50ns, 50 nanoseconds, .00000005 seconds.
Simply stated, what is the ratio of the shortest glitch detectable by any two pieces of test equipment on just a single circuit?
Example: 100us divided by 50ns = 2000:1 or 100ms divided by 50ns = 2,000,000:1
C is the number of circuits in the device that require testing.
Note: The number one question that arises when explaining and using the Intermittent Fault Detection Probability formula is; “why do you square the number of circuits to be tested or (C)?” in the comparison formula. Since this is the key to the entire solution, let’s take a moment to fully understand it.
The reason the number of circuits under test is squared is that while other single point or scanning-type testers are measuring one circuit at a time, the IFD is simultaneously testing all of the other circuits at the same time, for the same duration. As the conventional tester moves on to test a new circuit, the IFD continues to test all the other connected circuits at the same time, for the same period and so forth and so on. Intermittence by its very definition is random in time, place, amplitude and duration. Therefore, the detection of intermittence is a condition of probabilities and the ability to detect it is measured in test-coverage.
The following is a simple explanation of the squaring effect of simultaneous and continuous testing for intermittence.
Using an easy example of a 3 by 3 matrix of circuits (9 total circuits to be tested), like a simple 9-pin cable, let’s compare. Conventional scanning test equipment, while connected to all the circuits, only measures one circuit at a time. So, while this technology might measure test-point-1 for one second, the IFD’s all-lines, all-the-time technology, simultaneously and continuously tests all 9 of the circuits for that same one-second, for 9 total seconds of intermittence test coverage. When conventional equipment then moves (scans) to measure test-point-2, also for one second, the IFD tests all 9 circuits for another second giving you 9 more seconds of intermittence test coverage. Conventional equipment then moves on to test-point-3 for one-second, and the IFD, again tests all 9 circuits for that same one-second. When conventional testers have finally completed testing each of the 9 circuits for just one second each (9 seconds total), the IFD has simultaneously tested all 9 circuits for 9 seconds each, (9 x 9) or 81 total seconds.
It doesn’t matter if you have a 9-pin cable or a 10,000 test point avionics box, with the IFD’s simultaneous and continuous test technology; you square the number of circuits to be tested for the test coverage calculation.
Using the E=SC2 formula of test coverage or probability gain of the IFD technology, you can begin to see why the IFD works and other technologies simply don’t.
For example, let’s consider a state of the art, scanning continuity tester that claims to test continuity at the rate of 3,500 test points a minute. The single-circuit intermittent discontinuity detection speed could then be computed to be approximately 17ms (.017 seconds) (60/3500).
If you were testing just one wire or circuit, then the IFD at 50ns (nanoseconds) is 340,000 times more sensitive at catching intermittence on a single circuit.
S= .017 divided by .00000005 = 340,000 times more likely to detect NFF intermittence on a single circuit.
Now, take a 100-circuit chassis or cable.
Using the formula E=SC2:
E = 340,000 x 100 x 100 = 3,400,000,000
In this example, the IFD is 3.4 billion times more sensitive than the scanning continuity tester for detecting intermittent / NFF at 50ns on a 100-circuit chassis or cable.
Next, take a 1,000 test point coverage requirement, such as the Modular Low Power Radio Frequency (MLPRF) LRU chassis in the AN/APG-68 radar used on the F-16 Fighting Falcon:
E = 340,000 x 1,000 x 1,000 = 340,000,000,000
In this example, the IFD is 340 billion times more sensitive than the scanning continuity tester for detecting intermittent / NFF at 50ns on a 1,000-circuit chassis or cable.
Similarly, take a 3,000 test point coverage requirement, such as the Radar Receiver (RR) WRA chassis in the AN/APG-73 Radar used on the F/A-18 Hornet:
E = 340,000 x 3,000 x 3,000 = 3,060,000,000,000
In this example, the IFD is 3 trillion, 60 billion times more sensitive than the scanning continuity tester for detecting intermittent / NFF at 50ns on a 3,000-circuit chassis.
In a final example, consider the 10,000 test point coverage requirement for the Programmable Signal Processor (PSP) LRU chassis in the AN/APG-68 Radar used on the F-16 Fighting Falcon:
E = 340,000 x 10,000 x 10,000 = 34,000,000,000,000
In this example, the IFD is 34 trillion times more sensitive for detecting intermittent / NFF at 50ns on a 10,000- circuit chassis.
These demonstrated advantages in detection probability are why IFD technology is actively reducing the intermittent / NFF problem down to a 5 minute test in a typical avionics system as outlined above. These rather simple to compute metrics also show conclusively why IFD technology works so well for resolving the intermittent / NFF problem. This technology sees real intermittent circuit occurrences that conventional test equipment cannot see and was not designed to detect. Given this “explosion” in test coverage, it becomes crystal clear why the IFD is the only applicable technology designed specifically for, and capable of, detecting, resolving, and gauging the overall problem and levels of intermittent / NFF.
*Intermittent fault detection specifications measured using Hewlett-Packard 8111A Pulse/Function Generator
IFDIS is THE RIGHT STUFF!
Intermittent Fault Detection and Isolation Systems (IFDIS) can best be described as a 3-pillared approach to resolving NFF / Intermittence.
These pillars consist of;
1.) The implementation and use of serialized data tracking to identify bad actors and repeat offender problems by Aircraft and Line Replaceable Units (LRU) / Weapon Replaceable Assemblies (WRA).
2.) The application of light environmental stimuli to duplicate the operational environment and rapidly expose even the "lowest amplitude & shortest duration" intermittent circuits during test time.
3) The use of precise intermittence testing technology -- IFD (Intermittent Fault Detectors) -- developed by Universal Synaptics Corp. specifically designed to detect and isolate the underlying intermittent causes at levels of sensitivity and probability never before possible, as well as form and fit Interface Test Adaptation (ITA) to ensure that all of the potential failing circuit interconnects in the suspect devices are all tested simultaneously and continuously while closely simulating the aircrafts operational environment.
- Jochen Horn, Fritz Kourimsky, Kurt Baderschneider, Harald Lutsch, AMP Deutschland GmbH, “Avoiding Fretting Corrosion by Design” AMP Journal of Technology, Vol. 4 June 1995
- Piet van Dijk, Frank van Meijl, “Contact Problems Due to Fretting and Their Solutions” AMP Automotive Development Centre, AMP Journal of Technology Vol. 5 June, 1996
- Correspondence with Peter Fussinger, Chairman AMC., On-file.
- Jochen Horn, Kurt Baderschneider, Bernd Lippmann, AMP Deutschland GmbH, “A New Criterion for Dynamic Reliability of Contacts” AMP Journal of Technology Vol. 2 November, 1992
- Gary L. Gemas, “Aircraft Avionic System Maintenance Cannot Duplicate and Retest OK Analytical Source Analysis" Master's Thesis, Air Force Institute of Technology (DTIC), WPAFB OH, September, 1983
- Steven Dunwoody, Edward Bock, John Sofia, “A Practical and Reliable Method for Detection of Nanosecond Intermittency”, AMP Journal of Technology Vol. 5 June, 1996
- Walter Shawlee II,”How Parts and Systems Age”, Avionics System Design, Avionics Magazine, November 2000.
- Brent Sorensen, Gary Kelly, Artur Sajecki, Paul Sorensen, “An Analyzer for Detecting Aging Faults in Electronic Devices”, IEEE, AutoTestCon, September 1994
- Brent Sorensen, “The Achilles Heel of Modern Electronics”, Evaluation Engineering Magazine, June 2004 Feature Article
- Wayne Tustin, “Random Vibration and Shock Testing”, Equipment Reliability Institute, 2005
© Universal Synaptics Corporation 2008-2016 All Rights Reserved