IEEE Reliability Society Newsletter     Vol. 61, No. 2, May 2015

Table of Contents

Front page:

President's Message

From the Editor

 

Society News:

Call for AdCom Candidates!

RS Tutorial Update

 

Members & Chapters

Changsha Chapter

Taiwan Chapter Outreach

Dallas Chapter

Boston Chapter

 

Meetings & Conferences

PHM Conference

ICCE Announcement

IRPS Report

 

Letters in Reliability

Nihal Sinnudai Shares Lifetime Achievement Award Ideas

Sam Keene's "Lessons Learned in Testing Components"

 

Links:

Reliability Society Home

RS Newsletter Homepage

Lessons Learned in testing components

Dr. Samuel J Keene, Jr.

My purpose is to share some lessons learned in testing and evaluating parts to qualify their operational reliability for a given application.  The qualification testing insights are also augmented in this paper by companion learning experiences gleaned in failure analysis.

1. First, we need to establish the repeatability and reproducibility of our device measurement system.  This seems simple, but it is not always a simple matter however it’s a vital step, one that is keyed on in Six Sigma processes and is labeled Gauge R&R.  “Gauge R&R measures the amount of variability induced in measurements by the measurement system itself, and compares it to the total variability observed to determine the viability of the measurement system”:

http://en.wikipedia.org/wiki/ANOVA_gauge_R%26R .

This is explained further in section 3.

On one occasion, we were experiencing a high number of failures in a motor drive circuit. We had a team of concerned engineers and technicians investigating the problem and peering over the oscilloscope screen, observing the voltage pulse on the gate to cathode on an operational triac driver.  We observed 350 volt spikes across the triac’s gate to cathode during regular operation.  These voltage spikes were at insanely high levels for a semiconductor device leaving us puzzled why we weren’t “smoking” the triacs?  No one could explain this phenomenon.  Eventually, we set both test probes on the same point, the cathode terminal, and still observed the phantom 350 volt spike.  According to specification there should be no voltage difference observed.  Conclusion; the high voltage spike was an equipment measurement aberration observed at high frequency on the differential measurement probes. The observed spike was an artifact of the differential probe measurement system having not been properly calibrated at the observed operating frequency. The first concern was to properly calibrate our differential test probe at the high frequency where we were examining the triac driver performance.

2. Measurement data can be classified into 3 categories, each with increasing test power:

  • Binary data: the lowest informational power (e.g., pass/fail)
  • Ordinal data: ranking (e.g., A, B, C, D, E), and
  • Ratio data: (e.g., 3.1228). Ratio data levels can always be compared to the zero value; this data type has the highest informational value, and, by far, it is the preferred type of test measurement data.

We might test and qualify components for mechanical strength, e.g., to meet a 10 pound tensile strength requirement.  Suppose all components under test pass the test, i.e., none fail.  So the binary data shows all components pass the tensile test requirement.  Alternatively, suppose we test all units to failure.  The latter data case provides ratio data and is thus more informative, allowing the estimation of the design safety factor or margin of strength for the components under test.  The ratio data also provides further confirmation that the tester is working properly and properly stressing the tensile strength of the Components under test.  One can also calculate the statistical probability or fraction of parts, in the population of production parts, are expected to fall below and not pass that 10 pound required limit.  One can estimate a failure rate of production parts, with respect to the tensile strength requirement.

3. In six sigma, we are taught to conduct a Gage Repeatability and Reproducibility (R&R) Study to assure our measurement system is accurate and adequate.  The individual performing the test will measure a sample of the product to be tested. Then this person will retest the same samples again, but this time, the measurements are made in a random order to see how closely the tester can repeat the measurement values.  Randomization of the test order takes out any parasitic measurement drift component.

Then a second operator will measure the same samples again, in a new random measurement order, and the second tester will also repeat the test measurement in a random order vis a vis the first set of data measurements.   Then the data will be compared to the first tester’s data. This establishes the data measurement variability and reproducibility. Six sigma sets bounds on the test variability allowed for acceptable test R&R.  See http://www.minitab.com/en-us/Support/Tutorials/Fundamentals-of-Gage-R-R/

4. Set aside control samples that can be tested and retested any time before measurements are taken. Preferably, these sample values will span the variable range of values of interest.  This assures that measurement repeatability is being maintained, over the sample measurement range, and provides continuing, real-time confidence in your measurement system.  Then you can demonstrate ably rely on the data you are observing.

5. Brainstorm to identify the possible variables that could impact the performance of the unit being tested, i.e., ask a lot of questions about the device and what could affect its performance.  http://rs.ieee.org/images/files/newsletters/2014/2_2014/Letters_in_Reliability/onquestions.html

For example, we initially identified 25 variables potentially impacting the life, performance, and reliability of the first HeNe lasers we were using in an early industrial application of lasers. Variables included cathode material type (Al 2024 vs Al 6061), brush or etched cathode finish, tube fill pressure, etc.  A cross functional  development team, representing: the laser supplier, applications, test, and research personnel then reduced the potential  key input variables (KPIV) to be tested to 13 variables.  The first test ran was a “screening test” to reduce the number of variables down to the “Key Variables”.  Further Design of Experiments DOE tests were designed and run considering only those variables that were found significant in the initial test.  Subsequent DOE testing will model the laser reliability as a function of the reduced set KPIV’s.

The DOE test strategy was used to assess the influence of the individual   KPOV variables as well as their potential interactions. Traditional one fact at a time testing (OFAT) is inefficient and potentially ineffective, since it does not reveal possible factor interactions.  This test effort to develop and qualify the laser for printer and scanner applications took over one year, from first testing prototype lasers and then on to early production samples. The initial laser samples were initially chaotic in their performance and life.  This application was the first or an early industrial use of HeNe lasers.  More on DOE: http://asq.org/learn-about-quality/data-collection-analysis-tools/overview/design-of-experiments-tutorial.html.  The DOE test was bold, and expensive, but it took the HeNe laser from chaos to never causing a significant field problem. 

The basic rule in DOE is to block variables you can control and then randomize the remaining variables. So in the laser case, all brushed cathode lasers would undergo the same test stressor level at the same time.  This heightens the contrast, but if you are comingling the test samples in the same oven, their location in the oven should be determined in a random manner.  This helps neutralize the effect of any temperature gradients on the KPOV contrast. DOE test design and analysis example is shown in http://asq.org/learn-about-quality/data-collectioI%20Ihn-analysis-tools/overview/design-of-experiments-tutorial.html.

6. Archive test samples.  The time and effort to test these samples has added a lot of knowledge value added to them.  There is a part of us to deem that we have finished our evaluation, so let’s discard them and move on.  Subsequently in production, parts may demonstrate failure problems or performance changes.  It is most useful to have these “gold standard units” that were used to initially qualify the components.  They help answer what has changed.  For example, there was an early concern about the “warm up” time for a HeNe laser to best operate in our scanner application.  Our test lab developed a heater jacket for the laser that made a 10 fold improvement in laser turn-on time, and we turned a patent application for his concept.  The patent application was rejected.  A year later, following the patent rejection, our technology office had renewed interest in this concept and asked to see our prototype.  We had discarded our prototype, and worse, we could not repeat the results.  So we never knew what changed.

http://rs.ieee.org/images/files/newsletters/2014/2_2014/Letters_in_Reliability/onquestions.html