I will present several reasons why we might pull test equipment. I selected these reasons because they illustrate several important concepts. These are not the only possible reasons. You should easily be able to think of others. I encourage you to do so, and then examine them with a very critical eye.
They key take-away is that you need to analyze the results differently if your reason changes.
This is the simplest case. Find a way to pull something until it breaks, wear proper protective gear (PPE) and put up some guards to avoid getting injured, and pull to destruction. Conclusion: Wow, that was cool! No harm, no foul, and don’t draw any conclusions that mean anything. Let’s go break something else!
This is another very simple case. Do what the standard says. Maybe it makes sense, maybe it doesn't, that doesn’t matter. Pay your money and do what the standard says the way it says to do it, and if you meet all the acceptance criteria, you can say that your device meets the standard.
The manufacturer does this type of testing. It is complicated and a full-time job, so they hire an expert to oversee the testing and to interpret the results.
The reason for statistical quality control is not to show that something is good, but to show that manufacturing produces a consistent product. Changes are bad. Treat results saying that something is getting stronger the same way you treat results saying that something is getting weaker: investigate the inconsistency and take appropriate corrective action.
Showing something is good is an entirely different activity.
Maybe, for some unfathomable-to-me reason, you just want to know the breaking strength. Test a number of samples and calculate the average. You won’t know how much you can trust this number unless you make some assumptions. The most common assumption is that the breaking strengths have a Normal distribution (they definitely do not[1]). Then do one or both of the following:
The Interpreting Strength Tests page discusses these in more detail.
Warning: Do not use the mean and standard deviation to estimate the probability that the breaking strength=X if X lies outside the range observed in your test results[4].
This is an interesting scenario. Maybe you are like me and care more about knowing how likely something is to break under your weight than you are about what the average breaking strength is. For example, I want to know that breaking under my weight has less than a one-in-a-million chance. Showing is not practical. Let’s see why.
Some people may think that we can just subtract 6σ from the average breaking strength. This would work if we knew that breaking strength followed a normal distribution[5] all the way out past the 6σ limit. We don’t. Can we test whether the distribution is Normal? Sort of, but the normal statistical tests assume the distribution is Normal unless proven otherwise. We want the opposite.
The best (and really only) way to proceed is to break enough samples to determine where the cumulative distribution crosses one-in-a-million. In other words, you need to break a few million samples and estimate the crossing point. I cannot afford to do that, and you don't have the time.
You can’t always get what you want.
[1] Assuming a normal distribution means assuming some samples will have negative breaking strengths and that some could be arbitrarily large. Both are physically untenable. Despite this, most results will be near the mean and so the assumption is useful.
[2] Remember to thank the Guinness Brewery for giving us the Student-t distribution needed to analyze this case.
[3] Use Χ2 statistics, but observe the subtleties of the number of degrees of freedom needed in the calculation.
[4] You have no data outside that range to support any conclusions about anything outside that range. You can only draw conclusions there if you make assumptions. If your conclusion is only based on assumptions and not supported by data, all that you have done is assume the conclusion.
[5] More precisely, we need to know that the cumulative breaking strength and cumulative normal distribution agree at this point. This is extremely unlikely.