# Hyperlink 13. Measuring and Interpreting Sampling Variability

Measuring sampling variability (error) is not difficult. The necessary measurements are done as part of routine QC for environmental contaminant analysis. For the sake of this explanation, suppose the analytical portion of the sampling and analysis process has negligible error (i.e., is very precise). This is rarely true, but it is stipulated here for the sake of simplifying the illustration. In real life, the amount of variability on the analytical side can be determined from laboratory QC, and the analytical and sampling errors can be separated. Here we want to assume that the measurement variation comes only from subsampling error, as caused by within-sample heterogeneity.

Further suppose that the initial result of a laboratory duplicate pair was 12.3 mg/kg and the result for the duplicate analytical subsample gave a concentration result of 9.6 mg/kg. One way to measure this variation is by the RPD, which is calculated here as the difference between the two results (determined by subtraction) divided by the average of the two results, so that RPD between 12.3 and 9.6 is 25%. An RPD is the most common measure of precision when duplicates are involved. How do we know what an acceptable RPD is? Many times this is set arbitrarily at the beginning of a project. But there is another way to look at it.

An investigator may ask whether the level of data variability indicated by an RPD of 25% (using the value in this example) could cause a decision error. As described in other sections, whether a decision error is likely depends on (a) what the decision threshold is, (b) what the true mean is, (c) how much variability is present, and (d) the strategy for making decisions (i.e., whether decisions are based on a single sample result or on multiple results using the calculated mean or a UCL on the mean). Suppose for this example that the decision threshold is 100 mg/kg and the true mean is 12.3 mg/kg. If the variability present in the subsampling process is an RPD of ±25%, repeated analyses of subsamples will produce results that vary mostly between 9.6 and 15.8 mg/kg (although some individual results could fall outside these boundaries). Can that level of data variability cause a decision error if a decision were to be made on the result of a single analysis?

If the RPD were 25% around a true mean of 12.3 mg/kg and the decision threshold is 100 mg/kg, it is unlikely (although not impossible) to generate a decision error because there is little chance that any single result could be higher than 100 mg/kg. On the other hand, if the decision threshold is 15 mg/kg (rather than 100 mg/kg), the true mean is 12.3 mg/kg, and the RPD is 25%, it is quite possible for any single result to be higher than the action level and cause a decision error. If the consequences of a decision error were dire, an investigator might guard against the error by making multiple analyses and calculating the average of those analytical results. Alternatively, the investigator could decide to reduce within-sample heterogeneity, and thus the RPD, by reducing the sample particle size (e.g., milling the sample) so that any single subsample analysis is more likely to give results closer to the true mean of the sample. For a threshold of 15 mg/kg and a true mean of 12.3 mg/kg, the RPD needs to be reduced (i.e., the precision needs to be improved) to 20% or better.