4.1 Factors that Complicate Estimating the Mean Concentration

ISM sampling produces an estimate of the mean contaminant concentration in soil within a specified volume (i.e., a DU). As with any estimate derived from sampling, ISM results are subject to error, the components of which are described in Section 2.5. Understanding error introduced by sampling is squarely in the domain of statistical analysis. Rigorous statistical analysis regarding the extent to which various ISM sampling strategies provide accurate estimates of the mean contaminant concentration have not yet been published. This information is necessary to understand how factors such as number of increments, number of replicates, and contaminant distributions across the site influence the reliability of ISM estimates of mean contaminant concentration. An evaluation of the reliability of ISM based on statistical principles is vital to widespread acceptance of this sampling method for regulatory purposes.

*Collecting more than 3 replicates will increase certainty in estimate of mean and UCL and is recommended in these cases. More than 10 have diminishing value.
**The number of increments depends on heterogeneity (highly variable sites require more increments) and on size (a small site may require fewer increments).

Figure 4-1. ISM decision tree.



The statistical analysis presented in this document evaluates how ISM field sampling procedures may influence the error in the estimate of the mean concentration.Statistical evaluation of ISM is a new area. Thorough evaluation of ISM is a substantial undertaking, well beyond the scope of this document. Thus, the findings presented here should be viewed as the best available to date but incomplete in terms of addressing all of the points and questions that might be asked. It is also important to note that analyses described in this report have focused on the extent to which ISM field samples represent the true mean of the DU, assuming that the concentration within those samples can be measured with complete accuracy. Statistical evaluation of subsampling methods in the laboratory is also important (see example in Gerlach and Nocerino 2003) but is not addressed due to time and resource constraints.

Data on chemical concentrations in environmental media present challenges for estimating the mean concentration. This problem applies to both ISM and discrete sampling. If a DU is perfectly homogenous, meaning that the contaminant exists in the same concentration everywhere across the DU, developing a sampling strategy to accurately estimate the concentration is simple. For that case, all sampling approaches, from a single discrete sample to the most expansive set of ISM samples, would yield the same average concentration (within the limits of laboratory error), and thus any can provide a reliable estimate of the mean. Unfortunately, this ideal situation is never encountered in soils. Site concentrations typically exhibit some degree of heterogeneity, and the greater the heterogeneity, the more difficult it is to accurately estimate the mean concentration through sampling. As discussed in the next section, this difficulty gives rise to error in the estimation of the mean, and different sampling approaches yield different values for the mean. This error can be managed so that reliable estimates of the mean can be produced, but management requires an understanding of how the number of discrete samples, or the number increments and replicates in ISM sampling, affects estimates of the mean. Simulation studies to develop this understanding are described later in this section.