Hyperlink 1. Estimates of the Mean in Risk Assessment

Often, a small set of discrete samples is used to represent an volume of soil and to determine the mean concentration of the volume. However, since short- and microscale heterogeneities are typically not addressed, the variability in a discrete soil data set is often high, causing considerable uncertainty in whether the calculated mean is close to the true mean. To accommodate the uncertainty in estimating the mean from a small group of samples, a UCL on the mean is used to add conservatism to the estimate. The greater the variability in the data set, the wider the distance between the calculated mean and the UCL. USEPA risk assessment guidance (USEPA 1992) recommends the 95% UCL because it accounts for uncertainties due to limited sampling data and provides reasonable confidence that site means will not be underestimated.

Risk assessment is interested in the mean concentration over an exposure unit. Risk Assessment Guidance for Superfund (USEPA 1989b) states that the concentration term in the exposure equation is the mean concentration contacted at the exposure point or points over the exposure period. This point is reiterated in Calculating Upper Confidence Limits for Exposure Point Concentrations at Hazardous Waste Sites (USEPA 2002a), which states that the concentration, commonly termed the “exposure point concentration” (EPC), is to be a conservative estimate of the mean chemical concentration in an environmental medium.

To adequately determine a mean concentration, more discrete samples would be needed than are commonly collected for discrete data sets. To accommodate the uncertainty in estimating the mean from a small group of samples, the 95% UCL on the mean is used to add conservatism. Too few sampling points create a wide interval between the calculated mean and the UCL. USEPA guidance states that data sets with fewer than 10 samples per exposure area provide poor estimates of the mean concentration (i.e., there is a large difference between the calculated sample mean and the 95% UCL), while data sets with 20–30 samples provide fairly consistent estimates of the mean (i.e., the 95% UCL is closer to the calculated sample mean). In general, the distance between the UCL and the calculated mean decreases as more samples are included in the calculation (USEPA 1992).

As is discussed throughout this document, ISM offers denser sampling coverage. ISM meets risk assessment goals for reliable estimation of the true DU mean, usually with more precision and less bias than for typical discrete sample data sets with far fewer sampling points.