# 2.3.2 Representative Soil Samples

The best laboratory cannot produce good data if the sample is not representative of the soil being assessed or of the intended decision (e.g., assessment, exposure, or remedial decisions). A representative sample is one that contains a subset of all the contaminants of a population in exactly the same proportion as they are present in the target population. In other words, the contaminant concentration in a representative sample provides an accurate and precise estimate the true contaminant concentration in the target population. The population is the "whole" from which samples are taken to measure properties of interest. Hyperlink 10 provides further discussion of the concept of "representativeness" as it is has been discussed in existing USEPA and ASTM International (ASTM) guidance.

For most soil sampling scenarios, a single sample or even several discrete samples do not well represent the population of interest because soil populations are too heterogeneous. As discussed in Hyperlink 11, even testing a lawn for nutrient status requires more than one sample. If using discrete samples, a __set__ of them is needed to capture the diversity of the population so that a mean can be estimated mathematically for the population. This is not the case for incremental samples because the sample is composed of increments from across the entire population. A well-designed incremental sampling plan can yield a single sample for analysis that has physically captured the population diversity such that it is representative of the mean of the target population.

__enter the population characteristic of interest to the decision__)."

If a sample or set of samples intended to represent the population does not properly do so, a "sampling error" is said to have occurred. This is why systematic planning must be done before developing the sampling design. Otherwise, it is impossible to know what a sample is supposed to represent and how to collect it so that it is "representative." Unfortunately, it is common for sampling designs to be developed without a clear picture of how the data will be used. Inadequate sampling designs commonly indicate that "representative samples" will be collected, but often there is no indication what the samples are supposed to be representing. On the other hand, a statement such as the following provides an unambiguous statement about the population of interest: "Samples will provide estimates of the true mean concentration of arsenic within the < 2 mm soil fraction of the upper 6 inches of soil for each residential lot."

The most representative soil sample is one that captures the characteristic(s) of interest for the targeted population with the least amount of error. Procedures must be in place to manage the various types of heterogeneity and the errors they cause. Interestingly, USEPA's *Applicability of Superfund Data Categories to the Removal Program* (USEPA 2006a) emphasizes that documenting total measurement error, which includes sampling errors, is a feature of definitive data. For data to be definitive, either analytical or total measurement error must be determined. Traditional QA/QC programs ignore sampling error in favor of analytical error only. But, as discussed previously, analytical error is often only a small fraction of the total measurement error. Obtaining a representative sample is the first requirement, and determining sampling error is a quantitative measure of representativeness. No data can be truly definitive without knowing that the sample was selected, collected, and processed properly.