# A.1.2.1 Rationale for the M-3 simulations

A major portion of this ISM technical and regulatory guidance discusses Gy’s increment sampling methodology, Gy sampling errors, bulk material heterogeneities, sample support, and sampling patterns. It should be noted that, for all statistical data distributions, a simple random sample (discrete or composite) always yields an unbiased (representative) estimate of the population mean. For bulk materials, it is the correct sample support and sampling scheme that matters to obtain an unbiased estimate of a bulk material DU. In bulk material sampling, we are sampling bulk material (e.g., soils) and not values from a data set following some known or unknown statistical distribution. In bulk material sampling concentration distribution does not matter to obtain a representative sample yielding an unbiased estimate of the DU mean. However, concentration distribution plays a role in computing a defensible UCL providing desired coverage to the DU mean.

In Section A.5 an attempt has been made to evaluate ISM incorporating heterogeneities, sample support, and sampling patterns. Examples discussed in Section A.5 confirm Gy’s findings, and simulation results described there lead to the conclusion that an ISM sample is a representative sample (yields an unbiased estimate of DU mean) provided increments of appropriate sample support are collected using the simple random sampling scheme.

One of the main objectives of this document is to evaluate the capability of Gy’s increment sampling methodology on environmental bulk material DUs in obtaining unbiased estimates of DU means. To address CH, small-scale DH, and GSE, the concept of sample support is introduced in M-3 DU simulations. To demonstrate the importance of sample support and sampling patterns used in obtaining unbiased estimate of the population (DU) mean, ISM increments of specified sample support from M-3 DUs were collected using the three sampling patterns. In M-3 simulations, the concept of sample support is used to demonstrate how the use of an appropriate sample support can address small-scale DH and GSE resulting in unbiased estimates of the DU mean.

Most of the simulations from bulk material sampling maps represent idealized scenarios in which the DU is relatively homogeneous with respect to bulk material particle mass. For example, maps are used in some simulations to represent a distribution of concentrations throughout a DU but without specifically noting the scale (sample mass or volume) that each coordinate location on the map actually represents (see Section A.3). Other simulations conducted with probability distributions (instead of maps) are equivalent to sampling from a “smoothed” surface with homogeneous concentrations at a small scale (see Section A.2). It is implicitly assumed that the “bulk material” within the DU is homogeneously distributed (however, concentration values within the DU can be highly skewed and may follow spatial patterns) with one and only one point (particle) at each sampling location; and therefore GSE is not present within the DU. It should be noted that Gy proposed the use of incremental sampling to address GSE. The set of simulations for Scenarios M3-A and M3-C (see Section A.5.1) attempted to introduce the concept of how differential particle mass and concentration can be addressed through ISM sampling.

To evaluate the performance of ISM in producing unbiased estimates of means of "bulk material" DUs, hypothetical homogeneous and heterogeneous DUs mimicking bulk material (e.g., soils) DUs are generated using the "MIS Module" of the software Scout 1.1 (USEPA n.d. "Scout 1.1"). In addition to bulk material particulates of varying sizes and shapes, a typical DU also consists of uncontaminated items (e.g., trash, twigs, rocks, dead creatures) that are discarded before submission to lab analysis. Moreover, some locations within a DU are inaccessible (e.g., construction, trash, trees, bushes, boulders, ponds etc.). All these factors also contribute to CH and DH within a DU. Due to the presence of CH and DH in a bulk material DU, each location of the DU consists of none (e.g., buildings, bushes and trees representing inaccessible locations) to multiple (e.g., a training range used multiple times) particulates of the bulk material (e.g., soils).

In M-3 bulk material DUs, locations with no points (empty spaces) are considered representing inaccessible locations which cannot be sampled. Keeping these practical scenarios in mind, while generating and sampling M-3 DUs, it is not assumed that each location (e.g., [x,y] location) of the DU consists of one and only one value (particle, point). This phenomena can be best illustrated by using increment samples collected using a pogo stick (e.g., Hewitt et al. 2009). Some increments may consist of trash, twigs, and other materials which will be discarded during the ISM sample preparation process (e.g., drying, sieving) before submitting the incremental samples (ISs) for lab analysis. As a result, each bulk material ISM increment may not be of same mass of the contaminated material.

A typical IS replicate of specified number of increments (e.g., 36, 64) is collected using the sample support of specified radius (e.g., 0.01, 0.05 units). The size of the desired sample support is determined based on the CSM and particle size distributions (also see Section A.6 for details). Using the selected increment collection location as center, all points within the circle of radius 0.05 (chosen tool, sample support) units are included in that increment. Average (mass) of all points in that sample support constitutes an increment. An increment sampling location without any points represents an inaccessible location. When an increment lands on an empty location, the "MIS Module" of Scout 1.1 moves to the nearest neighboring accessible location to collect an increment of specified sample support (field crew also moves to the nearest sampling location when the chosen sampling location is inaccessible). Using one of the three sampling patterns (e.g., simple random sampling), 36 (or 64) increments of same sample support (mass) are collected in a similar manner. Results for 3 and 5 replicates based on 36 and 64 increments are presented in Section A.5.

As with M-1 and M-2 simulations, the "true" mean must be defined for evaluation of statistical performance from the repeated simulated sampling of the DU (Monte Carlo simulations). For the M-1 and M-2 cases the "true" mean is defined as the average of all increments (i.e., grid nodes). For the M-3 cases the "true" mean is defined as the average of all discrete point values (all particulates with measurable concentration values). All empty spaces can be viewed as representing inaccessible locations and/or trash that will be sieved out.