# A.6.6 Sampling Patterns

Schematics of the sampling patterns considered in the simulation experiments are given as follows. Figure A-21 shows the serpentine pattern, Figure A-22 shows the systematic random within grid pattern with 16 increments (one from each grid), and Figure A-23 shows the simple random sampling pattern with 16 random increments.

Figure A-21. Serpentine pattern applied to one quadrant of a DU.

Figure A-22. Systematic random grid pattern with a random start.

Figure A-23. Simple random sampling pattern.

The simple random sampling pattern yields an estimate that is consistent. As a statistical term, consistency implies that performance measures computed using the simple random sampling (e.g., MSE, RMSE, and SD [FE]) decrease as the sample size (here the number of replicates) increases. The coverage provided by a *t*-95% UCL based on simple random sampling decreases as the sample size increases, and the coverage provided by a Chebyshev 95% UCL increases as the sample size increases. However, based on simple random sampling, mean and standard deviation being consistent estimates, both *t*-UCL and Chebyshev UCL decrease as the sample size (number of replicates for ISM) increases.

Some known properties of sampling patterns and UCLs are noted below:

- For samples (e.g., replicates based on increments collected using simple random sampling) collected using a simple random sampling pattern, the properties of Student’s-
*t*-statistic-based 95% UCL (Student’s-*t*-95% UCL) and Chebyshev inequality–based 95% UCL (Chebyshev 95% UCL) are well established. Specifically, the coverage provided by*t*-95% UCL based on an simple random sampling is nonincreasing as the sample size (e.g., replicates) increases, and the coverage provided by Chebyshev 95% UCL is nondecreasing as the sample size (replicates) increases (e.g., USEPA 2010c; Singh, Singh, and Engelhardt 1997; Singh, Singh, and Iaci 2002, Dudewicz and Mishra 1988). - For normally distributed data sets,
*t*-95% UCL based on a simple random sampling provides approximately 95% coverage for the DU, and Chebyshev UCL 95 tends to provide higher coverage for the DU mean than the nominal 95%. - For moderately skewed to highly skewed data,
*t*-95% UCL fails to provide 95% coverage for the DU mean. For such data sets, the use of Chebyshev 95% UCL is preferred to address uncertainties associated with the estimate of DU mean (Singh, Singh, and Engelhardt 1997; Singh, Singh, and Iaci 2002; USEPA 2010b). - For serpentine and systematic random sampling patterns, the properties of
*t*-95% UCL and Chebyshev 95% UCL are not well established. However, it is noted that for heterogeneous DUs, in addition to yielding biased estimates of the DU mean, the use of serpentine and systematic random within grid sampling patterns tends to yield ISM replicates with lower variability (e.g., Singh, Singh, and Murphy 2009). Therefore, the coverage provided by a 95% UCL (e.g.,*t*-95% UCL and Chebyshev 95% UCL) of mean based on ISM increments collected using the serpentine and systematic random within grid sampling patterns is lower than the nominal 0.95 coverage.

Note: If a *t*-95% UCL based on r replicates (increments collected using simple random sampling) does not provide the nominal 95% coverage to DU mean, then a *t*-95% UCL based on a higher (>r) number of replicates also does not provide the 95% coverage to DU mean.

Documents dealing with MIS methodology (e.g., Hewitt et al. 2007, 2009; USACE 2009) suggest the use of serpentine pattern (Figure A-21) or a systematic random grid pattern (Figure A-22) to collect increments making an ISM sample. The use of these patterns is suggested since these sampling patterns are easier to implement in the field; however, statistical sampling theory suggests that data based on simple random sampling (Figure A-23) yield unbiased (representative) estimates. Therefore, increments (of equal mass) should be collected in a completely random/ unbiased manner from the DU consisting of the bulk material. The use of simple random sampling gives each location of the DU an equal chance of being selected in the sample (discrete or ISM) used to estimate the DU mean. These three sampling patterns were evaluated by Singh, Singh, and Murphy (2009). Based on simulation experiments, as expected, they observed that relative bias (FE) in an estimate of DU mean is the least when increments are collected in a random manner, an observation also supported by statistical sampling theory (Elder, Thompson, and Myers 1980).