For close election contests, or situations where errors are clustered in only a few precincts, an audit based on a small random sample of precincts (or other large batches of ballots) might run a large risk of letting an incorrect outcome go undetected. To understand why, it may help to consider tasting soup. If you put salt into one part of the pot and taste in another part of the pot, you might not be able to assess how salty the soup is. On the other hand, if you stir thoroughly between the salting and the tasting, the soup in any part of the pot will give a good idea of the saltiness of the soup.
Or consider jelly beans grouped into bags (as ballots are grouped into precincts). Suppose there are 100 bags of 100 jelly beans each, with some bags having a mixture of flavors and others consisting of a single flavor only. Suppose also that each bag is covered with aluminum foil, so that nobody can tell which is which by looking at the bags. I hate coconut jelly beans and I want to estimate the number of coconut beans in all 100 bags (just as I hate and want to estimate the number of mistabulated ballots).
One option would be to choose a bag at random, open it, and count all the beans. I could then estimate the total number of coconut beans by multiplying the number in that bag by 100. If I chose a bag that contained only coconut beans, I would estimate that all 10,000 beans were coconut; if the bag consisted of entirely a different flavor, I would estimate that none of the 10,000 beans was coconut; and if I picked a mixed bag, I would assume the ratio of all 10,000 beans was the same as that in the bag I had picked.
Suppose instead the jelly bean bags are all opened by someone else, dumped into a large pot, and stirred well. Suppose I then choose 100 beans at random from the large pot and count the number of coconut beans in that group. The estimate I get in this case will be far more reliable than the estimate I would get by looking at the contents of a single bag, even though in both cases I’m examining 100 jelly beans. To get a similarly reliable estimate on the number of coconut jelly beans in all the bags by drawing individual bags at random, I would have to examine far more bags and count many more jelly beans.
In the same way, audits that pick individual ballots at random will be more efficient than audits that pick entire precincts at random. Physically stirring or shuffling ballots is not recommended! The desired “declustering” can be achieved by creating a detailed catalog of the physical ballots — including their storage locations — and choosing randomly from that catalog. Such a catalog is called a “ballot manifest.”
This material is based on Risk-limiting vote-tabulation audits: The importance of cluster size by Stark.