**PROBABILITY OF AT LEAST
ONE SUCCESSFUL TEST WELL**

If
the
purpose of
an aquifer exploration task is
to estimate how many test wells would be required in
an aquifer to complete at least one successful test well,
then exceedance
probability might be used. Exceedance
probability
(P_{e})
is the probability that one value
obtained at random from a large distribution of values will equal or
exceed a certain value. The
probability of the value not exceeding that value is (1-P_{e}). If the values in the distribution are independent
(no value is affected by any other value),
then
if n values are obtained at random:

P(ALO)
= 1-[P(LTC)]^{n } Equation 1

where:

P(ALO) is the
probability that at least one value equals or exceeds a certain
critical value, and

P(LTC)
is the probability that a
value obtained at random is less than the critical value.

The
term [P(LTC)]^{n}
is the probability that all of the values obtained were less than the
critical value, and P(ALO) is its complement. That
is: [P(LTC)]^{n}+P(ALO)=1.

Solving
Equation 1 for **n**, yields:

**n **=
log[1-P(ALO)]/log[P(LTC)] Equation 2

This
form of the equation may be useful in some aquifer exploration
projects. For example, if sufficient data on hydraulic conductivity
at aquifer test sites scattered randomly in a certain geologic
terrane are available, then one can select a value for P(ALO) that
would provide an acceptable level of uncertainty of finding a certain
critical value of hydraulic conductivity. Then a
cumulatave-distribution polygon for hydraulic conductivity can be
constructed from the existing data, and P(LTC) can be read from
cumulative distribution. Once one has values for P(ALO) and P(LTC),
a value for **n** my be calculated from Equation 2. The value for **n**
provides an estimate of the number of test wells that would be
required to find one or more sites with with hydraulic conductivity
at or above the critical value.

As
an example, consider the frequency histogram of hydraulic
conductivity in a certain geologic terrane shown in Figure 1. This
histogram is based on 1539 values. The corresponding
cumulative-distribution polygon is shown in Figure 2.

Figure
1. Example of a frequency histogram.

Figure 2.
Example of cumulative distribution polygon.

If one were to
need a hydraulic conductivity 500 ft/day for a successful well
completion, then P(LTC) would be about 0.96. If one wanted to be 90%
confident of finding this critical value with at least one test well,
then P(ALO) would be set at 0.90. From Equation 2:

**n**
= log[1-0.90]/log[0.96]
= 56.4.

So
one would be 90% confident of constructing at least one test well
yielding a value of hydraulic conductivity of 500 ft/day or more if 57 test
wells were constructed. These wells would have to be located
sufficiently far apart so that the hydraulic conductivity of one
would not be affected by any other
(independent).

However,
if 100 ft/day would suffice, then P(LTC)
would be about 0.68. If P(ALO) were kept at 0.90, then

**n** = log
[1-.90]/log[.68] = 5.97.

So
one would be 90% confident of constructing
at least one test well yielding a value of hydraulic conductivity of
100 ft/day or more if 6
test wells were constructed.

Such
exceedance probability calculations might
be used at the planning stage of a water resource investigation to
provide information on the possible cost of a test well program given
a desired likelihood of success. In practice, one likely would not
need to construct the **n** test wells before a successful one is
completed. However, there is also a
probability [1-P(ALO)] that even **n** test
wells will not be successful.

The
example above is based on a large number of data values, so that the
cumulative distribution polygon is likely a
good representation of the entire population of values. However, if
additional data were available, the cumulative distribution polygon
would not be likely to be exactly the same.
Some judgment is
required regarding the effect of sample size. The effect of sample size may be
examined by using Kolmogorov's D-statistic. This statistic is the
maximum absolute value of the difference between the sample
cumulative distribution and the population distribution. Its
distribution is known. Consequently, it gives a probabilistic
estimate of maximum difference between the cumulative distribution of
a data set and the cumulative distribution of the entire population. The value of D for a sample size (**n**) greater than 35 is 1.22/**n**^{1/2},
1.36/ **n**^{1/2}, and 1.63/**n**^{1/2}, at significance
levels of 0.1, 0.05, and 0.01, respectively. These significance
levels represent confidence levels of 90, 95, and 99 percent. In the
case of the cumulative distribution polygon shown in Figure 2, where
n=1539, D=0.04 at the 99 percent confidence level. So this
distribution is a good representation of the actual distribution of
hydraulic conductivity in that geologic terrane. The population
distribution would differ from the sample distribution by no more
than 4 percent with a very high degree of confidence.

Dixon and Massey (__127__) provide a table
of percentiles of the distribution of Kolmogorov's D-statistic and a
relatively simple description of it use. More complete tables (and
abstruse mathematical disquisitions) are available elsewhere. Values
of Kolmogorov's D-statistic for selected confidence levels and
samples sizes are plotted in Figure 3. Figure 3 shows that as the
sample size (for example, hydraulic conductivities from previous test
wells) decreases and the selected confidence level increases, D
increases. In the extreme, if the sample size is one (1) and the
confidence level is 99 percent, D is 0.995. Hence, a single random
sample yields virtually no information on the population
distribution. For small sample sizes, the D-statistic could be used
to provide more conservative estimates of n, because its value at a
selected confidence level could be added to P(LTC). However P(LTC)
must be less than one (1) to avoid division by zero.

Figure 3. Graph of selected values of Kolmogorov's D-Statistic.

One
can increase the chance of success if test wells can be placed
at geologically advantageous locations. Identifying such locations
may involve aerial photograph interpretation, interpretation of
satellite images, geophysical investigations, stratigraphic facies
mapping, outcrop studies of fracture density and orientation,
geomorphology investigation, calibration of groundwater flow models,
and other geologic studies, as appropriate for a particular terrane.

The
concept of exceedance probability is also potentially useful in
petroleum exploration and mineral exploration.
It is widely
used in flood hazard reporting.

**Addendum**

For questions or comments (especially regarding errors and omissions), send an __email to ddunn@dunnhydrogeo.com__.

If you need a consulting hydrogeologist to provide services related to aquifer exploration, send an email to the address above.