HomeSoftwareECliPSConfidence factorsCF and probability › CF and probability

CF and probability


Generally speaking, the CF is the difference between belief and disbelief:

CF(h, e) = MB(h, e) - MD(h, e), expressed in per cent.

Here CF is the confidence factor in the hypothesis h due to evidence e, MB is the measure of increased belief in h due to e, and MD is the measure of increased disbelief in h due to e.

In most cases disbelief is simply the opposite of belief, and therefore using CFs is equivalent of using probabilities, for which the following is always true:

P(h) = 1 - P(¬h),

where P(¬h) is the probability of any hypothesis other than h.

For the case of a posterior hypothesis that relies on evidence e

 P(h| e) = 1 - P(¬h| e).

The fundamental problem here is that while P(h| e) implies a cause-and-effect relationship between e and h, there may be no cause-and-effect relationship between e and ¬h. For example, it is known that El Niño events tend to be associated with anomalously warm winters in the Great Lakes region (Rodionov and Assel, 2003). The probability of a cold winter, however, does not appear to be affected by the processes in the equatorial Pacific, so that the above equation is not true. This type of nonlinear, or asymmetric, relationships between climatic variables attracts more and more attention in climate research (e.g., Wu and Hsieh, 2004).

When many pieces of evidence are combined, the sum of the CFs for all categories of the forecast variable is not necessarily equal 100. Therefore, CFs are not probabilities, although in most cases for individual rules they can be converted into probabilities, and vice versa.

The measures of belief and disbelief defined in terms of probabilities also include prior, or climatological, probability P(h). If  P(h) = 1, then MB(h, e) = MD(h, e) = 1, otherwise

MB(h, e) = (max [P(h| e), P(h)] - P(h))/ (1 - P(h)),

MD(h, e)= (min [P(h| e), P(h)] - P(h))/ (- P(h)).

For example, let’s assume that in some region the climatological probabilities of a cold and warm winter are P (cold) = 0.6 and and P (warm) = 0.4. Suppose there is a factor e that may or may not affect the probability of occurrences of cold or warm winters. The data shows that out of 16 winters when e was observed, 8 were warm and 8 cold. Then, according to the above formulas,

MB(warm, e) = (0.6 – 0.6)/0.4 = 0,

MD(warm, e) = (0.5 – 0.6)/(-0.5) = 0.2, and 

CF = -20.

In this example, equal probabilities of e increase our disbelief in h: winter = warm. In our analysis, however, we do not use negative CFs. Instead, we calculate CFs for the opposite event, i.e., CF (cold, e) = 20.

In the next example, let’s assume that the climatological probabilities of a cold or warm winter are the same, i.e., P(cold)  = P (warm) = 0.5. Suppose that the data shows that out of 16 winters when e was observed, 12 were warm and 4 cold. Then

MB(warm, e) = (0.75 – 0.5)/0.5 = 0.5,

MD(warm, e) = 0, and 

CF = 50.

Since the evidence e does not support the hypothesis “winter = cold”, CF (cold, e) = 0. The value for MB(warm, e) in this example can also be obtained as 

MB(warm, e) = p(h| e) – p(¬h| e) = 0.75 - 0.25 = 0.5.

The data also shows that the odds (O) of warm versus cold winter are 12:4 or 3:1. Odds can be converted into probability using the following formula:

P(warm, e) = O/(1+O) = 0.75. 

In their forecasts, the U.S. Climate Prediction Center uses the so-called “probability of exceedance" (Pexc), which in our example will be 

Pexc = P(warm, e) – P(warm) = 0.25.

All this gives us a better understanding of what CF = 50 means in more familiar probabilistic measures of uncertainty. This value of CF, however, needs to be adjusted for the number of observations in the same manner as discussed above for the CF values based the correlation. It may be further adjusted to reflect the quality of data, reliability of sources of information, accuracy of the analysis, etc. All in all, the CF reflects both the objective and subjective confidence in h.