HomeForecasts › Forecast skill

Forecast skill


Forecast skill score (SS) is defined as the average accuracy of the test forecasts relative to the accuracy of forecasts produced by a reference method:

where MSE is the mean squared error of the test forecasts and MSEr is that of the reference forecasts (NOAA Forecast Verification Glossary). MSE measures mean squared difference between the forecasts and observations:

where N is the number of years of  forecast. When calculating the skill score for an entire region, instead of the MSE we use the normalized MSE (NMSE) defined as

NMSE = MSE/VARk,

where VARk is the observed variance for station k in the region.

Three reference methods are used here:

  1. Climatology, or climate normals. Climatology is obtained by averaging the corresponding weather derivative values over the 30-year base period, 1971-2000. Climatology is one of the two most widely used standards of reference in the field of forecast verification. The second one is persistence.
  2. Persistence. It represents a simple method of forecasting the most recent observation. Despite its apparent simplicity, persistence often provides a hard-to-beat forecast.
  3. Optimal climate normals (OCN). The OCN is one of the main forecasting tools used by the Climate Prediction Center (CPC). It  predicts a climate variable using its average values for a given month/season during the last 10 years (for temperature). The tool, as it is used at the CPC, is simply a measure of the trend. In the absence of a clearly defined ENSO phase, the OCN output dominates the forecast maps produced by the CPC, as shown in this Seasonal Forecast Guidance.

Positive (negative) skill score (presented here in per cent) means that the test forecast is better (worse) than that of the reference method. A score of 0 indicates that both forecasts are equally skillful. Note that the skill score is asymmetric relative to zero. For a perfect forecast, the skill score reaches a maximum of 100, but it can go to negative infinity for a badly missed forecast, if the reference forecast is close to the observation. Therefore, the skill score can be unstable for small sample sizes. 

At Climate Logic, skill scores are calculated for weather derivatives in the cities that are traded on the Chicago Mercantile Exchange (currently 18 in the U.S., 6 in Canada, and 9 in Europe). Some of those cities are close to each other, so that their temperature variations are highly correlated. This further reduces the effective sample size, because the forecasts for those cities cannot be considered independent. Therefore, in order to get a better feeling of the forecast accuracy, it is recommended to read our post-mortem analyses, which describe how successful (or otherwise) the forecast was in capturing the essential aspects of climatic processes during the forecast season. 

It should also be noted that, since we are currently tracing skill scores only for the calendar seasons (Dec-Feb for winter and Jun-Aug for summer), occasionally they may not correctly reflect the accuracy of our forecasts. For example, our winter 2008 forecast for Europe accurately predicted the anomalously cold November-December 2007 and anomalously warm January-March 2008, significantly outperforming all our reference methods for those  months. However, the skill score for December-February 2008 was slightly negative relative to the OCN.

An example of a performance analysis for the CPC's climate outlooks is presented by Klaus Wolter (NOAA). It is shown that temperature forecast skill (relative to climatology) averages just under +20 (for non-CL forecasts) and under +10 for all forecasts, with wild swings from season to season and from year to year. There was no advance in forecast skill since 1995, when the first climate outlook was issued. Given a strong warming trend in recent decades, it would be interesting to see how the CPC's climate outlooks are scored relative to persistence or the OCN. 

Another way to estimate the accuracy of the forecasts is to see how profitable they can be when used for weather derivative trading on WeatherBill.com. See the results of betting on heating degree days for the winter (Nov-Mar) of 2008.