Home › Software › Regime shift detector › Documentation › Handling outliers › Handling outliers
Handling outliers
Due to outliers, the average is not representative for the mean value of the regimes, and this may significantly affect the results of the regime shift detection. Ideally the weight for the data value should be chosen such that it is small if that value is considered as an outlier. To handle the outliers, the program uses the Huber's weight function (Huber, 2005), which is calculated here as
weight = min (1, parameter/(|anomaly|)),
where anomaly is the deviation from the expected mean value of the new regime normalized by the standard deviation averaged for all consecutive sections of the cut-off length in the series. If anomalies are less than or equal to the value of the parameter then their weights are equal to one. Otherwise, the weights are inversely proportional to the distance from the expected mean value of the new regime.
After the timing of the regime shifts is determined, the mean values of the regimes are determined using the following iterative procedure. First, a simple unweighed arithmetic mean is calculated as the initial estimate of the mean value of the regime. Then a weighed mean is calculated with the weights determined by the distance from that first estimate. The procedure is repeated one more time with the new estimate of the regime mean.
Figure below illustrates the effect of the outliers on the timing of regime shifts in mean winter (DJF) temperature in central England for the period 1900-1933. The top graph shows that if the Huber's weight parameter is set to 6 (i.e., all temperature values that are less than six standard deviations have equal weights), a regime shift is detected 1920. The temperature value for 1917, however, appears to be an outlier. Reducing the Huber's weight parameter to 1 (bottom graph) changes the regime shift year to 1911.

The results of the regime shift detection for the winter (DJF) surface air temperature in central England using two different Huber's weight parameters: 6 (top panel) and 1 (bottom panel). Note changes in the onset and termination of the second regime.
- Printer-friendly version
- Login or register to post comments