Page tree
Skip to end of metadata
Go to start of metadata

Cumulative Distribution Function, Probability Density Function

Cumulative Distribution Function (CDF)

The Cumulative Distribution Function is the probability that a continuous random variable has a value less than or equal to a given value.  Each member of the ENS gives a different forecast value (e.g. of temperature) for a given time and location, and consequently these results may be used to define a CDF where the x-axis is the forecast variable (e.g. temperature) and the y-axis the number of ENS members (expressed as a proportion of the total number of ENS members) forecasting a value less than  a given threshold.  The median value will be where the CDF is 50%.


 Fig8.1.4.1: The CDF shows the probability not exceeding a threshold value (e.g. not exceeding 20°C).  The figure is a schematic explanation of the principle behind the Extreme Forecast Index, measured by the area between the cumulative distribution functions (CDFs) of the M-Climate(blue) and the ENS members (red) forecast temperatures.   In this example almost all the ENS forecast temperatures are above the M-climate median and about 15% are above the M-climate maximum.  The blue line shows the cumulative probability of temperatures evaluated by M-climate for a given location, time of year and forecast lead time.  The red line shows the corresponding cumulative probability of temperatures evaluated by the ENS. In this case, the EFI is positive (the red line to the right of the blue line), indicating higher than normal probabilities of warm anomalies.  

Probability Density Function (PDF)

The Probability Density Function (PDF) is the first derivative of the CDF.

Fig8.1.4.2A left: Example CDF.

Fig8.1.4.2B right: The PDF is defined as the first derivative of the CDF and the graphs correspond to the example CDF curves in Fig8.1.4.2A with the temperature M-climate (blue) and the forecast distribution (red). 

Dotted lines show the median for the M-Climate and forecast.

From a CDF curve it is easy to determine the median and any other percentiles as the point on the x-axis where a horizontal line intersects the curve.  The most likely values are associated with those where the CDF is steepest.  Similarly, the PDF shows peaks in the curve at the highest probability intervals.  The EFI can be understood and interpreted with both the CDF and PDF in mind; the former relates to the EFI value, the latter clarifies the connection to probabilities.  A steep slope of the CDF, or equivantly a narrow peak of the PDF, implies a high confidence in the forecast.

In the upper frames of Fig8.1.4.2 the peak of the forecast PDF (red) is to the right of the peak of the M-climate PDF (blue), indicating that the forecast predicts warmer than normal conditions and the sharpness of the peak indicates fairly high probability.

In the lower frames of Fig8.1.4.2 the peak of the forecast PDF (red) is to the left of the peak of the M-climate PDF (blue), indicating that the forecast predicts colder than normal conditions and the sharpness of the peak indicates high probability.

Sometimes, in certain situations, the distribution of possible outcomes can have two favoured solutions.  We call this "bimodality".  On a PDF this is clearly shown by two peaks.  On a CDF curve it will be denoted by a step. A scenario in which one can sometimes see bimodal solutions is for the maximum wind gust parameter, close to the track of an active, small scale frontal wave cyclone.  North of the track relatively light winds are favoured whilst south of the track very strong winds are favoured.  Values in between may be less likely overall.

Fig8.1.4.3: The example PDF diagram indicates the ENS members are widely distributed but fall towards two distinct more likely wind speeds - one set suggests a most probable wind speed centred around the peak at W1 and a second set suggests a most probable wind speed centred around the peak at W2.  The associated example CDF shows the probability of (i.e. the number of ENS members) attaining wind speeds.  The CDF increases until the first peak of the PDF is reached at W1, flattening out as few additional ENS members show slightly higher wind speeds before becoming steeper again with the increasing number of ENS members forecasting the higher wind speeds at W2.  These pattern can occer, for example, when there is uncertainty whether a depression will pass one side or the other of the location in question.  The diagrams say nothing about the direction of the winds (e.g. they may be moderate easterlies to the north of the location or strong westerlies to the south (N Hem)) nor about timing of the depression (e.g. it may be slower or faster).  The diagrams only give information on the variation among the ENS member solutions. 

The third diagram is Forecast and M-Climate CDF for maximum wind gusts for 45.9°N 45.28°W, Valid for 24 hours from Saturday 24 March 2018 00 UTC to Sunday 25 March 2018 00 UTC.  The trace show CDFs at this location from a series of recent ensemble forecasts for this period and the black line is the M-climate.  The red (last) trace shows a flat interval (at about 57% probability of not exceeding 20m/s gusts) indicating bi-modal structure of the PDF. 

  • No labels