Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Warning

Any macro that computes a difference to analyses needs to move to the last exercise on 'Forecast differences'.

Any macro that uses analysis to plot could use HRES forecast instead?

Available plot types

Warning

these need updating


Panel

For these exercises please use the Metview icons in the row labelled 'ENS'.

ens_rmse.mv : this is similar to the hres_rmse.mv in the previous exercise. It will plot the root-mean-square-error growth for the ensemble forecasts.

ens_to_an.mv : this will plot (a) the mean of the ensemble forecast, (b) the ensemble spread, (c) the HRES deterministic forecast and (d) the analysis for the same date.

ens_to_an_runs_spag.mv : this plots a 'spaghetti map' for a given parameter for the ensemble forecasts compared to the analysis. Another way of visualizing ensemble spread.

stamp.mv : this plots all of the ensemble forecasts for a particular field and lead time. Each forecast is shown in a stamp sized map. Very useful for a quick visual inspection of each ensemble forecast.

stamp_diff.mv : similar to stamp.mv except that for each forecast it plots a difference map from the analysis. Very useful for quick visual inspection of the forecast differences of each ensemble forecast.

 

Additional plots for further analysis:

pf_to_cf_diff.mv : this useful macro allows two individual ensemble forecasts to be compared to the control forecast. As well as plotting the forecasts from the members, it also shows a difference map for each.

ens_to_an_diff.mv : this will plot the difference between the ensemble control, ensemble mean or an individual ensemble member and the analysis for a given parameter.

...

Panel
borderColorred

Q. How does the ensemble mean MSLP and Z500 fields compare to the HRES forecast and analysis?
Q. Examine the initial diversity in the ensemble and how the ensemble spread and error growth develops.  What do the extreme forecasts look like?
Q. Are there any members that consistently provide a better forecast?
Q. Comparing the two ensembles, ens_oper and ens_2016, which is the better ensemble for this case study?

Task 1:

...

This is similar to task 1 in exercise 2, except the RMSE curves for all the ensemble members from a particular forecast will be plotted.

Right-click the ens_rmse.mv icon, select 'Edit' and plot the curves for 'mslp' and 'z500'.

Change 'expID' for your choice of ensemble.

Code Block
languagebash
titleMake sure 'clustering' is off for this task!
clustersId="off"

Clustering will be used in later tasks.

Panel
borderColorred

Q. How do the HRES, ensemble control forecast and ensemble mean compare?
Q. How do the ensemble members behave, do they give better or worse forecasts?

There might be some evidence of clustering in the ensemble plumes.

There might be some individual forecasts that give a lower RMS error than the control forecast.

If time:

  • Explore the plumes from other variables.
  • Do you see the same amount of spread in RMSE from other pressure levels in the atmosphere?

Task 2: Ensemble spread

In the previous task, uncertainty in the forecast by starting from different initial conditions and the stochastic parameterizations can result in significant differences in the RMSE (for this particular case and geographical region).

This task will explore the difference in another way by looking at the 'ensemble spread'.

Use the ens_to_an.mv icon and plot the MSLP and z500. This will produce plots showing: the mean of  all the ensemble forecasts, the spread of the ensemble forecasts, the operational HRES deterministic forecast and the analysis.

Change 'expId' if required.

Animate this plot to see how the spread grows.

This macro can also be used to look at clusters of ensemble members. It will be used later in the clustering tasks. For this task, make sure all the members of the ensemble are used.

Code Block
languagebash
titleUse all ensemble members in this task:
#ENS members (use ["all"] or a list of members like [1,2,3]
members=["all"]        #[1,2,3,4,5] or ["all"] or ["cl.example.1"]
Panel
borderColorred

Q. How does the mean of the ensemble forecasts compare to the HRES & analysis?
Q. Does the ensemble spread capture the error in the forecast?
Q. What other comments can you make about the ensemble spread?

Task 3: Spaghetti plots - another way to visualise spread

A "spaghetti" plot is where a single contour of a parameter is plotted for all ensemble members. It is another way of visualizing the differences between the ensemble members and focussing on features.

Use the ens_to_an_runs_spag.mv icon. Plot and animate the MSLP and z500 fields using your suitable choice for the contour level. Find a value that highlights the low pressure centres. Note that not all members may reach the low pressure set by the contour.

The red contour line shows the control forecast of the ensemble.

Note that this macro may animate slowly because of the computations required.

Experiment with changing the contour value and (if time) plotting other fields.

Task 4: Visualise ensemble members and differences

So far we have been looking at reducing the information in some way to visualise the ensemble.

Stamp maps are used to visualise all the ensemble members as normal maps. These are small, stamp sized contour maps plotted for each ensemble member using a small set of contours.

There are two icons to use, stamp.mv and stamp_diff.mv.

Use stamp.mv to plot the MSLP and z500 fields in the ensemble.

The stamp map is slow to plot as it reads a lot of data. Rather than animate each forecast step, a particular date can be set by changing the 'steps' variable.

Code Block
languagebash
titleSet date/time to 24-09-2012 00Z
#Define forecast steps
steps=[2012-09-24 00:00,"to",2012-09-24 00:00,"by",6]

Make sure clustersId="off" for this task.

Precipitation over France

Use stamp.mv and plot total precipitation ('tp') over France (mapType=2) for 00Z 24-09-2012 (compare with Figure 2 in Pantillon).

Note, stamp_diff.mv cannot be used for 'tp' as there is no precipitation data in the analyses.

Difference stamp maps

Use the stamp_diff.mv plot to look at the differences between the ensemble members and the analysis. It can be easier to understand the difference in the ensembles by using difference stamp maps.

Panel
borderColorred

Q. Using the stamp and stamp difference maps, study the ensemble. Identify which ensembles produce "better" forecasts.
Q. Can you see any distinctive patterns in the difference maps?

Compare ensemble members to analysis

After visualizing the stamp maps, it can be useful to animate a comparison of individual ensemble members to the analyses.

ens_to_an_diff.mv and pf_to_cf_diff.mv can be used to compare ensemble members.

...

titleUse ens_to_an_diff to compare an ensemble member to the analysis

 To animate the difference in MSLP of an individual ensemble member 30 to the analysis, edit the lines:

Code Block
param="mslp"
ensType="pf30"

To compare the control forecast:

Code Block
ensType="cf"
Further analysis using ensembles

...

titleUse pf_to_cf_diff.mv to compare two ensemble members to the control forecast

This will show the forecasts from the ensemble members and also their difference with the ensemble control forecast.

To animate the difference in MSLP with ensemble members '30' and '50', set:

Code Block
param="mslp"
pf=[30,50]
Panel
titleSea-surface temperature

Compare the SST parameter used for the ens_oper and ens_2016 ensemble forecasts. The 2016 reforecast of this case study used a coupled ocean model unlike the 2012 ensemble and HRES forecast that used climatology for the first 5 days.

Panel
titleCross-sections of ensemble members

To show a cross-section of a particular ensemble member, use the macro 'ens_xs.mv'.

This works in the same way as the an_xs.mv and hres_xs.mv macros.

...

titleIdentifying sensitive region for better forecasts

Find ensemble members that appear to produce a better forecast and look to see how the initial development in these members differs.

  • Select 'better' forecasts using the stamp plots and use ens_to_an.mv to modify the list of ensemble plots.
  • Use pf_to_cf_diff and ens_to_an_diff to take the difference between these perturbed ensemble member forecasts from the control and analyses to also look at this.
Panel
borderColorred

Q. Can you tell which area is more sensitive for the forecast?

Ensemble spread

Warning

need to remove plotting the analysis from ens_to_an

This task will explore the 'ensemble spread'.

Use the ens_to_an.mv icon and plot the MSLP and z500. This will produce plots showing: the mean of  all the ensemble forecasts, the spread of the ensemble forecasts and the operational HRES deterministic forecast.

Change 'expId' if required to select either the 2012 ensemble expId="ens_oper" or the reforecast ensemble expId="ens_2016".

Animate this plot to see how the spread grows.

This macro can also be used to look at clusters of ensemble members. It will be used later in the clustering tasks. For this task, make sure all the members of the ensemble are used.

Code Block
languagebash
titleUse all ensemble members in this task:
#ENS members (use ["all"] or a list of members like [1,2,3]
members=["all"]        #[1,2,3,4,5] or ["all"] or ["cl.example.1"]


Panel
borderColorred

Q. How does the mean of the ensemble forecasts compare to the HRES & analysis?
Q. Does the ensemble spread capture the error in the forecast?
Q. What other comments can you make about the ensemble spread?

Task 3: Spaghetti plots - another way to visualise spread


Warning

remove analysis from the spaghetti plots, plot HRES instead


A "spaghetti" plot is where a single contour of a parameter is plotted for all ensemble members. It is another way of visualizing the differences between the ensemble members and focussing on features.

Use the ens_to_an_runs_spag.mv icon. Plot and animate the MSLP and z500 fields using your suitable choice for the contour level. Find a value that highlights the low pressure centres. Note that not all members may reach the low pressure set by the contour.

The red contour line shows the control forecast of the ensemble.

Note that this macro may animate slowly because of the computations required.

Experiment with changing the contour value and (if time) plotting other fields.

Task 4: Visualise ensemble members and differences

So far we have been looking at reducing the information in some way to visualise the ensemble.

Stamp maps are used to visualise all the ensemble members as normal maps. These are small, stamp sized contour maps plotted for each ensemble member using a small set of contours.

There are two icons to use, stamp.mv and stamp_diff.mv.

Use stamp.mv to plot the MSLP and z500 fields in the ensemble.

The stamp map is slow to plot as it reads a lot of data. Rather than animate each forecast step, a particular date can be set by changing the 'steps' variable.

Code Block
languagebash
titleSet date/time to 24-09-2012 00Z
#Define forecast steps
steps=[2012-09-24 00:00,"to",2012-09-24 00:00,"by",6]

Make sure clustersId="off" for this task.

Precipitation over France

Use stamp.mv and plot total precipitation ('tp') over France (mapType=2) for 00Z 24-09-2012 (compare with Figure 2 in Pantillon).

Note, stamp_diff.mv cannot be used for 'tp' as there is no precipitation data in the analyses.

Difference stamp maps

Use the stamp_diff.mv plot to look at the differences between the ensemble members and the analysis. It can be easier to understand the difference in the ensembles by using difference stamp maps.

Panel
borderColorred

Q. Using the stamp and stamp difference maps, study the ensemble. Identify which ensembles produce "better" forecasts.
Q. Can you see any distinctive patterns in the difference maps?

Compare ensemble members to analysis

After visualizing the stamp maps, it can be useful to animate a comparison of individual ensemble members to the analyses.

ens_to_an_diff.mv and pf_to_cf_diff.mv can be used to compare ensemble members.

Panel
titleUse ens_to_an_diff to compare an ensemble member to the analysis

 To animate the difference in MSLP of an individual ensemble member 30 to the analysis, edit the lines:

Code Block
param="mslp"
ensType="pf30"

To compare the control forecast:

Code Block
ensType="cf"


Further analysis using ensembles
Panel
titleUse pf_to_cf_diff.mv to compare two ensemble members to the control forecast

This will show the forecasts from the ensemble members and also their difference with the ensemble control forecast.

To animate the difference in MSLP with ensemble members '30' and '50', set:

Code Block
param="mslp"
pf=[30,50]



Panel
titleSea-surface temperature

Compare the SST parameter used for the ens_oper and ens_2016 ensemble forecasts. The 2016 reforecast of this case study used a coupled ocean model unlike the 2012 ensemble and HRES forecast that used climatology for the first 5 days.


Panel
titleCross-sections of ensemble members

To show a cross-section of a particular ensemble member, use the macro 'ens_xs.mv'.

This works in the same way as the an_xs.mv and hres_xs.mv macros.


Panel
titleIdentifying sensitive region for better forecasts

Find ensemble members that appear to produce a better forecast and look to see how the initial development in these members differs.

  • Select 'better' forecasts using the stamp plots and use ens_to_an.mv to modify the list of ensemble plots.
  • Use pf_to_cf_diff and ens_to_an_diff to take the difference between these perturbed ensemble member forecasts from the control and analyses to also look at this.
Panel
borderColorred

Q. Can you tell which area is more sensitive for the forecast?


 

Task 5:  Cumulative distribution function

Recap

The probability distribution function of the normal distribution
or Gaussian distribution. The probabilities expressed as a
percentage for various widths of standard deviations (σ)
represent the area under the curve.

Image Added

Figure from Wikipedia.


Cumulative distribution function for a normal
distribution with varying standard deviation ( σ)

Image Added

Figure from Wikipedia.

Cumulative distribution function (CDF)

The figures above illustrate the relationship between a normal distribution and its associated cumulative distribution function. The CDF is constructed from the area under the probability density function.

The CDF gives the probability that a value on the curve will be found to have a value less than or equal to the corresponding value on the x-axis. For example, in the figure, the probability for values less than or equal to X=0 is 50%.

The shape of the CDF curve is related to the shape of the normal distribution. The width of the CDF curve is directly related to the value of the standard deviation of the probability distribution function.

For an ensemble, the width is therefore related to the 'ensemble spread'.

For a forecast ensemble where all values were the same, the CDF would be a vertical straight line.

Plot the CDFs

Image Added

This exercise uses the cdf.mv icon. Right-click, select 'Edit' and then:

  • Plot the CDF of MSLP for Toulouse for your choice of ensemble
  • Find a latitude/longitude point in the area of intense precipitation on 12Z 24/9/2012 (see Figure 2(c) Pantillon et al) and plot the CDF for MSLP (set station=[lat,lon] in the macro cdf.mv)

Note that only MSLP, 2m temperature (t2) and 10m wind-speed (speed10) are available for the CDF.

Make sure useClusters='off'.

Panel
borderColorred

Q. Compare the CDF from the different forecast ensembles; what can you say about the spread?

Exercise 4: Cluster analysis

The paper by Pantillon et al, describes the use of clustering to identify the main scenarios among the ensemble members.

This exercise repeats some of the plots from the previous one but this time with clustering enabled.

Using clustering will highlight the ensemble members in each cluster in the plots.

In this exercise you will:

  • Construct your own qualitative clusters by choosing members for two clusters
  • Generate clusters using principal component analysis (similar to Pantillon et al).

Task 1: Create your own clusters

Clusters can be created manually from lists of the ensemble members.

Choose members for two clusters. The stamp maps are useful for this task.

From the stamp map of z500 at 24/9/2012 (t+96), identify ensemble members that represent the two most likely forecast scenarios.

It is usual to create clusters from z500 as it represents the large-scale flow and is not a noisy field. However, for this particular case study, the stamp map of 'tp' (total precipitation) over France is also very indicative of the distinct forecast scenarios.

Panel
titleCreate your own clusters

Right-click 'ens_oper_cluster.example.txt' and select Edit (or make a duplicate)

The file contains two example lines:

Code Block
1#   2  3  4  9  22 33 40
2#   10 11 12 31 49

The first line defines the list of members for 'Cluster 1': in this example, members 2, 3, 4, 9, 22, 33, 40.

The second line defines the list of members for 'Cluster 2': in this example, members 10, 11, 12, 31, 49.

Change these two lines!.
Put your choice of ensemble member numbers for cluster 1 and 2 (lines 1 and 2 respectively).

You can create multiple cluster definitions by using the 'Duplicate' menu option to make copies of the file for use in the plotting macros..

The filename is important!
The first part of the name 'ens_oper' refers to the ensemble dataset and must match the name used in the plotting macro. 
The 'example' part of the filename can be changed to your choice and should match the 'clustersId'.
As an example a filename of: ens_both_cluster.fred.txt would require 'expId=ens_both', 'clustersId=fred' in the macro.


Panel
titlePlot ensembles with your cluster definitions

Use the clusters of ensemble members you have created in ens_oper_cluster.example.txt.

Set clustersId='example' in each of the ensemble plotting macros to enable cluster highlighting.

Replot ensembles:

RMSE: plot the RMSE curves using ens_rmse.mv. This will colour the curves differently according to which cluster they are in.

Stamp maps: the stamp maps will be reordered such at the ensemble members will be groups according to their cluster. Applies to stamp.mv and stamp_diff.mv. This will make it easier to see the forecast scenarios according to your clustering.

Spaghetti maps: with clusters enabled, two additional maps are produced which show the contour lines for each cluster. The spaghetti maps are similar to Figure 10. in Pantillon et al.


Panel
titlePlot maps of parameters as clusters

The macro cluster_to_an.mv can be used to plot maps of parameters as clusters and compared to the analysis and HRES forecasts.

Use cluster_to_an.mv to plot z500 maps of your two clusters (equivalent to Figure 7 in Pantillon et al.)

If your cluster definition file is called

 

Task 5:  Cumulative distribution function

Recap

The probability distribution function of the normal distribution
or Gaussian distribution. The probabilities expressed as a
percentage for various widths of standard deviations (σ)
represent the area under the curve.

Image Removed

Figure from Wikipedia.

Cumulative distribution function for a normal
distribution with varying standard deviation ( σ)

Image Removed

Figure from Wikipedia.

Cumulative distribution function (CDF)

The figures above illustrate the relationship between a normal distribution and its associated cumulative distribution function. The CDF is constructed from the area under the probability density function.

The CDF gives the probability that a value on the curve will be found to have a value less than or equal to the corresponding value on the x-axis. For example, in the figure, the probability for values less than or equal to X=0 is 50%.

The shape of the CDF curve is related to the shape of the normal distribution. The width of the CDF curve is directly related to the value of the standard deviation of the probability distribution function.

For an ensemble, the width is therefore related to the 'ensemble spread'.

For a forecast ensemble where all values were the same, the CDF would be a vertical straight line.

Plot the CDFs

Image Removed

This exercise uses the cdf.mv icon. Right-click, select 'Edit' and then:

  • Plot the CDF of MSLP for Toulouse for your choice of ensemble
  • Find a latitude/longitude point in the area of intense precipitation on 12Z 24/9/2012 (see Figure 2(c) Pantillon et al) and plot the CDF for MSLP (set station=[lat,lon] in the macro cdf.mv)

Note that only MSLP, 2m temperature (t2) and 10m wind-speed (speed10) are available for the CDF.

Make sure useClusters='off'.

Panel
borderColorred

Q. Compare the CDF from the different forecast ensembles; what can you say about the spread?

Exercise 4: Cluster analysis

The paper by Pantillon et al, describes the use of clustering to identify the main scenarios among the ensemble members.

This exercise repeats some of the plots from the previous one but this time with clustering enabled.

Using clustering will highlight the ensemble members in each cluster in the plots.

In this exercise you will:

  • Construct your own qualitative clusters by choosing members for two clusters
  • Generate clusters using principal component analysis (similar to Pantillon et al).

Task 1: Create your own clusters

Clusters can be created manually from lists of the ensemble members.

Choose members for two clusters. The stamp maps are useful for this task.

From the stamp map of z500 at 24/9/2012 (t+96), identify ensemble members that represent the two most likely forecast scenarios.

It is usual to create clusters from z500 as it represents the large-scale flow and is not a noisy field. However, for this particular case study, the stamp map of 'tp' (total precipitation) over France is also very indicative of the distinct forecast scenarios.

Panel
titleCreate your own clusters

Right-click 'ens_oper_cluster.example.txt' and select Edit (or make a duplicate)The file contains two example lines, then Edit cluster_to_an.mv and set:

Code Block
languagebash
#ENS1# members  2  3  4  9  22 33 40
2#   10 11 12 31 49

The first line defines the list of members for 'Cluster 1': in this example, members 2, 3, 4, 9, 22, 33, 40.

The second line defines the list of members for 'Cluster 2': in this example, members 10, 11, 12, 31, 49.

Change these two lines!.
Put your choice of ensemble member numbers for cluster 1 and 2 (lines 1 and 2 respectively).

You can create multiple cluster definitions by using the 'Duplicate' menu option to make copies of the file for use in the plotting macros..

The filename is important!
The first part of the name 'ens_oper' refers to the ensemble dataset and must match the name used in the plotting macro. 
The 'example' part of the filename can be changed to your choice and should match the 'clustersId'.
As an example a filename of: ens_both_cluster.fred.txt would require 'expId=ens_both', 'clustersId=fred' in the macro.

Panel
titlePlot ensembles with your cluster definitions

Use the clusters of ensemble members you have created in ens_oper_cluster.example.txt.

Set clustersId='example' in each of the ensemble plotting macros to enable cluster highlighting.

Replot ensembles:

RMSE: plot the RMSE curves using ens_rmse.mv. This will colour the curves differently according to which cluster they are in.

Stamp maps: the stamp maps will be reordered such at the ensemble members will be groups according to their cluster. Applies to stamp.mv and stamp_diff.mv. This will make it easier to see the forecast scenarios according to your clustering.

Spaghetti maps: with clusters enabled, two additional maps are produced which show the contour lines for each cluster. The spaghetti maps are similar to Figure 10. in Pantillon et al.

(use ["all"] or a list of members like [1,2,3]
members_1=["cl.example.1"]
members_2=["cl.example.2"]

If your cluster definition file is has another name, e.g. ens_oper_cluster.fred.txt, then members_1=["cl.fred.1"].

Plot other parameters:

Plot total precipitation for France (mapType=2). Compare with Figure 8. in Pantillon et al.


Panel
borderColorred

Q. Experiment with the choice of members in each clusters and plot z500 at t+96 (Figure 7 in Pantillon et al.). How similar are your cluster maps?
Q. What date/time does the impact of the different clusters become apparent?
Q. Are two clusters enough? Where do the extreme forecasts belong?

Task 2: Empirical orthogonal functions / Principal component analysis

A quantitative way of clustering an ensemble is by computing empirical orthogonal functions from the differences between the ensemble members and the control forecast.

Although geopotential height at 500hPa at 00 24/9/2012 is used in the paper by Pantillon et al., the steps described below can be used for any parameter at any step.

Image Added

The eof.mv macro computes the EOFs and the clustering.

Warning

Always use the eof.mv first for a given parameter, step and ensemble forecast (e.g. ens_oper or ens_2016) to create the cluster file.

Otherwise cluster_to_an.mv and other plots with clustering enabled will fail or plot with the wrong clustering of ensemble members.

If you change step or ensemble, recompute the EOFS and cluster definitions using eof.mv. Note however, that once a cluster has been computed, it can be used for all steps with any parameter.


Panel
titleCompute EOFs and clusters

Edit 'eof.mv'

Set the parameter to use, choice of ensemble and forecast step required for the EOF computation:

Code Block
param="z500"
expId="ens_oper"
steps=[2012-09-24 00:00]

Run the macro.

The above example will compute the EOFs of geopotential height anomaly at 500hPa using the 2012 operational ensemble at forecast step 00Z on 24/09/2012.

A plot will appear showing the first two EOFs (similar to Figure 5 in Pantillon et al.)

The geographical area for the EOF computation is: 35-55N, 10W-20E (same as in Pantillon et al). If desired it can be changed in eof.mv.


Panel
titleEOF cluster definition file

The eof.mv macro will create a text file with the cluster definitions, in the same format as described above in the previous task.

The filename will be different, it will have 'eof' in the filename to indicate it was created by using empirical orthogonal functions.

Code Block
languagebash
titleCluster filename created for ensemble 'ens_oper' using eof.mv
ens_oper_cluster.eof.txt

If a different ensemble forecast is used, for example ens_2016, the filename will be: ens_2016_cluster.eof.mv

This cluster definition file can then be used to plot any variable at all steps (as for task 1).

Panel
titlePlot maps of parameters as clusters

The macro cluster_to_an.mv can be used to plot maps of parameters as clusters and compared to the analysis and HRES forecasts.

Use cluster_to_an.mv to plot z500 maps of your two clusters (equivalent to Figure 7 in Pantillon et al.)

If your cluster definition file is called 'ens_oper_cluster.example.txt', then Edit cluster_to_an.mv and set:

Code Block
languagebash
#ENS members (use ["all"] or a list of members like [1,2,3]
members_1=["cl.example.1"]
members_2=["cl.example.2"]

If your cluster definition file is has another name, e.g. ens_oper_cluster.fred.txt, then members_1=["cl.fred.1"].

Plot other parameters:

Plot total precipitation for France (mapType=2). Compare with Figure 8. in Pantillon et al.


Panel
borderColorred

Q. Experiment with the choice of members in each clusters and plot z500 at t+96 (Figure 7 in Pantillon et al.). How similar are your cluster maps?
Q. What date/time does the impact of the different clusters become apparent?
Q. Are two clusters enough? Where do the extreme forecasts belong?

Task 2: Empirical orthogonal functions / Principal component analysis

A quantitative way of clustering an ensemble is by computing empirical orthogonal functions from the differences between the ensemble members and the control forecast.

Although geopotential height at 500hPa at 00 24/9/2012 is used in the paper by Pantillon et al., the steps described below can be used for any parameter at any step.

Image Removed

The eof.mv macro computes the EOFs and the clustering.

Warning

Always use the eof.mv first for a given parameter, step and ensemble forecast (e.g. ens_oper or ens_2016) to create the cluster file.

Otherwise cluster_to_an.mv and other plots with clustering enabled will fail or plot with the wrong clustering of ensemble members.

If you change step or ensemble, recompute the EOFS and cluster definitions using eof.mv. Note however, that once a cluster has been computed, it can be used for all steps with any parameter.

Panel
titleCompute EOFs and clusters

Edit 'eof.mv'

Set the parameter to use, choice of ensemble and forecast step required for the EOF computation:

Code Block
param="z500"
expId="ens_oper"
steps=[2012-09-24 00:00]

Run the macro.

The above example will compute the EOFs of geopotential height anomaly at 500hPa using the 2012 operational ensemble at forecast step 00Z on 24/09/2012.

A plot will appear showing the first two EOFs (similar to Figure 5 in Pantillon et al.)

The geographical area for the EOF computation is: 35-55N, 10W-20E (same as in Pantillon et al). If desired it can be changed in eof.mv.

Panel
titleEOF cluster definition file

The eof.mv macro will create a text file with the cluster definitions, in the same format as described above in the previous task.

The filename will be different, it will have 'eof' in the filename to indicate it was created by using empirical orthogonal functions.

Code Block
languagebash
titleCluster filename created for ensemble 'ens_oper' using eof.mv
ens_oper_cluster.eof.txt

If a different ensemble forecast is used, for example ens_2016, the filename will be: ens_2016_cluster.eof.mv

This cluster definition file can then be used to plot any variable at all steps (as for task 1).

Panel
borderColorred

Q. What do the EOFs plotted by eof.mv show?
Q. Change the parameter used for the EOF (try the 'total precipitation' field). How does the cluster change?

 

Panel
titlePlot ensemble and cluster maps

Use the cluster definition file computed by eof.mv to the plot ensembles and maps with clusters enabled (as described for task 1, but this time with the 'eof' cluster file).

The macro cluster_to_an.mv can be used to plot maps of parameters as clusters and compared to the analysis and HRES forecasts.

Use cluster_to_an.mv to plot z500 and MSLP maps of the two clusters created by the EOF/PCA analysis (equivalent to Figure 7 in Pantillon et al.)

Edit cluster_to_an.mv and set:

Code Block
languagebash
#ENS members (use ["all"] or a list of members like [1,2,3]
members_1=["cl.eof.1"]
members_2=["cl.eof.2"]

Run the macro.

If time also look at the total precipitation (tp) over France and PV/320K.

 

What do the EOFs plotted by eof.mv show?
Q. Change the parameter used for the EOF (try the 'total precipitation' field). How does the cluster change?


 

Panel
titlePlot ensemble and cluster maps

Use the cluster definition file computed by eof.mv to the plot ensembles and maps with clusters enabled (as described for task 1, but this time with the 'eof' cluster file).

The macro cluster_to_an.mv can be used to plot maps of parameters as clusters and compared to the analysis and HRES forecasts.

Use cluster_to_an.mv to plot z500 and MSLP maps of the two clusters created by the EOF/PCA analysis (equivalent to Figure 7 in Pantillon et al.)

Edit cluster_to_an.mv and set:

Code Block
languagebash
#ENS members (use ["all"] or a list of members like [1,2,3]
members_1=["cl.eof.1"]
members_2=["cl.eof.2"]

Run the macro.

If time also look at the total precipitation (tp) over France and PV/320K.

 

From Figure 7 in Pantillon et al. we see that cluster 1 corresponds to a cutoff low moving eastward over Europe and cluster 2 to a weak ridge over western Europe. Cluster 1 exhibits a weak interaction between Nadine and the cut-off low over Europe. In cluster 2, there is a strong interaction between the cutoff and Nadine in which Nadine makes landfall over the Iberian penisula.
Panel
borderColorred

Q. How similar is the PCA computed clusters to your manual clustering?
Q. Which cluster best represents the analysis?
Q. How useful is the cluster analysis as an aid to forecasting for HyMEX?
Q. Change the date/time used to compute the clusters. How does the variance explained by the first two clusters change?  Is geopotential the best parameter to use?


Panel
titleCluster method code

For those interested:

The code that computes the clusters can be found in the Python script: aux/cluster.py..

This uses the 'ward' cluster method from SciPy. Other cluster algorithms are available. See http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage

The python code can be changed to a different algorithm or the more adventurous can write their own cluster algorithm!

Exercise 5. Percentiles and probabilities

To further compare the 2012 and 2016 ensemble forecasts, plots showing the percentile amount and probabilities above a threshold can be made for total precipitation.

Use these icons:

Image Added

Both these macros will use the 6-hourly total precipitation for forecast steps at 90, 96 and 102 hours, plotted over France.

Task 1. Plot percentiles of total precipitation

Edit the percentile_tp_compare.mv icon.

Set the percentile for the total precipitation to 75%:

Code Block
languagebash
#The percentile of ENS precipitation forecast
perc=75

Run the macro and compare the percentiles from both the forecasts.  Change the percentiles to see how the forecasts differ.

Task 2: Plot probabilities of total precipitation

This macro will produce maps showing the probability of 6-hourly precipitation for the same area as in Task 1.

In this case, the maps show the probability that total precipitation exceeds a threshold expressed in mm.

Edit the prob_tp_compare.mv and set the probability to 20mm:

Code Block
languagebash
#The probability of precipitation greater than
prob=20

Run the macro and view the map. Try changing the threshold value and run.

Panel
borderColorred

Q. Using these two macros, compare the 2012 and 2016 forecast ensemble. Which was the better forecast for HyMEX flight planning?


Exercise 6. Assessment of forecast errors

In this exercise, various methods for presenting the forecast error are presented.

Panel

Image Added

hres_rmse.mv             : this plots the root-mean-square-error growth curves for the operational HRES forecast compared to the ECMWF analyses.

hres_to_an_diff.mv     : this plots a single parameter as a difference map between the operational HRES forecast and the ECMWF analysis. Use this to understand the forecast errors.

Task 1: Forecast error

In this task, we'll look at the difference between the forecast and the analysis by using "root-mean-square error" (RMSE) curves as a way of summarising the performance of the forecast.

Root-mean square error curves are a standard measure to determine forecast error compared to the analysis and several of the exercises will use them. The RMSE is computed by taking the square-root of the mean of the forecast difference between the HRES and analyses. RMSE of the 500hPa geopotential is a standard measure for assessing forecast model performance at ECMWF (for more information see: http://www.ecmwf.int/en/forecasts/quality-our-forecasts).

Right-click the hres_rmse.mv icon, select 'Edit' and plot the RMSE curve for z500.

Repeat for the mean-sea-level pressure mslp.

Repeat for both geographical regions: mapType=1 (Atlantic) and mapType=2 (France).

Panel
borderColorred

Q. What do the RMSE curves show?
Q. Why are the curves different between the two regions?

Task 2: Compare forecast to analysis

Use the hres_to_an_diff.mv icon and plot the difference map between the HRES forecast and the analysis for z500 and mslp.

Panel
borderColorred

Q. What differences can be seen?
Q. How well did the forecast position the Hurricane and cut-off N.Atlantic low?

If time: look at other fields to study the behaviour of  the forecast.

Task 3: RMSE "plumes" for the ensemble

This is similar to task 1 in exercise 2, except the RMSE curves for all the ensemble members from a particular forecast will be plotted.

Right-click theens

From Figure 7 in Pantillon et al. we see that cluster 1 corresponds to a cutoff low moving eastward over Europe and cluster 2 to a weak ridge over western Europe. Cluster 1 exhibits a weak interaction between Nadine and the cut-off low over Europe. In cluster 2, there is a strong interaction between the cutoff and Nadine in which Nadine makes landfall over the Iberian penisula.
Panel
borderColorred

Q. How similar is the PCA computed clusters to your manual clustering?
Q. Which cluster best represents the analysis?
Q. How useful is the cluster analysis as an aid to forecasting for HyMEX?
Q. Change the date/time used to compute the clusters. How does the variance explained by the first two clusters change?  Is geopotential the best parameter to use?

Panel
titleCluster method code

For those interested:

The code that computes the clusters can be found in the Python script: aux/cluster.py..

This uses the 'ward' cluster method from SciPy. Other cluster algorithms are available. See http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage

The python code can be changed to a different algorithm or the more adventurous can write their own cluster algorithm!

Exercise 5. Percentiles and probabilities

To further compare the 2012 and 2016 ensemble forecasts, plots showing the percentile amount and probabilities above a threshold can be made for total precipitation.

Use these icons:

Image Removed

Both these macros will use the 6-hourly total precipitation for forecast steps at 90, 96 and 102 hours, plotted over France.

Task 1. Plot percentiles of total precipitation

Edit the percentile_tp_compare.mv icon.

Set the percentile for the total precipitation to 75%:

Code Block
languagebash
#The percentile of ENS precipitation forecast
perc=75

Run the macro and compare the percentiles from both the forecasts.  Change the percentiles to see how the forecasts differ.

Task 2: Plot probabilities of total precipitation

This macro will produce maps showing the probability of 6-hourly precipitation for the same area as in Task 1.

In this case, the maps show the probability that total precipitation exceeds a threshold expressed in mm.

Edit the prob_tp_compare.mv and set the probability to 20mm:

Code Block
languagebash
#The probability of precipitation greater than
prob=20

Run the macro and view the map. Try changing the threshold value and run.

Panel
borderColorred

Q. Using these two macros, compare the 2012 and 2016 forecast ensemble. Which was the better forecast for HyMEX flight planning?

Exercise 6. Assessment of forecast errors

In this exercise, various methods for presenting the forecast error are presented.

Panel

Image Removed

hres_rmse.mv             : this plots the root-mean-square-error growth curves for the operational HRES forecast compared to the ECMWF analyses.

hres_to_an_diff.mv     : this plots a single parameter as a difference map between the operational HRES forecast and the ECMWF analysis. Use this to understand the forecast errors.

Task 1: Forecast error

In this task, we'll look at the difference between the forecast and the analysis by using "root-mean-square error" (RMSE) curves as a way of summarising the performance of the forecast.

Root-mean square error curves are a standard measure to determine forecast error compared to the analysis and several of the exercises will use them. The RMSE is computed by taking the square-root of the mean of the forecast difference between the HRES and analyses. RMSE of the 500hPa geopotential is a standard measure for assessing forecast model performance at ECMWF (for more information see: http://www.ecmwf.int/en/forecasts/quality-our-forecasts).

Right-click the hres_rmse.mv icon, select 'Edit' and plot the RMSE curve for z500.

Repeat for the mean-sea-level pressure mslp.

curves for 'mslp' and 'z500'.

Change 'expID' for your choice of ensemble.

Code Block
languagebash
titleMake sure 'clustering' is off for this task!
clustersId="off"

Clustering will be used in later tasksRepeat for both geographical regions: mapType=1 (Atlantic) and mapType=2 (France).

Panel
borderColorred

Q. What do the RMSE curves show?
Q. Why are the curves different between the two regions?

Task 2: Compare forecast to analysis

Use the hres_to_an_diff.mv icon and plot the difference map between the HRES forecast and the analysis for z500 and mslp.

Panel
borderColorred

Q. What differences can be seen?
Q. How well did the forecast position the Hurricane and cut-off N.Atlantic low?

...

How do the HRES, ensemble control forecast and ensemble mean compare?
Q. How do the ensemble members behave, do they give better or worse forecasts?

There might be some evidence of clustering in the ensemble plumes.

There might be some individual forecasts that give a lower RMS error than the control forecast.

If time:

  • Explore the plumes from other variables.
  • Do you see the same amount of spread in RMSE from other pressure levels in the atmosphere?


Appendix

Further reading

For more information on the stochastic physics scheme in (Open)IFS, see the article:

...