Figure II.2.8 The problem of inter- vs extrapolation using a spatially biased dataset.
Such frequency curves easily reveal the bias usually found in mountainous areas. In flat areas,
such as Denmark this problem seldom occurs. The variability of the terrain is low, and the
station density is high. The highest elevated station occurs at 166 m a.s.l. a few meters higher
than the highest grid cell. This is quite common also in Alpine countries, where observatories
are located on mountain peaks.
Another way to decide whether the sample is representative is to compare the actual
distribution with a spatially stationary random process following a multidimensional Poisson
law.
II.2.7.2 Data representativity and atmospheric processes.
So far we have discussed spatial representativity, second order spatial stationarity, the spatial
intrinsic hypothesis etc. Many spatialisation applications in climatology and meteorology are
however based on a pre-defined linear model. Applying such models with temporal data
creates other concerns. One example is if a model established using data from a period with a
certain climate would be valid for all events in the future, when the climate could be in a
different phase? Different atmospheric circulation patterns will for example have a significant
effect on the spatial structure functions in several regions in Europe. One specific problem is
the frequency of temperature inversions, which are strongly related to certain circulation
types. Also the occurrence of extreme climatic episodes will be strongly affected by certain
atmospheric conditions.
Systematic changes in the observation networks should also be a concern. If the “universe” is
different for the calibration and estimation period, the defined model will be different. This
problem can also be associated with spatial homogeneity, which is a relatively new issue that
appeared when spatialisation methods began to be used to derive gridded climatological
datasets.
Sabtu, 21 Januari 2017
journal 5
Figure II.2.7 The upper panel shows the (cumulative) frequency functions of the altitude of
the meteorological network of the Nordic countries (red curve) and the 1 x 1 DEM
(GTOPO30, black curve). The lower graph shows the same for Denmark only.
journal 4
Figure II.2.6 The left panel shows the (area of) Thiessen polygons of the meteorological
network used for the NORDGRID project. The right panel shows the standard deviation of the
terrain model (red colour high standard deviation, yellow small) indicating the variability of
the terrain. Ideally areas with high variability should have higher station density.
That means that many of the assumptions underlying the spatialisation methods described in
the previous chapters are not fully fulfilled.
Station networks are usually biased towards lower altitudes and populated areas
Station density varies uncertainty is a function of station density.
In a sparse network changes in the network will have consequences for the
homogeneity of gridded time series.
Therefore will the uncertainty no be a fixed property in spatial analysis, but vary in space and
time.
In meteorology and climatology terrain characteristics is one of the main explanatory
variables, and the input data should represent the same “universe” as the one that shall be
estimated. In an ideal world that means that the frequency distribution of the input data and
the explanatory variables should coincide. Whether this is true can be investigated by
comparing the distribution functions directly, e.g. the altitude distribution of the terrain model
with the distribution of station elevation (Figure II.2.7). In mountainous areas stations are
usually observation biased towards the lower elevations.
The consequence of such biased input data is a high risk of performing extrapolation instead
of interpolation. Especially when using a model parameterized on a biased network this
problem could occur. This effect might even be increased by use of external predictors.
Defining the trend by e.g. linear regression analysis will easily lead to estimates in the
extrapolation domain - outside the valid area of the model’ (Figure II.2.8.).
network used for the NORDGRID project. The right panel shows the standard deviation of the
terrain model (red colour high standard deviation, yellow small) indicating the variability of
the terrain. Ideally areas with high variability should have higher station density.
That means that many of the assumptions underlying the spatialisation methods described in
the previous chapters are not fully fulfilled.
Station networks are usually biased towards lower altitudes and populated areas
Station density varies uncertainty is a function of station density.
In a sparse network changes in the network will have consequences for the
homogeneity of gridded time series.
Therefore will the uncertainty no be a fixed property in spatial analysis, but vary in space and
time.
In meteorology and climatology terrain characteristics is one of the main explanatory
variables, and the input data should represent the same “universe” as the one that shall be
estimated. In an ideal world that means that the frequency distribution of the input data and
the explanatory variables should coincide. Whether this is true can be investigated by
comparing the distribution functions directly, e.g. the altitude distribution of the terrain model
with the distribution of station elevation (Figure II.2.7). In mountainous areas stations are
usually observation biased towards the lower elevations.
The consequence of such biased input data is a high risk of performing extrapolation instead
of interpolation. Especially when using a model parameterized on a biased network this
problem could occur. This effect might even be increased by use of external predictors.
Defining the trend by e.g. linear regression analysis will easily lead to estimates in the
extrapolation domain - outside the valid area of the model’ (Figure II.2.8.).
journal 3
II.2.7 Assessment of uncertainties in spatialisation.
In spatialisation, like in all other types of modelling, assessment of errors and uncertainties is
of major importance. There are several types of errors and uncertainties to consider, and also
several ways to address these.
First of all it is necessary to distinguish between what is uncertainty, and what is error. In this
section this will be discussed, and different procedures to address uncertainties and errors be
presented and discussed.
II.2.7.1 Data representativity, quality and reliability
Traditionally we differentiate between several types of errors. Meteorology and climatology
depend on observations done by some sort of measurements, direct or indirect. These
observations contain systematic and random measurement errors. Such errors will not be
deeply discussed in this report, as they are not a part of the spatialisation itself. But they
should be considered by the individual researchers in order to:
exclude data with insufficient data quality
address accuracy of measurements in order to understand the natural variability of the
data entered into the spatialisation algorithms.
Among the systematic errors are e.g. instruments which are not calibrated, and therefore give
wrong readings, observations not carried out according to the guidelines, computer
programmes used in data processing etc. Such errors should ideally be easy to detect in a
state-of-the-art data quality control system.
Random errors are more difficult to detect. Such errors are usually one-occasion type of errors
for which the reason is not clear, e.g. misreading by an observer is the classical example.
These are not that easy to discover if they are within the range of the natural variability of the
observed element or if it is an error.
Representativity of the observation network is probably the most serious problem within
meteorology and climatology. Such networks usually have an irregular spatial distribution
both in 2- and 3D. To complicate this issue further, station networks will change over time
both concerning the location of the individual stations as well as the spatial density of
measurement sites.
There are several ways of addressing the irregularity of station networks. The 2-dimensional
representativity of stations can e.g. be described by calculating Voronoï-diagrams (Thiessen
polygons) around each station. One example is given in Figure II.2.6 where the variation of
the surrounding area is shown. This example is taken from the NORDGRID project (Jansson
et al., 2007) and shows quite clearly the different problems that arise with station networks.
Here are the data networks in four different countries merged, showing e.g. a tremendously
dense network in the smallest country Denmark and a rather sparse network in Finland. In
addition is the variance of the terrain smallest in the areas with the densest station network
(Figure II.2.6).
In spatialisation, like in all other types of modelling, assessment of errors and uncertainties is
of major importance. There are several types of errors and uncertainties to consider, and also
several ways to address these.
First of all it is necessary to distinguish between what is uncertainty, and what is error. In this
section this will be discussed, and different procedures to address uncertainties and errors be
presented and discussed.
II.2.7.1 Data representativity, quality and reliability
Traditionally we differentiate between several types of errors. Meteorology and climatology
depend on observations done by some sort of measurements, direct or indirect. These
observations contain systematic and random measurement errors. Such errors will not be
deeply discussed in this report, as they are not a part of the spatialisation itself. But they
should be considered by the individual researchers in order to:
exclude data with insufficient data quality
address accuracy of measurements in order to understand the natural variability of the
data entered into the spatialisation algorithms.
Among the systematic errors are e.g. instruments which are not calibrated, and therefore give
wrong readings, observations not carried out according to the guidelines, computer
programmes used in data processing etc. Such errors should ideally be easy to detect in a
state-of-the-art data quality control system.
Random errors are more difficult to detect. Such errors are usually one-occasion type of errors
for which the reason is not clear, e.g. misreading by an observer is the classical example.
These are not that easy to discover if they are within the range of the natural variability of the
observed element or if it is an error.
Representativity of the observation network is probably the most serious problem within
meteorology and climatology. Such networks usually have an irregular spatial distribution
both in 2- and 3D. To complicate this issue further, station networks will change over time
both concerning the location of the individual stations as well as the spatial density of
measurement sites.
There are several ways of addressing the irregularity of station networks. The 2-dimensional
representativity of stations can e.g. be described by calculating Voronoï-diagrams (Thiessen
polygons) around each station. One example is given in Figure II.2.6 where the variation of
the surrounding area is shown. This example is taken from the NORDGRID project (Jansson
et al., 2007) and shows quite clearly the different problems that arise with station networks.
Here are the data networks in four different countries merged, showing e.g. a tremendously
dense network in the smallest country Denmark and a rather sparse network in Finland. In
addition is the variance of the terrain smallest in the areas with the densest station network
(Figure II.2.6).
journal 2
62 COST Action 719 Final report
II.2.6.3 Dynamical Downscaling (RCM)
As the name implies, a regional climate model does not attempt to simulate the entire globe
but only a portion thereof. Regional models use the same laws of physics described in terms
of mathematical equations as do global models. The technique of nested region climate
modelling consists of using output from global model simulations to provide initial conditions
and time-dependent lateral meteorological boundary conditions to drive limited area model
simulation for selected time slices of the global model run (Dickinson et al. 1989, Giorgi et al.
1990, 1991). This technique is essentially originated from numerical weather prediction,
however the RCM is adapted for climate time scales and thus compromises have been made
between horizontal and temporal resolution. For the commonly used 30 year time slices (1961
– 1990 and 2071 – 2100) they reach a horizontal resolution of about 20 – 50 km.
The nested regional climate modelling techniques can not only be used to downscale GCMs
but also to dynamical downscale reanalysis such as the ERA40. Such runs are called ‘perfect
boundary condition runs’ and are among other applications used to test the performance of the
RCM.
The main theoretical weakness of RCMs is that systematic errors of the GCM are handed
down to the RCM. Furthermore, depending on the domain size and the resolution, RCM
simulations can be as computationally demanding as GCMs are.
The RCM is the right choice if changes in variability and extremes are required for the impact
input and if the simulation is significantly more realistic at high resolution or even only
available at high resolution. Furthermore, the use of RCM is preferable for regional and local
impact assessment especially in complex topography, where coast lines are important and in
region with highly heterogeneous land surface cover.
The physical modelling approach is mentioned here because it also has future possibilities
within climatological monitoring. There also exist some applications where output fields from
such models form the first guess of a field for objective interpolations (e.g. the Swedish
mesoscale analysis system MESAN (Häggmark et al., 1997)).
II.2.6.3 Dynamical Downscaling (RCM)
As the name implies, a regional climate model does not attempt to simulate the entire globe
but only a portion thereof. Regional models use the same laws of physics described in terms
of mathematical equations as do global models. The technique of nested region climate
modelling consists of using output from global model simulations to provide initial conditions
and time-dependent lateral meteorological boundary conditions to drive limited area model
simulation for selected time slices of the global model run (Dickinson et al. 1989, Giorgi et al.
1990, 1991). This technique is essentially originated from numerical weather prediction,
however the RCM is adapted for climate time scales and thus compromises have been made
between horizontal and temporal resolution. For the commonly used 30 year time slices (1961
– 1990 and 2071 – 2100) they reach a horizontal resolution of about 20 – 50 km.
The nested regional climate modelling techniques can not only be used to downscale GCMs
but also to dynamical downscale reanalysis such as the ERA40. Such runs are called ‘perfect
boundary condition runs’ and are among other applications used to test the performance of the
RCM.
The main theoretical weakness of RCMs is that systematic errors of the GCM are handed
down to the RCM. Furthermore, depending on the domain size and the resolution, RCM
simulations can be as computationally demanding as GCMs are.
The RCM is the right choice if changes in variability and extremes are required for the impact
input and if the simulation is significantly more realistic at high resolution or even only
available at high resolution. Furthermore, the use of RCM is preferable for regional and local
impact assessment especially in complex topography, where coast lines are important and in
region with highly heterogeneous land surface cover.
The physical modelling approach is mentioned here because it also has future possibilities
within climatological monitoring. There also exist some applications where output fields from
such models form the first guess of a field for objective interpolations (e.g. the Swedish
mesoscale analysis system MESAN (Häggmark et al., 1997)).
journal
COST Action 719 Final report 61
II.2.6.1 Downscaling methods
Further on are physically based Global Climate Models (GCMs) among the most seminal
tools for providing climate scenario information. However, for many impact applications the
horizontal resolution of a GCM is much too coarse (about 300 km). Therefore, downscaling
methods are required to provide climate information at a regional to local scale. The methods
can be distinguished into two types (1) statistical downscaling and (2) dynamical
downscaling.
II.2.6.2 Statistical Downscaling (SDS)
The concept of statistical downscaling (SDS) is based of the assumption that regional climate
is conditioned by two factors (a) the large scale climatic state and (b) regional/local
physiographic features such as topography, land-sea distribution and land use (von Storch,
1995, 1999). Therefore, a statistical model has to be found that relates large-scale climate
variables (‘predictors’) to regional and local variables (‘predictands’). By feeding the largescale
output of a GCM into the statistical model, the corresponding local and climate
characteristic is estimated.
The main advantages of SDS are:
- SDS is computationally inexpensive
- SDS can be used to provide site-specific information
- Rapid application to multiple GCMs
The main disadvantages/weakness of SDS are:
- the basic assumption is not verifiable, what means that possible future changes in the
statistical relationship are not taken into account
- requires long time series of surface and upper air observations
In the following, the main SDS techniques are briefly described. The categorization is similar
to that used by IPCC WG1 (Giorgi et al. 2001)
I) Weather classification schemes
Typically, weather states are defined by applying cluster analysis to atmospheric fields (Huth,
2000; Hewitson et al., 2002) or using subjective circulation classification schemes (Jones et
al. 1993). In both cases the weather pattern are grouped according to their similarity.
II) Regression Models
Here, a linear or nonlinear relationship between predictands and the large scale atmospheric
forcing is established. Commonly applied methods include multiple regression (Murphy,
1999), canonical correlation analysis (CCA) (von Storch et al., 1993) and artificial neural
networks which are akin to nonlinear regression (Crane et al. 1998).
III) Weather generators
Weather generators are statistical models of observed sequences of weather variables (Wilks
et al., 1999). They replicate the statistical attributes of a local climate variable like mean or
variance, but not observed sequences of events. Most of them focus on the daily time-scale, as
required by many impact models, but sub-daily models are also available (e.g. Katz et al.,
1995).
II.2.6.1 Downscaling methods
Further on are physically based Global Climate Models (GCMs) among the most seminal
tools for providing climate scenario information. However, for many impact applications the
horizontal resolution of a GCM is much too coarse (about 300 km). Therefore, downscaling
methods are required to provide climate information at a regional to local scale. The methods
can be distinguished into two types (1) statistical downscaling and (2) dynamical
downscaling.
II.2.6.2 Statistical Downscaling (SDS)
The concept of statistical downscaling (SDS) is based of the assumption that regional climate
is conditioned by two factors (a) the large scale climatic state and (b) regional/local
physiographic features such as topography, land-sea distribution and land use (von Storch,
1995, 1999). Therefore, a statistical model has to be found that relates large-scale climate
variables (‘predictors’) to regional and local variables (‘predictands’). By feeding the largescale
output of a GCM into the statistical model, the corresponding local and climate
characteristic is estimated.
The main advantages of SDS are:
- SDS is computationally inexpensive
- SDS can be used to provide site-specific information
- Rapid application to multiple GCMs
The main disadvantages/weakness of SDS are:
- the basic assumption is not verifiable, what means that possible future changes in the
statistical relationship are not taken into account
- requires long time series of surface and upper air observations
In the following, the main SDS techniques are briefly described. The categorization is similar
to that used by IPCC WG1 (Giorgi et al. 2001)
I) Weather classification schemes
Typically, weather states are defined by applying cluster analysis to atmospheric fields (Huth,
2000; Hewitson et al., 2002) or using subjective circulation classification schemes (Jones et
al. 1993). In both cases the weather pattern are grouped according to their similarity.
II) Regression Models
Here, a linear or nonlinear relationship between predictands and the large scale atmospheric
forcing is established. Commonly applied methods include multiple regression (Murphy,
1999), canonical correlation analysis (CCA) (von Storch et al., 1993) and artificial neural
networks which are akin to nonlinear regression (Crane et al. 1998).
III) Weather generators
Weather generators are statistical models of observed sequences of weather variables (Wilks
et al., 1999). They replicate the statistical attributes of a local climate variable like mean or
variance, but not observed sequences of events. Most of them focus on the daily time-scale, as
required by many impact models, but sub-daily models are also available (e.g. Katz et al.,
1995).
Langganan:
Postingan (Atom)