Skip to content

2. Data handling

Rasmus E. Benestad edited this page Mar 16, 2020 · 49 revisions

An important basic prinsiple of esd is to use 'smart data', where data comes with its metadata. The metadata is attached to the data through attributes, which tag the data without getting in the way of the operations. In the esd-package, the functions are tailored to make clever use of the metadata when processing and analysing the data. We also place an emphasis on the FAIR principles - see 'Climate Data You Can Trust'.

The most commonly used R esd functions here are

Function Description
select.station Find or select one (or several) specific station(s) from existing meta-data`
station Retrieve data for a weather station from a specific dataset
retrieve Retrieve a field data from a netcdf file

This section presents some examples about how to create and/or retrieve (freely available global and/or national) data sets using the 'esd' library. It has to be noted that the library includes two global datasets, the MET Norway archive, and data from Nordic programs. These are mainly:

  • MET Norway archive (KDVH) for daily and monthly time steps referred to as 'METNOD' and 'METNOM', respectively (works within MET Norway firewall only).
  • Global historical climatology network on a daily and monthly time steps referred to as 'GHCND' and 'GHCNM' respectively.
  • European climate data sets referred to as 'ECAD'
  • Nordic monthly data sets form the 'NACD' and 'NARP' programs.

Weather Station data (national and global datasets) from the web (Basic users)

Select weather stations from existing 'esd' meta data

Retrieve data from predefined datasets

Create a station object from scratch (advanced users)

Retrieve field data from ncdf/netcdf files eg. reanalysis,gcms,or rcms

Select weather stations from existing meta-data

To select weather station from one or several data sets mentioned previously, type

> ss <- select.station()

The empty parantheses '()' can hold arguments for any search criterion from the following list.

Search argument Description
loc select stations by loc*ation's name(s)
lon select stations by location's lon*gitude
lat select stations by location's lat*itude
alt select stations by alt*itudes. Positive value, select all stations above alt; for negative values, select all stations below alt value.
param select stations by recorded param*eter or variable identifier.
src limit the search to a specific data source ("NARP","NACD", "NORDKLIMA", "GHCNM", "METNOM", "ECAD", "GHCND" and "METNOD")
stid select stations by the identifier of the weather/climate station.
alt Numeric value of altitude (in meters a.s.l.) used for selection.
cntr Select stations by countries' name.
it Select stations for specific or a range of Dates. An integer in the range of [1:12] for months, an integer of 4 digits for years (e.g. 2014), or a vector of Dates ("2014-01-01").
nmin Select only stations with at least nmin number of years, months or days depending on the class of object x (e.g. 30 years).

The following map shows all available weather stations recording 2m-temperature within the spatial domain covering Scandinavian regions. This is obtained by typing

> ss <- select.station(param='t2m',lon=c(-15,45),lat=c(55,80))
> map(ss,cex=.5,col="darkred",bg="red")

Alt text

The following map shows all available weather stations recording precipitation within the spatial domain covering Scandinavian regions. This is obtained by typing

> ss <- select.station(param='precip',lon=c(-15,45),lat=c(55,80))
> map(ss,cex=.5,col="darkgreen",bg="green")

Alt text

Retrieve data from predefined datasets

e.g. 1 METNO

To get the daily mean temperature for "Oslo" station ("18700") from MET Norway archive (works only within MET Norway firewall!) type

for daily,

> t2m.dly <- station(stid='18700',param='t2m',src='metnod')

which is equivalent to

> t2m.dly <- station.metnod(stid='18700',param='t2m')

For monthly data, type

> t2m.mon <- station.metnom(stid="18700",param="t2m")

You can aggregate the data into mean annual values and plot the values by typing

> t2m.ann <- as.annual(t2m.dly, FUN = "mean")
> plot(t2m.ann,ylim=c(2,10))

and finally, add the linear trend on top using

> lines(trend(t2m.ann),col="red",lwd=2)

Alt text

Create a station object from scratch (Advanced users)

The user can create a station object from scratch using the as.station() function as follows

> s <- as.station(x=data,stid="18700", loc="Oslo-Blindern", lon="10.7", lat="59.9", alt=94,
                  param=c("t2m","precip"))`

where data is a data.frame object containing ordered values recorded at 'Oslo-Blindern' weather station.

e.g. 1 / Monthly data

> data <- round(matrix(rnorm(20*12),20,12),2)
> colnames(data) <- month.abb
> x <- data.frame(year=1981:2000,data)
> X <- as.station(x,loc="",param="noise",unit="none")

e.g. 2 / daily or any indexed data from text file

> x <- read.table(file.name,header=TRUE,skip=20,sep=",")
## Suppose that the file contains column or header names called 'dates' and 'data', 
## then create a zoo object from x
> z <- zoo(x$data, order.by = x$date) # plz check ?zoo for additional info
## Finally, create a station object from z by adding attributes
> y  <- as.station(z,stid,lon,lat,alt,parm,calendar,quality,cntr,loc,src,url,unit,longname,reference,info)

Retrieve field data from 'ncdf' (NetCDF) files (e.g. Reanalysis, GCMs, RCMs, ...)

The main function here is retrieve() which reads data from any 'netcdf' file and return a zoo field object with attributes. retrieve() can handle data on a regular or irregular(rotated) lon-lat grid.

Arguments Description
ncfile full path of the netcdf file or any object of class 'ncdf' or 'ncdf4'.
lon select a specific grid cell or a subregion by lon*gitude
lat select a specific grid cell or a subregion by lat*itude
lev select a specific vertical lev*el
time select a specific date or time span
param Climate parameter or variable (e.g. tas : surface temperature)
plot plot the returned object if set to TRUE
greenwich convert longitudes to -180E/180E or center maps on greenwich meridian (0 deg. E)
verbose If TRUE, displays extra information on progress.

e.g. 1 Retrieve ERA40 reanalysis

Download the ERA40 reanalysis air surface temperature (tas) from the climate explorer website and store it locally in the destination file using

download.file(url="http://climexp.knmi.nl/NCEPNCAR40/air.2m.mon.mean.nc", 
              destfile="/tmp/air.2m.mon.mean.nc",
              method = "auto", quiet = FALSE,mode = "w",cacheOK = TRUE)

then, read the data into eraint object and plot the result using

> eraint <- retrieve('/tmp/air.2m.mon.mean.nc',plot=T)

Alt text

e.g. 2 CMIP3/5 RCP Scenarios

Download the air surface temperature (tas) for RCP 4.5 scenarios and NorESM1-ME model from the climate explorer and store it locally in the destination file using

download.file(url="http://climexp.knmi.nl/CMIP5/monthly/tas/tas_Amon_NorESM1-ME_rcp45_000.nc", 
              destfile="/tmp/tas_Amon_NorESM1-ME_rcp45_000.nc",
              method = "auto", quiet = FALSE,mode = "w",cacheOK = TRUE)

Then, reads the data into the object "gcm" typing

gcm <-retrieve(ncfile="/tmp/tas_Amon_NorESM1-ME_rcp45_000.nc",param="tas",plot=TRUE)

Finally, map the results using

map(gcm,projection="lonlat")

Alt text