Within Europe, the Water Framework Directive (WFD) sets EU-wide standards for how the quality of surface- and ground-waters across Europe is assessed and classified. Assessment of quality using the WFD is based on a range of elements that vary depending on the type of water being assessed and are combined to give an overall classification of waterbodies into five classes (High, Good, Moderate, Poor and Bad) for surface waters and two classes (Good or Poor) for groundwaters.
In the UK the Environment Agency (EA) is the competent authority responsible for monitoring and assessment of water quality within England. The EA have made the reporting data relating to the requirements of the WFD available via the Catchment Data Explorer (CDE) website, https://environment.data.gov.uk/catchment-planning/.
cde
is a package for R which facilitates searching and
download of the WFD reporting data for all waterbodies from the EA CDE
website.
The types of data that can be downloaded are: WFD status classification data, Reasons for Not Achieving Good (RNAG) status, objectives set for waterbodies and details of associated protected areas.
The CDE data are made available under the Open Government Licence v3.0 and use of the data accessed by and contained within this package implies acceptance of these licence conditions.
You can install the current development version from github with:
The search_sites
function allows you to search for
waterbodies, Operational or Management Catchments or River Basin
Districts that contain a match or partial match for a specified search
string (which is case-sensitive). There is a hierarchical relationship
between waterbodies, catchments and River Basin Districts (RBD) as shown
here.
As an example, we will search for waterbodies containing the name
“Lark”.
# load the package
library(cde)
# search for waterbodies containing the name "Lark"
lark_wb<-search_names(string="Lark", column="name")
The dataframe returned contains details of all the waterbodies containing the string “Lark” in their name. The details returned include waterbody id codes (WBID), type of waterbody, Operational and Management Catchment names and River Basin District.
# show the top 6 rows of the 'name' column
head(lark_wb$name)
#> [1] "Lark (US Hawstead)"
#> [2] "Lark downstream of Mill Street Bridge"
#> [3] "Lark (Hawstead to Abbey Gardens)"
#> [4] "Lark (Abbey Gardens to Mildenhall)"
#> [5] "Lark"
#> [6] "Lark - Fynn (d/s confluence)"
To search for Operational Catchments containing the same string we would use the following code.
Having located a waterbody, catchment or River Basin District that we
want to retrieve data for, we can use the get_status
function to retrieve the status classification information from the CDE
website. We can extract the data for a specific year, or a range of
years. For Operational/Management Catchment or River Basin District
level downloads, we can also extract information just for a specific
waterbody type (such as rivers) or for all waterbody types. In addition
it is possible to extract classification data relating to a specific
element of the classification.
The overall classification is made up of a number of different
elements in a hierarchy. Details of the hierarchy of classification
levels can be found here.
By default it retrieves the “Overall Water Body”” classification status,
but by specifying the level
, information on a specific
level of classification can be retrieved. The possible values are:
Level 1 | Level 2 | Level 4 |
---|---|---|
Ecological | Biological quality elements | Overall Water Body |
Chemical | Chemical Status element | - |
Quantitative | Hydromorphological Supporting Elements | - |
- | Other Substances | - |
- | Physico-chemical quality elements | - |
- | Priority hazardous substances | - |
- | Priority substances | - |
- | Quantitative Status element | - |
- | Specific pollutants | - |
- | Supporting elements | - |
The function returns an object of class cde_df
(basically a dataframe with custom print and plot methods) containing
the status (and other details) for the specified combination of column,
value, level and dates. Note that during 2013 and 2014 waterbodies were
classified under both Cycle 1 and Cycle 2 methodologies. The status
information extracted for these years is just for the Cycle 2
classification, to avoid double counting. There was also a change in
some of the environmental standards applied to chemical aspects of
status assessment between cycles, so there may be some noticeable
changes in status between these years. See here
for more details.
For details of the meaning of the the different columns returned, see the output reference list.
# extract overall waterbody status classification data for a single
# waterbody in all years
# first decide which waterbody, we can use one from the first search
# above (need the WBID information)
head(lark_wb)
#> WBID name type OC
#> 1911 GB105033042920 Lark (US Hawstead) River Lark
#> 1912 GB105033043052 Lark downstream of Mill Street Bridge River Lark
#> 1914 GB105033042940 Lark (Hawstead to Abbey Gardens) River Lark
#> 1918 GB105033043051 Lark (Abbey Gardens to Mildenhall) River Lark
#> 2197 GB105035040360 Lark River Deben
#> 2200 GB105035040300 Lark - Fynn (d/s confluence) River Deben
#> MC RBD
#> 1911 Cam and Ely Ouse Anglian
#> 1912 Cam and Ely Ouse Anglian
#> 1914 Cam and Ely Ouse Anglian
#> 1918 Cam and Ely Ouse Anglian
#> 2197 East Suffolk Anglian
#> 2200 East Suffolk Anglian
# we will get data for the first waterbody here (WBID: GB105033042920,
# name: Lark (US Hawstead))
lark_hawstead<-get_status(ea_name="GB105033042920", column="WBID")
# the dataframe returned contains all of the data for this site in all
# years (we did not specify year/year range).
lark_hawstead
#> river_basin_district management_catchment operational_catchment
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> With an additional 17 columns of data.
#> Row values may be truncated to fit console.
# just a quick look at the actual status data
table(lark_hawstead$status)
#>
#> Good Moderate
#> 1 9
An example of a higher level download, specifying a year range and type (in this case Rivers).
# download status data for a given year range and type of waterbody
lark_OC_rivers<-get_status(ea_name="Lark", column="OC", startyr=2013, endyr=2015, type="River")
# print out the results
lark_OC_rivers
#> river_basin_district management_catchment operational_catchment
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> Anglian Cam and Ely Ouse Lark
#> With an additional 26 rows and 17 columns of data.
#> Row values may be truncated to fit console.
To get information about status classification in relation to a
specific level in the classification, we can specify level
as well (see table above for options and here
for more details on the classification levels used).
The get_status
function, along with other
get_...
functions, has a plot
method which
provides quick overview plots of status classes, giving a plot of
percentages of waterbodies in different status classes for the
combination of criteria specified. Plotting is only possible for
Operational/Management Catchment or River Basin District downloads.
# get overall waterbody status information for the Lark OC between 2013 and 2015
lark_OC_2013_15 <- get_status(ea_name="Lark", column="OC", startyr=2013, endyr=2015)
# plot the data
plot(lark_OC_2013_15)
For plots, the colour scheme used is based on the
viridis
palette. For get_status
and
get_objectives
an alternative colour scheme, based on the
WFD-defined status class colours, can be used instead by setting
scheme="wfd"
within a plot
call. Also if a
single year is specified, a standard (as opposed to stacked) barplot is
produced as shown below.
Not all waterbodies in the Lark Operational Catchment example above
have achieved Good status. The get_rnag
function downloads
Reasons for Not Achieving Good (RNAG) data, which allow us to find out
more detail on the pressures on the waterbodies that have been assessed
to be driving the failure. The RNAG data can be extracted for specific
classification levels, as per the status data above.
For details of the meaning of the the different columns returned, see the output reference list.
Plots of RNAG data are given as frequency histograms of the occurence
of information in the pressure_tier_3
column. For details
of this, see the reference list.
For those waterbodies that are at less than Good status, objectives
are set to indicate what status is aimed for in the longer term. The
objectives are set in relation to what is determined to be achievable in
the given timescale. Therefore objectives have been set in relation to
the 6-year cycle of assessment (so years 2015, 2021 and 2027, then also
2040 and 2050 for long-term objectives). Using the
get_objectives
function, we can download objectives for
waterbodies, catchments or River Basin Districts. Objectives can be
downloaded for a specific year (2015, 2021, 2027, 2040 or 2050), level
of classification and waterbody type as per the get_status
function. Note however that not all waterbodies have objectives set for
all years, levels or types. If no objectives are set for the criteria
specified, a message is given.
For details of the meaning of the the different columns returned, see the output reference list.
# download the objectives set for 2015 for the Lark Operational Catchment
lark_OC_obj_2015<-get_objectives(ea_name="Lark", column="OC", year=2015)
Plotting of objectives is similar to that of get_status
data, except the status classes represent the target objectives
predicted to be achieved by the date specified.
The get_pa
function downloads details of the protected
areas associated with a waterbody, catchment or River Basin District.
The protected areas listed include those designated under conservation
reasons, such as SACs (Habitats and Species Directive), pollution
reduction, such as Nitrate Vulnerable Zones (Nitrates Directive) or
human use (Bathing Water Directive).
For details of the meaning of the the different columns returned, see the output reference list.
# get details of the protected areas within the Lark Operational Catchment
lark_OC_pa<-get_pa(ea_name="Lark", column="OC")
Plotting the output of get_pa
produces a frequency
histogram of the protected_area_type
column within the area
specified.