Title: | Convert European Regional Data |
---|---|
Description: | Motivated by changing administrative boundaries over time, the 'nuts' package can convert European regional data with NUTS codes between versions (2006, 2010, 2013, 2016 and 2021) and levels (NUTS 1, NUTS 2 and NUTS 3). The package uses spatial interpolation as in Lam (1983) <doi:10.1559/152304083783914958> based on granular (100m x 100m) area, population and land use data provided by the European Commission's Joint Research Center. |
Authors: | Moritz Hennicke [aut, cre, cph] , Werner Krause [aut, cph] , Pueyo-Ros Josep [rev] (Josep reviewed the package for rOpenSci, see https://github.com/ropensci/software-review/issues/623#issuecomment-1951446662), Le Meur Nolwenn [rev] (Nolwenn reviewed the package for rOpenSci, see https://github.com/ropensci/software-review/issues/623#issuecomment-1961501137) |
Maintainer: | Moritz Hennicke <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.0 |
Built: | 2025-01-02 05:44:00 UTC |
Source: | https://github.com/ropensci/nuts |
The data frame stores all NUTS codes in hierarchical levels 1, 2 and 3 by NUTS classification versions 2006, 2010, 2013, 2016 and 2021.
all_nuts_codes
all_nuts_codes
all_nuts_codes
A data frame with 8,896 rows and 2 columns:
NUTS code
NUTS versions
Country name
https://urban.jrc.ec.europa.eu/tools/nuts-converter?lng=en#/
The table contains population, area and surface flows between two NUTS regions and different NUTS code classifications. NUTS regions are at 1st, 2nd and 3rd level. NUTS versions are 2006, 2010, 2013, 2016 and 2021.
cross_walks
cross_walks
cross_walks
A data frame with 47,340 rows and 9 columns:
Departing NUTS code
Desired NUTS code
Departing NUTS version
Desired NUTS version
NUTS division level
Country name
Area size flow
2018 population flow
2011 population flow
2018 artificial surfaces flow
2012 artificial surfaces flow
https://urban.jrc.ec.europa.eu/tools/nuts-converter?lng=en#/
The data frame contains the number of different manure storage facilities from the Farm Structure Survey in all (former) EU member states, such as Iceland, Norway, Switzerland and Montenegro at the NUTS 3 level. Please see the link indicated below for more information.
manure
manure
manure
A data frame with 17,151 rows and 4 columns:
9 indicators: All manure storage facilities, solid dung, liquid manure slurry, slurry: tank, slurry: lagoon; covered facilities with either dung, liquid manure, slurry
NUTS 1, 2, 3 or National level
Years 2000, 2003 and 2010
Number
https://ec.europa.eu/eurostat/databrowser/view/aei_fm_ms/default/table?lang=en
nuts_aggregate()
transforms regional NUTS data between NUTS levels.
nuts_aggregate( data, to_level, variables, weight = NULL, missing_rm = FALSE, missing_weights_pct = FALSE, multiple_versions = c("error", "most_frequent") )
nuts_aggregate( data, to_level, variables, weight = NULL, missing_rm = FALSE, missing_weights_pct = FALSE, multiple_versions = c("error", "most_frequent") )
data |
A nuts.classified object returned by |
to_level |
Number corresponding to the desired NUTS level to be aggregated to: |
variables |
Named character specifying variable names and variable type ( |
weight |
String with name of the weight used for conversion. Can be area size |
missing_rm |
Boolean that is FALSE by default. TRUE removes regional flows that depart from missing NUTS codes. |
missing_weights_pct |
Boolean that is FALSE by default. TRUE computes the percentage of missing weights due to missing departing NUTS regions for each variable. |
multiple_versions |
By default equal to |
Console messages can be controlled with rlang::local_options(nuts.verbose = "quiet")
to silence messages and
nuts.verbose = "verbose"
to switch messages back on.
A tibble containing NUTS codes, aggregated variable values, and possibly grouping variables.
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Data varies at the NUTS level x indicator x year x country x NUTS code level head(manure) # Aggregate from NUTS 3 to 2 by indicator x year manure %>% filter(nchar(geo) == 5) %>% nuts_classify(nuts_code = "geo", group_vars = c('indic_ag','time')) %>% # Group vars are automatically passed on nuts_aggregate(to_level = 2, variables = c('values'= 'absolute'), weight = 'pop18')
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Data varies at the NUTS level x indicator x year x country x NUTS code level head(manure) # Aggregate from NUTS 3 to 2 by indicator x year manure %>% filter(nchar(geo) == 5) %>% nuts_classify(nuts_code = "geo", group_vars = c('indic_ag','time')) %>% # Group vars are automatically passed on nuts_aggregate(to_level = 2, variables = c('values'= 'absolute'), weight = 'pop18')
nuts_classify()
can identify the NUTS version year and level from a variable containing NUTS codes.
nuts_classify( data, nuts_code, group_vars = NULL, ties = c("most_recent", "oldest") )
nuts_classify( data, nuts_code, group_vars = NULL, ties = c("most_recent", "oldest") )
data |
A data frame or tibble that contains a variable with NUTS |
nuts_code |
Variable name containing NUTS codes |
group_vars |
Variable name(s) for classification within groups. |
ties |
Picks |
Console messages can be controlled with rlang::local_options(nuts.verbose = "quiet")
to silence messages and
nuts.verbose = "verbose"
to switch messages back on.
A list of three tibbles. The first tibble contains the original data with the classified NUTS version, level, and country. The second tibble lists the group-specific overlap with each NUTS version. The third tibble shows missing NUTS codes for each group.
The output can be passed to nuts_convert_version()
to convert data across NUTS versions and nuts_aggregate()
to aggregate across NUTS levels.
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Data varies at the NUTS level x indicator x year x country x NUTS code level head(manure) # Classify version of NUTS 2 codes in Germany manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') # Classify version of NUTS 3 codes within country and year manure %>% filter(nchar(geo) == 5) %>% filter(indic_ag == 'I07A_EQ_Y') %>% select(-indic_ag) %>% # Data varies at the year x country x NUTS code level. The country grouping # is always used by default. nuts_classify(nuts_code = 'geo', group_vars = 'time')
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Data varies at the NUTS level x indicator x year x country x NUTS code level head(manure) # Classify version of NUTS 2 codes in Germany manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') # Classify version of NUTS 3 codes within country and year manure %>% filter(nchar(geo) == 5) %>% filter(indic_ag == 'I07A_EQ_Y') %>% select(-indic_ag) %>% # Data varies at the year x country x NUTS code level. The country grouping # is always used by default. nuts_classify(nuts_code = 'geo', group_vars = 'time')
nuts_convert_version()
transforms regional NUTS data between NUTS versions.
nuts_convert_version( data, to_version, variables, weight = NULL, missing_rm = FALSE, missing_weights_pct = FALSE, multiple_versions = c("error", "most_frequent") )
nuts_convert_version( data, to_version, variables, weight = NULL, missing_rm = FALSE, missing_weights_pct = FALSE, multiple_versions = c("error", "most_frequent") )
data |
A nuts.classified object returned by |
to_version |
String with desired NUTS version the function should convert to. Possible versions: |
variables |
Named character specifying variable names and variable type ( |
weight |
String with name of the weight used for conversion. Can be area size |
missing_rm |
Boolean that is FALSE by default. TRUE removes regional flows that depart from missing NUTS codes. |
missing_weights_pct |
Boolean that is FALSE by default. TRUE computes the percentage of missing weights due to missing departing NUTS regions for each variable. |
multiple_versions |
By default equal to |
Console messages can be controlled with rlang::local_options(nuts.verbose = "quiet")
to silence messages and
nuts.verbose = "verbose"
to switch messages back on.
A tibble containing NUTS codes, converted variable values, and possibly grouping variables.
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Data varies at the NUTS level x indicator x year x country x NUTS code level head(manure) # Convert NUTS 2 codes in Germany from 2006 to 2021 version manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data now only varies at the NUTS code level nuts_classify(nuts_code = "geo") %>% nuts_convert_version(to_version = '2021', weight = 'pop18', variables = c('values' = 'absolute')) # Convert NUTS 3 codes by country x year, classifying version first manure %>% filter(nchar(geo) == 5) %>% filter(indic_ag == 'I07A_EQ_Y') %>% select(-indic_ag) %>% # Data now varies at the year x NUTS code level nuts_classify(nuts_code = 'geo', group_vars = c('time')) %>% nuts_convert_version(to_version = '2021', weight = 'pop18', variables = c('values' = 'absolute'))
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Data varies at the NUTS level x indicator x year x country x NUTS code level head(manure) # Convert NUTS 2 codes in Germany from 2006 to 2021 version manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data now only varies at the NUTS code level nuts_classify(nuts_code = "geo") %>% nuts_convert_version(to_version = '2021', weight = 'pop18', variables = c('values' = 'absolute')) # Convert NUTS 3 codes by country x year, classifying version first manure %>% filter(nchar(geo) == 5) %>% filter(indic_ag == 'I07A_EQ_Y') %>% select(-indic_ag) %>% # Data now varies at the year x NUTS code level nuts_classify(nuts_code = 'geo', group_vars = c('time')) %>% nuts_convert_version(to_version = '2021', weight = 'pop18', variables = c('values' = 'absolute'))
nuts_get_data()
returns the classified data after running nuts_classify()
.
nuts_get_data(data)
nuts_get_data(data)
data |
A nuts.classified object returned by |
Console messages can be controlled with rlang::local_options(nuts.verbose = "quiet")
to silence messages and
nuts.verbose = "verbose"
to switch messages back on.
A tibble containing the original data with the classified NUTS version, level, and country.
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Classify version of NUTS 2 codes in Germany classified <- manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') nuts_get_data(classified)
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Classify version of NUTS 2 codes in Germany classified <- manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') nuts_get_data(classified)
nuts_get_missing()
returns the classified data after running nuts_classify()
.
nuts_get_missing(data)
nuts_get_missing(data)
data |
A nuts.classified object returned by |
Console messages can be controlled with rlang::local_options(nuts.verbose = "quiet")
to silence messages and
nuts.verbose = "verbose"
to switch messages back on.
A tibble listing missing NUTS codes for each group.
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Classify version of NUTS 2 codes in Germany classified <- manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') nuts_get_missing(classified)
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Classify version of NUTS 2 codes in Germany classified <- manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') nuts_get_missing(classified)
nuts_get_version()
returns the classified data after running nuts_classify()
.
nuts_get_version(data)
nuts_get_version(data)
data |
A nuts.classified object returned by |
Console messages can be controlled with rlang::local_options(nuts.verbose = "quiet")
to silence messages and
nuts.verbose = "verbose"
to switch messages back on.
A tibble that lists the group-specific overlap with each NUTS version.
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Classify version of NUTS 2 codes in Germany classified <- manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') nuts_get_version(classified)
library(dplyr) # Load EUROSTAT data of manure storage deposits data(manure) # Classify version of NUTS 2 codes in Germany classified <- manure %>% filter(nchar(geo) == 4) %>% filter(indic_ag == 'I07A_EQ_Y') %>% filter(grepl('^DE', geo)) %>% filter(time == 2003) %>% select(-indic_ag, -time) %>% # Data varies at the NUTS code level nuts_classify(nuts_code = 'geo') nuts_get_version(classified)
nuts_test_multiple_versions
is called from either nuts_convert_version
or nuts_aggregate
to selects the most frequent version within groups or throw an error.
nuts_test_multiple_versions(group_vars, multiple_versions, data_versions, data)
nuts_test_multiple_versions(group_vars, multiple_versions, data_versions, data)
group_vars |
Variable name(s) for classification within groups. Always computes overlap within country. |
multiple_versions |
By default equal to |
data_versions |
Data versions |
data |
A nuts.classified object returned by |
A tibble containing NUTS codes, the potential number of rows dropped and a message with the results of the test.
library(dplyr) df <- manure %>% filter(nchar(geo) == 5) %>% select(geo, indic_ag, values) %>% distinct(geo, .keep_all = TRUE) %>% nuts_classify(nuts_code = "geo", group_vars = "indic_ag", data = .) nuts_test_multiple_versions(group_vars = "indic_ag", multiple_versions = "most_frequent", data_versions = df$versions_data, data = df$data)
library(dplyr) df <- manure %>% filter(nchar(geo) == 5) %>% select(geo, indic_ag, values) %>% distinct(geo, .keep_all = TRUE) %>% nuts_classify(nuts_code = "geo", group_vars = "indic_ag", data = .) nuts_test_multiple_versions(group_vars = "indic_ag", multiple_versions = "most_frequent", data_versions = df$versions_data, data = df$data)
The data frame contains information on patent applications to the European Patent Office by year and NUTS 3 regions.
patents
patents
patents
A data frame with 104,106 rows and 4 columns:
4 indicators: Number, Nominal GDP in billion euro, Per million habitants, Per million of population in the labor force
NUTS 1, 2, 3 or National level
Years 2008, 2009, 2010, 2011 and 2012
Values
https://ec.europa.eu/eurostat/databrowser/view/PAT_EP_RTOT/default/table