Title: | 'NOAA' Weather Data from R |
---|---|
Description: | Client for many 'NOAA' data sources including the 'NCDC' climate 'API' at <https://www.ncdc.noaa.gov/cdo-web/webservices/v2>, with functions for each of the 'API' 'endpoints': data, data categories, data sets, data types, locations, location categories, and stations. In addition, we have an interface for 'NOAA' sea ice data, the 'NOAA' severe weather inventory, 'NOAA' Historical Observing 'Metadata' Repository ('HOMR') data, 'NOAA' storm data via 'IBTrACS', tornado data via the 'NOAA' storm prediction center, and more. |
Authors: | Scott Chamberlain [aut] , Daniel Hocking [aut, cre] , Brooke Anderson [ctb], Maëlle Salmon [ctb], Adam Erickson [ctb], Nicholas Potter [ctb], Joseph Stachelek [ctb], Alex Simmons [ctb], Karthik Ram [ctb], Hart Edmund [ctb], rOpenSci [fnd] (https://ropensci.org) |
Maintainer: | Daniel Hocking <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.4.0 |
Built: | 2024-09-28 06:29:29 UTC |
Source: | https://github.com/ropensci/rnoaa |
rnoaa is an R interface to NOAA climate data.
Many functions in this package interact with the National Climatic Data
Center application programming interface (API) at
https://www.ncdc.noaa.gov/cdo-web/webservices/v2, all of
which functions start with ncdc_
. An access token, or API key, is
required to use all the ncdc_
functions. The key is required by NOAA,
not us. Go to the link given above to get an API key.
More NOAA data sources are being added through time. Data sources and their function prefixes are:
buoy_*
- NOAA Buoy data from the National Buoy Data Center
ghcnd_*
/meteo_*
- GHCND daily data from NOAA
isd_*
- ISD/ISH data from NOAA
homr_*
- Historical Observing Metadata Repository (HOMR)
vignette
ncdc_*
- NOAA National Climatic Data Center (NCDC) vignette
(examples)
sea_ice
- Sea ice vignette
storm_
- Storms (IBTrACS) vignette
swdi
- Severe Weather Data Inventory (SWDI) vignette
tornadoes
- From the NOAA Storm Prediction Center
coops_search
- NOAA CO-OPS - tides and currents data
cpc_prcp
- rainfall data from the NOAA Climate
Prediction Center (CPC)
arc2
- rainfall data from Africa Rainfall Climatology
version 2
bsw
- Blended sea winds (BSW)
ersst
- NOAA Extended Reconstructed Sea Surface
Temperature (ERSST) data
lcd
- Local Climitalogical Data from NOAA
Government shutdowns can greatly affect data sources in this package. The following is a breakdown of the functions that fetch data by HTTP vs. FTP - done this way as we've noticed that during the ealry 2019 border wall shutdown most FTP services were up, while those that were down were HTTP; though not all HTTP services were down.
HTTP info: https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
FTP info: https://en.wikipedia.org/wiki/File_Transfer_Protocol
HTTP services (whether service is/was up or down during early 2019 shutdown)
buoy_*
- Up
homr_*
- Up
ncdc_*
- Down
swdi
- Down
tornadoes
- Down
coops_search
- Up
ersst
- Down
lcd
- Down
se_*
- Down
FTP services (whether service is/was up or down during early 2019 shutdown)
ghcnd_*
- Up
isd_*
- Up
sea_ice
- Up
storm_
- Up
cpc_prcp
- Up
arc2
- Up
bsw
- Up
We've tried to whenever possible detect whether a service is error due to a government shutdown and give a message saying so. If you know a service is down that rnoaa interacts with but we don't fail well during a shutdown let us know.
Some functions use netcdf files, including:
ersst
buoy
bsw
You'll need the ncdf4
package for those functions, and those only.
ncdf4
is in Suggests in this package, meaning you only need
ncdf4
if you are using any of the functions listed above. You'll get
an informative error telling you to install ncdf4
if you don't have
it and you try to use the those functions. Installation of ncdf4
should be straightforward on any system.
meteo
family of functionsThe meteo
family of functions are prefixed with meteo_
and
provide a set of helper functions to:
Identify candidate stations from a latitude/longitude pair
Retrieve complete data for one or more stations (meteo_coverage()
)
Arc2 - Africa Rainfall Climatology version 2
arc2(date, box = NULL, ...)
arc2(date, box = NULL, ...)
date |
(character/date) one or more dates of the form YYYY-MM-DD |
box |
(numeric) vector of length 4, of the form
|
... |
curl options passed on to crul::verb-GET |
a list of tibbles with columns:
date: date (YYYY-MM-DD)
lon: longitude
lat: latitude
precip: precipitation (mm)
The box
parameter filters the arc2 data to a bounding box you supply.
The function that does the cropping to the bounding box is dplyr::filter
.
You can do any filtering you want on your own if you do not supply
box
and then use whatever tools you want to filter the data by
lat/lon, date, precip values.
See arc2_cache for managing cached files
docs: https://ftp.cpc.ncep.noaa.gov/fews/fewsdata/africa/arc2/ARC2_readme.txt
## Not run: x = arc2(date = "1983-01-01") arc2(date = "2017-02-14") # many dates arc2(date = c("2019-05-27", "2019-05-28")) arc2(seq(as.Date("2019-04-21"), by = "day", length.out = 5)) ## combine outputs x <- arc2(seq(as.Date("2019-05-20"), as.Date("2019-05-25"), "days")) dplyr::bind_rows(x) # bounding box filter box <- c(xmin = 9, ymin = 4, xmax = 10, ymax = 5) arc2(date = "2017-02-14", box = box) arc2(date = c("2019-05-27", "2019-05-28"), box = box) arc2(seq(as.Date("2019-05-20"), as.Date("2019-05-25"), "days"), box = box) ## End(Not run)
## Not run: x = arc2(date = "1983-01-01") arc2(date = "2017-02-14") # many dates arc2(date = c("2019-05-27", "2019-05-28")) arc2(seq(as.Date("2019-04-21"), by = "day", length.out = 5)) ## combine outputs x <- arc2(seq(as.Date("2019-05-20"), as.Date("2019-05-25"), "days")) dplyr::bind_rows(x) # bounding box filter box <- c(xmin = 9, ymin = 4, xmax = 10, ymax = 5) arc2(date = "2017-02-14", box = box) arc2(date = c("2019-05-27", "2019-05-28"), box = box) arc2(seq(as.Date("2019-05-20"), as.Date("2019-05-25"), "days"), box = box) ## End(Not run)
autoplot method for meteo_coverage objects
autoplot_meteo_coverage(meteo_object, old_style = FALSE)
autoplot_meteo_coverage(meteo_object, old_style = FALSE)
meteo_object |
the object returned from |
old_style |
(logical) create the old style of plots, which is faster, but does not plot gaps to indicate missing data |
see meteo_coverage()
for examples.
A ggplot2 plot
The Blended Sea Winds dataset contains globally gridded, high-resolution ocean surface vector winds and wind stresses on a global 0.25° grid, and multiple time resolutions of six-hourly, daily, monthly, and 11-year (1995–2005) climatological monthlies.
bsw(date = NULL, uv_stress = "uv", resolution = "6hrly", ...)
bsw(date = NULL, uv_stress = "uv", resolution = "6hrly", ...)
date |
(date/character) date, in the form YYYY-MM-DD if resolution is 6hrly or daily, or in the form YYYY-MM if resolution is monthly. For resolution=clm can be left NULL. If given, must be in the range 1987-07-09 to today-1 (yesterday) |
uv_stress |
(character) one of uv or stresss, not sure what these mean exactly yet. Default: uv |
resolution |
(character) temporal resolution. one of 6hrly, clm, daily, or monthly. See Details. |
... |
curl options passed on to crul::verb-GET |
Products are available from July 9th, 1987 - present.
Uses ncdf4
under the hood to read NetCDF files
an object of class ncdf4
Message from NOAA: "We also ask you to acknowledge us in your use of the data to help us justify continued service. This may be done by including text such as: The wind data are acquired from NOAA's National Climatic Data Center, via their website We would also appreciate receiving a copy of the relevant publication."
6hrly: 6-hourly, 4 global snapshots (u,v) at UTC 00, 06, 12 and 18Z
clm: climatological monthlies; also provided is the scalar mean (u,v,w)
daily: averages of the 6hrly time points, thus with a center time 09Z; also provided is the scalar mean, (u,v,w)
monthly: averages of daily data; also provided is the scalar mean (u,v,w)
See bsw_cache for managing cached files
We only handle the netcdf files for now, we're avoiding the ieee files, see https://www.cpc.ncep.noaa.gov/products/wesley/wgrib2/ieee.html
https://www.ncdc.noaa.gov/data-access/marineocean-data/blended-global/blended-sea-winds
## Not run: # 6hrly data ## uv x <- bsw(date = "2017-10-01") ## stress y <- bsw(date = "2011-08-01", uv_stress = "stress") # daily z <- bsw(date = "2017-10-01", resolution = "daily") # monthly w <- bsw(date = "2011-08", resolution = "monthly") # clm # x <- bsw(resolution = "clm") ## End(Not run)
## Not run: # 6hrly data ## uv x <- bsw(date = "2017-10-01") ## stress y <- bsw(date = "2011-08-01", uv_stress = "stress") # daily z <- bsw(date = "2017-10-01", resolution = "daily") # monthly w <- bsw(date = "2011-08", resolution = "monthly") # clm # x <- bsw(resolution = "clm") ## End(Not run)
Get NOAA buoy data from the National Buoy Data Center
buoy(dataset, buoyid, year = NULL, datatype = NULL, ...) buoys(dataset) buoy_stations(refresh = FALSE, ...)
buoy(dataset, buoyid, year = NULL, datatype = NULL, ...) buoys(dataset) buoy_stations(refresh = FALSE, ...)
dataset |
(character) Dataset name to query. See below for Details. Required |
buoyid |
Buoy ID, can be numeric/integer/character. Required |
year |
(integer) Year of data collection. Optional. Note there is
a special value |
datatype |
(character) Data type, one of 'c', 'cc', 'p', 'o'. Optional |
... |
Curl options passed on to crul::verb-GET
Optional. A number of different HTTP requests are made internally, but
we only pass this on to the request to get the netcdf file in the internal
function |
refresh |
(logical) Whether to use cached data ( |
Functions:
buoy_stations - Get buoy stations. A cached version of the dataset
is available in the package. Beware, takes a long time to run if you
do refresh = TRUE
buoys - Get available buoys given a dataset name
buoy - Get data given some combination of dataset name, buoy ID, year, and datatype
Options for the dataset parameter. One of:
adcp - Acoustic Doppler Current Profiler data
adcp2 - MMS Acoustic Doppler Current Profiler data
cwind - Continuous Winds data
dart - Deep-ocean Assessment and Reporting of Tsunamis data
mmbcur - Marsh-McBirney Current Measurements data
ocean - Oceanographic data
pwind - Peak Winds data
stdmet - Standard Meteorological data
swden - Spectral Wave Density data with Spectral Wave Direction data
wlevel - Water Level data
If netcdf data has lat/lon variables, then we'll parse into a tidy data.frame. If not, we'll give back the ncdf4 object for the user to parse (in which case the data.frame will be empty).
http://www.ndbc.noaa.gov/, http://dods.ndbc.noaa.gov/
## Not run: if (crul::ok("https://dods.ndbc.noaa.gov/thredds", timeout_ms = 1000)) { # Get buoy station information x <- buoy_stations() # refresh stations as needed, takes a while to run # you shouldn't need to update very often # x <- buoy_stations(refresh = TRUE) if (interactive() && requireNamespace("leaflet")){ library("leaflet") z <- leaflet(data = na.omit(x)) z <- leaflet::addTiles(z) leaflet::addCircles(z, ~lon, ~lat, opacity = 0.5) } # year=9999 to get most current data - not always available buoy(dataset = "swden", buoyid = 46012, year = 9999) # Get available buoys buoys(dataset = 'cwind') # Get data for a buoy ## if no year or datatype specified, we get the first file buoy(dataset = 'cwind', buoyid = 46085) # Including specific year buoy(dataset = 'cwind', buoyid = 41001, year = 1999) # Including specific year and datatype buoy(dataset = 'cwind', buoyid = 45005, year = 2008, datatype = "c") buoy(dataset = 'cwind', buoyid = 41001, year = 1997, datatype = "c") # Other datasets buoy(dataset = 'ocean', buoyid = 41029) # curl debugging buoy(dataset = 'cwind', buoyid = 46085, verbose = TRUE) # some buoy ids are character, case doesn't matter, we'll account for it buoy(dataset = "stdmet", buoyid = "VCAF1") buoy(dataset = "stdmet", buoyid = "wplf1") buoy(dataset = "dart", buoyid = "dartu") } ## End(Not run)
## Not run: if (crul::ok("https://dods.ndbc.noaa.gov/thredds", timeout_ms = 1000)) { # Get buoy station information x <- buoy_stations() # refresh stations as needed, takes a while to run # you shouldn't need to update very often # x <- buoy_stations(refresh = TRUE) if (interactive() && requireNamespace("leaflet")){ library("leaflet") z <- leaflet(data = na.omit(x)) z <- leaflet::addTiles(z) leaflet::addCircles(z, ~lon, ~lat, opacity = 0.5) } # year=9999 to get most current data - not always available buoy(dataset = "swden", buoyid = 46012, year = 9999) # Get available buoys buoys(dataset = 'cwind') # Get data for a buoy ## if no year or datatype specified, we get the first file buoy(dataset = 'cwind', buoyid = 46085) # Including specific year buoy(dataset = 'cwind', buoyid = 41001, year = 1999) # Including specific year and datatype buoy(dataset = 'cwind', buoyid = 45005, year = 2008, datatype = "c") buoy(dataset = 'cwind', buoyid = 41001, year = 1997, datatype = "c") # Other datasets buoy(dataset = 'ocean', buoyid = 41029) # curl debugging buoy(dataset = 'cwind', buoyid = 46085, verbose = TRUE) # some buoy ids are character, case doesn't matter, we'll account for it buoy(dataset = "stdmet", buoyid = "VCAF1") buoy(dataset = "stdmet", buoyid = "wplf1") buoy(dataset = "dart", buoyid = "dartu") } ## End(Not run)
Get NOAA co-ops data
coops_search( begin_date = NULL, end_date = NULL, station_name = NULL, product, datum = NULL, units = "metric", time_zone = "gmt", application = "rnoaa", ... )
coops_search( begin_date = NULL, end_date = NULL, station_name = NULL, product, datum = NULL, units = "metric", time_zone = "gmt", application = "rnoaa", ... )
begin_date |
(numeric) Date in yyyymmdd format. Required |
end_date |
(numeric) Date in yyyymmdd format. Required |
station_name |
(numeric) a station name. Required |
product |
(character) Specify the data type. See below for Details. Required |
datum |
(character) See below for Details. Required for all water level products. |
units |
(character) Specify metric or english (imperial) units, one of 'metric', 'english'. |
time_zone |
(character) Time zone, one of 'gmt', 'lst', 'lst_ldt'.
For GMT, we convert time stamps to GMT. For LST, we look up the time zone
of the station with its lat/lon values, and assign that time zone. When
|
application |
(character) If called within an external package, set to the name of your organization. Optional |
... |
Curl options passed on to crul::verb-GET Optional |
Options for the product paramter. One of:
water_level - Preliminary or verified water levels, depending on availability
air_temperature - Air temperature as measured at the station
water_temperature - Water temperature as measured at the station
wind - Wind speed, direction, and gusts as measured at the station
air_pressure - Barometric pressure as measured at the station
air_gap - Air Gap (distance between a bridge and the water's surface) at the station
conductivity - The water's conductivity as measured at the station
visibility - Visibility from the station's visibility sensor. A measure of atmospheric clarity
humidity - Relative humidity as measured at the station
salinity - Salinity and specific gravity data for the station
one_minute_water_level - One minute water level data for the station
predictions - 6 minute predictions water level data for the station
hourly_height - Verified hourly height water level data for the station
high_low - Verified high/low water level data for the station
daily_mean - Verified daily mean water level data for the station
monthly_mean - Verified monthly mean water level data for the station
datums - datums data for the stations
currents - Currents data for currents stations
Maximum Durations in a Single Call:
Products water_level through predictions allow requests for up to
Products hourly_height and high_low allow requests for up to
Products daily_mean and monthly_mean allow requests for up to
Options for the datum parameter. One of:
MHHW - Mean higher high water
MHW - Mean high water
MTL - Mean tide level
MSL - Mean sea level
MLW - Mean low water
MLLW - Mean lower low water
NAVD - North American Vertical Datum
STND - Station datum
List, of length one or two.
metadata A list of metadata with slots id, name, lat, lon
data A data.frame with data
Scott Chamberlain, Joseph Stachelek, Tom Philippi
https://tidesandcurrents.noaa.gov/api/ https://tidesandcurrents.noaa.gov/map/
## Not run: # Get monthly mean sea level data at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20120301, end_date = 20141001, datum = "stnd", product = "monthly_mean") # Get verified water level data at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, datum = "stnd", product = "water_level") # Get daily mean water level data at Fairport, OH (9063053) coops_search(station_name = 9063053, begin_date = 20150927, end_date = 20150928, product = "daily_mean", datum = "stnd", time_zone = "lst") # Get air temperature at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "air_temperature") # Get water temperature at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "water_temperature") # Get air pressure at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "air_pressure") # Get wind at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "wind") # Get hourly water level height at Key West (8724580) coops_search(station_name = 8724580, begin_date = 20140927, end_date = 20140928, product = "hourly_height", datum = "stnd") # Get high-low water level at Key West (8724580) coops_search(station_name = 8724580, begin_date = 20140927, end_date = 20140928, product = "high_low", datum = "stnd") # Get currents data at Pascagoula Harbor (ps0401) coops_search(station_name = "ps0401", begin_date = 20151221, end_date = 20151222, product = "currents") # Get one-minute water level at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, datum = "stnd", product = "one_minute_water_level") # Get datums at Fort Myers, FL (8725520) coops_search(station_name = 8725520, product = "datums") # Get water level predictions at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140928, end_date = 20140929, datum = "stnd", product = "predictions") ## End(Not run)
## Not run: # Get monthly mean sea level data at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20120301, end_date = 20141001, datum = "stnd", product = "monthly_mean") # Get verified water level data at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, datum = "stnd", product = "water_level") # Get daily mean water level data at Fairport, OH (9063053) coops_search(station_name = 9063053, begin_date = 20150927, end_date = 20150928, product = "daily_mean", datum = "stnd", time_zone = "lst") # Get air temperature at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "air_temperature") # Get water temperature at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "water_temperature") # Get air pressure at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "air_pressure") # Get wind at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "wind") # Get hourly water level height at Key West (8724580) coops_search(station_name = 8724580, begin_date = 20140927, end_date = 20140928, product = "hourly_height", datum = "stnd") # Get high-low water level at Key West (8724580) coops_search(station_name = 8724580, begin_date = 20140927, end_date = 20140928, product = "high_low", datum = "stnd") # Get currents data at Pascagoula Harbor (ps0401) coops_search(station_name = "ps0401", begin_date = 20151221, end_date = 20151222, product = "currents") # Get one-minute water level at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, datum = "stnd", product = "one_minute_water_level") # Get datums at Fort Myers, FL (8725520) coops_search(station_name = 8725520, product = "datums") # Get water level predictions at Vaca Key (8723970) coops_search(station_name = 8723970, begin_date = 20140928, end_date = 20140929, datum = "stnd", product = "predictions") ## End(Not run)
Precipitation data from NOAA Climate Prediction Center (CPC)
cpc_prcp(date, us = FALSE, drop_undefined = FALSE, ...)
cpc_prcp(date, us = FALSE, drop_undefined = FALSE, ...)
date |
(date/character) date in YYYY-MM-DD format |
us |
(logical) US data only? default: |
drop_undefined |
(logical) drop undefined precipitation
values (values in the |
... |
curl options passed on to crul::verb-GET |
Rainfall data for the world (1979-present, resolution 50 km), and the US (1948-present, resolution 25 km).
a data.frame, with columns:
lon - longitude (0 to 360)
lat - latitude (-90 to 90)
precip - precipitation (in mm) (see Details for more information)
Internally we multiply all precipitation measurements by 0.1 as per the CPC documentation.
Values of -99.0 are classified as "undefined". These values can be
removed by setting drop_undefined = TRUE
in the cpc_prcp
function call. These undefined values are not dropped by default -
so do remember to set drop_undefined = TRUE
to drop them; or
you can easily do it yourself by e.g., subset(x, precip >= 0)
See cpc_cache for managing cached files
https://www.cpc.ncep.noaa.gov/ https://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP https://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_CONUS/DOCU/PRCP_CU_GAUGE_V1.0CONUS_0.25deg.README https://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/DOCU/PRCP_CU_GAUGE_V1.0GLB_0.50deg_README.txt https://psl.noaa.gov/data/gridded/data.unified.daily.conus.html
## Not run: x = cpc_prcp(date = "2017-01-15") cpc_prcp(date = "2015-06-05") cpc_prcp(date = "2017-01-15") cpc_prcp(date = "2005-07-09") cpc_prcp(date = "1979-07-19") # United States data only cpc_prcp(date = "2005-07-09", us = TRUE) cpc_prcp(date = "2009-08-03", us = TRUE) cpc_prcp(date = "1998-04-23", us = TRUE) # drop undefined values (those given as -99.0) cpc_prcp(date = "1998-04-23", drop_undefined = TRUE) ## End(Not run)
## Not run: x = cpc_prcp(date = "2017-01-15") cpc_prcp(date = "2015-06-05") cpc_prcp(date = "2017-01-15") cpc_prcp(date = "2005-07-09") cpc_prcp(date = "1979-07-19") # United States data only cpc_prcp(date = "2005-07-09", us = TRUE) cpc_prcp(date = "2009-08-03", us = TRUE) cpc_prcp(date = "1998-04-23", us = TRUE) # drop undefined values (those given as -99.0) cpc_prcp(date = "1998-04-23", drop_undefined = TRUE) ## End(Not run)
NOAA Extended Reconstructed Sea Surface Temperature (ERSST) data
ersst(year, month, overwrite = TRUE, version = "v5", ...)
ersst(year, month, overwrite = TRUE, version = "v5", ...)
year |
(numeric) A year. Must be > 1853. The max value is whatever the current year is. Required |
month |
A month, character or numeric. If single digit (e.g. 8), we add a zero in front (e.g., 08). Required |
overwrite |
(logical) To overwrite the path to store files in or not,
Default: |
version |
(character) ERSST version. one of "v5" (default) or "v4" |
... |
Curl options passed on to crul::verb-GET |
See ersst_cache for managing cached files
ersst()
currently defaults to use ERSST v5 - you can set v4 or v5
using the version
parameter
If a request is unsuccesful, the file written to disk is deleted before the function exits.
If you use this data in your research please cite rnoaa
(citation("rnoaa")
), and cite NOAA ERSST (see citations at link above)
An ncdf4
object. See ncdf4 for parsing the output
https://www.ncdc.noaa.gov/data-access/marineocean-data/extended-reconstructed-sea-surface-temperature-ersst-v5
## Not run: # October, 2015 ersst(year = 2015, month = 10) # May, 2015 ersst(year = 2015, month = 5) ersst(year = 2015, month = "05") # February, 1890 ersst(year = 1890, month = 2) # Process data library("ncdf4") res <- ersst(year = 1890, month = 2) ## varibles names(res$var) ## get a variable ncdf4::ncvar_get(res, "ssta") ## End(Not run)
## Not run: # October, 2015 ersst(year = 2015, month = 10) # May, 2015 ersst(year = 2015, month = 5) ersst(year = 2015, month = "05") # February, 1890 ersst(year = 1890, month = 2) # Process data library("ncdf4") res <- ersst(year = 1890, month = 2) ## varibles names(res$var) ## get a variable ncdf4::ncvar_get(res, "ssta") ## End(Not run)
A dataset containing the FIPS codes for 51 US states and territories. The variables are as follows:
A data frame with 3142 rows and 5 variables
state. US state name.
county. County name.
fips_state. Numeric value, from 1 to 51.
fips_county. Numeric value, from 1 to 840.
fips. Numeric value, from 1001 to 56045.
This function uses ftp to access the Global Historical Climatology Network daily weather data from NOAA's FTP server for a single weather site. It requires the site identification number for that site and will pull the entire weather dataset for the site.
ghcnd(stationid, refresh = FALSE, ...) ghcnd_read(path, ...)
ghcnd(stationid, refresh = FALSE, ...) ghcnd_read(path, ...)
stationid |
(character) A character vector giving the identification of
the weather stations for which the user would like to pull data. To get a full
and current list of stations, the user can use the |
refresh |
(logical) If |
... |
In the case of |
path |
(character) a path to a file with a |
This function saves the full set of weather data for the queried
site locally in the directory specified by the path
argument.
You can access the path for the cached file via attr(x, "source")
You can access the last modified time for the cached file via
attr(x, "file_modified")
Messages are printed to the console about file path and file last modified time
which you can suppress with suppressMessages()
For those station ids that are not found, we will delete the file locally so that a bad station id file is not cached. The returned data for a bad station id will be an empty data.frame and the attributes are empty strings.
A tibble (data.frame) which contains data pulled from NOAA's FTP server for the queried weather site. A README file with more information about the format of this file is available from NOAA (https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt). This file is formatted so each line of the file gives the daily weather observations for a single weather variable for all days of one month of one year. In addition to measurements, columns are included for certain flags, which add information on observation sources and quality and are further explained in NOAA's README file for the data.
The base url for data requests can be changed. The allowed urls are: https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/all (default), https://ncei.noaa.gov/pub/data/ghcn/daily/all
You can set the base url using the RNOAA_GHCND_BASE_URL
environment
variable; see example below.
The reason for this is that sometimes one base url source is temporarily down, but another base url may work. It doesn't make sense to allow an arbitrary base URL; open an issue if there is another valid base URL for GHNCD data that we should add to the allowed set of base urls.
See ghcnd_cache for managing cached files
Scott Chamberlain [email protected], Adam Erickson [email protected]
To generate a weather dataset for a single weather site that has
been cleaned to a tidier weather format, the user should use the
ghcnd_search()
function, which calls ghcnd()
and then
processes the output, or meteo_tidy_ghcnd()
, which wraps the
ghcnd_search()
function to output a tidy dataframe. To pull
GHCND data from multiple monitors, see meteo_pull_monitors()
## Not run: # Get data ghcnd(stationid = "AGE00147704") stations <- ghcnd_stations() ghcnd(stations$id[40]) library("dplyr") ghcnd(stations$id[80300]) %>% select(id, element) %>% slice(1:3) # manipulate data ## using built in fxns dat <- ghcnd(stationid = "AGE00147704") (alldat <- ghcnd_splitvars(dat)) ## using dplyr library("dplyr") dat <- ghcnd(stationid = "AGE00147704") filter(dat, element == "PRCP", year == 1909) # refresh the cached file ghcnd(stationid = "AGE00147704", refresh = TRUE) # Read in a .dly file you've already downloaded path <- system.file("examples/AGE00147704.dly", package = "rnoaa") ghcnd_read(path) # change the base url for data requests Sys.setenv(RNOAA_GHCND_BASE_URL = "https://ncei.noaa.gov/pub/data/ghcn/daily/all") ghcnd(stations$id[45], verbose = TRUE) ## must be in the allowed set of urls # Sys.setenv(RNOAA_GHCND_BASE_URL = "https://google.com") # ghcnd(stations$id[58], verbose = TRUE) ## End(Not run)
## Not run: # Get data ghcnd(stationid = "AGE00147704") stations <- ghcnd_stations() ghcnd(stations$id[40]) library("dplyr") ghcnd(stations$id[80300]) %>% select(id, element) %>% slice(1:3) # manipulate data ## using built in fxns dat <- ghcnd(stationid = "AGE00147704") (alldat <- ghcnd_splitvars(dat)) ## using dplyr library("dplyr") dat <- ghcnd(stationid = "AGE00147704") filter(dat, element == "PRCP", year == 1909) # refresh the cached file ghcnd(stationid = "AGE00147704", refresh = TRUE) # Read in a .dly file you've already downloaded path <- system.file("examples/AGE00147704.dly", package = "rnoaa") ghcnd_read(path) # change the base url for data requests Sys.setenv(RNOAA_GHCND_BASE_URL = "https://ncei.noaa.gov/pub/data/ghcn/daily/all") ghcnd(stations$id[45], verbose = TRUE) ## must be in the allowed set of urls # Sys.setenv(RNOAA_GHCND_BASE_URL = "https://google.com") # ghcnd(stations$id[58], verbose = TRUE) ## End(Not run)
This function uses ftp to access the Global Historical Climatology Network daily weather data from NOAA's FTP server for a single weather monitor site. It requires the site identification number for that site and will pull the entire weather dataset for the site. It will then clean this data to convert it to a tidier format and will also, if requested, filter it to a certain date range and to certain weather variables.
ghcnd_search( stationid, date_min = NULL, date_max = NULL, var = "all", refresh = FALSE, ... )
ghcnd_search( stationid, date_min = NULL, date_max = NULL, var = "all", refresh = FALSE, ... )
stationid |
(character) A character vector giving the identification of
the weather stations for which the user would like to pull data. To get a full
and current list of stations, the user can use the |
date_min |
A character string giving the earliest date of the daily weather time series that the user would like in the final output. This character string should be formatted as "yyyy-mm-dd". If not specified, the default is to keep all daily data for the queried weather site from the earliest available date. |
date_max |
A character string giving the latest date of the daily weather time series that the user would like in the final output. This character string should be formatted as "yyyy-mm-dd". If not specified, the default is to keep all daily data for the queried weather site through the most current available date. |
var |
A character vector specifying either
A full list of possible weather variables is available in NOAA's README file for the GHCND data (https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt). Most weather stations will only have a small subset of all the possible weather variables, so the data generated by this function may not include all of the variables the user specifies through this argument. |
refresh |
(logical) If |
... |
In the case of |
Messages are printed to the console about file path, file last modified time
which you can suppress with suppressMessages()
A list object with slots for each of the available specified weather variables. Each element in the list is a separate time series dataframe with daily observations, as well as flag values, for one of the weather variables. The flag values give information on the quality and source of each observation; see the NOAA README file linked above for more information. Each data.frame is sorted by date, with the earliest date first.
This function calls ghcnd()
, which will download and save
data from all available dates and weather variables for the queried
weather station. The step of limiting the dataset to only certain dates
and / or weather variables, using the date_min
, date_max
,
and var
arguments, does not occur until after the full data has
been pulled.
Scott Chamberlain [email protected], Adam Erickson [email protected]
meteo_pull_monitors()
, meteo_tidy_ghcnd()
## Not run: # Search based on variable and/or date ghcnd_search("AGE00147704", var = "PRCP") ghcnd_search("AGE00147704", var = "PRCP", date_min = "1920-01-01") ghcnd_search("AGE00147704", var = "PRCP", date_max = "1915-01-01") ghcnd_search("AGE00147704", var = "PRCP", date_min = "1920-01-01", date_max = "1925-01-01") ghcnd_search("AGE00147704", date_min = "1920-01-01", date_max = "1925-01-01") ghcnd_search("AGE00147704", var = c("PRCP","TMIN")) ghcnd_search("AGE00147704", var = c("PRCP","TMIN"), date_min = "1920-01-01") ghcnd_search("AGE00147704", var = "adfdf") # refresh the cached file ghcnd_search("AGE00147704", var = "PRCP", refresh = TRUE) ## End(Not run)
## Not run: # Search based on variable and/or date ghcnd_search("AGE00147704", var = "PRCP") ghcnd_search("AGE00147704", var = "PRCP", date_min = "1920-01-01") ghcnd_search("AGE00147704", var = "PRCP", date_max = "1915-01-01") ghcnd_search("AGE00147704", var = "PRCP", date_min = "1920-01-01", date_max = "1925-01-01") ghcnd_search("AGE00147704", date_min = "1920-01-01", date_max = "1925-01-01") ghcnd_search("AGE00147704", var = c("PRCP","TMIN")) ghcnd_search("AGE00147704", var = c("PRCP","TMIN"), date_min = "1920-01-01") ghcnd_search("AGE00147704", var = "adfdf") # refresh the cached file ghcnd_search("AGE00147704", var = "PRCP", refresh = TRUE) ## End(Not run)
ghcnd
This function is a helper function for ghcnd_search()
. It helps
with cleaning up the data returned from ghcnd()
, to get it in a
format that is easier to work with.
ghcnd_splitvars(x)
ghcnd_splitvars(x)
x |
An object returned from |
See ghcnd()
examples
Scott Chamberlain, Adam Erickson, Elio Campitelli
These function allow you to pull the current versions of certain meta-datasets for the GHCND, including lists of country and state abbreviations used in some of the weather station IDs and information about the current version of the data.
ghcnd_states(...) ghcnd_countries(...) ghcnd_version(...)
ghcnd_states(...) ghcnd_countries(...) ghcnd_version(...)
... |
In the case of |
Functions:
ghcnd_version
: Get current version of GHCND data
ghcnd_states
: Get US/Canada state names and 2-letter codes
ghcnd_countries
: Get country names and 2-letter codes
Scott Chamberlain [email protected], Adam Erickson [email protected]
## Not run: ghcnd_states() ghcnd_countries() ghcnd_version() ## End(Not run)
## Not run: ghcnd_states() ghcnd_countries() ghcnd_version() ## End(Not run)
This function returns an object with a dataframe with meta-information about all available GHCND weather stations.
ghcnd_stations(refresh = FALSE, ...)
ghcnd_stations(refresh = FALSE, ...)
refresh |
(logical) If |
... |
In the case of |
This function returns a tibble (dataframe) with a weather station on each row with the following columns:
id
: The weather station's ID number. The first two letters
denote the country (using FIPS country codes).
latitude
: The station's latitude, in decimal degrees.
Southern latitudes will be negative.
longitude
: The station's longitude, in decimal degrees.
Western longitudes will be negative.
elevation
: The station's elevation, in meters.
name
: The station's name.
gsn_flag
: "GSN" if the monitor belongs to the GCOS Surface
Network (GSN). Otherwise either blank or missing.
wmo_id
: If the station has a WMO number, this column gives
that number. Otherwise either blank or missing.
element
: A weather variable recorded at some point during
that station's history. See the link below in "References" for
definitions of the abbreviations used for this variable.
first_year
: The first year of data available at that station
for that weather element.
last_year
: The last year of data available at that station
for that weather element.
If a weather station has data on more than one weather variable, it will be represented in multiple rows of this output dataframe.
Since this function is pulling a large dataset by ftp, it may take a while to run.
For more documentation on the returned dataset, see http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt
## Not run: # Get stations, ghcnd-stations and ghcnd-inventory merged (stations <- ghcnd_stations()) library(dplyr) # filter by state stations %>% filter(state == "IL") stations %>% filter(state == "OR") # those without state values stations %>% filter(state == "") # filter by element stations %>% filter(element == "PRCP") # filter by id prefix stations %>% filter(grepl("^AF", id)) stations %>% filter(grepl("^AFM", id)) # filter by station long name stations %>% filter(name == "CALLATHARRA") ## End(Not run)
## Not run: # Get stations, ghcnd-stations and ghcnd-inventory merged (stations <- ghcnd_stations()) library(dplyr) # filter by state stations %>% filter(state == "IL") stations %>% filter(state == "OR") # those without state values stations %>% filter(state == "") # filter by element stations %>% filter(element == "PRCP") # filter by id prefix stations %>% filter(grepl("^AF", id)) stations %>% filter(grepl("^AFM", id)) # filter by station long name stations %>% filter(name == "CALLATHARRA") ## End(Not run)
Historical Observing Metadata Repository (HOMR) station metadata
homr( qid = NULL, qidMod = NULL, station = NULL, state = NULL, county = NULL, country = NULL, name = NULL, nameMod = NULL, platform = NULL, date = NULL, begindate = NULL, enddate = NULL, headersOnly = FALSE, phrData = NULL, combine = FALSE, ... )
homr( qid = NULL, qidMod = NULL, station = NULL, state = NULL, county = NULL, country = NULL, name = NULL, nameMod = NULL, platform = NULL, date = NULL, begindate = NULL, enddate = NULL, headersOnly = FALSE, phrData = NULL, combine = FALSE, ... )
qid |
One of COOP, FAA, GHCND, ICAO, NCDCSTNID, NWSLI, TRANS, WBAN, or
WMO, or any of those plus |
qidMod |
(character) One of: is, starts, ends, contains. Specifies how the ID portion of the qid parameter should be applied within the search. If a qid is passed but the qidMod parameter is not used, qidMod is assumed to be IS. |
station |
(character) A station id. |
state |
(character) A two-letter state abbreviation. Two-letter code for US states, Canadian provinces, and other Island areas. |
county |
(character) A two letter county code. US county names, best used with a state identifier. |
country |
(character) A two letter country code. See here for a list of valid country names. |
name |
(character) One of |
nameMod |
(character) |
platform |
(character) (aka network) |
date |
(character) |
begindate , enddate
|
|
headersOnly |
(logical) Returns only minimal information for each station found (NCDC Station ID, Preferred Name, Station Begin Date, and Station End Date), but is much quicker than a full query. If you are performing a search that returns a large number of stations and intend to choose only one from that list to examine in detail, headersOnly may give you enough information to find the NCDC Station ID for the station that you actually want. |
phrData |
(logical) The HOMR web service now includes PHR (element-level) data when available, in an elements section. Because of how this data is structured, it can substantially increase the size of any result which includes it. If you don't need this data you can omit it by including phrData=false. If the parameter is not set, it will default to phrData=true. |
combine |
(logical) Combine station metadata or not. |
... |
Curl options passed on to crul::verb-GET (optional) |
Since the definitions for variables are always the same, we don't
include the ability to get description data in this function. Use
homr_definitions()
to get descriptions information.
A list, with elements named by the station ids.
https://www.ncdc.noaa.gov/homr/api
## Not run: homr(qid = 'COOP:046742') homr(qid = ':046742') homr(qidMod='starts', qid='COOP:0467') homr(headersOnly=TRUE, state='DE') homr(headersOnly=TRUE, country='GHANA') homr(headersOnly=TRUE, state='NC', county='BUNCOMBE') homr(name='CLAYTON') res <- homr(state='NC', county='BUNCOMBE', combine=TRUE) res$id res$head res$updates homr(nameMod='starts', name='CLAY') homr(headersOnly=TRUE, platform='ASOS') homr(qid='COOP:046742', date='2011-01-01') homr(qid='COOP:046742', begindate='2005-01-01', enddate='2011-01-01') homr(state='DE', headersOnly=TRUE) homr(station=20002078) homr(station=20002078, date='all', phrData=FALSE) # Optionally pass in curl options homr(headersOnly=TRUE, state='NC', county='BUNCOMBE', verbose = TRUE) ## End(Not run)
## Not run: homr(qid = 'COOP:046742') homr(qid = ':046742') homr(qidMod='starts', qid='COOP:0467') homr(headersOnly=TRUE, state='DE') homr(headersOnly=TRUE, country='GHANA') homr(headersOnly=TRUE, state='NC', county='BUNCOMBE') homr(name='CLAYTON') res <- homr(state='NC', county='BUNCOMBE', combine=TRUE) res$id res$head res$updates homr(nameMod='starts', name='CLAY') homr(headersOnly=TRUE, platform='ASOS') homr(qid='COOP:046742', date='2011-01-01') homr(qid='COOP:046742', begindate='2005-01-01', enddate='2011-01-01') homr(state='DE', headersOnly=TRUE) homr(station=20002078) homr(station=20002078, date='all', phrData=FALSE) # Optionally pass in curl options homr(headersOnly=TRUE, state='NC', county='BUNCOMBE', verbose = TRUE) ## End(Not run)
Historical Observing Metadata Repository (HOMR) station metadata - definitions
homr_definitions(...)
homr_definitions(...)
... |
Curl options passed on to crul::verb-GET optional |
## Not run: head( homr_definitions() ) ## End(Not run)
## Not run: head( homr_definitions() ) ## End(Not run)
Get and parse NOAA ISD/ISH data
isd( usaf, wban, year, overwrite = TRUE, cleanup = TRUE, additional = TRUE, parallel = FALSE, cores = getOption("cl.cores", 2), progress = FALSE, force = FALSE, ... )
isd( usaf, wban, year, overwrite = TRUE, cleanup = TRUE, additional = TRUE, parallel = FALSE, cores = getOption("cl.cores", 2), progress = FALSE, force = FALSE, ... )
usaf , wban
|
(character) USAF and WBAN code. Required |
year |
(numeric) One of the years from 1901 to the current year. Required. |
overwrite |
(logical) To overwrite the path to store files in or not,
Default: |
cleanup |
(logical) If |
additional |
(logical) include additional and remarks data sections
in output. Default: |
parallel |
(logical) do processing in parallel. Default: |
cores |
(integer) number of cores to use: Default: 2. We look in your option "cl.cores", but use default value if not found. |
progress |
(logical) print progress - ignored if |
force |
(logical) force download? Default: |
... |
Curl options passed on to crul::verb-GET |
isd
saves the full set of weather data for the queried
site locally in the directory specified by the path
argument. You
can access the path for the cached file via attr(x, "source")
We use isdparser internally to parse ISD files. They are relatively complex to parse, so a separate package takes care of that.
This function first looks for whether the data for your specific
query has already been downloaded previously in the directory given by
the path
parameter. If not found, the data is requested form NOAA's
FTP server. The first time a dataset is pulled down we must a) download the
data, b) process the data, and c) save a compressed .rds file to disk. The
next time the same data is requested, we only have to read back in the
.rds file, and is quite fast. The benfit of writing to .rds files is that
data is compressed, taking up less space on your disk, and data is read
back in quickly, without changing any data classes in your data, whereas
we'd have to jump through hoops to do that with reading in csv. The
processing can take quite a long time since the data is quite messy and
takes a bunch of regex to split apart text strings. We hope to speed
this process up in the future. See examples below for different behavior.
A tibble (data.frame).
Note that when you get an error similar to Error: download failed for https://ftp.ncdc.noaa.gov/pub/data/noaa/1955/011490-99999-1955.gz
, the
file does not exist on NOAA's servers. If your internet is down,
you'll get a different error.
There are now no transformations (scaling, class changes, etc.)
done on the output data. This may change in the future with parameters
to toggle transformations, but none are done for now. See
isdparser::isd_transform()
for transformation help.
Comprehensive transformations for all variables are not yet available
but should be available in the next version of this package.
See isd_cache for managing cached files
https://ftp.ncdc.noaa.gov/pub/data/noaa/ https://www1.ncdc.noaa.gov/pub/data/noaa
Other isd:
isd_read()
,
isd_stations_search()
,
isd_stations()
## Not run: # Get station table (stations <- isd_stations()) ## plot stations ### remove incomplete cases, those at 0,0 df <- stations[complete.cases(stations$lat, stations$lon), ] df <- df[df$lat != 0, ] ### make plot library("leaflet") leaflet(data = df) %>% addTiles() %>% addCircles() # Get data (res <- isd(usaf='011490', wban='99999', year=1986)) (res <- isd(usaf='011690', wban='99999', year=1993)) (res <- isd(usaf='109711', wban=99999, year=1970)) # "additional" and "remarks" data sections included by default # can toggle that parameter to not include those in output, saves time (res1 <- isd(usaf='011490', wban='99999', year=1986, force = TRUE)) (res2 <- isd(usaf='011490', wban='99999', year=1986, force = TRUE, additional = FALSE)) # The first time a dataset is requested takes longer system.time( isd(usaf='782680', wban='99999', year=2011) ) system.time( isd(usaf='782680', wban='99999', year=2011) ) # Plot data ## get data for multiple stations res1 <- isd(usaf='011690', wban='99999', year=1993) res2 <- isd(usaf='782680', wban='99999', year=2011) res3 <- isd(usaf='008415', wban='99999', year=2016) res4 <- isd(usaf='109711', wban=99999, year=1970) ## combine data library(dplyr) res_all <- bind_rows(res1, res2, res3, res4) # add date time library("lubridate") dd <- sprintf('%s %s', as.character(res_all$date), res_all$time) res_all$date_time <- ymd_hm(dd) ## remove 999's res_all <- filter(res_all, temperature < 900) ## plot if (interactive()) { library(ggplot2) ggplot(res_all, aes(date_time, temperature)) + geom_line() + facet_wrap(~usaf_station, scales = 'free_x') } # print progress ## note: if the file is already on your system, you'll see no progress bar (res <- isd(usaf='011690', wban='99999', year=1993, progress=TRUE)) # parallelize processing # (res <- isd(usaf=172007, wban=99999, year=2016, parallel=TRUE)) ## End(Not run)
## Not run: # Get station table (stations <- isd_stations()) ## plot stations ### remove incomplete cases, those at 0,0 df <- stations[complete.cases(stations$lat, stations$lon), ] df <- df[df$lat != 0, ] ### make plot library("leaflet") leaflet(data = df) %>% addTiles() %>% addCircles() # Get data (res <- isd(usaf='011490', wban='99999', year=1986)) (res <- isd(usaf='011690', wban='99999', year=1993)) (res <- isd(usaf='109711', wban=99999, year=1970)) # "additional" and "remarks" data sections included by default # can toggle that parameter to not include those in output, saves time (res1 <- isd(usaf='011490', wban='99999', year=1986, force = TRUE)) (res2 <- isd(usaf='011490', wban='99999', year=1986, force = TRUE, additional = FALSE)) # The first time a dataset is requested takes longer system.time( isd(usaf='782680', wban='99999', year=2011) ) system.time( isd(usaf='782680', wban='99999', year=2011) ) # Plot data ## get data for multiple stations res1 <- isd(usaf='011690', wban='99999', year=1993) res2 <- isd(usaf='782680', wban='99999', year=2011) res3 <- isd(usaf='008415', wban='99999', year=2016) res4 <- isd(usaf='109711', wban=99999, year=1970) ## combine data library(dplyr) res_all <- bind_rows(res1, res2, res3, res4) # add date time library("lubridate") dd <- sprintf('%s %s', as.character(res_all$date), res_all$time) res_all$date_time <- ymd_hm(dd) ## remove 999's res_all <- filter(res_all, temperature < 900) ## plot if (interactive()) { library(ggplot2) ggplot(res_all, aes(date_time, temperature)) + geom_line() + facet_wrap(~usaf_station, scales = 'free_x') } # print progress ## note: if the file is already on your system, you'll see no progress bar (res <- isd(usaf='011690', wban='99999', year=1993, progress=TRUE)) # parallelize processing # (res <- isd(usaf=172007, wban=99999, year=2016, parallel=TRUE)) ## End(Not run)
Read NOAA ISD/ISH local file
isd_read( path, additional = TRUE, parallel = FALSE, cores = getOption("cl.cores", 2), progress = FALSE )
isd_read( path, additional = TRUE, parallel = FALSE, cores = getOption("cl.cores", 2), progress = FALSE )
path |
(character) path to the file. required. |
additional |
(logical) include additional and remarks data sections
in output. Default: |
parallel |
(logical) do processing in parallel. Default: |
cores |
(integer) number of cores to use: Default: 2. We look in your option "cl.cores", but use default value if not found. |
progress |
(logical) print progress - ignored if |
isd_read
- read a .gz
file as downloaded
from NOAA's website
A tibble (data.frame)
https://ftp.ncdc.noaa.gov/pub/data/noaa/
isd()
, isd_stations()
, isd_stations_search()
Other isd:
isd_stations_search()
,
isd_stations()
,
isd()
## Not run: file <- system.file("examples", "011490-99999-1986.gz", package = "rnoaa") isd_read(file) isd_read(file, additional = FALSE) ## End(Not run)
## Not run: file <- system.file("examples", "011490-99999-1986.gz", package = "rnoaa") isd_read(file) isd_read(file, additional = FALSE) ## End(Not run)
Get NOAA ISD/ISH station data from NOAA FTP server.
isd_stations(refresh = FALSE)
isd_stations(refresh = FALSE)
refresh |
(logical) Download station data from NOAA ftp server again.
Default: |
The data table is cached, but you can force download of data from
NOAA by setting refresh=TRUE
a tibble (data.frame) with the columns:
usaf - USAF number, character
wban - WBAN number, character
station_name - station name, character
ctry - Country, if given, character
state - State, if given, character
icao - ICAO number, if given, character
lat - Latitude, if given, numeric
lon - Longitude, if given, numeric
elev_m - Elevation, if given, numeric
begin - Begin date of data coverage, of form YYYYMMDD, numeric
end - End date of data coverage, of form YYYYMMDD, numeric
See isd_cache for managing cached files
https://ftp.ncdc.noaa.gov/pub/data/noaa/
Other isd:
isd_read()
,
isd_stations_search()
,
isd()
## Not run: # Get station table (stations <- isd_stations()) ## plot stations ### remove incomplete cases, those at 0,0 df <- stations[complete.cases(stations$lat, stations$lon), ] df <- df[df$lat != 0, ] ### make plot library("leaflet") leaflet(data = df) %>% addTiles() %>% addCircles() ## End(Not run)
## Not run: # Get station table (stations <- isd_stations()) ## plot stations ### remove incomplete cases, those at 0,0 df <- stations[complete.cases(stations$lat, stations$lon), ] df <- df[df$lat != 0, ] ### make plot library("leaflet") leaflet(data = df) %>% addTiles() %>% addCircles() ## End(Not run)
Search for NOAA ISD/ISH station data from NOAA FTP server.
isd_stations_search(lat = NULL, lon = NULL, radius = NULL, bbox = NULL)
isd_stations_search(lat = NULL, lon = NULL, radius = NULL, bbox = NULL)
lat |
(numeric) Latitude, in decimal degree |
lon |
(numeric) Latitude, in decimal degree |
radius |
(numeric) Radius (in km) to search from the lat,lon coordinates |
bbox |
(numeric) Bounding box, of the form: min-longitude, min-latitude, max-longitude, max-latitude |
We internally call isd_stations()
to get the data.frame
of ISD stations, which is quite fast as long as it's not the first time
called since we cache the table. Before searching, we clean up the
data.frame, removing stations with no lat/long coordinates, those with
impossible lat/long coordinates, and those at 0,0.
When lat/lon/radius input we use meteo_distance()
to search
for stations, while when bbox is input, we simply use
dplyr::filter()
a data.frame with the columns:
usaf - USAF number, character
wban - WBAN number, character
station_name - station name, character
ctry - Country, if given, character
state - State, if given, character
icao - ICAO number, if given, character
lat - Latitude, if given, numeric
lon - Longitude, if given, numeric
elev_m - Elevation, if given, numeric
begin - Begin date of data coverage, of form YYYYMMDD, numeric
end - End date of data coverage, of form YYYYMMDD, numeric
distance - distance (km) (only present if using lat/lon/radius parameter combination)
https://ftp.ncdc.noaa.gov/pub/data/noaa/
Other isd:
isd_read()
,
isd_stations()
,
isd()
## Not run: ## lat, long, radius isd_stations_search(lat = 38.4, lon = -123, radius = 250) x <- isd_stations_search(lat = 60, lon = 18, radius = 200) if (requireNamespace("leaflet")) { library("leaflet") leaflet() %>% addTiles() %>% addCircles(lng = x$lon, lat = x$lat, popup = x$station_name) %>% clearBounds() } ## bounding box bbox <- c(-125.0, 38.4, -121.8, 40.9) isd_stations_search(bbox = bbox) ## End(Not run)
## Not run: ## lat, long, radius isd_stations_search(lat = 38.4, lon = -123, radius = 250) x <- isd_stations_search(lat = 60, lon = 18, radius = 200) if (requireNamespace("leaflet")) { library("leaflet") leaflet() %>% addTiles() %>% addCircles(lng = x$lon, lat = x$lat, popup = x$station_name) %>% clearBounds() } ## bounding box bbox <- c(-125.0, 38.4, -121.8, 40.9) isd_stations_search(bbox = bbox) ## End(Not run)
Local Climatological Data from NOAA
lcd(station, year, col_types = NULL, ...)
lcd(station, year, col_types = NULL, ...)
station |
(character) station code, e.g., "02413099999". we will allow integer/numeric passed here, but station ids can have leading zeros, so it's a good idea to keep stations as character class. required |
year |
(integer) year, e.g., 2017. required |
col_types |
(named character vector) defaults to NULL. Use this argument to change the returned column type. For example,"character" instead of "numeric". See or use lcd_columns to create a named vector with allowed column names. If the user specified type is not compatible, the function will choose a type automatically and raise a message. optional |
... |
curl options passed on to crul::verb-GET |
a data.frame with many columns. the first 10 are metadata:
station
date
latitude
longitude
elevation
name
report_type
source
And the rest should be all data columns. The first part of many column names is the time period, being one of:
hourly
daily
monthly
shortduration
So the variable you are looking for may not be the first part of the column name
See lcd_cache for managing cached files
Docs: https://www.ncei.noaa.gov/data/local-climatological-data/doc/LCD_documentation.pdf Data comes from: https://www.ncei.noaa.gov/data/local-climatological-data/access
## Not run: x = lcd(station = "01338099999", year = 2017) lcd(station = "01338099999", year = 2015) lcd(station = "02413099999", year = 2009) lcd(station = "02413099999", year = 2001) # pass curl options lcd(station = "02413099999", year = 2002, verbose = TRUE) ## End(Not run)
## Not run: x = lcd(station = "01338099999", year = 2017) lcd(station = "01338099999", year = 2015) lcd(station = "02413099999", year = 2009) lcd(station = "02413099999", year = 2001) # pass curl options lcd(station = "02413099999", year = 2002, verbose = TRUE) ## End(Not run)
The meteo functions use an aplication
meteo_clear_cache(force = FALSE)
meteo_clear_cache(force = FALSE)
force |
(logical) force delete. default: |
This function will clear all cached meteo files.
Other meteo:
meteo_show_cache()
Call this function after pulling down observations for a set of stations
to retrieve the "coverage" (i.e. how complete each field is). If either
or both obs_start_date
or obs_end_date
are specified,
the coverage test will be limited to that date range.
meteo_coverage( meteo_df, obs_start_date = NULL, obs_end_date = NULL, verbose = FALSE )
meteo_coverage( meteo_df, obs_start_date = NULL, obs_end_date = NULL, verbose = FALSE )
meteo_df |
a meteo |
obs_start_date |
specify either or both (obs_start_date, obs_end_date)
to constrain coverate tests. These should be |
obs_end_date |
specify either or both (obs_start_date, obs_end_date)
to constrain coverate tests. These should be |
verbose |
if |
a list
containing 2 data.frame
s named 'summary' and 'detail'.
The 'summary' data.frame
contains columns:
$ id (chr) $ start_date (time) $ end_date (time) $ total_obs (int)
with additional fields (and their coverage percent) depending on which
weather variables were queried and available for the weather station. The
data.frame
named 'detail' contains the same columns as the meteo_df
input
data, but expands the rows to contain NA
s for days without data.
## Not run: monitors <- c("ASN00095063", "ASN00024025", "ASN00040112", "ASN00041023", "ASN00009998", "ASN00066078", "ASN00003069", "ASN00090162", "ASN00040126", "ASN00058161") obs <- meteo_pull_monitors(monitors) obs_covr <- meteo_coverage(obs) ## End(Not run)
## Not run: monitors <- c("ASN00095063", "ASN00024025", "ASN00040112", "ASN00041023", "ASN00009998", "ASN00066078", "ASN00003069", "ASN00090162", "ASN00040126", "ASN00058161") obs <- meteo_pull_monitors(monitors) obs_covr <- meteo_coverage(obs) ## End(Not run)
This function will identify all weather stations with a specified radius of
a location. If no radius is given, the function will return a dataframe
of all available monitors, sorted by distance to the location. The
limit
argument can be used to limit the output dataframe to the x
closest monitors to the location.
meteo_distance( station_data, lat, long, units = "deg", radius = NULL, limit = NULL )
meteo_distance( station_data, lat, long, units = "deg", radius = NULL, limit = NULL )
station_data |
The output of |
lat |
Latitude of the location. Southern latitudes should be given as negative values. |
long |
Longitude of the location. Western longitudes should be given as negative values. |
units |
Units of the latitude and longitude values. Possible values are:
|
radius |
A numeric vector giving the radius (in kilometers) within which to search for monitors near a location. |
limit |
An integer giving the maximum number of monitors to include for
each location. The |
A dataframe of weather stations near the location. This is the
single-location version of the return value for
meteo_nearby_stations()
Alex Simmons [email protected], Brooke Anderson [email protected]
## Not run: station_data <- ghcnd_stations() meteo_distance(station_data, -33, 151, radius = 10, limit = 10) meteo_distance(station_data, -33, 151, radius = 10, limit = 3) # FIXME - units param is ignored #meteo_distance(station_data, -33, 151, units = 'rad', radius = 10, limit = 3) ## End(Not run)
## Not run: station_data <- ghcnd_stations() meteo_distance(station_data, -33, 151, radius = 10, limit = 10) meteo_distance(station_data, -33, 151, radius = 10, limit = 3) # FIXME - units param is ignored #meteo_distance(station_data, -33, 151, units = 'rad', radius = 10, limit = 3) ## End(Not run)
This function inputs a dataframe with latitudes and longitudes of locations
and creates a dataframe with monitors within a certain radius of those
locations. The function can also be used, with the limit
argument, to
pull a certain number of the closest weather monitors to each location.
The weather monitor IDs in the output dataframe can be used with other
rnoaa functions to pull data from all available weather stations near
a location (e.g., meteo_pull_monitors()
).
meteo_nearby_stations( lat_lon_df, lat_colname = "latitude", lon_colname = "longitude", station_data = ghcnd_stations(), var = "all", year_min = NULL, year_max = NULL, radius = NULL, limit = NULL )
meteo_nearby_stations( lat_lon_df, lat_colname = "latitude", lon_colname = "longitude", station_data = ghcnd_stations(), var = "all", year_min = NULL, year_max = NULL, radius = NULL, limit = NULL )
lat_lon_df |
A dataframe that contains the latitude, longitude, and
a unique identifier for each location ( |
lat_colname |
A character string giving the name of the latitude column
in the |
lon_colname |
A character string giving the name of the longitude column
in the |
station_data |
The output of |
var |
A character vector specifying either
A full list of possible weather variables is available in NOAA's README file for the GHCND data (https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt). Most weather stations will only have a small subset of all the possible weather variables, so the data generated by this function may not include all of the variables the user specifies through this argument. |
year_min |
A numeric value giving the earliest year from which you ultimately want weather data (e.g., 2013, if you only are interested in data from 2013 and later). |
year_max |
A numeric value giving the latest year from which you ultimately want weather data. |
radius |
A numeric vector giving the radius (in kilometers) within which to search for monitors near a location. |
limit |
An integer giving the maximum number of monitors to include for
each location. The |
Great circle distance is used to determine whether a weather monitor is within the required radius.
A list containing dataframes with the sets of unique weather stations within the search radius for each location. Site IDs for the weather stations given in this dataframe can be used in conjunction with other functions in the rnoaa package to pull weather data for the station. The dataframe for each location includes:
id
: The weather station ID, which can be used in other
functions to pull weather data from the station;
name
: The weather station name;
latitude
: The station's latitude, in decimal degrees.
Southern latitudes will be negative;
longitude
: The station's longitude, in decimal degrees.
Western longitudes will be negative;
distance
: The station's distance, in kilometers, from the
location.
By default, this function will pull the full station list from NOAA
to use to identify nearby locations. If you will be creating lists of
monitors nearby several stations, you can save some time by using the
ghcnd_stations()
function separately to create an object
with all stations and then use the argument station_data
in
this function to reference that object, rather than using this function's
defaults (see examples).
Alex Simmons [email protected], Brooke Anderson [email protected]
The weather monitor IDs generated by this function can be used in
other functions in the rnoaa package, like
meteo_pull_monitors()
and meteo_tidy_ghcnd()
, to
pull weather data from weather monitors near a location.
## Not run: station_data <- ghcnd_stations() # Takes a while to run lat_lon_df <- data.frame(id = c("sydney", "brisbane"), latitude = c(-33.8675, -27.4710), longitude = c(151.2070, 153.0234)) nearby_stations <- meteo_nearby_stations(lat_lon_df = lat_lon_df, station_data = station_data, radius = 10) miami <- data.frame(id = "miami", latitude = 25.7617, longitude = -80.1918) # Get all stations within 50 kilometers meteo_nearby_stations(lat_lon_df = miami, station_data = station_data, radius = 50, var = c("PRCP", "TMAX"), year_min = 1992, year_max = 1992) # Get the closest 10 monitors meteo_nearby_stations(lat_lon_df = miami, station_data = station_data, limit = 10, var = c("PRCP", "TMAX"), year_min = 1992, year_max = 1992) ## End(Not run)
## Not run: station_data <- ghcnd_stations() # Takes a while to run lat_lon_df <- data.frame(id = c("sydney", "brisbane"), latitude = c(-33.8675, -27.4710), longitude = c(151.2070, 153.0234)) nearby_stations <- meteo_nearby_stations(lat_lon_df = lat_lon_df, station_data = station_data, radius = 10) miami <- data.frame(id = "miami", latitude = 25.7617, longitude = -80.1918) # Get all stations within 50 kilometers meteo_nearby_stations(lat_lon_df = miami, station_data = station_data, radius = 50, var = c("PRCP", "TMAX"), year_min = 1992, year_max = 1992) # Get the closest 10 monitors meteo_nearby_stations(lat_lon_df = miami, station_data = station_data, limit = 10, var = c("PRCP", "TMAX"), year_min = 1992, year_max = 1992) ## End(Not run)
This function takes a single location and a dataset of available weather stations and calculates the distance between the location and each of the stations, using the great circle method. A new column is added to the dataset of available weather stations giving the distance between each station and the input location. The station dataset is then sorted from closest to furthest distance to the location and returned as the function output.
meteo_process_geographic_data(station_data, lat, long, units = "deg")
meteo_process_geographic_data(station_data, lat, long, units = "deg")
station_data |
The output of |
lat |
Latitude of the location. Southern latitudes should be given as negative values. |
long |
Longitude of the location. Western longitudes should be given as negative values. |
units |
Units of the latitude and longitude values. Possible values are:
|
The station_data
dataframe that is input, but with a
distance
column added that gives the distance to the location
(in kilometers), and re-ordered by distance between each station and
the location (closest weather stations first).
Alex Simmons [email protected], Brooke Anderson [email protected]
## Not run: station_data <- ghcnd_stations() meteo_process_geographic_data(station_data, lat=-33, long=151) ## End(Not run)
## Not run: station_data <- ghcnd_stations() meteo_process_geographic_data(station_data, lat=-33, long=151) ## End(Not run)
This function takes a vector of one or more weather station IDs. It will pull the weather data from the Global Historical Climatology Network's daily data (GHCND) for each of the stations and join them together in a single tidy dataframe. For any weather stations that the user calls that are not available by ftp from GHCND, the function will return a warning giving the station ID.
meteo_pull_monitors( monitors, keep_flags = FALSE, date_min = NULL, date_max = NULL, var = "all" )
meteo_pull_monitors( monitors, keep_flags = FALSE, date_min = NULL, date_max = NULL, var = "all" )
monitors |
A character vector listing the station IDs for all
weather stations the user would like to pull. To get a full and
current list of stations, the user can use the |
keep_flags |
TRUE / FALSE for whether the user would like to keep all the flags for each weather variable. The default is to not keep the flags (FALSE). See the note below for more information on these flags. |
date_min |
A character string giving the earliest date of the daily weather time series that the user would like in the final output. This character string should be formatted as "yyyy-mm-dd". If not specified, the default is to keep all daily data for the queried weather site from the earliest available date. |
date_max |
A character string giving the latest date of the daily weather time series that the user would like in the final output. This character string should be formatted as "yyyy-mm-dd". If not specified, the default is to keep all daily data for the queried weather site through the most current available date. |
var |
A character vector specifying either
A full list of possible weather variables is available in NOAA's README file for the GHCND data (https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt). Most weather stations will only have a small subset of all the possible weather variables, so the data generated by this function may not include all of the variables the user specifies through this argument. |
A data frame of daily weather data for multiple weather monitors, converted to a tidy format. All weather variables may not exist for all weather stations. Examples of variables returned are:
id
: Character string with the weather station site id
date
: Date of the observation
prcp
: Precipitation, in tenths of mm
tavg
: Average temperature, in tenths of degrees Celsius
tmax
: Maximum temperature, in tenths of degrees Celsius
tmin
: Minimum temperature, in tenths of degrees Celsius
awnd
: Average daily wind speed, in meters / second
wsfg
: Peak gust wind speed, in meters / second
There are other possible weather variables in the Global Historical
Climatology Network; see
http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt for a full
list. If the var
argument is something other than "all", then
only variables included in that argument will be included in the output
data frame. All variables are in the units specified in the linked file
(note that, in many cases, measurements are given in tenths of the units
more often used, e.g., tenths of degrees for temperature). All column names
correspond to variable names in the linked file, but with all uppercase
letters changed to lowercase.
The weather flags, which are kept by specifying
keep_flags = TRUE
are:
*_mflag
: Measurement flag, which gives some information on how
the observation was measured.
*_qflag
: Quality flag, which gives quality information on the
measurement, like if it failed to pass certain quality checks.
*_sflag
: Source flag. This gives some information on the
weather collection system (e.g., U.S. Cooperative Summary of the Day,
Australian Bureau of Meteorology) the weather observation comes from.
More information on the interpretation of these flags can be found in the README file for the NCDC's Daily Global Historical Climatology Network's data at http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt
This function converts any value of -9999 to a missing value for the variables "prcp", "tmax", "tmin", "tavg", "snow", and "snwd". However, for some weather observations, there still may be missing values coded using a series of "9"s of some length. You will want to check your final data to see if there are lurking missing values given with series of "9"s.
This function may take a while to run.
Brooke Anderson [email protected]
For more information about the data pulled with this function, see:
Menne, M.J., I. Durre, R.S. Vose, B.E. Gleason, and T.G. Houston, 2012: An overview of the Global Historical Climatology Network-Daily Database. Journal of Atmospheric and Oceanic Technology, 29, 897-910, doi:10.1175/JTECH-D-11-00103.1.
## Not run: monitors <- c("ASN00003003", "ASM00094299", "ASM00094995", "ASM00094998") all_monitors_clean <- meteo_pull_monitors(monitors) ## End(Not run)
## Not run: monitors <- c("ASN00003003", "ASM00094299", "ASM00094995", "ASM00094998") all_monitors_clean <- meteo_pull_monitors(monitors) ## End(Not run)
Displays the full path to the meteo
cache directory
meteo_show_cache()
meteo_show_cache()
Other meteo:
meteo_clear_cache()
This function uses the haversine formula to calculate the great circle distance between two locations, identified by their latitudes and longitudes.
meteo_spherical_distance(lat1, long1, lat2, long2, units = "deg")
meteo_spherical_distance(lat1, long1, lat2, long2, units = "deg")
lat1 |
Latitude of the first location. |
long1 |
Longitude of the first location. |
lat2 |
Latitude of the second location. |
long2 |
Longitude of the second location. |
units |
Units of the latitude and longitude values. Possible values are:
|
A numeric value giving the distance (in kilometers) between the pair of locations.
This function assumes an earth radius of 6,371 km.
Alex Simmons [email protected], Brooke Anderson [email protected]
meteo_spherical_distance(lat1 = -27.4667, long1 = 153.0217, lat2 = -27.4710, long2 = 153.0234)
meteo_spherical_distance(lat1 = -27.4667, long1 = 153.0217, lat2 = -27.4710, long2 = 153.0234)
This function inputs an object created by ghcnd
and cleans up
the data into a tidy form.
meteo_tidy_ghcnd( stationid, keep_flags = FALSE, var = "all", date_min = NULL, date_max = NULL )
meteo_tidy_ghcnd( stationid, keep_flags = FALSE, var = "all", date_min = NULL, date_max = NULL )
stationid |
(character) A character vector giving the identification of
the weather stations for which the user would like to pull data. To get a full
and current list of stations, the user can use the |
keep_flags |
TRUE / FALSE for whether the user would like to keep all the flags for each weather variable. The default is to not keep the flags (FALSE). See the note below for more information on these flags. |
var |
A character vector specifying either
A full list of possible weather variables is available in NOAA's README file for the GHCND data (https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt). Most weather stations will only have a small subset of all the possible weather variables, so the data generated by this function may not include all of the variables the user specifies through this argument. |
date_min |
A character string giving the earliest date of the daily weather time series that the user would like in the final output. This character string should be formatted as "yyyy-mm-dd". If not specified, the default is to keep all daily data for the queried weather site from the earliest available date. |
date_max |
A character string giving the latest date of the daily weather time series that the user would like in the final output. This character string should be formatted as "yyyy-mm-dd". If not specified, the default is to keep all daily data for the queried weather site through the most current available date. |
A data frame of daily weather data for a single weather monitor, converted to a tidy format. All weather variables may not exist for all weather stations. Examples of variables returned are:
id
: Character string with the weather station site id
date
: Date of the observation
prcp
: Precipitation, in tenths of mm
tavg
: Average temperature, in degrees Celsius
tmax
: Maximum temperature, in degrees Celsius
tmin
: Minimum temperature, in degrees Celsius
awnd
: Average daily wind speed, in meters / second
wsfg
: Peak gust wind speed, in meters / second
There are other possible weather variables in the Global Historical
Climatology Network; see
http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt for a full
list. The variables prcp
, tmax
, tmin
, and tavg
have all been converted from tenths of their metric to the metric (e.g.,
from tenths of degrees Celsius to degrees Celsius). All other variables
are in the units specified in the linked file.
The weather flags, which are kept by specifying
keep_flags = TRUE
are:
*_mflag
: Measurement flag, which gives some information on how
the observation was measured.
*_qflag
: Quality flag, which gives quality information on the
measurement, like if it failed to pass certain quality checks.
*_sflag
: Source flag. This gives some information on the
weather collection system (e.g., U.S. Cooperative Summary of the Day,
Australian Bureau of Meteorology) the weather observation comes from.
More information on the interpretation of these flags can be found in the README file for the NCDC's Daily Global Historical Climatology Network's data at http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt
Brooke Anderson [email protected]
## Not run: # One station in Australia is ASM00094275 meteo_tidy_ghcnd(stationid = "ASN00003003") meteo_tidy_ghcnd(stationid = "ASN00003003", var = "tavg") meteo_tidy_ghcnd(stationid = "ASN00003003", date_min = "1989-01-01") ## End(Not run)
## Not run: # One station in Australia is ASM00094275 meteo_tidy_ghcnd(stationid = "ASN00003003") meteo_tidy_ghcnd(stationid = "ASN00003003", var = "tavg") meteo_tidy_ghcnd(stationid = "ASN00003003", date_min = "1989-01-01") ## End(Not run)
This function restructures the output of ghcnd_search()
to add a column giving the variable name (key
) and change the
name of the variable column to value
. These changes facilitate
combining all elements from the list created by ghcnd_search()
,
to create a tidy dataframe of the weather observations from the station.
meteo_tidy_ghcnd_element(x, keep_flags = FALSE)
meteo_tidy_ghcnd_element(x, keep_flags = FALSE)
x |
A dataframe with daily observations for a single monitor for a
single weather variable. This dataframe is one of the elements returned
by |
keep_flags |
TRUE / FALSE for whether the user would like to keep all the flags for each weather variable. The default is to not keep the flags (FALSE). See the note below for more information on these flags. |
A dataframe reformatted to allow easy aggregation of all weather variables for a single monitor.
Brooke Anderson [email protected]
Search for and get NOAA NCDC data
ncdc( datasetid = NULL, datatypeid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, includemetadata = TRUE, add_units = FALSE, ... )
ncdc( datasetid = NULL, datatypeid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, includemetadata = TRUE, add_units = FALSE, ... )
datasetid |
(required) Accepts a single valid dataset id. Data
returned will be from the dataset specified, see |
datatypeid |
Accepts a valid data type id or a vector or list of data type ids. (optional) |
stationid |
Accepts a valid station id or a vector or list of station ids |
locationid |
Accepts a valid location id or a vector or list of location ids (optional) |
startdate |
(character/date) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data after the specified date. The date range must be less than 1 year. required. |
enddate |
(character/date) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data before the specified date. The date range must be less than 1 year. required. |
sortfield |
The field to sort results by. Supports id, name, mindate, maxdate, and datacoverage fields (optional) |
sortorder |
Which order to sort by, asc or desc. Defaults to asc (optional) |
limit |
Defaults to 25, limits the number of results in the response. Maximum is 1000 (optional) |
offset |
Defaults to 0, used to offset the resultlist (optional) |
token |
This must be a valid token token supplied to you by NCDC's Climate Data Online access token generator. (required) See Authentication section below for more details. |
includemetadata |
Used to improve response time by preventing the calculation of result metadata. Default: TRUE. This does not affect the return object, in that the named part of the output list called "meta" is still returned, but is NULL. In practice, I haven't seen response time's improve, but perhaps they will for you. |
add_units |
(logical) whether to add units information or not.
default: |
... |
Curl options passed on to |
Note that NOAA NCDC API calls can take a long time depending on the call. The NOAA API doesn't perform well with very long timespans, and will time out and make you angry - beware.
Keep in mind that three parameters, datasetid, startdate, and enddate are required.
Note that the default limit (no. records returned) is 25. Look at the
metadata in $meta
to see how many records were found. If more were
found than 25, you could set the parameter limit
to something
higher than 25.
An S3 list of length two, a slot of metadata (meta), and a slot
for data (data). The meta slot is a list of metadata elements, and the
data slot is a data.frame, possibly of length zero if no data is found. Note
that values in the data slot don't indicate their units by default, so you
will want to either use the add_units
parameter (experimental, see Adding
units) or consult the documentation for each dataset to ensure you're using
the correct units.
Get an API key (aka, token) at https://www.ncdc.noaa.gov/cdo-web/token You can pass your token in as an argument or store it one of two places:
your .Rprofile file with the entry
options(noaakey = "your-noaa-token")
your .Renviron file with the entry
NOAA_KEY=your-noaa-token
See Startup
for information on how to create/find your
.Rrofile and .Renviron files
The attributes, or "flags", for each row of the output for data may have
a flag with it. Each datasetid
has it's own set of flags. The
following are flag columns, and what they stand for. fl_
is the
beginning of each flag column name, then one or more characters to describe
the flag, keeping it short to maintain a compact data frame. Some of
these fields are the same across datasetids. See the vignette
vignette("rnoaa_attributes", "rnoaa")
for description of possible
values for each flag.
fl_c completeness
fl_d day
fl_m measurement
fl_q quality
fl_s source
fl_t time
fl_cmiss consecutive missing
fl_miss missing
fl_u units
Note that flags are different for GSOM and GSOY datasets. They have their
own set of flags per data class. See
system.file("extdata/gsom.json", package = "rnoaa")
for GSOM
and system.file("extdata/gsom.json", package = "rnoaa")
for GSOY.
Those are JSON files. The system.file()
call gives you then path,
then read in with jsonlite::fromJSON()
which will give a data.frame
of the metadata. For more detailed info but plain text, open
system.file("extdata/gsom_readme.txt", package = "rnoaa")
and system.file("extdata/gsoy_readme.txt", package = "rnoaa")
in a text editor.
The add_units
parameter is experimental - USE WITH CAUTION!
If add_units=TRUE
we pull data from curated lists of data
used by matching by datasetid and data type.
We've attempted to gather as much information as possible on the many, many data types across the many different NOAA data sets. However, we may have got some things wrong, so make sure to double check data you get if you do add units.
Get in touch if you find some units that are wrong or missing, and if you are able to help correct information.
Other ncdc:
ncdc_combine()
,
ncdc_datacats()
,
ncdc_datasets()
,
ncdc_datatypes()
,
ncdc_locs_cats()
,
ncdc_locs()
,
ncdc_plot()
,
ncdc_stations()
## Not run: # GHCN-Daily (or GHCND) data, for a specific station ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = '2013-10-01', enddate = '2013-12-01') ### also accepts dates as class Date ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = as.Date('2013-10-01'), enddate = as.Date('2013-12-01')) # GHCND data, for a location by FIPS code ncdc(datasetid='GHCND', locationid = 'FIPS:02', startdate = '2010-05-01', enddate = '2010-05-10') # GHCND data from October 1 2013 to December 1 2013 ncdc(datasetid='GHCND', startdate = '2013-10-01', enddate = '2013-10-05') # GHCN-Monthly (or GSOM) data from October 1 2013 to December 1 2013 ncdc(datasetid='GSOM', startdate = '2013-10-01', enddate = '2013-12-01') ncdc(datasetid='GSOM', startdate = '2013-10-01', enddate = '2013-12-01', stationid = "GHCND:AE000041196") # Normals Daily (or NORMAL_DLY) GHCND:USW00014895 dly-tmax-normal data ncdc(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895', startdate = '2010-05-01', enddate = '2010-05-10') # Dataset, and location in Australia ncdc(datasetid='GHCND', locationid='FIPS:AS', startdate = '2010-05-01', enddate = '2010-05-31') # Dataset, location and datatype for PRECIP_HLY data ncdc(datasetid='PRECIP_HLY', locationid='ZIP:28801', datatypeid='HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # multiple datatypeid's ncdc(datasetid='PRECIP_HLY', datatypeid = 'HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # multiple locationid's ncdc(datasetid='PRECIP_HLY', locationid=c("FIPS:30103", "FIPS:30091"), startdate = '2010-05-01', enddate = '2010-05-10') # Dataset, location, station and datatype ncdc(datasetid='PRECIP_HLY', locationid='ZIP:28801', stationid='COOP:310301', datatypeid='HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # Dataset, location, and datatype for GHCND ncdc(datasetid='GHCND', locationid='FIPS:BR', datatypeid='PRCP', startdate = '2010-05-01', enddate = '2010-05-10') # Normals Daily GHCND dly-tmax-normal data ncdc(datasetid='NORMAL_DLY', datatypeid='dly-tmax-normal', startdate = '2010-05-01', enddate = '2010-05-10') # Normals Daily GHCND:USW00014895 dly-tmax-normal ncdc(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895', datatypeid='dly-tmax-normal', startdate = '2010-05-01', enddate = '2010-05-10') # Hourly Precipitation data for ZIP code 28801 ncdc(datasetid='PRECIP_HLY', locationid='ZIP:28801', datatypeid='HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # 15 min Precipitation data for ZIP code 28801 ncdc(datasetid='PRECIP_15', datatypeid='QPCP', startdate = '2010-05-01', enddate = '2010-05-02') # Search the NORMAL_HLY dataset ncdc(datasetid='NORMAL_HLY', stationid = 'GHCND:USW00003812', startdate = '2010-05-01', enddate = '2010-05-10') # Search the GSOY dataset ncdc(datasetid='ANNUAL', locationid='ZIP:28801', startdate = '2010-05-01', enddate = '2010-05-10') # Search the NORMAL_ANN dataset ncdc(datasetid='NORMAL_ANN', datatypeid='ANN-DUTR-NORMAL', startdate = '2010-01-01', enddate = '2010-01-01') # Include metadata or not ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = '2013-10-01', enddate = '2013-12-01') ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = '2013-10-01', enddate = '2013-12-01', includemetadata=FALSE) # Many stationid's stat <- ncdc_stations(startdate = "2000-01-01", enddate = "2016-01-01") ## find out what datasets might be available for these stations ncdc_datasets(stationid = stat$data$id[10]) ## get some data ncdc(datasetid = "GSOY", stationid = stat$data$id[1:10], startdate = "2010-01-01", enddate = "2011-01-01") ## End(Not run) ## Not run: # NEXRAD2 data ## doesn't work yet ncdc(datasetid='NEXRAD2', startdate = '2013-10-01', enddate = '2013-12-01') ## End(Not run)
## Not run: # GHCN-Daily (or GHCND) data, for a specific station ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = '2013-10-01', enddate = '2013-12-01') ### also accepts dates as class Date ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = as.Date('2013-10-01'), enddate = as.Date('2013-12-01')) # GHCND data, for a location by FIPS code ncdc(datasetid='GHCND', locationid = 'FIPS:02', startdate = '2010-05-01', enddate = '2010-05-10') # GHCND data from October 1 2013 to December 1 2013 ncdc(datasetid='GHCND', startdate = '2013-10-01', enddate = '2013-10-05') # GHCN-Monthly (or GSOM) data from October 1 2013 to December 1 2013 ncdc(datasetid='GSOM', startdate = '2013-10-01', enddate = '2013-12-01') ncdc(datasetid='GSOM', startdate = '2013-10-01', enddate = '2013-12-01', stationid = "GHCND:AE000041196") # Normals Daily (or NORMAL_DLY) GHCND:USW00014895 dly-tmax-normal data ncdc(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895', startdate = '2010-05-01', enddate = '2010-05-10') # Dataset, and location in Australia ncdc(datasetid='GHCND', locationid='FIPS:AS', startdate = '2010-05-01', enddate = '2010-05-31') # Dataset, location and datatype for PRECIP_HLY data ncdc(datasetid='PRECIP_HLY', locationid='ZIP:28801', datatypeid='HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # multiple datatypeid's ncdc(datasetid='PRECIP_HLY', datatypeid = 'HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # multiple locationid's ncdc(datasetid='PRECIP_HLY', locationid=c("FIPS:30103", "FIPS:30091"), startdate = '2010-05-01', enddate = '2010-05-10') # Dataset, location, station and datatype ncdc(datasetid='PRECIP_HLY', locationid='ZIP:28801', stationid='COOP:310301', datatypeid='HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # Dataset, location, and datatype for GHCND ncdc(datasetid='GHCND', locationid='FIPS:BR', datatypeid='PRCP', startdate = '2010-05-01', enddate = '2010-05-10') # Normals Daily GHCND dly-tmax-normal data ncdc(datasetid='NORMAL_DLY', datatypeid='dly-tmax-normal', startdate = '2010-05-01', enddate = '2010-05-10') # Normals Daily GHCND:USW00014895 dly-tmax-normal ncdc(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895', datatypeid='dly-tmax-normal', startdate = '2010-05-01', enddate = '2010-05-10') # Hourly Precipitation data for ZIP code 28801 ncdc(datasetid='PRECIP_HLY', locationid='ZIP:28801', datatypeid='HPCP', startdate = '2010-05-01', enddate = '2010-05-10') # 15 min Precipitation data for ZIP code 28801 ncdc(datasetid='PRECIP_15', datatypeid='QPCP', startdate = '2010-05-01', enddate = '2010-05-02') # Search the NORMAL_HLY dataset ncdc(datasetid='NORMAL_HLY', stationid = 'GHCND:USW00003812', startdate = '2010-05-01', enddate = '2010-05-10') # Search the GSOY dataset ncdc(datasetid='ANNUAL', locationid='ZIP:28801', startdate = '2010-05-01', enddate = '2010-05-10') # Search the NORMAL_ANN dataset ncdc(datasetid='NORMAL_ANN', datatypeid='ANN-DUTR-NORMAL', startdate = '2010-01-01', enddate = '2010-01-01') # Include metadata or not ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = '2013-10-01', enddate = '2013-12-01') ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', startdate = '2013-10-01', enddate = '2013-12-01', includemetadata=FALSE) # Many stationid's stat <- ncdc_stations(startdate = "2000-01-01", enddate = "2016-01-01") ## find out what datasets might be available for these stations ncdc_datasets(stationid = stat$data$id[10]) ## get some data ncdc(datasetid = "GSOY", stationid = stat$data$id[1:10], startdate = "2010-01-01", enddate = "2011-01-01") ## End(Not run) ## Not run: # NEXRAD2 data ## doesn't work yet ncdc(datasetid='NEXRAD2', startdate = '2013-10-01', enddate = '2013-12-01') ## End(Not run)
Coerce multiple outputs to a single data.frame object.
ncdc_combine(...)
ncdc_combine(...)
... |
Objects from another ncdc_* function. |
A data.frame
Other ncdc:
ncdc_datacats()
,
ncdc_datasets()
,
ncdc_datatypes()
,
ncdc_locs_cats()
,
ncdc_locs()
,
ncdc_plot()
,
ncdc_stations()
,
ncdc()
## Not run: # data out1 <- ncdc(datasetid='GHCND', locationid = 'FIPS:02', startdate = '2010-05-01', enddate = '2010-05-31', limit=10) out2 <- ncdc(datasetid='GHCND', locationid = 'FIPS:02', startdate = '2010-07-01', enddate = '2010-07-31', limit=10) ncdc_combine(out1, out2) # data sets out1 <- ncdc_datasets(datatypeid='TOBS') out2 <- ncdc_datasets(datatypeid='PRCP') ncdc_combine(out1, out2) # data types out1 <- ncdc_datatypes(datatypeid="ACMH") out2 <- ncdc_datatypes(datatypeid='PRCP') ncdc_combine(out1, out2) # data categories out1 <- ncdc_datacats(datacategoryid="ANNAGR") out2 <- ncdc_datacats(datacategoryid='PRCP') ncdc_combine(out1, out2) # data locations out1 <- ncdc_locs(locationcategoryid='ST', limit=52) out2 <- ncdc_locs(locationcategoryid='CITY', sortfield='name', sortorder='desc') ncdc_combine(out1, out2) # data locations out1 <- ncdc_locs_cats(startdate='1970-01-01') out2 <- ncdc_locs_cats(locationcategoryid='CLIM_REG') ncdc_combine(out1, out2) # stations out1 <- ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289') out2 <- ncdc_stations(stationid='COOP:010008') out3 <- ncdc_stations(datasetid='PRECIP_HLY', startdate='19900101', enddate='19901231') out4 <- ncdc_stations(datasetid='GHCND', locationid='FIPS:12017') ncdc_combine(out1, out2, out3, out4) # try to combine two different classes out1 <- ncdc_locs_cats(startdate='1970-01-01') out2 <- ncdc_stations(stationid='COOP:010008') out3 <- ncdc_locs_cats(locationcategoryid='CLIM_REG') ncdc_combine(out1, out2, out3) ## End(Not run)
## Not run: # data out1 <- ncdc(datasetid='GHCND', locationid = 'FIPS:02', startdate = '2010-05-01', enddate = '2010-05-31', limit=10) out2 <- ncdc(datasetid='GHCND', locationid = 'FIPS:02', startdate = '2010-07-01', enddate = '2010-07-31', limit=10) ncdc_combine(out1, out2) # data sets out1 <- ncdc_datasets(datatypeid='TOBS') out2 <- ncdc_datasets(datatypeid='PRCP') ncdc_combine(out1, out2) # data types out1 <- ncdc_datatypes(datatypeid="ACMH") out2 <- ncdc_datatypes(datatypeid='PRCP') ncdc_combine(out1, out2) # data categories out1 <- ncdc_datacats(datacategoryid="ANNAGR") out2 <- ncdc_datacats(datacategoryid='PRCP') ncdc_combine(out1, out2) # data locations out1 <- ncdc_locs(locationcategoryid='ST', limit=52) out2 <- ncdc_locs(locationcategoryid='CITY', sortfield='name', sortorder='desc') ncdc_combine(out1, out2) # data locations out1 <- ncdc_locs_cats(startdate='1970-01-01') out2 <- ncdc_locs_cats(locationcategoryid='CLIM_REG') ncdc_combine(out1, out2) # stations out1 <- ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289') out2 <- ncdc_stations(stationid='COOP:010008') out3 <- ncdc_stations(datasetid='PRECIP_HLY', startdate='19900101', enddate='19901231') out4 <- ncdc_stations(datasetid='GHCND', locationid='FIPS:12017') ncdc_combine(out1, out2, out3, out4) # try to combine two different classes out1 <- ncdc_locs_cats(startdate='1970-01-01') out2 <- ncdc_stations(stationid='COOP:010008') out3 <- ncdc_locs_cats(locationcategoryid='CLIM_REG') ncdc_combine(out1, out2, out3) ## End(Not run)
Data Categories represent groupings of data types.
ncdc_datacats( datasetid = NULL, datacategoryid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
ncdc_datacats( datasetid = NULL, datacategoryid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
datasetid |
Accepts a valid dataset id or a vector or list of dataset id's. Data returned will be from the dataset specified, see datasets() (required) |
datacategoryid |
A valid data category id. Data types returned will be associated with the data category(ies) specified |
stationid |
Accepts a valid station id or a vector or list of station ids (optional) |
locationid |
Accepts a valid location id or a vector or list of location id's. (optional) |
startdate |
Accepts valid ISO formated date (yyyy-mm-dd). Data returned will have data after the specified date. Paramater can be use independently of enddate (optional) |
enddate |
Accepts valid ISO formated date (yyyy-mm-dd). Data returned will have data before the specified date. Paramater can be use independently of startdate (optional) |
sortfield |
The field to sort results by. Supports id, name, mindate, maxdate, and datacoverage fields (optional) |
sortorder |
Which order to sort by, asc or desc. Defaults to asc (optional) |
limit |
Defaults to 25, limits the number of results in the response. Maximum is 1000 (optional) |
offset |
Defaults to 0, used to offset the resultlist (optional) |
token |
This must be a valid token token supplied to you by NCDC's Climate Data Online access token generator. (required) See Authentication section below for more details. |
... |
Curl options passed on to |
Note that calls with both startdate and enddate don't seem to work, though specifying one or the other mostly works.
A data.frame
for all datasets, or a list of length two,
each with a data.frame.
Get an API key (aka, token) at https://www.ncdc.noaa.gov/cdo-web/token You can pass your token in as an argument or store it one of two places:
your .Rprofile file with the entry
options(noaakey = "your-noaa-token")
your .Renviron file with the entry
NOAA_KEY=your-noaa-token
See Startup
for information on how to create/find your
.Rrofile and .Renviron files
https://www.ncdc.noaa.gov/cdo-web/webservices/v2
Other ncdc:
ncdc_combine()
,
ncdc_datasets()
,
ncdc_datatypes()
,
ncdc_locs_cats()
,
ncdc_locs()
,
ncdc_plot()
,
ncdc_stations()
,
ncdc()
## Not run: ## Limit to 10 results ncdc_datacats(limit=10) ## by datasetid ncdc_datacats(datasetid="ANNUAL") ncdc_datacats(datasetid=c("ANNUAL", "PRECIP_HLY")) ## Single data category ncdc_datacats(datacategoryid="ANNAGR") ## Fetch data categories for a given set of locations ncdc_datacats(locationid='CITY:US390029') ncdc_datacats(locationid=c('CITY:US390029', 'FIPS:37')) ## Data categories for a given date ncdc_datacats(startdate = '2013-10-01') # Get data categories with data for a series of the same parameter arg, in this case # stationid's ncdc_datacats(stationid='COOP:310090') ncdc_datacats(stationid=c('COOP:310090','COOP:310184','COOP:310212')) ## Curl debugging ncdc_datacats(limit=10, verbose = TRUE) ## End(Not run)
## Not run: ## Limit to 10 results ncdc_datacats(limit=10) ## by datasetid ncdc_datacats(datasetid="ANNUAL") ncdc_datacats(datasetid=c("ANNUAL", "PRECIP_HLY")) ## Single data category ncdc_datacats(datacategoryid="ANNAGR") ## Fetch data categories for a given set of locations ncdc_datacats(locationid='CITY:US390029') ncdc_datacats(locationid=c('CITY:US390029', 'FIPS:37')) ## Data categories for a given date ncdc_datacats(startdate = '2013-10-01') # Get data categories with data for a series of the same parameter arg, in this case # stationid's ncdc_datacats(stationid='COOP:310090') ncdc_datacats(stationid=c('COOP:310090','COOP:310184','COOP:310212')) ## Curl debugging ncdc_datacats(limit=10, verbose = TRUE) ## End(Not run)
From the NOAA API docs: All of our data are in datasets. To retrieve any data from us, you must know what dataset it is in.
ncdc_datasets( datasetid = NULL, datatypeid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
ncdc_datasets( datasetid = NULL, datatypeid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
datasetid |
(optional) Accepts a single valid dataset id. Data returned will be from the dataset specified. |
datatypeid |
Accepts a valid data type id or a vector or list of data type ids. (optional) |
stationid |
Accepts a valid station id or a vector or list of station ids |
locationid |
Accepts a valid location id or a vector or list of location ids (optional) |
startdate |
(optional) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data after the specified date. The date range must be less than 1 year. |
enddate |
(optional) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data before the specified date. The date range must be less than 1 year. |
sortfield |
The field to sort results by. Supports id, name, mindate, maxdate, and datacoverage fields (optional) |
sortorder |
Which order to sort by, asc or desc. Defaults to asc (optional) |
limit |
Defaults to 25, limits the number of results in the response. Maximum is 1000 (optional) |
offset |
Defaults to 0, used to offset the resultlist (optional) |
token |
This must be a valid token token supplied to you by NCDC's Climate Data Online access token generator. (required) See Authentication section below for more details. |
... |
Curl options passed on to |
A data.frame for all datasets, or a list of length two, each with a data.frame.
Get an API key (aka, token) at https://www.ncdc.noaa.gov/cdo-web/token You can pass your token in as an argument or store it one of two places:
your .Rprofile file with the entry
options(noaakey = "your-noaa-token")
your .Renviron file with the entry
NOAA_KEY=your-noaa-token
See Startup
for information on how to create/find your
.Rrofile and .Renviron files
https://www.ncdc.noaa.gov/cdo-web/webservices/v2
Other ncdc:
ncdc_combine()
,
ncdc_datacats()
,
ncdc_datatypes()
,
ncdc_locs_cats()
,
ncdc_locs()
,
ncdc_plot()
,
ncdc_stations()
,
ncdc()
## Not run: # Get a table of all datasets ncdc_datasets() # Get details from a particular dataset ncdc_datasets(datasetid='ANNUAL') # Get datasets with Temperature at the time of observation (TOBS) data type ncdc_datasets(datatypeid='TOBS') ## two datatypeid's ncdc_datasets(datatypeid=c('TOBS', "ACMH")) # Get datasets with data for a series of the same parameter arg, in this case # stationid's ncdc_datasets(stationid='COOP:310090') ncdc_datasets(stationid=c('COOP:310090','COOP:310184','COOP:310212')) # Multiple datatypeid's ncdc_datasets(datatypeid=c('ACMC','ACMH','ACSC')) ncdc_datasets(datasetid='ANNUAL', datatypeid=c('ACMC','ACMH','ACSC')) ncdc_datasets(datasetid='GSOY', datatypeid=c('ACMC','ACMH','ACSC')) # Multiple locationid's ncdc_datasets(locationid="FIPS:30091") ncdc_datasets(locationid=c("FIPS:30103", "FIPS:30091")) ## End(Not run)
## Not run: # Get a table of all datasets ncdc_datasets() # Get details from a particular dataset ncdc_datasets(datasetid='ANNUAL') # Get datasets with Temperature at the time of observation (TOBS) data type ncdc_datasets(datatypeid='TOBS') ## two datatypeid's ncdc_datasets(datatypeid=c('TOBS', "ACMH")) # Get datasets with data for a series of the same parameter arg, in this case # stationid's ncdc_datasets(stationid='COOP:310090') ncdc_datasets(stationid=c('COOP:310090','COOP:310184','COOP:310212')) # Multiple datatypeid's ncdc_datasets(datatypeid=c('ACMC','ACMH','ACSC')) ncdc_datasets(datasetid='ANNUAL', datatypeid=c('ACMC','ACMH','ACSC')) ncdc_datasets(datasetid='GSOY', datatypeid=c('ACMC','ACMH','ACSC')) # Multiple locationid's ncdc_datasets(locationid="FIPS:30091") ncdc_datasets(locationid=c("FIPS:30103", "FIPS:30091")) ## End(Not run)
From the NOAA API docs: Describes the type of data, acts as a label. For example: If it's 64 degrees out right now, then the data type is Air Temperature and the data is 64.
ncdc_datatypes( datasetid = NULL, datatypeid = NULL, datacategoryid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
ncdc_datatypes( datasetid = NULL, datatypeid = NULL, datacategoryid = NULL, stationid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
datasetid |
(optional) Accepts a valid dataset id or a vector or list of them. Data returned will be from the dataset specified. |
datatypeid |
Accepts a valid data type id or a vector or list of data type ids. (optional) |
datacategoryid |
Optional. Accepts a valid data category id or a vector or list of data category ids (although it is rare to have a data type with more than one data category) |
stationid |
Accepts a valid station id or a vector or list of station ids |
locationid |
Accepts a valid location id or a vector or list of location ids (optional) |
startdate |
(optional) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data after the specified date. The date range must be less than 1 year. |
enddate |
(optional) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data before the specified date. The date range must be less than 1 year. |
sortfield |
The field to sort results by. Supports id, name, mindate, maxdate, and datacoverage fields (optional) |
sortorder |
Which order to sort by, asc or desc. Defaults to asc (optional) |
limit |
Defaults to 25, limits the number of results in the response. Maximum is 1000 (optional) |
offset |
Defaults to 0, used to offset the resultlist (optional) |
token |
This must be a valid token token supplied to you by NCDC's Climate Data Online access token generator. (required) See Authentication section below for more details. |
... |
Curl options passed on to |
A data.frame
for all datasets, or a list of length two,
each with a data.frame
Get an API key (aka, token) at https://www.ncdc.noaa.gov/cdo-web/token You can pass your token in as an argument or store it one of two places:
your .Rprofile file with the entry
options(noaakey = "your-noaa-token")
your .Renviron file with the entry
NOAA_KEY=your-noaa-token
See Startup
for information on how to create/find your
.Rrofile and .Renviron files
https://www.ncdc.noaa.gov/cdo-web/webservices/v2
Other ncdc:
ncdc_combine()
,
ncdc_datacats()
,
ncdc_datasets()
,
ncdc_locs_cats()
,
ncdc_locs()
,
ncdc_plot()
,
ncdc_stations()
,
ncdc()
## Not run: # Fetch available data types ncdc_datatypes() # Fetch more information about the ACMH data type id, or the ACSC ncdc_datatypes(datatypeid="ACMH") ncdc_datatypes(datatypeid="ACSC") # datasetid, one or many ## ANNUAL should be replaced by GSOY, but both exist and give ## different answers ncdc_datatypes(datasetid="ANNUAL") ncdc_datatypes(datasetid="GSOY") ncdc_datatypes(datasetid=c("ANNUAL", "PRECIP_HLY")) # Fetch data types with the air temperature data category ncdc_datatypes(datacategoryid="TEMP", limit=56) ncdc_datatypes(datacategoryid=c("TEMP", "AUPRCP")) # Fetch data types that support a given set of stations ncdc_datatypes(stationid='COOP:310090') ncdc_datatypes(stationid=c('COOP:310090','COOP:310184','COOP:310212')) # Fetch data types that support a given set of loncationids ncdc_datatypes(locationid='CITY:AG000001') ncdc_datatypes(locationid=c('CITY:AG000001','CITY:AG000004')) ## End(Not run)
## Not run: # Fetch available data types ncdc_datatypes() # Fetch more information about the ACMH data type id, or the ACSC ncdc_datatypes(datatypeid="ACMH") ncdc_datatypes(datatypeid="ACSC") # datasetid, one or many ## ANNUAL should be replaced by GSOY, but both exist and give ## different answers ncdc_datatypes(datasetid="ANNUAL") ncdc_datatypes(datasetid="GSOY") ncdc_datatypes(datasetid=c("ANNUAL", "PRECIP_HLY")) # Fetch data types with the air temperature data category ncdc_datatypes(datacategoryid="TEMP", limit=56) ncdc_datatypes(datacategoryid=c("TEMP", "AUPRCP")) # Fetch data types that support a given set of stations ncdc_datatypes(stationid='COOP:310090') ncdc_datatypes(stationid=c('COOP:310090','COOP:310184','COOP:310212')) # Fetch data types that support a given set of loncationids ncdc_datatypes(locationid='CITY:AG000001') ncdc_datatypes(locationid=c('CITY:AG000001','CITY:AG000004')) ## End(Not run)
From the NOAA NCDC API docs: Locations can be a specific latitude/longitude point such as a station, or a label representing a bounding area such as a city.
ncdc_locs( datasetid = NULL, locationid = NULL, locationcategoryid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
ncdc_locs( datasetid = NULL, locationid = NULL, locationcategoryid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
datasetid |
A valid dataset id or a vector or list of dataset id's. Data returned will be from the dataset specified, see datasets() (required) |
locationid |
A valid location id or a vector or list of location ids. |
locationcategoryid |
A valid location id or a vector or list of location category ids |
startdate |
A valid ISO formatted date (yyyy-mm-dd). Data returned will have data after the specified date. Paramater can be use independently of enddate (optional) |
enddate |
Accepts valid ISO formatted date (yyyy-mm-dd). Data returned will have data before the specified date. Paramater can be use independently of startdate (optional) |
sortfield |
The field to sort results by. Supports id, name, mindate, maxdate, and datacoverage fields (optional) |
sortorder |
Which order to sort by, asc or desc. Defaults to asc (optional) |
limit |
Defaults to 25, limits the number of results in the response. Maximum is 1000 (optional) |
offset |
Defaults to 0, used to offset the resultlist (optional) |
token |
This must be a valid token token supplied to you by NCDC's Climate Data Online access token generator. (required) See Authentication section below for more details. |
... |
Curl options passed on to |
A list containing metadata and the data, or a single data.frame.
Get an API key (aka, token) at https://www.ncdc.noaa.gov/cdo-web/token You can pass your token in as an argument or store it one of two places:
your .Rprofile file with the entry
options(noaakey = "your-noaa-token")
your .Renviron file with the entry
NOAA_KEY=your-noaa-token
See Startup
for information on how to create/find your
.Rrofile and .Renviron files
https://www.ncdc.noaa.gov/cdo-web/webservices/v2
Other ncdc:
ncdc_combine()
,
ncdc_datacats()
,
ncdc_datasets()
,
ncdc_datatypes()
,
ncdc_locs_cats()
,
ncdc_plot()
,
ncdc_stations()
,
ncdc()
## Not run: # All locations, first 25 results ncdc_locs() # Fetch more information about location id FIPS:37 ncdc_locs(locationid='FIPS:37') # Fetch available locations for the GHCND (Daily Summaries) dataset ncdc_locs(datasetid='GHCND') ncdc_locs(datasetid=c('GHCND', 'ANNUAL')) ncdc_locs(datasetid=c('GSOY', 'ANNUAL')) ncdc_locs(datasetid=c('GHCND', 'GSOM')) # Fetch all U.S. States ncdc_locs(locationcategoryid='ST', limit=52) # Many locationcategoryid's ## this apparently works, but returns nothing often with multiple ## locationcategoryid's ncdc_locs(locationcategoryid=c('ST', 'ZIP')) # Fetch list of city locations in descending order ncdc_locs(locationcategoryid='CITY', sortfield='name', sortorder='desc') ## End(Not run)
## Not run: # All locations, first 25 results ncdc_locs() # Fetch more information about location id FIPS:37 ncdc_locs(locationid='FIPS:37') # Fetch available locations for the GHCND (Daily Summaries) dataset ncdc_locs(datasetid='GHCND') ncdc_locs(datasetid=c('GHCND', 'ANNUAL')) ncdc_locs(datasetid=c('GSOY', 'ANNUAL')) ncdc_locs(datasetid=c('GHCND', 'GSOM')) # Fetch all U.S. States ncdc_locs(locationcategoryid='ST', limit=52) # Many locationcategoryid's ## this apparently works, but returns nothing often with multiple ## locationcategoryid's ncdc_locs(locationcategoryid=c('ST', 'ZIP')) # Fetch list of city locations in descending order ncdc_locs(locationcategoryid='CITY', sortfield='name', sortorder='desc') ## End(Not run)
Location categories are groupings of similar locations.
ncdc_locs_cats( datasetid = NULL, locationcategoryid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
ncdc_locs_cats( datasetid = NULL, locationcategoryid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, token = NULL, ... )
datasetid |
A valid dataset id or a vector or list of dataset id's. Data returned will be from the dataset specified, see datasets() (required) |
locationcategoryid |
A valid location id or a vector or list of location category ids |
startdate |
A valid ISO formatted date (yyyy-mm-dd). Data returned will have data after the specified date. Paramater can be use independently of enddate (optional) |
enddate |
Accepts valid ISO formatted date (yyyy-mm-dd). Data returned will have data before the specified date. Paramater can be use independently of startdate (optional) |
sortfield |
The field to sort results by. Supports id, name, mindate, maxdate, and datacoverage fields (optional) |
sortorder |
Which order to sort by, asc or desc. Defaults to asc (optional) |
limit |
Defaults to 25, limits the number of results in the response. Maximum is 1000 (optional) |
offset |
Defaults to 0, used to offset the resultlist (optional) |
token |
This must be a valid token token supplied to you by NCDC's Climate Data Online access token generator. (required) See Authentication section below for more details. |
... |
Curl options passed on to |
Locations can be a specific latitude/longitude point such as a station, or a label representing a bounding area such as a city.
A list containing metadata and the data, or a single data.frame.
Get an API key (aka, token) at https://www.ncdc.noaa.gov/cdo-web/token You can pass your token in as an argument or store it one of two places:
your .Rprofile file with the entry
options(noaakey = "your-noaa-token")
your .Renviron file with the entry
NOAA_KEY=your-noaa-token
See Startup
for information on how to create/find your
.Rrofile and .Renviron files
https://www.ncdc.noaa.gov/cdo-web/webservices/v2
Other ncdc:
ncdc_combine()
,
ncdc_datacats()
,
ncdc_datasets()
,
ncdc_datatypes()
,
ncdc_locs()
,
ncdc_plot()
,
ncdc_stations()
,
ncdc()
## Not run: # All location categories, first 25 results ncdc_locs_cats() # Find locations with category id of CLIM_REG ncdc_locs_cats(locationcategoryid='CLIM_REG') # Displays available location categories within GHCN-Daily dataset ncdc_locs_cats(datasetid='GHCND') ncdc_locs_cats(datasetid='GSOY') ncdc_locs_cats(datasetid='ANNUAL') # multiple datasetid's ncdc_locs_cats(datasetid=c('GHCND', 'GSOM')) # Displays available location categories from start date 1970-01-01 ncdc_locs_cats(startdate='1970-01-01') ## End(Not run)
## Not run: # All location categories, first 25 results ncdc_locs_cats() # Find locations with category id of CLIM_REG ncdc_locs_cats(locationcategoryid='CLIM_REG') # Displays available location categories within GHCN-Daily dataset ncdc_locs_cats(datasetid='GHCND') ncdc_locs_cats(datasetid='GSOY') ncdc_locs_cats(datasetid='ANNUAL') # multiple datasetid's ncdc_locs_cats(datasetid=c('GHCND', 'GSOM')) # Displays available location categories from start date 1970-01-01 ncdc_locs_cats(startdate='1970-01-01') ## End(Not run)
Plot NOAA climate data.
ncdc_plot(..., breaks = NULL, dateformat = "%d/%m/%y")
ncdc_plot(..., breaks = NULL, dateformat = "%d/%m/%y")
... |
Input noaa object or objects. |
breaks |
Regularly spaced date breaks for x-axis. See examples for
usage. See date_breaks. Default: |
dateformat |
Date format using standard POSIX specification for labels
on x-axis. See |
This function accepts directly output from the ncdc()
function,
not other functions.
This is a simple wrapper function around some ggplot2 code. There is indeed a lot you can modify in your plots, so this function just does some basic stuff. Look at the internals for what the function does.
ggplot2 plot
Other ncdc:
ncdc_combine()
,
ncdc_datacats()
,
ncdc_datasets()
,
ncdc_datatypes()
,
ncdc_locs_cats()
,
ncdc_locs()
,
ncdc_stations()
,
ncdc()
## Not run: # Search for data first, then plot out <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-05-01', enddate = '2010-10-31', limit=500) ncdc_plot(out) ncdc_plot(out, breaks="14 days") ncdc_plot(out, breaks="1 month", dateformat="%d/%m") ncdc_plot(out, breaks="1 month", dateformat="%d/%m") # Combine many calls to ncdc function out1 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-03-01', enddate = '2010-05-31', limit=500) out2 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-09-01', enddate = '2010-10-31', limit=500) df <- ncdc_combine(out1, out2) ncdc_plot(df) ## or pass in each element separately ncdc_plot(out1, out2, breaks="45 days") ## End(Not run)
## Not run: # Search for data first, then plot out <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-05-01', enddate = '2010-10-31', limit=500) ncdc_plot(out) ncdc_plot(out, breaks="14 days") ncdc_plot(out, breaks="1 month", dateformat="%d/%m") ncdc_plot(out, breaks="1 month", dateformat="%d/%m") # Combine many calls to ncdc function out1 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-03-01', enddate = '2010-05-31', limit=500) out2 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-09-01', enddate = '2010-10-31', limit=500) df <- ncdc_combine(out1, out2) ncdc_plot(df) ## or pass in each element separately ncdc_plot(out1, out2, breaks="45 days") ## End(Not run)
From the NOAA NCDC API docs: Stations are where the data comes from (for most datasets) and can be considered the smallest granual of location data. If you know what station you want, you can quickly get all manner of data from it
ncdc_stations( stationid = NULL, datasetid = NULL, datatypeid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, datacategoryid = NULL, extent = NULL, token = NULL, dataset = NULL, station = NULL, location = NULL, locationtype = NULL, page = NULL, ... )
ncdc_stations( stationid = NULL, datasetid = NULL, datatypeid = NULL, locationid = NULL, startdate = NULL, enddate = NULL, sortfield = NULL, sortorder = NULL, limit = 25, offset = NULL, datacategoryid = NULL, extent = NULL, token = NULL, dataset = NULL, station = NULL, location = NULL, locationtype = NULL, page = NULL, ... )
stationid |
A single valid station id, with datasetid namespace, e.g., GHCND:USW00014895 |
datasetid |
(optional) Accepts a valid dataset id or a vector or list of them. Data returned will be from the dataset specified. |
datatypeid |
Accepts a valid data type id or a vector or list of data type ids. (optional) |
locationid |
Accepts a valid location id or a vector or list of location ids (optional) |
startdate |
(optional) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data after the specified date. The date range must be less than 1 year. |
enddate |
(optional) Accepts valid ISO formated date (yyyy-mm-dd) or date time (YYYY-MM-DDThh:mm:ss). Data returned will have data before the specified date. The date range must be less than 1 year. |
sortfield |
The field to sort results by. Supports id, name, mindate, maxdate, and datacoverage fields (optional) |
sortorder |
Which order to sort by, asc or desc. Defaults to asc (optional) |
limit |
Defaults to 25, limits the number of results in the response. Maximum is 1000 (optional) |
offset |
Defaults to 0, used to offset the resultlist (optional) |
datacategoryid |
(character, optional) Accepts a valid data category id or a vector or list of data category ids. |
extent |
(numeric, optional) The geographical extent for which you want to
search. Give four values that defines a bounding box, lat and long for the
southwest corner, then lat and long for the northeast corner. For example:
|
token |
This must be a valid token token supplied to you by NCDC's Climate Data Online access token generator. (required) See Authentication section below for more details. |
dataset |
THIS IS A DEPRECATED ARGUMENT. See datasetid. |
station |
THIS IS A DEPRECATED ARGUMENT. See stationid. |
location |
THIS IS A DEPRECATED ARGUMENT. See locationid. |
locationtype |
THIS IS A DEPRECATED ARGUMENT. There is no equivalent argument in v2 of the NOAA API. |
page |
THIS IS A DEPRECATED ARGUMENT. There is no equivalent argument in v2 of the NOAA API. |
... |
Curl options passed on to |
A list of metadata.
Get an API key (aka, token) at https://www.ncdc.noaa.gov/cdo-web/token You can pass your token in as an argument or store it one of two places:
your .Rprofile file with the entry
options(noaakey = "your-noaa-token")
your .Renviron file with the entry
NOAA_KEY=your-noaa-token
See Startup
for information on how to create/find your
.Rrofile and .Renviron files
https://www.ncdc.noaa.gov/cdo-web/webservices/v2
Other ncdc:
ncdc_combine()
,
ncdc_datacats()
,
ncdc_datasets()
,
ncdc_datatypes()
,
ncdc_locs_cats()
,
ncdc_locs()
,
ncdc_plot()
,
ncdc()
## Not run: # Get metadata on all stations ncdc_stations() ncdc_stations(limit=5) # Get metadata on a single station ncdc_stations(stationid='COOP:010008') # For many stations use lapply or similar lapply(c("COOP:010008", "COOP:010063", "COOP:010116"), function(z) { ncdc_stations( startdate = "2013-01-01", enddate = "2014-11-01", stationid = z) }$data) # Displays all stations within GHCN-Daily (100 Stations per page limit) ncdc_stations(datasetid = 'GHCND') ncdc_stations(datasetid = 'ANNUAL') ncdc_stations(datasetid = 'GSOY') # Station ncdc_stations(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895') # datatypeid ncdc_stations(datatypeid="ANN-HTDD-NORMAL") ncdc_stations(datatypeid=c("ANN-HTDD-NORMAL", "ACSC")) # locationid ncdc_stations(locationid="CITY:AG000001") ncdc_stations(locationid="FIPS:30091") ncdc_stations(locationid=c("FIPS:30103", "FIPS:30091")) # datacategoryid ncdc_stations(datacategoryid="ANNPRCP") ncdc_stations(datacategoryid="AUAGR") ncdc_stations(datacategoryid=c("ANNPRCP", "AUAGR")) # Displays all stations within GHCN-Daily (Displaying page 10 of the results) ncdc_stations(datasetid='GHCND') # Specify datasetid and locationid ncdc_stations(datasetid='GHCND', locationid='FIPS:12017') # Specify datasetid, locationid, and station ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289') # Specify datasetid, locationidtype, locationid, and station ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289') # Displays list of stations within the specified county ncdc_stations(datasetid='GHCND', locationid='FIPS:12017') # Displays list of Hourly Precipitation locationids between 01/01/1990 and 12/31/1990 ncdc_stations(datasetid='PRECIP_HLY', startdate='19900101', enddate='19901231') # Search for stations by spatial extent ## Search using a bounding box, w/ lat/long of the SW corner, then of NE corner ncdc_stations(extent=c(47.5204,-122.2047,47.6139,-122.1065)) ## End(Not run)
## Not run: # Get metadata on all stations ncdc_stations() ncdc_stations(limit=5) # Get metadata on a single station ncdc_stations(stationid='COOP:010008') # For many stations use lapply or similar lapply(c("COOP:010008", "COOP:010063", "COOP:010116"), function(z) { ncdc_stations( startdate = "2013-01-01", enddate = "2014-11-01", stationid = z) }$data) # Displays all stations within GHCN-Daily (100 Stations per page limit) ncdc_stations(datasetid = 'GHCND') ncdc_stations(datasetid = 'ANNUAL') ncdc_stations(datasetid = 'GSOY') # Station ncdc_stations(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895') # datatypeid ncdc_stations(datatypeid="ANN-HTDD-NORMAL") ncdc_stations(datatypeid=c("ANN-HTDD-NORMAL", "ACSC")) # locationid ncdc_stations(locationid="CITY:AG000001") ncdc_stations(locationid="FIPS:30091") ncdc_stations(locationid=c("FIPS:30103", "FIPS:30091")) # datacategoryid ncdc_stations(datacategoryid="ANNPRCP") ncdc_stations(datacategoryid="AUAGR") ncdc_stations(datacategoryid=c("ANNPRCP", "AUAGR")) # Displays all stations within GHCN-Daily (Displaying page 10 of the results) ncdc_stations(datasetid='GHCND') # Specify datasetid and locationid ncdc_stations(datasetid='GHCND', locationid='FIPS:12017') # Specify datasetid, locationid, and station ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289') # Specify datasetid, locationidtype, locationid, and station ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289') # Displays list of stations within the specified county ncdc_stations(datasetid='GHCND', locationid='FIPS:12017') # Displays list of Hourly Precipitation locationids between 01/01/1990 and 12/31/1990 ncdc_stations(datasetid='PRECIP_HLY', startdate='19900101', enddate='19901231') # Search for stations by spatial extent ## Search using a bounding box, w/ lat/long of the SW corner, then of NE corner ncdc_stations(extent=c(47.5204,-122.2047,47.6139,-122.1065)) ## End(Not run)
Manage data caches
To get the cache directory for a data source, see the method
x$cache_path_get()
cache_delete
only accepts 1 file name, while
cache_delete_all
doesn't accept any names, but deletes all files.
For deleting many specific files, use cache_delete
in a lapply()
type call
Note that cached files will continue to be used until they are deleted. It's possible to run into problems when changes happen in your R setup. For example, at least one user reported changing versions of this package and running into problems because a cached data file from a previous version of rnoaa did not work with the newer version of rnoaa. You should occasionally delete all cached files.
Assuming x is a HoardClient
class object, e.g., lcd_cache
x$cache_path_get()
get cache path
x$cache_path_set()
set cache path
x$list()
returns a character vector of full path file names
x$files()
returns file objects with metadata
x$details()
returns files with details
x$delete()
delete specific files
x$delete_all()
delete all files, returns nothing
isd()
/isd_stations()
: isd_cache
cpc_prcp()
: cpc_cache
arc2()
: arc2_cache
lcd()
: lcd_cache
bsw()
: bsw_cache
ersst()
: ersst_cache
tornadoes()
: torn_cache
ghcnd()
/ghcnd_search()
: ghcnd_cache
se_data()
/se_files()
: stormevents_cache
rnoaa_options()
for managing whether you see messages
about cached files when you request data
rnoaa options
rnoaa_options(cache_messages = TRUE)
rnoaa_options(cache_messages = TRUE)
cache_messages |
(logical) whether to emit messages with information
on caching status for function calls that can cache data. default: |
rnoaa package level options; stored in an internal
package environment roenv
rnoaa_caching for managing cached files
## Not run: rnoaa_options(cache_messages = FALSE) ## End(Not run)
## Not run: rnoaa_options(cache_messages = FALSE) ## End(Not run)
noaa
: Function name changed, prefixed with ncdc now
noaa_datacats
: Function name changed, prefixed with ncdc now
noaa_datasets
: Function name changed, prefixed with ncdc now
noaa_datatypes
: Function name changed, prefixed with ncdc now
noaa_locs
: Function name changed, prefixed with ncdc now
noaa_locs_cats
: Function name changed, prefixed with ncdc now
noaa_stations
: Function name changed, prefixed with ncdc now
noaa_plot
: Function name changed, prefixed with ncdc now
noaa_combine
: Function name changed, prefixed with ncdc now
noaa_seaice
: Function name changed to seaice
erddap_data
: See package rerddap
erddap_clear_cache
: See package rerddap
erddap_datasets
: Moved to package rerddap
erddap_grid
: Moved to package rerddap
erddap_info
: Moved to rerddap::info()
erddap_search
: Moved to rerddap::ed_search
erddap_table
: Moved to rerddap::tabledap
ncdc_leg_variables
: Removed. See NCDC Legacy
below
ncdc_leg_sites
: Removed. See NCDC Legacy
below
ncdc_leg_site_info
: Removed. See NCDC Legacy
below
ncdc_leg_data
: Removed. See NCDC Legacy
below
seaice
: Replaced with sea_ice()
lcd_cleanup
: No longer available. See lcd
docs
ghcnd_clear_cache
: No longer available. See rnoaa_caching
storm_shp
: Function defunct.
storm_shp_read
: Function defunct.
storm_data
: Function defunct.
storm_meta
: Function defunct.
The functions for working with GEFS ensemble forecast data (prefixed with "gefs") are defunct, but may come back to rnoaa later:
The NCDC legacy API is too unreliable and slow. Use the newer NCDC API via
the functions ncdc()
, ncdc_datacats()
, ncdc_datasets()
,
ncdc_datatypes()
, ncdc_locs()
, ncdc_locs_cats()
, ncdc_stations()
,
ncdc_plot()
, and ncdc_combine()
Get sea ice data.
sea_ice(year = NULL, month = NULL, pole = NULL, format = "shp", ...)
sea_ice(year = NULL, month = NULL, pole = NULL, format = "shp", ...)
year |
(numeric) a year |
month |
(character) a month, as character abbrevation of a month |
pole |
(character) one of S (south) or N (north) |
format |
(character) one of shp (default), geotiff-extent (for geotiff extent data), or geotiff-conc (for geotiff concentration data) |
... |
Further arguments passed on to |
data.frame if format="shp"
(a fortified sp object);
raster::raster()
if not
See the "User Guide" pdf at https://nsidc.org/data/g02135
## Not run: if (requireNamespace("raster")) { ## one year, one moth, one pole sea_ice(year = 1990, month = "Apr", pole = "N") sea_ice(year = 1990, month = "Apr", pole = "N", format = "geotiff-extent") sea_ice(year = 1990, month = "Apr", pole = "N", format = "geotiff-conc") ## one year, one month, many poles sea_ice(year = 1990, month = "Apr") ## one year, many months, many poles sea_ice(year = 1990, month = c("Apr", "Jun", "Oct")) ## many years, one month, one pole sea_ice(year = 1990:1992, month = "Sep", pole = "N") # get geotiff instead of shp data. x <- sea_ice(year = 1990, month = "Apr", format = "geotiff-extent") y <- sea_ice(year = 1990, month = "Apr", format = "geotiff-conc") } ## End(Not run)
## Not run: if (requireNamespace("raster")) { ## one year, one moth, one pole sea_ice(year = 1990, month = "Apr", pole = "N") sea_ice(year = 1990, month = "Apr", pole = "N", format = "geotiff-extent") sea_ice(year = 1990, month = "Apr", pole = "N", format = "geotiff-conc") ## one year, one month, many poles sea_ice(year = 1990, month = "Apr") ## one year, many months, many poles sea_ice(year = 1990, month = c("Apr", "Jun", "Oct")) ## many years, one month, one pole sea_ice(year = 1990:1992, month = "Sep", pole = "N") # get geotiff instead of shp data. x <- sea_ice(year = 1990, month = "Apr", format = "geotiff-extent") y <- sea_ice(year = 1990, month = "Apr", format = "geotiff-conc") } ## End(Not run)
Collects .csv
files from NOAA, and binds them together into
a single data.frame. Data across years, with extent and area of ice.
sea_ice_tabular(...)
sea_ice_tabular(...)
... |
Curl options passed on to crul::verb-GET - beware that curl options are passed to each http request, for each of 24 requests. |
An example file, for January, North pole:
https://sidads.colorado.edu/DATASETS/NOAA/G02135/north/monthly/data/N_01_extent_v3.0.csv
a value in any cell of -9999 indicates missing data
A data.frame with columns:
year (integer)
mo (integer)
data.type (character)
region (character)
extent (numeric)
area (numeric)
## Not run: df <- sea_ice_tabular() df ## End(Not run)
## Not run: df <- sea_ice_tabular() df ## End(Not run)
NOAA Storm Events data
se_data(year, type, overwrite = TRUE, ...) se_files(...)
se_data(year, type, overwrite = TRUE, ...) se_files(...)
year |
(numeric) a four digit year. see output of |
type |
(character) one of details, fatalities, locations, or legacy. required. |
overwrite |
(logical) To overwrite the path to store files in or not,
Default: |
... |
Curl options passed on to crul::verb-GET (optional) |
A tibble (data.frame)
See stormevents_cache for managing cached files
https://www.ncdc.noaa.gov/stormevents/
## Not run: # get list of files and their urls res <- se_files() res tail(res) # get data x <- se_data(year = 2013, type = "details") x z <- se_data(year = 1988, type = "fatalities") z w <- se_data(year = 2003, type = "locations") w leg <- se_data(year = 2003, type = "legacy") leg ## End(Not run)
## Not run: # get list of files and their urls res <- se_files() res tail(res) # get data x <- se_data(year = 2013, type = "details") x z <- se_data(year = 1988, type = "fatalities") z w <- se_data(year = 2003, type = "locations") w leg <- se_data(year = 2003, type = "legacy") leg ## End(Not run)
Get NOAA data for the Severe Weather Data Inventory (SWDI)
swdi( dataset = NULL, format = "xml", startdate = NULL, enddate = NULL, limit = 25, offset = NULL, radius = NULL, center = NULL, bbox = NULL, tile = NULL, stat = NULL, id = NULL, filepath = NULL, ... )
swdi( dataset = NULL, format = "xml", startdate = NULL, enddate = NULL, limit = 25, offset = NULL, radius = NULL, center = NULL, bbox = NULL, tile = NULL, stat = NULL, id = NULL, filepath = NULL, ... )
dataset |
Dataset to query. See below for details. |
format |
File format to download. One of xml, csv, shp, or kmz. |
startdate |
Start date. See details. |
enddate |
End date. See details. |
limit |
Number of results to return. Defaults to 25. Any number from 1 to 10000000. Time out issues likely to occur at higher limits. |
offset |
Any number from 1 to 10000000. Default is NULL, no offset, start from 1. |
radius |
Search radius in miles (current limit is 15 miles). BEWARE: As far as we know, this parameter doesn't do anything, or at least does not in fact limit the search to the given radius. DO NOT USE. |
center |
Center coordinate in lon,lat decimal degree format, e.g.: c(-95.45,36.88) |
bbox |
Bounding box in format of minLon,minLat,maxLon,maxLat, e.g.: c(-91,30,-90,31) |
tile |
Coordinate in lon,lat decimal degree format, e.g.: c(-95.45,36.88). The lat/lon values are rounded to the nearest tenth of degree. For the above example, the matching tile would contain values from -95.4500 to -95.5499 and 36.8500 to 36.9499 |
stat |
One of count or tilesum:$longitude,$latitude. Setting stat='count' returns number of results only (no actual data). stat='tilesum:$longitude,$latitude' returns daily feature counts for a tenth of a degree grid centered at the nearest tenth of a degree to the supplied values. |
id |
An identifier, e.g., 533623. Not sure how you find these ids? |
filepath |
If kmz or shp chosen the file name and optionally path to write to. Ignored format=xml or format=csv (optional) |
... |
Curl options passed on to crul::verb-GET (optional) |
Options for the dataset parameter. One of (and their data formats):
nx3tvs NEXRAD Level-3 Tornado Vortex Signatures (point)
nx3meso NEXRAD Level-3 Mesocyclone Signatures (point)
nx3hail NEXRAD Level-3 Hail Signatures (point)
nx3structure NEXRAD Level-3 Storm Cell Structure Information (point)
plsr Preliminary Local Storm Reports (point)
warn Severe Thunderstorm, Tornado, Flash Flood and Special Marine warnings (polygon)
nldn Lightning strikes from Vaisala. Available to government and military users only. If you aren't one of those, you'll get a 400 status stop message if you request data from this dataset (point)
For startdate and enddate, the date range syntax is 'startDate:endDate' or special option of 'periodOfRecord'. Note that startDate is inclusive and endDate is exclusive. All dates and times are in GMT. The current limit of the date range size is one year.
All latitude and longitude values for input parameters and output data are in the WGS84 datum.
If xml or csv chosen, a list of length three, a slot of metadata (meta), a slot for data (data), and a slot for shape file data with a single column 'shape'. The meta slot is a list of metadata elements, and the data slot is a data.frame, possibly of length zero if no data is found.
If kmz or shp chosen, the file is downloaded to your machine and a message is printed.
https://www.ncdc.noaa.gov/ncei-severe-weather-data-inventory https://www.ncdc.noaa.gov/swdiws/
## Not run: # Search for nx3tvs data from 5 May 2006 to 6 May 2006 swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506') # Get all 'nx3tvs' near latitude = 32.7 and longitude = -102.0 swdi(dataset='nx3tvs', startdate='20060506', enddate='20060507', center=c(-102.0,32.7)) # use an id swdi(dataset='warn', startdate='20060506', enddate='20060507', id=533623) # Get all 'plsr' within the bounding box (-91,30,-90,31) swdi(dataset='plsr', startdate='20060505', enddate='20060510', bbox=c(-91,30,-90,31)) # Get all 'nx3tvs' within the tile -102.1/32.6 (-102.15,32.55,-102.25,32.65) swdi(dataset='nx3tvs', startdate='20060506', enddate='20060507', tile=c(-102.12,32.62)) # Counts ## Note: stat='count' will only return metadata, nothing in the data or shape slots ## Note: stat='tilesum:...' returns counts in the data slot for each date for that tile, ## and shape data ## Get number of 'nx3tvs' near latitude = 32.7 and longitude = -102.0 swdi(dataset='nx3tvs', startdate='20060505', enddate='20060516', center=c(-102.0,32.7), stat='count') ## Get daily count nx3tvs features on .1 degree grid centered at latitude = 32.7 ## and longitude = -102.0 swdi(dataset='nx3tvs', startdate='20060505', enddate='20090516', stat='tilesum:-102.0,32.7') # CSV format swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format='csv') # SHP format swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format='shp', filepath='myfile') # KMZ format swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format='kmz', filepath='myfile.kmz') # csv output to SpatialPointsDataFrame res <- swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format="csv") library('sp') coordinates(res$data) <- ~lon + lat res$data class(res$data) ## End(Not run)
## Not run: # Search for nx3tvs data from 5 May 2006 to 6 May 2006 swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506') # Get all 'nx3tvs' near latitude = 32.7 and longitude = -102.0 swdi(dataset='nx3tvs', startdate='20060506', enddate='20060507', center=c(-102.0,32.7)) # use an id swdi(dataset='warn', startdate='20060506', enddate='20060507', id=533623) # Get all 'plsr' within the bounding box (-91,30,-90,31) swdi(dataset='plsr', startdate='20060505', enddate='20060510', bbox=c(-91,30,-90,31)) # Get all 'nx3tvs' within the tile -102.1/32.6 (-102.15,32.55,-102.25,32.65) swdi(dataset='nx3tvs', startdate='20060506', enddate='20060507', tile=c(-102.12,32.62)) # Counts ## Note: stat='count' will only return metadata, nothing in the data or shape slots ## Note: stat='tilesum:...' returns counts in the data slot for each date for that tile, ## and shape data ## Get number of 'nx3tvs' near latitude = 32.7 and longitude = -102.0 swdi(dataset='nx3tvs', startdate='20060505', enddate='20060516', center=c(-102.0,32.7), stat='count') ## Get daily count nx3tvs features on .1 degree grid centered at latitude = 32.7 ## and longitude = -102.0 swdi(dataset='nx3tvs', startdate='20060505', enddate='20090516', stat='tilesum:-102.0,32.7') # CSV format swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format='csv') # SHP format swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format='shp', filepath='myfile') # KMZ format swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format='kmz', filepath='myfile.kmz') # csv output to SpatialPointsDataFrame res <- swdi(dataset='nx3tvs', startdate='20060505', enddate='20060506', format="csv") library('sp') coordinates(res$data) <- ~lon + lat res$data class(res$data) ## End(Not run)
This function gets spatial paths of tornadoes from NOAA's National Weather Service Storm Prediction Center Severe Weather GIS web page.
tornadoes(...)
tornadoes(...)
... |
Curl options passed on to crul::verb-GET (optional) |
A Spatial object is returned of class SpatialLinesDataFrame.
See torn_cache for managing cached files
https://www.spc.noaa.gov/gis/svrgis/
## Not run: shp <- tornadoes() library('sp') if (interactive()) { # may take 10 sec or so to render plot(shp) } ## End(Not run)
## Not run: shp <- tornadoes() library('sp') if (interactive()) { # may take 10 sec or so to render plot(shp) } ## End(Not run)
Gives you an at-a-glance ggplot of the missingness inside a dataframe,
colouring cells according to missingness, where black indicates a present
cell and grey indicates a missing cell. As it returns a ggplot
object,
it is very easy to customize and change labels, and so on.
vis_miss(x, cluster = FALSE, sort_miss = FALSE)
vis_miss(x, cluster = FALSE, sort_miss = FALSE)
x |
a data.frame |
cluster |
logical |
sort_miss |
logical |
vis_miss
visualises a data.frame to display missingness. This is
taken from the visdat package, currently only available on github:
https://github.com/tierneyn/visdat
## Not run: monitors <- c("ASN00003003", "ASM00094299") weather_df <- meteo_pull_monitors(monitors) vis_miss(weather_df) ## End(Not run)
## Not run: monitors <- c("ASN00003003", "ASM00094299") weather_df <- meteo_pull_monitors(monitors) vis_miss(weather_df) ## End(Not run)