Title: | Querying and Managing Large Biodiversity Occurrence Datasets |
---|---|
Description: | Facilitates the gathering of biodiversity occurrence data from disparate sources. Metadata is managed throughout the process to facilitate reporting and enhanced ability to repeat analyses. |
Authors: | Hannah L. Owens [aut, cre] , Cory Merow [aut] , Brian Maitner [aut] , Jamie M. Kass [aut] , Vijay Barve [aut] , Robert P. Guralnick [aut] , Damiano Oldoni [rev] (<https://orcid.org/0000-0003-3445-7562>, Damiano reviewed the package (v. 0.5.2) for rOpenSci, see <https://github.com/ropensci/software-review/issues/407>) |
Maintainer: | Hannah L. Owens <[email protected]> |
License: | GPL-3 |
Version: | 0.5.9 |
Built: | 2024-12-27 03:59:18 UTC |
Source: | https://github.com/ropensci/occCite |
A class for managing GBIF login data.
username
A vector of type character specifying a GBIF username.
email
A vector of type character specifying the email associated with a GBIF username.
pwd
A vector of type character containing the user's password for logging in to GBIF.
GBIFLogin <- GBIFLoginManager( user = "occCiteTester", email = "****@yahoo.com", pwd = "12345" )
GBIFLogin <- GBIFLoginManager( user = "occCiteTester", email = "****@yahoo.com", pwd = "12345" )
Takes users GBIF login particulars and turns it
into a GBIFLogin
for use in downloading data from
GBIF. You MUST ALREADY HAVE AN ACCOUNT at GBIF.
GBIFLoginManager(user = NULL, email = NULL, pwd = NULL)
GBIFLoginManager(user = NULL, email = NULL, pwd = NULL)
user |
A vector of type character specifying a GBIF username. |
email |
A vector of type character specifying the email associated with a GBIF username. |
pwd |
A vector of type character containing the user's password for logging in to GBIF. |
An object of class GBIFLogin
containing the user's
GBIF login data.
## Inputting user particulars ## Not run: myLogin <- GBIFLoginManager( user = "theWoman", email = "[email protected]", pwd = "sh3r" ) ## End(Not run) ## Not run: ## Can also be mined from your system environment myLogin <- GBIFLoginManager( user = NULL, email = NULL, pwd = NULL ) ## End(Not run)
## Inputting user particulars ## Not run: myLogin <- GBIFLoginManager( user = "theWoman", email = "[email protected]", pwd = "sh3r" ) ## End(Not run) ## Not run: ## Can also be mined from your system environment myLogin <- GBIFLoginManager( user = NULL, email = NULL, pwd = NULL ) ## End(Not run)
Downloads occurrence points and useful related information for processing within other occCite functions
getBIENpoints(taxon)
getBIENpoints(taxon)
taxon |
A single plant species or vector of plant species |
'getBIENpoints' only returns all BIEN records, including non- native and cultivated occurrences.
A list containing
a data frame of occurrence data;
a list containing: i notes on usage, ii bibtex citations, and iii acknowledgment information;
a data frame containing the raw results of a query to 'BIEN::BIEN_occurrence_species()'.
## Not run: getBIENpoints(taxon = "Protea cynaroides") ## End(Not run)
## Not run: getBIENpoints(taxon = "Protea cynaroides") ## End(Not run)
Downloads GBIF occurrence points and useful related information for processing within other occCite functions
getGBIFpoints( taxon, GBIFLogin = GBIFLogin, GBIFDownloadDirectory = NULL, checkPreviousGBIFDownload = T )
getGBIFpoints( taxon, GBIFLogin = GBIFLogin, GBIFDownloadDirectory = NULL, checkPreviousGBIFDownload = T )
taxon |
A string with a single species name |
GBIFLogin |
An object of class |
GBIFDownloadDirectory |
An optional argument that specifies the local directory where GBIF downloads will be saved. If this is not specified, the downloads will be saved to your current working directory. |
checkPreviousGBIFDownload |
A logical operator specifying whether the user wishes to check their existing prepared downloads on the GBIF website. |
'getGBIFpoints' only returns records from GBIF that have coordinates, aren't flagged as having geospatial issues, and have an occurrence status flagged as "PRESENT".
A list containing
a data frame of occurrence data;
GBIF search metadata;
a data frame containing the raw results of a query to 'rgbif::occ_download_get()'.
## Not run: getGBIFpoints( taxon = "Gadus morhua", GBIFLogin = myGBIFLogin, GBIFDownloadDirectory = NULL ) ## End(Not run)
## Not run: getGBIFpoints( taxon = "Gadus morhua", GBIFLogin = myGBIFLogin, GBIFDownloadDirectory = NULL ) ## End(Not run)
Results of an occCite search for *Protea cynaroides*
myOccCiteObject
myOccCiteObject
An 'occCiteData' object with the following slots:
What kind of query was made
A vector of taxonomic sources specified
A data frame with results of taxonomic cleanup
A vector of which databases were queried (i.e. GBIF and BIEN)
When the search was made
A list of length 1 named "Protea cynaroides". Contains a list of length 2 with results from each database, GBIF and BIEN
Global Biodiversity Information Facility, GBIF (https://www.gbif.org/) and Botanical Information and Ecology Network, BIEN (https://bien.nceas.ucsb.edu/bien/) data aggregators.
myOccCiteObject
myOccCiteObject
Harvests citations for occurrence data
occCitation(x = NULL)
occCitation(x = NULL)
x |
An object of class |
An object of class occCiteCitation
. It is
a named list of the same length as the number of species
included in your occCiteData
object. Each item
in the list has citation information for occurrences.
## Not run: data(myOccCiteObject) myCitations <- occCitation(x = myOccCiteObject) ## End(Not run)
## Not run: data(myOccCiteObject) myCitations <- occCitation(x = myOccCiteObject) ## End(Not run)
A class for managing citations generated from occCite queries.
occCitationResults
The results of performing
occCitation
on a occCiteData
object,
stored as a named list, each of the items named after a searched
taxon and containing a data frame with occurrence information.
A class for managing metadata associated with occCite queries and data manipulation.
userQueryType
A vector of type character specifying whether the user made their original taxonomic query based on a vector of taxon names or a phylogeny.
userSpecTaxonomy
A vector of type character that presents a list of taxonomic sources for cleaning taxonomy of queries. This can be user-specified or default.
cleanedTaxonomy
A data frame with containing input taxon names, the
closest match according to taxize::gnr_resolve
, and a list of
taxonomic data sources that contain the matching name, generated
by studyTaxonList
.
occSources
A vector of class "character" containing a list of
occurrence data sources, generated when passing a occCiteData
object through occQuery
.
occCiteSearchDate
The date on which the occurrence search query was conducted via occCite.
occResults
The results of an occQuery
search, stored
as a named list, each of the items named after a searched taxon and
containing a data frame with occurrence information.
Makes maps for each individual species in an
occCiteData
object.
occCiteMap( occCiteData, species_map = "all", species_colors = NULL, ds_map = c("GBIF", "BIEN"), map_limit = 1000, awesomeMarkers = TRUE, cluster = FALSE )
occCiteMap( occCiteData, species_map = "all", species_colors = NULL, ds_map = c("GBIF", "BIEN"), map_limit = 1000, awesomeMarkers = TRUE, cluster = FALSE )
occCiteData |
An object of class |
species_map |
Character; either the default "all" to map all species
in |
species_colors |
Character; the default NULL will choose random colors from those available (see Details), or those specified by the user as a character or character vector (the number of colors must match the number of species mapped). |
ds_map |
Character; specifies which data service records will be mapped, with the default being GBIF, BIEN, and GBIF_BIEN (records with the same coordinates in both databases). |
map_limit |
Numeric; the number of points to map per species, set at a default of 1000 randomly selected records; users can specify a higher number, but be aware that leaflet can lag or crash when too many points are plotted. |
awesomeMarkers |
Logical; if 'TRUE' (default), mapped points will be 'awesomeMarkers' attributed with an icon for a globe for GBIF, a leaf for BIEN, or a database if records from both databases have the same coordinates; if 'FALSE', mapped points will be leaflet 'circleMarkers' |
cluster |
Logical; if 'TRUE' (default is 'FALSE') turns on marker clustering, which does not preserve color differences between species |
When mapping using 'awesomeMarkers' (default), the parameter species_colors must match those in a specified color library, currently: c("red", "lightred", "orange", "beige", "green", "lightgreen", "blue", "lightblue", "purple", "pink", "cadetblue", "white", "gray", "lightgray"). When 'awesomeMarkers' is 'FALSE' and species_colors are not specified, random colors from the 'RColorBrewer' Set1 palette are used.
A leaflet map
## Not run: data(myOccCiteObject) occCiteMap(myOccCiteObject, cluster = FALSE) ## End(Not run)
## Not run: data(myOccCiteObject) occCiteMap(myOccCiteObject, cluster = FALSE) ## End(Not run)
Takes rectified list of specimens from
studyTaxonList
and returns point data from
rgbif
with metadata.
occQuery( x = NULL, datasources = c("gbif", "bien"), GBIFLogin = NULL, GBIFDownloadDirectory = NULL, loadLocalGBIFDownload = F, checkPreviousGBIFDownload = T, options = NULL )
occQuery( x = NULL, datasources = c("gbif", "bien"), GBIFLogin = NULL, GBIFDownloadDirectory = NULL, loadLocalGBIFDownload = F, checkPreviousGBIFDownload = T, options = NULL )
x |
An object of class |
datasources |
A vector of occurrence data sources to search. This is currently limited to GBIF and BIEN, but may expand in the future. |
GBIFLogin |
An object of class |
GBIFDownloadDirectory |
An optional argument that specifies the local directory where GBIF downloads will be saved. If this is not specified, the downloads will be saved to your current working directory. |
loadLocalGBIFDownload |
If |
checkPreviousGBIFDownload |
If |
options |
A vector of options to pass to |
If you are querying GBIF, note that 'occQuery()' only returns records from GBIF that have coordinates, aren't flagged as having geospatial issues, and have an occurrence status flagged as "PRESENT".
The object of class occCiteData
supplied by the user
as an argument, with occurrence data search results, as well as metadata
on the occurrence sources queried.
## Not run: ## If you have already created a occCite object, and have not previously ## downloaded GBIF data. occQuery( x = myOccCiteObject, datasources = c("gbif", "bien"), GBIFLogin = myLogin, GBIFDownloadDirectory = "./Desktop", loadLocalGBIFDownload = F ) ## If you don't have an occCite object yet occQuery( x = c("Buteo buteo", "Protea cynaroides"), datasources = c("gbif", "bien"), GBIFLogin = myLogin, GBIFDownloadDirectory = "./Desktop", loadLocalGBIFDownload = F ) ## If you have previously downloaded occurrence data from GBIF ## and saved it in a folder called "GBIFDownloads". occQuery( x = c("Buteo buteo", "Protea cynaroides"), datasources = c("gbif", "bien"), GBIFLogin = myLogin, GBIFDownloadDirectory = "./Desktop/GBIFDownloads", loadLocalGBIFDownload = T ) ## End(Not run)
## Not run: ## If you have already created a occCite object, and have not previously ## downloaded GBIF data. occQuery( x = myOccCiteObject, datasources = c("gbif", "bien"), GBIFLogin = myLogin, GBIFDownloadDirectory = "./Desktop", loadLocalGBIFDownload = F ) ## If you don't have an occCite object yet occQuery( x = c("Buteo buteo", "Protea cynaroides"), datasources = c("gbif", "bien"), GBIFLogin = myLogin, GBIFDownloadDirectory = "./Desktop", loadLocalGBIFDownload = F ) ## If you have previously downloaded occurrence data from GBIF ## and saved it in a folder called "GBIFDownloads". occQuery( x = c("Buteo buteo", "Protea cynaroides"), datasources = c("gbif", "bien"), GBIFLogin = myLogin, GBIFDownloadDirectory = "./Desktop/GBIFDownloads", loadLocalGBIFDownload = T ) ## End(Not run)
Generates up to three different kinds of plots, with toggles determining whether plots should be done for individual species or aggregating all species–histogram by year of occurrence records, waffle::waffle plot of primary data sources, waffle::waffle plot of data aggregators.
## S3 method for class 'occCiteData' plot(x, ...)
## S3 method for class 'occCiteData' plot(x, ...)
x |
An object of class |
... |
Additional arguments affecting how the formatted citation document is produced. 'bySpecies': Logical; setting to 'TRUE' generates the desired plots for each species. 'plotTypes': The type of plot to be generated; "yearHistogram", "source", and/or "aggregator". |
A list containing the desired plots.
data(myOccCiteObject) plot( x = myOccCiteObject, bySpecies = FALSE, plotTypes = c("yearHistogram", "source", "aggregator") )
data(myOccCiteObject) plot( x = myOccCiteObject, bySpecies = FALSE, plotTypes = c("yearHistogram", "source", "aggregator") )
Searches the list of a user's most recent 1000 downloads on the GBIF servers and returns the data set key for the most recently prepared download.
prevGBIFdownload(taxonKey, GBIFLogin)
prevGBIFdownload(taxonKey, GBIFLogin)
taxonKey |
A taxon key as returned from 'rgbif::name_suggest()'. |
GBIFLogin |
An object of class |
A GBIF download key, if one is available
## Not run: GBIFLogin <- GBIFLoginManager( user = "theWoman", email = "[email protected]", pwd = "sh3r" ) taxKey <- rgbif::name_suggest( q = "Protea cynaroides", rank = "species" )$key[1] prevGBIFdownload( taxonKey = taxKey, GBIFLogin = myGBIFLogin ) ## End(Not run)
## Not run: GBIFLogin <- GBIFLoginManager( user = "theWoman", email = "[email protected]", pwd = "sh3r" ) taxKey <- rgbif::name_suggest( q = "Protea cynaroides", rank = "species" )$key[1] prevGBIFdownload( taxonKey = taxKey, GBIFLogin = myGBIFLogin ) ## End(Not run)
Prints formatted citations for occurrences and main packages used (i.e. base, occCite, rgbif, and/or BIEN).
## S3 method for class 'occCiteCitation' print(x, ...)
## S3 method for class 'occCiteCitation' print(x, ...)
x |
An object of class |
... |
Additional arguments affecting how the formatted citation document is produced |
A text string with formatted citations
# Print citations for all species together data(myOccCiteObject) print(myOccCiteObject) # Print citations for each species individually data(myOccCiteObject) print(myOccCiteObject, bySpecies = TRUE)
# Print citations for all species together data(myOccCiteObject) print(myOccCiteObject) # Print citations for each species individually data(myOccCiteObject) print(myOccCiteObject, bySpecies = TRUE)
Takes input phylogenies or vectors of taxon names, checks
against taxonomic database, returns vector of cleaned taxonomic names
(using taxize::gnr_resolve()
) for use in spocc queries, as
well as warnings if there are invalid names.
studyTaxonList(x = NULL, datasources = "GBIF Backbone Taxonomy")
studyTaxonList(x = NULL, datasources = "GBIF Backbone Taxonomy")
x |
A phylogeny of class 'phylo' or a vector of class 'character' containing the names of taxa of interest |
datasources |
A vector of taxonomic data sources implemented in
|
An object of class occCiteData
containing the type
of inquiry the user has made –a phylogeny or a vector of names– and a
data frame containing input taxa names, the closest match according to
taxize::gnr_resolve
, and a list of taxonomic data sources that
contain the matching name.
## Inputting a vector of taxon names studyTaxonList( x = c( "Buteo buteo", "Buteo buteo hartedi", "Buteo japonicus" ), datasources = c("National Center for Biotechnology Information") ) ## Inputting a phylogeny phylogeny <- ape::read.nexus( system.file("extdata/Fish_12Tax_time_calibrated.tre", package = "occCite" ) ) phylogeny <- ape::extract.clade(phylogeny, 18) studyTaxonList( x = phylogeny, datasources = c("GBIF Backbone Taxonomy") )
## Inputting a vector of taxon names studyTaxonList( x = c( "Buteo buteo", "Buteo buteo hartedi", "Buteo japonicus" ), datasources = c("National Center for Biotechnology Information") ) ## Inputting a phylogeny phylogeny <- ape::read.nexus( system.file("extdata/Fish_12Tax_time_calibrated.tre", package = "occCite" ) ) phylogeny <- ape::extract.clade(phylogeny, 18) studyTaxonList( x = phylogeny, datasources = c("GBIF Backbone Taxonomy") )
Displays a summary of relevant stats about a query
## S3 method for class 'occCiteData' summary(object, ...)
## S3 method for class 'occCiteData' summary(object, ...)
object |
An object of class |
... |
Additional arguments affecting the summary produced |
data(myOccCiteObject) summary(myOccCiteObject)
data(myOccCiteObject) summary(myOccCiteObject)
An function that takes an input taxonomic name, checks against taxonomic database, returns vector for use in database queries, as well as warnings if the name is invalid.
taxonRectification(taxName = NULL, datasources = NULL, skipTaxize = FALSE)
taxonRectification(taxName = NULL, datasources = NULL, skipTaxize = FALSE)
taxName |
A string that, ideally, is a taxonomic name |
datasources |
A vector of taxonomic data sources implemented in
|
skipTaxize |
If |
A string with the closest match according to
taxize::gnr_resolve()
, and a list of taxonomic data sources that
contain the matching name.
# Inputting taxonomic name and specifying what taxonomic sources to search taxonRectification( taxName = "Buteo buteo hartedi", datasources = "National Center for Biotechnology Information", skipTaxize = TRUE )
# Inputting taxonomic name and specifying what taxonomic sources to search taxonRectification( taxName = "Buteo buteo hartedi", datasources = "National Center for Biotechnology Information", skipTaxize = TRUE )