Title: | Taxonomic Information from 'Wikipedia' |
---|---|
Description: | 'Taxonomic' information from 'Wikipedia', 'Wikicommons', 'Wikispecies', and 'Wikidata'. Functions included for getting taxonomic information from each of the sources just listed, as well performing taxonomic search. |
Authors: | Scott Chamberlain [aut], Ethan Welty [aut], Grzegorz Sapijaszko [aut], Zachary Foster [aut, cre] |
Maintainer: | Zachary Foster <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.4.0.91 |
Built: | 2025-01-09 06:11:55 UTC |
Source: | https://github.com/ropensci/wikitaxa |
Taxonomic Information from Wikipedia
Scott Chamberlain [email protected]
Ethan Welty
data.frame of 295 rows, with 3 columns:
language - language
language_local - language in local name
wiki - langugae code for the wiki
From https://meta.wikimedia.org/wiki/List_of_Wikipedias
Wikidata taxonomy data
wt_data(x, property = NULL, ...) wt_data_id(x, language = "en", limit = 10, ...)
wt_data(x, property = NULL, ...) wt_data_id(x, language = "en", limit = 10, ...)
x |
(character) a taxonomic name |
property |
(character) a property id, e.g., P486 |
... |
curl options passed on to |
language |
(character) two letter language code |
limit |
(integer) records to return. Default: 10 |
Note that wt_data
can take a while to run since when fetching
claims it has to do so one at a time for each claim
You can search things other than taxonomic names with wt_data
if you
like
wt_data
searches Wikidata, and returns a list with elements:
labels - data.frame with columns: language, value
descriptions - data.frame with columns: language, value
aliases - data.frame with columns: language, value
sitelinks - data.frame with columns: site, title
claims - data.frame with columns: claims, property_value, property_description, value (comma separted values in string)
wt_data_id
gets the Wikidata ID for the searched term, and
returns the ID as character
## Not run: # search by taxon name # wt_data("Mimulus alsinoides") # choose which properties to return wt_data(x="Mimulus foliatus", property = c("P846", "P815")) # get a taxonomic identifier wt_data_id("Mimulus foliatus") # the id can be passed directly to wt_data() # wt_data(wt_data_id("Mimulus foliatus")) ## End(Not run)
## Not run: # search by taxon name # wt_data("Mimulus alsinoides") # choose which properties to return wt_data(x="Mimulus foliatus", property = c("P846", "P815")) # get a taxonomic identifier wt_data_id("Mimulus foliatus") # the id can be passed directly to wt_data() # wt_data(wt_data_id("Mimulus foliatus")) ## End(Not run)
Supports both static page urls and their equivalent API calls.
wt_wiki_page(url, ...)
wt_wiki_page(url, ...)
url |
(character) MediaWiki page url. |
... |
Arguments passed to |
If the URL given is for a human readable html page, we convert it to equivalent API call - if URL is already an API call, we just use that.
an HttpResponse
response object from crul
Other MediaWiki functions:
wt_wiki_page_parse()
,
wt_wiki_url_build()
,
wt_wiki_url_parse()
## Not run: wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica") ## End(Not run)
## Not run: wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica") ## End(Not run)
Parses common properties from the result of a MediaWiki API page call.
wt_wiki_page_parse( page, types = c("langlinks", "iwlinks", "externallinks"), tidy = FALSE )
wt_wiki_page_parse( page, types = c("langlinks", "iwlinks", "externallinks"), tidy = FALSE )
page |
(crul::HttpResponse) Result of |
types |
(character) List of properties to parse. |
tidy |
(logical). tidy output to data.frames when possible.
Default: |
Available properties currently not parsed: title, displaytitle, pageid, revid, redirects, text, categories, links, templates, images, sections, properties, ...
a list
Other MediaWiki functions:
wt_wiki_page()
,
wt_wiki_url_build()
,
wt_wiki_url_parse()
## Not run: pg <- wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica") wt_wiki_page_parse(pg) ## End(Not run)
## Not run: pg <- wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica") wt_wiki_page_parse(pg) ## End(Not run)
Builds a MediaWiki page url from its component parts (wiki name, wiki type, and page title). Supports both static page urls and their equivalent API calls.
wt_wiki_url_build( wiki, type = NULL, page = NULL, api = FALSE, action = "parse", redirects = TRUE, format = "json", utf8 = TRUE, prop = c("text", "langlinks", "categories", "links", "templates", "images", "externallinks", "sections", "revid", "displaytitle", "iwlinks", "properties") )
wt_wiki_url_build( wiki, type = NULL, page = NULL, api = FALSE, action = "parse", redirects = TRUE, format = "json", utf8 = TRUE, prop = c("text", "langlinks", "categories", "links", "templates", "images", "externallinks", "sections", "revid", "displaytitle", "iwlinks", "properties") )
wiki |
(character | list) Either the wiki name or a list with
|
type |
(character) Wiki type. |
page |
(character) Wiki page title. |
api |
(boolean) Whether to return an API call or a static page url
(default). If |
action |
(character) See https://en.wikipedia.org/w/api.php for supported actions. This function currently only supports "parse". |
redirects |
(boolean) If the requested page is set to a redirect, resolve it. |
format |
(character) See https://en.wikipedia.org/w/api.php for supported output formats. |
utf8 |
(boolean) If |
prop |
(character) Properties to retrieve, either as a character vector or pipe-delimited string. See https://en.wikipedia.org/w/api.php?action=help&modules=parse for supported properties. |
a URL (character)
Other MediaWiki functions:
wt_wiki_page_parse()
,
wt_wiki_page()
,
wt_wiki_url_parse()
wt_wiki_url_build(wiki = "en", type = "wikipedia", page = "Malus domestica") wt_wiki_url_build( wt_wiki_url_parse("https://en.wikipedia.org/wiki/Malus_domestica")) wt_wiki_url_build("en", "wikipedia", "Malus domestica", api = TRUE)
wt_wiki_url_build(wiki = "en", type = "wikipedia", page = "Malus domestica") wt_wiki_url_build( wt_wiki_url_parse("https://en.wikipedia.org/wiki/Malus_domestica")) wt_wiki_url_build("en", "wikipedia", "Malus domestica", api = TRUE)
Parse a MediaWiki page url into its component parts (wiki name, wiki type, and page title). Supports both static page urls and their equivalent API calls.
wt_wiki_url_parse(url)
wt_wiki_url_parse(url)
url |
(character) MediaWiki page url. |
a list with elements:
wiki - wiki language
type - wikipedia type
page - page name
Other MediaWiki functions:
wt_wiki_page_parse()
,
wt_wiki_page()
,
wt_wiki_url_build()
wt_wiki_url_parse(url="https://en.wikipedia.org/wiki/Malus_domestica") wt_wiki_url_parse("https://en.wikipedia.org/w/api.php?page=Malus_domestica")
wt_wiki_url_parse(url="https://en.wikipedia.org/wiki/Malus_domestica") wt_wiki_url_parse("https://en.wikipedia.org/w/api.php?page=Malus_domestica")
WikiCommons
wt_wikicommons(name, utf8 = TRUE, ...) wt_wikicommons_parse( page, types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"), tidy = FALSE ) wt_wikicommons_search(query, limit = 10, offset = 0, utf8 = TRUE, ...)
wt_wikicommons(name, utf8 = TRUE, ...) wt_wikicommons_parse( page, types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"), tidy = FALSE ) wt_wikicommons_search(query, limit = 10, offset = 0, utf8 = TRUE, ...)
name |
(character) Wiki name - as a page title, must be length 1 |
utf8 |
(logical) If |
... |
curl options, passed on to |
page |
( |
types |
(character) List of properties to parse |
tidy |
(logical). tidy output to data.frame's if possible.
Default: |
query |
(character) query terms |
limit |
(integer) number of results to return. Default: 10 |
offset |
(integer) record to start at. Default: 0 |
wt_wikicommons
returns a list, with slots:
langlinks - language page links
externallinks - external links
common_names - a data.frame with name
and language
columns
classification - a data.frame with rank
and name
columns
wt_wikicommons_parse
returns a list
wt_wikicommons_search
returns a list with slots for continue
and
query
, where query
holds the results, with query$search
slot with
the search results
https://www.mediawiki.org/wiki/API:Search for help on search
## Not run: # high level wt_wikicommons(name = "Malus domestica") wt_wikicommons(name = "Pinus contorta") wt_wikicommons(name = "Ursus americanus") wt_wikicommons(name = "Balaenoptera musculus") wt_wikicommons(name = "Category:Poeae") wt_wikicommons(name = "Category:Pinaceae") # low level pg <- wt_wiki_page("https://commons.wikimedia.org/wiki/Malus_domestica") wt_wikicommons_parse(pg) # search wikicommons # FIXME: utf=FALSE for now until curl::curl_escape fix # https://github.com/jeroen/curl/issues/228 wt_wikicommons_search(query = "Pinus", utf8 = FALSE) ## use search results to dig into pages res <- wt_wikicommons_search(query = "Pinus", utf8 = FALSE) lapply(res$query$search$title[1:3], wt_wikicommons) ## End(Not run)
## Not run: # high level wt_wikicommons(name = "Malus domestica") wt_wikicommons(name = "Pinus contorta") wt_wikicommons(name = "Ursus americanus") wt_wikicommons(name = "Balaenoptera musculus") wt_wikicommons(name = "Category:Poeae") wt_wikicommons(name = "Category:Pinaceae") # low level pg <- wt_wiki_page("https://commons.wikimedia.org/wiki/Malus_domestica") wt_wikicommons_parse(pg) # search wikicommons # FIXME: utf=FALSE for now until curl::curl_escape fix # https://github.com/jeroen/curl/issues/228 wt_wikicommons_search(query = "Pinus", utf8 = FALSE) ## use search results to dig into pages res <- wt_wikicommons_search(query = "Pinus", utf8 = FALSE) lapply(res$query$search$title[1:3], wt_wikicommons) ## End(Not run)
Wikipedia
wt_wikipedia(name, wiki = "en", utf8 = TRUE, ...) wt_wikipedia_parse( page, types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"), tidy = FALSE ) wt_wikipedia_search( query, wiki = "en", limit = 10, offset = 0, utf8 = TRUE, ... )
wt_wikipedia(name, wiki = "en", utf8 = TRUE, ...) wt_wikipedia_parse( page, types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"), tidy = FALSE ) wt_wikipedia_search( query, wiki = "en", limit = 10, offset = 0, utf8 = TRUE, ... )
name |
(character) Wiki name - as a page title, must be length 1 |
wiki |
(character) wiki language. default: en. See wikipedias for language codes. |
utf8 |
(logical) If |
... |
curl options, passed on to |
page |
( |
types |
(character) List of properties to parse |
tidy |
(logical). tidy output to data.frame's if possible.
Default: |
query |
(character) query terms |
limit |
(integer) number of results to return. Default: 10 |
offset |
(integer) record to start at. Default: 0 |
wt_wikipedia
returns a list, with slots:
langlinks - language page links
externallinks - external links
common_names - a data.frame with name
and language
columns
classification - a data.frame with rank
and name
columns
synonyms - a character vector with taxonomic names
wt_wikipedia_parse
returns a list with same slots determined by
the types
parmeter
wt_wikipedia_search
returns a list with slots for continue
and
query
, where query
holds the results, with query$search
slot with
the search results
https://www.mediawiki.org/wiki/API:Search for help on search
## Not run: # high level wt_wikipedia(name = "Malus domestica") wt_wikipedia(name = "Malus domestica", wiki = "fr") wt_wikipedia(name = "Malus domestica", wiki = "da") # low level pg <- wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica") wt_wikipedia_parse(pg) wt_wikipedia_parse(pg, tidy = TRUE) # search wikipedia # FIXME: utf=FALSE for now until curl::curl_escape fix # https://github.com/jeroen/curl/issues/228 wt_wikipedia_search(query = "Pinus", utf8=FALSE) wt_wikipedia_search(query = "Pinus", wiki = "fr", utf8=FALSE) wt_wikipedia_search(query = "Pinus", wiki = "br", utf8=FALSE) ## curl options # wt_wikipedia_search(query = "Pinus", verbose = TRUE, utf8=FALSE) ## use search results to dig into pages res <- wt_wikipedia_search(query = "Pinus", utf8=FALSE) lapply(res$query$search$title[1:3], wt_wikipedia) ## End(Not run)
## Not run: # high level wt_wikipedia(name = "Malus domestica") wt_wikipedia(name = "Malus domestica", wiki = "fr") wt_wikipedia(name = "Malus domestica", wiki = "da") # low level pg <- wt_wiki_page("https://en.wikipedia.org/wiki/Malus_domestica") wt_wikipedia_parse(pg) wt_wikipedia_parse(pg, tidy = TRUE) # search wikipedia # FIXME: utf=FALSE for now until curl::curl_escape fix # https://github.com/jeroen/curl/issues/228 wt_wikipedia_search(query = "Pinus", utf8=FALSE) wt_wikipedia_search(query = "Pinus", wiki = "fr", utf8=FALSE) wt_wikipedia_search(query = "Pinus", wiki = "br", utf8=FALSE) ## curl options # wt_wikipedia_search(query = "Pinus", verbose = TRUE, utf8=FALSE) ## use search results to dig into pages res <- wt_wikipedia_search(query = "Pinus", utf8=FALSE) lapply(res$query$search$title[1:3], wt_wikipedia) ## End(Not run)
WikiSpecies
wt_wikispecies(name, utf8 = TRUE, ...) wt_wikispecies_parse( page, types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"), tidy = FALSE ) wt_wikispecies_search(query, limit = 10, offset = 0, utf8 = TRUE, ...)
wt_wikispecies(name, utf8 = TRUE, ...) wt_wikispecies_parse( page, types = c("langlinks", "iwlinks", "externallinks", "common_names", "classification"), tidy = FALSE ) wt_wikispecies_search(query, limit = 10, offset = 0, utf8 = TRUE, ...)
name |
(character) Wiki name - as a page title, must be length 1 |
utf8 |
(logical) If |
... |
curl options, passed on to |
page |
( |
types |
(character) List of properties to parse |
tidy |
(logical). tidy output to data.frame's if possible.
Default: |
query |
(character) query terms |
limit |
(integer) number of results to return. Default: 10 |
offset |
(integer) record to start at. Default: 0 |
wt_wikispecies
returns a list, with slots:
langlinks - language page links
externallinks - external links
common_names - a data.frame with name
and language
columns
classification - a data.frame with rank
and name
columns
wt_wikispecies_parse
returns a list
wt_wikispecies_search
returns a list with slots for continue
and
query
, where query
holds the results, with query$search
slot with
the search results
https://www.mediawiki.org/wiki/API:Search for help on search
## Not run: # high level wt_wikispecies(name = "Malus domestica") wt_wikispecies(name = "Pinus contorta") wt_wikispecies(name = "Ursus americanus") wt_wikispecies(name = "Balaenoptera musculus") # low level pg <- wt_wiki_page("https://species.wikimedia.org/wiki/Abelmoschus") wt_wikispecies_parse(pg) # search wikispecies # FIXME: utf=FALSE for now until curl::curl_escape fix # https://github.com/jeroen/curl/issues/228 wt_wikispecies_search(query = "pine tree", utf8=FALSE) ## use search results to dig into pages res <- wt_wikispecies_search(query = "pine tree", utf8=FALSE) lapply(res$query$search$title[1:3], wt_wikispecies) ## End(Not run)
## Not run: # high level wt_wikispecies(name = "Malus domestica") wt_wikispecies(name = "Pinus contorta") wt_wikispecies(name = "Ursus americanus") wt_wikispecies(name = "Balaenoptera musculus") # low level pg <- wt_wiki_page("https://species.wikimedia.org/wiki/Abelmoschus") wt_wikispecies_parse(pg) # search wikispecies # FIXME: utf=FALSE for now until curl::curl_escape fix # https://github.com/jeroen/curl/issues/228 wt_wikispecies_search(query = "pine tree", utf8=FALSE) ## use search results to dig into pages res <- wt_wikispecies_search(query = "pine tree", utf8=FALSE) lapply(res$query$search$title[1:3], wt_wikispecies) ## End(Not run)