Title: | Interface to the arXiv API |
---|---|
Description: | An interface to the API for 'arXiv', a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics. |
Authors: | Karthik Ram [aut] , Karl Broman [aut, cre] |
Maintainer: | Karl Broman <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.10 |
Built: | 2025-01-12 05:21:49 UTC |
Source: | https://github.com/ropensci/aRxiv |
arXiv subject classifications: their abbreviations and corresponding descriptions.
data(arxiv_cats)
data(arxiv_cats)
A data frame with five columns: the abbreviations of the
subject classifications (category
), the field of study,
subfield of study (within Physics; NA
otherwise), a short
description, and a longer description.
https://arxiv.org/category_taxonomy
arxiv_cats
arxiv_cats
Count the number of results for a given search. Useful to check before attempting to pull down a very large number of records.
arxiv_count(query = NULL, id_list = NULL)
arxiv_count(query = NULL, id_list = NULL)
query |
Search pattern as a string; a vector of such strings is
also allowed, in which case the elements are combined with |
id_list |
arXiv doc IDs, as comma-delimited string or a vector of such strings |
Number of results (integer). An attribute
"search_info"
contains information about the search
parameters and the time at which it was performed.
arxiv_search()
, query_terms()
,
arxiv_cats()
# count papers in category stat.AP (applied statistics) arxiv_count(query = "cat:stat.AP") # count papers by Peter Hall in any stat category arxiv_count(query = 'au:"Peter Hall" AND cat:stat*') # count papers for a range of dates # here, everything in 2013 arxiv_count("submittedDate:[2013 TO 2014]")
# count papers in category stat.AP (applied statistics) arxiv_count(query = "cat:stat.AP") # count papers by Peter Hall in any stat category arxiv_count(query = 'au:"Peter Hall" AND cat:stat*') # count papers for a range of dates # here, everything in 2013 arxiv_count("submittedDate:[2013 TO 2014]")
Open, in web browser, the abstract pages for each of set of arXiv search results.
arxiv_open(search_results, limit = 20)
arxiv_open(search_results, limit = 20)
search_results |
Data frame of search results, as returned from |
limit |
Maximum number of abstracts to open in one call. |
There is a delay between calls to
utils::browseURL()
, with the amount taken from the R
option "aRxiv_delay"
(in seconds); if missing, the default
is 3 sec.
(Invisibly) Vector of character strings with URLs of abstracts opened.
z <- arxiv_search('au:"Peter Hall" AND ti:deconvolution') arxiv_open(z)
z <- arxiv_search('au:"Peter Hall" AND ti:deconvolution') arxiv_open(z)
Allows for progammatic searching of the arXiv pre-print repository.
arxiv_search( query = NULL, id_list = NULL, start = 0, limit = 10, sort_by = c("submitted", "updated", "relevance"), ascending = TRUE, batchsize = 100, force = FALSE, output_format = c("data.frame", "list"), sep = "|" )
arxiv_search( query = NULL, id_list = NULL, start = 0, limit = 10, sort_by = c("submitted", "updated", "relevance"), ascending = TRUE, batchsize = 100, force = FALSE, output_format = c("data.frame", "list"), sep = "|" )
query |
Search pattern as a string; a vector of such strings
also allowed, in which case the elements are combined with |
id_list |
arXiv doc IDs, as comma-delimited string or a vector of such strings |
start |
An offset for the start of search |
limit |
Maximum number of records to return. |
sort_by |
How to sort the results (ignored if |
ascending |
If TRUE, sort in ascending order; else descending
(ignored if |
batchsize |
Maximum number of records to request at one time |
force |
If TRUE, force search request even if it seems extreme |
output_format |
Indicates whether output should be a data frame or a list. |
sep |
String to use to separate multiple authors,
affiliations, DOI links, and categories, in the case that
|
If output_format="data.frame"
, the result is a data
frame with each row being a manuscript and columns being the
various fields.
If output_format="list"
, the result is a list parsed from
the XML output of the search, closer to the raw output from arXiv.
The data frame format has the following columns.
[,1] | id | arXiv ID |
[,2] | submitted | date first submitted |
[,3] | updated | date last updated |
[,4] | title | manuscript title |
[,5] | summary | abstract |
[,6] | authors | author names |
[,7] | affiliations | author affiliations |
[,8] | link_abstract | hyperlink to abstract |
[,9] | link_pdf | hyperlink to pdf |
[,10] | link_doi | hyperlink to DOI |
[,11] | comment | authors' comment |
[,12] | journal_ref | journal reference |
[,13] | doi | published DOI |
[,14] | primary_category | primary category |
[,15] | categories | all categories |
The contents are all strings; missing values are empty strings (""
).
The columns authors
, affiliations
, link_doi
,
and categories
may have multiple entries separated by
sep
(by default, "|"
).
The result includes an attribute "search_info"
that includes
information about the details of the search parameters, including
the time at which it was completed. Another attribute
"total_results"
is the total number of records that match
the query.
arxiv_count()
, arxiv_open()
,
query_terms()
, arxiv_cats()
# search for author Peter Hall with deconvolution in title z <- arxiv_search(query = 'au:"Peter Hall" AND ti:deconvolution', limit=2) attr(z, "total_results") # total no. records matching query z$title # search for a set of documents by arxiv identifiers z <- arxiv_search(id_list = c("0710.3491v1", "0804.0713v1", "1003.0315v1")) # can also use a comma-separated string z <- arxiv_search(id_list = "0710.3491v1,0804.0713v1,1003.0315v1") # Journal references, if available z$journal_ref # search for a range of dates (in this case, one day) z <- arxiv_search("submittedDate:[199701010000 TO 199701012400]", limit=2)
# search for author Peter Hall with deconvolution in title z <- arxiv_search(query = 'au:"Peter Hall" AND ti:deconvolution', limit=2) attr(z, "total_results") # total no. records matching query z$title # search for a set of documents by arxiv identifiers z <- arxiv_search(id_list = c("0710.3491v1", "0804.0713v1", "1003.0315v1")) # can also use a comma-separated string z <- arxiv_search(id_list = "0710.3491v1,0804.0713v1,1003.0315v1") # Journal references, if available z$journal_ref # search for a range of dates (in this case, one day) z <- arxiv_search("submittedDate:[199701010000 TO 199701012400]", limit=2)
Check for connection to arXiv API
can_arxiv_connect(max_time = 5)
can_arxiv_connect(max_time = 5)
max_time |
Maximum wait time in seconds |
Returns TRUE if connection is established and FALSE otherwise.
can_arxiv_connect(2)
can_arxiv_connect(2)
Possible terms that correspond to different fields in arXiv searches.
data(query_terms)
data(query_terms)
A data frame with two columns: the term
and corresponding
description
.
Karl W Broman
https://arxiv.org/help/api/user-manual.html
query_terms
query_terms