Title: | An API Client for the Internet Archive |
---|---|
Description: | Search the Internet Archive (<https://archive.org>), retrieve metadata, and download files. |
Authors: | Lincoln Mullen [aut, cre] |
Maintainer: | Lincoln Mullen <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.6 |
Built: | 2024-10-28 06:07:16 UTC |
Source: | https://github.com/ropensci/internetarchive |
Open an Internet Archive item in the browser
ia_browse(item_id, type = c("details", "stream"))
ia_browse(item_id, type = c("details", "stream"))
item_id |
The item identifier. If multiple item identifiers are passed in, only the first will be opened. |
type |
Which page to open: |
Returns the item ID(s) passed to the function.
# Distinguished Converts to Rome in America ia_browse("distinguishedcon00scanuoft")
# Distinguished Converts to Rome in America ia_browse("distinguishedcon00scanuoft")
Download files for Internet Archive items.
ia_download(files, dir = ".", extended_name = TRUE, overwrite = FALSE, silence = FALSE)
ia_download(files, dir = ".", extended_name = TRUE, overwrite = FALSE, silence = FALSE)
files |
A data frame of files returned by ia_files. You should filter this data frame to download only the files that you actually want. |
dir |
The directory in which to save the downloaded files. |
extended_name |
If this argument is |
overwrite |
If |
silence |
If false, print the item IDs as they are downloaded. |
A data frame including the file names of the downloaded files.
## Not run: if (require(dplyr)) { dir <- tempdir() ia_get_items("thedamnationofth00133gut") %>% ia_files() %>% filter(type == "txt") %>% # download only the files we want ia_download(dir = dir, extended_name = FALSE) } ## End(Not run)
## Not run: if (require(dplyr)) { dir <- tempdir() ia_get_items("thedamnationofth00133gut") %>% ia_files() %>% filter(type == "txt") %>% # download only the files we want ia_download(dir = dir, extended_name = FALSE) } ## End(Not run)
Access the list of files associated with an Internet Archive item
ia_files(items)
ia_files(items)
items |
A list describing an Internet Archive items returned from the API. |
A list containing the files as a list of character vectors.
## Not run: ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 3) items <- ia_get_items(ids) files <- ia_files(items) files ## End(Not run)
## Not run: ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 3) items <- ia_get_items(ids) files <- ia_files(items) files ## End(Not run)
Get the metadata for Internet Archive items
ia_get_items(item_id, silence = FALSE)
ia_get_items(item_id, silence = FALSE)
item_id |
A character vector containing the ID for an Internet Archive item. This argument is vectorized, so you can retrieve multiple items at once. |
silence |
If false, print the item IDs as they are retrieved. |
A list containing the metadata returned by the API. List names correspond to the item IDs.
## Not run: ia_get_items("thedamnationofth00133gut") ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 2) ia_get_items(ids) ## End(Not run)
## Not run: ia_get_items("thedamnationofth00133gut") ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 2) ia_get_items(ids) ## End(Not run)
Access the item IDs from an Internet Archive items
ia_item_id(item)
ia_item_id(item)
item |
A list describing an Internet Archive items returned from the API. This argument is vectorized. |
A character vector containing the item IDs.
ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 3) items <- ia_get_items(ids) ia_item_id(items)
ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 3) items <- ia_get_items(ids) ia_item_id(items)
Perform an simple keyword search of the Internet Archive.
ia_keyword_search(keywords, num_results = 5, page = 1, print_total = TRUE)
ia_keyword_search(keywords, num_results = 5, page = 1, print_total = TRUE)
keywords |
The keywords to search for. |
num_results |
The number of results to return per page. |
page |
When results are paged, which page of results to return. |
print_total |
Should the total number of results for this query be printed as a message? |
A character vector of Internet Archive item IDs.
ia_keyword_search("isaac hecker", num_results = 20)
ia_keyword_search("isaac hecker", num_results = 20)
List accepted metadata fields
ia_list_fields()
ia_list_fields()
A list of the accepted metadata fields
ia_list_fields()
ia_list_fields()
Access the item metadata from an Internet Archive item
ia_metadata(items)
ia_metadata(items)
items |
A list object describing an Internet Archive items returned from the API. |
A data frame containing the metadata, with columns id
for the
item identifier, field
for the name of the metadata field, and
value
for the metadata values.
ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 3) items <- ia_get_items(ids) metadata <- ia_metadata(items) metadata
ats_query <- c("publisher" = "american tract society") ids <- ia_search(ats_query, num_results = 3) items <- ia_get_items(ids) metadata <- ia_metadata(items) metadata
Perform an advanced search of the Internet Archive, specifying which metadata fields to search. Note that all searches are in the form of "contains," i.e., the title contains the search term.
ia_search(terms, num_results = 5, page = 1, print_url = FALSE, print_total = TRUE)
ia_search(terms, num_results = 5, page = 1, print_url = FALSE, print_total = TRUE)
terms |
A set of metadata fields and corresponding values to search. These should take the form of a named character vector. |
num_results |
The number of results to return per page. |
page |
When results are paged, which page of results to return. |
print_url |
Should the URL used for the query be printed as a message? |
print_total |
Should the total number of results for this query be printed as a message? |
A character vector of Internet Archive item IDs.
See the documentation on the Internet Archive's advanced search page.
query1 <- c("title" = "damnation of theron ware") ia_search(query1) query2 <- c("title" = "damnation of theron ware", "contributor" = "gutenberg") ia_search(query2)
query1 <- c("title" = "damnation of theron ware") ia_search(query1) query2 <- c("title" = "damnation of theron ware", "contributor" = "gutenberg") ia_search(query2)
This client permits you to search (ia_search), retrieve item metadata (ia_metadata) and associated files (ia_files), and download files (ia_files) in a pipeable interface.