Package: robotstxt 0.7.15.9000
robotstxt: A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Authors:
robotstxt_0.7.15.9000.tar.gz
robotstxt_0.7.15.9000.zip(r-4.5)robotstxt_0.7.15.9000.zip(r-4.4)robotstxt_0.7.15.9000.zip(r-4.3)
robotstxt_0.7.15.9000.tgz(r-4.5-any)robotstxt_0.7.15.9000.tgz(r-4.4-any)robotstxt_0.7.15.9000.tgz(r-4.3-any)
robotstxt_0.7.15.9000.tar.gz(r-4.5-noble)robotstxt_0.7.15.9000.tar.gz(r-4.4-noble)
robotstxt_0.7.15.9000.tgz(r-4.4-emscripten)robotstxt_0.7.15.9000.tgz(r-4.3-emscripten)
robotstxt.pdf |robotstxt.html✨
robotstxt/json (API)
NEWS
# Install 'robotstxt' in R: |
install.packages('robotstxt', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org')) |
Reviews:rOpenSci Software Review #25
Bug tracker:https://github.com/ropensci/robotstxt/issues
Pkgdown site:https://docs.ropensci.org
On CRAN:robotstxt-0.7.15(2024-08-29)
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
Last updated 4 months agofrom:d3d0a4d525 (on main). Checks:9 OK. Indexed: yes.
Target | Result | Latest binary |
---|---|---|
Doc / Vignettes | OK | Mar 14 2025 |
R-4.5-win | OK | Mar 14 2025 |
R-4.5-mac | OK | Mar 14 2025 |
R-4.5-linux | OK | Mar 14 2025 |
R-4.4-win | OK | Mar 14 2025 |
R-4.4-mac | OK | Mar 14 2025 |
R-4.4-linux | OK | Mar 14 2025 |
R-4.3-win | OK | Mar 14 2025 |
R-4.3-mac | OK | Mar 14 2025 |
Exports:%>%get_robotstxtget_robotstxt_http_getget_robotstxtsis_valid_robotstxton_client_error_defaulton_domain_change_defaulton_file_type_mismatch_defaulton_not_found_defaulton_redirect_defaulton_server_error_defaulton_sub_domain_change_defaulton_suspect_content_defaultparse_robotstxtpaths_allowedrequest_handler_handlerrobotstxtrt_last_httprt_request_handler
Dependencies:askpassclicodetoolscurldigestfuturefuture.applyglobalsgluehttrjsonlitelifecyclelistenvmagrittrmimeopensslparallellyR6Rcpprlangspiderbarstringistringrsysvctrs