Package: robotstxt 0.7.15.9000
robotstxt: A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Authors:
robotstxt_0.7.15.9000.tar.gz
robotstxt_0.7.15.9000.zip(r-4.5)robotstxt_0.7.15.9000.zip(r-4.4)robotstxt_0.7.15.9000.zip(r-4.3)
robotstxt_0.7.15.9000.tgz(r-4.4-any)robotstxt_0.7.15.9000.tgz(r-4.3-any)
robotstxt_0.7.15.9000.tar.gz(r-4.5-noble)robotstxt_0.7.15.9000.tar.gz(r-4.4-noble)
robotstxt_0.7.15.9000.tgz(r-4.4-emscripten)robotstxt_0.7.15.9000.tgz(r-4.3-emscripten)
robotstxt.pdf |robotstxt.html✨
robotstxt/json (API)
NEWS
# Install 'robotstxt' in R: |
install.packages('robotstxt', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/ropensci/robotstxt/issues
Pkgdown site:https://docs.ropensci.org
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
Last updated 2 months agofrom:d3d0a4d525 (on main). Checks:7 OK. Indexed: yes.
Target | Result | Latest binary |
---|---|---|
Doc / Vignettes | OK | Jan 14 2025 |
R-4.5-win | OK | Jan 14 2025 |
R-4.5-linux | OK | Jan 14 2025 |
R-4.4-win | OK | Jan 14 2025 |
R-4.4-mac | OK | Jan 14 2025 |
R-4.3-win | OK | Jan 14 2025 |
R-4.3-mac | OK | Jan 14 2025 |
Exports:%>%get_robotstxtget_robotstxt_http_getget_robotstxtsis_valid_robotstxton_client_error_defaulton_domain_change_defaulton_file_type_mismatch_defaulton_not_found_defaulton_redirect_defaulton_server_error_defaulton_sub_domain_change_defaulton_suspect_content_defaultparse_robotstxtpaths_allowedrequest_handler_handlerrobotstxtrt_last_httprt_request_handler
Dependencies:askpassclicodetoolscurldigestfuturefuture.applyglobalsgluehttrjsonlitelifecyclelistenvmagrittrmimeopensslparallellyR6Rcpprlangspiderbarstringistringrsysvctrs