Package: robotstxt 0.7.15.9000
robotstxt: A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Authors:
robotstxt_0.7.15.9000.tar.gz
robotstxt_0.7.15.9000.zip(r-4.5)robotstxt_0.7.15.9000.zip(r-4.4)robotstxt_0.7.15.9000.zip(r-4.3)
robotstxt_0.7.15.9000.tgz(r-4.4-any)robotstxt_0.7.15.9000.tgz(r-4.3-any)
robotstxt_0.7.15.9000.tar.gz(r-4.5-noble)robotstxt_0.7.15.9000.tar.gz(r-4.4-noble)
robotstxt_0.7.15.9000.tgz(r-4.4-emscripten)robotstxt_0.7.15.9000.tgz(r-4.3-emscripten)
robotstxt.pdf |robotstxt.html✨
robotstxt/json (API)
NEWS
# Install 'robotstxt' in R: |
install.packages('robotstxt', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/ropensci/robotstxt/issues
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
Last updated 23 hours agofrom:d3d0a4d525 (on main). Checks:OK: 7. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Nov 15 2024 |
R-4.5-win | OK | Nov 15 2024 |
R-4.5-linux | OK | Nov 15 2024 |
R-4.4-win | OK | Nov 15 2024 |
R-4.4-mac | OK | Nov 15 2024 |
R-4.3-win | OK | Nov 15 2024 |
R-4.3-mac | OK | Nov 15 2024 |
Exports:%>%get_robotstxtget_robotstxt_http_getget_robotstxtsis_valid_robotstxton_client_error_defaulton_domain_change_defaulton_file_type_mismatch_defaulton_not_found_defaulton_redirect_defaulton_server_error_defaulton_sub_domain_change_defaulton_suspect_content_defaultparse_robotstxtpaths_allowedrequest_handler_handlerrobotstxtrt_last_httprt_request_handler
Dependencies:askpassclicodetoolscurldigestfuturefuture.applyglobalsgluehttrjsonlitelifecyclelistenvmagrittrmimeopensslparallellyR6Rcpprlangspiderbarstringistringrsysvctrs