Package: robotstxt 0.7.15.9000
robotstxt: A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Authors:
robotstxt_0.7.15.9000.tar.gz
robotstxt_0.7.15.9000.zip(r-4.5)robotstxt_0.7.15.9000.zip(r-4.4)robotstxt_0.7.15.9000.zip(r-4.3)
robotstxt_0.7.15.9000.tgz(r-4.4-any)robotstxt_0.7.15.9000.tgz(r-4.3-any)
robotstxt_0.7.15.9000.tar.gz(r-4.5-noble)robotstxt_0.7.15.9000.tar.gz(r-4.4-noble)
robotstxt_0.7.15.9000.tgz(r-4.4-emscripten)robotstxt_0.7.15.9000.tgz(r-4.3-emscripten)
robotstxt.pdf |robotstxt.html✨
robotstxt/json (API)
NEWS
# Install 'robotstxt' in R: |
install.packages('robotstxt', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/ropensci/robotstxt/issues
Pkgdown:https://docs.ropensci.org
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
Last updated 30 days agofrom:d3d0a4d525 (on main). Checks:OK: 7. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Dec 15 2024 |
R-4.5-win | OK | Dec 15 2024 |
R-4.5-linux | OK | Dec 15 2024 |
R-4.4-win | OK | Dec 15 2024 |
R-4.4-mac | OK | Dec 15 2024 |
R-4.3-win | OK | Dec 15 2024 |
R-4.3-mac | OK | Dec 15 2024 |
Exports:%>%get_robotstxtget_robotstxt_http_getget_robotstxtsis_valid_robotstxton_client_error_defaulton_domain_change_defaulton_file_type_mismatch_defaulton_not_found_defaulton_redirect_defaulton_server_error_defaulton_sub_domain_change_defaulton_suspect_content_defaultparse_robotstxtpaths_allowedrequest_handler_handlerrobotstxtrt_last_httprt_request_handler
Dependencies:askpassclicodetoolscurldigestfuturefuture.applyglobalsgluehttrjsonlitelifecyclelistenvmagrittrmimeopensslparallellyR6Rcpprlangspiderbarstringistringrsysvctrs