Package: textreuse 1.0.1
textreuse: Detect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Authors:
textreuse_1.0.1.tar.gz
textreuse_1.0.1.zip(r-4.7)textreuse_1.0.1.zip(r-4.6)textreuse_1.0.1.zip(r-4.5)
textreuse_1.0.1.tgz(r-4.6-x86_64)textreuse_1.0.1.tgz(r-4.6-arm64)textreuse_1.0.1.tgz(r-4.5-x86_64)textreuse_1.0.1.tgz(r-4.5-arm64)
textreuse_1.0.1.tar.gz(r-4.7-arm64)textreuse_1.0.1.tar.gz(r-4.7-x86_64)textreuse_1.0.1.tar.gz(r-4.6-arm64)textreuse_1.0.1.tar.gz(r-4.6-x86_64)
textreuse_1.0.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html✨
card.svg |card.png
textreuse/json (API)
NEWS
| # Install 'textreuse' in R: |
| install.packages('textreuse', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org')) |
Reviews:rOpenSci Software Review #20
Bug tracker:https://github.com/ropensci/textreuse/issues
Pkgdown/docs site:https://docs.ropensci.org
Last updated from:6f8cbe3802 (on master). Checks:14 OK. Indexed: yes.
| Target | Result | Time | Files | Syslog |
|---|---|---|---|---|
| linux-devel-arm64 | OK | 162 | ||
| linux-devel-x86_64 | OK | 180 | ||
| pkgdown docs | OK | 170 | ||
| source / vignettes | OK | 200 | ||
| linux-release-arm64 | OK | 181 | ||
| linux-release-x86_64 | OK | 154 | ||
| macos-release-arm64 | OK | 94 | ||
| macos-release-x86_64 | OK | 239 | ||
| macos-oldrel-arm64 | OK | 106 | ||
| macos-oldrel-x86_64 | OK | 246 | ||
| windows-devel | OK | 146 | ||
| windows-release | OK | 134 | ||
| windows-oldrel | OK | 120 | ||
| wasm-release | OK | 133 |
Exports:align_localas_sparse_matrixcontentcontent<-count_matchesfilenameshas_contenthas_hasheshas_minhasheshas_tokenshash_stringhasheshashes<-is.TextReuseCorpusis.TextReuseTextDocumentjaccard_bag_similarityjaccard_dissimilarityjaccard_similaritylshlsh_addlsh_candidateslsh_comparelsh_probabilitylsh_querylsh_subsetlsh_thresholdmatching_tokensmetameta<-minhash_generatorminhashesminhashes<-pairwise_candidatespairwise_compareratio_of_matchesrehashskippedTextReuseCorpusTextReuseTextDocumenttoken_indextoken_index_candidatestokenizetokenize_ngramstokenize_sentencestokenize_skip_ngramstokenize_wordstokenstokens<-wordcount
Dependencies:assertthatBHclicpp11digestdplyrgenericsgluelatticelifecyclemagrittrMatrixNLPpillarpkgconfigpurrrR6RcppRcppProgressrlangstringistringrtibbletidyrtidyselectutf8vctrswithr
Introduction to the textreuse package
Rendered fromtextreuse-introduction.Rmdusingknitr::rmarkdownon Jun 05 2026.Last update: 2026-05-05
Started: 2015-10-22
Minhash and locality-sensitive hashing
Rendered fromtextreuse-minhash.Rmdusingknitr::rmarkdownon Jun 05 2026.Last update: 2026-05-05
Started: 2015-10-22
Pairwise comparisons for document similarity
Rendered fromtextreuse-pairwise.Rmdusingknitr::rmarkdownon Jun 05 2026.Last update: 2026-05-05
Started: 2015-10-22
Text Alignment
Rendered fromtextreuse-alignment.Rmdusingknitr::rmarkdownon Jun 05 2026.Last update: 2026-05-05
Started: 2015-10-22
