Package: textreuse 0.1.5
textreuse: Detect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Authors:
textreuse_0.1.5.tar.gz
textreuse_0.1.5.zip(r-4.5)textreuse_0.1.5.zip(r-4.4)textreuse_0.1.5.zip(r-4.3)
textreuse_0.1.5.tgz(r-4.5-x86_64)textreuse_0.1.5.tgz(r-4.5-arm64)textreuse_0.1.5.tgz(r-4.4-x86_64)textreuse_0.1.5.tgz(r-4.4-arm64)textreuse_0.1.5.tgz(r-4.3-x86_64)textreuse_0.1.5.tgz(r-4.3-arm64)
textreuse_0.1.5.tar.gz(r-4.5-noble)textreuse_0.1.5.tar.gz(r-4.4-noble)
textreuse_0.1.5.tgz(r-4.4-emscripten)textreuse_0.1.5.tgz(r-4.3-emscripten)
textreuse.pdf |textreuse.html✨
textreuse/json (API)
NEWS
# Install 'textreuse' in R: |
install.packages('textreuse', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org')) |
Reviews:rOpenSci Software Review #20
Bug tracker:https://github.com/ropensci/textreuse/issues
Pkgdown site:https://docs.ropensci.org
On CRAN:textreuse-0.1.5(2020-05-15)
Last updated 1 months agofrom:895b5ff299 (on master). Checks:1 OK, 11 NOTE. Indexed: yes.
Target | Result | Latest binary |
---|---|---|
Doc / Vignettes | OK | Mar 16 2025 |
R-4.5-win-x86_64 | NOTE | Mar 16 2025 |
R-4.5-mac-x86_64 | NOTE | Mar 16 2025 |
R-4.5-mac-aarch64 | NOTE | Mar 16 2025 |
R-4.5-linux-x86_64 | NOTE | Mar 16 2025 |
R-4.4-win-x86_64 | NOTE | Mar 16 2025 |
R-4.4-mac-x86_64 | NOTE | Mar 16 2025 |
R-4.4-mac-aarch64 | NOTE | Mar 16 2025 |
R-4.4-linux-x86_64 | NOTE | Mar 16 2025 |
R-4.3-win-x86_64 | NOTE | Mar 16 2025 |
R-4.3-mac-x86_64 | NOTE | Mar 16 2025 |
R-4.3-mac-aarch64 | NOTE | Mar 16 2025 |
Exports:align_localcontentcontent<-filenameshas_contenthas_hasheshas_minhasheshas_tokenshash_stringhasheshashes<-is.TextReuseCorpusis.TextReuseTextDocumentjaccard_bag_similarityjaccard_dissimilarityjaccard_similaritylshlsh_candidateslsh_comparelsh_probabilitylsh_querylsh_subsetlsh_thresholdmetameta<-minhash_generatorminhashesminhashes<-pairwise_candidatespairwise_compareratio_of_matchesrehashskippedTextReuseCorpusTextReuseTextDocumenttokenizetokenize_ngramstokenize_sentencestokenize_skip_ngramstokenize_wordstokenstokens<-wordcount
Dependencies:assertthatBHclicpp11digestdplyrfansigenericsgluelifecyclemagrittrNLPpillarpkgconfigpurrrR6RcppRcppProgressrlangstringistringrtibbletidyrtidyselectutf8vctrswithr
Introduction to the textreuse package
Rendered fromtextreuse-introduction.Rmd
usingknitr::rmarkdown
on Mar 16 2025.Last update: 2020-05-12
Started: 2015-10-22
Minhash and locality-sensitive hashing
Rendered fromtextreuse-minhash.Rmd
usingknitr::rmarkdown
on Mar 16 2025.Last update: 2015-10-31
Started: 2015-10-22
Pairwise comparisons for document similarity
Rendered fromtextreuse-pairwise.Rmd
usingknitr::rmarkdown
on Mar 16 2025.Last update: 2015-10-31
Started: 2015-10-22
Text Alignment
Rendered fromtextreuse-alignment.Rmd
usingknitr::rmarkdown
on Mar 16 2025.Last update: 2015-10-22
Started: 2015-10-22