Changes in version 1.0.1                        

This release brings together several years of maintenance and feature
work to make textreuse easier to use on current R installations and more
practical for larger document collections.

This is a CRAN resubmission that fixes a moved README URL reported by
CRAN incoming checks.

Text input and corpus construction

  - TextReuseTextDocument() and TextReuseCorpus() now accept an encoding
    argument, making it easier to read source files whose text encoding
    is known or differs from the platform default.
  - TextReuseCorpus() now keeps skipped-document bookkeeping
    deterministic. Skipped documents are reported consistently, and skip
    metadata is available even when skip_short = FALSE.
  - Very short documents are handled more predictably when skip n-grams
    are used, avoiding assertion failures and making corpus construction
    easier to diagnose.

Alignment and match inspection

  - align_local() now returns an empty local alignment instead of
    throwing an error when two texts have no matching words. This makes
    batch alignment workflows easier to run because no-match pairs can
    be represented directly.
  - align_local() gains preserve_punctuation, allowing displayed
    alignments to keep punctuation from the original texts when that
    context is useful.
  - New count_matches() and matching_tokens() helpers expose absolute
    match counts and the matched tokens themselves, so users can inspect
    what drove a similarity score rather than relying only on a ratio.

Candidate generation and comparison

  - New token-index helpers find candidate document pairs from shared
    n-grams, giving users another way to identify likely reuse pairs
    before running more expensive comparisons.
  - pairwise_candidates() and matrix conversion now preserve all
    document IDs, including documents without returned candidate pairs.
  - as_sparse_matrix() provides a sparse matrix representation of
    candidate results, which is more convenient for downstream modeling,
    graph analysis, and workflows with many documents.

Locality-sensitive hashing

  - lsh_add() can add new documents to an existing LSH bucket cache, so
    users can extend an index without rebuilding it from scratch.
  - lsh_compare() can run comparisons in parallel on non-Windows
    platforms when options(mc.cores) is set.
  - Long-running C++ hashing and n-gram loops now check for user
    interrupts, so expensive jobs can be stopped more cleanly from R.

Compatibility and documentation

  - Compatibility with current dplyr and tidyr releases has been
    refreshed.
  - README, vignette, reference, and pkgdown examples were regenerated
    against current package output.
  - Stale external links and documentation badges were updated so
    package checks and the public documentation site are cleaner.

                 Changes in version 0.1.4 (2016-11-28)                  

  - Preventative maintenance release to avoid failing tests when new
    version of BH is released.

                 Changes in version 0.1.3 (2016-03-28)                  

  - Preventative maintenance release to avoid failing tests when new
    versions of the dplyr and testthat packages are released.

                 Changes in version 0.1.2 (2015-11-06)                  

  - Fix memory error in shingle_ngrams()
  - Fix tests for retokenizing on Windows
  - More informative error message if using lsh() on corpora without
    minhashes

                 Changes in version 0.1.1 (2015-11-04)                  

  - Fix progress bars in vignettes

                 Changes in version 0.1.0 (2015-10-31)                  

  - Initial release