Package: tabulapdf 1.0.5-5

Mauricio Vargas Sepulveda

tabulapdf: Extract Tables from PDF Documents

Bindings for the 'Tabula' <https://tabula.technology/> 'Java' library, which can extract tables from PDF files. This tool can reduce time and effort in data extraction processes in fields like investigative journalism. It allows for automatic and manual table extraction, the latter facilitated through a 'Shiny' interface, enabling manual areas selection\ with a computer mouse for data retrieval.

Authors:Thomas J. Leeper [aut], Mauricio Vargas Sepulveda [aut, cre], Tom Paskhalis [aut], Manuel Aristaran [ctb], David Gohel [ctb], Lincoln Mullen [ctb], Munk School of Global Affairs and Public Policy [fnd]

tabulapdf_1.0.5-5.tar.gz
tabulapdf_1.0.5-5.zip(r-4.5)tabulapdf_1.0.5-5.zip(r-4.4)tabulapdf_1.0.5-5.zip(r-4.3)
tabulapdf_1.0.5-5.tgz(r-4.5-any)tabulapdf_1.0.5-5.tgz(r-4.4-any)tabulapdf_1.0.5-5.tgz(r-4.3-any)
tabulapdf_1.0.5-5.tar.gz(r-4.5-noble)tabulapdf_1.0.5-5.tar.gz(r-4.4-noble)
tabulapdf_1.0.5-5.tgz(r-4.4-emscripten)tabulapdf_1.0.5-5.tgz(r-4.3-emscripten)
tabulapdf.pdf |tabulapdf.html
tabulapdf/json (API)
NEWS

# Install 'tabulapdf' in R:
install.packages('tabulapdf', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/ropensci/tabulapdf/issues

Pkgdown site:https://docs.ropensci.org

Uses libs:
  • openjdk– OpenJDK Java runtime, using Hotspot JIT

On CRAN:

Conda:

javapdfpdf-documentpeer-reviewedropenscitabulatabular-dataopenjdk

10.07 score 552 stars 1 packages 159 scripts 1.6k downloads 11 exports 27 dependencies

Last updated 2 months agofrom:03cabea1c4 (on main). Checks:9 OK. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKMar 04 2025
R-4.5-winOKMar 04 2025
R-4.5-macOKFeb 02 2025
R-4.5-linuxOKMar 04 2025
R-4.4-winOKMar 04 2025
R-4.4-macOKMar 04 2025
R-4.4-linuxOKMar 04 2025
R-4.3-winOKMar 04 2025
R-4.3-macOKMar 04 2025

Exports:extract_areasextract_metadataextract_tablesextract_textget_n_pagesget_page_dimslocate_areasmake_thumbnailsmerge_pdfssplit_pdfstop_logging

Dependencies:bitbit64clicliprcpp11crayonfansigluehmslifecyclemagrittrpillarpkgconfigpngprettyunitsprogressR6readrrJavarlangtibbletidyselecttzdbutf8vctrsvroomwithr

Introduction to tabulapdf

Rendered fromtabulapdf.Rmdusingknitr::rmarkdownon Mar 04 2025.

Last update: 2024-09-19
Started: 2024-04-11

Readme and manuals

Help Manual

Help pageTopics
tabulapdftabulapdf-package tabulapdf
extract_metadataextract_metadata
extract_tablesextract_tables
extract_textextract_text
Page length and dimensionsget_n_pages get_page_dims
extract_areasextract_areas locate_areas
make_thumbnailsmake_thumbnails
Split and merge PDFsmerge_pdfs split_pdf
rJava loggingstop_logging