Package: git2rdata 0.5.2

Thierry Onkelinx

git2rdata: Store and Retrieve Data.frames in a Git Repository

The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette("version_control", package = "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency", package = "git2rdata") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.

Authors:Thierry Onkelinx [aut, cre], Floris Vanderhaeghe [ctb], Peter Desmet [ctb], Els Lommelen [ctb], Research Institute for Nature and Forest [cph, fnd]

git2rdata_0.5.2.tar.gz
git2rdata_0.5.2.zip(r-4.7)git2rdata_0.5.2.zip(r-4.6)git2rdata_0.5.2.zip(r-4.5)
git2rdata_0.5.2.tgz(r-4.6-any)git2rdata_0.5.2.tgz(r-4.5-any)
git2rdata_0.5.2.tar.gz(r-4.7-any)git2rdata_0.5.2.tar.gz(r-4.6-any)
git2rdata_0.5.2.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
git2rdata/json (API)

# Install 'git2rdata' in R:
install.packages('git2rdata', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))

Reviews:rOpenSci Software Review #263

Bug tracker:https://github.com/ropensci/git2rdata/issues

Pkgdown/docs site:https://ropensci.github.io

On CRAN:

Conda:

reproducible-researchversion-control

10.30 score 104 stars 4 packages 249 scripts 660 downloads 21 exports 3 dependencies

Last updated from:513d2363d5 (on main). Checks:10 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK184
pkgdown docsOK222
source / vignettesOK221
linux-release-x86_64OK148
macos-release-arm64OK78
macos-oldrel-arm64OK98
windows-develOK89
windows-releaseOK89
windows-oldrelOK90
wasm-releaseOK113

Exports:commitdata_packagedisplay_metadatais_git2rdatais_git2rmetalist_datametaprune_metapullpushread_vcrecent_commitrelabelrename_variablerepositoryrm_datastatusupdate_metadataupgrade_dataverify_vcwrite_vc

Dependencies:assertthatgit2ryaml

Creating data packages
Introduction | Basic usage | Package contents | Schema information | Important notes | CSV format required | Metadata integration | Recursive search | Use cases | Data sharing | Data validation | Data catalogs | See also

Last update: 2026-04-09
Started: 2026-04-09

Using the convert argument
Introduction | Basic usage | Example: case conversion | Multiple columns | Use cases | Unsupported data type | Storage optimization | Data standardization | Important notes | Limitations

Last update: 2026-04-09
Started: 2026-04-09

Adding metadata
Introduction | Reading metadata | Updating the optional metadata

Last update: 2025-12-10
Started: 2024-09-06

Efficiency Relative to Storage and Time
Introduction | Data Storage | On a File System | In Git Repositories | Timings | Writing Data | Reading Data

Last update: 2025-12-10
Started: 2019-02-26

Suggested Workflow for Storing a Variable Set of Dataframes under Version Control
Introduction | Setup | Structuring Git2rdata Objects Within a Project | Storing Dataframes ad Hoc into a Git Repository | First Commit | Second Commit | Third Commit | Scripted Workflow for Storing Dataframes | R Package Workflow for Storing Dataframes | Analysis Workflow with Reproducible Data | Long running analysis

Last update: 2025-12-10
Started: 2019-02-26

Getting Started Storing Dataframes as Plain Text
Introduction | Maintaining Variable Classes | Efficiency Relative to Storage and Time | Optimizing File Storage | Optimized for Version Control | Basic Usage | Storing Optimized | Storing Verbose | Efficiency Relative to File Storage | Reading Data | Missing Values

Last update: 2024-09-06
Started: 2019-02-26

Optimizing Storage for Version Control
Introduction | Setup | Assumptions | Sorting Observations | Sorting Variables | Handling Factors Optimized | Relabelling a Factor

Last update: 2024-09-06
Started: 2019-02-26

Storing Large Dataframes
Introduction | When to Split the Dataframe

Last update: 2022-03-17
Started: 2021-01-13