Package: pangoling 1.0.1

Bruno Nicenboim

pangoling: Access to Large Language Model Predictions

Provides access to word predictability estimates using large language models (LLMs) based on 'transformer' architectures via integration with the 'Hugging Face' ecosystem. The package interfaces with pre-trained neural networks and supports both causal/auto-regressive LLMs (e.g., 'GPT-2'; Radford et al., 2019) and masked/bidirectional LLMs (e.g., 'BERT'; Devlin et al., 2019, <doi:10.48550/arXiv.1810.04805>) to compute the probability of words, phrases, or tokens given their linguistic context. By enabling a straightforward estimation of word predictability, the package facilitates research in psycholinguistics, computational linguistics, and natural language processing (NLP).

Authors:Bruno Nicenboim [aut, cre], Chris Emmerly [ctb], Giovanni Cassani [ctb], Lisa Levinson [rev], Utku Turk [rev]

pangoling_1.0.1.tar.gz
pangoling_1.0.1.zip(r-4.5)pangoling_1.0.1.zip(r-4.4)pangoling_1.0.1.zip(r-4.3)
pangoling_1.0.1.tgz(r-4.5-any)pangoling_1.0.1.tgz(r-4.4-any)pangoling_1.0.1.tgz(r-4.3-any)
pangoling_1.0.1.tar.gz(r-4.5-noble)pangoling_1.0.1.tar.gz(r-4.4-noble)
pangoling_1.0.1.tgz(r-4.4-emscripten)pangoling_1.0.1.tgz(r-4.3-emscripten)
pangoling.pdf |pangoling.html
pangoling/json (API)
NEWS

# Install 'pangoling' in R:
install.packages('pangoling', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))

Reviews:rOpenSci Software Review #575

Bug tracker:https://github.com/ropensci/pangoling/issues

Pkgdown site:https://docs.ropensci.org

Datasets:
  • df_jaeger14 - Self-Paced Reading Dataset on Chinese Relative Clauses
  • df_sent - Example dataset: Two word-by-word sentences

On CRAN:

Conda:

nlppsycholinguisticstransformers

4.90 score 8 stars 24 exports 26 dependencies

Last updated 4 hours agofrom:967d98b74e (on main). Checks:4 OK, 5 NOTE. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKMar 11 2025
R-4.5-winOKMar 11 2025
R-4.5-macOKMar 11 2025
R-4.5-linuxOKMar 11 2025
R-4.4-winNOTEMar 11 2025
R-4.4-macNOTEMar 11 2025
R-4.4-linuxNOTEMar 11 2025
R-4.3-winNOTEMar 11 2025
R-4.3-macNOTEMar 11 2025

Exports:causal_configcausal_lpcausal_lp_matscausal_next_tokens_pred_tblcausal_next_tokens_tblcausal_pred_matscausal_preloadcausal_targets_predcausal_tokens_lp_tblcausal_tokens_pred_lstcausal_words_predinstall_py_pangolinginstalled_py_pangolingmasked_configmasked_lpmasked_preloadmasked_targets_predmasked_tokens_pred_tblmasked_tokens_tblntokensperplexity_calcset_cache_foldertokenize_lsttransformer_vocab

Dependencies:cachemclidata.tablefastmapglueherejsonlitelatticelifecyclemagrittrMatrixmemoisepillarpngrappdirsRcppRcppTOMLreticulaterlangrprojrootrstudioapitidyselecttidytableutf8vctrswithr

Troubleshooting the use of Python in R

Rendered fromtroubleshooting.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11

Using a Bert model to get the predictability of words in their context

Rendered fromintro-bert.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11

Using a GPT2 transformer model to get word predictability

Rendered fromintro-gpt2.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11

Worked-out example: Surprisal from a causal (GPT) model as a cognitive processing bottleneck in reading

Rendered fromexample.Rmdusingknitr::rmarkdownon Mar 11 2025.

Last update: 2025-03-11
Started: 2025-03-11