| Title: | Recodes Sex/Gender Descriptions into a Standard Set |
|---|---|
| Description: | Provides dictionary-based tools for recoding free-text gender responses into consistent categories while preserving gender diversity where possible. The package standardises spelling, capitalization, whitespace, and common variants through curated named character-vector dictionaries, supports either detailed or collapsed output categories, and can retain original unmatched responses for manual review. It also includes helpers for creating custom dictionaries from approximate string matches and a local interactive application for recoding uploaded data files. |
| Authors: | Yaoxiang Li [aut, cre] (ORCID: <https://orcid.org/0000-0001-9200-1016>), Jennifer Beaudry [aut] (ORCID: <https://orcid.org/0000-0003-1596-6708>), Emily Kothe [aut] (ORCID: <https://orcid.org/0000-0003-1210-0554>), Felix Singleton Thorn [aut] (ORCID: <https://orcid.org/0000-0002-0237-6146>), Rhydwyn McGuire [aut], Nicholas Tierney [aut] (ORCID: <https://orcid.org/0000-0003-1460-8722>), Mathew Ling [aut] (ORCID: <https://orcid.org/0000-0002-0940-2538>), Julia Silge [rev] (Julia reviewed the package (v. 0.0.0.9000) for rOpenSci, see <https://github.com/ropensci/software-review/issues/435>), Elin Waring [rev] (Elin reviewed the package (v. 0.0.0.9000) for rOpenSci, see <https://github.com/ropensci/software-review/issues/435>) |
| Maintainer: | Yaoxiang Li <[email protected]> |
| License: | GPL-2 |
| Version: | 0.1.1 |
| Built: | 2026-05-12 23:00:21 UTC |
| Source: | https://github.com/ropensci/gendercoder |
gender_create_dictionary suggests dictionary entries for gender
responses that are not already matched exactly. The returned named character
vector is intended to be reviewed before it is combined with a built-in
dictionary and passed to recode_gender().
gender_create_dictionary( gender, dictionary = gendercoder::manylevels_en, max_distance = 1 )gender_create_dictionary( gender, dictionary = gendercoder::manylevels_en, max_distance = 1 )
gender |
a character vector of gender responses for recoding |
dictionary |
a character vector whose names are known gender responses and whose values are replacement values |
max_distance |
maximum edit distance allowed for a suggested match |
a named character vector of suggested replacement values
suggested <- gender_create_dictionary( c("maile", "unknown"), dictionary = manylevels_en, max_distance = 1 ) suggestedsuggested <- gender_create_dictionary( c("maile", "unknown"), dictionary = manylevels_en, max_distance = 1 ) suggested
Provides dictionaries and recode_gender() to allow for easy automatic coding of common variations in free-text responses to the question "What is your gender?"
Maintainer: Yaoxiang Li [email protected] (ORCID)
Authors:
Jennifer Beaudry [email protected] (ORCID)
Emily Kothe [email protected] (ORCID)
Felix Singleton Thorn [email protected] (ORCID)
Rhydwyn McGuire [email protected]
Nicholas Tierney [email protected] (ORCID)
Mathew Ling [email protected] (ORCID)
Other contributors:
Julia Silge (Julia reviewed the package (v. 0.0.0.9000) for rOpenSci, see <https://github.com/ropensci/software-review/issues/435>) [reviewer]
Elin Waring (Elin reviewed the package (v. 0.0.0.9000) for rOpenSci, see <https://github.com/ropensci/software-review/issues/435>) [reviewer]
Useful links:
Report bugs at https://github.com/ropensci/gendercoder/issues
Code data interactively in a Shiny app that runs locally in RStudio or a web browser using a bs4Dash interface. The app supports CSV, Stata, SPSS, RDS, and R data files. Stata and SPSS files require the optional haven package.
gendercoder_app(...)gendercoder_app(...)
... |
arguments to pass to |
Called for its side effect of launching a Shiny app.
if (interactive()) { gendercoder_app() }if (interactive()) { gendercoder_app() }
recode_gender matches uncleaned gender responses to cleaned list using
an built-in or custom dictionary.
recode_gender( gender, dictionary = gendercoder::manylevels_en, retain_unmatched = FALSE )recode_gender( gender, dictionary = gendercoder::manylevels_en, retain_unmatched = FALSE )
gender |
a character vector of gender responses for recoding |
dictionary |
a list that the contains gender responses and their
replacement values. A built-in dictionary |
retain_unmatched |
logical indicating if gender responses that are not found in dictionary should be filled with the uncleaned values during recoding |
a character vector of recoded genders
df <- data.frame( stringsAsFactors = FALSE, gender = c("male", "MALE", "mle", "I am male", "femail", "female", "enby"), age = c(34L, 37L, 77L, 52L, 68L, 67L, 83L) ) df$recoded_gender <- recode_gender(df$gender, dictionary = manylevels_en, retain_unmatched = TRUE ) dfdf <- data.frame( stringsAsFactors = FALSE, gender = c("male", "MALE", "mle", "I am male", "femail", "female", "enby"), age = c(34L, 37L, 77L, 52L, 68L, 67L, 83L) ) df$recoded_gender <- recode_gender(df$gender, dictionary = manylevels_en, retain_unmatched = TRUE ) df