labelmachine is an R package that helps assigning meaningful labels to data sets. Furthermore, you can manage your labels in so called lama-dictionary files, which are yaml files. This makes it very easy using the same label translations in multiple projects which share similar data structure.

Labeling your data can be easy!

## Installation

# Install release version from CRAN
install.packages("labelmachine")

# Install development version from GitHub
devtools::install_github('a-maldet/labelmachine', build_vignettes = TRUE)

## Concept

The label assignments are given in so called translations (named character vectors), which are like a recipes, telling which original value will be mapped onto which new label. The translations are collected in so called lama_dictionary objects. This lama_dictionary objects will be used to translate your data frame variables.

## Usage

Let df be a data frame with marks and subjects, which should be translated

df <- data.frame(
pupil_id = c(1, 1, 2, 2, 3),
subject = c("en", "ma", "ma", "en", "en"),
result = c(2, 1, 3, 2, NA),
stringsAsFactors = FALSE
)
df
##   pupil_id subject result
## 1        1      en      2
## 2        1      ma      1
## 3        2      ma      3
## 4        2      en      2
## 5        3      en     NA

Create a lama_dictionary object holding the translations:

library(labelmachine)
dict <- new_lama_dictionary(
subjects = c(en = "English", ma = "Mathematics", NA_ = "other subjects"),
results = c("1" = "Excellent", "2" = "Satisfying", "3" = "Failed", NA_ = "Missed")
)
dict
##
## --- lama_dictionary ---
## Variable 'subjects':
##               en               ma              NA_
##        "English"    "Mathematics" "other subjects"
##
## Variable 'results':
##            1            2            3          NA_
##  "Excellent" "Satisfying"     "Failed"     "Missed"

Translate the data frame variables:

df_new <- lama_translate(
df,
dict,
subject_new = subjects(subject),
result_new = results(result)
)
str(df_new)
## 'data.frame':    5 obs. of  5 variables:
##  $pupil_id : num 1 1 2 2 3 ##$ subject    : chr  "en" "ma" "ma" "en" ...
##  $result : num 2 1 3 2 NA ##$ subject_new: Factor w/ 3 levels "English","Mathematics",..: 1 2 2 1 1
##  \$ result_new : Factor w/ 4 levels "Excellent","Satisfying",..: 2 1 3 2 4

## Highlights

labelmachine offers the following features:

• All types of variables can be translated: Logical, Numeric, Character, Factor
• When translating your variables, you may choose between keeping the current ordering or applying a new factor ordering to your variable.
• Assigning meaningful labels to missing values (NA) is no problem.
• Assigning NA to existing values is no problem.
• Merging two values into a single label is no problem.
• Transforming a data frame holding label assignment lists into a lama_dictionary is no problem.
• Manage your translations in yaml files in order to use the same translations in different projects sharing similar data.