Skip to contents

Add a random categorical variable to dataset

Usage

add_var_random_cat(
  data,
  name = "random_cat",
  cat = LETTERS[1:6],
  prob,
  overwrite = TRUE,
  seed
)

Arguments

data

A dataset

name

Name of new variable (as string)

cat

Vector of categories

prob

Vector of probabilities

overwrite

Can new random variable overwrite an existing variable in dataset?

seed

Seed for random number generation (integer)

Value

Dataset containing new random variable

Examples

library(magrittr)
iris %>% add_var_random_cat() %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species random_cat
#> 1          5.1         3.5          1.4         0.2  setosa          C
#> 2          4.9         3.0          1.4         0.2  setosa          E
#> 3          4.7         3.2          1.3         0.2  setosa          D
#> 4          4.6         3.1          1.5         0.2  setosa          A
#> 5          5.0         3.6          1.4         0.2  setosa          B
#> 6          5.4         3.9          1.7         0.4  setosa          B
iris %>% add_var_random_cat(name = "my_cat") %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species my_cat
#> 1          5.1         3.5          1.4         0.2  setosa      F
#> 2          4.9         3.0          1.4         0.2  setosa      B
#> 3          4.7         3.2          1.3         0.2  setosa      B
#> 4          4.6         3.1          1.5         0.2  setosa      E
#> 5          5.0         3.6          1.4         0.2  setosa      C
#> 6          5.4         3.9          1.7         0.4  setosa      D
iris %>% add_var_random_cat(cat = c("Version A", "Version B")) %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species random_cat
#> 1          5.1         3.5          1.4         0.2  setosa  Version A
#> 2          4.9         3.0          1.4         0.2  setosa  Version A
#> 3          4.7         3.2          1.3         0.2  setosa  Version A
#> 4          4.6         3.1          1.5         0.2  setosa  Version B
#> 5          5.0         3.6          1.4         0.2  setosa  Version A
#> 6          5.4         3.9          1.7         0.4  setosa  Version A
iris %>% add_var_random_cat(cat = c(1,2,3,4,5)) %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species random_cat
#> 1          5.1         3.5          1.4         0.2  setosa          4
#> 2          4.9         3.0          1.4         0.2  setosa          2
#> 3          4.7         3.2          1.3         0.2  setosa          1
#> 4          4.6         3.1          1.5         0.2  setosa          2
#> 5          5.0         3.6          1.4         0.2  setosa          2
#> 6          5.4         3.9          1.7         0.4  setosa          1