Package 'mutagen'

Title: Extensions to dplyr's mutate
Description: Extensions to dplyr's mutate.
Authors: Gustavo Velásquez [aut, cre]
Maintainer: Gustavo Velásquez <[email protected]>
License: MIT + file LICENSE
Version: 0.2.0
Built: 2025-10-26 01:40:20 UTC
Source: https://github.com/gvelasq/mutagen

Help Index


Generate list or list-column with NULLs replaced with NAs

Description

This function takes a list and replaces all NULL values with NA. It is useful for working with list-columns in a data frame.

Usage

gen_na_listcol(x)

Arguments

x

A list or list-column to modify.

Details

Parallelization is supported via purrr::in_parallel().

Value

A list with all NULL values replaced with NA.

Examples

library(dplyr, warn.conflicts = FALSE)
a <-
  mtcars %>%
  select(cyl, vs, am) %>%
  slice(1:6) %>%
  as_tibble() %>%
  mutate(listcol = list(NULL, "b", "c", "d", "e", "f"))
glimpse(a)
b <-
  a %>%
  mutate(across(starts_with("listcol"), gen_na_listcol))
glimpse(b)

Generate column percent

Description

This function calculates a column percent. The by argument calculates column percents within unique categories of grouping columns. The prop argument calculates a proportion rather than a percent.

Usage

gen_percent(data, col, by, prop = FALSE)

Arguments

data

A data frame.

col

<tidy-select> A single column with which to calculate a column percent.

by

An optional character vector of columns to group by.

prop

If TRUE, percent will be shown as a proportion between 0-1 rather than a percent between 0-100. Default is FALSE.

Value

A double vector totaling 100 within col. If grouping columns are specified with by, the percent for each unique category of grouping columns will total 100 within col. If prop is specified, a double vector totaling 1 within col (or totaling 1 within unique categories of grouping columns specified with by).

Examples

library(dplyr, warn.conflicts = FALSE)
a <- as_tibble(mtcars)
gen_percent(a, gear)
b <-
  a %>%
  select(gear, cyl, carb) %>%
  arrange(gear, cyl, carb) %>%
  mutate(
    pct1 = gen_percent(., gear),
    pct2 = gen_percent(., gear, by = "cyl"),
    pct3 = gen_percent(., gear, by = c("cyl", "carb")),
    prop1 = gen_percent(., gear, prop = TRUE)
  )
b

Generate rowwise match of a set of values

Description

This function performs a rowwise match of a set of supplied values across columns in a data frame. If any of the row values equal one of the supplied values, this function returns an integer 1 (1L) for that row, otherwise it returns an integer 0 (0L).

Usage

gen_rowany(data, cols, values)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

values

A list of values to match.

Details

Parallelization is supported via purrr::in_parallel().

Value

A binary integer vector indicating whether any supplied value was matched with an integer 1 (1L), otherwise it returns an integer 0 (0L).

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = 1:3,
  y = rep(NA, 3),
  z = letters[1:3],
  aa = rep(FALSE, 3)
)
val <- list(1, NA, "a", FALSE)
val2 <- list(5, NaN, "d", Inf)
gen_rowany(a, values = val)
b <- a %>%
  mutate(
    q = gen_rowany(., values = val),
    r = gen_rowany(., values = val2)
  )
b

Generate rowwise count of columns matching a set of values

Description

This function performs a rowwise count of columns in a data frame that match a set of supplied values.

Usage

gen_rowcount(data, cols, values)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

values

A list of values to match.

Details

Parallelization is supported via purrr::in_parallel().

Value

An integer vector with the number of matched values.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = 1:3,
  y = rep(NA, 3),
  z = letters[1:3],
  aa = rep(FALSE, 3)
)
val <- list(1, NA, "a", FALSE)
gen_rowcount(a, values = val)
gen_rowcount(a, everything(), values = val)
gen_rowcount(a, starts_with(letters[25:26]), values = val)
b <- a %>% mutate(q = gen_rowcount(., values = val))
b

Generate rowwise first nonmissing value

Description

This function returns the rowwise first nonmissing value in a data frame.

Usage

gen_rowfirst(data, cols)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

Details

Parallelization is supported via purrr::in_parallel().

Value

A vector of the rowwise first nonmissing value. The vector's type will be of common type to all rowwise nonmissing values.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(NA, 3, NA),
  z = c(4, NA, 5)
)
gen_rowfirst(a)
gen_rowfirst(a, all_of(letters[25:26]))
b <- a %>% mutate(q = gen_rowfirst(.))
b
c <-
  a %>%
  mutate(w = c("a", TRUE, NA), .before = "x") %>%
  mutate(q = gen_rowfirst(.))
c # note that q is of type <chr>

Generate rowwise last nonmissing value

Description

This function returns the rowwise last nonmissing value in a data frame.

Usage

gen_rowlast(data, cols)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

Details

Parallelization is supported via purrr::in_parallel().

Value

A vector of the rowwise last nonmissing value. The vector's type will be of common type to all rowwise nonmissing values.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(NA, 3, NA),
  z = c(4, NA, 5)
)
gen_rowlast(a)
gen_rowlast(a, all_of(letters[24:25]))
b <- a %>% mutate(q = gen_rowlast(.))
b
c <-
  a %>%
  mutate(aa = c("a", TRUE, NA), .after = "z") %>%
  mutate(q = gen_rowlast(.))
c # note that q is of type <chr>

Generate rowwise maximum value

Description

This function returns the rowwise maximum value in a data frame.

Usage

gen_rowmax(data, cols)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

Details

Parallelization is supported via purrr::in_parallel().

Value

A vector of the rowwise maximum value.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(NA, 3, NA),
  z = c(4, NA, 5)
)
gen_rowmax(a)
gen_rowmax(a, everything())
gen_rowmax(a, starts_with(letters[24:25]))
b <- a %>% mutate(q = gen_rowmax(.))
b

Generate rowwise mean

Description

This function returns the rowwise arithmetic mean value in a data frame.

Usage

gen_rowmean(data, cols)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

Details

Parallelization is supported via purrr::in_parallel().

Value

A double vector of the rowwise arithmetic mean value. Missing values are ignored.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(NA, 3, NA),
  z = c(4, NA, 5)
)
gen_rowmean(a)
gen_rowmean(a, everything())
gen_rowmean(a, all_of(letters[25:26]))
b <- a %>% mutate(q = gen_rowmean(.))
b

Generate rowwise median

Description

This function returns the rowwise median value in a data frame.

Usage

gen_rowmedian(data, cols)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

Details

Parallelization is supported via purrr::in_parallel().

Value

A double vector of the rowwise median value. Missing values are ignored.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(2, 3, 2),
  z = c(4, NA, 5)
)
gen_rowmedian(a)
gen_rowmedian(a, everything())
gen_rowmedian(a, all_of(letters[25:26]))
b <- a %>% mutate(q = gen_rowmedian(.))
b

Generate rowwise minimum value

Description

This function returns the rowwise minimum value in a data frame.

Usage

gen_rowmin(data, cols)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

Details

Parallelization is supported via purrr::in_parallel().

Value

A vector of the rowwise minimum value.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(NA, 3, NA),
  z = c(4, NA, 5)
)
gen_rowmin(a)
gen_rowmin(a, everything())
gen_rowmin(a, starts_with(letters[25:26]))
b <- a %>% mutate(q = gen_rowmin(.))
b

Generate rowwise count of missing values

Description

This function returns the rowwise count of missing values in a data frame.

Usage

gen_rowmiss(data, cols)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

Details

Parallelization is supported via purrr::in_parallel().

Value

An integer vector of the rowwise count of missing values.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(NA, 3, NA),
  z = c(4, NA, 5)
)
gen_rowmiss(a)
gen_rowmiss(a, all_of(letters[25:26]))
b <- a %>% mutate(q = gen_rowmiss(.))
b

Generate rowwise nth nonmissing value

Description

This function returns the rowwise nth nonmissing value in a data frame.

Usage

gen_rownth(data, cols, n)

Arguments

data

A data frame.

cols

<tidy-select> Columns to search across.

n

An integer vector of length 1 that specifies the position of the rowwise nth nonmissing value to search for. A negative integer will index from the end.

Details

Parallelization is supported via purrr::in_parallel().

Value

A vector of the rowwise nth nonmissing value. The vector's type will be of common type to all rowwise nonmissing values.

Examples

library(dplyr, warn.conflicts = FALSE)
a <- tibble(
  x = c(1, NA, 2),
  y = c(NA, 3, NA),
  z = c(4, NA, 5)
)
gen_rownth(a, n = 1)
gen_rownth(a, n = 2)
gen_rownth(a, all_of(letters[25:26]), n = 1)
b <- a %>% mutate(q = gen_rownth(., n = 1), r = gen_rownth(., n = 2))
b
c <-
  a %>%
  mutate(w = c("a", TRUE, NA), .before = "x") %>%
  mutate(q = gen_rownth(., n = 1), r = gen_rownth(., n = 2))
c # note that q and r are of type <chr>