Perform dplyr only if the column exists

Based on the discussion of dplyr conditional evaluation, I would like to conditionally execute a step in the pipeline depending on whether a reference column exists in the transmitted data frame.

Example

The results created with and must be identical. 1) 2)

Existing column

# 1)
mtcars %>% 
  filter(am == 1) %>%
  filter(cyl == 4)

# 2)
mtcars %>%
  filter(am == 1) %>%
  {
    if("cyl" %in% names(.)) filter(cyl == 4) else .
  }

Inaccessible column

# 1)
mtcars %>% 
  filter(am == 1)

# 2)    
mtcars %>%
  filter(am == 1) %>%
  {
    if("absent_column" %in% names(.)) filter(absent_column == 4) else .
  }

Problem

For an available column, the passed object does not match the original data frame. The source code returns an error message:

Error in filter(cyl == 4): object 'cyl'not found

I tried an alternative syntax (with no luck):

>> mtcars %>%
...   filter(am == 1) %>%
...   {
...     if("cyl" %in% names(.)) filter(.$cyl == 4) else .
...   }
 Show Traceback

 Rerun with Debug
 Error in UseMethod("filter_") : 
  no applicable method for 'filter_' applied to an object of class "logical" 

Subsequent

, == filter. , . mtcars% > %

filter({
    if ("does_not_ex" %in% names(.))
      does_not_ex
    else
      NULL
  } == {
    if ("does_not_ex" %in% names(.))
      unique(.[['does_not_ex']])
    else
      NULL
  })

, :

filter_impl(.data, quo): 32, 0

:

mtcars %>%
  filter({
    if ("mpg" %in% names(.))
      mpg
    else
      NULL
  } == {
    if ("mpg" %in% names(.))
      unique(.[['mpg']])
    else
      NULL
  })

:

  mpg cyl disp  hp drat   wt  qsec vs am gear carb
1  21   6  160 110  3.9 2.62 16.46  0  1    4    4

: {:

, filter, dplyr?

+4
1

- , , if. , .

Try:

mtcars %>%
  filter(am == 1) %>%
  filter({if("cyl" %in% names(.)) cyl else NULL} == 4)

. , , , , filter.

EDIT: docendo discimus , - .

+5

Source: https://habr.com/ru/post/1681620/


All Articles