Column selection based on multiple attribute conditions in dplyr 0.7.0

I am trying to figure out how to efficiently select columns using dplyr::select_if. The dataset starwarsin dplyr 0.70 is a good dataset for this:

> starwars
# A tibble: 87 x 13
                 name height  mass    hair_color  skin_color eye_color birth_year gender homeworld species     films  vehicles starships
                <chr>  <int> <dbl>         <chr>       <chr>     <chr>      <dbl>  <chr>     <chr>   <chr>    <list>    <list>    <list>
 1     Luke Skywalker    172    77         blond        fair      blue       19.0   male  Tatooine   Human <chr [5]> <chr [2]> <chr [2]>
 2              C-3PO    167    75          <NA>        gold    yellow      112.0   <NA>  Tatooine   Droid <chr [6]> <chr [0]> <chr [0]>
 3              R2-D2     96    32          <NA> white, blue       red       33.0   <NA>     Naboo   Droid <chr [7]> <chr [0]> <chr [0]>
 4        Darth Vader    202   136          none       white    yellow       41.9   male  Tatooine   Human <chr [4]> <chr [0]> <chr [1]>
 5        Leia Organa    150    49         brown       light     brown       19.0 female  Alderaan   Human <chr [5]> <chr [1]> <chr [0]>
 6          Owen Lars    178   120   brown, grey       light      blue       52.0   male  Tatooine   Human <chr [3]> <chr [0]> <chr [0]>
 7 Beru Whitesun lars    165    75         brown       light      blue       47.0 female  Tatooine   Human <chr [3]> <chr [0]> <chr [0]>
 8              R5-D4     97    32          <NA>  white, red       red         NA   <NA>  Tatooine   Droid <chr [1]> <chr [0]> <chr [0]>
 9  Biggs Darklighter    183    84         black       light     brown       24.0   male  Tatooine   Human <chr [1]> <chr [0]> <chr [1]>
10     Obi-Wan Kenobi    182    77 auburn, white        fair blue-gray       57.0   male   Stewjon   Human <chr [6]> <chr [1]> <chr [5]>

Now say that I would like to select columns that are integers. This works well:

library(dplyr)

starwars %>%
  select_if(is.numeric)

But what should I do if I want to choose based on several criteria. For example, maybe I need both numeric and character columns:

starwars %>%
  select_if(c(is.numeric, is.character))

Or maybe I want all numeric and column name:

starwars %>%
  select_if(name, is.character)

None of the two examples above work, so I wonder how I could accomplish what I have outlined here.

+4
source share
3 answers

In the first example:

starwars %>%
  select_if(function(col) {is.numeric(col) | is.character(col)})

RDocumentation.

:

toKeep <- sapply(starwars, is.numeric)
starwars %>%
  select("name", names(toKeep)[as.numeric(toKeep) == 1])

- , , :)

+4

:

 to_keep <- function(x) is.numeric(x) | is.character(x)
 starwars %>% select_if(to_keep)

"- quosure":

starwars %>% select_if(funs(is.numeric(.) | is.character(.)))

, ( , ):

 starwars %>%
    select("name") %>%
    bind_cols(select_if(starwars, funs(is.numeric(.) | is.character(.))))
+2

For the second part (getting a numeric AND column name):

to_keep <- c(starwars %>% select_if(is.numeric) %>% names,"name")
starwars %>% select(one_of(to_keep))  
0
source

Source: https://habr.com/ru/post/1679377/


All Articles