Bulk changes to dataset columns by numeric

I have imported an excel dataset and want to set almost all columns (over 90) to numeric when they are originally characters. What is the best way to achieve this, because importing and changing each of the numbers one at a time is not the most efficient method?

+4
source share
2 answers

This should do as you wish:

# Random data frame for illustration (100 columns wide)
df <- data.frame(replicate(100,sample(0:1,1000,rep=TRUE)))

# Check column names / return column number (just encase you wanted to check)
colnames(df)

# Specify columns
cols <- c(1:length(df))   # length(df) is useful as if you ever add more columns at later date

# Or if only want to specify specific column numbers: 
# cols <- c(1:100) 

#With help of magrittr pipe function change all to numeric
library(magrittr)
df[,cols] %<>% lapply(function(x) as.numeric(as.character(x)))

# Check our columns are numeric
str(df)
0
source

Assuming your data has already been imported with all columns of characters, you can convert the corresponding columns to numeric, using mutate_atby position or name:

suppressPackageStartupMessages(library(tidyverse))  

# Assume the imported excel file has 5 columns a to e
df <- tibble(a = as.character(1:3),
             b = as.character(5:7),
             c = as.character(8:10),
             d = as.character(2:4),
             e = as.character(2:4))

# select the columns by position (convert all except 'b')
df %>% mutate_at(c(1, 3:5), as.numeric)
#> # A tibble: 3 x 5
#>       a     b     c     d     e
#>   <dbl> <chr> <dbl> <dbl> <dbl>
#> 1     1     5     8     2     2
#> 2     2     6     9     3     3
#> 3     3     7    10     4     4

# or drop the columns that shouldn't be used ('b' and 'd' should stay as chr)
df %>% mutate_at(-c(2, 4), as.numeric)
#> # A tibble: 3 x 5
#>       a     b     c     d     e
#>   <dbl> <chr> <dbl> <chr> <dbl>
#> 1     1     5     8     2     2
#> 2     2     6     9     3     3
#> 3     3     7    10     4     4

# select the columns by name
df %>% mutate_at(c("a", "c", "d", "e"), as.numeric)
#> # A tibble: 3 x 5
#>       a     b     c     d     e
#>   <dbl> <chr> <dbl> <dbl> <dbl>
#> 1     1     5     8     2     2
#> 2     2     6     9     3     3
#> 3     3     7    10     4     4
0
source

Source: https://habr.com/ru/post/1685816/


All Articles