In R, Merge rows where the column has the same value, but otherwise

So, I have data where many values ​​(x) were split due to a case problem, and I would like to combine all these values, ignoring case, and just adding values ​​to other columns (y and z)

I have a dataframe like:

x     y  z 
rain  2   40
Rain  4   50
RAIN  7   25
Wind  8   10
Snow  3    9
SNOW  11  25

I need a Dataframe:

x     y   z
Rain  13  115
Wind  8   10
Snow  14  34
+4
source share
3 answers

You can omit the caps in the first column and then fill.

Option 1: base Raggregate()

with(df, aggregate(list(y = y, z = z), list(x = tolower(x)), sum))
#      x  y   z
# 1 rain 13 115
# 2 snow 14  34
# 3 wind  8  10

Alternatively, a method of the formula can also be used.

aggregate(. ~ x, transform(df, x = tolower(x)), sum)

Option 2: data.table. It also preserves the order that you show as a result.

library(data.table)
as.data.table(df)[, lapply(.SD, sum), by = .(x = tolower(x))]
#       x  y   z
# 1: rain 13 115
# 2: wind  8  10
# 3: snow 14  34

, keyby by

3: R xtabs()

xtabs(cbind(y = y, z = z) ~ tolower(x), df)
#           
# tolower(x)   y   z
#       rain  13 115
#       snow  14  34
#       wind   8  10 

(, , , ), , x .

:

df <- tructure(list(x = structure(c(1L, 2L, 3L, 6L, 4L, 5L), .Label = c("rain", 
"Rain", "RAIN", "Snow", "SNOW", "Wind"), class = "factor"), y = c(2L, 
4L, 7L, 8L, 3L, 11L), z = c(40L, 50L, 25L, 10L, 9L, 25L)), .Names = c("x", 
"y", "z"), class = "data.frame", row.names = c(NA, -6L))
+4

Try:

library(dplyr)
df %>%
  group_by(x = tolower(x)) %>%
  summarise_each(funs(sum))

:

#Source: local data frame [3 x 3]
#
#      x     y     z
#  (chr) (int) (int)
#1  rain    13   115
#2  snow    14    34
#3  wind     8    10
+3

title, stringr dplyr group_by summarise.

require(dplyr)    
summarise_each(group_by(df,x=stringr::str_to_title(x)),funs(sum))

df - .

      x     y     z
  (chr) (int) (int)
1  Rain    13   115
2  Snow    14    34
3  Wind     8    10
+1

Source: https://habr.com/ru/post/1608906/


All Articles