Row selection and ordering of the result in R

Apparently, the trivial task of selecting rows in a data frame and then ordering them eludes me and at the same time drives me crazy. For example, let it have a trivial data frame:

country = c("US", "US", "CA", "US") company = c("Apple", "Google", "RIM", "MS") vals = c(100, 70, 50, 90) df <- data.frame(country, company, vals) 

Lets arrange it with vals

 > df[order(vals),] country company vals 3 CA RIM 50 2 US Google 70 4 US MS 90 1 US Apple 100 

Works great. Now try to select only American companies and order values ​​there. We get some fictitious result.

 > df[country=="US", ][order(vals),] country company vals 4 US MS 90 2 US Google 70 NA <NA> <NA> NA 1 US Apple 100 

Lets you complete the order, and then select. Again, a dummy result

 > df[order(vals),][country=="US", ] country company vals 3 CA RIM 50 2 US Google 70 1 US Apple 100 

How do I get a data frame that includes only US companies and is sorted by val?

+4
source share
3 answers

Not sure if you can do this using a subset of calls [ , since you need to refer to an ordered or reduced data frame in the second call to the subset. One way is to arrange the data and provide them with subset() to select rows from this ordered data frame:

 > with(df, subset(df[order(vals),], subset = country == "US")) country company vals 2 US Google 70 4 US MS 90 1 US Apple 100 
+6
source

I always found it strange that the R base does not have the convenience of reordering a data frame, as well as a subset. So I wrote my own:

 library(plyr) arrange(subset(df, country == "US"), vals) 
+6
source
 > df[df$country=="US",][order(df[df$country=="US","vals"]),] country company vals 2 US Google 70 4 US MS 90 1 US Apple 100 

I think it’s a good habit to delete the source variables and just work with the file frame (so df $ country instead of country).

+1
source

Source: https://habr.com/ru/post/1337952/


All Articles