To select a row with the minimum number "Number" for each group of "ID" we can use one of the functions of aggregation by groups. A base R - aggregate . Using aggregate we can either use the "formula" method or specify a list groupings of elements / variables with the argument by . Using the formula method, we get the value of min "Number" for each "ID".
aggregate(Number~ID, df1, FUN=min)
Or we can use a faster option with data.table . Here we convert "data.frame" to "data.table" ( setDT(df1) ), grouped by "ID", we get the value min "Number".
library(data.table) setDT(df1)[, list(Number=min(Number)), by = ID]
Or you can also do this with setorder up to the order column "Number" and use unique with the by parameter to select the first row not duplicated by "ID". (from comments by @David Arenburgs)
unique(setorder(setDT(df1), Number), by = "ID")
Or using dplyr , we group by 'ID' and get the subset rows with summarise .
library(dplyr) df1 %>% group_by(ID) %>% summarise(Number= min(Number))
Or we can use the sqldf syntax to get a subset of the data.
library(sqldf) sqldf('select ID, min(Number) as Number from df1 group by ID')
Update
If there are multiple columns, and you want to get a row based on the minimum Number for each ID, you can use which.min . Using .I will get the row index and can be used for a subset of the rows.
setDT(df1)[df1[, .I[which.min(Number)], by = ID]$V1]
Or with dplyr we use slice to filter strings that have a min value for "Number" for each "ID"
df1 %>% group_by(ID) %>% slice(which.min(Number))