The difference between Distinct and Unique

What is the difference between distinctand uniquein R using dplyr in connection with:

  • Speed
  • Capabilities (valid inputs, parameters, etc.) and usage
  • Output

For instance:

library(dplyr)
data(iris)

# creating data with duplicates
iris_dup <- bind_rows(iris, iris)

d <- distinct(iris_dup)
u <- unique(iris_dup)

all(d==u) # returns True

In this example, distinctthey uniqueperform the same function. Are there examples of how you should use one and not the other? Are there any tricks or their common uses? I am new to R, but come with a good SQL background. This question has already been mentioned several times, but I'm still looking for a more complete answer.

+4
source share
1 answer

, . .

distinct() dplyr . , dataframe

distinct(iris_dup, Petal.Width, Species)

unique() . , .

: , unique() . . .

unique(iris_dup[c("Petal.Width", "Species")])

( - ). distinct , unique .

     Petal.Width    Species
1          0.2     setosa
2          0.4     setosa
3          0.3     setosa
4          0.1     setosa
5          0.5     setosa
6          0.6     setosa
7          1.4 versicolor
8          1.5 versicolor
9          1.3 versicolor
10         1.6 versicolor
11         1.0 versicolor
12         1.1 versicolor
13         1.8 versicolor
14         1.2 versicolor
15         1.7 versicolor
16         2.5  virginica
17         1.9  virginica
18         2.1  virginica
19         1.8  virginica
20         2.2  virginica
21         1.7  virginica
22         2.0  virginica
23         2.4  virginica
24         2.3  virginica
25         1.5  virginica
26         1.6  virginica
27         1.4  virginica

. , dplyr , distinct .

+3

Source: https://habr.com/ru/post/1681814/


All Articles