RelationalGroupedDataset vs KeyvalueGroupedDataset? When should you use each of them?

When grouping Dataset into Spark, there are two methods: groupBy and groupByKey[K] .

groupBy returns a RelationalGroupedDataset , and groupByKey[K] returns a KeyvalueGroupedDataset .

What are the differences between the two?

Under what circumstances should I choose one after another?


Why is my question a duplicate of these Dataset vs DataFrame questions? I do not understand. These are completely different things! My question is very specific, but not general.

+5
source share

Source: https://habr.com/ru/post/1275031/


All Articles