With Spark and Java, I am trying to add to the existing dataset [Row], where n columns contains an Integer column.
I have successfully added id with zipWithUniqueId()or with zipWithIndex, even using monotonically_increasing_id(). But no one is satisfied.
Example. I have one dataset with 195 rows. When I use one of these three methods, I get some identifier, for example, 1584156487 or 12036. In addition, these identifiers are not adjacent.
What I need / need is pretty simple: the id Integer column, whose value is 1 for the foreach dataset.count () row, where id = 1 is followed by id = 2, etc.
How can I do this in Java / Spark?
source
share