Is Kryo serialization required when working with the Dataset API?
Because datasets use Encoders for either serialization and deserialization:
- Does Kyro serialization even work for datasets? (Provided that the correct configuration is passed to Spark and the classes are correctly registered)
- If this works, how much will the performance improvement be? Thank.
source
share