I have a few abstract questions. I have been using Apache Spark (also Streaming and SQL) with Scala quite a lot lately. Most of my Spark works basically move RDD / Dataframes from one class to another, where each class does some input conversion.
Recently, I also read about Domain Driven Design, which made me think about how I would model my Spark programs using DDD. I have to say that itβs much harder for me to model spark code than code without spark using DDD concepts (perhaps because it mainly performs transforms or IO). Perhaps I am thinking about how to create an ubiquitous language, but not how to apply it in the spark code itself.
I tried a Google search on how to use Spark with DDD, but couldn't find anything about it, and so I was wondering:
- Am I just missing something about how to apply DDD concepts to Spark code?
- Perhaps Spark jobs are so focused on ETL that they really don't require DDD? If this is not the case, can someone explain how he / she uses the DDD concepts with Spark code? Perhaps some examples may be helpful.
Hope this is a legitimate question, if not, I'm sorry
Thank you in advance
source share