I have a luigi python task that includes some pyspark libraries. Now I would like to submit this task to mesos using spark-submit. What to do to launch it? Below is my code skeleton:
from pyspark.sql import functions as F
from pyspark import SparkContext
class myClass(SparkSubmitTask):
def __init__(self, date):
self.date = date
def output(self):
def input(self):
def run(self):
if __name__ == "__main__":
luigi.run()
Without luigi, I subordinate this task to the following command line:
/opt/spark/bin/spark-submit --master mesos://host:port --deploy-mode cluster --total-executor-cores 1 --driver-cores 1 --executor-memory 1G --driver-memory 1G my_module.py
Now the problem is how can I fix the sending of the luigi task, which includes the luigi command line, for example:
luigi --module my_module myClass --local-scheduler --date 2016-01
Another question is, should my_module.py first complete the required task, do I need to do something for this, or just install the same as the current command line?
I really appreciate any hints or suggestions about this. Thank you so much.