I am using Spark Catalyst to submit query plans for the openCypher query engine, ingraph . During the query planning process, I would like to convert from a specific logical plan ( Plan1) to another logical plan ( Plan2). (I try to keep the question simple, so I skipped some details here. The project is completely open source, so if necessary, I would gladly provide additional information on why this is necessary .)
The best approach I could find was to use recursively transformDown. Here is a small example that converts from Plan1Nodeto Plan2Nodeby replacing each instance OpA1with OpA2and each instance OpB1with OpB2.
import org.apache.spark.sql.catalyst.expressions.Attribute
import org.apache.spark.sql.catalyst.plans.logical.{LeafNode, LogicalPlan, UnaryNode}
trait Plan1Node extends LogicalPlan
case class OpA1() extends LeafNode with Plan1Node {
override def output: Seq[Attribute] = Seq()
}
case class OpB1(child: Plan1Node) extends UnaryNode with Plan1Node {
override def output: Seq[Attribute] = Seq()
}
trait Plan2Node extends LogicalPlan
case class OpA2() extends LeafNode with Plan2Node {
override def output: Seq[Attribute] = Seq()
}
case class OpB2(child: Plan2Node) extends UnaryNode with Plan2Node {
override def output: Seq[Attribute] = Seq()
}
object Plan1ToPlan2 {
def transform(plan: Plan1Node): Plan2Node = {
plan.transformDown {
case OpA1() => OpA2()
case OpB1(child) => OpB2(transform(child))
}
}.asInstanceOf[Plan2Node]
}
This approach does the job. This code:
val p1 = OpB1(OpA1())
val p2 = Plan1ToPlan2.transform(p1)
Results in:
p1: OpB1 = OpB1
+- OpA1
p2: Plan2Node = OpB2
+- OpA2
However, use asInstanceOf[Plan2Node]is definitely a bad code smell. I decided to use the Strategy to define conversion rules, but this class is designed to convert from physical to logical.
Is there a more elegant way to define the transformation between logical plans? Or uses several logical plans that are considered anti-patterns?