Defining new semantics for expressions in Python

I want to define a Python based constraint specification language. For instance:

x = IntVar() c = Constraint(x < 19) c.solve() 

Here IntVar is a class that describes a variable that can take any integer value, while Constraint is a class that represents constraints. To implement this, I can simply overload the < operator by specifying the __lt__ method for the IntVar class.

Suppose now that I want to say that 10 < x < 19 . I would like to write something like:

 c = Constraint(x > 10 and x < 19) 

Unfortunately, I cannot do this because and cannot be overloaded in Python. Using & instead of and not an option due to its priority and because bit-wise & has its own meaning in a constraint language, for example, (x & 0x4) == 1 .

What solution can you offer?

As a workaround, I use quoted expressions for constraints:

 c = Constraint("x < 19") 

But this requires parsing a constraint language, which I would prefer to avoid, and, more importantly, syntactic correctness can only be checked when the parsing is actually performed. Thus, the user can spend several hours to discover that there is a syntax error in the definition of the constraint.

Another option I examined uses the form of a lambda expression to define a constraint:

 c = Constraint(lambda: x < 19) 

but I cannot access the lambda object syntax tree.

+6
source share
3 answers

Using & , | and ~ actually a pretty good option. You just need to document that parentheses are necessary because of the different operator precedence.

SQLAlchemy does it like this, for example. For people who do not like this kind of abuse of bitwise operators, it also provides functions and_(*args) , or_(*args) and not_(arg) , which perform the same thing as their counterparts of operators. However, you are forced to prefix the notation ( and_(foo, bar) ), which is not as readable as the infix notation ( foo & bar ).


The lambda approach is also a good idea (besides the ugliness introduced by lambda itself). Unfortunately, AST is really not available without source code, but wait, you have source code that is simply not tied to a function object!

Imagine this code:

 import ast import inspect def evaluate(constraint): print ast.dump(ast.parse(inspect.getsource(constraint))) evaluate(lambda x: x < 5 and x > -5) 

This will give you this AST:

 Module( body=[ Expr( value=Call( func=Name(id='evaluate', ctx=Load()), args=[ Lambda( args=arguments( args=[ Name(id='x', ctx=Param()) ], vararg=None, kwarg=None, defaults=[] ), body=BoolOp( op=And(), values=[ Compare( left=Name(id='x', ctx=Load()), ops=[Lt()], comparators=[Num(n=5)] ), Compare( left=Name(id='x', ctx=Load()), ops=[Gt()], comparators=[Num(n=-5)] ) ] ) ) ], keywords=[], starargs=None, kwargs=None ) ) ] ) 

The disadvantage is that you get the whole source string - but you can easily go through the AST until you reach the lambda expression (the first in the call of your evaluation function), and then you can only work with the corresponding part.

To avoid having to evaluate it yourself, now you can simply rewrite the AST to use bitwise operators instead, and then compile the new AST for a function that will then use overloaded operators.

Let's look at AST ((x < 5) & (x > -5)) :

 body=BinOp( left=Compare( left=Name(id='x', ctx=Load()), ops=[Lt()], comparators=[Num(n=5)] ), op=BitAnd(), right=Compare( left=Name(id='x', ctx=Load()), ops=[Gt()], comparators=[Num(n=-5)] ) ) 

As you can see, the difference is pretty slight. You just need to rewrite your BUTOp to use BinOp!

AST and_(x < 5, x > -5) will look like this:

 body=Call( func=Name(id='and_', ctx=Load()), args=[ Compare( left=Name(id='x', ctx=Load()), ops=[Lt()], comparators=[Num(n=5)] ), Compare( left=Name(id='x', ctx=Load()), ops=[Gt()], comparators=[Num(n=-5)] ) ], keywords=[], starargs=None, kwargs=None ) 

It is also not too difficult to rewrite.

+1
source

For what it's worth, and , or and not cannot be overloaded in Python, since they are not operators. They are simply control flow operators for evaluating a short circuit.

As they say, as a Python developer, I find using & to implement "logical" and "rather confusing and probably error prone".

Is your "constraint language" necessary built-in in Python? If so, perhaps you should consider preprocessing files with Python + restrictions.

Given the "parsing" of the language of restrictions, the following option occurs to me:

  • look at the PLY . This may allow you to define a complete language with its own grammar. This may not be the best option for embedded languages.
  • Another option is to use ast . To quote the document: "The ast module helps Python applications process Python abstract syntax grammar trees." This will allow you to parse syntax like Python. But by providing your own semantics (see ast.NodeTransformer )
0
source

Are you familiar with Infix pattern (hack)?

Here is how you can apply it:

 class Infix: def __init__(self, function): self.function = function def __ror__(self, other): return Infix(lambda x, self=self, other=other: self.function(other, x)) def __or__(self, other): return self.function(other) def __rlshift__(self, other): return Infix(lambda x, self=self, other=other: self.function(other, x)) def __rshift__(self, other): return self.function(other) def __call__(self, value1, value2): return self.function(value1, value2) andalso=Infix(lambda x,y: x.and_impl(y)) orelse=Infix(lambda x,y: x.or_impl(y)) #and then c = Constraint(( (x > 10) |andalso| (x < 19) ) |orelse| (y < 0)) 

Unfortunately, you cannot specify operator priority when using Infix , and, as you have already noticed, this leads to excessive bracketing.

In general, I doubt that you will find solutions that accurately mimic the behavior of and and or and do not have flaws.

0
source

Source: https://habr.com/ru/post/974262/


All Articles