Well, I could not help adding my own implementation using some of the ideas we discussed in response to Ray. I approached several things differently than Ray.
I added some handling of the probability of a fall of each operator. Operators are biased so that lower priority operators (higher priority values) are more common than higher ones.
I also implemented parentheses only when priority is required. Since integers have the highest priority (minimum priority value), they never wrap in parentheses. No parenthesized expression is required as a node in the expression tree.
The probability of using the operator is biased towards the initial levels (using a quadratic function) to get a more pleasant distribution of operators. Choosing a different exhibitor gives greater potential for controlling the quality of output, but I did not play very much with the possibilities.
I also added an evaluator for fun, as well as for filtering vague expressions.
import sys import random # dictionary of operator precedence and incidence probability, with an # evaluator added just for fun. operators = { '^': {'prec': 10, 'prob': .1, 'eval': lambda a, b: pow(a, b)}, '*': {'prec': 20, 'prob': .2, 'eval': lambda a, b: a*b}, '/': {'prec': 20, 'prob': .2, 'eval': lambda a, b: a/b}, '+': {'prec': 30, 'prob': .25, 'eval': lambda a, b: a+b}, '-': {'prec': 30, 'prob': .25, 'eval': lambda a, b: ab}} max_levels = 3 integer_range = (-100, 100) random.seed() # A node in an expression tree class expression(object): def __init__(self): super(expression, self).__init__() def precedence(self): return -1 def eval(self): return 0 @classmethod def create_random(cls, level): if level == 0: is_op = True elif level == max_levels: is_op = False else: is_op = random.random() <= 1.0 - pow(level/max_levels, 2.0) if is_op: return binary_expression.create_random(level) else: return integer_expression.create_random(level) class integer_expression(expression): def __init__(self, value): super(integer_expression, self).__init__() self.value = value def __str__(self): return self.value.__str__() def precedence(self): return 0 def eval(self): return self.value @classmethod def create_random(cls, level): return integer_expression(random.randint(integer_range[0], integer_range[1])) class binary_expression(expression): def __init__(self, symbol, left_expression, right_expression): super(binary_expression, self).__init__() self.symbol = symbol self.left = left_expression self.right = right_expression def eval(self): f = operators[self.symbol]['eval'] return f(self.left.eval(), self.right.eval()) @classmethod def create_random(cls, level): symbol = None # Choose an operator based on its probability distribution r = random.random() cumulative = 0.0 for k, v in operators.items(): cumulative += v['prob'] if r <= cumulative: symbol = k break assert symbol != None left = expression.create_random(level + 1) right = expression.create_random(level + 1) return binary_expression(symbol, left, right) def precedence(self): return operators[self.symbol]['prec'] def __str__(self): left_str = self.left.__str__() right_str = self.right.__str__() op_str = self.symbol # Use precedence to determine if we need to put the sub expressions in # parentheses if self.left.precedence() > self.precedence(): left_str = '('+left_str+')' if self.right.precedence() > self.precedence(): right_str = '('+right_str+')' # Nice to have space around low precedence operators if operators[self.symbol]['prec'] >= 30: op_str = ' ' + op_str + ' ' return left_str + op_str + right_str max_result = pow(10, 10) for i in range(10): expr = expression.create_random(0) try: value = float(expr.eval()) except: value = 'indeterminate' print expr, '=', value
I got the following results:
(4 + 100)*41/46 - 31 - 18 - 2^-83 = -13.0 (43 - -77)/37^-94 + (-66*67)^(-24*49) = 3.09131533541e+149 -32 - -1 + 74 + 74 - 15 + 64 - -22/98 = 37.0 (-91*-4*45*-55)^(-9^2/(82 - -53)) = 1.0 -72*-85*(75 - 65) + -100*19/48*22 = 61198.0 -57 - -76 - -54*76 + -38 - -23 + -17 - 3 = 4088.0 (84*-19)^(13 - 87) - -10*-84*(-28 + -49) = 64680.0 -69 - -8 - -81^-51 + (53 + 80)^(99 - 48) = 2.07220963807e+108 (-42*-45)^(12/87) - -98 + -23 + -67 - -37 = 152.0 -31/-2*-58^-60 - 33 - -49 - 46/12 = -79.0
There are several things that a program does that, although they are valid, a person would not do. For instance:
- It can create long lines of consecutive divisions (e.g. 1/2/3/4/5).
- +/- of a negative number is common (e.g. 1 -2)
They can be fixed with a cleaning pass.
In addition, there is no guarantee that an answer will be determined. Dividings by 0 and 0 ^ 0 are possible, although they can be filtered with exception handling.