This is an interesting problem. There are several different things here. One of the problems is how to describe the sequence of operations and operands that are included in the arithmetic expression. Using parentheses to establish the order of operations is rather messy, so instead I suggest thinking of an expression as a stack of operations and operands, for example - 4 4 for 4-4, + 4 * 4 4 for (4 * 4) +4, * 4 + 4 4 for (4 + 4) * 4, etc. It's like the reverse polish notation on an HP calculator. Then you don’t need to worry about parentheses, as the data structure for expressions will help below when we create larger and larger expressions.
Now let's move on to the algorithm for constructing expressions. Dynamic programming in this situation does not work, in my opinion, because (for example) to build some numbers in the range from 0 to 100, you may need to temporarily go beyond this range.
The best way to conceptualize the problem, I think, is like the first width search (BFS) on a chart. Technically, the graph will be infinite (all positive integers or all integers or all rational numbers, depending on how much you want to get), but at any time you will have only the final part of the graph. A sparse graph structure may be required.
Each node (number) on the graph will have the weight associated with it, the minimum number 4 required to achieve this node, as well as an expression that achieves this result. First you start with node (4), with the number 1 associated with it (it takes 4 to make 4), and the simple expression is “4”. You can also drop (44) with a weight of 2, (444) with a weight of 3 and (4444) with a weight of 4.
To create larger expressions, apply all the different operations that you have to those original nodes. For example, unary negation, factorial, square root; binary operations, such as * 4 at the bottom of your stack, for multiplication by 4, + 4 , - 4 , / 4 , ^ 4 for exponentiality, as well as + 44 , etc. The weight of the operation is the amount of 4s required for this operation; unary operations would have a weight of 0, + 4 would have a weight of 1, * 44 would have a weight of 2, etc. You would add the weight of the operation to the weight of the node it is running on to get the new weight, so for example + 4 acting on node (44) with a weight of 2, and the expression “44” will result in a new node (48) with weight 3 and the expression "+ 4 44". If the result for 48 has a better weight than the existing result for 48, replace the new node with (48).
When applying functions, you have to use a certain meaning. factor (4444) would be a very large number; it would be wise to set a domain for your factorial function, which would prevent you from getting too big a result or go beyond. Same thing with functions like / 4; if you do not want to deal with fractions, say that multiples of 4 are not outside the region / 4 and in this case do not use the operator.
The resulting algorithm is very similar to Dijkstra's algorithm for calculating the distance on the graph, although not quite the same.