There are 3 popular methods for calculating the derivative:
- Numerical differentiation
- Symbolic differentiation
- Automatic differentiation
Numerical differentiation depends on the definition of the derivative:
where you put a very small h and evaluate the function in two places. This is the most basic formula, and in practice, people use other formulas that give a smaller estimation error. This method of calculating the derivative is suitable mainly if you do not know your function and can only try it. It also requires a lot of computation for the high temperature function.
Symbolic differentiation manipulates mathematical expressions. If you have ever used matlab or mathematica, then you saw something like this 
Here, for each mathematical expression, they know the derivative and use different rules (product rule, chain rule) to calculate the resulting derivative. They then simplify the final expression to obtain the resulting expression.
Automatic differentiation controls blocks of computer programs. The differentiator has rules for accepting the derivative of each program element (when you define any op in core TF, you need to register a gradient for that op). It also uses the chain rule to break complex expressions into simpler ones. Here is a good good example of how it works in real TF programs with some explanations .
You might think that automatic differentiation coincides with symbolic differentiation (in one place they work with a mathematical expression, in the other with computer programs). And yes, they are sometimes very similar. But for flow control commands (`if, while, loops), the results can be very different :
symbolic differentiation leads to inefficient code (if only carefully) and faces difficulties converting a computer program into a single expression
source share