What is the difference between backpropagation and auto diff in reverse?

Looking through this book , I am familiar with the following:

For each training instance, the backpropagation algorithm first predicts (front pass), measures the error, then goes through each layer in the reverse order to measure the contribution of the error from each connection (back pass), and finally changes the weight compound slightly to reduce the error.

However, I'm not sure how this differs from the AutoDiff implementation in the reverse TensorFlow mode.

As far as I know, the reverse auto-diff first goes through the graph in the forward direction, and then in the second pass it calculates all the partial derivatives for the outputs of the inputs. This is very similar to the distribution algorithm.

How is back propagation different from auto diff in reverse?

+4
source share
2 answers

Automatic differentiation differs from the method used in standard calculus classes on how gradients are calculated, as well as in some functions, such as its native ability to take the gradient of the data structure, and not just a well-defined mathematical function. I don't understand enough details, but this is a great link that explains it much deeper:

https://alexey.radul.name/ideas/2013/introduction-to-automatic-differentiation/

Here is another guidebook that looks pretty good that I just found.

https://rufflewind.com/2016-12-30/reverse-mode-automatic-differentiation

, backprop , , , . backprop , . , , , , .

backpropagation .

https://brilliant.org/wiki/backpropagation/

+2

, , :

Bakpropagation , backpropagation, Gradient Descent. , - , , .

+1

Source: https://habr.com/ru/post/1696327/


All Articles