I am trying to implement Sarsa linear gradient descent based on the Sutton and Barto Book , see the algorithm in the figure below.
/ p>
However, I am trying to understand something in the algorithm:
I hope someone can help clarify this for me :)
w - . , , Q(s,a), , . , , , , , . , ( w). w, , , . , , . Q , , , , . . !
w
Q(s,a)
, ( ). . , , , 12 (z - , , w ). , , 10.1.
z
Source: https://habr.com/ru/post/1661473/More articles:Is it possible to make all inner classes visible to another assembly in C # with some configuration? - c #Random encoder failure OMX.qcom - androidTypeScript, redux-form and connect - reactjsHow to add a bower dependency in the correct order to a jhipster project - javascriptIncrease popify popup width from tinyBSB - rSpss syntax, how to list values with labels - listAngular2 - Форма модульного тестирования Отправить - unit-testingAngular2 - Testing the module Observed error "Unable to read property" subscribe "undefined" - angularAngular2 test form: submit method not called - javascriptWhy do I get NA coefficients and how "lm" gives the reference level for interaction - rAll Articles