Episodic semi-gradient Sars with neural network

Question

Episodic semi-gradient Sars with neural network

When trying to implement a Episodic Semi-gradient Sarsa withneural network as an approximator, I wondered how I choose the optimal action based on the network weights currently studied. If the action space is discrete, I can simply calculate the estimated value of the various actions in the current state and choose the one that gives the maximum. But this, apparently, is not the best way to solve the problem. In addition, it does not work if the action space can be continuous (for example, acceleration of a car with an auto-brake).

So basically I'm wondering how to solve the 10th line Choose A' as a function of q(S', , w)in this Sutton pseudo-code:

? , Keras?

: ? , MSE R ?

+4

reinforcement-learning neural-network sarsa

FlashTek 28 . '17 15:35

1

Neil Slater · Accepted Answer · 2017-07-29T16:38:13+0000

, , .

:

, A ', S, . ( 1-ε, ε- , SARSA)
, , .. | A (s) | (, "" , ). , , (.. , A of (S, A)). , . , . , DQN Atari, , AlphaGo.
, , . 15 : . , ( - , , ). "-" -, , .

, , , . , , , , ( , )

? , MSE R ?

. , MSE, . ( TD) . ∇q (S, A, w) ( , LaTex SO) , .

Episodic semi-gradient Sars with neural network

More articles: