I am currently reading Sutton's book Reinforcement Learning: An introduction. After reading Chapter 6.1, I wanted to implement the TD(0)RL algorithm for this parameter:

To do this, I tried to implement the pseudo code presented here:

Having done this, I wondered how to take this step A <- action given by π for S: can I choose the optimal action Afor my current state S? Since the function of the value V(S)depends only on the state, and not on the action, which I really do not know how to do this.
I found this question (where did I get the images from) that deals with the same exercise - but here the action is simply randomly selected and not selected by the policy action π.
: , action-value function Q(s, a)?