Why is the episode done after 200 time steps (MountainCar Gym)?

When using MountainCar-v0 from OpenAI-gym in Python, the value will be executed after 200 time steps. Why is this? Since the state of the goal is not reached, the episode should not be performed.

import gym
env = gym.make('MountainCar-v0')
env.reset()
for _ in range(300):
    env.render()
    res = env.step(env.action_space.sample())
    print(_)
    print(res[2])

I want to run the step method until the car reaches the flag and breaks the for loop. Is it possible? Something like this:

n_episodes = 10
done = False
for i in range(n_episodes):
    env.reset()
    while done == False:
        env.render()
        state, reward, done, _ = env.step(env.action_space.sample())
+4
source share
2 answers

The current latest version of the gym stops the environment 200 steps, even if you are not using env.monitor. To avoid this, use env = gym.make("MountainCar-v0").env

+6
source

https://github.com/openai/gym/wiki/FAQ:

, . , . ( , , ).

, -, , , , , . , MountainCar , , 200 , . 200 . MountainCarMyEasyVersion-v0 , , gym/gym/envs/__init__.py:

gym.envs.register(
    id='MountainCarMyEasyVersion-v0',
    entry_point='gym.envs.classic_control:MountainCarEnv',
    max_episode_steps=250,      # MountainCar-v0 uses 200
    reward_threshold=-110.0,
)
env = gym.make('MountainCarMyEasyVersion-v0')

, .

+3

Source: https://habr.com/ru/post/1672252/


All Articles