Why is the episode done after 200 time steps (MountainCar Gym)?

Question

Why is the episode done after 200 time steps (MountainCar Gym)?

When using MountainCar-v0 from OpenAI-gym in Python, the value will be executed after 200 time steps. Why is this? Since the state of the goal is not reached, the episode should not be performed.

import gym
env = gym.make('MountainCar-v0')
env.reset()
for _ in range(300):
    env.render()
    res = env.step(env.action_space.sample())
    print(_)
    print(res[2])

I want to run the step method until the car reaches the flag and breaks the for loop. Is it possible? Something like this:

n_episodes = 10
done = False
for i in range(n_episodes):
    env.reset()
    while done == False:
        env.render()
        state, reward, done, _ = env.step(env.action_space.sample())

+4

python openai-gym

needRhelp Mar 14 '17 at 13:55

source share

2 answers

https://github.com/openai/gym/wiki/FAQ:

, . , . ( , , ).

, -, , , , , . , MountainCar , , 200 , . 200 . MountainCarMyEasyVersion-v0 , , gym/gym/envs/__init__.py:

gym.envs.register(
    id='MountainCarMyEasyVersion-v0',
    entry_point='gym.envs.classic_control:MountainCarEnv',
    max_episode_steps=250,      # MountainCar-v0 uses 200
    reward_threshold=-110.0,
)
env = gym.make('MountainCarMyEasyVersion-v0')

, .

+3

catherio 15 . '17 23:35

Scitator · Accepted Answer · 2017-03-15T06:09:33+0000

The current latest version of the gym stops the environment 200 steps, even if you are not using env.monitor. To avoid this, use env = gym.make("MountainCar-v0").env

Why is the episode done after 200 time steps (MountainCar Gym)?

More articles: