Is the learning speed good for the Adam method?

I train my method. I got the result as shown below. Is this a good learning speed? If not, is it tall or short? This is my result.

enter image description here

lr_policy: "step"
gamma: 0.1
stepsize: 10000
power: 0.75
# lr for unnormalized softmax
base_lr: 0.001
# high momentum
momentum: 0.99
# no gradient accumulation
iter_size: 1
max_iter: 100000
weight_decay: 0.0005
snapshot: 4000
snapshot_prefix: "snapshot/train"
type:"Adam"

This is a link

At low levels of learning, improvements will be linear. Thanks to the high levels of training, they will look more exponential. Higher learning speeds will reduce loss faster, but they are stuck in worse loss values.   enter image description here

+6
source share
2 answers

. . 0.0005 0.0001 , . , , .

, , - , . , , , . , , , , .

+6

(, 0,1), , , . , , 100 , 100 . , .

, , .

+5

Source: https://habr.com/ru/post/1015954/


All Articles