How to use TensorBoard to analyze results and reduce root-mean-square error

In Tensorflow, I try to create a model for performing superresolution of an image (i.e. a regression task) and analyze the results using TensorBoard. During training, I found that the standard error (MSE) in most cases bounced from 100 to 200 (even from the very beginning) and never converged. I was hoping to add the following variables to tf.summaryand analyze what the problem is causing this.

graph_loss = get_graph_mean_square_error()
tf.summary.scalar('graph_loss', graph_loss)

regularization_loss = tf.add_n([tf.nn.l2_loss(weight) for weight in weights]) * regularization_param
tf.summary.scalar('reg_loss', regularization_loss)

tf.summary.scalar('overall_loss', regularization_loss + graph_loss)

for index in range(len(weights)):
    tf.summary.histogram("weight[%02d]" % index, weights[index])

optimizer = tf.train.AdamOptimizer()
capped_grad_and_vars = [(tf.clip_by_value(grad, -clip_value, clip_value), var) for grad, var in grad_and_vars if grad is not None]
train_optimizer = optimizer.apply_gradients(capped_grad_and_vars, global_step)

for grad, var in grad_and_vars:
    tf.summary.histogram(var.name + '/gradient', grad)

for grad, var in capped_grad_and_vars:
    tf.summary.histogram(var.name + '/capped_gradient', grad)

The model is a ResNET with a missing connection, which contains several repeating layers [convolution → batch normalization → ReLU]. On the Distribution tab, I see that several graphs have been added with the following template:

  • BatchNorm_ []/beta0/capped_gradient
  • BatchNorm_ []/beta0/
  • BatchNorm_ []/gamma0/capped_gradient
  • BatchNorm_ []/gamma0/
  • [] _0/capped_gradient
  • [] _0/
  • weight_ [] _
  • weight_ [] _0/capped_gradient
  • weight_ [] _0/

, , , - :

L2

regularization_param 0,0001, reg_loss , 1,5 ( ) 3,5. graph_loss 100 200, reg_loss - 1,5 3,5.

  • reg_loss, ( )?
  • reg_loss , (100-200 1.5-3.5)?
  • , regularization_param?

, MSE . , ResNET, , (clip_by_value 0,05) . , , , 22 20K , ( TensorBoard , , , / ):

First 20K weight distribution

. 66K :

66K weight distribution

, 20K - , weight_36_ weight_37_ . 50K , weight_36_ ( ) weight_39_ ( ) .

( , capped_gradient clip_by_value 0,05), , , :

66K batch normalization

  1. , - , ? ( , , , , ).
  2. MSE ?

:)

+4
2
  • reg_loss, ( > )?

, .

  1. reg_loss , (100-200 1.5-3.5)?
  2. , regularization_param?
0,001 0,1 ( ) , MSE reg_loss. reg_loss.
  1. , - , ? ( , , , , ).
  2. MSE ?

, , MSE . . MSE , , .

0

:

  • : 0,05. , = (0,05 * ) , . ( ) 0,05, , . , , , , -.

  • l2-: , MSE, .

+2

Source: https://habr.com/ru/post/1694023/


All Articles