Tensorflow - Loss starts high and doesn't decrease

I started writing Neuronal Networks with tensor flow, and there is one problem that I face in each of my projects.

My loss always starts with something like 50 or higher and does not decrease, or if this happens, it does it slowly, that after all my eras, I donโ€™t even approach the acceptable loss rate.

Things that he has already tried (and did not greatly affect the result)

  • tested for redefinition, but in the following example you can see that I have 15,000 workouts and 15,000 test data sets and something like 900 neurons.
  • checked various optimizers and optimizer values
  • tried to increase traingdata using testdata as trainingdata aswell
  • tried to increase and decrease the packet size

I created a knowledge network https://youtu.be/vq2nnJ4g6N0

But let's look at one of my test projects :

I have a list of names and you want to take the floor so that my source data looks like this:

names=["Maria","Paul","Emilia",...]

genders=["f","m","f",...]

To transfer it to the network, I convert the names to a charCodes array (expecting maxlength 30) and the floor to a bit array

names=[[77.,97. ,114.,105.,97. ,0. ,0.,...]
       [80.,97. ,117.,108.,0.  ,0. ,0.,...]
       [69.,109.,105.,108.,105.,97.,0.,...]]

genders=[[1.,0.]
         [0.,1.]
         [1.,0.]]

I built a network with 3 hidden layers [30,20], [20,10], [10,10] and [10,2] for the output layer. All hidden layers have a ReLU function as an activation function. The output layer has softmax.

# Input Layer
x = tf.placeholder(tf.float32, shape=[None, 30])
y_ = tf.placeholder(tf.float32, shape=[None, 2])

# Hidden Layers
# H1
W1 = tf.Variable(tf.truncated_normal([30, 20], stddev=0.1))
b1 = tf.Variable(tf.zeros([20]))
y1 = tf.nn.relu(tf.matmul(x, W1) + b1)

# H2
W2 = tf.Variable(tf.truncated_normal([20, 10], stddev=0.1))
b2 = tf.Variable(tf.zeros([10]))
y2 = tf.nn.relu(tf.matmul(y1, W2) + b2)

# H3
W3 = tf.Variable(tf.truncated_normal([10, 10], stddev=0.1))
b3 = tf.Variable(tf.zeros([10]))
y3 = tf.nn.relu(tf.matmul(y2, W3) + b3)

# Output Layer
W = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
b = tf.Variable(tf.zeros([2]))
y = tf.nn.softmax(tf.matmul(y3, W) + b)

Now the calculation of losses, accuracy and training operations:

# Loss
cross_entropy = -tf.reduce_sum(y_*tf.log(y))

# Accuracy
is_correct = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

# Training
train_operation = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

I train the network in batches of 100

sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(150):
    bs = 100
    index = i*bs
    inputBatch = inputData[index:index+bs]
    outputBatch = outputData[index:index+bs]

    sess.run(train_operation, feed_dict={x: inputBatch, y_: outputBatch})
    accuracyTrain, lossTrain = sess.run([accuracy, cross_entropy], feed_dict={x: inputBatch, y_: outputBatch})

    if i%(bs/10) == 0:
        print("step %d loss %.2f accuracy %.2f" % (i, lossTrain, accuracyTrain))

And I get the following result:

step 0 loss 68.96 accuracy 0.55
step 10 loss 69.32 accuracy 0.50
step 20 loss 69.31 accuracy 0.50
step 30 loss 69.31 accuracy 0.50
step 40 loss 69.29 accuracy 0.51
step 50 loss 69.90 accuracy 0.53
step 60 loss 68.92 accuracy 0.55
step 70 loss 68.99 accuracy 0.55
step 80 loss 69.49 accuracy 0.49
step 90 loss 69.25 accuracy 0.52
step 100 loss 69.39 accuracy 0.49
step 110 loss 69.32 accuracy 0.47
step 120 loss 67.17 accuracy 0.61
step 130 loss 69.34 accuracy 0.50
step 140 loss 69.33 accuracy 0.47


What am I doing wrong?

Why does it start with ~ 69 in my project, and not below?


Thanks so much guys!

+4
1

, 0.69 nats , .

base 2, 0.69/log(2), , 1 , , , .

, .

, . , , tf.nn.sigmoid_cross_entropy_with_logits.

Adam Optimizer .

, :

1) , . , . , 26x30 = 780. .

2) . , . 6 2015 "a", 0 10 . " , " a " . .

0

Source: https://habr.com/ru/post/1664409/


All Articles