I created MLP with Google TensorFlow . The network works, but for some reason refuses to study properly. It always converges to an output of almost 1.0 no matter what the input actually is.
full code can be seen here .
Any ideas?
Input and output (batch size 4) is as follows:
input_data = [[0., 0.], [0., 1.], [1., 0.], [1., 1.]] # XOR input output_data = [[0.], [1.], [1.], [0.]] # XOR output n_input = tf.placeholder(tf.float32, shape=[None, 2], name="n_input") n_output = tf.placeholder(tf.float32, shape=[None, 1], name="n_output")
Hidden Layer Configuration :
# hidden layer bias neuron b_hidden = tf.Variable(0.1, name="hidden_bias")
Output Level Configuration :
W_output = tf.Variable(tf.random_uniform([hidden_nodes, 1], -1.0, 1.0), name="output_weights")
My teaching methods are as follows:
loss = tf.reduce_mean(cross_entropy)
I tried both settings for cross entropy :
cross_entropy = -tf.reduce_sum(n_output * tf.log(output))
and
cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(n_output, output)
where n_output is the original result, as described in output_data and output predicted / calculated value of my network.
Learning within the cycle (for n eras) is as follows:
cvalues = sess.run([train, loss, W_hidden, b_hidden, W_output], feed_dict={n_input: input_data, n_output: output_data})
I save the result to the values ββfor debug printing loss , W_hidden , ...
No matter what I tried, when I test my network, trying to verify the result, it always produces something like this:
(...) step: 2000 loss: 0.0137040186673 b_hidden: 1.3272010088 W_hidden: [[ 0.23195425 0.53248233 -0.21644847 -0.54775208 0.52298909] [ 0.73933059 0.51440752 -0.08397482 -0.62724304 -0.53347367]] W_output: [[ 1.65939867] [ 0.78912479] [ 1.4831928 ] [ 1.28612828] [ 1.12486529]] (--- finished with 2000 epochs ---) (Test input for validation:) input: [0.0, 0.0] | output: [[ 0.99339396]] input: [0.0, 1.0] | output: [[ 0.99289012]] input: [1.0, 0.0] | output: [[ 0.99346077]] input: [1.0, 1.0] | output: [[ 0.99261558]]
Thus, it does not study correctly, but always converges to almost 1.0 no matter what input is served.