After completing the mnist / cifar tutorials, I thought that I was experimenting with tensor flow, creating my own βbigβ dataset, and for simplicity, I settled on black and white oval shape, which changes its height and width regardless of scale 0.0-1.0 in the form images 28x28 pixels in size (of which I have 5000 training images, 1000 test images).
My code uses MNIST expert as the basis (in short for speed), but I switched to a square error based on and, based on the tips here , changed to a sigmoid function for the final level of activation, given that this is not a classification, but rather the "best fitting '' between two tensors, y_ and y_conv.
However, over> 100 thousand iterations, the yield of losses quickly settles between 400 and 900 (or, therefore, 0.2-0.3 from any given mark, averaged over 2 marks in a batch of 50), so I assume that I "Just making noise. Maybe I'm wrong, but I was hoping to use Tensorflow to convolve images to output maybe 10 or more independent labeled variables. Did I miss something fundamental here?
def train(images, labels):
What bothers me the most is how the tensor board seems to show that the weights are practically unchanged, even after several hours and hours of training and a fading learning speed (although this is not shown in the code). My understanding of machine learning is that when you collapse images, the layers effectively make up the edge detection layers ... so I'm confused why they can hardly change.
My theories currently are:
1. I missed / misunderstood something regarding the loss function.
2. I misunderstood how weight is initialized / updated
3. I greatly underestimated how long the process should take ... although, again, the loss just fluctuates.
Any help would be greatly appreciated, thanks!