You have two parts in your question:
- How to change this problem to a higher dimension space.
- How to move from the descent of the party gradient to stochastic gradient descent.
To get a higher size setting, you can define your linear task y = <x, w>
. Then you just need to resize your W
variable so that it matches the W
value and replace the multiplication W*x_data
with the scalar product tf.matmul(x_data, W)
, and your code should work fine.
To change the learning method to stochastic gradient descent, you need to abstract the input of the cost function using tf.placeholder
.
Once you have defined X
and y_
to hold the input at each step, you can build the same cost function. Then you need to call your step by submitting the correct mini-batch of your data.
Here is an example of how you can implement this behavior, and should show that W
quickly converges to W
import tensorflow as tf import numpy as np
Two side notes:
The implementation below is called a mini-batch gradient, as at each step, the gradient is calculated using a subset of our mini_batch_size
data. This is a variant of stochastic gradient descent, which is usually used to stabilize the gradient estimate at each step. Stochastic gradient descent can be obtained by setting mini_batch_size = 1
.
The data set can be shuffled in each era to bring the implementation closer to theoretical considerations. Some recent works also only look at one pass through your dataset, as it prevents over-installation. For a more detailed mathematical description, you can see Bottou12 . This can be easily changed according to your problem setting and the statistical property you are looking for.
source share