Implementation of the cost function of a neural network (Week No. 5 Coursera) using Python

Based on a Coursera machine learning course, I am trying to implement a cost function for a neural network in python. There is a question similar to this one with the accepted answer, but the code in these answers is written in an octave. In order not to be lazy, I tried to adapt the corresponding concepts of the answer to my case, and, as far as I can tell, I correctly implement this function. However, the cost of I output is different from the expected value, so I'm doing something wrong.

Here is a small reproducible example:

The following link leads to a file .npzthat can be downloaded (as shown below) to obtain the corresponding data. Rename the file "arrays.npz", please, if you use it.

http://www.filedropper.com/arrays_1

if __name__ == "__main__":

with np.load("arrays.npz") as data:

    thrLayer = data['thrLayer'] # The final layer post activation; you 
    # can derive this final layer, if verification needed, using weights below

    thetaO = data['thetaO'] # The weight array between layers 1 and 2
    thetaT = data['thetaT'] # The weight array between layers 2 and 3

    Ynew = data['Ynew'] # The output array with a 1 in position i and 0s elsewhere

    #class i is the class that the data described by X[i,:] belongs to

    X = data['X'] #Raw data with 1s appended to the first column
    Y = data['Y'] #One dimensional column vector; entry i contains the class of entry i



import numpy as np

m = len(thrLayer)
k = thrLayer.shape[1]
cost = 0

for i in range(m):
    for j in range(k):
        cost += -Ynew[i,j]*np.log(thrLayer[i,j]) - (1 - Ynew[i,j])*np.log(1 - thrLayer[i,j])
print(cost)
cost /= m

'''
Regularized Cost Component
'''

regCost = 0

for i in range(len(thetaO)):
    for j in range(1,len(thetaO[0])):
        regCost += thetaO[i,j]**2

for i in range(len(thetaT)):
    for j in range(1,len(thetaT[0])):
        regCost += thetaT[i,j]**2

regCost *= lam/(2*m) 


print(cost)
print(regCost)

In fact, it costshould be 0.287629, and it cost + newCostshould be 0.383770.

This is the cost function posted in the question above for reference:


enter image description here

+4
source share
1 answer

The problem is that you are using the wrong class labels . When calculating the cost function, you need to use true truth or true class labels.

, Ynew, . , , Y Ynew, .

import numpy as np

with np.load("arrays.npz") as data:

    thrLayer = data['thrLayer'] # The final layer post activation; you
    # can derive this final layer, if verification needed, using weights below

    thetaO = data['thetaO'] # The weight array between layers 1 and 2
    thetaT = data['thetaT'] # The weight array between layers 2 and 3

    Ynew = data['Ynew'] # The output array with a 1 in position i and 0s elsewhere

    #class i is the class that the data described by X[i,:] belongs to

    X = data['X'] #Raw data with 1s appended to the first column
    Y = data['Y'] #One dimensional column vector; entry i contains the class of entry i


m = len(thrLayer)
k = thrLayer.shape[1]
cost = 0

Y_arr = np.zeros(Ynew.shape)
for i in xrange(m):
    Y_arr[i,int(Y[i,0])-1] = 1

for i in range(m):
    for j in range(k):
        cost += -Y_arr[i,j]*np.log(thrLayer[i,j]) - (1 - Y_arr[i,j])*np.log(1 - thrLayer[i,j])
cost /= m

'''
Regularized Cost Component
'''

regCost = 0

for i in range(len(thetaO)):
    for j in range(1,len(thetaO[0])):
        regCost += thetaO[i,j]**2

for i in range(len(thetaT)):
    for j in range(1,len(thetaT[0])):
        regCost += thetaT[i,j]**2
lam=1
regCost *= lam/(2.*m)


print(cost)
print(cost + regCost)

:

0.287629165161
0.383769859091

: ​​ regCost *= lam/(2*m), regCost.

+2

Source: https://habr.com/ru/post/1649464/


All Articles