Confused about tensor and batch sizes in pytorch

Question

Confused about tensor and batch sizes in pytorch

So, I'm very new to PyTorch and Neural Networks in general, and I'm having trouble creating a neural network that classifies names by gender.
I based this on a PyTorch tutorial for RNNs that classify names by nationality, but I decided not to go with a repetitive approach ... Stop me right here if that was a wrong idea!
However, when I try to start login through the network, it tells me:

RuntimeError: matrices expected, got 3D, 2D tensors at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/generic/THTensorMath.c:1232

I know this is due to the way PyTorch always expects batch size or something else, and I have a tensor configured this way, but you can probably say about this that I have no idea m talk. Here is my code:

 from future import unicode_literals, print_function, division from io import open import glob import unicodedata import string import torch import torchvision import torch.nn as nn import torch.optim as optim import random from torch.autograd import Variable import matplotlib.pyplot as plt import matplotlib.ticker as ticker """------GLOBAL VARIABLES------""" all_letters = string.ascii_letters + " .,;'" num_letters = len(all_letters) all_names = {} genders = ["Female", "Male"] """-------DATA EXTRACTION------""" def findFiles(path): return glob.glob(path) def unicodeToAscii(s): return ''.join( c for c in unicodedata.normalize('NFD', s) if unicodedata.category(c) != 'Mn' and c in all_letters ) # Read a file and split into lines def readLines(filename): lines = open(filename, encoding='utf-8').read().strip().split('\n') return [unicodeToAscii(line) for line in lines] for file in findFiles("/home/andrew/PyCharm/PycharmProjects/CantStop/data/names/*.txt"): gender = file.split("/")[-1].split(".")[0] names = readLines(file) all_names[gender] = names """-----DATA INTERPRETATION-----""" def nameToTensor(name): tensor = torch.zeros(len(name), 1, num_letters) for index, letter in enumerate(name): tensor[index][0][all_letters.find(letter)] = 1 return tensor def outputToGender(output): gender, gender_index = output.data.topk(1) if gender_index[0][0] == 0: return "Female" return "Male" """------NETWORK SETUP------""" class Net(nn.Module): def __init__(self, input_size, output_size): super(Net, self).__init__() #Layer 1 self.Lin1 = nn.Linear(input_size, int(input_size/2)) self.ReLu1 = nn.ReLU() self.Batch1 = nn.BatchNorm1d(int(input_size/2)) #Layer 2 self.Lin2 = nn.Linear(int(input_size/2), output_size) self.ReLu2 = nn.ReLU() self.Batch2 = nn.BatchNorm1d(output_size) self.softMax = nn.LogSoftmax() def forward(self, input): output1 = self.Batch1(self.ReLu1(self.Lin1(input))) output2 = self.softMax(self.Batch2(self.ReLu2(self.Lin2(output1)))) return output2 NN = Net(num_letters, 2) """------TRAINING------""" def getRandomTrainingEx(): gender = genders[random.randint(0, 1)] name = all_names[gender][random.randint(0, len(all_names[gender])-1)] gender_tensor = Variable(torch.LongTensor([genders.index(gender)])) name_tensor = Variable(nameToTensor(name)) return gender_tensor, name_tensor, gender def train(input, target): loss_func = nn.NLLLoss() optimizer = optim.SGD(NN.parameters(), lr=0.0001, momentum=0.9) optimizer.zero_grad() output = NN(input) loss = loss_func(output, target) loss.backward() optimizer.step() return output, loss all_losses = [] current_loss = 0 for i in range(100000): gender_tensor, name_tensor, gender = getRandomTrainingEx() output, loss = train(name_tensor, gender_tensor) current_loss += loss if i%1000 == 0: print("Guess: %s, Correct: %s, Loss: %s" % (outputToGender(output), gender, loss.data[0])) if i%100 == 0: all_losses.append(current_loss/10) current_loss = 0 # plt.figure() # plt.plot(all_losses) # plt.show()

Please help the newbie!

+5

python deep-learning machine-learning neural-network pytorch

Andrew Kirillov Jul 10 '17 at 19:24

source share

2 answers

Haha TTpro · Answer 1 · 2017-07-11T02:36:40+0000

Debugging your error:

Pycharm is a useful python debugger that allows you to set the measurement of breakpoints and views of your tensor.
To make debugging easier, don't push things forward like this

 output1 = self.Batch1(self.ReLu1(self.Lin1(input)))

Instead of this

 h1 = self.ReLu1(self.Lin1(input)) h2 = self.Batch1(h1)

For stacktrace, Pytorch also provides Pythonic stacktrack errors. I believe that before

 RuntimeError: matrices expected, got 3D, 2D tensors at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/generic/THTensorMath.c:1232

There are some python stacktrace errors that point directly to your code. To simplify debugging, as I said, do not fold forward.

You use Pycharm to create a breakpoint before . In the debugger observer, then use Variable(torch.rand(dim1, dim2)) to check the forward input, output measurement, and if the measurement is incorrect. Comparison with input dimension. Call input.size() in the debugger observer.

For example, self.ReLu1(self.Lin1(Variable(torch.rand(10, 20)))).size() . If the reading text is displayed (error), then the input size is incorrect. Otherwise, it shows the size of the output.

Read the docs

In Pytorch Docs , it defines the size of the I / O. It also has snip code example.

 >>> rnn = nn.RNN(10, 20, 2) >>> input = Variable(torch.randn(5, 3, 10)) >>> h0 = Variable(torch.randn(2, 3, 20)) >>> output, hn = rnn(input, h0)

You can use snip code in PyCharm Debugger to examine input size, output a certain level of your interest (RNN, Linear, BatchNorm1d).

DarkCygnus · Answer 2 · 2017-07-13T16:47:28+0000

Firstly, regarding your error, as the other answers say, as well as your exception, probably because your input parameters are not formed correctly. You can try debugging to isolate the line that gives the error, and then edit your question with it, so we know exactly what causes the problem and fixes it (without a full stack trace, it's harder to find out what the problem is).

Now you are trying to implement the Neural Network, which classifies names by gender , as you indicated. We see that this task will require you to somehow enter a name (which has different sizes) and output the gender (binary variable: male, female). However, neural networks are generally built and trained to classify inputs (vectors) of fixed size, as they are mentioned in pytorch docs :

Parameters: input_size - the number of expected functions at input x
...

Looking at the tutorial that you mentioned, they consider this situation, since in their case the input for the network is a single letter, it is converted to “one hot vector”, because they indicate:

To complete the step of this network, we need to transfer the input (in our case, the tensor for the current letter) and the previous hidden state (which we first initialize as zeros). Well return the result (the probability of each language) and the next hidden state (which we adhere to for the next step).

And even give an example (remember that the Variable tensors in pytorch):

 input = Variable(letterToTensor('A')) hidden = Variable(torch.zeros(1, n_hidden)) output, next_hidden = rnn(input, hidden)

Note. . Speaking of which, you can take some other steps to adapt your implementation to variable-sized inputs. Based on my experience, as well as complemented by this and this other big question, you could:

Pre-process your data to extract new features and convert them to fixed-size inputs. This is usually the most used approach, but requires experience and patience in order to get good opportunities. Some of the methods used: PCA (analysis of the main components) and LDA (separation of the hidden Dirichlet distribution)
For example, you can extract functions from your data such as: the length of the name, the number of letters a in the name (female names tend to have more than a), the number of letters e in the name (the same with male names, maybe?) And others ... so you can create new functions like [name_length, a_found, e_found, ...] . Then you can follow the regular approach with new vectors of a fixed size. Note that these functions should make sense; I just came up with these (although they might work).
Divide your input names into a fixed-size substring (or repeat them using a sliding window) so that you can then classify them with a network designed for that size and combine the outputs in the ensemble to get the final classification.

Confused about tensor and batch sizes in pytorch

More articles: