I have the following dataset for a chemical process consisting of 5 consecutive input vectors to get 1 output. Each input is sampled every minute, while the output sample is selected every 5.

Although I believe that the result depends on the 5 previous input vectors, I decided to look for LSTM for my design. After much research on how my LSTM architecture should be, I came to the conclusion that I should mask part of the output sequence with zeros and leave only the last output. The final architecture is below according to my dataset:

My question is: what should be the parameters of the tensor of the three-dimensional input signal? For instance. [5, 5 ,?]? And also what should be my "lot size"? Is this the number of my samples?