Cross check in Keras

I embed a multi-layer perceptron in Keras and use scikit-learn to do cross validation. For this, I was inspired by the code found in the Cross-Checking issue in Keras

from sklearn.cross_validation import StratifiedKFold def load_data(): # load your data using this function def create model(): # create your model using this function def train_and_evaluate__model(model, data[train], labels[train], data[test], labels[test)): # fit and evaluate here. if __name__ == "__main__": X, Y = load_model() kFold = StratifiedKFold(n_splits=10) for train, test in kFold.split(X, Y): model = None model = create_model() train_evaluate(model, X[train], Y[train], X[test], Y[test]) 

In my research on neural networks, I learned that the idea of ​​knowledge about a neural network is in the synaptic balance and in the process of tracing the network, scales that are updated to thereby reduce the frequency of network errors and improve its performance. (In my case, I use supervised learning)

For better training and evaluation of neural network performance, a common use method is cross-validation, which returns sections of the data set for training and model evaluation.

My doubt is ...

In this code snippet:

 for train, test in kFold.split(X, Y): model = None model = create_model() train_evaluate(model, X[train], Y[train], X[test], Y[test]) 

Do we identify, train and evaluate a new neural network for each of the created sections?

If my goal is to fine tune the network for the entire data set, why is it wrong to identify one neural network and train it for the generated sections?

That is, why is this piece of code so?

 for train, test in kFold.split(X, Y): model = None model = create_model() train_evaluate(model, X[train], Y[train], X[test], Y[test]) 

and not like that?

 model = None model = create_model() for train, test in kFold.split(X, Y): train_evaluate(model, X[train], Y[train], X[test], Y[test]) 

Do I understand correctly how the code works? Or my theory?

Thanks!

+9
source share
5 answers

If my goal is to set up a network for the entire data set

It is unclear what you mean by “fine tuning” or even for what purpose you perform cross-validation (CV); in general, CV serves one of the following purposes:

  • Model selection (select hyperparameter values)
  • Model Evaluation

Since you do not define any search grid to select a hyperparameter in your code, it seems that you are using CV to get the expected performance of your model (error, accuracy, etc.).

In any case, for whatever reason you are using CV, the first fragment is correct; your second snippet

 model = None model = create_model() for train, test in kFold.split(X, Y): train_evaluate(model, X[train], Y[train], X[test], Y[test]) 

will train your model sequentially over different sections (for example, a train at section No. 1, then continue training at section No. 2, etc.), which in essence is just training your entire data set, and this is certainly not a cross-check ...

However, the last step after CV, which is often implied (and often missed by beginners), is that after you are satisfied with your chosen hyperparameters and / or model performance, as indicated in your CV procedure, you return and again prepare your model, this time with all the data available.

+8
source

I think many of your questions will be answered if you read about nested cross-validation. This is a good way to fine-tune your model’s hyperparameters. There is a topic here:

https://stats.stackexchange.com/questions/65128/nested-cross-validation-for-model-selection

The biggest problem to be aware of is "peeking" or circular logic. Essentially, you want to make sure that none of the data used to evaluate the accuracy of the model is not visible during training.

One example where this can be problematic is if you use something like a PCA or ICA to extract functions. If you are doing something like this, you should definitely run the PCA on your training set, and then apply the transformation matrix from the training set to the test set.

+2
source

The recorded functions make this a little less obvious, but the idea is to track the performance of your model when you iterate over your creases and at the end provide either those lower-level performance metrics or averaged global metrics. For instance:

The train_evaluate function train_evaluate ideally output some precision score for each split that could be combined at the end.

 def train_evaluate(model, x_train, y_train, x_test, y_test): model.fit(x_train, y_train) return model.score(x_test, y_test) X, Y = load_model() kFold = StratifiedKFold(n_splits=10) scores = np.zeros(10) idx = 0 for train, test in kFold.split(X, Y): model = create_model() scores[idx] = train_evaluate(model, X[train], Y[train], X[test], Y[test]) idx += 1 print(scores) print(scores.mean()) 

So yes, you want to create a new model for each fold, because the goal of this exercise is to determine how your model, how it is developed, performs on all data segments, and not just on one specific segment, which models may or may not allow work well.

This type of approach becomes especially effective when applied along with the search for a grid by hyperparameters. In this approach, you train a model with various hyperparameters using cross-validation and track performance on splits and in general. In the end, you can get a much better idea of ​​which hyperparameters allow the model to work best. For a more detailed explanation, see sklearn model selection and pay particular attention to Cross-validation and mesh search sections.

+1
source

The main idea of ​​testing the performance of your model is to follow these steps:

  • Train model on a training set.
  • Evaluate your model with data not used in the training process to simulate new data arrival.

So, basically - the data that you should finally test, should simulate the first piece of data that you will receive from your client / application in order to apply your model.

So, why cross-validation is so powerful - it makes every data point in the entire data set to use as modeling new data.

And now - to answer your question - each cross-validation should follow the following pattern:

 for train, test in kFold.split(X, Y model = training_procedure(train, ...) score = evaluation_procedure(model, test, ...) 

because in the end you will first train your model and then use it on new data. In your second approach, you cannot treat it as a facial expression of the educational process, because, for example, for the second time, your model will have information stored from the first fold, which is not equivalent to your educational procedure.

Of course, you can apply the training procedure, which uses 10 times in a row for training for the purposes of a thin network. But then this is not a cross-check - you will need to evaluate this procedure using some of the diagrams above.

+1
source

You can use the Scikit-Learn API shells with Keras models.

Given input x and y , an example of repeated 5x cross-validation is given:

 from sklearn.model_selection import RepeatedKFold, cross_val_score from tensorflow.keras.models import * from tensorflow.keras.layers import * from tensorflow.keras.wrappers.scikit_learn import KerasRegressor def buildmodel(): model= Sequential([ Dense(10, activation="relu"), Dense(5, activation="relu"), Dense(1) ]) model.compile(optimizer='adam', loss='mse', metrics=['mse']) return(model) estimator= KerasRegressor(build_fn=buildmodel, epochs=100, batch_size=10, verbose=0) kfold= RepeatedKFold(n_splits=5, n_repeats=100) results= cross_val_score(estimator, x, y, cv=kfold, n_jobs=2) # 2 cpus results.mean() # Mean MSE 
0
source

Source: https://habr.com/ru/post/1274481/


All Articles