Other Validation Functionalities

Splitting Datasets With scikit-learn and train_test_split() Darren Jones 02:09

00:00 Other validation functionalities.

00:04 scikit-learn’s model_selection module offers a lot of functionalities related to model selection and validation, including the following: cross-validation, learning curves, and hyperparameter tuning.

00:51 This provides k measures of predictive performance, and you can then analyze their mean and standard deviation.

01:00 You can implement cross validation with KFold, StratifiedKFold, LeaveOneOut, and a few other classes and functions from scikit-learn’s model_selection module.

01:13 A learning curve, sometimes called a training curve, shows how the prediction score of training and validation sets depends on the number of training samples.

01:22 You can use learning_curve() to get this dependency, which can help you find the optimal size of the training set, choose hyperparameters, compare models, and so on.

01:33 Hyperparameter tuning, also called hyperparameter optimization, is the process of determining the best set of hyperparameters to define your machine learning model. scikit-learn’s model_selection module provides you with several options for this purpose, including GridSearchCV, RandomizedSearchCV, validation_curve(), and others.

01:56 Splitting your data is also important for hyperparameter tuning. Now that you’ve covered all the elements of this course, let’s take some time to look back at what you’ve learned.

Become a Member to join the conversation.