timator with different values of a specified parameter. This is similar to grid search with one parameter. However, this will also compute training scores and is merely a utility for plotting the results. Read more in the :ref:`User Guide `. Parameters ---------- estimator : object type that implements the "fit" method An object of that type which is cloned for each validation. It must also implement "predict" unless `scoring` is a callable that doesn't rely on "predict" to compute a score. X : {array-like, sparse matrix} of shape (n_samples, n_features) Training vector, where `n_samples` is the number of samples and `n_features` is the number of features. y : array-like of shape (n_samples,) or (n_samples, n_outputs) or None Target relative to X for classification or regression; None for unsupervised learning. param_name : str Name of the parameter that will be varied. param_range : array-like of shape (n_values,) The values of the parameter that will be evaluated. groups : array-like of shape (n_samples,), default=None Group labels for the samples used while splitting the dataset into train/test set. Only used in conjunction with a "Group" :term:`cv` instance (e.g., :class:`GroupKFold`). .. versionchanged:: 1.6 ``groups`` can only be passed if metadata routing is not enabled via ``sklearn.set_config(enable_metadata_routing=True)``. When routing is enabled, pass ``groups`` alongside other metadata via the ``params`` argument instead. E.g.: ``validation_curve(..., params={'groups': groups})``. cv : int, cross-validation generator or an iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are: - None, to use the default 5-fold cross validation, - int, to specify the number of folds in a `(Stratified)KFold`, - :term:`CV splitter`, - An iterable yielding (train, test) splits as arrays of indices. For int/None inputs, if the estimator is a classifier and ``y`` is either binary or multiclass, :class:`StratifiedKFold` is used. In all other cases, :class:`KFold` is used. These splitters are instantiated with `shuffle=False` so the splits will be the same across calls. Refer :ref:`User Guide ` for the various cross-validation strategies that can be used here. .. versionchanged:: 0.22 ``cv`` default value if None changed from 3-fold to 5-fold. scoring : str or callable, default=None A str (see :ref:`scoring_parameter`) or a scorer callable object / function with signature ``scorer(estimator, X, y)``. n_jobs : int, default=None Number of jobs to run in parallel. Training the estimator and computing the score are parallelized over the combinations of each parameter value and each cross-validation split. ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context. ``-1`` means using all processors. See :term:`Glossary ` for more details. pre_dispatch : int or str, default='all' Number of predispatched jobs for parallel execution (default is all). The option can reduce the allocated memory. The str can be an expression like '2*n_jobs'. verbose : int, default=0 Controls the verbosity: the higher, the more messages. error_score : 'raise' or numeric, default=np.nan Value to assign to the score if an error occurs in estimator fitting. If set to 'raise', the error is raised. If a numeric value is given, FitFailedWarning is raised. .. versionadded:: 0.20 fit_params : dict, default=None Parameters to pass to the fit method of the estimator. .. deprecated:: 1.6 This parameter is deprecated and will be removed in version 1.8. Use ``params`` instead. params : dict, default=None Parameters to pass to the estimator, scorer and cross-validation object. - If `enable_metadata_routing=False` (default): Parameters directly passed to the `fit` method of the estimator. - If `enable_metadata_routing=True`: Parameters safely routed to the `fit` method of the estimator, to the scorer and to the cross-validation object. See :ref:`Metadata Routing User Guide ` for more details. .. versionadded:: 1.6 Returns ------- train_scores : array of shape (n_ticks, n_cv_folds) Scores on training sets. test_scores : array of shape (n_ticks, n_cv_folds) Scores on test set. Notes ----- See :ref:`sphx_glr_auto_examples_model_selection_plot_train_error_vs_test_error.py` Examples -------- >>> import numpy as np >>> from sklearn.datasets import make_classification >>> from sklearn.model_selection import validation_curve >>> from sklearn.linear_model import LogisticRegression >>> X, y = make_classification(n_samples=1_000, random_state=0) >>> logistic_regression = LogisticRegression() >>> param_name, param_range = "C", np.logspace(-8, 3, 10) >>> train_scores, test_scores = validation_curve( ... logistic_regression, X, y, param_name=param_name, param_range=param_range ... ) >>> print(f"The average train accuracy is {train_scores.mean():.2f}") The average train accuracy is 0.81 >>> print(f"The average test accuracy is {test_scores.mean():.2f}") The average test accuracy is 0.81 r5