d.
    Available internal optimizers are: `{'fmin_l_bfgs_b'}`.

n_restarts_optimizer : int, default=0
    The number of restarts of the optimizer for finding the kernel's
    parameters which maximize the log-marginal likelihood. The first run
    of the optimizer is performed from the kernel's initial parameters,
    the remaining ones (if any) from thetas sampled log-uniform randomly
    from the space of allowed theta-values. If greater than 0, all bounds
    must be finite. Note that `n_restarts_optimizer == 0` implies that one
    run is performed.

normalize_y : bool, default=False
    Whether or not to normalize the target values `y` by removing the mean
    and scaling to unit-variance. This is recommended for cases where
    zero-mean, unit-variance priors are used. Note that, in this
    implementation, the normalisation is reversed before the GP predictions
    are reported.

    .. versionchanged:: 0.23

copy_X_train : bool, default=True
    If True, a persistent copy of the training data is stored in the
    object. Otherwise, just a reference to the training data is stored,
    which might cause predictions to change if the data is modified
    externally.

n_targets : int, default=None
    The number of dimensions of the target values. Used to decide the number
    of outputs when sampling from the prior distributions (i.e. calling
    :meth:`sample_y` before :meth:`fit`). This parameter is ignored once
    :meth:`fit` has been called.

    .. versionadded:: 1.3

random_state : int, RandomState instance or None, default=None
    Determines random number generation used to initialize the centers.
    Pass an int for reproducible results across multiple function calls.
    See :term:`Glossary <random_state>`.

Attributes
----------
X_train_ : array-like of shape (n_samples, n_features) or list of object
    Feature vectors or other representations of training data (also
    required for prediction).

y_train_ : array-like of shape (n_samples,) or (n_samples, n_targets)
    Target values in training data (also required for prediction).

kernel_ : kernel instance
    The kernel used for prediction. The structure of the kernel is the
    same as the one passed as parameter but with optimized hyperparameters.

L_ : array-like of shape (n_samples, n_samples)
    Lower-triangular Cholesky decomposition of the kernel in ``X_train_``.

alpha_ : array-like of shape (n_samples,)
    Dual coefficients of training data points in kernel space.

log_marginal_likelihood_value_ : float
    The log-marginal-likelihood of ``self.kernel_.theta``.

n_features_in_ : int
    Number of features seen during :term:`fit`.

    .. versionadded:: 0.24

feature_names_in_ : ndarray of shape (`n_features_in_`,)
    Names of features seen during :term:`fit`. Defined only when `X`
    has feature names that are all strings.

    .. versionadded:: 1.0

See Also
--------
GaussianProcessClassifier : Gaussian process classification (GPC)
    based on Laplace approximation.

References
----------
.. [RW2006] `Carl E. Rasmussen and Christopher K.I. Williams,
   "Gaussian Processes for Machine Learning",
   MIT Press 2006 <https://www.gaussianprocess.org/gpml/chapters/RW.pdf>`_

Examples
--------
>>> from sklearn.datasets import make_friedman2
>>> from sklearn.gaussian_process import GaussianProcessRegressor
>>> from sklearn.gaussian_process.kernels import DotProduct, WhiteKernel
>>> X, y = make_friedman2(n_samples=500, noise=0, random_state=0)
>>> kernel = DotProduct() + WhiteKernel()
>>> gpr = GaussianProcessRegressor(kernel=kernel,
...         random_state=0).fit(X, y)
>>> gpr.score(X, y)
0.3680...
>>> gpr.predict(X[:2,:], return_std=True)
(array([653.0..., 592.1...]), array([316.6..., 316.6...]))
Nr