ubtracted for centering data to a
    zero mean. Set to np.zeros(n_features) otherwise.

X_scale_ : ndarray of shape (n_features,)
    Set to np.ones(n_features).

n_features_in_ : int
    Number of features seen during :term:`fit`.

    .. versionadded:: 0.24

feature_names_in_ : ndarray of shape (`n_features_in_`,)
    Names of features seen during :term:`fit`. Defined only when `X`
    has feature names that are all strings.

    .. versionadded:: 1.0

See Also
--------
ARDRegression : Bayesian ARD regression.

Notes
-----
There exist several strategies to perform Bayesian ridge regression. This
implementation is based on the algorithm described in Appendix A of
(Tipping, 2001) where updates of the regularization parameters are done as
suggested in (MacKay, 1992). Note that according to A New
View of Automatic Relevance Determination (Wipf and Nagarajan, 2008) these
update rules do not guarantee that the marginal likelihood is increasing
between two consecutive iterations of the optimization.

References
----------
D. J. C. MacKay, Bayesian Interpolation, Computation and Neural Systems,
Vol. 4, No. 3, 1992.

M. E. Tipping, Sparse Bayesian Learning and the Relevance Vector Machine,
Journal of Machine Learning Research, Vol. 1, 2001.

Examples
--------
>>> from sklearn import linear_model
>>> clf = linear_model.BayesianRidge()
>>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
BayesianRidge()
>>> clf.predict([[1, 1]])
array([1.])
r