random. - `'random'`: non-negative random matrices, scaled with: `sqrt(X.mean() / n_components)` - `'nndsvd'`: Nonnegative Double Singular Value Decomposition (NNDSVD) initialization (better for sparseness) - `'nndsvda'`: NNDSVD with zeros filled with the average of X (better when sparsity is not desired) - `'nndsvdar'` NNDSVD with zeros filled with small random values (generally faster, less accurate alternative to NNDSVDa for when sparsity is not desired) - `'custom'`: Use custom matrices `W` and `H` which must both be provided. .. versionchanged:: 1.1 When `init=None` and n_components is less than n_samples and n_features defaults to `nndsvda` instead of `nndsvd`. solver : {'cd', 'mu'}, default='cd' Numerical solver to use: - 'cd' is a Coordinate Descent solver. - 'mu' is a Multiplicative Update solver. .. versionadded:: 0.17 Coordinate Descent solver. .. versionadded:: 0.19 Multiplicative Update solver. beta_loss : float or {'frobenius', 'kullback-leibler', 'itakura-saito'}, default='frobenius' Beta divergence to be minimized, measuring the distance between X and the dot product WH. Note that values different from 'frobenius' (or 2) and 'kullback-leibler' (or 1) lead to significantly slower fits. Note that for beta_loss <= 0 (or 'itakura-saito'), the input matrix X cannot contain zeros. Used only in 'mu' solver. .. versionadded:: 0.19 tol : float, default=1e-4 Tolerance of the stopping condition. max_iter : int, default=200 Maximum number of iterations before timing out. random_state : int, RandomState instance or None, default=None Used for initialisation (when ``init`` == 'nndsvdar' or 'random'), and in Coordinate Descent. Pass an int for reproducible results across multiple function calls. See :term:`Glossary `. alpha_W : float, default=0.0 Constant that multiplies the regularization terms of `W`. Set it to zero (default) to have no regularization on `W`. .. versionadded:: 1.0 alpha_H : float or "same", default="same" Constant that multiplies the regularization terms of `H`. Set it to zero to have no regularization on `H`. If "same" (default), it takes the same value as `alpha_W`. .. versionadded:: 1.0 l1_ratio : float, default=0.0 The regularization mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an elementwise L2 penalty (aka Frobenius Norm). For l1_ratio = 1 it is an elementwise L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2. .. versionadded:: 0.17 Regularization parameter *l1_ratio* used in the Coordinate Descent solver. verbose : int, default=0 Whether to be verbose. shuffle : bool, default=False If true, randomize the order of coordinates in the CD solver. .. versionadded:: 0.17 *shuffle* parameter used in the Coordinate Descent solver. Attributes ---------- components_ : ndarray of shape (n_components, n_features) Factorization matrix, sometimes called 'dictionary'. n_components_ : int The number of components. It is same as the `n_components` parameter if it was given. Otherwise, it will be same as the number of features. reconstruction_err_ : float Frobenius norm of the matrix difference, or beta-divergence, between the training data ``X`` and the reconstructed data ``WH`` from the fitted model. n_iter_ : int Actual number of iterations. n_features_in_ : int Number of features seen during :term:`fit`. .. versionadded:: 0.24 feature_names_in_ : ndarray of shape (`n_features_in_`,) Names of features seen during :term:`fit`. Defined only when `X` has feature names that are all strings. .. versionadded:: 1.0 See Also -------- DictionaryLearning : Find a dictionary that sparsely encodes data. MiniBatchSparsePCA : Mini-batch Sparse Principal Components Analysis. PCA : Principal component analysis. SparseCoder : Find a sparse representation of data from a fixed, precomputed dictionary. SparsePCA : Sparse Principal Components Analysis. TruncatedSVD : Dimensionality reduction using truncated SVD. References ---------- .. [1] :doi:`"Fast local algorithms for large scale nonnegative matrix and tensor factorizations" <10.1587/transfun.E92.A.708>` Cichocki, Andrzej, and P. H. A. N. Anh-Huy. IEICE transactions on fundamentals of electronics, communications and computer sciences 92.3: 708-721, 2009. .. [2] :doi:`"Algorithms for nonnegative matrix factorization with the beta-divergence" <10.1162/NECO_a_00168>` Fevotte, C., & Idier, J. (2011). Neural Computation, 23(9). Examples -------- >>> import numpy as np >>> X = np.array([[1, 1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]]) >>> from sklearn.decomposition import NMF >>> model = NMF(n_components=2, init='random', random_state=0) >>> W = model.fit_transform(X) >>> H = model.components_ r