the complete dataset before timing out. alpha_W : float, default=0.0 Constant that multiplies the regularization terms of `W`. Set it to zero (default) to have no regularization on `W`. alpha_H : float or "same", default="same" Constant that multiplies the regularization terms of `H`. Set it to zero to have no regularization on `H`. If "same" (default), it takes the same value as `alpha_W`. l1_ratio : float, default=0.0 The regularization mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an elementwise L2 penalty (aka Frobenius Norm). For l1_ratio = 1 it is an elementwise L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2. forget_factor : float, default=0.7 Amount of rescaling of past information. Its value could be 1 with finite datasets. Choosing values < 1 is recommended with online learning as more recent batches will weight more than past batches. fresh_restarts : bool, default=False Whether to completely solve for W at each step. Doing fresh restarts will likely lead to a better solution for a same number of iterations but it is much slower. fresh_restarts_max_iter : int, default=30 Maximum number of iterations when solving for W at each step. Only used when doing fresh restarts. These iterations may be stopped early based on a small change of W controlled by `tol`. transform_max_iter : int, default=None Maximum number of iterations when solving for W at transform time. If None, it defaults to `max_iter`. random_state : int, RandomState instance or None, default=None Used for initialisation (when ``init`` == 'nndsvdar' or 'random'), and in Coordinate Descent. Pass an int for reproducible results across multiple function calls. See :term:`Glossary `. verbose : bool, default=False Whether to be verbose. Attributes ---------- components_ : ndarray of shape (n_components, n_features) Factorization matrix, sometimes called 'dictionary'. n_components_ : int The number of components. It is same as the `n_components` parameter if it was given. Otherwise, it will be same as the number of features. reconstruction_err_ : float Frobenius norm of the matrix difference, or beta-divergence, between the training data `X` and the reconstructed data `WH` from the fitted model. n_iter_ : int Actual number of started iterations over the whole dataset. n_steps_ : int Number of mini-batches processed. n_features_in_ : int Number of features seen during :term:`fit`. feature_names_in_ : ndarray of shape (`n_features_in_`,) Names of features seen during :term:`fit`. Defined only when `X` has feature names that are all strings. See Also -------- NMF : Non-negative matrix factorization. MiniBatchDictionaryLearning : Finds a dictionary that can best be used to represent data using a sparse code. References ---------- .. [1] :doi:`"Fast local algorithms for large scale nonnegative matrix and tensor factorizations" <10.1587/transfun.E92.A.708>` Cichocki, Andrzej, and P. H. A. N. Anh-Huy. IEICE transactions on fundamentals of electronics, communications and computer sciences 92.3: 708-721, 2009. .. [2] :doi:`"Algorithms for nonnegative matrix factorization with the beta-divergence" <10.1162/NECO_a_00168>` Fevotte, C., & Idier, J. (2011). Neural Computation, 23(9). .. [3] :doi:`"Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence" <10.1109/ASPAA.2011.6082314>` Lefevre, A., Bach, F., Fevotte, C. (2011). WASPA. Examples -------- >>> import numpy as np >>> X = np.array([[1, 1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]]) >>> from sklearn.decomposition import MiniBatchNMF >>> model = MiniBatchNMF(n_components=2, init='random', random_state=0) >>> W = model.fit_transform(X) >>> H = model.components_ r