d)), [shape=(M, N)]) where ``data``, ``row_ind`` and ``col_ind`` satisfy the relationship ``a[row_ind[k], col_ind[k]] = data[k]``. csr_array((data, indices, indptr), [shape=(M, N)]) is the standard CSR representation where the column indices for row i are stored in ``indices[indptr[i]:indptr[i+1]]`` and their corresponding values are stored in ``data[indptr[i]:indptr[i+1]]``. If the shape parameter is not supplied, the array dimensions are inferred from the index arrays. Attributes ---------- dtype : dtype Data type of the array shape : 2-tuple Shape of the array ndim : int Number of dimensions (this is always 2) nnz size data CSR format data array of the array indices CSR format index array of the array indptr CSR format index pointer array of the array has_sorted_indices has_canonical_format T Notes ----- Sparse arrays can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power. Advantages of the CSR format - efficient arithmetic operations CSR + CSR, CSR * CSR, etc. - efficient row slicing - fast matrix vector products Disadvantages of the CSR format - slow column slicing operations (consider CSC) - changes to the sparsity structure are expensive (consider LIL or DOK) Canonical Format - Within each row, indices are sorted by column. - There are no duplicate entries. Examples -------- >>> import numpy as np >>> from scipy.sparse import csr_array >>> csr_array((3, 4), dtype=np.int8).toarray() array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=int8) >>> row = np.array([0, 0, 1, 2, 2, 2]) >>> col = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_array((data, (row, col)), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]]) >>> indptr = np.array([0, 2, 3, 6]) >>> indices = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_array((data, indices, indptr), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]]) Duplicate entries are summed together: >>> row = np.array([0, 1, 2, 0]) >>> col = np.array([0, 1, 1, 0]) >>> data = np.array([1, 2, 4, 8]) >>> csr_array((data, (row, col)), shape=(3, 3)).toarray() array([[9, 0, 0], [0, 2, 0], [0, 4, 0]]) As an example of how to construct a CSR array incrementally, the following snippet builds a term-document array from texts: >>> docs = [["hello", "world", "hello"], ["goodbye", "cruel", "world"]] >>> indptr = [0] >>> indices = [] >>> data = [] >>> vocabulary = {} >>> for d in docs: ... for term in d: ... index = vocabulary.setdefault(term, len(vocabulary)) ... indices.append(index) ... data.append(1) ... indptr.append(len(indices)) ... >>> csr_array((data, indices, indptr), dtype=int).toarray() array([[2, 1, 0, 0], [0, 1, 1, 1]]) r]