I use python scikit-learn to cluster documents, and I have a sparse matrix stored in a dict object:
For instance:
doc_term_dict = { ('d1','t1'): 12, \ ('d2','t3'): 10, \ ('d3','t2'): 5 \ } # from mysql data table <type 'dict'>
I want to use scikit-learn for clustering, where the input matrix type is scipy.sparse.csr.csr_matrix
Example:
(0, 2164) 0.245793088885 (0, 2076) 0.205702177467 (0, 2037) 0.193810934784 (0, 2005) 0.14547028437 (0, 1953) 0.153720023365 ... <class 'scipy.sparse.csr.csr_matrix'>
I can not find a way to convert the dict to this csr matrix (I never used scipy .)
source share