I am trying to use the hcluster library in python. I don't have enough python knowledge to use sparse matrix in hcluster. Please help me. So what I do:
import os.path
import numpy
import scipy
import scipy.io
from hcluster import squareform, pdist, linkage, complete
from hcluster.hierarchy import linkage, from_mlab_linkage
from numpy import savetxt
from StringIO import StringIO
data.dmp contains the matrix:
A B C D
A 0 1 0 1
B 1 0 0 1
C 0 0 0 0
D 1 1 0 0
and contains only the upper right part of the matrix. I don’t know how to spell it correctly in English :) so all numbers are higher than the main diagonal so data.dmp contains: 1 0 1, 0 1, 0
f = file('data.dmp','r')
s = StringIO(f.readline()).getvalue()
f.close()
matrix = numpy.asarray(eval("["+s+"]"))
for an unknown reason to me, hcluster uses inverted values, for example, I use 0 if A! = C, and use 1 if A == D
sqfrm = squareform(matrix)
Y = pdist(sqfrm, metric="cosine")
bond Y
Z = linkage(Y, method="complete")
So, the Z matrix is what I need (if I used hcluster correctly?)
But I have the following problems:
,
,
python , thats
.
, python ,
?
, python
hcluster,
, ,
hcluster?
HAC?
, !