Numpy: understanding the concept of numpy array for string names

Maybe a very vague question, but digging the links to numpy didn't help me.

I need to do a similarity matrix calculation with the following hierarchical clustering for a binary array that looks like this:

name    val1    val2    val3    val4    val5
comp1   0   0   1   0   1
comp2   1   0   0   0   0
comp3   0   0   1   0   0
comp4   1   1   0   0   0
comp5   0   0   1   0   0

I do not understand the concept of string names in numpy. I can read a file like this

test = np.genfromtxt('test.b', delimiter='\t', names = True, dtype = None)
print type(test[0])
numpy.void
print test[0]
('comp1',0, 0, 1, 0, 1)

But how to take string names into account (this information is very important)? Is it possible?

I believe void is not the right way to store a binary array to calculate the next affinity matrix?

+2
source share
1 answer

Numpy . . - dtype=[('name', object), ('val1', int), ...]. , , .

, genfromtxt , object, , - , Python, .

, pandas, ( ). pandas.read_table .

+6

Source: https://habr.com/ru/post/1606554/


All Articles