Find eigenvalues ​​of a subset of a Dataframe in Python

I have a matrix in the form of a DataFrame

   df=     6M         1Y         2Y         4Y         5Y        10Y        30Y
      6M   n/a        n/a        n/a        n/a        n/a        n/a        n/a
      1Y   n/a          1  0.9465095   0.869504  0.8124711    0.64687  0.5089244
      2Y   n/a  0.9465095          1  0.9343177  0.8880676  0.7423546  0.6048189
      4Y   n/a   0.869504  0.9343177          1  0.9762842  0.8803984  0.7760753
      5Y   n/a  0.8124711  0.8880676  0.9762842          1  0.9117788  0.8404656
      10Y  n/a    0.64687  0.7423546  0.8803984  0.9117788          1  0.9514033
      30Y  n/a  0.5089244  0.6048189  0.7760753  0.8404656  0.9514033          1

I read the values ​​from the matrix (real numbers), and whenever there is no data, I insert 'n/a'(it is necessary to support this format for other reasons). I would like to calculate the eigenvalues DataFrame subsets that contain values float (essentially a subset of '1Y'before '30Y').

I can extract a subset with iloc

tmp = df.iloc[1:df.shapep[0],1:df.shape[1]] 

and extract the correct values ​​(check the types and they will be float). But when I try to calculate eigenvalues tmpwith np.linalg.eigvalsh, I get an error

TypeError: No loop matching the specified signature and casting
was found for ufunc eigvalsh_lo

, , 'n/a' '0.0', ( 0.0, , , 0). , - , .

?

+4
1

IIUC pd.to_numeric NaN, fillna() 0 np.linalg.eigvals:

In [348]: df.apply(pd.to_numeric, errors='coerce')
Out[348]:
     6M        1Y        2Y        4Y        5Y       10Y       30Y
6M  NaN       NaN       NaN       NaN       NaN       NaN       NaN
1Y  NaN  1.000000  0.946509  0.869504  0.812471  0.646870  0.508924
2Y  NaN  0.946509  1.000000  0.934318  0.888068  0.742355  0.604819
4Y  NaN  0.869504  0.934318  1.000000  0.976284  0.880398  0.776075
5Y  NaN  0.812471  0.888068  0.976284  1.000000  0.911779  0.840466
10Y NaN  0.646870  0.742355  0.880398  0.911779  1.000000  0.951403
30Y NaN  0.508924  0.604819  0.776075  0.840466  0.951403  1.000000

In [350]: df.apply(pd.to_numeric, errors='coerce').fillna(0)
Out[350]:
     6M        1Y        2Y        4Y        5Y       10Y       30Y
6M    0  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
1Y    0  1.000000  0.946509  0.869504  0.812471  0.646870  0.508924
2Y    0  0.946509  1.000000  0.934318  0.888068  0.742355  0.604819
4Y    0  0.869504  0.934318  1.000000  0.976284  0.880398  0.776075
5Y    0  0.812471  0.888068  0.976284  1.000000  0.911779  0.840466
10Y   0  0.646870  0.742355  0.880398  0.911779  1.000000  0.951403
30Y   0  0.508924  0.604819  0.776075  0.840466  0.951403  1.000000

In [351]: np.linalg.eigvals(df.apply(pd.to_numeric, errors='coerce').fillna(0))
Out[351]:
array([ 5.11329285,  0.7269089 ,  0.07770957,  0.01334893,  0.02909796,
        0.03964179,  0.        ])

pd.to_numeric float:

In [352]: df.apply(pd.to_numeric, errors='coerce').dtypes
Out[352]:
6M     float64
1Y     float64
2Y     float64
4Y     float64
5Y     float64
10Y    float64
30Y    float64
dtype: object

pd.to_numeric pandas version >= 0.17.0.

'n/a' , replace astype(float):

df.replace('n/a', 0).astype(float)

In [364]: df.replace('n/a', 0).astype(float)
Out[364]:
     6M        1Y        2Y        4Y        5Y       10Y       30Y
6M    0  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000
1Y    0  1.000000  0.946510  0.869504  0.812471  0.646870  0.508924
2Y    0  0.946510  1.000000  0.934318  0.888068  0.742355  0.604819
4Y    0  0.869504  0.934318  1.000000  0.976284  0.880398  0.776075
5Y    0  0.812471  0.888068  0.976284  1.000000  0.911779  0.840466
10Y   0  0.646870  0.742355  0.880398  0.911779  1.000000  0.951403
30Y   0  0.508924  0.604819  0.776075  0.840466  0.951403  1.000000

In [365]: np.linalg.eigvals(df.replace('n/a', 0).astype(float))
Out[365]:
array([ 5.11329285,  0.7269089 ,  0.07770957,  0.01334893,  0.02909796,
        0.03964179,  0.        ])
+3

Source: https://habr.com/ru/post/1624411/


All Articles