Pandas.Series () Creating columns using a DataFrame returns NaN data records

I am trying to convert a data block into a series using code that is simplified, looks like this:

dates = ['2016-1-{}'.format(i)for i in range(1,21)] values = [i for i in range(20)] data = {'Date': dates, 'Value': values} df = pd.DataFrame(data) df['Date'] = pd.to_datetime(df['Date']) ts = pd.Series(df['Value'], index=df['Date']) print(ts) 

However, the print output is as follows:

 Date 2016-01-01 NaN 2016-01-02 NaN 2016-01-03 NaN 2016-01-04 NaN 2016-01-05 NaN 2016-01-06 NaN 2016-01-07 NaN 2016-01-08 NaN 2016-01-09 NaN 2016-01-10 NaN 2016-01-11 NaN 2016-01-12 NaN 2016-01-13 NaN 2016-01-14 NaN 2016-01-15 NaN 2016-01-16 NaN 2016-01-17 NaN 2016-01-18 NaN 2016-01-19 NaN 2016-01-20 NaN Name: Value, dtype: float64 

Where is NaN ? Is the representation of a DataFrame object DataFrame invalid input for the Series class?

I found the to_series function for to_series objects, is there something similar for a DataFrame ?

+5
source share
3 answers

I think you can use values , it converts the Value column to an array:

 ts = pd.Series(df['Value'].values, index=df['Date']) 
 import pandas as pd import numpy as np import io dates = ['2016-1-{}'.format(i)for i in range(1,21)] values = [i for i in range(20)] data = {'Date': dates, 'Value': values} df = pd.DataFrame(data) df['Date'] = pd.to_datetime(df['Date']) print df['Value'].values [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19] ts = pd.Series(df['Value'].values, index=df['Date']) 
 print(ts) Date 2016-01-01 0 2016-01-02 1 2016-01-03 2 2016-01-04 3 2016-01-05 4 2016-01-06 5 2016-01-07 6 2016-01-08 7 2016-01-09 8 2016-01-10 9 2016-01-11 10 2016-01-12 11 2016-01-13 12 2016-01-14 13 2016-01-15 14 2016-01-16 15 2016-01-17 16 2016-01-18 17 2016-01-19 18 2016-01-20 19 dtype: int64 

Or you can use:

 ts1 = pd.Series(data=values, index=pd.to_datetime(dates)) print(ts1) 2016-01-01 0 2016-01-02 1 2016-01-03 2 2016-01-04 3 2016-01-05 4 2016-01-06 5 2016-01-07 6 2016-01-08 7 2016-01-09 8 2016-01-10 9 2016-01-11 10 2016-01-12 11 2016-01-13 12 2016-01-14 13 2016-01-15 14 2016-01-16 15 2016-01-17 16 2016-01-18 17 2016-01-19 18 2016-01-20 19 dtype: int64 

Thanks to @ajcr for the best explanation of why you get NaN :

When you give the Series or DataFrame column a DataFrame value, it will reindex it using the specified index . Since your DataFrame column has an integer index (not a date index ), you get a lot of missing values.

+9
source

If you are only looking to create a series with values ​​that you could do:

  pd.Series( [i for i in range(20)], pd.date_range('2016-01-02', periods=20, freq='D')) 
0
source

You can simply do:

 s = df.set_index('Date') 

Now this is a single data frame.

If you really want this to be a series:

 s = df.set_index('Date').Value 

btw, NaN numpy Not-a-Number.

Using your method, you can use:

 ts = pd.Series(df['Value'].values, name='Value', index=df['Date']) 

The reason you get NaN is because you are not providing data in the correct format. You are passing a series to a series.

0
source

Source: https://habr.com/ru/post/1244495/


All Articles