How to use PyCall in Julia to convert Python output to Julia DataFrame

I would like to get some data from quandland analyze it in Julia. Unfortunately, there is no official API for this (yet). I know this solution , but it is still very limited in functionality and does not match the same syntax as the original Python API.

I thought it would be wise to use PyCallJulia to retrieve data using the official Python API from within. This gives the result, but I'm not sure how I can convert it to a format that I could use in Julia (ideally a DataFrame).

I have tried the following.

using PyCall, DataFrames
@pyimport quandl

data = quandl.get("WIKI/AAPL", returns = "pandas");

Julia converts this output to Dict{Any,Any}. When used returns = "numpy"instead, returns = "pandas"I get PyObject rec.array.

How can I get datalike Julia DataFrameas it quandl.jlwill return it? Please note that quandl.jlthis is not an option for me, since it does not support automatic extraction of several assets and does not have several other functions, so it is important that I can use the Python API.

Thanks for any suggestions!

+4
source share
3 answers

Python/ Pandas. , , ; Pandas 0.18.0 Python 2 Pandas 0.19.1 Python 3. @niczky12 , Dict{Any,Any} . , - , PyCall Pandas, . :

  • :

    data = quandl.get("WIKI/AAPL", returns = "pandas")
    cols = keys(data)
    df = DataFrame(Any[collect(values(data[c])) for c in cols], map(Symbol, cols))
    
  • PyCall niczky12, . , data[:Open] , data["Open"] PyObject.

    data = pycall(quandl.get, PyObject, "WIKI/AAPL", returns = "pandas")
    cols = data[:columns]
    df = DataFrame(Any[Array(data[c]) for c in cols], map(Symbol, cols))
    

, , , . :

df[:Date] = collect(data[:index])
+2

:

data:

julia> colnames = map(Symbol, data[:columns]);
12-element Array{Symbol,1}:
 :Open                
 :High                
 :Low                 
 :Close               
 :Volume              
 Symbol("Ex-Dividend")
 Symbol("Split Ratio")
 Symbol("Adj. Open")  
 Symbol("Adj. High")  
 Symbol("Adj. Low")   
 Symbol("Adj. Close") 
 Symbol("Adj. Volume")

DataFrame:

julia> y = DataFrame(Any[Array(data[c]) for c in colnames], colnames)

6Γ—12 DataFrames.DataFrame
β”‚ Row β”‚ Open  β”‚ High  β”‚ Low   β”‚ Close β”‚ Volume   β”‚ Ex-Dividend β”‚ Split Ratio β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 28.75 β”‚ 28.87 β”‚ 28.75 β”‚ 28.75 β”‚ 2.0939e6 β”‚ 0.0         β”‚ 1.0         β”‚
β”‚ 2   β”‚ 27.38 β”‚ 27.38 β”‚ 27.25 β”‚ 27.25 β”‚ 785200.0 β”‚ 0.0         β”‚ 1.0         β”‚
β”‚ 3   β”‚ 25.37 β”‚ 25.37 β”‚ 25.25 β”‚ 25.25 β”‚ 472000.0 β”‚ 0.0         β”‚ 1.0         β”‚
β”‚ 4   β”‚ 25.87 β”‚ 26.0  β”‚ 25.87 β”‚ 25.87 β”‚ 385900.0 β”‚ 0.0         β”‚ 1.0         β”‚
β”‚ 5   β”‚ 26.63 β”‚ 26.75 β”‚ 26.63 β”‚ 26.63 β”‚ 327900.0 β”‚ 0.0         β”‚ 1.0         β”‚
β”‚ 6   β”‚ 28.25 β”‚ 28.38 β”‚ 28.25 β”‚ 28.25 β”‚ 217100.0 β”‚ 0.0         β”‚ 1.0         β”‚

β”‚ Row β”‚ Adj. Open β”‚ Adj. High β”‚ Adj. Low β”‚ Adj. Close β”‚ Adj. Volume β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 0.428364  β”‚ 0.430152  β”‚ 0.428364 β”‚ 0.428364   β”‚ 1.17258e8   β”‚
β”‚ 2   β”‚ 0.407952  β”‚ 0.407952  β”‚ 0.406015 β”‚ 0.406015   β”‚ 4.39712e7   β”‚
β”‚ 3   β”‚ 0.378004  β”‚ 0.378004  β”‚ 0.376216 β”‚ 0.376216   β”‚ 2.6432e7    β”‚
β”‚ 4   β”‚ 0.385453  β”‚ 0.38739   β”‚ 0.385453 β”‚ 0.385453   β”‚ 2.16104e7   β”‚
β”‚ 5   β”‚ 0.396777  β”‚ 0.398565  β”‚ 0.396777 β”‚ 0.396777   β”‚ 1.83624e7   β”‚
β”‚ 6   β”‚ 0.420914  β”‚ 0.422851  β”‚ 0.420914 β”‚ 0.420914   β”‚ 1.21576e7   β”‚

@Matt B. .

, Any . , , :

# first, guess the Julia equivalent of type of the object
function guess_type(x::PyCall.PyObject)
  string_dtype = x[:dtype][:name]
  julia_string = string(uppercase(string_dtype[1]), string_dtype[2:end])

  return eval(parse("$julia_string"))
end

# convert an individual column, falling back to Any array if the guess was wrong
function convert_column(x)
  y = try Array{guess_type(x)}(x) catch Array(x) end
  return y
end

# put everything together into a single function
function convert_pandas(df)
  colnames =  map(Symbol, data[:columns])
  y = DataFrame(Any[convert_column(df[c]) for c in colnames], colnames)

  return y
end

, data , , Float64:

y = convert_pandas(data);
showcols(y)
9147Γ—12 DataFrames.DataFrame
β”‚ Col # β”‚ Name        β”‚ Eltype  β”‚ Missing β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1     β”‚ Open        β”‚ Float64 β”‚ 0       β”‚
β”‚ 2     β”‚ High        β”‚ Float64 β”‚ 0       β”‚
β”‚ 3     β”‚ Low         β”‚ Float64 β”‚ 0       β”‚
β”‚ 4     β”‚ Close       β”‚ Float64 β”‚ 0       β”‚
β”‚ 5     β”‚ Volume      β”‚ Float64 β”‚ 0       β”‚
β”‚ 6     β”‚ Ex-Dividend β”‚ Float64 β”‚ 0       β”‚
β”‚ 7     β”‚ Split Ratio β”‚ Float64 β”‚ 0       β”‚
β”‚ 8     β”‚ Adj. Open   β”‚ Float64 β”‚ 0       β”‚
β”‚ 9     β”‚ Adj. High   β”‚ Float64 β”‚ 0       β”‚
β”‚ 10    β”‚ Adj. Low    β”‚ Float64 β”‚ 0       β”‚
β”‚ 11    β”‚ Adj. Close  β”‚ Float64 β”‚ 0       β”‚
β”‚ 12    β”‚ Adj. Volume β”‚ Float64 β”‚ 0       β”‚
+5

There is an API. Just use Quandl.jl: https://github.com/milktrader/Quandl.jl

using Quandl
data = quandlget("WIKI/AAPL")

This gives an additional advantage in obtaining data in a useful Julia format (TimeArray), which has the appropriate methods defined for working with such data.

0
source

Source: https://habr.com/ru/post/1671550/


All Articles