Reading JSON array in Julia DataFrame format

Question

Reading JSON array in Julia DataFrame format

Given the JSON file, the JSON package successfully parses it. But if I wanted to use it as a DataFrame (or any other columnar data structure), what would be a good way to get it?

Currently, for example, I have:

 using JSON using DataFrames json_str = """ [{ "color": "red", "value": "#f00" }, { "color": "green", "value": "#0f0" }, { "color": "blue", "value": "#00f" }, { "color": "cyan", "value": "#0ff" }, { "color": "magenta", "value": "#f0f" }, { "color": "yellow", "value": "#ff0" }, { "color": "black", "value": "#000" } ] """ function jsontodf(a) ka = union([keys(r) for r in a]...) df = DataFrame(;Dict(Symbol(k)=>get.(a,k,NA) for k in ka)...) return df end a = JSON.Parser.parse(json_str) jsontodf(a)

that leads to:

 7×2 DataFrames.DataFrame │ Row │ color │ value │ ├─────┼───────────┼────────┤ │ 1 │ "red" │ "#f00" │ │ 2 │ "green" │ "#0f0" │ │ 3 │ "blue" │ "#00f" │ │ 4 │ "cyan" │ "#0ff" │ │ 5 │ "magenta" │ "#f0f" │ │ 6 │ "yellow" │ "#ff0" │ │ 7 │ "black" │ "#000" │

and also processes some missing fields with NA. Anything cleaner / faster (Julia v0. 6+)?

+8

dataframe julia-lang julia

Dan getz 10 sept. '17 at 18:13

source share

1 answer

Bogumił kamiński · Answer 1 · 2019-04-29T13:26:58+0000

I dug up this old question, and now we have a better solution for it, starting with DataFrames.jl 0.18.0.

If all the entries in JSON have the same fields, you can write:

 reduce(vcat, DataFrame.(a))

If you have to handle the ability to use different fields in each file, write:

 vcat(DataFrame.(a)..., cols=:union)

This can be a bit problematic if a has a lot of records since it does splatting. I just introduced PR so you can also write:

 reduce(vcat, DataFrame.(a), cols=:union)

soon.

Reading JSON array in Julia DataFrame format

More articles: