Convert Pandas DataFrame to JSON as an element of a larger data structure

I worked with pandas DataFrame objects on my server, converting them to CSV for transfer to a browser where table values ​​are displayed using d3. Although CSV is a file as much as possible, I really need more than just a 2D data table. If nothing else, I would like to return some metadata about the data.

So, I started talking to JSON, thinking that I could build a dictionary with some meta-information and my DataFrame. For example, as an absurdly simple example:

>>> z = numpy.zeros(10) >>> df = pandas.DataFrame(z) >>> df 0 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 >>> result = { ... "name": "Simple Example", ... "data": df, ... } 

Not surprisingly, this cannot be directly serialized using the json module. I found the jsonext module and tried it. It "works", but gives incomplete results:

 >>> jsonext.dumps(result) '{"data": ["0"], "name": "Simple Example"}' 

Looking at the methods that the DataFrame itself provides for this kind of thing, I found to_dict () and to_json (). The first produces dictionaries of dictionaries:

 >>> df.to_dict() {0: {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0}} 

but, as you can see, they cannot be serialized in JSON, since keys are not strings.

df.to_json () looked like it might work, although I would run a JSON string embedded in another JSON string. Something like that:

json.dumps ({"name": "Simple Example", "data": df.to_json ()}) '{"data": "{\" 0 \ ": {\" 0 \ ": 0.0, \" 1 \ ": 0.0, \" 2 \ ": 0.0, \" 3 \ ": 0.0, \" 4 \ ": 0.0, \" 5 \ ": 0.0, \" 6 \ ": 0.0, \" 7 \ ": 0.0, \" 8 \ ": 0.0, \" 9 \ ": 0.0}}", "name": "A simple example"} '

In other words, a bit of a mess.

Any suggestions on how to handle such a nested structure where some of the elements cannot be directly serialized? I think I could get jsonext to work, but its Dict mixin expects to find the correct (in my mind) to_dict () method. DataFrame.to_dict () doesn't seem to return the right thing. (Although I will continue to ride with him.)

I decided that this should be a cat who was already a scrub. I just didn’t find it. I would be pleased that I had nothing more hierarchical than something like my example (albeit with a lot of key / value pairs), although I will not turn my nose in a more general solution.

+5
source share
3 answers

The default function (supplied in json.dumps ) is called for all objects that cannot be serialized by default. It can return any object that the encoder can encode by default, for example, dict.

df.to_json() returns a string. json.loads(df.to_json) returns a dict with keys that are strings. Therefore, if we set default=lambda df: json.loads(df.to_json()) , then the DataFrame will be serialized as if it were a dict.

 import json import numpy as np import pandas as pd z = np.zeros(10) df = pd.DataFrame(z) result = {"name": "Simple Example", "data": df, } jstr = json.dumps(result, default=lambda df: json.loads(df.to_json())) newresult = json.loads(jstr) print(newresult) # {u'data': {u'0': {u'0': 0.0, # u'1': 0.0, # u'2': 0.0, # u'3': 0.0, # u'4': 0.0, # u'5': 0.0, # u'6': 0.0, # u'7': 0.0, # u'8': 0.0, # u'9': 0.0}}, # u'name': u'Simple Example'} print(pd.DataFrame(newresult['data'])) 

gives

  0 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 
+6
source

I think a more detailed reading in jsonext docs was justified. It looks like I can create my own mixin that knows how to correctly encode DataFrame objects and then calls jsonext.dumps (result). I was seduced by the existing methods to_dict () and to_json () of DataFrame objects, which actually do not solve the problem.

+2
source

One way would be to convert the index / columns to rows:

 In [355]: df.index = df.index.astype(str) In [356]: df.columns = df.columns.astype(str) 

Then you can build a dict and go to json.dump :

 In [357]: result = { ...: ... "name": "Simple Example", ...: ... "data": df.to_dict(), ...: ... } In [359]: print json.dumps(result, indent=4) { "data": { "0": { "1": 0.0, "0": 0.0, "3": 0.0, "2": 0.0, "5": 0.0, "4": 0.0, "7": 0.0, "6": 0.0, "9": 0.0, "8": 0.0 } }, "name": "Simple Example" } 
-1
source

Source: https://habr.com/ru/post/1204203/


All Articles