JSON for pandas DataFrame

Question

JSON for pandas DataFrame

What I'm trying to do is extract altitude data from the google maps API along the path given by the latitude and longitude coordinates, as follows:

from urllib2 import Request, urlopen import json path1 = '42.974049,-81.205203|42.974298,-81.195755' request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false') response = urlopen(request) elevations = response.read()

This gives me data that looks like this:

 elevations.splitlines() ['{', ' "results" : [', ' {', ' "elevation" : 243.3462677001953,', ' "location" : {', ' "lat" : 42.974049,', ' "lng" : -81.205203', ' },', ' "resolution" : 19.08790397644043', ' },', ' {', ' "elevation" : 244.1318664550781,', ' "location" : {', ' "lat" : 42.974298,', ' "lng" : -81.19575500000001', ' },', ' "resolution" : 19.08790397644043', ' }', ' ],', ' "status" : "OK"', '}']

when typing in a DataFrame, this is what I get:

enter image description here

 pd.read_json(elevations)

and here is what i want:

enter image description here

I'm not sure if this is possible, but basically what I'm looking for is a way to combine height, latitude and longitude data in the pandas framework (it is not necessary to have a mutiline headers fantasy).

If someone can help or give advice on working with this data, that would be great! If you cannot say that I have not worked much with json data before ...

EDIT:

This method is not entirely attractive, but it seems to work:

 data = json.loads(elevations) lat,lng,el = [],[],[] for result in data['results']: lat.append(result[u'location'][u'lat']) lng.append(result[u'location'][u'lng']) el.append(result[u'elevation']) df = pd.DataFrame([lat,lng,el]).T

ends with a dataframe with column latitudes, longitude, height

enter image description here

+88

json python pandas google-maps

pbreach Jan 14 '14 at 1:32

source share

9 answers

Check out this snapshot.

 # reading the JSON data using json.load() file = 'data.json' with open(file) as train_file: dict_train = json.load(train_file) # converting json dataset from dictionary to dataframe train = pd.DataFrame.from_dict(dict_train, orient='index') train.reset_index(level=0, inplace=True)

Hope this helps :)

+18

Rishu Jun 17 '17 at 17:04 on

source share

You can import your json data into dictionnary Python first:

 data = json.loads(elevations)

Then change the data on the fly:

 for result in data['results']: result[u'lat']=result[u'location'][u'lat'] result[u'lng']=result[u'location'][u'lng'] del result[u'location']

Rebuild json string:

 elevations = json.dumps(data)

Finally:

 pd.read_json(elevations)

You can also avoid dumping data back to a string, I assume Panda can directly create a DataFrame from a dictionary (I have not used it for a long time: p)

+10

Raphaël Braud Jan 14 '14 at 2:19 on

source share

The problem is that you have multiple columns in the data frame that contain dicts with smaller dicts inside them. Useful Json is often heavily nested. I wrote small functions that pull the information I want into a new column. So I have this in the format that I want to use.

 for row in range(len(data)): #First I load the dict (one at a time) n = data.loc[row,'dict_column'] #Now I make a new column that pulls out the data that I want. data.loc[row,'new_column'] = n.get('key')

+4

billmanH Oct 20 '14 at 4:54

source share

BillmanH's solution helped me, but didn't work until I switched from:

 n = data.loc[row,'json_column']

so that:

 n = data.iloc[[row]]['json_column']

here, for the rest, conversion to a dictionary is useful for working with JSON data.

 import json for row in range(len(data)): n = data.iloc[[row]]['json_column'].item() jsonDict = json.loads(n) if ('mykey' in jsonDict): display(jsonDict['mykey'])

+1

niltoid Dec 11 '18 at 22:14

source share

Just a new version of the accepted answer, since python3.x does not support urllib2

 from requests import request import json from pandas.io.json import json_normalize path1 = '42.974049,-81.205203|42.974298,-81.195755' response=request(url='http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false', method='get') elevations = response.json() elevations data = json.loads(elevations) json_normalize(data['results'])

+1

AB Abhi Feb 15 '19 at 16:40

source share

 #Use the small trick to make the data json interpret-able #Since your data is not directly interpreted by json.loads() >>> import json >>> f=open("sampledata.txt","r+") >>> data = f.read() >>> for x in data.split("\n"): ... strlist = "["+x+"]" ... datalist=json.loads(strlist) ... for y in datalist: ... print(type(y)) ... print(y) ... ... <type 'dict'> {u'0': [[10.8, 36.0], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'1': [[10.8, 36.1], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'2': [[10.8, 36.2], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'3': [[10.8, 36.300000000000004], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'4': [[10.8, 36.4], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'5': [[10.8, 36.5], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'6': [[10.8, 36.6], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'7': [[10.8, 36.7], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'8': [[10.8, 36.800000000000004], {u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'9': [[10.8, 36.9], {u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]}

0

MIKHIL NAGARALE May 03 '19 at 10:26

source share

Here is a small helper class that converts JSON to a DataFrame and vice versa: Hope you find this useful.

 # -*- coding: utf-8 -*- from pandas.io.json import json_normalize class DFConverter: #Converts the input JSON to a DataFrame def convertToDF(self,dfJSON): return(json_normalize(dfJSON)) #Converts the input DataFrame to JSON def convertToJSON(self, df): resultJSON = df.to_json(orient='records') return(resultJSON)

0

Siva Jun 02 '19 at 16:09 on

source share

Once you get the flat DataFrame received from the accepted answer, you can make the MultiIndex columns ("fancy multi-line header") as follows:

 df.columns = pd.MultiIndex.from_tuples([tuple(c.split('.')) for c in df.columns])

0

loganbvh Jun 16 '19 at 15:20

source share

pbreach · Accepted Answer · 2014-01-21 18:17

I found a quick and easy solution to what I need using the json_normalize function included in the latest version of pandas 0.13.

 from urllib2 import Request, urlopen import json from pandas.io.json import json_normalize path1 = '42.974049,-81.205203|42.974298,-81.195755' request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false') response = urlopen(request) elevations = response.read() data = json.loads(elevations) json_normalize(data['results'])

This gives a nice flattened data frame with json data that I got from google map APIs.

JSON for pandas DataFrame

More articles: