JSON for pandas DataFrame

What I'm trying to do is extract altitude data from the google maps API along the path given by the latitude and longitude coordinates, as follows:

from urllib2 import Request, urlopen import json path1 = '42.974049,-81.205203|42.974298,-81.195755' request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false') response = urlopen(request) elevations = response.read() 

This gives me data that looks like this:

 elevations.splitlines() ['{', ' "results" : [', ' {', ' "elevation" : 243.3462677001953,', ' "location" : {', ' "lat" : 42.974049,', ' "lng" : -81.205203', ' },', ' "resolution" : 19.08790397644043', ' },', ' {', ' "elevation" : 244.1318664550781,', ' "location" : {', ' "lat" : 42.974298,', ' "lng" : -81.19575500000001', ' },', ' "resolution" : 19.08790397644043', ' }', ' ],', ' "status" : "OK"', '}'] 

when typing in a DataFrame, this is what I get:

enter image description here

 pd.read_json(elevations) 

and here is what i want:

enter image description here

I'm not sure if this is possible, but basically what I'm looking for is a way to combine height, latitude and longitude data in the pandas framework (it is not necessary to have a mutiline headers fantasy).

If someone can help or give advice on working with this data, that would be great! If you cannot say that I have not worked much with json data before ...

EDIT:

This method is not entirely attractive, but it seems to work:

 data = json.loads(elevations) lat,lng,el = [],[],[] for result in data['results']: lat.append(result[u'location'][u'lat']) lng.append(result[u'location'][u'lng']) el.append(result[u'elevation']) df = pd.DataFrame([lat,lng,el]).T 

ends with a dataframe with column latitudes, longitude, height

enter image description here

+88
json python pandas google-maps
Jan 14 '14 at 1:32
source share
9 answers

I found a quick and easy solution to what I need using the json_normalize function included in the latest version of pandas 0.13.

 from urllib2 import Request, urlopen import json from pandas.io.json import json_normalize path1 = '42.974049,-81.205203|42.974298,-81.195755' request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false') response = urlopen(request) elevations = response.read() data = json.loads(elevations) json_normalize(data['results']) 

This gives a nice flattened data frame with json data that I got from google map APIs.

+122
Jan 21 '14 at 18:17
source share

Check out this snapshot.

 # reading the JSON data using json.load() file = 'data.json' with open(file) as train_file: dict_train = json.load(train_file) # converting json dataset from dictionary to dataframe train = pd.DataFrame.from_dict(dict_train, orient='index') train.reset_index(level=0, inplace=True) 

Hope this helps :)

+18
Jun 17 '17 at 17:04 on
source share

You can import your json data into dictionnary Python first:

 data = json.loads(elevations) 

Then change the data on the fly:

 for result in data['results']: result[u'lat']=result[u'location'][u'lat'] result[u'lng']=result[u'location'][u'lng'] del result[u'location'] 

Rebuild json string:

 elevations = json.dumps(data) 

Finally:

 pd.read_json(elevations) 

You can also avoid dumping data back to a string, I assume Panda can directly create a DataFrame from a dictionary (I have not used it for a long time: p)

+10
Jan 14 '14 at 2:19 on
source share

The problem is that you have multiple columns in the data frame that contain dicts with smaller dicts inside them. Useful Json is often heavily nested. I wrote small functions that pull the information I want into a new column. So I have this in the format that I want to use.

 for row in range(len(data)): #First I load the dict (one at a time) n = data.loc[row,'dict_column'] #Now I make a new column that pulls out the data that I want. data.loc[row,'new_column'] = n.get('key') 
+4
Oct 20 '14 at 4:54
source share

BillmanH's solution helped me, but didn't work until I switched from:

 n = data.loc[row,'json_column'] 

so that:

 n = data.iloc[[row]]['json_column'] 

here, for the rest, conversion to a dictionary is useful for working with JSON data.

 import json for row in range(len(data)): n = data.iloc[[row]]['json_column'].item() jsonDict = json.loads(n) if ('mykey' in jsonDict): display(jsonDict['mykey']) 
+1
Dec 11 '18 at 22:14
source share

Just a new version of the accepted answer, since python3.x does not support urllib2

 from requests import request import json from pandas.io.json import json_normalize path1 = '42.974049,-81.205203|42.974298,-81.195755' response=request(url='http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false', method='get') elevations = response.json() elevations data = json.loads(elevations) json_normalize(data['results']) 
+1
Feb 15 '19 at 16:40
source share
 #Use the small trick to make the data json interpret-able #Since your data is not directly interpreted by json.loads() >>> import json >>> f=open("sampledata.txt","r+") >>> data = f.read() >>> for x in data.split("\n"): ... strlist = "["+x+"]" ... datalist=json.loads(strlist) ... for y in datalist: ... print(type(y)) ... print(y) ... ... <type 'dict'> {u'0': [[10.8, 36.0], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'1': [[10.8, 36.1], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'2': [[10.8, 36.2], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'3': [[10.8, 36.300000000000004], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'4': [[10.8, 36.4], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'5': [[10.8, 36.5], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'6': [[10.8, 36.6], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'7': [[10.8, 36.7], {u'10': 0, u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'8': [[10.8, 36.800000000000004], {u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} <type 'dict'> {u'9': [[10.8, 36.9], {u'1': 0, u'0': 0, u'3': 0, u'2': 0, u'5': 0, u'4': 0, u'7': 0, u'6': 0, u'9': 0, u'8': 0}]} 
0
May 03 '19 at 10:26
source share

Here is a small helper class that converts JSON to a DataFrame and vice versa: Hope you find this useful.

 # -*- coding: utf-8 -*- from pandas.io.json import json_normalize class DFConverter: #Converts the input JSON to a DataFrame def convertToDF(self,dfJSON): return(json_normalize(dfJSON)) #Converts the input DataFrame to JSON def convertToJSON(self, df): resultJSON = df.to_json(orient='records') return(resultJSON) 
0
Jun 02 '19 at 16:09 on
source share

Once you get the flat DataFrame received from the accepted answer, you can make the MultiIndex columns ("fancy multi-line header") as follows:

 df.columns = pd.MultiIndex.from_tuples([tuple(c.split('.')) for c in df.columns]) 
0
Jun 16 '19 at 15:20
source share



All Articles