Pandas Latitude - longitude distance between consecutive rows

I have the following in a Pandas DataFrame in Python 2.7:

Ser_Numb        LAT      LONG
       1  74.166061 30.512811
       2  72.249672 33.427724
       3  67.499828 37.937264
       4  84.253715 69.328767
       5  72.104828 33.823462
       6  63.989462 51.918173
       7  80.209112 33.530778
       8  68.954132 35.981256
       9  83.378214 40.619652
       10 68.778571 6.607066

I want to calculate the distance between consecutive lines in a data frame. The result should look something like this:

Ser_Numb          LAT        LONG   Distance
       1    74.166061   30.512811          0
       2    72.249672   33.427724          d_between_Ser_Numb2 and Ser_Numb1
       3    67.499828   37.937264          d_between_Ser_Numb3 and Ser_Numb2
       4    84.253715   69.328767          d_between_Ser_Numb4 and Ser_Numb3
       5    72.104828   33.823462          d_between_Ser_Numb5 and Ser_Numb4
       6    63.989462   51.918173          d_between_Ser_Numb6 and Ser_Numb5
       7    80.209112   33.530778   .
       8    68.954132   35.981256   .
       9    83.378214   40.619652   .
       10   68.778571   6.607066    .

Attempt

This post looks somewhat similar, but it calculates the distance between fixed points. I need the distance between consecutive points.

I tried to adapt this as follows:

df['LAT_rad'], df['LON_rad'] = np.radians(df['LAT']), np.radians(df['LONG'])
df['dLON'] = df['LON_rad'] - np.radians(df['LON_rad'].shift(1))
df['dLAT'] = df['LAT_rad'] - np.radians(df['LAT_rad'].shift(1))
df['distance'] = 6367 * 2 * np.arcsin(np.sqrt(np.sin(df['dLAT']/2)**2 + math.cos(df['LAT_rad'].astype(float).shift(-1)) * np.cos(df['LAT_rad']) * np.sin(df['dLON']/2)**2))

However, I get the following error:

Traceback (most recent call last):
  File "C:\Python27\test.py", line 115, in <module>
    df['distance'] = 6367 * 2 * np.arcsin(np.sqrt(np.sin(df['dLAT']/2)**2 + math.cos(df['LAT_rad'].astype(float).shift(-1)) * np.cos(df['LAT_rad']) * np.sin(df['dLON']/2)**2))
  File "C:\Python27\lib\site-packages\pandas\core\series.py", line 78, in wrapper
    "{0}".format(str(converter)))
TypeError: cannot convert the series to <type 'float'>
[Finished in 2.3s with exit code 1]

This bug has been fixed from a MaxU comment. With the correction, the output of this calculation does not make sense - the distance is about 8000 km:

   Ser_Numb        LAT       LONG   LAT_rad   LON_rad      dLON      dLAT     distance
0         1  74.166061  30.512811  1.294442  0.532549       NaN       NaN          NaN
1         2  72.249672  33.427724  1.260995  0.583424  0.574129  1.238402  8010.487211
2         3  67.499828  37.937264  1.178094  0.662130  0.651947  1.156086  7415.364469
3         4  84.253715  69.328767  1.470505  1.210015  1.198459  1.449943  9357.184623
4         5  72.104828  33.823462  1.258467  0.590331  0.569212  1.232802  7992.087820
5         6  63.989462  51.918173  1.116827  0.906143  0.895840  1.094862  7169.812123
6         7  80.209112  33.530778  1.399913  0.585222  0.569407  1.380421  8851.558260
7         8  68.954132  35.981256  1.203477  0.627991  0.617777  1.179044  7559.609520
8         9  83.378214  40.619652  1.455224  0.708947  0.697986  1.434220  9194.371978
9        10  68.778571   6.607066  1.200413  0.115315  0.102942  1.175014          NaN

In accordance with:

  • this -: Latitude1 = 74.166061, Longitude1 = 30.512811, Latitude2 = 72.249672, Longitude2 = 33.427724 233
  • haversine : print haversine(30.512811, 74.166061, 33.427724, 72.249672) 232,55

233 , ~ 8000 . , - , .

: Pandas? ?

:

DF, . :

import pandas as pd
df = pd.read_clipboard()
print df
+4
1

(c) @ballsatballsdotballs ( ;-) :

def haversine_np(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)

    All args must be of equal length.    

    """
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2

    c = 2 * np.arcsin(np.sqrt(a))
    km = 6367 * c
    return km

df['dist'] = \
    haversine_np(df.LONG.shift(), df.LAT.shift(),
                 df.loc[1:, 'LONG'], df.loc[1:, 'LAT'])

:

In [566]: df
Out[566]:
   Ser_Numb        LAT       LONG         dist
0         1  74.166061  30.512811          NaN
1         2  72.249672  33.427724   232.549785
2         3  67.499828  37.937264   554.905446
3         4  84.253715  69.328767  1981.896491
4         5  72.104828  33.823462  1513.397997
5         6  63.989462  51.918173  1164.481327
6         7  80.209112  33.530778  1887.256899
7         8  68.954132  35.981256  1252.531365
8         9  83.378214  40.619652  1606.340727
9        10  68.778571   6.607066  1793.921854

UPDATE:, :

In [573]: pd.concat([df['LAT'].shift(), df.loc[1:, 'LAT']], axis=1, ignore_index=True)
Out[573]:
           0          1
0        NaN        NaN
1  74.166061  72.249672
2  72.249672  67.499828
3  67.499828  84.253715
4  84.253715  72.104828
5  72.104828  63.989462
6  63.989462  80.209112
7  80.209112  68.954132
8  68.954132  83.378214
9  83.378214  68.778571
+15

Source: https://habr.com/ru/post/1659998/


All Articles