Import a data frame from one Jupyter Notebook file to another

I have 3 separate jupyter laptop files that deal with separate data frames. I clean and manipulate data in these laptops for every df. Is there any way to reference cleared / final data in a separate laptop?

My concern is that if I work with all 3 dfs in one laptop and then do more with it after (merge / join), it will be a mile long. I also do not want to rewrite a bunch of code to get the data ready for use in my new laptop.

+4
source share
1 answer

If you are using pandas data frames, then one approach is to use pandas.DataFrame.to_csv()both pandas.read_csv()to save and load cleared data between each step.

  • Notebook1 loads input1 and saves the result1.
  • Notebook2 downloads result1 and saves result2.
  • Notebook3 downloads result2 and saves result3.

If this is your data:

import pandas as pd
raw_data = {'id': [10, 20, 30], 
            'name': ['foo', 'bar', 'baz']
           }
input = pd.DataFrame(raw_data, columns = ['id', 'name'])

Then in notebook1.ipynb process it like this:

# load
df = pd.read_csv('input.csv', index_col=0)
# manipulate frame here
# ...
# save
df.to_csv('result1.csv')

... and repeat this process for each step in the chain.

# load
df = pd.read_csv('result1.csv', index_col=0)
# manipulate frame here
# ...
# save
df.to_csv('result2.csv')

In the end, your laptop collection will look like this:

  • input.csv
  • notebook1.ipynb
  • notebook2.ipynb
  • notebook3.ipynb
  • result1.csv
  • result2.csv
  • result3.csv

Documentation:

0
source

Source: https://habr.com/ru/post/1687258/


All Articles