Read csv in dataframe in google colab

I am trying to read a csv file that I saved locally on my machine. (For reference only, this is titanic data from Kaggle, which is here .)

From this question and answer, I learned that you can import data using this code, which works well for me.

from google.colab import files uploaded = files.upload() 

Where I lost, how can I convert it to a dataframe. The example google post page in the answer above does not say that.

I am trying to convert the uploaded dictionary to a dataframe using the from_dict , but cannot make it work. There is some discussion about converting dict to dataframe here , but the solutions are not applicable to me (I think).

So, in summary, my question is:

 How do I convert a csv file stored locally on my files to pandas datframe on google-colaboratory? 
+5
source share
3 answers

Pandas read_csv should do the trick. You will want to wrap your loaded bytes in io.StringIO , since read_csv expects a file-like object.

Here is a complete example: https://colab.research.google.com/notebook#fileId=1JmwtF5OmSghC-y3-BkvxLan0zYXqCJJf

The key piece is

 import pandas as pd import io df = pd.read_csv(io.StringIO(uploaded['train.csv'].decode('utf-8'))) df 
+9
source

Alternatively, you can use github to import files. You can take this as an example: https://drive.google.com/file/d/1D6ViUx8_ledfBqcxHCrFPcqBvNZitwCs/view?usp=sharing

Also google does not save the file longer, so you may have to run github fragments over and over again.

0
source

Colab google: downloading csv from your PC I had the same problem with an excel file (* .xlsx), I solved the problem as follows, and I think you can do the same with csv files: - If you have a file on your PC, called (file.xlsx), then: 1- Download it from your hard drive using this simple code:

 from google.colab import files uploaded = files.upload() 

Click (Select Files) and upload it to your Google drive.

2- Then:

 import io data = io.BytesIO(uploaded['file.XLSX']) 

3- Finally, read your file:

 import pandas as pd f = pd.read_excel(data , sheet_name = '1min', header = 0, skiprows = 2) #df.sheet_names df.head() 

4- Please change the parameter values ​​to read your own file. I think that this could be generalized to read other types of files!
Enjoy it!

0
source

Source: https://habr.com/ru/post/1274813/


All Articles