Purpose. To read the csv file loaded into the Google storage bucket.
Environment. Launch Jupyter notebook using an SSH instance on the Master node. Using python on a Jupyter laptop, trying to access a simple csv file loaded into a Google storage bucket.
Approaches -
1st approach - Write a simple python program
Wrote the following program
import csv
f = open('gs://python_test_hm/train.csv' , 'rb' )
csv_f = csv.reader(f)
for row in csv_f
print row
Results - "No such file or directory" error message
Second approach. Using gcloud Package tried to access train.csv file. An example code is shown below. The code below is not the actual code. The file in Google Cloud Storage in my version of the code was mentioned in "gs: ///Filename.csv" Results - Error message "There is no such file or directory"
Download data from CSV
import csv
from gcloud import bigquery
from gcloud.bigquery import SchemaField
client = bigquery.Client()
dataset = client.dataset('dataset_name')
dataset.create()
SCHEMA = [
SchemaField('full_name', 'STRING', mode='required'),
SchemaField('age', 'INTEGER', mode='required'),
]
table = dataset.table('table_name', SCHEMA)
table.create()
with open('csv_file', 'rb') as readable:
table.upload_from_file(
readable, source_format='CSV', skip_leading_rows=1)
The third approach is
import csv
import urllib
url = 'https://storage.cloud.google.com/<bucket>/train.csv'
response = urllib.urlopen(url)
cr = csv.reader(response)
print cr
for row in cr:
print row
Results. The above code does not lead to any error, but displays the XML content on the google page, as shown below. I am interested in viewing the csv data of a train file.
['<!DOCTYPE html>']
['<html lang="en">']
[' <head>']
[' <meta charset="utf-8">']
[' <meta content="width=300', ' initial-scale=1" name="viewport">']
[' <meta name="google-site-verification" content="LrdTUW9psUAMbh4Ia074- BPEVmcpBxF6Gwf0MSgQXZs">']
[' <title>Sign in - Google Accounts</title>']
Can someone shed light on what may be wrong here, and how can I achieve my goal? Your help is much appreciated.
Many thanks for your help!