Python GAE - How to check if a file exists in Google Cloud Storage

I have a script where I want to check if a file exists in a bucket, and if it does not create it.

I tried using os.path.exists(file_path) where file_path = "/gs/testbucket" . But I get a file that is not found.

I know that I can use the files.listdir() API function to list all the files located on the path, and then check if the file I want is one of them. But I was wondering if there is another way to check if the file exists.

+9
source share
10 answers

I think there is no function to check directly if the file exists given its path.
I created a function that uses the files.listdir() API function to list all the files in a bucket and match it with the desired file name. It returns true if found and false if not.

0
source

This post is old, now you can check if the file exists in GCP using the blob class, but because it took me a while to find the answer, adding others here who are looking for a solution

 from google.cloud import storage name = 'file_i_want_to_check.txt' storage_client = storage.Client() bucket_name = 'my_bucket_name' bucket = storage_client.bucket(bucket_name) stats = storage.Blob(bucket=bucket, name=name).exists(storage_client) 

Documentation here

Hope this helps!

+9
source

You can use the stat function to get file information. This will in practice make a HEAD request for Google Cloud Storage instead of GET, which is slightly less resource intensive.

 import cloudstorage as gcs # return stat if there is one, else None or false. A stat record should be truthy def is_file_available(filepath): try: return gcs.stat(filepath) except gcs_errors.NotFoundError as e: return False 
+3
source

You can use a custom function (shown below) to check if a file exists or not.

 def is_file_available(filepath): #check if the file is available fileavability = 'yes'; try: fp = files.open(filepath, 'r') fp.close() except Exception,e: fileavability = 'no' return fileavability 
use the above function as follows
  filepath = '/gs/test/testme.txt' fileavability = is_file_available(filepath) 

note: in the above function, you can also get the result as “no” if read permission is not granted to the application that is trying to read the file.

+2
source

It is as simple as using an existing method in a blob object:

 from google.cloud import storage def blob_exists(projectname, credentials, bucket_name, filename): client = storage.Client(projectname, credentials=credentials) bucket = client.get_bucket(bucket_name) blob = bucket.blob(filename) return blob.exists() 
+2
source

A small variation of Amit's answer from a few years ago, updated for cloudstorage api.

 import cloudstorage as gcs def GCSExists(gcs_file): ''' True if file exists; pass complete /bucket/file ''' try: file = gcs.open(gcs_file,'r') file.close() status = True except: status = False return status 
+1
source

Yes! It is possible! from this

And this is my code:

 def get_by_signed_url(self, object_name, bucket_name=GCLOUD_BUCKET_NAME): bucket = self.client_storage.bucket(bucket_name) blob = bucket.blob(object_name) #this is check if file exist or not stats = blob.exists(self.client_storage) if not stats: raise NotFound(messages.ERROR_NOT_FOUND) url_lifetime = self.expiration # Seconds in an hour serving_url = blob.generate_signed_url(url_lifetime) return self.session.get(serving_url) 
0
source

The file I'm looking for in Google Cloud Storage: init.sh

Full path: gs: //cw-data/spark_app_code/init.sh

 >>> from google.cloud import storage >>> def is_exist(bucket_name,object): ... client = storage.Client() ... bucket = client.bucket(bucket_name) ... blob = bucket.get_blob(object) ... try: ... return blob.exists(client) ... except: ... return False ... >>> is_exist('cw-data','spark_app_code') False >>> is_exist('cw-data','spark_app_code/') True >>> is_exist('cw-data','init.sh') False >>> is_exist('cw-data','spark_app_code/init.sh') True >>> is_exist('cw-data','/init.sh') False >>> 

Here, files are stored not as in local file systems, but in the form of keys. Thus, when searching for a file in Google repository, use the absolute path, not just the file name.

0
source

If you are working with gcs files in a service like ml-engine, use tenorflow to check if the file exists or not:

 import tensorflow as tf file_exists = tf.gfile.Exists('gs://your-bucket-name/your-file.txt') 
0
source

If you are looking for a solution in NodeJS, then here it is:

 var storage = require('@google-cloud/storage')(); var myBucket = storage.bucket('my-bucket'); var file = myBucket.file('my-file'); file.exists(function(err, exists) {}); // If the callback is omitted, then this function return a Promise. file.exists().then(function(data) { var exists = data[0]; }); 

If you need more help, you can refer to this document: https://cloud.google.com/nodejs/docs/reference/storage/1.5.x/File#exists

0
source

Source: https://habr.com/ru/post/1447886/


All Articles