Google App Engine and Google Sheets exceed memory limit

I am writing a simple service to take data from several sources, combine them and use the Google API client to send it to the Google Sheet. Easy peasy works well, data is not so big.

The problem is that calling .spreadsheets () after creating the api service (i.e. build('sheets', 'v4', http=auth).spreadsheets() ) causes a memory jump of about 30 megabytes (I did some profiling to highlight where the memory was allocated). When deployed in GAE, these spikes adhere for long periods of time (sometimes by the clock), creeping up and after several requests cause a GAE error "Exceeding soft private memory".

I use memcache for discovery document and urlfetch for data capture, but these are the only other services that I use.

I tried collecting garbage manually by changing threadafe in app.yaml, even things like changing the point at which .spreadsheets () is called, and cannot shake this problem. It's also possible that I just misunderstood something about the GAE architecture, but I know that the spike is caused by calling .spreadsheets (), and I didn't store anything in local caches.

Is there a way: 1) to reduce the size of the memory burst from calls .spreadsheets () or 2) to save the bursts in memory (or, preferably, do both). The following is a very simplified description to give an idea of ​​the API calls and the request handler, if necessary I can provide more complete code. I know that similar questions have been asked before, but I cannot fix it.

https://gist.github.com/chill17/18f1caa897e6a20201232165aca05239

+5
source share
1 answer

I came across this when using the table APIs on a small processor with 20 MB of useful RAM. The problem is that the Google API client pulls the entire API in a string format and stores it as a resource object in memory.

If there is a problem with free memory, you must create your own http object and manually make the desired request. See My Spreadsheet () class for an example of how to create a new spreadsheet using this method.

 SCOPES = 'https://www.googleapis.com/auth/spreadsheets' CLIENT_SECRET_FILE = 'client_secret.json' APPLICATION_NAME = 'Google Sheets API Python Quickstart' class Spreadsheet: def __init__(self, title): #Get credentials from locally stored JSON file #If file does not exist, create it self.credentials = self.getCredentials() #HTTP service that will be used to push/pull data self.service = httplib2.Http() self.service = self.credentials.authorize(self.service) self.headers = {'content-type': 'application/json', 'accept-encoding': 'gzip, deflate', 'accept': 'application/json', 'user-agent': 'google-api-python-client/1.6.2 (gzip)'} print("CREDENTIALS: "+str(self.credentials)) self.baseUrl = "https://sheets.googleapis.com/v4/spreadsheets" self.spreadsheetInfo = self.create(title) self.spreadsheetId = self.spreadsheetInfo['spreadsheetId'] def getCredentials(self): """Gets valid user credentials from storage. If nothing has been stored, or if the stored credentials are invalid, the OAuth2 flow is completed to obtain the new credentials. Returns: Credentials, the obtained credential. """ home_dir = os.path.expanduser('~') credential_dir = os.path.join(home_dir, '.credentials') if not os.path.exists(credential_dir): os.makedirs(credential_dir) credential_path = os.path.join(credential_dir, 'sheets.googleapis.com-python-quickstart.json') store = Storage(credential_path) credentials = store.get() if not credentials or credentials.invalid: flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES) flow.user_agent = APPLICATION_NAME if flags: credentials = tools.run_flow(flow, store, flags) else: # Needed only for compatibility with Python 2.6 credentials = tools.run(flow, store) print('Storing credentials to ' + credential_path) return credentials def create(self, title): #Only put title in request body... We don't need anything else for now requestBody = { "properties":{ "title":title }, } print("BODY: "+str(requestBody)) url = self.baseUrl response, content = self.service.request(url, method="POST", headers=self.headers, body=str(requestBody)) print("\n\nRESPONSE\n"+str(response)) print("\n\nCONTENT\n"+str(content)) return json.loads(content) 
+1
source

Source: https://habr.com/ru/post/1257661/


All Articles