How to upload compressed files to BigQuery

I want to download wikipedia pageviews from http://dumps.wikimedia.org/other/pagecounts-raw/ to BigQuery, what's the fastest way?

+4
source share
1 answer

This is a classic demo that I am doing to show how easy it is to load data into BigQuery.

To get an hour of browsing from Wikipedia, just run the file:

wget http://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-06/pagecounts-20140602-180000.gz

Then load it into BigQuery (it takes less or about 5 minutes):

bq load -F" " --quote "" fh-bigquery:wikipedia.pagecounts_20140602-18 pagecounts-20140602-180000.gz language,title,requests:integer,content_size:integer

Note that this file weighs about 100 MB of compressed (gz), and you do not need to unpack files of this size in order to upload them to BigQuery. It contains about 8 million rows of hourly page views.

  • -F" ": ,
  • --quote "":
  • language,title,requests:integer,content_size:integer: . , , ( ).

( bq)

, https://bigquery.cloud.google.com/table/fh-bigquery:wikipedia. pagecounts_20140602_18.

https://bigquery.cloud.google.com/table/fh-bigquery:wikipedia.wikipedia_views_201308 (53 , SELECT SUM(requests) FROM [fh-bigquery:wikipedia.wikipedia_views_201308].

+4

Source: https://habr.com/ru/post/1543927/


All Articles