How can I train from BigQuery instead of csv files in Cloud ML?

Question

How can I train from BigQuery instead of csv files in Cloud ML?

My training data is in BigQuery. How can I use it to train a model in Cloud ML?

+4

google-bigquery google-cloud-ml

rhaertel80 Sep 29 '16 at 16:22

source share

1 answer

rhaertel80 · Answer 1 · 2016-09-29T16:22:23+0000

Change the preprocessing pipeline to use BigQuerySource(use the same class Featuresas in the CSV samples). Here is an example:

feature_set = CsvFeatures()
train_query = "SELECT …"
valid_query = "SELECt …"
train = pipeline | 'read_train' >> beam.Read(beam.io.BigQuerySource(query=train_query))
eval = pipeline | 'read_valid' >> beam.Read(beam.io.BigQuerySource(query=valid_query))
(metadata, train_features, eval_features) = ((train, eval) |
    ml.Preprocess('Preprocess', feature_set))

How can I train from BigQuery instead of csv files in Cloud ML?

More articles: