1
我正在嘗試編寫一個簡單的Dataflow作業,該作業使用BigQuerySource
類中的query
參數。從Dataflow到BigQuerySource的簡單查詢會引發異常
用最簡單的術語,我可以使用BigQuerySource
類訪問BigQuery表,然後針對它進行過濾。我無法使用BigQuerySource
直接針對BigQuery表進行查詢/過濾。
這是一些代碼。在過濾網上,我的數據流管線內正常工作:
import argparse
import apache_beam as beam
parser = argparse.ArgumentParser()
parser.add_argument('--output', required=True)
known_args, pipeline_args = parser.parse_known_args(None)
p = beam.Pipeline(argv=pipeline_args)
source = 'bigquery-public-data:samples.shakespeare'
rows = p | 'read'>>beam.io.Read(beam.io.BigQuerySource(source))
f = rows | 'filter' >> beam.Map(lambda row: 1 if (row['word_count'] > 1) else 0)
f | 'write' >> beam.io.WriteToText(known_args.output)
p.run()
更換該節中段與單行查詢給出了一個錯誤。
f = p | 'read' >> beam.io.Read(beam.io.BigQuerySource('SELECT 1 FROM ' \
+ 'bigquery-public-data:samples.shakespeare where word_count > 1'))
返回的錯誤看起來像一個語法錯誤。
(a29eabc394a38f62): Workflow failed. Causes:
(a29eabc394a38cfa): S04:read+write/Write/WriteImpl/WriteBundles+write/Write/WriteImpl/Pair+write/Write/WriteImpl/WindowInto(WindowIntoFn)+write/Write/WriteImpl/GroupByKey/Reify+write/Write/WriteImpl/GroupByKey/Write failed.,
(fb6d0643d7f13886): BigQuery execution failed.,
(fb6d0643d7f13b03): Error: Message: Encountered " "-" "- "" at line 1, column 59. Was expecting: <EOF>
我需要逃避的BigQuery專案名稱-
字符?
「整個表引用」指的是項目,數據集和表?當試圖告訴我這是一個無效的表,即'消息:無效的表名:\'bigquery-public-data:samples.shakespeare \'' –
也請檢查 - https://cloud.google.com/bigquery/docs/reference/standard-sql/migrating-from-legacy-sql#project-qualified_table_names - 在標準sql中,你應該用'.'替換':'。 –
完美 - 用'.'代替':'解決了這個問題。我。謝謝! –