我無法創建一個發佈/訂閱源連接到一個大的查詢水槽dataflowRunner工作,通過插入兩個:錯誤從發佈/訂閱灑進大查詢蟒蛇
apache_beam.io.gcp.pubsub.PubSubSource
apache_beam.io.gcp.bigquery.BigQuerySink
成線59和74分別在github上的beam/sdks/python/apache_beam/examples/streaming_wordcount.py(https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/streaming_wordcount.py)示例中。刪除第61-70行並指定正確的pub/sub和bigquery參數後,腳本無錯誤地運行,無需構建管道。
旁註:腳本提到流式管道支持不可用於Python。但是,它提到apache_beam.io.gcp.pubsub.PubSubSource梁文檔僅適用於流 (第一句的「apache_beam.io.gcp.pubsub模塊」下面的標題:https://beam.apache.org/documentation/sdks/pydoc/2.0.0/apache_beam.io.gcp.html#module-apache_beam.io.gcp.pubsub)
不能等到當它!:)這將是一個很棒的功能 –
@FilipeHoffa,是否有可能批處理成python的大查詢? – Evan
@Evan,您當然可以使用Python批量處理來自Pub/Sub的消息到BigQuery中;看到谷歌提供的示例 這裏 - https://github.com/GoogleCloudPlatform/kubernetes-bigquery-python/blob/master/pubsub/pubsub-pipe-image/pubsub-to-bigquery.py – andre622