Bigquery創建Google雲端存儲（本地或外部）鏈接

我有一些文件上傳到Google雲端存儲（csv和json）。Bigquery創建Google雲端存儲（本地或外部）鏈接

我可以創建BigQuery表，本地或外部鏈接到Google雲端存儲中的這些文件。

在創建bigquery表的過程中，我可以檢查「架構自動檢測」。

「Schema Automatically detect」適用於json新行分隔的格式文件。但是對於csv文件，第一行是'列名'，bigquery不能執行「模式自動檢測」，它將第一行視爲數據，然後創建的模式bigquery將爲string_field_1，string_field_2等。

是否有什麼，我需要爲我的CSV文件，使大量查詢「模式自動檢測」做工程

CSV文件我已經是「Microsoft Excel的逗號分隔值文件」

更新：？

如果第一列是空的，BigQuery autod etect不檢測頭

custom id,asset id,related isrc,iswc,title,hfa song code,writers,match policy,publisher name,sync ownership share,sync ownership territory,sync ownership restriction 
,A123,,,Medley of very old Viennese songs,,,,,,, 
,A234,,,Suite de pièces No. 3 en Ré Mineur HWV 428 - Allemande,,,,,,,

但是，如果第一列不爲空 - 這是確定：

custom id,asset id,related isrc,iswc,title,hfa song code,writers,match policy,publisher name,sync ownership share,sync ownership territory,sync ownership restriction 
1,A123,,,Medley of very old Viennese songs,,,,,,, 
2,A234,,,Suite de pièces No. 3 en Ré Mineur HWV 428 - Allemande,,,,,,,

不應成爲BigQuery的功能改進要求？

來源

2017-03-22 searain

CSV自動檢測確實檢測到CSV文件中的標題行，因此必須對您的數據有特別的看法。如果您可以提供真實的數據片段和您使用的實際命令，那就太好了。這是我的例子，演示了它是如何工作的：

~$ cat > /tmp/people.csv 
Id,Name,DOB 
1,Bill Gates,1955-10-28 
2,Larry Page,1973-03-26 
3,Mark Zuckerberg,1984-05-14 
~$ bq load --source_format=CSV --autodetect dataset.people /tmp/people.csv 
Upload complete. 
Waiting on bqjob_r33dc9ca5653c4312_0000015af95f6209_1 ... (2s) Current status: DONE 
~$ bq show dataset.people 
Table project:dataset.people 

    Last modified  Schema  Total Rows Total Bytes Expiration Labels 
----------------- ----------------- ------------ ------------- ------------ -------- 
    22 Mar 21:14:27 |- Id: integer 3   89         
        |- Name: string             
        |- DOB: date

來源

2017-03-23 04:17:20

我試過一些其他的csv文件。他們在工作。這與csv文件本身有關。 – searain

如果你可以分享這個CSV文件，或者只是它的一些代碼片段來重現問題，那真的很有幫助。 –

custom id,asset id,related isrc,iswc,title,hfa song code,writers,match policy,publisher name,sync ownership share,sync ownership territory,sync ownership restriction 
,A123,,,Medley of very old Viennese songs,,,,,,, 
,A234,,,Suite de pièces No. 3 en Ré Mineur HWV 428 - Allemande,,,,,,,

如果第一欄是空的，谷歌的BigQuery無法檢測的模式。

custom id,asset id,related isrc,iswc,title,hfa song code,writers,match policy,publisher name,sync ownership share,sync ownership territory,sync ownership restriction 
1,A123,,,Medley of very old Viennese songs,,,,,,, 
2,A234,,,Suite de pièces No. 3 en Ré Mineur HWV 428 - Allemande,,,,,,,

如果我將值添加到第一列，那麼Google BigQuery可以檢測到模式。

它應該是BigQuery的功能改進請求嗎？

來源

2017-03-23 18:21:36 searain

是的，請在https://issuetracker.google.com/savedsearches/559654上將問題跟蹤器中的BigQuery功能改進歸檔 –

Bigquery創建Google雲端存儲（本地或外部）鏈接

回答

相關問題