2012-10-23 32 views
0

我在將JSON導入到BigQuery時遇到問題。我們已經創建了服務帳戶,並且正在爲我們的服務器和BQ之間的所有對話使用定製的.NET 4庫。查詢工作,工作列表工作,基本上所有提取工作,但通過JSON格式上傳不起作用。BigQuery JSON導入內部錯誤?

這裏的不同之處是剛開始工作返回了什麼:

{ 
"kind": "bigquery#job", 
"etag": "\"WgwoVdnmFVq0E0riaWM5H0QXabs/R_b3J5b4GjwliMH_X8kjPNLVYsI\"", 
"id": "dot-metrics:job_f7eea1449bb24dffb0a0de1637f31abb", 
"selfLink": "https://www.googleapis.com/bigquery/v2/projects/dot-metrics/jobs/job_f7eea1449bb24dffb0a0de1637f31abb", 
"jobReference": { 
    "projectId": "dot-metrics", 
    "jobId": "job_f7eea1449bb24dffb0a0de1637f31abb" 
}, 
"configuration": { 
    "load": { 
    "schema": { 
    "fields": [ 
    { 
     "name": "word", 
     "type": "STRING", 
     "mode": "REQUIRED" 
    }, 
    { 
     "name": "word_count", 
     "type": "INTEGER", 
     "mode": "REQUIRED" 
    }, 
    { 
     "name": "corpus", 
     "type": "STRING", 
     "mode": "REQUIRED" 
    }, 
    { 
     "name": "corpus_date", 
     "type": "INTEGER", 
     "mode": "REQUIRED" 
    } 
    ] 
    }, 
    "destinationTable": { 
    "projectId": "dot-metrics", 
    "datasetId": "DotMetric_TEST", 
    "tableId": "TestTable" 
    }, 
    "writeDisposition": "WRITE_APPEND", 
    "allowQuotedNewlines": true, 
    "sourceFormat": "NEWLINE_DELIMITED_JSON" 
    } 
}, 
"status": { 
    "state": "DONE", 
    "errorResult": { 
    "reason": "internalError", 
    "message": "Backend error. Job aborted." 
    } 
}, 
"statistics": { 
    "startTime": "1350998303355", 
    "endTime": "1350998337446", 
    "load": { 
    "inputFiles": "1", 
    "inputFileBytes": "7359" 
    } 
} 
} 

數據是JSON換行分隔的字符串是這樣的:

{"Word":"blah_139","WordCount":6615,"Corpus":"Corpus_678","CorpusDate": 6088201915056} 
{"Word":"blah_602","WordCount":2978,"Corpus":"Corpus_493","CorpusDate": 6088201915056} 
{"Word":"blah_50","WordCount":8315,"Corpus":"Corpus_360","CorpusDate": 6088201915056} 
{"Word":"blah_736","WordCount":8971,"Corpus":"Corpus_751","CorpusDate": 6088201915056} 
{"Word":"blah_243","WordCount":2362,"Corpus":"Corpus_174","CorpusDate": 6088201915056} 
{"Word":"blah_643","WordCount":765,"Corpus":"Corpus_315","CorpusDate": 6088201915056} 

工作運行一段時間(約10秒)但隨後死亡。請幫忙!

回答

0

好吧,它看起來像你複製莎士比亞樣品表,並附加到它。莎士比亞的示例模式,因爲它是從谷歌內部的源數據中使用較舊版本的bigquery導入的,所以它的模式有一些瑕疵。當我們導入它時,這些疣會導致你的問題(具體地說,我們認爲corpus_date字段應該是一個不是int64的int32字段,儘管bigquery只支持用於新數據的int32)。

如果您執行的是write_truncate而不是追加並傳遞新模式,或者導入到新表中,則不應出現此問題。

+0

我用'WRITE_TRUNCATE'試過了,現在這個工作已經'PENDING'大約10分鐘了。這是正常的嗎? –

+0

它的工作。謝謝:) –

+0

很高興知道。如果有很多大型進口正在進行,您可能會陷入暫停。我們的工作人員池應該在發生這種情況時動態調整,但有時需要幾分鐘時間才能在線提供額外的容量。 –