2012-09-20 50 views
4

是否有辦法在給定的時間範圍內使用bq命令行工具列出所有作業ID?我需要做的是循環遍歷所有的Id,並找出是否有任何錯誤。所有bq作業的摘要

我使用web界面瞭解作業的ID,然後使用命令:

​​

後來我將手動複製粘貼輸出的「錯誤」的一部分。這需要很長時間來報告給定日期的工作總結。

+1

bq ls -j'bq show | grep ^項目| awk'{print $ 2}''| grep「'date +'%d%b''」| awk'{print $ 1}'#這是窮人的解決方案並不如以下所示的那樣優雅 – shantanuo

+0

實際上,將結果輸送到流處理器可以非常優雅:-) –

回答

4

當然,你可以通過運行列出直到最後1000個工作崗位供您有訪問一個項目:

bq ls -j --max_results=1000 project_number 

如果你有超過1000個就業機會,你也可以寫一個Python腳本來列出通過結果的1000批次分頁的所有作業 - 像這樣:

import httplib2 
import pprint 
import sys 

from apiclient.discovery import build 
from apiclient.errors import HttpError 

from oauth2client.client import AccessTokenRefreshError 
from oauth2client.client import OAuth2WebServerFlow 
from oauth2client.client import flow_from_clientsecrets 
from oauth2client.file import Storage 
from oauth2client.tools import run 


# Enter your Google Developer Project number 
PROJECT_NUMBER = 'XXXXXXXXXXXX' 

FLOW = flow_from_clientsecrets('client_secrets.json', 
           scope='https://www.googleapis.com/auth/bigquery') 



def main(): 

    storage = Storage('bigquery_credentials.dat') 
    credentials = storage.get() 

    if credentials is None or credentials.invalid: 
    credentials = run(FLOW, storage) 

    http = httplib2.Http() 
    http = credentials.authorize(http) 

    bigquery_service = build('bigquery', 'v2', http=http) 
    jobs = bigquery_service.jobs() 

    page_token=None 
    count=0 

    while True: 
    response = list_jobs_page(jobs, page_token) 
    if response['jobs'] is not None: 
     for job in response['jobs']: 
     count += 1 
     print '%d. %s\t%s\t%s' % (count, 
            job['jobReference']['jobId'], 
            job['state'], 
            job['errorResult']['reason'] if job.get('errorResult') else '') 
    if response.get('nextPageToken'): 
     page_token = response['nextPageToken'] 
    else: 
     break 


def list_jobs_page(jobs, page_token=None): 
    try: 
    jobs_list = jobs.list(projectId=PROJECT_NUMBER, 
          projection='minimal', 
          allUsers=True, 
          maxResults=1000, 
          pageToken=page_token).execute() 

    return jobs_list 

    except HttpError as err: 
    print 'Error:', pprint.pprint(err.content) 


if __name__ == '__main__': 
    main() 
+0

#出現錯誤#oauth2client.clientsecrets.InvalidClientSecretsError:File找不到:「client_secrets.json」 – shantanuo

+0

是的,上面的腳本需要客戶端ID和祕密,因爲它使用OAuth ...請參閱:https://developers.google.com/bigquery/docs/authorization#clientsecrets –

+0

您可以還可以使用服務器到服務器的服務帳戶身份驗證編寫這些腳本(實際上,這對於位於服務器上的自動化腳本來說是更好的策略)......請參閱https://developers.google.com/bigquery/docs/授權#服務帳戶服務器 –

1

下面的腳本是接近我需要報告。

#!/bin/sh 
bq ls -j `bq show | grep ^Project | awk '{print $2}'` | grep "`date +'%d %b'`" | awk '{print $1}' > tosave.txt 

for myjob in `cat tosave.txt` 
do 
bq ls -j `bq show | grep ^Project | awk '{print $2}'` | grep $myjob 

bq show --format=prettyjson -j $myjob | grep -C2 "message" | head 

done