在Dataflow流水線中解析Stackdriver LogEntry JSON

我正在構建一個Dataflow流水線來處理Stackdriver日誌，數據是從Pub/Sub讀取的，並將結果寫入BigQuery。當我從Pub/Sub讀取時，我得到的JSON字符串爲LogEntry對象，但我真正感興趣的是包含用戶日誌消息的protoPayload.line記錄。爲了讓那些我需要解析LogEntry JSON對象，我發現了一個兩歲Google example如何做到這一點：在Dataflow流水線中解析Stackdriver LogEntry JSON

import com.google.api.client.json.JsonParser; 
import com.google.api.client.json.jackson2.JacksonFactory; 
import com.google.api.services.logging.model.LogEntry; 

try { 
    JsonParser parser = new JacksonFactory().createJsonParser(entry); 
    LogEntry logEntry = parser.parse(LogEntry.class); 
    logString = logEntry.getTextPayload(); 
} 
catch (IOException e) { 
    LOG.error("IOException parsing entry: " + e.getMessage()); 
} 
catch(NullPointerException e) { 
    LOG.error("NullPointerException parsing entry: " + e.getMessage()); 
}

不幸的是這並沒有爲我工作，在logEntry.getTextPayload()回報null。我甚至不確定它是否應該像com.google.api.services.logging庫那樣在Google Cloud文檔中的任何位置提及，但當前日誌記錄庫似乎爲google-cloud-logging。

因此，如果有人可以建議什麼是解析LogEntry對象的正確或最簡單的方法？

來源

2017-10-09 dmitryb

我結束了手動解析LogEntry JSON與gson庫，特別是使用樹遍歷方法。這裏是一個小片段：

static class ProcessLogMessages extends DoFn<String, String> { 
    @ProcessElement 
    public void processElement(ProcessContext c) { 
     String entry = c.element(); 

     JsonParser parser = new JsonParser(); 
     JsonElement element = parser.parse(entry); 
     if (element.isJsonNull()) { 
      return; 
     } 
     JsonObject root = element.getAsJsonObject(); 
     JsonArray lines = root.get("protoPayload").getAsJsonObject().get("line").getAsJsonArray(); 
     for (int i = 0; i < lines.size(); i++) { 
      JsonObject line = lines.get(i).getAsJsonObject(); 
      String logMessage = line.get("logMessage").getAsString(); 

      // Do what you need with the logMessage here 
      c.output(logMessage); 
     } 
    } 
}

這是很簡單，因爲我很感興趣，只有protoPayload.line.logMessage對象爲我工作得很好。但我想這是不理想的方法來解析LogEntry對象，如果你需要處理很多屬性。

來源

2017-10-12 10:31:25 dmitryb

在Dataflow流水線中解析Stackdriver LogEntry JSON

回答

相關問題