2013-12-09 66 views
23

我用暴風0.8.1到讀出一個Amazon SQS隊列傳入的消息和我得到一致的例外這樣做的時候是什麼造成這些ParseError例外:在我的風暴集羣閱讀時斷的AWS SQS隊列

2013-12-02 02:21:38 executor [ERROR] 
java.lang.RuntimeException: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[1,1] 
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.) 
     at REDACTED.spouts.SqsQueueSpout.handleNextTuple(SqsQueueSpout.java:219) 
     at REDACTED.spouts.SqsQueueSpout.nextTuple(SqsQueueSpout.java:88) 
     at backtype.storm.daemon.executor$fn__3976$fn__4017$fn__4018.invoke(executor.clj:447) 
     at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377) 
     at clojure.lang.AFn.run(AFn.java:24) 
     at java.lang.Thread.run(Thread.java:701) 
Caused by: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[1,1] 
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.) 
     at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:524) 
     at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:298) 
     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:167) 
     at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:812) 
     at com.amazonaws.services.sqs.AmazonSQSClient.receiveMessage(AmazonSQSClient.java:575) 
     at REDACTED.spouts.SqsQueueSpout.handleNextTuple(SqsQueueSpout.java:191) 
     ... 5 more 
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1] 
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK. 
     at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.setInputSource(XMLStreamReaderImpl.java:219) 
     at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.<init>(XMLStreamReaderImpl.java:189) 
     at com.sun.xml.internal.stream.XMLInputFactoryImpl.getXMLStreamReaderImpl(XMLInputFactoryImpl.java:277) 
     at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:129) 
     at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLEventReader(XMLInputFactoryImpl.java:78) 
     at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:85) 
     at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:41) 
     at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:503) 
     ... 10 more 

我已經調試了隊列上的數據,一切看起來不錯。我無法弄清楚爲什麼API的XML響應會導致這些問題。有任何想法嗎?

回答

47

在這裏回答我自己的問題的時代。

目前Oracle和OpenJDK的Java中存在XML擴展限制處理錯誤,導致共享計數器在解析多個XML文檔時觸及默認上限。

  1. https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123
  2. https://bugs.openjdk.java.net/browse/JDK-8028111
  3. https://github.com/aws/aws-sdk-java/issues/123

雖然我認爲我們的版本(6b27-1.12.6-1ubuntu0.12.04.4)並沒有受到影響,運行給出的示例代碼在OpenJDK錯誤報告中確實證實我們對這個錯誤很敏感。

要解決此問題,我需要將jdk.xml.entityExpansionLimit=0傳遞給Storm工作人員。通過在我的集羣中添加以下內容到storm.yaml,我可以緩解這個問題。

supervisor.childopts: "-Djdk.xml.entityExpansionLimit=0" 
worker.childopts: "-Djdk.xml.entityExpansionLimit=0" 

我要指出,這在技術上可能讓你拒絕服務攻擊,但由於我們的XML文檔只從SQS來了,我不擔心有人惡意僞造XML殺死我們的工人。

+0

可能還有更多。我使用Java6獲得相同的錯誤。我的機器上沒有安裝Java7。 – BrianC

+0

P.S.順便提一下。 – BrianC

+0

沒關係。發現這也影響了Java5,6,7和8的特定版本。詳細瞭解 https://bugs.openjdk.java.net/browse/JDK-8028111 – BrianC