2013-10-08 70 views
0

我在我的hadoop集羣中運行nutch。 當作業到達步驟取數據我得到java.net.SocketException: Connection reset。這裏是完整的堆棧跟蹤:Nutch:失敗:java.net.SocketException:連接重置

2013-10-09 00:34:05,922 INFO org.apache.nutch.fetcher.Fetcher: fetch of Url error : xxxxxxx failed with: java.net.SocketException: Connection reset 
2013-10-09 00:34:05,923 ERROR org.apache.nutch.protocol.httpclient.Http: Failed to get protocol output 
java.net.SocketException: Connection reset 
    at java.net.SocketInputStream.read(SocketInputStream.java:189) 
    at java.net.SocketInputStream.read(SocketInputStream.java:121) 
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) 
    at java.io.BufferedInputStream.read(BufferedInputStream.java:254) 
    at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:77) 
    at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:105) 
    at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1115) 
    at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1373) 
    at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1832) 
    at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1590) 
    at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:995) 
    at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:397) 
    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170) 
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396) 
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324) 
    at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:94) 
    at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154) 
    at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:140) 
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:703) 
+0

您可以通過瀏覽器/ curl/wget從主機訪問(可能多次在行中)目標URL,該hadoop正在運行?有關異常本身的解釋,請參閱http://stackoverflow.com/questions/62929/java-net-socketexception-connection-reset。 – harpun

回答

0

你必須在種子列表中指明url的協議!例如:

http://stackoverflow.com/ 
https://google.com 
ftp://foo.bar 
相關問題