2015-09-08 44 views
4

我在giraph中定製類時遇到了一些問題。我做了一個VertexInput和輸出格式,但我總是得到以下錯誤:java.io.IOException:ensureRemaining:只剩下0字節,試圖讀取1

java.io.IOException: ensureRemaining: Only * bytes remaining, trying to read * 

與其中「*」放在不同的值。

這是在單節點羣集上測試的。

當vertexIterator執行next(),並且沒有更多頂點時,會發生此問題。這個迭代器是從flush方法調用的,但我不明白,基本上,爲什麼「next()」方法失敗。下面是一些日誌和類...

我的日誌如下:

15/09/08 00:52:21 INFO bsp.BspService: BspService: Connecting to ZooKeeper with job giraph_yarn_application_1441683854213_0001, 1 on localhost:22181 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:host.name=localhost 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_79 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.class.path=.:${CLASSPATH}:./**/ 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/l$ 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:os.version=3.13.0-62-generic 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:user.name=hduser 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hduser 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Client environment:user.dir=/app/hadoop/tmp/nm-local-dir/usercache/hduser/appcache/application_1441683854213_0001/container_1441683854213_0001_01_000003 
15/09/08 00:52:21 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:22181 sessionTimeout=60000 [email protected] 
15/09/08 00:52:21 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:22181. Will not attempt to authenticate using SASL (unknown error) 
15/09/08 00:52:21 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:22181, initiating session 
15/09/08 00:52:21 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:22181, sessionid = 0x14fab0de0bb0002, negotiated timeout = 40000 
15/09/08 00:52:21 INFO bsp.BspService: process: Asynchronous connection complete. 
15/09/08 00:52:21 INFO netty.NettyServer: NettyServer: Using execution group with 8 threads for requestFrameDecoder. 
15/09/08 00:52:21 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 
15/09/08 00:52:21 INFO netty.NettyServer: start: Started server communication server: localhost/127.0.0.1:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 
15/09/08 00:52:21 INFO netty.NettyClient: NettyClient: Using execution handler with 8 threads after request-encoder. 
15/09/08 00:52:21 INFO graph.GraphTaskManager: setup: Registering health of this worker... 
15/09/08 00:52:21 INFO yarn.GiraphYarnTask: [STATUS: task-1] WORKER_ONLY starting... 
15/09/08 00:52:22 INFO bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/giraph_yarn_application_1441683854213_0001/_masterJobState) 
15/09/08 00:52:22 INFO bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir already exists! 
15/09/08 00:52:22 INFO bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir already exists! 
15/09/08 00:52:22 INFO worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir/0/_superstepD$ 
15/09/08 00:52:22 INFO netty.NettyServer: start: Using Netty without authentication. 
15/09/08 00:52:22 INFO bsp.BspService: process: partitionAssignmentsReadyChanged (partitions are assigned) 
15/09/08 00:52:22 INFO worker.BspServiceWorker: startSuperstep: Master(hostname=localhost, MRtaskID=0, port=30000) 
15/09/08 00:52:22 INFO worker.BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/giraph_yarn_application_1441683854$ 
15/09/08 00:52:22 INFO yarn.GiraphYarnTask: [STATUS: task-1] startSuperstep: WORKER_ONLY - Attempt=0, Superstep=-1 
15/09/08 00:52:22 INFO netty.NettyClient: Using Netty without authentication. 
15/09/08 00:52:22 INFO netty.NettyClient: Using Netty without authentication. 
15/09/08 00:52:22 INFO netty.NettyClient: connectAllAddresses: Successfully added 2 connections, (2 total connected) 0 failed, 0 failures total. 
15/09/08 00:52:22 INFO netty.NettyServer: start: Using Netty without authentication. 
15/09/08 00:52:22 INFO handler.RequestDecoder: decode: Server window metrics MBytes/sec received = 0, MBytesReceived = 0.0001, ave received req MBytes = 0.0001, secs waited = 1.44168435E9 
15/09/08 00:52:22 INFO worker.BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 1 total splits. 
15/09/08 00:52:22 INFO worker.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0, overall roughly 0.0% input splits rese$ 
15/09/08 00:52:22 INFO worker.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0 from ZooKeeper and got input split 'hdfs://hdnode01:54310/u$ 
15/09/08 00:52:22 INFO worker.InputSplitsCallable: loadFromInputSplit: Finished loading /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_vertexInputSplitDir/0 (v=6, e=10) 
15/09/08 00:52:22 INFO worker.InputSplitsCallable: call: Loaded 1 input splits in 0.16241108 secs, (v=6, e=10) 36.94329 vertices/sec, 61.572155 edges/sec 
15/09/08 00:52:22 ERROR utils.LogStacktraceCallable: Execution of callable failed 

java.lang.IllegalStateException: next: IOException 
     at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:101) 
     at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99) 
     at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115) 
     at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466) 
     at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412) 
     at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241) 
     at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) 
     at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) 
     at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745) 
Caused by: java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1 
     at org.apache.giraph.utils.UnsafeReads.ensureRemaining(UnsafeReads.java:77) 
     at org.apache.giraph.utils.UnsafeArrayReads.readByte(UnsafeArrayReads.java:123) 
     at org.apache.giraph.utils.UnsafeReads.readLine(UnsafeReads.java:100) 
     at pruebas.TextAndDoubleComplexWritable.readFields(TextAndDoubleComplexWritable.java:37) 
     at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:540) 
     at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98) 
     ... 11 more 
15/09/08 00:52:22 ERROR worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/giraph_yarn_application_1441683854213_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerHea$ 
15/09/08 00:52:22 ERROR yarn.GiraphYarnTask: GiraphYarnTask threw a top-level exception, failing task 
java.lang.RuntimeException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for [email protected]0 
     at org.apache.giraph.yarn.GiraphYarnTask.run(GiraphYarnTask.java:104) 
     at org.apache.giraph.yarn.GiraphYarnTask.main(GiraphYarnTask.java:183) 
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for [email protected]0 
     at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193) 
     at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151) 
     at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136) 
     at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99) 
     at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233) 
     at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316) 
     at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409) 
     at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629) 
     at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284) 
     at org.apache.giraph.yarn.GiraphYarnTask.run(GiraphYarnTask.java:92) 
     ... 1 more 
Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalStateException: next: IOException 
     at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
     at java.util.concurrent.FutureTask.get(FutureTask.java:202) 
     at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312) 
     at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185) 
     ... 10 more 
Caused by: java.lang.IllegalStateException: next: IOException 
     at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:101) 
     at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99) 
     at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115) 
     at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466) 
     at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412) 
     at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241) 
     at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) 
     at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) 
     at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745) 
Caused by: java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1 
     at org.apache.giraph.utils.UnsafeReads.ensureRemaining(UnsafeReads.java:77) 
     at org.apache.giraph.utils.UnsafeArrayReads.readByte(UnsafeArrayReads.java:123) 
     at org.apache.giraph.utils.UnsafeReads.readLine(UnsafeReads.java:100) 
     at pruebas.TextAndDoubleComplexWritable.readFields(TextAndDoubleComplexWritable.java:37) 
     at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:540) 
     at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98) 
     ... 11 more 

我的輸入格式:

package pruebas; 

import org.apache.giraph.edge.Edge; 
import org.apache.giraph.edge.EdgeFactory; 
import org.apache.giraph.io.formats.AdjacencyListTextVertexInputFormat; 
import org.apache.hadoop.io.DoubleWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.InputSplit; 
import org.apache.hadoop.mapreduce.TaskAttemptContext; 

/** 
* @author hduser 
* 
*/ 
public class IdTextWithComplexValueInputFormat 
     extends 
     AdjacencyListTextVertexInputFormat<Text, TextAndDoubleComplexWritable, DoubleWritable> { 

    @Override 
    public AdjacencyListTextVertexReader createVertexReader(InputSplit split, 
      TaskAttemptContext context) { 
     return new TextComplexValueDoubleAdjacencyListVertexReader(); 
    } 

    protected class TextComplexValueDoubleAdjacencyListVertexReader extends 
      AdjacencyListTextVertexReader { 

     /** 
     * Constructor with 
     * {@link AdjacencyListTextVertexInputFormat.LineSanitizer}. 
     * 
     * @param lineSanitizer 
     *   the sanitizer to use for reading 
     */ 
     public TextComplexValueDoubleAdjacencyListVertexReader() { 
      super(); 
     } 

     @Override 
     public Text decodeId(String s) { 
      return new Text(s); 
     } 

     @Override 
     public TextAndDoubleComplexWritable decodeValue(String s) { 
      TextAndDoubleComplexWritable valorComplejo = new TextAndDoubleComplexWritable(); 
      valorComplejo.setVertexData(Double.valueOf(s)); 
      valorComplejo.setIds_vertices_anteriores(""); 
      return valorComplejo; 
     } 

     @Override 
     public Edge<Text, DoubleWritable> decodeEdge(String s1, String s2) { 
      return EdgeFactory.create(new Text(s1), 
        new DoubleWritable(Double.valueOf(s2))); 
     } 
    } 

} 

TextAndDoubleComplexWritable:

package pruebas; 

import java.io.DataInput; 
import java.io.DataOutput; 
import java.io.IOException; 

import org.apache.hadoop.io.Writable; 

public class TextAndDoubleComplexWritable implements Writable { 

    private String idsVerticesAnteriores; 

    private double vertexData; 

    public TextAndDoubleComplexWritable() { 
     super(); 
     this.idsVerticesAnteriores = ""; 
    } 

    public TextAndDoubleComplexWritable(double vertexData) { 
     super(); 
     this.vertexData = vertexData; 
    } 

    public TextAndDoubleComplexWritable(String ids_vertices_anteriores, 
      double vertexData) { 
     super(); 
     this.idsVerticesAnteriores = ids_vertices_anteriores; 
     this.vertexData = vertexData; 
    } 

    public void write(DataOutput out) throws IOException { 
     out.writeUTF(idsVerticesAnteriores); 
    } 

    public void readFields(DataInput in) throws IOException { 
     idsVerticesAnteriores = in.readLine(); 
    } 

    public String getIds_vertices_anteriores() { 
     return idsVerticesAnteriores; 
    } 

    public void setIds_vertices_anteriores(String ids_vertices_anteriores) { 
     this.idsVerticesAnteriores = ids_vertices_anteriores; 
    } 

    public double getVertexData() { 
     return vertexData; 
    } 

    public void setVertexData(double vertexData) { 
     this.vertexData = vertexData; 
    } 
} 

我的輸入文件:

Portada 0.0  Sugerencias  1.0 
Sugerencias  3.0  Portada 1.0 

,我用這個命令執行它:

$HADOOP_HOME/bin/yarn jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner lectura_de_grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif pruebas.IdTextWithComplexValueInputFormat -vip /user/hduser/input/wiki-graph-chiquito.txt -op /user/hduser/output/caminosNavegacionales -w 2 -yh 250 

任何幫助,將不勝感激!


UPDATE: 我的輸入文件是錯誤的。 Giraph(或我的例子)並不能很好地處理向非列表頂點傳出的數據。

但問題仍然存在。我更新了原始問題上的文件數據。

UPDATE 2: OutputFormat沒有被使用,並且計算算法也不會被執行。爲了幫助澄清問題,我刪除了這兩個詞。

Update 3,19/11/2015: 問題不在輸入格式中,輸入格式運行良好並完全讀取數據。 問題出在TextAndDoubleComplexWritable的課上,我把它添加到我的原始問題中,以更好地解釋這個最終解決方案(我也添加了一個答案)。

回答

0

的問題是在類TextAndDoubleComplexWritable。當我們實施Writable接口時,我並不知道方法readFieldswrite的重要性。這是至關重要的,因爲是讓我們在giraph中發送和接收信息的方法。我在readFields方法中寫了一個空字符串,我應該使用該方法寫入我的頂點的兩個值。我用以下方法更新了這兩種方法:

public void write(DataOutput out) throws IOException { 
     out.writeDouble(this.vertexData); 
     out.writeUTF(this.idsVerticesAnteriores != "" ? "hola" 
       : this.idsVerticesAnteriores); 
} 

public void readFields(DataInput in) throws IOException { 
    this.vertexData = in.readDouble(); 
    this.idsVerticesAnteriores = in.readUTF(); 
    // idsVerticesAnteriores = in.readLine(); 
} 

這是行得通的,最後!!

3

這是異常org.apache.giraph.utils.UnsafeReads.ensureRemaining的根本原因。注意這是由giraph utils調用的。

異常意味着讀者堅持需要來自輸入流的更多輸入,但輸入流沒有剩下那麼多輸入(即它碰到EOF)。

+0

感謝您的回答,但是如果您閱讀了我的整個問題,問題就發生在調用next()方法時,並且該方法會在閱讀器碰到EOF時停止迭代,對嗎?但事實並非如此。我不知道爲什麼,這就是爲什麼我要問這裏! ;) – chomp

+0

請您回答@ash或支持者(upvoters),如果您可以給我一些額外的信息來解決我的問題,我會更好的更新我的問題,儘量提供更多的信息爲此.. – chomp

+0

你解決了這個問題嗎?如果不瞭解giraph預期的輸入格式,我將很難幫助更多。 – ash

1

只是在黑暗中拍攝,但你有沒有嘗試檢查next()是否返回null。當它到達閱讀的最後?

if(method == null){ 
//Continue 
} 
else{ 
//It's Null 
} 
+0

嗨@ thomasjcf21,謝謝你的回答,方法next()它沒有返回null,我很確定。但是,在任何迭代中,如果hasNext()不返回true,則不應調用下一個方法...但在我的情況下,它確實如此,並且我不明白爲什麼... – chomp

+0

能否告訴我們更多你的迭代的細節? – AlbertFG

+0

問題解決了,請檢查我的最新答案。 – chomp