2014-07-17 49 views
7
過濾結果時

我試圖在HBase的這種方式來篩選結果:OutOfOrderScannerNextException在HBase的

List<Filter> andFilterList = new ArrayList<>(); 
SingleColumnValueFilter sourceLowerFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("source"), CompareFilter.CompareOp.GREATER, Bytes.toBytes(lowerLimit)); 
sourceLowerFilter.setFilterIfMissing(true); 
SingleColumnValueFilter sourceUpperFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("source"), CompareFilter.CompareOp.LESS_OR_EQUAL, Bytes.toBytes(upperLimit)); 
sourceUpperFilter.setFilterIfMissing(true); 
SingleColumnValueFilter targetLowerFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("target"), CompareFilter.CompareOp.GREATER, Bytes.toBytes(lowerLimit)); 
targetLowerFilter.setFilterIfMissing(true); 
SingleColumnValueFilter targetUpperFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("target"), CompareFilter.CompareOp.LESS_OR_EQUAL, Bytes.toBytes(upperLimit)); 
targetUpperFilter.setFilterIfMissing(true); 

andFilterList.add(sourceUpperFilter); 
andFilterList.add(targetUpperFilter); 

FilterList andFilter = new FilterList(FilterList.Operator.MUST_PASS_ALL, andFilterList); 

List<Filter> orFilterList = new ArrayList<>(); 
orFilterList.add(sourceLowerFilter); 
orFilterList.add(targetLowerFilter); 
FilterList orFilter = new FilterList(FilterList.Operator.MUST_PASS_ONE, orFilterList); 

FilterList fl = new FilterList(FilterList.Operator.MUST_PASS_ALL); 
fl.addFilter(andFilter); 
fl.addFilter(orFilter); 

Scan edgeScan = new Scan(); 
edgeScan.setFilter(fl); 
ResultScanner edgeScanner = table.getScanner(edgeScan); 
Result edgeResult; 
logger.info("Writing edges..."); 
while ((edgeResult = edgeScanner.next()) != null) { 
    // Some code 
} 

此代碼launchs此錯誤:

org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout? 
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:402) 
    at org.deustotech.internet.phd.framework.rdf2subdue.RDF2Subdue.writeFile(RDF2Subdue.java:150) 
    at org.deustotech.internet.phd.framework.rdf2subdue.RDF2Subdue.run(RDF2Subdue.java:39) 
    at org.deustotech.internet.phd.Main.main(Main.java:32) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 178 number_of_rows: 100 close_scanner: false next_call_seq: 0 
    at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3098) 
    at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) 
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) 
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) 
    at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) 
    at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) 
    at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) 
    at java.lang.Thread.run(Thread.java:745) 

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
    at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) 
    at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) 
    at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) 
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204) 
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) 
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) 
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) 
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:354) 
    ... 9 more 
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 178 number_of_rows: 100 close_scanner: false next_call_seq: 0 
    at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3098) 
    at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497) 
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) 
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) 
    at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168) 
    at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39) 
    at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111) 
    at java.lang.Thread.run(Thread.java:745) 

    at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453) 
    at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1657) 
    at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1715) 
    at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29900) 
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174) 
    ... 13 more 

的RPC超時設置爲600000。我有試圖刪除一些過濾器給出這些結果:

  • sourceUpperFilter & &(sourceLowerFilter || targetLowerFilter) - >成功
  • targetUpperFilter & &(sourceLowerFilter || targetLowerFilter) - >成功
  • (sourceUpperFilter & & targetUpperFilter)& &(sourceLowerFilter) - >失敗
  • (sourceUpperFilter & & targetUpperFilter)& &(targetLowerFilter) - >失敗

任何幫助將不勝感激。謝謝。

+0

嗨米克爾,你有沒有最終找到解決這個問題? –

+0

不幸的是,不是...我離開HBase,我開始使用Hypertable ... –

+0

請更改RPC時間將600000轉換爲1800000請參閱我的回答 –

回答

1

原因:在大區域尋找幾行。客戶端要求填充#rows 需要時間。此時客戶端獲得rpc超時。 因此,客戶端將在同一臺掃描儀上重試呼叫。請記住,下一個 呼叫客戶說,給我接下來N行,你從哪裏。舊的失敗 調用正在進行中,並且會先行一些行。所以這個重試呼叫 將錯過那些行....爲了避免這一點,並區分這種情況下,我們有 這個掃描seqno和這個例外。在看到這個時,客戶端將關閉 掃描儀,並創建一個新的適當的起始行。但是這種重試的方式 只發生一次。這次電話也可能會超時。

因此我們 必須調整超時和/或掃描緩存值。
心跳機制避免了長時間運行掃描的超時。

在我們的情況下的數據在HBase的巨大,我們已經使用 RPC超時= 180萬,租賃期限= 1800000,我們已經使用fuzzy row filtersscan.setCaching(xxxx)// value need to be adjusted ;

注:值濾波器是慢(因爲全表掃描將需要很長的時間來執行)比行過濾器

有了以上所有預防措施,我們成功地通過mapreduce從hbase查詢大量數據。

希望這個解釋有幫助。

1

我通過設置hbase.client.scanner.caching

see also

Client and RS maintain a nextCallSeq number during the scan. Every next() call from client to server will increment this number in both sides. Client passes this number along with the request and at RS side both the incoming nextCallSeq and its nextCallSeq will be matched. In case of a timeout this increment at the client side should not happen. If at the server side fetching of next batch of data was over, there will be mismatch in the nextCallSeq number. Server will throw OutOfOrderScannerNextException and then client will reopen the scanner with startrow as the last successfully retrieved row.

Since the problem is caused by the client-side overtime, then the corresponding reduction in client cache (hbase.client.scanner.caching) size or increase rpc timeout time (hbase.rpc.timeout) can be.

希望這個答案可以幫助解決這個問題。

+0

請將鏈接內容的基本部分複製到此處 – miraclefoxx