2016-12-01 30 views
0

kudu中的目標表非常龐大。我在scala中有以下內容,我想檢查該行是否存在於kudu中。這四列是kudu表中的主鍵,但是當我定義一個上界時,我似乎得到了所有的行。使用kudu掃描儀過濾kudu中的特定行

如何在kudu中選擇特定的行?在這裏,我希望只有一行被返回。

val table2 : KuduTable = kuduClient.openTable("event-sets") 
    val eventColumns: util.List[String] = List(
     OccurrenceSchema.SetId.name, 
     OccurrenceSchema.Period.name, 
     OccurrenceSchema.Event.name, 
     OccurrenceSchema.Date.name).asJava 

    val end:PartialRow = table2.getSchema.newPartialRow() 
    end.addInt(OccurrenceSchema.Period.name,1476) 
    end.addInt(OccurrenceSchema.SetId.name,82) 
    end.addInt(OccurrenceSchema.Event.name,3195167) 
    end.addLong(OccurrenceSchema.Date.name,1367922840000L) 

    val kuduScanner: KuduScanner = kuduClient.newScannerBuilder(table2) 
     .setProjectedColumnNames(eventColumns) 
     .lowerBound(end) 
     .exclusiveUpperBound((end)) 
     .build() 

    assert(kuduScanner.hasMoreRows) 
    while (kuduScanner.hasMoreRows) { 
     val resultIterator: RowResultIterator = kuduScanner.nextRows() 
     while (resultIterator.hasNext) { 
     val result: RowResult = resultIterator.next() 
     assert(result != null) 
     logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name)) 
     logger.info(" : Period Value -- " + result.getInt(OccurrenceSchema.Period.name)) 
     logger.info(" : Event Value -- " + result.getInt(OccurrenceSchema.Event.name)) 
     logger.info(" : Date Value -- " + result.getLong(OccurrenceSchema.Date.name)) 
} 
} 

回答

1

從我的理解中,您正在尋找eaxcly表中的一條記錄。 使用掃描儀並定義界限和/或限制與我沒有爲我工作。相反,我通過定義KuduPredicate來解決問題。 下面你會發現我的解決方案。

val builder: KuduScannerBuilder = kuduClient.newScannerBuilder(table2) 
// define columns, you want to select 
builder.setProjectedColumnNames(eventColumns) 

// add predicates to select a record by primary key 
val pkPeriod: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Period.name), KuduPredicate.ComparisonOp.EQUAL, 1476) 
builder.addPredicate(pkPeriod) 
val pkSetId: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.SetId.name), KuduPredicate.ComparisonOp.EQUAL, 82) 
builder.addPredicate(pkSetId) 
val pkEvent: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Event.name), KuduPredicate.ComparisonOp.EQUAL, 3195167) 
builder.addPredicate(pkEvent) 
val pkDate: KuduPredicate = KuduPredicate.newComparisonPredicate(OccurrenceSchema.Date.name), KuduPredicate.ComparisonOp.EQUAL, 1367922840000L) 
builder.addPredicate(pkDate) 

val kuduScanner: KuduScanner = builder.build() 

while (kuduScanner.hasMoreRows) { 
    val resultIterator: RowResultIterator = kuduScanner.nextRows() 
    while (resultIterator.hasNext) { 
    val result: RowResult = resultIterator.next() 

    // do whatever you have to do with the selected record 
    logger.info(" : SetId Value -- " + result.getInt(OccurrenceSchema.SetId.name)) 
    } 
} 

我是Kudu的新手,因此我不確定這個解決方案是否是最有效的解決方案。至少,它會返回預期的結果。

我的原始代碼是用Java編寫和測試的。我已經手動將它移植到Scala,但迄今爲止我還沒有測試它!