2017-02-23 98 views
0

我正在嘗試將方解石與Kafka結合,我參考了CsvStreamableTable。將Kafka與Apache Calcite結合起來

每個ConsumerRecord是轉換使用fowlloing代碼爲Object []:

static class ArrayRowConverter extends RowConverter<Object[]> { 
    private List<Schema.Field> fields; 

    public ArrayRowConverter(List<Schema.Field> fields) { 
     this.fields = fields; 
    } 

    @Override 
    Object[] convertRow(ConsumerRecord<String, GenericRecord> consumerRecord) { 
     Object[] objects = new Object[fields.size()+1]; 
     int i = 0 ; 
     objects[i++] = consumerRecord.timestamp(); 
     for(Schema.Field field : this.fields) { 
      Object obj = consumerRecord.value().get(field.name()); 
      if(obj instanceof Utf8){ 
       objects[i ++] = obj.toString(); 
      }else { 
       objects[i ++] = obj; 
      } 
     } 
     return objects; 
    } 
} 

枚舉被實現爲以下中,一個線程是從卡夫卡不斷地輪詢記錄,並把它們放入一個隊列,getRecord()方法輪詢從那個隊列:

public E current() { 
    return current; 
} 

public boolean moveNext() { 
for(;;) { 
    if(cancelFlag.get()) { 
     return false; 
    } 
    ConsumerRecord<String, GenericRecord> record = getRecord(); 
    if(record == null) { 
     try { 
      Thread.sleep(200L); 
     } catch (InterruptedException e) { 
      e.printStackTrace(); 
     } 
     continue; 
    } 
    current = rowConvert.convertRow(record); 
    return true; 
    } 
} 

我測試SELECT STREAM * FROM Kafka.clicks,它工作正常。 rowtime是明確添加的第一列,值是卡夫卡的記錄時間戳。

但當我

SELECT STREAM FLOOR(rowtime TO HOUR) 
AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks GROUP BY FLOOR(rowtime TO HOUR), ip 

扔例外

java.sql.SQLException: Error while executing SQL "SELECT STREAM FLOOR(rowtime TO HOUR) AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks GROUP BY FLOOR(rowtime TO HOUR), ip": From line 1, column 85 to line 1, column 119: Streaming aggregation requires at least one monotonic expression in GROUP BY clause 
    at org.apache.calcite.avatica.Helper.createException(Helper.java:56) 
    at org.apache.calcite.avatica.Helper.createException(Helper.java:41) 

回答

0

需要聲明的是, 「ROWTIME」 列是單調的。在MockCatalogReader中,請注意「ORDER」和「SHIPMENTS」流中「ROWTIME」是如何聲明爲單調的。這就是爲什麼SqlValidatorTest.testStreamGroupBy()中的某些查詢是有效的,而其他查詢則不是。驗證者依賴的關鍵方法是SqlValidatorTable.getMonotonicity(String columnName)

+0

感謝朱利安,有沒有簡單的方法來聲明列單調,還是應該像MockTable一樣實現? – user2283216