莉莉與Morphline和HBase

我正在嘗試使用Cloudera的教程。（http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/search_hbase_batch_indexer.html）莉莉與Morphline和HBase

我有一個代碼在HBase中插入Avro格式的對象，我想將它們插入到Solr中，但我什麼也沒得到。

我一直在考慮看看到日誌：

15/06/12 00:45:00 TRACE morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells: beforeNotify: {lifecycle=[START_SESSION]} 
15/06/12 00:45:00 TRACE morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells: beforeProcess: {_attachment_body=[keyvalues={0Name178721/data:avroUser/1434094131495/Put/vlen=237/seqid=0}], _attachment_mimetype=[application/java-hbase-result]} 
15/06/12 00:45:00 DEBUG indexer.Indexer$RowBasedIndexer: Indexer _default_ will send to Solr 0 adds and 0 deletes 
15/06/12 00:45:00 TRACE morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells: beforeNotify: {lifecycle=[START_SESSION]} 
15/06/12 00:45:00 TRACE morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells: beforeProcess: {_attachment_body=[keyvalues={1Name134339/data:avroUser/1434094131495/Put/vlen=237/seqid=0}], _attachment_mimetype=[application/java-hbase-result]}

所以，我reaing他們，但我不知道爲什麼它不是在Solr的索引的任何東西。我想我的morphline.conf是錯誤的。

morphlines : [ 
{ 
    id : morphline1 
    importCommands : ["org.kitesdk.**", "org.apache.solr.**", "com.ngdata.**"] 
    commands : [ 
     { 
     extractHBaseCells { 
      mappings : [ 
      { 
      inputColumn : "data:avroUser" 
       outputField : "_attachment_body" 
       type : "byte[]" 
       source : value 
      } 
     ] 
     } 
     } 

     #for avro use with type : "byte[]" in extractHBaseCells mapping above 
     { readAvroContainer {} } 
     { 
     extractAvroPaths { 
      flatten : true 
      paths : { 
      name : /name 
      } 
     } 
     } 
     { logTrace { format : "output record: {}", args : ["@{}"] } } 
    ] 
} 
]

我不知道如果我不得不在Solr的一個「_attachment_body」的領域，但它似乎是沒有必要的，所以我想這readAvroContainer或extractAvroPaths是錯誤的。我在Solr中有一個「name」字段，我的avroUser也有一個「name」字段。

{"namespace": "example.avro", 
"type": "record", 
"name": "User", 
"fields": [ 
    {"name": "name", "type": "string"}, 
    {"name": "favorite_number", "type": ["int", "null"]}, 
    {"name": "favorite_color", "type": ["string", "null"]} 
] 
}

來源

2015-06-12 Guille

我已經在這裏工作得很好。我做了這個步驟：

1）安裝hbase-solr-indexer作爲服務：所有你必須安裝hbase-solr-indexer的拳頭。 installing hbase-solr-indexing as a service

將cloudera repos添加到yum repos中。該類型後：

sudo yum install hbase-solr-indexer

2）Criate morphline文件：好吧，你做到了。

2）設置每列家族的複製範圍和註冊HBase的索引器配置

Using the Lily HBase NRT Indexer Service

$ hbase shell 
hbase shell> disable 'record' 
hbase shell> alter 'record', {NAME => 'data', REPLICATION_SCOPE => 1} 
hbase shell> enable 'record'

嘗試按照高於其他教程。 ;）我與NRT解決方案的問題，但是當我一步一步地跟隨所有教程它的工作。

我希望這可以幫助別人。

來源

2016-06-16 19:46:43

莉莉與Morphline和HBase

回答

相關問題