7

我使用elasticsearch和需要實現分層對象如下方面的搜索:層次刻面與Elasticsearch

  • 類別1(10)
    • 子類別1(4)
    • 子類別2 (6)
  • 類別2(X)
    • ...

所以我需要方面的兩個相關的對象。文件說,這是可能獲得此類方面的數值,但我需要它串http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-stats-facet.html

這裏是另一個有趣的話題,可惜它的老:http://elasticsearch-users.115913.n3.nabble.com/Pivot-facets-td2981519.html

它可能有彈力的搜索? 如果是這樣,我該怎麼做?

回答

3

目前,elasticsearch不支持開箱即用的分層分解。但即將發佈的1.0版本具有新的aggregations模塊,可用於獲取這些類型的面(更像是透視面而不是等級面)。版本1.0目前處於測試階段,您可以download the second beta並自行測試aggregatins。你舉的例子可能看起來像

curl -XPOST 'localhost:9200/_search?pretty' -d ' 
{ 
    "aggregations": { 
     "main category": { 
     "terms": { 
      "field": "cat_1", 
      "order": {"_term": "asc"} 
     }, 
     "aggregations": { 
      "sub category": { 
       "terms": { 
        "field": "cat_2", 
        "order": {"_term": "asc"} 
       } 
      } 
     } 
     } 
    } 
}' 

的想法是,以有磨製的每個級別不同的領域和剷鬥基於第一級(cat_1)的條款提出你的面。根據第二級的條款(cat_2),這些聚合將具有子桶。結果可能看起來像

{ 
    "aggregations" : { 
    "main category" : { 
     "buckets" : [ { 
     "key" : "category 1", 
     "doc_count" : 10, 
     "sub category" : { 
      "buckets" : [ { 
      "key" : "subcategory 1", 
      "doc_count" : 4 
      }, { 
      "key" : "subcategory 2", 
      "doc_count" : 6 
      } ] 
     } 
     }, { 
     "key" : "category 2", 
     "doc_count" : 7, 
     "sub category" : { 
      "buckets" : [ { 
      "key" : "subcategory 1", 
      "doc_count" : 3 
      }, { 
      "key" : "subcategory 2", 
      "doc_count" : 4 
      } ] 
     } 
     } ] 
    } 
    } 
} 
+0

謝謝!還發現github上的bug和相關的帖子說它將在ES 1.0中修復。實現已經在beta 2中可用。現在玩它:)謝謝! – zonder

5

以前的解決方案的作品真的很好,直到你有沒有比單文檔多級標籤更多。在這種情況下,簡單的聚合不起作用,因爲lucene字段的平面結構會混合內部聚合的結果。 請參見下面的例子:

DELETE /test_category 
POST /test_category 

# Insert a doc with 2 hierarchical tags 
POST /test_category/test/1 
{ 
    "categories": [ 
    { 
     "cat_1": "1", 
     "cat_2": "1.1" 
    }, 
    { 
     "cat_1": "2", 
     "cat_2": "2.2" 
    } 
    ] 
} 

# Simple two-levels aggregations query 
GET /test_category/test/_search?search_type=count 
{ 
    "aggs": { 
    "main_category": { 
     "terms": { 
     "field": "categories.cat_1" 
     }, 
     "aggs": { 
     "sub_category": { 
      "terms": { 
      "field": "categories.cat_2" 
      } 
     } 
     } 
    } 
    } 
} 

這是錯誤的反應,我已經在ES 1.4,其中對內部聚集的字段是在文件級混合有:

{ 
    ... 
    "aggregations": { 
     "main_category": { 
     "buckets": [ 
      { 
       "key": "1", 
       "doc_count": 1, 
       "sub_category": { 
        "buckets": [ 
        { 
         "key": "1.1", 
         "doc_count": 1 
        }, 
        { 
         "key": "2.2", <= WRONG 
         "doc_count": 1 
        } 
        ] 
       } 
      }, 
      { 
       "key": "2", 
       "doc_count": 1, 
       "sub_category": { 
        "buckets": [ 
        { 
         "key": "1.1", <= WRONG 
         "doc_count": 1 
        }, 
        { 
         "key": "2.2", 
         "doc_count": 1 
        } 
        ] 
       } 
      } 
     ] 
     } 
    } 
} 

解決方案可以使用嵌套的對象。這些都是做的步驟:

1)定義架構中的一個新的類型與嵌套對象

POST /test_category/test2/_mapping 
{ 
    "test2": { 
    "properties": { 
     "categories": { 
     "type": "nested", 
     "properties": { 
      "cat_1": { 
      "type": "string" 
      }, 
      "cat_2": { 
      "type": "string" 
      } 
     } 
     } 
    } 
    } 
} 

# Insert a single document 
POST /test_category/test2/1 
{"categories":[{"cat_1":"1","cat_2":"1.1"},{"cat_1":"2","cat_2":"2.2"}]} 

2)運行嵌套聚集查詢:

GET /test_category/test2/_search?search_type=count 
{ 
    "aggs": { 
    "categories": { 
     "nested": { 
     "path": "categories" 
     }, 
     "aggs": { 
     "main_category": { 
      "terms": { 
      "field": "categories.cat_1" 
      }, 
      "aggs": { 
      "sub_category": { 
       "terms": { 
       "field": "categories.cat_2" 
       } 
      } 
      } 
     } 
     } 
    } 
    } 
} 

這就是響應,現在正確,我得到了:

{ 
     ... 
     "aggregations": { 
      "categories": { 
      "doc_count": 2, 
      "main_category": { 
       "buckets": [ 
        { 
         "key": "1", 
         "doc_count": 1, 
         "sub_category": { 
         "buckets": [ 
          { 
           "key": "1.1", 
           "doc_count": 1 
          } 
         ] 
         } 
        }, 
        { 
         "key": "2", 
         "doc_count": 1, 
         "sub_category": { 
         "buckets": [ 
          { 
           "key": "2.2", 
           "doc_count": 1 
          } 
         ] 
         } 
        } 
       ] 
      } 
      } 
     } 
    } 

相同的解決方案可以延長t o超過兩個層次的層面。