2013-10-01 21 views
3

我將一些公司的指數,其中國家屬性Elasticsearch數據聚合是國家代碼的數組:與小面的陣列

curl -XPUT 'http://localhost:9200/test/company/10' -d '{"countries" : ["CH", "CN"], "name" : "company10"}' 
curl -XPUT 'http://localhost:9200/test/company/11' -d '{"countries" : ["AT", "CH", "CN", "DE", "EN", "FR"], "name" : "company11"}' 
curl -XPUT 'http://localhost:9200/test/company/12' -d '{"countries" : ["AT", "CN", "EN", "FR"], "name" : "company12"}' 
curl -XPUT 'http://localhost:9200/test/company/13' -d '{"countries" : ["CH", "CN", "HU"], "name" : "company13"}' 
curl -XPUT 'http://localhost:9200/test/company/14' -d '{"countries" : ["CH", "CN", "EN", "FR"], "name" : "company14"}' 
curl -XPUT 'http://localhost:9200/test/company/15' -d '{"countries" : ["AT", "CN", "DE", "EN", "FR", "HU"], "name" : "company15"}' 
curl -XPUT 'http://localhost:9200/test/company/16' -d '{"countries" : ["AT", "BE", "CH", "DE", "EN", "FR", "HU"], "name" : "company16"}' 
curl -XPUT 'http://localhost:9200/test/company/17' -d '{"countries" : ["BE", "CN", "EN"], "name" : "company17"}' 
curl -XPUT 'http://localhost:9200/test/company/18' -d '{"countries" : ["AT", "CH", "CN", "DE"], "name" : "company18"}' 
curl -XPUT 'http://localhost:9200/test/company/19' -d '{"countries" : ["AT", "CH", "CN", "DE", "EN", "FR", "HU"], "name" : "company19"}' 
curl -XPUT 'http://localhost:9200/test/company/20' -d '{"countries" : ["EN", "FR"], "name" : "company20"}' 
curl -XPUT 'http://localhost:9200/test/company/21' -d '{"countries" : ["AT", "BE", "DE", "FR", "HU"], "name" : "company21"}' 
curl -XPUT 'http://localhost:9200/test/company/22' -d '{"countries" : ["AT", "BE", "CH", "DE", "EN", "FR", "HU"], "name" : "company22"}' 
curl -XPUT 'http://localhost:9200/test/company/23' -d '{"countries" : ["AT", "BE", "CH", "CN", "DE", "EN", "HU"], "name" : "company23"}' 
curl -XPUT 'http://localhost:9200/test/company/24' -d '{"countries" : ["AT", "BE", "CH", "CN", "DE", "EN", "FR"], "name" : "company24"}' 
curl -XPUT 'http://localhost:9200/test/company/25' -d '{"countries" : ["AT", "BE", "CH", "DE", "EN", "FR"], "name" : "company25"}' 
curl -XPUT 'http://localhost:9200/test/company/26' -d '{"countries" : ["AT", "BE", "CH", "CN", "DE", "EN", "FR", "HU"], "name" : "company26"}' 
curl -XPUT 'http://localhost:9200/test/company/27' -d '{"countries" : ["AT", "EN", "FR"], "name" : "company27"}' 
curl -XPUT 'http://localhost:9200/test/company/28' -d '{"countries" : ["CN"], "name" : "company28"}' 
curl -XPUT 'http://localhost:9200/test/company/29' -d '{"countries" : ["BE", "CH", "CN", "EN", "FR"], "name" : "company29"}' 
curl -XPUT 'http://localhost:9200/test/company/30' -d '{"countries" : ["CN"], "name" : "company30"}' 

我想COUNTRY_CODE到公司彙總(國家屬性) ,請統計每個國家有多少家公司。

可悲的是,即使是這樣(爲AT代碼計數)不工作:

curl -XGET 'http://localhost:9200/test/company/_search?pretty=true' -d ' 
{"query" : { "match_all" : {} }, 
"facets" : { 
    "foo" : { 
     "filter" : { 
     "term" : { "countries" : "AT" } 
     } 
    } 
    } 
} 
' 

我越來越:

...

"facets" : { 
    "foo" : { 
    "_type" : "filter", 
    "count" : 0 
    } 
} 

我是什麼做錯了?

+0

只是'AT'不工作?你嘗試過'CN'嗎? –

+0

現在,我也嘗試過CN,與AT在facet部分的相同響應 – astropanic

+1

hmm ok,我想這可能是由於ES沒有索引停用詞(http://stackoverflow.com/questions/17883936/is-there -a路至轉義elasticsearch一站式的話)。但是如果CN也行不通,那就不可能是這樣。 –

回答

5

我認爲這是因爲沒有分析過濾器。 AT是停用詞,因此它沒有編入索引。您可以使用_analyze API:http://localhost:9200/test/_analyze?text=AT&field=countries進行檢查。

您可以檢查非停用詞,例如CN,但這是小寫的http://localhost:9200/test/_analyze?text=CN&field=countries。因此cn(實際上存儲在索引中)與您facet過濾器中的CN不匹配。

您可以嘗試修改您的搜索,以小寫的國家縮寫:

curl -XGET 'http://localhost:9200/test/company/_search?pretty=true' -d ' 
{"query" : { "match_all" : {} }, 
"facets" : { 
    "foo" : { 
     "filter" : { 
     "term" : { "countries" : "cn" } 
     } 
    } 
    } 
}' 

得到

"facets" : { 
    "foo" : { 
     "_type" : "filter", 
     "count" : 15 
    } 
    } 

但我認爲你應該對國家的定義映射到"index":"not_analyzed"避免這種情況(包括禁用詞和降價)

# Delete index 
# 
curl -XDELETE 'http://localhost:9200/test' 

# Create with mapping 
# 
curl -XPUT 'http://localhost:9200/test/' -d '{ 
    "mappings": { 
    "company": { 
     "properties": { 
     "countries": { "type": "string", "index" : "not_analyzed" } 
     } 
    } 
    } 
}' 


# Index documents 
# 
curl -XPUT 'http://localhost:9200/test/company/10' -d '{"countries" : ["CH", "CN"], "name" : "company10"}' 
curl -XPUT 'http://localhost:9200/test/company/11' -d '{"countries" : ["AT", "CH", "CN", "DE", "EN", "FR"], "name" : "company11"}' 
curl -XPUT 'http://localhost:9200/test/company/12' -d '{"countries" : ["AT", "CN", "EN", "FR"], "name" : "company12"}' 
curl -XPUT 'http://localhost:9200/test/company/13' -d '{"countries" : ["CH", "CN", "HU"], "name" : "company13"}' 
curl -XPUT 'http://localhost:9200/test/company/14' -d '{"countries" : ["CH", "CN", "EN", "FR"], "name" : "company14"}' 
curl -XPUT 'http://localhost:9200/test/company/15' -d '{"countries" : ["AT", "CN", "DE", "EN", "FR", "HU"], "name" : "company15"}' 
curl -XPUT 'http://localhost:9200/test/company/16' -d '{"countries" : ["AT", "BE", "CH", "DE", "EN", "FR", "HU"], "name" : "company16"}' 
curl -XPUT 'http://localhost:9200/test/company/17' -d '{"countries" : ["BE", "CN", "EN"], "name" : "company17"}' 
curl -XPUT 'http://localhost:9200/test/company/18' -d '{"countries" : ["AT", "CH", "CN", "DE"], "name" : "company18"}' 
curl -XPUT 'http://localhost:9200/test/company/19' -d '{"countries" : ["AT", "CH", "CN", "DE", "EN", "FR", "HU"], "name" : "company19"}' 
curl -XPUT 'http://localhost:9200/test/company/20' -d '{"countries" : ["EN", "FR"], "name" : "company20"}' 
curl -XPUT 'http://localhost:9200/test/company/21' -d '{"countries" : ["AT", "BE", "DE", "FR", "HU"], "name" : "company21"}' 
curl -XPUT 'http://localhost:9200/test/company/22' -d '{"countries" : ["AT", "BE", "CH", "DE", "EN", "FR", "HU"], "name" : "company22"}' 
curl -XPUT 'http://localhost:9200/test/company/23' -d '{"countries" : ["AT", "BE", "CH", "CN", "DE", "EN", "HU"], "name" : "company23"}' 
curl -XPUT 'http://localhost:9200/test/company/24' -d '{"countries" : ["AT", "BE", "CH", "CN", "DE", "EN", "FR"], "name" : "company24"}' 
curl -XPUT 'http://localhost:9200/test/company/25' -d '{"countries" : ["AT", "BE", "CH", "DE", "EN", "FR"], "name" : "company25"}' 
curl -XPUT 'http://localhost:9200/test/company/26' -d '{"countries" : ["AT", "BE", "CH", "CN", "DE", "EN", "FR", "HU"], "name" : "company26"}' 
curl -XPUT 'http://localhost:9200/test/company/27' -d '{"countries" : ["AT", "EN", "FR"], "name" : "company27"}' 
curl -XPUT 'http://localhost:9200/test/company/28' -d '{"countries" : ["CN"], "name" : "company28"}' 
curl -XPUT 'http://localhost:9200/test/company/29' -d '{"countries" : ["BE", "CH", "CN", "EN", "FR"], "name" : "company29"}' 
curl -XPUT 'http://localhost:9200/test/company/30' -d '{"countries" : ["CN"], "name" : "company30"}' 

# Refresh index 
# 
curl -XPOST 'http://localhost:9200/test/_refresh' 

# Search 
# 
curl -XGET 'http://localhost:9200/test/company/_search?pretty=true' -d ' 
{"query" : { "match_all" : {} }, 
"facets" : { 
    "foo" : { 
     "filter" : { 
     "term" : { "countries" : "AT" } 
     } 
    } 
    } 
} 
' 
+0

問題的真棒解釋,非常感謝。 它的作用就像一個魅力;) – astropanic

+0

我有同樣的問題,我已經添加索引:not_analyzed字段'country_code',認爲有些國家','是'等等被排除在方面。我繼續檢查。現在我剛剛在國家代碼前加了'_',所以它存儲了_at,_be等。 – Alex

+0

或者@vhyza是對的,我只有2種類型 - 一種是提供的映射,另一種是沒有自動處理的:not_indexed,我的fauld ,所以是 - index:not_analyzed解決了這個問題。 – Alex