1
不知道這是錯誤還是我錯過了一些東西。但是,術語方面正在返回錯誤數量的條款數量。elasticsearch:錯誤計數方面
我有一個字段有str_tag_analyzer
。
我想從字段中獲取標籤雲。我想獲得排名前20的標籤以及他們的數量(他們出現了多少次)。
術語方面看起來這種情況下的解決方案。我有一個理解,術語facet query中的size參數控制將返回多少個標記。
當我運行不同大小的術語分面查詢時,我得到意想不到的結果。這裏是我的一些查詢和他們的結果。
查詢1
curl -XGET 'http://server:9200/stage_profiles/wrapper_0/_search?pretty=1' -d '
{
query : {
"nested" : {
"query" : {
"field" : {
"gsid" : 222
}
},
"path" : "medals"
}
}, from: 0, size: 0
,
facets: {
"tags" : { "terms" : {"field" : "field_val_t", size: 1} }
}
}'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 189,
"max_score" : 1.0,
"hits" : [ ]
},
"facets" : {
"tags" : {
"_type" : "terms",
"missing" : 57,
"total" : 331,
"other" : 316,
"terms" : [ {
"term" : "hyderabad",
"count" : 15
} ]
}
}
查詢2
curl -XGET 'http://server:9200/stage_profiles/wrapper_0/_search?pretty=1' -d '
{
query : {
"nested" : {
"query" : {
"field" : {
"gsid" : 222
}
},
"path" : "medals"
}
}, from: 0, size: 0
,
facets: {
"tags" : { "terms" : {"field" : "field_val_t", size: 3} }
}
}'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 189,
"max_score" : 1.0,
"hits" : [ ]
},
"facets" : {
"tags" : {
"_type" : "terms",
"missing" : 57,
"total" : 331,
"other" : 282,
"terms" : [ {
"term" : "playing",
"count" : 20
}, {
"term" : "hyderabad",
"count" : 15
}, {
"term" : "pune",
"count" : 14
} ]
}
}
}
查詢3
curl -XGET 'http://server:9200/stage_profiles/wrapper_0/_search?pretty=1' -d '
{
query : {
"nested" : {
"query" : {
"field" : {
"gsid" : 222
}
},
"path" : "medals"
}
}, from: 0, size: 0
,
facets: {
"tags" : { "terms" : {"field" : "field_val_t", size: 10} }
}
}'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 189,
"max_score" : 1.0,
"hits" : [ ]
},
"facets" : {
"tags" : {
"_type" : "terms",
"missing" : 57,
"total" : 331,
"other" : 198,
"terms" : [ {
"term" : "playing",
"count" : 20
}, {
"term" : "hyderabad",
"count" : 19
}, {
"term" : "bangalore",
"count" : 18
}, {
"term" : "pune",
"count" : 16
}, {
"term" : "chennai",
"count" : 16
}, {
"term" : "games",
"count" : 13
}, {
"term" : "testing",
"count" : 11
}, {
"term" : "cricket",
"count" : 9
}, {
"term" : "singing",
"count" : 6
}, {
"term" : "movies",
"count" : 5
} ]
}
}
}
我有如下考慮 1.第一個查詢是給具有15計數標記,但還有另一個標籤的計數爲20(可以在查詢2和3中看到)。因此它必須返回「正在播放」標籤,計數爲20. 2.第二個查詢返回「hyderabad」標籤的計數爲15,但第三個查詢返回的計數爲19,用於相同標籤。
如果您需要任何其他信息,例如地圖,ES中的數據,請告訴我。 謝謝
從版本0.90.6開始,您還可以使用['shard_size'](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html#_accuracy_control) 。 – Sonson123
這不是實現它的最好方法。使用單個碎片可能會影響性能。 – eliasah