1
我是elasticsearch的新手。 我想要實現span的功能,在精確的詞組匹配和精確的詞序列匹配之後,還要考慮到子串匹配。Elasticsearch:Span_near和子字符串匹配
例如:
文件我對指數:
- 男子霜
- 男士抗皺霜
- 男子先進的除皺霜
- 婦女霜
- 婦女抗皺霜
- women's advanc ED抗皺霜
如果我搜索「男人的精華」,我要得到相同的序列如上圖所示。 預期的搜索結果:
- 男子霜 - >精確短語匹配
- 男士抗皺霜 - >與
slop 1
- 男子的高級抗皺霜的搜索字詞順序 - >搜索詞序列與
slop 2
- 女士霜 - >子字符精確短語匹配
- 女士皺紋膏 - >子字符串搜索字詞序列
slop 1
名
- 婦女先進的除皺霜 - >搜索子項序列與
slop 2
我能達到前3周的結果與span_near
已經嵌套span_terms
與slop = 2
和in_order = true
。
我不能去實現它剩下的4至6,因爲span_near是有嵌套span_terms不支持wildcard
,在這個例子中「男人的奶油」 OR「男子霜」。 有什麼方法可以使用ELASTICSEARCH實現它?
最新通報
我的指數:
{
"bluray": {
"settings": {
"index": {
"uuid": "4jofvNfuQdqbhfaF2ibyhQ",
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "1000199"
}
}
}
}
}
映射:
{
"bluray": {
"mappings": {
"movies": {
"properties": {
"genre": {
"type": "string"
}
}
}
}
}
}
我運行下面的查詢:
POST /bluray/movies/_search
{
"query": {
"bool": {
"should": [
{
"span_near": {
"clauses": [
{
"span_term": {
"genre": "women"
}
},
{
"span_term": {
"genre": "cream"
}
}
],
"collect_payloads": false,
"slop": 12,
"in_order": true
}
},
{
"custom_boost_factor": {
"query": {
"match_phrase": {
"genre": "women cream"
}
},
"boost_factor": 4.1
}
},
{
"match": {
"genre": {
"query": "women cream",
"analyzer": "standard",
"minimum_should_match": "99%"
}
}
}
]
}
}
}
這是給我下面的結果:
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 6,
"max_score": 0.011612939,
"hits": [
{
"_index": "bluray",
"_type": "movies",
"_id": "u9aNkZAoR86uAiW9SX8szQ",
"_score": 0.011612939,
"_source": {
"genre": "men's cream"
}
},
{
"_index": "bluray",
"_type": "movies",
"_id": "cpTyKrL6TWuJkXvliibVBQ",
"_score": 0.009290351,
"_source": {
"genre": "men's wrinkle cream"
}
},
{
"_index": "bluray",
"_type": "movies",
"_id": "rn_SFvD4QBO6TJQJNuOh5A",
"_score": 0.009290351,
"_source": {
"genre": "men's advanced wrinkle cream"
}
},
{
"_index": "bluray",
"_type": "movies",
"_id": "9a31_bRpR2WfWh_4fgsi_g",
"_score": 0.004618556,
"_source": {
"genre": "women's cream"
}
},
{
"_index": "bluray",
"_type": "movies",
"_id": "q-DoBBl2RsON_qwLRSoh9Q",
"_score": 0.0036948444,
"_source": {
"genre": "women's advanced wrinkle cream"
}
},
{
"_index": "bluray",
"_type": "movies",
"_id": "TxzCP8B_Q8epXtIcfgEw3Q",
"_score": 0.0036948444,
"_source": {
"genre": "women's wrinkle cream"
}
}
]
}
}
這是不正確的。爲什麼當我搜索女性時會先搜索男性?
注意:搜索「男士霜」仍然會返回更好的結果,但不會遵循搜索詞序列。
我試圖運用指標說明如下:http://stackoverflow.com/questions/9421358/filename-search-with-elasticsearch,但仍然沒有以搜索字詞順序返回子字符串結果。我也用這裏提供的要點 - > http://sense.qbox.io/gist/db82c3fca956c8bffae19559b1fe3108c101e851,這也沒有給我想要的結果。 –
你是否找到了解決方案?我也有同樣的問題。 – letalumil