2014-04-09 74 views
1

我是elasticsearch的新手。 我想要實現span的功能,在精確的詞組匹配和精確的詞序列匹配之後,還要考慮到子串匹配。Elasticsearch:Span_near和子字符串匹配

例如:

文件我對指數:

  1. 男子霜
  2. 男士抗皺霜
  3. 男子先進的除皺霜
  4. 婦女霜
  5. 婦女抗皺霜
  6. women's advanc ED抗皺霜

如果我搜索「男人的精華」,我要得到相同的序列如上圖所示。 預期的搜索結果:

  1. 男子霜 - >精確短語匹配
  2. 男士抗皺霜 - >與slop 1
  3. 男子的高級抗皺霜的搜索字詞順序 - >搜索詞序列與slop 2
  4. 女士霜 - >子字符精確短語匹配
  5. 女士皺紋膏 - >子字符串搜索字詞序列slop 1
  6. 婦女先進的除皺霜 - >搜索子項序列與slop 2

我能達到前3周的結果與span_near已經嵌套span_termsslop = 2in_order = true
我不能去實現它剩下的4至6,因爲span_near是有嵌套span_terms不支持wildcard,在這個例子中「男人的奶油」 OR「男子」。 有什麼方法可以使用ELASTICSEARCH實現它?

最新通報
我的指數:

{ 
    "bluray": { 
    "settings": { 
     "index": { 
     "uuid": "4jofvNfuQdqbhfaF2ibyhQ", 
     "number_of_replicas": "1", 
     "number_of_shards": "5", 
     "version": { 
      "created": "1000199" 
     } 
     } 
    } 
    } 
} 

映射:

{ 
    "bluray": { 
    "mappings": { 
     "movies": { 
     "properties": { 
      "genre": { 
      "type": "string" 
      } 
     } 
     } 
    } 
    } 
} 

我運行下面的查詢:

POST /bluray/movies/_search 
{ 
    "query": { 
    "bool": { 
     "should": [ 
     { 
      "span_near": { 
      "clauses": [ 
       { 
       "span_term": { 
        "genre": "women" 
       } 
       }, 
       { 
       "span_term": { 
        "genre": "cream" 
       } 
       } 
      ], 
      "collect_payloads": false, 
      "slop": 12, 
      "in_order": true 
      } 
     }, 
     { 
      "custom_boost_factor": { 
      "query": { 
       "match_phrase": { 
       "genre": "women cream" 
       } 
      }, 
      "boost_factor": 4.1 
      } 
     }, 
     { 
      "match": { 
      "genre": { 
       "query": "women cream", 
       "analyzer": "standard", 
       "minimum_should_match": "99%" 
      } 
      } 
     } 
     ] 
    } 
    } 
} 

這是給我下面的結果:

"took": 3, 
    "timed_out": false, 
    "_shards": { 
     "total": 5, 
     "successful": 5, 
     "failed": 0 
    }, 
    "hits": { 
     "total": 6, 
     "max_score": 0.011612939, 
     "hits": [ 
     { 
      "_index": "bluray", 
      "_type": "movies", 
      "_id": "u9aNkZAoR86uAiW9SX8szQ", 
      "_score": 0.011612939, 
      "_source": { 
       "genre": "men's cream" 
      } 
     }, 
     { 
      "_index": "bluray", 
      "_type": "movies", 
      "_id": "cpTyKrL6TWuJkXvliibVBQ", 
      "_score": 0.009290351, 
      "_source": { 
       "genre": "men's wrinkle cream" 
      } 
     }, 
     { 
      "_index": "bluray", 
      "_type": "movies", 
      "_id": "rn_SFvD4QBO6TJQJNuOh5A", 
      "_score": 0.009290351, 
      "_source": { 
       "genre": "men's advanced wrinkle cream" 
      } 
     }, 
     { 
      "_index": "bluray", 
      "_type": "movies", 
      "_id": "9a31_bRpR2WfWh_4fgsi_g", 
      "_score": 0.004618556, 
      "_source": { 
       "genre": "women's cream" 
      } 
     }, 
     { 
      "_index": "bluray", 
      "_type": "movies", 
      "_id": "q-DoBBl2RsON_qwLRSoh9Q", 
      "_score": 0.0036948444, 
      "_source": { 
       "genre": "women's advanced wrinkle cream" 
      } 
     }, 
     { 
      "_index": "bluray", 
      "_type": "movies", 
      "_id": "TxzCP8B_Q8epXtIcfgEw3Q", 
      "_score": 0.0036948444, 
      "_source": { 
       "genre": "women's wrinkle cream" 
      } 
     } 
     ] 
    } 
} 

這是不正確的。爲什麼當我搜索女性時會先搜索男性?

注意:搜索「男士霜」仍然會返回更好的結果,但不會遵循搜索詞序列。

+0

我試圖運用指標說明如下:http://stackoverflow.com/questions/9421358/filename-search-with-elasticsearch,但仍然沒有以搜索字詞順序返回子字符串結果。我也用這裏提供的要點 - > http://sense.qbox.io/gist/db82c3fca956c8bffae19559b1fe3108c101e851,這也沒有給我想要的結果。 –

+0

你是否找到了解決方案?我也有同樣的問題。 – letalumil

回答

0
POST /bluray/movies/_search 
{ 
    "query": { 
    "bool": { 
     "should": [ 
     { 
      "span_near": { 
      "clauses": [ 
       { 
       "span_term": { 
        "genre": "women's" 
       } 
       }, 
       { 
       "span_term": { 
        "genre": "cream" 
       } 
       } 
      ], 
      "collect_payloads": false, 
      "slop": 12, 
      "in_order": true 
      } 
     },{ 
      "match": { 
      "genre": { 
       "query": "women's cream", 
       "analyzer": "standard", 
       "minimum_should_match": "99%" 
      } 
      } 
     } 
     ] 
    } 
    } 
} 

這給下面的輸出爲您的預期:

{ 
    "took": 2, 
    "timed_out": false, 
    "_shards": { 
    "total": 5, 
    "successful": 5, 
    "failed": 0 
    }, 
    "hits": { 
    "total": 6, 
    "max_score": 0.7841132, 
    "hits": [ 
     { 
     "_index": "bluray", 
     "_type": "movies", 
     "_id": "4", 
     "_score": 0.7841132, 
     "_source": { 
      "genre": "women's cream" 
     } 
     }, 
     { 
     "_index": "bluray", 
     "_type": "movies", 
     "_id": "5", 
     "_score": 0.56961054, 
     "_source": { 
      "genre": "women's wrinkle cream" 
     } 
     }, 
     { 
     "_index": "bluray", 
     "_type": "movies", 
     "_id": "6", 
     "_score": 0.35892165, 
     "_source": { 
      "genre": "women's advanced wrinkle cream" 
     } 
     }, 
     { 
     "_index": "bluray", 
     "_type": "movies", 
     "_id": "3", 
     "_score": 0.2876821, 
     "_source": { 
      "genre": "men's advanced wrinkle cream" 
     } 
     }, 
     { 
     "_index": "bluray", 
     "_type": "movies", 
     "_id": "1", 
     "_score": 0.25811607, 
     "_source": { 
      "genre": "men's cream" 
     } 
     }, 
     { 
     "_index": "bluray", 
     "_type": "movies", 
     "_id": "2", 
     "_score": 0.11750762, 
     "_source": { 
      "genre": "men's wrinkle cream" 
     } 
     } 
    ] 
    } 
}