2015-04-03 36 views
0

我有以下的彈性搜索配置:edge_ngram過濾器,而不是analzyed匹配搜索

PUT /my_index 
{ 
    "settings": { 
     "number_of_shards": 1, 
     "analysis": { 
      "filter": { 
       "autocomplete_filter": { 
        "type":  "edge_ngram", 
        "min_gram": 1, 
        "max_gram": 20 
       }, 
       "snow_filter" : { 
        "type" : "snowball", 
        "language" : "English" 
       } 
      }, 
      "analyzer": { 
       "autocomplete": { 
        "type":  "custom", 
        "tokenizer": "standard", 
        "filter": [ 
         "lowercase", 
         "snow_filter", 
         "autocomplete_filter" 
        ] 
       } 
      } 
     } 
    } 
} 

PUT /my_index/_mapping/my_type 
{ 
    "my_type": { 
     "properties": { 
      "name": { 
       "type": "multi_field", 
       "fields": { 
        "name": { 
         "type":   "string", 
         "index_analyzer": "autocomplete", 
         "search_analyzer": "snowball" 
        }, 
        "not": { 
         "type": "string", 
         "index": "not_analyzed" 
        } 
       } 
      } 
     } 
    } 
} 


POST /my_index/my_type/_bulk 
{ "index": { "_id": 1   }} 
{ "name": "Brown foxes" } 
{ "index": { "_id": 2   }} 
{ "name": "Yellow furballs" } 
{ "index": { "_id": 3   }} 
{ "name": "my discovery" } 
{ "index": { "_id": 4   }} 
{ "name": "myself is fun" } 
{ "index": { "_id": 5   }} 
{ "name": ["foxy", "foo"] } 
{ "index": { "_id": 6   }} 
{ "name": ["foo bar", "baz"] } 

我試圖讓一個搜索只返回擁有的「富巴」的名稱第6項和我不太清楚如何。這是我在做什麼現在:

GET /my_index/my_type/_search 
{ 
    "query": { 
     "match": { 
      "name": { 
       "query": "foo b" 
      } 
     } 
    } 
} 

我知道這是分詞器是如何分裂這個詞,但那種失去對如何既靈活,有足夠的嚴格匹配這個組合。我猜我需要在我的名稱映射上做多個字段,但我不確定。如何修復查詢和/或我的映射以滿足我的需求?

回答

1

你已經接近。由於您edge_ngram分析器產生的1的最小長度的標記,並查詢被記號化到"foo""b",默認match query operator"or",查詢具有起動用"b"(或"foo")一術語的每個文檔,三個匹配文檔。

使用"and"運營商似乎做你想要什麼:

POST /my_index/my_type/_search 
{ 
    "query": { 
     "match": { 
      "name": { 
       "query": "foo b", 
       "operator": "and" 
      } 
     } 
    } 
} 
... 
{ 
    "took": 1, 
    "timed_out": false, 
    "_shards": { 
     "total": 1, 
     "successful": 1, 
     "failed": 0 
    }, 
    "hits": { 
     "total": 1, 
     "max_score": 1.4451914, 
     "hits": [ 
     { 
      "_index": "test_index", 
      "_type": "my_type", 
      "_id": "6", 
      "_score": 1.4451914, 
      "_source": { 
       "name": [ 
        "foo bar", 
        "baz" 
       ] 
      } 
     } 
     ] 
    } 
} 

這是我用來測試它的代碼:

http://sense.qbox.io/gist/4f6fb7c1fdc6942023091ee1433d7490e04e7dea

+0

作爲後續行動,如果我添加新記錄與名稱:[「工作」,「表演」],我搜索「過程」新記錄回來。這是爲什麼發生? (我可以設置一個min_score我想嘗試阻止它作爲匹配) – RyanHirsch 2015-04-04 22:58:29

+0

我剛剛嘗試過,並沒有獲取該記錄(http://sense.qbox.io/gist/61216d3dd894c50212503df931310d24f790e15f)。所以一定還有其他的事情要做。 – 2015-04-04 23:04:04

+0

我只是將分析器設置爲自動完成,而不是具有單獨的索引和搜索分析器,導致該問題。 – RyanHirsch 2015-04-04 23:16:02