2017-05-05 42 views
0

我有這樣的指標:爲什麼彈性檢索查找不區分大小寫

"analysis" : { "filter" : { "meeteor_ngram" : { "type" : "nGram", "min_gram" : "2", "max_gram" : "15" } }, "analyzer" : { "meeteor" : { "filter" : [ "meeteor_ngram" ], "tokenizer" : "standard" } } },

而這個文件:

{ "_index" : "test_global_search", "_type" : "meeting", "_id" : "1", "_version" : 1, "found" : true, "_source" : { "name" : "LightBulb Innovation", "purpose" : "The others should listen the Innovators and also improve the current process.", "location" : "Projector should be set up.", "meeting_notes" : [ { "meeting_note_text" : "The immovator proposed to change the Bulb to Led." } ], "agenda_items" : [ { "text" : "Discuss The Lightning" } ] } }

而且儘管我沒有做小寫過濾,也不分詞都這些查詢返回文檔:

curl -XGET 'localhost:9200/global_search/meeting/_search?pretty' -H 'Content-Type: application/json' -d' 
{ 
    "query": { 
     "match": { 
      "name": "lightbulb" 
     } 
    } 
} 
' 

curl -XGET 'localhost:9200/global_search/meeting/_search?pretty' -H 'Content-Type: application/json' -d' 
{ 
    "query": { 
     "match": { 
      "name": "Lightbulb" 
     } 
    } 
} 
' 

這裏是映射:

→ curl -XGET 'localhost:9200/global_search/_mapping/meeting?pretty' 
{ 
    "global_search" : { 
    "mappings" : { 
     "meeting" : { 
     "properties" : { 
      "agenda_items" : { 
      "properties" : { 
       "text" : { 
       "type" : "text", 
       "analyzer" : "meeteor" 
       } 
      } 
      }, 
      "location" : { 
      "type" : "text", 
      "analyzer" : "meeteor" 
      }, 
      "meeting_notes" : { 
      "properties" : { 
       "meeting_note_text" : { 
       "type" : "text", 
       "analyzer" : "meeteor" 
       } 
      } 
      }, 
      "name" : { 
      "type" : "text", 
      "analyzer" : "meeteor" 
      }, 
      "purpose" : { 
      "type" : "text", 
      "analyzer" : "meeteor" 
      } 
     } 
     } 
    } 
    } 
} 
+0

你的映射在哪裏? – RoiHatam

+0

我加了@RoiHatam – Boti

+0

@Boti哪個索引有上面的文件?它是'test_global_search'還是'global_search'?兩個索引有相同的映射嗎? – avr

回答

0

請加上"index" : "not_analyzed"name

"name" : { 
     "type" : "keyword", 
     "index" : true 
} 
+0

我收到了: [400] {「error」:{「root_cause」:[{「type」:「mapper_parsing_exception」,「reason」:「無法解析映射[會議]:[string] 5.0和自動升級失敗,因爲自動升級不支持參數[analyzer]。現在應該使用[text]或[keyword]字段代替字段[name]「}],」type「:」mapper_parsing_exception「,」原因「:」解析映射失敗[meeting]:[string]類型在5.0中被刪除,並且自動升級失敗,因爲自動升級不支持參數[analyzer] ... – Boti

+0

@Boti對不起,我更新了映射到新版本''關鍵字'而不是'字符串','true'而不是'not_analyzed' – RoiHatam

+0

它仍然找到'燈泡'和'燈泡'兩種方式。+帶關鍵字我將無法搜索「當你鍵入「...所以我需要自定義分析器。我仍然不明白爲什麼不區分大小寫 那裏。 – Boti

3

兩個LightBulblightBulb是因爲custom analyzer您創建的恢復您的文檔。

檢查分析儀如何標記數據。

GET global_search/_analyze?analyzer=meeteor 
{ 
    "text" : "LightBulb Innovation" 
} 

你會看到下面的輸出:

{ 
"tokens": [ 
    { 
    "token": "Li", 
    "start_offset": 0, 
    "end_offset": 9, 
    "type": "word", 
    "position": 0 
    }, 
    { 
    "token": "Lig", 
    "start_offset": 0, 
    "end_offset": 9, 
    "type": "word", 
    "position": 0 
    }, 
    { 
    "token": "Ligh", 
    "start_offset": 0, 
    "end_offset": 9, 
    "type": "word", 
    "position": 0 
    }, 
    { 
    "token": "Light", 
    "start_offset": 0, 
    "end_offset": 9, 
    "type": "word", 
    "position": 0 
    }, 
.... other terms starting from Light 

    { 
    "token": "ig",  ======> tokens below this get matched when you run your query 
    "start_offset": 0, 
    "end_offset": 9, 
    "type": "word", 
    "position": 0 
    }, 
    { 
    "token": "igh", 
    "start_offset": 0, 
    "end_offset": 9, 
    "type": "word", 
    "position": 0 
    }, 
    { 
    "token": "ight", 
    "start_offset": 0, 
    "end_offset": 9, 
    "type": "word", 
    "position": 0 
    }, 
    .... other tokens. 

現在,當你運行match查詢相同custom analyzer行爲和令牌您以上述方式搜索的文本。和像'ig' , 'igh'和更多的令牌匹配。這就是爲什麼match似乎不起作用。

term查詢的情況下,沒有搜索分析器的行爲。它按原樣搜索該術語。如果您搜索LightBulb,它會在令牌中找到。但不會找到lightBulb

希望這能夠澄清你的問題。

關於termmatch的研究。

+0

嗯,我需要這個: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer .html – Boti

+0

@boti。是的,您可以在搜索時更換分析儀。 – Richa