我有一個大型的名字數據庫,主要來自蘇格蘭。我們目前正在生產一個原型來取代現有的一個搜索軟件。這仍在製作中,我們的目標是讓我們的結果儘可能地接近當前搜索的結果。ElasticSearch - 尋找人名
我希望有人能幫助我,我進入一個搜索彈性搜索,查詢是「邁克爾Heaney」,我得到了一些野生的結果。目前的搜索返回兩個主要的姓氏,這些是 - 「Heaney」和「Heavey」都帶有「Michael」的名字,我可以在Elastic Search中獲得「Heaney」結果,但是我無法獲得「Heavey」,ES也返回沒有姓氏「邁克爾」的人,但我明白,這是由於它是模糊查詢的一部分。我知道這是一個狹義的用例,因爲它只有一個搜索,但得到這個結果並知道我可以如何獲得它會有所幫助。
謝謝。
映射
{
"jr": {
"_all": {
"enabled": true,
"index_analyzer": "index_analyzer",
"search_analyzer": "search_analyzer"
},
"properties": {
"pty_forename": {
"type": "string",
"index": "analyzed",
"boost": 2,
"index_analyzer": "index_analyzer",
"search_analyzer": "search_analyzer",
"store": "yes"
},
"pty_full_name": {
"type": "string",
"index": "analyzed",
"boost": 4,
"index_analyzer": "index_analyzer",
"search_analyzer": "search_analyzer",
"store": "yes"
},
"pty_surname": {
"type": "string",
"index": "analyzed",
"boost": 4,
"index_analyzer": "index_analyzer",
"search_analyzer": "search_analyzer",
"store": "yes"
}
}
}
}'
指數設置
{
"settings": {
"number_of_shards": 2,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"index_analyzer": {
"tokenizer": "standard",
"filter": [
"standard",
"my_delimiter",
"lowercase",
"stop",
"asciifolding",
"porter_stem",
"my_metaphone"
]
},
"search_analyzer": {
"tokenizer": "standard",
"filter": [
"standard",
"my_metaphone",
"synonym",
"lowercase",
"stop",
"asciifolding",
"porter_stem"
]
}
},
"filter": {
"synonym": {
"type": "synonym",
"synonyms_path": "synonyms/synonyms.txt"
},
"my_delimiter": {
"type": "word_delimiter",
"generate_word_parts": true,
"catenate_words": false,
"catenate_numbers": false,
"catenate_all": false,
"split_on_case_change": false,
"preserve_original": false,
"split_on_numerics": false,
"stem_english_possessive": false
},
"my_metaphone": {
"type": "phonetic",
"encoder": "metaphone",
"replace": false
}
}
}
}
}'
模糊
{
"from":0, "size":100,
"query": {
"bool": {
"should": [
{
"fuzzy": {
"pty_surname": {
"min_similarity": 0.2,
"value": "Heaney",
"prefix_length": 0,
"boost": 5
}
}
},
{
"fuzzy": {
"pty_forename": {
"min_similarity": 1,
"value": "Michael",
"prefix_length": 0,
"boost": 1
}
}
}
]
}
}
}
謝謝亞歷克斯。讓我把所有這些信息都收集起來,然後我會回報。答案看起來很徹底。 – Nate
我們剛剛發表了一篇關於模糊搜索的文章,這也可能是有趣的:https://www.found.no/foundation/fuzzy-search/ –
將書籤。非常感謝您的幫助,我學到了很多東西。 – Nate