只有一個真正的方法來做到這一點。你必須索引你的數據關鍵字和搜索它與帶狀皰疹分析:
看到這個再現:
首先,我們將創建兩個自定義分析:關鍵字和帶狀皰疹:
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer_keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"asciifolding",
"lowercase"
]
},
"my_analyzer_shingle": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"asciifolding",
"lowercase",
"shingle"
]
}
}
}
},
"mappings": {
"your_type": {
"properties": {
"keyword": {
"type": "string",
"index_analyzer": "my_analyzer_keyword",
"search_analyzer": "my_analyzer_shingle"
}
}
}
}
}
現在,讓我們創建一個使用你給我們一些樣本數據:
POST /test/your_type/1
{
"id": 1,
"keyword": "thousand eyes"
}
POST /test/your_type/2
{
"id": 2,
"keyword": "facebook"
}
POST /test/your_type/3
{
"id": 3,
"keyword": "superdoc"
}
POST /test/your_type/4
{
"id": 4,
"keyword": "quora"
}
POST /test/your_type/5
{
"id": 5,
"keyword": "your story"
}
POST /test/your_type/6
{
"id": 6,
"keyword": "Surgery"
}
POST /test/your_type/7
{
"id": 7,
"keyword": "lending club"
}
POST /test/your_type/8
{
"id": 8,
"keyword": "ad roll"
}
POST /test/your_type/9
{
"id": 9,
"keyword": "the honest company"
}
POST /test/your_type/10
{
"id": 10,
"keyword": "Draft kings"
}
最後查詢運行搜索:
POST /test/your_type/_search
{
"query": {
"match": {
"keyword": "I saw the news of lending club on facebook, your story and quora"
}
}
}
這是結果:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 0.009332742,
"hits": [
{
"_index": "test",
"_type": "your_type",
"_id": "2",
"_score": 0.009332742,
"_source": {
"id": 2,
"keyword": "facebook"
}
},
{
"_index": "test",
"_type": "your_type",
"_id": "7",
"_score": 0.009332742,
"_source": {
"id": 7,
"keyword": "lending club"
}
},
{
"_index": "test",
"_type": "your_type",
"_id": "4",
"_score": 0.009207102,
"_source": {
"id": 4,
"keyword": "quora"
}
},
{
"_index": "test",
"_type": "your_type",
"_id": "5",
"_score": 0.0014755741,
"_source": {
"id": 5,
"keyword": "your story"
}
}
]
}
}
那麼它在幕後?
- 它將您的文檔索引爲整個關鍵字(它將整個字符串作爲單個標記發出)。我還添加了asciifolding過濾器,因此它將字母標準化,即
é
變爲e
)和小寫字母過濾器(不區分大小寫的搜索)。因此,例如Draft kings
被索引爲draft kings
- 現在搜索分析器使用相同的邏輯,除了它的標記器正在發出單詞標記並且在其上創建了帶狀皰疹(標記的組合),這將與您的關鍵字匹配步。
是任何人能夠在ElasticSearch的5.x版本運行它,似乎映射類型應該從字符串改爲文字,index_analyzer只是分析,但我試圖執行一個搜索 – mac
@mac讓當too_many_clauses錯誤我試圖讓你爲你工作! –
@mac我能夠運行查詢,但他們沒有帶回任何數據。我已經在GitHub上記錄了這個問題:https://github.com/elastic/elasticsearch/issues/26989 –