2016-10-09 101 views
3

我使用的是官方Elasticsearch客戶的NodeJS庫,查詢以下索引結構:Elasticsearch - 基於文本的長度查詢

{ 
    "_index": "articles", 
    "_type": "context", 
    "_id": "1", 
    "_version": 1, 
    "found": true, 
    "_source": { 
    "article": "this is a paragraph", 
    "topic": "topic A" 
    } 
} 

{ 
    "_index": "articles", 
    "_type": "context", 
    "_id": "2", 
    "_version": 1, 
    "found": true, 
    "_source": { 
    "article": "this is a paragraph this is a paragraph this is a paragraph", 
    "topic": "topic B" 
    } 
} 

我想使用術語查詢我的指數「,這是一段」和提高的結果最相似的文本長度,即:文檔_id:1

我能做到這一點,而無需重新索引和添加字段到我的指數(as described here)?

+0

如果你不能改變映射或做一個重新索引,那麼可能在查詢時Groovy腳本? –

+0

感謝您的回覆,我是彈性搜索新手......您詳細闡述的顏色。 –

+0

嗯...讓我發表一個示例查詢作爲例子... –

回答

1

以下查詢使用Groovy查看索引到ES(使用_source.article.length())的實際文本的長度以及要搜索的文本的長度。作爲一個非常簡單的基本查詢,我使用match_phrase,然後根據搜索文本與原始文本的時間長度進行比較來重新計算文檔。

GET /articles/context/_search 
{ 
    "query": { 
    "function_score": { 
     "query": { 
     "match_phrase": { 
      "article": "this is a paragraph" 
     } 
     }, 
     "functions": [ 
     { 
      "script_score": { 
      "script": { 
       "inline": "text_to_search_length=text_to_search.length(); compared_length=_source.article.length();return (compared_length-text_to_search_length).abs()", 
       "params": { 
       "text_to_search": "this is a paragraph" 
       } 
      } 
      } 
     } 
     ] 
    } 
    }, 
    "sort": [ 
    { 
     "_score": { 
     "order": "asc" 
     } 
    } 
    ] 
}