獲取聚集

這裏是我的ES查詢：獲取聚集

===創建索引===

PUT /sample

===插入數據===

PUT /sample/docs/1 
{"data": "And the world said, 'Disarm, disclose, or face serious consequences'—and therefore, we worked with the world, we worked to make sure that Saddam Hussein heard the message of the world."} 
PUT /sample/docs/2 
{"data": "Never give in — never, never, never, never, in nothing great or small, large or petty, never give in except to convictions of honour and good sense. Never yield to force; never yield to the apparently overwhelming might of the enemy"}

===查詢，得到的結果===

POST sample/docs/_search 
{ 
    "query": { 
    "match": { 
     "data": "never" 
    } 
    }, 
    "highlight": { 
    "fields": { 
     "data":{} 
    } 
    } 
}

===檢索結果===

... 
     "highlight": { 
      "data": [ 
      "<em>Never</em> give in — <em>never</em>, <em>never</em>, <em>never</em>, <em>never</em>, in nothing great or small, large or petty, <em>never</em> give", 
      " in except to convictions of honour and good sense. <em>Never</em> yield to force; <em>never</em> yield to the apparently overwhelming might of the enemy" 
      ] 
     }

===所需的結果===

所需期限由文件搜索詞的頻率如下例所示

Doc Id: 2 
Term Frequency :{ 
    "never": 8 
}

我已經試過桶聚合，術語聚合和其他聚合，但我沒有得到這個結果。

感謝您的幫助！

來源

2017-09-23 Callisto

您應該使用Term Vector，它支持根據頻率查詢特定的術語。

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-termvectors.html

在這種情況下，您的查詢將

GET /sample/docs/_termvectors 
{ 
    "doc": { 
     "data": "never" 
    }, 
    "term_statistics" : true, 
    "field_statistics" : true, 
    "positions": false, 
    "offsets": false, 
    "filter" : { 
     "min_term_freq" : 8 
    } 
}

來源

2017-09-23 20:40:38

我越來越如果我執行你的建議的查詢以下錯誤： '{ 「錯誤」：{ 「ROOT_CAUSE」： [ { 「type」：「illegal_state_exception」，「reason」：「術語向量請求的字段統計信息存在錯誤：值爲\ nsum_doc_freq 0 \ ndoc_count 0 \ nsum_ttf 0」 } ]，「類型」：「illegal_state_exception」，「原因」：「出毛病與術語載體請求的字段統計：此數值\ nsum_doc_freq 0 \ ndoc_count 0 \ nsum_ttf 0」 }，「狀態」：500 }' – Callisto

而我的需求是不同的，根據您的建議查詢它將返回結果與術語頻率8，但我想要的結果是術語頻率的數量。 – Callisto

回答

相關問題