爲什麼在同一個查詢中queryWeight包含某些結果分數，但不包含其他分數？

我正在多個字段上執行一個字段的query_string查詢，_all和tags.name，並試圖理解評分。查詢：{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}。下面是查詢返回的文件：爲什麼在同一個查詢中queryWeight包含某些結果分數，但不包含其他分數？

文件1對tags.name完全匹配，但不是在_all。
文檔8在tags.name和_all上有完全匹配。

文件8應該贏了，它確實如此，但我對打分的結果感到困惑。看起來像文檔1被tags.name分數乘以兩次IDF而受到處罰，而文檔8的tags.name分數只乘以一次IDF。總之：

他們都有一個組件weight(tags.name:animal in 0) [PerFieldSimilarity]。
在文檔1中，我們有weight = score = queryWeight x fieldWeight。
在文件8中，我們有weight = fieldWeight！

由於queryWeight包含idf，這導致文檔1被idf兩次懲罰。

任何人都可以理解這一點嗎？

信息

如果我刪除從查詢的字段_all，queryWeight完全從解釋了。
添加"use_dis_max":true作爲選項沒有效果。
- 然而，另外加入"tie_breaker":0.7（或任何值）確實通過給它的更復雜的公式，我們在文獻看到1.
- 思想影響文獻8：這是合理的，一個布爾查詢（此是）可能會這樣做是爲了給予與多個子查詢匹配的查詢更多的權重。然而，這對dis_max查詢沒有任何意義，它應該只返回最大的子查詢。

下面是相關的解釋請求。尋找嵌入式評論。

文獻1（匹配僅在tags.name）：

curl -XGET 'http://localhost:9200/questions/question/1/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'：

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "1", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.058849156, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.058849156, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = score = queryWeight x fieldWeight 
     "details" : [ { 
     // score and queryWeight are NOT a part of the other explain! 
     "value" : 0.058849156, 
     "description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:", 
     "details" : [ { 
      "value" : 0.30685282, 
      "description" : "queryWeight, product of:", 
      "details" : [ { 
      // This idf is NOT a part of the other explain! 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 1.0, 
      "description" : "queryNorm" 
      } ] 
     }, { 
      "value" : 0.19178301, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
       "value" : 1.0, 
       "description" : "termFreq=1.0" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } ] 
    } ] 
    }

文獻8（在兩個_all和tags.name匹配）：

curl -XGET 'http://localhost:9200/questions/question/8/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'：

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "8", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.15342641, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.033902764, 
     "description" : "btq, product of:", 
     "details" : [ { 
     "value" : 0.033902764, 
     "description" : "weight(_all:anim in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 0.033902764, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 0.70710677, 
      "description" : "tf(freq=0.5), with freq of:", 
      "details" : [ { 
       "value" : 0.5, 
       "description" : "phraseFreq=0.5" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.15625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     }, { 
     "value" : 1.0, 
     "description" : "allPayload(...)" 
     } ] 
    }, { 
     "value" : 0.15342641, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = fieldWeight 
     // No score or queryWeight in sight! 
     "details" : [ { 
     "value" : 0.15342641, 
     "description" : "fieldWeight in 0, product of:", 
     "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "termFreq=1.0" 
      } ] 
     }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
     }, { 
      "value" : 0.5, 
      "description" : "fieldNorm(doc=0)" 
     } ] 
     } ] 
    } ] 
    } 
}

來源

2014-01-19 tmandry

嗨，你自己找到答案了嗎？或者你有任何來源去學習？我正在遭受同樣的缺乏理解。在我們的案例中，這會對一些點擊產生不利影響，並且我需要了解爲什麼以及如何調整我們的查詢。 – Jakub

不，我從來沒有找到一個答案，不幸的是，好奇看到你聽到回來。 – tmandry

我沒有答案。只是想提及我發佈的問題到Elasticsearch論壇：https://groups.google.com/forum/#!topic/elasticsearch/xBKlFkq0SP0 我會在這裏通知我什麼時候會得到答案。

來源

2015-04-17 13:12:19 Jakub

爲什麼在同一個查詢中queryWeight包含某些結果分數，但不包含其他分數？

回答

相關問題