2014-01-19 94 views
3

我正在多個字段上執行一個字段的query_string查詢,_alltags.name,並試圖理解評分。查詢:{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}。下面是查詢返回的文件:爲什麼在同一個查詢中queryWeight包含某些結果分數,但不包含其他分數?

  • 文件1tags.name完全匹配,但不是在_all
  • 文檔8tags.name_all上有完全匹配。

文件8應該贏了,它確實如此,但我對打分的結果感到困惑。看起來像文檔1被tags.name分數乘以兩次IDF而受到處罰,而文檔8的tags.name分數只乘以一次IDF。總之:

  • 他們都有一個組件weight(tags.name:animal in 0) [PerFieldSimilarity]
  • 在文檔1中,我們有weight = score = queryWeight x fieldWeight
  • 在文件8中,我們有weight = fieldWeight

由於queryWeight包含idf,這導致文檔1被idf兩次懲罰。

任何人都可以理解這一點嗎?

信息

  • 如果我刪除從查詢的字段_allqueryWeight完全從解釋了。
  • 添加"use_dis_max":true作爲選項沒有效果。
    • 然而,另外加入"tie_breaker":0.7(或任何值)確實通過給它的更復雜的公式,我們在文獻看到1.
    • 思想影響文獻8:這是合理的,一個布爾查詢(此是)可能會這樣做是爲了給予與多個子查詢匹配的查詢更多的權重。然而,這對dis_max查詢沒有任何意義,它應該只返回最大的子查詢。

下面是相關的解釋請求。尋找嵌入式評論。

文獻1(匹配僅在tags.name):

curl -XGET 'http://localhost:9200/questions/question/1/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "1", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.058849156, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.058849156, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = score = queryWeight x fieldWeight 
     "details" : [ { 
     // score and queryWeight are NOT a part of the other explain! 
     "value" : 0.058849156, 
     "description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:", 
     "details" : [ { 
      "value" : 0.30685282, 
      "description" : "queryWeight, product of:", 
      "details" : [ { 
      // This idf is NOT a part of the other explain! 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 1.0, 
      "description" : "queryNorm" 
      } ] 
     }, { 
      "value" : 0.19178301, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
       "value" : 1.0, 
       "description" : "termFreq=1.0" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } ] 
    } ] 
    } 

文獻8(在兩個_alltags.name匹配):

curl -XGET 'http://localhost:9200/questions/question/8/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "8", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.15342641, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.033902764, 
     "description" : "btq, product of:", 
     "details" : [ { 
     "value" : 0.033902764, 
     "description" : "weight(_all:anim in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 0.033902764, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 0.70710677, 
      "description" : "tf(freq=0.5), with freq of:", 
      "details" : [ { 
       "value" : 0.5, 
       "description" : "phraseFreq=0.5" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.15625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     }, { 
     "value" : 1.0, 
     "description" : "allPayload(...)" 
     } ] 
    }, { 
     "value" : 0.15342641, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = fieldWeight 
     // No score or queryWeight in sight! 
     "details" : [ { 
     "value" : 0.15342641, 
     "description" : "fieldWeight in 0, product of:", 
     "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "termFreq=1.0" 
      } ] 
     }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
     }, { 
      "value" : 0.5, 
      "description" : "fieldNorm(doc=0)" 
     } ] 
     } ] 
    } ] 
    } 
} 
+0

嗨,你自己找到答案了嗎?或者你有任何來源去學習?我正在遭受同樣的缺乏理解。在我們的案例中,這會對一些點擊產生不利影響,並且我需要了解爲什麼以及如何調整我們的查詢。 – Jakub

+0

不,我從來沒有找到一個答案,不幸的是,好奇看到你聽到回來。 – tmandry

回答

相關問題