我正在使用Solr搜索機構...我的Solr DB有大約400k個文檔,每個文檔都有多個字段,如(「name」,「id」,「city」,.. 。)...Solr爲同一個查詢提供了不同的查詢語句
在我的數據庫的文件看起來是這樣的:
"docs":
{
"id": "91348",
"p_code": "71637",
"name": "University of Toronto - Mississauga",
"ext_name": "",
"city": "Mississauga",
"country": "CA",
"state": "ON",
"type": "academic/campus",
"alt_name": "",
"ext_city": "",
"zip": "L5L 1C6",
"alt_ext_city": "",
}
我寫這樣{name: (university of toronto)}...
前兩場比賽的查詢是:
"docs":
{
"id": "91348",
"p_code": "71637",
"name": "University of Toronto - Mississauga",
"ext_name": "",
"city": "Mississauga",
"country": "CA",
"state": "ON",
"type": "academic/campus",
"alt_name": "",
"ext_city": "",
"zip": "L5L 1C6",
"alt_ext_city": "",
"_version_": 1473710223400108000,
"score": 1.499069
},
{
"id": "10624",
"p_code": "7938",
"name": "University of Toronto",
"ext_name": "",
"city": "Toronto",
"country": "CA",
"state": "ON",
"type": "academic",
"alt_name": "Saint George Downtown Campus",
"ext_city": "",
"zip": "M5S 1A1",
"alt_ext_city": "",
"_version_": 1473710220148473900,
"score": 1.4967358
}
我真的很驚訝地看到, 「多倫多大學 - 密西西比一個「比」多倫多大學「得分高。直覺上,包含「多倫多大學 - 密西沙加大學」的領域應該得到較低的分數,因爲它比另一個長。
我也非常驚訝地發現,Solr給出了querynorm的不同值,如下所示: (0.03198291 = queryNorm)用於頂級文檔和(0.03203078 = queryNorm)用於第二級文檔。我推測查詢規範對於所有文檔應該完全相同,因爲它只是查詢的一個函數。
我不確定我的Solr是如何工作的,或者索引或配置有問題?有人遇到同樣的問題嗎?
你的完整查詢字符串是怎麼看的? ..並且我們正在談論單個服務器,還是存在分片或SolrCloud的參與? – MatsLindh
至於爲什麼短期沒有得到提高分數,我最好的猜測是你有'omitNorms = true'。正如你所提到的那樣,在得分的時候會佔用較短的字段,這取決於是否存儲了規範。 – femtoRgon