2017-05-02 81 views
1

更改我的索引的相似度算法不起作用。我不想比較BM25和TF-IDF,但我總是得到相同的結果。我正在使用Elasticsearch 5.x.Elasticsearch更改相似度不起作用

我已經嘗試從字面上的一切。屬性的相似性設置爲classicBM25或不設置任何

"properties": { 
      "content": { 
       "type": "text", 
       "similarity": "classic" 
      }, 

我也試着設置我的索引的默認similarty在settings並在properties

"settings": { 
    "index": { 
     "number_of_shards": "5", 
     "provided_name": "test", 
     "similarity": { 
      "default": { 
       "type": "classic" 
      } 
     }, 
     "creation_date": "1493748517301", 
     "number_of_replicas": "1", 
     "uuid": "sNuWcT4AT82MKsfAB9JcXQ", 
     "version": { 
      "created": "5020299" 
     } 
    } 

的使用它查詢即時測試看起來是這樣的:

{ 
    "query": { 
    "match": { 
     "content": "some search query" 
    } 
    } 
} 

回答

1

我在下面創建了一個示例:

DELETE test 
PUT test 
{ 
    "mappings": { 
    "book": { 
     "properties": { 
     "content": { 
      "type": "text", 
      "similarity": "BM25" 
     }, 
     "subject": { 
      "type": "text", 
      "similarity": "classic" 
     } 
     } 
    } 
    } 
} 

POST test/book/1 
{ 
    "subject": "A neutron star is the collapsed core of a large (10–29 solar masses) star. Neutron stars are the smallest and densest stars known to exist.[1] Though neutron stars typically have a radius on the order of 10 km, they can have masses of about twice that of the Sun.", 
    "content": "A neutron star is the collapsed core of a large (10–29 solar masses) star. Neutron stars are the smallest and densest stars known to exist.[1] Though neutron stars typically have a radius on the order of 10 km, they can have masses of about twice that of the Sun." 
} 
POST test/book/2 
{ 
    "subject": "A quark star is a hypothetical type of compact exotic star composed of quark matter, where extremely high temperature and pressure forces nuclear particles to dissolve into a continuous phase consisting of free quarks. These are ultra-dense phases of degenerate matter theorized to form inside neutron stars exceeding a predicted internal pressure needed for quark degeneracy.", 
    "content": "A quark star is a hypothetical type of compact exotic star composed of quark matter, where extremely high temperature and pressure forces nuclear particles to dissolve into a continuous phase consisting of free quarks. These are ultra-dense phases of degenerate matter theorized to form inside neutron stars exceeding a predicted internal pressure needed for quark degeneracy." 
} 

GET test/_search?explain 
{ 
    "query": { 
    "match": { 
     "subject": "neutron" 
    } 
    } 
} 
GET test/_search?explain 
{ 
    "query": { 
    "match": { 
     "content": "neutron" 
    } 
    } 
} 

subjectcontent領域有不同的定義相似,但在這兩個文件我提供的(來自維基百科),他們在他們相同的文本。從第二個"description": "idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:"

  • :運行兩個查詢您將在解釋這樣的事情看,也得到不同的分數結果:

    • 第一個查詢"description": "idf, computed as log(1 + (docCount - docFreq + 0.5)/(docFreq + 0.5)) from:",
  • +0

    PERFEKT感謝,我現在知道我做錯了什麼。解釋做到了。我計算了錯誤的測量結果。我發現BM25是默認的! –

    +0

    很酷。感謝您的跟進。 –