2017-01-18 48 views
0

我的樣本指數和文件結構如下所示:彈性搜索添加砝碼或提高查詢和多期限條款

http://localhost:9200/testindex/ 
    { 
     "settings": { 
     "analysis": { 
      "analyzer": { 
      "autocomplete": { 
       "tokenizer": "whitespace", 
       "filter": [ 
       "lowercase", 
       "autocomplete" 
       ] 
      }, 
      "autocomplete_search": { 
       "tokenizer": "whitespace", 
       "filter": [ 
       "lowercase" 
       ] 
      } 
      }, 
      "filter": { 
      "autocomplete": { 
       "type": "nGram", 
       "min_gram": 2, 
       "max_gram": 40 
      } 
      } 
     } 
     }, 
     "mappings": { 
     "table1": { 
      "properties": { 
      "title": { 
       "type": "string", 
       "index": "not_analyzed" 
      }, 
      "type": { 
       "type": "string", 
       "index": "not_analyzed" 
      }, 
      "type1": { 
       "type": "string", 
       "index": "not_analyzed" 
      }, 
      "id": { 
       "type": "string", 
       "analyzer": "autocomplete", 
       "search_analyzer": "autocomplete_search" 
      } 
      } 
     } 
     } 
    } 



http://localhost:9200/testindex/table1/1 
{ 
    "title": "mumbai", 
    "type": "efg", 
    "type1": "efg1", 
    "id": "Caryle management" 
} 


http://localhost:9200/testindex/table1/2 
{ 
    "title": "canada", 
    "type": "abc", 
    "type1": "abc1", 
    "id": "labson series 2014" 
} 



http://localhost:9200/testindex/table1/3/ 
{ 
    "title": "ny", 
    "type": "abc", 
    "type1": "abc1", 
    "id": "labson series 2012" 
} 


http://localhost:9200/testindex/table1/4/ 
{ 
    "title": "pune", 
    "type": "abc", 
    "type1": "abc1", 
    "id": "hybrid management" 
} 




Query used to get all documents where type = "abc" and "efg" and have id equal to labson and management . 


{ 
     "query": { 
     "bool": { 
      "filter": { 
      "query": { 
       "terms": { 
       "type": [ 
        "abc", 
        "efg" 
       ] 
       } 
      } 
      }, 
      "minimum_should_match": 1, 
      "should": [ 
      { 
       "query": { 
       "bool": { 
        "must": [ 
        { 
         "term": { 
         "_type": "table1" 
         } 
        }, 
        { 
         "bool": { 
         "should": [ 
          { 
          "match": { 
           "id": { 
           "query": "labson ", 
           "operator": "and" 
           } 
          } 
          }, 
          { 
          "match": { 
           "id": { 
           "query": "management", 
           "operator": "and" 
           } 
          } 
          } 
         ] 
         } 
        } 
        ] 
       } 
       } 
      } 
      ] 
     } 
     } 
    } 






    "hits": [ 
    { 
    "_index": "testindex", 
    "_type": "table1", 
    "_id": "2", 
    "_score": 1, 
    "_source": { 
    "title": "canada", 
    "type": "abc", 
    "type1": "abc1", 
    "id": "labson series 2014" 
    } 
    } 
    , 
    { 
    "_index": "testindex", 
    "_type": "table1", 
    "_id": "4", 
    "_score": 1, 
    "_source": { 
    "title": "pune", 
    "type": "abc", 
    "type1": "abc1", 
    "id": "hybrid management" 
    } 
    } 
    , 
    { 
    "_index": "testindex", 
    "_type": "table1", 
    "_id": "1", 
    "_score": 1, 
    "_source": { 
    "title": "mumbai", 
    "type": "efg", 
    "type1": "efg1", 
    "id": "Caryle management" 
    } 
    } 
    , 
    { 
    "_index": "testindex", 
    "_type": "table1", 
    "_id": "3", 
    "_score": 1, 
    "_source": { 
    "title": "ny", 
    "type": "abc", 
    "type1": "abc1", 
    "id": "labson series 2012" 
    } 
    } 
    ] 

所以我需要幫助對該輸出的問題。

  1. 爲什麼labson系列2012在結果 最後文件呢?雖然我的搜索條件要首先考慮labson和 然後管理我。怎麼能在管理提升中添加或重量labson關鍵字 。所以輸出應該給我所有文件, 匹配labson然後管理根據輸入的順序在 匹配條款。
  2. 我該如何添加一個過濾器在頂部應該閱讀有給我所有 有輸入(「abc」,「efg」)和type1在 (「abc」)的文檔。現在我只是在「(abc」,「efg」)中搜索類型,如何修改查詢以包含type1字段的IN子句。

請對上述2查詢解決方案提供一些僞代碼,因爲我是新來的ES,這將幫助我非常

在此先感謝

回答

0

我想清楚你對這個「雖然我的搜索標準想要首先考察labson,然後管理「。 Elasticsearch在生成分數時不考慮查詢子句的順序。分數由每個子查詢子句獨立於順序生成,然後將它們全部組合以評估最終分數。

請參考以下查詢您的用例。 對於分數計算,您可以添加boost param in match query選項來增加文檔在匹配情況下的分數。我使用custom score query來忽略tdf /頻率。要忽略對socring的查詢規範效果,可以在索引文檔時關閉querynorm。請使用以下映射到turn off querynorm

{ 
     "settings": { 
      "analysis": { 
       "analyzer": { 
        "autocomplete": { 
         "tokenizer": "whitespace", 
         "filter": [ 
          "lowercase", 
          "autocomplete" 
         ] 
        }, 
        "autocomplete_search": { 
         "tokenizer": "whitespace", 
         "filter": [ 
          "lowercase" 
         ] 
        } 
       }, 
       "filter": { 
        "autocomplete": { 
         "type": "nGram", 
         "min_gram": 2, 
         "max_gram": 40 
        } 
       } 
      } 
     }, 
     "mappings": { 
      "table1": { 
       "properties": { 
        "title": { 
         "type": "string", 
         "index": "not_analyzed" 
        }, 
        "type": { 
         "type": "string", 
         "index": "not_analyzed" 
        }, 
        "type1": { 
         "type": "string", 
         "index": "not_analyzed" 
        }, 
        "id": { 
         "type": "string", 
         "analyzer": "autocomplete", 
         "search_analyzer": "autocomplete_search", 
         "norms": { 
          "enabled": false 
         } 
        } 
       } 
      } 
     } 
    } 

Few discussion thread for similar scoring usecases.

Github issue for query norm

既然你還提到你想過濾器頂部("abc" , "efg") and type1 in ("abc")。所以我添加了一個必須的過濾器和兩個子過濾器的術語和術語來支持這個。

{ 
    "query": { 
     "filtered": { 
      "query": { 
       "bool": { 
        "should": [{ 
         "constant_score": { 
          "query": { 
           "match": { 
            "id": { 
             "query": "management", 
             "operator": "and" 
            } 
           } 
          }, 
          "boost": 1 
         } 
        }, { 
         "constant_score": { 
          "query": { 
           "match": { 
            "id": { 
             "query": "labson", 
             "operator": "and" 
            } 
           } 
          }, 
          "boost": 2 
         } 
        }], 
        "must": [{ 
         "term": { 
          "type1": { 
           "value": "abc" 
          } 
         } 
        }, { 
         "terms": { 
          "type": [ 
           "abc", 
           "efg" 
          ] 
         } 
        }] 
       } 
      } 
     } 
    } 
} 

鑑於此過濾器("abc" , "efg") and type1 in ("abc")您的要求,實際上是沒有文件匹配此標準的匹配將在情況下你來0您運行的是那些提到4號文件此查詢。如果要將and子句更改爲OR子句,則可以通過對查詢進行適當更改來更改。

Furthur通過爲多個匹配查詢添加不同的提升參數並期望通過組合每個匹配查詢的每個分數來評估得分來評分,從而使您獲得更多評分。

希望這對你有用。 謝謝

+0

因此,如果我必須增加從上到下的提升參數,我應該把第一個匹配作爲最大值,然後以降序排列,直到最後一個匹配的最小值爲1。 – baiduXiu

+0

在相同的查詢中,如果我必須指定不在(「abc」,「efg」)中的類型而不是在(「abc」,「efg」)中,我們如何在ES中執行此操作 – baiduXiu

+0

另外, 4,我嘗試了2和3,但沒有奏效,但它與4一起工作。所以我們如何決定提升因子 – baiduXiu