我Solr模式如下(僅重要部分):使用dismax搜索多字索引項
<fieldType name="bagofwords_expertfinding" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<!-- remove letters repeated more than two times -->
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^.*(([aA-zZ])\\2)\\2+.*$" replacement=""/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
</fieldType>
<fieldType name="namedentities_expertfinding" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<!-- remove letters repeated more than two times -->
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="\s," replacement=","/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern=",\s" replacement=","/>
<tokenizer class="solr.PatternTokenizerFactory" pattern="," />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
</fieldType>
在namedentities我索引多字詞,如:「diego alberto milito」,「diego armando maradona」。我試圖在兩個領域進行搜索,以dismax查詢來提升他們。
但與此查詢嘗試: 本地主機:8080/Solr的/選擇/ Q = 「馬拉多納」 & DEFTYPE = dismax & QF = namedentities^100個bagofwords^1 & FL = *,得分& debugQuery =真& mm = 0
solr找不到任何東西。也許我不明白正確使用「象徵
我不明白,也給這個從Solr的維基:
」在Solr的1.4和之前,您應該基本定毫米= 0,如果你想等同於q.op = OR,而mm = 100%,如果您想要q.op = AND的等價性。在3.x和trunk中,默認值mm由q.op參數決定(q.op = AND => mm = 100%; q.op = OR => mm = 0%)。請記住,缺省操作符受到schema.xml條目的影響。在較舊版本的Solr中,默認值爲100%(所有子句必須匹配)「
並且假設在我的模式中defaultOperator是OR,爲什麼沒有設置mm = 0,我獲得的默認mm值爲100.
提前感謝!
解析查詢的調試版本的輸出也是有用的。我懷疑t由於您將字段標記爲字母,因此您的精確搜索將不匹配 - 因爲這兩個條目都不是您將其用引號引起來搜索的字符串。 – MatsLindh 2012-02-13 21:46:17
謝謝。我終於發現引號並不意味着完全匹配,而是尋找一個短語:連續的字符串,所以我改變了我的模式分析器。但是沒有辦法處理多詞記號......所以我在單詞索引中搜索短語 – Tywnil 2012-02-13 21:56:15