在我的項目中,我使用Lucence實現了全文索引搜索。但是,在做這件事時,我堅持用邏輯來區分Lucene布爾運算符與Normal和/或不是單詞。如何指定Lucene.net布爾邏輯AND,OR,而不是來自正常和/或不是變量的運算符?
假設例如,如果我們正在搜索「我想要一支筆和鉛筆」,但默認情況下Lucene.net搜索Lucene OR操作。所以它會搜索像「我或想要一個OR筆或鉛筆」不喜歡我想有什麼想「我或想要一個或筆或OR和或鉛筆」。那麼,我們如何區分一個正常的,或不是來自Lucene運營商?
爲此,我已經做了,它看起來像
/// <summary>
/// Method to get search predicates
/// </summary>
/// <param name="searchTerm">Search term</param>
/// <returns>List of predicates</returns>
public static IList<string> GetPredicates(string searchTerm)
{
//// Remove unwanted characters
//searchTerm = Regex.Replace(searchTerm, "[<(.|\n)*?!'`>]", string.Empty);
string exactSearchTerm = string.Empty,
keywordOrSearchTerm = string.Empty,
andSearchTerm = string.Empty,
notSearchTerm = string.Empty,
searchTermWithOutKeywords = string.Empty;
//// Exact search tern
exactSearchTerm = "\"" + searchTerm.Trim() + "\"";
//// Search term without keywords
searchTermWithOutKeywords = Regex.Replace(
searchTerm, " and not | and | or ", " ", RegexOptions.IgnoreCase);
//// Splioted keywords
string[] splittedKeywords = searchTermWithOutKeywords.Trim().Split(
new char[] { ' ', ',' }, StringSplitOptions.RemoveEmptyEntries);
//// Or search term
keywordOrSearchTerm = string.Join(" OR ", splittedKeywords);
//// And search term
andSearchTerm = string.Join(" AND ", splittedKeywords);
//// not search term
int index = 0;
List<string> searchTerms = (from term in Regex.Split(
searchTerm, " and not ", RegexOptions.IgnoreCase)
where index++ != 0
select term).ToList();
searchTerms = (from term in searchTerms
select Regex.IsMatch(term, " and | or ", RegexOptions.IgnoreCase) ?
Regex.Split(term, " and | or ", RegexOptions.IgnoreCase).FirstOrDefault() :
term).ToList();
notSearchTerm = searchTerms.Count > 0 ? string.Join(" , ", searchTerms) : "\"\"";
return new List<string> { exactSearchTerm, andSearchTerm, keywordOrSearchTerm, notSearchTerm };
}
一個輔助方法,但它會返回四個結果。所以我必須通過我的索引循環4次,但它似乎是非常忙碌的。那麼任何人都可以在一個循環中解決這個問題嗎?
好建議。 +1 –