2016-03-06 113 views
-2

我需要以最有效的方式在C#中執行此操作。搜索包含特定單詞的短語的高效方法

假設:

  1. Collection1:{"I am good", He is best", They are poor", "Mostly they are average", "All are very nice"}
  2. Collection2:{"good", "best" ,"nice"}

我想搜索所有Collection2Collection1和匹配結果存儲在Collection3,所以Collection3會是這樣:

Coll ection3:{"I am good", "I am best", "All are very nice"}

+0

看起來你需要一個倒排索引,看看Lucene.NET如何做到這一點,或者只是使用該庫。 –

+0

我想知道你是否對這個主題做過任何研究。例如在右邊的相關列中有[什麼.NET集合提供了最快的搜索?](http://stackoverflow.com/questions/1009107/what-net-collection-provides-the-fastest-search? ) – Steve

+0

@EugenePodskal; Collection1項目將會像一個短語。 Collection2項目將是在Collection1短語中搜索的所有單詞,然後將匹配的短語放入Collection3中。 – p0iz0neR

回答

0

最好的辦法。

string[] Collection1 = {"I am good", "He is best", "They are poor", "Mostly they are average", "All are very nice"}; 
string[] Collection2 = { "good", "best", "nice" }; 

var Collection3 = Collection1.Select(x => x.ToLower()) 
        .Where(x => Collection2.Any(y => x.Contains(y))).ToArray(); 
+0

不能用C#2010編譯。請問你能寫出正確的代碼嗎?這看起來很簡單,很好。 – p0iz0neR

+0

使用System.Linq添加;它應該工作 –

+1

你贏了兄弟。非常感謝。 – p0iz0neR

0
IList<String> Collection3; 

for(int i = 0 ; i < Collectio2.Count ; i++) 
{ 
    foreach(String str in Collection1) 
    { 
     if(str.Contains(Collection2[i])) 
     { 
     Collection3.Add(str); 
     } 
    } 
} 
+0

使用嵌套循環。我想避免遞歸循環。 – p0iz0neR

+0

在這種情況下,您需要使用LINQ –

+0

我使用過Dictionary,但它只返回TRUE或FALSE,而不是來自ContainsValue函數的索引。 – p0iz0neR

0

假設你Collection2項目是在字[沒有雙關語意],你可以使用LINQ ToLookup的通常含義的話 - 這會給你一個適當的MultiValueDictionary模擬,並使用您可以嘗試類似:

var phrases = new[] { "I am good", "He is best", "They are poor", "Mostly they are average", "All are very nice", "Not so\tgood \t", }; 

var lookup = phrases 
    .Select((phrase, index) => 
     new 
     { 
      phrase, 
      index, 
      words = phrase.Split((Char[])null, StringSplitOptions.RemoveEmptyEntries) 
     }) 
    .SelectMany(item => 
     item 
      .words 
      .Select(word => 
       new 
       { 
        word, 
        item.index, 
        item.phrase 
       })) 
    .ToLookup(
     keySelector: item => item.word, 
     elementSelector: item => new { item.phrase, item.index }); 

var wordsToSearch = new[] { "good", "best", "nice" }; 

var searchResults = wordsToSearch 
    .Select(word => 
     new 
     { 
      word, 
      phrases = lookup[word].ToArray() 
     }); 

foreach (var result in searchResults) 
{ 
    Console.WriteLine(
     "Word '{0}' can be found in phrases : {1}", 
     result.word, 
     String.Join(
      ", ", 
      result 
       .phrases 
       .Select(phrase => 
        String.Format("{0}='{1}'", phrase.index, phrase.phrase)))); 
}  

它提供給你的指標和短語,讓您可以根據需要適應它。

但是,如果您的Collection2不是由單詞組成,而是由短語組成,那麼您將需要更強大的功能,如lucene.net,這可以正確處理全文搜索。

相關問題