2015-06-25 74 views
0

我無法創建從字符串中刪除停用詞的代碼。這裏是我的代碼:從asp.net中的字符串中刪除停用詞c#

String Review="The portfolio is fine except for the fact that the last movement of sonata #6 is missing. What should one expect?"; 

string[] arrStopword = new string[] {"a", "i", "it", "am", "at", "on", "in", "to", "too", "very","of", "from", "here", "even", "the", "but", "and", "is","my","them", "then", "this", "that", "than", "though", "so", "are"}; 
StringBuilder sbReview = new StringBuilder(Review); 
foreach (string word in arrStopword){ 
sbReview.Replace(word, "");} 
Label1.Text = sbReview.ToString(); 
運行 Label1.Text = "The portfolo s fne except for fct tht lst movement st #6 s mssng. Wht should e expect? "

我希望它必須返回"portofolio fine except for fact last movement sonata #6 is missing. what should one expect?"

有人知道如何解決這個問題時

回答

0

你可以使用「一」,「我」等,以確保程序只刪除這些單詞,如果它們被用作單詞(所以它們周圍有空格)。只需用空格替換它們即可保持原樣。

1

問題是你比較子字符串,而不是字。您需要拆分原始文本,刪除項目然後重新加入。

試試這個

List<string> words = Review.Split(" ").ToList(); 
foreach(string stopWord in arrStopWord) 
    words.Remove(stopWord); 
string result = String.Join(" ", words); 

,我可以用這個看到的唯一問題是,它不處理punctiation那麼好,但你得到的總體思路。

2

你可以使用LINQ來解決這個問題。首先,您需要將您的string轉換,使用Split功能轉變爲stringlist" "(空間)分離,然後用Except得到,你的結果將包含然後的話可以申請string.Join

var newString = string.Join(" ", Review.Split(' ').Except(arrStopword)); 
+1

這是一個辛辣的肉丸。不會想到「Except」。 – LocEngineer

+0

可愛,優雅的解決方案。使用Except超載例如忽略大小寫增強.Except(arrStopword,StringComparer.InvariantCultureIgnoreCase) – getsetcode