2015-11-25 63 views
0

我做了一個簡單的程序,其中編寫了一個短語,並顯示了與單個單詞匹配的視頻。讓我說我進入了「我去學校」。在這裏它應該從句子中刪除單詞「to」並返回三個單詞。 這是我已經嘗試過的代碼!它工作正常,但是當我輸入一些短語時,它將刪除幫助動詞並且除此之外,它會替換出現問題的空字符串。任何建議請從句子中刪除幫助詞

代碼

class MyPlayer 
     { 
     string complete_name; 
     string root; 
     string[] supportedExtensions; 
     string videoname; 
     public MyPlayer(string snt) 
      { 
       videoname = snt; 
      } 
     public List<VideosDetail> test() 
      { 
       complete_name = videoname.ToLower() + ".wmv"; 
       root = System.IO.Path.GetDirectoryName(@"C:\Users\Administrator\Desktop\VideosFrame\VideosFrame\Model\"); 
       supportedExtensions = new[] { ".wmv" }; 
       var files = Directory.GetFiles(Path.Combine(root, "Videos"), "*.*").Where(s => supportedExtensions.Contains(Path.GetExtension(s).ToLower())); 

      List<VideosDetail> videos = new List<VideosDetail>(); 
      VideosDetail id; 
      bool flagefilefound = false; 
      foreach (var file in files) 
       { 
      id = new VideosDetail() 
        { 

         Path = file, 
         FileName = Path.GetFileName(file), 
         Extension = Path.GetExtension(file) 
        }; 
        FileInfo fi = new FileInfo(file); 
       if (id.FileName == complete_name) 
        { 

         id.FileName = fi.Name; 
         id.Size = fi.Length; 
         videos.Add(id); 
         flagefilefound = true; 
        } 


        if (flagefilefound) 
         break; 
       } 

       if (!flagefilefound) 
       { 
        MessageBox.Show("no such video is available. "); 
       } 
       return videos; 
      } 

     } 

     private void play_Click(object sender, RoutedEventArgs e) 
     { 
      List<string> chk = new List<string>(); 
      chk.Add("is"); 
      chk.Add("am"); 
      chk.Add("are"); 
      chk.Add("were"); 
      chk.Add("was"); 
      chk.Add("do"); 
      chk.Add("does"); 
      chk.Add("has"); 
      chk.Add("have"); 

      chk.Add("an"); 
      chk.Add("the"); 
      chk.Add("to"); 
      chk.Add("of"); 
      string sen = vdo.Text; 
      List<string> tmp = new List<string>(); 
      string[] split = sen.Split(' '); 
      foreach (var item in split) 
      { 
       tmp.Add(item); 
      } 
      foreach (var item in chk) 
      { 
       if(sen.Contains(item)) 
       { 
        int index = sen.IndexOf(item); 
        sen = sen.Remove(index,item.Length); 
       }; 

      } 
     foreach (var i in tmp) 
      { 

       MyPlayer player = new MyPlayer(i); 
       VideoList.ItemsSource = player.test(); 

      } 

     } 
+0

唐看不到你的問題... – MajkeloDev

+0

那麼這段代碼是由什麼產生的?什麼是你的問題 –

+0

@MohitShrivastava編輯我的問題 – tabia

回答

3

你實際上做的是消除所謂停用詞,和可能,創造袋的話:

private static HashSet<String> s_StopWords = 
    new HashSet<String>(StringComparer.OrdinalIgnoreCase) { 
    "is", "am", "are", "were", "was", "do", "does", "to", "from", // etc. 
}; 

private static Char[] s_Separators = new Char[] { 
    '\r', '\n', ' ', '\t', '.', ',', '!', '?', '"', //TODO: check this list 
}; 

... 

String source = "I go to school"; 

// ["I", "go", "school"] - "to" being a stop word is removed 
String[] words = source 
    .Split(s_Separators, StringSplitOptions.RemoveEmptyEntries) 
    .Where(word => !s_StopWords.Contains(word)) 
    .ToArray(); 

// Combine back: "I go school" 
String result = String.Join(" ", words); 
+0

這有幫助!!謝謝:) – tabia

+0

我們說[昨天](http://stackoverflow.com/questions/33896376/how-to-find-longest-sentence-in-text )你不能通過單純的字符串拆分來解析句子。當然,它可以滿足這個作業要求,但在現實世界中,你使用自然語言處理庫。 – CodeCaster