2016-07-31 101 views
3

我有一個文本文件,包含將近45,000個單詞,每行一個單詞。成千上萬的這些詞出現超過10次。我想創建一個沒有重複單詞的新文件。我使用流讀取器,但它只讀取一次文件。我怎樣才能擺脫重複的話。請幫幫我。由於 我的代碼是這樣從文本文件中刪除重複單詞

Try 
     File.OpenText(TextBox1.Text) 
    Catch ex As Exception 
     MsgBox(ex.Message) 
     Exit Sub 
    End Try 

    Dim line As String = String.Empty 
    Dim OldLine As String = String.Empty 
    Dim sr = File.OpenText(TextBox1.Text) 

    line = sr.ReadLine 
    OldLine = line 

    Do While sr.Peek <> -1 
     Application.DoEvents() 
     line = sr.ReadLine 
     If OldLine <> line Then 
       My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt", line & vbCrLf, True) 
     End If 

     OldLine = line 
    Loop 


    sr.Close() 
    System.Diagnostics.Process.Start(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt") 
    MsgBox("Loop terminated. Stream Reader Closed." & vbCrLf) 

回答

2

您可以使用LINQ的Distinct()方法這一點。

這將爲小文件的工作:

Dim lines As String() = File.ReadAllLines("yourfile.txt") 
File.WriteAllLines("yourfile.txt", lines.Distinct().ToArray()) 
+1

當然親愛的拉基蒂奇。你把它毀掉了。 – gulmaily