2013-12-20 54 views
0

我寫的,從5個文本文件中讀取數據,並根據某些給定的鍵字盡數C#程序讀取多個文本文件和內容存儲到一個數組

 string[] word_1 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D1_H1.txt").Split(' '); 
     string[] word_2 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D2_H1.txt").Split(' '); 
     string[] word_3 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D3_H2.txt").Split(' '); 
     string[] word_4 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D4_H2.txt").Split(' '); 
     string[] word_5 = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D5_H2.txt").Split(' '); 
     string[] given_doc = File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\Given_doc.txt").Split(' '); 

這是我從讀文本文件,閱讀,我使用的循環,如果循環計數從軟管文件中的每個字

for (int i = 0; i < word_1.Length; i++) 

     { 

      string s = word_1[i]; 


       if ("Red".Equals(word_1[i])) 
       { 
        //Console.WriteLine(word[i]); 

        h1_r++; 
       } 
       if ("Green".Equals(word_1[i])) 
       { 
        h1_g++; 
       } 
       if ("Blue".Equals(word_1[i])) 
       { 
        h1_b++; 
       } 

     } 

這是我用來從一個文件及其工作得很好計數的循環之後,我做了這個過程5次閱讀所有文件,我的問題是我怎麼能讀取這5個文件使用循環和存儲噸下襬陣列(每個關鍵詞的計數)

在此先感謝!

+0

是文件的名稱重要還是你只是閱讀該目錄中的所有文件? – germi

+0

你的第一個代碼塊是否編譯? ReadAllText()返回一個字符串,而不是一個數組。 –

+0

實際上,文本文件的數量非常重要,而不是文件名。我想從多個文本文件中獲取數據 –

回答

1

複製粘貼代碼通常是不好的。它導致代碼違反了不要重複自己(DRY)規則。調整你的代碼:

const string path = @"C:\Users\Niyomal N\Desktop\Assignment\Assignment"; 
string[] files = new string[] { "D1_H1.txt", "D2_H1.txt", "D3_H1.txt", ... }; 

foreach (string file in files) { 
    string fullPath = Path.Combine(path, file); 
    //TODO: count words of file `fullPath` 
} 

在陣列中存儲的字數是不是最優的,因爲你將不得不遍歷你在文件中遇到的每個字的陣列。 使用一個具有不變查找時間的字典。這要快得多。

var wordCount = new Dictionary<string, int>(); 

然後你可以算的話是這樣的:

int count; 
if (wordCount.TryGetValue(word, out count)) { 
    wordCount[word] = count + 1; 
} else { 
    wordCount[word] = 1; 
} 

UPDATE

可以測試關鍵字,這樣

var keywords = new HashSet<string> { "Red", "Green", "Blue" }; 

string word = "Green"; 
if (keywords.Contains(word)) { 
    ... 
} 

HasSets一跟詞典一樣快。

小心使用套管這個詞。 HashSets默認情況下區分大小寫。如果「紅色」和「紅色」和「RED」已被alltogehter發現,初始化HashSet這樣的:

var keywords = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase) 
    { "Red", "Green", "Blue" }; 
+0

我如何使用這個來計算那些特殊的關鍵字我的意思是我想只計算每個文檔中的「紅色」「綠色」和「藍色」關鍵詞計數,文檔可以包含其他那些我想要過濾的詞的數量只有那些關鍵詞 –

1
List<KeyValuePair<string, string>> completeList = new List<KeyValuePair<string, string>>(); 

      completeList.AddRange("D1_H1.txt",File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D1_H1.txt").Split(' ')); 
      completeList.AddRange("D1_H2.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D2_H1.txt").Split(' ')); 
      completeList.AddRange("D1_H3.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D3_H2.txt").Split(' ')); 
      completeList.AddRange("D1_H4.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D4_H2.txt").Split(' ')); 
      completeList.AddRange("D1_H5.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\D5_H2.txt").Split(' ')); 
      completeList.AddRange("D1_H6.txt", File.ReadAllText(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment\Given_doc.txt").Split(' ')); 


      var result = completeList.GroupBy(r => r.Key).Select(r => new {File = r.Key, Red = r.Count(s => s.Value == "red"), Green = r.Count(s => s.Value == "green"), Blue = r.Count(s => s.Value == "blue") }); 
      foreach (var itm in result) 
      { 
       Console.WriteLine(itm.File); 
       Console.WriteLine(itm.Red); 
       Console.WriteLine(itm.Green); 
       Console.WriteLine(itm.Blue); 

      } 
+0

我想分別獲得每個文件中每個關鍵詞的計數並將它們存儲在一個數組中。 –

2

LINQ查詢是您在這裏最簡單的解決方案:

var filenames = new[] { "D1_H1.txt", "D2_H1.txt", "D3_H2.txt" }; 
var words = new[] { "Red", "Green", "Blue" }; 
var counters = 
    filenames.Select(filename => Path.Combine(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment", filename)) 
      .SelectMany(filepath => File.ReadAllLines(filepath)) 
      .SelectMany(line => line.Split(new[] { ' ' })) 
      .Where(word => words.Contains(word)) 
      .GroupBy(word => word, (key, values) => new 
       { 
       Word = key, 
       Count = values.Count() 
       }) 
      .ToDictionary(g => g.Word, g => g.Count); 

,然後你把所有文件中的字計數器的詞典:

int redCount = counters["Red"]; 

如果你想存儲每個每個櫃檯文件中,可以使用略有修改的查詢:

var filenames = new[] { "D1_H1.txt", "D2_H1.txt", "D3_H2.txt" }; 
var words = new[] { "Red", "Green", "Blue" }; 
var counters = 
    filenames.Select(filename => Path.Combine(@"C:\Users\Niyomal N\Desktop\Assignment\Assignment", filename)) 
      .Select(filepath => new 
      { 
       Filepath = filepath, 
       Count = File.ReadAllLines(filepath) 
          .SelectMany(line => line.Split(new[] { ' ' })) 
          .Where(word => words.Contains(word)) 
          .GroupBy(word => word, (key, values) => new 
          { 
           Word = key, 
           Count = values.Count() 
          }) 
          .ToDictionary(g => g.Word, g => g.Count) 
      }) 
      .ToDictionary(g => g.Filepath, g => g.Count); 

,然後相應地使用它:

int redCount = counters[@"C:\Users\(...)\D1_H1.txt"]["Red"]; 
相關問題