2014-05-03 43 views
0

任何人都可以幫助我指出正確的方向嗎?先謝謝你。使用ASP.NET將逗號分隔的CSV文件處理爲多個文件

我正在尋找一個小應用程序,它將通過將數據行輸出到多個csv文件中,並根據數組中某列的不同值列表(比如Header1)來處理一個csv文件,但是,我不會不知道從哪裏開始。供參考:頭一中的列表將會一直改變。

我已經能夠讀取該文件與此代碼的數組:

[Read From Comma-Delimited Text Files in Visual Basic][1] 

現在我要處理基於第一列中的數據。例如;

輸入:

input.csv 

"Header1","Header2","Header3","Header4" 
"apple","pie","soda","beer" 
"apple","cake","milk","wine" 
"pear","pie","soda","beer" 
"pear","pie","soda","beer" 
"orange","pie","soda","beer" 
"orange","pie","soda","beer" 

OUTPUT:

output1.csv 

"Header1","Header2","Header3","Header4" 
"apple","pie","soda","beer" 
"apple","cake","milk","wine" 

output2.csv 

"Header1","Header2","Header3","Header4" 
"pear","pie","soda","beer" 
"pear","pie","soda","beer" 

output2.csv 

"Header1","Header2","Header3","Header4" 
"orange","pie","soda","beer" 
"orange","pie","soda","beer" 
+0

您的鏈接已損壞。 –

回答

0

用於保存數據而不是數組的合適數據結構將是Dictionary。這可以很容易地檢查你是否已經有了一個特定類別的條目(比如說「apple」或「pear」)。然後,您只需向字典中添加新條目或添加到現有條目中即可。

要創建輸出文件,您需要迭代字典中的每個條目(以分離文件),然後遍歷字典條目的值(獲取文件中的行)中的每個實體。

Option Infer On 

Imports System.IO 
Imports Microsoft.VisualBasic.FileIO 

Module Module1 

    Sub SeparateCsvToFiles(srcFile As String) 

     Dim d As New Dictionary(Of String, List(Of String)) 
     Dim headers As String() 

     Using tfp As New TextFieldParser(srcFile) 
      tfp.HasFieldsEnclosedInQuotes = True 
      tfp.SetDelimiters(",") 
      Dim currentRow As String() 

      ' Get the headers 
      Try 
       headers = tfp.ReadFields() 
      Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException 
       Throw New FormatException(String.Format("Could not read header line in ""{0}"".", srcFile)) 
      End Try 

      ' Read the data 
      Dim lineNumber As Integer = 1 

      While Not tfp.EndOfData 
       Try 
        currentRow = tfp.ReadFields() 

        'TODO: Possibly handle the wrong number of entries more gracefully. 
        If currentRow.Count = headers.Count Then 
         ' assume column to sort on is the zeroth one 
         Dim category = currentRow(0) 
         Dim values = String.Join(",", currentRow.Skip(1).Select(Function(s) """" & s & """")) 

         If d.ContainsKey(category) Then 
          d(category).Add(values) 
         Else 
          Dim valuesList As New List(Of String) 
          valuesList.Add(values) 
          d.Add(category, valuesList) 
         End If 

        Else 
         Throw New FormatException(String.Format("Wrong number of entries in line {0} in ""{1}"".", lineNumber, srcFile)) 
        End If 

       Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException 
        Throw New FormatException(String.Format("Could not read data line {0} in ""{1}"".", lineNumber, srcFile)) 
       End Try 

       lineNumber += 1 

      End While 
     End Using 

     ' Output the data 
     'TODO: Write code to output files to a different directory. 
     Dim destDir = Path.GetDirectoryName(srcFile) 

     Dim fileNumber As Integer = 1 
     Dim headerLine = String.Join(",", headers.Select(Function(s) """" & s & """")) 

     'TODO: think up more meaningful names instead of x and y.  
     For Each x In d 
      Dim destFile = Path.Combine(destDir, "output" & fileNumber.ToString() & ".csv") 

      Using sr As New StreamWriter(destFile) 
       sr.WriteLine(headerLine) 
       For Each y In x.Value 
        sr.WriteLine(String.Format("""{0}"",{1}", x.Key, y)) 
       Next 
      End Using 

      fileNumber += 1 

     Next 

    End Sub 

    Sub Main() 
     SeparateCsvToFiles("C:\temp\input.csv") 
     Console.WriteLine("Done.") 
     Console.ReadLine() 

    End Sub 

End Module 
+0

非常感謝您的快速回復!到目前爲止,這工作很好。我想知道是否有辦法修改sr.WriteLine部分,以防我想有選擇地只寫某些列。所以,可能只有Header1,Header3和/或Header4。 – user3599851

+0

要修改的行將是'Dim values = String.Join(「,」,currentRow.Skip(1).Select(Function(s)「」「」&s&「」「))'它在哪裏構建類別之後的輸出行的一部分。我相信你可以用For..Next循環自己做。 –

+0

@ user3599851對不起,我忘了在我之前的評論中加入@,以確保您得到通知。 –

0

你可以做的是

  • 讀鍵列到一個列表q
  • 創建不同的列表項DIST
  • 在打擊DIST Q根據在蒸餾水寫入行到文件索引比較值和

Dim lines As String() = System.IO.File.ReadAllLines("input.csv") 
Dim q = (From line In lines 
         Let x = line.Split(",") 
         Select x(0)).ToList() 
Dim dist = q.Distinct().ToList() 

For j As Integer = 1 To dist.Count - 1 
    Using sw As New StreamWriter(File.Open("output" & j & ".csv", FileMode.OpenOrCreate)) 
     sw.WriteLine(lines(0)) 
    End Using 
Next 

For i As Integer = 1 To q.Count - 1 
    Console.WriteLine(q(i)) 
    Console.WriteLine(dist.IndexOf(q(i))) 

    Using sw As New StreamWriter(File.Open("output" & dist.IndexOf(q(i)) & ".csv", FileMode.Append)) 
     sw.WriteLine(lines(i)) 
    End Using 
Next 

如果鍵列是不是第一個,改變其指數在X(0)

相關問題