2010-04-16 48 views
1

什麼是轉換的最佳方式是:最好的方式轉換成分隔固定寬度

FirstName,LastName,Title,BirthDate,HireDate,City,Region 
Nancy,Davolio,Sales Representative,1948-12-08,1992-05-01,Seattle,WA 
Andrew,Fuller,Vice President Sales,1952-02-19,1992-08-14,Tacoma,WA 
Janet,Leverling,Sales Representative,1963-08-30,1992-04-01,Kirkland,WA 
Margaret,Peacock,Sales Representative,1937-09-19,1993-05-03,Redmond,WA 
Steven,Buchanan,Sales Manager,1955-03-04,1993-10-17,London,NULL 
Michael,Suyama,Sales Representative,1963-07-02,1993-10-17,London,NULL 
Robert,King,Sales Representative,1960-05-29,1994-01-02,London,NULL 
Laura,Callahan,Inside Sales Coordinator,1958-01-09,1994-03-05,Seattle,WA 
Anne,Dodsworth,Sales Representative,1966-01-27,1994-11-15,London,NULL 

這樣:

FirstName LastName    Title       BirthDate HireDate City   Region 
---------- -------------------- ------------------------------ ----------- ---------- --------------- --------------- 
Nancy  Davolio    Sales Representative   1948-12-08 1992-05-01 Seattle   WA 
Andrew  Fuller    Vice President, Sales   1952-02-19 1992-08-14 Tacoma   WA 
Janet  Leverling   Sales Representative   1963-08-30 1992-04-01 Kirkland  WA 
Margaret Peacock    Sales Representative   1937-09-19 1993-05-03 Redmond   WA 
Steven  Buchanan    Sales Manager     1955-03-04 1993-10-17 London   NULL 
Michael Suyama    Sales Representative   1963-07-02 1993-10-17 London   NULL 
Robert  King     Sales Representative   1960-05-29 1994-01-02 London   NULL 
Laura  Callahan    Inside Sales Coordinator  1958-01-09 1994-03-05 Seattle   WA 
Anne  Dodsworth   Sales Representative   1966-01-27 1994-11-15 London   NULL 
+1

通過BEST你是指最簡潔,最具可讀性或更高性能?由BEST製作的 – 2010-04-16 16:27:58

+0

我的意思是在現實世界的場景中有用且靈活。任何經歷足夠長的人都會知道分隔文件的佈局最終可能會發生變化:) – ehosca 2010-04-17 00:00:48

回答

1

你有兩個問題在這裏。單獨考慮它們,你會更容易找到一個好的解決方案。

  1. 將您的CSV格式輸入數據解析爲有用的格式。

  2. 出示您以某種方式

不要寫你自己的CSV分析器數據。規則有點棘手,但格式是衆所周知的。從長遠來看,錯誤將會很糟糕。您可以調用.NET框架中的現有CSV庫,但我對它們的瞭解不多。但是,這個問題對於C#中新的dynamic功能來說是完美的。這裏看起來很有前途:http://tonikielo.blogspot.com/2010/01/c-40-dynamic-linq-to-csvhe.html

我假設打印數據是一個微不足道的問題,你不需要我們的幫助。如果沒有,你需要給我們更多的信息,比如你想如何決定列的寬度。

+0

我同意你的觀點。動態看起來非常有希望。我肯定會考慮到這一點,當我們得到4.0的綠燈 我遇到這個問題,因爲我正在編寫一個SQL查詢分析器類型的工具,而不是數據庫允許你寫一個GemFire分佈式緩存的OQL查詢和顯示結果。查詢分析器允許輸出顯示在多個視圖中,網格中,作爲文本等。我想這確實不足以解決問題。 確定列寬的規則是:列中任何單個項目的最寬度+ 1 – ehosca 2010-04-25 13:33:23

3

我想創建一個自定義的類來保存信息,然後爲CSV文件中的每一行做一個循環,分割逗號並填充自定義對象。然後將它們全部放入列表或IEnumrable中,並將其放入中繼器/數據網格中。

public class Person 
    { 
     public string FirstName { get; set; } 
     public string LastName { get; set; } 
     public string Title { get; set; } 
     public DateTime BirthDate { get; set; } 
     public DateTime HireDate { get; set; } 
     public string City { get; set; } 
     public string Region { get; set; } 
    } 

    public void Parse(string csv) 
    { 
     string[] lines = csv.Split(Environment.NewLine.ToCharArray()); 
        List<Person> persons = new List<Person>(); 

     foreach (string line in lines) 
     { 
      string[] values = line.Split(','); 

      Person p = new Person(); 

      p.FirstName = values[ 0 ]; 
      p.LastName = values[ 1 ]; 

          persons.Add(p); 
      //.... etc etc 
     } 
    } 
+0

+1,這是一個很好的開始,尤其是如果您需要驗證源文件並捕獲可能的數據錯誤。 (但是我會在'Person'類中負責字符串分割和解析邏輯的'Parse'方法) – Regent 2010-04-16 17:00:16

+0

我不是問如何解析分隔值到一個對象:) – ehosca 2010-04-16 18:02:53

+0

@ehosca - 一旦進入對象,雖然你可以做一個ToString()方法儘管打印出整個行。編寫一個簡單的foreach循環來處理打印。 – Matt 2010-04-16 19:57:57

3

這符合你的要求規定,並使用LINQ(因爲你的問題被標記LINQ),但不一定是最好的

class Program 
{ 
    static void Main(string[] args) 
    { 
     List<string> inputs = new List<string> 
     { 
      "FirstName,LastName,Title,BirthDate,HireDate,City,Region", 
      "Nancy,Davolio,Sales Representative,1948-12-08,1992-05-01,Seattle,WA", 
      "Andrew,Fuller,Vice President Sales,1952-02-19,1992-08-14,Tacoma,WA", 
      "Janet,Leverling,Sales Representative,1963-08-30,1992-04-01,Kirkland,WA", 
      "Margaret,Peacock,Sales Representative,1937-09-19,1993-05-03,Redmond,WA", 
      "Steven,Buchanan,Sales Manager,1955-03-04,1993-10-17,London,NULL", 
      "Michael,Suyama,Sales Representative,1963-07-02,1993-10-17,London,NULL", 
      "Robert,King,Sales Representative,1960-05-29,1994-01-02,London,NULL", 
      "Laura,Callahan,Inside Sales Coordinator,1958-01-09,1994-03-05,Seattle,WA", 
      "Anne,Dodsworth,Sales Representative,1966-01-27,1994-11-15,London,NULL" 
     }; 

     // TODO: These widths would presumably be configurable 
     List<int> widths = new List<int> { 12, 22, 32, 13, 12, 17, 8 }; 

     List<string> outputs = inputs.Select(s => ToFixedWidths(s, ',', widths)).ToList(); 

     outputs.ForEach(s => System.Diagnostics.Debug.WriteLine(s)); 

     Console.ReadLine(); 
    } 

    private static string ToFixedWidths(string s, char separator, List<int> widths) 
    { 
     List<string> split = s.Split(separator).ToList(); 

     // TODO: Error handling - what if there are more/less separators in 
     // string s than we have width values? 

     return string.Join(String.Empty, split.Select((ss, i) => ss.PadRight(widths[i], ' ')).ToArray()); 
    } 
} 

在生產場景,雖然我期望將這些數據讀入合適的Person班,正如馬特在他的回答中所建議的那樣。

+0

爲什麼不在你的'ToFixedWidths'函數中使用ss.PadRight(widths [i],'')'? – Regent 2010-04-16 16:47:54

0
using System; 
using System.Collections.Generic; 
using System.Linq; 
using System.Text; 
using System.Text.RegularExpressions; 

namespace StringParsingWithLinq 
{ 
    internal class Program 
    { 
     private static void Main(string[] args) 
     { 
      var inputs = new List<string> 
          { 
           "FirstName,LastName,Title,BirthDate,HireDate,City,Region", 
           "Nancy,Davolio,Sales Representative,1948-12-08,1992-05-01,Seattle,WA", 
           "Andrew,Fuller,\"Vice President, Sales\",1952-02-19,1992-08-14,Tacoma,WA", 
           "Janet,Leverling,Sales Representative,1963-08-30,1992-04-01,Kirkland,WA", 
           "Margaret,Peacock,Sales Representative,1937-09-19,1993-05-03,Redmond,WA", 
           "Steven,Buchanan,Sales Manager,1955-03-04,1993-10-17,London,NULL", 
           "Michael,Suyama,Sales Representative,1963-07-02,1993-10-17,London,NULL", 
           "Robert,King,Sales Representative,1960-05-29,1994-01-02,London,NULL", 
           "Laura,Callahan,Inside Sales Coordinator,1958-01-09,1994-03-05,Seattle,WA", 
           "Anne,Dodsworth,Sales Representative,1966-01-27,1994-11-15,London,NULL" 
          }; 

      Console.Write(FixedWidthHelper.ReadLines(inputs) 
           .ToFixedLengthString()); 
      Console.ReadLine(); 
     } 

     #region Nested type: FixedWidthHelper 

     public class FixedWidthHelper 
     { 
      private readonly Regex _csvRegex = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))"); 
      private readonly List<string[]> _data = new List<string[]>(); 
      private List<int> _fieldLen; 

      public static FixedWidthHelper ReadLines(List<string> lines) 
      { 
       var fw = new FixedWidthHelper(); 
       lines.ForEach(fw.AddDelimitedLine); 
       return fw; 
      } 

      private void AddDelimitedLine(string line) 
      { 
       string[] fields = _csvRegex.Split(line); 

       if (_fieldLen == null) 
        _fieldLen = new List<int>(fields.Select(f => f.Length)); 

       for (int i = 0; i < fields.Length; i++) 
       { 
        if (fields[i].Length > _fieldLen[i]) 
         _fieldLen[i] = fields[i].Length; 
       } 

       _data.Add(fields); 
      } 

      public string ToFixedLengthString() 
      { 
       var sb = new StringBuilder(); 
       foreach (var list in _data) 
       { 
        for (int i = 0; i < list.Length; i++) 
        { 
         sb.Append(list[i].PadRight(_fieldLen[i] + 1, ' ')); 
        } 
        sb.AppendLine(); 
       } 

       return sb.ToString(); 
      } 
     } 

     #endregion 
    } 
} 

alt text

2

我知道這樣做是PowerShell中最簡單的方法:

 
PS > Import-Csv .\x.csv | Format-Table -AutoSize

FirstName LastName Title BirthDate HireDate City Region --------- -------- ----- --------- -------- ---- ------ Nancy Davolio Sales Representative 1948-12-08 1992-05-01 Seattle WA Andrew Fuller Vice President Sales 1952-02-19 1992-08-14 Tacoma WA Janet Leverling Sales Representative 1963-08-30 1992-04-01 Kirkland WA ...

+0

我希望看到-AutoSize選項的執行:) – ehosca 2010-04-25 13:34:16

0

我剛剛寫了tablify這個確切的目的。與

[sudo -H] pip3 install tablify 

安裝
tablify input.dat 

會給你

FirstName , LastName , Title     , BirthDate , HireDate , City  , Region 
Nancy  , Davolio , Sales Representative  , 1948-12-08 , 1992-05-01 , Seattle , WA 
Andrew , Fuller , Vice President Sales  , 1952-02-19 , 1992-08-14 , Tacoma , WA 
Janet  , Leverling , Sales Representative  , 1963-08-30 , 1992-04-01 , Kirkland , WA 
Margaret , Peacock , Sales Representative  , 1937-09-19 , 1993-05-03 , Redmond , WA 
Steven , Buchanan , Sales Manager   , 1955-03-04 , 1993-10-17 , London , NULL 
Michael , Suyama , Sales Representative  , 1963-07-02 , 1993-10-17 , London , NULL 
Robert , King  , Sales Representative  , 1960-05-29 , 1994-01-02 , London , NULL 
Laura  , Callahan , Inside Sales Coordinator , 1958-01-09 , 1994-03-05 , Seattle , WA 
Anne  , Dodsworth , Sales Representative  , 1966-01-27 , 1994-11-15 , London , NULL 

從那裏,它應該很容易將文件適應您的需求。