2012-02-03 54 views
0

我有一個.csv文件(我無法控制數據),並且出於某種原因,它包含了引號中的所有內容。刪除文件助手中的引號

"Date","Description","Original Description","Amount","Type","Category","Name","Labels","Notes" 
"2/02/2012","ac","ac","515.00","a","b","","javascript://" 
"2/02/2012","test","test","40.00","a","d","c",""," " 

我使用filehelpers,我想知道刪除所有這些報價將是最好的方法是什麼?有沒有說「如果我看到引號刪除,如果沒有引用找到什麼都不做」?

這與數據弄亂我將有"\"515.00\""與不需要額外的引號(尤其是因爲我想在這種情況下,它是一個小數不是一個字符串」。

我也不清楚是什麼‘的JavaScript’是一回事,它爲什麼產生,但是這是從服務,我沒有控制權。

編輯 我這是怎麼消耗的CSV文件。

using (TextReader textReader = new StreamReader(stream)) 
     { 
      engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue; 

      object[] transactions = engine.ReadStream(textReader); 
     } 
+0

我們可以看到代碼? – 2012-02-03 18:51:27

回答

6

可以使用FieldQuoted屬性描述最好here的屬性頁上。請注意,該屬性可以應用於任何FileHelpers字段(即使它輸入Decimal)。 (請記住,FileHelpers類描述了您的導入文件的規格。因此,當您將Decimal字段標記爲FieldQuoted時,您在文件中說的是,此字段將被引用。

你甚至可以指定該報價是否是可選的

[FieldQuoted('"', QuoteMode.OptionalForBoth)] 

這裏是一個控制檯應用程序,它與您的數據的工作原理:

class Program 
{ 
    [DelimitedRecord(",")] 
    [IgnoreFirst(1)] 
    public class Format1 
    { 
     [FieldQuoted] 
     [FieldConverter(ConverterKind.Date, "d/M/yyyy")] 
     public DateTime Date; 
     [FieldQuoted] 
     public string Description; 
     [FieldQuoted] 
     public string OriginalDescription; 
     [FieldQuoted] 
     public Decimal Amount; 
     [FieldQuoted] 
     public string Type; 
     [FieldQuoted] 
     public string Category; 
     [FieldQuoted] 
     public string Name; 
     [FieldQuoted] 
     public string Labels; 
     [FieldQuoted] 
     [FieldOptional] 
     public string Notes; 
    } 

    static void Main(string[] args) 
    { 
     var engine = new FileHelperEngine(typeof(Format1)); 

     // read in the data 
     object[] importedObjects = engine.ReadString(@"""Date"",""Description"",""Original Description"",""Amount"",""Type"",""Category"",""Name"",""Labels"",""Notes"" 
""2/02/2012"",""ac"",""ac"",""515.00"",""a"",""b"","""",""javascript://"" 
""2/02/2012"",""test"",""test"",""40.00"",""a"",""d"",""c"","""","" """); 

     // check that 2 records were imported 
     Assert.AreEqual(2, importedObjects.Length); 

     // check the values for the first record 
     Format1 customer1 = (Format1)importedObjects[0]; 
     Assert.AreEqual(DateTime.Parse("2/02/2012"), customer1.Date); 
     Assert.AreEqual("ac", customer1.Description); 
     Assert.AreEqual("ac", customer1.OriginalDescription); 
     Assert.AreEqual(515.00, customer1.Amount); 
     Assert.AreEqual("a", customer1.Type); 
     Assert.AreEqual("b", customer1.Category); 
     Assert.AreEqual("", customer1.Name); 
     Assert.AreEqual("javascript://", customer1.Labels); 
     Assert.AreEqual("", customer1.Notes); 

     // check the values for the second record 
     Format1 customer2 = (Format1)importedObjects[1]; 
     Assert.AreEqual(DateTime.Parse("2/02/2012"), customer2.Date); 
     Assert.AreEqual("test", customer2.Description); 
     Assert.AreEqual("test", customer2.OriginalDescription); 
     Assert.AreEqual(40.00, customer2.Amount); 
     Assert.AreEqual("a", customer2.Type); 
     Assert.AreEqual("d", customer2.Category); 
     Assert.AreEqual("c", customer2.Name); 
     Assert.AreEqual("", customer2.Labels); 
     Assert.AreEqual(" ", customer2.Notes); 
    } 
} 

(注意,你的第一線數據似乎有8個字段,而不是9個,所以我用FieldOptional標記了Notes字段)。

0

這裏是做這件事的一種方法:

string[] lines = new string[] 
{ 
    "\"Date\",\"Description\",\"Original Description\",\"Amount\",\"Type\",\"Category\",\"Name\",\"Labels\",\"Notes\"", 
    "\"2/02/2012\",\"ac\",\"ac\",\"515.00\",\"a\",\"b\",\"\",\"javascript://\"", 
    "\"2/02/2012\",\"test\",\"test\",\"40.00\",\"a\",\"d\",\"c\",\"\",\" \"", 
}; 

string[][] values = 
    lines.Select(line => 
     line.Trim('"') 
      .Split(new string[] { "\",\"" }, StringSplitOptions.None) 
      .ToArray() 
     ).ToArray(); 

lines數組表示你的樣品中的線條。在C#字符串文字中,每個"字符必須以\"的格式轉義。

對於每一行,我們首先刪除第一個和最後一個"字符,然後繼續使用","字符序列作爲分隔符將其拆分爲一組子字符串。

注意,上面的代碼將不起作用,如果你有"字符你的價值觀中自然產生的(即使逃脫)。

編輯:如果您的CSV是從流中讀取,你的所有需要​​做的是:

var lines = new List<string>(); 
using (var streamReader = new StreamReader(stream)) 
    while (!streamReader.EndOfStream) 
     lines.Add(streamReader.ReadLine()); 

上面的代碼的其餘部分將工作完好。

編輯:鑑於你的新代碼,檢查是否您正在尋找這樣的事情:

for (int i = 0; i < transactions.Length; ++i) 
{ 
    object oTrans = transactions[i]; 
    string sTrans = oTrans as string; 
    if (sTrans != null && 
     sTrans.StartsWith("\"") && 
     sTrans.EndsWith("\"")) 
    { 
     transactions[i] = sTrans.Substring(1, sTrans.Length - 2); 
    } 
} 
+0

我給出的代碼是一個.csv文件的例子,它可以從流中上傳並讀取。 – chobo2 2012-02-03 19:18:25

+0

那麼他們得到了一些構建方法「引擎」,返回一個對象數組。查看更改。 – chobo2 2012-02-03 19:54:46

0

我有同樣的困境,我更換了引號,當我值加載到我的列表對象:

using System; 
using System.Collections.Generic; 
using System.IO; 
using System.Windows.Forms; 

namespace WindowsFormsApplication6 
{ 
    public partial class Form1 : Form 
    { 
     public Form1() 
     { 
      InitializeComponent(); 
     } 

     private void Form1_Load(object sender, EventArgs e) 
     { 
      LoadCSV(); 
     } 

     private void LoadCSV() 
     { 
      List<string> Rows = new List<string>(); 
      string m_CSVFilePath = "<Path to CSV File>"; 

      using (StreamReader r = new StreamReader(m_CSVFilePath)) 
      { 
       string row; 

       while ((row = r.ReadLine()) != null) 
       { 
        Rows.Add(row.Replace("\"", "")); 
       } 

       foreach (var Row in Rows) 
       { 
        if (Row.Length > 0) 
        { 
         string[] RowValue = Row.Split(','); 

         //Do something with values here 
        } 
       } 
      } 
     } 

    } 
} 
+0

我一直在尋找的選擇,似乎像這個字段屬性可能會做的伎倆。 FieldQuoted(QuoteMode.OptionalForBoth)。我認爲他們錯過了一個選項(一個只會忽略讀和寫的引號) – chobo2 2012-02-03 20:33:30

+0

@ chobo2 - 這很好,但如果你使用Filehelpers,你仍然需要在客戶端機器上安裝一個dll嗎?我的解決方案只是使用框架而不需要任何額外的文件。 – 2012-02-03 20:41:23

0

此代碼可以幫助我發展:

using (StreamReader r = new StreamReader("C:\\Projects\\Mactive\\Audience\\DrawBalancing\\CSVFiles\\Analytix_ABC_HD.csv")) 
{ 
    string row; 

    int outCount; 
     StringBuilder line=new StringBuilder() ; 
     string token=""; 
     char chr; 
     string Eachline; 

    while ((row = r.ReadLine()) != null) 
    { 
     outCount = row.Length; 
     line = new StringBuilder(); 
     for (int innerCount = 0; innerCount <= outCount - 1; innerCount++) 
     {     
      chr=row[innerCount]; 

      if (chr != '"') 
      { 
       line.Append(row[innerCount].ToString()); 
      } 
      else if(chr=='"') 
      { 
       token = ""; 
       innerCount = innerCount + 1; 
       for (; innerCount < outCount - 1; innerCount++) 
       { 
        chr=row[innerCount]; 
        if(chr=='"') 
        { 
         break; 
        } 

        token = token + chr.ToString();        
       } 

       if(token.Contains(",")){token=token.Replace(",","");} 
       line.Append(token); 
      }     
     } 
     Eachline = line.ToString(); 
     Console.WriteLine(Eachline); 
    } 
}