2012-02-20 38 views
2

我有以下代碼,它加載速度很慢第一次。 CSV文件約爲4mb 16000行。如何提高在VB.net中創建DataTable的性能?

 If Session("tb") Is Nothing Then 
      Dim str As String() 
      If (IsNothing(Cache("csvdata"))) Then 
       str = File.ReadAllLines(Server.MapPath("~/test/feed.csv")) 
       Cache.Insert("csvdata", str, Nothing, DateTime.Now.AddHours(12), TimeSpan.Zero) 
      Else 
       str = CType(Cache("csvdata"), Array) 
      End If 
      Dim dt As New DataTable 
      dt.Columns.Add("Shape", GetType(System.String)) 
      dt.Columns.Add("Weight", GetType(System.Double)) 
      dt.Columns.Add("Color", GetType(System.String)) 
      dt.Columns.Add("Clarity", GetType(System.String)) 
      dt.Columns.Add("Price", GetType(System.Int32)) 
      dt.Columns.Add("CutGrade", GetType(System.String)) 

      For i As Integer = 1 To str.Length - 1 
       Dim pattern As String = ",(?=([^""]*""[^""]*"")*[^""]*$)" 
       Dim rgx As New Regex(pattern) 
       Dim t As String = rgx.Replace(str(i), "\") 
       Dim s As String() = t.Split("\"c) 
       Dim pr As Int32 = CType(s(5), Int32) 
       Dim fpr As Int32 
       Dim rate As Double 
       Select Case pr 
        Case Is < 300 
         rate = 2 
        Case 301 To 600 
         rate = 1.7 
        Case Is > 600 
         rate = 1.16 
       End Select 
       fpr = Math.Round(pr * rate) 
       Dim a As String() = {s(1), s(2), s(3), s(4), fpr, s(40)} 
       dt.Rows.Add(a) 
      Next 

      Session("tb") = dt 
      ListView1.DataSource = dt 
      ListView1.DataBind() 
     Else 
      Dim x As DataTable = CType(Session("tb"), DataTable) 
      ListView1.DataSource = x 
      ListView1.DataBind() 
     End If 

csv文件被緩存,我認爲這可以與大家分享。 (一個人在12小時內加載一次) 一旦我創建了會話,頁面加載也很快。 因此,創建Datatable似乎是一個緩慢的過程。 這是第一次處理數據表,我敢肯定有人可以指出我做錯了什麼。

謝謝

UPDATE:

我已經改變緩存到原始數據表,而不是CSV文件。 它現在快速加載,但我想知道這是不是一個壞主意。

Cache.Insert("csvdata", dt, Nothing, DateTime.Now.AddHours(12), TimeSpan.Zero) 

將它存儲在緩存中後,我可以使用Linq對它運行Query。

示例CSV第3行

Supplier ID,Shape,Weight,Color,Clarity,Price/Carat,Lot Number,Stock Number,Lab,Cert #,Certificate Image,2nd Image,Dimension,Depth %,Table %,Crown Angle,Crown %,Pavilion Angle,Pavilion %,Girdle Thinnest,Girdle Thickest,Girdle %,Culet Size,Culet Condition,Polish,Symmetry,Fluor Color,Fluor Intensity,Enhancements,Remarks,Availability,Is Active,FC-Main Body,FC- Intensity,FC- Overtone,Matched Pair,Separable,Matching Stock #,Pavilion,Syndication,Cut Grade,External Url 
9349,Round,1.74,F,VVS1,13650.00,,IM-95-188-243,ABC,11228,,,7.81|7.85|4.62,59.00,62.00,34.00,13.00,,,Medium,,0,None,,Excellent,Very Good,Blue,Medium,,"",Not Specified,Y,,,,False,True,,,,Very Good,http://www.test/teste. 
9949,Round,1.00,I,VVS1,6059.00,,IM-95-189-C021,ABC,212197,,,6.37|6.42|3.96,61.90,54.00,34.50,16.00,,,Thin,Slightly Thick,0,None,,Excellent,Good,,None,,"Additional pinpoints are not shown.",Guaranteed Available,Y,,,,False,True,,,,Very Good,http://www.test/test. 
+2

「緩慢?」 – 2012-02-20 22:37:46

+0

加載第一次需要大約7-8秒我的本地測試服務器是10克RAM四Xeon 1.86ghz – shinya 2012-02-20 23:28:55

回答

0

考慮使用一個TextFieldParser讀取CSV,而不是分裂自己的字符串。另外,如果使用CustomClass具有Shape,Weight,Color等屬性的List(Of CustomClass),則可以避免DataTable的不必要開銷,並且仍然可以對List執行LINQ查詢。

請原諒我的C#,我沒有在這個盒子上安裝VB.NET。

public class Gemstone 
    { 
     public string Shape { get; set; } 
     public double Weight { get; set; } 
     public string Color { get; set; } 
    } 

    static void Main(string[] args) 
    { 
     TextFieldParser textFieldParser = new TextFieldParser("data.txt"); 
     textFieldParser.Delimiters = new string[] {","}; 
     textFieldParser.ReadLine(); // skip header line 
     List<Gemstone> list = new List<Gemstone>(16000); // allocate the list with your best calculated guess of its final size 
     while(!textFieldParser.EndOfData) 
     { 
      string[] fields = textFieldParser.ReadFields(); 
      Gemstone gemstone = new Gemstone(); 
      gemstone.Shape = fields[1]; 
      gemstone.Weight = Double.Parse(fields[2]); 
      gemstone.Color = fields[3]; 
      list.Add(gemstone); 
     } 
+0

我'現在嘗試TextFieldParser ...我如何擺脫Feild Name?我似乎無法擺脫它的結果。我試過Dim i = 0雖然不是我的Reader.EndOfData如果我= 1那麼「處理該行「else i = 1結束如果結束雖然但仍然解析第一行... – shinya 2012-02-21 19:19:26

+0

@shinya你能發佈一行你想使用的示例csv數據嗎? – 2012-02-22 01:40:43

+0

我在問題部分發布了示例csv數據 – shinya 2012-02-23 19:25:35

0

FYI我剛剛發現這整個TextFieldParser的事情,我做的文本文件中的很多分析,所以我測試了它....

在一個11MB的文件,大約有5200行和300列。

這是我在使用數據表時速度的25%。這是速度的15%左右,當我刪除了數據表代碼:

 Dim DataTable As New DataTable() 
    Dim StartTime As Long = Now.Ticks 
    Dim Reader As New FileIO.TextFieldParser("file.txt") 
    Reader.TextFieldType = FileIO.FieldType.Delimited 
    Reader.SetDelimiters(vbTab) 
    Reader.HasFieldsEnclosedInQuotes = False 
    Dim Header As Boolean = True 
    While Not Reader.EndOfData 
     Dim Fields() As String = Reader.ReadFields 
     If Header Then 
      For I As Integer = 1 To 320 
       DataTable.Columns.Add("Col" & I) 
      Next 
      Header = False 
     Else 
      If Mid(Fields(0), 1, 1) <> "#" Then DataTable.Rows.Add(Fields) 
     End If 
    End While 
    Debug.Print((Now.Ticks - StartTime)/10000 & "ms") 

    Dim DataTable2 As New DataTable() 
    StartTime = Now.Ticks 
    For I As Integer = 1 To 320 
     DataTable2.Columns.Add("Col" & I) 
    Next 
    For Each line As String In System.IO.File.ReadAllLines("file.txt") 
     Dim NVP() As String = Split(line, vbTab) 
     If Mid(line, 1, 1) <> "#" Then DataTable2.Rows.Add(NVP) 
    Next 
    Debug.Print((Now.Ticks - StartTime)/10000 & "ms") 

隨着確定年代的代碼刪除:

​​3210

均田令我感到詫異,但我猜表具有更多的功能。我發現另一個新的東西,我永遠不會使用:(