我有一個CSV文件,它看起來像下面,如何分析非結構化csv文件
Processname:;ABC Buying
ID:;31
Message Date:;08-02-2012
Receiver (code):;12345
Object code:
Location (code):;12345
Date;time
2012.02.08;00:00;0;0,00
2012.02.08;00:15;0;0,00
2012.02.08;00:30;0;0,00
2012.02.08;00:45;0;0,00
2012.02.08;01:00;0;0,00
2012.02.08;01:15;0;0,00
它可以有上述信息的1個或更多次數,讓我們說,如果它有2個occurances,那麼csv文件看起來像...
Processname:;ABC Buying
ID:;31
Message Date:;08-02-2012
Receiver (code):;12345
Object code:
Location (code):;12345
Date;time
2012.02.08;00:00;0;0,00
2012.02.08;00:15;0;0,00
2012.02.08;00:30;0;0,00
2012.02.08;00:45;0;0,00
2012.02.08;01:00;0;0,00
2012.02.08;01:15;0;0,00
Processname:;ABC Buying
ID:;41
Message Date:;08-02-2012
Receiver (code):;12345
Object code:
Location (code):;12345
Date;time
2012.02.08;00:00;0;17,00
2012.02.08;00:15;0;1,00
2012.02.08;00:30;0;15,00
2012.02.08;00:45;0;0,00
2012.02.08;01:00;0;0,00
2012.02.08;01:15;0;9,00
什麼是解析此csv文件的最佳方法?
我的方法的僞代碼...
// Read the complete file
var lines = File.ReadAllLines(filePath);
// Split the lines at the occurrence of "Processname:;ABC Buying"
var blocks = lines.SplitAtTheOccuranceOf("Processname:;ABC Buying");
// The results will go to
var results = new List<Result>();
// Loop through the collection
foreach(var b in blocks)
{
var result = new Result();
foreach(var l in b.lines)
{
// read the first line and check it contains "Processname" if so, assign the value to result.ProcessName =
// read the 2nd line and check it contains "ID" if so, assign the value to result.ID
// read the 3rd line and check it contains "Object Code" if so, assign the value to result.ObjectCode
// Ignore string.empty
// check for location (code), if so assign the value to result.LocationCode
// Parse all the other rows by spliting with ';' the first part is date, 2nd part is time, 3rd part is value
}
results.Add(result);
}
什麼是做到這一點的最好方法是什麼?
這看起來不像CSV。 – Lloyd
CSV是一個'相對'結構化的文檔,Microsoft Jet Engine將爲您完成這項工作。所以它絕對是自定義代碼時間! - 鏈接到RFC的RFC - http://tools.ietf.org/html/rfc4180 –
它的價值我可能會形容爲「具有複雜結構的文本文件」,因爲它的結構,它只是沒有所有的行都是一樣的。如果其完全非結構化的代碼幾乎沒有機會。 ;-) – Chris