2017-08-13 57 views
2

我正嘗試爲一個字符串分隔像下面只返回數字。 從第一個文件的輸出是這樣的:字符串後separatedBy

輸出文件1:

contents: 2017-07-31 16:29:53,0.10109999,9.74414271,0.98513273,0.15%,42302999779,-0.98513273,9.72952650 
2017-07-31 16:29:53,0.10109999,0.25585729,0.02586716,0.25%,42302999779,-0.02586716,0.25521765 


rows: ["2017-07-31 16:29:53,0.10109999,9.74414271,0.98513273,0.15%,42302999779,-0.98513273,9.72952650", "2017-07-31 16:29:53,0.10109999,0.25585729,0.02586716,0.25%,42302999779,-0.02586716,0.25521765", "", ""] 

輸出文件2:

contents: 40.75013313,0.00064825,5/18/2017 7:17:01 PM 

19.04004820,0.00059900,5/19/2017 9:17:03 PM 

rows: ["4\00\0.\07\05\00\01\03\03\01\03\0,\00\0.\00\00\00\06\04\08\02\05\0,\05\0/\01\08\0/\02\00\01\07\0 \07\0:\01\07\0:\00\01\0 \0P\0M\0", "\0", "1\09\0.\00\04\00\00\04\08\02\00\0,\00\0.\00\00\00\05\09\09\00\00\0,\0\05\0/\01\09\0/\02\00\01\07\0 \09\0:\01\07\0:\00\03\0 \0P\0M\0", "\0", "\0", "\0"] 

所以這兩個文件是可讀的字符串,因爲print(content)是加工。 但是,只要字符串分開,第二個文件就不可讀了。 我嘗試了不同的編碼,但沒有任何工作。有沒有人有一個想法,如何強制字符串到第二個文件,保持一個可讀的字符串?

+1

它必須與編碼有關。你能上傳你的原始csv文件嗎? – nathan

+2

這是關係到你以前的,現在被刪除的問題https://stackoverflow.com/questions/45662712/problems-with-csv-file-type? - 您是否按照我的建議嘗試了'CSVReader(stream:stream,codecType:UTF16.self,endian:.big/.little)'? –

+1

請參閱https://stackoverflow.com/questions/18851558/ios-whats-the-best-way-to-detect-a-files-encoding以自動檢測編碼。 –

回答

2

你的文件顯然UTF-16(小端)編碼:

 
$ hexdump fullorders4.csv 
0000000 4f 00 72 00 64 00 65 00 72 00 55 00 75 00 69 00 
0000010 64 00 2c 00 45 00 78 00 63 00 68 00 61 00 6e 00 
0000020 67 00 65 00 2c 00 54 00 79 00 70 00 65 00 2c 00 
0000030 51 00 75 00 61 00 6e 00 74 00 69 00 74 00 79 00 
... 

對於ASCII字符,則UTF-16編碼的第一個字節是 ASCII碼,而第二個字節是零。

如果文件讀取爲UTF-8,那麼零將被轉換爲 ASCII NUL字符,這就是您在輸出中看到的\0

因此指定編碼作爲utf16LittleEndian作品 你的情況:

let contents = try NSString(contentsOfFile: path, encoding: String.Encoding.utf16LittleEndian.rawValue) 
// or: 
let contents = try String(contentsOfFile: path, encoding: .utf16LittleEndian) 

還有它試圖所使用的編碼 (比較iOS: What's the best way to detect a file's encoding)的方法檢測。在斯威夫特這將是

var enc: UInt = 0 
let contents = try NSString(contentsOfFile: path, usedEncoding: &enc) 
// or: 
var enc = String.Encoding.ascii 
let contents = try String(contentsOfFile: path, usedEncoding: &enc) 

然而,在特定情況下,這將再次讀取該文件爲UTF-8 因爲它有效UTF-8。對文件(UTF-16 little-endian的FF FE)預先配置byte order mark (BOM) 可以可靠地解決該問題 問題。