2015-09-18 79 views
0

我用下面testcodeNumpy Recarray將字節文字標籤寫入我的csv文件?

import numpy as np 
import csv 

data = np.zeros((3,),dtype=("S24,int,float")) 
with open("testtest.csv", 'w', newline='') as f: 
    writer = csv.writer(f,delimiter=',') 
    for row in data: 
     writer.writerow(row) 

並且數據在CSV文件具有B「」標記(字節文字標記),用於記錄陣列的串組件。 處理寫入這些記錄數組的csv的正確方法以及避免在csv文件中包含字節字面量標記的最佳方法是什麼?

+0

這看起來像[開放問題#4543](https://github.com/numpy/ numpy/issues/4543) – askewchan

回答

0

我覺得你與Python3其中使用Unicode作爲默認字符串類型的工作。字節串然後得到特殊的b標記。

如果我生成使用Unicode而不是字節的數據,這個工程:

In [654]: data1 = np.zeros((3,),dtype=("U24,int,float")) 
In [655]: data1['f0']='xxx' # more interesting string field 
In [656]: with open('test.csv','w') as f: 
    writer=csv.writer(f,delimiter=',') 
    for row in data1: 
     writer.writerow(row) 
In [658]: cat test.csv 
xxx,0,0.0 
xxx,0,0.0 
xxx,0,0.0 

np.savetxt做同樣的事情:

In [668]: np.savetxt('test.csv',data1,fmt='%s',delimiter=',') 
In [669]: cat test.csv 
xxx,0,0.0 
xxx,0,0.0 
xxx,0,0.0 

的問題是,我可以解決此,同時保持S24字段?例如打開文件爲wb

https://stackoverflow.com/a/27513196/901925 Trying to strip b' ' from my Numpy array

探討過這個問題,前面看起來像我的解決方案是要麼decode字節字段,或者直接寫一個字節的文件。由於您的數組混合了字符串和數字字段,因此decode解決方案更乏味。

data1 = data.astype('U24,i,f') # convert bytestring field to unicode 

一個輔助功能,可用於decode字節串上飛:

In [147]: fn = lambda row: [j.decode() if isinstance(j,bytes) else j for j in row] 
In [148]: with open('test.csv','w') as f: 
    writer=csv.writer(f,delimiter=',') 
    for row in data: 
     writer.writerow(fn(row)) 
    .....:   
In [149]: cat test.csv 
xxx,0,0.0 
yyy,0,0.0 
zzz,0,0.0 
+0

另一個numpy字節字符串格式化問題:http://stackoverflow.com/questions/32207420/numpy-string-encoding/32208336。除了自定義的「格式」方法,它仍然推薦「解碼」。 – hpaulj

0

您是否需要這三種dtype中的數據?考慮在numpy浮點數或整數數組上使用numpy.savetxt()。

http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html

data = np.zeros((3,3)) 
filename='foo' 
np.savetxt(filename+".csv",data,fmt='%1.6e',delimiter=",") 
#fmt='%1.6e' controls how the numbers are written to the text file. 
#E.g. use fmt='%d' for integers