2014-01-29 41 views
1

有人在計算器勸我用熊貓標記的我的CSV文件中的值,並提供代碼如下:的Python:大熊貓,解析數學運算

# original code 
import pandas 

cmf = pandas.read_csv('CMF_MA68II.csv', names=['wavelength', 'x', 'y', 'z']) 
d65 = pandas.read_csv('D65_MA68II_10nm.csv', names=['wavelength', 'a', 'b']) 
data = pandas.read_csv('spectral_data.csv', names=['serialNumber', 'wavelength', 'measurement', 'name']) 

lookup = pandas.merge(cmf, d65, on='wavelength') 
merged = pandas.merge(data, lookup, on='wavelength') 

totals = ((lookup[['x', 'y', 'z']].T*lookup['a']).T).sum() 
wps = 100 * totals/totals['y'] 

print totals['y'] 
print "D65_CMF_2006_10_deg white point = " 
print wps 

我加在最後這一部分:

# here's my crappy part: 

i = 0 

for i in range(i, i+1), data['serialNumber']: 
    x = ((merged.x * merged.a * merged.measurement).sum()/(merged.y * merged.a * 100).sum())  
    y = ((merged.y * merged.a * merged.measurement).sum()/(merged.y * merged.a * 100).sum())  
    z = ((merged.z * merged.a * merged.measurement).sum()/(merged.y * merged.a * 100).sum())   
    print x, y, z 

但是,這些行對我的文件的所有行執行操作,而不管與其關聯的name,結果是所有單獨測量的平均值。

正如你可以看到,文件'spectral_data.csv'的結構names=['serialNumber', 'wavelength', 'measurement', 'name']

我想什麼做的是執行此操作:

merged['X'] = (merged.x * merged.a * merged.measurement).sum()/totals['y'] 
的一系列數據

由自己定義name,也就是說,我的文件'spectral_data.csv'包含多個系列的值,我想要爲它們中的每一個獲得結果,並將它們存儲在具有結構['序列號','X','Y' ,'Z','name']

有人有這方面的解決方案?

由於

文件的例子: 'CMF_MA68II.csv'

400,1.879338E-02,2.589775E-03,8.508254E-02 
410,8.277331E-02,1.041303E-02,3.832822E-01 
420,2.077647E-01,2.576133E-02,9.933444E-01 
430,3.281798E-01,4.698226E-02,1.624940E+00 
440,4.026189E-01,7.468288E-02,2.075946E+00 
450,3.932139E-01,1.039030E-01,2.128264E+00 
460,3.013112E-01,1.414586E-01,1.768440E+00 
470,1.914176E-01,1.999859E-01,1.310576E+00 
480,7.593120E-02,2.682271E-01,7.516389E-01 
490,1.400745E-02,3.554018E-01,3.978114E-01 
500,5.652072E-03,4.780482E-01,2.078158E-01 
510,3.778185E-02,6.248296E-01,8.852389E-02 
520,1.201511E-01,7.788199E-01,3.784916E-02 
530,2.380254E-01,8.829552E-01,1.539505E-02 
540,3.841856E-01,9.665325E-01,6.083223E-03 
550,5.374170E-01,9.907500E-01,2.323578E-03 
560,7.123849E-01,9.944304E-01,8.779264E-04 
570,8.933408E-01,9.640545E-01,3.342429E-04 
580,1.034327E+00,8.775360E-01,1.298230E-04 
590,1.147304E+00,7.869950E-01,5.207245E-05 
600,1.148163E+00,6.629035E-01,2.175998E-05 
610,1.048485E+00,5.282296E-01,9.530130E-06 
620,8.629581E-01,3.950755E-01,0.000000E+00 
630,6.413984E-01,2.751807E-01,0.000000E+00 
640,4.323126E-01,1.776882E-01,0.000000E+00 
650,2.714900E-01,1.083996E-01,0.000000E+00 
660,1.538163E-01,6.033976E-02,0.000000E+00 
670,8.281010E-02,3.211852E-02,0.000000E+00 
680,4.221473E-02,1.628841E-02,0.000000E+00 
690,2.025590E-02,7.797457E-03,0.000000E+00 
700,9.816228E-03,3.776140E-03,0.000000E+00 

'D65_MA68II_10nm.csv'

400,82.7549,14.708 
410,91.486,17.6753 
420,93.4318,20.995 
430,86.6823,24.6709 
440,104.865,28.7027 
450,117.008,33.0859 
460,117.812,37.8121 
470,114.861,42.8693 
480,115.923,48.2423 
490,108.811,53.9132 
500,109.354,59.8611 
510,107.802,66.0635 
520,104.79,72.4959 
530,107.689,79.1326 
540,104.405,85.947 
550,104.046,92.912 
560,100,100 
570,96.3342,107.184 
580,95.788,114.436 
590,88.6856,121.731 
600,90.0062,129.043 
610,89.5991,136.346 
620,87.6987,143.618 
630,83.2886,150.836 
640,83.6992,157.979 
650,80.0268,165.028 
660,80.2146,171.963 
670,82.2778,178.769 
680,78.2842,185.429 
690,69.7213,191.931 
700,71.6091,198.261 

'spectral_data.csv'

0,400,12.73,"a" 
0,410,12.41,"a" 
0,420,12.55,"a" 
0,430,13.42,"a" 
0,440,15.07,"a" 
0,450,17.31,"a" 
0,460,19.20,"a" 
0,470,20.96,"a" 
0,480,22.11,"a" 
0,490,23.45,"a" 
0,500,24.62,"a" 
0,510,25.42,"a" 
0,520,24.51,"a" 
0,530,22.43,"a" 
0,540,20.94,"a" 
0,550,21.59,"a" 
0,560,22.36,"a" 
0,570,21.54,"a" 
0,580,22.03,"a" 
0,590,28.86,"a" 
0,600,37.02,"a" 
0,610,42.00,"a" 
0,620,44.79,"a" 
0,630,46.57,"a" 
0,640,47.56,"a" 
0,650,48.70,"a" 
0,660,49.90,"a" 
0,670,50.75,"a" 
0,680,51.53,"a" 
0,690,52.24,"a" 
0,700,53.00,"a" 
1,400,2.31,"b" 
1,410,2.33,"b" 
1,420,2.33,"b" 
1,430,2.30,"b" 
1,440,2.29,"b" 
1,450,2.30,"b" 
1,460,2.27,"b" 
1,470,2.26,"b" 
1,480,2.24,"b" 
1,490,2.23,"b" 
1,500,2.22,"b" 
1,510,2.21,"b" 
1,520,2.20,"b" 
1,530,2.19,"b" 
1,540,2.18,"b" 
1,550,2.18,"b" 
1,560,2.18,"b" 
1,570,2.16,"b" 
1,580,2.15,"b" 
1,590,2.14,"b" 
1,600,2.14,"b" 
1,610,2.13,"b" 
1,620,2.12,"b" 
1,630,2.11,"b" 
1,640,2.11,"b" 
1,650,2.11,"b" 
1,660,2.10,"b" 
1,670,2.08,"b" 
1,680,2.07,"b" 
1,690,2.06,"b" 
1,700,2.04,"b" 
+1

也許我能夠幫助你,如果你從CSV發佈一些示例數據文件,我也認爲你的代碼(i,i + 1),data ['serialNumber']:' –

+0

我只是添加了文件 –

回答

1

這將做到這一點計算到三個新列,然後組由名稱和序列號(你實際上可以按無論是在這種情況下,但這樣一來,你在最終結果中同時獲得):

# First calculate the new columns 
cols = ['x', 'y', 'z'] 
uppercols = ['X', 'Y', 'Z'] 
for uppercol, col in zip(uppercols, cols): 
    merged[uppercol] = (merged[col] * merged.a * merged.measurement)/totals['y'] 

# Now group and sum 
sums = merged.groupby(['serialNumber', 'name'])[uppercols].sum() 

要寫到CSV文件,只是做

sums.to_csv('test.csv') 
1

您可以將並應用用戶定義的函數:

res = merged.groupby(['serialNumber','name']).apply(lambda g:pd.Series([(g[c] * g.a * g.measurement).sum()/totals['y'] for c in "xyz"], index=['X','Y','Z'])) 
print res