2017-05-26 57 views
1

我有以下列格式數據:如何將CSV列轉換爲標準化的np數組?

1495573445.162, 0, 0.021973, 0.012283, -0.995468, 1 
1495573445.172, 0, 0.021072, 0.013779, -0.994308, 1 
1495573445.182, 0, 0.020157, 0.015717, -0.995575, 1 
1495573445.192, 0, 0.017883, 0.012756, -0.993927, 1 
1495573445.202, 0, 0.021194, 0.012161, -0.994705, 1 

沒有報頭。有大約1000個相似的行。

我想將第三列,第四列和第五列標準化爲np數組。

我有以下代碼。

import numpy as np 

Acc1_x = np.genfromtxt('Accelerometer1.csv', delimiter=',') 
Acc1_y = np.genfromtxt('Accelerometer1.csv', delimiter=',') 
Acc1_z = np.genfromtxt('Accelerometer1.csv', delimiter=',') 

Acc2_x = np.genfromtxt('Accelerometer2.csv', delimiter=',') 
Acc2_y = np.genfromtxt('Accelerometer2.csv', delimiter=',') 
Acc2_z = np.genfromtxt('Accelerometer2.csv', delimiter=',') 

Acc3_x = np.genfromtxt('Accelerometer3.csv', delimiter=',') 
Acc3_y = np.genfromtxt('Accelerometer3.csv', delimiter=',') 
Acc3_z = np.genfromtxt('Accelerometer3.csv', delimiter=',') 

Acc1_x_normed = (Acc1_x - Acc1_x.min())/Acc1_x.ptp() 
Acc1_y_normed = (Acc1_y - Acc1_y.min())/Acc1_y.ptp() 
Acc1_z_normed = (Acc1_z - Acc1_y.min())/Acc1_y.ptp() 

Acc2_x_normed = (Acc2_x - Acc2_x.min())/Acc2_x.ptp() 
Acc2_y_normed = (Acc2_y - Acc2_y.min())/Acc2_y.ptp() 
Acc2_z_normed = (Acc2_z - Acc2_z.min())/Acc2_z.ptp() 

Acc3_x_normed = (Acc3_x - Acc3_x.min())/Acc3_x.ptp() 
Acc3_y_normed = (Acc3_y - Acc3_y.min())/Acc3_y.ptp() 
Acc3_z_normed = (Acc3_z - Acc3_z.min())/Acc3_z.ptp() 

print Acc1_x_normed 
print Acc1_y_normed 
print Acc1_z_normed 

print Acc2_x_normed 
print Acc2_y_normed 
print Acc2_z_normed 

print Acc3_x_normed 
print Acc3_y_normed 
print Acc3_z_normed 

然而,它打印出:

[ 1.00000000e+00 6.65681116e-10 6.79158889e-10 6.76190128e-10 
    0.00000000e+00 1.33432096e-09] 
[ 1.00000000e+00 6.64579197e-10 6.76536483e-10 6.73108367e-10 
    0.00000000e+00 1.33321904e-09] 
[ 1.00000000e+00 6.64579197e-10 6.78750350e-10 6.72710526e-10 
    -5.20201801e-13 1.33321904e-09] 
[ 1.00000000e+00 6.64916187e-10 6.79567423e-10 6.72057929e-10 
    0.00000000e+00 1.33355603e-09] 
[ 1.00000000e+00 6.65568779e-10 6.81056484e-10 6.73282209e-10 
    0.00000000e+00 1.33420862e-09] 
[ 1.00000000e+00 6.64252896e-10 6.78771073e-10 6.71313064e-10 
    0.00000000e+00 1.33289274e-09] 
[ 1.00000000e+00 6.61436566e-10 6.71241501e-10 6.69088480e-10 
    0.00000000e+00 1.33007639e-09] 
[ 1.00000000e+00 6.70966021e-10 6.84606942e-10 6.79750611e-10 
    0.00000000e+00 1.33960584e-09] 
[ 1.00000000e+00 6.70894477e-10 6.84147587e-10 6.82066111e-10 
    0.00000000e+00 1.33953430e-09] 

我需要它在每個從csv文件的列的打印出整個1000倍左右的值,但它僅在每個陣列中打印出6 。

+0

'Acc1_x'沒有從'Acc1_y'不同在你的代碼等等。這個問題遍及你的代碼的其餘部分;您需要以某種方式引用特定的列,無論是按索引還是按名稱。也許從['pandas.read_csv'](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)開始? – roganjosh

+0

您是否意識到'Acc1_x'(和其他人)是一個(n,6)二維數組?你有沒有足夠的'numpy'知道如何索引和執行行,列和/或整個數組? – hpaulj

+0

@roganjosh我會用正確的列數索引genfromtext嗎? – dirtysocks45

回答

1

您是非常接近只需要添加邏輯軸= 0所以

Acc1_x_normed = (Acc1_x - Acc1_x.min())/Acc1_x.ptp() 

變得

Acc1_x_normed = (Acc1_x - Acc1_x.min(axis=0))/Acc1_x.ptp(axis=0) 
+0

它的工作原理,但我收到了很多NaN和關於不良分歧的警告。我相信在某些情況下我可能會被零除。另外,@roganjosh是正確的,我的所有Acc_x,Acc_y和Acc_z都差不多。我不知道如何緩解這一點。 – dirtysocks45

+0

在包含numpy之後但在其他代碼之前,您可以說np.seterr(divide ='ignore',invalid ='ignore')來清除警告消息。 – James

+0

要將nan(非數字)轉換爲零,請添加邏輯Acc1_x_normed = np.nan_to_num(Acc1_x_normed) – James