1
我試圖用PCA降低數據集的維數。然後,根據某些標準(取決於從中獲取數據點的文件名的編號),爲每個數據點分配一個「類/類別」,並將所有數據點繪製爲散點圖,其中包含遺留的有關散點圖的其他信息的問題
如同每個數據點的另一個列表我有一些附加信息存儲,我希望每個數據點都可以選擇,以便我可以讀取終端中的信息。 在繪製我的散點圖時 - 我假設因爲我繪製了子集明智的 - 訂單被搞亂了。 接收到的事件的標記不再適用於具有附加信息的陣列。
我試圖在繪圖時重新排列信息數組,但不知何故它仍然無法工作。這裏是我的代碼:
targets = []
trainNames = []
# React on to a click on a datapoint.
def onPick(event):
indexes = event.ind
xy = event.artist.get_offsets()
for index in indexes:
print trainNames[index]
# Load the additonal information for each datapoint. It's stored in the
# same order as the datapoints in 'trainingfile.csv'.
modelNamesFile = open("training-names.csv")
for line in modelNamesFile:
# Save target for datapoint. It's the class of the object, seperated
# into "rectangular", "cylindrical", "irregular", dependend on the
# objects file number.
objnum = int(line.split(",")[-1].split("/")[-1].split(".")[0])
if (objnum <= 16):
objnum = 0
elif (objnum >= 17 and objnum <= 34):
objnum = 1
else:
objnum = 2
targets.append(objnum)
# Save name description for datapoint.
sceneName = line.split(",")[0].split("/")[-1]
modelName = line.split(",")[-1].split("/")[-1].split(".")[0]
trainNames.append(sceneName + ", " + modelName)
target_names = ["rectangular", "cylindrical", "irregular"]
# Load the actual data.
f = open("trainingfile.csv")
tData = []
for line in f:
lsplit = line.split(",")
datapoint = []
for feature in lsplit:
datapoint.append(float(feature))
tData.append(datapoint)
data = np.array(tData)
# Transform it into 2D with PCA.
y = np.array(targets)
X = np.delete(data, data.shape[1] - 1, 1) # Strip class.
pipeline = Pipeline([('scaling', StandardScaler()), ('pca', PCA(n_components=2))])
X_reduced = pipeline.fit_transform(data)
# Create plot.
trainNames = np.array(trainNames)
tmpTrainNames = np.array([])
fig = plt.figure()
for c, i, target_name in zip("rgb", [0, 1, 2], target_names):
plt.scatter(X_reduced[y == i, 0], X_reduced[y == i, 1], c=c, label=target_name, picker=True)
# Here i try to rearrange the order of the additonal information int he order the points
# were plotted.
tmpTrainNames = np.append(tmpTrainNames, trainNames[y == i])
trainNames = tmpTrainNames
plt.legend()
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
fig.canvas.mpl_connect('pick_event', onPick)
plt.show()
如果它太複雜,我可以嘗試簡化。就告訴我嘛。