第二次迭代在Pandas/Numpy中添加額外字符

我正在運行下面的代碼，第一次迭代運行良好，第二次迭代開始時它給我一個關鍵錯誤。我注意到當第二次迭代開始時，會自動將一個字符串「L」添加到該鍵中。鏈接下面我的代碼：第二次迭代在Pandas/Numpy中添加額外字符

鏈接，我使用的數據是低於：

不知道爲什麼它正在發生。有人可以讓我知道是什麼導致了這個問題。非常感謝幫助！ Traceback (most recent call last): File "C:/Python27/myScripts/KNN.py", line 114, in <module> pred_lst.append(predict_output_of_query(10.0, features_train, df_housePrice_train, features_test[i])) File "C:/Python27/myScripts/KNN.py", line 96, in predict_output_of_query avg1 += output_train["price"][i] File "C:\Python27\lib\site-packages\pandas\core\series.py", line 557, in __getitem__ result = self.index.get_value(self, key) File "C:\Python27\lib\site-packages\pandas\core\index.py", line 1790, in get_value return self._engine.get_value(s, k) File "pandas\index.pyx", line 103, in pandas.index.IndexEngine.get_value (pandas\index.c:3204) File "pandas\index.pyx", line 111, in pandas.index.IndexEngine.get_value (pandas\index.c:2903) File "pandas\index.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandas\index.c:3843) File "pandas\hashtable.pyx", line 303, in pandas.hashtable.Int64HashTable.get_item (pandas\hashtable.c:6525) File "pandas\hashtable.pyx", line 309, in pandas.hashtable.Int64HashTable.get_item (pandas\hashtable.c:6463) KeyError: 6818L

來源

2016-04-06 user1122534

如果您在此處傾倒你的整個代碼和數據，我不認爲你會得到很多回應。嘗試隔離你的問題，並使其簡明扼要。然後，你會得到更多的迴應。 – Hun

@Hun向代碼提供數據，以便有人可以直接運行代碼並查看錯誤。它必須執行前面的步驟才能達到這個步驟，因爲我遇到了錯誤 – user1122534

代碼還很長，而且數據量很大。看看其他問題，看看其他人如何得到多個答案。 – Hun

現在，我只看到你的get_numpy_data定義，並認爲它不像你所期望的那樣工作。例如，行

features_train, output_train = get_numpy_data(df_housePrice_train, feature_list, 'price')

似乎修改df_housePrice_train。並且output_train成爲包含字符串"price"的np數組。

更新：

線distances = []真的應該是函數compute_distances內。該功能在每次執行時將元素附加到distances。接下來，將一些元素的索引（位置）應用於數據幀。在第一次執行時，一切正常，但後來列表增長並且一些索引變得更大 - 超過了數據框的大小。

更新2

爲了完整性：KeyError: 6818L意味着，長整型6818（L表示這裏一個類型）不是df_housePrice_train一個有效密鑰。

所需的代碼修改：

## KNN.py, line 61: 
#distances = []  # <- delete this line 

def compute_distances(features_instances, features_query): 
    distances = []  # <-- add here 
    # rest of the function body...

來源

2016-04-06 04:36:04 ptrj

是的，謝謝你指出。但是，這似乎不是問題（儘管用df_housePrice_train ['價格']取代了它）。代碼在GIT上更新。所以，它所做的是，如果最初的關鍵是6818在第二次迭代時出於某種原因尋找6818L並拋出一個錯誤 – user1122534

基金它。我認爲這回答了這個問題。但是我沒有檢查整個代碼，所以不能保證它的正確性。 – ptrj

嗨，沒有回答我的問題 – user1122534

第二次迭代在Pandas/Numpy中添加額外字符

回答

相關問題