讀取CSV和篩選它基於列

-1

reader = pd.read_csv(csvfile, sep=';', header=0) 
list1=[] 
list2=[]

這裏我按行讀取CSV文件一行：

for row in reader.itertuples(): 
      list1.append(row)

然後我看着新列出並篩選它基於一個條件：

for i in range(len(list1)): 
     if(list1[i][5]==highestpointheight): 
       list2.append(list1[i])

現在我有一種基於條件過濾列表。

有沒有其他有效的方法，以便我沒有兩個for循環？

來源

2017-05-03 User193452

爲什麼你會不使用熊貓['.sort']（你的'reader'對象上的http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort.html）？ –

如果您使用的是熊貓，您應該學會使用數據框而不是將它們轉換爲Python列表或numpy數組。 – xvan

@xvan：我需要這樣的輸出以便我的應用程序（graphql resolver）可以讀取它：[Pandas（Index = 5，ullid = 1，sheetid = 2，highestpointheight = 332）]，我可以得到這個輸出熊貓數據框？ – User193452

pd.read_csv(csvfile, sep=';', header=0).loc[lambda df: df[5] == highestpointheight, :]

見http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-callable

來源

2017-05-03 13:15:00 xvan

好的解決方案！ – MaxU

你可能要扭轉這兩個操作的順序：

嘗試：

reader = reader[reader.iloc[:, 5] == highestpointheight] # filter the 6th column based on highestpointheight 
for row in reader.itertuples(): 
     list1.append(row)

來源

2017-05-03 13:08:57

讀取CSV和篩選它基於列

回答

相關問題