有條件循環：熊貓Python

有關熊貓數據框條件循環的問題。數據框架的利益是巨大的。我們有不同時間的學生姓名和考試成績（請參閱下文）。如果他/她的分數在任何測試中低於75，則認爲該學生失敗，否則通過。我無法有效地做到這一點。據幀：有條件循環：熊貓Python

score = {'student_name': ['Jiten', 'Jac', 'Ali', 'Steve', 'Dave', 'James'], 
    'test_quiz_1': [74, 81, 84, 67, 59, 96], 
'test_quiz_2': [76, np.NaN, 99, 77, 53, 69], 
'test_mid_term': [76, 88, 84, 67, 58, np.NaN], 
'test_final_term': [76, 78, 89, 67, 58, 96]} 

df = pd.DataFrame(score, columns = ['student_name', 'test_quiz_1', 'test_quiz_2', 'test_mid_term', 'test_final_term'])

我的方法：（基於雅克·克瓦姆的回答修改）

df.test_quiz_1 > 70

這（^）給了我的位置，其中具體的學生不及格。其他測試（df.test_quiz_2，...）可以重複相同的操作。最後，如果考生在任何考試中失敗，我都需要將這些全部合併成一個最後一列，在那裏學生失敗。

編輯：我對python和pandas瞭解甚少。我正在編寫關於如何在C/C++中實現的僞代碼。

for student in student_list: 
    value=0 
    for i in range (no_of_test): 
     if (score<75): 
      value=value+1 
     else: 
      continue 
    if(value>0): 
     student[status]=fail 
    else: 
     student[status]=pass

上面只是一個僞代碼。我不會創建任何附加列來標記學生是否在任何測試中失敗。使用熊貓可以在Python中實現類似的東西嗎？

請指教。

來源

2017-07-26 Xingfang Lee

我覺得這適合您的需要：

cols = df.columns.drop("student_name").tolist() 
df["PassOrFail"] = df[cols].fillna(0).lt(75).any(1) 

for i in cols: 
    df[i+"_"] = df[i].fillna(0).lt(75)

說明

首先我們創建有關列的列表：

['test_quiz_1', 'test_quiz_2', 'test_mid_term', 'test_final_term']

然後，我們創建一個新的col [「PassOrFail」]，用於檢查數據幀是否爲conatainin g相關列（np.Nan = 0）低於75.

最後，使用True或False值爲每個相關列創建一個新列。

更新

比方說，我們只有在獲得真或假有興趣，然後將下面的代碼應該是足夠了：

cols = df.columns.drop("student_name").tolist() 
results = df[cols].fillna(0).lt(75).any(1).tolist() 
(~pd.Series(results,index=df["student_name"])).to_dict()

輸出：

{'Ali': True, 
'Dave': False, 
'Jac': False, 
'James': False, 
'Jiten': False, 
'Steve': False}

來源

2017-07-26 06:34:52

謝謝。請檢查問題的編輯部分。 –

@XingfangLee更新了我的答案。這是你要求的嗎？ –

是的。感謝您的回覆。 –

您應該使用從numpy繼承的pandas矢量操作來代替循環。例如，以標記通過test_quiz_1人：

df.test_quiz_1 > 70

，並提供：

0  True 
1  True 
2  True 
3 False 
4 False 
5  True 
Name: test_quiz_1, dtype: bool

編輯：繼續讓我們說你有3個試驗用5名學生，並表示它作爲一個布爾值數據框：

 0  1  2 
0 True True False 
1 True True True 
2 True False False 
3 True False True 
4 True False False

如果學生通過了所有的測試，通過df.all(axis=1)來檢查他們是否通過了所有測試，測試結果如下：

0 False 
1  True 
2 False 
3 False 
4 False 
dtype: bool

只有學生1通過了這種情況。

來源

2017-07-26 04:48:56

感謝。我根據你的回答修改了這個問題。請指教。 –

再次感謝。我們不能避免在數據框中創建額外的列（上面的布爾數據框中的參考列0,1,2）。 –

df.set_index('student_name').lt(75).any(1) 
# `lt` is the method version of `<` 
# this identifies students that received 
# a score less than 75 on any of the tests. 

student_name 
Jiten  True 
Jac  False 
Ali  False 
Steve  True 
Dave  True 
James  True 
dtype: bool

來源

2017-07-26 04:53:07 piRSquared

有條件循環：熊貓Python

回答

相關問題