2015-10-06 145 views
1

對不起,我剛纔問這個問題:Pythonic Way to have multiple Or's when conditioning in a dataframe,但將其標記爲過早地回答,因爲它通過我的過於簡單的測試用例,但不工作更普遍。 (如果有可能合併,重新討論這個問題,這將是偉大的...)問題與ISIN大熊貓

以下是完整的問題:

sum(data['Name'].isin(eligible_players)) 
> 0 

sum(data['Name'] == "Antonio Brown") 
> 68 

"Antonio Brown" in eligible_players 
> True 

基本上,如果我理解正確的,我表示安東尼布朗在合格的玩家,他在數據框中。但是,由於某些原因,.isin()無法正常工作。

正如我在之前的問題說,我正在尋找一種方法來檢查許多ORS選擇合適的行

____編輯____

In[14]: 
eligible_players 
Out[14]: 
Name 
Antonio Brown  378 
Demaryius Thomas 334 
Jordy Nelson  319 
Dez Bryant   309 
Emmanuel Sanders 293 
Odell Beckham  289 
Julio Jones   288 
Randall Cobb  284 
Jeremy Maclin  267 
T.Y. Hilton   255 
Alshon Jeffery  252 
Golden Tate   250 
Mike Evans   236 
DeAndre Hopkins  223 
Calvin Johnson  220 
Kelvin Benjamin  218 
Julian Edelman  213 
Anquan Boldin  213 
Steve Smith   213 
Roddy White   208 
Brandon LaFell  205 
Mike Wallace  205 
A.J. Green   203 
DeSean Jackson  200 
Jordan Matthews  194 
Eric Decker   194 
Sammy Watkins  190 
Torrey Smith  186 
Andre Johnson  186 
Jarvis Landry  178 
Eddie Royal   176 
Brandon Marshall 175 
Vincent Jackson  175 
Rueben Randle  174 
Marques Colston  173 
Mohamed Sanu  171 
Keenan Allen  170 
James Jones   168 
Malcom Floyd  168 
Kenny Stills  167 
Greg Jennings  162 
Kendall Wright  162 
Doug Baldwin  160 
Michael Floyd  159 
Robert Woods  158 
Name: Pts, dtype: int64 

In [31]: 
data.tail(110) 
Out[31]: 
Name Pts year week pos Team 
28029 Dez Bryant 25 2014 17 WR DAL 
28030 Antonio Brown 25 2014 17 WR PIT 
28031 Jordan Matthews 24 2014 17 WR PHI 
28032 Randall Cobb 23 2014 17 WR GB 
28033 Rueben Randle 21 2014 17 WR NYG 
28034 Demaryius Thomas 19 2014 17 WR DEN 
28035 Calvin Johnson 19 2014 17 WR DET 
28036 Torrey Smith 18 2014 17 WR BAL 
28037 Roddy White 17 2014 17 WR ATL 
28038 Steve Smith 17 2014 17 WR BAL 
28039 DeSean Jackson 16 2014 17 WR WAS 
28040 Mike Evans 16 2014 17 WR TB 
28041 Anquan Boldin 16 2014 17 WR SF 
28042 Adam Thielen 15 2014 17 WR MIN 
28043 Cecil Shorts 15 2014 17 WR JAC 
28044 A.J. Green 15 2014 17 WR CIN 
28045 Jordy Nelson 14 2014 17 WR GB 
28046 Brian Hartline 14 2014 17 WR MIA 
28047 Robert Woods 13 2014 17 WR BUF 
28048 Kenny Stills 13 2014 17 WR NO 
28049 Emmanuel Sanders 13 2014 17 WR DEN 
28050 Eddie Royal 13 2014 17 WR SD 
28051 Marques Colston 13 2014 17 WR NO 
28052 Chris Owusu 12 2014 17 WR NYJ 
28053 Brandon LaFell 12 2014 17 WR NE 
28054 Dontrelle Inman 12 2014 17 WR SD 
28055 Reggie Wayne 11 2014 17 WR IND 
28056 Paul Richardson 11 2014 17 WR SEA 
28057 Cole Beasley 11 2014 17 WR DAL 
28058 Jarvis Landry 10 2014 17 WR MIA 
+0

請進行[MCVE],所以其他人可以證實這個問題(我們可以驗證它不只是'eligible_players'莫名其妙地改變或別的東西微不足道。) – DSM

+0

是,再上面,在我的玩具這工作完全正常 –

+0

@DSM我試圖創建一個最小,完整的,可驗證的例子,在我之前提到的問題中,我把它連接起來了,所以我接受了答案。當我的實際數據運行它,它沒有工作,所以我要尋找問題出在哪裏,從... – qwertylpc

回答

3

(另外:一旦您發佈了實際使用的內容,僅需幾秒鐘即可看到問題。)

Series.isin(something)通過something進行迭代以確定要測試成員身份的一組內容。但是您的eligible_players不是列表,它是系列。和遍歷一個系列是遍歷所有,即使會員(in)是相對於指數:

In [72]: eligible_players = pd.Series([10,20,30], index=["A","B","C"]) 

In [73]: list(eligible_players) 
Out[73]: [10, 20, 30] 

In [74]: "A" in eligible_players 
Out[74]: True 

所以你的情況,你可以使用eligible_players.index,而不是傳遞正確名稱:

In [75]: df = pd.DataFrame({"Name": ["A","B","C","D"]}) 

In [76]: df 
Out[76]: 
    Name 
0 A 
1 B 
2 C 
3 D 

In [77]: df["Name"].isin(eligible_players) # remember, this will be [10, 20, 30] 
Out[77]: 
0 False 
1 False 
2 False 
3 False 
Name: Name, dtype: bool 

In [78]: df["Name"].isin(eligible_players.index) 
Out[78]: 
0  True 
1  True 
2  True 
3 False 
Name: Name, dtype: bool 

In [79]: df["Name"].isin(eligible_players.index).sum() 
Out[79]: 3