熊貓數據框（選擇）

我通過閱讀CSV文件中創建數據幀，打印

<class 'pandas.core.frame.DataFrame'> 
    Int64Index: 176 entries, 0 to 175 
    Data columns (total 8 columns): 
    ID   176 non-null values 
    study   176 non-null values 
    center  176 non-null values 
    initials  176 non-null values 
    age   147 non-null values 
    sex   133 non-null values 
    lesion age 35 non-null values 
    group   35 non-null values 
    dtypes: float64(2), int64(1), object(5)

爲什麼給我一個錯誤，當我試圖按照一定條件

SUBJECTS[SUBJECTS.study=='NO2' and SUBJECTS.center=='Hermann']

錯誤信息選擇從數據幀行：

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

非常感謝您提前。

來源

2014-06-15 Hello lad

用途：（要麼True或False）

SUBJECTS[(SUBJECTS.study=='NO2') & (SUBJECTS.center=='Hermann')]

的and導致Python來評估布爾上下文SUBJECTS.study=='NO2'和 SUBJECTS.center=='Hermann')

在你的情況，你不希望任何評估爲布爾值。相反，你需要元素邏輯and。這由&而不是and指定。

的錯誤，每當你嘗試評估在布爾上下文中的NumPy的陣列或熊貓NDFrame

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

提高。考慮

bool(np.array([True, False]))

一些用戶可能會想到這回True因爲數組是非空。或者有些人可能會預期True，因爲至少有一個元素的陣列是True。其他人可能會期望它返回False，因爲不是所有陣列中的元素都是True。由於對布爾上下文應該返回的內容有多個同樣有效的期望，NumPy和Pandas的設計者決定強制用戶明確：使用.all()或.any()或len()。

來源

2014-06-15 21:22:13 unutbu

歡迎來到SO。該錯誤是由於pandas框架下如何numpy功能，考慮到這些例子：

In [158]: 
a=np.array([1,2,1,1,1,1,2]) 
b=np.array([1,1,1,2,2,2,1]) 

In [159]: 
#Array Boolean operation 
a==1 
Out[159]: 
array([ True, False, True, True, True, True, False], dtype=bool) 

In [160]: 
#Array Boolean operation 
b==1 
Out[160]: 
array([ True, True, True, False, False, False, True], dtype=bool) 

In [161]: 
#and is not an array Boolean operation 
(a==1) and (b==1) 
--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-161-271ddf20f621> in <module>() 
----> 1 (a==1) and (b==1) 

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() 

In [162]: 
#But & operates on arrays 
(a==1) & (b==1) 
Out[162]: 
array([ True, False, True, False, False, False, False], dtype=bool) 

In [163]: 
#Or * 
(a==1) * (b==1) 
Out[163]: 
array([ True, False, True, False, False, False, False], dtype=bool) 

In [164]: 
df=pd.DataFrame({'a':a, 'b':b}) 
In [166]: 
#Therefore this is a good approach 
df[(df.a==1) & (df.b==1)] 
Out[166]: 
a b 
0 1 1 
2 1 1 
2 rows × 2 columns 

In [167]: 
#This will also get you there, but it is not preferred. 
df[df.a==1][df.b==1] 
C:\Anaconda\lib\site-packages\pandas\core\frame.py:1686: UserWarning: Boolean Series key will be reindexed to match DataFrame index. 
    "DataFrame index.", UserWarning) 
Out[167]: 
a b 
0 1 1 
2 1 1 
2 rows × 2 columns

來源

2014-06-15 21:29:38

非常感謝朱CT，我看過你所有的代碼，它可以幫助我瞭解了很多:) @CT朱 –

熊貓數據框（選擇）

回答

相關問題