2017-03-07 41 views
-1
def age_range(age): 
    if age <= 18: 
     return 'Minors' 
    elif age >= 19 & age < 63: 
     return 'Adults' 
    elif age >= 63 & age < 101: 
     return 'Senior Citizen' 
    else: 
     return 'Age Unknown' 

titanic_data_df["PassengerType"] = titanic_data_df[['Age']].apply(age_range, axis = 1) 

titanic_data_df.head() 

我收到以下錯誤,當我試圖將一個新列添加到現有的數據框(titanic_data_df):ValueError異常而使用apply()方法

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-466-741f5646101e> in <module>() 
     1 #create a new df with just age and distinguish each passenger as minor, adult or senior citizen 
----> 2 titanic_data_df["PassengerType"] =  titanic_data_df[['Age']].apply(age_range, axis = 1) 
     3 
     4 titanic_data_df.head() 

C:\Users\test\Anaconda2\envs\py27DAND\lib\site-packages\pandas\core\frame.pyc in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 
    4161      if reduce is None: 
    4162       reduce = True 
-> 4163      return self._apply_standard(f, axis, reduce=reduce) 
    4164    else: 
    4165     return self._apply_broadcast(f, axis) 

C:\Users\test\Anaconda2\envs\py27DAND\lib\site-packages\pandas\core\frame.pyc in _apply_standard(self, func, axis, ignore_failures, reduce) 
    4257    try: 
    4258     for i, v in enumerate(series_gen): 
    -> 4259      results[i] = func(v) 
    4260      keys.append(v.name) 
    4261    except Exception as e: 

<ipython-input-465-e62ccbeee80e> in age_range(age) 
     1 def age_range(age): 
----> 2  if age <= 18: 
     3   return 'Minors' 
     4  elif age >= 19 & age < 63: 
     5   return 'Adults' 

C:\Users\test\Anaconda2\envs\py27DAND\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self) 
    915   raise ValueError("The truth value of a {0} is ambiguous. " 
    916       "Use a.empty, a.bool(), a.item(), a.any() or a.all()." 
--> 917       .format(self.__class__.__name__)) 
    918 
    919  __bool__ = __nonzero__ 

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 0') 

從我迄今已經閱讀與上述方法中的if ... else語句有關。我無法弄清楚它是什麼。任何幫助表示讚賞。謝謝。

+1

您可以加入一個[MCVE(包括回溯)你的問題?如果我們無法重現錯誤,很難弄清楚發生了什麼。 – MSeifert

+0

這是一個熊貓問題嗎?問題標籤似乎不完整。 –

+1

我對熊貓的瞭解不多,但我確實知道與邏輯運算符'和'不同的按位運算符'&',所以很可能是導致此問題的原因。其實,沒關係 - 這會造成不正確的結果,而不是錯誤。 – TigerhawkT3

回答

1

當您選擇一列作爲titanic_data_df[['Age']](請注意雙方括號)時,您實際上正在獲取包含單個列的DataFrame。在這種情況下,apply()函數將單個元素串傳遞給函數age_range

試試這個:

titanic_data_df["PassengerType"] = titanic_data_df['Age'].apply(age_range) 
+0

謝謝你的解釋。這就說得通了。 另外,它似乎我也可以使用appylmap(),如果我想繼續使用數據幀而不是系列。 – pyuser181

0

熊貓cut功能將使這一更容易爲你。首先,我將構建一個數據框來演示cut函數。

titanic_data_df = pd.DataFrame(data=[[13, 'Male'], [14, 'Female'], [38, 'Female'], [72, 'Male'], [33, 'Female'], [80, 'Male'], [34, 'Male'], [15, 'Female'], [27, 'Female'],[23, 'Male'], [64, 'Female'], [38, 'Female'], [12, 'Male'], [32, 'Female'], [21, 'Male'], [66, 'Male'], [73, 'Female'], [22, 'Female']], columns=['Age', 'Sex']) 
print(titanic_data_df) 
    Age  Sex 
0 13 Male 
1 14 Female 
2 38 Female 
3 72 Male 
4 33 Female 
5 80 Male 
6 34 Male 
7 15 Female 
8 27 Female 
9 23 Male 
10 64 Female 
11 38 Female 
12 12 Male 
13 32 Female 
14 21 Male 
15 66 Male 
16 73 Female 
17 22 Female 

然後,我簡單地套用cut功能:

bins = ['Minors', 'Adults', 'Senior Citizens'] 
titanic_data_df["PassengerType"] = pd.cut(titanic_data_df.Age, [0, 18, 63, 101], labels=bins) 
print(titanic_data_df) 
    Age  Sex  PassengerType 
0 13 Male   Minors 
1 14 Female   Minors 
2 38 Female   Adults 
3 72 Male Senior Citizen 
4 33 Female   Adults 
5 80 Male Senior Citizen 
6 34 Male   Adults 
7 15 Female   Minors 
8 27 Female   Adults 
9 23 Male   Adults 
10 64 Female Senior Citizen 
11 38 Female   Adults 
12 12 Male   Minors 
13 32 Female   Adults 
14 21 Male   Adults 
15 66 Male Senior Citizen 
16 73 Female Senior Citizen 
17 22 Female   Adults 
+0

謝謝你的解釋。這很棒! – pyuser181