我有一些三個變量的數據框,我想創建一個字典,每個變量的每個標籤的相對計數。熊貓value_counts()for循環失敗,因爲lambda
我很容易創建一個forloop,輸出正是我想要的,但是我的lambda產生了更奇怪的結果。
下面是數據:
In [3]:
import pandas as pd
raw_data = {
'category1': ['Red', 'Red', 'Red', 'Green'],
'category2': ['Plane', 'Plane', 'Plane', 'Car'],
'category3': ['Orange', 'Orange', 'Orange', 'Banana'],
}
df = pd.DataFrame(raw_data)
df
Out[3]:
category1 category2 category3
0 Red Plane Orange
1 Red Plane Orange
2 Red Plane Orange
3 Green Car Banana
for循環產生精確的輸出我想:
In [4]:
forloop = {}
for column in df:
forloop[column] = df[column].value_counts(normalize=True).to_dict()
forloop
Out[4]:
{'category1': {'Green': 0.25, 'Red': 0.75},
'category2': {'Car': 0.25, 'Plane': 0.75},
'category3': {'Banana': 0.25, 'Orange': 0.75}}
然而,這拉姆達由於一些未知的原因而失敗:
In [6]:
ratio = lambda x: x.value_counts(normalize=True).to_dict()
output_lambda = df.apply(ratio)
output_lambda
Out[6]:
category1 <built-in method values of dict object at 0x10...
category2 <built-in method values of dict object at 0x10...
category3 <built-in method values of dict object at 0x10...
dtype: object