我怎樣才能在空場統計數據幀

我有一個數據幀：我怎樣才能在空場統計數據幀

| city | field2 | field3 | field4 | field5 | 
| 1 | a |  | b | b | 
| 2 |  |  | c |  | 
| 3 |  | a |  |  | 
| 4 | a |  |  |  | 
| 1 |  | a |  | b | 
| 2 | b |  | c |  | 
| 4 |  | a |  |  | 
| 3 |  |  | a |  | 
| 2 | b |  |  |  | 
| 1 |  | a |  | b | 
| 2 |  |  | a |  | 
| 3 | a |  |  | b | 
| 1 |  |  | b |  | 
| 1 | b | a |  |  | 
| 2 |  |  | b | b | 
| 1 | b | a |  | b |

我需要在這裏是統計上的場「城市」羣空白字段的列表。

| city | field2 | field3 | field4 | field5 | 
| 1 | 3 | 2 | 4 | 2 | 
| 2 | 3 | 5 | 1 | 4 | 
| 3 | 2 | 2 | 2 | 2 | 
| 4 | 1 | 1 | 2 | 2 |

我該如何用python熊貓做到這一點？

來源

2015-09-25 NCNecros

你是如何確定的值，以填補在？ – rurp

@rurp這是這個「城市」的空白單元格的字段數。例如city 1在field2中有3個空白單元格，在field3中有2個空白單元格等。 – NCNecros

你是什麼意思的空白單元格？這是否意味着NaN或其他？ – rurp

import pandas as pd 
import numpy as np 

df = pd.DataFrame({ 
    "city": [1,2,1,2,1,2], 
    "field2": [np.nan, "a", np.nan, np.nan, "b", np.nan], 
    "field3": [np.nan, np.nan, np.nan, "b", "a", "b"], 
    }) 
df

這是我的示例數據：

city field2 field3 
0 1 NaN NaN 
1 2 a NaN 
2 1 NaN NaN 
3 2 NaN b 
4 1 b a 
5 2 NaN b

現在的邏輯：

# define a function that counts the number of `nan` in a series. 
def count_nan(col): 
    return col.isnull().sum() 

# group by city and count the number of `nan` per city 
df.groupby("city").agg({"field2": count_nan, "field3": count_nan})

這是輸出：

field2 field3 
city   
1 2 2 
2 2 1

來源

2015-09-25 06:16:45 cel

我更喜歡使用'col.isnull（）。sum（）'。 –

謝謝。如果你想不爲null，條件是什麼？例如'==「」' – NCNecros

@AndyHayden，你的版本更具可讀性，甚至更快。 – cel

我怎樣才能在空場統計數據幀

回答

相關問題