2016-08-15 71 views
1

我已經得到了我想用象下面的例子中列變量拆分數據幀:分割大熊貓被列變量數據框

gender height weight 
male  42.8 157.5 
male  41.3 165.6 
female 48.4 144.2 

我想要的結果是:

df_male

gender height weight 
male  42.8 157.5 
male  41.3 165.6 

df_female

gender height weight 
female 48.4 144.2 

問題是,我想成爲 能夠做到這一點與變量,有5-25類別的任何地方。

我的想法是,應該有一個地遍歷原始數據框中吐出多個dataframes但我開放給所有可能的解決方案

+0

選擇行由(邏輯)他們的條件被稱爲*過濾* – smci

回答

5

下會產生含有一個數據幀的每個值列表所述gender柱:

import io 
import pandas as pd 

data = io.StringIO('''\ 
gender height weight 
male  42.8 157.5 
male  41.3 165.6 
female 48.4 144.2 
''') 
df = pd.read_csv(data, delim_whitespace=True) 

dfs = [rows for _, rows in df.groupby('gender')] 

dfs是長度爲2的列表,以及以下元素:

print(dfs[0]) 

# gender height weight 
# 2 female 48.4 144.2 

print(dfs[1]) 

# gender height weight 
# 0 male 42.8 157.5 
# 1 male 41.3 165.6 

這可能是更好的創建與鍵在gender列中的不同值的字典和值dataframes:

dfs = [{gender: rows for gender, rows in df.groupby('gender')} 

結果如下詞典:

{'female':  gender height weight 
      2 female 48.4 144.2, 
'male':  gender height weight 
      0 male 42.8 157.5 
      1 male 41.3 165.6}