Python熊貓groupby方法無法正常工作

-2

我有一個文本文件，每行都有數據，每行都有一個時間戳。所以我讀的數據這樣一個數據幀：Python熊貓groupby方法無法正常工作

table = pd.read_table(file, sep='|', skiprows=[1], usecols = columns, parse_dates = dateColumns, date_parser = parsedate, converters=columnsFormat)

到目前爲止，一切都很好。

我的結果是像下面的實例中的數據幀：

Name Local Code Date  Value 
A1 Here 01 01-01-1990 1.2 
A1 Here 01 01-02-1990 0.8 
A1 Here 01 01-03-1990 1.6 
... 
A2 There 02 01-01-1990 1.1 
A2 There 02 01-02-1990 0.7 
A2 There 02 01-03-1990 1.3 
... 
An Where n 12-31-2013 2.1

的日期是按時間順序，但我有幾個羣體，他們不具有相同數目的元素。

我想要做的是將數據幀分組Name,Local和Code。因此，我可以將這些值作爲索引，將日期和值作爲組的列。

東西如下面的例子：

(Index)   Date  Value 
(A1 Here 01) 01-01-1990 1.2 
        01-02-1990 0.8 
        01-03-1990 1.6 
... 
(A2 There 02) 01-01-1990 1.1 
        01-02-1990 0.7 
        01-03-1990 1.3 
... 
(An Where n) 12-31-2013 2.1

代替具有基團如這些但是，當我執行

table = table.groupby(['Name', 'Local', 'Code'])

我最終像這些下列基團。第一組包含第一天的所有數據，第二組包含第二天的所有數據，依此類推。

Name Local Code Date  Value 
A1 Here 01 01-01-1990 1.2 
A2 There 02 01-01-1990 1.1 
... 
A1 Here 01 01-02-1990 0.8 
A2 There 02 01-02-1990 0.7 
... 
A1 Here 01 01-03-1990 1.6 
A2 There 02 01-03-1990 1.3 
... 
An Where n 12-31-2013 2.1

任何想法，我可能如何分組我的解釋？

如果我使用table = table.groupby(['Name', 'Local', 'Code', 'Date'])我有一組這樣的：

Name Local Code Date  Value 
A1 Here 01 01-01-1990 1.2 
       01-02-1990 0.8 
       01-03-1990 1.6 
... 
A2 There 02 01-01-1990 1.1 
       01-02-1990 0.7 
       01-03-1990 1.3 
... 
An Where n 12-31-2013 2.1

這幾乎是我想要的，但我將不得不通過Name，Local和Code它在幾組分開。可能嗎？

當讀表時，parse_dates和converters改變索引中的東西嗎？

希望我現在明白了。謝謝。

來源

2014-03-19 Lucas

http://stackoverflow.com/questions/17027470/pandas-groupby-and-multiindex – grasshopper

可能重複拆分表如何？ – grasshopper

每個'名稱本地代碼'有兩列的數據框：日期和值。 – Lucas

回答你的最後一個問題：

如果通過

groups = df.groupby(['name','local','code'])

迭代，你應該得到每個組單獨的數據幀，即：

for g, grp in groups: 
    print grp

來源

2014-03-19 17:11:24 grasshopper

我認爲OP有這個，但問題是這些不是[名稱，本地，代碼]索引（即在這個階段它不尊重as_index），這是OP所問的。 –

正如你可以set_index變通手法，然後groupby指數：

In [11]: df1 = df.set_index(['Name', 'Local', 'Code']) 

In [12]: g = df1.groupby(df1.index) 

In [13]: for i in df1.groupby(df1.index): print i 
(('A1', 'Here', 1), 
         Date Value 
Name Local Code     
A1 Here 1  01-01-1990 1.2 
      1  01-02-1990 0.8 
      1  01-03-1990 1.6)

來源

2014-03-19 17:41:42

對不起，我認爲這解決了問題，但它沒有。 – Lucas

@盧卡斯顯然你必須詳細說明。無論如何，就像我通常評論的那樣，最好使用groupby方法，比如apply。 –

Python熊貓groupby方法無法正常工作

回答

相關問題