蟒蛇GROUPBY itertools列表方法

我有一個這樣的名單：＃[年，日，值1，值，VALUE3]蟒蛇GROUPBY itertools列表方法

[[2014, 1, 10, 20, 30], 
[2014, 1, 3, 7, 4], 
[2014, 2, 14, 43,5], 
[2014, 2, 33, 1, 6] 
... 
[2013, 1, 34, 54, 3], 
[2013, 2, 23, 33, 2], 
...]

，我需要按年天，以獲得類似：

[[2014, 1, sum[all values1 with day=1), sum(all values2 with day =1), avg(all values3 with day=1)], 
[2014, 2, sum[all values1 with day=2), sum(all values2 with day =2), avg(all values3 with day=2)], 
.... 
[2013, 1, sum[all values1 with day=1), sum(all values2 with day =1), avg(all values3 with day=1)], 
[2013, 2, sum[all values1 with day=2), sum(all values2 with day =2), avg(all values3 with day=2)],, 
....]

我該怎麼用itertool ?,我不能使用熊貓或numpy，因爲我的系統不支持它。非常感謝你的幫助。

來源

2016-05-04 Madmartigan

目前還不清楚你想分組。你想按年份分組嗎？天？年和日？別的東西？ – mgilson

儘量提供一小段可用的數據。上面的代碼不是有效的Python代碼，並且當有人試圖對您的問題進行實驗時，並沒有多大幫助。 –

您的數據已經按「（年，日）」排序了嗎？ –

import itertools 
import operator 

key = operator.itemgetter(0,1) 
my_list.sort(key=key) 
for (year, day), records in itertools.groupby(my_list, key): 
    print("Records on", year, day, ":") 
    for record in records: print(record)

itertools.groupby不起作用像SQL的GROUPBY。它按順序分組。這意味着如果你有一個沒有排序的元素列表，你可能會得到同一個鍵上的多個組。所以，讓我們說你要組根據他們的奇偶性整數的列表（甚至VS奇），那麼你可以這樣做：

L = [1,2,3,4,5,7,8] # notice that there's no 6 in the list 
itertools.groupby(L, lambda i:i%2)

現在，如果你來自一個SQL世界中，你可能會認爲這給你兩組 - 一組爲偶數，一組爲奇數。雖然這是有道理的，但並不是Python如何做。它依次考慮每個元素並檢查它是否屬於與前一個元素相同的組。如果是這樣，則將這兩個元素添加到組中;否則，每個元素都有自己的組。

因此，與上述列表中，我們得到：

key: 1 
elements: [1] 

key: 0 
elements[2] 

key: 1 
elements: [3] 

key: 0 
elements[4] 

key: 1 
elements: [5,7] # see what happened here?

因此，如果你希望做一個像分組在SQL，那麼你要排序的前手名單，通過密鑰（標準），您要組：

L = [1,2,3,4,5,7,8] # notice that there's no 6 in the list 
L.sort(key=lambda i:i%2) # now L looks like this: [2,4,1,3,5,7] - the odds and the evens stick together 
itertools.groupby(L, lambda i:%2) # this gives two groups containing all the elements that belong to each group

來源

2016-05-04 19:19:13 inspectorG4dget

請問您可以添加一些上下文嗎？ – ppperry

@ppperry：檢查出來 – inspectorG4dget

我試圖做一個簡短的回答，但我沒有suceed但我已經成功地得到了很多的參與蟒蛇內置模塊：

import itertools 
import operator 
import functools

我會用functools.reduce做的款項，但它需要一個自定義函數：

def sum_sum_sum_counter(res, array): 
    # Unpack the values of the array 
    year, day, val1, val2, val3 = array 
    res[0] += val1 
    res[1] += val2 
    res[2] += val3 
    res[3] += 1 # counter 
    return res

這個函數有一個計數器，因爲要計算它比跑步更直觀的意思執行的平均值。

現在最有趣的部分：我將前兩個元素組（假設這些被排序，否則一個需要之前類似lst = sorted(lst, key=operator.itemgetter(0,1))：

result = [] 
for i, values in itertools.groupby(lst, operator.itemgetter(0,1)): 
    # Now let's use the reduce function with a start list containing zeros 
    calc = functools.reduce(sum_sum_sum_counter, values, [0, 0, 0, 0]) 
    # Append year, day and the results. 
    result.append([i[0], i[1], calc[0], calc[1], calc[2]/calc[3]])

的calc[2]/calc[3]是值3的平均記得上在reduce功能元素是一個櫃檯，由數除以合計平均

給我一個結果：。

[[2014, 1, 13, 27, 17.0], 
[2014, 2, 47, 44, 5.5], 
[2013, 1, 34, 54, 3.0], 
[2013, 2, 23, 33, 2.0]]

只是使用你給出的值。

來源

2016-05-04 21:18:47 MSeifert

蟒蛇GROUPBY itertools列表方法

回答

相關問題