2014-06-05 33 views
0

我需要將概率的排序列表拆分爲組。第一組包含概率從(0.5,1),第二(0.25,0.5)等動態更改組關鍵字

我已經產生了一些代碼,將包含兩個小於1的冪的列表拆分成兩個列表:列表成員之一大於0.5,另一個包含小於0.5的(原始)列表成員。

from itertools import groupby 
from operator import itemgetter 
import doctest 
N= 10 

twos = [2**(-(i+1)) for i in range(0,N)] 

def split_by_prob(items,cutoff): 
    """ 
    (list of double) -> list of (lists) of double 
    Splits a set into subsets based on probability 
    >>> split_by_prob(twos, 0.5) 
    [[0.5], [ 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]] 
    """ 
    groups = [] 
    keys = [] 
    for k,g in it.groupby(enumerate(items), lambda (j, x): x<cutoff): 
     groups.append((map(itemgetter(1),g))) 
    return groups 

從命令行調用此代碼正是這樣做的:

>>> g = split_into_groups(twos,0.5) 
>>> g 
[[0.5], [0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]] 

我的問題:我怎麼能更改每個迭代的截止?即如果我通過該函數的截斷列表(例如cutoffs = [0.5, 0.125, 0.0625],我會得到一個列表,每個列表中的原始列表的各個成員分組到正確的類別中。在這種情況下,返回的組將是類似於[[0.5],[0.25,0125],[0.0625],[0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]

回答

1

如果我理解正確的話你可以遍歷使用x < i截斷的列表中爲每個我在截止。

cutoffs = [0.5, 0.125, 0.0625] 
def split_by_prob(items,cutoffs): 
    """ 
    (list of double) -> list of (lists) of double 
    Splits a set into subsets based on probability 
    # >>> split_by_prob(twos, 0.5) 
    [[0.5], [ 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]] 
    """ 
    groups = [] 
    keys = [] 

    for i in cutoffs: 
     for k,g in groupby(enumerate(items), lambda (j, x): x < i): 
      groups.append((map(itemgetter(1),g))) 
    return groups 

print split_by_prob(twos, cutoffs) 


[0.5], [0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625], [0.5, 0.25, 0.125], [0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625], [0.5, 0.25, 0.125, 0.0625], [0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625] 
+0

天哪,我花了一個多小時思考這個問題!它並不完全符合我的需要 - 還有一些額外的組(例如,一旦我自己分組[0,5],我不希望它出現在任何其他組中),但是,我確定我可以弄明白。 –

0

我已經想通了什麼,我需要做的,完整的代碼如下。我米不知道它是多麼有效率或pythonic然而:

import numpy as np 
from itertools import groupby 
from operator import itemgetter 
import doctest 
N= 10 

twos = [2**(-(i+1)) for i in range(0,N)] 
cutoffs = [0.5, 0.125, 0.03125] 

def split_by_prob(items,cutoff,groups): 
    """ 
    (list of double) -> list of (lists) of double 
    Splits a set into subsets based on probability 
    >>> split_by_prob(twos, 0.5) 
    [[0.5], [ 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]] 
    """ 
    for k,g in groupby(enumerate(items), lambda (j, x): x<cutoff): 
     groups.append((map(itemgetter(1),g))) 
    return groups 

def split_into_groups(items, cutoffs): 
    """ 
    (list of double) -> list of (lists) of double 
    Splits a set into subsets based on probability 
    >>> split_by_prob(twos, cutoffs) 
    [[0.5], [0.25, 0.125], [0.0625, 0.03125], [0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]] 
    """ 
    groups = items 
    final = [] 
    for i in cutoffs: 
     groups = split_by_prob(groups,i,[]) 
     final.append(groups[0]) 
     groups = groups.pop() 
    final.append(groups) 
    return final