2012-10-12 91 views
1

我已經實現了下面的代碼,它完美的工作沒有任何問題。但我不滿意它,因爲它看起來不漂亮?比任何事情都好,我覺得它看起來不像pythonic那樣做。Pythonic構建數據結構的方法

所以我想我會採取從stackoverflow社區的建議。這個metod從sql查詢中獲取數據,這是另一種方法,該方法返回一個字典,並基於該字典中的數據進行模式匹配和計數過程。我想以pythonic的方式做到這一點,並返回一個更好的數據結構。

下面是代碼:

def getLaguageUserCount(self): 
    bots = self.getBotUsers() 
    user_template_dic = self.getEnglishTemplateUsers() 
    print user_template_dic 
    user_by_language = {} 
    en1Users = [] 
    en2Users = [] 
    en3Users=[] 
    en3Users=[] 
    en4Users=[] 
    en5Users=[] 
    en_N_Users=[] 
    en1 = 0 
    en2 = 0 
    en3 = 0 
    en4 = 0 
    en5 = 0 
    enN = 0 
    lang_regx = re.compile(r'User_en-([1-5n])', re.M|re.I) 
    for userId, langCode in user_template_dic.iteritems(): 
     if userId not in bots: 
      print 'printing key value' 
      for item in langCode: 
       item = item.replace('--','-') 
       match_lang_obj = lang_regx.match(item) 
       if match_lang_obj is not None: 
        if match_lang_obj.group(1) == '1': 
         en1 += 1 
         en1Users.append(userId) 
        if match_lang_obj.group(1) == '2': 
         en2 += 1 
         en2Users.append(userId) 
        if match_lang_obj.group(1) == '3': 
         en3 += 1 
         en3Users.append(userId) 
        if match_lang_obj.group(1) == '4': 
         en4 += 1 
         en4Users.append(userId) 
        if match_lang_obj.group(1) == '5': 
         en5 += 1 
         en5Users.append(userId) 
        if match_lang_obj.group(1) == 'N': 
         enN += 1 
         en_N_Users.append(userId) 
       else: 
        print "Group didn't match our regex: " + item 
     else: 
      print userId + ' is a bot' 
    language_count = {} 
    user_by_language['en-1-users'] = en1Users 
    user_by_language['en-2-users'] = en2Users 
    user_by_language['en-3-users'] = en3Users 
    user_by_language['en-4-users'] = en4Users 
    user_by_language['en-5-users'] = en5Users 
    user_by_language['en-N-users'] = en_N_Users 
    user_by_language['en-1'] = en1 
    user_by_language['en-2'] = en2 
    user_by_language['en-3'] = en3 
    user_by_language['en-4'] = en4 
    user_by_language['en-5'] = en5 
    user_by_language['en-n'] = enN 
    return user_by_language 
+1

這是更適合http://codereview.stackexchange.com –

+0

我該如何將此移至您建議的位置?只需複製過去或有辦法「標記它即可移動」? –

回答

3

你能避免所有這些列表和直接的數據添加到字典user_by_language

我將其定義爲:

user_by_language = collections.defaultdict(list) 

正則表達式匹配後,只是這樣做:

user_by_language['en-%s-users' % match_lang_obj.group(1)].append(userId) 

最後,你抓住這些元素的全部長度,並保存爲en-1en-2 ...