2013-10-07 51 views
-1

說,我有以下的話,我想提出一個清單計數令牌和字符在名單

"cat,dog,fish"   (first row) 
"turtle,charzard,pikachu,lame" (second row) 
"232.34,23.4,242.12%"   (third row) 

我的問題是我如何計算每行的標記,就像第一行有3個,第二排有4個,第三個有3個。之後我該如何計算字符數,然後爲每一行決定哪個記號具有最多字符?使輸出看起來像

token count = 3, character count = 10, fish has the most characters 
token count = 4, character count = 25, charzard has the most characters 
token count = 3, character count = 17, 242.12% has the most characters 

只使用簡單的列表方法,如len()。並使用逗號作爲分隔符。謝謝,我真的失去了,因爲每次我嘗試使用帶剝離逗號(「」)我得到一個錯誤

+2

我認爲你必須看看'split'方法 – shyam

回答

-2

要計算每行的令牌數量

試試這個

import re 
print len(re.findall(r'\w+', line)) 

DEMO

4

試試這個。同時適用於Python2Python3

rows = [ "cat,dog,fish", "turtle,charzard,pikachu,lame", "232.34,23.4,242.12%" ] 
for row in rows: 
    tokens = row.split(',') 
    token_cnt = len(tokens) 
    char_cnt = sum([len(token) for token in tokens]) 
    longest_token = max(tokens, key=len) 
    print("token count = %d, character count = %d, %s has the most characters" %(token_cnt, char_cnt, longest_token)) 

結果:@ inspectorG4dget的

現在用max,而不是我的sort愚蠢的選擇,找到最長的單詞,啓發:

>>> token count = 3, character count = 10, fish has the most characters 
>>> token count = 4, character count = 25, charzard has the most characters 
>>> token count = 3, character count = 17, 242.12% has the most characters 

EDITED回答。

1

給出一個字符串列表:

def my_output(string_of_tokens): 
    tokens = string_of_tokens.split(",") 
    print "token count = %s, character count = %s, %s has the most characters" % 
     (len(tokens), sum(map(len, tokens)), reduce(lambda a, b: a if len(a) > len(b) else b, tokens)) 

list = ["cat,dog,fish", "turtle,charzard,pikachu,lame", "232.34,23.4,242.12%"] 
for l in list: 
    my_output(l) 
+0

+1'減少':)在我的答案我使用'排序'愚蠢地找到最長的單詞..然而在'Python3''減少'被移動到'functools'這使我懶得有時使用它>< – starrify

1

假設你有逗號分隔行的文件:

with open('path/to/input') as infile: 
    for i,line in enumerate(infile, 1): 
    toks = line.split(',') 
    print "row %d: token_count=%d character_count=%d '%s' has the most characters" %(len(toks), sum(len(t) for t in toks), max(toks, key=len)) 
+0

+1對於'max'>< – starrify