不包含特殊字符的字符串中的字符數

我需要計算給定文件中的字符數。問題是，我沒有正確地分割文件。如果我的輸入文件的內容是「The！dog-ate ##### the，cat」，我不需要輸出中的特殊字符。 o/p：t：4 h：2 e：3！：1 d：1 o：1 g：1 - ：1＃：5 ....此外，我需要刪除「 - 」符號並確保該詞不連接。不包含特殊字符的字符串中的字符數

from collections import Counter 
    import sys 
    filename = sys.argv[1] 
    reg = '[^a-zA-Z+]' 
    f = open(filename, 'r') 
    x = f.read().strip() 
    lines=[] 
    for line in x: 
     line = line.strip().upper() 
     if line: 
      lines.append(line) 
    print(Counter(lines))

有人可以幫助我嗎？

來源

2017-09-23 sangeetha ramesh

如果你的問題得到回答，請[接受最有用的答案（https://stackoverflow.com/help/有人-答案）。 –

只需刪除值，你不想：

c = Counter(lines) 
del c['#'] 
del c['-'] 
del c[','] 
print(c)

來源

2017-09-23 04:48:10

使用re.sub和刪除特殊字符。

import re 

with open(filename) as f: 
    content = re.sub('[^a-zA-Z]', '', f.read(), flags=re.M)  
counts = Counter(content)

演示：

In [1]: re.sub('[^a-zA-Z]', '', "The! dog-ate #####the,cat") 
Out[1]: 'Thedogatethecat' 

In [2]: Counter(_) 
Out[2]: 
Counter({'T': 1, 
     'a': 2, 
     'c': 1, 
     'd': 1, 
     'e': 3, 
     'g': 1, 
     'h': 2, 
     'o': 1, 
     't': 3})

注意，如果你要計算大寫和小寫計算在一起，你可以轉換content爲小寫：

counts = Counter(content.lower())

來源

2017-09-23 04:49:33

foo.txt的

asdas 

[email protected]#[email protected] 


asdljh 


12j3l1k23j

來源：

https://docs.python.org/3/library/string.html#string.ascii_letters

import string 
from collections import Counter 

with open('foo.txt') as f: 
    text = f.read() 

filtered_text = [char for char in text if char in in string.ascii_letters] 
counted = Counter(filtered_text) 
print(counted.most_common())

輸出

[('a', 3), ('j', 3), ('s', 3), ('d', 2), ('l', 2), ('h', 1), ('k', 1)]

來源

2017-09-23 04:57:14

將'ascii_letters'轉換爲預先設置將提高效率。在字符串上查找是線性時間。 –

不包含特殊字符的字符串中的字符數

回答

相關問題