基本上,我今天一直在玩這個遊戲。我有一個名爲test.csv這個數據文件,該文件被編碼成UTF-8:從Python中的.CSV檢索並顯示UTF-8
「阮」,0.500 「過渡」,0.250 「樂」,0.250
現在我試圖與讀它這段代碼和它顯示所有這樣的有趣:Trần
現在我已經經歷了2.6的所有Python文檔,這是我使用的,我不能讓包裝與所有的想法一起工作我假設的互聯網都是非常正確的,只是沒有被你的真正適用。從好的一面,我瞭解到,不是所有的字體都能正確顯示那些字符,無論如何我以前沒有想到過,並且學到了很多關於Unicode的知識,所以它肯定不會浪費時間。
如果有人能指出我出錯的地方,我將非常感激。
這裏是更新每低於要求的代碼返回此錯誤 -
Traceback (most recent call last): File "surname_generator.py", line 39, in probfamilynames = [(familyname,float(prob)) for familyname,prob in unicode_csv_reader(open(familynamelist))] File "surname_generator.py", line 27, in unicode_csv_reader for row in csv_reader: File "surname_generator.py", line 33, in utf_8_encoder yield line.encode('utf-8') UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)
from random import random
import csv
class ChooseFamilyName(object):
def __init__(self, probs):
self._total_prob = 0.
self._familyname_levels = []
for familyname, prob in probs:
self._total_prob += prob
self._familyname_levels.append((self._total_prob, familyname))
return
def pickfamilyname(self):
pickfamilyname = self._total_prob * random()
for level, familyname in self._familyname_levels:
if level >= pickfamilyname:
return familyname
print "pickfamilyname error"
return
def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),
dialect=dialect, **kwargs)
for row in csv_reader:
# decode UTF-8 back to Unicode, cell by cell:
yield [unicode(cell, 'utf-8') for cell in row]
def utf_8_encoder(unicode_csv_data):
for line in unicode_csv_data:
yield line.encode('utf-8')
familynamelist = 'familyname_vietnam.csv'
a = 0
while a < 10:
a = a + 1
probfamilynames = [(familyname,float(prob)) for familyname,prob in unicode_csv_reader(open(familynamelist))]
familynamepicker = ChooseFamilyName(probfamilynames)
print(familynamepicker.pickfamilyname())
這個效果很好。但是,我意識到我可以改進我所做的並使之更加清潔。 – MDA1973 2009-10-14 12:13:34