2014-02-13 25 views
1
class Keeper(object): 

    def __init__(self, keep): 
     self.keep = sets.Set(map(ord, keep)) 

    def __getitem__(self, n): 
     if n not in self.keep: 
      return None 
     return unichr(n) 

    def __call__(self, s): 
     return unicode(s).translate(self) 

makefilter = Keeper 

if __name__ == '__main__': 
    just_vowels = makefilter('aeiouy') 

    print just_vowels(u'four score and seven years ago') 

它會發出「ouoeaeeyeaao」。'translate(self)'是如何工作的?

我知道'translate'函數應該重新調用由string.maketrans()創建的表參數。

但是爲什麼'自我'在翻譯功能中傳遞。

它是如何調用__getitem__函數的?

+0

將'print n'放入'__getitem__'方法,看看會發生什麼。 – Matthias

回答

3

之前,我們來到你的片斷,讓當__getitem__是調用我解釋一下:

這是__getitem__說:

__getitem__: object.__getitem__(self, key)被調用來實現的自我評價[關鍵]

對於序列類型,接受的鍵應該是整數和切片對象。請注意,負指數的特殊解釋(如果該類希望模擬序列類型)取決於方法__getitem__()。如果密鑰的類型不合適,可能會提出TypeError;如果該序列的索引集之外的值(在對負值進行任何特殊解釋之後),則應提高IndexError。對於映射類型,如果缺少密鑰(不在容器中),應該引發KeyError

所以,讓我們看看下面的代碼片段:

class Keeper(object): 
    def __init__(self, keep): 
     self.keep = set(map(ord, keep)) 

if __name__ == '__main__': 
    just_vowels = Keeper('aeiouy') 
    print just_vowels[1] 

輸出:是一個錯誤,因爲does not support indexing沒有定義__getitem__方法。

Traceback (most recent call last): 
    File "tran.py", line 15, in <module> 
    print just_vowels[1] 
TypeError: 'Keeper' object does not support indexing 

現在,讓我們改變片段,並添加__getitem__允許對象索引:

class Keeper(object): 
    def __init__(self, keep): 
     self.keep = set(map(ord, keep)) 

    def __getitem__(self, n): 
     if n in self.keep: 
      return unichr(n) 
     else: 
      return 'Not Found in %s' % self.keep 

if __name__ == '__main__': 
    just_vowels = Keeper('aeiouy') 
    for i in range(97,103): 
     print just_vowels[i] 

輸出:

a 
Not Found in set([97, 101, 105, 111, 117, 121]) 
Not Found in set([97, 101, 105, 111, 117, 121]) 
Not Found in set([97, 101, 105, 111, 117, 121]) 
e 
Not Found in set([97, 101, 105, 111, 117, 121]) 

所以,最後讓來到你的片段,當我們用自己的映射表即。字典。默認情況下,它將調用__getitem__方法來允許索引和哪些數字在[97, 101, 105, 111, 117, 121]範圍內。因此,如果數字或ord值不在該集合中,它只會返回None,這意味着從您的unicode字符串中刪除。

這裏有一些內置的Python對象的支持數字索引:

>>> '__getitem__' in dir(dict) 
True 
>>> '__getitem__' in dir(list) 
True 
>>> '__getitem__' in dir(set) 
False 
>>> '__getitem__' in dir(tuple) 
True 
>>> '__getitem__' in dir(string) 
False 
>>> 

在組索引例子:

>>> s 
set([1, 2]) 
>>> s[0] 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
TypeError: 'set' object does not support indexing 
>>> 

讓我解釋一下unicode的翻譯部分,我希望你已經知道這一點,但對於那些不知道的人。

這是unicode.translate說:

>>> help(unicode.translate) 
Help on method_descriptor: 

translate(...) 
    S.translate(table) -> unicode 
    Return a copy of the string S, where all characters have been mapped 
    through the given translation table, which must be a mapping of 
    Unicode ordinals to Unicode ordinals, Unicode strings or None. 
    Unmapped characters are left untouched. Characters mapped to None 
    are deleted. 
>> 

這需要一個table可能是Unicode碼到Unicode碼,Unicode字符串或None的字典即映射。

讓我們的例子:從Unicode字符串去除標點符號的:

>>> uni_string = unicode('String with [email protected]!."##') 
>>> uni_string 
u'String with [email protected]!."##' 
>>> 

允許創建映射字典標點符號爲無:

>>> punc = '!"#$.' 
>>> punc_map = {ord(x):None for x in punc } 
>>> punc_map 
{33: None, 34: None, 35: None, 36: None, 46: None} 
>>> 

允許使用此punc_map翻譯Unicode字符串刪除標點符號:

>>> uni_string 
u'String with [email protected]!."##' 
>>> uni_string.translate(punc_map) 
u'String with [email protected]' 
>>>