2017-02-02 53 views
1

https://docs.python.org/3/library/gzip.html的Python 3:gzip.open()和模式

我正在考慮使用gzip.open(),和我有點困惑的mode說法:

mode參數可以是任何「 r','a','ab','w','wb','x' 或'xb'表示二進制模式,或'rt','at','wt'或' xt'爲文本模式。 默認值是'rb'。

那麼'w''wb'之間有什麼區別?

該文件聲明它們都是二進制模式

那麼這是否意味着'w''wb'之間沒有區別?

+0

小問題:除了python-3.x之外,不應該在這裏應用python標籤嗎?我問專家:它確實提到了python 3,但仍然是python,有些人可能錯過了這個......我想我看到了類似的情況,但是我忘了哪一個。 – fedepad

回答

3

這意味着r默認爲rb,如果你想要文本,你必須使用rt來指定它。

像你說的(而不是open行爲,其中r意味着rt,不rb

+0

我喜歡這種情況。我擔心''r''是二進制讀取,而'rb''是更多的二進制讀取''r''。 – jeff00seattle

2

準確,因爲已經覆蓋@

讓·弗朗索瓦·法布爾的答案。
我只是想展示一些代碼,因爲它很有趣。
讓我們來看看python庫中的gzip.py源代碼,看看實際發生了什麼。
gzip.open()可以在這裏找到https://github.com/python/cpython/blob/master/Lib/gzip.py和我在下方報告

def open(filename, mode="rb", compresslevel=9, 
     encoding=None, errors=None, newline=None): 
    """Open a gzip-compressed file in binary or text mode. 
    The filename argument can be an actual filename (a str or bytes object), or 
    an existing file object to read from or write to. 
    The mode argument can be "r", "rb", "w", "wb", "x", "xb", "a" or "ab" for 
    binary mode, or "rt", "wt", "xt" or "at" for text mode. The default mode is 
    "rb", and the default compresslevel is 9. 
    For binary mode, this function is equivalent to the GzipFile constructor: 
    GzipFile(filename, mode, compresslevel). In this case, the encoding, errors 
    and newline arguments must not be provided. 
    For text mode, a GzipFile object is created, and wrapped in an 
    io.TextIOWrapper instance with the specified encoding, error handling 
    behavior, and line ending(s). 
    """ 
    if "t" in mode: 
     if "b" in mode: 
      raise ValueError("Invalid mode: %r" % (mode,)) 
    else: 
     if encoding is not None: 
      raise ValueError("Argument 'encoding' not supported in binary mode") 
     if errors is not None: 
      raise ValueError("Argument 'errors' not supported in binary mode") 
     if newline is not None: 
      raise ValueError("Argument 'newline' not supported in binary mode") 

    gz_mode = mode.replace("t", "") 
    if isinstance(filename, (str, bytes, os.PathLike)): 
     binary_file = GzipFile(filename, gz_mode, compresslevel) 
    elif hasattr(filename, "read") or hasattr(filename, "write"): 
     binary_file = GzipFile(None, gz_mode, compresslevel, filename) 
    else: 
     raise TypeError("filename must be a str or bytes object, or a file") 

    if "t" in mode: 
     return io.TextIOWrapper(binary_file, encoding, errors, newline) 
    else: 
     return binary_file 

幾件事情我們注意到:

  • 默認模式爲rb爲你報道的資料說
  • 打開二進制文件,它並不關心它是否爲"r", "rb", "w", "wb"例如。
    這一點我們可以在下面的文本行,見:

    gz_mode = mode.replace("t", "") 
    if isinstance(filename, (str, bytes, os.PathLike)): 
        binary_file = GzipFile(filename, gz_mode, compresslevel) 
    elif hasattr(filename, "read") or hasattr(filename, "write"): 
        binary_file = GzipFile(None, gz_mode, compresslevel, filename) 
    else: 
        raise TypeError("filename must be a str or bytes object, or a file") 
    
    if "t" in mode: 
        return io.TextIOWrapper(binary_file, encoding, errors, newline) 
    else: 
        return binary_file 
    

    基本的二進制文件binary_file被內置羯羊有一個附加的B或不作爲gz_mode可以在這一點上,b與否。
    現在調用類class GzipFile(_compression.BaseStream)來構建binary_file

在構造以下行是很重要的:

if mode and ('t' in mode or 'U' in mode): 
     raise ValueError("Invalid mode: {!r}".format(mode)) 
    if mode and 'b' not in mode: 
     mode += 'b' 
    if fileobj is None: 
     fileobj = self.myfileobj = builtins.open(filename, mode or 'rb') 
    if filename is None: 
     filename = getattr(fileobj, 'name', '') 
     if not isinstance(filename, (str, bytes)): 
      filename = '' 
    else: 
     filename = os.fspath(filename) 
    if mode is None: 
     mode = getattr(fileobj, 'mode', 'rb') 

這裏可以清楚地看到,如果'b'中不存在,它將被添加

if mode and 'b' not in mode: 
      mode += 'b' 

所以有模式已經討論過的兩種模式沒有區別。