2011-07-27 42 views
8

所以我一直在玩原始的WSGI,cgi.FieldStorage和文件上傳。我不明白它如何處理文件上傳。cgi.FieldStorage如何存儲文件?

起初,它似乎只是將整個文件存儲在內存中。我認爲嗯,這應該很容易測試 - 一個大文件應該堵塞內存!..而事實並非如此。不過,當我請求文件時,它是一個字符串,而不是迭代器,文件對象或任何東西。

我試過閱讀cgi模塊的源代碼,發現了一些關於臨時文件的東西,但它返回了一個令人討厭的字符串,而不是文件(類似)對象!所以... fscking如何工作?!

下面是我使用的代碼:

import cgi 
from wsgiref.simple_server import make_server 

def app(environ,start_response): 
    start_response('200 OK',[('Content-Type','text/html')]) 
    output = """ 
    <form action="" method="post" enctype="multipart/form-data"> 
    <input type="file" name="failas" /> 
    <input type="submit" value="Varom" /> 
    </form> 
    """ 
    fs = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ) 
    f = fs.getfirst('failas') 
    print type(f) 
    return output 


if __name__ == '__main__' : 
    httpd = make_server('',8000,app) 
    print 'Serving' 
    httpd.serve_forever() 

提前感謝! :)

回答

6

檢查cgi module description,有一段討論如何處理文件上傳。

如果字段表示上傳的文件,經由值屬性或getvalue()方法訪問值讀取存儲器中的整個文件作爲一個字符串。這可能不是你想要的。您可以通過測試文件名屬性或文件屬性來測試上傳的文件。然後,您可以悠閒地從文件屬性中讀取數據:

fileitem = form["userfile"] 
if fileitem.file: 
    # It's an uploaded file; count lines 
    linecount = 0 
    while 1: 
     line = fileitem.file.readline() 
     if not line: break 
     linecount = linecount + 1 

關於你提到的例子,getfirst()只是getvalue()一個版本。 嘗試

f = fs['failas'].file 

該更換

f = fs.getfirst('failas') 

將返回一個類似文件的對象,它是可讀的「清閒」。

+0

謝謝:)我主要使用Django,但有時我喜歡玩那些低級別的東西:) – Justinas

5

最好的方法是不要讀取文件(甚至每行,甚至每一行gimel建議)。

您可以使用一些繼承並從FieldStorage擴展一個類,然後重寫make_file函數。當FieldStorage是文件類型時調用make_file。

供您參考,默認make_file看起來是這樣的:無論你想

def make_file(self, binary=None): 
    """Overridable: return a readable & writable file. 

    The file will be used as follows: 
    - data is written to it 
    - seek(0) 
    - data is read from it 

    The 'binary' argument is unused -- the file is always opened 
    in binary mode. 

    This version opens a temporary file for reading and writing, 
    and immediately deletes (unlinks) it. The trick (on Unix!) is 
    that the file can still be used, but it can't be opened by 
    another process, and it will automatically be deleted when it 
    is closed or when the current process terminates. 

    If you want a more permanent file, you derive a class which 
    overrides this method. If you want a visible temporary file 
    that is nevertheless automatically deleted when the script 
    terminates, try defining a __del__ method in a derived class 
    which unlinks the temporary files you have created. 

    """ 
    import tempfile 
    return tempfile.TemporaryFile("w+b") 

而不是創造temporaryfile,永久創建文件。

2

使用@hasanatkazmi答案(在扭曲的應用程序使用),我有這樣的:

#!/usr/bin/env python2 
# -*- coding: utf-8 -*- 
# -*- indent: 4 spc -*- 
import sys 
import cgi 
import tempfile 


class PredictableStorage(cgi.FieldStorage): 
    def __init__(self, *args, **kwargs): 
     self.path = kwargs.pop('path', None) 
     cgi.FieldStorage.__init__(self, *args, **kwargs) 

    def make_file(self, binary=None): 
     if not self.path: 
      file = tempfile.NamedTemporaryFile("w+b", delete=False) 
      self.path = file.name 
      return file 
     return open(self.path, 'w+b') 

被警告,該文件是並不總是由CGI模塊創建。根據這些cgi.py行只會創建若大於1000個字節:

if self.__file.tell() + len(line) > 1000: 
    self.file = self.make_file('') 

所以,你必須檢查,如果該文件實際上是一個查詢到自定義類的path場像這樣創建:

if file_field.path: 
    # Using an already created file... 
else: 
    # Creating a temporary named file to store the content. 
    import tempfile 
    with tempfile.NamedTemporaryFile("w+b", delete=False) as f: 
     f.write(file_field.value) 
     # You can save the 'f.name' field for later usage. 

如果Content-Length也爲外地,這似乎很少設置,文件也應CGI創建。

就是這樣。這樣可以預測地存儲文件,從而減少應用程序的內存使用量。