2010-08-01 55 views
1

我做了如下簡單的網絡服務器。我怎樣才能找到上傳的文件名在Python cgi

import BaseHTTPServer, os, cgi 
import cgitb; cgitb.enable() 

html = """ 
<html> 
<body> 
<form action="" method="POST" enctype="multipart/form-data"> 
File upload: <input type="file" name="upfile"> 
<input type="submit" value="upload"> 
</form> 
</body> 
</html> 
""" 
class Handler(BaseHTTPServer.BaseHTTPRequestHandler): 
    def do_GET(self): 
     self.send_response(200) 
     self.send_header("content-type", "text/html;charset=utf-8") 
     self.end_headers() 
     self.wfile.write(html) 

    def do_POST(self): 
     ctype, pdict = cgi.parse_header(self.headers.getheader('content-type')) 
     if ctype == 'multipart/form-data': 
      query = cgi.parse_multipart(self.rfile, pdict) 
      upfilecontent = query.get('upfile') 
      if upfilecontent: 
       # i don't know how to get the file name.. so i named it 'tmp.dat' 
       fout = file(os.path.join('tmp', 'tmp.dat'), 'wb') 
       fout.write (upfilecontent[0]) 
       fout.close() 
     self.do_GET() 

if __name__ == '__main__': 
    server = BaseHTTPServer.HTTPServer(("127.0.0.1", 8080), Handler) 
    print('web server on 8080..') 
    server.serve_forever() 

在BaseHTTPRequestHandler的do_Post方法中,我成功獲取了上傳的文件數據。

但我想不出如何獲得上傳文件的原始名稱。 self.rfile.name只是一個'套接字' 我怎樣才能得到上傳的文件名?

回答

2

你使用有作爲起點漂亮的斷碼(例如看那個global rootnode其中name rootnode使用無處 - 明確半編輯源,並嚴重那個)。

無論如何,你使用「客戶端」爲POST什麼表格?它如何設置upfile字段?

爲什麼不使用正常的FieldStorage方法,如Python's docs中所記錄的?這樣,您可以使用相應字段的.file屬性來獲取要讀取的類文件對象,或者使用它的.value屬性將其全部讀取到內存中並將其作爲字符串讀取,再加上字段的.filename屬性以瞭解上傳的文件名稱。關於FieldStorage的更詳細但簡明的文檔是here

編輯:現在的OP編輯了Q可澄清一下,我看這個問題:BaseHTTPServer確實根據CGI規範設置環境,所以cgi模塊是不是十分有用吧。不幸的是,唯一一種簡單的環境設置方法是盜取和破解CGIHTTPServer.py的大部分代碼(無需重複使用,因此需要,嘆息,複製和粘貼代碼),例如...:

def populenv(self): 
     path = self.path 
     dir, rest = '.', 'ciao' 

     # find an explicit query string, if present. 
     i = rest.rfind('?') 
     if i >= 0: 
      rest, query = rest[:i], rest[i+1:] 
     else: 
      query = '' 

     # dissect the part after the directory name into a script name & 
     # a possible additional path, to be stored in PATH_INFO. 
     i = rest.find('/') 
     if i >= 0: 
      script, rest = rest[:i], rest[i:] 
     else: 
      script, rest = rest, '' 

     # Reference: http://hoohoo.ncsa.uiuc.edu/cgi/env.html 
     # XXX Much of the following could be prepared ahead of time! 
     env = {} 
     env['SERVER_SOFTWARE'] = self.version_string() 
     env['SERVER_NAME'] = self.server.server_name 
     env['GATEWAY_INTERFACE'] = 'CGI/1.1' 
     env['SERVER_PROTOCOL'] = self.protocol_version 
     env['SERVER_PORT'] = str(self.server.server_port) 
     env['REQUEST_METHOD'] = self.command 
     uqrest = urllib.unquote(rest) 
     env['PATH_INFO'] = uqrest 
     env['SCRIPT_NAME'] = 'ciao' 
     if query: 
      env['QUERY_STRING'] = query 
     host = self.address_string() 
     if host != self.client_address[0]: 
      env['REMOTE_HOST'] = host 
     env['REMOTE_ADDR'] = self.client_address[0] 
     authorization = self.headers.getheader("authorization") 
     if authorization: 
      authorization = authorization.split() 
      if len(authorization) == 2: 
       import base64, binascii 
       env['AUTH_TYPE'] = authorization[0] 
       if authorization[0].lower() == "basic": 
        try: 
         authorization = base64.decodestring(authorization[1]) 
        except binascii.Error: 
         pass 
        else: 
         authorization = authorization.split(':') 
         if len(authorization) == 2: 
          env['REMOTE_USER'] = authorization[0] 
     # XXX REMOTE_IDENT 
     if self.headers.typeheader is None: 
      env['CONTENT_TYPE'] = self.headers.type 
     else: 
      env['CONTENT_TYPE'] = self.headers.typeheader 
     length = self.headers.getheader('content-length') 
     if length: 
      env['CONTENT_LENGTH'] = length 
     referer = self.headers.getheader('referer') 
     if referer: 
      env['HTTP_REFERER'] = referer 
     accept = [] 
     for line in self.headers.getallmatchingheaders('accept'): 
      if line[:1] in "\t\n\r ": 
       accept.append(line.strip()) 
      else: 
       accept = accept + line[7:].split(',') 
     env['HTTP_ACCEPT'] = ','.join(accept) 
     ua = self.headers.getheader('user-agent') 
     if ua: 
      env['HTTP_USER_AGENT'] = ua 
     co = filter(None, self.headers.getheaders('cookie')) 
     if co: 
      env['HTTP_COOKIE'] = ', '.join(co) 
     # XXX Other HTTP_* headers 
     # Since we're setting the env in the parent, provide empty 
     # values to override previously set values 
     for k in ('QUERY_STRING', 'REMOTE_HOST', 'CONTENT_LENGTH', 
        'HTTP_USER_AGENT', 'HTTP_COOKIE', 'HTTP_REFERER'): 
      env.setdefault(k, "") 
     os.environ.update(env) 

這可能進一步顯着地簡化,但並非沒有花一些時間和精力上的任務:-(

隨着手頭這個populenv功能,我們可以重新編碼:

def do_POST(self): 
    populen(self) 
    form = cgi.FieldStorage(fp=self.rfile) 
    upfilecontent = form['upfile'].value 
    if upfilecontent: 
     fout = open(os.path.join('tmp', form['upfile'].filename), 'wb') 
     fout.write(upfilecontent) 
     fout.close() 
    self.do_GET() 

...並過着幸福的生活;-)。 (當然,使用任何體面的WSGI服務器,甚至是the demo one都會容易得多,但這個練習對CGI及其內部的指導性的;-)。

+0

thnx亞歷克斯。我在do_Post()方法中嘗試了cgi.FieldStorage(),但它返回空類。我應該使用CGIHTTPRequestHandler與單獨的py文件來獲取FieldStorage信息? – 2010-08-01 17:21:48

+0

@tk,你可以這樣做,但沒有理由把它放在一個單獨的'.py'文件中。如果你編輯你的Q以顯示最少量的代碼失敗,而不是指向那個已知的破解示例,那麼任何人都可以幫助你更輕鬆;也**請**編輯您的Q以顯示您用於上傳的表單,正如我已經提到的那樣,因爲除非您_顯示它,否則根本無法猜測它可能會出現什麼問題! – 2010-08-01 17:31:05

+0

亞歷克斯,再次thanx。我用python代碼和表單html解決了我的問題。 – 2010-08-02 00:26:37

1

通過使用cgi.FieldStorage,您可以輕鬆地提取文件名。檢查下面的例子:

def do_POST(self): 
    ctype, pdict = cgi.parse_header(self.headers.getheader('content-type')) 
    if ctype == 'multipart/form-data': 
     form = cgi.FieldStorage(fp=self.rfile, headers=self.headers, environ={'REQUEST_METHOD':'POST', 'CONTENT_TYPE':self.headers['Content-Type'], }) 
     filename = form['upfile'].filename 
     data = form['upfile'].file.read() 
     open("./%s"%filename, "wb").write(data) 
    self.do_GET() 
+0

在較新的Python版本中,您應該使用'self.headers.get_params()'而不是'self.headers.getheader()' – maciek 2017-05-22 12:16:51