2013-03-12 34 views
1

我爲當前項目編寫了自定義Django文件上傳處理程序。這是一個概念驗證,它允許您計算上傳文件的散列,而不將該文件存儲在磁盤上。可以肯定的是,這是一個概念證明,但如果我能把它運用起來,我就可以實現我工作的真正目的。訪問自定義Django上傳處理程序中的其他表單域

從本質上講,這是我到目前爲止,這是工作的罰款有一個主要的例外:

from django.core.files.uploadhandler import * 
from hashlib import sha256 
from myproject.upload.files import MyProjectUploadedFile 

class MyProjectUploadHandler(FileUploadHandler): 
    def __init__(self, *args, **kwargs): 
     super(MyProjectUploadHandler, self).__init__(*args, **kwargs) 

    def handle_raw_input(self, input_data, META, content_length, boundary, 
      encoding = None): 
     self.activated = True 

    def new_file(self, *args, **kwargs): 
     super(MyProjectUploadHandler, self).new_file(*args, **kwargs) 

     self.digester = sha256() 
     raise StopFutureHandlers() 

    def receive_data_chunk(self, raw_data, start): 
     self.digester.update(raw_data) 

    def file_complete(self, file_size): 
     return MyProjectUploadedFile(self.digester.hexdigest()) 

自定義上傳處理器的偉大工程。散列是準確的,並且不需要將任何上傳的文件存儲到磁盤,並且在任何時候都只使用64kb的內存。

我遇到的唯一問題是,在處理文件(用戶輸入的文本鹽)之前,我需要從POST請求訪問另一個字段。我的形式如下:請求已被處理,該文件已被上傳,它不適合我的使用情況工作崗位

<form id="myForm" method="POST" enctype="multipart/form-data" action="/upload/"> 
    <fieldset> 
     <input name="salt" type="text" placeholder="Salt"> 
     <input name="uploadfile" type="file"> 
     <input type="submit"> 
    </fieldset> 
</form> 

「鹽」 POST變量只提供給我。我似乎無法找到一種方法來以任何方式,形狀或形式在我的上傳處理程序中訪問此變量。

有沒有辦法讓我訪問每個多部分變量,而不是隻訪問上傳的文件?

回答

2

我的解決方案來得不易,但在這裏它是:

class IntelligentUploadHandler(FileUploadHandler): 
    """ 
    An upload handler which overrides the default multipart parser to allow 
    simultaneous parsing of fields and files... intelligently. Subclass this 
    for real and true awesomeness. 
    """ 

    def __init__(self, *args, **kwargs): 
     super(IntelligentUploadHandler, self).__init__(*args, **kwargs) 

    def field_parsed(self, field_name, field_value): 
     """ 
     A callback method triggered when a non-file field has been parsed 
     successfully by the parser. Use this to listen for new fields being 
     parsed. 
     """ 
     pass 

    def handle_raw_input(self, input_data, META, content_length, boundary, 
      encoding = None): 
     """ 
     Parse the raw input from the HTTP request and split items into fields 
     and files, executing callback methods as necessary. 

     Shamelessly adapted and borrowed from django.http.multiparser.MultiPartParser. 
     """ 
     # following suit from the source class, this is imported here to avoid 
     # a potential circular import 
     from django.http import QueryDict 

     # create return values 
     self.POST = QueryDict('', mutable=True) 
     self.FILES = MultiValueDict() 

     # initialize the parser and stream 
     stream = LazyStream(ChunkIter(input_data, self.chunk_size)) 

     # whether or not to signal a file-completion at the beginning of the loop. 
     old_field_name = None 
     counter = 0 

     try: 
      for item_type, meta_data, field_stream in Parser(stream, boundary): 
       if old_field_name: 
        # we run this test at the beginning of the next loop since 
        # we cannot be sure a file is complete until we hit the next 
        # boundary/part of the multipart content. 
        file_obj = self.file_complete(counter) 

        if file_obj: 
         # if we return a file object, add it to the files dict 
         self.FILES.appendlist(force_text(old_field_name, encoding, 
          errors='replace'), file_obj) 

        # wipe it out to prevent havoc 
        old_field_name = None 
       try: 
        disposition = meta_data['content-disposition'][1] 
        field_name = disposition['name'].strip() 
       except (KeyError, IndexError, AttributeError): 
        continue 

       transfer_encoding = meta_data.get('content-transfer-encoding') 

       if transfer_encoding is not None: 
        transfer_encoding = transfer_encoding[0].strip() 

       field_name = force_text(field_name, encoding, errors='replace') 

       if item_type == FIELD: 
        # this is a POST field 
        if transfer_encoding == "base64": 
         raw_data = field_stream.read() 
         try: 
          data = str(raw_data).decode('base64') 
         except: 
          data = raw_data 
        else: 
         data = field_stream.read() 

        self.POST.appendlist(field_name, force_text(data, encoding, 
         errors='replace')) 

        # trigger listener 
        self.field_parsed(field_name, self.POST.get(field_name)) 
       elif item_type == FILE: 
        # this is a file 
        file_name = disposition.get('filename') 

        if not file_name: 
         continue 

        # transform the file name 
        file_name = force_text(file_name, encoding, errors='replace') 
        file_name = self.IE_sanitize(unescape_entities(file_name)) 

        content_type = meta_data.get('content-type', ('',))[0].strip() 

        try: 
         charset = meta_data.get('content-type', (0, {}))[1].get('charset', None) 
        except: 
         charset = None 

        try: 
         file_content_length = int(meta_data.get('content-length')[0]) 
        except (IndexError, TypeError, ValueError): 
         file_content_length = None 

        counter = 0 

        # now, do the important file stuff 
        try: 
         # alert on the new file 
         self.new_file(field_name, file_name, content_type, 
           file_content_length, charset) 

         # chubber-chunk it 
         for chunk in field_stream: 
          if transfer_encoding == "base64": 
           # base 64 decode it if need be 
           over_bytes = len(chunk) % 4 

           if over_bytes: 
            over_chunk = field_stream.read(4 - over_bytes) 
            chunk += over_chunk 

           try: 
            chunk = base64.b64decode(chunk) 
           except Exception as e: 
            # since this is anly a chunk, any error is an unfixable error 
            raise MultiPartParserError("Could not decode base64 data: %r" % e) 

          chunk_length = len(chunk) 
          self.receive_data_chunk(chunk, counter) 
          counter += chunk_length 
          # ... and we're done 
        except SkipFile: 
         # just eat the rest 
         exhaust(field_stream) 
        else: 
         # handle file upload completions on next iteration 
         old_field_name = field_name 

     except StopUpload as e: 
      # if we get a request to stop the upload, exhaust it if no con reset 
      if not e.connection_reset: 
       exhaust(input_data) 
     else: 
      # make sure that the request data is all fed 
      exhaust(input_data) 

     # signal the upload has been completed 
     self.upload_complete() 

     return self.POST, self.FILES 

    def IE_sanitize(self, filename): 
     """Cleanup filename from Internet Explorer full paths.""" 
     return filename and filename[filename.rfind("\\")+1:].strip() 

從本質上講,通過繼承這個類,你可以有更多...智能上傳處理程序。根據我的需要,字段將以field_parsed方法宣佈爲子類。

我已經向Django團隊報告了feature request這個功能,希望這個功能成爲Django常規工具箱的一部分,而不是像上面那樣猴子修補源代碼。

0

基於對FileUploadHandler代碼,發現這裏在第62行:

https://github.com/django/django/blob/master/django/core/files/uploadhandler.py

它看起來像request對象被傳遞到處理程序,併爲self.request

存儲在這種情況下,你應該能夠在您的上傳處理程序中的任意位置訪問鹽分

salt = self.request.POST.get('salt') 

除非我誤解你的問題。

+0

這就是問題所在,你的代碼不起作用。如果我打印self.request.REQUEST,我會在控制檯中獲得'{}',直到處理程序完成後,基本上沒有可用的變量。 – 2013-03-13 00:34:22

+0

如果我用'handle_raw_input'弄髒我的手,我可以讀取實際的請求主體,可變數據和全部。但是,我想知道是否有更好的解決方案,因爲手動處理數據,解析邊界,提取可變信息等有點麻煩。 – 2013-03-13 00:39:10

相關問題