2012-11-10 46 views
6

這是我的第一個問題,如果在代碼示例方面有點長,我很抱歉。Python - 此代碼是否缺乏列表理解和生成器

作爲工作應用程序的一部分,我被要求編寫一個暴露一些字段的Bit Torrent文件解析器。我做了代碼,並被告知我的代碼「不太符合我們從團隊領導所需的級別」。哎喲!

這很好,自從我編碼並列出解析之後,發生器在當天就不存在了(我開始使用COBOL,但用C,C++等編碼)。對我來說,下面的代碼非常乾淨。有時候有沒有需要使用更復雜的結構,語法或模式 - 「保持簡單」。

請問一些Python大師可以批評這個代碼嗎?我相信其他人可以看到代碼可以改進的地方。有更多的意見,等等(在bencode.py爲http://wiki.theory.org/Decoding_bencoded_data_with_python

我能想到的領域:在display_

  1. *方法使用列表內涵避免的「如果」字符串更好
  2. 列表理解/發電機使用
  3. 不好用全局
  4. 標準輸入/輸出/管道?這是一個簡單的任務,所以我認爲這是沒有必要的。

我個人對此代碼感到自豪,所以想知道我需要改進的地方。謝謝。

#!/usr/bin/env python2 
    """Bit Torrent Parsing 

    Parses a Bit Torrent file. 


    A basic parser for Bit Torrent files. Visit http://wiki.theory.org/BitTorrentSpecification for the BitTorrent specification. 

    """ 

    __author__ = "...." 
    __version__ = "$Revision: 1.0 $" 
    __date__ = "$Date: 2012/10/26 11:08:46 $" 
    __copyright__ = "Enjoy & Distribute" 
    __license__ = "Python" 

    import bencode 
    import argparse 
    from argparse import RawTextHelpFormatter 
    import binascii 
    import time 
    import os 
    import pprint 

    torrent_files = 0 
    torrent_pieces = 0 

    def display_root(filename, root): 
     """prints main (root) information on torrent""" 
     global torrent_files 
     global torrent_pieces 

     print 
     print "Bit Torrent Metafile Structure root nodes:" 
     print "------------------------------------------" 
     print "Torrent filename: ", filename 
     print " Info:   %d file(s), %d pieces, ~%d kb/pc" % (
        torrent_files, 
        torrent_pieces, 
        root['info']['piece length']/1024) 

     if 'private' in root['info']: 
      if root['info']['private'] == 1: 
       print "      Publish presence: Private" 

     print " Announce:  ", root['announce'] 

     if 'announce-list' in root: 
      print " Announce List: " 
      for i in root['announce-list']: 
       print "     ", i[0] 

     if 'creation date' in root: 
      print " Creation Date: ", time.ctime(root['creation date']) 

     if 'comment' in root: 
      print " Comment:  ", root['comment'] 

     if 'created-by' in root: 
      print " Created-By: ", root['created-by'] 

     print " Encoding:  ", root['encoding'] 
     print 



    def display_torrent_file(info): 
     """prints file information (single or multifile)""" 
     global torrent_files 
     global torrent_pieces 

     if 'files' in info: 
      # multipart file mode 
      # directory, followed by filenames 

      print "Files:" 

      max_files = args.maxfiles 
      display = max_files if (max_files < torrent_files) else torrent_files 
      print " %d File %d shown: " % (torrent_files, display) 
      print " Directory: ", info['name'] 
      print " Filenames:" 

      i = 0 
      for files in info['files']: 

       if i < max_files: 

        prefix = '' 
        if len(files['path']) > 1: 
         prefix = './' 
        filename = prefix + '/'.join(files['path']) 

        if args.filehash: 
         if 'md5sum' in files: 
          md5hash = binascii.hexlify(files['md5sum']) 
         else: 
          md5hash = 'n/a' 
         print '  %s [hash: %s]' % (filename, md5hash) 
        else: 
         print '  %s ' % filename 
        i += 1 
       else: 
        break 

     else: 
      # single file mode 
      print "Filename: ", info['name'] 

     print 


    def display_pieces(pieceDict): 
     """prints SHA1 hash for pieces, limited by arg pieces""" 
     global torrent_files 
     global torrent_pieces 
     # global pieceDict 

     # limit since a torrent file can have 1,000's of pieces 
     max_pieces = args.pieces if args.pieces else 10 

     print "Pieces:" 
     print " Torrent contains %s pieces, %d shown."% (
       torrent_pieces, max_pieces) 

     print " piece : sha1" 
     i = 0   
     while i < max_pieces and i < torrent_pieces: 
      # print SHA1 hash in readable hex format 
      print ' %5d : %s' % (i+1, binascii.hexlify(pieceDict[i])) 
      i += 1 


    def parse_pieces(root): 
     """create dictionary [ piece-num, hash ] from info's pieces 

     Returns the pieces dictionary. key is the piece number, value is the 
     SHA1 hash value (20-bytes) 

     Keyword arguments: 
     root -- a Bit Torrent Metafile root dictionary 

     """ 

     global torrent_pieces 

     pieceDict = {} 
     i = 0 
     while i < torrent_pieces: 
      pieceDict[i] = root['info']['pieces'][(20*i):(20*i)+20] 
      i += 1 

     return pieceDict 

    def parse_root_str(root_str): 
     """create dictionary [ piece-num, hash ] from info's pieces 

     Returns the complete Bit Torrent Metafile Structure dictionary with 
     relevant Bit Torrent Metafile nodes and their values. 

     Keyword arguments: 
     root_str -- a UTF-8 encoded string with root-level nodes (e.g., info) 

     """ 

     global torrent_files 
     global torrent_pieces 


     try: 
      torrent_root = bencode.decode(root_str) 
     except StandardError: 
      print 'Error in torrent file, likely missing separators like ":"' 

     if 'files' in torrent_root['info']: 
      torrent_files = len(torrent_root['info']['files']) 
     else: 
      torrent_files = 1 

     torrent_pieces = len(torrent_root['info']['pieces'])/20 
     torrent_piece = parse_pieces(torrent_root) 

     return torrent_root, torrent_piece 

    def readfile(filename): 
     """read file and return file's data""" 

     global torrent_files 
     global torrent_pieces 

     if os.path.exists(filename): 
      with open(filename, mode='rb') as f: 
       filedata = f.read() 
     else: 
      print "Error: filename: '%s' does not exist." % filename 
      raise IOError("Filename not found.") 

     return filedata 


    if __name__ == "__main__": 

     parser = argparse.ArgumentParser(formatter_class=RawTextHelpFormatter, 
      description= 
        "A basic parser for Bit Torrent files. Visit " 
        "http://wiki.theory.org/BitTorrentSpecification for " 
        "the BitTorrent specification.", 
      epilog= 
        "The keys for the Bit Torrent MetaInfo File Structure " 
        "are info, announce, announce-list, creation date, comment, " 
        "created by and encoding. \n" 
        "The Info Dictionary (info) is dependant on whether the torrent " 
        "file is a single or multiple file. The keys common to both " 
        "are piece length, pieces and private.\nFor single files, the " 
        "additional keys are name, length and md5sum.For multiple files " 
        "the keys are, name and files. files is also a dictionary with " 
        "keys length, md5sum and path.\n\n" 
        "Examples:\n" 
        "torrentparse.py --string 'l4:dir14:dir28:file.exte'\n" 
        "torrentparse.py --filename foo.torrent\n" 
        "torrentparse.py -f foo.torrent -f bar.torrent " 
        "--maxfiles 2 --filehash --pieces 2 -v")   
     filegroup = parser.add_argument_group('Input File or String') 
     filegroup.add_argument("-f", "--filename", 
        help="name of torrent file to parse", 
        action='append') 
     filegroup.add_argument("-fh", "--filehash", 
        help="display file's MD5 hash", 
        action = "store_true") 
     filegroup.add_argument("-maxf", "--maxfiles", 
        help="display X filenames (default=20)", 
        metavar = 'X', 
        type=int, default=20) 

     piecegroup = parser.add_argument_group('Torrent Pieces') 
     piecegroup.add_argument("-p", "--pieces", 
        help = "display X piece's SHA1 hash (default=10)", 
        metavar = 'X', 
        type = int) 

     parser.add_argument("-s", "--string", 
        help="string for bencoded dictionary item") 


     parser.add_argument("-v", "--verbose", 
        help = "Display MetaInfo file to stdout", 
        action = "store_true") 

     args = parser.parse_args() 



     if args.string: 
      print 
      text = bencode.decode(args.string) 
      print text 
     else:  
      for fn in args.filename: 

       try: 
        filedata = readfile(fn) 
        torrent_root, torrent_piece = parse_root_str(filedata) 
       except IOError: 
        print "Please enter a valid filename" 
        raise 

       if torrent_root: 
        display_root(fn, torrent_root) 
        display_torrent_file(torrent_root['info']) 

        if args.pieces: 
         display_pieces(torrent_piece) 

       verbose = True if args.verbose else False 
       if verbose: 
        print 
        print "Verbose Mode: \nPrinting root and info dictionaries" 
        # remove pieces as its long. display it afterwards 
        pieceless_root = torrent_root 
        del pieceless_root['info']['pieces'] 
        pp = pprint.PrettyPrinter(indent=4) 
        pp.pprint(pieceless_root) 
        print 

        print "Print info's piece information: " 
        pp.pprint(torrent_piece) 
        print 
       print "\n" 
+4

我會先把它變成一個班。 – Blender

+1

缺少面向對象。我敢打賭,這就是我們所期待的。另外'if __name__ ...'下的部分太長了。這應該是一個main()函數。 – Keith

+4

我注意到的第一件事是全局變量。 – icktoofay

回答

3

我注意到的第一件事就是你有很多全局變量。這不好;對於一個問題,您的代碼不再是線程安全的。 (現在我看到你注意的是,在你的問題,但是這是應該被改變。)

這看起來有點奇怪:

i = 0 
for files in info['files']: 
    if i < max_files: 
     # ... 
    else: 
     break 

相反,你可能只是這樣做:

for file in info['files'][:max_files]: 
    # ... 

我還注意到,你解析的文件剛好足以輸出所有漂亮的數據。你可能想把它放到合適的結構中。例如,有Torrent,PieceFile類。

+0

謝謝大家 - 我同意你的所有意見。我想我更關注結果而不是風格。公平不夠,再次感謝。 –

4

下面的代碼片段:

i = 0 
while i < torrent_pieces: 
    pieceDict[i] = root['info']['pieces'][(20*i):(20*i)+20] 
    i += 1 

應改爲:

for i in range(torrent_pieces): 
    pieceDict[i] = root['info']['pieces'][(20*i):(20*i)+20] 

這可能是他們希望看到的那種東西。一般來說,Python代碼不應該在for循環中非常需要顯式索引變量操作。

+0

同意。這肯定是我學習Python的一個薄弱環節。學習經驗... –