2016-11-29 38 views
0

我正在閱讀Python 3 docs here,我必須是盲人或其他什麼的......它在哪裏說如何得到消息的正文?如何從message.parser.Parser返回的Message對象中獲取消息正文(或正文)?

我想要做的是打開一條消息,並在消息的基於文本的主體中執行一些循環,跳過二進制附件。僞代碼:

def read_all_bodies(local_email_file): 
    email = Parser().parse(open(local_email_file, 'r')) 
    for pseudo_body in email.pseudo_bodies: 
     if pseudo_body.pseudo_is_binary(): 
      continue 
     # Pseudo-parse the body here 

我該怎麼做?甚至是消息類正確的類嗎?它不只是用於標題嗎?

回答

1

這是最好使用兩個函數完成:

  1. 一個打開的文件。如果消息是單個部分,則get_payload將在消息中返回字符串。如果消息是多,它返回子消息
  2. 二來處理文本/淨荷

這是如何可以做到的名單:

def parse_file_bodies(filename): 
    # Opens file and parses email 
    email = Parser().parse(open(filename, 'r')) 
    # For multipart emails, all bodies will be handled in a loop 
    if email.is_multipart(): 
     for msg in email.get_payload(): 
      parse_single_body(msg) 
    else: 
     # Single part message is passed diractly 
     parse_single_body(email) 

def parse_single_body(email): 
    payload = email.get_payload(decode=True) 
    # The payload is binary. It must be converted to 
    # python string depending in input charset 
    # Input charset may vary, based on message 
    try: 
     text = payload.decode("utf-8") 
     # Now you can work with text as with any other string: 
     ... 
    except UnicodeDecodeError: 
     print("Error: cannot parse message as UTF-8") 
     return 
相關問題