2017-07-16 70 views
2

我正在閱讀存儲在我的計算機中的電子郵件文件,能夠提取電子郵件的標題,但無法提取正文。無法在python中提取電子郵件文件的正文

# The following part is working , opening a file and reading the header . 

    import email 
    from email.parser import HeaderParser 
    with open(passedArgument1+filename,"r",encoding="ISO-8859-1") as f: 
     msg=email.message_from_file(f) 
     print('message',msg.as_string()) 
     parser = HeaderParser() 
     h = parser.parsestr(msg.as_string()) 
     print (h.keys()) 

     # The following snippet gives error 
     msgBody=msg.get_body('text/plain') 

是否有任何正確的方法只提取正文消息。在這一點上走。

僅供參考電子郵件文件可以從

https://drive.google.com/file/d/0B3XlF206d5UrOW5xZ3FmV3M3Rzg/view

+0

什麼是錯誤訊息? – Fabien

+0

AttributeError:'消息'對象沒有屬性'get_body' – Sumanth

+0

似乎這個方法不存在。你檢查了文檔嗎? – Fabien

回答

4

更新下載

如果你擁有的是AttributeError: 'Message' object has no attribute 'get_body'錯誤,你可能想讀什麼如下。

我做了一些測試,看起來文檔確實與當前庫實現(2017年7月)相比是錯誤的。

實際上是什麼,你可能會尋找功能get_payload()似乎做你想要達到的目標:

The conceptual model provided by an EmailMessage object is that of an ordered dictionary of headers coupled with a payload that represents the RFC 5322 body of the message, which might be a list of sub-EmailMessage objects

get_payload()是不是在2017年目前的月Documentation,但help()說以下內容:

get_payload(i=None, decode=False) method of email.message.Message instance 
    Return a reference to the payload. 

The payload will either be a list object or a string. If you mutate the list object, you modify the message's payload in place. Optional i returns that index into the payload.

Optional decode is a flag indicating whether the payload should be decoded or not, according to the Content-Transfer-Encoding header (default is False).

When True and the message is not a multipart, the payload will be decoded if this header's value is 'quoted-printable' or 'base64'. If some other encoding is used, or the header is missing, or if the payload has bogus data (i.e. bogus base64 or uuencoded data), the payload is returned as-is.

If the message is a multipart and the decode flag is True , then None is returned.

+0

讓我趕快試試,並更新你 – Sumanth

+0

給我下面的消息msgBody = msg.EmailMessage.get_body(「text/plain的」) AttributeError的:「消息」對象有沒有屬性「EmailMessage」 – Sumanth

+0

是get_payload( ),作品謝謝你的回答 – Sumanth

3

3.6電子郵件庫默認使用一個與Python 3.2兼容的API,這就是導致這個問題的原因。

注意從文檔以下聲明的默認策略:

email.message_from_file(fp, _class=None, *, policy=policy.compat32)

如果你想用「新」的API,你在3.6文檔看,你必須創建消息一個不同的政策。

import email 
from email import policy 
... 
msg=email.message_from_file(f, policy=policy.default) 

會給你新的API,你在文檔中看到,其中將包括非常有用:get_body()

相關問題