使用mail
紅寶石我得到這個消息從電子郵件正文解析阿拉伯語:UndefinedConversionError試圖
mail.rb:22:in `encode': "\xC7" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
from mail.rb:22:in `<main>'
如果我刪除編碼,我得到一個消息紅寶石
/var/lib/gems/1.9.1/gems/bson-1.7.0/lib/bson/bson_ruby.rb:63:in `rescue in to_utf8_binary': String not valid utf-8: "<div dir=\"ltr\"><div class=\"gmail_quote\">l<br><br><br><div dir=\"ltr\"><div class=\"gmail_quote\"><br><br><br><div dir=\"ltr\"><div class=\"gmail_quote\"><br><br><br><div dir=\"ltr\"><div dir=\"rtl\">\xC7\xE1\xE4\xD5 \xC8\xC7\xE1\xE1\xDB\xC9 \xC7\xE1\xDA\xD1\xC8\xED\xC9</div></div>\r\n</div><br></div>\r\n</div><br></div>\r\n</div><br></div>" (BSON::InvalidStringEncoding)
這是我的代碼:
require 'mail'
require 'mongo'
connection = Mongo::Connection.new
db = connection.db("DB")
db = Mongo::Connection.new.db("DB")
newsCollection = db["news"]
Mail.defaults do
retriever_method :pop3, :address => "pop.gmail.com",
:port => 995,
:user_name => 'my_username',
:password => '*****',
:enable_ssl => true
end
emails = Mail.last
#Checks if email is multipart and decods accordingly. Put to extract UTF8 from body
plain_part = emails.multipart? ? (emails.text_part ? emails.text_part.body.decoded : nil) : emails.body.decoded
html_part = emails.html_part ? emails.html_part.body.decoded : nil
mongoMessage = {"date" => emails.date.to_s , "subject" => emails.subject , "body" => plain_part.encode('UTF-8') }
msgID = newsCollection.insert(mongoMessage) #add the document to the database and returns it's ID
puts msgID
對於英語和希伯來語它的作品完美,但它似乎gmail發送阿拉伯語w不同的編碼。用ASCII-8BIT
代替UTF-8
給出了類似的錯誤。
對於普通電子郵件使用plain_part
時,我會得到相同的結果。我正在處理來自一個特定來源的電子郵件,所以我可以放心地將html_part置於不引起錯誤的位置。 爲了讓它更怪異阿拉伯語的主題完美呈現。 我應該使用什麼編碼?
謝謝,但輸出沒有變化。仍輸出base64或MARK-8 – raam86