2012-04-22 66 views
2

我在this blog的評論中發現代碼存在問題,作爲該帖子中提供的代碼的擴展。從Gmail下載子文件夾

mailSoc <- function(login, 
        pass, 
        serv = "imap.gmail.com", #specify IMAP server 
        ntore = 50, #ignore if addressed to more than 
        todow = -1, #how many to download 
        begin = -1, #from which to start 
        folder = ''){ #folder to download (default:inbox) 

    #load rJython and Python libraries 
    require(rJython) 
    rJython <- rJython(modules = "imaplib") 
    rJython$exec("import imaplib") 

    #connect to server 
    rJython$exec(paste("mymail = imaplib.IMAP4_SSL('", 
        serv, "')", sep = "")) 
    rJython$exec(paste("mymail.login(\'", 
        login, "\',\'", 
        pass, "\')", sep = "")) 

    #get number of available messages 
    rJython$exec(paste("sel = mymail.select(\"", folder,"\")", sep="")) 
    rJython$exec("number = sel[1]") 
    nofmsg <- .jstrVal(rJython$get("number")) 
    nofmsg <- as.numeric(unlist(strsplit(nofmsg, "'"))[2]) 

    #if 'begin' not specified begin from the newest 
    if(begin == -1) 
    { 
    begin <- nofmsg 
    } 

    #if 'todow' not specified download all 
    if(todow == -1) 
    { 
    end <- 1 
    } 
    else 
    { 
    end <- begin - todow 
    } 

    #give a little bit of information 
    todownload <- begin - end 
    print(paste("Found", nofmsg, "emails")) 
    print(paste("I will download", todownload, "messages.")) 
    print("It can take a while") 

    data <- data.frame() 

    #fetching emails 
    for (i in begin:end) { 
    nr <- as.character(i) 

    #get sender 
    rJython$exec(paste("typ, fro = mymail.fetch(\'", nr, "\', \'(BODY[HEADER.FIELDS (from)])\')", sep = "")) 
    rJython$exec("fro = fro[0][1]") 
    from <- .jstrVal(rJython$get("fro")) 
    from <- unlist(strsplit(from, "[\r\n, \"]")) 
    from <- sub("from: ", "", from, ignore.case = TRUE) 
          from <- grep("@", from, value = TRUE) 
    #get addresees 
    rJython$exec(paste("typ, to = mymail.fetch(\'", nr, "\', \'(BODY[HEADER.FIELDS (to)])\')", sep = "")) 
    rJython$exec("to = to[0][1]") 
    to <- .jstrVal(rJython$get("to")) 
    to <- unlist(strsplit(to, "[\r\n, \"]")) 
    to <- sub("to: ", "", to, ignore.case = TRUE) 
    from <- sub("\"", "", from, ignore.case = TRUE) 
    to <- grep("@", to, value = TRUE) 

    #get dates: 
    rJython$exec(paste("typ, date = mymail.fetch(\'", nr, "\', \'(BODY[HEADER.FIELDS (date)])\')", sep = "")) 
    rJython$exec("date = date[0][1]") 
    date <- .jstrVal(rJython$get("date")) 

    #add to data frame 
    #vec <- rep(from, length(to)) 
    if(length(to)==0) 
    to <- 'NA' 
    if(length(from)==0) 
    to <- 'NA' 
    data <- rbind(data, data.frame(from, to, date)) 

    #give some information about progress 
    #print(i) 
    if((i - begin) %% 100 == 0) 
    { 
     print(paste((i - begin)*(-1), "/", todownload, 
        " Downloading...", sep = "")) 
    } 
    } 
    names(data) <- c("from", "to", "date") 
    data$from <- tolower(data$from) 
    data$to <- tolower(data$to) 

    #close connection 
    rJython$exec("mymail.shutdown()") 
    return(data) 
} 

我指定從中我想下載我的電子郵件

maild <- mailSoc("login", "passowrd", serv = "imap.gmail.com", 
       ntore = 20, todow = 200, folder='anywhere') 

文件夾後,我得到錯誤信息:

[1] 「Found NA emails」 
[1] 「I will download NA messages.」 
[1] 「It can take a while」 
Error in begin:end : NA/NaN argument 
In addition: Warning message: 
In mailSoc(「xyz」, 「xyz」, serv = 「imap.gmail.com」, : 
NAs introduced by coercion 

你知不知道我該怎麼辦?我想選擇接下來我想要下載的gmail中的文件夾/子文件夾。

回答

0

我找到了解決我的問題的方法。我需要做的是與

rJython$exec("sel = mymail.select('[Gmail]/All Mail')") 

更換

rJython$exec(paste("sel = mymail.select(\"", folder,"\")", sep="")) 

但後來我有一個問題,我無法下載超過2500名的電子郵件。也許你可以找到解決這個問題的辦法...

[1] 「Found 17976 emails」 
[1] 「I will download 2500 messages.」 
[1] 「It can take a while」 
[1] 「0/2500 Downloading…」 
[1] 「100/2500 Downloading…」 
[1] 「200/2500 Downloading…」 
[1] 「300/2500 Downloading…」 
....MORE LINES.... 
[1] 「2400/2500 Downloading…」 
Error in data.frame(vec, to) : 
arguments imply differing number of rows: 0, 1