2012-09-12 30 views
4

我從在線數據庫(紅帽子),通過API獲取數據,並獲取傳遞作爲這樣的逗號分隔字符串,解析回「亂」 API晶格結構

RAW.API <- structure("id,event_arm,name,dob,pushed_text,pushed_calc,complete\n\"01\",\"event_1_arm_1\",\"John\",\"1979-05-01\",\"\",\"\",2\n\"01\",\"event_2_arm_1\",\"John\",\"2012-09-02\",\"abc\",\"123\",1\n\"01\",\"event_3_arm_1\",\"John\",\"2012-09-10\",\"\",\"\",2\n\"02\",\"event_1_arm_1\",\"Mary\",\"1951-09-10\",\"def\",\"456\",2\n\"02\",\"event_2_arm_1\",\"Mary\",\"1978-09-12\",\"\",\"\",2\n", "`Content-Type`" = structure(c("text/html", "utf-8"), .Names = c("", "charset"))) 

我有這樣的腳本很好地將其解析爲數據幀,

(df <- read.table(file = textConnection(RAW.API), header = TRUE, 
sep = ",", na.strings = "", stringsAsFactors = FALSE)) 
    id  event_arm name  dob pushed_text pushed_calc complete 
1 1 event_1_arm_1 John 1979-05-01  <NA>   NA  2 
2 1 event_2_arm_1 John 2012-09-02   abc   123  1 
3 1 event_3_arm_1 John 2012-09-10  <NA>   NA  2 
4 2 event_1_arm_1 Mary 1951-09-10   def   456  2 
5 2 event_2_arm_1 Mary 1978-09-12  <NA>   NA  2 

然後我做了一些計算,並將其寫入pushed_textpushed_calc此後我需要的數據格式返回給它排在凌亂的逗號分隔的結構。

我想象這樣的事情,

API.back <- `some magic command`(df, ...) 

identical(RAW.API, API.back) 
[1] TRUE 

一些命令可以從數據幀我做了,df格式化我的數據,回到那個原始API對象進來,RAW.API結構。

任何幫助將不勝感激。

回答

3

這似乎工作:

some_magic <- function(df) { 
    ## Replace NA with "", converting column types as needed 
    df[] <- lapply(df, function(X) { 
       if(any(is.na(X))) {X[is.na(X)] <- ""; X} else {X} 
      }) 

    ## Print integers in first column as 2-digit character strings 
    ## (DO NOTE: Hardwiring the number of printed digits here is probably 
    ## inadvisable, though needed to _exactly_ reconstitute RAW.API.) 
    df[[1]] <- sprintf("%02.0f", df[[1]]) 

    ## Separately build header and table body, then suture them together 
    l1 <- paste(names(df), collapse=",") 
    l2 <- capture.output(write.table(df, sep=",", col.names=FALSE, 
            row.names=FALSE)) 
    out <- paste0(c(l1, l2, ""), collapse="\n") 

    ## Reattach attributes 
    att <- list("`Content-Type`" = structure(c("text/html", "utf-8"), 
       .Names = c("", "charset"))) 
    attributes(out) <- att 
    out 
} 

identical(some_magic(df), RAW.API) 
# [1] TRUE 
+0

我對魔術非常深刻的印象,但事情似乎丟失。這是因爲在某處丟失了一個'}'或''''。目前試圖弄清楚。 –

+1

@EricFail - 就這樣!它現在已經修好了,應該也適合你。 –