library(data.table)
df <- structure(list(
continuousNumericOne = c(3.82495116149284, 0.915662542284416, 0.751001771620762, NA, NA, 8.07583989184169, 4.57303752008246, 4.02747047825306, 2.79953011697721, 4.28614794390785),
catagoricalFactorOne = structure(c(3L, 3L, 3L, NA, 3L, NA, 2L, 2L, 2L, NA), .Label = c("blue", "green", "red"), class = "factor"),
continuousNumericTwo = c(NA, NA, 2.58285715825289, -2.71316582700148, 3.95645652249594, 1.96862094118233, 4.96960533647993, 6.15199683070215, 3.98091405116921, NA),
catagoricalFactorTwo = structure(c(3L, 3L, 3L, NA, 3L, 3L, 2L, 2L, 2L, 1L), .Label = c("blue", "orange", "red"), class = "factor"),
continuousNumericThree = c(3.43332616062442, 2.21448227693603, 2.31889349781533, NA, NA, 3.57539465909581, 3.28076535012702, NA, 3.15063300766727, 2.9556632429251),
continuousNumericFour = c(7.77131807052585, NA, 6.5830522592014, NA, 7.36003333388333, 8.25217350122047, 7.18282902739316, 8.60641407074177, 4.87689328481095, NA)),
.Names = c("continuousNumericOne", "catagoricalFactorOne", "continuousFactorTwo", "catagoricalFactorTwo", "continuousNumericThree", "continuousNumericFour"),
row.names = c(NA, -10L),
class = c("data.table", "data.frame"))
> df
continuousNumericOne catagoricalFactorOne continuousFactorTwo catagoricalFactorTwo continuousNumericThree continuousNumericFour
1: 3.8249512 red NA red 3.433326 7.771318
2: 0.9156625 red NA red 2.214482 NA
3: 0.7510018 red 2.582857 red 2.318893 6.583052
4: NA NA -2.713166 NA NA NA
5: NA red 3.956457 red NA 7.360033
6: 8.0758399 NA 1.968621 red 3.575395 8.252174
7: 4.5730375 green 4.969605 orange 3.280765 7.182829
8: 4.0274705 green 6.151997 orange NA 8.606414
9: 2.7995301 green 3.980914 orange 3.150633 4.876893
10: 4.2861479 NA NA blue 2.955663 NA
一個人怎麼能做出一個自定義的函數來處理數據,如下所示來處理列數據的最佳方式......R 3與自定義函數
如果列一個明確的(因素),用'空白'代替所有NA
如果該列是連續的(數字),則額外的靈活性來進一步處理數據,例如首先將數據從0縮放到1,然後如果需要則替換NA ,也許是-1.1。
我已經花了大量時間進行列表, 試圖追蹤列名,以及是否給定的列名因素與否, 嘗試通過應用不同的功能應用的方法,仍然沒有運氣。
如果有更好的方法,我全部都是耳朵。
如果還有其他列不是因素或數字,該怎麼辦? 'process.default < - function(x)x'? – Frank
'process.default'對於沒有爲'process.foo'創建的對象來說是一個全面的方法。無論你需要什麼,你都可以製作'process.character','process.raw'。而'process'可以保持爲對'UseMethod'的調用。編輯 - 添加'process.default'來回答,因爲這是正確的做法。 –
因此,您可能不得不用'process.factor < - function(x){level(x)< - c(levels(x),「」); x [is.na(x)] < - 「」; (scale)(x));其中x和y分別表示一個或多個函數。 x [is.na(x)] < - -1.1; x }' – akrun