2017-07-31 72 views
0

這可能是一個簡單的問題,但我無法弄清楚答案。考慮一個簡單的數據幀如何使用magrittr從數據框中提取單個元素?

library(dplyr) 
library(purrr) 
library(magrittr) 
dataframe <- data_frame(id = c(1,2,3,4), 
         text = c("this is a this", "this is another",'hello','what???')) 

> dataframe 
# A tibble: 4 x 2 
    id   text 
    <dbl>   <chr> 
1  1 this is a this 
2  2 this is another 
3  3   hello 
4  4   what??? 

在這裏,我想要寫一個管式提取第4行和列文本的元素:what???

我試圖用

dataframe %>% pull(text)[[4]] 

,但它不工作。我能在這裏做什麼?

+2

嗯,你總是可以像'數據框%>%拉(文本)%>%[4]'。 –

+0

@AndreyKolyadin工作!謝謝,但如何人應該知道這些東西?它在哪裏被記錄? –

+1

或者'也許'dataframe%>%拉(文本)%>%last()' – Sotos

回答

3

這工作:

dataframe %>% select(text) %>% unlist() %>% .[4] 

編輯:

不在於它真的很重要這一點,但也有較快的選項(從穆迪的列表):

microbenchmark(
    dataframe %$% text[4], 
    dataframe %>% {.$text[4]}, 
    dataframe %>% .[[4,"text"]], 
    dataframe %>% `[[`(4,"text"), 
    dataframe %>% extract2(4,"text"), 
    dataframe %$% text %>% extract(4), 
    dataframe %>% extract2("text") %>% extract(4), 
    dataframe %>% use_series(text) %>% extract(4), 
    dataframe %>% pull(text) %>% .[4], # @andrey-kolyadin in the comments 
    dataframe %>% select(text) %>% unlist() %>% .[4], # @stackTon's solution 
    dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution 
) 

Unit: microseconds 
                expr  min  lq  mean median  uq  max neval 
           dataframe %$% text[4] 49.014 58.0065 74.18069 66.8210 76.5185 256.353 100 
         dataframe %>% {  .$text[4] } 92.739 102.7880 119.06888 112.6615 124.1220 290.205 100 
          dataframe %>% .[[4, "text"]] 65.235 70.5240 90.02727 79.5155 92.9155 344.507 100 
          dataframe %>% 4[["text"]] 69.466 76.8710 93.45829 85.6865 101.0250 224.618 100 
        dataframe %>% extract2(4, "text") 68.761 77.4005 90.49983 82.6890 99.6150 166.789 100 
        dataframe %$% text %>% extract(4) 81.455 87.6255 108.64541 99.9675 116.3640 332.519 100 
     dataframe %>% extract2("text") %>% extract(4) 98.733 106.8440 120.75439 114.6010 125.3560 256.000 100 
     dataframe %>% use_series(text) %>% extract(4) 137.521 147.3940 165.11001 156.7390 172.0780 409.741 100 
        dataframe %>% pull(text) %>% .[4] 1984.177 2042.0055 2189.99915 2076.0335 2172.6505 5512.815 100 
     dataframe %>% select(text) %>% unlist() %>% .[4] 3241.256 3362.9095 3644.73124 3425.4990 3567.9555 8855.978 100 
dataframe %>% filter(row_number() == 4) %>% pull(text) 3542.039 3635.4820 3941.44085 3767.7140 3980.3415 8704.705 100 

I lik E(列表中沒有的):

dataframe %>% .$text %>% .[4] 

平均162

+0

謝謝!你知道'。[4]'語法的文檔在哪裏嗎? –

+1

'.'只是'dataframe'的一個佔位符,因爲它通過管道進行操作。你可以像這樣編寫上面的代碼,並得到相同的結果'dataframe%>%select(。,text)%>%unlist(。)%>%。[4]'。最後的括號只是標準的R子集。 – Tunn

3

你可以試試:

dataframe %>% 
    filter(row_number() == 4) %>% 
    pull(text) 
+1

好戲也是 –

1

對於magrittr - 只解決方案,你想

dataframe %>% magrittr::use_series(text) %>% magrittr::extract(4) 
1

短短可能性:

dataframe %$% text[4] 
dataframe %>% {.$text[4]} 
dataframe %>% .[[4,"text"]] 
dataframe %>% `[[`(4,"text") 

或者這個如果你只想使用magrittr別名:

dataframe %>% extract2(4,"text") 
dataframe %$% text %>% extract(4) 
dataframe %>% extract2("text") %>% extract(4) 
dataframe %>% use_series(text) %>% extract(4) # @Brian'ssolution 

其他提議的解決方案不是純粹的magrittr(使用dplyr):

dataframe %>% pull(text) %>% .[4] # @andrey-kolyadin in the comments 
dataframe %>% select(text) %>% unlist() %>% .[4] # @stackTon's solution 
dataframe %>% filter(row_number() == 4) %>% pull(text) # Aramis7d's solution 
相關問題