2016-07-17 19 views
4

我很抱歉提出可能非常簡單的問題。我有一個數據框,我試圖把它放在下面。如何選擇在數據框中共享特定名稱的列

mydf <- structure(list(br.Id = c(1992.0001, 1992.0002, 1992.0003, 1992.0004, 
1992.0005, 1992.0006, 1992.0007, 1992.0008, 1992.0009, 1992.001 
), si.month = c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), br.day = c(23L, 
23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L), br.year = c(1992L, 
1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L 
), branch = 1:10, br.location = c(160170L, 160170L, 160170L, 160170L, 
160170L, 160170L, 160170L, 160170L, 160170L, 160170L), si.length = c(90L, 
128L, 112L, 68L, 56L, 58L, 111L, 111L, 115L, 65L), si.weight = c(9.3, 
32.5, 19, 4.4, 2.1, 2.8, 16.1, 17.9, 22.7, 3.4), si.sex = structure(c(2L, 
1L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 1L), .Label = c("female", "male", 
"unknown"), class = "factor"), maturity = structure(c(7L, 7L, 
7L, 7L, 10L, 7L, 7L, 7L, 7L, 2L), .Label = c("developing", "immature", 
"mature", "nearly.ripe", "nearly.spent", "recovering", "ripe", 
"running", "spent", "unknown", "yoy"), class = "factor"), age = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, 1L)), .Names = c("br.Id", "si.month", 
"br.day", "br.year", "branch", "br.location", "si.length", "si.weight", 
"si.sex", "maturity", "age"), row.names = c(NA, 10L), class = "data.frame") 

我想要做的是選擇與br共享相同名稱的特定列。所以輸出應該看起來像

 br.Id br.day br.year branch br.location 
1 1992.000  23 1992  1  160170 
2 1992.000  23 1992  2  160170 
3 1992.000  23 1992  3  160170 
4 1992.000  23 1992  4  160170 
5 1992.001  23 1992  5  160170 
6 1992.001  23 1992  6  160170 
7 1992.001  23 1992  7  160170 
8 1992.001  23 1992  8  160170 
9 1992.001  23 1992  9  160170 
10 1992.001  23 1992  10  160170 

我想也許grep可以用來得到這些列,但我無法弄清楚如何使用它。我感謝你的任何幫助

回答

4

使用此

mydf[,grep(colnames(mydf),pattern="br.",fixed = TRUE)] 
+0

感謝它幫助我瞭解瞭如何使用grep來處理我的數據 –

5

你可以用這個嗎?
一種方法是使用dplyr軟件包。您需要加載library(dplyr)。然後,選擇函數是一個已知的dplyr函數,用於選擇變量,然後您可以使用包含來獲取那些列中具有特定字母的列。

mydf %>% select(contains("br")) 

#br.Id br.day br.year branch br.location 
#1 1992.000  23 1992  1  160170 
#2 1992.000  23 1992  2  160170 
#3 1992.000  23 1992  3  160170 
#4 1992.000  23 1992  4  160170 
#5 1992.001  23 1992  5  160170 
#6 1992.001  23 1992  6  160170 
#7 1992.001  23 1992  7  160170 
#8 1992.001  23 1992  8  160170 
#9 1992.001  23 1992  9  160170 
#10 1992.001  23 1992  10  160170 
+0

感謝,我不喜歡你的答案,因爲我是新的,但感謝你感謝你給我一些東西,我甚至無法想象! –

5

或者使用grepl

mydf[,grepl("br.",colnames(mydf))] 

或者使用regexpr

mydf[,regexpr("br.",colnames(mydf))>0] 

或者使用str_detectstringr

library(stringr) 
mydf[,str_detect(colnames(mydf),"br.")] 
+1

我喜歡regexpr,我喜歡你的解決方案! – Learner

+0

@ m0h3n謝謝,我不能喜歡你的答案,因爲我是新的,但謝謝你,謝謝你給我看其他方式 –

2
mydf[, grep("^br.", names(mydf))] 
+0

請注意,名稱相當於colnames:http://stackoverflow.com/questions/24799153/what-is-the -difference-between-names-and -colnames – snoram

+2

'names()'相當於'colnames()'**數據幀** –

+0

@RichardScriven謝謝。 – snoram

相關問題