2015-10-16 149 views
0

如何從這個字符向量中有效地刪除重複項?從字符向量中刪除重複的字符串

> dput(data[1:30]) 
c("AT2G27020 AT3G26340", "AT1G56450 AT3G26340", "AT1G13060 AT3G26340", 
"AT3G22630 AT3G26340", "AT3G22110 AT3G26340", "AT2G05840 AT3G26340", 
"AT1G47250 AT3G26340", "AT1G79210 AT3G26340", "AT2G27020 AT5G40580", 
"AT3G27430 AT5G40580", "AT4G31300 AT5G40580", "AT3G14290 AT5G40580", 
"AT3G22630 AT5G40580", "AT3G22110 AT5G40580", "AT5G35590 AT5G40580", 
"AT2G05840 AT5G40580", "AT3G60820 AT5G40580", "AT1G79210 AT5G40580", 
"AT2G27020 AT3G27430", "AT2G27020 AT4G31300", "AT1G53850 AT2G27020", 
"AT2G27020 AT5G66140", "AT2G27020 AT3G51260", "AT1G21720 AT2G27020", 
"AT1G56450 AT2G27020", "AT1G13060 AT2G27020", "AT2G27020 AT3G22630", 
"AT2G27020 AT4G14800", "AT2G27020 AT3G22110", "AT2G27020 AT5G35590" 
) 

我曾嘗試使用簡單的功能爲:uniqueduplicated但遺憾的是它沒有工作。

這是我的不好。通過重複我是指相同的AGIs,因此它們中的一些一起存儲在「」中並不重要。我希望每個「ATXG ...」只有一次在我的向量中。在開始時我並不知道矢量包含它們對...對不起。

+3

你究竟做了什麼(代碼),什麼沒有工作? _entire_字符串上的'unique'和'duplicated'工作。你想刪除什麼「重複」? – hrbrmstr

+1

您的示例不包含重複項... – Cath

+1

您的字符串格式爲「」text1 text2「」。你是否想看看這兩個值是否相等? 'text1 == text2'? –

回答

3
unique(unlist(strsplit(x, " "))) 
#[1] "AT2G27020" "AT3G26340" "AT1G56450" "AT1G13060" "AT3G22630" "AT3G22110" 
#[7] "AT2G05840" "AT1G47250" "AT1G79210" "AT5G40580" "AT3G27430" "AT4G31300" 
#[13] "AT3G14290" "AT5G35590" "AT3G60820" "AT1G53850" "AT5G66140" "AT3G51260" 
#[19] "AT1G21720" "AT4G14800"