2012-11-16 24 views
0

這是當我試圖用對基因的條件根據合併不同的基因表達的結果我特別的噩夢之一,這裏是我的合併數據幀:刪除與數據幀列的交叉信息線比較兩個colums

knowngene1 Logfold1  Gene1 knowngene2 Logfold2  Gene2 
uc001ezv.3 5.1167021111 NA uc001ezu.1 5.6262305191 FLG 
uc001ihe.4 4.1338871783 LOC100216001 uc001ihg.3 3.9475325801 NA 
uc001iki.4 9.9902455211 CELF2 uc001ikn.2 9.3321964303 NA 
uc001ikk.2 10.3059806111 CELF2 uc001ikn.2 9.3321964303 NA 
uc001ikl.4 9.9890468379 CELF2 uc001ikn.2 9.3321964303 NA 
uc001ikn.2 9.8293484977 NA uc001iki.4 9.4401488053 CELF2 
uc001ikn.2 9.8293484977 NA uc001ikk.2 9.2887954663 CELF2 
uc001ikn.2 9.8293484977 NA uc001ikl.4 9.4401488053 CELF2 
uc001ikn.2 9.8293484977 NA uc010qbi.2 8.6399349792 CELF2 
uc001ikn.2 9.8293484977 NA uc010qbj.1 9.2887954663 CELF2 
uc001ezu.1 5.6262305191 FLG uc001ezv.3 5.1167021111 NA 
uc001ihg.3 3.9475325801 NA uc001ihe.4 4.1338871783 LOC100216001 
uc001iki.4 9.4401488053 CELF2 uc001ikn.2 9.8293484977 NA 
uc001ikk.2 9.2887954663 CELF2 uc001ikn.2 9.8293484977 NA 
uc001ikl.4 9.4401488053 CELF2 uc001ikn.2 9.8293484977 NA 
uc001ikn.2 9.3321964303 NA uc001iki.4 9.9902455211 CELF2 
uc001ikn.2 9.3321964303 NA uc001ikk.2 10.3059806111 CELF2 
uc001ikn.2 9.3321964303 NA uc001ikl.4 9.9890468379 CELF2 
uc001ikn.2 9.3321964303 NA uc010qbi.2 10.3865530025 CELF2 
uc001ikn.2 9.3321964303 NA uc010qbj.1 10.3072927485 CELF2 
uc001iot.1 6.9068905956 NA uc001iou.2 8.4040043896 VIM 
uc001iou.2 10.4420548632 VIM uc001iot.1 5.8235197903 NA 
uc001ipd.3 4.4693510978 ST8SIA6 uc001ipf.1 5.1931857169 NA 
uc001kgd.3 3.5469561781 NA uc009xts.3 4.0607448636 IFIT2 
uc001kgf.3 3.3975573789 IFIT3 uc001kgd.3 3.2512633588 NA 

問題是,我想刪除不重複的行,當然沒有,我想刪除已知基因訪問器在knowngene1和knongene2中更改的那些。讓我告訴一個例子,第一個是我想保持

uc001ikn.2 9.8293484977 NA uc001iki.4 9.4401488053 CELF2 

這些下一行對我來說都是一樣的線路,其實第一個是一個我想保持的鏡面反射圖像,儘管其表達的價值觀,這或多或少是在同一範圍內

uc001iki.4 9.4401488053 CELF2 uc001ikn.2 9.8293484977 NA 
uc001ikn.2 9.3321964303 NA uc001ikl.4 9.9890468379 CELF2 

這樣的想法是隻保留第一個我看到並取出下一人。你有什麼想法?

回答

1

所以你想刪除uc001ikn.2出現的所有行?如果是這樣,我認爲這將工作:

Rgames> foo 
    [,1] [,2] 
[1,] 1 7 
[2,] 2 8 
[3,] 3 9 
[4,] 2 3 
[5,] 4 1 
[6,] 3 10 
[7,] 5 11 
[8,] 6 12 
Rgames> foo[!duplicated(foo[,1])&!(foo[,2]%in%duplicated(foo[,1])),] 
    [,1] [,2] 
[1,] 1 7 
[2,] 2 8 
[3,] 3 9 
[4,] 5 11 
[5,] 6 12 

凡在你的情況,你會在df$knowngene1df$knowngene2列進行操作。

+0

毫無疑問,一些受過良好教育的人會使用'熔化'和'重鑄':-) –