2015-02-11 43 views
0

刪除重複值我有以下data.frame:在分組中的R

dataFrame <- data.frame(sent = c(1,1,1,2,2,3,3,3,4,5), word = c("good printer","bad printer","wireless easy","amazing friendly", 
                  "great friendly","quite vibrant nice","well vibrant","no vibrant", 
                  "great notebook","nice car"),val = c(1,-1,1,1,1,1,1,1,1,1), 
        extract = c("printer","printer","wireless","friendly","friendly","vibrant","vibrant","vibrant","notebook","car")) 

它看起來象下面這樣:

sent   word  val extract 
    1  good printer 1 printer 
    1  bad printer -1 printer 
    1  wireless easy 1 wireless 
    2 amazing friendly 1 friendly 
    2  great friendly 1 friendly 
    3 quite vibrant nice 1 vibrant 
    3  well vibrant 1 vibrant 
    3 horrible vibrant -1 vibrant 
    4  great notebook 1 notebook 
    5   nice car 1  car 

,我有重複的每個子組中移除掙扎。因此,如果發送的值與val的值相同,並且提取的值相同,我需要刪除重複項。

下面是所需的輸出:

sent   word  val extract 
    1  good printer 1 printer 
    1  bad printer -1 printer 
    1  wireless easy 1 wireless 
    2 amazing friendly 1 friendly 
    3  well vibrant 1 vibrant 
    3 horrible vibrant -1 vibrant 
    4  great notebook 1 notebook 
    5   nice car 1  car 

我會感激你的任何幫助或建議。非常感謝前鋒。

+1

請檢查'重複'或'唯一' – akrun 2015-02-11 11:04:26

+1

基本上@akrun已經提供了您需要的所有信息,儘管提供的數據看起來不像你實際展示的一個。更不用說所需的輸出是不一致的。有時你移除第一個重複,在其他場合移除第二個重複。 – 2015-02-11 11:08:11

+0

好的,是的,我感謝你的想法。 !重複的語法是我正在尋找的。非常感謝米夫和傢伙。 – martinkabe 2015-02-11 12:21:26

回答

0

注意,第一和第二代碼塊看起來是矛盾的,所以我已經與第一:

> dataFrame <- data.frame(sent = c(1,1,1,2,2,3,3,3,4,5), 
          word = c("good printer","bad printer","wireless easy","amazing friendly", 
            "great friendly","quite vibrant nice","well vibrant","no vibrant", 
            "great notebook","nice car"), 
          val = c(1,-1,1,1,1,1,1,1,1,1), 
          extract = c("printer","printer","wireless","friendly","friendly","vibrant", 
              "vibrant","vibrant","notebook","car")) 

> dataFrame[!duplicated(dataFrame[,-2]),] 
       sent    word val extract 
      1  1  good printer 1 printer 
      2  1  bad printer -1 printer 
      3  1  wireless easy 1 wireless 
      4  2 amazing friendly 1 friendly 
      6  3 quite vibrant nice 1 vibrant 
      9  4  great notebook 1 notebook 
      10 5   nice car 1  car 

重複的返回一個布爾矢量用於指示該行之前是否發生,所以用這個對所有然後從原始數據框中提取這些行

+0

非常感謝Miff for!重複的語法。 – martinkabe 2015-02-11 12:23:15