SAS/PROC SQL - 只要存在重複，就刪除BY組中的所有觀察結果

我是SAS新手，如果滿足兩個條件，我試圖刪除組。我現在有這組數據：SAS/PROC SQL - 只要存在重複，就刪除BY組中的所有觀察結果

ID ID_2 ID_3; 

A 1 1; 

A 1 1; 

A 1 1; 

A 2 0; 

A 2 1; 

B 3 0; 

B 3 0;

我被ID分組然後通過ID_2。

我想刪除組中的所有條目，只要（1）所有三個變量都存在重複 - 我不只是想刪除重複項，我想刪除整個組AND（2））這種重複在每個組的所有行中涉及ID_3中的值'1'。

換句話說，我想要的結果是：

ID ID_2 ID_3; 

A 2 0; 

A 2 1; 

B 3 0; 

B 3 0;

我都花在這個至少5小時，我已經嘗試了各種方法：

第一。最後。（這並不能保證在所有觀測通過小組賽）
nodup（此方法只刪除重複的 - 我想刪除連組的第一行）
滯後（再次，該組的第一行停留，這是不是我想要的）

我打開使用proc SQL以及。真的很感謝任何輸入，提前謝謝！

來源

2016-11-11 LovetoLearn

你試過了什麼？ –

我嘗試了上面提到的三種方法。 – LovetoLearn

請發佈您嘗試過的代碼並格式化您的數據 - 理想情況下是數據步驟，但至少要刪除空格和分號，以便讀入SAS。 DoW循環或SQL步驟都可能工作。你大概可以爭論第一個/最後一個工作，但你需要檢查ID和ID3。 – Reeza

我相信這會實現你想要的。我想，這個邏輯可以調整得更清楚一點，但是當我測試它時，它就起作用了。

data x; 
    input id $ id_2 id_3; 
cards; 
A 1 1 
A 1 1 
A 1 1 
A 2 0 
A 2 1 
B 3 0 
B 3 0 
; 
run; 

* I realize the data are already sorted, but I think it is better 
* not to assume they are.; 
proc sort data=x; 
    by id id_2 id_3; 
run; 

* It is helpful to create a dataset for the duplicates as well as the 
* unduplicated observations.; 
data nodups 
    dups 
    ; 

    set x; 
    by id id_2 id_3; 

    * When FIRST.ID_3 and LAST.ID_3 at the same time, there is only 
    * one obs in the group, so keep it; 
    if first.id_3 and last.id_3 
    then output nodups; 

    * Otherwise, we know we have more than one obs. According to 
    * the OP, we keep them, too, unless ID_3 = 1; 
    else do; 
     if id_3 = 1 
     then output dups; 
     else output nodups; 
    end; 

run;

來源

2016-11-13 02:44:18 vknowles

SAS/PROC SQL - 只要存在重複，就刪除BY組中的所有觀察結果

回答

相關問題