謝謝誰能幫助我。我有如下數據集:基於隨後使用的第一個標記值。保留等
data smp;
infile datalines dlm=',';
informat identifier $7. trx_date $9. transaction_id $13. product_description $50. ;
input identifier $ trx_date transaction_id $ product_description $ ;
datalines;
Cust1,11Aug2016,20-0030417313,ONKEN BIOPOT F/FREE STRAWBERRY
Cust1,11Aug2016,20-0030417313,ONKEN BIOPOT F/FREE STRAWBERRY
Cust1,11Aug2016,20-0030417313,ONKEN BIOPOT FULL STRAWB/GRAIN
Cust1,11Aug2016,20-0030417313,RACHELS YOG GREEK NAT F/F/ORG
Cust1,03Nov2016,23-0040737060,RACHELS YOG GREEK NAT F/F/ORG
Cust3,13Feb2016,39-0070595440,COLLECT YOG LEMON
Cust3,21Jun2016,34-0050769524,AF YOG FARMHOUSE STRAWB/REDCUR
Cust3,21Jun2016,34-0050769524,Y/VALLEY GREEK HONEY ORGANIC
Cust3,21Jun2016,34-0050769524,Y/VALLEY THICK LEMON CURD ORG
Cust3,21Jun2016,34-0050769524,Y/VALLEY THICK YOG FRUITY FAVS
Cust3,21Jun2016,34-0050769524,Y/VALLEY THICK YOG STRAWB ORG
Cust3,26Jun2016,39-0430106897,TOTAL GREEK YOGURT 0%
Cust3,14Aug2016,54-0040266755,M/BUNCH SQUASHUMS STRAW/RASP
Cust3,14Aug2016,54-0040266755,MULLER CORNER STRAWBERRY
Cust3,14Aug2016,54-0040266755,TOTAL GREEK YOGURT 0%
Cust3,22Aug2016,54-0050447336,M/BUNCH SQUASHUMS STRAW/RASP
;
對於每個客戶(以及基於TRANSACTION_ID他們的每一個購買的),我想標誌,該標誌將在下次訪問期間回購各產品(只有自己的未來訪問)。因此,在上述數據集中,正確的標誌位於第4,12和13行,因爲這些產品是在下次客戶拜訪時購買的(我們只看下一次拜訪)。
我想用下面的程序做到這一點:
proc sort data = smp out = td;
by descending identifier transaction_id product_description;
run;
DATA TD2(DROP=tmp_product);
SET td;
BY identifier transaction_id product_description;
RETAIN tmp_product;
IF FIRST.product_description and first.transaction_id THEN DO;
tmp_product = product_description;
END;
ATTRIB repeat_flag FORMAT=$1.;
IF NOT FIRST.product_description THEN DO;
IF tmp_product EQ product_description THEN repeat_flag ='Y';
ELSE repeat_flag = 'N';
END;
RUN;
proc sort data = td2;
by descending identifier transaction_id product_description;
run;
但它不工作?如果有人能夠幫助它將是fab。 祝福
您的前兩行是相同的產品和相同的日期。這可能會造成麻煩。 – Tom