2015-08-18 57 views
1

我有一組獨特的客戶ID和購買,需要將它們縮減爲一個觀察值,以包含每個客戶的每次獨特購買。SAS - 掃描數據庫的子集並使用唯一值填充數組

例如,

CustID Purchase1 Purchase2 Purchase3 Purchase4 
J Bike Shoes Shirt Pants 
J Shirt Pants null null 
J Bike Helmet Pants null 
K Shoes Helmet null null 
L Basketball Shoes Shirt null 
L Bike Helmet null null 

,我想我的輸出看起來像:

CustID P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 PN 
J Bike Shoes Shirt Pants Helmet null null null null null null null 
K Shoes Helmet null null ........ null 
L Basketball Shoes Shirt Bike Helmet null .... null 

我還好只設置一個非常大的值的最大p使我從不打它,但是如果有人能夠告訴我如何掃描數據集併爲P設置最大值(對應於給定客戶的最大獨特購買次數),則可獲得獎勵積分。

+0

您也可以從[本文](http://support.sas.com/resources/papers/proceedings12/052-2012.pdf發現3例)如果你想在一次轉置中轉置多個變量,這很有趣。 – user667489

回答

0

這樣的事情呢? 所有在同一列的採購,nodupkey通過主題刪除重複購買,返回上一個基於行的環境(系統會自動選擇列號命名它們COL1 COL2等)

/*sample dataset*/ 
data want; 
    infile datalines delimiter=' '; 
    input CustID $ Purchase1 $ Purchase2 $ Purchase3 $ Purchase4 $; 
    datalines;      
J Bike Shoes Shirt Pants 
J Shirt Pants null null 
J Bike Helmet Pants null 
K Shoes Helmet null null 
L Basketball Shoes Shirt null 
L Bike Helmet null null 
; 


/*every purchase on the same column*/ 
data want01; 
length purchase $200; 
set want; 
array purc[*] purchase:; 
do i=1 to dim(purc); 
PURCHALL=purc[i]; 
output; 
end; 
keep custid purchall; 
run; 

/*delete repeated purchases and blanks*/ 
proc sort data=want01 out=want02 nodupkey; where purchall not in ('' 'null'); by custid purchall; run; 

/*returning on a row based dataset*/ 
proc transpose data=want02 out=want03; 
by custid; 
var purchall; 
run; 

如果您只希望獲得最大數量的獨特購買,只需在WANT02數據集上應用一個proc freq(一個包含獨特購買而沒有空白和空值)。

proc freq data=want02 noprint; 
table custid /out=want04; 
run; 

WANT04將是:

CUSTID | FREQUENCY | 
-------------------- 
J  |  5 | 
K  |  2 | 
L  |  5 | 
+0

我相信這會起作用,謝謝!歡迎您撥打 – Brian

+0

,如果您需要進一步澄清,請告知我們。 – stat