2017-01-12 213 views
0

我有兩個數據集,我想按照領土#進行合併......第一個數據集具有領土信息,包括領土#,第二個數據集具有領土#,但它們是橫跨4個不同的欄目,分別爲drug_terr1,drug_terr2,drug_terr3和drug_Terr4 ...我需要在所有4列進行合併,因爲它們每個都有不同的地區#,我希望這些數字包含在我與數據集的合併中領土信息...我嘗試了重命名,但沒有工作,因爲它只改變了第一列...有沒有辦法將所有這些數據合併,並通過領土#重命名,所以我可以做合併?按不同列名合併多個列的SAS數據集

最終會希望它看起來像這樣,但需要從'terrfile'獲得4列成爲1列名爲territory_nbr,所以我可以合併。

%let output = E:\Horizon\Adhoc\AH\; 
%let terrs =\\uslsasas1\E$\Horizon\IMS Processing\Weekly Data\20161230\Demo\; 
libname terrs "&terrs."; 
%let curr_process_wk = '12-30-2016'; 
%let curr_quarter =_q1; 
**0 Grab pskw; 
data pskw_data; 
set PSKW.PSKWMaster ; 
where week in ('12-16-2016','12-23-2016','12-30-2016','01-06-2017') and CopayType ="FBD" and FNRX=1 and pme_id in (46,42,55,38) and product in ('DUEXIS','VIMOVO','PENNSAID') 
and 
(COBPrimaryRejectCode1 in ('75','76') or COBPrimaryRejectCode2 in ('75', '76') or COBPrimaryRejectCode3 in ('75' , '76')); 
run; 
proc sort data=pskw_data; 
by imsid; 
run; 

** 01 Grab tbl HCP; 
proc sort data=ims.tblhcp (where = (week = &curr_process_wk.) keep = week imsid first_name last_name address1 address2 city state zip spec) 
      out = IMS_demo (drop = week); 
     by IMSID; 
run; 

** 02 Grab tbl terrs_by_imsid; 
data terrfile; 
set terrs.wd2_terrs_by_imsid&curr_quarter.; 
run; 

proc sort data = terrfile; 
by imsid; 
run; 
** 03 Grab tbl roster; 
data roster (keep = territorycode repname territoryname teamname); 
set ims.tblRoster; 
    repname = trim(left(FirstName))||" "||trim(left(LastName)); 
run; 
**04 link ; 
data combine_dbs; 
merge pskw_data (in=in1) 
ims.tblhcp (in=in2); 
by imsid; 
if in1; 
run; 
data territories; ***can't merge because territory code is not in terrfile, just 4 columns as I mentioned above***; 
merge terrfile (in=in1) 
roster (in=in2); 
by territorycode; 
if in2; 
run; 
+0

你可以顯示你的數據看起來像atm嗎? –

+0

你想從領土主文件(每個領土有一條記錄的文件)中選擇哪些字段?由於您想將其與您的事實表(最多四個地區代碼)組合四次,因此您需要爲每個字段設置四個名稱,以存儲最多四個不同的值。 – Tom

+0

我想從terrfiles中找到一個名爲IMS_ID的東西,這樣我就可以最終將其添加到我的名冊數據集中。 – SQUISH

回答

1

您需要將事實表與查找表合併四次。假設您的地區標識符在您的查找表中被稱爲ID,您想從中選取IMS_ID。我們還假設您的事實表中的四個字段的名稱分別爲ID1-ID4

proc sql ; 
    create table want as 
    select a.* 
      , b.ims_id as ims_id1 
      , c.ims_id as ims_id2 
      , d.ims_id as ims_id3 
      , e.ims_id as ims_id4 
    from FACT a 
    left join LU b on a.id1=b.id 
    left join LU c on a.id2=c.id 
    left join LU d on a.id3=d.id 
    left join LU e on a.id4=e.id 
    ; 
quit; 

在您的例子,它看起來ROSTER是你FACT表,TERRFILES是你LU表。您的ID變量看起來像是名稱TERRITORYCODE,至少在您的查找文件中。很難說ROSTER中四個變量的命名。