2015-01-10 38 views
0

我正在嘗試編寫一個循環來生成並填寫一個虛擬變量,以確定某個人是否是該年度某個特定團體的成員。我的數據很長,每個觀察結果都是一個人,一年。它看起來像下面。Stata循環觀察以年爲字符串變量

X1     X2     X3     
AR, 1972-1981  PDC, 1982-1986  PFL, 1986-. 
MD, 1966-1980  PMDB, 1980-1988  PSB, 1988-. 
MD, 1966-1968  AR, 1968-1980  PDS, 1980-1985 

在逗號之前是派對,之後是該人是派對成員的年份。 任何幫助將不勝感激!

到目前爲止我的代碼是:

rename X1 XA 
rename X2 XB 
rename X3 XC 

foreach var of varlist XA XB XC{ 
    split `var', parse (,) 
} 
tabulate XA1, gen(p) 
+0

PLZ分享你已經嘗試過,但不起作用varlist中的X1 X2 X3的 –

+0

的foreach VAR代碼{ 分裂'變種」,解析(,) }製表X1,根() – user4438802

+0

哦,對不起,我認爲這是一個python問題) 但是,我建議用你的代碼更新問題,這通常有助於獲得答案) –

回答

2

下面是做到這一點的方法之一。我不得不假設在X3中缺失的年份對應於什麼,所以你需要改變它。

/* Enter Data */ 
clear 

input str20 X1 str20 X2 str20 X3     
"AR, 1972-1981"  "PDC, 1982-1986"  "PFL, 1986-." 
"MD, 1966-1980"  "PMDB, 1980-1988"  "PSB, 1988-." 
"MD, 1966-1968"  "AR, 1968-1980"  "PDS, 1980-1985" 
end 

compress 

/* Split X1,X2,X3 into party, start year and end year and create 3 ID variables that we need later */ 
forvalues v=1/3 { 
    split X`v', parse(", " "-") 
    gen id`v'=_n 
} 

/* Makes years numeric, and get rid of messy original data */ 
destring X12 X13 X22 X23 X32 X33, replace 
replace X33 = 1990 if missing(X33) // enter your survey year here 
drop X1 X2 X3 

/* stack the spells on top of each other */ 
stack (id1 X11 X12 X13) (id2 X21 X22 X23) (id3 X31 X32 X33), into(id party year1 year2) clear 
drop _stack 

/* Put the data into long format and fill in the gaps */ 
reshape long year, i(id party) j(p) 
drop p 
/* need this b/c people can be in more than one party in a given year */ 
egen idparty = group(id party), label 
xtset idparty year 
tsfill 
carryforward id party, replace 
drop idparty 

/* create party dummies */ 
tab party, gen(DD_) 

/* rename the dummies to have party affiliation at the end instead of numbers */ 
foreach var of varlist DD_* { 
    levelsof party if `var'==1, local(party) clean 
    rename `var' ind_`party' 
} 

drop party 

/* get back down to one person-year observation */ 
collapse (max) ind_*, by(id year) 

list id year ind_*, sepby(id) noobs 
1

關注Dimitriy的領導(和解釋),這裏有一個稍微不同的方式。我對丟失的終點做出了不同的假設,即我將該系列截斷爲最後一個已知年份。

clear 
set more off 

input /// 
str15 (XA     XB     XC)     
"AR, 1972-1981"  "PDC, 1982-1986"  "PFL, 1986-." 
"MD, 1966-1980"  "PMDB, 1980-1988" "PSB, 1988-." 
"MD, 1966-1968"  "AR, 1968-1980" "PDS, 1980-1985" 
end 

list 

*----- what you want? ----- 

// main 
stack X*, into(X) clear 
bysort _stack: gen id = _n 
order id, first 

split X, parse (, -) 
rename (X1 X2 X3) (party sdate edate) 

destring ?date, replace 
gen diff = edate - sdate + 1 
expand diff 

bysort id party: replace sdate = sdate[1] + _n - 1 

drop _stack X edate diff 

// create indicator variables 
tabulate party, gen(y) 

// fix years with two or more parties 
levelsof party, local(lp) clean 
collapse (sum) y*, by(id sdate) 

// rename 
unab ly: y* 
rename (`ly') (`lp') 

list, sepby(id) 
相關問題