的Stata：比較兩個數據集落的不同變量

我有兩個大的數據集（超過1000級的變量中的每個），其中一個具有第二的所有變量，加上附加的變量。我想獲得所有這些附加變量的列表，然後刪除它們並將一個數據集追加到另一個。我曾嘗試命令dta_equal，但是來到這裏發現了同樣的問題：http://www.stata.com/statalist/archive/2011-08/msg00308.html 的Stata：比較兩個數據集落的不同變量

我想append, keep()不能明白我想直接做的，即，因爲我必須手動輸入變量，一個不能追加，而下降的其他變量的數據集在keep()選項中選擇一個，這對我的大數據集來說是不現實的。

有沒有辦法解決這個問題？

來源

2015-09-08 Chen

檢查'ssc describe cfvars'。 –

有幾個Stata的命令，可以在這裏很有用。

的unab命令在第一實施例中使用，以使在用較少的變量數據集的變量的列表。第二個和第三個示例使用describe命令獲取數據集中當前不在內存中的變量列表。

最後一部分的的示例中示出了如何使用擴展宏列表函數，以獲得共同的變量的列表，並且所述一組變量，不常見的兩個數據集。

* simulate 2 datasets, one has more variables than the other 
sysuse auto, clear 
save "data1.dta", replace 
gen x = _n 
gen y = -_n 
save "data2.dta", replace 

* example 1: drop after append 
use "data1.dta", clear 
unab vcommon : * 
gen source = 1 
append using "data2.dta" 
replace source = 2 if mi(source) 
keep `vcommon' source 

* example 2: drop first then append 
clear 
describe using "data1.dta", varlist short 
local vcommon `r(varlist)' 
use `vcommon' using "data2.dta", clear 
gen source = 2 
append using "data1.dta" 
replace source = 1 if mi(source) 

* example 3: append and keep on the fly 
use "data1.dta", clear 
unab vcommon : * 
gen source = 1 
append using "data2.dta", keep(`vcommon') 
replace source = 2 if mi(source) 

* use extended macro list functions to manipulate variable list 
clear 
describe using "data1.dta", varlist short 
local vlist1 `r(varlist)' 
describe using "data2.dta", varlist short 
local vlist2 `r(varlist)' 
local vcommon : list vlist1 & vlist2 
local vinonly1 : list vlist1 - vlist2 
local vinonly2 : list vlist2 - vlist1 
dis "common variables = `vcommon'" 
dis "variables in data1 not found in data2 = `vinonly1'" 
dis "variables in data2 not found in data1 = `vinonly2'"

來源

2015-09-09 14:52:10

謝謝@RobertPicard。這真的很有用。 – Chen

的Stata：比較兩個數據集落的不同變量

回答

相關問題