此問題與Stata: select the minimum of each observation有關。每組選擇最低值
我有數據如下:
clear
input str4 id int eventdate byte dia_bp_copy int sys_bp_copy
"pat" 15698 100 140
"pat" 16183 80 120
"pat" 19226 98 155
"pat" 19375 80 130
"sue" 14296 80 120
"sue" 14334 88 127
"sue" 14334 96 158
"sue" 14334 84 136
"sue" 14403 86 124
"sue" 14403 88 134
"sue" 14403 90 156
"sue" 14403 86 134
"sue" 14403 90 124
"sue" 14431 80 120
"sue" 14431 80 140
"sue" 14431 80 130
"sue" 15456 80 130
"sue" 15501 80 120
"sue" 15596 80 120
"mary" 14998 90 154
"mary" 15165 91 179
"mary" 15280 91 156
"mary" 15386 81 154
"mary" 15952 77 133
"mary" 15952 80 144
"mary" 16390 91 159
end
有些人在一天多的讀數,如見1999年3月31日蘇我要選擇每天的最低讀數。
這裏是我的代碼,它讓我有一些方法。它笨重笨拙,我正在尋求幫助,以更直接的方式做我想做的事情。
*make flag for repeat observations on same day
sort id eventdate
by id: gen flag =1 if eventdate==eventdate[_n-1]
by id: gen flag2=1 if eventdate==eventdate[_n+1]
by id: gen flag3 =1 if flag==1 | flag2==1
drop flag flag2
* group repeat observations together
egen group = group(id flag3 eventdate)
* find lowest `sys_bp_copy` value per group
bys group (eventdate flag3): egen low_sys=min(sys_bp_copy)
*remove the observations where the lowest value of `sys_bp`_copy doesn't exist
bys group: gen remove =1 if low_sys!=sys_bp_copy
drop if remove==1 & group !=.
****本在哪裏,我想幫助**問題
上述方法的問題是,對蘇,她的兩個重複讀數具有相同VAL sys_bp_copy
。所以我上面的方法留給她多個讀數。
在這種情況下,我想參考dia_sys_copy
並選擇最低值,以幫助我在每次讀取一行時選擇多行讀數。代碼如下 - 但必須有一個更簡單的方法來做到這一點?
drop flag3 remove group
sort id eventdate
by id: gen flag =1 if eventdate==eventdate[_n-1]
by id: gen flag2=1 if eventdate==eventdate[_n+1]
by id: gen flag3 =1 if flag==1 | flag2==1
egen group = group(id flag3 eventdate)
bys group (eventdate flag3): egen low_dia=min(dia_bp_copy)
bys group: gen remove =1 if low_dia!=dia_bp_copy
drop if remove==1 & group !=.
好吧,我會進行編輯,以使更多succint。不掛斷。 – user2363642