我試圖找到一個函數,它會索引一個字符的第n個實例。SAS:如何在字符串中找到第n個字符/字符組的第n個實例?
例如,如果我有字符串ABABABBABSSSDDEE
並且我想查找A
的第三個實例,那麼我該怎麼做?如果我想找到AB
ABAB 的第四實例的 BB AB SSSDDEE
data HAVE;
input STRING $;
datalines;
ABABABBASSSDDEE
;
RUN;
我試圖找到一個函數,它會索引一個字符的第n個實例。SAS:如何在字符串中找到第n個字符/字符組的第n個實例?
例如,如果我有字符串ABABABBABSSSDDEE
並且我想查找A
的第三個實例,那麼我該怎麼做?如果我想找到AB
ABAB 的第四實例的 BB AB SSSDDEE
data HAVE;
input STRING $;
datalines;
ABABABBASSSDDEE
;
RUN;
data _null_;
findThis = 'A'; *** substring to find;
findIn = 'ADABAACABAAE'; **** the string to search;
instanceOf=1; *** and the instance of the substring we want to find;
pos = 0;
len = 0;
startHere = 1;
endAt = length(findIn);
n = 0; *** count occurrences of the pattern;
pattern = '/' || findThis || '/';
rx = prxparse(pattern);
CALL PRXNEXT(rx, startHere, endAt, findIn, pos, len);
if pos le 0 then do;
put 'Could not find ' findThis ' in ' findIn;
end;
else do while (pos gt 0);
n+1;
if n eq instanceOf then leave;
CALL PRXNEXT(rx, startHere, endAt, findIn, pos, len);
end;
if n eq instanceOf then do;
put 'found ' instanceOf 'th instance of ' findThis ' at position ' pos ' in ' findIn;
end;
else do;
put 'No ' instanceOf 'th instance of ' findThis ' found';
end;
run;
下面是使用find()
功能和datastep內做循環的解決方案。然後我拿這個代碼,並把它放到一個proc fcmp
程序中來創建我自己的函數find_n()
。這將大大簡化任何使用此任務的任務並允許代碼重用。
定義數據:
data have;
length string $50;
input string $;
datalines;
ABABABBABSSSDDEE
;
run;
DO循環解決方案:
data want;
set have;
search_term = 'AB';
nth_time = 4;
counter = 0;
last_find = 0;
start = 1;
pos = find(string,search_term,'',start);
do while (pos gt 0 and nth_time gt counter);
last_find = pos;
start = pos + 1;
counter = counter + 1;
pos = find(string,search_term,'',start+1);
end;
if nth_time eq counter then do;
put "The nth occurrence was found at position " last_find;
end;
else do;
put "Could not find the nth occurrence";
end;
run;
定義proc fcmp
功能:
注意:如果第n-發生不能被發現返回0.
options cmplib=work.temp.temp;
proc fcmp outlib=work.temp.temp;
function find_n(string $, search_term $, nth_time) ;
counter = 0;
last_find = 0;
start = 1;
pos = find(string,search_term,'',start);
do while (pos gt 0 and nth_time gt counter);
last_find = pos;
start = pos + 1;
counter = counter + 1;
pos = find(string,search_term,'',start+1);
end;
result = ifn(nth_time eq counter, last_find, 0);
return (result);
endsub;
run;
例proc fcmp
用法:
注意這兩次調用該函數。第一個例子是顯示原始請求解決方案。第二個例子顯示當找不到匹配時會發生什麼。
data want;
set have;
nth_position = find_n(string, "AB", 4);
put nth_position =;
nth_position = find_n(string, "AB", 5);
put nth_position =;
run;
爲什麼downvote? –
你到目前爲止嘗試過什麼。你有沒有讀過'正則表達式'。 'SAS'使用'PERL'正則表達式引擎。 – gwillie
你也可以在循環中使用FIND來查找簡單的例子,雖然正則表達式對於複雜情況是更好的方法... – kl78
@gwillie到目前爲止沒有'regex',但我會研究它..... .....在過去使用'index'和'find'與'substr'的組合,但是這個下一級複雜度可能需要regex。 TY。 –