我有包含許多行的文件,如下面:在awk/GSUB替代的特殊字符和字符串的提取
<li><img src="img/tt_potato-30x30.png" alt="ew_inactive"> <img src="img/in-event-40x40.png" alt="event"> - dep[(0:0)(0:0)]ref[(3:0)(0:0)]srch[?] - <a href "tcc_1111.html">XX:The quick brown fox jumped over the lazy </a> -<img src= "img/config-40x40.png" alt="config"><img src="img/validate-40x50.png" alt="validate"> - user
<li><img src="img/tt_potato-30x30.png" alt="ew_inactive"> <img src="img/in-event-40x40.png" alt="event"> - dep[(0:0)(0:0)]ref[(3:0)(0:0)]srch[?] - <a href "tcc_1111.html">YY:Jack and Jill went up the hill </a> -<img src= "img/config-40x40.png" alt="config"><img src="img/validate-40x50.png" alt="validate"> - user
<li><img src="img/tt_potato-30x30.png" alt="ew_inactive"> <img src="img/in-event-40x40.png" alt="event"> - dep[(0:0)(0:0)]ref[(3:0)(0:0)]srch[?] - <a href "tcc_1111.html">ZZ: Mary had a little lamb </a> -<img src= "img/config-40x40.png" alt="config"><img src="img/validate-40x50.png" alt="validate"> - user
我希望提取以下字符串,並丟棄一切。
XX: The quick brown fox jumped over the lazy
YY: Jack and Jill went up the hill
ZZ: Mary had a little lamb
到目前爲止,我已經使用以下awk命令嘗試,但它似乎被限制爲XX需要更換的YY和ZZ。
awk '{gsub(/^.*XX:/,"XX:"); gsub(/[<\a>].*$/,"[</a>].");print}'
有沒有其他人可以建議使用任何其他標準的Linux工具? 謝謝。
XX/YY/ZZ的通用性如何?如果是這樣,你可以在大多數正則表達式引擎中執行'[XYZ] {2}''。 – stevesliva
@stevesliva,我認爲問題是更多(或也),OP必須改變替換字符串以及哪些字母匹配正則表達式。 – jas
嗨,Jas是正確的,在':'之前替換字符串的變化將是一個要求..感謝您的回覆 – niknak