awk或sed刪除字符前文件中的文本然後字符後

我有一個文件，我試圖用awk刪除()之前的文本，但將文本保留在()中。我也試圖刪除_#之後的空格和文本，然後輸出整行。也許sed是一個更好的選擇，但我不確定如何。awk或sed刪除字符前文件中的文本然後字符後

文件

chr4 100009839 100009851 426_1201_128(ADH5)_1 0 - 
chr4 100006265 100006367 426_1202_128(ADH5)_2 0 - 
chr4 100003125 100003267 426_1203_128(ADH5)_3 0 -

期望的輸出

chr4 100009839 100009851 ADH5_1 
chr4 100006265 100006367 ADH5_2 
chr4 100003125 100003267 ADH5_3

AWK

awk -F'()_*' '{print $1,$2,$3,$4}' file

來源

2016-03-15 Chris

awk -F'[\t()]' '{OFS="\t"; print $1, $2, $3, $5 $6}' file

輸出：

 
chr4 100009839  100009851  ADH5_1 
chr4 100006265  100006367  ADH5_2 
chr4 100003125  100003267  ADH5_3

來源

2016-03-15 17:56:33 Cyrus

使用SED具有取代：

$ sed 's/[^ ]*(\([^)]*\))\(_[^ ]*\).*$/\1\2/' infile 
chr4 100009839 100009851 ADH5_1 
chr4 100006265 100006367 ADH5_2 
chr4 100003125 100003267 ADH5_3

拆開該正則表達式：

[^ ]*(  # Non-spaces up to and including opening parenthesis 
\(   # Start first capture group 
    [^)]* # Content between parentheses: everything but a closing parenthesis 
\)   # End of first capture group 
)   # Closing parenthesis, not captured 
\(   # Start second capture group 
    _[^ ]* # Underscore and non-spaces, '_1' etc. 
\)   # End of second capture group 
.*$   # Rest of line, not captured

來源

2016-03-16 06:19:47

awk或sed刪除字符前文件中的文本然後字符後

回答

相關問題