使用正則表達式來檢索字符之間的字符串

我想要使用grep命令或只是知道regex以獲得「>」和「<」字符之間的以下字符串。使用正則表達式來檢索字符之間的字符串

字符串：

<f id=mos-title>demo-break-1</f>

我想回

demo-break-1

[這是您需要的正則表達式。]（http://en.wikipedia.org/wiki/XPath） – 2013-03-14 22:10:17

另一種方法：http://stackoverflow.com/questions/1732348/regex-match-open-tags-除了-XHTML-自足標籤 – 2013-03-14 22:13:45

假設文件foo包含：

<f id=mos-title>demo-break-1</f> 
<f id=mos-title>demo-break-2</f> 
<f id=mos-title>demo-break-3</f> 
<a>foo testing</a>

你可以做這樣的事情：

perl -ne 'print "$1\n" if /<.+id=mos-title>(.+?)<\/f>/' foo

請記住，這將是嚴格的，因爲這些匹配只發生在一行上。此外，由於這不是有效的HTML解析器，因此您必須考慮格式中的任何偏差。

儘管嚴格但仍不是100％的HTML兼容，這是一個更輕鬆的方法。

perl -ne 'print "$1\n" if /<.+id=mos-title\b.*?>\s*(.+?)\s*<\/f>/' foo

輸出將如下所示：

demo-break-1 
demo-break-2 
demo-break-3

2013-03-15 00:08:22 cmevoli

如果你有一個正確的XML文件是這樣的：

<root> 
    <f id="mos-title">demo-break-1</f> 
</root>

你可以使用一個適當的解析器：

xmllint --xpath "/root/f[@id='mos-title']" input.xml | \ 
     sed 's/[^>]*>\([^<]*\)<[^>]*>/\1\n/g'

隨着你的輸入，你是確保的輸入格式是一致的（即產生），可以使用SED：

sed 's/[^>]*>\([^<]*\)<[^>]*>/\1/g' input

2013-03-15 00:36:39 perreal

通常最好使用XML解析器，但你可以試試這個AWK：

awk '$1==s{print $2}' s="f id=mos-title" RS=\< FS=\> file

2013-03-16 11:56:35 Scrutinizer

回答