正則表達式來獲得<a href> from a string in java

Suppose I have正則表達式來獲得<a href> from a string in java

<img class="size-full wp-image-10225" alt="animals" src="abc.jpg"> blah blah blah&nbsp; 
<a href="http://en.wikipedia.org/wiki/Elephant">elephant is an animal</a>&nbsp;blah

I want a regex to give me the output :

blah blah blah <a href="http://en.wikipedia.org/wiki/Elephant">elephant is an animal</a> blah

without the  . I can do str.replace(" ","") separately, but how do I get the string starting from blah blah... until blah (which includes link tag).

來源

2014-03-28 user3298846

您必須單獨刪除'img'標籤。你只需要a-Tag？這與RegExpr一起工作。如果您想在標籤前後獲得其他文本，請在此處遇到問題。爲什麼你不容易刪除不需要的標籤？ –

我確實需要標籤之前的文字。所以基本上我不能說StringUtils.removeHTMLTags（），因爲這將刪除所有的標籤，我想要的HTML標籤。所以基本上我在想什麼是找到ahref之前的第一個「>」，然後從那裏捕獲文本，直到（含） – user3298846

_Sees正則表達式和HTML在title_「http://stackoverflow.com/a/1732454/2846923 「。 –

Maybe something like this?

^<[^>]*>\s*|&nbsp;

Java escaped:

^<[^>]*>\\s*|&nbsp;

regex101 demo

^<[^>]*>\\s*將第一img標籤以及任何後續的空間相匹配。然後替換 。替換字符串是""。

雖然您可能想要使用適當的HTML解析器，因爲它不太可能中斷。

來源

2014-03-28 19:16:45 Jerry

嘿謝謝傑瑞。雖然我沒有得到java轉義部分。所以我應該這樣做：str.replace（^ <[^>] *> \ s * | ，「」） – user3298846

@ user3298846在Java中使用轉義版本。 :) – Jerry

是的，對不起，我沒有看完整的問題。 – Andres

正則表達式來獲得<a href> from a string in java

回答

相關問題