獲取html開始標記的正則表達式

我只想獲取起始html標記。可以說我有HTML這樣獲取html開始標記的正則表達式

<div class="some">Here is a sample text<br /><p>A paragraph here</p></div> 
<ul><li>List Item</li></ul>

從上面的HTML我想提取這些信息

<div 
<br 
<p 
<ul 
<li

看到我不需要結束「>」嘗試正則表達式/<[a-zA-Z]+[1-6]?/g標籤

來源

2012-01-20 coure2011

我吸的正則表達式，所以每當有一個簡單的正則表達式表達我的需要，我用這個網站來幫助我建立它。我在10秒內想出了你的問題的答案，即使我只知道基本知識：http：//gskinner.com/RegExr/ – gsingh2011

。我爲標頭HTML標籤添加了[1-6] - 我認爲它們是唯一帶有數字的標籤。如果你想確定你可以做/<[a-zA-Z0-9]+/g，因爲在HTML中一個<總是一個標籤（除非它是一個註釋<--），因爲在線<得到轉換爲<。

來源

2012-01-20 05:43:36

以下內容將返回您希望從html正文獲得的匹配數組。

'<div class="some">Here is a sample text<br /><p>A paragraph here</p></div><ul><li>List Item</li></ul>'.match(/<\w+/g)

來源

2012-01-20 05:50:26

如何：

String input = "<div class=\"some\">Here is a sample text<br /><p>A paragraph here</p></div><ul><li>List Item</li></ul><6>"; 
Scanner scanner = new Scanner(input); 
String result = ""; 
while((result = scanner.findInLine("<\\w+")) !=null){ 
    System.out.println(result); 
}

來源

2012-01-20 08:48:37 Eugene

解決方案是在Java – coure2011

@ course2011真的沒關係，正則表達式很重要 – Eugene

獲取html開始標記的正則表達式

回答

相關問題