考慮下面的HTML:如何使用RegEx從HTML中提取值?
<p><span class="xn-location">OAK RIDGE, N.J.</span>, <span class="xn-chron">March 16, 2011</span> /PRNewswire/ -- Lakeland Bancorp, Inc. (Nasdaq: <a href='http://studio-5.financialcontent.com/prnews?Page=Quote&Ticker=LBAI' target='_blank' title='LBAI'> LBAI</a>), the holding company for Lakeland Bank, today announced that it redeemed <span class="xn-money">$20 million</span> of the Company's outstanding <span class="xn-money">$39 million</span> in Fixed Rate Cumulative Perpetual Preferred Stock, Series A that was issued to the U.S. Department of the Treasury under the Capital Purchase Program on <span class="xn-chron">February 6, 2009</span>, thereby reducing Treasury's investment in the Preferred Stock to <span class="xn-money">$19 million</span>. The Company paid approximately <span class="xn-money">$20.1 million</span> to the Treasury to repurchase the Preferred Stock, which included payment for accrued and unpaid dividends for the shares.  This second repayment, or redemption, of Preferred Stock will result in annualized savings of <span class="xn-money">$1.2 million</span> due to the elimination of the associated preferred dividends and related discount accretion.  A one-time, non-cash charge of <span class="xn-money">$745 thousand</span> will be incurred in the first quarter of 2011 due to the acceleration of the Preferred Stock discount accretion.  The warrant previously issued to the Treasury to purchase 997,049 shares of common stock at an exercise price of <span class="xn-money">$8.88</span>, adjusted for stock dividends and subject to further anti-dilution adjustments, will remain outstanding.</p>
我想獲得<span>
元素中的值。我還想獲得<span>
元素上class
屬性的值。
理想情況下,我可以通過函數運行一些HTML並獲取提取實體的字典(基於上面定義的<span>
解析)。
上述代碼是來自較大源HTML文件的代碼片段,它無法與XML解析器進行比較。所以我正在尋找一個可能的正則表達式來幫助提取感興趣的信息。
什麼編程語言是您使用?有一些庫會採用HTML不是有效的XML,並且仍允許使用xpath表達式等來查詢信息。 – 2011-03-16 15:26:37
編程語言= .net – 2011-03-16 15:32:40