2017-07-10 74 views
0
<div class="school_row_cell_content"> 
<div class="center_div"> 
    <img src="/assets/isbiimages/c1504.jpg" alt="School Crest" width="190"></div> 
       Shelburne Road, 
<br class="">Cheltenham, 
<br class="">Gloucestershire, 
<br class="">United Kingdom, 
<br class="">GL51 6HE 
<br class=""> 
<br class=""><strong>Tel:</strong> +44 1242 258000 
<br class=""><strong>Fax:</strong> +44 1242 258004 
<br class=""><br class=""><strong><a href="http://www.deanclose.org.uk" rel="nofollow" target="_blank" id="154" title="opens in new window" class="school_website_btn">Visit School Website</a></strong> 
<br class=""> 
<br class=""><strong>Founded:</strong>1886<br class=""><br class=""><strong>Headmaster:</strong> 
<br class=""><a href="/assets/isbiimages/ph1504.jpg" class="iframe_popups">Mr Bradley Salisbury</a> 
<br class=""><br class=""><strong>Registrar:</strong> 
<br class="">Mrs Kelly Serjeant 
<br class=""> 
<br class="">This school offers flexi-boarding. 
<br class=""> 
<br class=""><strong>Accreditations and affiliations:</strong> 
<br class="">ISBA, HMC, BSA, AGBIS 
<br class=""><strong>Religious affiliation:</strong> 
<br class="">Church of England<br class=""><strong>Teaching languages:</strong> 
<br class="">English 
<br class="">           
</div> 

我想根據自己的標籤抓取某些字段,例如,我想抓住「創建」;在這個例子中將是1886年。還有「註冊服務商」:這是凱利·塞耶太太夫人。XPATH - 強標記後抓取文本

我想這個變化,沒有運氣:

//strong[starts-with(., 'Registrar:')]//text()[not(parent::strong)] 

不太清楚我在做什麼錯。

回答

0

嘗試:

//div[@class="school_row_cell_content"]//text()[.="Registrar:"]/following::text()[string-length()>0][1] 

得到Mrs Kelly Serjeant

更換"Registrar:""Founded:"作爲

//div[@class="school_row_cell_content"]//text()[.="Founded:"]/following::text()[string-length()>0][1] 

得到1886

0

您可以通過只替換文本下面的XPath嘗試包含部分

//*[contains(text(),'Founded')]/following-sibling::text()[string-length()>1][position()<2]