2014-03-24 76 views
0

這是我的HTML。由此我想兩個細節Jsoup想要獲得所有元素的類名相同的值

發行人:施普林格出版社, 價格:$七千二百八十四

問題是所有的外部和內部的類名相同。請建議如何使用jsoup從下面的HTML獲得以上兩個值。

<div class="details"> 
    <div class="fullname">ANALYTICAL AND BIOANALYTICAL CHEMISTRY (2011)</div> 
    <div class="catbox"> 
     <div class="catcontents"> 
      <div class="contents_ct1">Eigenfactor Category:</div> 
      <div class="contents_ct2" style="margin-left: -5px;">ANALYTIC CHEMISTRY</div> 
     </div> 
     <div class="catcontents"> 
      <div class="contents_ct1">ISI Category:</div> 
      <div class="contents_ct2" style="margin-left: -49px;">CO EA</div> 
     </div> 
     <div class="catcontents"> 
      <div class="contents_ct1">Group:</div> 
      <div class="contents_ct2" style="margin-left: -80px;">Sci</div> 
     </div> 
     <div class="catcontents"> 
      <div class="contents_ct1">Total Articles (5yrs):</div> 
      <div class="contents_ct2" style="margin-left: -12px;">3,544</div> 
     </div> 
    </div> 
    <div class="catbox" style="margin-left: 20px"> 
     <div class="catcontents"> 
      <div class="contents_ct1">Publisher:</div> 
      <div class="contents_ct2" style="margin-left: -55px;">Springer-Verlag</div> 
     </div> 
     <div class="catcontents"> 
      <div class="contents_ct1">First Published:</div> 
      <div class="contents_ct2" style="margin-left: -35px;">2001</div> 
     </div> 
     <div class="catcontents"> 
      <div class="contents_ct1"><a href="http://journalprices.com/" title="Prices provided by JournalPrices.com" target="_blank" style="font-size: 11px">Price:</a></div> 
      <div class="contents_ct2" style="margin-left: -80px;">$7,284</div> 
     </div> 
     <div class="catcontents"> 
      <div class="contents_ct1">Cost Effectiveness:</div> 
      <div class="contents_ct2" style="margin-left: -18px;">1.0302</div> 
     </div> 
    </div> 
    <div class="tgraph"> 
     <div class="plotB"> 
      <iframe src="plot1.php?issn=1618-2642" width="370px" height="220px" frameborder=0 scrolling="no"></iframe> 
     </div> 
     <div class="plotB" style="margin-left: 10px"> 
      <iframe src="plot2.php?issn=1618-2642" width="340px" height="220px" frameborder=0 scrolling="no"></iframe> 
     </div> 
    </div> 
</div> 
+0

你知道字符串的發佈者:'和'價格:'不會改變嗎?如果沒有,你可以找到基於這些字符串的元素 – DangerDan

回答

1

靜態HTML結構

假設佈局總是跟隨你提供的源的結構,可以用簡單的CSS選擇的語法指定解析哪個元素。

Element publisher = doc.select("div.catbox:eq(2) div.catcontents div.contents_ct2").first(); 
Element price = doc.select("div.catbox:eq(2) div.catcontents:eq(2) div.contents_ct2").first(); 
System.out.println("Publisher: " + publisher.text() + "\nPrice: " + price.text()); 

會導致打印出

run: 
Publisher: Springer-Verlag 
Price: $7,284 

動態HTML結構

如果結構是不一樣的時候,下面的代碼應該產生相同的結果,但檢查元素的文本以正確識別它們。

Elements content = doc.select("div.catcontents"); 
Element publisher = null; 
Element price = null; 
for (Element element : content) { 
    if(element.text().startsWith("Publisher")){ 
     publisher = element; 
    } 
    if(element.text().startsWith("Price")){ 
     price = element; 
    } 
} 
System.out.println(publisher.text() + "\n" + price.text()); 
+0

如果您發現這是答案,請記住將來將其標記爲答案! –

相關問題