2014-02-05 122 views
1

我在我的android應用程序中使用jsoup解析我的html代碼,但現在我需要解析表數據,我無法讓它工作。我嘗試了很多方法,但都不成功,所以如果有人有經驗,我想在這裏嘗試運氣。用jsoup解析表數據

這裏是我的HTML的一部分:

<div id="editacia_jedla"> 
    <h2>My header</h2> 
    <h3>My sub header</h3> 

    <table border="0" class="jedalny_listok_tabulka" cellpadding="2" cellspacing="1"> 
    <tr> 
     <td width="100" class="menu_nazov neparna" align="left">Food Menu 1</td> 
     <td class="jedlo neparna" align="left">vegetable and beef 
     <div class="jedlo_box_alergeny">Allergens: <a href="#" class="alergen_1">1</a>, <a href="#" class="alergen_3">3</a></div> 
     </td> 
    </tr> 
    <tr> 
     <td width="100" class="menu_nazov parna" align="left">Food Menu 2</td> 
     <td class="jedlo parna" align="left">Potato salad and pork 
     <div class="jedlo_box_alergeny">Allergens: <a href="#" class="alergen_6">6</a></div> 
     </td> 
    </tr> 
    </table> 
    etc 
</div> 

我的Java/Android的代碼:

try { 
      String tableHtmlCode=""; 
      Document fullHtmlDocument = Jsoup.connect(urlOfFoodDay).get(); 
      Element elm1 = fullHtmlDocument.select("#editacia_jedla").first(); 
      for(Element element : elm1.children()) 
      { 
       tableHtmlCode+=element.getElementsByIndexEquals(2); //this set table content because 0=h2, 1=h3 
      } 
      Document parsedTableDocument = Jsoup.parse(tableHtmlCode); 
      //Element th = parsedTableDocument.select("td[class=jedlo neparna]").first(); THIS IS BAD 
      String foodContent=""; 
      String foodAllergens=""; 
     } 

所以現在我想提取文本蔬菜和牛肉並將其保存到字符串foodContent和numbera 1,3(一起)來自div類jedlo_box_alergeny保存到字符串foo dAllergens。有人可以幫忙嗎?我將非常感謝任何想法

回答

2

遍歷文檔的父代碼jedalny_listok_tabulka並循環td標籤。

td標籤是父母的href標籤,其中包括過敏的價值觀。因此,你會遍歷所有的標籤a元素,讓您的數字,是這樣的:

Elements myElements = doc.getElementsByClass("jedalny_listok_tabulka") 
       .first().getElementsByTag("td"); 
     for (Element element : myElements) { 
      if (element.className().contains("jedlo")) { 
       String foodContent = element.ownText(); 
       String foodAllergen = ""; 

       for (Element href : element.getElementsByTag("a")) { 
        foodAllergen += " " + href.text(); 
       } 

       System.out.println(foodContent + " : " + foodAllergen); 
      } 
     } 

輸出:

vegetable and beef : 1 3 
Potato salad and pork : 6 
+1

完美,非常感謝你。 –