Jsoup表解析

-1

我是jsoup和這個解析thingy的新手，所以如果你需要更多的信息讓你能夠回答我的問題，請告訴我！Jsoup表解析

我有這張表，我想用Java中的Jsoup解析。我只是想獲得的文本：

「BS計算機科學，CS（2012-2014）」

從表

<h3>Fahran S Kamili (fsk226)</h3> 
     <div> 
      10 Degree Audit Requests Returned. 
     </div> 
     <table> 
      <thead> 
       <tr> 
<!-- *nrfkh - 9/2012: [degaudt-634]* --> 
         <th colspan="8">Degree Audits Requested</th> 

<!-- *end nrfkh - 9/2012: [degaudt-634]* --> 

       </tr> 
       <tr> 
        <th>Rerun</th> 

<!-- *nrfkh - 9/2012: [degaudt-634]* --> 

<!-- *end nrfkh - 9/2012: [degaudt-634]* --> 
        <th>Request Created</th> 
<!-- *nrfkh - 9/2012: [degaudt-634]* --> 

<!-- *end nrfkh - 9/2012: [degaudt-634]* --> 
        <th>Audit Type</th> 
        <th>Program</th> 
        <th>Courses Requested</th> 
        <th>Request Status</th> 
        <th>Audit ID</th> 
        <th>Delete Option</th> 
       </tr> 
      </thead> 
        <tbody><tr> 
         <td> 
            <a href="https://utdirect.utexas.edu/apps/degree/audits/requests/student_individual/?form-0-eid=fsk226&form-0-name=Fahran%20S%20Kamili&form-0-begin_ccyy=2012&form-0-degree_plan=ESC%20SS%20CS&form-0-minor=&current=X&future=&planned=&form-TOTAL_FORMS=20&form-INITIAL_FORMS=0&form-MAX_NUM_FORMS=&rerun=" target="_blank">Rerun</a> 
         </td> 
<!-- *nrfkh - 9/2012: [degaudt-634]* --> 
<!-- *end nrfkh - 9/2012: [degaudt-634]* --> 
         <td> 
          12/20/2013 
          05:06 PM 
         </td> 
<!-- *nrfkh - 9/2012: [degaudt-634]* --> 
<!-- *end nrfkh - 9/2012: [degaudt-634]* --> 
         <td> 
           Normal 

         </td> 
         <td> 
          B S Computer Science, CS 
          (2012-2014) 
         </td>

的這部分

表實際上是延伸到了長，但這些包含只是彼此的兄弟姐妹（所以我假設如果我能得到這個文本，我也可以很容易地得到其他文本）。

來源

2014-01-25 user3134067

'「所以如果你需要更多的信息......」「 - 是的，就像你到目前爲止嘗試過什麼，以及它如何不工作？還有什麼讓你特別困惑？ –

如果我是你的HTML部分保存到一個文件，並通過jsoup解析它，我會嘗試打印自認爲遇到的所有td元素是你所追求的：

public static void main(String... args) throws IOException { 
     File input = new File("C:/users/XYZ/desktop/input.html"); 
     Document doc = Jsoup.parse(input, "UTF-8", ""); 
     Elements tds = doc.getElementsByTag("td"); 
     for (Element td : tds) { 
      System.out.println(td.text()); 
     } 
    }

輸出：

Rerun 
12/20/2013 05:06 PM 
Normal 
B S Computer Science, CS (2012-2014)

來源

2014-01-25 19:53:01 PopoFibo

回答

相關問題