1
我試圖從HTML表中抓取URL,但是每次我都會得到HREF標題數據而不是URL - 是否有人可以解決/避免這種情況?通過Android/jsoup以URL格式抓取數據
<table class="datagrid">
<tr>
<th>Number</th>
<th>Name</th>
<th>Sex</th>
<th>Location</th>
</tr>
<tr>
<td><a href="redirector.cfm?ID=93bd5121-7a3b-4a56-a576-f432e542047a&page=1&&lname=&fname=" title="501207593">501207593 </a></td>
<td>AARON, JUSTIN COLBY </td>
<td>M </td>
<td>Facility 1</td>
</tr>
<tr>
<td><a href="redirector.cfm?ID=c5629a92-7113-487c-ba9b-1e62203ab08d&page=1&&lname=&fname=" title="501302750">501302750 </a></td>
<td>AARONSON, CARY HOWARD </td>
<td>M </td>
<td>Facility 2</td>
</tr>
<tr>
<td><a href="redirector.cfm?ID=66d01768-5686-44eb-ac6a-16eb783f52d0&page=1&&lname=&fname=" title="501306284">501306284 </a></td>
<td>ABBOTT, LAUREA </td>
<td>F </td>
<td>Facility 3</td>
</tr>
來源:
public class MainActivity extends Activity {
TextView tv;
String url = "http://google.com";
String tr;
Document doc;
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
tv = (TextView) findViewById(R.id.TextView01);
new MyTask().execute(url);
}
private class MyTask extends AsyncTask<String, Void, String> {
ProgressDialog prog;
String title = "";
@Override
protected void onPreExecute() {
prog = new ProgressDialog(MainActivity.this);
prog.setMessage("Loading....");
prog.show();
}
@Override
protected String doInBackground(String... params) {
try {
doc = Jsoup.connect(params[0]).get();
Element tableElement = doc.select(".datagrid").first();
Elements tableRows = tableElement.select("tr");
for (Element row : tableRows) {
Elements cells = row.select("td");
if (cells.size() > 0) {
title = cells.get(0).text() + "; "
+ cells.get(1).text() + "; "
+ cells.get(2).text() + "; "
+ cells.get(3).text();
}
}
} catch (IOException e) {
e.printStackTrace();
}
return title;
}
@Override
protected void onPostExecute(String title) {
super.onPostExecute(title);
prog.dismiss();
tv.setText(title);
}
}
}
目前的結果:
501306284; ABBOTT,LAUREA; F ;設備3
期望的結果:
redirector.cfm ID = 66d01768-5686-44eb-ac6a-16eb783f52d0 &頁= 1 & & L-NAME = & FNAME = 「標題=」 501306284; ABBOTT,LAUREA; F ;基金3
或更好,但...
預期的效果
點擊這裏獲取更多信息(< -URL); ABBOTT,LAUREA; F ;基金3
我嘗試以下,但它似乎沒有工作... 標題= cells.get(0).attr( 「HREF」)+ 「;」 \t \t + cells.get(0)。 text()+「;」 + cells.get(1).text()+「;」 + cells.get(2).text()+「;」 + cells.get(3).text ); – HelloMojo
哦,因爲你正在迭代'TD'。然後你必須要求'TD'的第一個孩子。更新的答案應該可以工作。 –