2017-04-21 52 views
1

我想通過以下HTML循環表格行。我正在使用以下xpath選擇器//*[@id="employee-table"]/tbody/tr,但它不起作用。創建循環來解析scrapy中的表數據

<table id="employee-table" class="table table-striped table-bordered responsive-table dataTable no-footer" role="grid" aria-describedby="employee-table_info" style="width: 882px;"> 
<thead> 
<tr role="row"><th class="sorting_asc" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-sort="ascending" aria-label=" Name : activate to sort column descending" style="width: 174px;"> Name </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Year : activate to sort column ascending" style="width: 36px;"> Year </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Title : activate to sort column ascending" style="width: 82px;"> Title </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Agency : activate to sort column ascending" style="width: 192px;"> Agency </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Location : activate to sort column ascending" style="width: 115px;"> Location </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Salary : activate to sort column ascending" style="width: 50px;"> Salary </th></tr> 
</thead> 
<tbody> 
<tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/bharatkumar-a-g">A G. Bharatkumar</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Wisconsin</td><td>$335,000</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/roure-a-rafael">A Rafael Roure</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Florida</td><td>$333,634</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/dumont-aaron-s">Aaron S. Dumont</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Louisiana</td><td>$330,302</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/andrews-aaron-t">Aaron T. Andrews</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Florida</td><td>$350,000</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/elmi-abdolali">Abdolali Elmi</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>West Virginia</td><td>$325,056</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/haleem-abdul-a">Abdul A. Haleem</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Missouri</td><td>$351,056</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/ward-abner-m">Abner M. Ward</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Hawaii</td><td>$337,756</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/cohen-adam-c">Adam C. Cohen</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Indiana</td><td>$340,000</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/bakker-adam-j">Adam J. Bakker</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Minnesota</td><td>$325,980</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/bracha-adam-s">Adam S. Bracha</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Florida</td><td>$335,000</td></tr></tbody> 
</table> 
+0

它使用'lxml' =>'r = tree.xpath('// * [@ id =「employee-table」]/tbody/tr')' –

+0

嘗試使用此選擇器'/* [@ ID = 「僱員表」]/tbody的/ TR [@角色= 「行」]' –

回答

2

嘗試//*[@id="employee-table"]/tr

爲什麼你的XPath不工作是監守的tbody原因。你必須刪除它,並檢查你是否得到你想要的結果。

可以scrapy文檔閱讀:http://doc.scrapy.org/en/0.14/topics/firefox.html

火狐,特別是著名的加入<tbody>元素 表。另一方面,Scrapy不會修改HTML的原始頁面,因此如果您在XPath表達式中使用<tbody>, ,您將無法提取任何數據。