最近我試圖從網頁解析HTML表使用lxml
和requests
。解析使用lxml和請求與python
的Python代碼運行是這樣的:
>>> from lxml to html
>>> import requests
>>> page = requests.get('http://www.bigpaisa.com/candlestick-stock-screener-result/nse/bearish-evening-star-candlestick-pattern')
>>> tree = html.fromstring(page.text)'
然後我想用lxml.xpath()
函數獲取列表來解析以下重複數據塊:
<TR>
<TD style="font-size: 11px;"><!-- <a href="/company-technical-details/<%=sr.getExchange()%>/<%=sr.getSymbol()%>/<%=sr.getName()%>" ><%= sr.getSymbol() %></a> -->
AMTEKINDIA </TD>
<TD style="font-size: 11px; max-width: 135px;">AMTEK INDIA LIMITED</TD>
<TD> nse </TD>
<TD style="min-width: 60px; max-width: 60px;">02-01-2015</TD>
<TD>78</TD>
<TD>78.3</TD>
<TD>72.25</TD>
<TD>73.9</TD>
但未能如願得到一個錯誤,例如:
>>> symbol=tree.xpath('//TD[@style="font-size: 11px;"][@!-- [@a href="/company-t
echnical-details/[@%=sr.getExchange()%]/[@%=sr.getSymbol()%]/[@%=sr.getName()%]"
][@%= sr.getSymbol() %][@/a] --]/text()')
給出Xpath評估錯誤和
>>> prices=tree.xpath('//TD/text()')
返回沒有值的列表。
'from lxml to html' is not valid Python。你的意思是'從lxml導入html'嗎? – 2015-01-04 15:09:08