0
我想這個端口的程序,從工作到python
C#
:端口BeautifulSoup到HtmlAgility
from __future__ import print_function
import requests
from bs4 import BeautifulSoup
r = requests.get('http://www.forexfactory.com/calendar.php?day=nov18.2016')
soup = BeautifulSoup(r.text, 'lxml')
tables = soup.findAll("table", {'class':'calendar__table'})
for table in tables:
for row in table.findAll("tr"):
for cell in row.findAll("td"):
print (cell.text, end = " ")
print()
這是我的[代碼片段]使用HtmlAgilityPack
在C#
嘗試,但它不工作:
HtmlWeb browser = new HtmlWeb();
string URI = "http://www.forexfactory.com/calendar.php?day=nov18.2016";
ServicePointManager.ServerCertificateValidationCallback += (sender, cert, chain, sslPolicyErrors) => true;
ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3 | SecurityProtocolType.Tls | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls12;
HtmlDocument document = browser.Load(URI);
foreach (HtmlNode row in document.DocumentNode.Descendants("table").FirstOrDefault(_ => _.Id.Equals("calendar__table")).Descendants("tr"))
Console.WriteLine(row);
Python代碼查找具有給定* class *屬性的'table'元素,您的C#代碼似乎尋找給定的* id *('_.Id.Equals')。我不熟悉C#的HTML解析,但我會去尋找一個類的等價物。 –
如果有問題的HTML將多個類分配給同一個表,那麼HTML Agility Pack將不會嘗試解決這個問題。所以它可能需要一些工作來分割空白的類屬性並執行一個包含。 – jessehouwing