下面擦傷數據的代碼可以從以下頁面:。 「http://www.gbgb.org.uk/resultsMeeting.aspx?id=136005刮HTML成csv文件
它刮掉所有培訓相關領域,並將它們打印到屏幕不過,我想嘗試,並在打印數據表格形式轉化爲csv文件,導出到電子表格或數據庫中
在網站源HTML中,軌道,日期,日期時間(比賽時間)等級,距離和獎勵來自div類「resultsBlockheader」,並且在網頁上形成比賽卡的頂部區域
來源中的種族身體HTML來自div類「resultsBlock」,這包括完成位置(Fin)灰狗,陷阱,SP,時間/秒和時間距離。
最終會看起來像這樣
track,date,datetime,grade,distance,prize,fin,greyhound,trap,SP,timeSec,time distance
這是可能的,或者我會得到它打印到表格中的屏幕之前,我可以將其導出爲CSV。
from urllib import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.gbgb.org.uk/resultsMeeting.aspx?id=136005")
bsObj = BeautifulSoup(html, 'lxml')
nameList = bsObj. findAll("div", {"class": "track"})
for name in nameList:
List = bsObj. findAll("div", {"class": "distance"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("div", {"class": "prizes"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "first essential fin"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "essential greyhound"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "trap"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "sp"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "timeSec"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "timeDistance"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "essential trainer"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "first essential comment"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("div", {"class": "resultsBlockFooter"})
for name in nameList:
print(name. get_text())
nameList = bsObj. findAll("li", {"class": "first essential"})
for name in nameList:
print(name. get_text())
這只是在他們自己的行上打印了一大堆東西。如果你想要一個表格或csv格式,你需要重新格式化這整個代碼 –
您好cricket_007.Thanks您的答覆。我將如何獲得屏幕上的東西並排打印(對所有這些仍然很新) :) – moonshadow
'print(1,2)'將打印在同一行上。 'print(1)'然後'print(2)'將會在新行上打印。那很簡單。您必須將每個值放在一個列表中,才能在一行中打印出來。目前您專注於列而不是行。 –