我的代碼生成了我想要移除的額外表格。我想刪除除此之外的所有其他表格。從python網頁抓取結果中刪除多餘的表格
我的代碼
import csv
from bs4 import BeautifulSoup
import requests
import pandas as pd
import telnetlib as tn
import os
#import sys
cwd = os.getcwd()
print (os.getcwd)
cwd = os.getcwd()
os.chdir('c:\\Users\STaiwo\Desktop\My R code')
page = requests.get("https://www.flyingblue.com/earn-and-spend-
miles/airlines/partner/180/china-eastern.html", verify = False)
print(page.content) ### Collects HTML content of site
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify()) ## Cleans up the content of the site
for table in soup.findAll('tbody'):
print('Table')
list_of_rows = []
for row in table.findAll('tr')[1:]:
list_of_cells = []
for cell in row.findAll('td'):
text = ((cell.text.replace(' ', '')))
list_of_cells.append(text)
list_of_rows.append(list_of_cells)
print(list_of_rows)
結果目前我得到: 表 [[ '頭等艙', 'F,U', '150%'],['P '','125%'],['Business Class','J,C,D,I','125%'],['Premium Economy Class','W','110%'],''Economy '','Y,B','100%'],['E,H,M','75%'],['L,N,R,S,V,K','50%'] ,[ 'T','30% '],[' 不符合應計」, 'Z,Q,G', '0%']] 表 [] 表 [] 表 [['英里距離:6,482','總'],['Booking sub-class:125%','8,103'],['8,103']] 表 [['Distance in miles: [''預訂小組:125%','精英獎金:75%','12,965'],['8,103','4,862']] 表 [['距離英里數:6,482','Total'],['Booking sub-class:50%','3,241'],['3,241']] 表 [['Distance in miles:6,482','Total'], [ '的預訂的子類:50%', '精英獎金:N/A', '3241'],[ '3241', '0']]
我想要的結果: 表 [ ['頭等艙','F,U','150%'],['P','125%'],['巴士「'經濟艙','Y','B'',''經濟艙','J,C,D,I','125%'],['Premium Economy Class','W','110% ['L,N,R,S,V,K','50%'],['T','30%'] ],['不適用於權責發生制','Z,Q,G','0%']]