將表格數據劃分爲.csv

我可以打印出表格，但無法將其提取到.csv文件中。我是新來的每天都在拼搶和學習。將表格數據劃分爲.csv

如何將此數據屏幕抓取到CSV文件？

標準庫模塊進口OS 進口SYS

# The wget module 
import wget 

# The BeautifulSoup module 
from bs4 import BeautifulSoup 

# The selenium module 
from selenium import webdriver 
from selenium.webdriver.common.keys import Keys 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By 

driver = webdriver.Chrome() # if you want to use chrome, replace Firefox() with Chrome() 
driver.get("https://www.arcountydata.com/county.asp?county=Benton") # load the web page 
search_begin = driver.find_element_by_xpath("//*[@id='Assessor']/div/div[2]/a/i").click() 
# for websites that need you to login to access the information 
elem = driver.find_element_by_id("OwnerName") # Find the email input field of the login form 
elem.send_keys("Roth Family Inc") # Send the users email 

search_exeute = driver.find_element_by_xpath("//*[@id='Search']").click() 


src = driver.page_source # gets the html source of the page 

parser = BeautifulSoup(src,"lxml") # initialize the parser and parse the source "src" 


table = parser.find("table", attrs={"class" : "table table-striped-yellow  table-hover table-bordered table-condensed table-responsive"}) # A list of attributes that you want to check in a tag) 

f = open('/output.csv', 'w') 

parcel="" 
owner="" 
ptype="" 
site_address="" 
location="" 
acres="" 

summons =[] 
#print table 

list_of_rows = [] 
for row in table.findAll('tr')[1:]: 
    list_of_cells = [] 
    for cell in row.findAll('td'): 
     text = cell.text.replace("&nbsp;", "") 
     list_of_cells.append(text) 
    list_of_rows.append(list_of_cells) 
print list_of_rows 

driver.close() # closes the driver ?>

來源

2017-04-24 user3691781

很高興看到一個新成員，祝你好運精先生！你正在做一件非常棒的工作，你很好地評論了你的代碼，這意味着你理解它，非常出色！

import csv

這是一個Python中的模塊，它允許輕鬆讀取/寫入CSV文件，所以讓我們先導入它。

with open(name_csv+'.csv', 'w+') as csvfile: 
     spamwriter = csv.writer(csvfile, delimiter=',') 
     spamwriter.writerow(list_of_rows) 
#Used to write 1 row, each element in the array will be seperated by a comma

編輯：

with open('somecsv.csv','w+') as csvfile: 
    spamwriter = csv.writer(csvfile, delimiter='|') # Changed the delimiter (Way of separating) 
    # This opens the CSV file and we set some additional parameters 
    for row in table.findAll('tr')[::2]: 
     list_of_cell = [] 
     for cell in row.findAll('td')[:5]: 
      text = cell.text.replace("&nbsp;", "").strip() 
      text = text.replace('''...\n\n\n'''," |") #This one is added so it replaces string before Lot with comma 
      text = text.replace(''':\n''',':') #This one is added so it doesn't interfere 
      text = text.replace('''\n''','|') #Adds a comma before block 
      list_of_cell.append(text) 
     print(list_of_cell) 
     spamwriter.writerow(list_of_cell)

來源

2017-04-24 01:24:03

感謝故障。它可以幫助，因爲我現在能夠解析數據爲csv。然而，整個數據被解析爲1行..我該怎麼做，以便它在html表格中逐行解析... – user3691781

這應該工作，不要忘記導入csv。我會考慮添加Selenium Selenium。 –

非常感謝您先生..它幫助我學習今天新的東西..我會嘗試進一步擦洗代碼...對於一些數據是一個列被解析成多行.. – user3691781

將表格數據劃分爲.csv

回答

相關問題