Ruby n00b這裏。我將同一頁面重複兩次 - 但每次都以稍微不同的方式 - 將其導出到單獨的CSV文件。然後,我想結合CSV第1號的第一列和CSV第2號的第2列來創建CSV第3號。ruby/nokogiri scraping - 導出到多個CSV,然後從每個CSV中取出列並結合成最終的CSV
拖動CSVs的代碼NO.1 & 2作品。但添加我的嘗試將兩個CSV組合成第三個(在底部註釋掉)返回以下錯誤 - 兩個CSV填充正常,但第三個保持空白,並且腳本處於似乎是無限循環的狀態。我知道這行不應該是在底部,但我看不出有什麼地方它會去...
alts.rb:45:in `block in <main>': undefined local variable or method `scrapedURLs1' for main:Object (NameError)
from /Users/JammyStressford/.rvm/rubies/ruby-2.0.0-p451/lib/ruby/2.0.0/csv.rb:1266:in `open'
from alts.rb:44:in `<main>'
代碼本身:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'csv'
url = "http://www.example.com/page"
page = Nokogiri::HTML(open(url))
CSV.open("results1.csv", "wb") do |csv|
page.css('img.product-card-image').each do |scrape|
product1 = scrape['alt']
page.css('a.product-card-image-link').each do |scrape|
link1 = scrape['href']
scrapedProducts1 = "#{product1}"[0..-7]
scrapedURLs1 = "{link1}"
csv << [scrapedProducts1, scrapedURLs1]
end
end
end
CSV.open("Results2.csv", "wb") do |csv|
page.css('a.product-card-image-link').each do |scrape|
link2 = scrape['href']
page.css('img.product-card-image').each do |scrape|
product2 = scrape['alt']
scrapedProducts2 = "#{product2}"[0..-7]
scrapedURLs2 = "http://www.lyst.com#{link2}"
csv << [scrapedURLs2, scrapedProducts2]
end
end
end
## Here is where I am trying to combine the two columns into a new CSV. ##
## It doesn't work. I suspect that this part should be further up... ##
# CSV.open("productResults3.csv", "wb") do |csv|
# csv << [scrapedURLs1, scrapedProducts2]
#end
puts "upload complete!"
感謝您的閱讀。
感謝您回答丹,重申:您的觀點: – JammyStressford