如果你想找到a
標籤href
參數,使用正確的工具,這是不是經常一個正則表達式。更有可能你應該使用HTML/XML解析器。
Nokogiri是選擇用Ruby解析器:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri.HTML(open('http://www.example.org/index.html'))
doc.search('a').map{ |a| a['href'] }
pp doc.search('a').map{ |a| a['href'] }
# => [
# => "/",
# => "/domains/",
# => "/numbers/",
# => "/protocols/",
# => "/about/",
# => "/go/rfc2606",
# => "/about/",
# => "/about/presentations/",
# => "/about/performance/",
# => "/reports/",
# => "/domains/",
# => "/domains/root/",
# => "/domains/int/",
# => "/domains/arpa/",
# => "/domains/idn-tables/",
# => "/protocols/",
# => "/numbers/",
# => "/abuse/",
# => "http://www.icann.org/",
# => "mailto:[email protected]?subject=General%20website%20feedback"
# => ]
一點都沒有,不知道什麼樣的結構'url'擁有,或你的錯誤的。 – Thilo