提取存儲在URL中的數據

這裏新增了一個，想知道如何從一個特定的URL中獲取數據並將其存儲在數據庫中，然後使用Rails訪問它。提取存儲在URL中的數據

但是，我能夠從URL中獲取數據並以XML格式接收數據並能夠顯示數據，但這是手動完成的，我只想知道如何從數據中獲取數據URL，因爲它以散列形式存在，並且包含很多屬性。

需要將這些屬性存儲在數據庫中，並直接從URL中檢索值。

2013-10-07 user2853473

你可以使用Mechanize抓取頁面，使用Nokogiri解析內容，並建立使用Nokogiri::XML::Builder從接收到的數據的XML（或Builder），或者將其存儲在數據庫中。

2013-10-07 06:31:17 ck3g

for that use Nokogiri gem for more information you can read from http://nokogiri.org/tutorials/parsing_an_html_xml_document.html 

I also gives you following commands of nokogiri... please avoide # sign 

doc = Nokogiri::HTML(open(your site url)) 
# get all specific selector's all matching elements 
# doc.css("div") 

# get specific selector's first matching element 
# doc.at_css("div") 

# get matching element by id name 
# doc.at_css("input#id name") 
# eg: doc.at_css("input#ResultsCount") 

# get matching element by class name 
# doc.at_css("div.class name") 
# eg: doc.at_css("div.results") 


# File.open("#{Rails.root}/public/aa.txt","w+").write(doc.css("div#search-result-listings")) 

# get fields data eg. take a value of input field whose id ResultsCount 
# <input type="hidden" name="ResultsCount" id="ResultsCount" value="12321" /> 
# doc.at_css("input#ResultsCount")["value"] 

# get all results 
# search_results=doc.at_css("div#search-result-listings").css("div.result.clearfix") 


#find by tag ("<ul>") and find their elements and children 
dc=doc.at_css("div#search-result-listings") 
#find all elements of ul such as li with their childs 
dc.at_css("ul").elements 
#if only childs of elements 
dc.at_css("ul").elements.children 
#if you want to print that child value then use "text" property 
dc.at_css("ul").elements.children[0].text 
#if you want all child data then use 
dc.at_css("ul").elements.children.text 
or 
dc.at_css("ul").elements.text 
or 
dc.at_css("ul").text

來源

2013-10-07 13:33:37 HarsHarI

提取存儲在URL中的數據

回答

相關問題