2013-10-29 69 views
2

我在nessus-xmlrpc gem中構建了一個功能,以允許下載HTML報告。我只需要解析fileName值的HTML響應即可。用Nokogiri解析nessus HTTP響應?

這裏是我使用的實例響應中,從nessus的:

<!doctype html> 
<html> 
<head> 
<meta charset="utf-8"> 
<title>Formatting the report</title><meta http-equiv="refresh" 
content="5;url=/file/xslt/download/?fileName=Windows_-_Main___Media_kr1sjb.html"> 
</head> 
<body bgcolor="#2b4e67"> 
<link type="text/css" href="jqueryui18.css" rel="stylesheet" /> 
<script type="text/javascript" src="jqueryui18.js"></script> 
<div id="main"></div> 

當使用引入nokogiri:

doc = Nokogiri::HTML(http_content) 

的文檔結束等:

#<Nokogiri::HTML::Document:0x4758128 name="document" children=[#<Nokogiri::XML::DTD:0x4757f48 name="html">, #<Nokogiri::XML::Element:0x47578ea name="html" children=[#<Nokogiri::XML::Element:0x47577aa name="head" children=[#<Nokogiri::XML::Element:0x475767e name="meta" attributes=[#<Nokogiri::XML::Attr:0x475764c name="charset" value="utf-8">]>, #<Nokogiri::XML::Element:0x4757368 name="title" children=[#<Nokogiri::XML::Text:0x475723c "Formatting the report">]>, #<Nokogiri::XML::Element:0x475af9a name="meta" attributes=[#<Nokogiri::XML::Attr:0x475ac48 name="http-equiv" value="refresh">, #<Nokogiri::XML::Attr:0x475ac2a name="content" value="5;url=/file/xslt/download/?fileName=Windows_-_Main___Media_kr1sjb.html">]>]>, #<Nokogiri::XML::Element:0x470b968 name="body" attributes=[#<Nokogiri::XML::Attr:0x470b7ec name="bgcolor" value="#2b4e67">] children=[#<Nokogiri::XML::Element:0x46fb216 name="link" attributes=[#<Nokogiri::XML::Attr:0x46fb18a name="type" value="text/css">, #<Nokogiri::XML::Attr:0x46fb176 name="href" value="jqueryui18.css">, #<Nokogiri::XML::Attr:0x46fb16c name="rel" value="stylesheet">]>, #<Nokogiri::XML::Element:0x46fa1d6 name="script" attributes=[#<Nokogiri::XML::Attr:0x46fa0e6 name="type" value="text/javascript">, #<Nokogiri::XML::Attr:0x46fa0c8 name="src" value="jqueryui18.js">]>, #<Nokogiri::XML::Element:0x46dd43c name="div" attributes=[#<Nokogiri::XML::Attr:0x46dd37e name="id" value="main">]>]>]>]> 

我可以不知道如何獲得fileName值,「Windows_- Main __Media_kr1sjb.html「。

任何幫助將是美好的,我會推動這些變化一勞永逸它的工作。

回答

1

我下面做:

require 'nokogiri' 

doc = Nokogiri::HTML.parse <<-eol 
<html> 
<head> 
<meta charset="utf-8"> 
<title>Formatting the report</title><meta http-equiv="refresh" 
content="5;url=/file/xslt/download/?fileName=Windows_-_Main___Media_kr1sjb.html"> 
</head> 
<body bgcolor="#2b4e67"> 
<link type="text/css" href="jqueryui18.css" rel="stylesheet" /> 
<script type="text/javascript" src="jqueryui18.js"></script> 
<div id="main"></div> 
eol 

str = doc.at_css('meta[http-equiv="refresh"]')['content'] 
# => "5;url=/file/xslt/download/?fileName=Windows_-_Main___Media_kr1sjb.html" 
str[/\?fileName=(.*)/,1] 
# => "Windows_-_Main___Media_kr1sjb.html" 
+0

好吧好吧,我必須趕響應,所以我有一個名爲http_content變量上面的HTML文檔。我將如何調整你的方法做它是這樣工作嗎? – user2933606

+0

@ user2933606把它寫成下面的'引入nokogiri :: HTML.parse(HTTP_RESPONSE)'.. –

+0

感謝您幫助 – user2933606