2012-08-28 52 views
3

我正在使用ruby機械化gem來自動完成表單提交和刮取結果。我有以下代碼:使用Ruby提交'aspnetForm'機械化不能按預期工作

require 'mechanize' 
require 'logger' 

url = "http://www.cebupacificair.com/Pages/default.aspx" 
agent = Mechanize.new do |agent| 
    agent.log = Logger.new(STDOUT) 
    agent.follow_meta_refresh = true 
end 
page = agent.get(url) 

search_results = page.form_with(:name => 'aspnetForm') do |form| 
    form['__EVENTARGUMENT'] = '' 
    form['__EVENTTARGET'] = 'ControlGroupSearchView$AvailabilitySearchInputSearchView$LinkButtonNewSearch' 
    form.radiobutton_with(:value => "RoundTrip").check 
    form.field_with(:name => "ddOrigin").options_with(:value => "MNL").select 
    form.field_with(:name => "ddDestination").options_with(:value => "SGN").select 
    form.field_with(:name => "_depmonthyear").options_with(:value => "2013-02").select 
    form.field_with(:name => "_depday").options_with(:value => "9").select 
    form.field_with(:name => "_retmonthyear").options_with(:value => "2013-02").select 
    form.field_with(:name => "_retday").options_with(:value => "11").select 
    form.field_with(:name => "_adults").options_with(:value => "1").select 
    form.field_with(:name => "_children").options_with(:value => "0").select 
    form.field_with(:name => "_infants").options_with(:value => "0").select 
end.click_button 

puts search_results.body 

當我運行上面的代碼,它提交的形式,但它不重定向到時候我做了手動提交我被重定向的頁面。這裏是日誌:

Starting run ... 
I, [2012-08-27T08:37:08.010661 #4] INFO -- : Net::HTTP::Get: /Pages/default.aspx 
D, [2012-08-27T08:37:08.010794 #4] DEBUG -- : request-header: accept => */* 
D, [2012-08-27T08:37:08.010825 #4] DEBUG -- : request-header: user-agent => Mechanize/2.3 Ruby/1.9.2...more 
D, [2012-08-27T08:37:08.010849 #4] DEBUG -- : request-header: accept-encoding => gzip,deflate,identity 
D, [2012-08-27T08:37:08.010873 #4] DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7 
D, [2012-08-27T08:37:08.010898 #4] DEBUG -- : request-header: accept-language => en-us,en;q=0.5 
D, [2012-08-27T08:37:08.010922 #4] DEBUG -- : request-header: host => www.cebupacificair.com 
I, [2012-08-27T08:37:11.354481 #4] INFO -- : status: Net::HTTPOK 1.1 200 OK 
D, [2012-08-27T08:37:11.354778 #4] DEBUG -- : response-header: cache-control => private, max-age=0 
D, [2012-08-27T08:37:11.354869 #4] DEBUG -- : response-header: content-type => text/html; charset=utf-8 
D, [2012-08-27T08:37:11.354945 #4] DEBUG -- : response-header: content-encoding => gzip 
D, [2012-08-27T08:37:11.355017 #4] DEBUG -- : response-header: expires => Sun, 12 Aug 2012 08:37:09 GMT 
D, [2012-08-27T08:37:11.355089 #4] DEBUG -- : response-header: last-modified => Mon, 27 Aug 2012 08:37:09 GMT 
D, [2012-08-27T08:37:11.355161 #4] DEBUG -- : response-header: vary => Accept-Encoding 
D, [2012-08-27T08:37:11.355232 #4] DEBUG -- : response-header: server => Microsoft-IIS/7.0 
D, [2012-08-27T08:37:11.355313 #4] DEBUG -- : response-header: x-aspnet-version => 2.0.50727 
D, [2012-08-27T08:37:11.355386 #4] DEBUG -- : response-header: x-powered-by => ASP.NET 
D, [2012-08-27T08:37:11.355457 #4] DEBUG -- : response-header: microsoftsharepointteamservices => 12.0.0.6420 
D, [2012-08-27T08:37:11.355528 #4] DEBUG -- : response-header: date => Mon, 27 Aug 2012 08:37:09 GMT 
D, [2012-08-27T08:37:11.355599 #4] DEBUG -- : response-header: content-length => 19031 
D, [2012-08-27T08:37:11.355668 #4] DEBUG -- : response-header: connection => close 
D, [2012-08-27T08:37:11.355739 #4] DEBUG -- : response-header: set-cookie => MyCookie=8npRAWS5i10kju...more 
D, [2012-08-27T08:37:11.356068 #4] DEBUG -- : Read 6698 bytes (6698 total) 
D, [2012-08-27T08:37:11.356328 #4] DEBUG -- : Read 12333 bytes (19031 total) 
D, [2012-08-27T08:37:11.356890 #4] DEBUG -- : gzip response 
D, [2012-08-27T08:37:11.375285 #4] DEBUG -- : saved cookie: MyCookie=8npRAWS5i10kjuDl8xX/01gRq0obDLa...more 
I, [2012-08-27T08:37:11.385797 #4] INFO -- : form encoding: utf-8 
D, [2012-08-27T08:37:11.388509 #4] DEBUG -- : query: "MSOWebPartPage_PostbackSource=&MSOTlPn_Selecte...more 
I, [2012-08-27T08:37:11.390797 #4] INFO -- : Net::HTTP::Post: /Pages/default.aspx 
D, [2012-08-27T08:37:11.390897 #4] DEBUG -- : request-header: accept => */* 
D, [2012-08-27T08:37:11.390927 #4] DEBUG -- : request-header: user-agent => Mechanize/2.3 Ruby/1.9.2...more 
D, [2012-08-27T08:37:11.390966 #4] DEBUG -- : request-header: accept-encoding => gzip,deflate,identity 
D, [2012-08-27T08:37:11.390991 #4] DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7 
D, [2012-08-27T08:37:11.391015 #4] DEBUG -- : request-header: accept-language => en-us,en;q=0.5 
D, [2012-08-27T08:37:11.391039 #4] DEBUG -- : request-header: cookie => MyCookie=8npRAWS5i10kjuDl8xX...more 
D, [2012-08-27T08:37:11.391063 #4] DEBUG -- : request-header: host => www.cebupacificair.com 
D, [2012-08-27T08:37:11.391095 #4] DEBUG -- : request-header: referer => http://www.cebupacificair.c...more 
D, [2012-08-27T08:37:11.391123 #4] DEBUG -- : request-header: content-type => application/x-www-form...more 
D, [2012-08-27T08:37:11.391146 #4] DEBUG -- : request-header: content-length => 13733 
D, [2012-08-27T08:37:11.391170 #4] DEBUG -- : request-header: if-modified-since => Mon, 27 Aug 2012 ...more 
I, [2012-08-27T08:37:19.039011 #4] INFO -- : status: Net::HTTPOK 1.1 200 OK 
D, [2012-08-27T08:37:19.039252 #4] DEBUG -- : response-header: cache-control => private 
D, [2012-08-27T08:37:19.039348 #4] DEBUG -- : response-header: content-type => text/html; charset=utf-8 
D, [2012-08-27T08:37:19.039428 #4] DEBUG -- : response-header: content-encoding => gzip 
D, [2012-08-27T08:37:19.039501 #4] DEBUG -- : response-header: vary => Accept-Encoding 
D, [2012-08-27T08:37:19.039580 #4] DEBUG -- : response-header: server => Microsoft-IIS/7.0 
D, [2012-08-27T08:37:19.039652 #4] DEBUG -- : response-header: x-aspnet-version => 2.0.50727 
D, [2012-08-27T08:37:19.039725 #4] DEBUG -- : response-header: x-powered-by => ASP.NET 
D, [2012-08-27T08:37:19.039798 #4] DEBUG -- : response-header: microsoftsharepointteamservices => 12.0.0.6420 
D, [2012-08-27T08:37:19.039895 #4] DEBUG -- : response-header: date => Mon, 27 Aug 2012 08:37:17 GMT 
D, [2012-08-27T08:37:19.040003 #4] DEBUG -- : response-header: content-length => 20962 
D, [2012-08-27T08:37:19.040074 #4] DEBUG -- : response-header: connection => close 
D, [2012-08-27T08:37:19.040338 #4] DEBUG -- : Read 6906 bytes (6906 total) 
D, [2012-08-27T08:37:19.040605 #4] DEBUG -- : Read 14056 bytes (20962 total) 
D, [2012-08-27T08:37:19.041137 #4] DEBUG -- : gzip response 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html __...more 
Finished: 12.393 seconds elapsed 
runfinished 

任何人都知道我在這裏錯過了什麼?我是否需要不同的設置才能在asp窗體上工作?

想法?

謝謝。

+1

對我來說,它看起來像表格是根本沒有提交(即search_result.body仍然是初始頁面)。問題可能是搜索按鈕有一個onclick事件。不幸的是,Mechanize沒有處理javascript,所以你需要弄清楚onclick事件正在做什麼並且複製它,或者使用支持JavaScript的Gem,比如Watir或者Watir-Webdriver。 –

+0

謝謝@justin。該按鈕確實有一個js onlick事件。將查看頁面源以查看事件的作用。我會在發佈後更新... –

回答

2

嘗試不同的方法。

如果這是它發出的發佈請求。嘗試使該請求手動而不是使用機械化表單處理程序:

其提交的PARAMS是:

__EVENTTARGET=ControlGroupSearchView%24AvailabilitySearchInputSearchView%24LinkButtonNewSearch 
__EVENTARGUMENT= 
__VIEWSTATE= 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24RadioButtonMarketStructure=RoundTrip 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24TextBoxMarketOrigin1=BWN 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24TextBoxMarketDestination1=PEK 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24DropDownListMarketDay1=11 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24DropDownListMarketMonth1=2013-06 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24DropDownListMarketDay2=11 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24DropDownListMarketMonth2=2013-06 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24DropDownListPassengerType_ADT=1 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24DropDownListPassengerType_CHD=0 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24DropDownListPassengerType_INFANT=0 
ControlGroupSearchView%24AvailabilitySearchInputSearchView%24promoCodeID= 

所以你可以做的是:

# 1. Get the frontpage to get the cookie set: 
agent = Mechanize.new 
page = agent.get(url) 
# 2. The use .search helper to extract the inputs you might need 
# 3. Submit the form by sending the postparams manually 
result_page = agent.post("http://book.cebupacificair.com/Search.aspx?culture=en-us", {'__EVENTTARGET' => 'ControlGroupSearchView%24AvailabilitySearchInputSearchView%24LinkButtonNewSearch'}) # And of cause add the remaining params... 

我敢肯定,這種方法將爲你工作...

+0

謝謝Niels。這對我有效。 Mechanize不處理javaxcript,所以JavaScript鏈接等需要按照您所描述的方式進行處理。 –