2015-10-12 42 views
3

我試圖跟蹤門票趨勢我如何查詢沒有selenium的Python或Ruby結果

目前,我使用硒來模擬提交表單。

如你所知,硒很慢,消耗更多的記憶。

但是,當您提交表單,將您重定向到一個新的URL http://makeabooking.flyscoot.com/Flight/Select

所以,我沒有這個想法我怎麼能做到這一點,而不硒。

因爲我無法改變查詢的形式,如http://makeabooking.flyscoot.com/Flight/from={TPE}&to={NYK}&date={2015-10-12}來獲取結果。

任何想法與Ruby或Python與SSL代理和HTTP代理支持做到這一點?

樣品網站:http://www.flyscoot.com/index.php/en/

+0

你的車是你的馬前。我們幫助您調試您編寫的代碼,但您沒有給我們任何提供。請閱讀「[問]」和「[mcve]」。 –

回答

0

我想這個答案https://stackoverflow.com/a/1196151/1033953是你在找什麼。

您需要檢查表單上的參數以確保發佈了正確的值,但是您只需使用Ruby net/http發送HTTP Post。

我相信Python有類似的東西。或者你可以使用curl崗位作爲展示在這個答案https://superuser.com/a/149335

+0

嗨,我會稍後嘗試,但不會重定向行動到另一個域將打破'網/ http'?無論我查詢,我將被重定向到另一個域'http:// makeabooking.flyscoot.com/Flight/Select' – user3675188

2

你可以很容易地從鉻捲曲的請求,並通過使用它:

F12 > Network > request > Right Click > Copy As cURL 

捲曲「http://makeabooking.flyscoot.com/Flight/Select」 -H「的Accept-Encoding :gzip,deflate,sdch'-H'Accept-Language:en-US,en; q = 0.8,tr; q = 0.6'-H'升級不安全請求:1'-H User-Agent:Mozilla/5.0(Macintosh; Intel Mac OS X 10_10_5)AppleWebKit/537.36(KHTML,如Gecko)Chrome/45.0.2454.101 Safari/537.36'-H'接受:text/html,application/xhtml + xml,application/xml; q = 0.9 ,image/webp,/; q = 0.8'-H'Referer: http://www.flyscoot.com/index.php/en/'-H'Cookie:optimizelyEndUserId = oeu1444666692081r0.12463579000905156; __utmt = 1; [email protected]=1444666699786; ASP.NET_SessionId = lql5yzv1l3yatkh1lcumg2e5; dotrez = 1209262602.20480.0000; optimizelySegments =%7B%222335550040%22%3A%22gc%22%2C%222344180004%22%3A%22referral%22%2C%222354350067%22%3A%22false%22%2C%22235538​​0121%22%3A%22none%22 %7D; optimizelyBuckets =%7B%223025070068%22%3A%223020800213%22%7D; __utma = 185425846.733949751.1444666694.1444666694.1444666694.1; __utmb = 185425846.2.10.1444666694; __utmc = 185425846; __utmz = 185425846.1444666694.1.1.utmcsr = stackoverflow.com | utmccn =(轉診)| utmcmd =推薦| utmcct = /問題/ 33084039 /如何-可能-I-查詢的對結果無硒上的Python,或-紅寶石; granify.uuid = 68b0d8e8-d068-40d8-9068-3098e870b858; [email protected]=1444666699786; [email protected]=8; _gr_ep_sent = 1; _gr_er_sent = 1; [email protected]=2; optimizelyPendingLogEvents =%5B%5D'-H'Connection:keep-alive'-H'X-FirePHP-Version:0.0.6'-H'Cache-Control:max-age = 0'--compressed

如果您可以正確設置headerscookies信息,則可以使用Python請求。如果你想將它轉換爲Python請求,你可以使用這個link。通過這種方式你可以模擬瀏覽器。見pyton請求:

cookies = { 
    'optimizelyEndUserId': 'oeu1444666692081r0.12463579000905156', 
    '__utmt': '1', 
    '[email protected]': '1444666699786', 
    'ASP.NET_SessionId': 'lql5yzv1l3yatkh1lcumg2e5', 
    'dotrez': '1209262602.20480.0000', 
    'optimizelySegments': '%7B%222335550040%22%3A%22gc%22%2C%222344180004%22%3A%22referral%22%2C%222354350067%22%3A%22false%22%2C%222355380121%22%3A%22none%22%7D', 
    'optimizelyBuckets': '%7B%223025070068%22%3A%223020800213%22%7D', 
    '__utma': '185425846.733949751.1444666694.1444666694.1444666694.1', 
    '__utmb': '185425846.2.10.1444666694', 
    '__utmc': '185425846', 
    '__utmz': '185425846.1444666694.1.1.utmcsr=stackoverflow.com|utmccn=(referral)|utmcmd=referral|utmcct=/questions/33084039/how-could-i-query-the-result-without-selenium-on-python-or-ruby', 
    'granify.uuid': '68b0d8e8-d068-40d8-9068-3098e870b858', 
    '[email protected]': '1444666699786', 
    '[email protected]': '8', 
    '_gr_ep_sent': '1', 
    '_gr_er_sent': '1', 
    '[email protected]': '2', 
    'optimizelyPendingLogEvents': '%5B%5D', 
} 

headers = { 
    'Accept-Encoding': 'gzip, deflate, sdch', 
    'Accept-Language': 'en-US,en;q=0.8,tr;q=0.6', 
    'Upgrade-Insecure-Requests': '1', 
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36', 
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 
    'Referer': 'http://www.flyscoot.com/index.php/en/', 
    'Connection': 'keep-alive', 
    'X-FirePHP-Version': '0.0.6', 
    'Cache-Control': 'max-age=0', 
} 

requests.get('http://makeabooking.flyscoot.com/Flight/Select', headers=headers, cookies=cookies) 

如果你保存結果,你可以看到結果通過瀏覽器(打開stack.html)爲完成:

r = requests.get('http://makeabooking.flyscoot.com/Flight/Select', headers=headers, cookies=cookies 
f = open("stack1.html", "w") 
f.write(r.content) 
+0

嗨好像你必須在查詢的標題中有一個「隨機會話值」,所以我必須從FORM頁面獲得'random session value',然後使用header和cookie構建查詢。這樣對嗎 ? – user3675188

+0

取決於頁面,您可能會收到。因爲一些cookie可能會失效。 –