2016-11-10 59 views
1

我試圖從此page獲得NBA球員統計。有一個用戶界面按鈕,允許您將數據錶轉換爲csv,並且我試圖自動執行此過程。在引擎蓋下,它調用功能get_csv_output()與phantomjs調用函數給出不同於從控制檯調用的結果

在檢查控制檯中,get_csv_output("per_game")get_csv_output("advanced")分別以csv格式輸出#per_game#advanced表。

但是,當我嘗試使用phantom.js調用get_csv_output()函數時,它僅提取「per_game」表的csv數據,但不適用於「高級」表。

var page = require('webpage').create(); 
page.open('http://www.basketball-reference.com/players/a/abdulka01.html', function() { 
    var result = page.evaluate(function() { 
    return get_csv_output("per_game"); 
    }); 
    console.log(result); 
    phantom.exit() 
}); 

的這個輸出是CSV格式per_game表按預期方式。然而,當我嘗試將其更改爲get_csv_output("advanced")

輸出Converting from PRE-Formatted to CSV does not work, please <span class=tooltip onClick="window.location.reload()">Reload</span> and then click CSV

我試圖提供一些其他表的ID作爲輸入,並per_game似乎是唯一可行的。

回答

0

問題是解決了,現在它的工作原理:

function on_init (page){ 
page.viewportSize = {width:1600,height:900} 
page.evaluate(function(){ 
window.screen = {width:1600,height:900,availWidth:1600,availHeight:900}; 
window.innerWidth=1600; window.innerHeight=900; window.outerWidth=1600; window.outerHeight=900; 
window.navigator = { 
plugins: {length: 2, 'Shockwave Flash': {name: 'Shockwave Flash', filename: '/usr/lib/flashplugin-nonfree/libflashplayer.so', description: 'Shockwave Flash 11.2 r202', version: '11.2.202.440'}}, 
mimeTypes: {length: 2, "application/x-shockwave-flash": {description: "Shockwave Flash", suffixes: "swf", type: "application/x-shockwave-flash", enabledPlugin: {name: 'Shockwave Flash', filename: '/usr/lib/flashplugin-nonfree/libflashplayer.so', description: 'Shockwave Flash 11.2 r202', version: '11.2.202.440'}}}, 
appCodeName: "Mozilla", 
appName: "Netscape", 
appVersion: "5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36", 
cookieEnabled: 1, 
languages: "en-US,en", 
language: "en", 
onLine: 1, 
doNotTrack: null, 
platform: "Linux x86_64", 
product: "Gecko", 
vendor: "Google Inc.", 
vendorSub: "", 
productSub: 20030107, 
userAgent: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36", 
geolocation: {getCurrentPosition: function getCurrentPosition(){},watchPosition: function watchPosition(){},clearWatch: function clearWatch(){}}, 
javaEnabled: function javaEnabled(){return 0} };});}; 
var page = require('webpage').create(); 
page.onInitialized=function(){on_init(page)} 
page.open('http://www.basketball-reference.com/players/a/abdulka01.html', function() { 
    var result = page.evaluate(function() { 
    return get_csv_output("advanced"); 
    }); 
    console.log(result); 
    phantom.exit() 
}); 

./phantomjs test.js >>/dev/stdout

+3

能否請你解釋一下你怎麼知道做出這些變化,以及爲什麼他們有必要嗎? – Mahir

+0

是的,我們需要至少改變'UserAgent',以使這個腳本起作用。隨着我所做的更改,您將看到一個虛假的導航器對象,看起來像一個普通的瀏覽器。 – 2016-11-10 01:11:49

+0

本例中導航器對象的擴展版本: \t http://pastebin.com/kSndS8jX – 2016-11-10 01:16:44

相關問題