2012-09-16 162 views
0

我正在嘗試使用AutoIt創建Google關鍵字工具刮板。 我使用下面的代碼:獲取網頁源代碼,包括javascript

#include <IE.au3> 
$oIE = _IECreate ("https://adwords.google.com/o/KeywordTool") 
sleep(20000) 
$source = _IEDocReadHTML ($oIE) 

MsgBox(0,'',$source) 

(睡眠有沒有給我鍵入查詢並單擊IE窗口搜索的時候 - 在未來,我會自動執行此)

它輸出的HTML源不包含結果表,儘管我可以在Firebug中看到它。 下面是我用Firebug提取的單行。

<tr __gwt_row="19" __gwt_subrow="0" class="sCT"><td class="sBS sDT sES" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1059"><div id="gwt-debug-column-SELECTION-row-19-0"><input type="checkbox" class="sML"></div></div></td><td class="sBS sDT" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1060"><div id="gwt-debug-column-KEYWORD-row-19-1"><span style="white-space:nowrap"><span></span><span><a class="sOL" gwtuirendered="gwt-uid-1089"><b>windows</b> live</a></span><span></span></span></div></div></td><td class="sBS sDT" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1062"><div id="gwt-debug-column-COMPETITION-row-19-2"><div title="0,04">Bassa</div></div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1063"><div id="gwt-debug-column-GLOBAL_MONTHLY_SEARCHES-row-19-3">20.400.000</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1064"><div id="gwt-debug-column-AVERAGE_TARGETED_MONTHLY_SEARCHES-row-19-4">20.400.000</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1065"><div id="gwt-debug-column-SUGGESTED_BID-row-19-5">€&nbsp;0,40</div></div></td><td class="sBS sDT aw-ti-advertiser-specific-cell" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1066"><div id="gwt-debug-column-AD_SHARE-row-19-6">-</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1067"><div id="gwt-debug-column-AVERAGE_MONTHLY_SEARCHES_WITH_AFS-row-19-7">-</div></div></td><td class="sBS sDT aw-ti-advertiser-specific-cell" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1068"><div id="gwt-debug-column-SEARCH_SHARE-row-19-8">-</div></div></td><td class="sBS sDT" align="right"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1069"><div id="gwt-debug-column-TARGETED_MONTHLY_SEARCHES-row-19-9"><div style="width: 108px; white-space: nowrap" dir="ltr"><div style="width: 8px;height: 16px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 16px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 16px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 13px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div><div style="width: 1px;" class="goog-inline-block"></div><div style="width: 8px;height: 10px; background-color: #A4D0BB; vertical-align:bottom;" class="goog-inline-block" title=""></div></div></div></div></td><td class="sBS sDT sOS" align="left"><div style="outline-style:none;" __gwt_cell="cell-gwt-uid-1070"><div id="gwt-debug-column-EXTRACTED_FROM_WEBPAGE-row-19-10">-</div></div></td></tr> 

是否有一種方式來獲得完整的源代碼與AutoIt的,包括用JavaScript產生了內容?

回答

0

我會使用http請求,因爲它是最直接的方式來做到這一點。 它似乎給狀態404的方式 編輯:該網址缺少其最後一封信,導致404狀態。

#include <GUIConstantsEx.au3> 
#include <winapi.au3> 


MsgBox(0,default,get_url("https://adwords.google.com/o/KeywordTool")) 
    Func get_url($url) 

    $RequestURL = $url; 
    Global $oHTTP = ObjCreate("winhttp.winhttprequest.5.1") ; 
    $oHTTP.Open("GET", $RequestURL, False) 
    $oHTTP.Send() 
    if $oHTTP.status == 200 Then 
     Return $oHTTP.ResponseText 
    Else 
     Return "ooops... status: " & $oHTTP.status 
    EndIf 

EndFunc