2011-07-15 105 views
3

我想監視一個網站(www.bidcactus.com)。在網站上打開Firebug,進入網絡標籤,然後點擊XHR標籤。從網站獲取請求並檢索響應?

我想借此請求的響應,並將其保存到一個MySQL數據庫(我有一個當地的一個我的電腦(XAMPP)上運行。

有人告訴我做各種各樣的事情主要是利用jQuery的或JavaScript,但我沒有經歷過任何所以我想知道如果有人能幫助我在這裏。

有人建議我這個鏈接 Using Greasemonkey and jQuery to intercept JSON/AJAX data from a page, and process it

其使用Greasemonkey的還有我不知道很多有關要麼...

預先感謝任何幫助

實例/多個細節:
在監視發送(通過螢火蟲)的請求,我看到下面

http://www.bidcactus.com/CactusWeb/ItemUpdates?rnd=1310684278585 
The response of this link is the following: 
{"s":"uk5c","a":[{"w":"MATADORA","t":944,"p":5,"a":413173,"x":10}, 
{"w":"1000BidsAintEnough","t":6,"p":863,"a":413198,"x":0}, 
{"w":"YourBidzWillBeWastedHere","t":4725,"p":21,"a":413200,"x":8}, 
{"w":"iwillpay2much","t":344,"p":9,"a":413201,"x":9}, 
{"w":"apcyclops84","t":884,"p":3,"a":413213,"x":14}, 
{"w":"goin_postal","t":165,"p":5,"a":413215,"x":12}, 
{"w":"487951","t":825,"p":10,"a":413218,"x":6}, 
{"w":"mishmash","t":3225,"p":3,"a":413222,"x":7}, 
{"w":"CrazyKatLady2","t":6464,"p":1,"a":413224,"x":2}, 
{"w":"BOSS1","t":224,"p":102,"a":413230,"x":4}, 
{"w":"serbian48","t":62,"p":2,"a":413232,"x":11}, 
{"w":"Tuffenough","t":1785,"p":1,"a":413234,"x":1}, 
{"w":"apcyclops84","t":1970,"p":1,"a":413240,"x":13}, 
{"w":"Tuffenough","t":3524,"p":1,"a":413244,"x":5}, 
{"w":"Cdm17517","t":1424,"p":1,"a":413252,"x":3}],"tau":"0"} 

我明白這是什麼信息,我想我可以格式化但是我自己卻在網站上隨機創建新的請求。
示例http://www.bidcactus.com/CactusWeb/ItemUpdates?rnd=XXXXXXXXXXXX
我不確定它是如何創建它們的。

所以我需要得到所有的項目更新請求,並將信息發送到MySQL數據庫的響應。

+0

這可能與Greasemonkey,但它並不比你引用的鏈接更簡單。 [更多細節將有所幫助](http://stackoverflow.com/questions/how-to-ask)。例如,將要監視的頁面的源保存到pastebin.com,然後指出要監視和發佈的部分。考慮將問題/問題分解爲小塊。 PS:目標網站似乎沒有使用jQuery,但它確實使用了[YUI庫](http://developer.yahoo.com/yui/)。 –

+0

編輯第一篇文章儘可能多的信息,我可以 –

+0

感謝您的額外信息。這項任務不是太難,但可能會涉及;所以如果沒有人會讓我知道答案,我可能需要一兩天才能發佈答案。與此同時,人們已經發布了大量關於攔截Ajax調用的報道(http://stackoverflow.com/q/629671/331508)。嘗試一些代碼,看看它是如何發展的。 ;) –

回答

3

OK,這裏的工作代碼,一定程度上調整了該網站(頭版,僅限帳戶)。

使用說明:

  1. 安裝通用腳本。請注意,現在只有Firefox。

  2. 觀察它在Firebug的控制檯中運行,並調整過濾區(明確標示),在目標由你感興趣的數據。(也許整個a陣列?)

    注意,它可能需要幾秒鐘在「腳本開始」打印後,ajax攔截開始。

  3. 設置您的Web應用程序和服務器以接收數據。該腳本的帖子JSON,所以PHP,例如,將獲取數據,就像這樣:

    $jsonData = json_decode ($HTTP_RAW_POST_DATA); 
    
  4. 點腳本到你的服務器。

  5. Voilà。她完成了。


/****************************************************************************** 
******************************************************************************* 
** This script intercepts ajaxed data from the target web pages. 
** There are 4 main phases: 
**  1) Intercept XMLHttpRequest's made by the target page. 
**  2) Filter the data to the items of interest. 
**  3) Transfer the data from the page-scope to the GM scope. 
**   NOTE: This makes it technically possibly for the target page's 
**     webmaster to hack into GM's slightly elevated scope and 
**     exploit any XSS or zero-day vulnerabilities, etc. The risk 
**     is probably zero as long as you don't start any feuds. 
**  4) Use GM_xmlhttpRequest() to send the data to our server. 
******************************************************************************* 
******************************************************************************* 
*/ 
// ==UserScript== 
// @name   _Record ajax, JSON data. 
// @namespace  stackoverflow.com/users/331508/ 
// @description  Intercepts Ajax data, filters it and then sends it to our server. 
// @include   http://www.bidcactus.com/* 
// ==/UserScript== 

DEBUG = true; 
if (DEBUG) console.log ('***** Script Start *****'); 


/****************************************************************************** 
******************************************************************************* 
** PHASE 1 starts here, this is the XMLHttpRequest intercept code. 
** Note that it will not work in GM's scope. We must inject the code to the 
** page scope. 
******************************************************************************* 
******************************************************************************* 
*/ 
funkyFunc = ((<><![CDATA[ 

    DEBUG   = false; 
    //--- This is where we will put the data we scarf. It will be a FIFO stack. 
    payloadArray = []; //--- PHASE 3a 

    (function (open) { 
     XMLHttpRequest.prototype.open = function (method, url, async, user, pass) 
     { 
      this.addEventListener ("readystatechange", function (evt) 
      { 
       if (this.readyState == 4 && this.status == 200) //-- Done, & status "OK". 
       { 
        var jsonObj = null; 
        try { 
         jsonObj = JSON.parse (this.responseText); // FF code. Chrome?? 
        } 
        catch (err) { 
         //if (DEBUG) console.log (err); 
        } 
        //if (DEBUG) console.log (this.readyState, this.status, this.responseText); 

        /****************************************************************************** 
        ******************************************************************************* 
        ** PHASE 2: Filter as much as possible, at this stage. 
        **    For this site, jsonObj should be an object like so: 
        **     { s="1bjqo", a=[15], tau="0"} 
        **    Where a is an array of objects, like: 
        **     a 417387 
        **     p 1 
        **     t 826 
        **     w "bart69" 
        **     x 7 
        ******************************************************************************* 
        ******************************************************************************* 
        */ 
        //if (DEBUG) console.log (jsonObj); 
        if (jsonObj && jsonObj.a && jsonObj.a.length > 1) { 
         /*--- For demonstration purposes, we will only get the 2nd row in 
          the `a` array. (Probably stands for "auction".) 
         */ 
         payloadArray.push (jsonObj.a[1]); 
         if (DEBUG) console.log (jsonObj.a[1]); 
        } 
        //--- Done at this stage! Rest is up to the GM scope. 
       } 
      }, false); 

      open.call (this, method, url, async, user, pass); 
     }; 
    }) (XMLHttpRequest.prototype.open); 
]]></>).toString()); 


function addJS_Node (text, s_URL) 
{ 
    var scriptNode      = document.createElement ('script'); 
    scriptNode.type      = "text/javascript"; 
    if (text) scriptNode.textContent = text; 
    if (s_URL) scriptNode.src   = s_URL; 

    var targ = document.getElementsByTagName('head')[0] || d.body || d.documentElement; 
    targ.appendChild (scriptNode); 
} 

addJS_Node (funkyFunc); 


/****************************************************************************** 
******************************************************************************* 
** PHASE 3b: 
** Set up a timer to check for data from our ajax intercept. 
** Probably best to make it slightly faster than the target's 
** ajax frequency (about 1 second?). 
******************************************************************************* 
******************************************************************************* 
*/ 
timerHandle = setInterval (function() { SendAnyResultsToServer(); }, 888); 

function SendAnyResultsToServer() 
{ 
    if (unsafeWindow.payloadArray) { 
     var payload  = unsafeWindow.payloadArray; 
     while (payload.length) { 
      var dataRow = JSON.stringify (payload[0]); 
      payload.shift(); //--- pop measurement off the bottom of the stack. 
      if (DEBUG) console.log ('GM script, pre Ajax: ', dataRow); 

      /****************************************************************************** 
      ******************************************************************************* 
      ** PHASE 4: Send the data, one row at a time, to the our server. 
      ** The server would grab the data with: 
      **  $jsonData = json_decode ($HTTP_RAW_POST_DATA); 
      ******************************************************************************* 
      ******************************************************************************* 
      */ 
      GM_xmlhttpRequest ({ 
       method:  "POST", 
       url:  "http://localhost/db_test/ShowJSON_PostedData.php", 
       data:  dataRow, 
       headers: {"Content-Type": "application/json"}, 
       onload:  function (response) { 
           if (DEBUG) console.log (response.responseText); 
          } 
      }); 
     } 
    } 
} 


//--- EOF 



雜項指出:

  1. 我測試了該網站的主頁上,而不會在(I」記錄我不會在那裏創建一個帳戶)。

  2. 的AdBlock的FlashBlockNoScript的RequestPolicy所有全效測試。 JS被打開爲bidcactus.com(它必須是)但沒有其他人。把所有這些重新打開的東西不應該引起副作用 - 但如果它確實發生了,我不會去調試它。

  3. 這樣的代碼必須針對該網站進行調整,以及如何瀏覽該網站。這取決於你。希望代碼足夠自我記錄。

  4. 享受!



主要是:iFrame中的@include@exclude指示,JSON數據選擇和過濾,以及是否需要被阻止。此外,建議在調整完成後將變量(一個用於GM範圍,另一個用於頁面範圍)設置爲false

+0

這對我來說很好,但我必須替換CDATA構造,因爲我認爲E4X支持已從最近的瀏覽器中刪除。我使用了一個簡單的字符串scriptNode.innerHTML而不是scriptNode.textContent。 – AlvaHenrik

+0

@AlvaHenrik是的,E4X已經死了一段時間了;我會在某個時候更新這個答案。 (奇怪的是,Tampermonkey可能仍然以這種方式爲CDATA提供特殊支持。) –

-3

這不是用JavaScript/jQuery的Ajax請求可以實現的,因爲Same origin policy

我不跟Greasemonkey的經驗,通過

+0

這完全可以通過Greasemonkey來實現,它不受相同來源的限制。 –