2017-07-16 122 views
0

所以,即時通訊試圖廢除一個網站,需要一個post請求來檢索數據,但我沒有運氣..我最後一次嘗試是這樣的: 從請求導入會話 from bs4 import BeautifulSoup如何使python的xhr post請求

# HEAD requests ask for *just* the headers, which is all you need to grab the 
    # session cookie 
    session = Session() 

    # HEAD requests ask for *just* the headers, which is all you need to grab the 
    # session cookie 
    session.head('http://www.betrebels.gr/sports') 

    response = session.post(
     #url = "https://sports-  itainment.biahosted.com/WebServices/SportEvents.asmx/GetEvents", 
     url='http://www.betrebels.gr/sports', 
     data={ 
       'champIds':   '["1191783","1191784","1191785","939911","939912","939913","939914","175","190686","198881","542378","217750","91","201","2","38","201614","454","63077","60920","384","49251","61873","87095","110401","111033","122008","122019","342","343","344","430",213","95","10","1240912","1237673","1239055","339","340","124","1381","260549","1071542","437","271","510","1241462","72","277","137","308","488","2131","59178","433","434","347","203","348","349","92420","148716","322","184","127983","321","88173","417","418","284","2688","103419","618","487","56029","214640","215229","514","92","302","1084811","1084813","1084831","68739","81852","406","100","70","172","351","541730","541732","541733","548965","552442","554615","554616","554617","361","136","519","279","65","319","364","75","220","194676","149","121443","110902","171694","152501","568313","126998","758","740","1264928"]', 
       'dateFilter':'All', 
       'eventIds':'[]', 
       'marketsId':'-1', 
       'skinId':"betrebels" 
      }, 

     headers={'Accept':'application/json, text/javascript, */*; q=0.01', 
      'Accept-Encoding':'gzip, deflate, br', 
      'Accept-Language':'el-GR,el;q=0.8', 
      'Connection':'keep-alive', 
      'Content-Length':'701', 
      'Content-Type':'application/json; charset=UTF-8', 
      'Cookie':'Language=el-gr;   ASP.NET_SessionId=kp0b2xwf2vzuci4uwn33uh1o; IsBetApp=False; _ga=GA1.2.1005994943.1499255280; _gid=GA1.2.1197736989.1500201903; _gat=1; ParentUrl=ParentUrl is not need', 
      'DNT':'1', 
      'Host':'sports-itainment.biahosted.com', 
      'Origin':'https://sports-itainment.biahosted.com', 
      'Referer':'https://sports-itainment.biahosted.com/generic/prelive.aspx?token=&clientTimeZoneOffset=-180&lang=el-gr&walletcode=508729&skinid=betrebels&parentUrl=https%3A//ps.equalsystem.com/ps/game/BIASportbook.action', 
      'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36', 
      'X-Requested-With':'XMLHttpRequest'   
      } 
     ) 

    print response.text 




    soup= BeautifulSoup(response.content, "html.parser") 

    #leagues= soup.find_all("div",{"class": "header"})[0].text 
    #print leagues 
    leagues= soup.find_all("div", {"class": "championship-header"}) 
    links= soup.find_all("a") 

    for link in links: 
     print (link.get("href"), link.text) 

    for item in leagues: 
     #print item.contents[0].find_all("div",{"class": "header"})[0].text 
      print item.find_all("div",{"class": "header"})[0].text 
     print item.find_all("div",{"class": "header"})[0].text 
     print item.find_all("span")[0].text 

我想從betrebels.com取消所有足球聯賽的任何想法?

回答

0

所以實際數據是更清潔和更容易從真實來源獲得 - 這,如果你挖通過瀏覽器在發出請求,你可以看到 - 但這裏的網址:https://s5.sir.sportradar.com/betinaction/en/1

它也原生支持JSON這意味着你可以減少它只使用requests模塊和json模塊,如果你需要它,但請求允許你只返回你解析爲字典的原始json。

所有這一切意味着你可以從根本上簡化抓取過程,得到你想要的。

,你可以找到所有的聯賽對於所有的國家在這裏https://ls.sportradar.com/ls/feeds/?/betinaction/en/Europe:Berlin/gismo/config_tree/41/0/1你只需要抓住所有的_id字段,然後通過每一個與格式構建的URL環如https://s5.sir.sportradar.com/betinaction/en/1/category/ + _id

,但如果你檢查要求,你應該抓住爲該以及原始網址...

林留下其餘的你 - 但你想要的一切是存在的,它更容易閱讀和訪問

+0

你到底是什麼意思?你發佈的鏈接有所有的聯賽和球隊,但沒有賠率,我真的想要廢除賠率和聯賽如何這個鏈接會幫助我?也爲什麼當我運行我的代碼我不會得到發佈數據? – Geraki

+0

您是否真的轉到我發佈的第一個網址,並點擊了團隊鏈接。那裏有大量的統計數據,但我不打算爲你做......我只是提供了一個更簡單的解決方案,並且鑑於你尚未指定除了「團隊」之外還需要的其他數據,不能讀懂你的想法 –