2017-08-02 45 views
0

我試圖從這裏本教程: https://www.youtube.com/watch?v=XQgXKtPSzUI&list=WL&index=93試圖用Python來湊這個頁面,但回報亂碼

這是我試圖刮steemit後的腳本: enter image description here

from urllib.request import urlopen as uReq 
from bs4 import BeautifulSoup as soup 


my_url = 'https://steemit.com/test/@bitcoinfree/test-4' 

uClient = uReq(my_url) 
page_html = uClient.read() 
uClient.close() 
page_soup = soup(page_html,'html.parser') 
print(page_soup.prettify("utf-8")) 

目前代碼正在輸出亂碼。

我不知道如何獲得純html源碼。 我在做什麼錯? :(

+0

檢查你的接受標頭的gzip的選項 –

回答

0

得到它。

import requests 
from bs4 import BeautifulSoup 

url = 'https://steemit.com/test/@bitcoinfree/test-4' 
r = requests.get(url) 
soup = BeautifulSoup(r.content, "html.parser") 

print(soup.prettify())