我們可以定義custom formatter並將其傳遞給BeautifulSoup.prettify
方法(在你的情況有可能有unicode逃避問題,有這麼從here lution):
import requests
from bs4 import BeautifulSoup
r = requests.get('https://m.facebook.com/pages_reaction_units/more/?page_id=126721934083985&cursor={%22timeline_cursor%22:%22page_photos%22,%22timeline_section_cursor%22:null,%22has_next_page%22:true}&surface=mobile_page_home&unit_count=3')
def formatter(string):
formatted = bytes(string, 'ascii').decode('unicode-escape')
# for Python 2
# formatted = string.decode('unicode-escape')
return formatted
res = r.content.decode()
soup = BeautifulSoup(res, 'lxml')
print(soup.prettify(formatter=formatter))
給我們
<html>
<body>
<p>
for (;;);{"actions":[{"cmd":"replace","target":"m_pages_reaction_see_more_unit","html":"<div class="_3-8x"><div class="_55wo _gui" id="unit_id_688092377989036"><div><div class="_3w7e"><div class="_55wo _56bf _5rgl" data-ft="{"top_level_post_id":"1396231463799686","tl_objid":"1396231463799686","page_id":"126721934083985"}" id="u_0_4"><div><h3 class="_52ja _52jh"><span><strong><a href="\/tjournal\/?__tn__=C">TJournal<\/a><\/strong><\/span><\/h3><div class="_5rgn" style=""><span><p>В московском метро оператор Wi-Fi объяснил наличие рекламы шутками про котят и гимнастику для глаз.<\/p><\/span><\/div><div class="_5rgo _27x0"><a class="_5tip _5tfq" href="https:\/\/lm.facebook.com\/l.php?u=https%3A%2F%2Fvc.ru%2Fp%2Fmosmetro-wifi&h=ATOEzvb7duNxYzFzO8Y_WNBP24h3OXswMOwrGBf1UOpOgSMNW0A8saNQTZxNVw2_hzp3bRyNC8rInRxtEbxx5ubADIB0GFnFadbwXzCKOCSq2k66bzaM2H-XTN8Y&enc=AZPdwIbpMFV3FWXB3fcldNTj5RGMsvmR3tL1UhSg7fx6yOErLQUUIu5_DtOs5sw7v7oUxaeuc7rfgXDJ66XN1n68KN53bmSqkpFfSTTOJK650QG9uJPCNnlHZJl0zgV5OIn1nwpLZfkxn2WHCFb9z5DpTz3k-hXMYvBg9Di00D1Wqg&s=1" id="u_0_5" target="_blank"><div class="_59e9 _55wr _5til"><table class="_4g33 _52wc"><tbody><tr><td class="_5s61 _3lsd"><img src="https:\/\/external.xx.fbcdn.net\/safe_image.php?d=AQCYmQ6aqZ7CRrUU&w=149&h=149&url=https%3A%2F%2Fvc.ru%2Fcover%2F24313%2Ffb%2Fcover.jpg&cfs=1&jq=75&ext=jpg&_nc_hash=AQCFGypsajWMdGm0" width="112" height="112" class="img" \/><\/td><td class="_4g34"><h1 class="_52jd _52jh _ing">«В мире успеет родиться 273 котёнка»: оператор Wi-Fi в московском метро решил объяснить наличие рекламы шуточными цитатами<\/h1><div class="_52jc _2b-u">vc.ru<div class="_52jc"><\/div><\/div><\/td><\/tr><\/tbody><\/table><\/div><\/a><\/div><\/div><div class="_5sq4"><div class="mfss fcg"><abbr>27 мин<\/abbr><span> · <\/span><span class="_26zb"><span><span class="mfss">Доступно всем<\/span><\/span><\/span><\/div><div class="mfss fcg"><a href="\/story.php?story_fbid=1396231463799686&id=126721934083985#footer_action_list">Новость целиком<\/a><\/div><\/div><\/div><\/div><\/div><\/div><div class="_55wo _gui" id="unit_id_688092377989036"><div><div class="_3w7e"><div class="_55wo _56bf _5rgl" data-ft="{"top_level_post_id":"1396169803805852","tl_objid":"1396169803805852","page_id":"126721934083985"}" id="u_0_0"><div><h3 class="_52ja _52jh"><span><strong><a href="\/tjournal\/?__tn__=C">TJournal<\/a><\/strong><\/span><\/h3><div class="_5rgn" style=""><span><p>iOS 11, выходящая осенью, не поддерживает 32-битные приложения. Это значит, что пользователи потеряют доступ примерно к 200 тысячам программ. <\/p><p> Включая оригинальную Flappy Bird.<\/p><\/span><\/div><div class="_5rgo _27x0"><a class="_5tip _5tfq" href="https:\/\/lm.facebook.com\/l.php?u=https%3A%2F%2Fdtf.ru%2F7120-kak-uznat-kakie-prilozheniya-perestanut-rabotat-v-ios-11&h=ATPDoxCwMQLom_0MvLZd0sk90kReFLK9MtPs-flcXUdNA2IWlIuKBfnHqSQbNpveb8dW6gC32vxQCRtmUtBZQuy06LVxmW-gjvdi6C971XUvVhuyRCWlOdDWvgdj&enc=AZOcj4YClTNdw0ZHVMWhll-L-jVGpHnNtuLk9Su1t1GT4KnnMguG1EukLu57simsZJkFgnqQssiyu-wNtVVbXPYCIcVuYCfOcxWyWZs2j3gtpTJctaVpbpUlV4Na-zOn2PqlP8IqxHlC4uTH0jxH2qszc5trX-cCA-Mu5kuT2BDqeQ&s=1" id="u_0_1" target="_blank"><div class="_59e9 _55wr _5til"><table class="_4g33 _52wc"><tbody><tr><td class="_5s61 _3lsd"><img src="https:\/\/external.xx.fbcdn.net\/safe_image.php?d=AQAgAf90LRsVJMNs&w=149&h=149&url=https%3A%2F%2Fdtf.ru%2Fcover%2F7120%2Ffb%2Fcover.jpg&cfs=1&jq=75&ext=jpg&_nc_hash=AQAREWDHqfZjs47i" width="112" height="112" class="img" \/><\/td><td class="_4g34"><h1 class="_52jd _52jh _ing">Как узнать, какие приложения перестанут работать в iOS 11<\/h1><div class="_52jc _2b-u">dtf.ru<div class="_52jc"><\/div><\/div><\/td><\/tr><\/tbody><\/table><\/div><\/a><\/div><\/div><div class="_5sq4"><div class="mfss fcg"><abbr>1 ч<\/abbr><span> · <\/span><span class="_26zb"><span><span class="mfss">Доступно всем<\/span><\/span><\/span><\/div><div class="mfss fcg"><a href="\/story.php?story_fbid=1396169803805852&id=126721934083985#footer_action_list">Новость целиком<\/a><\/div><\/div><\/div><\/div><\/div><\/div><div class="_55wo _gui" id="unit_id_688092377989036"><div><div class="_3w7e"><div class="_55wo _56bf _5rgl" data-ft="{"top_level_post_id":"1396120047144161","tl_objid":"1396120047144161","page_id":"126721934083985"}" id="u_0_2"><div><h3 class="_52ja _52jh"><span><strong><a href="\/tjournal\/?__tn__=C">TJournal<\/a><\/strong><\/span><\/h3><div class="_5rgn" style=""><span><p>«Когда я начал писать песни, я использовал единственный лексикон, которым владел, — народный говор».<\/p><p> Боб Дилан спустя полгода молчания всё же записал свою нобелевскую лекцию и прислал её по почте. В ней он рассказал о трёх книгах, изменивших его жизнь, и песнях, вдохновивших его на творчество.<\/p><\/span><\/div><div class="_5rgo _27x0"><a class="_5tip _5tfq" href="https:\/\/lm.facebook.com\/l.php?u=https%3A%2F%2Ftjournal.ru%2F45066-nashi-pesni-zhivi-na-zemle-zhivih&h=ATPEMvJ7GQTe04oluO0JF3NLS22d-lRvx6KKD66V_nqi9fsI1IXKolS42cYwhhi84uwDJNdjx2guKZGZLG1tDdU2Z8hdcEJCAPOhvNmzbYCQVFb1jYIQSDbGzDxe&enc=AZNVVbLvwHAh7vlAmvmJ1MPucduzaj6RF3CqoH-RDxZsiY929_44hm3bFI9T3qviVsZH4m2uipObgAZC7o8Mi2TuSW3sCSJLDDM87FFtr_ykoSiHSVoe6dJ6Ofp_Ccdp9ZC1awnV8G5hk2y7mslIMKmDu5V6HVbYuJil4PPv3nkqqw&s=1" id="u_0_3" target="_blank"><div class="_59e9 _55wr _5til"><table class="_4g33 _52wc"><tbody><tr><td class="_5s61 _3lsd"><img src="https:\/\/external.xx.fbcdn.net\/safe_image.php?d=AQAfW-DGqBho4DTZ&w=149&h=149&url=https%3A%2F%2Fgif.cmtt.space%2F3%2Fclub%2F04%2F44%2Fea%2F9a32486b17242f.jpg&cfs=1&jq=75&sx=186&sy=0&sw=847&sh=847&ext=jpg&_nc_hash=AQDaECn9dkAl-dAu" width="112" height="112" class="img" \/><\/td><td class="_4g34"><h1 class="_52jd _52jh _ing">«Наши песни живы на земле живых»<\/h1><div class="_52jc _2b-u">tjournal.ru<div class="_52jc"><\/div><\/div><\/td><\/tr><\/tbody><\/table><\/div><\/a><\/div><\/div><div class="_5sq4"><div class="mfss fcg"><abbr>2 ч<\/abbr><span> · <\/span><span class="_26zb"><span><span class="mfss">Доступно всем<\/span><\/span><\/span><\/div><div class="mfss fcg"><a href="\/story.php?story_fbid=1396120047144161&id=126721934083985#footer_action_list">Новость целиком<\/a><\/div><\/div><\/div><\/div><\/div><\/div><div class="apm" id="m_pages_reaction_see_more_unit"><a href="\/pages_reaction_units\/more\/?page_id=126721934083985&cursor=%7B%22timeline_cursor%22%3A%22timeline_unit%3A1%3A00000000001496760847%3A04611686018427387904%3A09223372036854775805%3A04611686018427387904%22%2C%22timeline_section_cursor%22%3A%7B%7D%2C%22has_next_page%22%3Atrue%7D&surface=mobile_page_home&unit_count=3"><span>Еще<\/span><\/a><\/div><\/div>","replaceifexists":false,"allownull":false}],"css_hashes":["3HuCe","B4neB","UZuNp","9WMeb","xf2NL","9cXdw","JlaHi","QOT77","i6YK6","dkD1x","hInrP","PZ1UP","RnoEd","ol6iG","gfKii","1PJ9n","9U1Wz","IF96k"],"inline_css":"._4qba{font-style:normal}._4qbb,._4qbc,._4qbd{background:none;font-style:normal;padding:0;width:auto}._4qbd{border-bottom:1px solid #f99}._4qbb,._4qbc{border-bottom:1px solid #999}._4qbb:hover,._4qbc:hover,._4qbd:hover{background-color:#fcc;border-top:1px solid #ccc;cursor:help}
._3-8h{margin:4px}._3-8i{margin:8px}._3-8j{margin:12px}._3-8k{margin:16px}._3-8l{margin:20px}._3-8m{margin-bottom:4px;margin-top:4px}._3-8n{margin-bottom:8px;margin-top:8px}._3-8o{margin-bottom:12px;margin-top:12px}._3-8p{margin-bottom:16px;margin-top:16px}._3-8q{margin-bottom:20px;margin-top:20px}._3-8r{margin-left:4px;margin-right:4px}._3-8s{margin-left:8px;margin-right:8px}._3-8t{margin-left:12px;margin-right:12px}._3-8u{margin-left:16px;margin-right:16px}._3-8v{margin-left:20px;margin-right:20px}._3-8w{margin-top:4px}._3-8x{margin-top:8px}._3-8y{margin-top:12px}._3-8z{margin-top:16px}._3-8-{margin-top:20px}._3-8_{margin-right:4px}._3-90{margin-right:8px}._3-91{margin-right:12px}._3-92{margin-right:16px}._3-93{margin-right:20px}._3-94{margin-bottom:4px}._3-95{margin-bottom:8px}._3-96{margin-bottom:12px}._3-97{margin-bottom:16px}._3-98{margin-bottom:20px}._3-99{margin-left:4px}._3-9a{margin-left:8px}._3-9b{margin-left:12px}._3-9c{margin-left:16px}._3-9d{margin-left:20px}
._gui{border-bottom:1px solid #bec2c9;border-top:1px solid #ccc;display:block;margin:0 0 8px}._3g-o{display:block;margin:8px 0}html ._5tp0{display:block;margin:auto;max-height:32px;max-width:32px}.nontouch td._1r7u{padding-left:4px}._17lh{color:#4b4f56;display:block;font-size:12px;line-height:16px;max-height:16px;overflow:hidden;text-overflow:ellipsis;white-space:nowrap}._17li{display:inline-block;font-size:12px;line-height:16px;vertical-align:middle}._3w7e{position:relative}._3w7f{position:absolute;right:28px;top:0}._j5t{padding:6px 0}._j5t textarea{resize:none}._4bdb{width:320px}._4bdc{padding:4px 0}
._59e9{background:#f6f7f9}._55wm{background:#dddfe2}._5s6y{background:#000}._55wn{background:#4080ff}._55wo{background:#fff}._5oxw{background:#e9ebee}._50zt{background:#e9eaed}._25sz{background:#ff7f50}
._5rgl{margin:0 6px 6px;padding:6px}._gdx{margin:0 0 6px}.nontouch ._5rgl ._5rgl{border-color:#e9ebee;margin:6px 0 0}._5rgm{color:gray;margin-bottom:5px}._32a6{border-bottom:1px solid #e9ebee;padding-bottom:5px}._26zb{display:inline-block}._5sq4{margin-top:5px}._289a{display:block;margin:8px auto}._5rgn{margin:5px 0}._5rgl a,._5rgl a:visited{color:#2b55ad}._5rgl a:hover,._5rgl a:focus{background:#2b55ad;color:#fff}._4f23{border-bottom:1px solid #e9ebee;display:block;margin:0 -6px 5px -6px;padding:3px}._5nxi{font-size:small}._3ckq._2v9s{background-color:#e9ebee}._5d1 ._5rgl{font-size:small;padding:4px}._5d1 ._5rgm{margin-bottom:6px}._5d1 ._5sq4{margin-top:6px}._5d1 ._5rgn,._5d1 ._5rgo{margin:6px 0}
._56bf{border-color:#ccced3 #c4c6ca #b4b6ba;border-style:solid;border-width:1px}
._5rgr{margin:0 10px 15px;position:relative;user-select:none;word-wrap:break-word}._5gh8._5rgr{margin:0 0 8px}._3f50>._5rgr{margin:0}.embedded>._5rgr{margin:10px}.messageAttachments>._5rgr{margin:0}._5rgr ._5rgr{border-top:1px solid rgba(0, 0, 0, .07);margin:0}._5rgr ._5rgr._5ysf{border-top:0}._3f_b ._5rgu{position:relative;z-index:12}._5rgr ._4hkg{margin:0 10px 10px}._37w7 ._5rgr,._5rgr ._5s1m{margin:0}._37w7 ._5rgr::before{border-style:none}._5rgs._5rgs a{color:#1d2129}._5rgs._5jk3{color:#1d2129;font-weight:normal}._5rgs{border-bottom:1px solid rgba(0, 0, 0, .1);color:#999;font-size:13px;font-weight:bold;line-height:17px;padding:8px 10px}._5x45._5x45 a{color:#4b4f56;font-weight:normal}._5rgt,._5rgu{padding:0 10px}._5pes{margin-bottom:-10px}._u1r ._27x0{margin-bottom:-10px}._5rgt,._5t8z{margin:8px 0}._5rgt p,._5t8z p{margin-bottom:6px}._5rgt p:last-child,._5t8z p:last-child{margin-bottom:0}._5t8z{display:block}._5pes ._5t8z{margin-top:0}._5rgt a{color:#576b95;font-weight:normal}._37w7 ._5rgt{height:68px}._37w7 ._x0v ._5rgt{height:100%}._37w7 ._x0v ._5rgu{padding:0}._5rgt{position:relative}._4vbp{color:#7f7f7f;font-family:'HelveticaNeue-Medium', 'Helvetica Neue Medium', 'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:14px;margin:15px 0 9px 0}._5msi a,._5msi button,._5msi div[data-sigil*="more"]{position:relative;z-index:1}._5msi a._5msj{bottom:0;left:0;position:absolute;right:0;top:0;z-index:0}._4hkg ._4hkg ._4hkg .widePic{margin-left:-35px;margin-right:-35px}._4hkg ._4hkg .widePic{margin-left:-25px;margin-right:-25px}._4hkg .widePic{margin-left:-15px;margin-right:-15px}._5rgr .centeredPic{text-align:center}._5rgr ._1p6-{background:#f6f7f9;border-top:1px #dadde1 solid;color:#5e5e5e;font-size:12px;font-weight:bold;line-height:36px;text-align:center}._5rgr ._1p6-::before{background-image:url(https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yM\/r\/ujDtH-5KmUB.png);background-repeat:no-repeat;background-size:32px 54px;background-position:0 -32px;content:'';display:inline-block;height:20px;margin:0 4px 4px 0;vertical-align:middle;width:20px}._4vbo{background-color:#000;border-bottom:1px solid rgba(255, 255, 255, .17);margin:0 0 26px 0}._4vbo._5cyz ._54m,._4vbo._5cyz ._22rc{opacity:.25}._1sme{margin-left:0;margin-right:0}._4vbo ._5rgs{color:#7f7f7f}._4vbo ._yff{opacity:.7}._21nd{padding:0 0 10px}._21nd ._5t8z{margin:8px 8px 0 8px}._kw0{margin-left:-10px;margin-right:-10px}._3zhw{background-color:#edf2fa;border-bottom:solid 1px #dddfe2;border-top:solid 1px #dddfe2;padding:8px 10px}._3zhx{color:#5e5e5e;font-size:12px;font-weight:bold}._3zhy{color:#5e5e5e;font-size:12px}._3zi1{margin-left:8px}._5rgr ._w54{border-top:1px solid #dddfe2;padding:10px 10px 2px 10px}._5rgr ._26ti{border:none;float:left;height:0;overflow:hidden;padding:0}._5rgr ._w55{color:#90949c;padding-right:10px}._5rgr._5-8v ._5rgt{font-size:16px;line-height:20px}._5rgr._5-8w ._5rgt{font-size:17px;line-height:21px}
._5til{border:1px solid #ccc;margin:5px 0;word-wrap:break-word}._1s56{margin-bottom:5px}._5tim ._5til,._5tim ._1s56{margin-bottom:10px}._5til._1s58,._5tim ._5til._1s58{margin-bottom:0}.nontouch ._5til ._3lsd{padding-right:4px}._5tip{display:block}.nontouch ._5tip,.nontouch ._5tip:visited{color:#3e4350}.nontouch ._5tip:focus,.nontouch ._5tip:hover,.nontouch ._5tip:focus ._5til,.nontouch ._5tip:hover ._5til{background:#3b5998;color:#fff}._1prt ._5tip{visibility:hidden}._5nxi ._ing{font-size:small}
.nontouch a,.nontouch a:visited{color:#3b5998;text-decoration:none}.nontouch .sub,.nontouch .sub:visited{color:gray}.nontouch .sec,.nontouch .sec:visited{color:#6d84b4}.nontouch .inv,.nontouch .inv:visited{color:#fff}
.nontouch a:focus,.nontouch a:hover,.nontouch .sub:focus,.nontouch .sub:hover,.nontouch .sec:focus,.nontouch .sec:hover{background-color:#3b5998;color:#fff}.nontouch .inv:focus,.nontouch .inv:hover,.nontouch .inv:hover .fcy,.nontouch .inv:focus .fcy{background-color:#fff;color:#3b5998}.operaMini.nontouch a:focus,.operaMini.nontouch .sub:focus,.operaMini.nontouch .sec:focus,.operaMini.nontouch .inv:focus{background-color:inherit}.operaMini.nontouch a:focus{color:#365899}.operaMini.nontouch .sub:focus{color:gray}.operaMini.nontouch .sec:focus{color:#4267b2}.operaMini.nontouch .inv:focus{color:#fff}
.nontouch ._55wp{padding:0}.nontouch ._55wq{padding:2px}.nontouch ._55wr{padding:4px}.nontouch ._55ws{padding:6px}.nontouch ._56hq{padding:8px}
.nontouch ._4g33{border:0;border-collapse:collapse;margin:0;padding:0;width:100%}.nontouch ._4g33._4zc6{width:auto}.nontouch ._4g33 tbody,.nontouch ._52wc>tr>td,.nontouch ._52wc>tbody>tr>td,.nontouch ._4g33 td._52wc,.nontouch ._52wf>tr>td,.nontouch ._52wf>tbody>tr>td,.nontouch ._4g33 td._52wf{vertical-align:top}.nontouch ._52wd>tr>td,.nontouch ._52wd>tbody>tr>td,.nontouch ._4g33 td._52wd{vertical-align:bottom}.nontouch ._52we>tr>td,.nontouch ._52we>tbody>tr>td,.nontouch ._4g33 td._52we{vertical-align:middle}.nontouch ._4g33 td{padding:0}.nontouch ._4g33 td._55wq{padding:2px}.nontouch ._4g33 td._55wr{padding:4px}.nontouch ._4g33 td._55ws{padding:6px}.nontouch ._4g33 td._56hq{padding:8px}.nontouch ._4g34{width:100%}
.img{border:0;display:inline-block;vertical-align:top}i.img u{position:absolute;width:0;height:0;overflow:hidden}
._52j9{color:#90949c}._52ja{color:#4b4f56}._52jb{color:#1d2129}.touched ._592p ._52j9,.touched ._592p._52j9,.touched._592p ._52j9,.touched._592p._52j9,.touched ._592p ._52ja,.touched ._592p._52ja,.touched._592p ._52ja,.touched._592p._52ja,.touched ._592p._52jb,.touched._592p ._52jb,.touched ._592p ._52jb,.touched._592p._52jb,.touched ._592p,.touched._592p{color:#fff}._56bq{font-size:11px;line-height:16px;text-transform:uppercase}._52jc{font-size:12px;line-height:16px}._52jd{font-size:14px;line-height:20px}._52je{font-size:16px;line-height:20px}._52jf{font-size:18px;line-height:24px}._52jg{font-weight:normal}._52jh{font-weight:bold}._52ji{text-align:left}._52jj{text-align:center}._52jk{text-align:right}
.fcb{color:#000}.fcg{color:gray}.fcw{color:#fff}.fcl{color:#3b5998}.fcs{color:#6d84b4}
.mfsxs{font-size:x-small}.mfss{font-size:small}body,tr,input,textarea,.mfsm{font-size:medium}.mfsl{font-size:large}
.acw{background-color:#fff}.acbk{background-color:#000}.acb{background-color:#3b5998}.aclb{background-color:#eceff5}.acdb{background-color:#31394a}.acg{background-color:#f2f2f2}.acy{background-color:#fffbe2;color:#7f7212}.acr{background-color:#ffebe8;color:#6d220d}
.aps{padding:2px 3px}.apm{padding:4px 3px}.apl{padding:6px 3px}"}
</p>
</body>
</html>
在Python3我仍然有編碼符號 - \ u00257B \ u002522等 –
@KonstantinRusanov:這是奇怪的,測試了**的Python 2.7 **和* * Python 3.6 ** with'beautifulsoup4 == 4.5.3' –
在函數中使用這一行--formatted = bytes(string,'ascii')。decode('unicode-escape')? –