2
我颳了以下鏈接:如何在python中正確編碼可能的中文編碼?
http://www.footballcornersta.com/en/league.php?select=all&league=%E8%8B%B1%E8%B6%85&year=2014&month=1&Submit=Submit
和下面的字符串包含在菜單中相關的聯賽所有可用的選項:
ls_main = [['E','ENG PR','英超'],['E','ENG FAC','英足總杯'],['E','ENG Champ','英冠'],['E','ENG D1','英甲'],['I','ITA D1','意甲'],['I','ITA D2','意乙'],['S','SPA D1','西甲'],['S','SPA D2','西乙'],['G','GER D1','德甲'],['G','GER D2','德乙'],['F','FRA D1','法甲'],['F','FRA D2','法乙'],['S','SCO PR','蘇超'],['R','RUS PR','俄超'],['T','TUR PR','土超'],['B','BRA D1','巴西甲'],['U','USA MLS','美職聯'],['A','ARG D1','阿根甲'],['J','JP D1','日職業'],['J','JP D2','日職乙'],['A','AUS D1','澳A聯'],['K','KOR D1','韓K聯'],['C','CHN PR','中超'],['E','EURO Cup','歐洲盃'],['I','Italy Supe','意超杯'],['K','KOR K3','K3聯'],['C','CHN D1','中甲'],['D','DEN D2-E','丹乙東'],['D','DEN D2-W','丹乙西'],['D','DEN D1','丹甲'],['D','DEN PR','丹超'],['U','UKR U21','烏克蘭U21'],['U','UD2','烏克甲'],['U','UKR D1','烏克超'],['U','Uzber D1','烏茲超'],['U','URU D1','烏拉甲'],['U','UZB D2','烏茲甲'],['I','ISR D2','以色列乙'],['I','ISR D1','以色列甲'],['I','ISR PR','以色列超'],['I','Iraq L','伊拉聯'],['I','Ira D1','伊朗甲'],['I','IRA P','伊朗聯'],['R','RUS D2C','俄乙中'],['R','RUS D2U','俄乙烏'],['R','RUS D2S','俄乙南'],['R','RUS D2W','俄乙西'],['R','RUS RL','俄後賽'],['R','RUS D1','俄甲'],['R','RUS PR','俄超'],['B','BUL D1','保甲'],['C','CRO D1','克甲'],['I','ICE PR','冰島超'],['G','GHA PL','加納超'],['H','Hun U19','匈U19'],['H','HUN D2E','匈乙東'],['H','HUN D2W','匈乙西'],['H','HUN D1','匈甲'],['N','NIR IFAC','北愛冠'],['N','NIRE PR','北愛超'],['S','SAfrica D1','南非甲'],['S','SAfrica NSLP','南非超'],['L','LUX D1','盧森甲'],['I','IDN PR','印尼超'],['I','IND D1','印度甲'],['G','GUAT D1','危地甲'],['E','ECU D1','厄甲'],['F','Friendly','友誼賽'],['K','KAZ D1','哈薩超'],['C','COL D2','哥倫乙'],['C','COL C','哥倫杯'],['C','COL D1','哥倫甲'],['C','COS D1','哥斯甲'],['T','TUR U23','土A2青'],['T','TUR D3L1','土丙1'],['T','TUR D3L2','土丙2'],['T','TUR D3L3','土丙3'],['T','TUR2BK','土乙白'],['T','TUR2BB','土乙紅'],['T','TUR D1','土甲'],['E','EGY PR','埃及超'],['S','Serbia D2','塞爾乙'],['S','Serbia 1','塞爾聯'],['C','CYP D2','塞浦乙'],['C','CYP D1','塞浦甲'],['M','MEX U20','墨西U20'],['M','Mex D2','墨西乙'],['M','MEX D1','墨西聯'],['A','AUT D3E','奧丙東'],['A','AUT D3C','奧丙中'],['A','AUT D3W','奧丙西'],['A','AUT D2','奧乙'],['A','AUT D1','奧甲'],['V','VEN D1','委超'],['W','WAL D2','威甲'],['W','WAL D2CA','威聯盟'],['W','WAL D1','威超'],['A','Ang D1','安哥甲'],['N','NIG P','尼日超'],['P','PAR D1','巴拉甲'],['B','BRA D2','巴西乙'],['B','BRA CP','巴錦賽'],['G','GRE D3N','希丙北'],['G','GRE D3S','希丙南'],['G','GRE D2','希乙'],['G','GRE D1','希甲'],['G','GER U17','德U17'],['G','GER U19','德U19'],['G','GER D3','德丙'],['G','GER RN','德北聯'],['G','GER RS','德南聯'],['G','GER RW','德西聯'],['I','ITA D3A','意丙A'],['I','ITA D3B','意丙B'],['I','ITA D3C1','意丙C1'],['I','ITA D3C2','意丙C2'],['I','ITA CP U20','意青U20'],['E','EST D3','愛沙丙'],['N','NOR D2-A','挪乙A'],['N','NOR D2-B','挪乙B'],['N','NOR D2-C','挪乙C'],['N','NOR D2-D','挪乙D'],['N','NORC','挪威杯'],['N','NOR D1','挪甲'],['N','NOR PR','挪超'],['C','CZE D3','捷丙'],['C','CZE MSFL','捷丙M'],['C','CZE D2','捷乙'],['C','CZE U19','捷克U19'],['C','CZE D1','捷克甲'],['M','Mol D2','摩爾乙'],['M','MOL D1','摩爾甲'],['M','MOR D2','摩洛哥乙'],['M','MOR D1','摩洛超'],['S','Slovakia D3E','斯丙東'],['S','Slovakia D3W','斯丙西'],['S','Slovakia D2','斯伐乙'],['S','Slovakia D1','斯伐甲'],['S','Slovenia D1','斯洛甲'],['S','SIN D1','新加聯'],['J','JL3','日丙聯'],['C','CHI D2','智乙'],['C','CHI D1','智甲'],['G','Geo','格魯甲'],['G','GEO PR','格魯超'],['U','UEFA CL','歐冠杯'],['U','UEFA SC','歐霸杯'],['B','BEL D3A','比丙A'],['B','BEL D3B','比丙B'],['B','BEL D2','比乙'],['B','BEL W1','比女甲'],['B','BEL C','比杯'],['B','BEL D1','比甲'],['S','SAU D2','沙地甲'],['S','SAU D1','沙地聯'],['F','FRA D4A','法丁A'],['F','FRA D4B','法丁B'],['F','FRA D4C','法丁C'],['F','FRA D4D','法丁D'],['F','FRA D3','法丙'],['F','FRA U19','法國U19'],['F','FRA C','法國杯'],['P','POL D2E','波乙東'],['P','POL D2W','波乙西'],['P','POL D2','波蘭乙'],['P','POL D1','波蘭甲'],['B','BOS D1','波斯甲'],['P','POL YL','波青聯'],['T','THA D1','泰甲'],['T','THA PL','泰超'],['H','HON D1','洪都甲'],['A','Aus BP','澳布超'],['E','EST D1','愛沙甲'],['I','IRE D1','愛甲'],['I','IRE PR','愛超'],['B','BOL D1','玻利甲'],['F','Friendly','球會賽'],['S','SWI D1','瑞士甲'],['S','SWI PR','瑞士超'],['S','SWE D2','瑞甲'],['S','SWE D1','瑞超'],['B','BLR D2','白俄甲'],['B','BLR D1','白俄超'],['P','Peru D1','祕魯甲'],['T','TUN D2','突尼乙'],['T','Tun D1','突尼甲'],['R','ROM D2G1','羅乙1'],['R','ROM D2G2','羅乙2'],['R','ROM D1','羅甲'],['L','LIBERT C','自由杯'],['F','FIN D2','芬甲'],['F','FIN D1','芬超'],['S','SCO D3','蘇丙'],['S','SUD PL','蘇丹超'],['S','SCO D2','蘇乙'],['S','SCO D1','蘇甲'],['S','SCO HL','蘇高聯'],['E','ENG D2','英乙'],['E','ENG RyPR','英依超'],['E','ENG UP','英北超'],['E','ENG SP','英南超'],['E','ENG Trophy','英挑杯'],['E','ENG Con','英非'],['E','ENG CN','英非北'],['H','HOL D2','荷乙'],['H','HOL Yl','荷青甲'],['S','SV D1','薩爾超'],['P','POR U19','葡U19'],['P','POR D1','葡甲'],['P','POR PR','葡超'],['S','SPA D3B1','西丙1'],['S','SPA D3B2','西丙2'],['S','SPA D3B3','西丙3'],['S','SPA D3B4','西丙4'],['S','SPA Futsal','西內足'],['S','SPA W1','西女超'],['B','BRA CC','裏州賽'],['A','Arg D2M1','阿乙M1'],['A','Arg D2M2','阿乙M2'],['A','Arg D2M3','阿乙M3'],['A','ALG D2','阿及乙'],['A','ALG D1','阿及甲'],['A','AZE D1','阿塞甲'],['A','ALB D1','阿巴超'],['A','ARG D2','阿根乙'],['U','UAE D2','阿聯乙'],['K','KOR NL','韓聯盟'],['F','FYRM D2','馬其乙'],['M','MacedoniaFyr','馬其甲'],['M','MAS D1','馬來超'],['M','MON D2','黑山乙'],['M','MON D1','黑山甲'],['F','FCWC','世冠杯'],['W','World Cup','世界盃'],['F','FIFAWYC','世青杯'],['C','CWPL','中女超'],['C','CFC','中足協盃'],['D','DEN C','丹麥杯'],['A','Asia CL','亞冠杯'],['A','AFC','亞洲盃'],['R','Rus Cup','俄羅斯杯'],['H','HUN C','匈杯'],['N','NIR C','北愛杯'],['T','TUR C','土杯'],['T','Tenno Hai','天皇杯'],['W','WWC','女世杯'],['I','ITA Cup','意杯'],['G','GER C','德國杯'],['J','JPN LC','日聯杯'],['S','SCO FAC','蘇足總杯'],['E','ENG JPT','英錦賽'],['E','ENG FAC','足總杯'],['C','CAF NC','非洲杯'],['K','K-LC','韓聯杯'],['H','HK D1','香港甲']];
我刮的頁面的鏈接包含第三字符,但是當我複製它成爲上面的鏈接。
我不確定編碼。
import re
html = 'source of page'
matches = re.findall('ls_main = \[\[.*?;', html)[0]
matches = matches.decode('unknown encoding').encode('utf-8')
如何將原始字符放在鏈接的字符串中?
我使用Python 2.7。
你正在使用哪個版本的python? – ashwinjv 2014-09-19 01:02:56
該URL已包含UTF-8編碼文本。或者我誤解了你的問題? – 2014-09-19 01:03:40