我已經編寫了這個代碼來替換它們的標題。它可以根據需要替換帶有標題的網址,但它會在下一行顯示它們的標題。Python:用標題替換url
twfile.txt包含這些行:
link1 http://t.co/HvKkwR1c
no link line
輸出tw2file:
link1
Instagram
no link line
,但我想以這種形式輸出:
link1 Instagram
no link line
我應該怎麼辦?
我的代碼:
from bs4 import BeautifulSoup
import urllib
output = open('tw2file.txt','w')
with open('twfile.txt','r') as inputf:
for line in inputf:
try:
list1 = line.split(' ')
for i in range(len(list1)):
if "http" in list1[i]:
##print list1[i]
response = urllib.urlopen(list1[i])
html = response.read()
soup = BeautifulSoup(html)
list1[i] = soup.html.head.title
##print list1[i]
list1[i] = ''.join(ch for ch in list1[i])
else:
list1[i] = ''.join(ch for ch in list1[i])
line = ' '.join(list1)
print line
output.write(line)
except:
pass
inputf.close()
output.close()
它不影響輸出 –
你爲什麼要打印2次?打印行和output.write(行)? – Gio
'print'似乎是'console'。另一個似乎是'file' – emeth