I can't make this script to run all urls from list in one time.
一個參數,*args
保存你的代碼的方法(或任何你想要的名字,只是不要忘了*
)。 *
將自動解包您的列表。 *
沒有正式名稱,但有些人(包括我)喜歡將它稱爲splat operator。
def start_download(*args):
for value in args:
##for debugging purposes
##print value
response = urllib2.urlopen(value).read()
##put the rest of your code here
if __name__ == '__main__':
links = ['http://guardsmanbob.com/media/playlist.php?char='+
chr(i) for i in range(97,123)]
start_download(links)
編輯: 或者您也可以直接循環在你的鏈接列表,並下載每個。
links = ['http://guardsmanbob.com/media/playlist.php?char='+
chr(i) for i in range(97,123)]
for link in links:
response = urllib2.urlopen(link).read()
##put the rest of your code here
編輯2:
爲了讓所有的鏈接,然後保存這些文件,這裏的具體意見的完整代碼:
import urllib2
from bs4 import BeautifulSoup, SoupStrainer
links = ['http://guardsmanbob.com/media/playlist.php?char='+
chr(i) for i in range(97,123)]
for link in links:
response = urllib2.urlopen(link).read()
## gets all <a> tags
soup = BeautifulSoup(response, parse_only=SoupStrainer('a'))
## unnecessary link texts to be removed
not_included = ['News', 'FAQ', 'Stream', 'Chat', 'Media',
'League of Legends', 'Forum', 'Latest', 'Wallpapers',
'Links', 'Playlist', 'Sessions', 'BobRadio', 'All',
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y', 'Z', 'Misc', 'Play',
'Learn more about me', 'Chat info', 'Boblights',
'Music Playlist', 'Official Facebook',
'Latest Music Played', 'Muppets - Closing Theme',
'Billy Joel - The River Of Dreams',
'Manic Street Preachers - If You Tolerate This
Your Children Will Be Next',
'The Bravery - An Honest Mistake',
'The Black Keys - Strange Times',
'View whole playlist', 'View latest sessions',
'Referral Link', 'Donate to BoB',
'Guardsman Bob', 'Website template',
'Arcsin']
## create a file named "test.txt"
## write to file and close afterwards
with open("test.txt", 'w') as output:
for hyperlink in soup:
if hyperlink.text:
if hyperlink.text not in not_included:
##print hyperlink.text
output.write("%s\n" % hyperlink.text.encode('utf-8'))
這裏是保存在test.txt
輸出:
我建議您將test.txt
更改爲不同的文件名(前S標題),每次循環鏈接列表時都會覆蓋前一個鏈接。
使用第一種方法時,我得到這個:AttributeError:'list'對象沒有屬性'timeout'。而使用第二種方法時,我只獲得每個網址的第一首歌曲名稱。我該如何解決這個問題? – user1628593
好的,所以你可以使用第二種方法迭代你的鏈接列表。然後,您想要獲取每個網址的所有歌曲名稱,我是否正確? –
是的,你是對的。 – user1628593