用戶可能會將一堆url作爲命令行參數。過去給出的所有URL都用pickle序列化。腳本檢查所有給定的URL,如果它們是唯一的,那麼它們將被序列化並附加到文件中。至少這是應該發生的事情。沒有任何內容被追加。但是,當我以寫入模式打開文件時,會寫入新的唯一URL。那麼是什麼給了?代碼是:pickle.dump在追加文件時不轉儲
def get_new_urls():
if(len(urls.URLs) != 0): # check if empty
with open(urlFile, 'rb') as f:
try:
cereal = pickle.load(f)
print(cereal)
toDump = []
for arg in urls.URLs:
if (arg in cereal):
print("Duplicate URL {0} given, ignoring it.".format(arg))
else:
toDump.append(arg)
except Exception as e:
print("Holy bleep something went wrong: {0}".format(e))
return(toDump)
urlsToDump = get_new_urls()
print(urlsToDump)
# TODO: append new URLs
if(urlsToDump):
with open(urlFile, 'ab') as f:
pickle.dump(urlsToDump, f)
# TODO check HTML of each page against the serialized copy
with open(urlFile, 'rb') as f:
try:
cereal = pickle.load(f)
print(cereal)
except EOFError: # your URL file is empty, bruh
pass
儘管原創性很好,但請記住,這是一個孩子友好的網站;-( –
「不是dumpin'沒有東西」只是**錯誤** – mentalita