我想從文件中只提取IP地址,將它們按數字組織並將結果放入另一個文件中。提取特定分隔符後的IP地址
的數據是這樣的:
The Spammer (and all his/her info):
Username: user
User ID Number: 0
User Registration IP Address: 77.123.134.132
User IP Address for Selected Post: 177.43.168.35
User Email: [email protected]
這裏是我的代碼,它不會將IP地址正確地排序(即它77.123.134.132之前列出177.43.168.35):
import re
spammers = open('spammers.txt', "r")
ips = []
for text in spammers.readlines():
text = text.rstrip()
print text
regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex is not None and regex not in ips:
ips.append(regex)
for ip in ips:
OrganizedIPs = open("Organized IPs.txt", "a")
addy = "".join(ip)
if addy is not '':
print "IP: %s" % (addy)
OrganizedIPs.write(addy)
OrganizedIPs.write("\n")
spammers.close()
OrganizedIPs.close()
organize = open("Organized IPs.txt", "r")
ips = organize.readlines();
ips = list(set(ips))
print ips
for i in range(len(ips)):
ips[i] = ips[i].replace('\n', '')
print ips
ips.sort()
finish = open('organized IPs.txt', 'w')
finish.write('\n'.join(ips))
finish.close()
clean = open('spammers.txt', 'w')
clean.close()
我曾嘗試使用this IP sorter code,但它需要一個字符串作爲正則表達式返回一個列表。
也許那裏有一個聰明的辦法,但爲什麼不直接劈在並將int映射到您得到的列表並對int列表進行排序? – deinonychusaur
@deinonychusaur這正是我要做的! –
在你的例子中不要使用真實的IP地址。 –