1
我想用python實現Lempel-Ziv-Welch算法,但是在用二進制文件編寫我的文件時遇到困難。讀取和寫入逐字節的壓縮
action = sys.argv[3]
if action == "compress":
# initialize dictionary
dictionary = {}
for i in range(0,256):
# for single characters, the value is the same as the key
# in the compressed file, these would appear as is
dictionary[chr(i)] = i
input_file = open(sys.argv[1], 'rb+')
output_file = open(sys.argv[2], 'wb')
data = input_file.read()
# current_data is one byte
current_data = input_file.read(1)
i = 0
j = 1
current_data = data[i:j]
# look for the shortest string not in the dictionary
while i < len(data) - 2:
while current_data in dictionary.keys():
if j < len(data) + 1:
j = j + 1
current_data = data[i:j]
else:
break
# once the shortest string is found, add it to the dictionary
if current_data not in dictionary.keys():
dictionary[current_data] = len(dictionary)
thing_to_write = dictionary[current_data[:-1]]
i = j - 1
current_data = data[i:j]
else:
thing_to_write = dictionary[current_data]
i = i + 1
j = i + 1
# then write to the output file the found string - one character from the end (the longest string that is in the dictionary)\
mylist = []
thing_to_write = format(thing_to_write,'x')
thing_to_write = thing_to_write
for char in thing_to_write:
mylist.append(char.encode('hex'))
for elem in mylist:
output_file.write(elem)
input_file.close()
output_file.close()
print >> sys.stderr, "The size of " + sys.argv[1] + " is " + str(os.path.getsize(sys.argv[1])) + " bytes." + "\n" + "The size of " + sys.argv[2] + " is " + str(os.path.getsize(sys.argv[2])) + " bytes."
我試過用十六進制,二進制等格式寫很多不同的格式,但我想我只是把它們寫成8位字符。我怎樣才能寫入原始二進制文件?
什麼意思 「我有麻煩」?你有錯誤信息嗎?然後將完整的信息添加到問題。 – furas
[如何創建最小,完整和可驗證示例](http://stackoverflow.com/help/mcve) – wwii