3
我已經嘗試過使用結構模塊的方法,如我的代碼中註釋掉的行所示,但它沒有解決。基本上我有兩種選擇:我可以通過代碼寫入二進制數據代碼(我的代碼是長度從3到13比特的比特序列),或者轉換整個字符串(本例中n = 25000 +),到二進制數據。但我不知道如何實現這兩種方法。代碼:霍夫曼編碼:如何在Python中編寫二進制數據
import heapq
import binascii
import struct
def createFrequencyTupleList(inputFile):
frequencyDic = {}
intputFile = open(inputFile, 'r')
for line in intputFile:
for char in line:
if char in frequencyDic.keys():
frequencyDic[char] += 1
else:
frequencyDic[char] = 1
intputFile.close()
tupleList = []
for myKey in frequencyDic:
tupleList.append((frequencyDic[myKey],myKey))
return tupleList
def createHuffmanTree(frequencyList):
heapq.heapify(frequencyList)
n = len(frequencyList)
for i in range(1,n):
left = heapq.heappop(frequencyList)
right = heapq.heappop(frequencyList)
newNode = (left[0] + right[0], left, right)
heapq.heappush(frequencyList, newNode)
return frequencyList[0]
def printHuffmanTree(myTree, someCode,prefix=''):
if len(myTree) == 2:
someCode.append((myTree[1] + "@" + prefix))
else:
printHuffmanTree(myTree[1], someCode,prefix + '0')
printHuffmanTree(myTree[2], someCode,prefix + '1')
def parseCode(char, myCode):
for k in myCode:
if char == k[0]:
return k[2:]
if __name__ == '__main__':
myList = createFrequencyTupleList('input')
myHTree = createHuffmanTree(myList)
myCode = []
printHuffmanTree(myHTree, myCode)
inputFile = open('input', 'r')
outputFile = open('encoded_file2', "w+b")
asciiString = ''
n=0
for line in inputFile:
for char in line:
#outputFile.write(parseCode(char, myCode))
asciiString += parseCode(char, myCode)
n += len(parseCode(char, myCode))
#values = asciiString
#print n
#s = struct.Struct('25216s')
#packed_data = s.pack(values)
#print packed_data
inputFile.close()
#outputFile.write(packed_data)
outputFile.close()
好,趕快,謝謝。 – agf
我認爲這是一種改進。您不使用Python 3中的b''類型來存儲文本,而是存儲二進制數據,因此它非常合理。你爲什麼要在二進制數據上調用'ord'?你不會,你把它叫做一個角色。單字節的二進制數據最容易表示爲整數,這就是Python 3所做的。 – agf
是否支持所有版本的Python支持'{0}'.format()'支持嵌套版本?我知道'「{}」。格式()'不接受'所有版本「{0}」。格式()' – agf