從Python中的二進制文件中讀取4個字節的整數

我有一些包含4個字節整數的二進制文件集（有些可能很大（100MB））。從Python中的二進制文件中讀取4個字節的整數

任何人都可以提供一個代碼片段來展示如何提取每個4字節的整數，直到達到文件的結尾？使用Python 2.7。

感謝

2014-03-06 bzo

你可以使用struct.unpack()：

with open(filename, 'rb') as fileobj: 
    for chunk in iter(lambda: fileobj.read(4), ''): 
     integer_value = struct.unpack('<I', chunk)[0]

這使用<I解釋字節的小端的無符號整數。根據需要調整格式;大寫字母爲>，有符號整數爲i。

如果您需要閱讀大量的整數值一氣呵成，知道你需要多少閱讀，拿在array module看看還有：

from array import array 

arr = array('L') 
with open(filename, 'rb') as fileobj: 
    arr.fromfile(fileobj, number_of_integers_to_read)

，你可能需要使用array.byteswap()如果該文件的字節序和你的系統不匹配：

if sys.byteorder != 'little': 
    arr.byteswap()

2014-03-06 15:47:41

+1'array.fromfile'。你可以把它放在一個'while True：'循環裏面'try：.fromfile ..除了EOFError：pass'以避免在手之前知道'number_of_integers_to_read'。 – jfs

您可以通過從文件讀取字節流'data = fileobj.read（）'，然後調用'arr.frombytes（data）''來避免'number_of_integers_to_read'： –

@AlexeyPolonsky：或者您可以使用'arr .fromfile（fileobj，sys.maxsize）'並且只是捕獲'EOFError'異常。 –

退房的NumPy的fromfile function。您提供有關要讀取的數據的簡單類型註釋，並且該函數可以將其有效地讀取到NumPy對象中。

import numpy as np 
np.fromfile(file_name, dtype='<i4')

您可以更改dtype反映大小和字節順序爲好。 See here for some examples.

2014-03-06 16:00:42 ely

回答