如果你有興趣在後處理中的數據,你可能會使用它作爲numpy的數據。
>>> import wave
>>> import numpy as np
>>> f = wave.open('911.wav', 'r')
>>> data = f.readframes(f.getnframes())
>>> data[:10] # just to show it is a string of bytes
'"5AMj\x88\x97\xa6\xc0\xc9'
>>> numeric_data = np.fromstring(data, dtype=np.uint8)
>>> numeric_data
array([ 34, 53, 65, ..., 128, 128, 128], dtype=uint8)
>>> 10e-3*f.getframerate() # how many frames per 10ms?
110.25
這不是一個整數,所以,除非你要插你的數據,你需要墊您的數據用零來獲得不錯的110幀長的樣品(這是10ms左右,在此幀率) 。
>>> numeric_data.shape, f.getnframes() # there are just as many samples in the numpy array as there were frames
((186816,), 186816)
>>> padding_length = 110 - numeric_data.shape[0]%110
>>> padded = np.hstack((numeric_data, np.zeros(padding_length)))
>>> segments = padded.reshape(-1, 110)
>>> segments
array([[ 34., 53., 65., ..., 216., 222., 228.],
[ 230., 227., 224., ..., 72., 61., 45.],
[ 34., 33., 32., ..., 147., 158., 176.],
...,
[ 128., 128., 128., ..., 128., 128., 128.],
[ 127., 128., 128., ..., 128., 129., 129.],
[ 129., 129., 128., ..., 0., 0., 0.]])
>>> segments.shape
(1699, 110)
所以現在,segments
陣列的每一行都是大約10ms長。