加速閱讀蟒蛇中的wav

我正在研究一個項目，需要我以32位浮點數讀取多聲道wav文件。當我將一個特定的文件（1分鐘長，6個通道，48k fs）讀入Matlab並用tic/toc進行測量時，它會在2.456482秒內解析文件。

用於文件閱讀速度測量的Matlab代碼 tic wavread（'C：/data/testData/6ch.wav'）; toc

當我在python中執行它（請注意，我對python非常不熟悉），它需要18.1655315617秒！在我看來，像我這樣做的方式效率低下（我從28降到了18，但仍然太多......）

我剝去了代碼，以涉及到這個問題：

Python代碼文件的讀取速度測量

import wave32 
import struct 
import time 
import numpy as np 

def getWavData(inFile) 
    wavFile = wave32.open(inFile, 'r') 
    wavParams = wavFile.getparams() 
    nChannels = wavParams[0] 
    byteDepth = wavParams[1] 
    nFrames = wavParams[3] 
    wavData = np.empty([nFrames, nChannels], np.float32) 
    frames = wavFile.readframes(nFrames) 
    for i in range(nFrames): 
     for j in range(nChannels): 
      start = (i * nChannels + j) * byteDepth 
      stop = start + byteDepth 
      wavData[i][j] = struct.unpack('<f', frames[start:stop])[0] 
    return wavData 

inFile = 'C:/data/testData/6ch.wav' 
start = time.clock() 
data2 = getWavData(inFile) 
elapsed = time.clock() 
elapsedNew = elapsed - start 
print str(elapsedNew)

請不是wav32是一個小黑客，我不得不對wave.py執行，以使32位浮點閱讀。

"""Stuff to parse WAVE files. 

Usage. 

Reading WAVE files: 
     f = wave.open(file, 'r') 
where file is either the name of a file or an open file pointer. 
The open file pointer must have methods read(), seek(), and close(). 
When the setpos() and rewind() methods are not used, the seek() 
method is not necessary. 

This returns an instance of a class with the following public methods: 
     getnchannels() -- returns number of audio channels (1 for 
         mono, 2 for stereo) 
     getsampwidth() -- returns sample width in bytes 
     getframerate() -- returns sampling frequency 
     getnframes() -- returns number of audio frames 
     getcomptype() -- returns compression type ('NONE' for linear samples) 
     getcompname() -- returns human-readable version of 
         compression type ('not compressed' linear samples) 
     getparams()  -- returns a tuple consisting of all of the 
         above in the above order 
     getmarkers() -- returns None (for compatibility with the 
         aifc module) 
     getmark(id)  -- raises an error since the mark does not 
         exist (for compatibility with the aifc module) 
     readframes(n) -- returns at most n frames of audio 
     rewind()  -- rewind to the beginning of the audio stream 
     setpos(pos)  -- seek to the specified position 
     tell()   -- return the current position 
     close()   -- close the instance (make it unusable) 
The position returned by tell() and the position given to setpos() 
are compatible and have nothing to do with the actual position in the 
file. 
The close() method is called automatically when the class instance 
is destroyed. 

Writing WAVE files: 
     f = wave.open(file, 'w') 
where file is either the name of a file or an open file pointer. 
The open file pointer must have methods write(), tell(), seek(), and 
close(). 

This returns an instance of a class with the following public methods: 
     setnchannels(n) -- set the number of channels 
     setsampwidth(n) -- set the sample width 
     setframerate(n) -- set the frame rate 
     setnframes(n) -- set the number of frames 
     setcomptype(type, name) 
         -- set the compression type and the 
         human-readable compression type 
     setparams(tuple) 
         -- set all parameters at once 
     tell()   -- return current position in output file 
     writeframesraw(data) 
         -- write audio frames without pathing up the 
         file header 
     writeframes(data) 
         -- write audio frames and patch up the file header 
     close()   -- patch up the file header and close the 
         output file 
You should set the parameters before the first writeframesraw or 
writeframes. The total number of frames does not need to be set, 
but when it is set to the correct value, the header does not have to 
be patched up. 
It is best to first set all parameters, perhaps possibly the 
compression type, and then write audio frames using writeframesraw. 
When all frames have been written, either call writeframes('') or 
close() to patch up the sizes in the header. 
The close() method is called automatically when the class instance 
is destroyed. 
""" 

import __builtin__ 

__all__ = ["open", "openfp", "Error"] 

class Error(Exception): 
    pass 

WAVE_FORMAT_PCM = 0x0001 
WAVE_FORMAT_IEEE_FLOAT = 0x0003 

_array_fmts = None, 'b', 'h', None, 'l' 

# Determine endian-ness 
import struct 
if struct.pack("h", 1) == "\000\001": 
    big_endian = 1 
else: 
    big_endian = 0 

from chunk import Chunk 

class Wave_read: 
    """Variables used in this class: 

    These variables are available to the user though appropriate 
    methods of this class: 
    _file -- the open file with methods read(), close(), and seek() 
       set through the __init__() method 
    _nchannels -- the number of audio channels 
       available through the getnchannels() method 
    _nframes -- the number of audio frames 
       available through the getnframes() method 
    _sampwidth -- the number of bytes per audio sample 
       available through the getsampwidth() method 
    _framerate -- the sampling frequency 
       available through the getframerate() method 
    _comptype -- the AIFF-C compression type ('NONE' if AIFF) 
       available through the getcomptype() method 
    _compname -- the human-readable AIFF-C compression type 
       available through the getcomptype() method 
    _soundpos -- the position in the audio stream 
       available through the tell() method, set through the 
       setpos() method 

    These variables are used internally only: 
    _fmt_chunk_read -- 1 iff the FMT chunk has been read 
    _data_seek_needed -- 1 iff positioned correctly in audio 
       file for readframes() 
    _data_chunk -- instantiation of a chunk class for the DATA chunk 
    _framesize -- size of one frame in the file 
    """ 

    def initfp(self, file): 
     self._convert = None 
     self._soundpos = 0 
     self._file = Chunk(file, bigendian = 0) 
     if self._file.getname() != 'RIFF': 
      raise Error, 'file does not start with RIFF id' 
     if self._file.read(4) != 'WAVE': 
      raise Error, 'not a WAVE file' 
     self._fmt_chunk_read = 0 
     self._data_chunk = None 
     while 1: 
      self._data_seek_needed = 1 
      try: 
       chunk = Chunk(self._file, bigendian = 0) 
      except EOFError: 
       break 
      chunkname = chunk.getname() 
      if chunkname == 'fmt ': 
       self._read_fmt_chunk(chunk) 
       self._fmt_chunk_read = 1 
      elif chunkname == 'data': 
       if not self._fmt_chunk_read: 
        raise Error, 'data chunk before fmt chunk' 
       self._data_chunk = chunk 
       self._nframes = chunk.chunksize // self._framesize 
       self._data_seek_needed = 0 
       break 
      chunk.skip() 
     if not self._fmt_chunk_read or not self._data_chunk: 
      raise Error, 'fmt chunk and/or data chunk missing' 

    def __init__(self, f): 
     self._i_opened_the_file = None 
     if isinstance(f, basestring): 
      f = __builtin__.open(f, 'rb') 
      self._i_opened_the_file = f 
     # else, assume it is an open file object already 
     try: 
      self.initfp(f) 
     except: 
      if self._i_opened_the_file: 
       f.close() 
      raise 

    def __del__(self): 
     self.close() 
    # 
    # User visible methods. 
    # 
    def getfp(self): 
     return self._file 

    def rewind(self): 
     self._data_seek_needed = 1 
     self._soundpos = 0 

    def close(self): 
     if self._i_opened_the_file: 
      self._i_opened_the_file.close() 
      self._i_opened_the_file = None 
     self._file = None 

    def tell(self): 
     return self._soundpos 

    def getnchannels(self): 
     return self._nchannels 

    def getnframes(self): 
     return self._nframes 

    def getsampwidth(self): 
     return self._sampwidth 

    def getframerate(self): 
     return self._framerate 

    def getcomptype(self): 
     return self._comptype 

    def getcompname(self): 
     return self._compname 

    def getparams(self): 
     return self.getnchannels(), self.getsampwidth(), \ 
       self.getframerate(), self.getnframes(), \ 
       self.getcomptype(), self.getcompname() 

    def getmarkers(self): 
     return None 

    def getmark(self, id): 
     raise Error, 'no marks' 

    def setpos(self, pos): 
     if pos < 0 or pos > self._nframes: 
      raise Error, 'position not in range' 
     self._soundpos = pos 
     self._data_seek_needed = 1 

    def readframes(self, nframes): 
     if self._data_seek_needed: 
      self._data_chunk.seek(0, 0) 
      pos = self._soundpos * self._framesize 
      if pos: 
       self._data_chunk.seek(pos, 0) 
      self._data_seek_needed = 0 
     if nframes == 0: 
      return '' 
     if self._sampwidth > 1 and big_endian: 
      # unfortunately the fromfile() method does not take 
      # something that only looks like a file object, so 
      # we have to reach into the innards of the chunk object 
      import array 
      chunk = self._data_chunk 
      data = array.array(_array_fmts[self._sampwidth]) 
      nitems = nframes * self._nchannels 
      if nitems * self._sampwidth > chunk.chunksize - chunk.size_read: 
       nitems = (chunk.chunksize - chunk.size_read)/self._sampwidth 
      data.fromfile(chunk.file.file, nitems) 
      # "tell" data chunk how much was read 
      chunk.size_read = chunk.size_read + nitems * self._sampwidth 
      # do the same for the outermost chunk 
      chunk = chunk.file 
      chunk.size_read = chunk.size_read + nitems * self._sampwidth 
      data.byteswap() 
      data = data.tostring() 
     else: 
      data = self._data_chunk.read(nframes * self._framesize) 
     if self._convert and data: 
      data = self._convert(data) 
     self._soundpos = self._soundpos + len(data) // (self._nchannels * self._sampwidth) 
     return data 

    # 
    # Internal methods. 
    # 
    def _read_fmt_chunk(self, chunk): 
     wFormatTag, self._nchannels, self._framerate, dwAvgBytesPerSec, wBlockAlign = struct.unpack('<hhllh', chunk.read(14)) 
     if wFormatTag == WAVE_FORMAT_PCM or wFormatTag==WAVE_FORMAT_IEEE_FLOAT: 
      sampwidth = struct.unpack('<h', chunk.read(2))[0] 
      self._sampwidth = (sampwidth + 7) // 8 
     else: 
      #sampwidth = struct.unpack('<h', chunk.read(2))[0] 
      #self._sampwidth = (sampwidth + 7) // 8 
      raise Error, 'unknown format: %r' % (wFormatTag,) 
     self._framesize = self._nchannels * self._sampwidth 
     self._comptype = 'NONE' 
     self._compname = 'not compressed' 

class Wave_write: 
    """Variables used in this class: 

    These variables are user settable through appropriate methods 
    of this class: 
    _file -- the open file with methods write(), close(), tell(), seek() 
       set through the __init__() method 
    _comptype -- the AIFF-C compression type ('NONE' in AIFF) 
       set through the setcomptype() or setparams() method 
    _compname -- the human-readable AIFF-C compression type 
       set through the setcomptype() or setparams() method 
    _nchannels -- the number of audio channels 
       set through the setnchannels() or setparams() method 
    _sampwidth -- the number of bytes per audio sample 
       set through the setsampwidth() or setparams() method 
    _framerate -- the sampling frequency 
       set through the setframerate() or setparams() method 
    _nframes -- the number of audio frames written to the header 
       set through the setnframes() or setparams() method 

    These variables are used internally only: 
    _datalength -- the size of the audio samples written to the header 
    _nframeswritten -- the number of frames actually written 
    _datawritten -- the size of the audio samples actually written 
    """ 

    def __init__(self, f): 
     self._i_opened_the_file = None 
     if isinstance(f, basestring): 
      f = __builtin__.open(f, 'wb') 
      self._i_opened_the_file = f 
     try: 
      self.initfp(f) 
     except: 
      if self._i_opened_the_file: 
       f.close() 
      raise 

    def initfp(self, file): 
     self._file = file 
     self._convert = None 
     self._nchannels = 0 
     self._sampwidth = 0 
     self._framerate = 0 
     self._nframes = 0 
     self._nframeswritten = 0 
     self._datawritten = 0 
     self._datalength = 0 
     self._headerwritten = False 

    def __del__(self): 
     self.close() 

    # 
    # User visible methods. 
    # 
    def setnchannels(self, nchannels): 
     if self._datawritten: 
      raise Error, 'cannot change parameters after starting to write' 
     if nchannels < 1: 
      raise Error, 'bad # of channels' 
     self._nchannels = nchannels 

    def getnchannels(self): 
     if not self._nchannels: 
      raise Error, 'number of channels not set' 
     return self._nchannels 

    def setsampwidth(self, sampwidth): 
     if self._datawritten: 
      raise Error, 'cannot change parameters after starting to write' 
     if sampwidth < 1 or sampwidth > 4: 
      raise Error, 'bad sample width' 
     self._sampwidth = sampwidth 

    def getsampwidth(self): 
     if not self._sampwidth: 
      raise Error, 'sample width not set' 
     return self._sampwidth 

    def setframerate(self, framerate): 
     if self._datawritten: 
      raise Error, 'cannot change parameters after starting to write' 
     if framerate <= 0: 
      raise Error, 'bad frame rate' 
     self._framerate = framerate 

    def getframerate(self): 
     if not self._framerate: 
      raise Error, 'frame rate not set' 
     return self._framerate 

    def setnframes(self, nframes): 
     if self._datawritten: 
      raise Error, 'cannot change parameters after starting to write' 
     self._nframes = nframes 

    def getnframes(self): 
     return self._nframeswritten 

    def setcomptype(self, comptype, compname): 
     if self._datawritten: 
      raise Error, 'cannot change parameters after starting to write' 
     if comptype not in ('NONE',): 
      raise Error, 'unsupported compression type' 
     self._comptype = comptype 
     self._compname = compname 

    def getcomptype(self): 
     return self._comptype 

    def getcompname(self): 
     return self._compname 

    def setparams(self, params): 
     nchannels, sampwidth, framerate, nframes, comptype, compname = params 
     if self._datawritten: 
      raise Error, 'cannot change parameters after starting to write' 
     self.setnchannels(nchannels) 
     self.setsampwidth(sampwidth) 
     self.setframerate(framerate) 
     self.setnframes(nframes) 
     self.setcomptype(comptype, compname) 

    def getparams(self): 
     if not self._nchannels or not self._sampwidth or not self._framerate: 
      raise Error, 'not all parameters set' 
     return self._nchannels, self._sampwidth, self._framerate, \ 
       self._nframes, self._comptype, self._compname 

    def setmark(self, id, pos, name): 
     raise Error, 'setmark() not supported' 

    def getmark(self, id): 
     raise Error, 'no marks' 

    def getmarkers(self): 
     return None 

    def tell(self): 
     return self._nframeswritten 

    def writeframesraw(self, data): 
     self._ensure_header_written(len(data)) 
     nframes = len(data) // (self._sampwidth * self._nchannels) 
     if self._convert: 
      data = self._convert(data) 
     if self._sampwidth > 1 and big_endian: 
      import array 
      data = array.array(_array_fmts[self._sampwidth], data) 
      data.byteswap() 
      data.tofile(self._file) 
      self._datawritten = self._datawritten + len(data) * self._sampwidth 
     else: 
      self._file.write(data) 
      self._datawritten = self._datawritten + len(data) 
     self._nframeswritten = self._nframeswritten + nframes 

    def writeframes(self, data): 
     self.writeframesraw(data) 
     if self._datalength != self._datawritten: 
      self._patchheader() 

    def close(self): 
     if self._file: 
      self._ensure_header_written(0) 
      if self._datalength != self._datawritten: 
       self._patchheader() 
      self._file.flush() 
      self._file = None 
     if self._i_opened_the_file: 
      self._i_opened_the_file.close() 
      self._i_opened_the_file = None 

    # 
    # Internal methods. 
    # 

    def _ensure_header_written(self, datasize): 
     if not self._headerwritten: 
      if not self._nchannels: 
       raise Error, '# channels not specified' 
      if not self._sampwidth: 
       raise Error, 'sample width not specified' 
      if not self._framerate: 
       raise Error, 'sampling rate not specified' 
      self._write_header(datasize) 

    def _write_header(self, initlength): 
     assert not self._headerwritten 
     self._file.write('RIFF') 
     if not self._nframes: 
      self._nframes = initlength/(self._nchannels * self._sampwidth) 
     self._datalength = self._nframes * self._nchannels * self._sampwidth 
     self._form_length_pos = self._file.tell() 
     self._file.write(struct.pack('<l4s4slhhllhh4s', 
      36 + self._datalength, 'WAVE', 'fmt ', 16, 
      WAVE_FORMAT_PCM, self._nchannels, self._framerate, 
      self._nchannels * self._framerate * self._sampwidth, 
      self._nchannels * self._sampwidth, 
      self._sampwidth * 8, 'data')) 
     self._data_length_pos = self._file.tell() 
     self._file.write(struct.pack('<l', self._datalength)) 
     self._headerwritten = True 

    def _patchheader(self): 
     assert self._headerwritten 
     if self._datawritten == self._datalength: 
      return 
     curpos = self._file.tell() 
     self._file.seek(self._form_length_pos, 0) 
     self._file.write(struct.pack('<l', 36 + self._datawritten)) 
     self._file.seek(self._data_length_pos, 0) 
     self._file.write(struct.pack('<l', self._datawritten)) 
     self._file.seek(curpos, 0) 
     self._datalength = self._datawritten 

def open(f, mode=None): 
    if mode is None: 
     if hasattr(f, 'mode'): 
      mode = f.mode 
     else: 
      mode = 'rb' 
    if mode in ('r', 'rb'): 
     return Wave_read(f) 
    elif mode in ('w', 'wb'): 
     return Wave_write(f) 
    else: 
     raise Error, "mode must be 'r', 'rb', 'w', or 'wb'" 

openfp = open # B/W compatibility

很抱歉的長碼BTW :)

所以我的問題是：（？任何替代解決這一問題）是wave.py模塊天生就慢還是我做一些低效？

我想我可以在自定義函數的wav頭文件中讀取，並以不同的方式讀取文件，但似乎這將是很多工作，特別是因爲我不知道很多關於1）Python和處理

親切的問候2）文件，

編輯：我試過unutbu的建議，但作爲SciPy的不接受> 16位不起作用。

當我嘗試通過SciPy的wavreader我得到這個消息來解析wav文件：

C:\Users\King Broos\AppData\Local\Enthought\Canopy32\System\lib\site-packages\scipy\io\wavfile.py:31: WavFileWarning: Unfamiliar format bytes 
    warnings.warn("Unfamiliar format bytes", WavFileWarning) 

C:\Users\King Broos\AppData\Local\Enthought\Canopy32\System\lib\site-packages\scipy\io\wavfile.py:121: WavFileWarning: chunk not understood 
    warnings.warn("chunk not understood", WavFileWarning)

展望wavfile.py的代碼，這是它拋出異常的行：

if (comp != 1 or size > 16): 
    warnings.warn("Unfamiliar format bytes", WavFileWarning)

我真的需要24或32位，所以我猜scipy不是一個選項？

來源

2013-06-20 kbroos

的錯誤，你覺得從'scipy.io.wavfile'得到的僅僅是警告：它不知道它在做什麼，但它以任何方式做。所以你可能需要做一些額外的工作，但是如果你知道自己在做什麼，你應該能夠做到這一點：我只是從[這裏]（http：// www-mmsp）讀取一個64位浮點WAV文件。 ece.mcgill.ca/documents/AudioFormats/WAVE/Samples.html）使用'rate，data = scipy.io.wavfile.read（filename）'，唯一需要注意的是'data'的類型爲'np.int64 '。如果您查看與64位浮點數相同的數據，即'data = data.view（np.float64）'，則應該獲得正確的格式。 – Jaime

如果你可以安裝或有scipy，然後使用wavfile.read：

import scipy.io.wavfile as wavfile 
sample_rate, x = wavfile.read(filename)

您可能還需要研究source code, here。

注意scipy.io.wavfile不使用Python的wave模塊。我不知道它是否讀取您IEEE_FLOAT格式或沒有，但它不會做同樣的檢查爲wave.py：

if wFormatTag == WAVE_FORMAT_PCM or wFormatTag==WAVE_FORMAT_IEEE_FLOAT: 
     sampwidth = struct.unpack('<h', chunk.read(2))[0] 
     self._sampwidth = (sampwidth + 7) // 8 
    else: 
     #sampwidth = struct.unpack('<h', chunk.read(2))[0] 
     #self._sampwidth = (sampwidth + 7) // 8 
     raise Error, 'unknown format: %r' % (wFormatTag,)

也許這將工作外的開箱。

順便說一句，而不是使自己的模塊，wave32.py這是幾乎完全一樣wave.py從標準庫，你可以使用猴子補丁：

import wave 
import struct 

WAVE_FORMAT_IEEE_FLOAT = 0x0003 
def _read_fmt_chunk(self, chunk): 
    wFormatTag, self._nchannels, self._framerate, dwAvgBytesPerSec, wBlockAlign = struct.unpack('<hhllh', chunk.read(14)) 
    if wFormatTag == WAVE_FORMAT_PCM or wFormatTag == WAVE_FORMAT_IEEE_FLOAT: 
     sampwidth = struct.unpack('<h', chunk.read(2))[0] 
     self._sampwidth = (sampwidth + 7) // 8 
    else: 
     raise Error, 'unknown format: %r' % (wFormatTag,) 
    self._framesize = self._nchannels * self._sampwidth 
    self._comptype = 'NONE' 
    self._compname = 'not compressed' 

wave.Wave_read._read_fmt_chunk = _read_fmt_chunk

來源

2013-06-20 14:16:07 unutbu

將scipy作爲依賴項安裝不成問題。閱讀wip與scipy的問題是，它不支持> 16位音頻，這對於我正在做的工作來說是不可接受的，因爲我正在評估24位音頻編解碼器的最低位。 – kbroos

加速閱讀蟒蛇中的wav

回答

相關問題