我正在尋找一種快速的,最好是標準庫機制來確定wav文件的位深度,例如, '16位'或'24位'。確定wav文件的位深度
我正在使用一個子進程調用Sox來獲取大量音頻元數據,但子進程調用非常緩慢,目前我只能從Sox獲得可靠的唯一信息是位深度。
內置wave模塊沒有像getbitdepth()這樣的函數,並且與24位wav文件不兼容 - 我可以使用'try except'來使用wave模塊訪問文件元數據(如果它工作,手動記錄它是16位),然後打開,除了調用sox(sox將執行分析以準確記錄其位深度)。我擔心的是,這種方法就像猜測一樣。如果讀取8位文件會怎麼樣?如果不是,我會手動分配16位。
SciPy.io.wavefile也與24位音頻不兼容,因此會產生類似的問題。
這個tutorial真的很有趣,甚至包括一些非常低的級別(對於Python來說至少是低級別的)從wav文件頭中提取信息的腳本示例 - 不幸的是這些腳本不適用於16位音頻。
有什麼辦法可以簡單地(並且不用調用sox)來確定我正在檢查的wav文件的位深度是多少?
我使用的波頭分析器腳本如下:
import struct
import os
def print_wave_header(f):
'''
Function takes an audio file path as a parameter and
returns a dictionary of metadata parsed from the header
'''
r = {} #the results of the header parse
r['path'] = f
fin = open(f,"rb") # Read wav file, "r flag" - read, "b flag" - binary
ChunkID=fin.read(4) # First four bytes are ChunkID which must be "RIFF" in ASCII
r["ChunkID"]=ChunkID
ChunkSizeString=fin.read(4) # Total Size of File in Bytes - 8 Bytes
ChunkSize=struct.unpack('I',ChunkSizeString) # 'I' Format is to to treat the 4 bytes as unsigned 32-bit inter
TotalSize=ChunkSize[0]+8 # The subscript is used because struct unpack returns everything as tuple
r["TotalSize"]=TotalSize
DataSize=TotalSize-44 # This is the number of bytes of data
r["DataSize"]=DataSize
Format=fin.read(4) # "WAVE" in ASCII
r["Format"]=Format
SubChunk1ID=fin.read(4) # "fmt " in ASCII
r["SubChunk1ID"]=SubChunk1ID
SubChunk1SizeString=fin.read(4) # Should be 16 (PCM, Pulse Code Modulation)
SubChunk1Size=struct.unpack("I",SubChunk1SizeString) # 'I' format to treat as unsigned 32-bit integer
r["SubChunk1Size"]=SubChunk1Size
AudioFormatString=fin.read(2) # Should be 1 (PCM)
AudioFormat=struct.unpack("H",AudioFormatString) ## 'H' format to treat as unsigned 16-bit integer
r["AudioFormat"]=AudioFormat[0]
NumChannelsString=fin.read(2) # Should be 1 for mono, 2 for stereo
NumChannels=struct.unpack("H",NumChannelsString) # 'H' unsigned 16-bit integer
r["NumChannels"]=NumChannels[0]
SampleRateString=fin.read(4) # Should be 44100 (CD sampling rate)
SampleRate=struct.unpack("I",SampleRateString)
r["SampleRate"]=SampleRate[0]
ByteRateString=fin.read(4) # 44100*NumChan*2 (88200 - Mono, 176400 - Stereo)
ByteRate=struct.unpack("I",ByteRateString) # 'I' unsigned 32 bit integer
r["ByteRate"]=ByteRate[0]
BlockAlignString=fin.read(2) # NumChan*2 (2 - Mono, 4 - Stereo)
BlockAlign=struct.unpack("H",BlockAlignString) # 'H' unsigned 16-bit integer
r["BlockAlign"]=BlockAlign[0]
BitsPerSampleString=fin.read(2) # 16 (CD has 16-bits per sample for each channel)
BitsPerSample=struct.unpack("H",BitsPerSampleString) # 'H' unsigned 16-bit integer
r["BitsPerSample"]=BitsPerSample[0]
SubChunk2ID=fin.read(4) # "data" in ASCII
r["SubChunk2ID"]=SubChunk2ID
SubChunk2SizeString=fin.read(4) # Number of Data Bytes, Same as DataSize
SubChunk2Size=struct.unpack("I",SubChunk2SizeString)
r["SubChunk2Size"]=SubChunk2Size[0]
S1String=fin.read(2) # Read first data, number between -32768 and 32767
S1=struct.unpack("h",S1String)
r["S1"]=S1[0]
S2String=fin.read(2) # Read second data, number between -32768 and 32767
S2=struct.unpack("h",S2String)
r["S2"]=S2[0]
S3String=fin.read(2) # Read second data, number between -32768 and 32767
S3=struct.unpack("h",S3String)
r["S3"]=S3[0]
S4String=fin.read(2) # Read second data, number between -32768 and 32767
S4=struct.unpack("h",S4String)
r["S4"]=S4[0]
S5String=fin.read(2) # Read second data, number between -32768 and 32767
S5=struct.unpack("h",S5String)
r["S5"]=S5[0]
fin.close()
return r
每個wav文件的頭部都有bit_depth(前44個字節)...每個wav庫都必須解析頭文件...它很容易執行這個頭文件自己解析 –
使用我在示例中標記的教程是已經能夠解析報頭,但比特深度並不總是清晰的,例如 =塊ID b'RIFF ' 總計TOTALSIZE = 602914 數據尺寸= 602870 格式= b'WAVE' SubChunk1ID = b'JUNK」 SubChunk1Size = 92 的AudioFormat = 0 NumChannels = 0 SAMPLERATE = 0 ByteRate = 0 BlockAlign = 0 BitsPerSample = 0 SubChunk2ID = b '\ X00 \ X00 \ X00 \ X00' SubChunk2Size = 0 S1 = 0 S2 = 0 S3 = 0 S4 = 0 S5 = 0 根據文件 壓縮標題是否可讀,但我希望能夠讀取它,而不管文件格式/壓縮如何,沒有任何轉換過程。 – user3535074
其紅色標誌爲0表示所有這些標頭設置 - 無論是文件損壞或庫是錯誤的...即使wav文件被壓縮(我從來沒有見過壓縮的wav文件)頭肯定不會壓縮...這裏是一個簡潔的WAV規範總結http://soundfile.sapp.org/doc/WaveFormat/ ...如果你寫自己的頭解析器特別注意頭字段和數據部分的字節順序......你可以在兩頁代碼 –