我不得不將任何外部python庫(最好是wave
和/或audioop
)的wav文件從44100Hz下采樣到16000Hz。我嘗試使用setframerate
函數將wav文件的幀率更改爲16000,但這隻會減慢整個錄製的速度。我怎樣才能將音頻文件下采樣到16kHz,並保持相同的音頻長度?Python - 向下採樣wav音頻文件
非常感謝你提前
我不得不將任何外部python庫(最好是wave
和/或audioop
)的wav文件從44100Hz下采樣到16000Hz。我嘗試使用setframerate
函數將wav文件的幀率更改爲16000,但這隻會減慢整個錄製的速度。我怎樣才能將音頻文件下采樣到16kHz,並保持相同的音頻長度?Python - 向下採樣wav音頻文件
非常感謝你提前
可以在scipy
使用重採樣。這有點令人頭疼,因爲在python的本地代碼bytestring
與scipy
中需要的數組之間需要進行一些類型轉換。還有一個令人頭疼的問題,因爲在Python中的wave模塊中,沒有辦法確定數據是否被簽名(僅限於8位或16位)。它可能(應該)爲兩者工作,但我沒有測試它。
這是一個小程序,它將(無符號)8位和16位單聲道從44.1轉換爲16位。如果您有立體聲或使用其他格式,則不應該很難適應。在代碼的開頭編輯輸入/輸出名稱。永遠不要使用命令行參數。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# downsample.py
#
# Copyright 2015 John Coppens <[email protected]>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
# MA 02110-1301, USA.
#
#
inwave = "sine_44k.wav"
outwave = "sine_16k.wav"
import wave
import numpy as np
import scipy.signal as sps
class DownSample():
def __init__(self):
self.in_rate = 44100.0
self.out_rate = 16000.0
def open_file(self, fname):
try:
self.in_wav = wave.open(fname)
except:
print("Cannot open wav file (%s)" % fname)
return False
if self.in_wav.getframerate() != self.in_rate:
print("Frame rate is not %d (it's %d)" % \
(self.in_rate, self.in_wav.getframerate()))
return False
self.in_nframes = self.in_wav.getnframes()
print("Frames: %d" % self.in_wav.getnframes())
if self.in_wav.getsampwidth() == 1:
self.nptype = np.uint8
elif self.in_wav.getsampwidth() == 2:
self.nptype = np.uint16
return True
def resample(self, fname):
self.out_wav = wave.open(fname, "w")
self.out_wav.setframerate(self.out_rate)
self.out_wav.setnchannels(self.in_wav.getnchannels())
self.out_wav.setsampwidth (self.in_wav.getsampwidth())
self.out_wav.setnframes(1)
print("Nr output channels: %d" % self.out_wav.getnchannels())
audio = self.in_wav.readframes(self.in_nframes)
nroutsamples = round(len(audio) * self.out_rate/self.in_rate)
print("Nr output samples: %d" % nroutsamples)
audio_out = sps.resample(np.fromstring(audio, self.nptype), nroutsamples)
audio_out = audio_out.astype(self.nptype)
self.out_wav.writeframes(audio_out.copy(order='C'))
self.out_wav.close()
def main():
ds = DownSample()
if not ds.open_file(inwave): return 1
ds.resample(outwave)
return 0
if __name__ == '__main__':
main()
謝謝大家的回答。我已經找到了一個解決方案,它的工作非常好。這是整個功能。
def downsampleWav(src, dst, inrate=44100, outrate=16000, inchannels=2, outchannels=1):
if not os.path.exists(src):
print 'Source not found!'
return False
if not os.path.exists(os.path.dirname(dst)):
os.makedirs(os.path.dirname(dst))
try:
s_read = wave.open(src, 'r')
s_write = wave.open(dst, 'w')
except:
print 'Failed to open files!'
return False
n_frames = s_read.getnframes()
data = s_read.readframes(n_frames)
try:
converted = audioop.ratecv(data, 2, inchannels, inrate, outrate, None)
if outchannels == 1:
converted = audioop.tomono(converted[0], 2, 1, 0)
except:
print 'Failed to downsample wav'
return False
try:
s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed'))
s_write.writeframes(converted)
except:
print 'Failed to write wav'
return False
try:
s_read.close()
s_write.close()
except:
print 'Failed to close wav files'
return False
return True
我知道這是舊的,但我只是有同樣的問題,所以我嘗試了代碼,我認爲它有一個微妙的錯誤。如果我的inchannels = 1和outchannels = 1,那麼tomono函數將會被調用,這會擾亂我的音頻信號(長度會減半)。當寫幀時,你不應該只寫轉換後的[0](這取決於是否顯式調用了tomono),因爲ratecv返回的新狀態是不相關的? – user667804
如果你去爲11025Hz它會更容易,只是低通濾波器,然後採取每4個樣品 – samgak
是audioop的ratecv之後你在做什麼? https://docs.python.org/2/library/audioop.html#audioop.ratecv –
它需要16kHz,因爲我們的管線工具需要將它導出爲Unity項目。你介意給我一個使用audioop.ratecv函數的例子嗎?因爲我對該函數的fragment參數感到困惑。我如何得到它? @JimJeffries – d3cr1pt0r