2015-06-03 121 views
2

我不得不將任何外部python庫(最好是wave和/或audioop)的wav文件從44100Hz下采樣到16000Hz。我嘗試使用setframerate函數將wav文件的幀率更改爲16000,但這隻會減慢整個錄製的速度。我怎樣才能將音頻文件下采樣到16kHz,並保持相同的音頻長度?Python - 向下採樣wav音頻文件

非常感謝你提前

+0

如果你去爲11025Hz它會更容易,只是低通濾波器,然後採取每4個樣品 – samgak

+0

是audioop的ratecv之後你在做什麼? https://docs.python.org/2/library/audioop.html#audioop.ratecv –

+0

它需要16kHz,因爲我們的管線工具需要將它導出爲Unity項目。你介意給我一個使用audioop.ratecv函數的例子嗎?因爲我對該函數的fragment參數感到困惑。我如何得到它? @JimJeffries – d3cr1pt0r

回答

1

可以在scipy使用重採樣。這有點令人頭疼,因爲在python的本地代碼bytestringscipy中需要的數組之間需要進行一些類型轉換。還有一個令人頭疼的問題,因爲在Python中的wave模塊中,沒有辦法確定數據是否被簽名(僅限於8位或16位)。它可能(應該)爲兩者工作,但我沒有測試它。

這是一個小程序,它將(無符號)8位和16位單聲道從44.1轉換爲16位。如果您有立體聲或使用其他格式,則不應該很難適應。在代碼的開頭編輯輸入/輸出名稱。永遠不要使用命令行參數。

#!/usr/bin/env python 
# -*- coding: utf-8 -*- 
# 
# downsample.py 
# 
# Copyright 2015 John Coppens <[email protected]> 
# 
# This program is free software; you can redistribute it and/or modify 
# it under the terms of the GNU General Public License as published by 
# the Free Software Foundation; either version 2 of the License, or 
# (at your option) any later version. 
# 
# This program is distributed in the hope that it will be useful, 
# but WITHOUT ANY WARRANTY; without even the implied warranty of 
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 
# GNU General Public License for more details. 
# 
# You should have received a copy of the GNU General Public License 
# along with this program; if not, write to the Free Software 
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
# MA 02110-1301, USA. 
# 
# 

inwave = "sine_44k.wav" 
outwave = "sine_16k.wav" 

import wave 
import numpy as np 
import scipy.signal as sps 

class DownSample(): 
    def __init__(self): 
     self.in_rate = 44100.0 
     self.out_rate = 16000.0 

    def open_file(self, fname): 
     try: 
      self.in_wav = wave.open(fname) 
     except: 
      print("Cannot open wav file (%s)" % fname) 
      return False 

     if self.in_wav.getframerate() != self.in_rate: 
      print("Frame rate is not %d (it's %d)" % \ 
        (self.in_rate, self.in_wav.getframerate())) 
      return False 

     self.in_nframes = self.in_wav.getnframes() 
     print("Frames: %d" % self.in_wav.getnframes()) 

     if self.in_wav.getsampwidth() == 1: 
      self.nptype = np.uint8 
     elif self.in_wav.getsampwidth() == 2: 
      self.nptype = np.uint16 

     return True 

    def resample(self, fname): 
     self.out_wav = wave.open(fname, "w") 
     self.out_wav.setframerate(self.out_rate) 
     self.out_wav.setnchannels(self.in_wav.getnchannels()) 
     self.out_wav.setsampwidth (self.in_wav.getsampwidth()) 
     self.out_wav.setnframes(1) 

     print("Nr output channels: %d" % self.out_wav.getnchannels()) 

     audio = self.in_wav.readframes(self.in_nframes) 
     nroutsamples = round(len(audio) * self.out_rate/self.in_rate) 
     print("Nr output samples: %d" % nroutsamples) 

     audio_out = sps.resample(np.fromstring(audio, self.nptype), nroutsamples) 
     audio_out = audio_out.astype(self.nptype) 

     self.out_wav.writeframes(audio_out.copy(order='C')) 

     self.out_wav.close() 

def main(): 
    ds = DownSample() 
    if not ds.open_file(inwave): return 1 
    ds.resample(outwave) 
    return 0 

if __name__ == '__main__': 
    main() 
3

謝謝大家的回答。我已經找到了一個解決方案,它的工作非常好。這是整個功能。

def downsampleWav(src, dst, inrate=44100, outrate=16000, inchannels=2, outchannels=1): 
    if not os.path.exists(src): 
     print 'Source not found!' 
     return False 

    if not os.path.exists(os.path.dirname(dst)): 
     os.makedirs(os.path.dirname(dst)) 

    try: 
     s_read = wave.open(src, 'r') 
     s_write = wave.open(dst, 'w') 
    except: 
     print 'Failed to open files!' 
     return False 

    n_frames = s_read.getnframes() 
    data = s_read.readframes(n_frames) 

    try: 
     converted = audioop.ratecv(data, 2, inchannels, inrate, outrate, None) 
     if outchannels == 1: 
      converted = audioop.tomono(converted[0], 2, 1, 0) 
    except: 
     print 'Failed to downsample wav' 
     return False 

    try: 
     s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed')) 
     s_write.writeframes(converted) 
    except: 
     print 'Failed to write wav' 
     return False 

    try: 
     s_read.close() 
     s_write.close() 
    except: 
     print 'Failed to close wav files' 
     return False 

    return True 
+1

我知道這是舊的,但我只是有同樣的問題,所以我嘗試了代碼,我認爲它有一個微妙的錯誤。如果我的inchannels = 1和outchannels = 1,那麼tomono函數將會被調用,這會擾亂我的音頻信號(長度會減半)。當寫幀時,你不應該只寫轉換後的[0](這取決於是否顯式調用了tomono),因爲ratecv返回的新狀態是不相關的? – user667804