2017-01-09 26 views
8

我正在使用pyaudio將我的聲音錄製爲wav文件。我使用下面的代碼:爲Google Speech API創建合適的WAV文件

def voice_recorder(): 
    FORMAT = pyaudio.paInt16 
    CHANNELS = 2 
    RATE = 22050 
    CHUNK = 1024 
    RECORD_SECONDS = 4 
    WAVE_OUTPUT_FILENAME = "first.wav" 

    audio = pyaudio.PyAudio() 

    # start Recording 
    stream = audio.open(format=FORMAT, channels=CHANNELS, 
        rate=RATE, input=True, 
        frames_per_buffer=CHUNK) 
    print "konusun..." 
    frames = [] 

    for i in range(0, int(RATE/CHUNK * RECORD_SECONDS)): 
     data = stream.read(CHUNK) 
     frames.append(data) 
    #print "finished recording" 


    # stop Recording 
    stream.stop_stream() 
    stream.close() 
    audio.terminate() 

    waveFile = wave.open(WAVE_OUTPUT_FILENAME, 'wb') 
    waveFile.setnchannels(CHANNELS) 
    waveFile.setsampwidth(audio.get_sample_size(FORMAT)) 
    waveFile.setframerate(RATE) 
    waveFile.writeframes(b''.join(frames)) 
    waveFile.close() 

我用下面的代碼谷歌語音API該語音轉換基本上在WAV文件文本:https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/api-client/transcribe.py

當我嘗試導入WAV文件,該文件由pyaudio產生谷歌的代碼,我收到以下錯誤:

googleapiclient.errors.HttpError: <HttpError 400 when requesting https://speech.googleapis.com/v1beta1/speech:syncrecognize?alt=json returned "Invalid Configuration, Does not match Wav File Header. 
Wav Header Contents: 
Encoding: LINEAR16 
Channels: 2 
Sample Rate: 22050. 
Request Contents: 
Encoding: linear16 
Channels: 1 
Sample Rate: 22050."> 

我用下面的辦法解決這個:我的WAV文件轉換成與ffmpeg的MP3,再經過我轉換MP3文件用sox再次揮手:

def wav_to_mp3(): 
    FNULL = open(os.devnull, 'w') 
    subprocess.call(['ffmpeg', '-i', 'first.wav', '-ac', '1', '-ab', '6400', '-ar', '16000', 'second.mp3', '-y'], stdout=FNULL, stderr=subprocess.STDOUT) 

def mp3_to_wav(): 
    subprocess.call(['sox', 'second.mp3', '-r', '16000', 'son.wav']) 

谷歌的API與這個WAV輸出一起工作,但由於質量下降太多,它表現不佳。

那麼如何在第一步中用pyaudio創建Google兼容的WAV文件呢?

回答

4

轉換的WAV文件,FLAC文件,avconv並將其發送到谷歌語音API解決了這個問題

subprocess.call(['avconv', '-i', 'first.wav', '-y', '-ar', '48000', '-ac', '1', 'last.flac'])