Python的離線語音識別

我在做和應用程序做休耕：如果有些噪音是由麥克風檢測Python的離線語音識別

1:，直到檢測到無噪音的開始錄製音頻。之後，音頻被記錄到一個wav文件。我不得不檢測一些關於它的文字。只有5到10個字才能檢測到。

到目前爲止，我的代碼只做第一部分（檢測噪音和錄製音頻）。現在，我有一個列表，內容如下：help, please, yes, no, could, you, after, tomorrow。我需要一種離線方式來檢測我的聲音是否包含這些詞語。這可能嗎？我怎樣才能做到這一點？我正在使用Linux，並且無法將我的操作系統更改爲Windows或使用虛擬機。

我想使用聲音的譜圖，創建一個火車數據庫，並使用一些分類器來預測。例如，this是單詞的光譜圖。這是一個很好的技巧嗎？

謝謝。

來源

2016-02-06 Caaarlos

http://stackoverflow.com/questions/3644129/how - 你認識到，用python模塊蜻蜓發言 - 也許這有助於 – timgeb

謝謝，我忘了提及我正在使用linux。我讀到蜻蜓只能在窗戶上工作。我對嗎？ – Caaarlos

您沒有顯示任何代碼，也沒有演示任何搜索工作。事實上，您的問題看起來像是一個軟件推薦的細微請求，這是一個無關緊要的問題;如果不是，那肯定太寬泛了。我之前的近距離投票承認「太寬泛」。如果你可以[編輯]你的問題以避免這兩個問題，你也更有可能收到真正有用的答案。請查看[幫助]以獲取關於堆棧溢出的更多信息。 – tripleee

您可以使用Python中的pocketsphinx，使用pip install pocketsphinx進行安裝。代碼如下所示：

import sys, os 
from pocketsphinx.pocketsphinx import * 
from sphinxbase.sphinxbase import * 


modeldir = "../../../model" 
datadir = "../../../test/data" 

# Create a decoder with certain model 
config = Decoder.default_config() 
config.set_string('-hmm', os.path.join(modeldir, 'en-us/en-us')) 
config.set_string('-dict', os.path.join(modeldir, 'en-us/cmudict-en-us.dict')) 
config.set_string('-kws', 'command.list') 


# Open file to read the data 
stream = open(os.path.join(datadir, "goforward.raw"), "rb") 

# Alternatively you can read from microphone 
# import pyaudio 
# 
# p = pyaudio.PyAudio() 
# stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024) 
# stream.start_stream() 

# Process audio chunk by chunk. On keyword detected perform action and restart search 
decoder = Decoder(config) 
decoder.start_utt() 
while True: 
    buf = stream.read(1024) 
    if buf: 
     decoder.process_raw(buf, False, False) 
    else: 
     break 
    if decoder.hyp() != None: 
     print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()]) 
     print ("Detected keyword, restarting search") 
     decoder.end_utt() 
     decoder.start_utt()

關鍵字的列表應該是這樣的：

forward /1e-1/ 
    down /1e-1/ 
    other phrase /1e-20/

的數字是檢測閾值

來源

2016-02-07 09:00:13

Python的離線語音識別

回答

相關問題