2013-02-12 63 views
0

我想對語音文件進行語音識別。使用SetInputToWaveFile語音識別提前結束

我的代碼很基礎,從here派生。問題在於,即使某些波形文件長達數小時,它會在幾秒鐘後過早停止每個波形文件。

如何讓它掃描整個文件?

namespace Stimmenerkennung 
{ 
    public partial class Form1 : Form 
    { 
     //... 
     Thread erkennung; 
     bool completed; 

     private void Form1_Load(object sender, EventArgs e) 
     { 
      erkennung = new Thread(erkennen); 
      erkennung.Start(); 
     } 

     void erkennen() 
     { 
      using (SpeechRecognitionEngine recognizer = 
       new SpeechRecognitionEngine()) 
      { 

       // Create and load a grammar. 
       Grammar dictation = new DictationGrammar(); 
       dictation.Name = "Dictation Grammar"; 

       recognizer.LoadGrammar(dictation); 

       // Configure the input to the recognizer. 
       recognizer.SetInputToWaveFile(@"REC01.wav"); 


       // Attach event handlers for the results of recognition. 
       recognizer.SpeechRecognized += 
        new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized); 
       recognizer.RecognizeCompleted += 
        new EventHandler<RecognizeCompletedEventArgs>(recognizer_RecognizeCompleted); 

       // Perform recognition on the entire file. 
       db("Starting asynchronous recognition..."); 
       recognizer.RecognizeAsync(); 
       while (!completed) 
       { 
        //fs((int)(100/recognizer.AudioPosition.TotalSeconds * recognizer.AudioPosition.Seconds)); 
        db(recognizer.AudioState.ToString()); 
        Thread.Sleep(100); 
       } 
      } 
     } 

     // Handle the SpeechRecognized event. 
     void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e) 
     { 
      if (e.Result != null && e.Result.Text != null) 
      { 
       db(e.Result.Text); 
      } 
      else 
      { 
       db(" Recognized text not available."); 
      } 
     } 

     // Handle the RecognizeCompleted event. 
     void recognizer_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e) 
     { 
      if (e.Cancelled) 
      { 
       db(" Operation cancelled."); 
      } 
      if (e.InputStreamEnded) 
      { 
       db(" End of stream encountered."); 
      } 
      completed = true; 
     } 

     void db(string t) 
     { 
      this.textBox1.Invoke((MethodInvoker)delegate 
      { 
       textBox1.Text = textBox1.Text + Environment.NewLine + t; 
       //textBox1.Text = t; 
      }); 
     } 
    } 
} 
+1

錯在這裏最基本的一點是,你不與RecognizeMode.Multiple參數調用RecognizeAsync()。 – 2013-02-12 05:23:00

+0

解決了它。好工作。 – 2013-02-12 11:51:55

回答

0

您可以在幾秒鐘內將文件拆分爲靜音,並將該塊分別送入識別器。然後您可以將結果合併爲一個字符串。

您可以使用任何語音活動檢測實施來執行拆分,一個簡單的基於能量的VAD計算幀能量就足夠了。

您可以找到VAD的一些現有的實現在CMUSphinx

+0

既不像python也不像c#那樣工作。哪個VAD與c#或python一起使用? – 2013-02-12 11:55:18

+0

更好的是:如何使用Microsoft語音識別VAD?目標是在程序輸出TotalSeconds時輸入一個wav文件作爲輸入,當它檢測到語音時,更好地指定語音區域,即語音塊的開始時間和結束時間。 – 2013-02-12 13:33:53