2009-11-12 20 views
6

使用.Net 3.5中的System.Speech.Synthesis.SpeechSynthesizer類,SpeakProgressEventArgs的AudioPosition屬性顯示爲不準確。SpeechSynthesizer的SpeakProgressEventArgs是否不準確?

下面的代碼產生以下輸出:

代碼:

using System; 
using System.Speech.Synthesis; 
using System.Threading; 

namespace SpeechTest 
{ 
    class Program 
    { 
     static ManualResetEvent speechDoneEvent = new ManualResetEvent(false); 

     static void Main(string[] args) 
     { 
      SpeechSynthesizer synthesizer = new SpeechSynthesizer(); 

      synthesizer.SpeakProgress += new EventHandler<SpeakProgressEventArgs>(synthesizer_SpeakProgress); 

      synthesizer.SpeakCompleted += new EventHandler<SpeakCompletedEventArgs>(synthesizer_SpeakCompleted); 

      synthesizer.SetOutputToWaveFile("Test.wav"); 

      synthesizer.SpeakAsync("This holiday season, support the music you love by shopping at Made in Washington, online and at one of five local stores. Made in Washington chocolates, bountiful gift baskets and ornaments are the perfect holiday gifts for family, friends and co-workers."); 

      speechDoneEvent.WaitOne(); 
     } 

     static void synthesizer_SpeakCompleted(object sender, SpeakCompletedEventArgs e) 
     { 
      speechDoneEvent.Set(); 
     } 

     static void synthesizer_SpeakProgress(object sender, SpeakProgressEventArgs e) 
     { 
      Console.WriteLine("SpeakProgress: AudioPosition=" + e.AudioPosition + ",\tCharacterPosition=" + e.CharacterPosition + ",\tCharacterCount=" + e.CharacterCount + ",\tText=" + e.Text); 
     } 
    } 
} 

輸出:

SpeakProgress: AudioPosition=00:00:00.0043750, CharacterPosition=0, CharacterCount=4,  Text=This 
SpeakProgress: AudioPosition=00:00:00.2925625, CharacterPosition=5, CharacterCount=7,  Text=holiday 
SpeakProgress: AudioPosition=00:00:00.9086250, CharacterPosition=13, CharacterCount=6,  Text=season 
SpeakProgress: AudioPosition=00:00:01.9421250, CharacterPosition=21, CharacterCount=7,  Text=support 
SpeakProgress: AudioPosition=00:00:02.5621250, CharacterPosition=29, CharacterCount=3,  Text=the 
SpeakProgress: AudioPosition=00:00:02.6760625, CharacterPosition=33, CharacterCount=5,  Text=music 
SpeakProgress: AudioPosition=00:00:03.2648125, CharacterPosition=39, CharacterCount=3,  Text=you 
SpeakProgress: AudioPosition=00:00:03.5199375, CharacterPosition=43, CharacterCount=4,  Text=love 
SpeakProgress: AudioPosition=00:00:03.8435625, CharacterPosition=48, CharacterCount=2,  Text=by 
SpeakProgress: AudioPosition=00:00:04.0701875, CharacterPosition=51, CharacterCount=8,  Text=shopping 
SpeakProgress: AudioPosition=00:00:04.6840625, CharacterPosition=60, CharacterCount=2,  Text=at 
SpeakProgress: AudioPosition=00:00:04.8036250, CharacterPosition=63, CharacterCount=4,  Text=Made 
SpeakProgress: AudioPosition=00:00:05.0698125, CharacterPosition=68, CharacterCount=2,  Text=in 
SpeakProgress: AudioPosition=00:00:05.2521250, CharacterPosition=71, CharacterCount=10,  Text=Washington 
SpeakProgress: AudioPosition=00:00:06.2961875, CharacterPosition=83, CharacterCount=6,  Text=online 
SpeakProgress: AudioPosition=00:00:07.0540625, CharacterPosition=90, CharacterCount=3,  Text=and 
SpeakProgress: AudioPosition=00:00:07.3331250, CharacterPosition=94, CharacterCount=2,  Text=at 
SpeakProgress: AudioPosition=00:00:07.6818750, CharacterPosition=97, CharacterCount=3,  Text=one 
SpeakProgress: AudioPosition=00:00:08.0598750, CharacterPosition=101, CharacterCount=2,  Text=of 
SpeakProgress: AudioPosition=00:00:08.2163750, CharacterPosition=104, CharacterCount=4,  Text=five 
SpeakProgress: AudioPosition=00:00:08.5971875, CharacterPosition=109, CharacterCount=5,  Text=local 
SpeakProgress: AudioPosition=00:00:09.0243750, CharacterPosition=115, CharacterCount=6,  Text=stores 
SpeakProgress: AudioPosition=00:00:10.5325625, CharacterPosition=123, CharacterCount=4,  Text=Made 
SpeakProgress: AudioPosition=00:00:10.7700625, CharacterPosition=128, CharacterCount=2,  Text=in 
SpeakProgress: AudioPosition=00:00:10.9377500, CharacterPosition=131, CharacterCount=10,  Text=Washington 
SpeakProgress: AudioPosition=00:00:11.6708125, CharacterPosition=142, CharacterCount=10,  Text=chocolates 
SpeakProgress: AudioPosition=00:00:12.9798750, CharacterPosition=154, CharacterCount=9,  Text=bountiful 
SpeakProgress: AudioPosition=00:00:13.6303125, CharacterPosition=164, CharacterCount=4,  Text=gift 
SpeakProgress: AudioPosition=00:00:14.0959375, CharacterPosition=169, CharacterCount=7,  Text=baskets 
SpeakProgress: AudioPosition=00:00:14.7848125, CharacterPosition=177, CharacterCount=3,  Text=and 
SpeakProgress: AudioPosition=00:00:15.0507500, CharacterPosition=181, CharacterCount=9,  Text=ornaments 
SpeakProgress: AudioPosition=00:00:15.7195000, CharacterPosition=191, CharacterCount=3,  Text=are 
SpeakProgress: AudioPosition=00:00:15.9872500, CharacterPosition=195, CharacterCount=3,  Text=the 
SpeakProgress: AudioPosition=00:00:16.1488750, CharacterPosition=199, CharacterCount=7,  Text=perfect 
SpeakProgress: AudioPosition=00:00:16.7275000, CharacterPosition=207, CharacterCount=7,  Text=holiday 
SpeakProgress: AudioPosition=00:00:17.3336875, CharacterPosition=215, CharacterCount=5,  Text=gifts 
SpeakProgress: AudioPosition=00:00:17.9813125, CharacterPosition=221, CharacterCount=3,  Text=for 
SpeakProgress: AudioPosition=00:00:18.2216875, CharacterPosition=225, CharacterCount=6,  Text=family 
SpeakProgress: AudioPosition=00:00:19.0973750, CharacterPosition=233, CharacterCount=7,  Text=friends 
SpeakProgress: AudioPosition=00:00:19.7726250, CharacterPosition=241, CharacterCount=3,  Text=and 
SpeakProgress: AudioPosition=00:00:19.9655625, CharacterPosition=245, CharacterCount=10,  Text=co-workers 
SpeakProgress: AudioPosition=00:00:20.2518750, CharacterPosition=245, CharacterCount=10,  Text=co-workers 

然而,所產生的.wav文件的持續時間是15.69秒。如果輸出到Stream或null,則會發生相同的行爲。

屬性的documentation表示該屬性是「表示音頻輸出流中事件的時間位置的TimeSpan對象」。

它應該是一個精確的時間來表示單詞在輸出文件中開始或結束說話的時間,還是我誤解了它?

+0

什麼是選定的聲音? – Ahmad 2015-12-10 19:18:03

回答

1

audioPosition取決於所選語音合成器的聲音。對於一些微軟的聲音,如Anna,Zira,David,Hazel,據我所知,支持的音頻格式是16000Hz PCM。因此,下面的解決方案可以糾正auido位置:

var format = 
new System.Speech.AudioFormat.SpeechAudioFormatInfo(EncodingFormat.Pcm, 
                16000, 16, 1, 32000, 2, null); 
synthesizer.SetOutputToWaveFile("Test.wav", format); 

如果注意,SetOutputToWaveFile的缺省採樣率是22050,而正確的時間(15.69),以通過AudipPosition所示的時間(20.25)的比例大約是0.77。如果你將這個比率乘以22050,你就得到大約16000,這是正確的採樣率。