iPhone：準確識別人聲語音

我正在開發一個應用程序，我需要識別人類（準確地說是嬰兒哭聲）的聲音。我提到以下文章用於在iPhone麥克風上錄製聲音並對其進行採樣。iPhone：準確識別人聲語音

http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/ http://developer.apple.com/library/ios/#samplecode/aurioTouch/Introduction/Intro.html http://developer.apple.com/library/ios/#samplecode/SpeakHere/Introduction/Intro.html

...但我沒有得到我怎麼能準確分辨來自任何其他聲音的人聲。任何幫助或示例代碼都會很有幫助。

到目前爲止，我寫了下面的代碼：

-(void)levelTimerCallback:(NSTimer *)timer { 
    [recorder updateMeters]; 
    const double ALPHA = 0.05; 
    double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0])); 
    lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * lowPassResults; 
    NSLog(@"frequency: %f", lowPassResults); 
    NSLog(@"Average input: %f Peak input: %f", [recorder averagePowerForChannel:0], [recorder peakPowerForChannel:0]); 
    if (lowPassResults < 0.95) 
    [self playSound]; 
}

感謝。

來源

2011-04-21 applefreak

啊哈...我忘了附加我的代碼。 :) - （void）levelTimerCallback：（NSTimer *）timer { \t [recorder updateMeters]; \t const double ALPHA = 0.05; \t double peakPowerForChannel = pow（10，（0.05 * [recorder peakPowerForChannel：0]））; \t lowPassResults = ALPHA * peakPowerForChannel +（1.0 - ALPHA）* lowPassResults; \t \t NSLog（@「frequency：％f」，lowPassResults）; NSLog（@「Average input：％f Peak input：％f」，[recorder averagePowerForChannel：0]，[recorder peakPowerForChannel：0]）; \t if（lowPassResults <0.95） \t \t [self playSound]; } – applefreak 2011-04-26 10:17:27

我基本上使用http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/給出的代碼。基於安德魯的迴應，我不認爲我能夠識別嬰兒哭鬧的聲音。無論如何要找出我的上述代碼中的lowPassResults對於嬰兒哭聲的價值是什麼？是否有任何文件以其頻率/幅度表示不同的聲音？ – applefreak 2011-04-26 10:23:01

這是一個非常困難的問題。語音識別是一個複雜的主題，即使是大規模的公司也無法做到。一個建議是對它進行抽樣，看看它是否在一定的高音範圍內。除此之外，你需要閱讀語音識別理論。

由於this answer顯示，它不在iPhone SDK的範圍內，因此它不會是一個簡單的答案。

來源

2011-04-21 14:40:07 OrangeAlmondSoap

謝謝安德魯。我不想要確切的聲音識別功能，但如果我能夠找出嬰兒哭鬧的頻率/幅度，那麼也可以。 – applefreak 2011-04-26 10:38:21

iPhone：準確識別人聲語音

回答

相關問題