1
我正在使用MFCC提取功能來實現語音識別器我堅持使用HMM實現。我正在使用Kevin Murphy Toolbox for HMM。我的MFCC結果矩陣包含負值,這可能是我遇到的情況,我的MFCC代碼可能是錯誤的。以下是錯誤我getting-MFCC特徵提取的結果矩陣是否具有負值?
Attempted to access obsmat(:,-39.5403); index must be a positive integer or logical.
Error in multinomial_prob (line 19)
B(:,t) = obsmat(:, data(t));
Error in dhmm_em>compute_ess_dhmm (line 103)
obslik = multinomial_prob(obs, obsmat);
Error in dhmm_em (line 47)
[loglik, exp_num_trans, exp_num_visits1, exp_num_emit] = ...
Error in speechreco (line 77)
[LL, prior2, transmat2, obsmat2] = dhmm_em(dtr{1}, prior, A, B, 'max_iter', 5);
另外,如果有誰知道鏈接到任何Matlab的源代碼HMM請提供我堅持我的最後project.I想實現語音識別,不知道是什麼在提取特徵向量之後進行。
這是整個MATLAB代碼(我使用凱文·墨菲HMM工具包,誤差是在dhmm_em功能):
function []=speechreco()
vtr = {8}; fstr = {8}; nbtr = {8};
ctr = {8};
for i = 1:8
% Read audio data from train folder for performing operations
st=strcat('train\s',num2str(i),'.wav');
[s1 , fs1 , nb1]=wavread(st); %st is filename; s1 is sample data, fs1 is frame rate in hertz, nb1 is number of bits per sample
vtr{i} = s1; fstr{i} = fs1; nbtr{i} = nb1;
ctr{i} = mfcc(vtr{i},fstr{i});
end
display(ctr{1}); %MFCC matrix 20*129
W1 = transpose(ctr{1});
ch1=menu('Mel Space:','Signal 1','Signal 2','Signal 3',...
'Signal 4','Signal 5','Signal 6','Signal 7','Signal 8','Exit');
if ch1~=9
plot(linspace(0, (fstr{ch1}/2), 129), (melfb(20, 256, fstr{ch1})));
title('Mel-Spaced-Filterbank');
xlabel('Frequency[Hz]');
end
%error is here
[LL, prior2, transmat2, obsmat2] = dhmm_em(ctr{1}, prior, A, B, 'max_iter', 5);
plot(LL());
end
%%mfcc
%old one MFCC now
function r = mfcc(s, fs)
m = 100;
n = 256;
frame=blockFrames(s, fs, m, n); %power spectra obtained
m = melfb(20, n, fs);
n2 = 1 + floor(n/2);
z = m * abs(frame(1:n2, :)).^2; %apply traingular window
r = dct(log(z)); %take log and then the dct conversion
end
%% blockFrames Function
% blockFrames: Puts the signal into frames
%
% Inputs: s contains the signal to analize
% fs is the sampling rate of the signal
% m is the distance between the beginnings of two frames
% n is the number of samples per frame
%
% Output: M3 is a matrix containing all the frames
function M3 = blockFrames(s, fs, m, n)
l = length(s);
nbFrame = floor((l - n)/m) + 1;
for i = 1:n
for j = 1:nbFrame
M(i, j) = s(((j - 1) * m) + i); %#ok<AGROW>
end
end
h = hamming(n);
M2 = diag(h) * M;
for i = 1:nbFrame
M3(:, i) = fft(M2(:, i)); %#ok<AGROW>
end
end
%--------------------------------------------------------------------------
function m = melfb(p, n, fs) %used for graph plot of power spectra
% MELFB Determine matrix for a mel-spaced filterbank
%
% Inputs: p number of filters in filterbank
% n length of fft
% fs sample rate in Hz
%
% Outputs: x a (sparse) matrix containing the filterbank amplitudes
% size(x) = [p, 1+floor(n/2)]
%
% Usage: For example, to compute the mel-scale spectrum of a
% colum-vector signal s, with length n and sample rate fs:
%
% f = fft(s);
% m = melfb(p, n, fs);
% n2 = 1 + floor(n/2);
% z = m * abs(f(1:n2)).^2;
%
% z would contain p samples of the desired mel-scale spectrum
%%%%%%%%%%%%%%%%%%
%
f0 = 700/fs;
fn2 = floor(n/2);
lr = log(1 + 0.5/f0)/(p+1);
% convert to fft bin numbers with 0 for DC term
bl = n * (f0 * (exp([0 1 p p+1] * lr) - 1));
b1 = floor(bl(1)) + 1;
b2 = ceil(bl(2));
b3 = floor(bl(3));
b4 = min(fn2, ceil(bl(4))) - 1;
pf = log(1 + (b1:b4)/n/f0)/lr;
fp = floor(pf);
pm = pf - fp;
r = [fp(b2:b4) 1+fp(1:b3)];
c = [b2:b4 1:b3] + 1;
v = 2 * [1-pm(b2:b4) pm(1:b3)];
m = sparse(r, c, v, p, 1+fn2);
end
%----------------------------------------------------------------------
我現在已經添加了整個代碼。如果您有任何問題,也請告訴我,並且可以與HMM共享MatLab代碼以進行單詞識別,以便我可以將其用作參考,並進一步開發此離散隔離模型的連續模型代碼。 – user3489201