2012-11-20 61 views
2

所以,我在Matlab中實現了EM算法,但是我的矩陣很快就會被NaNInf值所污染。我認爲這可能是由矩陣反演造成的,但我不確定這是唯一的原因。Matlab NaN和Inf問題

下面是代碼:

function [F, Q, R, x_T, P_T] = em_algo(y, G) 
    % y_t = G_t'*x_t + v_t 1*1 = 1*p p*1 
    % x_t = F*x_t-1 + w_t  p*1 = p*p p*1 
    % G is T*p 
    p = size(G,2); % p = nb assets ; G = T*p 
    q = size(y,2); % q = nb observations ; y = T*q 
    T = size(y,1); % y is T*1 
    F = eye(p); % = Transition matrix p*p 
    Q = eye(p); % innovation (v) covariance matrix p*p 
    R = eye(q); % noise (w) covariance matrix q x q 
    x_T_old = zeros(p,T); 
    mu0 = zeros(p,1); 
    Sigma = eye(p); % Initial state covariance matrix p*p 
    converged = 0; 
    i = 0; 
    max_iter = 60; % only for testing purposes 
    while ~converged 
     if i > max_iter 
      break; 
     end 
     % E step = smoothing 
     fprintf('Iteration %d\n',i); 
     [x_T,P_T,P_Tm2] = smoother(G,F,Q,R,mu0,Sigma,y); 
     %x_T 

     % M step 
     A = zeros(p,p); 
     B = zeros(p,p); 
     C = zeros(p,p); 
     R = eye(q); 

     for t = 2:T % eq (9) in EM paper 
      A = A + (P_T(:,:,t-1) + (x_T(:,t-1)*x_T(:,t-1)')); 
     end 

     for t = 2:T % eq (10) 
      %B = B + (P_Tm2(:,:,t-1) + (x_T(:,t)*x_T(:,t-1)')); 
      B = B + (P_Tm2(:,:,t) + (x_T(:,t)*x_T(:,t-1)')); 
     end 

     for t = 1:T %eq (11) 
      C = C + (P_T(:,:,t) + (x_T(:,t)*x_T(:,t)')); 
     end  

     F = B*inv(A); %eq (12) 
     Q = (1/T)*(C - (B*inv(A)*B')); % eq (13) pxp 

     for t = 1:T 
      bias = y(t) - (G(t,:)*x_T(:,t)); 
      R = R + ((bias*bias') + (G(t,:)*P_T(:,:,t)*G(t,:)')); 
     end 
     R = (1/T)*R; 

     if i>1    
      err = norm(x_T-x_T_old)/norm(x_T_old); 
      if err < 1e-4 
       converged = 1; 
      end    
     end 
     x_T_old = x_T; 
     i = i+1; 
    end 
    fprintf('EM algorithm iterated %d times\n',i); 
end 

這個迭代,直到收斂在每次迭代(從未因發生在我的問題),並調用smoother.m

function [x_T, P_T, P_Tm2] = smoother(G,F,Q,R,mu0,Sigma,y) 
    % G is T*p 
    p = size(mu0,1); % mu0 is p*1 
    T = size(y,1); % y is T*1 
    J = zeros(p,p,T); 
    K = zeros(p,T); % gain matrix 
    x = zeros(p,T); 
    x(:,1) = mu0; 
    x_m1 = zeros(p,T); 
    x_T = zeros(p,T); % x values when we know all the data 
    % Notation : x = xt given t ; x_m1 = xt given t-1 (m1 stands for minus 
    % one) 
    P = zeros(p,p,T);% array of cov(xt|y1...yt), eq (6) in Shumway & Stoffer 1982 
    P(:,:,1) = Sigma; 
    P_m1 = zeros(p,p,T); % Same notation ; = cov(xt, xt-1|y1...yt) , eq (7) 
    P_T = zeros(p,p,T); 
    P_Tm2 = zeros(p,p,T); % cov(xT, xT-1|y1...yT) 

    for t = 2:T %starts at t = 2 because at each time t we need info about t-1 
     x_m1(:,t) = F*x(:,t-1); % eq A3 ; pxp * px1 = px1 
     P_m1(:,:,t) = (F*P(:,:,t-1)*F') + Q; % A4 ; pxp * pxp = pxp 

     if nnz(isnan(P_m1(:,:,t))) 
      error('NaNs in P_m1 at time t = %d',t); 
     end 
     if nnz(isinf(P_m1(:,:,t))) 
      error('Infs in P_m1 at time t = %d',t); 
     end 

     K(:,t) = P_m1(:,:,t)*G(t,:)'*pinv((G(t,:)*P_m1(:,:,t)*G(t,:)') + R); %A5 ; pxp * px1 * 1*1 = p*1 
     %K(:,t) = P_m1(:,:,t)*G(t,:)'/((G(t,:)*P_m1(:,:,t)*G(t,:)') + R); %A5 ; pxp * px1 * 1*1 = p*1 

     % The matrix inversion seems to generate NaN values which quickly 
     % contaminate all the other matrices. There is no warning about 
     % (close to) singular matrices or whatever. The use of pinv() 
     % instead of inv() seems to solve the problem... but I don't think 
     % it's the appropriate way to deal with it, there must be something 
     % wrong elsewhere 

     if nnz(isnan(K(:,t))) 
      error('NaNs in K at time t = %d',t); 
     end 


     x(:,t) = x_m1(:,t) + (K(:,t)*(y(t)-(G(t,:)*x_m1(:,t)))); %A6 
     P(:,:,t) = P_m1(:,:,t) - (K(:,t)*G(t,:)*P_m1(:,:,t)); %A7 
    end 

    x_T(:,T) = x(:,T); 
    P_T(:,:,T) = P(:,:,T); 

    for t = T:-1:2 % we stop at 2 since we need to use t-1 
     %P_m1 seem to get really huge (x10^22...), might lead to "Inf" 
     %values which in turn might screw pinv() 

     %% inv() caused NaN value to appear, pinv seems to solve the issue 

     J(:,:,t-1) = P(:,:,t-1)*F'*pinv(P_m1(:,:,t)); % A8 pxp * pxp * pxp 
     %J(:,:,t-1) = P(:,:,t-1)*F'/(P_m1(:,:,t)); % A8 pxp * pxp * pxp 
     x_T(:,t-1) = x(:,t-1) + J(:,:,t-1)*(x_T(:,t)-(F*x(:,t-1))); %A9 % Becomes NaN during 8th iteration! 
     P_T(:,:,t-1) = P(:,:,t-1) + J(:,:,t-1)*(P_T(:,:,t)-P_m1(:,:,t))*J(:,:,t-1)'; %A10 

     nans = [nnz(isnan(J)) nnz(isnan(P_m1)) nnz(isnan(F)) nnz(isnan(x_T)) nnz(isnan(x_m1))]; 
     if nnz(nans) 
      error('NaN invasion at time t = %d',t); 
     end 
    end 

    P_Tm2(:,:,T) = (eye(p) - K(:,T)*G(T,:))*F*P(:,:,T-1); % %A12 

    for t = T:-1:3 % stop at 3 because use of t-2 
     P_Tm2(:,:,t-1) = P_m1(:,:,t-1)*J(:,:,t-2)' + J(:,:,t-1)*(P_Tm2(:,:,t)-F*P(:,:,t-1))*J(:,:,t-2)'; % A11 
    end 
end 

NaN S和Inf S啓動突然出現在〜8 th迭代中。

我想在那裏我正在做一些邪惡的東西與我的矩陣,但我真的不知道什麼是錯的。我相信你的專業知識。

在此先感謝您的幫助。


羅迪: 這是我如何生成的數據(這不是「真實世界」的數據還沒有,只是產生的檢查,沒有什麼一些測試數據去wront):

T = 500; 
nbassets = 3; 
G = .1 + randn(T,nbassets); % random walk trajectories 
y = (1:T).'; 
y = 1.01.^y; % 1 * T % Exponential 1% returns curve 

丹: 你是對的。我確實缺乏數學背景來真正理解公式是如何派生的。我知道這沒有幫助,但我不確定我可以暫時補救。 :/


Rody:的確,我得出了同樣的結論。但我真的不知道是什麼讓它出現這樣的錯誤。

下面是本文的鏈接: http://www.stat.pitt.edu/stoffer/em.pdf

爲平滑的公式都在最後的附錄。感謝您的時間到目前爲止。

+2

你可以也應該直接在你的問題中嵌入代碼。 – slayton

+7

你可能想用'dbstop'運行它,如果naninf' 這樣你可以很快看到它第一次發生的情況,並且引用它爲什麼發生。 –

+4

編輯你的問題時,我看到一個'inv(A)'彈出幾次。永遠不會*,永遠不會*,永遠**,永遠不會***直接使用'inv()'!它很慢,不準確,不健壯等等。使用'LU'分解和/或反斜槓運算符。 –

回答

1

當用戶似乎已經插入到回答他的問題,我將它張貼在這裏:

正如@Rody提到的問題的原因是,使用inv創建NaNInf值。

用戶'解決'通過使用pinv來代替。