2017-05-29 171 views
2

我對線性模型y = X * beta + eps進行了大小(X)= [n d]的模擬研究。 我基於兩種方法考慮維度d的影響。 我運行10個模擬數據並獲得相應的beta估計,然後我想計算10個模擬數據中beta的平均值。matlab中的單元陣列均值

我的玩具MATLAB代碼如下:

 nsim=10; %iteration number 
     dd=[4 6]; %two dimension cases,\beta=(\beta_1,\cdots,\beta_d)^T 
     ddlen=length(dd); 
     nmethod=2; %two methods 
     seednum=0; 

     BH = cell(nsim,ddlen,nmethod); %estimation of beta w.r.t two dimension cases and two methods 

     for di = 1:ddlen 
      d = dd(di); 
      for ni = 1:nsim 
       seednum = seednum + di*ni; 
       randn('seed', seednum); 
       betahat=randn(d,1); 
       for method = 1:nmethod 
        if method==1 
         BH{ni,di,method} = betahat; 
        else 
         BH{ni,di,method} = 10*betahat; 
        end 
       end 
      end 
     end 

然後我們就可以得到

BH(:,:,1) = 

    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 


BH(:,:,2) = 

    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 
    [4x1 double] [6x1 double] 

我想在10個行平均值(NSIM = 10),並得到類似

mean(BH(:,:,1))= 

    [4x1 double] [6x1 double] 

mean(BH(:,:,2)) = 

    [4x1 double] [6x1 double] 

有沒有想法?謝謝!

+0

感謝@ EBH。但你的回答並不是我想要的。回報應該是兩個向量,一個是[4x1 double],另一個是[6x1 double],換句話說,分別是10 [4x1 double]的平均值和10 [6x1 double]的平均值。 –

+3

如果所有數組的大小相同,爲什麼使用單元陣列? – beaker

+0

爲什麼你在每個循環中設置一個新的隨機種子? – EBH

回答

0

我不知道這是否是最有效的方式不這樣做,但你可以使用arrayfun爲:

% generate random array 
BH = repmat({rand(4,1),rand(6,1)},[10 1 2]); 
% generate indexes for the 2nd and 3rd dimensions 
[n2,n1] = meshgrid(1:size(BH,2),1:size(BH,3)); 
% get mean across 1st (cell) dimension 
[res] = arrayfun(@(n1,n2)mean([BH{:,n1,n2}],2),n1(:),n2(:),'UniformOutput',false); 
% reshape to desired output 
res = reshape(res,[1 size(BH,2) size(BH,3)]); 

如果你想推廣到N維單元陣列:

% generate random array 
BH = repmat({rand(4,1),rand(6,1)},[10,1,2,2,5]); 
sz = size(BH); 
% generate indexes for the 2nd and 3rd dimensions 
n = cell(1,numel(sz) - 1); 
[n{:}] = ndgrid(1:sz(2),1:sz(3),1:sz(4),1:sz(5)); 
n = cell2mat(cellfun(@(x) {x(:)},n)); 
idx = 1:size(n,1); 
% get mean across 1st (cell) dimension 
[res] = arrayfun(@(idx)mean([BH{:,n(idx,1),n(idx,2),n(idx,3),n(idx,4)}],2),... 
    idx,'UniformOutput',false); 
% reshape to desired output 
res = reshape(res,[1 sz(2:end)]); 
+0

它的工作原理!感謝@ user2999345。好吧,我覺得按照你的方法做一些泛化是有點複雜的。考慮一下'BH = repmat({rand(4,1),rand(6,1)},[10,1,2,2,5]);'即10次迭代,2個維數,2個噪聲水平和5線性模型的候選方法y = X * beta + eps,其中eps〜N(0,\ sigma^2 * I)和\ sigma是噪音水平 –

+0

檢查我的編輯,我希望它回答您的要求 – user2999345

+0

哇,你的新編輯答案太棒了!非常感謝! @ user2999345 –

0

如果我明白你的意思,你想對所有在向量中相同位置的元素取平均值。因此,從BH(:,1,1)中的所有向量中,我們獲得4個平均值的一個向量,每個向量用於向量中的一個位置。這同樣適用於BH(:,1,2)。對於BH(:,2,1)BH(:,2,1),我們做同樣的事情,但在向量中有6個元素。

您可以使用下面的代碼:

% split BH to 2 arrays: 
bh4 = reshape(cell2mat(BH(:,1,:)),[],nsim,2); % all the 4 elements vectors 
bh6 = reshape(cell2mat(BH(:,2,:)),[],nsim,2); % all the 6 elements vectors 
meanBH4 = squeeze(mean(bh4,2)); % mean over all 4 element vectors 
meanBH6 = squeeze(mean(bh6,2)); % mean over all 6 element vectors 

然而,一步一步在做的正確方法是定義兩個數組,每個方法:

BH1 = zeros(nsim,ddlen,dd(1)); 
BH2 = zeros(nsim,ddlen,dd(2)); 

然後在你的循環中爲它們賦值:

if method==1 
    BH1(ni,di,:) = betahat; 
else 
    BH2(ni,di,:) = 10*betahat; 
end 

並且最後只取其中的平均值:

meanBH1 = mean(BH1,3) 
meanBH2 = mean(BH1,3) 

編輯:

爲了寫這一切,更多的 'Matlabish' 的方式,我建議如下:

nsim = 10; % iteration number 
dd = [4 6]; % two dimension cases,\beta=(\beta_1,\cdots,\beta_d)^T 
methods = 2; % two methods 

% preapering random seeds 
s = bsxfun(@times,1:numel(dd),(1:nsim).'); 
seednum = cumsum(s(:)); 

% initialize results array 
BH = nan(max(dd),nsim,numel(dd),methods); 
counter = 1; 
for k = 1:numel(dd) 
    for n = 1:nsim 
     % set a new random seed from the list: 
     rng(seednum(counter)); 
     % compute all betahats with this seed: 
     betahat = randn(max(dd),2).*repmat([1 10],[max(dd) 1]); 
     % assign the values to BH by dimension: 
     for m = 1:methods 
      BH(1:dd(k),n,k,m) = betahat(1:dd(k),m); 
     end 
     counter = counter+1; 
    end 
end 
% compute the means over iterations: 
means = squeeze(mean(BH,2,'omitnan')) 

,並讓你獲得means爲你的結果。


P.S.我不知道爲什麼你每次迭代,besides that's not a recommended syntax打電話randn('seed', seednum),但如果你能刪除它,那麼你可以向量化大部分的循環,你的代碼擠壓到:

% compute all betahats: 
betahat = randn(nsim,max(dd),numel(dd),2); 
% apply dimensions: 
for k = dd 
    betahat(:,k+1:end,1,:) = nan; 
end 
% apply methos 2: 
betahat(:,:,:,2) = betahat(:,:,:,2)*10; 

% compute the means over iterations: 
means = squeeze(mean(betahat,1,'omitnan')) 

希望事情看起來更清晰了。 ..

+0

Thanks @ EBH。但你的回答並不是我想要的。回報應該是兩個向量,一個是[4x1 double],另一個是[6x1 double],換句話說,分別是10 [4x1 double]的平均值和10 [6x1 double]的平均值。 –

+0

@JohnStone你在問題中描述的是1 * 2 * 2輸出,而不是4 * 1和6 * 1輸出的向量。不過,請參閱我的編輯。 – EBH

+0

謝謝@ EBH。那麼,如果我考慮一個更復雜的設置,比如'BH = repmat({rand(4,1),rand(6,1)},[10,1,2,2,5]);'即10次迭代,線性模型y = X * beta + eps,其中eps_N(0,\ sigma^2 * I)和\ sigma是噪聲水平的2個參數,2個噪聲水平和5個候選方法。我該做什麼? –

0

另外,

% split into seperate cell arrays 
BH_1 = BH(:,:,1); 
BH_2 = BH(:,:,2); 

% create matrix of compatible vectors, and take mean and put result back into cell array 
BH_1_mean = cat(2,{mean(cell2mat(BH_1(:,1)'),2)}, {mean(cell2mat(BH_1(:,2)'),2)}); 
BH_2_mean = cat(2,{mean(cell2mat(BH_2(:,1)'),2)}, {mean(cell2mat(BH_2(:,2)'),2)}); 
+0

謝謝@ kedarps,它的工作原理! –