2013-08-27 86 views
0

我有兩列,如下所示。運行數據的平均值或數據的分級

ABC = 

4.1103 25.5932 
5.0852 31.2679 
6.0021 15.9020 
5.8495 21.4804 
4.3245 19.9674 
5.9378 38.3452 
6.9460 8.8233 
7.4568 44.7429 
5.7358 32.7608 
5.3510 35.2645 
5.1657 54.6566 
5.1381 44.1870 
4.1566 101.8947 
5.7310 -3.0565 
5.5496 28.3637 
4.5672 -1.7736 
4.5805 11.8384 
4.7948 33.7640 
3.9901 6.0607 
4.4203 17.7308 
4.2712 -1.5834 
4.8808 -2.3123 
5.9004 -0.4623 
5.3929 1.1477 
5.6594 6.9741 
5.5114 11.3982 
5.4715 5.9189 
5.0021 6.2561 
4.1576 10.3207 
6.1025 3.4654 
3.9960 6.6892 
5.6938 3.8429 
5.2416 7.7513 
7.0922 2.6871 
5.3277 14.0617 
6.1350 4.0316 
6.0211 -20.3587 
6.7399 14.0224 
5.0818 102.6360 
5.6444 24.3167 
6.2542 19.8522 
6.2862 24.3430 
5.6452 -6.4020 
5.4561 14.7813 
4.7934 9.4639 
3.8523 32.0766 
3.9878 8.5313 
4.5232 42.0309 
4.2489 -12.0325 
6.0413 -5.5464 
4.9334 -3.2520 
4.1349 20.9038 
4.2329 20.6303 
4.2009 31.8840 
4.0624 48.5402 
4.7674 28.6595 
4.0767 4.7767 
4.0971 34.8460 
3.8442 24.0209 
5.2471 38.8815 
6.0241 59.3785 
6.9743 6.5027 
7.8732 4.5422 
4.3094 68.4340 
4.5601 -4.2946 
4.6140 109.4510 
4.5862 71.8387 
5.2210 66.1310 
4.3835 32.7592 
6.1432 36.3832 
5.4624 13.7891 
5.2129 40.1301 
3.8987 67.2705 
6.6328 15.0286 
8.0786 -7.3078 
4.8968 -6.7754 
4.1200 4.5333 
4.1098 -3.3204 
4.0373 26.4890 
3.8467 48.8121 
7.7795 -2.3606 
6.9553 21.3609 
6.2635 24.4985 
6.1518 -1.4200 
4.9115 11.5784 
5.5908 13.1351 
7.0117 -2.8297 
5.2193 38.6937 
6.0786 16.9453 
6.8229 14.0907 
8.0385 13.6228 
8.6596 -1.4478 
6.3257 8.0361 
6.9223 -14.2179 
3.8337 15.5773 
4.0039 -24.1494 
4.6332 17.9308 
6.3684 11.3398 
5.8592 4.0367 
6.9040 12.1495 
7.8524 -0.0432 
8.3545 10.8865 
9.3946 20.4614 
4.3015 25.9674 
4.4782 21.9045 
4.1994 39.2286 
4.3499 22.1004 
4.3652 33.6220 
4.2026 -5.8153 
5.1330 6.4996 
5.3118 33.7835 
4.2002 -3.1917 
3.8285 32.1016 
3.9485 21.6358 
3.8688 21.7830 
4.0494 24.7914 
4.0869 10.6577 
4.6699 8.4756 
5.1199 11.1885 
5.1831 8.6163 
4.5560 8.2806 
4.4886 4.8017 
4.5618 5.9434 
4.1135 12.8942 
4.1377 22.1423 

我作出了相等號碼。來自'x'的分箱和對應的平均分箱值'yy'。如下圖所示

x=ABC(:,1); 
y=ABC(:,2); 
counter=1 
    for i=min(x):0.3:max(x)  
     bin= x>i & x<= i+0.3;  
     xbin(counter,1) = mean(x(bin)); 
     yy(counter,1) = mean(y(bin)); 
     counter   = counter+1 
    end 

plot(x,y,'ro'); hold on 
plot(xbin,yy,'bo-'); 

其中,A「倉」被用於一定範圍的「X」中定義(請參閱for循環)。現在出放包含「XBIN」從「x」和從數據「YY」的意思'y'對應'xbin'。我對平均值「yy」的關注應該從約。等於沒有。的數據點。如果'bin'中沒有足夠的'y'數據點,那麼平均值'yy'應該是NaN。請有人可以幫助這方面。謝謝

+0

您應該看看histc函數。 http://www.mathworks.com/help/matlab/ref/histc.html – PeterM

回答

1

檢查您的for -loop的每次迭代中1的數量bin。如果這個數字是低於某一閾值,分配給NaNyy

x=ABC(:,1); 
y=ABC(:,2); 
counter=1; 

nbinmin = 5; % this is the threshold 

for i=min(x):0.3:max(x) 
    bin= x>i & x<= i+0.3; 
    xbin(counter,1) = mean(x(bin)); 

    % check if the number of 1s in bin is less than the threshold 
    if length(bin(bin==1)) < nbinmin 
     yy(counter,1) = NaN; 
    else 
     yy(counter,1) = mean(y(bin)); 
    end 
    counter = counter+1; 
end 
+0

@Schorch,謝謝。這是我所要求的。我適用於我的問題,它工作正常,但一段時間後有錯誤。我不擅長matlab,你能告訴我什麼是問題。因爲這個函數沒有你提到的'門檻值',而'門檻值'會出現如下錯誤。這是由於'x'數據中的NaN造成的。 ''嘗試訪問xbin(NaN,1);索引必須是正整數或logical.Error in testingPSSwithRO2_Original(line 117)xbin(counter,1)= nanmean(x(bin));'' – Umar

+0

@ user1949014您應該考慮首先刪除這些行:'x = ABC(: ,1); Y = ABC(:,2); x = x(〜isnan(ABC(:,1))); y = y(〜isnan(ABC(:,1)));'請注意,您必須在'isnan'個案例中使用第一列。 – Schorsch

1

問題不完全清楚,但你嘗試過使用直方圖功能,hist?看來,它可以做很多的工作對你

% choose the bin locations 
xcenters = min(x):0.3:max(x); 

% compute counts in each bin 
[counts, ctrs] = hist(y, xcenters); 

% set any with too few samples to NaN 
count_min = 3; 
counts(counts < count_min) = NaN; 

% plot -- either as a histogram, 
figure(1) 
bar(ctrs, counts) 
%or as a line plot (note that the line won't join up if too many NaN segments) 
figure(2) 
plot(ctrs, counts) 

您可以在這裏指定輸入盒中心,但定義箱的邊緣,而不是,看histc

1

您基本上正在尋找一個histogram with non-uniform bins直方圖相等的計數

對於非均勻的直方圖最簡單的情況是將Nx排序和排序矢量分離成k箱,即每個倉將具有樣品的N/k(也可以通過指定N = ck設定的比) 。

而不是一個線性間距的範圍域x,你做有序向量的線性拆分(因此一個非線性,非均勻分離的原始範圍)。

在你的情況是這樣的:

[sortedX, indeX] = sort(x); 
nVals = length(x); % N 
nBins = nVals/10; % k = N/c 

% linear split of the sorted vector 
stepX = (1:nVals/nBins:nVals); 
if stepX(end)~=nVals, stepX = [stepX nVals+1]; end 

% counting and bining on the indexed vector 
for i = 1 : length(stepX)-1  
    bin = indeX(stepX(i):stepX(i+1)-1); 
    xbin(i,1) = mean(x(bin)); 
    yy(i,1) = mean(y(bin)); 

end 

計算實際範圍(直方圖即邊緣),你可以在倉i使用max之間的中點和最小的bin i+1。你可以添加在你的循環如下的內容:

% calculate the range 
maxX(i) = max(x(bin)); 
minX(i) = min(x(bin)); 

所需的(非線性)範圍則是:

rangeX = [min(x) maxX(1:end-1) + (minX(2:end) - maxX(1:end-1))/2 max(x)]; 

,而原來的(線性)的範圍是:

rangeX_OP = min(x):0.3:max(x); 

您可以使用histc來驗證相等計數(對於rangeX)和不等於計數(對於rangeX_OP)。這就是計數的樣子(對於隨機的x與您的計數範圍相似,而c = 10計數每個bin)。頂部是線性間距如果範圍,底部是非線性的。

enter image description here