在matlab中合併兩個矩陣

我有兩個矩陣。一個大小爲1,000,000 x 9，另一個爲500,000 x 9.在matlab中合併兩個矩陣

這些列具有相同的含義，前7列具有鍵的功能。相應地，最後兩列有數據字符。在這兩個矩陣中有許多重疊的鍵值，我想有一個大矩陣來比較這些值。這個大矩陣應該是維百萬X 11

。例如：

A = [0 0 0 0 0 0 0 10 20; 0 0 0 0 0 0 1 30 40]; 
B = [0 0 0 0 0 0 0 50 60];

合併矩陣是這樣的：

C = [0 0 0 0 0 0 0 10 20 50 60; 0 0 0 0 0 0 1 30 40 0 0];

正如你可以看到，C的第一行矩陣A的列8,9和矩陣B的列10,11。第二行使用矩陣A的列8,9，最後的列使用0,0，因爲矩陣B中沒有對應的條目。

我理論上已經完成了這項任務，但它非常非常慢。我使用循環很多。在任何其他編程語言中，我將對兩個表進行排序，並在一個大循環中迭代兩個表，並保留兩個指針。

是否有更高效的算法可用於Matlab使用矢量化或至少是一個足夠有效的是慣用/短？

（附加說明：我最大的問題似乎是搜索功能：由於我的矩陣，我想在一個列向量7X1扔，讓我們將其命名爲key找到相應的行現在，我使用。 bsxfun爲：

targetRow = data(min(bsxfun(@eq, data(:, 1:7), key), [], 2) == 1, :);

我用min因爲bsxfun的結果是7個匹配標誌的載體，我當然希望所有的人是真實的在我看來，這可能是一個瓶頸Matlab算法）

來源

2017-02-08 IceFire

語義上，我會總是喜歡'所有（X，2）''以上分鐘（X，[]，2）== 1'用於邏輯陣列。不知道它是否更快。 – Florian

*「這可能是一個瓶頸」* - 你實際上可以檢查它是否是！嘗試使用'profiler'來顯示代碼的哪些行/部分使用最多的時間。 https://uk.mathworks.com/help/matlab/ref/profile.html即，'profile on; ; profile viewer;' – Wolfie

也許與ismember和一些索引：

% locates in B the last ocurrence of each key in A. idxA has logicals of 
% those keys found, and idxB tells us where in B. 
[idxA, idxB] = ismember(A(:,1:7), B(:,1:7),'rows'); 
C = [ A zeros(size(A, 1), 2) ]; 
C(idxA, 10:11) = B(idxB(idxA), 8:9); % idxB(idxA) are the idxB != 0

來源

2017-02-08 10:58:39 ibancg

這比我的建議更簡潔。 'ismember'是一個很好的吶喊，這是否允許B比A大？也許在開始時需要檢查，如果需要的話可以使用可變開關。由於信息密集，因此評論你的代碼可能會有所幫助。 – Wolfie

它可以處理更大的'B'，唯一的是指定如何處理重複鍵。就我而言，這是最後一次發生。 – ibancg

但'C'被初始化爲與'A'相同的行數？ – Wolfie

我想這你想要做什麼，只有你簡單的例子進行測試。

% Initial matrices 
A = [0 0 0 0 0 0 0 10 20; 
    0 0 0 0 0 0 1 30 40]; 
B = [0 0 0 0 0 0 0 50 60]; 

% Stack matrices with common key columns, 8&9 or 10&11 for data columns 
C = [[A, zeros(size(A,1),2)]; [B(:,1:7), zeros(size(B,1),2), B(:,8:9)]]; 
% Sort C so that matching key rows will be consecutive 
C = sortrows(C,1:7); 

% Loop through rows 
curRow = 1; 
lastRow = size(C,1) - 1; 
while curRow < lastRow 
    if all(C(curRow,1:7) == C(curRow+1,1:7)) 
     % If first 7 cols of 2 rows match, take max values (override 0s) 
     % It may be safer to initialise the 0 columns to NaNs, as max will 
     % choose a numeric value over NaN, and it allows your data to be 
     % negative values. 
     C(curRow,8:11) = max(C(curRow:curRow+1, 8:11)); 
     % Remove merged row 
     C(curRow+1,:) = []; 
     % Decrease size counter for matrix 
     lastRow = lastRow - 1; 
    else 
     % Increase row counter 
     curRow = curRow + 1;   
    end 
end

答案：

C = [0  0  0  0  0  0  0 10 20 50 60 
    0  0  0  0  0  0  1 30 40  0  0]

來源

2017-02-08 11:07:29 Wolfie

是的，這是個好主意，謝謝！我仍然給ibancg留下印記，因爲它更短。有一個upvote，雖然 – IceFire

@IceFire不用擔心，謝謝 – Wolfie

在matlab中合併兩個矩陣

回答

相關問題