2014-02-11 90 views
7

什麼是最快的方式來採取陣列A和輸出unique(A) [即[A]的唯一數組元素集合以及其第i個位置中的第i個條目的第i個條目unique(A)的第i個多樣性集合在A中。如何快速獲得多重陣列

這是一口,所以這裏是一個例子。鑑於A=[1 1 3 1 4 5 3],我想:

  1. unique(A)=[1 3 4 5]
  2. mult = [3 2 1 1]

這可以用一個單調乏味的for循環來實現,但想知道是否有利用MATLAB的陣列性質的方式。

回答

7
uA = unique(A); 
mult = histc(A,uA); 

或者:

uA = unique(A); 
mult = sum(bsxfun(@eq, uA(:).', A(:))); 

標杆

N = 100; 
A = randi(N,1,2*N); %// size 1 x 2*N 

%// Luis Mendo, first approach 
tic 
for iter = 1:1e3; 
    uA = unique(A); 
    mult = histc(A,uA); 
end 
toc 

%// Luis Mendo, second approach  
tic 
for iter = 1:1e3; 
    uA = unique(A); 
    mult = sum(bsxfun(@eq, uA(:).', A(:))); 
end 
toc 

%'// chappjc 
tic 
for iter = 1:1e3; 
    [uA,~,ic] = unique(A); % uA(ic) == A 
    mult= accumarray(ic.',1); 
end 
toc 

結果與N = 100

Elapsed time is 0.096206 seconds. 
Elapsed time is 0.235686 seconds. 
Elapsed time is 0.154150 seconds. 

結果與N = 1000

Elapsed time is 0.481456 seconds. 
Elapsed time is 4.534572 seconds. 
Elapsed time is 0.550606 seconds. 
+0

你有任何意見作爲那些二是更快? – Lepidopterist

+0

@GregorianFunk我不知道...另外,它可能取決於'A'的大小。有時候一種解決方案對於小尺寸而言是最快的,但對於大尺寸來說則不是。請給他們一個嘗試! –

+1

@GregorianFunk我做了一些測試(見編輯答案)。第一個顯然更快。 chappjc的答案非常接近 –

2
[uA,~,ic] = unique(A); % uA(ic) == A 
mult = accumarray(ic.',1); 

accumarray非常快。不幸的是,unique 3個輸出變慢。


晚此外:

uA = unique(A); 
mult = nonzeros(accumarray(A(:),1,[],@sum,0,true)) 
2
S = sparse(A,1,1); 
[uA,~,mult] = find(S); 

我發現這個優雅的解決方案中an old Newsgroup thread

測試與benchmark of Luis MendoN = 1000

Elapsed time is 0.228704 seconds. % histc 
Elapsed time is 1.838388 seconds. % bsxfun 
Elapsed time is 0.128791 seconds. % sparse 

(在我的機器,accumarray導致Error: Maximum variable size allowed by the program is exceeded.