如何生成矩陣？

我有以下數據：如何生成矩陣？

gene strain 
A1 S1 
A1 S4 
A1 S8 
A2 S5 
A2 S4 
A2 S9 
A3 S4 
A3 S1 
A3 S10

我需要產生具有基因VS菌株的矩陣，即，I需要顯示該基因存在於該菌株的，所以矩陣將看起來像這樣：

S1 S4 S5 S8 S9 S10 
A1 
A2 
A3

任何人都可以引導我通過最好和最快捷的方式來做到這一點在Ruby？我有一系列的菌株和基因。

來源

2012-02-14 Mark

檢查[Matrix]（http://www.ruby-doc.org/stdlib-1.9.3/libdoc/matrix/rdoc/Matrix.html）類。 – 2012-02-14 14:04:08

您是否期待有二進制條目的矩陣（存在/不存在）？ – chl 2012-02-14 14:05:05

是的，我期待一個矩陣與二進制條目（存在/缺席） – Mark 2012-02-14 15:28:30

有許多方法可以表示您需要的基因 - 應變矩陣。最好的方法將取決於你想要對矩陣做什麼。你想比較哪些菌株存在於不同的基因中？或比較哪些基因具有特定的應變？你只是想能夠查看給定的基因是否具有給定的應變？

一個簡單的方法將是一個Hash的鍵是Set S：

require 'set' 
h = Hash.new { |h,k| h[k] = Set.new } 
# assuming you already have the data in an array of arrays... 
data.each do |gene,strain| 
    h[gene] << strain 
end

如果你只是想在屏幕上進行打印矩陣，這裏是一個小腳本，可以這樣做：

require 'set' 
genes, strains = Set.new, Set.new 
h = Hash.new { |h,k| h[k] = Set.new } 
# again assuming you already have the data in an array of arrays 
data.each { |g,s| h[g] << s; genes << g; strains << s } 
genes, strains = genes.sort, strains.sort 

FIELD_WIDTH = 5  
BLANK  = " "*FIELD_WIDTH 
X   = "X" + (" " * (FIELD_WIDTH - 1)) 
def print_fixed_width(str) 
    str = str[0,FIELD_WIDTH] 
    print str 
    print " "*(FIELD_WIDTH-str.length) 
end 

# now print the matrix 
print BLANK 
strains.each { |s| print_fixed_width(s) } 
puts 

genes.each do |g| 
    print_fixed_width(g) 
    strains.each { |s| h[g].include?(s) ? print X : print BLANK } 
    puts 
end

請發表更多關於你想用矩陣做什麼的細節，如有必要我會提供一個更合適的選項。

來源

2012-02-14 14:36:43

謝謝。我需要檢查哪些基因存在於哪些菌株 – Mark 2012-02-14 15:22:59

那麼您需要採用給定的基因，並查看哪些菌株有？或者您是否需要承受一定的壓力，並查看其中存在哪些基因？或兩者？ – 2012-02-14 15:49:23

作爲一個方面說明，如果你打算在大量數據上檢查基因 - >應變成員資格，我會考慮使用散列而不是Set來檢查成員資格 - 它[更快]（https：//gist.github .com/1827686）比兩個Set.include？和Array.include ?.在@ AlexD的代碼中，這將轉化爲'Hash.new {| h，k | h [k] = {}}'接着是h [gene] [strain] = true'。然後你可以簡單地通過'h [gene] [strain]'而不是'h [gene]包括？（strain）來檢測成員身份。「 – user2398029 2012-02-14 15:52:31

你可以在一個二維數組來表示：

arr = [[1,1],[1,4],[1,8],[2,5],[2,4],[2,9],[3,4],[3,1],[3,10]]

快速和骯髒的表：

s = " 1234567890\n" 
(1..3).each do |i| 
    s << i.to_s << ' ' 
    (1..10).each do |j| 
    s << (arr.include?([i,j]) ? 'x' : ' ') 
    end 
    s << "\n" 
end 
puts s 

    1234567890 
1 x x x 
2 xx x 
3 x x  x

來源

2012-02-14 14:52:33 seph

如果「需要檢查哪些基因存在於株」，然後散列就足夠了：

str = <<DOC 
A1 S1 
A1 S4 
A1 S8 
A2 S5 
A2 S4 
A2 S9 
A3 S4 
A3 S1 
A3 S10 
DOC 

ar = str.lines.map{|line| line.split(/\s+/) } #string to array of arrays 
genes_from_strain = Hash.new{|h,k| h[k]=[] } #This hash will give an empty array if key is not present 
ar.each{|pair| genes_from_strain[pair.last] << pair.first } 
p genes_from_strain['S1'] #=>["A1", "A3"]

來源

2012-02-14 16:12:03 steenslag

如何生成矩陣？

回答

相關問題