2014-07-21 86 views
0

我是新來的數組主題和在這種情況下使用for循環,所以我希望有人可以給我指示如何解決這個問題。Python:如何比較列表與二維數組列表?

我有一個列表的列表,看起來像這樣:

[[1, 0, 0], [0, 1, 0], [1, 1, 0], [0, 0, 1], [1, 0, 1], [0, 1, 1], [1, 1, 1]] 

和一個二維數組,它看起來像這樣:

[[1 0 0] 
[1 0 1] 
[1 1 1] 
[1 0 0] 
[1 1 0] 
[1 0 1] 
[1 1 1] 
[0 0 1] 
[0 0 1]] 

數組最終將有接近420K的記錄,我會就像我可以看清清單中列出的組合的次數。我試圖用一個for循環,像這樣:

from matplotlib import pyplot as plt 
import numpy as np 
from matplotlib_venn import venn3, venn3_circles 
import os 
import sys 
from itertools import islice 

input_file= "/home/ruchik/bodyMap_Data/bodyMap_Files/final.txt"; 
col0_idx = 6 
col1_idx = 1 
col2_idx = 2 

print "number of sys arg", len(sys.argv) 
print "sys arg list", sys.argv 
input_file = sys.argv[1] 
col1_idx = int(sys.argv[2]) 
col2_idx = int(sys.argv[3]) 

## keep it real ;) 
col1_idx -= 1 
col2_idx -= 1 


print >> sys.stderr,'File is {file} and used columns are {col1} and {col2}.'.format(file=input_file, col1=col0_idx+col1_idx+1, col2=col2_idx+col0_idx+1) 

## Openning and reading the file 
#f = open(input_file, "r") 
#g = open("final_fixed.txt", "w") 
#print "Opened File Handle", f 
# 
#for line in f: 
# if line.strip(): 
#  g.write("\t".join(line.split()[7:]) + "\n") 

#f.close() 
#g.close() 

#print "File created." 

#f = open("final_fixed.txt", "r") 
f = open(input_file, "r") 

# header_all is a list of the content of the 1st line from position col0_idx-th to last-column-th 
header_all_list = [] 
header_all_list = f.readline().rstrip("\n").split('\t')[col0_idx:] 
header_reduced = [header_all_list[col1_idx], header_all_list[col2_idx], 'others'] 



# data_all is a (line-wise) list of (column-wise) list 
# with the content of each line but the 1st one from position col0_idx-th to last-column-th 
data_all_lol = [] 
for line in f: 
     data_all_lol.append(line.rstrip("\n").split('\t')[col0_idx:]) 

# just print the data_all list of list ... to make sure it is all fine up to there 
#for i in range(len(data_all)): 
#  for j in range(len(data_all[i])): 
#    print >> sys.stderr, 'all data {col_i} , {col_j} : {val_ij}'.format(col_i=i+1, col_j=j+1+col0_idx, val_ij = data_all[i][j]) 

op_lol = [[1, 0, 0], [0, 1, 0], [1, 1, 0], [0, 0, 1], [1, 0, 1], [0, 1, 1], [1, 1, 1]] 
count = [0, ] * len(op_lol) 
for i in range(len(op_lol)): 
    for j in range(len(data_reduced_transposed_npa)): 
     if list(data_reduced_transposed_npa[j]) == op_lol[i]: 
      count[i] += 1 

op = [[1, 0, 0], [0, 1, 0], [1, 1, 0], [0, 0, 1], [1, 0, 1], [0, 1, 1], [1, 1, 1]] 
count2 = [0, ] * len(op_lol) 
for column in data_reduced_npa: 
     for j in range(len(op_lol)): 
       count2[j] += 1 

#for k in range(len(op)): 
# print str(op[k]) + ': ' + str(count[k]) 
#header_venn3 = header_venn 
#data_venn3 = data_venn 
#print >> sys.stderr,"\nvenn3 order :" 
#print >> sys.stderr,"(Abc, aBc, ABc, abC, AbC, aBC, ABC)" 
#print >> sys.stderr,"venn3 header :" 
#print >> sys.stderr, header_venn3 
#print >> sys.stderr,"venn3 data :" 
#for i in range(len(data_venn3)): 
# for j in range(len(data_venn3[i])): 
#  print >> sys.stderr, 'venn3 data {col_i} , {col_j} : {val_ij}'.format(col_i=i, col_j=j, val_ij = data_venn3[i][j]) 




## Making the venn' 
plt.figure(figsize=(4,4)) 
v = venn3(subsets=count, set_labels = ('Introns', 'Heart', 'Others')) 
v.get_patch_by_id('100').set_alpha(1.0) 
v.get_patch_by_id('100').set_color('white') 
v.get_label_by_id('100').set_text('Unknown') 
plt.show() 

但這只是調換二維數組並打印,對於我來說,我究竟做錯了什麼?

+2

我不知道我理解的2D名單和二維數組 – alfasin

回答

1

你不能做到這一點:

count = [0, ] * len(op_lol); 

或本:

count2 = [0, ] * len(op_lol); 

這些都是在內存中創建值爲0的淺拷貝,所以當你去那些索引列表count和給它們賦新值,你只覆蓋內存中的一個位置。您需要實際上通過調用for循環,使用range,使用map或使用copy.deepcopy()方法實例化列表。

此外,您不會說data_reduced_npadata_reduced_transposed_npa來自哪裏,或者它們是什麼,所以無法真正說出導致輸出的原因。但至少,你應該看看copy.deepcopy()

https://docs.python.org/2/library/copy.html

+0

我添加完整的代碼之間的差異。 –