檢查多序列比對中特定位置的特定氨基酸

Stack Overflow上有類似的問題，但它使用的是Linux終端（Search for specific characters in specific positions of line）。我想用python做一個類似的事情，我不能完全弄清楚什麼是Python方法來做到這一點，而不必手動寫入成員資格檢查。檢查多序列比對中特定位置的特定氨基酸

我想在多序列比對的特定位置上搜索特定的氨基酸。我已經定義了氨基酸比對在一系列索引中的位置，

e.g Index = [1, 100, 235, 500].

我已經在這些位置定義了我想要的氨基酸。

Res1 = ["A","G"] 
Res2 = ["T","F"] 
Res3 = ["S,"W"] 
Res4 = ["H","J"]

我目前在做這樣的事情：

for m in records_dict: 
    if (records_dict[m].seq[Index[0]] \ 
     in Res1) and (records_dict[m].seq[Index[1]] \ 
     in Res2) and (records_dict[m].seq[Index[2]] \ 
     in Res3) and (records_dict[m].seq[Index[3]]\ 
     in Res4) 
    print m

現在，假設我有40個殘基列表我要檢查，我知道我必須寫殘列表中手動檢查，但當然，有一種更簡單的方法可以使用while循環或其他方法來執行此成員資格檢查。

另外，有什麼辦法可以合併一個系統，如果沒有序列匹配所有40個成員資格檢查，我會得到最匹配全部40個檢查的5個最佳序列，並且輸出例如序列「m 「有30/40比賽和這30場比賽的名單，哪10場比賽不匹配？

來源

2015-10-09 user1998510

我會假設你想檢查Res1是否在Index[0],Res2在Index[1]等等。

res = [Res1, Res2, Res3, Res4] 
for m in records_dist: 
    match = 0 
    match_log = [] 
    for i in Index: 
     if records_dict[m].seq[i] in res[i]: 
      match += 1 
      match_log.append(i)

這個小碼，你能夠計算匹配的數量，並在比賽時每個records_dist值跟蹤的指標。

如果你想檢查ResX是否位於多個位置，或者你不想像res列表那樣排列索引列表，我會定義一個列表字典，其中鍵爲ResX，值爲列表索引：

to_check = {} 
to_check[Res1] = [index1, index2] 
to_check[Res2] = [index1, ..., indexN] 
... 
to_check[ResX] = [indexI, ..., indexJ]

然後，使用

match_log = {} 
for m in records_dist: 
    match_log[m] = {} 
    for res, indexes in to_check: 
     match_log[m][res] = [] 
     for i in indexes: 
      if records_dict[m].seq[i] in res: 
       match_log[m][res].append(i) 
     nb_match = len(match_log[m][res])

或者在更Python的方式，使用filter：

match_log = {} 
for m in records_dist: 
    match_log[m] = {} 
    for res, indexes in to_check: 
     match_log[m][res] = filter(lamba i: records_dict[m].seq[i] in res, indexes) 
     nb_match = len(match_log[m][res])

來源

2015-10-09 09:51:20 NiziL

謝謝Nizil，幫助。但是，如果索引的編號與Res列表的順序不同，那麼將如何執行類似的操作？我假設你必須手動輸入。例如，如果index [0] == ResAsp和index [2] == ResGlu（而不是Res1和Res2）。我會把所有的ResX列表放在另一個列表中並同時迭代它嗎？ – user1998510

@ user1998510使用字典應該更有效，我已經更新了我的答案;） – NiziL

檢查多序列比對中特定位置的特定氨基酸

回答

相關問題