存儲一列文件爲詞典

我有一個文件，該文件看起來像這樣：存儲一列文件爲詞典

>Organism1 
    ETTGDMND 
    >Organism2 
    PDELMESPEER 
    >Organism3 
    YERLLRRAQ 
    >Organism1 
    EDLTEVSGIGC

我想創建一個字典，其中大寫字母（=氨基酸序列）是鍵，有機體名稱是價值。到目前爲止，我有：

dict1 = {} 
    for line in file.readlines(): 
     line = line.rstrip() 
     if ">" not in line:  # '>' not in the line=amino acid seq 
      key = line    #assign the line into a variable 'key' 
      dict1[key] = []  #make this variable the keys of dict1 
     else:      #if '>'is in the line = organism 
      value = line 
      dict1[key] = value 
    print dict1

它提出的是「鑰匙」沒有定義錯誤消息。但我認爲這是通過說key = line ..？

使用相同輸入文件的相關問題。如果我想僅在從該文件（用於其他目的）的氨基酸序列來調用，我確實：

my_sequences = [] 
for line in file: 
    line = line.rstrip() 
    if ">" not in line: 
     my_sequences = [line] # add these dna sequences to the list "my_sequences" 
print my_sequences

但只打印一個序列，而不是所有的序列。任何人都可以幫助我嗎？謝謝！

來源

2017-04-08 ccaarroo

哪個先到位，關鍵還是值？ –

您的第一行是_> Organism1_。這意味着該代碼將遵循沒有定義「key」的'else'分支。 – CristiFati

啊，這是有道理的！ – ccaarroo

由於您的值總是出現在您的密鑰之前，因此直接的方法是「記住」您獲得密鑰時可以使用的另一個變量中的值。所以，以下應該工作：

dict1 = {} 
file = open("somedata.dat") 
for line in file: # note you can leave out readlines() here 
    line = line.rstrip() 
    if line[0] == ">": # safer to check just first char 
     value = line[1:] # use [1:] to drop the ">" from the value 
    else: 
     dict1[line] = value 
print dict1

如果有以下的單個值氨基酸鍵的多行中，相同的值將被用於所有的密鑰。

關於第二個問題，問題是，這條線：

my_sequences = [line]

總是會替換my_sequences，不管其以前的值，這樣就可以獲得含有處理的最後一個序列的一個項目列表。替換爲：

my_sequences.append(line)

它將一個項目添加到列表的末尾，它會做你想做的。

來源

2017-04-08 23:21:53

好，這給了我我需要的東西。謝謝！ – ccaarroo

另外一個問題，除了創建一本字典 - 如果我只想從該文件的氨基酸序列（爲了另一個目的）調用，我爲文件中的行做了my_sequences = []： line = line.rstrip（） if 「>」不符合： my_sequences = [line]＃將這些dna序列添加到列表「my_sequences」 print my_sequences＃but that給我只有一個序列 – ccaarroo

哦，代碼在這裏顯示爲低劣，對不起 – ccaarroo

存儲一列文件爲詞典

回答

相關問題