2015-03-02 16 views
0

我有一個問題,我似乎無法找到並修復。使用函數來製作字典將不起作用,但功能外

FASTA = >header1 
     ATCGATCGATCCCGATCGACATCAGCATCGACTAC 
     ATCGACTCAAGCATCAGCTACGACTCGACTGACTACGACTCGCT 
     >header2 
     ATCGATCGCATCGACTACGACTACGACTACGCTTCGTATCAGCATCAGCT 
     ATCAGCATCGACGACGACTAGCACTACGACTACGACGATCCCGATCGATCAGCT 

def dnaSequence(): 
    ''' 
    This function makes a dict called DNAseq by reading the fasta file 
    given as first argument on the command line 
    INPUT: Fasta file containing strings 
    OUTPUT: key is header and value is sequence 
    ''' 

    DNAseq = {} 
    for line in FASTA: 
     line = line.strip() 
     if line.startswith('>'): 
      header = line 
      DNAseq[header] = "" 
     else: 
      seq = line 
      DNAseq[header] = seq 

    return DNAseq 



def digestFragmentsWithOneEnzyme(dnaSequence): 
    ''' 
    This function digests the sequence from DNAseq into smaller parts 
    by using the enzymes listed in the MODES. 
    INPUT: DNAseq and the enzymes from sys.argv[2:] 
    OUTPUT: The DNAseq is updated with the segments gained from the 
    digesting 
    ''' 
    enzymes = sys.argv[2:] 

    updated_list = [] 
    for enzyme in enzymes: 
     pattern = MODES(enzyme) 
     p = re.compile(pattern) 
     for dna in DNAseq.keys(): 
      matchlist = re.findall(p,dna) 
      updated_list = re.split(MODES, DNAseq) 
      DNAseq.update((key, updated_list.index(k)) for key in 
      d.iterkeys()) 
    return DNAseq 


def getMolecularWeight(dnaSequence): 
    ''' 
    This function calculates the molWeight of the sequence in DNAseq 
    INPUT: the updated DNAseq from the previous function as a dict 
    OUTPUT: The DNAseq is updated with the molweight of the digested fragments 
    ''' 

    results = [] 
    for seq in DNAseq.keys(): 
     results = sum((dnaMass[base]) for base in DNAseq[seq]) 
     DNAseq.update((key, results.index(k)) for key in 
     d.iterkeys()) 
    return DNAseq 


def main(argv=None): 
    ''' 
    This function prints the results of the digested DNA sequence on in the terminal. 
    INPUT: The DNAseq from the previous function as a dict 
    OUTPUT: name  weight weight weight 
      name2 weight weight weight 
    ''' 
    if argv == None: 
     argv = sys.argv 
    if len(argv) <2: 
     usage() 
     return 1 

    digestFragmentsWithOneEnzyme(dnaSequence()) 
    Genes = getMolecularWeight(digestFragmentsWithOneEnzyme()) 
    print ({header},{seq}).format(**DNAseq) 
    return 0 



if __name__ == '__main__': 
    sys.exit(main()) 

在第一功能我試圖從FASTA文件做出dict,在所述序列由正則表達式切片並最後被算出的molweight第二功能使用相同的dict

我的問題是,由於某種原因,Python不認識我dict,我得到一個錯誤:

name error DNAseq is not defined

如果我做dict外的函數的話,我有dict

+0

請修復您的代碼塊。 – Kevin 2015-03-02 13:56:23

回答

1

您將字典作爲dnaSequence而不是DNAseq傳遞給這兩個函數。

注意這是調用函數的一種非常奇怪的方式。當您將序列傳遞給它時,您完全忽略了第一次調用digestFragmentsWithOneEnzyme的結果,然後嘗試再次調用它以將結果傳遞給getMolecularWeight,但是您未能真正在該調用中傳遞序列,因此如果您有那麼多。

認爲你正在嘗試做的是這樣的:

sequence = dnaSequence() 
fragments = digestFragmentsWithOneEnzyme(sequence) 
genes = getMolecularWeight(fragments) 

,你應該避免調用的參數具有相同的名稱作爲一個單獨的功能這兩個功能,因爲這將隱藏的功能名稱。相反,選擇一個新的名字:

def digestFragmentsWithOneEnzyme(sequence): 
    ... 
    for dna in sequence: 

(你不需要調用keys() - 迭代的字典總是了鑰匙。)