加入不同長度的字符串在一個循環

我正在從這個格式轉換文件：加入不同長度的字符串在一個循環

# SampleNamea seq1a seq2a 
# SampleNameb seq1b seq2b 
# SampleNamec seq1c seq2c 
# SampelNamed seq1d seq2d

爲此格式：

# SampleNamea SampleNameb 0 0 0 0 s s e e q q 1 1 a b s s e e q q 2 2 a b 
# SampleNamec SampleNamed 0 0 0 0 s s e e q q 1 1 c d s s e e q q 2 2 c d

目前劇本我如果seq1a作品，seq1b等長度相同。但在數據集中，我有不同的字符串長度。如果我嘗試在我的數據集上運行腳本，我會收到消息IndexError: string index out of range。

這是腳本的所述部分：附圖出字符串的長度（即seq1aseq2a，seq1bseq2b），將其追加到InputMasterList，加SampleName s的額外的零的給OutputMasterList。然後，通過從InputMasterList[LineEven]字符串（seq1aseq2a）和InputMasterList[LineOdd]字符串（seq1bseq2b）中選擇以元素[0]開始的每個連續元素並將它們組合到OutputMasterList中，從而將字符串附加到OutputMasterList。所以結果將是（s s e e q q 1 1 a b s s e e q q 2 2 a b）。

我怎樣才能讓這個腳本在不同的字符串長度上工作？

LineEven = 0 
LineOdd = 1 
RecordNum = 1 

while RecordNum < (NumofLinesInFile/2): 
    for i in range(len(InputMasterList[LineEven])): 
     if i == 0: 
      OutputMasterList.append(SampleList[LineEven]+'\t'+ SampleList[LineEven]+'\t'+'0'+'\t'+'0'+'\t'+'0'+'\t'+'0'+'\t') 
     OutputMasterList[RecordNum] = InputMasterList[LineEven][i]+'\t'+InputMasterList[LineOdd][i]+'\t' 
    RecordNum = RecordNum + 1 
    LineEven = LineEven + 2 
    LineOdd = LineOdd + 2

我非常初學者，所以我知道這個代碼是相當繁瑣的，但任何幫助，將不勝感激。如果你需要澄清我想要用這個腳本做什麼，請不要猶豫，問。

更新：感謝您的及時回覆。由於您的反饋，我意識到我必須改變我的問題的性質。在我的數據集中，我缺少了我的腳本不喜歡的序列，我需要用佔位符來解決這個缺失的數據，這個佔位符的長度與對應的長度相同。

舊格式：

# SampleNamea seq1a seq2a 

# SampleNameb '.'  seq2b

新格式：

# SampleNamea seq1a seq2a 

# SampleNameb NNNNN seq2b

然後，我相信我的腳本將工作！

TL; DR - 根據您的反饋，我有基於我的下一步應該是什麼。

來源

2016-07-27 Kiera Alexandria

你能確定你的縮進是正確的嗎？現在有一個無限循環，因爲'RecordNum'在while循環內部不會增加 – Greg

您的示例沒有說明如何更改格式。請從舊的口頭描述新的格式。 –

如果seq1asequence2a的長度不同，那麼輸出結果如何？ –

InputMasterList根據您的新更新，[LineOdd]字符串可能看起來像（.seq2b）。

然後繼續追加前之前，做一次檢查上InputMasterList

if '.' in InputMasterList[LineOdd]: 
    InputMasterList[LineOdd] = InputMasterList[LineOdd].replace('.', 'NNNNN', 1)

您可以同時爲LineOdd和LineEven

注意做到這一點：這是基於新的輸入

來源

2016-07-28 03:24:06

加入不同長度的字符串在一個循環

回答

相關問題