2013-05-20 54 views
-1

我試圖在不同但預定義的行長度上拆分變長字符串。我把下面的代碼扔在一起,當我將它放到Python Tutor(我現在還沒有訪問適當的python IDE)時,在關鍵錯誤6上失敗了。我想這意味着我的while循環無法正常工作,它嘗試着不斷增加lineNum,但我不太確定爲什麼。有一個更好的方法嗎?或者這是容易解決的?將字符串拆分爲不同的行長度

代碼:

import re 

#Dictionary containing the line number as key and the max line length 
lineLengths = { 
     1:9, 
     2:11, 
     3:12, 
     4:14, 
     5:14 
       } 

inputStr = "THIS IS A LONG DESC 7X7 NEEDS SPLITTING"  #Test string, should be split on the spaces and around the "X" 

splitted = re.split("(?:\s|((?<=\d)X(?=\d)))",inputStr)  #splits inputStr on white space and where X is surrounded by numbers eg. dimensions 

lineNum = 1       #initialises the line number at 1 

lineStr1 = ""       #initialises each line as a string 
lineStr2 = "" 
lineStr3 = "" 
lineStr4 = "" 
lineStr5 = "" 

#Dictionary creating dynamic line variables 
lineNumDict = { 
     1:lineStr1, 
     2:lineStr2, 
     3:lineStr3, 
     4:lineStr4, 
     5:lineStr5 
     } 

if len(inputStr) > 40: 
    print "The short description is longer than 40 characters" 
else: 
    while lineNum <= 5: 
     for word in splitted: 
      if word != None: 
       if len(lineNumDict[lineNum]+word) <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += word 
       else: 
        lineNum += 1 
      else: 
       if len(lineNumDict[lineNum])+1 <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += " " 
       else: 
        lineNum += 1 

lineOut1 = lineStr1.strip() 
lineOut2 = lineStr2.strip() 
lineOut3 = lineStr3.strip() 
lineOut4 = lineStr4.strip() 
lineOut5 = lineStr5.strip() 

我已經採取了看看這個答案,但沒有C#的任何真正的理解:通過數量Split large text string into variable length strings without breaking words and keeping linebreaks and spaces

+0

給定示例輸入的輸出應該是什麼? –

+0

在這種情況下,我應該得到:「這是一個」「長期降落7」「X7需要」「分裂」 – ydaetskcoR

+0

是否分裂'7X7'是一個硬性要求?如果你只是分割單詞邊界,你可以得到一個更簡單的表達式。 –

回答

1

它不起作用,因爲你的用於循環中分割的循環中的文字,並帶有lineLen條件。你必須這樣做:

if len(inputStr) > 40: 
     print "The short description is longer than 40 characters" 
    else: 
     for word in splitted: 
      if lineNum > 5: 
       break 
      if word != None: 
       if len(lineNumDict[lineNum]+word) <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += word 
       else: 
        lineNum += 1 
      else: 
       if len(lineNumDict[lineNum])+1 <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += " " 
       else: 
        lineNum += 1 

而且lineStr1,lineStr2等不會改變,你必須直接訪問字典(string是不可改變的)。我試了一下,得到的結果工作:

print("Lines: %s" % lineNumDict) 

給出:

Lines: {1: 'THIS IS A', 2: 'LONG DESC 7', 3: '7 NEEDS ', 4: '', 5: ''} 
+0

這很好,但它似乎放棄了「X」和「分裂」。我已經改變了嵌套'if'的'else'部分,然後嘗試將該單詞添加到該行,並且如果該單詞太長,則要「打印」該單詞需要分裂,並且這似乎完美地工作。感謝您的幫助 – ydaetskcoR

+0

是的,其他情況只是放棄了這個詞,所以也不奇怪。實際上,我只看着循環。 – Chris

0
for word in splitted: 
    ... 
    lineNum += 1 

代碼增量lineNum字數爲splitted,即16次。

+0

我不確定我是否正確理解了你,但我只希望它增加一個'lineNum'(如此移動到下一行),如果添加一個單詞將超過'lineLength'限制。除非我錯過了某些東西,否則'if'塊應該能夠發揮作用? – ydaetskcoR

+0

是的,但是你的代碼有防止'lineNum'超過5的方法。 – xuanji

+0

我還沒有關注你。如果lineNum無法適合該行上的單詞,則該行數只應增加。現在看看它,但我不認爲它會將該詞添加到下一行,而是跳到下一個詞。這也需要改變。 – ydaetskcoR

0

我不知道是否正確評價正則表達式不會是更容易理解?

lineLengths = {1:9,2:11,3:12,4:14,5:14} 
inputStr = "THIS IS A LONG DESC 7X7 NEEDS SPLITTING" 
import re 
pat = """ 
(?:      # non-capture around the line as we want to drop leading spaces 
    \s*     # drop leading spaces 
    (.{{1,{max_len}}}) # up to max_len characters, will be added through 'format' 
    (?=[\b\sX]|$)  # and using word breaks, X and string ending as terminators 
         # but without capturing as we need X to go into the next match 
)?      # and ignoring missing matches if not all lines are necessary 
""" 

# build a pattern matching up to 5 lines with the corresponding max lengths 
pattern = ''.join(pat.format(max_len=x) for x in lineLengths.values()) 

re.match(pattern, inputStr, re.VERBOSE).groups() 
# Out: ('THIS IS A', 'LONG DESC 7', '7 NEEDS', 'SPLITTING', None) 

此外,對line_lengths使用dict沒有實際意義,列表可以很好地執行。

相關問題