將字符串拆分爲不同的行長度

-1

我試圖在不同但預定義的行長度上拆分變長字符串。我把下面的代碼扔在一起，當我將它放到Python Tutor（我現在還沒有訪問適當的python IDE）時，在關鍵錯誤6上失敗了。我想這意味着我的while循環無法正常工作，它嘗試着不斷增加lineNum，但我不太確定爲什麼。有一個更好的方法嗎？或者這是容易解決的？將字符串拆分爲不同的行長度

代碼：

import re 

#Dictionary containing the line number as key and the max line length 
lineLengths = { 
     1:9, 
     2:11, 
     3:12, 
     4:14, 
     5:14 
       } 

inputStr = "THIS IS A LONG DESC 7X7 NEEDS SPLITTING"  #Test string, should be split on the spaces and around the "X" 

splitted = re.split("(?:\s|((?<=\d)X(?=\d)))",inputStr)  #splits inputStr on white space and where X is surrounded by numbers eg. dimensions 

lineNum = 1       #initialises the line number at 1 

lineStr1 = ""       #initialises each line as a string 
lineStr2 = "" 
lineStr3 = "" 
lineStr4 = "" 
lineStr5 = "" 

#Dictionary creating dynamic line variables 
lineNumDict = { 
     1:lineStr1, 
     2:lineStr2, 
     3:lineStr3, 
     4:lineStr4, 
     5:lineStr5 
     } 

if len(inputStr) > 40: 
    print "The short description is longer than 40 characters" 
else: 
    while lineNum <= 5: 
     for word in splitted: 
      if word != None: 
       if len(lineNumDict[lineNum]+word) <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += word 
       else: 
        lineNum += 1 
      else: 
       if len(lineNumDict[lineNum])+1 <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += " " 
       else: 
        lineNum += 1 

lineOut1 = lineStr1.strip() 
lineOut2 = lineStr2.strip() 
lineOut3 = lineStr3.strip() 
lineOut4 = lineStr4.strip() 
lineOut5 = lineStr5.strip()

我已經採取了看看這個答案，但沒有C＃的任何真正的理解：通過數量Split large text string into variable length strings without breaking words and keeping linebreaks and spaces

來源

2013-05-20 ydaetskcoR

給定示例輸入的輸出應該是什麼？ –

在這種情況下，我應該得到：「這是一個」「長期降落7」「X7需要」「分裂」 – ydaetskcoR

是否分裂'7X7'是一個硬性要求？如果你只是分割單詞邊界，你可以得到一個更簡單的表達式。 –

它不起作用，因爲你的用於循環中分割的循環中的文字，並帶有lineLen條件。你必須這樣做：

if len(inputStr) > 40: 
     print "The short description is longer than 40 characters" 
    else: 
     for word in splitted: 
      if lineNum > 5: 
       break 
      if word != None: 
       if len(lineNumDict[lineNum]+word) <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += word 
       else: 
        lineNum += 1 
      else: 
       if len(lineNumDict[lineNum])+1 <= lineLengths[lineNum]: 
        lineNumDict[lineNum] += " " 
       else: 
        lineNum += 1

而且lineStr1，lineStr2等不會改變，你必須直接訪問字典（string是不可改變的）。我試了一下，得到的結果工作：

print("Lines: %s" % lineNumDict)

給出：

Lines: {1: 'THIS IS A', 2: 'LONG DESC 7', 3: '7 NEEDS ', 4: '', 5: ''}

來源

2013-05-20 10:49:07 Chris

這很好，但它似乎放棄了「X」和「分裂」。我已經改變了嵌套'if'的'else'部分，然後嘗試將該單詞添加到該行，並且如果該單詞太長，則要「打印」該單詞需要分裂，並且這似乎完美地工作。感謝您的幫助 – ydaetskcoR

是的，其他情況只是放棄了這個詞，所以也不奇怪。實際上，我只看着循環。 – Chris

for word in splitted: 
    ... 
    lineNum += 1

代碼增量lineNum字數爲splitted，即16次。

來源

2013-05-20 10:39:02 xuanji

我不確定我是否正確理解了你，但我只希望它增加一個'lineNum'（如此移動到下一行），如果添加一個單詞將超過'lineLength'限制。除非我錯過了某些東西，否則'if'塊應該能夠發揮作用？ – ydaetskcoR

是的，但是你的代碼有防止'lineNum'超過5的方法。 – xuanji

我還沒有關注你。如果lineNum無法適合該行上的單詞，則該行數只應增加。現在看看它，但我不認爲它會將該詞添加到下一行，而是跳到下一個詞。這也需要改變。 – ydaetskcoR

我不知道是否正確評價正則表達式不會是更容易理解？

lineLengths = {1:9,2:11,3:12,4:14,5:14} 
inputStr = "THIS IS A LONG DESC 7X7 NEEDS SPLITTING" 
import re 
pat = """ 
(?:      # non-capture around the line as we want to drop leading spaces 
    \s*     # drop leading spaces 
    (.{{1,{max_len}}}) # up to max_len characters, will be added through 'format' 
    (?=[\b\sX]|$)  # and using word breaks, X and string ending as terminators 
         # but without capturing as we need X to go into the next match 
)?      # and ignoring missing matches if not all lines are necessary 
""" 

# build a pattern matching up to 5 lines with the corresponding max lengths 
pattern = ''.join(pat.format(max_len=x) for x in lineLengths.values()) 

re.match(pattern, inputStr, re.VERBOSE).groups() 
# Out: ('THIS IS A', 'LONG DESC 7', '7 NEEDS', 'SPLITTING', None)

此外，對line_lengths使用dict沒有實際意義，列表可以很好地執行。

來源

2013-05-20 11:59:13

將字符串拆分爲不同的行長度

回答

相關問題