str.startswith（）沒有按照我的意圖工作

我試圖測試一個/ t或空格字符，我不明白爲什麼這段代碼不起作用。我正在做的是讀取一個文件，計算文件的loc，然後記錄文件中每個函數的名字以及它們的代碼行。下面的代碼是我試圖爲函數計算loc的地方。str.startswith（）沒有按照我的意圖工作

import re 

... 
    else: 
      loc += 1 
      for line in infile: 
       line_t = line.lstrip() 
       if len(line_t) > 0 \ 
       and not line_t.startswith('#') \ 
       and not line_t.startswith('"""'): 
        if not line.startswith('\s'): 
         print ('line = ' + repr(line)) 
         loc += 1 
         return (loc, name) 
        else: 
         loc += 1 
       elif line_t.startswith('"""'): 
        while True: 
         if line_t.rstrip().endswith('"""'): 
          break 
         line_t = infile.readline().rstrip() 

      return(loc,name)

輸出：

Enter the file name: test.txt 
line = '\tloc = 0\n' 

There were 19 lines of code in "test.txt" 

Function names: 

    count_loc -- 2 lines of code

正如你看到的，我的測試打印的行顯示/噸，但如果聲明明確地說（或因此我認爲），它應該只執行不存在空白字符。

這裏是我完整的測試文件，我一直在使用：

def count_loc(infile): 
    """ Receives a file and then returns the amount 
     of actual lines of code by not counting commented 
     or blank lines """ 

    loc = 0 
    for line in infile: 
     line = line.strip() 
     if len(line) > 0 \ 
     and not line.startswith('//') \ 
     and not line.startswith('/*'): 
      loc += 1 
      func_loc, func_name = checkForFunction(line); 
     elif line.startswith('/*'): 
      while True: 
       if line.endswith('*/'): 
        break 
       line = infile.readline().rstrip() 

    return loc 

if __name__ == "__main__": 
    print ("Hi") 
    Function LOC = 15 
    File LOC = 19

來源

2009-05-29 Justen

不要發佈重複（http://stackoverflow.com/questions/929169/str-startswith-not-working-as-i-intended）。 – 2009-05-30 07:02:26

\s僅僅是空白的re包做模式匹配時。

對於startswith，普通字符串的普通方法，\s沒什麼特別的。不是一個模式，只是字符。

來源

2009-05-29 19:03:30

我正在使用導入重新 - 我會將它添加到原來的帖子 – Justen 2009-05-29 19:08:33

@Justen你正在導入重新，但你只使用基本的字符串方法 – JimB 2009-05-29 19:17:50

啊我看到了，以及我在下面的帖子中的評論，我試過\ t和''但是它沒有在if __name__ ==中檢測'i'...所以它一直計數函數的loc，直到達到文件末尾。（順便說一句，我在編程中的其他地方使用正則表達式，所以仍然需要導入） – Justen 2009-05-29 19:31:40

你字符串文字不是你認爲他們是。您可以指定空格或TAB像這樣：

space = ' ' 
tab = '\t'

來源

2009-05-29 19:05:08 JimB

你的問題已經回答了，這是稍微偏離主題，但...

如果要解析的代碼，它往往是使用解析器更容易，更不容易出錯。如果您的代碼是Python代碼，Python附帶一對解析器（tokenize,ast,parser）。對於其他語言，您可以在Internet上找到很多解析器。 ANTRL是Python bindings中的着名程序之一。

作爲一個例子，下面這些代碼打印線不在註釋的Python模塊的所有行，而不是DOC-字符串：

import tokenize 

ignored_tokens = [tokenize.NEWLINE,tokenize.COMMENT,tokenize.N_TOKENS 
       ,tokenize.STRING,tokenize.ENDMARKER,tokenize.INDENT 
       ,tokenize.DEDENT,tokenize.NL] 
with open('test.py', 'r') as f: 
    g = tokenize.generate_tokens(f.readline) 
    line_num = 0 
    for a_token in g: 
     if a_token[2][0] != line_num and a_token[0] not in ignored_tokens: 
      line_num = a_token[2][0] 
      print(a_token)

由於a_token上面已經分析，你可以很容易地檢查也用於功能定義。您也可以通過查看當前列開始a_token[2][1]來跟蹤函數結束的位置。如果你想做更復雜的事情，你應該使用ast。

來源

2009-05-29 21:04:44 stephan

str.startswith（）沒有按照我的意圖工作

回答

相關問題