如何使用python跳過多個標題行

我是python的新手。試圖編寫一個腳本，該腳本將使用來自文件的數字變體，其中還包含一個標題。這裏是一個文件的例子：如何使用python跳過多個標題行

@File_Version: 4 
PROJECTED_COORDINATE_SYSTEM 
#File_Version____________-> 4 
#Master_Project_______-> 
#Coordinate_type_________-> 1 
#Horizon_name____________-> 
sb+ 
#Horizon_attribute_______-> STRUCTURE 
474457.83994 6761013.11978 
474482.83750 6761012.77069 
474507.83506 6761012.42160 
474532.83262 6761012.07251 
474557.83018 6761011.72342 
474582.82774 6761011.37433 
474607.82530 6761011.02524

我想跳過標題。這是我的嘗試。當然，如果我知道哪些字符會出現在標題中，如「＃」和「@」，它是有效的。但是，我怎樣才能跳過包含任何字母字符的所有行？

in_file1 = open(input_file1_short, 'r') 
out_file1 = open(output_file1_short,"w") 
lines = in_file1.readlines() 
x = [] 
y = [] 
for line in lines: 
    if "#" not in line and "@" not in line: 
     strip_line = line.strip() 
     replace_split = re.split(r'[ ,|;"\t]+', strip_line) 
     x = (replace_split[0]) 
     y = (replace_split[1]) 
     out_file1.write("%s\t%s\n" % (str(x),str(y))) 
in_file1.close()

非常感謝！

來源

2015-11-05 emin

你可以簡單地檢查前導字符，還是比你的頭部檢測更普遍？如果你可以在前面有數字，但後來得到的話，那麼也許我可以給你寫一個簡化函數。 – Prune

這將檢查每一行的第一個字符，並跳過不以數字開頭的所有行：

for line in lines: 
    if line[0].isdigit(): 
     # we've got a line starting with a digit

來源

2015-11-05 20:35:52 razzak

使用發生器管道過濾您的輸入流。這需要從原始輸入行中的行，但停下來檢查整行中是否有字母。

input_stream = (line in lines if 
       reduce((lambda x, y: (not y.isalpha()) and x), line, True)) 

for line in input_stream: 
    strip_line = ...

來源

2015-11-05 20:46:26 Prune

我想你可以使用一些內置插件是這樣的：

import string 
for line in lines: 
    if any([letter in line for letter in string.ascii_letters]): 
     print "there is an ascii letter somewhere in this line"

這只是尋找ASCII字母，但是。

你還可以：

import unicodedata 
for line in lines: 
    if any([unicodedata.category(unicode(letter)).startswith('L') for letter in line]): 
     print "there is a unicode letter somewhere in this line"

，但只有當我正確地理解我的Unicode類別....

即使清潔（使用來自其他答案建議，因此既適用於Unicode的行和字符串）。：

for line in lines: 
    if any([letter.isalpha() for letter in line]): 
     print "there is a letter somewhere in this line"

但是，有趣的是，如果你這樣做：

在[57]：U '\ u2161'.isdecimal（）

缺貨[57]：假

在[58]：U' \ u2161'.isdigit（）

缺貨[58]：假

在[59]：U'\ u2161'.isalpha（）

缺貨[59]：假

Unicode的FO r羅馬數字「Two」不是這些，，但unicodedata.category（u'\ u2161'）確實返回表示數字的'N1'（並且u'\ u2161'.isnumeric（）爲True）。

來源

2015-11-05 20:49:31 rkh

非常感謝您的建議。非常感激！它通過省略包含字母的行使用.isalpha – emin

如何使用python跳過多個標題行

回答

相關問題