2天前我第一次被介紹給Python(以及一般編程)。今天我被卡住了。我花了幾個小時試圖找到答案,我懷疑是一個如此微不足道的問題,沒有人還沒有卡在這裏:)Python - 如何嵌套文件讀取循環?
老闆想讓我手動清理巨大的.xml文件變成更人性化的東西。我正在嘗試創建一個腳本來爲我做。以下是.xml文件的示例以及我所需的輸出。
輸入(File.xml):
<IssueTracking>
<Issue>
<SequenceNum>123</SequenceNum>
<Subject>Subject of Ticket 123</Subject>
<Description>Line 1 in Description field of Ticket 123.
Line 2 in Description field of Ticket 123.
Line 3 in Description field of Ticket 123.</Description>
</Issue>
<Issue>
<SequenceNum>124</SequenceNum>
<Subject>Subject of Ticket 124</Subject>
<Description>Line 1 in Description field of Ticket 124.
Line 2 in Description field of Ticket 124.
Line 3 in Description field of Ticket 124.</Description>
</Issue>
</IssueTracking>
所需的輸出:
123 Subject of Ticket 123
Line 1 in Description field of Ticket 123.
Line 2 in Description field of Ticket 123.
Line 3 in Description field of Ticket 123.
124 Subject of Ticket 124
Line 1 in Description field of Ticket 124.
Line 2 in Description field of Ticket 124.
Line 3 in Description field of Ticket 124.
這裏是我這麼遠。
with open(File.xml, 'r') as SourceFile: # Opens the file
while 1: # Keep going through the file to the end
SourceFileLine = SourceFile.readline() # Saves lines of the source file
if not SourceFileLine: # Skip empty lines
break
SourceFileLine = SourceFileLine.strip() # Strips the whitespace
if "<SequenceNum>" in SourceFileLine:
SequenceNum = SourceFileLine[13:-14] # Trims the tags, saves the field.
continue
if "<Subject>" in SourceFileLine:
Subject = SourceFileLine[9:-10]
continue
#if "<Description>" in SourceFileLine:
# last_pos = SourceFile.tell()
# while "</Description>" not in SourceFileLine:
# SourceFile.seek(last_pos)
# ?????
#
# Description = Description[22:]
# continue
if "</Issue>" in SourceFileLine:
print(SequenceNum, end = "\t")
print(Subject)
# print(Description)
print("\n")
我被困在識別和<Description>
標籤之間的三條線固定成一個字符串,我可以繼續沿着源文件之前打印。現在已經掃描了許多其他文件行讀取循環的例子,我懷疑我需要的是標記點到達目標字段並在文件中的該點嵌套另一個讀取循環。但我還沒有找到另一個這樣做的例子,所以我認爲我錯過了一些基本的東西,或者有更好的方法。預先感謝您的幫助!
Python有一個內置的XML解析器:http://docs.python.org /library/pyexpat.html – 2012-07-20 19:24:14
+1用於輸入,所需輸出以及您嘗試的內容。 – 2012-07-20 19:58:12
您可能應該使用像YAML這樣的人性化序列化程序在您提取數據後輸出數據。你永遠不知道什麼時候需要再次處理這些數據。 – 2012-07-20 20:05:14