2014-01-29 57 views
1

我有一個腳本解析它工作得很好,直到我改變了它稍微一個HTML文件,從而可以從終端運行它,就像這樣:LXML調用etree.parse不起作用

python myscript.py filename 

所以,表示文件的直接名稱解析的時候,它的工作原理:

tree = etree.parse("folder/filename.html") 
places = [] 

def f1(): 

    for dfn in tree.getiterator('dfn'): 
    ... 
    return places 
def main(): 

    f1() 
    file_places = open('list_places.txt', 'w') 
    for x in sorted(places): 
     print>>file_places, x 

然後代替文件的確切名稱的我表示一個變量,應該再用作參數命令行:

args=sys.argv[1:] 
filename = sys.argv[0] 
tree = etree.parse(filename) 
places = [] 

def extract_places(): 

    for dfn in tree.getiterator('dfn'): 
    ... 
    return places 
def main(): 
     if len(args) < 1: 
      print 'usage: extract.py [file ...]' 
      sys.exit(1) 

     else: 
      extract_places() 
      file_places = open('list_places.txt', 'w') 
      for x in sorted(places): 
       print>>file_places, x 

這裏是我的錯誤: 回溯(最近通話最後一個):

File "extract.py", line 15, in <module> 
tree = etree.parse(filename) 
File "lxml.etree.pyx", line 2957, in lxml.etree.parse (src/lxml/lxml.etree.c:56299) 
File "parser.pxi", line 1533, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:82382) 
File "parser.pxi", line 1562, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:82675) 
File "parser.pxi", line 1462, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:81714) 
File "parser.pxi", line 1002, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:78623) 
File "parser.pxi", line 569, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:74567) 
File "parser.pxi", line 650, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:75458) 
File "parser.pxi", line 590, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74791) 
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1 

回答

4
filename = sys.argv[0] 

有你的問題。我懷疑你的意思是:

filename = args[0] 
+0

謝謝!我有點想念它.. – user3241376