2013-09-30 55 views
0

我正在嘗試創建Element類的子類。儘管我開始有麻煩。難以創建lxml元素子類

from lxml import etree 
try: 
    import docx 
except ImportError: 
    from docx import docx 

class File(etree.ElementBase): 
    def _init(self): 
     etree.ElementBase._init(self) 
     self.body = self.append(docx.makeelement('body')) 

f = File() 
relationships = docx.relationshiplist() 
title = 'File' 
subject = 'A very special File' 
creator = 'Me' 
keywords = ['python', 'Office Open XML', 'Word'] 
coreprops = docx.coreproperties(title=title, subject=subject, creator=creator, 
    keywords=keywords) 
appprops = docx.appproperties() 
contenttypes = docx.contenttypes() 
websettings = docx.websettings() 
wordrelationships = docx.wordrelationships(relationships) 
docx.savedocx(f, coreprops, appprops, contenttypes, websettings, 
wordrelationships, 'file.docx') 

當我嘗試打開從該代碼輸出的文檔,我的(與兼容包2003)版本的Word使我有以下錯誤:「該文件是由Word 2007中的一個早期測試版本創建並且無法在此版本中打開。「當我用用docx.newdocument()創建的另一個元素替換File對象時,文檔顯示正常。任何想法/建議?

+0

你的意思是你的構造函數使用'__init__'而不是'_init'嗎?此外,您可以嘗試檢查'docx.newdocument()'的[源代碼](https://github.com/mikemaccana/python-docx/blob/master/docx.py)以查看輸出應該是什麼樣子。它看起來像你缺少一個「文檔」標籤,儘管這只是我的猜測。 – Michael0x2a

回答

0

我真的不明白爲什麼要使用一個名爲File的單獨類。

正如Michael0x2a說,你did'nt把一個文件標籤,所以它不會工作(我不認爲Word 2007中可以讀取你的文件太多)

但這裏是更正後的代碼:

from lxml import etree 
try: 
    import docx 
except ImportError: 
    from docx import docx 

class File(object): 
    def makeelement(tagname, tagtext=None, nsprefix='w', attributes=None, 
        attrnsprefix=None): 
     '''Create an element & return it''' 
     # Deal with list of nsprefix by making namespacemap 
     namespacemap = None 
     if isinstance(nsprefix, list): 
      namespacemap = {} 
      for prefix in nsprefix: 
       namespacemap[prefix] = nsprefixes[prefix] 
      # FIXME: rest of code below expects a single prefix 
      nsprefix = nsprefix[0] 
     if nsprefix: 
      namespace = '{'+nsprefixes[nsprefix]+'}' 
     else: 
      # For when namespace = None 
      namespace = '' 
     newelement = etree.Element(namespace+tagname, nsmap=namespacemap) 
     # Add attributes with namespaces 
     if attributes: 
      # If they haven't bothered setting attribute namespace, use an empty 
      # string (equivalent of no namespace) 
      if not attrnsprefix: 
       # Quick hack: it seems every element that has a 'w' nsprefix for 
       # its tag uses the same prefix for it's attributes 
       if nsprefix == 'w': 
        attributenamespace = namespace 
       else: 
        attributenamespace = '' 
      else: 
       attributenamespace = '{'+nsprefixes[attrnsprefix]+'}' 

      for tagattribute in attributes: 
       newelement.set(attributenamespace+tagattribute, 
           attributes[tagattribute]) 
     if tagtext: 
      newelement.text = tagtext 
     return newelement 

    def __init__(self): 
     super(File,self).__init__() 
     self.document = self.makeelement('document') 
     self.document.append(self.makeelement('body')) 


f = File() 
relationships = docx.relationshiplist() 
title = 'File' 
subject = 'A very special File' 
creator = 'Me' 
keywords = ['python', 'Office Open XML', 'Word'] 
coreprops = docx.coreproperties(title=title, subject=subject, creator=creator, 
    keywords=keywords) 
appprops = docx.appproperties() 
contenttypes = docx.contenttypes() 
websettings = docx.websettings() 
wordrelationships = docx.wordrelationships(relationships) 
docx.savedocx(f.document, coreprops, appprops, contenttypes, websettings, 
wordrelationships, 'file.docx') 
+0

謝謝,但問題似乎來自Element類不喜歡__init__的事實,所以簡單的構造函數方法似乎不工作。 http://lxml.de/element_classes.html –

+0

爲什麼要擴展Etree.ElementBase? – edi9999