製作美麗的湯玩弄手柄吧

我正在製作一個腳本來重構模板增強的html。製作美麗的湯玩弄手柄吧

我想beautifulsoup打印內部{{}}一字不差的任何代碼，而無需轉換>或其他符號轉換爲HTML實體。

它需要將具有多個模板的文件拆分爲多個文件，每個文件都有一個模板。

規格：splitTemplates templates.html templateDir必須：

閱讀templates.html
每個<template name="xxx">contents {{ > directive }}</template>，寫這個模板文件templateDir/xxx

代碼片段：

soup = bs4.BeautifulSoup(open(filename,"r")) 
for t in soup.children: 
    if t.name=="template": 
     newfname = dirname+"/"+t["name"]+ext 
     f = open(newfname,"w") 
     # note f.write(t) fails as t is not a string -- prettify works 
     f.write(t.prettify()) 
     f.close()

意外的行爲：

{{ > directive }}變得{{ > directive }}但需要保存爲{{ > directive }}

更新：f.write(t.prettify(formatter=None))有時保留>，有時它變成>。這似乎是最明顯的變化，不知道爲什麼它改變了一些>而不是其他。

解決：由於海武也https://stackoverflow.com/a/663128/103081

import HTMLParser 
    U = HTMLParser.HTMLParser().unescape 
    soup = bs4.BeautifulSoup(open(filename,"r")) 
    for t in soup.children: 
     if t.name=="template": 
      newfname = dirname+"/"+t["name"]+ext 
      f = open(newfname,"w") 
      f.write(U(t.prettify(formatter=None))) 
      f.close()

另請參見：https://gist.github.com/DrPaulBrewer/9104465

來源

2014-02-19 Paul