我有以下代碼來解析某些HTML。我需要將輸出(html結果)保存爲帶有轉義字符序列的單行代碼,例如\n
,但由於單引號或輸出結果,我得到的表示不能從repr()
使用寫入多行,像這樣(解釋轉義序列):保留字符串內容 n並寫入一行
<section class="prog__container">
<span class="prog__sub">Title</span>
<p>PEP 336 - Make None Callable</p>
<span class="prog__sub">Description</span>
<p>
<p>
<code>
None
</code>
should be a callable object that when called with any
arguments has no side effect and returns
<code>
None
</code>
.
</p>
</p>
</section>
我需要什麼樣的(包括轉義序列):
<section class="prog__container">\n <span class="prog__sub">Title</span>\n <p>PEP 336 - Make None Callable</p>\n <span class="prog__sub">Description</span>\n <p>\n <p>\n <code>\n None\n </code>\n should be a callable object that when called with any\n arguments has no side effect and returns\n <code>\n None\n </code>\n .\n </p>\n </p>\n </section>
我的代碼
soup = BeautifulSoup(html, "html.parser")
for match in soup.findAll(['div']):
match.unwrap()
for match in soup.findAll(['a']):
match.unwrap()
html = soup.contents[0]
html = str(html)
html = html.splitlines(True)
html = " ".join(html)
html = re.sub(re.compile("\n"), "\\n", html)
html = repl(html) # my current solution works, but unusable
以上是我的解決方案,但是對象表示不好,我需要字符串表示。我怎樣才能做到這一點?
這工作。接受最簡單的解決方案 – lkdjf0293