find_all with CamelCase標籤名稱與BeautifulSoup 4

我想刮一個帶有BeautifulSoup 4.4.0的標籤名稱在camelCase和find_all似乎無法找到它們的XML文件。示例代碼：find_all with CamelCase標籤名稱與BeautifulSoup 4

from bs4 import BeautifulSoup 

xml = """ 
<hello> 
    world 
</hello> 
""" 
soup = BeautifulSoup(xml, "lxml") 

for x in soup.find_all("hello"): 
    print x 

xml2 = """ 
<helloWorld> 
    :-) 
</helloWorld> 
""" 
soup = BeautifulSoup(xml2, "lxml") 

for x in soup.find_all("helloWorld"): 
    print x

我得到的輸出是：

$ python soup_test.py 
<hello> 
    world 
</hello>

什麼是查找駱駝套管/大寫的標籤名稱的正確方法是什麼？

來源

2015-07-21 Paul Johnson

對於使用BeautifulSoup的任何區分大小寫的解析，您都希望在"xml"模式下進行解析。默認模式（解析HTML）不關心大小寫，因爲HTML不關心大小寫。在你的情況，而不是使用"lxml"模式，切換到"xml"：

from bs4 import BeautifulSoup 

xml2 = """ 
<helloWorld> 
    :-) 
</helloWorld> 
""" 
soup = BeautifulSoup(xml2, "xml") 

for x in soup.find_all("helloWorld"): 
    print x

來源

2015-07-21 23:18:36 heinst

find_all with CamelCase標籤名稱與BeautifulSoup 4

回答

相關問題