美麗的湯：在xml-ish文件中獲取所有<tag>的內容

我有一個xml-ish文件我試圖與BeautifulSoup解析讓我們說一個未知倍數的標籤在另一個標籤的樹內。事情總是順其自然，至少對於我在一組nexted標籤中提取的第一個標籤而言。這不是真正的HTML或XML，但接近...美麗的湯：在xml-ish文件中獲取所有<tag>的內容

提供的格式：

<data> 
<type> 
    <type_attribute_1>1</type_attribute_1> 
    <type_attribute_2>2</type_attribute_2> 
</type> 
<type> 
    <type_attribute_1>3</type_attribute_1> 
    <type_attribute_2>4</type_attribute_2> 
</type> 
</data>

我怎麼可能提取type_attribute_1和type_attribute_2的值類型都是標籤和分配給一個變量 - 即「Type_1_attribute_1」，「Type_1_attribute_2」，「Type_2_attribute_1」 & 「Type_2_attribute_2」

我用這樣的代碼，但它僅適用於位於<data>內的第一個<type>：

Type_1_Attribute_1 = soup.data.type.type_attribute_1.text 
Type_1_Attribute_2 = soup.data.type.type_attribute_2.text

UPDATE

我認爲那句我正在尋找一個有點不同可能會有幫助。我不知道變量名是Type_1_Attribute_1，因爲我不知道有多少個Type兄弟姐妹，所以在每個兄弟節點上輸入「Type」，即「_1」，「_2」，「_3」...即
假設：

Types = [i.stripText() for i in soup.select('Type')] 
parseables = len(Types) 
for i in range(0, parseables) 
    j = i+1 
    Type = Types[i] 
    Attribute_1 = Type.Type_Attribute_1.text 
    print Attribute_1

它打印Attribute_1的價值爲每種類型的，我怎麼會加上「Type_j」在Attribute_1與J的數值填寫？

來源

2016-02-03 Mark Brown

它可能有一些做的事實，數據的第2和第3行，沒有括號之間的任何數據，但支架.... –

這只是一個錯誤在這裏，而且是內現在修好了。 –

顯示您的輸入文檔的準確示例（您顯示的是_is_ XML，據我所見）。顯示您的Python代碼的完整示例，並更明確地顯示您需要什麼作爲輸出。幫助：http://stackoverflow.com/help/mcve。 –

什麼這個 -

from bs4 import BeautifulSoup as bs 

data = """<data> 
<type> 
    <type_attribute_1>1</type_attribute_1> 
    <type_attribute_2>2<2/type_attribute_2> 
</type> 
<type> 
    <type_attribute_1>3</type_attribute_1> 
    <type_attribute_2>4</type_attribute_2> 
</type> 
</data>""" 

soup = bs(data,'lxml') 

Type_1_Attribute_1 = [i.text.strip() for i in soup.select('type_attribute_1')] 
Type_1_Attribute_2 = [i.text.strip() for i in soup.select('type_attribute_2')] 

print filter(bool,Type_1_Attribute_1) 
print filter(bool,Type_1_Attribute_2)

輸出 -

[u'1', u'3'] 
[u'2', u'4']

編輯我沒有得到你，你爲什麼需要這個地方遍歷列表本身的變量（迭代器） - 例如

for i in Type_1_Attribute_1: 
    print (i)# here i itself a variable and it changes when i reiterate

Prints-

1 
3

因此，如果您需要使用該列表中的每個項目，只需使用迭代器並傳遞給函數，因爲我傳遞給print函數。

來源

2016-02-03 08:30:05 SIslam

非常有幫助 - 這將返回一個列表的值，我可以通過var = Type_1_Attribute_1 [i]訪問並賦值給一個變量。現在有沒有一種很好的方法來統計數據標籤的數量，以便我可以爲數字變量賦值？即var1 = Type_1_Attribute_1 [0]，var2 = Type_2_Attribute_1 [0]，foo1 = Type_1_Attribute_2 [0]，foo2 = Type_2_Attribute_2 [1]？ –

美麗的湯：在xml-ish文件中獲取所有<tag>的內容

回答

相關問題