2015-05-11 56 views
2

這是我的XML文件:如何在Python中解析這個XML響應?

<?xml version="1.0" ?> 
<Items> 
    <Item> 
     <ASIN>3570102769</ASIN> 
     <DetailPageURL>http://www.amazon.de/Inside-IS-Tage-Islamischen-Staat/dp/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3570102769</DetailPageURL> 
     <ItemLinks> 
      <ItemLink> 
       <Description>Add To Wishlist</Description> 
       <URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3570102769%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL> 
      </ItemLink> 
      <ItemLink> 
       <Description>Tell A Friend</Description> 
       <URL>http://www.amazon.de/gp/pdp/taf/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL> 
      </ItemLink> 
      <ItemLink> 
       <Description>All Customer Reviews</Description> 
       <URL>http://www.amazon.de/review/product/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL> 
      </ItemLink> 
      <ItemLink> 
       <Description>All Offers</Description> 
       <URL>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL> 
      </ItemLink> 
     </ItemLinks> 
     <ItemAttributes> 
      <Author>Jürgen Todenhöfer</Author> 
      <Binding>Gebundene Ausgabe</Binding> 
      <EAN>9783570102763</EAN> 
      <EANList> 
       <EANListElement>9783570102763</EANListElement> 
      </EANList> 
      <ISBN>3570102769</ISBN> 
      <IsEligibleForTradeIn>1</IsEligibleForTradeIn> 
      <ItemDimensions> 
       <Height Units="hundredths-inches">874</Height> 
       <Length Units="hundredths-inches">575</Length> 
       <Width Units="hundredths-inches">126</Width> 
      </ItemDimensions> 
      <Label>C. Bertelsmann Verlag</Label> 
      <Languages> 
       <Language> 
        <Name>Deutsch</Name> 
        <Type>Published</Type> 
       </Language> 
       <Language> 
        <Name>Deutsch</Name> 
        <Type>Original</Type> 
       </Language> 
       <Language> 
        <Name>Deutsch</Name> 
        <Type>Unbekannt</Type> 
       </Language> 
      </Languages> 
      <ListPrice> 
       <Amount>1799</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 17,99</FormattedPrice> 
      </ListPrice> 
      <Manufacturer>C. Bertelsmann Verlag</Manufacturer> 
      <ManufacturerMinimumAge Units="months">192</ManufacturerMinimumAge> 
      <NumberOfPages>288</NumberOfPages> 
      <PackageDimensions> 
       <Height Units="hundredths-inches">118</Height> 
       <Length Units="hundredths-inches">567</Length> 
       <Weight Units="hundredths-pounds">93</Weight> 
       <Width Units="hundredths-inches">252</Width> 
      </PackageDimensions> 
      <PackageQuantity>1</PackageQuantity> 
      <ProductGroup>Book</ProductGroup> 
      <ProductTypeName>ABIS_BOOK</ProductTypeName> 
      <PublicationDate>2015-04-27</PublicationDate> 
      <Publisher>C. Bertelsmann Verlag</Publisher> 
      <Studio>C. Bertelsmann Verlag</Studio> 
      <Title>Inside IS - 10 Tage im 'Islamischen Staat'</Title> 
      <TradeInValue> 
       <Amount>930</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 9,30</FormattedPrice> 
      </TradeInValue> 
     </ItemAttributes> 
     <OfferSummary> 
      <LowestNewPrice> 
       <Amount>1799</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 17,99</FormattedPrice> 
      </LowestNewPrice> 
      <LowestUsedPrice> 
       <Amount>1390</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 13,90</FormattedPrice> 
      </LowestUsedPrice> 
      <LowestCollectiblePrice> 
       <Amount>4999</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 49,99</FormattedPrice> 
      </LowestCollectiblePrice> 
      <TotalNew>56</TotalNew> 
      <TotalUsed>8</TotalUsed> 
      <TotalCollectible>1</TotalCollectible> 
      <TotalRefurbished>0</TotalRefurbished> 
     </OfferSummary> 
     <Offers> 
      <TotalOffers>1</TotalOffers> 
      <TotalOfferPages>1</TotalOfferPages> 
      <MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</MoreOffersUrl> 
      <Offer> 
       <OfferAttributes> 
        <Condition>New</Condition> 
       </OfferAttributes> 
       <OfferListing> 
        <OfferListingId>9KHCZj9qtL6ucVBPASfXaryQjU8tWbc0n%2F3F4F7GraOKW6Csji2OxpD93%2FkoHwgIGQctlnrtx4RWIeJULAcvvsFhiopFi08JdsZ%2FeO3u6g0%3D</OfferListingId> 
        <Price> 
         <Amount>1799</Amount> 
         <CurrencyCode>EUR</CurrencyCode> 
         <FormattedPrice>EUR 17,99</FormattedPrice> 
        </Price> 
        <Availability>Gewöhnlich versandfertig in 24 Stunden</Availability> 
        <AvailabilityAttributes> 
         <AvailabilityType>now</AvailabilityType> 
         <MinimumHours>0</MinimumHours> 
         <MaximumHours>0</MaximumHours> 
        </AvailabilityAttributes> 
        <IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping> 
       </OfferListing> 
      </Offer> 
     </Offers> 
    </Item> 
    <Item> 
     <ASIN>3813506479</ASIN> 
     <DetailPageURL>http://www.amazon.de/Altes-Land-Roman-D%C3%B6rte-Hansen/dp/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3813506479</DetailPageURL> 
     <ItemLinks> 
      <ItemLink> 
       <Description>Add To Wishlist</Description> 
       <URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3813506479%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL> 
      </ItemLink> 
      <ItemLink> 
       <Description>Tell A Friend</Description> 
       <URL>http://www.amazon.de/gp/pdp/taf/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL> 
      </ItemLink> 
      <ItemLink> 
       <Description>All Customer Reviews</Description> 
       <URL>http://www.amazon.de/review/product/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL> 
      </ItemLink> 
      <ItemLink> 
       <Description>All Offers</Description> 
       <URL>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL> 
      </ItemLink> 
     </ItemLinks> 
     <ItemAttributes> 
      <Author>Dörte Hansen</Author> 
      <Binding>Gebundene Ausgabe</Binding> 
      <EAN>9783813506471</EAN> 
      <EANList> 
       <EANListElement>9783813506471</EANListElement> 
      </EANList> 
      <ISBN>3813506479</ISBN> 
      <IsEligibleForTradeIn>1</IsEligibleForTradeIn> 
      <ItemDimensions> 
       <Height Units="hundredths-inches">870</Height> 
       <Length Units="hundredths-inches">567</Length> 
       <Width Units="hundredths-inches">114</Width> 
      </ItemDimensions> 
      <Label>Albrecht Knaus Verlag</Label> 
      <Languages> 
       <Language> 
        <Name>Deutsch</Name> 
        <Type>Published</Type> 
       </Language> 
       <Language> 
        <Name>Deutsch</Name> 
        <Type>Original</Type> 
       </Language> 
      </Languages> 
      <ListPrice> 
       <Amount>1999</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 19,99</FormattedPrice> 
      </ListPrice> 
      <Manufacturer>Albrecht Knaus Verlag</Manufacturer> 
      <NumberOfPages>288</NumberOfPages> 
      <PackageDimensions> 
       <Height Units="hundredths-inches">118</Height> 
       <Length Units="hundredths-inches">858</Length> 
       <Weight Units="hundredths-pounds">101</Weight> 
       <Width Units="hundredths-inches">559</Width> 
      </PackageDimensions> 
      <ProductGroup>Book</ProductGroup> 
      <ProductTypeName>ABIS_BOOK</ProductTypeName> 
      <PublicationDate>2015-02-16</PublicationDate> 
      <Publisher>Albrecht Knaus Verlag</Publisher> 
      <Studio>Albrecht Knaus Verlag</Studio> 
      <Title>Altes Land: Roman</Title> 
      <TradeInValue> 
       <Amount>965</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 9,65</FormattedPrice> 
      </TradeInValue> 
     </ItemAttributes> 
     <OfferSummary> 
      <LowestNewPrice> 
       <Amount>1999</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 19,99</FormattedPrice> 
      </LowestNewPrice> 
      <LowestUsedPrice> 
       <Amount>1599</Amount> 
       <CurrencyCode>EUR</CurrencyCode> 
       <FormattedPrice>EUR 15,99</FormattedPrice> 
      </LowestUsedPrice> 
      <TotalNew>72</TotalNew> 
      <TotalUsed>8</TotalUsed> 
      <TotalCollectible>0</TotalCollectible> 
      <TotalRefurbished>0</TotalRefurbished> 
     </OfferSummary> 
     <Offers> 
      <TotalOffers>1</TotalOffers> 
      <TotalOfferPages>1</TotalOfferPages> 
      <MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</MoreOffersUrl> 
      <Offer> 
       <OfferAttributes> 
        <Condition>New</Condition> 
       </OfferAttributes> 
       <OfferListing> 
        <OfferListingId>aeRv5KPt26T8S0hLrgV8Bv9UPYABYOMijGRxffbNJXUZSN4XfeeOZZpCZ28EURzmgMLlcYEBSRlMXS%2F8Z0pN1JbYerndME%2B2VK3RosfdQJA%3D</OfferListingId> 
        <Price> 
         <Amount>1999</Amount> 
         <CurrencyCode>EUR</CurrencyCode> 
         <FormattedPrice>EUR 19,99</FormattedPrice> 
        </Price> 
        <Availability>Gewöhnlich versandfertig in 24 Stunden</Availability> 
        <AvailabilityAttributes> 
         <AvailabilityType>now</AvailabilityType> 
         <MinimumHours>0</MinimumHours> 
         <MaximumHours>0</MaximumHours> 
        </AvailabilityAttributes> 
        <IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping> 
       </OfferListing> 
      </Offer> 
     </Offers> 
    </Item> 
</Items> 

我希望得到任何ASIN元素。所以,我想這一點:

from lxml import etree 
doc = etree.fromstring(xmlstring) 
items = doc.xpath('//Items/Item') 
for a in items: 
    asin = a.xpath('//ASIN/text()') 
    print asin 

我所得到的是這樣的:

['3570102769', '3813506479'] 
['3570102769', '3813506479'] 

但我想這一點:

['3570102769'] 
['3813506479'] 

我不明白這裏有什麼問題嗎?我認爲我應該迭代任何元素,並在每個元素是一個項目與一個 asin。爲什麼它返回兩個兩個 asin?

+0

給定XML只有2個ASIN元素?你期待2,2個元素列表? –

回答

2

當您搜索a.xpath('//ASIN/text()')時,您正在搜索整個文檔樹。從XML Path language specification報價:

//para選擇文檔根的所有的para子孫,因此選擇在同一文檔中所有的para元素作爲上下文節點

所以,你在做什麼是迭代匹配的Item節點,並說「給我所有的ASIN節點在這個文件請」。這個(Item節點)的上下文被忽略。

你應該做的是直接直接選擇ASIN子節點。保持你原來的執行,這可能是這樣的:

doc = etree.fromstring(xmlstring) 
items = doc.xpath('//Items/Item') 
for a in items: 
    asin = a.xpath('ASIN/text()') 
    print asin 

這使輸出你的願望:

['3570102769'] 
['3813506479'] 

或者,如果你不能確定哪裏在Item節點您ASIN出現,你可以使用.//ASIN/text()

+0

謝謝!這工作完美。現在對我來說很清楚。 –

+0

@JulianBaehr您可以使用當前節點('.')作爲上下文,即'asin = a.xpath('.// ASIN/text()')',但如果''是'的直接子節點'這不是真的有必要。 '// foo'是一個絕對路徑,它從根開始遍歷整個樹(它是'/ descendant-or-self :: node()/ child :: foo')的縮寫。在點前面加上一個相對路徑,但它仍然遍歷整個子樹。避免如果你不想遍歷整個樹。 – Tomalak