2017-02-19 166 views
0

我正在處理非結構化XML文檔,以便它可以轉換爲結構化文檔。在非結構化文檔看起來像下面XML文檔中的節點選擇

<?xml version="1.0" encoding="UTF-8"?> 
<CustomerInformation> 
    <CustomerPurchaseID>String</CustomerPurchaseID> 
    <MemberAddress>String</MemberAddress> 
    <MemberID>String</MemberID> 
    <MemberCity>String</MemberCity> 
    <MemberName>String</MemberName> 
    <MemberType>String</MemberType> 
    <MemberState>String</MemberState> 
    <MemberSince>String</MemberSince> 
    <PurchaseDate>String</PurchaseDate> 
    <CreditCardName></CreditCardName> 
    <CreditCardExpirration></CreditCardExpirration> 
    <Orders> 
     <LineItemCode>String</LineItemCode> 
     <LineItemID>String</LineItemID> 
     <LineItemDescription>String</LineItemDescription> 
     <DiscountCode>String</DiscountCode> 
    </Orders> 
    <Orders> 
     <LineItemCode>String</LineItemCode> 
     <LineItemID>String</LineItemID> 
     <LineItemDescription>String</LineItemDescription> 
     <DiscountCode>String</DiscountCode> 
    </Orders> 
    <ShipToAddress>String</ShipToAddress> 
    <ShipToCity>String</ShipToCity> 
    <ShipToFirstName>String</ShipToFirstName> 
    <ShipToLastName>String</ShipToLastName> 
    <ShipToState>String</ShipToState> 
    <ShipToZIPCode>String</ShipToZIPCode> 
    <CustomerAddressLine1>String</CustomerAddressLine1> 
    <CustomerAddressLine2>String</CustomerAddressLine2> 
    <CustomerID>String</CustomerID> 
    <CustomerCity>String</CustomerCity> 
    <CustomerEmail>String</CustomerEmail> 
    <CustomerFirstName>String</CustomerFirstName> 
    <CustomerLastName>String</CustomerLastName> 
    <CustomerHomePhone>String</CustomerHomePhone> 
    <CustomerState>String</CustomerState> 
    <CustomerZIP>String</CustomerZIP> 
    <Status>String</Status> 
    <OrderedFromName>String</OrderedFromName> 
    <CustomerIdentification></CustomerIdentification> 
    <PrimaryCustomerIndicator>String</PrimaryCustomerIndicator> 
    <OrderedFromAddressLine1Text>String</OrderedFromAddressLine1Text> 
    <OrderedFromAddressLine2Text>String</OrderedFromAddressLine2Text> 
    <OrderedFromCityName>String</OrderedFromCityName> 
    <OrderedFromStateCode>String</OrderedFromStateCode> 
    <OrderedFromZip5Code>String</OrderedFromZip5Code> 
    <OrderedFromZip4Code>String</OrderedFromZip4Code> 
    </CustomerInformation> 

應該轉換成一些這樣的:

<?xml version="1.0" encoding="UTF-8"?> 
<xmlns:evt="http://www.metadata..com/Management/"> 
    <Identifier>3442=000-MNNN</Identifier> 
    <TypeCode>Purchase History</TypeCode> 
    <TypeDescription>Order Summary</TypeDescription> 
    <PurposeCode>Invoice</PurposeCode> 
    <Member> 
     <Email>String</Email> 
     <MemberSince>03/23/2000</MemberSince> 
     <MemberType> 
      <MemberShipTypeCode>String</MemberShipTypeCode> 
      <TypeDescription>String</TypeDescription> 
     </MemberType> 
     <Address> 
      <AddressLine1Text>String</AddressLine1Text> 
      <AddressLine2Text>String</AddressLine2Text> 
      <CityName>String</CityName> 
      <StateCode>String</StateCode> 
      <Zip5Code>String</Zip5Code> 
      <Zip4Code>String</Zip4Code> 
     </Address> 
     <Telephone> 
      <AreaCode>String</AreaCode> 
      <TelephoneNumber>String</TelephoneNumber> 
     </Telephone> 
    </Member> 
    <Company> 
     <CompanyName>String</CompanyName> 
     <CustomerIdentification>0.0</CustomerIdentification> 
     <PrimaryCustomerIndicator>String</PrimaryCustomerIndicator> 
     <CompanyAddress> 
      <CompanyAddressLine1Text>String</CompanyAddressLine1Text> 
      <CompanyAddressLine2Text>String</CompanyAddressLine2Text> 
      <CompanyCityName>String</CompanyCityName> 
      <CompanyStateCode>String</CompanyStateCode> 
      <CompanyZip5Code>String</CompanyZip5Code> 
      <CompanyZip4Code>String</CompanyZip4Code> 
     </CompanyAddress> 
    </Company> 
    <Orders> 
    <CreditCard> 
      <CardName>String</CardName> 
      <CardExpirationDate>1967-08-13</CardExpirationDate> 
    </CreditCard> 
    <Order> 
     <Discount>String</Discount> 
     <ShippingVendorName>String</ShippingVendorName> 
     <ShipmentTrackingNumber>String</ShipmentTrackingNumber> 
     <ShipmentTrackingLinkText>String</ShipmentTrackingLinkText> 
     <CustomerName>String</CustomerName> 
     <CustomerEmailAddressText>String</CustomerEmailAddressText> 
     <Telephone> 
      <AreaCode>String</AreaCode> 
      <TelephoneNumber>String</TelephoneNumber> 
     </Telephone> 
     <ShippingAddress> 
      <ShippingAddressLine1Text>String</ShippingAddressLine1Text> 
      <ShippingAddressLine2Text>String</ShippingAddressLine2Text> 
      <ShippingCareOfText>String</ShippingCareOfText> 
      <ShippingCityName>String</ShippingCityName> 
      <ShippingStateCode>String</ShippingStateCode> 
      <ShippingZip5Code>String</ShippingZip5Code> 
      <ShippingZip4Code>String</ShippingZip4Code> 
     </ShippingAddress> 
     <LineItem> 
      <LineItemNumber>String</LineItemNumber> 
      <LineItemQuantityCount>0</LineItemQuantityCount> 
      <ItemOrderedIndicator>String</ItemOrderedIndicator> 
      <Discount>String</Discount> 
     </LineItem> 
    </Order> 
    </Orders> 

我能夠通過創建結構化的格式,並通過簡單地使用提取相關領域生成XML具有以下XSLT的節點值:

<xsl:value-of select=.../> 

但是我覺得可能有更好的方法來做到這一點。我希望能夠在導航非結構化文檔或平面文檔時控制結構的生成方式。有沒有辦法爲所有MemberAddress字段分組元素?如果我能夠做到這一點,我可以創建輸出的成員部分。我也可以爲其他元素做同樣的事情。我對結構化文檔進行硬編碼的擔憂是,它可能在未來發生變化。如果可能,我寧願能夠控制輸出。源文檔中的所有成員信息應映射到目標文檔中的成員元素。以OrderedFrom開頭的源文檔中的元素應映射到目標文檔中的公司字段。 ShipTo元素依次映射到目標文檔的訂單部分中的發貨信息等等。請幫忙!!

+0

''不是有效的開始標記。而''不是有效的XSLT指令。 –

回答

1

我對硬編碼結構化文檔的擔憂是它可能會在將來更改 。

XSLT樣式表將數據從一個XML模式轉換爲另一個XML模式。期望在任一模式中進行更改都不需要重寫樣式表是不現實的。

是否有一種方法可以將所有MemberAddress字段的元素分組爲 示例?

是的,如果你有一些方法來識別它們。例如,你可以這樣做:

<Member> 
    <xsl:for-each select="*[starts-with(name(), 'Member')]"> 
     <xsl:element name="{substring-after(name(), 'Member')}"> 
      <xsl:value-of select="." /> 
     </xsl:element> 
    </xsl:for-each> 
</Member> 

獲得:

<Member> 
    <Address>String</Address> 
    <ID>String</ID> 
    <City>String</City> 
    <Name>String</Name> 
    <Type>String</Type> 
    <State>String</State> 
    <Since>String</Since> 
</Member> 

,但不適合你的預期輸出。順便說一句,您的輸出會顯示大量不在您輸入內的數據,例如會員的電子郵件。

+0

是的,文件被修剪,因爲它非常冗長,非常感謝 – BreenDeen