我想使用Anydata-0.12將XML文件轉換爲CSV。 XML文件如下所示:將XML轉換爲CSV時出現內存不足錯誤
<FIXML r="20030618" s="20040109" v="4.4" xr="FIA" xv="1" xmlns="http://www.fixprotocol.org/FIXML-4-4">
<Batch>
<MktDataFull RptID="23520135" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20171215" MatDt="2017-12-15" CFI="OCASPS" StrkPx="100" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="5.7367" Ccy="USD" PxDelta="0.5" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="30818621" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20180615" MatDt="2018-06-15" CFI="OCASPS" StrkPx="100" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="7.3603" Ccy="USD" PxDelta="0.52" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="31165289" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170317" MatDt="2017-03-17" CFI="OCASPS" StrkPx="101" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="1.7973" Ccy="USD" PxDelta="0.46" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="31165443" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170317" MatDt="2017-03-17" CFI="OCASPS" StrkPx="102" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="1.2775" Ccy="USD" PxDelta="0.35" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="31165368" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170317" MatDt="2017-03-17" CFI="OCASPS" StrkPx="103" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="0.8861" Ccy="USD" PxDelta="0.25" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="31165483" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170317" MatDt="2017-03-17" CFI="OCASPS" StrkPx="104" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="0.5858" Ccy="USD" PxDelta="0.25" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="25807539" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170616" MatDt="2017-06-16" CFI="OCASPS" StrkPx="105" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="1.321" Ccy="USD" PxDelta="0.26" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="30818579" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20180615" MatDt="2018-06-15" CFI="OCASPS" StrkPx="105" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="4.7838" Ccy="USD" PxDelta="0.4" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="32444397" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170616" MatDt="2017-06-16" CFI="OCASPS" StrkPx="106" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="1.0134" Ccy="USD" PxDelta="0.26" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="32868839" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170120" MatDt="2017-01-20" CFI="OCASPS" StrkPx="107" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="0.0079" Ccy="USD" PxDelta="0" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
<MktDataFull RptID="32444384" BizDt="2016-12-09"><Instrmt Sym="OEF" MMY="20170616" MatDt="2017-06-16" CFI="OCASPS" StrkPx="109" StrkMult="1" StrkValu="100" Mult="100" StrkCcy="USD"/><Full Typ="5" Px="0.4888" Ccy="USD" PxDelta="0.11" Dt="2016-12-09"/><Full Typ="D" Px="100.15" Dt="2016-12-09"/></MktDataFull>
....
....
</Batch>
</FIXML>
CSV文件包含部分XML。它應該在XML文件中使用的列標題是這樣的:
RptID,BizDt,StrkMult,Sym,StrkValu,Mult,MatDt,CFI,StrkCcy,MMY,StrkPx
23520135,2016-12-09,1,OEF,100,100,2017-12-15,OCASPS,USD,20171215,100
30818621,2016-12-09,1,OEF,100,100,2018-06-15,OCASPS,USD,20180615,100
31165289,2016-12-09,1,OEF,100,100,2017-03-17,OCASPS,USD,20170317,101
31165443,2016-12-09,1,OEF,100,100,2017-03-17,OCASPS,USD,20170317,102
31165368,2016-12-09,1,OEF,100,100,2017-03-17,OCASPS,USD,20170317,103
31165483,2016-12-09,1,OEF,100,100,2017-03-17,OCASPS,USD,20170317,104
...
我運行這段代碼:
use AnyData;
my $input_xml = "oc170120.xml"; #name of the XML file
my $output_csv = "test3.csv"; #name of the output file
$flags->{record_tag} = 'Instrmt';
my $table = adTie('XML', $input_xml, 'r', $flags);
....
它正在與用於測試目的的小文件,一切都很好。但過了一段時間,我越來越
內存不足!
as adtie()
試圖將整個文件讀入內存,並且該XML文件具有超過400000條記錄。
我在64位系統上使用Perl 5.24.1。
創建'$ table'後你在做什麼?它是否給出了*內存不足*在這一行中的錯誤或者在創建了對數據進行操作的表之後? – AbhiNickz
如果XML文件很大,則無法將其加載到內存中。搜索XML的流處理,即[XML :: LibXML :: Reader](http://p3rl.org/XML::LibXML::Reader)或[XML :: SAX](http://p3rl.org/) XML :: SAX)。我不確定已棄用的AnyData是否可以使用。 – choroba
XML :: Twig在分塊處理方面也很出色。 – simbabque