有人可以請教我什麼是可能的方式與Python多線程? 我有一個XML文件(163 MB)。我的任務是需要如何使用python多線程從XMl插入數據庫?
- 讀取XML文件
- 將數據插入到一個DB(多表)
- 記錄在日誌文件中
我已經有插入的行數讀取執行上述1,2和3步驟的xml文件的python代碼。其實,我想用多線程來加速這個過程。我不知道如何開始工作。
這是XML結構。
<Content id="359366">
<Title>This title</Title>
<SortTitle>sorting</SortTitle>
<PublisherEntity id="2003">ABC Publishing Group</PublisherEntity>
<Publisher>ABC Publishing Group</Publisher>
<Imprint>Revell</Imprint>
<Language code = "en">English</Language>
<GeoRight>
<GeoCountry code = "WW" model = "Distribution">World</GeoCountry>
</GeoRight>
<Format type = "Adobe EPUB eBook">
<Identifier type = "DRMID">xxx-xxx-xx</Identifier>
<Identifier type = "ISBN">1234567</Identifier>
<SRP currency = "SGD">18.89</SRP>
<WholesaleCost currency = "SGD">11.14</WholesaleCost>
<OnSaleDate>01 Sep 2010</OnSaleDate>
<MinimumSoftwareVersion number="1.x">Adobe Digital Editions</MinimumSoftwareVersion>
<DownloadFileName>HouseonMalcolmStreet9781441213877</DownloadFileName>
<SecurityLevel value="ACS4">Adobe Content Server 4</SecurityLevel>
<ContentFileSize>473923</ContentFileSize>
<DownloadUrl>http://xxx.xx.com/</DownloadUrl>
<DownloadIDType>CRID</DownloadIDType>
<DrmInfo>
<Copy>
<Enabled>1</Enabled>
<Selections>2</Selections>
<Interval type = "Days">7</Interval>
</Copy>
<Print>
<Enabled>1</Enabled>
<Selections>20</Selections>
<Interval type = "Days">7</Interval>
</Print>
<Lend>
<Enabled>0</Enabled>
</Lend>
<ReadAloud>
<Enabled>0</Enabled>
</ReadAloud>
<Expires>
<Enabled>0</Enabled>
<Interval type = "Days">-1</Interval>
</Expires>
</DrmInfo>
</Format>
<Creator rank="1" id="923710">
<Name>name</Name>
<FileAs>Kelly, Leisha</FileAs>
<Role id="aut">Author</Role>
</Creator>
<SubTitle>A Novel</SubTitle>
<Edition></Edition>
<Series></Series>
<Coverage></Coverage>
<AgeGroup></AgeGroup>
<ContentType></ContentType>
<PublicationDate>09/01/2010</PublicationDate>
<ShortDescription>description</ShortDescription>
<FullDescription>full desc</FullDescription>
<Image type = "Cover Image">http://xxx.xx.jpg</Image>
<Image type = "Thumbnail Image">http://xxx.xx.jpg</Image>
<Subject code="FIC000000">Fiction</Subject>
<Subject code="FIC014000">Historical Fiction</Subject>
</Content>
這裏是現有的Python代碼download。
我建議你介紹一下你的當前代碼,並計算出不同方面所花費的時間。我認爲多線程不一定會加快任務速度。 – MattH
謝謝。我發佈了xml結構和當前代碼文件 – hhkhaing