我給了一個.txt文件中的數據,我需要將它們格式化爲可以上傳到數據庫中的數據。文字以任何方式錨定。根據標籤,數據需要轉儲到特定的txt文件和製表符分隔。在我的生活中,我做了很少的Perl,但是我知道Perl可以很容易地處理這種類型的應用程序,我只是失去了從哪裏開始。在Java,SQL和R之外,我毫無用處。這是一個條目我有接近這1000個處理)的例子:Perl - 將帶有標籤的文本文件解析爲新的文本文件
<PaperTitle>True incidence of all complications following immediate and delayed breast reconstruction.</PaperTitle>
<Abstract>BACKGROUND: Improved self-image and psychological well-being after breast reconstruction are well documented. To determine methods that optimized results with minimal morbidity, the authors examined their results and complications based on reconstruction method and timing. METHODS: The authors reviewed all breast reconstructions after mastectomy for breast cancer performed under the supervision of a single surgeon over a 6-year period at a tertiary referral center. Reconstruction method and timing, patient characteristics, and complication rates were reviewed. RESULTS: Reconstruction was performed on 240 consecutive women (94 bilateral and 146 unilateral; 334 total reconstructions). Reconstruction timing was evenly split between immediate (n = 167) and delayed (n = 167). Autologous tissue (n = 192) was more common than tissue expander/implant reconstruction (n = 142), and the free deep inferior epigastric perforator was the most common free flap (n = 124). The authors found no difference in the complication incidence with autologous reconstruction, whether performed immediately or delayed. However, there was a significantly higher complication rate following immediate placement of a tissue expander when compared with delayed reconstruction (p = 0.008). Capsular contracture was a significantly more common late complication following immediate (40.4 percent) versus delayed (17.0 percent) reconstruction (p < 0.001; odds ratio, 5.2; 95 percent confidence interval, 2.3 to 11.6). CONCLUSIONS: Autologous reconstruction can be performed immediately or delayed, with optimal aesthetic outcome and low flap loss risk. However, the overall complication and capsular contracture incidence following immediate tissue expander/implant reconstruction was much higher than when performed delayed. Thus, tissue expander placement at the time of mastectomy may not necessarily save the patient an extra operation and may compromise the final aesthetic outcome.</Abstract>
<BookTitle>Book1</BookTitle>
<Publisher>Publisher01, Boston</Publisher>
<Edition>1st</Edition>
<EditorList>
<Editor>
<LastName>Lewis</LastName>
<ForeName>Philip M</ForeName>
<Initials>PM</Initials>
</Editor>
<Editor>
<LastName>Kiffer</LastName>
<ForeName>Michael</ForeName>
<Initials>M</Initials>
</Editor>
</EditorList>
<Page>19-28</Page>
<Year>2008</Year>
<AuthorList>
<Author ValidYN="Y">
<LastName>Sullivan</LastName>
<ForeName>Stephen R</ForeName>
<Initials>SR</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Fletcher</LastName>
<ForeName>Derek R D</ForeName>
<Initials>DR</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Isom</LastName>
<ForeName>Casey D</ForeName>
<Initials>CD</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Isik</LastName>
<ForeName>F Frank</ForeName>
<Initials>FF</Initials>
</Author>
</AuthorList>
//
PaperTitle,摘要和頁面,需要進入Papers.txt文件
PaperTitle,BOOKTITLE ,版,出版商,以及年需要進入Book.txt文件
PaperTitle,所有的編輯數據姓,名,縮寫需要進入Editors.txt
PaperTitle,所有作者信息姓,名,首字母縮寫需要進入Authors.tx t
//標記條目的結尾。所有文件都需要製表符分隔。 雖然我不會拒絕完成的代碼,但我希望至少有一些想法能夠讓我至少解析出其中一個文件(如Book.txt)的代碼的正確方向,我很可能會想到它從那裏出來。 。非常感謝」
我會通過查看使用配置::一般模塊來處理解析和文本:: CSV_XS模塊生成輸出文件開始。 – 2014-11-21 22:57:11
這聽起來像你需要'XML :: Twig'。請顯示這些數據會導致的文件內容。 – Borodin 2014-11-21 22:58:34