我需要處理文本文件以提取相關信息,以便稍後輸入到R中進行統計分析。文本文件內容通常看起來像下面顯示的示例提取。董事會能否就我用於此目的的軟件/編程語言提出任何建議?該軟件的關鍵要求是:對文本處理軟件的建議
- 便於/編程語法的清晰度來提取每一行的相關信息(注意:不是所有的線將包含相關信息)
- 自由/開源
- 能在Linux和Windows系統通過很多很多單獨的文本包含在一個文件夾/目錄,但輸出只是一個單一的(CSV /文)文件運行
- 能力迴環文件
實例
Full Tilt Poker Game #19911608402: Table Buggy - $0.01/$0.02 - No Limit Hold'em - 4:05:58 ET - 2010/04/08 Seat 2: BAD BeAts02 ($1.74) Seat 3: VIVIVIVIV ($1.20) Seat 4: pipelis ($2.87), is sitting out Seat 5: trichinosis ($2.54) Seat 6: Syrenski ($2) Seat 9: evil-bunny1 ($1.20) BAD BeAts02 posts the small blind of $0.01 VIVIVIVIV posts the big blind of $0.02 handrici sits down pipelis stands up Syrenski posts $0.02 The button is in seat #9 *** HOLE CARDS *** Dealt to Syrenski [6d 3s] handrici adds $2 trichinosis calls $0.02 Syrenski checks pkmyers sits down evil-bunny1 folds BAD BeAts02 raises to $0.08 VIVIVIVIV folds VIVIVIVIV adds $0.02 pkmyers adds $1.34 trichinosis calls $0.06 Syrenski folds *** FLOP *** [Js 5s 8s] pipelis sits down BAD BeAts02 has 15 seconds left to act BAD BeAts02 bets $0.18 AntHraX85 sits down pipelis stands up trichinosis folds Uncalled bet of $0.18 returned to BAD BeAts02 BAD BeAts02 mucks AntHraX85 adds $2 BAD BeAts02 wins the pot ($0.19) *** SUMMARY *** Total pot $0.20 | Rake $0.01 Board: [Js 5s 8s] Seat 2: BAD BeAts02 (small blind) collected ($0.19), mucked Seat 3: VIVIVIVIV (big blind) folded before the Flop Seat 4: pipelis is sitting out Seat 5: trichinosis folded on the Flop Seat 6: Syrenski folded before the Flop Seat 9: evil-bunny1 (button) didn't bet (folded)
哪些信息是相關的嗎?用戶會決定什麼是相關的?有沒有一種模式? – pablosaraiva 2010-10-13 19:27:49
相關信息將與特定指定玩家即Syrenski有關的所有信息以及有關手牌結構的信息。 – babelproofreader 2010-10-13 21:07:09
'handrici坐下......在哪個座位號? – Kaz 2013-12-18 21:09:36