python訪問文件結構中的數據

我想以最有效的方式訪問存儲在目錄（〜20）中的.txt文件（〜1000）中的每個值（〜10000）。當抓取數據時，我想將它們放在HTML字符串中。我這樣做是爲了爲每個文件顯示一個包含表格的HTML頁面。僞：python訪問文件結構中的數據

fh=open('MyHtmlFile.html','w') 
    fh.write('''<head>Lots of tables</head><body>''') 
    for eachDirectory in rootFolder: 

     for eachFile in eachDirectory: 
      concat='' 

      for eachData in eachFile: 
       concat=concat+<tr><td>eachData</tr></td> 
      table=''' 
        <table>%s</table> 
        '''%(concat) 
     fh.write(table) 
    fh.write('''</body>''') 
    fh.close()

必須有一個更好的方法（我想這將需要永遠）！我已經檢查了set（）並讀了一些關於hashtables的內容，而是在漏洞被挖掘之前詢問專家。

謝謝你的時間！ /卡爾

來源

2011-08-18 ckarlbe

只是一個提示：連接字符串+ =絕對不鼓勵大量的字符串。 –

@jellybean如何提供連接字符串的替代方法？ – Raz

追加他們所有的列表mylist'和'「」.join（mylist）'他們之後 –

import os, os.path 
# If you're on Python 2.5 or newer, use 'with' 
# needs 'from __future__ import with_statement' on 2.5 
fh=open('MyHtmlFile.html','w') 
fh.write('<html>\r\n<head><title>Lots of tables</title></head>\r\n<body>\r\n') 
# this will recursively descend the tree 
for dirpath, dirname, filenames in os.walk(rootFolder): 
    for filename in filenames: 
     # again, use 'with' on Python 2.5 or newer 
     infile = open(os.path.join(dirpath, filename)) 
     # this will format the lines and join them, then format them into the table 
     # If you're on Python 2.6 or newer you could use 'str.format' instead 
     fh.write('<table>\r\n%s\r\n</table>' % 
        '\r\n'.join('<tr><td>%s</tr></td>' % line for line in infile)) 
     infile.close() 
fh.write('\r\n</body></html>') 
fh.close()

來源

2011-08-18 11:37:28 agf

你爲什麼「想象它會永遠」？您正在閱讀該文件，然後將其打印出來 - 這幾乎是您作爲要求提供的唯一一件事 - 而這正是您所做的一切。您可以通過幾種方式調整腳本（讀取塊不是行，調整緩衝區，打印出來而不是連接等），但是如果您不知道現在需要多少時間，您怎麼知道什麼更好/更糟？

配置文件首先，然後找到腳本是否太慢，然後找到一個緩慢的地方，然後才進行優化（或詢問優化）。

來源

2011-08-18 11:26:38 viraptor

我不尋找優化的代碼，我正在尋找一種不同的（更高效或優雅的）解決方案。我現在已經實現了僞代碼，耗時215秒。 – ckarlbe

答案的要點是隻有一個解決方案（直到開始分析）：您想要讀取所有文件中的所有數據，唯一的方法是讀取所有文件中的所有數據。沒有做後者，沒有辦法做前者。 –

python訪問文件結構中的數據

回答

相關問題