如何將數據從python保存到csv文件

-4

我有一個程序，最終打印出「匹配」我想將這個「匹配」中的數據保存到csv文件，我該怎麼做？我已經寫了一些代碼，保存這個變量，但這裏是我的代碼不寫什麼：如何將數據從python保存到csv文件

import shlex 
import subprocess 
import os 
import platform 
from bs4 import BeautifulSoup 
import re 
import csv 
import pickle 
def rename_files(): 
    file_list = os.listdir(r"C:\\PROJECT\\pdfs") 
    print(file_list) 
    saved_path = os.getcwd() 
    print('Current working directory is '+saved_path) 
    os.chdir(r'C:\\PROJECT\\pdfs') 
    for file_name in file_list: 
     os.rename(file_name, file_name.translate(None, " ")) 
    os.chdir(saved_path) 
rename_files() 

def run(command): 
    if platform.system() != 'Windows': 
     args = shlex.split(command) 
    else: 
     args = command 
    s = subprocess.Popen(args, 
         stdout=subprocess.PIPE, 
         stderr=subprocess.PIPE) 
    output, errors = s.communicate() 
    return s.returncode == 0, output, errors 

# Change this to your PDF file base directory 
base_directory = 'C:\\PROJECT\\pdfs' 
if not os.path.isdir(base_directory): 
    print "%s is not a directory" % base_directory 
    exit(1) 
# Change this to your pdf2htmlEX executable location 
bin_path = 'C:\\Python27\\pdfminer-20140328\\tools\\pdf2txt.py' 
if not os.path.isfile(bin_path): 
    print "Could not find %s" % bin_path 
    exit(1) 
for dir_path, dir_name_list, file_name_list in os.walk(base_directory): 
    for file_name in file_name_list: 
     # If this is not a PDF file 
     if not file_name.endswith('.pdf'): 
      # Skip it 
      continue 
     file_path = os.path.join(dir_path, file_name) 
     # Convert your PDF to HTML here 
     args = (bin_path, file_name, file_path) 
     success, output, errors = run("python %s -o %s.html %s " %args) 
     if not success: 
      print "Could not convert %s to HTML" % file_path 
      print "%s" % errors 
htmls_path = 'C:\\PROJECT' 
for dir_path, dir_name_list, file_name_list in os.walk(htmls_path): 
    for file_name in file_name_list: 
     if not file_name.endswith('.html'): 
      continue 
     with open(file_name) as markup: 
      soup = BeautifulSoup(markup.read()) 
      text = soup.get_text() 
      match = re.findall("PA/(\S*)\s*(\S*)", text) 
      print(match) 
with open ('score.csv', 'w') as f: 
    writer = csv.writer(f) 
    writer.writerows('%s' %match)

，我試圖將其保存到一個CSV文件中的部分代碼的最後3行。以下是「匹配」格式的打印：https://gyazo.com/930f9dad12109bc50825c91b51fb31f3

來源

2017-04-27 fsgdfgsd

請仔細閱讀[提問]，並可能提供[MCVE]，在當前狀態下，答案既不給出了您期望的結果的準確描述（格式，...），也沒有顯示您自己的努力來解決問題。 –

最後3行縮進這裏的方式與你的文件相同嗎？如果是這樣，我想這個問題就在那裏。 – Guillaume

是的，我應該標籤他們嗎？ – fsgdfgsd

您的代碼結構化的方式是，您遍歷for循環中的匹配項，然後，當循環結束時，將最後一個匹配項保存在CSV中。您可能需要在for循環內寫入CSV中的每個匹配項。

嘗試更換你的代碼的最後一行（在最後for循環開始）由：

with open('score.csv', 'wt') as f: 
    writer = csv.writer(f) 
    for dir_path, dir_name_list, file_name_list in os.walk(htmls_path): 
     for file_name in file_name_list: 
      if not file_name.endswith('.html'): 
       continue 
      with open(file_name) as markup: 
       soup = BeautifulSoup(markup.read()) 
       text = soup.get_text() 
       match = re.findall("PA/(\S*)\s*(\S*)", text) 
       print(match) 
       writer.writerow(match)

來源

2017-04-27 07:57:53 Guillaume

好了，它救了，但它保存的，而不是做這樣，如果它是一個表 – fsgdfgsd

現在應該固定在一列，使用'match.groups（）'將匹配的組作爲列表。 – Guillaume

它返回了一個錯誤：https://gyazo.com/dae08c866b8a453eb523803913ba27c7這裏是它 – fsgdfgsd

假設您已經有「匹配」，則可以在Python中使用CSV module。作家應該完成你的工作。

如果您能詳細說明數據的格式，這將會更有幫助。

來源

2017-04-27 07:50:31

這裏的「匹配」的格式打印：https://gyazo.com/930f9dad12109bc50825c91b51fb31f3 – fsgdfgsd

如何將數據從python保存到csv文件

回答

相關問題