0
我有兩種類型的文件,excel和csv,我正在使用它讀取帶有兩個永久列的數據:問題,答案和兩個臨時列,可能存在或不存在Word和Replacement。如何根據數據可用性從excel或csv文件中讀取數據?
我已經做了不同的功能,從csv和excel文件中讀取數據,這將根據文件的擴展名來調用。
是否有一種方法可以根據它們何時存在以及何時不存在,從臨時列(Word和Replacement)中讀取數據。請參考下面的函數定義:
1)CSV文件:
def read_csv_file(path):
quesData = []
ansData = []
asciiIgnoreQues = []
qWithoutPunctuation = []
colnames = ['Question','Answer']
data = pandas.read_csv(path, names = colnames)
quesData = data.Question.tolist()
ansData = data.Answer.tolist()
qWithoutPunctuation = quesData
qWithoutPunctuation = [''.join(c for c in s if c not in string.punctuation) for s in qWithoutPunctuation]
for x in qWithoutPunctuation:
asciiIgnoreQues.append(x.encode('ascii','ignore'))
return asciiIgnoreQues, ansData, quesData
2)功能來讀取Excel數據:
def read_excel_file(path):
book = open_workbook(path)
sheet = book.sheet_by_index(0)
quesData = []
ansData = []
asciiIgnoreQues = []
qWithoutPunctuation = []
for row in range(1, sheet.nrows):
quesData.append(sheet.cell(row,0).value)
ansData.append(sheet.cell(row,1).value)
qWithoutPunctuation = quesData
qWithoutPunctuation = [''.join(c for c in s if c not in string.punctuation) for s in qWithoutPunctuation]
for x in qWithoutPunctuation:
asciiIgnoreQues.append(x.encode('ascii','ignore'))
return asciiIgnoreQues, ansData, quesData
你認爲'pandas.read_csv'和'pandas.read_excel'嗎?他們將根據列出現的情況自動讀取。 – tmrlvi
@tmrlvi,我在讀取csv函數時使用了pandas.read_csv,但列標題必須在colnames中提供。但是如果我沒有單詞和替換曲面怎麼辦? –
你不必提供它們。如果你不這樣做,'pandas'推斷出這些名字。還是你的數據不包含標題? – tmrlvi