我有一個很大的df,並用'chunksize'來分割它。 之後,我使用循環經過df和下一個循環的間隔來做一些條件,然後我想合併所有這個df。我嘗試'concat(df)',但它返回錯誤。方法'加入'是不方便的,因爲我有400 df。 我該如何連接這個? 此代碼用大熊貓合併很多df
el = pd.read_csv('df2.csv', iterator=True, chunksize=100000)
buys = pd.read_excel('smartphone.xlsx')
buys['date'] = pd.to_datetime(buys['date'])
dates1 = buys['date']
ids1 = buys['id']
for i in el:
i['used_at'] = pd.to_datetime(i['used_at'])
df = i.sort_values(['ID', 'used_at'])
dates = df['used_at']
ids = df['ID']
urls = df['url']
for i, (id, date, url, id1, date1) in enumerate(zip(ids, dates, urls, ids1, dates1)):
df1 = df[(df['ID'] == ids1[i]) & (df['used_at'] < (dates1[i] + dateutil.relativedelta.relativedelta(days=5)).replace(hour=0, minute=0, second=0)) & (df['used_at'] > (dates1[i] - dateutil.relativedelta.relativedelta(months=1)).replace(day=1, hour=0, minute=0, second=0))]
df1 = DataFrame(df1)
if df1.empty:
continue
else:
df_upd = concat(df1, ignore_index=True)
book = load_workbook('report_buy2.xlsx')
writer = pd.ExcelWriter('report_buy2.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df_upd.to_excel(writer, "Main")
writer.save()
請出示你嘗試了一些代碼,併發布完整的錯誤消息。 – Jeff
@JeffL。添加代碼 –