2016-09-19 50 views
0

從Sean Lahman的棒球數據庫加載這些CSV文件。對於這個任務,我們將使用'Salaries.csv'和'Teams.csv'表。將這些表讀入一個pandas DataFrame並顯示每個表的頭部。如何申請一個zip文件,將其解壓縮,然後從csv文件創建熊貓數據框?

#Here's the code I have so far: 
import requests 
import io 
import zipfile 
url = 'http://seanlahman.com/files/database/lahman-csv_2014-02-14.zip 
r = requests.get(url,auth=('user','pass')) 

#These were lines of code I looked up but am not sure to use: 
#with zipfile.ZipFile('/path/to/file', 'r') as z: 
     #f = z.open('member.csv') 
     #table = pd.io.parsers.read_table(f, ...) 
#salariesData = pd.read_csv('Salaries.csv') 
#teamsData = pd.read_csv('Teams.csv') 
+2

家庭作業問題,一般都在這裏望而卻步。 –

回答

1

請求返回一個字節的文件,所以先字節轉換爲壓縮文件:

mlz = zipfile.ZipFile(io.BytesIO(r.content)) 

,看看有什麼在壓縮文件,鍵入:

mlz.namelist() 

然後你可以解壓和讀取CSV對應索引,x:

df1 = pd.read_csv(mlz.open(mlz.namelist()[0])) 
df2 = pd.read_csv(mlz.open(mlz.namelist()[1])) 

在特定情況下,這將可能是:

salariesData = pd.read_csv(mlz.open('Salaries.csv')) 
teamsData = pd.read_csv(mlz.open('Teams.csv')) 

(所有這一切都^假設你正在使用Python 3.X)

相關問題