閱讀網址爲熊貓數據框與列名（python3）

我已閱讀有關此主題的幾個問題，但似乎沒有爲我工作。閱讀網址爲熊貓數據框與列名（python3）

我想從這個頁面檢索數據「http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat」，併爲這些列指定了一些名稱。

我的代碼如下，這並不讓我指定名稱的數據列，因爲一切都在一列：

import pandas as pd 
import io 
import requests 
url="http://archive.ics.uci.edu/ml/machine-learningdatabases/statlog/heart/heart.dat" 
s=requests.get(url).content 
header_row = ['age','sex','chestpain','restBP','chol','sugar','ecg','maxhr','angina','dep','exercise','fluor','thal','diagnosis'] 
c=pd.read_csv(io.StringIO(s.decode('utf-8')), names=header_row) 
print(c)

輸出是：

 age sex chestpain \ 
0 70.0 1.0 4.0 130.0 322.0 0.0 2.0 109.0 0.0 2.4... NaN  NaN 
1 67.0 0.0 3.0 115.0 564.0 0.0 2.0 160.0 0.0 1.6... NaN  NaN 
2 57.0 1.0 2.0 124.0 261.0 0.0 0.0 141.0 0.0 0.3... NaN  NaN 
3 64.0 1.0 4.0 128.0 263.0 0.0 0.0 105.0 1.0 0.2... NaN  NaN

我需要做些什麼來實現我的目標？

非常感謝！

來源

2017-03-09 Gabriela Martinez

你確定的網址。我在打開它時遇到404錯誤 –

正確的網址https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat –

您提供的鏈接缺少連字符。我在我的回答中糾正了這一點。基本上，您需要將s字符串解碼爲utf-8，然後將其拆分爲\n以獲取每一行，然後將每行分割到空白區域以分別獲取每個值。這將爲您提供數據集的嵌套列表表示，您可以將其轉換爲熊貓數據框，然後您可以分配列名稱。

import pandas as pd 
import io 
import requests 
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat" 
s = requests.get(url).content 
s = s.decode('utf-8') 
s_rows = s.split('\n') 
s_rows_cols = [each.split() for each in s_rows] 
header_row = ['age','sex','chestpain','restBP','chol','sugar','ecg','maxhr','angina','dep','exercise','fluor','thal','diagnosis'] 
c = pd.DataFrame(s_rows_cols, columns = header_row) 
c.head()

來源

2017-03-09 11:20:45

非常感謝！這就是我需要的！最好的祝福！！！ –

閱讀網址爲熊貓數據框與列名（python3）

回答

相關問題