2014-09-29 174 views
2

使用datasets.fetch_mldata()時,我從進口sklearn.datasets fetch_mldata 導入fetch_mldata 並呼籲:IO錯誤sklearn

dataset = fetch_mldata('MNIST original') 

但我得到的是以下幾點:

> Traceback (most recent call last): File "<stdin>", line 1, in 
> <module> File 
> "C:\Users\Jacob\Development\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", 
> line 540, in runfile 
>  execfile(filename, namespace) File "C:/Users/Jacob/Documents/Dropbox/Technion/Semester 8/Machine 
> learning/Demo3/Demo3.py", line 75, in <module> 
>  dataset = fetch_mldata('MNIST original') File "C:\Users\Jacob\Development\Anaconda\lib\site-packages\sklearn\datasets\mldata.py", 
> line 158, in fetch_mldata 
>  matlab_dict = io.loadmat(matlab_file, struct_as_record=True) File 
> "C:\Users\Jacob\Development\Anaconda\lib\site-packages\scipy\io\matlab\mio.py", 
> line 126, in loadmat 
>  matfile_dict = MR.get_variables(variable_names) File "C:\Users\Jacob\Development\Anaconda\lib\site-packages\scipy\io\matlab\mio5.py", 
> line 288, in get_variables 
>  res = self.read_var_array(hdr, process) File "C:\Users\Jacob\Development\Anaconda\lib\site-packages\scipy\io\matlab\mio5.py", 
> line 248, in read_var_array 
>  return self._matrix_reader.array_from_header(header, process) File "mio5_utils.pyx", line 616, in 
> scipy.io.matlab.mio5_utils.VarReader5.array_from_header 
> (scipy\io\matlab\mio5_utils.c:5903) File "mio5_utils.pyx", line 645, 
> in scipy.io.matlab.mio5_utils.VarReader5.array_from_header 
> (scipy\io\matlab\mio5_utils.c:5332) File "mio5_utils.pyx", line 713, 
> in scipy.io.matlab.mio5_utils.VarReader5.read_real_complex 
> (scipy\io\matlab\mio5_utils.c:6323) File "mio5_utils.pyx", line 417, 
> in scipy.io.matlab.mio5_utils.VarReader5.read_numeric 
> (scipy\io\matlab\mio5_utils.c:3873) File "mio5_utils.pyx", line 353, 
> in scipy.io.matlab.mio5_utils.VarReader5.read_element 
> (scipy\io\matlab\mio5_utils.c:3595) File "streams.pyx", line 324, in 
> scipy.io.matlab.streams.FileStream.read_string 
> (scipy\io\matlab\streams.c:4343) IOError: could not read bytes 

我嘗試下載更新版本的sklearn,但它沒有幫助。 我是另一個關於這個問題的線索,但提供的解決方案並沒有幫助我。 How to use datasets.fetch_mldata() in sklearn?

任何想法?

回答

3

爲了您/他人的參考,我得到了幾乎相同的錯誤(Ubuntu),包括「IOError:無法讀取字節」錯誤。

我剛剛發佈了一個解決方案,在

How to use datasets.fetch_mldata() in sklearn?

簡短的回答 - 使用下面的:

from sklearn.datasets.mldata import fetch_mldata 
    data = fetch_mldata('mnist-original') 

dataset = fetch_mldata('mnist-original', data_home='***') 

更換***(保留引號)與您的首選位置(數據目錄) 。

-1

在我的情況下,根本原因是損壞的mnist-original.mat文件。該文件已損壞,因爲我在文件完全下載之前終止了Python。這留下了部分下載mnist-original.matC:\user\Taimi\scikit_learn_data\mldata

上面的解決方案適用於我,因爲它只是在新位置提取新副本。更直接的解決方案是找到損壞的mnist-original.mat文件,將其刪除並再次運行代碼。正在運行的代碼將再次下載mnist-original.mat。完整的mnist-original.mat大小爲54,142 KB,因此如果連接速度較慢,則需要幾分鐘才能完成fetch_mldata()