2016-08-02 24 views
0

我使用從多個grib2文件創建的xarray數據集來處理分段錯誤。寫入netcdf以及寫入數據幀時發生故障。任何有關出錯的建議都會受到讚賞。文件(從http://dd.weather.gc.ca/model_hrdps/west/grib2/00/000/) 'CMC_hrdps_west_RH_TGL_2_ps2.5km_2016072800_P015-00.grib2' 的,... 將xarray數據集寫入netcdf或數據框的分段錯誤

files = os.listdir(download_dir) 

實施例 'CMC_hrdps_west_TMP_TGL_2_ps2.5km_2016072800_P011-00.grib2'

# import and combine all grib2 files 
ds = xr.open_mfdataset(files,concat_dim='time',engine='pynio') 

<xarray.Dataset> 
Dimensions: (time: 48, xgrid_0: 685, ygrid_0: 485) 
Coordinates: 
    gridlat_0 (ygrid_0, xgrid_0) float32 44.6896 44.6956 44.7015 44.7075 ... 
    * ygrid_0 (ygrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ... 
    * xgrid_0 (xgrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ... 
    * time  (time) datetime64[ns] 2016-07-28T01:00:00 2016-07-28T02:00:00 ... 
    gridlon_0 (ygrid_0, xgrid_0) float32 -129.906 -129.879 -129.851 ... 
Data variables: 
    u   (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 
    gridrot_0 (time, ygrid_0, xgrid_0) float32 nan nan nan nan nan nan nan ... 
    Qli  (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 
    Qsi  (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 
    p   (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 
    rh   (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 
    press  (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 
    t   (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 
    vw_dir  (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ... 

寫出到的netCDF

ds.to_netcdf('test.nc') 

分割錯誤(核心轉儲)

回答

0

PyNIO在多線程中表現不佳。嘗試將lock=True添加到open_mfdataset(我們應該默認設置)。

嘗試將proprocess=lambda x: x.load()添加到open_mfdataset調用中。這將確保在處理下一個數據集之前將每個數據集完全加載到內存中。

+0

感謝您的建議,但即使'lock = True'我仍然得到段錯誤... – nicway

+0

我增加了另一種替代建議。 – shoyer

+0

工作,謝謝!最後的調用是:'ds = xr.open_mfdataset(files,concat_dim ='time',engine ='pynio',preprocess = lambda x:x.load())' – nicway