從.npy文件讀取數組到Fortran 90

我正在使用Python生成一些初始數據，例如'X'等二維數組的形式，然後使用Fortran對它們進行一些計算。最初當數組大小約爲10,000 x 10,000時，np.savetxt在速度方面表現良好。但是一旦我開始增加陣列的尺寸，savetxt的速度就會明顯變慢。所以我嘗試了np.save，這樣可以節省很多的速度，但是文件保存爲.npy格式。我將如何閱讀Fortran中的這樣一個文件來重建原始數組？從我讀過的內容來看，二進制通常會導致最低的空間消耗和最快的速度。從.npy文件讀取數組到Fortran 90

Fortran 90中，

open(10,file='/home/X.npy') 

read(10,*)X

我收到以下錯誤：Fortran運行時錯誤：錯誤的實數列表中輸入第1項

編輯：在Python中我做到這一點，

import numpy as np 
x = np.linspace(-50,50,10) 
y = np.linspace(-50,50,10) 
X,Y = np.meshgrid(x,y) 
np.save('X',X)

然後在Fortran中，我這樣做，

program read_unformatted_binary_from_python 
    use iso_c_binding 
    implicit none 

    integer, parameter :: DPR = selected_real_kind(p=15) 
    character(len=*), parameter :: filename = 'X.npy' 

    integer, parameter :: N = 10 
    real(DPR), allocatable, dimension(:, :) :: X 

    allocate(X(N, N)) 


    open(40, file=filename, status='old', access='stream', form='unformatted') 
    read(40) X 
    close(40) 

    write(*,*) X 

end program read_unformatted_binary_from_python

輸出開始與1.8758506894003703E-309 1.1711999948023422E + 171 5.2274167985502976E-037 8.4474009688929314E + 252 2.345660E + 180 9.9215260506210473E + 247 2.1620996769994603E + 233 7.5805790251297605E-096 3.4756671925988047E-152 6.5549091408576423E-260 -50.000000000000000 - 38.888888888888886 -27.777777777777779等

使用.bin格式時不會發生此錯誤，僅在使用導致npy的np.save時纔會發生。

來源

2017-09-20 ThunderFlash

Fortran不會自動讀取並正確解釋任意二進制文件。您將不得不尋找或編寫專門針對該格式的例程。這是相對良好的文件和您最喜愛的搜索引擎將幫助您追蹤其描述。順帶一提，錯誤消息表明文件的前4個（或可能8個）字節不能被解釋爲Fortran「真實」。 –

更糟糕的是，它已經在代碼片段中的格式化閱讀中打開，因此它試圖以文本形式讀取一個數字。這是行不通的。您正在談論二進制文件，但是您沒有打開二進制文件（未格式化）的I/O。 –

並始終使用標籤[tag：fortran]，以便更多人可以看到您的問題。 Fortran 90只是一箇舊版本 - 現在已過時。 Fortran 90不適合執行此任務，您需要從Fortran 2003中流式I/O。 –

Vladimir F是正確的，您想要在Fortran中「流」訪問「原始二進制文件」文件。這裏有一個MWE：

的Python

import numpy as np 
A = np.random.rand(10000, 10000) 
print(A.sum()) 
A.tofile('data.bin')

的Fortran

我Fortran的例子是不是需要，還可以多一點的時間，因爲我用許多不同的系統和編譯器套件，並且也很討厭大靜數組（我畢竟是Fortran用戶）。

我在MacBook Pro上用Python 2.7.x，Numpy 13.x和Homebrew GCC 6.3.0_1上的gfortran實現了快速編碼，但這應該適用於所有系統。

更新：這裏需要特別注意陣列的形狀和大小。如果dat被分配爲大於文件中的內容，那麼流式傳輸read應該嘗試填充整個陣列，點擊EOF符號，併發出錯誤。在Python中，np.fromfile()方法將讀取多達EOF，然後返回具有適當長度的一維數組，即使A最初是多維的。這是因爲原始二進制文件沒有元數據，並且只是一個連續的來自RAM的字節串。

其結果是，在Python以下行產生相同的文件：

A = np.random.rand(10000, 10000) 
A.tofile('file.whatever') 
A.ravel().tofile('file.whatever') 
A.reshape((100, 1000, 1000)).tofile('file.whatever')

並且該文件可以被讀取，並再成形爲：

B = np.fromfile('file.whatever').reshape(A.shape) 
B = np.fromfile('file.whatever').reshape((100, 1000, 100, 10)) 
# or something like 
B = np.fromfile('file.whatever') # just a 1D array 
B.resize(A.shape) # resized in-place

在Fortran中很容易讀整個原始文件使用流訪問而不知道其大小提前，雖然你顯然需要某種用戶輸入，以重塑數據：

program read_unformatted_binary_from_python 
    use iso_c_binding 
    implicit none 

    integer, parameter :: DPR = selected_real_kind(p=15) 
    character(len=*), parameter :: filename = 'data.bin' 
    integer :: N = 10000, bytes, reals, M 
    real(DPR), allocatable :: A(:,:), D(:, :), zeros(:) 
    real(DPR), allocatable, target :: B(:) 
    real(DPR), pointer :: C(:, :) 

    allocate(A(N, N)) 

    open(40, file=filename, status='old', access='stream', form='unformatted') 

    read(40) A 
    write(*,*) 'sum of A', sum(A) 

    inquire(unit=40, size=bytes) 
    reals = bytes/8 
    allocate(B(reals)) 

    read(40, pos=1) B 
    write(*,*) 'sum of B', sum(B) 

    ! now reshape B in-place assuming the user wants the first dimension 
    ! (which would be the outer dimension in Python) to be length 100 
    N = 100 
    if(mod(reals, N) == 0) then 
     M = reals/N 
     call C_F_POINTER (C_LOC(B), C, [N, M]) 
     write(*, *) 'sum of C', sum(C) 
     write(*, *) 'shape of C', shape(C) 
    else 
     write(*,*) 'file size is not divisible by N!, did not point C to B' 
    end if 

    ! now reshape B out-of-place with first dimension length 9900, and 
    ! pad the result so that there is no size mismatch error 
    N = 9900 
    M = reals/N 
    if(mod(reals, N) > 0) M=M+1 

    allocate(D(N, M)) 
    allocate(zeros(N), source=real(0.0, DPR)) 
    D = reshape(B, [N, M], pad=zeros) 

    write(*,*) 'sum of D', sum(D) 
    write(*,*) 'shape of D', shape(D) 

    ! obviously you can also work with shape(A) in fortran the same way you 
    ! would use A.shape() in Python, if you already knew that A was the 
    ! correct shape of the data 
    deallocate(D) 
    allocate(D, mold=A) 
    D = reshape(B, shape(A)) 
    write(*,*) 'sum of D', sum(D) 
    write(*,*) 'shape of D', shape(D) 

    ! or, just directly read it in, skipping the whole reshape B part 
    read(40, pos=1) D 
    write(*,*) 'sum of D', sum(D) 

    close(40) 

end program read_unformatted_binary_from_python

來源

2017-09-21 00:02:23 CAZT

顯然，在Fortran中使用numpy庫來讀取.npy文件。查看本食譜的底部，其中包括libnpy的下載鏈接： http://scipy-cookbook.readthedocs.io/items/InputOutput.html – CAZT

好東西！我不在我的電腦上試用這個，但這是向後兼容？如在「F90」中還是僅在2003年及以上，「流」會工作嗎？ libnpy看起來是一個很好的工具，它會檢查出 – ThunderFlash

Stream是Fortran 2003.我已經說過了，所以它在Fortran 90中不起作用.Fortran 90已經過時，沒有人使用它。有些人使用Fortran 95，是的，但通常他們還必須使用許多非標準擴展。 Fortran 90已經死了。除了一些非常舊的版本外，沒有Fortran 90，所以不要擔心。 –

從.npy文件讀取數組到Fortran 90

回答

相關問題