我使用DataLakeStoreFileSystemManagementClient類從Data Lake Store中讀取文件。我們用這樣的代碼爲文件打開一個蒸汽,逐字節讀取並處理它。這是一個特殊情況,我們不能使用U-SQL進行數據處理。Azure Data Lake Store - 現有連接被遠程主機強制關閉
m_adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(…);
return m_adlsFileSystemClient.FileSystem.OpenAsync(m_connection.AccountName, path);
該過程最多可能需要60分鐘才能讀取和處理文件。 問題是:在流讀取過程中,我經常收到「現有連接被遠程主機強行關閉」異常。特別是當閱讀需要20分鐘或更多時間。它不應該是超時,因爲我使用正確的客戶端超時設置創建了DataLakeStoreFileSystemManagementClient。您可以在下面找到例外詳情。這個異常看起來是隨機的,很難預測什麼時候可以得到它。它可以是第15分鐘以及第50分鐘的處理時間。
從Data Lake Store中讀取文件是否正常?對於Data Lake Store文件保持開放流的總時間是否有任何限制(或建議)?
例外:
System.AggregateException: One or more errors occurred. -
--> System.IO.IOException: Unable to read data from the transport connection: An
existing connection was forcibly closed by the remote host. ---> System.Net.Soc
kets.SocketException: An existing connection was forcibly closed by the remote h
ost
at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size,
SocketFlags socketFlags)
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 s
ize)
--- End of inner exception stack trace ---
at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.Read(Byte[] bu
ffer, Int32 offset, Int32 count)
at System.Net.Http.DelegatingStream.Read(Byte[] buffer, Int32 offset, Int32 c
ount)
at DataLake.Timeout.Research.FileDownloader.CopyStream(Stream input, Stream o
utput) in C:\TFS-SED\Main\Platform\DataNode\DataLake\DataLake.Timeout.Research\F
ileDownloader.cs:line 107
at DataLake.Timeout.Research.FileDownloader.<DownloadFileAsync>d__6.MoveNext(
) in C:\TFS-SED\Main\Platform\DataNode\DataLake\DataLake.Timeout.Research\FileDo
wnloader.cs:line 96