Get-content數據塊

我有大小約爲3GB的大文件。這些文件的頂部和底部都有信息部分，這些信息行數不同於文件。即Get-content數據塊

infostart1 
infostart2 
START-OF-DATA 
line1 
line2 
... 
... 
... 
linen 
END-OF-DATA 
infoend1 
infoend2

等我想創建一個datfile，將只複製和START-OF-數據結束-DATA之間的界線。

$DataStartLineNumber = (Select-String $File -Pattern 'START-OF-DATA' | Select-Object -ExpandProperty 'LineNumber')[0] 
$DataEndLineNumber = (Select-String $File -Pattern 'END-OF-DATA' | Select-Object -ExpandProperty 'LineNumber')[-1]

我曾嘗試：

Get-Content -Path $File | Select-Object -Index ($DataStartLineNumber..($DataEndLineNumber-2)) | Add-Content $Destination

但是由於內存使用獲取內容失敗。

我也曾嘗試：

Get-Content -Path $File -ReadCount 10000 | Select-Object -Index ($DataStartLineNumber..$DataEndLineNumber) | Add-Content $Destination

然而，由於預期這不起作用。

我不想逐行閱讀，因爲它需要很長時間。有什麼方法可以從文件中讀取數據塊，並應用過濾器來消除在「數據開始」和「數據結束」之前發生的任何事情。或按照原樣複製文件，然後刪除以「有效的方式」在「數據開始」之前和「數據結束」之前發生的任何事情。

來源

2017-02-22 yasemin

http://stackoverflow.com/questions/4192072/how-to-process -a-file-in-powershell-line-by-line-as-a-stream和http://stackoverflow.com/questions/32336756/alternative-to-get-content – Matt

Get-Content吸引大量文件。流媒體閱讀器將成爲這裏的路。運行幾個標誌/布爾，以便知道何時啓動和停止處理文件中的行。 – Matt

謝謝你馬特，我會研究它，我希望我能找到一個有效的方法。 – yasemin

由於Matt mentions in the comments，你可以通過自己的行讀取文件中的行，用StreamReader。

我建議「直接跳到」來開始一個循環，然後用另一收集相關線路：

$Reader = New-Object System.IO.StreamReader 'C:\Path\to\file.txt' 
$StartBoundary = 'START-OF-DATA' 
$EndBoundary = 'END-OF-DATA' 

# Skip ahead to the starting boundary 
while(-not($Reader.EndOfStream) -and ($line = $Reader.ReadLine()) -notmatch $StartBoundary){ <#nothing to be done here#> } 

# Output all lines until we hit the end boundary 
$lines = while(-not($Reader.EndOfStream) -and ($line = $Reader.ReadLine()) -notmatch $EndBoundary){ $line } 

# $lines now contain the data

來源

2017-02-22 18:01:19

謝謝Mathias，我用你的方法開始和結束線。有用：） – yasemin

我不知道，如果你的內存不足的問題將得到解決，但嘗試這個

[email protected]" 
{Content*:START-OF-DATA 
line1 
END-OF-DATA} 
{Content*:START-OF-DATA 
line2 
Line3 
END-OF-DATA} 
"@ 

Get-ChildItem "C:\temp\test" -file | foreach { 

    $Data=Get-Content $_.FullName | ConvertFrom-String -TemplateContent $template 

    if ($Data -ne $null) 
    { 
    [pscustomobject]@{FullName=$_.FullName; Content=$Data} 
    } 



} | Format-Table -Wrap

來源

2017-02-22 17:54:04 Esperento57

我沒有機會嘗試這個解決方案，但是謝謝Esperento57 – yasemin

Get-content數據塊

回答

相關問題