我試圖加載160GB CSV文件,SQL和我使用PowerShell腳本我從Github了,我得到這個錯誤錯誤:輸入數組的長度超過該表中PowerShell中的列數
IException calling "Add" with "1" argument(s): "Input array is longer than the number of columns in this table."
At C:\b.ps1:54 char:26
+ [void]$datatable.Rows.Add <<<< ($line.Split($delimiter))
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : DotNetMethodException
所以我用小3行csv檢查了相同的代碼,並且所有列匹配,並且在第一行也有標題,沒有額外的分隔符不知道爲什麼我得到這個錯誤。
的代碼是下面
<# 8-faster-runspaces.ps1 #>
# Set CSV attributes
$csv = "M:\d\s.txt"
$delimiter = "`t"
# Set connstring
$connstring = "Data Source=.;Integrated Security=true;Initial Catalog=PresentationOptimized;PACKET SIZE=32767;"
# Set batchsize to 2000
$batchsize = 2000
# Create the datatable
$datatable = New-Object System.Data.DataTable
# Add generic columns
$columns = (Get-Content $csv -First 1).Split($delimiter)
foreach ($column in $columns) {
[void]$datatable.Columns.Add()
}
# Setup runspace pool and the scriptblock that runs inside each runspace
$pool = [RunspaceFactory]::CreateRunspacePool(1,5)
$pool.ApartmentState = "MTA"
$pool.Open()
$runspaces = @()
# Setup scriptblock. This is the workhorse. Think of it as a function.
$scriptblock = {
Param (
[string]$connstring,
[object]$dtbatch,
[int]$batchsize
)
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connstring,"TableLock")
$bulkcopy.DestinationTableName = "abc"
$bulkcopy.BatchSize = $batchsize
$bulkcopy.WriteToServer($dtbatch)
$bulkcopy.Close()
$dtbatch.Clear()
$bulkcopy.Dispose()
$dtbatch.Dispose()
}
# Start timer
$time = [System.Diagnostics.Stopwatch]::StartNew()
# Open the text file from disk and process.
$reader = New-Object System.IO.StreamReader($csv)
Write-Output "Starting insert.."
while ((($line = $reader.ReadLine()) -ne $null))
{
[void]$datatable.Rows.Add($line.Split($delimiter))
if ($datatable.rows.count % $batchsize -eq 0)
{
$runspace = [PowerShell]::Create()
[void]$runspace.AddScript($scriptblock)
[void]$runspace.AddArgument($connstring)
[void]$runspace.AddArgument($datatable) # <-- Send datatable
[void]$runspace.AddArgument($batchsize)
$runspace.RunspacePool = $pool
$runspaces += [PSCustomObject]@{ Pipe = $runspace; Status = $runspace.BeginInvoke() }
# Overwrite object with a shell of itself
$datatable = $datatable.Clone() # <-- Create new datatable object
}
}
# Close the file
$reader.Close()
# Wait for runspaces to complete
while ($runspaces.Status.IsCompleted -notcontains $true) {}
# End timer
$secs = $time.Elapsed.TotalSeconds
# Cleanup runspaces
foreach ($runspace in $runspaces) {
[void]$runspace.Pipe.EndInvoke($runspace.Status) # EndInvoke method retrieves the results of the asynchronous call
$runspace.Pipe.Dispose()
}
# Cleanup runspace pool
$pool.Close()
$pool.Dispose()
# Cleanup SQL Connections
[System.Data.SqlClient.SqlConnection]::ClearAllPools()
# Done! Format output then display
$totalrows = 1000000
$rs = "{0:N0}" -f [int]($totalrows/$secs)
$rm = "{0:N0}" -f [int]($totalrows/$secs * 60)
$mill = "{0:N0}" -f $totalrows
Write-Output "$mill rows imported in $([math]::round($secs,2)) seconds ($rs rows/sec and $rm rows/min)"
通常在這種情況下,此錯誤表示某些行在本例中具有意外的嵌入分隔符...選項卡。這只是一些骯髒的輸入數據。您可以嘗試閱讀行,用空字符串替換選項卡,然後通過比較原始大小和縮小大小來查看哪些行具有比列更多的選項卡。如果你有四列,你會期望一行縮小三個字符。 –
@Laughing Vergil謝謝你的迴應,對逗號分隔的文件也一樣,我得到了同樣的錯誤 – Zack
對於某些數據,也可能有行終止符問題 - 在Windows樣式文本文件中有UNIX樣式換行符在一個數據檢索操作中結束兩行。嵌入換行符或CR/LF對也可能導致處理混亂。 –