2016-01-21 71 views
0

我有一個亂七八糟的CSV文件。我試圖用正則表達式從csv文件中列中的值中提取名字和姓氏。姓和名會有自己的專欄。使用多個分隔符和複製分隔符在CSV文件中分隔列值

CSV文件(帶分隔符的不同組合):

ID,Description,Number 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops ,SomeValue 
JDo,John Doe - Temp - Client Client Ops ,SomeValue 
JDo,John Doe-Temp-Client Client Ops,SomeValue 
JDo,John Doe - Temp-Client Client Ops,SomeValue 
JDo,John Doe - Temp-Client Client Ops,SomeValue 
JDo,John Doe-Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe - Temp - Client Client Ops,SomeValue 
JDo,John Doe-Temp - Client Client Ops ,SomeValue 
JDo,John Doe-Temp-Client Client Ops ,SomeValue 
JDo,John.Doe - Temp - Client Client Ops,SomeValue 
JDo,John .Doe - Temp - Client Client Ops,SomeValue 
JDo,John. Doe - Temp - Client Client Ops,SomeValue 
JDo,John . Doe - Temp - Client Client Ops,SomeValue 
JDo,John.Doe - Temp - Client Client Ops ,SomeValue 
JDo,John .Doe - Temp - Client Client Ops ,SomeValue 
JDo,John. Doe - Temp - Client Client Ops ,SomeValue 
JDo,John . Doe - Temp - Client Client Ops ,SomeValue 
JDo,John.Doe-Temp-Client Client Ops,SomeValue 
JDo,John .Doe-Temp-Client Client Ops,SomeValue 
JDo,John. Doe-Temp-Client Client Ops,SomeValue 
JDo,John . Doe-Temp-Client Client Ops,SomeValue 
JDo,John.Doe - Temp - Client Client Ops,SomeValue 
JDo,John .Doe - Temp - Client Client Ops,SomeValue 
JDo,John. Doe - Temp - Client Client Ops,SomeValue 
JDo,John . Doe - Temp - Client Client Ops,SomeValue 
JDo,John?Doe - Temp - Client Client Ops,SomeValue 
JDo,John ?Doe - Temp - Client Client Ops,SomeValue 
JDo,John? Doe - Temp - Client Client Ops,SomeValue 
JDo,John ? Doe - Temp - Client Client Ops,SomeValue 
JDo,John?Doe - Temp - Client Client Ops ,SomeValue 
JDo,John ?Doe - Temp - Client Client Ops ,SomeValue 
JDo,John? Doe - Temp - Client Client Ops ,SomeValue 
JDo,John ? Doe - Temp - Client Client Ops ,SomeValue 
JDo,John?Doe-Temp-Client Client Ops,SomeValue 
JDo,John ?Doe-Temp-Client Client Ops,SomeValue 
JDo,John? Doe-Temp-Client Client Ops,SomeValue 
JDo,John ? Doe-Temp-Client Client Ops,SomeValue 
JDo,John?Doe - Temp - Client Client Ops,SomeValue 
JDo,John ?Doe - Temp - Client Client Ops,SomeValue 
JDo,John? Doe - Temp - Client Client Ops,SomeValue 
JDo,John ? Doe - Temp - Client Client Ops,SomeValue 
JDo,"John,Doe - Temp - Client Client Ops",SomeValue 
JDo,"John ,Doe - Temp - Client Client Ops",SomeValue 
JDo,"John, Doe - Temp - Client Client Ops",SomeValue 
JDo,"John , Doe - Temp - Client Client Ops",SomeValue 
JDo," John,Doe - Temp - Client Client Ops ",SomeValue 
JDo," John ,Doe - Temp - Client Client Ops ",SomeValue 
JDo," John, Doe - Temp - Client Client Ops ",SomeValue 
JDo," John , Doe - Temp - Client Client Ops ",SomeValue 
JDo,"John,Doe-Temp-Client Client Ops",SomeValue 
JDo,"John ,Doe-Temp-Client Client Ops",SomeValue 
JDo,"John, Doe-Temp-Client Client Ops",SomeValue 
JDo,"John , Doe-Temp-Client Client Ops",SomeValue 
JDo,"John,Doe - Temp - Client Client Ops",SomeValue 
JDo,"John ,Doe - Temp - Client Client Ops",SomeValue 
JDo,"John, Doe - Temp - Client Client Ops",SomeValue 
JDo,"John , Doe - Temp - Client Client Ops",SomeValue 
JDo,John-Doe - Temp - Client Client Ops,SomeValue 
JDo,John -Doe - Temp - Client Client Ops,SomeValue 
JDo,John- Doe - Temp - Client Client Ops,SomeValue 
JDo,John - Doe - Temp - Client Client Ops,SomeValue 
JDo,John-Doe - Temp - Client Client Ops ,SomeValue 
JDo,John -Doe - Temp - Client Client Ops ,SomeValue 
JDo,John- Doe - Temp - Client Client Ops ,SomeValue 
JDo,John - Doe - Temp - Client Client Ops ,SomeValue 
JDo,John-Doe-Temp-Client Client Ops,SomeValue 
JDo,John -Doe-Temp-Client Client Ops,SomeValue 
JDo,John- Doe-Temp-Client Client Ops,SomeValue 
JDo,John - Doe-Temp-Client Client Ops,SomeValue 
JDo,John-Doe - Temp - Client Client Ops,SomeValue 
JDo,John -Doe - Temp - Client Client Ops,SomeValue 
JDo,John- Doe - Temp - Client Client Ops,SomeValue 
JDo,John - Doe - Temp - Client Client Ops,SomeValue

要添加的第一個和最後一個名字列,我使用下面的代碼:

Function FixRxClaimReportAddFirstLastNameColumn { 
    Param ($csvFile) 

    Write-Host "Adding columns 'First Name' and 'Last Name' to $csvFile" 
    Import-Csv $csvFile | 
    Select-Object *, @{n='First Name'; e={if ($_.Description) { 
     $columnFirstNameValue = $($_.Description -replace '\s+', ' ').split(" ")[0] 
     if ($columnFirstNameValue -notlike "*,*" -and $columnFirstNameValue -notmatch '\?' -and $columnFirstNameValue -notlike "*.*" -and $columnFirstNameValue -notlike "*-*") { 
      $columnFirstNameValue.Trim() 
     } else { 
      $columnFirstNameValue2 = $($_.Description -replace '\s+', ' ') -split {$_ -eq "-" -or $_ -eq "- " -or $_ -eq " -" -or $_ -eq " - " -or $_ -eq "," -or $_ -eq ", " -or $_ -eq " ," -or $_ -eq " , " -or $_ -eq "." -or $_ -eq ". " -or $_ -eq " ." -or $_ -eq " . " -or $_ -eq "?" -or $_ -eq "? " -or $_ -eq " ?" -or $_ -eq " ? "} 
      $columnFirstNameValue2[0].Trim() 
     } 
     }}}, @{n='Last Name'; e={if ($_.Description) { 
     $columnLastNameValue = $($_.Description -replace '\s+', ' ').split(" ")[1] 
     if ($columnLastNameValue -notlike "*,*" -and $columnLastNameValue -notmatch '\?' -and $columnLastNameValue -notlike "*.*" -and $columnLastNameValue -notlike "*-*") { 
      $columnLastNameValue.Trim() 
     } else { 
      $columnLastNameValue2 = $($_.Description -replace '\s+', ' ') -split {$_ -eq "-" -or $_ -eq "- " -or $_ -eq " -" -or $_ -eq " - " -or $_ -eq "," -or $_ -eq ", " -or $_ -eq " ," -or $_ -eq " , " -or $_ -eq "." -or $_ -eq ". " -or $_ -eq " ." -or $_ -eq " . " -or $_ -eq "?" -or $_ -eq "? " -or $_ -eq " ?" -or $_ -eq " ? "} 
      $columnLastNameValue2[1].Trim() 
     } 
     }}} | Export-Csv "$csvFile-Results.csv" -NoTypeInformation -Force 
    Write-Host "Complete." 
    Write-Host "" 
} 

FixRxClaimReportAddFirstLastNameColumn 'C:\Scripts\Tests\Test1.csv' 

當我運行這段代碼,所有的名字值應該是John,並且所有的姓氏值應該是Doe。然而,所有的價值都非常不同。

回答

3

你在想太複雜。從Description字段末尾刪除附加信息以獲取名稱,然後修剪名稱並將其分割爲名和姓,然後將這些名稱作爲新屬性添加到輸入對象。

試試這個:

Import-Csv 'C:\path\to\input.csv' | ForEach-Object { 
    $rawname = $_.Description -replace '-[^-]*-[^-]*$' 
    $firstname, $lastname = $rawname.Trim() -split ' *[ \?\.,-] *' 
    $_ | Add-Member -Type NoteProperty -Name FirstName -Value $firstname 
    $_ | Add-Member -Type NoteProperty -Name LastName -Value $lastname 
    $_ 
} | Export-Csv 'C:\path\to\output.csv' -NoType 
+0

感謝安斯加爾。你永遠是一個很大的幫助:) –

+0

你完全可以離開'$ rawname':'$ FirstName,$ LastName,$ null = $ _ -split'[\ s \?。, - ]'| ? {$ _}' – xXhRQ8sD2L7Z

+0

@ ST8Z6FR57ABE6A8RE9UF如果您不在同一行中閱讀和理解,則會更容易閱讀和理解。此外,以這種方式拆分的缺點是不能處理包含連字符的多個名字或名稱。 –