2017-01-05 188 views
2

我正在創建一個腳本,它將複製文件,重命名它,然後在裏面去除某些特殊字符。其中一個特殊字符是某種我不能用鍵複製的ASCII撇號。我可以複製並粘貼它,但替換功能不起作用。命令到powershell替換 - 特殊字符

打開文件>搜索奇怪的撇號'並用什麼都替換。我希望它用一個正常的撇號代替它,但我不知道這是如何完成的,而目前最大的問題是我無法「看到」自動生成的這個奇怪的撇號我正在修改的文件。任何幫助非常感謝。謝謝:)

撇號文件:」

普通撇號:'

這是我已經分離,以測試該批次的一大塊。

 @echo off 

    set YYMMDD=%DATE:~-2,2%%DATE:~-7,2%%DATE:~-10,2% 
    set DDMMYYYY=%DATE:~-10,2%%DATE:~-7,2%%DATE:~-4,4% 
    set YYYY-MM-DD=%DATE:~-4,4%-%DATE:~-7,2%-%DATE:~-10,2% 

powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv') -replace '’', '' | Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'" 

    Echo Done 
+0

是什麼奇怪的撇號的ASCII碼?順便說一句,反向角色看起來有點像一個奇怪的撇號,但不像你告訴我們的那個人。反引號字符用作PS字符串中的轉義符。 –

+0

在批處理文件中是否有'echo'打印特殊撇號(可以肯定,它不是編碼問題)?此外,您需要在單引號字符串內轉義該特殊撇號,因爲對於PowerShell,特殊撇號是有效的單引號字符:'-replace'''','''。 – PetSerAl

+0

你也許可以做一個正則表達式。如果我複製並粘貼怪異的撇號到regex101它確實認識到它是不同的。即使你不知道它是什麼,那至少可以讓你替換它。 – Nick

回答

1
set "fileIn=C:\LOCATION\Client_List_%DDMMYYYY%.csv" 
set "fileOu=C:\LOCATION\Client_List_%DDMMYYYY%.csv" 
powershell -c "(gc '%fileIn%').Replace('‘‘','').Replace('’’','')|Out-File '%fileOu%'" 

奇怪U+2019右單引號,按說收盤報價。它可以與一個不同的開放報價配對。在上例中,U+2018左單引號

Get-Help 'about_Quoting_Rules'

引號用於指定文本字符串。您可以在單引號(')或雙引號 (")中包含 一個字符串。

事實上,PowerShell中接受兩個不同套報價

  • 雙引號"
  • 單引號'

據我所知,所有這些引號存在於大多數Windows ANSI代碼頁(1252,1250,1257,1253,1251,1254,1255,1256,1258),所以他們可以從字面上ANSI使用 - 保存.bat腳本 - 除後者引號外U+201B單個高反轉9引號。在這種情況下,使用$([char]0x201B)代替'‛‛'如下:

rem  cast [char] to `[string]` ↓↓↓↓↓↓↓↓ 
powershell -c "(gc '%fileIn%').Replace([string]$([char]0x201B) , '')" 
rem            ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑ 

或如下:

rem [char] can't be empty so specify `[string]`   ↓↓↓↓↓↓↓↓ 
powershell -c "(gc '%fileIn%').Replace($([char]0x201B) , [string]'')" 
rem          ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑ 

分析和解釋

接着PowerShell代碼片斷示出了從Unicode數據庫的摘錄(字符名稱與Quotation Mark結束或含有Apostrophe):

PS D:> 0x22,0x27,0x00AB,0x00BB,0x2018,0x2019,0x201A,0x201B,0x201C,0x201D,0x201E,0x201F, 
    0x2039,0x203A,0x2E42,0x301D,0x301E,0x301F,0x055A | Get-CharInfo | Format-Table -AutoSize 

Char CodePoint    Category Description        
---- ---------    -------- -----------        
    " U+0022   OtherPunctuation Quotation Mark        
    ' U+0027   OtherPunctuation Apostrophe         
    « U+00AB InitialQuotePunctuation Left-Pointing Double Angle Quotation Mark 
    » U+00BB  FinalQuotePunctuation Right-Pointing Double Angle Quotation Mark 
    ‘ U+2018 InitialQuotePunctuation Left Single Quotation Mark     
    ’ U+2019  FinalQuotePunctuation Right Single Quotation Mark    
    ‚ U+201A   OpenPunctuation Single Low-9 Quotation Mark    
    ‛ U+201B InitialQuotePunctuation Single High-Reversed-9 Quotation Mark  
    「 U+201C InitialQuotePunctuation Left Double Quotation Mark     
    」 U+201D  FinalQuotePunctuation Right Double Quotation Mark    
    „ U+201E   OpenPunctuation Double Low-9 Quotation Mark    
    ‟ U+201F InitialQuotePunctuation Double High-Reversed-9 Quotation Mark  
    ‹ U+2039 InitialQuotePunctuation Single Left-Pointing Angle Quotation Mark 
    › U+203A  FinalQuotePunctuation Single Right-Pointing Angle Quotation Mark 
    ⹂ U+2E42   OtherNotAssigned Undefined         
    〝 U+301D   OpenPunctuation Reversed Double Prime Quotation Mark  
    〞 U+301E   ClosePunctuation Double Prime Quotation Mark    
    〟 U+301F   ClosePunctuation Low Double Prime Quotation Mark   
    ՚ U+055A   OtherPunctuation Armenian Apostrophe      

(從修改的Get-CharInfo cmdlet輸出。)原始Get-CharInfo模塊可從http://poshcode.org/5234下載。

下一頁PowerShell腳本上述結果完成了由顯示報價的一些有效的(在我的語言環境無效)組合:

$arrSingleQuotes = 
''' U+0027 Apostrophe '''        , 
‘‘‘ U+2018 Left Single Quotation Mark ‘‘‘    , 
’’’ U+2019 Right Single Quotation Mark ’’’    , 
‚‚‚ U+201A Single Low-9 Quotation Mark ‚‚‚    , 
‛‛‛ U+201B Single High-Reversed-9 Quotation Mark ‛‛‛  , 
‘‘‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’’’ , 
’’’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘‘‘ 
'$arrSingleQuotes (any combination)' 
$arrSingleQuotes 

$arrDoubleQoutes = 
""" U+0022 Quotation Mark """       , 
「「「 U+201C Left Double Quotation Mark 「「「    , 
」」」 U+201D Right Double Quotation Mark 」」」    , 
„„„ U+201E Double Low-9 Quotation Mark „„„    , 
「「「 U+201C (Left/Right) Double Quotation Mark U+201D 」」」 , 
」」」 U+201D (Right/Left) Double Quotation Mark U+201C 「「「 
'$arrDoubleQoutes (any combination)' 
$arrDoubleQoutes 

$noQuotes = @" 
« U+00AB Left-Pointing Double Angle Quotation Mark 
» U+00BB Right-Pointing Double Angle Quotation Mark 
‟ U+201F Double High-Reversed-9 Quotation Mark 
⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK 
‹ U+2039 Single Left-Pointing Angle Quotation Mark 
› U+203A Single Right-Pointing Angle Quotation Mark 
〝 U+301D Reversed Double Prime Quotation Mark 
〞U+301E Double Prime Quotation Mark 
〟U+301F Low Double Prime Quotation Mark 
՚ U+055A Armenian Apostrophe      
"@ 
'$noQuotes' 
$noQuotes 

輸出

PS D:> D:\PShell\SO\41488245_quotes.ps1 

$arrSingleQuotes (any combination) 
' U+0027 Apostrophe ' 
‘ U+2018 Left Single Quotation Mark ‘ 
’ U+2019 Right Single Quotation Mark ’ 
‚ U+201A Single Low-9 Quotation Mark ‚ 
‛ U+201B Single High-Reversed-9 Quotation Mark ‛ 
‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’ 
’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘ 

$arrDoubleQoutes (any combination) 
" U+0022 Quotation Mark " 
「 U+201C Left Double Quotation Mark 「 
」 U+201D Right Double Quotation Mark 」 
„ U+201E Double Low-9 Quotation Mark „ 
「 U+201C (Left/Right) Double Quotation Mark U+201D 」 
」 U+201D (Right/Left) Double Quotation Mark U+201C 「 

$noQuotes 
« U+00AB Left-Pointing Double Angle Quotation Mark 
» U+00BB Right-Pointing Double Angle Quotation Mark 
‟ U+201F Double High-Reversed-9 Quotation Mark 
⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK 
‹ U+2039 Single Left-Pointing Angle Quotation Mark 
› U+203A Single Right-Pointing Angle Quotation Mark 
〝 U+301D Reversed Double Prime Quotation Mark 
〞U+301E Double Prime Quotation Mark 
〟U+301F Low Double Prime Quotation Mark 
՚ U+055A Armenian Apostrophe      

注意⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK存在於統一數據庫並在PowerShell ISE中正確呈現。

附錄:我發現了更多的引號考生(只顯示結果腳本Excerpt_From_UnicodeDataTxt.ps1獲得):

PS > $x = .\tests\Excerpt_From_UnicodeDataTxt.ps1 -SearchString "Quotation|Apostrophe" | 
    Where-Object {$_.Category -match 'Punctuation'} 

PS > $x.Count 
23 

PS > $x 

Char CodePoint Category     Description          
---- --------- --------     -----------          
    " U+0022 Po-OtherPunctuation  Quotation Mark          
    ' U+0027 Po-OtherPunctuation  Apostrophe           
    « U+00AB Pi-InitialQuotePunctuation Left-Pointing Double Angle Quotation Mark   
    » U+00BB Pf-FinalQuotePunctuation Right-Pointing Double Angle Quotation Mark   
    ՚ U+055A Po-OtherPunctuation  Armenian Apostrophe        
    ‘ U+2018 Pi-InitialQuotePunctuation Left Single Quotation Mark       
    ’ U+2019 Pf-FinalQuotePunctuation Right Single Quotation Mark      
    ‚ U+201A Ps-OpenPunctuation   Single Low-9 Quotation Mark      
    ‛ U+201B Pi-InitialQuotePunctuation Single High-Reversed-9 Quotation Mark    
    「 U+201C Pi-InitialQuotePunctuation Left Double Quotation Mark       
    」 U+201D Pf-FinalQuotePunctuation Right Double Quotation Mark      
    „ U+201E Ps-OpenPunctuation   Double Low-9 Quotation Mark      
    ‟ U+201F Pi-InitialQuotePunctuation Double High-Reversed-9 Quotation Mark    
    ‹ U+2039 Pi-InitialQuotePunctuation Single Left-Pointing Angle Quotation Mark   
    › U+203A Pf-FinalQuotePunctuation Single Right-Pointing Angle Quotation Mark   
    ❮ U+276E Ps-OpenPunctuation   Heavy Left-Pointing Angle Quotation Mark Ornament 
    ❯ U+276F Pe-ClosePunctuation  Heavy Right-Pointing Angle Quotation Mark Ornament 
    ⹂ U+2E42 Ps-OpenPunctuation   Undefined           
    〝 U+301D Ps-OpenPunctuation   Reversed Double Prime Quotation Mark    
    〞 U+301E Pe-ClosePunctuation  Double Prime Quotation Mark      
    〟 U+301F Pe-ClosePunctuation  Low Double Prime Quotation Mark     
    " U+FF02 Po-OtherPunctuation  Fullwidth Quotation Mark       
    ' U+FF07 Po-OtherPunctuation  Fullwidth Apostrophe        
+0

嗨,感謝這樣一個令人難以置信的詳細迴應,這似乎是正確的軌道上。我運行了以下行 --- powershell -c「(gc'Client_XXX_List_%date%.csv')。Replace($([char] 0x201B),'')」--- but I get the following response :https://postimg.org/image/u0h4rjruh/ ---任何想法爲什麼?謝謝! – meeilz

+0

@meeilz答案已更新。對不起,我測試'.Replace($([char] 0x201B),'#''),即替換爲另一個_character_。現在作爲替換爲空字符串_以及。 – JosefZ

+0

謝謝:)它現在通過了初始階段,沒有錯誤,這是非常好的。目前它會搜索整個文檔,看起來沒問題,但它沒有找到字符,我縮小了它實際上是「正確的單引號」,而不是倒過來的,我已經改變了代碼到0x2019,我相信是正確的,但它仍然沒有找到/替換我的csv中的任何引號?很不尋常!非常感謝您的幫助:)謝謝! - ''powershell -c「(gc'Client_List_05012017.csv')。替換($([char] 0x2019),[string]'A')」' – meeilz

0

我認爲這是一個奇怪的反襯字符。至少這就是它的表現。

如果我這樣做:

$text = "Weird ’ Normal ' Backtick ` Weird ’ " 
$text.Replace("’","") 

它給了我這個:

Weird Normal ' Backtick Weird 

所以做這項工作?

powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv').replace('’’', '') | 
Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'" 

通過加倍正常的反嘀,它使腳本字面上的字符。加倍怪異的撇號似乎做同樣的事情,至少在我的測試工作。

+0

感謝您的回覆!不幸的是,我沒有得到任何不同的結果,它仍然處理並說「完成」(最後我的回聲),但檢查csv中的古怪撇號,它仍然在幾行。任何其他想法? – meeilz

+0

你能給我一些來自CSV的示例數據嗎? – Nick

+2

這是一個印刷單引號,而不是反引號。 –