批處理文件腳本，用於從文件名中刪除特殊字符（Windows）

我有一大組文件，其中一些包含文件名中的特殊字符（例如，ä，ö，％和其他）。我想要一個腳本文件來遍歷這些文件，並將它們重命名爲刪除特殊字符。我並不介意它的功能，但它可以用下劃線代替它們，例如，批處理文件腳本，用於從文件名中刪除特殊字符（Windows）

Störung％20.doc將被重新命名爲St_rung_20.doc

按優先順序排列：

一個Windiws批處理文件
Windows腳本文件cscript的（VBS）運行
可以從命令行運行的第三方軟件（即無需用戶交互）
另一種語言腳本文件，爲此我必須安裝additi onal腳本引擎

背景：我試圖在Windows上使用GnuPG加密這些文件，但它似乎沒有使用--encrypt-files選項處理文件名中的特殊字符。

來源

2008-11-04 njr101

感謝Tomalak誰實際上指出我在正確的方向。以爲我會在這裏發佈完整性。

問題似乎是，GPG使用的代碼頁是固定的（拉丁語I），與控制檯中配置的代碼頁無關。但一旦他指出了這一點，我想出瞭如何解決這個問題。

訣竅是在生成文件列表之前更改代碼頁。這實際上會使文件列表在控制檯中查看時顯示爲不正確。但是，當傳遞給GPG時，它工作正常。 GPG接受文件並用正確的文件名吐出加密文件。

批處理文件看起來是這樣的：

chcp 1252 
dir /b /s /a-d MyFolder >filelist.txt 
gpg -r [email protected] --encrypt-files <filelist.txt

來源

2008-11-06 14:14:01 njr101

這並不回答這個問題。這個問題問了一個批處理文件，這個問題明顯地解決了askers的問題，（因爲你是提問者），但你不應該選擇這個作爲正確的答案。 – 2010-03-17 16:26:55

在將文件名提供給gnupg之前，您是否嘗試將cmd.exe設置爲其他代碼頁？發行chcp 65001將cmd.exe預先設置爲Unicode，然後重試。

如果失敗，下面的VBScript會做到這一點：

Option Explicit 

Dim fso: Set fso = CreateObject("Scripting.FileSystemObject") 
Dim invalidChars: Set invalidChars = New RegExp 

' put all characters that you want to strip inside the brackets 
invalidChars.Pattern = "[äöüß&%]" 
invalidChars.IgnoreCase = True 
invalidChars.Global = True 

If WScript.Arguments.Unnamed.Count = 0 Then 
    WScript.Echo "Please give folder name as argument 1." 
    WScript.Quit 1 
End If 

Recurse fso.GetFolder(WScript.Arguments.Unnamed(0)) 

Sub Recurse(f) 
    Dim item 

    For Each item In f.SubFolders 
    Recurse item 
    Sanitize item 
    Next 
    For Each item In f.Files 
    Sanitize item 
    Next 
End Sub 

Sub Sanitize(folderOrFile) 
    Dim newName: newName = invalidChars.Replace(folderOrFile.Name, "_") 
    If folderOrFile.Name = newName Then Exit Sub 
    WScript.Echo folderOrFile.Name, " -> ", newName 
    folderOrFile.Name = newName 
End Sub

調用它像這樣：

cscript replace.vbs "c:\path\to\my\files"

您也可以將&在Windows資源管理器的文件夾拖放到它。

來源

2008-11-04 11:13:47 Tomalak

從http://www.robvanderwoude.com/bht.html：

使用NT的SET的字符串替換來替換或字符串中的任何位置刪除字符：

SET STRING=[ABCDEFG] 
SET STRING=%STRING:[=% 
SET STRING=%STRING:]=% 
ECHO String: %STRING% 

will display 
String: ABCDEFG 

SET STRING=[ABCDEFG] 
SET STRING=%STRING:[=(% 
SET STRING=%STRING:]=)% 
ECHO String: %STRING% 

will display 
String: (ABCDEFG) 

SET STRING=[ABCDEFG] 
SET STRING=%STRING:~1,7% 
ECHO String: %STRING% 

will display 
String: ABCDEFG

如果使用這樣的嘗試，你將不得不處理要替換的每個字符（例如Ä，Ö，Ü，ä，ö，ü，ß，還有á，à，é，è...）。

來源

2008-11-04 11:14:02 Treb

繼「RenameFilesWithAccentedAndDiacriticalLatinChars。PL」 PERL腳本文件重命名重音和變音拉丁字符：

這PERL腳本開始從參數給出的文件夾，否則，從當前文件夾。
遞歸搜索字符屬於80 - FF CP 1250，CP 1252，CP 1254和CP 1257（主要是重音拉丁字符）或帶有變音標記的拉丁字符的文件。
它僅通過從拉丁字符中刪除重音符號和變音符號來計算新文件名（例如，Été - > Ete）。
它顯示所有提出的重命名和可能的衝突，並要求用戶全球批准。
如果用戶已經批准，它將重命名所有沒有衝突的文件。

選項'--batch'避免了交互式問題。小心使用。

選項' - '避免將下一個參數解釋爲選項。

特別警告：

這個腳本最初編碼UTF-8，而應保持如此。
此腳本可能會重命名大量文件。
文件名在理論上都只用UTF-8編碼。但是可能會發現一些文件名稱也包含一些具有傳統編碼的字符。
作者已經應用了一致性檢查，魯棒性，衝突檢測和使用適當的編碼的努力。因此，此腳本應僅通過從拉丁字符中刪除重音符號和變音符號來重命名文件。
但是，此腳本僅在有限數量的OS （Windows，Mac OS X，Linux）和有限數量的終端編碼（CP 850，ISO-8859-1，UTF-8）下進行了測試。
所以，在奇怪的情況下，這個腳本可以用隨機名稱重命名許多文件。
因此，該腳本應該小心使用，並與極端護理改性（提防的內部串，輸入，輸出和命令編碼）

#!/usr/bin/perl -w 
#============================================================================= 
# 
# Copyright 2010 Etienne URBAH 
# 
# This program is free software; you can redistribute it and/or modify 
# it under the terms of the GNU General Public License as published by 
# the Free Software Foundation; either version 2 of the License, or 
# (at your option) any later version. 
# 
# This program is distributed in the hope that it will be useful, 
# but WITHOUT ANY WARRANTY; without even the implied warranty of 
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 
# GNU General Public License for more details at 
# http://www.gnu.org/licenses/gpl.html 
# 
# For usage and SPECIAL WARNING, see the 'Help' section below. 
# 
#============================================================================= 
use 5.008_000; # For correct Unicode support 
use warnings; 
use strict; 
use Encode; 

$| = 1;   # Autoflush STDOUT 

#----------------------------------------------------------------------------- 
# Function ucRemoveEolUnderscoreDash : 
# Set Uppercase, remove End of line, Underscores and Dashes 
#----------------------------------------------------------------------------- 
sub ucRemoveEolUnderscoreDash 
{ 
    local $_ = uc($_[0]); 
    chomp; 
    tr/_\-//d; 
    $_; 
} 

#----------------------------------------------------------------------------- 
# Constants 
#----------------------------------------------------------------------------- 
my $Encoding_Western = 'ISO-8859-1'; 
my $Encoding_Central = 'ISO-8859-2'; 
my $Encoding_Baltic = 'ISO-8859-4'; 
my $Encoding_Turkish = 'ISO-8859-9'; 
my $Encoding_W_Euro = 'ISO-8859-15'; 
my $Code_Page_OldWest = 850; 
my $Code_Page_Central = 1250; 
my $Code_Page_Western = 1252; 
my $Code_Page_Turkish = 1254; 
my $Code_Page_Baltic = 1257; 
my $Code_Page_UTF8 = 65001; 

my $HighBitSetChars = pack('C*', 0x80..0xFF); 

my %SuperEncodings = 
    (&ucRemoveEolUnderscoreDash($Encoding_Western), 'cp'.$Code_Page_Western, 
    &ucRemoveEolUnderscoreDash($Encoding_Central), 'cp'.$Code_Page_Central, 
    &ucRemoveEolUnderscoreDash($Encoding_Baltic), 'cp'.$Code_Page_Baltic, 
    &ucRemoveEolUnderscoreDash($Encoding_Turkish), 'cp'.$Code_Page_Turkish, 
    &ucRemoveEolUnderscoreDash($Encoding_W_Euro), 'cp'.$Code_Page_Western, 
    &ucRemoveEolUnderscoreDash('cp'.$Code_Page_OldWest), 
                'cp'.$Code_Page_Western); 

my %EncodingNames  = ('cp'.$Code_Page_Central, 'Central European', 
          'cp'.$Code_Page_Western, 'Western European', 
          'cp'.$Code_Page_Turkish, ' Turkish  ', 
          'cp'.$Code_Page_Baltic, '  Baltic  '); 

my %NonAccenChars  = ( 
          #--------------------------------# 
'cp'.$Code_Page_Central, # Central European (cp1250) # 
          #--------------------------------# 
          #€_‚_„…†‡_‰Š‹ŚŤŽŹ_‘’「」•–—_™š›śťžź# 
          'E_,_,.++_%S_STZZ_````.--_Ts_stzz'. 

          # ˇ˘Ł¤Ą¦§¨©Ş«¬®Ż°±˛ł´µ¶·¸ąş»Ľ˝ľż# 
          '_``LoAlS`CS_--RZ`+,l`uP.,as_L~lz'. 

          #ŔÁÂĂÄĹĆÇČÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢß# 
          'RAAAALCCCEEEEIIDDNNOOOOxRUUUUYTS'. 

          #ŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙# 
          'raaaalccceeeeiiddnnoooo%ruuuuyt`', 

          #--------------------------------# 
'cp'.$Code_Page_Western, # Western European (cp1252) # 
          #--------------------------------# 
          #€_‚ƒ„…†‡ˆ‰Š‹Œ_Ž__‘’「」•–—˜™š›œ_žŸ# 
          'E_,f,.++^%S_O_Z__````.--~Ts_o_zY'. 

          # ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿# 
          '_!cLoYlS`Ca_--R-`+23`uP.,10_qh3_'. 

          #ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß# 
          'AAAAAAACEEEEIIIIDNOOOOOxOUUUUYTS'. 

          #àáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ# 
          'aaaaaaaceeeeiiiidnooooo%ouuuuyty', 

          #--------------------------------# 
'cp'.$Code_Page_Turkish, #  Turkish (cp1254)  # 
          #--------------------------------# 
          #€_‚ƒ„…†‡ˆ‰Š‹Œ____‘’「」•–—˜™š›œ__Ÿ# 
          'E_,f,.++^%S_O____````.--~Ts_o__Y'. 

          # ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿# 
          '_!cLoYlS`Ca_--R-`+23`uP.,10_qh3_'. 

          #ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏĞÑÒÓÔÕÖ×ØÙÚÛÜİŞß# 
          'AAAAAAACEEEEIIIIGNOOOOOxOUUUUISS'. 

          #àáâãäåæçèéêëìíîïğñòóôõö÷øùúûüışÿ# 
          'aaaaaaaceeeeiiiignooooo%ouuuuisy', 

          #--------------------------------# 
'cp'.$Code_Page_Baltic, #  Baltic (cp1257)  # 
          #--------------------------------# 
          #€_‚_„…†‡_‰_‹_¨ˇ¸_‘’「」•–—_™_›_¯˛_# 
          'E_,_,.++_%___``,_````.--_T___-,_'. 

          # �¢£¤�¦§Ø©Ŗ«¬®Æ°±²³´µ¶·ø¹ŗ»¼½¾æ# 
          '__cLo_lSOCR_--RA`+23`uP.o1r_qh3a'. 

          #ĄĮĀĆÄÅĘĒČÉŹĖĢĶĪĻŠŃŅÓŌÕÖ×ŲŁŚŪÜŻŽß# 
          'AIACAAEECEZEGKILSNNOOOOxULSUUZZS'. 

          #ąįāćäåęēčéźėģķīļšńņóōõö÷ųłśūüżž˙# 
          'aiacaaeecezegkilsnnoooo%ulsuuzz`'); 

my %AccentedChars; 
my $AccentedChars  = ''; 
my $NonAccenChars  = ''; 
for ($Code_Page_Central, $Code_Page_Western, 
     $Code_Page_Turkish, $Code_Page_Baltic) 
    { 
     $AccentedChars{'cp'.$_} = decode('cp'.$_, $HighBitSetChars); 
     $AccentedChars   .= $AccentedChars{'cp'.$_}; 
     $NonAccenChars   .= $NonAccenChars{'cp'.$_}; 
    } 
#print "\n", length($NonAccenChars), ' ', $NonAccenChars,"\n"; 
#print "\n", length($AccentedChars), ' ', $AccentedChars,"\n"; 

my $QuotedMetaNonAccenChars = quotemeta($NonAccenChars); 

my $DiacriticalChars = ''; 
for (0x0300..0x036F, 0x1DC0..0x1DFF) 
    { $DiacriticalChars .= chr($_) } 

#----------------------------------------------------------------------------- 
# Parse options and parameters 
#----------------------------------------------------------------------------- 
my $b_Help  = 0; 
my $b_Interactive = 1; 
my $b_UTF8  = 0; 
my $b_Parameter = 0; 
my $Folder; 

for (@ARGV) 
{ 
    if (lc($_) eq '--') 
     { $b_Parameter = 1 } 
    elsif ((not $b_Parameter) and (lc($_) eq '--batch')) 
     { $b_Interactive = 0 } 
    elsif ((not $b_Parameter) and (lc($_) eq '--utf8')) 
     { $b_UTF8 = 1 } 
    elsif ($b_Parameter or (substr($_, 0, 1) ne '-')) 
     { 
      if (defined($Folder)) 
       { die "$0 accepts only 1 parameter\n" } 
      else 
       { $Folder = $_ } 
     } 
    else 
     { $b_Help = 1 } 
} 

#----------------------------------------------------------------------------- 
# Help 
#----------------------------------------------------------------------------- 
if ($b_Help) 
    { 
     die << "END_OF_HELP" 

$0 [--help] [--batch] [--] [folder] 

This script renames files with accented and diacritical Latin characters : 

- This PERL script starts from the folder given in parameter, or else from 
    the current folder. 
- It recursively searches for files with characters belonging to 80 - FF of 
    CP 1250, CP 1252, CP 1254 and CP 1257 (mostly accented Latin characters) 
    or Latin characters having diacritical marks. 
- It calculates new file names by removing the accents and diacritical marks 
    only from Latin characters (For example, Été --> Ete). 
- It displays all proposed renaming and perhaps conflicts, and asks the user 
    for global approval. 
- If the user has approved, it renames all files having no conflict. 

Option '--batch' avoids interactive questions. Use with care. 

Option '--'  avoids the next parameter to be interpreted as option. 

SPECIAL WARNING : 
- This script was originally encoded in UTF-8, and should stay so. 
- This script may rename a lot of files. 
- Files names are theoretically all encoded only with UTF-8. But some file 
    names may be found to contain also some characters having legacy encoding. 
- The author has applied efforts for consistency checks, robustness, conflict 
    detection and use of appropriate encoding. 
    So this script should only rename files by removing accents and diacritical 
    marks from Latin characters. 
- But this script has been tested only under a limited number of OS 
    (Windows, Mac OS X, Linux) and a limited number of terminal encodings 
    (CP 850, ISO-8859-1, UTF-8). 
- So, under weird circumstances, this script could rename many files with 
    random names. 
- Therefore, this script should be used with care, and modified with extreme 
    care (beware encoding of internal strings, inputs, outputs and commands) 
END_OF_HELP 
    } 

#----------------------------------------------------------------------------- 
# If requested, change current folder 
#----------------------------------------------------------------------------- 
if (defined($Folder)) 
    { chdir($Folder) or die "Can NOT set '$Folder' as current folder\n" } 

#----------------------------------------------------------------------------- 
# Following instruction is MANDATORY. 
# The return value should be non-zero, but on some systems it is zero. 
#----------------------------------------------------------------------------- 
utf8::decode($AccentedChars); 
# or die "$0: '\$AccentedChars' should be UTF-8 but is NOT.\n"; 

#----------------------------------------------------------------------------- 
# Check consistency on 'tr' 
#----------------------------------------------------------------------------- 
$_ = $AccentedChars; 
eval "tr/$AccentedChars/$QuotedMetaNonAccenChars/"; 
if ([email protected]) { warn [email protected] } 
if ([email protected] or ($_ ne $NonAccenChars)) 
    { die "$0: Consistency check on 'tr' FAILED :\n\n", 
      "Translated Accented Chars : ", length($_), ' : ', $_, "\n\n", 
      "  Non Accented Chars : ", length($NonAccenChars), ' : ', 
      $NonAccenChars, "\n" } 

#----------------------------------------------------------------------------- 
# Constants depending on the OS 
#----------------------------------------------------------------------------- 
my $b_Windows = (defined($ENV{'OS'}) and ($ENV{'OS'} eq 'Windows_NT')); 

my ($Q, $sep, $sep2, $HOME, $Find, @List, $cwd, @Move); 

if ($b_Windows) 
    { 
     $Q = '"'; 
     $sep = '\\'; 
     $sep2 = '\\\\'; 
     $HOME = $ENV{'USERPROFILE'}; 
     $Find = 'dir /b /s'; 
     @List = (((`ver 2>&1` =~ m/version\s+([0-9]+)/i) and ($1 >= 6)) ? 
       ('icacls') : 
       ('cacls')); 
     $cwd = `cd`; chomp $cwd; $cwd = quotemeta($cwd); 
     @Move = ('move'); 
    } 
else 
    { 
     $Q = "'"; 
     $sep = '/'; 
     $sep2 = '/'; 
     $HOME = $ENV{'HOME'}; 
     $Find = 'find .'; 
     @List = ('ls', '-d', '--'); 
     @Move = ('mv', '--'); 
     if (-w '/bin') { die "$0: For safety reasons, ", 
           "usage is BLOCKED to administrators.\n"} 
    } 

my $Encoding; 
my $ucEncoding; 
my $InputPipe = '-|';        # Used as global variable 

#----------------------------------------------------------------------------- 
# Under Windows, associate input and output encodings to code pages : 
# - Get the original code page, 
# - If it is not UTF-8, try to set it to UTF-8, 
# - Define the input encoding as the one associated to the ACTIVE code page, 
# - If STDOUT is the console, encode output for the ORIGINAL code page. 
#----------------------------------------------------------------------------- 
my $Code_Page_Original; 
my $Code_Page_Active; 

if ($b_Windows) 
    { 
     #----------------------------------------------------------------------- 
     # Get the original code page 
     #----------------------------------------------------------------------- 
     $_ = `chcp`; 
     m/([0-9]+)$/ or die "Non numeric Windows code page : ", $_; 
     $Code_Page_Original = $1; 
     print 'Windows Original Code Page = ', $Code_Page_Original, 
      ($Code_Page_Original == $Code_Page_UTF8 ? 
       ' = UTF-8, display is perhaps correct with a true type font.' : 
       ''), "\n\n"; 
     $Code_Page_Active = $Code_Page_Original ; 

     #----------------------------------------------------------------------- 
     # The input encoding must be the same as the ACTIVE code page 
     #----------------------------------------------------------------------- 
     $Encoding = ($Code_Page_Active == $Code_Page_UTF8 ? 
        'utf8' : 
        'cp'.$Code_Page_Active) ; 
     $InputPipe .= ":encoding($Encoding)"; 
     print "InputPipe = '$InputPipe'\n\n"; 

     #----------------------------------------------------------------------- 
     # If STDOUT is the console, output encoding must be the same as the 
     # ORIGINAL code page 
     #----------------------------------------------------------------------- 
     if ($Code_Page_Original != $Code_Page_UTF8) 
      { 
      no warnings 'unopened'; 
      @_ = stat(STDOUT); 
      use warnings; 
      if (scalar(@_) and ($_[0] == 1)) 
       { binmode(STDOUT, ":encoding(cp$Code_Page_Original)") } 
      else 
       { binmode(STDOUT, ":encoding($Encoding)") } 
      } 
    } 

#----------------------------------------------------------------------------- 
# Under *nix, if the 'LANG' environment variable contains an encoding, 
# verify that this encoding is supported by the OS and by PERL. 
#----------------------------------------------------------------------------- 
elsif (defined($ENV{'LANG'}) and ($ENV{'LANG'} =~ m/\.([^\@]+)$/i)) 
     { 
     $Encoding = $1; 

     my $Kernel = `uname -s`; 
     chomp $Kernel; 
     my $ucEncoding = &ucRemoveEolUnderscoreDash($Encoding); 
     if ((lc($Kernel) ne 'darwin') and not grep {$_ eq $ucEncoding} 
         (map { ($_, &ucRemoveEolUnderscoreDash($_)) } 
         `locale -m`)) 
      { die "Encoding = '$Encoding' or '$ucEncoding' NOT supported ". 
        "by the OS\n" } 

     my $ucLocale = &ucRemoveEolUnderscoreDash($ENV{'LANG'}); 
     if (not grep {$_ eq $ucLocale} 
         (map { ($_, &ucRemoveEolUnderscoreDash($_)) } 
         `locale -a`)) 
      { die "Locale = '$ENV{LANG}' or '$ucLocale' NOT supported ". 
        "by the OS\n" } 

     if (not defined(Encode::find_encoding($Encoding))) 
      { die "Encoding = '$Encoding' or '$ucEncoding' NOT supported ". 
        "by PERL\n" } 

     print "Encoding = '$Encoding' is supported by the OS and PERL\n\n"; 
     binmode(STDOUT, ":encoding($Encoding)"); 
     } 

#----------------------------------------------------------------------------- 
# Check consistency between parameter of 'echo' and output of 'echo' 
#----------------------------------------------------------------------------- 
undef $_; 
if (defined($Encoding) ) 
    { 
     $ucEncoding = &ucRemoveEolUnderscoreDash($Encoding); 
     if (defined($SuperEncodings{$ucEncoding})) 
      { $_ = substr($AccentedChars{$SuperEncodings{$ucEncoding}}, 
          0x20, 0x60) } 
     elsif (defined($AccentedChars{$Encoding})) 
      { $_ = $AccentedChars{$Encoding} } 
     elsif ($Encoding =~ m/^utf-?8$/i) 
      { $_ = $AccentedChars } 
    } 
if (not defined($_))    # Chosen chars are same in 4 code pages 
    { $_ = decode('cp'.$Code_Page_Central, 
        pack('C*', 0xC9, 0xD3, 0xD7, 0xDC,     # ÉÓ×Ü 
          0xE9, 0xF3, 0xF7, 0xFC)) }     # éó÷ü 
#print $_, " (Parameter)\n\n"; 
#system 'echo', $_; 
utf8::decode($_); 
#print "\n", $_, " (Parameter after utf8::decode)\n\n"; 
my @EchoCommand = ($b_Windows ? 
        "echo $_" : 
        ('echo', $_)); 
#system @EchoCommand; 

open(ECHO, $InputPipe, @EchoCommand) or die 'echo $_: ', $!; 
my $Output = join('', <ECHO>); 
close(ECHO); 
chomp $Output; 
#print "\n", $Output, " (Output of 'echo')\n"; 
utf8::decode($Output); 
#print "\n", $Output, " (Output of 'echo' after utf8::decode)\n\n"; 

if ($Output ne $_) 
    { 
     warn "$0: Consistency check between parameter ", 
         "of 'echo' and output of 'echo' FAILED :\n\n", 
      "Parameter of 'echo' : ", length($_), ' : ', $_, "\n\n", 
      " Output of 'echo' : ", length($Output), ' : ', $Output, "\n"; 
     exit 1; 
    } 

#----------------------------------------------------------------------------- 
# Print the translation table 
#----------------------------------------------------------------------------- 
if (defined($Encoding)) 
{ 
    undef $_; 
    $ucEncoding = &ucRemoveEolUnderscoreDash($Encoding); 
    if (defined($SuperEncodings{$ucEncoding})) 
     { 
      $_ = $SuperEncodings{$ucEncoding}; 
      print "--------- $EncodingNames{$_} ---------\n", 
       ' ', substr($AccentedChars{$_}, 0x20, 0x20), "\n", 
       '--> ', substr($NonAccenChars{$_}, 0x20, 0x20), "\n\n", 
       ' ', substr($AccentedChars{$_}, 0x40, 0x20), "\n", 
       '--> ', substr($NonAccenChars{$_}, 0x40, 0x20), "\n\n", 
       ' ', substr($AccentedChars{$_}, 0x60, 0x20), "\n", 
       '--> ', substr($NonAccenChars{$_}, 0x60, 0x20), "\n\n" } 
    else 
    { 
    for ('cp'.$Code_Page_Central, 'cp'.$Code_Page_Western, 
      'cp'.$Code_Page_Turkish, 'cp'.$Code_Page_Baltic) 
    { 
     if (('cp'.$Encoding eq $_) or ($Encoding =~ m/^utf-?8$/i)) 
      { print "--------- $EncodingNames{$_} ---------\n", 
        ' ', substr($AccentedChars{$_}, 0, 0x20), "\n", 
        '--> ', substr($NonAccenChars{$_}, 0, 0x20), "\n\n", 
        ' ', substr($AccentedChars{$_}, 0x20, 0x20), "\n", 
        '--> ', substr($NonAccenChars{$_}, 0x20, 0x20), "\n\n", 
        ' ', substr($AccentedChars{$_}, 0x40, 0x20), "\n", 
        '--> ', substr($NonAccenChars{$_}, 0x40, 0x20), "\n\n", 
        ' ', substr($AccentedChars{$_}, 0x60, 0x20), "\n", 
        '--> ', substr($NonAccenChars{$_}, 0x60, 0x20), "\n\n" } 
    } 
    } 
} 

#----------------------------------------------------------------------------- 
# Completely optional : 
# Inside the Unison file, find the accented file names to ignore 
#----------------------------------------------------------------------------- 
my $UnisonFile = $HOME.$sep.'.unison'.$sep.'common.unison'; 
my @Ignores; 

if (open(UnisonFile, '<', $UnisonFile)) 
    { 
     print "\nUnison File '", $UnisonFile, "'\n"; 
     while (<UnisonFile>) 
     { 
     if (m/^\s*ignore\s*=\s*Name\s*(.+)/) 
      { 
       $_ = $1 ; 
       if (m/[$AccentedChars]/) 
        { push(@Ignores, $_) } 
      } 
     } 
     close(UnisonFile); 
    } 
print map(" Ignore: ".$_."\n", @Ignores); 

#----------------------------------------------------------------------------- 
# Function OutputAndErrorFromCommand : 
# 
# Execute the command given as array in parameter, and return STDOUT + STDERR 
# 
# Reads global variable $InputPipe 
#----------------------------------------------------------------------------- 
sub OutputAndErrorFromCommand 
{ 
    local $_; 
    my @Command = @_;    # Protects content of @_ from any modification 
    #--------------------------------------------------------------------------- 
    # Under Windows, fork fails, so : 
    # - Enclose into double quotes parameters containing blanks or simple 
    #  quotes, 
    # - Use piped open with redirection of STDERR. 
    #--------------------------------------------------------------------------- 
    if (defined($ENV{'OS'}) and ($ENV{'OS'} eq 'Windows_NT')) 
     { 
     for (@Command) 
      { s/^((-|.*(\s|')).*)$/$Q$1$Q/ } 
     my $Command = join(' ', @Command); 
     #print "\n", $Command; 
     open(COMMAND, $InputPipe, "$Command 2>&1") or die '$Command: ', $!; 
     } 
    #--------------------------------------------------------------------------- 
    # Under Unix, quoting is too difficult, but fork succeeds 
    #--------------------------------------------------------------------------- 
    else 
     { 
     my $pid = open(COMMAND, $InputPipe); 
     defined($pid) or die "Can't fork: $!"; 
     if ($pid == 0)       # Child process 
      { 
       open STDERR, '>&=STDOUT'; 
       exec @Command;      # Returns only on failure 
       die "Can't @Command"; 
      } 
     } 
    $_ = join('', <COMMAND>);       # Child's STDOUT + STDERR 
    close COMMAND; 
    chomp; 
    utf8::decode($_); 
    $_; 
} 

#----------------------------------------------------------------------------- 
# Find recursively all files inside the current folder. 
# Verify accessibility of files with accented names. 
# Calculate non-accented file names from accented file names. 
# Build the list of duplicates. 
#----------------------------------------------------------------------------- 
my %Olds;        # $Olds{$New} = [ $Old1, $Old2, ... ] 
my $Old; 
my $Dir; 
my $Command; 
my $ErrorMessage; 
my $New; 
my %News; 

print "\n\nFiles with accented name and the corresponding non-accented name ", 
     ":\n"; 

open(FIND, $InputPipe, $Find) or die $Find, ': ', $!; 

FILE: 
while (<FIND>) 
{ 
    chomp; 
    #--------------------------------------------------------------------------- 
    # If the file path contains UTF-8, following instruction is MANDATORY. 
    # If the file path does NOT contain UTF-8, it should NOT hurt. 
    #--------------------------------------------------------------------------- 
    utf8::decode($_); 

    if ($b_Windows) 
     { s/^$cwd$sep2// } 
    else 
     { s/^\.$sep2// } 

    #--------------------------------------------------------------------------- 
    # From now on : $_ = Dir/OldFilename 
    #--------------------------------------------------------------------------- 
    push(@{$Olds{$_}}, $_); 

    if (m/([^$sep2]+)$/ and 
     ($1 =~ m/[$AccentedChars]|([\ -\~][$DiacriticalChars])/)) 
     { 
     if ($b_Windows and m/$Q/) 
      { 
       print "\n $Q$_$Q\n*** contains quotes.\n"; 
       next; 
      } 
     for my $Ignore (@Ignores) 
     { 
      if (m/$Ignore$/) 
       { next FILE } 
     } 
     $Old = $_ ; 
     m/^(.*$sep2)?([^$sep2]+)$/; 
     $Dir = (defined($1) ? $1 : ''); 
     $_ = $2; 

     #--------------------------------------------------------------------- 
     # From now on : $Old = Dir/OldFilename 
     #     $_ = OldFilename 
     #--------------------------------------------------------------------- 
     print "\n $Q$Old$Q\n"; 
     $ErrorMessage = &OutputAndErrorFromCommand(@List, $Old); 
     if ($? != 0) 
      { print "*** $ErrorMessage\n" } 
     else 
      { 
       #--------------------------------------------------------------- 
       # Change accented Latin chars to non-accented chars. 
       # Remove all diacritical marks after Latin chars. 
       #--------------------------------------------------------------- 
       eval "tr/$AccentedChars/$QuotedMetaNonAccenChars/"; 
       s/([\ -\~])[$DiacriticalChars]+/$1/g; 
       #--------------------------------------------------------------- 
       # From now on : $Old = Dir/OldFilename 
       #     $_ = NewFilename 
       #--------------------------------------------------------------- 
       if ([email protected]) 
        { warn [email protected] } 
       else 
        { 
        $New = $Dir.$_; 
        if ($b_Windows or (not utf8::is_utf8($Dir))) # Weird 
         { utf8::decode($New) }     # but necessary 
        $News{$Old} = $New; 
        push(@{$Olds{$New}}, $Old); 
        } 
       print "--> $Q$Dir$_$Q\n"; 
      } 
     } 
} 

close(FIND); 

#----------------------------------------------------------------------------- 
# Print list of duplicate non-accented file names 
#----------------------------------------------------------------------------- 
my $b_NoDuplicate = 1; 

for my $New (sort keys %Olds) 
{ 
    if (scalar(@{$Olds{$New}}) > 1) 
     { 
     if ($b_NoDuplicate) 
      { 
       print "\n\nFollowing files would have same non-accented name ", 
        ":\n"; 
       $b_NoDuplicate = 0; 
      } 
     print "\n", map(' '.$_."\n", @{$Olds{$New}}), '--> ', $New, "\n"; 
     for (@{$Olds{$New}}) 
      { delete $News{$_} }; 
     } 
} 

#----------------------------------------------------------------------------- 
# If there are NO file to rename, then exit 
#----------------------------------------------------------------------------- 
my $Number = scalar(keys %News); 

print "\n\n"; 
if ($Number < 1) 
    { 
     print "There are NO file to rename\n"; 
     exit; 
    } 

#----------------------------------------------------------------------------- 
# Ask the user for global approval of renaming 
#----------------------------------------------------------------------------- 
if ($b_Interactive) 
    { 
     print "In order to really rename the ", $Number, 
      " files which can safely be renamed, type 'rename' : "; 
     $_ = <STDIN>; 
     sleep 1;      # Gives time to PERL to handle interrupts 
     if (not m/^rename$/i) 
      { exit 1 } 
    } 
else 
    { print $Number, " files will be renamed\n\n" } 

#----------------------------------------------------------------------------- 
# Rename accented file names sorted descending by name size 
#----------------------------------------------------------------------------- 
$Number = 0; 
my $Move = join(' ', @Move); 

for (sort {length($b) <=> length($a)} keys %News) 
{ 
    $ErrorMessage = &OutputAndErrorFromCommand(@Move, $_, $News{$_}); 
    if ($? == 0) 
     { $Number++ } 
    else 
     { print "\n$Move $Q$_$Q\n", (' ' x length($Move)), 
     " $Q$News{$_}$Q\n", ('*' x length($Move)), " $ErrorMessage\n" } 
} 
print "\n$Number files have been successfully renamed\n"; 

__END__

來源

2010-03-11 21:38:39

用戶'nobi'是正確的：我的$ AccentedChars字符串仍然存在問題。我很高興我對'tr'的一致性檢查允許我的腳本在執行可怕的重命名之前停止。我沒有安全地上傳原始PERL腳本的地方，所以我修正了上面的版本，並且我成功地進行了測試。因此，我希望以上版本現在是好的。請測試它。 – 2010-03-16 01:09:36

我使用這個批次重命名的文件夾似乎到目前爲止工作良好... 在我的情況下，代碼頁是1252，你的可能會不同。

mode con codepage select=1252 
@echo off 
Setlocal enabledelayedexpansion 
::folder only (/D option) 
for /R /D %%d in (*) do (

set an=%%~nd 
set bn=!an:.=_! 
set cn=!bn:-=_! 
set dn=!cn: =_! 
set en=!dn:Á=A! 
set fn=!en:É=E! 
set gn=!fn:Í=I! 
set hn=!gn:Ó=O! 
set in=!hn:Ú=U! 
set jn=!in:Ü=U! 
set kn=!jn:á=a! 
set ln=!kn:é=e! 
set mn=!ln:í=i! 
set nn=!mn:ó=o! 
set on=!nn:ú=u! 
set pn=!on:ü=u! 
set qn=!pn:Ñ=N! 
set zn=!on:ñ=n! 

set ax=%%~xd 
set bx=!ax:.=_! 
set cx=!bx:-=_! 
set dx=!cx: =_! 
set bx=!ax:.=_! 
set cx=!bx:-=_! 
set dx=!cx: =_! 
set ex=!dx:Á=A! 
set fx=!ex:É=E! 
set gx=!fx:Í=I! 
set hx=!gx:Ó=O! 
set ix=!hx:Ú=U! 
set jx=!ix:Ü=U! 
set kx=!jx:á=a! 
set lx=!kx:é=e! 
set mx=!lx:í=i! 
set nx=!mx:ó=o! 
set ox=!nx:ú=u! 
set px=!ox:ü=u! 
set qx=!px:Ñ=N! 
set zx=!ox:ñ=n! 


if [!an!]==[] (set zn=) 
if [!ax!]==[] (set zx=) 

set newname=!zn!!zx! 

if /i not [%%~nd%%~xd]==[!newname!] rename "%%d" !newname! 

) 

endlocal 

pause

來源

2010-03-18 13:10:48 Manolo

批處理文件腳本，用於從文件名中刪除特殊字符（Windows）

回答

相關問題