2010-05-13 25 views
1

我有一個包含幾個目錄如下目錄:Linux的 - 要檢查可能重複目錄(也許正則表達式需要)

/音樂/
/音樂/ JoeBlogs-Back_In_Black - 1980年
/音樂/ JoeBlogs-Back_In_Black-(修復)-2003
/音樂/ JoeBlogs-Back_In_Black-(補發)-1987
/音樂/ JoeBlogs-Thunder_Man - 1947年

我想一個劇本要經過,並告訴我,當有'可能'重複,在上面的例子中LD拿起從目錄列表以下爲可能的重複:

/音樂/ JoeBlogs-Back_In_Black - 1980年
/音樂/ JoeBlogs-Back_In_Black-(修復)-2003
/音樂/ JoeBlogs-Back_In_Black-(補發)-1987

1)這可能嗎?
2)如果是這樣,請幫助!

+1

你說你有幾個目錄,但是你只顯示了一個目錄。其他目錄是什麼樣的?破折號「JoeBlogs-Back_In_Black-1980」之間的第二個文本字段是否始終是歌名? – dawg 2010-05-13 15:51:45

回答

0

如果你的目錄名遵循一個規則的結構,例如:

foo-Name_of_Interest-bar 

那麼你可以做一個簡單的正則表達式來脫光「foo-」和「-bar」,做一個直接的比較。

如果這是不可能的,你將不得不做一個更昂貴的模式匹配算法。也許像longest common sequenceLevenshtein distance。可能有其他更適合的技術。

Bash中簡單的匹配(3.2或更高版本)可能是這樣的片段:

dir='/Music/JoeBlogs-Back_In_Black-(Remastered)-2003' 
regex='^([^-]*)-([^-]*)-(.*)$' 
if [[ ${BASH_REMATCH[1]} == ${prev_dir[1]} && # "/Music/JoeBlogs" 
     ${BASH_REMATCH[2]} == ${prev_dir[2]} ]] # "Back_In_Black" 
then 
    echo "we have a match" 
fi 

這段代碼並不顯示find ... | while read ...環或如何匹配的以前的條目,並列出能夠處理。

2

跟進:

我做了我需要通過編碼下面的Perl腳本使用。這是我第一次使用Perl腳本(並且我必須學習Perl來編寫它 - 所以不要對我很難:)

#!/usr/bin/perl 

# README 
# 
# Checks a folder for Albums that are similar 
# eg : 
# Arist-Back_In_Black-(Remastered)-2001-XXX 
# Artist-Back_In_Black-(Reissue)-2000-YYY 
# 
# Script prompts you for which one to "zz" (putting zz in front of the file name you can delete it later) 
# 
# CONFIG 
# 
# Put your mp3 directory path in the $mp3dirpath variable 
# 

$mp3dirpath = '/data/downloads/MP3'; 

# END CONFIG 


@txt= qx{ls $mp3dirpath}; 


sort (@txt); 

$re1='.*?'; 
$re2='(?:[a-z][a-z0-9_]*)'; 
$re3='.*?'; 
$re4='((?:[a-z][a-z0-9_]*))'; 

$re=$re1.$re2.$re3.$re4; 

$foreach_count_before=0; #Setups up counter 
$foreach_count_after=1; #Setups up counter 


$number_in_arry = scalar (@txt); 

while ($foreach_count_before < $number_in_arry) { 
             if ($txt[$foreach_count_before] =~ m/$re/is) 
              { 
              $var1=$1; 
              } 
             if ($txt[$foreach_count_after] =~ m/$re/is) 
              { 
              $var2=$1; 
              } 
             if ($var1 eq $var2) 
              { 
              print "-------------------------------------\n"; 
              print "$txt[$foreach_count_before] \n"; 
              print "MATCHES \n"; 
              print "\n$txt[$foreach_count_after] \n"; 
              print "Which Should I Remove? \n"; 
              print "[1] $txt[$foreach_count_before]\n"; 
              print "[2] $txt[$foreach_count_after]\n"; 
              print "[Any Other Key] Take No Action\n\n"; 

              $answer = <>;  # Get user input, assign it to the variable 
               if ($answer == "1") { 
                 print "ZZing $txt[$foreach_count_before]"; 
                 $originalfilename = $mp3dirpath . '/' . $txt[$foreach_count_before]; 
                 $newfilename = $mp3dirpath . '/' . 'zz' . $txt[$foreach_count_before]; 
                 $originalfilename = trim($originalfilename); 
                 $newfilename = trim($newfilename); 
                 qx(mv $originalfilename $newfilename); 
               } 
               elsif ($answer == "2") { 
                 print "ZZing $txt[$foreach_count_after]"; 
                 $originalfilename = $mp3dirpath . '/' . $txt[$foreach_count_after]; 
                 $newfilename = $mp3dirpath . '/' . 'zz' . $txt[$foreach_count_after]; 
                 $originalfilename = trim($originalfilename); 
                 $newfilename = trim($newfilename); 
                 print "mv $originalfilename $newfilename"; 
                 qx(mv $originalfilename $newfilename); 
               } 
               else { 
                 print "Taking No Action"; 
               } 

              } 

              $foreach_count_before++; 
              $foreach_count_after++; 

             } 

# SubRoutine For Trimming White Space from variables 
sub trim($) 
{ 
my $string = shift; 
$string =~ s/^\s+//; 
$string =~ s/\s+$//; 
return $string; 
}