2014-09-04 102 views
0

我正在腳本中尋找文件路徑。所以我要寫出一個腳本, 喵喵出文件,然後查找「/」。量化捕獲的正則表達式

我寧願使用perl正則表達式,只想grep出文件路徑。

[email protected]:~ $ cat /sbcimp/dyn/data/FOO/GSD/scripts/FOOonoff.pl | grep "/" 

#!/usr/bin/perl 
my $output_file = "/sbcimp/dyn/data/stmFOO3/dailymetrics/PartRates/file6.csv"; 
my $input_file_name_ESTATE = "/sbcimp/dyn/sym/data/stmFOO3/part_rates/FOO_estate.$year$month1$day1.1630.csv"; 
my $input_file_name_ESTATE = "/sbcimp/data/stmFOO3/part_rates/FOO_estate.20140829.1630.csv"; 
my $input_file_name_ESTATE2 = "/sbcimp/part_rates/FOO_estate.$year$month1$day2.1630.csv"; 
my $input_file_name_ESTATE3 = "/sbcimp/FOO_estate.$year$month2$day3.1630.csv"; 
my $input_file_name_NEW = "/sbcimp/dyn/data/stmFOO3/dailymetrics/RiskTiers/new_terms.csv"; 
    $argVal =~ s/\s+$//; 
    $argVal =~ s/^\s+//; 
    $argVal =~ s/\"$//; 
    $argVal =~ s/^\"//; 
    $argVal =~ s/\'$//; 
    $argVal =~ s/^\'//; 

如果我把這個文件放到perl裏,我就得到根目錄。

[email protected]:~ $ cat /sbcimp/dyn/data/FOO/GSD/scripts/FOOonoff.pl | perl -nle 'print /(\/\w+\/)/' | sort -u 

/sbcimp/ 

我明白量詞正則表達式,但如果我使用 '打印/(/ \ w + /){1,9} /' 是不會給我「/w+/..either 1或9倍。我將尋找從根路徑1次或多次路徑。 我如何量化整個捕獲的正則表達式,而不僅僅是最後一個字符?

+0

大概應該是'/^\ /(\ w + \ /)+ /'代替。捕獲第一個斜槓,然後捕獲任意數量的\ w + /'序列。 – raina77ow 2014-09-04 04:11:18

+0

最簡單的方法是刪除字符串末尾的任何非斜槓符號(例如,使用'〜[^ /] + $ ~~')。 – raina77ow 2014-09-04 04:16:10

+0

所以這是不是真的可以量化? – capser 2014-09-04 04:20:10

回答

3

我建議不要使用正則表達式來解析Perl代碼,而是使用PPI

下面解析您爲字符串提供的perl行,將它們僅減少到其基本內容, ñ翻出的路徑信息:

use strict; 
use warnings; 

use PPI; 
use File::Basename; 

my $src = do {local $/; <DATA>}; 

# Load a document 
my $doc = PPI::Document->new(\$src); 

# Find all the strings within the doc 
my $strings = $doc->find('PPI::Token::Quote'); 
for (@$strings) { 
    my $str = eval 'no strict; no warnings; '. $_->content; 
    next if [email protected] || $str !~ /\//; 

    my ($name, $path) = fileparse($str); 

    print "$path\n"; 
} 

__DATA__ 
#!/usr/bin/perl 
my $output_file = "/sbcimp/dyn/data/stmFOO3/dailymetrics/PartRates/file6.csv"; 
my $input_file_name_ESTATE = "/sbcimp/dyn/sym/data/stmFOO3/part_rates/FOO_estate.$year$month1$day1.1630.csv"; 
my $input_file_name_ESTATE = "/sbcimp/data/stmFOO3/part_rates/FOO_estate.20140829.1630.csv"; 
my $input_file_name_ESTATE2 = "/sbcimp/part_rates/FOO_estate.$year$month1$day2.1630.csv"; 
my $input_file_name_ESTATE3 = "/sbcimp/FOO_estate.$year$month2$day3.1630.csv"; 
my $input_file_name_NEW = "/sbcimp/dyn/data/stmFOO3/dailymetrics/RiskTiers/new_terms.csv"; 
    $argVal =~ s/\s+$//; 
    $argVal =~ s/^\s+//; 
    $argVal =~ s/\"$//; 
    $argVal =~ s/^\"//; 
    $argVal =~ s/\'$//; 
    $argVal =~ s/^\'//; 

輸出:

/sbcimp/dyn/data/stmFOO3/dailymetrics/PartRates/ 
/sbcimp/dyn/sym/data/stmFOO3/part_rates/ 
/sbcimp/data/stmFOO3/part_rates/ 
/sbcimp/part_rates/ 
/sbcimp/ 
/sbcimp/dyn/data/stmFOO3/dailymetrics/RiskTiers/ 
+1

+1非常整潔的解決方案。 – TLP 2014-09-04 04:54:00