Perl正則表達式捕獲錨字之間的字符串

我仍在清理Oracle文件，必須替換文件中的函數/過程/軟件包名稱前面添加了Oracle模式名稱的文件中的字符串，以及何時函數/過程/包名稱用雙引號。一旦定義得到糾正，我將修改與其餘的實際代碼一起寫回文件。（注意：這篇文章是從this question延續）我想要的一些例子我試圖讓我的正則表達式來操作（注：這篇文章是從this question延續）我想要的一些例子清理：Perl正則表達式捕獲錨字之間的字符串

替換：

CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
p_trailing_separator IN BOOLEAN DEFAULT FALSE, 
p_max_linesize IN NUMBER DEFAULT 32000, 
p_mode IN VARCHAR2 DEFAULT 'w' 
) 
RETURN NUMBER 
IS

到

CREATE OR REPLACE FUNCTION DC_F_DUMP_CSV_MMA (
p_trailing_separator IN BOOLEAN DEFAULT FALSE, 
p_max_linesize IN NUMBER DEFAULT 32000, 
p_mode IN VARCHAR2 DEFAULT 'w' 
) 
RETURN NUMBER 
IS

我一直在嘗試使用下面的正則表達式去分開聲明，以便在清理出模式名稱/將函數/過程/包的名稱修改爲不被雙引號後進行後期重構。我與獲取每個到緩衝區中掙扎 - 這裏是抓住所有中間的輸入/輸出到它自己的緩衝區我的最新嘗試：

\b(CREATE\sOR\sREPLACE\s(PACKAGE|PACKAGE\sBODY|PROCEDURE|FUNCTION))(?:\W+\w+){1,100}?\W+(RETURN)\s*(\W+\w+)\s(AS|IS)\b

任何/所有的幫助感激！

這是我現在使用評估/寫入修正文件的腳本：

#!/usr/bin/perl 
use strict; 
use warnings; 
use File::Find; 
use Data::Dumper; 

# utility to clean strings 
sub trim($) { 
    my $string = shift; 
    $string = "" if !defined($string); 

    $string =~ s/^\s+//; 
    $string =~ s/\s+$//; 

    # aggressive removal of blank lines 
    $string =~ s/\n+/\n/g; 
    return $string; 
} 

sub cleanup_packages { 
    my $file = shift; 
    my $tmp = $file . ".tmp"; 
    my $package_name; 

    open(OLD, "< $file") or die "open $file: $!"; 
    open(NEW, "> $tmp") or die "open $tmp: $!"; 

    while (my $line = <OLD>) { 

    # look for the first line of the file to contain a CREATE OR REPLACE STATEMENT 
     if ($line =~ 
m/^(CREATE\sOR\sREPLACE)\s*(PACKAGE|PACKAGE\sBODY)?\s(.+)\s(AS|IS)?/i 
     ) 
     { 

      # look ahead to next line, in case the AS/IS is next 
      my $nextline = <OLD>; 

      # from the above IF clause, the package name is in buffer 3 
      $package_name = $3; 

      # if the package name and the AS/IS is on the same line, and 
      # the package name is quoted/prepended by the TRON2000 schema name 
      if ($package_name =~ m/"TRON2000"\."(\w+)"(\s*|\S*)(AS|IS)/i) { 
       # grab just the name and the AS/IS parts 
       $package_name =~ s/"TRON2000"\."(\w+)"(\s*|\S*)(AS|IS)/$1 $2/i; 
       trim($package_name); 
      } 
      elsif ( ($package_name =~ m/"TRON2000"\."(\w+)"/i) 
        && ($nextline =~ m/(AS|IS)/)) 
      { 

# if the AS/IS was on the next line from the name, put them together on one line 
       $package_name =~ s/"TRON2000"\."(\w+)"(\s*|\S*)/$1/i; 
       $package_name = trim($package_name) . ' ' . trim($nextline); 
       trim($package_name); # remove trailing carriage return 
      } 

      # now put the line back together 
      $line =~ 
s/^(CREATE\sOR\sREPLACE)\s*(PACKAGE|PACKAGE\sBODY|FUNCTION|PROCEDURE)?\s(.+)\s(AS|IS)?/$1 $2 $package_name/ig; 

      # and print it to the file 
      print NEW "$line\n"; 
     } 
     else { 

      # just a normal line - print it to the temp file 
      print NEW $line or die "print $tmp: $!"; 
     } 
    } 

    # close up the files 
    close(OLD) or die "close $file: $!"; 
    close(NEW) or die "close $tmp: $!"; 

    # rename the temp file as the original file name 
    unlink($file) or die "unlink $file: $!"; 
    rename($tmp, $file) or die "can't rename $tmp to $file: $!"; 
} 

# find and clean up oracle files 
sub eachFile { 
    my $ext; 
    my $filename = $_; 
    my $fullpath = $File::Find::name; 

    if (-f $filename) { 
     ($ext) = $filename =~ /(\.[^.]+)$/; 
    } 
    else { 

     # ignore non files 
     return; 
    } 

    if ($ext =~ /(\.spp|\.sps|\.spb|\.sf|\.sp)/i) { 
     print "package: $filename\n"; 
     cleanup_packages($fullpath); 
    } 
    else { 
     print "$filename not specified for processing!\n"; 
    } 
} 

MAIN: 
{ 
    my (@files, $file); 
    my $dir = 'C:/1_atest'; 

    # grab all the files for cleanup 
    find(\&eachFile, "$dir/"); 

    #open and evaluate each 
    foreach $file (@files) 
    { 
     # skip . and .. 
     next if ($file =~ /^\.$/); 
     next if ($file =~ /^\.\.$/); 
      cleanup_file($file); 
     }; 
}

來源

2012-05-08 T.j. Randall

假設文件的全部內容被存儲爲VAR標，下面應該做的招。

$Str = ' 
CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
    p_trailing_separator IN BOOLEAN DEFAULT FALSE, 
    p_max_linesize IN NUMBER DEFAULT 32000, 
    p_mode IN VARCHAR2 DEFAULT w 
) 
RETURN NUMBER 
IS 

CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
    p_trailing_separator IN BOOLEAN DEFAULT FALSE, 
    p_max_linesize IN NUMBER DEFAULT 32000, 
    p_mode IN VARCHAR2 DEFAULT w 
) 
RETURN NUMBER 
IS 
'; 

$Str =~ s#^(create\s+(?:or\s+replace\s+)?\w+\s+)"[^"]+"."([^"]+)"#$1 $2#mig; 

print $Str;

來源

2012-05-08 12:09:28 tuxuday

優秀@tuxuday！非常高效，而且更簡單 - 謝謝！我現在將嘗試稍微修改其他情況，其中我需要清理的字符串是'TRON2000.DC_F_DUMP_CSV_MMA'到'DC_F_DUMP_CSV_MMA'（模式名稱存在，但沒有雙引號）。再次感謝你！ –

Perl正則表達式捕獲錨字之間的字符串

回答

相關問題