2014-02-17 38 views
1

問題: 表示在兩個邊界之間抓取單詞的正則表達式。下面的代碼無法正常工作在邊界內抓取單詞

regexp -- {/b/{(.+)/}}/b} $outputline8 - filtered 

目的

  1. 拼搶位於後 set_false_path{}之間的所有引腳名xxx/xxx[x]
  2. 在set_false_path中可能有另一個選項,例如「through」,我仍然希望在這些選項後抓住這些引腳,並將這些引腳放入輸出文件中,如下所述。

這裏是我的輸入文件:input_file.txt

set_false_path -from [get_ports {AAAcc/BBB/CCC[1] \ 
BBB_1/CCC[1] CCC/DDD[1] \ 
DDD/EEE EEE/FFF[1] \ 
FFF/GGG[1]}] -through\ 
[get_pins {GGG/HHH[1] HHH/III[1] \ 
XXX/YYY[1] YYY/XXX[1] \ 
AAA/ZZZ[1]}] 
set_timing_derate -cell_sdada [get_cells \ 
{NONO[1]} 
set_false_path -from [get_ports {AAA/DDD[2]}] 

這裏是輸出文件(我所預期的格式):output_file.txt

AAAcc/BBB/CCC[1] 
BBB_1/CCC[1] 
CCC/DDD[1] 
DDD/EEE 
EEE/FFF[1] 
FFF/GGG[1] 
GGG/HHH[1] 
HHH/III[1] 
XXX/YYY[1] 
YYY/XXX[1] 
AAA/ZZZ[1] 
AAA/DDD[2] 

一般來說,這些引腳沒有任何一般模式。所以唯一的辦法是抓住{}之間的所有引腳。

從上面的輸入文件中,我們可以看到那些set_命令(來自input.txt)沒有連接在一個句子中。於是我做了一個代碼,只搶到內set_false path內容並加入這些線路,下面是我的代碼:

set inputfile [open "input_file.txt" r] 
set outputfile [open "output_file.txt" w] 

set first_word "" 
set outputline1 "" 
set filtered "" 

while { [gets $inputfile line] != 1} { 
set first_word [lindex [split $line ""] 0] 
set re2 {^set_+?} 
#match any "set_ " command 
if { [regexp $re2 $first_word matched] } { 
    #if the "set_ " command is found and the outputline1 is not empty, then it's 
    # the end of the last set_ command 
    if {$outputline1 != ""} { 
    #do the splitting here and put into the outputfile later on 
    regexp -- {/b/{(.+)/}}/b} $outputline8 - filtered 
    puts "$filtered:$filtered" 
    set outputline1 "" 
    } 

    # grab content if part of set_false_path 
    if{ [regexp "set_false_path" $first_word] } { 
    # if it's the expected command set, put "command_set" flag on which will be used on 
    # the next elseif 
    set command_set 1 
    lappend outputline1 $line 
    regsub -all {\\\[} $outputline1 "\[" outputline2 
    regsub -all {\\\]} $outputline2 "\]" outputline3 
    regsub -all {\\\{} $outputline3 "\{" outputline4 
    regsub -all {\\\}} $outputline4 "\}" outputline5 
    regsub -all {\\\\} $outputline5 "\\" outputline6 
    regsub -all {\\ +} $outputline6 " " outputline7 
    regsub -all {\s+} $outputline7 " " outputline8 
    } else { 
    set command_set 0 
    # if the line isn't started with set_false_path but it's part of set_false_path command 
    } elseif {$command_set} { 
    lappend outputline1 $line 
    regsub -all {\\\[} $outputline1 "\[" outputline2 
    regsub -all {\\\]} $outputline2 "\]" outputline3 
    regsub -all {\\\{} $outputline3 "\{" outputline4 
    regsub -all {\\\}} $outputline4 "\}" outputline5 
    regsub -all {\\\\} $outputline5 "\\" outputline6 
    regsub -all {\\ +} $outputline6 " " outputline7 
    regsub -all {\s+} $outputline7 " " outputline8 
    } else { 
    } 
} 
} 

puts "outputline:outputline8" 
#do the splitting here and put into the file later on for the last grabbed line! 

close $inputfile 
close $outputfile 

代碼深入討論:

  • 我發現後,我重疊行到outputline1,我會得到意想不到的輸出與多個空格和正斜槓:set_false_path\ -from\ \[get_ports\ \{AAA/BBB\[1\] \ ...等。

    此輸出包含用於每個特殊字符(如{[,空格等)的退格(\)。因此,我將許多regsub刪除所有這些不必要的添加。並最終加入結果位於$ outputline8

    的$ outputline8結果:

    set_false_path -from [get_ports {AAAcc/BBB/CCC[1] BBB_1/CCC[1] CCC/DDD[1] DDD/EEE EEE/FFF[1] FFF/GGG[1]}] -through [get_pins {GGG/HHH[1] HHH/III[1] XXX/YYY[1] YYY/XXX[1] AAA/ZZZ[1]}] 
    set_false_path -from [get_ports {AAA/DDD[2]}] 
    
  • 我打算抓住和內{}

  • 組內的 outputline8

參考:process multiple lines text file to print in single line

  • 這裏是最後的更新開始

    如果輸入文件:

    set_false_path -from [get_ports {AAAcc/BBB/CCC[1] BBB_1/CCC[1] DDD/EEE}] -through [get_pins {XXX_1[1]}] 
    

    我想要的輸出文件:

    AAAcc/BBB/CCC[1] 
    BBB_1/CCC[1] 
    DDD/EEE 
    XXX_1[1] 
    

謝謝! 這裏是最新的更新結束

注:我是新來的TCL和這個論壇,任何建議真的很感激!

+0

不應該在'{/ b /{(.+)/}}/ b}'中有反斜槓而不是正斜槓嗎? '{\ b \ {(。+)\}} \ b}' – devnull

+0

是的,devnull ..我很笨:(我試過{/b({(.+)/}}/b}​​但它沒有'將不起作用既不 –

+0

我曾嘗試使用 '正則表達式 - {\ {\}(+)} $ outputline8 - filtered' 但我得到: 'AAAcc/BBB [1] BBB_1/CCC [1 ] CCC/DDD [1] DDD/EEE EEE/FFF [1] FFF/GGG [1]}]通過[get_pins {GGG/HHH [1] HHH/III [1] XXX/YYY [1] YYY/XXX [1] AAA/ZZZ [1]' 好像它會得到第一個 「{」 到最後的 「}」 但我想: 'AAAcc/BBB [1] BBB_1/CCC [1] CCC/DDD [1] DDD/EEE EEE/FFF [1] FFF/GGG [1] GGG/HHH [1] HHH/III [1] XXX/YYY [1] YYY/XXX [1] AAA/ZZZ [1] ' 謝謝! –

回答

0

請嘗試以下腳本。我在代碼註釋中添加了解釋:

set inputfile [open "input_file.txt" r] 
set outputfile [open "output_file.txt" w] 

# This is a temp variable to store the partial lines 
set buffer "" 

while { [gets $inputfile line] != -1} { 
    # Take previous line and add to current line 
    set buffer "$buffer[regsub -- {\\[[:blank:]]*$} $line ""]" 

    # If there is no ending \ then stop adding and process the elements to extract 
    if {![regexp -- {\\[[:blank:]]*$} $line]} { 
    # Skip line if not "set_false_path" 
    if {[lindex [split $buffer " "] 0] ne "set_false_path"} { 
     set buffer "" 
     continue 
    } 

    # Grab each element with regexp into a list and print each to outputfile 
    # m contains whole match, groups contains sub-matches 
    foreach {m groups} [regexp -all -inline -- {\{([^\}]+)\}} $buffer] { 
     foreach out [split $groups] { 
     puts $outputfile $out 
     } 
    } 

    # Clear the temp variable 
    set buffer "" 
    } 
} 

close $inputfile 
close $outputfile 
+0

Hay Jerry,我收到錯誤消息:關閉引號後的多餘字符。 順便說一句,我想我打開新的話題,因爲有一個新的輸入文件的修改。請幫助我討論新話題! –

+0

@AndiLee哦?你使用的是什麼Tcl版本?我認爲導致錯誤的部分是'「$ buffer [regsub - {\\ [[:blank:]] * $} $ line」「]」'。你可以試試'$ buffer [regsub - {\\ [[:blank:]] * $} $ line「」]'?在此期間我檢查了新的問題。 – Jerry

+0

@AndiLee另外,我不認爲有必要提出另一個問題,除非輸入文件是完全不同的。但是,那麼這個問題會發生什麼? – Jerry