2013-06-21 23 views
0

我想查找由代字號(~)括起來的文本,並在其前面加上一些字符串,例如:將XML文件中的~it~替換爲~T1it~,然後將結果保存到另一個文件中。我知道如何使用XPath獲取文本以及如何替換它,但我不知道如何將替換後的文本放在他們的位置並輸出。使用XML查找和替換文本:: LibXML

這裏是我的輸入XML:

<?xml version="1.0"?> 
<chapter> 
<section> 
<para id="p001">this is<math>~rom~This is roman~normal~</math>para</para> 
<para id="p002">this is<math>~rom~This is roman~normal~</math>para</para> 
<para id="p003">this is<math>~rom~This is roman~normal~</math>para</para> 
</section> 
<abstract> 
<para id="p004">This is <math>~rom~This is roman~normal~</math>para</para> 
<para id="p005">this is<math>~rom~This is roman~normal~</math>para</para> 
<para id="p006">this is<math>~rom~This is roman~normal~</math>para</para> 
</abstract> 
</chapter> 

這裏是我的Perl腳本:

use strict; 
use warnings; 
use XML::LibXML; 
#use XML::LibXML::Text; 
use Cwd 'abs_path'; 
my $x_name=abs_path($ARGV[0]); 
my $doc = XML::LibXML->load_xml(location => $x_name, no_blanks => 1); 
my $xpath_expression='/chapter/section/para/math'; 
my @nodes = $doc->findnodes($xpath_expression); 
foreach my $node(@nodes){ 
    my $content = $node->textContent; 
    $content=~s#\~rom\~#~T1rom~#sg; 
    print $content,"\n"; 
} 

這裏是我想要的輸出:

<?xml version="1.0"?> 
<chapter> 
<section> 
<para id="p001">this is<math>~T1rom~This is roman~normal~</math>para</para> 
<para id="p002">this is<math>~T1rom~This is roman~normal~</math>para</para> 
<para id="p003">this is<math>~T1rom~This is roman~normal~</math>para</para> 
</section> 
<abstract> 
<para id="p004">This is <math>~rom~This is roman~normal~</math>para</para> 
<para id="p005">this is<math>~rom~This is roman~normal~</math>para</para> 
<para id="p006">this is<math>~rom~This is roman~normal~</math>para</para> 
</abstract> 
</chapter> 

回答

2

一種可能性:使用XML::LibXML::TextsetData方法:

#!/usr/bin/perl 
use warnings; 
use strict; 

use XML::LibXML;  

my $x_name = $ARGV[0]; 
my $doc = XML::LibXML->load_xml(location => $x_name, no_blanks => 1); 
my $xpath_expression = '/chapter/section/para/math/text()'; 
my @nodes = $doc->findnodes($xpath_expression); 
for my $node (@nodes) { 
    my $content = $node->toString; 
    $content =~ s#\~rom\~#~T1rom~#sg; 
    $node->setData($content); 
} 
$doc->toFile($x_name . '.new', 1); 
+0

優秀,它的工作就像一個魅力,非常感謝 – siva2012