2009-04-13 51 views

回答

2

有點相關的問題:How can I get the page orientation of a PDF page?How do I get character offset information from a pdf document?

與後一個問題的解決方案開始,我想出了這個食譜:

use CAM::PDF; 
my $pdf = CAM::PDF->new('my.pdf') or die $CAM::PDF::errstr; 
for my $pagenum (1 .. $pdf->numPages) { 
    my $pagetree = $pdf->getPageContentTree($pagenum) or next; 
    my @text = $pagetree->traverse('MyRenderer')->getTextBlocks; 
    for my $textblock (@text) { 
     print "text '$textblock->{str}' at ", 
     "($textblock->{left},$textblock->{bottom}), angle $textblock->{angle}\n"; 
    } 
} 

package MyRenderer; 
use base 'CAM::PDF::GS'; 

sub new { 
    my ($pkg, @args) = @_; 
    my $self = $pkg->SUPER::new(@args); 
    $self->{refs}->{text} = []; 
    return $self; 
} 
sub getTextBlocks { 
    my ($self) = @_; 
    return @{$self->{refs}->{text}}; 
} 
sub renderText { 
    my ($self, $string, $width) = @_; 
    my ($x, $y) = $self->textToDevice(0,0); 
    my ($x1, $y1) = $self->textToDevice(1,0); 
    push @{$self->{refs}->{text}}, { 
     str => $string, 
     left => $x, 
     bottom => $y, 
     angle => atan2($y1-$y, $x1-$x), 
    }; 
    return; 
} 

其產生這一結果的565頁PDFReference15_v5.pdf:

text 'ab' at (371.324,583.7249), angle -1.5707963267949 
text 'c' at (371.324,576.63365), angle -1.5707963267949 

請注意,角度是弧度。除以Pi並乘以180將其轉換爲度數。因此,-1.5707963267949是270度,與第565頁一致。

請注意,打印的角度是相對於頁面內容的角度。如果頁面本身進一步旋轉(按照上面的頁面方向問題),那麼您可能需要混合旋轉計算。