2011-04-14 67 views
0

是否可以使用PDF :: API2拆分多文檔PDF?例如,如果myfile.pdf包含以下書籤:如何使用PDF :: API2基於書籤拆分多文檔PDF基於書籤的PDF :: API2

  • bookmark1
  • bookmark2
  • bookmark3

然後它需要被分裂到以下各個PDF文件:

  • bookmark1.pdf
  • bookmark2.pdf
  • bookmark3.pdf

我找不到PDF :: API2的文檔中的任何書籤項。它是指什麼提綱

謝謝!

+0

以供將來參考,Adobe公司表示書籤作爲PDF規範「大綱」 – yms 2012-02-03 15:20:55

回答

3

我在Perl中嘗試了一下,然後放棄並努力工作到pdftk。我仍然從Perl控制它。以下是一個示例腳本,其中書籤的標題爲「第1章」和「附錄1」。您可能可以調整這個腳本,但意識到一些東西對我來說是特別的。我也是用一些新的功能,但如果你不想使用Perl 5.13,您可以輕鬆地切換出的部分:

use 5.013; 

use Data::Dumper; 
use File::Basename; 
use File::Spec::Functions; 
use File::Path qw(make_path); 

my $pdftk = 'pdftk'; 


    my $file = $ARGV[0]; 
    say ("\n$0 <FILENAME>") && exit 1 unless $file; 

my $dir = dirname($file) || '.'; 
my $output_dir = $ARGV[1] || $dir; 

unless(-e $output_dir) { 
    make_path $output_dir, { mode => 0755 } unless -e $output_dir; 
    die "mkdir failed: $!" unless -e $output_dir; 
    } 


my $string = `$pdftk @{[quotemeta($file)]} dump_data output -`; 

my($last_page) = $string =~ m/NumberOfPages: (\d+)/; 
say "last page is $last_page"; 

my $regex = qr/ 
    BookmarkTitle:  \s+ (?<title>.*?) \s+ 
    BookmarkLevel:  \s+ (?<level>\d+) \s+ 
    BookmarkPageNumber: \s+ (?<page>\d+) 
    /x; 

my @page_numbers; 
while($string =~ /$regex/g) { 
    next unless $+{level} == 1; 
    push @page_numbers, [ @+{ qw(title page) } ]; 
    } 

say "Last index is $#page_numbers"; 

# Chapter&#160;1.&#160;Introduction 
while(my($index, $elem) = each @page_numbers) { 
    last if $index == $#page_numbers; 
    $page_numbers[$index]->[0] =~ s/&#160;/ /g; 
    unshift @$elem, 
        $page_numbers[$index]->[0] =~ s/(?:Chapter|Appendix)\s+(\d+|[ABC]|).?\s+//g 

      ? 
     $1 
      : 
     'XX'; 
    last if $index == $#page_numbers; 

    push @$elem, $page_numbers[$index+1]->[-1] - 1;  
    } 
unshift @{ $page_numbers[-1] }, 'XX'; 
push @{ $page_numbers[-1] }, $last_page; 

print Dumper(\@page_numbers); 

# pdftk A=one.pdf B=two.pdf cat A1-7 B1-5 A8 output combined.pdf 
foreach my $elem (@page_numbers) { 
    my $chapter = $elem->[1] =~ s/\s+/_/rg; 
    my $filename = catfile($output_dir, "$elem->[0].$chapter.pdf"); 
    say "Splitting Chapter $elem->[0] $elem->[1]"; 
    print "Running ", join ' ', $pdftk, $file, 'cat', "$elem->[2]-$elem->[3]", 'output', $filename, "\n"; 
    system $pdftk, $file, 'cat', "$elem->[2]-$elem->[3]", 'output', $filename; 
    }