2016-07-01 36 views
1

我使用Apache FOP庫在Java8項目中生成一些pdf文件。英文內容顯示沒有任何問題,但俄文字符很奇怪。他們看起來像這樣:Ð#огÐ̧нApache FOP。發行w/cyrrilic字符

看來這裏的問題在某種程度上與編碼有關,但我該如何解決它?

這裏是我用來生成類PDF:

public class PdfGenerationTools implements StreamResource.StreamSource 
    { 
    String content; 

    public PdfGenerationTools(String content) { 
     this.content = content; 
    } 

    @Override 
    public InputStream getStream() 
    { 
     ByteArrayInputStream foStream = 
       new ByteArrayInputStream(content.getBytes(StringTools.UTF8)); 

     // Basic FOP configuration. You could create this object 
     // just once and keep it. 
     FopFactory fopFactory = FopFactory.newInstance(); 
     fopFactory.setStrictValidation(false); // For an example 

     // Configuration for this PDF document - mainly metadata 
     FOUserAgent userAgent = getFOUserAgent(fopFactory); 

     // Transform to PDF 
     ByteArrayOutputStream fopOut = new ByteArrayOutputStream(); 
     try { 
      Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, 
        userAgent, fopOut); 
      TransformerFactory factory = 
        TransformerFactory.newInstance(); 
      Transformer transformer = factory.newTransformer(); 
      Source src = new 
        javax.xml.transform.stream.StreamSource(foStream); 
      Result res = new SAXResult(fop.getDefaultHandler()); 
      transformer.transform(src, res); 
      fopOut.close(); 
      return new ByteArrayInputStream(fopOut.toByteArray()); 

     } catch (Exception e) { 
      e.printStackTrace(); 
     } 

     return null; 
    } 

    private FOUserAgent getFOUserAgent(FopFactory factory) 
    { 
     FOUserAgent userAgent = factory.newFOUserAgent(); 

     userAgent.setProducer("Company"); 
     userAgent.setCreationDate(new Date()); 
     userAgent.setTitle("Printing jobs"); 
     userAgent.setTargetResolution(300); // DPI 

     return userAgent; 
    } 

    public static String initDoc() 
    { 
     return "<?xml version='1.0' encoding='ISO-8859-1'?>"+ 
       "<fo:root xmlns:fo='http://www.w3.org/1999/XSL/Format'>"+ 
       "<fo:layout-master-set>"+ 
       "<fo:simple-page-master master-name='A4' margin='2cm'>"+ 
       "<fo:region-body />"+ 
       "</fo:simple-page-master>"+ 
       "</fo:layout-master-set>"+ 
       "<fo:page-sequence master-reference='A4'>"+ 
       "<fo:flow flow-name='xsl-region-body'>"; 
    } 

    public static String closeDoc() 
    { 
     return "</fo:flow>"+ 
       "</fo:page-sequence>"+ 
       "</fo:root>"; 
    } 

    public static String initTable() 
    { 
     return "<fo:block space-before.optimum=\"10pt\"></fo:block>" + 
       "<fo:table table-layout=\"fixed\" border-width=\"1mm\" border-style=\"solid\">" + 
       "<fo:table-column column-number=\"1\" column-width=\"50%\"/>" + 
       "<fo:table-column column-number=\"2\" column-width=\"50%\"/>" + 
       "<fo:table-body>"; 
    } 

    public static String closeTable() 
    { 
     return "</fo:table-body>" + 
       "</fo:table>"; 
    } 

    public static String initTableRow() 
    { 
     return "<fo:table-row keep-together.within-page=\"always\">"; 
    } 

    public static String closeTableRow() 
    { 
     return "</fo:table-row>"; 
    } 

    public static String getCell(String ... args) 
    { 
     final StringBuilder sb = new StringBuilder(); 
     sb.append("<fo:table-cell padding=\"1mm\" border-width=\"1mm\" border-style=\"double\">"); 

     for (String arg : args) 
     { 
      sb.append("<fo:block font-family=\"SansSerif\">") 
        .append(arg) 
        .append("</fo:block>"); 
     } 

     sb.append("</fo:table-cell>"); 

     return sb.toString(); 
    } 
} 

當我改變編碼從 'ISO-8859-1' 到 'UTF-8' 我西里爾子 看起來是這樣的:「## ###」。看來我這裏缺少的字體..

+1

這看起來像多字節UTF-8被看作是一些單字節的ISO/Windows編碼。剩下的做一些小測試,比如http://www.javaranch.com/journal/200409/CreatingMultipleLanguagePDFusingApacheFOP.html –

+1

它可能是一個字體配置問題([這是我的答案](http://stackoverflow.com/ a/28251945/4453460)可能會派上用場)或編碼問題。添加一個帶西里爾文字符的小型FO片段可能有助於獲得答案,否則無法嘗試重現您的問題(請參閱[MCVE](http://stackoverflow.com/help/mcve))。 – lfurini

+0

我在上面添加了一個代碼片段來展示我如何生成pdf內容 – user1053031

回答

2

您必須使用FOP配置文件指示您的字體被嵌入到PDF文件中,例如:

<?xml version="1.0" encoding="UTF-8"?> 
<fop version='1.0'> 
    <renderers> 
     <renderer mime='application/pdf'> 
      <fonts> 
       <!-- TTF fonts --> 
       <font kerning='yes' embed-url='c:\windows\fonts\arial.ttf'> 
        <font-triplet name='Arial' style='normal' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\arialbd.ttf'> 
        <font-triplet name='Arial' style='normal' weight='bold' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\ariali.ttf'> 
        <font-triplet name='Arial' style='italic' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\arialbi.ttf'> 
        <font-triplet name='Arial' style='italic' weight='bold' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\times.ttf'> 
        <font-triplet name='TimesNewRoman' style='normal' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\timesbd.ttf'> 
        <font-triplet name='TimesNewRoman' style='normal' weight='bold' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\timesi.ttf'> 
        <font-triplet name='TimesNewRoman' style='italic' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\timesbi.ttf'> 
        <font-triplet name='TimesNewRoman' style='italic' weight='bold' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\cour.ttf'> 
        <font-triplet name='CourierNew' style='normal' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\courbd.ttf'> 
        <font-triplet name='CourierNew' style='normal' weight='bold' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\couri.ttf'> 
        <font-triplet name='CourierNew' style='italic' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\courbi.ttf'> 
        <font-triplet name='CourierNew' style='italic' weight='bold' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\verdana.ttf'> 
        <font-triplet name='Verdana' style='normal' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\verdanab.ttf'> 
        <font-triplet name='Verdana' style='normal' weight='bold' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\verdanai.ttf'> 
        <font-triplet name='Verdana' style='italic' weight='normal' /> 
       </font> 
       <font kerning='yes' embed-url='c:\windows\fonts\verdanaz.ttf'> 
        <font-triplet name='Verdana' style='italic' weight='bold' /> 
       </font> 
      </fonts> 
     </renderer> 
    </renderers> 
</fop> 

如何使用:

// configure fopFactory as desired 
FopFactory fopFactory = FopFactory.newInstance(); 
FOUserAgent foUserAgent = fopFactory.newFOUserAgent(); 
fopFactory.setUserConfig(new File("fop.xml")); 
+0

最後我回到了這個問題..問題是我在Ubuntu 14th下工作。因此MS字體在此處不可用( – user1053031

+2

)您可以使用任何包含西里爾字符的字體,也可以在Ubuntu中設置MS字體,打開Ubuntu軟件中心並搜索「ttf-mscorefonts-installer」,這將安裝Microsoft的核心字體。 –