我使用Apache FOP庫在Java8項目中生成一些pdf文件。英文內容顯示沒有任何問題,但俄文字符很奇怪。他們看起來像這樣:Ð#огÐ̧н。Apache FOP。發行w/cyrrilic字符
看來這裏的問題在某種程度上與編碼有關,但我該如何解決它?
這裏是我用來生成類PDF:
public class PdfGenerationTools implements StreamResource.StreamSource
{
String content;
public PdfGenerationTools(String content) {
this.content = content;
}
@Override
public InputStream getStream()
{
ByteArrayInputStream foStream =
new ByteArrayInputStream(content.getBytes(StringTools.UTF8));
// Basic FOP configuration. You could create this object
// just once and keep it.
FopFactory fopFactory = FopFactory.newInstance();
fopFactory.setStrictValidation(false); // For an example
// Configuration for this PDF document - mainly metadata
FOUserAgent userAgent = getFOUserAgent(fopFactory);
// Transform to PDF
ByteArrayOutputStream fopOut = new ByteArrayOutputStream();
try {
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF,
userAgent, fopOut);
TransformerFactory factory =
TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
Source src = new
javax.xml.transform.stream.StreamSource(foStream);
Result res = new SAXResult(fop.getDefaultHandler());
transformer.transform(src, res);
fopOut.close();
return new ByteArrayInputStream(fopOut.toByteArray());
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
private FOUserAgent getFOUserAgent(FopFactory factory)
{
FOUserAgent userAgent = factory.newFOUserAgent();
userAgent.setProducer("Company");
userAgent.setCreationDate(new Date());
userAgent.setTitle("Printing jobs");
userAgent.setTargetResolution(300); // DPI
return userAgent;
}
public static String initDoc()
{
return "<?xml version='1.0' encoding='ISO-8859-1'?>"+
"<fo:root xmlns:fo='http://www.w3.org/1999/XSL/Format'>"+
"<fo:layout-master-set>"+
"<fo:simple-page-master master-name='A4' margin='2cm'>"+
"<fo:region-body />"+
"</fo:simple-page-master>"+
"</fo:layout-master-set>"+
"<fo:page-sequence master-reference='A4'>"+
"<fo:flow flow-name='xsl-region-body'>";
}
public static String closeDoc()
{
return "</fo:flow>"+
"</fo:page-sequence>"+
"</fo:root>";
}
public static String initTable()
{
return "<fo:block space-before.optimum=\"10pt\"></fo:block>" +
"<fo:table table-layout=\"fixed\" border-width=\"1mm\" border-style=\"solid\">" +
"<fo:table-column column-number=\"1\" column-width=\"50%\"/>" +
"<fo:table-column column-number=\"2\" column-width=\"50%\"/>" +
"<fo:table-body>";
}
public static String closeTable()
{
return "</fo:table-body>" +
"</fo:table>";
}
public static String initTableRow()
{
return "<fo:table-row keep-together.within-page=\"always\">";
}
public static String closeTableRow()
{
return "</fo:table-row>";
}
public static String getCell(String ... args)
{
final StringBuilder sb = new StringBuilder();
sb.append("<fo:table-cell padding=\"1mm\" border-width=\"1mm\" border-style=\"double\">");
for (String arg : args)
{
sb.append("<fo:block font-family=\"SansSerif\">")
.append(arg)
.append("</fo:block>");
}
sb.append("</fo:table-cell>");
return sb.toString();
}
}
當我改變編碼從 'ISO-8859-1' 到 'UTF-8' 我西里爾子 看起來是這樣的:「## ###」。看來我這裏缺少的字體..
這看起來像多字節UTF-8被看作是一些單字節的ISO/Windows編碼。剩下的做一些小測試,比如http://www.javaranch.com/journal/200409/CreatingMultipleLanguagePDFusingApacheFOP.html –
它可能是一個字體配置問題([這是我的答案](http://stackoverflow.com/ a/28251945/4453460)可能會派上用場)或編碼問題。添加一個帶西里爾文字符的小型FO片段可能有助於獲得答案,否則無法嘗試重現您的問題(請參閱[MCVE](http://stackoverflow.com/help/mcve))。 – lfurini
我在上面添加了一個代碼片段來展示我如何生成pdf內容 – user1053031