使用iTextSharp的,我有以下的代碼,成功地翻出了PDF文本爲廣大PDF的我想讀的......PdfTextExtractor.GetTextFromPage沒有返回正確的文本
PdfReader reader = new PdfReader(fileName);
for (int i = 1; i <= reader.NumberOfPages; i++)
{
text += PdfTextExtractor.GetTextFromPage(reader, i);
}
reader.Close();
然而,我的一些PDF格式的有XFA表單(已經被填寫),這將導致「文本」字段來填充下面的垃圾......
"Please wait... \n \nIf this message is not eventually replaced by the proper contents of the document, your PDF \nviewer may not be able to display this type of document. \n \nYou can upgrade to the latest version of Adobe Reader for Windows®, Mac, or Linux® by \nvisiting http://www.adobe.com/products/acrobat/readstep2.html. \n \nFor more assistance with Adobe Reader visit http://www.adobe.com/support/products/\nacrreader.html. \n \nWindows is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries. Mac is a trademark \nof Apple Inc., registered in the United States and other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other \ncountries."
我如何解決此問題?我嘗試使用iTextSharp的PdfStamper [1]來壓扁PDF,但這不起作用 - 生成的流具有相同的垃圾文本。
[1] How to flatten already filled out PDF form using iTextSharp