我沒有使用java.I希望在使用iText的Java庫PDF文件閱讀PDF表格處理太多的想法。如何進行?如何使用iText java讀取PDF中的表格?
回答
我的解決方案
package com.geek.tutorial.itext.table;
import java.io.FileOutputStream;
import com.lowagie.text.pdf.PdfPTable;
import com.lowagie.text.pdf.PdfPCell;
import com.lowagie.text.pdf.PdfWriter;
import com.lowagie.text.Document;
import com.lowagie.text.Paragraph;
public class SimplePDFTable
{
public SimplePDFTable() throws Exception
{
Document document = new Document();
PdfWriter.getInstance(document,
new FileOutputStream("SimplePDFTable.pdf"));
document.open();
PdfPTable table = new PdfPTable(2); // Code 1
// Code 2
table.addCell("1");
table.addCell("2");
// Code 3
table.addCell("3");
table.addCell("4");
// Code 4
table.addCell("5");
table.addCell("6");
// Code 5
document.add(table);
document.close();
}
public static void main(String[] args)
{
try
{
SimplePDFTable pdfTable = new SimplePDFTable();
}
catch(Exception e)
{
System.out.println(e);
}
}
}
vijaykamma這個代碼是在PDF寫字桌......我想閱讀PDF表格 –
您可以提取從內容流中的文本,但對於普通的PDF文件,其結果將是純文本(沒有任何結構)。如果頁面上有表格,該表格將不會被識別。你會得到內容和一些空白空間,但這不是一個表格結構!只有你有一個帶標籤的PDF,你才能獲得一個XML文件。如果PDF包含被識別爲表格標籤的標籤,這將在PDF中反映出來。
這就是我發現here
@soumitra仍然沒有解決方案? –
對於從PDF文件讀取表的內容,您只需要通過使用任何API將PDF轉換成文本文件(我用的iText的PdfTextExtracter.getTextFromPage()
)然後通過Java程序讀取該txt文件。讀完之後,主要任務就完成了。你必須篩選所需的數據,您可以通過持續使用String
類的拆分方法,直到你找到你想要的記錄做。
下面是我的代碼中,我已經提取從PDF文件中的記錄的一部分,並將其寫入到.csv文件。您可以查看PDF文件瀏覽:http://www.cea.nic.in/reports/monthly/generation_rep/actual/jan13/opm_02.pdf
public static void genrateCsvMonth_Region(String pdfpath, String csvpath) {
try {
String line = null;
// Appending Header in CSV file...
BufferedWriter writer1 = new BufferedWriter(new FileWriter(csvpath,
true));
writer1.close();
// Checking whether file is empty or not..
BufferedReader br = new BufferedReader(new FileReader(csvpath));
if ((line = br.readLine()) == null) {
BufferedWriter writer = new BufferedWriter(new FileWriter(
csvpath, true));
writer.append("REGION,");
writer.append("YEAR,");
writer.append("MONTH,");
writer.append("THERMAL,");
writer.append("NUCLEAR,");
writer.append("HYDRO,");
writer.append("TOTAL\n");
writer.close();
}
// Reading the pdf file..
PdfReader reader = new PdfReader(pdfpath);
BufferedWriter writer = new BufferedWriter(new FileWriter(csvpath,
true));
// Extracting records from page into String..
String page = PdfTextExtractor.getTextFromPage(reader, 1);
// Extracting month and Year from String..
String period1[] = page.split("PEROID");
String period2[] = period1[0].split(":");
String month[] = period2[1].split("-");
String period3[] = month[1].split("ENERGY");
String year[] = period3[0].split("VIS");
// Extracting Northen region
String northen[] = page.split("NORTHEN REGION");
String nthermal1[] = northen[0].split("THERMAL");
String nthermal2[] = nthermal1[1].split(" ");
String nnuclear1[] = northen[0].split("NUCLEAR");
String nnuclear2[] = nnuclear1[1].split(" ");
String nhydro1[] = northen[0].split("HYDRO");
String nhydro2[] = nhydro1[1].split(" ");
String ntotal1[] = northen[0].split("TOTAL");
String ntotal2[] = ntotal1[1].split(" ");
// Appending filtered data into CSV file..
writer.append("NORTHEN" + ",");
writer.append(year[0] + ",");
writer.append(month[0] + ",");
writer.append(nthermal2[4] + ",");
writer.append(nnuclear2[4] + ",");
writer.append(nhydro2[4] + ",");
writer.append(ntotal2[4] + "\n");
// Extracting Western region
String western[] = page.split("WESTERN");
String wthermal1[] = western[1].split("THERMAL");
String wthermal2[] = wthermal1[1].split(" ");
String wnuclear1[] = western[1].split("NUCLEAR");
String wnuclear2[] = wnuclear1[1].split(" ");
String whydro1[] = western[1].split("HYDRO");
String whydro2[] = whydro1[1].split(" ");
String wtotal1[] = western[1].split("TOTAL");
String wtotal2[] = wtotal1[1].split(" ");
// Appending filtered data into CSV file..
writer.append("WESTERN" + ",");
writer.append(year[0] + ",");
writer.append(month[0] + ",");
writer.append(wthermal2[4] + ",");
writer.append(wnuclear2[4] + ",");
writer.append(whydro2[4] + ",");
writer.append(wtotal2[4] + "\n");
// Extracting Southern Region
String southern[] = page.split("SOUTHERN");
String sthermal1[] = southern[1].split("THERMAL");
String sthermal2[] = sthermal1[1].split(" ");
String snuclear1[] = southern[1].split("NUCLEAR");
String snuclear2[] = snuclear1[1].split(" ");
String shydro1[] = southern[1].split("HYDRO");
String shydro2[] = shydro1[1].split(" ");
String stotal1[] = southern[1].split("TOTAL");
String stotal2[] = stotal1[1].split(" ");
// Appending filtered data into CSV file..
writer.append("SOUTHERN" + ",");
writer.append(year[0] + ",");
writer.append(month[0] + ",");
writer.append(sthermal2[4] + ",");
writer.append(snuclear2[4] + ",");
writer.append(shydro2[4] + ",");
writer.append(stotal2[4] + "\n");
// Extracting eastern region
String eastern[] = page.split("EASTERN");
String ethermal1[] = eastern[1].split("THERMAL");
String ethermal2[] = ethermal1[1].split(" ");
String ehydro1[] = eastern[1].split("HYDRO");
String ehydro2[] = ehydro1[1].split(" ");
String etotal1[] = eastern[1].split("TOTAL");
String etotal2[] = etotal1[1].split(" ");
// Appending filtered data into CSV file..
writer.append("EASTERN" + ",");
writer.append(year[0] + ",");
writer.append(month[0] + ",");
writer.append(ethermal2[4] + ",");
writer.append(" " + ",");
writer.append(ehydro2[4] + ",");
writer.append(etotal2[4] + "\n");
// Extracting northernEastern region
String neestern[] = page.split("NORTH");
String nethermal1[] = neestern[2].split("THERMAL");
String nethermal2[] = nethermal1[1].split(" ");
String nehydro1[] = neestern[2].split("HYDRO");
String nehydro2[] = nehydro1[1].split(" ");
String netotal1[] = neestern[2].split("TOTAL");
String netotal2[] = netotal1[1].split(" ");
writer.append("NORTH EASTERN" + ",");
writer.append(year[0] + ",");
writer.append(month[0] + ",");
writer.append(nethermal2[4] + ",");
writer.append(" " + ",");
writer.append(nehydro2[4] + ",");
writer.append(netotal2[4] + "\n");
writer.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
- 1. 如何閱讀PDF包含圖像,使用java中的itext表?
- 2. 使用iText的PDF中的HTML表格
- 3. 從itext讀取pdf
- 4. 使用iText讀取pdf的錯誤
- 5. 如何在多層使用itext讀取PDF中的書籤?
- 6. 使用iText閱讀pdf
- 7. 使用pdftools從PDF中讀取表格
- 8. 從IText流讀取Aspose pdf
- 9. 如何在java中使用itext在PDF頁腳中添加表格
- 10. 使用iTEXT創建Java PDF
- 11. 如何使用java讀取pdf文件?
- 12. 在iText中設置表格單元格的寬度java pdf
- 13. java中的Itext PDF操作
- 14. 如何用java itext保存PDF
- 15. 如何檢查pdf使用java中的itext進行保護
- 16. 如何使用java MVC模式中的itext生成PDF文檔
- 17. 如何使用iText格式化pdf模板中的文本
- 18. 使用iText和java的PDF生成器
- 19. 使用iText閱讀PDF註釋
- 20. 使用iText庫閱讀pdf文件
- 21. 使用JAVA將RTF轉換爲PDF,它讀取RTF文檔中的表格
- 22. ITEXT PDF閱讀器無法閱讀PDF
- 23. 如何iText的PDF中MVC3
- 24. 如何使用iText庫在PDF中的表格上應用背景圖片?
- 25. 如何使用itext +飛碟動態生成pdf使用itext +飛碟與java
- 26. 如何使用itextsharp從PDF讀取表格?
- 27. 如何使用iTextSharp讀取PDF表格數據?
- 28. 無法使用iText重命名PDF中的表單域Java
- 29. 使用java中的itext填充xfa pdf表單
- 30. 在pdf頁腳中使用itext創建2行的表格
它實際上會幫助,如果你需要指定你的問題,添加你已經做了一些什麼源代碼,你已經試過什麼也沒有工作至今。 – brimborium
我沒有使用java.i已經使用命令行(Java的罐子PDFBOX-APP-xyzjar ExtractText [OPTIONS] [文本文件])爲PDF轉換成某種格式從中我能夠處理該PDF處理多想法table.Also我已經使用http://www.roseindia.net/tutorial/java/itext/convertpdfToTextFile.html中的代碼。 –