您可以提取元數據的名稱如下(我的例子是解析XML文件,你可以簡單地將其更改爲PDF解析器,或使用自動檢測解析器:
//detecting the file type
BodyContentHandler handler = new BodyContentHandler(-1);
Metadata metadata = new Metadata();
File inFile = new File("example.xml");
FileInputStream inputstream = new FileInputStream(inFile);
ParseContext pcontext = new ParseContext();
//Xml parser
XMLParser xmlparser = new XMLParser();
xmlparser.parse(inputstream, handler, metadata, pcontext);
System.out.println("Metadata of the document:");
String[] metadataNames = metadata.names();//Now we have all the metadata tags here
for(String name : metadataNames) {
if (name == "Your Particular Tag"){ //here you can check if the tag names are the particular ones you need and do what you want with them
System.out.println(name + ": " + metadata.get(name));
}
}
'xpdf'提供實用程序'pdfinfo',爲PDF提供元數據信息 – devnull
將元數據放入臨時文件中,grep用於感興趣的元數據關鍵字,使用awk將值分割出來或者更具體/用不同的語言/ etc? – Gagravarr