2014-07-19 25 views
-2

我想用正則表達式從Java/JSP源代碼中提取完全限定的類名。完全限定java類名的正則表達式

已經有一些關於此的線程,特別是。 Regular expression matching fully qualified class names

雖然我非常接近解決問題,但我無法擺脫誤報。

以下是一些示例。在該行的末尾,我附加了期望值。

Logger l = LoggerFactory.getLogger("test");   // not a FQN, because it starts with an uppercase letter ("LoggerFactory") 
if(!com.db.TFSec.isPermitted("test") return;  // should return "com.db.TFSecurity" 
new java.util.concurrent.BrokenException();   // java.util.concurrent.BrokenException 
java.util.Set<Log> ls = new java.util.HashSet<>(); // java.util.Set, java.util.HashSet 
java.awt.Component c1d2 = new java.awt.List();  // java.awt.Component, java.awt.List 
com.de.tfsecurity.TFUser u;       // com.de.tfsecurity.TFUser 

我已經試過這3個正則表達式:

// Own try. Only one false positive in line1: [oggerFactory.getLogger("test")] 
([a-z]\\w*\\.\\w+(\\.\\w+)*)[<\\(;] 

// The following two regexes are the "correct" answers from the thread mentioned above. But I get false positives. 
([a-zA-Z_$][a-zA-Z\\d_$]*\\.)*[a-zA-Z_$][a-zA-Z\\d_$]* // false positives: [Logger, LoggerFactory.getLogger, test, if, return, new, c1d2] etc. 
([a-z][a-z_0-9]*\\.)*[A-Z_]($[A-Z_]|[\\w_])*   // false positives: the same as in the previous example 

這裏是我的源代碼:

public class FileUsageScanner { 
    // This is my own try. Works for most of the time, but we have false positives with LoggerFactory.getLogger, which is not a FQN 
    private final Pattern fqnPatternOwnTry = Pattern.compile("([a-z]\\w*\\.\\w+(\\.\\w+)*)[<\\(;]"); 
    // Solutions from https://stackoverflow.com/questions/5205339/regular-expression-matching-fully-qualified-java-classes 
    // Lots of false positives like: [Logger, LoggerFactory.getLogger, test, if, return, new, c1d2] etc. 
    private final Pattern fqnPatternThr = Pattern.compile("([a-zA-Z_$][a-zA-Z\\d_$]*\\.)*[a-zA-Z_$][a-zA-Z\\d_$]*"); 
    private final Pattern fqnPatternThr2 = Pattern.compile("([a-z][a-z_0-9]*\\.)*[A-Z_]($[A-Z_]|[\\w_])*"); 


    public static void main(String[] args) throws IOException { 
     FileUsageScanner scan = new FileUsageScanner(); 
     scan.getFQClassname("Logger logger = LoggerFactory.getLogger(\"test\");)"); // not a FQN 
     scan.getFQClassname("if(!com.db.TFSec.isPermitted(\"test\") return;");  // com.db.TFSec 
     scan.getFQClassname("new java.util.concurrent.BrokenException();");   // java.util.concurrent.BrokenException 
     scan.getFQClassname("java.util.Set<Log> loggers = new java.util.HashSet<>();"); // java.util.Set, java.util.HashSet 
     scan.getFQClassname("java.awt.Component c1d2 = new java.awt.List();"); // java.awt.Component, java.awt.List 
     scan.getFQClassname("com.de.tfsecurity.TFUser u;");      //com.de.tfsecurity.TFUser 
    } 

    private List<String> getFQClassname(String line) { 
     if (line != null && !line.isEmpty() && line.contains(".")) { 
      Matcher matcher = fqnPatternThr2.matcher(line); 
      List<String> l = null; 
      while (matcher.find()) { 
       if (l == null) { 
        l = new ArrayList<String>(); 
       } 
       l.add(matcher.group()); 
      } 
      if (l != null) 
       System.out.println("Found FQN in " + line + " -> " + l); 
      return l; 
     } 
     return null; 
    } 
}  

我怎樣才能擺脫誤報?

感謝您的任何意見,

伯恩哈德

+0

以下FQN *爲*合法(一種是約定,另一種是Java認爲有效的):'My.Packages.With.UpperCase','a.package.aClass'。您假定將有兩個或更多小寫包級別,後跟一個駱駝案例類名稱。如果是這樣,請在問題中明確說明。 – tucuxi

+0

謝謝。 'My.Packages.With.UpperCase':好的,我沒有意識到,這是可能的,因爲我從來沒有見過大寫的包名(只通過一堆jar庫瀏覽)。 是的,我假設一個包名稱如low.lower.UppercaseClassname。也許這只是慣例,但我可以忍受,即使Java規範告訴我,否則。 – Bernie

回答