正則表達式與Java拆分方法的WildCard匹配拆分

我知道有類似的問題，就像之前問過的那樣，但我想做一個自定義操作，但我不知道如何去做。我想分割數據的字符串使用正則表達式類似，但這次就像我知道的起始字符和喜歡的結束字符：正則表達式與Java拆分方法的WildCard匹配拆分

String myString="Google is a great search engine<as:...s>";

的<爲：和s>是開始和結束字符的...是動態的，我不能預測它的價值

我希望能夠將字符串從一開始就<拆分爲：到底S> 與它的動態字符串。

像：

myString.split("<as:/*s>");

類似的東西。我還希望將所有的<作爲：.. s>在字符串中出現。我知道這可以用正則表達式來完成，但是我從來沒有做過。我需要一個簡單而整潔的方式來做到這一點。在此先感謝

來源

2017-06-07 Chukwu Remijius

而不是使用.split()，我只會提取使用Pattern和Matcher。該方法找到<as:和s>之間的所有內容，並將其提取到捕獲組。組1然後有你想要的文字。

public static void main(String[] args) 
{ 
    final String myString="Google is a great search engine<as:Some stuff heres>"; 

    Pattern pat = Pattern.compile("^[^<]+<as:(.*)s>$"); 

    Matcher m = pat.matcher(myString); 
    if (m.matches()) { 
     System.out.println(m.group(1)); 
    } 
}

輸出：

這裏

一些東西，如果你需要在一開始的文字，你可以把它放在一個捕獲組爲好。

編輯：如果在輸入中有多個<as...s>，則以下內容將收集所有這些內容。編輯2：增加了邏輯。增加了對空虛的檢查。

public static List<String> multiEntry(final String myString) 
{ 
    String[] parts = myString.split("<as:"); 

    List<String> col = new ArrayList<>(); 
    if (! parts[0].trim().isEmpty()) { 
     col.add(parts[0]); 
    } 

    Pattern pat = Pattern.compile("^(.*?)s>(.*)?");   
    for (int i = 1; i < parts.length; ++i) { 
     Matcher m = pat.matcher(parts[i]); 
     if (m.matches()) { 
      for (int j = 1; j <= m.groupCount(); ++j) { 
       String s = m.group(j).trim(); 
       if (! s.isEmpty()) { 
        col.add(s); 
       } 
      } 
     } 
    } 

    return col; 
}

輸出：

[谷歌是一個偉大的搜索引擎，有些東西heress，這是Facebook的，更多的東西，在年底別的東西]

編輯3：這方法使用find和looping來進行解析。它也使用可選的捕獲組。

public static void looping() 
{ 
    final String myString="Google is a great search engine" 
      + "<as:Some stuff heresss>Here is Facebook<as:More Stuffs>" 
      + "Something else at the end" + 
      "<as:Stuffs>" + 
      "<as:Yet More Stuffs>"; 

    Pattern pat = Pattern.compile("([^<]+)?(<as:(.*?)s>)?"); 

    Matcher m = pat.matcher(myString); 
    List<String> col = new ArrayList<>(); 

    while (m.find()) { 
     String prefix = m.group(1); 
     String contents = m.group(3); 

     if (prefix != null) { col.add(prefix); } 
     if (contents != null) { col.add(contents); } 
    } 

    System.out.println(col); 
}

輸出：

[谷歌是一個偉大的搜索引擎，有些東西heress，這是Facebook的，更多的東西，到了最後，東西別的東西，但更多的東西]

附加編輯：編寫一些快速測試用例（帶有超級黑客助手類）以幫助驗證。這些全通（更新）multiEntry：

public static void main(String[] args) 
{ 
    Input[] inputs = { 
      new Input("Google is a great search engine<as:Some stuff heres>", 2), 
      new Input("Google is a great search engine" 
        + "<as:Some stuff heresss>Here is Facebook<as:More Stuffs>" 
        + "Something else at the end" + 
        "<as:Stuffs>" + 
        "<as:Yet More Stuffs>" + 
        "ending", 8), 
      new Input("Google is a great search engine" 
          + "<as:Some stuff heresss>Here is Facebook<as:More Stuffs>" 
          + "Something else at the end" + 
          "<as:Stuffs>" + 
          "<as:Yet More Stuffs>", 7), 
      new Input("No as here", 1),  
      new Input("Here is angle < input", 1), 
      new Input("Angle < plus <as:Stuff in as:s><as:Other stuff in as:s>", 3), 
      new Input("Angle < plus <as:Stuff in as:s><as:Other stuff in as:s>blah", 4), 
      new Input("<as:To start with anglass>Some ending", 2), 
    }; 


    List<String> res; 
    for (Input inp : inputs) { 
     res = multiEntry(inp.inp); 
     if (res.size() != inp.cnt) { 
      System.err.println("FAIL: " + res.size() 
      + " did not match exp of " + inp.cnt 
      + " on " + inp.inp); 
      System.err.println(res); 
      continue; 
     } 
     System.out.println(res); 
    } 
}

來源

2017-06-08 00:36:21 KevinO

非常感謝您@kelvinO，但我仍然需要通過拆分字符串<如：一些東西繼承人>，所以我可以得到我的字符串數組。其實我正在構建一個複雜的lucene搜索應用程序，需要索引一個動態的字段，我想用這裏的一些東西作爲索引的動態字段。我需要獲取字符串中所有的數組，或者讓文檔 –

@MichaelDawn，OK，添加一個循環方法。如果需要，您還可以放回''。 – KevinO

am在此行中出現錯誤列表 col = new ArrayList <>（）;.是否有我需要導入的任何類來使用列表 col = new ArrayList <>（）; –

正則表達式與Java拆分方法的WildCard匹配拆分

回答

相關問題