Java的正則表達式嘗試了分割字符串

嗨，我試圖分裂這個字符串了（它很長）：Java的正則表達式嘗試了分割字符串

Library Catalogue Log off | Borrower record | Course Reading | Collections | A-Z E-Journal list | ILL Request | Help   Browse | Search | Results List | Previous Searches | My e-Shelf | Self-Issue | Feedback       Selected records:  View Selected  |  Save/Mail  |  Create Subset  |  Add to My e-Shelf  |        Whole set:  Select All  |  Deselect  |  Rank  |  Refine  |  Filter   Records 1 - 15 of 101005 (maximum display and sort is 2500 records)         1 Drower, E. S. (Ethel Stefana), Lady, b. 1879. Lady E.S. Drower’s scholarly correspondence : an intrepid English autodidact in Iraq / edited by 2012. BK Book University Library(1/ 0) 2 Kowalski, Robin M. Cyberbullying : bullying in the digital age / Robin M. Kowalski, Susan P. Limber, Patricia W. Ag 2012. BK Book University Library(1/ 0) ... 15 Ambrose, Gavin. Approach and language [electronic resource] / Gavin Ambrose, Nigel Aono-Billson. 2011. BK Book

所以，我要麼得到回：

1 Drower, E. S. (Ethel Stefana), Lady, b. 1879. Lady E.S. Drower’s scholarly correspondence : an intrepid English autodidact in Iraq/edited by 2012. BK Book University Library(1/ 0) 

// Or 

1 Drower, E. S. (Ethel Stefana), Lady, b. 1879. Lady E.S. Drower’s scholarly correspondence : an intrepid English autodidact in Iraq

這只是一個例子和1 Drower，ES ...不會是靜態的。雖然每次輸入都會有所不同（1和2之間的細節），但字符串的總體佈局總是相同的。

我：

String top = ".*   (.*)"; 
String bottom = "\(\d/ \d\)\W*"; 
Pattern p = Pattern.compile(top); //+bottom 
Matcher matcher = p.matcher(td); //td is the input String 
String items = matcher.group(); 
System.out.println(items);

當我與top運行它，它的目的是去除所有的頭，但所有我得到的回覆是No match found。 bottom是我嘗試拆分字符串的其餘部分。

如果需要的話，我可以發佈所有輸入到15號。我需要的是分割輸入字符串，以便我可以處理15個結果中的每個個體。

感謝您的幫助！

來源

2012-03-14 Tbuermann

這將爲您提供兩種輸入。這是你想要的？

String text = "Library Catalogue Log off ..."; \\truncated text 

Pattern p = Pattern.compile("((1 Drower.+Iraq).+0\\)).+2 Kowalski"); 
Matcher m = p.matcher(text); 
if (m.find()) { 
    System.out.println(m.group(1)); 
    System.out.println(m.group(2)); 
}

Compile and run code here.

來源

2012-03-14 20:01:33 JMelnik

以某種方式是的。但事情是，輸入不是靜態的，意志會根據搜索結果而改變。對不起，我應該提到這一點。但是，輸入字符串的佈局不會更改。數字1只是第一個搜索結果，它會達到15個結果。如果需要，我可以將所有輸入發佈到15號。 – Tbuermann 2012-03-14 20:14:34

所以你需要分割所有的搜索結果，據我所知？ – JMelnik 2012-03-14 20:25:18

是的，這是正確的。例如：[1 Drower，E. S. ..]應該是一個String和[2 Kowalski，Robin M. ..]直到[15 Ambrose，Gavin。 ..]應該是下一個字符串。該輸入根據搜索結果而變化。但輸入字符串的佈局將始終相同。所以1，2，3 .. 15.除非有少於15個結果，否則總會在那裏 – Tbuermann 2012-03-14 20:28:18

首先，你需要將頭從結果數據分開。假設每次會有9個空白塊可以使用：.*\s{9}(.*)

接下來，您需要將數據解析爲行，由於沒有行分隔符，因此更加困難。你可以做的最好的假設是行被分隔：一個空格，一個或多個數字，然後另一個空間。

((?<=(?:^|\s))\d+\s.*?(?=(?:$|\s\d+\s)))

如果你打算嘗試解析記錄到字段，然後不打擾，除非你可以改變分隔符！

什麼每一位做一點解釋：

(?<=(?:^|\s))向後看：確保小組前的字符或者是字符串（第1記錄）的開始，或者一個空間（所有其他記錄）。

\d+\s.*?捕獲組：一個或多個數字後跟一個空格，然後是文本。由於在斷言中使用了非捕獲組?:，這是表達式在輸出中顯示的唯一部分。

(?=(?:$|\s\d+\s))向前看：請確保以下組的字符字符串標記$的任何一個結束或一個空格，然後通過1個+數字，後面加一個空格（表示下一條記錄）。

此方法適用於您提供的字段，但如果您的記錄包含自定義分隔符（例如，一本名爲「我最喜歡的10件事」的書。還有其他一些解析記錄的方法，這些方法有點安全，但如果這就是你想要做的，那麼它超出了正則表達式的期望...

來源

2012-06-14 14:31:29 KidTempo

Java的正則表達式嘗試了分割字符串

回答

相關問題