段落我已這個文本,我已經從利用iText一個pdf提取並放置到字符串變量:正則表達式,從網頁中提取以下
(1) A a, — al'-fah; of Hebrew origin; the first letter of the alphabet;
figurative only (from its use as a numeral) the first: — Alpha.
Often used (usually ajn an, before a vowel) also in composition
(as a contraction from (427) (a]neu,)) in the sense of privation;
so in many words beginning with this letter; occasionally in the
sense of union (as a contraction of (260) (a[ma)).
(2) ÆAarw>n, — ah-ar-ohn'; of Hebrew origin [Hebrew {175}
('Aharown)]; Aaron, the brother of Moses: — Aaron.
(3) ÆAbaddw>n, — ab-ad-dohn'; of Hebrew origin [Hebrew {11}
('abaddown)]; a destroying angel: — Abaddon.
(4) ajbarh>v, — ab-ar-ace'; from (1) (a) (as a negative particle) and (922)
(ba>rov); weightless, i.e. (figurative) not burdensome: — from
being burdensome.
(5) ÆAbba~, — ab-bah'; of Chaldee origin [Hebrew {2} ('ab (Chaldee))];
father (as a vocative): — Abba.
(6) &Abel, — ab'-el; of Hebrew origin [Hebrew {1893} (Hebel)]; Abel,
the son of Adam: — Abel.
(7) ÆAbia>, — ab-ee-ah'; of Hebrew origin [Hebrew {29} ('Abiyah)];
Abijah, the name of two Israelites: — Abia.
(8) ÆAbia>qar, — ab-ee-ath'-ar; of Hebrew origin [Hebrew {54}
('Ebyathar)]; Abiathar, an Israelite: — Abiathar.
(9) ÆAbilhnh>, — ab-ee-lay-nay'; of foreign origin [compare Hebrew {58}
('abel)]; Abilene, a region of Syria: — Abilene.
(10) ÆAbiou>d, — ab-ee-ood'; of Hebrew origin [Hebrew {31}
('Abiyhuwd)]; Abihud, an Israelite: — Abiud.
字符串中的各段與([0-9])
開始如(9)
或(5)
,我想用pagestring.split("regex")
提取以此字符序列開頭的每個段落。可以幫助嗎?
太棒了!有沒有一個教程或指南,你可以推薦,因爲正則表達式真的把我搞砸了? – Lema 2015-03-13 09:01:14
我以前學過正則表達式,所以我不能真正推薦一個教程。但http://regexcrossword.com/提供了一種有趣的學習方式。 – laune 2015-03-13 09:21:59