右鍵我從我從維基百科下載的xml文件中刪除一些引號。到目前爲止,該文本看起來像這樣(忽略換行,這只是所以它更容易閱讀):如何在正則表達式中替換更多表達式中的多個字符的字符串vb.net
'''Anarchism''' is a political philosophy that advocates stateless societies based on
non-hierarchical free associations.<ref name="iaf-ifa.org"/><ref>"That is why
Anarchy, when it works to destroy authority in all its aspects, when it demands
the abrogation of laws and the abolition of the mechanism that serves to
impose them, when it refuses all hierarchical organization and preaches free agreement - at the same time strives to maintain and enlarge the precious kernel of social customs without which
no human or animal society can exist." Peter Kropotkin. http://www.theanarchistlibrary.org/HTML/Petr_Kropotkin__Anarchism__its_philosophy_and_ideal.html
Anarchism: its philosophy and ideal</ref><ref>"anarchists are opposed to irrational (e.g., illegitimate)
authority, in other words, hierarchy - hierarchy being the institutionalisation of authority
within a society." http://www.theanarchistlibrary.org/HTML/The_Anarchist_FAQ_Editorial_Collective__An_Anarchist_FAQ__03_17_.html#toc2 "B.1
Why are anarchists against authority and hierarchy?" in An
Anarchist FAQ</ref><ref>"ANARCHISM, a social philosophy that rejects
authoritarian government and maintains that voluntary institutions are best
suited to express man's natural social tendencies." George Woodcock. "Anarchism" at The Encyclopedia of Philosophy</ref><ref>"In a society developed on these lines, the voluntary
associations which already now begin to cover all the fields of human activity
would take a still greater extension so as to substitute themselves for the
state in all its functions." http://www.theanarchistlibrary.org/HTML/Petr_Kropotkin___Anarchism__from_the_Encyclopaedia_Britannica.html
Peter Kropotkin. "Anarchism" from the Encyclopædia Britannica</ref> Anarchism holds the state
to be undesirable, unnecessary, or harmful
所有我從這個文本塊想是這樣的:
無政府主義是一種政治倡導基於非等級自由聯想的無國籍社會的哲學。無政府主義認爲國家是不受歡迎的,不必要的或有害的。
這在我看來,如果我刪除"<ref"
和"/ref>"
之間的所有文字,我應該能夠捕捉到所有需要的不良文字和刪除它。這是我目前的代碼:
Dim temptext As String = newsrt.ToString
Dim expression As New Regex("(?<=\<ref)[^/ref>]+(?=/ref>)")
Dim resul As String = expression.Replace(temptext, "")
但這似乎不起作用。 <ref
和/ref>
之間沒有文字被捕獲並替換爲「」。
任何幫助或建議將是偉大的!謝謝。
嘿。我對於正則表達式很新,但我想我明白什麼是貪婪 - >它會找到最後一節(/ ref>)的最後一個位置?如果是的話,我該如何阻止這種情況,因爲這裏有很多這些參考文獻,這些參考文獻是上下翻頁的,其中需要的文字介於兩者之間。 – FraserOfSmeg
我明白了,添加一個?像這樣:「」。感謝您的幫助! :D –
FraserOfSmeg
@FraserOfSmeg在這種情況下,您可以使其不符合''或使用')。)*/ref>'(這是您最初的意圖)。或者,更好的辦法是使用XML解析器! –