1
我想刪除在某個字符匹配THE END
或FINIS
後出現的任何文本。我知道這與其他topic非常相似,但我在正則表達式方面還不夠熟練,無法爲我工作。R:刪除匹配字符串後的文本結尾
我的文本是從古騰堡項目採取的莎士比亞書籍。他們通常看起來像
txt <- "... thou hast tam'd a curst shrow. LUCENTIO. 'Tis a wonder,
by your leave, she will be tam'd so. Exeunt THE END <<THIS ELECTRONIC VERSION OF THE
COMPLETE WORKS OF WILLIAM ..."
或
txt <- "... thou hast tam'd a curst shrow. LUCENTIO. 'Tis a wonder,
by your leave, she will be tam'd so. Exeunt FINIS <<THIS ELECTRONIC VERSION OF THE
COMPLETE WORKS OF WILLIAM ..."
我的理想看起來像gsub("^[THE END]*|^[FINIS]*", "", txt)
回到"... thou hast tam'd a curst shrow. LUCENTIO. 'Tis a wonder, by your leave, she will be tam'd so. Exeunt
'sub'應該夠了,因爲只有一個替代品。 – thelatemail