Ruby字符串分裂爲多個字符

我有一個字符串，比如說「Hello_World I am Learning，Ruby」。我想將這個字符串分成不同的單詞，最好的方法是什麼？Ruby字符串分裂爲多個字符

謝謝！ C.

2011-10-11 curious

您可以使用String.split和正則表達式模式作爲參數。像這樣：

"Hello_World I am Learning,Ruby".split /[ _,.!?]/ 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

來源

2011-10-11 09:50:14 zacsek

ruby-1.9.2-p290 :022 > str = "Hello_World I am Learning,Ruby" 
ruby-1.9.2-p290 :023 > str.split(/\s|,|_/) 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

來源

2011-10-11 09:53:45 Jin

雖然上面的例子中工作，我想將字符串分割的話拆就不會被認爲是任何一種文字的一部分字符的時候它可能會更好。要做到這一點，我這樣做：

str = "Hello_World I am Learning,Ruby" 
str.split(/[^a-zA-Z]/).reject(&:empty?).compact

本聲明如下：

拆分由不在字母字符的字符串
然後拒絕任何爲空字符串
，並移除陣列

然後將處理的話大部分組合的所有空值。上面的例子要求你列出你想匹配的所有字符。指定不認爲是單詞的一部分的字符要容易得多。

來源

2011-10-11 10:14:52 BlueFish

String#Scan似乎是一個合適的方法完成這個任務

irb(main):018:0> "Hello_World I am Learning,Ruby".scan(/[a-z]+/i) 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

，或者您可以使用內置的匹配\w

irb(main):020:0> "Hello_World I am Learning,Ruby".scan(/\w+/) 
=> ["Hello_World", "I", "am", "Learning", "Ruby"]

來源

2011-10-11 10:34:45 Bohdan

你可以使用\ W任何非單詞字符：

"Hello_World I am Learning,Ruby".split /[\W_]/ 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"] 

"Hello_World I am Learning, Ruby".split /[\W_]+/ 
=> ["Hello", "World", "I", "am", "Learning", "Ruby"]

來源

2011-10-11 10:45:57 Samnang

只是爲了好玩，1.9的Unicode識別版本（或1.8與Oniguruma）：

>> "This_µstring has words.and thing's".split(/[^\p{Word}']|\p{Connector_Punctuation}/) 
=> ["This", "µstring", "has", "words", "and", "thing's"]

或許：

>> "This_µstring has words.and thing's".split(/[^\p{Word}']|_/) 
=> ["This", "µstring", "has", "words", "and", "thing's"]

真正的問題是確定哪些字符序列構成在這種情況下一個「字」。您可能想要查看Oniguruma docs以瞭解支持的字符屬性，Wikipedia has some notes on the properties。

來源

2011-10-11 16:40:06

Ruby字符串分裂爲多個字符

回答

相關問題