查找字符串中子字符串的所有索引

我希望能夠使用Ruby在更大的字符串中找到所有出現的子字符串的索引。例如爲：所有「在」中的「愛因斯坦」查找字符串中子字符串的所有索引

str = "Einstein" 
str.index("in") #returns only 1 
str.scan("in") #returns ["in","in"] 
#desired output would be [1, 6]

來源

2017-04-10 Mokhtar

標準的手段是：

"Einstein".enum_for(:scan, /(?=in)/).map { Regexp.last_match.offset(0).first } 
#=> [1, 6]

來源

2017-04-10 17:40:02 tokland

不錯，一個。注意''nnnn「.enum_for（：scan，/nn/).map {Regexp.last_match.offset（0）.first}＃=> [0，2]'。如果'[0，1，2]'是所需的返回值，則將正則表達式（'/ nn /'）更改爲'/（？= nn）/'。 –

好點，@Cary。我想在大多數情況下，我們希望第二個更新。 – tokland

def indices_of_matches(str, target) 
    sz = target.size 
    (0..str.size-sz).select { |i| str[i,sz] == target } 
end 

indices_of_matches('Einstein', 'in') 
    #=> [1, 6] 
indices_of_matches('nnnn', 'nn') 
    #=> [0, 1, 2]

第二個例子反映了我關於重疊字符串的處理作出一個假設。如果不考慮重疊字符串（即，第二個示例中[0, 2]是期望的返回值），則此答案顯然不合適。

來源

2017-04-10 20:38:29

簡單而乾淨，可能我會用這個。 – tokland

這是一個更詳細的解決方案，它帶來了不依賴於全球價值的優勢：

def indices(string, regex) 
    position = 0 
    Enumerator.new do |yielder| 
    while match = regex.match(string, position) 
     yielder << match.begin(0) 
     position = match.end(0) 
    end 
    end 
end 

p indices("Einstein", /in/).to_a 
# [1, 6]

它輸出Enumerator，所以你也可以懶洋洋地使用它，或者只取n第一指標。

另外，如果你可能需要的不僅僅是指數的更多信息，你可以返回的MatchData的Enumerator並提取指數：

def matches(string, regex) 
    position = 0 
    Enumerator.new do |yielder| 
    while match = regex.match(string, position) 
     yielder << match 
     position = match.end(0) 
    end 
    end 
end 

p matches("Einstein", /in/).map{ |match| match.begin(0) } 
# [1, 6]

要獲取@Cary描述的行爲，你可以替換最後在position = match.begin(0) + 1的區塊內。

來源

2017-04-10 20:46:45

查找字符串中子字符串的所有索引

回答

相關問題