Ruby循環輸出重複

如何停止此代碼輸出中的重複項。Ruby循環輸出重複

RE = /<("[^"]*"|'[^']*'|[^'">])*>/ 
TAG_RE = /<(.+?)>(.*?)<.+?>/ 

text = "<date>show</date> me the current conditions for <city> detroit <END>" 
a = [] 

text.scan(TAG_RE).map { |w| a<< w; } 

text.gsub(RE, '').split.each do |q| 
    a.each_with_index do |v, i| 
     if q == a[i].last.strip 
      puts "#{q}\tB-#{a[i].first}"   
     else 
      puts "#{q}\tO"   
     end 

    end 
end

產出

show B-date 
show O 
me O 
me O 
the O 
the O 
current O 
current O 
conditions O 
conditions O 
for O 
for O 
detroit O 
detroit B-city

我只想字的單個實例，如果他們符合條件

喜歡這個

show B-date 
me O 
the O 
current O 
conditions O 
for O 
detroit B-city

我在哪裏可以把next的循環？

編輯
這是密碼Rubyiotic？

text.gsub(RE, '').split.each do |q| 
    a.each_with_index do |v, i| 
     @a = a[i].last.strip # save in a variable  
     if @a == q 
      puts "#{q}\tB-#{a[i].first}"  
      break # break inner loop if match found 
     end 
    end 
    next if @a == q #skip current outer loop if match found 
    puts "#{q}\tO" 
end

來源

2017-04-15 arjun

底特律應該結束''標籤嗎？ –

這並不重要。它只是檢查標籤內的單詞，然後從開始部分獲取標籤名稱。 – arjun

的問題是，你還遍歷您a這實際上是標籤和文字之間的哈希值。

如果你對待你的scan a hash而不是array，那麼你不會得到重複。

RE = /<("[^"]*"|'[^']*'|[^'">])*>/ 
TAG_RE = /<(.+?)>(.*?)<.+?>/ 

text = "<date>show</date> me the current conditions for <city> detroit <END>" 

a = text.scan(TAG_RE) 

text.gsub(RE, '').split.each do |q| 
    d = a.find { |p| p.last.strip == q } 
    if d 
    puts "#{q}\tB-#{d.first}" 
    else 
    puts "#{q}\tO" 
    end 
end

輸出：

show B-date 
me  O 
the  O 
current O 
conditions  O 
for  O 
detroit B-city

而且，雖然我們在這，你可以使用一個正確hash：

RE = /<("[^"]*"|'[^']*'|[^'">])*>/ 
TAG_RE = /<(.+?)>(.*?)<.+?>/ 

text = "<date>show</date> me the current conditions for <city> detroit <END>" 

map = Hash[*text.scan(TAG_RE).flatten.map(&:strip)].invert 

text.gsub(RE, '').split.each do |q| 
    tag = map[q] 
    if tag 
    puts "#{q}\tB-#{tag}" 
    else 
    puts "#{q}\tO" 
    end 
end

產生相同的輸出。

編輯：如果你在一個更Ruby- 去年秋季方式想，我可能會做這樣的事情：

class Text 
    TAGS_RE = /<("[^"]*"|'[^']*'|[^'">])*>/ 
    TAGS_WORDS_RE = /<(.+?)>\s*(.*?)\s*<.+?>/ 

    def self.strip_tags(text) 
    text.gsub(TAGS_RE, '') 
    end 

    def self.tagged_words(text) 
    matches = text.scan(TAGS_WORDS_RE) 
    Hash[*matches.flatten].invert 
    end 
end 

class Word 
    def self.display(word, tag) 
    puts "#{word}\t#{Word.tag(tag)}" 
    end 

    private 

    def self.tag(tag) 
    tag ? "B-#{tag}" : "0" 
    end 
end 

text = "<date>show</date> me the current conditions for <city> detroit <END>" 

words_tag = Text.tagged_words(text) 
Text.strip_tags(text).split.each do |word| 
    tag = words_tag[word] 
    Word.display(word, tag) 
end

爲什麼？

我不是那麼聰明，而且我很懶，所以我喜歡儘可能明確地寫東西。所以，我儘量避免循環。

編寫循環很容易，但是讀取循環並不容易，因爲在繼續閱讀和分析源代碼時，必須保持所讀內容的上下文。

通常，break s和next s的週期更難解析，因爲您必須跟蹤哪些代碼路徑突然結束週期。

嵌套循環更加困難，因爲您必須跟蹤以不同速度更改的多個上下文。

我相信建議的版本更容易閱讀，因爲每一行都可以自行理解。從一條線到另一條線，我們必須記住的環境很少。

細節被抽象的方法，所以如果你只是想了解大局，你可以看看代碼的主要部分：

words_tag = Text.tagged_words(text) 
Text.strip_tags(text).split.each do |word| 
    tag = words_tag[word] 
    Word.display(word, tag) 
end

如果你想了解它是如何的細節完成後，你看看這些方法是如何實現的。採用這種方法，實現細節不會泄漏到可能不需要的地方。

我認爲這是每種編程語言的一個好習慣，而不僅僅是Ruby。

來源

2017-04-15 21:51:39 Gaston

賀雅。我對這個問題進行了編輯。我用了'break'和'next'。好紅寶石？ _BTW，你的代碼很好吃。當然應該想到'哈希';）._ – arjun

謝謝:)。我更新了答案以解決您的新問題。 – Gaston

Ruby循環輸出重複

回答

相關問題