如果你的文字是固定的格式,這些數字將永遠是第一行在塊中,然後簡單地刪除第一行:
text='
(093) 123-34-56 (068) 123 45 67 (095) 123 456 78
Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)
Smart Functionality: Yes - xx TV Streaming Platform
Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78'
text.strip
# => "(093) 123-34-56 (068) 123 45 67 (095) 123 456 78\n Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)\n Smart Functionality: Yes - xx TV Streaming Platform\n Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78"
text.strip.lines
# => ["(093) 123-34-56 (068) 123 45 67 (095) 123 456 78\n", " Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)\n", " Smart Functionality: Yes - xx TV Streaming Platform\n", " Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78"]
text.strip.lines[1..-1].join
# => " Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)\n Smart Functionality: Yes - xx TV Streaming Platform\n Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78"
或者:
lines = text.strip.lines
# => ["(093) 123-34-56 (068) 123 45 67 (095) 123 456 78\n", " Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)\n", " Smart Functionality: Yes - xx TV Streaming Platform\n", " Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78"]
lines.shift
# => "(093) 123-34-56 (068) 123 45 67 (095) 123 456 78\n"
lines.join
# => " Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)\n Smart Functionality: Yes - xx TV Streaming Platform\n Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78"
使用正則表達式和gsub
可以工作,但它也更容易成爲一個維護問題。
如果電話號碼將永遠是一條線,但不一定是第一,那麼我仍然使用lines
打破文本到一個數組,但我會用reject
用正則表達式來數模式相匹配檢查每一行,並拒絕一個與電話號碼般的正則表達式匹配:在使用strip
導致領先的「\ n」被保留不
lines = text.lines
lines.reject{ |l| l[/\(\d{3}\) \d{3}[ -]\d+{2,3}[ -]\d{2,3}/] }
# => ["\n", " Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)\n", " Smart Functionality: Yes - xx TV Streaming Platform\n", " Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78"]
lines.reject{ |l| l[/\(\d{3}\) \d{3}[ -]\d+{2,3}[ -]\d{2,3}/] }.join
# => "\n Refresh Rate: 60Hz (Native). Backlight: LED (Full Array)\n Smart Functionality: Yes - xx TV Streaming Platform\n Dimensions (W x H x D): TV without stand (inches) : 28.98x17x3.18, TV with stand (inches) : 28.98x18.68x7.78"
注意。
使用lines
將文本轉換爲數組有助於隔離任何損壞,以防其他情況觸發模式匹配,從而導致文本無意中損壞。
這種方法出現故障時,電話號碼分散在整個文本中。儘管如此,我仍然可能會使用這種方法將文本減少到單獨的行,如果存在誤報,也可以減少可能的損害。
後你已經嘗試了什麼。你使用的是正則表達式嗎? –
你需要刪除哪些電話號碼格式? [有很多。](https://en.wikipedia.org/wiki/National_conventions_for_writing_telephone_numbers) –
有沒有一些特定的格式,它可以是不同的 – user