爲什麼這個正則表達式不能在R

中工作我試過grep，grepl，regexpr，gregexpr和所有返回失敗或非整數。爲什麼這個正則表達式不能在R

Ojbect是「test」，它是一個帶地址的字符串。例如提供：

[9972] "1350 Hwy 160 W\nFort Mill, SC 29715"                 
[9973] "Sonoran Desert Dentistry\n9220 E Raintree Dr\nSte 102\nScottsdale, AZ 85260"       
[9974] "3252 Vilas Rd\nCottage Grove, WI 53527"                
[9975] "224 W Cottage Grove Rd\nCottage Grove, WI 53527"              
[9976] "320 W Cottage Grove Rd\nCottage Grove, WI 53527"              
[9977] "7914 State Road 19\nDane, WI 53529"                 
[9978] "106 Dane St\nDane, WI 53529"

的目標是在最後的「\ n」所以只是我市通過郵編保持提取的一切。像「山寨格羅夫，WI 53527」

這裏是行不通的grep和正則表達式的樣本：

> grep("\\[^\\]+$", test) 
integer(0)

任何幫助將是巨大的。

來源

2015-11-20 frameworkgeek

有這些文本行沒有反斜槓。您需要知道，使用轉義字符的字符值的「print」輸出與「cat」輸出不同。閱讀'？Quotes'並嘗試一些線路上的'cat'。（...我認爲''[^ \\]「'會與任何東西匹配。） –

grep()不會改變文字。它只能找到它，並返回匹配索引或匹配本身。要更改匹配的文本，您希望使用sub()或gsub()。在這種情況下，sub()是合適的，因爲要刪除每個字符串中最後一次換行的所有內容。以下應該做到這一點。

sub(".*\n", "", test) 
# [1] "Fort Mill, SC 29715"  "Scottsdale, AZ 85260"  
# [3] "Cottage Grove, WI 53527" "Cottage Grove, WI 53527" 
# [5] "Cottage Grove, WI 53527" "Dane, WI 53529" 
# [7] "Dane, WI 53529"

.*是貪婪的，匹配任何
\n就是我們要找的

由於.*是貪婪的，這將刪除一切直到幷包括最後\n。

數據：

test <- c("1350 Hwy 160 W\nFort Mill, SC 29715", "Sonoran Desert Dentistry\n9220 E Raintree Dr\nSte 102\nScottsdale, AZ 85260", 
"3252 Vilas Rd\nCottage Grove, WI 53527", "224 W Cottage Grove Rd\nCottage Grove, WI 53527", 
"320 W Cottage Grove Rd\nCottage Grove, WI 53527", "7914 State Road 19\nDane, WI 53529", 
"106 Dane St\nDane, WI 53529")

來源

2015-11-20 02:17:05

你好，我欠你一杯啤酒。 – frameworkgeek

爲什麼這個正則表達式不能在R

回答

相關問題