2013-10-17 65 views
0

我試圖從一個句子中提取數字,然後將數字放在一起作爲數字數組。例如,從字符串中提取包含逗號的數字,並將其轉換爲數值數組

string<-"  The Team: $74,810 TOTAL RAISED SO FARJOIN THE TEAM Vik Muniz 
      Amount Raised: $70,560  71% Raised of $100,000 Goal CDI International, 
      Inc. Amount Raised: $2,070 Robert Goodwin Amount Raised: $1,500  
      30% Raised of $5,000 Goal Marcel Fukayama Amount Raised: 
      $210 Maitê Proença Amount Raised: $140 
      Thiago Nascimento Amount Raised: $120 
      Lydia Kroeger Amount Raised: $80 "   

爲了繼續進行,予先除去逗號,這樣我可以容易地提取數字:

string.nocomma <- gsub(',', '', string) 

,然後我試圖把號碼一起作爲一個數值向量:

fund.numbers <-unique(as.numeric(gsub("[^0-9]"," ",string.nocomma),""))  

和這裏的問題:

  1. R在最後一條命令之後拋出錯誤。該錯誤是如下:

    Warning message: 
    In unique(as.numeric(gsub("[^0-9]", " ", website.fund.nocomma), : 
    NAs introduced by coercion 
    
  2. 即使我解決上述問題並且擁有數字矢量,我不知道如何數值向量轉換成數字數組。

    有人可以幫助我嗎? 謝謝,

回答

2

你可以這樣來做:

## Extract all numbers and commas 
numbers <- unlist(regmatches(string, gregexpr("[0-9,]+", string))) 
## Delete commas 
numbers <- gsub(",", "", numbers) 
## Delete empty strings (when only one comma has been extracted) 
numbers <- numbers[numbers != ""] 
numbers 

# [1] "74810" "70560" "71"  "100000" "2070" "1500" "30"  
# [8] "5000" "210" "140" "120" "80" 
1

您應用GSUB(),您必須用數字和空格的字符串,所以它不可能將其轉換爲直接數字後。你需要它一個數字向量。我認爲最好用gregexpr來得到它:

## get list of string with numbers only 
> res = regmatches(string.nocomma, gregexpr("([0-9]+)", string.nocomma)) 
## convert it to numeric 
> res = as.numeric(unlist(res)) 

[1] 74810 70560  71 100000 2070 1500  30 5000 210 140 120 
[12]  80 
相關問題