我有一個字符串的長列表,例如這款機器可讀例如:在R中正確使用gsub /正則表達式?
A <- list(c("Biology","Cell Biology","Art","Humanities, Multidisciplinary; Psychology, Experimental","Astronomy & Astrophysics; Physics, Particles & Fields","Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods","Geriatrics & Gerontology","Gerontology","Management","Operations Research & Management Science","Computer Science, Artificial Intelligence; Computer Science, Information Systems; Engineering, Electrical & Electronic","Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods; Statistics & Probability"))
所以它看起來像這樣:
> A
[[1]]
[1] "Biology"
[2] "Cell Biology"
[3] "Art"
[4] "Humanities, Multidisciplinary; Psychology, Experimental"
[5] "Astronomy & Astrophysics; Physics, Particles & Fields"
[6] "Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods"
[7] "Geriatrics & Gerontology"
[8] "Gerontology"
[9] "Management"
[10] "Operations Research & Management Science"
[11] "Computer Science, Artificial Intelligence; Computer Science, Information Systems; Engineering, Electrical & Electronic"
[12] "Economics; Mathematics, Interdisciplinary Applications; Social Sciences, Mathematical Methods; Statistics & Probability"
我想,爲了得到這個修改這些條款和消除重複結果:
[1] "Science"
[2] "Science"
[3] "Arts & Humanities"
[4] "Arts & Humanities; Social Sciences"
[5] "Science"
[6] "Social Sciences; Science"
[7] "Science"
[8] "Social Sciences"
[9] "Social Sciences"
[10] "Science"
[11] "Science"
[12] "Social Sciences; Science"
到目前爲止,我只得到了這一點:
stringedit <- function(A)
{
A <-gsub("Biology", "Science", A)
A <-gsub("Cell Biology", "Science", A)
A <-gsub("Art", "Arts & Humanities", A)
A <-gsub("Humanities, Multidisciplinary", "Arts & Humanities", A)
A <-gsub("Psychology, Experimental", "Social Sciences", A)
A <-gsub("Astronomy & Astrophysics", "Science", A)
A <-gsub("Physics, Particles & Fields", "Science", A)
A <-gsub("Economics", "Social Sciences", A)
A <-gsub("Mathematics", "Science", A)
A <-gsub("Mathematics, Applied", "Science", A)
A <-gsub("Mathematics, Interdisciplinary Applications", "Science", A)
A <-gsub("Social Sciences, Mathematical Methods", "Social Sciences", A)
A <-gsub("Geriatrics & Gerontology", "Science", A)
A <-gsub("Gerontology", "Social Sciences", A)
A <-gsub("Management", "Social Sciences", A)
A <-gsub("Operations Research & Management Science", "Science", A)
A <-gsub("Computer Science, Artificial Intelligence", "Science", A)
A <-gsub("Computer Science, Information Systems", "Science", A)
A <-gsub("Engineering, Electrical & Electronic", "Science", A)
A <-gsub("Statistics & Probability", "Science", A)
}
B <- lapply(A, stringedit)
但它不能正常工作:
> B
[[1]]
[1] "Science"
[2] "Cell Science"
[3] "Arts & Humanities"
[4] "Arts & Humanities; Social Sciences"
[5] "Science; Science"
[6] "Social Sciences; Science, Interdisciplinary Applications; Social Sciences"
[7] "Science"
[8] "Social Sciences"
[9] "Social Sciences"
[10] "Operations Research & Social Sciences Science"
[11] "Computer Science, Arts & Humanitiesificial Intelligence; Science; Science"
[12] "Social Sciences; Science, Interdisciplinary Applications; Social Sciences; Science"
我怎樣才能實現上述正確的輸出?
非常感謝您提前考慮!
每當你發現自己以很多相似的代碼行結束時,你就會繞過可愛的[DRY原則](http://en.wikipedia.org/wiki/Don%27t_repeat_yourself)。所以現在是重新設計的時候了,顯然是一個包裝器傳遞給某種'apply'函數或其他類似循環的幫助器。 – aL3xa