我正在SAT分數數據庫上工作:https://nycopendata.socrata.com/Education/SAT-Results/f9bf-2cp4?用數值替換數據框中的字符值
這是什麼樣子:
> head(SAT)
DBN SCHOOL.NAME Num.of.SAT.Test.Takers
1 01M292 HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES 29
2 01M448 UNIVERSITY NEIGHBORHOOD HIGH SCHOOL 91
3 01M450 EAST SIDE COMMUNITY SCHOOL 70
4 01M458 FORSYTH SATELLITE ACADEMY 7
5 01M509 MARTA VALLE HIGH SCHOOL 44
6 01M515 LOWER EAST SIDE PREPARATORY HIGH SCHOOL 112
SAT.Critical.Reading.Avg..Score SAT.Math.Avg..Score SAT.Writing.Avg..Score
1 355 404 363
2 383 423 366
3 377 402 370
4 414 401 359
5 390 433 384
6 332 557 316
在列Num.of.SAT.Test.Takers,許多值進行簡單的人物的「。 's'列的相應值也有's',沒有數字分數。
> SATnocandidates<-SAT[SAT$Num.of.SAT=='s', ]
> head(SATnocandidates)
DBN SCHOOL.NAME Num.of.SAT.Test.Takers
23 02M392 MANHATTAN BUSINESS ACADEMY s
24 02M393 BUSINESS OF SPORTS SCHOOL s
26 02M399 THE HIGH SCHOOL FOR LANGUAGE AND DIPLOMACY s
39 02M427 MANHATTAN ACADEMY FOR ARTS & LANGUAGE s
41 02M437 HUDSON HIGH SCHOOL OF LEARNING TECHNOLOGIES s
42 02M438 INTERNATIONAL HIGH SCHOOL AT UNION SQUARE s
SAT.Critical.Reading.Avg..Score SAT.Math.Avg..Score SAT.Writing.Avg..Score
23 s s s
24 s s s
26 s s s
39 s s s
41 s s s
42 s s s
問題
- 在原來的SAT數據框,我想和數字載體,以取代所有的'在$ Num.of.SAT列值0。
- 隨後,我想要選擇性地將相應列中的所有'值'都替換爲0.
- 如何編寫總體命令來查找並替換數據幀中的所有's'值爲0?
是「s 「缺失的價值?如果是這樣,當在數據中讀取時將「s」設置爲「na.strings」的值.... – A5C1D2H2I1M1N2O1R2T1
事實上,NA可能優於0.(0會混淆你的直方圖,相關性,平均值...... ) –
阿南達,我是一個沒有編程背景的初學者。這可能是一個缺失的值,但我寧願將它設置爲數字0.因爲最終我需要添加行,列和做餅圖/箱子陰謀等 – vagabond