確定的載體,這有我絕對感到困惑和worried- 作爲日常工作的一部分,我已經根據被分類的變量個別觀測作爲TRUE
或FALSE
它們的值是否高於或低於/等於中值。但是,我一直在R中得到一個行爲,這在很大程度上是意外執行這個簡單的測試。令人困惑的行爲價值
所以借這個組觀察:
data=c(0.6666667, 0.8333, 0.6666667, 0.8333, 0.8333, 0.75, 0.9999, 0.7499667, 0.25, 0.6666667, 0.1667, 0.7499667, 0.5, 0.2500333, 0.3333667, 0.0834, 0.0001, 0.2500333, 0.8333, 0.9999, 0.9999, 0.2500333, 0.2500333, 0.3333667, 0.9166, 0.5, 0.2500333, 0.4166667, 0.0001, 0.1667333, 0.6666333, 0.0834, 0.1667, 0.6666333, 0.9166, 0.1667, 0.7499333, 0.9166, 0.9166, 0.9166, 0.7499667, 0.7499667, 0.4166667, 0.5, 0.2500333, 0.9166, 0.6666667, 0.1667333, 0.25, 0.0001, 0.3333667, 0.0001, 0.25, 0.0834, 0.9999, 0.0834, 0.1667, 0.5, 0.2500333, 0.3333667, 0.9166, 0.9166, 0.8333, 0.9166, 0.75, 0.0834, 0.4166667, 0.5, 0.0001, 0.9999, 0.8333, 0.6666667, 0.9166)
對我來說,這些值進行分類,我所做的:
data_med=median(data)
quant_data=data
quant_data[quant_data>data_med]="High"
quant_data[quant_data<=data_med]="Low"
我知道有更有效地這樣做的1點極大數的方式,但什麼我擔心的是,從這個輸出沒有意義。由於有上集中沒有NaN
S和測試全包(>
或<=
),我應該結束了,只有TRUE
/FALSE
值的列表,而是我得到:
[1] "High" "High" "High" "High" "High" "High" "High" "High" "Low" "High" "Low" "High" "Low" "Low" "Low" "Low" "1e-04"
[18] "Low" "High" "High" "High" "Low" "Low" "Low" "High" "Low" "Low" "Low" "1e-04" "Low" "High" "Low" "Low" "High"
[35] "High" "Low" "High" "High" "High" "High" "High" "High" "Low" "Low" "Low" "High" "High" "Low" "Low" "1e-04" "Low"
[52] "1e-04" "Low" "Low" "High" "Low" "Low" "Low" "Low" "Low" "High" "High" "High" "High" "High" "Low" "Low" "Low"
[69] "1e-04" "High" "High" "High" "High"
請參閱「 1E-04" S?更奇怪的,讓我們挑值69,返回奇數值的那些之一:
data[69]
>1e-04
如果我單獨測試這個值,我得到了我的預期得到:
data[69]<=data_med
TRUE
能
人解釋這種行爲?它看起來完全危險......
刪除這一行:'quant_data = data'並在'[。中使用'data'而不是'quant_data'。
Arun
2013-04-30 17:54:07
完成這項任務的一個相對較好的方法是使用'ifelse'作爲前綴:'quant_data < - ifelse(data> data_med,「High」,「Low」)' – Arun 2013-04-30 17:56:05
爲什麼選擇down-vote? – Arun 2013-04-30 17:59:01