0
我是R中的初學者,我在K均值聚類上跟隨this tutorial。但是,我試圖在真實數據上運行此算法。我選擇:http://exoplanet.eu/catalog/R中的K均值聚類
我已經加載的數據:
d <- read.csv2(
"exoplanet.eu_catalog.csv",
header = TRUE,
sep = ","
)
有了這個代碼:
plot(
x = log(as.numeric(as.character(d$semi_major_axis))),
y = log(as.numeric(as.character(d$mass))),
xlab = "Star-exoplanet distance (log(UA))",
ylab = "Mass of exoplanets (log(M[Jupiter]))"
)
我有以下圖文:
我想在這個gra上運行K-means聚類算法形性,以顯示三組顏色,但我不知道如何着手R.我想我得首先:
y = log(as.numeric(as.character(d$mass)))
y <- y[!is.na(y)]
x = log(as.numeric(as.character(d$semi_major_axis)))
x <- x[!is.na(x)]
但我不知道如何將數據格式化成一個矩陣,以運行kmeans(matrix, 3, nstart = 20)
。有任何線索嗎?