0
我有一個DataFrame包含由VectorAssembler創建的特徵向量,它也包含空值。我現在想用一個載體來代替空值:火花填充DataFrame與矢量爲null
val nil = Vectors.dense(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0)
df.na.fill(nil) // does not work.
什麼是做到這一點的正確方法?
編輯: 我發現要歸功於回答道:
val nil = Vectors.dense(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0)
import sc.implicits._
var fill = Seq(Tuple1(nil)).toDF("replacement")
val dates = data.schema.fieldNames.filter(e => e.contains("1"))
data = data.crossJoin(broadcast(fill))
for(e <- dates){
data = data.withColumn(e, coalesce(data.col(e), $"replacement"))
}
data = data.drop("replacement")