0
是否有一種常見方法來更改任何指定StructType的所有元素的可空屬性?它可能是嵌套的StructType。在Scala中更改Spark sql StructType的所有元素的可空屬性的常用方法
我看到@eliasah標記爲Spark Dataframe column nullable property change重複。但它們不同,因爲它無法解決層次結構/嵌套StructType,該答案僅適用於一個級別。
例如:
root
|-- user_id: string (nullable = false)
|-- name: string (nullable = false)
|-- system_process: array (nullable = false)
| |-- element: struct (containsNull = false)
| | |-- timestamp: long (nullable = false)
| | |-- process: string (nullable = false)
|-- type: string (nullable = false)
|-- user_process: array (nullable = false)
| |-- element: struct (containsNull = false)
| | |-- timestamp: long (nullable = false)
| | |-- process: string (nullable = false)
我想改變nullalbe到真正的所有元素,結果應該是:
root
|-- user_id: string (nullable = true)
|-- name: string (nullable = true)
|-- system_process: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- timestamp: long (nullable = true)
| | |-- process: string (nullable = true)
|-- type: string (nullable = true)
|-- user_process: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- timestamp: long (nullable = true)
| | |-- process: string (nullable = true)
附上StructType的JSON模式爲了方便測試的樣本:
val jsonSchema="""{"type":"struct","fields":[{"name":"user_id","type":"string","nullable":false,"metadata":{}},{"name":"name","type":"string","nullable":false,"metadata":{}},{"name":"system_process","type":{"type":"array","elementType":{"type":"struct","fields":[{"name":"timestamp","type":"long","nullable":false,"metadata":{}},{"name":"process_id","type":"string","nullable":false,"metadata":{}}]},"containsNull":false},"nullable":false,"metadata":{}},{"name":"type","type":"string","nullable":false,"metadata":{}},{"name":"user_process","type":{"type":"array","elementType":{"type":"struct","fields":[{"name":"timestamp","type":"long","nullable":false,"metadata":{}},{"name":"process_id","type":"string","nullable":false,"metadata":{}}]},"containsNull":false},"nullable":false,"metadata":{}}]}"""
DataType.fromJson(jsonSchema).asInstanceOf[StructType].printTreeString()