2
我的例子JSON模式(切斷由於尺寸):星火SQL JSON布爾評估
|-- LinearScheduleResult: struct (nullable = true)
| |-- Build: string (nullable = true)
| |-- EndTimestamp: string (nullable = true)
| |-- Errors: array (nullable = true)
| | |-- element: string (containsNull = true)
| |-- RequestId: string (nullable = true)
| |-- Schedule: struct (nullable = true)
| | |-- Airings: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- AiringTime: string (nullable = true)
| | | | |-- AiringType: string (nullable = true)
| | | | |-- CC: boolean (nullable = true)
| | | | |-- CallLetters: string (nullable = true)
| | | | |-- Category: string (nullable = true)
| | | | |-- Channel: string (nullable = true)
| | | | |-- Color: string (nullable = true)
| | | | |-- Copy: string (nullable = true)
| | | | |-- DSS: boolean (nullable = true)
| | | | |-- DVS: boolean (nullable = true)
| | | | |-- Dolby: boolean (nullable = true)
| | | | |-- Duration: long (nullable = true)
| | | | |-- DvbTriplet: string (nullable = true)
| | | | |-- EpisodeTitle: string (nullable = true)
| | | | |-- HD: boolean (nullable = true)
| | | | |-- HDLevel: string (nullable = true)
| | | | |-- IconAvailable: boolean (nullable = true)
| | | | |-- InstanceId: string (nullable = true)
| | | | |-- LetterBox: boolean (nullable = true)
| | | | |-- MovieRating: string (nullable = true)
| | | | |-- ParentNetworkId: long (nullable = true)
| | | | |-- ProgramId: string (nullable = true)
| | | | |-- SAP: boolean (nullable = true)
| | | | |-- SL: string (nullable = true)
| | | | |-- SeriesId: string (nullable = true)
| | | | |-- ServiceId: long (nullable = true)
| | | | |-- ShowingType: string (nullable = true)
| | | | |-- SourceDisplayName: string (nullable = true)
| | | | |-- SourceId: long (nullable = true)
| | | | |-- SourceLongName: string (nullable = true)
| | | | |-- Sports: boolean (nullable = true)
當我做到以下幾點:
results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings.Sports from tv")
它返回:
[Row(Sports=[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])]
當我做更復雜的事情如:
results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings from tv where LinearScheduleResult.Schedule.Airings.Sports = 'False'")
它永遠不會返回任何東西,我試過'假',假,0,假,還有更多的組合。
任何幫助,將不勝感激。
或者您可以下降到常規rdd計算 –