0
我正在使用HQL從配置單元表中提取一些數據,同時添加包含當前時間的額外行。hive unix_timestamp()給出多個值的UDF
類似於:從myTable中選擇col1,col2,col3,unix_timestamp();
我期待所有的記錄在第四列中具有相同的值。
我期待這樣的:
col1Value, col2Value, col3Value, col4Value, timeT
col1Value, col2Value, col3Value, col4Value, timeT
col1Value, col2Value, col3Value, col4Value, timeT
col1Value, col2Value, col3Value, col4Value, timeT
col1Value, col2Value, col3Value, col4Value, timeT
col1Value, col2Value, col3Value, col4Value, timeT
但是我得到的東西是這樣的:
col1Value, col2Value, col3Value, col4Value, timeT1
col1Value, col2Value, col3Value, col4Value, timeT1
col1Value, col2Value, col3Value, col4Value, timeT1
col1Value, col2Value, col3Value, col4Value, timeT2
col1Value, col2Value, col3Value, col4Value, timeT2
col1Value, col2Value, col3Value, col4Value, timeT2
col1Value, col2Value, col3Value, col4Value, timeT2
col1Value, col2Value, col3Value, col4Value, timeT3
col1Value, col2Value, col3Value, col4Value, timeT3
數據集是沒有那麼大,只使用單一映射。所以我的問題是:
在單臺機器中,unix_timestamp()對每個選中的行(hive的映射器中的每一行)進行求值,還是對所有行求值一個值?
我使用MAPR M5 /蜂巢0.9.0