2012-08-02 132 views
4

我存儲表作爲SequenceFile格式和我設置下面的命令,以啓用序列對BLOCK壓縮在描述在蜂房

set mapred.output.compress=true; 
set mapred.output.compression.type=BLOCK; 
set mapred.output.compression.codec=org.apache.hadoop.io.compress.LzoCodec; 

擴展表但當我喜歡觀看這 -

describe extended lip_table 

我得到的信息中有一個域名爲compressed,它被設置爲false,這意味着我的數據沒有通過設置上述三個命令被壓縮?

Detailed Table Information  Table(tableName:lip_table, dbName:default, owner:uname, 
createTime:1343931235, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols: 
[FieldSchema(name:buyer_id, type:bigint, comment:null), FieldSchema(name:total_chkout, 
type:bigint, comment:null), FieldSchema(name:total_errpds, type:bigint, comment:null)], 
location:hdfs://ares-nn/apps/hdmi/uname/lip-data, 
inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
**compressed:false**, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters: 
{serialization.format= , field.delim= 
+3

嘗試'描述'形容extended' – Idr 2012-08-12 19:24:51

+0

我正在蜂巢0.6,它不支持'描述formatted'一個漂亮的打印formatted'。 :( – ferhan 2012-08-12 22:08:53

回答

2

我發現this article,我覺得給你的問題的解決方案。 您應該嘗試在創建表或使用ALTER語句時,在表定義級別指定壓縮編解碼器的用法。

在創建時:

CREATE EXTERNAL TABLE lip_table (
            column1 string 
            , column2 string 
           ) 
PARTITIONED BY (date string) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t" 
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat" 
      OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" 
LOCATION '/path/to/hive/tables/lip'; 

使用ALTER(僅影響後續創建的分區):

ALTER TABLE lip_table 
SET FILEFORMAT 
    INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat" 
    OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"; 

http://www.mrbalky.com/2011/02/24/hive-tables-partitions-and-lzo-compression/

+0

我使用上面的SQL,但它給了我以下錯誤失敗:ParseException行4:73不匹配的輸入''期待SERDE靠近''org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat''以文件格式規範 – wgzhao 2015-10-08 02:57:56

1

爲了避免serde異常使用serde課呢。

ALTER TABLE <<table name>> 
SET FILEFORMAT 
INPUTFORMAT "<<Input format class>>" 
OUTPUTFORMAT 
"<<Output format class>>" SERDE "<<Serde class>>";