2012-05-29 68 views
1

我有以下行Unix的Bash腳本的空文件condtions

for file in $diff_file_list 
    do 

     # replace any ? with $current_date and replace any % with $file 
     formatted_output_filename=$(echo $output_filename | sed "s|?|$current_date|g" | sed "s|%|_$file|g") 
     $pig_bin_dir/pig -param preceding=$hdfs_hadoop_pre_dir/$file -param current=$hdfs_hadoop_cur_dir/$file -param output_added=$hdfs_hadoop_delta_dir/${file}_added -param output_removed=$hdfs_hadoop_delta_dir/${file}_removed -param delimiter=$delimiter diff.pig 
     [ $? -ne 0 ] && die "diff of data between $previous_date and $current_date using pig failed. exiting `basename $0` script" 
     $hadoop_bin_dir/hadoop dfs -cat $hdfs_hadoop_delta_dir/${file}_added/* | gzip > $file_output_dir/${formatted_output_filename}_added.gz 
     $hadoop_bin_dir/hadoop dfs -cat $hdfs_hadoop_delta_dir/${file}_removed/* | gzip > $file_output_dir/${formatted_output_filename}_removed.gz 
     [ $? -ne 0 ] && die "there was a problem gzipping ${formatted_output_filename}. exiting `basename $0` script" 
     [ $post_diff_script ] && ./$post_diff_script $source $previous_date $current_date 

    done 

我只希望它來創建_removed.gz和_added.gz文件時,文件不爲空bash腳本。 我試過在下面做這個,但是我的腳本有問題嗎?

for file in $diff_file_list 
    do 

     # replace any ? with $current_date and replace any % with $file 
     formatted_output_filename=$(echo $output_filename | sed "s|?|$current_date|g" | sed "s|%|_$file|g") 
     $pig_bin_dir/pig -param preceding=$hdfs_hadoop_pre_dir/$file -param current=$hdfs_hadoop_cur_dir/$file -param output_added=$hdfs_hadoop_delta_dir/${file}_added -param output_removed=$hdfs_hadoop_delta_dir/${file}_removed -param delimiter=$delimiter diff.pig 
     [ $? -ne 0 ] && die "diff of data between $previous_date and $current_date using pig failed. exiting `basename $0` script" 
     if [[ -s $hdfs_hadoop_delta_dir/${file}_added/* ]] ; then 
     echo "$hdfs_hadoop_delta_dir/${file}_added/* has data." 
     $hadoop_bin_dir/hadoop dfs -cat $hdfs_hadoop_delta_dir/${file}_added/* | gzip > $file_output_dir/${formatted_output_filename}_added.gz 
     $hadoop_bin_dir/hadoop dfs -cat $hdfs_hadoop_delta_dir/${file}_removed/* | gzip > $file_output_dir/${formatted_output_filename}_removed.gz 
     else 
     echo "$hdfs_hadoop_delta_dir/${file}_added/*is empty." 
     fi ; 
     [ $? -ne 0 ] && die "there was a problem gzipping ${formatted_output_filename}. exiting `basename $0` script" 
     [ $post_diff_script ] && ./$post_diff_script $source $previous_date $current_date 

    done 
+10

我的眼睛受傷... –

+0

是的,你的腳本在向我大喊。 – trojanfoe

+0

一些筆記。首先,儘管'[x] && y'起作用,但將它寫爲'if [x];那麼y; fi'。另外,嘗試在多行中打破很長的一行。這也有助於對齊某些東西,例如,許多'-param'選項發送到'pig'(在你分成多行後)。最後,當你說「有什麼問題」時,你應該更具體。究竟發生了什麼? – Shahbaz

回答

2
if [[ -s file ]] 
then 
    do_file_creation 
fi 

for f in dir/* 
do 
    if [[ -s $f ]] 
    then 
     do_file_creation 
    fi 
done 

使用小寫或混合的情況下的變量名。

使用[[ ]] &&if代替

使用縮進。

0

我試圖找到一個不添加循環的解決方案。這是不乾淨,雖然:

if [[ -s `ls -S $hdfs_hadoop_delta_dir/${file}_added/* 2>/dev/null | head -1` ]] ; then 

ls -S按文件大小排序文件,head -1採取的規模最大的一次,然後由if [[ -s測試爲非零。

不幸的是,必須處理沒有文件的情況。我用2>/dev/null。任何人有更好的主意?

+0

[BashFAQ/003](http://mywiki.wooledge.org/BashFAQ/003)適用於「最大」以及「最新/最舊」。 –