2016-07-18 159 views
0

我已經準備了一個shell腳本來完成一個50節點的Hadoop集羣下面的第一個和最後occourance:shell腳本中查找的字符串

  • 列表中的所有日誌中的每個服務器
  • 與我的應用程序文件
  • 打印的最後修改的時間戳,主機名,文件名
  • 從50個節點的日誌文件具體根據修改的時間戳

電流輸出格式是排序:

2016-07-11-01:06 server1 MY_APPLICATION-worker-6701.log.6.gz 
2016-07-12-05:23 server1 MY_APPLICATION-worker-6701.log.7.gz 
2016-07-13-08:38 server2 MY_APPLICATION-worker-6701.log 
2016-07-13-10:38 server3 MY_APPLICATION-worker-6701.log.out 
2016-07-13-10:38 server2 MY_APPLICATION-worker-6701.log.err 
2016-07-13-10:38 server5 MY_APPLICATION-worker-6701.log 
2016-07-15-10:22 server4 MY_APPLICATION-worker-6703.log.out 
2016-07-15-10:22 server3 MY_APPLICATION-worker-6703.log.err 
2016-07-15-10:22 server2 MY_APPLICATION-worker-6703.log 

totallogs="" 
for server in $(cat all-hadoop-cluster-servers.txt); do 
    logs1="$(ssh [email protected]$server 'ls /var/log/hadoop/storm/ -ltr --time-style="+%Y-%m-%d-%H:%M" | grep MY_APPLICATION | awk -v host=$HOSTNAME "{print \$6, host, \$7}"')" 
    if [ -z "${logs1}" ]; then 
     continue 
    else 
     logs1+="\n" 
     totallogs+=$logs1 
    fi 
done 
for el in "${totallogs[@]}" 
do 
    printf "$el" 
done | sort 

如何找到「唯一ID」,並與上述輸出一起在每個日誌文件中的「唯一ID」中最後一次出現的第一次出現。

預期輸出格式爲:

TIME_STAMP主機名的第一唯一ID最後唯一ID

2016-07-11-01:06 server1 MY_APPLICATION-worker-6701.log.6.gz 1467005065878 1467105065877 
2016-07-12-05:23 server1 MY_APPLICATION-worker-6701.log.7.gz 1467105065878 1467205065860 
2016-07-13-08:38 server2 MY_APPLICATION-worker-6701.log   1467205065861 1467305065852 
2016-07-13-10:38 server3 MY_APPLICATION-worker-6701.log.out  
2016-07-13-10:38 server2 MY_APPLICATION-worker-6701.log.err  
2016-07-13-10:38 server5 MY_APPLICATION-worker-6701.log   1467305065853 1467405065844 
2016-07-15-10:22 server4 MY_APPLICATION-worker-6703.log.out  
2016-07-15-10:22 server3 MY_APPLICATION-worker-6703.log.err  
2016-07-15-10:22 server2 MY_APPLICATION-worker-6703.log   1467405065845 1467505065853 

日誌文件示例:

DEBUG | 2008-09-06 10:51:44,848 | unique-ID >>>>>> 1467205065861 
DEBUG | 2008-09-06 10:51:44,817 | DefaultBeanDefinitionDocumentReader.java | 86 | Loading bean definitions 
DEBUG | 2008-09-06 10:51:44,848 | AbstractBeanDefinitionReader.java | 185 | Loaded 5 bean definitions from location pattern [samContext.xml] 
INFO | 2008-09-06 10:51:44,848 | XmlBeanDefinitionReader.java | 323 | Loading XML bean definitions from class path resource [tmfContext.xml] 
DEBUG | 2008-09-06 10:51:44,848 | DefaultDocumentLoader.java | 72 | Using JAXP provider [com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl] 
DEBUG | 2008-09-06 10:51:44,848 | BeansDtdResolver.java | 72 | Found beans DTD [http://www.springframework.org/dtd/spring-beans.dtd] in classpath: spring-beans.dtd 
DEBUG | 2008-09-06 10:51:44,848 | unique-ID >>>>>> 1467205065862 
DEBUG | 2008-09-06 10:51:44,864 | DefaultBeanDefinitionDocumentReader.java | 86 | Loading bean definitions 
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'MS-SQL' 
DEBUG | 2008-09-06 10:51:45,458 | DefaultSingletonBeanRegistry.java | 213 | Creating shared instance of singleton bean 'MySQL' 
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 383 | Creating instance of bean 'MySQL' 
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 459 | Eagerly caching bean 'MySQL' to allow for resolving potential circular references 
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'MySQL' 
DEBUG | 2008-09-06 10:51:45,458 | DefaultSingletonBeanRegistry.java | 213 | Creating shared instance of singleton bean 'Oracle' 
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 383 | Creating instance of bean 'Oracle' 
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 459 | Eagerly caching bean 'Oracle' to allow for resolving potential circular references 
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'Oracle' 
DEBUG | 2008-09-06 10:51:45,473 | DefaultSingletonBeanRegistry.java | 213 | Creating shared instance of singleton bean 'PostgreSQL' 
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 383 | Creating instance of bean 'PostgreSQL' 
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 459 | Eagerly caching bean 'PostgreSQL' to allow for resolving potential circular references 
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'PostgreSQL' 
INFO | 2008-09-06 10:51:45,473 | SQLErrorCodesFactory.java | 128 | SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase] 
DEBUG | 2008-09-06 10:52:44,817 | DefaultBeanDefinitionDocumentReader.java | 86 | Loading bean definitions 
DEBUG | 2008-09-06 10:52:44,848 | unique-ID >>>>>> 1467205065864 
+1

您應該向我們展示您的輸入的代表性樣本以及相應的期望輸出。請注意,你不應該使用'for'讀取行(使用'while read -r'來代替)。 –

+0

@anubhava,這是一個模擬日誌文件。文本「unique-ID >>>>>>」將在日誌文件中爲每幾個語句。文本「unique-ID >>>>>>」旁邊的值是預期輸出中提到的唯一ID。 –

+0

是的,但爲了構建解決方案,我們需要具有可以從給定的樣本輸入生成的預期輸出。例如。輸出中的「1467305065852」在樣本輸入中甚至不存在。 – anubhava

回答

0
grep 'uniqueID' sample_log_file | sed -n '1p;$p' 
0

由於你已經使用awk,你co ULD爲了讓它做的工作你awk程序

"{print \$6, host, \$7}" 

改變

"{ first=last=\"\"; path=\"/var/log/hadoop/storm/\"\$7; while (getline var <path) if (split(var, arr, \">>>>>>\") > 1) { if (!first) first=arr[2]; last=arr[2] } print \$6, host, \$7, \"\t\", first, last }"