2013-08-02 142 views
-1

事情是我想使用shell腳本來格式化我的文本文件內容。內容是這樣的:使用shell腳本來格式化文本文件內容

http://copyright.gov.in Inlinks: 
fromUrl: http://mhrd.gov.in/ anchor: Copyright 
fromUrl: http://mhrd.gov.in/hi/home anchor: कॉपीराइट 
fromUrl: http://mhrd.gov.in/?fontsize=normal anchor: Copyright 
fromUrl: http://mhrd.gov.in/?contrast=high anchor: Copyright 
fromUrl: http://mhrd.gov.in/?fontsize=large anchor: Copyright 
fromUrl: http://mhrd.gov.in/sitemap anchor: Copyright 
fromUrl: http://mhrd.gov.in/?fontsize=small anchor: Copyright 
fromUrl: http://mhrd.gov.in/hi anchor: कॉपीराइट 
fromUrl: http://mhrd.gov.in/?contrast=normal anchor: Copyright 

我想格式化爲輸出:

http://copyright.gov.in -> http://mhrd.gov.in/ 
http://copyright.gov.in -> http://mhrd.gov.in/hi/home 
http://copyright.gov.in -> http://mhrd.gov.in/?fontsize=normal 

+3

那你試試? –

+4

'請幫助我。請--'只有當您證明您試圖解決問題。 – devnull

+1

你應該使用正確的工具來完成正確的工作,我不認爲文本處理是任何shell核心業務的一部分。 'perl','awk'等在這些任務中更好... – user1146332

回答

1
$ cat foo.input 
http://copyright.gov.in Inlinks: 
fromUrl: http://mhrd.gov.in/ anchor: foo 
fromUrl: http://mhrd.gov.in/hi anchor: bar 
http://foo.acme.gov Inlinks: 
fromUrl: http://foo.acme.gov/ anchor: foo 
fromUrl: http://foo.acme.gov/about anchor: bar 

$ awk '/^http/ { host=$1; next } NF { printf "%s -> %s\n", host, $2 }' foo.input 
http://copyright.gov.in -> http://mhrd.gov.in/ 
http://copyright.gov.in -> http://mhrd.gov.in/hi 
http://foo.acme.gov -> http://foo.acme.gov/ 
http://foo.acme.gov -> http://foo.acme.gov/about