編寫shell腳本

我想編寫一個shell腳本，它將從標準輸入讀取文件，刪除所有字符串和空行字符，並將輸出寫入標準輸出。該文件是這樣的：編寫shell腳本

#some lines that do not contain <html> in here 
<html>a<html> 
<tr><html>b</html></tr> 
#some lines that do not contain <html> in here 
<html>c</html>

因此，輸出文件應包含：

#some lines that do not contain <html> in here 
a 
<tr>b</html></tr> 
#some lines that do not contain <html> in here 
c</html>

我嘗試寫這個shell腳本：

read INPUT #read file from std input 
tr -d '[:blank:]' 
grep "<html>" | sed -r 's/<html>//g' 
echo $INPUT

但是這個腳本不工作在所有。任何想法？ THX

來源

2013-03-19 Hanna Gabby

你可能想試試這個在Perl（或超過一定的外殼以外的東西）如果可能的話：[檢查出這個問題的答案]（http://stackoverflow.com/questions/3176842/strip-html-tags-with-perl） – summea 2013-03-19 19:50:02

@summea我不能。我必須使用＃！/ usr/bin/bash – 2013-03-19 19:52:07

應該保留註釋嗎？ – 2013-03-19 19:52:12

純慶典：

#!/bin/bash 

while read line 
do 
    #ignore comments 
    [[ "$line" = "\#" ]] && continue 
    #ignore empty lines 
    [[ $line =~ ^$ ]] && continue 
    echo ${line//\<html\>/} 
done < $1

輸出：

$ ./replace.sh input 
#some lines that do not contain in here 
a 
<tr>b</html></tr> 
#some lines that do not contain in here 
c</html>

純的sed：

sed -e :a -e '/^[^#]/N; s/<html>//; ta' input | sed '/^$/d'

來源

2013-03-19 20:03:10

[[「$ line」=「\＃」]]是什麼意思？不僅可以使用grep和sed – 2013-03-19 20:07:53

查看源代碼中的註釋以上 – 2013-03-19 20:08:57

@HannaGabby - 請參閱sed-only解決方案的更新 – 2013-03-19 20:18:09

awk中可以很容易地做到這一點：

awk '/./ {gsub("<html>","");print}' INPUTFILE

首先，它操作上與至少一個字符（所以空行被丟棄）的每一行，它取代「<html>」全局與上一個空字符串行，然後打印它。

來源

2013-03-19 19:54:48

OP需要註釋保留 – 2013-03-19 19:56:51

我只能使用grep和sed。但是什麼是/ ./ mean？是指當前目錄？ – 2013-03-19 20:00:50

@HannaGabby - '/./'是一個正則表達式，意思是一個字符[any] – 2013-03-19 20:04:47

回答

相關問題