grep /正則表達式找不到重音字

我正在嘗試裝載一個正則表達式，該正則表達式在一個文件中獲得一些單詞，該單詞的所有字母都與單詞模式相匹配。grep /正則表達式找不到重音字

我的問題是，正則表達式無法找到重音的單詞，但在我的文本文件中有很多重音單詞。

我的命令行是：

cat input/words.txt | grep '^[éra]\{1,4\}$' > output/words_era.txt 
cat input/words.txt | grep '^[carroça]\{1,7\}$' > output/words_carroca.txt

和文件的內容是：

carroça 
éra 
éssa 
roça 
roco 
rato 
onça 
orça 
roca

我怎樣才能解決呢？

來源

2011-01-19 GodFather

`locale`的輸出是什麼？「input/words.txt」的編碼是什麼？ – ephemient 2011-01-19 19:07:44

它適用於我，但也許問題與您的語法：方括號用於定義字符組，所以至少第二行肯定是錯誤的。嘗試： grep'^carroça\ {1,3 \} $' – UncleZeiv 2011-01-19 19:11:25

@UncleZeiv，我已經把正確的錯誤，現在我編輯正確。 – GodFather 2011-01-19 19:15:29

如果您的文件是ISO-8859-1編碼的，但你的系統區域設置爲UTF-8，這是不行的。

將文件轉換爲UTF-8或將您的系統區域設置更改爲ISO-8859-1。

 
# convert from ISO-8859-1 to the environmental locale before grepping 
# output will be in the current locale 
$ iconv -f 8859_1 input/words.txt | grep ... 

# run grep with an ISO-8859-1 locale 
# output will be in ISO-8859-1 encoding 
$ cat input/words.txt | env LC_ALL=en_US grep ...

來源

2011-01-19 19:26:52 ephemient

我發現一個似乎有效的相關問題here。

所以如果你是這樣的：

cat input/words.txt | LANG=C grep '^[éra]\{1,4\}$' > output/words_era.txt

不會產生你期待什麼？

來源

2011-01-19 19:18:11 dule

嘗試爲@dule說，但LANG=en_US.iso88591：

cat input/words.txt | LANG=en_US.iso88591 grep '^[éra]\{1,4\}$' > output/words_era.txt

來源

2011-01-19 19:24:57 UncleZeiv

假設一切是UTF-8，我通常只使用類似

perl -CSAD -le 'print if /^carroça{1,3}$/' filenames

因爲那樣的話，我知道它在做什麼。

來源

2011-01-19 21:51:00 tchrist

grep /正則表達式找不到重音字

回答

相關問題