2
如何將標點符號的自然語言文本格式化? Vim內置的gq
命令,或命令行工具,如fmt或par不考慮標點符號的斷行。我給大家舉一個例子,關於標點符號的格式文本
fmt -w 40
給人不是我想要的:
we had everything before us, we had
nothing before us, we were all going
direct to Heaven, we were all going
direct the other way
smart_formatter -w 40
會給:
we had everything before us,
we had nothing before us,
we were all going direct to Heaven,
we were all going direct the other way
當然,有些時候沒有標點符號中發現給定文本寬度,那麼它可以回退到標準文本格式行爲。
我想要這個的原因是爲了得到一個有意義的文本diff
,我可以發現哪些句子或子句發生了變化。