我有以下格式的CSV文件:針對CSV文件否定匹配使用的sed
$ tail X.csv | sed 's/[a-zA-Z0-9]/X/g'
XXXXXXX/XXXXXXXX XXXXXXXXXXXX), XXXXXXXXXXXXXXXXXXXX, XXXXXXXXXXXXXX (X),XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXXX (XXXXXXX XXXXXX),XXXXX,XX.XXX.XXX.XX,XXXXX,XXXXX XXXXXXXXX XXXXXX XXXX XXX XXXXXXXX XX XXXXXXX XXX XXXXXXXX XXXXXXX (XXXXXXXXX): XXXXXXXX X XXXXXXXXXX XXXX X XXXXXXXXXX.,XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXXX (XXXXXXX XXXXXX),XXXXX,XX.XXX.XXX.XX,XXXXX,XXXXXXX XXX XXXXXXXX XXX XXXXXXXX XXXXXXX (XXXXXXXXX) (XXXXXXX XXXXXXXXXXXXXX),XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXX,XXXXXXXXX (XXXXXX XXXXXXX XXXXXXX),XXXXX,XX.XXX.XXX.XX,XXXX,XXXXXXXX XXXXXXXXX XXXXXXXX XXX XXXXXX XXXXXXX XXXXXXX (XXXXXXXXX).,XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXXXXXXX (XXXXXX XXXXXXXXXX),XXXXX,XX.XXX.XXX.XX,XXXXX,XXXXXXXX XXXX XXXXXXX(X) XX XX/XX/XXXX XXX XXXXXXX XXXXXXXX (XXXXXXXXX).,XXXXX,,X,X,X,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXX,XXXXXXXXXXX (XXXXXXX XXXXXXXXX),XXXXX,XX.XXX.XXX.XX,XXXX,XXXXXXX XXX XXXXXXXX XXX XXXXXX XXXXX (XXXXXXXXX) (XXXXXXXXXXX XX XXXXX XXX XXXXXXXX-XXXX XXXXXXXXXXX): XXXXXXXXXXXXXXXXXXX (XXXXX), XXXXXXXXXXXXXXXXXX (XXXXX), XXXXXXXXXXXXXX (XXXX), XXXXXXXXXXXXXXXX (XXXXX),XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXXX (XXXXXXX XXXXXX),XXXXX,XX.XXX.XXX.XX,XXXXX,XXXXXXX XXX XXXXXXXX XXX XXXXXXXX XXXXXXX (XXXXXXXXX) (XXXXXXX XXXXXXXXXXXXXX),XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXXXXXXX (XXXXXX XXXXXXXXXX),XXXXX,XX.XXX.XXX.XX,XXXXX,XXXXXXXX XXXX XXXXXXX(X) XX XX/XX/XXXX XXX XXXXX XXXXXXXX (XXXXXXXXX).,XXXXX,,X,X,X,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXXXXXXX (XXXXXXXX XXXXXXXXX),XXXXX,XX.XXX.XXX.XX,XXXX,XXXXXXX XXX XXXXXXXX XXX XXXXXXX XXXXX (XXXXXXXXX) (XXXXXXX XXXX): XXXXXXXXXXXXXX (XXXXX),XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXX (XXXXXXXX XXXXX),XXXXX,XX.XXX.XXX.XX,XXXXX,XXXXXXX XXX XXXXXXXX XXX XXXXXX XXXXXXXX (XXXXXXXXX) (XXXXXXX XXXXXX XX XXXXXXXXXXX XXX XXXXXXXXXX XXXXX): XXXXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXXXXXXXXXX (XXXXXX XXXXX XXXXXX XXXXXXXXXXXXX XXXX XXX XXXXX. XXX XX XXXX XXXXXX.), XXXXXXXXXXXXXXXXXXXXXXXXXXXX (XXX XX XXXX XXXXX XXX XXX XXXX XXXXXXX.),XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
$
除了定界符逗號,生成的CSV文件包含逗號作爲值的一部分,以及,所以我需要sed(1)
與作爲|
另一個分隔符這樣的替代分隔符。
不幸的是,該文件不能再生(更換用別的東西分隔符)。
我不成功的嘗試:
$ tail X.csv | sed 's/[a-zA-Z0-9]/X/g' | sed --regexp-extended '/,/!s/,%s/|/g' | tail -1
XXXXXXXXX,XXXX-XX-XX XX:XX:XX.XXXXXXXXX,XX,XXXXX,X,XXXXXX,X,XXXXXX,XXXXXXX (XXXXXXXX XXXXX),XXXXX,XX.XXX.XXX.XX,XXXXX,XXXXXXX XXX XXXXXXXX XXX XXXXXX XXXXXXXX (XXXXXXXXX) (XXXXXXX XXXXXX XX XXXXXXXXXXX XXX XXXXXXXXXX XXXXX): XXXXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXX, XXXXXXXXXXXXXXXXXXXXXXXX (XXXXXX XXXXX XXXXXX XXXXXXXXXXXXX XXXX XXX XXXXX. XXX XX XXXX XXXXXX.), XXXXXXXXXXXXXXXXXXXXXXXXXXXX (XXX XX XXXX XXXXX XXX XXX XXXX XXXXXXX.),XXXXX,,X,XXX,XXXXXXX,,,{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
$
我怎樣才能解決這個問題?
1.你可能要生成一個不同的字段分隔符或引號裏的價值觀csv文件。 2.如果這不是一個選項,請提供更多信息:是**字段值內每**行的**秒**逗號?如果不是,我們如何才能找出哪些行需要修復? –
1)不幸的是,這不是一個選項,2)文件是巨大的,我不相信它在每一行,但在這個文件中很常見。 – alexus
@alexus,顯示您的文件更多的線,兩條線是不夠的 – RomanPerekhrest