我不得不清除一些來自OCR的輸入,它將手寫識別爲亂碼。任何建議正則表達式來清理隨機字符?例如:正則表達式來代替亂碼
Federal prosecutors on Monday charged a Miami man with the largest case of credit and debit card data theft ever in the United States, accusing the one-time government informant of swiping 130 million accounts on top of 40 million he stole previously. , ':, Ie ':... 11'1 . '(.. ~!' ': f I I . " .' I ~ I' ,11 l I I I ~ \ :' ,! .~ , .. r, 1 , ~ I . I' , .' I ,. , i I ; J . I.' ,.\) .. . : I 'I', I .' ' r," Gonzalez is a former informant for the U.S. Secret Service who helped the agency hunt hackers, authorities say. The agency later found out that he had also been working with criminals and feeding them information on ongoing investigations, even warning off at least one individual, according to authorities. eh....l ~.\O ::t e;~~~ s: ~ ~. 0 qs c::; ~ g o t/J (Ii ., ::3 (1l Il:l ~ cil~ 0 2: t:lHj~(1l . ~ ~a 0~ ~ S' N ("b t/J :s Ot/JIl:l"-<:! v'g::!t:O -....c...... VI (:ll <' 0 := - ~ < (1l ::3 (1l ~ ' t/J VJ ~ Pl ..... .... (II
+1,因爲它是一個有趣的問題,雖然我懷疑你不會得到其問題的解答。 – 2009-08-18 03:40:03
這是一個很好的問題,而單詞/短語識別(或其他方式)是AI的一個熱門話題。 – Russell 2009-08-18 03:41:50
我強烈地感到REGEX是這項工作的錯誤工具。 – Breton 2009-08-18 05:20:00