2011-03-08 40 views
9

我怎麼能剝奪標點符號除了這些字符.=$'-%如何在PHP中剝離標點符號

+2

你是什麼意思?請添加更多詳細信息和示例。 – 2011-03-08 14:23:53

+0

只要保留你想要的字符可能會更容易。你只想保留a-z(上限和下限)0-9和你列出的字符?另外,什麼是「phpr」? – 2011-03-08 14:25:04

+0

@middaparka這是PHP和[R](http://en.wikipedia.org/wiki/R_%28programming_language%29)之間的新合作:) – alex 2011-03-08 14:28:22

回答

16

既然你需要搭配一些Unicode字符(),這將是明智的使用正則表達式。該模式\p{P}任何已知的標點符合,並斷言的消失排除所需的特殊字符:

$text = preg_replace("/(?![.=$'€%-])\p{P}/u", "", $text); 
+0

你能解釋一下這個模式嗎,我是PCRE的新手,這是我第一次看到一個不是前瞻的'(?!)',或者我不明白的東西。另外,我不明白'\ p {P}' – AlexanderMP 2011-03-08 14:37:58

+1

@AlexanderMP:負面預測'(?!...)'斷言最好在這裏解釋:http://www.regular-expressions.info/lookaround。 html - 而'\ p {P}'有點神祕,但這裏也有一個很好的概述:http://www.regular-expressions.info/unicode.html – mario 2011-03-08 14:54:04

+0

這不會刪除'''字符。 – mopsyd 2018-01-02 02:45:32

1
preg_replace("[^-\w\d\s\.=$'€%]",'',$subject) 

雖然這將是更正確,更容易指定你想要去除的特點,而不是的字符(來自未知的設置),你不想去除。

+0

你不需要在字符類中跳過'$' - 但是AFAIK'-'應該是第一個或最後一個項目。 – ThiefMaster 2011-03-08 14:29:47

+0

可能。逃離PCRE使用的所有特殊字符時,我感覺更舒適。但感謝提示。 – AlexanderMP 2011-03-08 14:31:50

6
<? 
$whatToStrip = array("?","!",",",";"); // Add what you want to strip in this array 
$test = "Hi! Am I here?"; 
echo $test."\n\n"; 
echo str_replace($whatToStrip, "", $test); 

Demo here

或者,當然,更短:

$test = str_replace(array("?","!",",",";"), "", $test); 

Source from 1st example of str_replace manual

0

嘗試:

preg_replace("/[^\w-\p{L}\p{N}\p{Pd}\$\.€%']/", "", 'YOUR DATA');

,如果你想空間或沒有你沒有提到的,這樣就也是這樣。

0

問題:

需要字符串保存爲alphaNum與特定的標點符號,不想有特殊標點符號完全丟棄的字符。

解決辦法:

class ClassName { 

    protected static $cleanChars = array(
    '&lt;' => '', '&gt;' => '', '&#039;' => '', '&amp;' => '', 
    '&quot;' => '', 'À' => 'A', 'Á' => 'A', 'Â' => 'A', 'Ã' => 'A', 'Ä' => 'Ae', 
    '&Auml;' => 'A', 'Å' => 'A', 'Ā' => 'A', 'Ą' => 'A', 'Ă' => 'A', 'Æ' => 'Ae', 
    'Ç' => 'C', 'Ć' => 'C', 'Č' => 'C', 'Ĉ' => 'C', 'Ċ' => 'C', 'Ď' => 'D', 'Đ' => 'D', 
    'Ð' => 'D', 'È' => 'E', 'É' => 'E', 'Ê' => 'E', 'Ë' => 'E', 'Ē' => 'E', 
    'Ę' => 'E', 'Ě' => 'E', 'Ĕ' => 'E', 'Ė' => 'E', 'Ĝ' => 'G', 'Ğ' => 'G', 
    'Ġ' => 'G', 'Ģ' => 'G', 'Ĥ' => 'H', 'Ħ' => 'H', 'Ì' => 'I', 'Í' => 'I', 
    'Î' => 'I', 'Ï' => 'I', 'Ī' => 'I', 'Ĩ' => 'I', 'Ĭ' => 'I', 'Į' => 'I', 
    'İ' => 'I', 'IJ' => 'IJ', 'Ĵ' => 'J', 'Ķ' => 'K','Ł' => 'K', 'Ľ' => 'K', 
    'Ĺ' => 'K', 'Ļ' => 'K', 'Ŀ' => 'K', 'Ñ' => 'N', 'Ń' => 'N', 'Ň' => 'N', 
    'Ņ' => 'N', 'Ŋ' => 'N', 'Ò' => 'O', 'Ó' => 'O', 'Ô' => 'O', 'Õ' => 'O', 
    'Ö' => 'Oe', '&Ouml;' => 'Oe', 'Ø' => 'O', 'Ō' => 'O', 'Ő' => 'O', 'Ŏ' => 'O', 
    'Œ' => 'OE', 'Ŕ' => 'R', 'Ř' => 'R', 'Ŗ' => 'R', 'Ś' => 'S', 'Š' => 'S', 
    'Ş' => 'S', 'Ŝ' => 'S', 'Ș' => 'S', 'Ť' => 'T', 'Ţ' => 'T', 'Ŧ' => 'T', 
    'Ț' => 'T', 'Ù' => 'U', 'Ú' => 'U', 'Û' => 'U', 'Ü' => 'Ue', 'Ū' => 'U', 
    '&Uuml;' => 'Ue', 'Ů' => 'U', 'Ű' => 'U', 'Ŭ' => 'U', 'Ũ' => 'U', 'Ų' => 'U', 
    'Ŵ' => 'W', 'Ý' => 'Y', 'Ŷ' => 'Y', 'Ÿ' => 'Y', 'Ź' => 'Z', 'Ž' => 'Z', 
    'Ż' => 'Z', 'Þ' => 'T', 'à' => 'a', 'á' => 'a', 'â' => 'a', 'ã' => 'a', 
    'ä' => 'ae', '&auml;' => 'ae', 'å' => 'a', 'ā' => 'a', 'ą' => 'a', 'ă' => 'a', 
    'æ' => 'ae', 'ç' => 'c', 'ć' => 'c', 'č' => 'c', 'ĉ' => 'c', 'ċ' => 'c', 
    'ď' => 'd', 'đ' => 'd', 'ð' => 'd', 'è' => 'e', 'é' => 'e', 'ê' => 'e', 
    'ë' => 'e', 'ē' => 'e', 'ę' => 'e', 'ě' => 'e', 'ĕ' => 'e', 'ė' => 'e', 
    'ƒ' => 'f', 'ĝ' => 'g', 'ğ' => 'g', 'ġ' => 'g', 'ģ' => 'g', 'ĥ' => 'h', 
    'ħ' => 'h', 'ì' => 'i', 'í' => 'i', 'î' => 'i', 'ï' => 'i', 'ī' => 'i', 
    'ĩ' => 'i', 'ĭ' => 'i', 'į' => 'i', 'ı' => 'i', 'ij' => 'ij', 'ĵ' => 'j', 
    'ķ' => 'k', 'ĸ' => 'k', 'ł' => 'l', 'ľ' => 'l', 'ĺ' => 'l', 'ļ' => 'l', 
    'ŀ' => 'l', 'ñ' => 'n', 'ń' => 'n', 'ň' => 'n', 'ņ' => 'n', 'ʼn' => 'n', 
    'ŋ' => 'n', 'ò' => 'o', 'ó' => 'o', 'ô' => 'o', 'õ' => 'o', 'ö' => 'oe', 
    '&ouml;' => 'oe', 'ø' => 'o', 'ō' => 'o', 'ő' => 'o', 'ŏ' => 'o', 'œ' => 'oe', 
    'ŕ' => 'r', 'ř' => 'r', 'ŗ' => 'r', 'š' => 's', 'ù' => 'u', 'ú' => 'u', 
    'û' => 'u', 'ü' => 'ue', 'ū' => 'u', '&uuml;' => 'ue', 'ů' => 'u', 'ű' => 'u', 
    'ŭ' => 'u', 'ũ' => 'u', 'ų' => 'u', 'ŵ' => 'w', 'ý' => 'y', 'ÿ' => 'y', 
    'ŷ' => 'y', 'ž' => 'z', 'ż' => 'z', 'ź' => 'z', 'þ' => 't', 'ß' => 'ss', 
    'ſ' => 'ss', 'ый' => 'iy', 'А' => 'A', 'Б' => 'B', 'В' => 'V', 'Г' => 'G', 
    'Д' => 'D', 'Е' => 'E', 'Ё' => 'YO', 'Ж' => 'ZH', 'З' => 'Z', 'И' => 'I', 
    'Й' => 'Y', 'К' => 'K', 'Л' => 'L', 'М' => 'M', 'Н' => 'N', 'О' => 'O', 
    'П' => 'P', 'Р' => 'R', 'С' => 'S', 'Т' => 'T', 'У' => 'U', 'Ф' => 'F', 
    'Х' => 'H', 'Ц' => 'C', 'Ч' => 'CH', 'Ш' => 'SH', 'Щ' => 'SCH', 'Ъ' => '', 
    'Ы' => 'Y', 'Ь' => '', 'Э' => 'E', 'Ю' => 'YU', 'Я' => 'YA', 'а' => 'a', 
    'б' => 'b', 'в' => 'v', 'г' => 'g', 'д' => 'd', 'е' => 'e', 'ё' => 'yo', 
    'ж' => 'zh', 'з' => 'z', 'и' => 'i', 'й' => 'y', 'к' => 'k', 'л' => 'l', 
    'м' => 'm', 'н' => 'n', 'о' => 'o', 'п' => 'p', 'р' => 'r', 'с' => 's', 
    'т' => 't', 'у' => 'u', 'ф' => 'f', 'х' => 'h', 'ц' => 'c', 'ч' => 'ch', 
    'ш' => 'sh', 'щ' => 'sch', 'ъ' => '', 'ы' => 'y', 'ь' => '', 'э' => 'e', 
    'ю' => 'yu', 'я' => 'ya' 
); 

    public static function clean($string, $allowed=array(), $base="a-zA-Z0-9 "){ 
    if(empty($allowed) && !$base){ return false; } 
    $ignore = ""; 
    if(is_array($allowed)){ 
     foreach($allowed as $a){ 
     $ignore .= preg_quote($a); 
     } 
    } 
    return preg_replace("/[^{$base}{$ignore}\s]/", "", $string); 
    } 

    public static function alphaNum($string, $allowed=array(), $convert=false){ 
    if($convert){ 
     $string = strtr($string, self::$cleanChars); 
    } 
    return self::clean($string, $allowed, 'a-zA-Z0-9 '); 
    } 

} 

實例:

地帶的所有標點符號:

的ClassName :: alpaNum($字符串);

地帶的所有標點符號,而且轉換特殊字符:

類名:: alphaNum($字符串,空,真);

阿爾法數+附加標點:

的ClassName :: alphaNum( - ' '$串,陣列( '_', '',',',));

阿爾法數+附加標點和轉換:

類名:: alphaNum( - ' '$字符串數組( '_', '',',',),TRUE);

結論: 如果您期待特殊字符,並且完全不想放棄它們,您可以在檢查alphaNum之前將它們轉換。 (例如,清理文件名等)

如果丟棄特殊字符沒有任何實際影響,並且系統上沒有真正的預期,您可以在不轉換標點符號的情況下調用它以節省處理能力。 (例如,在從字符串大型陣列設置鍵)

我得到了cleanChars VAR從這裏: (我稍微修改了它) https://github.com/vanillaforums/Garden/blob/master/library/core/class.format.php

21

這裏有一個整潔的方式做到這一點:

preg_replace("#[[:punct:]]#", "", $target); 
+3

這是正確的答案 – 2013-09-06 12:08:40

+0

我很高興它有幫助 – 2013-10-06 21:24:03

+0

我很喜歡這個。 – 2016-08-14 22:39:41