使用的preg_match

提取字符串中的任何Unicode字符串occurence我有這樣的字符串使用的preg_match

sample İletişim form:: aşağıdaki formu

是我的目標是要提取具有使用的preg_match它裏面一個unicode /非ASCII字符字符串或php的preg_match_all。

所以我期待只有2 搜索聯絡和aşağıdaki字的結果。

Array 
(
    [0] => İletişim 
    [1] => aşağıdaki 
)

我只是想不出正則表達式，因爲我不擅長它。歡迎任何援助。

非常感謝。

來源

2013-06-05 Kenneth Palaganas

我想你想的解決方案一開始就在這裏：How do I detect non-ASCII characters in a string?

使用的preg_match（），你可以做smthg這樣的：

preg_match_all('/[^\s]*[^\x20-\x7f]+[^\s]*/', $string, $matches); 
print_r($matches);

，或雖未的preg_match，你可以使用函數mb_detect_encoding（）來測試字符串的編碼。在你的情況，你可以使用這種方式：

$matches = array_filter(explode(' ', $string), function($item) { 
    return !mb_detect_encoding($item, 'ASCII', TRUE); 
}); 
print_r($matches);

但最後一個是有點扭曲的^^

來源

2013-06-05 09:42:56 Lebugg

我測試的代碼，並只返回非ASCII字符而不是整個字符串包含那個角色。也許是的，它可能是實現我想要的一步。無論如何謝謝你 –

我發現一個效果很好。嘗試使用preg_match_all（）：''/ [^ \ s] * [^ \ x20- \ x7f] + [^ \ s] * /''; – Lebugg

還編輯我的職務，以使第二個解決方案的工作;） – Lebugg

可以使用Unicode屬性：

$string = 'sample İletişim form:: aşağıdaki formu'; 
preg_match_all("/(\pL+)/u", $string, $matches); 
print_r($matches);

輸出：

Array 
(
    [0] => Array 
     (
      [0] => sample 
      [1] => İletişim 
      [2] => form 
      [3] => aşağıdaki 
      [4] => formu 
     ) 

    [1] => Array 
     (
      [0] => sample 
      [1] => İletişim 
      [2] => form 
      [3] => aşağıdaki 
      [4] => formu 
     ) 

)

來源

2013-06-05 11:26:52 Toto

這一次提取不具有在其上的非ASCII其他弦。但感謝您的貢獻。欣賞它:) –

使用的preg_match

回答

相關問題