沒有分號編碼

即時通訊嘗試解碼文本，這是我相信在WINDOWS-1251中提出的。字符串看起來是這樣的：沒有分號編碼

&#1040&#1075&#1077&#1085&#1090

這應該代表了俄羅斯代理。這裏的問題是：

我不能，除非我的每個數字
我不能做手工，因爲我有一個10000行文本轉換後加上分號這個字符串轉換。

所以問題是，這是什麼編碼（無分號），我怎麼能自動添加它們到每一行（正則表達式也許？），而不會破壞代碼。

到目前爲止，我一直在試圖通過使用此代碼來做到這一點：

應用邏輯

public function parseSentence((array) $sentences, $sentence, $i) { 
    if (strstr($sentence, '-')) { 
     $sentences[$i] = $this->explodeAndSplit('-', $sentence); 
    } else if (strstr($sentence, "'")) { 
     $sentences[$i] = $this->explodeAndSplit("'", $sentence); 
    } else if (strstr($sentence, "(")) { 
     $sentences[$i] = $this->explodeAndSplit("(", $sentence); 
    } else if (strstr($sentence, ")")) { 
     $sentences[$i] = $this->explodeAndSplit(")", $sentence); 
    } else { 
     if (strstr($sentence, '#')) { 
      $sentences[$i] = chunk_split($sentence, 6, ';'); 
    } 
    return $sentences; 
} 

/** 
* Explode and Split 
* @param string $explodeBy 
* @param string $string 
* 
* @return string 
*/ 
private function explodeAndSplit($explodeBy, $string) { 
    $exp = explode($explodeBy, $string); 
    for ($j = 0; $j < count($exp); $j++) { 
     $exp[$j] = chunk_split($exp[$j], 6, ';'); 
    } 
    return implode($explodeBy, $exp); 
}

但很明顯，這種做法是不正確一點（當然，完全不正確的），因爲我沒有考慮許多其他「特殊」角色。那麼如何解決？

更新：
我使用流明的後端和AngularJS的前端。獲取在Lumen（數據庫/文本文件/ etc）中分析的所有數據，爲AngularJS提供所謂的API路由來訪問和檢索數據。而事實是，在任何瀏覽器這個semicolonless編碼工作巨大的，如果直接訪問，但無法顯示在角由於缺少分號

來源

2016-06-13 Ivan Zhivolupov

這些都是Russian HTML Codes (Cyrillic)。爲了確保它們正常顯示，需要施加適當content-type：

<meta http-equiv="content-type" content="text/html;charset=utf-8" />

我們正確地做到這一點，你要preg_split()的HTML代碼上面的字符串你有，因此：

array_filter(preg_split("/[&#]+/", $str));

^{array_filter()只是刪除任何空值。你也可以使用explode()來做同樣的事情。}

這將返回你有號的數組。從那裏，一個簡單的implode()所要求的前置&#和附加;很簡單：

echo '&#' .implode(";&#", array_filter(preg_split("/[&#]+/", $str))) . ';';

將返回：當產生是正確的HTML

&#1040;&#1075;&#1077;&#1085;&#1090;

現在，它顯示以下俄文本：

Агент

其中俄文翻譯爲Agent。

來源

2016-06-13 00:59:48 Darren

非常感謝這個驚人的演習，我的問題，簡單的解決方案，真的很感謝 –

@IvanZhivolupov我的榮幸！我很高興它幫助你解決了你的問題！ – Darren

沒有分號編碼

回答

相關問題