我還沒有測試過這段代碼,但我認爲這個非正則表達式的想法可能對你更好。基本上你用空格分割字符串,然後解析每一塊。這種方法意味着零件的訂單無關緊要。
由於內容和項目可以跨越多個部分,但我認爲我的代碼應該可以處理該問題,所以它有點棘手。它還假定您每個推文只有一個hashtag,用戶,項目和優先級。例如,如果會有多個hashtags,只需將它們放入一個數組而不是一個字符串。最後,它沒有任何錯誤處理來檢測/防止奇怪的事情發生。
這裏是我的未經測試的代碼:
$data = array(
'hash' => '',
'user' => '',
'priority' => '',
'project' => '',
'content' => ''
);
$parsingProjectName = false;
foreach(explode(' ', $tweet) as $piece)
{
switch(substr($piece, 0, 1))
{
case '#':
$data['hash'] = substr($piece, 1);
break;
case '@':
$data['user'] = substr($piece, 1);
break;
case '!':
$data['priority'] = substr($piece, 1);
break;
case '[':
// Check if the project name is longer than 1 word
if(strpos($piece, -1) == ']')
{
$data['project'] = substr($piece, 1, -1);
}
else
{
// There will be more to parse in the next piece(s)
$parsingProjectName = true;
$data['project'] = substr($piece, 1) . ' ';
}
break;
default:
if($parsingProjectName)
{
// Are we at the end yet?
if(strpos($piece, -1) == ']')
{
// Yes we are
$data['project'] .= substr($piece, 1, -1);
$parsingProjectName = false;
}
else
{
// Nope, there is more
$data['project'] .= substr($piece, 1) . ' ';
}
}
else
{
// We aren't in the middle of parsing the project name, and this piece doesn't start with one of the special chars, so assume it is content
$data['content'] .= $piece . ' ';
}
}
}
// There will be an extra space on the end; remove it
$data['content'] = substr($data['content'], 0, -1);
你覺得'\ w'做什麼?它與'[a-zA-Z]'幾乎相同' – Vyktor 2012-03-03 21:30:32
只需循環遍歷所有匹配,然後在每個不以#,@,!開始的匹配中組成一個字符串。 &[ – Yaniro 2012-03-03 21:44:47