2010-06-07 33 views
4

我應該如何去附加到所有將被髮送爲電子郵件的html字符串中的所有url的末尾?我想在谷歌分析跟蹤活動添加到它是這樣的:頁面
?utm_source=email&utm_medium=email&utm_campaign=product_notify如何附加到字符串中的所有網址?

99%不會「的.html」結束,某些URL可能會在他們結束已經有類似的東西?sr=1

回答

5

嗯...你可以做這樣的事情:

function AppendCampaignToString($string) { 
    $regex = '#(<a href=")([^"]*)("[^>]*?>)#i'; 
    return preg_replace_callback($regex, '_appendCampaignToString', $string); 
} 
function _AppendCampaignToString($match) { 
    $url = $match[2]; 
    if (strpos($url, '?') === false) { 
     $url .= '?'; 
    } 
    $url .= '&utm_source=email&utm_medium=email&utm_campaign=product_notify'; 
    return $match[1].$url.$match[3]; 
} 

這應該會自動查找網頁上所有鏈接(即使是外部因素,所以要小心)。這個?檢查只是確保我們附加一個查詢字符串...

編輯:修正了在正則表達式沒有按預期工作的問題。

+0

太棒了。謝謝。 – Echo 2010-06-07 15:03:34

+1

嗯,應該在utm_source可能改寫已經給了一個,被追加(PHP不能處理的$ _GET數組),或周圍的其他方式? – Wrikken 2010-06-07 15:08:51

+0

爲我節省了很多時間,很棒的工作。 – jbnunn 2011-04-14 22:55:01

0

您可以使用以下代碼片段將您的Google Analytics分析GET參數附加到當前腳本URI的現有參數。

function getQuery() { 

$url = parse_url($_SERVER['REQUEST_URI']); 

return $url['query'].'&utm_source=email&utm_medium=email&utm_campaign=product_notify'; 
} 
+0

不錯,但不是我要找的。我想追加到的URL是一串html。 – Echo 2010-06-07 14:59:55

2
<?php 
$add = array(
'utm_source'=>'email', 
'utm_medium'=>'email' 
'utm_campaign'=>'product_notify'); 
$doc = new DOMDocument(); 
$doc->loadHTML('your html'); 
foreach($doc->getElementsByTagName('a') as $link){ 
    $url = parse_url($link->getAttribute('href')); 
    $gets = isset($url['query']) ? array_merge(parse_str($url['query'])) : $add; 
    $newstring = ''; 
    if(isset($url['scheme'])) $newstring .= $url['scheme'].'://'; 
    if(isset($url['host'])) $newstring .= $url['host']; 
    if(isset($url['port'])) $newstring .= ':'.$url['port']; 
    if(isset($url['path'])) $newstring .= $url['path']; 
    $newstring .= '?'.http_build_query($gets); 
    if(isset($url['fragment'])) $newstring .= '#'.$url['fragment']; 
    $link->setAttribute('href',$newstring); 
} 
$html - $doc->saveHTML(); 
?> 
+1

太好了。這是處理事情的方式。沒有完整的答案,但我正在尋找這樣的事情。 – shamittomar 2014-04-08 08:07:24

0

我的解決方案我已經建立了&測試昨晚:

我只匹配這不已經有「utm_」之類的查詢參數的鏈接,但包括以「utm_」鏈接作爲部分路徑:在查詢另一個參數名稱的參數或子字符串之前,比如「xutm_」。

爲此,我已經使用正面和負面的正則表達式模式斷言的組合(http://php.net/manual/en/regexp.reference.assertions.php

我也允許標籤之前有其他的屬性和之後的href

$pattern = '/<a[^>]*href="(?=(.(?!(\?|&)utm_))*?>)[^"]*/i'; 

所有鏈接匹配它沒有'utm_'也沒有'& utm_'在href標記

然後,我使用類回調函數解決方案,以便能夠傳遞查詢參數被附加(作爲額外參數RS回調)

class link_params{ 
    private $parameters;  

    function __construct($params){ 
    $this->parameters = $params; 
    } 

    function callback($matches){ 
    return $matches[0] . (preg_match('/\\?[^"]/', $matches[0]) ? '&' : '?') . http_build_query($this->parameters); 
    } 
} 

準備,我要添加到鏈接查詢參數:

$params_to_add = array(
    'utm_source' => 'newsletter-sep13', 
    'utm_medium' => 'email', 
    'utm_campaign' => 'product-X' 
); 

$callback_helper = new link_params($params_to_add); 

最後我申請的preg_replace_callback函數是這樣的:

$html = preg_replace_callback($pattern, array($callback_helper, 'callback'), $html); 
4

更新到@ ircmaxell的答案,即使在代碼簡化之前存在屬性,正則表達式現在也可以匹配。

/** 
* @param string $body 
* @param string $campaign 
* @param string $medium 
* @return mixed 
*/ 
protected function add_analytics_tracking_to_urls($body, $campaign, $medium = 'email') { 
    return preg_replace_callback('#(<a.*?href=")([^"]*)("[^>]*?>)#i', function($match) use ($campaign, $medium) { 
     $url = $match[2]; 
     if (strpos($url, '?') === false) { 
      $url .= '?'; 
     } else { 
      $url .= '&'; 
     } 
     $url .= 'utm_source=' . $medium . '&utm_medium=' . $medium . '&utm_campaign=' . urlencode($campaign); 
     return $match[1] . $url . $match[3]; 
    }, $body); 
} 
0

這裏是我的解決方案,簡單的問題,但相當複雜的解決方案與

$campaign = (object)['utm_source' => 'email', 'utm_medium' => 'email', 'utm_campaign' => 'abc']; 
$host = 'www.me.com'; 

$html = preg_replace_callback(
     '#(<a.*?href=["\']?)(?<href>https?://[^\s"\']+)(["\']?.*?>.*?</a>)#si', function ($matches) use ($campaign, $host) { 
    $url = parse_url($matches['href']); 
    // if (isset($url['host']) && $url['host'] !== $host) return $matches[0]; 
    parse_str(isset($url['query']) ? $url['query'] : '', $query); 
    $query = array_merge(
     $query, array_filter(
        [ 
         'utm_source' => $campaign->utm_source, 
         'utm_medium' => $campaign->utm_medium, 
         'utm_term' => $campaign->utm_term, 
         'utm_content' => $campaign->utm_content, 
         'utm_campaign' => $campaign->utm_campaign, 
        ] 
      ) 
    ); 
    return $matches[1] . // anchor part before url 
    (isset($url['scheme']) ? $url['scheme'] . '://' : '') . 
    (isset($url['user']) ? $url['user'] : '') . 
    (isset($url['pass']) ? (isset($url['user']) ? ':' : '') . $url['pass'] : '') . 
    (isset($url['user']) || isset($url['pass']) ? '@' : ''). 
    (isset($url['host']) ? $url['host'] : '') . 
    (isset($url['port']) ? ':' . $url['port'] : '') . 
    (isset($url['path']) ? $url['path'] : '') . 
    '?' . http_build_query($query) . 
    (isset($url['fragment']) ? '#' . $url['fragment'] : '') . 
    $matches[3]; // anchor part after URL 
}, $html 
); 

最後一部分(CONCAT URL)所有的工作在URL類型也可以替換爲http_build_url(),但你將需要啓用HTTP擴展。

<a href="http://www.me.com">Lorem</a> 
<a href="http://www.me.com/">ipsum</a> 
<a href="http://www.me.com/#section-2">dolor</a> 
<a href="http://www.me.com/path-to-somewhere/file.php">sit</a> 
<a href="http://www.me.com/?">amet</a> 
<a href="http://www.me.com/?foo=bar">consectetur</a> 
<a href="http://www.me.com/?foo=bar&bar=foo">consectetur</a> 
<a href="http://www.NOTME.com?utm_source=XXX&utm_medium=XXX&utm_campaign=XXX">existing utm params</a> 
<a href="http://user:[email protected]/?foo=bar#section-3">elit</a> 
<a href="http://user:@www.me.com/?foo=bar#section-3">elit</a> 
<a href="http://[email protected]?foo=bar#section-3">elit</a> 

與結果如下:

<a href="http://www.me.com?utm_source=email&utm_medium=email&utm_campaign=abc">Lorem</a> 
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc">ipsum</a> 
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc#section-2">dolor</a> 
<a href="http://www.me.com/path-to-somewhere/file.php?utm_source=email&utm_medium=email&utm_campaign=abc">sit</a> 
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc">amet</a> 
<a href="http://www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc">consectetur</a> 
<a href="http://www.me.com/?foo=bar&bar=foo&utm_source=email&utm_medium=email&utm_campaign=abc">consectetur</a> 
<a href="http://www.NOTME.com?utm_source=email&utm_medium=email&utm_campaign=abc">existing utm params</a> 
<a href="http://user:[email protected]/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a> 
<a href="http://user:@www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a> 
<a href="http://[email protected]?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a> 

正如你可以看到,我的代碼適用於在HTML中所有鏈接(不只是me.com)如果

代碼是在跟蹤的網址測試你想在parse_url()之後過濾主機名取消註釋行。

相關問題