2011-05-25 55 views
1

我們在Google搜索結果中包含一些帶有SID的網址,我們希望301將網址重定向到沒有SID的網頁。所以我們需要一個URL重寫來改變這個URL從URL中刪除SID並使用.htaccess重定向301

http://www.in-due.de/hochzeitsshop/catalogsearch/result/index/?SID=8df077eea401bda0da7e9a980efe20cf&cat=148&dir=asc&limit=9&order=relevance&p=8&q=gold 

到這個網址:

http://www.in-due.de/hochzeitsshop/catalogsearch/result/index/?cat=148&dir=asc&limit=9&order=relevance&p=8&q=gold 

基本不除去這部分:

SID=8df077eea401bda0da7e9a980efe20cf& 

有人可以幫忙嗎?

回答

3

登錄到谷歌的網站管理員工具和站點配置,參數處理你想添加SID的列表,你可以手動刪除的URL,但我只是使用這個robots.txt文件,讓機器人拾取使用會話ID刪除這些URL。

這是我一直用於Magento網站的robot.txt文件。顯然你可能需要調整,如有必要:

# $Id: robots.txt,v magento-specific 2010/28/01 18:24:19 goba Exp $ 
# 
# robots.txt 
# 
# This file is to prevent the crawling and indexing of certain parts 
# of your site by web crawlers and spiders run by sites like Yahoo! 
# and Google. By telling these "robots" where not to go on your site, 
# you save bandwidth and server resources. 
# 
# This file will be ignored unless it is at the root of your host: 
# Used: http://example.com/robots.txt 
# Ignored: http://example.com/site/robots.txt 
# 
# For more information about the robots.txt standard, see: 
# http://www.robotstxt.org/wc/robots.html 
# 
# For syntax checking, see: 
# http://www.sxw.org.uk/computing/robots/check.html 

# Website Sitemap 
Sitemap: http://www.yourdomain.com/sitemap.xml 

# Crawlers Setup 
User-agent: * 
Crawl-delay: 10 

# Allowable Index 
Allow: /*?p= 
Allow: /index.php/blog/ 
Allow: /catalog/seo_sitemap/category/ 
Allow: /catalogsearch/result/ 

# Directories 
Disallow: /404/ 
Disallow: /app/ 
Disallow: /cgi-bin/ 
Disallow: /downloader/ 
Disallow: /includes/ 
Disallow: /js/ 
Disallow: /lib/ 
Disallow: /magento/ 
Disallow: /media/ 
Disallow: /pkginfo/ 
Disallow: /report/ 
Disallow: /skin/ 
Disallow: /stats/ 
Disallow: /var/ 

# Paths (clean URLs) 
Disallow: /index.php/ 
Disallow: /catalog/product_compare/ 
Disallow: /catalog/category/view/ 
Disallow: /catalog/product/view/ 
Disallow: /catalogsearch/ 
Disallow: /checkout/ 
Disallow: /control/ 
Disallow: /contacts/ 
Disallow: /customer/ 
Disallow: /customize/ 
Disallow: /newsletter/ 
Disallow: /poll/ 
Disallow: /review/ 
Disallow: /sendfriend/ 
Disallow: /tag/ 
Disallow: /wishlist/ 

# Files 
Disallow: /cron.php 
Disallow: /cron.sh 
Disallow: /error_log 
Disallow: /install.php 
Disallow: /LICENSE.html 
Disallow: /LICENSE.txt 
Disallow: /LICENSE_AFL.txt 
Disallow: /STATUS.txt 

# Paths (no clean URLs) 
Disallow: /*.js$ 
Disallow: /*.css$ 
Disallow: /*.php$ 
Disallow: /*?p=*& 
Disallow: /*?SID= 
+0

+1爲robots.txt – Tim 2011-05-25 20:03:22

2

我不會使用mod_rewrite這一點,因爲這是在這種情況下矯枉過正。有時SID是需要的,不應該從URL中刪除。

你可以做什麼B00MER建議,並按照谷歌佈局的最佳實踐: http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

例如,你將以下內容添加到該網頁標題:

兩者的robots.txt和規範的URL組合應該真正解決您可能擁有的任何搜索引擎優化問題。

祝你好運!

+0

好吧,我看到你的觀點(並感謝解釋) - 但現在有URLs與谷歌索引導致404s - 我想重定向他們的SID正確的頁面(沒有SID的頁面),只要他們出現在搜索中。 – perler 2011-05-27 09:42:13