Sunday, May 20, 2012
Text Size

Webmaster Crawl Errors and Joomla

If you use sh404 or the , you may see that googlebot still seems to get in and find your non optimised urls. The ones with index=option=com etc...
To block these from being crawled add the following to robots.tx file.
 
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /modules/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/
Disallow: /jupgrade/
Disallow: /*font-size*
Disallow: /*option=com*
 
You can further guide the bots by specifing exactly where they should crawl. You'll need to adjust for each sites url structure but something like this.
 
Allow: /articles/
Allow: /community/
Allow: /forum/
 
Finally be sure you install the component called xmap and submit the output to google web master tools.
Discuss this article
You need to log in or register to participate in this discussion.