Before we learn how to setup Robots.txt In Magento lets dig what it is. Well, Robots.txt (“robots dot txt”) is a text file that helps search engines like Google, Bing to understand what information on a particular site needed to be indexed. You can read more about robots.txt by clicking this link.
Robots.txt is a critical file for the success of any store. Unfortunately, Magento Community and Magento Enterprise by default do not have robots.txt, so as a developer you have take the pain and create this file once.
Improving performance Using Robots.txt in Magento
There are certain areas where Robots.txt file can help, we are listing the 2 primary reasons of using Robots.txt file below:
- Robots.txt will help prevent duplicate content issue, one of the primary thing for SEO success.
- Robots.txt also help you to hide technical details about your site i.e. Error logs, SVN files, wanted directories etc. Since, these are prevented by
Robots.txtyou are left with clean URLs to be indexed in search engines.
Set Up Robots.txt in Magento Like A Pro
Before you setup Robots.txt file, you should know that
robots.txt settings will only cover 1 domain at a time, so for multiple stores you have to create separate
robots.txt files for each stores. Creating Robots.txt is super simple since it’s nothing but a text file and can be created using any text editors like dreamweaver, notepad, vim or your favorite code editor.
Once you have created Robots.txt file it is supposed to reside at root of your site. For an example if your store domain is
www.mystore.com you should put
robots.txt file under the domain root where you also have app directory. So it sits and accessed like
www.mystore.com/robots.txt. Please note that many search engines look for Robots.txt file directly under your store root and not under a directory. So keeping this file under any directory, sub-directory is not wise.
Robots.txt for Magento
Following is a well tested version of
Robots.txt file which you can use, just edit the lines not applicable for your store’s set up.
# # Robots.txt for Magento Community and Enterprise # # GENERAL SETTINGS # Enables robots.txt rules for all crawlers User-agent: * # # Crawl-delay parameter: the number of seconds you want to wait between successful requests to the same server. # # Set a crawl rate, if your server's traffic problems. Please note that Google ignore crawl-delay setting in Robots.txt. You can set up this in Google Webmaster tool # Crawl-delay: 30 # # Magento sitemap: URL to your sitemap file in Magento # Sitemap: http://www.mystore.com/sitemap/sitemap.xml # # Settings that relate to the UNDER CONSTRUCTION # # Do not allow indexing files and folders that are required during development: CVS, SVN directory and dump files Disallow: / CVS Disallow: / *. Svn $ Disallow: / *. Idea $ Disallow: / *. Sql $ Disallow: / *. Tgz $ # # GENERAL SETTINGS For MAGENTO # # Do not index the page Magento admin Disallow: / admin / # # Do not index the general technical Magento directory Disallow: / app / Disallow: / downloader / Disallow: / errors / Disallow: / includes / Disallow: / lib / Disallow: / pkginfo / Disallow: / shell / Disallow: / var / # # Do not index the shared files Magento Disallow: / api.php Disallow: / cron.php Disallow: / cron.sh Disallow: / error_log Disallow: / get.php Disallow: / install.php Disallow: / LICENSE.html Disallow: / LICENSE.txt Disallow: / LICENSE_AFL.txt Disallow: / README.txt Disallow: / RELEASE_NOTES.txt # # MAGENTO SEA IMPROVEMENT # # Do not index the page subcategories that are sorted or filtered. Disallow: / *? Dir * Disallow: / *? Dir = desc Disallow: / *? Dir = asc Disallow: / *? Limit = all Disallow: / *? Mode * # # Do not index the second copy of the home page (example.com / index.php /). Un-comment only if you have activated Magento SEO URLs. # # Disallow: / index.php / # # Do not index the link from the session ID Disallow: / *? SID = # # Do not index the page checkout and user account Disallow: / checkout / Disallow: / onestepcheckout / Disallow: / customer / Disallow: / customer / account / Disallow: / customer / account / login / # # Do not index the search page and CEO, non-optimized link categories Disallow: / catalogsearch / Disallow: / catalog / product_compare / Disallow: / catalog / category / view / Disallow: / catalog / product / view / # # Server Settings # # Do not index the general technical directories and files on a server Disallow: / cgi-bin / Disallow: / cleanup.php Disallow: / apc.php Disallow: / memcache.php Disallow: / phpinfo.php # # SETTINGS Image indexing # # Optional: If you do not want to Google and Bing to index your images # User-agent: Googlebot-Image # Disallow: / # User-agent: msnbot-media # Disallow: /
Tools for testing Robots.txt file
There are various tools available which can help you test your
Robots.txt file, i am listing few popular ones here:
I hope this tutorial on
Robots.txt will help you set up & configure
Robots.txt In Magento like a pro. Please leave us a comment and let us know if you face any issues implementing this on your Magento setup.