ECommerce Insights Blog

Stay on top of it all and get ahead with useful articles, how-tos, tips and tricks on e-commerce.

How to best use Robot.txt file?

The robot.txt file is the name given to file that contains a path that cannot be crawled by the search engines. It creates a file that keeps search engine away from it saying that it is private and cannot be crawled. A site owner through the following link can create a robot file:

A robot link is a path that is used by site owners to protect a page from search engines. There are various ways by which a user can exploit robot.txt file.

  • The first way that is used by many webmasters is to go to the disallowed directories and look at the source. Web-masters very often leave comments there that give hints to passwords and usernames. This information can be easily used to guess the user’s password by entering infamous or best football teams.
  • The second method is called the Directory Traversal method. This method can be used by the webmaster who has denied a web-page.
       The user can bypass this denial with this method:
       The webmaster needs to go to the directory that denied access. For example:
       The webmaster needs to add a not found directory to the directory.
       The directory adversal part can be done by adding a /../
       This will give the webmaster a directory that will give access to the disallowed directory.
  • The third way to exploit a robot txt. file is called the CGI-BIN exploits. This method can be applied only to the websites that has CGI-BIN in it.

The above-mentioned are the methods by which a Webmaster can exploit the robot txt file. These methods have been used by webmasters. There are various methods that are suggested by other webmasters but these three stand out. The usage of these methods give the user access to websites.