Keys to a great robots.txt file.
I know that this is a fairly dry search engine optimization subject but it is important none the less. Setting up and managing your robots.txt file on your website could be the difference between getting ranked and not getting ranked. Below I will explain the function of the robots.txt file and give you some tips on making sure that it is correct, strong and serving its purpose.
What is a robots.txt file?
A robots.txt file is a file that is uploaded on your server to let search engines know if there are any webpages on your website that it should not index or crawl and to restrict certain search engine robots that you do not want crawling your website. It is a very simple text file that is placed on your web server in the root folder so the search engines can find it, for example:
Use disallow protocol properly.
The robots.txt files purpose is to let the search engines know what pages NOT to index. Your page should be made up of disallow protocols only unless there is a case where you want a search engine to view a page in a blocked sub directory, for example:
The search engines only check the robots.txt page to see what they are not supposed to do, there is no need to have a long list of webpages that you want indexed with “allow” protocols. They will find those pages naturally unless otherwise advised by a disallow protocol.
Know how spiders function.
The robot text file is read top to bottom and if a search engine runs into an error on the file it could simply ignore the information below the error. Therefore it is very important to make sure your robot txt file has no errors. There are quite a few tools you can find online to check your file for errors, http://www.frobee.com/robots-txt-check is one example. The last thing you want to have is a simple file like your robots.txt file holding back your rankings because it was created with errors.
Creating your robots.txt file.
Many people don’t understand how simple it can be to create the robots.txt file. It can be created on WordPad, Notepad or any type of plain text editor that you have on your computer. Start by creating an empty file and title it “robots.txt “.
The first text that should appear on the page should read something like this:
When you use the * that is referring to all search engines. If you wanted to specify behavior for a specific search engine you would use:
This would specifically tell Googlebot that these are the rules for its indexing. Although in general most websites address all the search engines at once with the * protocol. Every page that you do not want the search engines to index or view should have the disallow: protocol, for example:
Or whatever pages you do not want crawled and indexed.
I hope this information has been helpful for your robots.txt file creation and optimization. Have fun creating your page and make sure to always check it for errors when you are done creating it or even updating it.