Dmytro Spilka
Robots.txt is a simple text file that you can use to control different search crawlers. For example, you can restrict search engine bots from accessing your entire website or individual parts/pages of the site.
It is vital for search marketers, web designers and developers to understand the basic principles and functions of the robots.txt file. Improper use of robots.txt file can have an adverse effect on your search rankings and the overall performance of the website. A simple mistake in robots.txt file can put all of your online/search marketing efforts in danger. Hence, it is important to understand fundamentals of how search engines work and how to configure robots.txt file.
Please Note: Not every website has robots.txt file set-up, and this doesn’t mean that your site is in danger. Not having robots.txt simply means you are not blocking any bots from accessing any of your files.
Common Robots.txt Set-Ups:
Disallow:
Disallow: /
Disallow: /wp-admin/
Disallow: /myfolder/
Disallow: /myfile
Disallow: /
Disallow: /myfolder
Allow: /myfolder/myfile
Disallow:
Sitemap: https://www.mydomain.com/sitemap.xml
Do you need Robots.txt file?
Remember what we’ve said above? If you don’t have robots.txt installed, that doesn’t mean you are in trouble. In fact, in many cases, you won’t even need one.
You may need to have robots.txt in following scenarios:
*You want to block all or some search bots from accessing and crawling your site
*You want to block all or some search bots from accessing some of your folders or files (e.g. /wp-admin/ folder)
*You are using paid advertising links or affiliate links
*You are developing a new website and do not want it to be accessed and crawled by bots yet
Check if you have a robots.txt file:
1. Type in your site address (e.g. www.mywebsite.com)
2. Add “/robots.txt” at the end of your web address so that it will look like this: www.mywebsite.com/robots.txt
If you don’t have a file there, your site will usually return a 404-page error.
Audit your robots.txt file:
If you do have a robots.txt file, please make sure that it doesn’t block what you don’t want to be blocked.
Most websites would want to allow bots, use examples above to understand whether your robots.txt is blocking search bots.
You can carry out robots.txt test within your Search Console (Google Webmaster Tools). More on that here.
How to add Robots.txt?
Robots.txt is a simple text file which you can create with a notepad and then upload it to your website files.
If you are using Yoast Plugin for your WordPress website, simply go to Yoast > Tools > File Editor
Edit your file and then click save. If you haven’t got it set-up, you can also do it with Yoast.
List of the most popular User-agents/search bots:
Search Engine | Name |
---|---|
Googlebot | |
Googlebot News | Googlebot-News |
Googlebot Images | Googlebot-Image |
Googlebot Video | Googlebot-Video |
Google Mobile | Googlebot-Mobile |
Bing | Bingbot/MSNBot |
Yandex | Yandex Bot |
Baidu | Baiduspider |
Ask.com | AskJeeves |
Duck Duck Go | DuckDuckBot |
Yahoo | Slurp |
Dmytro Spilka
Head Wizard