Technical SEO: The Ultimate Beginners Guide to Robots.txt

July 27th, 2016. Posted by Solvid.

Guide to Robots.txt

Robots.txt is a simple text file that you can use to control different search crawlers. For example, you can restrict search engine bots from accessing your entire website or individual parts/pages of the site.

It is vital for search marketers, web designers and developers to understand the basic principles and functions of the robots.txt file. Improper use of robots.txt file can have an adverse effect on your search rankings and the overall performance of the website. A simple mistake in robots.txt file can put all of your online/search marketing efforts in danger. Hence, it is important to understand fundamentals of how search engines work and how to configure robots.txt file.

Please Note: Not every website has robots.txt file set-up, and this doesn’t mean that your site is in danger. Not having robots.txt simply means you are not blocking any bots from accessing any of your files.

Common Robots.txt Set-Ups:
User-agent: *
Disallow:
Allows all search bots to access and crawl the entire website.
User-agent: *
Disallow: /
Block all search crawlers from accessing your site.
User-agent: *
Disallow: /wp-admin/
Standard set-up for websites that use WordPress CMS. We’re blocking /wp-admin/ as we don’t really want search bots to try and access our backend. The first implementation is also valid, but search bots may try to access your WP-Admin.
User-agent: *
Disallow: /myfolder/
Blocking search bots from gaining access to a specific folder on the site.
User-agent: *
Disallow: /myfile
Blocking search bots from accessing a particular file.
User-agent: Googlebot
Disallow: /
Blocking specific search bot (Google, in this case) from accessing the site.
User-agent: *
Disallow: /myfolder
Allow: /myfolder/myfile
Blocking bots from accessing “myfolder”, but still allowing to access “myfile”, even though it is located in “myfolder”, which is blocked.
User-agent: *
Disallow:
Sitemap: https://www.mydomain.com/sitemap.xml
Allowing all bots and adding a sitemap.xml file location to the robots.txt file. It’s recommended to add a line with the location of a sitemap.xml to your robots.txt file, just to make the job easier for search engines to crawl and index all of your pages.
Do you need Robots.txt file?

Remember what we’ve said above? If you don’t have robots.txt installed, that doesn’t mean you are in trouble. In fact, in many cases, you won’t even need one.

You may need to have robots.txt in following scenarios:

*You want to block all or some search bots from accessing and crawling your site
*You want to block all or some search bots from accessing some of your folders or files (e.g. /wp-admin/ folder)
*You are using paid advertising links or affiliate links
*You are developing a new website and do not want it to be accessed and crawled by bots yet

Check if you have a robots.txt file:

1. Type in your site address (e.g. www.mywebsite.com)
2. Add “/robots.txt” at the end of your web address so that it will look like this: www.mywebsite.com/robots.txt

If you don’t have a file there, your site will usually return a 404-page error.

Audit your robots.txt file:

If you do have a robots.txt file, please make sure that it doesn’t block what you don’t want to be blocked.
Most websites would want to allow bots, use examples above to understand whether your robots.txt is blocking search bots.

You can carry out robots.txt test within your Search Console (Google Webmaster Tools). More on that here.

How to add Robots.txt?

Robots.txt is a simple text file which you can create with a notepad and then upload it to your website files.

Here is a video on how to add Robots.txt with CPanel:

If you are using Yoast Plugin for your WordPress website, simply go to Yoast > Tools > File Editor

Edit your file and then click save. If you haven’t got it set-up, you can also do it with Yoast.

List of the most popular User-agents/search bots:
Search EngineName
GoogleGooglebot
Googlebot NewsGooglebot-News
Googlebot ImagesGooglebot-Image
Googlebot VideoGooglebot-Video
Google MobileGooglebot-Mobile
BingBingbot/MSNBot
YandexYandex Bot
BaiduBaiduspider
Ask.comAskJeeves
Duck Duck GoDuckDuckBot
YahooSlurp

Check backlinks, website traffic, organic keywords, keyword traffic & difficulty, CPC and more.

Dmytro Spilka on FacebookDmytro Spilka on InstagramDmytro Spilka on LinkedinDmytro Spilka on Twitter
Dmytro Spilka
Head Wizard at Solvid
Head Wizard at Solvid & Founder of Solvid Online Tools. Contributor for The Huffington Post, The Next Web, SEMRush, Lifehack, MyCustomer and more.
2016-12-08T21:44:00+00:00 July 27th, 2016|

Pin It on Pinterest

SUBSCRIBE TO OUR NEWSLETTER!

SUBSCRIBE TO OUR NEWSLETTER!

Join our mailing list to receive the latest news and updates from our blog.

You have Successfully Subscribed!