February 26th 2019

Robots.txt is a text file that gives permission to the Search Engine bots, to whether or not to crawl the website content. Basically, the search engine robots before crawling the website content, checks if it has the permission to do so.

Ø§Ø³ØªÙƒØ´Ù Ø£ÙØ¶Ù„ ÙƒØªØ¨ Ø§Ù†Ø¬Ù„…
Discover the Power of Fortnite on Xbo…
best projectors for home
A List of the Best College Graduation…
Will 2024 Be Another Record-Breaking …

(When a search engine bot start crawling your domain. it will first look for robots.txt if found then it will follow the instructions mention in the file. and if robots.txt is not available it will crawl the entire domain including admin urls.)

The code for these instruction is written simple in a notepad and saved as a txt file. It looks something like this:

User-agent: * 
Allow: / 
Disallow: /cgi-bin/ 
Disallow: /wp-admin 
Sitemap: https://www.yourdomain.com/sitemap.xml

In order to solve the Robot.txt test error, simply follow two steps:

Create a Robot.txt file (Test the code suggested below)
Add the file in the cpanel –> public folder of the domain.

How to create robots.txt using Google Webmaster?

Open webmaster.google.com and login
open old version->crawl–>robots.txt tester or click on https://www.google.com/webmasters/tools/robots-testing-tool
Paste your piece of code created for your domain and click on the button ‘test’ you can also test specific URL to test whether it is blocked in the code or not.

You can download the robots.txt file from the google webmaster panel itself by clicking ‘submit’.

In this simple code each word has its Specification

User-agent defines the search engine. * sign by default means all search engines.
For specific search engine it can be like:
# Only for Google:- “User-agent: Googlebot ”
# Only for Bing :- “User-agent: Bingbot ”

Allow: / means all the search engines robots are allowed to crawl all the website directories/Pages (links).
Code ‘Disallow’ is used to disallow bot(s) to not crawl the directorie(s) or page(s) Disallow: /cgi-bin :– means the robots are not allowed to visit the cgi files. Disallow: /wp-admin :– means the robots are not allowed to crawl the wordpress admin urls. It will not show in SERPs

These urls can be different in your case. It depends on the platform which you are using to develop your website.

Note:- some smart Search engine (bad search engines) can skip your robots.txt and crawl your entire domain. But most popular search engine like google, bing etc gives priority to it.

The post How to fix Robots.txt test appeared first on Drive Digital : We help to Grow Your Business - Exceed Your Marketing Goals.

This post first appeared on 10 Biggest Marketing Mistakes Every Start-up, please read the originial post: here