How to use Robots.txt file?
We must first understand what is meant by Robot Files
When search engines come to your site, the first thing they notice is the Robot file.
Of course, it is different in some search engines, but this rule is generally true.
This file tells search engines what to index and what not to index.
It may also show your sitemap as XML.
After this search, the engines send a bot or Robot or Spider to crawl your site, as mentioned in the robot.txt file.
Google bots are called Googlebot, and Bing bots are called Bingbot.
Alexa, Lycos, Ask, and others also have their own bots.
Most bots are from search engines, although sometimes some sites submit their own bots for various reasons.
For example, some sites ask you to put a special code on your website to be verified.
Then they send their bot to check if you put the Verify code on your site or not.
Where are Robot.txt files located?
The robots.txt file belongs to the document Root Folder
Which is actually the same as Public HTML
You can create a blank file and name the file robots.txt.
This reduces the errors of your site and is effective in improving the site rank.
Blocking Robots and Search Engines
If you want to prevent Visit your site by search engines and also ranking your site by those search engines
You can use the following code.
#Code to not allow any search engines! User-agent: * Disallow: /
You can also prevent robots from crawling only a specific part of your site.
In the following example, we do not want bots and search engines to crawl the cgi-bin folder as well as the tmp and junk site.
# Blocks robots from specific folders / directories User-agent: * Disallow: / cgi-bin / Disallow: / tmp / Disallow: / junk /In the example above, http://www.yoursitesdomain.com/junk/index.html is not crawled by search engines but for example http://www.yoursitesdomain.com/index.html by search engine crawl May.
Google and the Bing Network
Today, Google and Bing search engines do not pay much attention to the robot.txt file and the delay in visiting your site.
Reduce by Google and Bing bots. It is better to create a webmaster account in Google and Bing and introduce your site domain.
In this case, you have the least delay in visiting your site by these two companies’ robots.
Tip: If you want to reduce your site traffic by blocking crawlers like Yandex or Baidu that you actually use
Not available in Iran, you must do this through the .htaccess file.