I'm sure the term robots.txt is familiar to your friends, right? Yes, when I was chatting in the afternoon with a fellow blogger, my friend suddenly asked me about robots.txt. Coincidentally, my friend is just learning blogging and is very curious about what robots.txt is. He often hears this term, but still doesn't understand its meaning, function, and how to use it. If there are friends who experience something similar, this article is perfect for friends :).
Understanding robots.txt
Before we get to know the functions and implement them on our website, of course, we must first know what robots.txt is. Robots.txt is a simple text file that functions as a filter or control over how search engines work to crawl our website. Overall, our website has many files, and with this robots.txt we can instruct search engine robots to determine which ones may be indexed and which may not be indexed.
Basically, a search engine robot is designed to index as much information from a website as it can. By making this text command, our website will be safe from files that should not be indexed by robots.txt. So you could say that the function of robots.txt is to limit the freedom of browsing of crawler robots and also to protect certain files on our website which are not allowed to be accessed by robot explorers.
Why is it important to use robots.txt?
There are several reasons, but the most prominent, there are two important reasons why we must use this robots.txt file. The first is to protect our website so that it is not vulnerable and not easily hacked. If we open all data to search engines, it is possible for pranksters to access the confidential data and it is tantamount to taking a big risk without us knowing it.
The second reason, is by allowing search engines to access our sites without restrictions, is tantamount to a large amount of bandwidth. As an effect, our site can be slower and performance-wise.
At first glance, robots.txt is indeed a small file. However, behind the two reasons we use this file, robots.txt actually has enormous benefits for website security and SEO. Imagine if we don't provide directions and prohibitions against crawler robots, then these robots can access our file folders at will and that is of course dangerous for our website.
Tag/ Scenario robots.txt
Robots.txt works according to the commands we give. Some of the command syntax / commandssyntax/commands that are often used in robots.txt include:
User-agent:* : syntax indicating that this rule is made for a robot from all search engines
Disallow:/config/ : syntax to prohibit robots from all search engines from browsing the 'config' folder
Disallow:/admin/ : syntax that disallows robots from all search engines to browse the admin folder
Allow:/ : the opposite syntax of 'disallow', which means to allow robots from all search engines to browse that folder
Within your robots.txt you can create 'Disallow' and 'Allow' rules according to your needs. So of course there will be a sample robots.txt file later in this article, and you can edit it according to the command you want for search engines.
Create an SEO Friendly robots.txt file
In order to create a robot.txt file, you must have access to the domain root. To able to access the root domain, you can login to your cPanel hosting account, then go to 'File Manager' and select the root domain for your website. After entering your public_html root domain, the next step is to select 'New File' or 'New File' and name it 'robots.txt'.
After you create the name of the file, then select 'Edit' or 'Edit' among the row of menus above it or by right-clicking. For editing options, select utf-8 for character encoding. After that, here is an example of a robots.txt file with a fairly SEO Friendly composition. You can of course edit it to add and subtract the command syntax to your liking:
sitemap: https://www.niadzgn.com/sitemap_index.xml
User-agent: *
# disallow all files in these directories
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /cgi-bin/
Disallow: /wp-content/
Disallow: /archives/
Disallow: /*?*
Disallow: *?replytocom
Disallow: /author
Disallow: /comments/feed/
Disallow: */trackback/
Disallow: /wp-*
Disallow: /*?*
User-agent:Mediapartners-Google*
Allow:/
User-agent:Googlebot-Image
Allow:/wp-content/uploads/
User-agent:Adsbot-Google
Allow:/
User-agent:Googlebot-Mobile
Allow:/
For the sitemap, please replace it with your sitemap address. When finished, select 'Save' or 'Save'. Done, now you have an SEO Friendly robot.txt file for your website.
Once you have this file, you can rest easy because Google-Bot will crawl your website files in a controlled manner. That's all for this article, hopefully, it will be useful for blogger friends who are confused about this robot.txt. Thanks for reading.
image quote pre code