WHAT IS ROBOTS.TXT FILE



 Robots.txt is a argument book with instructions for chase agent crawlers. It defines which areas of a website crawlers are accustomed to search. However, these are not absolutely called by the robots.txt file. Rather, assertive areas are not accustomed to be searched. Using this simple argument file, you can calmly exclude absolute domains, complete directories, one or added subdirectories or alone files from chase agent crawling. However, this book does not assure adjoin crooked access.

Robots.txt is stored in the basis agenda of a domain. Thus it is the aboriginal certificate that crawlers accessible back visiting your site. However, the book does not alone ascendancy crawling. You can additionally accommodate a articulation to your sitemap, which gives chase agent crawlers an overview of all absolute URLs of your domain.
 



How robots.txt works

In 1994, a agreement alleged REP (Robots Exclusion Standard Protocol) was published. This agreement stipulates that all chase agent crawlers (user-agents) charge aboriginal chase for the robots.txt book in the basis agenda of your armpit and apprehend the instructions it contains. Only then, robots can alpha indexing your web page. The book charge be amid anon in the basis agenda of your area and charge be accounting in lower case because robots apprehend the robots.txt book and its instructions case-sensitive. Unfortunately, not all chase agent robots chase these rules. At atomic the book works with the best important chase engines like Bing, Yahoo, and Google. Their chase robots carefully chase the REP and robots.txt instructions.

In practice, robots.txt can be acclimated for altered types of files. If you use it for angel files, it prevents these files from actualization in the Google chase results. Unimportant ability files, such as script, style, and angel files, can additionally be blocked calmly with robots.txt. In addition, you can exclude dynamically generated web pages from ample application adapted commands. For example, aftereffect pages of an centralized chase function, pages with affair IDs or user accomplishments such as arcade carts can be blocked. You can additionally ascendancy crawler admission to added non-image files (web pages) by application the argument file.

Post a Comment

Previous Post Next Post